nastra commented on code in PR #14351:
URL: https://github.com/apache/iceberg/pull/14351#discussion_r2502761262


##########
core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java:
##########
@@ -592,6 +641,66 @@ record = recordIt.next();
     }
   }
 
+  /**
+   * Rewrite a DV (Deletion Vector) file, updating the referenced data file 
paths in blob metadata.
+   *
+   * @param deleteFile source DV file to be rewritten
+   * @param outputFile output file to write the rewritten DV to
+   * @param io file io
+   * @param sourcePrefix source prefix that will be replaced
+   * @param targetPrefix target prefix to replace it
+   */
+  private static void rewriteDVFile(
+      DeleteFile deleteFile,
+      OutputFile outputFile,
+      FileIO io,
+      String sourcePrefix,
+      String targetPrefix)
+      throws IOException {
+    InputFile sourceFile = io.newInputFile(deleteFile.location());
+
+    try (org.apache.iceberg.puffin.PuffinReader reader =

Review Comment:
   can you please apply the below diff?
   
   ```
   private static void rewriteDVFile(
         DeleteFile deleteFile,
         OutputFile outputFile,
         FileIO io,
         String sourcePrefix,
         String targetPrefix)
         throws IOException {
       List<Blob> rewrittenBlobs = Lists.newArrayList();
       try (PuffinReader reader = 
Puffin.read(io.newInputFile(deleteFile.location())).build()) {
         // Read all blobs and rewrite them with updated referenced data file 
paths
         for (Pair<org.apache.iceberg.puffin.BlobMetadata, ByteBuffer> blobPair 
:
             reader.readAll(reader.fileMetadata().blobs())) {
           org.apache.iceberg.puffin.BlobMetadata blobMetadata = 
blobPair.first();
           ByteBuffer blobData = blobPair.second();
   
           // Get the original properties and update the referenced data file 
path
           Map<String, String> properties = 
Maps.newHashMap(blobMetadata.properties());
           String referencedDataFile = properties.get("referenced-data-file");
           if (referencedDataFile != null && 
referencedDataFile.startsWith(sourcePrefix)) {
             String newReferencedDataFile = newPath(referencedDataFile, 
sourcePrefix, targetPrefix);
             properties.put("referenced-data-file", newReferencedDataFile);
           }
   
           // Create a new blob with updated properties
           rewrittenBlobs.add(
               new Blob(
                   blobMetadata.type(),
                   blobMetadata.inputFields(),
                   blobMetadata.snapshotId(),
                   blobMetadata.sequenceNumber(),
                   blobData,
                   
PuffinCompressionCodec.forName(blobMetadata.compressionCodec()),
                   properties));
         }
       }
   
       try (PuffinWriter writer =
           
Puffin.write(outputFile).createdBy(IcebergBuild.fullVersion()).build()) {
         rewrittenBlobs.forEach(writer::write);
       }
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to