anuragmantri commented on code in PR #15470:
URL: https://github.com/apache/iceberg/pull/15470#discussion_r2934182802


##########
core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java:
##########
@@ -445,25 +451,36 @@ private static RewriteResult<DeleteFile> 
writeDeleteFileEntry(
       String sourcePrefix,
       String targetPrefix,
       String stagingLocation,
-      ManifestWriter<DeleteFile> writer) {
+      ManifestWriter<DeleteFile> writer,
+      FileIO io,
+      PositionDeleteReaderWriter posDeleteReaderWriter) {
 
     DeleteFile file = entry.file();
     RewriteResult<DeleteFile> result = new RewriteResult<>();
 
     switch (file.content()) {
       case POSITION_DELETES:
-        DeleteFile posDeleteFile = newPositionDeleteEntry(file, spec, 
sourcePrefix, targetPrefix);
+        // Rewrite inline so the manifest records the actual file size, which 
changes because
+        // embedded data file paths are rewritten. The staging path is 
deterministic, so
+        // duplicates across manifests simply overwrite with identical content.
+        String staging = stagingPath(file.location(), sourcePrefix, 
stagingLocation);
+        OutputFile outputFile = io.newOutputFile(staging);
+        try {
+          rewritePositionDeleteFile(
+              file, outputFile, io, spec, sourcePrefix, targetPrefix, 
posDeleteReaderWriter);
+        } catch (IOException e) {
+          throw new UncheckedIOException(
+              "Failed to rewrite position delete file " + file.location(), e);
+        }
+        long actualSize = io.newInputFile(staging).getLength();

Review Comment:
   Can we change the `rewritePositionDelete()` method to return actual file 
size? At least for the non-DV path, the `writer.close()` should have the actual 
size written. For DV (puffin files) I'm not sure if we can get the length that 
way, maybe for DVs we may have to stick to `getLegnth()` in that case. What do 
you think? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to