aokolnychyi commented on code in PR #8972:
URL: https://github.com/apache/iceberg/pull/8972#discussion_r1379398550


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java:
##########
@@ -354,104 +323,90 @@ private void deleteFiles(Iterable<String> locations) {
         .run(location -> table.io().deleteFile(location));
   }
 
-  private static ManifestFile writeManifest(
-      List<Row> rows,
-      int startIndex,
-      int endIndex,
-      Broadcast<Table> tableBroadcast,
-      String location,
-      int format,
-      Types.StructType combinedPartitionType,
-      PartitionSpec spec,
-      StructType sparkType)
-      throws IOException {
+  private ManifestWriterFactory manifestWriters() {
+    return new ManifestWriterFactory(
+        sparkContext().broadcast(SerializableTableWithSize.copyOf(table)),
+        formatVersion,
+        spec.specId(),
+        stagingLocation,
+        // allow the actual size of manifests to be 20% higher as the 
estimation is not precise
+        (long) (1.2 * targetManifestSizeBytes));

Review Comment:
   Using 20% to be safe (instead of 10%). The goal is to avoid cutting a new 
file for just a few entires if the estimation is not precise.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to