amogh-jahagirdar commented on code in PR #15310:
URL: https://github.com/apache/iceberg/pull/15310#discussion_r3028705792


##########
spark/v4.1/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteManifestsSparkAction.java:
##########
@@ -329,7 +330,10 @@ private Column sortColumn() {
   }
 
   private Dataset<Row> repartitionAndSort(Dataset<Row> df, Column col, int 
numPartitions) {
-    return df.repartitionByRange(numPartitions, col).sortWithinPartitions(col);
+    // add xxhash64 of file path for range partition to make sure we have 
enough parallelism

Review Comment:
   Minor comment: In the project we try to avoid using first person "we" since 
it contextually doesn't add too much value. "add xxhash64 of file path for 
range partition to ensure enough parallelism" 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to