[GitHub] [iceberg] rdblue commented on a change in pull request #2022: Implement logic to group and sort rows before writing rows for MERGE INTO.

GitBox Sat, 16 Jan 2021 16:01:46 -0800


rdblue commented on a change in pull request #2022:
URL: https://github.com/apache/iceberg/pull/2022#discussion_r559055221




##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -269,6 +279,51 @@ public Transform unknown(int fieldId, String sourceName, 
int sourceId, String tr
     return transforms.toArray(new Transform[0]);
   }
 
+  public static Distribution toRequiredDistribution(PartitionSpec spec, 
SortOrder sortOrder, boolean inferFromSpec) {
+    if (sortOrder.isUnsorted()) {
+      if (inferFromSpec) {
+        SortOrder specOrder = Partitioning.sortOrderFor(spec);
+        return Distributions.ordered(convert(specOrder));
+      }
+
+      return Distributions.unspecified();
+    }
+
+    Schema schema = spec.schema();
+    Multimap<Integer, SortField> sortFieldIndex = 
Multimaps.index(sortOrder.fields(), SortField::sourceId);
+
+    // build a sort prefix of partition fields that are not already in the 
sort order
+    SortOrder.Builder builder = SortOrder.builderFor(schema);
+    for (PartitionField field : spec.fields()) {
+      Collection<SortField> sortFields = sortFieldIndex.get(field.sourceId());
+      boolean isSorted = sortFields.stream().anyMatch(sortField ->
+              field.transform().equals(sortField.transform()) ||
+                      
sortField.transform().satisfiesOrderOf(field.transform()));

Review comment:
       I think this would fit on one line if the indentation were fixed. A 
continuing indent should be 2 indents, 4 spaces.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #2022: Implement logic to group and sort rows before writing rows for MERGE INTO.

Reply via email to