jihoonson commented on a change in pull request #8925: Parallel indexing single 
dim partitions
URL: https://github.com/apache/incubator-druid/pull/8925#discussion_r355106547
 
 

 ##########
 File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/batch/parallel/distribution/PartitionBoundaries.java
 ##########
 @@ -20,28 +20,48 @@
 package org.apache.druid.indexing.common.task.batch.parallel.distribution;
 
 import com.google.common.collect.ForwardingList;
-import com.google.common.collect.ImmutableList;
 
 import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
 import java.util.List;
+import java.util.stream.Collectors;
 
 /**
- * Convenience wrapper to make code more readable.
+ * List of range partition boundaries.
  */
-public class Partitions extends ForwardingList<String> implements List<String>
+public class PartitionBoundaries extends ForwardingList<String> implements 
List<String>
 {
   private final List<String> delegate;
 
   // For jackson
   @SuppressWarnings("unused")
-  private Partitions()
+  private PartitionBoundaries()
   {
     delegate = new ArrayList<>();
   }
 
-  public Partitions(String... partitions)
+  /**
+   * @param partitions Elements corresponding to evenly-spaced fractional 
ranks of the distribution
+   */
+  public PartitionBoundaries(String... partitions)
   {
-    delegate = ImmutableList.copyOf(partitions);
+    if (partitions.length == 0) {
+      delegate = Collections.emptyList();
+      return;
+    }
+
+    List<String> partitionBoundaries = Arrays.stream(partitions)
+                                             .distinct()
+                                             
.collect(Collectors.toCollection(ArrayList::new));
+
+    // First partition starts with null (see StringPartitionChunk.isStart())
+    partitionBoundaries.set(0, null);
+
+    // Last partition ends with null (see StringPartitionChunk.isEnd())
+    partitionBoundaries.add(null);
 
 Review comment:
   > Previously, I had logic to combine it with the second-to-last partition 
when it was small:
   > [#8925 
(comment)](https://github.com/apache/incubator-druid/pull/8925#discussion_r350539718)
   > 
   > If it's still desired to not have that logic to decide whether to combine 
it or not, then it needs to either always be combined or never be combined.
   
   I guess the small partitions are already being always combined by selecting 
distinct keys 
[here](https://github.com/apache/incubator-druid/pull/8925/files/83ab7a86a24c1eb739c3f6e44974d68b188ab898#diff-57c86134ac6a13111baab12eff55827bR54-R56)?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to