dpcollins-google commented on a change in pull request #11919:
URL: https://github.com/apache/beam/pull/11919#discussion_r453637965



##########
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java
##########
@@ -109,16 +110,33 @@ public void processElement(
   public static class ViaRandomKey<T> extends PTransform<PCollection<T>, 
PCollection<T>> {
     private ViaRandomKey() {}
 
+    private ViaRandomKey(@Nullable Integer numBuckets) {
+      this.numBuckets = numBuckets;
+    }
+
+    // The number of buckets to shard into. This is a performance optimization 
to prevent having
+    // unit sized bundles on the output. If unset, uses a random integer key.
+    private @Nullable Integer numBuckets;
+
+    public ViaRandomKey<T> withNumBuckets(@Nullable Integer numBuckets) {

Review comment:
       It was suggested by millsd@ above in this pr that it would have 
degenerate performance and lead to single key bundles, so he suggested I 
rewrite it like this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to