I suspect this was unintentional. It looks like @Daniel Collins <[email protected]> added the numBuckets parameter in https://github.com/apache/beam/pull/11919, maybe they can confirm.
Brian On Mon, May 3, 2021 at 5:17 PM Evan Galpin <[email protected]> wrote: > Hi all, > > While testing for a feature I’m implementing, I noticed that > Reshuffle.AssignToShard[1] produces (N*2)-1 buckets, where N is the value > of the user-defined numBuckets parameter. This is because the value of the > variable having the remainder operator applied, hashOfShard, can be > negative. > > Is it intentional to produce (N*2)-1 buckets? If not I’ll submit a small > patch. > > I only worry about the implications of changing it for use cases already > employing AssignToShard. Until recently (28 days ago[2]) the class was > private, plus Reshuffle is marked as deprecated, so I imagine the impact > would be low? Thoughts? > > Thanks, > Evan > > [1] > > https://github.com/apache/beam/blob/abbe14f721327d51cce02876324e7feba98581e2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java#L160 > [2] > > https://github.com/apache/beam/commit/abbe14f721327d51cce02876324e7feba98581e2 >
