Hi all,

While testing for a feature I’m implementing, I noticed that
Reshuffle.AssignToShard[1] produces (N*2)-1 buckets, where N is the value
of the user-defined numBuckets parameter. This is because the value of the
 variable having the remainder operator applied, hashOfShard, can be
negative.

Is it intentional to produce (N*2)-1 buckets? If not I’ll submit a small
patch.

I only worry about the implications of changing it for use cases already
employing AssignToShard. Until recently (28 days ago[2]) the class was
private, plus Reshuffle is marked as deprecated, so I imagine the impact
would be low? Thoughts?

Thanks,
Evan

[1]
https://github.com/apache/beam/blob/abbe14f721327d51cce02876324e7feba98581e2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java#L160
[2]
https://github.com/apache/beam/commit/abbe14f721327d51cce02876324e7feba98581e2

Reply via email to