arunpandianp opened a new pull request, #33318:
URL: https://github.com/apache/beam/pull/33318

   This is a POC showing how state multiplexing can work for GroupByKey. 
   
   - Messages with small keys (<4K) are hashed and shuffled to a fixed set of 
virtual sharding keys and get reduced on the virtual keys.
   - The actual keys are sent to the virtual keys as part of windows.
   - Existing combining, aggregations work at the window level, so they all 
work out of the box.
   
   The 4k threshold is currently an arbitrary small value and can be tweaked. 
The constraint is any state tags should not exceed 64k and keys are now part of 
the windowed state tags.
   
   Need to cleanup comments and add tests, sending this to share the idea and 
get initial feedback.
   
   
   R: @scwhittle 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to