Robert Bradshaw created BEAM-4565:
-------------------------------------

             Summary: Hot key fanout should not distribute keys to all shards.
                 Key: BEAM-4565
                 URL: https://issues.apache.org/jira/browse/BEAM-4565
             Project: Beam
          Issue Type: Task
          Components: sdk-java-core, sdk-py-core
    Affects Versions: 2.4.0, 2.3.0, 2.2.0, 2.1.0, 2.0.0, 2.5.0
            Reporter: Robert Bradshaw
            Assignee: Kenneth Knowles


The goal is to reduce the number of value sent to a single post-GBK worker. If 
combiner lifting happens, each bundle will sends a single value per sub-key, 
causing an N-fold blowup in shuffle data and N reducers with the same amount of 
data to consume as the single reducer in the non-fanout case. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to