Ian Zhou created BEAM-367:
-----------------------------

             Summary: GetFractionConsumed() inaccurate for non-uniform records
                 Key: BEAM-367
                 URL: https://issues.apache.org/jira/browse/BEAM-367
             Project: Beam
          Issue Type: Improvement
          Components: sdk-java-gcp
            Reporter: Ian Zhou
            Assignee: Daniel Halperin
            Priority: Minor


GetFractionConsumed() provides inaccurate progress updates for clustered 
records. For example, for a range spanning [1, 10], a cluster of records around 
5 (e.g. 5.000001 ..., 5.000009) will be recorded as ~50% complete upon reading 
the first record, and will remain at this percentage until the final record has 
been read. Instead, the start of the range should be changed to the first 
record seen (e.g. new range [5.000001, 10]). The end of the range can be 
changed over time through dynamic work rebalancing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to