Ian Zhou created BEAM-367:
-----------------------------
Summary: GetFractionConsumed() inaccurate for non-uniform records
Key: BEAM-367
URL: https://issues.apache.org/jira/browse/BEAM-367
Project: Beam
Issue Type: Improvement
Components: sdk-java-gcp
Reporter: Ian Zhou
Assignee: Daniel Halperin
Priority: Minor
GetFractionConsumed() provides inaccurate progress updates for clustered
records. For example, for a range spanning [1, 10], a cluster of records around
5 (e.g. 5.000001 ..., 5.000009) will be recorded as ~50% complete upon reading
the first record, and will remain at this percentage until the final record has
been read. Instead, the start of the range should be changed to the first
record seen (e.g. new range [5.000001, 10]). The end of the range can be
changed over time through dynamic work rebalancing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)