Sunil Pedapudi created BEAM-6819:
------------------------------------

             Summary: Remote sources provide insufficient metadata about 
relative sizes of splits
                 Key: BEAM-6819
                 URL: https://issues.apache.org/jira/browse/BEAM-6819
             Project: Beam
          Issue Type: Improvement
          Components: runner-dataflow, sdk-java-core
            Reporter: Sunil Pedapudi


In the current split protocol, SourceMetadata is reported for the initial 
parent source. Subsequent splits drop the SourceMetadata. Without this 
additional information, downstream systems make simplifying assumptions that 
result in decorrelation between input fraction and the actual fraction of input 
represented by a task. 

This decorrelation of input fraction has cascading negative effects for any 
system relying on trends in input fraction (eg., Cloud Dataflow's autotuning).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to