[ 
https://issues.apache.org/jira/browse/BEAM-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Pedapudi updated BEAM-6819:
---------------------------------
    Component/s:     (was: runner-dataflow)

> Remote sources provide insufficient metadata about relative sizes of splits
> ---------------------------------------------------------------------------
>
>                 Key: BEAM-6819
>                 URL: https://issues.apache.org/jira/browse/BEAM-6819
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Sunil Pedapudi
>            Priority: Minor
>
> In the current split protocol, SourceMetadata is reported for the initial 
> parent source. Subsequent splits drop the SourceMetadata. Without this 
> additional information, downstream systems make simplifying assumptions that 
> result in decorrelation between input fraction and the actual fraction of 
> input represented by a task. 
> This decorrelation of input fraction has cascading negative effects for any 
> system relying on trends in input fraction (eg., Cloud Dataflow's autotuning).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to