[ 
https://issues.apache.org/jira/browse/BEAM-10395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17178549#comment-17178549
 ] 

Beam JIRA Bot commented on BEAM-10395:
--------------------------------------

This issue was marked "stale-assigned" and has not received a public comment in 
7 days. It is now automatically unassigned. If you are still working on it, you 
can assign it to yourself again. Please also give an update about the status of 
the work.

> Dataflow runner should deduplicate files to stage by destination 
> -----------------------------------------------------------------
>
>                 Key: BEAM-10395
>                 URL: https://issues.apache.org/jira/browse/BEAM-10395
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Steve Niemitz
>            Priority: P2
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> If a pipeline contains multiple files with the same destination path, the 
> dataflow runner will try to stage them both in parallel, resulting in the 
> upload usually failing (due to conflicting uploads).
> The runner should only upload one file per destination, and ideally check 
> that the sources are the same as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to