[ 
https://issues.apache.org/jira/browse/BEAM-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491626#comment-17491626
 ] 

Beam JIRA Bot commented on BEAM-13010:
--------------------------------------

This issue is assigned but has not received an update in 30 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Delete orphaned files
> ---------------------
>
>                 Key: BEAM-13010
>                 URL: https://issues.apache.org/jira/browse/BEAM-13010
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-files
>    Affects Versions: 2.34.0
>            Reporter: David
>            Assignee: Pablo Estrada
>            Priority: P1
>              Labels: stale-assigned
>
> Until version 2.33.0 of Apache Beam, (tested with a Python streaming pipeline 
> consuming events from PubSub and writing them into GCS), some files were 
> being deleted from the temporary folder before being moved to the 
> destination. This was the original issue: 
> https://issues.apache.org/jira/browse/BEAM-12950
> In version 2.34.0 we applied a temporary workaround to be sure that no data 
> is dropped. Instead of deleting the orphaned files, we just log them:
> [https://github.com/apache/beam/pull/15576]
> Most probably the root cause of the missing event was that we were removing 
> files at an erroneous time. We need to delete orphaned files in a subsequent 
> step (after we're sure that there won't be retries). 
> Once the original issue is fixed and the orphaned files are deleted at the 
> correct time, we should remove the decorator of the unit test skipped in the 
> Pull Request above.  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to