[ 
https://issues.apache.org/jira/browse/BEAM-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Beam JIRA Bot updated BEAM-13010:
---------------------------------
    Labels:   (was: stale-assigned)

> Delete orphaned files
> ---------------------
>
>                 Key: BEAM-13010
>                 URL: https://issues.apache.org/jira/browse/BEAM-13010
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-files
>    Affects Versions: 2.34.0
>            Reporter: David
>            Priority: P1
>
> Until version 2.33.0 of Apache Beam, (tested with a Python streaming pipeline 
> consuming events from PubSub and writing them into GCS), some files were 
> being deleted from the temporary folder before being moved to the 
> destination. This was the original issue: 
> https://issues.apache.org/jira/browse/BEAM-12950
> In version 2.34.0 we applied a temporary workaround to be sure that no data 
> is dropped. Instead of deleting the orphaned files, we just log them:
> [https://github.com/apache/beam/pull/15576]
> Most probably the root cause of the missing event was that we were removing 
> files at an erroneous time. We need to delete orphaned files in a subsequent 
> step (after we're sure that there won't be retries). 
> Once the original issue is fixed and the orphaned files are deleted at the 
> correct time, we should remove the decorator of the unit test skipped in the 
> Pull Request above.  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to