[
https://issues.apache.org/jira/browse/BEAM-13010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Beam JIRA Bot updated BEAM-13010:
---------------------------------
Labels: stale-assigned (was: )
> Delete orphaned files
> ---------------------
>
> Key: BEAM-13010
> URL: https://issues.apache.org/jira/browse/BEAM-13010
> Project: Beam
> Issue Type: Bug
> Components: io-py-files
> Affects Versions: 2.34.0
> Reporter: David
> Assignee: Pablo Estrada
> Priority: P1
> Labels: stale-assigned
>
> Until version 2.33.0 of Apache Beam, (tested with a Python streaming pipeline
> consuming events from PubSub and writing them into GCS), some files were
> being deleted from the temporary folder before being moved to the
> destination. This was the original issue:
> https://issues.apache.org/jira/browse/BEAM-12950
> In version 2.34.0 we applied a temporary workaround to be sure that no data
> is dropped. Instead of deleting the orphaned files, we just log them:
> [https://github.com/apache/beam/pull/15576]
> Most probably the root cause of the missing event was that we were removing
> files at an erroneous time. We need to delete orphaned files in a subsequent
> step (after we're sure that there won't be retries).
> Once the original issue is fixed and the orphaned files are deleted at the
> correct time, we should remove the decorator of the unit test skipped in the
> Pull Request above.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)