Eugene Kirpichov created BEAM-3145:
--------------------------------------

             Summary: Improve cleanup of zombie temporary files in WriteFiles
                 Key: BEAM-3145
                 URL: https://issues.apache.org/jira/browse/BEAM-3145
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-core
            Reporter: Eugene Kirpichov
            Assignee: Reuven Lax


See user issue in 
https://stackoverflow.com/questions/47113773/dataflow-2-1-0-streaming-application-is-not-cleaning-temp-folders

For windowed writes, the proper solution is probably to put temp files into 
finer-grained directories, e.g. sharded by date or hour, and somehow clean them 
up by globbing and deleting the entire directory when the watermark goes past 
said date. It's complicated by late data and multiple trigger firings, of 
course.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to