Daniel Halperin created BEAM-1507:
-------------------------------------

             Summary: java DataflowRunner should warn if the stagingLocation 
has a TTL
                 Key: BEAM-1507
                 URL: https://issues.apache.org/jira/browse/BEAM-1507
             Project: Beam
          Issue Type: Improvement
          Components: sdk-py
            Reporter: Daniel Halperin
            Assignee: Ahmet Altay
            Priority: Minor


We have seen a few customers run into a hard-to-track-down bug where the 
staging bucket has a TTL, but files get TTL-deleted when they are still needed.

This might be because of:

1. Long lived batch jobs / streaming jobs can reference staged files 
arbitrarily later and will fail in bad ways if they have been deleted.
2. Some customers even hit issues where the "check file already exists" 
succeeds when starting a job, but then the file is TTL-deleted before the job 
actually starts. (This sounds crazy, but may happen if TTL is 7 days and jobs 
run every 7 days, for example. Race condition.)

I'm hoping it's not hard to check that files would have TTLs and warn if so.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to