Is this expected behavior or improper error recovery: *Task attempt_201306130117_0001_m_000009_0 failed to report status for 602 seconds. Killing!*
This was then followed by the retries of the task failing due to the existence of the S3 output file that the dead task had started writing: *org.apache.pig.backend.executionengine.ExecException: ERROR 2081: Unable to setup the store function. * *...* *Caused by: java.io.IOException: File already exists:s3n://n2ygk/reduced.1/useful/part-m-00009* Seems like this is exactly the kind of task restart that should "just work" if the garbage from the failed task were properly cleaned up. Is there a way to tell Pig to just clobber output files? Is there a technique for checkpointing Pig scripts so that I don't have to keep resubmitting this job and losing hours of work? I was even doing "STORE" of intermediate aliases so I could restart later, but the job failure causes the intermediate files to be deleted from S3. Thanks. /a
