Is this expected behavior or improper error recovery:

*Task attempt_201306130117_0001_m_000009_0 failed to report status for 602
seconds. Killing!*

This was then followed by the retries of the task failing due to the
existence of the S3 output file that the dead task had started writing:

*org.apache.pig.backend.executionengine.ExecException: ERROR 2081: Unable
to setup the store function.
*
*...*
*Caused by: java.io.IOException: File already
exists:s3n://n2ygk/reduced.1/useful/part-m-00009*

Seems like this is exactly the kind of task restart that should "just work"
if the garbage from the failed task were properly cleaned up.

Is there a way to tell Pig to just clobber output files?

Is there a technique for checkpointing Pig scripts so that I don't have to
keep resubmitting this job and losing hours of work? I was even doing
"STORE" of intermediate aliases so I could restart later, but the job
failure causes the intermediate files to be deleted from S3.

Thanks.
/a

Reply via email to