Hi Felix,

Two options I can think of

1) Set longer timeouts   -Dmapred.task.timeout=_____  in millisecond.
or
2) Have a separate thread that reports back to TaskTracker with status through 
writing to stderr
     https://issues.apache.org/jira/browse/HADOOP-1328
     Format:   "reporter:status:____"

Hope it works.

Koji


On 1/28/11 3:51 PM, "felix gao" <[email protected]> wrote:

mighty user group,

I am trying to write a streaming job that does a lot of io in a python program. 
 I know if I don't report back every x minutes the job will be terminated.  How 
do I report back to the task tracker in my streaming python job that is in the 
middle of the gzip for example.

Thanks,

Felix

Reply via email to