Stefan Groschupf wrote:
Do I miss the section in the jobtracker where this is done, or are people interested that I submit a patch doing this mechanism?
This is mostly already implemented. The tasktracker fails tasks that do not update their status within a configurable timeout. Task status is updated each time a task reads an input, writes an output or calls the Reporter.setStatus() method. The jobtracker will retry failed tasks up to four times.
The mapred-based fetcher also should not hang. It will exit even when it has hung threads. So the task timeout should be set to the maximum amount of time that any single page should require to fetch & parse. By default it is set to 10 minutes.
Doug
