I've been trying to get a rowsimilarity job to complete. It continues to
timeout on a RowSimilarityJob-CooccurrencesMapper-Reducer task so I've
upped the timeout to 30 minutes now. There are no errors in the logs
that I can see and no other task I've tried is acting like this. Is this
expected? Shouldn't the task check in more often?
It's doing 34,000 docs with 40 sim docs each on 8 cores so it is a bit
slow anyway, still I shouldn't have to turn up the timeout so high
should I?
12/07/11 07:54:16 INFO mapred.JobClient: Task Id :
attempt_201207101248_0002_m_000000_1, Status : FAILED
Task attempt_201207101248_0002_m_000000_1 failed to report status for
1800 seconds. Killing!
12/07/11 07:58:06 INFO mapred.JobClient: map 1% reduce 0%
12/07/11 08:00:39 INFO mapred.JobClient: map 2% reduce 0%
12/07/11 08:02:21 INFO mapred.JobClient: map 3% reduce 0%
12/07/11 08:04:49 INFO mapred.JobClient: map 4% reduce 0%
12/