I get this message: Task failed to report status for 604 seconds. Killing. often while running the parse reduce. Usually this would be because the machine went down, but the heartbeats are always up to date. Also, it will fail numerous times and the jobtracker will list the task as failed, but if I try to re-parse the segment it throws an error saying it's already parsed. Has anyone else had this problem?
On a side note, I've had a problem with the parse phase before - it would try to parse extremely long urls but I fixed that by searching for control characters and urls longer than a few hundred characters in the URL filters now.
