Hi Ned, I have seen this error before as well. For me, one of the reduce tasks always used to get stuck and cause the error that you mentioned. The reason you see the message saying that the segment is already parsed is because the remaining reduces finished successfully and dumped their output in the crawl_parse, parse_data and parse_text dirs in the segment folder. If you wanna try reparsing the segment - you can delete/rename these directories from that segment before retrying the parse.
-vishal. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ned Rockson Sent: Sunday, September 23, 2007 2:47 PM To: [email protected] Subject: Parse reduce task fails to respond? I get this message: Task failed to report status for 604 seconds. Killing. often while running the parse reduce. Usually this would be because the machine went down, but the heartbeats are always up to date. Also, it will fail numerous times and the jobtracker will list the task as failed, but if I try to re-parse the segment it throws an error saying it's already parsed. Has anyone else had this problem? On a side note, I've had a problem with the parse phase before - it would try to parse extremely long urls but I fixed that by searching for control characters and urls longer than a few hundred characters in the URL filters now.
