Hi Shawn,
I am trying to perform a large fetch (1 million pages), and
observing some reduce tasks dying with the following message:
Timed out.java.io.IOException: Task process exit with nonzero
status. at
org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:273) at
org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:145)
In the past, the two things that triggered this type of error when we
went to bigger jobs were:
1. Running out of file descriptors.
2. IPC timeouts with big splits.
So try bumping your file descriptors (on all servers) to say 16K, and
increasing the IPC timeout value in your config file.
-- Ken
A little bit about my environment:
- I am running a test cluster of 16 machines, dual 3GHz Xeons with
2GB of RAM each, running JRE 1.5.0_06
- Running Nutch 0.8-dev, built from trunk this afternoon. Hadoop
0.1.0 taken from the nightly build.
All fetch tasks (32 of 32) complete successfully, as do most reduce
jobs . However, one or two reduce jobs will fail with the above
message. Upon failure, they are rescheduled to another tracker as
expected.
The rescheduled reduce task will run up until the same point as the
previous one died, and then sit around for ~10 minutes and die with
the same message. The jobtracker will reschedule the reduce task a
few times before giving up -- the entire job is aborted.
I was able to perform a successful fetch of 250,000 pages in my
initial tests. I then tried to scale it up to 1M pages and I'm now
stuck :/
Can anyone provide some clues as to where I might start on debugging
this issue?
Regards,
-Shawn
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general