Hi,
I was wondering if someone else on the list has been experiencing an
issue similar to the one below. I'm running 2 independent crawls on a
single hadoop cluster and am regularly getting "reduce copier failed"
errors. Most of the time Nutch is able to recover from these errors, but
every now and then it doesn't.
I was wondering what the best way is to go about re-tuning my
nutch/hadoop parameters in order to avoid this type of errors altogether
? I'm running both crawls on a fairly small cluster (3 nodes, each with
8GB Ram) and I understand that i'm probably coming close to a practical
limit as to how big of a crawl i can perform in such a setup. I would be
most interested in a rough heuristic as to the maximum number of URLs to
fetch in a single iteration based on the number of nodes, as well as
whether there is any benefit in merging segments often as opposed to
merging infrequently (e.g. merging 10/15 segments into an existing
crawl). Also of interest would be the maximum number of hadoop tasks
(both map and reduce) that should be allocated per node on clusters
running nutch: in my current setting i allocate 5 map and 5 reduce tasks
to each node, and although i can see that - at times - i'm pushing my
servers a bit too far, adding more servers to my setup is not currently
an option.
many thanks in advance, any pointer will be greatly appreciated ~
cheers,
-yp
Fetcher: starting
Fetcher: segment: /user/snoothbot/crawl-domain-bot/segments/20100222161426
java.io.IOException: Task: attempt_201002220008_0036_r_000004_0 - The
reduce copier failed
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:375)
at org.apache.hadoop.mapred.Child.main(Child.java:158)
Caused by: java.io.IOException: Cannot run program "bash":
java.io.IOException: error=12, Cannot allocate memory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
at org.apache.hadoop.util.Shell.run(Shell.java:134)
at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:321)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
at
org.apache.hadoop.mapred.MapOutputFile.getInputFileForWrite(MapOutputFile.java:160)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2479)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2447)
Caused by: java.io.IOException: java.io.IOException: error=12, Cannot
allocate memory
at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
at java.lang.ProcessImpl.start(ProcessImpl.java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
... 8 more