Hi guys,
first, thanks for your help so far!
I have a Nutch server running in one Java JVM, starting a new thread for
each crawl.
I ran into a new issue after ~1 week of continuously repeated crawls (~ 10
sites ~ every hour each).
My hadoop.log said:
2014-01-04 19:15:42,229 WARN mapred.LocalJobRunner - job_local_45262
java.io.IOException: Cannot run program "chmod": error=24, Too many open
files
later on, I get:
java.io.IOException: Cannot run program "bash": error=24, Too many open
files
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:149)
at org.apache.hadoop.util.Shell.run(Shell.java:134)
at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
and eventually all crawls fail.
I'm wondering if there is any file descriptor leak that might be fixed in a
patch somewhere, or if there might be any other idea on how to fix this?
Thanks a lot,
Yann
--
View this message in context:
http://lucene.472066.n3.nabble.com/Cannot-run-program-chmod-too-many-open-files-tp4109753.html
Sent from the Nutch - User mailing list archive at Nabble.com.