Dear Jon, dear nutchers, thanks - do you remember by chance more about the backgrounds of that problem? I am using nutch-1.7 and have currently the same issue while parsing.
Nutch-1.7 in pseudo distributed mode, 32GB total, 768M per mapper/reducer task, 8G for hadoop, 2G for nutch. 6 mappers, max total in segment 100000 URLs. According to the log, each URL takes 0-1 ms to be parsed. Suddenly, the 1min load of the machine goes up to 200 and higher, even (1/5/15-200-500-1200) I could see. But there is moderate CPU-usage, low IO-Wait and ~50 percent idle. Currently, I am running under same conditions, but only 10k URLs per segment. Up to now for 30 generate-fetch-parse-update-cycles no problem. I am already a veteran with ulimit problems and set values (ulimit -n: 250000, ulimit -u 320000) very high. Now I am out of ideas. Any ideas, suggestions? Cheers, Martin -----Original Message----- From: Jon Uhal <[email protected]> Reply-to: [email protected] To: [email protected] Subject: Cannot run program "/bin/ls": java.io.IOException: error=11, Resource temporarily unavailable Date: Wed, 20 Nov 2013 16:47:33 -0500 I just wanted to leave this here since it took me way too long to figure out. For some people, this might be an obvious problem, but since it wasn't to me, I want to make sure anyone else that gets this can have this answer. I kept getting the following error when I was running a crawl. For me, it was consistently happening, but I couldn't find any similar issues or solutions on the typical sites. The closest thing I could find was this: http://www.nosql.se/2011/10/hadoop-tasktracker-java-lang-outofmemoryerror/ Below is the error I was seeing. This is just one of several exceptions that would happen during the parse but in the end, the parse step would have too many errors and fail the Nutch error limit. 13/11/20 20:14:19 INFO parse.ParseSegment: ParseSegment: segment: test/segments/20131120201240 13/11/20 20:14:20 INFO mapred.FileInputFormat: Total input paths to process : 2 13/11/20 20:14:21 INFO mapred.JobClient: Running job: job_201311202006_0017 13/11/20 20:14:22 INFO mapred.JobClient: map 0% reduce 0% 13/11/20 20:14:34 INFO mapred.JobClient: map 40% reduce 0% 13/11/20 20:14:36 INFO mapred.JobClient: map 50% reduce 0% 13/11/20 20:14:36 INFO mapred.JobClient: Task Id : attempt_201311202006_0017_m_000001_0, Status : FAILED java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: Cannot run program "/bin/ls": java.io.IOException: error=11, Resource temporarily unavailable at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.util.Shell.runCommand(Shell.java:200) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) at org.apache.hadoop.util.Shell.execCommand(Shell.java:461) at org.apache.hadoop.util.Shell.execCommand(Shell.java:444) at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:712) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:448) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:431) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:267) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:260) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.IOException: java.io.IOException: error=11, Resource temporarily unavailable at java.lang.UNIXProcess.<init>(UNIXProcess.java:148) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 15 more at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:473) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:431) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:267) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:260) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/11/20 20:14:44 INFO mapred.JobClient: map 75% reduce 16% 13/11/20 20:14:47 INFO mapred.JobClient: map 100% reduce 16% 13/11/20 20:14:53 INFO mapred.JobClient: map 100% reduce 72% 13/11/20 20:14:54 INFO mapred.JobClient: map 100% reduce 100% 13/11/20 20:14:56 INFO mapred.JobClient: Job complete: job_201311202006_0017 I can't remember exactly what caused me to think it might be related to the number of available file handles, but that is where I found my solution. Originally, my system was set for 1024 open files: $ ulimit -n 1024 Bumping this up to 8096 has fixed my issue at the moment. This is different depending on the system you run on, so I'm not going to try to cover each system's solution, but getting this limit increased was critical to getting my crawl to run through more than one iteration. Hope this helps anyone with the same problem.

