Thanks for all your help.. I applied that patch and I also added the property that Brad described. I am not receiving an out of memory error:
Reading content of SMB directory: 19A475BB-A31E-473A-BD05-62FA081F20F7/ -finishing thread FetcherThread, activeThreads=1 -finishing thread FetcherThread, activeThreads=0 -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0 -activeThreads=0 Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1104) at org.apache.nutch.crawl.Crawl.main(Crawl.java:133) Any ways to mitigate this? I get through aobut 900 or so documents when this occurs. On linux the free command shows that I only have 7mb of free memory! I have set the java heap size to 512 and it gets a bit further, but still dies at some point during the fetch process. Any other places I can restrict memory? I only have 1GB to work with on this box. -- View this message in context: http://lucene.472066.n3.nabble.com/File-System-Crawling-tp963557p966308.html Sent from the Nutch - User mailing list archive at Nabble.com.