I am running this on a linux box with about 950MB of RAM - do I need to increase my memory? I tried running with about 20-30 threads to about 100 threads and also specified Xmx option to set max heap size - but after a while, fetcher always fail with OutofMemory exception. is there a min memory requirement for running nutch?
Thanks -Jha On Feb 9, 2008 7:15 PM, DS jha <[EMAIL PROTECTED]> wrote: > I removed my custom plugin and ran fetch again - but it is still > faling with the same OutOfMemory exception. > > Thanks much, > > > > > On Feb 8, 2008 6:12 PM, <[EMAIL PROTECTED]> wrote: > > I don't know if you are using any custom plugins on the fetching stage. > > I don't even know if this is possible (I don't need it). But, I have had > > a similar experience with indexing. After a few thousand pages, Nutch > > would start complaining about lack of memory. The culprit was my plugin > > that created a connection to a database in each call. > > > > So, if you _are_ using custom plugins, make sure that they don't leak > > resources and reduce dependency on garbage collection to the minimum. > > > > Regards, > > > > Arkadi > > > > > > > -----Original Message----- > > > From: DS jha [mailto:[EMAIL PROTECTED] > > > Sent: Friday, February 08, 2008 4:17 PM > > > To: [email protected] > > > Subject: fetcher failing with outofmemory exception > > > > > > Hello - > > > > > > I am using latest nutch trunk on a Linux machine (single file system) > > > - I am trying to fetch about 5-10K pages and every time I run fetch > > > command, after fetching few hundred pages, it starts throwing > > > OutofMemory exception (not related to heapsize): > > > > > > 2008-02-08 02:41:01,395 FATAL fetcher.Fetcher - java.io.IOException: > > > java.io.IOException: Cannot allocate memory > > > 2008-02-08 02:41:01,719 FATAL fetcher.Fetcher - at > > > java.lang.UNIXProcess.<init>(UNIXProcess.java:148) > > > 2008-02-08 02:41:01,719 FATAL fetcher.Fetcher - at > > > java.lang.ProcessImpl.start(ProcessImpl.java:65) > > > 2008-02-08 02:41:01,719 FATAL fetcher.Fetcher - at > > > java.lang.ProcessBuilder.start(ProcessBuilder.java:451) > > > 2008-02-08 02:41:01,719 FATAL fetcher.Fetcher - at > > > java.lang.Runtime.exec(Runtime.java:591) > > > 2008-02-08 02:41:01,719 FATAL fetcher.Fetcher - at > > > java.lang.Runtime.exec(Runtime.java:464) > > > 2008-02-08 02:41:01,719 FATAL fetcher.Fetcher - at > > > org.apache.hadoop.fs.ShellCommand.runCommand(ShellCommand.java:48) > > > 2008-02-08 02:41:01,719 FATAL fetcher.Fetcher - at > > > org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > org.apache.hadoop.fs.DF.getAvailable(DF.java:72) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > > > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathF > > or > > > Write(LocalDirAllocator.java:296) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > > > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllo > > ca > > > tor.java:124) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > > > org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFil > > e. > > > java:88) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpillToDisk(MapT > > as > > > k.java:382) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:36 > > 4) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > > > org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:354) > > > 2008-02-08 02:41:01,720 FATAL fetcher.Fetcher - at > > > org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:178) > > > > > > Hard-disk does have enough space (over 20GB of which <2 GB is used) > > > > > > I am mostly using default hadoop and nutch settings (I tried changing > > > number of fetch threads - default 35 to 50, and 100 - but it doesn't > > > have any impact - Fetcher keeps on throwing the above exception after > > > a while. > > > > > > Any thoughts? > > > > > > Thanks > > > Jha. > > > > > > >
