Hi Deals you can edit the bin/nutch script file and increase the Permanent Generation Space to NUTCH_OPTS param. some code like this
JAVA=$JAVA_HOME/bin/java JAVA_HEAP_MAX=-Xmx1000m NUTCH_OPTS="-XX:PermSize=128M -XX:MaxPermSize=256m" On Wed, Mar 20, 2013 at 9:31 AM, Deals Collect <[email protected]>wrote: > Hi all, > > I'm using Nutch 1.4 and Solr 3.6.1. The crawling is working well, it crawls > data, send to Solr perfectly. But the problem happens when the crawl is > failed sometimes, I get the java.lang.OutOfMemoryError: PermGen space right > after that. Here is the log file: > > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) > at org.apache.nutch.parse.ParseSegment.parse(ParseSegment.java:157) > at org.apache.nutch.crawl.Crawl.run(Crawl.java:138) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at jobs.CrawlerUtils.crawlJob(CrawlerUtils.java:15) > at > > jobs.cudo.GoldCoastCudoNutchCrawler.doJob(GoldCoastCudoNutchCrawler.java:23) > at play.jobs.Job.doJobWithResult(Job.java:50) > at play.jobs.Job.call(Job.java:146) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) > at > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) > at org.apache.nutch.crawl.Injector.inject(Injector.java:217) > at org.apache.nutch.crawl.Crawl.run(Crawl.java:127) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at jobs.CrawlerUtils.crawlJob(CrawlerUtils.java:15) > at > jobs.cudo.HobartCudoNutchCrawler.doJob(HobartCudoNutchCrawler.java:23) > at play.jobs.Job.doJobWithResult(Job.java:50) > at play.jobs.Job.call(Job.java:146) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) > at > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Every time I get the "Job Failed!", I will get the problem with memory > right after that: > > Exception in thread "TP-Processor8" Exception in thread "TP-Processor1" > Exception in thread "TP-Processor7" java.lang.OutOfMemoryError: PermGen > space > Exception in thread "TP-Processor6" java.lang.OutOfMemoryError: PermGen > space > Exception in thread "TP-Processor9" java.lang.OutOfMemoryError: PermGen > space > Exception in thread "TP-Processor3" java.lang.OutOfMemoryError: PermGen > space > Exception in thread "TP-Processor12" java.lang.OutOfMemoryError: PermGen > space > Exception in thread "TP-Processor5" java.lang.OutOfMemoryError: PermGen > space > Exception in thread "TP-Processor10" java.lang.OutOfMemoryError: PermGen > space > > Anyone knows this issue? > > Many thanks, > Vu Pham > -- Don't Grow Old, Grow Up... :-)

