Hi Deals

you can edit the bin/nutch script file and increase the Permanent
Generation Space to NUTCH_OPTS param. some code like this

JAVA=$JAVA_HOME/bin/java
JAVA_HEAP_MAX=-Xmx1000m
NUTCH_OPTS="-XX:PermSize=128M -XX:MaxPermSize=256m"


On Wed, Mar 20, 2013 at 9:31 AM, Deals Collect <[email protected]>wrote:

> Hi all,
>
> I'm using Nutch 1.4 and Solr 3.6.1. The crawling is working well, it crawls
> data, send to Solr perfectly. But the problem happens when the crawl is
> failed sometimes, I get the java.lang.OutOfMemoryError: PermGen space right
> after that. Here is the log file:
>
> java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>     at org.apache.nutch.parse.ParseSegment.parse(ParseSegment.java:157)
>     at org.apache.nutch.crawl.Crawl.run(Crawl.java:138)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at jobs.CrawlerUtils.crawlJob(CrawlerUtils.java:15)
>     at
>
> jobs.cudo.GoldCoastCudoNutchCrawler.doJob(GoldCoastCudoNutchCrawler.java:23)
>     at play.jobs.Job.doJobWithResult(Job.java:50)
>     at play.jobs.Job.call(Job.java:146)
>     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>     at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
>     at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:662)
>
> java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>     at org.apache.nutch.crawl.Injector.inject(Injector.java:217)
>     at org.apache.nutch.crawl.Crawl.run(Crawl.java:127)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at jobs.CrawlerUtils.crawlJob(CrawlerUtils.java:15)
>     at
> jobs.cudo.HobartCudoNutchCrawler.doJob(HobartCudoNutchCrawler.java:23)
>     at play.jobs.Job.doJobWithResult(Job.java:50)
>     at play.jobs.Job.call(Job.java:146)
>     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>     at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
>     at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:662)
>
> Every time I get the "Job Failed!", I will get the problem with memory
> right after that:
>
> Exception in thread "TP-Processor8" Exception in thread "TP-Processor1"
> Exception in thread "TP-Processor7" java.lang.OutOfMemoryError: PermGen
> space
> Exception in thread "TP-Processor6" java.lang.OutOfMemoryError: PermGen
> space
> Exception in thread "TP-Processor9" java.lang.OutOfMemoryError: PermGen
> space
> Exception in thread "TP-Processor3" java.lang.OutOfMemoryError: PermGen
> space
> Exception in thread "TP-Processor12" java.lang.OutOfMemoryError: PermGen
> space
> Exception in thread "TP-Processor5" java.lang.OutOfMemoryError: PermGen
> space
> Exception in thread "TP-Processor10" java.lang.OutOfMemoryError: PermGen
> space
>
> Anyone knows this issue?
>
> Many thanks,
> Vu Pham
>



-- 
Don't Grow Old, Grow Up... :-)

Reply via email to