Hi Feng, Thanks again for your reply.
What I don't really understand is why I only get the PermGen issue after some jobs failed...? If the crawling is not fail, I never get the PermGen issue. But if the crawling is failed, I get the PermGen issue right after that... I've tried to used the JProfiler to analyze the memory in local but no luck... Many Thanks, Vu On Wed, Mar 20, 2013 at 2:13 PM, feng lu <[email protected]> wrote: > Hi Deals > > << > - Can I add -XX:MaxPermSize into JAVA_OPTS instead? > >> > yes, you can add this into JAVA_OPTS, but i can not find this param in > bin/nutch script file. > > << > - Does Garbage Collector clean the stuff in PermGen Space? If it doesn't > clean the stuff in PermGen and Nutch keeps adding the stuff into the > PermGen Space, it will have the PermGen issue again I guess (with the > bigger size) ? > > >> > GC can not clean the PermGen Space. so if you application load a lot of > Class. it will cause PermGen space error. > > i see that this problem also happens in the inject processing. But i don't > find any signs cause PermGen space error in this processing. It just takes > a flat file of URLs and adds them to the of pages to be crawled. There may > be additional factors to cause this problem. > > > > > On Wed, Mar 20, 2013 at 10:45 AM, Deals Collect <[email protected] > >wrote: > > > Hi Feng, > > > > Thanks for your reply. I'm going to add these stuff into the OPTS: > > -XX:MaxPermSize, -XX:+CMSPermGenSweepingEnabled and > > -XX:+CMSClassUnloadingEnabled. > > I'm still not clear about something: > > > > - Can I add -XX:MaxPermSize into JAVA_OPTS instead? > > - Does Garbage Collector clean the stuff in PermGen Space? If it doesn't > > clean the stuff in PermGen and Nutch keeps adding the stuff into the > > PermGen Space, it will have the PermGen issue again I guess (with the > > bigger size) ? > > > > Many thanks, > > Vu > > > > > > > > On Wed, Mar 20, 2013 at 1:20 PM, feng lu <[email protected]> wrote: > > > > > Hi Deals > > > > > > you can edit the bin/nutch script file and increase the Permanent > > > Generation Space to NUTCH_OPTS param. some code like this > > > > > > JAVA=$JAVA_HOME/bin/java > > > JAVA_HEAP_MAX=-Xmx1000m > > > NUTCH_OPTS="-XX:PermSize=128M -XX:MaxPermSize=256m" > > > > > > > > > On Wed, Mar 20, 2013 at 9:31 AM, Deals Collect <[email protected] > > > >wrote: > > > > > > > Hi all, > > > > > > > > I'm using Nutch 1.4 and Solr 3.6.1. The crawling is working well, it > > > crawls > > > > data, send to Solr perfectly. But the problem happens when the crawl > is > > > > failed sometimes, I get the java.lang.OutOfMemoryError: PermGen space > > > right > > > > after that. Here is the log file: > > > > > > > > java.io.IOException: Job failed! > > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) > > > > at > org.apache.nutch.parse.ParseSegment.parse(ParseSegment.java:157) > > > > at org.apache.nutch.crawl.Crawl.run(Crawl.java:138) > > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > at jobs.CrawlerUtils.crawlJob(CrawlerUtils.java:15) > > > > at > > > > > > > > > > > > > > jobs.cudo.GoldCoastCudoNutchCrawler.doJob(GoldCoastCudoNutchCrawler.java:23) > > > > at play.jobs.Job.doJobWithResult(Job.java:50) > > > > at play.jobs.Job.call(Job.java:146) > > > > at > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > > > > at java.lang.Thread.run(Thread.java:662) > > > > > > > > java.io.IOException: Job failed! > > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) > > > > at org.apache.nutch.crawl.Injector.inject(Injector.java:217) > > > > at org.apache.nutch.crawl.Crawl.run(Crawl.java:127) > > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > > at jobs.CrawlerUtils.crawlJob(CrawlerUtils.java:15) > > > > at > > > > > jobs.cudo.HobartCudoNutchCrawler.doJob(HobartCudoNutchCrawler.java:23) > > > > at play.jobs.Job.doJobWithResult(Job.java:50) > > > > at play.jobs.Job.call(Job.java:146) > > > > at > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > > > > at java.lang.Thread.run(Thread.java:662) > > > > > > > > Every time I get the "Job Failed!", I will get the problem with > memory > > > > right after that: > > > > > > > > Exception in thread "TP-Processor8" Exception in thread > "TP-Processor1" > > > > Exception in thread "TP-Processor7" java.lang.OutOfMemoryError: > PermGen > > > > space > > > > Exception in thread "TP-Processor6" java.lang.OutOfMemoryError: > PermGen > > > > space > > > > Exception in thread "TP-Processor9" java.lang.OutOfMemoryError: > PermGen > > > > space > > > > Exception in thread "TP-Processor3" java.lang.OutOfMemoryError: > PermGen > > > > space > > > > Exception in thread "TP-Processor12" java.lang.OutOfMemoryError: > > PermGen > > > > space > > > > Exception in thread "TP-Processor5" java.lang.OutOfMemoryError: > PermGen > > > > space > > > > Exception in thread "TP-Processor10" java.lang.OutOfMemoryError: > > PermGen > > > > space > > > > > > > > Anyone knows this issue? > > > > > > > > Many thanks, > > > > Vu Pham > > > > > > > > > > > > > > > > -- > > > Don't Grow Old, Grow Up... :-) > > > > > > > > > -- > Don't Grow Old, Grow Up... :-) >

