Add the following to hadoop-site.xml.  This sets the Java heap size for the
spawned child process.  You can set it to whatever you want.  I believe the
default size is 200 MB which is way too small.

<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx512m</value>
</property>

-----Original Message-----
From: Kunal Wku [mailto:[EMAIL PROTECTED] 
Sent: Monday, November 05, 2007 12:29 PM
To: Nutch User
Subject: Out of Memory Error While Crawling

Hello Everyone,
   
  I encountered errors during the crawl process as follows:
   
  java.lang.OutOfMemoryError: Java heap space
fetcher caught:java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
fetcher caught:java.lang.OutOfMemoryError: Java heap space
Exception in thread "main" java.io.IOException: Job failed!
 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
 at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:470)
 at org.apache.nutch.crawl.Crawl.main(Crawl.java:124)
   
  Please help me solve this.
   
  Thanks & Regards,
  Kunal Gosar

 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to