Hello everybody! Yesterday, I tried to run a crawl at depth 5 and topN 120000. 
In the middle of the 5th depth I got this error:

2014-03-19 19:16:11,608 WARN  fetcher.FetcherJob - fetch of 
http://www.weather.com/outlook/health/allergies/common/allergens/FL-allergen-716
 failed with: java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:11,608 INFO  fetcher.FetcherJob - fetching 
http://www.weather.com/outlook/health/allergies/pollenalert/USCA9000 (queue 
crawl delay=0ms)
2014-03-19 19:16:22,291 ERROR http.Http - Failed with the following error: 
java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:24,677 INFO  fetcher.FetcherJob - fetching 
http://www.weather.com/outlook/recreation/outdoors/fishing/29547:21 (queue 
crawl delay=0ms)
2014-03-19 19:16:24,677 WARN  fetcher.FetcherJob - fetch of 
http://www.weather.com/outlook/health/allergies/pollenalert/USCA9000 failed 
with: java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:33,550 ERROR http.Http - Failed with the following error: 
java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:35,568 INFO  fetcher.FetcherJob - fetching 
http://www.weather.com/outlook/health/allergies/common/allergens/NV-allergen-1187
 (queue crawl delay=0ms)
2014-03-19 19:16:35,568 WARN  fetcher.FetcherJob - fetch of 
http://www.weather.com/outlook/recreation/outdoors/fishing/29547:21 failed 
with: java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:41,928 ERROR http.Http - Failed with the following error: 
java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:43,535 INFO  fetcher.FetcherJob - fetching 
http://www.weather.com/outlook/health/allergies/common/allergens/OH-allergen-928
 (queue crawl delay=0ms)
2014-03-19 19:16:43,535 WARN  fetcher.FetcherJob - fetch of 
http://www.weather.com/outlook/health/allergies/common/allergens/NV-allergen-1187
 failed with: java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:50,432 ERROR http.Http - Failed with the following error: 
java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:50,888 WARN  fetcher.FetcherJob - fetch of 
http://www.weather.com/outlook/health/allergies/common/allergens/OH-allergen-928
 failed with: java.lang.OutOfMemoryError: Java heap space
2014-03-19 19:16:51,580 INFO  fetcher.FetcherJob - fetching 
http://www.weather.com/outlook/health/allergies/common/allergens/FL-allergen-235
 (queue crawl delay=0ms)
2014-03-19 19:16:53,120 ERROR http.Http - Failed with the following error: 
2014-03-19 19:16:53,711 INFO  fetcher.FetcherJob - fetching 
http://www.weather.com/outlook/recreation/outdoors/fishing/27891:21 (queue 
crawl delay=0ms)
2014-03-19 19:16:54,659 INFO  fetcher.FetcherJob - -finishing thread 
FetcherThread20, activeThreads=46
2014-03-19 19:17:06,734 INFO  fetcher.FetcherJob - -finishing thread 
FetcherThread48, activeThreads=44
2014-03-19 19:17:08,348 ERROR http.Http - Failed with the following error: 
java.lang.OutOfMemoryError: Java heap space

As you can see, I have problems with the Java heap space. I ran this crawl 
using Nutch 2.2.1, Eclipse and MySQL.

Any ideas on how to solve this thing? 
Recently, I changed metadata field from blob to longblob and put 
http.content.limit to -1 (None of them caused any trouble so far though).
                                          

Reply via email to