Hi Stefan and all,

This is the message i get when i catch the exception

051225 161555 status: segment 20051225122501, 1371 pages, 85 errors, 50758960 bytes, 13849896 ms
051225 161555 status: 0.09898991 pages/s, 28.6323 kb/s, 37023.312 bytes/page
051225 161556 Updating C:/test_dir/crawl_dir/db
051225 161556 Updating for C:/test_dir/crawl_dir/segments/20051225122501
051225 161556 Processing document 0
051225 161656 Processing document 1000
051225 161723 Finishing update
java.lang.OutOfMemoryError

As you said , i didnt even search and while updating the webdb i get this exception. I just reduced the number of threads and increased the number of retry in the nutch-site.xml

I didnt change any other options.

Should i change the value of   'io.sort.mb' and or io.sort.factor ?
and if so what should i change to so to eliminate the  error?

Also is there any minimum requirement of RAM for nutch to do indexing and searching ?

Any help is greatly appreciated
Thanks in advance

regards
-Hussain.



----- Original Message ----- From: "Stefan Groschupf" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Monday, December 26, 2005 7:18 PM
Subject: Re: "Out of memor error" while updating


Do you have a stack trace?
Is it may related to a 'too many file open Exception?'.
Also you can try to minimalize 'io.sort.mb' and or io.sort.factor.

Stefan

Am 26.12.2005 um 09:27 schrieb K.A.Hussain Ali:

HI all,

I am using Nutch to crawl few sites and when i crawl for certain
depth and do updation of webdb

while updating the webdb i get an "Out of Memory error"

I increased the jvm size using java_opts and even reduced the token
size of per page in the nutch-default.xml but still i get such an
error.

I am using tomcat and i have only one application running on it.

what is the system requirement of Nutch to get rid of this error ?

I even tried things mentioned in the mailing list but nothing turns
to be fruitful.

Any help is greatly appreciated.
Thanks in advance

regards
-Hussain.

---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net




Reply via email to