Doug,
I think you mean update the database, the updatedb command. This should not require a huge amount of memory. What size heap are you giving to Java (NUTCH_HEAPSIZE)? Which JVM are you using? What is the thread dump when you get this exception?I only get such output:
segments/20040619101950
040706 144148 loading file:/home/nutch/nutch-0.5-dev/conf/nutch-default.xml
040706 144149 loading file:/home/nutch/nutch-0.5-dev/conf/nutch-site.xml
040706 144149 Updating newdb
040706 144149 Updating for segments/20040619101950
040706 144149 Using URL filter: net.nutch.net.RegexURLFilter
...skipping...
Exception in thread "main" java.lang.OutOfMemoryError
I use -Xms 1500M -Xmx1800M Sun JDK 1.4.2.x
I think the only thing which should use a lot of memory when doing db updates is the file sorting code, which is limited by the io.sort.mb config parameter. You could try making this a bit smaller.
Thanks for the hint, that wasn't clear to me i will give a try.
Stefan
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers
