> >>>The one task crawls about 3% of my topN and stops
> >>>eventually with java.lang.OutOfMemoryError: Java heap space
> >>>errors.
> >>Are you running Fetcher in parsing mode? Try to use the -noParsing 
> >>option, and then parse the content in a separate step.

I am now running generate/fetch/parse/updatedb.
The fetch process still only gets about 3%-4% of
the URLs in the topN of the generate.
The fetch process logs similar messages as before:

-----
fetch of http://www.example.com/public/page.asp/85491 failed with: 
java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/16154 failed with: 
java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/20208 failed with: 
java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/15411 failed with: 
java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/178293 failed with: 
java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/843060 failed with: 
java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/967264 failed with: 
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
fetcher caught:java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/97401 failed with: 
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
fetcher caught:java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/1585146 failed with: 
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
fetcher caught:java.lang.OutOfMemoryError: Java heap space
fetch of http://www.example.com/public/page.asp/11 failed with: 
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
fetcher caught:java.lang.OutOfMemoryError: Java heap space
-----

The first few entries are just fetch of X failed with: Y
After a few of these, it changes to a set of 3 error messages
like 'fetcher caught: java.lang... ; java.lang... ; fetch of X
failed with: java.lang...'.

I am not seeing any errors in the parse process.

How do I hunt down the java heap space error
further?  This only occurs in the fetch process.
Do I have too many threads?

I have it set to 24 threads, 32 max on a single
host.

I have the std memory option on the java runs.
Every java process has the -Xmx1000m option.
Should this be increased?

How do you deal with slaves that have different
amounts of memory.  I have some with 1.5gb ram,
and others with 4gb ram.

Sorry for all the questions.  The fetch issue is
my current wall I am trying to overcome.

Should this be debugged in the fetch process or
is it possible the generate process is only
outputting 3%-4% of the topN value?

Thanks in advance for any pointers.

JohnM

-- 
john mendenhall
[EMAIL PROTECTED]
surf utopia
internet services

Reply via email to