Insurance Squared Inc. wrote:
Hi All,
We're experiencing problems with nutch freezing sporadically when
fetching. Not really to sure where to even start investigating. Some
digging into the archives suggested memory issues, so we did the
following:
TOMCAT_OPTS= -Xmx1024M to increase
Andrzej Bialecki wrote:
Insurance Squared Inc. wrote:
Hi All,
We're experiencing problems with nutch freezing sporadically when
fetching. Not really to sure where to even start investigating.
Some digging into the archives suggested memory issues, so we did the
following:
Hi,
Here's the output from when it freezes. Sorry it's a bit verbose,
wasn't sure what we're looking for so I've included it all:
051230 131519 fetching
http://www.municipalaffairs.gov.ab.ca/fco/pdf/ab-clan6-1.pdf
Full thread dump Java HotSpot(TM) Client VM (1.4.2_10-b03 mixed mode):
I've been getting periodic freezes during fetching also. I tracked
down one of the causes to a Java regular expression in my parse
filters. Java's regex support has been a source of lots of frustration
for me. But I'm not positive this was the only cause since I've
frozen on URLs that I couldn't
Howie Wang wrote:
I was wondering how to recover from a bad fetch. Should I consider
the segment corrupt and just delete it? Then should I reset the
fetch date in the webdb so that it will refetch it?
Unfinished segments are ok, you can use them for further processing. Of
course, the parts