nutch 0.8 (+ hadoop 0.5) does not crawl reliably

Teruhiko Kurosaka Tue, 24 Oct 2006 18:14:01 -0700

I am using nutch 0.8 (with hadoop 0.5 to get around
the Java Exception that I have asked a few months ago about)
with a custome analyzer plugin and some modification to
NutchAnalysis.jj.


I ran "nutch crawl" over the same test site of just three HTML 
files after clearing the index directory.  Two out of three tries,
the crawl session only fetches the index page only.  Only one run
(out of three tries) successfully fetches all pages.  All the
crawl runs are done using the exact same parameters.

Have anybody experienced strange behaviors like this?

-kuro

nutch 0.8 (+ hadoop 0.5) does not crawl reliably

Reply via email to