Hi Luca,
Hi Sebastian, thanks for replying!
But after the 5th cycle the crawler stopped?
Yes
For Pierre this has worked... Any suggestions?
I can post info for each step, but please tell me which log is more important: Haadop log? MySQL table? If this last one, which fields?
Alex says it's a MySQL problem, how can I verify after the generate step if he is correct?
Well, Nutch (resp. Hadoop) are designed to process much data. Job management has some overhead (and some artificial sleeps): 5 cycles * 4 jobs (generate/fetch/parse/update) = 20 jobs. 6s per job seems roughly ok, though it could be slightly faster.
Yes, this test is not well designed for Nutch, but I thought, as Stefan said, about a config or hardcoded delay somewhere in the nutch files I can try to reduce, since I will use on a single machine.
Luca

