Hi,
I did a fetch with 2000 Urls on the Url seed list with 1000 threads and 100 
urls from one single host.
The fetching process was super quick (60-70 Urls/s), but in between there are 
always INFO messages from mapred.LocalJobRunner. During the parsing process  
the number of those messages increased.
I see something like
009-01-28 17:02:21,886 INFO  mapred.LocalJobRunner - 
file:/.../segments/20090128132015/content/part-00000/data:67108864+33554432
2009-01-28 17:02:24,890 INFO  mapred.LocalJobRunner - 
file:/.../segments/20090128132015/content/part-00000/data:67108864+33554432
2009-01-28 17:02:27,894 INFO  mapred.LocalJobRunner - 
file:/.../segments/20090128132015/content/part-00000/data:67108864+33554432
2009-01-28 17:02:30,898 INFO  mapred.LocalJobRunner - 
file:/.../segments/20090128132015/content/part-00000/data:67108864+33554432
2009-01-28 17:02:33,902 INFO  mapred.LocalJobRunner - 
file:/.../segments/20090128132015/content/part-00000/data:67108864+33554432
2009-01-28 17:02:33,998 INFO  mapred.JobClient -  map 74% reduce 0%
2009-01-28 17:02:36,906 INFO  mapred.LocalJobRunner - 
file:/.../segments/20090128132015/content/part-00000/data:67108864+33554432
2009-01-28 17:02:39,910 INFO  mapred.LocalJobRunner - 
file:/.../segments/20090128132015/content/part-00000/data:67108864+33554432
2009-01-28 17:02:42,914 INFO  mapred.LocalJobRunner - file:/...
These messages appear since an hour and then from time to time a messages that 
another url has been parsed. But I would say that 90% of all output comes from 
loacaljobrunner.
I think these mapred processes slow down my complete generate/fetch/parse cycle.
What can I do? Is this a normal behavior? Any ideas? What did I wrong?
We are running two Nutch instances on a single machine.
Thanks in advance,
Nadine.

Reply via email to