Hi, Since the split of Hadoop from Nutch the fetcher is misbehaving.
All task trackers are accessing the same sites at the same time. The addition of the call to Hadoop in Fetcher.java: job.setBoolean("mapred.speculative.execution", false); Did not change this behavior. Here is what I found in the Job Tracker log where I think the speculative execution happens during fetch: 060301 130053 Task 'task_m_349fnm' has completed. 060301 130053 Adding task 'task_m_42lyfs' to tip tip_67i2wa, for tracker 'tracker_77986' I do not know if it is related but the task has ended prematurely without any error: 060301 130134 task_r_2c181b 0.0% reduce > copy > 060301 130135 task_m_349fnm 0.031490237% 10 pages, 0 errors, 0.1 pages/s, 47 kb/s, 060301 130135 task_m_349fnm 0.031490237% 10 pages, 0 errors, 0.1 pages/s, 47 kb/s, 060301 130135 Task task_m_349fnm is done. Gal ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general