Fetcher for one, and the mapreduce takes forever... IE the mapreduce is
kind of annoying... is it possible to disable it if I'm not running on a
DFS?
Matt
06/07/25 20:59:12 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:14 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:19 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:23 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:29 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:33 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:34 INFO mapred.JobClient: map 100% reduce 96%
06/07/25 20:59:40 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:41 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:42 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:47 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:48 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:52 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:53 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:05 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:22 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:29 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:39 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:01:07 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:01:08 INFO mapred.JobClient: map 100% reduce 97%
06/07/25 21:01:16 INFO mapred.LocalJobRunner: reduce > reduce
Sami Siren wrote:
Are you experiencing slowness in general or just on some parts of the
process.
Current fetcher is deadslow and it should be given immediate
attention. there have been some talk about the issue but I havent seen
any code yet.
--
Sami Siren
Matthew Holt wrote:
I agree. Is there anyway to disable something to speed it up? IE is
the map reduce currently needed if we're not on a DFS?
Matt
Vasja Ocvirk wrote:
Hello,
I'm wondering if anyone can help. We injected 1000 seed URLs into
Nutch 0.7.2 (basic configuration + 1000 URLs in regexp filter) and
it processed them in just few hours. We just switched to 0.8 with
same configuration, same URLs, but it seems everything slowed down
significantly. Crawl script has 60 threads -- same as before but now
it works much slower.
Thanks!
Best,
Vasja
__________ NOD32 1.1533 (20060512) Information __________
This message was checked by NOD32 antivirus system.
http://www.eset.com