Check:
http://issues.apache.org/jira/browse/NUTCH-233
and let us know if it helps.
Stefan
Am 31.07.2006 um 07:46 schrieb Matthew Holt:
Fetcher for one, and the mapreduce takes forever... IE the
mapreduce is kind of annoying... is it possible to disable it if
I'm not running on a DFS?
Matt
06/07/25 20:59:12 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:14 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:19 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:23 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:29 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:33 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:34 INFO mapred.JobClient: map 100% reduce 96%
06/07/25 20:59:40 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:41 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:42 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:47 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:48 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:52 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 20:59:53 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:05 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:22 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:29 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:00:39 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:01:07 INFO mapred.LocalJobRunner: reduce > reduce
06/07/25 21:01:08 INFO mapred.JobClient: map 100% reduce 97%
06/07/25 21:01:16 INFO mapred.LocalJobRunner: reduce > reduce
Sami Siren wrote:
Are you experiencing slowness in general or just on some parts of
the process.
Current fetcher is deadslow and it should be given immediate
attention. there have been some talk about the issue but I havent
seen any code yet.
--
Sami Siren
Matthew Holt wrote:
I agree. Is there anyway to disable something to speed it up? IE
is the map reduce currently needed if we're not on a DFS?
Matt
Vasja Ocvirk wrote:
Hello,
I'm wondering if anyone can help. We injected 1000 seed URLs
into Nutch 0.7.2 (basic configuration + 1000 URLs in regexp
filter) and it processed them in just few hours. We just
switched to 0.8 with same configuration, same URLs, but it seems
everything slowed down significantly. Crawl script has 60
threads -- same as before but now it works much slower.
Thanks!
Best,
Vasja
__________ NOD32 1.1533 (20060512) Information __________
This message was checked by NOD32 antivirus system.
http://www.eset.com