Check: http://issues.apache.org/jira/browse/NUTCH-233 and let us know if it helps. Stefan
Am 31.07.2006 um 07:46 schrieb Matthew Holt: > Fetcher for one, and the mapreduce takes forever... IE the > mapreduce is kind of annoying... is it possible to disable it if > I'm not running on a DFS? > Matt > > 06/07/25 20:59:12 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:14 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:19 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:23 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:29 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:33 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:34 INFO mapred.JobClient: map 100% reduce 96% > 06/07/25 20:59:40 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:41 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:42 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:47 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:48 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:52 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 20:59:53 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 21:00:05 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 21:00:22 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 21:00:29 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 21:00:39 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 21:01:07 INFO mapred.LocalJobRunner: reduce > reduce > 06/07/25 21:01:08 INFO mapred.JobClient: map 100% reduce 97% > 06/07/25 21:01:16 INFO mapred.LocalJobRunner: reduce > reduce > > > Sami Siren wrote: >> Are you experiencing slowness in general or just on some parts of >> the process. >> >> Current fetcher is deadslow and it should be given immediate >> attention. there have been some talk about the issue but I havent >> seen any code yet. >> >> -- >> Sami Siren >> >> Matthew Holt wrote: >>> I agree. Is there anyway to disable something to speed it up? IE >>> is the map reduce currently needed if we're not on a DFS? >>> >>> Matt >>> >>> Vasja Ocvirk wrote: >>> >>>> Hello, >>>> >>>> I'm wondering if anyone can help. We injected 1000 seed URLs >>>> into Nutch 0.7.2 (basic configuration + 1000 URLs in regexp >>>> filter) and it processed them in just few hours. We just >>>> switched to 0.8 with same configuration, same URLs, but it seems >>>> everything slowed down significantly. Crawl script has 60 >>>> threads -- same as before but now it works much slower. >>>> >>>> Thanks! >>>> >>>> Best, >>>> Vasja >>>> >>>> __________ NOD32 1.1533 (20060512) Information __________ >>>> >>>> This message was checked by NOD32 antivirus system. >>>> http://www.eset.com >>>> >>>> >>>> >>>> >>>> >>> >> >> > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
