On Fri, 2005-09-30 at 21:31 -0700, Doug Cutting wrote: > Rod Taylor wrote: > > With -numFetchers gone it appears I require a generate/update for each > > fetch which serializes the process.
> parallelizing these made a big difference. But now, in my experience, > the dbupdate/generate overhead is more like 10-20%. With mapreduce, > what percent of the time do you find that you're not fetching? At this moment I have an overloaded router causing communication problems between systems. So I get a ton of socket timeouts which can cause reduce %age complete to go backward. I'll get back to you when I have a few hundred million more pages and a corrected network. -- Rod Taylor <[EMAIL PROTECTED]>
