Oh, i'd like to add that the biggest problem is memory and the possibility for 
a parser to hang, consume resources and time out everything else and destroying 
the segment.
 
 
-----Original message-----
> From:Weilei Zhang <[email protected]>
> Sent: Sat 09-Feb-2013 23:40
> To: [email protected]
> Subject: Re: performance question: fetcher and parser in separate map/reduce 
> jobs?
> 
> This is indeed helpful. Thanks Lewis.
> 
> On Wed, Feb 6, 2013 at 6:50 PM, Lewis John Mcgibbney
> <[email protected]> wrote:
> > I've eventually added this to our FAQ's
> >
> > http://wiki.apache.org/nutch/FAQ#Can_I_parse_during_the_fetching_process.3F
> >
> > This should explain for you.
> > Lewis
> >
> > On Wed, Feb 6, 2013 at 6:31 PM, Weilei Zhang <[email protected]> wrote:
> >
> >> Hi
> >> I have a performance question:
> >> why fetcher and parser is staged in two separate jobs instead of one?
> >> Intuitively, parser can be included as a part of fetcher reducer,  is
> >> it? This seems to be more efficient.
> >> Thanks
> >> --
> >> Best Regards
> >> -Weilei
> >>
> >
> >
> >
> > --
> > *Lewis*
> 
> 
> 
> -- 
> Best Regards
> -Weilei
> 

Reply via email to