Hi Kirby,

I wouldn't have expressed it more clearly myself

Thanks

Julien

On 7 July 2011 00:30, Kirby Bohling <[email protected]> wrote:

> From what I remember about the list discussions was:
>
> Nutch shouldn't implement everything under the sun, and doesn't need
> to invent (and worse maintain) the wheel.  Lots of the existing
> projects now handle big chunks of the problems that Nutch originally
> implemented internally.
>
> * Nutch no longer implements Map-Reduce, that was spun off as Hadoop
> (as I understand it).
> * Tika started off by somebody taking the Nutch parsers and turning
> them into an independent project.
> * Nutch no longer directly indexes using Lucene, instead it lets Solr
> handle that.
>
> Nutch implemented a lot of useful and reusable infrastructure which
> others noticed, spun off and created separate projects (and in
> Hadoop's case ecosystems).  I am pretty sure that Julian's quote is
> about the piece of puzzle that Nutch is going to contribute the heavy
> lifting, and which pieces it is going to delegate the heavy lifting to
> some other project.  Even the crawler-commons project mentioned on the
> list is all about spinning out useful re-usable components.
>
> The problem Nutch is tackling is large and difficult.  The number of
> code contributors is actually fairly small, hence the extreme focus on
> re-using high quality code.
>
> All that is to say, Nutch still has the same goals and ultimately
> provides all the same functionality, it just isn't going to suffer
> from "Not Invented Here" syndrome.
>
> Kirby
>
>
> On Wed, Jul 6, 2011 at 6:04 PM, Mattmann, Chris A (388J)
> <[email protected]> wrote:
> > Also note that quotes can easily be taken out of context. Let's let
> Julien be specific
> > and explain what he means rather than interpret his quotes.
> >
> > I'm not sure many of the high level goals of Nutch have changed one bit
> since
> > Doug started the project. The means, and the mechanism for getting there,
> have
> > a little bit, hopefully to its benefit.
> >
> > You can read about some of this in my ApacheCon NA 2010 presentation:
> >
> > http://s.apache.org/UvU
> >
> > Cheers,
> > Chris
> >
> > On Jul 6, 2011, at 1:21 PM, <[email protected]> <
> [email protected]> wrote:
> >
> >> Julien Nioche, wrote:
> >>
> >> "This is a change in the scope of the project from being an open source
> >> large scale search engine to an open source crawler indeed. We should
> make
> >> this clearer on the website."
> >>
> >> Just a crawler? That is what worries me. When I kenw nutch 0.3, I loved
> >> its original purpose.  I think that most  users,  like me, do not have
> the
> >> technical abilities to deal with further issues, quite complicated for
> >> non-programmers.
> >>
> >>
> >>
> >
> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Chris Mattmann, Ph.D.
> > Senior Computer Scientist
> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > Office: 171-266B, Mailstop: 171-246
> > Email: [email protected]
> > WWW:   http://sunset.usc.edu/~mattmann/
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Adjunct Assistant Professor, Computer Science Department
> > University of Southern California, Los Angeles, CA 90089 USA
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to