Personally, I don't see the advantage of Nutch going for a TLP. It's not like new committers are having a hard time getting in today, it's not like they are being proposed and rejected. I also don't feel like Nutch lacks exposure/visibility -- lots of people know about it. It's just that very few people need a massively scalable web-wide crawling machinery that Nutch provides.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ > >From: "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov> >To: "email@example.com" <firstname.lastname@example.org> >Sent: Sat, March 20, 2010 7:30:54 PM >Subject: Re: [DISCUSS] Nutch as a top level project (TLP)? > >Hey Andrzej, > >>I’d be +1 for Nutch being a TLP. I don’t think it’ll change much (other than >>to provide more visibility/etc., and to allow more focused decision making by >>the folks in the Nutch community). The infrastructure moves required to move >>to TLP status are moving mailing lists, moving JIRA, moving SVN, and moving >>the website (a bit of redesign/etc.), which shouldn’t be that hard, and the >>infra team can probably help with (at least the first 3 parts if we file >>issues for them). > >>I’d volunteer to help with things like list moderation, or whatever else I >>can do to help. > >>The important things to decide would be: > > > * Who’s on the PMC (my suggestion, similar to Tika, make existing Nutch > committers PMC members) > > * Who’s the VP (my +1 for you) > >>Cheers, >>Chris > > > >>On 3/19/10 12:51 PM, "Andrzej Bialecki" <a...@getopt.org> wrote: > > >Hi devs, >> >>>>The ASF Board indicated recently that so called "umbrella" projects, >>>>i.e. projects that host many significant sub-projects, should examine >>>>their structure towards simplification, such as merging or splitting out >>>>sub-projects. >> >>>>Lucene TLP is such a project. Recently the Lucene PMC accepted the merge >>>>of Solr and Lucene core projects. Mahout project will most likely split >>>>to its own TLP soon. Which leaves Nutch as a sort of odd duck ;) >> >>>>Moving Nutch to its own TLP has some advantages, mostly an easier >>>>decision process - voting on new committers and new releases involves >>>>then only those who participate directly in Nutch dev., i.e. the Nutch >>>>community. >> >>>>Also, from the coding point of view, Nutch is not intrinsically tied to >>>>the Lucene development as if both would require some careful >>>>coordination - we just use Lucene as one of many dependencies, and in >>>>fact we aim to cleanly separate Nutch search API from Lucene-based API. >>>>I can easily imagine Nutch dropping completely the low-level >>>>Lucene-based components and moving to a more general search fabric (e.g. >>>>SolrCloud). >> >>>>Being its own TLP could also give Nutch more exposure and help to >>>>crystallize our mission. >> >>>>There are some disadvantages to such a split, too: we would need to >>>>spend some more effort on various administrative tasks, and maintain a >>>>separate web site (under Apache, but not under Lucene), and probably >>>>some other tasks that I'm not yet aware of. This would also mean that >>>>Nutch would have to stand on its own merit, which considering the small >>>>number of active committers may be challenging. >> >>>>Let's discuss this, and after we collect some pros and cons I'm going to >>>>call for a vote. >> >>>>-- >>>>Best regards, >>>>Andrzej Bialecki <>< >>>> ___. ___ ___ ___ _ _ __________________________________ >>>>[__ || __|__/|__||\/| Information Retrieval, Semantic Web >>>>___|||__|| \| || | Embedded Unix, System Integration >>http://www.sigram.com Contact: info at sigram dot com >> >> >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>Chris Mattmann, Ph.D. >>Senior Computer Scientist >>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>Office: 171-266B, Mailstop: 171-246 >>Email: chris.mattm...@jpl.nasa.gov >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>Adjunct Assistant Professor, Computer Science Department >>University of Southern California, Los Angeles, CA 90089 USA >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > >