Guys,

I thought I'd chime in here. I don't have a lot of time tonight (long day
out here in California), but perhaps I can add more thoughts tomorrow.

My +1 for moving Nutch into a TLP. With a 1.0 release, and several prior
releases (~10), I think that the discussion is reasonable. I also tend to
agree with Dennis's view regarding it being a positive thing to have a Nutch
PMC. The project has been around since 2005, and whether activity has slowed
recently of late, or not, there are still folks who are actively interested
in Nutch, and use it in operational form on the day-to-day, myself included
in that area.

That said, I would like to revisit some of the ideas about the Next
Generation Nutch discussion:

http://markmail.org/message/mcnbgg7uf54snf55#query:next%20generation%20nutch
%20mattmann+page:1+mid:ofk3ob3hv4djmrmn+state:results

And use this as a spring board for some of the things we should really think
about if we make Nutch a TLP. IMHO, these ideas really justify Nutch as a
TLP because we:

1. have a 1.0 release (and several official 0.x releases and patch 0.x.y
patch releases)
2. have the system in real-world operations
3. have a plan going forward for a "next gen" or 2.0 architecture

As for Nutch being an integration platform for existing Lucene components, I
think that Nutch should certainly make use of existing functionality where
it makes sense (Tika, Solr, etc.), but we really need to take a hard look at
insulating the core POJO model of Nutch (Brin and Page paper here folks, I'm
talking the Anatomy of a Large-Scale Hypertextual Web Search Engine) from
the underlying technology substrate. That would be my on my list of top
goals for Nutch as a TLP. In fact, even thinking about this, I think it
lends itself very nicely to a category of sub-projects (e.g., Nutch-Hadoop,
Nutch-JMS, etc.) to think about from a TLP perspective.

Anyways, just wanted to chime in. I'll add more tomorrow.

Thanks,
Chris




On 3/17/09 7:05 PM, "Marc Boucher" <marc.bouc...@hyperix.com> wrote:

> Dennis,
> 
> That adds another dimension to the issue which I had not considered.
> One avenue as you suggest would be to add another committer to the
> Lucene PMC. If that does not work them maybe going the route of TLP is
> the best option.
> 
> Marc
> 
> 
>> Part of this is about releases.  Currently releases are voted on by Lucene
>> PMC members and it takes 3 members to confirm a vote.  There are only 2
>> Nutch committers on the Lucene PMC.  So for releases, not that we have had
>> many recently, other Lucene PMC members who may not be actively associated
>> with Nutch would need to vote to release.  If Nutch was a TLP there would be
>> a Nutch PMC which would most likely include all current Nutch committers.
>>  The other may be to add another Nutch committer to the Lucene PMC.
>> 
>>> 
>>> My thoughts. And hopefully in the near future my small team will be
>>> able to contribute to Nutch in a meaningful way.
>> 
>> Any and every contribution is welcome.
>> 
>> Dennis
>> 
> 

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.mattm...@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



Reply via email to