Why don't we just collect VOTEs for each of the options a-e, and then 
figure out based on that if there is a majority. If there's no majority, we 
can widdle it down to say the top 2-3, and then VOTE on those, looking 
for majority again.

Cheers,
Chris

On Sep 16, 2011, at 11:44 AM, Markus Jelsma wrote:

> Option B) Shelve trunk in a branch and promote 1.4 to trunk. We can always 
> choose to hardwire HBASE (option D) later.
> 
> Markus
> 
>> Am happy to call for a vote on the future of Nutch 2.0 if you want. Shall
>> we reduce the various options described before to a single one?
>> 
>> Julien
>> 
>> On 15 September 2011 19:55, Markus Jelsma <[email protected]>wrote:
>>>> Hi Guys,
>>>> 
>>>> I thought I'd chime in on this thread. My comments below:
>>>>> I understand and share your frustration, however you need to bear in
>>> 
>>> mind
>>> 
>>>>> that things are done only if people volunteer and have time - usually
>>>>> taken from their holiday, weekends, evenings. Chris (who is the de
>>> 
>>> facto
>>> 
>>>>> release master for Nutch and Gora) has not had the time and nobody
>>>>> else has volunteered to do it.
>>>> 
>>>> Yep I haven't had the time to push a Gora 0.1.1-incubating release that
>>>> will address the Maven issues. However it is on my roadmap for open
>>> 
>>> source
>>> 
>>>> stuff to get done in the next month, so that's a good thing. But yes,
>>> 
>>> that
>>> 
>>>> portion of my open source work is all volunteer time, so sometimes
>>>> other things take priority.
>>>> 
>>>>>> As it happens, yesterday was the 1 year anniversary of the last
>>>>>> successful Hudson/Jenkins build...  If that actually worked, we
>>>>>> could point people towards it as a useful recipe for how to get a
>>>>>> build working off trunk.  I haven't been following Nutch too
>>>>>> closely, but it always strikes me as really odd, that there's a
>>>>>> nightly build and it doesn't bother anybody that it fails all the
>>>>>> time (and that there isn't a nightly build for the stable
>>>>>> branches).
>>>>> 
>>>>> The real issue behind all this is what we should do with Nutch 2.0.
>>> 
>>> What
>>> 
>>>>> follows is only my opinion and I would love to hear what others have
>>>>> to say on this subject.
>>>>> 
>>>>> Since we (actually mostly Dogacan) wrote 2.0 and delegated the
>>>>> storage
>>> 
>>> to
>>> 
>>>>> Gora, the latter hasn't really taken off since incubation. There have
>>>>> been some modest contributions to it but it does not seem to be used
>>>>> much and there is virtually nothing happening on it in terms of
>>>>> development. More worryingly, the people who initially contributed to
>>> 
>>> it
>>> 
>>>>> are not very active on the project (such is life, new jobs, different
>>>>> projects, etc...) anymore·. As for Nutch 2.0, it hasn't made any
>>>>> progress in  the last 12 months : we still have the same bugs, the
>>> 
>>> tests
>>> 
>>>>> do not work, the build has to be done manually etc...
>>>> 
>>>> Yep.
>>>> 
>>>>> At the same time, there has been a new lease of life into Nutch as a
>>>>> whole : there is definitely more activity on the mailing lists, new
>>>>> users, new active committers  etc... and quite a few bugfixes and
>>>>> improvements - most of them backported from what had been done in the
>>>>> trunk and people seem fairly happy with what we can do with 1.4
>>>> 
>>>> Totally agreed. I'm actually not super surprised -- ever since 1.1, I
>>> 
>>> kind
>>> 
>>>> of felt that maintaining a stable 1.X branch of Nutch (in parallel to
>>>> the 2.0 efforts) was really going to pay off since there was renewed
>>>> interest from users in leveraging (and furthermore accepting) the
>>>> nuances of 1.X.
>>>> 
>>>>> So the question is : what shall we do with 2.0? Here are a few
>>>>> possibilities
>>>>> 
>>>>> 
>>>>> a) put some effort into it, fix the bugs and make so that it can be
>>> 
>>> used
>>> 
>>>>> instead of 1.x
>>>>> b) shelve it and leave it for enthusiasts to play with + make 1.x the
>>>>> trunk again
>>>>> c) do nothing : keep 2.0 and 1.x in parallel  (but having to maintain
>>> 
>>> two
>>> 
>>>>> branches is quite a pain)
>>>>> d) abandon the idea of a neutral storage layer with Gora and hardwire
>>> 
>>> it
>>> 
>>>>> to e.g. HBase
>>>>> 
>>>>> Option (a) has not happened in the last 12 months and I am not very
>>>>> hopeful about it.
>>>>> 
>>>>> What do you guys think?
>>>> 
>>>> I'd suggest an option e). Evolve and keep releasing 1.X over the next 6
>>>> months, and keep 2.0 in the trunk. After 6 months, see how close 1.X is
>>> 
>>> to
>>> 
>>>> actually being 2.0 (e.g., did we release a 1.4, a 1.5, a 1.6?) If we
>>>> get to ~1.6 over the next 6 months and there is still no active
>>>> development
>>> 
>>> on
>>> 
>>>> 2.0, I'd propose we do this at that point in time:
>>>> 
>>>> 1. branch the current trunk as
>>>> https://svn.apache.org/repos/asf/nutch/branches/nutchgora 2. grab
>>>> latest stable branch (e.g.,
>>>> https://svn.apache.org/repos/asf/nutch/branches/branch-1.6) and
>>> 
>>> *replace*
>>> 
>>>> the Nutch trunk with it, and bump the version # to 1.7-dev 3. active
>>>> development on stable becomes active development in trunk and nutchgora
>>>> still exists in case anyone ever resurrects it.
>>>> 
>>>> That way, we give another 6 months to see how it shakes out and
>>> 
>>> potentially
>>> 
>>>> allow for 1 or 2 or 3 more stable releases before switching those over
>>>> to trunk.
>>>> 
>>>> Thoughts?
>>> 
>>> Yes. I don't believe we should wait until january before discussing this
>>> topic
>>> again. I, for example, cannot spend considerable extra time on the issues
>>> i put in 1.4, also due to the fact that it's not entirely stable.
>>> 
>>> There are many things i can write about this topic right now but don't
>>> feel it's neccessary. The choice is difficult and perhaps painful but
>>> when the voting round is opened by our project lead, i will vote for
>>> promoting 1.x back
>>> to trunk.
>>> 
>>> My apologies for my impatience and pessimism.
>>> 
>>>> BTW, I have a couple contributions from my CS572: Search Engines class
>>> 
>>> from
>>> 
>>>> a year ago that I'd love to port into the Nutch stable branch including
>>>> Hubs/Authorities ranking and some other goodies. I'll try and work on
>>>> those over the next few months, I'm just letting everyone know now so I
>>>> don't forget again :-)
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Chris Mattmann, Ph.D.
>>>> Senior Computer Scientist
>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>> Office: 171-266B, Mailstop: 171-246
>>>> Email: [email protected]
>>>> WWW:   http://sunset.usc.edu/~mattmann/
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Adjunct Assistant Professor, Computer Science Department
>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to