Why don't we just collect VOTEs for each of the options a-e, and then figure out based on that if there is a majority. If there's no majority, we can widdle it down to say the top 2-3, and then VOTE on those, looking for majority again.
Cheers, Chris On Sep 16, 2011, at 11:44 AM, Markus Jelsma wrote: > Option B) Shelve trunk in a branch and promote 1.4 to trunk. We can always > choose to hardwire HBASE (option D) later. > > Markus > >> Am happy to call for a vote on the future of Nutch 2.0 if you want. Shall >> we reduce the various options described before to a single one? >> >> Julien >> >> On 15 September 2011 19:55, Markus Jelsma <[email protected]>wrote: >>>> Hi Guys, >>>> >>>> I thought I'd chime in on this thread. My comments below: >>>>> I understand and share your frustration, however you need to bear in >>> >>> mind >>> >>>>> that things are done only if people volunteer and have time - usually >>>>> taken from their holiday, weekends, evenings. Chris (who is the de >>> >>> facto >>> >>>>> release master for Nutch and Gora) has not had the time and nobody >>>>> else has volunteered to do it. >>>> >>>> Yep I haven't had the time to push a Gora 0.1.1-incubating release that >>>> will address the Maven issues. However it is on my roadmap for open >>> >>> source >>> >>>> stuff to get done in the next month, so that's a good thing. But yes, >>> >>> that >>> >>>> portion of my open source work is all volunteer time, so sometimes >>>> other things take priority. >>>> >>>>>> As it happens, yesterday was the 1 year anniversary of the last >>>>>> successful Hudson/Jenkins build... If that actually worked, we >>>>>> could point people towards it as a useful recipe for how to get a >>>>>> build working off trunk. I haven't been following Nutch too >>>>>> closely, but it always strikes me as really odd, that there's a >>>>>> nightly build and it doesn't bother anybody that it fails all the >>>>>> time (and that there isn't a nightly build for the stable >>>>>> branches). >>>>> >>>>> The real issue behind all this is what we should do with Nutch 2.0. >>> >>> What >>> >>>>> follows is only my opinion and I would love to hear what others have >>>>> to say on this subject. >>>>> >>>>> Since we (actually mostly Dogacan) wrote 2.0 and delegated the >>>>> storage >>> >>> to >>> >>>>> Gora, the latter hasn't really taken off since incubation. There have >>>>> been some modest contributions to it but it does not seem to be used >>>>> much and there is virtually nothing happening on it in terms of >>>>> development. More worryingly, the people who initially contributed to >>> >>> it >>> >>>>> are not very active on the project (such is life, new jobs, different >>>>> projects, etc...) anymore·. As for Nutch 2.0, it hasn't made any >>>>> progress in the last 12 months : we still have the same bugs, the >>> >>> tests >>> >>>>> do not work, the build has to be done manually etc... >>>> >>>> Yep. >>>> >>>>> At the same time, there has been a new lease of life into Nutch as a >>>>> whole : there is definitely more activity on the mailing lists, new >>>>> users, new active committers etc... and quite a few bugfixes and >>>>> improvements - most of them backported from what had been done in the >>>>> trunk and people seem fairly happy with what we can do with 1.4 >>>> >>>> Totally agreed. I'm actually not super surprised -- ever since 1.1, I >>> >>> kind >>> >>>> of felt that maintaining a stable 1.X branch of Nutch (in parallel to >>>> the 2.0 efforts) was really going to pay off since there was renewed >>>> interest from users in leveraging (and furthermore accepting) the >>>> nuances of 1.X. >>>> >>>>> So the question is : what shall we do with 2.0? Here are a few >>>>> possibilities >>>>> >>>>> >>>>> a) put some effort into it, fix the bugs and make so that it can be >>> >>> used >>> >>>>> instead of 1.x >>>>> b) shelve it and leave it for enthusiasts to play with + make 1.x the >>>>> trunk again >>>>> c) do nothing : keep 2.0 and 1.x in parallel (but having to maintain >>> >>> two >>> >>>>> branches is quite a pain) >>>>> d) abandon the idea of a neutral storage layer with Gora and hardwire >>> >>> it >>> >>>>> to e.g. HBase >>>>> >>>>> Option (a) has not happened in the last 12 months and I am not very >>>>> hopeful about it. >>>>> >>>>> What do you guys think? >>>> >>>> I'd suggest an option e). Evolve and keep releasing 1.X over the next 6 >>>> months, and keep 2.0 in the trunk. After 6 months, see how close 1.X is >>> >>> to >>> >>>> actually being 2.0 (e.g., did we release a 1.4, a 1.5, a 1.6?) If we >>>> get to ~1.6 over the next 6 months and there is still no active >>>> development >>> >>> on >>> >>>> 2.0, I'd propose we do this at that point in time: >>>> >>>> 1. branch the current trunk as >>>> https://svn.apache.org/repos/asf/nutch/branches/nutchgora 2. grab >>>> latest stable branch (e.g., >>>> https://svn.apache.org/repos/asf/nutch/branches/branch-1.6) and >>> >>> *replace* >>> >>>> the Nutch trunk with it, and bump the version # to 1.7-dev 3. active >>>> development on stable becomes active development in trunk and nutchgora >>>> still exists in case anyone ever resurrects it. >>>> >>>> That way, we give another 6 months to see how it shakes out and >>> >>> potentially >>> >>>> allow for 1 or 2 or 3 more stable releases before switching those over >>>> to trunk. >>>> >>>> Thoughts? >>> >>> Yes. I don't believe we should wait until january before discussing this >>> topic >>> again. I, for example, cannot spend considerable extra time on the issues >>> i put in 1.4, also due to the fact that it's not entirely stable. >>> >>> There are many things i can write about this topic right now but don't >>> feel it's neccessary. The choice is difficult and perhaps painful but >>> when the voting round is opened by our project lead, i will vote for >>> promoting 1.x back >>> to trunk. >>> >>> My apologies for my impatience and pessimism. >>> >>>> BTW, I have a couple contributions from my CS572: Search Engines class >>> >>> from >>> >>>> a year ago that I'd love to port into the Nutch stable branch including >>>> Hubs/Authorities ranking and some other goodies. I'll try and work on >>>> those over the next few months, I'm just letting everyone know now so I >>>> don't forget again :-) >>>> >>>> Cheers, >>>> Chris >>>> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Chris Mattmann, Ph.D. >>>> Senior Computer Scientist >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>>> Office: 171-266B, Mailstop: 171-246 >>>> Email: [email protected] >>>> WWW: http://sunset.usc.edu/~mattmann/ >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> Adjunct Assistant Professor, Computer Science Department >>>> University of Southern California, Los Angeles, CA 90089 USA >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

