Glad to see were making progress here. Same with me, I am ready to move on with the project and move out of this 'rut' we have been in with trunk.
Thanks On Sat, Sep 17, 2011 at 6:56 PM, Mattmann, Chris A (388J) < [email protected]> wrote: > Hey Markus, > > No worries. I actually have no dog in this fight to be honest. > > I want Gora to be successful, and I want Nutch to be successful. > I haven't contributed much to Nutch 2.0 trunk but I have been > to the 1.x series branch. I wish I knew more about Gora's internals (and > am trying to learn) so I could help more with it. I think it will make a > lot > of sense to use it at some point. > > At the same time, I'm all for making 1.x releases and naturally getting to > 2.0 over time based on our current progress and understanding. I'm also > super excited about the 1.x versions of Nutch and when I think about it > the reality is that they've always been Nutch trunk even though we > artificially tried to turn the nutchbase brancn into it. > > So to wrap it up, I'm totally fine with 1.x moving into trunk and with > executing > the plan I proposed a while back: > > ---snip > 1. branch the current trunk as > https://svn.apache.org/repos/asf/nutch/branches/nutchgora > 2. grab latest stable branch (e.g., > https://svn.apache.org/repos/asf/nutch/branches/branch-1.6) and > *replace* the Nutch trunk with it, and bump the version # to 1.7-dev > 3. active development on stable becomes active development in trunk and > nutchgora still > exists in case anyone ever resurrects it. > ---snip > > Of course, it's not 1.6 (I was optimistic about getting there in 6 months > ;) ), but it's really 1.4. > And we don't need to bump to -dev since we're already in full dev with the > 1.4 cycle. > > So, I'm ready for a VOTE. Feel free to call one (or have Julien do it), and > I'll VOTE +1. > > Cheers, > Chris > > > On Sep 17, 2011, at 10:18 AM, Markus Jelsma wrote: > > > Hi Chris, > > > > I initially respawned this thread with the suggestion to not to wait > until > > january orso before the vote. Hence my apologies for being impatient and > > pessimistic about trunk :) > > > > Cheers, > > > >> Hey Julien, > >> > >> My option E was pretty much equivalent to B except I specified a time > frame > >> (next 6 months). Are we just saying that we'll accelerate the time frame > >> to say, umm, next week or the week after? :) > >> > >> If so, fine by me. Since I moved nutchbase into the trunk at one point, > I'd > >> be happy once we've VOTEd and decided to be the one to execute moving it > >> out. > >> > >> And yes, PMC votes will be binding and we'll do majority takes it, fine > by > >> me. > >> > >> Cheers, > >> Chris > >> > >> On Sep 17, 2011, at 1:45 AM, Julien Nioche wrote: > >>> Let's keep it simple. Let's vote for option B (i.e. shelve 2.0), if > most > >>> people are in favour then we don't need to look into other options at > >>> all. If not, we'll see what alternatives or arguments come up and vote > >>> on these later. > >>> > >>> I assume that only PMC votes will be binding and the majority takes it? > >>> > >>> Julien > >>> > >>> On 16 September 2011 22:30, Mattmann, Chris A (388J) > >>> <[email protected]> wrote: Why don't we just collect VOTEs > >>> for each of the options a-e, and then figure out based on that if there > >>> is a majority. If there's no majority, we can widdle it down to say the > >>> top 2-3, and then VOTE on those, looking for majority again. > >>> > >>> Cheers, > >>> Chris > >>> > >>> On Sep 16, 2011, at 11:44 AM, Markus Jelsma wrote: > >>>> Option B) Shelve trunk in a branch and promote 1.4 to trunk. We can > >>>> always choose to hardwire HBASE (option D) later. > >>>> > >>>> Markus > >>>> > >>>>> Am happy to call for a vote on the future of Nutch 2.0 if you want. > >>>>> Shall we reduce the various options described before to a single one? > >>>>> > >>>>> Julien > >>>>> > >>>>> On 15 September 2011 19:55, Markus Jelsma > > <[email protected]>wrote: > >>>>>>> Hi Guys, > >>>>>>> > >>>>>>> I thought I'd chime in on this thread. My comments below: > >>>>>>>> I understand and share your frustration, however you need to bear > >>>>>>>> in > >>>>>> > >>>>>> mind > >>>>>> > >>>>>>>> that things are done only if people volunteer and have time - > >>>>>>>> usually taken from their holiday, weekends, evenings. Chris (who > >>>>>>>> is the de > >>>>>> > >>>>>> facto > >>>>>> > >>>>>>>> release master for Nutch and Gora) has not had the time and nobody > >>>>>>>> else has volunteered to do it. > >>>>>>> > >>>>>>> Yep I haven't had the time to push a Gora 0.1.1-incubating release > >>>>>>> that will address the Maven issues. However it is on my roadmap for > >>>>>>> open > >>>>>> > >>>>>> source > >>>>>> > >>>>>>> stuff to get done in the next month, so that's a good thing. But > >>>>>>> yes, > >>>>>> > >>>>>> that > >>>>>> > >>>>>>> portion of my open source work is all volunteer time, so sometimes > >>>>>>> other things take priority. > >>>>>>> > >>>>>>>>> As it happens, yesterday was the 1 year anniversary of the last > >>>>>>>>> successful Hudson/Jenkins build... If that actually worked, we > >>>>>>>>> could point people towards it as a useful recipe for how to get a > >>>>>>>>> build working off trunk. I haven't been following Nutch too > >>>>>>>>> closely, but it always strikes me as really odd, that there's a > >>>>>>>>> nightly build and it doesn't bother anybody that it fails all the > >>>>>>>>> time (and that there isn't a nightly build for the stable > >>>>>>>>> branches). > >>>>>>>> > >>>>>>>> The real issue behind all this is what we should do with Nutch > 2.0. > >>>>>> > >>>>>> What > >>>>>> > >>>>>>>> follows is only my opinion and I would love to hear what others > >>>>>>>> have to say on this subject. > >>>>>>>> > >>>>>>>> Since we (actually mostly Dogacan) wrote 2.0 and delegated the > >>>>>>>> storage > >>>>>> > >>>>>> to > >>>>>> > >>>>>>>> Gora, the latter hasn't really taken off since incubation. There > >>>>>>>> have been some modest contributions to it but it does not seem to > >>>>>>>> be used much and there is virtually nothing happening on it in > >>>>>>>> terms of development. More worryingly, the people who initially > >>>>>>>> contributed to > >>>>>> > >>>>>> it > >>>>>> > >>>>>>>> are not very active on the project (such is life, new jobs, > >>>>>>>> different projects, etc...) anymore·. As for Nutch 2.0, it hasn't > >>>>>>>> made any progress in the last 12 months : we still have the same > >>>>>>>> bugs, the > >>>>>> > >>>>>> tests > >>>>>> > >>>>>>>> do not work, the build has to be done manually etc... > >>>>>>> > >>>>>>> Yep. > >>>>>>> > >>>>>>>> At the same time, there has been a new lease of life into Nutch as > >>>>>>>> a whole : there is definitely more activity on the mailing lists, > >>>>>>>> new users, new active committers etc... and quite a few bugfixes > >>>>>>>> and improvements - most of them backported from what had been done > >>>>>>>> in the trunk and people seem fairly happy with what we can do with > >>>>>>>> 1.4 > >>>>>>> > >>>>>>> Totally agreed. I'm actually not super surprised -- ever since 1.1, > >>>>>>> I > >>>>>> > >>>>>> kind > >>>>>> > >>>>>>> of felt that maintaining a stable 1.X branch of Nutch (in parallel > >>>>>>> to the 2.0 efforts) was really going to pay off since there was > >>>>>>> renewed interest from users in leveraging (and furthermore > >>>>>>> accepting) the nuances of 1.X. > >>>>>>> > >>>>>>>> So the question is : what shall we do with 2.0? Here are a few > >>>>>>>> possibilities > >>>>>>>> > >>>>>>>> > >>>>>>>> a) put some effort into it, fix the bugs and make so that it can > be > >>>>>> > >>>>>> used > >>>>>> > >>>>>>>> instead of 1.x > >>>>>>>> b) shelve it and leave it for enthusiasts to play with + make 1.x > >>>>>>>> the trunk again > >>>>>>>> c) do nothing : keep 2.0 and 1.x in parallel (but having to > >>>>>>>> maintain > >>>>>> > >>>>>> two > >>>>>> > >>>>>>>> branches is quite a pain) > >>>>>>>> d) abandon the idea of a neutral storage layer with Gora and > >>>>>>>> hardwire > >>>>>> > >>>>>> it > >>>>>> > >>>>>>>> to e.g. HBase > >>>>>>>> > >>>>>>>> Option (a) has not happened in the last 12 months and I am not > very > >>>>>>>> hopeful about it. > >>>>>>>> > >>>>>>>> What do you guys think? > >>>>>>> > >>>>>>> I'd suggest an option e). Evolve and keep releasing 1.X over the > >>>>>>> next 6 months, and keep 2.0 in the trunk. After 6 months, see how > >>>>>>> close 1.X is > >>>>>> > >>>>>> to > >>>>>> > >>>>>>> actually being 2.0 (e.g., did we release a 1.4, a 1.5, a 1.6?) If > we > >>>>>>> get to ~1.6 over the next 6 months and there is still no active > >>>>>>> development > >>>>>> > >>>>>> on > >>>>>> > >>>>>>> 2.0, I'd propose we do this at that point in time: > >>>>>>> > >>>>>>> 1. branch the current trunk as > >>>>>>> https://svn.apache.org/repos/asf/nutch/branches/nutchgora 2. grab > >>>>>>> latest stable branch (e.g., > >>>>>>> https://svn.apache.org/repos/asf/nutch/branches/branch-1.6) and > >>>>>> > >>>>>> *replace* > >>>>>> > >>>>>>> the Nutch trunk with it, and bump the version # to 1.7-dev 3. > active > >>>>>>> development on stable becomes active development in trunk and > >>>>>>> nutchgora still exists in case anyone ever resurrects it. > >>>>>>> > >>>>>>> That way, we give another 6 months to see how it shakes out and > >>>>>> > >>>>>> potentially > >>>>>> > >>>>>>> allow for 1 or 2 or 3 more stable releases before switching those > >>>>>>> over to trunk. > >>>>>>> > >>>>>>> Thoughts? > >>>>>> > >>>>>> Yes. I don't believe we should wait until january before discussing > >>>>>> this topic > >>>>>> again. I, for example, cannot spend considerable extra time on the > >>>>>> issues i put in 1.4, also due to the fact that it's not entirely > >>>>>> stable. > >>>>>> > >>>>>> There are many things i can write about this topic right now but > >>>>>> don't feel it's neccessary. The choice is difficult and perhaps > >>>>>> painful but when the voting round is opened by our project lead, i > >>>>>> will vote for promoting 1.x back > >>>>>> to trunk. > >>>>>> > >>>>>> My apologies for my impatience and pessimism. > >>>>>> > >>>>>>> BTW, I have a couple contributions from my CS572: Search Engines > >>>>>>> class > >>>>>> > >>>>>> from > >>>>>> > >>>>>>> a year ago that I'd love to port into the Nutch stable branch > >>>>>>> including Hubs/Authorities ranking and some other goodies. I'll try > >>>>>>> and work on those over the next few months, I'm just letting > >>>>>>> everyone know now so I don't forget again :-) > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Chris > >>>>>>> > >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>>> Chris Mattmann, Ph.D. > >>>>>>> Senior Computer Scientist > >>>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>>>>>> Office: 171-266B, Mailstop: 171-246 > >>>>>>> Email: [email protected] > >>>>>>> WWW: http://sunset.usc.edu/~mattmann/ > >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>>> Adjunct Assistant Professor, Computer Science Department > >>>>>>> University of Southern California, Los Angeles, CA 90089 USA > >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Chris Mattmann, Ph.D. > >>> Senior Computer Scientist > >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>> Office: 171-266B, Mailstop: 171-246 > >>> Email: [email protected] > >>> WWW: http://sunset.usc.edu/~mattmann/ > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>> Adjunct Assistant Professor, Computer Science Department > >>> University of Southern California, Los Angeles, CA 90089 USA > >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Chris Mattmann, Ph.D. > >> Senior Computer Scientist > >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> Office: 171-266B, Mailstop: 171-246 > >> Email: [email protected] > >> WWW: http://sunset.usc.edu/~mattmann/ > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Adjunct Assistant Professor, Computer Science Department > >> University of Southern California, Los Angeles, CA 90089 USA > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > -- *Lewis*

