Julien,
On Tue, Aug 9, 2011 at 10:10 AM, Julien Nioche < lists.digitalpeb...@gmail.com> wrote: > Hi Kirby, > > Grumble, Grumble. (adding dev@nutch, as that is more than likely >> where this discussion really belongs)... >> > > am adding gora-dev@incubator.apache.org as well > > >> It'd be really nice if folks could just follow the commands in the >> nightly build, and get a build pushed out. I've pointed this out >> previously, and was told this would be fixed "shortly" (right after >> GORA-0.1 finally got released, but not published in public maven repo, >> which as far as I know, it still isn't published, but I stopped >> checking on it). >> > > I understand and share your frustration, however you need to bear in mind > that things are done only if people volunteer and have time - usually taken > from their holiday, weekends, evenings. Chris (who is the de facto release > master for Nutch and Gora) has not had the time and nobody else has > volunteered to do it. > I don't mean to be a complainer, I'd happily try and contribute fixes on this one, but most of this would likely have to be done on Hudson/Jenkins. I think you're addressing a larger issue than I really meant. My point was, somehow a developer does a build on their desktop, and however that is done should be duplicated on Hudson/Jenkins. If you need the trunk of gora, then is it possible to checkout it out, build it and install it to a local repo, and then build Nutch via Hudson/Jenkins? Whatever it takes to get a build should be what the CI server is doing. The repeatable, but failing builds is what really confuses and frustrates me. The nightly/CI build should be automating what devs on their desktop to ensure it'll work on a clean setup. Right now, it just tells you that for the last year, the totally obvious steps will lead to a failure. I can figure out all of the configuration issues for Hudson/Jenkins to make it work, if somebody can push that into the Apache version. However, I think answering your questions first would be a good idea. My totally non-binding +1 for setting up a CI/Nightly build for the various stable branches too, the only one I found on Apache was for trunk. > >> As it happens, yesterday was the 1 year anniversary of the last >> successful Hudson/Jenkins build... If that actually worked, we could >> point people towards it as a useful recipe for how to get a build >> working off trunk. I haven't been following Nutch too closely, but it >> always strikes me as really odd, that there's a nightly build and it >> doesn't bother anybody that it fails all the time (and that there >> isn't a nightly build for the stable branches). >> > > The real issue behind all this is what we should do with Nutch 2.0. What > follows is only my opinion and I would love to hear what others have to say > on this subject. > > Since we (actually mostly Dogacan) wrote 2.0 and delegated the storage to > Gora, the latter hasn't really taken off since incubation. There have been > some modest contributions to it but it does not seem to be used much and > there is virtually nothing happening on it in terms of development. More > worryingly, the people who initially contributed to it are not very active > on the project (such is life, new jobs, different projects, etc...) > anymore·. As for Nutch 2.0, it hasn't made any progress in the last 12 > months : we still have the same bugs, the tests do not work, the build has > to be done manually etc... > > At the same time, there has been a new lease of life into Nutch as a whole > : there is definitely more activity on the mailing lists, new users, new > active committers etc... and quite a few bugfixes and improvements - most > of them backported from what had been done in the trunk and people seem > fairly happy with what we can do with 1.4 > > So the question is : what shall we do with 2.0? Here are a few > possibilities : > > a) put some effort into it, fix the bugs and make so that it can be used > instead of 1.x > b) shelve it and leave it for enthusiasts to play with + make 1.x the trunk > again > c) do nothing : keep 2.0 and 1.x in parallel (but having to maintain two > branches is quite a pain) > d) abandon the idea of a neutral storage layer with Gora and hardwire it to > e.g. HBase > > Option (a) has not happened in the last 12 months and I am not very hopeful > about it. > > What do you guys think? > I know nothing about the 2.0 branch, and can't really contribute to that conversation (that job issue interferes will all my free time). Kirby > Julien > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com >