Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-17 Thread Mattmann, Chris A (388J)
Hey Julien, My option E was pretty much equivalent to B except I specified a time frame (next 6 months). Are we just saying that we'll accelerate the time frame to say, umm, next week or the week after? :) If so, fine by me. Since I moved nutchbase into the trunk at one point, I'd be happy

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-17 Thread Markus Jelsma
Hi Chris, I initially respawned this thread with the suggestion to not to wait until january orso before the vote. Hence my apologies for being impatient and pessimistic about trunk :) Cheers, Hey Julien, My option E was pretty much equivalent to B except I specified a time frame (next 6

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-17 Thread Mattmann, Chris A (388J)
Hey Markus, No worries. I actually have no dog in this fight to be honest. I want Gora to be successful, and I want Nutch to be successful. I haven't contributed much to Nutch 2.0 trunk but I have been to the 1.x series branch. I wish I knew more about Gora's internals (and am trying to

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-17 Thread lewis john mcgibbney
Glad to see were making progress here. Same with me, I am ready to move on with the project and move out of this 'rut' we have been in with trunk. Thanks On Sat, Sep 17, 2011 at 6:56 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Hey Markus, No worries. I actually have

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-16 Thread Julien Nioche
Am happy to call for a vote on the future of Nutch 2.0 if you want. Shall we reduce the various options described before to a single one? Julien On 15 September 2011 19:55, Markus Jelsma markus.jel...@openindex.iowrote: Hi Guys, I thought I'd chime in on this thread. My comments below:

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-16 Thread lewis john mcgibbney
Hi Julien, I didn't want to skip ship with this one, but it seems that the binding community has already spoken their mind, and I for one shadow your suggestion. It's clear that trunk as it currently exists is not bleeding edge, there have been too many broken fronts to launch a concentrated

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-16 Thread Markus Jelsma
Option B) Shelve trunk in a branch and promote 1.4 to trunk. We can always choose to hardwire HBASE (option D) later. Markus Am happy to call for a vote on the future of Nutch 2.0 if you want. Shall we reduce the various options described before to a single one? Julien On 15 September

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-16 Thread Mattmann, Chris A (388J)
Why don't we just collect VOTEs for each of the options a-e, and then figure out based on that if there is a majority. If there's no majority, we can widdle it down to say the top 2-3, and then VOTE on those, looking for majority again. Cheers, Chris On Sep 16, 2011, at 11:44 AM, Markus

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-15 Thread Markus Jelsma
Hi Guys, I thought I'd chime in on this thread. My comments below: I understand and share your frustration, however you need to bear in mind that things are done only if people volunteer and have time - usually taken from their holiday, weekends, evenings. Chris (who is the de facto

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-09-15 Thread Sami Siren
On Thu, Sep 15, 2011 at 9:55 PM, Markus Jelsma markus.jel...@openindex.io wrote: There are many things i can write about this topic right now but don't feel it's neccessary. The choice is difficult and perhaps painful but when the voting round is opened by our project lead, i will vote for

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-08-10 Thread Julien Nioche
Hi Tom, I have been using Nutch 1.x for the last 9 months or so and it works well for large scale crawls up to around a billion pages. However, the inherent lack of random access in HDFS really starts to become a burden on our hadoop cluster when going through the whole

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-08-10 Thread Markus Jelsma
Julien, devs, users, I'd like to see bugs fixed in 2.0 but some of them are way out of my league or would cost me an absurd amount of time. I'd also really like to use Gora but Gora must be maintained. Gora will play a fundamental role in 2.0 and if something is broken there it is not trivial

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-08-10 Thread lewis john mcgibbney
Hi, Without changing the flow of conversation and the points which have already been touched upon, I would like to add: I am really split here between a couple of decisions. I like the abstraction that Gora provides, even though it is somewhat of a pain to configure, this also presents a barrier

Re: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-08-09 Thread Kirby Bohling
Julien, On Tue, Aug 9, 2011 at 10:10 AM, Julien Nioche lists.digitalpeb...@gmail.com wrote: Hi Kirby, Grumble, Grumble. (adding dev@nutch, as that is more than likely where this discussion really belongs)... am adding gora-...@incubator.apache.org as well It'd be really nice if

RE: Future of Nutch 2.0 [Was: Unresolved dependencies org.apache.gora#gora-hbase;0.1: not found in Nutch trunk]

2011-08-09 Thread Tom Davidson
Hi All, I have been using Nutch 1.x for the last 9 months or so and it works well for large scale crawls up to around a billion pages. However, the inherent lack of random access in HDFS really starts to become a burden on our hadoop cluster when going through the whole generate/update/fetch