Hey Julien,
My option E was pretty much equivalent to B except I specified a time frame
(next 6 months). Are we just
saying that we'll accelerate the time frame to say, umm, next week or the week
after? :)
If so, fine by me. Since I moved nutchbase into the trunk at one point, I'd be
happy
Hi Chris,
I initially respawned this thread with the suggestion to not to wait until
january orso before the vote. Hence my apologies for being impatient and
pessimistic about trunk :)
Cheers,
Hey Julien,
My option E was pretty much equivalent to B except I specified a time frame
(next 6
Hey Markus,
No worries. I actually have no dog in this fight to be honest.
I want Gora to be successful, and I want Nutch to be successful.
I haven't contributed much to Nutch 2.0 trunk but I have been
to the 1.x series branch. I wish I knew more about Gora's internals (and
am trying to
Glad to see were making progress here.
Same with me, I am ready to move on with the project and move out of this
'rut' we have been in with trunk.
Thanks
On Sat, Sep 17, 2011 at 6:56 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hey Markus,
No worries. I actually have
Am happy to call for a vote on the future of Nutch 2.0 if you want. Shall we
reduce the various options described before to a single one?
Julien
On 15 September 2011 19:55, Markus Jelsma markus.jel...@openindex.iowrote:
Hi Guys,
I thought I'd chime in on this thread. My comments below:
Hi Julien,
I didn't want to skip ship with this one, but it seems that the binding
community has already spoken their mind, and I for one shadow your
suggestion.
It's clear that trunk as it currently exists is not bleeding edge, there
have been too many broken fronts to launch a concentrated
Option B) Shelve trunk in a branch and promote 1.4 to trunk. We can always
choose to hardwire HBASE (option D) later.
Markus
Am happy to call for a vote on the future of Nutch 2.0 if you want. Shall
we reduce the various options described before to a single one?
Julien
On 15 September
Why don't we just collect VOTEs for each of the options a-e, and then
figure out based on that if there is a majority. If there's no majority, we
can widdle it down to say the top 2-3, and then VOTE on those, looking
for majority again.
Cheers,
Chris
On Sep 16, 2011, at 11:44 AM, Markus
Hi Guys,
I thought I'd chime in on this thread. My comments below:
I understand and share your frustration, however you need to bear in mind
that things are done only if people volunteer and have time - usually
taken from their holiday, weekends, evenings. Chris (who is the de facto
On Thu, Sep 15, 2011 at 9:55 PM, Markus Jelsma
markus.jel...@openindex.io wrote:
There are many things i can write about this topic right now but don't feel
it's neccessary. The choice is difficult and perhaps painful but when the
voting round is opened by our project lead, i will vote for
Hi Tom,
I have been using Nutch 1.x for the last 9 months or so and it works well
for large scale crawls up to around a billion pages. However, the inherent
lack of random access in HDFS really starts to become a burden on our hadoop
cluster when going through the whole
Julien, devs, users,
I'd like to see bugs fixed in 2.0 but some of them are way out of my league or
would cost me an absurd amount of time. I'd also really like to use Gora but
Gora must be maintained. Gora will play a fundamental role in 2.0 and if
something is broken there it is not trivial
Hi,
Without changing the flow of conversation and the points which have already
been touched upon, I would like to add:
I am really split here between a couple of decisions. I like the abstraction
that Gora provides, even though it is somewhat of a pain to configure, this
also presents a barrier
Julien,
On Tue, Aug 9, 2011 at 10:10 AM, Julien Nioche
lists.digitalpeb...@gmail.com wrote:
Hi Kirby,
Grumble, Grumble. (adding dev@nutch, as that is more than likely
where this discussion really belongs)...
am adding gora-...@incubator.apache.org as well
It'd be really nice if
Hi All,
I have been using Nutch 1.x for the last 9 months or so and it works well for
large scale crawls up to around a billion pages. However, the inherent lack of
random access in HDFS really starts to become a burden on our hadoop cluster
when going through the whole generate/update/fetch
15 matches
Mail list logo