I can do that but it will come after I finish some reqs on the next gen nutch. :) I do consider shard management to be part of that.

Dennis

Otis Gospodnetic wrote:
And there is http://wiki.apache.org/solr/DistributedSearch , but this talks 
*only* about search.

Dennis, are you the man to take what's on DistributedLucene and 
DistributedSearch and come up with a marriage proposal? :)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Andrzej Bialecki <[EMAIL PROTECTED]>
To: [email protected]
Sent: Monday, April 14, 2008 1:01:37 PM
Subject: Re: Next Generation Nutch

Dennis Kubes wrote:

Otis Gospodnetic wrote:
I suppose the first thing to do would be describe the requirements for this shard management. I imagine you have very specific functionality in mind from your Wikia Search experience. Mind putting your ideas on the Wiki? I think it would be very good to share this with [EMAIL PROTECTED] early on, so we can come up with something general that fits both Nutch and Solr. It might turn out that this calls for a separate Lucene project, but we'll see that once the real discussion starts.

I completely agree. This would be better as a shared project. I will put my current thoughts down on the Nutch wiki, unless there is already a discussion going somewhere?

There is a description of a related concept here: http://wiki.apache.org/hadoop/DistributedLucene . However, this addresses only the index part of the shard - in our case shards also contain plain text (for summaries) and the original binary content (for cached preview), and possibly other parts (NUTCH-466) neither of which is managed by this code.


Reply via email to