Dennis Kubes wrote:
Otis Gospodnetic wrote:
I suppose the first thing to do would be describe the requirements for
this shard management. I imagine you have very specific functionality
in mind from your Wikia Search experience. Mind putting your ideas on
the Wiki? I think it would be very good to share this with
[EMAIL PROTECTED] early on, so we can come up with something general
that fits both Nutch and Solr. It might turn out that this calls for
a separate Lucene project, but we'll see that once the real discussion
starts.
I completely agree. This would be better as a shared project. I will
put my current thoughts down on the Nutch wiki, unless there is already
a discussion going somewhere?
There is a description of a related concept here:
http://wiki.apache.org/hadoop/DistributedLucene . However, this
addresses only the index part of the shard - in our case shards also
contain plain text (for summaries) and the original binary content (for
cached preview), and possibly other parts (NUTCH-466) neither of which
is managed by this code.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com