Re: Lucene-based Distributed Index Leveraging Hadoop

Ian Holsman Wed, 06 Feb 2008 17:31:55 -0800

Ning Li wrote:

One main focus is to provide fault-tolerance in this distributed index
system. Correct me if I'm wrong, I think SOLR-303 is focusing on merging
results from multiple shards right now. We'd like to start an open source
project for a fault-tolerant distributed index system (or join if one
already exists) if there is enough interest. Making Solr work on top of such
a system could be an important goal and SOLR-303 is a big part of it in that
case.


I guess it depends on how you set up your shards in 303.

We plan on having a master/slave relationship on each shard, so thateach shard would sync the same way solr does currently.


regards
Ian

I should have made it clear that disjoint data sets are not a requirement of
the system.


On Feb 6, 2008 12:57 PM, Ian Holsman <[EMAIL PROTECTED]> wrote:

Hi.
AOL has a couple of projects going on in the lucene/hadoop/solr space,
and we will be pushing more stuff out as we can. We don't have anything
going with solr over hadoop at the moment.

I'm not sure if this would be better than what SOLR-303 does, but you
should have a look at the work being done there.

One of the things you mentioned is that the data sets are disjoint.
SOLR-303 doesn't require this, and allows us to have a document stored
in multiple shards (with different caching/update characteristics).

Re: Lucene-based Distributed Index Leveraging Hadoop

Reply via email to