Hi, thanks all, this has been very instructive. It looks like in the short term using a combination of replication and sharding, based on Upayavira's setup, might be the safest thing to do, while in the longer term following the zookeeper integration and solandra development might provide a more dynamic environment and perhaps easier setup. Please keep coming the good suggestions if you feel like. thanks again, Luca
On Dec 1, 2010, at 4:17 AM, Peter Karich wrote: > Hi, > > also take a look at solandra: > > https://github.com/tjake/Lucandra/tree/solandra > > I don't have it in prod yet but regarding administration overhead it > looks very promising. > And you'll get some other neat features like (soft) real time, for free. > So its same like A) + C) + X) - Y) ;-) > > Regards, > Peter. > > >> Hi, >> I'd like to know if anybody has suggestions/opinions on what is >> currently the best architecture for a distributed search system using Solr. >> The use case is that of a system composed >> of N indexes, each hosted on a separate machine, each index containing >> unique content. >> >> Options that I know of are: >> >> A) Using Solr distributed search >> B) Using Solr + Zookeeper integration >> C) Using replication, i.e. each node replicates all the others >> >> It seems like options A) and B) would suffer from a fault-tolerance >> standpoint: if any of the nodes goes down, the search won't -at this time- >> return partial results, but instead report an exception. >> Option C) would provide fault tolerance, at least for any search initiated >> at a node that is available, but would incur into a large replication >> overhead. >> >> Did I get any of the above wrong, or does somebody have some insight on what >> is the best system architecture for this use case ? >> >> thanks in advance, >> Luca > > > -- > http://jetwick.com twitter search prototype >