Re: Katta's goodness for Solr

Grant Ingersoll Wed, 12 Nov 2008 06:52:29 -0800


On Nov 11, 2008, at 1:15 PM, Otis Gospodnetic wrote:

Quick thought. I saw Stefan's Katta presentation last night. Kattaseems nice and simple. If I understood correctly, juicy stuff thatis interesting to Solr is:- Katta has a notion of a Primary Master and N Secondary Slaves (noSPOF there)- Search Nodes serve index shards copied locally from some sharedstorage- Zookeeper instances (again Primary Master and N Secondary Slaves)that facilitate communication among distributed components
The master:
-- knows how to distribute a set of index shards it is given acrossa number of search nodes (distribution policy pluggable, similar toHadoop's, but different)
-- has a map of which shard is on which search node (in Zookeeper)
-- knows how to replicate each shard (replication factor configurable)
-- knows when a search node goes down (via Zookeeper notification)
-- knows how to create more replicas of shards on dead search node(and remove extra replicas when search node is revived)-- can notify search nodes when a new index is available (viaZookeeper)
More in:
http://joa23.files.wordpress.com/2008/09/katta-overview.pdf

Paul Noble will like slide #13 ;)

In particular, I think that:
- Making use of Zookeper for index snapshot + replication might beuseful (Master publishes the info about a new snapshot to Zookierand Search Slaves get notified immediately and start copying theindex)- Making use of Zookeper for keeping a map of index shards +applying a replication factor would be very useful
- Making use of pluggable shard placement policy would be useful

Thoughts?

+1. Zookeeper does seem like a logical thing to add to Solr to handlethis, and it fits with Yonik's suggestions about Solr 2.0, I think, aslong as we also can easily maintain the simplicity of a single nodesetup, which still serves most people quite well.

Also:
While Katta provides shard->search server functionality viapluggable impl, what both Solr and Katta are still missing is thedoc->shard functionality. However, this might not be terribly hardif we do something similar to Katta's pluggable shard->search serverdistribution policy. Please mind I'm saying this without havinglooked at any of the Katta code.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ

Re: Katta's goodness for Solr

Reply via email to