Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The "ZooKeeperIntegration" page has been changed by ShalinMangar. The comment on this change is: Added issues, re-organized content, added example configuration. http://wiki.apache.org/solr/ZooKeeperIntegration?action=diff&rev1=4&rev2=5 -------------------------------------------------- + <!> [[Solr1.5]] + + <<TableOfContents>> + = Introduction = Integrating Solr and !ZooKeeper allow us a lot more flexibility for dynamic, distributed configuration. Additionally, it does not require a breakage of back-compatibility and it can use the existing Solr infrastructure. + See + * http://hadoop.apache.org/zookeeper - See https://issues.apache.org/jira/browse/SOLR-1277 + * https://issues.apache.org/jira/browse/SOLR-1277 + * https://issues.apache.org/jira/browse/SOLR-1431 + * https://issues.apache.org/jira/browse/SOLR-1585 - See http://hadoop.apache.org/zookeeper + = ZooKeeper Component = + There will be a !ZooKeeperComponent in !SolrCore configured through the solrconfig.xml. The !ZookeeperComponent may expose the !ZooKeeper client instance which could be used by any plugin for purposes such as adding/removing key/values or performing master election etc - = Architecture = + == Configuration == - == Distributed Search == + An example configuration for the !ZooKeeperComponent in solrconfig may look like the following: + {{{ + <zookeeper> + <!-- See the ZooKeeper docs --> + <str name="zkhostPorts">localhost:2181</str> + <!-- TODO: figure out how to do this programmatically --> + <str name="me">localhost:8983/solr/core1</str> + <!-- Timeout for the ZooKeeper. Optional. Default 10000 --> + <!-- Timeout in ms --> + <str name="timeout">5000</str> + <!- this is the directory in which this zk node will be added. The name of the node is a sequential number automatically assigned by zookeeper. The value is a Namedlist which may contain as many values as other components wish to add. This component only adds the key-> value me=localhost:8983/solr/core1. For instance , the Shardhandler may add a key value shard=shard1. ReplicationHandler may add something like version=124544 etc. --> + <str name="nodesDir">/myApp/solr</str> + </zookeeper> + }}} - For distributed search, create a new !ShardHandler that moves the shard calculation code from !QueryComponent and handles both the current approach and the !ZooKeeper approach. + = ZooKeeper Aware Distributed Search = - On startup, !ZooKeeper configuration contains whether the node is a shard or not. If it is, it registers itself with !ZooKeeper by adding a value under the appropriate path in !ZooKeeper (this is configurable). + For distributed search, a new !ShardHandler plugin will be created that moves the shard calculation code from !QueryComponent and handles both the current approach and the !ZooKeeper approach. There will be a new !ShardHandler called !ZooKeeperAwareShardHandler which will use the !ZooKeeper component to figure out the available shards and their nodes. + !ZooKeeperAwareShardHandler's configuration will contain the name of the shard to which this node belongs. On startup, it will get the core's ZooKeeperComponent and add a key/value (the shard name) in !ZooKeeper to the current core's node. - For example, if you a shard could register itself in the solr shard group "solr_shards" as: - solr_shards/192.168.0.1_8080_solr [192.168.0.1:8080/solr] // Note, the [] contain the actual address that is used when constructing the rb.shards value - Thus, solr_shards with two nodes might look like: - solr_shards/ - 192.168.0.1_8080_solr [192.168.0.1:8080/solr] - 192.168.0.2_8080_solr [192.168.0.2:8080/solr] + == Configuration == + {{{ + <requestHandler name="standard" class="solr.SearchHandler" default="true"> + <!-- other params go here --> + + <shardHandler class="ZooKeeperAwareShardHandler"> + <str name="shardName">shard1/nodes</int> + </shardHandler> + </requestHandler> + }}} + With the above configuration, on initialization, the !ZooKeeperAwareShardHandler will get the ZKClient from the !SolrCore and register itself as a sequential node under the path "/myApp/solr/shard1/nodes" and value me=localhost:8983/solr/core1 + + TODO: Figure out where does "me" live - zk configuration or shard handler configuration. + - Shards are ephemeral nodes in ZK speak and thus go away if the node dies. + Shards are ephemeral and sequential nodes in ZK speak and thus go away if the node dies. Then, when a query comes in, the !ShardsComponent can build the !ResponseBuilder.shards value appropriately based on what's contained in the shard group that it is participating in. This shard group approach should allow for a fanout approach to be employed. @@ -52, +83 @@ Through the ZK req handler, slaves can be moved around, at which point they will pull the index from the master in their group and thus you can have rebalancing. Additionally, new nodes that come online w/o an index will go to their master and get the index. The replication handler already handles replicating configuration files, so this is just a config issue. - = Implementation = - The current patch implements this all by adding a !ZooKeeper onto the !SolrCore and configuring it via the solrconfig.xml. The current patch only supports distributed search, but has some of the plumbing for setting up the master group stuff. The !ReplicationHandler has not been implemented yet. - - = Configuration and Running = - - Configure the !ZooKeeperComponent in solrconfig as follows: - {{{ - <zookeeper> - <!-- See the ZooKeeper docs --> - <str name="zkhostPorts">localhost:2181</str> - <!-- TODO: figure out how to do this programmatically --> - <str name="me">localhost:8983/solr/core1</str> - <!-- Timeout for the ZooKeeper. Optional. Default 10000 --> - <!-- Timeout in ms --> - <str name="timeout">5000</str> - <!- this is the directory in which this node will be added to. The name of the node is a sequential number automatically assigned by zookeeper. The value is a Namedlist which may contain as many values as other components wish to add. This component only adds the key-> value me=localhost:8983/solr/core1. For instance , the Shardhandler may add a key value shard=shard1 . ReplicationHandler may add something like version=124544 etc. --> - <str name="nodesDir">/domain/shard1/nodes</str> - </zookeepr> - }}} - - The ZookeeperComponent may expose the Zookeeper client instance which could be used by other plugins for other purposes such as Master Election etc The !ShardHandler is automatically setup.
