Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.

The "ZooKeeperIntegration" page has been changed by ShalinMangar.
The comment on this change is: Added issues, re-organized content, added 
example configuration.
http://wiki.apache.org/solr/ZooKeeperIntegration?action=diff&rev1=4&rev2=5

--------------------------------------------------

+ <!> [[Solr1.5]]
+ 
+ <<TableOfContents>>
+ 
  = Introduction =
  
  Integrating Solr and !ZooKeeper allow us a lot more flexibility for dynamic, 
distributed configuration.  Additionally, it does not require a breakage of 
back-compatibility and it can use the existing Solr infrastructure.
  
+ See
+  * http://hadoop.apache.org/zookeeper
- See https://issues.apache.org/jira/browse/SOLR-1277
+  * https://issues.apache.org/jira/browse/SOLR-1277
+  * https://issues.apache.org/jira/browse/SOLR-1431
+  * https://issues.apache.org/jira/browse/SOLR-1585
  
- See http://hadoop.apache.org/zookeeper
+ = ZooKeeper Component =
+ There will be a !ZooKeeperComponent in !SolrCore configured through the 
solrconfig.xml. The !ZookeeperComponent may expose the !ZooKeeper client 
instance which could be used by any plugin for purposes such as adding/removing 
key/values or performing master election etc
  
- = Architecture =
+ == Configuration ==
  
- == Distributed Search ==
+ An example configuration for the !ZooKeeperComponent in solrconfig may look 
like the following:
+ {{{
+ <zookeeper>
+     <!-- See the ZooKeeper docs -->
+     <str name="zkhostPorts">localhost:2181</str>
+     <!-- TODO: figure out how to do this programmatically -->
+     <str name="me">localhost:8983/solr/core1</str>
+     <!-- Timeout for the ZooKeeper.  Optional.  Default 10000 -->
+     <!-- Timeout in ms -->
+     <str name="timeout">5000</str>
+     <!- this is the directory in which this zk node will be added. The name 
of the node is a sequential number automatically assigned by zookeeper. The 
value is a Namedlist which may contain as many values as other components wish 
to add. This component only adds the key-> value me=localhost:8983/solr/core1. 
For instance , the Shardhandler may add a key value shard=shard1. 
ReplicationHandler may add something like version=124544 etc. -->
+     <str name="nodesDir">/myApp/solr</str>
+  </zookeeper>
+ }}}
  
- For distributed search, create a new !ShardHandler that moves the shard 
calculation code from !QueryComponent and handles both the current approach and 
the !ZooKeeper approach.
+ = ZooKeeper Aware Distributed Search =
  
- On startup, !ZooKeeper configuration contains whether the node is a shard or 
not.  If it is, it registers itself with !ZooKeeper by adding a value under the 
appropriate path in !ZooKeeper (this is configurable).  
+ For distributed search, a new !ShardHandler plugin will be created that moves 
the shard calculation code from !QueryComponent and handles both the current 
approach and the !ZooKeeper approach. There will be a new !ShardHandler called 
!ZooKeeperAwareShardHandler which will use the !ZooKeeper component to figure 
out the available shards and their nodes.
  
+ !ZooKeeperAwareShardHandler's configuration will contain the name of the 
shard to which this node belongs. On startup, it will get the core's 
ZooKeeperComponent and add a key/value (the shard name) in !ZooKeeper to the 
current core's node.
- For example, if you a shard could register itself in the solr shard group 
"solr_shards" as:
- solr_shards/192.168.0.1_8080_solr [192.168.0.1:8080/solr]  // Note, the [] 
contain the actual address that is used when constructing the rb.shards value
  
- Thus, solr_shards with two nodes might look like:
- solr_shards/
-   192.168.0.1_8080_solr [192.168.0.1:8080/solr]
-   192.168.0.2_8080_solr [192.168.0.2:8080/solr]
+ == Configuration ==
+ {{{
+ <requestHandler name="standard" class="solr.SearchHandler" default="true">
+     <!-- other params go here -->
+  
+     <shardHandler class="ZooKeeperAwareShardHandler">
+        <str name="shardName">shard1/nodes</int>
+     </shardHandler>
+ </requestHandler>
+ }}}
  
+ With the above configuration, on initialization, the 
!ZooKeeperAwareShardHandler will get the ZKClient from the !SolrCore and 
register itself as a sequential node under the path "/myApp/solr/shard1/nodes" 
and value me=localhost:8983/solr/core1
+ 
+ TODO: Figure out where does "me" live - zk configuration or shard handler 
configuration.
+ 
- Shards are ephemeral nodes in ZK speak and thus go away if the node dies.
+ Shards are ephemeral and sequential nodes in ZK speak and thus go away if the 
node dies.
  
  Then, when a query comes in, the !ShardsComponent can build the 
!ResponseBuilder.shards value appropriately based on what's contained in the 
shard group that it is participating in.  This shard group approach should 
allow for a fanout approach to be employed.
  
@@ -52, +83 @@

  
  Through the ZK req handler, slaves can be moved around, at which point they 
will pull the index from the master in their group and thus you can have 
rebalancing.  Additionally, new nodes that come online w/o an index will go to 
their master and get the index.  The replication handler already handles 
replicating configuration files, so this is just a config issue.
  
- = Implementation =
  
- The current patch implements this all by adding a !ZooKeeper onto the 
!SolrCore and configuring it via the solrconfig.xml.  The current patch only 
supports distributed search, but has some of the plumbing for setting up the 
master group stuff.  The !ReplicationHandler has not been implemented yet.
- 
- = Configuration and Running =
- 
- Configure the !ZooKeeperComponent in solrconfig as follows:
- {{{
- <zookeeper>
-         <!-- See the ZooKeeper docs -->
-     <str name="zkhostPorts">localhost:2181</str>
-     <!-- TODO: figure out how to do this programmatically -->
-     <str name="me">localhost:8983/solr/core1</str>
-     <!-- Timeout for the ZooKeeper.  Optional.  Default 10000 -->
-     <!-- Timeout in ms -->
-     <str name="timeout">5000</str>
-     <!- this is the directory in which this node will be added to. The name 
of the node is a sequential number automatically assigned by zookeeper. The 
value is a Namedlist which may contain as many values as other components wish 
to add. This component only adds the key-> value me=localhost:8983/solr/core1. 
For instance , the Shardhandler may add a key value shard=shard1 . 
ReplicationHandler may add something like version=124544 etc. -->
-     <str name="nodesDir">/domain/shard1/nodes</str>
-  </zookeepr>
- }}}
- 
- The ZookeeperComponent may expose the Zookeeper client instance which could 
be used by other plugins for other purposes such as Master Election etc
  
  The !ShardHandler is automatically setup.
  

Reply via email to