[Solr Wiki] Update of "SolrCloud" by YonikSeeley

Apache Wiki Mon, 14 Dec 2009 10:00:36 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.


The "SolrCloud" page has been changed by YonikSeeley.
The comment on this change is: cluster options.
http://wiki.apache.org/solr/SolrCloud?action=diff&rev1=3&rev2=4

--------------------------------------------------

  
  # Method #3 could move the shards listing for each node to the model of the 
node itself.
  
+ === Cluster Options ===
+ ==== search only ====
+ The most degenerate form of a cluster we would support - the user manages 
everything, tells zk where shards are, solr uses that for completing 
distributed search requests.  Master/slave relationships are not exposed.
+ 
+ There could perhaps be a flag or role that indicates that a node is meant to 
be hit by user requests (as opposed to sub-requests as a result of a 
distributed search).  This would allow to specialize servers based on function.
+ 
+ In order to further support user partitioning, users should still be able to 
specify a subset of the shards to query.  Perhaps even support optional shard 
groups, so a user could specify that only shards covering SF or NYC should be 
queried?
+ 
+ Distributed search is optional - a cluster could simply be a number of 
servers with the same shard.
+ 
+ ==== local config ====
+ This could be an option in conjunction with any other cluster model - solr 
need not load it's config from zookeeper.  One advantage this has is that it 
breaks the startup dependency on zookeeper - one could start up a solr server 
and index data to it w/o zookeeper being up.  The registration of this node in 
zookeeper could be asynchronous - it happens when we do finally connect to zk.
+ 
+ A variant on this could copy config files out of zookeeper to local storage 
to provide the benefit of disconnected operation.
+ 
+ ==== search and replication ====
+ In addition to "search only" this models the master slave relationship.  
Certain shards are marked as a master, and slaves will automatically enable 
replication to pull from the correct master.
+ 
+ Key/doc partitioning is still done by the user (i.e. shards are opaque and 
solr will not know if documentA belongs in shardX or shardY)
+ 
+ === Example Scenarios ===
+ User starts up a shard and says it has "shardX" (assume there is not already 
an existing shardX in the cluster).
+ 
  === Other Questions ===
  Is there a single master (cluster manager, not solr master) per collection, 
or a master for all collections in the cluster?
+ 
+ How do we build an index and test it out before adding it to a collection?  
We want to be part of the collection so we can get the config, but we don't 
want searchers to use the index yet.  Perhaps have a shard state that could 
indicate this.
+ 
+ Have some sort of command list that every server should execute before 
certain actions? (could involve hitting URLs, executing system commands, etc)?
  
  == Resources ==
  http://sourceforge.net/mailarchive/forum.php?forum_name=bailey-developers

[Solr Wiki] Update of "SolrCloud" by YonikSeeley

Reply via email to