On 1/29/2014 12:48 PM, Jeff Wartes wrote:
> And that, I think, is my misunderstanding. I had assumed that the link
> between a node and the collections it belongs to would be the (possibly
> chroot¹ed) zookeeper reference *itself*, not the node¹s directory
> structure. Instead, it appears that ZK is simply a repository for the
> collection configuration, where nodes may look up what they need based on
> filesystem core references.

Work is underway towards a new mode where zookeeper is the ultimate
source of truth, and each node will behave accordingly to implement and
maintain that truth.  I can't seem to locate a Jira issue for it,
unfortunately.  It's possible that one doesn't exist yet, or that it has
an obscure title.  Mark Miller is the one who really understands the
full details, as he's a primary author of SolrCloud code.

Currently, what SolrCloud considers to be "truth" is dictated by both
zookeeper and an amalgamation of which cores each server actually has
present.  The collections API modifies both.  With an older config (all
current and future 4.x versions), the latter is in solr.xml.  If you're
using the new solr.xml format (available 4.4 and later, will be
mandatory in 5.0), it's done with Core Discovery.  Zookeeper has a list
of everything and coordinates the cluster state, but has no real control
over the cores that actually exist on each server.  When the two sources
of truth disagree, nothing happens to fix the situation, manual
intervention is required.

Any errors in my understanding of SolrCloud are my own.  I don't claim
that what I just wrote is error-free, but I am pretty sure that it's
essentially correct.

Thanks,
Shawn

Reply via email to