[ 
https://issues.apache.org/jira/browse/SOLR-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801296#action_12801296
 ] 

Ted Dunning commented on SOLR-1724:
-----------------------------------

{quote}
We actually started out that way... (when a node went down there wasn't really 
any trace it ever existed) but have been moving away from it.
ZK may not just be a reflection of the cluster but may also control certain 
aspects of the cluster that you want persistent. For example, marking a node as 
"disabled" (i.e. don't use it). One could create APIs on the node to enable and 
disable and have that reflected in ZK, but it seems like more work than simply 
saying "change this znode".
{quote}

I see this as a  conflation of two or three goals that leads to trouble.  All 
of the goals are worthy and important, but the conflation of them leads to 
difficult problems.  Taken separately, the goals are easily met.

One goal is the reflection of current cluster state.  That is most reliably 
done using ephemeral files roughly as I described.

Another goal is the reflection of constraints or desired state of the cluster.  
This is best handled as you describe, with permanent files since you don't want 
this desired state to disappear when a node disappears.

The real issue is making sure that the source of whatever information is most 
directly connected to the physical manifestation of that information.  
Moreover, it is important in some cases (node state, for instance) that the 
state stay correct even when the source of that state loses control by 
crashing, hanging or becoming otherwise indisposed.  Inserting an intermediary 
into this chain of control is a bad idea.  Replicating ZK's rather well 
implemented ephemeral state mechanism with ad hoc heartbeats is also a bad idea 
(remember how *many* bugs there have been in hadoop relative to heartbeats and 
the name node?).

A somewhat secondary issue is whether the cluster master has to be involved in 
every query.  That seems like a really bad bottleneck to me and Katta provides 
a proof of existence that this is not necessary.

After trying several options in production, what I find is best is that the 
master lay down a statement of desired state and the nodes publish their status 
in a different and ephemeral fashion.  The master can record a history or there 
may be general directions such as your disabled list however you like but that 
shouldn't be mixed into the node status because you otherwise get into a 
situation where ephemeral files can no longer be used for what they are good at.



> Real Basic Core Management with Zookeeper
> -----------------------------------------
>
>                 Key: SOLR-1724
>                 URL: https://issues.apache.org/jira/browse/SOLR-1724
>             Project: Solr
>          Issue Type: New Feature
>          Components: multicore
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>             Fix For: 1.5
>
>
> Though we're implementing cloud, I need something real soon I can
> play with and deploy. So this'll be a patch that only deploys
> new cores, and that's about it. The arch is real simple:
> On Zookeeper there'll be a directory that contains files that
> represent the state of the cores of a given set of servers which
> will look like the following:
> /production/cores-1.txt
> /production/cores-2.txt
> /production/core-host-1-actual.txt (ephemeral node per host)
> Where each core-N.txt file contains:
> hostname,corename,instanceDir,coredownloadpath
> coredownloadpath is a URL such as file://, http://, hftp://, hdfs://, ftp://, 
> etc
> and
> core-host-actual.txt contains:
> hostname,corename,instanceDir,size
> Everytime a new core-N.txt file is added, the listening host
> finds it's entry in the list and begins the process of trying to
> match the entries. Upon completion, it updates it's
> /core-host-1-actual.txt file to it's completed state or logs an error.
> When all host actual files are written (without errors), then a
> new core-1-actual.txt file is written which can be picked up by
> another process that can create a new core proxy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to