Yeah, exactly. At a minimum you want a connection state listener + watcher on the node to ensure it stays in existence.
On Fri, Jan 15, 2016 at 9:35 AM, Steve Molloy <[email protected]> wrote: > I'm not that familiar with how nodes in ZK are created and kept for > Solr, but the mention of ephemeral nodes made me think that connection > being reset could cause the node to dissappear. There is a curator > recipe to workaround that specific issue: http://curator.apache.org/cur > ator-recipes/persistent-ephemeral-node.html > > Don't know if it helps at all, but thought I'd share just in case. :) > > Steve > > On Fri, 2016-01-15 at 02:51 +0000, Mark Miller wrote: > > bq. As for #2, I haven't found any tickets that mention anything like > > that, that may not mean much though. > > > > I'll see if I can dig it up. Perhaps it's only been discussed and we > > still need to make one, but I'm pretty sure someone did. > > > > - Mark > > > > On Thu, Jan 14, 2016 at 9:01 PM Erick Erickson < > > [email protected]> wrote: > > > bq: A report of a 'spotting' or two in the wild is a very weak leg > > > for > > > such a hack to stand on. > > > > > > Can't disagree. The more I think about it, the harder it is to see > > > some process that would > > > be helpful. The fact that the node (and presumably all replicas on > > > that node) are unavailable > > > means you can't index to any replica on that node _and_ you can't > > > do > > > regular distributed queries. About the only thing you _can_ do is > > > query the (stale) replicas on > > > that node with &distrib=false, which is at least a little useful > > > when > > > trying to understand the > > > state of the system but totally useless when it comes to a > > > production setup. > > > > > > I guess "monitor and if it's repeatable try to find out why it was > > > being removed in the first place". > > > > > > As for #2, I haven't found any tickets that mention anything like > > > that, that may not mean much > > > though. > > > > > > Scott: > > > > > > Right, but since the node was removed from live_nodes in the first > > > place, presumably the Solr > > > node wasn't reachable (speculation). So it wouldn't receive an > > > event > > > that it was removed > > > from the live_node ephemeral and couldn't repair itself. > > > > > > On Thu, Jan 14, 2016 at 5:55 PM, Scott Blum <[email protected]> > > > wrote: > > > > Most ephemeral node uses include a monitoring component or watch > > > of some > > > > kind tho. > > > > > > > > On Thu, Jan 14, 2016 at 5:54 PM, Mark Miller < > > > [email protected]> wrote: > > > >> > > > >> That is just silly though. There is no reason it should be gone > > > in a legit > > > >> situation. We can't have everything monitoring all it's state > > > all the time > > > >> and trying to correct it. > > > >> > > > >> A report of a 'spotting' or two in the wild is a very weak leg > > > for such a > > > >> hack to stand on. > > > >> > > > >> > > > >> - Mark > > > >> > > > >> On Thu, Jan 14, 2016 at 5:40 PM Scott Blum < > > > [email protected]> wrote: > > > >>> > > > >>> For #1, I think each node should periodically ensure it's in > > > the > > > >>> live_nodes list in ZK. > > > >> > > > >> -- > > > >> - Mark > > > >> about.me/markrmiller > > > > > > > > > > > > > > ------------------------------------------------------------------- > > > -- > > > To unsubscribe, e-mail: [email protected] > > > For additional commands, e-mail: [email protected] > > > > > > > > -- > > - Mark > > about.me/markrmiller > -- > Steve Molloy <[email protected]> > OpenText >
