I'm not that familiar with how nodes in ZK are created and kept for Solr, but the mention of ephemeral nodes made me think that connection being reset could cause the node to dissappear. There is a curator recipe to workaround that specific issue: http://curator.apache.org/cur ator-recipes/persistent-ephemeral-node.html
Don't know if it helps at all, but thought I'd share just in case. :) Steve On Fri, 2016-01-15 at 02:51 +0000, Mark Miller wrote: > bq. As for #2, I haven't found any tickets that mention anything like > that, that may not mean much though. > > I'll see if I can dig it up. Perhaps it's only been discussed and we > still need to make one, but I'm pretty sure someone did. > > - Mark > > On Thu, Jan 14, 2016 at 9:01 PM Erick Erickson < > [email protected]> wrote: > > bq: A report of a 'spotting' or two in the wild is a very weak leg > > for > > such a hack to stand on. > > > > Can't disagree. The more I think about it, the harder it is to see > > some process that would > > be helpful. The fact that the node (and presumably all replicas on > > that node) are unavailable > > means you can't index to any replica on that node _and_ you can't > > do > > regular distributed queries. About the only thing you _can_ do is > > query the (stale) replicas on > > that node with &distrib=false, which is at least a little useful > > when > > trying to understand the > > state of the system but totally useless when it comes to a > > production setup. > > > > I guess "monitor and if it's repeatable try to find out why it was > > being removed in the first place". > > > > As for #2, I haven't found any tickets that mention anything like > > that, that may not mean much > > though. > > > > Scott: > > > > Right, but since the node was removed from live_nodes in the first > > place, presumably the Solr > > node wasn't reachable (speculation). So it wouldn't receive an > > event > > that it was removed > > from the live_node ephemeral and couldn't repair itself. > > > > On Thu, Jan 14, 2016 at 5:55 PM, Scott Blum <[email protected]> > > wrote: > > > Most ephemeral node uses include a monitoring component or watch > > of some > > > kind tho. > > > > > > On Thu, Jan 14, 2016 at 5:54 PM, Mark Miller < > > [email protected]> wrote: > > >> > > >> That is just silly though. There is no reason it should be gone > > in a legit > > >> situation. We can't have everything monitoring all it's state > > all the time > > >> and trying to correct it. > > >> > > >> A report of a 'spotting' or two in the wild is a very weak leg > > for such a > > >> hack to stand on. > > >> > > >> > > >> - Mark > > >> > > >> On Thu, Jan 14, 2016 at 5:40 PM Scott Blum < > > [email protected]> wrote: > > >>> > > >>> For #1, I think each node should periodically ensure it's in > > the > > >>> live_nodes list in ZK. > > >> > > >> -- > > >> - Mark > > >> about.me/markrmiller > > > > > > > > > > ------------------------------------------------------------------- > > -- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > -- > - Mark > about.me/markrmiller -- Steve Molloy <[email protected]> OpenText
