Yeah, exactly.  At a minimum you want a connection state listener + watcher
on the node to ensure it stays in existence.

On Fri, Jan 15, 2016 at 9:35 AM, Steve Molloy <[email protected]> wrote:

> I'm not that familiar with how nodes in ZK are created and kept for
> Solr, but the mention of ephemeral nodes made me think that connection
> being reset could cause the node to dissappear. There is a curator
> recipe to workaround that specific issue: http://curator.apache.org/cur
> ator-recipes/persistent-ephemeral-node.html
>
> Don't know if it helps at all, but thought I'd share just in case. :)
>
> Steve
>
> On Fri, 2016-01-15 at 02:51 +0000, Mark Miller wrote:
> > bq. As for #2, I haven't found any tickets that mention anything like
> >  that, that may not mean much though.
> >
> > I'll see if I can dig it up. Perhaps it's only been discussed and we
> > still need to make one, but I'm pretty sure someone did.
> >
> > - Mark
> >
> > On Thu, Jan 14, 2016 at 9:01 PM Erick Erickson <
> > [email protected]> wrote:
> > > bq: A report of a 'spotting' or two in the wild is a very weak leg
> > > for
> > > such a hack to stand on.
> > >
> > > Can't disagree. The more I think about it, the harder it is to see
> > > some process that would
> > > be helpful. The fact that the node (and presumably all replicas on
> > > that node) are unavailable
> > > means you can't index to any replica on that node _and_ you can't
> > > do
> > > regular distributed queries. About the only thing you _can_ do is
> > > query the (stale) replicas on
> > > that node with &distrib=false, which is at least a little useful
> > > when
> > > trying to understand the
> > > state of the system but totally useless when it comes to a
> > > production setup.
> > >
> > > I guess "monitor and if it's repeatable try to find out why it was
> > > being removed in the first place".
> > >
> > > As for #2, I haven't found any tickets that mention anything like
> > > that, that may not mean much
> > > though.
> > >
> > > Scott:
> > >
> > > Right, but since the node was removed from live_nodes in the first
> > > place, presumably the Solr
> > > node wasn't reachable (speculation). So it wouldn't receive an
> > > event
> > > that it was removed
> > > from the live_node ephemeral and couldn't repair itself.
> > >
> > > On Thu, Jan 14, 2016 at 5:55 PM, Scott Blum <[email protected]>
> > > wrote:
> > > > Most ephemeral node uses include a monitoring component or watch
> > > of some
> > > > kind tho.
> > > >
> > > > On Thu, Jan 14, 2016 at 5:54 PM, Mark Miller <
> > > [email protected]> wrote:
> > > >>
> > > >> That is just silly though. There is no reason it should be gone
> > > in a legit
> > > >> situation. We can't have everything monitoring all it's state
> > > all the time
> > > >> and trying to correct it.
> > > >>
> > > >> A report of a 'spotting' or two in the wild is a very weak leg
> > > for such a
> > > >> hack to stand on.
> > > >>
> > > >>
> > > >> - Mark
> > > >>
> > > >> On Thu, Jan 14, 2016 at 5:40 PM Scott Blum <
> > > [email protected]> wrote:
> > > >>>
> > > >>> For #1, I think each node should periodically ensure it's in
> > > the
> > > >>> live_nodes list in ZK.
> > > >>
> > > >> --
> > > >> - Mark
> > > >> about.me/markrmiller
> > > >
> > > >
> > >
> > > -------------------------------------------------------------------
> > > --
> > > To unsubscribe, e-mail: [email protected]
> > > For additional commands, e-mail: [email protected]
> > >
> > >
> > --
> > - Mark
> > about.me/markrmiller
> --
> Steve Molloy <[email protected]>
> OpenText
>

Reply via email to