Re: live_nodes and state.json can get out of sync

Erick Erickson Thu, 14 Jan 2016 18:01:33 -0800

bq: A report of a 'spotting' or two in the wild is a very weak leg for
such a hack to stand on.

Can't disagree. The more I think about it, the harder it is to see
some process that would
be helpful. The fact that the node (and presumably all replicas on
that node) are unavailable
means you can't index to any replica on that node _and_ you can't do
regular distributed queries. About the only thing you _can_ do is
query the (stale) replicas on
that node with &distrib=false, which is at least a little useful when
trying to understand the
state of the system but totally useless when it comes to a production setup.

I guess "monitor and if it's repeatable try to find out why it was
being removed in the first place".

As for #2, I haven't found any tickets that mention anything like
that, that may not mean much
though.

Scott:

Right, but since the node was removed from live_nodes in the first
place, presumably the Solr
node wasn't reachable (speculation). So it wouldn't receive an event
that it was removed
from the live_node ephemeral and couldn't repair itself.

On Thu, Jan 14, 2016 at 5:55 PM, Scott Blum <[email protected]> wrote:
> Most ephemeral node uses include a monitoring component or watch of some
> kind tho.
>
> On Thu, Jan 14, 2016 at 5:54 PM, Mark Miller <[email protected]> wrote:
>>
>> That is just silly though. There is no reason it should be gone in a legit
>> situation. We can't have everything monitoring all it's state all the time
>> and trying to correct it.
>>
>> A report of a 'spotting' or two in the wild is a very weak leg for such a
>> hack to stand on.
>>
>>
>> - Mark
>>
>> On Thu, Jan 14, 2016 at 5:40 PM Scott Blum <[email protected]> wrote:
>>>
>>> For #1, I think each node should periodically ensure it's in the
>>> live_nodes list in ZK.
>>
>> --
>> - Mark
>> about.me/markrmiller
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: live_nodes and state.json can get out of sync

Reply via email to