bq: A report of a 'spotting' or two in the wild is a very weak leg for such a hack to stand on.
Can't disagree. The more I think about it, the harder it is to see some process that would be helpful. The fact that the node (and presumably all replicas on that node) are unavailable means you can't index to any replica on that node _and_ you can't do regular distributed queries. About the only thing you _can_ do is query the (stale) replicas on that node with &distrib=false, which is at least a little useful when trying to understand the state of the system but totally useless when it comes to a production setup. I guess "monitor and if it's repeatable try to find out why it was being removed in the first place". As for #2, I haven't found any tickets that mention anything like that, that may not mean much though. Scott: Right, but since the node was removed from live_nodes in the first place, presumably the Solr node wasn't reachable (speculation). So it wouldn't receive an event that it was removed from the live_node ephemeral and couldn't repair itself. On Thu, Jan 14, 2016 at 5:55 PM, Scott Blum <[email protected]> wrote: > Most ephemeral node uses include a monitoring component or watch of some > kind tho. > > On Thu, Jan 14, 2016 at 5:54 PM, Mark Miller <[email protected]> wrote: >> >> That is just silly though. There is no reason it should be gone in a legit >> situation. We can't have everything monitoring all it's state all the time >> and trying to correct it. >> >> A report of a 'spotting' or two in the wild is a very weak leg for such a >> hack to stand on. >> >> >> - Mark >> >> On Thu, Jan 14, 2016 at 5:40 PM Scott Blum <[email protected]> wrote: >>> >>> For #1, I think each node should periodically ensure it's in the >>> live_nodes list in ZK. >> >> -- >> - Mark >> about.me/markrmiller > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
