There is another thing to consider as well ... When a node goes off line and then back on, unless Zookeeper has been configured properly the ensemble may have trouble responding to the cluster.
Jim Keeney President, FitterWeb E: nextves...@gmail.com M: 703-568-5887 *FitterWeb Consulting* *Are you lean and agile enough for the web? * On Thu, Sep 27, 2018 at 4:12 AM Shawn Heisey <apa...@elyograg.org> wrote: > On 9/27/2018 8:00 AM, Shawn Heisey wrote: > > On 9/27/2018 7:24 AM, Kimber, Mike wrote: > >> I'm trying to determine if there is any health check available to > >> determine the above and then if the issue happens then an automated > >> mechanism in SolrCloud to restart the instance. Or is this something > >> we have to code ourselves? > > > > As shipped by the project, Solr will never restart itself > > automatically. If it dies, it's dead until you start it again, unless > > you implement something to restart it automatically.This is > > intentional -- Solr almost never dies unless there's some kind of > > problem -- not enough memory, corrupt software, etc.If Solr *does* > > die, you need to figure out why and fix it, not rely on an automatic > > restart. > > Replying to myself. Probably a sign of insanity! > > The other side of that coin is a completely unresponsive server. Here's > the thing about that situation: If it's really unresponsive, it > probably wouldn't be possible to send Solr a message to tell it to > restart itself. When a server in SolrCloud becomes unresponsive, > SolrCloud will attempt to have it do an index recovery, but this does > NOT involve a restart. Solr cannot restart itself automatically. It > might be possible to write that functionality into Solr, but I think > that using such functionality for automatic restarts on problem > detection is a very bad idea. The root of the problem must be found and > fixed, a restart probably isn't going to get rid of it. > > If a SolrCloud server remains unresponsive, then any recovery operation > that is initiated is going to fail. Typically, problems that lead to an > unresponsive server are not the kind of problems that will go away > without action by the administrator -- adding memory, reducing the index > size, etc. If the admin restarts the server to clear that kind of > problem, it's very likely that the problem will happen again. > > Thanks, > Shawn > >