Hi

Scenario:
* 1) You have a Solr cloud cluster running - several Solr nodes across several machine - many collections with many replica and documents indexed into them * 2) One of the machines running a Solr node completely crashes - totally gone including local disk with data/config etc. of the Solr node * 3) You want to be able to insert a new empty machine, install/configure Solr on this new machine, give it the same IP and hostname as the crashed machine had, and then we want to be able to start this new Solr node and have it take the place of the crashed Solr node, making the Solr cloud cluster work again * 4) No replication (only one replica per shard), so we will accept that the data on the crashed machine is gone forever, but of course we want the Solr cloud cluster to continue running with the documents indexed on the other Solr nodes

At my company we are establishing a procedure for what to do in 3) above.

Basically we use our "install script" to install/configure the new Solr node on the new machine as it was originally installed/configured on the crashed machine back when the system was originally set up - this includes an "empty" solr.xml file (no cores mentioned). Now starting all the Solr nodes (including the new reestablished one) again. They all start successfully but the Solr cloud cluster does not work - at least when doing distributed searches touching replica that used to run on the crashed Solr node, because those replica in not loaded on the reestablished node.

How to make sure a reestablished Solr node on a machine with same IP and hostname as on the machine that crashed will load all the replica that the old Solr node used to run?

Potential solutions
* We have tried to make sure that the solr.xml on the reestablished Solr node is containing the same core-list as on the crashed one. Then everything works as we want. But this is a little fragile and it is a solution "outside" Solr - you need to figure out how to reestablish the solr.xml yourself - probably something like looking into clusterstate.json and generate the solr.xml from that * Untested by us: Maybe we will also succeed just running Core API LOAD operations against the new reestablished Solr node - a LOAD operation for each replica that used to run on the Solr node. But this is also a little fragile and it is also (partly) a solution "outside" Solr - you need to figure out which cores to load yourself.

I have to say that we do not use the "latest" Solr version - we use a version of Solr based on 4.0.0. So there might be a solution already in Solr, but I would be surprised.

Any thoughts about how this "ought" to be done? Support in Solr? E.g. an "operation" to tell a Solr node to load all the replica that used to run on a machine with the same IP and hostname? Or...?

Regards, Per Steffensen

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to