Hi
Scenario:
* 1) You have a Solr cloud cluster running - several Solr nodes across
several machine - many collections with many replica and documents
indexed into them
* 2) One of the machines running a Solr node completely crashes -
totally gone including local disk with data/config etc. of the Solr node
* 3) You want to be able to insert a new empty machine,
install/configure Solr on this new machine, give it the same IP and
hostname as the crashed machine had, and then we want to be able to
start this new Solr node and have it take the place of the crashed Solr
node, making the Solr cloud cluster work again
* 4) No replication (only one replica per shard), so we will accept that
the data on the crashed machine is gone forever, but of course we want
the Solr cloud cluster to continue running with the documents indexed on
the other Solr nodes
At my company we are establishing a procedure for what to do in 3) above.
Basically we use our "install script" to install/configure the new Solr
node on the new machine as it was originally installed/configured on the
crashed machine back when the system was originally set up - this
includes an "empty" solr.xml file (no cores mentioned). Now starting all
the Solr nodes (including the new reestablished one) again. They all
start successfully but the Solr cloud cluster does not work - at least
when doing distributed searches touching replica that used to run on the
crashed Solr node, because those replica in not loaded on the
reestablished node.
How to make sure a reestablished Solr node on a machine with same IP and
hostname as on the machine that crashed will load all the replica that
the old Solr node used to run?
Potential solutions
* We have tried to make sure that the solr.xml on the reestablished Solr
node is containing the same core-list as on the crashed one. Then
everything works as we want. But this is a little fragile and it is a
solution "outside" Solr - you need to figure out how to reestablish the
solr.xml yourself - probably something like looking into
clusterstate.json and generate the solr.xml from that
* Untested by us: Maybe we will also succeed just running Core API LOAD
operations against the new reestablished Solr node - a LOAD operation
for each replica that used to run on the Solr node. But this is also a
little fragile and it is also (partly) a solution "outside" Solr - you
need to figure out which cores to load yourself.
I have to say that we do not use the "latest" Solr version - we use a
version of Solr based on 4.0.0. So there might be a solution already in
Solr, but I would be surprised.
Any thoughts about how this "ought" to be done? Support in Solr? E.g. an
"operation" to tell a Solr node to load all the replica that used to run
on a machine with the same IP and hostname? Or...?
Regards, Per Steffensen
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]