Ok, thanks. I think we will just reconstruct solr.xml (from
clusterstate.json) ourselves for now.
On 6/18/13 2:15 PM, Mark Miller wrote:
I don't know what the best method to use now is, but the slightly longer term
plan is to:
* Have a new mode where you cannot preconfigure cores, only use the
collection's API.
* ZK becomes the cluster state truth.
* The Overseer takes actions to ensure cores live/die in different places based
on the truth in ZK.
- Mark
On Jun 18, 2013, at 6:03 AM, Per Steffensen <[email protected]> wrote:
Hi
Scenario:
* 1) You have a Solr cloud cluster running - several Solr nodes across several
machine - many collections with many replica and documents indexed into them
* 2) One of the machines running a Solr node completely crashes - totally gone
including local disk with data/config etc. of the Solr node
* 3) You want to be able to insert a new empty machine, install/configure Solr
on this new machine, give it the same IP and hostname as the crashed machine
had, and then we want to be able to start this new Solr node and have it take
the place of the crashed Solr node, making the Solr cloud cluster work again
* 4) No replication (only one replica per shard), so we will accept that the
data on the crashed machine is gone forever, but of course we want the Solr
cloud cluster to continue running with the documents indexed on the other Solr
nodes
At my company we are establishing a procedure for what to do in 3) above.
Basically we use our "install script" to install/configure the new Solr node on the new
machine as it was originally installed/configured on the crashed machine back when the system was
originally set up - this includes an "empty" solr.xml file (no cores mentioned). Now
starting all the Solr nodes (including the new reestablished one) again. They all start
successfully but the Solr cloud cluster does not work - at least when doing distributed searches
touching replica that used to run on the crashed Solr node, because those replica in not loaded on
the reestablished node.
How to make sure a reestablished Solr node on a machine with same IP and
hostname as on the machine that crashed will load all the replica that the old
Solr node used to run?
Potential solutions
* We have tried to make sure that the solr.xml on the reestablished Solr node is
containing the same core-list as on the crashed one. Then everything works as we want.
But this is a little fragile and it is a solution "outside" Solr - you need to
figure out how to reestablish the solr.xml yourself - probably something like looking
into clusterstate.json and generate the solr.xml from that
* Untested by us: Maybe we will also succeed just running Core API LOAD operations
against the new reestablished Solr node - a LOAD operation for each replica that used to
run on the Solr node. But this is also a little fragile and it is also (partly) a
solution "outside" Solr - you need to figure out which cores to load yourself.
I have to say that we do not use the "latest" Solr version - we use a version
of Solr based on 4.0.0. So there might be a solution already in Solr, but I would be
surprised.
Any thoughts about how this "ought" to be done? Support in Solr? E.g. an
"operation" to tell a Solr node to load all the replica that used to run on a machine
with the same IP and hostname? Or...?
Regards, Per Steffensen
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]