Hi, Re "ZK becomes the cluster state truth."
I thought that was already the case, no? Who/what else holds (which) bits of the total truth? Thanks, Otis On Tue, Jun 18, 2013 at 8:15 AM, Mark Miller <[email protected]> wrote: > I don't know what the best method to use now is, but the slightly longer term > plan is to: > > * Have a new mode where you cannot preconfigure cores, only use the > collection's API. > * ZK becomes the cluster state truth. > * The Overseer takes actions to ensure cores live/die in different places > based on the truth in ZK. > > - Mark > > On Jun 18, 2013, at 6:03 AM, Per Steffensen <[email protected]> wrote: > >> Hi >> >> Scenario: >> * 1) You have a Solr cloud cluster running - several Solr nodes across >> several machine - many collections with many replica and documents indexed >> into them >> * 2) One of the machines running a Solr node completely crashes - totally >> gone including local disk with data/config etc. of the Solr node >> * 3) You want to be able to insert a new empty machine, install/configure >> Solr on this new machine, give it the same IP and hostname as the crashed >> machine had, and then we want to be able to start this new Solr node and >> have it take the place of the crashed Solr node, making the Solr cloud >> cluster work again >> * 4) No replication (only one replica per shard), so we will accept that the >> data on the crashed machine is gone forever, but of course we want the Solr >> cloud cluster to continue running with the documents indexed on the other >> Solr nodes >> >> At my company we are establishing a procedure for what to do in 3) above. >> >> Basically we use our "install script" to install/configure the new Solr node >> on the new machine as it was originally installed/configured on the crashed >> machine back when the system was originally set up - this includes an >> "empty" solr.xml file (no cores mentioned). Now starting all the Solr nodes >> (including the new reestablished one) again. They all start successfully but >> the Solr cloud cluster does not work - at least when doing distributed >> searches touching replica that used to run on the crashed Solr node, because >> those replica in not loaded on the reestablished node. >> >> How to make sure a reestablished Solr node on a machine with same IP and >> hostname as on the machine that crashed will load all the replica that the >> old Solr node used to run? >> >> Potential solutions >> * We have tried to make sure that the solr.xml on the reestablished Solr >> node is containing the same core-list as on the crashed one. Then everything >> works as we want. But this is a little fragile and it is a solution >> "outside" Solr - you need to figure out how to reestablish the solr.xml >> yourself - probably something like looking into clusterstate.json and >> generate the solr.xml from that >> * Untested by us: Maybe we will also succeed just running Core API LOAD >> operations against the new reestablished Solr node - a LOAD operation for >> each replica that used to run on the Solr node. But this is also a little >> fragile and it is also (partly) a solution "outside" Solr - you need to >> figure out which cores to load yourself. >> >> I have to say that we do not use the "latest" Solr version - we use a >> version of Solr based on 4.0.0. So there might be a solution already in >> Solr, but I would be surprised. >> >> Any thoughts about how this "ought" to be done? Support in Solr? E.g. an >> "operation" to tell a Solr node to load all the replica that used to run on >> a machine with the same IP and hostname? Or...? >> >> Regards, Per Steffensen >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
