Re: Solr Master-Slave fail-over across multiple data-centers

Daniel Collins Fri, 13 Jun 2014 02:57:13 -0700

Why do you need to swap the replicas from one master to another?

If you have a cross DC database that ensures both Masters are in sync, why
not just tie SolrSlave-B1 and SolrSlave-B2 to SolrMaster-B at all times?
 Then you don't have any fail-over to do at all?

We have multiple DCs and a similar setup (though a bit larger, 16 machines
per DC comprising 4 replicas of the collection) and we do exactly that.  So
we have 2 "independent" Solr Clouds, but we feed them from a single input
stream, so they should be in sync (except commit times might vary slightly
from replica to replica).  Users query whichever replica is nearest/least
loaded, to minimize cross-DC traffic.

But then for us, availability beats consistency, we'd rather have a working
cloud if one DC dies, even if it is slightly inconsistent.  For us, that's
better (its an NRT system) than the alternative.  If we do lose a DC, we'll
have to manually sync back up before we bring it on-line for users but
that's a price we are willing to pay.

On 13 June 2014 00:52, Arcadius Ahouansou <arcad...@menelic.com> wrote:

> Hello.
>
> - We currently have solr 4 in master-slave mode across 2 DataCenters.
>
> - We are planning to run the system in active-active mode, meaning that
> search requests will go to Solr Slaves in both DC-A and DC-B.
>
> - We have a highly available and cross DC database that feeds the
> SolrMaster in both DC. So, both Solr Masters are being kept up-to-date.
>
> - In order to allow all slaves in both DC to have the very same index
> version, we have come up with the idea of having multiple masterUrl on each
> slave, i.e masterUrl=masterUrl-A,masterUrl-B (and this is the main point of
> this post)
>
> - When both DC are available, only masterUrl-A is used for fetching the
> index and the topology would look like the one shown at
> https://www.dropbox.com/s/4vqdx70af5ddn69/master-slave-failover.png
>
> - In case the worst happens and we lose DC-A,   the slaves in DC-B will get
> network errors like NoRouteToHost or ConnectionTimeout.
>
> - After few attempts, the slaves will switch to using the next url in the
> masterUrl variable which would be masterUrl-B
>
> - This should work pretty well and when DC-A becomes available, we could
> issue a rest API call to reset the masterUrl or restart the master in DC-B
> and slaves in DC-B should switch back to using masterUrl-A.
>
> - I would like to gather your thought about this idea.
>
> - If this makes sense, I could raise a Jira ticket to enable multiple
> masterUrl and the fail-over principle described here.
>
> Thank you very much.
>
> Arcadius.
>

Re: Solr Master-Slave fail-over across multiple data-centers

Reply via email to