Hi Amit, It sounds like you need separate ES clusters (one per DC) and a way to feed the data into them all consistently.
I happened to scan-read the tribenodes documentation - it looks like it could work great for reads, but (IIRC) it will not do writes. I suspect you want some message-passing system (eg RabbitMQ) or redis (acting as a cache). If you were writing (system) logs then something like logstash would help interface. However, I suspect that is not the case so you would need to find the integration solution (between message-passing / redis and ES) that you need for your system. This means that you could probably use something like tribenodes for the reads and some message-passing/proxy system for writes. Cheers Ivan On 22/02/2014 18:32, Amit Soni wrote: > Hello Michael - Understand that ES is not built to maintain consistent > cluster state across data centers. what I am wondering is whether there > is a way for ElasticSearch to continue to replicate data onto a > different data center (with some delay of course) so that when the > primary center fails, the fail over data center still has most of the > data (may be except for the last few seconds/minutes/hours). > > Overall I am looking for a right way to implement cross data center > deployment of elastic-search! > > -Amit. > > > On Fri, Feb 21, 2014 at 9:37 AM, Michael Sick > <[email protected] > <mailto:[email protected]>> wrote: > > Dario, > > I believe that you're looking for > TribeNodes > http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/modules-tribe.html > > ES is not built to consistently cluster across DC's / larger network > lags. > > On Fri, Feb 21, 2014 at 11:24 AM, Dario Rossi <[email protected] > <mailto:[email protected]>> wrote: > > Hi, > I've the following problem: our application publishes content to > an Elasticsearch cluster. We use local data less node for > querying elasticsearch then, so we don't use HTTP REST and the > local nodes are the loadbalancer. Now they came with the > requirement of having the cluster replicated to another data > center too (and in the future maybe another too... ) for > resilience. > > At the very beginning we thought of having one large cluster > that goes across data centers (crazy). This solution has the > following problems: > > - The cluster has the split-brain problem (!) > - The client data less node will try to do requests across > different data centers (is there a solution to this???). I can't > find a way to avoid this. We don't want this to happen because > of a) latency and b) firewalling issues. > > So we started to think that this solution is not really viable. > So we thought of having one cluster per data center, which seems > more sensible. But then here we have the problem that we must > publish data to all clusters and, if one fails, we have no means > of rolling back (unless we try to set up a complicated version > based rollback system). I find this very complicated and hard to > maintain, although can be somewhat doable. > > My biggest problem is that we have to keep the data centers in > the same state at any time, so that if one goes down, we can > readily switch to the other. > > Any ideas, or can you recommend some support to help use deal > with this? -- Ivan Beveridge -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53091254.7020001%40livejournalinc.com. For more options, visit https://groups.google.com/groups/opt_out.
