Have you seen the Tribe Node? This is a kind of a "merged state" multi-master cluster.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-tribe.html Jörg On Fri, Jun 27, 2014 at 1:39 AM, Matthew Parrott <[email protected]> wrote: > Hi! > > Have there been any further explorations in the area of wan replication? > > I have ES clusters in multiple datacenters connected via high-speed > private network. I'm wondering if multi-master replication would be > possible in this environment or if we'd need some type of 'shovel' plugin > like the one described here to ship data between the DCs. > > Thanks, > Matthew > > > On Tuesday, July 23, 2013 10:06:10 AM UTC-7, Jörg Prante wrote: > >> Yes, I once examined Kafka, and discovered that many components are >> already there in Elasticsearch. For example, the activity stream is already >> there as ES translog (if you focus on indexing operations) and the ES >> gateway is a useful persistency store mechanism. What I didn't like was the >> single Kafka JVM, and the Zookeeper infrastructure, it is all adding up >> complexity beside ES. >> >> For cross-cluster replication, I think the best approach is distributed >> log replication. This is hard, because logged ES operations must be >> synchronized by an external time source (e.g. vector clocks) to use them >> like a global event stream. A pubsub mechanism could then work at the >> primary shards of an index in the ES node as a service, merging the >> translogs for an external agent who previously subscribed to the >> replication stream. The vector clock is required for a distributed time >> machine like behavior (snapshots), assuming the translog is not deleted, >> but stored for a certain time window. >> >> Jörg >> >> >> On Tue, Jul 23, 2013 at 3:55 PM, Vinicius Carvalho <[email protected] >> > wrote: >> >>> Thanks again Jorg, so that you know I'm actually considering using kafka >>> for intra cluster replication. We want to push the index operations to a >>> topic and then other clusters on different DCs would subscribe to this. >>> Conflict resolution will be last commit will win. And in case of kafka >>> cluster failure we will append changes to a local index, and then send them >>> over as the bus is back. In the case ES cluster dies, and when it recovers, >>> one nice thing on kafka is that one can request messages based on an >>> offset, so we could start consuming messages from the last point the >>> cluster had consume them. >>> >>> It's all ideas I'm working right now. I'll probably have time to start >>> coding them soon. Thanks for all the support :) >>> >>> Cheers >>> >>> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/37b1c902-a74d-4c35-bc41-5e1d5e76e72d%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/37b1c902-a74d-4c35-bc41-5e1d5e76e72d%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFpFnUvDrVdcui2opE3iju%3DzL%3DPTCMH8RXLVX0E4%2BetMQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
