Hi all, Since my requirements for x-datacenter replication are intertwined with discussions of the various approaches, I was asked to lay them out independently of any solution.
- I have 3 data centers. I'm modeling both session and resource data in my data grid usage. Resources are modeled off of physical devices. - There are no backing stores on any of the caches. We're operating as a true PaaS model. - In normal usage, this data is geographically partitioned so there won't be concurrent writes *on the same key* across the three sites - Requesting clients will be routed to their "local" data center. These client will look up and cache the load balancer address of their local data center. A network failure of a data center will cause an immediate failover. However, because this is a legacy vendor providing the client, a fail-back can take ~5 minutes, as clients slowly bleed off their backup data center. Because of this, there is a need that between-site consistency should not be lost for transactional writes. - My customer wants all sessions and resources requested by these clients to be "owned" by the local data center, with replicas of data on BOTH backup data centers. - The sessions are mostly ephemeral data and are a good candidate for async replication. The resource data is highly transactional and MUST be synchronous. This does not really present a problem for my customer, as the round trip time between data centers is at most 70ms. - My customer wants a data center outage to be as minimally obtrusive as possible (which really goes without saying). In addition, there are a lot of planned outages as well. Right now, this means the clients would be routed to the next proximate data center to slowly take the affected data center "out of service." - Any state transfer between nodes *should* be localized to the data center. Cross-data center traffic should be minimized as much as possible. Furthermore, the "local" data center should have 2 replicas of a key, while the backup data centers have 1. Here are the design challenges with the proposed solutions, as I see it. #1. Virtual Cluster. - RELAY needs to support > 2 sites. Like 3 =) - We would need a custom CH that can place data into its local site. - We would need a configuration that extends numOwners into primary and backup sites. - In the current state transfer model, a node joining means that all the nodes from all the sites are involved in rehashing. The rehash operation should be enhanced in this case to prefer using the local nodes as state providers. I know that this cannot be necessarily guaranteed, as a loss of multiple nodes at a site might mean the state only exists on another data center. But for a minimum of 10M records, I can't afford the existing limitations on state transfer. (And it goes without saying the NBST is an absolute must) - A data center outage will result in a rehash in this model. My customer doesn't want this. If we can write to 2 of the 3 data centers, we can always retransmit the replicas once the data center comes back online. - Conversely, this model DOES handle the case where there are concurrent writes due to clients migrating to their data center. This simply will result in more latency for these requests during this process, as they would have to be routed to their owner for processing. #2. Hot Rod This isn't an option, as I have confirmed I cannot open up this many ports. Plus HR is missing many features I take advantage of in embedded mode. #3. Cluster Bridge. - Most of the design challenges in #1 are greatly mitigated by #3 because they are independent clusters, e.g. there is no need for a custom CH, state transfer only affects one data center, etc. - However, the largest challenge here is guaranteeing that cross-data center writes remain consistent in the period that the clients migrate. - There is also the question of the management of the cluster. My customer, in particular, would want to manage the aggregate of all three clusters. Hopefully this facilitates design discussion for the meetings this week. Let me know if you need anything else. Erik -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Mircea Markus Sent: Thursday, February 16, 2012 9:53 AM To: infinispan -Dev List Subject: Re: [infinispan-dev] Use cases for x-datacenter replication > Yes, this makes sense. > > Although I doubt you will lose an entire data center. But an > intermediate switch can always crash, making the DC inaccessible, > which > is the same problem for clients as if the entire DC crashed. Also, if > we > implement the follow-the-sun approach, then clients need to be > switched > over gracefully from one DC to another one, so this functionality > definitely needs to be provided. > > We probaby need to differentiate between a crash-failover and a > graceful > failover; in the latter case, clients can be switched over to the new > DC > within a certain time frame, so we minimize load spikes that might be > caused by switching over all clients at exactly the same time. > > I added your comment to > https://community.jboss.org/wiki/CrossDatacenterReplication-Design, > so > we can discuss this next week (we have a meeting on cross-data center > clustering WED-FRI in London). > Keep the suggestions coming ! +100 _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
