On 05/30/2017 09:06 PM, Jay Pipes wrote:
On 05/30/2017 05:07 PM, Clint Byrum wrote:
Excerpts from Jay Pipes's message of 2017-05-30 14:52:01 -0400:
Sorry for the delay in getting back on this... comments inline.
On 05/18/2017 06:13 PM, Adrian Turjak wrote:
Hello fellow OpenStackers,
For the last while I've been looking at options for multi-region
multi-master Keystone, as well as multi-master for other services I've
been developing and one thing that always came up was there aren't many
truly good options for a true multi-master backend.
Not sure whether you've looked into Galera? We had a geo-distributed
12-site Galera cluster servicing our Keystone assignment/identity
information WAN-replicated. Worked a charm for us at AT&T. Much easier
to administer than master-slave replication topologies and the
performance (yes, even over WAN links) of the ws-rep replication was
excellent. And yes, I'm aware Galera doesn't have complete snapshot
isolation support, but for Keystone's workloads (heavy, heavy read, very
little write) it is indeed ideal.
This has not been my experience.
We had a 3 site, 9 node global cluster and it was _extremely_ sensitive
to latency. We'd lose even read ability whenever we had a latency storm
due to quorum problems.
Our sites were London, Dallas, and Sydney, so it was pretty common for
there to be latency between any of them.
I lost track of it after some reorgs, but I believe the solution was
to just have a single site 3-node galera for writes, and then use async
replication for reads. We even helped land patches in Keystone to allow
split read/write host configuration.
Interesting, thanks for the info. Can I ask, were you using the Galera
cluster for read-heavy data like Keystone identity/assignment storage?
Or did you have write-heavy data mixed in (like Keystone's old UUID
token storage...)
I'd also throw in, there's lots of versions of Galera with different
bugfixes / improvements as we go along, not to mention configuration
settings.... if Jay observes it working great on a distributed cluster
and Clint observes it working terribly, it could be that these were not
the same Galera versions being used.
It should be noted that CockroachDB's documentation specifically calls
out that it is extremely sensitive to latency due to the way it measures
clock skew... so might not be suitable for WAN-separated clusters?
Best,
-jay
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev