On 10/09/2018 06:34 AM, Florian Engelmann wrote:
Am 10/9/18 um 11:41 AM schrieb Jay Pipes:
On 10/09/2018 04:34 AM, Christian Berendt wrote:
On 8. Oct 2018, at 19:48, Jay Pipes <jaypi...@gmail.com> wrote:
Why not send all read and all write traffic to a single haproxy
endpoint and just have haproxy spread all traffic across each Galera
node?
Galera, after all, is multi-master synchronous replication... so it
shouldn't matter which node in the Galera cluster you send traffic to.
Probably because of MySQL deadlocks in Galera:
—snip—
Galera cluster has known limitations, one of them is that it uses
cluster-wide optimistic locking. This may cause some transactions to
rollback. With an increasing number of writeable masters, the
transaction rollback rate may increase, especially if there is write
contention on the same dataset. It is of course possible to retry the
transaction and perhaps it will COMMIT in the retries, but this will
add to the transaction latency. However, some designs are deadlock
prone, e.g sequence tables.
—snap—
Source:
https://severalnines.com/resources/tutorials/mysql-load-balancing-haproxy-tutorial
Have you seen the above in production?
Yes of course. Just depends on the application and how high the workload
gets.
Please read about deadloks and nova in the following report by Intel:
http://galeracluster.com/wp-content/uploads/2017/06/performance_analysis_and_tuning_in_china_mobiles_openstack_production_cloud_2.pdf
I have read the above. It's a synthetic workload analysis, which is why
I asked if you'd seen this in production.
For the record, we addressed much of the contention/races mentioned in
the above around scheduler resource consumption in the Ocata and Pike
releases of Nova.
I'm aware that the report above identifies the quota handling code in
Nova as the primary culprit of the deadlock issues but again, it's a
synthetic workload that is designed to find breaking points. It doesn't
represent a realistic production workload.
You can read about the deadlock issue in depth on my blog here:
http://www.joinfu.com/2015/01/understanding-reservations-concurrency-locking-in-nova/
That explains where the source of the problem comes from (it's the use
of SELECT FOR UPDATE, which has been removed from Nova's quota-handling
code in the Rocky release).
If just Nova is affected we could also create an additional HAProxy
listener using all Galera nodes with round-robin for all other services?
I fail to see the point of using Galera with a single writer. At that
point, why bother with Galera at all? Just use a single database node
with a single slave for backup purposes.
Anyway - proxySQL would be a great extension.
I don't disagree that proxySQL is a good extension. However, it adds yet
another services to the mesh that needs to be deployed, configured and
maintained.
Best,
-jay
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev