Re: [openstack-dev] [Fuel][MySQL][DLM][Oslo][DB][Trove][Galera][operators] Multi-master writes look OK, OCF RA and more things

Mike Bayer Sat, 30 Apr 2016 13:17:32 -0700


On 04/30/2016 10:50 AM, Clint Byrum wrote:

Excerpts from Roman Podoliaka's message of 2016-04-29 12:04:49 -0700:


I'm curious why you think setting wsrep_sync_wait=1 wouldn't help.

The exact example appears in the Galera documentation:

http://galeracluster.com/documentation-webpages/mysqlwsrepoptions.html#wsrep-sync-wait

The moment you say 'SET SESSION wsrep_sync_wait=1', the behavior should
prevent the list problem you see, and it should not matter that it is
a separate session, as that is the entire point of the variable:

we prefer to keep it off and just point applications at a single nodeusing master/passive/passive in HAProxy, so that we don't have theunnecessary performance hit of waiting for all transactions topropagate; we just stick on one node at a time. We've fixed a lot ofissues in our config in ensuring that HAProxy definitely keeps allclients on exactly one Galera node at a time.


"When you enable this parameter, the node triggers causality checks in
response to certain types of queries. During the check, the node blocks
new queries while the database server catches up with all updates made
in the cluster to the point where the check was begun. Once it reaches
this point, the node executes the original query."

In the active/passive case where you never use the passive node as a
read slave, one could actually set wsrep_sync_wait=1 globally. This will
cause a ton of lag while new queries happen on the new active and old
transactions are still being applied, but that's exactly what you want,
so that when you fail over, nothing proceeds until all writes from the
original active node are applied and available on the new active node.
It would help if your failover technology actually _breaks_ connections
to a presumed dead node, so writes stop happening on the old one.

If HAProxy is failing over from the master, which is no longerreachable, to another passive node, which is reachable, that means thatmaster is partitioned and will leave the Galera primary component. Italso means all current database connections are going to be bounced off,which will cause errors for those clients either in the middle of anoperation, or if a pooled connection is reused before it is known thatthe connection has been reset. So failover is usually not an error-freesituation in any case from a database client perspective and retryschemes are always going to be needed.

Additionally, the purpose of the enginefacade [1] is to allow Openstackapplications to fix their often incorrectly written database accesslogic such that in many (most?) cases, a single logical operation is nolonger unnecessarily split among multiple transactions when possible.I know that this is not always feasible in the case where multiple webrequests are coordinating, however.

That leaves only the very infrequent scenario of, the master hasfinished sending a write set off, the passives haven't finishedcommitting that write set, the master goes down and HAProxy fails overto one of the passives, and the application that just happens to also beconnecting fresh onto that new passive node in order to perform the nextoperation that relies upon the previously committed data so it does notsee a database error, and instead runs straight onto the node where thecommitted data it's expecting hasn't arrived yet. I can't make thejudgment for all applications if this scenario can't be handled like anyother transient error that occurs during a failover situation, howeverif there is such a case, then IMO the wsrep_sync_wait (formerly known aswsrep_causal_reads) may be used on a per-transaction basis for that verycritical, not-retryable-even-during-failover operation. Allowing thisvariable to be set for the scope of a transaction and reset afterwards,and only when talking to Galera, is something we've planned to work intothe enginefacade as well as an declarative transaction attribute thatwould be a pass-through on other systems.

[1]https://specs.openstack.org/openstack/oslo-specs/specs/kilo/make-enginefacade-a-facade.html


Also, If you thrash back and forth a bit, that could cause your app to
virtually freeze, but HAProxy and most other failover technologies allow
tuning timings so that you can stay off of a passive server long enough
to calm it down and fail more gracefully to it.

Anyway, this is why sometimes I do wonder if we'd be better off just
using MySQL with DRBD and good old pacemaker.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Fuel][MySQL][DLM][Oslo][DB][Trove][Galera][operators] Multi-master writes look OK, OCF RA and more things

Reply via email to