[Bug 1896223] Re: [FFe] this is the no-quorum-policy feature (specially for mssql sevrer)

Rafael David Tinoco Tue, 22 Sep 2020 12:41:54 -0700

The feature was announced by the following e-mail:

-------- Forwarded Message --------
Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / 
no-quorum-policy=demote
Date: Mon, 10 Aug 2020 11:47:24 -0500
From: Ken Gaillot <[email protected]>
Reply-To: Cluster Labs - All topics related to open-source clustering welcomed 
<[email protected]>
Organization: Red Hat
To: Cluster Labs - All topics related to open-source clustering welcomed 
<[email protected]>


Hi all,

Looking ahead to the Pacemaker 2.0.5 release expected at the end of
this year, here is a new feature already in the master branch.

When configuring resource operations, Pacemaker lets you set an "on-
fail" policy to specify whether to restart the resource, fence the
node, etc., if the operation fails. With 2.0.5, a new possible value
will be "demote", which will mean "demote this resource but do not
fully restart it".

"Demote" will be a valid value only for promote actions, and for
recurring monitors with "role" set to "Master".

Once the resource is demoted, it will be eligible for promotion again,
so if the promotion scores have not changed, a promote on the same node
may be attempted. If this is not desired, the agent can change the
promotion scores either in the failed monitor or the demote.

The intended use case is an application where a successful demote assures a 
well-functioning service, and a full restart would be
unnecessarily heavyweight. A large database might be an example.

Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option
to specify what happens to resources when quorum is lost (the default
being to stop them). With 2.0.5, "demote" will be a possible value here
as well, and will mean "demote all promotable resources and stop all
other resources".

The intended use case is an application that cannot cause any harm
after being demoted, and may be useful in a demoted role even if there
is no quorum. A database that operates read-only when demoted and
doesn't depend on any non-promotable resources might be an example.

Happy clustering :)
--
Ken Gaillot <[email protected]>

** Description changed:

  In bug:
  
  https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1895883
  
  I did the stabilization patches for pacemaker 2.0.4.. and I did not
  merge this feature so it could have its own FFe (this bug).
  
  The following patch-set:
  
  * #### this is the no-quorum-policy feature microsoft needs for mssqlserver
  |\
  | * c4429d86e - Log: scheduler: downgrade "active on" messages to trace (3 
mont
  | * 7eec572db - Build: libcrmcommon: bump CRM feature set (3 months ago) <Ken 
G
  | * 01c5ec67e - Low: scheduler: match initial no-quorum-policy struct value to
  | * 015b5c012 - Doc: Pacemaker Explained: document no-quorum-policy=demote (3 
m
  | * 5d809e136 - Test: scheduler: add regression test for 
no-quorum-policy="demo
  | * b1ae35938 - Feature: scheduler: support "demote" choice for 
no-quorum-polic
  | * 0b6834453 - Refactor: scheduler: functionize checking quorum policy in 
effe
  | * d4b9117e7 - Doc: Pacemaker Explained: correct on-fail default (3 months 
ago
  | * 204961e95 - Doc: Pacemaker Explained: document new on-fail="demote" option
  | * d29433ea5 - Test: scheduler: add regression tests for on-fail="demote" (3 
m
  | * 874f75e0f - Feature: scheduler: new on-fail="demote" recovery policy for 
pr
  | * 2f1e2df1f - Feature: xml: add on-fail="demote" option to resources schema 
(
  | * fd55a6660 - Doc: libpacemaker: improve comments for resource state and 
acti
  | * 98c3b649f - Log: libpacemaker: check for re-promotes specifically (3 
months
  | * ff6aebecf - Doc: libpacemaker: improve comments when logging actions (3 
mon
  | * f2d244bc4 - Test: scheduler: test forcing a restart instead of reload when
  | * a4d6a20a9 - Low: libpacemaker: don't force stop when skipping reload of 
fai
  | * 8dceba792 - Refactor: scheduler: use more appropriate types in a couple 
pla
  | * ef246ff05 - Fix: scheduler: disallow on-fail=stop for stop operations (3 
mo
  | * f1f71b3f3 - Refactor: scheduler: functionize comparing on-fail values (3 
mo
  *
  
  implements the :
  
  new on-fail="demote" recovery policy
  
  needed for Microsoft SQL Server big installations. MSSQL server does not
  use pacemaker for fencing, but it does need the promote/demote features
  available for it.. to coordinate its HA cluster. This request come
  directly by Microsoft after the feature was developed upstream.
  
+ I'm particularly interested in this feature because it allows Microsoft
+ SQL Server to use it. MSSQL server uses pacemaker to manage which node
+ is the primary node, but it does not need fencing mechanisms and this
+ feature allows it to simply promote/demote the nodes. Without it, big
+ clusters with MSSQL server installed might face big downtime because of
+ fencing and the database archive logs recovery.
+ 
  Note: This will help me trying to backport the feature to Focal.

** Also affects: pacemaker (Ubuntu Focal)
   Importance: Undecided
       Status: New

** Changed in: pacemaker (Ubuntu Focal)
   Importance: Undecided => Wishlist

** Changed in: pacemaker (Ubuntu)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1896223

Title:
  [FFe] this is the no-quorum-policy feature (specially for mssql
  sevrer)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1896223/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1896223] Re: [FFe] this is the no-quorum-policy feature (specially for mssql sevrer)

Reply via email to