Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Fabio M. Di Nitto
On 10/10/2012 1:04 PM, Heiko Nardmann wrote:
> Am 10.10.2012 10:11, schrieb Fabio M. Di Nitto:
>> [snip]
>> that doesn´t scale well for debian derivates that don´t ship
>> debian_version :) (see ubuntu & co..)
>>
>> You can´t even use something like "which dpkg" since the tool is
>> available on rpm based distributions... or viceversa.. there is rpm for
>> Debian & derivates.
>>
>> hardcoding all distributions is not optimal either, as they might change
>> policy by version
>>
>> Fabio
>>
> 
> What about 'lsb_release'? Is that executable available on all platforms?


Not installed by default, it´s generally shipped with $distro-lsb
metapackage that pulls in half gazillions dependencies.

I doubt it would solve anything since you still need to parse the
output. It´s really no different than hardcoding /etc/$distro_release,
actually with a few GB of extra packages ;)

Fabio



Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Heiko Nardmann
Am 10.10.2012 10:11, schrieb Fabio M. Di Nitto:
> [snip]
> that doesn´t scale well for debian derivates that don´t ship
> debian_version :) (see ubuntu & co..)
>
> You can´t even use something like "which dpkg" since the tool is
> available on rpm based distributions... or viceversa.. there is rpm for
> Debian & derivates.
>
> hardcoding all distributions is not optimal either, as they might change
> policy by version
>
> Fabio
>

What about 'lsb_release'? Is that executable available on all platforms?

Heiko



Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Dietmar Maurer
> > [ -f /etc/debian_version && -d /etc/default ]
> >
> 
> that doesn´t scale well for debian derivates that don´t ship debian_version :)
> (see ubuntu & co..)
> 
> You can´t even use something like "which dpkg" since the tool is available on
> rpm based distributions... or viceversa.. there is rpm for Debian & derivates.
> 
> hardcoding all distributions is not optimal either, as they might change 
> policy
> by version

OK, I can see the problem now.

- Dietmar




Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Fabio M. Di Nitto
On 10/10/2012 10:06 AM, Dietmar Maurer wrote:
>> On 10/10/2012 6:26 AM, Dietmar Maurer wrote:
 +# rpm based distros
 +[ -d /etc/sysconfig ] && \
 +  [ -f /etc/sysconfig/checkquorum ] && \
 +  . /etc/sysconfig/checkquorum
 +
 +# deb based distros
 +[ ! -d /etc/sysconfig ] && \
 +  [ -f /etc/default/checkquorum ] && \
 +  . /etc/default/checkquorum
 +
>>>
>>> FYI: Some RAID tool vendors delivers utilities for debian which creates
>> directory '/etc/sysconfig'
>>> on debian boxes, so that test is not reliable.
>>>
>>>
>>
>> This might be a controversial argument.
> 
> I just though there are better tests to see if you run on debian, for example:
> 
> [ -f /etc/debian_version && -d /etc/default ]
> 

that doesn´t scale well for debian derivates that don´t ship
debian_version :) (see ubuntu & co..)

You can´t even use something like "which dpkg" since the tool is
available on rpm based distributions... or viceversa.. there is rpm for
Debian & derivates.

hardcoding all distributions is not optimal either, as they might change
policy by version

Fabio



Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Dietmar Maurer
> Anyway examples and all, setups, limitations.. all in the doc as soon as it´s
> ready. Be a bit patience :)

Ok (I am just curios) - many thanks for you fast answers!

- Dietmar




Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Dietmar Maurer
> On 10/10/2012 6:26 AM, Dietmar Maurer wrote:
> >> +# rpm based distros
> >> +[ -d /etc/sysconfig ] && \
> >> +  [ -f /etc/sysconfig/checkquorum ] && \
> >> +  . /etc/sysconfig/checkquorum
> >> +
> >> +# deb based distros
> >> +[ ! -d /etc/sysconfig ] && \
> >> +  [ -f /etc/default/checkquorum ] && \
> >> +  . /etc/default/checkquorum
> >> +
> >
> > FYI: Some RAID tool vendors delivers utilities for debian which creates
> directory '/etc/sysconfig'
> > on debian boxes, so that test is not reliable.
> >
> >
> 
> This might be a controversial argument.

I just though there are better tests to see if you run on debian, for example:

[ -f /etc/debian_version && -d /etc/default ]








Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Fabio M. Di Nitto
On 10/10/2012 6:33 AM, Dietmar Maurer wrote:
> Will you add some documentaion how to use those scripts?

Yes our documentation overlord is preparing an upstream wiki page for
it. It will be ready before a release.

> 
> Seems those scripts does not check if the node is joined to the fence domain?
> 

It doesn´t really need to.

I´ll put this in the easiest way as possible:

- real fencing == murder
  there can only be one killer in the cluster at a time
  fence domain coordinates who can/should be killed by who

- checkquorum.wdmd == suicide
  there are N nodes in the cluster that can decide to commit suicide
  without really caring about what others are doing.
  this can run without any fencing configuration at all.

Anyway examples and all, setups, limitations.. all in the doc as soon as
it´s ready. Be a bit patience :)

Fabio



Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-10 Thread Fabio M. Di Nitto
On 10/10/2012 6:26 AM, Dietmar Maurer wrote:
>> +# rpm based distros
>> +[ -d /etc/sysconfig ] && \
>> +[ -f /etc/sysconfig/checkquorum ] && \
>> +. /etc/sysconfig/checkquorum
>> +
>> +# deb based distros
>> +[ ! -d /etc/sysconfig ] && \
>> +[ -f /etc/default/checkquorum ] && \
>> +. /etc/default/checkquorum
>> +
> 
> FYI: Some RAID tool vendors delivers utilities for debian which creates 
> directory '/etc/sysconfig'
> on debian boxes, so that test is not reliable.
> 
> 

This might be a controversial argument.

Debian policy (1) define the use of /etc/default as "should" (2), for
conffile such as this one. On the other side it does not explicitly
forbid the use of sysconfig.

sysconfig is not found anywhere in Debian default archive because
packages to use the formal *should* policy.

If third-party applications don´t follow Debian packaging guidelines, it
is possible that they might break other components as well.

Of course we can argue on the definition of "should" forever and ever :)

As upstream we follow basic guidelines, distribution packagers and
porters should (pun intended ;)) make sure to provide us with porting
patches (that´s also part of the Debian Maintainer duty).

Fabio

1) http://www.debian.org/doc/debian-policy/ch-opersys.html
   Section 9.3.2

"To ease the burden on the system administrator, such configurable values
 should not be placed directly in the script.
 Instead, they should be placed in a file in /etc/default, "

2) http://www.thefreedictionary.com/should
 should  (shd)
  aux.v. Past tense of shall
  1. Used to express obligation or duty






Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-09 Thread Dietmar Maurer
Will you add some documentaion how to use those scripts?

Seems those scripts does not check if the node is joined to the fence domain?

> -Original Message-
> From: cluster-devel-boun...@redhat.com [mailto:cluster-devel-
> boun...@redhat.com] On Behalf Of Fabio M. Di Nitto
> Sent: Dienstag, 09. Oktober 2012 11:36
> To: cluster-devel@redhat.com
> Cc: Fabio M. Di Nitto
> Subject: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration
> script with wdmd
> 
> From: "Fabio M. Di Nitto" 
> 




Re: [Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-09 Thread Dietmar Maurer
> +# rpm based distros
> +[ -d /etc/sysconfig ] && \
> + [ -f /etc/sysconfig/checkquorum ] && \
> + . /etc/sysconfig/checkquorum
> +
> +# deb based distros
> +[ ! -d /etc/sysconfig ] && \
> + [ -f /etc/default/checkquorum ] && \
> + . /etc/default/checkquorum
> +

FYI: Some RAID tool vendors delivers utilities for debian which creates 
directory '/etc/sysconfig'
on debian boxes, so that test is not reliable.





[Cluster-devel] [PATCH 2/2] checkquorum.wdmd: add integration script with wdmd

2012-10-09 Thread Fabio M. Di Nitto
From: "Fabio M. Di Nitto" 

requires wdmd >= 2.6

Resolves: rhbz#509056

Signed-off-by: Fabio M. Di Nitto 
---
 cman/scripts/Makefile |2 +-
 cman/scripts/checkquorum.wdmd |  104 +
 2 files changed, 105 insertions(+), 1 deletions(-)
 create mode 100644 cman/scripts/checkquorum.wdmd

diff --git a/cman/scripts/Makefile b/cman/scripts/Makefile
index b4866c8..7950311 100644
--- a/cman/scripts/Makefile
+++ b/cman/scripts/Makefile
@@ -1,4 +1,4 @@
-SHAREDIRTEX=checkquorum
+SHAREDIRTEX=checkquorum checkquorum.wdmd
 
 include ../../make/defines.mk
 include $(OBJDIR)/make/clean.mk
diff --git a/cman/scripts/checkquorum.wdmd b/cman/scripts/checkquorum.wdmd
new file mode 100644
index 000..1d81ff6
--- /dev/null
+++ b/cman/scripts/checkquorum.wdmd
@@ -0,0 +1,104 @@
+#!/bin/bash
+# Quorum detection watchdog script
+#
+# This script will return -2 if the node had quorum at one point
+# and then subsequently lost it
+#
+# Copyright 2012 Red Hat, Inc.
+
+# defaults
+
+# Amount of time in seconds to wait after quorum is lost to fail script
+waittime=60
+
+# action to take if quorum is missing for over > waittime
+# autodetect|hardreboot|crashdump|watchdog
+action=autodetect
+
+# Location of temporary file to capture timeouts
+timerfile="/var/run/cluster/checkquorum-timer"
+
+# rpm based distros
+[ -d /etc/sysconfig ] && \
+   [ -f /etc/sysconfig/checkquorum ] && \
+   . /etc/sysconfig/checkquorum
+
+# deb based distros
+[ ! -d /etc/sysconfig ] && \
+   [ -f /etc/default/checkquorum ] && \
+   . /etc/default/checkquorum
+
+has_quorum() {
+   corosync-quorumtool -s 2>/dev/null | \
+   grep ^Quorate: | \
+   grep -q Yes$
+}
+
+had_quorum() {
+   output="$(corosync-objctl 2>/dev/null | \
+   grep runtime.totem.pg.mrp.srp.operational_entered | cut -d "=" 
-f 2)"
+   [ -n "$output" ] && {
+   [ "$output" -ge 1 ] && return 0
+   return 1
+   }
+}
+
+take_action() {
+   case "$action" in
+   watchdog)
+   [ -n "$wdmd_action" ] && return 1
+   ;;
+   hardreboot)
+   echo 1 > /proc/sys/kernel/sysrq
+   echo b > /proc/sysrq-trigger
+   ;;
+   crashdump)
+   echo 1 > /proc/sys/kernel/sysrq
+   echo c > /proc/sysrq-trigger
+   ;;
+   autodetect)
+   service kdump status > /dev/null 2>&1
+   usekexec="$?"
+   [ -n "$wdmd_action" ] && [ "$usekexec" != "0" ] && 
return 1
+   echo 1 > /proc/sys/kernel/sysrq
+   [ "$usekexec" = "0" ] && echo c > /proc/sysrq-trigger
+   echo b > /proc/sysrq-trigger
+   esac
+}
+
+# watchdog uses $1 = test or = repair
+# with no arguments we are called by wdmd
+[ -z "$1" ] && wdmd_action=yes
+
+# we don't support watchdog repair action
+[ "$1" = "repair" ] && exit 1
+
+service corosync status > /dev/null 2>&1
+ret=$?
+
+case "$ret" in
+   3) # corosync is not running (clean)
+   rm -f "$timerfile"
+   exit 0
+   ;;
+   1) # corosync crashed or did exit abonormally (dirty - take action)
+   logger -t checkquorum.wdmd "corosync crashed or exited 
abonarmally. Node will soon reboot"
+   take_action
+   ;;
+   0) # corosync is running (clean)
+   # check quorum here
+   has_quorum && {
+   echo -e "oldtime=$(date +%s)" > "$timerfile"
+   exit 0
+   }
+   . "$timerfile"
+   newtime="$(date +%s)" 
+   delta=$((newtime - oldtime))
+   logger -t checkquorum.wdmd "Node has lost quorum. Node will 
soon reboot"
+   had_quorum && [ "$delta" -gt "$waittime" ] && {
+   take_action
+   }
+   ;;
+esac
+
+exit $?
-- 
1.7.7.6