Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Ken Gaillot
On 09/20/2016 07:51 PM, Andrew Beekhof wrote:
> 
> 
> On Wed, Sep 21, 2016 at 6:25 AM, Ken Gaillot  > wrote:
> 
> Hi everybody,
> 
> Currently, Pacemaker's on-fail property allows you to configure how the
> cluster reacts to operation failures. The default "restart" means try to
> restart on the same node, optionally moving to another node once
> migration-threshold is reached. Other possibilities are "ignore",
> "block", "stop", "fence", and "standby".
> 
> Occasionally, we get requests to have something like migration-threshold
> for values besides restart. For example, try restarting the resource on
> the same node 3 times, then fence.
> 
> I'd like to get your feedback on two alternative approaches we're
> considering.
> 
> ###
> 
> Our first proposed approach would add a new hard-fail-threshold
> operation property. If specified, the cluster would first try restarting
> the resource on the same node, 
> 
> 
> Well, just as now, it would be _allowed_ to start on the same node, but
> this is not guaranteed.
>  
> 
> before doing the on-fail handling.
> 
> For example, you could configure a promote operation with
> hard-fail-threshold=3 and on-fail=fence, to fence the node after 3
> failures.
> 
> 
> One point that's not settled is whether failures of *any* operation
> would count toward the 3 failures (which is how migration-threshold
> works now), or only failures of the specified operation.
> 
> 
> I think if hard-fail-threshold is per-op, then only failures of that
> operation should count.
>  
> 
> 
> Currently, if a start fails (but is retried successfully), then a
> promote fails (but is retried successfully), then a monitor fails, the
> resource will move to another node if migration-threshold=3. We could
> keep that behavior with hard-fail-threshold, or only count monitor
> failures toward monitor's hard-fail-threshold. Each alternative has
> advantages and disadvantages.
> 
> ###
> 
> The second proposed approach would add a new on-restart-fail resource
> property.
> 
> Same as now, on-fail set to anything but restart would be done
> immediately after the first failure. A new value, "ban", would
> immediately move the resource to another node. (on-fail=ban would behave
> like on-fail=restart with migration-threshold=1.)
> 
> When on-fail=restart, and restarting on the same node doesn't work, the
> cluster would do the on-restart-fail handling. on-restart-fail would
> allow the same values as on-fail (minus "restart"), and would default to
> "ban". 
> 
> 
> I do wish you well tracking "is this a restart" across demote -> stop ->
> start -> promote in 4 different transitions :-)
>  
> 
> 
> So, if you want to fence immediately after any promote failure, you
> would still configure on-fail=fence; if you want to try restarting a few
> times first, you would configure on-fail=restart and
> on-restart-fail=fence.
> 
> This approach keeps the current threshold behavior -- failures of any
> operation count toward the threshold. We'd rename migration-threshold to
> something like hard-fail-threshold, since it would apply to more than
> just migration, but unlike the first approach, it would stay a resource
> property.
> 
> ###
> 
> Comparing the two approaches, the first is more flexible, but also more
> complex and potentially confusing.
> 
> 
> More complex to implement or more complex to configure?

I was thinking more complex in behavior, so perhaps harder to follow /
expect.

For example, "After two start failures, fence this node; after three
promote failures, put the node in standby; but if a monitor failure is
the third operation failure of any type, then move the resource to
another node."

Granted, someone would have to inflict that on themselves :) but another
sysadmin / support tech / etc. who had to deal with the config later
might have trouble following it.

To keep the current default behavior, the default would be complicated,
too: "1 for start and stop operations, and 0 for other operations" where
"0 is equivalent to 1 except when on-fail=restart, in which case
migration-threshold will be used instead".

And then add to that tracking fail-count per node+resource+operation
combination, with the associated status output and cleanup options.
"crm_mon -f" currently shows failures like:

* Node node1:
   rsc1: migration-threshold=3 fail-count=1 last-failure='Wed Sep 21
15:12:59 2016'

What should that look like with per-op thresholds and fail-counts?

I'm not saying it's a bad idea, just that it's more complicated than it
first sounds, so it's worth thinking through the implications.

> With either approach, we would deprecate the start-failure-is-fatal
> cluster property. start-failure-is-fatal=true would be equivalent to
> 

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Ken Gaillot
On 09/21/2016 02:23 AM, Kristoffer Grönlund wrote:
> First of all, is there a use case for when fence-after-3-failures is a
> useful behavior? I seem to recall some case where someone expected that
> to be the behavior and were surprised by how pacemaker works, but that
> problem wouldn't be helped by adding another option for them not to know
> about.

I think I've most often encountered it with ignore/block. Sometimes
users have one particular service that's buggy and not really important,
so they ignore errors (or block). But they would like to try restarting
it a few times first.

I think fence-after-3-failures would make as much sense as
fence-immediately. The idea behind restarting a few times then taking a
more drastic action is that restarting is for the case where the service
crashed or is in a buggy state, and if that doesn't work, maybe
something's wrong with the node.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Auer, Jens
Hi,

> shared_fs has to wait for the DRBD promotion, but the other resources
> have no such limitation, so they are free to start before shared_fs.
Isn't there an implicit limitation by the ordering constraint? I have 
drbd_promote < shared_fs < snmpAgent-clone,
and I would expect this to be a transitive relationship.

> The problem is "... only impacts the startup procedure". Pacemaker
> doesn't distinguish start-up from any other state of the cluster. Nodes
> (and entire partitions of nodes) can come and go at any time, and any or
> all resources can be stopped and started again at any time, so
> "start-up" is not really as meaningful as it sounds.
> Maybe try an optional constraint of the other resources on the DRBD
> promotion. That would make it more likely that all the resources end up
> starting in the same transition.

What is the meaning of "transition"? Is there any way I can force resource 
actions into transitions?
I tried to group them but this doesn't work with cloned resources, and an 
ordered set
seems to use mandatory constraints and thus is not what I need.

I've added ordering constraints:
MDA1PFP-S01 14:46:42 3432 127 ~ # pcs constraint show --full
Location Constraints:
  Resource: mda-ip
Enabled on: MDA1PFP-PCS01 (score:50) (id:location-mda-ip-MDA1PFP-PCS01-50)
Constraint: location-mda-ip
  Rule: score=-INFINITY boolean-op=or  (id:location-mda-ip-rule)
Expression: pingd lt 1  (id:location-mda-ip-rule-expr)
Expression: not_defined pingd  (id:location-mda-ip-rule-expr-1)
Ordering Constraints:
  promote drbd1_sync then start shared_fs (kind:Mandatory) 
(id:order-drbd1_sync-shared_fs-mandatory)
  start shared_fs then start snmpAgent-clone (kind:Optional) 
(id:order-shared_fs-snmpAgent-clone-Optional)
  start shared_fs then start supervisor-clone (kind:Optional) 
(id:order-shared_fs-supervisor-clone-Optional)
  start shared_fs then start clusterSwitchNotification (kind:Mandatory) 
(id:order-shared_fs-clusterSwitchNotification-mandatory)
  start snmpAgent-clone then start supervisor-clone (kind:Optional) 
(id:order-snmpAgent-clone-supervisor-clone-Optional)
  start supervisor-clone then start clusterSwitchNotification (kind:Optional) 
(id:order-supervisor-clone-clusterSwitchNotification-Optional)
  promote drbd1_sync then start supervisor-clone (kind:Optional) 
(id:order-drbd1_sync-supervisor-clone-Optional)
  promote drbd1_sync then start clusterSwitchNotification (kind:Optional) 
(id:order-drbd1_sync-clusterSwitchNotification-Optional)
  promote drbd1_sync then start snmpAgent-clone (kind:Optional) 
(id:order-drbd1_sync-snmpAgent-clone-Optional)
Colocation Constraints:
  ACTIVE with mda-ip (score:INFINITY) (id:colocation-ACTIVE-mda-ip-INFINITY)
  drbd1_sync with mda-ip (score:INFINITY) (rsc-role:Master) 
(with-rsc-role:Started) (id:colocation-drbd1_sync-mda-ip-INFINITY)
  shared_fs with drbd1_sync (score:INFINITY) (rsc-role:Started) 
(with-rsc-role:Master) (id:colocation-shared_fs-drbd1_sync-INFINITY)
  clusterSwitchNotification with shared_fs (score:INFINITY) 
(id:colocation-clusterSwitchNotification-shared_fs-INFINITY)

but it still starts in the wrong order:
Sep 21 14:45:59 MDA1PFP-S01 crmd[3635]:  notice: Operation snmpAgent_start_0: 
ok (node=MDA1PFP-PCS01, call=39, rc=0, cib-update=45, confirmed=true)
Sep 21 14:45:59 MDA1PFP-S01 crmd[3635]:  notice: Operation drbd1_start_0: ok 
(node=MDA1PFP-PCS01, call=40, rc=0, cib-update=46, confirmed=true)
Sep 21 14:46:01 MDA1PFP-S01 crmd[3635]:  notice: Operation ping_start_0: ok 
(node=MDA1PFP-PCS01, call=38, rc=0, cib-update=48, confirmed=true)
Sep 21 14:46:01 MDA1PFP-S01 crmd[3635]:  notice: Operation supervisor_start_0: 
ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=51, confirmed=true)
Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]:  notice: Operation ACTIVE_start_0: ok 
(node=MDA1PFP-PCS01, call=48, rc=0, cib-update=57, confirmed=true)
Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]:  notice: Operation mda-ip_start_0: ok 
(node=MDA1PFP-PCS01, call=47, rc=0, cib-update=59, confirmed=true)
Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]:  notice: Operation shared_fs_start_0: 
ok (node=MDA1PFP-PCS01, call=55, rc=0, cib-update=62, confirmed=true)
Sep 21 14:46:06 MDA1PFP-S01 crmd[3635]:  notice: Operation 
clusterSwitchNotification_start_0: ok (node=MDA1PFP-PCS01, call=57, rc=0, 
cib-update=64, confirmed=true)

Best wishes,
  Jens

--
Jens Auer | CGI | Software-Engineer
CGI (Germany) GmbH & Co. KG
Rheinstraße 95 | 64295 Darmstadt | Germany
T: +49 6151 36860 154
jens.a...@cgi.com
Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter 
de.cgi.com/pflichtangaben.

CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to CGI 
Group Inc. and its affiliates may be contained in this message. If you are not 
a recipient indicated or intended in this message (or responsible for delivery 
of this message to such person), or you think for any reason that this message 
may have been addressed to you in error, you 

[ClusterLabs] Authoritative corosync's location (Was: corosync-quorum tool, output name key on Name column if set?)

2016-09-21 Thread Jan Pokorný
On 21/09/16 09:16 +0200, Jan Friesse wrote:
> Thomas Lamprecht napsal(a):
>> I have also another, organizational question. I saw on the GitHub page from
>> corosync that pull request there are preferred, and also that the
> 
> True

At this point, it's worth noting that ClusterLabs/corosync is
currently a stale fork of corosync/corosync location at GitHub,
which may be a source of confusion.

It would make sense to settle on just a single one to be clearly
authoritative place to be in touch with (not sure what options
are -- aliasing/transfering?).

-- 
Jan (Poki)


pgpSmExFp_37n.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Ken Gaillot
On 09/21/2016 09:00 AM, Auer, Jens wrote:
> Hi,
> 
> could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? 
> It sounds similar.

Correct -- "Optional" means honor the constraint only if both resources
are starting *in the same transition*.

shared_fs has to wait for the DRBD promotion, but the other resources
have no such limitation, so they are free to start before shared_fs.

The problem is "... only impacts the startup procedure". Pacemaker
doesn't distinguish start-up from any other state of the cluster. Nodes
(and entire partitions of nodes) can come and go at any time, and any or
all resources can be stopped and started again at any time, so
"start-up" is not really as meaningful as it sounds.

Maybe try an optional constraint of the other resources on the DRBD
promotion. That would make it more likely that all the resources end up
starting in the same transition.

> Cheers,
>   Jens
> 
> --
> Jens Auer | CGI | Software-Engineer
> CGI (Germany) GmbH & Co. KG
> Rheinstraße 95 | 64295 Darmstadt | Germany
> T: +49 6151 36860 154
> jens.a...@cgi.com
> Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter 
> de.cgi.com/pflichtangaben.
> 
> 
> 
> Von: Auer, Jens [jens.a...@cgi.com]
> Gesendet: Mittwoch, 21. September 2016 15:10
> An: users@clusterlabs.org
> Betreff: [ClusterLabs] kind=Optional order constraint not working at startup
> 
> Hi,
> 
> in my cluster setup I have a couple of resources from which I need to start 
> some in specific order. Basically I have two cloned resources that should 
> start after mounting a DRBD filesystem on all nodes plus one resource that 
> start after the clone sets. It is important that this only impacts the 
> startup procedure. Once the system is running stopping or starting one of the 
> clone resources should not impact the other resource's state. From reading 
> the manual, this should be what a local constraint with kind=Optional 
> implements. However, when I start the cluster the filesystem is started after 
> the otehr resources ignoring the ordering constraint.
> 
> My cluster configuration:
> pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 
> MDA1PFP-PCS02,MDA1PFP-S02
> pcs cluster start --all
> sleep 5
> crm_attribute --type nodes --node MDA1PFP-PCS01 --name ServerRole --update 
> PRIME
> crm_attribute --type nodes --node MDA1PFP-PCS02 --name ServerRole --update 
> BACKUP
> pcs property set stonith-enabled=false
> pcs resource defaults resource-stickiness=100
> 
> rm -f mda; pcs cluster cib mda
> pcs -f mda property set no-quorum-policy=ignore
> 
> pcs -f mda resource create mda-ip ocf:heartbeat:IPaddr2 ip=192.168.120.20 
> cidr_netmask=24 nic=bond0 op monitor interval=1s
> pcs -f mda constraint location mda-ip prefers MDA1PFP-PCS01=50
> pcs -f mda resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 
> host_list=pf-pep-dev-1  params timeout=1 attempts=3  op monitor interval=1 
> --clone
> pcs -f mda constraint location mda-ip rule score=-INFINITY pingd lt 1 or 
> not_defined pingd
> 
> pcs -f mda resource create ACTIVE ocf:heartbeat:dummy
> pcs -f mda constraint colocation add ACTIVE with mda-ip score=INFINITY
> 
> pcs -f mda resource create drbd1 ocf:linbit:drbd drbd_resource=shared_fs op 
> monitor interval=60s
> pcs -f mda resource master drbd1_sync drbd1 master-max=1 master-node-max=1 
> clone-max=2 clone-node-max=1 notify=true
> pcs -f mda constraint colocation add master drbd1_sync with mda-ip 
> score=INFINITY
> 
> pcs -f mda resource create shared_fs Filesystem device="/dev/drbd1" 
> directory=/shared_fs fstype="xfs"
> pcs -f mda constraint order promote drbd1_sync then start shared_fs
> pcs -f mda constraint colocation add shared_fs with master drbd1_sync 
> score=INFINITY
> 
> pcs -f mda resource create supervisor ocf:pfpep:supervisor params 
> config="/shared_fs/pfpep.ini" --clone
> pcs -f mda resource create snmpAgent ocf:pfpep:snmpAgent params 
> config="/shared_fs/pfpep.ini" --clone
> pcs -f mda resource create clusterSwitchNotification ocf:pfpep:clusterSwitch 
> params config="/shared_fs/pfpep.ini"
> 
> pcs -f mda constraint order start shared_fs then snmpAgent-clone  
> kind=Optional
> pcs -f mda constraint order start shared_fs then supervisor-clone 
> kind=Optional
> pcs -f mda constraint order start snmpAgent-clone then supervisor-clone 
> kind=Optional
> pcs -f mda constraint order start supervisor-clone then 
> clusterSwitchNotification kind=Optional
> pcs -f mda constraint colocation add clusterSwitchNotification with shared_fs 
> score=INFINITY
> 
> pcs cluster cib-push mda
> 
> The order of resource startup in the log file is:
> Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]:  notice: Operation snmpAgent_start_0: 
> ok (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=82, confirmed=true)
> Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]:  notice: Operation drbd1_start_0: ok 
> (node=MDA1PFP-PCS01, call=39, rc=0, cib-update=83, confirmed=true)

Re: [ClusterLabs] best practice fencing with ipmi in 2node-setups / cloneresource/monitor/timeout

2016-09-21 Thread Ken Gaillot
On 09/21/2016 01:51 AM, Stefan Bauer wrote:
> Hi Ken,
> 
> let met sum it up:
> 
> Pacemaker in recent versions is smart enough to run (trigger, execute) the 
> fence operation on the node, that is not the target.
> 
> If i have an external stonith device that can fence multiple nodes, a single 
> primitive is enough in pacemaker.
> 
> If with external/ipmi i can only address a single node, i need to have 
> multiple primitives - one for each node.
> 
> In this case it's recommended to let the primitive always run on the opposite 
> node - right?

Yes, exactly :-)

In terms of implementation, I'd use a +INFINITY location constraint to
tie the device to the opposite node. This approach (as opposed to a
-INFINITY constraint on the target node) allows the target node to run
the fence device when the opposite node is unavailable.

> thank you.
> 
> Stefan
>  
> -Ursprüngliche Nachricht-
>> Von:Ken Gaillot 
>> Gesendet: Die 20 September 2016 16:49
>> An: users@clusterlabs.org
>> Betreff: Re: [ClusterLabs] best practice fencing with ipmi in 2node-setups / 
>> cloneresource/monitor/timeout
>>
>> On 09/20/2016 06:42 AM, Digimer wrote:
>>> On 20/09/16 06:59 AM, Stefan Bauer wrote:
 Hi,

 i run a 2 node cluster and want to be save in split-brain scenarios. For
 this i setup external/ipmi to stonith the other node.
>>>
>>> Please use 'fence_ipmilan'. I believe that the older external/ipmi are
>>> deprecated (someone correct me if I am wrong on this).
>>
>> It's just an alternative. The "external/" agents come with the
>> cluster-glue package, which isn't provided by some distributions (such
>> as RHEL and its derivatives), so it's "deprecated" on those only.
>>
 Some possible issues jumped to my mind and i would ike to find the best
 practice solution:

 - I have a primitive for each node to stonith. Many documents and guides
 recommend to never let them run on the host it should fence. I would
 setup clone resources to avoid dealing with locations that would also
 influence scoring. Does that make sense?
>>>
>>> Since v1.1.10 of pacemaker, you don't have to worry about this.
>>> Pacemaker is smart enough to know where to run a fence call from in
>>> order to terminate a target.
>>
>> Right, fence devices can run anywhere now, and in fact they don't even
>> have to be "running" for pacemaker to use them -- as long as they are
>> configured and not intentionally disabled, pacemaker will use them.
>>
>> There is still a slight advantage to not running a fence device on a
>> node it can fence. "Running" a fence device in pacemaker really means
>> running the recurring monitor for it. Since the node that runs the
>> monitor has "verified" access to the device, pacemaker will prefer to
>> use it to execute that device. However, pacemaker will not use a node to
>> fence itself, except as a last resort if no other node is available. So,
>> running a fence device on a node it can fence means that the preference
>> is lost.
>>
>> That's a very minor detail, not worth worrying about. It's more a matter
>> of personal preference.
>>
>> In this particular case, a more relevant concern is that you need
>> different configurations for the different targets (the IPMI address is
>> different).
>>
>> One approach is to define two different fence devices, each with one
>> IPMI address. In that case, it makes sense to use the location
>> constraints to ensure the device prefers the node that's not its target.
>>
>> Another approach (if the fence agent supports it) is to use
>> pcmk_host_map to provide a different "port" (IPMI address) depending on
>> which host is being fenced. In this case, you need only one fence device
>> to be able to fence both hosts. You don't need a clone. (Remember, the
>> node "running" the device merely refers to its monitor, so the cluster
>> can still use the fence device, even if that node crashes.)
>>
 - Monitoring operation on the stonith primitive is dangerous. I read
 that if monitor operations fail for the stonith device, stonith action
 is triggered. I think its not clever to give the cluster the option to
 fence a node just because it has an issue to monitor a fence device.
 That should not be a reason to shutdown a node. What is your opinion on
 this? Can i just set the primitive monitor operation to disabled?
>>>
>>> Monitoring is how you will detect that, for example, the IPMI cable
>>> failed or was unplugged. I do not believe the node will get fenced on
>>> fence agent monitor failing... At least not by default.
>>
>> I am not aware of any situation in which a failing fence monitor
>> triggers a fence. Monitoring is good -- it verifies that the fence
>> device is still working.
>>
>> One concern particular to on-board IPMI devices is that they typically
>> share the same power supply as their host. So if the machine loses
>> power, the cluster can't contact the IPMI to fence it -- which means it
>> 

Re: [ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Auer, Jens
Hi,

could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? It 
sounds similar.

Cheers,
  Jens

--
Jens Auer | CGI | Software-Engineer
CGI (Germany) GmbH & Co. KG
Rheinstraße 95 | 64295 Darmstadt | Germany
T: +49 6151 36860 154
jens.a...@cgi.com
Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter 
de.cgi.com/pflichtangaben.



Von: Auer, Jens [jens.a...@cgi.com]
Gesendet: Mittwoch, 21. September 2016 15:10
An: users@clusterlabs.org
Betreff: [ClusterLabs] kind=Optional order constraint not working at startup

Hi,

in my cluster setup I have a couple of resources from which I need to start 
some in specific order. Basically I have two cloned resources that should start 
after mounting a DRBD filesystem on all nodes plus one resource that start 
after the clone sets. It is important that this only impacts the startup 
procedure. Once the system is running stopping or starting one of the clone 
resources should not impact the other resource's state. From reading the 
manual, this should be what a local constraint with kind=Optional implements. 
However, when I start the cluster the filesystem is started after the otehr 
resources ignoring the ordering constraint.

My cluster configuration:
pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 
MDA1PFP-PCS02,MDA1PFP-S02
pcs cluster start --all
sleep 5
crm_attribute --type nodes --node MDA1PFP-PCS01 --name ServerRole --update PRIME
crm_attribute --type nodes --node MDA1PFP-PCS02 --name ServerRole --update 
BACKUP
pcs property set stonith-enabled=false
pcs resource defaults resource-stickiness=100

rm -f mda; pcs cluster cib mda
pcs -f mda property set no-quorum-policy=ignore

pcs -f mda resource create mda-ip ocf:heartbeat:IPaddr2 ip=192.168.120.20 
cidr_netmask=24 nic=bond0 op monitor interval=1s
pcs -f mda constraint location mda-ip prefers MDA1PFP-PCS01=50
pcs -f mda resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 
host_list=pf-pep-dev-1  params timeout=1 attempts=3  op monitor interval=1 
--clone
pcs -f mda constraint location mda-ip rule score=-INFINITY pingd lt 1 or 
not_defined pingd

pcs -f mda resource create ACTIVE ocf:heartbeat:dummy
pcs -f mda constraint colocation add ACTIVE with mda-ip score=INFINITY

pcs -f mda resource create drbd1 ocf:linbit:drbd drbd_resource=shared_fs op 
monitor interval=60s
pcs -f mda resource master drbd1_sync drbd1 master-max=1 master-node-max=1 
clone-max=2 clone-node-max=1 notify=true
pcs -f mda constraint colocation add master drbd1_sync with mda-ip 
score=INFINITY

pcs -f mda resource create shared_fs Filesystem device="/dev/drbd1" 
directory=/shared_fs fstype="xfs"
pcs -f mda constraint order promote drbd1_sync then start shared_fs
pcs -f mda constraint colocation add shared_fs with master drbd1_sync 
score=INFINITY

pcs -f mda resource create supervisor ocf:pfpep:supervisor params 
config="/shared_fs/pfpep.ini" --clone
pcs -f mda resource create snmpAgent ocf:pfpep:snmpAgent params 
config="/shared_fs/pfpep.ini" --clone
pcs -f mda resource create clusterSwitchNotification ocf:pfpep:clusterSwitch 
params config="/shared_fs/pfpep.ini"

pcs -f mda constraint order start shared_fs then snmpAgent-clone  kind=Optional
pcs -f mda constraint order start shared_fs then supervisor-clone kind=Optional
pcs -f mda constraint order start snmpAgent-clone then supervisor-clone 
kind=Optional
pcs -f mda constraint order start supervisor-clone then 
clusterSwitchNotification kind=Optional
pcs -f mda constraint colocation add clusterSwitchNotification with shared_fs 
score=INFINITY

pcs cluster cib-push mda

The order of resource startup in the log file is:
Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]:  notice: Operation snmpAgent_start_0: 
ok (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=82, confirmed=true)
Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]:  notice: Operation drbd1_start_0: ok 
(node=MDA1PFP-PCS01, call=39, rc=0, cib-update=83, confirmed=true)
Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]:  notice: Operation ping_start_0: ok 
(node=MDA1PFP-PCS01, call=38, rc=0, cib-update=85, confirmed=true)
Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]:  notice: Operation supervisor_start_0: 
ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=88, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation ACTIVE_start_0: ok 
(node=MDA1PFP-PCS01, call=48, rc=0, cib-update=94, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation mda-ip_start_0: ok 
(node=MDA1PFP-PCS01, call=47, rc=0, cib-update=96, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation 
clusterSwitchNotification_start_0: ok (node=MDA1PFP-PCS01, call=50, rc=0, 
cib-update=98, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation shared_fs_start_0: 
ok (node=MDA1PFP-PCS01, call=57, rc=0, cib-update=101, confirmed=true)

Why is the shared file system started after the other resources?

Best wishes,
  Jens













[ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Auer, Jens
Hi,

in my cluster setup I have a couple of resources from which I need to start 
some in specific order. Basically I have two cloned resources that should start 
after mounting a DRBD filesystem on all nodes plus one resource that start 
after the clone sets. It is important that this only impacts the startup 
procedure. Once the system is running stopping or starting one of the clone 
resources should not impact the other resource's state. From reading the 
manual, this should be what a local constraint with kind=Optional implements. 
However, when I start the cluster the filesystem is started after the otehr 
resources ignoring the ordering constraint.

My cluster configuration:
pcs cluster setup --name MDA1PFP MDA1PFP-PCS01,MDA1PFP-S01 
MDA1PFP-PCS02,MDA1PFP-S02
pcs cluster start --all
sleep 5
crm_attribute --type nodes --node MDA1PFP-PCS01 --name ServerRole --update PRIME
crm_attribute --type nodes --node MDA1PFP-PCS02 --name ServerRole --update 
BACKUP
pcs property set stonith-enabled=false
pcs resource defaults resource-stickiness=100

rm -f mda; pcs cluster cib mda
pcs -f mda property set no-quorum-policy=ignore

pcs -f mda resource create mda-ip ocf:heartbeat:IPaddr2 ip=192.168.120.20 
cidr_netmask=24 nic=bond0 op monitor interval=1s
pcs -f mda constraint location mda-ip prefers MDA1PFP-PCS01=50
pcs -f mda resource create ping ocf:pacemaker:ping dampen=5s multiplier=1000 
host_list=pf-pep-dev-1  params timeout=1 attempts=3  op monitor interval=1 
--clone
pcs -f mda constraint location mda-ip rule score=-INFINITY pingd lt 1 or 
not_defined pingd

pcs -f mda resource create ACTIVE ocf:heartbeat:dummy
pcs -f mda constraint colocation add ACTIVE with mda-ip score=INFINITY

pcs -f mda resource create drbd1 ocf:linbit:drbd drbd_resource=shared_fs op 
monitor interval=60s
pcs -f mda resource master drbd1_sync drbd1 master-max=1 master-node-max=1 
clone-max=2 clone-node-max=1 notify=true
pcs -f mda constraint colocation add master drbd1_sync with mda-ip 
score=INFINITY

pcs -f mda resource create shared_fs Filesystem device="/dev/drbd1" 
directory=/shared_fs fstype="xfs"
pcs -f mda constraint order promote drbd1_sync then start shared_fs
pcs -f mda constraint colocation add shared_fs with master drbd1_sync 
score=INFINITY 

pcs -f mda resource create supervisor ocf:pfpep:supervisor params 
config="/shared_fs/pfpep.ini" --clone 
pcs -f mda resource create snmpAgent ocf:pfpep:snmpAgent params 
config="/shared_fs/pfpep.ini" --clone
pcs -f mda resource create clusterSwitchNotification ocf:pfpep:clusterSwitch 
params config="/shared_fs/pfpep.ini"

pcs -f mda constraint order start shared_fs then snmpAgent-clone  kind=Optional
pcs -f mda constraint order start shared_fs then supervisor-clone kind=Optional
pcs -f mda constraint order start snmpAgent-clone then supervisor-clone 
kind=Optional
pcs -f mda constraint order start supervisor-clone then 
clusterSwitchNotification kind=Optional
pcs -f mda constraint colocation add clusterSwitchNotification with shared_fs 
score=INFINITY

pcs cluster cib-push mda

The order of resource startup in the log file is:
Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]:  notice: Operation snmpAgent_start_0: 
ok (node=MDA1PFP-PCS01, call=40, rc=0, cib-update=82, confirmed=true)
Sep 21 13:01:21 MDA1PFP-S01 crmd[2760]:  notice: Operation drbd1_start_0: ok 
(node=MDA1PFP-PCS01, call=39, rc=0, cib-update=83, confirmed=true)
Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]:  notice: Operation ping_start_0: ok 
(node=MDA1PFP-PCS01, call=38, rc=0, cib-update=85, confirmed=true)
Sep 21 13:01:23 MDA1PFP-S01 crmd[2760]:  notice: Operation supervisor_start_0: 
ok (node=MDA1PFP-PCS01, call=45, rc=0, cib-update=88, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation ACTIVE_start_0: ok 
(node=MDA1PFP-PCS01, call=48, rc=0, cib-update=94, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation mda-ip_start_0: ok 
(node=MDA1PFP-PCS01, call=47, rc=0, cib-update=96, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation 
clusterSwitchNotification_start_0: ok (node=MDA1PFP-PCS01, call=50, rc=0, 
cib-update=98, confirmed=true)
Sep 21 13:01:28 MDA1PFP-S01 crmd[2760]:  notice: Operation shared_fs_start_0: 
ok (node=MDA1PFP-PCS01, call=57, rc=0, cib-update=101, confirmed=true)

Why is the shared file system started after the other resources?

Best wishes,
  Jens


























___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Klaus Wenninger
On 09/20/2016 10:25 PM, Ken Gaillot wrote:
> Hi everybody,
>
> Currently, Pacemaker's on-fail property allows you to configure how the
> cluster reacts to operation failures. The default "restart" means try to
> restart on the same node, optionally moving to another node once
> migration-threshold is reached. Other possibilities are "ignore",
> "block", "stop", "fence", and "standby".
>
> Occasionally, we get requests to have something like migration-threshold
> for values besides restart. For example, try restarting the resource on
> the same node 3 times, then fence.
>
> I'd like to get your feedback on two alternative approaches we're
> considering.
>
> ###
>
> Our first proposed approach would add a new hard-fail-threshold
> operation property. If specified, the cluster would first try restarting
> the resource on the same node, before doing the on-fail handling.
>
> For example, you could configure a promote operation with
> hard-fail-threshold=3 and on-fail=fence, to fence the node after 3 failures.
>
> One point that's not settled is whether failures of *any* operation
> would count toward the 3 failures (which is how migration-threshold
> works now), or only failures of the specified operation.
>
> Currently, if a start fails (but is retried successfully), then a
> promote fails (but is retried successfully), then a monitor fails, the
> resource will move to another node if migration-threshold=3. We could
> keep that behavior with hard-fail-threshold, or only count monitor
> failures toward monitor's hard-fail-threshold. Each alternative has
> advantages and disadvantages.
having something like reset-failcount-on-success might
be interesting here as well (as operation property, per resource
or both)
>
> ###
>
> The second proposed approach would add a new on-restart-fail resource
> property.
>
> Same as now, on-fail set to anything but restart would be done
> immediately after the first failure. A new value, "ban", would
> immediately move the resource to another node. (on-fail=ban would behave
> like on-fail=restart with migration-threshold=1.)
>
> When on-fail=restart, and restarting on the same node doesn't work, the
> cluster would do the on-restart-fail handling. on-restart-fail would
> allow the same values as on-fail (minus "restart"), and would default to
> "ban".
>
> So, if you want to fence immediately after any promote failure, you
> would still configure on-fail=fence; if you want to try restarting a few
> times first, you would configure on-fail=restart and on-restart-fail=fence.
>
> This approach keeps the current threshold behavior -- failures of any
> operation count toward the threshold. We'd rename migration-threshold to
> something like hard-fail-threshold, since it would apply to more than
> just migration, but unlike the first approach, it would stay a resource
> property.
>
> ###
>
> Comparing the two approaches, the first is more flexible, but also more
> complex and potentially confusing.
>
> With either approach, we would deprecate the start-failure-is-fatal
> cluster property. start-failure-is-fatal=true would be equivalent to
> hard-fail-threshold=1 with the first approach, and on-fail=ban with the
> second approach. This would be both simpler and more useful -- it allows
> the value to be set differently per resource.
As said both approaches have their pros and cons and you
can probably invent more that seem to be especially suitable
for certain cases.

A different approach would be to gather a couple of statistics
(e.g. restart failures globally & per operation and ...) and leave
it to the RA to derive a return-value from that info and what
it knows itself.

I know we had a similar discussion already ... and this thread
is probably a result of that ...

We could introduce a new RA-operation for getting the answer or
the new statistics are just another input for start.
A new operation might be appealing as this could - on demand -
be replaced by calling a custom script - no need to edit RAs; one
RA different custom scripts for different resources using the same RA ...

I know that using scripts makes it impossible to derive the
cluster-behavior from just seeing what is configure in the cib,
that the scripts have to be kept in sync over the nodes, ...

Can as well be an addition to one of the suggestions above.
Gathering additional statistics will probably be needed for them
anyway. 

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Kristoffer Grönlund
Kristoffer Grönlund  writes:

> If implementing the first option, I would prefer to keep the behavior of
> migration-threshold of counting all failures, not just
> monitors. Otherwise there would be two closely related thresholds with
> subtly divergent behavior, which seems confusing indeed.

I see now that the proposed threshold would be per-operation, in which
case I completely reverse opinions and think that a per-operation
configured threshold should apply to instances of that operation
only. :)

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Force Unmount - SLES 11 SP4

2016-09-21 Thread Kristoffer Grönlund
Jorge Fábregas  writes:

> Hi,
>
> I have an issue while shutting down one of our clusters.  The unmounting
> of an OCFS2 filesystem (ocf:heartbeat:Filesystem) is triggering a node
> fence (accordingly).  This is because the script for stopping the
> application is not killing all processes using the filesystem.  Is there
> a way to "force unmount" the filesystem using pacemaker as it is in SLES
> 11 SP4?
>
> I searched for something related and found the "force_unmount" parameter
> for ocf:heartbeat:Filesystem but it only works in RHEL (apparently it's
> a newer OCF version).
>
> It appears I'll have to deal with this out of pacemaker (perhaps thru an
> init script using "fuser -k" that would run prior to openais at system
> shutdown).
>
> If anyone here using SUSE has a better idea please let me know.
>

The force_unmount option is available in more recent version of SLES as
well, but not in SLES 11 SP4. You could try installing the upstream
version of the Filesystem agent and see if that works for you.

Cheers,
Kristoffer

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Kristoffer Grönlund
Ken Gaillot  writes:

> Hi everybody,
>
> Currently, Pacemaker's on-fail property allows you to configure how the
> cluster reacts to operation failures. The default "restart" means try to
> restart on the same node, optionally moving to another node once
> migration-threshold is reached. Other possibilities are "ignore",
> "block", "stop", "fence", and "standby".
>
> Occasionally, we get requests to have something like migration-threshold
> for values besides restart. For example, try restarting the resource on
> the same node 3 times, then fence.
>
> I'd like to get your feedback on two alternative approaches we're
> considering.
>
> ###
>
> Our first proposed approach would add a new hard-fail-threshold
> operation property. If specified, the cluster would first try restarting
> the resource on the same node, before doing the on-fail handling.
>
> For example, you could configure a promote operation with
> hard-fail-threshold=3 and on-fail=fence, to fence the node after 3 failures.
>
> One point that's not settled is whether failures of *any* operation
> would count toward the 3 failures (which is how migration-threshold
> works now), or only failures of the specified operation.
>
> Currently, if a start fails (but is retried successfully), then a
> promote fails (but is retried successfully), then a monitor fails, the
> resource will move to another node if migration-threshold=3. We could
> keep that behavior with hard-fail-threshold, or only count monitor
> failures toward monitor's hard-fail-threshold. Each alternative has
> advantages and disadvantages.
>
> ###
>
> The second proposed approach would add a new on-restart-fail resource
> property.
>
> Same as now, on-fail set to anything but restart would be done
> immediately after the first failure. A new value, "ban", would
> immediately move the resource to another node. (on-fail=ban would behave
> like on-fail=restart with migration-threshold=1.)
>
> When on-fail=restart, and restarting on the same node doesn't work, the
> cluster would do the on-restart-fail handling. on-restart-fail would
> allow the same values as on-fail (minus "restart"), and would default to
> "ban".
>
> So, if you want to fence immediately after any promote failure, you
> would still configure on-fail=fence; if you want to try restarting a few
> times first, you would configure on-fail=restart and on-restart-fail=fence.
>
> This approach keeps the current threshold behavior -- failures of any
> operation count toward the threshold. We'd rename migration-threshold to
> something like hard-fail-threshold, since it would apply to more than
> just migration, but unlike the first approach, it would stay a resource
> property.
>
> ###
>
> Comparing the two approaches, the first is more flexible, but also more
> complex and potentially confusing.
>
> With either approach, we would deprecate the start-failure-is-fatal
> cluster property. start-failure-is-fatal=true would be equivalent to
> hard-fail-threshold=1 with the first approach, and on-fail=ban with the
> second approach. This would be both simpler and more useful -- it allows
> the value to be set differently per resource.

Apologies for quoting the entire mail, but I had a hard time picking out
which part was more relevant when replying.

First of all, is there a use case for when fence-after-3-failures is a
useful behavior? I seem to recall some case where someone expected that
to be the behavior and were surprised by how pacemaker works, but that
problem wouldn't be helped by adding another option for them not to know
about.

My second comment would be that to me, the first option sounds less
complex, but then I don't know the internals of pacemaker that
well. Having a special case on-fail for restarts seems inelegant,
somehow.

If implementing the first option, I would prefer to keep the behavior of
migration-threshold of counting all failures, not just
monitors. Otherwise there would be two closely related thresholds with
subtly divergent behavior, which seems confusing indeed.

Cheers,
Kristoffer

> -- 
> Ken Gaillot 
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] corosync-quorum tool, output name key on Name column if set?

2016-09-21 Thread Jan Friesse

Thomas Lamprecht napsal(a):

On 09/20/2016 12:36 PM, Christine Caulfield wrote:

On 20/09/16 10:46, Thomas Lamprecht wrote:

Hi,

when I'm using corosync-quorumtool [-l] and have my ring0_addr set to a
IP address,
which does not resolve to a hostname, I get the nodes IP addresses for
the 'Name' column.

As I'm using the nodelist.node.X.name key to set the name of a node it
seems a bit confusing
to me that not this one gets preferred or at least also outputted. Its
quite a minor issue if
not nit picking but as I associate my nodes with there name I.

I'd be ready to assemble a patch and one possibility would be adapting
the output to something
like:


# corosync-quorumtool

Quorum information
--
Date: Tue Sep 20 11:12:14 2016
Quorum provider:  corosync_votequorum
Nodes:3
Node ID:  1
Ring ID:  1/1784
Quorate:  Yes

Votequorum information
--
Expected votes:   3
Highest expected: 3
Total votes:  3
Quorum:   2
Flags:Quorate

Membership information
--
 Nodeid  Votes Name ring0_addr
  1  1 uno  10.10.20.1 (local)
  2  1 due  10.10.20.2
  3  1 tre  10.10.20.3


And respective:


# corosync-quorumtool -l

Membership information
--
 Nodeid  Votes Name ring0_addr
  1  1 uno  10.10.20.1 (local)
  2  1 due  10.10.20.2
  3  1 tre  10.10.20.3

additional ring1_addr could be also outputted if set.

This would be just a general idea, if there are suggestions I'll gladly
hear them.

As such a change may be not ideal during a stable release, e.g as
corosync user could
parse the corosync-quorumtool output (I mean there are quite better
places to get the
info but there may be still user doing this)  another possibility would
be adding an
option flag to corosync similar to '-i' (show node IP addresses instead
of the resolved
name) which then shows the nodelist.node.X.name value instead of IP or
resolved name.

Another, third, option would be letting the output as is but if the '-i'
option is not
set prefer the nodelist.node.X.name over the resolved hostname and fall
back to IP if
both are not available.
I'd almost prefer this change the most, it lets the output as it is and
it seems logical
that the Name column outputs the name key if possible, imo.

Would such a patch be welcomed or is this just something I find a little
strange?

Hi Tomas,

I'd be happy to receive such a patch. The main reason it's not done this
way is that it's not always obvious how to resolve a name from it's IP
address. If corosync.conf has a nodelist then using that does seem like
the best option though (and bear in mind that more than 1 ring is
possible). If corosync.conf is set up to use multicast then we have no
choice to guess at what the name might be (as happens now).

Most of corosync-quorumtool was written when nodelist was not the
dominant way of configuring a cluster which is why it is the way it is
at the moment.

As to what should be the default and which options are most useful, I
would be interested to hear the views of the community as to what they
would like to see :)

Chrissie


Hi Chrissie,

Thanks for your answer!



Thomas,


OK, then I'll look into it a bit more and try to figure out which options
really could make sense, if earlier code hadn't the nodelist configuration
wasn't that dominant, I may find other places too where it could be used
for
outputs or as input parameter.

You mean if corosync.conf is setup without nodelist, just with multicast?


I believe so. Simply because nodelist is pretty new concept (2.x).


As with the nodelist section configured multicast is also possible, asking
just to understand you correctly.

Yes, would be nice to get some input here, I'll wait a bit, else I'll
send a
patch to get the discussion going. :)

I have also another, organizational question. I saw on the GitHub page from
corosync that pull request there are preferred, and also that the


True


communication should occur through GitHub, so should I use GitHub
instead of
the clusterlabs list for mails like my initial one from this thread?


Nope, not necessarily. Way "First discuss on ML then send PR" works very 
same as "Open issue, discuss and then send PR". Actually from time to 
time ML way is better because you get broader audience.


Regards,
  Honza



Thanks,
Thomas



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: 

Re: [ClusterLabs] corosync-quorum tool, output name key on Name column if set?

2016-09-21 Thread Thomas Lamprecht

On 09/20/2016 12:36 PM, Christine Caulfield wrote:

On 20/09/16 10:46, Thomas Lamprecht wrote:

Hi,

when I'm using corosync-quorumtool [-l] and have my ring0_addr set to a
IP address,
which does not resolve to a hostname, I get the nodes IP addresses for
the 'Name' column.

As I'm using the nodelist.node.X.name key to set the name of a node it
seems a bit confusing
to me that not this one gets preferred or at least also outputted. Its
quite a minor issue if
not nit picking but as I associate my nodes with there name I.

I'd be ready to assemble a patch and one possibility would be adapting
the output to something
like:


# corosync-quorumtool

Quorum information
--
Date: Tue Sep 20 11:12:14 2016
Quorum provider:  corosync_votequorum
Nodes:3
Node ID:  1
Ring ID:  1/1784
Quorate:  Yes

Votequorum information
--
Expected votes:   3
Highest expected: 3
Total votes:  3
Quorum:   2
Flags:Quorate

Membership information
--
 Nodeid  Votes Name ring0_addr
  1  1 uno  10.10.20.1 (local)
  2  1 due  10.10.20.2
  3  1 tre  10.10.20.3


And respective:


# corosync-quorumtool -l

Membership information
--
 Nodeid  Votes Name ring0_addr
  1  1 uno  10.10.20.1 (local)
  2  1 due  10.10.20.2
  3  1 tre  10.10.20.3

additional ring1_addr could be also outputted if set.

This would be just a general idea, if there are suggestions I'll gladly
hear them.

As such a change may be not ideal during a stable release, e.g as
corosync user could
parse the corosync-quorumtool output (I mean there are quite better
places to get the
info but there may be still user doing this)  another possibility would
be adding an
option flag to corosync similar to '-i' (show node IP addresses instead
of the resolved
name) which then shows the nodelist.node.X.name value instead of IP or
resolved name.

Another, third, option would be letting the output as is but if the '-i'
option is not
set prefer the nodelist.node.X.name over the resolved hostname and fall
back to IP if
both are not available.
I'd almost prefer this change the most, it lets the output as it is and
it seems logical
that the Name column outputs the name key if possible, imo.

Would such a patch be welcomed or is this just something I find a little
strange?

Hi Tomas,

I'd be happy to receive such a patch. The main reason it's not done this
way is that it's not always obvious how to resolve a name from it's IP
address. If corosync.conf has a nodelist then using that does seem like
the best option though (and bear in mind that more than 1 ring is
possible). If corosync.conf is set up to use multicast then we have no
choice to guess at what the name might be (as happens now).

Most of corosync-quorumtool was written when nodelist was not the
dominant way of configuring a cluster which is why it is the way it is
at the moment.

As to what should be the default and which options are most useful, I
would be interested to hear the views of the community as to what they
would like to see :)

Chrissie


Hi Chrissie,

Thanks for your answer!

OK, then I'll look into it a bit more and try to figure out which options
really could make sense, if earlier code hadn't the nodelist configuration
wasn't that dominant, I may find other places too where it could be used for
outputs or as input parameter.

You mean if corosync.conf is setup without nodelist, just with multicast?
As with the nodelist section configured multicast is also possible, asking
just to understand you correctly.

Yes, would be nice to get some input here, I'll wait a bit, else I'll send a
patch to get the discussion going. :)

I have also another, organizational question. I saw on the GitHub page from
corosync that pull request there are preferred, and also that the
communication should occur through GitHub, so should I use GitHub instead of
the clusterlabs list for mails like my initial one from this thread?

Thanks,
Thomas



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org