Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Klaus Wenninger
Maybe you need some mapping between vbox-guest-names and pacemaker
node-names? (attribute pcmk_host_map)

That you are writing that you added the script as fence_virtual is probably
a typo in the mail ... and would probably create a different error
message ...

On 02/06/2017 07:06 PM, Jihed M'selmi wrote:
> Tahnks for the share. I tried to implement it by adding the script
> fence_virtual  under /usr/sbin (chmod +x).
>
> I configure the stonith on both nodes but, I can't fence nodes. any
> thoughts ?
>
>  Resource: fence_vbox2 (class=stonith type=fence_virtualbox)
>   Attributes: ipaddr=192.168.1.77
>   Operations: monitor interval=60s (fence_vbox2-monitor-interval-60s)
>  Resource: fence_vbox1 (class=stonith type=fence_virtualbox)
>   Attributes: ipaddr=192.168.1.77
>   Operations: monitor interval=60s (fence_vbox1-monitor-interval-60s)
> [root@node1 ~]# pcs stonith fence node2
> Error: unable to fence 'node2'
> Command failed: No such device
>
>
> Cluster name: mycluster
> Stack: corosync
> Current DC: node1 (version 1.1.15-11.el7-e174ec8) - partition with quorum
> Last updated: Mon Feb  6 19:03:50 2017Last change: Mon Feb  6
> 18:56:19 2017 by root via cibadmin on node1
>
> 2 nodes and 2 resources configured
>
> Online: [ node1 node2 ]
>
> Full list of resources:
>
>  fence_vbox2(stonith:fence_virtualbox):Started node1
>  fence_vbox1(stonith:fence_virtualbox):Started node2
>
> Daemon Status:
>   corosync: active/enabled
>   pacemaker: active/enabled
>   pcsd: active/enabled
>
>
>
>
>
>
>
> Jihed MSELMI
> RHCE, RHCSA, VCP4
> 10 Villa Stendhal, 75020 Paris France
> Mobile: +33 (0) 753768653
>
>
> On Mon, Feb 6, 2017 at 5:50 PM, Jehan-Guillaume de Rorthais
> > wrote:
>
> Hi,
>
> On Mon, 6 Feb 2017 14:20:45 +0100
> Marek Grac > wrote:
>
> > I don't have one. But I see a lot of question about fence_vbox
> in last
> > days, is there any new material that references it?
>
> Here is a script a colleague of mine wrote (based on fence_virsh)
> to be able to
> fence a vbox VM:
>
>   https://gist.github.com/marco44/2a4e5213a328829acee60015bf9b5671
> 
>
> He wrote it to be able to build PoC cluster using vbox. It has not
> been tested
> in production, but it worked like a charm during some workshops so
> far.
>
> Regards,
> --
> Jehan-Guillaume de Rorthais
> Dalibo
>
> ___
> Users mailing list: Users@clusterlabs.org
> 
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
>
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> 
> Bugs: http://bugs.clusterlabs.org
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Antw: Re: Pacemaker kill does not cause node fault ???

2017-02-06 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 06.02.2017 um 16:13 in
Nachricht
<40eba339-2f46-28b8-4605-c7047e0ee...@redhat.com>:
> On 02/06/2017 03:28 AM, Ulrich Windl wrote:
> RaSca  schrieb am 03.02.2017 um 14:00 in
>> Nachricht
>> <0de64981-904f-5bdb-c98f-9c59ee47b...@miamammausalinux.org>:
>> 
>>> On 03/02/2017 11:06, Ferenc Wágner wrote:
 Ken Gaillot  writes:

> On 01/10/2017 04:24 AM, Stefan Schloesser wrote:
>
>> I am currently testing a 2 node cluster under Ubuntu 16.04. The setup
>> seems to be working ok including the STONITH.
>> For test purposes I issued a "pkill -f pace" killing all pacemaker
>> processes on one node.
>>
>> Result:
>> The node is marked as "pending", all resources stay on it. If I
>> manually kill a resource it is not noticed. On the other node a drbd
>> "promote" command fails (drbd is still running as master on the first
>> node).
>
> I suspect that, when you kill pacemakerd, systemd respawns it quickly
> enough that fencing is unnecessary. Try "pkill -f pace; systemd stop
> pacemaker".

 What exactly is "quickly enough"?
>>>
>>> What Ken is saying is that Pacemaker, as a service managed by systemd,
>>> have in its service definition file
>>> (/usr/lib/systemd/system/pacemaker.service) this option:
>>>
>>> Restart=on-failure
>>>
>>> Looking at [1] it is explained: systemd restarts immediately the process
>>> if it ends for some unexpected reason (like a forced kill).
>> 
>> Isn't the question: Is crmd a process that is expected to die (and thus
need
>> restarting)? Or wouldn't one prefer to debug this situation. I fear that
>> restarting it might just cover some fatal failure...
> 
> If crmd or corosync dies, the node will be fenced (if fencing is enabled
> and working). If one of the crmd's persistent connections (such as to
> the cib) fails, it will exit, so it ends up the same. But the other

But isn't it due to crmd not responding to network packets? So if the timeout
is long enough, and crmd is started fast enough, will the node really be
fenced?

> daemons (such as pacemakerd or attrd) can die and respawn without any
> risk to services.
> 
> The failure will be logged, but it will not be reported in cluster
> status, so there is a chance of not noticing it.

I don't understand: A node is fenced, but it will not be noted in the cluster
status???

[...]

Regards,
Ulrich


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] HA/Clusterlabs Summit 2017 Proposal

2017-02-06 Thread Digimer
As of now, I have restarted the planning wiki used for the last summit;

http://plan.alteeve.ca/index.php/Main_Page

It's not the most professional, and the notes aren't as complete as I
would have liked (we didn't have anyone specifically taking notes, I'll
fix that this time). What there are, though, is here:

http://plan.alteeve.ca/index.php/HA_Cluster_Summit_2015

Please feel free to comment/edit as you wish. I can set up an account on
the wiki if you don't have one from last time (I only close it normally
to keep the spammers out).

digimer

On 07/02/17 12:47 AM, Gang He wrote:
> Hi Kristoffer,
> 
> The meeting looks very attractive.
> Just one question, does the meeting have any website to archive the previous 
> topics/presentations/materials?
> 
> 
> Thanks
> Gang 
> 
> 

>> Hi everyone!
>>
>> The last time we had an HA summit was in 2015, and the intention then
>> was to have SUSE arrange the next meetup in the following year. We did
>> try to find a date that would be suitable for everyone, but for various
>> reasons there was never a conclusion and 2016 came and went.
>>
>> Well, I'd like to give it another try this year! This time, I've already
>> got a proposal for a place and date: September 7-8 in Nuremberg, Germany
>> (SUSE main office). I've got the new event area in the SUSE office
>> already reserved for these dates.
>>
>> My suggestion is to do a two day event similar to the one in Brno, but I
>> am open to any suggestions as to format and content. The main reason for
>> having the event would be for everyone to have a chance to meet and get
>> to know each other, but it's also an opportunity to discuss the future
>> of Clusterlabs and the direction going forward.
>>
>> Any thoughts or feedback are more than welcome! Let me know if you are
>> interested in coming or unable to make it.
>>
>> Cheers,
>> Kristoffer
>>
>> -- 
>> // Kristoffer Grönlund
>> // kgronl...@suse.com 
>>
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] HA/Clusterlabs Summit 2017 Proposal

2017-02-06 Thread Gang He
Hi Kristoffer,

The meeting looks very attractive.
Just one question, does the meeting have any website to archive the previous 
topics/presentations/materials?


Thanks
Gang 


>>> 
> Hi everyone!
> 
> The last time we had an HA summit was in 2015, and the intention then
> was to have SUSE arrange the next meetup in the following year. We did
> try to find a date that would be suitable for everyone, but for various
> reasons there was never a conclusion and 2016 came and went.
> 
> Well, I'd like to give it another try this year! This time, I've already
> got a proposal for a place and date: September 7-8 in Nuremberg, Germany
> (SUSE main office). I've got the new event area in the SUSE office
> already reserved for these dates.
> 
> My suggestion is to do a two day event similar to the one in Brno, but I
> am open to any suggestions as to format and content. The main reason for
> having the event would be for everyone to have a chance to meet and get
> to know each other, but it's also an opportunity to discuss the future
> of Clusterlabs and the direction going forward.
> 
> Any thoughts or feedback are more than welcome! Let me know if you are
> interested in coming or unable to make it.
> 
> Cheers,
> Kristoffer
> 
> -- 
> // Kristoffer Grönlund
> // kgronl...@suse.com 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Failure to configure iface-bridge resource causes cluster node fence action.

2017-02-06 Thread Ken Gaillot
On 02/06/2017 09:00 AM, Scott Greenlese wrote:
> Further explanation for my concern about --disabled not taking effect
> until after the iface-bridge was configured ...
> 
> The reason I wanted to create the iface-bridge resource "disabled", was
> to allow me the opportunity to impose
> a location constraint / rule on the resource to prevent it from being
> started on certain cluster nodes,
> where the specified slave vlan did not exist.
> 
> In my case, pacemaker assigned the resource to a cluster node where the
> specified slave vlan did not exist, which in turn
> triggered a fenced (off) action against that node (apparently, because
> the device could not be stopped, per Ken's reply earlier).
> 
> Again, my cluster is configured as "symmetric" , so I would have to "opt
> out" my new resource from
> certain cluster nodes via location constraint.
> 
> So, if this really is how --disable is designed to work, is there any
> way to impose a location constraint rule BEFORE
> the iface-bridge resource gets assigned. configured and started on a
> cluster node in a symmetrical cluster?

I would expect --disabled to behave like that already; I'm not sure
what's happening there.

But, you can add a resource and any constraints that apply to it
simultaneously. How to do this depends on whether you want to do it
interactively or scripted, and whether you prefer the low-level tools,
crm shell, or pcs.

If you want to script it via pcs, you can do pcs cluster cib $SOME_FILE,
then pcs -f $SOME_FILE , then pcs cluster
cib-push $SOME_FILE --config.

> 
> Thanks,
> 
> Scott Greenlese ... IBM KVM on System Z - Solutions Test, Poughkeepsie, N.Y.
> INTERNET: swgre...@us.ibm.com
> 
> 
> 
> Inactive hide details for Scott Greenlese---02/03/2017 03:23:40
> PM---Ken, Thanks for the explanation.Scott Greenlese---02/03/2017
> 03:23:40 PM---Ken, Thanks for the explanation.
> 
> From: Scott Greenlese/Poughkeepsie/IBM@IBMUS
> To: kgail...@redhat.com, Cluster Labs - All topics related to
> open-source clustering welcomed 
> Date: 02/03/2017 03:23 PM
> Subject: Re: [ClusterLabs] Failure to configure iface-bridge resource
> causes cluster node fence action.
> 
> 
> 
> 
> 
> Ken,
> 
> Thanks for the explanation.
> 
> One other thing, relating to the iface-bridge resource creation. I
> specified --disabled flag:
> 
>> [root@zs95kj VD]# date;pcs resource create br0_r1
>> ocf:heartbeat:iface-bridge bridge_name=br0 bridge_slaves=vlan1292 op
>> monitor timeout="20s" interval="10s" --*disabled*
> 
> Does the bridge device have to be successfully configured by pacemaker
> before disabling the resource? It seems
> that that was the behavior, since it failed the resource and fenced the
> node instead of disabling the resource.
> Just checking with you to be sure.
> 
> Thanks again..
> 
> Scott Greenlese ... IBM KVM on System Z Solutions Test, Poughkeepsie, N.Y.
> INTERNET: swgre...@us.ibm.com
> 
> 
> 
> Inactive hide details for Ken Gaillot ---02/02/2017 03:29:12 PM---On
> 02/02/2017 02:14 PM, Scott Greenlese wrote: > Hi folks,Ken Gaillot
> ---02/02/2017 03:29:12 PM---On 02/02/2017 02:14 PM, Scott Greenlese
> wrote: > Hi folks,
> 
> From: Ken Gaillot 
> To: users@clusterlabs.org
> Date: 02/02/2017 03:29 PM
> Subject: Re: [ClusterLabs] Failure to configure iface-bridge resource
> causes cluster node fence action.
> 
> 
> 
> 
> On 02/02/2017 02:14 PM, Scott Greenlese wrote:
>> Hi folks,
>>
>> I'm testing iface-bridge resource support on a Linux KVM on System Z
>> pacemaker cluster.
>>
>> pacemaker-1.1.13-10.el7_2.ibm.1.s390x
>> corosync-2.3.4-7.el7_2.ibm.1.s390x
>>
>> I created an iface-bridge resource, but specified a non-existent
>> bridge_slaves value, vlan1292 (i.e. vlan1292 doesn't exist).
>>
>> [root@zs95kj VD]# date;pcs resource create br0_r1
>> ocf:heartbeat:iface-bridge bridge_name=br0 bridge_slaves=vlan1292 op
>> monitor timeout="20s" interval="10s" --disabled
>> Wed Feb 1 17:49:16 EST 2017
>> [root@zs95kj VD]#
>>
>> [root@zs95kj VD]# pcs resource show |grep br0
>> br0_r1 (ocf::heartbeat:iface-bridge): FAILED zs93kjpcs1
>> [root@zs95kj VD]#
>>
>> As you can see, the resource was created, but failed to start on the
>> target node zs93kppcs1.
>>
>> To my surprise, the target node zs93kppcs1 was unceremoniously fenced.
>>
>> pacemaker.log shows a fence (off) action initiated against that target
>> node, "because of resource failure(s)" :
>>
>> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2719 ) debug:
>> determine_op_status: br0_r1_stop_0 on zs93kjpcs1 returned 'not
>> configured' (6) instead of the expected value: 'ok' (0)
>> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2602 ) warning:
>> unpack_rsc_op_failure: Processing failed op stop for br0_r1 on
>> zs93kjpcs1: not configured (6)
>> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( 

Re: [ClusterLabs] Antw: Re: Pacemaker kill does not cause node fault ???

2017-02-06 Thread Ken Gaillot
On 02/06/2017 03:28 AM, Ulrich Windl wrote:
 RaSca  schrieb am 03.02.2017 um 14:00 in
> Nachricht
> <0de64981-904f-5bdb-c98f-9c59ee47b...@miamammausalinux.org>:
> 
>> On 03/02/2017 11:06, Ferenc Wágner wrote:
>>> Ken Gaillot  writes:
>>>
 On 01/10/2017 04:24 AM, Stefan Schloesser wrote:

> I am currently testing a 2 node cluster under Ubuntu 16.04. The setup
> seems to be working ok including the STONITH.
> For test purposes I issued a "pkill -f pace" killing all pacemaker
> processes on one node.
>
> Result:
> The node is marked as "pending", all resources stay on it. If I
> manually kill a resource it is not noticed. On the other node a drbd
> "promote" command fails (drbd is still running as master on the first
> node).

 I suspect that, when you kill pacemakerd, systemd respawns it quickly
 enough that fencing is unnecessary. Try "pkill -f pace; systemd stop
 pacemaker".
>>>
>>> What exactly is "quickly enough"?
>>
>> What Ken is saying is that Pacemaker, as a service managed by systemd,
>> have in its service definition file
>> (/usr/lib/systemd/system/pacemaker.service) this option:
>>
>> Restart=on-failure
>>
>> Looking at [1] it is explained: systemd restarts immediately the process
>> if it ends for some unexpected reason (like a forced kill).
> 
> Isn't the question: Is crmd a process that is expected to die (and thus need
> restarting)? Or wouldn't one prefer to debug this situation. I fear that
> restarting it might just cover some fatal failure...

If crmd or corosync dies, the node will be fenced (if fencing is enabled
and working). If one of the crmd's persistent connections (such as to
the cib) fails, it will exit, so it ends up the same. But the other
daemons (such as pacemakerd or attrd) can die and respawn without any
risk to services.

The failure will be logged, but it will not be reported in cluster
status, so there is a chance of not noticing it.

> 
>>
>> [1] https://www.freedesktop.org/software/systemd/man/systemd.service.html 
>>
>> -- 
>> RaSca
>> ra...@miamammausalinux.org 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Jihed M'selmi
Yeah, I see your point. :)

On Mon, Feb 6, 2017, 3:40 PM Klaus Wenninger  wrote:

> On 02/06/2017 03:35 PM, Jihed M'selmi wrote:
> >
> > So do you suggest to used the sdb ?
> > The virtualbox was installed above Windows.
> >
>
> Just wanted to give you an option if you don't have
> a working fence-agent directly talking to the hypervisor -
> which I would always prefer.
>
> >
> > On Mon, Feb 6, 2017, 3:20 PM Klaus Wenninger  > > wrote:
> >
> > No experience with fencing vbox-VMs on my side either ...
> > But as always when there is no physical fencing-device
> > available sbd might be a way to go - either with just
> > a watchdog (guess vbox offers a virtual watchdog that
> > is supported by the linux-kernel or at least if you install
> > the guest-support for vbox) or with shared block-devices
> > on top.
> >
> > Regards,
> > Klaus
> >
> > On 02/06/2017 02:22 PM, Jihed M'selmi wrote:
> > >
> > > Not really, I found something in some google group but, it's not
> > > documented (if I am not wrong).
> > >
> > >
> > > On Mon, Feb 6, 2017, 2:21 PM Marek Grac  > 
> > > >> wrote:
> > >
> > > Hi,
> > >
> > > I don't have one. But I see a lot of question about
> > fence_vbox in
> > > last days, is there any new material that references it?
> > >
> > > m,
> > >
> > > On Mon, Feb 6, 2017 at 1:56 PM, Jihed M'selmi
> > > 
> > >>
> > wrote:
> > >
> > > Hi,
> > >
> > > I want set up a pcmk/corosync cluster using couple vbox
> > nodes.
> > >
> > > Anyone could.share how to install/configure a fence agent
> > >  fence_vbox ?
> > >
> > > Cheers
> > > JM
> > > --
> > >
> > > J.M
> > >
> > >
> > > ___
> > > Users mailing list: Users@clusterlabs.org
> > 
> > >  > >
> > > http://lists.clusterlabs.org/mailman/listinfo/users
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> > >
> > >
> > > ___
> > > Users mailing list: Users@clusterlabs.org
> > 
> > > >
> > > http://lists.clusterlabs.org/mailman/listinfo/users
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> > >
> > > --
> > >
> > > J.M
> > >
> > >
> > >
> > > ___
> > > Users mailing list: Users@clusterlabs.org
> > 
> > > http://lists.clusterlabs.org/mailman/listinfo/users
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > 
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> > --
> >
> > J.M
> >
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-- 

J.M
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Failure to configure iface-bridge resource causes cluster node fence action.

2017-02-06 Thread Scott Greenlese

Further explanation for my concern about --disabled not taking effect until
after the iface-bridge was configured  ...

The reason I wanted to create the iface-bridge resource "disabled", was to
allow me the opportunity to impose
a location constraint / rule  on the resource to prevent it from being
started on certain cluster nodes,
where the specified slave vlan did not exist.

In my case, pacemaker assigned the resource to a cluster node where the
specified slave vlan did not exist, which in turn
triggered a fenced (off) action against that node (apparently, because the
device could not be stopped, per Ken's reply earlier).

Again, my cluster is configured as "symmetric" , so I would have to "opt
out" my new resource from
certain cluster nodes via location constraint.

So, if this really is how --disable is designed to work, is there any way
to impose a location constraint rule BEFORE
the iface-bridge resource gets assigned. configured and started on a
cluster node in a symmetrical cluster?

Thanks,

Scott Greenlese ... IBM KVM on System Z -  Solutions Test,  Poughkeepsie,
N.Y.
  INTERNET:  swgre...@us.ibm.com





From:   Scott Greenlese/Poughkeepsie/IBM@IBMUS
To: kgail...@redhat.com, Cluster Labs - All topics related to
open-source clustering welcomed 
Date:   02/03/2017 03:23 PM
Subject:Re: [ClusterLabs] Failure to configure iface-bridge resource
causes cluster node fence action.



Ken,

Thanks for the explanation.

One other thing, relating to the iface-bridge resource creation. I
specified --disabled flag:

> [root@zs95kj VD]# date;pcs resource create br0_r1
> ocf:heartbeat:iface-bridge bridge_name=br0 bridge_slaves=vlan1292 op
> monitor timeout="20s" interval="10s" --disabled

Does the bridge device have to be successfully configured by pacemaker
before disabling the resource? It seems
that that was the behavior, since it failed the resource and fenced the
node instead of disabling the resource.
Just checking with you to be sure.

Thanks again..

Scott Greenlese ... IBM KVM on System Z Solutions Test, Poughkeepsie, N.Y.
INTERNET: swgre...@us.ibm.com



Inactive hide details for Ken Gaillot ---02/02/2017 03:29:12 PM---On
02/02/2017 02:14 PM, Scott Greenlese wrote: > Hi folks,Ken Gaillot
---02/02/2017 03:29:12 PM---On 02/02/2017 02:14 PM, Scott Greenlese wrote:
> Hi folks,

From: Ken Gaillot 
To: users@clusterlabs.org
Date: 02/02/2017 03:29 PM
Subject: Re: [ClusterLabs] Failure to configure iface-bridge resource
causes cluster node fence action.



On 02/02/2017 02:14 PM, Scott Greenlese wrote:
> Hi folks,
>
> I'm testing iface-bridge resource support on a Linux KVM on System Z
> pacemaker cluster.
>
> pacemaker-1.1.13-10.el7_2.ibm.1.s390x
> corosync-2.3.4-7.el7_2.ibm.1.s390x
>
> I created an iface-bridge resource, but specified a non-existent
> bridge_slaves value, vlan1292 (i.e. vlan1292 doesn't exist).
>
> [root@zs95kj VD]# date;pcs resource create br0_r1
> ocf:heartbeat:iface-bridge bridge_name=br0 bridge_slaves=vlan1292 op
> monitor timeout="20s" interval="10s" --disabled
> Wed Feb 1 17:49:16 EST 2017
> [root@zs95kj VD]#
>
> [root@zs95kj VD]# pcs resource show |grep br0
> br0_r1 (ocf::heartbeat:iface-bridge): FAILED zs93kjpcs1
> [root@zs95kj VD]#
>
> As you can see, the resource was created, but failed to start on the
> target node zs93kppcs1.
>
> To my surprise, the target node zs93kppcs1 was unceremoniously fenced.
>
> pacemaker.log shows a fence (off) action initiated against that target
> node, "because of resource failure(s)" :
>
> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2719 ) debug:
> determine_op_status: br0_r1_stop_0 on zs93kjpcs1 returned 'not
> configured' (6) instead of the expected value: 'ok' (0)
> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2602 ) warning:
> unpack_rsc_op_failure: Processing failed op stop for br0_r1 on
> zs93kjpcs1: not configured (6)
> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:3244 ) error:
> unpack_rsc_op: Preventing br0_r1 from re-starting anywhere: operation
> stop failed 'not configured' (6)
> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2719 ) debug:
> determine_op_status: br0_r1_stop_0 on zs93kjpcs1 returned 'not
> configured' (6) instead of the expected value: 'ok' (0)
> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:2602 ) warning:
> unpack_rsc_op_failure: Processing failed op stop for br0_r1 on
> zs93kjpcs1: not configured (6)
> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:3244 ) error:
> unpack_rsc_op: Preventing br0_r1 from re-starting anywhere: operation
> stop failed 'not configured' (6)
> Feb 01 17:55:56 [52941] zs95kj crm_resource: ( unpack.c:96 ) warning:
> pe_fence_node: Node zs93kjpcs1 will be fenced because of resource failure
(s)
>
>
> Thankfully, I was able to successfully create a iface-bridge resource
> when I changed the bridge_slaves value to an existent vlan interface.
>
> My main 

Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Klaus Wenninger
On 02/06/2017 03:35 PM, Jihed M'selmi wrote:
>
> So do you suggest to used the sdb ?
> The virtualbox was installed above Windows.
>

Just wanted to give you an option if you don't have
a working fence-agent directly talking to the hypervisor -
which I would always prefer.

>
> On Mon, Feb 6, 2017, 3:20 PM Klaus Wenninger  > wrote:
>
> No experience with fencing vbox-VMs on my side either ...
> But as always when there is no physical fencing-device
> available sbd might be a way to go - either with just
> a watchdog (guess vbox offers a virtual watchdog that
> is supported by the linux-kernel or at least if you install
> the guest-support for vbox) or with shared block-devices
> on top.
>
> Regards,
> Klaus
>
> On 02/06/2017 02:22 PM, Jihed M'selmi wrote:
> >
> > Not really, I found something in some google group but, it's not
> > documented (if I am not wrong).
> >
> >
> > On Mon, Feb 6, 2017, 2:21 PM Marek Grac  
> > >> wrote:
> >
> > Hi,
> >
> > I don't have one. But I see a lot of question about
> fence_vbox in
> > last days, is there any new material that references it?
> >
> > m,
> >
> > On Mon, Feb 6, 2017 at 1:56 PM, Jihed M'selmi
> > 
> >>
> wrote:
> >
> > Hi,
> >
> > I want set up a pcmk/corosync cluster using couple vbox
> nodes.
> >
> > Anyone could.share how to install/configure a fence agent
> >  fence_vbox ?
> >
> > Cheers
> > JM
> > --
> >
> > J.M
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> 
> >  >
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> 
> > >
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> > --
> >
> > J.M
> >
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> 
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> 
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
> -- 
>
> J.M
>


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Jihed M'selmi
So do you suggest to used the sdb ?
The virtualbox was installed above Windows.

On Mon, Feb 6, 2017, 3:20 PM Klaus Wenninger  wrote:

> No experience with fencing vbox-VMs on my side either ...
> But as always when there is no physical fencing-device
> available sbd might be a way to go - either with just
> a watchdog (guess vbox offers a virtual watchdog that
> is supported by the linux-kernel or at least if you install
> the guest-support for vbox) or with shared block-devices
> on top.
>
> Regards,
> Klaus
>
> On 02/06/2017 02:22 PM, Jihed M'selmi wrote:
> >
> > Not really, I found something in some google group but, it's not
> > documented (if I am not wrong).
> >
> >
> > On Mon, Feb 6, 2017, 2:21 PM Marek Grac  > > wrote:
> >
> > Hi,
> >
> > I don't have one. But I see a lot of question about fence_vbox in
> > last days, is there any new material that references it?
> >
> > m,
> >
> > On Mon, Feb 6, 2017 at 1:56 PM, Jihed M'selmi
> > > wrote:
> >
> > Hi,
> >
> > I want set up a pcmk/corosync cluster using couple vbox nodes.
> >
> > Anyone could.share how to install/configure a fence agent
> >  fence_vbox ?
> >
> > Cheers
> > JM
> > --
> >
> > J.M
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > 
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > 
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> > --
> >
> > J.M
> >
> >
> >
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-- 

J.M
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Klaus Wenninger
No experience with fencing vbox-VMs on my side either ...
But as always when there is no physical fencing-device
available sbd might be a way to go - either with just
a watchdog (guess vbox offers a virtual watchdog that
is supported by the linux-kernel or at least if you install
the guest-support for vbox) or with shared block-devices
on top.

Regards,
Klaus

On 02/06/2017 02:22 PM, Jihed M'selmi wrote:
>
> Not really, I found something in some google group but, it's not
> documented (if I am not wrong).
>
>
> On Mon, Feb 6, 2017, 2:21 PM Marek Grac  > wrote:
>
> Hi,
>
> I don't have one. But I see a lot of question about fence_vbox in
> last days, is there any new material that references it?
>
> m,
>
> On Mon, Feb 6, 2017 at 1:56 PM, Jihed M'selmi
> > wrote:
>
> Hi,
>
> I want set up a pcmk/corosync cluster using couple vbox nodes.
>
> Anyone could.share how to install/configure a fence agent
>  fence_vbox ?
>
> Cheers
> JM
> -- 
>
> J.M
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> 
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> 
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
> -- 
>
> J.M
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Jihed M'selmi
Not really, I found something in some google group but, it's not documented
(if I am not wrong).


On Mon, Feb 6, 2017, 2:21 PM Marek Grac  wrote:

> Hi,
>
> I don't have one. But I see a lot of question about fence_vbox in last
> days, is there any new material that references it?
>
> m,
>
> On Mon, Feb 6, 2017 at 1:56 PM, Jihed M'selmi 
> wrote:
>
> Hi,
>
> I want set up a pcmk/corosync cluster using couple vbox nodes.
>
> Anyone could.share how to install/configure a fence agent  fence_vbox ?
>
> Cheers
> JM
> --
>
> J.M
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-- 

J.M
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Marek Grac
Hi,

I don't have one. But I see a lot of question about fence_vbox in last
days, is there any new material that references it?

m,

On Mon, Feb 6, 2017 at 1:56 PM, Jihed M'selmi 
wrote:

> Hi,
>
> I want set up a pcmk/corosync cluster using couple vbox nodes.
>
> Anyone could.share how to install/configure a fence agent  fence_vbox ?
>
> Cheers
> JM
> --
>
> J.M
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] resources management - redesign

2017-02-06 Thread Kristoffer Grönlund
Hi Florin,

I'm afraid I don't quite understand what it is that you are asking. You
can specify the resource ID when creating resources, and using resource
constraints, you can specify any order/colocation structure that you
need.

> 1. RG = rg1 + following resources: fs1, fs2,fs3, ocf:heartbeat[my custom
> systemd script] 

What do you mean by ocf:heartbeat[my custom systemd script]? If you've
got your own service with a systemd service file and you don't need
custom monitoring, you can use "systemd:" as the resource agent.

> Now, what solution exists ?  export cib, edit cib and re-import cib;
> what if  I will need a new fs:fs4, so what: export cib, create new
> resource inside exported cib and re-import it. 

One way to make large changes to the configuration is to

1. Stop all resources

crm configure property stop-all-resources=true 

2. Edit configuration to what you need

crm configure edit

3. Start all resources

   crm configure property stop-all-resources=false

You might have some success in keeping services running during editing
by using maintenance-mode=true instead, but that takes a lot more
care and is difficult to recommend in the general case.

It is also possible to use the shadow CIB facitility to simulate changes
to the cluster before applying them:

http://clusterlabs.org/man/pacemaker/crm_simulate.8.html

There's some documentation on using Hawk with the simulator which is
already outdated but might be of some help in figuring out what is
possible:

https://hawk-guide.readthedocs.io/en/latest/simulator.html

Cheers,
Kristoffer

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Fence agent for VirtualBox

2017-02-06 Thread Jihed M'selmi
Hi,

I want set up a pcmk/corosync cluster using couple vbox nodes.

Anyone could.share how to install/configure a fence agent  fence_vbox ?

Cheers
JM
-- 

J.M
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] lrmd segfault

2017-02-06 Thread cys
Hi All.

Recently we got a lrmd coredump. It occured only once and  we don't know how to 
reproduce it.
The version we use is pacemaker-1.1.15-11. Ths os is centos 7.

Core was generated by `/usr/libexec/pacemaker/lrmd'.
Program terminated with signal 11, Segmentation fault.
#0  __strcasecmp_l_avx () at ../sysdeps/x86_64/multiarch/strcmp-sse42.S:164
164 movdqu  (%rdi), %xmm1
(gdb) bt
#0  __strcasecmp_l_avx () at ../sysdeps/x86_64/multiarch/strcmp-sse42.S:164
#1  0x7fd6d554a53c in crm_str_eq (a=, 
b=b@entry=0x7fd6d6d42800 "p_vip", use_case=use_case@entry=0) at utils.c:1454
#2  0x7fd6d5322baa in is_op_blocked (rsc=0x7fd6d6d42800 "p_vip") at 
services.c:653
#3  0x7fd6d5322ca5 in services_action_async (op=0x7fd6d6d5f8d0, 
action_callback=) at services.c:634
#4  0x7fd6d59af67c in lrmd_rsc_execute_service_lib (cmd=0x7fd6d6d69bd0, 
rsc=0x7fd6d6d5d6f0) at lrmd.c:1242
#5  lrmd_rsc_execute (rsc=0x7fd6d6d5d6f0) at lrmd.c:1308
#6  lrmd_rsc_dispatch (user_data=0x7fd6d6d5d6f0, user_data@entry=) at lrmd
#7  0x7fd6d55699f6 in crm_trigger_dispatch (source=0x7fd6d6d59190, 
callback=, userdata=) at mainloop.c:107
#8  0x7fd6d29757aa in g_main_context_dispatch () from 
/lib64/libglib-2.0.so.0
#9  0x7fd6d2975af8 in g_main_context_iterate.isra.24 () from 
/lib64/libglib-2.0.so.0
#10 0x7fd6d2975dca in g_main_loop_run () from /lib64/libglib-2.0.so.0
#11 0x7fd6d59ad3ad in main (argc=, argv=0x7fff4bd0def8) at 
main.c:476
(gdb) p inflight_ops->data
$4 = (gpointer) 0x7fd6d6d605c0
(gdb) x/10xg 0x7fd6d6d605c0
0x7fd6d6d605c0: 0x 0x00030002
0x7fd6d6d605d0: 0x00020004  0x0005
0x7fd6d6d605e0: 0x0008  0x
0x7fd6d6d605f0: 0x000d  0x000f000e
0x7fd6d6d60600: 0x00010001  0x0013

The memory at inflight_ops->data is not a valid svc_action_t object.

I saw a similar problem at 
http://lists.clusterlabs.org/pipermail/users/2017-January/004906.html.
But it said the problem has gone in 1.1.15.

Any help would be appreciated.
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: crm shell: How to display properties?

2017-02-06 Thread Kristoffer Grönlund
Ulrich Windl  writes:

 xin  schrieb am 06.02.2017 um 10:50 in Nachricht
> <65fbbdf9-f820-63e7-fe02-1d1acefc5...@suse.com>:
>> Hi Ulrich:
>> 
>>"crm configure show" can display what you set for properties.
>> 
>>Do you find another way?
>
> Yes,, but it shows the while configuration. If your configuration is long, the
> output can be very long.
> What I'm talking about is:
> crm(live)configure# show property
> ERROR: object property does not exist
> crm(live)configure# show pe-error-series-max
> ERROR: object pe-error-series-max does not exist
>
> But I found out: This one works: "crm(live)configure# show
> cib-bootstrap-options".
>

You can also use

crm configure show type:property

If you follow the *-options naming convention, you can do

crm configure show \*options

Cheers,
Kristoffer

> Regards,
> Ulrich
>
>> 
>> 在 2017年02月06日 17:12, Ulrich Windl 写道:
>> Ken Gaillot  schrieb am 02.02.2017 um 21:19 in
> Nachricht
>>> :
>>>
>>> [...]
 The files are not necessary for cluster operation, so you can clean them
 as desired. The cluster can clean them for you based on cluster options;
 see pe-error-series-max, pe-warn-series-max, and pe-input-series-max:
>>> [...]
>>>
>>> Related question:
>>> in crm shell I can set properties in configure context ("property ..."),
> but 
>> how can I display them (except from looking at the end of a "show")?
>>>
>>> Regards,
>>> Ulrich
>>>
>>>
>>>
>>> ___
>>> Users mailing list: Users@clusterlabs.org 
>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> Project Home: http://www.clusterlabs.org 
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>> Bugs: http://bugs.clusterlabs.org 
>>>
>> 
>> 
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] fence_vbox '--action=' not executing action

2017-02-06 Thread Kristoffer Grönlund
dur...@mgtsciences.com writes:

> Kristoffer Grönlund  wrote on 02/01/2017 10:49:54 PM:
>
>> 
>> Another possibility is that the command that fence_vbox tries to run
>> doesn't work for you for some reason. It will either call
>> 
>> VBoxManage startvm  --type headless
>> 
>> or
>> 
>> VBoxManage controlvm  poweroff
>> 
>> when passed on or off as the --action parameter.
>
> If there is no further work being done on fence_vbox, is there a 'dummy' 
> fence
> which I might use to make STONITH happy in my configuration?  It need only 
> send
> the correct signals to STONITH so that I might create an active/active 
> cluster
> to experiment with?  This is only an experimental configuration.
>

Another option would be to use SBD for fencing if your hypervisor can
provide uncached shared storage:

https://github.com/ClusterLabs/sbd

This is what we usually use for our test setups here, both with
VirtualBox and qemu/kvm.

fence_vbox is actively maintained for sure, but we'd need to narrow down
what the correct changes would be to make it work in your
environment.

Trying to use a dummy fencing agent is likely to come back to bite you,
the cluster will act very unpredictably if it thinks that there is a
fencing option that doesn't actually work.

For fence_vbox, the best path forward is probably to create an issue
upstream, and attach as much relevant information about your environment
as possible:

https://github.com/ClusterLabs/fence-agents/issues/new

Cheers,
Kristoffer

> Thank you,
>
> Durwin
>

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Antw: Re: crm shell: How to display properties?

2017-02-06 Thread Ulrich Windl
>>> "Ulrich Windl"  schrieb am 06.02.2017
um
11:25 in Nachricht <58985d1a02a100024...@gwsmtp1.uni-regensburg.de>:
 xin  schrieb am 06.02.2017 um 10:50 in Nachricht
> <65fbbdf9-f820-63e7-fe02-1d1acefc5...@suse.com>:
>> Hi Ulrich:
>> 
>>"crm configure show" can display what you set for properties.
>> 
>>Do you find another way?
> 
> Yes,, but it shows the while configuration. If your configuration is long, 

Sorry: s/while/whole/

> the
> output can be very long.
> What I'm talking about is:
> crm(live)configure# show property
> ERROR: object property does not exist
> crm(live)configure# show pe-error-series-max
> ERROR: object pe-error-series-max does not exist
> 
> But I found out: This one works: "crm(live)configure# show
> cib-bootstrap-options".
> 
> Regards,
> Ulrich
> 
>> 
>> 在 2017年02月06日 17:12, Ulrich Windl 写道:
>> Ken Gaillot  schrieb am 02.02.2017 um 21:19 in
> Nachricht
>>> :
>>>
>>> [...]
 The files are not necessary for cluster operation, so you can clean them
 as desired. The cluster can clean them for you based on cluster options;
 see pe-error-series-max, pe-warn-series-max, and pe-input-series-max:
>>> [...]
>>>
>>> Related question:
>>> in crm shell I can set properties in configure context ("property ..."),
> but 
>> how can I display them (except from looking at the end of a "show")?
>>>
>>> Regards,
>>> Ulrich
>>>
>>>
>>>
>>> ___
>>> Users mailing list: Users@clusterlabs.org 
>>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>>
>>> Project Home: http://www.clusterlabs.org 
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>> Bugs: http://bugs.clusterlabs.org 
>>>
>> 
>> 
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> 
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
> 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: crm shell: How to display properties?

2017-02-06 Thread Ulrich Windl
>>> xin  schrieb am 06.02.2017 um 10:50 in Nachricht
<65fbbdf9-f820-63e7-fe02-1d1acefc5...@suse.com>:
> Hi Ulrich:
> 
>"crm configure show" can display what you set for properties.
> 
>Do you find another way?

Yes,, but it shows the while configuration. If your configuration is long, the
output can be very long.
What I'm talking about is:
crm(live)configure# show property
ERROR: object property does not exist
crm(live)configure# show pe-error-series-max
ERROR: object pe-error-series-max does not exist

But I found out: This one works: "crm(live)configure# show
cib-bootstrap-options".

Regards,
Ulrich

> 
> 在 2017年02月06日 17:12, Ulrich Windl 写道:
> Ken Gaillot  schrieb am 02.02.2017 um 21:19 in
Nachricht
>> :
>>
>> [...]
>>> The files are not necessary for cluster operation, so you can clean them
>>> as desired. The cluster can clean them for you based on cluster options;
>>> see pe-error-series-max, pe-warn-series-max, and pe-input-series-max:
>> [...]
>>
>> Related question:
>> in crm shell I can set properties in configure context ("property ..."),
but 
> how can I display them (except from looking at the end of a "show")?
>>
>> Regards,
>> Ulrich
>>
>>
>>
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>>
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] restart of one instance of a clone resource causes restart of dependent resources

2017-02-06 Thread Daniel
Hi All,

I'm having issues with a ordering constraint with a clone resource in
pacemaker v1.1.14.
- I have a resourceA-clone (running on 2 nodes: node1 and node2).
- then I have 2 other resources: resourceB1 (allowed to run on node1 only)
and resourceB2 (allowed to run on node2 only).
- finally constraints:
--- start resourceA-clone then start resourceB1 (kind:Mandatory)
--- start resourceA-clone then start resourceB2 (kind:Mandatory)

Resources B need at least one instance of resourceA to be available to work
properly.
But in case one instance/clone of resourceA is restarted - it causes a
restart of the local resourceB.

Can this behavior be disabled (keeping the start order constraint globally)
?

thank you in advance for any feedback!

best regards
Daniel
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcsd 99% CPU

2017-02-06 Thread Tomas Jelinek

Dne 3.2.2017 v 22:08 Scott Greenlese napsal(a):

Hi all..

Over the past few days, I noticed that pcsd and ruby process is pegged
at 99% CPU, and commands such as
pcs status pcsd take up to 5 minutes to complete. On all active cluster
nodes, top shows:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27225 haclust+ 20 0 116324 91600 23136 R 99.3 0.1 1943:40 cib
23277 root 20 0 12.868g 8.176g 8460 S 99.7 13.0 407:44.18 ruby

The system log indicates High CIB load detected over the past 2 days:

[root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep
"Feb 3" |wc -l
1655
[root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep
"Feb 2" |wc -l
1658
[root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep
"Feb 1" |wc -l
147
[root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep
"Jan 31" |wc -l
444
[root@zs95kj ~]# grep "High CIB load detected" /var/log/messages |grep
"Jan 30" |wc -l
352


The first entries logged on Feb 2 started around 8:42am ...

Feb 2 08:42:12 zs95kj crmd[27233]: notice: High CIB load detected: 0.974333

This happens to coincide with the time that I had caused a node fence
(off) action by creating a iface-bridge resources and specified
a non-existent vlan slave interface (reported to the group yesterday in
a separate email thread). It also happened to cause me to lose
quorum in the cluster, because 2 of my 5 cluster nodes were already
offline.

My cluster currently has just over 200 VirtualDomain resources to
manage, plus one iface-bridge resource and one iface-vlan resource.
Both of which are currently configured properly and operational.

I would appreciate some guidance how to proceed with debugging this
issue. I have not taken any recovery actions yet.


Checking /var/log/pcsd/pcsd.log to see what pcsd is actually doing might 
be a good start. What pcsd version do you have?



I considered stopping the cluster, recycling pcsd.service on all nodes,
restarting cluster... and also, reboot the nodes, if
necessary. But, didn't want to clear it yet in case there's anything I
can capture while in this state.


Restarting just pcsd might be enough.

Tomas



Thanks..

Scott Greenlese ... KVM on System Z - Solutions Test, Poughkeepsie, N.Y.
INTERNET: swgre...@us.ibm.com


___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] crm shell: How to display properties?

2017-02-06 Thread xin

Hi Ulrich:

  "crm configure show" can display what you set for properties.

  Do you find another way?

在 2017年02月06日 17:12, Ulrich Windl 写道:

Ken Gaillot  schrieb am 02.02.2017 um 21:19 in Nachricht

:

[...]

The files are not necessary for cluster operation, so you can clean them
as desired. The cluster can clean them for you based on cluster options;
see pe-error-series-max, pe-warn-series-max, and pe-input-series-max:

[...]

Related question:
in crm shell I can set properties in configure context ("property ..."), but how can I 
display them (except from looking at the end of a "show")?

Regards,
Ulrich



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Huge amount of files in /var/lib/pacemaker/pengine

2017-02-06 Thread Oscar Segarra
Thanks a lot!

2017-02-06 9:55 GMT+01:00 Ulrich Windl :

> >>> Oscar Segarra  schrieb am 02.02.2017 um
> 19:49 in
> Nachricht
> 

[ClusterLabs] Antw: Re: Pacemaker kill does not cause node fault ???

2017-02-06 Thread Ulrich Windl
>>> RaSca  schrieb am 03.02.2017 um 14:00 in
Nachricht
<0de64981-904f-5bdb-c98f-9c59ee47b...@miamammausalinux.org>:

> On 03/02/2017 11:06, Ferenc Wágner wrote:
>> Ken Gaillot  writes:
>> 
>>> On 01/10/2017 04:24 AM, Stefan Schloesser wrote:
>>>
 I am currently testing a 2 node cluster under Ubuntu 16.04. The setup
 seems to be working ok including the STONITH.
 For test purposes I issued a "pkill -f pace" killing all pacemaker
 processes on one node.

 Result:
 The node is marked as "pending", all resources stay on it. If I
 manually kill a resource it is not noticed. On the other node a drbd
 "promote" command fails (drbd is still running as master on the first
 node).
>>>
>>> I suspect that, when you kill pacemakerd, systemd respawns it quickly
>>> enough that fencing is unnecessary. Try "pkill -f pace; systemd stop
>>> pacemaker".
>> 
>> What exactly is "quickly enough"?
> 
> What Ken is saying is that Pacemaker, as a service managed by systemd,
> have in its service definition file
> (/usr/lib/systemd/system/pacemaker.service) this option:
> 
> Restart=on-failure
> 
> Looking at [1] it is explained: systemd restarts immediately the process
> if it ends for some unexpected reason (like a forced kill).

Isn't the question: Is crmd a process that is expected to die (and thus need
restarting)? Or wouldn't one prefer to debug this situation. I fear that
restarting it might just cover some fatal failure...

> 
> [1] https://www.freedesktop.org/software/systemd/man/systemd.service.html 
> 
> -- 
> RaSca
> ra...@miamammausalinux.org 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] crm shell: How to display properties?

2017-02-06 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 02.02.2017 um 21:19 in 
>>> Nachricht
:

[...]
> The files are not necessary for cluster operation, so you can clean them
> as desired. The cluster can clean them for you based on cluster options;
> see pe-error-series-max, pe-warn-series-max, and pe-input-series-max:
[...]

Related question:
in crm shell I can set properties in configure context ("property ..."), but 
how can I display them (except from looking at the end of a "show")?

Regards,
Ulrich



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcsd 99% CPU

2017-02-06 Thread Jan Pokorný
On 03/02/17 16:08 -0500, Scott Greenlese wrote:
> Over the past few days, I noticed that pcsd and ruby process is pegged at
> 99% CPU, and commands such as pcs status pcsd  take up to 5 minutes to 
> complete.
> On all active cluster nodes, top shows:
> 
> PID   USER PR NI  VIRT  RES SHRS  %CPU %MEM  TIME+
> COMMAND
> 27225 haclust+ 20 0   116324   91600 23136 R  99.3
> 0.1  1943:40  cib
> 23277   root   200  12.868g  8.176g   8460   S  99.7
> 13.0407:44.18   ruby
> 
> [...]
> 
> I would appreciate some guidance how to proceed with debugging this issue.
> I have not taken any recovery actions yet.
> I considered stopping the cluster, recycling pcsd.service on all nodes,
> restarting cluster... and also, reboot the nodes, if
> necessary.  But, didn't want to clear it yet in case there's anything I can
> capture while in this state.

If you still have the pcsd/ruby process in that state, it might be
worth dumping a core for further off-line examination.  Assuming you
have enough space to store it (in order of gigabytes, it seems) and
gdb installed, you can do it like: gcore -o pcsd.core 23277

I have no idea how far the support for Ruby interpretation in gdb
goes (Python is quite well supported in terms of high level
debugging), but could be enough for figuring out what's going on.

If you are confident enough your cluster configuration does not
contain anything too confidential, it would perhaps be best if
you shared this core file in a compressed form privately with
tojeline at redhat.  Otherwise, you can use gdb itself to look
around the call stack in the core file, strings utility to guess
if there's excessive accumulation of particular strings, and similar
analyses, some of which are applicable also on live process, and
some would be usable only on live process (like strace).

Hope this helps at least a bit.

-- 
Jan (Poki)


pgpJWDsRh8Rq6.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Huge amount of files in /var/lib/pacemaker/pengine

2017-02-06 Thread Ulrich Windl
>>> Oscar Segarra  schrieb am 02.02.2017 um 19:49 in
Nachricht

[ClusterLabs] Antw: Re: [Question] About a change of crm_failcount.

2017-02-06 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 02.02.2017 um 19:33 in 
>>> Nachricht
<91a83571-9930-94fd-e635-962830671...@redhat.com>:
> On 02/02/2017 12:23 PM, renayama19661...@ybb.ne.jp wrote:
>> Hi All,
>> 
>> By the next correction, the user was not able to set a value except zero in 
> crm_failcount.
>> 
>>  - [Fix: tools: implement crm_failcount command-line options correctly]
>>- 
> https://github.com/ClusterLabs/pacemaker/commit/95db10602e8f646eefed335414e40 
> a994498cafd#diff-6e58482648938fd488a920b9902daac4
>> 
>> However, pgsql RA sets INFINITY in a script.
>> 
>> ```
>> (snip)
>> CRM_FAILCOUNT="${HA_SBIN_DIR}/crm_failcount"
>> (snip)
>> ocf_exit_reason "My data is newer than new master's one. New   master's 
> location : $master_baseline"
>> exec_with_retry 0 $CRM_FAILCOUNT -r $OCF_RESOURCE_INSTANCE -U $NODENAME 
>> -v 
> INFINITY
>> return $OCF_ERR_GENERIC
>> (snip)
>> ```
>> 
>> There seems to be the influence only in pgsql somehow or other.
>> 
>> Can you revise it to set a value except zero in crm_failcount?
>> We make modifications to use crm_attribute in pgsql RA if we cannot revise 
> it.
>> 
>> Best Regards,
>> Hideo Yamauchi.
> 
> Hmm, I didn't realize that was used. I changed it because it's not a
> good idea to set fail-count without also changing last-failure and
> having a failed op in the LRM history. I'll have to think about what the
> best alternative is.

The question also is whether the RA can acieve the same effect otherwise. I 
thought CRM sets the failcount, not the RA...

> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] resources management - redesign

2017-02-06 Thread Florin Portase
Hello, 

It seems pacemaker has a weird way to manage resource: 

And also it seems it's focusing more on defining individual resources
and not to much flexibility  for creating resource Groups 

Now, let's  take this example:  RG1 + deps [ fs1,fs2,fs3] = > ALL 3 file
system must be mounted before starting of  daemon 

1. RG = rg1 + following resources: fs1, fs2,fs3, ocf:heartbeat[my custom
systemd script] 

Ok, So i define the rg + fs1,fs2, ( by mistake heartbeat ), fs3 

So in this case RG1 will fail to start as the order of defining
resources instead of fs1, fs2,fs3, heartbeat is fs1, fs2,heartbea,fs3 

Now, what solution exists ?  export cib, edit cib and re-import cib;
what if  I will need a new fs:fs4, so what: export cib, create new
resource inside exported cib and re-import it. 

Well, this approach seems way to moronic; wouldn't be better when create
a resource to be able to specify resid, or  a much easier way to
manipulate resource order ??

signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org