Re: [ClusterLabs] HALVM monitor action fail on slave node. Possible bug?

2018-04-16 Thread Marco Marino
Hi Emmanuel, thank you for you support. I did a lot of checks during the WE and there are some updates: - Main problem is that ocf:heartbeat:LVM is old. The current version on centos 7 is 3.9.5 (package resource-agents). More precisely, in 3.9.5 the monitor function has one important assumption: th

Re: [ClusterLabs] Pacemaker resources are not scheduled

2018-04-16 Thread Jan Pokorný
Lkxjtu, On 14/04/18 00:16 +0800, lkxjtu wrote: > My cluster version: > Corosync 2.4.0 > Pacemaker 1.1.16 > > There are many resource anomalies. Some resources are only monitored > and not recovered. Some resources are not monitored or recovered. > Only one resource of vnm is scheduled normally, b

Re: [ClusterLabs] Booth fail-over conditions

2018-04-16 Thread Kristoffer Grönlund
Zach Anderson writes: > Hey all, > > new user to pacemaker/booth and I'm fumbling my way through my first proof > of concept. I have a 2 site configuration setup with local pacemaker > clusters at each site (running rabbitmq) and a booth arbitrator. I've > successfully validated the base failove

Re: [ClusterLabs] Pacemaker resources are not scheduled

2018-04-16 Thread lkxjtu
> Lkxjtu, > On 14/04/18 00:16 +0800, lkxjtu wrote: >> My cluster version: >> Corosync 2.4.0 >> Pacemaker 1.1.16 There are many resource anomalies. Some resources are only monitored >> and not recovered. Some resources are not monitored or recovered. >> Only one resource of vnm is scheduled no

Re: [ClusterLabs] Pacemaker resources are not scheduled

2018-04-16 Thread Ken Gaillot
On Mon, 2018-04-16 at 23:52 +0800, lkxjtu wrote: > > Lkxjtu, > > > On 14/04/18 00:16 +0800, lkxjtu wrote: > >> My cluster version: > >> Corosync 2.4.0 > >> Pacemaker 1.1.16 > >>  > >> There are many resource anomalies. Some resources are only > monitored > >> and not recovered. Some resources are

[ClusterLabs] attrd/attrd_updater asynchronous behavior

2018-04-16 Thread Jehan-Guillaume de Rorthais
Hi, I have a question in regard with attrd asynchronous behavior In PAF, during the election process to pick the best PgSQL master, we are using private attributes to publish the status (LSN) of each pgsql instances during the pre-promote action. Because we need these LSN from each nodes during

Re: [ClusterLabs] attrd/attrd_updater asynchronous behavior

2018-04-16 Thread Jehan-Guillaume de Rorthais
I got an answer on IRC from Ken Gaillot. Bellow his answer for tracking purpose. On Mon, 16 Apr 2018 23:28:39 +0200 Jehan-Guillaume de Rorthais wrote: [...] > * does looping until the value becomes available is enough to conclude all > other node have the same value? Or is available only locall

[ClusterLabs] No slave is promoted to be master

2018-04-16 Thread 范国腾
Hi, We install a new lab which only have the postgres resource and the vip resource. After the cluster is installed, the status is ok: only node is master and the other is slave. Then I run "pcs cluster stop --all" to close the cluster and then I run the "pcs cluster start --all" to start the

[ClusterLabs] 答复: No slave is promoted to be master

2018-04-16 Thread 范国腾
I check the status again. It is not not promoted but it promoted about 15 minutes after the cluster starts. I try in three labs and the results are same: The promotion happens 15 minutes after the cluster starts. Why is there about 15 minutes delay every time? Apr 16 22:08:32 node1 attrd[16

Re: [ClusterLabs] 答复: No slave is promoted to be master

2018-04-16 Thread Andrei Borzenkov
Отправлено с iPhone > 17 апр. 2018 г., в 7:16, 范国腾 написал(а): > > I check the status again. It is not not promoted but it promoted about 15 > minutes after the cluster starts. > > I try in three labs and the results are same: The promotion happens 15 > minutes after the cluster starts. >