Re: [Pacemaker] crmd does abort if a stopped node is specified

2014-05-08 Thread Kristoffer Grönlund
On Thu, 8 May 2014 09:58:41 +1000
Andrew Beekhof and...@beekhof.net wrote:

  node $id=131 vm01
  node $id=132 vm02
  (snip)
  
  Is the method of setting up ID of the node which has not
  participated in a cluster using a corosync stack like this?  
 
 I don;t know how crmsh works, sorry

$id= maps directly to the id attribute in the XML. The name maps to
the uname attribute. So those examples would generate the XML

  node id=131 uname=vm01/node
  node id=132 uname=vm02/node

Not sure if that answers the original question.

-- 
// Kristoffer Grönlund
// kgronl...@suse.com


signature.asc
Description: PGP signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] crmd does abort if a stopped node is specified

2014-05-08 Thread Yusuke Iida
Hi, Andrew

I read the code.
In the present processing, a setup of startup-fencing is read only
once after starting.
https://github.com/ClusterLabs/pacemaker/blob/master/lib/pengine/unpack.c#L455

In Pacemaker-1.0, whenever unpack_nodes() was called, a setup was read.
https://github.com/ClusterLabs/pacemaker-1.0/blob/master/lib/pengine/unpack.c#L194

While a cluster starts, a setup of startup-fencing cannot be changed.
It seems to it that the function has deteriorated.

I made the correction to this problem below.
https://github.com/ClusterLabs/pacemaker/pull/512

Will it be good in this fix?

Regards,
Yusuke

2014-05-08 15:59 GMT+09:00 Yusuke Iida yusk.i...@gmail.com:
 Hi, Andrew

 I am the method shown above and made a setup read.

 crmd was able to be added as a node of OFFLINE, without core dumping.

 However, the node of OFFLINE added although startup-fencing=false
 was set up has been fenced.
 I do not expect fence here.
 Why is it that startup-fencing=false is not effective?

 I attach crm_report when a problem occurs.

 The version of used Pacemaker is as follows.
 https://github.com/ClusterLabs/pacemaker/commit/9fa1ed36e373768e84bee47b5d21b0bf80f608b7

 Regards,
 Yusuke

 2014-05-08 8:58 GMT+09:00 Andrew Beekhof and...@beekhof.net:

 On 7 May 2014, at 7:53 pm, Yusuke Iida yusk.i...@gmail.com wrote:

 Hi, Andrew

 I would also like to describe the node which has not participated in a
 cluster to a crmsh file.

 I understood that uuid was required for a setup of a node as follows
 from this mail thread.

 # cat node.crm
 ### Cluster Option ###
 property no-quorum-policy=ignore \
stonith-enabled=true \
startup-fencing=false \
crmd-transition-delay=2s

 node $id=131 vm01
 node $id=132 vm02
 (snip)

 Is the method of setting up ID of the node which has not participated
 in a cluster using a corosync stack like this?

 I don;t know how crmsh works, sorry

 It is sufficient to describe the nodelist and nodeid to corosync.conf?

 That is my understanding, yes.


 # cat corosync.conf
 (snip)
 nodelist {
  node {
ring0_addr: 192.168.101.131
ring1_addr: 192.168.102.131
nodeid: 131
  }
  node {
ring0_addr: 192.168.101.132
ring1_addr: 192.168.101.132
nodeid: 132
  }
 }

 Regards,
 Yusuke

 2014-04-24 12:33 GMT+09:00 Kazunori INOUE kazunori.ino...@gmail.com:
 2014-04-23 19:32 GMT+09:00 Andrew Beekhof and...@beekhof.net:

 On 23 Apr 2014, at 7:17 pm, Kazunori INOUE kazunori.ino...@gmail.com 
 wrote:

 2014-04-22 0:45 GMT+09:00 David Vossel dvos...@redhat.com:

 - Original Message -
 From: Kazunori INOUE kazunori.ino...@gmail.com
 To: pm pacemaker@oss.clusterlabs.org
 Sent: Friday, April 18, 2014 4:49:42 AM
 Subject: [Pacemaker] crmd does abort if a stopped node is specified

 Hi,

 crmd does abort if I load CIB which specified a stopped node.

 # crm_mon -1
 Last updated: Fri Apr 18 11:51:36 2014
 Last change: Fri Apr 18 11:51:30 2014
 Stack: corosync
 Current DC: pm103 (3232261519) - partition WITHOUT quorum
 Version: 1.1.11-cf82673
 1 Nodes configured
 0 Resources configured

 Online: [ pm103 ]

 # cat test.cli
 node pm103
 node pm104

 # crm configure load update test.cli

 Apr 18 11:52:42 pm103 crmd[11672]:error: crm_int_helper:
 Characters left over after parsing 'pm104': 'pm104'
 Apr 18 11:52:42 pm103 crmd[11672]:error: crm_abort: crm_get_peer:
 Triggered fatal assert at membership.c:420 : id  0 || uname != NULL
 Apr 18 11:52:42 pm103 pacemakerd[11663]:error: child_waitpid:
 Managed process 11672 (crmd) dumped core

 (gdb) bt
 #0  0x0033da432925 in raise () from /lib64/libc.so.6
 #1  0x0033da434105 in abort () from /lib64/libc.so.6
 #2  0x7f30241b7027 in crm_abort (file=0x7f302440b0b3
 membership.c, function=0x7f302440b5d0 crm_get_peer, line=420,
 assert_condition=0x7f302440b27e id  0 || uname != NULL, do_core=1,
 do_fork=0) at utils.c:1177
 #3  0x7f30244048ee in crm_get_peer (id=0, uname=0x0) at 
 membership.c:420
 #4  0x7f3024402238 in crm_peer_uname (uuid=0x113e7c0 pm104) at

 is the uuid for your cluster nodes supposed to be the same as the 
 uname?  We're treating the uuid in this situation as if it should be a 
 number, which it clearly is not.

 OK, I got it.

 By the way, is there a method to know id of the node before starting 
 pacemaker?

 Normally it comes from corosync, so not really :-(

 It seems the only way is to specify the nodeid to nodelist directive
 in corosync.conf.

 nodelist {
  node {
ring0_addr: 192.168.101.143
nodeid: 3
  }
  node {
ring0_addr: 192.168.101.144
nodeid: 4
  }
 }

 Thanks!




 -- Vossel


 cluster.c:386
 #5  0x0043afbd in abort_transition_graph
 (abort_priority=100, abort_action=tg_restart, abort_text=0x44d2f4
 Non-status change, reason=0x113e4b0, fn=0x44df07 te_update_diff,
 line=382) at te_utils.c:518
 #6  0x0043caa4 in te_update_diff (event=0x10f2240
 cib_diff_notify, msg=0x1137660) at te_callbacks.c:382
 #7  0x7f302461d1bc in 

Re: [Pacemaker] [Question] About quorum-policy=freeze and promote.

2014-05-08 Thread emmanuel segura
Why are you using ssh as stonith? i don't think the fencing is working
because your nodes are in unclean state


2014-05-08 5:37 GMT+02:00 renayama19661...@ybb.ne.jp:

 Hi All,

 I composed Master/Slave resource of three nodes that set
 quorum-policy=freeze.
 (I use Stateful in Master/Slave resource.)

 -
 Current DC: srv01 (3232238280) - partition with quorum
 Version: 1.1.11-830af67
 3 Nodes configured
 9 Resources configured


 Online: [ srv01 srv02 srv03 ]

  Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 ]
  Slaves: [ srv02 srv03 ]
  Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 -


 Master resource starts in all nodes when I interrupt the internal
 communication of all nodes.

 -
 Node srv02 (3232238290): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv01 ]

  Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 ]
  Slaves: [ srv02 srv03 ]
  Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv02 ]

  Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 srv02 ]
  Slaves: [ srv03 ]
  Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv02 (3232238290): UNCLEAN (offline)
 Online: [ srv03 ]

  Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 srv03 ]
  Slaves: [ srv02 ]
  Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 -

 I think even if the cluster loses Quorum, being promote the Master /
 Slave resource that's specification of Pacemaker.

 Is it responsibility of the resource agent side to prevent a state of
 these plural Master?
  * I think that drbd-RA has those functions.
  * But, there is no function in Stateful-RA.
  * As an example, I think that the mechanism such as drbd is necessary by
 all means when I make a resource of Master/Slave newly.

 Will my understanding be wrong?

 Best Regards,
 Hideo Yamauchi.


 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




-- 
esta es mi vida e me la vivo hasta que dios quiera
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About quorum-policy=freeze and promote.

2014-05-08 Thread renayama19661014
Hi Emmanuel,

 Why are you using ssh as stonith? i don't think the fencing is working 
 because your nodes are in unclean state

No, STONITH is not carried out because all nodes lose quorum.
This is right movement of Pacemaker.

It is an example to use STONITH of ssh.

Best Regards,
Hideo Yamauchi.
--- On Thu, 2014/5/8, emmanuel segura emi2f...@gmail.com wrote:

 
 Why are you using ssh as stonith? i don't think the fencing is working 
 because your nodes are in unclean state
 
 
 
 
 2014-05-08 5:37 GMT+02:00  renayama19661...@ybb.ne.jp:
 Hi All,
 
 I composed Master/Slave resource of three nodes that set 
 quorum-policy=freeze.
 (I use Stateful in Master/Slave resource.)
 
 -
 Current DC: srv01 (3232238280) - partition with quorum
 Version: 1.1.11-830af67
 3 Nodes configured
 9 Resources configured
 
 
 Online: [ srv01 srv02 srv03 ]
 
  Resource Group: grpStonith1
      prmStonith1-1      (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
      prmStonith2-1      (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
      prmStonith3-1      (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
      Masters: [ srv01 ]
      Slaves: [ srv02 srv03 ]
  Clone Set: clnPingd [prmPingd]
      Started: [ srv01 srv02 srv03 ]
 -
 
 
 Master resource starts in all nodes when I interrupt the internal 
 communication of all nodes.
 
 -
 Node srv02 (3232238290): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv01 ]
 
  Resource Group: grpStonith1
      prmStonith1-1      (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
      prmStonith2-1      (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
      prmStonith3-1      (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
      Masters: [ srv01 ]
      Slaves: [ srv02 srv03 ]
  Clone Set: clnPingd [prmPingd]
      Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv02 ]
 
  Resource Group: grpStonith1
      prmStonith1-1      (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
      prmStonith2-1      (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
      prmStonith3-1      (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
      Masters: [ srv01 srv02 ]
      Slaves: [ srv03 ]
  Clone Set: clnPingd [prmPingd]
      Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv02 (3232238290): UNCLEAN (offline)
 Online: [ srv03 ]
 
  Resource Group: grpStonith1
      prmStonith1-1      (stonith:external/ssh): Started srv02
  Resource Group: grpStonith2
      prmStonith2-1      (stonith:external/ssh): Started srv01
  Resource Group: grpStonith3
      prmStonith3-1      (stonith:external/ssh): Started srv01
  Master/Slave Set: msPostgresql [pgsql]
      Masters: [ srv01 srv03 ]
      Slaves: [ srv02 ]
  Clone Set: clnPingd [prmPingd]
      Started: [ srv01 srv02 srv03 ]
 -
 
 I think even if the cluster loses Quorum, being promote the Master / Slave 
 resource that's specification of Pacemaker.
 
 Is it responsibility of the resource agent side to prevent a state of these 
 plural Master?
  * I think that drbd-RA has those functions.
  * But, there is no function in Stateful-RA.
  * As an example, I think that the mechanism such as drbd is necessary by all 
 means when I make a resource of Master/Slave newly.
 
 Will my understanding be wrong?
 
 Best Regards,
 Hideo Yamauchi.
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 
 
 -- 
 esta es mi vida e me la vivo hasta que dios quiera

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Require pointers on how to build an rpm for a specific Pacemaker Release.

2014-05-08 Thread Monali Porob
Hi ,
I was able to build an rpm  from  current sources of Pacemaker on Cent OS 6.5 . 
I followed steps mentioned at 
http://blog.clusterlabs.org/blog/2013/Pacemaker-1-dot-1-10-final/ .

I wanted to know how to make change to a specific Pacemaker release, then 
compile, and build rpm.
i.e is there a way to  download Pacemaker Release 1.1.11 source code , make 
some changes locally in the code , compile these changes and build the 
pacemaker rpm .

Regards,
Monali
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem][pacemaker1.0] The probe may not be carried out by difference in cib information of probe.

2014-05-08 Thread renayama19661014
Hi All,

We confirmed a problem when we performed clean up of the Master/Slave 
resource in Pacemaker1.0.
When this problem occurs, probe processing is not carried out.

I registered the problem with Bugzilla.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5211

In addition, I wrote the method of clean up avoiding a problem for Bugzilla.
But this method may not be usable depending on the combination of resources.

I request improvement if I can revise this problem in Pacemaker1.0 in community.

 * But this problem is improved in Pacemaker1.1 and does not seem to occur.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Question about Pacemaker resource configuration troubleshooting

2014-05-08 Thread Jacob Nikom
Hi,

I created Pacemaker resource configuration file, but it does not accepts it.
It complanes:
/home/jnikom/Kiva/dev/Prod/flvr/pace loadConstraintsXML 
clu_con_init_2014_05_08_001.xml
   error: unpack_rsc_op:No further recovery can be attempted for 
mysqlres: stop action failed with 'not installed' (5)
   error: unpack_rsc_op:No further recovery can be attempted for 
mysqlres: stop action failed with 'not installed' (5)
   error: unpack_rsc_op:No further recovery can be attempted for 
mysqlres: stop action failed with 'not installed' (5)
   error: native_create_actions:Resource mysqlres 
(lsb::my_db_res_master) is active on 3 nodes attempting recovery
/home/jnikom/Kiva/dev/Prod/flvr/pace

Is there any tool that could help to find out what specifically Pacemaker does 
not like in the resource configuration?

Best regards,

Jacob Nikom
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Question about Pacemaker resource configuration troubleshooting

2014-05-08 Thread Andrew Beekhof

On 9 May 2014, at 11:59 am, Jacob Nikom jni...@kivasystems.com wrote:

 Hi,
 
 I created Pacemaker resource configuration file, but it does not accepts it.
 It complanes:
 /home/jnikom/Kiva/dev/Prod/flvr/pace loadConstraintsXML 
 clu_con_init_2014_05_08_001.xml
error: unpack_rsc_op:No further recovery can be attempted for 
 mysqlres: stop action failed with 'not installed' (5)
error: unpack_rsc_op:No further recovery can be attempted for 
 mysqlres: stop action failed with 'not installed' (5)
error: unpack_rsc_op:No further recovery can be attempted for 
 mysqlres: stop action failed with 'not installed' (5)
error: native_create_actions:Resource mysqlres 
 (lsb::my_db_res_master) is active on 3 nodes attempting recovery
 /home/jnikom/Kiva/dev/Prod/flvr/pace   
 
 Is there any tool that could help to find out what specifically Pacemaker 
 does not like in the resource configuration?

Either the agent isn't on all nodes or its the agent that doesn't like your 
configuration.
Some tool it requires is probably missing.

 
 Best regards,
 
 Jacob Nikom
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Require pointers on how to build an rpm for a specific Pacemaker Release.

2014-05-08 Thread Andrew Beekhof

On 9 May 2014, at 2:32 am, Monali Porob monali.po...@huawei.com wrote:

 Hi ,
 I was able to build an rpm  from  current sources of Pacemaker on Cent OS 6.5 
 . I followed steps mentioned at 
 http://blog.clusterlabs.org/blog/2013/Pacemaker-1-dot-1-10-final/ . 
  
 I wanted to know how to make change to a specific Pacemaker release, then 
 compile, and build rpm.
 i.e is there a way to  download Pacemaker Release 1.1.11 source code , make 
 some changes locally in the code , compile these changes and build the 
 pacemaker rpm .  

git co Pacemaker-${someversion}
make rpm

  
 Regards,
 Monali
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About quorum-policy=freeze and promote.

2014-05-08 Thread Andrew Beekhof

On 8 May 2014, at 1:37 pm, renayama19661...@ybb.ne.jp wrote:

 Hi All,
 
 I composed Master/Slave resource of three nodes that set 
 quorum-policy=freeze.
 (I use Stateful in Master/Slave resource.)
 
 -
 Current DC: srv01 (3232238280) - partition with quorum
 Version: 1.1.11-830af67
 3 Nodes configured
 9 Resources configured
 
 
 Online: [ srv01 srv02 srv03 ]
 
 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 srv03 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
 -
 
 
 Master resource starts in all nodes when I interrupt the internal 
 communication of all nodes.
 
 -
 Node srv02 (3232238290): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv01 ]
 
 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 srv03 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv02 ]
 
 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 srv02 ]
 Slaves: [ srv03 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv02 (3232238290): UNCLEAN (offline)
 Online: [ srv03 ]
 
 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 srv03 ]
 Slaves: [ srv02 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
 -
 
 I think even if the cluster loses Quorum, being promote the Master / Slave 
 resource that's specification of Pacemaker.
 
 Is it responsibility of the resource agent side to prevent a state of these 
 plural Master?

No.

In this scenario, no nodes have quorum and therefor no additional instances 
should have been promoted.  Thats the definition of freeze :)
Even if one partition DID have quorum, no instances should have been promoted 
without fencing occurring first.

 * I think that drbd-RA has those functions.
 * But, there is no function in Stateful-RA.
 * As an example, I think that the mechanism such as drbd is necessary by all 
 means when I make a resource of Master/Slave newly.
 
 Will my understanding be wrong?
 
 Best Regards,
 Hideo Yamauchi.
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About quorum-policy=freeze and promote.

2014-05-08 Thread Andrew Beekhof

On 9 May 2014, at 2:05 pm, renayama19661...@ybb.ne.jp wrote:

 Hi Andrew,
 
 Thank you for comment.
 
 Is it responsibility of the resource agent side to prevent a state of these 
 plural Master?
 
 No.
 
 In this scenario, no nodes have quorum and therefor no additional instances 
 should have been promoted.  Thats the definition of freeze :)
 Even if one partition DID have quorum, no instances should have been 
 promoted without fencing occurring first.
 
 Okay.
 I wish this problem is revised by the next release.

crm_report?

 
 Many Thanks!
 Hideo Yamauchi.
 
 --- On Fri, 2014/5/9, Andrew Beekhof and...@beekhof.net wrote:
 
 
 On 8 May 2014, at 1:37 pm, renayama19661...@ybb.ne.jp wrote:
 
 Hi All,
 
 I composed Master/Slave resource of three nodes that set 
 quorum-policy=freeze.
 (I use Stateful in Master/Slave resource.)
 
 -
 Current DC: srv01 (3232238280) - partition with quorum
 Version: 1.1.11-830af67
 3 Nodes configured
 9 Resources configured
 
 
 Online: [ srv01 srv02 srv03 ]
 
 Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 ]
  Slaves: [ srv02 srv03 ]
 Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 -
 
 
 Master resource starts in all nodes when I interrupt the internal 
 communication of all nodes.
 
 -
 Node srv02 (3232238290): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv01 ]
 
 Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 ]
  Slaves: [ srv02 srv03 ]
 Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv03 (3232238300): UNCLEAN (offline)
 Online: [ srv02 ]
 
 Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 srv02 ]
  Slaves: [ srv03 ]
 Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 (snip)
 Node srv01 (3232238280): UNCLEAN (offline)
 Node srv02 (3232238290): UNCLEAN (offline)
 Online: [ srv03 ]
 
 Resource Group: grpStonith1
  prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
  prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
  prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
  Masters: [ srv01 srv03 ]
  Slaves: [ srv02 ]
 Clone Set: clnPingd [prmPingd]
  Started: [ srv01 srv02 srv03 ]
 -
 
 I think even if the cluster loses Quorum, being promote the Master / 
 Slave resource that's specification of Pacemaker.
 
 Is it responsibility of the resource agent side to prevent a state of these 
 plural Master?
 
 No.
 
 In this scenario, no nodes have quorum and therefor no additional instances 
 should have been promoted.  Thats the definition of freeze :)
 Even if one partition DID have quorum, no instances should have been 
 promoted without fencing occurring first.
 
 * I think that drbd-RA has those functions.
 * But, there is no function in Stateful-RA.
 * As an example, I think that the mechanism such as drbd is necessary by 
 all means when I make a resource of Master/Slave newly.
 
 Will my understanding be wrong?
 
 Best Regards,
 Hideo Yamauchi.
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 
 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org