[Pacemaker] [Problem] The state of a node cut with the node that rebooted by a cluster is not recognized.

2013-06-03 Thread renayama19661014
Hi All,

We confirmed a state of the recognition of the cluster in the next procedure.
We confirm it by the next combination.(RHEL6.4 guest)
 * corosync-2.3.0
 * pacemaker-Pacemaker-1.1.10-rc3

-

Step 1) Start all nodes and constitute a cluster.

[root@rh64-coro1 ~]# crm_mon -1 -Af
Last updated: Tue Jun  4 22:30:25 2013
Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
Stack: corosync
Current DC: rh64-coro3 (4231178432) - partition with quorum
Version: 1.1.9-db294e1
3 Nodes configured, unknown expected votes
0 Resources configured.


Online: [ rh64-coro1 rh64-coro2 rh64-coro3 ]


Node Attributes:
* Node rh64-coro1:
* Node rh64-coro2:
* Node rh64-coro3:

Migration summary:
* Node rh64-coro1: 
* Node rh64-coro3: 
* Node rh64-coro2: 


Step 2) Stop the first unit node.

[root@rh64-coro2 ~]# crm_mon -1 -Af
Last updated: Tue Jun  4 22:30:55 2013
Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
Stack: corosync
Current DC: rh64-coro3 (4231178432) - partition with quorum
Version: 1.1.9-db294e1
3 Nodes configured, unknown expected votes
0 Resources configured.


Online: [ rh64-coro2 rh64-coro3 ]
OFFLINE: [ rh64-coro1 ]


Node Attributes:
* Node rh64-coro2:
* Node rh64-coro3:

Migration summary:
* Node rh64-coro3: 
* Node rh64-coro2: 


Step 3) Restart the first unit node.

[root@rh64-coro1 ~]# crm_mon -1 -Af
Last updated: Tue Jun  4 22:31:29 2013
Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
Stack: corosync
Current DC: rh64-coro3 (4231178432) - partition with quorum
Version: 1.1.9-db294e1
3 Nodes configured, unknown expected votes
0 Resources configured.


Online: [ rh64-coro1 rh64-coro2 rh64-coro3 ]


Node Attributes:
* Node rh64-coro1:
* Node rh64-coro2:
* Node rh64-coro3:

Migration summary:
* Node rh64-coro1: 
* Node rh64-coro3: 
* Node rh64-coro2: 


Step 4) Interrupt the inter-connect of all nodes.

[root@kvm-host ~]# brctl delif virbr2 vnet1;brctl delif virbr2 vnet4;brctl 
delif virbr2 vnet7;brctl delif virbr3 vnet2;brctl delif virbr3 vnet5;brctl 
delif virbr3 vnet8

-


Two nodes that do not reboot then recognize other nodes definitely.

[root@rh64-coro2 ~]# crm_mon -1 -Af
Last updated: Tue Jun  4 22:32:06 2013
Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
Stack: corosync
Current DC: rh64-coro2 (4214401216) - partition WITHOUT quorum
Version: 1.1.9-db294e1
3 Nodes configured, unknown expected votes
0 Resources configured.


Node rh64-coro1 (4197624000): UNCLEAN (offline)
Node rh64-coro3 (4231178432): UNCLEAN (offline)
Online: [ rh64-coro2 ]


Node Attributes:
* Node rh64-coro2:

Migration summary:
* Node rh64-coro2: 

[root@rh64-coro3 ~]# crm_mon -1 -Af
Last updated: Tue Jun  4 22:33:17 2013
Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
Stack: corosync
Current DC: rh64-coro3 (4231178432) - partition WITHOUT quorum
Version: 1.1.9-db294e1
3 Nodes configured, unknown expected votes
0 Resources configured.


Node rh64-coro1 (4197624000): UNCLEAN (offline)
Node rh64-coro2 (4214401216): UNCLEAN (offline)
Online: [ rh64-coro3 ]


Node Attributes:
* Node rh64-coro3:

Migration summary:
* Node rh64-coro3: 


However, the node that rebooted does not recognize the state of one node 
definitely.

[root@rh64-coro1 ~]# crm_mon -1 -Af
Last updated: Tue Jun  4 22:33:31 2013
Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
Stack: corosync
Current DC: rh64-coro1 (4197624000) - partition WITHOUT quorum
Version: 1.1.9-db294e1
3 Nodes configured, unknown expected votes
0 Resources configured.


Node rh64-coro3 (4231178432): UNCLEAN (offline)> OKay.
Online: [ rh64-coro1 rh64-coro2 ] --> rh64-coro2 NG.


Node Attributes:
* Node rh64-coro1:
* Node rh64-coro2:

Migration summary:
* Node rh64-coro1: 
* Node rh64-coro2: 


It is right movement that recognize other nodes in a UNCLEAN state in the node 
that rebooted, but seems to recognize it by mistake.

It is like the problem of Pacemaker somehow or other.
 * There seems to be the problem with crm_peer_cache hush table.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The state of a node cut with the node that rebooted by a cluster is not recognized.

2013-06-03 Thread renayama19661014
Hi All,

I registered this problem with Bugzilla.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5160

Best Regards,
Hideo Yamauchi.

--- On Tue, 2013/6/4, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> We confirmed a state of the recognition of the cluster in the next procedure.
> We confirm it by the next combination.(RHEL6.4 guest)
>  * corosync-2.3.0
>  * pacemaker-Pacemaker-1.1.10-rc3
> 
> -
> 
> Step 1) Start all nodes and constitute a cluster.
> 
> [root@rh64-coro1 ~]# crm_mon -1 -Af
> Last updated: Tue Jun  4 22:30:25 2013
> Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
> Stack: corosync
> Current DC: rh64-coro3 (4231178432) - partition with quorum
> Version: 1.1.9-db294e1
> 3 Nodes configured, unknown expected votes
> 0 Resources configured.
> 
> 
> Online: [ rh64-coro1 rh64-coro2 rh64-coro3 ]
> 
> 
> Node Attributes:
> * Node rh64-coro1:
> * Node rh64-coro2:
> * Node rh64-coro3:
> 
> Migration summary:
> * Node rh64-coro1: 
> * Node rh64-coro3: 
> * Node rh64-coro2: 
> 
> 
> Step 2) Stop the first unit node.
> 
> [root@rh64-coro2 ~]# crm_mon -1 -Af
> Last updated: Tue Jun  4 22:30:55 2013
> Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
> Stack: corosync
> Current DC: rh64-coro3 (4231178432) - partition with quorum
> Version: 1.1.9-db294e1
> 3 Nodes configured, unknown expected votes
> 0 Resources configured.
> 
> 
> Online: [ rh64-coro2 rh64-coro3 ]
> OFFLINE: [ rh64-coro1 ]
> 
> 
> Node Attributes:
> * Node rh64-coro2:
> * Node rh64-coro3:
> 
> Migration summary:
> * Node rh64-coro3: 
> * Node rh64-coro2: 
> 
> 
> Step 3) Restart the first unit node.
> 
> [root@rh64-coro1 ~]# crm_mon -1 -Af
> Last updated: Tue Jun  4 22:31:29 2013
> Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
> Stack: corosync
> Current DC: rh64-coro3 (4231178432) - partition with quorum
> Version: 1.1.9-db294e1
> 3 Nodes configured, unknown expected votes
> 0 Resources configured.
> 
> 
> Online: [ rh64-coro1 rh64-coro2 rh64-coro3 ]
> 
> 
> Node Attributes:
> * Node rh64-coro1:
> * Node rh64-coro2:
> * Node rh64-coro3:
> 
> Migration summary:
> * Node rh64-coro1: 
> * Node rh64-coro3: 
> * Node rh64-coro2: 
> 
> 
> Step 4) Interrupt the inter-connect of all nodes.
> 
> [root@kvm-host ~]# brctl delif virbr2 vnet1;brctl delif virbr2 vnet4;brctl 
> delif virbr2 vnet7;brctl delif virbr3 vnet2;brctl delif virbr3 vnet5;brctl 
> delif virbr3 vnet8
> 
> -
> 
> 
> Two nodes that do not reboot then recognize other nodes definitely.
> 
> [root@rh64-coro2 ~]# crm_mon -1 -Af
> Last updated: Tue Jun  4 22:32:06 2013
> Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
> Stack: corosync
> Current DC: rh64-coro2 (4214401216) - partition WITHOUT quorum
> Version: 1.1.9-db294e1
> 3 Nodes configured, unknown expected votes
> 0 Resources configured.
> 
> 
> Node rh64-coro1 (4197624000): UNCLEAN (offline)
> Node rh64-coro3 (4231178432): UNCLEAN (offline)
> Online: [ rh64-coro2 ]
> 
> 
> Node Attributes:
> * Node rh64-coro2:
> 
> Migration summary:
> * Node rh64-coro2: 
> 
> [root@rh64-coro3 ~]# crm_mon -1 -Af
> Last updated: Tue Jun  4 22:33:17 2013
> Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
> Stack: corosync
> Current DC: rh64-coro3 (4231178432) - partition WITHOUT quorum
> Version: 1.1.9-db294e1
> 3 Nodes configured, unknown expected votes
> 0 Resources configured.
> 
> 
> Node rh64-coro1 (4197624000): UNCLEAN (offline)
> Node rh64-coro2 (4214401216): UNCLEAN (offline)
> Online: [ rh64-coro3 ]
> 
> 
> Node Attributes:
> * Node rh64-coro3:
> 
> Migration summary:
> * Node rh64-coro3: 
> 
> 
> However, the node that rebooted does not recognize the state of one node 
> definitely.
> 
> [root@rh64-coro1 ~]# crm_mon -1 -Af
> Last updated: Tue Jun  4 22:33:31 2013
> Last change: Tue Jun  4 22:22:54 2013 via crmd on rh64-coro1
> Stack: corosync
> Current DC: rh64-coro1 (4197624000) - partition WITHOUT quorum
> Version: 1.1.9-db294e1
> 3 Nodes configured, unknown expected votes
> 0 Resources configured.
> 
> 
> Node rh64-coro3 (4231178432): UNCLEAN (offline)> OKay.
> Online: [ rh64-coro1 rh64-coro2 ] --> rh64-coro2 
> NG.
> 
> 
> Node Attributes:
> * Node rh64-coro1:
> * Node rh64-coro2:
> 
> Migration summary:
> * Node rh64-coro1: 
> * Node rh64-coro2: 
> 
> 
> It is right movement that recognize other nodes in a UNCLEAN state in the 
> node that rebooted, but seems to recognize it by mistake.
> 
> It is like the problem of Pacemaker somehow or other.
>  * There seems to be the problem with crm_peer_cache hush table.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

_

Re: [Pacemaker] [Problem] The state of a node cut with the node that rebooted by a cluster is not recognized.

2013-06-04 Thread renayama19661014
Hi Andrew,

> Yep, sounds like a problem.
> I'll follow up on bugzilla

All right!

Many Thanks!
Hideo Yamauchi.

--- On Tue, 2013/6/4, Andrew Beekhof  wrote:

> 
> On 04/06/2013, at 3:00 PM, renayama19661...@ybb.ne.jp wrote:
> 
> > 
> > It is right movement that recognize other nodes in a UNCLEAN state in the 
> > node that rebooted, but seems to recognize it by mistake.
> > 
> > It is like the problem of Pacemaker somehow or other.
> > * There seems to be the problem with crm_peer_cache hush table.
> 
> Yep, sounds like a problem.
> I'll follow up on bugzilla

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem]Two error information is displayed.

2013-08-28 Thread renayama19661014
Hi All,

Though the trouble is only once, two error information is displayed in crm_mon.

-

[root@rh64-coro2 ~]# crm_mon -1 -Af
Last updated: Thu Aug 29 18:11:00 2013
Last change: Thu Aug 29 18:10:45 2013 via cibadmin on rh64-coro2
Stack: corosync
Current DC: NONE
1 Nodes configured
1 Resources configured


Online: [ rh64-coro2 ]


Node Attributes:
* Node rh64-coro2:

Migration summary:
* Node rh64-coro2: 
   dummy: migration-threshold=1 fail-count=1 last-failure='Thu Aug 29 18:10:57 
2013'

Failed actions:
dummy_monitor_3000 on (null) 'not running' (7): call=11, status=complete, 
last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms, exec=0ms
dummy_monitor_3000 on rh64-coro2 'not running' (7): call=11, 
status=complete, last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms, exec=0ms

-

There seems to be the problem with an additional judgment of the error 
information somehow or other.

-
static void
unpack_rsc_op_failure(resource_t *rsc, node_t *node, int rc, xmlNode *xml_op, 
enum action_fail_response *on_fail, pe_working_set_t * data_set) 
{
int interval = 0;
bool is_probe = FALSE;
action_t *action = NULL;
(snip)
if (rc != PCMK_OCF_NOT_INSTALLED || is_set(data_set->flags, 
pe_flag_symmetric_cluster)) {
if ((node->details->shutdown == FALSE) || (node->details->online == 
TRUE)) {
add_node_copy(data_set->failed, xml_op);
}
}

crm_xml_add(xml_op, XML_ATTR_UNAME, node->details->uname);
if ((node->details->shutdown == FALSE) || (node->details->online == TRUE)) {
add_node_copy(data_set->failed, xml_op);
}
(snip)
-


Please revise the additional handling of error information.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]Two error information is displayed.

2013-08-29 Thread renayama19661014
Hi Andres,

Thank you for comment.

> But to be seriously: I see this phaenomena, too.
> (pacemaker 1.1.11-1.el6-4f672bc)

If the version that you confirm is the same as next, probably it will be that 
the same problem happens.
There is a similar cord.
(https://github.com/ClusterLabs/pacemaker/blob/4f672bc85eefd33e2fb09b601bb8ec1510645468/lib/pengine/unpack.c)

Best Regards,
Hideo Yamauchi.

--- On Thu, 2013/8/29, Andreas Mock  wrote:

> Hi Hideo san,
> 
> the two line shall emphasis that you do not only have trouble
> but real trouble...  ;-)
> 
> But to be seriously: I see this phaenomena, too.
> (pacemaker 1.1.11-1.el6-4f672bc)
> 
> Best regards
> Andreas Mock
> 
> -Ursprüngliche Nachricht-
> Von: renayama19661...@ybb.ne.jp [mailto:renayama19661...@ybb.ne.jp] 
> Gesendet: Donnerstag, 29. August 2013 02:38
> An: PaceMaker-ML
> Betreff: [Pacemaker] [Problem]Two error information is displayed.
> 
> Hi All,
> 
> Though the trouble is only once, two error information is displayed in
> crm_mon.
> 
> -
> 
> [root@rh64-coro2 ~]# crm_mon -1 -Af
> Last updated: Thu Aug 29 18:11:00 2013
> Last change: Thu Aug 29 18:10:45 2013 via cibadmin on rh64-coro2
> Stack: corosync
> Current DC: NONE
> 1 Nodes configured
> 1 Resources configured
> 
> 
> Online: [ rh64-coro2 ]
> 
> 
> Node Attributes:
> * Node rh64-coro2:
> 
> Migration summary:
> * Node rh64-coro2: 
>    dummy: migration-threshold=1 fail-count=1 last-failure='Thu Aug 29
> 18:10:57 2013'
> 
> Failed actions:
>     dummy_monitor_3000 on (null) 'not running' (7): call=11,
> status=complete, last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms,
> exec=0ms
>     dummy_monitor_3000 on rh64-coro2 'not running' (7): call=11,
> status=complete, last-rc-change='Thu Aug 29 18:10:57 2013', queued=0ms,
> exec=0ms
> 
> -
> 
> There seems to be the problem with an additional judgment of the error
> information somehow or other.
> 
> -
> static void
> unpack_rsc_op_failure(resource_t *rsc, node_t *node, int rc, xmlNode
> *xml_op, enum action_fail_response *on_fail, pe_working_set_t * data_set) 
> {
>     int interval = 0;
>     bool is_probe = FALSE;
>     action_t *action = NULL;
> (snip)
>     if (rc != PCMK_OCF_NOT_INSTALLED || is_set(data_set->flags,
> pe_flag_symmetric_cluster)) {
>         if ((node->details->shutdown == FALSE) || (node->details->online ==
> TRUE)) {
>             add_node_copy(data_set->failed, xml_op);
>         }
>     }
> 
>     crm_xml_add(xml_op, XML_ATTR_UNAME, node->details->uname);
>     if ((node->details->shutdown == FALSE) || (node->details->online ==
> TRUE)) {
>         add_node_copy(data_set->failed, xml_op);
>     }
> (snip)
> -
> 
> 
> Please revise the additional handling of error information.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]Two error information is displayed.

2013-09-03 Thread renayama19661014
Hi Andrew,

> > Hi All,
> > 
> > Though the trouble is only once, two error information is displayed in 
> > crm_mon.
> 
> Have you got the full cib for when crm_mon is showing this?

No.
I reproduce a problem once again and acquire cib.

Best Regards,
Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]Two error information is displayed.

2013-09-03 Thread renayama19661014
Hi Andrew,

> > > Though the trouble is only once, two error information is displayed in 
> > > crm_mon.
> > 
> > Have you got the full cib for when crm_mon is showing this?
> 
> No.
> I reproduce a problem once again and acquire cib.

I send the result that I acquired by cibadmin -Q command.

Best Regards,
Hideo Yamauchi.

--- On Tue, 2013/9/3, renayama19661...@ybb.ne.jp  
wrote:

> Hi Andrew,
> 
> > > Hi All,
> > > 
> > > Though the trouble is only once, two error information is displayed in 
> > > crm_mon.
> > 
> > Have you got the full cib for when crm_mon is showing this?
> 
> No.
> I reproduce a problem once again and acquire cib.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
  

  




  


  


  

  
  
  
  

  



  


  

  
  

  

  



  

  
  

  
  
  

  

  

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]Two error information is displayed.

2013-09-03 Thread renayama19661014
Hi Andrew,

I confirmed that a problem was solved in a revision.

Thanks!
Hideo Yamauchi.

--- On Wed, 2013/9/4, Andrew Beekhof  wrote:

> Thanks (also to Andreas for sending me an example too)!
> 
> Fixed:
>    https://github.com/beekhof/pacemaker/commit/a32474b
> 
> On 04/09/2013, at 11:02 AM, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
>  Though the trouble is only once, two error information is displayed in 
>  crm_mon.
> >>> 
> >>> Have you got the full cib for when crm_mon is showing this?
> >> 
> >> No.
> >> I reproduce a problem once again and acquire cib.
> > 
> > I send the result that I acquired by cibadmin -Q command.
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Tue, 2013/9/3, renayama19661...@ybb.ne.jp 
> >  wrote:
> > 
> >> Hi Andrew,
> >> 
>  Hi All,
>  
>  Though the trouble is only once, two error information is displayed in 
>  crm_mon.
> >>> 
> >>> Have you got the full cib for when crm_mon is showing this?
> >> 
> >> No.
> >> I reproduce a problem once again and acquire cib.
> >> 
> >> Best Regards,
> >> Hideo Yamauchi.
> >> 
> >> ___
> >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >> 
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] crmd Segmentation fault at pacemaker 1.0.12

2013-11-20 Thread renayama19661014
Hi Andrew,
Hi Takatsuka san,

What kind of procedure did the problem that became this problem cause it in?
If a problem easily happens in the environment of the user, we think that it is 
necessary to contact a user using Pacemaker1.0.13.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2013/11/14, TAKATSUKA Haruka  wrote:

> Hello Andrew,
> Thank you for the quick modification.
> 
> (Unfortunately the confirmation test is not possible because I don't 
> understand
>  a reproduction method for this crash ...)
> 
> regards,
> ---
> Haruka Takatsuka
> 
> 
> On Thu, 14 Nov 2013 09:32:24 +1100
> Andrew Beekhof  wrote:
> 
> > 
> > On 13 Nov 2013, at 7:36 pm, TAKATSUKA Haruka  wrote:
> > 
> > > Hello,  pacemaker hackers
> > > 
> > > I report crmd's crash at pacemaker 1.0.12 .
> > > 
> > > We are going to upgrade pacemaker 1.0.12 to 1.0.13 .
> > > But I was not able to find a fix for this problem from ChangeLog.
> > > tengine.c:do_te_invoke() is not seem to care for transition_graph==NULL
> > > case in even 1.0.x head code.
> > 
> > This should help:
> > 
> > https://github.com/ClusterLabs/pacemaker-1.0/commit/20f169d9cccb6c889946c64ab09ab4fb7f572f7c
> > 
> > > 
> > > regards,
> > > Haruka Takatsuka.
> > > -
> > > 
> > > [log]
> > > Nov 07 00:00:08 srv1 crmd: [21843]: ERROR: crm_abort: 
> > > abort_transition_graph: Triggered assert at te_utils.c:259 : 
> > > transition_graph != NULL
> > > Nov 07 00:00:08 srv1 heartbeat: [21823]: WARN: Managed 
> > > /usr/lib64/heartbeat/crmd process 21843 killed by signal 11 [SIGSEGV - 
> > > Segmentation violation].
> > > Nov 07 00:00:08 srv1 heartbeat: [21823]: ERROR: Managed 
> > > /usr/lib64/heartbeat/crmd process 21843 dumped core
> > > Nov 07 00:00:08 srv1 heartbeat: [21823]: EMERG: Rebooting system.  
> > > Reason: /usr/lib64/heartbeat/crmd
> > > 
> > > [gdb]
> > > $ gdb -c core.21843 -s crmd.debug crmd
> > > --(snip)--
> > > Program terminated with signal 11, Segmentation fault.
> > > #0  0x004199c4 in do_te_invoke (action=140737488355328,
> > >    cause=C_FSA_INTERNAL, cur_state=S_POLICY_ENGINE,
> > >    current_input=I_FINALIZED, msg_data=0x1b28e20) at tengine.c:186
> > > 186                     if(transition_graph->complete == FALSE) {
> > > --(snip)--
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] A resource starts with a standby node.(Latest attrd does not serve as the crmd-transition-delay parameter)

2014-01-13 Thread renayama19661014
Hi All,

I contributed next bugzilla by a problem to occur for the difference of the 
timing of the attribute update by attrd before.
 * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2528

We can evade this problem now by using crmd-transition-delay parameter.

I confirmed whether I could evade this problem by renewed attrd recently.
 * In latest attrd, one became a leader and seemed to come to update an 
attribute.

However, latest attrd does not seem to substitute for crmd-transition-delay.
 * I contribute detailed log later.

We are dissatisfied with continuing using crmd-transition-delay.
Is there the plan when attrd handles this problem well in the future?

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] A resource starts with a standby node.(Latest attrd does not serve as the crmd-transition-delay parameter)

2014-01-13 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> Are you using the new attrd code or the legacy stuff?

I use new attrd.

> 
> If you're not using corosync 2.x or see:
> 
>     crm_notice("Starting mainloop...");
> 
> then its the old code.  The new code could also be used with CMAN but isn't 
> configured to build for in that situation.
> 
> Only the new code makes (or at least should do) crmd-transition-delay 
> redundant.

It did not seem to work so that new attrd dispensed with crmd-transition-delay 
to me.
I report the details again.
# Probably it will be Bugzilla. . .

Best Regards,
Hideo Yamauchi.

--- On Tue, 2014/1/14, Andrew Beekhof  wrote:

> 
> On 14 Jan 2014, at 3:52 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > I contributed next bugzilla by a problem to occur for the difference of the 
> > timing of the attribute update by attrd before.
> > * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2528
> > 
> > We can evade this problem now by using crmd-transition-delay parameter.
> > 
> > I confirmed whether I could evade this problem by renewed attrd recently.
> > * In latest attrd, one became a leader and seemed to come to update an 
> > attribute.
> > 
> > However, latest attrd does not seem to substitute for crmd-transition-delay.
> > * I contribute detailed log later.
> > 
> > We are dissatisfied with continuing using crmd-transition-delay.
> > Is there the plan when attrd handles this problem well in the future?
> 
> Are you using the new attrd code or the legacy stuff?
> 
> If you're not using corosync 2.x or see:
> 
>     crm_notice("Starting mainloop...");
> 
> then its the old code.  The new code could also be used with CMAN but isn't 
> configured to build for in that situation.
> 
> Only the new code makes (or at least should do) crmd-transition-delay 
> redundant.
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] A resource starts with a standby node.(Latest attrd does not serve as the crmd-transition-delay parameter)

2014-01-13 Thread renayama19661014
Hi Andrew,

> >> Are you using the new attrd code or the legacy stuff?
> > 
> > I use new attrd.
> 
> And the values are not being sent to the cib at the same time? 

As far as I looked. . .
When the transmission of the attribute of attrd of the node was late, a leader 
of attrd seemed to send an attribute to cib without waiting for it.

> >> Only the new code makes (or at least should do) crmd-transition-delay 
> >> redundant.
> > 
> > It did not seem to work so that new attrd dispensed with 
> > crmd-transition-delay to me.
> > I report the details again.
> > # Probably it will be Bugzilla. . .
> 
> Sounds good

All right!

Many Thanks!
Hideo Yamauch.

--- On Tue, 2014/1/14, Andrew Beekhof  wrote:

> 
> On 14 Jan 2014, at 4:13 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > Thank you for comments.
> > 
> >> Are you using the new attrd code or the legacy stuff?
> > 
> > I use new attrd.
> 
> And the values are not being sent to the cib at the same time? 
> 
> > 
> >> 
> >> If you're not using corosync 2.x or see:
> >> 
> >>     crm_notice("Starting mainloop...");
> >> 
> >> then its the old code.  The new code could also be used with CMAN but 
> >> isn't configured to build for in that situation.
> >> 
> >> Only the new code makes (or at least should do) crmd-transition-delay 
> >> redundant.
> > 
> > It did not seem to work so that new attrd dispensed with 
> > crmd-transition-delay to me.
> > I report the details again.
> > # Probably it will be Bugzilla. . .
> 
> Sounds good
> 
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Tue, 2014/1/14, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 14 Jan 2014, at 3:52 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> I contributed next bugzilla by a problem to occur for the difference of 
> >>> the timing of the attribute update by attrd before.
> >>> * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2528
> >>> 
> >>> We can evade this problem now by using crmd-transition-delay parameter.
> >>> 
> >>> I confirmed whether I could evade this problem by renewed attrd recently.
> >>> * In latest attrd, one became a leader and seemed to come to update an 
> >>> attribute.
> >>> 
> >>> However, latest attrd does not seem to substitute for 
> >>> crmd-transition-delay.
> >>> * I contribute detailed log later.
> >>> 
> >>> We are dissatisfied with continuing using crmd-transition-delay.
> >>> Is there the plan when attrd handles this problem well in the future?
> >> 
> >> Are you using the new attrd code or the legacy stuff?
> >> 
> >> If you're not using corosync 2.x or see:
> >> 
> >>     crm_notice("Starting mainloop...");
> >> 
> >> then its the old code.  The new code could also be used with CMAN but 
> >> isn't configured to build for in that situation.
> >> 
> >> Only the new code makes (or at least should do) crmd-transition-delay 
> >> redundant.
> >> 
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Enhancement] Change of the "globally-unique" attribute of the resource.

2014-01-14 Thread renayama19661014
Hi All,

When a user changes the "globally-unique" attribute of the resource, a problem 
occurs.

When it manages the resource with PID file, this occurs, but this is because 
PID file name changes by "globally-unique" attribute.

(snip)
if [ ${OCF_RESKEY_CRM_meta_globally_unique} = "false" ]; then
: ${OCF_RESKEY_pidfile:="$HA_VARRUN/ping-${OCF_RESKEY_name}"}
else
: ${OCF_RESKEY_pidfile:="$HA_VARRUN/ping-${OCF_RESOURCE_INSTANCE}"}
fi
(snip)


The problem can reappear in the following procedure.

* Step1: Started a resource.
(snip)
primitive prmPingd ocf:pacemaker:pingd \
params name="default_ping_set" host_list="192.168.0.1" multiplier="200" 
\
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="10s" timeout="60s" on-fail="restart" \
op stop interval="0s" timeout="60s" on-fail="ignore"
clone clnPingd prmPingd
(snip)

* Step2: Change "globally-unique" attribute.

[root]# crm configure edit
(snip)
clone clnPingd prmPingd \
meta clone-max="2" clone-node-max="2" globally-unique="true"
(snip)

* Step3: Stop Pacemaker

But, the resource does not stop because PID file was changed as for the changed 
resource of the "globally-unique" attribute.

I think that this is a known problem.

I wish this problem is solved in the future

Best Regards,
Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Enhancement] Change of the "globally-unique" attribute of the resource.

2014-01-14 Thread renayama19661014
Hi Andrew,

Thank you for comment.

> > But, the resource does not stop because PID file was changed as for the 
> > changed resource of the "globally-unique" attribute.
> 
> I'd have expected the stop action to be performed with the old attributes.
> crm_report tarball?

Okay.

I register this topic with Bugzilla.
I attach the log to Bugzilla.

Best Regards,
Hideo Yamauchi.
--- On Wed, 2014/1/15, Andrew Beekhof  wrote:

> 
> On 14 Jan 2014, at 7:26 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > When a user changes the "globally-unique" attribute of the resource, a 
> > problem occurs.
> > 
> > When it manages the resource with PID file, this occurs, but this is 
> > because PID file name changes by "globally-unique" attribute.
> > 
> > (snip)
> > if [ ${OCF_RESKEY_CRM_meta_globally_unique} = "false" ]; then
> >    : ${OCF_RESKEY_pidfile:="$HA_VARRUN/ping-${OCF_RESKEY_name}"}
> > else
> >    : ${OCF_RESKEY_pidfile:="$HA_VARRUN/ping-${OCF_RESOURCE_INSTANCE}"}
> > fi
> > (snip)
> 
> This is correct.  The pid file cannot include the instance number when 
> globally-unique is false and must do so when it is true.
> 
> > 
> > 
> > The problem can reappear in the following procedure.
> > 
> > * Step1: Started a resource.
> > (snip)
> > primitive prmPingd ocf:pacemaker:pingd \
> >        params name="default_ping_set" host_list="192.168.0.1" 
> >multiplier="200" \
> >        op start interval="0s" timeout="60s" on-fail="restart" \
> >        op monitor interval="10s" timeout="60s" on-fail="restart" \
> >        op stop interval="0s" timeout="60s" on-fail="ignore"
> > clone clnPingd prmPingd
> > (snip)
> > 
> > * Step2: Change "globally-unique" attribute.
> > 
> > [root]# crm configure edit
> > (snip)
> > clone clnPingd prmPingd \
> >    meta clone-max="2" clone-node-max="2" globally-unique="true"
> > (snip)
> > 
> > * Step3: Stop Pacemaker
> > 
> > But, the resource does not stop because PID file was changed as for the 
> > changed resource of the "globally-unique" attribute.
> 
> I'd have expected the stop action to be performed with the old attributes.
> crm_report tarball?
> 
> 
> > 
> > I think that this is a known problem.
> 
> It wasn't until now.
> 
> > 
> > I wish this problem is solved in the future
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Enhancement] Change of the "globally-unique" attribute of the resource.

2014-01-14 Thread renayama19661014
Hi Andrew,

Sorry

This problem is a thing of Pacemaker1.0.
On Pacemaker1.1.11, the resource did movement to stop definitely.

When "globally-unique" attribute changed somehow or other in Pacemaker1.1, 
Pacemkaer seems to carry out the reboot of the resource.

(snip)
Jan 15 18:29:40 rh64-2744 pengine[3369]:  warning: process_rsc_state: Detected 
active orphan prmClusterMon running on rh64-2744
Jan 15 18:29:40 rh64-2744 pengine[3369]: info: clone_print:  Clone Set: 
clnClusterMon [prmClusterMon] (unique)
Jan 15 18:29:40 rh64-2744 pengine[3369]: info: native_print:  
prmClusterMon:0#011(ocf::pacemaker:ClusterMon):#011Stopped
Jan 15 18:29:40 rh64-2744 pengine[3369]: info: native_print:  
prmClusterMon:1#011(ocf::pacemaker:ClusterMon):#011Stopped
Jan 15 18:29:40 rh64-2744 pengine[3369]: info: native_print: 
prmClusterMon#011(ocf::pacemaker:ClusterMon):#011 ORPHANED Started rh64-2744
Jan 15 18:29:40 rh64-2744 pengine[3369]:   notice: DeleteRsc: Removing 
prmClusterMon from rh64-2744
Jan 15 18:29:40 rh64-2744 pengine[3369]: info: native_color: Stopping 
orphan resource prmClusterMon
Jan 15 18:29:40 rh64-2744 pengine[3369]: info: RecurringOp:  Start 
recurring monitor (10s) for prmClusterMon:0 on rh64-2744
Jan 15 18:29:40 rh64-2744 pengine[3369]: info: RecurringOp:  Start 
recurring monitor (10s) for prmClusterMon:1 on rh64-2744
Jan 15 18:29:40 rh64-2744 pengine[3369]:   notice: LogActions: Start   
prmClusterMon:0#011(rh64-2744)
Jan 15 18:29:40 rh64-2744 pengine[3369]:   notice: LogActions: Start   
prmClusterMon:1#011(rh64-2744)Jan 15 18:29:40 rh64-2744 pengine[3369]:   
notice: LogActions: StopprmClusterMon#011(rh64-2744)

(snip)

Best Regards,
Hideo Yamauchi.

--- On Wed, 2014/1/15, renayama19661...@ybb.ne.jp  
wrote:

> Hi Andrew,
> 
> Thank you for comment.
> 
> > > But, the resource does not stop because PID file was changed as for the 
> > > changed resource of the "globally-unique" attribute.
> > 
> > I'd have expected the stop action to be performed with the old attributes.
> > crm_report tarball?
> 
> Okay.
> 
> I register this topic with Bugzilla.
> I attach the log to Bugzilla.
> 
> Best Regards,
> Hideo Yamauchi.
> --- On Wed, 2014/1/15, Andrew Beekhof  wrote:
> 
> > 
> > On 14 Jan 2014, at 7:26 pm, renayama19661...@ybb.ne.jp wrote:
> > 
> > > Hi All,
> > > 
> > > When a user changes the "globally-unique" attribute of the resource, a 
> > > problem occurs.
> > > 
> > > When it manages the resource with PID file, this occurs, but this is 
> > > because PID file name changes by "globally-unique" attribute.
> > > 
> > > (snip)
> > > if [ ${OCF_RESKEY_CRM_meta_globally_unique} = "false" ]; then
> > >    : ${OCF_RESKEY_pidfile:="$HA_VARRUN/ping-${OCF_RESKEY_name}"}
> > > else
> > >    : ${OCF_RESKEY_pidfile:="$HA_VARRUN/ping-${OCF_RESOURCE_INSTANCE}"}
> > > fi
> > > (snip)
> > 
> > This is correct.  The pid file cannot include the instance number when 
> > globally-unique is false and must do so when it is true.
> > 
> > > 
> > > 
> > > The problem can reappear in the following procedure.
> > > 
> > > * Step1: Started a resource.
> > > (snip)
> > > primitive prmPingd ocf:pacemaker:pingd \
> > >        params name="default_ping_set" host_list="192.168.0.1" 
> > >multiplier="200" \
> > >        op start interval="0s" timeout="60s" on-fail="restart" \
> > >        op monitor interval="10s" timeout="60s" on-fail="restart" \
> > >        op stop interval="0s" timeout="60s" on-fail="ignore"
> > > clone clnPingd prmPingd
> > > (snip)
> > > 
> > > * Step2: Change "globally-unique" attribute.
> > > 
> > > [root]# crm configure edit
> > > (snip)
> > > clone clnPingd prmPingd \
> > >    meta clone-max="2" clone-node-max="2" globally-unique="true"
> > > (snip)
> > > 
> > > * Step3: Stop Pacemaker
> > > 
> > > But, the resource does not stop because PID file was changed as for the 
> > > changed resource of the "globally-unique" attribute.
> > 
> > I'd have expected the stop action to be performed with the old attributes.
> > crm_report tarball?
> > 
> > 
> > > 
> > > I think that this is a known problem.
> > 
> > It wasn't until now.
> > 
> > > 
> > > I wish this problem is solved in the future
> > > 
> > > Best Regards,
> > > Hideo Yamauchi.
> > > 
> > > ___
> > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > 
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> > 
> > 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>


___
Pacema

[Pacemaker] [Question] About replacing in resource_set of the order limitation.

2014-01-16 Thread renayama19661014
Hi All,

We confirm a function of resource_set.

There were the resource of the group and the resource of the clone.

(snip)
Stack: corosync
Current DC: srv01 (3232238180) - partition WITHOUT quorum
Version: 1.1.10-f2d0cbc
1 Nodes configured
7 Resources configured


Online: [ srv01 ]

 Resource Group: grpPg
 A  (ocf::heartbeat:Dummy): Started srv01 
 B  (ocf::heartbeat:Dummy): Started srv01 
 C  (ocf::heartbeat:Dummy): Started srv01 
 D  (ocf::heartbeat:Dummy): Started srv01 
 E  (ocf::heartbeat:Dummy): Started srv01 
 F  (ocf::heartbeat:Dummy): Started srv01 
 Clone Set: clnPing [prmPing]
 Started: [ srv01 ]

Node Attributes:
* Node srv01:
+ default_ping_set  : 100   

Migration summary:
* Node srv01: 

(snip)

These have limitation showing next.

(snip)
  
  
  
  
(snip)


We tried that we rearranged a group in resource_set.
I think that I can rearrange the limitation of "colocation" as follows.

(snip)
  

  
  
  ...
  

  
(snip)

How should I rearrange the limitation of "order" in resource_set?

I thought that it was necessary to list two of the next, but a method to 
express well was not found.

 * "symmetirical=true" is necessary between the resources that were a group(A 
to F).
 * "symmetirical=false" is necessary between the resource that was a group(A to 
F) and the clone resources.

I wrote it as follows.
However, I think that symmetircal="false" is applied to all order limitation in 
this.
(snip)
  

  


  
  ...
  

  
(snip)

Best Reards,
Hideo Yamauchi.



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About replacing in resource_set of the order limitation.

2014-01-21 Thread renayama19661014
Hi All,

My test seemed to include a mistake.
It seems to be replaced by two limitation.

> However, I think that symmetircal="false" is applied to all order limitation 
> in this.
> (snip)
>       
>         
>           
>         
>         
>           
>           ...
>           
>         
>       
> (snip)


  
  
  

  
  
  
  
  
  

  

If my understanding includes a mistake, please point it out.

Best Reagards,
Hideo Yamauchi.

--- On Fri, 2014/1/17, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> We confirm a function of resource_set.
> 
> There were the resource of the group and the resource of the clone.
> 
> (snip)
> Stack: corosync
> Current DC: srv01 (3232238180) - partition WITHOUT quorum
> Version: 1.1.10-f2d0cbc
> 1 Nodes configured
> 7 Resources configured
> 
> 
> Online: [ srv01 ]
> 
>  Resource Group: grpPg
>      A      (ocf::heartbeat:Dummy): Started srv01 
>      B      (ocf::heartbeat:Dummy): Started srv01 
>      C      (ocf::heartbeat:Dummy): Started srv01 
>      D      (ocf::heartbeat:Dummy): Started srv01 
>      E      (ocf::heartbeat:Dummy): Started srv01 
>      F      (ocf::heartbeat:Dummy): Started srv01 
>  Clone Set: clnPing [prmPing]
>      Started: [ srv01 ]
> 
> Node Attributes:
> * Node srv01:
>     + default_ping_set                  : 100       
> 
> Migration summary:
> * Node srv01: 
> 
> (snip)
> 
> These have limitation showing next.
> 
> (snip)
>        rsc="grpPg" with-rsc="clnPing">
>       
>        then="grpPg" symmetrical="false">
>       
> (snip)
> 
> 
> We tried that we rearranged a group in resource_set.
> I think that I can rearrange the limitation of "colocation" as follows.
> 
> (snip)
>       
>         
>           
>           
>           ...
>           
>         
>       
> (snip)
> 
> How should I rearrange the limitation of "order" in resource_set?
> 
> I thought that it was necessary to list two of the next, but a method to 
> express well was not found.
> 
>  * "symmetirical=true" is necessary between the resources that were a group(A 
> to F).
>  * "symmetirical=false" is necessary between the resource that was a group(A 
> to F) and the clone resources.
> 
> I wrote it as follows.
> However, I think that symmetircal="false" is applied to all order limitation 
> in this.
> (snip)
>       
>         
>           
>         
>         
>           
>           ...
>           
>         
>       
> (snip)
> 
> Best Reards,
> Hideo Yamauchi.
> 
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] A resource starts with a standby node.(Latest attrd does not serve as the crmd-transition-delay parameter)

2014-01-30 Thread renayama19661014
Hi Andrew,

It became late.
I registered this problem by Bugzilla.
The report file is attached, too.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5194

Best Regards,
Hideo Yamauchi.

--- On Tue, 2014/1/14, Andrew Beekhof  wrote:

> 
> On 14 Jan 2014, at 4:33 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
>  Are you using the new attrd code or the legacy stuff?
> >>> 
> >>> I use new attrd.
> >> 
> >> And the values are not being sent to the cib at the same time? 
> > 
> > As far as I looked. . .
> > When the transmission of the attribute of attrd of the node was late, a 
> > leader of attrd seemed to send an attribute to cib without waiting for it.
> 
> And you have a delay configured?  And this value was set prior to that delay 
> expiring?
> 
> > 
>  Only the new code makes (or at least should do) crmd-transition-delay 
>  redundant.
> >>> 
> >>> It did not seem to work so that new attrd dispensed with 
> >>> crmd-transition-delay to me.
> >>> I report the details again.
> >>> # Probably it will be Bugzilla. . .
> >> 
> >> Sounds good
> > 
> > All right!
> > 
> > Many Thanks!
> > Hideo Yamauch.
> > 
> > --- On Tue, 2014/1/14, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 14 Jan 2014, at 4:13 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi Andrew,
> >>> 
> >>> Thank you for comments.
> >>> 
>  Are you using the new attrd code or the legacy stuff?
> >>> 
> >>> I use new attrd.
> >> 
> >> And the values are not being sent to the cib at the same time? 
> >> 
> >>> 
>  
>  If you're not using corosync 2.x or see:
>  
>       crm_notice("Starting mainloop...");
>  
>  then its the old code.  The new code could also be used with CMAN but 
>  isn't configured to build for in that situation.
>  
>  Only the new code makes (or at least should do) crmd-transition-delay 
>  redundant.
> >>> 
> >>> It did not seem to work so that new attrd dispensed with 
> >>> crmd-transition-delay to me.
> >>> I report the details again.
> >>> # Probably it will be Bugzilla. . .
> >> 
> >> Sounds good
> >> 
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> --- On Tue, 2014/1/14, Andrew Beekhof  wrote:
> >>> 
>  
>  On 14 Jan 2014, at 3:52 pm, renayama19661...@ybb.ne.jp wrote:
>  
> > Hi All,
> > 
> > I contributed next bugzilla by a problem to occur for the difference of 
> > the timing of the attribute update by attrd before.
> > * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2528
> > 
> > We can evade this problem now by using crmd-transition-delay parameter.
> > 
> > I confirmed whether I could evade this problem by renewed attrd 
> > recently.
> > * In latest attrd, one became a leader and seemed to come to update an 
> > attribute.
> > 
> > However, latest attrd does not seem to substitute for 
> > crmd-transition-delay.
> > * I contribute detailed log later.
> > 
> > We are dissatisfied with continuing using crmd-transition-delay.
> > Is there the plan when attrd handles this problem well in the future?
>  
>  Are you using the new attrd code or the legacy stuff?
>  
>  If you're not using corosync 2.x or see:
>  
>       crm_notice("Starting mainloop...");
>  
>  then its the old code.  The new code could also be used with CMAN but 
>  isn't configured to build for in that situation.
>  
>  Only the new code makes (or at least should do) crmd-transition-delay 
>  redundant.
>  
> >> 
> >> 
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.

2014-02-04 Thread renayama19661014
Hi All,

We tried to set sequential attribute of resource_set of colocation in true in 
crmsh.

We tried the next method, but true was not able to set it well.

-
[pengine]# crm --version
2.0 (Build 7cd5688c164d2949009accc7f172ce559cadbc4b)

- Pattern 1 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: msPostgresql:Master 
vip-master vip-rep sequential=true 

  

  


  
  

  

- Pattern 2 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: msPostgresql:Master 
vip-master vip-rep sequential=false

  

  


  
  

  

- Pattern 3 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: msPostgresql:Master 
sequential=true vip-master vip-rep sequential=true

  

  


  
  

  

- Pattern 4 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: msPostgresql:Master 
sequential=false vip-master vip-rep sequential=false

  


  


  
  

  

- Pattern 5 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: [ msPostgresql:Master ] [ 
vip-master vip-rep ]

  

  


  
  

  

- Pattern 6 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: ( msPostgresql:Master ) ( 
vip-master vip-rep )

  

  


  
  

  

- Pattern 7 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: [ msPostgresql:Master 
sequential=true ] [ vip-master vip-rep sequential=true ]

  

  


  
  

  

- Pattern 8 - 
colocation rsc_colocation-grpPg-clnPing INFINITY: ( msPostgresql:Master 
sequential=true ) ( vip-master vip-rep sequential=true )

  

  


  
  

  

-

How can true set sequantial attribute if I operate it in crmsh?

Best Regards,
Hideo Yamauchi.



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.

2014-02-05 Thread renayama19661014
Hi Kristoffer.

Thank you for comments.
We wait for a correction.

Many Thanks!
Hideo Yamauchi.


--- On Wed, 2014/2/5, Kristoffer Grönlund  wrote:

> On Wed, 5 Feb 2014 15:55:42 +0900 (JST)
> renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > We tried to set sequential attribute of resource_set of colocation in
> > true in crmsh.
> > 
> > We tried the next method, but true was not able to set it well.
> > 
> > [snipped]
> > 
> > How can true set sequantial attribute if I operate it in crmsh?
> > 
> 
> Hello,
> 
> Unfortunately, the parsing of resource sets in crmsh is incorrect in
> this case. crmsh will never explicitly set sequential to true, only to
> false. This is a bug in both the 2.0 development branch and in all
> previous versions.
> 
> A fix is on the way.
> 
> Thank you,
> Kristoffer
> 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs:
> > http://bugs.clusterlabs.org
> > 
> 
> 
> 
> -- 
> // Kristoffer Grönlund
> // kgronl...@suse.com
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.

2014-02-06 Thread renayama19661014
Hi Kristoffer,

In RHEL6.4, crmsh-c8f214020b2c gives the next error and cannot install it.

Does a procedure of any installation have a problem?

---

[root@srv01 crmsh-c8f214020b2c]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.4 (Santiago)

[root@srv01 crmsh-c8f214020b2c]# ./autogen.sh
(snip)
[root@srv01 crmsh-c8f214020b2c]# ./configure --sysconfdir=/etc 
--localstatedir=/var
(snip)
[root@srv01 crmsh-c8f214020b2c]# make install
Making install in doc
make[1]: Entering directory `/opt/crmsh-c8f214020b2c/doc'
a2x -f manpage crm.8.txt
WARNING: crm.8.txt: line 621: missing [[cmdhelp_._status] section
WARNING: crm.8.txt: line 3935: missing [[cmdhelp_._report] section
./crm.8.xml:3137: element refsect1: validity error : Element refsect1 content 
does not follow the DTD, expecting (refsect1info? , (title , subtitle? , 
titleabbrev?) , (((calloutlist | glosslist | bibliolist | itemizedlist | 
orderedlist | segmentedlist | simplelist | variablelist | caution | important | 
note | tip | warning | literallayout | programlisting | programlistingco | 
screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | 
classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | 
methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | 
graphicco | mediaobject | mediaobjectco | informalequation | informalexample | 
informalfigure | informaltable | equation | example | figure | table | msgset | 
procedure | sidebar | qandaset | task | anchor | bridgehead | remark | 
highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , 
refsect2*) | refsect2+)), got (title simpara simpara
 simpara simpara literallayout refsect2 refsect2 refsect2 refsect2 refsect2 
refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 
refsect2 refsect2 refsect2 simpara simpara simpara literallayout simpara 
literallayout refsect2 refsect2 refsect2 )

   ^
a2x: failed: xmllint --nonet --noout --valid "./crm.8.xml"
make[1]: *** [crm.8] Error 1
make[1]: Leaving directory `/opt/crmsh-c8f214020b2c/doc'
make: *** [install-recursive] Error 1

---


The same error seems to be given in latest crmsh-cc52dc69ceb1.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2014/2/6, Kristoffer Grönlund  wrote:

> On Wed, 5 Feb 2014 23:17:36 +0900 (JST)
> renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Kristoffer.
> > 
> > Thank you for comments.
> > We wait for a correction.
> > 
> > Many Thanks!
> > Hideo Yamauchi.
> > 
> 
> Hello,
> 
> A fix for this issue has now been committed as changeset c8f214020b2c,
> please let me know if it solves the problem for you.
> 
> This construction should now generate the correct XML:
> 
> colocation rsc_colocation-grpPg-clnPing INFINITY: \
>     [ msPostgresql:Master sequential=true ] \
>     [ vip-master vip-rep sequential=true ]
> 
> Thank you,
> 
> -- 
> // Kristoffer Grönlund
> // kgronl...@suse.com
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.

2014-02-11 Thread renayama19661014
Hi Kristoffer,

Thank you for comments.

I tested it.
However, the problem seems to still occur.

---
[root@srv01 crmsh-8d984b138fc4]# pwd
/opt/crmsh-8d984b138fc4

[root@srv01 crmsh-8d984b138fc4]# ./autogen.sh 
autoconf:   autoconf (GNU Autoconf) 2.63
automake:   automake (GNU automake) 1.11.1
(snip)

[root@srv01 crmsh-8d984b138fc4]# ./configure --sysconfdir=/etc 
--localstatedir=/var
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
(snip)
  Prefix   = /usr
  Executables  = /usr/sbin
  Man pages= /usr/share/man
  Libraries= /usr/lib64
  Header files = ${prefix}/include
  Arch-independent files   = /usr/share
  State information= /var
  System configuration = /etc

[root@srv01 crmsh-8d984b138fc4]# make install
Making install in doc
make[1]: Entering directory `/opt/crmsh-8d984b138fc4/doc'
a2x -f manpage crm.8.txt
WARNING: crm.8.txt: line 621: missing [[cmdhelp_._status] section
WARNING: crm.8.txt: line 3936: missing [[cmdhelp_._report] section
./crm.8.xml:3137: element refsect1: validity error : Element refsect1 content 
does not follow the DTD, expecting (refsect1info? , (title , subtitle? , 
titleabbrev?) , (((calloutlist | glosslist | bibliolist | itemizedlist | 
orderedlist | segmentedlist | simplelist | variablelist | caution | important | 
note | tip | warning | literallayout | programlisting | programlistingco | 
screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | 
classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | 
methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | 
graphicco | mediaobject | mediaobjectco | informalequation | informalexample | 
informalfigure | informaltable | equation | example | figure | table | msgset | 
procedure | sidebar | qandaset | task | anchor | bridgehead | remark | 
highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , 
refsect2*) | refsect2+)), got (title simpara simpara
 simpara simpara literallayout refsect2 refsect2 refsect2 refsect2 refsect2 
refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 
refsect2 refsect2 refsect2 simpara simpara simpara literallayout simpara 
literallayout refsect2 refsect2 refsect2 )

   ^
a2x: failed: xmllint --nonet --noout --valid "./crm.8.xml"
make[1]: *** [crm.8] Error 1
make[1]: Leaving directory `/opt/crmsh-8d984b138fc4/doc'
make: *** [install-recursive] Error 1

---


Best Regards,
Hideo Yamauchi.

--- On Mon, 2014/2/10, Kristoffer Grönlund  wrote:

> On Fri, 7 Feb 2014 09:21:12 +0900 (JST)
> renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Kristoffer,
> > 
> > In RHEL6.4, crmsh-c8f214020b2c gives the next error and cannot
> > install it.
> > 
> > Does a procedure of any installation have a problem?
> 
> Hello,
> 
> It seems that docbook validation is stricter on RHEL 6.4 than on other
> systems I use to test. I have pushed a fix for this problem, please
> test again with changeset 8d984b138fc4.
> 
> Thank you,
> 
> -- 
> // Kristoffer Grönlund
> // kgronl...@suse.com
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] About the difference in handling of "sequential".

2014-02-11 Thread renayama19661014
Hi All,

There is difference in two between handling of "sequential" of "resouce_set" of 
colocation.

Is either one not a mistake?


static gboolean
unpack_colocation_set(xmlNode * set, int score, pe_working_set_t * data_set)
{
xmlNode *xml_rsc = NULL;
resource_t *with = NULL;
resource_t *resource = NULL;
const char *set_id = ID(set);
const char *role = crm_element_value(set, "role");
const char *sequential = crm_element_value(set, "sequential");
int local_score = score;

const char *score_s = crm_element_value(set, XML_RULE_ATTR_SCORE);

if (score_s) {
local_score = char2score(score_s);
}

/* When "sequential" is not set, "sequential" is treat as TRUE. */

if (sequential != NULL && crm_is_true(sequential) == FALSE) {
return TRUE;
(snip)
static gboolean
colocate_rsc_sets(const char *id, xmlNode * set1, xmlNode * set2, int score,
  pe_working_set_t * data_set)
{
xmlNode *xml_rsc = NULL;
resource_t *rsc_1 = NULL;
resource_t *rsc_2 = NULL;

const char *role_1 = crm_element_value(set1, "role");
const char *role_2 = crm_element_value(set2, "role");

const char *sequential_1 = crm_element_value(set1, "sequential");
const char *sequential_2 = crm_element_value(set2, "sequential");

/* When "sequential" is not set, "sequential" is treat as FALSE. */

if (crm_is_true(sequential_1)) {
/* get the first one */
for (xml_rsc = __xml_first_child(set1); xml_rsc != NULL; xml_rsc = 
__xml_next(xml_rsc)) {
if (crm_str_eq((const char *)xml_rsc->name, XML_TAG_RESOURCE_REF, 
TRUE)) {
EXPAND_CONSTRAINT_IDREF(id, rsc_1, ID(xml_rsc));
break;
}
}
}

if (crm_is_true(sequential_2)) {
/* get the last one */
(snip)



Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.

2014-02-12 Thread renayama19661014
Hi Kristoffer,

Thank you for comments.

> Could you try with the latest changeset 337654e0cdc4?

However, the problem seems to still occur.

[root@srv01 crmsh-337654e0cdc4]# make install
Making install in doc
make[1]: Entering directory `/opt/crmsh-337654e0cdc4/doc'
a2x -f manpage crm.8.txt
WARNING: crm.8.txt: line 621: missing [[cmdhelp_._status] section
WARNING: crm.8.txt: line 3936: missing [[cmdhelp_._report] section
./crm.8.xml:3137: element refsect1: validity error : Element refsect1 content 
does not follow the DTD, expecting (refsect1info? , (title , subtitle? , 
titleabbrev?) , (((calloutlist | glosslist | bibliolist | itemizedlist | 
orderedlist | segmentedlist | simplelist | variablelist | caution | important | 
note | tip | warning | literallayout | programlisting | programlistingco | 
screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | 
classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | 
methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | 
graphicco | mediaobject | mediaobjectco | informalequation | informalexample | 
informalfigure | informaltable | equation | example | figure | table | msgset | 
procedure | sidebar | qandaset | task | anchor | bridgehead | remark | 
highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , 
refsect2*) | refsect2+)), got (title simpara simpara
 simpara simpara literallayout refsect2 refsect2 refsect2 refsect2 refsect2 
refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 refsect2 
refsect2 refsect2 refsect2 simpara simpara simpara literallayout simpara 
literallayout refsect2 refsect2 refsect2 )

   ^
a2x: failed: xmllint --nonet --noout --valid "./crm.8.xml"
make[1]: *** [crm.8] Error 1
make[1]: Leaving directory `/opt/crmsh-337654e0cdc4/doc'
make: *** [install-recursive] Error 1

Best Regards,
--- On Wed, 2014/2/12, Kristoffer Grönlund  wrote:

> On Wed, 12 Feb 2014 09:12:08 +0900 (JST)
> renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Kristoffer,
> > 
> > Thank you for comments.
> > 
> > I tested it.
> > However, the problem seems to still occur.
> > 
> 
> Hello,
> 
> Could you try with the latest changeset 337654e0cdc4?
> 
> Thank you,
> 
> -- 
> // Kristoffer Grönlund
> // kgronl...@suse.com
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.

2014-02-12 Thread renayama19661014
Hi Kristoffer,

Thank you for comments.

By crmsh-7f620e736895.tar.gz, I did "make install"  well.

I seem to be able to set the sequential attribute definitely.
The sequential attribute does become true.

---
(snip)
colocation rsc_colocation-master INFINITY: [ vip-master vip-rep sequential=true 
] [ msPostgresql:Master sequential=true ]
(snip)

  
  


  

(snip)
---

But the next information appeared when I put crm.
Does this last message not have any problem?

---
[root@srv01 ~]# crm configure load update db2-resource_set_0207.crm 
 
WARNING: pgsql: action monitor not advertised in meta-data, it may not be 
supported by the RA
WARNING: pgsql: action notify not advertised in meta-data, it may not be 
supported by the RA
WARNING: pgsql: action demote not advertised in meta-data, it may not be 
supported by the RA
WARNING: pgsql: action promote not advertised in meta-data, it may not be 
supported by the RA
INFO: object rsc_colocation-master cannot be represented in the CLI notation
---

In addition, colocation becomes xml when I confirm it by "crm configure show" 
command, and sequential disappears.
Do you not have a problem with this result either?

---
 crm configure show
(snip)
xml  \
   \
 \
 \
   \
   
\
 \
   \

(snip)
---

In addition, it becomes the error when I appoint false in symmetrical attribute 
of order.

---
(snip)
### Resource Order ###
order rsc_order-clnPingd-msPostgresql-1 0: clnPingd msPostgresql 
symmetrical=false
order test-order-1 0: ( vip-master vip-rep ) symmetrical=false
order test-order-2 INFINITY: msPostgresql:promote vip-master:start 
symmetrical=false
order test-order-3 INFINITY: msPostgresql:promote vip-rep:start 
symmetrical=false
order test-order-4 0: msPostgresql:demote vip-master:stop symmetrical=false
order test-order-5 0: msPostgresql:demote vip-rep:stop symmetrical=false
(snip)
[root@srv01 ~]# crm configure load update db2-resource_set_0207.crm 
Traceback (most recent call last):
  File "/usr/sbin/crm", line 56, in 
rc = main.run()
  File "/usr/lib64/python2.6/site-packages/crmsh/main.py", line 433, in run
return do_work(context, user_args)
  File "/usr/lib64/python2.6/site-packages/crmsh/main.py", line 272, in do_work
if context.run(' '.join(l)):
  File "/usr/lib64/python2.6/site-packages/crmsh/ui_context.py", line 87, in run
rv = self.execute_command() is not False
  File "/usr/lib64/python2.6/site-packages/crmsh/ui_context.py", line 244, in 
execute_command
rv = self.command_info.function(*arglist)
  File "/usr/lib64/python2.6/site-packages/crmsh/ui_configure.py", line 432, in 
do_load
return set_obj.import_file(method, url)
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 314, in 
import_file
return self.save(s, no_remove=True, method=method)
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 529, in 
save
upd_type="cli", method=method)
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 3178, in 
set_update
if not self._set_update(edit_d, mk_set, upd_set, del_set, upd_type, method):
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 3170, in 
_set_update
return self._cli_set_update(edit_d, mk_set, upd_set, del_set, method)
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 3118, in 
_cli_set_update
obj = self.create_from_cli(cli)
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 3045, in 
create_from_cli
node = obj.cli2node(cli_list)
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 1014, in 
cli2node
node = self._cli_list2node(cli_list, oldnode)
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 1832, in 
_cli_list2node
headnode = mkxmlsimple(head, oldnode, '')
  File "/usr/lib64/python2.6/site-packages/crmsh/cibconfig.py", line 662, in 
mkxmlsimple
node.set(n, v)
  File "lxml.etree.pyx", line 634, in lxml.etree._Element.set 
(src/lxml/lxml.etree.c:31548)
  File "apihelpers.pxi", line 487, in lxml.etree._setAttributeValue 
(src/lxml/lxml.etree.c:13896)
  File "apihelpers.pxi", line 1240, in lxml.etree._utf8 
(src/lxml/lxml.etree.c:19826)
TypeError: Argument must be string or unicode.
---

This error seems to be broken off if I apply the patch which next Mr Inoue 
contributed somehow or other

 * http://www.gossamer-threads.com/lists/linuxha/dev/88660

-
v = v.lower() 
if n.startswith('$'): 
n = n.lstrip('$') 
+ if isinstance(v, bool): 
+ v = "true" if v else "false" 
if (type(v) != type('') and type(v) != type(u'')) \ 
or v: # skip empty strings 
node.set(n, v) 
-


Best Regards,
Hideo Yamauchi.

--- On Thu, 2014/2/13, Kristoffer Grönlund  wrote:

> On

Re: [Pacemaker] [Question:crmsh] About a setting method of seuquential=true in crmsh.

2014-02-13 Thread renayama19661014
Hi Kristoffer,

Thank you for comments.

> > But the next information appeared when I put crm.
> > Does this last message not have any problem?
> > 
> > ---
> > [root@srv01 ~]# crm configure load update
> > db2-resource_set_0207.crm WARNING: pgsql: action monitor not
> > advertised in meta-data, it may not be supported by the RA WARNING:
> > pgsql: action notify not advertised in meta-data, it may not be
> > supported by the RA WARNING: pgsql: action demote not advertised in
> > meta-data, it may not be supported by the RA WARNING: pgsql: action
> > promote not advertised in meta-data, it may not be supported by the
> > RA INFO: object rsc_colocation-master cannot be represented in the
> > CLI notation ---
> 
> It seems that there is some problem parsing the pgsql RA, which
> I suspect is the underlying cause that makes crmsh display the
> constraint as XML. 
> 
> Which version of resource-agents is installed? On my test system, I
> have version resource-agents-3.9.5-70.2.x86_64. However, the version
> installed by centOS 6.5 seems to be a very old version,
> resource-agents-3.9.2-40.el6_5.6.x86_64.

I use resource-agent3.9.5, too.
I send it to savannah.nongnu.org if I still have a problem.

> It would be very helpful if you could file an issue for this problem at
> 
> https://savannah.nongnu.org/bugs/?group=crmsh
> 
> and also attach your configuration and an hb_report or crm_report,
> thank you.
> 
> I have also implemented a fix for the problem discovered by Mr. Inoue.
> His original email was unfortunately missed at the time.
> The fix is in changeset 71841e4559cf.
> 
> Thank you,

I told the uptake of the patch to Mr Inoue.

I confirmed movement in latest crmsh-364c59ee0612.

Most problems were solved somehow or other.
However, the problem that colocation is displayed in xml seems to still remain 
after having sent crm file.


(snip)
colocation rsc_colocation-master INFINITY: [ vip-master vip-rep sequential=true 
] [ msPostgresql:Master sequential=true ]
(snip)
#crm(live)configure# show
(snip)
xml  \
   \
 \
 \
   \
   
\
 \
   \

(snip)


I send this problem to savannah.nongnu.org.

Many Thanks!
Hideo Yamauchi.


> 
> -- 
> // Kristoffer Grönlund
> // kgronl...@suse.com
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About replacing in resource_set of the order limitation.

2014-02-16 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> Is this related to your email about symmetrical not being defaulted 
> consistently between colocate_rsc_sets() and unpack_colocation_set()?

Yes.
I think that a default is not handled well.
I will not have any problem when "sequential" attribute is set in cib by all 
means.

I think that I should revise processing when "sequential" attribute is not set.

Best Regards,
Hideo Yamauchi.

> 
> On 22 Jan 2014, at 3:05 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > My test seemed to include a mistake.
> > It seems to be replaced by two limitation.
> > 
> >> However, I think that symmetircal="false" is applied to all order 
> >> limitation in this.
> >> (snip)
> >>        >>symmetrical="false">
> >>         
> >>           
> >>         
> >>         
> >>           
> >>           ...
> >>           
> >>         
> >>       
> >> (snip)
> > 
> > 
> >       >then="prmEx" symmetrical="false">
> >      
> >      
> >        
> >          
> >          
> >          
> >          
> >          
> >          
> >        
> >      
> > 
> > If my understanding includes a mistake, please point it out.
> > 
> > Best Reagards,
> > Hideo Yamauchi.
> > 
> > --- On Fri, 2014/1/17, renayama19661...@ybb.ne.jp 
> >  wrote:
> > 
> >> Hi All,
> >> 
> >> We confirm a function of resource_set.
> >> 
> >> There were the resource of the group and the resource of the clone.
> >> 
> >> (snip)
> >> Stack: corosync
> >> Current DC: srv01 (3232238180) - partition WITHOUT quorum
> >> Version: 1.1.10-f2d0cbc
> >> 1 Nodes configured
> >> 7 Resources configured
> >> 
> >> 
> >> Online: [ srv01 ]
> >> 
> >> Resource Group: grpPg
> >>      A      (ocf::heartbeat:Dummy): Started srv01 
> >>      B      (ocf::heartbeat:Dummy): Started srv01 
> >>      C      (ocf::heartbeat:Dummy): Started srv01 
> >>      D      (ocf::heartbeat:Dummy): Started srv01 
> >>      E      (ocf::heartbeat:Dummy): Started srv01 
> >>      F      (ocf::heartbeat:Dummy): Started srv01 
> >> Clone Set: clnPing [prmPing]
> >>      Started: [ srv01 ]
> >> 
> >> Node Attributes:
> >> * Node srv01:
> >>     + default_ping_set                  : 100       
> >> 
> >> Migration summary:
> >> * Node srv01: 
> >> 
> >> (snip)
> >> 
> >> These have limitation showing next.
> >> 
> >> (snip)
> >>        >>rsc="grpPg" with-rsc="clnPing">
> >>       
> >>        >>then="grpPg" symmetrical="false">
> >>       
> >> (snip)
> >> 
> >> 
> >> We tried that we rearranged a group in resource_set.
> >> I think that I can rearrange the limitation of "colocation" as follows.
> >> 
> >> (snip)
> >>       
> >>         
> >>           
> >>           
> >>           ...
> >>           
> >>         
> >>       
> >> (snip)
> >> 
> >> How should I rearrange the limitation of "order" in resource_set?
> >> 
> >> I thought that it was necessary to list two of the next, but a method to 
> >> express well was not found.
> >> 
> >> * "symmetirical=true" is necessary between the resources that were a 
> >> group(A to F).
> >> * "symmetirical=false" is necessary between the resource that was a 
> >> group(A to F) and the clone resources.
> >> 
> >> I wrote it as follows.
> >> However, I think that symmetircal="false" is applied to all order 
> >> limitation in this.
> >> (snip)
> >>        >>symmetrical="false">
> >>         
> >>           
> >>         
> >>         
> >>           
> >>           ...
> >>           
> >>         
> >>       
> >> (snip)
> >> 
> >> Best Reards,
> >> Hideo Yamauchi.
> >> 
> >> 
> >> 
> >> ___
> >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >> 
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >> 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] About the difference in handling of "sequential".

2014-02-16 Thread renayama19661014
Hi Andrew,

I found your correction.

https://github.com/beekhof/pacemaker/commit/37ff51a0edba208e6240e812936717fffc941a41

Many Thanks!
Hideo Yamauchi.

--- On Wed, 2014/2/12, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> There is difference in two between handling of "sequential" of "resouce_set" 
> of colocation.
> 
> Is either one not a mistake?
> 
> 
> static gboolean
> unpack_colocation_set(xmlNode * set, int score, pe_working_set_t * data_set)
> {
>     xmlNode *xml_rsc = NULL;
>     resource_t *with = NULL;
>     resource_t *resource = NULL;
>     const char *set_id = ID(set);
>     const char *role = crm_element_value(set, "role");
>     const char *sequential = crm_element_value(set, "sequential");
>     int local_score = score;
> 
>     const char *score_s = crm_element_value(set, XML_RULE_ATTR_SCORE);
> 
>     if (score_s) {
>         local_score = char2score(score_s);
>     }
> 
> /* When "sequential" is not set, "sequential" is treat as TRUE. */
> 
>     if (sequential != NULL && crm_is_true(sequential) == FALSE) {
>         return TRUE;
> (snip)
> static gboolean
> colocate_rsc_sets(const char *id, xmlNode * set1, xmlNode * set2, int score,
>                   pe_working_set_t * data_set)
> {
>     xmlNode *xml_rsc = NULL;
>     resource_t *rsc_1 = NULL;
>     resource_t *rsc_2 = NULL;
> 
>     const char *role_1 = crm_element_value(set1, "role");
>     const char *role_2 = crm_element_value(set2, "role");
> 
>     const char *sequential_1 = crm_element_value(set1, "sequential");
>     const char *sequential_2 = crm_element_value(set2, "sequential");
> 
> /* When "sequential" is not set, "sequential" is treat as FALSE. */
> 
>     if (crm_is_true(sequential_1)) {
>         /* get the first one */
>         for (xml_rsc = __xml_first_child(set1); xml_rsc != NULL; xml_rsc = 
> __xml_next(xml_rsc)) {
>             if (crm_str_eq((const char *)xml_rsc->name, XML_TAG_RESOURCE_REF, 
> TRUE)) {
>                 EXPAND_CONSTRAINT_IDREF(id, rsc_1, ID(xml_rsc));
>                 break;
>             }
>         }
>     }
> 
>     if (crm_is_true(sequential_2)) {
>         /* get the last one */
> (snip)
> 
> 
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About replacing in resource_set of the order limitation.

2014-02-16 Thread renayama19661014
Hi Andrew,

> >> Is this related to your email about symmetrical not being defaulted 
> >> consistently between colocate_rsc_sets() and unpack_colocation_set()?
> > 
> > Yes.
> > I think that a default is not handled well.
> > I will not have any problem when "sequential" attribute is set in cib by 
> > all means.
> > 
> > I think that I should revise processing when "sequential" attribute is not 
> > set.
> 
> agreed. I've changed some occurrences locally but there may be more.

All right!

Many Thanks!
Hideo Yamauchi.


--- On Mon, 2014/2/17, Andrew Beekhof  wrote:

> 
> On 17 Feb 2014, at 12:47 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > Thank you for comments.
> > 
> >> Is this related to your email about symmetrical not being defaulted 
> >> consistently between colocate_rsc_sets() and unpack_colocation_set()?
> > 
> > Yes.
> > I think that a default is not handled well.
> > I will not have any problem when "sequential" attribute is set in cib by 
> > all means.
> > 
> > I think that I should revise processing when "sequential" attribute is not 
> > set.
> 
> agreed. I've changed some occurrences locally but there may be more.
> 
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> >> 
> >> On 22 Jan 2014, at 3:05 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> My test seemed to include a mistake.
> >>> It seems to be replaced by two limitation.
> >>> 
>  However, I think that symmetircal="false" is applied to all order 
>  limitation in this.
>  (snip)
>          symmetrical="false">
>           
>             
>           
>           
>             
>             ...
>             
>           
>         
>  (snip)
> >>> 
> >>> 
> >>>        >>>then="prmEx" symmetrical="false">
> >>>       
> >>>        >>>symmetrical="true">
> >>>         
> >>>           
> >>>           
> >>>           
> >>>           
> >>>           
> >>>           
> >>>         
> >>>       
> >>> 
> >>> If my understanding includes a mistake, please point it out.
> >>> 
> >>> Best Reagards,
> >>> Hideo Yamauchi.
> >>> 
> >>> --- On Fri, 2014/1/17, renayama19661...@ybb.ne.jp 
> >>>  wrote:
> >>> 
>  Hi All,
>  
>  We confirm a function of resource_set.
>  
>  There were the resource of the group and the resource of the clone.
>  
>  (snip)
>  Stack: corosync
>  Current DC: srv01 (3232238180) - partition WITHOUT quorum
>  Version: 1.1.10-f2d0cbc
>  1 Nodes configured
>  7 Resources configured
>  
>  
>  Online: [ srv01 ]
>  
>  Resource Group: grpPg
>        A      (ocf::heartbeat:Dummy): Started srv01 
>        B      (ocf::heartbeat:Dummy): Started srv01 
>        C      (ocf::heartbeat:Dummy): Started srv01 
>        D      (ocf::heartbeat:Dummy): Started srv01 
>        E      (ocf::heartbeat:Dummy): Started srv01 
>        F      (ocf::heartbeat:Dummy): Started srv01 
>  Clone Set: clnPing [prmPing]
>        Started: [ srv01 ]
>  
>  Node Attributes:
>  * Node srv01:
>       + default_ping_set                  : 100       
>  
>  Migration summary:
>  * Node srv01: 
>  
>  (snip)
>  
>  These have limitation showing next.
>  
>  (snip)
>          score="INFINITY" rsc="grpPg" with-rsc="clnPing">
>         
>          then="grpPg" symmetrical="false">
>         
>  (snip)
>  
>  
>  We tried that we rearranged a group in resource_set.
>  I think that I can rearrange the limitation of "colocation" as follows.
>  
>  (snip)
>          score="INFINITY">
>           
>             
>             
>             ...
>             
>           
>         
>  (snip)
>  
>  How should I rearrange the limitation of "order" in resource_set?
>  
>  I thought that it was necessary to list two of the next, but a method to 
>  express well was not found.
>  
>  * "symmetirical=true" is necessary between the resources that were a 
>  group(A to F).
>  * "symmetirical=false" is necessary between the resource that was a 
>  group(A to F) and the clone resources.
>  
>  I wrote it as follows.
>  However, I think that symmetircal="false" is applied to all order 
>  limitation in this.
>  (snip)
>          symmetrical="false">
>           
>             
>           
>           
>             
>             ...
>             
>           
>         
>  (snip)
>  
>  Best Reards,
>  Hideo Yamauchi.
>  
>  
>  
>  ___
>  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>  
>  Project Home: http://www.clusterlabs.org
>  Getting started

[Pacemaker] [Patch]Information of "Connectivity is lost" is not displayed

2014-02-16 Thread renayama19661014
Hi All,

The crm_mon tool which is attached to Pacemaker1.1 seems to have a problem.
I send a patch.

Best Regards,
Hideo Yamauchi.

trac2781.patch
Description: Binary data
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Patch]Information of "Connectivity is lost" is not displayed

2014-02-16 Thread renayama19661014
Hi All,

The next change was accomplished by Mr. Lars.

 
https://github.com/ClusterLabs/pacemaker/commit/6a17c003b0167de9fe51d5330fb6e4f1b4ffe64c

I may lack the correction of other parts which are not the patch which I sent.

Best Regards,
Hideo Yamauchi.

--- On Mon, 2014/2/17, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> The crm_mon tool which is attached to Pacemaker1.1 seems to have a problem.
> I send a patch.
> 
> Best Regards,
> Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem] Fail-over is delayed.(State transition is not calculated.)

2014-02-17 Thread renayama19661014
Hi All,

I confirmed movement at the time of the trouble in one of Master/Slave in 
Pacemaker1.1.11.

-

Step1) Constitute a cluster.

[root@srv01 ~]# crm_mon -1 -Af
Last updated: Tue Feb 18 18:07:24 2014
Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01
Stack: corosync
Current DC: srv01 (3232238180) - partition with quorum
Version: 1.1.10-9d39a6b
2 Nodes configured
6 Resources configured


Online: [ srv01 srv02 ]

 vip-master (ocf::heartbeat:Dummy): Started srv01 
 vip-rep(ocf::heartbeat:Dummy): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 ]

Node Attributes:
* Node srv01:
+ default_ping_set  : 100   
+ master-pgsql  : 10
* Node srv02:
+ default_ping_set  : 100   
+ master-pgsql  : 5 

Migration summary:
* Node srv01: 
* Node srv02: 

Step2) Monitor error in vip-master.

[root@srv01 ~]# rm -rf /var/run/resource-agents/Dummy-vip-master.state 

[root@srv01 ~]# crm_mon -1 -Af  
Last updated: Tue Feb 18 18:07:58 2014
Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01
Stack: corosync
Current DC: srv01 (3232238180) - partition with quorum
Version: 1.1.10-9d39a6b
2 Nodes configured
6 Resources configured


Online: [ srv01 srv02 ]

 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 ]

Node Attributes:
* Node srv01:
+ default_ping_set  : 100   
+ master-pgsql  : 10
* Node srv02:
+ default_ping_set  : 100   
+ master-pgsql  : 5 

Migration summary:
* Node srv01: 
   vip-master: migration-threshold=1 fail-count=1 last-failure='Tue Feb 18 
18:07:50 2014'
* Node srv02: 

Failed actions:
vip-master_monitor_1 on srv01 'not running' (7): call=30, 
status=complete, last-rc-change='Tue Feb 18 18:07:50 2014', queued=0ms, exec=0ms
-

However, the resource does not fail-over.

But, fail-over is calculated when I check cib in crm_simulate at this point in 
time.

-
[root@srv01 ~]# crm_simulate -L -s

Current cluster status:
Online: [ srv01 srv02 ]

 vip-master (ocf::heartbeat:Dummy): Stopped 
 vip-rep(ocf::heartbeat:Dummy): Stopped 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 ]

Allocation scores:
clone_color: clnPingd allocation score on srv01: 0
clone_color: clnPingd allocation score on srv02: 0
clone_color: prmPingd:0 allocation score on srv01: INFINITY
clone_color: prmPingd:0 allocation score on srv02: 0
clone_color: prmPingd:1 allocation score on srv01: 0
clone_color: prmPingd:1 allocation score on srv02: INFINITY
native_color: prmPingd:0 allocation score on srv01: INFINITY
native_color: prmPingd:0 allocation score on srv02: 0
native_color: prmPingd:1 allocation score on srv01: -INFINITY
native_color: prmPingd:1 allocation score on srv02: INFINITY
clone_color: msPostgresql allocation score on srv01: 0
clone_color: msPostgresql allocation score on srv02: 0
clone_color: pgsql:0 allocation score on srv01: INFINITY
clone_color: pgsql:0 allocation score on srv02: 0
clone_color: pgsql:1 allocation score on srv01: 0
clone_color: pgsql:1 allocation score on srv02: INFINITY
native_color: pgsql:0 allocation score on srv01: INFINITY
native_color: pgsql:0 allocation score on srv02: 0
native_color: pgsql:1 allocation score on srv01: -INFINITY
native_color: pgsql:1 allocation score on srv02: INFINITY
pgsql:1 promotion score on srv02: 5
pgsql:0 promotion score on srv01: 1
native_color: vip-master allocation score on srv01: -INFINITY
native_color: vip-master allocation score on srv02: INFINITY
native_color: vip-rep allocation score on srv01: -INFINITY
native_color: vip-rep allocation score on srv02: INFINITY

Transition Summary:
 * Start   vip-master   (srv02)
 * Start   vip-rep  (srv02)
 * Demote  pgsql:0  (Master -> Slave srv01)
 * Promote pgsql:1  (Slave -> Master srv02)

-

In addition, fail-over is calculated even if "cluster_recheck_interval" is 
carried out.

Fail-over is carried out even if I carry out cibadmin -B.

-
[root@srv01 ~]# cibadmin -B

[root@srv01 ~]# crm_mon -1 -Af
Last updated: Tue Feb 18 18:21:15 2014
Last change: Tue Feb 18 18:21:00 2014 via cibadmin on srv01
Stack: corosync
Current DC: srv01 (3232238180) - partition with quorum
Version: 1.1.10-9d39a6b
2 Nodes configured
6 Resources configured


Online: [ srv01 srv02 ]

 vip-master (ocf::heartbeat:Dummy): Started srv02 
 vip-rep(ocf::heartbeat:Dummy): Started srv02 
 Master/Slave Set: 

Re: [Pacemaker] [Patch]Information of "Connectivity is lost" is not displayed

2014-02-17 Thread renayama19661014
Hi Andrew,

> I'm confused... that patch seems to be the reverse of yours.
> Are you saying that we need to undo Lars' one?

No, I do not understand the meaning of the correction of Mr. Lars.

However, as now, crm_mon does not display a right attribute.
Possibly did you not discuss the correction to put meta data in rsc-parameters 
with Mr. Lars? Or Mr. David?

Best Regards,
Hideo Yamauchi.

--- On Tue, 2014/2/18, Andrew Beekhof  wrote:

> 
> On 17 Feb 2014, at 5:43 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > The next change was accomplished by Mr. Lars.
> > 
> > https://github.com/ClusterLabs/pacemaker/commit/6a17c003b0167de9fe51d5330fb6e4f1b4ffe64c
> 
> I'm confused... that patch seems to be the reverse of yours.
> Are you saying that we need to undo Lars' one?
> 
> > 
> > I may lack the correction of other parts which are not the patch which I 
> > sent.
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Mon, 2014/2/17, renayama19661...@ybb.ne.jp 
> >  wrote:
> > 
> >> Hi All,
> >> 
> >> The crm_mon tool which is attached to Pacemaker1.1 seems to have a problem.
> >> I send a patch.
> >> 
> >> Best Regards,
> >> Hideo Yamauchi.
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Patch]Information of "Connectivity is lost" is not displayed

2014-02-17 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> can I see the config of yours that crm_mon is not displaying correctly?

It is displayed as follows.
-
[root@srv01 tmp]# crm_mon -1 -Af   
Last updated: Tue Feb 18 19:51:04 2014
Last change: Tue Feb 18 19:48:55 2014 via cibadmin on srv01
Stack: corosync
Current DC: srv01 (3232238180) - partition WITHOUT quorum
Version: 1.1.10-9d39a6b
1 Nodes configured
5 Resources configured


Online: [ srv01 ]

Clone Set: clnPingd [prmPingd]
 Started: [ srv01 ]

Node Attributes:
* Node srv01:
+ default_ping_set  : 0 

Migration summary:
* Node srv01: 

-

I uploaded log in the next place.(trac2781.zip)

 * https://skydrive.live.com/?cid=3A14D57622C66876&id=3A14D57622C66876%21117

Best Regards,
Hideo Yamauchi.


--- On Tue, 2014/2/18, Andrew Beekhof  wrote:

> 
> On 18 Feb 2014, at 12:19 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> >> I'm confused... that patch seems to be the reverse of yours.
> >> Are you saying that we need to undo Lars' one?
> > 
> > No, I do not understand the meaning of the correction of Mr. Lars.
> 
> 
> name, multiplier and host_list are all resource parameters, not meta 
> attributes.
> so lars' patch should be correct.
> 
> can I see the config of yours that crm_mon is not displaying correctly?
> 
> > 
> > However, as now, crm_mon does not display a right attribute.
> > Possibly did you not discuss the correction to put meta data in 
> > rsc-parameters with Mr. Lars? Or Mr. David?
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 17 Feb 2014, at 5:43 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> The next change was accomplished by Mr. Lars.
> >>> 
> >>> https://github.com/ClusterLabs/pacemaker/commit/6a17c003b0167de9fe51d5330fb6e4f1b4ffe64c
> >> 
> >> I'm confused... that patch seems to be the reverse of yours.
> >> Are you saying that we need to undo Lars' one?
> >> 
> >>> 
> >>> I may lack the correction of other parts which are not the patch which I 
> >>> sent.
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> --- On Mon, 2014/2/17, renayama19661...@ybb.ne.jp 
> >>>  wrote:
> >>> 
>  Hi All,
>  
>  The crm_mon tool which is attached to Pacemaker1.1 seems to have a 
>  problem.
>  I send a patch.
>  
>  Best Regards,
>  Hideo Yamauchi.
> >>> 
> >>> ___
> >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>> 
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>> Bugs: http://bugs.clusterlabs.org
> >> 
> >> 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Patch]Information of "Connectivity is lost" is not displayed

2014-02-17 Thread renayama19661014
Hi Andrew,

I attach the result of the cibadmin -Q command.

Best Regards,
Hideo Yamauchi.


--- On Tue, 2014/2/18, Andrew Beekhof  wrote:

> 
> On 18 Feb 2014, at 1:45 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > Thank you for comments.
> > 
> >> can I see the config of yours that crm_mon is not displaying correctly?
> > 
> > It is displayed as follows.
> 
> I mean the raw xml. Can you attach it?
> 
> > -
> > [root@srv01 tmp]# crm_mon -1 -Af                   
> > Last updated: Tue Feb 18 19:51:04 2014
> > Last change: Tue Feb 18 19:48:55 2014 via cibadmin on srv01
> > Stack: corosync
> > Current DC: srv01 (3232238180) - partition WITHOUT quorum
> > Version: 1.1.10-9d39a6b
> > 1 Nodes configured
> > 5 Resources configured
> > 
> > 
> > Online: [ srv01 ]
> > 
> > Clone Set: clnPingd [prmPingd]
> >     Started: [ srv01 ]
> > 
> > Node Attributes:
> > * Node srv01:
> >    + default_ping_set                  : 0         
> > 
> > Migration summary:
> > * Node srv01: 
> > 
> > -
> > 
> > I uploaded log in the next place.(trac2781.zip)
> > 
> > * https://skydrive.live.com/?cid=3A14D57622C66876&id=3A14D57622C66876%21117
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 18 Feb 2014, at 12:19 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi Andrew,
> >>> 
>  I'm confused... that patch seems to be the reverse of yours.
>  Are you saying that we need to undo Lars' one?
> >>> 
> >>> No, I do not understand the meaning of the correction of Mr. Lars.
> >> 
> >> 
> >> name, multiplier and host_list are all resource parameters, not meta 
> >> attributes.
> >> so lars' patch should be correct.
> >> 
> >> can I see the config of yours that crm_mon is not displaying correctly?
> >> 
> >>> 
> >>> However, as now, crm_mon does not display a right attribute.
> >>> Possibly did you not discuss the correction to put meta data in 
> >>> rsc-parameters with Mr. Lars? Or Mr. David?
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> >>> 
>  
>  On 17 Feb 2014, at 5:43 pm, renayama19661...@ybb.ne.jp wrote:
>  
> > Hi All,
> > 
> > The next change was accomplished by Mr. Lars.
> > 
> > https://github.com/ClusterLabs/pacemaker/commit/6a17c003b0167de9fe51d5330fb6e4f1b4ffe64c
>  
>  I'm confused... that patch seems to be the reverse of yours.
>  Are you saying that we need to undo Lars' one?
>  
> > 
> > I may lack the correction of other parts which are not the patch which 
> > I sent.
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Mon, 2014/2/17, renayama19661...@ybb.ne.jp 
> >  wrote:
> > 
> >> Hi All,
> >> 
> >> The crm_mon tool which is attached to Pacemaker1.1 seems to have a 
> >> problem.
> >> I send a patch.
> >> 
> >> Best Regards,
> >> Hideo Yamauchi.
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>  
>  
> >>> 
> >>> ___
> >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>> 
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>> Bugs: http://bugs.clusterlabs.org
> >> 
> >> 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
>
  

  




  


  


  


  
  
  

  
  

  
  
  

  
  




  
  
  
  
  


  







  

  
  


  





  
  



  

  


  


  
  

  
  
  

  
  


  
  

  
  

  
  

  
  
 

Re: [Pacemaker] [Problem] Fail-over is delayed.(State transition is not calculated.)

2014-02-18 Thread renayama19661014
Hi David,

Thank you for comments.

> You have resource-stickiness=INFINITY, this is what is preventing the 
> failover from occurring. Set resource-stickiness=1 or 0 and the failover 
> should occur.
> 

However, the resource moves by a calculation of the next state transition.
By a calculation of the first trouble, can it not travel the resource?

In addition, the resource moves when the resource deletes next colocation.

colocation rsc_colocation-master-3 INFINITY: vip-rep msPostgresql:Master

There is the problem with handling of colocation of some Pacemaker?

Best Regards,
Hideo Yamauchi.

--- On Wed, 2014/2/19, David Vossel  wrote:

> 
> - Original Message -
> > From: renayama19661...@ybb.ne.jp
> > To: "PaceMaker-ML" 
> > Sent: Monday, February 17, 2014 7:06:53 PM
> > Subject: [Pacemaker] [Problem] Fail-over is delayed.(State transition is 
> > not    calculated.)
> > 
> > Hi All,
> > 
> > I confirmed movement at the time of the trouble in one of Master/Slave in
> > Pacemaker1.1.11.
> > 
> > -
> > 
> > Step1) Constitute a cluster.
> > 
> > [root@srv01 ~]# crm_mon -1 -Af
> > Last updated: Tue Feb 18 18:07:24 2014
> > Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01
> > Stack: corosync
> > Current DC: srv01 (3232238180) - partition with quorum
> > Version: 1.1.10-9d39a6b
> > 2 Nodes configured
> > 6 Resources configured
> > 
> > 
> > Online: [ srv01 srv02 ]
> > 
> >  vip-master     (ocf::heartbeat:Dummy): Started srv01
> >  vip-rep        (ocf::heartbeat:Dummy): Started srv01
> >  Master/Slave Set: msPostgresql [pgsql]
> >      Masters: [ srv01 ]
> >      Slaves: [ srv02 ]
> >  Clone Set: clnPingd [prmPingd]
> >      Started: [ srv01 srv02 ]
> > 
> > Node Attributes:
> > * Node srv01:
> >     + default_ping_set                  : 100
> >     + master-pgsql                      : 10
> > * Node srv02:
> >     + default_ping_set                  : 100
> >     + master-pgsql                      : 5
> > 
> > Migration summary:
> > * Node srv01:
> > * Node srv02:
> > 
> > Step2) Monitor error in vip-master.
> > 
> > [root@srv01 ~]# rm -rf /var/run/resource-agents/Dummy-vip-master.state
> > 
> > [root@srv01 ~]# crm_mon -1 -Af
> > Last updated: Tue Feb 18 18:07:58 2014
> > Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01
> > Stack: corosync
> > Current DC: srv01 (3232238180) - partition with quorum
> > Version: 1.1.10-9d39a6b
> > 2 Nodes configured
> > 6 Resources configured
> > 
> > 
> > Online: [ srv01 srv02 ]
> > 
> >  Master/Slave Set: msPostgresql [pgsql]
> >      Masters: [ srv01 ]
> >      Slaves: [ srv02 ]
> >  Clone Set: clnPingd [prmPingd]
> >      Started: [ srv01 srv02 ]
> > 
> > Node Attributes:
> > * Node srv01:
> >     + default_ping_set                  : 100
> >     + master-pgsql                      : 10
> > * Node srv02:
> >     + default_ping_set                  : 100
> >     + master-pgsql                      : 5
> > 
> > Migration summary:
> > * Node srv01:
> >    vip-master: migration-threshold=1 fail-count=1 last-failure='Tue Feb 18
> >    18:07:50 2014'
> > * Node srv02:
> > 
> > Failed actions:
> >     vip-master_monitor_1 on srv01 'not running' (7): call=30,
> >     status=complete, last-rc-change='Tue Feb 18 18:07:50 2014', queued=0ms,
> >     exec=0ms
> > -
> > 
> > However, the resource does not fail-over.
> > 
> > But, fail-over is calculated when I check cib in crm_simulate at this point
> > in time.
> > 
> > -
> > [root@srv01 ~]# crm_simulate -L -s
> > 
> > Current cluster status:
> > Online: [ srv01 srv02 ]
> > 
> >  vip-master     (ocf::heartbeat:Dummy): Stopped
> >  vip-rep        (ocf::heartbeat:Dummy): Stopped
> >  Master/Slave Set: msPostgresql [pgsql]
> >      Masters: [ srv01 ]
> >      Slaves: [ srv02 ]
> >  Clone Set: clnPingd [prmPingd]
> >      Started: [ srv01 srv02 ]
> > 
> > Allocation scores:
> > clone_color: clnPingd allocation score on srv01: 0
> > clone_color: clnPingd allocation score on srv02: 0
> > clone_color: prmPingd:0 allocation score on srv01: INFINITY
> > clone_color: prmPingd:0 allocation score on srv02: 0
> > clone_color: prmPingd:1 allocation score on srv01: 0
> > clone_color: prmPingd:1 allocation score on srv02: INFINITY
> > native_color: prmPingd:0 allocation score on srv01: INFINITY
> > native_color: prmPingd:0 allocation score on srv02: 0
> > native_color: prmPingd:1 allocation score on srv01: -INFINITY
> > native_color: prmPingd:1 allocation score on srv02: INFINITY
> > clone_color: msPostgresql allocation score on srv01: 0
> > clone_color: msPostgresql allocation score on srv02: 0
> > clone_color: pgsql:0 allocation score on srv01: INFINITY
> > clone_color: pgsql:0 allocation score on srv02: 0
> > clone_color: pgsql:1 allocation score on srv01: 0
> > clone_color: pgsql:1 allocation score on srv02: INFINITY
> > native_color: pgsql:0 allocation score on srv01: INFINITY
> > native_color: pgsql:0 allocation s

Re: [Pacemaker] [Problem] Fail-over is delayed.(State transition is not calculated.)

2014-02-18 Thread renayama19661014
Hi Andrew,

> I'll follow up on the bug.

Thanks!

Hideo Yamauch.

--- On Wed, 2014/2/19, Andrew Beekhof  wrote:

> I'll follow up on the bug.
> 
> On 19 Feb 2014, at 10:55 am, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi David,
> > 
> > Thank you for comments.
> > 
> >> You have resource-stickiness=INFINITY, this is what is preventing the 
> >> failover from occurring. Set resource-stickiness=1 or 0 and the failover 
> >> should occur.
> >> 
> > 
> > However, the resource moves by a calculation of the next state transition.
> > By a calculation of the first trouble, can it not travel the resource?
> > 
> > In addition, the resource moves when the resource deletes next colocation.
> > 
> > colocation rsc_colocation-master-3 INFINITY: vip-rep msPostgresql:Master
> > 
> > There is the problem with handling of colocation of some Pacemaker?
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Wed, 2014/2/19, David Vossel  wrote:
> > 
> >> 
> >> - Original Message -
> >>> From: renayama19661...@ybb.ne.jp
> >>> To: "PaceMaker-ML" 
> >>> Sent: Monday, February 17, 2014 7:06:53 PM
> >>> Subject: [Pacemaker] [Problem] Fail-over is delayed.(State transition is 
> >>> not    calculated.)
> >>> 
> >>> Hi All,
> >>> 
> >>> I confirmed movement at the time of the trouble in one of Master/Slave in
> >>> Pacemaker1.1.11.
> >>> 
> >>> -
> >>> 
> >>> Step1) Constitute a cluster.
> >>> 
> >>> [root@srv01 ~]# crm_mon -1 -Af
> >>> Last updated: Tue Feb 18 18:07:24 2014
> >>> Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01
> >>> Stack: corosync
> >>> Current DC: srv01 (3232238180) - partition with quorum
> >>> Version: 1.1.10-9d39a6b
> >>> 2 Nodes configured
> >>> 6 Resources configured
> >>> 
> >>> 
> >>> Online: [ srv01 srv02 ]
> >>> 
> >>>   vip-master     (ocf::heartbeat:Dummy): Started srv01
> >>>   vip-rep        (ocf::heartbeat:Dummy): Started srv01
> >>>   Master/Slave Set: msPostgresql [pgsql]
> >>>       Masters: [ srv01 ]
> >>>       Slaves: [ srv02 ]
> >>>   Clone Set: clnPingd [prmPingd]
> >>>       Started: [ srv01 srv02 ]
> >>> 
> >>> Node Attributes:
> >>> * Node srv01:
> >>>      + default_ping_set                  : 100
> >>>      + master-pgsql                      : 10
> >>> * Node srv02:
> >>>      + default_ping_set                  : 100
> >>>      + master-pgsql                      : 5
> >>> 
> >>> Migration summary:
> >>> * Node srv01:
> >>> * Node srv02:
> >>> 
> >>> Step2) Monitor error in vip-master.
> >>> 
> >>> [root@srv01 ~]# rm -rf /var/run/resource-agents/Dummy-vip-master.state
> >>> 
> >>> [root@srv01 ~]# crm_mon -1 -Af
> >>> Last updated: Tue Feb 18 18:07:58 2014
> >>> Last change: Tue Feb 18 18:05:46 2014 via crmd on srv01
> >>> Stack: corosync
> >>> Current DC: srv01 (3232238180) - partition with quorum
> >>> Version: 1.1.10-9d39a6b
> >>> 2 Nodes configured
> >>> 6 Resources configured
> >>> 
> >>> 
> >>> Online: [ srv01 srv02 ]
> >>> 
> >>>   Master/Slave Set: msPostgresql [pgsql]
> >>>       Masters: [ srv01 ]
> >>>       Slaves: [ srv02 ]
> >>>   Clone Set: clnPingd [prmPingd]
> >>>       Started: [ srv01 srv02 ]
> >>> 
> >>> Node Attributes:
> >>> * Node srv01:
> >>>      + default_ping_set                  : 100
> >>>      + master-pgsql                      : 10
> >>> * Node srv02:
> >>>      + default_ping_set                  : 100
> >>>      + master-pgsql                      : 5
> >>> 
> >>> Migration summary:
> >>> * Node srv01:
> >>>     vip-master: migration-threshold=1 fail-count=1 last-failure='Tue Feb 
> >>>18
> >>>     18:07:50 2014'
> >>> * Node srv02:
> >>> 
> >>> Failed actions:
> >>>      vip-master_monitor_1 on srv01 'not running' (7): call=30,
> >>>      status=complete, last-rc-change='Tue Feb 18 18:07:50 2014', 
> >>>queued=0ms,
> >>>      exec=0ms
> >>> -
> >>> 
> >>> However, the resource does not fail-over.
> >>> 
> >>> But, fail-over is calculated when I check cib in crm_simulate at this 
> >>> point
> >>> in time.
> >>> 
> >>> -
> >>> [root@srv01 ~]# crm_simulate -L -s
> >>> 
> >>> Current cluster status:
> >>> Online: [ srv01 srv02 ]
> >>> 
> >>>   vip-master     (ocf::heartbeat:Dummy): Stopped
> >>>   vip-rep        (ocf::heartbeat:Dummy): Stopped
> >>>   Master/Slave Set: msPostgresql [pgsql]
> >>>       Masters: [ srv01 ]
> >>>       Slaves: [ srv02 ]
> >>>   Clone Set: clnPingd [prmPingd]
> >>>       Started: [ srv01 srv02 ]
> >>> 
> >>> Allocation scores:
> >>> clone_color: clnPingd allocation score on srv01: 0
> >>> clone_color: clnPingd allocation score on srv02: 0
> >>> clone_color: prmPingd:0 allocation score on srv01: INFINITY
> >>> clone_color: prmPingd:0 allocation score on srv02: 0
> >>> clone_color: prmPingd:1 allocation score on srv01: 0
> >>> clone_color: prmPingd:1 allocation score on srv02: INFINITY
> >>> native_color: prmPingd:0 allocation score on srv01: INFINITY
> >>> native_color: prmPingd:0 allocation scor

Re: [Pacemaker] [Patch]Information of "Connectivity is lost" is not displayed

2014-02-18 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> So I'm confused as to what the problem is.
> What are you expecting crm_mon to show?

I wish it is displayed as follows.


* Node srv01:
+ default_ping_set  : 0 : Connectivity is lost

Best Regards,
Hideo Yamauchi.
--- On Wed, 2014/2/19, Andrew Beekhof  wrote:

> 
> On 18 Feb 2014, at 2:38 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > I attach the result of the cibadmin -Q command.
> 
> 
> So I see:
> 
>         
>           
>              id="prmPingd-instance_attributes-name"/>
>              id="prmPingd-instance_attributes-host_list"/>
>              id="prmPingd-instance_attributes-multiplier"/>
>              id="prmPingd-instance_attributes-attempts"/>
>              id="prmPingd-instance_attributes-timeout"/>
>           
> 
> The correct way to query those is as parameters, not as meta attributes.
> Which is what lars' patch achieves.
> 
> In your email I see:
> 
> >>> * Node srv01:
> >>>     + default_ping_set                  : 0         
> 
> 
> Which looks correct based on:
> 
>            name="default_ping_set" value="0"/>
> 
> So I'm confused as to what the problem is.
> What are you expecting crm_mon to show?
> 
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 18 Feb 2014, at 1:45 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi Andrew,
> >>> 
> >>> Thank you for comments.
> >>> 
>  can I see the config of yours that crm_mon is not displaying correctly?
> >>> 
> >>> It is displayed as follows.
> >> 
> >> I mean the raw xml. Can you attach it?
> >> 
> >>> -
> >>> [root@srv01 tmp]# crm_mon -1 -Af                   
> >>> Last updated: Tue Feb 18 19:51:04 2014
> >>> Last change: Tue Feb 18 19:48:55 2014 via cibadmin on srv01
> >>> Stack: corosync
> >>> Current DC: srv01 (3232238180) - partition WITHOUT quorum
> >>> Version: 1.1.10-9d39a6b
> >>> 1 Nodes configured
> >>> 5 Resources configured
> >>> 
> >>> 
> >>> Online: [ srv01 ]
> >>> 
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 ]
> >>> 
> >>> Node Attributes:
> >>> * Node srv01:
> >>>     + default_ping_set                  : 0         
> >>> 
> >>> Migration summary:
> >>> * Node srv01: 
> >>> 
> >>> -
> >>> 
> >>> I uploaded log in the next place.(trac2781.zip)
> >>> 
> >>> * 
> >>> https://skydrive.live.com/?cid=3A14D57622C66876&id=3A14D57622C66876%21117
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> 
> >>> --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> >>> 
>  
>  On 18 Feb 2014, at 12:19 pm, renayama19661...@ybb.ne.jp wrote:
>  
> > Hi Andrew,
> > 
> >> I'm confused... that patch seems to be the reverse of yours.
> >> Are you saying that we need to undo Lars' one?
> > 
> > No, I do not understand the meaning of the correction of Mr. Lars.
>  
>  
>  name, multiplier and host_list are all resource parameters, not meta 
>  attributes.
>  so lars' patch should be correct.
>  
>  can I see the config of yours that crm_mon is not displaying correctly?
>  
> > 
> > However, as now, crm_mon does not display a right attribute.
> > Possibly did you not discuss the correction to put meta data in 
> > rsc-parameters with Mr. Lars? Or Mr. David?
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 17 Feb 2014, at 5:43 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> The next change was accomplished by Mr. Lars.
> >>> 
> >>> https://github.com/ClusterLabs/pacemaker/commit/6a17c003b0167de9fe51d5330fb6e4f1b4ffe64c
> >> 
> >> I'm confused... that patch seems to be the reverse of yours.
> >> Are you saying that we need to undo Lars' one?
> >> 
> >>> 
> >>> I may lack the correction of other parts which are not the patch 
> >>> which I sent.
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> --- On Mon, 2014/2/17, renayama19661...@ybb.ne.jp 
> >>>  wrote:
> >>> 
>  Hi All,
>  
>  The crm_mon tool which is attached to Pacemaker1.1 seems to have a 
>  problem.
>  I send a patch.
>  
>  Best Regards,
>  Hideo Yamauchi.
> >>> 
> >>> ___
> >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>> 
> >>> Project Home: http://www.clusterlabs.org
> >>> Getting started: 
> >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>> Bugs: http://bugs.clusterlabs.org
> >> 
> >> 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> 

Re: [Pacemaker] [Patch]Information of "Connectivity is lost" is not displayed

2014-02-18 Thread renayama19661014
Hi Andrew,

> > I wish it is displayed as follows.
> > 
> > 
> > * Node srv01:
> >    + default_ping_set                  : 0             : Connectivity is 
> >lost
> 
> Ah!   https://github.com/beekhof/pacemaker/commit/5d51930

It was displayed definitely.

Many Thanks!
Hideo Yamauchi.

--- On Wed, 2014/2/19, Andrew Beekhof  wrote:

> 
> On 19 Feb 2014, at 2:55 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > Thank you for comments.
> > 
> >> So I'm confused as to what the problem is.
> >> What are you expecting crm_mon to show?
> > 
> > I wish it is displayed as follows.
> > 
> > 
> > * Node srv01:
> >    + default_ping_set                  : 0             : Connectivity is 
> >lost
> 
> Ah!   https://github.com/beekhof/pacemaker/commit/5d51930
> 
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > --- On Wed, 2014/2/19, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 18 Feb 2014, at 2:38 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi Andrew,
> >>> 
> >>> I attach the result of the cibadmin -Q command.
> >> 
> >> 
> >> So I see:
> >> 
> >>          >>type="ping">
> >>           
> >>              >>id="prmPingd-instance_attributes-name"/>
> >>              >>id="prmPingd-instance_attributes-host_list"/>
> >>              >>id="prmPingd-instance_attributes-multiplier"/>
> >>              >>id="prmPingd-instance_attributes-attempts"/>
> >>              >>id="prmPingd-instance_attributes-timeout"/>
> >>           
> >> 
> >> The correct way to query those is as parameters, not as meta attributes.
> >> Which is what lars' patch achieves.
> >> 
> >> In your email I see:
> >> 
> > * Node srv01:
> >      + default_ping_set                  : 0         
> >> 
> >> 
> >> Which looks correct based on:
> >> 
> >>            >>name="default_ping_set" value="0"/>
> >> 
> >> So I'm confused as to what the problem is.
> >> What are you expecting crm_mon to show?
> >> 
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> 
> >>> --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> >>> 
>  
>  On 18 Feb 2014, at 1:45 pm, renayama19661...@ybb.ne.jp wrote:
>  
> > Hi Andrew,
> > 
> > Thank you for comments.
> > 
> >> can I see the config of yours that crm_mon is not displaying correctly?
> > 
> > It is displayed as follows.
>  
>  I mean the raw xml. Can you attach it?
>  
> > -
> > [root@srv01 tmp]# crm_mon -1 -Af                   
> > Last updated: Tue Feb 18 19:51:04 2014
> > Last change: Tue Feb 18 19:48:55 2014 via cibadmin on srv01
> > Stack: corosync
> > Current DC: srv01 (3232238180) - partition WITHOUT quorum
> > Version: 1.1.10-9d39a6b
> > 1 Nodes configured
> > 5 Resources configured
> > 
> > 
> > Online: [ srv01 ]
> > 
> > Clone Set: clnPingd [prmPingd]
> >       Started: [ srv01 ]
> > 
> > Node Attributes:
> > * Node srv01:
> >      + default_ping_set                  : 0         
> > 
> > Migration summary:
> > * Node srv01: 
> > 
> > -
> > 
> > I uploaded log in the next place.(trac2781.zip)
> > 
> > * 
> > https://skydrive.live.com/?cid=3A14D57622C66876&id=3A14D57622C66876%21117
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 18 Feb 2014, at 12:19 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi Andrew,
> >>> 
>  I'm confused... that patch seems to be the reverse of yours.
>  Are you saying that we need to undo Lars' one?
> >>> 
> >>> No, I do not understand the meaning of the correction of Mr. Lars.
> >> 
> >> 
> >> name, multiplier and host_list are all resource parameters, not meta 
> >> attributes.
> >> so lars' patch should be correct.
> >> 
> >> can I see the config of yours that crm_mon is not displaying correctly?
> >> 
> >>> 
> >>> However, as now, crm_mon does not display a right attribute.
> >>> Possibly did you not discuss the correction to put meta data in 
> >>> rsc-parameters with Mr. Lars? Or Mr. David?
> >>> 
> >>> Best Regards,
> >>> Hideo Yamauchi.
> >>> 
> >>> --- On Tue, 2014/2/18, Andrew Beekhof  wrote:
> >>> 
>  
>  On 17 Feb 2014, at 5:43 pm, renayama19661...@ybb.ne.jp wrote:
>  
> > Hi All,
> > 
> > The next change was accomplished by Mr. Lars.
> > 
> > https://github.com/ClusterLabs/pacemaker/commit/6a17c003b0167de9fe51d5330fb6e4f1b4ffe64c
>  
>  I'm confused... that patch seems to be the reverse of yours.
>  Are you saying that we need to undo Lars' one?
>  
> > 
> > I may lack the correction of other parts which are not the patch 
> > which I sent.
> > 
> > Best Regards,
> > Hideo Yamau

[Pacemaker] [Problem] The timer which does not stop is discarded.

2014-02-19 Thread renayama19661014
Hi All,

The timer which is not stopped at the time of the stop of the monitor of the 
master slave resource of the local node runs.
Therefore, warning to cancel outputs a timer when crmd handles the transition 
that is in a new state.

I confirm it in the next procedure.

Step1) Constitute a cluster.

[root@srv01 ~]# crm_mon -1 -Af
Last updated: Thu Feb 20 22:57:09 2014
Last change: Thu Feb 20 22:56:32 2014 via cibadmin on srv01
Stack: corosync
Current DC: srv01 (3232238180) - partition with quorum
Version: 1.1.10-c1a326d
2 Nodes configured
6 Resources configured


Online: [ srv01 srv02 ]

 vip-master (ocf::heartbeat:Dummy2):Started srv01 
 vip-rep(ocf::heartbeat:Dummy): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 ]

Node Attributes:
* Node srv01:
+ default_ping_set  : 100   
+ master-pgsql  : 10
* Node srv02:
+ default_ping_set  : 100   
+ master-pgsql  : 5 

Migration summary:
* Node srv01: 
* Node srv02: 

Step2) Cause trouble.
[root@srv01 ~]# rm -rf /var/run/resource-agents/Dummy-vip-master.state 

Step3) Warning is displayed by log.
(snip)
Feb 20 22:57:46 srv01 crmd[12107]:   notice: te_rsc_command: Initiating action 
5: cancel pgsql_cancel_9000 on srv01 (local)
Feb 20 22:57:46 srv01 lrmd[12104]: info: cancel_recurring_action: 
Cancelling operation pgsql_monitor_9000
Feb 20 22:57:46 srv01 crmd[12107]: info: match_graph_event: Action 
pgsql_monitor_9000 (5) confirmed on srv01 (rc=0)
(snip)
Feb 20 22:57:46 srv01 pengine[12106]: info: LogActions: Leave   
prmPingd:1#011(Started srv02)Feb 20 22:57:46 srv01 crmd[12107]: info: 
do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ 
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Feb 20 22:57:46 srv01 crmd[12107]:  warning: destroy_action: Cancelling timer 
for action 5 (src=139)
(snip)

The time-out monitoring with the timer thinks like an unnecessary at the time 
of the stop of the monitor of the master slave resource of the local node.

I registered these contents with Bugzilla.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5199

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The timer which does not stop is discarded.

2014-02-23 Thread renayama19661014
Hi All,

I made a patch.
Please confirm the contents of the patch.
If there is not a problem, please reflect it in github.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2014/2/20, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> The timer which is not stopped at the time of the stop of the monitor of the 
> master slave resource of the local node runs.
> Therefore, warning to cancel outputs a timer when crmd handles the transition 
> that is in a new state.
> 
> I confirm it in the next procedure.
> 
> Step1) Constitute a cluster.
> 
> [root@srv01 ~]# crm_mon -1 -Af
> Last updated: Thu Feb 20 22:57:09 2014
> Last change: Thu Feb 20 22:56:32 2014 via cibadmin on srv01
> Stack: corosync
> Current DC: srv01 (3232238180) - partition with quorum
> Version: 1.1.10-c1a326d
> 2 Nodes configured
> 6 Resources configured
> 
> 
> Online: [ srv01 srv02 ]
> 
>  vip-master     (ocf::heartbeat:Dummy2):        Started srv01 
>  vip-rep        (ocf::heartbeat:Dummy): Started srv01 
>  Master/Slave Set: msPostgresql [pgsql]
>      Masters: [ srv01 ]
>      Slaves: [ srv02 ]
>  Clone Set: clnPingd [prmPingd]
>      Started: [ srv01 srv02 ]
> 
> Node Attributes:
> * Node srv01:
>     + default_ping_set                  : 100       
>     + master-pgsql                      : 10        
> * Node srv02:
>     + default_ping_set                  : 100       
>     + master-pgsql                      : 5         
> 
> Migration summary:
> * Node srv01: 
> * Node srv02: 
> 
> Step2) Cause trouble.
> [root@srv01 ~]# rm -rf /var/run/resource-agents/Dummy-vip-master.state 
> 
> Step3) Warning is displayed by log.
> (snip)
> Feb 20 22:57:46 srv01 crmd[12107]:   notice: te_rsc_command: Initiating 
> action 5: cancel pgsql_cancel_9000 on srv01 (local)
> Feb 20 22:57:46 srv01 lrmd[12104]:     info: cancel_recurring_action: 
> Cancelling operation pgsql_monitor_9000
> Feb 20 22:57:46 srv01 crmd[12107]:     info: match_graph_event: Action 
> pgsql_monitor_9000 (5) confirmed on srv01 (rc=0)
> (snip)
> Feb 20 22:57:46 srv01 pengine[12106]:     info: LogActions: Leave   
> prmPingd:1#011(Started srv02)Feb 20 22:57:46 srv01 crmd[12107]:     info: 
> do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE 
> [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
> Feb 20 22:57:46 srv01 crmd[12107]:  warning: destroy_action: Cancelling timer 
> for action 5 (src=139)
> (snip)
> 
> The time-out monitoring with the timer thinks like an unnecessary at the time 
> of the stop of the monitor of the master slave resource of the local node.
> 
> I registered these contents with Bugzilla.
> 
>  * http://bugs.clusterlabs.org/show_bug.cgi?id=5199
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

trac2766.patch
Description: Binary data
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Question] About "quorum-policy=freeze" and "promote".

2014-05-07 Thread renayama19661014
Hi All,

I composed Master/Slave resource of three nodes that set quorum-policy="freeze".
(I use Stateful in Master/Slave resource.)

-
Current DC: srv01 (3232238280) - partition with quorum
Version: 1.1.11-830af67
3 Nodes configured
9 Resources configured


Online: [ srv01 srv02 srv03 ]

 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 srv03 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
-


Master resource starts in all nodes when I interrupt the internal communication 
of all nodes.

-
Node srv02 (3232238290): UNCLEAN (offline)
Node srv03 (3232238300): UNCLEAN (offline)
Online: [ srv01 ]

 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 srv03 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
(snip)
Node srv01 (3232238280): UNCLEAN (offline)
Node srv03 (3232238300): UNCLEAN (offline)
Online: [ srv02 ]

 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 srv02 ]
 Slaves: [ srv03 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
(snip)
Node srv01 (3232238280): UNCLEAN (offline)
Node srv02 (3232238290): UNCLEAN (offline)
Online: [ srv03 ]

 Resource Group: grpStonith1
 prmStonith1-1  (stonith:external/ssh): Started srv02 
 Resource Group: grpStonith2
 prmStonith2-1  (stonith:external/ssh): Started srv01 
 Resource Group: grpStonith3
 prmStonith3-1  (stonith:external/ssh): Started srv01 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 srv03 ]
 Slaves: [ srv02 ]
 Clone Set: clnPingd [prmPingd]
 Started: [ srv01 srv02 srv03 ]
-

I think even if the cluster loses Quorum, being "promote" the Master / Slave 
resource that's specification of Pacemaker.

Is it responsibility of the resource agent side to prevent a state of these 
plural Master?
 * I think that drbd-RA has those functions.
 * But, there is no function in Stateful-RA.
 * As an example, I think that the mechanism such as drbd is necessary by all 
means when I make a resource of Master/Slave newly.

Will my understanding be wrong?

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About "quorum-policy=freeze" and "promote".

2014-05-08 Thread renayama19661014
Hi Emmanuel,

> Why are you using ssh as stonith? i don't think the fencing is working 
> because your nodes are in unclean state

No, STONITH is not carried out because all nodes lose quorum.
This is right movement of Pacemaker.

It is an example to use STONITH of ssh.

Best Regards,
Hideo Yamauchi.
--- On Thu, 2014/5/8, emmanuel segura  wrote:

> 
> Why are you using ssh as stonith? i don't think the fencing is working 
> because your nodes are in unclean state
> 
> 
> 
> 
> 2014-05-08 5:37 GMT+02:00  :
> Hi All,
> 
> I composed Master/Slave resource of three nodes that set 
> quorum-policy="freeze".
> (I use Stateful in Master/Slave resource.)
> 
> -
> Current DC: srv01 (3232238280) - partition with quorum
> Version: 1.1.11-830af67
> 3 Nodes configured
> 9 Resources configured
> 
> 
> Online: [ srv01 srv02 srv03 ]
> 
>  Resource Group: grpStonith1
>      prmStonith1-1      (stonith:external/ssh): Started srv02
>  Resource Group: grpStonith2
>      prmStonith2-1      (stonith:external/ssh): Started srv01
>  Resource Group: grpStonith3
>      prmStonith3-1      (stonith:external/ssh): Started srv01
>  Master/Slave Set: msPostgresql [pgsql]
>      Masters: [ srv01 ]
>      Slaves: [ srv02 srv03 ]
>  Clone Set: clnPingd [prmPingd]
>      Started: [ srv01 srv02 srv03 ]
> -
> 
> 
> Master resource starts in all nodes when I interrupt the internal 
> communication of all nodes.
> 
> -
> Node srv02 (3232238290): UNCLEAN (offline)
> Node srv03 (3232238300): UNCLEAN (offline)
> Online: [ srv01 ]
> 
>  Resource Group: grpStonith1
>      prmStonith1-1      (stonith:external/ssh): Started srv02
>  Resource Group: grpStonith2
>      prmStonith2-1      (stonith:external/ssh): Started srv01
>  Resource Group: grpStonith3
>      prmStonith3-1      (stonith:external/ssh): Started srv01
>  Master/Slave Set: msPostgresql [pgsql]
>      Masters: [ srv01 ]
>      Slaves: [ srv02 srv03 ]
>  Clone Set: clnPingd [prmPingd]
>      Started: [ srv01 srv02 srv03 ]
> (snip)
> Node srv01 (3232238280): UNCLEAN (offline)
> Node srv03 (3232238300): UNCLEAN (offline)
> Online: [ srv02 ]
> 
>  Resource Group: grpStonith1
>      prmStonith1-1      (stonith:external/ssh): Started srv02
>  Resource Group: grpStonith2
>      prmStonith2-1      (stonith:external/ssh): Started srv01
>  Resource Group: grpStonith3
>      prmStonith3-1      (stonith:external/ssh): Started srv01
>  Master/Slave Set: msPostgresql [pgsql]
>      Masters: [ srv01 srv02 ]
>      Slaves: [ srv03 ]
>  Clone Set: clnPingd [prmPingd]
>      Started: [ srv01 srv02 srv03 ]
> (snip)
> Node srv01 (3232238280): UNCLEAN (offline)
> Node srv02 (3232238290): UNCLEAN (offline)
> Online: [ srv03 ]
> 
>  Resource Group: grpStonith1
>      prmStonith1-1      (stonith:external/ssh): Started srv02
>  Resource Group: grpStonith2
>      prmStonith2-1      (stonith:external/ssh): Started srv01
>  Resource Group: grpStonith3
>      prmStonith3-1      (stonith:external/ssh): Started srv01
>  Master/Slave Set: msPostgresql [pgsql]
>      Masters: [ srv01 srv03 ]
>      Slaves: [ srv02 ]
>  Clone Set: clnPingd [prmPingd]
>      Started: [ srv01 srv02 srv03 ]
> -
> 
> I think even if the cluster loses Quorum, being "promote" the Master / Slave 
> resource that's specification of Pacemaker.
> 
> Is it responsibility of the resource agent side to prevent a state of these 
> plural Master?
>  * I think that drbd-RA has those functions.
>  * But, there is no function in Stateful-RA.
>  * As an example, I think that the mechanism such as drbd is necessary by all 
> means when I make a resource of Master/Slave newly.
> 
> Will my understanding be wrong?
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> -- 
> esta es mi vida e me la vivo hasta que dios quiera

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem][pacemaker1.0] The "probe" may not be carried out by difference in cib information of "probe".

2014-05-08 Thread renayama19661014
Hi All,

We confirmed a problem when we performed "clean up" of the Master/Slave 
resource in Pacemaker1.0.
When this problem occurs, "probe" processing is not carried out.

I registered the problem with Bugzilla.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5211

In addition, I wrote the method of "clean up" avoiding a problem for Bugzilla.
But this method may not be usable depending on the combination of resources.

I request improvement if I can revise this problem in Pacemaker1.0 in community.

 * But this problem is improved in Pacemaker1.1 and does not seem to occur.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About "quorum-policy=freeze" and "promote".

2014-05-08 Thread renayama19661014
Hi Andrew,

Thank you for comment.

> > Is it responsibility of the resource agent side to prevent a state of these 
> > plural Master?
> 
> No.
> 
> In this scenario, no nodes have quorum and therefor no additional instances 
> should have been promoted.  Thats the definition of "freeze" :)
> Even if one partition DID have quorum, no instances should have been promoted 
> without fencing occurring first.

Okay.
I wish this problem is revised by the next release.

Many Thanks!
Hideo Yamauchi.

--- On Fri, 2014/5/9, Andrew Beekhof  wrote:

> 
> On 8 May 2014, at 1:37 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > I composed Master/Slave resource of three nodes that set 
> > quorum-policy="freeze".
> > (I use Stateful in Master/Slave resource.)
> > 
> > -
> > Current DC: srv01 (3232238280) - partition with quorum
> > Version: 1.1.11-830af67
> > 3 Nodes configured
> > 9 Resources configured
> > 
> > 
> > Online: [ srv01 srv02 srv03 ]
> > 
> > Resource Group: grpStonith1
> >     prmStonith1-1      (stonith:external/ssh): Started srv02 
> > Resource Group: grpStonith2
> >     prmStonith2-1      (stonith:external/ssh): Started srv01 
> > Resource Group: grpStonith3
> >     prmStonith3-1      (stonith:external/ssh): Started srv01 
> > Master/Slave Set: msPostgresql [pgsql]
> >     Masters: [ srv01 ]
> >     Slaves: [ srv02 srv03 ]
> > Clone Set: clnPingd [prmPingd]
> >     Started: [ srv01 srv02 srv03 ]
> > -
> > 
> > 
> > Master resource starts in all nodes when I interrupt the internal 
> > communication of all nodes.
> > 
> > -
> > Node srv02 (3232238290): UNCLEAN (offline)
> > Node srv03 (3232238300): UNCLEAN (offline)
> > Online: [ srv01 ]
> > 
> > Resource Group: grpStonith1
> >     prmStonith1-1      (stonith:external/ssh): Started srv02 
> > Resource Group: grpStonith2
> >     prmStonith2-1      (stonith:external/ssh): Started srv01 
> > Resource Group: grpStonith3
> >     prmStonith3-1      (stonith:external/ssh): Started srv01 
> > Master/Slave Set: msPostgresql [pgsql]
> >     Masters: [ srv01 ]
> >     Slaves: [ srv02 srv03 ]
> > Clone Set: clnPingd [prmPingd]
> >     Started: [ srv01 srv02 srv03 ]
> > (snip)
> > Node srv01 (3232238280): UNCLEAN (offline)
> > Node srv03 (3232238300): UNCLEAN (offline)
> > Online: [ srv02 ]
> > 
> > Resource Group: grpStonith1
> >     prmStonith1-1      (stonith:external/ssh): Started srv02 
> > Resource Group: grpStonith2
> >     prmStonith2-1      (stonith:external/ssh): Started srv01 
> > Resource Group: grpStonith3
> >     prmStonith3-1      (stonith:external/ssh): Started srv01 
> > Master/Slave Set: msPostgresql [pgsql]
> >     Masters: [ srv01 srv02 ]
> >     Slaves: [ srv03 ]
> > Clone Set: clnPingd [prmPingd]
> >     Started: [ srv01 srv02 srv03 ]
> > (snip)
> > Node srv01 (3232238280): UNCLEAN (offline)
> > Node srv02 (3232238290): UNCLEAN (offline)
> > Online: [ srv03 ]
> > 
> > Resource Group: grpStonith1
> >     prmStonith1-1      (stonith:external/ssh): Started srv02 
> > Resource Group: grpStonith2
> >     prmStonith2-1      (stonith:external/ssh): Started srv01 
> > Resource Group: grpStonith3
> >     prmStonith3-1      (stonith:external/ssh): Started srv01 
> > Master/Slave Set: msPostgresql [pgsql]
> >     Masters: [ srv01 srv03 ]
> >     Slaves: [ srv02 ]
> > Clone Set: clnPingd [prmPingd]
> >     Started: [ srv01 srv02 srv03 ]
> > -
> > 
> > I think even if the cluster loses Quorum, being "promote" the Master / 
> > Slave resource that's specification of Pacemaker.
> > 
> > Is it responsibility of the resource agent side to prevent a state of these 
> > plural Master?
> 
> No.
> 
> In this scenario, no nodes have quorum and therefor no additional instances 
> should have been promoted.  Thats the definition of "freeze" :)
> Even if one partition DID have quorum, no instances should have been promoted 
> without fencing occurring first.
> 
> > * I think that drbd-RA has those functions.
> > * But, there is no function in Stateful-RA.
> > * As an example, I think that the mechanism such as drbd is necessary by 
> > all means when I make a resource of Master/Slave newly.
> > 
> > Will my understanding be wrong?
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About "quorum-policy=freeze" and "promote".

2014-05-08 Thread renayama19661014
Hi Andrew,

> > Okay.
> > I wish this problem is revised by the next release.
> 
> crm_report?

I confirmed a problem again in PM1.2-rc1 and registered in Bugzilla.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5212

Towards Bugzilla, I attached the crm_report file.

Best Regards,
Hideo Yamauchi.

--- On Fri, 2014/5/9, Andrew Beekhof  wrote:

> 
> On 9 May 2014, at 2:05 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > Thank you for comment.
> > 
> >>> Is it responsibility of the resource agent side to prevent a state of 
> >>> these plural Master?
> >> 
> >> No.
> >> 
> >> In this scenario, no nodes have quorum and therefor no additional 
> >> instances should have been promoted.  Thats the definition of "freeze" :)
> >> Even if one partition DID have quorum, no instances should have been 
> >> promoted without fencing occurring first.
> > 
> > Okay.
> > I wish this problem is revised by the next release.
> 
> crm_report?
> 
> > 
> > Many Thanks!
> > Hideo Yamauchi.
> > 
> > --- On Fri, 2014/5/9, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 8 May 2014, at 1:37 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> I composed Master/Slave resource of three nodes that set 
> >>> quorum-policy="freeze".
> >>> (I use Stateful in Master/Slave resource.)
> >>> 
> >>> -
> >>> Current DC: srv01 (3232238280) - partition with quorum
> >>> Version: 1.1.11-830af67
> >>> 3 Nodes configured
> >>> 9 Resources configured
> >>> 
> >>> 
> >>> Online: [ srv01 srv02 srv03 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 ]
> >>>      Slaves: [ srv02 srv03 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> -
> >>> 
> >>> 
> >>> Master resource starts in all nodes when I interrupt the internal 
> >>> communication of all nodes.
> >>> 
> >>> -
> >>> Node srv02 (3232238290): UNCLEAN (offline)
> >>> Node srv03 (3232238300): UNCLEAN (offline)
> >>> Online: [ srv01 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 ]
> >>>      Slaves: [ srv02 srv03 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> (snip)
> >>> Node srv01 (3232238280): UNCLEAN (offline)
> >>> Node srv03 (3232238300): UNCLEAN (offline)
> >>> Online: [ srv02 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 srv02 ]
> >>>      Slaves: [ srv03 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> (snip)
> >>> Node srv01 (3232238280): UNCLEAN (offline)
> >>> Node srv02 (3232238290): UNCLEAN (offline)
> >>> Online: [ srv03 ]
> >>> 
> >>> Resource Group: grpStonith1
> >>>      prmStonith1-1      (stonith:external/ssh): Started srv02 
> >>> Resource Group: grpStonith2
> >>>      prmStonith2-1      (stonith:external/ssh): Started srv01 
> >>> Resource Group: grpStonith3
> >>>      prmStonith3-1      (stonith:external/ssh): Started srv01 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 srv03 ]
> >>>      Slaves: [ srv02 ]
> >>> Clone Set: clnPingd [prmPingd]
> >>>      Started: [ srv01 srv02 srv03 ]
> >>> -
> >>> 
> >>> I think even if the cluster loses Quorum, being "promote" the Master / 
> >>> Slave resource that's specification of Pacemaker.
> >>> 
> >>> Is it responsibility of the resource agent side to prevent a state of 
> >>> these plural Master?
> >> 
> >> No.
> >> 
> >> In this scenario, no nodes have quorum and therefor no additional 
> >> instances should have been promoted.  Thats the definition of "freeze" :)
> >> Even if one partition DID have quorum, no instances should have been 
> >> promoted without fencing occurring first.
> >> 
> >>> * I think that drbd-RA has those functions.
> >>> * But, there is no function in Stateful-RA.
> >>> * As an example, I think that the mechanism such as drbd is necessary by 
> >>> all means when I make a resource of Master/Slave newly.
> >>

[Pacemaker] [Question] About control of colocation.(master-slave with primitive)

2014-05-12 Thread renayama19661014
Hi All,

We assume special resource constitution.
Master of master-slave depends on primitive resource for the constitution.

We performed the setting that Master stopped becoming it in Slave node 
experimentally.


   location rsc_location-msStateful-1 msPostgresql \
rule $role="master" 200: #uname eq srv01 \
rule $role="master" -INFINITY: #uname eq srv02

The Master resource depends on the primitive resource.

   colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master A-master


Step1) Start Slave node.
---
[root@srv02 ~]# crm_mon -1 -Af
Last updated: Tue May 13 22:28:12 2014
Last change: Tue May 13 22:28:07 2014
Stack: corosync
Current DC: srv02 (3232238190) - partition WITHOUT quorum
Version: 1.1.11-f0f09b8
1 Nodes configured
3 Resources configured


Online: [ srv02 ]

 A-master (ocf::heartbeat:Dummy): Started srv02 
 Master/Slave Set: msPostgresql [pgsql]
 Slaves: [ srv02 ]

Node Attributes:
* Node srv02:
+ master-pgsql  : 5 

Migration summary:
* Node srv02: 
---

Step2) Start Master node.
---
[root@srv02 ~]# crm_mon -1 -Af
Last updated: Tue May 13 22:33:39 2014
Last change: Tue May 13 22:28:07 2014
Stack: corosync
Current DC: srv02 (3232238190) - partition with quorum
Version: 1.1.11-f0f09b8
2 Nodes configured
3 Resources configured


Online: [ srv01 srv02 ]

 A-master (ocf::heartbeat:Dummy): Started srv02 
 Master/Slave Set: msPostgresql [pgsql]
 Masters: [ srv01 ]
 Slaves: [ srv02 ]

Node Attributes:
* Node srv01:
+ master-pgsql  : 10
* Node srv02:
+ master-pgsql  : 5 

Migration summary:
* Node srv02: 
* Node srv01: 
---

 * The Master node that primitive node does not start becomes Master.


We do not want to be promoted to Master in the node that primitive resource 
does not start.
Is there the setting of colocation and order which are not promoted to Master 
of the Master node?

 I think that one method includes the next method.
  * I handle it to update an attribute when primitive resource starts.
  * I write an attribute in the condition to be promoted to Master.


In addition, we are often confused about control of colotaion and order.
It is in particular the control between primitive/group resource and 
clone/master-slave resources.
Will you describe detailed contents in a document?


Best Regards,
Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem][pacemaker1.0] The "probe" may not be carried out by difference in cib information of "probe".

2014-05-13 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> Do you guys have any timeframe for moving away from 1.0.x?
> The 1.1 series is over 4 years old now and quite usable :-)
> 
> There is really a (low) limit to how much effort I can put into support for 
> it.

We gradually move from Pacemaker1.0 to Pacemaker1.1, too.

I thought that I should record that there was this problem with Pacemaker1.0.
And I registered a problem and reported it.
(possibly the user who is behind with a shift to Pacemaker1.1 may encounter the 
same problem.)

It is not necessary at all to revise it for Pacemaker1.0.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem][pacemaker1.0] The "probe" may not be carried out by difference in cib information of "probe".

2014-05-14 Thread renayama19661014
Hi Andrew,

> > It is not necessary at all to revise it for Pacemaker1.0.
> 
> Maybe we need to add KnownIssues.md to the repo for anyone thats slow to 
> update.
> Are there any 1.0 bugs that really really need fixing or shall we move them 
> all to the KnownIssues file?

That's a good idea.
In the user who is behind with a shift to PM1.1, it will help big.

Best Regards,
Hideo Yamachi.



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About control of colocation.(master-slave with primitive)

2014-05-14 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> > We do not want to be promoted to Master in the node that primitive resource 
> > does not start.
> > Is there the setting of colocation and order which are not promoted to 
> > Master of the Master node?
> 
> Your config looks reasonable... almost certainly a bug in the PE.
> Do you happen to have the relevant pengine input file available?

Really?
It was like right handling of PE as far as I confirmed a source code of PM1.1.
I register this problem with Bugzilla and contact you.

Best Regards,
Hideo Yamauchi.

--- On Wed, 2014/5/14, Andrew Beekhof  wrote:

> 
> On 13 May 2014, at 3:14 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > We assume special resource constitution.
> > Master of master-slave depends on primitive resource for the constitution.
> > 
> > We performed the setting that Master stopped becoming it in Slave node 
> > experimentally.
> > 
> > 
> >   location rsc_location-msStateful-1 msPostgresql \
> >        rule $role="master" 200: #uname eq srv01 \
> >        rule $role="master" -INFINITY: #uname eq srv02
> > 
> > The Master resource depends on the primitive resource.
> > 
> >   colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master A-master
> > 
> > 
> > Step1) Start Slave node.
> > ---
> > [root@srv02 ~]# crm_mon -1 -Af
> > Last updated: Tue May 13 22:28:12 2014
> > Last change: Tue May 13 22:28:07 2014
> > Stack: corosync
> > Current DC: srv02 (3232238190) - partition WITHOUT quorum
> > Version: 1.1.11-f0f09b8
> > 1 Nodes configured
> > 3 Resources configured
> > 
> > 
> > Online: [ srv02 ]
> > 
> > A-master     (ocf::heartbeat:Dummy): Started srv02 
> > Master/Slave Set: msPostgresql [pgsql]
> >     Slaves: [ srv02 ]
> > 
> > Node Attributes:
> > * Node srv02:
> >    + master-pgsql                      : 5         
> > 
> > Migration summary:
> > * Node srv02: 
> > ---
> > 
> > Step2) Start Master node.
> > ---
> > [root@srv02 ~]# crm_mon -1 -Af
> > Last updated: Tue May 13 22:33:39 2014
> > Last change: Tue May 13 22:28:07 2014
> > Stack: corosync
> > Current DC: srv02 (3232238190) - partition with quorum
> > Version: 1.1.11-f0f09b8
> > 2 Nodes configured
> > 3 Resources configured
> > 
> > 
> > Online: [ srv01 srv02 ]
> > 
> > A-master     (ocf::heartbeat:Dummy): Started srv02 
> > Master/Slave Set: msPostgresql [pgsql]
> >     Masters: [ srv01 ]
> >     Slaves: [ srv02 ]
> > 
> > Node Attributes:
> > * Node srv01:
> >    + master-pgsql                      : 10        
> > * Node srv02:
> >    + master-pgsql                      : 5         
> > 
> > Migration summary:
> > * Node srv02: 
> > * Node srv01: 
> > ---
> > 
> > * The Master node that primitive node does not start becomes Master.
> > 
> > 
> > We do not want to be promoted to Master in the node that primitive resource 
> > does not start.
> > Is there the setting of colocation and order which are not promoted to 
> > Master of the Master node?
> 
> Your config looks reasonable... almost certainly a bug in the PE.
> Do you happen to have the relevant pengine input file available?
> 
> > 
> > I think that one method includes the next method.
> >  * I handle it to update an attribute when primitive resource starts.
> >  * I write an attribute in the condition to be promoted to Master.
> > 
> > 
> > In addition, we are often confused about control of colotaion and order.
> > It is in particular the control between primitive/group resource and 
> > clone/master-slave resources.
> > Will you describe detailed contents in a document?
> > 
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About control of colocation.(master-slave with primitive)

2014-05-14 Thread renayama19661014
Hi Andrew,

> >> Your config looks reasonable... almost certainly a bug in the PE.
> >> Do you happen to have the relevant pengine input file available?
> > 
> > Really?
> 
> I would expect that:
> 
>   colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master A-master
> 
> would only promote msPostgresql on a node where A-master was running.
> 
> Is that not what you were wanting?

Yes. I wanted it.

However, this colocation does not come to be applied by handling of PE.
This is because role of msPostgresql is not decided when it calculates 
placement of A-MASTER.
 * In this case colocation seems to affect only the priority of the 
Master/Slave resource.

I think that this problem disappears if this calculation of the PE is revised.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2014/5/15, Andrew Beekhof  wrote:

> 
> On 15 May 2014, at 9:57 am, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> > Thank you for comments.
> > 
> >>> We do not want to be promoted to Master in the node that primitive 
> >>> resource does not start.
> >>> Is there the setting of colocation and order which are not promoted to 
> >>> Master of the Master node?
> >> 
> >> Your config looks reasonable... almost certainly a bug in the PE.
> >> Do you happen to have the relevant pengine input file available?
> > 
> > Really?
> 
> I would expect that:
> 
>   colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master A-master
> 
> would only promote msPostgresql on a node where A-master was running.
> 
> Is that not what you were wanting?
> 
> 
> > It was like right handling of PE as far as I confirmed a source code of 
> > PM1.1.
> > I register this problem with Bugzilla and contact you.
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Wed, 2014/5/14, Andrew Beekhof  wrote:
> > 
> >> 
> >> On 13 May 2014, at 3:14 pm, renayama19661...@ybb.ne.jp wrote:
> >> 
> >>> Hi All,
> >>> 
> >>> We assume special resource constitution.
> >>> Master of master-slave depends on primitive resource for the constitution.
> >>> 
> >>> We performed the setting that Master stopped becoming it in Slave node 
> >>> experimentally.
> >>> 
> >>> 
> >>>    location rsc_location-msStateful-1 msPostgresql \
> >>>         rule $role="master" 200: #uname eq srv01 \
> >>>         rule $role="master" -INFINITY: #uname eq srv02
> >>> 
> >>> The Master resource depends on the primitive resource.
> >>> 
> >>>    colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master 
> >>>A-master
> >>> 
> >>> 
> >>> Step1) Start Slave node.
> >>> ---
> >>> [root@srv02 ~]# crm_mon -1 -Af
> >>> Last updated: Tue May 13 22:28:12 2014
> >>> Last change: Tue May 13 22:28:07 2014
> >>> Stack: corosync
> >>> Current DC: srv02 (3232238190) - partition WITHOUT quorum
> >>> Version: 1.1.11-f0f09b8
> >>> 1 Nodes configured
> >>> 3 Resources configured
> >>> 
> >>> 
> >>> Online: [ srv02 ]
> >>> 
> >>> A-master     (ocf::heartbeat:Dummy): Started srv02 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Slaves: [ srv02 ]
> >>> 
> >>> Node Attributes:
> >>> * Node srv02:
> >>>     + master-pgsql                      : 5         
> >>> 
> >>> Migration summary:
> >>> * Node srv02: 
> >>> ---
> >>> 
> >>> Step2) Start Master node.
> >>> ---
> >>> [root@srv02 ~]# crm_mon -1 -Af
> >>> Last updated: Tue May 13 22:33:39 2014
> >>> Last change: Tue May 13 22:28:07 2014
> >>> Stack: corosync
> >>> Current DC: srv02 (3232238190) - partition with quorum
> >>> Version: 1.1.11-f0f09b8
> >>> 2 Nodes configured
> >>> 3 Resources configured
> >>> 
> >>> 
> >>> Online: [ srv01 srv02 ]
> >>> 
> >>> A-master     (ocf::heartbeat:Dummy): Started srv02 
> >>> Master/Slave Set: msPostgresql [pgsql]
> >>>      Masters: [ srv01 ]
> >>>      Slaves: [ srv02 ]
> >>> 
> >>> Node Attributes:
> >>> * Node srv01:
> >>>     + master-pgsql                      : 10        
> >>> * Node srv02:
> >>>     + master-pgsql                      : 5         
> >>> 
> >>> Migration summary:
> >>> * Node srv02: 
> >>> * Node srv01: 
> >>> ---
> >>> 
> >>> * The Master node that primitive node does not start becomes Master.
> >>> 
> >>> 
> >>> We do not want to be promoted to Master in the node that primitive 
> >>> resource does not start.
> >>> Is there the setting of colocation and order which are not promoted to 
> >>> Master of the Master node?
> >> 
> >> Your config looks reasonable... almost certainly a bug in the PE.
> >> Do you happen to have the relevant pengine input file available?
> >> 
> >>> 
> >>> I think that one method includes the next method.
> >>>   * I handle it to update an attribute when primitive resource starts.
> >>>   * I write an attribute in the condition to be promoted to Master.
> >>> 
> >>> 
> >>> In addition, we are often confused about control

Re: [Pacemaker] [Problem][pacemaker1.0] The "probe" may not be carried out by difference in cib information of "probe".

2014-05-14 Thread renayama19661014
Hi Andrwe,

> Here we go:
> 
>    https://github.com/ClusterLabs/pacemaker-1.0/blob/master/README.md
> 
> If any additional bugs are found in 1.0, we should create a new entry at 
> bugs.clusterlabs.org, add it to the above README and as long as 1.1 is 
> unaffected: close the bug as WONTFIX. 

All right!

Many Thanks!
Hideo Yamauchi.


--- On Thu, 2014/5/15, Andrew Beekhof  wrote:

> 
> On 15 May 2014, at 9:54 am, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi Andrew,
> > 
> >>> It is not necessary at all to revise it for Pacemaker1.0.
> >> 
> >> Maybe we need to add KnownIssues.md to the repo for anyone thats slow to 
> >> update.
> >> Are there any 1.0 bugs that really really need fixing or shall we move 
> >> them all to the KnownIssues file?
> > 
> > That's a good idea.
> > In the user who is behind with a shift to PM1.1, it will help big.
> 
> Here we go:
> 
>    https://github.com/ClusterLabs/pacemaker-1.0/blob/master/README.md
> 
> If any additional bugs are found in 1.0, we should create a new entry at 
> bugs.clusterlabs.org, add it to the above README and as long as 1.1 is 
> unaffected: close the bug as WONTFIX. 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About control of colocation.(master-slave with primitive)

2014-05-14 Thread renayama19661014
Hi Andrew,

I registered a problem in Bugzilla.
And I attached a file of crm_report.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5213

Best Regards,
Hideo Yamauchi.

--- On Thu, 2014/5/15, renayama19661...@ybb.ne.jp  
wrote:

> Hi Andrew,
> 
> > >> Your config looks reasonable... almost certainly a bug in the PE.
> > >> Do you happen to have the relevant pengine input file available?
> > > 
> > > Really?
> > 
> > I would expect that:
> > 
> >   colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master A-master
> > 
> > would only promote msPostgresql on a node where A-master was running.
> > 
> > Is that not what you were wanting?
> 
> Yes. I wanted it.
> 
> However, this colocation does not come to be applied by handling of PE.
> This is because role of msPostgresql is not decided when it calculates 
> placement of A-MASTER.
>  * In this case colocation seems to affect only the priority of the 
> Master/Slave resource.
> 
> I think that this problem disappears if this calculation of the PE is revised.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> --- On Thu, 2014/5/15, Andrew Beekhof  wrote:
> 
> > 
> > On 15 May 2014, at 9:57 am, renayama19661...@ybb.ne.jp wrote:
> > 
> > > Hi Andrew,
> > > 
> > > Thank you for comments.
> > > 
> > >>> We do not want to be promoted to Master in the node that primitive 
> > >>> resource does not start.
> > >>> Is there the setting of colocation and order which are not promoted to 
> > >>> Master of the Master node?
> > >> 
> > >> Your config looks reasonable... almost certainly a bug in the PE.
> > >> Do you happen to have the relevant pengine input file available?
> > > 
> > > Really?
> > 
> > I would expect that:
> > 
> >   colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master A-master
> > 
> > would only promote msPostgresql on a node where A-master was running.
> > 
> > Is that not what you were wanting?
> > 
> > 
> > > It was like right handling of PE as far as I confirmed a source code of 
> > > PM1.1.
> > > I register this problem with Bugzilla and contact you.
> > > 
> > > Best Regards,
> > > Hideo Yamauchi.
> > > 
> > > --- On Wed, 2014/5/14, Andrew Beekhof  wrote:
> > > 
> > >> 
> > >> On 13 May 2014, at 3:14 pm, renayama19661...@ybb.ne.jp wrote:
> > >> 
> > >>> Hi All,
> > >>> 
> > >>> We assume special resource constitution.
> > >>> Master of master-slave depends on primitive resource for the 
> > >>> constitution.
> > >>> 
> > >>> We performed the setting that Master stopped becoming it in Slave node 
> > >>> experimentally.
> > >>> 
> > >>> 
> > >>>    location rsc_location-msStateful-1 msPostgresql \
> > >>>         rule $role="master" 200: #uname eq srv01 \
> > >>>         rule $role="master" -INFINITY: #uname eq srv02
> > >>> 
> > >>> The Master resource depends on the primitive resource.
> > >>> 
> > >>>    colocation rsc_colocation-master-1 INFINITY: msPostgresql:Master 
> > >>>A-master
> > >>> 
> > >>> 
> > >>> Step1) Start Slave node.
> > >>> ---
> > >>> [root@srv02 ~]# crm_mon -1 -Af
> > >>> Last updated: Tue May 13 22:28:12 2014
> > >>> Last change: Tue May 13 22:28:07 2014
> > >>> Stack: corosync
> > >>> Current DC: srv02 (3232238190) - partition WITHOUT quorum
> > >>> Version: 1.1.11-f0f09b8
> > >>> 1 Nodes configured
> > >>> 3 Resources configured
> > >>> 
> > >>> 
> > >>> Online: [ srv02 ]
> > >>> 
> > >>> A-master     (ocf::heartbeat:Dummy): Started srv02 
> > >>> Master/Slave Set: msPostgresql [pgsql]
> > >>>      Slaves: [ srv02 ]
> > >>> 
> > >>> Node Attributes:
> > >>> * Node srv02:
> > >>>     + master-pgsql                      : 5         
> > >>> 
> > >>> Migration summary:
> > >>> * Node srv02: 
> > >>> ---
> > >>> 
> > >>> Step2) Start Master node.
> > >>> ---
> > >>> [root@srv02 ~]# crm_mon -1 -Af
> > >>> Last updated: Tue May 13 22:33:39 2014
> > >>> Last change: Tue May 13 22:28:07 2014
> > >>> Stack: corosync
> > >>> Current DC: srv02 (3232238190) - partition with quorum
> > >>> Version: 1.1.11-f0f09b8
> > >>> 2 Nodes configured
> > >>> 3 Resources configured
> > >>> 
> > >>> 
> > >>> Online: [ srv01 srv02 ]
> > >>> 
> > >>> A-master     (ocf::heartbeat:Dummy): Started srv02 
> > >>> Master/Slave Set: msPostgresql [pgsql]
> > >>>      Masters: [ srv01 ]
> > >>>      Slaves: [ srv02 ]
> > >>> 
> > >>> Node Attributes:
> > >>> * Node srv01:
> > >>>     + master-pgsql                      : 10        
> > >>> * Node srv02:
> > >>>     + master-pgsql                      : 5         
> > >>> 
> > >>> Migration summary:
> > >>> * Node srv02: 
> > >>> * Node srv01: 
> > >>> ---
> > >>> 
> > >>> * The Master node that primitive node does not start becomes Master.
> > >>> 
> > >>> 
> > >>> We do not want to be promoted to Master in the node that primitive 
> > >>> resource does not start.
> > >>> Is 

[Pacemaker] [Problem] The "dampen" parameter of the attrd_updater command is ignored, and an attribute is updated.

2014-05-26 Thread renayama19661014
Hi All,

The attrd_updater command ignores the "dampen" parameter and updates an 
attribute.

Step1) Start one node.
[root@srv01 ~]# crm_mon -1 -Af
Last updated: Tue May 27 19:36:35 2014
Last change: Tue May 27 19:34:59 2014
Stack: corosync
Current DC: srv01 (3232238180) - partition WITHOUT quorum
Version: 1.1.11-f0f09b8
1 Nodes configured
0 Resources configured


Online: [ srv01 ]


Node Attributes:
* Node srv01:

Migration summary:
* Node srv01: 

Step2) Update an attribute by attrd_updater command.
[root@srv01 ~]# attrd_updater -n default_ping_set -U 500 -d 3000   

Step3) The attribute is updated without waiting for the time of the "dampen" 
parameter.
[root@srv01 ~]# cibadmin -Q | grep ping_set
  

The next code seems to have a problem somehow or other.

--- attrd/command.c -
(snip)
/* this only involves cluster nodes. */
if(v->nodeid == 0 && (v->is_remote == FALSE)) {
if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) == 0) {
/* Create the name/id association */
crm_node_t *peer = crm_get_peer(v->nodeid, host);
crm_trace("We know %s's node id now: %s", peer->uname, peer->uuid);
if(election_state(writer) == election_won) {
write_attributes(FALSE, TRUE);
return;
}
}
}

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The "dampen" parameter of the attrd_updater command is ignored, and an attribute is updated.

2014-05-27 Thread renayama19661014
Hi Andrew,

Thank you for comment.

> > --- attrd/command.c -
> > (snip)
> >    /* this only involves cluster nodes. */
> >    if(v->nodeid == 0 && (v->is_remote == FALSE)) {
> >        if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) == 
> >0) {
> >            /* Create the name/id association */
> >            crm_node_t *peer = crm_get_peer(v->nodeid, host);
> >            crm_trace("We know %s's node id now: %s", peer->uname, 
> >peer->uuid);
> >            if(election_state(writer) == election_won) {
> >                write_attributes(FALSE, TRUE);
> >                return;
> >            }
> >        }
> >    }
> 
> This is for 5194 right?

No.
I listed the same thing in 5194, but this does not seem to be 5194 basic 
problems.

I try the reproduction of 5194 problems, but have not been able to yet reappear.
Possibly 5194 problems may not happen in PM1.1.12-rc1.
 * As for 5194 matters, please give me time a little more.

> 
> I'd expect that block to hit this clause though:
> 
>      } else if(mainloop_timer_running(a->timer)) {
>         crm_info("Write out of '%s' delayed: timer is running", a->id);
>         return;

Which point of the source code does the suggested code mentioned above revise?
(Which line of the source code is it?)

Best Regards,
Hideo Yamauchi.

--- On Wed, 2014/5/28, Andrew Beekhof  wrote:

> 
> On 27 May 2014, at 12:13 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > The attrd_updater command ignores the "dampen" parameter and updates an 
> > attribute.
> > 
> > Step1) Start one node.
> > [root@srv01 ~]# crm_mon -1 -Af
> > Last updated: Tue May 27 19:36:35 2014
> > Last change: Tue May 27 19:34:59 2014
> > Stack: corosync
> > Current DC: srv01 (3232238180) - partition WITHOUT quorum
> > Version: 1.1.11-f0f09b8
> > 1 Nodes configured
> > 0 Resources configured
> > 
> > 
> > Online: [ srv01 ]
> > 
> > 
> > Node Attributes:
> > * Node srv01:
> > 
> > Migration summary:
> > * Node srv01: 
> > 
> > Step2) Update an attribute by attrd_updater command.
> > [root@srv01 ~]# attrd_updater -n default_ping_set -U 500 -d 3000       
> > 
> > Step3) The attribute is updated without waiting for the time of the 
> > "dampen" parameter.
> > [root@srv01 ~]# cibadmin -Q | grep ping_set            
> >           >name="default_ping_set" value="500"/>
> > 
> > The next code seems to have a problem somehow or other.
> > 
> > --- attrd/command.c -
> > (snip)
> >    /* this only involves cluster nodes. */
> >    if(v->nodeid == 0 && (v->is_remote == FALSE)) {
> >        if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) == 
> >0) {
> >            /* Create the name/id association */
> >            crm_node_t *peer = crm_get_peer(v->nodeid, host);
> >            crm_trace("We know %s's node id now: %s", peer->uname, 
> >peer->uuid);
> >            if(election_state(writer) == election_won) {
> >                write_attributes(FALSE, TRUE);
> >                return;
> >            }
> >        }
> >    }
> 
> This is for 5194 right?
> 
> I'd expect that block to hit this clause though:
> 
>      } else if(mainloop_timer_running(a->timer)) {
>         crm_info("Write out of '%s' delayed: timer is running", a->id);
>         return;
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The "dampen" parameter of the attrd_updater command is ignored, and an attribute is updated.

2014-05-27 Thread renayama19661014
Hi Andrew,

> > I'd expect that block to hit this clause though:
> > 
> >      } else if(mainloop_timer_running(a->timer)) {
> >         crm_info("Write out of '%s' delayed: timer is running", a->id);
> >         return;
> 
> Which point of the source code does the suggested code mentioned above revise?
> (Which line of the source code is it?)

Is it the next cord that you pointed?

void
write_attribute(attribute_t *a)
{
int updates = 0;
(snip)
} else if(mainloop_timer_running(a->timer)) {
crm_info("Write out of '%s' delayed: timer is running", a->id);
return;
}
(snip)

At the time of phenomenon of the problem, the timer does not yet block it by 
this processing because it does not start.

Best Regards,
Hideo Yamauchi.

--- On Wed, 2014/5/28, renayama19661...@ybb.ne.jp  
wrote:

> Hi Andrew,
> 
> Thank you for comment.
> 
> > > --- attrd/command.c -
> > > (snip)
> > >    /* this only involves cluster nodes. */
> > >    if(v->nodeid == 0 && (v->is_remote == FALSE)) {
> > >        if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) 
> > >== 0) {
> > >            /* Create the name/id association */
> > >            crm_node_t *peer = crm_get_peer(v->nodeid, host);
> > >            crm_trace("We know %s's node id now: %s", peer->uname, 
> > >peer->uuid);
> > >            if(election_state(writer) == election_won) {
> > >                write_attributes(FALSE, TRUE);
> > >                return;
> > >            }
> > >        }
> > >    }
> > 
> > This is for 5194 right?
> 
> No.
> I listed the same thing in 5194, but this does not seem to be 5194 basic 
> problems.
> 
> I try the reproduction of 5194 problems, but have not been able to yet 
> reappear.
> Possibly 5194 problems may not happen in PM1.1.12-rc1.
>  * As for 5194 matters, please give me time a little more.
> 
> > 
> > I'd expect that block to hit this clause though:
> > 
> >      } else if(mainloop_timer_running(a->timer)) {
> >         crm_info("Write out of '%s' delayed: timer is running", a->id);
> >         return;
> 
> Which point of the source code does the suggested code mentioned above revise?
> (Which line of the source code is it?)
> 
> Best Regards,
> Hideo Yamauchi.
> 
> --- On Wed, 2014/5/28, Andrew Beekhof  wrote:
> 
> > 
> > On 27 May 2014, at 12:13 pm, renayama19661...@ybb.ne.jp wrote:
> > 
> > > Hi All,
> > > 
> > > The attrd_updater command ignores the "dampen" parameter and updates an 
> > > attribute.
> > > 
> > > Step1) Start one node.
> > > [root@srv01 ~]# crm_mon -1 -Af
> > > Last updated: Tue May 27 19:36:35 2014
> > > Last change: Tue May 27 19:34:59 2014
> > > Stack: corosync
> > > Current DC: srv01 (3232238180) - partition WITHOUT quorum
> > > Version: 1.1.11-f0f09b8
> > > 1 Nodes configured
> > > 0 Resources configured
> > > 
> > > 
> > > Online: [ srv01 ]
> > > 
> > > 
> > > Node Attributes:
> > > * Node srv01:
> > > 
> > > Migration summary:
> > > * Node srv01: 
> > > 
> > > Step2) Update an attribute by attrd_updater command.
> > > [root@srv01 ~]# attrd_updater -n default_ping_set -U 500 -d 3000       
> > > 
> > > Step3) The attribute is updated without waiting for the time of the 
> > > "dampen" parameter.
> > > [root@srv01 ~]# cibadmin -Q | grep ping_set            
> > >           > >name="default_ping_set" value="500"/>
> > > 
> > > The next code seems to have a problem somehow or other.
> > > 
> > > --- attrd/command.c -
> > > (snip)
> > >    /* this only involves cluster nodes. */
> > >    if(v->nodeid == 0 && (v->is_remote == FALSE)) {
> > >        if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) 
> > >== 0) {
> > >            /* Create the name/id association */
> > >            crm_node_t *peer = crm_get_peer(v->nodeid, host);
> > >            crm_trace("We know %s's node id now: %s", peer->uname, 
> > >peer->uuid);
> > >            if(election_state(writer) == election_won) {
> > >                write_attributes(FALSE, TRUE);
> > >                return;
> > >            }
> > >        }
> > >    }
> > 
> > This is for 5194 right?
> > 
> > I'd expect that block to hit this clause though:
> > 
> >      } else if(mainloop_timer_running(a->timer)) {
> >         crm_info("Write out of '%s' delayed: timer is running", a->id);
> >         return;
> > 
> > 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The "dampen" parameter of the attrd_updater command is ignored, and an attribute is updated.

2014-05-27 Thread renayama19661014
Hi Andrew,

> Perhaps try:
> 
> diff --git a/attrd/commands.c b/attrd/commands.c
> index 7f1b4b0..7342e23 100644
> --- a/attrd/commands.c
> +++ b/attrd/commands.c
> @@ -464,6 +464,15 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> filter)
>  
>      a->changed |= changed;
>  
> +    if(changed) {
> +        if(a->timer) {
> +            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> a->id);
> +            mainloop_timer_start(a->timer);
> +        } else {
> +            write_or_elect_attribute(a);
> +        }
> +    }
> +
>      /* this only involves cluster nodes. */
>      if(v->nodeid == 0 && (v->is_remote == FALSE)) {
>          if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) == 
> 0) {
> @@ -476,15 +485,6 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> filter)
>              }
>          }
>      }
> -
> -    if(changed) {
> -        if(a->timer) {
> -            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> a->id);
> -            mainloop_timer_start(a->timer);
> -        } else {
> -            write_or_elect_attribute(a);
> -        }
> -    }
>  }
>  
>  void

Okay!
I confirm movement.

Many Thanks!
Hideo Yamauchi.

--- On Wed, 2014/5/28, Andrew Beekhof  wrote:

> 
> On 28 May 2014, at 4:10 pm, Andrew Beekhof  wrote:
> 
> > 
> > On 28 May 2014, at 3:04 pm, renayama19661...@ybb.ne.jp wrote:
> > 
> >> Hi Andrew,
> >> 
>  I'd expect that block to hit this clause though:
>  
>      } else if(mainloop_timer_running(a->timer)) {
>         crm_info("Write out of '%s' delayed: timer is running", a->id);
>         return;
> >>> 
> >>> Which point of the source code does the suggested code mentioned above 
> >>> revise?
> >>> (Which line of the source code is it?)
> >> 
> >> Is it the next cord that you pointed?
> > 
> > right
> > 
> >> 
> >> void
> >> write_attribute(attribute_t *a)
> >> {
> >>   int updates = 0;
> >> (snip)
> >>   } else if(mainloop_timer_running(a->timer)) {
> >>       crm_info("Write out of '%s' delayed: timer is running", a->id);
> >>       return;
> >>   }
> >> (snip)
> >> 
> >> At the time of phenomenon of the problem, the timer does not yet block it 
> >> by this processing because it does not start.
> > 
> > Thats the curious part
> 
> Perhaps try:
> 
> diff --git a/attrd/commands.c b/attrd/commands.c
> index 7f1b4b0..7342e23 100644
> --- a/attrd/commands.c
> +++ b/attrd/commands.c
> @@ -464,6 +464,15 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> filter)
>  
>      a->changed |= changed;
>  
> +    if(changed) {
> +        if(a->timer) {
> +            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> a->id);
> +            mainloop_timer_start(a->timer);
> +        } else {
> +            write_or_elect_attribute(a);
> +        }
> +    }
> +
>      /* this only involves cluster nodes. */
>      if(v->nodeid == 0 && (v->is_remote == FALSE)) {
>          if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) == 
> 0) {
> @@ -476,15 +485,6 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> filter)
>              }
>          }
>      }
> -
> -    if(changed) {
> -        if(a->timer) {
> -            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> a->id);
> -            mainloop_timer_start(a->timer);
> -        } else {
> -            write_or_elect_attribute(a);
> -        }
> -    }
>  }
>  
>  void
> 
> 
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The "dampen" parameter of the attrd_updater command is ignored, and an attribute is updated.

2014-05-27 Thread renayama19661014
Hi Andrew,

I confirmed movement at once.
Your patch solves a problem.

Many Thanks!
Hideo Yamauchi.

--- On Wed, 2014/5/28, renayama19661...@ybb.ne.jp  
wrote:

> Hi Andrew,
> 
> > Perhaps try:
> > 
> > diff --git a/attrd/commands.c b/attrd/commands.c
> > index 7f1b4b0..7342e23 100644
> > --- a/attrd/commands.c
> > +++ b/attrd/commands.c
> > @@ -464,6 +464,15 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> > filter)
> >  
> >      a->changed |= changed;
> >  
> > +    if(changed) {
> > +        if(a->timer) {
> > +            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> > a->id);
> > +            mainloop_timer_start(a->timer);
> > +        } else {
> > +            write_or_elect_attribute(a);
> > +        }
> > +    }
> > +
> >      /* this only involves cluster nodes. */
> >      if(v->nodeid == 0 && (v->is_remote == FALSE)) {
> >          if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) 
> > == 0) {
> > @@ -476,15 +485,6 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> > filter)
> >              }
> >          }
> >      }
> > -
> > -    if(changed) {
> > -        if(a->timer) {
> > -            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> > a->id);
> > -            mainloop_timer_start(a->timer);
> > -        } else {
> > -            write_or_elect_attribute(a);
> > -        }
> > -    }
> >  }
> >  
> >  void
> 
> Okay!
> I confirm movement.
> 
> Many Thanks!
> Hideo Yamauchi.
> 
> --- On Wed, 2014/5/28, Andrew Beekhof  wrote:
> 
> > 
> > On 28 May 2014, at 4:10 pm, Andrew Beekhof  wrote:
> > 
> > > 
> > > On 28 May 2014, at 3:04 pm, renayama19661...@ybb.ne.jp wrote:
> > > 
> > >> Hi Andrew,
> > >> 
> >  I'd expect that block to hit this clause though:
> >  
> >      } else if(mainloop_timer_running(a->timer)) {
> >         crm_info("Write out of '%s' delayed: timer is running", a->id);
> >         return;
> > >>> 
> > >>> Which point of the source code does the suggested code mentioned above 
> > >>> revise?
> > >>> (Which line of the source code is it?)
> > >> 
> > >> Is it the next cord that you pointed?
> > > 
> > > right
> > > 
> > >> 
> > >> void
> > >> write_attribute(attribute_t *a)
> > >> {
> > >>   int updates = 0;
> > >> (snip)
> > >>   } else if(mainloop_timer_running(a->timer)) {
> > >>       crm_info("Write out of '%s' delayed: timer is running", a->id);
> > >>       return;
> > >>   }
> > >> (snip)
> > >> 
> > >> At the time of phenomenon of the problem, the timer does not yet block 
> > >> it by this processing because it does not start.
> > > 
> > > Thats the curious part
> > 
> > Perhaps try:
> > 
> > diff --git a/attrd/commands.c b/attrd/commands.c
> > index 7f1b4b0..7342e23 100644
> > --- a/attrd/commands.c
> > +++ b/attrd/commands.c
> > @@ -464,6 +464,15 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> > filter)
> >  
> >      a->changed |= changed;
> >  
> > +    if(changed) {
> > +        if(a->timer) {
> > +            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> > a->id);
> > +            mainloop_timer_start(a->timer);
> > +        } else {
> > +            write_or_elect_attribute(a);
> > +        }
> > +    }
> > +
> >      /* this only involves cluster nodes. */
> >      if(v->nodeid == 0 && (v->is_remote == FALSE)) {
> >          if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) 
> > == 0) {
> > @@ -476,15 +485,6 @@ attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool 
> > filter)
> >              }
> >          }
> >      }
> > -
> > -    if(changed) {
> > -        if(a->timer) {
> > -            crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, 
> > a->id);
> > -            mainloop_timer_start(a->timer);
> > -        } else {
> > -            write_or_elect_attribute(a);
> > -        }
> > -    }
> >  }
> >  
> >  void
> > 
> > 
> > 
> > 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Enhancement] When attrd reboots, the attribute disappears.

2014-06-08 Thread renayama19661014
Hi All,

I submitted a problem in next bugziila in the past.
 * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2501

A similar phenomenon is generated in attrd of latest Pacemaker.

Step 1) Set the setting of the cluster as follows.
 export PCMK_fail_fast=no

Step 2) Start a cluster.

Step 3) Cause trouble in a resource and improve a trouble count.(fail-count)

[root@srv01 ~]# crm_mon -1 -Af
(snip)
Online: [ srv01 ]

 before-dummy   (ocf::heartbeat:Dummy): Started srv01 
 vip-master (ocf::heartbeat:Dummy2):Started srv01 


Migration summary:
* Node srv01: 
   before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun  9 
19:21:07 2014'

Failed actions:
before-dummy_monitor_1 on srv01 'not running' (7): call=11, 
status=complete, last-rc-change='Mon Jun  9 19:21:07 2014', queued=0ms, exec=0ms


Step 4) Reboot attrd in kill.(I assume that attrd breaks down and rebooted.)

Step 5) Produce trouble in a resource same as step 3 again.
 * The trouble number(fail-count) of times returns to 1.


[root@srv01 ~]# crm_mon -1 -Af 
(snip)
Online: [ srv01 ]

 before-dummy   (ocf::heartbeat:Dummy): Started srv01 
 vip-master (ocf::heartbeat:Dummy2):Started srv01 

Migration summary:
* Node srv01: 
   before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun  9 
19:22:47 2014'

Failed actions:
before-dummy_monitor_1 on srv01 'not running' (7): call=17, 
status=complete, last-rc-change='Mon Jun  9 19:22:47 2014', queued=0ms, exec=0ms


Even if attrd reboots, I think that it is necessary to improve attrd so that an 
attribute is maintained definitely.

Best Regards,
Hideo Yamauch.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Enhancement] When attrd reboots, the attribute disappears.

2014-06-09 Thread renayama19661014
Hi Andrew,

Thank you for comennts.

> Please use bugs.clusterlabs.org in future.
> I'll follow up in bugzilla

Okay!

Best Regards,
Hideo Yamauchi.

--- On Tue, 2014/6/10, Andrew Beekhof  wrote:

> 
> On 9 Jun 2014, at 12:01 pm, renayama19661...@ybb.ne.jp wrote:
> 
> > Hi All,
> > 
> > I submitted a problem in next bugziila in the past.
> > * https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2501
> 
> Please use bugs.clusterlabs.org in future.
> I'll follow up in bugzilla
> 
> > 
> > A similar phenomenon is generated in attrd of latest Pacemaker.
> > 
> > Step 1) Set the setting of the cluster as follows.
> > export PCMK_fail_fast=no
> > 
> > Step 2) Start a cluster.
> > 
> > Step 3) Cause trouble in a resource and improve a trouble count.(fail-count)
> > 
> > [root@srv01 ~]# crm_mon -1 -Af
> > (snip)
> > Online: [ srv01 ]
> > 
> > before-dummy   (ocf::heartbeat:Dummy): Started srv01 
> > vip-master     (ocf::heartbeat:Dummy2):        Started srv01 
> > 
> > 
> > Migration summary:
> > * Node srv01: 
> >   before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun  
> >9 19:21:07 2014'
> > 
> > Failed actions:
> >    before-dummy_monitor_1 on srv01 'not running' (7): call=11, 
> >status=complete, last-rc-change='Mon Jun  9 19:21:07 2014', queued=0ms, 
> >exec=0ms
> > 
> > 
> > Step 4) Reboot attrd in kill.(I assume that attrd breaks down and rebooted.)
> > 
> > Step 5) Produce trouble in a resource same as step 3 again.
> > * The trouble number(fail-count) of times returns to 1.
> > 
> > 
> > [root@srv01 ~]# crm_mon -1 -Af         
> > (snip)
> > Online: [ srv01 ]
> > 
> > before-dummy   (ocf::heartbeat:Dummy): Started srv01 
> > vip-master     (ocf::heartbeat:Dummy2):        Started srv01 
> > 
> > Migration summary:
> > * Node srv01: 
> >   before-dummy: migration-threshold=10 fail-count=1 last-failure='Mon Jun  
> >9 19:22:47 2014'
> > 
> > Failed actions:
> >    before-dummy_monitor_1 on srv01 'not running' (7): call=17, 
> >status=complete, last-rc-change='Mon Jun  9 19:22:47 2014', queued=0ms, 
> >exec=0ms
> > 
> > 
> > Even if attrd reboots, I think that it is necessary to improve attrd so 
> > that an attribute is maintained definitely.
> > 
> > Best Regards,
> > Hideo Yamauch.
> > 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Question] About snmp trap of crm_mon.

2014-07-23 Thread renayama19661014
Hi All,

We were going to confirm snmptrap function in crm_mon of Pacemaker1.1.12.
However, crm_mon does not seem to support a message for a new difference of cib.


void
crm_diff_update(const char *event, xmlNode * msg)
{
    int rc = -1;
    long now = time(NULL);
(snip)
    if (crm_mail_to || snmp_target || external_agent) {
        /* Process operation updates */
        xmlXPathObject *xpathObj = xpath_search(msg,
                                                "//" F_CIB_UPDATE_RESULT "//" 
XML_TAG_DIFF_ADDED
                                                "//" XML_LRM_TAG_RSC_OP);
        int lpc = 0, max = numXpathResults(xpathObj);
(snip)

Best Regards,
Hideo Yamauch.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About snmp trap of crm_mon.

2014-07-24 Thread renayama19661014
Hi Andrew,

> Perhaps someone feels like testing this:
>   https://github.com/beekhof/pacemaker/commit/3df6aff
> 
> Otherwise I'll do it on monday


An immediate correction, thank you.
I confirm snmp by the end of Monday.

Many Thanks!
Hideo Yamauchi.



- Original Message -
> From: Andrew Beekhof 
> To: renayama19661...@ybb.ne.jp; The Pacemaker cluster resource manager 
> 
> Cc: 
> Date: 2014/7/25, Fri 14:02
> Subject: Re: [Pacemaker] [Question] About snmp trap of crm_mon.
> 
> 
> On 24 Jul 2014, at 6:32 pm, Andrew Beekhof  wrote:
> 
>> 
>>  On 24 Jul 2014, at 11:54 am, renayama19661...@ybb.ne.jp wrote:
>> 
>>>  Hi All,
>>> 
>>>  We were going to confirm snmptrap function in crm_mon of 
> Pacemaker1.1.12.
>>>  However, crm_mon does not seem to support a message for a new 
> difference of cib.
>> 
>>  dammit :(
> 
> Perhaps someone feels like testing this:
>   https://github.com/beekhof/pacemaker/commit/3df6aff
> 
> Otherwise I'll do it on monday
> 
>> 
>>> 
>>> 
>>>  void
>>>  crm_diff_update(const char *event, xmlNode * msg)
>>>  {
>>>     int rc = -1;
>>>     long now = time(NULL);
>>>  (snip)
>>>     if (crm_mail_to || snmp_target || external_agent) {
>>>         /* Process operation updates */
>>>         xmlXPathObject *xpathObj = xpath_search(msg,
>>>                                                 "//" 
> F_CIB_UPDATE_RESULT "//" XML_TAG_DIFF_ADDED
>>>                                                 "//" 
> XML_LRM_TAG_RSC_OP);
>>>         int lpc = 0, max = numXpathResults(xpathObj);
>>>  (snip)
>>> 
>>>  Best Regards,
>>>  Hideo Yamauch.
>>> 
>>> 
>>>  ___
>>>  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>>  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>>  Project Home: http://www.clusterlabs.org
>>>  Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>  Bugs: http://bugs.clusterlabs.org
>> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About snmp trap of crm_mon.

2014-07-27 Thread renayama19661014

Hi Andrew,

>> Perhaps someone feels like testing this:
>>   https://github.com/beekhof/pacemaker/commit/3df6aff
>> 
>> Otherwise I'll do it on monday


I confirmed the output of the SNMP trap of the resource and the SNMP trap of 
STONITH.
By your correction, the crm_mon command came to send trap.

Please reflect a correction in Master repository.

Best Regards,
Hideo Yamauchi.



- Original Message -
>From: "renayama19661...@ybb.ne.jp" 
>To: Andrew Beekhof ; The Pacemaker cluster resource 
>manager  
>Date: 2014/7/25, Fri 14:21
>Subject: Re: [Pacemaker] [Question] About snmp trap of crm_mon.
> 
>Hi Andrew,
>
>> Perhaps someone feels like testing this:
>>   https://github.com/beekhof/pacemaker/commit/3df6aff
>> 
>> Otherwise I'll do it on monday
>
>
>An immediate correction, thank you.
>I confirm snmp by the end of Monday.
>
>Many Thanks!
>Hideo Yamauchi.
>
>
>
>- Original Message -
>> From: Andrew Beekhof 
>> To: renayama19661...@ybb.ne.jp; The Pacemaker cluster resource manager 
>> 
>> Cc: 
>> Date: 2014/7/25, Fri 14:02
>> Subject: Re: [Pacemaker] [Question] About snmp trap of crm_mon.
>> 
>> 
>> On 24 Jul 2014, at 6:32 pm, Andrew Beekhof  wrote:
>> 
>>> 
>>>  On 24 Jul 2014, at 11:54 am, renayama19661...@ybb.ne.jp wrote:
>>> 
  Hi All,
 
  We were going to confirm snmptrap function in crm_mon of 
>> Pacemaker1.1.12.
  However, crm_mon does not seem to support a message for a new 
>> difference of cib.
>>> 
>>>  dammit :(
>> 
>> Perhaps someone feels like testing this:
>>   https://github.com/beekhof/pacemaker/commit/3df6aff
>> 
>> Otherwise I'll do it on monday
>> 
>>> 
 
 
  void
  crm_diff_update(const char *event, xmlNode * msg)
  {
     int rc = -1;
     long now = time(NULL);
  (snip)
     if (crm_mail_to || snmp_target || external_agent) {
         /* Process operation updates */
         xmlXPathObject *xpathObj = xpath_search(msg,
                                                 "//" 
>> F_CIB_UPDATE_RESULT "//" XML_TAG_DIFF_ADDED
                                                 "//" 
>> XML_LRM_TAG_RSC_OP);
         int lpc = 0, max = numXpathResults(xpathObj);
  (snip)
 
  Best Regards,
  Hideo Yamauch.
 
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org
>>> 
>> 
>
>___
>Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org
>
>
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem]colocation condition does not become effective.

2012-06-11 Thread renayama19661014
Hi All,

We put pgsql resource of Master/Slave and original RA together and constituted
a cluster.

However, the problem that colocation condition did not become effective
happened.
Therefore promote is carried out.

Step1) Start a node not to be used to Master earlier.
Step2) VIPCheck resources start.
Step3) Start a node to fit a master next.

Because there are VIPCheck resource and colocation condition, promote cannot be
carried out.

(snip)
  
(snip)

However, in violation of colocation condition, promote is carried out.

---crm_mon-

Last updated: Mon Jun 11 18:58:35 2012
Stack: Heartbeat
Current DC: 03-sl6 (fa2f3fed-c63f-4dfb-b0c1-bd02a49ef9b9) - partition with
quorum
Version: 1.0.12-066152e
2 Nodes configured, unknown expected votes
7 Resources configured.


Online: [ 02-sl6 03-sl6 ]

 vip-slave(ocf::heartbeat:IPaddr2):Started 02-sl6
 vipCheck(ocf::heartbeat:VIPcheck):Started 03-sl6
 Master/Slave Set: msPostgresql
 Masters: [ 02-sl6 ]
 Slaves: [ 03-sl6 ]
(snip)

---ptest
[root@rh62-test1 ~]# ptest -x pe-input-778 -VVV
ptest[5261]: 2012/06/12_22:48:56 notice: unpack_config: On loss of CCM Quorum:
Ignore
ptest[5261]: 2012/06/12_22:48:56 WARN: unpack_nodes: Blind faith: not fencing
unseen nodes
ptest[5261]: 2012/06/12_22:48:56 notice: native_print: vip-slave   
(ocf::heartbeat:IPaddr2):   Stopped 
ptest[5261]: 2012/06/12_22:48:56 notice: native_print: vipCheck
(ocf::heartbeat:VIPcheck):  Started 03-sl6
ptest[5261]: 2012/06/12_22:48:56 notice: group_print:  Resource Group:
master-group
ptest[5261]: 2012/06/12_22:48:56 notice: native_print:  vip-master 
(ocf::heartbeat:IPaddr2):   Stopped 
ptest[5261]: 2012/06/12_22:48:56 notice: native_print:  vip-rep
(ocf::heartbeat:IPaddr2):   Stopped 
ptest[5261]: 2012/06/12_22:48:56 notice: clone_print:  Master/Slave Set:
msPostgresql
ptest[5261]: 2012/06/12_22:48:56 notice: short_print:  Slaves: [ 03-sl6
02-sl6 ]
ptest[5261]: 2012/06/12_22:48:56 notice: clone_print:  Clone Set: clnDiskd1
ptest[5261]: 2012/06/12_22:48:56 notice: short_print:  Started: [ 03-sl6
02-sl6 ]
ptest[5261]: 2012/06/12_22:48:56 notice: clone_print:  Clone Set: clnDiskd2
ptest[5261]: 2012/06/12_22:48:56 notice: short_print:  Started: [ 03-sl6
02-sl6 ]
ptest[5261]: 2012/06/12_22:48:56 notice: clone_print:  Clone Set: clnPingd
ptest[5261]: 2012/06/12_22:48:56 notice: short_print:  Started: [ 03-sl6
02-sl6 ]
ptest[5261]: 2012/06/12_22:48:56 notice: RecurringOp:  Start recurring monitor
(9s) for postgresql:1 on 02-sl6
ptest[5261]: 2012/06/12_22:48:56 notice: RecurringOp:  Start recurring monitor
(9s) for postgresql:1 on 02-sl6
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource vip-slave
(Stopped)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource vipCheck 
(Started 03-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
vip-master(Stopped)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource vip-rep  
(Stopped)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
postgresql:0  (Slave 03-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Promote postgresql:1  
(Slave -> Master 02-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
prmDiskd1:0   (Started 03-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
prmDiskd1:1   (Started 02-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
prmDiskd2:0   (Started 03-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
prmDiskd2:1   (Started 02-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
pingCheck:0   (Started 03-sl6)
ptest[5261]: 2012/06/12_22:48:56 notice: LogActions: Leave   resource
pingCheck:1   (Started 02-sl6)


We tried the same movement in Pacemaker1.1.7, but the problem occurred in the
same way.

---crm_mon-

Last updated: Mon Jun 11 19:52:59 2012
Last change: Mon Jun 11 19:49:30 2012 via crm_attribute on 16-sl6
Stack: Heartbeat
Current DC: 17-sl6 (706ba487-030a-41f8-bcb0-b8c6371bfb89) - partition with
quorum
Version: 1.1.7-512f868
2 Nodes configured, unknown expected votes
6 Resources configured.


Online: [ 16-sl6 17-sl6 ]

vip-slave   (ocf::heartbeat:IPaddr2):   Started 16-sl6
vipCheck(ocf::heartbeat:VIPcheck):  Started 17-sl6
 Master/Slave Set: msPostgresql [postgresql]
 Masters: [ 16-sl6 ]
 Slaves: [ 17-sl6 ]
(snip)


 I registered these contents with Bugzilla.
  * http://bugs.clusterlabs.org/show_bug.cgi?id=5070

Best Regards,
Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://ww

[Pacemaker] [Problem and Question] About negative setting of colocation.

2012-06-26 Thread renayama19661014
Hi All,

We confirmed it about negative setting of colocation.

It is colocation with with-rsc-role designation.

We checked it in the next procedure.

Step1) Start the first node. And we send cib.


Last updated: Wed Jun 27 22:58:15 2012
Stack: Heartbeat
Current DC: rh62-test1 (4a7c480c-ad95-4238-8506-7e5f75c551a4) - partition with 
quorum
Version: 1.0.12-unknown
1 Nodes configured, unknown expected votes
2 Resources configured.


Online: [ rh62-test1 ]

 vipCheck   (ocf::pacemaker:Dummy): Started rh62-test1
 Master/Slave Set: msPostgresql
 Masters: [ rh62-test1 ]
 Stopped: [ postgresql:1 ]

Node Attributes:
* Node rh62-test1:
+ master-postgresql:0   : 10

Migration summary:
* Node rh62-test1:


Step2) Start the second node next.

[root@rh62-test1 ~]# ptest -x pe-input-5 -VVV
ptest[29213]: 2012/06/27_23:30:43 notice: unpack_config: On loss of CCM Quorum: 
Ignore
ptest[29213]: 2012/06/27_23:30:43 WARN: unpack_nodes: Blind faith: not fencing 
unseen nodes
ptest[29213]: 2012/06/27_23:30:43 notice: native_print: vipCheck
(ocf::pacemaker:Dummy): Started rh62-test1
ptest[29213]: 2012/06/27_23:30:43 notice: clone_print:  Master/Slave Set: 
msPostgresql
ptest[29213]: 2012/06/27_23:30:43 notice: short_print:  Masters: [ 
rh62-test1 ]
ptest[29213]: 2012/06/27_23:30:43 notice: short_print:  Stopped: [ 
postgresql:1 ]
ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring monitor 
(10s) for postgresql:0 on rh62-test2
ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring monitor 
(9s) for postgresql:1 on rh62-test1
ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring monitor 
(10s) for postgresql:0 on rh62-test2
ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring monitor 
(9s) for postgresql:1 on rh62-test1
ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Leave   resource vipCheck 
(Started rh62-test1)
ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Demote  postgresql:0  
(Master -> Slave rh62-test1)
ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Stopresource 
postgresql:0 (rh62-test1)
ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Start   postgresql:0  
(rh62-test2)
ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Start   postgresql:1  
(rh62-test1)
ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Promote postgresql:1  
(Stopped -> Master rh62-test1)


However, as for the "postgresql:0" resource, it is done Demote.
And replaced with "postgresql:1" resource and am done Promote.
This replacement has a problem by useless movement.

And the cluster repeats Promote,Demote after this and does not constitute a 
resource definitely.

Does it work definitely to set -INFINITY by with-rsc-role designation in 
colocation?
Is it a mistake to appoint it in specifications?

  
  
  

Or is this a bug?

 * I confirmed it in ptest of Pacemaker1.1.7, but the result was the same.

Best Regards,
Hideo Yamauchi.
  

  







  


  
  


  


  

  
  

  
  
  
  
  


  
  







  

  


  

  


  

  
  
  
  


  


  

  
  

  

  
  
  
  

  
  

  


  
  





  

  


  

  

  
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem and Question] About negative setting of colocation.

2012-06-26 Thread renayama19661014
Hi All,

I registered these contents with Bugzilla.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5074

And I attached hb_report.

Best Regards,
Hideo Yamauchi.

--- On Wed, 2012/6/27, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> We confirmed it about negative setting of colocation.
> 
> It is colocation with with-rsc-role designation.
> 
> We checked it in the next procedure.
> 
> Step1) Start the first node. And we send cib.
> 
> 
> Last updated: Wed Jun 27 22:58:15 2012
> Stack: Heartbeat
> Current DC: rh62-test1 (4a7c480c-ad95-4238-8506-7e5f75c551a4) - partition 
> with quorum
> Version: 1.0.12-unknown
> 1 Nodes configured, unknown expected votes
> 2 Resources configured.
> 
> 
> Online: [ rh62-test1 ]
> 
>  vipCheck       (ocf::pacemaker:Dummy): Started rh62-test1
>  Master/Slave Set: msPostgresql
>      Masters: [ rh62-test1 ]
>      Stopped: [ postgresql:1 ]
> 
> Node Attributes:
> * Node rh62-test1:
>     + master-postgresql:0               : 10        
> 
> Migration summary:
> * Node rh62-test1:
> 
> 
> Step2) Start the second node next.
> 
> [root@rh62-test1 ~]# ptest -x pe-input-5 -VVV
> ptest[29213]: 2012/06/27_23:30:43 notice: unpack_config: On loss of CCM 
> Quorum: Ignore
> ptest[29213]: 2012/06/27_23:30:43 WARN: unpack_nodes: Blind faith: not 
> fencing unseen nodes
> ptest[29213]: 2012/06/27_23:30:43 notice: native_print: vipCheck        
> (ocf::pacemaker:Dummy): Started rh62-test1
> ptest[29213]: 2012/06/27_23:30:43 notice: clone_print:  Master/Slave Set: 
> msPostgresql
> ptest[29213]: 2012/06/27_23:30:43 notice: short_print:      Masters: [ 
> rh62-test1 ]
> ptest[29213]: 2012/06/27_23:30:43 notice: short_print:      Stopped: [ 
> postgresql:1 ]
> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> monitor (10s) for postgresql:0 on rh62-test2
> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> monitor (9s) for postgresql:1 on rh62-test1
> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> monitor (10s) for postgresql:0 on rh62-test2
> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> monitor (9s) for postgresql:1 on rh62-test1
> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Leave   resource 
> vipCheck (Started rh62-test1)
> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Demote  postgresql:0    
>   (Master -> Slave rh62-test1)
> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Stop    resource 
> postgresql:0     (rh62-test1)
> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Start   postgresql:0    
>   (rh62-test2)
> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Start   postgresql:1    
>   (rh62-test1)
> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Promote postgresql:1    
>   (Stopped -> Master rh62-test1)
> 
> 
> However, as for the "postgresql:0" resource, it is done Demote.
> And replaced with "postgresql:1" resource and am done Promote.
> This replacement has a problem by useless movement.
> 
> And the cluster repeats Promote,Demote after this and does not constitute a 
> resource definitely.
> 
> Does it work definitely to set -INFINITY by with-rsc-role designation in 
> colocation?
> Is it a mistake to appoint it in specifications?
> 
>        rsc-role="Master" score="INFINITY" with-rsc="vipCheck"/>
>        with-rsc="msPostgresql" with-rsc-role="Slave"/>
>        score="INFINITY" then="msPostgresql" then-action="promote"/>
> 
> Or is this a bug?
> 
>  * I confirmed it in ptest of Pacemaker1.1.7, but the result was the same.
> 
> Best Regards,
> Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem and Question] About negative setting of colocation.

2012-06-27 Thread renayama19661014
Hi Andrew,

All right.
We wait for a correction.

Best Regards,
Hideo  Yamauchi.

--- On Thu, 2012/6/28, Andrew Beekhof  wrote:

> On Wed, Jun 27, 2012 at 4:27 PM,   wrote:
> > Hi All,
> >
> > I registered these contents with Bugzilla.
> >  * http://bugs.clusterlabs.org/show_bug.cgi?id=5074
> 
> Excellent, thankyou!
> David and I have almost finished our "break everything and put it back
> together" phase, we'll be diving back into bug reports in the coming
> days.
> 
> >
> > And I attached hb_report.
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> > --- On Wed, 2012/6/27, renayama19661...@ybb.ne.jp 
> >  wrote:
> >
> >> Hi All,
> >>
> >> We confirmed it about negative setting of colocation.
> >>
> >> It is colocation with with-rsc-role designation.
> >>
> >> We checked it in the next procedure.
> >>
> >> Step1) Start the first node. And we send cib.
> >>
> >> 
> >> Last updated: Wed Jun 27 22:58:15 2012
> >> Stack: Heartbeat
> >> Current DC: rh62-test1 (4a7c480c-ad95-4238-8506-7e5f75c551a4) - partition 
> >> with quorum
> >> Version: 1.0.12-unknown
> >> 1 Nodes configured, unknown expected votes
> >> 2 Resources configured.
> >> 
> >>
> >> Online: [ rh62-test1 ]
> >>
> >>  vipCheck       (ocf::pacemaker:Dummy): Started rh62-test1
> >>  Master/Slave Set: msPostgresql
> >>      Masters: [ rh62-test1 ]
> >>      Stopped: [ postgresql:1 ]
> >>
> >> Node Attributes:
> >> * Node rh62-test1:
> >>     + master-postgresql:0               : 10
> >>
> >> Migration summary:
> >> * Node rh62-test1:
> >>
> >>
> >> Step2) Start the second node next.
> >>
> >> [root@rh62-test1 ~]# ptest -x pe-input-5 -VVV
> >> ptest[29213]: 2012/06/27_23:30:43 notice: unpack_config: On loss of CCM 
> >> Quorum: Ignore
> >> ptest[29213]: 2012/06/27_23:30:43 WARN: unpack_nodes: Blind faith: not 
> >> fencing unseen nodes
> >> ptest[29213]: 2012/06/27_23:30:43 notice: native_print: vipCheck        
> >> (ocf::pacemaker:Dummy): Started rh62-test1
> >> ptest[29213]: 2012/06/27_23:30:43 notice: clone_print:  Master/Slave Set: 
> >> msPostgresql
> >> ptest[29213]: 2012/06/27_23:30:43 notice: short_print:      Masters: [ 
> >> rh62-test1 ]
> >> ptest[29213]: 2012/06/27_23:30:43 notice: short_print:      Stopped: [ 
> >> postgresql:1 ]
> >> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> >> monitor (10s) for postgresql:0 on rh62-test2
> >> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> >> monitor (9s) for postgresql:1 on rh62-test1
> >> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> >> monitor (10s) for postgresql:0 on rh62-test2
> >> ptest[29213]: 2012/06/27_23:30:43 notice: RecurringOp:  Start recurring 
> >> monitor (9s) for postgresql:1 on rh62-test1
> >> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Leave   resource 
> >> vipCheck (Started rh62-test1)
> >> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Demote  
> >> postgresql:0      (Master -> Slave rh62-test1)
> >> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Stop    resource 
> >> postgresql:0     (rh62-test1)
> >> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: 
> >> Start   postgresql:0      (rh62-test2)
> >> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: 
> >> Start   postgresql:1      (rh62-test1)
> >> ptest[29213]: 2012/06/27_23:30:43 notice: LogActions: Promote 
> >> postgresql:1      (Stopped -> Master rh62-test1)
> >>
> >>
> >> However, as for the "postgresql:0" resource, it is done Demote.
> >> And replaced with "postgresql:1" resource and am done Promote.
> >> This replacement has a problem by useless movement.
> >>
> >> And the cluster repeats Promote,Demote after this and does not constitute 
> >> a resource definitely.
> >>
> >> Does it work definitely to set -INFINITY by with-rsc-role designation in 
> >> colocation?
> >> Is it a mistake to appoint it in specifications?
> >>
> >>        >> rsc-role="Master" score="INFINITY" with-rsc="vipCheck"/>
> >>        >> score="-INFINITY" with-rsc="msPostgresql" with-rsc-role="Slave"/>
> >>        >> score="INFINITY" then="msPostgresql" then-action="promote"/>
> >>
> >> Or is this a bug?
> >>
> >>  * I confirmed it in ptest of Pacemaker1.1.7, but the result was the same.
> >>
> >> Best Regards,
> >> Hideo Yamauchi.
> >
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]colocation condition does not become effective.

2012-06-27 Thread renayama19661014
Hi Andrew,

Thank you for comment.

All right.
We wait for a correction.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2012/6/28, Andrew Beekhof  wrote:

> On Tue, Jun 12, 2012 at 3:33 PM,   wrote:
> > Hi All,
> ...
> >  I registered these contents with Bugzilla.
> >  * http://bugs.clusterlabs.org/show_bug.cgi?id=5070
> 
> Excellent.  I believe David has been looking into this already.
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem] Order is ignored, and promote is carried out.

2012-06-28 Thread renayama19661014
Hi All,

We identified order of the master/slave resource as primitive resource.

We set order limitation as follows.
However, promote was carried out even if primitvei resource caused start 
trouble.


Last updated: Fri Jun 29 19:20:09 2012
Stack: Heartbeat
Current DC: rh62-test1 (c9aea7b3-4fe9-4766-9b09-b1f3ab2c329d) - partition with 
quorum
Version: 1.0.12-unknown
1 Nodes configured, unknown expected votes
2 Resources configured.


Online: [ rh62-test1 ]

 Master/Slave Set: msPostgresql
 Masters: [ rh62-test1 ]
 Stopped: [ postgresql:1 ]

Migration summary:
* Node rh62-test1: 
   vipCheck: migration-threshold=1 fail-count=100

Failed actions:
vipCheck_start_0 (node=rh62-test1, call=4, rc=1, status=complete): unknown 
error


Is there setting to let you carry out order of start and promote definitely?
Or is this a bug?

The same phenomenon seems to occur in Pacemaker 1.1.7.

I registered these contents with Bugzilla. 
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5075

Best Regards,
Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] Order is ignored, and promote is carried out.

2012-06-28 Thread renayama19661014
Hi All,

Sorry...I forgot it

> We set order limitation as follows.

  
  


Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/6/29, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> We identified order of the master/slave resource as primitive resource.
> 
> We set order limitation as follows.
> However, promote was carried out even if primitvei resource caused start 
> trouble.
> 
> 
> Last updated: Fri Jun 29 19:20:09 2012
> Stack: Heartbeat
> Current DC: rh62-test1 (c9aea7b3-4fe9-4766-9b09-b1f3ab2c329d) - partition 
> with quorum
> Version: 1.0.12-unknown
> 1 Nodes configured, unknown expected votes
> 2 Resources configured.
> 
> 
> Online: [ rh62-test1 ]
> 
>  Master/Slave Set: msPostgresql
>      Masters: [ rh62-test1 ]
>      Stopped: [ postgresql:1 ]
> 
> Migration summary:
> * Node rh62-test1: 
>    vipCheck: migration-threshold=1 fail-count=100
> 
> Failed actions:
>     vipCheck_start_0 (node=rh62-test1, call=4, rc=1, status=complete): 
> unknown error
> 
> 
> Is there setting to let you carry out order of start and promote definitely?
> Or is this a bug?
> 
> The same phenomenon seems to occur in Pacemaker 1.1.7.
> 
> I registered these contents with Bugzilla. 
>  * http://bugs.clusterlabs.org/show_bug.cgi?id=5075
> 
> Best Regards,
> Hideo Yamauchi.
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] Order is ignored, and promote is carried out.

2012-07-01 Thread renayama19661014
Hi Phillip,

Thank you for comment.
However, the result was the same even if I used the Group resource.

(snip)



  
(snip)
  
(snip)


Last updated: Mon Jul  2 17:56:27 2012
Stack: Heartbeat
Current DC: rh62-test1 (6d534d4e-a3a1-4a92-86c7-eadf6c2f7570) - partition with 
quorum
Version: 1.0.12-unknown
1 Nodes configured, unknown expected votes
2 Resources configured.


Online: [ rh62-test1 ]

 Master/Slave Set: msPostgresql
 Masters: [ rh62-test1 ]
 Stopped: [ postgresql:1 ]

Migration summary:
* Node rh62-test1: 
   vipCheck: migration-threshold=1 fail-count=100

Failed actions:
vipCheck_start_0 (node=rh62-test1, call=4, rc=1, status=complete): unknown 
error

Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/6/29, Phillip Frost  wrote:

> 
> On Jun 28, 2012, at 10:26 PM, renayama19661...@ybb.ne.jp wrote:
> 
> >> We set order limitation as follows.
> > 
> >       >with-rsc="msPostgresql" with-rsc-role="Master"/>
> >       >score="INFINITY" then="msPostgresql" then-action="promote"/>
> > 
> >> However, promote was carried out even if primitvei resource caused start 
> >> trouble.
> >> 
> >> Online: [ rh62-test1 ]
> >> 
> >> Master/Slave Set: msPostgresql
> >>      Masters: [ rh62-test1 ]
> >>      Stopped: [ postgresql:1 ]
> >> 
> >> Migration summary:
> >> * Node rh62-test1: 
> >>    vipCheck: migration-threshold=1 fail-count=100
> >> 
> >> Failed actions:
> >>     vipCheck_start_0 (node=rh62-test1, call=4, rc=1, status=complete): 
> >>unknown error
> 
> What happens if you reverse the order of the of the colocation constraint? 
> You've told pacemaker to decide where to put msPostgresql:Master first, and 
> if it can't run that, then don't run vipCheck, but to start them in the 
> opposite order. I'm not sure an order constraint will prevent one resource 
> from running if another fails to start, but a colocation constraint will, if 
> you get it in the right order.
> 
> You could also use a resource group, which combines colocation and order 
> constraints in the order you'd expect.
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] Order is ignored, and promote is carried out.

2012-07-01 Thread renayama19661014
Hi All,

The constitution of the cluster became right by a redundant resource and 
addition and a change of the limitation as follows.
However, we do not want to perform the redundant setting.

(snip)
  



  

  
  



  

  
(snip)
  
  
  
  
(snip)

Last updated: Mon Jul  2 18:35:19 2012
Stack: Heartbeat
Current DC: rh62-test1 (90e5d5b7-d217-4386-a03d-069111772b54) - partition with 
quorum
Version: 1.0.12-unknown
1 Nodes configured, unknown expected votes
3 Resources configured.


Online: [ rh62-test1 ]

 Master/Slave Set: msPostgresql
 Slaves: [ rh62-test1 ]
 Stopped: [ postgresql:1 ]

Migration summary:
* Node rh62-test1: 
   vipCheck: migration-threshold=1 fail-count=100

Failed actions:
vipCheck_start_0 (node=rh62-test1, call=5, rc=1, status=complete): unknown 
error


Best Regards,
Hideo Yamauchi.

--- On Mon, 2012/7/2, renayama19661...@ybb.ne.jp  
wrote:

> Hi Phillip,
> 
> Thank you for comment.
> However, the result was the same even if I used the Group resource.
> 
> (snip)
>              provider="pacemaker" type="Dummy">         id="vipCheck-instance_attributes">                
>            on-fail="restart" start-delay="4s" timeout="90s"/>              
> 
>       
> (snip)
>        score="INFINITY" then="msPostgresql" then-action="promote"/>
> (snip)
> 
> 
> Last updated: Mon Jul  2 17:56:27 2012
> Stack: Heartbeat
> Current DC: rh62-test1 (6d534d4e-a3a1-4a92-86c7-eadf6c2f7570) - partition 
> with quorum
> Version: 1.0.12-unknown
> 1 Nodes configured, unknown expected votes
> 2 Resources configured.
> 
> 
> Online: [ rh62-test1 ]
> 
>  Master/Slave Set: msPostgresql
>      Masters: [ rh62-test1 ]
>      Stopped: [ postgresql:1 ]
> 
> Migration summary:
> * Node rh62-test1: 
>    vipCheck: migration-threshold=1 fail-count=100
> 
> Failed actions:
>     vipCheck_start_0 (node=rh62-test1, call=4, rc=1, status=complete): 
> unknown error
> 
> Best Regards,
> Hideo Yamauchi.
> 
> --- On Fri, 2012/6/29, Phillip Frost  wrote:
> 
> > 
> > On Jun 28, 2012, at 10:26 PM, renayama19661...@ybb.ne.jp wrote:
> > 
> > >> We set order limitation as follows.
> > > 
> > >       > >score="INFINITY" with-rsc="msPostgresql" with-rsc-role="Master"/>
> > >       > >score="INFINITY" then="msPostgresql" then-action="promote"/>
> > > 
> > >> However, promote was carried out even if primitvei resource caused start 
> > >> trouble.
> > >> 
> > >> Online: [ rh62-test1 ]
> > >> 
> > >> Master/Slave Set: msPostgresql
> > >>      Masters: [ rh62-test1 ]
> > >>      Stopped: [ postgresql:1 ]
> > >> 
> > >> Migration summary:
> > >> * Node rh62-test1: 
> > >>    vipCheck: migration-threshold=1 fail-count=100
> > >> 
> > >> Failed actions:
> > >>     vipCheck_start_0 (node=rh62-test1, call=4, rc=1, status=complete): 
> > >>unknown error
> > 
> > What happens if you reverse the order of the of the colocation constraint? 
> > You've told pacemaker to decide where to put msPostgresql:Master first, and 
> > if it can't run that, then don't run vipCheck, but to start them in the 
> > opposite order. I'm not sure an order constraint will prevent one resource 
> > from running if another fails to start, but a colocation constraint will, 
> > if you get it in the right order.
> > 
> > You could also use a resource group, which combines colocation and order 
> > constraints in the order you'd expect.
> > 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem] Though order limitation exists, start which is not carried out is made.

2012-07-05 Thread renayama19661014
Hi All,

We performed setting avoiding the next problem.

 * http://www.gossamer-threads.com/lists/linuxha/pacemaker/80250 
  -> http://bugs.clusterlabs.org/show_bug.cgi?id=5070 

 * http://www.gossamer-threads.com/lists/linuxha/pacemaker/80549 
  -> http://bugs.clusterlabs.org/show_bug.cgi?id=5075 

The resource finally started in Master node definitely.



Last updated: Fri Jul  6 10:19:42 2012
Stack: Heartbeat
Current DC: sr1 (31411278-b503-4e17-9cf1-38638babb4ff) - partition with quorum
Version: 1.0.12-unknown
1 Nodes configured, unknown expected votes
7 Resources configured.


Online: [ sr1 ]

 vipCheck   (ocf::heartbeat:VIPcheck):  Started sr1
 vipCheck2  (ocf::pacemaker:Dummy): Started sr1
 vip-slave  (ocf::heartbeat:IPaddr2):   Started sr1
 Resource Group: master-group
 vip-master (ocf::heartbeat:IPaddr2):   Started sr1
 vip-rep(ocf::heartbeat:IPaddr2):   Started sr1
 Master/Slave Set: msPostgresql
 Masters: [ sr1 ]
 Stopped: [ postgresql:1 ]
 Clone Set: clnPingd
 Started: [ sr1 ]
 Clone Set: clnDiskd1
 Started: [ sr1 ]



But start of vipCheck2 is made idly when we watch progress on the way.
 * Start of this vipCheck2 is really state transition of the red limit that is 
not carried out.
 * It exists in "pe-input-1.bz2".


(snip)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: RecurringOp:  Start recurring 
monitor (10s) for prmPingd:0 on sr1
Jul  6 09:52:52 sr1 pengine: [2771]: notice: RecurringOp:  Start recurring 
monitor (10s) for prmDiskd1:0 on sr1
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
vipCheck  (Stopped)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Start   vipCheck2  
(sr1)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
vip-slave (Stopped)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
vip-master(Stopped)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
vip-rep   (Stopped)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
postgresql:0  (Stopped)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
postgresql:1  (Stopped)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Start   prmPingd:0 
(sr1)
Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Start   prmDiskd1:0
(sr1)
Jul  6 09:52:52 sr1 crmd: [2766]: info: do_state_transition: State transition 
S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE 
origin=handle_response ]
Jul  6 09:52:52 sr1 crmd: [2766]: info: unpack_graph: Unpacked transition 1: 20 
actions in 20 synapses
Jul  6 09:52:52 sr1 crmd: [2766]: info: do_te_invoke: Processing graph 1 
(ref=pe_calc-dc-1341535972-14) derived from /var/lib/pengine/pe-input-1.bz2
(snip)



Because there is limitation of vipCheck and vipCheck2, we think that the state 
transition of start of this useless vipCheck2 should not be made.

(snip)
  
(snip)


Is this specifications?
Or is it a bug?

 * I registered a problem with Bugzilla.
  * http://bugs.clusterlabs.org/show_bug.cgi?id=5079

Best Regards,
Hideo Yamauchi.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem] Order which combined a master with clone is invalid.

2012-07-19 Thread renayama19661014
Hi All,

We confirmed movement of order which combined a master with clone.
We performed it by a very simple combination.

Step1) We change it to produce start error in Dummy resource.

(snip)
dummy_start() {
return $OCF_ERR_GENERIC
dummy_monitor
(snip)

Step2) We start one node and send cib. 


However, as for the master, it is done start even if start of clone fails.
And it becomes the Slave state.


Last updated: Fri Jul 20 15:36:10 2012
Stack: Heartbeat
Current DC: NONE
1 Nodes configured, unknown expected votes
2 Resources configured.


Online: [ drbd1 ]

 Master/Slave Set: msDrPostgreSQLDB
 Slaves: [ drbd1 ]
 Stopped: [ prmDrPostgreSQLDB:1 ]

Migration summary:
* Node drbd1: 
   prmPingd:0: migration-threshold=1 fail-count=100

Failed actions:
prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): unknown 
error


We confirmed it just to make sure in Pacemaker1.1.7.
However, the problem was the same.


Last updated: Fri Jul 20 22:53:22 2012
Last change: Fri Jul 20 22:53:09 2012 via cibadmin on fedora17-1
Stack: corosync
Current DC: fedora17-1 (1) - partition with quorum
Version: 1.1.7-e6922a70f742d3eab63d7e22f3ea0408b54b5dae
1 Nodes configured, unknown expected votes
4 Resources configured.


Online: [ fedora17-1 ]

 Master/Slave Set: msDrPostgreSQLDB [prmDrPostgreSQLDB]
 Slaves: [ fedora17-1 ]
 Stopped: [ prmDrPostgreSQLDB:1 ]

Migration summary:
* Node fedora17-1: 
   prmPingd:0: migration-threshold=1 fail-count=100

Failed actions:
prmPingd:0_start_0 (node=fedora17-1, call=14, rc=1, status=complete): 
unknown error


I think that this problem is similar to the bug that I reported before.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5075.

Is this problem a bug?
Or can we be improved by setting?

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] Order which combined a master with clone is invalid.

2012-07-20 Thread renayama19661014
Hi All,

I registered hb_report file with Bugzilla.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5086

Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/7/20, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> We confirmed movement of order which combined a master with clone.
> We performed it by a very simple combination.
> 
> Step1) We change it to produce start error in Dummy resource.
> 
> (snip)
> dummy_start() {
> return $OCF_ERR_GENERIC
>     dummy_monitor
> (snip)
> 
> Step2) We start one node and send cib. 
> 
> 
> However, as for the master, it is done start even if start of clone fails.
> And it becomes the Slave state.
> 
> 
> Last updated: Fri Jul 20 15:36:10 2012
> Stack: Heartbeat
> Current DC: NONE
> 1 Nodes configured, unknown expected votes
> 2 Resources configured.
> 
> 
> Online: [ drbd1 ]
> 
>  Master/Slave Set: msDrPostgreSQLDB
>      Slaves: [ drbd1 ]
>      Stopped: [ prmDrPostgreSQLDB:1 ]
> 
> Migration summary:
> * Node drbd1: 
>    prmPingd:0: migration-threshold=1 fail-count=100
> 
> Failed actions:
>     prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): unknown 
> error
> 
> 
> We confirmed it just to make sure in Pacemaker1.1.7.
> However, the problem was the same.
> 
> 
> Last updated: Fri Jul 20 22:53:22 2012
> Last change: Fri Jul 20 22:53:09 2012 via cibadmin on fedora17-1
> Stack: corosync
> Current DC: fedora17-1 (1) - partition with quorum
> Version: 1.1.7-e6922a70f742d3eab63d7e22f3ea0408b54b5dae
> 1 Nodes configured, unknown expected votes
> 4 Resources configured.
> 
> 
> Online: [ fedora17-1 ]
> 
>  Master/Slave Set: msDrPostgreSQLDB [prmDrPostgreSQLDB]
>      Slaves: [ fedora17-1 ]
>      Stopped: [ prmDrPostgreSQLDB:1 ]
> 
> Migration summary:
> * Node fedora17-1: 
>    prmPingd:0: migration-threshold=1 fail-count=100
> 
> Failed actions:
>     prmPingd:0_start_0 (node=fedora17-1, call=14, rc=1, status=complete): 
> unknown error
> 
> 
> I think that this problem is similar to the bug that I reported before.
> 
>  * http://bugs.clusterlabs.org/show_bug.cgi?id=5075.
> 
> Is this problem a bug?
> Or can we be improved by setting?
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] Order which combined a master with clone is invalid.

2012-07-22 Thread renayama19661014
Hi David,

Thank you for comments.

> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s03s02.html

I confirmed it in INFINITY.

(snip)
  
  
(snip)


By the start only for Master nodes, it was controlled well.

[root@drbd1 ~]# crm_mon -1 -f 

Last updated: Mon Jul 23 08:24:38 2012
Stack: Heartbeat
Current DC: NONE
1 Nodes configured, unknown expected votes
2 Resources configured.


Online: [ drbd1 ]


Migration summary:
* Node drbd1: 
   prmPingd:0: migration-threshold=1 fail-count=100

Failed actions:
prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): unknown 
error


However, the problem occurs when I send cib after the Slave node started 
together.


Last updated: Mon Jul 23 08:35:41 2012
Stack: Heartbeat
Current DC: drbd2 (6d4b04de-12c0-499a-b388-febba50eaec2) - partition with quorum
Version: 1.0.12-unknown
2 Nodes configured, unknown expected votes
2 Resources configured.


Online: [ drbd1 drbd2 ]

 Master/Slave Set: msDrPostgreSQLDB
 Masters: [ drbd2 ]
 Slaves: [ drbd1 ] ---> Started and Status 
Slave.
 Clone Set: clnPingd
 Started: [ drbd2 ]
 Stopped: [ prmPingd:0 ]

Migration summary:
* Node drbd1: 
   prmPingd:0: migration-threshold=1 fail-count=100
* Node drbd2: 

Failed actions:
prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): unknown 
error

Best Regards,
Hideo Yamauchi.



--- On Sat, 2012/7/21, David Vossel  wrote:

> 
> 
> - Original Message -
> > From: renayama19661...@ybb.ne.jp
> > To: "PaceMaker-ML" 
> > Sent: Friday, July 20, 2012 1:39:51 AM
> > Subject: [Pacemaker] [Problem] Order which combined a master with clone 
> > is    invalid.
> > 
> > Hi All,
> > 
> > We confirmed movement of order which combined a master with clone.
> > We performed it by a very simple combination.
> > 
> > Step1) We change it to produce start error in Dummy resource.
> > 
> > (snip)
> > dummy_start() {
> > return $OCF_ERR_GENERIC
> >     dummy_monitor
> > (snip)
> > 
> > Step2) We start one node and send cib.
> > 
> > 
> > However, as for the master, it is done start even if start of clone
> > fails.
> > And it becomes the Slave state.
> 
> Not a bug,  You are using advisory ordering in your order constraint.
> 
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s03s02.html
> 
> > 
> > 
> > Last updated: Fri Jul 20 15:36:10 2012
> > Stack: Heartbeat
> > Current DC: NONE
> > 1 Nodes configured, unknown expected votes
> > 2 Resources configured.
> > 
> > 
> > Online: [ drbd1 ]
> > 
> >  Master/Slave Set: msDrPostgreSQLDB
> >      Slaves: [ drbd1 ]
> >      Stopped: [ prmDrPostgreSQLDB:1 ]
> > 
> > Migration summary:
> > * Node drbd1:
> >    prmPingd:0: migration-threshold=1 fail-count=100
> > 
> > Failed actions:
> >     prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete):
> >     unknown error
> > 
> > 
> > We confirmed it just to make sure in Pacemaker1.1.7.
> > However, the problem was the same.
> > 
> > 
> > Last updated: Fri Jul 20 22:53:22 2012
> > Last change: Fri Jul 20 22:53:09 2012 via cibadmin on fedora17-1
> > Stack: corosync
> > Current DC: fedora17-1 (1) - partition with quorum
> > Version: 1.1.7-e6922a70f742d3eab63d7e22f3ea0408b54b5dae
> > 1 Nodes configured, unknown expected votes
> > 4 Resources configured.
> > 
> > 
> > Online: [ fedora17-1 ]
> > 
> >  Master/Slave Set: msDrPostgreSQLDB [prmDrPostgreSQLDB]
> >      Slaves: [ fedora17-1 ]
> >      Stopped: [ prmDrPostgreSQLDB:1 ]
> > 
> > Migration summary:
> > * Node fedora17-1:
> >    prmPingd:0: migration-threshold=1 fail-count=100
> > 
> > Failed actions:
> >     prmPingd:0_start_0 (node=fedora17-1, call=14, rc=1,
> >     status=complete): unknown error
> > 
> > 
> > I think that this problem is similar to the bug that I reported
> > before.
> > 
> >  * http://bugs.clusterlabs.org/show_bug.cgi?id=5075.
> > 
> > Is this problem a bug?
> > Or can we be improved by setting?
> 
> see advisory ordering
> 
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] Order which combined a master with clone is invalid.

2012-07-29 Thread renayama19661014
Hi Andrew,

Thank you or commets.

> > Online: [ drbd1 drbd2 ]
> >
> >  Master/Slave Set: msDrPostgreSQLDB
> >  Masters: [ drbd2 ]
> >  Slaves: [ drbd1 ] ---> Started and Status 
> > Slave.
> 
> Yep, looks like a bug.  I'll follow up on the bugzilla.

I talked with David by Bugzilla.

And I confirmed that I worked by two next methods well.

The first method)
 * Set colocation in clnPingd and msDrPostgreSQLDB.

The second method)
 * Set interleave option in clnPingd.

Do my two methods include a mistake?

If you suspect the bug, please write in comment at Bugzilla.

Many Thanks,
Hideo Yamauchi.


--- On Mon, 2012/7/30, Andrew Beekhof  wrote:

> On Mon, Jul 23, 2012 at 9:43 AM,   wrote:
> > Hi David,
> >
> > Thank you for comments.
> >
> >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s03s02.html
> >
> > I confirmed it in INFINITY.
> >
> > (snip)
> >        >rsc-role="Master" score="INFINITY" with-rsc="clnPingd"/>
> >        >symmetrical="false" then="msDrPostgreSQLDB"/>
> > (snip)
> >
> >
> > By the start only for Master nodes, it was controlled well.
> >
> > [root@drbd1 ~]# crm_mon -1 -f
> > 
> > Last updated: Mon Jul 23 08:24:38 2012
> > Stack: Heartbeat
> > Current DC: NONE
> > 1 Nodes configured, unknown expected votes
> > 2 Resources configured.
> > 
> >
> > Online: [ drbd1 ]
> >
> >
> > Migration summary:
> > * Node drbd1:
> >    prmPingd:0: migration-threshold=1 fail-count=100
> >
> > Failed actions:
> >     prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): unknown 
> >error
> >
> >
> > However, the problem occurs when I send cib after the Slave node started 
> > together.
> >
> > 
> > Last updated: Mon Jul 23 08:35:41 2012
> > Stack: Heartbeat
> > Current DC: drbd2 (6d4b04de-12c0-499a-b388-febba50eaec2) - partition with 
> > quorum
> > Version: 1.0.12-unknown
> > 2 Nodes configured, unknown expected votes
> > 2 Resources configured.
> > 
> >
> > Online: [ drbd1 drbd2 ]
> >
> >  Master/Slave Set: msDrPostgreSQLDB
> >      Masters: [ drbd2 ]
> >      Slaves: [ drbd1 ] ---> Started and Status 
> >Slave.
> 
> Yep, looks like a bug.  I'll follow up on the bugzilla.
> 
> >  Clone Set: clnPingd
> >      Started: [ drbd2 ]
> >      Stopped: [ prmPingd:0 ]
> >
> > Migration summary:
> > * Node drbd1:
> >    prmPingd:0: migration-threshold=1 fail-count=100
> > * Node drbd2:
> >
> > Failed actions:
> >     prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): unknown 
> >error
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> >
> >
> > --- On Sat, 2012/7/21, David Vossel  wrote:
> >
> >>
> >>
> >> - Original Message -
> >> > From: renayama19661...@ybb.ne.jp
> >> > To: "PaceMaker-ML" 
> >> > Sent: Friday, July 20, 2012 1:39:51 AM
> >> > Subject: [Pacemaker] [Problem] Order which combined a master with clone 
> >> > is    invalid.
> >> >
> >> > Hi All,
> >> >
> >> > We confirmed movement of order which combined a master with clone.
> >> > We performed it by a very simple combination.
> >> >
> >> > Step1) We change it to produce start error in Dummy resource.
> >> >
> >> > (snip)
> >> > dummy_start() {
> >> > return $OCF_ERR_GENERIC
> >> >     dummy_monitor
> >> > (snip)
> >> >
> >> > Step2) We start one node and send cib.
> >> >
> >> >
> >> > However, as for the master, it is done start even if start of clone
> >> > fails.
> >> > And it becomes the Slave state.
> >>
> >> Not a bug,  You are using advisory ordering in your order constraint.
> >>
> >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s03s02.html
> >>
> >> >
> >> > 
> >> > Last updated: Fri Jul 20 15:36:10 2012
> >> > Stack: Heartbeat
> >> > Current DC: NONE
> >> > 1 Nodes configured, unknown expected votes
> >> > 2 Resources configured.
> >> > 
> >> >
> >> > Online: [ drbd1 ]
> >> >
> >> >  Master/Slave Set: msDrPostgreSQLDB
> >> >      Slaves: [ drbd1 ]
> >> >      Stopped: [ prmDrPostgreSQLDB:1 ]
> >> >
> >> > Migration summary:
> >> > * Node drbd1:
> >> >    prmPingd:0: migration-threshold=1 fail-count=100
> >> >
> >> > Failed actions:
> >> >     prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete):
> >> >     unknown error
> >> >
> >> >
> >> > We confirmed it just to make sure in Pacemaker1.1.7.
> >> > However, the problem was the same.
> >> >
> >> > 
> >> > Last updated: Fri Jul 20 22:53:22 2012
> >> > Last change: Fri Jul 20 22:53:09 2012 via cibadmin on fedora17-1
> >> > Stack: corosync
> >> > Current DC: fedora17-1 (1) - partition with quorum
> >> > Version: 1.1.7-e6922a70f742d3eab63d7e22f3ea0408b54b5dae
> >> > 1 Nodes configured, unknown expected votes
> >> > 4 Resources configured.
> >> > 
> >> >
> >> > Online: [ fedora17-1 ]
> >> >
> >> >  Master/Slave Set: msDrPostgreSQLDB [prmDrPostgreSQLDB]
> >> >      Slaves: [ fedora17-1 ]
> >> >      Stopped: [ prmDrPostgreSQ

Re: [Pacemaker] [Patch] When the ID of the resource changes, influence may be reflected on an application of colocation.

2012-08-05 Thread renayama19661014
Hi All,

I registered this problem with Bugzilla.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5089

Best Regards,
Hideo Yamauchi.

--- On Wed, 2012/8/1, renayama19661...@ybb.ne.jp  
wrote:

> Hi All,
> 
> When the ID of the resource changed, we confirmed that control of different 
> colocation was carried out.
> The control of the resource may vary according to this problem only in the 
> difference in ID name of the resource.
> 
> 
> The first pattern) The state transition that we expect is made. (pe-input-423)
> 
> [root@drbd2 trac2114]# ptest -x pe-input-423 -VVV
> ptest[13220]: 2012/08/01_10:29:16 notice: unpack_config: On loss of CCM 
> Quorum: Ignore
> ptest[13220]: 2012/08/01_10:29:16 WARN: unpack_nodes: Blind faith: not 
> fencing unseen nodes
> ptest[13220]: 2012/08/01_10:29:16 WARN: unpack_rsc_op: Processing failed op 
> postgresql:0_monitor_9000 on 02-sl6: not running (7)
> ptest[13220]: 2012/08/01_10:29:16 notice: native_print: vipCheck        
> (ocf::pacemaker:Dummy): Started 02-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: native_print: vipCheckSupport 
> (ocf::pacemaker:Dummy): Started 02-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: group_print:  Resource Group: 
> master-group
> ptest[13220]: 2012/08/01_10:29:16 notice: native_print:      vip-master 
> (ocf::heartbeat:IPaddr2):       Started 02-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: native_print:      vip-rep    
> (ocf::heartbeat:IPaddr2):       Started 02-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Master/Slave Set: 
> msPostgresql
> ptest[13220]: 2012/08/01_10:29:16 notice: native_print:      postgresql:0     
>   (ocf::heartbeat:pgsql): Slave 02-sl6 FAILED
> ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Slaves: [ 03-sl6 ]
> ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Clone Set: clnDiskd1
> ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Started: [ 02-sl6 
> 03-sl6 ]
> ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Clone Set: clnDiskd2
> ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Started: [ 02-sl6 
> 03-sl6 ]
> ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Clone Set: clnPingd
> ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Started: [ 02-sl6 
> 03-sl6 ]
> ptest[13220]: 2012/08/01_10:29:16 WARN: common_apply_stickiness: Forcing 
> msPostgresql away from 02-sl6 after 1 failures (max=1)
> ptest[13220]: 2012/08/01_10:29:16 WARN: common_apply_stickiness: Forcing 
> msPostgresql away from 02-sl6 after 1 failures (max=1)
> ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> monitor (10s) for vip-master on 03-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> monitor (10s) for vip-rep on 03-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> monitor (9s) for postgresql:1 on 03-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> monitor (9s) for postgresql:1 on 03-sl6
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> vipCheck (Started 02-sl6 -> 03-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> vipCheckSupport  (Started 02-sl6 -> 03-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> vip-master       (Started 02-sl6 -> 03-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> vip-rep  (Started 02-sl6 -> 03-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Stop    resource 
> postgresql:0     (02-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Promote postgresql:1    
>   (Slave -> Master 03-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> prmDiskd1:0      (Started 02-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> prmDiskd1:1      (Started 03-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> prmDiskd2:0      (Started 02-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> prmDiskd2:1      (Started 03-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> pingCheck:0      (Started 02-sl6)
> ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> pingCheck:1      (Started 03-sl6)
> 
> 
> The second pattern) Different state transition is made only by resource ID 
> being different.(pe-input-396)
>  * I changed a resource name into gtmproxy1 from vipCheck.
>  * I changed a resource name into gtmproxy1Support from vipCheckSupport.
> 
> [root@drbd2 trac2114]# ptest -x pe-input-396 -VVV
> ptest[13221]: 2012/08/01_10:29:36 notice: unpack_config: On loss of CCM 
> Quorum: Ignore
> ptest[13221]: 2012/08/01_10:29:36 WARN: unpack_nodes: Blind faith: not 
> fencing unseen nodes
> ptest[13221]: 2012/08/01_10:29:36 WARN: unpack_rsc_op: Processing failed op 
> datanode1:0_monitor_9000 on 02-sl6: not running (7)
> ptest[13221]: 2012

Re: [Pacemaker] [Patch] When the ID of the resource changes, influence may be reflected on an application of colocation.

2012-08-07 Thread renayama19661014
Hi Andrew

Thank you for comments.

> The problem with this approach is that ordering of the constraints in
> the cib is not preserved between the nodes.
> I will follow up further on the bugzilla.

All right!

Many Thanks,
Hideo Yamauchi.


--- On Tue, 2012/8/7, Andrew Beekhof  wrote:

> The problem with this approach is that ordering of the constraints in
> the cib is not preserved between the nodes.
> I will follow up further on the bugzilla.
> 
> On Wed, Aug 1, 2012 at 11:35 AM,   wrote:
> > Hi All,
> >
> > When the ID of the resource changed, we confirmed that control of different 
> > colocation was carried out.
> > The control of the resource may vary according to this problem only in the 
> > difference in ID name of the resource.
> >
> >
> > The first pattern) The state transition that we expect is made. 
> > (pe-input-423)
> >
> > [root@drbd2 trac2114]# ptest -x pe-input-423 -VVV
> > ptest[13220]: 2012/08/01_10:29:16 notice: unpack_config: On loss of CCM 
> > Quorum: Ignore
> > ptest[13220]: 2012/08/01_10:29:16 WARN: unpack_nodes: Blind faith: not 
> > fencing unseen nodes
> > ptest[13220]: 2012/08/01_10:29:16 WARN: unpack_rsc_op: Processing failed op 
> > postgresql:0_monitor_9000 on 02-sl6: not running (7)
> > ptest[13220]: 2012/08/01_10:29:16 notice: native_print: vipCheck        
> > (ocf::pacemaker:Dummy): Started 02-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: native_print: vipCheckSupport 
> > (ocf::pacemaker:Dummy): Started 02-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: group_print:  Resource Group: 
> > master-group
> > ptest[13220]: 2012/08/01_10:29:16 notice: native_print:      vip-master 
> > (ocf::heartbeat:IPaddr2):       Started 02-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: native_print:      vip-rep    
> > (ocf::heartbeat:IPaddr2):       Started 02-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Master/Slave Set: 
> > msPostgresql
> > ptest[13220]: 2012/08/01_10:29:16 notice: native_print:      postgresql:0   
> >     (ocf::heartbeat:pgsql): Slave 02-sl6 FAILED
> > ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Slaves: [ 
> > 03-sl6 ]
> > ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Clone Set: clnDiskd1
> > ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Started: [ 
> > 02-sl6 03-sl6 ]
> > ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Clone Set: clnDiskd2
> > ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Started: [ 
> > 02-sl6 03-sl6 ]
> > ptest[13220]: 2012/08/01_10:29:16 notice: clone_print:  Clone Set: clnPingd
> > ptest[13220]: 2012/08/01_10:29:16 notice: short_print:      Started: [ 
> > 02-sl6 03-sl6 ]
> > ptest[13220]: 2012/08/01_10:29:16 WARN: common_apply_stickiness: Forcing 
> > msPostgresql away from 02-sl6 after 1 failures (max=1)
> > ptest[13220]: 2012/08/01_10:29:16 WARN: common_apply_stickiness: Forcing 
> > msPostgresql away from 02-sl6 after 1 failures (max=1)
> > ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> > monitor (10s) for vip-master on 03-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> > monitor (10s) for vip-rep on 03-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> > monitor (9s) for postgresql:1 on 03-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: RecurringOp:  Start recurring 
> > monitor (9s) for postgresql:1 on 03-sl6
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> > vipCheck (Started 02-sl6 -> 03-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> > vipCheckSupport  (Started 02-sl6 -> 03-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> > vip-master       (Started 02-sl6 -> 03-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Move    resource 
> > vip-rep  (Started 02-sl6 -> 03-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Stop    resource 
> > postgresql:0     (02-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Promote postgresql:1  
> >     (Slave -> Master 03-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> > prmDiskd1:0      (Started 02-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> > prmDiskd1:1      (Started 03-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> > prmDiskd2:0      (Started 02-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> > prmDiskd2:1      (Started 03-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> > pingCheck:0      (Started 02-sl6)
> > ptest[13220]: 2012/08/01_10:29:16 notice: LogActions: Leave   resource 
> > pingCheck:1      (Started 03-sl6)
> >
> >
> > The second pattern) Different state transition is made only by resource ID 
> > being different.(pe-input-396)
> >  * I changed a resource name into gtmproxy1 from vipCheck.
>

Re: [Pacemaker] [Problem] Order which combined a master with clone is invalid.

2012-08-07 Thread renayama19661014
Hi Andrew,

> > The first method)
> >  * Set colocation in clnPingd and msDrPostgreSQLDB.
> >
> > The second method)
> >  * Set interleave option in clnPingd.
> >
> > Do my two methods include a mistake?
> 
> No.
> 
> Looking closer, the initial constraint says only that the Master must
> be on a node running clnPingd.
> Slaves are free to run anywhere :)

All right!

We did not use the "interleave" option in now.
We decide to use the first method for a while.

Many Thanks,
Hideo Yamauchi.


--- On Tue, 2012/8/7, Andrew Beekhof  wrote:

> On Mon, Jul 30, 2012 at 2:08 PM,   wrote:
> > Hi Andrew,
> >
> > Thank you or commets.
> >
> >> > Online: [ drbd1 drbd2 ]
> >> >
> >> >  Master/Slave Set: msDrPostgreSQLDB
> >> >      Masters: [ drbd2 ]
> >> >      Slaves: [ drbd1 ] ---> Started and 
> >> >Status Slave.
> >>
> >> Yep, looks like a bug.  I'll follow up on the bugzilla.
> >
> > I talked with David by Bugzilla.
> >
> > And I confirmed that I worked by two next methods well.
> >
> > The first method)
> >  * Set colocation in clnPingd and msDrPostgreSQLDB.
> >
> > The second method)
> >  * Set interleave option in clnPingd.
> >
> > Do my two methods include a mistake?
> 
> No.
> 
> Looking closer, the initial constraint says only that the Master must
> be on a node running clnPingd.
> Slaves are free to run anywhere :)
> 
> >
> > If you suspect the bug, please write in comment at Bugzilla.
> >
> > Many Thanks,
> > Hideo Yamauchi.
> >
> >
> > --- On Mon, 2012/7/30, Andrew Beekhof  wrote:
> >
> >> On Mon, Jul 23, 2012 at 9:43 AM,   wrote:
> >> > Hi David,
> >> >
> >> > Thank you for comments.
> >> >
> >> >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s03s02.html
> >> >
> >> > I confirmed it in INFINITY.
> >> >
> >> > (snip)
> >> >        >> >rsc-role="Master" score="INFINITY" with-rsc="clnPingd"/>
> >> >        >> >symmetrical="false" then="msDrPostgreSQLDB"/>
> >> > (snip)
> >> >
> >> >
> >> > By the start only for Master nodes, it was controlled well.
> >> >
> >> > [root@drbd1 ~]# crm_mon -1 -f
> >> > 
> >> > Last updated: Mon Jul 23 08:24:38 2012
> >> > Stack: Heartbeat
> >> > Current DC: NONE
> >> > 1 Nodes configured, unknown expected votes
> >> > 2 Resources configured.
> >> > 
> >> >
> >> > Online: [ drbd1 ]
> >> >
> >> >
> >> > Migration summary:
> >> > * Node drbd1:
> >> >    prmPingd:0: migration-threshold=1 fail-count=100
> >> >
> >> > Failed actions:
> >> >     prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): 
> >> >unknown error
> >> >
> >> >
> >> > However, the problem occurs when I send cib after the Slave node started 
> >> > together.
> >> >
> >> > 
> >> > Last updated: Mon Jul 23 08:35:41 2012
> >> > Stack: Heartbeat
> >> > Current DC: drbd2 (6d4b04de-12c0-499a-b388-febba50eaec2) - partition 
> >> > with quorum
> >> > Version: 1.0.12-unknown
> >> > 2 Nodes configured, unknown expected votes
> >> > 2 Resources configured.
> >> > 
> >> >
> >> > Online: [ drbd1 drbd2 ]
> >> >
> >> >  Master/Slave Set: msDrPostgreSQLDB
> >> >      Masters: [ drbd2 ]
> >> >      Slaves: [ drbd1 ] ---> Started and 
> >> >Status Slave.
> >>
> >> Yep, looks like a bug.  I'll follow up on the bugzilla.
> >>
> >> >  Clone Set: clnPingd
> >> >      Started: [ drbd2 ]
> >> >      Stopped: [ prmPingd:0 ]
> >> >
> >> > Migration summary:
> >> > * Node drbd1:
> >> >    prmPingd:0: migration-threshold=1 fail-count=100
> >> > * Node drbd2:
> >> >
> >> > Failed actions:
> >> >     prmPingd:0_start_0 (node=drbd1, call=4, rc=1, status=complete): 
> >> >unknown error
> >> >
> >> > Best Regards,
> >> > Hideo Yamauchi.
> >> >
> >> >
> >> >
> >> > --- On Sat, 2012/7/21, David Vossel  wrote:
> >> >
> >> >>
> >> >>
> >> >> - Original Message -
> >> >> > From: renayama19661...@ybb.ne.jp
> >> >> > To: "PaceMaker-ML" 
> >> >> > Sent: Friday, July 20, 2012 1:39:51 AM
> >> >> > Subject: [Pacemaker] [Problem] Order which combined a master with 
> >> >> > clone is    invalid.
> >> >> >
> >> >> > Hi All,
> >> >> >
> >> >> > We confirmed movement of order which combined a master with clone.
> >> >> > We performed it by a very simple combination.
> >> >> >
> >> >> > Step1) We change it to produce start error in Dummy resource.
> >> >> >
> >> >> > (snip)
> >> >> > dummy_start() {
> >> >> > return $OCF_ERR_GENERIC
> >> >> >     dummy_monitor
> >> >> > (snip)
> >> >> >
> >> >> > Step2) We start one node and send cib.
> >> >> >
> >> >> >
> >> >> > However, as for the master, it is done start even if start of clone
> >> >> > fails.
> >> >> > And it becomes the Slave state.
> >> >>
> >> >> Not a bug,  You are using advisory ordering in your order constraint.
> >> >>
> >> >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch06s03s02.html
> >> >>
> >> >> >
> >> >> > 
> >> >> > Last updated: Fri Jul 20 15:36:10 2012
> >> >> > Stack: Heartbeat
> >> 

Re: [Pacemaker] [Problem] Though order limitation exists, start which is not carried out is made.

2012-08-15 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> > Is this specifications?
> > Or is it a bug?
> >
> >  * I registered a problem with Bugzilla.
> >   * http://bugs.clusterlabs.org/show_bug.cgi?id=5079
> 
> Excellent.  I'll follow up there (likewise for the other bugs you've
> sent to the list recently :-)

I understood it with specifications.

Please revise it if you judge it to need the correction so that useless 
transition is not given.
Because the useless transition may confuse us. . .

Many thanks,
Hideo Yamauchi.

--- On Wed, 2012/8/15, Andrew Beekhof  wrote:

> On Fri, Jul 6, 2012 at 11:38 AM,   wrote:
> > Hi All,
> >
> > We performed setting avoiding the next problem.
> >
> >  * http://www.gossamer-threads.com/lists/linuxha/pacemaker/80250
> >   -> http://bugs.clusterlabs.org/show_bug.cgi?id=5070
> >
> >  * http://www.gossamer-threads.com/lists/linuxha/pacemaker/80549
> >   -> http://bugs.clusterlabs.org/show_bug.cgi?id=5075
> >
> > The resource finally started in Master node definitely.
> >
> > 
> > 
> > Last updated: Fri Jul  6 10:19:42 2012
> > Stack: Heartbeat
> > Current DC: sr1 (31411278-b503-4e17-9cf1-38638babb4ff) - partition with 
> > quorum
> > Version: 1.0.12-unknown
> > 1 Nodes configured, unknown expected votes
> > 7 Resources configured.
> > 
> >
> > Online: [ sr1 ]
> >
> >  vipCheck       (ocf::heartbeat:VIPcheck):      Started sr1
> >  vipCheck2      (ocf::pacemaker:Dummy): Started sr1
> >  vip-slave      (ocf::heartbeat:IPaddr2):       Started sr1
> >  Resource Group: master-group
> >      vip-master (ocf::heartbeat:IPaddr2):       Started sr1
> >      vip-rep    (ocf::heartbeat:IPaddr2):       Started sr1
> >  Master/Slave Set: msPostgresql
> >      Masters: [ sr1 ]
> >      Stopped: [ postgresql:1 ]
> >  Clone Set: clnPingd
> >      Started: [ sr1 ]
> >  Clone Set: clnDiskd1
> >      Started: [ sr1 ]
> > 
> >
> >
> > But start of vipCheck2 is made idly when we watch progress on the way.
> >  * Start of this vipCheck2 is really state transition of the red limit that 
> >is not carried out.
> >  * It exists in "pe-input-1.bz2".
> >
> > 
> > (snip)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: RecurringOp:  Start recurring 
> > monitor (10s) for prmPingd:0 on sr1
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: RecurringOp:  Start recurring 
> > monitor (10s) for prmDiskd1:0 on sr1
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
> > vipCheck      (Stopped)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Start   vipCheck2  
> >     (sr1)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
> > vip-slave     (Stopped)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
> > vip-master    (Stopped)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
> > vip-rep       (Stopped)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
> > postgresql:0  (Stopped)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: Leave   resource 
> > postgresql:1  (Stopped)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: 
> > Start   prmPingd:0     (sr1)
> > Jul  6 09:52:52 sr1 pengine: [2771]: notice: LogActions: 
> > Start   prmDiskd1:0    (sr1)
> > Jul  6 09:52:52 sr1 crmd: [2766]: info: do_state_transition: State 
> > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
> > cause=C_IPC_MESSAGE origin=handle_response ]
> > Jul  6 09:52:52 sr1 crmd: [2766]: info: unpack_graph: Unpacked transition 
> > 1: 20 actions in 20 synapses
> > Jul  6 09:52:52 sr1 crmd: [2766]: info: do_te_invoke: Processing graph 1 
> > (ref=pe_calc-dc-1341535972-14) derived from /var/lib/pengine/pe-input-1.bz2
> > (snip)
> > 
> >
> >
> > Because there is limitation of vipCheck and vipCheck2, we think that the 
> > state transition of start of this useless vipCheck2 should not be made.
> >
> > (snip)
> >        >symmetrical="false" then="vipCheck2"/>
> > (snip)
> >
> >
> > Is this specifications?
> > Or is it a bug?
> >
> >  * I registered a problem with Bugzilla.
> >   * http://bugs.clusterlabs.org/show_bug.cgi?id=5079
> 
> Excellent.  I'll follow up there (likewise for the other bugs you've
> sent to the list recently :-)
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch

[Pacemaker] [Question] About the stop order at the time of the Probe error.

2012-08-21 Thread renayama19661014
Hi All,

We found a problem at the time of Porobe error.

It is the following simple resource constitution.


Last updated: Wed Aug 22 15:19:50 2012
Stack: Heartbeat
Current DC: drbd1 (6081ac99-d941-40b9-a4a3-9f996ff291c0) - partition with quorum
Version: 1.0.12-c6770b8
1 Nodes configured, unknown expected votes
1 Resources configured.


Online: [ drbd1 ]

 Resource Group: grpTest
 resource1  (ocf::pacemaker:Dummy): Started drbd1
 resource2  (ocf::pacemaker:Dummy): Started drbd1
 resource3  (ocf::pacemaker:Dummy): Started drbd1
 resource4  (ocf::pacemaker:Dummy): Started drbd1

Node Attributes:
* Node drbd1:

Migration summary:
* Node drbd1: 


Depending on the resource that the Probe error occurs, the stop of the resource 
does not become the inverse order.

I confirmed it in the next procedure.

Step 1) Make resource2 and resource4 a starting state.

[root@drbd1 ~]# touch /var/run/Dummy-resource2.state
[root@drbd1 ~]# touch /var/run/Dummy-resource4.state

Step 2) Start a node and send cib.

Step 3) Resource2 and resource3 stop, but are not inverse order.

(snip)
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: group_print:  Resource Group: 
grpTest
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:  
resource1#011(ocf::pacemaker:Dummy):#011Stopped 
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:  
resource2#011(ocf::pacemaker:Dummy):#011Started drbd1
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:  
resource3#011(ocf::pacemaker:Dummy):#011Stopped 
Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:  
resource4#011(ocf::pacemaker:Dummy):#011Started drbd1
(snip)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating action 6: 
stop resource2_stop_0 on drbd1 (local)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing 
key=6:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource2_stop_0 )
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource2 stop[6] (pid 32745)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating action 
11: stop resource4_stop_0 on drbd1 (local)
Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing 
key=11:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource4_stop_0 )
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource4 stop[7] (pid 32746)
Aug 22 15:19:47 drbd1 lrmd: [32716]: info: operation stop[6] on resource2 for 
client 32719: pid 32745 exited with return code 0
(snip)


I know that there is a cause of this stop order for order in group.

In this case our user wants to stop a resource in inverse order definitely.

 * resource4_stop -> resource2_stop

Stop order is important to the resource of our user.


I ask next question.

Question 1) Is there right setting in cib.xml to evade this problem?

Question 2) In Pacemaker1.1, does this problem occur?

Question 3) I added following order.






And the addition of this order seems to solve a problem.
Is the addition of order right as one method of the solution, too?


Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About the stop order at the time of the Probe error.

2012-09-04 Thread renayama19661014
Hi Andrew, 

Thank you for comments.

> > Question 1) Is there right setting in cib.xml to evade this problem?
> 
> No.
> 
> >
> > Question 2) In Pacemaker1.1, does this problem occur?
> 
> Yes.  I'll see what I can do.
> 
> >
> > Question 3) I added following order.
> >
> >
> > 
> > 
> > 
> >
> > And the addition of this order seems to solve a problem.
> > Is the addition of order right as one method of the solution, 
> > too?
> 
> Really the PE should handle this implicitly, without need for
> additional constraints.

All right.

I wish this problem is solved.
I registered a demand with Bugzilla about this problem.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5101

Best Regards,
Hideo Yamauchi.



--- On Wed, 2012/9/5, Andrew Beekhof  wrote:

> On Wed, Aug 22, 2012 at 4:44 PM,   wrote:
> > Hi All,
> >
> > We found a problem at the time of Porobe error.
> >
> > It is the following simple resource constitution.
> >
> > 
> > Last updated: Wed Aug 22 15:19:50 2012
> > Stack: Heartbeat
> > Current DC: drbd1 (6081ac99-d941-40b9-a4a3-9f996ff291c0) - partition with 
> > quorum
> > Version: 1.0.12-c6770b8
> > 1 Nodes configured, unknown expected votes
> > 1 Resources configured.
> > 
> >
> > Online: [ drbd1 ]
> >
> >  Resource Group: grpTest
> >      resource1  (ocf::pacemaker:Dummy): Started drbd1
> >      resource2  (ocf::pacemaker:Dummy): Started drbd1
> >      resource3  (ocf::pacemaker:Dummy): Started drbd1
> >      resource4  (ocf::pacemaker:Dummy): Started drbd1
> >
> > Node Attributes:
> > * Node drbd1:
> >
> > Migration summary:
> > * Node drbd1:
> >
> >
> > Depending on the resource that the Probe error occurs, the stop of the 
> > resource does not become the inverse order.
> >
> > I confirmed it in the next procedure.
> >
> > Step 1) Make resource2 and resource4 a starting state.
> >
> > [root@drbd1 ~]# touch /var/run/Dummy-resource2.state
> > [root@drbd1 ~]# touch /var/run/Dummy-resource4.state
> >
> > Step 2) Start a node and send cib.
> >
> > Step 3) Resource2 and resource3 stop, but are not inverse order.
> >
> > (snip)
> > Aug 22 15:19:47 drbd1 pengine: [32722]: notice: group_print:  Resource 
> > Group: grpTest
> > Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
> > resource1#011(ocf::pacemaker:Dummy):#011Stopped
> > Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
> > resource2#011(ocf::pacemaker:Dummy):#011Started drbd1
> > Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
> > resource3#011(ocf::pacemaker:Dummy):#011Stopped
> > Aug 22 15:19:47 drbd1 pengine: [32722]: notice: native_print:      
> > resource4#011(ocf::pacemaker:Dummy):#011Started drbd1
> > (snip)
> > Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating 
> > action 6: stop resource2_stop_0 on drbd1 (local)
> > Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing 
> > key=6:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource2_stop_0 )
> > Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource2 stop[6] (pid 32745)
> > Aug 22 15:19:47 drbd1 crmd: [32719]: info: te_rsc_command: Initiating 
> > action 11: stop resource4_stop_0 on drbd1 (local)
> > Aug 22 15:19:47 drbd1 crmd: [32719]: info: do_lrm_rsc_op: Performing 
> > key=11:2:0:5c924067-0d20-48fd-9772-88e530661270 op=resource4_stop_0 )
> > Aug 22 15:19:47 drbd1 lrmd: [32716]: info: rsc:resource4 stop[7] (pid 32746)
> > Aug 22 15:19:47 drbd1 lrmd: [32716]: info: operation stop[6] on resource2 
> > for client 32719: pid 32745 exited with return code 0
> > (snip)
> 
> Hmmm. Thats not good.
> 
> >
> > I know that there is a cause of this stop order for order in group.
> >
> > In this case our user wants to stop a resource in inverse order definitely.
> >
> >  * resource4_stop -> resource2_stop
> >
> > Stop order is important to the resource of our user.
> >
> >
> > I ask next question.
> >
> > Question 1) Is there right setting in cib.xml to evade this problem?
> 
> No.
> 
> >
> > Question 2) In Pacemaker1.1, does this problem occur?
> 
> Yes.  I'll see what I can do.
> 
> >
> > Question 3) I added following order.
> >
> >
> >         
> >         
> >         
> >
> >             And the addition of this order seems to solve a problem.
> >             Is the addition of order right as one method of the solution, 
> >too?
> 
> Really the PE should handle this implicitly, without need for
> additional constraints.
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] fencing best practices for virtual environments

2012-09-09 Thread renayama19661014
Hi Alberto,

I think that I should set plural external/vcenter if your problem is practice 
of stonith when vcenter falls and is not usable.

Please refer to the next email and patch.
 * http://www.gossamer-threads.com/lists/linuxha/dev/78702

Best Regards,
Hideo Yamauchi.

--- On Sun, 2012/9/9, Alberto Menichetti  wrote:

> Hi all,
> 
> I'm setting up a two-node pacemaker cluster (SLES-HA Extension) on vmware 
> vsphere 5.
> I've successfully configured and tested the stonith plugin 
> "external/vcenter"; but this plugin introduces a single point of failure in 
> my cluster infrastructure because it depends on the availability of the 
> virtual center (which is, in the customer environment, a virtual machine).
> I was thinking to introduce an additional fencing device, to be used when the 
> virtual center is unavailable; is this a suggested deployment?
> The fecing device I'd like to use is sdb.
> 
> Are there some best practices or validated configurations for a deploy like 
> this?
> 
> Thank you.
> Alberto
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Problem] About the replacement of the master/slave resource.

2012-09-09 Thread renayama19661014
Hi All,

We confirmed movement of the trouble of the clone resource that we combined 
with Master/Slave resource.

The master / slave resources are replaced under the influence of the trouble of 
the clonal resource.

We confirmed it in the next procedure.


Step1) We start a cluster and send cib.


Last updated: Mon Sep 10 15:26:25 2012
Stack: Heartbeat
Current DC: drbd2 (08607c71-da7b-4abf-b6d5-39ee39552e89) - partition with quorum
Version: 1.0.12-c6770b8
2 Nodes configured, unknown expected votes
6 Resources configured.


Online: [ drbd1 drbd2 ]

 Resource Group: grpPostgreSQLDB
 prmApPostgreSQLDB  (ocf::pacemaker:Dummy): Started drbd1
 Resource Group: grpStonith1
 prmStonith1-2  (stonith:external/ssh): Started drbd2
 prmStonith1-3  (stonith:meatware): Started drbd2
 Resource Group: grpStonith2
 prmStonith2-2  (stonith:external/ssh): Started drbd1
 prmStonith2-3  (stonith:meatware): Started drbd1
 Master/Slave Set: msDrPostgreSQLDB
 Masters: [ drbd1 ]
 Slaves: [ drbd2 ]
 Clone Set: clnDiskd1
 Started: [ drbd1 drbd2 ]
 Clone Set: clnPingd
 Started: [ drbd1 drbd2 ]

Step2) We cause a monitor error in pingd.

[root@drbd1 ~]# rm -rf /var/run/pingd-default_ping_set 

Step3) FailOver is finished.


Last updated: Mon Sep 10 15:27:08 2012
Stack: Heartbeat
Current DC: drbd2 (08607c71-da7b-4abf-b6d5-39ee39552e89) - partition with quorum
Version: 1.0.12-c6770b8
2 Nodes configured, unknown expected votes
6 Resources configured.


Online: [ drbd1 drbd2 ]

 Resource Group: grpPostgreSQLDB
 prmApPostgreSQLDB  (ocf::pacemaker:Dummy): Started drbd2
 Resource Group: grpStonith1
 prmStonith1-2  (stonith:external/ssh): Started drbd2
 prmStonith1-3  (stonith:meatware): Started drbd2
 Resource Group: grpStonith2
 prmStonith2-2  (stonith:external/ssh): Started drbd1
 prmStonith2-3  (stonith:meatware): Started drbd1
 Master/Slave Set: msDrPostgreSQLDB
 Masters: [ drbd2 ]
 Stopped: [ prmDrPostgreSQLDB:1 ]
 Clone Set: clnDiskd1
 Started: [ drbd1 drbd2 ]
 Clone Set: clnPingd
 Started: [ drbd2 ]
 Stopped: [ prmPingd:0 ]

Failed actions:
prmPingd:0_monitor_1 (node=drbd1, call=14, rc=7, status=complete): not 
running



However, Master/Slave resources seemed to be replaced when we watched log.

Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Moveresource 
prmApPostgreSQLDB#011(Started drbd1 -> drbd2)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave   resource 
prmStonith1-2#011(Started drbd2)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave   resource 
prmStonith1-3#011(Started drbd2)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave   resource 
prmStonith2-2#011(Started drbd1)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave   resource 
prmStonith2-3#011(Started drbd1)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Moveresource 
prmDrPostgreSQLDB:0#011(Master drbd1 -> drbd2)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Stopresource 
prmDrPostgreSQLDB:1#011(drbd2)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave   resource 
prmDiskd1:0#011(Started drbd1)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave   resource 
prmDiskd1:1#011(Started drbd2)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Stopresource 
prmPingd:0#011(drbd1)
Sep 10 15:26:53 drbd2 pengine: [2668]: notice: LogActions: Leave   resource 
prmPingd:1#011(Started drbd2)

The replacement is unnecessary, and Slave becomes Master, and inoperative 
Master should have only to originally stop.

However, this problem seems to be solved in Pacemaker1.1.

Will the correction be possible for Pacemaker1.0?
Because I have a big difference in placement processing with Pacemaker1.1, I 
think that the correction to Pacemaker1.0 is difficult.

 * This problem may have been reported as a known problem.
 * I registered this problem with Bugzilla.
  * http://bugs.clusterlabs.org/show_bug.cgi?id=5103

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Patch] The log when I lost Quorum is never output.

2012-10-17 Thread renayama19661014
Hi All,

I watched a source of crmd of 
ClusterLabs-pacemaker-1.0-Pacemaker-1.0.12-116-gf372204.zip.

I found the log("Quorum lost: ") processing that was never output.
The cause that log is not output is fsa_have_quorum.
This value is always FALSE.

(snip)
* crmd/callback.c
#if SUPPORT_HEARTBEAT
static gboolean fsa_have_quorum = FALSE;
(snip)
if(update_quorum) {
crm_have_quorum = ccm_have_quorum(event);
crm_update_quorum(crm_have_quorum, FALSE);

if(crm_have_quorum == FALSE) {
/* did we just loose quorum? */
if(fsa_have_quorum) {
crm_info("Quorum lost: %s", ccm_event_name(event));
}
}
}
(snip)

I made a patch to output this log.
Please apply to a repository if this patch does not have a problem.

Best Regards,
Hideo Yamauchi.

trac2198.patch
Description: Binary data
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Patch] The log when I lost Quorum is never output.

2012-10-17 Thread renayama19661014
Hi Andrew,

Thank you for comments.

Even your correction is good, but one place of correction is necessary.

 #if SUPPORT_HEARTBEAT
-static gboolean fsa_have_quorum = FALSE;
 
 gboolean ccm_dispatch(int fd, gpointer user_data)
 {
@@ -575,14 +574,14 @@
 
if(update_quorum) {
crm_have_quorum = ccm_have_quorum(event);
-   crm_update_quorum(crm_have_quorum, FALSE);
 
if(crm_have_quorum == FALSE) {
/* did we just loose quorum? */
-   if(fsa_have_quorum) {
+   if(fsa_has_quorum) {
crm_info("Quorum lost: %s", ccm_event_name(event));
}
}
+   crm_update_quorum(crm_have_quorum, FALSE);
}

if(update_cache) {


And I think fsa_have_quorum to delete it because I am not necessary.


Best Regards,
Hideo Yamauuchi.


--- On Thu, 2012/10/18, Andrew Beekhof  wrote:

> What about this instead?
> 
> diff --git a/crmd/heartbeat.c b/crmd/heartbeat.c
> index cae143b..3a7f31d 100644
> --- a/crmd/heartbeat.c
> +++ b/crmd/heartbeat.c
> @@ -354,14 +354,13 @@ crmd_ccm_msg_callback(oc_ed_t event, void
> *cookie, size_t size, const void *data
> 
>      if (update_quorum) {
>          crm_have_quorum = ccm_have_quorum(event);
> -        crm_update_quorum(crm_have_quorum, FALSE);
> -
>          if (crm_have_quorum == FALSE) {
>              /* did we just loose quorum? */
>              if (fsa_have_quorum) {
>                  crm_info("Quorum lost: %s", ccm_event_name(event));
>              }
>          }
> +        crm_update_quorum(crm_have_quorum, FALSE);
>      }
> 
>      if (update_cache) {
> 
> 
> On Wed, Oct 17, 2012 at 6:41 PM,   wrote:
> > Hi All,
> >
> > I watched a source of crmd of 
> > ClusterLabs-pacemaker-1.0-Pacemaker-1.0.12-116-gf372204.zip.
> >
> > I found the log("Quorum lost: ") processing that was never output.
> > The cause that log is not output is fsa_have_quorum.
> > This value is always FALSE.
> >
> > (snip)
> > * crmd/callback.c
> > #if SUPPORT_HEARTBEAT
> > static gboolean fsa_have_quorum = FALSE;
> > (snip)
> >         if(update_quorum) {
> >             crm_have_quorum = ccm_have_quorum(event);
> >             crm_update_quorum(crm_have_quorum, FALSE);
> >
> >             if(crm_have_quorum == FALSE) {
> >                         /* did we just loose quorum? */
> >                         if(fsa_have_quorum) {
> >                         crm_info("Quorum lost: %s", ccm_event_name(event));
> >                         }
> >             }
> >         }
> > (snip)
> >
> > I made a patch to output this log.
> > Please apply to a repository if this patch does not have a problem.
> >
> > Best Regards,
> > Hideo Yamauchi.
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>

new.patch
Description: Binary data
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] The strange behavior of Master/Slave when it failed to demote

2013-01-22 Thread renayama19661014
Hi All,

I registered a problem at bugzilla in place of Miss Ikeda.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5133

Best Regards,
Hideo Yamauchi.


--- On Thu, 2013/1/10, Junko IKEDA  wrote:

> 
> 
> Hi,
> 
> I'm running Stateful RA with Pacemaker 1.0.12, and found that its demote 
> behavior is something wrong.
> 
> This is my configuration;
> There is no stonith devices, and demote/stop are set as on-fail="block".
> 
> # crm configure show
> node $id="21c624bd-c426-43dc-9665-bbfb92054bcd" dl380g5c \
> node $id="3f6ec88d-ee47-4f63-bfeb-652b8dd96027" dl380g5d
> primitive dummy ocf:pacemaker:Stateful \
>         op start interval="0s" timeout="100s" on-fail="restart" \
>         op monitor interval="10s" role="Master" timeout="100s" 
> on-fail="restart" \
>         op monitor interval="20s" role="Slave" timeout="100s" 
> on-fail="restart" \
>         op promote interval="0s" timeout="100s" on-fail="restart" \
>         op demote interval="0s" timeout="100s" on-fail="block" \
>         op stop interval="0s" timeout="100s" on-fail="block"
> ms stateful dummy
> property $id="cib-bootstrap-options" \
>         dc-version="1.0.12-066152e" \
>         cluster-infrastructure="Heartbeat" \
>         no-quorum-policy="ignore" \
>         stonith-enabled="false" \
>         startup-fencing="false" \
>         crmd-transition-delay="2s"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="INFINITY" \
>         migration-threshold="1"
> 
> 
> 
> 1) Initial status (dl380g5c=Master/dl380g5d=Slave)
> # crm_mon -1 -n
> 
> 
> Last updated: Thu Jan 10 18:25:17 2013
> Stack: Heartbeat
> Current DC: dl380g5d (3f6ec88d-ee47-4f63-bfeb-652b8dd96027) - partition with 
> quorum
> Version: 1.0.12-066152e
> 2 Nodes configured, unknown expected votes
> 1 Resources configured.
> 
> 
> Node dl380g5c (21c624bd-c426-43dc-9665-bbfb92054bcd): online
>         dummy:0 (ocf::pacemaker:Stateful) Master
> Node dl380g5d (3f6ec88d-ee47-4f63-bfeb-652b8dd96027): online
>         dummy:1 (ocf::pacemaker:Stateful) Started
> 
> 
> 
> 2) Modify Stateful RA to reprodece "demote NG", and put the Master node into 
> standby mode.
> 
> # vim /usr/lib/ocf/resource.d/pacemaker/Stateful
> stateful_demote() {
> return $OCF_ERR_GENERIC
> 
>     stateful_check_state
>     if [ $? = 0 ]; then
>         # CRM Error - Should never happen
>         return $OCF_NOT_RUNNING
> 
> ...
> 
> 
> # crm node standby dl380g5c
> # crm_mon -1 -n
> 
> Last updated: Thu Jan 10 18:27:04 2013
> Stack: Heartbeat
> Current DC: dl380g5d (3f6ec88d-ee47-4f63-bfeb-652b8dd96027) - partition with 
> quorum
> Version: 1.0.12-066152e
> 2 Nodes configured, unknown expected votes
> 1 Resources configured.
> 
> 
> Node dl380g5c (21c624bd-c426-43dc-9665-bbfb92054bcd): standby
>         dummy:0 (ocf::pacemaker:Stateful) Slave  (unmanaged) FAILED
> Node dl380g5d (3f6ec88d-ee47-4f63-bfeb-652b8dd96027): online
>         dummy:1 (ocf::pacemaker:Stateful) Master
> 
> Failed actions:
>     dummy:0_demote_0 (node=dl380g5c, call=4, rc=1, status=complete): unknown 
> error
> 
> 
> In the above crm_mon, dl380g5c's status is "Slave", but it might be still 
> "Master" because it failed to demote.
> So dl380g5d should be prohibited from its promoting action to prevent the 
> multiple Master.
> It seems that Pacemaker 1.1 shows the same behavior as 1.0.12.
> I'm not sure but Pacemaker 1.0.11's behavior is correct(dl380g5d can not 
> promote).
> Please see the attached hb_report.
> 
> 
> Jan 10 18:27:01 dl380g5d pengine: [4297]: info: determine_online_status: Node 
> dl380g5c is standby
> Jan 10 18:27:01 dl380g5d pengine: [4297]: info: determine_online_status: Node 
> dl380g5d is online
> Jan 10 18:27:01 dl380g5d pengine: [4297]: notice: unpack_rsc_op: Operation 
> dummy:0_monitor_0 found resource dummy:0 active in master mode on dl380g5c
> Jan 10 18:27:01 dl380g5d pengine: [4297]: WARN: unpack_rsc_op: Processing 
> failed op dummy:0_demote_0 on dl380g5c: unknown error (1)
> Jan 10 18:27:01 dl380g5d pengine: [4297]: WARN: unpack_rsc_op: Forcing 
> dummy:0 to stop after a failed demote action
> Jan 10 18:27:01 dl380g5d pengine: [4297]: info: native_add_running: resource 
> dummy:0 isnt managed
> Jan 10 18:27:01 dl380g5d pengine: [4297]: notice: clone_print:  Master/Slave 
> Set: stateful
> Jan 10 18:27:01 dl380g5d pengine: [4297]: notice: native_print:      dummy:0  
> (ocf::pacemaker:Stateful):  Slave dl380g5c (unmanaged) FAILED
> Jan 10 18:27:01 dl380g5d pengine: [4297]: notice: short_print:      Slaves: [ 
> dl380g5d ]
> Jan 10 18:27:01 dl380g5d pengine: [4297]: info: get_failcount: stateful has 
> failed 1 times on dl380g5c
> Jan 10 18:27:01 dl380g5d pengine: [4297]: WARN: common_apply_stickiness: 
> Forcing stateful away from dl380g5c after 1 failures (max=1)
> Jan 10 18:27:01 dl380g5d pengine: [4297]: info: get_failcount: stateful has 
> failed 1 times on dl380g5c
> Jan 10 18:27:01 dl380g5d pengine: [4297]: WARN: common_apply_stickiness:

[Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.

2013-03-05 Thread renayama19661014
Hi Dejan,
Hi Andrew,

As for the crm shell, the check of the meta attribute was revised with the next 
patch.

 * http://hg.savannah.gnu.org/hgweb/crmsh/rev/d1174f42f4b3

This patch was backported in Pacemaker1.0.13.

 * 
https://github.com/ClusterLabs/pacemaker-1.0/commit/fa1a99ab36e0ed015f1bcbbb28f7db962a9d1abc#shell/modules/cibconfig.py

However, the ordered,colocated attribute of the group resource is treated as an 
error when I use crm Shell which adopted this patch.

--
(snip)
### Group Configuration ###
group master-group \
vip-master \
vip-rep \
meta \
ordered="false"
(snip)

[root@rh63-heartbeat1 ~]# crm configure load update test2339.crm 
INFO: building help index
crm_verify[20028]: 2013/03/06_17:57:18 WARN: unpack_nodes: Blind faith: not 
fencing unseen nodes
WARNING: vip-master: specified timeout 60s for start is smaller than the 
advised 90
WARNING: vip-master: specified timeout 60s for stop is smaller than the advised 
100
WARNING: vip-rep: specified timeout 60s for start is smaller than the advised 90
WARNING: vip-rep: specified timeout 60s for stop is smaller than the advised 100
ERROR: master-group: attribute ordered does not exist  -> WHY?
Do you still want to commit? y
--

If it chooses `yes` by a confirmation message, it is reflected, but it is a 
problem that error message is displayed.
 * The error occurs in the same way when I appoint colocated attribute.
AndI noticed that there was not explanation of ordered,colocated of the 
group resource in online help of Pacemaker.

I think that the designation of the ordered,colocated attribute should not 
become the error in group resource.
In addition, I think that ordered,colocated should be added to online help.

Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.

2013-03-06 Thread renayama19661014
Hi Dejan,
Hi Andrew,

Thank you for comment.
I confirm the movement of the patch and report it.

Best Regards,
Hideo Yamauchi.

--- On Wed, 2013/3/6, Dejan Muhamedagic  wrote:

> Hi Hideo-san,
> 
> On Wed, Mar 06, 2013 at 10:37:44AM +0900, renayama19661...@ybb.ne.jp wrote:
> > Hi Dejan,
> > Hi Andrew,
> > 
> > As for the crm shell, the check of the meta attribute was revised with the 
> > next patch.
> > 
> >  * http://hg.savannah.gnu.org/hgweb/crmsh/rev/d1174f42f4b3
> > 
> > This patch was backported in Pacemaker1.0.13.
> > 
> >  * 
> >https://github.com/ClusterLabs/pacemaker-1.0/commit/fa1a99ab36e0ed015f1bcbbb28f7db962a9d1abc#shell/modules/cibconfig.py
> > 
> > However, the ordered,colocated attribute of the group resource is treated 
> > as an error when I use crm Shell which adopted this patch.
> > 
> > --
> > (snip)
> > ### Group Configuration ###
> > group master-group \
> >         vip-master \
> >         vip-rep \
> >         meta \
> >                 ordered="false"
> > (snip)
> > 
> > [root@rh63-heartbeat1 ~]# crm configure load update test2339.crm 
> > INFO: building help index
> > crm_verify[20028]: 2013/03/06_17:57:18 WARN: unpack_nodes: Blind faith: not 
> > fencing unseen nodes
> > WARNING: vip-master: specified timeout 60s for start is smaller than the 
> > advised 90
> > WARNING: vip-master: specified timeout 60s for stop is smaller than the 
> > advised 100
> > WARNING: vip-rep: specified timeout 60s for start is smaller than the 
> > advised 90
> > WARNING: vip-rep: specified timeout 60s for stop is smaller than the 
> > advised 100
> > ERROR: master-group: attribute ordered does not exist  -> WHY?
> > Do you still want to commit? y
> > --
> > 
> > If it chooses `yes` by a confirmation message, it is reflected, but it is a 
> > problem that error message is displayed.
> >  * The error occurs in the same way when I appoint colocated attribute.
> > AndI noticed that there was not explanation of ordered,colocated of the 
> > group resource in online help of Pacemaker.
> > 
> > I think that the designation of the ordered,colocated attribute should not 
> > become the error in group resource.
> > In addition, I think that ordered,colocated should be added to online help.
> 
> These attributes are not listed in crmsh. Does the attached patch
> help?
> 
> Thanks,
> 
> Dejan
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.

2013-03-06 Thread renayama19661014
Hi Dejan,

The problem was settled with your patch.

However, I have a question.
I want to use "resource_set" which Mr. Andrew proposed, but do not understand a 
method to use with crm shell.

I read two next cib.xml and confirmed it with crm shell.

Case 1) sequential="false". 
(snip)








(snip)
 * When I confirm it with crm shell ...
(snip)
group master-group vip-master vip-rep
order test-order : _rsc_set_ ( vip-master vip-rep )
(snip)

Case 2) sequential="true"
(snip)

  

  
  

  

(snip)
 * When I confirm it with crm shell ...
(snip)
   group master-group vip-master vip-rep
   xml  \
 \
 \
 \
 \

(snip)

Does the designation of "sequential=true" have to describe it in xml?
Is there a right method to appoint an attribute of "resource_set" with crm 
shell?
Possibly is not "resource_set" usable with crm shell of Pacemaker1.0.13?

Best Regards,
Hideo Yamauchi.

--- On Thu, 2013/3/7, renayama19661...@ybb.ne.jp  
wrote:

> Hi Dejan,
> Hi Andrew,
> 
> Thank you for comment.
> I confirm the movement of the patch and report it.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> --- On Wed, 2013/3/6, Dejan Muhamedagic  wrote:
> 
> > Hi Hideo-san,
> > 
> > On Wed, Mar 06, 2013 at 10:37:44AM +0900, renayama19661...@ybb.ne.jp wrote:
> > > Hi Dejan,
> > > Hi Andrew,
> > > 
> > > As for the crm shell, the check of the meta attribute was revised with 
> > > the next patch.
> > > 
> > >  * http://hg.savannah.gnu.org/hgweb/crmsh/rev/d1174f42f4b3
> > > 
> > > This patch was backported in Pacemaker1.0.13.
> > > 
> > >  * 
> > >https://github.com/ClusterLabs/pacemaker-1.0/commit/fa1a99ab36e0ed015f1bcbbb28f7db962a9d1abc#shell/modules/cibconfig.py
> > > 
> > > However, the ordered,colocated attribute of the group resource is treated 
> > > as an error when I use crm Shell which adopted this patch.
> > > 
> > > --
> > > (snip)
> > > ### Group Configuration ###
> > > group master-group \
> > >         vip-master \
> > >         vip-rep \
> > >         meta \
> > >                 ordered="false"
> > > (snip)
> > > 
> > > [root@rh63-heartbeat1 ~]# crm configure load update test2339.crm 
> > > INFO: building help index
> > > crm_verify[20028]: 2013/03/06_17:57:18 WARN: unpack_nodes: Blind faith: 
> > > not fencing unseen nodes
> > > WARNING: vip-master: specified timeout 60s for start is smaller than the 
> > > advised 90
> > > WARNING: vip-master: specified timeout 60s for stop is smaller than the 
> > > advised 100
> > > WARNING: vip-rep: specified timeout 60s for start is smaller than the 
> > > advised 90
> > > WARNING: vip-rep: specified timeout 60s for stop is smaller than the 
> > > advised 100
> > > ERROR: master-group: attribute ordered does not exist  -> WHY?
> > > Do you still want to commit? y
> > > --
> > > 
> > > If it chooses `yes` by a confirmation message, it is reflected, but it is 
> > > a problem that error message is displayed.
> > >  * The error occurs in the same way when I appoint colocated attribute.
> > > AndI noticed that there was not explanation of ordered,colocated of 
> > > the group resource in online help of Pacemaker.
> > > 
> > > I think that the designation of the ordered,colocated attribute should 
> > > not become the error in group resource.
> > > In addition, I think that ordered,colocated should be added to online 
> > > help.
> > 
> > These attributes are not listed in crmsh. Does the attached patch
> > help?
> > 
> > Thanks,
> > 
> > Dejan
> > > 
> > > Best Regards,
> > > Hideo Yamauchi.
> > > 
> > > 
> > > ___
> > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > 
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> > 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] [Question]About "sequential" designation of resource_set.

2013-03-06 Thread renayama19661014
Hi Andrew,

I tried "resource_set  sequential" designation.
 *  http://www.gossamer-threads.com/lists/linuxha/pacemaker/84578

I caused an error in start of the vip-master resource and confirmed movement.

(snip)
  

  



  


  



  

  
(snip)

By the ordered designation of the group resource, the difference that I 
expected appeared.( Case 1 and Case 2)
However, by the "sequential" designation, the difference that I expected did 
not appear.(Case 3 and Case 4)

(snip)


  
---> or "false"





(snip)


Case 1) group meta_attribute ordered=false 
 * Start of vip-rep is published without waiting for start of vip-master.

[root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
Mar  7 19:40:50 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 2: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  7 19:40:50 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 6: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  7 19:41:25 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  7 19:41:25 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 5: start vip-master_start_0 on rh63-heartbeat1
Mar  7 19:41:25 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 7: start vip-rep_start_0 on rh63-heartbeat1 
Mar  7 19:41:26 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 8: monitor vip-rep_monitor_1 on rh63-heartbeat1
Mar  7 19:41:27 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 2: stop vip-master_stop_0 on rh63-heartbeat1
Mar  7 19:41:28 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: Initiating 
action 6: stop vip-rep_stop_0 on rh63-heartbeat1


Case 2) group meta_attribute ordered=true
 * Start of vip-rep waits for start of vip-master and is published.

[root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
Mar  7 19:34:37 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 2: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  7 19:34:37 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  7 19:35:42 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
Mar  7 19:35:42 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
Mar  7 19:35:42 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
Mar  7 19:35:42 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
Mar  7 19:35:42 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 6: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  7 19:35:43 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  7 19:35:43 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 5: start vip-master_start_0 on rh63-heartbeat1
Mar  7 19:35:45 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: Initiating 
action 1: stop vip-master_stop_0 on rh63-heartbeat1


Case 3) group resource_set sequential=false
 * Start of vip-rep waits for start of vip-master and is published.
 * I expected a result same as the first case.

[root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
Mar  7 19:43:50 rh63-heartbeat2 crmd: [19113]: info: te_rsc_command: Initiating 
action 2: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  7 19:43:50 rh63-heartbeat2 crmd: [19113]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat2 (local) - no waitin

Re: [Pacemaker] [Question]About "sequential" designation of resource_set.

2013-03-06 Thread renayama19661014
Hi Andrew,

Thank you for comment.

It was colocation.
I make modifications and confirm movement.

Many Thanks!
Hideo Yamauchi.

--- On Thu, 2013/3/7, Andrew Beekhof  wrote:

> Oh!
> 
> You use the resource sets _instead_ of a group.
> If you want group.ordered=false, then use a colocation set (with
> sequential=true).
> If you want group.colocated=false, then use an ordering set (with
> sequential=true).
> 
> Hope that helps :)
> 
> On Thu, Mar 7, 2013 at 3:16 PM,   wrote:
> > Hi Andrew,
> >
> > Thank you for comments.
> >
> >> > Case 3) group resource_set sequential=false
> >> >  * Start of vip-rep waits for start of vip-master and is published.
> >> >  * I expected a result same as the first case.
> >>
> >> Me too. Have you got the relevant PE file?
> >
> > I attached the thing which just collected hb_report.
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> >
> >
> > --- On Thu, 2013/3/7, Andrew Beekhof  wrote:
> >
> >> On Thu, Mar 7, 2013 at 1:27 PM,   wrote:
> >> > Hi Andrew,
> >> >
> >> > I tried "resource_set  sequential" designation.
> >> >  *  http://www.gossamer-threads.com/lists/linuxha/pacemaker/84578
> >> >
> >> > I caused an error in start of the vip-master resource and confirmed 
> >> > movement.
> >> >
> >> > (snip)
> >> >       
> >> >          >> >type="Dummy2">
> >> >           
> >> >              >> >on-fail="restart" timeout="60s"/>
> >> >              >> >name="monitor" on-fail="restart" timeout="60s"/>
> >> >              >> >on-fail="block" timeout="60s"/>
> >> >           
> >> >         
> >> >          >> >type="Dummy">
> >> >           
> >> >              >> >on-fail="stop" timeout="60s"/>
> >> >              >> >on-fail="restart" timeout="60s"/>
> >> >              >> >on-fail="block" timeout="60s"/>
> >> >           
> >> >         
> >> >       
> >> > (snip)
> >> >
> >> > By the ordered designation of the group resource, the difference that I 
> >> > expected appeared.( Case 1 and Case 2)
> >> > However, by the "sequential" designation, the difference that I expected 
> >> > did not appear.(Case 3 and Case 4)
> >> >
> >> > (snip)
> >> >     
> >> >         
> >> >                  >> >id="test-order-resource_set">  ---> or "false"
> >> >                         
> >> >                         
> >> >                 
> >> >         
> >> >     
> >> > (snip)
> >> >
> >> >
> >> > Case 1) group meta_attribute ordered=false
> >> >  * Start of vip-rep is published without waiting for start of vip-master.
> >> >
> >> > [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> >> > Mar  7 19:40:50 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - 
> >> > no waiting
> >> > Mar  7 19:40:50 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 
> >> > (local) - no waiting
> >> > Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> >> > Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 
> >> > (local)
> >> > Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> >> > Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> >> > Mar  7 19:41:24 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 
> >> > (local) - no waiting
> >> > Mar  7 19:41:25 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - 
> >> > no waiting
> >> > Mar  7 19:41:25 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> >> > Mar  7 19:41:25 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 7: start vip-rep_start_0 on rh63-heartbeat1
> >> > Mar  7 19:41:26 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 8: monitor vip-rep_monitor_1 on rh63-heartbeat1
> >> > Mar  7 19:41:27 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 2: stop vip-master_stop_0 on rh63-heartbeat1
> >> > Mar  7 19:41:28 rh63-heartbeat2 crmd: [18992]: info: te_rsc_command: 
> >> > Initiating action 6: stop vip-rep_stop_0 on rh63-heartbeat1
> >> >
> >> >
> >> > Case 2) group meta_attribute ordered=true
> >> >  * Start of vip-rep waits for start of vip-master and is published.
> >> >
> >> > [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> >> > Mar  7 19:34:37 rh63-heartbeat2 crmd: [18865]: info: te_rsc_command: 
> >> > Initiating action 2: probe_complete probe_co

Re: [Pacemaker] [Question]About "sequential" designation of resource_set.

2013-03-06 Thread renayama19661014
Hi Andrew,

> > You use the resource sets _instead_ of a group.
> > If you want group.ordered=false, then use a colocation set (with
> > sequential=true).

In "colocation", I used "resource_set".
However, a result did not include the change.

Will this result be a mistake of my setting?

Case 1) sequential=false
(snip)

  

  
  

  

(sip)
[root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 2: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 6: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 5: start vip-master_start_0 on rh63-heartbeat1
Mar  8 00:20:58 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: Initiating 
action 1: stop vip-master_stop_0 on rh63-heartbeat1


Case 2) sequential=true
(snip)

  

  
  

  

(snip)
[root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 2: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 6: probe_complete probe_complete on rh63-heartbeat2 (local) - no waiting
Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 3: probe_complete probe_complete on rh63-heartbeat1 - no waiting
Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 5: start vip-master_start_0 on rh63-heartbeat1
Mar  7 23:54:51 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: Initiating 
action 1: stop vip-master_stop_0 on rh63-heartbeat1


Best Regards,
Hideo Yamauchi.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.

2013-03-10 Thread renayama19661014
Hi Dejan,

Thank you for comment.

> sequential=true is the default. In that case it's not possible to
> have an unequivocal representation for the same construct and, in
> this particular case, the conversion XML->CLI->XML yields a
> different XML. There's a later commit which helps here, I think
> that it should be possible to backport it to 1.0:
> 
> changeset:   789:916d1b15edc3
> user:Dejan Muhamedagic 
> date:Thu Aug 16 17:01:24 2012 +0200
> summary: Medium: cibconfig: drop attributes set to default on cib import

I apply the backporting that you taught and confirm movement.
I talk with you again if I have a problem.

> > Is there a right method to appoint an attribute of "resource_set" with crm 
> > shell?
> > Possibly is not "resource_set" usable with crm shell of Pacemaker1.0.13?
> 
> Should work. It's just that using it with two resources, well,
> it's sort of unusual use case.

All right!

Many Thanks!
Hideo Yamauchi.

--- On Fri, 2013/3/8, Dejan Muhamedagic  wrote:

> Hi Hideo-san,
> 
> On Thu, Mar 07, 2013 at 10:18:09AM +0900, renayama19661...@ybb.ne.jp wrote:
> > Hi Dejan,
> > 
> > The problem was settled with your patch.
> > 
> > However, I have a question.
> > I want to use "resource_set" which Mr. Andrew proposed, but do not 
> > understand a method to use with crm shell.
> > 
> > I read two next cib.xml and confirmed it with crm shell.
> > 
> > Case 1) sequential="false". 
> > (snip)
> >     
> >         
> >                  >id="test-order-resource_set">
> >                         
> >                         
> >                 
> >         
> >     
> > (snip)
> >  * When I confirm it with crm shell ...
> > (snip)
> >     group master-group vip-master vip-rep
> >     order test-order : _rsc_set_ ( vip-master vip-rep )
> > (snip)
> 
> Yes. All size two resource sets get the _rsc_set_ keyword,
> otherwise it's not possible to distinguish them from "normal"
> constraints. Resource sets are supposed to help cases when it is
> necessary to express relation between three or more resources.
> Perhaps this case should be an exception.
> 
> > Case 2) sequential="true"
> > (snip)
> >     
> >       
> >         
> >           
> >           
> >         
> >       
> >     
> > (snip)
> >  * When I confirm it with crm shell ...
> > (snip)
> >    group master-group vip-master vip-rep
> >    xml  \
> >          \
> >                  \
> >                  \
> >          \
> > 
> > (snip)
> > 
> > Does the designation of "sequential=true" have to describe it in xml?
> 
> sequential=true is the default. In that case it's not possible to
> have an unequivocal representation for the same construct and, in
> this particular case, the conversion XML->CLI->XML yields a
> different XML. There's a later commit which helps here, I think
> that it should be possible to backport it to 1.0:
> 
> changeset:   789:916d1b15edc3
> user:        Dejan Muhamedagic 
> date:        Thu Aug 16 17:01:24 2012 +0200
> summary:     Medium: cibconfig: drop attributes set to default on cib import
> 
> > Is there a right method to appoint an attribute of "resource_set" with crm 
> > shell?
> > Possibly is not "resource_set" usable with crm shell of Pacemaker1.0.13?
> 
> Should work. It's just that using it with two resources, well,
> it's sort of unusual use case.
> 
> Cheers,
> 
> Dejan
> 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > --- On Thu, 2013/3/7, renayama19661...@ybb.ne.jp 
> >  wrote:
> > 
> > > Hi Dejan,
> > > Hi Andrew,
> > > 
> > > Thank you for comment.
> > > I confirm the movement of the patch and report it.
> > > 
> > > Best Regards,
> > > Hideo Yamauchi.
> > > 
> > > --- On Wed, 2013/3/6, Dejan Muhamedagic  wrote:
> > > 
> > > > Hi Hideo-san,
> > > > 
> > > > On Wed, Mar 06, 2013 at 10:37:44AM +0900, renayama19661...@ybb.ne.jp 
> > > > wrote:
> > > > > Hi Dejan,
> > > > > Hi Andrew,
> > > > > 
> > > > > As for the crm shell, the check of the meta attribute was revised 
> > > > > with the next patch.
> > > > > 
> > > > >  * http://hg.savannah.gnu.org/hgweb/crmsh/rev/d1174f42f4b3
> > > > > 
> > > > > This patch was backported in Pacemaker1.0.13.
> > > > > 
> > > > >  * 
> > > > >https://github.com/ClusterLabs/pacemaker-1.0/commit/fa1a99ab36e0ed015f1bcbbb28f7db962a9d1abc#shell/modules/cibconfig.py
> > > > > 
> > > > > However, the ordered,colocated attribute of the group resource is 
> > > > > treated as an error when I use crm Shell which adopted this patch.
> > > > > 
> > > > > --
> > > > > (snip)
> > > > > ### Group Configuration ###
> > > > > group master-group \
> > > > >         vip-master \
> > > > >         vip-rep \
> > > > >         meta \
> > > > >                 ordered="false"
> > > > > (snip)
> > > > > 
> > > > > [root@rh63-heartbeat1 ~]# crm configure load update test2339.crm 
> > > > > INFO: building help index
> > > > > crm_verify[20028]: 2013/03/06_17:57:18 WARN: unpack_nodes: Blind 
> > > > > faith: not fencing unseen nodes
> > > >

Re: [Pacemaker] [Question]About "sequential" designation of resource_set.

2013-03-13 Thread renayama19661014
Hi Andrew,

> In "colocation", I used "resource_set".
> However, a result did not include the change.

Please, about the result that I tried, give me comment.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2013/3/7, renayama19661...@ybb.ne.jp  
wrote:

> Hi Andrew,
> 
> > > You use the resource sets _instead_ of a group.
> > > If you want group.ordered=false, then use a colocation set (with
> > > sequential=true).
> 
> In "colocation", I used "resource_set".
> However, a result did not include the change.
> 
> Will this result be a mistake of my setting?
> 
> Case 1) sequential=false
> (snip)
>     
>       
>         
>           
>           
>         
>       
>     
> (sip)
> [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> Mar  8 00:20:58 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 1: stop vip-master_stop_0 on rh63-heartbeat1
> 
> 
> Case 2) sequential=true
> (snip)
>     
>       
>         
>           
>           
>         
>       
>     
> (snip)
> [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> Mar  7 23:54:51 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 1: stop vip-master_stop_0 on rh63-heartbeat1
> 
> 
> Best Regards,
> Hideo Yamauchi.
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.

2013-03-18 Thread renayama19661014
Hi Dejan,

> > changeset:   789:916d1b15edc3
> > user:Dejan Muhamedagic 
> > date:Thu Aug 16 17:01:24 2012 +0200
> > summary: Medium: cibconfig: drop attributes set to default on cib import

I confirmed that I was set definitely without becoming xml if you made the 
modifications that you taught.

* When I set true with cib.xml file.(sequential=true)
(snip)








(snip)

[root@rh64-heartbeat1 ~]# crm 
crm(live)# configure
crm(live)configure# show
(snip)
group testGroup01 Dummy01 Dummy02
order test-order : _rsc_set_ Dummy01 Dummy02
(snip)

Many Thanks!
Hideo Yamauchi.


--- On Mon, 2013/3/11, renayama19661...@ybb.ne.jp  
wrote:

> Hi Dejan,
> 
> Thank you for comment.
> 
> > sequential=true is the default. In that case it's not possible to
> > have an unequivocal representation for the same construct and, in
> > this particular case, the conversion XML->CLI->XML yields a
> > different XML. There's a later commit which helps here, I think
> > that it should be possible to backport it to 1.0:
> > 
> > changeset:   789:916d1b15edc3
> > user:        Dejan Muhamedagic 
> > date:        Thu Aug 16 17:01:24 2012 +0200
> > summary:     Medium: cibconfig: drop attributes set to default on cib import
> 
> I apply the backporting that you taught and confirm movement.
> I talk with you again if I have a problem.
> 
> > > Is there a right method to appoint an attribute of "resource_set" with 
> > > crm shell?
> > > Possibly is not "resource_set" usable with crm shell of Pacemaker1.0.13?
> > 
> > Should work. It's just that using it with two resources, well,
> > it's sort of unusual use case.
> 
> All right!
> 
> Many Thanks!
> Hideo Yamauchi.
> 
> --- On Fri, 2013/3/8, Dejan Muhamedagic  wrote:
> 
> > Hi Hideo-san,
> > 
> > On Thu, Mar 07, 2013 at 10:18:09AM +0900, renayama19661...@ybb.ne.jp wrote:
> > > Hi Dejan,
> > > 
> > > The problem was settled with your patch.
> > > 
> > > However, I have a question.
> > > I want to use "resource_set" which Mr. Andrew proposed, but do not 
> > > understand a method to use with crm shell.
> > > 
> > > I read two next cib.xml and confirmed it with crm shell.
> > > 
> > > Case 1) sequential="false". 
> > > (snip)
> > >     
> > >         
> > >                  > >id="test-order-resource_set">
> > >                         
> > >                         
> > >                 
> > >         
> > >     
> > > (snip)
> > >  * When I confirm it with crm shell ...
> > > (snip)
> > >     group master-group vip-master vip-rep
> > >     order test-order : _rsc_set_ ( vip-master vip-rep )
> > > (snip)
> > 
> > Yes. All size two resource sets get the _rsc_set_ keyword,
> > otherwise it's not possible to distinguish them from "normal"
> > constraints. Resource sets are supposed to help cases when it is
> > necessary to express relation between three or more resources.
> > Perhaps this case should be an exception.
> > 
> > > Case 2) sequential="true"
> > > (snip)
> > >     
> > >       
> > >         
> > >           
> > >           
> > >         
> > >       
> > >     
> > > (snip)
> > >  * When I confirm it with crm shell ...
> > > (snip)
> > >    group master-group vip-master vip-rep
> > >    xml  \
> > >          \
> > >                  \
> > >                  \
> > >          \
> > > 
> > > (snip)
> > > 
> > > Does the designation of "sequential=true" have to describe it in xml?
> > 
> > sequential=true is the default. In that case it's not possible to
> > have an unequivocal representation for the same construct and, in
> > this particular case, the conversion XML->CLI->XML yields a
> > different XML. There's a later commit which helps here, I think
> > that it should be possible to backport it to 1.0:
> > 
> > changeset:   789:916d1b15edc3
> > user:        Dejan Muhamedagic 
> > date:        Thu Aug 16 17:01:24 2012 +0200
> > summary:     Medium: cibconfig: drop attributes set to default on cib import
> > 
> > > Is there a right method to appoint an attribute of "resource_set" with 
> > > crm shell?
> > > Possibly is not "resource_set" usable with crm shell of Pacemaker1.0.13?
> > 
> > Should work. It's just that using it with two resources, well,
> > it's sort of unusual use case.
> > 
> > Cheers,
> > 
> > Dejan
> > 
> > > Best Regards,
> > > Hideo Yamauchi.
> > > 
> > > --- On Thu, 2013/3/7, renayama19661...@ybb.ne.jp 
> > >  wrote:
> > > 
> > > > Hi Dejan,
> > > > Hi Andrew,
> > > > 
> > > > Thank you for comment.
> > > > I confirm the movement of the patch and report it.
> > > > 
> > > > Best Regards,
> > > > Hideo Yamauchi.
> > > > 
> > > > --- On Wed, 2013/3/6, Dejan Muhamedagic  wrote:
> > > > 
> > > > > Hi Hideo-san,
> > > > > 
> > > > > On Wed, Mar 06, 2013 at 10:37:44AM +0900, renayama19661...@ybb.ne.jp 
> > > > > wrote:
> > > > > > Hi Dejan,
> > > > > > Hi Andrew,
> > > > > > 
> > > > > > As for the crm shell, the check of the meta attribute was r

Re: [Pacemaker] [Question]About "sequential" designation of resource_set.

2013-03-21 Thread renayama19661014
Hi Andrew,

I registered this question with Bugzilla.

 * http://bugs.clusterlabs.org/show_bug.cgi?id=5147

Best Regards,
Hideo Yamauchi.

--- On Thu, 2013/3/14, renayama19661...@ybb.ne.jp  
wrote:

> Hi Andrew,
> 
> > In "colocation", I used "resource_set".
> > However, a result did not include the change.
> 
> Please, about the result that I tried, give me comment.
> 
> Best Regards,
> Hideo Yamauchi.
> 
> --- On Thu, 2013/3/7, renayama19661...@ybb.ne.jp  
> wrote:
> 
> > Hi Andrew,
> > 
> > > > You use the resource sets _instead_ of a group.
> > > > If you want group.ordered=false, then use a colocation set (with
> > > > sequential=true).
> > 
> > In "colocation", I used "resource_set".
> > However, a result did not include the change.
> > 
> > Will this result be a mistake of my setting?
> > 
> > Case 1) sequential=false
> > (snip)
> >     
> >       
> >         
> >           
> >           
> >         
> >       
> >     
> > (sip)
> > [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> > Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> > waiting
> > Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 
> > (local) - no waiting
> > Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> > Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
> > Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> > Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> > Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 
> > (local) - no waiting
> > Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - no 
> > waiting
> > Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> > Mar  8 00:20:58 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> > Initiating action 1: stop vip-master_stop_0 on rh63-heartbeat1
> > 
> > 
> > Case 2) sequential=true
> > (snip)
> >     
> >       
> >         
> >           
> >           
> >         
> >       
> >     
> > (snip)
> > [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> > Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> > waiting
> > Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 
> > (local) - no waiting
> > Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> > Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
> > Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> > Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> > Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 
> > (local) - no waiting
> > Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - no 
> > waiting
> > Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> > Mar  7 23:54:51 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> > Initiating action 1: stop vip-master_stop_0 on rh63-heartbeat1
> > 
> > 
> > Best Regards,
> > Hideo Yamauchi.
> > 
> > 
> > ___
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting starte

Re: [Pacemaker] [Question]About "sequential" designation of resource_set.

2013-03-21 Thread renayama19661014
Hi Andrew,

Thank your for comments.
 
> > > You use the resource sets _instead_ of a group.
> > > If you want group.ordered=false, then use a colocation set (with
> > > sequential=true).
> 
> In "colocation", I used "resource_set".
> However, a result did not include the change.
> Will this result be a mistake of my setting?
> 
> Case 1) sequential=false
> (snip)
> 
>   
> 
>   
>   
> 
>   
> 
> 
> What are trying to achieve with this?  It doesn't do anything because there 
> is nothing to collocate master or rep with.
> The only value here is to show that rep would not be stopped when master is. 

However, you made next reply.
I used colocation_set in substitution for ordered=false.

>>You use the resource sets _instead_ of a group. 
>>If you want group.ordered=false, then use a colocation set (with 
>>sequential=true). 
>>If you want group.colocated=false, then use an ordering set (with 
>>sequential=true). 

After all is it right that the substitute for ordered=false of group sets 
order_set?

Best Regards,
Hideo Yamauchi.



--- On Fri, 2013/3/22, Andrew Beekhof  wrote:

> 
> 
> On Thursday, March 7, 2013,   wrote:
> Hi Andrew,
> 
> > > You use the resource sets _instead_ of a group.
> > > If you want group.ordered=false, then use a colocation set (with
> > > sequential=true).
> 
> In "colocation", I used "resource_set".
> However, a result did not include the change.
> Will this result be a mistake of my setting?
> 
> Case 1) sequential=false
> (snip)
>     
>       
>         
>           
>           
>         
>       
>     
> 
> What are trying to achieve with this?  It doesn't do anything because there 
> is nothing to collocate master or rep with.
> The only value here is to show that rep would not be stopped when master is. 
>  (sip)
> [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> Mar  8 00:20:58 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> Initiating action 1: stop vip-master_stop_0 on rh63-heartbeat1
> 
> 
> Case 2) sequential=true
> (snip)
>     
>       
>         
>           
>           
>         
>       
>     
> (snip)
> [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 (local)
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 (local) 
> - no waiting
> Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - no 
> waiting
> Mar  7 23:54:49 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> Mar  7 23:

Re: [Pacemaker] [Question]About "sequential" designation of resource_set.

2013-03-21 Thread renayama19661014
Hi Andrew,

Thank your for comment.

> Sorry, I'm not sure I understand the question.

Sorry

When we use resource_set in substitution for ordered of a thing of group 
resource, do we use "colocation set"?
Or do we use "ordering set "?

It does not seem to do work same as "group ordered=fase" if it is right to use 
"ordering set".

Best Regards,
Hideo Yamauchi.

--- On Fri, 2013/3/22, Andrew Beekhof  wrote:

> On Fri, Mar 22, 2013 at 12:34 PM,   wrote:
> > Hi Andrew,
> >
> > Thank your for comments.
> >
> >> > > You use the resource sets _instead_ of a group.
> >> > > If you want group.ordered=false, then use a colocation set (with
> >> > > sequential=true).
> >>
> >> In "colocation", I used "resource_set".
> >> However, a result did not include the change.
> >> Will this result be a mistake of my setting?
> >>
> >> Case 1) sequential=false
> >> (snip)
> >>     
> >>       
> >>         
> >>           
> >>           
> >>         
> >>       
> >>     
> >>
> >> What are trying to achieve with this?  It doesn't do anything because 
> >> there is nothing to collocate master or rep with.
> >> The only value here is to show that rep would not be stopped when master 
> >> is.
> >
> > However, you made next reply.
> > I used colocation_set in substitution for ordered=false.
> >
> >>>You use the resource sets _instead_ of a group.
> >>>If you want group.ordered=false, then use a colocation set (with
> >>>sequential=true).
> >>>If you want group.colocated=false, then use an ordering set (with
> >>>sequential=true).
> >
> > After all is it right that the substitute for ordered=false of group sets 
> > order_set?
> 
> Sorry, I'm not sure I understand the question.
> 
> >
> > Best Regards,
> > Hideo Yamauchi.
> >
> >
> >
> > --- On Fri, 2013/3/22, Andrew Beekhof  wrote:
> >
> >>
> >>
> >> On Thursday, March 7, 2013,   wrote:
> >> Hi Andrew,
> >>
> >> > > You use the resource sets _instead_ of a group.
> >> > > If you want group.ordered=false, then use a colocation set (with
> >> > > sequential=true).
> >>
> >> In "colocation", I used "resource_set".
> >> However, a result did not include the change.
> >> Will this result be a mistake of my setting?
> >>
> >> Case 1) sequential=false
> >> (snip)
> >>     
> >>       
> >>         
> >>           
> >>           
> >>         
> >>       
> >>     
> >>
> >> What are trying to achieve with this?  It doesn't do anything because 
> >> there is nothing to collocate master or rep with.
> >> The only value here is to show that rep would not be stopped when master 
> >> is.
> >>  (sip)
> >> [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> >> Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> >> waiting
> >> Mar  8 00:20:52 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 
> >> (local) - no waiting
> >> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> >> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 7: monitor vip-master_monitor_0 on rh63-heartbeat2 
> >> (local)
> >> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 5: monitor vip-rep_monitor_0 on rh63-heartbeat1
> >> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 8: monitor vip-rep_monitor_0 on rh63-heartbeat2 (local)
> >> Mar  8 00:20:55 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 6: probe_complete probe_complete on rh63-heartbeat2 
> >> (local) - no waiting
> >> Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 3: probe_complete probe_complete on rh63-heartbeat1 - no 
> >> waiting
> >> Mar  8 00:20:56 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 5: start vip-master_start_0 on rh63-heartbeat1
> >> Mar  8 00:20:58 rh63-heartbeat2 crmd: [22372]: info: te_rsc_command: 
> >> Initiating action 1: stop vip-master_stop_0 on rh63-heartbeat1
> >>
> >>
> >> Case 2) sequential=true
> >> (snip)
> >>     
> >>       
> >>         
> >>           
> >>           
> >>         
> >>       
> >>     
> >> (snip)
> >> [root@rh63-heartbeat2 ~]# grep "Initiating action" /var/log/ha-log
> >> Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> >> Initiating action 2: probe_complete probe_complete on rh63-heartbeat1 - no 
> >> waiting
> >> Mar  7 23:54:44 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> >> Initiating action 3: probe_complete probe_complete on rh63-heartbeat2 
> >> (local) - no waiting
> >> Mar  7 23:54:48 rh63-heartbeat2 crmd: [4]: info: te_rsc_command: 
> >> Initiating action 4: monitor vip-master_monitor_0 on rh63-heartbeat1
> >> Mar  7 23:54:48 rh63-heart

  1   2   3   4   >