Re: [Linux-ha-dev] OCF RA for named

2011-08-05 Thread Serge Dubrouski
No interest?

On Tue, Jul 12, 2011 at 3:50 PM, Serge Dubrouski serge...@gmail.com wrote:

 Hello -

 I've created an OCF RA for named (BIND) server. There is an existing one in
 redhat directory but I don't like how it does monitoring and I doubt that it
 can work with pacemaker. So please review the attached RA and see if it can
 be included into the project.


 --
 Serge Dubrouski.




-- 
Serge Dubrouski.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-HA] Antw: Re: [ha-wg-technical] The mess with OCF_CHECK_LEVEL (crm aborts during commit)

2011-08-05 Thread Ulrich Windl
 Dejan Muhamedagic de...@suse.de schrieb am 04.08.2011 um 18:32 in 
 Nachricht
20110804163245.GA28585@rondo.homenet:
 Hi,
 
 On Thu, Aug 04, 2011 at 05:45:16PM +0200, Ulrich Windl wrote:
  Hi!
  
  Some RAs support OCF_CHECK_LEVEL (e.g. ocf:heartbeat:Raid1). However the 
 OCF_CHECK_LEVEL is not advertised in the metadata. Also, OCF_CHECK_LEVEL is 
 not a global parameter (wouldn't make much sense).
  
  So obviously using the crm_gui one can add OCF_CHECK_LEVEL for some 
 resource, and that seems to work.
  
  So far, so good. Now I tried to add more resources without an 
 OCF_CHECK_LEVEL using the crm command line. I added the new resources to a 
 group that contained resources using OCF_CHECK_LEVEL.
 
 OCF_CHECK_LEVEL is to be defined on a per-monitor basis, like
 this:
 
 primitive ...
   op monitor OCF_CHECK_LEVEL=10 interval=...

[...]

So, is a configuration like the following incorrect?

primitive prm_c11_as_1_raid1 ocf:heartbeat:Raid1 \
params raidconf=/etc/mdadm/mdadm.conf raiddev=/dev/md15 
OCF_CHECK_LEVEL=1 \
operations $id=prm_c11_as_1_raid1-operations \
op start interval=0 timeout=20s \
op stop interval=0 timeout=20s \
op monitor interval=60 timeout=60s


Ulrich
P.S. Moving the issue to the linux-ha list as requested.

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-05 Thread Ulrich Windl
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 18:49 in Nachricht
4e3acd86.1020...@arcor.de:
 Hi Ulrich,
 
 I did not folow the complete thread, just jumped in - sorry. Is the
 resource inside a resource group? In this case the stickiness is
 multiplied. And sofor the stickiness could be greater than the location
 role (score).

Hi!

Yes, a group with about 20 resources has a resource-stickiness=10 and a 
location loc_grp_cbw grp_cbw 50: node. As the group is somewhat 
indivisible, assigning varying stickinesses to individual resources just makes 
things unreadable and complicated. I feel that a group stickyness should 
override individual resource stickynesses, and not be used a a default 
stickyness for every resource in the group.

Regards,
Ulrich

 
 Regards
 Fabian
 
 On 08/04/2011 03:10 PM, Ulrich Windl wrote:
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 12:58 in Nachricht
  4e3a7b5c.1030...@arcor.de:
  On 08/04/2011 08:28 AM, Ulrich Windl wrote:
  Hi!
 
  Isn't the stickyness effectively based on the failcount? We have one
  resource
  that has a location constraint for one node with a weight of 50 and a
  sticky ness of 10. The resource runs on a different node and shows no
  tendency of moving back (not even after restarts).
 
  No stickiness has nothing to do with the failcount. The policy engine
  could take both into account the stickiness (for RUNNING resources) and
  the failcount for (RUNNING or non-running ressources).
 
  If you ever had a on-start-failure of a resource on a node the failcount
  is set to infinity which means, the resource could not be started at
  this node.
 
  fabian,
 
  I know that, and the errors were removed by crm_resource -C. Still the 
 resource is happy where it is, and doesn't want to move away.
 
 
  If the policy engine needs to evaluate where to run a resource it uses
  the location/antcolocation/cololaction constraints, failcounts,
  stickiness and maybe some other scores to evaluate WHERE to run a resource.
 
  So in my opinion the stiness does exactly what you are asking for.
 
  Unfortunately someone did a manual migrate yesterday, so I cannot show the 
 scores that lead to the problem.
 
  Regards,
  Ulrich
 
 
  ___
  Linux-HA mailing list
  Linux-HA@lists.linux-ha.org 
  http://lists.linux-ha.org/mailman/listinfo/linux-ha 
  See also: http://linux-ha.org/ReportingProblems 
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 
 

 
 

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: [ha-wg-technical] The mess with OCF_CHECK_LEVEL (crm aborts during commit)

2011-08-05 Thread Dejan Muhamedagic
Hi,

On Fri, Aug 05, 2011 at 08:23:43AM +0200, Ulrich Windl wrote:
  Dejan Muhamedagic de...@suse.de schrieb am 04.08.2011 um 18:32 in 
  Nachricht
 20110804163245.GA28585@rondo.homenet:
  Hi,
  
  On Thu, Aug 04, 2011 at 05:45:16PM +0200, Ulrich Windl wrote:
   Hi!
   
   Some RAs support OCF_CHECK_LEVEL (e.g. ocf:heartbeat:Raid1). However the 
  OCF_CHECK_LEVEL is not advertised in the metadata. Also, OCF_CHECK_LEVEL is 
  not a global parameter (wouldn't make much sense).
   
   So obviously using the crm_gui one can add OCF_CHECK_LEVEL for some 
  resource, and that seems to work.
   
   So far, so good. Now I tried to add more resources without an 
  OCF_CHECK_LEVEL using the crm command line. I added the new resources to a 
  group that contained resources using OCF_CHECK_LEVEL.
  
  OCF_CHECK_LEVEL is to be defined on a per-monitor basis, like
  this:
  
  primitive ...
  op monitor OCF_CHECK_LEVEL=10 interval=...
 
 [...]
 
 So, is a configuration like the following incorrect?
 
 primitive prm_c11_as_1_raid1 ocf:heartbeat:Raid1 \
 params raidconf=/etc/mdadm/mdadm.conf raiddev=/dev/md15 
 OCF_CHECK_LEVEL=1 \
 operations $id=prm_c11_as_1_raid1-operations \
 op start interval=0 timeout=20s \
 op stop interval=0 timeout=20s \
 op monitor interval=60 timeout=60s

Yes. See an example here:

http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-operation-monitor-multiple.html

Though it's XML, you can see that OCF_CHECK_LEVEL is defined
within a monitor operation.

 Ulrich
 P.S. Moving the issue to the linux-ha list as requested.

Thanks,

Dejan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] About OCF RA exportfs

2011-08-05 Thread alain . moulle
Hi,

I checked all threads about HA NFS active/active, and I understood that 
the solution
was to have a periodic backup of rmtab in a .rmtab locally in the shared 
FS, as it was 
effectively done in the OCF RA exportfs delivered in 
resource-agents-3.0.12-15

I just wonder if it is always the good solution for HA-NFS active/active 
and if 
there is somewhere a newer version of this RA OCF exportfs ?

The thing I don't catch in this RA OCF exportfs script is that the 
monitoring function does a grep
of OCF_RESKEY_directory in the rmtab which fails if it does not find it in 
the file, but
it seems that unless at least on NFS client mounts the directory via NFS, 
there is no
chance to have this directory in the rmtab file ... so, once the resource 
is started,  first monitoring fails.
Where am I wrong ?

Thanks
Alain Moullé
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: [ha-wg-technical] The mess with OCF_CHECK_LEVEL (crm aborts during commit)

2011-08-05 Thread Ulrich Windl
 Dejan Muhamedagic de...@suse.de schrieb am 05.08.2011 um 08:39 in 
 Nachricht
20110805063900.GB31749@rondo.homenet:
 Hi,
 
 On Fri, Aug 05, 2011 at 08:23:43AM +0200, Ulrich Windl wrote:
   Dejan Muhamedagic de...@suse.de schrieb am 04.08.2011 um 18:32 in 
 Nachricht
  20110804163245.GA28585@rondo.homenet:
   Hi,
   
   On Thu, Aug 04, 2011 at 05:45:16PM +0200, Ulrich Windl wrote:
Hi!

Some RAs support OCF_CHECK_LEVEL (e.g. ocf:heartbeat:Raid1). However 
the 
   OCF_CHECK_LEVEL is not advertised in the metadata. Also, OCF_CHECK_LEVEL 
 is 
   not a global parameter (wouldn't make much sense).

So obviously using the crm_gui one can add OCF_CHECK_LEVEL for some 
   resource, and that seems to work.

So far, so good. Now I tried to add more resources without an 
   OCF_CHECK_LEVEL using the crm command line. I added the new resources to 
   a 
 
   group that contained resources using OCF_CHECK_LEVEL.
   
   OCF_CHECK_LEVEL is to be defined on a per-monitor basis, like
   this:
   
   primitive ...
 op monitor OCF_CHECK_LEVEL=10 interval=...
  
  [...]
  
  So, is a configuration like the following incorrect?
  
  primitive prm_c11_as_1_raid1 ocf:heartbeat:Raid1 \
  params raidconf=/etc/mdadm/mdadm.conf raiddev=/dev/md15 
 OCF_CHECK_LEVEL=1 \
  operations $id=prm_c11_as_1_raid1-operations \
  op start interval=0 timeout=20s \
  op stop interval=0 timeout=20s \
  op monitor interval=60 timeout=60s
 
 Yes. See an example here:
 
 http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-
  
 operation-monitor-multiple.html
 
 Though it's XML, you can see that OCF_CHECK_LEVEL is defined
 within a monitor operation.

Amazingly crm_verify -LV does not report any problem however.

Regards,
Ulrich


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
On 08/05/2011 08:30 AM, Ulrich Windl wrote:
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 18:49 in Nachricht
 4e3acd86.1020...@arcor.de:
 Hi Ulrich,

 I did not folow the complete thread, just jumped in - sorry. Is the
 resource inside a resource group? In this case the stickiness is
 multiplied. And sofor the stickiness could be greater than the location
 role (score).
 
 Hi!
 
 Yes, a group with about 20 resources has a resource-stickiness=10 

In this case - if I remeber that correctly the scores for a RUNNING
group is 20*10 - 200  50.

Can you describe your problem, what are you missing?

a) You want to have a RUNNING group NOT to do a fallback - Stickiness
should do that here: 2M (active node) 500K (preferred node) [if
activenode  preferred node ;-)]
b) You want to have a STOPPED group to be placed on a specific node (to
have an ordered administartion at least at the start-point - location
score should help here 500K (preferred node) 0 (not preferred node)

I miss the point where you argumented, that stickiness is not
implemented as you expected it would be implemented. Could you explain,
whats missing or wrong? Maybe we can try it in a state description like
status-before (f.e. group on node1), change in the cluster (either event
or admin based) and status-after (here the current implemented one and
the one that you expected how it should work).

Kind regards
Fabian

and a location loc_grp_cbw grp_cbw 50: node. As the group is
somewhat indivisible, assigning varying stickinesses to individual
resources just makes things unreadable and complicated. I feel that a
group stickyness should override individual resource stickynesses, and
not be used a a default stickyness for every resource in the group.
 
 Regards,
 Ulrich
 

 Regards
 Fabian

 On 08/04/2011 03:10 PM, Ulrich Windl wrote:
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 12:58 in Nachricht
 4e3a7b5c.1030...@arcor.de:
 On 08/04/2011 08:28 AM, Ulrich Windl wrote:
 Hi!

 Isn't the stickyness effectively based on the failcount? We have one
 resource
 that has a location constraint for one node with a weight of 50 and a
 sticky ness of 10. The resource runs on a different node and shows no
 tendency of moving back (not even after restarts).

 No stickiness has nothing to do with the failcount. The policy engine
 could take both into account the stickiness (for RUNNING resources) and
 the failcount for (RUNNING or non-running ressources).

 If you ever had a on-start-failure of a resource on a node the failcount
 is set to infinity which means, the resource could not be started at
 this node.

 fabian,

 I know that, and the errors were removed by crm_resource -C. Still the 
 resource is happy where it is, and doesn't want to move away.


 If the policy engine needs to evaluate where to run a resource it uses
 the location/antcolocation/cololaction constraints, failcounts,
 stickiness and maybe some other scores to evaluate WHERE to run a resource.

 So in my opinion the stiness does exactly what you are asking for.

 Unfortunately someone did a manual migrate yesterday, so I cannot show the 
 scores that lead to the problem.

 Regards,
 Ulrich


 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 

 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 

 
  
  
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-05 Thread alain . moulle
Hi,
I was the guy who intiate this thread with a simple question, but
this thread has been re-oriented with other similar questions ...
so I don't know who is answering to anybody else ... please Fabian,
if you can just reopen my first msg in this thread, it would be nice
for me ...
Thanks a lot anyway.
Alain



De :Maloja01 maloj...@arcor.de
A : linux-ha@lists.linux-ha.org
Date :  05/08/2011 11:02
Objet : Re: [Linux-HA] Antw: Re: location and orders : Question about a 
behavior ...
Envoyé par :linux-ha-boun...@lists.linux-ha.org



On 08/05/2011 08:30 AM, Ulrich Windl wrote:
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 18:49 in 
Nachricht
 4e3acd86.1020...@arcor.de:
 Hi Ulrich,

 I did not folow the complete thread, just jumped in - sorry. Is the
 resource inside a resource group? In this case the stickiness is
 multiplied. And sofor the stickiness could be greater than the location
 role (score).
 
 Hi!
 
 Yes, a group with about 20 resources has a resource-stickiness=10 

In this case - if I remeber that correctly the scores for a RUNNING
group is 20*10 - 200  50.

Can you describe your problem, what are you missing?

a) You want to have a RUNNING group NOT to do a fallback - Stickiness
should do that here: 2M (active node) 500K (preferred node) [if
activenode  preferred node ;-)]
b) You want to have a STOPPED group to be placed on a specific node (to
have an ordered administartion at least at the start-point - location
score should help here 500K (preferred node) 0 (not preferred node)

I miss the point where you argumented, that stickiness is not
implemented as you expected it would be implemented. Could you explain,
whats missing or wrong? Maybe we can try it in a state description like
status-before (f.e. group on node1), change in the cluster (either event
or admin based) and status-after (here the current implemented one and
the one that you expected how it should work).

Kind regards
Fabian

and a location loc_grp_cbw grp_cbw 50: node. As the group is
somewhat indivisible, assigning varying stickinesses to individual
resources just makes things unreadable and complicated. I feel that a
group stickyness should override individual resource stickynesses, and
not be used a a default stickyness for every resource in the group.
 
 Regards,
 Ulrich
 

 Regards
 Fabian

 On 08/04/2011 03:10 PM, Ulrich Windl wrote:
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 12:58 in 
Nachricht
 4e3a7b5c.1030...@arcor.de:
 On 08/04/2011 08:28 AM, Ulrich Windl wrote:
 Hi!

 Isn't the stickyness effectively based on the failcount? We have one
 resource
 that has a location constraint for one node with a weight of 50 
and a
 sticky ness of 10. The resource runs on a different node and 
shows no
 tendency of moving back (not even after restarts).

 No stickiness has nothing to do with the failcount. The policy engine
 could take both into account the stickiness (for RUNNING resources) 
and
 the failcount for (RUNNING or non-running ressources).

 If you ever had a on-start-failure of a resource on a node the 
failcount
 is set to infinity which means, the resource could not be started at
 this node.

 fabian,

 I know that, and the errors were removed by crm_resource -C. Still 
the 
 resource is happy where it is, and doesn't want to move away.


 If the policy engine needs to evaluate where to run a resource it 
uses
 the location/antcolocation/cololaction constraints, failcounts,
 stickiness and maybe some other scores to evaluate WHERE to run a 
resource.

 So in my opinion the stiness does exactly what you are asking for.

 Unfortunately someone did a manual migrate yesterday, so I cannot show 
the 
 scores that lead to the problem.

 Regards,
 Ulrich


 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 

 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 

 
 
 
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
On 08/05/2011 11:26 AM, alain.mou...@bull.net wrote:
 Hi,
 I was the guy who intiate this thread with a simple question, but
 this thread has been re-oriented with other similar questions ...
 so I don't know who is answering to anybody else ... please Fabian,
 if you can just reopen my first msg in this thread, it would be nice
 for me ...

Yes you are right - so I will rewind the thread beginning from message
1 :)

 Thanks a lot anyway.
 Alain
 
 
 
 De :Maloja01 maloj...@arcor.de
 A : linux-ha@lists.linux-ha.org
 Date :  05/08/2011 11:02
 Objet : Re: [Linux-HA] Antw: Re: location and orders : Question about a 
 behavior ...
 Envoyé par :linux-ha-boun...@lists.linux-ha.org
 
 
 
 On 08/05/2011 08:30 AM, Ulrich Windl wrote:
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 18:49 in 
 Nachricht
 4e3acd86.1020...@arcor.de:
 Hi Ulrich,

 I did not folow the complete thread, just jumped in - sorry. Is the
 resource inside a resource group? In this case the stickiness is
 multiplied. And sofor the stickiness could be greater than the location
 role (score).

 Hi!

 Yes, a group with about 20 resources has a resource-stickiness=10 
 
 In this case - if I remeber that correctly the scores for a RUNNING
 group is 20*10 - 200  50.
 
 Can you describe your problem, what are you missing?
 
 a) You want to have a RUNNING group NOT to do a fallback - Stickiness
 should do that here: 2M (active node) 500K (preferred node) [if
 activenode  preferred node ;-)]
 b) You want to have a STOPPED group to be placed on a specific node (to
 have an ordered administartion at least at the start-point - location
 score should help here 500K (preferred node) 0 (not preferred node)
 
 I miss the point where you argumented, that stickiness is not
 implemented as you expected it would be implemented. Could you explain,
 whats missing or wrong? Maybe we can try it in a state description like
 status-before (f.e. group on node1), change in the cluster (either event
 or admin based) and status-after (here the current implemented one and
 the one that you expected how it should work).
 
 Kind regards
 Fabian
 
 and a location loc_grp_cbw grp_cbw 50: node. As the group is
 somewhat indivisible, assigning varying stickinesses to individual
 resources just makes things unreadable and complicated. I feel that a
 group stickyness should override individual resource stickynesses, and
 not be used a a default stickyness for every resource in the group.

 Regards,
 Ulrich


 Regards
 Fabian

 On 08/04/2011 03:10 PM, Ulrich Windl wrote:
 Maloja01 maloj...@arcor.de schrieb am 04.08.2011 um 12:58 in 
 Nachricht
 4e3a7b5c.1030...@arcor.de:
 On 08/04/2011 08:28 AM, Ulrich Windl wrote:
 Hi!

 Isn't the stickyness effectively based on the failcount? We have one
 resource
 that has a location constraint for one node with a weight of 50 
 and a
 sticky ness of 10. The resource runs on a different node and 
 shows no
 tendency of moving back (not even after restarts).

 No stickiness has nothing to do with the failcount. The policy engine
 could take both into account the stickiness (for RUNNING resources) 
 and
 the failcount for (RUNNING or non-running ressources).

 If you ever had a on-start-failure of a resource on a node the 
 failcount
 is set to infinity which means, the resource could not be started at
 this node.

 fabian,

 I know that, and the errors were removed by crm_resource -C. Still 
 the 
 resource is happy where it is, and doesn't want to move away.


 If the policy engine needs to evaluate where to run a resource it 
 uses
 the location/antcolocation/cololaction constraints, failcounts,
 stickiness and maybe some other scores to evaluate WHERE to run a 
 resource.

 So in my opinion the stiness does exactly what you are asking for.

 Unfortunately someone did a manual migrate yesterday, so I cannot show 
 the 
 scores that lead to the problem.

 Regards,
 Ulrich


 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 

 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org 
 http://lists.linux-ha.org/mailman/listinfo/linux-ha 
 See also: http://linux-ha.org/ReportingProblems 





 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
On 08/02/2011 05:06 PM, alain.mou...@bull.net wrote:
 Hi
 
 I have this simple configuration of locations and orders between resources 
 group-1 , group-2 and clone-1
 (on a two nodes ha cluster with Pacemaker-1.1.2-7 /corosync-1.2.3-21) :
 
 location loc1-group-1   group-1 +100: node2
 location loc1-group-2   group-2 +100: node3
 
 order order-group-1   inf: group-1   clone-1
 order order-group-2   inf: group-2   clone-1
 
 property $id=cib-bootstrap-options \
 dc-version=1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe \
 cluster-infrastructure=openais \
 expected-quorum-votes=2 \
 stonith-enabled=true \
 no-quorum-policy=ignore \
 default-resource-stickiness=5000 \
 
 (and no current cli- preferences)
 
 When I stop the node2, the group-1 is well migrated on node3
 But when node2 is up again, and that I start Pacemaker again on node2,
 the group-1 automatically comes back on node2 , and I wonder why ?
 
 I have other similar configuration with same location constraints and same
 default-resource-stickiness value, but without order with a clone 
 resource,
 and the group does not come back automatically. But I don't understand why
 this order constraint would change this behavior ...

We should focus our thoughts on the fact, that when node2 comes back
into the cluster the clone-1 gets a change, because it is started now
also on node2 - am I right? I do not have a good explanatio at this
point of time but this could be the point why the group-1 looses its
stickiness, because its first stopped and than restarted (after the
clone is completely up again).

Can you check the following in your setup: Either set max_clone to 1
(just for a test of course) or doing an anti-location that clone-1 will
not run on node2 (so after rejoining node2 clone-1 will not get a
change in its setup).

With your current config (without my changes):
You should also check, if you see any stops on clone-instances when
node2 is rejoining the cluster. That could be the case, if you have
limitted the number of clones and have additional location
constraints for the clone.

Can you tell more about the clone and the group? Are there any possible
side effects in the functionality of the resources?

Kind regards
Fabian

 
 Thanks for your help
 Alain Moullé
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] ocf::LVM monitor needs excessive time to complete

2011-08-05 Thread Ulrich Windl
Hi,

we run a cluster that has about 30 LVM VGs that are monitored every minute with 
a timeout interval of 90s. Surprisingly even if the system is in nominal state, 
the LVM monitor times out.

I suspect this has to do with multiple LVM commands being run in parallel like 
this:
# ps ax |grep vg
 2014 pts/0D+ 0:00 vgs
 2580 ?D  0:00 vgdisplay -v NFS_C11_IO
 2638 ?D  0:00 vgck CBW_DB_BTD
 2992 ?D  0:00 vgdisplay -v C11_DB_Exe
 3002 ?D  0:00 vgdisplay -v C11_DB_15k
 4564 pts/2S+ 0:00 grep vg
# ps ax |grep vg
 8095 ?D  0:00 vgck CBW_DB_Exe
 8119 ?D  0:00 vgdisplay -v C11_DB_FATA
 8194 ?D  0:00 vgdisplay -v NFS_SAP_Exe

When I tried a vgs manually, it could not be suspended or killed, and it took 
more than 30 seconds to complete.

Thus the LVM monitoring is quite useless as it is now (SLES 11 SP1 x86_64 on a 
machine with lots of disks, RAM and CPUs).

As I had changed all the timeouts via crm configure edit, I suspect the LRM 
starts all these monitors at the same time, creating massive parallelism. Maybe 
a random star delay would be more useful than having the user specify a 
variable start delay for the monitor. Possibly those stuck monitor operations 
also affect monitors that would finish in time.

Here's a part of the mess on one node:
Aug  5 13:50:55 h03 lrmd: [14526]: WARN: operation monitor[360] on 
ocf::LVM::prm_cbw_ci_mnt_lvm for client 14529, its parameters: 
CRM_meta_name=[monitor] crm_feature_set=[3.0.5] CRM_meta_record_pending=[true] 
CRM_meta_timeout=[3] CRM_meta_interval=[1] volgrpname=[CBW_CI] : pid 
[29910] timed out
Aug  5 13:50:55 h03 crmd: [14529]: ERROR: process_lrm_event: LRM operation 
prm_cbw_ci_mnt_lvm_monitor_1 (360) Timed Out (timeout=3ms)
Aug  5 13:50:55 h03 lrmd: [14526]: WARN: perform_ra_op: the operation operation 
monitor[154] on ocf::IPaddr2::prm_a20_ip_1 for client 14529, its parameters: 
CRM_meta_name=[monitor] crm_feature_set=[3.0.5] CRM_meta_record_pending=[true] 
CRM_meta_timeout=[2] CRM_meta_interval=[1] iflabel=[a20] 
ip=[172.20.17.54]  stayed in operation list for 24020 ms (longer than 1 ms)
Aug  5 13:50:56 h03 lrmd: [14526]: WARN: perform_ra_op: the operation operation 
monitor[179] on ocf::Raid1::prm_nfs_cbw_trans_raid1 for client 14529, its 
parameters: CRM_meta_record_pending=[true] raidconf=[/etc/mdadm/mdadm.conf] 
crm_feature_set=[3.0.5] OCF_CHECK_LEVEL=[1] raiddev=[/dev/md8] 
CRM_meta_name=[monitor] CRM_meta_timeout=[6] CRM_meta_interval=[6]  
stayed in operation list for 24010 ms (longer than 1 ms)
Aug  5 13:50:56 h03 attrd: [14527]: notice: attrd_ais_dispatch: Update relayed 
from h04
Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_local_callback: Expanded 
fail-count-prm_cbw_ci_mnt_lvm=value++ to 9
Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_trigger_update: Sending flush 
op to all hosts for: fail-count-prm_cbw_ci_mnt_lvm (9)
Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_perform_update: Sent update 
416: fail-count-prm_cbw_ci_mnt_lvm=9
Aug  5 13:50:56 h03 attrd: [14527]: notice: attrd_ais_dispatch: Update relayed 
from h04

Regards,
Ulrich




___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] location and orders : Question about a behavior ...

2011-08-05 Thread alain . moulle
Hi Fabian,

Many thanks to have a look at my initial problem.
I can't make it again today as I'm trying another configuration on both 
servers (HA NFS active/active I post
another thread about it this morning) but I should be able to try again 
next week.

But if I well understand your explanation: 
you suppose that clone-1 instance on node2 when it starts
again after the reboot, it could disturb the clone-1 instance on node 3 by 
stop/restart it also on node3 ? 
I have not noticed via crm_mon any state change of the clone-1 instance on 
node3 when node2 is restarted
neither any state change on the group-2 which remain started on node3 (if 
clone-1 has been stopped/restarted
on node3 even quickly, I should have also seen group-2 stopped/restarted 
due to the order-group-2 constraint)

Hope it helps to clarify ...
Thanks again
Alain



De :Maloja01 maloj...@arcor.de
A : linux-ha@lists.linux-ha.org
Date :  05/08/2011 11:40
Objet : Re: [Linux-HA] location and orders : Question about a behavior ...
Envoyé par :linux-ha-boun...@lists.linux-ha.org



On 08/02/2011 05:06 PM, alain.mou...@bull.net wrote:
 Hi
 
 I have this simple configuration of locations and orders between 
resources 
 group-1 , group-2 and clone-1
 (on a two nodes ha cluster with Pacemaker-1.1.2-7 /corosync-1.2.3-21) :
 
 location loc1-group-1   group-1 +100: node2
 location loc1-group-2   group-2 +100: node3
 
 order order-group-1   inf: group-1   clone-1
 order order-group-2   inf: group-2   clone-1
 
 property $id=cib-bootstrap-options \
 dc-version=1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe \
 cluster-infrastructure=openais \
 expected-quorum-votes=2 \
 stonith-enabled=true \
 no-quorum-policy=ignore \
 default-resource-stickiness=5000 \
 
 (and no current cli- preferences)
 
 When I stop the node2, the group-1 is well migrated on node3
 But when node2 is up again, and that I start Pacemaker again on node2,
 the group-1 automatically comes back on node2 , and I wonder why ?
 
 I have other similar configuration with same location constraints and 
same
 default-resource-stickiness value, but without order with a clone 
 resource,
 and the group does not come back automatically. But I don't understand 
why
 this order constraint would change this behavior ...

We should focus our thoughts on the fact, that when node2 comes back
into the cluster the clone-1 gets a change, because it is started now
also on node2 - am I right? I do not have a good explanatio at this
point of time but this could be the point why the group-1 looses its
stickiness, because its first stopped and than restarted (after the
clone is completely up again).

Can you check the following in your setup: Either set max_clone to 1
(just for a test of course) or doing an anti-location that clone-1 will
not run on node2 (so after rejoining node2 clone-1 will not get a
change in its setup).

With your current config (without my changes):
You should also check, if you see any stops on clone-instances when
node2 is rejoining the cluster. That could be the case, if you have
limitted the number of clones and have additional location
constraints for the clone.

Can you tell more about the clone and the group? Are there any possible
side effects in the functionality of the resources?

Kind regards
Fabian

 
 Thanks for your help
 Alain Moullé
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Antw: Re: [ha-wg-technical] The mess with OCF_CHECK_LEVEL (crm aborts during commit)

2011-08-05 Thread Dejan Muhamedagic
On Fri, Aug 05, 2011 at 09:15:33AM +0200, Ulrich Windl wrote:
  Dejan Muhamedagic de...@suse.de schrieb am 05.08.2011 um 08:39 in 
  Nachricht
 20110805063900.GB31749@rondo.homenet:
  Hi,
  
  On Fri, Aug 05, 2011 at 08:23:43AM +0200, Ulrich Windl wrote:
Dejan Muhamedagic de...@suse.de schrieb am 04.08.2011 um 18:32 in 
  Nachricht
   20110804163245.GA28585@rondo.homenet:
Hi,

On Thu, Aug 04, 2011 at 05:45:16PM +0200, Ulrich Windl wrote:
 Hi!
 
 Some RAs support OCF_CHECK_LEVEL (e.g. ocf:heartbeat:Raid1). However 
 the 
OCF_CHECK_LEVEL is not advertised in the metadata. Also, 
OCF_CHECK_LEVEL 
  is 
not a global parameter (wouldn't make much sense).
 
 So obviously using the crm_gui one can add OCF_CHECK_LEVEL for some 
resource, and that seems to work.
 
 So far, so good. Now I tried to add more resources without an 
OCF_CHECK_LEVEL using the crm command line. I added the new resources 
to a 
  
group that contained resources using OCF_CHECK_LEVEL.

OCF_CHECK_LEVEL is to be defined on a per-monitor basis, like
this:

primitive ...
op monitor OCF_CHECK_LEVEL=10 interval=...
   
   [...]
   
   So, is a configuration like the following incorrect?
   
   primitive prm_c11_as_1_raid1 ocf:heartbeat:Raid1 \
   params raidconf=/etc/mdadm/mdadm.conf raiddev=/dev/md15 
  OCF_CHECK_LEVEL=1 \
   operations $id=prm_c11_as_1_raid1-operations \
   op start interval=0 timeout=20s \
   op stop interval=0 timeout=20s \
   op monitor interval=60 timeout=60s
  
  Yes. See an example here:
  
  http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-
   
  operation-monitor-multiple.html
  
  Though it's XML, you can see that OCF_CHECK_LEVEL is defined
  within a monitor operation.
 
 Amazingly crm_verify -LV does not report any problem however.

crm_verify doesn't know which parameters the RA supports. crm
configure verify should complain, however, because it looks at
the RA meta-data and does checks which are beyond crm_verify.

Thanks,

Dejan

 Regards,
 Ulrich
 
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] ocf::LVM monitor needs excessive time to complete

2011-08-05 Thread Dejan Muhamedagic
Hi,

On Fri, Aug 05, 2011 at 01:55:25PM +0200, Ulrich Windl wrote:
 Hi,
 
 we run a cluster that has about 30 LVM VGs that are monitored every minute 
 with a timeout interval of 90s. Surprisingly even if the system is in nominal 
 state, the LVM monitor times out.
 
 I suspect this has to do with multiple LVM commands being run in parallel 
 like this:
 # ps ax |grep vg
  2014 pts/0D+ 0:00 vgs
  2580 ?D  0:00 vgdisplay -v NFS_C11_IO
  2638 ?D  0:00 vgck CBW_DB_BTD
  2992 ?D  0:00 vgdisplay -v C11_DB_Exe
  3002 ?D  0:00 vgdisplay -v C11_DB_15k
  4564 pts/2S+ 0:00 grep vg
 # ps ax |grep vg
  8095 ?D  0:00 vgck CBW_DB_Exe
  8119 ?D  0:00 vgdisplay -v C11_DB_FATA
  8194 ?D  0:00 vgdisplay -v NFS_SAP_Exe
 
 When I tried a vgs manually, it could not be suspended or killed, and it 
 took more than 30 seconds to complete.
 
 Thus the LVM monitoring is quite useless as it is now (SLES 11 SP1 x86_64 on 
 a machine with lots of disks, RAM and CPUs).

I guess that this is somehow related to the storage. Best to
report directly to SUSE.

 As I had changed all the timeouts via crm configure edit, I suspect the LRM 
 starts all these monitors at the same time, creating massive parallelism. 
 Maybe a random star delay would be more useful than having the user specify a 
 variable start delay for the monitor. Possibly those stuck monitor operations 
 also affect monitors that would finish in time.

lrmd starts at most max-children operations in parallel. That's 4
by default.

Thanks,

Dejan

 Here's a part of the mess on one node:
 Aug  5 13:50:55 h03 lrmd: [14526]: WARN: operation monitor[360] on 
 ocf::LVM::prm_cbw_ci_mnt_lvm for client 14529, its parameters: 
 CRM_meta_name=[monitor] crm_feature_set=[3.0.5] 
 CRM_meta_record_pending=[true] CRM_meta_timeout=[3] 
 CRM_meta_interval=[1] volgrpname=[CBW_CI] : pid [29910] timed out
 Aug  5 13:50:55 h03 crmd: [14529]: ERROR: process_lrm_event: LRM operation 
 prm_cbw_ci_mnt_lvm_monitor_1 (360) Timed Out (timeout=3ms)
 Aug  5 13:50:55 h03 lrmd: [14526]: WARN: perform_ra_op: the operation 
 operation monitor[154] on ocf::IPaddr2::prm_a20_ip_1 for client 14529, its 
 parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.5] 
 CRM_meta_record_pending=[true] CRM_meta_timeout=[2] 
 CRM_meta_interval=[1] iflabel=[a20] ip=[172.20.17.54]  stayed in 
 operation list for 24020 ms (longer than 1 ms)
 Aug  5 13:50:56 h03 lrmd: [14526]: WARN: perform_ra_op: the operation 
 operation monitor[179] on ocf::Raid1::prm_nfs_cbw_trans_raid1 for client 
 14529, its parameters: CRM_meta_record_pending=[true] 
 raidconf=[/etc/mdadm/mdadm.conf] crm_feature_set=[3.0.5] OCF_CHECK_LEVEL=[1] 
 raiddev=[/dev/md8] CRM_meta_name=[monitor] CRM_meta_timeout=[6] 
 CRM_meta_interval=[6]  stayed in operation list for 24010 ms (longer than 
 1 ms)
 Aug  5 13:50:56 h03 attrd: [14527]: notice: attrd_ais_dispatch: Update 
 relayed from h04
 Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_local_callback: Expanded 
 fail-count-prm_cbw_ci_mnt_lvm=value++ to 9
 Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_trigger_update: Sending flush 
 op to all hosts for: fail-count-prm_cbw_ci_mnt_lvm (9)
 Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_perform_update: Sent update 
 416: fail-count-prm_cbw_ci_mnt_lvm=9
 Aug  5 13:50:56 h03 attrd: [14527]: notice: attrd_ais_dispatch: Update 
 relayed from h04
 
 Regards,
 Ulrich
 
 
 
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] ocf::LVM monitor needs excessive time to complete

2011-08-05 Thread Dimitri Maziuk
On 8/5/2011 7:18 AM, Dejan Muhamedagic wrote:
 Hi,

 On Fri, Aug 05, 2011 at 01:55:25PM +0200, Ulrich Windl wrote:
...
 When I tried a vgs manually, it could not be suspended or killed, and
it took more than 30 seconds to complete.

 Thus the LVM monitoring is quite useless as it is now (SLES 11 SP1
x86_64 on a machine with lots of disks, RAM and CPUs).

 I guess that this is somehow related to the storage. Best to
 report directly to SUSE.


What sort of disks and how many? -- last time we ran out of room, I had 
to add a different-sized ide disk (smaller, because you couldn't buy a 
big one anymore) so I had to use lvm. I/O performance went down the 
drain right away. (That was centos5 a couple of years ago.)

Dima (thank Cthulhu for sata and mdadm)
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] ocf:heartbeat:exportfs and crm configure verify

2011-08-05 Thread Ulrich Windl
Hi!

I think the RA for exportfs needs to be changed to allow a list of hosts (I had 
mentioned that before). Linux only allows eicher a hostname pattern, an IP 
mask, or a netgroup, but you cannot specify a thing like host[358] or 
host{3,5,8}.

So as an ugly work-arounf one uses one resource per host. This works, but crm 
configure verify complains about it:
WARNING: Resources 
prm_nfs_cbw_trans_exp_h02,prm_nfs_cbw_trans_exp_h03,prm_nfs_cbw_trans_exp_h04,prm_nfs_cbw_trans_exp_h06,prm_nfs_cbw_trans_exp_h07,prm_nfs_cbw_trans_exp_n01,prm_nfs_cbw_trans_exp_v01,prm_nfs_cbw_trans_exp_v03
 violate uniqueness for parameter fsid: ba57bee9-5872-46f2-9a87-0d178851d795

So for one filesystem it seems to be required (by the RA only?) that only one 
exportfs resource exists. That's bad.

Also the documentation for clientspec is not that precise 
(resource-agents-1.0.3-0.10.1):

-
clientspec* (string): Client ACL.
The client specification allowing remote machines to mount the directory
over NFS.
-

If I find time, I'll suggest a patch for the RA.

Regards,
Ulrich


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Q: default vs. default (e.g. exportfs)

2011-08-05 Thread Ulrich Windl
Hi!

I frequently see problems I don't understand:
When configuring an exportfs resource using crm shell without explicitly 
specifying operations or timeouts, I get warnings like these:
WARNING: prm_nfs_v03: default timeout 20s for start is smaller than the advised 
40

I wonder: If the default is 40s, and I specify none, why isn't that default 
used?
Is it because CRM has ist own defaults?

Regards,
Ulrich


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] OCF RA for named

2011-08-05 Thread Serge Dubrouski
No interest?

On Tue, Jul 12, 2011 at 3:50 PM, Serge Dubrouski serge...@gmail.com wrote:

 Hello -

 I've created an OCF RA for named (BIND) server. There is an existing one in
 redhat directory but I don't like how it does monitoring and I doubt that it
 can work with pacemaker. So please review the attached RA and see if it can
 be included into the project.


 --
 Serge Dubrouski.




-- 
Serge Dubrouski.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Antw: Re: ocf::LVM monitor needs excessive time to complete

2011-08-05 Thread Ulrich Windl
 Dejan Muhamedagic deja...@fastmail.fm schrieb am 05.08.2011 um 14:18 in
Nachricht 20110805121851.GB950@rondo.homenet:
 Hi,
 
 On Fri, Aug 05, 2011 at 01:55:25PM +0200, Ulrich Windl wrote:
  Hi,
  
  we run a cluster that has about 30 LVM VGs that are monitored every minute 
 with a timeout interval of 90s. Surprisingly even if the system is in nominal 
 state, the LVM monitor times out.
  
  I suspect this has to do with multiple LVM commands being run in parallel 
 like this:
  # ps ax |grep vg
   2014 pts/0D+ 0:00 vgs
   2580 ?D  0:00 vgdisplay -v NFS_C11_IO
   2638 ?D  0:00 vgck CBW_DB_BTD
   2992 ?D  0:00 vgdisplay -v C11_DB_Exe
   3002 ?D  0:00 vgdisplay -v C11_DB_15k
   4564 pts/2S+ 0:00 grep vg
  # ps ax |grep vg
   8095 ?D  0:00 vgck CBW_DB_Exe
   8119 ?D  0:00 vgdisplay -v C11_DB_FATA
   8194 ?D  0:00 vgdisplay -v NFS_SAP_Exe
  
  When I tried a vgs manually, it could not be suspended or killed, and it 
 took more than 30 seconds to complete.
  
  Thus the LVM monitoring is quite useless as it is now (SLES 11 SP1 x86_64 
 on a machine with lots of disks, RAM and CPUs).
 
 I guess that this is somehow related to the storage. Best to
 report directly to SUSE.
 

Hi!

I suspect that LVM uses an exclusive lock while examining the state. Basically 
vgdisplay in Linux does a stupid thing: It always scanns all disks to find PVs. 
As compared to HP-UX LVM, there it only scans the disks if you explicitly 
request as vgscan. A simple vgdisplay will access kernel in-RAM structures, but 
you can only vgdisplay VGs that are active (otherwise the kernel doesn't know 
them). PVs for VGs are stored in a file there.

I don't think the disk system is the problem; it's the LVM implementation. A 
very quick test series showed that vgdisplay for a named VG that exists takes 
0.3 to 0.8 seconds, that's rather slow. And looking for a VG that does not 
exist takes 0.8 to 1.5 seconds.

The system in question has 192 SCSI disks that are combined to 44 multipath 
disks. About the half of those are combined to RAID1s, a few of those RAIDs are 
partitioned. All RAIDs have a VG with at least one LV. This gives 72 device 
mapper devices. Now if lvm searches on all those devices, it can take a while 
to complete.

While playing I made an interesting observation: If you use jsut vgdisplay to 
display all VGs, the command takes about 0.05s, but when you specify a name, it 
takes about 0.7s. Finally when using awk to locate the desired VG, the command 
isn't very much slower than without awk:

# time (vgdisplay | awk '$1 == VG  $2 == Name  $3 == dd { print $3 }')

real0m0.082s
user0m0.020s
sys 0m0.012s
# time (vgdisplay | awk '$1 == VG  $2 == Name  $3 == sys { print $3 
}')
sys

real0m0.098s
user0m0.012s
sys 0m0.020s
# time vgdisplay sys
[...]
real0m0.063s
user0m0.020s
sys 0m0.004s
# time vgdisplay sysX
  Volume group sysX not found

real0m0.806s
user0m0.012s
sys 0m0.060s

So the status as it's implemented now takes much longer to return stopped 
than it takes to return started. Maybe someone wants to have a look what 
terrible things happen when a non-existing VG is specified for vgdisplay

Regards,
Ulrich


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] ocf::LVM monitor needs excessive time to complete

2011-08-05 Thread Maloja01
Hi,

processes in state D looks like locked in a kernel call/device request.
Do you have a problem with your storage? This is not cluster related .

Kind regards
Fabian

On 08/05/2011 01:55 PM, Ulrich Windl wrote:
 Hi,
 
 we run a cluster that has about 30 LVM VGs that are monitored every minute 
 with a timeout interval of 90s. Surprisingly even if the system is in nominal 
 state, the LVM monitor times out.
 
 I suspect this has to do with multiple LVM commands being run in parallel 
 like this:
 # ps ax |grep vg
  2014 pts/0D+ 0:00 vgs
  2580 ?D  0:00 vgdisplay -v NFS_C11_IO
  2638 ?D  0:00 vgck CBW_DB_BTD
  2992 ?D  0:00 vgdisplay -v C11_DB_Exe
  3002 ?D  0:00 vgdisplay -v C11_DB_15k
  4564 pts/2S+ 0:00 grep vg
 # ps ax |grep vg
  8095 ?D  0:00 vgck CBW_DB_Exe
  8119 ?D  0:00 vgdisplay -v C11_DB_FATA
  8194 ?D  0:00 vgdisplay -v NFS_SAP_Exe
 
 When I tried a vgs manually, it could not be suspended or killed, and it 
 took more than 30 seconds to complete.
 
 Thus the LVM monitoring is quite useless as it is now (SLES 11 SP1 x86_64 on 
 a machine with lots of disks, RAM and CPUs).
 
 As I had changed all the timeouts via crm configure edit, I suspect the LRM 
 starts all these monitors at the same time, creating massive parallelism. 
 Maybe a random star delay would be more useful than having the user specify a 
 variable start delay for the monitor. Possibly those stuck monitor operations 
 also affect monitors that would finish in time.
 
 Here's a part of the mess on one node:
 Aug  5 13:50:55 h03 lrmd: [14526]: WARN: operation monitor[360] on 
 ocf::LVM::prm_cbw_ci_mnt_lvm for client 14529, its parameters: 
 CRM_meta_name=[monitor] crm_feature_set=[3.0.5] 
 CRM_meta_record_pending=[true] CRM_meta_timeout=[3] 
 CRM_meta_interval=[1] volgrpname=[CBW_CI] : pid [29910] timed out
 Aug  5 13:50:55 h03 crmd: [14529]: ERROR: process_lrm_event: LRM operation 
 prm_cbw_ci_mnt_lvm_monitor_1 (360) Timed Out (timeout=3ms)
 Aug  5 13:50:55 h03 lrmd: [14526]: WARN: perform_ra_op: the operation 
 operation monitor[154] on ocf::IPaddr2::prm_a20_ip_1 for client 14529, its 
 parameters: CRM_meta_name=[monitor] crm_feature_set=[3.0.5] 
 CRM_meta_record_pending=[true] CRM_meta_timeout=[2] 
 CRM_meta_interval=[1] iflabel=[a20] ip=[172.20.17.54]  stayed in 
 operation list for 24020 ms (longer than 1 ms)
 Aug  5 13:50:56 h03 lrmd: [14526]: WARN: perform_ra_op: the operation 
 operation monitor[179] on ocf::Raid1::prm_nfs_cbw_trans_raid1 for client 
 14529, its parameters: CRM_meta_record_pending=[true] 
 raidconf=[/etc/mdadm/mdadm.conf] crm_feature_set=[3.0.5] OCF_CHECK_LEVEL=[1] 
 raiddev=[/dev/md8] CRM_meta_name=[monitor] CRM_meta_timeout=[6] 
 CRM_meta_interval=[6]  stayed in operation list for 24010 ms (longer than 
 1 ms)
 Aug  5 13:50:56 h03 attrd: [14527]: notice: attrd_ais_dispatch: Update 
 relayed from h04
 Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_local_callback: Expanded 
 fail-count-prm_cbw_ci_mnt_lvm=value++ to 9
 Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_trigger_update: Sending flush 
 op to all hosts for: fail-count-prm_cbw_ci_mnt_lvm (9)
 Aug  5 13:50:56 h03 attrd: [14527]: info: attrd_perform_update: Sent update 
 416: fail-count-prm_cbw_ci_mnt_lvm=9
 Aug  5 13:50:56 h03 attrd: [14527]: notice: attrd_ais_dispatch: Update 
 relayed from h04
 
 Regards,
 Ulrich
 
 
 
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] location and orders : Question about a behavior ...

2011-08-05 Thread Maloja01
Hi Alain,

yes your arguments about group-2 makes sense. To get an idea, if you are
seeing a side effect about one resource desturbing the other OR if its a
reproducable plan of the pengine, you should check, if this also
happens, if you only set node2 in status standby and active again.

If so you could create a shadow cib and than only change the node status
in the shadow cib and starting the what-if-analyze.

This gives us an idea, if the cluster does that relocation due to your
configuration only inside the cluster, or if there also some extenal
factors which only occurs on really running resources.

Kind regards
Fabian

On 08/05/2011 02:17 PM, alain.mou...@bull.net wrote:
 Hi Fabian,
 
 Many thanks to have a look at my initial problem.
 I can't make it again today as I'm trying another configuration on both 
 servers (HA NFS active/active I post
 another thread about it this morning) but I should be able to try again 
 next week.
 
 But if I well understand your explanation: 
 you suppose that clone-1 instance on node2 when it starts
 again after the reboot, it could disturb the clone-1 instance on node 3 by 
 stop/restart it also on node3 ? 
 I have not noticed via crm_mon any state change of the clone-1 instance on 
 node3 when node2 is restarted
 neither any state change on the group-2 which remain started on node3 (if 
 clone-1 has been stopped/restarted
 on node3 even quickly, I should have also seen group-2 stopped/restarted 
 due to the order-group-2 constraint)
 
 Hope it helps to clarify ...
 Thanks again
 Alain
 
 
 
 De :Maloja01 maloj...@arcor.de
 A : linux-ha@lists.linux-ha.org
 Date :  05/08/2011 11:40
 Objet : Re: [Linux-HA] location and orders : Question about a behavior ...
 Envoyé par :linux-ha-boun...@lists.linux-ha.org
 
 
 
 On 08/02/2011 05:06 PM, alain.mou...@bull.net wrote:
 Hi

 I have this simple configuration of locations and orders between 
 resources 
 group-1 , group-2 and clone-1
 (on a two nodes ha cluster with Pacemaker-1.1.2-7 /corosync-1.2.3-21) :

 location loc1-group-1   group-1 +100: node2
 location loc1-group-2   group-2 +100: node3

 order order-group-1   inf: group-1   clone-1
 order order-group-2   inf: group-2   clone-1

 property $id=cib-bootstrap-options \
 dc-version=1.1.2-f059ec7ced7a86f18e5490b67ebf4a0b963bccfe \
 cluster-infrastructure=openais \
 expected-quorum-votes=2 \
 stonith-enabled=true \
 no-quorum-policy=ignore \
 default-resource-stickiness=5000 \

 (and no current cli- preferences)

 When I stop the node2, the group-1 is well migrated on node3
 But when node2 is up again, and that I start Pacemaker again on node2,
 the group-1 automatically comes back on node2 , and I wonder why ?

 I have other similar configuration with same location constraints and 
 same
 default-resource-stickiness value, but without order with a clone 
 resource,
 and the group does not come back automatically. But I don't understand 
 why
 this order constraint would change this behavior ...
 
 We should focus our thoughts on the fact, that when node2 comes back
 into the cluster the clone-1 gets a change, because it is started now
 also on node2 - am I right? I do not have a good explanatio at this
 point of time but this could be the point why the group-1 looses its
 stickiness, because its first stopped and than restarted (after the
 clone is completely up again).
 
 Can you check the following in your setup: Either set max_clone to 1
 (just for a test of course) or doing an anti-location that clone-1 will
 not run on node2 (so after rejoining node2 clone-1 will not get a
 change in its setup).
 
 With your current config (without my changes):
 You should also check, if you see any stops on clone-instances when
 node2 is rejoining the cluster. That could be the case, if you have
 limitted the number of clones and have additional location
 constraints for the clone.
 
 Can you tell more about the clone and the group? Are there any possible
 side effects in the functionality of the resources?
 
 Kind regards
 Fabian
 

 Thanks for your help
 Alain Moullé

 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems