Re: [Linux-ha-dev] [Problem] The designation of the S option seems to have a problem.

2016-05-03 Thread renayama19661014
Hi Dejan,

I agree to your opinion, too.

I think that the abolition of hb_report included in glue is right.
If the abolition is decided, we use the hb_report function of crm_report or the 
crm shell.

Best Regards,
Hideo Yamauchi.



- Original Message -
> From: Dejan Muhamedagic 
> To: MLLIST-HA-DEV 
> Cc: 
> Date: 2016/5/3, Tue 20:58
> Subject: Re: [Linux-ha-dev] [Problem] The designation of the S option seems 
> to have a problem.
> 
> Hi Hideo-san,
> 
> On Mon, May 02, 2016 at 04:57:09PM +0900, renayama19661...@ybb.ne.jp wrote:
>>  Hi All,
>> 
>>  The S option of hb_report does not work well.
>>  Mr. Kristoer made similar modifications in hb_report of the crm shell.
>> 
>>   * https://github.com/ClusterLabs/crmsh/issues/137
>> 
>>  I just request this correction in glue.
> 
> Thanks for the patch. But I think that we should deprecate
> hb_report in favour of crm report, no use keeping two copies
> around.
> 
> Cheers,
> 
> Dejan
> 
>>  Best Regards,
>>  Hideo Yamauchi.
> 
> 
>>  ___
>>  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>>  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>  Home Page: http://linux-ha.org/
> 
> ___
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
> 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Problem] The designation of the S option seems to have a problem.

2016-05-02 Thread renayama19661014
Hi All,

The S option of hb_report does not work well.
Mr. Kristoer made similar modifications in hb_report of the crm shell.

 * https://github.com/ClusterLabs/crmsh/issues/137

I just request this correction in glue.

Best Regards,
Hideo Yamauchi.


option_s.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] oracle RA - Change of the judgment of the check_mon_user processing.

2014-07-22 Thread renayama19661014
Hi Dejan,

Thank you for comments.


 
 415        if echo $output | grep -w EXPIRED 
 /dev/null; then
 
 Also, could you verify if common_sql_filter() need modifications?

I will confirm it once again tomorrow.
I send a patch once again, if necessary.


Best Regards,
Hideo Yamacuhi.



- Original Message -
 From: Dejan Muhamedagic deja...@fastmail.fm
 To: renayama19661...@ybb.ne.jp; High-Availability Linux Development List 
 linux-ha-dev@lists.linux-ha.org
 Cc: 
 Date: 2014/7/22, Tue 18:53
 Subject: Re: [Linux-ha-dev] [Patch] oracle RA - Change of the judgment of the 
 check_mon_user processing.
 
 On Tue, Jul 22, 2014 at 11:57:04AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
 
  Consideration when NLS_LANG is set for other languages in oracle resource 
 agent is necessary.
  I attached a patch.
 
 The patch looks good. I wonder if this string is also
 translated:
 
 415         if echo $output | grep -w EXPIRED 
 /dev/null; then
 
 Also, could you verify if common_sql_filter() need modifications?
 
 Cheers,
 
 Dejan
 
 
  Best Regards,
  Hideo Yamauchi.
 
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Question] About the change of the oracle resource agent.

2014-07-22 Thread renayama19661014
Hi Dejan,

All right!!

 Is that with the latest version?


I confirm RA now in Oracle12c.
It is the latest edition of oracle.

Many Thanks!
Hideo Yamauchi.



- Original Message -
 From: Dejan Muhamedagic deja...@fastmail.fm
 To: renayama19661...@ybb.ne.jp; High-Availability Linux Development List 
 linux-ha-dev@lists.linux-ha.org
 Cc: 
 Date: 2014/7/22, Tue 18:46
 Subject: Re: [Linux-ha-dev] [Question] About the change of the oracle 
 resource agent.
 
 Hi Hideo-san,
 
 On Tue, Jul 22, 2014 at 11:07:29AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
 
  I am going to explain the next change to our user.
 
   * https://github.com/ClusterLabs/resource-agents/pull/367
   * https://github.com/ClusterLabs/resource-agents/pull/439
 
 
  Let me confirm whether it is the next contents that a patch intends.
 
  1) Because it was a problem that OCFMON user was added while the oracle 
 manager did not know it, patch changed it to appoint it explicitly.
 
 The OCFMON user and password parameters are optional, hence in
 this respect nothing really changed. The user is still created
 by the RA. However, it is good that they're now visible in the
 meta-data.
 
  2) Patch changed a deadline of OCFMON.(A deadline for password of the 
 default may be 180 days.)
 
 That's the problem we had with the previous version. Now there's
 a profile created for the monitoring user which has unlimited
 password expiry. If the password expired in the meantime, due to
 a missing profile, then it is reset.
 
 If the monitor still fails, the RA tries as sysdba again.
 
  3) Patch kept compatibility with old RA.
 
 Yes.
 
  Is there the main point of any other patches?
 
 No.
 
  If there is really the problem that occurred, before this change, please 
 teach to me.
 
 As mentioned above, the issue was that the password could
 expire.
 
  I intend to really show the problem that happened to a user.
   * For example, a time limit of OCFMON expired and failed in a monitor of 
 oracle
 
 Is that with the latest version?
 
 Cheers,
 
 Dejan
 
  I am going to send a patch later.
 
  Best Regards,
  Hideo Yamauchi.
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Question] About the change of the oracle resource agent.

2014-07-22 Thread renayama19661014
Hi Dejan,

I confirmed it in the environment where NLS_LANG was set in 
Japanese(Japanese_Japan.AL32UTF8).

I changed the expiration date of the OCFMON user and pushed forward the date of 
the system for one year.
I confirmed that the next processing worked definitely.(...on oracle12c)

Confirmed 1) After OCFMON user became expired (EXPIRED), the monitor processing 
in the sysdba user succeeds.
Confirmed 2) The grep judgment of the EXPIRED character string is carried out 
definitely.
Confirmed 3) When we start oracle again after OCFMON user expired, the time 
limit of the OCFMON user is changed.

 
 415        if echo $output | grep -w EXPIRED 
 /dev/null; then
 
 Also, could you verify if common_sql_filter() need modifications?

As a result, the correction of the next grep was not necessary.(Confirmed 
2,Confirmed 3)

Best Regards,
Hideo Yamauchi.



- Original Message -
 From: renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp
 To: Dejan Muhamedagic deja...@fastmail.fm; High-Availability Linux 
 Development List linux-ha-dev@lists.linux-ha.org
 Cc: 
 Date: 2014/7/22, Tue 20:50
 Subject: Re: [Linux-ha-dev] [Question] About the change of the oracle 
 resource agent.
 
 Hi Dejan,
 
 All right!!
 
  Is that with the latest version?
 
 
 I confirm RA now in Oracle12c.
 It is the latest edition of oracle.
 
 Many Thanks!
 Hideo Yamauchi.
 
 
 
 - Original Message -
  From: Dejan Muhamedagic deja...@fastmail.fm
  To: renayama19661...@ybb.ne.jp; High-Availability Linux Development List 
 linux-ha-dev@lists.linux-ha.org
  Cc: 
  Date: 2014/7/22, Tue 18:46
  Subject: Re: [Linux-ha-dev] [Question] About the change of the oracle 
 resource agent.
 
  Hi Hideo-san,
 
  On Tue, Jul 22, 2014 at 11:07:29AM +0900, renayama19661...@ybb.ne.jp wrote:
   Hi All,
 
   I am going to explain the next change to our user.
 
    * https://github.com/ClusterLabs/resource-agents/pull/367
    * https://github.com/ClusterLabs/resource-agents/pull/439
 
 
   Let me confirm whether it is the next contents that a patch intends.
 
   1) Because it was a problem that OCFMON user was added while the 
 oracle 
  manager did not know it, patch changed it to appoint it explicitly.
 
  The OCFMON user and password parameters are optional, hence in
  this respect nothing really changed. The user is still created
  by the RA. However, it is good that they're now visible in the
  meta-data.
 
   2) Patch changed a deadline of OCFMON.(A deadline for password of the 
  default may be 180 days.)
 
  That's the problem we had with the previous version. Now there's
  a profile created for the monitoring user which has unlimited
  password expiry. If the password expired in the meantime, due to
  a missing profile, then it is reset.
 
  If the monitor still fails, the RA tries as sysdba again.
 
   3) Patch kept compatibility with old RA.
 
  Yes.
 
   Is there the main point of any other patches?
 
  No.
 
   If there is really the problem that occurred, before this change, 
 please 
  teach to me.
 
  As mentioned above, the issue was that the password could
  expire.
 
   I intend to really show the problem that happened to a user.
    * For example, a time limit of OCFMON expired and failed in a monitor 
 of 
  oracle
 
  Is that with the latest version?
 
  Cheers,
 
  Dejan
 
   I am going to send a patch later.
 
   Best Regards,
   Hideo Yamauchi.
   ___
   Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
   http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
   Home Page: http://linux-ha.org/
 
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Question] About the change of the oracle resource agent.

2014-07-21 Thread renayama19661014
Hi All,

I am going to explain the next change to our user.

 * https://github.com/ClusterLabs/resource-agents/pull/367
 * https://github.com/ClusterLabs/resource-agents/pull/439


Let me confirm whether it is the next contents that a patch intends.

1) Because it was a problem that OCFMON user was added while the oracle manager 
did not know it, patch changed it to appoint it explicitly.
2) Patch changed a deadline of OCFMON.(A deadline for password of the default 
may be 180 days.)
3) Patch kept compatibility with old RA.

Is there the main point of any other patches?

If there is really the problem that occurred, before this change, please teach 
to me.
I intend to really show the problem that happened to a user.
 * For example, a time limit of OCFMON expired and failed in a monitor of oracle

I am going to send a patch later.

Best Regards,
Hideo Yamauchi.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch] oracle RA - Change of the judgment of the check_mon_user processing.

2014-07-21 Thread renayama19661014
Hi All,

Consideration when NLS_LANG is set for other languages in oracle resource agent 
is necessary.
I attached a patch.

Best Regards,
Hideo Yamauchi.


trac2891.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch:crmsh] Correction of the mistake of the processing to transfer comment.

2014-01-15 Thread renayama19661014
Hi Kristoffer,

 Sorry, I should have mentioned that I applied the patch to the
 development version, not to 1.2.5, when testing.
 
 I suspect that the difference is that in older versions, comments were
 stripped completely from the configuration, but in newer versions,
 comments are kept. However, it seems that with this patch there are
 comments generated in the XML code that the CLI syntax cannot
 represent.
 
 I have not had time to completely investigate. I will look into the
 problem further and let you know what I find.

It was recognized that the patch which I donated was unnecessary after all.
rpm which we used somehow or other seemed to have a problem.

I withdraw a patch.

Best Regards,
Hideo Yamauchi.

--- On Wed, 2014/1/15, Kristoffer Grönlund kgronl...@suse.com wrote:

 On Tue, 14 Jan 2014 12:31:29 +0900 (JST)
 renayama19661...@ybb.ne.jp wrote:
 
  Hi Kristoffer,
  
  In addition, the error did not happen on the edit test.
  I passed the test of edit even if I did not apply my patch even if I
  applied a patch.
  
  Did you execute the command of what kind of test?
 
 Sorry, I should have mentioned that I applied the patch to the
 development version, not to 1.2.5, when testing.
 
 I suspect that the difference is that in older versions, comments were
 stripped completely from the configuration, but in newer versions,
 comments are kept. However, it seems that with this patch there are
 comments generated in the XML code that the CLI syntax cannot
 represent.
 
 I have not had time to completely investigate. I will look into the
 problem further and let you know what I find.
 
 Thank you,
 
  
   * on crmsh-7cd5688c164d.tar(tip)
  (snip)
  [root@rh64-2744 test]# ./regression.sh 
  confbasic. checking... PASS
  confbasic-xml. checking... PASS
  edit checking... PASS
  (snip)
  
   * on crmsh-ef3f08547688(1.2.5)
  (snip)
  [root@rh64-2744 test]# ./regression.sh 
  confbasic. checking... PASS
  confbasic-xml. checking... FAIL
  edit. checking... PASS
  (snip)
  
  Best Regards,
  Hideo Yamauchi.
  
 
 
 
 -- 
 // Kristoffer Grönlund
 // kgronl...@suse.com
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch:crmsh] Correction of the mistake of the processing to transfer comment.

2014-01-13 Thread renayama19661014
Hi Krstoffer,

As for the problem, contents of rpm of crmsh1.2.5 which we used seem to have a 
problem somehow or other.
The problem did not occur in crm which I made from a source code of crmsh1.2.5.

The application of the patch which I donated seems to be unnecessary.

I confirm the details and contact me again.

Best Regards,
Hideo Yamauchi.

--- On Sun, 2014/1/12, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Kristoffer,
 
 Thank you for comment.
 
 I will look the day after tomorrow.
 
 Best Regards,
 Hideo Yamauchi.
 
 --- On Fri, 2014/1/10, Kristoffer Grönlund kgronl...@suse.com wrote:
 
  On Fri, 10 Jan 2014 16:27:47 +0900 (JST)
  renayama19661...@ybb.ne.jp wrote:
  
   Hi Dejan,
   
   I send a patch of crmsh1.2.5.
   Similar correction seems to be necessary for latest crmsh.
   
   Best Regards,
   Hideo Yamauchi
  
  Hello,
  
  Thank you for the patch!
  
  I tried applying the patch to the latest crmsh, but when running the
  regression test suite, I got an error. I think the patch is fixing a
  bug, but unfortunately it seems to reveal a different problem.
  
  Maybe you can help me figure out what is going wrong!
  
  Failing test case output included below:
  
  [   77s] Fri Jan 10 12:38:47 UTC 2014: BEGIN testcase edit
  [   77s] --
  [   77s] testcase edit failed
  [   77s] output is in crmtestout/edit.out
  [   77s] diff (from crmtestout/edit.diff):
  [   77s] --- /usr/share/crmsh/tests/testcases/edit.exp    2014-01-10 
  12:38:36.0 +
  [   77s] +++ -    2014-01-10 12:38:53.149599264 +
  [   77s] @@ -84,4 +84,8 @@
  [   77s]  .TRY configure rsc_defaults $id=rsc_options failure-timeout=10m
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s]  .TRY configure filter sed 's/2m/60s/' cib-bootstrap-options
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s]  .TRY configure show rsc_options
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s]  rsc_defaults $id=rsc_options \
  [   77s] @@ -89,3 +93,6 @@
  [   77s]  .TRY configure property stonith-enabled=true
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s]  .TRY configure show cib-bootstrap-options
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s]  property $id=cib-bootstrap-options \
  [   77s] @@ -94,4 +101,8 @@
  [   77s]  .TRY configure filter 'sed s/stonith-enabled=.true.//'
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s] +ERROR: 13: syntax: Unknown command near xml parsing 'xml 
  rsc_location id=loc-d1 rsc=d1 !--# -- rule id=r1 score=-INFINITY 
  boolean-op=or expression operation=not_defined attribute=webserver 
  id=loc-d1-expression/ expression attribute=mem type=number operation=lte 
  value=0 id=loc-d1-expression-3/ /rule rule id=loc-d1-rule 
  score=-INFINITY expression operation=not_defined attribute=a2 
  id=loc-d1-expression-2/ /rule rule id=r2 score-attribute=webserver 
  expression operation=defined attribute=webserver id=loc-d1-expression-0/ 
  /rule !--# -- /rsc_location'
  [   77s]  .TRY configure show cib-bootstrap-options
  [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
  [   77s]  property $id=cib-bootstrap-options \
  [   77s] -    default-action-timeout=60s
  [   77s] +    default-action-timeout=60s \
  [   77s] +    stonith-enabled=true
  [   77s] --
  [   77s] Fri Jan 10 12:38:53 UTC 2014: END testcase edit
  
  -- 
  // Kristoffer Grönlund
  // kgronl...@suse.com
  
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch:crmsh] Correction of the mistake of the processing to transfer comment.

2014-01-13 Thread renayama19661014
Hi Kristoffer,

In addition, the error did not happen on the edit test.
I passed the test of edit even if I did not apply my patch even if I applied a 
patch.

Did you execute the command of what kind of test?

 * on crmsh-7cd5688c164d.tar(tip)
(snip)
[root@rh64-2744 test]# ./regression.sh 
confbasic. checking... PASS
confbasic-xml. checking... PASS
edit checking... PASS
(snip)

 * on crmsh-ef3f08547688(1.2.5)
(snip)
[root@rh64-2744 test]# ./regression.sh 
confbasic. checking... PASS
confbasic-xml. checking... FAIL
edit. checking... PASS
(snip)

Best Regards,
Hideo Yamauchi.

--- On Tue, 2014/1/14, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Krstoffer,
 
 As for the problem, contents of rpm of crmsh1.2.5 which we used seem to have 
 a problem somehow or other.
 The problem did not occur in crm which I made from a source code of 
 crmsh1.2.5.
 
 The application of the patch which I donated seems to be unnecessary.
 
 I confirm the details and contact me again.
 
 Best Regards,
 Hideo Yamauchi.
 
 --- On Sun, 2014/1/12, renayama19661...@ybb.ne.jp 
 renayama19661...@ybb.ne.jp wrote:
 
  Hi Kristoffer,
  
  Thank you for comment.
  
  I will look the day after tomorrow.
  
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2014/1/10, Kristoffer Grönlund kgronl...@suse.com wrote:
  
   On Fri, 10 Jan 2014 16:27:47 +0900 (JST)
   renayama19661...@ybb.ne.jp wrote:
   
Hi Dejan,

I send a patch of crmsh1.2.5.
Similar correction seems to be necessary for latest crmsh.

Best Regards,
Hideo Yamauchi
   
   Hello,
   
   Thank you for the patch!
   
   I tried applying the patch to the latest crmsh, but when running the
   regression test suite, I got an error. I think the patch is fixing a
   bug, but unfortunately it seems to reveal a different problem.
   
   Maybe you can help me figure out what is going wrong!
   
   Failing test case output included below:
   
   [   77s] Fri Jan 10 12:38:47 UTC 2014: BEGIN testcase edit
   [   77s] --
   [   77s] testcase edit failed
   [   77s] output is in crmtestout/edit.out
   [   77s] diff (from crmtestout/edit.diff):
   [   77s] --- /usr/share/crmsh/tests/testcases/edit.exp    2014-01-10 
   12:38:36.0 +
   [   77s] +++ -    2014-01-10 12:38:53.149599264 +
   [   77s] @@ -84,4 +84,8 @@
   [   77s]  .TRY configure rsc_defaults $id=rsc_options 
   failure-timeout=10m
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s]  .TRY configure filter sed 's/2m/60s/' cib-bootstrap-options
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s]  .TRY configure show rsc_options
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s]  rsc_defaults $id=rsc_options \
   [   77s] @@ -89,3 +93,6 @@
   [   77s]  .TRY configure property stonith-enabled=true
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s]  .TRY configure show cib-bootstrap-options
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s]  property $id=cib-bootstrap-options \
   [   77s] @@ -94,4 +101,8 @@
   [   77s]  .TRY configure filter 'sed s/stonith-enabled=.true.//'
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s] +ERROR: 13: syntax: Unknown command near xml parsing 'xml 
   rsc_location id=loc-d1 rsc=d1 !--# -- rule id=r1 score=-INFINITY 
   boolean-op=or expression operation=not_defined attribute=webserver 
   id=loc-d1-expression/ expression attribute=mem type=number 
   operation=lte value=0 id=loc-d1-expression-3/ /rule rule 
   id=loc-d1-rule score=-INFINITY expression operation=not_defined 
   attribute=a2 id=loc-d1-expression-2/ /rule rule id=r2 
   score-attribute=webserver expression operation=defined 
   attribute=webserver id=loc-d1-expression-0/ /rule !--# -- 
   /rsc_location'
   [   77s]  .TRY configure show cib-bootstrap-options
   [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
   [   77s]  property $id=cib-bootstrap-options \
   [   77s] -    default-action-timeout=60s
   [   77s] +    default-action-timeout=60s \
   [   77s] +    stonith-enabled=true
   [   77s] --
   [   77s] Fri Jan 10 12:38:53 UTC 2014: END testcase edit
   
   -- 
   // Kristoffer Grönlund
   // kgronl...@suse.com
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
  
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___

Re: [Linux-ha-dev] [Patch:crmsh] Correction of the mistake of the processing to transfer comment.

2014-01-11 Thread renayama19661014
Hi Kristoffer,

Thank you for comment.

I will look the day after tomorrow.

Best Regards,
Hideo Yamauchi.

--- On Fri, 2014/1/10, Kristoffer Grönlund kgronl...@suse.com wrote:

 On Fri, 10 Jan 2014 16:27:47 +0900 (JST)
 renayama19661...@ybb.ne.jp wrote:
 
  Hi Dejan,
  
  I send a patch of crmsh1.2.5.
  Similar correction seems to be necessary for latest crmsh.
  
  Best Regards,
  Hideo Yamauchi
 
 Hello,
 
 Thank you for the patch!
 
 I tried applying the patch to the latest crmsh, but when running the
 regression test suite, I got an error. I think the patch is fixing a
 bug, but unfortunately it seems to reveal a different problem.
 
 Maybe you can help me figure out what is going wrong!
 
 Failing test case output included below:
 
 [   77s] Fri Jan 10 12:38:47 UTC 2014: BEGIN testcase edit
 [   77s] --
 [   77s] testcase edit failed
 [   77s] output is in crmtestout/edit.out
 [   77s] diff (from crmtestout/edit.diff):
 [   77s] --- /usr/share/crmsh/tests/testcases/edit.exp    2014-01-10 
 12:38:36.0 +
 [   77s] +++ -    2014-01-10 12:38:53.149599264 +
 [   77s] @@ -84,4 +84,8 @@
 [   77s]  .TRY configure rsc_defaults $id=rsc_options failure-timeout=10m
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s]  .TRY configure filter sed 's/2m/60s/' cib-bootstrap-options
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s]  .TRY configure show rsc_options
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s]  rsc_defaults $id=rsc_options \
 [   77s] @@ -89,3 +93,6 @@
 [   77s]  .TRY configure property stonith-enabled=true
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s]  .TRY configure show cib-bootstrap-options
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s]  property $id=cib-bootstrap-options \
 [   77s] @@ -94,4 +101,8 @@
 [   77s]  .TRY configure filter 'sed s/stonith-enabled=.true.//'
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s] +ERROR: 13: syntax: Unknown command near xml parsing 'xml 
 rsc_location id=loc-d1 rsc=d1 !--# -- rule id=r1 score=-INFINITY 
 boolean-op=or expression operation=not_defined attribute=webserver 
 id=loc-d1-expression/ expression attribute=mem type=number operation=lte 
 value=0 id=loc-d1-expression-3/ /rule rule id=loc-d1-rule 
 score=-INFINITY expression operation=not_defined attribute=a2 
 id=loc-d1-expression-2/ /rule rule id=r2 score-attribute=webserver 
 expression operation=defined attribute=webserver id=loc-d1-expression-0/ 
 /rule !--# -- /rsc_location'
 [   77s]  .TRY configure show cib-bootstrap-options
 [   77s] +INFO: object loc-d1 cannot be represented in the CLI notation
 [   77s]  property $id=cib-bootstrap-options \
 [   77s] -    default-action-timeout=60s
 [   77s] +    default-action-timeout=60s \
 [   77s] +    stonith-enabled=true
 [   77s] --
 [   77s] Fri Jan 10 12:38:53 UTC 2014: END testcase edit
 
 -- 
 // Kristoffer Grönlund
 // kgronl...@suse.com
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch:crmsh] Correction of the mistake of the processing to transfer comment.

2014-01-09 Thread renayama19661014
Hi Dejan,

I send a patch of crmsh1.2.5.
Similar correction seems to be necessary for latest crmsh.

Best Regards,
Hideo Yamauchi

trac2744.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]Exit code of reset of external/libvirt is wrong.

2013-07-05 Thread renayama19661014
Hi Dejan,

Thank you for comments.

 Many thanks for the patch. Applied (slightly modified).
 I wonder if we should also ignore the outcome of libvirt_start.
 What are the chances that it fails?

I found that I failed in libvirt_start for the first time.
When I use libvirt in vSphere environment, it occurs.
In the vSphere environment, libvirt can carry out libvirt_stop of the fail over 
host of the vSphere HA, but cannot carry out libvirt_start.
Probably I think that it is a problem not to happen in KVM.

Best Regards,
Hideo Yamauchi.

--- On Fri, 2013/7/5, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Fri, Jul 05, 2013 at 08:51:14AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  The exit code of reset of external/libvirt is wrong.
 
 Indeed. Quite sloppy the latest change, my apologies.
 
  I attached a patch.
 
 Many thanks for the patch. Applied (slightly modified).
 I wonder if we should also ignore the outcome of libvirt_start.
 What are the chances that it fails?
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
 
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch]Exit code of reset of external/libvirt is wrong.

2013-07-04 Thread renayama19661014
Hi All,

The exit code of reset of external/libvirt is wrong.
I attached a patch.

Best Regards,
Hideo Yamauchi.

libvirt.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Problem] external/vcenter fails in stonith of the guest of the similar name.

2012-10-22 Thread renayama19661014
Hi Dejan,

  Please revise it to add a character of ^ to a search.
 
 Applied. Thanks!

I confirmed it.
(http://hg.linux-ha.org/glue/rev/0809ed6abeb7)

Many Thanks,
Hideo Yamauchi.

--- On Tue, 2012/10/23, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Mon, Oct 22, 2012 at 09:20:53AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  external/vcenter fails in stonith of the guest of the similar name.
  
  For example, as for the practice of stonith of sr2, stonith does backup-sr2 
  when two guests of sr2,backup-sr2 exist.
  
  The problem is a thing by the next search.
  
   $vm = Vim::find_entity_view(view_type = VirtualMachine, filter = { 
 name = qr/\Q$host_to_vm{$targetHost}\E/i });
  
  
  It seems to be caused by the fact that the correction that Mr. Lars pointed 
  out before leaks out.
  
   * 
 http://lists.community.tummy.com/pipermail/linux-ha-dev/2011-April/018397.html
  
  (snip)
  Unless this filter thing has a special mode where it internally does a
  $x eq $y for scalars and $x =~ $y for explicitly designated qr//
  Regexp objects, I'd suggest to here also do
      filter = { name = qr/^\Q$realTarget\E$/i }
  (snip)
  
  Please revise it to add a character of ^ to a search.
 
 Applied. Thanks!
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Problem] external/vcenter fails in stonith of the guest of the similar name.

2012-10-21 Thread renayama19661014
Hi All,

external/vcenter fails in stonith of the guest of the similar name.

For example, as for the practice of stonith of sr2, stonith does backup-sr2 
when two guests of sr2,backup-sr2 exist.

The problem is a thing by the next search.

 $vm = Vim::find_entity_view(view_type = VirtualMachine, filter = { name = 
qr/\Q$host_to_vm{$targetHost}\E/i });


It seems to be caused by the fact that the correction that Mr. Lars pointed out 
before leaks out.

 * 
http://lists.community.tummy.com/pipermail/linux-ha-dev/2011-April/018397.html

(snip)
Unless this filter thing has a special mode where it internally does a
$x eq $y for scalars and $x =~ $y for explicitly designated qr//
Regexp objects, I'd suggest to here also do
filter = { name = qr/^\Q$realTarget\E$/i }
(snip)

Please revise it to add a character of ^ to a search.

Best Regards,
Hideo Yamauchi.


___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] The problem that the cord of the digest cord of crmd becomes mismatched for.

2012-10-12 Thread renayama19661014
Hi Dejan,
Hi Andrew,

I confirmed the update with the patch of glue.
 * http://hg.linux-ha.org/glue/rev/579e45f957b6

Many Thanks!
Hideo Yamauchi.


--- On Fri, 2012/10/12, Dejan Muhamedagic de...@suse.de wrote:

 Hi,
 
 On Fri, Oct 12, 2012 at 08:31:21AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Andrew,
  Hi Dejan,
  
   Makes sense to me.
   With the patch, the effective options are create+op rather than
   create+op1+op2+op3...
  
  Will it be a meaning to change the structure of the op-done message?
  I cannot change op message when I think about other influence.
  I think that a patch is right by the op message of present lrmd and crmd.
  
  We want to apply a patch to glue early if we can do it.
 
 I'll do some testing first.
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  --- On Thu, 2012/10/11, Andrew Beekhof beek...@gmail.com wrote:
  
   On Wed, Oct 10, 2012 at 11:21 PM, Dejan Muhamedagic de...@suse.de wrote:
Hi Hideo-san,
   
On Wed, Oct 10, 2012 at 03:22:08PM +0900, renayama19661...@ybb.ne.jp 
wrote:
Hi All,
   
We found pacemaker that we could not judge a result of the operation 
of lrmd well.
   
When we carry out following crm, a parameter of the operation of start 
is given back to crmd as a result of operation of monitor.
   
(snip)
primitive prmDiskd ocf:pacemaker:Dummy \
            params name=diskcheck_status_internal device=/dev/vda 
   interval=30 \
            op start interval=0 timeout=60s on-fail=restart 
   prereq=fencing \
            op monitor interval=30s timeout=60s on-fail=restart \
            op stop interval=0s timeout=60s on-fail=block
(snip)
   
This is because lrmd gives back prereq parameter of start as a result 
of monitor operation.
As a result, crmd judge mismatched with a parameter of the monitor 
operation that crmd asked lrmd for for the parameter that Irmd carried 
out of the monitor operation.
   
We can confirm this problem by the next command in Pacemaker1.0.12.
   
Command 1) crm_verify command outputs the difference in digest cord.
   
[root@rh63-heartbeat1 ~]# crm_verify -L
crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition: 
Parameters to prmDiskd:0_monitor_3 on rh63-heartbeat1 changed: 
recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
   
   
Command 2) The ptest command outputs the difference in digest cord, 
too.
   
[root@rh63-heartbeat1 ~]# ptest -L -VV
ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not 
fencing unseen nodes
ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: 
Parameters to prmDiskd:0_monitor_3 on rh63-heartbeat1 changed: 
recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
[root@rh63-heartbeat1 ~]#
   
Command 3) By cibadmin -B command, pengine restart monitor of an 
unnecessary resource.
   
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT: 
check_action_definition: Parameters to prmDiskd:0_monitor_3 on 
rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: 
RecurringOp:  Start recurring monitor (30s) for prmDiskd:0 on 
rh63-heartbeat1
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: 
Leave   resource prmDiskd:0#011(Started rh63-heartbeat1)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: 
do_state_transition: State transition S_POLICY_ENGINE - 
S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE 
origin=handle_response ]
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: 
Unpacked transition 2: 1 actions in 1 synapses
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: 
Processing graph 2 (ref=pe_calc-dc-1349868660-20) derived from 
/var/lib/pengine/pe-input-2.bz2
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: 
Initiating action 1: monitor prmDiskd:0_monitor_3 on 
rh63-heartbeat1 (local)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: 
Performing key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 
op=prmDiskd:0_monitor_3 )
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: 
operation monitor[4] on prmDiskd:0 for client 19839, its parameters: 
CRM_meta_clone=[0] CRM_meta_prereq=[fencing] device=[/dev/vda] 
name=[diskcheck_status_internal] CRM_meta_clone_node_max=[1] 
CRM_meta_clone_max=[1] CRM_meta_notify=[false] 
CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30] 
prereq=[fencing] 

[Linux-ha-dev] [Patch] The problem that the cord of the digest cord of crmd becomes mismatched for.

2012-10-10 Thread renayama19661014
Hi All,

We found pacemaker that we could not judge a result of the operation of lrmd 
well.

When we carry out following crm, a parameter of the operation of start is given 
back to crmd as a result of operation of monitor.

(snip)
primitive prmDiskd ocf:pacemaker:Dummy \
params name=diskcheck_status_internal device=/dev/vda interval=30 
\
op start interval=0 timeout=60s on-fail=restart prereq=fencing \
op monitor interval=30s timeout=60s on-fail=restart \
op stop interval=0s timeout=60s on-fail=block
(snip)

This is because lrmd gives back prereq parameter of start as a result of 
monitor operation.
As a result, crmd judge mismatched with a parameter of the monitor operation 
that crmd asked lrmd for for the parameter that Irmd carried out of the monitor 
operation.

We can confirm this problem by the next command in Pacemaker1.0.12.

Command 1) crm_verify command outputs the difference in digest cord.

[root@rh63-heartbeat1 ~]# crm_verify -L
crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition: 
Parameters to prmDiskd:0_monitor_3 on rh63-heartbeat1 changed: recorded 
7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
(reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6


Command 2) The ptest command outputs the difference in digest cord, too.

[root@rh63-heartbeat1 ~]# ptest -L -VV
ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not fencing 
unseen nodes
ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: Parameters to 
prmDiskd:0_monitor_3 on rh63-heartbeat1 changed: recorded 
7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
(reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
[root@rh63-heartbeat1 ~]# 

Command 3) By cibadmin -B command, pengine restart monitor of an unnecessary 
resource.

Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT: 
check_action_definition: Parameters to prmDiskd:0_monitor_3 on 
rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: RecurringOp:  Start 
recurring monitor (30s) for prmDiskd:0 on rh63-heartbeat1
Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: Leave   
resource prmDiskd:0#011(Started rh63-heartbeat1)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_state_transition: State 
transition S_POLICY_ENGINE - S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: Unpacked 
transition 2: 1 actions in 1 synapses
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: Processing 
graph 2 (ref=pe_calc-dc-1349868660-20) derived from 
/var/lib/pengine/pe-input-2.bz2
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: Initiating 
action 1: monitor prmDiskd:0_monitor_3 on rh63-heartbeat1 (local)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: Performing 
key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 op=prmDiskd:0_monitor_3 )
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: operation 
monitor[4] on prmDiskd:0 for client 19839, its parameters: CRM_meta_clone=[0] 
CRM_meta_prereq=[fencing] device=[/dev/vda] name=[diskcheck_status_internal] 
CRM_meta_clone_node_max=[1] CRM_meta_clone_max=[1] CRM_meta_notify=[false] 
CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30] 
prereq=[fencing] CRM_meta_on_fail=[restart] CRM_meta_name=[monitor] 
CRM_meta_interval=[3] CRM_meta_timeout=[6]  cancelled
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: rsc:prmDiskd:0 monitor[5] 
(pid 20009)
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM 
operation prmDiskd:0_monitor_3 (call=4, status=1, cib-update=0, 
confirmed=true) Cancelled
Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: operation monitor[5] on 
prmDiskd:0 for client 19839: pid 20009 exited with return code 0
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: append_digest:  
yamauchi Calculated digest 7d7c9f601095389fc7cc0c6b29c61a7a for 
prmDiskd:0_monitor_3 (0:0;1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6). 
Source: parameters device=/dev/vda name=diskcheck_status_internal 
interval=30 prereq=fencing CRM_meta_timeout=6/
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM 
operation prmDiskd:0_monitor_3 (call=5, rc=0, cib-update=53, 
confirmed=false) ok
Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: match_graph_event: Action 
prmDiskd:0_monitor_3 (1) confirmed on rh63-heartbeat1 (rc=0)


It is a problem to judge crmd that a digest cord is changed in not changing the 
parameter at all.

I made a patch.
The lrmd always gives back only a parameter depended on to a result from crmd 
and is a patch copying 

Re: [Linux-ha-dev] [Patch] The problem that the cord of the digest cord of crmd becomes mismatched for.

2012-10-10 Thread renayama19661014
Hi Dejan,

Thank you for comments.

I wait for comment of Andrew.
I hope that a problem is settled with a patch.

Many thanks,
Hideo Yamauhci.

--- On Wed, 2012/10/10, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Wed, Oct 10, 2012 at 03:22:08PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  We found pacemaker that we could not judge a result of the operation of 
  lrmd well.
  
  When we carry out following crm, a parameter of the operation of start is 
  given back to crmd as a result of operation of monitor.
  
  (snip)
  primitive prmDiskd ocf:pacemaker:Dummy \
          params name=diskcheck_status_internal device=/dev/vda 
 interval=30 \
          op start interval=0 timeout=60s on-fail=restart 
 prereq=fencing \
          op monitor interval=30s timeout=60s on-fail=restart \
          op stop interval=0s timeout=60s on-fail=block
  (snip)
  
  This is because lrmd gives back prereq parameter of start as a result of 
  monitor operation.
  As a result, crmd judge mismatched with a parameter of the monitor 
  operation that crmd asked lrmd for for the parameter that Irmd carried out 
  of the monitor operation.
  
  We can confirm this problem by the next command in Pacemaker1.0.12.
  
  Command 1) crm_verify command outputs the difference in digest cord.
  
  [root@rh63-heartbeat1 ~]# crm_verify -L
  crm_verify[19988]: 2012/10/10_20:29:58 CRIT: check_action_definition: 
  Parameters to prmDiskd:0_monitor_3 on rh63-heartbeat1 changed: recorded 
  7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
  (reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
  
  
  Command 2) The ptest command outputs the difference in digest cord, too.
  
  [root@rh63-heartbeat1 ~]# ptest -L -VV
  ptest[19992]: 2012/10/10_20:30:19 WARN: unpack_nodes: Blind faith: not 
  fencing unseen nodes
  ptest[19992]: 2012/10/10_20:30:19 CRIT: check_action_definition: Parameters 
  to prmDiskd:0_monitor_3 on rh63-heartbeat1 changed: recorded 
  7d7c9f601095389fc7cc0c6b29c61a7a vs. d38c85388dea5e8e2568c3d699eb9cce 
  (reload:3.0.1) 0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
  [root@rh63-heartbeat1 ~]# 
  
  Command 3) By cibadmin -B command, pengine restart monitor of an 
  unnecessary resource.
  
  Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: CRIT: 
  check_action_definition: Parameters to prmDiskd:0_monitor_3 on 
  rh63-heartbeat1 changed: recorded 7d7c9f601095389fc7cc0c6b29c61a7a vs. 
  d38c85388dea5e8e2568c3d699eb9cce (reload:3.0.1) 
  0:0;6:1:0:ca6a5ad2-0340-4769-bab7-289a00862ba6
  Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: RecurringOp:  
  Start recurring monitor (30s) for prmDiskd:0 on rh63-heartbeat1
  Oct 10 20:31:00 rh63-heartbeat1 pengine: [19842]: notice: LogActions: 
  Leave   resource prmDiskd:0#011(Started rh63-heartbeat1)
  Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_state_transition: 
  State transition S_POLICY_ENGINE - S_TRANSITION_ENGINE [ 
  input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
  Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: unpack_graph: Unpacked 
  transition 2: 1 actions in 1 synapses
  Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_te_invoke: 
  Processing graph 2 (ref=pe_calc-dc-1349868660-20) derived from 
  /var/lib/pengine/pe-input-2.bz2
  Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: te_rsc_command: 
  Initiating action 1: monitor prmDiskd:0_monitor_3 on rh63-heartbeat1 
  (local)
  Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: do_lrm_rsc_op: 
  Performing key=1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6 
  op=prmDiskd:0_monitor_3 )
  Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: cancel_op: operation 
  monitor[4] on prmDiskd:0 for client 19839, its parameters: 
  CRM_meta_clone=[0] CRM_meta_prereq=[fencing] device=[/dev/vda] 
  name=[diskcheck_status_internal] CRM_meta_clone_node_max=[1] 
  CRM_meta_clone_max=[1] CRM_meta_notify=[false] 
  CRM_meta_globally_unique=[false] crm_feature_set=[3.0.1] interval=[30] 
  prereq=[fencing] CRM_meta_on_fail=[restart] CRM_meta_name=[monitor] 
  CRM_meta_interval=[3] CRM_meta_timeout=[6]  cancelled
  Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: rsc:prmDiskd:0 
  monitor[5] (pid 20009)
  Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: process_lrm_event: LRM 
  operation prmDiskd:0_monitor_3 (call=4, status=1, cib-update=0, 
  confirmed=true) Cancelled
  Oct 10 20:31:00 rh63-heartbeat1 lrmd: [19836]: info: operation monitor[5] 
  on prmDiskd:0 for client 19839: pid 20009 exited with return code 0
  Oct 10 20:31:00 rh63-heartbeat1 crmd: [19839]: info: append_digest:  
  yamauchi Calculated digest 7d7c9f601095389fc7cc0c6b29c61a7a for 
  prmDiskd:0_monitor_3 (0:0;1:2:0:ca6a5ad2-0340-4769-bab7-289a00862ba6). 
  Source: parameters device=/dev/vda name=diskcheck_status_internal 
  interval=30 prereq=fencing CRM_meta_timeout=6/
  Oct 10 20:31:00 rh63-heartbeat1 crmd: 

Re: [Linux-ha-dev] improvements of the libvirt stonith plugin

2012-07-13 Thread renayama19661014
Hi All,

We confirmed the connection of libvirt of Esx(vmware) on RHEL6.3.

When we are connected to Esxi, different results are provided like RHEL5.

[root@rh63-1 ~]# virsh -c esx://root@192.168.133.1/?no_verify=1 destroy sr2
Enter root's password for 192.168.133.1: 
error: Failed to destroy domain sr2
error: Requested operation is not valid: Domain is not powered on


Because the following result is provided, the patch which Mr.Matsuo contributed 
becomes useful in Esx.

[root@rh63-1 ~]# virsh -c esx://root@192.168.133.2/?no_verify=1 dominfo sr1
Enter root's password for 192.168.133.2: 
Id: 68
Name:   sr1
UUID:   423b6068-2b19-b80d-0ef2-0c64e3ee25b3
OS Type:hvm
State:  running 
CPU(s): 2
Max memory: 2097152 kB
Used memory:2097152 kB
Persistent: yes
Autostart:  disable
Managed save:   unknown

[root@rh63-1 ~]# virsh -c esx://root@192.168.133.1/?no_verify=1 dominfo sr2
Enter root's password for 192.168.133.1: 
Id: -
Name:   sr2
UUID:   423b9c27-cded-5616-b5c9-f04f4215b663
OS Type:hvm
State:  shut off
CPU(s): 2
Max memory: 2097152 kB
Used memory:2097152 kB
Persistent: yes
Autostart:  disable
Managed save:   unknown


Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/6/1, Takatoshi MATSUO matsuo@gmail.com wrote:

 Hi
 
 I found that fencing of libvirt plugin is failed on Xen(RHEL5),
 because virsh's outputs are different.
 
 KVM on RHEL6
 -
 # virsh destroy host1
 error: Failed to destroy domain host1
 error: Requested operation is not valid: domain is not running
 -
 
 Xen on RHEL5
 -
 # virsh destroy host1
 error: Failed to destroy domain host1
 error: invalid argument in Domain host1 isn't running.
 -
 
 I attached a patch.
 
 Regards,
 Takatoshi MATSUO
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] improvements of the libvirt stonith plugin

2012-07-13 Thread renayama19661014
Hi Dejan,

I confirmed the adoption of the patch.

Many Thanks!
Hideo Yamauchi.

--- On Fri, 2012/7/13, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Fri, Jul 13, 2012 at 03:52:09PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  We confirmed the connection of libvirt of Esx(vmware) on RHEL6.3.
  
  When we are connected to Esxi, different results are provided like RHEL5.
  
  [root@rh63-1 ~]# virsh -c esx://root@192.168.133.1/?no_verify=1 destroy sr2
  Enter root's password for 192.168.133.1: 
  error: Failed to destroy domain sr2
  error: Requested operation is not valid: Domain is not powered on
  
  
  Because the following result is provided, the patch which Mr.Matsuo 
  contributed becomes useful in Esx.
 
 Good. I missed the patch somehow, sorry about that.
 Applied now. Many thanks for the patch to Takatoshi MATSUO.
 
 I also modified the search string a bit in a later changeset.
 
 Cheers,
 
 Dejan
 
 
  [root@rh63-1 ~]# virsh -c esx://root@192.168.133.2/?no_verify=1 dominfo sr1
  Enter root's password for 192.168.133.2: 
  Id:             68
  Name:           sr1
  UUID:           423b6068-2b19-b80d-0ef2-0c64e3ee25b3
  OS Type:        hvm
  State:          running 
  CPU(s):         2
  Max memory:     2097152 kB
  Used memory:    2097152 kB
  Persistent:     yes
  Autostart:      disable
  Managed save:   unknown
  
  [root@rh63-1 ~]# virsh -c esx://root@192.168.133.1/?no_verify=1 dominfo sr2
  Enter root's password for 192.168.133.1: 
  Id:             -
  Name:           sr2
  UUID:           423b9c27-cded-5616-b5c9-f04f4215b663
  OS Type:        hvm
  State:          shut off
  CPU(s):         2
  Max memory:     2097152 kB
  Used memory:    2097152 kB
  Persistent:     yes
  Autostart:      disable
  Managed save:   unknown
  
  
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2012/6/1, Takatoshi MATSUO matsuo@gmail.com wrote:
  
   Hi
   
   I found that fencing of libvirt plugin is failed on Xen(RHEL5),
   because virsh's outputs are different.
   
   KVM on RHEL6
   -
   # virsh destroy host1
   error: Failed to destroy domain host1
   error: Requested operation is not valid: domain is not running
   -
   
   Xen on RHEL5
   -
   # virsh destroy host1
   error: Failed to destroy domain host1
   error: invalid argument in Domain host1 isn't running.
   -
   
   I attached a patch.
   
   Regards,
   Takatoshi MATSUO
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] pull request 82 for postfix ra *call for help*

2012-05-16 Thread renayama19661014
Hi Raoul,

As for me, you understood a meaning.
And I understood that plural contents were not set at data_dir.
It is that this loop is a loop in consideration of the expansion of the future 
directory check.

Is my understanding wrong?

Many Thanks.
Hideo Yamauchi.

--- On Wed, 2012/5/16, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hello Hideo-san!
 
 On 16.05.2012 06:22, renayama19661...@ybb.ne.jp wrote:
  Hi Raoul,
  
  I forgot it.
  
  Is not it necessary to convert a comma into the space from data_dir if you 
  leave a loop of data_dir?
  
  example) data_dir=`echo $data_dir | tr ',' ' '` 
 
 I think we still have a major misunderstanding :)
 
 This loop is *not* about looping multiple data directories
 (multiple data directories are not possible and an error is
 issued by the new patch)
 
 This loop is kept in place if we want to loop different, additional
 directories, for example the data_dir *and* the mail_spool_directory
 *and* the queue_directory.
 
 As of now, we do not loop more directories but the loop does not harm
 in any way, so I would rather keep it there.
 
 
 Can anyone help me to express myself in a better way or help me
 understand the real issue which Hideo-san wants to address?
 *Please* :)
 
 Cheers,
 Raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] pull request 82 for postfix ra *call for help*

2012-05-16 Thread renayama19661014
Hi Raoul,

Thank you for comments.

I agree to your correction.
I am sorry that I confused you.

Many Thanks!
Hideo Yamauchi.

--- On Wed, 2012/5/16, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hello Hideo-san!
 
 On 16.05.2012 08:12, renayama19661...@ybb.ne.jp wrote:
  Hi Raoul,
  
  As for me, you understood a meaning.
  And I understood that plural contents were not set at data_dir.
  It is that this loop is a loop in consideration of the expansion of the 
  future directory check.
 
 Mhm, I *think* so.
 
 So can we agree that there is nothing left to do and I can issue another
 pull request? :)
 
 Otherwise, I'm confused on what you're expecting from me.
 (If it is simply removing the loop because there currently is *no need*
 for looping, to which i agree, I would still refrain from this
 particular change because we would not anything here.)
 
 Thanks,
 Raoul
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-05-16 Thread renayama19661014
Hi Lars,

 Pushed to http://hg.linux-ha.org/glue

We confirmed that a problem was settled with your patch.

Many Thanks!
Hideo Yamauchi.

--- On Thu, 2012/5/17, Lars Ellenberg lars.ellenb...@linbit.com wrote:

 On Wed, May 16, 2012 at 09:33:48AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Lars,
  
  In the environment where we confirmed leak, I confirm your patch.
 
 Pushed to http://hg.linux-ha.org/glue
 
 Thanks,
 
 -- 
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com
 
 DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-05-15 Thread renayama19661014
Hi Lars,

Sorry...An answer was late.

In the environment where we confirmed leak, I confirm your patch.

Many Thanks,
Hideo Yamauchi.

--- On Wed, 2012/5/16, Lars Ellenberg lars.ellenb...@linbit.com wrote:

 On Tue, May 15, 2012 at 11:14:53AM +0200, Lars Ellenberg wrote:
  On Mon, May 14, 2012 at 05:44:55PM +0200, Lars Ellenberg wrote:
By the way, I suspect Lars' suggestion would work fine.  I would 
certainly explain what the better patch is in the comments when you 
apply this one.
 
  Hm. Looks like it *does* explode (aka segfault)
 
 Continuing my monologue ...
 it may just have been incomplete.
 
 The patch below seems to work just fine.
 
 I managed to occasionally trigger the
     Attempt to remove timeout (%u) with NULL source
 message, but I have seen that one without the patch as well,
 so that may just be some other oddity somewhere:
 double removal of timeout resources ;-)
 
 We can find and drop those later,
 they look harmless enough.
 
 I do not see any memleak anywhere anymore with this patch applied.
 
 Comments/review/testing welcome.
 
 # HG changeset patch
 # User Lars Ellenberg l...@linbit.com
 # Date 1337066453 -7200
 # Node ID e63dd41f46b7bd150a23a62303bde6be78305c9c
 # Parent  63d968249025b245e38b1da6d0202438ec45ebf3
 [mq]: potential-fix-for-timer-leak
 
 diff --git a/lib/clplumbing/GSource.c b/lib/clplumbing/GSource.c
 --- a/lib/clplumbing/GSource.c
 +++ b/lib/clplumbing/GSource.c
 @@ -1507,6 +1507,7 @@
      g_source_set_callback(source, function, data, notify); 
  
      append-gsourceid = g_source_attach(source, NULL);
 +    g_source_unref(source);
      return append-gsourceid;
  
  }
 @@ -1517,14 +1518,12 @@
      GSource* source = g_main_context_find_source_by_id(NULL,tag);
      struct GTimeoutAppend* append = GTIMEOUT(source);
      
 -    g_source_remove(tag);
 -    
      if (source == NULL){
          cl_log(LOG_ERR, Attempt to remove timeout (%u)
           with NULL source,    tag);
      }else{
          g_assert(IS_TIMEOUTSRC(append));
 -        g_source_unref(source);
 +        g_source_remove(tag);
      }
      
      return;
 
 -- 
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] pull request 82 for postfix ra

2012-05-15 Thread renayama19661014
Hi Raoul,

  I think the only patch left is postfix.patch.1121 from
  http://www.gossamer-threads.com/lists/linuxha/dev/76532#76532 right?
  
   diff -r aaf72a017c98 postfix
   --- a/postfixMon Nov 21 10:32:33 2011 +0900
   +++ b/postfixMon Nov 21 10:34:08 2011 +0900
   @@ -264,7 +264,13 @@
fi
   
if ocf_is_true $status_support; then
   -data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
   2/dev/null`
   +orig_data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
   2/dev/null`
   +data_dir=`echo $orig_data_dir | tr ',' ' '`
   +dcount=`echo $data_dir | wc -w`
   +if [ $dcount -gt 1 ]; then
   +ocf_log err Postfix data directory '$orig_data_dir' 
   cannot set plural parameters.
   +return $OCF_ERR_PERM
   +fi
if [ ! -d $data_dir ]; then
if ocf_is_probe; then
ocf_log info Postfix data directory '$data_dir' not 
   readable during probe.
  
  i would slightly modify this:
  
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  diff --git a/heartbeat/postfix b/heartbeat/postfix
  index 273d5c9..2f4ab13 100755
  --- a/heartbeat/postfix
  +++ b/heartbeat/postfix
  @@ -264,6 +264,11 @@ postfix_validate_all()
  
   if ocf_is_true $status_support; then
   data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
  2/dev/null`
  +data_dir_count=`echo $data_dir | tr ',' ' ' | wc -w`
  +if [ $data_dir_count -gt 1 ]; then
  +   ocf_log err Postfix data directory '$orig_data_dir' cannot 
  be set to multiple directories.
  +return $OCF_ERR_INSTALLED
  +fi
   if [ ! -d $data_dir ]; then
   if ocf_is_probe; then
   ocf_log info Postfix data directory '$data_dir' not 
  readable during probe.
  
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Thanks!
I agree to the patch which you changed.


  
  what do you think about that?
  
   @@ -278,16 +284,14 @@
# check directory permissions
if ocf_is_true $status_support; then
user=`postconf $OPTION_CONFIG_DIR -h mail_owner 2/dev/null`
   -for dir in $data_dir; do
   -if ! su -s /bin/sh - $user -c test -w $dir; then
   -if ocf_is_probe; then
   -ocf_log info Directory '$dir' is not writable 
   by user '$user' during probe.
   -else
   -ocf_log err Directory '$dir' is not writable by 
   user '$user'.
   -return $OCF_ERR_PERM;
   -fi
   +if ! su -s /bin/sh - $user -c test -w $data_dir; then
   +if ocf_is_probe; then
   +ocf_log info Directory '$data_dir' is not writable 
   by user '$user' during probe.
   +else
   +ocf_log err Directory '$data_dir' is not writable 
   by user '$user'.
   +return $OCF_ERR_PERM;
fi
   -done
   +fi
fi
fi
   
  
  As outlined, i see no benefit in removing the loop and would like to
  keep it in case we want to check some other directories in the future.

Okay.
But, therefore does not the loop of data_dir have to change it as follows?

   -for dir in $data_dir; do
   +for dir in $data_dir; do

Many Thanks,
Hideo Yamauchi.


--- On Tue, 2012/5/15, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 
 Thank you for comments.
 
 I am slightly busy.
 I confirm it and will send an email tomorrow.
 
 Best Regards,
 Hideo Yamauchi.
 
 --- On Fri, 2012/5/11, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
 
  Hi Hideo-san!
  
  On 2012-05-11 02:09, renayama19661...@ybb.ne.jp wrote:
   Hi Raoul,
   Hi Dejan,
   
   Thank you for the reflection to a repository.
   
   To Raoul :
     The matter of the next email is still left.
     Please tell your opinion.
     * http://www.gossamer-threads.com/lists/linuxha/dev/76409
  
  I think the only patch left is postfix.patch.1121 from
  http://www.gossamer-threads.com/lists/linuxha/dev/76532#76532 right?
  
   diff -r aaf72a017c98 postfix
   --- a/postfix    Mon Nov 21 10:32:33 2011 +0900
   +++ b/postfix    Mon Nov 21 10:34:08 2011 +0900
   @@ -264,7 +264,13 @@
            fi
   
            if ocf_is_true $status_support; then
   -            data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
   2/dev/null`
   +            orig_data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
   2/dev/null`
   +            data_dir=`echo $orig_data_dir | tr ',' ' '`
   +            dcount=`echo $data_dir | wc -w`
   +            if [ $dcount -gt 1 ]; then
   +                    ocf_log err Postfix data directory 

Re: [Linux-ha-dev] pull request 82 for postfix ra

2012-05-15 Thread renayama19661014
Hi Raoul,

I forgot it.

Is not it necessary to convert a comma into the space from data_dir if you 
leave a loop of data_dir?

example) data_dir=`echo $data_dir | tr ',' ' '` 

Best Regards,
Hideo Yamauchi.


--- On Wed, 2012/5/16, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 
   I think the only patch left is postfix.patch.1121 from
   http://www.gossamer-threads.com/lists/linuxha/dev/76532#76532 right?
   
diff -r aaf72a017c98 postfix
--- a/postfix    Mon Nov 21 10:32:33 2011 +0900
+++ b/postfix    Mon Nov 21 10:34:08 2011 +0900
@@ -264,7 +264,13 @@
             fi

             if ocf_is_true $status_support; then
-            data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
2/dev/null`
+            orig_data_dir=`postconf $OPTION_CONFIG_DIR -h 
data_directory 2/dev/null`
+            data_dir=`echo $orig_data_dir | tr ',' ' '`
+            dcount=`echo $data_dir | wc -w`
+            if [ $dcount -gt 1 ]; then
+                    ocf_log err Postfix data directory 
'$orig_data_dir' cannot set plural parameters.
+                    return $OCF_ERR_PERM
+            fi
                 if [ ! -d $data_dir ]; then
                     if ocf_is_probe; then
                         ocf_log info Postfix data directory '$data_dir' 
   not readable during probe.
   
   i would slightly modify this:
   
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   diff --git a/heartbeat/postfix b/heartbeat/postfix
   index 273d5c9..2f4ab13 100755
   --- a/heartbeat/postfix
   +++ b/heartbeat/postfix
   @@ -264,6 +264,11 @@ postfix_validate_all()
   
            if ocf_is_true $status_support; then
                data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
  2/dev/null`
   +            data_dir_count=`echo $data_dir | tr ',' ' ' | wc -w`
   +            if [ $data_dir_count -gt 1 ]; then
   +               ocf_log err Postfix data directory '$orig_data_dir' 
   cannot be set to multiple directories.
   +                return $OCF_ERR_INSTALLED
   +            fi
                if [ ! -d $data_dir ]; then
                    if ocf_is_probe; then
                        ocf_log info Postfix data directory '$data_dir' not 
  readable during probe.
   
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 
 Thanks!
 I agree to the patch which you changed.
 
 
   
   what do you think about that?
   
@@ -278,16 +284,14 @@
             # check directory permissions
             if ocf_is_true $status_support; then
                 user=`postconf $OPTION_CONFIG_DIR -h mail_owner 
   2/dev/null`
-            for dir in $data_dir; do
-                if ! su -s /bin/sh - $user -c test -w $dir; then
-                    if ocf_is_probe; then
-                        ocf_log info Directory '$dir' is not writable 
by user '$user' during probe.
-                    else
-                        ocf_log err Directory '$dir' is not writable 
by user '$user'.
-                        return $OCF_ERR_PERM;
-                    fi
+            if ! su -s /bin/sh - $user -c test -w $data_dir; then
+                if ocf_is_probe; then
+                    ocf_log info Directory '$data_dir' is not 
writable by user '$user' during probe.
+                else
+                    ocf_log err Directory '$data_dir' is not writable 
by user '$user'.
+                    return $OCF_ERR_PERM;
                     fi
-            done
+            fi
             fi
         fi

   
   As outlined, i see no benefit in removing the loop and would like to
   keep it in case we want to check some other directories in the future.
 
 Okay.
 But, therefore does not the loop of data_dir have to change it as follows?
 
-            for dir in $data_dir; do
+            for dir in $data_dir; do
 
 Many Thanks,
 Hideo Yamauchi.
 
 
 --- On Tue, 2012/5/15, renayama19661...@ybb.ne.jp 
 renayama19661...@ybb.ne.jp wrote:
 
  Hi Raoul,
  
  Thank you for comments.
  
  I am slightly busy.
  I confirm it and will send an email tomorrow.
  
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2012/5/11, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
  
   Hi Hideo-san!
   
   On 2012-05-11 02:09, renayama19661...@ybb.ne.jp wrote:
Hi Raoul,
Hi Dejan,

Thank you for the reflection to a repository.

To Raoul :
      The matter of the next email is still left.
      Please tell your opinion.
      * http://www.gossamer-threads.com/lists/linuxha/dev/76409
   
   I think the only patch left is postfix.patch.1121 from
   http://www.gossamer-threads.com/lists/linuxha/dev/76532#76532 right?
   
diff -r aaf72a017c98 postfix
--- a/postfix    Mon Nov 21 10:32:33 2011 +0900
+++ b/postfix    Mon Nov 21 10:34:08 2011 +0900
@@ -264,7 +264,13 @@
             fi

             if 

Re: [Linux-ha-dev] pull request 82 for postfix ra

2012-05-14 Thread renayama19661014
Hi Raoul,

Thank you for comments.

I am slightly busy.
I confirm it and will send an email tomorrow.

Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/5/11, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hi Hideo-san!
 
 On 2012-05-11 02:09, renayama19661...@ybb.ne.jp wrote:
  Hi Raoul,
  Hi Dejan,
  
  Thank you for the reflection to a repository.
  
  To Raoul :
    The matter of the next email is still left.
    Please tell your opinion.
    * http://www.gossamer-threads.com/lists/linuxha/dev/76409
 
 I think the only patch left is postfix.patch.1121 from
 http://www.gossamer-threads.com/lists/linuxha/dev/76532#76532 right?
 
  diff -r aaf72a017c98 postfix
  --- a/postfix    Mon Nov 21 10:32:33 2011 +0900
  +++ b/postfix    Mon Nov 21 10:34:08 2011 +0900
  @@ -264,7 +264,13 @@
           fi
  
           if ocf_is_true $status_support; then
  -            data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
  2/dev/null`
  +            orig_data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
  2/dev/null`
  +            data_dir=`echo $orig_data_dir | tr ',' ' '`
  +            dcount=`echo $data_dir | wc -w`
  +            if [ $dcount -gt 1 ]; then
  +                    ocf_log err Postfix data directory '$orig_data_dir' 
  cannot set plural parameters.
  +                    return $OCF_ERR_PERM
  +            fi
               if [ ! -d $data_dir ]; then
                   if ocf_is_probe; then
                       ocf_log info Postfix data directory '$data_dir' not 
 readable during probe.
 
 i would slightly modify this:
 
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 diff --git a/heartbeat/postfix b/heartbeat/postfix
 index 273d5c9..2f4ab13 100755
 --- a/heartbeat/postfix
 +++ b/heartbeat/postfix
 @@ -264,6 +264,11 @@ postfix_validate_all()
 
          if ocf_is_true $status_support; then
              data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
 2/dev/null`
 +            data_dir_count=`echo $data_dir | tr ',' ' ' | wc -w`
 +            if [ $data_dir_count -gt 1 ]; then
 +               ocf_log err Postfix data directory '$orig_data_dir' cannot 
 be set to multiple directories.
 +                return $OCF_ERR_INSTALLED
 +            fi
              if [ ! -d $data_dir ]; then
                  if ocf_is_probe; then
                      ocf_log info Postfix data directory '$data_dir' not 
 readable during probe.
 
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 
 what do you think about that?
 
  @@ -278,16 +284,14 @@
           # check directory permissions
           if ocf_is_true $status_support; then
               user=`postconf $OPTION_CONFIG_DIR -h mail_owner 2/dev/null`
  -            for dir in $data_dir; do
  -                if ! su -s /bin/sh - $user -c test -w $dir; then
  -                    if ocf_is_probe; then
  -                        ocf_log info Directory '$dir' is not writable by 
  user '$user' during probe.
  -                    else
  -                        ocf_log err Directory '$dir' is not writable by 
  user '$user'.
  -                        return $OCF_ERR_PERM;
  -                    fi
  +            if ! su -s /bin/sh - $user -c test -w $data_dir; then
  +                if ocf_is_probe; then
  +                    ocf_log info Directory '$data_dir' is not writable by 
  user '$user' during probe.
  +                else
  +                    ocf_log err Directory '$data_dir' is not writable by 
  user '$user'.
  +                    return $OCF_ERR_PERM;
                   fi
  -            done
  +            fi
           fi
       fi
  
 
 As outlined, i see no benefit in removing the loop and would like to
 keep it in case we want to check some other directories in the future.
 
 quoting http://www.gossamer-threads.com/lists/linuxha/dev/76453#76453 :
 
  the current loop:
  for dir in $data_dir; do
  ...
  done
  (looping exactly one dir)
  
  could easily be enhanced to check more dirs, e.g.:
  for dir in $data_dir $data_dir/active $data_dir/incoming; do
  ...
  done
  (looping three dirs)
  
  without having to re-introduce the loop.
 
 Cheers,
 Raoul
 -- 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] pull request 82 for postfix ra

2012-05-10 Thread renayama19661014
Hi Raoul,
Hi Dejan,

Thank you for the reflection to a repository.

To Raoul :
 The matter of the next email is still left.
 Please tell your opinion.
 * http://www.gossamer-threads.com/lists/linuxha/dev/76409

Best Regards,
Hideo Yamauchi.

--- On Thu, 2012/5/10, Dejan Muhamedagic de...@suse.de wrote:

 Hi Raoul,
 
 On Thu, May 10, 2012 at 01:21:42PM +0200, Raoul Bhatia [IPAX] wrote:
  Hi!
  
  While reviewing my repository and patch collection for the
  resource agents, i opened a pull request [1] for the postfix patches
  that have been lurking around in my repository since a couple of months.
  
  I think there is some outstanding discussion with Hideo-san but
  I would like to pick them up afterwards.
  
  Comments and feedback is welcome!
 
 I just pulled the patches. Thanks!
 
 Cheers,
 
 Dejan
 
  Thanks,
  Raoul
  
  [1] https://github.com/ClusterLabs/resource-agents/pull/82
  -- 
  
  DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
  Technischer Leiter
  
  IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
  Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
  1190 Wien                           tel.               +43 1 3670030
  FN 277995t HG Wien                  fax.            +43 1 3670030 15
  
  
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-05-08 Thread renayama19661014
Hi Alan,

Thank you for comments.

 FYI: there is code in the heartbeat communication layer which is quite happy 
 to simulate lost packets.
 
 I made it difficult to turn on accidentally.  Read the code for details if 
 you're interested.

All right.

Many Thanks,
Hideo Yamauchi.

--- On Tue, 2012/5/8, Alan Robertson al...@unix.sh wrote:

 FYI: there is code in the heartbeat communication layer which is quite happy 
 to simulate lost packets.
 
 I made it difficult to turn on accidentally.  Read the code for details if 
 you're interested.
 
 
 
 On 04/30/2012 10:21 PM, renayama19661...@ybb.ne.jp wrote:
  Hi Lars,
  
  We confirmed that this problem occurred with v1 mode of Heartbeat.
    * The problem happens with the v2 mode in the same way.
  
  We confirmed a problem in the next procedure.
  
  Step 1) Put a special device extinguishing a communication packet of 
  Heartbeat in the network.
  
  Step 2) Between nodes, the retransmission of the message is carried out 
  repeatedly.
  
  Step 3) Then the memory of the master process increases little by little.
  
  
   As a result of the ps command of the master process --
  * node1
  (start)
  32126 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: master 
  control process
  (One hour later)
  32126 ?        SLs    0:03      0   182 54729  7868  0.0 heartbeat: master 
  control process
  (Two hour later)
  32126 ?        SLs    0:08      0   182 55317  8456  0.0 heartbeat: master 
  control process
  (Four hours later)
  32126 ?        SLs    0:24      0   182 56673  9812  0.0 heartbeat: master 
  control process
  
  * node2
  (start)
  31928 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: master 
  control process
  (One hour later)
  31928 ?        SLs    0:02      0   182 54481  7620  0.0 heartbeat: master 
  control process
  (Two hour later)
  31928 ?        SLs    0:08      0   182 55353  8492  0.0 heartbeat: master 
  control process
  (Four hours later)
  31928 ?        SLs    0:23      0   182 56689  9828  0.0 heartbeat: master 
  control process
  
  
  The state of the memory leak seems to vary according to a node with the 
  quantity of the retransmission.
  
  The increase of this memory disappears by applying my patch.
  
  And the similar correspondence seems to be necessary in 
  send_reqnodes_msg(), but this is like little leak.
  
  Best Regards,
  Hideo Yamauchi.
  
  
  --- On Sat, 2012/4/28, 
  renayama19661...@ybb.ne.jprenayama19661...@ybb.ne.jp  wrote:
  
  Hi Lars,
  
  Thank you for comments.
  
  Have you actually been able to measure that memory leak you observed,
  and you can confirm this patch will fix it?
  
  Because I don't think this patch has any effect.
  Yes.
  I really measured leak.
  I can show a result next week.
  #Japan is a holiday until Tuesday.
  
  send_rexmit_request() is only used as paramter to
  Gmain_timeout_add_full, and it returns FALSE always,
  which should cause the respective sourceid to be auto-removed.
  It seems to be necessary to release gsource somehow or other.
  The similar liberation seems to be carried out in lrmd.
  
  Best Regards,
  Hideo Yamauchi.
  
  
  --- On Fri, 2012/4/27, Lars Ellenberglars.ellenb...@linbit.com  wrote:
  
  On Thu, Apr 26, 2012 at 10:56:30AM +0900, renayama19661...@ybb.ne.jp 
  wrote:
  Hi All,
  
  We gave test that assumed remote cluster environment.
  And we tested packet lost.
  
  The retransmission timer of Heartbeat causes memory leak.
  
  I donate a patch.
  Please confirm the contents of the patch.
  And please reflect a patch in a repository of Heartbeat.
  Have you actually been able to measure that memory leak you observed,
  and you can confirm this patch will fix it?
  
  Because I don't think this patch has any effect.
  
  send_rexmit_request() is only used as paramter to
  Gmain_timeout_add_full, and it returns FALSE always,
  which should cause the respective sourceid to be auto-removed.
  
  
  diff -r 106ca984041b heartbeat/hb_rexmit.c
  --- a/heartbeat/hb_rexmit.c    Thu Apr 26 19:28:26 2012 +0900
  +++ b/heartbeat/hb_rexmit.c    Thu Apr 26 19:31:44 2012 +0900
  @@ -164,6 +164,8 @@
         seqno_t seq = (seqno_t) ri-seq;
         struct node_info* node = ri-node;
         struct ha_msg*    hmsg;
  +    unsigned long           sourceid;
  +    gpointer value;
           if (STRNCMP_CONST(node-status, UPSTATUS) != 0
             STRNCMP_CONST(node-status, ACTIVESTATUS) !=0) {
  @@ -196,11 +198,17 @@
               node-track.last_rexmit_req = time_longclock();         -   
  if (!g_hash_table_remove(rexmit_hash_table, ri)){
  -        cl_log(LOG_ERR, %s: entry not found in rexmit_hash_table
  -               for seq/node(%ld %s),                -               
  __FUNCTION__, ri-seq, ri-node-nodename);
  -        return FALSE;
  +    value = g_hash_table_lookup(rexmit_hash_table, ri);
  +    if ( value != NULL) {
  +        sourceid = (unsigned long) value;
  +        

Re: [Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-05-02 Thread renayama19661014
Hi Lars,

Thank you for comments.

  
  And when it passes more than a full day
  
  * node1
  32126 ?        SLs   79:52      0   182 71189 24328  0.1 heartbeat: master 
  control process                        
  
  * node2
  31928 ?        SLs   77:01      0   182 70869 24008  0.1 heartbeat: master 
  control process
 
 Oh, I see.
 
 This is a design choice (maybe not even intentional) of the Gmain_*
 wrappers used throughout the heartbeat code.
 
 The real glib g_timeout_add_full(), and most other similar functions,
 internally do
  id = g_source_attach(source, ...);
  g_source_unref(source);
  return id;
 
 Thus in g_main_dispatch, the
  need_destroy = ! dispatch (...)
  if (need_destroy)
      g_source_destroy_internal()
 
 in fact ends up destroying it,
 if dispatch() returns FALSE,
 as documented: 
     The function is called repeatedly until it returns FALSE, at
     which point the timeout is automatically destroyed and the
     function will not be called again.
 
 Not so with the heartbeat/glue/libplumbing Gmain_timeout_add_full.
 It does not g_source_unref(), so we keep the extra reference around
 until someone explicitly calls Gmain_timeout_remove().
 
 Talk about principle of least surprise :(
 
 Changing this behaviour to match glib's, i.e. unref'ing after
 g_source_attach, would seem like the correct thing to do,
 but is likely to break other pieces of code in subtle ways,
 so it may not be the right thing to do at this point.

Thank you for detailed explanation.
If you found the method that is appropriate than the correction that I 
suggested, I approve of it.

 I'm going to take your patch more or less as is.
 If it does not show up soon, prod me again.
 

All right.

Many Thanks!
Hideo Yamauchi.  


 Thank you for tracking this down.
 
  Best Regards,
  Hideo Yamauchi.
  
  
  --- On Tue, 2012/5/1, renayama19661...@ybb.ne.jp 
  renayama19661...@ybb.ne.jp wrote:
  
   Hi Lars,
   
   We confirmed that this problem occurred with v1 mode of Heartbeat.
    * The problem happens with the v2 mode in the same way.
   
   We confirmed a problem in the next procedure.
   
   Step 1) Put a special device extinguishing a communication packet of 
   Heartbeat in the network.
   
   Step 2) Between nodes, the retransmission of the message is carried out 
   repeatedly.
   
   Step 3) Then the memory of the master process increases little by little.
   
   
    As a result of the ps command of the master process --
   * node1
   (start)
   32126 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: 
   master control process
   (One hour later)
   32126 ?        SLs    0:03      0   182 54729  7868  0.0 heartbeat: 
   master control process
   (Two hour later)
   32126 ?        SLs    0:08      0   182 55317  8456  0.0 heartbeat: 
   master control process
   (Four hours later)
   32126 ?        SLs    0:24      0   182 56673  9812  0.0 heartbeat: 
   master control process 
   
   * node2
   (start)
   31928 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: 
   master control process
   (One hour later)
   31928 ?        SLs    0:02      0   182 54481  7620  0.0 heartbeat: 
   master control process
   (Two hour later)
   31928 ?        SLs    0:08      0   182 55353  8492  0.0 heartbeat: 
   master control process
   (Four hours later)
   31928 ?        SLs    0:23      0   182 56689  9828  0.0 heartbeat: 
   master control process
   
   
   The state of the memory leak seems to vary according to a node with the 
   quantity of the retransmission.
   
   The increase of this memory disappears by applying my patch.
   
   And the similar correspondence seems to be necessary in 
   send_reqnodes_msg(), but this is like little leak.
   
   Best Regards,
   Hideo Yamauchi.
   
   
   --- On Sat, 2012/4/28, renayama19661...@ybb.ne.jp 
   renayama19661...@ybb.ne.jp wrote:
   
Hi Lars,

Thank you for comments.

 Have you actually been able to measure that memory leak you observed,
 and you can confirm this patch will fix it?
 
 Because I don't think this patch has any effect.

Yes.
I really measured leak.
I can show a result next week.
#Japan is a holiday until Tuesday.

 
 send_rexmit_request() is only used as paramter to
 Gmain_timeout_add_full, and it returns FALSE always,
 which should cause the respective sourceid to be auto-removed.

It seems to be necessary to release gsource somehow or other.
The similar liberation seems to be carried out in lrmd.

Best Regards,
Hideo Yamauchi.


--- On Fri, 2012/4/27, Lars Ellenberg lars.ellenb...@linbit.com wrote:

 On Thu, Apr 26, 2012 at 10:56:30AM +0900, renayama19661...@ybb.ne.jp 
 wrote:
  Hi All,
  
  We gave test that assumed remote cluster environment.
  And we tested packet lost.
  
  The retransmission timer of Heartbeat causes memory leak.
  
  I donate a 

Re: [Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-05-01 Thread renayama19661014
Hi Lars,

And when it passes more than a full day

* node1
32126 ?SLs   79:52  0   182 71189 24328  0.1 heartbeat: master 
control process

* node2
31928 ?SLs   77:01  0   182 70869 24008  0.1 heartbeat: master 
control process


Best Regards,
Hideo Yamauchi.


--- On Tue, 2012/5/1, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Lars,
 
 We confirmed that this problem occurred with v1 mode of Heartbeat.
  * The problem happens with the v2 mode in the same way.
 
 We confirmed a problem in the next procedure.
 
 Step 1) Put a special device extinguishing a communication packet of 
 Heartbeat in the network.
 
 Step 2) Between nodes, the retransmission of the message is carried out 
 repeatedly.
 
 Step 3) Then the memory of the master process increases little by little.
 
 
  As a result of the ps command of the master process --
 * node1
 (start)
 32126 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: master 
 control process
 (One hour later)
 32126 ?        SLs    0:03      0   182 54729  7868  0.0 heartbeat: master 
 control process
 (Two hour later)
 32126 ?        SLs    0:08      0   182 55317  8456  0.0 heartbeat: master 
 control process
 (Four hours later)
 32126 ?        SLs    0:24      0   182 56673  9812  0.0 heartbeat: master 
 control process 
 
 * node2
 (start)
 31928 ?        SLs    0:00      0   182 53989  7128  0.0 heartbeat: master 
 control process
 (One hour later)
 31928 ?        SLs    0:02      0   182 54481  7620  0.0 heartbeat: master 
 control process
 (Two hour later)
 31928 ?        SLs    0:08      0   182 55353  8492  0.0 heartbeat: master 
 control process
 (Four hours later)
 31928 ?        SLs    0:23      0   182 56689  9828  0.0 heartbeat: master 
 control process
 
 
 The state of the memory leak seems to vary according to a node with the 
 quantity of the retransmission.
 
 The increase of this memory disappears by applying my patch.
 
 And the similar correspondence seems to be necessary in send_reqnodes_msg(), 
 but this is like little leak.
 
 Best Regards,
 Hideo Yamauchi.
 
 
 --- On Sat, 2012/4/28, renayama19661...@ybb.ne.jp 
 renayama19661...@ybb.ne.jp wrote:
 
  Hi Lars,
  
  Thank you for comments.
  
   Have you actually been able to measure that memory leak you observed,
   and you can confirm this patch will fix it?
   
   Because I don't think this patch has any effect.
  
  Yes.
  I really measured leak.
  I can show a result next week.
  #Japan is a holiday until Tuesday.
  
   
   send_rexmit_request() is only used as paramter to
   Gmain_timeout_add_full, and it returns FALSE always,
   which should cause the respective sourceid to be auto-removed.
  
  It seems to be necessary to release gsource somehow or other.
  The similar liberation seems to be carried out in lrmd.
  
  Best Regards,
  Hideo Yamauchi.
  
  
  --- On Fri, 2012/4/27, Lars Ellenberg lars.ellenb...@linbit.com wrote:
  
   On Thu, Apr 26, 2012 at 10:56:30AM +0900, renayama19661...@ybb.ne.jp 
   wrote:
Hi All,

We gave test that assumed remote cluster environment.
And we tested packet lost.

The retransmission timer of Heartbeat causes memory leak.

I donate a patch.
Please confirm the contents of the patch.
And please reflect a patch in a repository of Heartbeat.
   
   Have you actually been able to measure that memory leak you observed,
   and you can confirm this patch will fix it?
   
   Because I don't think this patch has any effect.
   
   send_rexmit_request() is only used as paramter to
   Gmain_timeout_add_full, and it returns FALSE always,
   which should cause the respective sourceid to be auto-removed.
   
   
diff -r 106ca984041b heartbeat/hb_rexmit.c
--- a/heartbeat/hb_rexmit.c    Thu Apr 26 19:28:26 2012 +0900
+++ b/heartbeat/hb_rexmit.c    Thu Apr 26 19:31:44 2012 +0900
@@ -164,6 +164,8 @@
         seqno_t seq = (seqno_t) ri-seq;
         struct node_info* node = ri-node;
         struct ha_msg*    hmsg;
+    unsigned long           sourceid;
+    gpointer value;
     
         if (STRNCMP_CONST(node-status, UPSTATUS) != 0 
             STRNCMP_CONST(node-status, ACTIVESTATUS) !=0) {
@@ -196,11 +198,17 @@
         
         node-track.last_rexmit_req = time_longclock();    
         
-    if (!g_hash_table_remove(rexmit_hash_table, ri)){
-        cl_log(LOG_ERR, %s: entry not found in rexmit_hash_table
-               for seq/node(%ld %s),                
-               __FUNCTION__, ri-seq, ri-node-nodename);
-        return FALSE;
+    value = g_hash_table_lookup(rexmit_hash_table, ri);
+    if ( value != NULL) {
+        sourceid = (unsigned long) value;
+        Gmain_timeout_remove(sourceid);
+
+        if (!g_hash_table_remove(rexmit_hash_table, ri)){
+            cl_log(LOG_ERR, %s: entry not found in rexmit_hash_table
+                 

Re: [Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-04-30 Thread renayama19661014
Hi Lars,

We confirmed that this problem occurred with v1 mode of Heartbeat.
 * The problem happens with the v2 mode in the same way.

We confirmed a problem in the next procedure.

Step 1) Put a special device extinguishing a communication packet of Heartbeat 
in the network.

Step 2) Between nodes, the retransmission of the message is carried out 
repeatedly.

Step 3) Then the memory of the master process increases little by little.


 As a result of the ps command of the master process --
* node1
(start)
32126 ?SLs0:00  0   182 53989  7128  0.0 heartbeat: master 
control process
(One hour later)
32126 ?SLs0:03  0   182 54729  7868  0.0 heartbeat: master 
control process
(Two hour later)
32126 ?SLs0:08  0   182 55317  8456  0.0 heartbeat: master 
control process
(Four hours later)
32126 ?SLs0:24  0   182 56673  9812  0.0 heartbeat: master 
control process 

* node2
(start)
31928 ?SLs0:00  0   182 53989  7128  0.0 heartbeat: master 
control process
(One hour later)
31928 ?SLs0:02  0   182 54481  7620  0.0 heartbeat: master 
control process
(Two hour later)
31928 ?SLs0:08  0   182 55353  8492  0.0 heartbeat: master 
control process
(Four hours later)
31928 ?SLs0:23  0   182 56689  9828  0.0 heartbeat: master 
control process


The state of the memory leak seems to vary according to a node with the 
quantity of the retransmission.

The increase of this memory disappears by applying my patch.

And the similar correspondence seems to be necessary in send_reqnodes_msg(), 
but this is like little leak.

Best Regards,
Hideo Yamauchi.


--- On Sat, 2012/4/28, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Lars,
 
 Thank you for comments.
 
  Have you actually been able to measure that memory leak you observed,
  and you can confirm this patch will fix it?
  
  Because I don't think this patch has any effect.
 
 Yes.
 I really measured leak.
 I can show a result next week.
 #Japan is a holiday until Tuesday.
 
  
  send_rexmit_request() is only used as paramter to
  Gmain_timeout_add_full, and it returns FALSE always,
  which should cause the respective sourceid to be auto-removed.
 
 It seems to be necessary to release gsource somehow or other.
 The similar liberation seems to be carried out in lrmd.
 
 Best Regards,
 Hideo Yamauchi.
 
 
 --- On Fri, 2012/4/27, Lars Ellenberg lars.ellenb...@linbit.com wrote:
 
  On Thu, Apr 26, 2012 at 10:56:30AM +0900, renayama19661...@ybb.ne.jp wrote:
   Hi All,
   
   We gave test that assumed remote cluster environment.
   And we tested packet lost.
   
   The retransmission timer of Heartbeat causes memory leak.
   
   I donate a patch.
   Please confirm the contents of the patch.
   And please reflect a patch in a repository of Heartbeat.
  
  Have you actually been able to measure that memory leak you observed,
  and you can confirm this patch will fix it?
  
  Because I don't think this patch has any effect.
  
  send_rexmit_request() is only used as paramter to
  Gmain_timeout_add_full, and it returns FALSE always,
  which should cause the respective sourceid to be auto-removed.
  
  
   diff -r 106ca984041b heartbeat/hb_rexmit.c
   --- a/heartbeat/hb_rexmit.c    Thu Apr 26 19:28:26 2012 +0900
   +++ b/heartbeat/hb_rexmit.c    Thu Apr 26 19:31:44 2012 +0900
   @@ -164,6 +164,8 @@
        seqno_t seq = (seqno_t) ri-seq;
        struct node_info* node = ri-node;
        struct ha_msg*    hmsg;
   +    unsigned long           sourceid;
   +    gpointer value;
    
        if (STRNCMP_CONST(node-status, UPSTATUS) != 0 
            STRNCMP_CONST(node-status, ACTIVESTATUS) !=0) {
   @@ -196,11 +198,17 @@
        
        node-track.last_rexmit_req = time_longclock();    
        
   -    if (!g_hash_table_remove(rexmit_hash_table, ri)){
   -        cl_log(LOG_ERR, %s: entry not found in rexmit_hash_table
   -               for seq/node(%ld %s),                
   -               __FUNCTION__, ri-seq, ri-node-nodename);
   -        return FALSE;
   +    value = g_hash_table_lookup(rexmit_hash_table, ri);
   +    if ( value != NULL) {
   +        sourceid = (unsigned long) value;
   +        Gmain_timeout_remove(sourceid);
   +
   +        if (!g_hash_table_remove(rexmit_hash_table, ri)){
   +            cl_log(LOG_ERR, %s: entry not found in rexmit_hash_table
   +                   for seq/node(%ld %s),                
   +                   __FUNCTION__, ri-seq, ri-node-nodename);
   +            return FALSE;
   +        }
        }
        
        schedule_rexmit_request(node, seq, max_rexmit_delay);
  
  
  -- 
  : Lars Ellenberg
  : LINBIT | Your Way to High Availability
  : DRBD/HA support and consulting http://www.linbit.com
  
  DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  

Re: [Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-04-27 Thread renayama19661014
Hi Lars,

Thank you for comments.

 Have you actually been able to measure that memory leak you observed,
 and you can confirm this patch will fix it?
 
 Because I don't think this patch has any effect.

Yes.
I really measured leak.
I can show a result next week.
#Japan is a holiday until Tuesday.

 
 send_rexmit_request() is only used as paramter to
 Gmain_timeout_add_full, and it returns FALSE always,
 which should cause the respective sourceid to be auto-removed.

It seems to be necessary to release gsource somehow or other.
The similar liberation seems to be carried out in lrmd.

Best Regards,
Hideo Yamauchi.


--- On Fri, 2012/4/27, Lars Ellenberg lars.ellenb...@linbit.com wrote:

 On Thu, Apr 26, 2012 at 10:56:30AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  We gave test that assumed remote cluster environment.
  And we tested packet lost.
  
  The retransmission timer of Heartbeat causes memory leak.
  
  I donate a patch.
  Please confirm the contents of the patch.
  And please reflect a patch in a repository of Heartbeat.
 
 Have you actually been able to measure that memory leak you observed,
 and you can confirm this patch will fix it?
 
 Because I don't think this patch has any effect.
 
 send_rexmit_request() is only used as paramter to
 Gmain_timeout_add_full, and it returns FALSE always,
 which should cause the respective sourceid to be auto-removed.
 
 
  diff -r 106ca984041b heartbeat/hb_rexmit.c
  --- a/heartbeat/hb_rexmit.c    Thu Apr 26 19:28:26 2012 +0900
  +++ b/heartbeat/hb_rexmit.c    Thu Apr 26 19:31:44 2012 +0900
  @@ -164,6 +164,8 @@
       seqno_t seq = (seqno_t) ri-seq;
       struct node_info* node = ri-node;
       struct ha_msg*    hmsg;
  +    unsigned long           sourceid;
  +    gpointer value;
   
       if (STRNCMP_CONST(node-status, UPSTATUS) != 0 
           STRNCMP_CONST(node-status, ACTIVESTATUS) !=0) {
  @@ -196,11 +198,17 @@
       
       node-track.last_rexmit_req = time_longclock();    
       
  -    if (!g_hash_table_remove(rexmit_hash_table, ri)){
  -        cl_log(LOG_ERR, %s: entry not found in rexmit_hash_table
  -               for seq/node(%ld %s),                
  -               __FUNCTION__, ri-seq, ri-node-nodename);
  -        return FALSE;
  +    value = g_hash_table_lookup(rexmit_hash_table, ri);
  +    if ( value != NULL) {
  +        sourceid = (unsigned long) value;
  +        Gmain_timeout_remove(sourceid);
  +
  +        if (!g_hash_table_remove(rexmit_hash_table, ri)){
  +            cl_log(LOG_ERR, %s: entry not found in rexmit_hash_table
  +                   for seq/node(%ld %s),                
  +                   __FUNCTION__, ri-seq, ri-node-nodename);
  +            return FALSE;
  +        }
       }
       
       schedule_rexmit_request(node, seq, max_rexmit_delay);
 
 
 -- 
 : Lars Ellenberg
 : LINBIT | Your Way to High Availability
 : DRBD/HA support and consulting http://www.linbit.com
 
 DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch] The patch which revises memory leak.

2012-04-25 Thread renayama19661014
Hi All,

We gave test that assumed remote cluster environment.
And we tested packet lost.

The retransmission timer of Heartbeat causes memory leak.

I donate a patch.
Please confirm the contents of the patch.
And please reflect a patch in a repository of Heartbeat.

Best Regards,
Hideo Yamauchi.

rexmit_leak.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] LVM monitor change

2012-04-10 Thread renayama19661014
Hi Dejan,

Thank you for comments.

 That's not a good reason. Testing if binaries exist on every
 monitor operation really doesn't make much sense. Why would you
 expect programs to start disappearing? And if they do, we may
 have a much more serious problem to deal with.

All right.

We withdraw this patch.
And let me discuss it when we review overall RA next again.

Many Thanks,
Hideo Yamauchi.


--- On Tue, 2012/4/10, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Tue, Apr 10, 2012 at 12:43:00PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Dejan,
  
  Thank you for comments.
  
  
   Hi Hideo-san,
   
   On Mon, Apr 09, 2012 at 09:18:07AM +0900, renayama19661...@ybb.ne.jp 
   wrote:
Hi Dejan,

Thank you for comments.

  I change validate-all and want to change it to always carry out 
  validate-all.
  I abolish vgck/vgdisplay carried out in validate-all and intend to 
  make only the check of the parameter simply.
  
  How do you think?
 
 Isn't it that validate-all may be really necessary only in the
 start action? The repeating monitor is scheduled only after a
 successful start.

It may be surely necessary as you say.
However, I think validate-all to unify it so that it is always carried 
out.
   
   But why?
  
  There is the resource to carry out validate-all every time a lot.
  We wish it becomes LVM in the same way.
 
 That's not a good reason. Testing if binaries exist on every
 monitor operation really doesn't make much sense. Why would you
 expect programs to start disappearing? And if they do, we may
 have a much more serious problem to deal with.
 
 Cheers,
 
 Dejan
 
How about what the check of vgck/vgdisplay chooses it in a parameter 
and can carry out?
   
   Again, why? It doesn't make any difference for a running
   resource? We may do this before the start operation, of course.
  
  My correction is different from original LVM in big validate-all.
  
  There were many mistakes to my patch.
  And I think about a patch again and send it.
  
  Best Regards,
  Hideo Yamauchi.
  
   
   Cheers,
   
   Dejan
   

Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/4/6, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Fri, Apr 06, 2012 at 10:50:39AM +0900, renayama19661...@ybb.ne.jp 
 wrote:
  Hi Dejan,
  
  I change validate-all and want to change it to always carry out 
  validate-all.
  I abolish vgck/vgdisplay carried out in validate-all and intend to 
  make only the check of the parameter simply.
  
  How do you think?
 
 Isn't it that validate-all may be really necessary only in the
 start action? The repeating monitor is scheduled only after a
 successful start.
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2012/4/6, Dejan Muhamedagic de...@suse.de wrote:
  
   Hi Hideo-san,
   
   On Thu, Apr 05, 2012 at 11:32:05AM +0900, 
   renayama19661...@ybb.ne.jp wrote:
Hi Dejan,

I agree to your patch.
   
   Thank you for the reply.
   
   BTW, the monitor was shamelessly stolen from Vladislav.
   
   Applied.
   
   ocft test passed (after some struggle and eventually fixing the
   ocft source).
   
   Cheers,
   
   Dejan
   
Best Regards,
Hideo Yamauchi.

--- On Thu, 2012/4/5, Dejan Muhamedagic de...@suse.de wrote:

 Hi all,
 
 This is a proposed set of two patches which would eliminate 
 use
 of LVM commands in the monitor path. We already discussed the
 issue elsewhere and I don't see any point in keeping
 vgck/vgdisplay given that they don't result in better 
 monitoring
 under normal circumstances. And if the circumstances are such
 that the new monitoring fails, I think that there'll be many
 more problems on the node than a failed volume group.
 
 Cheers,
 
 Dejan
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
   ___
   Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
   http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
   Home Page: http://linux-ha.org/
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org

Re: [Linux-ha-dev] LVM monitor change

2012-04-09 Thread renayama19661014
Hi Dejan,

Thank you for comments.


 Hi Hideo-san,
 
 On Mon, Apr 09, 2012 at 09:18:07AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Dejan,
  
  Thank you for comments.
  
I change validate-all and want to change it to always carry out 
validate-all.
I abolish vgck/vgdisplay carried out in validate-all and intend to make 
only the check of the parameter simply.

How do you think?
   
   Isn't it that validate-all may be really necessary only in the
   start action? The repeating monitor is scheduled only after a
   successful start.
  
  It may be surely necessary as you say.
  However, I think validate-all to unify it so that it is always carried out.
 
 But why?

There is the resource to carry out validate-all every time a lot.
We wish it becomes LVM in the same way.

 
  How about what the check of vgck/vgdisplay chooses it in a parameter and 
  can carry out?
 
 Again, why? It doesn't make any difference for a running
 resource? We may do this before the start operation, of course.

My correction is different from original LVM in big validate-all.

There were many mistakes to my patch.
And I think about a patch again and send it.

Best Regards,
Hideo Yamauchi.

 
 Cheers,
 
 Dejan
 
  
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2012/4/6, Dejan Muhamedagic de...@suse.de wrote:
  
   Hi Hideo-san,
   
   On Fri, Apr 06, 2012 at 10:50:39AM +0900, renayama19661...@ybb.ne.jp 
   wrote:
Hi Dejan,

I change validate-all and want to change it to always carry out 
validate-all.
I abolish vgck/vgdisplay carried out in validate-all and intend to make 
only the check of the parameter simply.

How do you think?
   
   Isn't it that validate-all may be really necessary only in the
   start action? The repeating monitor is scheduled only after a
   successful start.
   
   Cheers,
   
   Dejan
   
Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/4/6, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Thu, Apr 05, 2012 at 11:32:05AM +0900, renayama19661...@ybb.ne.jp 
 wrote:
  Hi Dejan,
  
  I agree to your patch.
 
 Thank you for the reply.
 
 BTW, the monitor was shamelessly stolen from Vladislav.
 
 Applied.
 
 ocft test passed (after some struggle and eventually fixing the
 ocft source).
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  --- On Thu, 2012/4/5, Dejan Muhamedagic de...@suse.de wrote:
  
   Hi all,
   
   This is a proposed set of two patches which would eliminate use
   of LVM commands in the monitor path. We already discussed the
   issue elsewhere and I don't see any point in keeping
   vgck/vgdisplay given that they don't result in better monitoring
   under normal circumstances. And if the circumstances are such
   that the new monitoring fails, I think that there'll be many
   more problems on the node than a failed volume group.
   
   Cheers,
   
   Dejan
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] LVM monitor change

2012-04-08 Thread renayama19661014
Hi Dejan,

Thank you for comments.

  I change validate-all and want to change it to always carry out 
  validate-all.
  I abolish vgck/vgdisplay carried out in validate-all and intend to make 
  only the check of the parameter simply.
  
  How do you think?
 
 Isn't it that validate-all may be really necessary only in the
 start action? The repeating monitor is scheduled only after a
 successful start.

It may be surely necessary as you say.
However, I think validate-all to unify it so that it is always carried out.

How about what the check of vgck/vgdisplay chooses it in a parameter and can 
carry out?


Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/4/6, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Fri, Apr 06, 2012 at 10:50:39AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Dejan,
  
  I change validate-all and want to change it to always carry out 
  validate-all.
  I abolish vgck/vgdisplay carried out in validate-all and intend to make 
  only the check of the parameter simply.
  
  How do you think?
 
 Isn't it that validate-all may be really necessary only in the
 start action? The repeating monitor is scheduled only after a
 successful start.
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2012/4/6, Dejan Muhamedagic de...@suse.de wrote:
  
   Hi Hideo-san,
   
   On Thu, Apr 05, 2012 at 11:32:05AM +0900, renayama19661...@ybb.ne.jp 
   wrote:
Hi Dejan,

I agree to your patch.
   
   Thank you for the reply.
   
   BTW, the monitor was shamelessly stolen from Vladislav.
   
   Applied.
   
   ocft test passed (after some struggle and eventually fixing the
   ocft source).
   
   Cheers,
   
   Dejan
   
Best Regards,
Hideo Yamauchi.

--- On Thu, 2012/4/5, Dejan Muhamedagic de...@suse.de wrote:

 Hi all,
 
 This is a proposed set of two patches which would eliminate use
 of LVM commands in the monitor path. We already discussed the
 issue elsewhere and I don't see any point in keeping
 vgck/vgdisplay given that they don't result in better monitoring
 under normal circumstances. And if the circumstances are such
 that the new monitoring fails, I think that there'll be many
 more problems on the node than a failed volume group.
 
 Cheers,
 
 Dejan
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
   ___
   Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
   http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
   Home Page: http://linux-ha.org/
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] LVM monitor change

2012-04-05 Thread renayama19661014
Hi Dejan,

I change validate-all and want to change it to always carry out validate-all.
I abolish vgck/vgdisplay carried out in validate-all and intend to make only 
the check of the parameter simply.

How do you think?

Best Regards,
Hideo Yamauchi.

--- On Fri, 2012/4/6, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Thu, Apr 05, 2012 at 11:32:05AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Dejan,
  
  I agree to your patch.
 
 Thank you for the reply.
 
 BTW, the monitor was shamelessly stolen from Vladislav.
 
 Applied.
 
 ocft test passed (after some struggle and eventually fixing the
 ocft source).
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  --- On Thu, 2012/4/5, Dejan Muhamedagic de...@suse.de wrote:
  
   Hi all,
   
   This is a proposed set of two patches which would eliminate use
   of LVM commands in the monitor path. We already discussed the
   issue elsewhere and I don't see any point in keeping
   vgck/vgdisplay given that they don't result in better monitoring
   under normal circumstances. And if the circumstances are such
   that the new monitoring fails, I think that there'll be many
   more problems on the node than a failed volume group.
   
   Cheers,
   
   Dejan
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] LVM monitor change

2012-04-04 Thread renayama19661014
Hi Dejan,

I agree to your patch.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2012/4/5, Dejan Muhamedagic de...@suse.de wrote:

 Hi all,
 
 This is a proposed set of two patches which would eliminate use
 of LVM commands in the monitor path. We already discussed the
 issue elsewhere and I don't see any point in keeping
 vgck/vgdisplay given that they don't result in better monitoring
 under normal circumstances. And if the circumstances are such
 that the new monitoring fails, I think that there'll be many
 more problems on the node than a failed volume group.
 
 Cheers,
 
 Dejan
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch] Patch for external/vcenter.

2012-03-14 Thread renayama19661014
Hi All,

We used external/vcenter in vSphere5 environment.
The external/vcenter tried the use with vCenter server and the ESXi server.

And We found some problems.

Problem 1) external/vcenter does not support addition of VM.
  external/vcenter fails in start when there is VM which is not yet made in 
HOSTLIST.

Problem 2) external/vcenter fails in start processing under the influence of 
the above-mentioned problem even if I add the STONITH resource that went by way 
of ESXi server in consideration of the stop of the vCenter server.
 The STONITH resource with VM which does not exist fails in start processing 
when I set it for ESXi in current external/vcenter.
 However, VM may move ESXi server by vMotion and DRS.
 When vCenter server fell, it is necessary to consider STONITH from ESXi server 
of VM moved to.

 -

In consideration of the trouble of the vCenter server, we put STONITH of the 
ESXi server.

(server)
vCenter (192.168.133.40)
db1 on ESXi server 1(192.168.133.1)
db2 on ESXi server 2(192.168.133.2)

(snip)
### Group Configuration ###
group grpStonith1 \
prmStonith1-1 \--- for vCetner
prmStonith1-2 \--- for ESXi server 1
prmStonith1-3 \--- for ESXi server 2
(snip)
primitive prmStonith1-2 stonith:external/vcenter \
params \
priority=3 \
stonith-timeout=60s \
VI_SERVER=192.168.133.1 \
VI_CREDSTORE=/etc/vicredentials.xml \
HOSTLIST=db1;db2 \ --- Because it is 
VM which there is not to ESXi server, external/vcenter fails in start 
processing.
RESETPOWERON=0 \
op start interval=0s timeout=60s \
op monitor interval=3600s timeout=60s \
op stop interval=0s timeout=60s

primitive prmStonith1-3 stonith:external/vcenter \
params \
priority=4 \
stonith-timeout=60s \
VI_SERVER=192.168.133.2 \
VI_CREDSTORE=/etc/vicredentials.xml \
HOSTLIST=db1;db2 \  Because it 
is VM which there is not to ESXi server, external/vcenter fails in start 
processing.
RESETPOWERON=0 \
op start interval=0s timeout=60s \
op monitor interval=3600s timeout=60s \
op stop interval=0s timeout=60s
 --

I think that the check of the gethosts processing is unnecessary.
It obstructs start processing.

When real STONITH is performed, I think external/vcenter to be enough just to 
check VM.(HOSTLIST)

I made a sample patch.
This patch returns HOSTLIST like other STONITH modules simply.

Please take in this patch.

Best Regards,
Hideo Yamauchi.

vcenter.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] New RA: IPredirect

2012-02-01 Thread renayama19661014
Hi David,

I write a demand from me.

1) Please implement the check of the parameter in ipredirect_validate.(For 
example, it is necessary to check a form and the port of the address are 
numerical value)
2) And please carry out ipredirect_validate.(I think that I should call it 
other than meta-data processing.)
3) Please process an error code by practice of iptables.
4) And please give the log at the time of the error.
5) In IPredirect, script should check iptables command is usable. (check_binary 
$IPTABLES)

Best Regards,
Hideo Yamauchi.

--- On Wed, 2012/2/1, David Gersic dger...@niu.edu wrote:

 Somewhat based on Dummy, and somewhat based on IPaddr2, here's an RA I put 
 together to do port redirection via iptables.
 
 I have an application (Shibboleth Identity Provider) that runs under Tomcat. 
 Because Tomcat runs as a non-root user, the application server can only 
 listen on ports over 1024. But this particular app must be on ports 80 and 
 443. The only way to do that is to use iptables and redirect traffic to the 
 external ip address to an internal (10.0.0.1) address, changing the port used 
 along the way. In order to manage this from Linux/HA, I needed a way to add 
 and remove the necessary iptables rules as part of my resource group.
 
 Setting up the resource group, I have this in it:
 
          primitive class=ocf type=IPredirect provider=heartbeat 
 is_managed=true id=IPR_8_2
            instance_attributes id=IPR_8_2_instance_attrs
              attributes
                nvpair name=interface value=eth3/
                nvpair name=external_ip value=131.156.21.44/
                nvpair name=external_port value=443/
                nvpair name=internal_ip value=10.0.0.1/
                nvpair name=internal_port value=8443/
              /attributes
            /instance_attributes
            operations
              op name=monitor interval=10 timeout=10 start_delay=10/
              op name=start timeout=10/
              op name=stop timeout=10/
            /operations
          /primitive
 
 to redirect external port 443 traffic to internal port 8443 where the 
 application is actually listening. I'm using two IPaddr2 primitives to bind 
 the external (131.156.21.44) and internal (10.0.0.1) to eth3. This group will 
 have Filesystem and Tomcat primitives as well, to manage the shared storage 
 and application server.
 
 Tested here and seems to work. Comments or changes appreciated.
 
 
 #!/bin/sh
 #
 # Description:  Manages iptables port redirection firewall rules
 #               needed for a resource group under Heartbeat/LinuxHA
 #               control.
 #
 # Copyright 2012 Northern Illinois University, David Gersic
 #                    All Rights Reserved.
 #
 # This program is free software; you can redistribute it and/or
 # modify it under the terms of the GNU General Public License
 # as published by the Free Software Foundation; either version 2
 # of the License, or (at your option) any later version.
 # 
 # This program is distributed in the hope that it will be useful,
 # but WITHOUT ANY WARRANTY; without even the implied warranty of
 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 # GNU General Public License for more details.
 # 
 # You should have received a copy of the GNU General Public License
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  
 # 02110-1301, USA.
 #
 #
 # OCF parameters:
 #   OCF_RESKEY_interface - Which interface to apply the rules to (ie: eth0, 
 eth1, etc.)
 #   OCF_RESKEY_external_ip - External IP address to redirect from
 #   OCF_RESKEY_external_port - External IP port to redirect from
 #   OCF_RESKEY_internal_ip - Internal IP adddress to redirect to
 #   OCF_RESKEY_internal_port - Internal IP port to redirect to
 #
 
 ###
 # Initialization:
 . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
 ###
 
 meta_data() {
     cat END
 ?xml version=1.0?
 !DOCTYPE resource-agent SYSTEM ra-api-1.dtd
 resource-agent name=IPredirect version=0.9
 version1.0/version
 
 longdesc lang=en
 This resource agent enables port redirection from an external IP address to 
 an internal
 IP address. This is useful for applications that must be reachable on a port 
 below 1024,
 but that must also run as non-root.
 /longdesc
 shortdesc lang=enIPredirect resource agent/shortdesc
 
 parameters
 parameter name=interface unique=1
 longdesc lang=en
 Which interface to apply the rules to (ie: eth0, eth1, etc.)
 /longdesc
 shortdesc lang=enNetwork interface/shortdesc
 

Re: [Linux-ha-dev] [Patch] Patch for IPsrcaddr.(2/2)

2012-01-28 Thread renayama19661014
Hi Dejan,

Thank you for comments.

 OK. Applied that too. The ocft test passes, but cannot work
 without specifying the existing address. I'm not sure, but I
 think that ocft cannot ask for user input, so the test is going
 to be semi-automatic.

All right!
I confirmed the next contents.
 * 
https://github.com/ClusterLabs/resource-agents/commit/9cd054d15112bd7053763c7655059a07e07f4e69
 * 
https://github.com/ClusterLabs/resource-agents/commit/7bfd0597a1d2efcd4cd2f579675510cff725ec17

Many thanks!!
Hideo Yamauchi.

--- On Sat, 2012/1/28, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Wed, Jan 25, 2012 at 10:09:26AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Dejan,
  
  Thank you for comments.
  
   Now the ocft test fails:
   
   2012/01/23_21:39:40 ERROR: IP address [127.0.0.3] is a loopback
   address, thus can not be preferred source address
   
   Any idea how to update the ocft test case?
  
  I try this problem, too.
 
 I carried out ocf-tester with three cases.
 
 Case1) I carry it out after improving an address by ifconfig command.
 
 [root@rh57-3 ClusterLabs-resource-agents-7edbe1d]# ifconfig eth0:1 
 192.168.40.7 up
 [root@rh57-3 ClusterLabs-resource-agents-7edbe1d]# ocf-tester -v -n 
 IPsrcaddr -o ipaddress=192.168.40.7 
 /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr
 Beginning tests for /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr...
 Testing permissions with uid nobody
 Testing: meta-data
[...] [Note to myself: drop the meta-data output]
 ERROR: Setup problem: couldn't find command: gawk

Install gawk perhaps?
  
  I am mysterious...gwak had been already installed, but this error seemed to 
  be given.
 
 Sorry, it was my mistake. ocf-tester does this on purpose.
 
  The next environment variable(OCF_TESTER_FAIL_HAVE_BINARY) of ocf-tester 
  seems to influence it somehow or other.
  
  (snip)
  OCF_TESTER_FAIL_HAVE_BINARY=1
  export OCF_TESTER_FAIL_HAVE_BINARY
  OCF_RESKEY_CRM_meta_interval=0
  test_command monitor
  (snip)
  
  Similar error occurs in IPaddr2.
  
  [root@rh57-3 heartbeat]# ocf-tester -v -n IPaddr2 -o ip=192.168.40.8 
  /usr/lib/ocf/resource.d/heartbeat/IPaddr2
  Beginning tests for /usr/lib/ocf/resource.d/heartbeat/IPaddr2...
  Testing permissions with uid nobody
  (snip)
  Checking current state
  Testing: monitor
  Testing: monitor
  ERROR: Setup problem: couldn't find command: ip
  Testing: start
  (snip)
  
  Is not a correction of ocf-tester necessary?
  

[...]
 INFO: The ip route has been already set.(192.168.40.0/24, eth0, 
 default via 192.168.40.1 dev eth0 )

Hmm, I saw different stuff:

ERROR: command 'ip route replace 10.2.13.0/24 169.254.0.0/16 dev eth0 
src 10.2.13.154' failed

Debugging:

+ ip route replace 10.2.13.0/24 169.254.0.0/16 dev eth0 src 10.2.13.154
Error: either to is duplicate, or 169.254.0.0/16 is a garbage.

The route list:

xen-d:~ # ip route list
default via 10.2.13.1 dev eth0 
10.2.13.0/24 dev eth0  proto kernel  scope link  src 10.2.13.54 
127.0.0.0/8 dev lo  scope link 
169.254.0.0/16 dev eth0  scope link 

It seems like the last entry confuses the new calculation code.
  
  In my environment, I set it in NOZEROCONF=yes.
  Therefore, the last entry does not exist.
 
 Right. But it's still better that the RA can handle this
 situation too.
 
   It turns out that the problem is here (nothing to do with your
   patch):
   
   NETWORK=`ip route list dev $INTERFACE scope link|grep -o '^[^ ]*'`
   
   Perhaps we should do:
   
   NETWORK=`ip route list dev $INTERFACE match $ipaddress scope link|grep -o 
   '^[^ ]*'`
   
   Opinions?
  
  I think that the method that you showed is more right.
 
 OK. Applied that too. The ocft test passes, but cannot work
 without specifying the existing address. I'm not sure, but I
 think that ocft cannot ask for user input, so the test is going
 to be semi-automatic.
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] Patch for IPsrcaddr.(1/2)

2012-01-15 Thread renayama19661014
Hi Dejan,

Thank you for comments.

  This patch revises the next point.
  
   * When route has been already assigned, RA skips an allotment. 
 
 Is this just a performance improvement? Or did you see anything
 wrong happen when running the current code?

The problem is not taking place.
We found this waste in a review.
I think that this waste may influence a performance.

 
   * Added error log of FINDIF.
   * Deleted the unused sentence.
 
 It would be good to have at least two patches, because we should
 always try to have patches with single self-contained
 modification.

Sorry..
Because I was small, I did not divide this patch into one patch.


Best Regards,
Hideo Yamauchi.

--- On Sat, 2012/1/14, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 Sorry for picking up this so late.
 
 On Tue, Nov 29, 2011 at 02:48:52PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  We made a patch to IPsrcaddr.
  
  This patch revises the next point.
  
   * When route has been already assigned, RA skips an allotment. 
 
 Is this just a performance improvement? Or did you see anything
 wrong happen when running the current code?
 
   * Added error log of FINDIF.
   * Deleted the unused sentence.
 
 It would be good to have at least two patches, because we should
 always try to have patches with single self-contained
 modification.
 
 Cheers,
 
 Dejan
 
  Please please confirm my correction. 
  And please commit a correction. 
  
  Best Regards,
  Hideo Yamauchi
 
  diff -r 2107bc4f5c8b heartbeat/IPsrcaddr
  --- a/heartbeat/IPsrcaddr    Thu Nov 24 14:13:11 2011 +0900
  +++ b/heartbeat/IPsrcaddr    Thu Nov 24 14:13:53 2011 +0900
  @@ -167,13 +167,20 @@
   srca_start() {
       srca_read $1
   
  -    ip route replace $NETWORK dev $INTERFACE src $1 || \
  -        errorexit command 'ip route replace $NETWORK dev $INTERFACE src 
  $1' failed
  +    rc=$?
  +    if [ $rc = 0 ]; then 
  +        rc=$OCF_SUCCESS
  +        ocf_log info The ip route has been already set.($NETWORK, 
  $INTERFACE, $ROUTE_WO_SRC)
  +    else
  +        ip route replace $NETWORK dev $INTERFACE src $1 || \
  +            errorexit command 'ip route replace $NETWORK dev $INTERFACE 
  src $1' failed
   
  -    $CMDCHANGE $ROUTE_WO_SRC src $1 || \
  -        errorexit command '$CMDCHANGE $ROUTE_WO_SRC src $1' failed
  +        $CMDCHANGE $ROUTE_WO_SRC src $1 || \
  +            errorexit command '$CMDCHANGE $ROUTE_WO_SRC src $1' failed
  +        rc=$?
  +    fi
   
  -    return $?
  +    return $rc
   }
   
   #
  @@ -252,7 +259,6 @@
       else
           true
       fi       
  -#    return $OCF_SUCCESS
       ;;
       *) #less than three decimal dots
       false;;
  @@ -377,7 +383,6 @@
         
         Linux|SunOS)        
         IF=`find_interface $BASEIP`
  -#      echo $IF
         if [ -z $IF ]; then
             return $OCF_NOT_RUNNING
         fi
  @@ -455,7 +460,11 @@
   
   findif_out=`$FINDIF -C`
   rc=$?
  -[ $rc -ne 0 ]  exit $rc
  +[ $rc -ne 0 ]  {
  +    ocf_log err [$FINDIF -C] failed
  +    exit $rc
  +}
  +
   INTERFACE=`echo $findif_out | awk '{print $1}'`
   NETWORK=`ip route list dev $INTERFACE scope link|grep -o '^[^ ]*'`
   
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] Patch for IPsrcaddr.(2/2)

2012-01-15 Thread renayama19661014
Hi Dejan,

Thank you for comments.

 On Tue, Nov 29, 2011 at 02:49:24PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  We made a patch to IPsrcaddr.
  
  This patch revises the next point.
  
   * Made modifications to carry out validate_all processing.
 
 I'm not necessarily against it, but I wonder why. This would make
 monitor validate the environment every time. Is that really
 necessary? What was your motivation for this change?

I think that the handling of validate-all should be carried out in the same way 
as other RA.
Therefore we suggested this correction.
 * All RA is not same, but give readability and conservatism if it is similar 
constitution.
 
   * Undefined and deleted the unused IPROUTE variable
 
 OK.
 
   * The find_interface_generic processing revised it to search it by ip 
 command.
 
 Good.
 
  However, we cannot test environment except Linux.
  Therefore, we limited a condition to carry out processing to environment of 
  Linux.
 
 That's fine too.

Many Thanks!
Hideo Yamauchi.


 
 Cheers,
 
 Dejan
 
  (snip)
  @@ -458,6 +440,10 @@
   
   ipaddress=$OCF_RESKEY_ipaddress
   
  +if [ x$SYSTYPE = xLinux ]; then
  +    srca_validate_all
  +fi
  +
  (snip)
  
  
  Please please confirm my correction. 
  And please commit a correction. 
  
  
  
  Best Regards,
  Hideo Yamauchi
 
  diff -r e4d9d86a9577 IPsrcaddr
  --- a/IPsrcaddr    Mon Nov 28 20:02:26 2011 +0900
  +++ b/IPsrcaddr    Mon Nov 28 20:03:07 2011 +0900
  @@ -307,35 +307,14 @@
   #
   find_interface_generic() {
   
  -  $IFCONFIG $IFCONFIG_A_OPT  |
  -  while read ifname linkstuff
  -  do
  -    : Read gave us ifname = $ifname
  -
  -    read inet addr junk
  -    : Read gave us inet = $inet addr = $addr
  -
  -    while
  -      read line  [ X$line != X ]
  -    do
  -      : Nothing
  -    done
  -
  -    case $SYSTYPE in
  -      *BSD)
  -        $IFCONFIG | grep $BASEIP -B`$IFCONFIG | grep -c inet` | grep 
  UP, | cut -d : -f 1
  -        return 0;;
  -      *)
  -            : comparing $BASEIP to $addr (from ifconfig)
  -        case $addr in
  -          addr:$BASEIP)    echo $ifname; return $OCF_SUCCESS;;
  -          $BASEIP)    echo $ifname; return $OCF_SUCCESS;;
  -            esac
  -        continue;;
  -    esac
  -
  -  done
  -  return $OCF_ERR_GENERIC 
  +    local iface=`$IP2UTIL -o -f inet addr show | grep \ $BASEIP \
  +            | cut -d ' ' -f2 | grep -v '^ipsec[0-9][0-9]*$'`
  +        if [ -z $iface ]; then
  +            return $OCF_ERR_GENERIC
  +        else 
  +            echo $iface
  +            return $OCF_SUCCESS
  +        fi
   }
   
   
  @@ -409,7 +388,6 @@
   srca_validate_all() {
   
       check_binary $AWK
  -    check_binary $IPROUTE
       check_binary $IFCONFIG
   
   #    The IP address should be in good shape
  @@ -420,6 +398,10 @@
         exit $OCF_ERR_CONFIGURED
       fi
   
  +    if ocf_is_probe; then
  +      return $OCF_SUCCESS
  +    fi
  +
   #    We should serve this IP address of course
       if ip_status $ipaddress; then
         :
  @@ -458,6 +440,10 @@
   
   ipaddress=$OCF_RESKEY_ipaddress
   
  +if [ x$SYSTYPE = xLinux ]; then
  +    srca_validate_all
  +fi
  +
   findif_out=`$FINDIF -C`
   rc=$?
   [ $rc -ne 0 ]  {
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data.

2011-12-15 Thread renayama19661014
Hi All,

I made the patch which revised the old next problem.

 * http://www.gossamer-threads.com/lists/linuxha/users/70262

In consideration of influence when a parameter was changed, I replace only a 
value of timeout.

Please confirm my patch. 
And please commit a patch. 

Best Regards,
Hideo Yamauchi.

trac1467.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data.

2011-12-15 Thread renayama19661014
Hi Dejan,

Thank you for comment.

 It looks like a wrong place for a fix. Shouldn't crmd send all
 environment? It is only by chance that we have the timeout value
 available in this function.

In the case of stop, crmd does not ask lrmd for the substitution of the 
parameter. .

(snip)
/* reset the resource's parameters? */
if(op-interval == 0) {
if(safe_str_eq(CRMD_ACTION_START, operation)
   || safe_str_eq(CRMD_ACTION_STATUS, operation)) {
op-copyparams = 1;
}
}
(snip)

When the parameter of the resource is changed, I think this to be because I 
influence the stop of the resource of lrmd.
It is necessary for the changed parameter not to copy it.

My patch is an example when I handle it in lrmd.

Is there a better patch?
* For example, it may be good to give copyparams a different value.

Best Regards,
Hideo Yamauchi.


--- On Thu, 2011/12/15, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Thu, Dec 15, 2011 at 06:21:00PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  I made the patch which revised the old next problem.
  
   * http://www.gossamer-threads.com/lists/linuxha/users/70262
  
  In consideration of influence when a parameter was changed, I replace only 
  a value of timeout.
  
  Please confirm my patch. 
  And please commit a patch. 
 
 It looks like a wrong place for a fix. Shouldn't crmd send all
 environment? It is only by chance that we have the timeout value
 available in this function.
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
 
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data.

2011-12-15 Thread renayama19661014
Hi Andrew,

Thank you for comment.

 When stopping, you always want to use the old parameters (think of
 someone changing 'ip' for an IPaddr resource).
 Options that are interpreted by the crmd or lrmd are a different
 matter which resulted in:
 
 https://github.com/ClusterLabs/pacemaker/commit/fcfe6fe522138343e4138248829926700fac213e
 

All right.
Will you apply this correction to 1.0 of Pacemaker?

Best Regards,
Hideo Yamauchi.




--- On Fri, 2011/12/16, Andrew Beekhof and...@beekhof.net wrote:

 On Thu, Dec 15, 2011 at 8:45 PM,  renayama19661...@ybb.ne.jp wrote:
  Hi Dejan,
 
  Thank you for comment.
 
  It looks like a wrong place for a fix. Shouldn't crmd send all
  environment? It is only by chance that we have the timeout value
  available in this function.
 
  In the case of stop, crmd does not ask lrmd for the substitution of the 
  parameter. .
 
  (snip)
         /* reset the resource's parameters? */
         if(op-interval == 0) {
             if(safe_str_eq(CRMD_ACTION_START, operation)
                || safe_str_eq(CRMD_ACTION_STATUS, operation)) {
                 op-copyparams = 1;
             }
         }
  (snip)
 
  When the parameter of the resource is changed, I think this to be because I 
  influence the stop of the resource of lrmd.
  It is necessary for the changed parameter not to copy it.
 
 When stopping, you always want to use the old parameters (think of
 someone changing 'ip' for an IPaddr resource).
 Options that are interpreted by the crmd or lrmd are a different
 matter which resulted in:
     
 https://github.com/ClusterLabs/pacemaker/commit/fcfe6fe522138343e4138248829926700fac213e
 
 
  My patch is an example when I handle it in lrmd.
 
  Is there a better patch?
  * For example, it may be good to give copyparams a different value.
 
  Best Regards,
  Hideo Yamauchi.
 
 
  --- On Thu, 2011/12/15, Dejan Muhamedagic de...@suse.de wrote:
 
  Hi Hideo-san,
 
  On Thu, Dec 15, 2011 at 06:21:00PM +0900, renayama19661...@ybb.ne.jp wrote:
   Hi All,
  
   I made the patch which revised the old next problem.
  
    * http://www.gossamer-threads.com/lists/linuxha/users/70262
  
   In consideration of influence when a parameter was changed, I replace 
   only a value of timeout.
  
   Please confirm my patch.
   And please commit a patch.
 
  It looks like a wrong place for a fix. Shouldn't crmd send all
  environment? It is only by chance that we have the timeout value
  available in this function.
 
  Cheers,
 
  Dejan
 
   Best Regards,
   Hideo Yamauchi.
 
 
   ___
   Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
   http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
   Home Page: http://linux-ha.org/
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data.

2011-12-15 Thread renayama19661014
Hi Andrew,

  All right.
  Will you apply this correction to 1.0 of Pacemaker?
 
 Sure.  We'll pick it up for .13

Many Thanks!!

Hideo Yamauchi.

--- On Fri, 2011/12/16, Andrew Beekhof and...@beekhof.net wrote:

 On Fri, Dec 16, 2011 at 1:21 PM,  renayama19661...@ybb.ne.jp wrote:
  Hi Andrew,
 
  Thank you for comment.
 
  When stopping, you always want to use the old parameters (think of
  someone changing 'ip' for an IPaddr resource).
  Options that are interpreted by the crmd or lrmd are a different
  matter which resulted in:
      
  https://github.com/ClusterLabs/pacemaker/commit/fcfe6fe522138343e4138248829926700fac213e
 
 
  All right.
  Will you apply this correction to 1.0 of Pacemaker?
 
 Sure.  We'll pick it up for .13
 
 
  Best Regards,
  Hideo Yamauchi.
 
 
 
 
  --- On Fri, 2011/12/16, Andrew Beekhof and...@beekhof.net wrote:
 
  On Thu, Dec 15, 2011 at 8:45 PM,  renayama19661...@ybb.ne.jp wrote:
   Hi Dejan,
  
   Thank you for comment.
  
   It looks like a wrong place for a fix. Shouldn't crmd send all
   environment? It is only by chance that we have the timeout value
   available in this function.
  
   In the case of stop, crmd does not ask lrmd for the substitution of the 
   parameter. .
  
   (snip)
          /* reset the resource's parameters? */
          if(op-interval == 0) {
              if(safe_str_eq(CRMD_ACTION_START, operation)
                 || safe_str_eq(CRMD_ACTION_STATUS, operation)) {
                  op-copyparams = 1;
              }
          }
   (snip)
  
   When the parameter of the resource is changed, I think this to be 
   because I influence the stop of the resource of lrmd.
   It is necessary for the changed parameter not to copy it.
 
  When stopping, you always want to use the old parameters (think of
  someone changing 'ip' for an IPaddr resource).
  Options that are interpreted by the crmd or lrmd are a different
  matter which resulted in:
      
  https://github.com/ClusterLabs/pacemaker/commit/fcfe6fe522138343e4138248829926700fac213e
 
  
   My patch is an example when I handle it in lrmd.
  
   Is there a better patch?
   * For example, it may be good to give copyparams a different value.
  
   Best Regards,
   Hideo Yamauchi.
  
  
   --- On Thu, 2011/12/15, Dejan Muhamedagic de...@suse.de wrote:
  
   Hi Hideo-san,
  
   On Thu, Dec 15, 2011 at 06:21:00PM +0900, renayama19661...@ybb.ne.jp 
   wrote:
Hi All,
   
I made the patch which revised the old next problem.
   
     * http://www.gossamer-threads.com/lists/linuxha/users/70262
   
In consideration of influence when a parameter was changed, I replace 
only a value of timeout.
   
Please confirm my patch.
And please commit a patch.
  
   It looks like a wrong place for a fix. Shouldn't crmd send all
   environment? It is only by chance that we have the timeout value
   available in this function.
  
   Cheers,
  
   Dejan
  
Best Regards,
Hideo Yamauchi.
  
  
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
  
   ___
   Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
   http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
   Home Page: http://linux-ha.org/
  
   ___
   Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
   http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
   Home Page: http://linux-ha.org/
 
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch] Patch for LVM.(2/3)

2011-12-04 Thread renayama19661014
Hi All,

This patch revises the next point. 

* Correction of the log wrong at the time of status practice.

Please confirm my patch. 
And please commit a patch. 

Best Regards, 
Hideo Yamauchidiff -r 46f87af89d20 heartbeat/LVM
--- a/heartbeat/LVM Mon Dec 05 19:21:11 2011 +0900
+++ b/heartbeat/LVM Mon Dec 05 19:21:44 2011 +0900
@@ -162,12 +162,14 @@
   fi
 
   # Report on LVM volume status to stdout...
-  if
-echo $VGOUT | grep -i 'Access.*read/write' /dev/null
-  then
-ocf_log debug Volume $1 is available read/write (running)
-  else
-ocf_log debug Volume $1 is available read-only (running)
+  if [ $rc -eq 0 ]; then
+if
+   echo $VGOUT | grep -i 'Access.*read/write' /dev/null
+then
+   ocf_log debug Volume $1 is available read/write (running)
+else
+   ocf_log debug Volume $1 is available read-only (running)
+fi
   fi
  
   return $OCF_SUCCESS
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch] Patch for LVM.(3/3)

2011-12-04 Thread renayama19661014
Hi All,

This patch revises the next point. 

* Deletion of the unused statement.

Please confirm my patch. 
And please commit a patch. 

Best Regards, 
Hideo Yamauchidiff -r a85a5ba1712f heartbeat/LVM
--- a/heartbeat/LVM Mon Dec 05 22:43:03 2011 +0900
+++ b/heartbeat/LVM Mon Dec 05 22:44:59 2011 +0900
@@ -325,9 +325,7 @@
 if 
   [ -z $OCF_RESKEY_volgrpname ]
 then
-#  echo You must identify the volume group name!
   ocf_log err You must identify the volume group name!
-#  usage
   exit $OCF_ERR_CONFIGURED 
 fi
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch] Patch for IPsrcaddr.(2/2)

2011-11-28 Thread renayama19661014
Hi All,

We made a patch to IPsrcaddr.

This patch revises the next point.

 * Made modifications to carry out validate_all processing.
 * Undefined and deleted the unused IPROUTE variable
 * The find_interface_generic processing revised it to search it by ip command.


However, we cannot test environment except Linux.
Therefore, we limited a condition to carry out processing to environment of 
Linux.

(snip)
@@ -458,6 +440,10 @@
 
 ipaddress=$OCF_RESKEY_ipaddress
 
+if [ x$SYSTYPE = xLinux ]; then
+   srca_validate_all
+fi
+
(snip)


Please please confirm my correction. 
And please commit a correction. 



Best Regards,
Hideo Yamauchi
diff -r e4d9d86a9577 IPsrcaddr
--- a/IPsrcaddr Mon Nov 28 20:02:26 2011 +0900
+++ b/IPsrcaddr Mon Nov 28 20:03:07 2011 +0900
@@ -307,35 +307,14 @@
 #
 find_interface_generic() {
 
-  $IFCONFIG $IFCONFIG_A_OPT  |
-  while read ifname linkstuff
-  do
-: Read gave us ifname = $ifname
-
-read inet addr junk
-: Read gave us inet = $inet addr = $addr
-
-while
-  read line  [ X$line != X ]
-do
-  : Nothing
-done
-
-case $SYSTYPE in
-  *BSD)
-   $IFCONFIG | grep $BASEIP -B`$IFCONFIG | grep -c inet` | grep 
UP, | cut -d : -f 1
-   return 0;;
-  *)
-   : comparing $BASEIP to $addr (from ifconfig)
-   case $addr in
- addr:$BASEIP) echo $ifname; return $OCF_SUCCESS;;
- $BASEIP)  echo $ifname; return $OCF_SUCCESS;;
-   esac
-   continue;;
-esac
-
-  done
-  return $OCF_ERR_GENERIC 
+   local iface=`$IP2UTIL -o -f inet addr show | grep \ $BASEIP \
+| cut -d ' ' -f2 | grep -v '^ipsec[0-9][0-9]*$'`
+if [ -z $iface ]; then
+return $OCF_ERR_GENERIC
+else 
+echo $iface
+return $OCF_SUCCESS
+fi
 }
 
 
@@ -409,7 +388,6 @@
 srca_validate_all() {
 
 check_binary $AWK
-check_binary $IPROUTE
 check_binary $IFCONFIG
 
 #  The IP address should be in good shape
@@ -420,6 +398,10 @@
  exit $OCF_ERR_CONFIGURED
fi
 
+   if ocf_is_probe; then
+ return $OCF_SUCCESS
+   fi
+
 #  We should serve this IP address of course
if ip_status $ipaddress; then
  :
@@ -458,6 +440,10 @@
 
 ipaddress=$OCF_RESKEY_ipaddress
 
+if [ x$SYSTYPE = xLinux ]; then
+   srca_validate_all
+fi
+
 findif_out=`$FINDIF -C`
 rc=$?
 [ $rc -ne 0 ]  {
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]Remove unnecessary loop handling of data_directory for postfix.

2011-11-27 Thread renayama19661014
Hi Raoul,

About the second patch which I contributed, how do you think?

Best Regards,
Hideo Yamauchi.

--- On Mon, 2011/11/21, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 
 Thank you for comment.
 
 Because postfix check did not give back the details to a result as for RA, I 
 recognized that the details of the log were necessary.
 
 I changed a check of data_directory.
 And I abolish a suggestion street in front, the loop.
 This is because the plural setting is not admitted because it added a check.
 
 Please please confirm my correction. 
 And please commit a correction. 
 
 Best Regards,
 Hideo Yamauchi.
 
 
 --- On Sat, 2011/11/19, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
 
  Hi Hideo-san!
  
  On 2011-11-16 11:36, renayama19661...@ybb.ne.jp wrote:
   I think that the same check has been already carried out in a resource 
   agent.
   
   (snip)
        # run Postfix internal check, if not probing
        if ! ocf_is_probe; then
            $binary $OPTIONS check/dev/null 21
            ret=$?
            if [ $ret -ne 0 ]; then
   ocf_log err Postfix 'check' failed. $ret
                return $OCF_ERR_GENERIC
            fi
   fi
   (snip)
   
   
   That means, after all is not the loop check of data_directory unnecessary?
  
  postfix check is called after all other checks have passed and, you're
  right, it also checks the required directories.
  
  i think i had some issues though:
       # check spool/queue and data directories (if applicable)
       # this is required because postfix check does not catch all errors
  
  but i cannot remember the exact problems anymore.
  
  anyways, postfix check will return a OCF_ERR_GENERIC
  which is regarded as a soft error (!) [1] and will
  
  a. not hint the user or a gui application to the exact problem and
  b. will lead to a restart of the failed resource on the same node
  
  
  the more in-depth check will fail with OCF_ERR_INSTALLED [2] or
  OCF_ERR_PERM [3] and will
  
  c. give more information in this regard and
  d. migrates the resource to a different node,
     which makes sense if i.e. the shared queue directory (nfs, etc.)
     isn't available.
  
  
  i think that this behavior is good and checking the most commonly
  modified directories separately has been very helpful in my setups.
  
  but of course, i'm open for comments.
  
   #Sorry...Because English is weak, I may understand your opinion by 
   mistake.
  
  no worries. english isn't my first language either and until now we
  managed to work things out, right? :)
  
  cheers,
  raoul
  
  [1] 
  http://www.linux-ha.org/doc/dev-guides/_literal_ocf_err_generic_literal_1.html
  [2] 
  http://www.linux-ha.org/doc/dev-guides/_literal_ocf_err_installed_literal_5.html
  [3] 
  http://www.linux-ha.org/doc/dev-guides/_literal_ocf_err_perm_literal_4.html
  -- 
  DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
  Technischer Leiter
  
  IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
  Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
  1190 Wien                           tel.               +43 1 3670030
  FN 277995t HG Wien                  fax.            +43 1 3670030 15
  
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] ocf:heartbeat:postfix postfix_running (was: Re: [Patch]The patch which revises log and an unnecessary loop for postfix resource agent.)

2011-11-21 Thread renayama19661014
Hi Raoul,

Thank you for comment.

 https://github.com/raoulbhatia/resource-agents/commit/4a5afaa217

All right!
I confirmed it about your modified contents.

Cheers,
Hideo Yamauchi.



--- On Tue, 2011/11/22, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hi Hideo-san!
 
 On 2011-11-21 03:07, renayama19661...@ybb.ne.jp wrote:
  It is judged that postfix works definitely and stops.
 
  The version that I confirmed is 2.6.6 on RHEL6.1.
 
  There seems to be a mistake with one patch.
  The postfix status command does not seem to return a detailed result.
  This is the same as postfix check command.
 
  I think that next is more right.
    * I abolished output and omitted output from log.
 
  (snip)
  postfix_running() {
       local loglevel
       loglevel=${1:-err}
 
       # run Postfix status if available
       if ocf_is_true $status_support; then
           $binary $OPTION_CONFIG_DIR status 21
           ret=$?
           if [ $ret -ne 0 ]; then
               ocf_log $loglevel Postfix status: $ret
           fi
           return $ret
       fi
  (snip)
 
 i applied this change and also updated the other of ocf_log lines
 a little bit:
 
 https://github.com/raoulbhatia/resource-agents/commit/4a5afaa217
 
 i would like to resolve another issue though:
 
 if we expect to log an error, e.g.:
 
 1. postfix stop
 2. call postfix_running to see if postfix actually stopped.
 
 so there is an expected error if postfix_running which will get
 logged and will possibly trouble the admin, right?
 
 thinking about how to solve this for the postfix ra (e.g. using a
 -q parameter) i thought about using the ocf_run function.
 but the ocf_run function will log an error too...
 
 so i'll leave this issue until my other email is answered ;)
 
 cheers,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] add an option to ocf_run to surpress *all* output

2011-11-21 Thread renayama19661014
Hi Raoul,

I think that the optional addition of Raoul is good.
Surely the optional addition will be useful in future.

Best Regards,
Hideo Yamauchi.

--- On Tue, 2011/11/22, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hi all!
 
 i'm using the following logic in my postfix ra:
 
 1. stop postfix
 2. check if postfix is actually stopped by checking it's status.
     if so, everything is working as intended!
 
 i now wanted to switch to using ocf_run in my postfix ra but there
 is no parameter to completely suppress the entire output of a command.
 
 what about adding a special option, e.g. -qq, to not log *anything* even
 if the command to run returns an error?
 
 thanks,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]Remove unnecessary loop handling of data_directory for postfix.

2011-11-20 Thread renayama19661014
Hi Raoul,

Thank you for comment.

Because postfix check did not give back the details to a result as for RA, I 
recognized that the details of the log were necessary.

I changed a check of data_directory.
And I abolish a suggestion street in front, the loop.
This is because the plural setting is not admitted because it added a check.

Please please confirm my correction. 
And please commit a correction. 

Best Regards,
Hideo Yamauchi.


--- On Sat, 2011/11/19, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hi Hideo-san!
 
 On 2011-11-16 11:36, renayama19661...@ybb.ne.jp wrote:
  I think that the same check has been already carried out in a resource 
  agent.
  
  (snip)
       # run Postfix internal check, if not probing
       if ! ocf_is_probe; then
           $binary $OPTIONS check/dev/null 21
           ret=$?
           if [ $ret -ne 0 ]; then
  ocf_log err Postfix 'check' failed. $ret
               return $OCF_ERR_GENERIC
           fi
  fi
  (snip)
  
  
  That means, after all is not the loop check of data_directory unnecessary?
 
 postfix check is called after all other checks have passed and, you're
 right, it also checks the required directories.
 
 i think i had some issues though:
      # check spool/queue and data directories (if applicable)
      # this is required because postfix check does not catch all errors
 
 but i cannot remember the exact problems anymore.
 
 anyways, postfix check will return a OCF_ERR_GENERIC
 which is regarded as a soft error (!) [1] and will
 
 a. not hint the user or a gui application to the exact problem and
 b. will lead to a restart of the failed resource on the same node
 
 
 the more in-depth check will fail with OCF_ERR_INSTALLED [2] or
 OCF_ERR_PERM [3] and will
 
 c. give more information in this regard and
 d. migrates the resource to a different node,
    which makes sense if i.e. the shared queue directory (nfs, etc.)
    isn't available.
 
 
 i think that this behavior is good and checking the most commonly
 modified directories separately has been very helpful in my setups.
 
 but of course, i'm open for comments.
 
  #Sorry...Because English is weak, I may understand your opinion by mistake.
 
 no worries. english isn't my first language either and until now we
 managed to work things out, right? :)
 
 cheers,
 raoul
 
 [1] 
 http://www.linux-ha.org/doc/dev-guides/_literal_ocf_err_generic_literal_1.html
 [2] 
 http://www.linux-ha.org/doc/dev-guides/_literal_ocf_err_installed_literal_5.html
 [3] 
 http://www.linux-ha.org/doc/dev-guides/_literal_ocf_err_perm_literal_4.html
 -- 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
diff -r aaf72a017c98 postfix
--- a/postfix   Mon Nov 21 10:32:33 2011 +0900
+++ b/postfix   Mon Nov 21 10:34:08 2011 +0900
@@ -264,7 +264,13 @@
 fi
 
 if ocf_is_true $status_support; then
-data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
2/dev/null`
+orig_data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 
2/dev/null`
+data_dir=`echo $orig_data_dir | tr ',' ' '`
+dcount=`echo $data_dir | wc -w`
+if [ $dcount -gt 1 ]; then
+ocf_log err Postfix data directory '$orig_data_dir' 
cannot set plural parameters. 
+return $OCF_ERR_PERM
+fi
 if [ ! -d $data_dir ]; then
 if ocf_is_probe; then
 ocf_log info Postfix data directory '$data_dir' not 
readable during probe.
@@ -278,16 +284,14 @@
 # check directory permissions
 if ocf_is_true $status_support; then
 user=`postconf $OPTION_CONFIG_DIR -h mail_owner 2/dev/null`
-for dir in $data_dir; do
-if ! su -s /bin/sh - $user -c test -w $dir; then
-if ocf_is_probe; then
-ocf_log info Directory '$dir' is not writable by user 
'$user' during probe.
-else
-ocf_log err Directory '$dir' is not writable by user 
'$user'.
-return $OCF_ERR_PERM;
-fi
+if ! su -s /bin/sh - $user -c test -w $data_dir; then
+if ocf_is_probe; then
+ocf_log info Directory '$data_dir' is not writable by 
user '$user' during probe.
+else
+ocf_log err Directory '$data_dir' is not writable by user 
'$user'.
+return $OCF_ERR_PERM;
 fi
-done
+fi
 fi
 fi
 

Re: [Linux-ha-dev] ocf:heartbeat:postfix postfix_running (was: Re: [Patch]The patch which revises log and an unnecessary loop for postfix resource agent.)

2011-11-20 Thread renayama19661014
Hi Raoul,

  2. we log an error (rc 1) which actually is expected and good
  (postfix is not running; we're eligible to start it)
 
  the same happens upon stopping postfix:
  Nov 18 15:01:07 m01 crmd: [2063]: info: do_lrm_rsc_op: Performing 
  key=116:55885:0:9582c8d2-c69a-4d79-91f6-04ea7bbe1853 
  op=m-mail-postfix_stop_0 )
  Nov 18 15:01:07 m01 lrmd: [2060]: info: rsc:m-mail-postfix stop[175] (pid 
  10420)
  Nov 18 15:01:08 m01 postfix/postfix-script[10632]: the Postfix mail system 
  is not running
  Nov 18 15:01:08 m01 postfix[10420]: INFO: Postfix status: ''. 1
  Nov 18 15:01:08 m01 postfix/postfix-script[10652]: the Postfix mail system 
  is not running
  Nov 18 15:01:08 m01 postfix[10420]: INFO: Postfix status: ''. 1
  Nov 18 15:01:08 m01 postfix[10420]: INFO: Postfix stopped.
  Nov 18 15:01:08 m01 lrmd: [2060]: info: operation stop[175] on 
  m-mail-postfix for client 2063: pid 10420 exited with return code 0
 
 this is still a pending issue.

In my environment, the same log does not appear.

Nov 21 10:48:22 rhel6-1 attrd: [5964]: info: attrd_ha_callback: Update relayed 
from rhel6-2
Nov 21 10:48:22 rhel6-1 attrd: [5964]: info: attrd_trigger_update: Sending 
flush op to all hosts for: shutdown (1321840102)
Nov 21 10:48:22 rhel6-1 attrd: [5964]: info: attrd_perform_update: Sent update 
8: shutdown=1321840102
Nov 21 10:48:23 rhel6-1 lrmd: [5962]: info: cancel_op: operation monitor[4] on 
prmDummy1 for client 5965, its parameters: CRM_meta_name=[monitor] 
crm_feature_set=[3.0.1] CRM_meta_on_fail=[restart] CRM_meta_interval=[1] 
CRM_meta_timeout=[2]  cancelled
Nov 21 10:48:23 rhel6-1 crmd: [5965]: info: do_lrm_rsc_op: Performing 
key=6:2:0:bf49e695-7079-40fd-803b-f732619084f4 op=prmDummy1_stop_0 )
Nov 21 10:48:23 rhel6-1 lrmd: [5962]: info: rsc:prmDummy1 stop[5] (pid 6488)
Nov 21 10:48:23 rhel6-1 crmd: [5965]: info: process_lrm_event: LRM operation 
prmDummy1_monitor_1 (call=4, status=1, cib-update=0, confirmed=true) 
Cancelled
Nov 21 10:48:26 rhel6-1 postfix(prmDummy1)[6488]: [6637]: INFO: Postfix status: 
''. 1
Nov 21 10:48:26 rhel6-1 postfix(prmDummy1)[6488]: [6639]: INFO: Postfix stopped.
Nov 21 10:48:26 rhel6-1 lrmd: [5962]: info: operation stop[5] on prmDummy1 for 
client 5965: pid 6488 exited with return code 0
Nov 21 10:48:26 rhel6-1 crmd: [5965]: info: process_lrm_event: LRM operation 
prmDummy1_stop_0 (call=5, rc=0, cib-update=14, confirmed=true) ok
Nov 21 10:48:27 rhel6-1 crmd: [5965]: info: handle_request: Shutting down

It is judged that postfix works definitely and stops.

The version that I confirmed is 2.6.6 on RHEL6.1.

There seems to be a mistake with one patch.
The postfix status command does not seem to return a detailed result.
This is the same as postfix check command.

I think that next is more right.
 * I abolished output and omitted output from log.

(snip)
postfix_running() {
local loglevel
loglevel=${1:-err}

# run Postfix status if available
if ocf_is_true $status_support; then
$binary $OPTION_CONFIG_DIR status 21
ret=$?
if [ $ret -ne 0 ]; then
ocf_log $loglevel Postfix status: $ret
fi
return $ret
fi
(snip)

Best Regards,
Hideo Yamauchi.




--- On Sat, 2011/11/19, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 On 2011-11-18 15:16, Raoul Bhatia [IPAX] wrote:
  1. we do not capture the the Postfix mail system is not running
       output. maybe this is a result from running in an interactive shell?
 
 i can answer this myself.
 
 postfix, at least on debian, only displays the output to stdout if
 there is a connected terminal.
 
 e.g.
  # postfix blar; echo $?
  postfix/postfix-script: fatal: usage: postfix start (or stop, reload, 
  abort, flush, check, status, set-permissions, upgrade-configuration)
  1
  # ssh localhost postfix blar; echo $?
  1
 
 so i do not know whether there is any sense in logging
 the postfix_running output.
 
  2. we log an error (rc 1) which actually is expected and good
      (postfix is not running; we're eligible to start it)
 
  the same happens upon stopping postfix:
  Nov 18 15:01:07 m01 crmd: [2063]: info: do_lrm_rsc_op: Performing 
  key=116:55885:0:9582c8d2-c69a-4d79-91f6-04ea7bbe1853 
  op=m-mail-postfix_stop_0 )
  Nov 18 15:01:07 m01 lrmd: [2060]: info: rsc:m-mail-postfix stop[175] (pid 
  10420)
  Nov 18 15:01:08 m01 postfix/postfix-script[10632]: the Postfix mail system 
  is not running
  Nov 18 15:01:08 m01 postfix[10420]: INFO: Postfix status: ''. 1
  Nov 18 15:01:08 m01 postfix/postfix-script[10652]: the Postfix mail system 
  is not running
  Nov 18 15:01:08 m01 postfix[10420]: INFO: Postfix status: ''. 1
  Nov 18 15:01:08 m01 postfix[10420]: INFO: Postfix stopped.
  Nov 18 15:01:08 m01 lrmd: [2060]: info: operation stop[175] on 
  m-mail-postfix for client 2063: pid 10420 exited with return code 0
 
 this is still a pending issue.
 
 thanks,
 raoul
 -- 
 
 DI (FH) Raoul 

Re: [Linux-ha-dev] [Patch]Remove unnecessary loop handling of data_directory for postfix.

2011-11-16 Thread renayama19661014
Hi Raoul,

Thank you for comment.

 On 2011-11-16 01:16, renayama19661...@ybb.ne.jp wrote:
  I judged that I could not set plural data_directory parameters from these 
  results and contributed a patch.
  Is my judgment wrong?
 
 to my knowledge, you're correct.
 multiple data_directories are not possible (and make imho make no sense)

All right!
Thanks!

 
 
  (Exapmle) It is postfix2.6.6 on RHEL6 that I confirmed.
 
    * Step1 : I set two directories in main.cf.
 
  (snip)
  # The data_directory parameter specifies the location of Postfix-writable
  # data files (caches, random numbers). This directory must be owned
  # by the mail_owner account (see below).
  #
  data_directory = /var/lib/postfix,/var/lib/postfix2
  (snip)
 
 so you set the directory to the single value
 /var/lib/postfix,/var/lib/postfix2
 
 which is not tokenized/split into an array.
 
    * Step2 : I make a directory and give access permission.
 
  [root@rhel6-1 ~]# mkdir /var/lib/postfix2
  [root@rhel6-1 ~]# chown postfix:postfix /var/lib/postfix2
 
 
    * Step3 : I execute postfix chek command.(ERROR)
 
  [root@rhel6-1 ~]# postfix check
  mkdir: cannot create directory `/var/lib/postfix,/var/lib/postfix2': No 
  such file or directory
  postfix/postfix-script: fatal: unable to create missing queue directories
  [root@rhel6-1 ~]# echo $?
  1
 
 that is expected and the resource agent should check the same.

I think that the same check has been already carried out in a resource agent.

(snip)
# run Postfix internal check, if not probing
if ! ocf_is_probe; then
$binary $OPTIONS check /dev/null 21
ret=$?
if [ $ret -ne 0 ]; then
ocf_log err Postfix 'check' failed. $ret
return $OCF_ERR_GENERIC
fi
fi
(snip)


That means, after all is not the loop check of data_directory unnecessary?

#Sorry...Because English is weak, I may understand your opinion by mistake.

Best Regards,
Hideo Yamauchi.


 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]The patch which revises log and an unnecessary loop for postfix resource agent.

2011-11-15 Thread renayama19661014
Hi Raoul,

Thank you for comment.

 1. simply break the loop when postfix isn't running anymore.
 2. ocf_log info Postfix stopped. will be called at the end of the
postfix_stop() method.
 
 any objections?

All right.

I think that the correction that you suggested is right.
I approve of it.

Thanks,
Hideo Yamauchi.

--- On Tue, 2011/11/15, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hi Hideo-san!
 
 On 2011-11-15 01:14, renayama19661...@ybb.ne.jp wrote:
  Hi Raoul,
  
  why do you want to return here and not simply break and let the
  postfix_stop() continue it's work?
 
 ok, so i would change the patch to:
 
  --- a/heartbeat/postfix
  +++ b/heartbeat/postfix
  @@ -173,6 +173,8 @@ postfix_stop()
       for i in 1 2 3 4 5; do
           if postfix_running info; then
               sleep 1
  +        else
  +            break
           fi
       done
 
 1. simply break the loop when postfix isn't running anymore.
 2. ocf_log info Postfix stopped. will be called at the end of the
    postfix_stop() method.
 
 any objections?
 
 thanks,
 raoul
 -- 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]Remove unnecessary loop handling of data_directory for postfix.

2011-11-15 Thread renayama19661014
Hi Raoul,

Thank you for comment.

Because I do not know a lot about setting of postfix, I may be a wrong opinion.

The data_directory of main.cf of postfix can set the directories more than two.
However, at the time of this setting, the postfix check command returns an 
error.


Because the resource agent of postfix executes postfix check command in the 
same way, the validate processing returns an error.

I judged that I could not set plural data_directory parameters from these 
results and contributed a patch.
Is my judgment wrong?

(Exapmle) It is postfix2.6.6 on RHEL6 that I confirmed.

 * Step1 : I set two directories in main.cf.

(snip)
# The data_directory parameter specifies the location of Postfix-writable
# data files (caches, random numbers). This directory must be owned
# by the mail_owner account (see below).
#
data_directory = /var/lib/postfix,/var/lib/postfix2
(snip)

 * Step2 : I make a directory and give access permission.

[root@rhel6-1 ~]# mkdir /var/lib/postfix2
[root@rhel6-1 ~]# chown postfix:postfix /var/lib/postfix2


 * Step3 : I execute postfix chek command.(ERROR)

[root@rhel6-1 ~]# postfix check
mkdir: cannot create directory `/var/lib/postfix,/var/lib/postfix2': No such 
file or directory
postfix/postfix-script: fatal: unable to create missing queue directories
[root@rhel6-1 ~]# echo $?
1

Best Regards,
Hideo Yamauchi.


--- On Tue, 2011/11/15, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hi Hideo-san!
 
 On 2011-11-15 03:09, renayama19661...@ybb.ne.jp wrote:
  Hi Raoul,
  Hi All,
 
  I removed unnecessary loop handling of data_directory.
 
  This patch is applied to pass after the next patch was applied.
    * http://www.gossamer-threads.com/lists/linuxha/dev/76354
 
 
  Please please confirm my correction.
  And please commit a correction.
 
 the reason i kept this loop is that
 if we need to check another directory for write permissions in the
 future
 we only need to add this directory to the loop.
 
 i used to have two loops in the ra:
 - one for checking if important directories exists and
 - one for checking if important directories are writable.
 
 see https://github.com/raoulbhatia/resource-agents/commit/136dd79
 
 after your status_support patches in mid 2011,
 the first loop got unfolded.
 i kept the second loop on purpuse.
 
 
 what are your thoughts? did you simply remove the loop because it is
 unnecessary or did you have anything else in mind?
 
 
 if it is not too much of a problem, i'd like to keep the write
 check loop intact just in case we need it for another directory.
 
 cheers,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]The patch which revises log and an unnecessary loop for postfix resource agent.

2011-11-14 Thread renayama19661014
Hi Raoul,

 why do you want to return here and not simply break and let the
 postfix_stop() continue it's work?

No, I do not have any problem even if I use the break sentence.
It is my preference to have used the return sentence.

Cheers,
Hideo Yamauchi.


--- On Mon, 2011/11/14, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hi!
 
 thanks for your contribution!
 
 On 2011-11-14 07:04, renayama19661...@ybb.ne.jp wrote:
  diff -r 52dcb4318e21 heartbeat/postfix
  --- a/heartbeat/postfix    Mon Nov 14 14:46:36 2011 +0900
  +++ b/heartbeat/postfix    Mon Nov 14 14:47:03 2011 +0900
 ...
  @@ -168,14 +171,17 @@
 
       # grant some time for shutdown and recheck 5 times
       for i in 1 2 3 4 5; do
  -        if postfix_running; then
  +        if postfix_running info; then
               sleep 1
  +        else
  +            ocf_log info Postfix stopped.
  +            return $OCF_SUCCESS
           fi
       done
 why do you want to return here and not simply break and let the
 postfix_stop() continue it's work?
 
 
 besides that, your patch looks fine upon the first check.
 
 cheers,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch]Remove unnecessary loop handling of data_directory for postfix.

2011-11-14 Thread renayama19661014
Hi Raoul,
Hi All,

I removed unnecessary loop handling of data_directory.

This patch is applied to pass after the next patch was applied.
 * http://www.gossamer-threads.com/lists/linuxha/dev/76354


Please please confirm my correction. 
And please commit a correction. 

Best Regards,
Hideo Yamauchi.diff -r b2a771cba975 heartbeat/postfix
--- a/heartbeat/postfix Tue Nov 15 10:53:38 2011 +0900
+++ b/heartbeat/postfix Tue Nov 15 10:57:10 2011 +0900
@@ -278,16 +278,14 @@
 # check directory permissions
 if ocf_is_true $status_support; then
 user=`postconf $OPTION_CONFIG_DIR -h mail_owner 2/dev/null`
-for dir in $data_dir; do
-if ! su -s /bin/sh - $user -c test -w $dir; then
-if ocf_is_probe; then
-ocf_log info Directory '$dir' is not writable by user 
'$user' during probe.
-else
-ocf_log err Directory '$dir' is not writable by user 
'$user'.
-return $OCF_ERR_PERM;
-fi
+if ! su -s /bin/sh - $user -c test -w $data_dir; then
+if ocf_is_probe; then
+ocf_log info Directory '$data_dir' is not writable by 
user '$user' during probe.
+else
+ocf_log err Directory '$data_dir' is not writable by user 
'$user'.
+return $OCF_ERR_PERM;
 fi
-done
+fi
 fi
 fi
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch]The patch which revises log and an unnecessary loop for postfix resource agent.

2011-11-13 Thread renayama19661014
Hi Raoul,
Hi All,

I send the modified patch of the resource agent of postfix.

The correction is two points.

 * Change of the log level in conjunction with the monitor processing.
 * Deletion of an unnecessary loop by the stop processing

Please please confirm my correction.
And please commit a correction.

Best Regards,
Hideo Yamauchi.diff -r 52dcb4318e21 heartbeat/postfix
--- a/heartbeat/postfix Mon Nov 14 14:46:36 2011 +0900
+++ b/heartbeat/postfix Mon Nov 14 14:47:03 2011 +0900
@@ -96,12 +96,15 @@
 }
 
 postfix_running() {
+local loglevel 
+loglevel=${1:-err} 
+
 # run Postfix status if available
 if ocf_is_true $status_support; then
 output=`$binary $OPTION_CONFIG_DIR status 21`
 ret=$?
 if [ $ret -ne 0 ]; then
-ocf_log err Postfix status: '$output'. $ret
+ocf_log $loglevel Postfix status: '$output'. $ret
 fi
 return $ret
 fi
@@ -121,7 +124,7 @@
 postfix_start()
 {
 # if Postfix is running return success
-if postfix_running; then
+if postfix_running info; then
 ocf_log info Postfix already running.
 return $OCF_SUCCESS
 fi
@@ -140,7 +143,7 @@
 while true; do
 sleep 1
 # break if postfix is up and running; log failure otherwise
-postfix_running  break
+postfix_running info  break
 ocf_log info Postfix failed initial monitor action. $ret
 done
 
@@ -152,7 +155,7 @@
 postfix_stop()
 {
 # if Postfix is not running return success
-if ! postfix_running; then
+if ! postfix_running info; then
 ocf_log info Postfix already stopped.
 return $OCF_SUCCESS
 fi
@@ -168,14 +171,17 @@
 
 # grant some time for shutdown and recheck 5 times
 for i in 1 2 3 4 5; do
-if postfix_running; then
+if postfix_running info; then
 sleep 1
+else
+ocf_log info Postfix stopped.  
+#return $OCF_SUCCESS
 fi
 done
 
 # escalate to abort if we did not stop by now
 # @TODO shall we loop here too?
-if postfix_running; then
+if postfix_running info; then
 ocf_log err Postfix failed to stop. Escalating to 'abort'.
 
 $binary $OPTIONS abort /dev/null 21; ret=$?
@@ -202,7 +208,14 @@
 
 postfix_monitor()
 {
-if postfix_running; then
+local status_loglevel=err
+
+# Set loglevel to info during probe 
+if ocf_is_probe; then 
+status_loglevel=info 
+fi 
+
+if postfix_running $status_loglevel; then
 return $OCF_SUCCESS
 fi
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]The patch which revises log and an unnecessary loop for postfix resource agent.

2011-11-13 Thread renayama19661014
Hi All,

Sorry
Because there was an error to the patch, I send it again.

Best Regards,
Hideo Yamauchi.

--- On Mon, 2011/11/14, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 Hi All,
 
 I send the modified patch of the resource agent of postfix.
 
 The correction is two points.
 
  * Change of the log level in conjunction with the monitor processing.
  * Deletion of an unnecessary loop by the stop processing
 
 Please please confirm my correction.
 And please commit a correction.
 
 Best Regards,
 Hideo Yamauchi.diff -r 52dcb4318e21 heartbeat/postfix
--- a/heartbeat/postfix Mon Nov 14 14:46:36 2011 +0900
+++ b/heartbeat/postfix Mon Nov 14 14:47:03 2011 +0900
@@ -96,12 +96,15 @@
 }
 
 postfix_running() {
+local loglevel 
+loglevel=${1:-err} 
+
 # run Postfix status if available
 if ocf_is_true $status_support; then
 output=`$binary $OPTION_CONFIG_DIR status 21`
 ret=$?
 if [ $ret -ne 0 ]; then
-ocf_log err Postfix status: '$output'. $ret
+ocf_log $loglevel Postfix status: '$output'. $ret
 fi
 return $ret
 fi
@@ -121,7 +124,7 @@
 postfix_start()
 {
 # if Postfix is running return success
-if postfix_running; then
+if postfix_running info; then
 ocf_log info Postfix already running.
 return $OCF_SUCCESS
 fi
@@ -140,7 +143,7 @@
 while true; do
 sleep 1
 # break if postfix is up and running; log failure otherwise
-postfix_running  break
+postfix_running info  break
 ocf_log info Postfix failed initial monitor action. $ret
 done
 
@@ -152,7 +155,7 @@
 postfix_stop()
 {
 # if Postfix is not running return success
-if ! postfix_running; then
+if ! postfix_running info; then
 ocf_log info Postfix already stopped.
 return $OCF_SUCCESS
 fi
@@ -168,14 +171,17 @@
 
 # grant some time for shutdown and recheck 5 times
 for i in 1 2 3 4 5; do
-if postfix_running; then
+if postfix_running info; then
 sleep 1
+else
+ocf_log info Postfix stopped.  
+return $OCF_SUCCESS
 fi
 done
 
 # escalate to abort if we did not stop by now
 # @TODO shall we loop here too?
-if postfix_running; then
+if postfix_running info; then
 ocf_log err Postfix failed to stop. Escalating to 'abort'.
 
 $binary $OPTIONS abort /dev/null 21; ret=$?
@@ -202,7 +208,14 @@
 
 postfix_monitor()
 {
-if postfix_running; then
+local status_loglevel=err
+
+# Set loglevel to info during probe 
+if ocf_is_probe; then 
+status_loglevel=info 
+fi 
+
+if postfix_running $status_loglevel; then
 return $OCF_SUCCESS
 fi
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]Patch for LVM resource agents.

2011-10-03 Thread renayama19661014
Hi Dejan,

Many Thanks!!

Hideo Yamauchi.

--- On Tue, 2011/10/4, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Fri, Sep 30, 2011 at 11:17:19AM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi Dejan,
  
  Sorry
  
  I sent the main body which was not a patch.
  I send it again.
 
 Patch applied.
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp 
  renayama19661...@ybb.ne.jp wrote:
  
   Hi Dejan,
   
   Sorry
   
   There was still a mistake to the patch which I sent a while ago.
   With the patch which I sent a while ago, precious detailed log is 
   canceled.
   Furthermore, I send the patch which I revised.
   
   Best Regards,
   Hideo Yamauchi.
   
   
   --- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp 
   renayama19661...@ybb.ne.jp wrote:
   
Hi Dejan,

  ocft test reports this:
  
  'LVM' case 7:   FAILED. Agent returns unexpected value: 
  'OCF_NOT_RUNNING'. See details below:
  2011/09/29_17:00:49 WARNING: LVM Volume ocft-vg is not available 
  (stopped).     Using volume group(s) on command line
  Finding volume group ocft-vg
  --- Volume group ---
  VG Name               ocft-vg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               4.00 MiB
  PE Size               4.00 KiB
  Total PE              1024
  Alloc PE / Size       150 / 600.00 KiB
  Free  PE / Size       874 / 3.41 MiB
  VG UUID               csVKm6-Bzdp-s40E-9O2S-uttx-PrcW-fq6Wtz
  
  --- Logical volume ---
  LV Name                /dev/ocft-vg/ocft-lv
  VG Name                ocft-vg
  LV UUID                XjMtXj-DLzy-J8Rb-6Bfb-HNoM-7o6x-VOPnMG
  LV Write Access        read/write
  LV Status              NOT available
  LV Size                600.00 KiB
  Current LE             150
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  
  --- Physical volumes ---
  PV Name               /dev/loop0
  PV UUID               z6deWo-42uN-HPrZ-nLC4-wrba-34IZ-N98cmL
  PV Status             allocatable
  Total PE / Free PE    1024 / 874
  
  2011/09/29_17:00:49 INFO: LVM Volume ocft-vg is offline
  
  That's for double stop, I think. OTOH, ocf-tester says that it
  passed all tests. Somebody's lying :)

I do not know a lot about ocft.
I carried out ocft with -v option.
     *  It is LVM which applied the patch which I attached to this email to 
   have carried out.

[root@bl460g1a heartbeat]# /usr/sbin/ocft test -v LVM 
Initialing LVM...done
(snip)
Starting 'LVM' case 7 'monitor when running':
Setting agent environment:    export 
OCFT_pv=/var/run/resource-agents/ocft-LVM-pv
Setting agent environment:    export OCFT_vg=ocft-vg
Setting agent environment:    export OCFT_lv=ocft-lv
Setting agent environment:    export OCFT_loop=/dev/loop0
Setting agent environment:    export OCF_RESKEY_volgrpname=ocft-vg
Running agent:                ./LVM stop  ?
Running agent:                ./LVM monitor
Checking return value:        FAILED. The return value 
'OCF_NOT_RUNNING' != 'OCF_SUCCESS'. See details below:
2011/09/30_10:16:49 INFO: LVM Volume ocft-vg is offline
(snip)

After stop of LVM was carried out on 'monitor when running' test, 
monitor seems to be carried out.

Is not it a problem of ocft?

  When I tried by hand to stop a running VG:
  
  # OCF_RESKEY_volgrpname=$OCFT_vg 
  /usr/lib/ocf/resource.d/heartbeat/LVM stop
  INFO: Deactivating volume group ocft-vg
  INFO: 0 logical volume(s) in volume group ocft-vg now active
  ERROR: LVM Volume ocft-vg is not available (stopped).     Using 
  volume group(s) on command line
      Finding volume group ocft-vg
  ...
  # echo $?
  0
  
  The exit code is OK, but there's an error message. Further stops
  produced the same. Can you please verify this.
  
  Hence, there seems to be a problem with the ocft test case.

This was a mistake of my patch.
I attached the patch which I revised.

Best Regards,
Hideo Yamauchi.



--- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp 
renayama19661...@ybb.ne.jp wrote:

 Hi Dejan,
 
 Thank you for comment.
 I confirm your information and revise a patch again.
 
 Best Regards,
 Hideo Yamauchi.
 
 --- On Fri, 2011/9/30, Dejan Muhamedagic de...@suse.de wrote:
 
  Hi Hideo-san,
  
  On Mon, 

Re: [Linux-ha-dev] [Patch]Patch for LVM resource agents.

2011-09-29 Thread renayama19661014
Hi Dejan,

Thank you for comment.
I confirm your information and revise a patch again.

Best Regards,
Hideo Yamauchi.

--- On Fri, 2011/9/30, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Mon, Sep 12, 2011 at 02:44:22PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All, 
  
  We made the patch of the LVM resource agent at the next point of view.
  
   Point 1) The LVM resource agent outputs the details of the log at the time 
 of the error for a system administrator.
   Point 2) The LVM resource agent uses OCF variable for a return code.
   Point 3) With a patch, the LVM resource agent merge status processing and 
 report_status processing.
  
   * We did not revise it about TODO of vgimport/vgexport in the LVM resource 
 agent.
  
  Please examine this patch. 
 
 ocft test reports this:
 
 'LVM' case 7:   FAILED. Agent returns unexpected value: 'OCF_NOT_RUNNING'. 
 See details below:
 2011/09/29_17:00:49 WARNING: LVM Volume ocft-vg is not available (stopped).   
   Using volume group(s) on command line
 Finding volume group ocft-vg
 --- Volume group ---
 VG Name               ocft-vg
 System ID
 Format                lvm2
 Metadata Areas        1
 Metadata Sequence No  2
 VG Access             read/write
 VG Status             resizable
 MAX LV                0
 Cur LV                1
 Open LV               0
 Max PV                0
 Cur PV                1
 Act PV                1
 VG Size               4.00 MiB
 PE Size               4.00 KiB
 Total PE              1024
 Alloc PE / Size       150 / 600.00 KiB
 Free  PE / Size       874 / 3.41 MiB
 VG UUID               csVKm6-Bzdp-s40E-9O2S-uttx-PrcW-fq6Wtz
 
 --- Logical volume ---
 LV Name                /dev/ocft-vg/ocft-lv
 VG Name                ocft-vg
 LV UUID                XjMtXj-DLzy-J8Rb-6Bfb-HNoM-7o6x-VOPnMG
 LV Write Access        read/write
 LV Status              NOT available
 LV Size                600.00 KiB
 Current LE             150
 Segments               1
 Allocation             inherit
 Read ahead sectors     auto
 
 --- Physical volumes ---
 PV Name               /dev/loop0
 PV UUID               z6deWo-42uN-HPrZ-nLC4-wrba-34IZ-N98cmL
 PV Status             allocatable
 Total PE / Free PE    1024 / 874
 
 2011/09/29_17:00:49 INFO: LVM Volume ocft-vg is offline
 
 That's for double stop, I think. OTOH, ocf-tester says that it
 passed all tests. Somebody's lying :)
 
 When I tried by hand to stop a running VG:
 
 # OCF_RESKEY_volgrpname=$OCFT_vg /usr/lib/ocf/resource.d/heartbeat/LVM stop
 INFO: Deactivating volume group ocft-vg
 INFO: 0 logical volume(s) in volume group ocft-vg now active
 ERROR: LVM Volume ocft-vg is not available (stopped).     Using volume 
 group(s) on command line
     Finding volume group ocft-vg
 ...
 # echo $?
 0
 
 The exit code is OK, but there's an error message. Further stops
 produced the same. Can you please verify this.
 
 Hence, there seems to be a problem with the ocft test case.
 
 Cheers,
 
 Dejan
 
  Best Regards,
  Hideo Yamauchi.
  diff -r fc1e82852f7a heartbeat/LVM
  --- a/heartbeat/LVM    Wed Aug 31 01:39:02 2011 +0900
  +++ b/heartbeat/LVM    Mon Sep 12 14:29:36 2011 +0900
  @@ -123,22 +123,17 @@
   #    Return LVM status (silently)
   #
   LVM_status() {
  -  if 
  -    [ $LVM_MAJOR -eq 1 ]
  -  then
  -    vgdisplay $1 21 | grep -i 'Status.*available' 21 /dev/null
  -    return $?
  -  else
  -    vgdisplay -v $1 21 | grep -i 'Status[ \t]*available' 21 /dev/null
  -    return $?
  +  local rc
  +  loglevel=debug
  +
  +  # Set the log level of the error message
  +  if [ X${2} == X ]; then
  +    loglevel=err
  +    if ocf_is_probe; then
  +      loglevel=warn
  +    fi
     fi
  -}
  -
  -#
  -#    Report on LVM volume status to stdout...
  -#
  -LVM_report_status() {
  -
  +  
     if 
       [ $LVM_MAJOR -eq 1 ]
     then
  @@ -150,16 +145,16 @@
       echo $VGOUT | grep -i 'Status[ \t]*available' /dev/null
       rc=$?
     fi
  -
  -  if
  -    [ $rc -eq 0 ]
  -  then
  -    : Volume $1 is available
  -  else
  -    ocf_log debug LVM Volume $1 is not available (stopped)
  -    return $OCF_NOT_RUNNING
  +  if [ $rc -ne 0 ]; then
  +        ocf_log $loglevel LVM Volume $1 is not available (stopped). 
  ${VGOUT}
  +  fi
  +  
  +  if [ X${2} == X ]; then
  +    # status call return
  +      return $rc
     fi
   
  +  # Report on LVM volume status to stdout...
     if
       echo $VGOUT | grep -i 'Access.*read/write' /dev/null
     then
  @@ -167,8 +162,9 @@
     else
       ocf_log debug Volume $1 is available read-only (running)
     fi
  -  
  + 
     return $OCF_SUCCESS
  +
   }
   
   #
  @@ -176,6 +172,7 @@
   #
   #
   LVM_monitor() {
  +  local rc
     if
       LVM_status $1
     then
  @@ -185,9 +182,14 @@
       return $OCF_NOT_RUNNING
     fi
   
  -  vgck $1 /dev/null 21
  +  VGOUT=`vgck $1 21`
  +  rc=$?
  +  if [ $rc -ne 0 ]; then
  +    ocf_log err LVM Volume $1 is not found. ${VGOUT}:${rc}
  +    return $OCF_ERR_GENERIC
  +  fi
   
  -  return $?
  + 

Re: [Linux-ha-dev] [Patch]Patch for LVM resource agents.

2011-09-29 Thread renayama19661014
Hi Dejan,

  ocft test reports this:
  
  'LVM' case 7:   FAILED. Agent returns unexpected value: 'OCF_NOT_RUNNING'. 
  See details below:
  2011/09/29_17:00:49 WARNING: LVM Volume ocft-vg is not available (stopped). 
  Using volume group(s) on command line
  Finding volume group ocft-vg
  --- Volume group ---
  VG Name   ocft-vg
  System ID
  Formatlvm2
  Metadata Areas1
  Metadata Sequence No  2
  VG Access read/write
  VG Status resizable
  MAX LV0
  Cur LV1
  Open LV   0
  Max PV0
  Cur PV1
  Act PV1
  VG Size   4.00 MiB
  PE Size   4.00 KiB
  Total PE  1024
  Alloc PE / Size   150 / 600.00 KiB
  Free  PE / Size   874 / 3.41 MiB
  VG UUID   csVKm6-Bzdp-s40E-9O2S-uttx-PrcW-fq6Wtz
  
  --- Logical volume ---
  LV Name/dev/ocft-vg/ocft-lv
  VG Nameocft-vg
  LV UUIDXjMtXj-DLzy-J8Rb-6Bfb-HNoM-7o6x-VOPnMG
  LV Write Accessread/write
  LV Status  NOT available
  LV Size600.00 KiB
  Current LE 150
  Segments   1
  Allocation inherit
  Read ahead sectors auto
  
  --- Physical volumes ---
  PV Name   /dev/loop0
  PV UUID   z6deWo-42uN-HPrZ-nLC4-wrba-34IZ-N98cmL
  PV Status allocatable
  Total PE / Free PE1024 / 874
  
  2011/09/29_17:00:49 INFO: LVM Volume ocft-vg is offline
  
  That's for double stop, I think. OTOH, ocf-tester says that it
  passed all tests. Somebody's lying :)

I do not know a lot about ocft.
I carried out ocft with -v option.
 *  It is LVM which applied the patch which I attached to this email to have 
carried out.

[root@bl460g1a heartbeat]# /usr/sbin/ocft test -v LVM 
Initialing LVM...done
(snip)
Starting 'LVM' case 7 'monitor when running':
Setting agent environment:export 
OCFT_pv=/var/run/resource-agents/ocft-LVM-pv
Setting agent environment:export OCFT_vg=ocft-vg
Setting agent environment:export OCFT_lv=ocft-lv
Setting agent environment:export OCFT_loop=/dev/loop0
Setting agent environment:export OCF_RESKEY_volgrpname=ocft-vg
Running agent:./LVM stop  ?
Running agent:./LVM monitor
Checking return value:FAILED. The return value 'OCF_NOT_RUNNING' != 
'OCF_SUCCESS'. See details below:
2011/09/30_10:16:49 INFO: LVM Volume ocft-vg is offline
(snip)

After stop of LVM was carried out on 'monitor when running' test, monitor seems 
to be carried out.

Is not it a problem of ocft?

  When I tried by hand to stop a running VG:
  
  # OCF_RESKEY_volgrpname=$OCFT_vg /usr/lib/ocf/resource.d/heartbeat/LVM stop
  INFO: Deactivating volume group ocft-vg
  INFO: 0 logical volume(s) in volume group ocft-vg now active
  ERROR: LVM Volume ocft-vg is not available (stopped). Using volume 
  group(s) on command line
  Finding volume group ocft-vg
  ...
  # echo $?
  0
  
  The exit code is OK, but there's an error message. Further stops
  produced the same. Can you please verify this.
  
  Hence, there seems to be a problem with the ocft test case.

This was a mistake of my patch.
I attached the patch which I revised.

Best Regards,
Hideo Yamauchi.



--- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Dejan,
 
 Thank you for comment.
 I confirm your information and revise a patch again.
 
 Best Regards,
 Hideo Yamauchi.
 
 --- On Fri, 2011/9/30, Dejan Muhamedagic de...@suse.de wrote:
 
  Hi Hideo-san,
  
  On Mon, Sep 12, 2011 at 02:44:22PM +0900, renayama19661...@ybb.ne.jp wrote:
   Hi All, 
   
   We made the patch of the LVM resource agent at the next point of view.
   
    Point 1) The LVM resource agent outputs the details of the log at the 
  time of the error for a system administrator.
    Point 2) The LVM resource agent uses OCF variable for a return code.
    Point 3) With a patch, the LVM resource agent merge status processing 
  and report_status processing.
   
    * We did not revise it about TODO of vgimport/vgexport in the LVM 
  resource agent.
   
   Please examine this patch. 
  
  ocft test reports this:
  
  'LVM' case 7:   FAILED. Agent returns unexpected value: 'OCF_NOT_RUNNING'. 
  See details below:
  2011/09/29_17:00:49 WARNING: LVM Volume ocft-vg is not available 
  (stopped).     Using volume group(s) on command line
  Finding volume group ocft-vg
  --- Volume group ---
  VG Name               ocft-vg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               4.00 MiB
  PE Size               4.00 KiB
  Total PE        

Re: [Linux-ha-dev] [Patch]Patch for LVM resource agents.

2011-09-29 Thread renayama19661014
Hi Dejan,

Sorry

There was still a mistake to the patch which I sent a while ago.
With the patch which I sent a while ago, precious detailed log is canceled.
Furthermore, I send the patch which I revised.

Best Regards,
Hideo Yamauchi.


--- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Dejan,
 
   ocft test reports this:
   
   'LVM' case 7:   FAILED. Agent returns unexpected value: 
   'OCF_NOT_RUNNING'. See details below:
   2011/09/29_17:00:49 WARNING: LVM Volume ocft-vg is not available 
   (stopped).     Using volume group(s) on command line
   Finding volume group ocft-vg
   --- Volume group ---
   VG Name               ocft-vg
   System ID
   Format                lvm2
   Metadata Areas        1
   Metadata Sequence No  2
   VG Access             read/write
   VG Status             resizable
   MAX LV                0
   Cur LV                1
   Open LV               0
   Max PV                0
   Cur PV                1
   Act PV                1
   VG Size               4.00 MiB
   PE Size               4.00 KiB
   Total PE              1024
   Alloc PE / Size       150 / 600.00 KiB
   Free  PE / Size       874 / 3.41 MiB
   VG UUID               csVKm6-Bzdp-s40E-9O2S-uttx-PrcW-fq6Wtz
   
   --- Logical volume ---
   LV Name                /dev/ocft-vg/ocft-lv
   VG Name                ocft-vg
   LV UUID                XjMtXj-DLzy-J8Rb-6Bfb-HNoM-7o6x-VOPnMG
   LV Write Access        read/write
   LV Status              NOT available
   LV Size                600.00 KiB
   Current LE             150
   Segments               1
   Allocation             inherit
   Read ahead sectors     auto
   
   --- Physical volumes ---
   PV Name               /dev/loop0
   PV UUID               z6deWo-42uN-HPrZ-nLC4-wrba-34IZ-N98cmL
   PV Status             allocatable
   Total PE / Free PE    1024 / 874
   
   2011/09/29_17:00:49 INFO: LVM Volume ocft-vg is offline
   
   That's for double stop, I think. OTOH, ocf-tester says that it
   passed all tests. Somebody's lying :)
 
 I do not know a lot about ocft.
 I carried out ocft with -v option.
  *  It is LVM which applied the patch which I attached to this email to have 
 carried out.
 
 [root@bl460g1a heartbeat]# /usr/sbin/ocft test -v LVM 
 Initialing LVM...done
 (snip)
 Starting 'LVM' case 7 'monitor when running':
 Setting agent environment:    export 
 OCFT_pv=/var/run/resource-agents/ocft-LVM-pv
 Setting agent environment:    export OCFT_vg=ocft-vg
 Setting agent environment:    export OCFT_lv=ocft-lv
 Setting agent environment:    export OCFT_loop=/dev/loop0
 Setting agent environment:    export OCF_RESKEY_volgrpname=ocft-vg
 Running agent:                ./LVM stop  ?
 Running agent:                ./LVM monitor
 Checking return value:        FAILED. The return value 'OCF_NOT_RUNNING' != 
 'OCF_SUCCESS'. See details below:
 2011/09/30_10:16:49 INFO: LVM Volume ocft-vg is offline
 (snip)
 
 After stop of LVM was carried out on 'monitor when running' test, monitor 
 seems to be carried out.
 
 Is not it a problem of ocft?
 
   When I tried by hand to stop a running VG:
   
   # OCF_RESKEY_volgrpname=$OCFT_vg /usr/lib/ocf/resource.d/heartbeat/LVM 
   stop
   INFO: Deactivating volume group ocft-vg
   INFO: 0 logical volume(s) in volume group ocft-vg now active
   ERROR: LVM Volume ocft-vg is not available (stopped).     Using volume 
   group(s) on command line
       Finding volume group ocft-vg
   ...
   # echo $?
   0
   
   The exit code is OK, but there's an error message. Further stops
   produced the same. Can you please verify this.
   
   Hence, there seems to be a problem with the ocft test case.
 
 This was a mistake of my patch.
 I attached the patch which I revised.
 
 Best Regards,
 Hideo Yamauchi.
 
 
 
 --- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp 
 renayama19661...@ybb.ne.jp wrote:
 
  Hi Dejan,
  
  Thank you for comment.
  I confirm your information and revise a patch again.
  
  Best Regards,
  Hideo Yamauchi.
  
  --- On Fri, 2011/9/30, Dejan Muhamedagic de...@suse.de wrote:
  
   Hi Hideo-san,
   
   On Mon, Sep 12, 2011 at 02:44:22PM +0900, renayama19661...@ybb.ne.jp 
   wrote:
Hi All, 

We made the patch of the LVM resource agent at the next point of view.

     Point 1) The LVM resource agent outputs the details of the log at the 
   time of the error for a system administrator.
     Point 2) The LVM resource agent uses OCF variable for a return code.
     Point 3) With a patch, the LVM resource agent merge status processing 
   and report_status processing.

     * We did not revise it about TODO of vgimport/vgexport in the LVM 
   resource agent.

Please examine this patch. 
   
   ocft test reports this:
   
   'LVM' case 7:   FAILED. Agent returns unexpected value: 
   'OCF_NOT_RUNNING'. See details below:
   2011/09/29_17:00:49 WARNING: LVM Volume ocft-vg is not available 
   (stopped).     Using volume group(s) on command line
   

Re: [Linux-ha-dev] [Patch]Patch for LVM resource agents.

2011-09-29 Thread renayama19661014
Hi Dejan,

Sorry

I sent the main body which was not a patch.
I send it again.

Best Regards,
Hideo Yamauchi.

--- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Dejan,
 
 Sorry
 
 There was still a mistake to the patch which I sent a while ago.
 With the patch which I sent a while ago, precious detailed log is canceled.
 Furthermore, I send the patch which I revised.
 
 Best Regards,
 Hideo Yamauchi.
 
 
 --- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp 
 renayama19661...@ybb.ne.jp wrote:
 
  Hi Dejan,
  
ocft test reports this:

'LVM' case 7:   FAILED. Agent returns unexpected value: 
'OCF_NOT_RUNNING'. See details below:
2011/09/29_17:00:49 WARNING: LVM Volume ocft-vg is not available 
(stopped).     Using volume group(s) on command line
Finding volume group ocft-vg
--- Volume group ---
VG Name               ocft-vg
System ID
Format                lvm2
Metadata Areas        1
Metadata Sequence No  2
VG Access             read/write
VG Status             resizable
MAX LV                0
Cur LV                1
Open LV               0
Max PV                0
Cur PV                1
Act PV                1
VG Size               4.00 MiB
PE Size               4.00 KiB
Total PE              1024
Alloc PE / Size       150 / 600.00 KiB
Free  PE / Size       874 / 3.41 MiB
VG UUID               csVKm6-Bzdp-s40E-9O2S-uttx-PrcW-fq6Wtz

--- Logical volume ---
LV Name                /dev/ocft-vg/ocft-lv
VG Name                ocft-vg
LV UUID                XjMtXj-DLzy-J8Rb-6Bfb-HNoM-7o6x-VOPnMG
LV Write Access        read/write
LV Status              NOT available
LV Size                600.00 KiB
Current LE             150
Segments               1
Allocation             inherit
Read ahead sectors     auto

--- Physical volumes ---
PV Name               /dev/loop0
PV UUID               z6deWo-42uN-HPrZ-nLC4-wrba-34IZ-N98cmL
PV Status             allocatable
Total PE / Free PE    1024 / 874

2011/09/29_17:00:49 INFO: LVM Volume ocft-vg is offline

That's for double stop, I think. OTOH, ocf-tester says that it
passed all tests. Somebody's lying :)
  
  I do not know a lot about ocft.
  I carried out ocft with -v option.
   *  It is LVM which applied the patch which I attached to this email to 
 have carried out.
  
  [root@bl460g1a heartbeat]# /usr/sbin/ocft test -v LVM 
  Initialing LVM...done
  (snip)
  Starting 'LVM' case 7 'monitor when running':
  Setting agent environment:    export 
  OCFT_pv=/var/run/resource-agents/ocft-LVM-pv
  Setting agent environment:    export OCFT_vg=ocft-vg
  Setting agent environment:    export OCFT_lv=ocft-lv
  Setting agent environment:    export OCFT_loop=/dev/loop0
  Setting agent environment:    export OCF_RESKEY_volgrpname=ocft-vg
  Running agent:                ./LVM stop  ?
  Running agent:                ./LVM monitor
  Checking return value:        FAILED. The return value 'OCF_NOT_RUNNING' != 
  'OCF_SUCCESS'. See details below:
  2011/09/30_10:16:49 INFO: LVM Volume ocft-vg is offline
  (snip)
  
  After stop of LVM was carried out on 'monitor when running' test, monitor 
  seems to be carried out.
  
  Is not it a problem of ocft?
  
When I tried by hand to stop a running VG:

# OCF_RESKEY_volgrpname=$OCFT_vg /usr/lib/ocf/resource.d/heartbeat/LVM 
stop
INFO: Deactivating volume group ocft-vg
INFO: 0 logical volume(s) in volume group ocft-vg now active
ERROR: LVM Volume ocft-vg is not available (stopped).     Using volume 
group(s) on command line
        Finding volume group ocft-vg
...
# echo $?
0

The exit code is OK, but there's an error message. Further stops
produced the same. Can you please verify this.

Hence, there seems to be a problem with the ocft test case.
  
  This was a mistake of my patch.
  I attached the patch which I revised.
  
  Best Regards,
  Hideo Yamauchi.
  
  
  
  --- On Fri, 2011/9/30, renayama19661...@ybb.ne.jp 
  renayama19661...@ybb.ne.jp wrote:
  
   Hi Dejan,
   
   Thank you for comment.
   I confirm your information and revise a patch again.
   
   Best Regards,
   Hideo Yamauchi.
   
   --- On Fri, 2011/9/30, Dejan Muhamedagic de...@suse.de wrote:
   
Hi Hideo-san,

On Mon, Sep 12, 2011 at 02:44:22PM +0900, renayama19661...@ybb.ne.jp 
wrote:
 Hi All, 
 
 We made the patch of the LVM resource agent at the next point of view.
 
  Point 1) The LVM resource agent outputs the details of the log at 
the time of the error for a system administrator.
  Point 2) The LVM resource agent uses OCF variable for a return code.
  Point 3) With a patch, the LVM resource agent merge status 
processing and report_status processing.
 
  * We did not revise it about TODO of 

Re: [Linux-ha-dev] [Patch 3]Change avoiding the stop error of the mysql resource agent.

2011-09-21 Thread renayama19661014
Hi Raoul,

 thanks for clearing this for me!
 i've commited your change:
 
 https://github.com/raoulbhatia/resource-agents/commit/d828b7f91abff87e930b11097e6543e2bdc87023
 
 thank you for your contribution!

Many thanks!!

Best Regards,
Hideo Yamauchi.



--- On Wed, 2011/9/21, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hello hideo-san!
 
 On 09/21/2011 02:28 AM, renayama19661...@ybb.ne.jp wrote:
  No.
  Because it repeats with status processing, it should delete it that RA 
  checks pid file.
 
  -    if [ ! -f $OCF_RESKEY_pid ]; then
  -    ocf_log info MySQL is not running
  -        return $OCF_SUCCESS
  +    mysql_status info
  +    rc=$?
  +    if [ $rc = $OCF_NOT_RUNNING ]; then
  +       return $OCF_SUCCESS
         fi
 
 thanks for clearing this for me!
 i've commited your change:
 
 https://github.com/raoulbhatia/resource-agents/commit/d828b7f91abff87e930b11097e6543e2bdc87023
 
 thank you for your contribution!
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 2]Change of the output level of the log of the resource agent of mysql.

2011-09-20 Thread renayama19661014
Hi Raoul,

I agree to modified contents.
I will confirm movement tomorrow just to make sure in this resource agent.

Please wait until tomorrow.

Best Regards,
Hideo Yamauchi.


--- On Tue, 2011/9/20, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 On 09/20/2011 02:11 AM, renayama19661...@ybb.ne.jp wrote:
  Hi Raoul,
 
  Thank you for comment.
 
  The log level of my patch is not decision at suggestion.
  We think the log level to be allowed to unify it in WARN or INFO either 
  other than ERROR.
 
  If log in status carried out by start and stop does not come to appear at 
  ERROR level, the system administrator is not confused.
 
 i've commited your patch with a slight modification in log leves.
 
 please verify:
 
 https://github.com/raoulbhatia/resource-agents/commit/65b7b4202549bc087d3759dc9636b4966e2dafd2
 
 thanks,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 3]Change avoiding the stop error of the mysql resource agent.

2011-09-20 Thread renayama19661014
Hi Raoul,

Thank you for comment.

 ok, but why do you ommit the check if the pidfile exists and then cat 
 this very file. if you cat a non existing file, you'll get errors.
 basically, the ra does the following:
 
  kill `cat /tmp/mysql.pid 2/dev/null`  /dev/null; echo $?
  kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or 
  kill -l [sigspec]
  1
 
 so the ra will exit with:
   ocf_log err MySQL couldn't be stopped
   return $OCF_ERR_GENERIC

However, it is not necessary for the stop to become the error because mysql 
falls.
The error of the stop restrains FO in some cases.
 * The similar processing to this problem is carried out in pgsql.

 shouldn't the following patch be enough
 (just adding the mysql_status check but not removing the pid check?)
 
  diff --git a/heartbeat/mysql b/heartbeat/mysql
  index e449de4..474f62e 100755
  --- a/heartbeat/mysql
  +++ b/heartbeat/mysql
  @@ -898,6 +898,12 @@ mysql_stop() {
  $CRM_MASTER -D
   fi
 
  +mysql_status info
  +rc=$?
  +if [ $rc = $OCF_NOT_RUNNING ]; then
  +   return $OCF_SUCCESS
  +fi
  +
   if [ ! -f $OCF_RESKEY_pid ]; then
  ocf_log info MySQL is not running
   return $OCF_SUCCESS

No.
Because it repeats with status processing, it should delete it that RA checks 
pid file.

-if [ ! -f $OCF_RESKEY_pid ]; then
-ocf_log info MySQL is not running
-return $OCF_SUCCESS
+mysql_status info
+rc=$?
+if [ $rc = $OCF_NOT_RUNNING ]; then
+   return $OCF_SUCCESS
  fi


Sorry...My understanding to your comment may be wrong.


Best Regards,
Hideo Yamauchi.


--- On Tue, 2011/9/20, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hello hideo-san!
 
 On 09/20/2011 02:19 AM, renayama19661...@ybb.ne.jp wrote:
 ...
  -    if [ ! -f $OCF_RESKEY_pid ]; then
  -    ocf_log info MySQL is not running
  -        return $OCF_SUCCESS
  +    mysql_status info
  +    rc=$?
  +    if [ $rc = $OCF_NOT_RUNNING ]; then
  +       return $OCF_SUCCESS
         fi
 ...
 
  i'm sorry but i do not understand the problem you're addressing.
  can you please describe with different words?
 
  Sorry
 
  For example, mysql fails in stop processing and causes an error when the 
  next trouble happens.
 
  Step1 ) For example, for switch over, stop handling of Mysql begins.
  Step2 ) However, Mysql fell by process trouble just after that. The pid 
  file is left.
  Step3 ) When pid file is left by the current stop processing, an error 
  happens.
 
    * The stop processing is finished normally by checking pid file and the 
 existence of the process by this patch definitely.
    * And the switch over excess succeeds.
 
 ok, but why do you ommit the check if the pidfile exists and then cat 
 this very file. if you cat a non existing file, you'll get errors.
 basically, the ra does the following:
 
  kill `cat /tmp/mysql.pid 2/dev/null`  /dev/null; echo $?
  kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or 
  kill -l [sigspec]
  1
 
 so the ra will exit with:
           ocf_log err MySQL couldn't be stopped
           return $OCF_ERR_GENERIC
 
 shouldn't the following patch be enough
 (just adding the mysql_status check but not removing the pid check?)
 
  diff --git a/heartbeat/mysql b/heartbeat/mysql
  index e449de4..474f62e 100755
  --- a/heartbeat/mysql
  +++ b/heartbeat/mysql
  @@ -898,6 +898,12 @@ mysql_stop() {
          $CRM_MASTER -D
       fi
 
  +    mysql_status info
  +    rc=$?
  +    if [ $rc = $OCF_NOT_RUNNING ]; then
  +       return $OCF_SUCCESS
  +    fi
  +
       if [ ! -f $OCF_RESKEY_pid ]; then
          ocf_log info MySQL is not running
           return $OCF_SUCCESS
 
 cheers,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 2]Change of the output level of the log of the resource agent of mysql.

2011-09-20 Thread renayama19661014
Hi Raoul,

  i've commited your patch with a slight modification in log leves.
  
  please verify:
  
  https://github.com/raoulbhatia/resource-agents/commit/65b7b4202549bc087d3759dc9636b4966e2dafd2

I confirmed movement of RA.
I confirmed log of RA.

There is no problem.

Thanks,
Hideo Yamauchi.



--- On Tue, 2011/9/20, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 
 I agree to modified contents.
 I will confirm movement tomorrow just to make sure in this resource agent.
 
 Please wait until tomorrow.
 
 Best Regards,
 Hideo Yamauchi.
 
 
 --- On Tue, 2011/9/20, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
 
  On 09/20/2011 02:11 AM, renayama19661...@ybb.ne.jp wrote:
   Hi Raoul,
  
   Thank you for comment.
  
   The log level of my patch is not decision at suggestion.
   We think the log level to be allowed to unify it in WARN or INFO either 
   other than ERROR.
  
   If log in status carried out by start and stop does not come to appear at 
   ERROR level, the system administrator is not confused.
  
  i've commited your patch with a slight modification in log leves.
  
  please verify:
  
  https://github.com/raoulbhatia/resource-agents/commit/65b7b4202549bc087d3759dc9636b4966e2dafd2
  
  thanks,
  raoul
  
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 2]Change of the output level of the log of the resource agent of mysql.

2011-09-19 Thread renayama19661014
Hi Raoul,

Thank you for comment.

The log level of my patch is not decision at suggestion.
We think the log level to be allowed to unify it in WARN or INFO either other 
than ERROR.

If log in status carried out by start and stop does not come to appear at ERROR 
level, the system administrator is not confused.

Best Regards,
Hideo Yamauchi.



--- On Tue, 2011/9/20, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hello hideo-san!
 
 i just got to review your patch and got one or two questions (see below)
 
 On 08/30/2011 10:20 AM, renayama19661...@ybb.ne.jp wrote:
  mysql.1651-2.patch
 
 
  diff -r a2d0d723bc62 heartbeat/mysql
  --- a/heartbeat/mysql    Wed Aug 31 01:32:38 2011 +0900
  +++ b/heartbeat/mysql    Wed Aug 31 01:38:08 2011 +0900
  @@ -807,7 +817,7 @@
        # Let the CRM/LRM time us out if required
        start_wait=1
        while [ $start_wait = 1 ]; do
  -    mysql_status
  +    mysql_status warn
    rc=$?
    if [ $rc = $OCF_SUCCESS ]; then
        start_wait=0
  @@ -908,7 +918,7 @@
        count=0
        while [ $count -lt $shutdown_timeout ]
        do
  -    mysql_status
  +    mysql_status info
    rc=$?
    if [ $rc = $OCF_NOT_RUNNING ]; then
        break
 
 in mysql_start() you use the warn level
 in mysql_stop() you use the info level.
 
 shouldn't these two be the same levels?
 (e.g. both warn?)
 
  @@ -918,7 +928,7 @@
    ocf_log debug MySQL still hasn't stopped yet. Waiting...
        done
 
  -    mysql_status
  +    mysql_status info
        if [ $? != $OCF_NOT_RUNNING ]; then
    ocf_log info MySQL failed to stop after ${shutdown_timeout}s using 
 SIGTERM. Trying SIGKILL...
    /bin/kill -KILL $pid  /dev/null
 
 while reviewing the log leves, should we set this last ocf_log line
 to warn? (sorry for mixing this in here - but maybe you can comment
 on that too :) )
 
 cheers,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 3]Change avoiding the stop error of the mysql resource agent.

2011-09-19 Thread renayama19661014
Hi Raoul,

Thank you for comment.

  diff -r cb5f9b84cc5f heartbeat/mysql
  --- a/heartbeat/mysqlWed Aug 31 01:38:15 2011 +0900
  +++ b/heartbeat/mysqlWed Aug 31 01:38:55 2011 +0900
  @@ -897,9 +897,10 @@
$CRM_MASTER -D
fi
 
  -if [ ! -f $OCF_RESKEY_pid ]; then
  -ocf_log info MySQL is not running
  -return $OCF_SUCCESS
  +mysql_status info
  +rc=$?
  +if [ $rc = $OCF_NOT_RUNNING ]; then
  +   return $OCF_SUCCESS
fi
 
pid=`cat $OCF_RESKEY_pid 2  /dev/null `
 
 i'm sorry but i do not understand the problem you're addressing.
 can you please describe with different words?

Sorry

For example, mysql fails in stop processing and causes an error when the next 
trouble happens.

Step1 ) For example, for switch over, stop handling of Mysql begins.
Step2 ) However, Mysql fell by process trouble just after that. The pid file is 
left.
Step3 ) When pid file is left by the current stop processing, an error happens.

 * The stop processing is finished normally by checking pid file and the 
existence of the process by this patch definitely.
 * And the switch over excess succeeds.

Best Regards,
Hideo Yamauchi.



--- On Tue, 2011/9/20, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hello hideo-san!
 
 On 08/30/2011 10:20 AM, renayama19661...@ybb.ne.jp wrote:
  Hi,
 
  When a process of mysql falls just after the check of the pid file of 
  monitor, the mysql resource agent causes an error by a stop.
 
  This is caused by the fact that a resource agent checks pid of the mysql 
  process that fell at the time of a stop.
  The resource agent should check the effectiveness of the pid file before a 
  check again.
 
 ...
  diff -r cb5f9b84cc5f heartbeat/mysql
  --- a/heartbeat/mysql    Wed Aug 31 01:38:15 2011 +0900
  +++ b/heartbeat/mysql    Wed Aug 31 01:38:55 2011 +0900
  @@ -897,9 +897,10 @@
    $CRM_MASTER -D
        fi
 
  -    if [ ! -f $OCF_RESKEY_pid ]; then
  -    ocf_log info MySQL is not running
  -        return $OCF_SUCCESS
  +    mysql_status info
  +    rc=$?
  +    if [ $rc = $OCF_NOT_RUNNING ]; then
  +       return $OCF_SUCCESS
        fi
 
        pid=`cat $OCF_RESKEY_pid 2  /dev/null `
 
 i'm sorry but i do not understand the problem you're addressing.
 can you please describe with different words?
 
 thanks,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 2]Change of the output level of the log of the resource agent of mysql.

2011-09-11 Thread renayama19661014
Hi Raoul,

How about the modified patch of this place?

Best Regards,
Hidoe Yamauchi.

--- On Tue, 2011/8/30, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi,
 
 The resource agent of mysql outputs error log every time in probe,start,stop.
 
 
  Aug 11 13:38:31 ib01 mysql[15764]: ERROR: MySQL is not running
 
 
 When a resource does not start, the resource agent changes the level of the 
 log and should output it.
 
 Otherwise the operator is confused by an error.
 I modelled it on other resource agents and changed a level of the log.
 
 I send a patch.
 
 Please examine this patch.
 
 Best Regards,
 Hideo Yamauchi.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 3]Change avoiding the stop error of the mysql resource agent.

2011-09-11 Thread renayama19661014
Hi Raoul,

How about the modified patch of this place?

Best Regards,
Hidoe Yamauchi.

--- On Tue, 2011/8/30, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi,
 
 When a process of mysql falls just after the check of the pid file of 
 monitor, the mysql resource agent causes an error by a stop.
 
 This is caused by the fact that a resource agent checks pid of the mysql 
 process that fell at the time of a stop.
 The resource agent should check the effectiveness of the pid file before a 
 check again.
 
 I send a patch.
 
 Please examine this patch.
 
 
 Best Regards,
 Hideo Yamauchi.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch]Patch for LVM resource agents.

2011-09-11 Thread renayama19661014
Hi All, 

We made the patch of the LVM resource agent at the next point of view.

 Point 1) The LVM resource agent outputs the details of the log at the time of 
the error for a system administrator.
 Point 2) The LVM resource agent uses OCF variable for a return code.
 Point 3) With a patch, the LVM resource agent merge status processing and 
report_status processing.

 * We did not revise it about TODO of vgimport/vgexport in the LVM resource 
agent.

Please examine this patch. 

Best Regards,
Hideo Yamauchi.diff -r fc1e82852f7a heartbeat/LVM
--- a/heartbeat/LVM Wed Aug 31 01:39:02 2011 +0900
+++ b/heartbeat/LVM Mon Sep 12 14:29:36 2011 +0900
@@ -123,22 +123,17 @@
 #  Return LVM status (silently)
 #
 LVM_status() {
-  if 
-[ $LVM_MAJOR -eq 1 ]
-  then
-   vgdisplay $1 21 | grep -i 'Status.*available' 21 /dev/null
-   return $?
-  else
-   vgdisplay -v $1 21 | grep -i 'Status[ \t]*available' 21 /dev/null
-   return $?
+  local rc
+  loglevel=debug
+
+  # Set the log level of the error message
+  if [ X${2} == X ]; then
+   loglevel=err
+   if ocf_is_probe; then
+ loglevel=warn
+   fi
   fi
-}
-
-#
-#  Report on LVM volume status to stdout...
-#
-LVM_report_status() {
-
+  
   if 
 [ $LVM_MAJOR -eq 1 ]
   then
@@ -150,16 +145,16 @@
echo $VGOUT | grep -i 'Status[ \t]*available' /dev/null
rc=$?
   fi
-
-  if
-[ $rc -eq 0 ]
-  then
-: Volume $1 is available
-  else
-ocf_log debug LVM Volume $1 is not available (stopped)
-return $OCF_NOT_RUNNING
+  if [ $rc -ne 0 ]; then
+   ocf_log $loglevel LVM Volume $1 is not available (stopped). ${VGOUT}
+  fi
+  
+  if [ X${2} == X ]; then
+   # status call return
+   return $rc
   fi
 
+  # Report on LVM volume status to stdout...
   if
 echo $VGOUT | grep -i 'Access.*read/write' /dev/null
   then
@@ -167,8 +162,9 @@
   else
 ocf_log debug Volume $1 is available read-only (running)
   fi
-  
+ 
   return $OCF_SUCCESS
+
 }
 
 #
@@ -176,6 +172,7 @@
 #
 #
 LVM_monitor() {
+  local rc
   if
 LVM_status $1
   then
@@ -185,9 +182,14 @@
 return $OCF_NOT_RUNNING
   fi
 
-  vgck $1 /dev/null 21
+  VGOUT=`vgck $1 21`
+  rc=$?
+  if [ $rc -ne 0 ]; then
+ocf_log err LVM Volume $1 is not found. ${VGOUT}:${rc}
+return $OCF_ERR_GENERIC
+  fi
 
-  return $?
+  return $OCF_SUCCESS
 }
 
 #
@@ -232,10 +234,10 @@
 
   vgdisplay $1 21 | grep 'Volume group .* not found' /dev/null  {
 ocf_log info Volume group $1 not found
-return 0
+return $OCF_SUCCESS
   }
   ocf_log info Deactivating volume group $1
-  ocf_run vgchange -a ln $1 || return 1
+  ocf_run vgchange -a ln $1 || return $OCF_ERR_GENERIC
 
   if
 LVM_status $1
@@ -256,10 +258,10 @@
   check_binary $AWK
 
 #  Off-the-shelf tests...  
-  vgck $VOLUME /dev/null 21
+  VGOUT=`vgck ${VOLUME} 21`
   
   if [ $? -ne 0 ]; then
-   ocf_log err Volume group [$VOLUME] does not exist or contains error!
+   ocf_log err Volume group [$VOLUME] does not exist or contains error! 
${VGOUT}
exit $OCF_ERR_GENERIC
   fi
 
@@ -267,13 +269,13 @@
   if 
 [ $LVM_MAJOR -eq 1 ]
   then
-   vgdisplay $VOLUME /dev/null 21
+   VGOUT=`vgdisplay ${VOLUME} 21`
   else
-   vgdisplay -v $VOLUME /dev/null 21
+   VGOUT=`vgdisplay -v ${VOLUME} 21`
   fi
 
   if [ $? -ne 0 ]; then
-   ocf_log err Volume group [$VOLUME] does not exist or contains error!
+   ocf_log err Volume group [$VOLUME] does not exist or contains error! 
${VGOUT}
exit $OCF_ERR_GENERIC
   fi
 
@@ -350,7 +352,7 @@
   stop)LVM_stop $VOLUME
exit $?;;
 
-  status)  LVM_report_status $VOLUME
+  status)  LVM_status $VOLUME $1
exit $?;;
 
   monitor) LVM_monitor $VOLUME
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-09-08 Thread renayama19661014
Hi Raoul,
Hi Florian,

Thank you for the change of the repository.

Best Regards,
Hideo Yamauchi.

--- On Thu, 2011/9/8, Florian Haas f.g.h...@gmx.net wrote:

 On 09/08/11 10:34, Raoul Bhatia [IPAX] wrote:
  On 09/08/2011 04:49 AM, renayama19661...@ybb.ne.jp wrote:
    do not apply a patch even if you apply this patch, there is not the big 
 problem.
  I am lacking in my explanation, and I'm sorry.
  
  ok. i just updated my pull request.
  
  https://github.com/ClusterLabs/resource-agents/pull/20
  
  dejan, can you please review and apply our patches?
 
 Taking the liberty to step in for Dejan, I've merged and pushed your
 changes. Thanks for your contribution!
 
 Cheers,
 Florian
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-09-07 Thread renayama19661014
Hi Raoul,

 does it hurt if we leave this patch in? i do not see any problem with
 that code.

Even if you do not apply a patch even if you apply this patch, there is not the 
big problem.
I am lacking in my explanation, and I'm sorry.

Best Regards,
Hideo Yamauchi.


--- On Wed, 2011/9/7, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 hi Hideo-san!
 
 On 09/07/2011 01:50 AM, renayama19661...@ybb.ne.jp wrote:
  However, my patch made a mistake.
      I do not seem to get the result of postfix status.
      It is necessary to watch log of postfix in the details of the 
  problem after all.
    
      Therefore, I withdraw the patch of the part of postfix status.
    
      diff -r 19c97e0021f0 postfix
      --- a/postfix   Thu Jun 16 21:45:53 2011 +0900
      +++ b/postfix   Thu Jun 16 21:46:01 2011 +0900
      @@ -98,12 +98,8 @@
         postfix_running() {
             # run Postfix status if available
             if ocf_is_true $status_support; then
      -        output=`$binary $OPTION_CONFIG_DIR status 21`
      -        ret=$?
      -        if [ $ret -ne 0 ]; then
      -            ocf_log err Postfix status: '$output'. $ret
      -        fi
      -        return $ret
      +        $binary $OPTION_CONFIG_DIR status 21
      +        return $?
             fi
    
             # manually check Postfix's pid
 [...]
  I thought that output could acquire the details of the problem of postfix 
  status with a former patch.
  And I thought the output of the details of the problem to be useful for an 
  operator.
  However, the details of the problem only were really reflected on log of 
  postfix in the environment that I tried.
 
  Therefore I want to withdraw the suggestion of the patch of this part.
 
 does it hurt if we leave this patch in? i do not see any problem with
 that code.
 
 thanks,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch 1]Change of the monitor log of the resource agent of mysql.

2011-09-06 Thread renayama19661014
Hi Raoul,

All right.

Thanks!!

Hideo Yamauchi. 

--- On Wed, 2011/9/7, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 On 08/30/2011 10:19 AM, renayama19661...@ybb.ne.jp wrote:
  Hi,
 
  The log that a resource agent of mysql outputs with a monitor is a noise 
  very much.
 
   Aug 11 13:40:01 ib01 mysql[18164]: INFO: COUNT(*) 4
   Aug 11 13:40:01 ib01 mysql[18164]: INFO: MySQL monitor succeeded
     -  repeat monitor log.
 
  I suggest the next patch.
 
    * The addition of the -q option to ocf_run.
    * Change the log of the monitor completion to debug.
 
  I send a patch.
 
  Please examine this patch.
 
 ack. i applied it to my mysql branch:
 
 https://github.com/raoulbhatia/resource-agents/commits/mysql
 
 thanks,
 raoul
 
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] [Patch]Mistake of the table name variable.

2011-09-06 Thread renayama19661014
Hi Raoul,

All right.

Thanks!!

Hideo Yamauchi. 


--- On Wed, 2011/9/7, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 On 08/30/2011 08:58 AM, renayama19661...@ybb.ne.jp wrote:
  Hi,
 
  I contribute a patch revising the mistake of the variable of the resource 
  agent of mysql.
 
 ack. i applied it to my mysql branch:
 
 https://github.com/raoulbhatia/resource-agents/commits/mysql
 
 thanks,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-09-06 Thread renayama19661014
Hi Raoul,

 thanks for testing my ra. i'll check the ra and will then issue a
 pull request.

Okay.
We hope that a correction is included in the next release of the resource agent.

 
  However, my patch made a mistake.
  I do not seem to get the result of postfix status.
  It is necessary to watch log of postfix in the details of the problem after 
  all.
 
  Therefore, I withdraw the patch of the part of postfix status.
 
  diff -r 19c97e0021f0 postfix
  --- a/postfix   Thu Jun 16 21:45:53 2011 +0900
  +++ b/postfix   Thu Jun 16 21:46:01 2011 +0900
  @@ -98,12 +98,8 @@
postfix_running() {
# run Postfix status if available
if ocf_is_true $status_support; then
  -output=`$binary $OPTION_CONFIG_DIR status 21`
  -ret=$?
  -if [ $ret -ne 0 ]; then
  -ocf_log err Postfix status: '$output'. $ret
  -fi
  -return $ret
  +$binary $OPTION_CONFIG_DIR status 21
  +return $?
fi
 
# manually check Postfix's pid
 
 it's been a while since i looked into the code.
 
 why do you want to issue postfix status if /usr/sbin/postfix
 does not support this command?

I thought that output could acquire the details of the problem of postfix 
status with a former patch.
And I thought the output of the details of the problem to be useful for an 
operator.
However, the details of the problem only were really reflected on log of 
postfix in the environment that I tried.

Therefore I want to withdraw the suggestion of the patch of this part.

Best Regards,
Hideo Yamauchi.

--- On Wed, 2011/9/7, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 On 06/16/2011 05:48 AM, renayama19661...@ybb.ne.jp wrote:
  The postfix ra worked well.
 
 thanks for testing my ra. i'll check the ra and will then issue a
 pull request.
 
  However, my patch made a mistake.
  I do not seem to get the result of postfix status.
  It is necessary to watch log of postfix in the details of the problem after 
  all.
 
  Therefore, I withdraw the patch of the part of postfix status.
 
  diff -r 19c97e0021f0 postfix
  --- a/postfix   Thu Jun 16 21:45:53 2011 +0900
  +++ b/postfix   Thu Jun 16 21:46:01 2011 +0900
  @@ -98,12 +98,8 @@
    postfix_running() {
        # run Postfix status if available
        if ocf_is_true $status_support; then
  -        output=`$binary $OPTION_CONFIG_DIR status 21`
  -        ret=$?
  -        if [ $ret -ne 0 ]; then
  -            ocf_log err Postfix status: '$output'. $ret
  -        fi
  -        return $ret
  +        $binary $OPTION_CONFIG_DIR status 21
  +        return $?
        fi
 
        # manually check Postfix's pid
 
 it's been a while since i looked into the code.
 
 why do you want to issue postfix status if /usr/sbin/postfix
 does not support this command?
 
 thanks,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch]Mistake of the table name variable.

2011-08-30 Thread renayama19661014
Hi, 

I contribute a patch revising the mistake of the variable of the resource agent 
of mysql.

Best Regards,
Hideo Yamauchi.

mysql.1662.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch 1]Change of the monitor log of the resource agent of mysql.

2011-08-30 Thread renayama19661014
Hi,

The log that a resource agent of mysql outputs with a monitor is a noise very 
much.

 Aug 11 13:40:01 ib01 mysql[18164]: INFO: COUNT(*) 4
 Aug 11 13:40:01 ib01 mysql[18164]: INFO: MySQL monitor succeeded
  - repeat monitor log. 

I suggest the next patch.

 * The addition of the -q option to ocf_run.
 * Change the log of the monitor completion to debug.

I send a patch.

Please examine this patch.

Best Regards,
Hideo Yamauchi.

mysql.1651-1.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch 2]Change of the output level of the log of the resource agent of mysql.

2011-08-30 Thread renayama19661014
Hi,

The resource agent of mysql outputs error log every time in probe,start,stop.


 Aug 11 13:38:31 ib01 mysql[15764]: ERROR: MySQL is not running


When a resource does not start, the resource agent changes the level of the log 
and should output it.

Otherwise the operator is confused by an error.
I modelled it on other resource agents and changed a level of the log.

I send a patch.

Please examine this patch.

Best Regards,
Hideo Yamauchi.

mysql.1651-2.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Patch 3]Change avoiding the stop error of the mysql resource agent.

2011-08-30 Thread renayama19661014
Hi,

When a process of mysql falls just after the check of the pid file of monitor, 
the mysql resource agent causes an error by a stop.

This is caused by the fact that a resource agent checks pid of the mysql 
process that fell at the time of a stop.
The resource agent should check the effectiveness of the pid file before a 
check again.

I send a patch.

Please examine this patch.


Best Regards,
Hideo Yamauchi.

mysql.1651-3.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] [Repost][Patch]The revision of noize log of the iSCSITarget resource.

2011-07-11 Thread renayama19661014
Hi,

I make a former patch again for agent3.9.2 and contribute it.

The patch performed the check of the portals parameter before setting a default 
value.

When users use tgt, warning log is not output by iSCSI target-RA when users do 
not set portals parameter.


The log of the next noise is not output by a patch.

Jul 12 19:15:54 srv01 iSCSITarget[13839]: WARNING: Configuration parameter 
portals is not supported by the iSCSI implementation and will be ignored.
Jul 12 19:16:04 srv01 iSCSITarget[13957]: WARNING: Configuration parameter 
portals is not supported by the iSCSI implementation and will be ignored.
Jul 12 19:16:14 srv01 iSCSITarget[14060]: WARNING: Configuration parameter 
portals is not supported by the iSCSI implementation and will be ignored.
Jul 12 19:16:25 srv01 iSCSITarget[14189]: WARNING: Configuration parameter 
portals is not supported by the iSCSI implementation and will be ignored.

* The link of an old email becomes next 
 * http://www.gossamer-threads.com/lists/linuxha/dev/70274

Please please confirm the contents of the patch.

Best Regards,
Hideo Yamauchi.diff -r f4df06073f4d iSCSITarget
--- a/iSCSITarget   Tue Jul 12 19:37:14 2011 +0900
+++ b/iSCSITarget   Tue Jul 12 19:37:40 2011 +0900
@@ -42,9 +42,6 @@
 fi
 : ${OCF_RESKEY_implementation=${OCF_RESKEY_implementation_default}}
 
-# Listen on 0.0.0.0:3260 by default
-OCF_RESKEY_portals_default=0.0.0.0:3260
-: ${OCF_RESKEY_portals=${OCF_RESKEY_portals_default}}
 
 # Lockfile, used for selecting a target ID
 LOCKFILE=${HA_RSCTMP}/iSCSITarget-${OCF_RESKEY_implementation}.lock
@@ -552,6 +549,7 @@
 
 case $1 in
   meta-data)
+   OCF_RESKEY_portals_default=0.0.0.0:3260
meta_data
exit $OCF_SUCCESS
;;
@@ -564,6 +562,10 @@
 # Everything except usage and meta-data must pass the validate test
 iSCSITarget_validate
 
+# Listen on 0.0.0.0:3260 by default
+OCF_RESKEY_portals_default=0.0.0.0:3260
+: ${OCF_RESKEY_portals=${OCF_RESKEY_portals_default}}
+
 case $__OCF_ACTION in
 start) iSCSITarget_start;;
 stop)  iSCSITarget_stop;;
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[Linux-ha-dev] Prototypic declaration is insufficient.

2011-06-16 Thread renayama19661014
Hi all,

Because there is not prototypic declaration, in the top of the source of glue, 
I cannot compile it.

diff -r 7d9a54d5da6c main.c
--- a/main.cFri Jun 17 18:34:21 2011 +0900
+++ b/main.cFri Jun 17 18:34:55 2011 +0900
@@ -78,6 +78,7 @@
 void log_buf(int severity, char *buf);
 void log_msg(int severity, const char * fmt, ...)G_GNUC_PRINTF(2,3);
 void trans_log(int priority, const char * fmt, ...)G_GNUC_PRINTF(2,3);
+void setup_cl_log(void);
 
 static int pil_loglevel_to_syslog_severity[] = {
/* Indices: none=0, PIL_FATAL=1, PIL_CRIT=2, PIL_WARN=3,


Best Regards,
Hideo Yamauch.

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-06-15 Thread renayama19661014
Hi Raoul,

I'm sorry.
I was weak in English, and it confused you.

 please refetch one last time from
 https://github.com/raoulbhatia/resource-agents/blob/master/heartbeat/postfix
 
 i think i got the probing issue fixed!

I confirm movement and will inform it of a result tomorrow.

Best Regards,
Hideo Yamauchi

--- On Wed, 2011/6/15, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hi Hideo-san!
 
 On 06/15/2011 10:53 AM, renayama19661...@ybb.ne.jp wrote:
  Hi Raoul,
  
  Thank you for comment.
   
  please test the postfix ra from my repository:
  https://github.com/raoulbhatia/resource-agents/blob/master/heartbeat/postfix
 
  there is a minor issue regarding probes and a resulting double start,
  which is left to be resolved. no other issues in my production
  environment so far.
 
  so i'd be glad if you could give it a shot!
  
  All right.
  
  I confirm movement in postfix which you showed.
 
 i'm sorry but i do not understand what you mean by that.
 can you please rephrase that?
 
 
  Because our environment is RHEL, I report a test result on RHEL5 and RHEL6.
 perfect!
 
 please refetch one last time from
 https://github.com/raoulbhatia/resource-agents/blob/master/heartbeat/postfix
 
 i think i got the probing issue fixed!
 
 thanks,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-06-15 Thread renayama19661014
Hi Raoul,

I confirmed movement of postfix in the next environment.
 
 * RHEL5 - postfix 2.3.3
 * RHEL6 - postfix 2.6.6

The postfix ra worked well.

However, my patch made a mistake.
I do not seem to get the result of postfix status.
It is necessary to watch log of postfix in the details of the problem after all.

Therefore, I withdraw the patch of the part of postfix status.

diff -r 19c97e0021f0 postfix
--- a/postfix   Thu Jun 16 21:45:53 2011 +0900
+++ b/postfix   Thu Jun 16 21:46:01 2011 +0900
@@ -98,12 +98,8 @@
 postfix_running() {
 # run Postfix status if available
 if ocf_is_true $status_support; then
-output=`$binary $OPTION_CONFIG_DIR status 21`
-ret=$?
-if [ $ret -ne 0 ]; then
-ocf_log err Postfix status: '$output'. $ret
-fi
-return $ret
+$binary $OPTION_CONFIG_DIR status 21
+return $?
 fi
 
 # manually check Postfix's pid


Best Regards,
Hideo Yamauchi.


--- On Wed, 2011/6/15, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 
 I'm sorry.
 I was weak in English, and it confused you.
 
  please refetch one last time from
  https://github.com/raoulbhatia/resource-agents/blob/master/heartbeat/postfix
  
  i think i got the probing issue fixed!
 
 I confirm movement and will inform it of a result tomorrow.
 
 Best Regards,
 Hideo Yamauchi
 
 --- On Wed, 2011/6/15, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
 
  Hi Hideo-san!
  
  On 06/15/2011 10:53 AM, renayama19661...@ybb.ne.jp wrote:
   Hi Raoul,
   
   Thank you for comment.
    
   please test the postfix ra from my repository:
   https://github.com/raoulbhatia/resource-agents/blob/master/heartbeat/postfix
  
   there is a minor issue regarding probes and a resulting double start,
   which is left to be resolved. no other issues in my production
   environment so far.
  
   so i'd be glad if you could give it a shot!
   
   All right.
   
   I confirm movement in postfix which you showed.
  
  i'm sorry but i do not understand what you mean by that.
  can you please rephrase that?
  
  
   Because our environment is RHEL, I report a test result on RHEL5 and 
   RHEL6.
  perfect!
  
  please refetch one last time from
  https://github.com/raoulbhatia/resource-agents/blob/master/heartbeat/postfix
  
  i think i got the probing issue fixed!
  
  thanks,
  raoul
  -- 
  
  DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
  Technischer Leiter
  
  IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
  Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
  1190 Wien                           tel.               +43 1 3670030
  FN 277995t HG Wien                  fax.            +43 1 3670030 15
  
  
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-06-14 Thread renayama19661014
Hi Raoul,

  to my knowledge, the ra's output is logged by pacemaker.
  moreover, postfix logs to the mail facility itself.
  
  what are the reasons for separately capturing and logging
  all output?
 
 When a problem occurred, the output of detailed log helps an operator.
 In addition, pacemaker can give only the log that ra output in std.

My the third patch was wrong.
And log of postfix helps a manager enough.

Please abandon my the third patch to a trash box.

Best Regards,
Hideo Yamauchi.



--- On Thu, 2011/6/9, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 
 Thank you for the merge of the patch.
 
  to my knowledge, the ra's output is logged by pacemaker.
  moreover, postfix logs to the mail facility itself.
  
  what are the reasons for separately capturing and logging
  all output?
 
 When a problem occurred, the output of detailed log helps an operator.
 In addition, pacemaker can give only the log that ra output in std.
 
 
 Best Regards,
 Hideo Yamauchi.
 
  
 
 --- On Thu, 2011/6/9, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
 
  On 07.06.2011 04:40, renayama19661...@ybb.ne.jp wrote:
   Hi All,
  
   I contribute my last patch.(patch3)
   This is a patch for the sources which applied patch 1.
   It is the patch which output the details of the error in log.
  
  hi!
  
  to my knowledge, the ra's output is logged by pacemaker.
  moreover, postfix logs to the mail facility itself.
  
  what are the reasons for separately capturing and logging
  all output?
  
  (mainly patch3)
  
  thanks,
  raoul
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status

2011-06-08 Thread renayama19661014
Hi Raoul,

Thank you for the merge of the patch.

 to my knowledge, the ra's output is logged by pacemaker.
 moreover, postfix logs to the mail facility itself.
 
 what are the reasons for separately capturing and logging
 all output?

When a problem occurred, the output of detailed log helps an operator.
In addition, pacemaker can give only the log that ra output in std.


Best Regards,
Hideo Yamauchi.

 

--- On Thu, 2011/6/9, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 On 07.06.2011 04:40, renayama19661...@ybb.ne.jp wrote:
  Hi All,
 
  I contribute my last patch.(patch3)
  This is a patch for the sources which applied patch 1.
  It is the patch which output the details of the error in log.
 
 hi!
 
 to my knowledge, the ra's output is logged by pacemaker.
 moreover, postfix logs to the mail facility itself.
 
 what are the reasons for separately capturing and logging
 all output?
 
 (mainly patch3)
 
 thanks,
 raoul
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-06 Thread renayama19661014
Hi Raoul,

 Hideo-san, i updated your postfix.patch2 the way i would improve it.
 any objections?

No.
Thanks!

Best Regards,
Hideo Yamauchi.

--- On Mon, 2011/6/6, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hideo-san, i updated your postfix.patch2 the way i would improve it.
 any objections?
 
 cheers,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-06 Thread renayama19661014
Hi Dejan,

Thank you for comment.

 In the latest version of ocf-shellfuncs there is some support for
 version checks.

I did not know that there was the check handling of version in new 
ocf-shellfuncs.
I renew a patch to use the processing.

Thanks.
Hideo Yamauchi.

--- On Mon, 2011/6/6, Dejan Muhamedagic de...@suse.de wrote:

 Hi Hideo-san,
 
 On Mon, Jun 06, 2011 at 01:36:01PM +0900, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  Sorry
  
  +    if [ ${ver_str[0]} -le 2 -a ${ver_str[1]} -le 5 ]; then
  
  I missed.
  
  +    if [ ${ver_str[0]} -lt 2 -o ${ver_str[0]} -eq 2 -a ${ver_str[1]} -lt 5 
  ]; then
 
 In the latest version of ocf-shellfuncs there is some support for
 version checks.
 
 Cheers,
 
 Dejan
 
  Thanks.
  Hideo Yamauchi.
  
  
  --- On Mon, 2011/6/6, renayama19661...@ybb.ne.jp 
  renayama19661...@ybb.ne.jp wrote:
  
   Hi All,
   
   I send a patch in conjunction with the status processing.
   It is made the following modifications.
   
    * Carry out status processing in a version judgment 
    * Change of the parameter check 
    * Error log when status processing failed
    * Value set of the ret variable
   
   I send the patch of other corrections later.
   
   Please comment on all of you for the patch.
   
   
   Best Regards,
   Hideo Yamauchi.
   
   
   --- On Fri, 2011/6/3, Dejan Muhamedagic de...@suse.de wrote:
   
On Fri, Jun 03, 2011 at 12:03:20PM +0200, Raoul Bhatia [IPAX] wrote:
 On 06/03/2011 11:45 AM, Dejan Muhamedagic wrote:
  Regressions are bad. You have to keep in mind that not everybody
  runs the latest release of postfix. This really needs to be fixed
  before the release.
 
 it's no regression but has been like that since the initial release.
 see commit e7af463d or
 
 https://github.com/ClusterLabs/resource-agents/blame/master/heartbeat/postfix#LID100
 
 i didn't know this until Noah brought this to my/our attention:
 http://www.gossamer-threads.com/lists/linuxha/pacemaker/72379#72379

OK.  I misunderstood the post, it seemed to me as if status had
been introduced in the latest set of patches.  This is another
matter then.

Cheers,

Dejan

 thanks,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-06 Thread renayama19661014
Hi Raoul,

Thank you for comment.

 i think we could safely do the kill -s 0 for *any*
 version and call postfix status only if available.

I think so.

However, I do not know a lot about postfix so.
I want the opinion of the detailed person.

 btw. quickly looking at your patch, i spotted 1
 typo: status_suuport instead of status_support
 (douple u/p)

Sorry...
It is my typo.
 
 for the version check, i think we should try using the
 ocf internal function.

Ok.

 
   * Change of the parameter check
 the checks are basically fine. i would slightly update the
 logging information. (i can do this when i apply your patches)

Thanks!

 
   * Error log when status processing failed
   * Value set of the ret variable
 
 i don't think that the use of $ret is correct.

I made modifications to set unsettled ret variable in an original resource 
agent. 
But I am unsettled, the ret variable may not have to output it in log.

Best Regards,
Hideo Yamauchi.

--- On Mon, 2011/6/6, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:

 Hi Hideo-san!
 
 On 06/06/2011 04:51 AM, renayama19661...@ybb.ne.jp wrote:
  Hi All,
  
  I send a patch in conjunction with the status processing.
  It is made the following modifications.
  
   * Carry out status processing in a version judgment
 
 i think we could safely do the kill -s 0 for *any*
 version and call postfix status only if available.
 
 btw. quickly looking at your patch, i spotted 1
 typo: status_suuport instead of status_support
 (douple u/p)
 
 for the version check, i think we should try using the
 ocf internal function.
 
   * Change of the parameter check
 the checks are basically fine. i would slightly update the
 logging information. (i can do this when i apply your patches)
 
   * Error log when status processing failed
   * Value set of the ret variable
 
 i don't think that the use of $ret is correct.
 
 please comment on my suggestions and/or update the
 ra in this regard.
 
 thanks,
 raoul
 -- 
 
 DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
 Technischer Leiter
 
 IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
 Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
 1190 Wien                           tel.               +43 1 3670030
 FN 277995t HG Wien                  fax.            +43 1 3670030 15
 
 
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-06 Thread renayama19661014
Hi All,

I revised the first patch.
Please confirm contents.

Best Regards,
Hideo Yamauchi.


--- On Tue, 2011/6/7, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi Raoul,
 
 Thank you for comment.
 
  i think we could safely do the kill -s 0 for *any*
  version and call postfix status only if available.
 
 I think so.
 
 However, I do not know a lot about postfix so.
 I want the opinion of the detailed person.
 
  btw. quickly looking at your patch, i spotted 1
  typo: status_suuport instead of status_support
  (douple u/p)
 
 Sorry...
 It is my typo.
  
  for the version check, i think we should try using the
  ocf internal function.
 
 Ok.
 
  
    * Change of the parameter check
  the checks are basically fine. i would slightly update the
  logging information. (i can do this when i apply your patches)
 
 Thanks!
 
  
    * Error log when status processing failed
    * Value set of the ret variable
  
  i don't think that the use of $ret is correct.
 
 I made modifications to set unsettled ret variable in an original resource 
 agent. 
 But I am unsettled, the ret variable may not have to output it in log.
 
 Best Regards,
 Hideo Yamauchi.
 
 --- On Mon, 2011/6/6, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
 
  Hi Hideo-san!
  
  On 06/06/2011 04:51 AM, renayama19661...@ybb.ne.jp wrote:
   Hi All,
   
   I send a patch in conjunction with the status processing.
   It is made the following modifications.
   
    * Carry out status processing in a version judgment
  
  i think we could safely do the kill -s 0 for *any*
  version and call postfix status only if available.
  
  btw. quickly looking at your patch, i spotted 1
  typo: status_suuport instead of status_support
  (douple u/p)
  
  for the version check, i think we should try using the
  ocf internal function.
  
    * Change of the parameter check
  the checks are basically fine. i would slightly update the
  logging information. (i can do this when i apply your patches)
  
    * Error log when status processing failed
    * Value set of the ret variable
  
  i don't think that the use of $ret is correct.
  
  please comment on my suggestions and/or update the
  ra in this regard.
  
  thanks,
  raoul
  -- 
  
  DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
  Technischer Leiter
  
  IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
  Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
  1190 Wien                           tel.               +43 1 3670030
  FN 277995t HG Wien                  fax.            +43 1 3670030 15
  
  
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
diff -r a18d987956c7 postfix
--- a/postfix   Tue Jun 07 11:07:11 2011 +0900
+++ b/postfix   Tue Jun 07 11:13:16 2011 +0900
@@ -97,10 +97,25 @@
 
 running() {
 # run Postfix status
-$binary $OPTION_CONFIG_DIR status /dev/null 21
+local rcode
+if ocf_is_true $status_support; then
+output=`$binary $OPTION_CONFIG_DIR status`
+rcode=$?
+if [ $rcode -ne 0 ]; then
+ocf_log err Postfix status: $output
+fi
+return $rcode
+else
+PIDFILE=${queue_dir}/pid/master.pid
+if [ -f $PIDFILE ]; then
+ PID=`head -n 1 $PIDFILE`
+ kill -s 0 $PID /dev/null 21  [ `ps -p $PID | grep master | 
wc -l` -eq 1 ]
+ return $?
+fi
+false
+fi
 }
 
-
 postfix_status()
 {
 running
@@ -219,25 +234,42 @@
 fi
 fi
 
+# check postfix version
+status_support=false
+output=`postconf $OPTION_CONFIG_DIR -h mail_version`
+if [ $? -ne 0 ]; then
+ocf_log err Postfix config mail_version does not exist. $output
+fi
+ocf_version_cmp $output 2.5.0
+if [ $? -ne 0 ]; then
+status_support=true
+fi
+
 # check spool/queue and data directories
 # this is required because postfix check does not catch all errors
 queue_dir=`postconf $OPTION_CONFIG_DIR -h queue_directory 2/dev/null`
-data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 2/dev/null`
-for dir in $queue_dir $data_dir; do
-if [ ! -d $dir ]; then
-ocf_log err Postfix directory '$queue_dir' does not exist. $ret
+if [ ! -d $queue_dir ]; then
+ocf_log err Postfix directory '$queue_dir' does not exist.
+return $OCF_ERR_INSTALLED
+fi
+if ocf_is_true $status_support; then
+data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 2/dev/null`
+if [ ! -d $data_dir ]; then
+ocf_log err Postfix directory '$data_dir' does not exist.
 return $OCF_ERR_INSTALLED
 fi
-done
+fi
 
 # check permissions
-   

Re: [Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-06 Thread renayama19661014
Hi All,

I contribute my last patch.(patch3)
This is a patch for the sources which applied patch 1.
It is the patch which output the details of the error in log.

Best Regards,
Hideo Yamauchi.


--- On Tue, 2011/6/7, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi All,
 
 I revised the first patch.
 Please confirm contents.
 
 Best Regards,
 Hideo Yamauchi.
 
 
 --- On Tue, 2011/6/7, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
 wrote:
 
  Hi Raoul,
  
  Thank you for comment.
  
   i think we could safely do the kill -s 0 for *any*
   version and call postfix status only if available.
  
  I think so.
  
  However, I do not know a lot about postfix so.
  I want the opinion of the detailed person.
  
   btw. quickly looking at your patch, i spotted 1
   typo: status_suuport instead of status_support
   (douple u/p)
  
  Sorry...
  It is my typo.
   
   for the version check, i think we should try using the
   ocf internal function.
  
  Ok.
  
   
     * Change of the parameter check
   the checks are basically fine. i would slightly update the
   logging information. (i can do this when i apply your patches)
  
  Thanks!
  
   
     * Error log when status processing failed
     * Value set of the ret variable
   
   i don't think that the use of $ret is correct.
  
  I made modifications to set unsettled ret variable in an original resource 
  agent. 
  But I am unsettled, the ret variable may not have to output it in log.
  
  Best Regards,
  Hideo Yamauchi.
  
  --- On Mon, 2011/6/6, Raoul Bhatia [IPAX] r.bha...@ipax.at wrote:
  
   Hi Hideo-san!
   
   On 06/06/2011 04:51 AM, renayama19661...@ybb.ne.jp wrote:
Hi All,

I send a patch in conjunction with the status processing.
It is made the following modifications.

     * Carry out status processing in a version judgment
   
   i think we could safely do the kill -s 0 for *any*
   version and call postfix status only if available.
   
   btw. quickly looking at your patch, i spotted 1
   typo: status_suuport instead of status_support
   (douple u/p)
   
   for the version check, i think we should try using the
   ocf internal function.
   
     * Change of the parameter check
   the checks are basically fine. i would slightly update the
   logging information. (i can do this when i apply your patches)
   
     * Error log when status processing failed
     * Value set of the ret variable
   
   i don't think that the use of $ret is correct.
   
   please comment on my suggestions and/or update the
   ra in this regard.
   
   thanks,
   raoul
   -- 
   
   DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
   Technischer Leiter
   
   IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
   Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
   1190 Wien                           tel.               +43 1 3670030
   FN 277995t HG Wien                  fax.            +43 1 3670030 15
   
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 diff -r 303d9d19eb61 postfix
--- a/postfix   Tue Jun 07 11:22:15 2011 +0900
+++ b/postfix   Tue Jun 07 11:35:52 2011 +0900
@@ -102,7 +102,7 @@
 output=`$binary $OPTION_CONFIG_DIR status`
 rcode=$?
 if [ $rcode -ne 0 ]; then
-ocf_log err Postfix status: $output
+ocf_log err Postfix status: $rcode : $output
 fi
 return $rcode
 else
@@ -130,11 +130,11 @@
 fi
 
 # start Postfix
-$binary $OPTIONS start /dev/null 21
+output=`$binary $OPTIONS start /dev/null 21`
 ret=$?
 
 if [ $ret -ne 0 ]; then
-ocf_log err Postfix returned error. $ret
+ocf_log err Postfix returned error. $ret : $output 
 return $OCF_ERR_GENERIC
 fi
 
@@ -163,11 +163,11 @@
 fi
 
 # stop Postfix
-$binary $OPTIONS stop /dev/null 21
+output=`$binary $OPTIONS stop /dev/null 21`
 ret=$?
 
 if [ $ret -ne 0 ]; then
-ocf_log err Postfix returned an error while stopping. $ret
+ocf_log err Postfix returned an error while stopping. $ret : $output 
 return $OCF_ERR_GENERIC
 fi
 
@@ -201,7 +201,12 @@
 {
 if postfix_status; then
 ocf_log info Reloading Postfix.
-$binary $OPTIONS reload
+output=`$binary $OPTIONS reload`
+ret=$?
+if [ $ret -ne 0 ]; then
+ocf_log err Postfix reload error. $ret : $output 
+fi
+return $ret
 fi
 }
 
@@ -237,8 +242,9 @@
 # check postfix version
 status_support=false
 output=`postconf $OPTION_CONFIG_DIR -h mail_version`
-if [ $? -ne 0 ]; then
-ocf_log err Postfix config mail_version does not exist. 

Re: [Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-05 Thread renayama19661014
Hi All,

I send a patch in conjunction with the status processing.
It is made the following modifications.

 * Carry out status processing in a version judgment 
 * Change of the parameter check 
 * Error log when status processing failed
 * Value set of the ret variable

I send the patch of other corrections later.

Please comment on all of you for the patch.


Best Regards,
Hideo Yamauchi.


--- On Fri, 2011/6/3, Dejan Muhamedagic de...@suse.de wrote:

 On Fri, Jun 03, 2011 at 12:03:20PM +0200, Raoul Bhatia [IPAX] wrote:
  On 06/03/2011 11:45 AM, Dejan Muhamedagic wrote:
   Regressions are bad. You have to keep in mind that not everybody
   runs the latest release of postfix. This really needs to be fixed
   before the release.
  
  it's no regression but has been like that since the initial release.
  see commit e7af463d or
  
  https://github.com/ClusterLabs/resource-agents/blame/master/heartbeat/postfix#LID100
  
  i didn't know this until Noah brought this to my/our attention:
  http://www.gossamer-threads.com/lists/linuxha/pacemaker/72379#72379
 
 OK.  I misunderstood the post, it seemed to me as if status had
 been introduced in the latest set of patches.  This is another
 matter then.
 
 Cheers,
 
 Dejan
 
  thanks,
  raoul
  -- 
  
  DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
  Technischer Leiter
  
  IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
  Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
  1190 Wien                           tel.               +43 1 3670030
  FN 277995t HG Wien                  fax.            +43 1 3670030 15
  
 ___
 Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
 Home Page: http://linux-ha.org/
diff -r fd372ca4d647 postfix
--- a/postfix   Mon Jun 06 11:45:51 2011 +0900
+++ b/postfix   Mon Jun 06 11:46:32 2011 +0900
@@ -97,10 +97,22 @@
 
 running() {
 # run Postfix status
-$binary $OPTION_CONFIG_DIR status /dev/null 21
+if [ $status_suuport ]; then
+output=`$binary $OPTION_CONFIG_DIR status`
+if [ $? -ne 0 ]; then
+ocf_log err Postfix status. %s $output
+fi
+else
+PIDFILE=${queue_dir}/pid/master.pid
+if [ -f $PIDFILE ]; then
+ PID=`head -n 1 $PIDFILE`
+ kill -s 0 $PID /dev/null 21  [ `ps -p $PID | grep master | 
wc -l` -eq 1 ]
+ return $?
+fi
+false
+fi
 }
 
-
 postfix_status()
 {
 running
@@ -219,25 +231,44 @@
 fi
 fi
 
+# check postfix version
+status_support=true
+output=`postconf $OPTION_CONFIG_DIR -h mail_version`
+if [ $? -ne 0 ]; then
+ocf_log err Postfix config mail_version does not exist. %s $output
+fi
+ver_str=(`echo $output | tr '.' ' '`)
+if [ ${ver_str[0]} -le 2 -a ${ver_str[1]} -le 5 ]; then
+status_support=false
+fi
+
 # check spool/queue and data directories
 # this is required because postfix check does not catch all errors
 queue_dir=`postconf $OPTION_CONFIG_DIR -h queue_directory 2/dev/null`
-data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 2/dev/null`
-for dir in $queue_dir $data_dir; do
-if [ ! -d $dir ]; then
-ocf_log err Postfix directory '$queue_dir' does not exist. $ret
+ret=$?
+if [ ! -d $queue_dir ]; then
+ocf_log err Postfix directory '$queue_dir' does not exist. $ret
+return $OCF_ERR_INSTALLED
+fi
+if [ ! $status_support ]; then
+data_dir=`postconf $OPTION_CONFIG_DIR -h data_directory 2/dev/null`
+ret=$?
+if [ ! -d $data_dir ]; then
+ocf_log err Postfix directory '$data_dir' does not exist. $ret
 return $OCF_ERR_INSTALLED
 fi
-done
+fi
 
 # check permissions
-user=`postconf $OPTION_CONFIG_DIR -h mail_owner 2/dev/null`
-for dir in $data_dir; do
-if ! su -s /bin/sh - $user -c test -w $dir; then
-ocf_log err Directory '$dir' is not writable by user '$user'.
-exit $OCF_ERR_PERM;
-fi
-done
+if [ ! $status_support ]; then
+user=`postconf $OPTION_CONFIG_DIR -h mail_owner 2/dev/null`
+for dir in $data_dir; do
+if ! su -s /bin/sh - $user -c test -w $dir; then
+ocf_log err Directory '$dir' is not writable by user '$user'.
+exit $OCF_ERR_PERM;
+fi
+done
+fi
 
 # run Postfix internal check
 $binary $OPTIONS check /dev/null 21
@@ -355,3 +386,4 @@
 exit $OCF_ERR_UNIMPLEMENTED
 ;;
 esac
+
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org

Re: [Linux-ha-dev] Postfix status (was Re: state of heartbeat resource agents)

2011-06-05 Thread renayama19661014
Hi All,

The next patch supports a loop of the waiting of the start processing 
successively.
The start processing revised it like other resource agents to wait on for start.

Best Regards,
Hideo Yamauchi.


--- On Mon, 2011/6/6, renayama19661...@ybb.ne.jp renayama19661...@ybb.ne.jp 
wrote:

 Hi All,
 
 I send a patch in conjunction with the status processing.
 It is made the following modifications.
 
  * Carry out status processing in a version judgment 
  * Change of the parameter check 
  * Error log when status processing failed
  * Value set of the ret variable
 
 I send the patch of other corrections later.
 
 Please comment on all of you for the patch.
 
 
 Best Regards,
 Hideo Yamauchi.
 
 
 --- On Fri, 2011/6/3, Dejan Muhamedagic de...@suse.de wrote:
 
  On Fri, Jun 03, 2011 at 12:03:20PM +0200, Raoul Bhatia [IPAX] wrote:
   On 06/03/2011 11:45 AM, Dejan Muhamedagic wrote:
Regressions are bad. You have to keep in mind that not everybody
runs the latest release of postfix. This really needs to be fixed
before the release.
   
   it's no regression but has been like that since the initial release.
   see commit e7af463d or
   
   https://github.com/ClusterLabs/resource-agents/blame/master/heartbeat/postfix#LID100
   
   i didn't know this until Noah brought this to my/our attention:
   http://www.gossamer-threads.com/lists/linuxha/pacemaker/72379#72379
  
  OK.  I misunderstood the post, it seemed to me as if status had
  been introduced in the latest set of patches.  This is another
  matter then.
  
  Cheers,
  
  Dejan
  
   thanks,
   raoul
   -- 
   
   DI (FH) Raoul Bhatia M.Sc.          email.          r.bha...@ipax.at
   Technischer Leiter
   
   IPAX - Aloy Bhatia Hava OG          web.          http://www.ipax.at
   Barawitzkagasse 10/2/2/11           email.            off...@ipax.at
   1190 Wien                           tel.               +43 1 3670030
   FN 277995t HG Wien                  fax.            +43 1 3670030 15
   
  ___
  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
  Home Page: http://linux-ha.org/
 diff -r 6f405d0b697b postfix
--- a/postfix   Mon Jun 06 11:56:47 2011 +0900
+++ b/postfix   Mon Jun 06 12:04:58 2011 +0900
@@ -139,12 +139,16 @@
 sleep 2
 
 # initial monitoring action
-running
-ret=$?
-if [ $ret -ne $OCF_SUCCESS ]; then
-ocf_log err Postfix failed initial monitor action. $ret
-return $OCF_ERR_GENERIC
-fi
+while :
+do
+running
+ret=$?
+if [ $ret -eq $OCF_SUCCESS ]; then
+break;
+fi
+sleep 1
+ocf_log debug Postfix failed initial monitor action. $ret
+done
 
 ocf_log info Postfix started.
 return $OCF_SUCCESS
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


  1   2   >