[Pacemaker] About a combination with OpenAIS.

2009-06-10 Thread renayama19661014
Hi, I am going to check the operation by a combination of Pacemaker and OpenAIS. In reference to a page(http://www.clusterlabs.org/wiki/Install), I made environment from a source. The source which I used was the following versions. *

[Pacemaker] A demand to the process trouble.(OpenAIS/Corosync and Pacemaker)

2009-07-16 Thread renayama19661014
Hi, We began shift investigation to the combination of Pacemaker and corosync/openais now. We put Pacemaker and openais(whitetank) together and confirmed movement at the time of the process trouble. (This is the function that a reboot emergency occurred by a combination with Heartbeat.) I

Re: [Pacemaker] A demand to the process trouble.(OpenAIS/Corosync and Pacemaker)

2009-07-22 Thread renayama19661014
Hi Andrew, If the crmd dies, then I (IIRC) the lrmd cancels all existing resource monitoring. However, when the crmd is recovered, it should setup the resource monitoring again. Is the second part not happening? There are some patterns in a problem. When crmd/stonithd restarts in a ACT

Re: [Pacemaker] A demand to the process trouble.(OpenAIS/Corosync and Pacemaker)

2009-07-22 Thread renayama19661014
Hi Andrew, Can you open a bug for that. I suspect the lrmd might be doing the wrong thing, but assign it to pacemaker until I can prove that :-) All right. Had better I register problem in bugzilla? #As for me, a problem understands the thing in lrmd. Best Regards, Hideo Yamauchi. ---

Re: [Pacemaker] A demand to the process trouble.(OpenAIS/Corosync and Pacemaker)

2009-07-28 Thread renayama19661014
Hi Andrew, Yes please. http://developerbugs.linux-foundation.org/show_bug.cgi?id=2161 Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: 2009/7/23 renayama19661...@ybb.ne.jp: Hi Andrew, Can you open a bug for that. I suspect the lrmd might be doing the wrong

Re: [Pacemaker] An error occurs by a crm command.

2009-07-28 Thread renayama19661014
Hi Andrew, When I do not appoint target_role, an error does not occur. Is my operation wrong? role=Stopped needs to occur after the meta keyword, not as part of the operation I was wrong. This targer_role was for MASTER/SLAVE appointment. Thanks, Hideo Yamauchi. --- Andrew Beekhof

[Pacemaker] About a stop/restart of the monitor.

2009-08-05 Thread renayama19661014
Hi, We examined a method to reopen after having stopped the monitor of the resource in environment of latest Pacemaker. Of course we understand that a monitor stops by making a cluster a maintenance mode. However, our customer wants to stop a monitor individually without making a cluster a

Re: [Pacemaker] About a stop/restart of the monitor.

2009-08-06 Thread renayama19661014
Hi Andrew, iirc, there you can also set enabled=false. Check out the schema files. It may even be in the PDF Thank you. I tried to test the next command. # cibadmin -R -X 'op id=prmDummy1-monitor interval=10s name=monitor on-fail=restart timeout=60s enabled=false /' But, the monitor of

Re: [Pacemaker] About a substitute of initdead.(for Corosync and Pacemaker)

2009-08-27 Thread renayama19661014
Hi Andrew, Basically yes. Granted the name we used for the option could be better :-) All right. Thank you. Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: On Thu, Aug 27, 2009 at 10:01 AM, renayama19661...@ybb.ne.jp wrote: Hi, I asked about the setting of

Re: [Pacemaker] A function demand of the new environment.

2009-09-17 Thread renayama19661014
Hi Andrew, I still don't understand why you can't simply make them regular cluster resources. Can you please explain? I understand your opinion. I surely understand that simple RA can operate an object of respawn by shifting. However, an object of respawn of all users cannot shift to RA. For

Re: [Pacemaker] A function demand of the new environment.

2009-09-21 Thread renayama19661014
Hi Lars, I understand that I can come true by the method that you showed enough. However, we wanted to do respawn under the cluster software if possible. We talk about a substitute method again, too. Thank you. Hideo Yamauchi. --- Lars Ellenberg lars.ellenb...@linbit.com wrote: On Mon, Sep

Re: [Pacemaker] A function demand of the new environment.

2009-09-21 Thread renayama19661014
Hi Andrew, I'm almost afraid to ask, why would someone need this capability? In the case of the environment that shifted from version 1 of Heartbeat, I think that the user may do how to use such respawn. We talk about a substitute method again, too. Thank you. Hideo Yamauchi. --- Andrew

Re: [Pacemaker] A problem to fail in a stop of Pacemaker.

2009-09-29 Thread renayama19661014
Hi Remi, It appears that this is a similar problem to the one that I reported, yes. It appears to not be a bug in Corosync, but rather one in Pacemaker. This bug has been filed in Red Hat Bugzilla, see it at: https://bugzilla.redhat.com/show_bug.cgi?id=525589 Perhaps you could add

Re: [Pacemaker] Failed in restart of Corosync.

2009-10-18 Thread renayama19661014
Hi Steven, All right. Thank you. Best Regards, Hideo Yamauchi. --- Steven Dake sd...@redhat.com wrote: This bug is reported and we are working on a solution. Regards -steve On Mon, 2009-10-19 at 11:05 +0900, renayama19661...@ybb.ne.jp wrote: Hi, I understand that a combination

Re: [Pacemaker] A function demand of the new environment.

2009-10-30 Thread renayama19661014
Hi Andrew, Or are you looking for the ability to do more than simply prevent resources from running? Yes. For the realization of the function to assure the practice of the resources of the cluster, we use respawn. We found the function that it might be replaced of respawn in a road map of

Re: [Pacemaker] [PATCH]The abolition of distinguishing a node name by a small and capital letter.

2009-12-10 Thread renayama19661014
Hi Dejan, About this matter, how do you think? Please tell me your opinion. Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi, We committed the following mistakes. * The host name is a small letter. * The host name of RULE is a capital letter. * The host name

Re: [Pacemaker] [PATCH]The abolition of distinguishing a node name by a small and capital letter.

2009-12-14 Thread renayama19661014
Hi Dejan, Are node names uppercase? And then stonith doesn't work? The node name is a small letter. stonith acts, but it is not carried out because it is not found an target node. It is caused by a mistake of the setting obviously. However, for a user, some kind of measures are necessary.

Re: [Pacemaker] The order of resources changes when resources register themselves by a crm command.

2009-12-14 Thread renayama19661014
Hi, Everybody thank you for comment. If my understanding is right, I think that I will register myself with Bugzilla as a demand. Is my understanding right? Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: On Fri, Dec 11, 2009 at 11:28 AM, Lars Marowsky-Bree

Re: [Pacemaker] [PATCH]The abolition of distinguishing a node name by a small and capital letter.

2009-12-15 Thread renayama19661014
Hi all, Thank you for understanding of all of you. Yes, though that means changing all plugins. I can assist the revision of the plug in. Best Regards, Hideo Yamauchi. --- Dejan Muhamedagic deja...@fastmail.fm wrote: Hi Lars, On Tue, Dec 15, 2009 at 02:24:42PM +0100, Lars Ellenberg

Re: [Pacemaker] The order of resources changes when resources register themselves by a crm command.

2009-12-15 Thread renayama19661014
Hi Dejan, I registered. http://developerbugs.linux-foundation.org/show_bug.cgi?id=2290 Best Regards, Hideo Yamauchi. --- Dejan Muhamedagic deja...@fastmail.fm wrote: Hi Hideo-san, On Tue, Dec 15, 2009 at 09:41:13AM +0900, renayama19661...@ybb.ne.jp wrote: Hi, Everybody thank you

Re: [Pacemaker] [PATCH]The abolition of distinguishing a node name by a small and capital letter.

2009-12-16 Thread renayama19661014
Hi Dejan, That would be great since I'm overwhelmed with other business now. And please open a bugzilla if there isn't any so that we can track this. All right. Please wait Best Regards, Hideo Yamauchi. --- Dejan Muhamedagic deja...@fastmail.fm wrote: Hi Hideo-san, On Wed, Dec 16,

Re: [Pacemaker] [PATCH]The abolition of distinguishing a node name by a small and capital letter.

2009-12-16 Thread renayama19661014
Hi Dejan, I registered. http://developerbugs.linux-foundation.org/show_bug.cgi?id=2292 Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Dejan, That would be great since I'm overwhelmed with other business now. And please open a bugzilla if there isn't any so that

[Pacemaker] rsc_location does not work.

2009-12-21 Thread renayama19661014
Hi, We constituted the complicated cluster of three nodes.(2ACT+1STB) We built a cluster by the next combination. * corosync-1.1.2 * Reusable-Cluster-Components-fa44a169d55f * Cluster-Resource-Agents-6f02f8ad7fd4 * Pacemaker-1-0-d990c453b999 The resource of group02-1 hoped that it started

Re: [Pacemaker] rsc_location does not work.

2010-01-05 Thread renayama19661014
Hi Andrew, Thank you for comment. You would be better off with a constraint like example 9.3 which will exclude any unconnected node and leave the previous location scores unchanged: \xA0

Re: [Pacemaker] rsc_location does not work.

2010-01-05 Thread renayama19661014
Hi Andrew, By the method that you showed, the problem was solved. Thank you. Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Andrew, Thank you for comment. You would be better off with a constraint like example 9.3 which will exclude any unconnected node and leave the

Re: [Pacemaker] The problem that a resource does not move by the monitor error of the clone.

2010-01-24 Thread renayama19661014
Hi All, About this problem, please give me advice. Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi, Step1) It started in three nodes as follows. [r...@srv01 ~]# crm_mon -1 Last updated: Tue Jan 19 10:36:20 2010 Stack: openais Current DC: srv01 -

Re: [Pacemaker] The problem that a resource does not move by the monitor error of the clone.

2010-01-28 Thread renayama19661014
Hi Andrew, Thank you. By your advice, the problem was solved. Understanding for my resource-stickiness was insufficient. Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: On Thu, Jan 28, 2010 at 12:13 PM, Andrew Beekhof and...@beekhof.net wrote: 2010/1/19

Re: [Pacemaker] [Problem]It does not start by single node configuration.

2010-01-28 Thread renayama19661014
Hi Andrew, Very odd. CTS (successfully) tests this condition all the time. What does corosync.conf look like? Is there a firewall involved? It _looks_ like the crmd isn't getting its own messages back. My test environment is next. * Single VM(RHEL5.4-32bit) on Esxi. * NIC x 3 * 1: A

Re: [Pacemaker] crm_attribte failed with Multiple attributes match

2010-02-03 Thread renayama19661014
Hi Andrew, This problem occurred in our environment. * RHEL5.4(x86) on Esxi 2node * corosync 1.1.2 * Pacemaker-1-0-6f67420618b0.tar.gz It sometimes occurs when I carry out test.sh in long time. [r...@srv01 ~]# ./test.sh (snip) scope=status name=master-vmrd-res: value=132 Multiple

Re: [Pacemaker] crm_attribte failed with Multiple attributes match

2010-02-04 Thread renayama19661014
Hi Andrew, After all is it a problem of libxml2? Is it necessary to update libxml2? It seems so. If I understood hj correctly, that was the only part he replaced in order for it to function correctly. Thank you for comment. I update libxml2, too and confirm it. Best Regards, Hideo

Re: [Pacemaker] [Problem]It does not start by single node configuration.

2010-02-04 Thread renayama19661014
Hi Andrew, The problem does not occur at single ring. It is strange that this problem does not happen in Pacemaker1.0.6. Will this be a problem of corosync? Do not you hear any similar phenomenon? Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Andrew, Could you

Re: [Pacemaker] [Problem]It does not start by single node configuration.

2010-02-07 Thread renayama19661014
Hi Andrew, Odd. You might want to report this to the openais guys. All right. The problem goes away when you just downgrade pacemaker to 1.0.6 (and leave corosync at the same version)? Sorry... It was my mistake. When I put corosync1.2.0 and Pacemaker1.0.6 together, the same problem

Re: [Pacemaker] [GUI][PATCH]An orphan resource was displayed by GUI.

2010-02-07 Thread renayama19661014
Hi Yan, This problem seems to still occur somehow or other. The Orphaned resource of the clone is extremely rarely displayed. I confirm a condition and report it again. Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Yan, Could you please try the attached patch?

Re: [Pacemaker] [GUI][PATCH]An orphan resource was displayed by GUI.

2010-02-07 Thread renayama19661014
Hi Yan, Hi Andrew, The phenomenon of the problem does not reappear. I considerably try it, but do not readily reappear. But, I found the interesting output from the log that the problem happened. When a problem happened, GUI became the following display.

Re: [Pacemaker] Cannot control the placement of the resource.

2010-02-10 Thread renayama19661014
Hi Andrew, Thank you for comment. Actually, none of the included pe files seem to match the failure case... can you supply the cib from that condition? I report the result that carried out cibadmin -Q at the stage of each problem. Is this information(cibadmin -Q) enough? Best Regards, Hideo

[Pacemaker] We cannot stop specific clone resources by a crm_resource order.

2010-02-17 Thread renayama19661014
Hi, We cannot stop specific clone resources by a crm_resource order. Last updated: Wed Feb 17 16:56:26 2010 Stack: openais Current DC: srv01 - partition with quorum Version: 1.0.7-0d97eb2e69533c1352044394c88d7c05802a09a5 2 Nodes configured, 2 expected votes 2 Resources configured.

Re: [Pacemaker] Cannot control the placement of the resource.

2010-02-21 Thread renayama19661014
Hi Andrew, About the cause of this problem, did you understand what it was? Please contact me if there is the information that is necessary for investigation else. Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Andrew, I confirmed it in the next environment.

Re: [Pacemaker] We cannot stop specific clone resources by a crm_resource order.

2010-02-22 Thread renayama19661014
Hi Andrew, Can we stop one clone by a crm_resouce command and a crm command? Or is it necessary to set -INIFINITY in a rule? Right, a location rule Possibly was the clone the specifications that cannot stop individually? Correct. You can however do: crm configure location

Re: [Pacemaker] [GUI][PATCH]An orphan resource was displayed by GUI.

2010-02-23 Thread renayama19661014
Hi Yan, Thank you for a reply. Probability of the reproduction of the problem is not high in my environment, but tries your patch for the moment. I e-mail a result to you. Best Regards, Hideo Yamauchi. --- Yan Gao y...@novell.com wrote: Hi Hideo, On 02/08/10 15:35,

Re: [Pacemaker] Cannot control the placement of the resource.

2010-02-23 Thread renayama19661014
Hi Andrew, Thank you for a reply. I test your patch. Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: 2010/2/13 renayama19661...@ybb.ne.jp: Hi Andrew, I confirmed it in the next environment. \xA0* CentOS 5.4 (on ESXi) \xA0* Pacemaker-1-0-0bf7d14dd554

Re: [Pacemaker] About the wrong calculation of the value of expected vote.

2010-02-24 Thread renayama19661014
Hi Andrew, Thank you for comment. I registered a problem on Bugzilla. * http://developerbugs.linux-foundation.org/show_bug.cgi?id=2359 Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: 2010/2/24 renayama19661...@ybb.ne.jp: Hi, I found a problem of exptected

[Pacemaker] [PATCH]An option of quorum-policy does not become effective.

2010-03-04 Thread renayama19661014
Hi All, Appointment of quorum-policy does not become it effectively when we use Pacemaker development version.(Pacemaker-1-0-93b87931206e.tar.gz) The reason is because the problem adds votes of the node that it lost, and it handles it. I offer a patch. Because I call member_loop_fn

[Pacemaker] [PATCH]The change of the output level of the log.(for stonithd)

2010-03-07 Thread renayama19661014
Hi, We confirmed log of stonithd by the setting that a period of the operation of stonith was long. When STONITH is carried out in the case of the setting that a period of the operation of stonith is long, the following error is output by log. Jan 29 14:00:53 cgl60 stonithd: [7524]: ERROR:

Re: [Pacemaker] [PATCH]The change of the output level of the log.(for stonithd)

2010-03-08 Thread renayama19661014
Hi Dejan, Anything not to upset the operator :) Similar patch applied. Thanks. BTW, can't recall seeing this error. Still not clear to me when did you encounter it. In the case of ssh/external, it seems to be generated when I let do sleep of status on purpose in slightly long time. During

Re: [Pacemaker] Problem : By colocations limitation, the resource appointment of the combination does not become effective.

2010-03-08 Thread renayama19661014
Hi Andrew, This is normal for constraints with scores INFINITY. Anything INFINITY is preferable but not mandatory Sorry The method of my question was bad. As of STEP9, is the setting that a resource of UMgroup01 does not start possible? I do not perform the INFINITY setting in

Re: [Pacemaker] [PATCH]The change of the output level of the log.(for stonithd)

2010-03-09 Thread renayama19661014
Hi Dejan, Andrew fixed it already. Sorry about that. All right. Thanks. Hideo Yamauchi. --- Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Mar 09, 2010 at 04:31:22PM +0900, renayama19661...@ybb.ne.jp wrote: Hi Dejan, There seem to be some problems for a retouch somehow

[Pacemaker] [PATCH GUI]A Japanization file.

2010-03-11 Thread renayama19661014
Hi Yan, For Pacemaker-Python-GUI-14fd66fafbfa.tar.gz, I updated a Japanization file. In the latest edition of GUI, please reflect this patch. The next error occurs by construction of Pacemaker-Python-GUI-14fd66fafbfa.tar.gz. Please remove an error. cc1: warnings being treated as errors

Re: [Pacemaker] [PATCH GUI]A Japanization file.

2010-03-11 Thread renayama19661014
Hi Yan, There are other two _fuzzy_ translations in the ja.po. You may want to update them too:-) Thanks! I'm sorry. I send a patch again. Prototypes of some functions have changed. You need to update to the latest pacemaker. I used Pacemaker of the next place. *

Re: [Pacemaker] [PATCH GUI]A Japanization file.

2010-03-11 Thread renayama19661014
Hi Yan, You may want to try pacemaker 1.1: http://hg.clusterlabs.org/pacemaker/1.1 All right. Thanks!! Otherwise you could reverse the change: http://hg.clusterlabs.org/pacemaker/pygui/rev/4dc8cb63f29b Thanks!! Best Regards, Hideo Yamauchi. --- Yan Gao y...@novell.com wrote: On

Re: [Pacemaker] [PATCH GUI]A Japanization file.

2010-03-14 Thread renayama19661014
Hi Yan, In the version that you taught, I confirmed display of GUI by the Japanese language. I made a patch again and attached it. Please reflect this patch in GUI for development version. Best Regards, Hideo Yamauchi. ja.po.20100316.patch Description: 4280310379-ja.po.20100316.patch

[Pacemaker] Problem : Sometimes failed in the start of the guest(on KVM).

2010-03-18 Thread renayama19661014
Hi, I use VirtualDomain-RA and, on KVM, constitute a cluster. However, a guest sometimes fails in start. Mar 16 15:16:52 x3650e lrmd: [13457]: info: RA output: (guest-kvm1:start:stderr) error: Failed to start domain kvm1 error: internal error unable to start guest: inet_listen:

Re: [Pacemaker] Problem : Sometimes failed in the start of the guest(on KVM).

2010-03-19 Thread renayama19661014
Hi Dejan, IIRC, that port has to do with vnc and something else (another VNC server?) has already been started on that port. Thank you for comment. I examine it a little more. Best Regards, Hideo Yamauchi. --- Dejan Muhamedagic deja...@fastmail.fm wrote: Hi Hideo-san, On Fri, Mar 19,

Re: [Pacemaker] About replacement of clone and handling of the fail number of times.

2010-03-23 Thread renayama19661014
Hi Andrew, Thank you for comment. So if I can summarize, you're saying that clnUMdummy02 should not be allowed to run on srv01 because the combined number of failures is 6 (and clnUMdummy02 is a non-unique clone). And that the current behavior is that clnUMdummy02 continues to run. Is

Re: [Pacemaker] Problem : By colocations limitation, the resource appointment of the combination does not become effective.

2010-03-23 Thread renayama19661014
Hi Andrew, Thank you for comment. I was suggesting: rsc_colocation id=rsc_colocation01-3 rsc=UMgroup01 with-rsc=clnUMgroup01 score=INFINITY/ rsc_location id=no-connectivity-01-1 rsc=UMgroup01 rule id=clnPingd-exclude-rule score=-INFINITY boolean-op=or expression

Re: [Pacemaker] Problem : By colocations limitation, the resource appointment of the combination does not become effective.

2010-03-23 Thread renayama19661014
Hi Andrew, I ask you a question one more. Our real resource constitution is a little more complicated. We do colocation of the clone(clnG3dummy1, clnG3dummy2) which does not treat the update of the attribute such as pingd. (snip) clone id=clnG3dummy1 primitive class=ocf

Re: [Pacemaker] About replacement of clone and handling of the fail number of times.

2010-03-24 Thread renayama19661014
Hi Andrew, Do you mean: why is the clone on srv01 always $clone:0 but on srv02 its sometimes $clone:0 and sometimes $clone:1 ? yes. The replacement thought both nodes to be the same movement. Because it is globally-unique=false. Best Regards, Hideo Yamauchi. --- Andrew Beekhof

Re: [Pacemaker] About replacement of clone and handling of the fail number of times.

2010-03-25 Thread renayama19661014
Hi Andrew, globally-unique=false means that :0 and :1 are actually the same resource. its perfectly valid for entries for both to exist on the node, but the PE should fold them together internally. in most ways it does, just not for failures (yet). Thank you for comment. Some we were

Re: [Pacemaker] Problem : By colocations limitation, the resource appointment of the combination does not become effective.

2010-04-18 Thread renayama19661014
Hi Andrew, Are you busy? Please give my question an answer. Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Andrew, I ask you a question one more. Our real resource constitution is a little more complicated. We do colocation of the clone(clnG3dummy1,

Re: [Pacemaker] Problem : By colocations limitation, the resource appointment of the combination does not become effective.

2010-04-19 Thread renayama19661014
Hi Andrew, We want to realize start in order of the next. 1) clnPingd, clnG3dummy1, clnG3dummy2, clnUMgroup01 (All resources start) - UMgroup01 start * And the resource moves if a clone of one stops. 2) clnPingd, clnG3dummy1, clnG3dummy2 (All resources start) - OVDBgroup02-1

Re: [Pacemaker] Problem : By colocations limitation, the resource appointment of the combination does not become effective.

2010-04-19 Thread renayama19661014
Hi Andrew, Thank you for comment. But, does not the problem of the next email recur when I change it in INFINITY? * http://www.gossamer-threads.com/lists/linuxha/pacemaker/60342 No, as I previously explained: By an answer before you, pingd moves well. However, does setting of

Re: [Pacemaker] Problem : By colocations limitation, the resource appointment of the combination does not become effective.

2010-04-20 Thread renayama19661014
Hi Andrew, Yes, because the -INFINITY + INFINITY = -INFINITY and therefore the node wont be allowed to host the service. Thank you for comment. My worry was useless somehow or other. The initial placement of the resource went well, too. By various patterns, I test some movement. Best

Re: [Pacemaker] About influence of resouce-stickiness which used colocation for limitation.

2010-04-23 Thread renayama19661014
Hi Andrew, Fixed in: http://hg.clusterlabs.org/pacemaker/1.1/rev/4c775a4abc87 Thanks! Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: Fixed in: http://hg.clusterlabs.org/pacemaker/1.1/rev/4c775a4abc87 2010/4/22 renayama19661...@ybb.ne.jp: Hi, We

Re: [Pacemaker] About influence of resouce-stickiness which used colocation for limitation.

2010-04-25 Thread renayama19661014
Hi Andrew, Version 1.0 is necessary for us. Please backport your revision to version 1.0. Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Andrew, Fixed in: http://hg.clusterlabs.org/pacemaker/1.1/rev/4c775a4abc87 Thanks! Best Regards, Hideo Yamauchi.

Re: [Pacemaker] About influence of resouce-stickiness which used colocation for limitation.

2010-04-27 Thread renayama19661014
Hi Andrew, Done. http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f7da9d09ebd2 It seems to move with your patch definitely. But, the following error is reflected on log. Does not this error have any problem? Apr 27 16:37:00 srv01 pengine: [5839]: ERROR: native_merge_weights:

Re: [Pacemaker] About influence of resouce-stickiness which used colocation for limitation.

2010-04-27 Thread renayama19661014
Hi Andrew, Oh, that was some development logging I forgot to remove. I'll backport that fix in a moment too. All right. Thanks! Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: On Tue, Apr 27, 2010 at 9:42 AM, renayama19661...@ybb.ne.jp wrote: Hi Andrew,

[Pacemaker] [Problem] A fail count is up by a postponed monitor.

2010-05-11 Thread renayama19661014
Hi, On a test of Pacemaker before a little, the following problem happened. * corosync 1.2.1 * Pacemaker-1-0-8463260ff667 * Reusable-Cluster-Components-c447fc25e119 * Cluster-Resource-Agents-f92935082277 A problem is that the monitor error of the prmFsPostgreSQLDB3-2 resource that stopped

[Pacemaker] An error of log_data_element is noisy.

2010-05-13 Thread renayama19661014
Hi, In 1.0 latest version, an error is reflected on log. Movement does not have any problem, but is very noisy. (snip) May 13 16:21:34 srv01 cib: [24342]: ERROR: log_data_element: cib_config_changed: Diff /lrm May 13 16:21:34 srv01 cib: [24342]: ERROR: log_data_element:

Re: [Pacemaker] An error of log_data_element is noisy.

2010-05-13 Thread renayama19661014
Fixed: http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/94bf2cc9219b Thanks. Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: Fixed: http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/94bf2cc9219b On Thu, May 13, 2010 at 9:26 AM, renayama19661...@ybb.ne.jp wrote:

Re: [Pacemaker] [Problem] A fail count is up by a postponed monitor.

2010-05-16 Thread renayama19661014
Hi Andrew, The next patch seems to influence the cause of this problem. * http://hg.linux-ha.org/glue/rev/3112dd90ecd8 Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi Andrew, I registered this problem with Bugzilla. *

[Pacemaker] A mistake of the log output.

2010-06-10 Thread renayama19661014
Hi, There seems to be an error in the log output of the source of Pacemaker. void clone_expand(resource_t *rsc, pe_working_set_t *data_set) { clone_variant_data_t *clone_data = NULL; get_clone_variant_data(clone_data, rsc); crm_err(Processing actions from %s, rsc-id);

Re: [Pacemaker] A mistake of the log output.

2010-06-10 Thread renayama19661014
Hi Andrew, Thanks! Another one... Not a great problemNext if is the same. static gboolean determine_online_status_no_fencing(pe_working_set_t *data_set, xmlNode * node_state, node_t *this_node) { (snip) if(!crm_is_true(ccm_state) || safe_str_eq(ha_state, DEADSTATUS)){

[Pacemaker] [Problem]Cib cannot update an attribute by 16 node constitution.

2010-06-13 Thread renayama19661014
We tested 16 node constitution (15+1). We carried out the next procedure. Step1) Start 16 nodes. Step2) Send cib after a DC node was decided. An error occurs by the update of the attribute of pingd after Probe processing was over.

Re: [Pacemaker] [Problem]Cib cannot update an attribute by 16 node constitution.

2010-06-14 Thread renayama19661014
Hi Andrew, Thank you for comment. More likely of the underlying messaging infrastructure, but I'll take a look. Perhaps the default cib operation timeouts are too low for larger clusters. The log attached it to next Bugzilla. #65533;*

[Pacemaker] [PATCH]Omitted STONITH of useless broadcast.(only 2 nodes configuration)

2010-07-06 Thread renayama19661014
Hi All, I wrote a patch. This patch limited it to two node configuration and wrote it. When failed in STONITH by the configuration of two nodes, the request of STONITH to the other node is useless. Because the reason is because it can carry out STONITH only from one node. We should omit the

Re: [Pacemaker] [PATCH]Omitted STONITH of useless broadcast.(only 2 nodes configuration)

2010-07-06 Thread renayama19661014
Hi Andrew, Thank you for comment. stonithd should also have access to the memebership list, so i dont think clients like the crmd need to be involved here. My understanding may be wrong. When stonithd accesses memberlist, stonithd can know node configuration. It is from memberlist and does

Re: [Pacemaker] [PATCH]Omitted STONITH of useless broadcast.(only 2 nodes configuration)

2010-07-07 Thread renayama19661014
Hi Andrew, Thank you for comment. It should be able to calculate the expected number of votes/replies from the ais/heartbeat membership list directly. The crmd shouldn't need to pass that info in IMHO. OK. I think about a patch of the form to acquire memberlist from stonithd. Best Regards,

[Pacemaker] [PATCH]The changing of the log level of pengine process.

2010-07-28 Thread renayama19661014
Hi All, Our user showed a demand in a level of log output after handling of pengine. When STONITH is carried out, pengine wants to output log at a warning level if a repeating resource is only an STONITH resource. Because plural STONITH may be started when STONITH is carried out. However, it

[Pacemaker] [Problem]The problem of the combination of Pacemaker and corosync1.2.7.

2010-08-01 Thread renayama19661014
Hi, I confirmed movement when corosync1.2.7 combined Pacemaker. The combination is as follows. * corosync 1.2.7 * Pacemaker-1-0-74392a28b7f3.tar * Cluster-Resource-Agents-bfcc4e050a07.tar * Reusable-Cluster-Components-8286b46c91e3.tar I confirmed the next movement in two nodes of a

Re: [Pacemaker] [Problem]The problem of the combination of Pacemaker and corosync1.2.7.

2010-08-03 Thread renayama19661014
Hi Vladislav, Thank you for comment. This is probably connected to http://marc.info/?l=openaism=127977785007234w=2 Steven promised to look at that issue after his vacation. I wait for a revision of Steven. Meanwhile, I use Pacemaker1.1 to recommend of Andrew. Best Regards, Hideo

Re: [Pacemaker] [Problem]The problem of the combination of Pacemaker and corosync1.2.7.

2010-08-03 Thread renayama19661014
Hi Andrew, Thank you for comment. No need to wait, the current tip of Pacemaker 1.1 is perfectly stable (and included for RHEL6.0). Almost all the testing has been done for 1.1.3, I've just been busy helping out with some other projects at Red Hat and haven't had time to do the actual

[Pacemaker] [PATCH]A redundant if sentence.

2010-08-03 Thread renayama19661014
Hi, It is the patch of a redundant if sentence for pengine. void unpack_operation( action_t *action, xmlNode *xml_obj, pe_working_set_t* data_set) { (snip) if(safe_str_eq(class, stonith)) { action-needs = rsc_req_nothing; value = nothing (fencing

[Pacemaker] [Problem]A compilation error of Pacemaker1.1.

2010-08-03 Thread renayama19661014
Hi, I compiled Pacemaker1.1. But, the next error happened. [r...@srv01 Pacemaker-1-1-5ce5b34cf3ab]# export PREFIX=/usr;export LCRSODIR=$PREFIX/libexec/lcrso;export CLUSTER_USER=hacluster;export CLUSTER_GROUP=haclient [r...@srv01 Pacemaker-1-1-5ce5b34cf3ab]# ./autogen.sh ./configure

Re: [Pacemaker] [Problem]A compilation error of Pacemaker1.1.

2010-08-04 Thread renayama19661014
Hi Andrew, Fixed: http://hg.clusterlabs.org/pacemaker/1.1/rev/6bad7c6bbe7d Thanks! Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: On Wed, Aug 4, 2010 at 10:08 AM, Andrew Beekhof and...@beekhof.net wrote: On Wed, Aug 4, 2010 at 6:34 AM,

Re: [Pacemaker] [Problem]A compilation error of Pacemaker1.1.

2010-08-04 Thread renayama19661014
Hi Andrew, Let me ask you a question. I deleted service from corosync.conf. I started service of pacemaker after having started corosync. However, two clusters do not consist of a node. * srv01 [r...@srv01 ~]# ps -ef | grep heartbeat root 4 0 0 10:25 ?00:00:00

Re: [Pacemaker] [Problem]A compilation error of Pacemaker1.1.

2010-08-05 Thread renayama19661014
Hi Andrew, Thank you for comment. Sorry, I gave you the wrong information yesterday. Apparently working on GUIs for two weeks is enough to rot one's brain ;-) Since I've also been talking to other people about this, I've taken the time to write it up here:

[Pacemaker] A demand for the expected votes indication and a question.

2010-08-05 Thread renayama19661014
Hi, Our user uses corosync and Pacemaker. Last updated: Fri Aug 6 13:25:37 2010 Stack: openais Current DC: srv01 - partition with quorum Version: 1.1.2-230655711dc7b8579747ddeafc6f39247f8e87fc 3 Nodes configured, 3 expected votes 1 Resources configured. Online: [

[Pacemaker] About specifications of on-fail=block.

2010-08-08 Thread renayama19661014
Hi, Let me confirm it about specifications of on-fail=block. I constituted the following cluster. Last updated: Mon Aug 9 11:18:29 2010 Stack: openais Current DC: srv01 - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected

Re: [Pacemaker] About specifications of on-fail=block.

2010-08-13 Thread renayama19661014
Hi, I compared movement in a version of pacemaker about this problem. * 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b [r...@srv02 ~]# crm_mon -1 Last updated: Fri Aug 13 13:02:24 2010 Stack: openais Current DC: srv01 - partition with quorum Version:

Re: [Pacemaker] [PATCH]The changing of the log level of pengine process.

2010-08-26 Thread renayama19661014
Hi Andrew, Thank you for comment. Why not simply remove the if(was_processing_error) block? Its just a summary message, the place that set was_processing_error will also have logged an error. Is this meaning to abolish the next code? - if(was_processing_error) { -

Re: [Pacemaker] A demand for the expected votes indication and a question.

2010-08-26 Thread renayama19661014
Hi Andrew, crm_mon shouldn't really display expected votes for heartbeat clusters... they're not used in any way when heartbeat is in use. expected votes is only relevant for ver: 0 of the pacemaker/corosync plugin. in the future pacemaker will obtain quorum information directly from

Re: [Pacemaker] About specifications of on-fail=block.

2010-08-26 Thread renayama19661014
Hi Andrew, I registered this problem on Bugzilla. * http://developerbugs.linux-foundation.org/show_bug.cgi?id=2476 Best Regards, Hideo Yamauchi. --- renayama19661...@ybb.ne.jp wrote: Hi, I compared movement in a version of pacemaker about this problem. *

Re: [Pacemaker] [PATCH]The changing of the log level of pengine process.

2010-09-01 Thread renayama19661014
Hi Andrew, Thank you for comment. We discussed it about this matter a little. The revision of the output of the log withdraws it for the moment. Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: On Fri, Aug 27, 2010 at 3:03 AM, renayama19661...@ybb.ne.jp wrote:

[Pacemaker] A patch of crm_mon for the trouble actions.

2010-09-12 Thread renayama19661014
Hi, I contribute the patch of the crm_mon command. A node was offline and, in the case of the shutdown, revised it not to display a trouble action. Please confirm a patch. And, without a problem, please take this revision in a development version. diff -r 9b95463fde99 tools/crm_mon.c ---

Re: [Pacemaker] A patch of crm_mon for the trouble actions.

2010-09-13 Thread renayama19661014
Hi Andrew, Thank you for comment. I assume this is for the stonith-enabled=true case, since offline nodes are ignored for stonith-enabled=false. Once the node is shot, then its status section is erased and no failed actions will be shown... so why do we need this patch? I know that trouble

Re: [Pacemaker] A patch of crm_mon for the trouble actions.

2010-09-13 Thread renayama19661014
Hi Andrew, Thank you for comment. Thanks for the explanation, I think you're right that we shouldn't be showing these failed actions. I think we want to do it in the PE though, eg. stop them from making it into the failed_ops list in the first place. Does your answer mean that the next

Re: [Pacemaker] About Quorum control at the time of the service stop.(no-quorum-policy=freeze)

2010-09-13 Thread renayama19661014
Hi Andrew, Thank you for comment. As a conclusion in case of the freeze setting * At the divided point in time, the resource maintains it. * When a node shuts it down, in divided constitution, the resource does migrate. - Maintaining a resource in divided constitution. Is my

Re: [Pacemaker] About Quorum control at the time of the service stop.(no-quorum-policy=freeze)

2010-09-14 Thread renayama19661014
Hi Andrew, I'd probably summarize it as: resources are frozen to their current _partition_ They can only move around within their partition. So if the partition does not have quorum and * a node shuts down, the partition can reallocate any services on that node, but * a node

Re: [Pacemaker] A patch of crm_mon for the trouble actions.

2010-09-14 Thread renayama19661014
Hi Andrew, Perfect. Pushed. Thanks! http://hg.clusterlabs.org/pacemaker/1.1/rev/d932da0b886b Thanks!! Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net wrote: Perfect. Pushed. Thanks! http://hg.clusterlabs.org/pacemaker/1.1/rev/d932da0b886b 2010/9/14

[Pacemaker] [Problem or Enhancement]When attrd reboots, a fail count is initialized.

2010-09-26 Thread renayama19661014
Hi, When I investigated another problem, I discovered this phenomenon. If attrd causes process trouble and does not restart, the problem does not occur. Step1) After start, it causes a monitor error in UmIPaddr twice. Online: [ srv01 srv02 ] Resource Group: UMgroup01 UmVIPcheck

Re: [Pacemaker] About behavior in Action Lost.

2010-09-28 Thread renayama19661014
Hi Andrew, Pushed as: http://hg.clusterlabs.org/pacemaker/1.1/rev/8433015faf18 Not sure about applying to 1.0 though, its a dramatic change in behavior. The change of this link is not found. Where did you update it? Best Regards, Hideo Yamauchi. --- Andrew Beekhof and...@beekhof.net

  1   2   3   4   >