Hi,
I am going to check the operation by a combination of Pacemaker and OpenAIS.
In reference to a page(http://www.clusterlabs.org/wiki/Install), I made
environment from a source.
The source which I used was the following versions.
*
Hi,
We began shift investigation to the combination of Pacemaker and
corosync/openais now.
We put Pacemaker and openais(whitetank) together and confirmed movement at the
time of the process
trouble.
(This is the function that a reboot emergency occurred by a combination with
Heartbeat.)
I
Hi Andrew,
If the crmd dies, then I (IIRC) the lrmd cancels all existing resource
monitoring.
However, when the crmd is recovered, it should setup the resource
monitoring again.
Is the second part not happening?
There are some patterns in a problem.
When crmd/stonithd restarts in a ACT
Hi Andrew,
Can you open a bug for that.
I suspect the lrmd might be doing the wrong thing, but assign it to
pacemaker until I can prove that :-)
All right.
Had better I register problem in bugzilla?
#As for me, a problem understands the thing in lrmd.
Best Regards,
Hideo Yamauchi.
---
Hi Andrew,
Yes please.
http://developerbugs.linux-foundation.org/show_bug.cgi?id=2161
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
2009/7/23 renayama19661...@ybb.ne.jp:
Hi Andrew,
Can you open a bug for that.
I suspect the lrmd might be doing the wrong
Hi Andrew,
When I do not appoint target_role, an error does not occur.
Is my operation wrong?
role=Stopped needs to occur after the meta keyword, not as part of the
operation
I was wrong.
This targer_role was for MASTER/SLAVE appointment.
Thanks,
Hideo Yamauchi.
--- Andrew Beekhof
Hi,
We examined a method to reopen after having stopped the monitor of the resource
in environment of
latest Pacemaker.
Of course we understand that a monitor stops by making a cluster a maintenance
mode.
However, our customer wants to stop a monitor individually without making a
cluster a
Hi Andrew,
iirc, there you can also set enabled=false. Check out the schema
files. It may even be in the PDF
Thank you.
I tried to test the next command.
# cibadmin -R -X 'op id=prmDummy1-monitor interval=10s name=monitor
on-fail=restart
timeout=60s enabled=false /'
But, the monitor of
Hi Andrew,
Basically yes. Granted the name we used for the option could be better :-)
All right.
Thank you.
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
On Thu, Aug 27, 2009 at 10:01 AM, renayama19661...@ybb.ne.jp wrote:
Hi,
I asked about the setting of
Hi Andrew,
I still don't understand why you can't simply make them regular
cluster resources.
Can you please explain?
I understand your opinion.
I surely understand that simple RA can operate an object of respawn by shifting.
However, an object of respawn of all users cannot shift to RA.
For
Hi Lars,
I understand that I can come true by the method that you showed enough.
However, we wanted to do respawn under the cluster software if possible.
We talk about a substitute method again, too.
Thank you.
Hideo Yamauchi.
--- Lars Ellenberg lars.ellenb...@linbit.com wrote:
On Mon, Sep
Hi Andrew,
I'm almost afraid to ask, why would someone need this capability?
In the case of the environment that shifted from version 1 of Heartbeat, I
think that the user may do
how to use such respawn.
We talk about a substitute method again, too.
Thank you.
Hideo Yamauchi.
--- Andrew
Hi Remi,
It appears that this is a similar problem to the one that I reported,
yes. It appears to not be a bug in Corosync, but rather one in
Pacemaker. This bug has been filed in Red Hat Bugzilla, see it at:
https://bugzilla.redhat.com/show_bug.cgi?id=525589
Perhaps you could add
Hi Steven,
All right.
Thank you.
Best Regards,
Hideo Yamauchi.
--- Steven Dake sd...@redhat.com wrote:
This bug is reported and we are working on a solution.
Regards
-steve
On Mon, 2009-10-19 at 11:05 +0900, renayama19661...@ybb.ne.jp wrote:
Hi,
I understand that a combination
Hi Andrew,
Or are you looking for the ability to do more than simply prevent
resources from running?
Yes.
For the realization of the function to assure the practice of the resources of
the cluster, we use
respawn.
We found the function that it might be replaced of respawn in a road map of
Hi Dejan,
About this matter, how do you think?
Please tell me your opinion.
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi,
We committed the following mistakes.
* The host name is a small letter.
* The host name of RULE is a capital letter.
* The host name
Hi Dejan,
Are node names uppercase? And then stonith doesn't work?
The node name is a small letter.
stonith acts, but it is not carried out because it is not found an target node.
It is caused by a mistake of the setting obviously.
However, for a user, some kind of measures are necessary.
Hi,
Everybody thank you for comment.
If my understanding is right, I think that I will register myself with Bugzilla
as a demand.
Is my understanding right?
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
On Fri, Dec 11, 2009 at 11:28 AM, Lars Marowsky-Bree
Hi all,
Thank you for understanding of all of you.
Yes, though that means changing all plugins.
I can assist the revision of the plug in.
Best Regards,
Hideo Yamauchi.
--- Dejan Muhamedagic deja...@fastmail.fm wrote:
Hi Lars,
On Tue, Dec 15, 2009 at 02:24:42PM +0100, Lars Ellenberg
Hi Dejan,
I registered.
http://developerbugs.linux-foundation.org/show_bug.cgi?id=2290
Best Regards,
Hideo Yamauchi.
--- Dejan Muhamedagic deja...@fastmail.fm wrote:
Hi Hideo-san,
On Tue, Dec 15, 2009 at 09:41:13AM +0900, renayama19661...@ybb.ne.jp wrote:
Hi,
Everybody thank you
Hi Dejan,
That would be great since I'm overwhelmed with other business now.
And please open a bugzilla if there isn't any so that we can
track this.
All right.
Please wait
Best Regards,
Hideo Yamauchi.
--- Dejan Muhamedagic deja...@fastmail.fm wrote:
Hi Hideo-san,
On Wed, Dec 16,
Hi Dejan,
I registered.
http://developerbugs.linux-foundation.org/show_bug.cgi?id=2292
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Dejan,
That would be great since I'm overwhelmed with other business now.
And please open a bugzilla if there isn't any so that
Hi,
We constituted the complicated cluster of three nodes.(2ACT+1STB)
We built a cluster by the next combination.
* corosync-1.1.2
* Reusable-Cluster-Components-fa44a169d55f
* Cluster-Resource-Agents-6f02f8ad7fd4
* Pacemaker-1-0-d990c453b999
The resource of group02-1 hoped that it started
Hi Andrew,
Thank you for comment.
You would be better off with a constraint like example 9.3 which will
exclude any unconnected node and leave the previous location scores
unchanged:
\xA0
Hi Andrew,
By the method that you showed, the problem was solved.
Thank you.
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
Thank you for comment.
You would be better off with a constraint like example 9.3 which will
exclude any unconnected node and leave the
Hi All,
About this problem, please give me advice.
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi,
Step1) It started in three nodes as follows.
[r...@srv01 ~]# crm_mon -1
Last updated: Tue Jan 19 10:36:20 2010
Stack: openais
Current DC: srv01 -
Hi Andrew,
Thank you.
By your advice, the problem was solved.
Understanding for my resource-stickiness was insufficient.
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
On Thu, Jan 28, 2010 at 12:13 PM, Andrew Beekhof and...@beekhof.net wrote:
2010/1/19
Hi Andrew,
Very odd. CTS (successfully) tests this condition all the time.
What does corosync.conf look like? Is there a firewall involved?
It _looks_ like the crmd isn't getting its own messages back.
My test environment is next.
* Single VM(RHEL5.4-32bit) on Esxi.
* NIC x 3
* 1: A
Hi Andrew,
This problem occurred in our environment.
* RHEL5.4(x86) on Esxi 2node
* corosync 1.1.2
* Pacemaker-1-0-6f67420618b0.tar.gz
It sometimes occurs when I carry out test.sh in long time.
[r...@srv01 ~]# ./test.sh
(snip)
scope=status name=master-vmrd-res: value=132
Multiple
Hi Andrew,
After all is it a problem of libxml2?
Is it necessary to update libxml2?
It seems so.
If I understood hj correctly, that was the only part he replaced in
order for it to function correctly.
Thank you for comment.
I update libxml2, too and confirm it.
Best Regards,
Hideo
Hi Andrew,
The problem does not occur at single ring.
It is strange that this problem does not happen in Pacemaker1.0.6.
Will this be a problem of corosync?
Do not you hear any similar phenomenon?
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
Could you
Hi Andrew,
Odd. You might want to report this to the openais guys.
All right.
The problem goes away when you just downgrade pacemaker to 1.0.6 (and
leave corosync at the same version)?
Sorry...
It was my mistake.
When I put corosync1.2.0 and Pacemaker1.0.6 together, the same problem
Hi Yan,
This problem seems to still occur somehow or other.
The Orphaned resource of the clone is extremely rarely displayed.
I confirm a condition and report it again.
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Yan,
Could you please try the attached patch?
Hi Yan,
Hi Andrew,
The phenomenon of the problem does not reappear.
I considerably try it, but do not readily reappear.
But, I found the interesting output from the log that the problem happened.
When a problem happened, GUI became the following display.
Hi Andrew,
Thank you for comment.
Actually, none of the included pe files seem to match the failure
case... can you supply the cib from that condition?
I report the result that carried out cibadmin -Q at the stage of each problem.
Is this information(cibadmin -Q) enough?
Best Regards,
Hideo
Hi,
We cannot stop specific clone resources by a crm_resource order.
Last updated: Wed Feb 17 16:56:26 2010
Stack: openais
Current DC: srv01 - partition with quorum
Version: 1.0.7-0d97eb2e69533c1352044394c88d7c05802a09a5
2 Nodes configured, 2 expected votes
2 Resources configured.
Hi Andrew,
About the cause of this problem, did you understand what it was?
Please contact me if there is the information that is necessary for
investigation else.
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
I confirmed it in the next environment.
Hi Andrew,
Can we stop one clone by a crm_resouce command and a crm command?
Or is it necessary to set -INIFINITY in a rule?
Right, a location rule
Possibly was the clone the specifications that cannot stop individually?
Correct.
You can however do:
crm configure location
Hi Yan,
Thank you for a reply.
Probability of the reproduction of the problem is not high in my environment,
but tries your patch for
the moment.
I e-mail a result to you.
Best Regards,
Hideo Yamauchi.
--- Yan Gao y...@novell.com wrote:
Hi Hideo,
On 02/08/10 15:35,
Hi Andrew,
Thank you for a reply.
I test your patch.
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
2010/2/13 renayama19661...@ybb.ne.jp:
Hi Andrew,
I confirmed it in the next environment.
\xA0* CentOS 5.4 (on ESXi)
\xA0* Pacemaker-1-0-0bf7d14dd554
Hi Andrew,
Thank you for comment.
I registered a problem on Bugzilla.
* http://developerbugs.linux-foundation.org/show_bug.cgi?id=2359
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
2010/2/24 renayama19661...@ybb.ne.jp:
Hi,
I found a problem of exptected
Hi All,
Appointment of quorum-policy does not become it effectively when we use
Pacemaker development
version.(Pacemaker-1-0-93b87931206e.tar.gz)
The reason is because the problem adds votes of the node that it lost, and it
handles it.
I offer a patch.
Because I call member_loop_fn
Hi,
We confirmed log of stonithd by the setting that a period of the operation of
stonith was long.
When STONITH is carried out in the case of the setting that a period of the
operation of stonith is
long, the following error is output by log.
Jan 29 14:00:53 cgl60 stonithd: [7524]: ERROR:
Hi Dejan,
Anything not to upset the operator :) Similar patch applied.
Thanks.
BTW, can't recall seeing this error. Still not clear to me when
did you encounter it.
In the case of ssh/external, it seems to be generated when I let do sleep of
status on purpose in
slightly long time.
During
Hi Andrew,
This is normal for constraints with scores INFINITY.
Anything INFINITY is preferable but not mandatory
Sorry
The method of my question was bad.
As of STEP9, is the setting that a resource of UMgroup01 does not start
possible?
I do not perform the INFINITY setting in
Hi Dejan,
Andrew fixed it already. Sorry about that.
All right.
Thanks.
Hideo Yamauchi.
--- Dejan Muhamedagic deja...@fastmail.fm wrote:
Hi,
On Tue, Mar 09, 2010 at 04:31:22PM +0900, renayama19661...@ybb.ne.jp wrote:
Hi Dejan,
There seem to be some problems for a retouch somehow
Hi Yan,
For Pacemaker-Python-GUI-14fd66fafbfa.tar.gz, I updated a Japanization file.
In the latest edition of GUI, please reflect this patch.
The next error occurs by construction of
Pacemaker-Python-GUI-14fd66fafbfa.tar.gz.
Please remove an error.
cc1: warnings being treated as errors
Hi Yan,
There are other two _fuzzy_ translations in the ja.po. You may want to
update them too:-) Thanks!
I'm sorry.
I send a patch again.
Prototypes of some functions have changed. You need to update to the
latest pacemaker.
I used Pacemaker of the next place.
*
Hi Yan,
You may want to try pacemaker 1.1:
http://hg.clusterlabs.org/pacemaker/1.1
All right. Thanks!!
Otherwise you could reverse the change:
http://hg.clusterlabs.org/pacemaker/pygui/rev/4dc8cb63f29b
Thanks!!
Best Regards,
Hideo Yamauchi.
--- Yan Gao y...@novell.com wrote:
On
Hi Yan,
In the version that you taught, I confirmed display of GUI by the Japanese
language.
I made a patch again and attached it.
Please reflect this patch in GUI for development version.
Best Regards,
Hideo Yamauchi.
ja.po.20100316.patch
Description: 4280310379-ja.po.20100316.patch
Hi,
I use VirtualDomain-RA and, on KVM, constitute a cluster.
However, a guest sometimes fails in start.
Mar 16 15:16:52 x3650e lrmd: [13457]: info: RA output:
(guest-kvm1:start:stderr) error: Failed to
start domain kvm1 error: internal error unable to start guest: inet_listen:
Hi Dejan,
IIRC, that port has to do with vnc and something else (another
VNC server?) has already been started on that port.
Thank you for comment.
I examine it a little more.
Best Regards,
Hideo Yamauchi.
--- Dejan Muhamedagic deja...@fastmail.fm wrote:
Hi Hideo-san,
On Fri, Mar 19,
Hi Andrew,
Thank you for comment.
So if I can summarize, you're saying that clnUMdummy02 should not be
allowed to run on srv01 because the combined number of failures is 6
(and clnUMdummy02 is a non-unique clone).
And that the current behavior is that clnUMdummy02 continues to run.
Is
Hi Andrew,
Thank you for comment.
I was suggesting:
rsc_colocation id=rsc_colocation01-3 rsc=UMgroup01
with-rsc=clnUMgroup01 score=INFINITY/
rsc_location id=no-connectivity-01-1 rsc=UMgroup01
rule id=clnPingd-exclude-rule score=-INFINITY boolean-op=or
expression
Hi Andrew,
I ask you a question one more.
Our real resource constitution is a little more complicated.
We do colocation of the clone(clnG3dummy1, clnG3dummy2) which does not treat
the update of the
attribute such as pingd.
(snip)
clone id=clnG3dummy1
primitive class=ocf
Hi Andrew,
Do you mean: why is the clone on srv01 always $clone:0 but on srv02
its sometimes $clone:0 and sometimes $clone:1 ?
yes.
The replacement thought both nodes to be the same movement.
Because it is globally-unique=false.
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof
Hi Andrew,
globally-unique=false means that :0 and :1 are actually the same resource.
its perfectly valid for entries for both to exist on the node, but the
PE should fold them together internally.
in most ways it does, just not for failures (yet).
Thank you for comment.
Some we were
Hi Andrew,
Are you busy?
Please give my question an answer.
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
I ask you a question one more.
Our real resource constitution is a little more complicated.
We do colocation of the clone(clnG3dummy1,
Hi Andrew,
We want to realize start in order of the next.
1) clnPingd, clnG3dummy1, clnG3dummy2, clnUMgroup01 (All resources start)
- UMgroup01 start
* And the resource moves if a clone of one stops.
2) clnPingd, clnG3dummy1, clnG3dummy2 (All resources start) -
OVDBgroup02-1
Hi Andrew,
Thank you for comment.
But, does not the problem of the next email recur when I change it in
INFINITY?
* http://www.gossamer-threads.com/lists/linuxha/pacemaker/60342
No, as I previously explained:
By an answer before you, pingd moves well.
However, does setting of
Hi Andrew,
Yes, because the -INFINITY + INFINITY = -INFINITY and therefore the
node wont be allowed to host the service.
Thank you for comment.
My worry was useless somehow or other.
The initial placement of the resource went well, too.
By various patterns, I test some movement.
Best
Hi Andrew,
Fixed in:
http://hg.clusterlabs.org/pacemaker/1.1/rev/4c775a4abc87
Thanks!
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
Fixed in:
http://hg.clusterlabs.org/pacemaker/1.1/rev/4c775a4abc87
2010/4/22 renayama19661...@ybb.ne.jp:
Hi,
We
Hi Andrew,
Version 1.0 is necessary for us.
Please backport your revision to version 1.0.
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
Fixed in:
http://hg.clusterlabs.org/pacemaker/1.1/rev/4c775a4abc87
Thanks!
Best Regards,
Hideo Yamauchi.
Hi Andrew,
Done.
http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f7da9d09ebd2
It seems to move with your patch definitely.
But, the following error is reflected on log.
Does not this error have any problem?
Apr 27 16:37:00 srv01 pengine: [5839]: ERROR: native_merge_weights:
Hi Andrew,
Oh, that was some development logging I forgot to remove.
I'll backport that fix in a moment too.
All right.
Thanks!
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
On Tue, Apr 27, 2010 at 9:42 AM, renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
Hi,
On a test of Pacemaker before a little, the following problem happened.
* corosync 1.2.1
* Pacemaker-1-0-8463260ff667
* Reusable-Cluster-Components-c447fc25e119
* Cluster-Resource-Agents-f92935082277
A problem is that the monitor error of the prmFsPostgreSQLDB3-2 resource that
stopped
Hi,
In 1.0 latest version, an error is reflected on log.
Movement does not have any problem, but is very noisy.
(snip)
May 13 16:21:34 srv01 cib: [24342]: ERROR: log_data_element:
cib_config_changed: Diff /lrm
May 13 16:21:34 srv01 cib: [24342]: ERROR: log_data_element:
Fixed:
http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/94bf2cc9219b
Thanks.
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
Fixed:
http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/94bf2cc9219b
On Thu, May 13, 2010 at 9:26 AM, renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
The next patch seems to influence the cause of this problem.
* http://hg.linux-ha.org/glue/rev/3112dd90ecd8
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi Andrew,
I registered this problem with Bugzilla.
*
Hi,
There seems to be an error in the log output of the source of Pacemaker.
void clone_expand(resource_t *rsc, pe_working_set_t *data_set)
{
clone_variant_data_t *clone_data = NULL;
get_clone_variant_data(clone_data, rsc);
crm_err(Processing actions from %s, rsc-id);
Hi Andrew,
Thanks!
Another one...
Not a great problemNext if is the same.
static gboolean
determine_online_status_no_fencing(pe_working_set_t *data_set, xmlNode *
node_state, node_t
*this_node)
{
(snip)
if(!crm_is_true(ccm_state) || safe_str_eq(ha_state, DEADSTATUS)){
We tested 16 node constitution (15+1).
We carried out the next procedure.
Step1) Start 16 nodes.
Step2) Send cib after a DC node was decided.
An error occurs by the update of the attribute of pingd after Probe processing
was over.
Hi Andrew,
Thank you for comment.
More likely of the underlying messaging infrastructure, but I'll take a look.
Perhaps the default cib operation timeouts are too low for larger clusters.
The log attached it to next Bugzilla.
#65533;*
Hi All,
I wrote a patch.
This patch limited it to two node configuration and wrote it.
When failed in STONITH by the configuration of two nodes, the request of
STONITH to the other node is
useless.
Because the reason is because it can carry out STONITH only from one node.
We should omit the
Hi Andrew,
Thank you for comment.
stonithd should also have access to the memebership list, so i dont
think clients like the crmd need to be involved here.
My understanding may be wrong.
When stonithd accesses memberlist, stonithd can know node configuration.
It is from memberlist and does
Hi Andrew,
Thank you for comment.
It should be able to calculate the expected number of votes/replies
from the ais/heartbeat membership list directly.
The crmd shouldn't need to pass that info in IMHO.
OK.
I think about a patch of the form to acquire memberlist from stonithd.
Best Regards,
Hi All,
Our user showed a demand in a level of log output after handling of pengine.
When STONITH is carried out, pengine wants to output log at a warning level if
a repeating resource is
only an STONITH resource.
Because plural STONITH may be started when STONITH is carried out.
However, it
Hi,
I confirmed movement when corosync1.2.7 combined Pacemaker.
The combination is as follows.
* corosync 1.2.7
* Pacemaker-1-0-74392a28b7f3.tar
* Cluster-Resource-Agents-bfcc4e050a07.tar
* Reusable-Cluster-Components-8286b46c91e3.tar
I confirmed the next movement in two nodes of a
Hi Vladislav,
Thank you for comment.
This is probably connected to
http://marc.info/?l=openaism=127977785007234w=2
Steven promised to look at that issue after his vacation.
I wait for a revision of Steven.
Meanwhile, I use Pacemaker1.1 to recommend of Andrew.
Best Regards,
Hideo
Hi Andrew,
Thank you for comment.
No need to wait, the current tip of Pacemaker 1.1 is perfectly stable
(and included for RHEL6.0).
Almost all the testing has been done for 1.1.3, I've just been busy
helping out with some other projects at Red Hat and haven't had time
to do the actual
Hi,
It is the patch of a redundant if sentence for pengine.
void
unpack_operation(
action_t *action, xmlNode *xml_obj, pe_working_set_t* data_set)
{
(snip)
if(safe_str_eq(class, stonith)) {
action-needs = rsc_req_nothing;
value = nothing (fencing
Hi,
I compiled Pacemaker1.1.
But, the next error happened.
[r...@srv01 Pacemaker-1-1-5ce5b34cf3ab]# export PREFIX=/usr;export
LCRSODIR=$PREFIX/libexec/lcrso;export CLUSTER_USER=hacluster;export
CLUSTER_GROUP=haclient
[r...@srv01 Pacemaker-1-1-5ce5b34cf3ab]# ./autogen.sh ./configure
Hi Andrew,
Fixed:
http://hg.clusterlabs.org/pacemaker/1.1/rev/6bad7c6bbe7d
Thanks!
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
On Wed, Aug 4, 2010 at 10:08 AM, Andrew Beekhof and...@beekhof.net wrote:
On Wed, Aug 4, 2010 at 6:34 AM,
Hi Andrew,
Let me ask you a question.
I deleted service from corosync.conf.
I started service of pacemaker after having started corosync.
However, two clusters do not consist of a node.
* srv01
[r...@srv01 ~]# ps -ef | grep heartbeat
root 4 0 0 10:25 ?00:00:00
Hi Andrew,
Thank you for comment.
Sorry, I gave you the wrong information yesterday.
Apparently working on GUIs for two weeks is enough to rot one's brain ;-)
Since I've also been talking to other people about this, I've taken
the time to write it up here:
Hi,
Our user uses corosync and Pacemaker.
Last updated: Fri Aug 6 13:25:37 2010
Stack: openais
Current DC: srv01 - partition with quorum
Version: 1.1.2-230655711dc7b8579747ddeafc6f39247f8e87fc
3 Nodes configured, 3 expected votes
1 Resources configured.
Online: [
Hi,
Let me confirm it about specifications of on-fail=block.
I constituted the following cluster.
Last updated: Mon Aug 9 11:18:29 2010
Stack: openais
Current DC: srv01 - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected
Hi,
I compared movement in a version of pacemaker about this problem.
* 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
[r...@srv02 ~]# crm_mon -1
Last updated: Fri Aug 13 13:02:24 2010
Stack: openais
Current DC: srv01 - partition with quorum
Version:
Hi Andrew,
Thank you for comment.
Why not simply remove the if(was_processing_error) block?
Its just a summary message, the place that set was_processing_error
will also have logged an error.
Is this meaning to abolish the next code?
- if(was_processing_error) {
-
Hi Andrew,
crm_mon shouldn't really display expected votes for heartbeat
clusters... they're not used in any way when heartbeat is in use.
expected votes is only relevant for ver: 0 of the pacemaker/corosync plugin.
in the future pacemaker will obtain quorum information directly from
Hi Andrew,
I registered this problem on Bugzilla.
* http://developerbugs.linux-foundation.org/show_bug.cgi?id=2476
Best Regards,
Hideo Yamauchi.
--- renayama19661...@ybb.ne.jp wrote:
Hi,
I compared movement in a version of pacemaker about this problem.
*
Hi Andrew,
Thank you for comment.
We discussed it about this matter a little.
The revision of the output of the log withdraws it for the moment.
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
On Fri, Aug 27, 2010 at 3:03 AM, renayama19661...@ybb.ne.jp wrote:
Hi,
I contribute the patch of the crm_mon command.
A node was offline and, in the case of the shutdown, revised it not to display
a trouble action.
Please confirm a patch.
And, without a problem, please take this revision in a development version.
diff -r 9b95463fde99 tools/crm_mon.c
---
Hi Andrew,
Thank you for comment.
I assume this is for the stonith-enabled=true case, since offline
nodes are ignored for stonith-enabled=false.
Once the node is shot, then its status section is erased and no failed
actions will be shown... so why do we need this patch?
I know that trouble
Hi Andrew,
Thank you for comment.
Thanks for the explanation, I think you're right that we shouldn't be
showing these failed actions.
I think we want to do it in the PE though, eg. stop them from making
it into the failed_ops list in the first place.
Does your answer mean that the next
Hi Andrew,
Thank you for comment.
As a conclusion in case of the freeze setting
* At the divided point in time, the resource maintains it.
* When a node shuts it down, in divided constitution, the resource does
migrate.
- Maintaining a resource in divided constitution.
Is my
Hi Andrew,
I'd probably summarize it as:
resources are frozen to their current _partition_
They can only move around within their partition. So if the partition
does not have quorum and
* a node shuts down, the partition can reallocate any services on
that node, but
* a node
Hi Andrew,
Perfect. Pushed. Thanks!
http://hg.clusterlabs.org/pacemaker/1.1/rev/d932da0b886b
Thanks!!
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net wrote:
Perfect. Pushed. Thanks!
http://hg.clusterlabs.org/pacemaker/1.1/rev/d932da0b886b
2010/9/14
Hi,
When I investigated another problem, I discovered this phenomenon.
If attrd causes process trouble and does not restart, the problem does not
occur.
Step1) After start, it causes a monitor error in UmIPaddr twice.
Online: [ srv01 srv02 ]
Resource Group: UMgroup01
UmVIPcheck
Hi Andrew,
Pushed as:
http://hg.clusterlabs.org/pacemaker/1.1/rev/8433015faf18
Not sure about applying to 1.0 though, its a dramatic change in behavior.
The change of this link is not found.
Where did you update it?
Best Regards,
Hideo Yamauchi.
--- Andrew Beekhof and...@beekhof.net
1 - 100 of 311 matches
Mail list logo