Hi All,
The crm_attribute command expands the contents of options from the
OCF_RESOURCE_INSTANCE environment variable if the p option is not specified.
However, if -INFINITY is specified as the value of the v option at this time,
crm_attributes will incorrectly expand -INFINITY as an option
Hi Ken,
Hi All,
The problem that occurred in RHEL8.4 also occurs in pacemaker bundled in
RHEL8.7beta.
(snip)
Oct 06 11:40:55 pgsql(pgsql)[17503]:WARNING: Retrying(remain 86115).
"crm_mon -1 --output-as=xml" failed. rc=102. stdout="
crm_mon: Error: cluster is not
Hi Ken,
Hi Klaus,
Thanks for your comment.
>We did not have time to get it into the RHEL 8.4 GA (general
>availability) release, which means for example it will not be in 8.4
>install images, but we did get a 0-day fix, which means that it will be
>available via "yum update" the same day that
Hi ALl,
Sorry...
Due to my operation mistake, the same email was sent multiple times.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
> To: Cluster Labs - All topics related to open-source clustering welcomed
> ; Cluster Labs - All topics
Hi Klaus,
Hi Ken,
We have confirmed that the operation is improved by the test.
Thank you for your prompt response.
We look forward to including this fix in the release version of RHEL 8.4.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
> To:
Hi Klaus,
Hi Ken,
We have confirmed that the operation is improved by the test.
Thank you for your prompt response.
We look forward to including this fix in the release version of RHEL 8.4.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
> To:
Hi Klaus,
Hi Ken,
> I've opened https://github.com/ClusterLabs/pacemaker/pull/2342 with
> I guess the simplest possible solution to the immediate issue so
> that we can discuss it.
Thank you for the fix.
I have confirmed that the fixes have been merged.
I'll test this fix today just in
Hi Klaus,
Thanks for your comment.
> Hmm ... is that with selinux enabled?
> Respectively do you see any related avc messages?
Selinux is not enabled.
Isn't crm_mon caused by not returning a response when pacemakerd prepares to
stop?
pgsql needs the result of crm_mon in demote processing
Hi Ken,
Hi All,
In the pgsql resource, crm_mon is executed in the process of demote and stop,
and the result is processed.
However, pacemaker included in RHEL8.4beta fails to execute this crm_mon.
- The problem also occurs on github
master(c40e18f085fad9ef1d9d79f671ed8a69eb3e753f).
The
Hi Steffen,
I've been experimenting with it since last weekend, but I haven't been able to
reproduce the same situation.
It seems that the cause is that the reproduction method cannot be limited.
Can I attach a problem log?
Best Regards,
Hideo Yamauchi.
- Original Message -
> From:
Hi Ulrich,
> So you were asking for a specific section of the CIB like "cibadmin -Q -o
> status"?
No.
There is no need for a specific section of cib.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: Ulrich Windl
> To: users@clusterlabs.org; renayama19661...@ybb.ne.jp
>
Hi Steffen,
> Unfortunately not sure about the exact scenario. But I have been doing
> some recent experiments with node standby/unstandby stop/start. This
> to get procedures right for updating node rpms etc.
>
> Later I noticed the uncomforting "pending fencing actions" status msg.
Okay!
Hi Steffen,
Hi Reid,
The fencing history is kept inside stonith-ng and is not written to cib.
However, getting the entire cib and getting it sent will help you to reproduce
the problem.
Best Regards,
Hideo Yamauchi.
- Original Message -
>From: Reid Wahl
>To:
Hi Steffen,
> Here CIB settings attached (pcs config show) for all 3 of my nodes
> (all 3 seems 100% identical), node03 is the DC.
Thank you for the attachment.
What is the scenario when this situation occurs?
In what steps did the problem appear when fencing was performed (or failed)?
Best
Hi Reid,
Hi Steffen,
> According to Steffen's description, the "pending" is displayed
> only on
> node 1, while the DC is node 3. That's another thing that makes me
> wonder if this is a distinct issue.
The problem may not be the same.
I think it's a good idea to have bugzilla or ML provide
Hi Steffen,
Hi Reid,
I also checked the Centos source rpm and it seems to include a fix for the
problem.
As Steffen suggested, if you share your CIB settings, I might know something.
If this issue is the same as the fix, the display will only be displayed on the
DC node and will not affect
Hi Steffen,
The fix pointed out by Reid is affecting it.
Since the fencing action requested by the DC node exists only in the DC node,
such an event occurs.
You will need to take advantage of the modified pacemaker to resolve the issue.
Best Regards,
Hideo Yamauchi.
- Original Message
Hi Jan,
Hi Ken,
Thanks for your comment.
I am going to check a little more about the problem of libqb.
Many thanks,
Hideo Yamauchi.
- Original Message -
> From: Ken Gaillot
> To: Cluster Labs - All topics related to open-source clustering welcomed
>
> Cc:
> Date: 2019/1/3, Thu
Hi All,
This problem occurred with our users.
The following problem occurred in a two-node cluster that does not set STONITH.
The problem seems to have occurred in the following procedure.
Step 1) Configure the cluster with 2 nodes. The DC node is the second node.
Step 2) Several resources are
Hi All,
In clusters that do not use STONITH, actions to erase attributes to each other
occurred.
The problem occurs when the load on the CPU goes up and the token of corosync
does not stabilize.
I confirmed that the problem will occur with a simple configuration.
Step1) Configure the cluster.
Hi All,
Sorry...
I made a mistake in line breaks.
to send again.
---
Hi All,
We have confirmed a slightly strange configuration of the bundle.
There is only one bundle resource, and it has an association with a group
resource.
The operation was confirmed in PM 1.1.19.
Step1) Configure the
Hi All, We have confirmed a slightly strange configuration of the bundle.
There is only one bundle resource, and it has an association with a group
resource.
The operation was confirmed in PM 1.1.19. Step1) Configure the cluster.
[root@cent7-host1 ~]# crm_mon -R
Defaulting to one-shot
Hi All,
The behavior of glue-based STONITH plug-in such as external / ipmi has been
changed since PM 1.1.16.
Up to PM 1.1.15, "status" was executed in QUERY of STONITH.
For PM 1.1.16 and later, "list" is executed.
This is due to the following changes.
-
Hi All,
I have built the following environment.
* RHEL7.3@KVM
* libqb-1.0.2
* corosync 2.4.4
* pacemaker 2.0-rc4
Start up the cluster and pour crm files with 180 Dummy resources.
Node 3 will not start.
--
[root@rh73-01 ~]# crm_mon -1
Stack:
Hi All,
[Sorry..There was a defect in line breaks. to send again.]
I was checking the operation of Bundle with Pacemaker version 2.0.0-9cd0f6cb86.
When Bundle resource is configured in Pacemaker and attribute is changed,
pengine core dumps.
Step1) Start Pacemaker and pour in the settings.
Hi All, I was checking the operation of Bundle with Pacemaker version
2.0.0-9cd0f6cb86. When Bundle resource is configured in Pacemaker and attribute
is changed, pengine core dumps. Step1) Start Pacemaker and pour in the
settings. (The replicas and replicas-per-host are set to 1.)
Hi Octavian,
Are you possibly using the free version of ESXi?
On the free version of ESXi, the operation on or off fails.
The same phenomenon also occurs in connection with virsh.
- https://communities.vmware.com/thread/542433
Best Regards,
Hideo Yamauchi.
- Original Message -
>From:
Hi Ken,
Thank you for comment
Okay!
I wait for a correction.
Many thanks!
Hideo Yamauchi.
- Original Message -
> From: Ken Gaillot
> To: users@clusterlabs.org
> Cc:
> Date: 2017/4/8, Sat 05:04
> Subject: Re: [ClusterLabs] [Problem] The crmd causes an error of
Hi All,
I confirmed a development edition of Pacemaker.
-
https://github.com/ClusterLabs/pacemaker/tree/71dbd128c7b0a923c472c8e564d33a0ba1816cb5
property no-quorum-policy="ignore" \
stonith-enabled="true" \
startup-fencing="false"
rsc_defaults
Hi All,
By the next correction, the user was not able to set a value except zero in
crm_failcount.
- [Fix: tools: implement crm_failcount command-line options correctly]
-
Hi Ken,
Thank you for comment.
For example, our user does not use pacemaker.log and corosync.log.
Via a syslog, the user makes setting to output all log to /var/log/ha-log.
-
(/etc/corosycn/corosync.conf)
logging {
syslog_facility: local1
debug: off
}
Hi Ken,
Thank you for comment.
For example, our user does not use pacemaker.log and corosync.log.
Via a syslog, the user makes setting to output all log to /var/log/ha-log.
-
(/etc/corosycn/corosync.conf)
logging {
syslog_facility: local1
debug: off
}
Hi All,
When I carry out Pacemaker1.1.15 and Pacemaker1.1.16 in RHEL7.3, log in
conjunction with pacemaker is not collected in the file which I collected in
sosreport.
This seems to be caused by the next correction and pacemaker.py script of
RHEL7.3.
-
Hi Klaus,
Hi Jan,
Hi All,
About watchdog using WD service, there does not seem to be the opposite opinion.
I do work to make an official patch from next week.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
> To:
Hi Ken,
Hi All,
About a future SNMP trap function, we request the following function. * SNMP
trap function of the attribute change. This is a function to transmit an SNMP
trap when a specific attribute changes.
It is useful to trap a change of the score of drbd and a change of the score of
the
Hi Klaus,
Hi Jan,
Hi All,
Our member argued about watchdog using WD service.
1) The WD service is not abolished.
2) In pacemaker_remote, it is available by starting corosync in localhost.
3) It is necessary for the scramble of watchdog to consider it.
4) Because I think about the case which does
Hi Klaus,
Hi All,
I tried prototype of watchdog using WD service.
-
https://github.com/HideoYamauchi/pacemaker/commit/3ee97b76e0212b1790226864dfcacd1a327dbcc9
Please comment.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
Hi Klaus,
Thank you for comment.
I make the patch which is prototype using WD service.
Please wait a little.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: Klaus Wenninger
> To: users@clusterlabs.org
> Cc:
> Date: 2016/10/10, Mon 21:03
> Subject:
Hi All,
We discovered a problem in the cluster which Quorum control and STONITH did not
have.
We can confirm the problem in the next procedure.
Step1) Constitute a cluster.
[root@rh72-01 ~]# crm configure load update trac3437.crm
[root@rh72-01 ~]# crm_mon -1 -Af
Stack: corosync
Current DC:
Hi Honza,
Thank you for comment.
>> Our user constituted a cluster in corosync and Pacemaker in the next
> environment.
>> The cluster constituted it among guests.
>>
>> * Host/Guest : RHEL6.6 - kernel : 2.6.32-504.el6.x86_64
>> * libqb 0.17.1
>> * corosync 2.3.4
>> * Pacemaker 1.1.12
Hi All,
When a node joined while start of the resource takes time, start of the
resource is carried out twice.
Step 1) Put sleep in start of the Dummy
resource.(/usr/lib/ocf/resource.d/heartbeat/Dummy)
(snip)
dummy_start() {
sleep 60
dummy_monitor
if [ $? = $OCF_SUCCESS ]; then
(snip)
Hi All, When a node joined while start of the resource takes time, start of the
resource is carried out twice. Step 1) Put sleep in start of the Dummy
resource.(/usr/lib/ocf/resource.d/heartbeat/Dummy) (snip)
dummy_start() { sleep 60 dummy_monitor if [ $? = $OCF_SUCCESS ]; then
(snip) Step 2)
Hi Klaus,
I do it by the weekend of the next week and write a patch using queue.
In addition, please tell me your opinion.
Many thanks!
Hideo Yamauchi.
- Original Message -
> From: Klaus Wenninger
> To: "users@clusterlabs.org"
> Cc:
>
i Klaus,
After all we want transmission order.
I think that I am going to use meta_attribute which you suggested and am enough.
(snip)
(snip)
I intend to write the correction that included a cue in this meta_attribute,
what do you think?
I think that processing when queue was changed is
Hi All,
After all our member needs the control of the turn of the transmission of the
SNMP trap.
We make a patch of the control of the turn of the transmission and intend to
send it.
Probably, with the patch, we add the "ordered" attribute that we sent by an
email before.
Best Regards,
Hi Klaus,
Because the script is performed the effectiveness of in async, I think that it
is difficult to set "uptime" by the method of the sample.
After all we may request the transmission of the order.
#The patch before mine only controls a practice turn of the async and is not a
thing giving
Hi All,
We have a request for a new SNMP function.
The order of traps is not right.
The turn of the trap is not sometimes followed.
This is because the handling of notice carries out "path" in async.
I think that it is necessary to wait for completion of the practice at "path"
unit of
Hi All,
I tried a planned new SNMP function released in
Pacemaker1.1.14.(pacemaker-87bc29e4b821fd2a98c978d5300e43eef41c2367)
However, in the next procedure, the SNMP trap is not transmitted.
Step 1) Start node A.
Step 2) Send CLI file.(The trap is transmitted definitely then.)
Dec 3 14:25:25
Hi Dejan,
All right!
Thank you for merging a patch.
Many Thanks!
Hideo Yamauchi.
- Original Message -
> From: Dejan Muhamedagic
> To: users@clusterlabs.org
> Cc:
> Date: 2015/11/16, Mon 18:02
> Subject: Re: [ClusterLabs] Antw: Re: [Question] Question about
Hi Ken,
Hi Ulrich,
Hi All,
I sent a patch.
* https://github.com/ClusterLabs/resource-agents/pull/698
Please confirm it.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
> To: Cluster Labs - All topics related to
Hi Ken,
Hi Ulrich,
Thank you for comment
The RA of mysql seemed to have a problem somehow or other from the beginning as
far as I heard the opinion of Ken and Ulrich.
I wait for the opinion of other people a little more, and I make a patch.
Best Regards,
Hideo Yamauchi.
- Original
Hi All,
I contributed a patch several times about mysql, too.
I did not mind it very much before, but mysql RA makes next move.
Step1) Constitute a cluster using mysql in Pacemaker.
Step2) The mysql process kill by signal SIGKILL.
Step3) Stop Pacemaker before monitor error occurs and stop mysql
Hi Dejan,
Thank you for a reply.
> It somehow slipped.
>
> I suppose that you tested the patch well and nobody objected so
> far, so lets apply it.
>
> Many thanks! And sorry about the delay.
I confirmed the merge of the patch.
* http://hg.linux-ha.org/glue/rev/56f40ec5d37e
Many Thanks!
Hi Dejan,
Hi All,
How about the patch which I contributed by a former email?
I would like an opinion.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
> To: Cluster Labs - All topics related to open-source clustering
Hi Ken,
Thank you for comments.
> The above is the reason for the behavior you're seeing.
>
> A fenced node can come back up and rejoin the cluster before the fence
> command reports completion. When Pacemaker sees the rejoin, it assumes
> the fence command completed.
>
> However in this case,
Hi All,
The following problem produced us in Pacemaker1.1.12.
While STONITH was not completed, a resource moved it.
The next movement seemed to happen in a cluster.
Step1) Start a cluster.
Step2) Node 1 breaks down.
Step3) Node 1 is reconnected before practice is completed from node 2
Hi Yan,
Hi All,
The problem seems to be taking place somehow or other in the run_alarms inside
carried out from hbagent.
I confirmed that hbagent received SIGTERM.
There seems to be the problem with connect() carried out from run_alarms.
We continue investigating it including a different
Hi All,
We intend to change some patches.
We withdraw this patch.
Best Regards,
Hideo Yamauchi.
- Original Message -
> From: "renayama19661...@ybb.ne.jp"
> To: ClusterLabs-ML
> Cc:
> Date: 2015/9/7, Mon 09:06
> Subject:
Hi Yan,
Thank you for comment.
> Sounds weird. I've never encountered the issue before. Actually I
> haven't run it with heartbeat for years ;-) We'd probably have to find
> the pattern and produce it.
We still just began an investigation.
If there is the point that you think to be the
Hi All,
When a cluster carries out stonith, Pacemaker handles host name by a small
letter.
When a user sets the host name of the OS and host name of hostlist of
external/libvrit in capital letters and waits, stonith is not carried out.
The external/libvrit to convert host name of hostlist, and
Hi Andrew,
I used the built-in SNMP.
I started as a daemon with -d option.
Is it running on both nodes or just snmp1?
On both nodes.
[root@snmp1 ~]# ps -ef |grep crm_mon
root 4923 1 0 09:42 ? 00:00:00 crm_mon -d -S 192.168.40.2 -W
-p /tmp/ClusterMon-upstart.pid
Hi Andrew,
A correction seems to still have a problem.
It is awaiting demote, and the master-group resource cannot move.
[root@bl460g8n3 ~]# crm_mon -1 -Af
Last updated: Tue Aug 18 11:13:39 2015 Last change: Tue Aug 18
11:11:01 2015 by root via crm_resource on bl460g8n4
Stack:
Hi Andrew,
Thank you for comments.
I will confirm it tomorrow.
I am a vacation today.
Best Regards,
Hideo Yamauchi.
- Original Message -
From: Andrew Beekhof and...@beekhof.net
To: renayama19661...@ybb.ne.jp; Cluster Labs - All topics related to
open-source clustering welcomed
Hi All,
We confirmed movement of
pacemaker_remote.(version:pacemaker-ad1f397a8228a63949f86c96597da5cecc3ed977)
It is the following cluster constitution.
* sl7-01(KVM host)
* snmp1(Guest on the sl7-01 host)
* snmp2(Guest on the sl7-01 host)
We prepared for the next CLI file to confirm the
Hi All,
We have a question for the next correction.
*
https://github.com/ClusterLabs/pacemaker/commit/a97c28d75347aa7be76092aa22459f0f56a220ff
We understand it that this obeyed a correction of systemd.
In Pacemaker1.1.13, SysVStartPriority=99 is set.
Pacemaker1.1.12 is set, too.
When we
Hi Jan,
Thank you for comments.
When we use Pacemaker1.1.12, does it have a problem to delete
SysVStartPriority=99?
Or must we not delete it when use Pacemaker1.1.12 and Pacemaker1.1.13?
Or is it necessary to judge it by a version of systemd of the OS, and to
set it?
It was a leftover
66 matches
Mail list logo