22120911.uz4cgybmxsced...@redhat.com>:
> Sounds like your RA returns e.g. OCF_ERR_ARGS or similar where it
> shouldnt.
>
> Try starting the resource with crm_resource and add ‑VV which should
> show you the code as it's being run.
>
> On 22/07/19 13:55 +0200, Ulrich Wi
Hi!
Playing with some new RA that won't start, I found this in crm_resource's man:
-Y, --why
Show why resources are not running, optionally filtered by
--resource and/or --node
When I tried it, all I got was:
# crm_resource -r prm_idredir_test -Y
Resource
>>> schrieb am 16.07.2019 um 15:20 in Nachricht
<87k1cixgii.fsf...@lant.ki.iif.hu>:
> "Ulrich Windl" writes:
>
>> schrieb am 15.07.2019 um 18:41 in Nachricht
> <87o91vp7vv@lant.ki.iif.hu>:
>>
>>> In a mostly symmetrical clus
>>> Nishant Nakate schrieb am 16.07.2019 um 10:11 in
Nachricht
...
> May be because my knowledge of resource agents is not enough. Having my
> processes as resources automatically handled by pace maker will workout. I
> will need to find out more on making my services (written in CPP) resource
>
>>> Nishant Nakate schrieb am 16.07.2019 um 08:58 in
Nachricht
:
> On Tue, Jul 16, 2019 at 11:33 AM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de> wrote:
>
>> >>> Nishant Nakate schrieb am 16.07.2019 um
>> 05:37 in
>> Nachricht
>> :
>>> Nishant Nakate schrieb am 16.07.2019 um 05:37 in
Nachricht
:
> Hi All,
>
> I am new to this community and HA tools. Need some guidance on my current
> handling pacemaker.
>
> For one of my projects, I am using pacekmaker for high availability.
> Following the instructions provided in setup
>>> Jehan-Guillaume de Rorthais schrieb am 10.07.2019 um
13:14 in
Nachricht <20190710131427.3876ea36@firost>:
> On Wed, 10 Jul 2019 12:53:59 +0300
> Andrei Borzenkov wrote:
>
>> On Wed, Jul 10, 2019 at 12:42 PM Jehan‑Guillaume de Rorthais
>> wrote:
>>
>> >
>> > > > Jul 09 09:16:32 [2679]
>>> Jan Pokorný schrieb am 01.07.2019 um 14:42 in
Nachricht
<20190701124215.gn31...@redhat.com>:
> On 01/07/19 13:26 +0200, Ulrich Windl wrote:
>>>>> Jan Pokorný schrieb am 27.06.2019 um 12:02
>>>>> in Nachricht <20190627100209.gf31...@redhat.c
Would running pcsd unter valgrind be an option? In addition to checking for
leaks, it can also provide some memory usage statistics (who is using how
much)...
>>> Tomas Jelinek schrieb am 27.06.2019 um 15:30 in
Nachricht
<363f827e-d05d-309f-7ab6-c43e268df...@redhat.com>:
> Hi,
>
> We (pcs
>>> Jan Pokorný schrieb am 27.06.2019 um 12:02 in
Nachricht
<20190627100209.gf31...@redhat.com>:
> On 25/06/19 12:20 ‑0500, Ken Gaillot wrote:
>> On Tue, 2019‑06‑25 at 11:06 +, Somanath Jeeva wrote:
>> Addressing the root cause, I'd first make sure corosync is running at
>> real‑time priority
>>> Somanath Jeeva schrieb am 25.06.2019 um 13:06
in
Nachricht
> I have not configured fencing in our setup . However I would like to know if
> the split brain can be avoided when high CPU occurs.
It seems you like to ride a bicycle with crossed arms while trying to avoid to
fall ;-)
>
>
>>> Ken Gaillot schrieb am 24.06.2019 um 16:57 in
Nachricht
<95f51b52283d05bbd948e4508c406d7ccb64.ca...@redhat.com>:
> On Mon, 2019‑06‑24 at 08:52 +0200, Jan Friesse wrote:
>> Somanath,
>>
>> > Hi All,
>> >
>> > I have a two node cluster with multicast (udp) transport . The
>> > multicast
>>> Jan Friesse schrieb am 24.06.2019 um 08:52 in
Nachricht
:
> Somanath,
>
>> Hi All,
>>
>> I have a two node cluster with multicast (udp) transport . The multicast IP
> used in 224.1.1.1 .
>
> Would you mind to give a try to UDPU (unicast)? For two node cluster
> there is going to be no
To me it looks like a broken migration configuration.
>>> "Lentes, Bernd" schrieb am 19.06.2019
um
18:46 in Nachricht
<1654529492.1465807.1560962767193.javamail.zim...@helmholtz-muenchen.de>:
> ‑ On Jun 15, 2019, at 4:30 PM, Bernd Lentes
bernd.lentes@helmholtz‑muenchen.de
> wrote:
>
>>
>>> Tiemen Ruiten schrieb am 14.06.2019 um 16:43 in
Nachricht
:
> Right, so I may have been too fast to give up. I set maintenance mode back
> on and promoted ph-sql-04 manually. Unfortunately I don't have the logs of
> ph-sql-03 anymore because I reinitialized it.
>
> You mention that demote
>>> Indivar Nair schrieb am 09.06.2019 um 14:52 in
Nachricht
:
> Hello ...,
>
> I have an Active-Passive cluster with two nodes hosting an XFS Filesystem
> over a CLVM Volume.
>
> If a failover happens, the volume is mounted on the other node without a
> recovery that usually happens to a
>>> Andrei Borzenkov schrieb am 29.05.2019 um 20:31 in
Nachricht <1d0c775a-6854-bede-e241-8c23d5919...@gmail.com>:
> 29.05.2019 11:12, Ulrich Windl пишет:
>>>>> Jan Pokorný schrieb am 28.05.2019 um 16:31 in
>> Nachricht
>> <20190528143145.ga29...@
>>> Jan Pokorný schrieb am 28.05.2019 um 16:31 in
Nachricht
<20190528143145.ga29...@redhat.com>:
> On 27/05/19 08:28 +0200, Ulrich Windl wrote:
>> I copnfigured ocf:pacemaker:NodeUtilization more or less for fun, and I
> realized that the cluster rrepiorts no pro
Hi!
I copnfigured ocf:pacemaker:NodeUtilization more or less for fun, and I
realized that the cluster rrepiorts no problems, but in syslog I have these
unusual messages:
2019-05-27T08:21:07.748149+02:00 h06 lrmd[16599]: notice:
prm_node_util_monitor_30:15028:stderr [ info: Writing node
>>> "Lentes, Bernd" schrieb am 23.05.2019
um
15:01 in Nachricht
<1029244418.9784641.1558616472505.javamail.zim...@helmholtz-muenchen.de>:
>
> ‑ On May 20, 2019, at 8:28 AM, Ulrich Windl
ulrich.wi...@rz.uni‑regensburg.de
> wrote:
>
>>>>>
Hi!
Reading the release Notes on SLES12 HAE, I found the ``op monitor
role="Stopped" ...`` operation that had been discussed here before, too.
When trying to configure it, I get an error message (from crm shell):
WARNING: prm_ping_gw1-v582: action 'monitor_Stopped' not found in Resource
Agent
Hi!
So maybe the original defective RA would be valuable for debugging the issue.
I guess the RA was invalid in some way that wasn't detected or handled
properly...
Regards,
Ulrich
>>> Andrei Borzenkov schrieb am 21.05.2019 um 09:13 in
Nachricht :
> 21.05.2019 0:46, Ken Gaillot пишет:
>>>
>>> Kadlecsik József schrieb am 20.05.2019 um
23:15 in Nachricht :
[...]
> stopping/starting of the resources. :‑) I haven't thought that "id" is
> reserved as parameter name.
The attribute "id" is "very much reserved" even in HTML, XML, SGML, etc. I
don't know about "id" being reserved as an
What worries me is "Rejecting name for unique".
>>> Kadlecsik József schrieb am 20.05.2019
um
14:37 in Nachricht :
> On Sun, 19 May 2019, Kadlecsik József wrote:
>
>> On Sat, 18 May 2019, Kadlecsik József wrote:
>>
>> > On Sat, 18 May 2019, Kadlecsik József wrote:
>> >
>> > > On Sat, 18 May
>>> "Lentes, Bernd" schrieb am 16.05.2019
um
17:10 in Nachricht
<1151882511.6631123.1558019430655.javamail.zim...@helmholtz-muenchen.de>:
> Hi,
>
> my HA-Cluster with two nodes fenced one on 14th of may.
> ha-idg-1 has been the DC, ha-idg-2 was fenced.
> It happened around 11:30 am.
> The log
>>> Klecho schrieb am 07.05.2019 um 08:59 in Nachricht
<5e84375f-a631-cf06-50dc-58150fd78...@gmail.com>:
> Hi,
>
> During the weekend my corosync daemon suddenly died without anything in
> the logs, except this:
>
> May 5 20:39:16 ZZZ kernel: [1605277.136049] traps: corosync[2811] trap
>
>>> Andrei Borzenkov schrieb am 05.05.2019 um 07:43 in
Nachricht <033573b9-188f-baf6-e4b9-ba73150a3...@gmail.com>:
> 30.04.2019 19:47, Олег Самойлов пишет:
>>
>>
>>> 30 апр. 2019 г., в 19:38, Andrei Borzenkov
>>> написал(а):
>>>
>>> 30.04.2019 19:34, Олег Самойлов пишет:
> No. I
>>> "Lentes, Bernd" schrieb am 03.05.2019
um
19:18 in Nachricht
<595056197.709875.1556903922143.javamail.zim...@helmholtz-muenchen.de>:
> Hi,
>
> on my cluster nodes i established a systemd service which starts crm_mon
> which writes cluster information into a html-file so i can see the state
>>> Arkadiy Kulev schrieb am 05.05.2019 um 15:14 in
>>> Nachricht
:
> Hello!
>
> I run pacemaker on 2 active/active hosts which balance the load of 2 public
> IP addresses.
> A few days ago we ran a very CPU/network intensive process on one of the 2
> hosts and Pacemaker failed.
>
> I've
Hi!
Trying to upgrade one corosync 1 cluster (SLES11 SP4) to corosync 2 (SLES12
SP4) resulted in a two-node cluster that happily fences each node, and little
else.
A first investigation indicated that I simply placed the "transport: updu" line
within each "interface" instead of globally. It
>>> Jan Pokorný schrieb am 29.04.2019 um 17:22 in
Nachricht
<20190429152200.ga19...@redhat.com>:
> On 29/04/19 14:58 +0200, Jan Pokorný wrote:
>> On 29/04/19 08:20 +0200, Ulrich Windl wrote:
>>>>>> Jan Pokorný schrieb am 25.04.2019 um 18:49
>>>
>>> Jan Pokorný schrieb am 25.04.2019 um 18:49 in
Nachricht
<20190425164946.gf23...@redhat.com>:
> On 24/04/19 09:32 ‑0500, Ken Gaillot wrote:
>> On Wed, 2019‑04‑24 at 16:08 +0200, wf...@niif.hu wrote:
>>> Make install creates /var/log/pacemaker with mode 0770, owned by
>>> hacluster:haclient.
Hi!
I managed to get my cluster up again after upgrading from SLES11 SP4 to SLES12
SP4, but my CTDB Samba won't start any more. The problem is:
CTDB(prm_s02_ctdb)[30904]: ERROR: Failed to execute /usr/sbin/ctdbd.
lrmd[27341]: notice: prm_s02_ctdb_start_0:30857:stderr [ Invalid option
H!
I have a question: What is the difference between "confirmed=true" and
"confired=false" actions, like here:
Apr 24 08:30:20 h01 crmd[10774]: notice: process_lrm_event: Operation
prm_xen_v01_migrate_from_0: ok (node=h01, call=169, rc=0, cib-update=150,
confirmed=true)
Apr 24 08:30:26 h01
Hi!
I know that April 1st is gone, but maybe should be have "user-friendly
durations" also? Maybe like:
"a deep breath", meaning "30 seconds"
"a guru meditation", meaning "5 minutes"
"a coffee break", meaning "15 minutes"
"a lunch break", meaning "half an hour"
...
Typical maintenance tasks can
Hi!
After some tweaking past updating SLES11 to SLES12 I build a new config file
for corosync.
Corosync is happy, pacemaker says the nodes are online, but the cluster status
still says both nodes are "UNCLEAN (offline)". Why?
Messages I see are:
crmd: info: peer_update_callback:Client
>>> Tomas Jelinek schrieb am 23.04.2019 um 12:36 in
Nachricht
:
> The files are listed as ghost files in order to let rpm know they belong
> to pcs but are not distributed in rpm packages. Those files are created
> by pcsd in runtime. I guess the 000 permissions come from the fact those
>
Hi!
Fighting with the cahnges between corosync 1 (SLES11 SP4) and corosync 2
(SLES12 SP4), I got some "funny" error message:
corosync[13979]: [MAIN ] parse error in config: The token hold timeout
parameter (16 ms) may not be less than (30 ms).
The funny part is that "hold" is not set at all
Hi!
Reading the corosync.conf manual page of corosync 2.3.6 (SLES12 SP4), I have
some random comments:
The tag indent for "clear_node_high_bit" seems broken.
Shouldn't "hold" be "token_hold"? Why is "token_retransmits_before_loss_const"
so long (why the "_const")?
The description states
>>> Christopher Lumens schrieb am 18.04.2019 um 17:07 in
Nachricht <2044533839.17332252.1555600059522.javamail.zim...@redhat.com>:
>> As XML is only as good as ist structure, would you present the structure
of
>> such XML output?
>
> If you enjoy reading XML that describes XML, you can check
>>> "Lentes, Bernd" schrieb am 18.04.2019
um
16:11 in Nachricht
<63714399.378325.196700800.javamail.zim...@helmholtz-muenchen.de>:
> Hi,
>
> i have a two-node cluster, both servers are buffered by an UPS.
> If power is gone the UPS sends after a configurable time a signal via
> network to
>>> Ken Gaillot schrieb am 17.04.2019 um 22:53 in
Nachricht
<57f8ec2dcee0c715e2bc1146005e86e7844306d1.ca...@redhat.com>:
> Hi all,
>
> Another new feature considered experimental in the upcoming pacemaker
> release is XML output from tools for easier automated parsing. To begin
> with, only
>>> JCA <1.41...@gmail.com> schrieb am 17.04.2019 um 22:50 in Nachricht
:
> I am trying to get fencing working, as described in the "Cluster from
> Scratch" guide, and I am stymied at get-go :-(
>
> The document mentions a property named stonith-enabled. When I was trying
> to get my first
>>> Ken Gaillot schrieb am 16.04.2019 um 00:30 in
Nachricht
<144df656215fc1ed6b3a35cffd1cbd2436f2a785.ca...@redhat.com>:
[...]
> The cluster successfully probed the service on both nodes, and started
> it on node one. It then tried to start a 30‑second recurring monitor
> for the service, but the
>>> JCA <1.41...@gmail.com> schrieb am 15.04.2019 um 23:30 in Nachricht
:
> Well, I remain puzzled. I added a statement to the end of my script in
> order to capture its return value. Much to my surprise, when I create the
> associated resource (as described in my previous post) myapp-script gets
>>> schrieb am 15.04.2019 um 13:03 in Nachricht
<566fe1cd-b8fd-41e0-bc07-1722be14e...@ya.ru>:
>
>> 14 апр. 2019 г., в 10:12, Andrei Borzenkov
написал(а):
>
> Thanks for explanation, I think this will be good addition to the SBD
> manual. (SBD manual need in this.) But my
>>> Sven Möller schrieb am 08.04.2019 um 16:11 in
Nachricht
<20190408141109.horde.fk4h-6rlnqpo3s3muppw...@cloudy.nichthelfer.de>:
> Hi,
> we were running a corosync config including 2 Rings for about 2.5 years on a
> two node NFS Cluster (active/passive). The first ring (ring 0) is configured
>
>>> Valentin Vidic schrieb am 03.04.2019 um 09:26
in
Nachricht <20190403072602.gw9...@gavran.carpriv.carnet.hr>:
> On Wed, Apr 03, 2019 at 09:13:58AM +0200, Ulrich Windl wrote:
>> I'm surprised: Once sbd writes the fence command, it usually takes
>> less than 3 se
>>> Digimer schrieb am 02.04.2019 um 19:49 in Nachricht
<6c6302f4-844b-240d-8d0e-727dddf36...@alteeve.ca>:
[...]
> It's worth noting that SBD fencing is "better than nothing", but slow.
> IPMI and/or PDU fencing completes a lot faster.
I'm surprised: Once sbd writes the fence command, it
>>> Brian Reichert schrieb am 26.03.2019 um 21:12 in
Nachricht <20190326201259.gj36...@numachi.com>:
> This will sound like a dumb question:
>
> The manpage for pcs(8) implies that to set up a cluster, one needs
> to provide a name.
>
> Why do clusters have names?
Seems to be traditional.
>>> Ken Gaillot schrieb am 26.03.2019 um 20:28 in
Nachricht
<1d8d000ab946586783fc9adec3063a1748a5b06f.ca...@redhat.com>:
> On Tue, 2019-03-26 at 22:12 +0300, Andrei Borzenkov wrote:
>> 26.03.2019 17:14, Ken Gaillot пишет:
>> > On Tue, 2019-03-26 at 14:11 +0100, Thomas Singleton wrote:
>> > > Dear
>>> Valentin Vidic schrieb am 20.03.2019 um 19:00
in
Nachricht <20190320180007.gx9...@gavran.carpriv.carnet.hr>:
> On Wed, Mar 20, 2019 at 01:47:56PM ‑0400, Digimer wrote:
>> Not when DRBD is configured correctly. You sent 'fencing
>> resource‑and‑stonith;' and set the appropriate fence handler.
>>> Digimer schrieb am 20.03.2019 um 18:47 in Nachricht
<38f790d0-4bb6-53b8-7eb4-b285ed147...@alteeve.ca>:
> On 2019-03-20 1:46 p.m., Valentin Vidic wrote:
>> On Wed, Mar 20, 2019 at 01:34:52PM -0400, Digimer wrote:
>>> Depending on your fail-over tolerances, I might add NFS to the mix and
>>>
>>> Digimer schrieb am 20.03.2019 um 17:37 in Nachricht
<37a2b613-62ce-a552-804a-df5199674...@alteeve.ca>:
> Note;
>
> Cluster filesystems are amazing if you need them, and to be avoided if
> at all possible. The overhead from the cluster locking hurts performance
> quite a lot, and adds a
Hi!
On "2." (provider): I think "heartbeat" is purely historical, and (as you
might have found out, the provider is mostly a subdirectory in the tree where
RAs are located) for compatibility nobody dared or cared to change. Personally
I'm using my own provider (consisting of a four-letter-hash of
>>> Alex Crow schrieb am 11.03.2019 um 22:28 in Nachricht
:
> On 11/03/2019 21:18, Full Name wrote:
>> I am a complete newbie here, so please bear with me, if I ask something
> stupid and/or obvious.
>>
>> I have been able to deploy and configure the software across three
> nodes,
Hi!
I'm playing with upgrading a SLES11 SP4 cluster to SLES12 SP4. After having
upgraded on node in a two-node cluster, I see the following messages on the
SLES12 node:
[...]
Feb 27 08:46:16 h02 corosync[5198]: [TOTEM ] A new membership
(172.20.16.2:2336) was formed. Members joined:
>>> Ken Gaillot schrieb am 26.02.2019 um 16:27 in
>>> Nachricht
:
[...]
>
> Actions that have been *scheduled* but not *initiated* can be aborted.
> But anytime a resource agent has been invoked, we wait for that process
> to complete.
I guess it's to receive the regular exit code.
>
[...]
>
>>> Valer Nur schrieb am 25.02.2019 um 17:22 in Nachricht
<863485302.5185430.155721...@mail.yahoo.com>:
> Hi,
> I am a newbie on this. I have followed the great documentation to create an
> active/passive cluster. I skipped the Apache part since I do not need it. All
> seems to be working
>>> Edwin Török schrieb am 20.02.2019 um 12:30 in
Nachricht <0a49f593-1543-76e4-a8ab-06a48c596...@citrix.com>:
> On 20/02/2019 07:57, Jan Friesse wrote:
>> Edwin,
>>>
>>>
>>> On 19/02/2019 17:02, Klaus Wenninger wrote:
On 02/19/2019 05:41 PM, Edwin Török wrote:
> On 19/02/2019 16:26,
>>> Eric Robinson schrieb am 19.02.2019 um 21:06 in
Nachricht
>> -Original Message-
>> From: Users On Behalf Of Ken Gaillot
>> Sent: Tuesday, February 19, 2019 10:31 AM
>> To: Cluster Labs - All topics related to open-source clustering welcomed
>>
>> Subject: Re: [ClusterLabs] Why Do
error 4 in libc-2.17.so[7f221c554000+1c2000]
>> [ 5390.361918] Code: b8 00 00 00 04 00 00 00 74 07 48 8d 05 f8 f2 0d 00
>> c3 0f 1f 80 00 00 00 00 48 31 c0 89 f9 83 e1 3f 66 0f ef c0 83 f9 30 77
>> 19 0f 6f 0f 66 0f 74 c1 66 0f d7
>>> Jan Pokorný schrieb am 18.02.2019 um 21:08 in
Nachricht
<20190218200816.gd23...@redhat.com>:
> On 15/02/19 08:48 +0100, Jan Friesse wrote:
>> Ulrich Windl napsal(a):
>>> IMHO any process running at real-time priorities must make sure
>>> that
Hi!
I also wonder: With SCHED_RR would a sched_yield() at a proper place the 100%
CPU loop also fix this issue? Or do you think "we need real-time, and cannot
allow any other task to run"?
Regards,
Ulrich
>>> Edwin Török schrieb am 15.02.2019 um 17:58 in
Nachricht :
> On 15/02/2019 16:08,
I would expect that, as strace interrupts the RT task to query the code; you
should run strace at the same RT priority ;-)
>>> Edvin Torok 14.02.19 19.54 Uhr >>>
Apologies for top posting, the strace you asked for is available here (although
running strace itself had side-effect of getting
Hi!
IMHO any process running at real-time priorities must make sure that it
consumes the CPU only for shorrt moment that are really critical to be
performed in time. Specifically having some code that performs poorly (for
various reasons) is absolutely _not_ a candidate to be run with real-time
Hi!
I wonder: Can we close this thread with "You have been warned, so please don't
come back later, crying! In the meantime you can do what you want to do."?
Regards,
Ulrich
>>> Jehan-Guillaume de Rorthais schrieb am 13.02.2019 um
15:05 in
Nachricht <20190213150549.47634671@firost>:
> On Wed,
Hello!
I'd like to comment as an "old" SuSE customer:
I'm amazed that lighttpd is dropped in favor of some new go application:
SuSE now has a base system that needs (correct me if I'm wrong): shell, perl,
python, java, go, ruby, ...?
Maybe each programmer has his favorite. Personally I also
>>> Maciej S schrieb am 11.02.2019 um 12:34 in Nachricht
:
> I was wondering if anyone can give a plain answer if fencing is really
> needed in case there are no shared resources being used (as far as I define
> shared resource).
>
> We want to use PAF or other Postgres (with replicated data
>>> Jan Pokorný schrieb am 01.02.2019 um 08:10 in
Nachricht
<20190201071011.gb7...@redhat.com>:
> On 28/01/19 09:47 ‑0600, Ken Gaillot wrote:
>> On Mon, 2019‑01‑28 at 18:04 +0530, Dileep V Nair wrote:
>> Pacemaker can handle the clock jumping forward, but not backward.
>
> I am rather surprised,
Hi!
IMHO it's like in Perl: When relying the hash keys to be returned in any
particular (or even stable) order, the idea is just broken! Either keep the
keys in an extra array for ordering, or sort them in some way...
Regards,
Ulrich
>>> Jan Pokorný schrieb am 18.01.2019 um 20:32 in
Nachricht
>>> Ken Gaillot schrieb am 17.01.2019 um 18:45 in
Nachricht
:
> On Thu, 2019‑01‑17 at 07:49 +0100, Ulrich Windl wrote:
>> > > > Ken Gaillot schrieb am 16.01.2019 um
>> > > > 16:34 in Nachricht
>>
>> <02e51f6d4f7c7c11161d54e2968c23
properly, so the other node
will assume some failure and fence (to be sure the other side is dead).
Regards,
Ulrich
>>> "Bryan K. Walton" schrieb am 16.01.2019 um
16:36 in Nachricht
<20190116153625.zybof7hrkuueh...@mygeeto.inside.leepfrog.com>:
> On Wed, Jan 16, 2019 at 04
>>> Ken Gaillot schrieb am 16.01.2019 um 16:34 in
>>> Nachricht
<02e51f6d4f7c7c11161d54e2968c23c77c4a1eed.ca...@redhat.com>:
[...]
> In retrospect, interleave=true should have been the default. I've never
> seen a case where false made sense, and people get bit by overlooking
> it all the time.
Hi!
I guess we need more logs; especially some events from storage2 before fencing
is triggered.
Regards,
Ulrich
>>> "Bryan K. Walton" schrieb am 16.01.2019
um
16:03 in Nachricht
<20190116150321.3j2f2upz67eth...@mygeeto.inside.leepfrog.com>:
> I have posed this question on the DRBD‑user list,
>>> schrieb am 14.01.2019 um 11:48 in Nachricht
<87va2rjyzv@lant.ki.iif.hu>:
> Hi,
>
> Recently I spent some time mapping the interrelations of the C header
> files constituting the Pacemaker API. In the end I decided they were so
> tightly interdependent that there was really no useful way
>>> Ken Gaillot schrieb am 08.01.2019 um 18:28 in
Nachricht
<2ecefc63baa56a76a6eeca7c696fc7a1653eb620.ca...@redhat.com>:
> On Tue, 2019-01-08 at 17:23 +0100, Kristoffer Grönlund wrote:
>> On Tue, 2019-01-08 at 10:07 -0600, Ken Gaillot wrote:
>> > On Tue, 2019-01-08 at 10:30 +0100, Kristoffer
>>> Ken Gaillot schrieb am 08.01.2019 um 17:55 in
Nachricht
:
> On Tue, 2019‑01‑08 at 07:35 ‑0600, Bryan K. Walton wrote:
>> Hi,
>>
>> I'm building a two node cluster with Centos 7.6 and DRBD. These
>> nodes
>> are connected upstream to two Brocade switches. I'm trying to enable
>> fencing by
>>> Ken Gaillot schrieb am 08.01.2019 um 00:52 in
Nachricht
:
> There has been some discussion in the past about generating more
> machine‑friendly output from pacemaker CLI tools for scripting and
> high‑level interfaces, as well as possibly adding a pacemaker REST API.
Interesting: XML being
>>> Andrei Borzenkov schrieb am 22.12.2018 um 05:27 in
Nachricht <3897ef3e-220e-7377-9647-24965eab4...@gmail.com>:
> 21.12.2018 12:09, Klaus Wenninger пишет:
>> On 12/21/2018 08:15 AM, Fulong Wang wrote:
>>> Hello Experts,
>>>
>>> I'm New to this mail lists.
>>> Pls kindlyforgive me if this mail
Hi!
Offline a SCSI disk: "echo offline > /sys/block/sd/device/state". The
opposite is not "online", BTW, but: ""echo running >
/sys/block/sd/device/state".
You could also try "echo "scsi remove-single-device" >
/proc/scsi/scsi", where MAGIC is (AFAIR) "HOST BUS TARGET LUN".
Regards,
Ulrich
>>> Chris Walker schrieb am 18.12.2018 um 17:13 in Nachricht
:
[...]
> 2. As Ken mentioned, synchronize the starting of Corosync and Pacemaker. I
> did this with a simple ExecStartPre systemd script:
>
> [root@bug0 ~]# cat /etc/systemd/system/corosync.service.d/ha_wait.conf
> [Service]
>
Hi!
I just noticed that in SLES12 SP3 there is a syslog-ng RA, but it seems there
is no syslog-ng package available, and "$SYSLOG_NG_EXE" is not set any more.
Thus that RA is unusable IMHO.
I was running a non-priviledged syslog-ng with ist own log files on a high-port
in SLES11, but it seems
>>> Vitaly Zolotusky schrieb am 17.12.2018 um 21:43 in
>>> Nachricht
<1782126841.215210.1545079428...@webmail6.networksolutionsemail.com>:
> Hello,
> I have a 2 node cluster and stonith is configured for SBD and fence_ipmilan.
> fence_ipmilan for node 1 is configured for 0 delay and for node 2
>>> Jan Friesse schrieb am 14.12.2018 um 15:06 in
Nachricht
<991569e4-2430-30f1-1bbc-827be7637...@redhat.com>:
[...]
> ‑ UDP/UDPU transports are still present, but supports only single ring
> (RRP is gone in favor of Knet) and doesn't support encryption
[...]
I wonder: Is there a migration
Hi!
Once again you forgot the one-line-summary what is is ;-) I guess a quorum
device...
Regards,
Ulrich
>>> Jan Friesse schrieb am 12.12.2018 um 15:20 in
Nachricht
:
> I am pleased to announce the first stable release of Corosync‑Qdevice
> 3.0.0 available immediately from GitHub at
>
>>> Ken Gaillot schrieb am 11.12.2018 um 21:48 in
Nachricht
<33316cc0570a12255c7de7dd387caee9c5058e37.ca...@redhat.com>:
> Hi all,
>
> I expect to have the first release candidate for Pacemaker 2.0.1
> available soon! It will mostly be a bug fix release, but one
> interesting new feature is
Hi!
It seems your systems run with non-operative fencing, and the cluster wants to
fence a node. Maybe bring the cluster to a clean state first, then repeat the
test.
Regards,
Ulrich
>>> "Lentes, Bernd" schrieb am 03.12.2018
um
16:40 in Nachricht
>>> lejeczek schrieb am 23.11.2018 um 15:56 in Nachricht
<46d2baf6-a03d-9aac-fceb-7bcffb383...@yahoo.co.uk>:
> hi guys,
>
> Do we have tools or maybe outside of the cluster suite there is a way to
> backup cluster?
>
> I'm obviously talking about configuration so potentially cluster could
>
>>> Klechomir schrieb am 20.11.2018 um 11:40 in Nachricht
<12860117.ByXx81i3mo@bobo>:
> Hi list,
> Bumped onto the following issue lately:
>
> When ultiple VMs are given shutdown right one‑after‑onther and the shutdown
of
>
> the first VM takes long, the others aren't being shut down at all
>>> Michael Schwartzkopff schrieb am 20.11.2018 um 08:41 in
>>> Nachricht
:
> Am 20.11.18 um 08:35 schrieb Bernd:
>> Am 2018-11-20 08:06, schrieb Ulrich Windl:
>>>>>> Bernd schrieb am 20.11.2018 um 07:21 in
>>>>>> Nachricht
>&
Hi!
You forgot the most important piece of information: "What is it?" I guess it's
so obvious for you that you forgot to mention. ;-)
Regards,
Ulrich
>>> Digimer schrieb am 20.11.2018 um 08:25 in Nachricht
<3ff31468-4052-dda7-7841-4c04985ad...@alteeve.ca>:
> *
>>> Bernd schrieb am 20.11.2018 um 07:21 in Nachricht
:
> Hi,
>
> I'd like to run a certain bunch of cronjobs from time to time on the
> cluster node (four node cluster) that has the lowest load of all four
> nodes.
>
> The parameters wanted for this system yet to build are
>
> * automatic
Hi!
maybe syslog provides help on what's going wrong...
Regards,
Ulrich
>>> Michael Gaberkorn schrieb am 14.11.2018 um
13:17
in Nachricht <408efd9b-2c9b-431b-8bb3-108b37dc0...@bd-innovations.com>:
> Hello.
>
>
> I installed ha-cluster with Postgresql-11 with high amount off data (5-9
>
>>> Valentin Vidic schrieb am 13.11.2018 um 17:04 in
Nachricht <20181113160419.gv3...@gavran.carpriv.carnet.hr>:
> On Tue, Nov 13, 2018 at 04:06:34PM +0100, Valentin Vidic wrote:
>> Could be some kind of ARP inspection going on in the networking equipment,
>> so check switch logs if you have
>>> Ken Gaillot schrieb am 08.11.2018 um 17:58 in
>>> Nachricht
<1541696332.5197.3.ca...@redhat.com>:
> On Thu, 2018-11-08 at 12:14 +, Ian Underhill wrote:
[...]
>
> Each transition is a set of actions needed to get to the desired state.
> "Complete" are actions that were initiated and a
Hi!
While analyzing some odd cluster problem in SLES11 SP4, I found this message
repeating quite a lot (several times per second) with the same text:
[...more...]
Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined:
SA_AIS_ERR_TRY_AGAIN
Nov 10 22:10:47 h05
>>> Ken Gaillot schrieb am 06.11.2018 um 00:12 in
>>> Nachricht
<1541459570.5061.11.ca...@redhat.com>:
> On Mon, 2018-11-05 at 16:14 -0600, Ryan Thomas wrote:
>> I have a two node cluster. I restart the network after making
>> changes to the network settings. But, as soon as I restart the
>>
>>> "T. Ladd Omar" schrieb am 23.10.2018 um 15:06 in
>>> Nachricht
:
> Hi all, I send this message to get some answers for my questions about
> Pacemaker.
> 1. In order to cleanup start-failed resources automatically, I add
> failure-timeout attribute for resources, however, the common way to
ace to disappear...
> unless you disable fencing for DLM.
>
> I am now speculating that DLM restarts when the communications fail, and
> the theory that disabling startup fencing for DLM
> (enable_startup_fencing=0) may be the solution to my problem (reverting my
> enable_fencing=0 DLM
801 - 900 of 1332 matches
Mail list logo