>>> Ken Gaillot schrieb am 08.03.2021 um 17:47 in
Nachricht
<76793e7b39e2194d328821c7ac9a5d3b82778d5e.ca...@redhat.com>:
> On Mon, 2021‑03‑08 at 09:57 +0100, Ulrich Windl wrote:
>> > > > Reid Wahl schrieb am 08.03.2021 um 08:42 in
>> > > > Nac
>>> Andrei Borzenkov schrieb am 08.03.2021 um 11:46 in
Nachricht <366c7071-8b7e-9ea8-5ea1-cbb6de6d4...@gmail.com>:
...
> Probe needs to answer "is resource active *now*". If probe for resource
> is impossible until some other resources are active, something is really
> wrong with design. Either
detail :)
I see what you mean.
Regards,
Ulrich
>
> On Sun, Mar 7, 2021 at 11:10 PM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de> wrote:
>
>> >>> Reid Wahl schrieb am 05.03.2021 um 21:22 in
>> Nachricht
>> :
>> > On Fri, Mar 5, 2021 a
>>> Reid Wahl schrieb am 05.03.2021 um 21:22 in Nachricht
:
> On Fri, Mar 5, 2021 at 10:13 AM Ken Gaillot wrote:
>
>> On Fri, 2021-03-05 at 11:39 +0100, Ulrich Windl wrote:
>> > Hi!
>> >
>> > I'm unsure what actually causes a problem I
>>> Digimer schrieb am 05.03.2021 um 18:05 in Nachricht
<5c062e2b-8742-4a9a-0e7c-bc8dec251...@alteeve.ca>:
> On 2021-03-05 3:34 a.m., Oyvind Albrigtsen wrote:
>> Hi,
>>
>> We are considering to merge the fence-virt repo into the fence-agents
>> git repository.
>>
>> Tell us if you have any
Hi!
I'm unsure what actually causes a problem I see (a resource was "detected
running" when it actually was not), but I'm sure some probe started on cluster
node start cannot provide a useful result until some other resource has been
started. AFAIK there is no way to make a probe obey odering
>>> Digimer schrieb am 04.03.2021 um 06:38 in Nachricht
<41edb705-6b8a-2221-fc8b-a367aac98...@alteeve.ca>:
> On 2021-03-03 6:53 p.m., Eric Robinson wrote:
>>
>>> -Original Message-----
>>> From: Users On Behalf Of Ulrich Windl
>>> Se
>>> Digimer schrieb am 04.03.2021 um 06:35 in Nachricht
:
> On 2021-03-03 1:56 a.m., Ulrich Windl wrote:
>>>>> Eric Robinson schrieb am 02.03.2021 um 19:26
in
>> Nachricht
>>
> m>
>>
>>>> -Original Message-
>>>&g
>>> Eric Robinson schrieb am 02.03.2021 um 19:26 in
Nachricht
>> -Original Message-
>> From: Users On Behalf Of Digimer
>> Sent: Monday, March 1, 2021 11:02 AM
>> To: Cluster Labs - All topics related to open-source clustering welcomed
>> ; Ulr
>>> Valentin Vidic schrieb am 28.02.2021 um
16:59
in Nachricht <20210228155921.gm29...@valentin-vidic.from.hr>:
> On Sun, Feb 28, 2021 at 03:34:20PM +, Eric Robinson wrote:
>> 001db02b rebooted. After it came back up, I tried it in the other
direction.
>>
>> On node 001db02b, the command...
>>> Eric Robinson schrieb am 28.02.2021 um 16:34 in
Nachricht
> I just configured STONITH in Azure for the first time. My initial test went
> fine.
>
> On node 001db02a, the command...
>
> # pcs stonith fence 001db02b
>
> ...produced output...
>
> 001db02b fenced.
>
> 001db02b rebooted.
pt. I couldn't
>> find the script under /usr/lib/ocf, Also on the internet.
>>
>> Regards,
>> Niveditha
>> --
>> *From:* Users on behalf of Ulrich Windl <
>> ulrich.wi...@rz.uni-regensburg.de>
>> *Sent:* Frida
>>> Eric Robinson schrieb am 26.02.2021 um 19:58 in
Nachricht
>> -Original Message-
>> From: Users On Behalf Of Andrei
>> Borzenkov
>> Sent: Friday, February 26, 2021 11:27 AM
>> To: users@clusterlabs.org
>> Subject: Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went
>>> Eric Robinson schrieb am 26.02.2021 um 18:23 in
Nachricht
>> ‑Original Message‑
>> From: Digimer
>> Sent: Friday, February 26, 2021 10:35 AM
>> To: Cluster Labs ‑ All topics related to open‑source clustering welcomed
>> ; Eric Robinson
>> Subject: Re: [ClusterLabs] Our 2‑Node
>>> Digimer schrieb am 26.02.2021 um 17:34 in Nachricht
<699432c7-89a6-41bf-c805-f4a7a0a4a...@alteeve.ca>:
> On 2021‑02‑26 11:19 a.m., Eric Robinson wrote:
>> At 5:16 am Pacific time Monday, one of our cluster nodes failed and its
>> mysql services went down. The cluster did not automatically
>>> Eric Robinson schrieb am 26.02.2021 um 17:19 in
Nachricht
> At 5:16 am Pacific time Monday, one of our cluster nodes failed and its mysql
> services went down. The cluster did not automatically recover.
>
> We're trying to figure out:
>
>
> 1. Why did it fail?
> 2. Why did it not
ot have ocf-tester script with me to test the RA script. I couldn't
> find the script under /usr/lib/ocf, Also on the internet.
>
> Regards,
> Niveditha
>
> From: Users on behalf of Ulrich Windl
>
> Sent: Friday, February 26, 2021 5:35
>>> Niveditha U schrieb am 26.02.2021 um 12:39 in
Nachricht
> Hi Team,
>
> We have xml data base called xdb which we want to use it as pcs resource.
> Hence, we created a custom resource agent script for the same. We are able
to
> start/stop the xdb resource using debug‑start and debug‑stop
t; and in the corosync sources tarball tests/testvotequorum1.c
>
> CHrissie
>
>
> On 25/02/2021 07:16, Ulrich Windl wrote:
>> Hi!
>>
>> I'm thinking about some simple cluster status display that is updated
> periodically.
>> I wonder how to get some &q
Hi!
According to the help message "-p" provides "machine readable" output for
corosync-quorumtool for "-s" and "-l".
However I don't see any considerable format change in output with or without
"-p":
# /usr/sbin/corosync-quorumtool -l
Membership information
--
Nodeid
resources_running="7" type="member"/>
>
Yes it's all in the CIB, but parsing XML is not being considered efficient by
me ;-)
In most cases using XML just speeds up global warming ;-)
Regards,
Ulrich
>
>
> On Thu, 2021‑02‑25 at 11:26 +0100, Ulrich Windl wrote:
for fencing
>> > > > dlm_controld[1616]: 91659 lvm_global wait for fencing
>> > > >
>> > > > These were messages when postgresql‑12 service was being
>> > started on
>> > > > node2.
>> > > > As postgresql service i
gt;
> CHrissie
>
>
> On 25/02/2021 07:16, Ulrich Windl wrote:
>> Hi!
>>
>> I'm thinking about some simple cluster status display that is updated
> periodically.
>> I wonder how to get some "cluster facts" efficiently. Among those are:
>>
&
t;> > > These were messages when postgresql-12 service was being started on
>> > > node2.
>> > > As postgresql service is dependent on these services(dlm,lvmlockd
>> > > and gfs2), it has not started in time on node2.
>> > > And node2 fenced it
>>> Ken Gaillot schrieb am 24.02.2021 um 23:45 in
Nachricht
<6373352fd18e819bada715a7d610499a658eda29.ca...@redhat.com>:
> On Wed, 2021‑02‑24 at 11:16 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> Using a utilization‑based placement strategy (placement‑
>>
Hi!
I'm thinking about some simple cluster status display that is updated
periodically.
I wonder how to get some "cluster facts" efficiently. Among those are:
* Is corosync running, and how many nodes can be seen?
* Is Pacemaker running, how many nodes does it see, and does it have a quorum?
*
Hi!
Using a utilization-based placement strategy (placement-strategy=balanced), I
wonder why pacemaker chose node h16 to place a new resource.
The situation before placement looks like this:
Remaining: h16 capacity: utl_ram=207124 utl_cpu=340
Remaining: h18 capacity: utl_ram=209172 utl_cpu=360
>>> Ken Gaillot schrieb am 19.02.2021 um 17:48 in
Nachricht
:
> On Fri, 2021‑02‑19 at 17:54 +0300, Andrei Borzenkov wrote:
>> In the latest PDF versions I downloaded recently code samples appear
>> truncated quite often ‑ they do not fit on page. I compared with
>> previous versions I have and
>>> lejeczek schrieb am 19.02.2021 um 17:40 in Nachricht
:
> Hi guys.
>
> I have a simple cluster with simple constraints:
>
> Colocation Constraints:
>check-jupyterhub with openvpn-server (score:INFINITY)
>secret-dropbox with openvpn-server (score:INFINITY)
> Ticket Constraints:
>
>
Hi!
Inspecting the logs after the cluster had rebalanced resources, I'm wondering:
It looks as if pacemaker-controld does log a success message when a local
migration succeeded, but not if a remote one did.
Actions planned:
Migrateprm_xen_test-jeos1( h16 -> h18 )
Migrate
>>> Strahil Nikolov schrieb am 17.02.2021 um 17:46 in
Nachricht <2134183555.2122291.1613580414...@mail.yahoo.com>:
> Hello All,
> I'm currently in a process of building SAP HANA Scale-out cluster and the
> HANA team has asked that all nodes on the active instance should have one IP
> for backup
Hi Ken,
personally I think systemd is already logging too much, and I don't think that
adding instructions to many log messages is actually helpful (It could be done
as separate log message (maybe at severity info) already).
In Windows I see the problem that it's very hard to find real problems
>>> "Lentes, Bernd" schrieb am 16.02.2021
>>> um
10:37 in Nachricht
<151181584.46426249.1613468259150.javamail.zim...@helmholtz-muenchen.de>:
>
> - On Feb 15, 2021, at 10:24 PM, Bernd Lentes
> bernd.len...@helmholtz-muenchen.de wrote:
>
>> - On Feb 15, 2021, at 9:00 PM, kgaillot
> pcmk_monitor_action=metadata pcmk_reboot_action=off
> Meta Attrs: provides=unfencing
> Operations: monitor interval=60s (scsi-monitor-interval-60s)
>
> On Mon, Feb 15, 2021 at 7:17 AM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de> wrote:
>
>> >>> shivra
>>> "Lentes, Bernd" schrieb am 13.02.2021
um
01:23 in Nachricht
<547781995.41340156.1613175834146.javamail.zim...@helmholtz-muenchen.de>:
>
> - On Feb 12, 2021, at 12:50 PM, Yan Gao y...@suse.com wrote:
>
>
>>
>>
>> It seems that crmsh has difficulty parsing the "random" ids of the
>>
>>> shivraj dongawe schrieb am 14.02.2021 um 12:03 in
Nachricht
:
> We are running a two node cluster on Ubuntu 20.04 LTS. Cluster related
> package version details are as
> follows: pacemaker/focal-updates,focal-security 2.0.3-3ubuntu4.1 amd64
> pacemaker/focal 2.0.3-3ubuntu3 amd64
>
>>> "Lentes, Bernd" schrieb am 12.02.2021
um
11:05 in Nachricht
<2012472669.39955087.1613124328501.javamail.zim...@helmholtz-muenchen.de>:
> Hi,
>
> i have problems with a configured alert which does not alert anymore.
> I played a bit around with it and changed several times the configuration
>>> Ken Gaillot schrieb am 11.02.2021 um 19:13 in
Nachricht
<5ddea954b8e8a45cf73a7a169752146e27f69083.ca...@redhat.com>:
> On Thu, 2021-02-11 at 13:59 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> After that problem I see this in crm_mon output:
>> Failed Fe
is there.
Regards,
Ulrich
>>> Ulrich Windl schrieb am 09.02.2021 um 16:32 in Nachricht <6022AB1C.645 :
161 :
60728>:
>>>> Klaus Wenninger schrieb am 09.02.2021 um 16:12 in
> Nachricht :
> > On 2/9/21 3:10 PM, Ulrich Windl wrote:
> >>>>> &q
>>> "Ben .T.George" schrieb am 10.02.2021 um 20:28 in
Nachricht
:
> HI
>
> i have 2 resources and i would like configure in such a way that both
> should always run from same node,
"from" == "on"?
see "colocation" constraints.
>
> also is it safe to give below values for 2 node cluster:
>
>
>>> "Ben .T.George" schrieb am 10.02.2021 um 19:56 in
Nachricht
:
> HI
>
> Is it mandatory for 2 node pcs cluster require a quorum and separate
> Heartbeat Network?
Question: What do you expect when the network link goes down?
>
> Regards,
> Ben
>>> "Ben .T.George" schrieb am 10.02.2021 um 16:14 in
Nachricht
:
> HI
>
> thanks for the Help and i have done "pcs resource clear" and tried the same
> method again, now the resource is not going back.
>
> One more thing I noticed is that my service was from systemd and I have
> created a
>>> Klaus Wenninger schrieb am 09.02.2021 um 16:12 in
Nachricht :
> On 2/9/21 3:10 PM, Ulrich Windl wrote:
>>>>> "Ulrich Windl" schrieb am
09.02.2021
>> um
>> 15:00 in Nachricht <6022956302a10003e...@gwsmtp.uni-regensburg.de>:
&g
>>> "Ulrich Windl" schrieb am 09.02.2021
um
15:00 in Nachricht <6022956302a10003e...@gwsmtp.uni-regensburg.de>:
> Hi!
>
> I had made a mistake, leading to node h16 to be fenced. After recovery (h16
> had re‑joined the cluster) I had stopp
age of your "X", then decide what Y should be ;-)
Regards,
Ulrich
> Regards,
> Stuart
>
> On Tue, Feb 9, 2021 at 2:34 AM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de> wrote:
>
>> Hi!
>>
>> Maybe you just misunderstand what maintennce mo
Hi!
I had made a mistake, leading to node h16 to be fenced. After recovery (h16 had
re-joined the cluster) I had stopped the node, reconfigured the network, then
started the node again.
Then I did the same thing (not the unwanted fencing) with h18. When I started
the node again, I saw these
Hi!
Maybe you just misunderstand what maintennce mode for a single node means: CIBS
updates will still be performed, but not the resource actions. If CIB updates
are sent to another node, that node will perform actions.
Maybe just explain what you really want to do with one node in maintenance
>>> Ken Gaillot schrieb am 08.02.2021 um 17:43 in
Nachricht
<5ee981d3893dd7712c747661de05240df1ccd8eb.ca...@redhat.com>:
> On Mon, 2021‑02‑08 at 08:41 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> There were previous indications of this problem, but today I had it
>>> Ken Gaillot schrieb am 05.02.2021 um 16:47 in
>>> Nachricht
<7247097610e6ab4f3a44a7648e0acf32fbdb9937.ca...@redhat.com>:
Hi!
...
>> Doesn't systemctl return a proper exit status?
>
> It does, but we don't use systemctl, we use the systemd C library
> interface. And unfortunately, our
>>> Andrei Borzenkov schrieb am 05.02.2021 um 15:31 in
Nachricht <4572fad7-c5ae-6d93-2559-741d052e3...@gmail.com>:
> 05.02.2021 12:54, Ulrich Windl пишет:
>>>>> Ulrich Windl schrieb am 01.02.2021 um 11:59 in Nachricht <6017DF04.888
:
>> 161 :
>
Hi!
There were previous indications of this problem, but today I had it again:
I restarted a node (h18, DC) via "crm cluster restart", and the node shutdown
cleanly (at least it came to an end), but when restarting, the node was fenced
by the new DC (h16):
Feb 08 08:12:24 h18
t; (action_complete)debug: nfs-daemon systemd start is now complete
> (elapsed=2397ms, remaining=97603ms): ok (0)
> Feb 05 02:06:53.521 fastvm-rhel-8-0-23 pacemaker-execd [19354]
> (log_finished) debug: nfs-daemon monitor (call 20) exited with status
> 0 (execution
>>> Ulrich Windl schrieb am 01.02.2021 um 11:59 in Nachricht <6017DF04.888 :
161 :
60728>:
>>>> Andrei Borzenkov schrieb am 01.02.2021 um 11:05 in
> Nachricht
> :
> > On Mon, Feb 1, 2021 at 12:53 PM Ulrich Windl
> > wrote:
> ...
> &g
Hi!
While analyzing cluster problems I noticed this:
Normal resources executed via OCF RAs create two log entries by
pacemaker-execd: One when starting the resource and another when the resource
completed starting.
However for systemd units I only get a start message. Is that intentional? Does
>>> Ken Gaillot schrieb am 03.02.2021 um 00:02 in
Nachricht
<396cc52f2d27b8aab611d2312ba172b07bdc9d7f.ca...@redhat.com>:
> On Tue, 2021‑02‑02 at 14:27 ‑0700, Brent Jensen wrote:
>> I've been trying to get my DRBD cluster on Centos8 / Pacemaker 2 to
>> work but have had issues with cluster not
>>> S Sathish S schrieb am 02.02.2021 um 07:20 in
Nachricht
> Hi Team,
>
> we have taken latest pacemaker version after that we found pcs status
> command output consist of * in each line , is this expected behavior.
>
> https://github.com/ClusterLabs/pacemaker/tree/Pacemaker‑2.0.5
>
> pcs
>>> Ken Gaillot schrieb am 02.02.2021 um 17:40 in
Nachricht
<5d7d52f14417e6e8baee49dfbc23884b5183b073.ca...@redhat.com>:
> Hi all,
>
> Pacemaker has a feature allowing CIB modifications to be made from
> hosts that are not cluster nodes:
>
>
Hi!
I'm wondering:
I had a failed clone resource. After fixing the problem, I performed a cleanup,
but the fail-counts weren't reset (I thought that was the case in older
versions of pacemaker):
Before:
Full List of Resources:
* Clone Set: cln_iotw-md10 [prm_iotw-md10]:
* Started: [ h19
>>> Ken Gaillot schrieb am 01.02.2021 um 17:27 in
Nachricht
<9b99d08faf4ddbe496ede10165f586afd81aa850.ca...@redhat.com>:
> On Mon, 2021-02-01 at 11:16 -0500, Stuart Massey wrote:
>> Andrei,
>> You are right, thank you. I have an earlier thread on which I posted
>> a pacemaker.log for this issue,
; adding this line works for me:
>>
>> pcs constraint colocation add lta-subscription-backend-ope-s1 with
>> s3srvnotificationdispatcher INFINITY
>>
>> I would thanks everyone helped me and spend his time.
>>
>> Have a good Week!
>>
>> Best
>>
>> Damian
>&
>>> Andrei Borzenkov schrieb am 01.02.2021 um 12:07 in
Nachricht
:
> On Mon, Feb 1, 2021 at 1:59 PM Ulrich Windl
> wrote:
>>
>> But the VM *wasn't* stopped on h16!
>>
>
> I am not sure what you mean here. It was not stopped during migration?
> Y
>>> Andrei Borzenkov schrieb am 01.02.2021 um 11:05 in
Nachricht
:
> On Mon, Feb 1, 2021 at 12:53 PM Ulrich Windl
> wrote:
...
>> Feb 01 10:33:08 h16 pacemaker‑execd[7464]: notice:
> prm_xen_test‑jeos5_stop_0[33137] erro
>>> Andrei Borzenkov schrieb am 01.02.2021 um 11:05 in
Nachricht
:
> On Mon, Feb 1, 2021 at 12:53 PM Ulrich Windl
> wrote:
>>
>> Hi!
>>
>> While fighting to get the wrong configuration, I broke libvirt
live‑migration
Of course I meant "*right*
Hi!
While fighting to get the wrong configuration, I broke libvirt live-migration
by not enabling the TLS socket.
When testing to live-migrate a VM from h16 to h18, these are the essential
events:
Feb 01 10:30:10 h16 pacemaker-schedulerd[7466]: notice: * Move
prm_cron_snap_test-jeos5
>>> "Sharma, Jaikumar" schrieb am 30.01.2021 um
13:41
in Nachricht
>> fence_drac5 , fence_drac (not sure about that) , SBD
> I've configured IPMI over LAN giving static IP addresses to both nodes (at
> iDRAC level) in cluster and I can power reset/reboot both nodes in the
> cluster by
>>> Stuart Massey schrieb am 29.01.2021 um 18:37 in
Nachricht
:
> Can someone help me with this?
> Background:
>
> "node01" is failing, and has been placed in "maintenance" mode. It
> occasionally loses connectivity.
>
> "node02" is able to run our resources
>
> Consider the following messages
>>> Andrei Borzenkov schrieb am 29.01.2021 um 18:36 in
Nachricht <7bd34d6c-642f-0e44-e424-1445ebb30...@gmail.com>:
> 29.01.2021 14:19, Ulrich Windl пишет:
>> Hi!
>>
>> I'm having an odd failure using a systemd socket unit controlled by the
> cl
Hi!
I'm having an odd failure using a systemd socket unit controlled by the cluster.
(Personally I feel: "cluster and systemd: One resource controller too much".
But when you need to control a systemd unit...)
When the unit is active already, a start peration fails:
Jan 29 12:12:46 h16
>>> Andrei Borzenkov schrieb am 28.01.2021 um 18:30 in
Nachricht :
> 27.01.2021 22:03, Ken Gaillot пишет:
>>
>> With a group, later members depend on earlier members. If an earlier
>> member can't run, then no members after it can run.
>>
>> However we can't make the dependency go in both
example?
Regards,
Ulrich
> i really hope was a solution or workaorund for this, but as ken clarify,
> pacemaker cant hadle this exceptions.
>
> Many thanks for your quick and effective support.
>
> Have a good evening!
>
> Damiano
>
>
> Il giorno gio
Hi again!
I had made a mistake: defining resource utilization with a name that doesn't
exist as node capacity/utilization (mistyped it).
The effect was that the resource was stopped, but unfortunately ptest did not
tell me why
("Insuffient node capacity for resource ...")
However I'd think
Regards,
Ulrich
>
> Regards,
> xin
> ________
> From: Users on behalf of Ulrich Windl
>
> Sent: Thursday, January 28, 2021 4:46 PM
> To: users@clusterlabs.org
> Subject: [ClusterLabs] SLES15 SP2: crm shell crashed
>
> Hi!
>
> Trying
Ken,
thanks for analyzing the logs! See comments inline...
>>> Ken Gaillot schrieb am 27.01.2021 um 19:55 in
Nachricht
<644fc719a2e8870c332db859bcdef275d986249a.ca...@redhat.com>:
> On Wed, 2021‑01‑27 at 12:36 +0100, Ulrich Windl wrote:
...
>> Jan 27 10:43:48 h
>>> damiano giuliani schrieb am 27.01.2021 um
19:25
in Nachricht
:
> Hi Andrei, Thanks for ur help.
> if one of my resource in the group fails or the primary node went down (
> in my case acspcmk-02 ), the probe notices it and pacemaker tries to
> restart the whole resource group on the second
>>> Ken Gaillot schrieb am 27.01.2021 um 18:46 in
Nachricht
<02cd90fcc10f1021d9f51649e2991da3209a6935.ca...@redhat.com>:
> On Wed, 2021-01-27 at 08:35 +0100, Ulrich Windl wrote:
>> > > > Tomas Jelinek schrieb am 26.01.2021 um
>> > > > 16:15
Hi!
Trying to add one resource to my cluster that is very similar to an existing
one, I had forgotten to define one related resource.
When editing the config to "clone and adjust" constraints, crm shell crashed
when saving:
crm(live/h16)configure# edit
ERROR: constraint
>>> Tomas Jelinek schrieb am 26.01.2021 um 16:15 in
Nachricht
<48f935a5-184f-d2d7-7f1a-db596aa6c...@redhat.com>:
> Dne 25. 01. 21 v 17:01 Ken Gaillot napsal(a):
>> On Mon, 2021‑01‑25 at 09:51 +0100, Jehan‑Guillaume de Rorthais wrote:
>>> Hi Digimer,
>>>
>>> On Sun, 24 Jan 2021 15:31:22 ‑0500
>>>
>>> Ken Gaillot schrieb am 26.01.2021 um 16:08 in
Nachricht
:
> On Tue, 2021‑01‑26 at 02:12 ‑0500, Digimer wrote:
>> Hi all,
>>
>> I created a resource with an INFINITE stop timeout;
>>
>> pcs resource create srv01‑test ocf:alteeve:server name="srv01‑test"
>> meta
>> allow‑migrate="true"
>>> Digimer schrieb am 25.01.2021 um 19:18 in Nachricht
<18d77f26-b21b-4f2e-184c-c2280876d...@alteeve.ca>:
...
> If I understand what's been said in this thread, the host node got a
> shutdown request so it migrated the resource. Then the peer (new host)
> would have gotten the shutdown request,
>>> Strahil Nikolov schrieb am 25.01.2021 um 12:28 in
Nachricht <1768184755.3488991.1611574085...@mail.yahoo.com>:
> Hi All,
> As you all know migrating a resource is actually manipulating the location
> constraint for that resource.
> Is there any plan for an option to control a default timeout
Hi!
I reconfigured my cluster to let it control virtlockd (instead of just "enable"
it in systemd). However I still have problems I don't understand:
When live-migrating a Xen PV I still get these messages:
Jan 25 12:38:06 h18 virtlockd[42724]: libvirt version: 6.0.0
Jan 25 12:38:06 h18
>>> Jehan-Guillaume de Rorthais schrieb am 25.01.2021 um
09:51 in
Nachricht <20210125095132.575f55aa@firost>:
> Hi Digimer,
>
> On Sun, 24 Jan 2021 15:31:22 ‑0500
> Digimer wrote:
> [...]
>> I had a test server (srv01‑test) running on node 1 (el8‑a01n01), and on
>> node 2 (el8‑a01n02) I ran
drbd resource
> since we put the failing node in maintenance.
When you are in maintenance mode, monitor operations won't run AFAIK.
> We will watch for a bit longer.
> Thanks again
>
> On Thu, Jan 21, 2021 at 2:23 AM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de&
>>> Gang He schrieb am 22.01.2021 um 09:44 in Nachricht
:
>
> On 2021/1/22 16:17, Ulrich Windl wrote:
>>>>> Gang He schrieb am 22.01.2021 um 09:13 in Nachricht
>> <1fd1c07d-d12c-fea9-4b17-90a977fe7...@suse.com>:
>>> Hi Ulrich,
>>&g
mlock
But cln_lockspace_ocfs2 provides the shared filesystem that lvmlockd uses. I
thought for locking in a cluster it needs a cluster-wide filesystem.
>
>
> Thanks
> Gang
>
> On 2021/1/21 20:08, Ulrich Windl wrote:
>>>>> Gang He schrieb am 21.01.2021 um 11:30 in Na
>>> Ken Gaillot schrieb am 22.01.2021 um 00:51 in
Nachricht
:
> Hi all,
>
> A recurring request we've seen from Pacemaker users is a feature called
> "non‑critical resources" in a proprietary product and "independent
> subtrees" in the old rgmanager project.
>
> An example is a large database
>>> Ken Gaillot schrieb am 21.01.2021 um 17:24 in
Nachricht
<28f8b077a30233efa41d04688eb21e82c8432ddd.ca...@redhat.com>:
> On Thu, 2021‑01‑21 at 08:19 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> I have a question about utilization‑based resource placement
ched.
The only VG the cluster node sees is:
ph16:~ # vgs
VG #PV #LV #SN Attr VSize VFree
sys 1 3 0 wz--n- 222.50g0
Regards,
Ulrich
> I feel the problem was probably caused by lvmlock resource agent script,
> which did not handle this corner case correctly.
>
> Thanks
Hi!
I have a problem: For tests I had configured lvmlockd. Now that the tests have
ended, no LVM is used for cluster resources any more, but lvmlockd is still
configured.
Unfortunately I ran into this problem:
On OCFS2 mount was unmounted successfully, another holding the lockspace for
; IP, since that is the route to the IP addresses resolved for the host
>> names; that will certainly be the only route to the quorum device. I can
>> say that this cluster has run reasonably well for quite some time with
this
>> configuration prior to the recently developed hardware issues o
Hi!
I have a question about utilization-based resource placement (specifically:
placement-strategy=balanced):
Assume you have two resource capacities (say A and B) on each node, and each
resource also has a utilization parameter for both.
Both nodes have enough capacity for a resource to be
>>> "Ulrich Windl" schrieb am 15.01.2021
um
09:36 in Nachricht <6001541002a10003e...@gwsmtp.uni-regensburg.de>:
> Hi!
>
> The cluster I'm configuring (SLES15 SP2) fenced a node last night. Still
> unsure what exactly caused the fencing, but looking at
>>> Reid Wahl schrieb am 19.01.2021 um 08:22 in Nachricht
:
> On Mon, Jan 18, 2021 at 11:18 PM Ulrich Windl <
> ulrich.wi...@rz.uni-regensburg.de> wrote:
>
>> >>> Ken Gaillot schrieb am 18.01.2021 um 19:29 in
>> Nachricht
>> <104
>>> Stuart Massey schrieb am 19.01.2021 um 04:46 in
Nachricht
:
> So, we have a 2-node cluster with a quorum device. One of the nodes (node1)
> is having some trouble, so we have added constraints to prevent any
> resources migrating to it, but have not put it in standby, so that drbd in
>
>>> Ken Gaillot schrieb am 18.01.2021 um 19:29 in
Nachricht
<1047fd943be77f4a6fd4cd4dd19b65d1550512f8.ca...@redhat.com>:
> On Fri, 2021‑01‑15 at 11:40 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> With a cluster recheck interval, I see periodic log messages
>>> Ken Gaillot schrieb am 18.01.2021 um 19:20 in
Nachricht
<06d171c5d33bcb20af71d534a94ce26a56bdd530.ca...@redhat.com>:
> On Fri, 2021‑01‑15 at 09:36 +0100, Ulrich Windl wrote:
>> Hi!
>>
>> The cluster I'm configuring (SLES15 SP2) fenced a node last night.
Hi!
It should be easy (I guess), but when requiring both masters to be on the same
node, can't you do with one DRBD device (something like putting a LVM VG on
that and proivide two LVs)?
Regards,
Ulrich
>>> schrieb am 18.01.2021 um 18:43 in Nachricht
:
> Hi again,
>
> I need some help to
>>> Andrei Borzenkov schrieb am 18.01.2021 um 10:54 in
Nachricht
:
> On Mon, Jan 18, 2021 at 11:55 AM Ulrich Windl
> wrote:
> .
>>
>> So can someone explan, or direct me to some helpful docs?
>>
>
> Are you aware of https://libvirt.org/kbase/locki
d line was never written to persistent journal
What might help is running "journalctl -f" on a terminal. So you see the last
messages received, even if not written to the filesystem (I think). So when the
host is down, you see the last messages.
Disk writes frequently miss the
Hi!
I'm migrating our Xen PVM environment from SLES11 SP4 to SLES15 SP2. As it
seems libvirt is the preferred framework to use, so I configured VirtualDomains
instead of Xen. I had to move configuration from Xen xm via xen xl to libvirt.
What I couldn't get from the docs is whether and when I
401 - 500 of 1332 matches
Mail list logo