[ClusterLabs] Release crmsh 4.6.0-rc1

2023-12-07 Thread Xin Liang via Users
Hello everyone!


I am pleased to announce that crmsh 4.6.0-rc1 is now available for release!

Changes since tag 4.5.0

Features:
- bootstrap: Support ssh-agent and crmsh could no longer rely on the private 
key in the cluster nodes (#1261)
- prun: Replace parallax with crmsh.prun to support non-root sudoer 
(#1147)
- report: Rewrite crm report and improve performance noticeably 
(#1246)

Major fixes:
- Fix: bootstrap: fix the owner and permission of file authorized_keys 
(bsc#1217279)
- Fix: prun: should not call user_pair_for_ssh() when target host is localhost 
(bsc#1217094)
- Fix: utils: Add 'sudo' only when there is a sudoer(bsc#1215549)
- Fix: report: Pick up tarball suffix dynamically (bsc#1215438)
- Fix: report: Pick 'gzip' as the first compress prog for cross-platform 
compatibility(bsc#1215438)
- Fix: constants: Add several resource meta attributes (bsc#1215319)
- Fix: upgradeutil: reduce the timeout for getting sequence from remote node 
(bsc#1213797)
- Fix: userdir: Get the effictive user name instead of using getpass.getuser 
(bsc#1213821)
- Fix: upgradeutil: support the change of path of upgrade_seq in crmsh-4.5 
(bsc#1213050)
- Fix: ui_context: wait4dc should assume a subcommand completes successfully if 
no exceptions are raised (bsc#1212992)
- Fix: bootstrap: fix the validation of option -N and -c (bsc#1212436)
- Fix: geo_cluster: the behavior of choosing a default user in 
geo_join/geo_init_arbitrator is different with `cluster join` (bsc#1211817)
- Fix: utils: do not use sudoer user to create ssh session unless it is 
specified explicitly (bsc#1211817)
- Dev: refine non-root sudoer support for crmsh.parallax.parallax_call 
(bsc#1210709)
- Fix: bootstrap: `init --qnetd-hostname` fails when username is not specified 
(bsc#1211200)
- Fix: bootstrap: crm cluster join default behavior change in ssh key handling 
(bsc#1210693)
- Fix: help: Long time to load and parse crm.8.adoc (bsc#1210198)
- Fix: cibconfig: use any existing rsc_defaults set rather than create another 
one (bsc#1210614)
- Fix: lock: Join node failed to wait init node finished (bsc#1210332)

For more change details please see 
https://github.com/ClusterLabs/crmsh/blob/crmsh-4.6/ChangeLog
Thanks to everyone who contributed to this release!
Any feedback and suggestions are welcome!

Regards,
xin

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] ethernet link up/down - ?

2023-12-07 Thread Reid Wahl
On Thu, Dec 7, 2023 at 7:34 AM lejeczek via Users  wrote:
>
>
>
> On 04/12/2023 20:58, Reid Wahl wrote:
> > On Thu, Nov 30, 2023 at 10:30 AM lejeczek via Users
> >  wrote:
> >>
> >>
> >> On 07/02/2022 20:09, lejeczek via Users wrote:
> >>> Hi guys
> >>>
> >>> How do you guys go about doing link up/down as a resource?
> >>>
> >>> many thanks, L.
> >>>
> >> With simple tests I confirmed that indeed Linux - on my
> >> hardware at leat - can easily power down&up an eth link - if
> >> a @devel reads this:
> >> Is there an agent in the suite which a non-programmer could
> >> easily (for most safely) adopt for such purpose?
> >> I understand such agent has to be cloneable & promotable.
> > The iface-bridge resource appears to do something similar for bridges.
> > I don't see anything currently for links in general.
> >
> >
> Where can I find that agent?

https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/iface-bridge

> Any comment on that idea about adding, introducing such
> "link" agent into agents in the future?
> Should I go _github_ and suggest it there perhaps?
> Naturally done by "devel" it would be ideal, as opposed to,
> by us user/admins.

You can file an issue on GitHub, yes:
https://github.com/ClusterLabs/resource-agents/issues

Developer resources are finite and we get requests for new agents
regularly. If you're a subscriber to a RHEL or SUSE or something like
that, then I'd recommend filing an RFE through your distribution.
(That would be necessary anyway if you want such an agent to get into
a downstream package.)

You could also write the agent and submit a pull request to get it
merged into the resource-agents repo.

> thanks, L.
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/



-- 
Regards,

Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] How to achieve next order behavior for start/stop action for failover

2023-12-07 Thread Reid Wahl
On Thu, Dec 7, 2023 at 7:07 AM Novik Arthur  wrote:
>
> Thank you Reid!
> "Mandatory" with "symmetrical=false" did exactly what I wanted.
>
> Sincerely yours,
> A

Adding the list back to confirm this is resolved.

Thanks for confirming and for correcting my typo! Glad it's working as desired.

>
> пн, 4 дек. 2023 г. в 22:51, Reid Wahl :
>>
>> On Mon, Dec 4, 2023 at 8:48 AM Novik Arthur  wrote:
>> >
>> > Hello community!
>> > I'm not sure if pacemaker can do it or not with current logic (maybe it 
>> > could be a new feature), but it's worth asking before starting to "build 
>> > my own Luna-park ,with blackjack and "
>> >
>> > Right now I have something like:
>> > MGS -> MDT -> OST
>> > order mdt-after-mgs Optional: mgs:start mdt:start
>> > order ost-after-mgs Optional: mgs:start ost:start
>> > order ost-after-mdt Optional: mdt:start ost:start
>> >
>> > We have 4 nodes (A,B,C,D).
>> > Nodes A and B carry MGS.
>> > Nodes A,B,C,D carry MDT000[0-3] - one per node.
>> > Nodes A,B,C,D carry OST000[0-3] - one per node.
>> > If we stop nodes A and B, MGS will be stopped since there is NO placement 
>> > to start for it, but MDT000[0-1] and OST000[0-1] could failover and will 
>> > try to do that and will fail since MGS is a mandatory for us (and by the 
>> > end will be blocked), but I use optional to avoid unnecessary stop/start 
>> > chain for MDT/OST.
>> >
>> > I want to avoid unnecessary STOP/START actions of each dependent resource 
>> > in case of failover, but preserve the order and enforce MGS dependency for 
>> > those resources which are stopped (so, to start I need to follow the chain 
>> > and if started then do nothing). Think about it like separate procedures 
>> > for 1st start and failover during work... like soft-mandatory or something 
>> > like that.
>>
>> You might try adding "symmetric=false". That option means the
>> constraints apply to start but not to stop.
>>
>> Otherwise, I'm struggling to understand the actual vs. desired
>> behavior here. Perhaps some example outputs or logs would be helpful
>> to illustrate it.
>>
>> All of these ordering constraints are optional. This means they're
>> applied only when both actions would be scheduled. For example, if mgs
>> and mdt are both scheduled to start, then mgs must start first; but if
>> only mdt is scheduled to start, then it does not depend on mgs.
>>
>> Perhaps the fact that these are cloned resources is causing the
>> ordering constraints to behave differently from expectation... I'm not
>> sure.
>>
>>
>> > I think that if I tweak OCF start/stop (make them dummy and always 
>> > success) and move all logic to monitors with deep checks, so that monitor 
>> > could mount/umount and etc. + assign transient attrs which could track 
>> > ready or not for start, and create location rules which prefer/honor 
>> > transient attrs, then I could achieve desirable state, but it looks very 
>> > complex and probably doesn't worth it...
>>
>> Sometimes for complex dependencies, it can be helpful to configure an
>> ocf:pacemaker:attribute resource (which usually depends on other
>> resources in turn). This resource agent sets a node attribute on the
>> node where it's running. The attribute can be useful in rules as part
>> of more complicated constraint schemes.
>>
>> >
>> > I would love to see any thoughts about it.
>> >
>> > Thanks,
>> > A
>> >
>> > ___
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users
>> >
>> > ClusterLabs home: https://www.clusterlabs.org/
>>
>>
>>
>> --
>> Regards,
>>
>> Reid Wahl (He/Him)
>> Senior Software Engineer, Red Hat
>> RHEL High Availability - Pacemaker
>>


-- 
Regards,

Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Pacemaker 2.1.7-rc3 now available (likely final)

2023-12-07 Thread Ken Gaillot
Hi all,

Source code for the third (and likely final) release candidate for
Pacemaker version 2.1.7 is available at:

 
https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7-rc3

This release candidate fixes a couple issues introduced in rc1. See the
ChangeLog or the link above for details.

Everyone is encouraged to download, build, and test the new release. We
do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

This is probably your last chance to test before the final release,
which I expect in about two weeks. If anyone needs more time, let me
know and I can delay it till early January.
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] ethernet link up/down - ?

2023-12-07 Thread lejeczek via Users



On 04/12/2023 20:58, Reid Wahl wrote:

On Thu, Nov 30, 2023 at 10:30 AM lejeczek via Users
 wrote:



On 07/02/2022 20:09, lejeczek via Users wrote:

Hi guys

How do you guys go about doing link up/down as a resource?

many thanks, L.


With simple tests I confirmed that indeed Linux - on my
hardware at leat - can easily power down&up an eth link - if
a @devel reads this:
Is there an agent in the suite which a non-programmer could
easily (for most safely) adopt for such purpose?
I understand such agent has to be cloneable & promotable.

The iface-bridge resource appears to do something similar for bridges.
I don't see anything currently for links in general.



Where can I find that agent?

Any comment on that idea about adding, introducing such 
"link" agent into agents in the future?

Should I go _github_ and suggest it there perhaps?
Naturally done by "devel" it would be ideal, as opposed to, 
by us user/admins.

thanks, L.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] [EXT] Prevent cluster transition when resource unavailable on both nodes

2023-12-07 Thread Windl, Ulrich
Hi!

What about this: Run a ping node for a remote resource to set up some score 
value. If the remote is unreachable, the score will reflect that.
Then add a rule chink that score, deciding whether to run the virtual IP or not.

Regards,
Ulrich

-Original Message-
From: Users  On Behalf Of Alexander Eastwood
Sent: Wednesday, December 6, 2023 5:56 PM
To: users@clusterlabs.org
Subject: [EXT] [ClusterLabs] Prevent cluster transition when resource 
unavailable on both nodes

Hello, 

I administrate a Pacemaker cluster consisting of 2 nodes, which are connected 
to each other via ethernet cable to ensure that they are always able to 
communicate with each other. A network switch is also connected to each node 
via ethernet cable and provides external access.

One of the managed resources of the cluster is a virtual IP, which is assigned 
to a physical network interface card and thus depends on the network switch 
being available. The virtual IP is always hosted on the active node.

We had the situation where the network switch lost power or was rebooted, as a 
result both servers reported `NIC Link is Down`. The recover operation on the 
Virtual IP resource then failed repeatedly on the active node, and a transition 
was initiated. Since the other node was also unable to start the resource, the 
cluster was swaying between the 2 nodes until the NIC links were up again.

Is there a way to change this behaviour? I am thinking of the following 
sequence of events, but have not been able to find a way to configure this:

 1. active node detects NIC Link is Down, which affects a resource managed by 
the cluster (monitor operation on the resource starts to fail)
 2. active node checks if the other (passive) node in the cluster would be able 
to start the resource
 3. if passive node can start the resource, transition all resources to passive 
node
 4. if passive node is unable to start the resource, then there is nothing to 
be gained a transition, so no action should be taken

Any pointers or advice will be much appreciated!

Thank you and kind regards,

Alex Eastwood
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/