Re: [ClusterLabs] resource cloned group colocations

2023-03-02 Thread Andrei Borzenkov
On Thu, Mar 2, 2023 at 4:16 PM Gerald Vogt  wrote:
>
> On 02.03.23 13:51, Klaus Wenninger wrote:
> > Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's
> > fine. ip2 will be moved immediately to ha3. Good.
> >
> > However, if pacemaker on ha2 starts up again, it will immediately
> > remove
> > ip2 from ha3 and keep it offline, while the services in the group are
> > starting on ha2. As the services unfortunately take some time to come
> > up, ip2 is offline for more than a minute.
> >
> > It seems the colocations with the clone are already good once the clone
> > group begins to start services and thus allows the ip to be removed
> > from
> > the current node.
> >
> >
> > To achieve this you have to add orders on top of collocations.
>
> I don't understand that.
>
> "order" and "colocation" are constraints. They work on resources.
>
> I don't see how I could add an order on top of a colocation constraint...
>

You cannot, but asymmetrical serializing constraint may do it

first start clone then stop ip

when new node comes up, pacemaker builds a transaction which starts
clone on the new node and moves ip (stops on the old node and starts
on the new node). These actions are (should be) part of the same
transaction so serializing constraints should apply.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] resource cloned group colocations

2023-03-02 Thread Vladislav Bogdanov
On Thu, 2023-03-02 at 08:41 +0100, Gerald Vogt wrote:
> Hi,
> 
> I am setting up a mail relay cluster which main purpose is to
> maintain 
> the service ips via IPaddr2 and move them between cluster nodes when 
> necessary.
> 
> The service ips should only be active on nodes which are running all 
> necessary mail (systemd) services.
> 
> So I have set up a resource for each of those services, put them into
> a 
> group in order they should start, cloned the group as they are
> normally 
> supposed to run on the nodes at all times.
> 
> Then I added an order constraint
>    start mail-services-clone then start mail1-ip
>    start mail-services-clone then start mail2-ip
> 
> and colocations to prefer running the ips on different nodes but only
> with the clone running:
> 
>    colocation add mail2-ip with mail1-ip -1000
>    colocation ip1 with mail-services-clone
>    colocation ip2 with mail-services-clone
> 
> as well as a location constraint to prefer running the first ip on
> the 
> first node and the second on the second
> 
>    location ip1 prefers ha1=2000
>    location ip2 prefers ha2=2000
> 
> Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's
> fine. ip2 will be moved immediately to ha3. Good.
> 
> However, if pacemaker on ha2 starts up again, it will immediately
> remove 
> ip2 from ha3 and keep it offline, while the services in the group are
> starting on ha2. As the services unfortunately take some time to come
> up, ip2 is offline for more than a minute.
> 
> It seems the colocations with the clone are already good once the
> clone 
> group begins to start services and thus allows the ip to be removed
> from 
> the current node.
> 
> I was wondering how can I define the colocation to be accepted only
> if 
> all services in the clone have been started? And not once the first 
> service in the clone is starting?
> 
> Thanks,
> 
> Gerald
> 

I noticed such behavior many years ago - it is especially visible with
a long-starting resources, and one of techniques
to deal with that is to use transient node attributes instead of
colocation/order between group and vip.
I'm not sure there is a suitable open-source resource agent which just
manages specified node attribute, but it should be
not hard to compose one which implements a pseudo-resource handler
together with atrrd_updater calls.
Probably you can trim all ethernet-related from a ethmonitor to make
such almost-dummy resource agent.

Once RA is there, you can add it as the last resource in the group, and
then rely on the attribute it manages to start your VIP.
That is done with location constraints, just use score-attribute in
their rules -
 
https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#rule-properties

So, the idea is: your custom RA sets attribute 'mail-clone-started' to
something like 1,
and you have a location constraint which prevents cluster from starting
your VIP resource on a node if value of  'mail-clone-started' attribute
on a node is less then 1 or not defined.
Once node has that attribute set (which happens at the very end of a
start sequence of a group) then (and only then) it decides to move your
VIP
to that node (because of other location constraints with preferences
you already have).

Just make sure attributes are transient (not stored into CIB).


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] resource cloned group colocations

2023-03-02 Thread Gerald Vogt

On 02.03.23 13:51, Klaus Wenninger wrote:

Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's
fine. ip2 will be moved immediately to ha3. Good.

However, if pacemaker on ha2 starts up again, it will immediately
remove
ip2 from ha3 and keep it offline, while the services in the group are
starting on ha2. As the services unfortunately take some time to come
up, ip2 is offline for more than a minute.

It seems the colocations with the clone are already good once the clone
group begins to start services and thus allows the ip to be removed
from
the current node.


To achieve this you have to add orders on top of collocations.


I don't understand that.

"order" and "colocation" are constraints. They work on resources.

I don't see how I could add an order on top of a colocation constraint...

Thanks,

Gerald

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] resource cloned group colocations

2023-03-02 Thread Klaus Wenninger
On Thu, Mar 2, 2023 at 8:41 AM Gerald Vogt  wrote:

> Hi,
>
> I am setting up a mail relay cluster which main purpose is to maintain
> the service ips via IPaddr2 and move them between cluster nodes when
> necessary.
>
> The service ips should only be active on nodes which are running all
> necessary mail (systemd) services.
>
> So I have set up a resource for each of those services, put them into a
> group in order they should start, cloned the group as they are normally
> supposed to run on the nodes at all times.
>
> Then I added an order constraint
>start mail-services-clone then start mail1-ip
>start mail-services-clone then start mail2-ip
>
> and colocations to prefer running the ips on different nodes but only
> with the clone running:
>
>colocation add mail2-ip with mail1-ip -1000
>colocation ip1 with mail-services-clone
>colocation ip2 with mail-services-clone
>
> as well as a location constraint to prefer running the first ip on the
> first node and the second on the second
>
>location ip1 prefers ha1=2000
>location ip2 prefers ha2=2000
>
> Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's
> fine. ip2 will be moved immediately to ha3. Good.
>
> However, if pacemaker on ha2 starts up again, it will immediately remove
> ip2 from ha3 and keep it offline, while the services in the group are
> starting on ha2. As the services unfortunately take some time to come
> up, ip2 is offline for more than a minute.
>
> It seems the colocations with the clone are already good once the clone
> group begins to start services and thus allows the ip to be removed from
> the current node.
>

To achieve this you have to add orders on top of collocations.

Klaus


>
> I was wondering how can I define the colocation to be accepted only if
> all services in the clone have been started? And not once the first
> service in the clone is starting?
>
> Thanks,
>
> Gerald
>
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] resource cloned group colocations

2023-03-01 Thread Gerald Vogt

Hi,

I am setting up a mail relay cluster which main purpose is to maintain 
the service ips via IPaddr2 and move them between cluster nodes when 
necessary.


The service ips should only be active on nodes which are running all 
necessary mail (systemd) services.


So I have set up a resource for each of those services, put them into a 
group in order they should start, cloned the group as they are normally 
supposed to run on the nodes at all times.


Then I added an order constraint
  start mail-services-clone then start mail1-ip
  start mail-services-clone then start mail2-ip

and colocations to prefer running the ips on different nodes but only 
with the clone running:


  colocation add mail2-ip with mail1-ip -1000
  colocation ip1 with mail-services-clone
  colocation ip2 with mail-services-clone

as well as a location constraint to prefer running the first ip on the 
first node and the second on the second


  location ip1 prefers ha1=2000
  location ip2 prefers ha2=2000

Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's 
fine. ip2 will be moved immediately to ha3. Good.


However, if pacemaker on ha2 starts up again, it will immediately remove 
ip2 from ha3 and keep it offline, while the services in the group are 
starting on ha2. As the services unfortunately take some time to come 
up, ip2 is offline for more than a minute.


It seems the colocations with the clone are already good once the clone 
group begins to start services and thus allows the ip to be removed from 
the current node.


I was wondering how can I define the colocation to be accepted only if 
all services in the clone have been started? And not once the first 
service in the clone is starting?


Thanks,

Gerald


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/