Re: [ClusterLabs] resource cloned group colocations
On Thu, Mar 2, 2023 at 4:16 PM Gerald Vogt wrote: > > On 02.03.23 13:51, Klaus Wenninger wrote: > > Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's > > fine. ip2 will be moved immediately to ha3. Good. > > > > However, if pacemaker on ha2 starts up again, it will immediately > > remove > > ip2 from ha3 and keep it offline, while the services in the group are > > starting on ha2. As the services unfortunately take some time to come > > up, ip2 is offline for more than a minute. > > > > It seems the colocations with the clone are already good once the clone > > group begins to start services and thus allows the ip to be removed > > from > > the current node. > > > > > > To achieve this you have to add orders on top of collocations. > > I don't understand that. > > "order" and "colocation" are constraints. They work on resources. > > I don't see how I could add an order on top of a colocation constraint... > You cannot, but asymmetrical serializing constraint may do it first start clone then stop ip when new node comes up, pacemaker builds a transaction which starts clone on the new node and moves ip (stops on the old node and starts on the new node). These actions are (should be) part of the same transaction so serializing constraints should apply. ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] resource cloned group colocations
On Thu, 2023-03-02 at 08:41 +0100, Gerald Vogt wrote: > Hi, > > I am setting up a mail relay cluster which main purpose is to > maintain > the service ips via IPaddr2 and move them between cluster nodes when > necessary. > > The service ips should only be active on nodes which are running all > necessary mail (systemd) services. > > So I have set up a resource for each of those services, put them into > a > group in order they should start, cloned the group as they are > normally > supposed to run on the nodes at all times. > > Then I added an order constraint > start mail-services-clone then start mail1-ip > start mail-services-clone then start mail2-ip > > and colocations to prefer running the ips on different nodes but only > with the clone running: > > colocation add mail2-ip with mail1-ip -1000 > colocation ip1 with mail-services-clone > colocation ip2 with mail-services-clone > > as well as a location constraint to prefer running the first ip on > the > first node and the second on the second > > location ip1 prefers ha1=2000 > location ip2 prefers ha2=2000 > > Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's > fine. ip2 will be moved immediately to ha3. Good. > > However, if pacemaker on ha2 starts up again, it will immediately > remove > ip2 from ha3 and keep it offline, while the services in the group are > starting on ha2. As the services unfortunately take some time to come > up, ip2 is offline for more than a minute. > > It seems the colocations with the clone are already good once the > clone > group begins to start services and thus allows the ip to be removed > from > the current node. > > I was wondering how can I define the colocation to be accepted only > if > all services in the clone have been started? And not once the first > service in the clone is starting? > > Thanks, > > Gerald > I noticed such behavior many years ago - it is especially visible with a long-starting resources, and one of techniques to deal with that is to use transient node attributes instead of colocation/order between group and vip. I'm not sure there is a suitable open-source resource agent which just manages specified node attribute, but it should be not hard to compose one which implements a pseudo-resource handler together with atrrd_updater calls. Probably you can trim all ethernet-related from a ethmonitor to make such almost-dummy resource agent. Once RA is there, you can add it as the last resource in the group, and then rely on the attribute it manages to start your VIP. That is done with location constraints, just use score-attribute in their rules - https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/singlehtml/index.html#rule-properties So, the idea is: your custom RA sets attribute 'mail-clone-started' to something like 1, and you have a location constraint which prevents cluster from starting your VIP resource on a node if value of 'mail-clone-started' attribute on a node is less then 1 or not defined. Once node has that attribute set (which happens at the very end of a start sequence of a group) then (and only then) it decides to move your VIP to that node (because of other location constraints with preferences you already have). Just make sure attributes are transient (not stored into CIB). ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] resource cloned group colocations
On 02.03.23 13:51, Klaus Wenninger wrote: Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's fine. ip2 will be moved immediately to ha3. Good. However, if pacemaker on ha2 starts up again, it will immediately remove ip2 from ha3 and keep it offline, while the services in the group are starting on ha2. As the services unfortunately take some time to come up, ip2 is offline for more than a minute. It seems the colocations with the clone are already good once the clone group begins to start services and thus allows the ip to be removed from the current node. To achieve this you have to add orders on top of collocations. I don't understand that. "order" and "colocation" are constraints. They work on resources. I don't see how I could add an order on top of a colocation constraint... Thanks, Gerald ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] resource cloned group colocations
On Thu, Mar 2, 2023 at 8:41 AM Gerald Vogt wrote: > Hi, > > I am setting up a mail relay cluster which main purpose is to maintain > the service ips via IPaddr2 and move them between cluster nodes when > necessary. > > The service ips should only be active on nodes which are running all > necessary mail (systemd) services. > > So I have set up a resource for each of those services, put them into a > group in order they should start, cloned the group as they are normally > supposed to run on the nodes at all times. > > Then I added an order constraint >start mail-services-clone then start mail1-ip >start mail-services-clone then start mail2-ip > > and colocations to prefer running the ips on different nodes but only > with the clone running: > >colocation add mail2-ip with mail1-ip -1000 >colocation ip1 with mail-services-clone >colocation ip2 with mail-services-clone > > as well as a location constraint to prefer running the first ip on the > first node and the second on the second > >location ip1 prefers ha1=2000 >location ip2 prefers ha2=2000 > > Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's > fine. ip2 will be moved immediately to ha3. Good. > > However, if pacemaker on ha2 starts up again, it will immediately remove > ip2 from ha3 and keep it offline, while the services in the group are > starting on ha2. As the services unfortunately take some time to come > up, ip2 is offline for more than a minute. > > It seems the colocations with the clone are already good once the clone > group begins to start services and thus allows the ip to be removed from > the current node. > To achieve this you have to add orders on top of collocations. Klaus > > I was wondering how can I define the colocation to be accepted only if > all services in the clone have been started? And not once the first > service in the clone is starting? > > Thanks, > > Gerald > > > ___ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
[ClusterLabs] resource cloned group colocations
Hi, I am setting up a mail relay cluster which main purpose is to maintain the service ips via IPaddr2 and move them between cluster nodes when necessary. The service ips should only be active on nodes which are running all necessary mail (systemd) services. So I have set up a resource for each of those services, put them into a group in order they should start, cloned the group as they are normally supposed to run on the nodes at all times. Then I added an order constraint start mail-services-clone then start mail1-ip start mail-services-clone then start mail2-ip and colocations to prefer running the ips on different nodes but only with the clone running: colocation add mail2-ip with mail1-ip -1000 colocation ip1 with mail-services-clone colocation ip2 with mail-services-clone as well as a location constraint to prefer running the first ip on the first node and the second on the second location ip1 prefers ha1=2000 location ip2 prefers ha2=2000 Now if I stop pacemaker on one of those nodes, e.g. on node ha2, it's fine. ip2 will be moved immediately to ha3. Good. However, if pacemaker on ha2 starts up again, it will immediately remove ip2 from ha3 and keep it offline, while the services in the group are starting on ha2. As the services unfortunately take some time to come up, ip2 is offline for more than a minute. It seems the colocations with the clone are already good once the clone group begins to start services and thus allows the ip to be removed from the current node. I was wondering how can I define the colocation to be accepted only if all services in the clone have been started? And not once the first service in the clone is starting? Thanks, Gerald ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/