Re: [ClusterLabs] cloned ethmonitor - upon failure of all nodes

2019-08-15 Thread solarmon
Thanks for the tips on the constraint rules.

I currently have two separate constraint for each ethmonitor - I did not
realise that you could have a constraint rule that can combine multiple
ethmonitors.

Based on your suggestion, I have now used one constraint, as such:

score=-INFINITY ethmonitor-net4 eq 0 and ethmonitor-net5 eq 0

and I have tested it (by simulating a network interface down by
disconnecting the interface in vmware) and it works as expected - when, for
example, net4 interface is down on both nodes, the cluster does not try to
move the virtual ip resource group, or take it down.

I'll now need to apply this same logic to some ping monitor resources.

Thank you for your help!

On Thu, 15 Aug 2019 at 15:13, Ken Gaillot  wrote:

> On Thu, 2019-08-15 at 10:59 +0100, solarmon wrote:
> > Hi,
> >
> > I have a two node cluster setup where each node is multi-homed over
> > two separate external interfaces - net4 and net5 - that can have
> > traffic load balanced between them.
> >
> > I have created multiple virtual ip resources (grouped together) that
> > should only be active on only one of the two nodes.
> >
> > I have created ethmonitor resources for net4 and net5 and have
> > created a constraint against the virtual ip resources group.
> >
> > When one of the net4/net5 interfaces is taken on the active node
> > (where the virtual IPs are), the virtual ip resource group switches
> > to the other node. This is working as expected.
> >
> > However, when either of the net4/net5 interfaces are down on BOTH
> > nodes - for example, if net4 is down on BOTH nodes - the cluster
> > seems to get itself in to a flapping state where there virtual IP
> > resources keeps becoming available then unavailable. Or the virtual
> > IP resources group isn't running on any node.
> >
> > Since net4 and net5 interfaces can have traffic load-balanced across
> > them, it is acceptable for the virtual IP resources to be running any
> > of the node, even if the same interface (for example, net4) is down
> > on both nodes, since the other interface (for example, net5) is still
> > available on both nodes.
> >
> > What is the recommended way to configure the ethmonitor and
> > constraint resources for this type of multi-homed setup?
>
> It's probably the constraint. When monitoring a single interface, the
> location constraint should have rule giving a score of -INFINITY when
> the special node attribute's value is 0.
>
> However in your case, your goal is more complicated, so the rule has to
> be as well. I'd set a -INFINITY score when *both* attributes are 0
> (e.g. ethmonitor-net4 eq 0 and ethmonitor-net5 eq 0). That will keep
> the IPs on a node where at least one interface is working.
>
> If you additionally want to prefer a node with both interfaces working,
> I'd add 2 more rules giving a slightly negative preference to a node
> where a single attribute is 0 (one rule for each attribute).
> --
> Ken Gaillot 
>
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] cloned ethmonitor - upon failure of all nodes

2019-08-15 Thread solarmon
I have tried both ifdown and using vmware to disconnect the interface from
the VM.

It is likely a constraint issue, as I don't have much experience with
corosync/pacemaker.

On Thu, 15 Aug 2019 at 14:01, Jan Pokorný  wrote:

> On 15/08/19 10:59 +0100, solarmon wrote:
> > I have a two node cluster setup where each node is multi-homed over two
> > separate external interfaces - net4 and net5 - that can have traffic load
> > balanced between them.
> >
> > I have created multiple virtual ip resources (grouped together) that
> should
> > only be active on only one of the two nodes.
> >
> > I have created ethmonitor resources for net4 and net5 and have created a
> > constraint against the virtual ip resources group.
> >
> > When one of the net4/net5 interfaces is taken
>
> clarification request: taken _down_ ?
>
> (and if so, note that there's this running misconception that
> ifdown equals pulling the cable out or cutting it apart physically
> and I am not sure if, by any chance, ethmonitor is not as sensitive
> about this difference as corosync used to be for years, yet people
> were not getting it right)
>
> > on the active node (where the virtual IPs are), the virtual ip
> > resource group switches to the other node.  This is working as
> > expected.
> >
> > However, when either of the net4/net5 interfaces are down on BOTH nodes -
> > for example, if net4 is down on BOTH nodes - the cluster seems to get
> > itself in to a flapping state where there virtual IP resources keeps
> > becoming available then unavailable. Or the virtual IP resources group
> > isn't running on any node.
> >
> > Since net4 and net5 interfaces can have traffic load-balanced across
> them,
> > it is acceptable for the virtual IP resources to be running any of the
> > node, even if the same interface (for example, net4) is down on both
> nodes,
> > since the other interface (for example, net5) is still available on both
> > nodes.
> >
> > What is the recommended way to configure the ethmonitor and constraint
> > resources for this type of multi-homed setup?
>
> --
> Jan (Poki)
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] cloned ethmonitor - upon failure of all nodes

2019-08-15 Thread Ken Gaillot
On Thu, 2019-08-15 at 10:59 +0100, solarmon wrote:
> Hi,
> 
> I have a two node cluster setup where each node is multi-homed over
> two separate external interfaces - net4 and net5 - that can have
> traffic load balanced between them.
> 
> I have created multiple virtual ip resources (grouped together) that
> should only be active on only one of the two nodes.
> 
> I have created ethmonitor resources for net4 and net5 and have
> created a constraint against the virtual ip resources group.
> 
> When one of the net4/net5 interfaces is taken on the active node
> (where the virtual IPs are), the virtual ip resource group switches
> to the other node. This is working as expected.
> 
> However, when either of the net4/net5 interfaces are down on BOTH
> nodes - for example, if net4 is down on BOTH nodes - the cluster
> seems to get itself in to a flapping state where there virtual IP
> resources keeps becoming available then unavailable. Or the virtual
> IP resources group isn't running on any node.
> 
> Since net4 and net5 interfaces can have traffic load-balanced across
> them, it is acceptable for the virtual IP resources to be running any
> of the node, even if the same interface (for example, net4) is down
> on both nodes, since the other interface (for example, net5) is still
> available on both nodes.
> 
> What is the recommended way to configure the ethmonitor and
> constraint resources for this type of multi-homed setup?

It's probably the constraint. When monitoring a single interface, the
location constraint should have rule giving a score of -INFINITY when
the special node attribute's value is 0.

However in your case, your goal is more complicated, so the rule has to
be as well. I'd set a -INFINITY score when *both* attributes are 0
(e.g. ethmonitor-net4 eq 0 and ethmonitor-net5 eq 0). That will keep
the IPs on a node where at least one interface is working.

If you additionally want to prefer a node with both interfaces working,
I'd add 2 more rules giving a slightly negative preference to a node
where a single attribute is 0 (one rule for each attribute).
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] cloned ethmonitor - upon failure of all nodes

2019-08-15 Thread Jan Pokorný
On 15/08/19 10:59 +0100, solarmon wrote:
> I have a two node cluster setup where each node is multi-homed over two
> separate external interfaces - net4 and net5 - that can have traffic load
> balanced between them.
> 
> I have created multiple virtual ip resources (grouped together) that should
> only be active on only one of the two nodes.
> 
> I have created ethmonitor resources for net4 and net5 and have created a
> constraint against the virtual ip resources group.
> 
> When one of the net4/net5 interfaces is taken

clarification request: taken _down_ ?

(and if so, note that there's this running misconception that
ifdown equals pulling the cable out or cutting it apart physically
and I am not sure if, by any chance, ethmonitor is not as sensitive
about this difference as corosync used to be for years, yet people
were not getting it right)

> on the active node (where the virtual IPs are), the virtual ip
> resource group switches to the other node.  This is working as
> expected.
> 
> However, when either of the net4/net5 interfaces are down on BOTH nodes -
> for example, if net4 is down on BOTH nodes - the cluster seems to get
> itself in to a flapping state where there virtual IP resources keeps
> becoming available then unavailable. Or the virtual IP resources group
> isn't running on any node.
> 
> Since net4 and net5 interfaces can have traffic load-balanced across them,
> it is acceptable for the virtual IP resources to be running any of the
> node, even if the same interface (for example, net4) is down on both nodes,
> since the other interface (for example, net5) is still available on both
> nodes.
> 
> What is the recommended way to configure the ethmonitor and constraint
> resources for this type of multi-homed setup?

-- 
Jan (Poki)


pgpkVbJsN1vOV.pgp
Description: PGP signature
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/