Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On Tue, 2018-11-13 at 18:41 +0100, Valentin Vidic wrote: > On Tue, Nov 13, 2018 at 11:01:46AM -0600, Ken Gaillot wrote: > > Clone instances have a default stickiness of 1 (instead of the > > usual 0) > > so that they aren't needlessly shuffled around nodes every > > transition. > > You can temporarily set an explicit stickiness of 0 to let them > > rebalance, then unset it to go back to the default. > > Thanks, this works as expected now: > > clone cip-clone cip \ > meta clone-max=2 clone-node-max=2 globally-unique=true > interleave=true \ > resource-stickiness=0 target-role=Started > > Clone instance moves when a node is down but also returns when the > node > is back online. > > Do you perhaps know if CLUSTERIP has any special network requirements > to > work properly? Yes, the switch must support multicast MAC (which is different from multicast IP). Sometimes this is an option that must be turned on. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On Tue, Nov 13, 2018 at 11:01:46AM -0600, Ken Gaillot wrote: > Clone instances have a default stickiness of 1 (instead of the usual 0) > so that they aren't needlessly shuffled around nodes every transition. > You can temporarily set an explicit stickiness of 0 to let them > rebalance, then unset it to go back to the default. Thanks, this works as expected now: clone cip-clone cip \ meta clone-max=2 clone-node-max=2 globally-unique=true interleave=true \ resource-stickiness=0 target-role=Started Clone instance moves when a node is down but also returns when the node is back online. Do you perhaps know if CLUSTERIP has any special network requirements to work properly? -- Valentin ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On Tue, 2018-11-13 at 17:27 +0100, Valentin Vidic wrote: > On Tue, Nov 13, 2018 at 05:04:19PM +0100, Valentin Vidic wrote: > > Also it seems to require multicast, so better check for that too :) > > And while the CLUSTERIP resource seems to work for me in a test > cluster, the following clone definition: > > clone cip-clone cip \ > meta clone-max=2 clone-node-max=2 globally-unique=true > interleave=true target-role=Started > > allows for both clone instances to end up on the same node: > > Clone Set: cip-clone [cip] (unique) > cip:0 (ocf::heartbeat:IPaddr2): Started sid2 > cip:1 (ocf::heartbeat:IPaddr2): Started sid2 > > Is there a way to spread the resources other than setting > clone-node-max=1 for a while? Clone instances have a default stickiness of 1 (instead of the usual 0) so that they aren't needlessly shuffled around nodes every transition. You can temporarily set an explicit stickiness of 0 to let them rebalance, then unset it to go back to the default. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On Tue, Nov 13, 2018 at 05:04:19PM +0100, Valentin Vidic wrote: > Also it seems to require multicast, so better check for that too :) And while the CLUSTERIP resource seems to work for me in a test cluster, the following clone definition: clone cip-clone cip \ meta clone-max=2 clone-node-max=2 globally-unique=true interleave=true target-role=Started allows for both clone instances to end up on the same node: Clone Set: cip-clone [cip] (unique) cip:0 (ocf::heartbeat:IPaddr2): Started sid2 cip:1 (ocf::heartbeat:IPaddr2): Started sid2 Is there a way to spread the resources other than setting clone-node-max=1 for a while? -- Valentin ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On Tue, Nov 13, 2018 at 04:06:34PM +0100, Valentin Vidic wrote: > Could be some kind of ARP inspection going on in the networking equipment, > so check switch logs if you have access to that. Also it seems to require multicast, so better check for that too :) -- Valentin ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On Tue, Nov 13, 2018 at 09:06:56AM -0500, Daniel Ragle wrote: > Thanks, finally getting back to this. Putting a tshark on both nodes and > then restarting the VIP-clone resource shows the pings coming through for 12 > seconds, always on node2, then stop. I.E., before/after those 12 seconds > nothing on either node from the server initiating the pings. Could be some kind of ARP inspection going on in the networking equipment, so check switch logs if you have access to that. -- Valentin ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On 10/11/2018 5:00 PM, Valentin Vidic wrote: On Thu, Oct 11, 2018 at 01:25:52PM -0400, Daniel Ragle wrote: For the 12 second window it *does* work in, it appears as though it works only on one of the two servers (and always the same one). My twelve seconds of pings runs continuously then stops; while attempts to hit the Web server works hit or miss depending on my source port (I'm using sourceip-sourceport). I.E., as if anything that would be handled by the other server isn't making it through. But after the 12 seconds neither server responds to the requests against the VIP (but they do respond fine to their own static IPs at all times). Could be that the switch in front of the servers does not like to see the same MAC on two ports or something like that. During the 12 seconds that it works I get these in the logs of the server that *is* responding: Oct 11 12:17:43 node2 kernel: ipt_CLUSTERIP: unknown protocol 1 Oct 11 12:17:44 node2 kernel: ipt_CLUSTERIP: unknown protocol 1 Oct 11 12:17:45 node2 kernel: ipt_CLUSTERIP: unknown protocol 1 Protocol 1 once per second should be ICMP PING so this is just CLUSTERIP complaining that it can't calculate sourceip-sourceport for those packets (ICMP has no source port). So maybe try recording the traffic using tcpdump on both servers and see if any requests are comming in at all from the network equipment. Thanks, finally getting back to this. Putting a tshark on both nodes and then restarting the VIP-clone resource shows the pings coming through for 12 seconds, always on node2, then stop. I.E., before/after those 12 seconds nothing on either node from the server initiating the pings. Dan ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 works for 12 seconds then stops
On Thu, Oct 11, 2018 at 01:25:52PM -0400, Daniel Ragle wrote: > For the 12 second window it *does* work in, it appears as though it works > only on one of the two servers (and always the same one). My twelve seconds > of pings runs continuously then stops; while attempts to hit the Web server > works hit or miss depending on my source port (I'm using > sourceip-sourceport). I.E., as if anything that would be handled by the > other server isn't making it through. But after the 12 seconds neither > server responds to the requests against the VIP (but they do respond fine to > their own static IPs at all times). Could be that the switch in front of the servers does not like to see the same MAC on two ports or something like that. > During the 12 seconds that it works I get these in the logs of the server > that *is* responding: > > Oct 11 12:17:43 node2 kernel: ipt_CLUSTERIP: unknown protocol 1 > Oct 11 12:17:44 node2 kernel: ipt_CLUSTERIP: unknown protocol 1 > Oct 11 12:17:45 node2 kernel: ipt_CLUSTERIP: unknown protocol 1 Protocol 1 once per second should be ICMP PING so this is just CLUSTERIP complaining that it can't calculate sourceip-sourceport for those packets (ICMP has no source port). So maybe try recording the traffic using tcpdump on both servers and see if any requests are comming in at all from the network equipment. -- Valentin ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org