On Thu, Feb 16, 2012 at 11:14:37PM -0500, William Seligman wrote:
> On 2/16/12 8:13 PM, Andrew Beekhof wrote:
> >On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagic<[email protected]>  
> >wrote:
> >>Hi,
> >>
> >>On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote:
> >>>On 2/10/12 4:53 PM, William Seligman wrote:
> >>>>I'm trying to set up an Active/Active cluster (yes, I hear the sounds of 
> >>>>kittens
> >>>>dying). Versions:
> >>>>
> >>>>Scientific Linux 6.2
> >>>>pacemaker-1.1.6
> >>>>resource-agents-3.9.2
> >>>>
> >>>>I'm using cloned IPaddr2 resources:
> >>>>
> >>>>primitive ClusterIP ocf:heartbeat:IPaddr2 \
> >>>>         params ip="129.236.252.13" cidr_netmask="32" \
> >>>>         op monitor interval="30s"
> >>>>primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \
> >>>>         params ip="10.44.7.13" cidr_netmask="32" \
> >>>>         op monitor interval="31s"
> >>>>primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \
> >>>>         params ip="10.43.7.13" cidr_netmask="32" \
> >>>>         op monitor interval="32s"
> >>>>group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox
> >>>>clone ClusterIPClone ClusterIPGroup
> >>>>
> >>>>When both nodes of my two-node cluster are running, everything looks and
> >>>>functions OK. From "service iptables status" on node 1 (hypatia-tb):
> >>>>
> >>>>5    CLUSTERIP  all  --  0.0.0.0/0            10.43.7.13          
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2
> >>>>local_node=1 hash_init=0
> >>>>6    CLUSTERIP  all  --  0.0.0.0/0            10.44.7.13          
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2
> >>>>local_node=1 hash_init=0
> >>>>7    CLUSTERIP  all  --  0.0.0.0/0            129.236.252.13      
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2
> >>>>local_node=1 hash_init=0
> >>>>
> >>>>On node 2 (orestes-tb):
> >>>>
> >>>>5    CLUSTERIP  all  --  0.0.0.0/0            10.43.7.13          
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2
> >>>>local_node=2 hash_init=0
> >>>>6    CLUSTERIP  all  --  0.0.0.0/0            10.44.7.13          
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2
> >>>>local_node=2 hash_init=0
> >>>>7    CLUSTERIP  all  --  0.0.0.0/0            129.236.252.13      
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2
> >>>>local_node=2 hash_init=0
> >>>>
> >>>>If I do a simple test of ssh'ing into 129.236.252.13, I see that I 
> >>>>alternately
> >>>>login into hypatia-tb and orestes-tb. All is good.
> >>>>
> >>>>Now take orestes-tb offline. The iptables rules on hypatia-tb are 
> >>>>unchanged:
> >>>>
> >>>>5    CLUSTERIP  all  --  0.0.0.0/0            10.43.7.13          
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2
> >>>>local_node=1 hash_init=0
> >>>>6    CLUSTERIP  all  --  0.0.0.0/0            10.44.7.13          
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2
> >>>>local_node=1 hash_init=0
> >>>>7    CLUSTERIP  all  --  0.0.0.0/0            129.236.252.13      
> >>>>CLUSTERIP
> >>>>hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2
> >>>>local_node=1 hash_init=0
> >>>>
> >>>>If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be
> >>>>machine-dependent. On one machine I get in, from another I get a 
> >>>>time-out. Both
> >>>>machines show the same MAC address for 129.236.252.13:
> >>>>
> >>>>arp 129.236.252.13
> >>>>Address                  HWtype  HWaddress           Flags Mask           
> >>>> Iface
> >>>>hamilton-tb.nevis.colum  ether   B1:95:5A:B5:16:79   C                    
> >>>> eth0
> >>>>
> >>>>Is this the way the cloned IPaddr2 resource is supposed to behave in the 
> >>>>event
> >>>>of a node failure, or have I set things up incorrectly?
> >>>
> >>>I spent some time looking over the IPaddr2 script. As far as I can tell, 
> >>>the
> >>>script has no mechanism for reconfiguring iptables in the event of a 
> >>>change of
> >>>state in the number of clones.
> >>>
> >>>I might be stupid -- er -- dedicated enough to make this change on my own, 
> >>>then
> >>>share the code with the appropriate group. The change seems to be 
> >>>relatively
> >>>simple. It would be in the monitor operation. In pseudo-code:
> >>>
> >>>if (<IPaddr2 resource is already started>  ) then
> >>>   if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max 
> >>> last time
> >>>     || OCF_RESKEY_CRM_meta_clone     != OCF_RESKEY_CRM_meta_clone last 
> >>> time )
> >>>     ip_stop
> >>>     ip_start
> >>
> >>Just changing the iptables entries should suffice, right?
> >>Besides, doing stop/start in the monitor is sort of unexpected.
> >>Another option is to add the missing node to one of the nodes
> >>which are still running (echo "+<n>">>
> >>/proc/net/ipt_CLUSTERIP/<ip>). But any of that would be extremely
> >>tricky to implement properly (if not impossible).
> >>
> >>>   fi
> >>>fi
> >>>
> >>>If this would work, then I'd have two questions for the experts:
> >>>
> >>>- Would the values of OCF_RESKEY_CRM_meta_clone_max and/or
> >>>OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a 
> >>>resource
> >>>changed?
> >>
> >>OCF_RESKEY_CRM_meta_clone_max definitely not.
> >>OCF_RESKEY_CRM_meta_clone may change but also probably not; it's
> >>just a clone sequence number. In short, there's no way to figure
> >>out the total number of clones by examining the environment.
> >>Information such as membership changes doesn't trickle down to
> >>the resource instances.
> >
> >What about notifications?  The would be the right point to
> >re-configure things I'd have thought.
> 
> I ran a simple test: I added "notify" to the IPaddr2 actions, and
> logged the values of every one of the variables in "Pacemaker
> Explained" that related to clones. I brought the IPaddr2 up and down
> a few times on both my machines. No values changed at all, and no
> "notify" actions were logged, though the appropriate "stop",
> "start", and "monitor" actions were. It looks like a cloned IPaddr2
> resource doesn't get a notify signal.
> 
> At this point, it looks my notion of re-writing IPaddr2 won't work.
> I'm redesigning my cluster configuration so I don't require
> cloned/highly-available IP addresses.
> 
> Is this a bug?

Looks like a deficiency. I'm not sure how to deal with it though.

> Is there a bugzilla or similar resource for resource agents?

https://developerbugs.linuxfoundation.org/enter_bug.cgi?product=Linux-HA

Then choose Resource agent. Or create an issue at
https://github.com/ClusterLabs/resource-agents

Cheers,

Dejan

> -- 
> Bill Seligman             | mailto://[email protected]
> Nevis Labs, Columbia Univ | http://www.nevis.columbia.edu/~seligman/
> PO Box 137                |
> Irvington NY 10533  USA   | Phone: (914) 591-2823
> 



> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to