Greetings from a confused user;

We are running pacemaker as part of a load-balanced cluster of two members, 
both VMWare VMs, with both acting as stepping-stones to our DNS recursive 
resolvers (RR).  Simple use  - the /etc/resolver.conf on the *NIX boxes points 
at both IPs, and the cluster forwards to one of multiple RRs for DNS resolution.

Today, for an as-yet undetermined reason, one of the two members started 
failing to connect to the RRs. Intermittently. And quite annoyingly, as this 
has affected data center operations.  No matter what we've tried, one member 
fails intermittently, the other is fine.  
And we've tried - 
 - reboot of the affected member - it came back up clean and fine, but the 
issue remained.
 - fail the cluster, moving both IPs to the second member server; failover was 
successful, problem remained.
  -- this moved the entire cluster to a different VM on a different VMWare host 
server, so different NIC, etc...
- failed the cluster back to the original server; both IPs appears on the 
'suspect' VM, and the problem remained
- restore the cluster; both IPs are on the proper VMs, but the one still fails 
intermittently while the second just chugs along.

Any ideas what could be causing this?  Is this something that could be caused 
by the cluster config?  Anybody ever seen anything similar?

Our current unsustainable workaround is to remove the IP for the affected 
member from the *NIX resolver.conf file.

I appreciate any reasonable suggestions.  (I am not the creator of the cluster, 
just the guy trying o figure it out. Unfortunately the creator and my mentor is 
dearly departed and, in times like this, sorely missed.)

Any replies will be read and responded to early tomorrow AM.  thanks for 
understanding.
--
Jeff Westgate
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to