Chrissie: Perhaps you have some insight here? On 23/08/2013, at 8:42 PM, Jakob Curdes <j...@info-systems.de> wrote:
> Hmmm, the problem turns out to DNS-related. At startup, some of the virtual > interfaces are inactive and the DNS servers are unreachable. And CMAN seems > to do a lookup for all ip addresses on the machine; I have the names of all > cluster members in the hosts file but not all names of all other addresses > (i.e. the ones managed by the cluster). Anyway I wonder whiy even with -d64 > it doesn't tell me anything about what it is doing. I think the timespan of > an hour is just because we have lots of VLAN interfaces the he wants to get a > DNS name for.... >> >> Hi, >> >> we have a simple 2-node cluster running CMAN and pacemaker under CentOS 6. >> The problem is that upon startup the machines (even if "alone", i.e. second >> machine is off), will give a cman timeout on startup saying "Timed-out >> waiting for cluster". >> *If I start the services manually an hour later or so, everything works >> fine* until I reboot one machine, that machine will typically timeout again >> on startup. >> The issue seems not to be related to firewalling (I retried the failed start >> with open firewalls- no change) nor to multicast communication as the >> cluster is setup to use unicast via an bonded interface that has no other >> purposes. >> For the machine that timeouts, I do not see ANY communication attempts with >> the other node (running tcpdump on both machines) so I suppose it is waiting >> for something local to happen. >> Also, I do not see ANY log entries related to the failed start; the logfiles >> only get updated once the cluster has started successfully. >> What might be the cause here? Could that be a fencing problem (we have IPMI >> fencing configured which works)? Here is the cluster.conf: >> >> <cluster config_version="13" name="fw-cluster"> >> <logging logfile="/var/log/cluster/cluster.log" logfile_priority="info" >> syslog_priority="crit" to_logfile="yes" to_syslog="yes"/> >> <dlm protocol="sctp"/> >> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3" >> skip_undefined="1"/> >> <clusternodes> >> <clusternode name="gw1" nodeid="1"> >> <fence> >> <method name="pcmk-redirect"> >> <device name="pcmk" port="gw1"/> >> </method> >> </fence> >> </clusternode> >> <clusternode name="gw2" nodeid="2"> >> <fence> >> <method name="pcmk-redirect"> >> <device name="pcmk" port="gw2"/> >> </method> >> </fence> >> </clusternode> >> </clusternodes> >> <cman expected_votes="1" transport="udpu" two_node="1"/> >> <fencedevices> >> <fencedevice agent="fence_pcmk" name="pcmk"/> >> </fencedevices> >> <rm> >> <failoverdomains/> >> <resources/> >> </rm> >> </cluster> >> > > Regards, > Jakob Curdes > > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems