Chrissie: Perhaps you have some insight here?

On 23/08/2013, at 8:42 PM, Jakob Curdes <j...@info-systems.de> wrote:

> Hmmm, the problem turns out to DNS-related. At startup, some of the virtual 
> interfaces are inactive and the DNS servers are unreachable. And CMAN seems 
> to do a lookup for all ip addresses on the machine; I have the names of all 
> cluster members in the hosts file but not all names of all other addresses 
> (i.e. the ones managed by the cluster). Anyway I wonder whiy even with -d64 
> it doesn't tell me anything about what it is doing. I think the timespan of 
> an hour is just because we have lots of VLAN interfaces the he wants to get a 
> DNS name for....
>> 
>> Hi,
>> 
>> we have a simple 2-node cluster running CMAN and pacemaker under CentOS 6.
>> The problem is that upon startup the machines (even if "alone", i.e. second 
>> machine is off), will give a cman timeout on startup saying "Timed-out 
>> waiting for cluster".
>> *If I start the services manually an hour later or so, everything works 
>> fine* until I reboot one machine, that machine will typically timeout again 
>> on startup.
>> The issue seems not to be related to firewalling (I retried the failed start 
>> with open firewalls- no change) nor to multicast communication as the 
>> cluster is setup to use unicast via an bonded interface that has no other 
>> purposes.
>> For the machine that timeouts, I do not see ANY communication attempts with 
>> the other node (running tcpdump on both machines) so I suppose it is waiting 
>> for something local to happen.
>> Also, I do not see ANY log entries related to the failed start; the logfiles 
>> only get updated once the cluster has started successfully.
>> What might be the cause here? Could that be a fencing problem (we have IPMI 
>> fencing configured which works)? Here is the cluster.conf:
>> 
>> <cluster config_version="13" name="fw-cluster">
>>  <logging logfile="/var/log/cluster/cluster.log" logfile_priority="info" 
>> syslog_priority="crit" to_logfile="yes" to_syslog="yes"/>
>>  <dlm protocol="sctp"/>
>>  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3" 
>> skip_undefined="1"/>
>>  <clusternodes>
>>    <clusternode name="gw1" nodeid="1">
>>      <fence>
>>        <method name="pcmk-redirect">
>>          <device name="pcmk" port="gw1"/>
>>        </method>
>>      </fence>
>>    </clusternode>
>>    <clusternode name="gw2" nodeid="2">
>>      <fence>
>>        <method name="pcmk-redirect">
>>          <device name="pcmk" port="gw2"/>
>>        </method>
>>      </fence>
>>    </clusternode>
>>  </clusternodes>
>>  <cman expected_votes="1" transport="udpu" two_node="1"/>
>>  <fencedevices>
>>    <fencedevice agent="fence_pcmk" name="pcmk"/>
>>  </fencedevices>
>>  <rm>
>>    <failoverdomains/>
>>    <resources/>
>>  </rm>
>> </cluster>
>> 
> 
> Regards,
> Jakob Curdes
> 
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to