What versions of pacemaker and the dlm? What does the stack trace from the core look like?
On Sun, Apr 25, 2010 at 1:15 PM, Oliver Heinz <ohe...@fbihome.de> wrote: > Am Samstag, 24. April 2010, um 17:27:42 schrieb Pål Simensen: >> Can you check your dmesg to see if DLM is segfaulting? I might be >> experiencing the same problem. If corosync is started at boot DLM >> segfaults, but if it's started manually everything is ok. Still trying to >> find out more about what is going on, and I sadly can't provide more >> information before Monday when I get to work. We did even try bootchart to >> see if that could provide some more information, but sadly no. We also >> changed the start order to corosync by renaming the init symlink to >> S98corosync, but that didn't work out either. > > > You are right, dlm is segfaulting and network is already up at that time. > > [ 15.654093] br53: port 1(vlan53) entering forwarding state > [ 15.664083] br83: port 1(vlan83) entering forwarding state > ... > [ 46.979087] dlm_controld.pc[2533]: segfault at 0 ip 00007f30f7d68022 sp > 00007fffddf0e288 error 4 in libc-2.11.1.so[7f30f7ce5000+178000] > > I rebuild the packages http://ppa.launchpad.net/ubuntu-ha/lucid- > cluster/ubuntu/pool/main/r/redhat-cluster on a freshly installed lucid VM but > this didn't change anything. I even upgraded them to current 3.0.11 still > segfaulting. So try and error seems not to work. Maybe someone with a little > more understanding what's going on can do an educated guess? > > TIA, > Oliver > > >> >> On Sat, Apr 24, 2010 at 12:25 PM, Oliver Heinz <ohe...@fbihome.de> wrote: >> > Hi, >> > >> > when rebooting my cluster nodes they won't bring up the ocfs2-fs because >> > of resDLM failing. When I issue a '/etc/init.d/pacemaker restart' >> > afterwards everything is fine. >> > >> > The machine needs quite a while to bring up the (bonding) network >> > interfaces. >> > Do timeout values need to be adjusted? Or should I rather try to startup >> > pacemaker after the network is completely up? >> > >> > >> > my current config: >> > >> > node server-c \ >> > >> > attributes standby="off" >> > >> > node server-d >> > primitive failover-ip ocf:heartbeat:IPaddr \ >> > >> > params ip="192.168.5.150" \ >> > op monitor interval="10s" >> > >> > primitive resDLM ocf:pacemaker:controld \ >> > >> > op monitor interval="120s" >> > >> > primitive resFS ocf:heartbeat:Filesystem \ >> > >> > params device="/dev/mapper/data-data" directory="/srv/data" >> > >> > fstype="ocfs2" \ >> > >> > op monitor interval="120s" >> > >> > primitive resO2CB ocf:pacemaker:o2cb \ >> > >> > op monitor interval="120s" >> > >> > clone cloneDLM resDLM \ >> > >> > meta globally-unique="false" interleave="true" >> > >> > clone cloneFS resFS \ >> > >> > meta interleave="true" ordered="true" >> > >> > clone cloneO2CB resO2CB \ >> > >> > meta globally-unique="false" interleave="true" >> > >> > colocation colFSO2CB inf: cloneFS cloneO2CB >> > colocation colO2CBDLM inf: cloneO2CB cloneDLM >> > order ordDLMO2CB 0: cloneDLM cloneO2CB >> > order ordO2CBFS 0: cloneO2CB cloneFS >> > property $id="cib-bootstrap-options" \ >> > >> > dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \ >> > cluster-infrastructure="openais" \ >> > expected-quorum-votes="2" \ >> > stonith-enabled="false" \ >> > last-lrm-refresh="1272026744" >> > >> > I tried something like >> > primitive resDLM ocf:pacemaker:controld \ >> > >> > op start timeout="100s" \ >> > op monitor interval="120s" >> > >> > but this didn't help. >> > >> > >> > >> > >> > >> > TIA, >> > Oliver >> > >> > >> > >> > >> > >> > >> > >> > >> > _______________________________________________ >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf