Hello everyone

Can anyone, please assist me with the following problem. In syslog I get the following messages:

kernel: stonithd[2029]: segfault at 0 ip 00000000004047ed sp 00007fffe886c8c0 error 4 in stonithd[400000+17000] pacemakerd[2025]: notice: pcmk_child_exit: Child process stonith-ng terminated with signal 11 (pid=2029, core=128)

Then pacemakerd tries to respawn stonith-ng, but it fails again and this goes infinitely.

I have found a very similar problem in the mailing list archives, but it was already fixed and was related to Heartbeat only, while I'm using Corosync.

What I have noticed is that this is somehow related to DRBD that I configure. With empty configuration (no RAs) or some other RAs (IPaddr2, ...), stonithd is running without any problem. At the same time, despite the issue, DRBD Master / Slave resource seems to work correctly.

Here is my configuration:

node $id="1" fio-node1 \
    attributes standby="off"
node $id="2" fio-node2 \
    attributes standby="off"
rsc_template drbd-r ocf:linbit:drbd \
    op start interval="0" timeout="240" \
    op promote interval="0" timeout="90" \
    op demote interval="0" timeout="90" \
    op notify interval="0" timeout="90" \
    op stop interval="0" timeout="100" \
    op monitor interval="20" role="Slave" timeout="20" \
    op monitor interval="10" role="Master" timeout="20"
primitive drbd-r1 @drbd-r \
    params drbd_resource="r1"
ms ms-r1 drbd-r1 \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
property $id="cib-bootstrap-options" \
    dc-version="1.1.9-2a917dd" \
    cluster-infrastructure="corosync" \
    stonith-enabled="false" \
    last-lrm-refresh="1366018562"

and here is drbd.conf:

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

resource r1 {
    device /dev/drbd1;
    disk /dev/vg-bio/lv1;
    meta-disk internal;
    on fio-node1 {
    address 172.17.68.128:7789;
    }
    on fio-node2 {
    address 172.17.68.129:7789;
    }
}

You can download full configuration (cib, corosync.conf, drbd.conf, drbd.d/global-common.conf) here - http://up.iteam.ua/download/152101/50aa518a439747e72/.

I'm using Pacemaker 1.1.9 with Corosync 2.3.0 and crmsh 1.2.5 all built from source on Ubuntu Server 12.10 x64.
Build options for the above are:
pacemaker: ./configure --with-corosync --with-cs-quorum --without-ais --without-heartbeat --without-cman --with-snmp corosync: ./configure --disable-rdma --disable-testagents --disable-dbus --enable-snmp --enable-qdevices
crmsh: ./configure

Any help or guidance is highly appreciated. Thanks!




_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to