Hi all,
I have a problem with clvmd in building up a 2-node-cluster.
On both nodes SLES11 (Linux 2.6.27.23-0.1-xen) is installed with
pacemaker-1.0.3-4.1, openais-0.80.3-26.1, libdlm-2.99.08-8.4
and lvm2-clvm-2.02.39-18.4.
The locking_type in lvm.conf was changed to 3 on both nodes.
DLM clone resource and clvm clone resource were included in
the pacemaker configuration with clvm depending on DLM.
When openais ist started on both nodes DLM and clvmd are started
on the first node but clvmd hangs on the second node and is timed
out after 5 minutes.
Crm_mon:
============
Last updated: Fri Oct 23 07:26:31 2009
Current DC: cuzzonia - partition with quorum
Version: 1.0.3-0080ec086ae9c20ad5c4c3562000c0ad68374f0a
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ cuzzonia cuzzonib ]
Full list of resources:
Clone Set: iRMC_cuzzonib
Started: [ cuzzonia ]
Clone Set: iRMC_cuzzonia
Started: [ cuzzonib ]
Clone Set: dlm-clone
Started: [ cuzzonia cuzzonib ]
Clone Set: clvm-clone
clvm:1 (ocf::lvm2:clvmd): Started cuzzonib (unmanaged) FAILED
Started: [ cuzzonia ]
Failed actions:
clvm:1_start_0 (node=cuzzonib, call=9, rc=-2, status=Timed Out): unknown
exec error
clvm:1_stop_0 (node=cuzzonib, call=10, rc=1, status=complete): unknown error
Strace for "clvmd start" shows following difference at the end of trace:
First node (cuzzonia):
....
select(5, [4], NULL, NULL, NULL) = 1 (in [4])
read(4, "\0\0\0\0", 4) = 4
exit_group(0) = ?
Second node (cuzzonib):
............
select(5, [4], NULL, NULL, NULL) = ? ERESTARTNOHAND (To be restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
+++ killed by SIGTERM +++
What is the reason for the problem ?
Thanks and regards,
Armin Haußecker
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems