Hi! Update: After reboot the cLVM and OCFS filesystems seem OK, but the LV is re.mirroring: # lvs LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert cLV1 cVG mwi-aom-- 100.00g 13.19 (The LV had justbeen re-mirrored yesterday)
The difference I see is this: kernel: [ 148.951183] device-mapper: dm-log-userspace: version 1.1.0 loaded dmeventd[15783]: dmeventd ready for processing. In the past days that node was rebooted about every 7 minutes because of that problem... Regards, Ulrich >>> "Ulrich Windl" <[email protected]> schrieb am 18.12.2014 um 09:48 in Nachricht <[email protected]>: > Hello! > > I have a four-node test cluster running SLES11 SP3 with HAE. On of the node > is having a problem I cannot solve: > The node repeatedly reboots and the problem seems to be cLVM. Soo after cLVM > starts, some requests seem to time out, and the cluster will fence the node. > Here is what I have: > > cluster-dlm starts and seems to be running OK > OCFS2 tried to mount: > kernel: [62593.937391] ocfs2: Mounting device (253,9) on (node 5874044, slot > 2) with ordered data mode. > kernel: [62593.969439] (mount.ocfs2,27848,3):ocfs2_global_read_info:403 > ERROR: status = 24 > kernel: [62593.983920] (mount.ocfs2,27848,3):ocfs2_global_read_info:403 > ERROR: status = 24 > kernel: [62594.017919] ocfs2: Mounting device (253,11) on (node 5874044, > slot 2) with ordered data mode. > kernel: [62594.027811] (mount.ocfs2,27847,3):ocfs2_global_read_info:403 > ERROR: status = 24 > kernel: [62594.036353] (mount.ocfs2,27847,3):ocfs2_global_read_info:403 > ERROR: status = 24 > kernel: [62594.040442] ocfs2: Mounting device (253,4) on (node 5874044, slot > 1) with ordered data mode. > kernel: [62594.044470] (mount.ocfs2,27916,2):ocfs2_global_read_info:403 > ERROR: status = 24 > kernel: [62594.083156] (mount.ocfs2,27916,3):ocfs2_global_read_info:403 > ERROR: status = 24 > The RAs all report the mount succeeded > cmirrord[28116]: Starting cmirrord: > cmirrord[28116]: Built: May 29 2013 15:04:35 > LVM(prm_LVM_cVG)[28196]: INFO: Activating volume group cVG > (I guess the cVG should be activated before OCFS; maybe that's the problem) > kernel: [62597.721116] device-mapper: dm-log-userspace: version 1.1.0 loaded > kernel: [62612.720078] device-mapper: dm-log-userspace: [35cRCORE] Request > timed out: [5/2] - retrying > kernel: [62627.720027] device-mapper: dm-log-userspace: [35cRCORE] Request > timed out: [5/4] - retrying > kernel: [62642.720022] device-mapper: dm-log-userspace: [35cRCORE] Request > timed out: [5/5] - retrying > kernel: [62657.721517] device-mapper: dm-log-userspace: [35cRCORE] Request > timed out: [5/6] - retrying > > A short time later the node is fenced. Before some updates this node also > worked fine; I don't know where to start searching. Ideas? > > The relevant updates may be these: > ocfs2-kmp-xen-1.6_3.0.101_0.40-0.20.98 Thu Oct 23 10:56:55 2014 > cluster-network-kmp-xen-1.4_3.0.101_0.40-2.27.98 Thu Oct 23 10:56:51 2014 > ocfs2-kmp-default-1.6_3.0.101_0.40-0.20.98 Thu Oct 23 10:56:45 2014 > cluster-network-kmp-default-1.4_3.0.101_0.40-2.27.98 Thu Oct 23 10:56:40 2014 > xen-kmp-default-4.2.4_04_3.0.101_0.40-0.9.1 Thu Oct 23 10:56:35 2014 > xen-tools-4.2.4_04-0.9.1 Thu Oct 23 10:56:21 2014 > kernel-xen-3.0.101-0.40.1 Thu Oct 23 10:55:29 2014 > kernel-default-3.0.101-0.40.1 Thu Oct 23 10:54:35 2014 > xen-libs-4.2.4_04-0.9.1 Thu Oct 23 10:54:28 2014 > xen-4.2.4_04-0.9.1 Thu Oct 23 10:54:27 2014 > kernel-xen-base-3.0.101-0.40.1 Thu Oct 23 10:53:49 2014 > kernel-default-base-3.0.101-0.40.1 Thu Oct 23 10:52:58 2014 > > Regards, > Ulrich > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
