Hi,
I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5,
ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5.
My problem is that all the time when i try to run /etc/init.d/o2cb stop
it fails with this error:
Stopping O2CB cluster CLUSTER: Failed
Unable to stop cluster as
ls -lR /sys/kernel/config/cluster
What does this return?
On 10/18/2011 02:05 PM, Laurentiu Gosu wrote:
Hi,
I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5,
ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5.
My problem is that all the time when i try to run /etc/init.d/o2cb
What does this return?
cat
/sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
Also, do:
ls -lR /sys/kernel/debug/ocfs2
ls -lR /sys/kernel/debug/o2dlm
On 10/18/2011 02:14 PM, Laurentiu Gosu wrote:
Here is the output:
ls -lR /sys/kernel/config/cluster
Again the outputs:
cat
/sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
dm-2
---here should be volgr1-lvol0 i guess?
ls -lR /sys/kernel/debug/ocfs2
ls: /sys/kernel/debug/ocfs2: No such file or directory
ls -lR /sys/kernel/debug/o2dlm
ls:
mount -t debugfs debugfs /sys/kernel/debug
Then list that dir.
Also, do:
ocfs2_hb_ctl -l -d /dev/dm-2
Be careful before killing. We want to be sure that dev is not mounted.
On 10/18/2011 02:23 PM, Laurentiu Gosu wrote:
Again the outputs:
cat
ls -lR /sys/kernel/debug/ocfs2
/sys/kernel/debug/ocfs2:
total 0
ls -lR /sys/kernel/debug/o2dlm
/sys/kernel/debug/o2dlm:
total 0
ocfs2_hb_ctl -I -d /dev/dm-2
ocfs2_hb_ctl: Device name specified was not found while reading uuid
There is no /dev/dm-2 mounted.
On 10/19/2011 00:27, Sunil Mushran
So it is not mounted. But we still have a hb thread because
hb could not be stopped during umount. The reason for that
could be the same that causes ocfs2_hb_ctl to fail.
Do:
mounted.ocfs2 -d
On 10/18/2011 02:32 PM, Laurentiu Gosu wrote:
ls -lR /sys/kernel/debug/ocfs2
/sys/kernel/debug/ocfs2:
mounted.ocfs2 -d
DeviceFS Stack UUID Label
/dev/mapper/volgr1-lvol0 ocfs2 o2cb
0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2
mounted.ocfs2 -f
DeviceFS Nodes
/dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001
ro02xsrv001 = the other
ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D
On 10/18/2011 02:40 PM, Laurentiu Gosu wrote:
mounted.ocfs2 -d
DeviceFS Stack UUID Label
/dev/mapper/volgr1-lvol0 ocfs2 o2cb 0C4AB55FE9314FA5A9F81652FDB9B22D
ocfs2
mounted.ocfs2 -f
ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs
On 10/19/2011 00:43, Sunil Mushran wrote:
ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D
On 10/18/2011 02:40 PM, Laurentiu Gosu wrote:
mounted.ocfs2 -d
DeviceFS Stack UUID
ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat
No improvment :(
On 10/19/2011 00:50, Sunil Mushran wrote:
See if this cleans it up.
ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
On 10/18/2011 02:44 PM,
See if this cleans it up.
ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
On 10/18/2011 02:44 PM, Laurentiu Gosu wrote:
ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs
On 10/19/2011 00:43, Sunil Mushran wrote:
ocfs2_hb_ctl -l -u
Let's do it by hand.
rm -rf /sys/kernel/config/cluster/.../heartbeat/0C4AB55FE9314FA5A9F81652FDB9B22D
On 10/18/2011 02:52 PM, Laurentiu Gosu wrote:
ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat
No improvment :(
On
well..this is weird
ls /sys/kernel/config/cluster/CLUSTER/heartbeat/
*918673F06F8F4ED188DDCE14F39945F6* dead_threshold
looks like we have different UUIDs. Where is this coming from??
ocfs2_hb_ctl -I -u 918673F06F8F4ED188DDCE14F39945F6
918673F06F8F4ED188DDCE14F39945F6: 1 refs
On 10/19/2011
Did you reformat the volume recently? or, when did you format last?
On 10/18/2011 03:13 PM, Laurentiu Gosu wrote:
well..this is weird
ls /sys/kernel/config/cluster/CLUSTER/heartbeat/
*918673F06F8F4ED188DDCE14F39945F6* dead_threshold
looks like we have different UUIDs. Where is this coming
Yes, i did reformat it(even more than once i think, last week). This is
a pre-production system and i'm trying various options before moving
into real life.
On 10/19/2011 01:19, Sunil Mushran wrote:
Did you reformat the volume recently? or, when did you format last?
On 10/18/2011 03:13 PM,
One way this can happen is if one starts the hb manually and then force
formats on that volume. The format will generate a new uuid. Once that
happens, the hb tool cannot map the region to the device and thus fail
to stop it. Right now the easiest option on this box is resetting it.
On
OK, i rebooted one of the nodes(both had similar issues); . But
something is still fishy.
- i mounted the device: mount -t ocfs2 /dev/volgr1/lvol0 /mnt/tmp/
- i unmount it: umount /mnt/tmp/
- tried to stop o2cb: /etc/init.d/o2cb stop
Stopping O2CB cluster CLUSTER: Failed
Unable to stop cluster
Manual delete will only work if there are no references. In your case
there are references.
You may want to start both nodes from scratch. Do not start/stop
heartbeat manually. Also, do not force-format.
On 10/18/2011 03:54 PM, Laurentiu Gosu wrote:
OK, i rebooted one of the nodes(both had
19 matches
Mail list logo