Yes, i did reformat it(even more than once i think, last week). This is a pre-production system and i'm trying various options before moving into real life.

On 10/19/2011 01:19, Sunil Mushran wrote:
Did you reformat the volume recently? or, when did you format last?

On 10/18/2011 03:13 PM, Laurentiu Gosu wrote:
well..this is weird
ls /sys/kernel/config/cluster/CLUSTER/heartbeat/
*918673F06F8F4ED188DDCE14F39945F6*  dead_threshold

looks like we have different UUIDs. Where is this coming from??

ocfs2_hb_ctl -I -u 918673F06F8F4ED188DDCE14F39945F6
918673F06F8F4ED188DDCE14F39945F6: 1 refs


On 10/19/2011 01:04, Sunil Mushran wrote:
Let's do it by hand.
rm -rf /sys/kernel/config/cluster/.../heartbeat/*0C4AB55FE9314FA5A9F81652FDB9B22D *

On 10/18/2011 02:52 PM, Laurentiu Gosu wrote:
 ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D
ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat

No improvment :(


On 10/19/2011 00:50, Sunil Mushran wrote:
See if this cleans it up.
ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D

On 10/18/2011 02:44 PM, Laurentiu Gosu wrote:
ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D
0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs


On 10/19/2011 00:43, Sunil Mushran wrote:
ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D

On 10/18/2011 02:40 PM, Laurentiu Gosu wrote:
mounted.ocfs2 -d
Device FS Stack UUID Label /dev/mapper/volgr1-lvol0 ocfs2 o2cb 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2

mounted.ocfs2 -f
Device                FS     Nodes
/dev/mapper/volgr1-lvol0  ocfs2  ro02xsrv001

ro02xsrv001 = the other node in the cluster.

By the way, there is no /dev/md-2
 ls /dev/dm-*
/dev/dm-0  /dev/dm-1


On 10/19/2011 00:37, Sunil Mushran wrote:
So it is not mounted. But we still have a hb thread because
hb could not be stopped during umount. The reason for that
could be the same that causes ocfs2_hb_ctl to fail.

Do:
mounted.ocfs2 -d

On 10/18/2011 02:32 PM, Laurentiu Gosu wrote:
ls -lR /sys/kernel/debug/ocfs2
/sys/kernel/debug/ocfs2:
total 0

ls -lR /sys/kernel/debug/o2dlm
/sys/kernel/debug/o2dlm:
total 0

ocfs2_hb_ctl -I -d /dev/dm-2
ocfs2_hb_ctl: Device name specified was not found while reading uuid

There is no /dev/dm-2 mounted.


On 10/19/2011 00:27, Sunil Mushran wrote:
mount -t debugfs debugfs /sys/kernel/debug

Then list that dir.

Also, do:
ocfs2_hb_ctl -l -d /dev/dm-2

Be careful before killing. We want to be sure that dev is not mounted.

On 10/18/2011 02:23 PM, Laurentiu Gosu wrote:
Again   the outputs:
cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev
dm-2
--->here should be volgr1-lvol0 i guess?

ls -lR /sys/kernel/debug/ocfs2
ls: /sys/kernel/debug/ocfs2: No such file or directory

ls -lR /sys/kernel/debug/o2dlm
ls: /sys/kernel/debug/o2dlm: No such file or directory

I think i have to enable debug first somehow..?

Laurentiu.

On 10/19/2011 00:17, Sunil Mushran wrote:
What does this return?
cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev

Also, do:
ls -lR /sys/kernel/debug/ocfs2
ls -lR /sys/kernel/debug/o2dlm

On 10/18/2011 02:14 PM, Laurentiu Gosu wrote:
Here is the output:

ls -lR /sys/kernel/config/cluster
/sys/kernel/config/cluster:
total 0
drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER

/sys/kernel/config/cluster/CLUSTER:
total 0
-rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method
drwxr-xr-x 3 root root    0 Oct 19 00:12 heartbeat
-rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms
-rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms
drwxr-xr-x 4 root root    0 Oct 11 20:23 node
-rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms

/sys/kernel/config/cluster/CLUSTER/heartbeat:
total 0
drwxr-xr-x 2 root root 0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6
-rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold

/sys/kernel/config/cluster/CLUSTER/heartbeat/*918673F06F8F4ED188DDCE14F39945F6*:
total 0
-rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes
-rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks
-rw-r--r-- 1 root root 4096 Oct 19 00:12 dev
-r--r--r-- 1 root root 4096 Oct 19 00:12 pid
-rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block

/sys/kernel/config/cluster/CLUSTER/node:
total 0
drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001
drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002

/sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001:
total 0
-rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address
-rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
-rw-r--r-- 1 root root 4096 Oct 19 00:12 local
-rw-r--r-- 1 root root 4096 Oct 19 00:12 num

/sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002:
total 0
-rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address
-rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port
-rw-r--r-- 1 root root 4096 Oct 19 00:12 local
-rw-r--r-- 1 root root 4096 Oct 19 00:12 num




On 10/19/2011 00:12, Sunil Mushran wrote:
ls -lR /sys/kernel/config/cluster

What does this return?

On 10/18/2011 02:05 PM, Laurentiu Gosu wrote:
Hi,
I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5,
ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5.
My problem is that all the time when i try to run /etc/init.d/o2cb stop
it fails with this error:
      Stopping O2CB cluster CLUSTER: Failed
Unable to stop cluster as heartbeat region still active There is no active mount point. I tried to manually stop the heartdbeat with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). But even if refs number is set to zero the "heartbeat region still
active" occurs.
How can i fix this?

Thank you in advance.
Laurentiu.


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users
















_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to