So the problem you are encountering is killing via uuid. You could kill by device name too.
By now you have the list of heartbeat regions. To get the device name for a region, do: $ cat /sys/kernel/config/cluster/CLUSERNAME/heartbeat/C43CB881C2C84B09BAC14546BF6DCAD9/dev sdf1 $ ocfs2_hb_ctl -K -d /dev/sdf1 Now makesure that that device is not mounted. It should not be. If it is, then you probably have used force-uuid-reset to change the uuid of an active device. In that case, I see no solution other than a node reset. But before you do this, I would like some more info. 1. strace -o /tmp/hbctl.out ocfs2_hb_ctl -K -u F5F0522D39FC4EB2824C3E68C0B1D589 2. uname -a 3. rpm -qa | grep ocfs2 4. rpm -qf `which ocfs2_hb_ctl` 5. mounted.ocfs2 -d >/tmp/mounted.out Thanks Sunil Daniel Keisling wrote: > I wrote a script to easily get the heartbeats that should have been > killed. However, I get a segmentation fault everytime I try and kill > the "dead" heartbeats: > > [EMAIL PROTECTED] tmp]# mounted.ocfs2 -d | grep -i f5f0 | wc -l > 0 > > [EMAIL PROTECTED] tmp]# ocfs2_hb_ctl -K -u > F5F0522D39FC4EB2824C3E68C0B1D589 > Segmentation fault (core dumped) > > > > The process is still active: > > [EMAIL PROTECTED] tmp]# ps -ef | grep -i f5f0 > root 620 169 0 Nov29 ? 00:00:30 [o2hb-F5F0522D39] > root 22608 18491 0 14:07 pts/4 00:00:00 grep -i f5f0 > > Attached is the core. > > While I can create and mount snapshot filesystems on my development > node, a dead heartbeat on one of my production nodes is not letting me > mount the snapshot for a newly presented filesystem (thus causing our > backups to fail). What else can I do? I really don't want to open an > SR with Oracle... > > Thanks, > > Daniel _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
