This doesn't happen all that often. I suppose had I read the manual on ocfs2_hb_ctl vs just using my init scripts, I would have been able to stop/start the heartbeat gracefully. If it ever happens again (tempting to force it), I'll make sure to cat /proc/mounts.
Thanks for the update. Michael -----Original Message----- From: Sunil Mushran [mailto:[EMAIL PROTECTED] Sent: Monday, January 07, 2008 4:03 PM To: Michael M. Cc: [email protected] Subject: Re: [Ocfs2-users] Bug when unmounting The mount utility is lying. Not only is the fs appears to be umounted, it is umounted. This is because the mount command gets the list of mounted volumes from /etc/mtab. (mount and umount utilities update this file accordingly.) When you do a ctrl-c, while the kernel/fs umounts the vol, the umount tool is unable to remove the corresponding entry from /etc/mtab. Next time you encounter this issue, do cat /proc/mounts. That's the kernel view of the mounted volumes and if I am correct you will not see your umounted vol listed there. The other side effect of this is that the heartbeat does not stop. But that can be stopped manually. To view the hb references, do: $ ocfs2_hb_ctl -I -d /dev/sdf1 You should see just 1 reference. To stop the heartbeat, do: $ ocfs2_hb_ctl -K -d /dev/sdf1 This will stop the heartbeat cleanly. Note that heartbeat has no knowledge of the mount. As in, it will allow you to stop the hb even if the volume is mounted. And if you do that, the fs/nodemanager will fence the node. So manually stop only if you encounter the above situation. Lastly, while yes this is a bug, we won't be able to address it till OCFS2 1.4. The newer util-linux (mount/umount) allows fs extensions for umounts like it already does for mount. We will have umount.ocfs2 (like we do mount.ocfs2) which will block signals for the duration of umount/stop hb. How often does this happen? Sunil Michael M. wrote: > > Dear OCFS2-Devs, > > > > There is a slight bug when unmointing an ocfs2 volume. I was > unmounting it, and typing other commands while waiting for it to > unmount, and accidentally hit ctrl-c. The filesystem appears > unmounted, when I run umount /mountpoint/ it reports it as not > mounted. However, when I attempt to stop o2cb, it says at least one > heartbeat region was still active. At this point, there was nothing I > could do, not umount, not mount, not stop or start the heartbeat, and > needed to powercycle the machine, which of course, cause the behavior > of the other machines waiting til the heartbeat check timeout was up. > > > > This happens every so often. Is there a way to FORCE the heartbeat to > stop? > > > > Or to force mount/unmount? > > > > Michael > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > [email protected] > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
