On Wed, Aug 10, 2011 at 11:29 PM, Bruno Wolff III <[email protected]> wrote: > On Wed, Aug 10, 2011 at 16:35:23 -0500, > Ed Sutton <[email protected]> wrote: >> After every livecd-creator build I must restart and reboot my CentOS 5.2 >> machine using the fsck option to fix disk issues I do not understand. I >> tried a work-around without success for a similar sounding problem: >> >> Bug 509427 - livecd-creator fails to unmount >> https://bugzilla.redhat.com/show_bug.cgi?id=509427 >> >> Any suggestions on troubleshooting are much appreciated. > > The devices get umount'd in the wrong order. In later versions of > livecd-creator we changed things so that lazy umounts are used. > I think there is still a place where an exception can occur where things > don't get cleaned up nicely, but for the most part exceptions don't > leave lots of stuff mounted these days. > > I am not sure how well livecd-creator for recent Fedoras will work on CentOS > 5.
It's not just centos, I've been watching umount failures with livecd-tools-16.3-1.fc16.x86_64. First it was failing when unmounting bind-mounts with "block devices not permitted on fs" message - that error means EACCES: http://git.kernel.org/?p=utils/util-linux/util-linux.git;a=blob;f=mount/umount.c;h=64f320c71b0f5ec1a4e6c6190bc186d22f23034b;hb=HEAD#l200 Adding 2s sleep seems to paper-over the issue, didn't get _that_ failure after the following patch: --- fs.py.ORIG 2011-03-31 19:53:44.000000000 -0400 +++ fs.py 2011-08-10 12:19:31.219339890 -0400 @@ -142,6 +142,9 @@ if not self.mounted: return + # sleep to try to avoid umount shenanigans + # e.g. umount: XXX/imgcreate-3OdaNp/install_root//var/cache/yum: block devices not permitted on fs + time.sleep(2) rc = call(["/bin/umount", self.dest]) if rc != 0: logging.info("Unable to unmount %s normally, using lazy unmount" % self.dest) After that I saw sporadic failures when removing loop device: Losetup remove /dev/loop14 loop: can't delete device /dev/loop14: Device or resource busy followed by fsck failure on ext3fs.img I tried to see what's holding the loop device, but after the following patch I can't reproduce any more, probably it adds enough delay to avoid the issue: --- fs.py.ORIG 2011-03-31 19:53:44.000000000 -0400 +++ fs.py 2011-08-10 12:19:31.219339890 -0400 @@ -320,6 +323,8 @@ if self.device is None: return logging.info("Losetup remove %s" % self.device) + rc = call(["fuser", "-m", self.device]) + logging.info("fuser rc=%s" % rc) rc = call(["/sbin/losetup", "-d", self.device]) self.device = None @@ -389,6 +394,7 @@ def cleanup(self): Mount.cleanup(self) + logging.info("Mount.cleaup done") self.disk.cleanup() def unmount(self): @@ -396,6 +402,7 @@ logging.info("Unmounting directory %s" % self.mountdir) rc = call(["/bin/umount", self.mountdir]) if rc == 0: + logging.info("umount rc=0, ismount=%s" % os.path.ismount(self.mountdir)) self.mounted = False else: logging.warn("Unmounting directory %s failed, using lazy umount" % self.mountdir) We have a continuous livecd build running, so I'll keep watching for this issue. I wonder if anyone else has hit this issue on f15/16 ? Alan -- livecd mailing list [email protected] https://admin.fedoraproject.org/mailman/listinfo/livecd
