[Bug 1680997] Re: Container file system corruption on libvirtd restart
Upstream bug report is at https://bugzilla.redhat.com/show_bug.cgi?id=1452701 ** Bug watch added: Red Hat Bugzilla #1452701 https://bugzilla.redhat.com/show_bug.cgi?id=1452701 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1680997 Title: Container file system corruption on libvirtd restart To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1680997/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1680997] Re: Container file system corruption on libvirtd restart
The steps outlined in the initial bug report reliably (100%) reproduce the problem for me on Ubuntu 16.04, it is tested in different Environments (1xAMD, ca. 10xIntel). Here's the short way to get there: - Install a basic Ubuntu 16.04 Server - apt-get install virt-manager (installing the GUI pulls in the heavy lifting components) - create a libvirt/lxc container of something like AnyName 2097152 2097152 4 /machine exe /sbin/init destroy restart restart /dev/net/tun (I have experimented quite a lot, and it boils down to the loop-mounted file system) - Start the container via virsh or virt-manager - Restart libvirtd - Examine state of the container in virsh or virt-manager vs. the state of the loop device via losetup The important parts are: - The container is shown as stopped - The container dosen't reply to network requests or console connection requests (i.e. it seems truly dead) - The loop device doesn't show up in host-side "mount | grep loop" - libvirtd allows to (re-)start the container, ending up with a double- mounted file system Migrating to lxd is not feasable in many environments, in addition to that i am totally aware (and not critisizing!), that libvirt-lxc was/is unsupported. For me the real bug is, that this scenario is possible: If Ubuntu were to just exclude libvirt's lxc driver, that would be not really fine, but at least fool-proof. The blocker to lxd adoption is not on the admin side (me), but on the end user side: Virt-manager is the favorite toy for SMB/NGO local admins, typically run via XQuartz on a Mac or XMing on Windows. Please let me know, if and when I can be of further help - I am willing to test and have quite a few testbeds at hand, where I can easily create throw-away containers and ruin them. Since I tripped over this, I migrated around to have one node running no containers at every single customer, just to do exactly that. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1680997 Title: Container file system corruption on libvirtd restart To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1680997/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1680997] [NEW] Container file system corruption on libvirtd restart
Public bug reported: A data corruption bug exists in the LXC driver for libvirt, that has just cost me a MySQL server. Steps to reproduce: - (for visualization only) In virt-manager add a connection to local lxc:// - create an LXC container, that has a loop-mounted image file and start it - (for visualization only) the container shows as running in virt-manager - systemctl stop libvirtd ; sleep 2 ; sync ; systemctl start libvirtd - (for visualization only) the container shows as shut off in virt-manager - The container no longer responds to network requests, has no attachable console - The loop mount does no longer show up on host-side "mount" output BUT: losetup -a reveals, that a loop device is still attached to the image file BUT: In reality this loop device is still mounted, processes in the container still access the file system BUT: There is no way to unmount or free it - losetup -d ends without an error but does nothing - restart the container (virsh -c lxc:// start name-of-container or via virt-manager) THIS SHOULD NOT BE ALLOWED - The image file is now twice mounted and corruption starts creeping in - Depending on how long this state persists (in terms of IO), the damage can be significant When finally discovering the problem, the only way to unstick the container is a reboot. This is the final nail in the coffin: The hidden instance syncs AFTER the new instance, effectivly pushing back the past. This can be quite nasty, if a libvirt restart results from an unattended upgrade. I do understand, that libvirt/LXC is deprecated - this strikes me as a rather unsubtle way to push users to the newest incarnation, though. In non-enterprisy environments (read SMB or NGO) virt-manager is often used as a "power user" tool, and those end users are unwilling if not unable to use different toolsets for containers and full-fledged VMs. And disabling unattended upgrades in such an environment is inviting trouble. ** Affects: libvirt (Ubuntu) Importance: Undecided Status: New ** Package changed: udev (Ubuntu) => libvirt (Ubuntu) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1680997 Title: Container file system corruption on libvirtd restart To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1680997/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1635729] [NEW] bcache won't start on boot due to exotic block devices filtered in udev rules
Public bug reported: If a bcache is created with a Curtis-Wright NVRAM card as caching device, the bcache device will not show up on boot without manual intervention. Tracking this down show this being due to the fact, that line 9 of /lib/udev/rules.d/60-persistent-storage.rules filters on a whitelist of devices (loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*), that doesn't include the NVRAM devices (umem*). A few steps later, this results in bcache_register() not being called. Trivially patching -KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*", GOTO="persistent_storage_end" +KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|umem*", GOTO="persistent_storage_end" resolves the issue. ** Affects: ubuntu Importance: Undecided Status: New ** Description changed: - If a bcache is created not with a Curtis-Wright NVRAM card as caching + If a bcache is created with a Curtis-Wright NVRAM card as caching device, the bcache device will not show up on boot without manual intervention. Tracking this down show this being due to the fact, that line 9 of /lib/udev/rules.d/60-persistent-storage.rules filters on a whitelist of devices (loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*), that doesn't include the NVRAM devices (umem*). A few steps later, this results in bcache_register() not being called. Trivially patching -KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*", GOTO="persistent_storage_end" +KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|umem*", GOTO="persistent_storage_end" resolves the issue. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1635729 Title: bcache won't start on boot due to exotic block devices filtered in udev rules To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1635729/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs