[Bug 1680997] Re: Container file system corruption on libvirtd restart

2017-05-19 Thread Eugen Rieck
Upstream bug report is at
https://bugzilla.redhat.com/show_bug.cgi?id=1452701

** Bug watch added: Red Hat Bugzilla #1452701
   https://bugzilla.redhat.com/show_bug.cgi?id=1452701

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1680997

Title:
  Container file system corruption on libvirtd restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1680997/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1680997] Re: Container file system corruption on libvirtd restart

2017-04-10 Thread Eugen Rieck
The steps outlined in the initial bug report reliably (100%) reproduce the 
problem for me on Ubuntu 16.04, it is tested in different Environments (1xAMD, 
ca. 10xIntel).
Here's the short way to get there:

- Install a basic Ubuntu 16.04 Server
- apt-get install virt-manager (installing the GUI pulls in the heavy lifting 
components)
- create a libvirt/lxc container of something like

  AnyName
  2097152
  2097152
  4
  
/machine
  
  
exe
/sbin/init
  
  

  
  
  destroy
  restart
  restart
  

  
  
  


  
  
  
  


  
  
  


  
/dev/net/tun
  

  


(I have experimented quite a lot, and it boils down to the loop-mounted
file system)

- Start the container via virsh or virt-manager
- Restart libvirtd
- Examine state of the container in virsh or virt-manager vs. the state of the 
loop device via losetup

The important parts are:
- The container is shown as stopped
- The container dosen't reply to network requests or console connection 
requests (i.e. it seems truly dead)
- The loop device doesn't show up in host-side "mount | grep loop"

- libvirtd allows to (re-)start the container, ending up with a double-
mounted file system

Migrating to lxd is not feasable in many environments, in addition to
that i am totally aware (and not critisizing!), that libvirt-lxc was/is
unsupported. For me the real bug is, that this scenario is possible: If
Ubuntu were to just exclude libvirt's lxc driver, that would be not
really fine, but at least fool-proof.

The blocker to lxd adoption is not on the admin side (me), but on the
end user side: Virt-manager is the favorite toy for SMB/NGO local
admins, typically run via XQuartz on a Mac or XMing on Windows.

Please let me know, if and when I can be of further help - I am willing
to test and have quite a few testbeds at hand, where I can easily create
throw-away containers and ruin them. Since I tripped over this, I
migrated around to have one node running no containers at every single
customer, just to do exactly that.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1680997

Title:
  Container file system corruption on libvirtd restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1680997/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1680997] [NEW] Container file system corruption on libvirtd restart

2017-04-07 Thread Eugen Rieck
Public bug reported:

A data corruption bug exists in the LXC driver for libvirt, that has
just cost me a MySQL server.

Steps to reproduce:
- (for visualization only) In virt-manager add a connection to local lxc:// 
- create an LXC container, that has a loop-mounted image file and start it
- (for visualization only) the container shows as running in virt-manager
- systemctl stop libvirtd ; sleep 2 ; sync ; systemctl start libvirtd
- (for visualization only) the container shows as shut off in virt-manager
- The container no longer responds to network requests, has no attachable 
console
- The loop mount does no longer show up on host-side "mount" output
  BUT: losetup -a reveals, that a loop device is still attached to the 
image file
  BUT: In reality this loop device is still mounted, processes in the 
container still access the file system
  BUT: There is no way to unmount or free it - losetup -d ends without an 
error but does nothing
- restart the container (virsh -c lxc:// start name-of-container or via 
virt-manager)
  THIS SHOULD NOT BE ALLOWED
- The image file is now twice mounted and corruption starts creeping in
- Depending on how long this state persists (in terms of IO), the damage can be 
significant

When finally discovering the problem, the only way to unstick the
container is a reboot. This is the final nail in the coffin: The hidden
instance syncs AFTER the new instance, effectivly pushing back the past.

This can be quite nasty, if a libvirt restart results from an unattended
upgrade.

I do understand, that libvirt/LXC is deprecated - this strikes me as a rather 
unsubtle way to push users to the newest incarnation, though.
In non-enterprisy environments (read SMB or NGO) virt-manager is often used as 
a "power user" tool, and those end users are unwilling if not unable to use 
different toolsets for containers and full-fledged VMs. And disabling 
unattended upgrades in such an environment is inviting trouble.

** Affects: libvirt (Ubuntu)
 Importance: Undecided
 Status: New

** Package changed: udev (Ubuntu) => libvirt (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1680997

Title:
  Container file system corruption on libvirtd restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1680997/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1635729] [NEW] bcache won't start on boot due to exotic block devices filtered in udev rules

2016-10-21 Thread Eugen Rieck
Public bug reported:

If a bcache is created with a Curtis-Wright NVRAM card as caching
device, the bcache device will not show up on boot without manual
intervention.

Tracking this down show this being due to the fact, that line 9 of
/lib/udev/rules.d/60-persistent-storage.rules filters on a whitelist of
devices
(loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*),
that doesn't include the NVRAM devices (umem*). A few steps later, this
results in bcache_register() not being called.

Trivially patching

-KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*",
 GOTO="persistent_storage_end"
+KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|umem*",
 GOTO="persistent_storage_end"

resolves the issue.

** Affects: ubuntu
 Importance: Undecided
 Status: New

** Description changed:

- If a bcache is created not with a Curtis-Wright NVRAM card as caching
+ If a bcache is created with a Curtis-Wright NVRAM card as caching
  device, the bcache device will not show up on boot without manual
  intervention.
  
  Tracking this down show this being due to the fact, that line 9 of
  /lib/udev/rules.d/60-persistent-storage.rules filters on a whitelist of
  devices
  
(loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*),
  that doesn't include the NVRAM devices (umem*). A few steps later, this
  results in bcache_register() not being called.
  
  Trivially patching
  
  
-KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*",
 GOTO="persistent_storage_end"
  
+KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|umem*",
 GOTO="persistent_storage_end"
  
  resolves the issue.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1635729

Title:
  bcache won't start on boot due to exotic block devices filtered in
  udev rules

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1635729/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs