Public bug reported:
I lost some data last night, and I think the problem has to do with how
lxd cleans up when there are attached devices that can't be umounted.
This is my testcase:
---
# / is ext4
# /home is ext4
# /var/lib/lxd is btrfs
# /usr/local/nfs is a remote nfs directory
mkdir -p /home/testdir
touch /home/testdir/this_file_should_remain
lxc init ubuntu:14.04 c1
lxc config device add c1 home disk source=/home/testdir path=/home/testdir
lxc config device add c1 nfs disk source=/usr/local/nfs path=/usr/local/nfs
lxc start c1
sleep 5
# Unexport the nfs share on the server (causing "Stale file handle")
lxc stop c1 && lxc delete c1
[ -f /home/testdir/this_file_should_remain ] || echo "File is missing!"
# In the log:
Apr 28 09:50:04 karai lxd[1273]: t=2016-04-28T09:50:04+0200 lvl=eror
msg="Unable to remove unix devices" err="lstat
/var/lib/lxd/devices/c1/disk.usr-local-nfs: stale NFS file handle"
Apr 28 09:50:04 karai lxd[1273]: t=2016-04-28T09:50:04+0200 lvl=eror
msg="Unable to remove disk devices" err="lstat
/var/lib/lxd/devices/c1/disk.usr-local-nfs: stale NFS file handle"
---
Expected outcome:
/home/testdir/this_file_should_remain should be left alone. lxc stop and lxc
delete should probably return errors.
Actual outcome:
/home/testdir is empty.
---
At first I though the problem was in storage_btrfs.go/ContainerDelete as
it seems wrong to do an os.RemoveAll() instead of failing if there's
anything left after removing the subvolume. Uncommenting this and
rebuilding the package did not help, however.
My current unconfirmed suspicion is that the problem is in
container_lxc.go/cleanup(). removeUnixDevices and removeDiskDevices both
seem to return on the first failure, and cleanup() then continues
without checking the return value and runs
os.RemoveAll(c.DevicesPath()).
Additional information:
---
# lsb_release -rd
Description: Ubuntu 16.04 LTS
Release: 16.04
** Affects: lxd (Ubuntu)
Importance: Undecided
Status: New
** Summary changed:
- lxd handles cleanu
+ lxd handles cleanup of un-umountable devices badly
** Description changed:
I lost some data last night, and I think the problem has to do with how
- lxd performs the cleanup when there are attached devices that can't be
- umounted .
+ lxd cleans up when there are attached devices that can't be umounted.
This is my testcase:
---
# / is ext4
# /home is ext4
# /var/lib/lxd is btrfs
# /usr/local/nfs is a remote nfs directory
mkdir -p /home/testdir
touch /home/testdir/this_file_should_remain
lxc init ubuntu:14.04 c1
lxc config device add c1 home disk source=/home/testdir path=/home/testdir
lxc config device add c1 nfs disk source=/usr/local/nfs path=/usr/local/nfs
lxc start c1
sleep 5
# Unexport the nfs share on the server (causing "Stale file handle")
lxc stop c1 && lxc delete c1
[ -f /home/testdir/this_file_should_remain ] || echo "File is missing!"
# In the log:
Apr 28 09:50:04 karai lxd[1273]: t=2016-04-28T09:50:04+0200 lvl=eror
msg="Unable to remove unix devices" err="lstat
/var/lib/lxd/devices/c1/disk.usr-local-nfs: stale NFS file handle"
Apr 28 09:50:04 karai lxd[1273]: t=2016-04-28T09:50:04+0200 lvl=eror
msg="Unable to remove disk devices" err="lstat
/var/lib/lxd/devices/c1/disk.usr-local-nfs: stale NFS file handle"
---
Expected outcome:
/home/testdir/this_file_should_remain should be left alone. lxc stop and lxc
delete should probably return errors.
Actual outcome:
/home/testdir is empty.
---
At first I though the problem was in storage_btrfs.go/ContainerDelete as
it seems wrong to do an os.RemoveAll() instead of failing if there's
anything left after removing the subvolume. Uncommenting this and
rebuilding the package did not help, however.
My current unconfirmed suspicion is that the problem is in
container_lxc.go/cleanup(). removeUnixDevices and removeDiskDevices both
seems to return on the first failure, and cleanup() then continues
without checking the return value and runs
os.RemoveAll(c.DevicesPath()).
Additional information:
---
# lsb_release -rd
Description: Ubuntu 16.04 LTS
Release: 16.04
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1576082
Title:
lxd handles cleanup of un-umountable devices badly
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1576082/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs