Public bug reported:
We've just come across a situation where cloud-init races systemd to
mount a partition, with the result that either cloud-init fails to load
its seed, or a relatively important partition fails to mount. I'm not
entirely sure this is a bug - it could well be argued this is mis-
configuration on our part, and there's a trivial workaround, but I'll
leave the cloud-init devs to make that call.
The configuration is as follows:
1. On recent Ubuntu classic images for the Raspberry Pi, cloud-init's
seed is located on the boot partition (/dev/mmcblk0p1, labelled "system-
boot"). This is desirable as it's a straight-forward FAT partition, and
thus easily accessible and editable on any OS (as opposed to the
original seed location on the root ext4 partition).
2. Cloud-init is configured to look for its seed with "fs_label: system-
boot"
3. Systemd is configured (via /etc/fstab) to mount /dev/disk/by-label
/system-boot (/dev/mmcblk0p1) on /boot/firmware
The result on boot-up is one of three situations:
1. cloud-init mounts /dev/mmcblk0p1 on a temp path. While attempting to
read its seed, systemd attempts to mount /dev/mmcblk0p1 on
/boot/firmware but fails because the device is already mounted
elsewhere. cloud-init succeeds in reading its seed, umounts
/dev/mmcblk0p1 and the system boots successfully, but /boot/firmware
isn't mounted (leading to issues further down the line with other things
like flash-kernel that expect it to be mounted).
2. cloud-init checks for /dev/mmcblk0p1 in /proc/mounts, and doesn't see
it there. It goes to mount /dev/mmcblk0p1 on a temp path, but before it
can do so, systemd mounts it on /boot/firmware. cloud-init's mount fails
(device already mounted), so it fails to read its seed and uses a
default config instead (which isn't entirely desirable for us as the
default involves a long wait if ethernet is not connected; our default
seed overrides that).
3. Everything works fine. This occurs when either a) systemd mounts
/boot/firmware first, then cloud-init sees this mount in /proc/mounts
and uses it or b) cloud-init mounts /dev/mmcblk0p1 on a temp path, reads
its seed and unmounts it, *then* systemd mounts /boot/firmware.
Here's the relevant snippet of traceback from when the second result
occurs:
...
2020-01-14 20:48:01,337 - util.py[DEBUG]: Read 386 bytes from /etc/os-release
2020-01-14 20:48:01,338 - util.py[DEBUG]: Running command ['blkid', '-odevice',
'/dev/sr0'] with allowed return codes [0, 2] (shell=False, capture=True)
2020-01-14 20:48:01,348 - util.py[DEBUG]: Running command ['blkid', '-odevice',
'/dev/sr1'] with allowed return codes [0, 2] (shell=False, capture=True)
2020-01-14 20:48:01,357 - util.py[DEBUG]: Running command ['blkid',
'-tTYPE=vfat', '-odevice'] with allowed return codes [0, 2] (shell=False,
capture=True)
2020-01-14 20:48:01,415 - util.py[DEBUG]: Running command ['blkid',
'-tTYPE=iso9660', '-odevice'] with allowed return codes [0, 2] (shell=False,
capture=True)
2020-01-14 20:48:01,432 - util.py[DEBUG]: Running command ['blkid',
'-tLABEL=SYSTEM-BOOT', '-odevice'] with allowed return codes [0, 2]
(shell=False, capture=True)
2020-01-14 20:48:01,448 - util.py[DEBUG]: Running command ['blkid',
'-tLABEL=system-boot', '-odevice'] with allowed return codes [0, 2]
(shell=False, capture=True)
2020-01-14 20:48:01,466 - DataSourceNoCloud.py[DEBUG]: Attempting to use data
from /dev/mmcblk0p1
2020-01-14 20:48:01,467 - util.py[DEBUG]: Reading from /proc/mounts
(quiet=False)
2020-01-14 20:48:01,467 - util.py[DEBUG]: Read 2062 bytes from /proc/mounts
2020-01-14 20:48:01,468 - util.py[DEBUG]: Fetched {'sysfs': {'fstype': 'sysfs',
'mountpoint': '/sys', 'opts': 'rw,nosuid,nodev,noexec,relatime'}, 'proc':
{'fstype': 'proc', 'mountpoint': '/proc', 'opts':
'rw,nosuid,nodev,noexec,relatime'}, 'udev': {'fstype': 'devtmpfs',
'mountpoint': '/dev', 'opts':
'rw,nosuid,relatime,size=1913472k,nr_inodes=156992,mode=755'}, 'devpts':
{'fstype': 'devpts', 'mountpoint': '/dev/pts', 'opts':
'rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000'}, 'tmpfs': {'fstype':
'tmpfs', 'mountpoint': '/etc/machine-id', 'opts':
'ro,relatime,size=391432k,mode=755'}, '/dev/mmcblk0p2': {'fstype': 'ext4',
'mountpoint': '/', 'opts': 'rw,relatime'}, 'securityfs': {'fstype':
'securityfs', 'mountpoint': '/sys/kernel/security', 'opts':
'rw,nosuid,nodev,noexec,relatime'}, 'cgroup': {'fstype': 'cgroup',
'mountpoint': '/sys/fs/cgroup/freezer', 'opts':
'rw,nosuid,nodev,noexec,relatime,freezer'}, 'pstore': {'fstype': 'pstore',
'mountpoint': '/sys/fs/pstore', 'opts': 'rw,nosuid,nodev,noexec,relatime'},
'systemd-1': {'fstype': 'autofs', 'mountpoint': '/proc/sys/fs/binfmt_misc',
'opts':
'rw,relatime,fd=26,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1728'},
'debugfs': {'fstype': 'debugfs', 'mountpoint': '/sys/kernel/debug', 'opts':
'rw,relatime'}, 'mqueue': {'fstype': 'mqueue', 'mountpoint': '/dev/mqueue',
'opts': 'rw,relatime'}, 'tracefs': {'fstype': 'tracefs', 'mountpoint':
'/sys/kernel/debug/tracing', 'opts': 'rw,relatime'}, 'configfs': {'fstype':
'configfs', 'mountpoint': '/sys/kernel/config', 'opts': 'rw,relatime'},
'fusectl': {'fstype': 'fusectl', 'mountpoint': '/sys/fs/fuse/connections',
'opts': 'rw,relatime'}} mounts from proc
2020-01-14 20:48:01,468 - util.py[DEBUG]: Running command ['mount', '-o', 'ro',
'-t', 'auto', '/dev/mmcblk0p1', '/run/cloud-init/tmp/tmp2z8rueu4'] with allowed
return codes [0] (shell=False, capture=True)
2020-01-14 20:48:01,529 - util.py[DEBUG]: Failed mount of '/dev/mmcblk0p1' as
'auto': Unexpected error while running command.
Command: ['mount', '-o', 'ro', '-t', 'auto', '/dev/mmcblk0p1',
'/run/cloud-init/tmp/tmp2z8rueu4']
Exit code: 32
Reason: -
Stdout:
Stderr: mount: /run/cloud-init/tmp/tmp2z8rueu4: /dev/mmcblk0p1 already mounted
on /boot/firmware.
2020-01-14 20:48:01,530 - util.py[WARNING]: Failed to mount /dev/mmcblk0p1 when
looking for data
2020-01-14 20:48:01,533 - util.py[DEBUG]: Failed to mount /dev/mmcblk0p1 when
looking for data
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceNoCloud.py",
line 128, in _get_data
pp2d_kwargs)
File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1758, in
mount_cb
(device, tmpd, failure_reason))
cloudinit.util.MountFailedError: Failed mounting /dev/mmcblk0p1 to
/run/cloud-init/tmp/tmp2z8rueu4 due to: Unexpected error while running command.
Command: ['mount', '-o', 'ro', '-t', 'auto', '/dev/mmcblk0p1',
'/run/cloud-init/tmp/tmp2z8rueu4']
Exit code: 32
Reason: -
Stdout:
Stderr: mount: /run/cloud-init/tmp/tmp2z8rueu4: /dev/mmcblk0p1 already mounted
on /boot/firmware.
2020-01-14 20:48:01,549 - __init__.py[DEBUG]: Datasource DataSourceNoCloud
[seed=None][dsmode=net] not updated for events: New instance first boot
2020-01-14 20:48:01,549 - handlers.py[DEBUG]: finish:
init-local/search-NoCloud: SUCCESS: no local data found from DataSourceNoCloud
2020-01-14 20:48:01,549 - main.py[DEBUG]: No local datasource found
...
The relevant code is mount_cb() in util.py which is calling mounts() to
read /proc/mounts then attempting to mount the device when it's not
found (leading to a classic race between the check and the action).
It might be argued that it should simply attempt the mount, and check
/proc/mounts in the event of failure, but I'm wary that mounting FS' is
a potentially costly exercise (can the seed be on a remote mount?), and
there's probably edge cases here I'm unaware of (is it possible, perhaps
dangerous, to double-mount certain devices?).
Perhaps it could be adjusted to check /proc/mounts, attempt the mount,
then *re-check* /proc/mounts in the case of failure?
Nevertheless, there's a trivial work-around in our case: just add an
override for "RequiresMountFor=/boot/firmware" to our systemd config for
cloud-init-local, thus ensuring the mount is definitely available prior
to cloud-init looking for its seed. This in turn argues the issue could
simply be viewed as mis-configuration on our part: we *know* the seed is
on /dev/mmcblk0p1 as that's where we've configured it to be; should we
not also configure cloud-init in such a way that we guarantee the device
containing its seed is definitely mounted prior to it running?
** Affects: cloud-init (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1860046
Title:
Race to mount seed device
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1860046/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs