[N.B. I wrote the below before I saw Ryan's comment, so there is some
repetition.]

OK, I've spent some time catching up on this properly so I can
summarise: per comment #24, the issue is that when udev processes the
events emitted by the kernel, it (sometimes) doesn't determine the
correct partition information.  The kernel _does_ emit all the events we
would expect, and udev _does_ handle all the events we would expect
(which is to say that `udevadm settle` doesn't change behaviour here, it
merely ensures that the broken behaviour has completed before we
proceed).  The hypothesised race condition is somewhere between the
kernel and udev: I believe the kernel event is emitted before the
partition table has necessarily been fully updated so when udev
processes the event and reads the partition table, sometimes it finds
the partition and sometimes it doesn't.  To be clear, the kernel event
generation and the buggy udev event handling all happens as a result of
the resize command, _not_ as a result of anything else cloud-init runs
subsequently.

So as far as I can tell, this bug would occur regardless of what runs
the resize command, and no matter what commands are executed after the
resize command.  (It might be possible to work around this bug by
issuing commands that force a re-read of the partition table on a disk,
for example, but this bug _would_ still have occurred before then.)

cloud-init could potentially work around a (kernel|systemd) that isn't
handling partitions correctly, but we really shouldn't have to.  Until
we're satisfied that they cannot actually be fixed, we shouldn't do
that.  (I am _not_ convinced that this cannot be fixed in (the
kernel|systemd), because using a different kernel and using a different
udevadm have both caused the issue to stop reproducing.)

So, let me be a little more categorical.  The information we have at the
moment indicates an issue in the interactions between the kernel and
udev on partition resize.  cloud-init's involvement is merely as the
initiator of that resize.  Until we have more information that indicates
the issue to be in cloud-init, this isn't a valid cloud-init issue.
Once we have more information from the kernel and/or systemd folks, if
it indicates that cloud-init _is_ at fault, please move this back to
New.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1834875

Title:
  cloud-init growpart race with udev

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1834875/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to