[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
This is fixed in 0.7.9. ** Changed in: cloud-init Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: Fix Released Status in cloud-init package in Ubuntu: Fix Released Status in cloud-init source package in Xenial: Fix Released Status in cloud-init source package in Yakkety: Fix Released Bug description: === Begin SRU Template === [Impact] In some cases, cloud-init writes entries to /etc/fstab, and on azure it will even format a disk for mounting and then write the entry for that 'ephemeral' disk there. A supported operation on Azure is to "resize" the system. When you do this the system is shut down, resized (given larger/faster disks and more CPU) and then brought back up. In that process, the "ephemeral" disk re-initialized to its original NTFS format. The designed goal is for cloud-init to recognize this situation and re-format the disk to ext4. The problem is that the mount of that disk happens before cloud-init can reformat. Thats because the entry in fstab has 'auto' and is automatically mounted. The end result is that after resize operation the user will be left with the ephemeral disk mounted at /mnt and having a ntfs filesystem rather than ext4. [Test Case] The text in comment 3 describes how to recreate by the original reporter. Another way to do this is to just re-format the ephemeral disk as ntfs and then reboot. The result *should* be that after reboot it comes back up and has an ext4 filesystem on it. 1.) boot system on azure (for this, i use https://gist.github.com/smoser/5806147, but you can use web ui or any other way). Save output of journalctl --no-pager > journalctl.orig systemctl status --no-pager > systemctl-status.orig systemctl --no-pager > systemctl.orig 2.) unmount the ephemeral disk $ umount /mnt 3.) repartition it so that mkfs.ntfs does less and is faster This is not strictly necessary, but mkfs.ntfs can take upwards of 20 minutes. shrinking /dev/sdb2 to be 200M means it will finish in < 1 minute. $ disk=/dev/disk/cloud/azure_resource $ part=/dev/disk/cloud/azure_resource-part1 $ echo "2048,$((2*1024*100)),7" | sudo sfdisk "$disk" $ time mkfs.ntfs --quick "$part" 4.) reboot 5.) expect that /proc/mounts has /dev/disk/cloud/azure_resource-part1 as ext4 and that fstab has x-systemd.requires in it. $ awk '$2 == "/mnt" { print $0 }' /proc/mounts /dev/sdb1 /mnt ext4 rw,relatime,data=ordered 0 0 $ awk '$2 == "/mnt" { print $0 }' /etc/fstab /dev/sdb1 /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2 6.) collect journal and systemctl information as described in step 1 above. Compare output, specifically looking for case insensitve "breaks" [Regression Potential] Regression is unlikely. Likely failure case is just that the problem is not correctly fixed, and the user ends up with either an NTFS formated disk that is mounted at /mnt or there is nothing mounted at /mnt. === End SRU Template === After resizing a 16.04 VM on Azure, the VM is presented with a new ephemeral drive (of a different size), which initially is NTFS formatted. Cloud-init tries to format the appropriate partition ext4, but fails because it is mounted. Cloud-init has unmount logic for exactly this case in the get_data call on the Azure data source, but this is never called because fresh cache is found. Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False) Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/cloud/instance/obj.pkl Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0] ... Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating file system None on /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Using cmd: /sbin/mkfs.ext4 /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/mkfs.ext4', '/dev/sdb1'] with allowed return codes [0] (shell=False, capture=True) Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.052 seconds Jun 27 19:07:48 azubuntu1604arm
[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
This bug was fixed in the package cloud-init - 0.7.8-49-g9e904bb- 0ubuntu1~16.10.1 --- cloud-init (0.7.8-49-g9e904bb-0ubuntu1~16.10.1) yakkety; urgency=medium * debian/cloud-init.templates: enable DigitalOcean by default [Ben Howard] * debian/cloud-init.postinst: update /etc/fstab on Azure to fix future resize operations. (LP: #1611074) * New upstream snapshot. - systemd/cloud-init-local.service: + replace 'Wants' and 'After' on local-fs.target with more granular After=systemd-remount-fs.service and RequiresMountsFor=/var/lib and Before=sysinit.target. This is done run sufficiently early enough to update /etc/fstab. (LP: #1611074) - systemd/cloud-init.service: + add Before=sysinit.target and DefaultDependencies=no (LP: #1611074) + drop Requires=networking.service to work where networking.service is not needed. + add Conflicts=shutdown.target + drop unnecessary Wants=local-fs.target - net: support reading ipv6 dhcp config from initramfs [LaMont Jones] (LP: #1621615) - dmidecode: Allow dmidecode to be used on aarch64, and only attempt usage on x86, x86_64, and aarch64. [Robert Schweikert] - disk-config: udev settle after partitioning in gpt format. (LP: #1626243) - Add support for snap create-user on Ubuntu Core images. [Ryan Harper] (LP: #1619393) - Fix sshd restarts for rhel distros. [Jim Gorz] - Move user/group functions to new ug_util file [Joshua Harlow] - update Gentoo initscripts to run in the correct order [Matthew Thode] - MAAS: improve the debugging tool in datasource to consider config provided on kernel cmdline. - DataSources: + Ec2: protect against non-dictionary in block-device-mapping. + AliYun: Add new datasource for Ali-Cloud ECS, that is available but not enabled by default [kaihuan.pkh] + OpenNebula: replace parsing of 'ip' command with similar function available in cloudinit.net. This fixed unit tests when running in environment with no networking. - doc changes: + Add documentation on stages of boot. + make the RST files consistently formated and other improvements. + fixed example to not overwrite /etc/hosts [Chris Glass] + fix spelling / typos in ca_certs and scripts_vendor. + improve HACKING.rst file + Add documentation for logging features. [Wesley Wiedenmeier] - code style and unit test changes: + pep8: fix style errors reported by pycodestyle 2.1.0 + pyflakes: fix issue with pyflakes 1.3 found in ubuntu zesty-proposed. + Add coverage dependency to bddeb to fix package build. + Add coverage collection to tox unit tests. [Joshua Powers] + do not read system /etc/cloud/cloud.cfg.d (LP: #1635350) + tests: silence the Cheetah UserWarning about NameMapper C version. + Fix python2.6 things found running in centos 6. -- Scott MoserTue, 22 Nov 2016 17:04:36 -0500 ** Changed in: cloud-init (Ubuntu Yakkety) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: Fix Committed Status in cloud-init package in Ubuntu: Fix Released Status in cloud-init source package in Xenial: Fix Released Status in cloud-init source package in Yakkety: Fix Released Bug description: === Begin SRU Template === [Impact] In some cases, cloud-init writes entries to /etc/fstab, and on azure it will even format a disk for mounting and then write the entry for that 'ephemeral' disk there. A supported operation on Azure is to "resize" the system. When you do this the system is shut down, resized (given larger/faster disks and more CPU) and then brought back up. In that process, the "ephemeral" disk re-initialized to its original NTFS format. The designed goal is for cloud-init to recognize this situation and re-format the disk to ext4. The problem is that the mount of that disk happens before cloud-init can reformat. Thats because the entry in fstab has 'auto' and is automatically mounted. The end result is that after resize operation the user will be left with the ephemeral disk mounted at /mnt and having a ntfs filesystem rather than ext4. [Test Case] The text in comment 3 describes how to recreate by the original reporter. Another way to do this is to just re-format the ephemeral disk as ntfs and then reboot. The result *should* be that after reboot it comes back up and has an ext4 filesystem on it. 1.) boot system on azure (for this, i use https://gist.github.com/smoser/5806147, but you can use web ui or any other way). Save output of journalctl --no-pager > journalctl.orig
[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
This bug was fixed in the package cloud-init - 0.7.8-49-g9e904bb- 0ubuntu1~16.04.1 --- cloud-init (0.7.8-49-g9e904bb-0ubuntu1~16.04.1) xenial-proposed; urgency=medium * debian/cloud-init.postinst: update /etc/fstab on Azure to fix future resize operations. (LP: #1611074) * New upstream snapshot. - Add activate_datasource, for datasource specific code paths. (LP: #1611074) - systemd: cloud-init-local use RequiresMountsFor=/var/lib/cloud (LP: #1642062) cloud-init (0.7.8-47-gb6561a1-0ubuntu1~16.04.1) xenial-proposed; urgency=medium * debian/cloud-init.templates: enable DigitalOcean by default [Ben Howard] * New upstream snapshot. - systemd/cloud-init-local.service: + replace 'Wants' and 'After' on local-fs.target with more granular After=systemd-remount-fs.service and RequiresMountsFor=/var/lib and Before=sysinit.target. This is done run sufficiently early enough to update /etc/fstab. (LP: #1611074) + add Before=NetworkManager.service so that cloud-init can render NetworkManager network config before it would apply them. - systemd/cloud-init.service: + add Before=sysinit.target and DefaultDependencies=no (LP: #1611074) + drop Requires=networking.service to work where networking.service is not needed. + add Conflicts=shutdown.target + drop unnecessary Wants=local-fs.target - net: support reading ipv6 dhcp config from initramfs [LaMont Jones] (LP: #1621615) - dmidecode: Allow dmidecode to be used on aarch64, and only attempt usage on x86, x86_64, and aarch64. [Robert Schweikert] - disk-config: udev settle after partitioning in gpt format. (LP: #1626243) - Add support for snap create-user on Ubuntu Core images. [Ryan Harper] (LP: #1619393) - Fix sshd restarts for rhel distros. [Jim Gorz] - Move user/group functions to new ug_util file [Joshua Harlow] - update Gentoo initscripts to run in the correct order [Matthew Thode] - MAAS: improve the debugging tool in datasource to consider config provided on kernel cmdline. - lxd: Update network config for LXD 2.3 [Stéphane Graber] (LP: #1640556) - Decode unicode types in decode_binary [Robert Schweikert] - Allow ephemeral drive to be unpartitioned [Paul Meyer] - subp: add 'update_env' argument which allows for more easily adding environment variables to a subprocess call. - Adjust mounts and disk configuration for systemd. (LP: #1611074) - DataSources: + Ec2: protect against non-dictionary in block-device-mapping. + AliYun: Add new datasource for Ali-Cloud ECS, that is available but not enabled by default [kaihuan.pkh] + DigitalOcean: use meta-data for network configuration and enable data source by default. [Ben Howard] + OpenNebula: replace parsing of 'ip' command with similar function available in cloudinit.net. This fixed unit tests when running in environment with no networking. - doc changes: + Add documentation on stages of boot. + make the RST files consistently formated and other improvements. + fixed example to not overwrite /etc/hosts [Chris Glass] + fix spelling / typos in ca_certs and scripts_vendor. + improve HACKING.rst file + Add documentation for logging features. [Wesley Wiedenmeier] + Improve module documentation and doc cleanup. [Wesley Wiedenmeier] - code style and unit test changes: + pep8: fix style errors reported by pycodestyle 2.1.0 + pyflakes: fix issue with pyflakes 1.3 found in ubuntu zesty-proposed. + Add coverage dependency to bddeb to fix package build. + Add coverage collection to tox unit tests. [Joshua Powers] + do not read system /etc/cloud/cloud.cfg.d (LP: #1635350) + tests: silence the Cheetah UserWarning about NameMapper C version. + Fix python2.6 things found running in centos 6. -- Scott MoserFri, 18 Nov 2016 16:51:54 -0500 ** Changed in: cloud-init (Ubuntu Xenial) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: Fix Committed Status in cloud-init package in Ubuntu: Fix Released Status in cloud-init source package in Xenial: Fix Released Bug description: === Begin SRU Template === [Impact] In some cases, cloud-init writes entries to /etc/fstab, and on azure it will even format a disk for mounting and then write the entry for that 'ephemeral' disk there. A supported operation on Azure is to "resize" the system. When you do this the system is shut down, resized (given larger/faster disks and more CPU) and then brought back up. In that process, the
[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
This bug was fixed in the package cloud-init - 0.7.8-49-g9e904bb- 0ubuntu1 --- cloud-init (0.7.8-49-g9e904bb-0ubuntu1) zesty; urgency=medium * debian/cloud-init.postinst: update /etc/fstab on Azure to fix future resize operations. (LP: #1611074) * New upstream snapshot. - Add activate_datasource, for datasource specific code paths. Use that on Azure to handle re-formatting of ephemeral disk. (LP: #1611074) -- Scott MoserFri, 18 Nov 2016 16:37:34 -0500 ** Changed in: cloud-init (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: Fix Committed Status in cloud-init package in Ubuntu: Fix Released Status in cloud-init source package in Xenial: Confirmed Bug description: === Begin SRU Template === [Impact] In some cases, cloud-init writes entries to /etc/fstab, and on azure it will even format a disk for mounting and then write the entry for that 'ephemeral' disk there. A supported operation on Azure is to "resize" the system. When you do this the system is shut down, resized (given larger/faster disks and more CPU) and then brought back up. In that process, the "ephemeral" disk re-initialized to its original NTFS format. The designed goal is for cloud-init to recognize this situation and re-format the disk to ext4. The problem is that the mount of that disk happens before cloud-init can reformat. Thats because the entry in fstab has 'auto' and is automatically mounted. The end result is that after resize operation the user will be left with the ephemeral disk mounted at /mnt and having a ntfs filesystem rather than ext4. [Test Case] The text in comment 3 describes how to recreate by the original reporter. Another way to do this is to just re-format the ephemeral disk as ntfs and then reboot. The result *should* be that after reboot it comes back up and has an ext4 filesystem on it. 1.) boot system on azure (for this, i use https://gist.github.com/smoser/5806147, but you can use web ui or any other way). Save output of journalctl --no-pager > journalctl.orig systemctl status --no-pager > systemctl-status.orig systemctl --no-pager > systemctl.orig 2.) unmount the ephemeral disk $ umount /mnt 3.) repartition it so that mkfs.ntfs does less and is faster This is not strictly necessary, but mkfs.ntfs can take upwards of 20 minutes. shrinking /dev/sdb2 to be 200M means it will finish in < 1 minute. $ disk=/dev/disk/cloud/azure_resource $ part=/dev/disk/cloud/azure_resource-part1 $ echo "2048,$((2*1024*100)),7" | sudo sfdisk "$disk" $ time mkfs.ntfs --quick "$part" 4.) reboot 5.) expect that /proc/mounts has /dev/disk/cloud/azure_resource-part1 as ext4 and that fstab has x-systemd.requires in it. $ awk '$2 == "/mnt" { print $0 }' /proc/mounts /dev/sdb1 /mnt ext4 rw,relatime,data=ordered 0 0 $ awk '$2 == "/mnt" { print $0 }' /etc/fstab /dev/sdb1 /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2 6.) collect journal and systemctl information as described in step 1 above. Compare output, specifically looking for case insensitve "breaks" [Regression Potential] Regression is unlikely. Likely failure case is just that the problem is not correctly fixed, and the user ends up with either an NTFS formated disk that is mounted at /mnt or there is nothing mounted at /mnt. === End SRU Template === After resizing a 16.04 VM on Azure, the VM is presented with a new ephemeral drive (of a different size), which initially is NTFS formatted. Cloud-init tries to format the appropriate partition ext4, but fails because it is mounted. Cloud-init has unmount logic for exactly this case in the get_data call on the Azure data source, but this is never called because fresh cache is found. Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False) Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/cloud/instance/obj.pkl Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0] ... Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating file system None on /dev/sdb1 Jun 27 19:07:48
[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
** Changed in: cloud-init (Ubuntu) Status: Fix Released => Confirmed ** Changed in: cloud-init (Ubuntu Xenial) Status: Fix Committed => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: Fix Committed Status in cloud-init package in Ubuntu: Confirmed Status in cloud-init source package in Xenial: Confirmed Bug description: === Begin SRU Template === [Impact] In some cases, cloud-init writes entries to /etc/fstab, and on azure it will even format a disk for mounting and then write the entry for that 'ephemeral' disk there. A supported operation on Azure is to "resize" the system. When you do this the system is shut down, resized (given larger/faster disks and more CPU) and then brought back up. In that process, the "ephemeral" disk re-initialized to its original NTFS format. The designed goal is for cloud-init to recognize this situation and re-format the disk to ext4. The problem is that the mount of that disk happens before cloud-init can reformat. Thats because the entry in fstab has 'auto' and is automatically mounted. The end result is that after resize operation the user will be left with the ephemeral disk mounted at /mnt and having a ntfs filesystem rather than ext4. [Test Case] The text in comment 3 describes how to recreate by the original reporter. Another way to do this is to just re-format the ephemeral disk as ntfs and then reboot. The result *should* be that after reboot it comes back up and has an ext4 filesystem on it. 1.) boot system on azure (for this, i use https://gist.github.com/smoser/5806147, but you can use web ui or any other way). Save output of journalctl --no-pager > journalctl.orig systemctl status --no-pager > systemctl-status.orig systemctl --no-pager > systemctl.orig 2.) unmount the ephemeral disk $ umount /mnt 3.) repartition it so that mkfs.ntfs does less and is faster This is not strictly necessary, but mkfs.ntfs can take upwards of 20 minutes. shrinking /dev/sdb2 to be 200M means it will finish in < 1 minute. $ disk=/dev/disk/cloud/azure_resource $ part=/dev/disk/cloud/azure_resource-part1 $ echo "2048,$((2*1024*100)),7" | sudo sfdisk "$disk" $ time mkfs.ntfs --quick "$part" 4.) reboot 5.) expect that /proc/mounts has /dev/disk/cloud/azure_resource-part1 as ext4 and that fstab has x-systemd.requires in it. $ awk '$2 == "/mnt" { print $0 }' /proc/mounts /dev/sdb1 /mnt ext4 rw,relatime,data=ordered 0 0 $ awk '$2 == "/mnt" { print $0 }' /etc/fstab /dev/sdb1 /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2 6.) collect journal and systemctl information as described in step 1 above. Compare output, specifically looking for case insensitve "breaks" [Regression Potential] Regression is unlikely. Likely failure case is just that the problem is not correctly fixed, and the user ends up with either an NTFS formated disk that is mounted at /mnt or there is nothing mounted at /mnt. === End SRU Template === After resizing a 16.04 VM on Azure, the VM is presented with a new ephemeral drive (of a different size), which initially is NTFS formatted. Cloud-init tries to format the appropriate partition ext4, but fails because it is mounted. Cloud-init has unmount logic for exactly this case in the get_data call on the Azure data source, but this is never called because fresh cache is found. Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False) Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/cloud/instance/obj.pkl Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0] ... Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating file system None on /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Using cmd: /sbin/mkfs.ext4 /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/mkfs.ext4', '/dev/sdb1'] with allowed return codes [0] (shell=False, capture=True) Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.052 seconds Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT]
[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
** Also affects: cloud-init (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: cloud-init (Ubuntu Xenial) Status: New => Confirmed ** Changed in: cloud-init (Ubuntu Xenial) Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: Fix Committed Status in cloud-init package in Ubuntu: Fix Released Status in cloud-init source package in Xenial: Confirmed Bug description: === Begin SRU Template === [Impact] In some cases, cloud-init writes entries to /etc/fstab, and on azure it will even format a disk for mounting and then write the entry for that 'ephemeral' disk there. A supported operation on Azure is to "resize" the system. When you do this the system is shut down, resized (given larger/faster disks and more CPU) and then brought back up. In that process, the "ephemeral" disk re-initialized to its original NTFS format. The designed goal is for cloud-init to recognize this situation and re-format the disk to ext4. The problem is that the mount of that disk happens before cloud-init can reformat. Thats because the entry in fstab has 'auto' and is automatically mounted. The end result is that after resize operation the user will be left with the ephemeral disk mounted at /mnt and having a ntfs filesystem rather than ext4. [Test Case] The text in comment 3 describes how to recreate by the original reporter. Another way to do this is to just re-format the ephemeral disk as ntfs and then reboot. The result *should* be that after reboot it comes back up and has an ext4 filesystem on it. 1.) boot system on azure (for this, i use https://gist.github.com/smoser/5806147, but you can use web ui or any other way). 2.) unmount the ephemeral disk $ umount /mnt 3.) repartition it so that mkfs.ntfs does less and is faster This is not strictly necessary, but mkfs.ntfs can take upwards of 20 minutes. shrinking /dev/sdb2 to be 200M means it will finish in < 1 minute. $ disk=/dev/disk/cloud/azure_resource $ part=/dev/disk/cloud/azure_resource-part1 $ echo "2048,$((2*1024*100)),7" | sudo sfdisk "$disk" $ time mkfs.ntfs --quick "$part" 4.) reboot 5.) expect that /proc/mounts has /dev/disk/cloud/azure_resource-part1 as ext4 and that fstab has x-systemd.requires in it. $ awk '$2 == "/mnt" { print $0 }' /proc/mounts /dev/sdb1 /mnt ext4 rw,relatime,data=ordered 0 0 $ awk '$2 == "/mnt" { print $0 }' /etc/fstab /dev/sdb1 /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2 [Regression Potential] Regression is unlikely. Likely failure case is just that the problem is not correctly fixed, and the user ends up with either an NTFS formated disk that is mounted at /mnt or there is nothing mounted at /mnt. === End SRU Template === After resizing a 16.04 VM on Azure, the VM is presented with a new ephemeral drive (of a different size), which initially is NTFS formatted. Cloud-init tries to format the appropriate partition ext4, but fails because it is mounted. Cloud-init has unmount logic for exactly this case in the get_data call on the Azure data source, but this is never called because fresh cache is found. Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False) Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/cloud/instance/obj.pkl Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0] ... Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating file system None on /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Using cmd: /sbin/mkfs.ext4 /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/mkfs.ext4', '/dev/sdb1'] with allowed return codes [0] (shell=False, capture=True) Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.052 seconds Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[WARNING]: Failed during filesystem operation#012Failed to exec of '['/sbin/mkfs.ext4', '/dev/sdb1']':#012Unexpected error while running command.#012Command: ['/sbin/mkfs.ext4', '/dev/sdb1']#012Exit code: 1#012Reason:
[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
This bug was fixed in the package cloud-init - 0.7.8-3-g80f5ec4-0ubuntu1 --- cloud-init (0.7.8-3-g80f5ec4-0ubuntu1) yakkety; urgency=medium * New upstream snapshot. - Adjust mounts and disk configuration for systemd. (LP: #1611074) - dmidecode: run dmidecode only on i?86 or x86_64 arch. [Robert Schweikert] -- Scott MoserTue, 20 Sep 2016 13:59:20 -0400 ** Changed in: cloud-init (Ubuntu) Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: Confirmed Status in cloud-init package in Ubuntu: Fix Released Bug description: === Begin SRU Template === [Impact] In some cases, cloud-init writes entries to /etc/fstab, and on azure it will even format a disk for mounting and then write the entry for that 'ephemeral' disk there. A supported operation on Azure is to "resize" the system. When you do this the system is shut down, resized (given larger/faster disks and more CPU) and then brought back up. In that process, the "ephemeral" disk re-initialized to its original NTFS format. The designed goal is for cloud-init to recognize this situation and re-format the disk to ext4. The problem is that the mount of that disk happens before cloud-init can reformat. Thats because the entry in fstab has 'auto' and is automatically mounted. The end result is that after resize operation the user will be left with the ephemeral disk mounted at /mnt and having a ntfs filesystem rather than ext4. [Test Case] The text in comment 3 describes how to recreate by the original reporter. Another way to do this is to just re-format the ephemeral disk as ntfs and then reboot. The result *should* be that after reboot it comes back up and has an ext4 filesystem on it. 1.) boot system on azure (for this, i use https://gist.github.com/smoser/5806147, but you can use web ui or any other way). 2.) unmount the ephemeral disk $ umount /mnt 3.) repartition it so that mkfs.ntfs does less and is faster This is not strictly necessary, but mkfs.ntfs can take upwards of 20 minutes. shrinking /dev/sdb2 to be 200M means it will finish in < 1 minute. $ disk=/dev/disk/cloud/azure_resource $ part=/dev/disk/cloud/azure_resource-part1 $ echo "2048,$((2*1024*100)),7" | sudo sfdisk "$disk" $ time mkfs.ntfs --quick "$part" 4.) reboot 5.) expect that /proc/mounts has /dev/disk/cloud/azure_resource-part1 as ext4 and that fstab has x-systemd.requires in it. $ awk '$2 == "/mnt" { print $0 }' /proc/mounts /dev/sdb1 /mnt ext4 rw,relatime,data=ordered 0 0 $ awk '$2 == "/mnt" { print $0 }' /etc/fstab /dev/sdb1 /mnt auto defaults,nofail,x-systemd.requires=cloud-init.service,comment=cloudconfig 0 2 [Regression Potential] Regression is unlikely. Likely failure case is just that the problem is not correctly fixed, and the user ends up with either an NTFS formated disk that is mounted at /mnt or there is nothing mounted at /mnt. === End SRU Template === After resizing a 16.04 VM on Azure, the VM is presented with a new ephemeral drive (of a different size), which initially is NTFS formatted. Cloud-init tries to format the appropriate partition ext4, but fails because it is mounted. Cloud-init has unmount logic for exactly this case in the get_data call on the Azure data source, but this is never called because fresh cache is found. Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False) Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/cloud/instance/obj.pkl Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0] ... Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating file system None on /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Using cmd: /sbin/mkfs.ext4 /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/mkfs.ext4', '/dev/sdb1'] with allowed return codes [0] (shell=False, capture=True) Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.052 seconds Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[WARNING]: Failed during filesystem
[Yahoo-eng-team] [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
** Also affects: cloud-init Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1611074 Title: Reformatting of ephemeral drive fails on resize of Azure VM Status in cloud-init: New Status in cloud-init package in Ubuntu: New Bug description: After resizing a 16.04 VM on Azure, the VM is presented with a new ephemeral drive (of a different size), which initially is NTFS formatted. Cloud-init tries to format the appropriate partition ext4, but fails because it is mounted. Cloud-init has unmount logic for exactly this case in the get_data call on the Azure data source, but this is never called because fresh cache is found. Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False) Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/cloud/instance/obj.pkl Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0] Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0] ... Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating file system None on /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Using cmd: /sbin/mkfs.ext4 /dev/sdb1 Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/mkfs.ext4', '/dev/sdb1'] with allowed return codes [0] (shell=False, capture=True) Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.052 seconds Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[WARNING]: Failed during filesystem operation#012Failed to exec of '['/sbin/mkfs.ext4', '/dev/sdb1']':#012Unexpected error while running command.#012Command: ['/sbin/mkfs.ext4', '/dev/sdb1']#012Exit code: 1#012Reason: -#012Stdout: ''#012Stderr: 'mke2fs 1.42.13 (17-May-2015)\n/dev/sdb1 is mounted; will not make a filesystem here!\n' $ lsb_release -rd Description:Ubuntu 16.04.1 LTS Release:16.04 $ cat /etc/cloud/build.info build_name: server serial: 20160721 ~$ dpkg -l cloud-init Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++--=-=-= ii cloud-init 0.7.7~bzr1256-0ubuntu all Init scripts for cloud instances We're seeing ~100% repro of this bug on resize, where the only success cases are caused by another bug that messes up fstab and prevents mounting of the drive. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1611074/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp