** Description changed: - During an Autopilot deployment on gMAAS, Juju had hung running a mon- - relation-changed hook + [Impact] + Disks with invalid metadata can cause hangs during cleaning; resulting in stuck deployments. - $ ps afxwww | grep -A 4 [m]on-relation-changed - 29118 ? S 0:03 \_ /usr/bin/python /var/lib/juju/agents/unit-ceph-1/charm/hooks/mon-relation-changed - 37996 ? S 0:00 \_ /bin/sh /usr/sbin/ceph-disk-prepare --fs-type xfs --zap-disk /dev/sdb - 37998 ? S 0:00 \_ /usr/bin/python /usr/sbin/ceph-disk prepare --fs-type xfs --zap-disk /dev/sdb - 38016 ? D 0:00 \_ /sbin/sgdisk --zap-all --clear --mbrtogpt -- /dev/sdb + [Test Case] + Initialize a disk with invalid metadata using the '--zap-disk' option. + + [Regression Potential] + Minimal; already in later Ubuntu releases. + + [Original Bug Report] + During an Autopilot deployment on gMAAS, Juju had hung running a mon-relation-changed hook + + $ ps afxwww | grep -A 4 [m]on-relation-changed + 29118 ? S 0:03 \_ /usr/bin/python /var/lib/juju/agents/unit-ceph-1/charm/hooks/mon-relation-changed + 37996 ? S 0:00 \_ /bin/sh /usr/sbin/ceph-disk-prepare --fs-type xfs --zap-disk /dev/sdb + 37998 ? S 0:00 \_ /usr/bin/python /usr/sbin/ceph-disk prepare --fs-type xfs --zap-disk /dev/sdb + 38016 ? D 0:00 \_ /sbin/sgdisk --zap-all --clear --mbrtogpt -- /dev/sdb This had been in this state for > 10m. The logs[1] from the unit in question showed that something was up with the partition tables on that disk. I fixed this by hand using gdisk[2] [1] https://pastebin.canonical.com/135426/ [2] http://paste.ubuntu.com/11887096/
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1475247 Title: ceph-disk-prepare --zap-disk hang To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1475247/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs