[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
please reopen if this is still an issue ** Changed in: systemd (Ubuntu) Status: Confirmed => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
\o/ All worked on the first run now, this gives some confidence that the flakyness is gone for now. 2.0.874-5ubuntu5debconf/1.5.68 2018-07-20 06:12:24pass 2.0.874-5ubuntu5targetcli-fb/2.1.43-2 2018-07-20 06:05:57pass 2.0.874-5ubuntu5qemu/1:2.12+dfsg-3ubuntu3 2018-07-20 06:05:48pass 2.0.874-5ubuntu5python3-defaults/3.6.6-12018-07-20 06:05:44pass 2.0.874-5ubuntu5netifaces/0.10.4-1build12018-07-20 06:04:45pass If said runtime increase of the test is found to be affecting e.g. Bionic SRUs as well blocking on opne-iscsi more than in the past then we will need to consider backporting the fix. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
2.0.874-5ubuntu5 migrated and had the first good new run. Retriggering all those currently blocked on it ... -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
This bug was fixed in the package open-iscsi - 2.0.874-5ubuntu5 --- open-iscsi (2.0.874-5ubuntu5) cosmic; urgency=medium * Harden dep8 tests against effects due to slow execution on Launchpad infrastructure (LP: #1732028). - debian/tests/patch-image: remove problematic fstab entries - debian/tests/tgt-boot-test: ran xkvm in verbose mode -- Christian Ehrhardt Thu, 19 Jul 2018 18:22:39 +0200 ** Changed in: open-iscsi (Ubuntu) Status: New => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
** Merge proposal linked: https://code.launchpad.net/~paelzer/ubuntu/+source/open-iscsi/+git/open-iscsi/+merge/350076 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
First good run: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-cosmic-ci-train-ppa-service-3325/cosmic/amd64/o/open- iscsi/20180719_180433_98a21@/log.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
Note to myself: WIP at https://code.launchpad.net/~paelzer/ubuntu/+source/open-iscsi/+git/open- iscsi/+ref/mask-uefi-part-to-unbreak-tests -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
Due to snapd still lopoing in this case (bug 1782602) The sum of changes is: $ systemctl disable snapd.seeded.service $ systemctl disable snapd.service cat /etc/fstab LABEL=cloudimg-rootfs /ext4 defaults0 0 #LABEL=UEFI /boot/efi vfatdefaults0 0 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
In a discussion with smoser we decided to try just clearing /etc/fstab - it might be sufficient for our test intention and at the same time avoid the issue. With that done we still have the following failing with timeouts: [ TIME ] Timed out waiting for device dev-ttyS0.device. [FAILED] Failed to start Journal Service. Not sure why the first happens, but the second is due to the lack of a root FS. In my slowed down environments it boots REALLY slow then, like 30 minutes. Several services are unforgiving of the root not being there. Especially snap that was getting into a loop of Starting/Stopping/... and by that did not let it complete. The former try was too hard (intentionally), so I stepped back to just replace the labels with the (in this case) known partition IDs /dev/sda1 and /dev/sda15. This I first booted non slowed to check if the config works at all. It works Note: non slowed the TCG-system is at 4*100% for the vcpus and 400% for TCG. ~800% vs ~98% is a slowdown by 1/8 which matches what I see as effective slowdown. A few more tries together with Scott found that disabling the fstab entry for EFI, but keeping the main root as-is. Again we verified this config is valid (boot without CPU limit) and then we set the limit. With that finally it seems to work. Lets code that up in open-iscsi and put it to a PPA to test on LP infra against it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
I worked on the two messages and related boot messages: 1. the ordering cycle is about media-root, maybe the config is not yet perfect 2. timeouts on dev-ttyS0.device dev-disk-by\x2dlabel-UEFI.device Hmm, that is: Dependency failed for /boot/efi. Dependency failed for Local File Systems. local-fs.target: Job local-fs.target/start failed I checked for network coming up and was first confused between systemd[1]: Failed to start Network Service. And network in general coming up very late. In a log: eth0: DHCPv4 address 10.0.12.15/24 systemd-networkd[529]: eth0: Configured In one of my cases ~10 minutes later than I got into the emergency console. Afterwards boot-efi.mount autocompleted just fine btw. Early boot is a swarm of red herrings: I was checking more like Started Remount Root and Kernel File Systems. before Started Network Service It seems the latter just depends on time synchronization and therefore waits. Reached target Network is Online. is before it as it should be. And as trivial as that may sound, the network seems to be related at least. Of all the logs only the good cases reach: Reached target Network is Online But that is only reachable after local-fs which in the bad case fails - so it is ok to not be reached if the root is missing. I also checked if the ordering changed e.g. compared to Bionic? But it seemed - at least for network init similar in old bad&good cases. Arr - early boot dependencies are lovely :-) Following a discussion about fstab modifications next ... -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
** Description changed: This issue keeps cropping up. It shows itself in open-iscsi autopkg tests. I think it might just be "really slow system". It seems the timeout is only - 1 minute 30 seconds for the disk to appear, and in a happy run you + 1 minute 30 seconds for the disk to appear, and in a happy run you might see something very close: [K[ [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for dev-disk…-UEFI.device (1min 29s / 1min 32s) [K[ [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for dev-disk…-UEFI.device (1min 30s / 1min 32s) [K[[0;32m OK [0m] Found device VIRTUAL-DISK UEFI. - Mounting /boot/efi... + Mounting /boot/efi... --- There is information on the open-iscsi tests at [1]. - [1] https://git.launchpad.net/~usd-import-team/ubuntu/+source/open-iscsi/tree/debian/tests/README-boot-test.md + [1] https://git.launchpad.net/~usd-import-team/ubuntu/+source/open-iscsi/tree/debian/tests/README-boot-test.md The tests set up an iscsi target and boot a kvm guest off that read-only root with overlayroot. # get open-iscsi source $ apt-get source open-iscsi $ cd open-iscsi-2.0.874/ $ sudo apt-get install -qy simplestreams tgt qemu-system-x86 \ - cloud-image-utils distro-info + cloud-image-utils distro-info $ cd open-iscsi-2.0.874/ ## we're now mostly following debian/tests/README-boot-test.md # download the image and get kernel/initrd $ PATH=$PWD/debian/tests:$PATH - $ get-image bionic.d bionic + $ get-image bionic.d bionic $ sudo `which patch-image` \ - --kernel=bionic.d/kernel --initrd=bionic.d/initrd bionic.d/disk.img + --kernel=bionic.d/kernel --initrd=bionic.d/initrd bionic.d/disk.img $ tgt-boot-test -v bionic.d/disk.img bionic.d/kernel bionic.d/initrd Success is being able to log in with 'ubuntu' and 'passw0rd'. Failure as seen in the log is dropping into an emergency shell. Once inside (this was successful) you'll see a mostly sane system. Some things to note: a.) tgt-boot-test boots without kvm enabled. This is because using - kvm with qemu in nested virt would cause system lockups. Its slower + kvm with qemu in nested virt would cause system lockups. Its slower but more reliable to go wtihout. b.) under bug 1723183 I made overlayroot comment out the root filesystem from the rendered /etc/fstab. That was because systemd got confused and assumed that /media/root-ro had to be on top of /. c.) you can enable or disable kvm by setting _USE_KVM=0 or _USE_KVM=1 -in your environment. + in your environment. $ grep -v "^# " /etc/fstab # # #LABEL=cloudimg-rootfs /media/root-ro/ ext4 ro,defaults,noauto 0 0 /media/root-ro/ / overlay lowerdir=/media/root-ro/,upperdir=/media/root-rw/overl ay/,workdir=/media/root-rw/overlay-workdir/_ 0 0 LABEL=UEFI /boot/efi vfat defaults 0 0 # overlayroot:fs-unsupported $ sudo blkid /dev/sda1: LABEL="cloudimg-rootfs" UUID="7b1980bd-9102-4356-8df0-ec7a0c062411" TYPE="ext4" PARTUUID="c0b5ace0-4703-4667-babb-3d38137cab88" /dev/sda15: LABEL="UEFI" UUID="B177-3CC9" TYPE="vfat" PARTUUID="0ab0b9fd-2c28-4724-857a-1559f0cf76ea" /dev/sda14: PARTUUID="221662d6-cab0-4290-ba1c-e72acf2bf193" $ cat /run/systemd/generator/local-fs.target.requires/boot-efi.mount # Automatically generated by systemd-fstab-generator [Unit] SourcePath=/etc/fstab Documentation=man:fstab(5) man:systemd-fstab-generator(8) Before=local-fs.target [Mount] Where=/boot/efi What=/dev/disk/by-label/UEFI Type=vfat Related bugs: - * bug 1680197: Zesty deployments failing sporadically - * bug 1723183: transient systemd ordering issue when using overlayroot + * bug 1680197: Zesty deployments failing sporadically + * bug 1723183: transient systemd ordering issue when using overlayroot + * bug 1666573: transient systemd ordering cycle in boot with overlayroot ver read-only open-iscsi root ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: systemd 234-2ubuntu12 ProcVersionSignature: User Name 4.13.0-16.19-generic 4.13.4 Uname: Linux 4.13.0-16-generic x86_64 ApportVersion: 2.20.7-0ubuntu4 Architecture: amd64 Date: Mon Nov 13 21:06:36 2017 Lsusb: Error: command ['lsusb'] failed with exit code 1: MachineType: QEMU Standard PC (i440FX + PIIX, 1996) ProcEnviron: - TERM=vt220 - PATH=(custom, no user) - XDG_RUNTIME_DIR= - LANG=C.UTF-8 - SHELL=/bin/bash + TERM=vt220 + PATH=(custom, no user) + XDG_RUNTIME_DIR= + LANG=C.UTF-8 + SHELL=/bin/bash ProcKernelCmdLine: nomodeset iscsi_initiator=maas-enlist iscsi_target_name=tgt-boot-test-7xuhwl iscsi_target_ip=10.0.12.2 iscsi_target_port=3260 iscsi_initiator=maas-enlist ip=maas-enlist:BOOTIF ro net.ifnames=0 BOOTIF_DEFAULT=eth0 root=/dev/disk/by-path/ip-10.0.12.2:3260-iscsi-tgt-boot-test-7xuhwl-lun-1-part1 overlayroot=tmpfs console=ttyS0 ds=nocloud-net;seedfrom=http://10.0.12.2:32600/ SourcePackage: system
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
i attached as reference https://hackmd.io/E0ydu7Y7QEe-kroPb6-OOA at some point we can improve the doc in the debian/tests directory with that. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
Running just with tgt-boot-test in interactive mode, no user-data and no extra disk. Issue still reproducible with that. Saving the image to try different modifications while being able to get back to the current state. Then I started where Scott already experimented. The timeouts. I was trying the big hammer first setting all of them. # Inside: sudo mount-image-callback --system-mounts disk.img /bin/bash /etc/fstab: LABEL=cloudimg-rootfs /ext4 defaults,x-systemd.device-timeout=600,x-systemd.mount-timeout=600,_netdev 0 0 $ mkdir etc/systemd/user.conf.d $ echo DefaultTimeoutStartSec=600s > etc/systemd/user.conf.d/longtimeout.conf I (accidentally) verified that without CPU limit this modified root disk would boot fine. And will now run it with 40% over lunchtime (a bit less to more surely trigger it - and since I don't "wait" for it to complete lower % won't hurt). But that still ran into the issues. At least I have an emergency shell there interactively now that I can take a look before the next tries. Also not only disk fails but also (probably dependency) Timed out waiting for device dev-ttyS0.device. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra]
It does not consume "a lot" of CPU but lets try still to slow it down into the error. $ sudo apt install cgroup-tools $ sudo cgcreate -g cpu:/cpulimited # you might want to ensure that cpu.cfs_period_us: 10 # e.g. Set 10% as hard limit $ echo 1 | sudo tee /sys/fs/cgroup/cpu,cpuacct/cpulimited/cpu.cfs_quota_us # This was done on the Host, all tests run in the qemu guest (as on LP), move qemu in those cgroups $ for task in $(ls -1 /proc/$(pidof qemu-system-x86_64)/task/); do echo $task | sudo tee -a /sys/fs/cgroup/cpu,cpuacct/cpulimited/tasks; done; I tested different shares using TCG in the guest. 50% seemed to be a tradeoff that isn't too slow to totally fail or having you wait too much. It gets slower, TCG&Emulation blocked at 50%, vcpus at ~12% With that I was able to hit this locally: [ 432.218071] systemd[1]: Timed out waiting for device dev-disk-by\x2dlabel-UEFI.device. [ TIME ] Timed out waiting for device dev-disk-by\x2dlabel-UEFI.device. [DEPEND] Dependency failed for /boot/efi. [DEPEND] Dependency failed for Local File Systems. [...] You are in emergency mode. After logging in, type "journalctl -xb" to view system logs, "systemctl reboot" to reboot, "systemctl default" or "exit" to boot into default mode. Press Enter for maintenance (or press Control-D to continue): >From here going step by step down to the smallest test (this was full >autopkgtest). And there we can again start trying tweaks as in comment #5 and comment #6 but hopefully iterate faster on them. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs