[Bug 1732028] Re: transient boot fail with overlayroot [open-iscsi autopkg tests]
I could confirm that I can run the guest that way and it uses the intended root=/dev/disk/by-path/ip-10.0.12.2:3260-iscsi-tgt-boot-test- b7X8g2-lun-1-part1. The only and unfortunate difference to the issue when run on LP infra stays, that it works all of the time :-/ Note: I ran most of them in background after verifying once how they behave. But wanted to make sure they complete reproducibly. a - manual login, no user data case 3/3 b - user data collect data to disk and shut down 3/3 c - user data collect just shut down 3/3 d - user data collect data to disk and shut down, no tty (nohup CMD > log 2>&1) 3/3 e - non interactive like autopkgtest would run it 3/3 f - non interactive like autopkgtest would run it forcing KVM mode 3/3 Next I was parsing all the logs that the fails on LP accumulated recently. I found this errors whichi is interesting: iscsistart: initiator reported error (15 - session exists) I realized this is present on ALL logs that we gathered. But after thinking I had a lead on this I realized that the good cases had those messages as well. Also found the mentioned "ordering cycle on media-root\x2dro.mount/start" In good as well as bad logs. So neither of these "is it" Essentially the boot around the iscsi root has these steps with some noise in between - looking for differences in good/bad cases. They start the same even with sharing a few errors that seem to be red herrings: [...] (early boot) all logs (id changes) - Logging into tgt-boot-test-o3PlsL 10.1.1.2:3260,1 all logs - mounted filesystem with ordered data on 7/17 logs (also good) - Found ordering cycle ... only all bad cases - Dependency failed for Local File Systems only all bad cases - Timed out waiting for device (devices change) only all bad cases - Started Emergency Shell That matches this bug here. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot [open-iscsi autopkg tests]
There also was a slight remaining uncertainty if this might be the detection of KVM being broken and running in nested KVm on Launchpad. But I was able to confirm that it runs as intended in TCG mode: should_try_kvm = no. virt=kvm (nested kvm is finicky). set _USE_KVM=1 to force. [1]: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac /autopkgtest-cosmic-ci-train-ppa-service-3325/cosmic/amd64/o/open- iscsi/20180719_104631_271d7@/log.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot [open-iscsi autopkg tests]
FYI: In Cosmic this currently blocks (and will block more soon due to python bound on it): qemu, debconf, python3, targetcli-fb, netifaces It also is no more transient, but happens always on LP infrastructure recently (~20 reruns now). Knowing it likely is "too slow" even being unsure why LP in particular is affected we can try to recreate knowing that. I checked CPU consumption in: - KVM mode (super low) - TCG mode (~1.2 CPUs) Adapting title as well ... ** Summary changed: - transient boot fail with overlayroot [open-iscsi autopkg tests] + timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests] ** Summary changed: - timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests] + timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot [open-iscsi autopkg tests]
CE: Moving over some of the discussion that started on a related MP [1]. Smoser: I put together some doc on how to run the https://hackmd.io/E0ydu7Y7QEe-kroPb6-OOA That should help you to possibly recreate outside of the adt test harness. [1]: https://code.launchpad.net/~paelzer/ubuntu/+source/open-iscsi/+git /open-iscsi/+merge/349799 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: timeout in iscsi boot fail with overlayroot [open-iscsi autopkg tests on LP Infra] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot [open-iscsi dep8 tests]
** Summary changed: - transient boot fail with overlayroot + transient boot fail with overlayroot [open-iscsi dep8 tests] ** Attachment added: "bionic failure log 2.0.874-5ubuntu2 qemu/1:2.11+dfsg-1ubuntu2" https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1732028/+attachment/5063566/+files/log.gz ** Summary changed: - transient boot fail with overlayroot [open-iscsi dep8 tests] + transient boot fail with overlayroot [open-iscsi autopkg tests] ** Also affects: open-iscsi (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: transient boot fail with overlayroot [open-iscsi autopkg tests] To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/open-iscsi/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot
Bummer. https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-bionic/bionic/amd64/o/open-iscsi/20171122_204418_8ac74@/log.gz shows [ 43.793658] systemd[1]: media-root\x2dro.mount: Found ordering cycle on -.mount/start [ 43.801914] systemd[1]: media-root\x2dro.mount: Found dependency on media-root\x2dro.mount/start [ 43.812260] systemd[1]: media-root\x2dro.mount: Unable to break cycle starting with media-root\x2dro.mount/start [ 43.822135] systemd[1]: Requested transaction contains an unfixable cyclic ordering dependency: Resource deadlock avoided Yet, the *same* set of packages https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-bionic/bionic/amd64/o/open-iscsi/20171122_211728_e853a@/log.gz this time had no mention of 'ordering cycle' and took 1m7 seconds for uefi device to appear. So I really do think the failures are a matter of simply needing more time. However, the 'ordering cycle' is concerning as I had thought that it shoudl have been fixed. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: transient boot fail with overlayroot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot
Well, I tried updating /etc/fstab https://code.launchpad.net/~smoser/ubuntu/+source/open-iscsi/+git/open-iscsi/+ref/fix/get-journal-publish-artifacts but that doesnt help. It still fails and goes on after 90 seconds. I verified this by just changing the LABEL= to a different string that would never be there. It should have waited 6m, but didnt. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: transient boot fail with overlayroot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot
Note that this failure could just be due to slowness at this point. The open-iscsi tests that are running are running in nested qemu with kvm disabled. So it is quite slow to boot. We seem to be getting usually around 40 seconds for that to arrive. My statement is based on messages like: [[0;31m*[0;1;31m*[0m] A start job is running for dev-disk…label-UEFI.device (34s / 1min 30s) [K[[0;32m OK [0m] Found device VIRTUAL-DISK UEFI. There isn't a good report on exactly what time it arrived other than the last 'XXs' second listed' perhaps we could/should collect the journal from inside the vm into artifacts. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: transient boot fail with overlayroot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot
Is there a way to bump that 1m30s timeout from the kernel command line? Even image modification would be ok, as we're already modifying the image to insert the deb. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: transient boot fail with overlayroot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot
It looks like: x-systemd.device-timeout= might be useful https://www.freedesktop.org/software/systemd/man/systemd.mount.html we'd have to update the /etc/fstab entry in 'patch-image' of the open- iscsi test, but we could do that and just bump it to 6m or something. As some evidence, that this might have just been slow, here is one run that took 1m11 seconds: https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-bionic/bionic/amd64/o/open-iscsi/20171112_225036_bf82c@/log.gz -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: transient boot fail with overlayroot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732028] Re: transient boot fail with overlayroot
Another fail log at https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-bionic/bionic/amd64/o/open-iscsi/20171114_192057_17bf1@/log.gz Basically if you look at bionic failures http://autopkgtest.ubuntu.com/packages/o/open-iscsi/bionic/amd64 if it took 11 hours and failed, this is why. ** Changed in: systemd (Ubuntu) Status: New => Confirmed ** Changed in: systemd (Ubuntu) Importance: Undecided => Medium -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732028 Title: transient boot fail with overlayroot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1732028/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs