** Description changed: SRU Justification: ================== [Impact] * Hardware diagnose data (diag 318) of KVM guest kernel cannot be handled. * A fix is needed to enhance problem determination of guest kernel under KVM using DIAG 0x318 instruction execution. * The s390x diagnose 318 instruction sets the control program name code (CPNC) and control program version code (CPVC) to provide useful information regarding the OS during debugging. * The CPNC is explicitly set to 4 to indicate a Linux/KVM environment. + * The user story to this is, that s390x clearly is a platform that has + virtualization for ages, and as part of that established diag calls to + allow you add data to guests. This helps live management and/or guest + debugging in case of problems. For KVM guests this data is + wrong/incomplete so far and this is the fix for it. + You might want to see [1] for the base-feature that this fixes. And do + not say it is ugly, I did not send the PoP page about diags :-) + + [1]: https://git.mentality.rip/OpenE2K/qemu-e2k/commit/fabdada9357b + [Fix] * In general the following 9 patches (backports) are needed: * "[PATCH 1/9] s390/sclp: get machine once during read scp/cpu info" - https://launchpadlibrarian.net/581388471/0001-s390-sclp-get-machine-once-during-read-scp-cpu-info.patch + https://launchpadlibrarian.net/581388471/0001-s390-sclp-get-machine-once-during-read-scp-cpu-info.patch * "[PATCH 2/9] s390/sclp: rework sclp boundary checks" - https://launchpadlibrarian.net/581388472/0002-s390-sclp-rework-sclp-boundary-checks.patch + https://launchpadlibrarian.net/581388472/0002-s390-sclp-rework-sclp-boundary-checks.patch * "[PATCH 3/9] s390/sclp: read sccb from mem based on provided length" - https://launchpadlibrarian.net/581388474/0003-s390-sclp-read-sccb-from-mem-based-on-provided-lengt.patch + https://launchpadlibrarian.net/581388474/0003-s390-sclp-read-sccb-from-mem-based-on-provided-lengt.patch * "[PATCH 4/9] s390/sclp: check sccb len before filling in data" - https://launchpadlibrarian.net/581388476/0004-s390-sclp-check-sccb-len-before-filling-in-data.patch + https://launchpadlibrarian.net/581388476/0004-s390-sclp-check-sccb-len-before-filling-in-data.patch * "[PATCH 5/9] s390/sclp: use cpu offset to locate cpu entries" - https://launchpadlibrarian.net/581389965/0005-s390-sclp-use-cpu-offset-to-locate-cpu-entries.patch + https://launchpadlibrarian.net/581389965/0005-s390-sclp-use-cpu-offset-to-locate-cpu-entries.patch * "[PATCH 6/9] s390/sclp: add extended-length sccb support for kvm guest" - https://launchpadlibrarian.net/581389970/0006-s390-sclp-add-extended-length-sccb-support-for-kvm-g.patch + https://launchpadlibrarian.net/581389970/0006-s390-sclp-add-extended-length-sccb-support-for-kvm-g.patch * "[PATCH 7/9] s390: guest support for diagnose 0x318" - https://launchpadlibrarian.net/581389974/0007-s390-guest-support-for-diagnose-0x318.patch + https://launchpadlibrarian.net/581389974/0007-s390-guest-support-for-diagnose-0x318.patch * "[PATCH 8/9] s390x: pv: Remove sclp boundary checks" - https://launchpadlibrarian.net/581389981/0008-s390x-pv-Remove-sclp-boundary-checks.patch + https://launchpadlibrarian.net/581389981/0008-s390x-pv-Remove-sclp-boundary-checks.patch * "[PATCH 9/9] s390x: pv: Fix diag318 PV fencing" - https://launchpadlibrarian.net/581389982/0009-s390x-pv-Fix-diag318-PV-fencing.patch - + https://launchpadlibrarian.net/581389982/0009-s390x-pv-Fix-diag318-PV-fencing.patch + [Test Case] * Setup an IBM Z or LinuxONE LPAR with Ubuntu Server as KVM host. * And setup an Ubuntu KVM virtual machine on top. * It can then be observed if the CPNC (diag318 data) has been successfully set by looking at the s390dbf messages for the KVM guest. * The CPNC will always be 4 (denotes Linux environment). * Another way to test this is by running the sync_regs_test under tools/testing/selftests/kvm/s390x/sync_regs_test. Just running the kernel self test suite can trigger this. * It is important that the patched qemu is tested with correct patched kernel, since one required the other - they can be found here: - qemu: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4767 - kernel: https://launchpad.net/~fheimes/+archive/ubuntu/lp1953334 + qemu: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4767 + kernel: https://launchpad.net/~fheimes/+archive/ubuntu/lp1953334 [Where problems could occur] * The approach here is to provide additional debug and diagnose information on top. * Hence even if the diag318 changes are broken, the existing functionality shouldn't be harmed. - * However, with the functional changes broken code could be introduced (e.g. due to erroneous pointer arithmetic for example) that does not compile or causes crashes. But this is what the test builds are for + * However, with the functional changes broken code could be introduced (e.g. due to erroneous pointer arithmetic for example) that does not compile or causes crashes. But this is what the test builds are for qemu: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4767 (kernel: https://launchpad.net/~fheimes/+archive/ubuntu/lp1953334) * On top the diag318 diagnose data might not properly provided - maybe empty or wrong. Again that is what the test builds and the verification later is targeted at. [Other] * LP#1953334 is related to this bug and covers the Kernel part. __________ Hardware diagnose data (diag 318) of KVM guest kernel cannot be handled. Fix needed to enhance problem determination of guest kernel under KVM Solution provided by Collin: All patches are provided to enable the DIAGNOSE 0x318 problem determination aid for a QEMU guest. Analogous KVM patches are required. This solution required the prerequisite Extended-Length SCCB patches as well. I've applied a bugfix related to resetting the diag318 to one of the patches (one line fix -- upstream conversation here: https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg03618.html)
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1953338 Title: [UBUNTU 20.04] KVM hardware diagnose data improvements for guest kernel - qemu part To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1953338/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
