On Fri, May 13, 2022 at 12:59:39PM +0200, Laszlo Ersek wrote: > On 05/12/22 09:52, Richard W.M. Jones wrote: > > In https://bugzilla.redhat.com/show_bug.cgi?id=2082806 we've been > > tracking an insidious qemu bug which intermittently prevents the > > libguestfs appliance from starting. The symptoms are that SeaBIOS > > starts and displays its messages, but the kernel isn't reached. We > > found that the kernel does in fact start, but when it tries to set up > > page tables and jump to protected mode it gets a triple fault which > > causes the emulated CPU in qemu to reset (qemu exits). > > > > This seems to only affect TCG (not KVM). > > > > Yesterday I found that this is caused by using -cpu max which enables > > the "la57" feature (5-level page tables[0]), and that we can make the > > problem go away using -cpu max,la57=off. Note that I still don't > > fully understand the qemu bug, so this is only a workaround. > > > > I chose to disable 5-level page tables for both TCG and KVM, partly to > > make the patch simpler, and partly because I guess it's not a feature > > (ie. 57 bit linear addresses) that is useful for the libguestfs > > appliance case, where we have limited physical memory and no need to > > run any programs with huge address spaces. > > > > I tested this by running both the direct & libvirt paths overnight. I > > expect that this patch will fail with old qemu/libvirt which doesn't > > understand the "la57" feature, but this is only intended as a > > temporary workaround. > > > > [0] Article about 5-level page tables as background: > > https://lwn.net/Articles/717293/ > > > > Thanks: Laszlo Ersek > > Fixes: > > https://answers.launchpad.net/ubuntu/+source/libguestfs/+question/701625 > > --- > > lib/launch-direct.c | 15 +++++++++++++-- > > lib/launch-libvirt.c | 7 +++++++ > > 2 files changed, 20 insertions(+), 2 deletions(-) > > > > diff --git a/lib/launch-direct.c b/lib/launch-direct.c > > index c07a8d78f..ff0eaeb62 100644 > > --- a/lib/launch-direct.c > > +++ b/lib/launch-direct.c > > @@ -518,8 +518,19 @@ launch_direct (guestfs_h *g, void *datav, const char > > *arg) > > } end_list (); > > > > cpu_model = guestfs_int_get_cpu_model (has_kvm && !force_tcg); > > - if (cpu_model) > > - arg ("-cpu", cpu_model); > > + if (cpu_model) { > > +#if defined(__x86_64__) > > + /* Temporary workaround for RHBZ#2082806 */ > > + if (STREQ (cpu_model, "max")) { > > + start_list ("-cpu") { > > + append_list (cpu_model); > > + append_list ("la57=off"); > > + } end_list (); > > + } > > + else > > +#endif > > + arg ("-cpu", cpu_model); > > + } > > > > if (g->smp > 1) > > arg_format ("-smp", "%d", g->smp); > > diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c > > index 87da2f40e..03d69e027 100644 > > --- a/lib/launch-libvirt.c > > +++ b/lib/launch-libvirt.c > > @@ -1185,6 +1185,13 @@ construct_libvirt_xml_cpu (guestfs_h *g, > > else if (STREQ (cpu_model, "max")) { > > /* https://bugzilla.redhat.com/show_bug.cgi?id=1935572#c11 */ > > attribute ("mode", "maximum"); > > +#if defined(__x86_64__) > > + /* Temporary workaround for RHBZ#2082806 */ > > + start_element ("feature") { > > + attribute ("policy", "disable"); > > + attribute ("name", "la57"); > > + } end_element (); > > +#endif > > } > > else > > single_element ("model", cpu_model); > > > > Acked-by: Laszlo Ersek <[email protected]>
I pushed this as commit 59d7e6e017. FYI these bugs: https://bugzilla.redhat.com/show_bug.cgi?id=2084566 https://bugzilla.redhat.com/show_bug.cgi?id=2084567 https://bugzilla.redhat.com/show_bug.cgi?id=2084568 Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW _______________________________________________ Libguestfs mailing list [email protected] https://listman.redhat.com/mailman/listinfo/libguestfs
