[Bug 1769053] Re: Ability to control phys-bits through libvirt

2023-04-18 Thread Christian Ehrhardt
While the bugzilla case wasn't updated this landed in v8.7.0 via a series around
https://gitlab.com/libvirt/libvirt/-/commit/e6c29f09e5b75d7a8d79ae670407060446282c78

v9.0.0 of libvirt is in Ubuntu Lunar, due to that - from now on - one
can control the physical bit settings in a defined way through libvirt.

See maxphysaddr in [1] for how to use that.

Mid term Ubuntu will consider no more adding further variants of the
workaround, that was providing machine types with the -hpb suffix to
allow larger guests.

[1]: https://libvirt.org/formatdomain.html#cpu-model-and-topology

** Changed in: libvirt (Ubuntu)
   Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1769053

Title:
  Ability to control phys-bits through libvirt

Status in libvirt:
  Confirmed
Status in QEMU:
  Invalid
Status in libvirt package in Ubuntu:
  Fix Released
Status in qemu package in Ubuntu:
  Invalid

Bug description:
  Attempting to start a KVM guest with more than 1TB of RAM fails.

  It looks like we might need some extra patches:
  https://lists.gnu.org/archive/html/qemu-discuss/2017-12/msg5.html

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: qemu-system-x86 1:2.11+dfsg-1ubuntu7
  ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17
  Uname: Linux 4.15.0-20-generic x86_64
  ApportVersion: 2.20.9-0ubuntu7
  Architecture: amd64
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Fri May  4 16:21:14 2018
  InstallationDate: Installed on 2017-04-05 (393 days ago)
  InstallationMedia: Ubuntu 16.10 "Yakkety Yak" - Release amd64 (20161012.2)
  MachineType: Dell Inc. XPS 13 9360
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-20-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash transparent_hugepage=madvise 
vt.handoff=1
  SourcePackage: qemu
  UpgradeStatus: Upgraded to bionic on 2018-04-30 (3 days ago)
  dmi.bios.date: 02/26/2018
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.6.2
  dmi.board.name: 0PF86Y
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A00
  dmi.chassis.type: 9
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr2.6.2:bd02/26/2018:svnDellInc.:pnXPS139360:pvr:rvnDellInc.:rn0PF86Y:rvrA00:cvnDellInc.:ct9:cvr:
  dmi.product.family: XPS
  dmi.product.name: XPS 13 9360
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/libvirt/+bug/1769053/+subscriptions




Re: [RESEND PATCH v2] target/i386: Switch back XFRM value

2023-03-27 Thread Christian Ehrhardt
On Thu, Oct 27, 2022 at 2:36 AM Yang, Weijiang  wrote:
>
>
> On 10/26/2022 7:57 PM, Zhong, Yang wrote:
> > The previous patch wrongly replaced FEAT_XSAVE_XCR0_{LO|HI} with
> > FEAT_XSAVE_XSS_{LO|HI} in CPUID(EAX=12,ECX=1):{ECX,EDX}, which made
> > SGX enclave only supported SSE and x87 feature(xfrm=0x3).
> >
> > Fixes: 301e90675c3f ("target/i386: Enable support for XSAVES based 
> > features")
> >
> > Signed-off-by: Yang Zhong 
> > ---
> >   target/i386/cpu.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index ad623d91e4..19aaed877b 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -5584,8 +5584,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
> > uint32_t count,
> >   } else {
> >   *eax &= env->features[FEAT_SGX_12_1_EAX];
> >   *ebx &= 0; /* ebx reserve */
> > -*ecx &= env->features[FEAT_XSAVE_XSS_LO];
> > -*edx &= env->features[FEAT_XSAVE_XSS_HI];
> > +*ecx &= env->features[FEAT_XSAVE_XCR0_LO];
> > +*edx &= env->features[FEAT_XSAVE_XCR0_HI];
>
> Oops, that's my fault to replace with wrong definitions, thanks for the fix!
>
> Reviewed-by:  Yang Weijiang 

Hi,
I do not have any background on this but stumbled over this and wondered,
is there any particular reason why this wasn't applied yet?

It seemed to fix a former mistake, was acked and then ... silence

> >
> >   /* FP and SSE are always allowed regardless of XSAVE/XCR0. */
> >   *ecx |= XSTATE_FP_MASK | XSTATE_SSE_MASK;
>


-- 
Christian Ehrhardt
Senior Staff Engineer, Ubuntu Server
Canonical Ltd



Re: [PATCH] acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block

2023-03-02 Thread Christian Ehrhardt
On Wed, Mar 1, 2023 at 9:04 AM Laszlo Ersek  wrote:
>
> Hello Christian,
>
> On 3/1/23 08:17, Christian Ehrhardt wrote:
> > On Thu, Jan 5, 2023 at 8:14 AM Laszlo Ersek  wrote:
> >>
> >> On 1/4/23 13:35, Michael S. Tsirkin wrote:
> >>> On Wed, Jan 04, 2023 at 10:01:38AM +0100, Laszlo Ersek wrote:
> >>>> The modern ACPI CPU hotplug interface was introduced in the following
> >>>> series (aa1dd39ca307..679dd1a957df), released in v2.7.0:
> >>>>
> >>>>   1  abd49bc2ed2f docs: update ACPI CPU hotplug spec with new protocol
> >>>>   2  16bcab97eb9f pc: piix4/ich9: add 'cpu-hotplug-legacy' property
> >>>>   3  5e1b5d93887b acpi: cpuhp: add CPU devices AML with _STA method
> >>>>   4  ac35f13ba8f8 pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
> >>>>   5  d2238cb6781d acpi: cpuhp: implement hot-add parts of CPU hotplug
> >>>>   interface
> >>>>   6  8872c25a26cc acpi: cpuhp: implement hot-remove parts of CPU hotplug
> >>>>   interface
> >>>>   7  76623d00ae57 acpi: cpuhp: add cpu._OST handling
> >>>>   8  679dd1a957df pc: use new CPU hotplug interface since 2.7 machine 
> >>>> type
> >>>>
> > ...
> >>
> >> The solution to the riddle
> >
> > Hi,
> > just to add to this nicely convoluted case an FYI to everyone involved
> > back then,
> > the fix seems to have caused a regression [1] in - as far as I've
> > found - an edge case.
> >
> > [1]: https://gitlab.com/qemu-project/qemu/-/issues/1520
>
> After reading the gitlab case, here's my theory on it:
>
> - Without the patch applied, the CPU hotplug register block in QEMU is
> broken. Effectively, it has *always* been broken; to put it differently,
> you have most likely *never* seen a QEMU in which the CPU hotplug
> register block was not broken. The reason is that the only QEMU release
> without the breakage (as far as a guest could see it!) was v5.0.0, but
> it got exposed to the guest as early as v5.1.0 (IOW, in the 5.* series,
> the first stable release already exposed the issue), and the symptom has
> existed since (up to and including 7.2).
>
> - With the register block broken, OVMF's multiprocessing is broken, and
> the random chaos just happens to play out in a way that makes OVMF think
> it's running on a uniprocessor system.
>
> - With the register block *fixed* (commit dab30fbe applied), OVMF
> actually boots up your VCPUs. With MT-TCG, this translates to as many
> host-side VCPU threads running in your QEMU process as you have VCPUs.
>
> - Furthermore, if your OVMF build includes the SMM driver stack, then
> each UEFI variable update will require all VCPUs to enter SMM. All VCPUs
> entering SMM is a "thundering herd" event, so it seriously spins up all
> your host-side threads. (I assume the SMM-enabled binaries are what you
> refer to as "signed OVMF cases" in the gitlab ticket.)
>
> - If you overcommit the VCPUs (#vcpus > #pcpus), then your host-side
> threads will be competing for PCPUs. On s390x, there is apparently some
> bottleneck in QEMU's locking or in the host kernel or wherever else that
> penalizes (#threads > #pcpus) heavily, while on other host arches, the
> penalty is (apparently) not as severe.
>
> So, the QEMU fix actually "only exposes" the high penalty of the MT-TCG
> VCPU thread overcommit that appears characteristic of s390x hosts.
> You've not seen this symptom before because, regardless of how many
> VCPUs you've specified in the past, OVMF has never actually attempted to
> bring those up, due to the hotplug regblock breakage "masking" the
> actual VCPU counts (the present-at-boot VCPU count and the possible max
> VCPU count).

Thank you for the detailed thoughts - if we can confirm this we can
close the case as "it is odd that there is so much penalty, but =>
Won't Fix / Works as Intended"

> Here's a test you could try: go back to QEMU v5.0.0 *precisely*, and try
> to reproduce the symptom. I expect that it should reproduce.

v5.0.0 - 1 host cpu vs 2 vcpu - 58.47s
v5.0.0 - 1 host cpu vs 1 vcpu -  5.33s
v5.0.0 - 2 host cpu vs 2 vcpu -  5.27s
v5.1.0 - 1 host cpu vs 2 vcpu -  7.18s
v5.1.0 - 1 host cpu vs 1 vcpu -  5.22s
v5.1.0 - 2 host cpu vs 2 vcpu -  5.40s

Yes, v5.0.0 behaves exactly like the recent master branch does since your fix.
And v5.1.0 does no more, just as you predicted

> Here's another test you can try: with latest QEMU, boot an x86 Linux
> guest, but using SeaBIOS, not OVMF, on your s390x host. Then, in the
> Linux guest, run as many busy loops (e.g. i

Re: [PATCH] acpi: cpuhp: fix guest-visible maximum access size to the legacy reg block

2023-02-28 Thread Christian Ehrhardt
On Thu, Jan 5, 2023 at 8:14 AM Laszlo Ersek  wrote:
>
> On 1/4/23 13:35, Michael S. Tsirkin wrote:
> > On Wed, Jan 04, 2023 at 10:01:38AM +0100, Laszlo Ersek wrote:
> >> The modern ACPI CPU hotplug interface was introduced in the following
> >> series (aa1dd39ca307..679dd1a957df), released in v2.7.0:
> >>
> >>   1  abd49bc2ed2f docs: update ACPI CPU hotplug spec with new protocol
> >>   2  16bcab97eb9f pc: piix4/ich9: add 'cpu-hotplug-legacy' property
> >>   3  5e1b5d93887b acpi: cpuhp: add CPU devices AML with _STA method
> >>   4  ac35f13ba8f8 pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
> >>   5  d2238cb6781d acpi: cpuhp: implement hot-add parts of CPU hotplug
> >>   interface
> >>   6  8872c25a26cc acpi: cpuhp: implement hot-remove parts of CPU hotplug
> >>   interface
> >>   7  76623d00ae57 acpi: cpuhp: add cpu._OST handling
> >>   8  679dd1a957df pc: use new CPU hotplug interface since 2.7 machine type
> >>
...
>
> The solution to the riddle

Hi,
just to add to this nicely convoluted case an FYI to everyone involved
back then,
the fix seems to have caused a regression [1] in - as far as I've
found - an edge case.

[1]: https://gitlab.com/qemu-project/qemu/-/issues/1520

...

> Laszlo
>
>


-- 
Christian Ehrhardt
Senior Staff Engineer, Ubuntu Server
Canonical Ltd



Re: [PATCH-for-7.0] build: disable fcf-protection on -march=486 -m16

2022-03-24 Thread Christian Ehrhardt
On Wed, Mar 23, 2022 at 11:54 AM Philippe Mathieu-Daudé
 wrote:
>
> On 23/3/22 10:07, christian.ehrha...@canonical.com wrote:
> > From: Christian Ehrhardt 
> >
> > Some of the roms build with -march=i486 -m16 which is incompatible
> > with -fcf-protection. That in turn is can be set by default, for
> > example in Ubuntu [1].
> > That causes:
> >   cc1: error: ‘-fcf-protection’ is not compatible with this target
> >
> > This won't work on -march=i486 -m16 and no matter if set or not we can
> > override it to "none" if the option is known to the compiler to be
> > able to build reliably.
> >
> > Fixes: https://gitlab.com/qemu-project/qemu/-/issues/889
> >
> > [1]: https://wiki.ubuntu.com/ToolChain/CompilerFlags#A-fcf-protection
> >
> > Signed-off-by: Christian Ehrhardt 
> > ---
> >   pc-bios/optionrom/Makefile | 4 
> >   1 file changed, 4 insertions(+)
>
> Reviewed-by: Philippe Mathieu-Daudé 

Thank you for the review Thomas and Philippe!
For the sake of testing other than my local build checks, the CI jobs
on [1] on gitlab also all passed for this.

[1]: https://gitlab.com/paelzer/qemu/-/pipelines/498917375

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd



[PATCH] build: disable fcf-protection on -march=486 -m16

2022-03-23 Thread christian . ehrhardt
From: Christian Ehrhardt 

Some of the roms build with -march=i486 -m16 which is incompatible
with -fcf-protection. That in turn is can be set by default, for
example in Ubuntu [1].
That causes:
 cc1: error: ‘-fcf-protection’ is not compatible with this target

This won't work on -march=i486 -m16 and no matter if set or not we can
override it to "none" if the option is known to the compiler to be
able to build reliably.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/889

[1]: https://wiki.ubuntu.com/ToolChain/CompilerFlags#A-fcf-protection

Signed-off-by: Christian Ehrhardt 
---
 pc-bios/optionrom/Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
index 5d55d25acc..f1ef898073 100644
--- a/pc-bios/optionrom/Makefile
+++ b/pc-bios/optionrom/Makefile
@@ -14,6 +14,10 @@ cc-option = $(if $(shell $(CC) $1 -c -o /dev/null -xc 
/dev/null >/dev/null 2>&1
 
 override CFLAGS += -march=i486 -Wall
 
+# If -fcf-protection is enabled in flags or compiler defaults that will
+# conflict with -march=i486
+override CFLAGS += $(call cc-option, -fcf-protection=none)
+
 # Flags for dependency generation
 override CPPFLAGS += -MMD -MP -MT $@ -MF $(@D)/$(*F).d
 
-- 
2.35.1




Re: [PATCH] tcg: Remove dh_alias indirection for dh_typecode

2022-02-17 Thread Christian Ehrhardt
On Thu, Feb 17, 2022 at 4:48 AM Richard Henderson
 wrote:
>
> The dh_alias redirect is intended to handle TCG types as distinguished
> from C types.  TCG does not distinguish signed int from unsigned int,
> because they are the same size.  However, we need to retain this
> distinction for dh_typecode, lest we fail to extend abi types properly
> for the host call parameters.

Thank you Richard and Keith for the fix for
- https://github.com/keith-packard/snek/issues/58
- https://gitlab.com/qemu-project/qemu/-/issues/876

I did apply that and tested it on s390x with the load that originally found it.
With qemu 6.2 + that patch I ran the full testsuite of snek-arm on
s390x and it works great now.
If you like feel free to add any combination of the following when committing:

Fixes: #876
Fixes: 7319d83a735 ("tcg: Combine dh_is_64bit and dh_is_signed to dh_typecode")
Reported-by: Christian Ehrhardt 
Tested-by: Christian Ehrhardt 

> This bug was detected when running the 'arm' emulator on an s390
> system. The s390 uses TCG_TARGET_EXTEND_ARGS which triggers code
> in tcg_gen_callN to extend 32 bit values to 64 bits; the incorrect
> sign data in the typemask for each argument caused the values to be
> extended as unsigned values.
>
> This simple program exhibits the problem:
>
> static volatile int num = -9;
> static volatile int den = -5;
>
> int
> main(void)
> {
> int quo = num / den;
> printf("num %d den %d quo %d\n", num, den, quo);
> exit(0);
> }
>
> When run on the broken qemu, this results in:
>
> num -9 den -5 quo 0
>
> The correct result is:
>
> num -9 den -5 quo 1
>
> Reported-by: Keith Packard 
> Signed-off-by: Richard Henderson 
> ---
>  include/exec/helper-head.h   | 19 ++-
>  target/hppa/helper.h |  2 ++
>  target/i386/ops_sse_header.h |  3 +++
>  target/m68k/helper.h |  1 +
>  target/ppc/helper.h  |  3 +++
>  5 files changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/include/exec/helper-head.h b/include/exec/helper-head.h
> index b974eb394a..734af067fe 100644
> --- a/include/exec/helper-head.h
> +++ b/include/exec/helper-head.h
> @@ -53,13 +53,16 @@
>  # ifdef TARGET_LONG_BITS
>  #  if TARGET_LONG_BITS == 32
>  #   define dh_alias_tl i32
> +#   define dh_typecode_tl dh_typecode_i32
>  #  else
>  #   define dh_alias_tl i64
> +#   define dh_typecode_tl dh_typecode_i64
>  #  endif
>  # endif
> -# define dh_alias_env ptr
>  # define dh_ctype_tl target_ulong
> +# define dh_alias_env ptr
>  # define dh_ctype_env CPUArchState *
> +# define dh_typecode_env dh_typecode_ptr
>  #endif
>
>  /* We can't use glue() here because it falls foul of C preprocessor
> @@ -92,18 +95,16 @@
>  #define dh_typecode_i64 4
>  #define dh_typecode_s64 5
>  #define dh_typecode_ptr 6
> -#define dh_typecode(t) glue(dh_typecode_, dh_alias(t))
> +#define dh_typecode_int dh_typecode_s32
> +#define dh_typecode_f16 dh_typecode_i32
> +#define dh_typecode_f32 dh_typecode_i32
> +#define dh_typecode_f64 dh_typecode_i64
> +#define dh_typecode_cptr dh_typecode_ptr
> +#define dh_typecode(t) dh_typecode_##t
>
>  #define dh_callflag_i32  0
> -#define dh_callflag_s32  0
> -#define dh_callflag_int  0
>  #define dh_callflag_i64  0
> -#define dh_callflag_s64  0
> -#define dh_callflag_f16  0
> -#define dh_callflag_f32  0
> -#define dh_callflag_f64  0
>  #define dh_callflag_ptr  0
> -#define dh_callflag_cptr dh_callflag_ptr
>  #define dh_callflag_void 0
>  #define dh_callflag_noreturn TCG_CALL_NO_RETURN
>  #define dh_callflag(t) glue(dh_callflag_, dh_alias(t))
> diff --git a/target/hppa/helper.h b/target/hppa/helper.h
> index fe8a9ce493..c7e35ce8c7 100644
> --- a/target/hppa/helper.h
> +++ b/target/hppa/helper.h
> @@ -1,7 +1,9 @@
>  #if TARGET_REGISTER_BITS == 64
>  # define dh_alias_tr i64
> +# define dh_typecode_tr  dh_typecode_i64
>  #else
>  # define dh_alias_tr i32
> +# define dh_typecode_tr  dh_typecode_i32
>  #endif
>  #define dh_ctype_tr  target_ureg
>
> diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h
> index e68af5c403..cef28f2aae 100644
> --- a/target/i386/ops_sse_header.h
> +++ b/target/i386/ops_sse_header.h
> @@ -30,6 +30,9 @@
>  #define dh_ctype_Reg Reg *
>  #define dh_ctype_ZMMReg ZMMReg *
>  #define dh_ctype_MMXReg MMXReg *
> +#define dh_typecode_Reg dh_typecode_ptr
> +#define dh_typecode_ZMMReg dh_typecode_ptr
> +#define dh_typecode_MMXReg dh_typecode_ptr
>
>  DEF_HELPER_3(glue(psrlw, SUFFIX), void, env, Reg, Reg)
>  DEF_HELPER_3(glue(psraw, SUFFIX), void, env, Reg, Reg)
&g

[PATCH] tools/virtiofsd: Add rseq syscall to the seccomp allowlist

2022-02-09 Thread christian . ehrhardt
From: Christian Ehrhardt 

The virtiofsd currently crashes when used with glibc 2.35.
That is due to the rseq system call being added to every thread
creation [1][2].

[1]: https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/
[2]: https://sourceware.org/pipermail/libc-alpha/2022-February/136040.html

This happens not at daemon start, but when a guest connects

/usr/lib/qemu/virtiofsd -f --socket-path=/tmp/testvfsd -o sandbox=chroot \
-o source=/var/guests/j-virtiofs --socket-group=kvm
virtio_session_mount: Waiting for vhost-user socket connection...
# start ok, now guest will connect
virtio_session_mount: Received vhost-user socket connection
virtio_loop: Entry
fv_queue_set_started: qidx=0 started=1
fv_queue_set_started: qidx=1 started=1
Bad system call (core dumped)

We have to put rseq on the seccomp allowlist to avoid that the daemon
is crashing in this case.

Reported-by: Michael Hudson-Doyle 
Signed-off-by: Christian Ehrhardt 
---
 tools/virtiofsd/passthrough_seccomp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/virtiofsd/passthrough_seccomp.c 
b/tools/virtiofsd/passthrough_seccomp.c
index a3ce9f898d..21b8f53bd9 100644
--- a/tools/virtiofsd/passthrough_seccomp.c
+++ b/tools/virtiofsd/passthrough_seccomp.c
@@ -116,6 +116,9 @@ static const int syscall_allowlist[] = {
 SCMP_SYS(write),
 SCMP_SYS(writev),
 SCMP_SYS(umask),
+#ifdef __NR_rseq
+SCMP_SYS(rseq), /* required since glibc 2.35 */
+#endif
 };
 
 /* Syscalls used when --syslog is enabled */
-- 
2.35.0




[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2022-01-03 Thread Christian Ehrhardt
Thank you Frank for that extra confirmation,
by now also all the blockers on the other bug fixed are good. I expect this to 
be released as soon as the SRU Team is back from the Christmas shutdown.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Fix Committed

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
     with no space at all for BRK. It can happen to any case, but is
     much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
     and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  Second test from bug 1928075

  $ sudo qemu-debootstrap --arch=arm64 bullseye bullseye-arm64
  http://ftp.debian.org/debian

  In the bad case this is failing like
  W: Failure trying to run: /sbin/ldconfig
  W: See //debootstrap/debootstrap.log for detail

  And in that log file you'll see the segfault
  $ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
     emulation not of a system but of binaries.
     Commonly uses for cross-tests and cross-builds so that is the
     space to watch for regressions

  [Other Info]

   * n/a

  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions




[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-12-16 Thread Christian Ehrhardt
FYI the release of this is slowed down by the slow verification of bug
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1929926

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Fix Committed

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
     with no space at all for BRK. It can happen to any case, but is
     much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
     and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  Second test from bug 1928075

  $ sudo qemu-debootstrap --arch=arm64 bullseye bullseye-arm64
  http://ftp.debian.org/debian

  In the bad case this is failing like
  W: Failure trying to run: /sbin/ldconfig
  W: See //debootstrap/debootstrap.log for detail

  And in that log file you'll see the segfault
  $ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
     emulation not of a system but of binaries.
     Commonly uses for cross-tests and cross-builds so that is the
     space to watch for regressions

  [Other Info]

   * n/a

  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions




[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-11-30 Thread Christian Ehrhardt
Focal

old

$ sudo apt install --reinstall qemu-user-static=1:4.2-3ubuntu6.18
Reading package lists... Done
Building dependency tree   
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not upgraded.
Need to get 21.3 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 
qemu-user-static amd64 1:4.2-3ubuntu6.18 [21.3 MB]
Fetched 21.3 MB in 1s (16.4 MB/s)   
(Reading database ... 126154 files and directories currently installed.)
Preparing to unpack .../qemu-user-static_1%3a4.2-3ubuntu6.18_amd64.deb ...
Unpacking qemu-user-static (1:4.2-3ubuntu6.18) over (1:4.2-3ubuntu6.18) ...
Setting up qemu-user-static (1:4.2-3ubuntu6.18) ...
Processing triggers for man-db (2.9.1-1) ...

ubuntu@f-1928075-qemuuserstatic:~$ sudo chroot /home/ubuntu/bullseye-arm64 
/bin/sh /debootstrap/debootstrap --second-stage
W: Failure trying to run:  /sbin/ldconfig
W: See //debootstrap/debootstrap.log for details
ubuntu@f-1928075-qemuuserstatic:~$ tail -n 2 
bullseye-arm64/debootstrap/debootstrap.log
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

Upgrade


ubuntu@f-1928075-qemuuserstatic:~$ apt-cache policy qemu-user-static
qemu-user-static:
  Installed: 1:4.2-3ubuntu6.18
  Candidate: 1:4.2-3ubuntu6.19
  Version table:
 1:4.2-3ubuntu6.19 500
500 http://archive.ubuntu.com/ubuntu focal-proposed/universe amd64 
Packages
 *** 1:4.2-3ubuntu6.18 500
500 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 
Packages
100 /var/lib/dpkg/status
 1:4.2-3ubuntu6.17 500
500 http://security.ubuntu.com/ubuntu focal-security/universe amd64 
Packages
 1:4.2-3ubuntu6 500
500 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages
ubuntu@f-1928075-qemuuserstatic:~$ sudo apt install qemu-user-static
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following packages will be upgraded:
  qemu-user-static
1 upgraded, 0 newly installed, 0 to remove and 65 not upgraded.
Need to get 21.3 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-proposed/universe amd64 
qemu-user-static amd64 1:4.2-3ubuntu6.19 [21.3 MB]
Fetched 21.3 MB in 2s (9092 kB/s)   
(Reading database ... 126160 files and directories currently installed.)
Preparing to unpack .../qemu-user-static_1%3a4.2-3ubuntu6.19_amd64.deb ...
Unpacking qemu-user-static (1:4.2-3ubuntu6.19) over (1:4.2-3ubuntu6.18) ...
Setting up qemu-user-static (1:4.2-3ubuntu6.19) ...
Processing triggers for man-db (2.9.1-1) ...
ubuntu@f-1928075-qemuuserstatic:~$ sudo update-binfmts  --test --display  
qemu-aarch64
qemu-aarch64 (enabled):
 package = qemu-user-static
type = magic
  offset = 0
   magic = 
\x7f\x45\x4c\x46\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\xb7\x00
mask = 
\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff
 interpreter = /usr/bin/qemu-aarch64-static
detector = 


Test with new versio

ubuntu@f-1928075-qemuuserstatic:~$ sudo chroot /home/ubuntu/bullseye-arm64 
/bin/sh /debootstrap/debootstrap --second-stage
I: Installing core packages...
W: Failure trying to run:  dpkg --force-depends --install 
/var/cache/apt/archives/base-passwd_3.5.51_arm64.deb
W: See //debootstrap/debootstrap.log for details
ubuntu@f-1928075-qemuuserstatic:~$ tail -n 2 
bullseye-arm64/debootstrap/debootstrap.log
dpkg: error: parsing file '/var/lib/dpkg/status' near line 5 package 'dpkg':
 duplicate value for 'Package' field


That is the good case and also a full run now completes.

$ sudo rm -rf bullseye-arm64; sudo qemu-debootstrap --arch=arm64 bullseye 
bullseye-arm64 http://ftp.debian.org/debian
I: Running command: debootstrap --arch arm64 --foreign bullseye bullseye-arm64 
http://ftp.debian.org/debian
W: Cannot check Release signature; keyring file not available 
/usr/share/keyrings/debian-archive-keyring.gpg
I: Retrieving InRelease 
I: Retrieving Packages 
...
I: Configuring tasksel...
I: Configuring libc-bin...
I: Base system installed successfully.


I can't run the docker test due to networking restrictions, but it was
the same fault and the same fix - so that should be ok. If anyone else
can test -proposed with docker please feel free to do so.

** Tags removed: verification-needed verification-needed-focal
** Tags added: verification-done verification-done-focal

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Fix Committed

Bug description:
  [Impact]

 

[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-11-30 Thread Christian Ehrhardt
Uploaded to F-unapproved, waiting for the SRU team to accept it.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
     with no space at all for BRK. It can happen to any case, but is
     much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
     and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  Second test from bug 1928075

  $ sudo qemu-debootstrap --arch=arm64 bullseye bullseye-arm64
  http://ftp.debian.org/debian

  In the bad case this is failing like
  W: Failure trying to run: /sbin/ldconfig
  W: See //debootstrap/debootstrap.log for detail

  And in that log file you'll see the segfault
  $ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
     emulation not of a system but of binaries.
     Commonly uses for cross-tests and cross-builds so that is the
     space to watch for regressions

  [Other Info]

   * n/a

  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions




[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-11-30 Thread Christian Ehrhardt
** Changed in: qemu (Ubuntu Focal)
   Status: Triaged => In Progress

** Changed in: qemu (Ubuntu Focal)
 Assignee: (unassigned) => Christian Ehrhardt  (paelzer)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
     with no space at all for BRK. It can happen to any case, but is
     much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
     and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  Second test from bug 1928075

  $ sudo qemu-debootstrap --arch=arm64 bullseye bullseye-arm64
  http://ftp.debian.org/debian

  In the bad case this is failing like
  W: Failure trying to run: /sbin/ldconfig
  W: See //debootstrap/debootstrap.log for detail

  And in that log file you'll see the segfault
  $ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
     emulation not of a system but of binaries.
     Commonly uses for cross-tests and cross-builds so that is the
     space to watch for regressions

  [Other Info]

   * n/a

  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions




[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-11-30 Thread Christian Ehrhardt
SRU template updated, PPA rebuilt, Merge requests updated.
Also bundled another bug fix.

Waiting for MR review now.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
     with no space at all for BRK. It can happen to any case, but is
     much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
     and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  Second test from bug 1928075

  $ sudo qemu-debootstrap --arch=arm64 bullseye bullseye-arm64
  http://ftp.debian.org/debian

  In the bad case this is failing like
  W: Failure trying to run: /sbin/ldconfig
  W: See //debootstrap/debootstrap.log for detail

  And in that log file you'll see the segfault
  $ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
  Segmentation fault (core dumped)

  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
     emulation not of a system but of binaries.
     Commonly uses for cross-tests and cross-builds so that is the
     space to watch for regressions

  [Other Info]

   * n/a

  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions




[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-11-30 Thread Christian Ehrhardt
Hi,
sorry this has fallen through the cracks, but bug 1928075 made me re-discover 
it and it is time finally to complete that.

** Tags added: server-next

** Description changed:

  [Impact]
  
-  * The current space reserved can be too small and we can end up
-with no space at all for BRK. It can happen to any case, but is
-much more likely with the now common PIE binaries.
+  * The current space reserved can be too small and we can end up
+    with no space at all for BRK. It can happen to any case, but is
+    much more likely with the now common PIE binaries.
  
-  * Backport the upstream fix which reserves a bit more space while loading
-and giving it back after interpreter and stack is loaded.
+  * Backport the upstream fix which reserves a bit more space while loading
+    and giving it back after interpreter and stack is loaded.
  
  [Test Plan]
  
-  * On x86 run:
+  * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
-  libc-bin
+  libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)
  
  
+ Second test from bug 1928075
+ 
+ $ sudo qemu-debootstrap --arch=arm64 bullseye bullseye-arm64
+ http://ftp.debian.org/debian
+ 
+ In the bad case this is failing like
+ W: Failure trying to run: /sbin/ldconfig
+ W: See //debootstrap/debootstrap.log for detail
+ 
+ And in that log file you'll see the segfault
+ $ tail -n 2 bullseye-arm64/debootstrap/debootstrap.log
+ qemu: uncaught target signal 11 (Segmentation fault) - core dumped
+ Segmentation fault (core dumped)
+ 
  [Where problems could occur]
  
-  * Regressions would be around use-cases of linux-user that is
-emulation not of a system but of binaries.
-Commonly uses for cross-tests and cross-builds so that is the
-space to watch for regressions
+  * Regressions would be around use-cases of linux-user that is
+    emulation not of a system but of binaries.
+    Commonly uses for cross-tests and cross-builds so that is the
+    space to watch for regressions
  
  [Other Info]
-  
-  * n/a
  
+  * n/a
  
  ---
  
  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being broken
  when run under qemu-user (for all target architectures, host being amd64
  for me).
  
  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)
  
  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c
  
  When we disable this internal implementation and rely on glibc's malloc,
  then everything is fine. But it might be that glibc has a fallback when
  sbrk() is not working properly and it might hide the underlying problem
  in qemu-user.
  
  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080
  
  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb
  
  The version of qemu I have been using is 2.11 (Debian package qemu-user-
  static version 1:2.11+dfsg-1) but I have had reports that the problem is
  reproducible with older versions (back to 2.8 at least).
  
  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599
  
  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
     with no space at all for BRK. It can happen to any case, but is
     much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
     and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in 

[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-09-20 Thread Christian Ehrhardt
Yeah Sebastian, a new ticket (with a reference to this bug as being
similar) would be preferred.

** Changed in: qemu (Ubuntu)
 Assignee: Christian Ehrhardt  (paelzer) => (unassigned)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
 with no space at all for BRK. It can happen to any case, but is
 much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
 and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
 emulation not of a system but of binaries.
 Commonly uses for cross-tests and cross-builds so that is the
 space to watch for regressions

  [Other Info]
   
   * n/a


  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions




[Bug 1921664] Re: Coroutines are racy for risc64 emu on arm64 - crash on Assertion

2021-08-24 Thread Christian Ehrhardt
Hmm, thanks for the hint Thomas.

Of the two formerly referenced same-source different result builds:

[1] => built 2021-03-23 in Hirsute => works
[2] => built 2021-04-12 in Hirsute => fails

[1]: 
https://launchpad.net/ubuntu/+source/qemu/1:5.2+dfsg-9ubuntu1/+build/21196422
[2]: 
https://launchpad.net/~paelzer/+archive/ubuntu/lp-1921664-testbuilds-rebuildold/+build/21392458

The default flags changed in
  https://launchpad.net/ubuntu/+source/dpkg/1.20.7.1ubuntu4
and according to the build logs both ran with that.
Copy-Pasta from the log:
  dpkg (= 1.20.7.1ubuntu4),
=> In between those we did not switch the LTO default flags


For clarification LTO is the default nowadays and we are not disabling it 
generally in qemu. So - yes the builds are with LTO, but both the good and the 
bad one are.

Although looking at versions I see we have:
- good case 10.2.1-23ubuntu2
- bad case  10.3.0-1ubuntu1

So maybe - while it wasn't LTO - something in 10.3 maybe even LTO-
since-10.3 is what is broken?

@Tommy - I don't have any of the test systems around anymore, if I'd
build you a no-LTO qemu for testing what would you these days need -
Hirsute, Impish, ... ?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1921664

Title:
  Coroutines are racy for risc64 emu on arm64 - crash on Assertion

Status in QEMU:
  Incomplete
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  Note: this could as well be "riscv64 on arm64" for being slow@slow and affect
  other architectures as well.

  The following case triggers on a Raspberry Pi4 running with arm64 on
  Ubuntu 21.04 [1][2]. It might trigger on other environments as well,
  but that is what we have seen it so far.

 $ wget 
https://github.com/carlosedp/riscv-bringup/releases/download/v1.0/UbuntuFocal-riscv64-QemuVM.tar.gz
 $ tar xzf UbuntuFocal-riscv64-QemuVM.tar.gz
 $ ./run_riscvVM.sh
  (wait ~2 minutes)
 [ OK ] Reached target Local File Systems (Pre).
 [ OK ] Reached target Local File Systems.
  Starting udev Kernel Device Manager...
  qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.

  This is often, but not 100% reproducible and the cases differ slightly we
  see either of:
  - qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
  - qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.

  Rebuilding working cases has shown to make them fail, as well as rebulding
  (or even reinstalling) bad cases has made them work. Also the same builds on
  different arm64 CPUs behave differently. TL;DR: The full list of conditions
  influencing good/bad case here are not yet known.

  [1]: 
https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#1-overview
  [2]: 
http://cdimage.ubuntu.com/daily-preinstalled/pending/hirsute-preinstalled-desktop-arm64+raspi.img.xz

  
  --- --- original report --- ---

  I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days
  ago broke it.  Now when I launch it, it hits an assertion:

  OpenSBI v0.6
     _  _
    / __ \  / |  _ \_   _|
   | |  | |_ __   ___ _ __ | (___ | |_) || |
   | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
   | |__| | |_) |  __/ | | |) | |_) || |_
    \/| .__/ \___|_| |_|_/|/_|
  | |
  |_|

  ...
  Found /boot/extlinux/extlinux.conf
  Retrieving file: /boot/extlinux/extlinux.conf
  618 bytes read in 2 ms (301.8 KiB/s)
  RISC-V Qemu Boot Options
  1:  Linux kernel-5.5.0-dirty
  2:  Linux kernel-5.5.0-dirty (recovery mode)
  Enter choice: 1:Linux kernel-5.5.0-dirty
  Retrieving file: /boot/initrd.img-5.5.0-dirty
  qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.
  ./run.sh: line 31:  1604 Aborted (core dumped) 
qemu-system-riscv64 -machine virt -nographic -smp 8 -m 8G -bios fw_payload.bin 
-device virtio-blk-devi
  ce,drive=hd0 -object rng-random,filename=/dev/urandom,id=rng0 -device 
virtio-rng-device,rng=rng0 -drive 
file=riscv64-UbuntuFocal-qemu.qcow2,format=qcow2,id=hd0 -devi
  ce virtio-net-device,netdev=usernet -netdev user,id=usernet,$ports

  Interestingly this doesn't happen on the AMD64 version of Ubuntu 21.04
  (fully updated).

  Think you have everything already, but just in case:

  $ lsb_release -rd
  Description:Ubuntu Hirsute Hippo (development branch)
  Release:21.04

  $ uname -a
  Linux minimacvm 5.11.0-11-generic #12-Ubuntu SMP Mon Mar 1 19:27:36 UTC 2021 
aarch64 aarch64 aarch64 GNU/Linux
  (note this is a VM running on macOS/M1)

  $ apt-cache policy qemu
  qemu:
    Installed: 1:5.2+dfsg-9ubuntu1
    

[Bug 1907952] Re: qemu-system-aarch64: with "-display gtk" arrow keys are received as just ^[ on ttyAMA0

2021-06-15 Thread Christian Ehrhardt
** Tags added: qemu-21.10

** Also affects: qemu (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: qemu (Ubuntu)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1907952

Title:
  qemu-system-aarch64: with "-display gtk" arrow keys are received as
  just ^[ on ttyAMA0

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Triaged
Status in qemu package in Debian:
  Confirmed

Bug description:
  I originally observed this on Debian packaged qemu 5.2 at
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=976808

  Today I checked out the latest git source at
  Sun, 13 Dec 2020 19:21:09 +0900
  and configured the source as follows:

  ./configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib/qemu \
   --localstatedir=/var --disable-blobs --disable-strip --localstatedir=/var \
   --libdir=/usr/lib/aarch64-linux-gnu \ 
   --firmwarepath=/usr/share/qemu:/usr/share/seabios:/usr/lib/ipxe/qemu \ 
   --target-list=aarch64-softmmu,arm-softmmu --disable-werror \ 
   --disable-user  --enable-gtk --enable-vnc
  then executed "make" on an ARM64 (not an x86_64) host,
  running the latest Debian testing.

  I did the following commands on an arm64 host with the Debian Installer Alpha 
3 at
  
https://cdimage.debian.org/cdimage/bullseye_di_alpha3/arm64/iso-cd/debian-bullseye-DI-alpha3-arm64-netinst.iso

  #!/bin/sh

  ARCH=arm64
  IMAGE=`pwd`/qemu-disk-${ARCH}.qcow2
  CDROM=`pwd`/debian-bullseye-DI-alpha3-${ARCH}-netinst.iso
  rm -f $IMAGE
  qemu-img create -f qcow2 -o compat=1.1 -o lazy_refcounts=on -o 
preallocation=off $IMAGE 20G
  cd /var/tmp
  cp /usr/share/AAVMF/AAVMF_VARS.fd .
  $HOME/qemu-git/qemu/build/qemu-system-aarch64 \
  -display gtk -enable-kvm -machine virt -cpu host -m 3072 -smp 2\
  -net nic,model=virtio -net user -object 
rng-random,filename=/dev/urandom,id=rng0 \
  -device virtio-rng-pci,rng=rng0,id=rng-device0 \
  -drive 
if=virtio,file=${IMAGE},index=0,format=qcow2,discard=unmap,detect-zeroes=unmap,media=disk
 \
  -drive if=virtio,file=${CDROM},index=1,format=raw,readonly=on,media=cdrom 
\
  -drive 
if=pflash,format=raw,unit=0,file=/usr/share/AAVMF/AAVMF_CODE.fd,readonly=on \
  -drive if=pflash,format=raw,unit=1,file=`pwd`/AAVMF_VARS.fd

  Then 4 arrow keys on the physical keyboard are received as just "^[".

  This symptom was not observed on qemu-system-x86_64.
  This symptom was not observed with virt-manager on my arm64 host, neither.
  This seems unique to -display gtk of qemu-system-aarch64.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1907952/+subscriptions



[Bug 1868116] Re: QEMU monitor no longer works

2021-05-04 Thread Christian Ehrhardt
@Thomas - there is a leftover task here and I've filed [1] for it in the new 
tracker.
What is the right state to move this bug here into now?

[1]: https://gitlab.com/qemu-project/qemu/-/issues/137

** Bug watch added: gitlab.com/qemu-project/qemu/-/issues #137
   https://gitlab.com/qemu-project/qemu/-/issues/137

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  Incomplete
Status in qemu package in Ubuntu:
  Invalid
Status in vte2.91 package in Ubuntu:
  Fix Released
Status in qemu package in Debian:
  Fix Released

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  dmi.sys.vendor: LENOVO
  mtime.conffile..etc.apport.crashdb.conf: 2019-08-29T08:39:36.787240

To manage notifications about this bug go to:

[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-04-26 Thread Christian Ehrhardt
For Focal:
- SRU Template added to the bug
- MP: 
https://code.launchpad.net/~paelzer/ubuntu/+source/qemu/+git/qemu/+merge/401771
- PPA: 
https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4535/+packages 
(still building)

I'd ask anyone affected by this on Focal to give it a try on the PPA and
let us know if this fix would work for you.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
 with no space at all for BRK. It can happen to any case, but is
 much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
 and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
 emulation not of a system but of binaries.
 Commonly uses for cross-tests and cross-builds so that is the
 space to watch for regressions

  [Other Info]
   
   * n/a


  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions



[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2021-04-26 Thread Christian Ehrhardt
** Description changed:

+ [Impact]
+ 
+  * The current space reserved can be too small and we can end up
+with no space at all for BRK. It can happen to any case, but is
+much more likely with the now common PIE binaries.
+ 
+  * Backport the upstream fix which reserves a bit more space while loading
+and giving it back after interpreter and stack is loaded.
+ 
+ [Test Plan]
+ 
+  * On x86 run:
+ sudo apt install -y qemu-user-static docker.io
+ sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
+ ...
+ Running hooks in /etc/ca-certificates/update.d...
+ done.
+ Errors were encountered while processing:
+  libc-bin
+ E: Sub-process /usr/bin/dpkg returned an error code (1)
+ 
+ 
+ [Where problems could occur]
+ 
+  * Regressions would be around use-cases of linux-user that is
+emulation not of a system but of binaries.
+Commonly uses for cross-tests and cross-builds so that is the
+space to watch for regressions
+ 
+ [Other Info]
+  
+  * n/a
+ 
+ 
+ ---
+ 
  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being broken
  when run under qemu-user (for all target architectures, host being amd64
  for me).
  
  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)
  
  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c
  
  When we disable this internal implementation and rely on glibc's malloc,
  then everything is fine. But it might be that glibc has a fallback when
  sbrk() is not working properly and it might hide the underlying problem
  in qemu-user.
  
  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080
  
  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb
  
  The version of qemu I have been using is 2.11 (Debian package qemu-user-
  static version 1:2.11+dfsg-1) but I have had reports that the problem is
  reproducible with older versions (back to 2.8 at least).
  
  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599
  
  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged

Bug description:
  [Impact]

   * The current space reserved can be too small and we can end up
 with no space at all for BRK. It can happen to any case, but is
 much more likely with the now common PIE binaries.

   * Backport the upstream fix which reserves a bit more space while loading
 and giving it back after interpreter and stack is loaded.

  [Test Plan]

   * On x86 run:
  sudo apt install -y qemu-user-static docker.io
  sudo docker run --rm arm64v8/debian:bullseye bash -c 'apt update && apt 
install -y wget'
  ...
  Running hooks in /etc/ca-certificates/update.d...
  done.
  Errors were encountered while processing:
   libc-bin
  E: Sub-process /usr/bin/dpkg returned an error code (1)

  
  [Where problems could occur]

   * Regressions would be around use-cases of linux-user that is
 emulation not of a system but of binaries.
 Commonly uses for cross-tests and cross-builds so that is the
 space to watch for regressions

  [Other Info]
   
   * n/a


  ---

  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash 

[Bug 1921664] Re: Recent update broke qemu-system-riscv64

2021-04-14 Thread Christian Ehrhardt
Also I've rebuilt the most recent master c1e90def01 about ~55 commits newer 
than 6.0-rc2.
As in the experiments of Tommy I was unable to reproduce it there.
But with the data from the tests before it is very likely that this is more
likely an accident by having a slightly different timing than a fix (to be
clear I'd appreciate if there is a fix, I'm just unable to derive from this
being good I could e.g. bisect).

export CFLAGS="-O0 -g -fPIC"
../configure --enable-system --disable-xen --disable-werror --disable-docs 
--disable-libudev --disable-guest-agent --disable-sdl --disable-gtk 
--disable-vnc --disable-xen --disable-brlapi  --disable-hax --disable-vde 
--disable-netmap --disable-rbd --disable-libiscsi --disable-libnfs 
--disable-smartcard --disable-libusb --disable-usb-redir --disable-seccomp 
--disable-glusterfs --disable-tpm --disable-numa --disable-opengl 
--disable-virglrenderer --disable-xfsctl --disable-slirp --disable-blobs 
--disable-rdma --disable-pvrdma --disable-attr --disable-vhost-net 
--disable-vhost-vsock --disable-vhost-scsi --disable-vhost-crypto 
--disable-vhost-user --disable-spice --disable-qom-cast-debug --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --enable-tools 
--disable-libssh --disable-libpmem --disable-cap-ng --disable-vte 
--disable-iconv --disable-curses --disable-linux-aio --disable-linux-io-uring 
--disable-kvm --disable-replication --audio-drv-list="" --disable-vhost-kernel 
--disable-vhost-vdpa --disable-live-block-migration --disable-keyring 
--disable-auth-pam --disable-curl --disable-strip --enable-fdt 
--target-list="riscv64-softmmu"
make -j10

Just like the package build that configures as
   coroutine backend: ucontext
   coroutine pool: YES

5/5 runs with that were ok
But since we know it is racy I'm unsure if that implies much :-/

P.S. I have not yet went into a build-option bisect, but chances are it could be
related. But that is too much stabbing in the dark, maybe someone experienced
in the coroutines code can already make sense of all the info we have gathered 
so
far.
I'll update the bug description and add an upstream task to have all the info 
we have get mirrored to the qemu mailing lists.

** Summary changed:

- Recent update broke qemu-system-riscv64
+ Coroutines are racy for risc64 emu on arm64 - crash on Assertion

** Description changed:

+ Note: this could as well be "riscv64 on arm64" for being slow@slow and affect
+ other architectures as well.
+ 
+ The following case triggers on a Raspberry Pi4 running with arm64 on
+ Ubuntu 21.04 [1][2]. It might trigger on other environments as well,
+ but that is what we have seen it so far.
+ 
+$ wget 
https://github.com/carlosedp/riscv-bringup/releases/download/v1.0/UbuntuFocal-riscv64-QemuVM.tar.gz
+$ tar xzf UbuntuFocal-riscv64-QemuVM.tar.gz
+$ ./run_riscvVM.sh
+ (wait ~2 minutes)
+[ OK ] Reached target Local File Systems (Pre).
+[ OK ] Reached target Local File Systems.
+ Starting udev Kernel Device Manager...
+ qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
+ 
+ This is often, but not 100% reproducible and the cases differ slightly we
+ see either of:
+ - qemu-system-riscv64: ../../util/qemu-coroutine-lock.c:57: 
qemu_co_queue_wait_impl: Assertion `qemu_in_coroutine()' failed.
+ - qemu-system-riscv64: ../../block/aio_task.c:64: aio_task_pool_wait_one: 
Assertion `qemu_coroutine_self() == pool->main_co' failed.
+ 
+ Rebuilding working cases has shown to make them fail, as well as rebulding
+ (or even reinstalling) bad cases has made them work. Also the same builds on
+ different arm64 CPUs behave differently. TL;DR: The full list of conditions
+ influencing good/bad case here are not yet known.
+ 
+ [1]: 
https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi#1-overview
+ [2]: 
http://cdimage.ubuntu.com/daily-preinstalled/pending/hirsute-preinstalled-desktop-arm64+raspi.img.xz
+ 
+ 
+ --- --- original report --- ---
+ 
  I regularly run a RISC-V (RV64GC) QEMU VM, but an update a few days ago
  broke it.  Now when I launch it, it hits an assertion:
  
-   

- OpenSBI v0.6  

-_  _ 
-   / __ \  / |  _ \_   _|  

-  | |  | |_ __   ___ _ __ | (___ | |_) || |

-  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |

   
-  | |__| | |_) |  __/ | | |) | |_) || |_   
 

[Bug 1915063] Re: Windows 10 wil not install using qemu-system-x86_64

2021-04-09 Thread Christian Ehrhardt
Thanks @Babu for the clarifications!
I really hope that the qemu patch makes it in v6.0 - then I can better consider 
picking it up as backport for qemu (already have a bug about that in bug 
1921754 - therefore I'm setting the qemu task here as invalid)

The last step I can provide for the kernel bug that this one here is (before 
the rest of the work is with the kernel Team) is to verify/falsify if that also 
affects the non-oem linux-generic kernel.
There the latest was 5.4.0.71.74 from focal-proposed and the latest already 
released one is 5.4.0.70.73.

5.4.0.70.73 - failing
5.4.0.71.74 - failing

So while the almost-released oem kernel based on 5.10 will cover this -
the patch should indeed also be backported to linux-generic and all the
other flavours - otherwise Windows (and potentially more) will no more
be usable as KVM guest on such Chips (threadrippers, but maybe more AMD
chips that are not yet known as well)

** Changed in: qemu (Ubuntu)
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1915063

Title:
  Windows 10 wil not install using qemu-system-x86_64

Status in QEMU:
  New
Status in linux package in Ubuntu:
  Confirmed
Status in linux-oem-5.10 package in Ubuntu:
  Fix Released
Status in linux-oem-5.6 package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Invalid

Bug description:
  Steps to reproduce
  install virt-manager and ovmf if nopt already there
  copy windows and virtio iso files to /var/lib/libvirt/images

  Use virt-manager from local machine to create your VMs with the disk, CPUs 
and memory required
  Select customize configuration then select OVMF(UEFI) instead of seabios
  set first CDROM to the windows installation iso (enable in boot options)
  add a second CDROM and load with the virtio iso
change spice display to VNC

Always get a security error from windows and it fails to launch the 
installer (works on RHEL and Fedora)
  I tried updating the qemu version from Focals 4.2 to Groovy 5.0 which was of 
no help
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  DistributionChannelDescriptor:
   # This is the distribution channel descriptor for the OEM CDs
   # For more information see 
http://wiki.ubuntu.com/DistributionChannelDescriptor
   
canonical-oem-sutton-focal-amd64-20201030-422+pc-sutton-bachman-focal-amd64+X00
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-01-20 (19 days ago)
  InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 
20201030-14:39
  MachineType: LENOVO 30E102Z
  NonfreeKernelModules: nvidia_modeset nvidia
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1042-oem 
root=UUID=389cd165-fc52-4814-b837-a1090b9c2387 ro locale=en_US quiet splash 
vt.handoff=7
  ProcVersionSignature: Ubuntu 5.6.0-1042.46-oem 5.6.19
  RelatedPackageVersions:
   linux-restricted-modules-5.6.0-1042-oem N/A
   linux-backports-modules-5.6.0-1042-oem  N/A
   linux-firmware  1.187.8
  RfKill:
   
  Tags:  focal
  Uname: Linux 5.6.0-1042-oem x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 07/29/2020
  dmi.bios.vendor: LENOVO
  dmi.bios.version: S07KT08A
  dmi.board.name: 1046
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Defined
  dmi.chassis.type: 3
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrS07KT08A:bd07/29/2020:svnLENOVO:pn30E102Z:pvrThinkStationP620:rvnLENOVO:rn1046:rvrNotDefined:cvnLENOVO:ct3:cvrNone:
  dmi.product.family: INVALID
  dmi.product.name: 30E102Z
  dmi.product.sku: LENOVO_MT_30E1_BU_Think_FM_ThinkStation P620
  dmi.product.version: ThinkStation P620
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1915063/+subscriptions



[Bug 1915063] Re: Windows 10 wil not install using qemu-system-x86_64

2021-04-08 Thread Christian Ehrhardt
David used "5.6.0-1042.46-oem", the closest I had was "5.6.0-1052-oem"
so I tried that one.

With that my win10 install immediately crashed into the reported issue.

So to summarize:
1. I can reproduce it
2. Chances are high that it is fixed by kernel commit 841c2be0 "kvm: x86: 
replace kvm_spec_ctrl_test_value with runtime test on the host"
3. there are some qemu changes which might be related, but we need Babu to 
reply about if/how those are related

I need to get myself updated on Ubuntu oem kernels.
If there is a 5.6 series that is supposed to work on that, then this patch 
needs to be backported.
But if OTOH it is a valid upgrade path that you'll get the 5.10.0-1020-oem that 
I had or later as part of your 20.04 OEM then that "is the fix" for you @David.



** Also affects: linux (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: linux-oem-5.6 (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: linux-oem-5.10 (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: linux-oem-5.10 (Ubuntu)
   Status: New => Fix Released

** Changed in: linux-oem-5.6 (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1915063

Title:
  Windows 10 wil not install using qemu-system-x86_64

Status in QEMU:
  New
Status in linux package in Ubuntu:
  New
Status in linux-oem-5.10 package in Ubuntu:
  Fix Released
Status in linux-oem-5.6 package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Steps to reproduce
  install virt-manager and ovmf if nopt already there
  copy windows and virtio iso files to /var/lib/libvirt/images

  Use virt-manager from local machine to create your VMs with the disk, CPUs 
and memory required
  Select customize configuration then select OVMF(UEFI) instead of seabios
  set first CDROM to the windows installation iso (enable in boot options)
  add a second CDROM and load with the virtio iso
change spice display to VNC

Always get a security error from windows and it fails to launch the 
installer (works on RHEL and Fedora)
  I tried updating the qemu version from Focals 4.2 to Groovy 5.0 which was of 
no help
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  DistributionChannelDescriptor:
   # This is the distribution channel descriptor for the OEM CDs
   # For more information see 
http://wiki.ubuntu.com/DistributionChannelDescriptor
   
canonical-oem-sutton-focal-amd64-20201030-422+pc-sutton-bachman-focal-amd64+X00
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-01-20 (19 days ago)
  InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 
20201030-14:39
  MachineType: LENOVO 30E102Z
  NonfreeKernelModules: nvidia_modeset nvidia
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1042-oem 
root=UUID=389cd165-fc52-4814-b837-a1090b9c2387 ro locale=en_US quiet splash 
vt.handoff=7
  ProcVersionSignature: Ubuntu 5.6.0-1042.46-oem 5.6.19
  RelatedPackageVersions:
   linux-restricted-modules-5.6.0-1042-oem N/A
   linux-backports-modules-5.6.0-1042-oem  N/A
   linux-firmware  1.187.8
  RfKill:
   
  Tags:  focal
  Uname: Linux 5.6.0-1042-oem x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 07/29/2020
  dmi.bios.vendor: LENOVO
  dmi.bios.version: S07KT08A
  dmi.board.name: 1046
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Defined
  dmi.chassis.type: 3
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrS07KT08A:bd07/29/2020:svnLENOVO:pn30E102Z:pvrThinkStationP620:rvnLENOVO:rn1046:rvrNotDefined:cvnLENOVO:ct3:cvrNone:
  dmi.product.family: INVALID
  dmi.product.name: 30E102Z
  dmi.product.sku: LENOVO_MT_30E1_BU_Think_FM_ThinkStation P620
  dmi.product.version: ThinkStation P620
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1915063/+subscriptions



[Bug 1915063] Re: Windows 10 wil not install using qemu-system-x86_64

2021-04-08 Thread Christian Ehrhardt
Finally I'm able to test on a Threadripper myself now.

Note: In regard to the commit that Babu identified - I'm on kernel
5.10.0-1020-oem so that patch would be applied already. I need to find
an older kernel to retry with that as well

(on that new kernel) I did a full Win10 install and it worked fine for
me.

In regard to CPU types (for comparison) I got

qemu 1:4.2-3ubuntu6.15 / libvirt 6.0.0-0ubuntu8.8:

  EPYC-Rome
  AMD
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  


With a more recent qemu/libvirt it isn't much different for this chip
(there recently were some Milan changes, but those seem not to matter
for this chip).

qemu  1:5.2+dfsg-9ubuntu1 / libvirt 7.0.0-2ubuntu1


  EPYC-Rome
  AMD
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  



I wasn't able to crash this setup with an old (18.04) nor a new 21.04) Ubuntu 
guest.
Installing Win10 worked fine for a while and didn't break as reported. But the 
setup I have goes through triple ssh-tunnels and around the globe - that slows 
things down a lot :-/
This is how far I've got:
1. start up the install
2. select no license key -> custom install -> it started copying files
3. it goes into the first reboot

After this the latency kills me and virt-manager starts to abort the 
installation.
So far I did not hit (https://launchpadlibrarian.net/529734412/security.png) as 
reported by David.
@David - did this already pass the critical step for you, how early or late in 
the install did you hit the issues.


As I said I'll probably need to find an older kernel anyway (to be before the 
commit that Babu referenced)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1915063

Title:
  Windows 10 wil not install using qemu-system-x86_64

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Steps to reproduce
  install virt-manager and ovmf if nopt already there
  copy windows and virtio iso files to /var/lib/libvirt/images

  Use virt-manager from local machine to create your VMs with the disk, CPUs 
and memory required
  Select customize configuration then select OVMF(UEFI) instead of seabios
  set first CDROM to the windows installation iso (enable in boot options)
  add a second CDROM and load with the virtio iso
change spice display to VNC

Always get a security error from windows and it fails to launch the 
installer (works on RHEL and Fedora)
  I tried updating the qemu version from Focals 4.2 to Groovy 5.0 which was of 
no help
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  DistributionChannelDescriptor:
   # This is the distribution channel descriptor for the OEM CDs
   # For more information see 
http://wiki.ubuntu.com/DistributionChannelDescriptor
   
canonical-oem-sutton-focal-amd64-20201030-422+pc-sutton-bachman-focal-amd64+X00
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-01-20 (19 days ago)
  InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 
20201030-14:39
  MachineType: LENOVO 30E102Z
  NonfreeKernelModules: nvidia_modeset nvidia
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1042-oem 
root=UUID=389cd165-fc52-4814-b837-a1090b9c2387 ro locale=en_US quiet splash 
vt.handoff=7
  ProcVersionSignature: Ubuntu 5.6.0-1042.46-oem 5.6.19
  RelatedPackageVersions:
   linux-restricted-modules-5.6.0-1042-oem N/A
   linux-backports-modules-5.6.0-1042-oem  N/A
   linux-firmware  1.187.8
  RfKill:
   
  Tags:  focal
  Uname: Linux 5.6.0-1042-oem x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 07/29/2020
  dmi.bios.vendor: LENOVO
  dmi.bios.version: S07KT08A
  dmi.board.name: 1046
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Defined
  dmi.chassis.type: 3
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrS07KT08A:bd07/29/2020:svnLENOVO:pn30E102Z:pvrThinkStationP620:rvnLENOVO:rn1046:rvrNotDefined:cvnLENOVO:ct3:cvrNone:
  dmi.product.family: INVALID
  dmi.product.name: 30E102Z
  dmi.product.sku: LENOVO_MT_30E1_BU_Think_FM_ThinkStation P620
  dmi.product.version: ThinkStation P620
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1915063/+subscriptions



[Bug 1915063] Re: Windows 10 wil not install using qemu-system-x86_64

2021-04-08 Thread Christian Ehrhardt
Thanks Babu/Igor for chiming in!

@Babu
That exposed STIBP but not IBRS - isn't that what you tried to solve (for 
userspace) in qemu via a v2 for the Rome chips?
=> https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg01020.html

I was recently pinging that, as it wasn't merged into the qemu 6.0-rc
Do you have any more insight why this is held back still?

If I might ask - how does the kernel fix you referenced interact with this 
proposed qemu change?
Assumptions (please correct me):
1. with the qemu change and using that Rome-v2 it would ask to expose both 
features and no more crash (even on unfixed kernels)
2. with the kernel fix it will no more crash, even with an unfixed qemu?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1915063

Title:
  Windows 10 wil not install using qemu-system-x86_64

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Steps to reproduce
  install virt-manager and ovmf if nopt already there
  copy windows and virtio iso files to /var/lib/libvirt/images

  Use virt-manager from local machine to create your VMs with the disk, CPUs 
and memory required
  Select customize configuration then select OVMF(UEFI) instead of seabios
  set first CDROM to the windows installation iso (enable in boot options)
  add a second CDROM and load with the virtio iso
change spice display to VNC

Always get a security error from windows and it fails to launch the 
installer (works on RHEL and Fedora)
  I tried updating the qemu version from Focals 4.2 to Groovy 5.0 which was of 
no help
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  DistributionChannelDescriptor:
   # This is the distribution channel descriptor for the OEM CDs
   # For more information see 
http://wiki.ubuntu.com/DistributionChannelDescriptor
   
canonical-oem-sutton-focal-amd64-20201030-422+pc-sutton-bachman-focal-amd64+X00
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-01-20 (19 days ago)
  InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 
20201030-14:39
  MachineType: LENOVO 30E102Z
  NonfreeKernelModules: nvidia_modeset nvidia
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1042-oem 
root=UUID=389cd165-fc52-4814-b837-a1090b9c2387 ro locale=en_US quiet splash 
vt.handoff=7
  ProcVersionSignature: Ubuntu 5.6.0-1042.46-oem 5.6.19
  RelatedPackageVersions:
   linux-restricted-modules-5.6.0-1042-oem N/A
   linux-backports-modules-5.6.0-1042-oem  N/A
   linux-firmware  1.187.8
  RfKill:
   
  Tags:  focal
  Uname: Linux 5.6.0-1042-oem x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 07/29/2020
  dmi.bios.vendor: LENOVO
  dmi.bios.version: S07KT08A
  dmi.board.name: 1046
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Defined
  dmi.chassis.type: 3
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrS07KT08A:bd07/29/2020:svnLENOVO:pn30E102Z:pvrThinkStationP620:rvnLENOVO:rn1046:rvrNotDefined:cvnLENOVO:ct3:cvrNone:
  dmi.product.family: INVALID
  dmi.product.name: 30E102Z
  dmi.product.sku: LENOVO_MT_30E1_BU_Think_FM_ThinkStation P620
  dmi.product.version: ThinkStation P620
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1915063/+subscriptions



[Bug 1915063] Re: Windows 10 wil not install using qemu-system-x86_64

2021-04-06 Thread Christian Ehrhardt
Ok, so you should be able to drop these lines one by one:

  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  

If that does not yet make it work, add those one by one (removing the features 
of the named type)













































































Eventually I'd hope you identify one feature (re add everything but this
to verify) that breaks it. Any chance to do this iterative test? You
could also "bisect" this list if you want to save some time.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1915063

Title:
  Windows 10 wil not install using qemu-system-x86_64

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Steps to reproduce
  install virt-manager and ovmf if nopt already there
  copy windows and virtio iso files to /var/lib/libvirt/images

  Use virt-manager from local machine to create your VMs with the disk, CPUs 
and memory required
  Select customize configuration then select OVMF(UEFI) instead of seabios
  set first CDROM to the windows installation iso (enable in boot options)
  add a second CDROM and load with the virtio iso
change spice display to VNC

Always get a security error from windows and it fails to launch the 
installer (works on RHEL and Fedora)
  I tried updating the qemu version from Focals 4.2 to Groovy 5.0 which was of 
no help
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  DistributionChannelDescriptor:
   # This is the distribution channel descriptor for the OEM CDs
   # For more information see 
http://wiki.ubuntu.com/DistributionChannelDescriptor
   
canonical-oem-sutton-focal-amd64-20201030-422+pc-sutton-bachman-focal-amd64+X00
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-01-20 (19 days ago)
  InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 
20201030-14:39
  MachineType: LENOVO 30E102Z
  NonfreeKernelModules: nvidia_modeset nvidia
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1042-oem 
root=UUID=389cd165-fc52-4814-b837-a1090b9c2387 ro locale=en_US quiet splash 
vt.handoff=7
  ProcVersionSignature: Ubuntu 5.6.0-1042.46-oem 5.6.19
  RelatedPackageVersions:
   linux-restricted-modules-5.6.0-1042-oem N/A
   linux-backports-modules-5.6.0-1042-oem  N/A
   linux-firmware  1.187.8
  RfKill:
   
  Tags:  focal
  Uname: Linux 5.6.0-1042-oem x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 07/29/2020
  dmi.bios.vendor: LENOVO
  dmi.bios.version: S07KT08A
  dmi.board.name: 1046
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Defined
  dmi.chassis.type: 3
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrS07KT08A:bd07/29/2020:svnLENOVO:pn30E102Z:pvrThinkStationP620:rvnLENOVO:rn1046:rvrNotDefined:cvnLENOVO:ct3:cvrNone:
  dmi.product.family: INVALID
  dmi.product.name: 30E102Z
  dmi.product.sku: LENOVO_MT_30E1_BU_Think_FM_ThinkStation P620
  dmi.product.version: ThinkStation P620
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1915063/+subscriptions



[Bug 1915063] Re: Windows 10 wil not install using qemu-system-x86_64

2021-04-03 Thread Christian Ehrhardt
That is awesome David,
qemu64 is like a very low common denominator with only very basic CPU features.
While "copy host" means "enable all you can".

We can surely work with that a bit, but until I get access to the same
HW I need you to do it.


If you run in a console `$virsh domcapabilities` it will spew some XML at you. 
One of the sections will be for "host-model". In my case that looks like


  Skylake-Client-IBRS
  Intel
  
  
  
...



That means a names CPU type (the one that is closest to what you have) and some 
feature additionally enabled/disabled.

If you could please post the full output you have, that can be useful.
>From there you could go two steps.
1. as you see in my example it will list some cpu features on top of the named 
type.
   If you remove them one by one you might be able to identify the single-cpu 
featute
   that breaks in your case.
2. The named CPU that you have also has a representation, it can be found in
   /usr/share/libvirt/cpu_map...
   That ill list all the CPU features that make up the named type.
   If #1 wasn't sufficient, you can now add those to your guest definition one 
by one in disabled 
   state, example


A description of the underlying mechanism is here
https://libvirt.org/formatdomain.html#cpu-model-and-topology

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1915063

Title:
  Windows 10 wil not install using qemu-system-x86_64

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Steps to reproduce
  install virt-manager and ovmf if nopt already there
  copy windows and virtio iso files to /var/lib/libvirt/images

  Use virt-manager from local machine to create your VMs with the disk, CPUs 
and memory required
  Select customize configuration then select OVMF(UEFI) instead of seabios
  set first CDROM to the windows installation iso (enable in boot options)
  add a second CDROM and load with the virtio iso
change spice display to VNC

Always get a security error from windows and it fails to launch the 
installer (works on RHEL and Fedora)
  I tried updating the qemu version from Focals 4.2 to Groovy 5.0 which was of 
no help
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  DistributionChannelDescriptor:
   # This is the distribution channel descriptor for the OEM CDs
   # For more information see 
http://wiki.ubuntu.com/DistributionChannelDescriptor
   
canonical-oem-sutton-focal-amd64-20201030-422+pc-sutton-bachman-focal-amd64+X00
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-01-20 (19 days ago)
  InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 
20201030-14:39
  MachineType: LENOVO 30E102Z
  NonfreeKernelModules: nvidia_modeset nvidia
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1042-oem 
root=UUID=389cd165-fc52-4814-b837-a1090b9c2387 ro locale=en_US quiet splash 
vt.handoff=7
  ProcVersionSignature: Ubuntu 5.6.0-1042.46-oem 5.6.19
  RelatedPackageVersions:
   linux-restricted-modules-5.6.0-1042-oem N/A
   linux-backports-modules-5.6.0-1042-oem  N/A
   linux-firmware  1.187.8
  RfKill:
   
  Tags:  focal
  Uname: Linux 5.6.0-1042-oem x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 07/29/2020
  dmi.bios.vendor: LENOVO
  dmi.bios.version: S07KT08A
  dmi.board.name: 1046
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Defined
  dmi.chassis.type: 3
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrS07KT08A:bd07/29/2020:svnLENOVO:pn30E102Z:pvrThinkStationP620:rvnLENOVO:rn1046:rvrNotDefined:cvnLENOVO:ct3:cvrNone:
  dmi.product.family: INVALID
  dmi.product.name: 30E102Z
  dmi.product.sku: LENOVO_MT_30E1_BU_Think_FM_ThinkStation P620
  dmi.product.version: ThinkStation P620
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1915063/+subscriptions



[Bug 1915063] Re: Windows 10 wil not install using qemu-system-x86_64

2021-03-26 Thread Christian Ehrhardt
Thanks David,
I have no threadripper around atm, I think I can next week get hands on an EPYC 
Rome, but that isn't 100% the same.

But gladly you tried this on the latest qemu 5.2 and since it is failing there 
as well it might be worth to also report it upstream. That is a great community 
which might have ran things on a threadripper already and be able to point us 
to a qemu/kernel fix - or at least an existing discussions abut it.
For now I'm adding a qemu task here which will mirror this case to the mailing 
list.

** Also affects: qemu
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1915063

Title:
  Windows 10 wil not install using qemu-system-x86_64

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  Steps to reproduce
  install virt-manager and ovmf if nopt already there
  copy windows and virtio iso files to /var/lib/libvirt/images

  Use virt-manager from local machine to create your VMs with the disk, CPUs 
and memory required
  Select customize configuration then select OVMF(UEFI) instead of seabios
  set first CDROM to the windows installation iso (enable in boot options)
  add a second CDROM and load with the virtio iso
change spice display to VNC

Always get a security error from windows and it fails to launch the 
installer (works on RHEL and Fedora)
  I tried updating the qemu version from Focals 4.2 to Groovy 5.0 which was of 
no help
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.11-0ubuntu27.14
  Architecture: amd64
  CasperMD5CheckResult: skip
  CurrentDesktop: ubuntu:GNOME
  DistributionChannelDescriptor:
   # This is the distribution channel descriptor for the OEM CDs
   # For more information see 
http://wiki.ubuntu.com/DistributionChannelDescriptor
   
canonical-oem-sutton-focal-amd64-20201030-422+pc-sutton-bachman-focal-amd64+X00
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-01-20 (19 days ago)
  InstallationMedia: Ubuntu 20.04 "Focal" - Build amd64 LIVE Binary 
20201030-14:39
  MachineType: LENOVO 30E102Z
  NonfreeKernelModules: nvidia_modeset nvidia
  Package: linux (not installed)
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-1042-oem 
root=UUID=389cd165-fc52-4814-b837-a1090b9c2387 ro locale=en_US quiet splash 
vt.handoff=7
  ProcVersionSignature: Ubuntu 5.6.0-1042.46-oem 5.6.19
  RelatedPackageVersions:
   linux-restricted-modules-5.6.0-1042-oem N/A
   linux-backports-modules-5.6.0-1042-oem  N/A
   linux-firmware  1.187.8
  RfKill:
   
  Tags:  focal
  Uname: Linux 5.6.0-1042-oem x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip docker kvm libvirt lpadmin plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 07/29/2020
  dmi.bios.vendor: LENOVO
  dmi.bios.version: S07KT08A
  dmi.board.name: 1046
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Defined
  dmi.chassis.type: 3
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: None
  dmi.modalias: 
dmi:bvnLENOVO:bvrS07KT08A:bd07/29/2020:svnLENOVO:pn30E102Z:pvrThinkStationP620:rvnLENOVO:rn1046:rvrNotDefined:cvnLENOVO:ct3:cvrNone:
  dmi.product.family: INVALID
  dmi.product.name: 30E102Z
  dmi.product.sku: LENOVO_MT_30E1_BU_Think_FM_ThinkStation P620
  dmi.product.version: ThinkStation P620
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1915063/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-24 Thread Christian Ehrhardt
@Sadoon - yes, that is the same fix that Laurent pointed to a few hours
before.

@Frank - the kernel I had before was 5.11.0-11-generic (failing). I've
tested "5.11.0-13-generic #14~lp1920784" from your PPA and can confirm
that this fixes the issue.

Thanks Laurent for identifying the fix and thanks Frank for the kernel.
I'll mark bug tasks accordingly and @Frank you'll let me know if there is 
anything else you need to drive this to completion.

** Changed in: qemu
   Status: New => Invalid

** Changed in: glibc (Ubuntu)
   Status: New => Invalid

** Changed in: qemu (Ubuntu)
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  Invalid
Status in The Ubuntu-power-systems project:
  Confirmed
Status in glibc package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Invalid

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 LTS on ArchPOWER solved the 
problem. Kernel 5.9 and 5.10 have the same issue when combined with glibc2.33

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu-system 1:5.2+dfsg-6ubuntu2
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic ppc64le
  .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: 
'/sys/firmware/opal/msglog'
  ApportVersion: 2.20.11-0ubuntu60
  Architecture: ppc64el
  CasperMD5CheckResult: pass
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Mon Mar 22 14:48:39 2021
  InstallationDate: Installed on 2021-03-22 (0 days ago)
  InstallationMedia: Ubuntu-Server 21.04 "Hirsute Hippo" - Alpha ppc64el 
(20210321)
  KvmCmdLine: COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
  ProcKernelCmdLine: root=UUID=f3d03315-0944-4a02-9c87-09c00eba9fa1 ro
  ProcLoadAvg: 1.20 0.73 0.46 1/1054 6071
  ProcSwaps:
   Filename TypeSizeUsed
Priority
   /swap.img   file 8388544 0   
-2
  ProcVersion: Linux version 5.11.0-11-generic (buildd@bos02-ppc64el-002) (gcc 
(Ubuntu 10.2.1-20ubuntu1) 10.2.1 20210220, GNU ld (GNU Binutils for Ubuntu) 
2.36.1) #12-Ubuntu SMP Mon Mar 1 19:26:20 UTC 2021
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  acpidump:
   
  cpu_cores: Number of cores present = 8
  cpu_coreson: Number of cores online = 8
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920784/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-24 Thread Christian Ehrhardt
And gladly this was only added in >=5.9 and we have Groovy (5.8) and
Hirsute (5.11) so only the Hirsute kernel is needed to adapt, but
further backports are not needed.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  Invalid
Status in The Ubuntu-power-systems project:
  Confirmed
Status in glibc package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Confirmed
Status in qemu package in Ubuntu:
  Invalid

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 LTS on ArchPOWER solved the 
problem. Kernel 5.9 and 5.10 have the same issue when combined with glibc2.33

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu-system 1:5.2+dfsg-6ubuntu2
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic ppc64le
  .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: 
'/sys/firmware/opal/msglog'
  ApportVersion: 2.20.11-0ubuntu60
  Architecture: ppc64el
  CasperMD5CheckResult: pass
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Mon Mar 22 14:48:39 2021
  InstallationDate: Installed on 2021-03-22 (0 days ago)
  InstallationMedia: Ubuntu-Server 21.04 "Hirsute Hippo" - Alpha ppc64el 
(20210321)
  KvmCmdLine: COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
  ProcKernelCmdLine: root=UUID=f3d03315-0944-4a02-9c87-09c00eba9fa1 ro
  ProcLoadAvg: 1.20 0.73 0.46 1/1054 6071
  ProcSwaps:
   Filename TypeSizeUsed
Priority
   /swap.img   file 8388544 0   
-2
  ProcVersion: Linux version 5.11.0-11-generic (buildd@bos02-ppc64el-002) (gcc 
(Ubuntu 10.2.1-20ubuntu1) 10.2.1 20210220, GNU ld (GNU Binutils for Ubuntu) 
2.36.1) #12-Ubuntu SMP Mon Mar 1 19:26:20 UTC 2021
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  acpidump:
   
  cpu_cores: Number of cores present = 8
  cpu_coreson: Number of cores online = 8
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920784/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-23 Thread Christian Ehrhardt
I might be spoiled by the s390x-POP style to define instructions, but in
the following doc about the PowerISA unfortunately there is no list of
reasons-for-SIGILL. Therefore I'm out of options on this waiting for
someone - most likely IBM - to chime in.

https://wiki.raptorcs.com/w/images/f/f5/PowerISA_public.v3.1.pdf

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  New
Status in The Ubuntu-power-systems project:
  New
Status in glibc package in Ubuntu:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 LTS on ArchPOWER solved the 
problem. Kernel 5.9 and 5.10 have the same issue when combined with glibc2.33

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu-system 1:5.2+dfsg-6ubuntu2
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic ppc64le
  .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: 
'/sys/firmware/opal/msglog'
  ApportVersion: 2.20.11-0ubuntu60
  Architecture: ppc64el
  CasperMD5CheckResult: pass
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Mon Mar 22 14:48:39 2021
  InstallationDate: Installed on 2021-03-22 (0 days ago)
  InstallationMedia: Ubuntu-Server 21.04 "Hirsute Hippo" - Alpha ppc64el 
(20210321)
  KvmCmdLine: COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
  ProcKernelCmdLine: root=UUID=f3d03315-0944-4a02-9c87-09c00eba9fa1 ro
  ProcLoadAvg: 1.20 0.73 0.46 1/1054 6071
  ProcSwaps:
   Filename TypeSizeUsed
Priority
   /swap.img   file 8388544 0   
-2
  ProcVersion: Linux version 5.11.0-11-generic (buildd@bos02-ppc64el-002) (gcc 
(Ubuntu 10.2.1-20ubuntu1) 10.2.1 20210220, GNU ld (GNU Binutils for Ubuntu) 
2.36.1) #12-Ubuntu SMP Mon Mar 1 19:26:20 UTC 2021
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  acpidump:
   
  cpu_cores: Number of cores present = 8
  cpu_coreson: Number of cores online = 8
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920784/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-23 Thread Christian Ehrhardt
As my other repro-code didn't trigger the issue I looked at qemu again
and found that before the failing ioctl->scv call there are plenty other
even some very similar (same?) calls that work just fine.

I wonder if on guest setup qemu (or e.g. the rom we load) might set some
arch-bits or such which then breaks the next "scv 0" call.

I attached the full ioctl log here.

** Attachment added: "ioctl log of qemu until the crash happens"
   
https://bugs.launchpad.net/qemu/+bug/1920784/+attachment/5480011/+files/qemu-ioctls-util-crash.txt

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  New
Status in The Ubuntu-power-systems project:
  New
Status in glibc package in Ubuntu:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 LTS on ArchPOWER solved the 
problem. Kernel 5.9 and 5.10 have the same issue when combined with glibc2.33

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu-system 1:5.2+dfsg-6ubuntu2
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic ppc64le
  .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: 
'/sys/firmware/opal/msglog'
  ApportVersion: 2.20.11-0ubuntu60
  Architecture: ppc64el
  CasperMD5CheckResult: pass
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Mon Mar 22 14:48:39 2021
  InstallationDate: Installed on 2021-03-22 (0 days ago)
  InstallationMedia: Ubuntu-Server 21.04 "Hirsute Hippo" - Alpha ppc64el 
(20210321)
  KvmCmdLine: COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
  ProcKernelCmdLine: root=UUID=f3d03315-0944-4a02-9c87-09c00eba9fa1 ro
  ProcLoadAvg: 1.20 0.73 0.46 1/1054 6071
  ProcSwaps:
   Filename TypeSizeUsed
Priority
   /swap.img   file 8388544 0   
-2
  ProcVersion: Linux version 5.11.0-11-generic (buildd@bos02-ppc64el-002) (gcc 
(Ubuntu 10.2.1-20ubuntu1) 10.2.1 20210220, GNU ld (GNU Binutils for Ubuntu) 
2.36.1) #12-Ubuntu SMP Mon Mar 1 19:26:20 UTC 2021
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  acpidump:
   
  cpu_cores: Number of cores present = 8
  cpu_coreson: Number of cores online = 8
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920784/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-23 Thread Christian Ehrhardt
qemu calls this ioctl on ppc64 as:
  sysdeps/unix/sysv/linux/powerpc/ioctl.c
result = INLINE_SYSCALL (ioctl, 3, fd, request, arg);

The mapping of macros in sysdeps/unix/sysv/linux/powerpc/sysdep.h seems to be:
INTERNAL_SYSCALL -> INTERNAL_SYSCALL_NCS -> TRY_SYSCALL_SCV -> SYSCALL_SCV

 76 #define SYSCALL_SCV(nr) \
 77   ({\
 78 __asm__ __volatile__\
 79   (".machine \"push\"\n\t"  \
 80".machine \"power9\"\n\t"\
 81"scv 0\n\t"  \
 82".machine \"pop\"\n\t"   \
 83"0:" \
 84: "=" (r0),\
 85  "=" (r3), "=" (r4), "=" (r5),\
 86  "=" (r6), "=" (r7), "=" (r8) \
 87: ASM_INPUT_##nr \
 88: "r9", "r10", "r11", "r12", \
 89  "lr", "ctr", "memory");\
 90 r3; \
 91   })

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  New
Status in The Ubuntu-power-systems project:
  New
Status in glibc package in Ubuntu:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 LTS on ArchPOWER solved the 
problem. Kernel 5.9 and 5.10 have the same issue when combined with glibc2.33

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu-system 1:5.2+dfsg-6ubuntu2
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic ppc64le
  .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: 
'/sys/firmware/opal/msglog'
  ApportVersion: 2.20.11-0ubuntu60
  Architecture: ppc64el
  CasperMD5CheckResult: pass
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Mon Mar 22 14:48:39 2021
  InstallationDate: Installed on 2021-03-22 (0 days ago)
  InstallationMedia: Ubuntu-Server 21.04 "Hirsute Hippo" - Alpha ppc64el 
(20210321)
  KvmCmdLine: COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
  ProcKernelCmdLine: root=UUID=f3d03315-0944-4a02-9c87-09c00eba9fa1 ro
  ProcLoadAvg: 1.20 0.73 0.46 1/1054 6071
  ProcSwaps:
   Filename TypeSizeUsed
Priority
   /swap.img   file 8388544 0   
-2
  ProcVersion: Linux version 5.11.0-11-generic (buildd@bos02-ppc64el-002) (gcc 
(Ubuntu 10.2.1-20ubuntu1) 10.2.1 20210220, GNU ld (GNU Binutils for Ubuntu) 
2.36.1) #12-Ubuntu SMP Mon Mar 1 19:26:20 UTC 2021
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  acpidump:
   
  cpu_cores: Number of cores present = 8
  cpu_coreson: Number of cores online = 8
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920784/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-23 Thread Christian Ehrhardt
[10] outlined to use PPC_FEATURE2_SCV but [4] does just that.
In addition [6] added power9 machine settings as only on this ISA it
is available - like:
+   .machine "push"
+   .machine "power9"
scv 0
+   .machine "pop"

Maybe there is some generated "scv 0" left that needs the same [6]
treatment?

OTOH In a normal test program I can run "scv 0" just fine.
But not other scv levels (expected).

# cat test.c
#include 

int main() {
   printf("Hello scv 0\n");
   __asm__(
   "scv 0\n\t"
   );
   printf("survived\n");
   __asm__(
   "scv 1\n\t"
   );
   printf("survived level 1\n");
   return 0;
}
# gcc -Wall -o test test.c
./test
Hello scv 0
survived
Illegal instruction (core dumped)

IIRC .machine is only a psedo-op for the assembler.
So it is correct that I can't see it in the live disassembly of gdb.

The failing "svc 0" from glibcs __GI___ioctl is
   0x766c49a0 <+320>:   01 00 00 44 scv 0
And in my test program it is
   0x00010848 <+44>:01 00 00 44 scv 0

Hmm, this is the same opcode but fails in just one of the cases.
This might need someone being more an ppc64/glibc expert than me :-/

@Frank - could you modify this bug to become mirrored to IBM for their
arch-expertise please?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  New
Status in The Ubuntu-power-systems project:
  New
Status in glibc package in Ubuntu:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 LTS on ArchPOWER solved the 
problem. Kernel 5.9 and 5.10 have the same issue when combined with glibc2.33

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu-system 1:5.2+dfsg-6ubuntu2
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic ppc64le
  .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: 
'/sys/firmware/opal/msglog'
  ApportVersion: 2.20.11-0ubuntu60
  Architecture: ppc64el
  CasperMD5CheckResult: pass
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Mon Mar 22 14:48:39 2021
  InstallationDate: Installed on 2021-03-22 (0 days ago)
  InstallationMedia: Ubuntu-Server 21.04 "Hirsute Hippo" - Alpha ppc64el 
(20210321)
  KvmCmdLine: COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
  ProcKernelCmdLine: root=UUID=f3d03315-0944-4a02-9c87-09c00eba9fa1 ro
  ProcLoadAvg: 1.20 0.73 0.46 1/1054 6071
  ProcSwaps:
   Filename TypeSizeUsed
Priority
   /swap.img   file 8388544 0   
-2
  ProcVersion: Linux version 5.11.0-11-generic (buildd@bos02-ppc64el-002) (gcc 
(Ubuntu 10.2.1-20ubuntu1) 10.2.1 20210220, GNU ld (GNU Binutils for Ubuntu) 
2.36.1) #12-Ubuntu SMP Mon Mar 1 19:26:20 UTC 2021
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  acpidump:
   
  cpu_cores: Number of cores present = 8
  cpu_coreson: Number of cores online = 8
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920784/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-23 Thread Christian Ehrhardt
Since this seems to be broken on all Distributions as soon as the triggering
combination of kernel/glibc is present I think we'd want to open that up to
upstream qemu for a wider discussion and to also hit the ppc64 architecture
experts.

Furthermore I'm not entirely sure if this needs to be fixed in qemu, it
might instead be the case that instead a fix is needed in glibc.

Therefore I'm adding a qemu (upstream) bug task for now to have the bug
reported there as well (might be worth for awareness anyway) - but
chances are that after some debugging it will turn out to become a glibc
issue instead.

If only I could break this test out of kvm ioctl into something simpler,
then we could then properly file against glibc 

** Also affects: glibc (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: qemu
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  New
Status in The Ubuntu-power-systems project:
  New
Status in glibc package in Ubuntu:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 LTS on ArchPOWER solved the 
problem. Kernel 5.9 and 5.10 have the same issue when combined with glibc2.33

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: qemu-system 1:5.2+dfsg-6ubuntu2
  ProcVersionSignature: Ubuntu 5.11.0-11.12-generic 5.11.0
  Uname: Linux 5.11.0-11-generic ppc64le
  .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: 
'/sys/firmware/opal/msglog'
  ApportVersion: 2.20.11-0ubuntu60
  Architecture: ppc64el
  CasperMD5CheckResult: pass
  CurrentDesktop: Unity:Unity7:ubuntu
  Date: Mon Mar 22 14:48:39 2021
  InstallationDate: Installed on 2021-03-22 (0 days ago)
  InstallationMedia: Ubuntu-Server 21.04 "Hirsute Hippo" - Alpha ppc64el 
(20210321)
  KvmCmdLine: COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
  ProcKernelCmdLine: root=UUID=f3d03315-0944-4a02-9c87-09c00eba9fa1 ro
  ProcLoadAvg: 1.20 0.73 0.46 1/1054 6071
  ProcSwaps:
   Filename TypeSizeUsed
Priority
   /swap.img   file 8388544 0   
-2
  ProcVersion: Linux version 5.11.0-11-generic (buildd@bos02-ppc64el-002) (gcc 
(Ubuntu 10.2.1-20ubuntu1) 10.2.1 20210220, GNU ld (GNU Binutils for Ubuntu) 
2.36.1) #12-Ubuntu SMP Mon Mar 1 19:26:20 UTC 2021
  SourcePackage: qemu
  UpgradeStatus: No upgrade log present (probably fresh install)
  VarLogDump_list: total 0
  acpidump:
   
  cpu_cores: Number of cores present = 8
  cpu_coreson: Number of cores online = 8
  cpu_smt: SMT=4

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1920784/+subscriptions



[Bug 1920784] Re: qemu-system-ppc64le fails with kvm acceleration

2021-03-23 Thread Christian Ehrhardt
Hi Sadoon,
thanks for the report!
There isn't much to find about this issue yet.
One automatic syscaller crash report [1].
On the emulation side there is [2][3].

On the glibc side we have [4][5] adding the use of it with [6] being a fix.
All those seem to be in glibc 2.33 - so I'd expect with [6] it should only
be issued on power9 which in turn should HW-support the instruction.

I was trying to recreate this on power8 and power9 machines.
As expected on power8 just nothing happens (the instruction isn't used due to 
[6]).
TBH I first wondered if these Sforza chips [7][8][9] you mentioned are
fully identical to a classic IBM p9 box - but I was indeed able to reproduce
the issue just fine on an IBM-sold P9
dmesg:
[ 1516.438442] Facility 'SCV' unavailable (12), exception at 0x76c9f84c49a0, 
MSR=9280f033
[ 1516.438472] qemu-system-ppc[42884]: illegal instruction (4) at 76c9f84c49a0 
nip 76c9f84c49a0 lr 1f12839d9f0 code 1 in libc-2.33.so[76c9f838+22]
[ 1516.438489] qemu-system-ppc[42884]: code: e8010010 7c0803a6 4e800020 
6042 7ca42b78 4bffed65 6000 38210020 
[ 1516.438493] qemu-system-ppc[42884]: code: e8010010 7c0803a6 4e800020 
6042 <4401> 4bb8 6000 6042

The chip I used for this test is:
Model:   2.2 (pvr 004e 1202)
Model name:  POWER9, altivec supported

The syscall this crashes in belongs to the ioctl
(gdb) bt
#0  __GI___ioctl (fd=, request=536915584) at 
../sysdeps/unix/sysv/linux/powerpc/ioctl.c:56
#1  0x0cb63ef7d9f0 in kvm_vcpu_ioctl (cpu=cpu@entry=0x7d0f48010010, 
type=type@entry=536915584) at ../../accel/kvm/kvm-all.c:2654
#2  0x0cb63ef7dbdc in kvm_cpu_exec (cpu=0x7d0f48010010) at 
../../accel/kvm/kvm-all.c:2491
#3  0x0cb63ee78344 in kvm_vcpu_thread_fn (arg=0x7d0f48010010) at 
../../accel/kvm/kvm-cpus.c:49
#4  0x0cb63f1d14bc in qemu_thread_start (args=) at 
../../util/qemu-thread-posix.c:521
#5  0x7d0f4ac69114 in start_thread (arg=0x7d0f23dfe720) at 
pthread_create.c:473
#6  0x7d0f4ab755c0 in clone () at 
../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:103

And jumping into the code of the  __GI___ioctl we can clearly see
the scv instruction is indeed there in the executed code path:

   0x766c4984 <__GI___ioctl+292>   bl  0x766c36e8 
<__GI___tcgetattr+8>
   0x766c4988 <__GI___ioctl+296>   nop
   0x766c498c <__GI___ioctl+300>   addir1,r1,32
   0x766c4990 <__GI___ioctl+304>   ld  r0,16(r1)
   0x766c4994 <__GI___ioctl+308>   mtlrr0
   0x766c4998 <__GI___ioctl+312>   blr
   0x766c499c <__GI___ioctl+316>   ori r2,r2,0
  >0x766c49a0 <__GI___ioctl+320>   scv 0


[1]: 
https://webcache.googleusercontent.com/search?q=cache:uS0jhPekyqMJ:https://syzkaller-ppc64.appspot.com/text%3Ftag%3DCrashReport%26x%3D17d9988300+=2=de=clnk=uk
[2]: 
https://git.qemu.org/?p=qemu.git;a=commit;h=3c89b8d6ac5b8728cd7620f9885bd953edd18a11
[3]: https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg05425.html
[4]: 
https://sourceware.org/git/?p=glibc.git;a=commit;h=68ab82f56690ada86ac1e0c46bad06ba189a10ef
[5]: 
https://sourceware.org/git/?p=glibc.git;a=commit;h=41f013cef24884604c303435dd1915be2ea5c0e0
[6]: 
https://sourceware.org/git/?p=glibc.git;a=commit;h=527c89cd32f8522859f58343be3d3dc8f754b783
[7]: https://wiki.raptorcs.com/wiki/Sforza
[8]: https://wiki.raptorcs.com/wiki/Talos_II
[9]: https://wiki.raptorcs.com/wiki/POWER9
[10]: https://lwn.net/Articles/822867/

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1920784

Title:
  qemu-system-ppc64le fails with kvm acceleration

Status in QEMU:
  New
Status in The Ubuntu-power-systems project:
  New
Status in glibc package in Ubuntu:
  New
Status in qemu package in Ubuntu:
  Confirmed

Bug description:
  (Suspected glibc issue!)

  qemu-system-ppc64(le) fails when invoked with kvm acceleration with
  error "illegal instruction"

  > qemu-system-ppc64(le) -M pseries,accel=kvm

  Illegal instruction (core dumped)

  In dmesg:

  Facility 'SCV' unavailable (12), exception at 0x7624f8134c0c,
  MSR=9280f033

  
  Version-Release number of selected component (if applicable):
  qemu 5.2.0 
  Linux kernel 5.11
  glibc 2.33
  all latest updates as of submitting the bug report

  How reproducible:
  Always

  Steps to Reproduce:
  1. Run qemu with kvm acceleration

  Actual results:
  Illegal instruction

  Expected results:
  Normal VM execution

  Additional info:
  The machine is a Raptor Talos II Lite with a Sforza V1 8-core, but was also 
observed on a Raptor Blackbird with the same processor.

  This was also observed on Fedora 34 beta, which uses glibc 2.33
  Also tested on ArchPOWER (unofficial port of Arch Linux for ppc64le) with 
glibc 2.33
  Fedora 33 and Ubuntu 20.10, both using glibc 2.32 do not have this issue, and 
downgrading the Linux kernel from 5.11 to 5.4 

Re: [PATCH] disas: Fix build with glib2.0 >=2.67.3

2021-03-08 Thread Christian Ehrhardt
On Wed, Feb 24, 2021 at 2:15 PM Daniel P. Berrangé  wrote:
>
> On Wed, Feb 24, 2021 at 01:07:33PM +, Peter Maydell wrote:
> > On Wed, 24 Feb 2021 at 11:04, Daniel P. Berrangé  
> > wrote:
> > > So from osdep.h I think something like this is likely sufficient:
> > >
> > > diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> > > index ba15be9c56..7a1d83a8b6 100644
> > > --- a/include/qemu/osdep.h
> > > +++ b/include/qemu/osdep.h
> > > @@ -126,6 +126,7 @@ extern int daemon(int, int);
> > >  #include "glib-compat.h"
> > >  #include "qemu/typedefs.h"
> > >
> > > +extern "C" {
> >
> > Needs to be protected by #ifdef so it's only relevant for the
> > C++ compiler, right?
> >
> > >  /*
> > >   * For mingw, as of v6.0.0, the function implementing the assert macro is
> > >   * not marked as noreturn, so the compiler cannot delete code following 
> > > an
> > > @@ -722,4 +723,6 @@ static inline int 
> > > platform_does_not_support_system(const char *command)
> > >  }
> > >  #endif /* !HAVE_SYSTEM_FUNCTION */
> > >
> > > +}
> > > +
> > >  #endif
> > >
> > >
> > > We'll also need to them protect any local headers we use before this 
> > > point.
> > >
> > > $ grep #include ../../../include/qemu/osdep.h  | grep -v '<'
> > > #include "config-host.h"
> > > #include CONFIG_TARGET
> > > #include "exec/poison.h"
> > > #include "qemu/compiler.h"
> > > #include "sysemu/os-win32.h"
> > > #include "sysemu/os-posix.h"
> > > #include "glib-compat.h"
> > > #include "qemu/typedefs.h"
> > >
> > > and transitively through that list, but I think there's no too many
> > > more there.
> >
> > Is there anything we can do to make the compiler complain if we
> > get this wrong? Otherwise it seems likely that we'll end up
> > accidentally putting things inside or outside 'extern "C"'
> > declarations when they shouldn't be, as we make future changes
> > to our headers.
>
> There's nothing easy I know of to highlight this.  It is more the kind
> of thing checkpatch would have to look at - complain if there is
> anything which isn't a  preprocessor include directive or comment
> before the 'extern'.
>
> > (The other approach would be to try to get rid of the
> > C++ in the codebase. We could probably say 'drop vixl
> > and always use capstone', for instance.)
>
> Yeah, getting rid of C++ would probably be the sanest solution long
> term.

Just as an input on short-term alternatives,
in open-vm-tools we've found to follow
https://developer.gnome.org/glib/stable/glib-Version-Information.html#GLIB-VERSION-MIN-REQUIRED:CAPS
to be an easy fix for the time being.
Which in the autoconf magic there was just:
  +AC_DEFINE(GLIB_VERSION_MIN_REQUIRED, GLIB_VERSION_2_34, [Ignore
post 2.34 deprecations])
  +AC_DEFINE(GLIB_VERSION_MAX_ALLOWED, GLIB_VERSION_2_34, [Prevent
post 2.34 APIs])
(Or any other/newer version that one would want to select)

Not sure what would apply to qemu here, but I wanted to let you know
of the overall concept in regard to this issue.


> Regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
>


-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd



Re: [PATCH] disas: Fix build with glib2.0 >=2.67.3

2021-02-23 Thread Christian Ehrhardt
On Tue, Feb 23, 2021 at 5:12 PM Daniel P. Berrangé  wrote:
>
> On Tue, Feb 23, 2021 at 03:43:48PM +, Peter Maydell wrote:
> > On Tue, 23 Feb 2021 at 15:03, Christian Ehrhardt
> >  wrote:
> > >
> > > glib2.0 introduced a change in 2.67.3 and later which triggers an
> > > issue [1] for anyone including it's headers in a "extern C" context
> > > which a few files in disas/* do. An example of such an include chain
> > > and error look like:
> > >
> > > ../../disas/arm-a64.cc
> > > In file included from /usr/include/glib-2.0/glib/gmacros.h:241,
> > >  from 
> > > /usr/lib/x86_64-linux-gnu/glib-2.0/include/glibconfig.h:9,
> > >  from /usr/include/glib-2.0/glib/gtypes.h:32,
> > >  from /usr/include/glib-2.0/glib/galloca.h:32,
> > >  from /usr/include/glib-2.0/glib.h:30,
> > >  from 
> > > /<>/qemu-5.2+dfsg/include/glib-compat.h:32,
> > >  from 
> > > /<>/qemu-5.2+dfsg/include/qemu/osdep.h:126,
> > >  from ../../disas/arm-a64.cc:21:
> > > /usr/include/c++/10/type_traits:56:3: error: template with C linkage
> > >56 |   template
> > >   |   ^~~~
> > > ../../disas/arm-a64.cc:20:1: note: ‘extern "C"’ linkage started here
> > >20 | extern "C" {
> > >   | ^~
> > >
> > > To fix that move the include of osdep.h out of that section. It was added
> > > already as C++ fixup by e78490c44: "disas/arm-a64.cc: Include osdep.h 
> > > first".
> > >
> > > [1]: https://gitlab.gnome.org/GNOME/glib/-/issues/2331
> >
> > I'm not convinced by this as a fix, though I'm happy to be corrected
> > by somebody with a fuller understanding of C++. glib.h may be supposed
> > to work as a C++ header, but osdep.h as a whole is definitely a C header,
> > so I think it ought to be inside 'extern C'; and it has to be
> > the first header included; and it happens to want to include glib.h.
> >
> > Fixing glib.h seems like it would be nicer, assuming they haven't
> > already shipped this breakage.

They have already shipped this. And in the linked issue and from there
MR discussions
you'll find that everyone acknowledges that the "error" is in the
consuming projects
and that glib upstream is rejecting to fix it for e.g. the argument of
backward compatibility.

> Failing that, does it work to do:
>
> This was raised in Fedora and upstream GLib already
>
> https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1935
> https://lists.fedoraproject.org/archives/list/de...@lists.fedoraproject.org/thread/J3P4TRHLWNDIKXF76OLYZNAPTABCZ3U5/#7LXFUDBBBIT23FE44QJYWX3I7U4EHW6M
>
> The key comment there is this one:
>
>   "Note that wrapping the header include in an extern declaration violates
>C++ standard requirements.  ("A translation unit shall include a header
>only outside of any declaration or definition", [using.headers]/3)"
>
> IOW, if we need to make osdep.h safe to use from C++, then we
> need to put the 'extern "C" { ... }'  bit in osdep.h itself,
> not in the things which include it.
>
> >
> > /*
> >  * glib.h expects to be included as a C++ header if we're
> >  * building a C++ file, but osdep.h and thus glib-compat.h are
> >  * C headers and should be inside an "extern C" block.
> >  */
> > #ifdef __cplusplus
> > extern "C++" {
> > #include 
> > #if defined(G_OS_UNIX)
> > #include 
> > #endif
> > }
> >
> > in include/glib-compat.h ?
>
> That'd be even worse.
>
> We need to make headers that need to be used from C++ code follow
> the pattern:
>
> #include 
> #include 
> #include 
> ...all other includs..
>
> extern "C" {
> ..
> only the declarations, no #includes
> ...
> };

While I can follow the words and always awesome explanations by Daniel,
I must admit that I'm a bit lost at what a v2 of this could look like.

osdep.h as of today unfortunately isn't as trivial as 1. include 2.
declarations.
There are late includes deep in cascading ifdef's and we all know that "just
moving includes around for the above fix to work in an easy way" in headers
will likely (maybe even silently) break things.

So I wonder is this going to become a massive patch either moving a lot or
adding many extern C declarations all over the place in osdep-h? Or did I
just fail to see that there is an obviously better approach to this?

>
> Regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
>


-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd



[PATCH] disas: Fix build with glib2.0 >=2.67.3

2021-02-23 Thread Christian Ehrhardt
glib2.0 introduced a change in 2.67.3 and later which triggers an
issue [1] for anyone including it's headers in a "extern C" context
which a few files in disas/* do. An example of such an include chain
and error look like:

../../disas/arm-a64.cc
In file included from /usr/include/glib-2.0/glib/gmacros.h:241,
 from /usr/lib/x86_64-linux-gnu/glib-2.0/include/glibconfig.h:9,
 from /usr/include/glib-2.0/glib/gtypes.h:32,
 from /usr/include/glib-2.0/glib/galloca.h:32,
 from /usr/include/glib-2.0/glib.h:30,
 from /<>/qemu-5.2+dfsg/include/glib-compat.h:32,
 from /<>/qemu-5.2+dfsg/include/qemu/osdep.h:126,
 from ../../disas/arm-a64.cc:21:
/usr/include/c++/10/type_traits:56:3: error: template with C linkage
   56 |   template
  |   ^~~~
../../disas/arm-a64.cc:20:1: note: ‘extern "C"’ linkage started here
   20 | extern "C" {
  | ^~

To fix that move the include of osdep.h out of that section. It was added
already as C++ fixup by e78490c44: "disas/arm-a64.cc: Include osdep.h first".

[1]: https://gitlab.gnome.org/GNOME/glib/-/issues/2331

Signed-off-by: Christian Ehrhardt 
---
 disas/arm-a64.cc   | 2 +-
 disas/nanomips.cpp | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/disas/arm-a64.cc b/disas/arm-a64.cc
index 9fa779e175..27613d4b25 100644
--- a/disas/arm-a64.cc
+++ b/disas/arm-a64.cc
@@ -17,8 +17,8 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
-extern "C" {
 #include "qemu/osdep.h"
+extern "C" {
 #include "disas/dis-asm.h"
 }
 
diff --git a/disas/nanomips.cpp b/disas/nanomips.cpp
index 90e63b8367..3c202075cc 100644
--- a/disas/nanomips.cpp
+++ b/disas/nanomips.cpp
@@ -27,8 +27,8 @@
  *  Reference Manual", Revision 01.01, April 27, 2018
  */
 
-extern "C" {
 #include "qemu/osdep.h"
+extern "C" {
 #include "disas/dis-asm.h"
 }
 
-- 
2.30.0




[PATCH] build: -no-pie is no functional linker flag

2020-12-14 Thread Christian Ehrhardt
Recent binutils changes dropping unsupported options [1] caused a build
issue in regard to the optionroms.

  ld -m elf_i386 -T /<>/pc-bios/optionrom//flat.lds -no-pie \
-s -o multiboot.img multiboot.o
  ld.bfd: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)

This isn't really a regression in ld.bfd, filing the bug upstream
revealed that this never worked as a ld flag [2] - in fact it seems we
were by accident setting --nmagic).

Since it never had the wanted effect this usage of LDFLAGS_NOPIE, should be
droppable without any effect. This also is the only use-case of LDFLAGS_NOPIE
in .mak, therefore we can also remove it from being added there.

[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=983d925d
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=27050#c5

Signed-off-by: Christian Ehrhardt 
---
 configure  | 3 ---
 pc-bios/optionrom/Makefile | 1 -
 2 files changed, 4 deletions(-)

diff --git a/configure b/configure
index 3f823ed163..61c17c2dde 100755
--- a/configure
+++ b/configure
@@ -2133,7 +2133,6 @@ EOF
 # Check we support --no-pie first; we will need this for building ROMs.
 if compile_prog "-Werror -fno-pie" "-no-pie"; then
   CFLAGS_NOPIE="-fno-pie"
-  LDFLAGS_NOPIE="-no-pie"
 fi
 
 if test "$static" = "yes"; then
@@ -2149,7 +2148,6 @@ if test "$static" = "yes"; then
   fi
 elif test "$pie" = "no"; then
   CONFIGURE_CFLAGS="$CFLAGS_NOPIE $CONFIGURE_CFLAGS"
-  CONFIGURE_LDFLAGS="$LDFLAGS_NOPIE $CONFIGURE_LDFLAGS"
 elif compile_prog "-Werror -fPIE -DPIE" "-pie"; then
   CONFIGURE_CFLAGS="-fPIE -DPIE $CONFIGURE_CFLAGS"
   CONFIGURE_LDFLAGS="-pie $CONFIGURE_LDFLAGS"
@@ -6768,7 +6766,6 @@ echo "QEMU_CXXFLAGS=$QEMU_CXXFLAGS" >> $config_host_mak
 echo "GLIB_CFLAGS=$glib_cflags" >> $config_host_mak
 echo "GLIB_LIBS=$glib_libs" >> $config_host_mak
 echo "QEMU_LDFLAGS=$QEMU_LDFLAGS" >> $config_host_mak
-echo "LDFLAGS_NOPIE=$LDFLAGS_NOPIE" >> $config_host_mak
 echo "LD_I386_EMULATION=$ld_i386_emulation" >> $config_host_mak
 echo "EXESUF=$EXESUF" >> $config_host_mak
 echo "HOST_DSOSUF=$HOST_DSOSUF" >> $config_host_mak
diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
index 084fc10f05..30771f8d17 100644
--- a/pc-bios/optionrom/Makefile
+++ b/pc-bios/optionrom/Makefile
@@ -41,7 +41,6 @@ override CFLAGS += $(call cc-option, $(Wa)-32)
 
 LD_I386_EMULATION ?= elf_i386
 override LDFLAGS = -m $(LD_I386_EMULATION) -T $(SRC_DIR)/flat.lds
-override LDFLAGS += $(LDFLAGS_NOPIE)
 
 all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin pvh.bin
 
-- 
2.29.2




[Bug 1894804] Re: Second DEVICE_DELETED event missing during virtio-blk disk device detach

2020-10-03 Thread Christian Ehrhardt
I was seeing in my manual tests that the delete->Event was taking a bit
longer, maybe your retry logic kicks in before it is complete now. And
due to that the problem takes place.

So with the PPA you should get:
a) a warning, but no more the cancelled/undefined behavior on double delete
b) can tune your retry logic to no more trigger the warning messages

So I wasn't sure from the logs - is the fix in the PPA good to overcome
the issue completely or is it missing something?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894804

Title:
  Second DEVICE_DELETED event missing during virtio-blk disk device
  detach

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  We are in the process of moving OpenStack CI across to use 20.04 Focal
  as the underlying OS and encountering the following issue in any test
  attempting to detach disk devices from running QEMU instances.

  We can see a single DEVICE_DELETED event raised to libvirtd for the
  /machine/peripheral/virtio-disk1/virtio-backend device but we do not
  see a second event for the actual disk. As a result the device is
  still marked as present in libvirt but QEMU reports it as missing in
  subsequent attempts to remove the device.

  The following log snippets can also be found in the following pastebin
  that's slightly easier to gork:

  http://paste.openstack.org/show/797564/

  https://review.opendev.org/#/c/746981/ libvirt: Bump
  MIN_{LIBVIRT,QEMU}_VERSION and NEXT_MIN_{LIBVIRT,QEMU}_VERSION

  https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260
  nova-ceph-multistore

  
https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260/log/controller/logs/dpkg-l.txt

  ii  libvirt-daemon   6.0.0-0ubuntu8.3 
 amd64Virtualization daemon
  ii  libvirt-daemon-driver-qemu   6.0.0-0ubuntu8.3 
 amd64Virtualization daemon QEMU connection driver
  ii  libvirt-daemon-system6.0.0-0ubuntu8.3 
 amd64Libvirt daemon configuration files
  ii  libvirt-daemon-system-systemd6.0.0-0ubuntu8.3 
 amd64Libvirt daemon configuration files (systemd)
  ii  libvirt-dev:amd646.0.0-0ubuntu8.3 
 amd64development files for the libvirt library
  ii  libvirt0:amd64   6.0.0-0ubuntu8.3 
 amd64library for interfacing with different virtualization systems
  [..]
  ii  qemu-block-extra:amd64   1:4.2-3ubuntu6.4 
 amd64extra block backend modules for qemu-system and qemu-utils
  ii  qemu-slof20191209+dfsg-1  
 all  Slimline Open Firmware -- QEMU PowerPC version
  ii  qemu-system  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries
  ii  qemu-system-arm  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (arm)
  ii  qemu-system-common   1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (common files)
  ii  qemu-system-data 1:4.2-3ubuntu6.4 
 all  QEMU full system emulation (data files)
  ii  qemu-system-mips 1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (mips)
  ii  qemu-system-misc 1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (miscellaneous)
  ii  qemu-system-ppc  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (ppc)
  ii  qemu-system-s390x1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (s390x)
  ii  qemu-system-sparc1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (sparc)
  ii  qemu-system-x86  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (x86)
  ii  qemu-utils   1:4.2-3ubuntu6.4 
 amd64QEMU utilities

  
https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260/log/controller/logs/libvirt/qemu
  /instance-003a_log.txt

  2020-09-07 19:29:55.021+: starting up libvirt version: 6.0.0, package: 
0ubuntu8.3 (Marc Deslauriers  Thu, 30 Jul 2020 
06:40:28 -0400), qemu version: 4.2.0Debian 1:4.2-3ubuntu6.4, kernel: 
5.4.0-45-generic, hostname: ubuntu-focal-ovh-bhs1-0019682147
  LC_ALL=C \
  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  HOME=/var/lib/libvirt/qemu/domain-86-instance-003a \
  

[Bug 1894804] Re: Second DEVICE_DELETED event missing during virtio-blk disk device detach

2020-10-02 Thread Christian Ehrhardt
Thanks Lee,
builds are most reliably done in Lauchpad IMHO and the Ubuntu Delta is a quilt 
stack which isn't as easily bisectable.
If we end up searching not between Ubuntu Delta I can help to get you qemu 
built from git for that.
But for some initially probing which range of changes we want to actually look 
at let me provide you PPAs.

Try #1 in PPA [1] is the fix you referred [2] to that avoids the double
delete to cause unexpected results. It will also add the warnings you
have seen. So giving it a test should be a great first try.

Let me know what the results with that are.

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4294/+packages
[2]: 
https://github.com/qemu/qemu/commit/cce8944cc9efab47d4bf29cfffb3470371c3541b

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894804

Title:
  Second DEVICE_DELETED event missing during virtio-blk disk device
  detach

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  We are in the process of moving OpenStack CI across to use 20.04 Focal
  as the underlying OS and encountering the following issue in any test
  attempting to detach disk devices from running QEMU instances.

  We can see a single DEVICE_DELETED event raised to libvirtd for the
  /machine/peripheral/virtio-disk1/virtio-backend device but we do not
  see a second event for the actual disk. As a result the device is
  still marked as present in libvirt but QEMU reports it as missing in
  subsequent attempts to remove the device.

  The following log snippets can also be found in the following pastebin
  that's slightly easier to gork:

  http://paste.openstack.org/show/797564/

  https://review.opendev.org/#/c/746981/ libvirt: Bump
  MIN_{LIBVIRT,QEMU}_VERSION and NEXT_MIN_{LIBVIRT,QEMU}_VERSION

  https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260
  nova-ceph-multistore

  
https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260/log/controller/logs/dpkg-l.txt

  ii  libvirt-daemon   6.0.0-0ubuntu8.3 
 amd64Virtualization daemon
  ii  libvirt-daemon-driver-qemu   6.0.0-0ubuntu8.3 
 amd64Virtualization daemon QEMU connection driver
  ii  libvirt-daemon-system6.0.0-0ubuntu8.3 
 amd64Libvirt daemon configuration files
  ii  libvirt-daemon-system-systemd6.0.0-0ubuntu8.3 
 amd64Libvirt daemon configuration files (systemd)
  ii  libvirt-dev:amd646.0.0-0ubuntu8.3 
 amd64development files for the libvirt library
  ii  libvirt0:amd64   6.0.0-0ubuntu8.3 
 amd64library for interfacing with different virtualization systems
  [..]
  ii  qemu-block-extra:amd64   1:4.2-3ubuntu6.4 
 amd64extra block backend modules for qemu-system and qemu-utils
  ii  qemu-slof20191209+dfsg-1  
 all  Slimline Open Firmware -- QEMU PowerPC version
  ii  qemu-system  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries
  ii  qemu-system-arm  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (arm)
  ii  qemu-system-common   1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (common files)
  ii  qemu-system-data 1:4.2-3ubuntu6.4 
 all  QEMU full system emulation (data files)
  ii  qemu-system-mips 1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (mips)
  ii  qemu-system-misc 1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (miscellaneous)
  ii  qemu-system-ppc  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (ppc)
  ii  qemu-system-s390x1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (s390x)
  ii  qemu-system-sparc1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (sparc)
  ii  qemu-system-x86  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (x86)
  ii  qemu-utils   1:4.2-3ubuntu6.4 
 amd64QEMU utilities

  
https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260/log/controller/logs/libvirt/qemu
  /instance-003a_log.txt

  2020-09-07 19:29:55.021+: starting up libvirt version: 6.0.0, package: 
0ubuntu8.3 (Marc Deslauriers  Thu, 30 Jul 2020 
06:40:28 

[Bug 1849644] Re: QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

2020-09-30 Thread Christian Ehrhardt
Trying to connect using
novnc   latest/stable:1.2.0  2020-07-31 (6) 18MB -

as-is failing to connect
Keeping VNC up and refreshing qemu.


Updating to the new qemu from focal proposed (by now resolved the archive 
publishing issues we had before this morning).
Get:67 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 qemu-utils 
amd64 1:4.2-3ubuntu6.7 [975 kB] 
 
Get:68 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
qemu-system-common amd64 1:4.2-3ubuntu6.7 [1056 kB] 

Get:69 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
qemu-block-extra amd64 1:4.2-3ubuntu6.7 [53.8 kB]   

Get:70 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
qemu-system-data all 1:4.2-3ubuntu6.7 [563 kB]  

Get:71 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 qemu-kvm 
amd64 1:4.2-3ubuntu6.7 [13.1 kB]
   
Get:72 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
qemu-system-x86 amd64 1:4.2-3ubuntu6.7 [6720 kB]

Get:73 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
qemu-system-gui amd64 1:4.2-3ubuntu6.7 [40.8 kB]

Get:74 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 
qemu-system-mips amd64 1:4.2-3ubuntu6.7 [12.9 MB]


Now the same novnc1.2 can connect to it \o/
Setting verified

** Tags removed: verification-needed verification-needed-focal
** Tags added: verification-done verification-done-focal

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1849644

Title:
  QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Fix Committed

Bug description:
  [Impact]

   * The exact details of the protocol/subprotocal was slightly unclear
 between the projects. So qemu ended up insisting on "binary" being
 used but newer noVNC clients no more used it.

   * Qemu got fixed in 5.0 to be more tolerant and accept an empty sub-
 protocol as well. This SRU backports that fix to Focal.

  [Test Case]

   * Without the fix the following will "Failed to connect", but with
  the fix it will work.

  $ sudo apt install qemu-system-x86
  # will only boot into a failing bootloader, but that is enough
  $ qemu-system-x86_64 -vnc :0,websocket
  # We need version 1.2 or later, so use the snap
  $ snap install novnc
  $ novnc --vnc localhost:5700
  Connect browser to http://:6080/vnc.html
  Click "Connect"

   * Cross check with an older noVNC (e.g. the one in Focal) if the 
 connectivity still works on those as well

 - Reminders when switching between the noVNC implementations
   - always refresh the browser with all clear ctrl+alt+f5
   - start/stop the snapped one via snap.novnc.novncsvc.service

  [Regression Potential]

   * This is exclusive to the functionality of noVNC, so regressions would 
 have to be expected in there. The tests try to exercise the basics, but
 e.g. Openstack would be a major product using 

  [Other Info]
   
   * The noVNC in Focal itself does not yet have the offending change, but
 we want the qemu in focal to be connecteable from ~any type of client


  ---


  
  When running a machine using "-vnc" and the "websocket" option QEMU seems to 
require the subprotocol called 'binary'. This subprotocol does not exist in the 
WebSocket specification. In fact it has never existed in the spec, in one of 
the very early drafts of WebSockets it was briefly mentioned but it never made 
it to a final version.

  When the WebSocket server requires a non-standard subprotocol any
  WebSocket client that works correctly won't be able to connect.

  One example of such a client is noVNC, it tells the server that it
  doesn't want to use any subprotocol. QEMU's WebSocket proxy doesn't
  let noVNC connect. If noVNC is modified to ask for 'binary' it will
  work, this is, however, incorrect behavior.

  Looking at the code in "io/channel-websock.c" it seems it's quite
  hard-coded to binary:

  Look at line 58 and 433 here:
  https://git.qemu.org/?p=qemu.git;a=blob;f=io/channel-websock.c

  This code has to be made more dynamic, and shouldn't require binary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1849644/+subscriptions



[Bug 1894804] Re: Second DEVICE_DELETED event missing during virtio-blk disk device detach

2020-09-29 Thread Christian Ehrhardt
Thanks James, but now I'm  unsure where to go from here as it isn't
reproducible with many tries at different scales that James and I did.

@Sean/Lee
Since you wondered if it might be due to Ubuntu Delta on top of 4.2 - there are 
two things we could compare Ubuntu's qemu to then:
1. qemu 4.2 as released by upstream
2. qemu 4.2 as build in centos (they have some delta as well)
Not sure I can provide you #2 easily and #1 will always need a bit of delta to 
build correctly. All of that would be doable still, but in general if 
I'd provide you PPA builds could you try those in your environment to 
!reliably! trigger the issue allowing us to play bisect-ping-pong? The word 
"reliable" is important here as we'd need to sort out builds/patches by a 
reliable yes/no on each step.

@Sean/Lee
- which "the centos 8 build of the same qemu" version is that exactly? I might 
take a look at comparing the patches applied. We are at 4.2.1 already, the 
latest I found there was 4.2.0-29.el8.3.x86_64.rpm which already is the 
advanced-virt version (otherwise 2.12 based).

@Sean/Lee
All of the following suggested approaches depend on the question if you can 
test this reliably with different qemu PPA builds:
- qemu 4.2 (as upstream) vs usual Ubuntu build -> find the offending patch in 
our delta
- test Ubuntu 20.10 which has qemu 5.0 and libvirt 6.6 -> if fixed there find 
by which change
- qemu 4.2 Ubuntu vs qemu 4.2 as in CentOS (but build for Ubuntu) -> if the 
latter works better then let us find by which (set of) patches.


I was checking the Delta on Centos8 advanced-virt qemu 4.2 as that was reported
to (maybe) work better. I was comparing which patches are applied, that are no
on Ubuntu and which of those might be related.
Among several individual fixes for some issues the biggest patch sets
are feature backports for Enhanced LUKS/backup/snapshot handling,
multifd migration/cancel, block-mirror, HMAT changes, virtiofs, qemu-img zero
write, arm time handling and some related build time self tests.
Due to the nature of these changes some affect the block handling by affecting 
block/job/hotplug. But they might only do so by accident, nothing is clearly 
for addressing the issue that
was reported. And even of the list below most seem unrelated - so as Sean 
assumed maybe it just isn't exercised on centos enough to be seen there?

eca0f3524a4eb57d03a56b0cbcef5527a0981ce4 backup: don't acquire aio_context in 
backup_clean
58226634c4b02af7b10862f7fbd3610a344bfb7f backup: Improve error for 
bdrv_getlength() failure
958a04bd32af18d9a207bcc78046e56a202aebc2 backup: Make sure that source and 
target size match
7b8e4857426f2e2de2441749996c6161b550bada block: Add flags to 
bdrv(_co)_truncate()
92b92799dc8662b6f71809100a4aabc1ae408ebb block: Add flags to 
BlockDriver.bdrv_co_truncate()
087ab8e775f48766068e65de1bc99d03b40d1670 block: always fill entire LUKS header 
space with zeros
8c6242b6f383e43fd11d2c50f8bcdd2bba1100fc block-backend: Add flags to 
blk_truncate()
564806c529d4e0acad209b1e5b864a8886092f1f block-backend: Reorder flush/pdiscard 
function definitions
0abf2581717a19d9749d5c2ff8acd0ac203452c2 block/backup-top: Don't acquire 
context while dropping top
1de6b45fb5c1489b450df7d1a4c692bba9678ce6 block: bdrv_reopen() with backing file 
in different AioContext
e1d7f8bb1ec0c6911dcea81641ce6139dbded02d block.c: adding bdrv_co_delete_file
69032253c33ae1774233c63cedf36d32242a85fc block/curl: HTTP header field names 
are case insensitive
7788a319399f17476ff1dd43164c869e320820a2 block/curl: HTTP header fields allow 
whitespace around values
91005a495e228ebd7e5e173cd18f952450eef82d blockdev: Acquire AioContext on dirty 
bitmap functions
471ded690e19689018535e3f48480507ed073e22 blockdev: fix coding style issues in 
drive_backup_prepare
3ea67e08832775a28d0bd2795f01bc77e7ea1512 blockdev: honor 
bdrv_try_set_aio_context() context requirements
c6996cf9a6c759c29919642be9a73ac64b38301b blockdev: Promote several bitmap 
functions to non-static
377410f6fb4f6b0d26d4a028c20766fae05de17e blockdev: Return bs to the proper 
context on snapshot abort
bb4e58c6137e80129b955789dd4b66c1504f20dc blockdev: Split off basic bitmap 
operations for qemu-img
5b7bfe515ecbd584b40ff6e41d2fd8b37c7d5139 blockdev: unify qmp_blockdev_backup 
and blockdev-backup transaction paths
2288ccfac96281c316db942d10e3f921c1373064 blockdev: unify qmp_drive_backup and 
drive-backup transaction paths
7f16476fab14fc32388e0ebae793f64673848efa block: Fix blk->in_flight during 
blk_wait_while_drained()
30dd65f307b647eef8156c4a33bd007823ef85cb block: Fix cross-AioContext 
blockdev-snapshot
eeea1faa099f82328f5831cf252f8ce0a59a9287 block: Fix leak in 
bdrv_create_file_fallback()
fd17146cd93d1704cd96d7c2757b325fc7aac6fd block: Generic file creation fallback
fbb92b6798894d3bf62fe3578d99fa62c720b242 block: Increase BB.in_flight for 
coroutine and sync interfaces
17e1e2be5f9e84e0298e28e70675655b43e225ea block: Introduce 
'bdrv_reopen_commit_post' step

[Bug 1849644] Re: QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

2020-09-25 Thread Christian Ehrhardt
SRU Template for qemu added and MP linked to fix this in Ubuntu 20.04

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1849644

Title:
  QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

   * The exact details of the protocol/subprotocal was slightly unclear
 between the projects. So qemu ended up insisting on "binary" being
 used but newer noVNC clients no more used it.

   * Qemu got fixed in 5.0 to be more tolerant and accept an empty sub-
 protocol as well. This SRU backports that fix to Focal.

  [Test Case]

   * Without the fix the following will "Failed to connect", but with
  the fix it will work.

  $ sudo apt install qemu-system-x86
  # will only boot into a failing bootloader, but that is enough
  $ qemu-system-x86_64 -vnc :0,websocket
  # We need version 1.2 or later, so use the snap
  $ snap install novnc
  $ novnc --vnc localhost:5700
  Connect browser to http://:6080/vnc.html
  Click "Connect"

   * Cross check with an older noVNC (e.g. the one in Focal) if the 
 connectivity still works on those as well

 - Reminders when switching between the noVNC implementations
   - always refresh the browser with all clear ctrl+alt+f5
   - start/stop the snapped one via snap.novnc.novncsvc.service

  [Regression Potential]

   * This is exclusive to the functionality of noVNC, so regressions would 
 have to be expected in there. The tests try to exercise the basics, but
 e.g. Openstack would be a major product using 

  [Other Info]
   
   * The noVNC in Focal itself does not yet have the offending change, but
 we want the qemu in focal to be connecteable from ~any type of client


  ---


  
  When running a machine using "-vnc" and the "websocket" option QEMU seems to 
require the subprotocol called 'binary'. This subprotocol does not exist in the 
WebSocket specification. In fact it has never existed in the spec, in one of 
the very early drafts of WebSockets it was briefly mentioned but it never made 
it to a final version.

  When the WebSocket server requires a non-standard subprotocol any
  WebSocket client that works correctly won't be able to connect.

  One example of such a client is noVNC, it tells the server that it
  doesn't want to use any subprotocol. QEMU's WebSocket proxy doesn't
  let noVNC connect. If noVNC is modified to ask for 'binary' it will
  work, this is, however, incorrect behavior.

  Looking at the code in "io/channel-websock.c" it seems it's quite
  hard-coded to binary:

  Look at line 58 and 433 here:
  https://git.qemu.org/?p=qemu.git;a=blob;f=io/channel-websock.c

  This code has to be made more dynamic, and shouldn't require binary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1849644/+subscriptions



[Bug 1849644] Re: QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

2020-09-24 Thread Christian Ehrhardt
** Description changed:

- When running a machine using "-vnc" and the "websocket" option QEMU
- seems to require the subprotocol called 'binary'. This subprotocol does
- not exist in the WebSocket specification. In fact it has never existed
- in the spec, in one of the very early drafts of WebSockets it was
- briefly mentioned but it never made it to a final version.
+ [Impact]
+ 
+  * The exact details of the protocol/subprotocal was slightly unclear
+between the projects. So qemu ended up insisting on "binary" being
+used but newer noVNC clients no more used it.
+ 
+  * Qemu got fixed in 5.0 to be more tolerant and accept an empty sub-
+protocol as well. This SRU backports that fix to Focal.
+ 
+ [Test Case]
+ 
+  * Without the fix the following will "Failed to connect", but with the
+ fix it will work.
+ 
+ $ sudo apt install qemu-system-x86
+ # will only boot into a failing bootloader, but that is enough
+ $ qemu-system-x86_64 -vnc :0,websocket
+ # We need version 1.2 or later, so use the snap
+ $ snap install novnc
+ $ novnc --vnc localhost:5700
+ Connect browser to http://:6080/vnc.html
+ Click "Connect"
+ 
+  * Cross check with an older noVNC (e.g. the one in Focal) if the 
+connectivity still works on those as well
+ 
+- Reminders when switching between the noVNC implementations
+  - always refresh the browser with all clear ctrl+alt+f5
+  - start/stop the snapped one via snap.novnc.novncsvc.service
+ 
+ [Regression Potential]
+ 
+  * This is exclusive to the functionality of noVNC, so regressions would 
+have to be expected in there. The tests try to exercise the basics, but
+e.g. Openstack would be a major product using 
+ 
+ [Other Info]
+  
+  * The noVNC in Focal itself does not yet have the offending change, but
+we want the qemu in focal to be connecteable from ~any type of client
+ 
+ 
+ ---
+ 
+ 
+ 
+ When running a machine using "-vnc" and the "websocket" option QEMU seems to 
require the subprotocol called 'binary'. This subprotocol does not exist in the 
WebSocket specification. In fact it has never existed in the spec, in one of 
the very early drafts of WebSockets it was briefly mentioned but it never made 
it to a final version.
  
  When the WebSocket server requires a non-standard subprotocol any
  WebSocket client that works correctly won't be able to connect.
  
  One example of such a client is noVNC, it tells the server that it
  doesn't want to use any subprotocol. QEMU's WebSocket proxy doesn't let
  noVNC connect. If noVNC is modified to ask for 'binary' it will work,
  this is, however, incorrect behavior.
  
  Looking at the code in "io/channel-websock.c" it seems it's quite hard-
  coded to binary:
  
  Look at line 58 and 433 here:
  https://git.qemu.org/?p=qemu.git;a=blob;f=io/channel-websock.c
  
  This code has to be made more dynamic, and shouldn't require binary.

** Changed in: qemu (Ubuntu Focal)
   Status: Confirmed => Triaged

** Changed in: qemu (Ubuntu Focal)
   Status: Triaged => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1849644

Title:
  QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

   * The exact details of the protocol/subprotocal was slightly unclear
 between the projects. So qemu ended up insisting on "binary" being
 used but newer noVNC clients no more used it.

   * Qemu got fixed in 5.0 to be more tolerant and accept an empty sub-
 protocol as well. This SRU backports that fix to Focal.

  [Test Case]

   * Without the fix the following will "Failed to connect", but with
  the fix it will work.

  $ sudo apt install qemu-system-x86
  # will only boot into a failing bootloader, but that is enough
  $ qemu-system-x86_64 -vnc :0,websocket
  # We need version 1.2 or later, so use the snap
  $ snap install novnc
  $ novnc --vnc localhost:5700
  Connect browser to http://:6080/vnc.html
  Click "Connect"

   * Cross check with an older noVNC (e.g. the one in Focal) if the 
 connectivity still works on those as well

 - Reminders when switching between the noVNC implementations
   - always refresh the browser with all clear ctrl+alt+f5
   - start/stop the snapped one via snap.novnc.novncsvc.service

  [Regression Potential]

   * This is exclusive to the functionality of noVNC, so regressions would 
 have to be expected in there. The tests try to exercise the basics, but
 e.g. Openstack would be a major product using 

  [Other Info]
   
   * The noVNC in Focal itself does not yet have the offending change, but
 we want the qemu in focal to be connecteable from ~any type of client


  ---


  
  When running a machine using "-vnc" and the "websocket" option QEMU seems to 

[Bug 1849644] Re: QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

2020-09-24 Thread Christian Ehrhardt
I can confirm the test steps mentioned above. In an unchanged scenario a
formerly failing-to-connect case got working when replacing the qemu (on
Focal) in use with a patched one.

Adding an SRU Template for Focal to the bug description ...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1849644

Title:
  QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Confirmed

Bug description:
  When running a machine using "-vnc" and the "websocket" option QEMU
  seems to require the subprotocol called 'binary'. This subprotocol
  does not exist in the WebSocket specification. In fact it has never
  existed in the spec, in one of the very early drafts of WebSockets it
  was briefly mentioned but it never made it to a final version.

  When the WebSocket server requires a non-standard subprotocol any
  WebSocket client that works correctly won't be able to connect.

  One example of such a client is noVNC, it tells the server that it
  doesn't want to use any subprotocol. QEMU's WebSocket proxy doesn't
  let noVNC connect. If noVNC is modified to ask for 'binary' it will
  work, this is, however, incorrect behavior.

  Looking at the code in "io/channel-websock.c" it seems it's quite
  hard-coded to binary:

  Look at line 58 and 433 here:
  https://git.qemu.org/?p=qemu.git;a=blob;f=io/channel-websock.c

  This code has to be made more dynamic, and shouldn't require binary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1849644/+subscriptions



[Bug 1894804] Re: Second DEVICE_DELETED event missing during virtio-blk disk device detach

2020-09-24 Thread Christian Ehrhardt
As outlined before the time from device_del to the DEVICE_DELETED is
indeed "a bit longer" in Focal (from ~0.1s to ~6s), but other than that
I couldn't find anything else yet.

We were wondering due to [1] in a related bug if it might be dependent to 
(over)load.
I was consuming Disk and CPU in a controlled fashion (stress-ng --cpu 8 --io 4 
--hdd 4 1x in the container the load is in and 1x on the Host). On that I was 
running attach/detach loops but in all of a two hundred retries it was 
"->device_del <-DEVICE_DELETED <-DEVICE_DELETED ->blockdev-del ->blockdev-del".

James Page was doing tests (thanks) with real Openstack and Ceph but
couldn't - so far -trigger the issue either. He will likely later update
the bug as well, but is currently trying to ramp up concurrency to more
likely hit the issue.

https://bugs.launchpad.net/nova/+bug/1882521/comments/8

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1894804

Title:
  Second DEVICE_DELETED event missing during virtio-blk disk device
  detach

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  New

Bug description:
  We are in the process of moving OpenStack CI across to use 20.04 Focal
  as the underlying OS and encountering the following issue in any test
  attempting to detach disk devices from running QEMU instances.

  We can see a single DEVICE_DELETED event raised to libvirtd for the
  /machine/peripheral/virtio-disk1/virtio-backend device but we do not
  see a second event for the actual disk. As a result the device is
  still marked as present in libvirt but QEMU reports it as missing in
  subsequent attempts to remove the device.

  The following log snippets can also be found in the following pastebin
  that's slightly easier to gork:

  http://paste.openstack.org/show/797564/

  https://review.opendev.org/#/c/746981/ libvirt: Bump
  MIN_{LIBVIRT,QEMU}_VERSION and NEXT_MIN_{LIBVIRT,QEMU}_VERSION

  https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260
  nova-ceph-multistore

  
https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260/log/controller/logs/dpkg-l.txt

  ii  libvirt-daemon   6.0.0-0ubuntu8.3 
 amd64Virtualization daemon
  ii  libvirt-daemon-driver-qemu   6.0.0-0ubuntu8.3 
 amd64Virtualization daemon QEMU connection driver
  ii  libvirt-daemon-system6.0.0-0ubuntu8.3 
 amd64Libvirt daemon configuration files
  ii  libvirt-daemon-system-systemd6.0.0-0ubuntu8.3 
 amd64Libvirt daemon configuration files (systemd)
  ii  libvirt-dev:amd646.0.0-0ubuntu8.3 
 amd64development files for the libvirt library
  ii  libvirt0:amd64   6.0.0-0ubuntu8.3 
 amd64library for interfacing with different virtualization systems
  [..]
  ii  qemu-block-extra:amd64   1:4.2-3ubuntu6.4 
 amd64extra block backend modules for qemu-system and qemu-utils
  ii  qemu-slof20191209+dfsg-1  
 all  Slimline Open Firmware -- QEMU PowerPC version
  ii  qemu-system  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries
  ii  qemu-system-arm  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (arm)
  ii  qemu-system-common   1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (common files)
  ii  qemu-system-data 1:4.2-3ubuntu6.4 
 all  QEMU full system emulation (data files)
  ii  qemu-system-mips 1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (mips)
  ii  qemu-system-misc 1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (miscellaneous)
  ii  qemu-system-ppc  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (ppc)
  ii  qemu-system-s390x1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (s390x)
  ii  qemu-system-sparc1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (sparc)
  ii  qemu-system-x86  1:4.2-3ubuntu6.4 
 amd64QEMU full system emulation binaries (x86)
  ii  qemu-utils   1:4.2-3ubuntu6.4 
 amd64QEMU utilities

  
https://zuul.opendev.org/t/openstack/build/4c56def513884c5eb3ba7b0adf7fa260/log/controller/logs/libvirt/qemu
  /instance-003a_log.txt

  2020-09-07 

[Bug 1849644] Re: QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

2020-09-24 Thread Christian Ehrhardt
Repro steps:
$ sudo apt install qemu-system-x86 novnc
/usr/share/novnc
python3 -m http.server

Connect browser to http://:8000/vnc.html
  Click "Settings"
  Open "advanced"
  Open "websocket"
  Set port 5700
  Clear path
  Click "Connect"

And it works ...

Turns out my former check on the offending noVNC commit was wrong.
There are
https://github.com/novnc/noVNC/commit/f8318361b1b62c4d76b091132d4a8ccfdd2957e4 
(referenced on this bug before)
$ git tag --contains f8318361b1b62c4d76b091132d4a8ccfdd2957e4
v1.0.0
But only really gone later with:
https://github.com/novnc/noVNC/commit/c912230309806aacbae4295faf7ad6406da97617
$ git tag --contains c912230309806aacbae4295faf7ad6406da97617
v1.2.0

So the novnc of Focal isn't affected but anyone who uses a newer noVNC >=1.2 
would be.
=> lower SRU priority
=> Modify above repro steps to not use noVNC from Focal via apt but use 1.2 
from snaps


Repro steps:
$ snap install novnc
$ novnc --vnc localhost:5700
Connect browser to http://:6080/vnc.html
Click "Connect"

TODO: repro steps to be verified with a qemu that has the fix applied

** Changed in: qemu (Ubuntu Focal)
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1849644

Title:
  QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Confirmed

Bug description:
  When running a machine using "-vnc" and the "websocket" option QEMU
  seems to require the subprotocol called 'binary'. This subprotocol
  does not exist in the WebSocket specification. In fact it has never
  existed in the spec, in one of the very early drafts of WebSockets it
  was briefly mentioned but it never made it to a final version.

  When the WebSocket server requires a non-standard subprotocol any
  WebSocket client that works correctly won't be able to connect.

  One example of such a client is noVNC, it tells the server that it
  doesn't want to use any subprotocol. QEMU's WebSocket proxy doesn't
  let noVNC connect. If noVNC is modified to ask for 'binary' it will
  work, this is, however, incorrect behavior.

  Looking at the code in "io/channel-websock.c" it seems it's quite
  hard-coded to binary:

  Look at line 58 and 433 here:
  https://git.qemu.org/?p=qemu.git;a=blob;f=io/channel-websock.c

  This code has to be made more dynamic, and shouldn't require binary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1849644/+subscriptions



[Bug 1849644] Re: QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

2020-09-24 Thread Christian Ehrhardt
This is in v5.0.0 so fixed in Groovy.
The related noVNC change is in upstream version >=v1.0.0 which correlates with 
Ubuntu >=20.04, threfore fixing this in Focals Qemu seems good.

** Also affects: qemu (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: qemu (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Changed in: qemu (Ubuntu)
   Status: New => Fix Released

** Changed in: qemu (Ubuntu Focal)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1849644

Title:
  QEMU VNC websocket proxy requires non-standard 'binary' subprotocol

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Confirmed

Bug description:
  When running a machine using "-vnc" and the "websocket" option QEMU
  seems to require the subprotocol called 'binary'. This subprotocol
  does not exist in the WebSocket specification. In fact it has never
  existed in the spec, in one of the very early drafts of WebSockets it
  was briefly mentioned but it never made it to a final version.

  When the WebSocket server requires a non-standard subprotocol any
  WebSocket client that works correctly won't be able to connect.

  One example of such a client is noVNC, it tells the server that it
  doesn't want to use any subprotocol. QEMU's WebSocket proxy doesn't
  let noVNC connect. If noVNC is modified to ask for 'binary' it will
  work, this is, however, incorrect behavior.

  Looking at the code in "io/channel-websock.c" it seems it's quite
  hard-coded to binary:

  Look at line 58 and 433 here:
  https://git.qemu.org/?p=qemu.git;a=blob;f=io/channel-websock.c

  This code has to be made more dynamic, and shouldn't require binary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1849644/+subscriptions



[Bug 1894804] Re: Second DEVICE_DELETED event missing during virtio-blk disk device detach

2020-09-21 Thread Christian Ehrhardt
Hi,
interesting bug report - thanks lee, Kashyap and Sean - as well as Danil for 
taking a look already. 

If this would always fail no unplugs would work ever which I knew can't
be right as I test that. So we need to find what is different...


@Openstack people - is that reliably triggering at the very first hot-detach 
for you or more like "happening in one of a million detaches triggered by the 
CI"?


First I simplified the case from "massive openstack CI" to just a few commands 
and small XMLs...
So I tried to cut ceph out of the equation here and compared hot remove between 
Bionic and Focal qemu/libvirt when removing a local image file.

$ qemu-img create -f qcow2 /var/lib/uvtool/libvirt/images/testdisk.qcow 20m
Disk:

  
  
  


The systems have just this guest each and the log debug config is like:
 /etc/libvirt/libvirtd.conf:
 > log_filters="3:qemu 3:libvirt 4:object 2:json 4:event 3:util"  
 > log_outputs="2:file:/var/log/libvirtd.log"
 $ rm /var/log/libvirtd.log
 $ systemctl restart libvirtd

Detach via:
 $ virsh detach-disk f-on-b vdc --live

I compared the monitor streams in those two cases but they both deliver 
DEVICE_DELETED for the 
/machine/peripheral/virtio-disk2/virtio-backend as well as 
/machine/peripheral/virtio-disk2.


@Jamespage could you get me a system with at least one auxiliary ceph disk that 
I can attach/detach to check if this might be in any way ceph specific?


Details in case we need them for comparison later on ...:

## Bionic ##

libvirt checks devices before removal:

2020-09-21 11:00:19.287+: 5755: info : qemuMonitorSend:1076 : 
QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 
msg={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-12"}
2020-09-21 11:00:19.288+: 5686: info : qemuMonitorJSONIOProcessLine:213 : 
QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": [{"name": 
"virtio-disk2", "type": "child"}, {"name": "virtio-disk1", 
"type": "child"}, {"name": "virtio-disk0", "type": 
"child"}, {"name": "video0", "type": "child"}, 
{"name": "serial0", "type": "child"}, {"name": "balloon0", "type": 
"child"}, {"name": "net0", "type": 
"child"}, {"name": "usb", "type": "child"}, 
{"name": "type", "type": "string"}], "id": "libvirt-12"}

Then send removal command

2020-09-21 11:00:48.672+: 5691: info : qemuMonitorSend:1076 : 
QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 
msg={"execute":"device_del","arguments":{"id":"virtio-disk2"},"id":"libvirt-13"}
2020-09-21 11:00:48.673+: 5686: info : qemuMonitorJSONIOProcessLine:213 : 
QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": {}, "id": 
"libvirt-13"}
020-09-21 11:00:48.721+: 5686: info : qemuMonitorJSONIOProcessLine:208 : 
QEMU_MONITOR_RECV_EVENT: mon=0x7fdf94001430 event={"timestamp": {"seconds": 
1600686048, "microseconds": 720908}, "event": "DEVICE_DELETED", "data": 
{"path": "/machine/peripheral/virtio-disk2/virtio-backend"}}
2020-09-21 11:00:48.723+: 5686: info : qemuMonitorJSONIOProcessLine:208 : 
QEMU_MONITOR_RECV_EVENT: mon=0x7fdf94001430 event={"timestamp": {"seconds": 
1600686048, "microseconds": 723091}, "event": "DEVICE_DELETED", "data": 
{"device": "virtio-disk2", "path": "/machine/peripheral/virtio-disk2"}}

^^ both DEVICE_DELETED occur in response to the "device_del".

vv Now also remove the device, but that already happened implicitly

2020-09-21 11:00:48.723+: 5691: info : qemuMonitorSend:1076 : 
QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 
msg={"execute":"human-monitor-command","arguments":{"command-line":"drive_del 
drive-virtio-disk2"},"id":"libvirt-14"}
2020-09-21 11:00:48.724+: 5686: info : qemuMonitorJSONIOProcessLine:213 : 
QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": "Device 
'drive-virtio-disk2' not found\r\n", "id": "libvirt-14"}

In the follow up query virtio-disk2 is removed

2020-09-21 11:00:48.875+: 5691: info : qemuMonitorSend:1076 : 
QEMU_MONITOR_SEND_MSG: mon=0x7fdf94001430 
msg={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-15"}
2020-09-21 11:00:48.876+: 5686: info : qemuMonitorJSONIOProcessLine:213 : 
QEMU_MONITOR_RECV_REPLY: mon=0x7fdf94001430 reply={"return": [{"name": 
"virtio-disk1", "type": "child"}, {"name": "virtio-disk0", 
"type": "child"}, {"name": "video0", "type": 
"child"}, {"name": "serial0", "type": "child"}, 
{"name": "balloon0", "type": "child"}, {"name": "net0", 
"type": "child"}, {"name": "usb", "type": 
"child"}, {"name": "type", "type": "string"}], "id": 
"libvirt-15"}


## Focal ##

libvirt checks devices before removal:

2020-09-21 11:00:20.856+: 6395: info : qemuMonitorSend:993 : 
QEMU_MONITOR_SEND_MSG: mon=0x7f463800b3e0 
msg={"execute":"qom-list","arguments":{"path":"/machine/peripheral"},"id":"libvirt-12"}
2020-09-21 11:00:20.857+: 6319: info : qemuMonitorJSONIOProcessLine:239 : 
QEMU_MONITOR_RECV_REPLY: mon=0x7f463800b3e0 reply={"return": [{"name": "type", 
"type": "string"}, {"name": "pci.7", "type": "child"}, 

[Bug 1589923] Re: https websockets not working in 2.5 or 2.6

2020-08-31 Thread Christian Ehrhardt
** Tags removed: server-next

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1589923

Title:
  https websockets not working in 2.5 or 2.6

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Xenial:
  Incomplete
Status in qemu source package in Yakkety:
  Won't Fix
Status in Arch Linux:
  New

Bug description:
  % gdb --args ./x86_64-softmmu/qemu-system-x86_64 -vnc 
0.0.0.0:1,tls,x509=/etc/pki/libvirt-le,websocket=5701 
  
  GNU gdb (GDB) 7.11
  Copyright (C) 2016 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later 
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-pc-linux-gnu".
  Type "show configuration" for configuration details.
  For bug reporting instructions, please see:
  .
  Find the GDB manual and other documentation resources online at:
  .
  For help, type "help".
  Type "apropos word" to search for commands related to "word"...
  Reading symbols from ./x86_64-softmmu/qemu-system-x86_64...done.
  (gdb) run
  Starting program: /home/ben/qemu/qemu-2.6.0/x86_64-softmmu/qemu-system-x86_64 
-vnc 0.0.0.0:1,tls,x509=/etc/pki/libvirt-le,websocket=5701
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/usr/lib/libthread_db.so.1".
  [New Thread 0x7fffe16f6700 (LWP 12767)]
  [New Thread 0x7fffde2d4700 (LWP 12768)]
  [New Thread 0x7fffd3fff700 (LWP 12769)]
  Initializing VNC server with x509 no auth
  Client sioc=0x5874d6b0 ws=1 auth=1 subauth=0
  New client on socket 0x5874d6b0
  vnc_set_share_mode/0x5874d6b0: undefined -> connecting
  TLS Websocket connection required
  Start TLS WS handshake process
  Handshake failed TLS handshake failed: The TLS connection was non-properly 
terminated.
  Closing down client sock: protocol error
  vnc_set_share_mode/0x5779f510: connecting -> disconnected
  Client sioc=0x5873c6a0 ws=1 auth=1 subauth=0
  New client on socket 0x5873c6a0
  vnc_set_share_mode/0x5873c6a0: undefined -> connecting
  TLS Websocket connection required
  Start TLS WS handshake process
  TLS handshake complete, starting websocket handshake
  Websocket negotiate starting
  Websock handshake complete, starting VNC protocol
  Write Plain: Pending output 0x57b91c60 size 4096 offset 12. Wait SSF 0
  Wrote wire 0x57b91c60 12 -> 12

  Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
  0x0001 in ?? ()
  (gdb) thread apply all bt

  Thread 4 (Thread 0x7fffd3fff700 (LWP 12769)):
  #0  0x7fffef35a09f in pthread_cond_wait@@GLIBC_2.3.2 () from 
/usr/lib/libpthread.so.0
  #1  0x55a20bd9 in qemu_cond_wait (cond=cond@entry=0x587267e0, 
  mutex=mutex@entry=0x58726810) at util/qemu-thread-posix.c:123
  #2  0x559770ab in vnc_worker_thread_loop 
(queue=queue@entry=0x587267e0)
  at ui/vnc-jobs.c:228
  #3  0x559775e8 in vnc_worker_thread (arg=0x587267e0) at 
ui/vnc-jobs.c:335
  #4  0x7fffef354474 in start_thread () from /usr/lib/libpthread.so.0
  #5  0x7fffea43c69d in clone () from /usr/lib/libc.so.6

  Thread 3 (Thread 0x7fffde2d4700 (LWP 12768)):
  #0  0x7fffef35a09f in pthread_cond_wait@@GLIBC_2.3.2 () from 
/usr/lib/libpthread.so.0
  #1  0x55a20bd9 in qemu_cond_wait (cond=, 
  ---Type  to continue, or q  to quit---
  emu_global_mutex>) at util/qemu-thread-posix.c:123
  #2  0x55715edf in qemu_tcg_wait_io_event (cpu=0x564ee840) at 
/home/ben/qemu/qemu-2.6.0/cpus.c:1015
  #3  qemu_tcg_cpu_thread_fn (arg=) at 
/home/ben/qemu/qemu-2.6.0/cpus.c:1161
  #4  0x7fffef354474 in start_thread () from /usr/lib/libpthread.so.0
  #5  0x7fffea43c69d in clone () from /usr/lib/libc.so.6

  Thread 2 (Thread 0x7fffe16f6700 (LWP 12767)):
  #0  0x7fffea438229 in syscall () from /usr/lib/libc.so.6
  #1  0x55a20ee8 in futex_wait (val=, ev=) at util/qemu-thread-posix.c:292
  #2  qemu_event_wait (ev=ev@entry=0x5641ece4 ) at 
util/qemu-thread-posix.c:399
  #3  0x55a2f2ae in call_rcu_thread (opaque=) at 
util/rcu.c:250
  #4  0x7fffef354474 in start_thread () from /usr/lib/libpthread.so.0
  #5  0x7fffea43c69d in clone () from /usr/lib/libc.so.6

  Thread 1 (Thread 0x77f5bb00 (LWP 12763)):
  #0  0x0001 in ?? ()
  #1  0x559efb53 in qio_task_free (task=0x58212140) at io/task.c:58
  #2  0x559efd89 in qio_task_complete (task=task@entry=0x58212140) 
at io/task.c:145
  #3  0x559ef5aa in qio_channel_websock_handshake_send 
(ioc=0x5873c6a0, condition=, 
  user_data=0x58212140) at 

[Bug 1883984] Re: QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-s390x

2020-08-26 Thread Christian Ehrhardt
old version
sudo apt install qemu-system-s390x=1:4.2-3ubuntu6.4
...test as listed in the test instructions ...

ubuntu@focal-sqxbr:~$ ./a.out 
Segmentation fault
(qemu is dead at this point)

$ sudo apt install qemu-system-s390x=1:4.2-3ubuntu6.5
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following packages will be upgraded:
  qemu-system-s390x
1 upgraded, 0 newly installed, 0 to remove and 315 not upgraded.
Need to get 2334 kB of archives.
After this operation, 4096 B of additional disk space will be used.
Get:1 http://ports.ubuntu.com focal-proposed/main s390x qemu-system-s390x s390x 
1:4.2-3ubuntu6.5 [2334 kB]
Fetched 2334 kB in 1s (3927 kB/s)  
(Reading database ... 203254 files and directories currently installed.)
Preparing to unpack .../qemu-system-s390x_1%3a4.2-3ubuntu6.5_s390x.deb ...
Unpacking qemu-system-s390x (1:4.2-3ubuntu6.5) over (1:4.2-3ubuntu6.4) ...
Setting up qemu-system-s390x (1:4.2-3ubuntu6.5) ...
Processing triggers for man-db (2.9.3-2) ...
ubuntu@s1lp05:~$ 

ubuntu@focal-sqxbr:~$ ./a.out 
(no crash)


Setting verified

** Tags removed: verification-needed verification-needed-focal
** Tags added: verification-done verification-done-focal

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883984

Title:
  QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-
  s390x

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Fix Committed

Bug description:
  [Impact]

   * An instruction was described wrong so that on usage the program would 
 crash.

  [Test Case]

   * Run s390x in emulation and there use this program:
 For simplicity and speed you can use KVM guest as usual on s390x, that 
 after prep of the test you run in qemu-tcg like:

 $ sudo qemu-system-s390x -machine s390-ccw-virtio,accel=tcg -cpu 
max,zpci=on -serial mon:stdio -display none -m 4096 -nic 
user,model=virtio,hostfwd=tcp::-:22 -drive 
file=/var/lib/uvtool/libvirt/images/focal-sqxbr.qcow,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
 -device 
virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
 Obviously is you have no s390x access you need to use emulation right 
 away.

   * Build and run failing program
 $ sudo apt install clang
 $ cat > bug-sqrtl-one-line.c << EOF
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
  __volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}
  EOF
 $ cc bug-sqrtl-one-line.c
 $ ./a.out
 Segmentation fault (core dumped)

 qemu is dead by now as long as the bug is present

  [Regression Potential]

   * The change only modifies 128 bit square root on s390x so regressions
 should be limited to exactly that - which formerly before this fix was 
 a broken instruction.

  [Other Info]
   
   * n/a

  ---

  In porting software to guest Ubuntu 18.04 and 20.04 VMs for S/390x, I 
discovered
  that some of my own numerical programs, and also a GNU configure script for at
  least one package with CC=clang, would cause an instant crash of the VM, 
sometimes
  also destroying recently opened files, and producing long strings of NUL 
characters
  in /var/log/syslog in the S/390 guest O/S.

  Further detective work narrowed the cause of the crash down to a single IBM 
S/390
  instruction: sqxbr (128-bit IEEE 754 square root).  Here is a one-line program
  that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0
  (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920
  workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator
  version 5.0.0, reproducibly produces a VM crash under qemu-system-s390x.

  % cat bug-sqrtl-one-line.c
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
__volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}

  % cc bug-sqrtl-one-line.c && ./a.out
  Segmentation fault (core dumped)

  The problem code may be the function float128_sqrt() defined in 
qemu-5.0.0/fpu/softfloat.c
  starting at line 7619.  I have NOT attempted to run the qemu-system-s390x 
executable
  under a debugger.  However, I observe that S/390 is the only CPU family that 
I know of,
  except possibly for a Fujitsu SPARC-64, that has a 128-bit square root in 
hardware.
  Thus, this instruction bug may not have been seen before.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883984/+subscriptions



[Bug 1883984] Re: QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-s390x

2020-08-19 Thread Christian Ehrhardt
Note: final upstream commit link
https://git.qemu.org/?p=qemu.git;a=commit;h=9bf728a09bf7509b27543664f9cca6f4f337f608

** Changed in: qemu (Ubuntu)
 Assignee: Christian Ehrhardt  (paelzer) => (unassigned)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883984

Title:
  QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-
  s390x

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged

Bug description:
  [Impact]

   * An instruction was described wrong so that on usage the program would 
 crash.

  [Test Case]

   * Run s390x in emulation and there use this program:
 For simplicity and speed you can use KVM guest as usual on s390x, that 
 after prep of the test you run in qemu-tcg like:

 $ sudo qemu-system-s390x -machine s390-ccw-virtio,accel=tcg -cpu 
max,zpci=on -serial mon:stdio -display none -m 4096 -nic 
user,model=virtio,hostfwd=tcp::-:22 -drive 
file=/var/lib/uvtool/libvirt/images/focal-sqxbr.qcow,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
 -device 
virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
 Obviously is you have no s390x access you need to use emulation right 
 away.

   * Build and run failing program
 $ sudo apt install clang
 $ cat > bug-sqrtl-one-line.c << EOF
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
  __volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}
  EOF
 $ cc bug-sqrtl-one-line.c
 $ ./a.out
 Segmentation fault (core dumped)

 qemu is dead by now as long as the bug is present

  [Regression Potential]

   * The change only modifies 128 bit square root on s390x so regressions
 should be limited to exactly that - which formerly before this fix was 
 a broken instruction.

  [Other Info]
   
   * n/a

  ---

  In porting software to guest Ubuntu 18.04 and 20.04 VMs for S/390x, I 
discovered
  that some of my own numerical programs, and also a GNU configure script for at
  least one package with CC=clang, would cause an instant crash of the VM, 
sometimes
  also destroying recently opened files, and producing long strings of NUL 
characters
  in /var/log/syslog in the S/390 guest O/S.

  Further detective work narrowed the cause of the crash down to a single IBM 
S/390
  instruction: sqxbr (128-bit IEEE 754 square root).  Here is a one-line program
  that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0
  (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920
  workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator
  version 5.0.0, reproducibly produces a VM crash under qemu-system-s390x.

  % cat bug-sqrtl-one-line.c
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
__volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}

  % cc bug-sqrtl-one-line.c && ./a.out
  Segmentation fault (core dumped)

  The problem code may be the function float128_sqrt() defined in 
qemu-5.0.0/fpu/softfloat.c
  starting at line 7619.  I have NOT attempted to run the qemu-system-s390x 
executable
  under a debugger.  However, I observe that S/390 is the only CPU family that 
I know of,
  except possibly for a Fujitsu SPARC-64, that has a 128-bit square root in 
hardware.
  Thus, this instruction bug may not have been seen before.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883984/+subscriptions



[Bug 1883984] Re: QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-s390x

2020-08-19 Thread Christian Ehrhardt
** Description changed:

+ [Impact]
+ 
+  * An instruction was described wrong so that on usage the program would 
+crash.
+ 
+ [Test Case]
+ 
+  * Run s390x in emulation and there use this program:
+For simplicity and speed you can use KVM guest as usual on s390x, that 
+after prep of the test you run in qemu-tcg like:
+ 
+$ sudo qemu-system-s390x -machine s390-ccw-virtio,accel=tcg -cpu 
max,zpci=on -serial mon:stdio -display none -m 4096 -nic 
user,model=virtio,hostfwd=tcp::-:22 -drive 
file=/var/lib/uvtool/libvirt/images/focal-sqxbr.qcow,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
 -device 
virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
+Obviously is you have no s390x access you need to use emulation right 
+away.
+ 
+  * Build and run failing program
+$ sudo apt install clang
+$ cat > bug-sqrtl-one-line.c << EOF
+ int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
+ __volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}
+ EOF
+$ cc bug-sqrtl-one-line.c
+$ ./a.out
+Segmentation fault (core dumped)
+ 
+qemu is dead by now as long as the bug is present
+ 
+ [Regression Potential]
+ 
+  * The change only modifies 128 bit square root on s390x so regressions
+should be limited to exactly that - which formerly before this fix was 
+a broken instruction.
+ 
+ [Other Info]
+  
+  * n/a
+ 
+ ---
+ 
  In porting software to guest Ubuntu 18.04 and 20.04 VMs for S/390x, I 
discovered
  that some of my own numerical programs, and also a GNU configure script for at
  least one package with CC=clang, would cause an instant crash of the VM, 
sometimes
  also destroying recently opened files, and producing long strings of NUL 
characters
  in /var/log/syslog in the S/390 guest O/S.
  
  Further detective work narrowed the cause of the crash down to a single IBM 
S/390
  instruction: sqxbr (128-bit IEEE 754 square root).  Here is a one-line program
- that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0 
- (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920 
- workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator 
+ that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0
+ (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920
+ workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator
  version 5.0.0, reproducibly produces a VM crash under qemu-system-s390x.
  
  % cat bug-sqrtl-one-line.c
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
__volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}
  
  % cc bug-sqrtl-one-line.c && ./a.out
  Segmentation fault (core dumped)
  
  The problem code may be the function float128_sqrt() defined in 
qemu-5.0.0/fpu/softfloat.c
  starting at line 7619.  I have NOT attempted to run the qemu-system-s390x 
executable
  under a debugger.  However, I observe that S/390 is the only CPU family that 
I know of,
  except possibly for a Fujitsu SPARC-64, that has a 128-bit square root in 
hardware.
  Thus, this instruction bug may not have been seen before.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883984

Title:
  QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-
  s390x

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged

Bug description:
  [Impact]

   * An instruction was described wrong so that on usage the program would 
 crash.

  [Test Case]

   * Run s390x in emulation and there use this program:
 For simplicity and speed you can use KVM guest as usual on s390x, that 
 after prep of the test you run in qemu-tcg like:

 $ sudo qemu-system-s390x -machine s390-ccw-virtio,accel=tcg -cpu 
max,zpci=on -serial mon:stdio -display none -m 4096 -nic 
user,model=virtio,hostfwd=tcp::-:22 -drive 
file=/var/lib/uvtool/libvirt/images/focal-sqxbr.qcow,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
 -device 
virtio-blk-ccw,devno=fe.0.0001,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off
 Obviously is you have no s390x access you need to use emulation right 
 away.

   * Build and run failing program
 $ sudo apt install clang
 $ cat > bug-sqrtl-one-line.c << EOF
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
  __volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}
  EOF
 $ cc bug-sqrtl-one-line.c
 $ ./a.out
 Segmentation fault (core dumped)

 qemu is dead by now as long as the bug is present

  [Regression Potential]

   * The change only modifies 128 bit square root on s390x so regressions
 should be limited to exactly that - which formerly before this fix was 
 a broken instruction.

  [Other Info]
   
   * n/a

 

[Bug 1886811] Re: systemd complains Failed to enqueue loopback interface start request: Operation not supported

2020-08-19 Thread Christian Ehrhardt
SRU need the bug 1890881 fix to be really helpful, but the dependency chain of 
that is not SRUable.
See: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1890881/comments/17

Users (of this valid but rare use case) can either use Groovy which will
fix this or wait until Openstack Victoria will make it available for
Focal via the Ubuntu Cloud Archive [1].

[1]: https://wiki.ubuntu.com/OpenStack/CloudArchive

** Changed in: qemu (Ubuntu Focal)
   Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1886811

Title:
  systemd complains Failed to enqueue loopback interface start request:
  Operation not supported

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Won't Fix
Status in qemu package in Debian:
  Fix Released

Bug description:
  This symptom seems similar to
  https://bugs.launchpad.net/qemu/+bug/1823790

  Host Linux: Debian 11 Bullseye (testing) on x84-64 architecture
  qemu version: latest git of git commit hash 
eb2c66b10efd2b914b56b20ae90655914310c925
  compiled with "./configure --static --disable-system" 

  Down stream bug report at 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964289
  Bug report (closed) to systemd: 
https://github.com/systemd/systemd/issues/16359

  systemd in armhf and armel (both little endian 32-bit) containers fail to 
start with
  Failed to enqueue loopback interface start request: Operation not supported

  How to reproduce on Debian (and probably Ubuntu):
  mmdebstrap --components="main contrib non-free" --architectures=armhf 
--variant=important bullseye /var/lib/machines/armhf-bullseye
  systemd-nspawn -D /var/lib/machines/armhf-bullseye -b

  When "armhf" architecture is replaced with "mips" (32-bit big endian) or 
"ppc64"
  (64-bit big endian), the container starts up fine.

  The same symptom is also observed with "powerpc" (32-bit big endian)
  architecture.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1886811/+subscriptions



[Bug 1886811] Re: systemd complains Failed to enqueue loopback interface start request: Operation not supported

2020-08-18 Thread Christian Ehrhardt
To fully work this also needs the fix for bug 1890881 as identified
there.

** Changed in: qemu (Ubuntu Focal)
   Status: New => Triaged

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1886811

Title:
  systemd complains Failed to enqueue loopback interface start request:
  Operation not supported

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  Triaged
Status in qemu package in Debian:
  Fix Released

Bug description:
  This symptom seems similar to
  https://bugs.launchpad.net/qemu/+bug/1823790

  Host Linux: Debian 11 Bullseye (testing) on x84-64 architecture
  qemu version: latest git of git commit hash 
eb2c66b10efd2b914b56b20ae90655914310c925
  compiled with "./configure --static --disable-system" 

  Down stream bug report at 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964289
  Bug report (closed) to systemd: 
https://github.com/systemd/systemd/issues/16359

  systemd in armhf and armel (both little endian 32-bit) containers fail to 
start with
  Failed to enqueue loopback interface start request: Operation not supported

  How to reproduce on Debian (and probably Ubuntu):
  mmdebstrap --components="main contrib non-free" --architectures=armhf 
--variant=important bullseye /var/lib/machines/armhf-bullseye
  systemd-nspawn -D /var/lib/machines/armhf-bullseye -b

  When "armhf" architecture is replaced with "mips" (32-bit big endian) or 
"ppc64"
  (64-bit big endian), the container starts up fine.

  The same symptom is also observed with "powerpc" (32-bit big endian)
  architecture.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1886811/+subscriptions



[Bug 1823790] Re: QEMU mishandling of SO_PEERSEC forces systemd into tight loop

2020-08-18 Thread Christian Ehrhardt
Sorry, posted this on the wrong bug :-/
I beg your pardon for the noise.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1823790

Title:
  QEMU mishandling of SO_PEERSEC forces systemd into tight loop

Status in QEMU:
  Fix Released

Bug description:
  While building Debian images for embedded ARM target systems I
  detected that QEMU seems to force newer systemd daemons into a tight
  loop.

  My setup is the following:

  Host machine: Ubuntu 18.04, amd64
  LXD container: Debian Buster, arm64, systemd 241
  QEMU: qemu-aarch64-static, 4.0.0-rc2 (custom build) and 3.1.0 (Debian 
1:3.1+dfsg-7)

  To easily reproduce the issue I have created the following repository:
  https://github.com/lueschem/edi-qemu

  The call where systemd gets looping is the following:
  2837 getsockopt(3,1,31,274891889456,274887218756,274888927920) = -1 errno=34 
(Numerical result out of range)

  Furthermore I also verified that the issue is not related to LXD.
  The same behavior can be reproduced using systemd-nspawn.

  This issue reported against systemd seems to be related:
  https://github.com/systemd/systemd/issues/11557

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1823790/+subscriptions



[Bug 1823790] Re: QEMU mishandling of SO_PEERSEC forces systemd into tight loop

2020-08-18 Thread Christian Ehrhardt
Bisect worked and once you find it it seems obvious that this is exactly
our case:

commit 65b261a63a48fbb3b11193361d4ea0c38a3c3dfd
Author: Laurent Vivier 
Date:   Thu Jul 9 09:23:32 2020 +0200

linux-user: add netlink RTM_SETLINK command

This command is needed to be able to boot systemd in a container.

  $ sudo systemd-nspawn -D /chroot/armhf/sid/ -b
  Spawning container sid on /chroot/armhf/sid.
  Press ^] three times within 1s to kill container.
  systemd 245.6-2 running in system mode.
  Detected virtualization systemd-nspawn.
  Detected architecture arm.

  Welcome to Debian GNU/Linux bullseye/sid!

  Set hostname to .
  Failed to enqueue loopback interface start request: Operation not 
supported
  Caught , dumped core as pid 3.
  Exiting PID 1...
  Container sid failed with error code 255.

Signed-off-by: Laurent Vivier 
Message-Id: <20200709072332.890440-2-laur...@vivier.eu>

 linux-user/fd-trans.c | 1 +
 1 file changed, 1 insertion(+)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1823790

Title:
  QEMU mishandling of SO_PEERSEC forces systemd into tight loop

Status in QEMU:
  Fix Released

Bug description:
  While building Debian images for embedded ARM target systems I
  detected that QEMU seems to force newer systemd daemons into a tight
  loop.

  My setup is the following:

  Host machine: Ubuntu 18.04, amd64
  LXD container: Debian Buster, arm64, systemd 241
  QEMU: qemu-aarch64-static, 4.0.0-rc2 (custom build) and 3.1.0 (Debian 
1:3.1+dfsg-7)

  To easily reproduce the issue I have created the following repository:
  https://github.com/lueschem/edi-qemu

  The call where systemd gets looping is the following:
  2837 getsockopt(3,1,31,274891889456,274887218756,274888927920) = -1 errno=34 
(Numerical result out of range)

  Furthermore I also verified that the issue is not related to LXD.
  The same behavior can be reproduced using systemd-nspawn.

  This issue reported against systemd seems to be related:
  https://github.com/systemd/systemd/issues/11557

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1823790/+subscriptions



[Bug 1886811] Re: systemd complains Failed to enqueue loopback interface start request: Operation not supported

2020-08-12 Thread Christian Ehrhardt
@Ryutaroh - could you test [1] if it gets you around this bug (1886811)
and if bug 1890881 is present in focal as well?

[1]: https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4197

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1886811

Title:
  systemd complains Failed to enqueue loopback interface start request:
  Operation not supported

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  New
Status in qemu package in Debian:
  Fix Released

Bug description:
  This symptom seems similar to
  https://bugs.launchpad.net/qemu/+bug/1823790

  Host Linux: Debian 11 Bullseye (testing) on x84-64 architecture
  qemu version: latest git of git commit hash 
eb2c66b10efd2b914b56b20ae90655914310c925
  compiled with "./configure --static --disable-system" 

  Down stream bug report at 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964289
  Bug report (closed) to systemd: 
https://github.com/systemd/systemd/issues/16359

  systemd in armhf and armel (both little endian 32-bit) containers fail to 
start with
  Failed to enqueue loopback interface start request: Operation not supported

  How to reproduce on Debian (and probably Ubuntu):
  mmdebstrap --components="main contrib non-free" --architectures=armhf 
--variant=important bullseye /var/lib/machines/armhf-bullseye
  systemd-nspawn -D /var/lib/machines/armhf-bullseye -b

  When "armhf" architecture is replaced with "mips" (32-bit big endian) or 
"ppc64"
  (64-bit big endian), the container starts up fine.

  The same symptom is also observed with "powerpc" (32-bit big endian)
  architecture.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1886811/+subscriptions



[Bug 1886811] Re: systemd complains Failed to enqueue loopback interface start request: Operation not supported

2020-08-12 Thread Christian Ehrhardt
** Also affects: qemu (Ubuntu Focal)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1886811

Title:
  systemd complains Failed to enqueue loopback interface start request:
  Operation not supported

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Focal:
  New
Status in qemu package in Debian:
  Fix Released

Bug description:
  This symptom seems similar to
  https://bugs.launchpad.net/qemu/+bug/1823790

  Host Linux: Debian 11 Bullseye (testing) on x84-64 architecture
  qemu version: latest git of git commit hash 
eb2c66b10efd2b914b56b20ae90655914310c925
  compiled with "./configure --static --disable-system" 

  Down stream bug report at 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964289
  Bug report (closed) to systemd: 
https://github.com/systemd/systemd/issues/16359

  systemd in armhf and armel (both little endian 32-bit) containers fail to 
start with
  Failed to enqueue loopback interface start request: Operation not supported

  How to reproduce on Debian (and probably Ubuntu):
  mmdebstrap --components="main contrib non-free" --architectures=armhf 
--variant=important bullseye /var/lib/machines/armhf-bullseye
  systemd-nspawn -D /var/lib/machines/armhf-bullseye -b

  When "armhf" architecture is replaced with "mips" (32-bit big endian) or 
"ppc64"
  (64-bit big endian), the container starts up fine.

  The same symptom is also observed with "powerpc" (32-bit big endian)
  architecture.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1886811/+subscriptions



[Bug 1877052] Re: KVM Win 10 guest pauses after kernel upgrade

2020-08-10 Thread Christian Ehrhardt
I haven't seen any similar reports nor any updates here.
Might I ask if you  have got any further since then?

Qemu 5.0 is available in Ubuntu 20.10 now, if you are willing to upgrade or 
install a test system that might be worth a try (new libvirt is still WIP, but 
unlikely to play a role here).
20.10 proposed would even have a 5.8.0.12.14 kernel since a kernel change might 
have been what started this that might be worth a check as well.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1877052

Title:
  KVM Win 10 guest pauses after kernel upgrade

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  Hello!
  Unfortunately the bug has apparently reappeared. I have a Windows 10 running 
in a VM, which after my today's "apt upgrade" goes into pause mode after a few 
seconds of running time.

  Until yesterday it used to work and I was able to boot the VM. During
  the kernel update (from 5.4.0-28.33 to 5.4.0-29.34) the VM was active
  and then went into pause mode. Even after a reboot of my host system
  the problem still persists: the VM boots for a few seconds and then
  switches to pause mode.

  Current Kernel: Linux andreas-laptop 5.4.0-29-generic #33-Ubuntu SMP
  Wed Apr 29 14:32:27 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  Maybe relevant logfile lines:
  2020-05-06T07:46:42.857574Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.857718Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:46:42.860567Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.860582Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:47:22.901057Z qemu-system-x86_64: terminating on signal 15 from 
pid 1593 (/usr/sbin/libvirtd)
  2020-05-06 07:47:23.101+: shutting down, reason=destroyed


  Kind regards,
     Andreas

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1877052/+subscriptions



[Bug 1883984] Re: QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-s390x

2020-08-03 Thread Christian Ehrhardt
** Also affects: qemu (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Changed in: qemu (Ubuntu Focal)
   Status: New => Triaged

** Changed in: qemu (Ubuntu Focal)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883984

Title:
  QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-
  s390x

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Focal:
  Triaged

Bug description:
  In porting software to guest Ubuntu 18.04 and 20.04 VMs for S/390x, I 
discovered
  that some of my own numerical programs, and also a GNU configure script for at
  least one package with CC=clang, would cause an instant crash of the VM, 
sometimes
  also destroying recently opened files, and producing long strings of NUL 
characters
  in /var/log/syslog in the S/390 guest O/S.

  Further detective work narrowed the cause of the crash down to a single IBM 
S/390
  instruction: sqxbr (128-bit IEEE 754 square root).  Here is a one-line program
  that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0 
  (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920 
  workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator 
  version 5.0.0, reproducibly produces a VM crash under qemu-system-s390x.

  % cat bug-sqrtl-one-line.c
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
__volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}

  % cc bug-sqrtl-one-line.c && ./a.out
  Segmentation fault (core dumped)

  The problem code may be the function float128_sqrt() defined in 
qemu-5.0.0/fpu/softfloat.c
  starting at line 7619.  I have NOT attempted to run the qemu-system-s390x 
executable
  under a debugger.  However, I observe that S/390 is the only CPU family that 
I know of,
  except possibly for a Fujitsu SPARC-64, that has a 128-bit square root in 
hardware.
  Thus, this instruction bug may not have been seen before.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883984/+subscriptions



[Bug 1883984] Re: QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-s390x

2020-08-03 Thread Christian Ehrhardt
** Also affects: qemu (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: qemu (Ubuntu)
   Status: New => In Progress

** Changed in: qemu (Ubuntu)
 Assignee: (unassigned) => Christian Ehrhardt  (paelzer)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883984

Title:
  QEMU S/390x sqxbr (128-bit IEEE 754 square root) crashes qemu-system-
  s390x

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  In Progress

Bug description:
  In porting software to guest Ubuntu 18.04 and 20.04 VMs for S/390x, I 
discovered
  that some of my own numerical programs, and also a GNU configure script for at
  least one package with CC=clang, would cause an instant crash of the VM, 
sometimes
  also destroying recently opened files, and producing long strings of NUL 
characters
  in /var/log/syslog in the S/390 guest O/S.

  Further detective work narrowed the cause of the crash down to a single IBM 
S/390
  instruction: sqxbr (128-bit IEEE 754 square root).  Here is a one-line program
  that when compiled and run on a VM hosted on QEMUcc emulator version 4.2.0 
  (Debian 1:4.2-3ubuntu6.1) [hosted on Ubuntu 20.04 on a Dell Precision 7920 
  workstation with an Intel Xeon Platinum 8253 CPU],  and also on QEMU emulator 
  version 5.0.0, reproducibly produces a VM crash under qemu-system-s390x.

  % cat bug-sqrtl-one-line.c
  int main(void) { volatile long double x, r; x = 4.0L; __asm__ 
__volatile__("sqxbr %0, %1" : "=f" (r) : "f" (x)); return (0);}

  % cc bug-sqrtl-one-line.c && ./a.out
  Segmentation fault (core dumped)

  The problem code may be the function float128_sqrt() defined in 
qemu-5.0.0/fpu/softfloat.c
  starting at line 7619.  I have NOT attempted to run the qemu-system-s390x 
executable
  under a debugger.  However, I observe that S/390 is the only CPU family that 
I know of,
  except possibly for a Fujitsu SPARC-64, that has a 128-bit square root in 
hardware.
  Thus, this instruction bug may not have been seen before.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883984/+subscriptions



Re: [PULL 06/16] accel/tcg: better handle memory constrained systems

2020-07-28 Thread Christian Ehrhardt
On Mon, Jul 27, 2020 at 2:24 PM Alex Bennée  wrote:

> It turns out there are some 64 bit systems that have relatively low
> amounts of physical memory available to them (typically CI system).
> Even with swapping available a 1GB translation buffer that fills up
> can put the machine under increased memory pressure. Detect these low
> memory situations and reduce tb_size appropriately.
>
> Fixes: 600e17b2615 ("accel/tcg: increase default code gen buffer size for
> 64 bit")
> Signed-off-by: Alex Bennée 
> Reviewed-by: Richard Henderson 
> Reviewed-by: Robert Foley 
> Cc: BALATON Zoltan 
> Cc: Christian Ehrhardt 
> Message-Id: <20200724064509.331-7-alex.ben...@linaro.org>
>

I beg your pardon for the late reply, but I was out a week.
I see this is already the pull request and my former feedback was included
- thanks.

Never the less I took the chance to test it in the context that I found and
reported the initial bug.
If only to show that I didn't fire this case :-)

We know there is quite some noise/deviation, but I only ran single tests as
the problem was easily visible despite the noise. Amount of memory qemu
settles on:

Host 32G, Guest 512M
4.2633M
5.0   1672M
5.0+ Fix  1670M

Host 1.5G, Guest 512M
4.2692M
5.0   16xxM (OOM)
5.0+ Fix   766M

So we seem to have achieved that small environments no more break (a very
small amount of very densely sized systems might still) but at the same
time get the bigger cache for any normal/large system.
Tested-by: Christian Ehrhardt 
Reviewed-by: Christian Ehrhardt 


> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 2afa46bd2b1..2d83013633b 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t
> tb_size)
>  {
>  /* Size the buffer.  */
>  if (tb_size == 0) {
> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +size_t phys_mem = qemu_get_host_physmem();
> +if (phys_mem == 0) {
> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +} else {
> +tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
> +}
>  }
>      if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
>  tb_size = MIN_CODE_GEN_BUFFER_SIZE;
> --
> 2.20.1
>
>

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


Re: [PATCH v1 4/5] util: add qemu_get_host_physmem utility function

2020-07-17 Thread Christian Ehrhardt
On Fri, Jul 17, 2020 at 3:32 PM BALATON Zoltan  wrote:

> On Fri, 17 Jul 2020, Alex Bennée wrote:
> > This will be used in a future patch. For POSIX systems _SC_PHYS_PAGES
> > isn't standardised but at least appears in the man pages for
> > Open/FreeBSD. The result is advisory so any users of it shouldn't just
> > fail if we can't work it out.
> >
> > The win32 stub currently returns 0 until someone with a Windows system
> > can develop and test a patch.
> >
> > Signed-off-by: Alex Bennée 
> > Cc: BALATON Zoltan 
> > Cc: Christian Ehrhardt 
> > ---
> > include/qemu/osdep.h | 10 ++
> > util/oslib-posix.c   | 11 +++
> > util/oslib-win32.c   |  6 ++
> > 3 files changed, 27 insertions(+)
> >
> > diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> > index 4841b5c6b5f..7ff209983e2 100644
> > --- a/include/qemu/osdep.h
> > +++ b/include/qemu/osdep.h
> > @@ -665,4 +665,14 @@ static inline void qemu_reset_optind(void)
> >  */
> > char *qemu_get_host_name(Error **errp);
> >
> > +/**
> > + * qemu_get_host_physmem:
> > + *
> > + * Operating system agnostiv way of querying host memory.
>
> Typo: agnostiv -> agnostic
>
> > + *
> > + * Returns amount of physical memory on the system. This is purely
> > + * advisery and may return 0 if we can't work it out.
> > + */
> > +size_t qemu_get_host_physmem(void);
> > +
> > #endif
> > diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> > index 36bf8593f8c..d9da782b896 100644
> > --- a/util/oslib-posix.c
> > +++ b/util/oslib-posix.c
> > @@ -839,3 +839,14 @@ char *qemu_get_host_name(Error **errp)
> >
> > return g_steal_pointer();
> > }
> > +
> > +size_t qemu_get_host_physmem(void)
> > +{
> > +#ifdef _SC_PHYS_PAGES
> > +long pages = sysconf(_SC_PHYS_PAGES);
> > +if (pages > 0) {
> > +return pages * qemu_real_host_page_size;
>
> The Linux man page warns that this product may overflow so maybe you could
> return pages here.
>

The caller might be even less aware of that than this function - so maybe
better handle it here.
How about handling overflows and cutting it to MiB before returning?


> > +}
> > +#endif
> > +return 0;
> > +}
> > diff --git a/util/oslib-win32.c b/util/oslib-win32.c
> > index 7eedbe5859a..31030463cc9 100644
> > --- a/util/oslib-win32.c
> > +++ b/util/oslib-win32.c
> > @@ -828,3 +828,9 @@ char *qemu_get_host_name(Error **errp)
> >
> > return g_utf16_to_utf8(tmp, size, NULL, NULL, NULL);
> > }
> > +
> > +size_t qemu_get_host_physmem(void)
> > +{
> > +/* currently unimplemented */
> > +return 0;
> > +}
>
> For Windows this may help:
>
> https://stackoverflow.com/questions/5553665/get-ram-system-size
>
> not sure about other OSes.
>
> Regards,
> BALATON Zoltan



-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


Re: [PATCH v1 5/5] accel/tcg: better handle memory constrained systems

2020-07-17 Thread Christian Ehrhardt
On Fri, Jul 17, 2020 at 12:51 PM Alex Bennée  wrote:

> It turns out there are some 64 bit systems that have relatively low
> amounts of physical memory available to them (typically CI system).
> Even with swapping available a 1GB translation buffer that fills up
> can put the machine under increased memory pressure. Detect these low
> memory situations and reduce tb_size appropriately.
>
> Fixes: 600e17b261
> Signed-off-by: Alex Bennée 
> Cc: BALATON Zoltan 
> Cc: Christian Ehrhardt 
> ---
>  accel/tcg/translate-all.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 2afa46bd2b1..2ff0ba6d19b 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -976,7 +976,12 @@ static inline size_t size_code_gen_buffer(size_t
> tb_size)
>  {
>  /* Size the buffer.  */
>  if (tb_size == 0) {
> -tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +size_t phys_mem = qemu_get_host_physmem();
> +if (phys_mem > 0 && phys_mem < (2 *
> DEFAULT_CODE_GEN_BUFFER_SIZE)) {
> +tb_size = phys_mem / 4;
>

In my experiments I've found that /8 more closely matches the former
behavior
on small hosts while at the same time not affecting common large hosts.


> +} else {
> +tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +}
>  }
>  if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
>  tb_size = MIN_CODE_GEN_BUFFER_SIZE;
> --
> 2.20.1
>
>

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


Re: [PATCH] accel/tcg: reduce default code gen buffer on small hosts

2020-07-17 Thread Christian Ehrhardt
On Fri, Jul 17, 2020 at 4:07 PM Christian Ehrhardt <
christian.ehrha...@canonical.com> wrote:

> Since v5.0.0 and 600e17b2 "accel/tcg: increase default code gen buffer
> size for 64 bit" in particular qemu with TCG regularly gets OOM Killed
> on small hosts.
>
> The former 47a2def4 "accel/tcg: remove link between guest ram and TCG
> cache size" removed the link to guest size which is right, but at least
> some connection to the host size needs to be retained to avoid growing
> out of control on common CI setups which run at 1-2G host sizes.
>
> The lower value of 1/8th of the host memory size and the default (of
> currently 1G) will be taken to initialize the TB. There already is a
> Min/Max check in place to not reach ridiculously small values.
>
> Fixes: 600e17b2
>

Just found "[PATCH v1 0/5] candidate fixes for 5.1-rc1 (shippable,
semihosting, OOM tcg)"
which was submitted while I was prepping this one (this is a busy day since
I'll be off for a week).

Please ignore this patch here and give the series of Alex a look as it is
the more advanced version :-).


> Signed-off-by: Christian Ehrhardt 
> ---
>  accel/tcg/translate-all.c | 23 +++
>  1 file changed, 23 insertions(+)
>
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 2afa46bd2b..ffcd67060e 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -977,6 +977,29 @@ static inline size_t size_code_gen_buffer(size_t
> tb_size)
>  /* Size the buffer.  */
>  if (tb_size == 0) {
>  tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +/*
> + * A static default of 1G turned out to break (OOM Kill) many
> common
> + * CI setups that run at 1-2G Host memory size.
> + * At the same time the former default of ram_size/4 wasted
> performance
> + * on large host systems when running small guests.
> + * Common CI guest sizes are 0.5-1G which meant ~128M-256M TB
> size.
> + * A Default of 1/8th of the host size will get small hosts a
> + * similar TB size than they had prior to v5.0 and common bare
> metal
> + * systems (>=8G) the new 1G default that was set in v5.0
> + */
> +#if defined _SC_PHYS_PAGES && defined _SC_PAGESIZE
> +{
> +unsigned long max = DEFAULT_CODE_GEN_BUFFER_SIZE;
> +double pages = (double)sysconf(_SC_PHYS_PAGES);
> +
> +if (pages > 0 && pagesize > 0) {
> +max = (unsigned long)((pages * qemu_real_host_page_size)
> / 8);
> +}
> +if (max < DEFAULT_CODE_GEN_BUFFER_SIZE) {
> +tb_size = max;
> +}
> +}
> +#endif
>  }
>  if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
>  tb_size = MIN_CODE_GEN_BUFFER_SIZE;
> --
> 2.27.0
>
>

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


[PATCH] accel/tcg: reduce default code gen buffer on small hosts

2020-07-17 Thread Christian Ehrhardt
Since v5.0.0 and 600e17b2 "accel/tcg: increase default code gen buffer
size for 64 bit" in particular qemu with TCG regularly gets OOM Killed
on small hosts.

The former 47a2def4 "accel/tcg: remove link between guest ram and TCG
cache size" removed the link to guest size which is right, but at least
some connection to the host size needs to be retained to avoid growing
out of control on common CI setups which run at 1-2G host sizes.

The lower value of 1/8th of the host memory size and the default (of
currently 1G) will be taken to initialize the TB. There already is a
Min/Max check in place to not reach ridiculously small values.

Fixes: 600e17b2

Signed-off-by: Christian Ehrhardt 
---
 accel/tcg/translate-all.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 2afa46bd2b..ffcd67060e 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -977,6 +977,29 @@ static inline size_t size_code_gen_buffer(size_t tb_size)
 /* Size the buffer.  */
 if (tb_size == 0) {
 tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+/*
+ * A static default of 1G turned out to break (OOM Kill) many common
+ * CI setups that run at 1-2G Host memory size.
+ * At the same time the former default of ram_size/4 wasted performance
+ * on large host systems when running small guests.
+ * Common CI guest sizes are 0.5-1G which meant ~128M-256M TB size.
+ * A Default of 1/8th of the host size will get small hosts a
+ * similar TB size than they had prior to v5.0 and common bare metal
+ * systems (>=8G) the new 1G default that was set in v5.0
+ */
+#if defined _SC_PHYS_PAGES && defined _SC_PAGESIZE
+{
+unsigned long max = DEFAULT_CODE_GEN_BUFFER_SIZE;
+double pages = (double)sysconf(_SC_PHYS_PAGES);
+
+if (pages > 0 && pagesize > 0) {
+max = (unsigned long)((pages * qemu_real_host_page_size) / 8);
+}
+if (max < DEFAULT_CODE_GEN_BUFFER_SIZE) {
+tb_size = max;
+}
+}
+#endif
 }
 if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
 tb_size = MIN_CODE_GEN_BUFFER_SIZE;
-- 
2.27.0




Re: TB Cache size grows out of control with qemu 5.0

2020-07-17 Thread Christian Ehrhardt
On Thu, Jul 16, 2020 at 6:27 PM Alex Bennée  wrote:

>
> Christian Ehrhardt  writes:
>
> > On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan 
> wrote:
> >
> >> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two
> >> following it.
> >>
> >
> > Thank you Zoltan for pointing out this commit, I agree that this seems
> to be
> > the trigger for the issues I'm seeing. Unfortunately the common CI host
> size
> > is 1-2G. For example on Ubuntu Autopkgtests 1.5G.
> > Those of them running guests do so in 0.5-1G size in TCG mode
> > (as they often can't rely on having KVM available).
> >
> > The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing
> > on memory pressure (never existed) makes these systems go OOM-Killing
> > the qemu process.
>
> Ooops. I admit the assumption was that most people running system
> emulation would be doing it on beefier machines.
>
> > The patches indicated that the TB flushes on a full guest boot are a
> > good indicator of the TB size efficiency. From my old checks I had:
> >
> > - Qemu 4.2 512M guest with 32M default overwritten by ram-size/4
> > TB flush count  14, 14, 16
> > - Qemu 5.0 512M guest with 1G default
> > TB flush count  1, 1, 1
> >
> > I agree that ram/4 seems odd, especially on huge guests that is a lot
> > potentially wasted. And most environments have a bit of breathing
> > room 1G is too big in small host systems and the common CI system falls
> > into this category. So I tuned it down to 256M for a test.
> >
> > - Qemu 4.2 512M guest with tb-size 256M
> > TB flush count  5, 5, 5
> > - Qemu 5.0 512M guest with tb-size 256M
> > TB flush count  5, 5, 5
> > - Qemu 5.0 512M guest with 256M default in code
> > TB flush count  5, 5, 5
> >
> > So performance wise the results are as much in-between as you'd think
> from a
> > TB size in between. And the memory consumption which (for me) is the
> actual
> > current issue to fix would be back in line again as expected.
>
> So I'm glad you have the workaround.
>
> >
> > So on one hand I'm suggesting something like:
> > --- a/accel/tcg/translate-all.c
> > +++ b/accel/tcg/translate-all.c
> > @@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re
> >   * Users running large scale system emulation may want to tweak their
> >   * runtime setup via the tb-size control on the command line.
> >   */
> > -#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
> > +#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB)
>
> The problem we have is any number we pick here is arbitrary. And while
> it did regress your use-case changing it again just pushes a performance
> regression onto someone else.


Thanks for your feedback Alex!

That is true "for you" since 5.0 is released from upstreams POV.
But from the downstreams POV no 5.0 exists for Ubuntu yet and I'd break
many places releasing it like that.
Sadly the performance gain to the other cases will most likely go unnoticed.


> The most (*) 64 bit desktop PCs have 16Gb
> of RAM, almost all have more than 8gb. And there is a workaround.
>

Due to our work around virtualization the values representing
"most 64 bit desktop PCs" aren't the only thing that matters :-)

...


> > This is a bit more tricky than it seems as ram_sizes is no more
> > present in that context but it is enough to discuss it.
> > That should serve all cases - small and large - better as a pure
> > static default of 1G or always ram/4?
>
> I'm definitely against re-introducing ram_size into the mix. The
> original commit (a1b18df9a4) that broke this introduced an ordering
> dependency which we don't want to bring back.
>

I agree with that reasoning, but currently without any size dependency
the "arbitrary value" we picked to be 1G is even more fixed than it was
before.
Compared to pre v5.0 for now I can only decide to
a) tune it down -> performance impact for huge guests
b) keep it at 1G -> functional breakage with small hosts

I'd be more amenable to something that took into account host memory and
> limited the default if it was smaller than a threshold. Is there a way
> to probe that that doesn't involve slurping /proc/meminfo?
>

I agree that a host-size dependency might be the better way to go,
yet I have no great cross-platform resilient way to get that.
Maybe we can make it like "if I can get some value consider it,
otherwise use the current default".
That would improve many places already, while keeping the rest at the
current behavior.


> >
> > P.S. I added Alex being the Author of the offending 

Re: TB Cache size grows out of control with qemu 5.0

2020-07-16 Thread Christian Ehrhardt
On Wed, Jul 15, 2020 at 5:58 PM BALATON Zoltan  wrote:

> See commit 47a2def4533a2807e48954abd50b32ecb1aaf29a and the next two
> following it.
>

Thank you Zoltan for pointing out this commit, I agree that this seems to be
the trigger for the issues I'm seeing. Unfortunately the common CI host size
is 1-2G. For example on Ubuntu Autopkgtests 1.5G.
Those of them running guests do so in 0.5-1G size in TCG mode
(as they often can't rely on having KVM available).

The 1G TB buffer + 0.5G actual guest size + lack of dynamic downsizing
on memory pressure (never existed) makes these systems go OOM-Killing
the qemu process.

The patches indicated that the TB flushes on a full guest boot are a
good indicator of the TB size efficiency. From my old checks I had:

- Qemu 4.2 512M guest with 32M default overwritten by ram-size/4
TB flush count  14, 14, 16
- Qemu 5.0 512M guest with 1G default
TB flush count  1, 1, 1

I agree that ram/4 seems odd, especially on huge guests that is a lot
potentially wasted. And most environments have a bit of breathing
room 1G is too big in small host systems and the common CI system falls
into this category. So I tuned it down to 256M for a test.

- Qemu 4.2 512M guest with tb-size 256M
TB flush count  5, 5, 5
- Qemu 5.0 512M guest with tb-size 256M
TB flush count  5, 5, 5
- Qemu 5.0 512M guest with 256M default in code
TB flush count  5, 5, 5

So performance wise the results are as much in-between as you'd think from a
TB size in between. And the memory consumption which (for me) is the actual
current issue to fix would be back in line again as expected.

So on one hand I'm suggesting something like:
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -944,7 +944,7 @@ static void page_lock_pair(PageDesc **re
  * Users running large scale system emulation may want to tweak their
  * runtime setup via the tb-size control on the command line.
  */
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (256 * MiB)
 #endif
 #endif

OTOH I understand someone else might want to get the more speedy 1G
especially for large guests. If someone used to run a 4G guest in TCG the
TB Size was 1G all along.
How about picking the smaller of (1G || ram-size/4) as default?

This might then look like:
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -956,7 +956,12 @@ static inline size_t size_code_gen_buffe
 {
 /* Size the buffer.  */
 if (tb_size == 0) {
-tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+unsigned long max_default = (unsigned long)(ram_size / 4);
+if (max_default < DEFAULT_CODE_GEN_BUFFER_SIZE) {
+tb_size = max_default;
+} else {
+   tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+}
 }
 if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
 tb_size = MIN_CODE_GEN_BUFFER_SIZE;

This is a bit more tricky than it seems as ram_sizes is no more
present in that context but it is enough to discuss it.
That should serve all cases - small and large - better as a pure
static default of 1G or always ram/4?

P.S. I added Alex being the Author of the offending patch and Richard/Paolo
for being listed in the Maintainers file for TCG.

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


TB Cache size grows out of control with qemu 5.0

2020-07-15 Thread Christian Ehrhardt
eb for this behavior, but either I use
the wrong keywords or it wasn't reported/discussed yet.
Nor does [1] list anything that sounds related
But if this already rings a bell for someone please let me know.
Thanks in advance!

[1]: https://wiki.qemu.org/ChangeLog/5.0#TCG

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-06-30 Thread Christian Ehrhardt
This will re-open again for Bionic due to bug 1885419 forcing a revert of the 
former backports.
After a deeper evaluation if the assert is wrong in the backport or just 
flagging a problem formerly already existing in Bionic this will be re-fixed.

** Changed in: qemu (Ubuntu Bionic)
   Status: Fix Released => Triaged

** Changed in: qemu (Ubuntu Bionic)
 Assignee: (unassigned) => Rafael David Tinoco (rafaeldtinoco)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Fix Released
Status in kunpeng920 ubuntu-18.04 series:
  Fix Released
Status in kunpeng920 ubuntu-18.04-hwe series:
  Fix Released
Status in kunpeng920 ubuntu-19.10 series:
  Fix Released
Status in kunpeng920 ubuntu-20.04 series:
  Fix Released
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  Triaged
Status in qemu source package in Eoan:
  Fix Released
Status in qemu source package in Focal:
  Fix Released

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches 

[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2020-06-17 Thread Christian Ehrhardt
** Changed in: qemu (Ubuntu)
 Assignee: Richard Henderson (rth) => Christian Ehrhardt  (paelzer)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Triaged

Bug description:
  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions



[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-06-16 Thread Christian Ehrhardt
We had the 14 (instead f 7) days in -proposed for some extended maturing. 
Nothing came up in regard to this and all validations were good.
Dropping block-proposed to be released once the SRU Team gets to it.

** Tags removed: block-proposed-bionic block-proposed-eoan block-
proposed-focal

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  Fix Committed
Status in kunpeng920 ubuntu-18.04 series:
  Fix Committed
Status in kunpeng920 ubuntu-18.04-hwe series:
  Fix Committed
Status in kunpeng920 ubuntu-19.10 series:
  Fix Committed
Status in kunpeng920 ubuntu-20.04 series:
  Fix Committed
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  Fix Committed
Status in qemu source package in Eoan:
  Fix Committed
Status in qemu source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll 

Re: [PATCH] qga: fix assert regression on guest-shutdown

2020-06-15 Thread Christian Ehrhardt
On Tue, Jun 9, 2020 at 1:15 PM Christian Ehrhardt <
christian.ehrha...@canonical.com> wrote:

>
>
> On Thu, Jun 4, 2020 at 3:43 PM Christian Ehrhardt <
> christian.ehrha...@canonical.com> wrote:
>
>>
>>
>> On Thu, Jun 4, 2020 at 11:46 AM Marc-André Lureau <
>> marcandre.lur...@redhat.com> wrote:
>>
>>> Since commit 781f2b3d1e ("qga: process_event() simplification"),
>>> send_response() is called unconditionally, but will assert when "rsp" is
>>> NULL. This may happen with QCO_NO_SUCCESS_RESP commands, such as
>>> "guest-shutdown".
>>>
>>> Fixes: 781f2b3d1e5ef389b44016a897fd55e7a780bf35
>>> Cc: Michael Roth 
>>> Reported-by: Christian Ehrhardt 
>>> Signed-off-by: Marc-André Lureau 
>>> ---
>>>  qga/main.c | 6 +-
>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qga/main.c b/qga/main.c
>>> index f0e454f28d3..3febf3b0fdf 100644
>>> --- a/qga/main.c
>>> +++ b/qga/main.c
>>> @@ -531,7 +531,11 @@ static int send_response(GAState *s, const QDict
>>> *rsp)
>>>  QString *payload_qstr, *response_qstr;
>>>  GIOStatus status;
>>>
>>> -g_assert(rsp && s->channel);
>>> +g_assert(s->channel);
>>> +
>>> +    if (!rsp) {
>>> +return 0;
>>> +}
>>>
>>>
>>>
>> Thanks Marc-André,
>> LGTM and should fix the issues I was seeing.
>>
>> Reviewed-by: Christian Ehrhardt 
>>
>
> In the meantime I also got to test this against the initially reported
> issue, LGTM as well (ran as no-change backport onto 4.2).
>
> Tested-by: Christian Ehrhardt 
>

This LGTM with 2*reviews 1*tested and 11 days on the list without any
negative feedback.
I just wanted to re-check if there is anything else left for this to be
committed?


Re: [PATCH] qga: fix assert regression on guest-shutdown

2020-06-09 Thread Christian Ehrhardt
On Thu, Jun 4, 2020 at 3:43 PM Christian Ehrhardt <
christian.ehrha...@canonical.com> wrote:

>
>
> On Thu, Jun 4, 2020 at 11:46 AM Marc-André Lureau <
> marcandre.lur...@redhat.com> wrote:
>
>> Since commit 781f2b3d1e ("qga: process_event() simplification"),
>> send_response() is called unconditionally, but will assert when "rsp" is
>> NULL. This may happen with QCO_NO_SUCCESS_RESP commands, such as
>> "guest-shutdown".
>>
>> Fixes: 781f2b3d1e5ef389b44016a897fd55e7a780bf35
>> Cc: Michael Roth 
>> Reported-by: Christian Ehrhardt 
>> Signed-off-by: Marc-André Lureau 
>> ---
>>  qga/main.c | 6 +-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/qga/main.c b/qga/main.c
>> index f0e454f28d3..3febf3b0fdf 100644
>> --- a/qga/main.c
>> +++ b/qga/main.c
>> @@ -531,7 +531,11 @@ static int send_response(GAState *s, const QDict
>> *rsp)
>>  QString *payload_qstr, *response_qstr;
>>  GIOStatus status;
>>
>> -g_assert(rsp && s->channel);
>> +g_assert(s->channel);
>> +
>> +if (!rsp) {
>> +return 0;
>> +}
>>
>>
>>
> Thanks Marc-André,
> LGTM and should fix the issues I was seeing.
>
> Reviewed-by: Christian Ehrhardt 
>

In the meantime I also got to test this against the initially reported
issue, LGTM as well (ran as no-change backport onto 4.2).

Tested-by: Christian Ehrhardt 


[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-06-04 Thread Christian Ehrhardt
I've looked and retried the tests - all green now.
Let us give it a few extra days in proposed as planned, but other than that it 
looks ok to be released.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  In Progress
Status in kunpeng920 ubuntu-18.04 series:
  In Progress
Status in kunpeng920 ubuntu-18.04-hwe series:
  In Progress
Status in kunpeng920 ubuntu-19.10 series:
  In Progress
Status in kunpeng920 ubuntu-20.04 series:
  In Progress
Status in kunpeng920 upstream-kernel series:
  Invalid
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  Fix Committed
Status in qemu source package in Eoan:
  Fix Committed
Status in qemu source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  

Re: [PATCH] qga: fix assert regression on guest-shutdown

2020-06-04 Thread Christian Ehrhardt
On Thu, Jun 4, 2020 at 11:46 AM Marc-André Lureau <
marcandre.lur...@redhat.com> wrote:

> Since commit 781f2b3d1e ("qga: process_event() simplification"),
> send_response() is called unconditionally, but will assert when "rsp" is
> NULL. This may happen with QCO_NO_SUCCESS_RESP commands, such as
> "guest-shutdown".
>
> Fixes: 781f2b3d1e5ef389b44016a897fd55e7a780bf35
> Cc: Michael Roth 
> Reported-by: Christian Ehrhardt 
> Signed-off-by: Marc-André Lureau 
> ---
>  qga/main.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/qga/main.c b/qga/main.c
> index f0e454f28d3..3febf3b0fdf 100644
> --- a/qga/main.c
> +++ b/qga/main.c
> @@ -531,7 +531,11 @@ static int send_response(GAState *s, const QDict *rsp)
>  QString *payload_qstr, *response_qstr;
>  GIOStatus status;
>
> -g_assert(rsp && s->channel);
> +g_assert(s->channel);
> +
> +if (!rsp) {
> +    return 0;
> +    }
>
>
>
Thanks Marc-André,
LGTM and should fix the issues I was seeing.

Reviewed-by: Christian Ehrhardt 


-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


qemu-guest agent asserts on shutdown

2020-06-04 Thread Christian Ehrhardt
Hi,
while debugging a report I got in Ubuntu I found that since qemu 4.0 the
guest agent shutdown feature works (guest is shutting down) but crashes
when doing so each time. This can be a big red herring when debugging other
things as well as people start to get "an application crashed, do you want
to report" pop-ups if they have set up automatic crash reports.

If you boot the guest after starting it again and check the guest-agent
status you will see in journal:
-- Logs begin at Tue 2020-06-02 07:41:32 UTC, end at Thu 2020-06-04
08:07:37 UTC. --
Jun 02 07:47:58 focal systemd[1]: Started QEMU Guest Agent.
Jun 02 07:49:03 focal qemu-ga[1984]: info: guest-shutdown called, mode:
(null)
Jun 02 07:49:03 focal qemu-ga[1984]: **
Jun 02 07:49:03 focal qemu-ga[1984]:
ERROR:/build/qemu-7aKH5L/qemu-4.2/qga/main.c:532:send_response: assertion
failed: (rsp && s->channel)
Jun 02 07:49:03 focal qemu-ga[1984]: Bail out!
ERROR:/build/qemu-7aKH5L/qemu-4.2/qga/main.c:532:send_response: assertion
failed: (rsp && s->channel)
Jun 02 07:49:04 focal systemd[1]: Stopping QEMU Guest Agent...
Jun 02 07:49:04 focal systemd[1]: qemu-guest-agent.service: Succeeded.
Jun 02 07:49:04 focal systemd[1]: Stopped QEMU Guest Agent.

The actual assert is from "forever" [3] (v0.15) which is the initial
addition of qemu guest agent in 2011. That was later restructured in [1]
(v1.1) and [2] (v4.0).

In a check through Ubuntu releases I got
1) Host: Q 2.11 L 4.0 (Bionic) - G 2.11 (Bionic)
2) Host: Q 4.0 L 5.4 (Eoan) - G 2.11 (Bionic)
3) Host: Q 4.2 L 6.0 (Focal) - G 2.11 (Bionic)
4) Host: Q 2.11 L 4.0 (Bionic) - G 4.0 (Eoan)
5) Host: Q 4.0 L 5.4 (Eoan) - G 4.0 (Eoan)
6) Host: Q 4.2 L 6.0 (Focal) - G 4.0 (Eoan)
7) Host: Q 2.11 L 4.0 (Bionic) - G 4.2 (Focal)
8) Host: Q 4.0 L 5.4 (Eoan) - G 4.2 (Focal)
9) Host: Q 4.2 L 6.0 (Focal) - G 4.2 (Focal)

So it seemed to be the qemu-guest-agent portion since >=4.0.
I did a build with [2] reverted and the crash is gone.

I see from the host:
$ virsh qemu-agent-command focal '{"execute": "guest-shutdown"}'
"error: Guest agent is not responding: Guest agent disappeared while
executing command"

I'm not sure which part of the communication breaks first, but it could try
to send on a dying connection, the old code had:

rsp = qmp_dispatch(_commands, QOBJECT(req), false);
if (rsp) {
ret = send_response(s, rsp)

While the new code is like:

rsp = qmp_dispatch(_commands, obj, false);
end:
 ret = send_response(s, rsp);

Maybe it runs send_response despite qmp_dispatch failing now?

I didn't stare at it long enough to have a solution yet, but wanted to make
the maintainer of qga and the Author aware.

[1]: https://git.qemu.org/?p=qemu.git;a=commit;h=125b310e1d62
[2]: https://git.qemu.org/?p=qemu.git;a=commit;h=781f2b3d1e5e
[3]: https://git.qemu.org/?p=qemu.git;a=commit;h=48ff7a625b36

-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-05-29 Thread Christian Ehrhardt
Migrated right now, sponsoring the related SRU portions into B/E/F ...
for consideration by the SRU Team.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  In Progress
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Triaged
Status in kunpeng920 ubuntu-20.04 series:
  Triaged
Status in kunpeng920 upstream-kernel series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
   

[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-05-28 Thread Christian Ehrhardt
FYI: sponsored into groovy

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  In Progress
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Triaged
Status in kunpeng920 ubuntu-20.04 series:
  Triaged
Status in kunpeng920 upstream-kernel series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  #2  qemu_poll_ns 

[Bug 1877052] Re: KVM Win 10 guest pauses after kernel upgrade

2020-05-28 Thread Christian Ehrhardt
@Andreas - If we find nothing else to try I'll ping you when I have a
newer qemu build for Ubuntu 20.10 for you to try.

** Tags added: qemu-20.10

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1877052

Title:
  KVM Win 10 guest pauses after kernel upgrade

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  Hello!
  Unfortunately the bug has apparently reappeared. I have a Windows 10 running 
in a VM, which after my today's "apt upgrade" goes into pause mode after a few 
seconds of running time.

  Until yesterday it used to work and I was able to boot the VM. During
  the kernel update (from 5.4.0-28.33 to 5.4.0-29.34) the VM was active
  and then went into pause mode. Even after a reboot of my host system
  the problem still persists: the VM boots for a few seconds and then
  switches to pause mode.

  Current Kernel: Linux andreas-laptop 5.4.0-29-generic #33-Ubuntu SMP
  Wed Apr 29 14:32:27 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  Maybe relevant logfile lines:
  2020-05-06T07:46:42.857574Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.857718Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:46:42.860567Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.860582Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:47:22.901057Z qemu-system-x86_64: terminating on signal 15 from 
pid 1593 (/usr/sbin/libvirtd)
  2020-05-06 07:47:23.101+: shutting down, reason=destroyed


  Kind regards,
     Andreas

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1877052/+subscriptions



[Bug 1877052] Re: KVM Win 10 guest pauses after kernel upgrade

2020-05-28 Thread Christian Ehrhardt
The warnings in the report like "MSR(48FH).vmx-exit-load-perf-global-ctrl" are 
unrelated (in regard to guest hang).
Those happen on
a) too old kernels that don't support the feature
b) mismatch of expectations of a chips vs its actual capabilities
E.g. if libvirt thinks a feature should be supported by a chip, but isn't.
There are toomany SKUs out there to be perfect - so these are red-herrings at 
best.

I have not seen similar reports recently nor anyone else chiming in on this one.
After loosing what e thought could be a track to the bgu I'm puzzled what to do 
now on this?

@Andreas - did you in the meantime find any new insight on this?

** Changed in: qemu (Ubuntu)
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1877052

Title:
  KVM Win 10 guest pauses after kernel upgrade

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  Hello!
  Unfortunately the bug has apparently reappeared. I have a Windows 10 running 
in a VM, which after my today's "apt upgrade" goes into pause mode after a few 
seconds of running time.

  Until yesterday it used to work and I was able to boot the VM. During
  the kernel update (from 5.4.0-28.33 to 5.4.0-29.34) the VM was active
  and then went into pause mode. Even after a reboot of my host system
  the problem still persists: the VM boots for a few seconds and then
  switches to pause mode.

  Current Kernel: Linux andreas-laptop 5.4.0-29-generic #33-Ubuntu SMP
  Wed Apr 29 14:32:27 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  Maybe relevant logfile lines:
  2020-05-06T07:46:42.857574Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.857718Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:46:42.860567Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.860582Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:47:22.901057Z qemu-system-x86_64: terminating on signal 15 from 
pid 1593 (/usr/sbin/libvirtd)
  2020-05-06 07:47:23.101+: shutting down, reason=destroyed


  Kind regards,
     Andreas

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1877052/+subscriptions



[Bug 1805256] Re: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images

2020-05-26 Thread Christian Ehrhardt
** No longer affects: qemu (Ubuntu Disco)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1805256

Title:
  qemu-img hangs on rcu_call_ready_event logic in Aarch64 when
  converting images

Status in kunpeng920:
  In Progress
Status in kunpeng920 ubuntu-18.04 series:
  Triaged
Status in kunpeng920 ubuntu-18.04-hwe series:
  Triaged
Status in kunpeng920 ubuntu-19.10 series:
  Triaged
Status in kunpeng920 ubuntu-20.04 series:
  Triaged
Status in kunpeng920 upstream-kernel series:
  Fix Committed
Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  In Progress
Status in qemu source package in Bionic:
  In Progress
Status in qemu source package in Eoan:
  In Progress
Status in qemu source package in Focal:
  In Progress

Bug description:
  [Impact]

  * QEMU locking primitives might face a race condition in QEMU Async
  I/O bottom halves scheduling. This leads to a dead lock making either
  QEMU or one of its tools to hang indefinitely.

  [Test Case]

  * qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs in Aarch64.

  [Regression Potential]

  * This is a change to a core part of QEMU: The AIO scheduling. It
  works like a "kernel" scheduler, whereas kernel schedules OS tasks,
  the QEMU AIO code is responsible to schedule QEMU coroutines or event
  listeners callbacks.

  * There was a long discussion upstream about primitives and Aarch64.
  After quite sometime Paolo released this patch and it solves the
  issue. Tested platforms were: amd64 and aarch64 based on his commit
  log.

  * Christian suggests that this fix stay little longer in -proposed to
  make sure it won't cause any regressions.

  * dannf suggests we also check for performance regressions; e.g. how
  long it takes to convert a cloud image on high-core systems.

  [Other Info]

   * Original Description bellow:

  Command:

  qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Hangs indefinitely approximately 30% of the runs.

  

  Workaround:

  qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2

  Run "qemu-img convert" with "a single coroutine" to avoid this issue.

  

  (gdb) thread 1
  ...
  (gdb) bt
  #0 0xbf1ad81c in __GI_ppoll
  #1 0xaabcf73c in ppoll
  #2 qemu_poll_ns
  #3 0xaabd0764 in os_host_main_loop_wait
  #4 main_loop_wait
  ...

  (gdb) thread 2
  ...
  (gdb) bt
  #0 syscall ()
  #1 0xaabd41cc in qemu_futex_wait
  #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 )
  #3 0xaabed05c in call_rcu_thread
  #4 0xaabd34c8 in qemu_thread_start
  #5 0xbf25c880 in start_thread
  #6 0xbf1b6b9c in thread_start ()

  (gdb) thread 3
  ...
  (gdb) bt
  #0 0xbf11aa20 in __GI___sigtimedwait
  #1 0xbf2671b4 in __sigwait
  #2 0xaabd1ddc in sigwait_compat
  #3 0xaabd34c8 in qemu_thread_start
  #4 0xbf25c880 in start_thread
  #5 0xbf1b6b9c in thread_start

  

  (gdb) run
  Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2
  ./disk01.ext4.qcow2 ./output.qcow2

  [New Thread 0xbec5ad90 (LWP 72839)]
  [New Thread 0xbe459d90 (LWP 72840)]
  [New Thread 0xbdb57d90 (LWP 72841)]
  [New Thread 0xacac9d90 (LWP 72859)]
  [New Thread 0xa7ffed90 (LWP 72860)]
  [New Thread 0xa77fdd90 (LWP 72861)]
  [New Thread 0xa6ffcd90 (LWP 72862)]
  [New Thread 0xa67fbd90 (LWP 72863)]
  [New Thread 0xa5ffad90 (LWP 72864)]

  [Thread 0xa5ffad90 (LWP 72864) exited]
  [Thread 0xa6ffcd90 (LWP 72862) exited]
  [Thread 0xa77fdd90 (LWP 72861) exited]
  [Thread 0xbdb57d90 (LWP 72841) exited]
  [Thread 0xa67fbd90 (LWP 72863) exited]
  [Thread 0xacac9d90 (LWP 72859) exited]
  [Thread 0xa7ffed90 (LWP 72860) exited]

  
  """

  All the tasks left are blocked in a system call, so no task left to call
  qemu_futex_wake() to unblock thread #2 (in futex()), which would unblock
  thread #1 (doing poll() in a pipe with thread #2).

  Those 7 threads exit before disk conversion is complete (sometimes in
  the beginning, sometimes at the end).

  

  On the HiSilicon D06 system - a 96 core NUMA arm64 box - qemu-img
  frequently hangs (~50% of the time) with this command:

  qemu-img convert -f qcow2 -O qcow2 /tmp/cloudimg /tmp/cloudimg2

  Where "cloudimg" is a standard qcow2 Ubuntu cloud image. This
  qcow2->qcow2 conversion happens to be something uvtool does every time
  it fetches images.

  Once hung, attaching gdb gives the following backtrace:

  (gdb) bt
  #0  0xae4f8154 in __GI_ppoll (fds=0xe8a67dc0, 
nfds=187650274213760,
  timeout=, timeout@entry=0x0, sigmask=0xc123b950)
  at ../sysdeps/unix/sysv/linux/ppoll.c:39
  #1  0xbbefaf00 in ppoll (__ss=0x0, __timeout=0x0, __nfds=,
  __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77
  

[Bug 1868116] Re: QEMU monitor no longer works

2020-05-25 Thread Christian Ehrhardt
** Changed in: qemu (Ubuntu)
   Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Invalid
Status in vte2.91 package in Ubuntu:
  Fix Released
Status in qemu package in Debian:
  Fix Released

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  dmi.sys.vendor: LENOVO
  mtime.conffile..etc.apport.crashdb.conf: 2019-08-29T08:39:36.787240

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1868116/+subscriptions



[Bug 1877052] Re: KVM Win 10 guest pauses after kernel upgrade

2020-05-06 Thread Christian Ehrhardt
Note: might be related (or not) to bug 1866870
Let's analyze as independent and dup if it turns out to be a dup.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1877052

Title:
  KVM Win 10 guest pauses after kernel upgrade

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  New

Bug description:
  Hello!
  Unfortunately the bug has apparently reappeared. I have a Windows 10 running 
in a VM, which after my today's "apt upgrade" goes into pause mode after a few 
seconds of running time.

  Until yesterday it used to work and I was able to boot the VM. During
  the kernel update (from 5.4.0-28.33 to 5.4.0-29.34) the VM was active
  and then went into pause mode. Even after a reboot of my host system
  the problem still persists: the VM boots for a few seconds and then
  switches to pause mode.

  Current Kernel: Linux andreas-laptop 5.4.0-29-generic #33-Ubuntu SMP
  Wed Apr 29 14:32:27 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

  Maybe relevant logfile lines:
  2020-05-06T07:46:42.857574Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.857718Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:46:42.860567Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(48FH).vmx-exit-load-perf-global-ctrl [bit 12]
  2020-05-06T07:46:42.860582Z qemu-system-x86_64: warning: host doesn't support 
requested feature: MSR(490H).vmx-entry-load-perf-global-ctrl [bit 13]
  2020-05-06T07:47:22.901057Z qemu-system-x86_64: terminating on signal 15 from 
pid 1593 (/usr/sbin/libvirtd)
  2020-05-06 07:47:23.101+: shutting down, reason=destroyed


  Kind regards,
     Andreas

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1877052/+subscriptions



[Bug 1866870] Re: KVM Guest pauses after upgrade to Ubuntu 20.04

2020-05-05 Thread Christian Ehrhardt
Hi Andreas,
so the only upgrade you did to trigger this for you was to bump the kernel from 
5.4.0-28.33 to 5.4.0-29.34 - nothing else? I have not (yet?) heard other 
similar reports, but it might be just too early?
At least on my system for now things still work with the new kernel like before.

I'd recommend filing a new bug, refer to this one as maybe being related and 
adding the following right away:
- kernel version (you have this here I know)
- qemu/libvirt/seabios/ovmf version (if you don't mind just attach `dpkg -l`)
- guest XML (if using libvirt) otherwise the qemu command-line
- add a cross check and report what happens with other guests configs (e.g. non 
windows, using 
  another bios as the former issue was tied to seabios, use different guest CPU 
types)
- the full /var/log/apt/history.log
- a date when you last started the VM successfully (not just 
still-had-it-running, but started it) 
  and the date when it started to fail (probably yesterday then I guess)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1866870

Title:
  KVM Guest pauses after upgrade to Ubuntu 20.04

Status in QEMU:
  Invalid
Status in qemu package in Ubuntu:
  Invalid
Status in seabios package in Ubuntu:
  Fix Released

Bug description:
  Symptom:
  Error unpausing domain: internal error: unable to execute QEMU command 
'cont': Resetting the Virtual Machine is required

  Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in 
cb_wrapper
  callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 111, in tmpcb
  callback(*args, **kwargs)
File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line 
66, in newfn
  ret = fn(self, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/object/domain.py", line 1311, in 
resume
  self._backend.resume()
File "/usr/lib/python3/dist-packages/libvirt.py", line 2174, in resume
  if ret == -1: raise libvirtError ('virDomainResume() failed', dom=self)
  libvirt.libvirtError: internal error: unable to execute QEMU command 'cont': 
Resetting the Virtual Machine is required

  
  ---

  As outlined here:
  https://bugs.launchpad.net/qemu/+bug/1813165/comments/15

  After upgrade, all KVM guests are in a default pause state. Even after
  forcing them off via virsh, and restarting them the guests are paused.

  These Guests are not nested.

  A lot of diganostic information are outlined in the previous bug
  report link provided. The solution mentioned in previous report had
  been allegedly integrated into the downstream updates.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1866870/+subscriptions



[Bug 1749393] Re: sbrk() not working under qemu-user with a PIE-compiled binary?

2020-05-01 Thread Christian Ehrhardt
Will be merged in 20.10 with qemu >=5.0 where this came upstream.

** Tags added: qemu-20.10

** Changed in: qemu (Ubuntu)
   Status: Confirmed => Triaged

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1749393

Title:
  sbrk() not working under qemu-user with a PIE-compiled binary?

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  Triaged

Bug description:
  In Debian unstable, we recently switched bash to be a PIE-compiled
  binary (for hardening). Unfortunately this resulted in bash being
  broken when run under qemu-user (for all target architectures, host
  being amd64 for me).

  $ sudo chroot /srv/chroots/sid-i386/ qemu-i386-static /bin/bash
  bash: xmalloc: .././shell.c:1709: cannot allocate 10 bytes (0 bytes allocated)

  bash has its own malloc implementation based on sbrk():
  https://git.savannah.gnu.org/cgit/bash.git/tree/lib/malloc/malloc.c

  When we disable this internal implementation and rely on glibc's
  malloc, then everything is fine. But it might be that glibc has a
  fallback when sbrk() is not working properly and it might hide the
  underlying problem in qemu-user.

  This issue has also been reported to the bash upstream author and he 
suggested that the issue might be in qemu-user so I'm opening a ticket here. 
Here's the discussion with the bash upstream author:
  https://lists.gnu.org/archive/html/bug-bash/2018-02/threads.html#00080

  You can find the problematic bash binary in that .deb file:
  
http://snapshot.debian.org/archive/debian/20180206T154716Z/pool/main/b/bash/bash_4.4.18-1_i386.deb

  The version of qemu I have been using is 2.11 (Debian package qemu-
  user-static version 1:2.11+dfsg-1) but I have had reports that the
  problem is reproducible with older versions (back to 2.8 at least).

  Here are the related Debian bug reports:
  https://bugs.debian.org/889869
  https://bugs.debian.org/865599

  It's worth noting that bash used to have this problem (when compiled as a PIE 
binary) even when run directly but then something got fixed in the kernel and 
now the problem only appears when run under qemu-user:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518483

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1749393/+subscriptions



[Bug 1868116] Re: QEMU monitor no longer works

2020-03-29 Thread Christian Ehrhardt
Thanks Ken!
I verified it and the new version indeed fixes the issue in focal.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  Fix Released
Status in qemu package in Debian:
  Unknown

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  dmi.sys.vendor: LENOVO
  mtime.conffile..etc.apport.crashdb.conf: 2019-08-29T08:39:36.787240

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1868116/+subscriptions



[Bug 1868116] Re: QEMU monitor no longer works

2020-03-27 Thread Christian Ehrhardt
As Vte-upstream long term would want to get rid of this implementation
style Christian Persch provided a qemu patch [1]. That is too much UI
for me to really have an in-depth opinion, but I can say that it builds
and input works fine with it.

I suggested on [2] to send it to qemu-devel, but in case that doesn't
happen it might be great if Gerd Hoffmann and Cole Robinson could take a
look at it.

[1]: 
https://gitlab.gnome.org/GNOME/vte/uploads/1e8ccb6aaf2e8fcef91dd67d23f47fae/qemu.patch
[2]: https://gitlab.gnome.org/GNOME/vte/issues/222

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  Triaged

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  

[Bug 1868116] Re: QEMU monitor no longer works

2020-03-26 Thread Christian Ehrhardt
Subscribed and Assigned to Ubuntu Desktop to get to 0.60.1 before Focal 
releases.
I'd be happy about an update here that this surely is on your todo list.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  Triaged

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  dmi.sys.vendor: LENOVO
  mtime.conffile..etc.apport.crashdb.conf: 2019-08-29T08:39:36.787240

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1868116/+subscriptions



[Bug 1868116] Re: QEMU monitor no longer works

2020-03-26 Thread Christian Ehrhardt
>From IRC:
[16:10]  cpaelzer, @vte, we should get 0.60.1 for focal, 0.59.91 is a 
rc1 for 0.60, we are lacking behind merging the stable version from Debian but 
it's on our backlog (kenvandine was look at that one), the .1 is part of GNOME 
3.36.1 which we plan to get before release (I would understand if you would 
like to backport a patch to help testing rather than waiting though)

>From VTE Bug:
The standard Ubuntu freeze doesn't apply to GNOME packages. Usually Ubuntu aims 
to ship latest GNOME x.1. VTE is part of GNOME, VTE 0.60.0 is part of GNOME 
3.36.0, VTE 0.60.1 belongs to GNOME 3.36.1 etc. Accordingly, 0.60.0 -> 0.60.1 
contains important bugfixes only, no new features. In this particular case, 
0.60.1 will bring a trivial shell script fix (quite important for non-VTE 
users), and hopefully this one. It would be outright ridiculous for an LTS 
distro to ship an unstable VTE. So, the only reasonable thing for Ubuntu 20.04 
is to ship VTE 0.60.1. Anyway, this is not the right place to discuss it.

But gladly there now is a commit with a fix:
https://gitlab.gnome.org/GNOME/vte/-/commit/277ee003066b3993cf6d55a05606009caac69015

I agree that we need this for 20.04, and therefore will set this up in
prio and assign it to the Desktop team.

** Changed in: vte2.91 (Ubuntu)
 Assignee: (unassigned) => Ubuntu Desktop (ubuntu-desktop)

** Changed in: vte2.91 (Ubuntu)
   Status: New => Triaged

** Changed in: vte2.91 (Ubuntu)
   Importance: Undecided => Critical

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  Triaged

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 

[Bug 1868116] Re: QEMU monitor no longer works

2020-03-26 Thread Christian Ehrhardt
I'm not sure how many of you are tracking the Vte bug [1] so here a
summary of the latest insight from there.

- Short term it seems that new behavior will be reverted in Vte 0.60.1.
- Long term the Vte devs might want to deprecate no-pty use cases or at least 
better understand why apps use it that way.

For more details please read [1].

[1]: https://gitlab.gnome.org/GNOME/vte/issues/222

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  New

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  dmi.sys.vendor: LENOVO
  mtime.conffile..etc.apport.crashdb.conf: 2019-08-29T08:39:36.787240

To manage notifications about this bug go to:

[Bug 1868116] Re: QEMU monitor no longer works

2020-03-25 Thread Christian Ehrhardt
I'm not really a UI guy, so I was checking what I might have lost by disabling 
VTE and found the very old [1]. That list of features really seems to make 
disabling VTE not an real option:
  "It's also screen reader accessible, supports copy/paste, proper scrolling and
   most of the other features you would expect from a terminal widget."

After seeing that Cole authored the "drop PTY" [3] patch I have
subscribed him here as well.

I have tried to answer and ask a few questions on the VTE issue [2] to
get it make progress, but it would really benefit getting the attention
of Gerhard and Cole (or anyone else who feels the UI-power).

[1]: 
https://git.qemu.org/?p=qemu.git;a=commit;h=d861def367b516055dc4c46dc1305143ee653c84
[2]: https://gitlab.gnome.org/GNOME/vte/issues/222
[3]: 
https://git.qemu.org/?p=qemu.git;a=commit;h=d4370741402a97b8b6d0c38fef18ab38bf25ab22

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  New

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 

[Bug 1868116] Re: QEMU monitor no longer works

2020-03-25 Thread Christian Ehrhardt
For a bit of reverse-confirmation of the findings so far.
If I build qemu without VTE, like (configure)
GTK support   yes (3.24.14)
VTE support   no

It works, due to the fallback implemented by [1][2].
But obviously without all the VTE features, I'd prefer a more fine grained fix 
than disabling VTE :-)

[1]: 
https://git.qemu.org/?p=qemu.git;a=commit;h=f8c223f69ac58488ea830597281b7ddd33037c4c
[2]: 
https://git.qemu.org/?p=qemu.git;a=commit;h=bbbf9bfb9c27e389340cf50a11c22fa46c572150

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  New

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  dmi.sys.vendor: LENOVO
  

[Bug 1868116] Re: QEMU monitor no longer works

2020-03-25 Thread Christian Ehrhardt
Thank you Egmont for the bug for VTE in the gnome tracker!

Graphics isn't something I'm usually at home - the related qemu code is
mostly in ui/gtk.c per Maintainers file Gerd Hoffmann is the expert. I
subscribed him to the bug here to raise visibility for him.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1868116

Title:
  QEMU monitor no longer works

Status in QEMU:
  New
Status in qemu package in Ubuntu:
  Triaged
Status in vte2.91 package in Ubuntu:
  New

Bug description:
  Repro:
  VTE
  $ meson _build && ninja -C _build && ninja -C _build install

  qemu:
  $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
  $ make

  Test:
  $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
  - switch to monitor with CTRL+ALT+2
  - try to enter something

  Affects head of both usptream git repos.

  
  --- original bug ---

  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send
  commands to the guest, or key combinations that the host would grab
  (as Ctrl-Alt-F1 or Alt-F4).

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:

  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
   qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
   kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
   kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 310-14ISK
  dmi.sys.vendor: LENOVO
  mtime.conffile..etc.apport.crashdb.conf: 2019-08-29T08:39:36.787240

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1868116/+subscriptions



[Bug 1867519] Re: qemu 4.2 segfaults on VF detach

2020-03-25 Thread Christian Ehrhardt
** Merge proposal unlinked:
   
https://code.launchpad.net/~paelzer/ubuntu/+source/qemu/+git/qemu/+merge/381033

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1867519

Title:
  qemu 4.2 segfaults on VF detach

Status in QEMU:
  Fix Committed
Status in qemu package in Ubuntu:
  Fix Released

Bug description:
  After updating Ubuntu 20.04 to the Beta version, we get the following
  error and the virtual machines stucks when detaching PCI devices using
  virsh command:

  Error:
  error: Failed to detach device from /tmp/vf_interface_attached.xml
  error: internal error: End of file from qemu monitor

  steps to reproduce:
   1. create a VM over Ubuntu 20.04 (5.4.0-14-generic)
   2. attach PCI device to this VM (Mellanox VF for example)
   3. try to detaching  the PCI device using virsh command:
 a. create a pci interface xml file:
  






  
 b.  #virsh detach-device  


  - Ubuntu release:
Description:Ubuntu Focal Fossa (development branch)
Release:20.04

  - Package ver:
libvirt0:
Installed: 6.0.0-0ubuntu3
Candidate: 6.0.0-0ubuntu5
Version table:
   6.0.0-0ubuntu5 500
  500 http://il.archive.ubuntu.com/ubuntu focal/main amd64 Packages
   *** 6.0.0-0ubuntu3 100
  100 /var/lib/dpkg/status

  - What you expected to happen: 
PCI device detached without any errors.

  - What happened instead:
getting the errors above and he VM stuck

  additional info:
  after downgrading the libvirt0 package and all the dependent packages to 5.4 
the previous, version, seems that the issue disappeared

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1867519/+subscriptions



[Bug 1868116] Re: QEMU monitor no longer works

2020-03-24 Thread Christian Ehrhardt
Last commit mentioning VTE is a while ago:
6415994 Thu Oct 11 17:30:39 2018 +0200 gtk: Don't vte_terminal_set_encoding() 
on new VTE versions

I built head of qemu against head of vte - to check if I even need to look for 
existing fixes.
=> That still fails, so it is probably time for a bug report to get other 
people to think with us.


** Description changed:

+ Repro:
+ VTE
+ $ meson _build && ninja -C _build && ninja -C _build install
+ 
+ qemu:
+ $ ../configure --python=/usr/bin/python3 --disable-werror --disable-user 
--disable-linux-user --disable-docs --disable-guest-agent --disable-sdl 
--enable-gtk --disable-vnc --disable-xen --disable-brlapi --disable-fdt 
--disable-hax --disable-vde --disable-netmap --disable-rbd --disable-libiscsi 
--disable-libnfs --disable-smartcard --disable-libusb --disable-usb-redir 
--disable-seccomp --disable-glusterfs --disable-tpm --disable-numa 
--disable-opengl --disable-virglrenderer --disable-xfsctl --disable-vxhs 
--disable-slirp --disable-blobs --target-list=x86_64-softmmu --disable-rdma 
--disable-pvrdma --disable-attr --disable-vhost-net --disable-vhost-vsock 
--disable-vhost-scsi --disable-vhost-crypto --disable-vhost-user 
--disable-spice --disable-qom-cast-debug --disable-vxhs --disable-bochs 
--disable-cloop --disable-dmg --disable-qcow1 --disable-vdi --disable-vvfat 
--disable-qed --disable-parallels --disable-sheepdog --disable-avx2 
--disable-nettle --disable-gnutls --disable-capstone --disable-tools 
--disable-libpmem --disable-iconv --disable-cap-ng
+ $ make
+ 
+ Test:
+ $ LD_LIBRARY_PATH=/usr/local/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH 
./build/x86_64-softmmu/qemu-system-x86_64 -enable-kvm --drive 
media=cdrom,file=http://archive.ubuntu.com/ubuntu/dists/bionic/main/installer-amd64/current/images/netboot/mini.iso
+ - switch to monitor with CTRL+ALT+2
+ - try to enter something
+ 
+ Affects head of both usptream git repos.
+ 
+ 
+ --- original bug ---
+ 
  It was observed that the QEMU console (normally accessible using
  Ctrl+Alt+2) accepts no input, so it can't be used. This is being
  problematic because there are cases where it's required to send commands
  to the guest, or key combinations that the host would grab (as Ctrl-
  Alt-F1 or Alt-F4).
  
  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: qemu 1:4.2-3ubuntu2
  Uname: Linux 5.6.0-rc6+ x86_64
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  CurrentDesktop: XFCE
  Date: Thu Mar 19 12:16:31 2020
  Dependencies:
-  
+ 
  InstallationDate: Installed on 2017-06-13 (1009 days ago)
  InstallationMedia: Xubuntu 17.04 "Zesty Zapus" - Release amd64 (20170412)
  KvmCmdLine:
-  COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
-  qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
-  kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
-  kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
+  COMMAND STAT  EUID  RUID PIDPPID %CPU COMMAND
+  qemu-system-x86 Sl+   1000  1000   34275   25235 29.2 qemu-system-x86_64 -m 
4G -cpu Skylake-Client -device virtio-vga,virgl=true,xres=1280,yres=720 -accel 
kvm -device nec-usb-xhci -serial vc -serial stdio -hda 
/home/usuario/Sistemas/androidx86.img -display gtk,gl=on -device usb-audio
+  kvm-nx-lpage-re S0 0   34284   2  0.0 [kvm-nx-lpage-re]
+  kvm-pit/34275   S0 0   34286   2  0.0 [kvm-pit/34275]
  MachineType: LENOVO 80UG
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc6+ 
root=UUID=6b4ae5c0-c78c-49a6-a1ba-029192618a7a ro quiet ro kvm.ignore_msrs=1 
kvm.report_ignored_msrs=0 kvm.halt_poll_ns=0 kvm.halt_poll_ns_grow=0 
i915.enable_gvt=1 i915.fastboot=1 cgroup_enable=memory swapaccount=1 
zswap.enabled=1 zswap.zpool=z3fold 
resume=UUID=a82e38a0-8d20-49dd-9cbd-de7216b589fc log_buf_len=16M 
usbhid.quirks=0x0079:0x0006:0x10 config_scsi_mq_default=y 
scsi_mod.use_blk_mq=1 mtrr_gran_size=64M mtrr_chunk_size=64M nbd.nbds_max=2 
nbd.max_part=63
  SourcePackage: qemu
  UpgradeStatus: Upgraded to focal on 2019-12-22 (87 days ago)
  dmi.bios.date: 08/09/2018
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 0XCN45WW
  dmi.board.asset.tag: NO Asset Tag
  dmi.board.name: Toronto 4A2
  dmi.board.vendor: LENOVO
  dmi.board.version: SDK0J40679 WIN
  dmi.chassis.asset.tag: NO Asset Tag
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Lenovo ideapad 310-14ISK
  dmi.modalias: 
dmi:bvnLENOVO:bvr0XCN45WW:bd08/09/2018:svnLENOVO:pn80UG:pvrLenovoideapad310-14ISK:rvnLENOVO:rnToronto4A2:rvrSDK0J40679WIN:cvnLENOVO:ct10:cvrLenovoideapad310-14ISK:
  dmi.product.family: IDEAPAD
  dmi.product.name: 80UG
  dmi.product.sku: LENOVO_MT_80UG_BU_idea_FM_Lenovo ideapad 310-14ISK
  dmi.product.version: Lenovo ideapad 

  1   2   3   >