Re: P2P problem on QEMU

2021-05-27 Thread James Harvey
On Wed, May 26, 2021 at 1:44 PM Gabriele Borello
 wrote:
> The following kernel version was used: Linux 5.9.0-rc8 x86_64,
> ...
>  The kernel was compiled by configuring peer-to-peer as described in the 
> p2pmem-test guide. Trying to run the command suggested in the p2pmem-test 
> guide ( ./p2pmem-test /dev/nvme0n1 /dev/nvme1n1 /dev/p2pmem0 -c 1 -s 4k),  we 
> get a kernel panic.

I've never looked at PCI Peer-to-Peer DMA.  Briefly, I see the
p2pmem-test developer sbates130272 has github repos for linux-p2pmen
which has been maintained through 5.13-rc1.  It could be worth trying
the more recent kernel.

> We don't know where we went wrong. If you could give us feedback we would be 
> very grateful.

The stacktrace from the panic would be needed for others to give much
feedback.  If I'm understanding correctly that you aren't trying to
really do P2P between the host and guest OS's, but between a virtual
NVMe in the guest to a p2pmem device in the guest, my next step would
be to get an NVMe and install exactly what you installed in QEMU to it
(identical versions and configuration) and see if you could replicate
the stacktrace on bare metal.  If so, it would be a kernel issue
rather than QEMU.  If you couldn't, that could suggest the QEMU NVMe
device support may be incomplete or bugged in a way preventing this.
If you're running an older version of QEMU, it would also be worth
trying 6.0.0 to see if the issue has been resolved.



[Bug 1905562] [NEW] Guest seems suspended after host freed memory for it using oom-killer

2020-11-25 Thread James Harvey
Public bug reported:

Host: qemu 5.1.0, linux 5.5.13
Guest: Windows 7 64-bit

This guest ran a memory intensive process, and triggered oom-killer on
host.  Luckily, it killed chromium.  My understanding is this should
mean qemu should have continued running unharmed.  But, the spice
connection shows the host system clock is stuck at the exact time oom-
killer was triggered.  The host is completely unresponsive.

I can telnet to the qemu monitor.  "info status" shows "running".  But,
multiple times running "info registers -a" and saving the output to text
files shows the registers are 100% unchanged, so it's not really
running.

On the host, top shows around 4% CPU usage by qemu.  strace shows about
1,000 times a second, these 6 lines repeat:

0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.10>
0.34 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.09>
0.31 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.07>
0.28 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.07>
0.30 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=0}, 
NULL, 8) = 0 (Timeout)  <0.09>
0.43 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, 
tv_nsec=769662}, NULL, 8) = 0 (Tim eout) <0.000788>

In the monitor, "info irq" shows IRQ 0 is increasing about 1,000 times a
second.  IRQ 0 seems to be for the system clock, and 1,000 times a
second seems to be the frequency a windows 7 guest might have the clock
at.

Those fd's are for: (9) [eventfd]; [signalfd], type=STREAM, 4 x the
spice socket file, and "TCP localhost:ftnmtp->localhost:36566
(ESTABLISHED)".

Because the guest's registers aren't changing, it seems to me like
monitor thinks the VM is running, but it's actually effectively in a
paused state.  I think all the strace activity shown above must be
generated by the host.  Perhaps it's repeatedly trying to contact the
guest to inject a new clock, and communicate with it on the various
eventfd's, spice socket, etc.  So, I'm thinking the strace doesn't give
any information about the real reason why the VM is acting as if it's
paused.

I've checked "info block", and there's nothing showing that a device is
paused, or that there's any issues with them.  (Can't remember what term
can be there, but a paused/blocked/etc block device I think caused a VM
to act like this for me in the past.)


Is there something I can provide to help fix the bug here?

Is there something I can do, to try to get the VM running again?  (I
sadly have unsaved work in it.)

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "qemu with arguments"
   
https://bugs.launchpad.net/bugs/1905562/+attachment/5437888/+files/qemu-arguments

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905562

Title:
  Guest seems suspended after host freed memory for it using oom-killer

Status in QEMU:
  New

Bug description:
  Host: qemu 5.1.0, linux 5.5.13
  Guest: Windows 7 64-bit

  This guest ran a memory intensive process, and triggered oom-killer on
  host.  Luckily, it killed chromium.  My understanding is this should
  mean qemu should have continued running unharmed.  But, the spice
  connection shows the host system clock is stuck at the exact time oom-
  killer was triggered.  The host is completely unresponsive.

  I can telnet to the qemu monitor.  "info status" shows "running".
  But, multiple times running "info registers -a" and saving the output
  to text files shows the registers are 100% unchanged, so it's not
  really running.

  On the host, top shows around 4% CPU usage by qemu.  strace shows
  about 1,000 times a second, these 6 lines repeat:

  0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.10>
  0.34 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.09>
  0.31 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.07>
  0.28 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.07>
  0.30 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, 

[Bug 1905562] Re: Guest seems suspended after host freed memory for it using oom-killer

2020-11-25 Thread James Harvey
Am I correct to expect the VM to continue successfully, after oom-killer
successfully freed up memory?  This journactl does show a calltrace
which includes "vmx_vmexit", and I'm not sure what that function is for
but looks a little worrisome.

** Attachment added: "section or journalctl from host, showing oom-killer"
   
https://bugs.launchpad.net/qemu/+bug/1905562/+attachment/5437889/+files/journalctl.oom-killer

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905562

Title:
  Guest seems suspended after host freed memory for it using oom-killer

Status in QEMU:
  New

Bug description:
  Host: qemu 5.1.0, linux 5.5.13
  Guest: Windows 7 64-bit

  This guest ran a memory intensive process, and triggered oom-killer on
  host.  Luckily, it killed chromium.  My understanding is this should
  mean qemu should have continued running unharmed.  But, the spice
  connection shows the host system clock is stuck at the exact time oom-
  killer was triggered.  The host is completely unresponsive.

  I can telnet to the qemu monitor.  "info status" shows "running".
  But, multiple times running "info registers -a" and saving the output
  to text files shows the registers are 100% unchanged, so it's not
  really running.

  On the host, top shows around 4% CPU usage by qemu.  strace shows
  about 1,000 times a second, these 6 lines repeat:

  0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.10>
  0.34 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.09>
  0.31 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.07>
  0.28 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.07>
  0.30 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=0}, 
NULL, 8) = 0 (Timeout)  <0.09>
  0.43 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, 
tv_nsec=769662}, NULL, 8) = 0 (Tim eout) <0.000788>

  In the monitor, "info irq" shows IRQ 0 is increasing about 1,000 times
  a second.  IRQ 0 seems to be for the system clock, and 1,000 times a
  second seems to be the frequency a windows 7 guest might have the
  clock at.

  Those fd's are for: (9) [eventfd]; [signalfd], type=STREAM, 4 x the
  spice socket file, and "TCP localhost:ftnmtp->localhost:36566
  (ESTABLISHED)".

  Because the guest's registers aren't changing, it seems to me like
  monitor thinks the VM is running, but it's actually effectively in a
  paused state.  I think all the strace activity shown above must be
  generated by the host.  Perhaps it's repeatedly trying to contact the
  guest to inject a new clock, and communicate with it on the various
  eventfd's, spice socket, etc.  So, I'm thinking the strace doesn't
  give any information about the real reason why the VM is acting as if
  it's paused.

  I've checked "info block", and there's nothing showing that a device
  is paused, or that there's any issues with them.  (Can't remember what
  term can be there, but a paused/blocked/etc block device I think
  caused a VM to act like this for me in the past.)

  
  Is there something I can provide to help fix the bug here?

  Is there something I can do, to try to get the VM running again?  (I
  sadly have unsaved work in it.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905562/+subscriptions



[Bug 1842787] Re: Writes permanently hang with very heavy I/O on virtio-scsi - worse on virtio-blk

2019-09-21 Thread James Harvey
Apologies, it looks like I ran into two separate bugs, one with XFS, and
one with BTRFS, that had the same symptom, initially making me to think
this must be a QEMU issue.

Using blktrace, I was able to see within the VM, that the virtio block
device wasn't getting the writes that were going into uninterruptible
sleep.

So, this should be able to be closed.  For some reason, virtio-blk
seemed to trigger the bugs more rapidly, but at this point, I can't say
there is anything at fault with it or virtio-scsi.


BTRFS issue was discussed and linked to here 
https://lore.kernel.org/linux-btrfs/CAL3q7H4peDv_bQa5vGJeOM=V--yq1a1=ahat5qcsxjbndos...@mail.gmail.com/
 and has been released.  I've been able to run it for several days without a 
lockup, so it seems to have fixed the issue for me.

I just emailed the XFS list about the separate problems with it.  No
idea if it's an issue in more recent kernels than 5.1.15-5.1.16, which
is what I was running at the time of the XFS errors.  (Like the original
report said, I was on 5.2.11 at that point.)  See
https://www.spinics.net/lists/linux-xfs/msg31927.html

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1842787

Title:
  Writes permanently hang with very heavy I/O on virtio-scsi - worse on
  virtio-blk

Status in QEMU:
  New

Bug description:
  Up to date Arch Linux on host and guest.  linux 5.2.11.  QEMU 4.1.0.
  Full command line at bottom.

  Host gives QEMU two thin LVM volumes.  The first is the root
  filesystem, and the second is for heavy I/O, on a Samsung 970 Evo 1TB.

  When maxing out the I/O on the second virtual block device using
  virtio-blk, I often get a "lockup" in about an hour or two.  From the
  advise of iggy in IRC, I switched over to virtio-scsi.  It ran
  perfectly for a few days, but then "locked up" in the same way.

  By "lockup", I mean writes to the second virtual block device
  permanently hang.  I can read files from it, but even "touch foo"
  never times out, cannot be "kill -9"'ed, and is stuck in
  uninterruptible sleep.

  When this happens, writes to the first virtual block device with the
  root filesystem are fine, so the O/S itself remains responsive.

  The second virtual block device uses BTRFS.  But, I have also tried
  XFS and reproduced the issue.

  In guest, when this starts, it starts logging "task X blocked for more
  than Y seconds".  Below is an example of one of these.  At this point,
  anything that is or does in the future write to this block device gets
  stuck in uninterruptible sleep.

  -

  INFO: task kcompactd:232 blocked for more than 860 seconds.
    Not tained 5.2.11-1 #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messae.
  kcompactd0  D0   232  2 0x80004000
  Call Trace:
   ? __schedule+0x27f/0x6d0
   schedule+0x3d/0xc0
   io_schedule+0x12/0x40
   __lock_page+0x14a/0x250
   ? add_to_page_cache_lru+0xe0/0xe0
   migrate_pages+0x803/0xb70
   ? isolate_migratepages_block+0x9f0/0x9f0
   ? __reset_isolation_suitable+0x110/0x110
   compact_zone+0x6a2/0xd30
   kcompactd_do_work+0x134/0x260
   ? kvm_clock_read+0x14/0x30
   ? kvm_sched_clock_read+0x5/0x10
   kcompactd+0xd3/0x220
   ? wait_woken+0x80/0x80
   kthread+0xfd/0x130
   ? kcompactd_do_work+0x260/0x260
   ? kthread_park+0x80/0x80
   ret_from_fork+0x35/0x40

  -

  In guest, there are no other dmesg/journalctl entries other than
  "task...blocked".

  On host, there are no dmesg/journalctl entries whatsoever.  Everything
  else in host continues to work fine, including other QEMU VM's on the
  same underlying SSD (but obviously different lvm volumes.)

  I understand there might not be enough to go on here, and I also
  understand it's possible this isn't a QEMU bug.  Happy to run given
  commands or patches to help diagnose what's going on here.

  I'm now running a custom compiled QEMU 4.1.0, with debug symbols, so I
  can get a meaningful backtrace from the host point of view.

  I've only recently tried this level of I/O, so can't say if this is a
  new issue.

  When writes are hanging, on host, I can connect to the monitor.
  Running "info block" shows nothing unusual.

  -

  /usr/bin/qemu-system-x86_64
     -name arch,process=qemu:arch
     -no-user-config
     -nodefaults
     -nographic
     -uuid 0528162b-2371-41d5-b8da-233fe61b6458
     -pidfile /tmp/0528162b-2371-41d5-b8da-233fe61b6458.pid
     -machine q35,accel=kvm,vmport=off,dump-guest-core=off
     -cpu SandyBridge-IBRS
     -smp cpus=24,cores=12,threads=1,sockets=2
     -m 24G
     -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd
     -drive 
if=pflash,format=raw,readonly,file=/var/qemu/0528162b-2371-41d5-b8da-233fe61b6458.fd
     -monitor telnet:localhost:8000,server,nowait,nodelay
     -spice 
unix,addr=/tmp/0528162b-2371-41d5-b8da-233fe61b6458.sock,disable-ticketing
     -device ioh3420,id=pcie.1,bus=pcie.0,slot=0
 

[Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-19 Thread James Harvey
Yes, I first replicated the issue by removing "max_outputs=1", then
patched spice server, and the issue no longer happens.

QEMU 4.1.0 still changed something.  If I understand correctly, it's now
in some circumstances saying there are 0 monitors, even though there's a
graphic card?

Fixing this in spice to effectively ignore being told 0, and go with 1
instead, gets around the bug, but still makes me think there's something
wrong in QEMU 4.1.0.  Granted, perhaps with this spice fix, it might not
cause any negative effects anymore.

But, I don't know if there are any third party applications especially
on Windows that don't use upstream spice-server and might be thrown off
by this in a similar way.  So, I wonder if QEMU 4.1.0 should still have
something fixed.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1844053] Re: task blocked for more than X seconds - events drm_fb_helper_dirty_work

2019-09-15 Thread James Harvey
** Description changed:

  I've had bunches of these errors on 9 different boots, between
  2019-08-21 and now, with Arch host and guest, from linux 5.1.16 to
  5.2.14 on host and guest, with QEMU 4.0.0 and 4.1.0.  spice 0.14.2,
  spice-gtk 0.37, spice-protocol 0.14.0, virt-viewer 8.0.
  
  I've been fighting with some other issues related to a 5.2 btrfs
  regression, a QEMU qxl regression (see bug 1843151) which I ran into
  when trying to temporarily abandon virtio-vga, and I haven't paid enough
  attention to what impact it has on the system when these occur.  In
  journalctl, I can see I often rebooted minutes after they occurred, but
  sometimes much later.  That must mean whenever I saw it happen that I
  rebooted the VM, or potentially it impacted functionality of the system.
  
  Please let me know if and how I can get more information for you if
  needed.
  
  I've replicated this on both a system with integrated ASPEED video, and
  on an AMD Vega 64 running amdgpu.
  
  As an example, I have one boot which reported at 122 seconds, 245, 368,
  491, 614, 737, 860, 983, 1105, 1228, then I rebooted.
  
  I have another that reported 122/245/368/491/614/737, went away for 10
  minutes, then started reporting again 122/245/368/491, and went away.
  Then, I rebooted about 20 hours later.
  
  Host system has no graphical impact when this happens, and logs nothing
  in its journalctl.
  
+ Guest is tty mode only, with kernel argument "video=1280x1024".  No x
+ server.
+ 
  ==
  
  INFO: task kworker/0:1:15 blocked for more than 122 seconds.
-   Not tainted 5.2.14-1 #1
+   Not tainted 5.2.14-1 #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  kworker/0:1 D015  2 0x84000
  Workqueue: events drm_fb_helper_dirty_work [drm_kms_helper]
  Call Trace:
-  ? __schedule+0x27f/0x6d0
-  schedule+0x3d/0xc0
-  virtio_gpu_queue_fenced_ctrl_buffer+0xa1/0x130 [virtio_gpu]
-  ? wait_woken+0x80/0x80
-  virtio_gpu_surface_dirty+0x2a5/0x300 [virtio_gpu]
-  drm_fb_helper_dirty_work+0x156/0x160 [drm_kms_helper]
-  process_one_work+0x19a/0x3b0
-  worker_tread+0x50/0x3a0
-  kthread+0xfd/0x130
-  ? process_one_work+0x3b0/0x3b0
-  ? kthread_park+0x80/0x80
-  ret_from_fork+0x35/0x40
+  ? __schedule+0x27f/0x6d0
+  schedule+0x3d/0xc0
+  virtio_gpu_queue_fenced_ctrl_buffer+0xa1/0x130 [virtio_gpu]
+  ? wait_woken+0x80/0x80
+  virtio_gpu_surface_dirty+0x2a5/0x300 [virtio_gpu]
+  drm_fb_helper_dirty_work+0x156/0x160 [drm_kms_helper]
+  process_one_work+0x19a/0x3b0
+  worker_tread+0x50/0x3a0
+  kthread+0xfd/0x130
+  ? process_one_work+0x3b0/0x3b0
+  ? kthread_park+0x80/0x80
+  ret_from_fork+0x35/0x40
  
  ==
  
  /usr/bin/qemu-system-x86_64 \
--name vm,process=qemu:vm \
--no-user-config \
--nodefaults \
--nographic \
--uuid  \
--pidfile  \
--machine q35,accel=kvm,vmport=off,dump-guest-core=off \
--cpu SandyBridge-IBRS \
--smp cpus=4,cores=2,threads=1,sockets=2 \
--m 4G \
--drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd 
\
--drive if=pflash,format=raw,file=/var/qemu/efivars/vm.fd \
--monitor telnet:localhost:8000,server,nowait,nodelay \
--spice unix,addr=/tmp/spice.vm.sock,disable-ticketing \
--device ioh3420,id=pcie.1,bus=pcie.0,slot=0 \
--device virtio-vga,bus=pcie.1,addr=0 \
--usbdevice tablet \
--netdev bridge,id=network0,br=br0 \
--device 
virtio-net-pci,netdev=network0,mac=F4:F6:34:F6:34:2d,bus=pcie.0,addr=3 \
--device virtio-scsi-pci,id=scsi1 \
--drive 
driver=raw,node-name=hd0,file=/dev/lvm/vm,if=none,discard=unmap,cache=none,aio=threads
+    -name vm,process=qemu:vm \
+    -no-user-config \
+    -nodefaults \
+    -nographic \
+    -uuid  \
+    -pidfile  \
+    -machine q35,accel=kvm,vmport=off,dump-guest-core=off \
+    -cpu SandyBridge-IBRS \
+    -smp cpus=4,cores=2,threads=1,sockets=2 \
+    -m 4G \
+    -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd 
\
+    -drive if=pflash,format=raw,file=/var/qemu/efivars/vm.fd \
+    -monitor telnet:localhost:8000,server,nowait,nodelay \
+    -spice unix,addr=/tmp/spice.vm.sock,disable-ticketing \
+    -device ioh3420,id=pcie.1,bus=pcie.0,slot=0 \
+    -device virtio-vga,bus=pcie.1,addr=0 \
+    -usbdevice tablet \
+    -netdev bridge,id=network0,br=br0 \
+    -device 
virtio-net-pci,netdev=network0,mac=F4:F6:34:F6:34:2d,bus=pcie.0,addr=3 \
+    -device virtio-scsi-pci,id=scsi1 \
+    -drive 
driver=raw,node-name=hd0,file=/dev/lvm/vm,if=none,discard=unmap,cache=none,aio=threads

** Description changed:

  I've had bunches of these errors on 9 different boots, between
  2019-08-21 and now, with Arch host and guest, from linux 5.1.16 to
  5.2.14 on host and guest, with QEMU 4.0.0 and 4.1.0.  spice 0.14.2,
  spice-gtk 0.37, spice-protocol 0.14.0, virt-viewer 8.0.
  
  I've been fighting with some other issues related to a 5.2 btrfs
  regression, a QEMU qxl regression (see bug 

[Qemu-devel] [Bug 1844053] [NEW] task blocked for more than X seconds - events drm_fb_helper_dirty_work

2019-09-15 Thread James Harvey
Public bug reported:

I've had bunches of these errors on 9 different boots, between
2019-08-21 and now, with Arch host and guest, from linux 5.1.16 to
5.2.14 on host and guest, with QEMU 4.0.0 and 4.1.0.  spice 0.14.2,
spice-gtk 0.37, spice-protocol 0.14.0, virt-viewer 8.0.

I've been fighting with some other issues related to a 5.2 btrfs
regression, a QEMU qxl regression (see bug 1843151) which I ran into
when trying to temporarily abandon virtio-vga, and I haven't paid enough
attention to what impact it has on the system when these occur.  In
journalctl, I can see I often rebooted minutes after they occurred, but
sometimes much later.  That must mean whenever I saw it happen that I
rebooted the VM, or potentially it impacted functionality of the system.

Please let me know if and how I can get more information for you if
needed.

I've replicated this on both a system with integrated ASPEED video, and
on an AMD Vega 64 running amdgpu.

As an example, I have one boot which reported at 122 seconds, 245, 368,
491, 614, 737, 860, 983, 1105, 1228, then I rebooted.

I have another that reported 122/245/368/491/614/737, went away for 10
minutes, then started reporting again 122/245/368/491, and went away.
Then, I rebooted about 20 hours later.

Host system has no graphical impact when this happens, and logs nothing
in its journalctl.

==

INFO: task kworker/0:1:15 blocked for more than 122 seconds.
  Not tainted 5.2.14-1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1 D015  2 0x84000
Workqueue: events drm_fb_helper_dirty_work [drm_kms_helper]
Call Trace:
 ? __schedule+0x27f/0x6d0
 schedule+0x3d/0xc0
 virtio_gpu_queue_fenced_ctrl_buffer+0xa1/0x130 [virtio_gpu]
 ? wait_woken+0x80/0x80
 virtio_gpu_surface_dirty+0x2a5/0x300 [virtio_gpu]
 drm_fb_helper_dirty_work+0x156/0x160 [drm_kms_helper]
 process_one_work+0x19a/0x3b0
 worker_tread+0x50/0x3a0
 kthread+0xfd/0x130
 ? process_one_work+0x3b0/0x3b0
 ? kthread_park+0x80/0x80
 ret_from_fork+0x35/0x40

==

/usr/bin/qemu-system-x86_64 \
   -name vm,process=qemu:vm \
   -no-user-config \
   -nodefaults \
   -nographic \
   -uuid  \
   -pidfile  \
   -machine q35,accel=kvm,vmport=off,dump-guest-core=off \
   -cpu SandyBridge-IBRS \
   -smp cpus=4,cores=2,threads=1,sockets=2 \
   -m 4G \
   -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd \
   -drive if=pflash,format=raw,file=/var/qemu/efivars/vm.fd \
   -monitor telnet:localhost:8000,server,nowait,nodelay \
   -spice unix,addr=/tmp/spice.vm.sock,disable-ticketing \
   -device ioh3420,id=pcie.1,bus=pcie.0,slot=0 \
   -device virtio-vga,bus=pcie.1,addr=0 \
   -usbdevice tablet \
   -netdev bridge,id=network0,br=br0 \
   -device 
virtio-net-pci,netdev=network0,mac=F4:F6:34:F6:34:2d,bus=pcie.0,addr=3 \
   -device virtio-scsi-pci,id=scsi1 \
   -drive 
driver=raw,node-name=hd0,file=/dev/lvm/vm,if=none,discard=unmap,cache=none,aio=threads

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1844053

Title:
  task blocked for more than X seconds - events drm_fb_helper_dirty_work

Status in QEMU:
  New

Bug description:
  I've had bunches of these errors on 9 different boots, between
  2019-08-21 and now, with Arch host and guest, from linux 5.1.16 to
  5.2.14 on host and guest, with QEMU 4.0.0 and 4.1.0.  spice 0.14.2,
  spice-gtk 0.37, spice-protocol 0.14.0, virt-viewer 8.0.

  I've been fighting with some other issues related to a 5.2 btrfs
  regression, a QEMU qxl regression (see bug 1843151) which I ran into
  when trying to temporarily abandon virtio-vga, and I haven't paid
  enough attention to what impact it has on the system when these occur.
  In journalctl, I can see I often rebooted minutes after they occurred,
  but sometimes much later.  That must mean whenever I saw it happen
  that I rebooted the VM, or potentially it impacted functionality of
  the system.

  Please let me know if and how I can get more information for you if
  needed.

  I've replicated this on both a system with integrated ASPEED video,
  and on an AMD Vega 64 running amdgpu.

  As an example, I have one boot which reported at 122 seconds, 245,
  368, 491, 614, 737, 860, 983, 1105, 1228, then I rebooted.

  I have another that reported 122/245/368/491/614/737, went away for 10
  minutes, then started reporting again 122/245/368/491, and went away.
  Then, I rebooted about 20 hours later.

  Host system has no graphical impact when this happens, and logs
  nothing in its journalctl.

  ==

  INFO: task kworker/0:1:15 blocked for more than 122 seconds.
Not tainted 5.2.14-1 #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  kworker/0:1 D015  2 0x84000
  Workqueue: events drm_fb_helper_dirty_work [drm_kms_helper]
  Call 

[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-12 Thread James Harvey
Sorry, my #8 was really long.  All builds I've done were in clean
chroots, so starting from scratch with just git source, with no
interference from other builds.  Also later in #8, I show that
--disable-glusterfs doesn't work because some part of the build looks
for the .so that was never built.

Luckily, be812c0 was easy enough to just manually revert on top of
4.1.0.

And, good news.  (I hope!)  4.1.0 with be812c0 manually reverted on top
of it prevents the bug, even WITHOUT "max_outputs=1".

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-12 Thread James Harvey
Bisection is not going well at all with this code base!

Before your last reply, I started, and the first between 4.0.0 and 4.1.0
is aae6500972 which fails compilation:

==

...
  CC  stubs/pci-host-piix.o
  CC  stubs/ram-block.o
  CC  stubs/ramfb.o
  CC  stubs/fw_cfg.o
  CC  stubs/semihost.o
  CC  qemu-keymap.o
  CC  util/filemonitor-stub.o

Warning, treated as error:
/build/qemu-bisect/src/qemu/docs/interop/bitmaps.rst:202:Could not lex 
literal_block as "json". Highlighting skipped.
  CC  ui/input-keymap.o
  CC  contrib/elf2dmp/main.o
  CC  contrib/elf2dmp/addrspace.o
  CC  contrib/elf2dmp/download.o
  CC  contrib/elf2dmp/pdb.o
  CC  contrib/elf2dmp/qemu_elf.o
  CC  contrib/ivshmem-client/ivshmem-client.o
  CC  contrib/ivshmem-client/main.o
  CC  contrib/ivshmem-server/ivshmem-server.o

==

I tried just marking it as good and hoping it was a more recent
regression, instead of even doing a skip, but efa85a4d1a fails with the
same error.  I double checked that 4.0.0 and 4.1.0 still get past that
spot for me, and they do.

I tried your suggestion, be812c0, but that compiled with this error:

==

  CC  crypto/cipher.o
  CC  crypto/tlscreds.o
  CC  crypto/tlscredsanon.o
/build/qemu-bisect/src/qemu/block/gluster.c: In function 
‘qemu_gluster_co_pwrite_zeroes’:
/build/qemu-bisect/src/qemu/block/gluster.c:994:52: warning: passing argument 4 
of ‘glfs_zerofill_async’ from incompatible pointer type [-Wincompatible-pointer
-types]
  994 | ret = glfs_zerofill_async(s->fd, offset, size, 
gluster_finish_aiocb, );
  |^~~~
  ||
  |void (*)(struct 
glfs_fd *, ssize_t,  void *) {aka void (*)(struct glfs_fd *, long int,  void *)}
In file included from /build/qemu-bisect/src/qemu/block/gluster.c:12:
/usr/include/glusterfs/api/glfs.h:993:73: note: expected ‘glfs_io_cbk’ {aka 
‘void (*)(struct glfs_fd *, long int,  struct glfs_stat *, struct glfs_stat *, 
void
 *)’} but argument is of type ‘void (*)(struct glfs_fd *, ssize_t,  void *)’ 
{aka ‘void (*)(struct glfs_fd *, long int,  void *)’}
  993 | glfs_zerofill_async(glfs_fd_t *fd, off_t length, off_t len, glfs_io_cbk 
fn,
  | 
^~
/build/qemu-bisect/src/qemu/block/gluster.c: In function 
‘qemu_gluster_do_truncate’:
/build/qemu-bisect/src/qemu/block/gluster.c:1035:13: error: too few arguments 
to function ‘glfs_ftruncate’
 1035 | if (glfs_ftruncate(fd, offset)) {
  | ^~
In file included from /build/qemu-bisect/src/qemu/block/gluster.c:12:
/usr/include/glusterfs/api/glfs.h:768:1: note: declared here
  768 | glfs_ftruncate(glfs_fd_t *fd, off_t length, struct glfs_stat *prestat,
  | ^~
/build/qemu-bisect/src/qemu/block/gluster.c:1046:13: error: too few arguments 
to function ‘glfs_ftruncate’
 1046 | if (glfs_ftruncate(fd, offset)) {
  | ^~

==

So, I looked at configure and saw a "--disable-glusterfs" option, and
tried it.  It still failed with:

==

  GEN it.mo
  GEN bg.mo
  GEN fr_FR.mo
  GEN zh_CN.mo
  GEN de_DE.mo
  GEN hu.mo
  GEN tr.mo
for obj in hu.mo tr.mo it.mo bg.mo fr_FR.mo zh_CN.mo de_DE.mo; do \
base=$(basename $obj .mo); \
install -d 
/build/qemu-bisect/pkg/qemu-bisect/usr/share/locale/$base/LC_MESSAGES; \
install -m644 $obj 
/build/qemu-bisect/pkg/qemu-bisect/usr/share/locale/$base/LC_MESSAGES/qemu.mo; \
done
make[1]: Leaving directory '/build/qemu-bisect/src/build-full/po'
install -d -m 0755 "/build/qemu-bisect/pkg/qemu-bisect/usr/share/qemu/keymaps"
set -e; for x in da en-gb  et  fr fr-ch  is  lt  no  pt-br  sv ar  
de en-us  fi  fr-be  hr it  lv  nl pl  ru th de-ch  es 
fo  fr-ca  hu ja  mk  pt  sl tr bepocz; do \
install -c -m 0644 /build/qemu-bisect/src/qemu/pc-bios/keymaps/$x 
"/build/qemu-bisect/pkg/qemu-bisect/usr/share/qemu/keymaps"; \
done
install -c -m 0644 /build/qemu-bisect/src/build-full/trace-events-all 
"/build/qemu-bisect/pkg/qemu-bisect/usr/share/qemu/trace-events-all"
for d in aarch64-softmmu alpha-softmmu arm-softmmu cris-softmmu hppa-softmmu 
i386-softmmu lm32-softmmu m68k-softmmu microblazeel-softmmu microblaze-softmmu 
mips64el-softmmu mips64-softmmu mipsel-softmmu mips-softmmu moxie-softmmu 
nios2-softmmu or1k-softmmu ppc64-softmmu ppc-softmmu riscv32-softmmu 
riscv64-softmmu s390x-softmmu sh4eb-softmmu sh4-softmmu sparc64-softmmu 
sparc-softmmu tricore-softmmu unicore32-softmmu x86_64-softmmu xtensaeb-softmmu 
xtensa-softmmu aarch64_be-linux-user aarch64-linux-user alpha-linux-user 
armeb-linux-user arm-linux-user cris-linux-user hppa-linux-user i386-linux-user 
m68k-linux-user 

[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-12 Thread James Harvey
P.S. Looks like I can use --disable-docs to hopefully get around the
json parsing error, but that still doesn't help with the gluster error
or that something is still looking the .so given --disable-glusterfs.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-12 Thread James Harvey
a) spice 0.14.2.  Also spice-gtk 0.37, and spice-protocol 0.14.0.

b) Swapping with "-device qxl-vga,max_outputs=1" does fix the problem.
Swapping with "-device qxl-vga" still has the bug.

c) Knowing b, would the bisect still help?  If needed, sure, I will.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-11 Thread James Harvey
Sorry, in comment #2 for the native graphics window command line, I
copied from the wrong trial.  The argument for QXL should have been
included, because that works with a native graphics window:

   (...bootindex=0) \
   -vga qxl

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-11 Thread James Harvey
Finding a minimal case did shed some light on this.

Using QEMU's native graphics window, this works fine:

$ /usr/bin/qemu-system-x86_64 \
   -m 1G \
   -blockdev 
raw,node-name=install_iso,read-only=on,file.driver=file,file.filename=/mnt/losable/ISOs/archlinux-2019.09.01-x86_64.iso
 \
   -device ide-cd,drive=install_iso,bus=ide.0,bootindex=0

But, introducing spice reproduces the problem:

$ /usr/bin/qemu-system-x86_64 \
   -m 1G \
   -blockdev 
raw,node-name=install_iso,read-only=on,file.driver=file,file.filename=/mnt/losable/ISOs/archlinux-2019.09.01-x86_64.iso
 \
   -spice unix,addr=/tmp/spice.qxl.sock,disable-ticketing \
   -device ide-cd,drive=install_iso,bus=ide.0,bootindex=0 \
   -vga qxl

$ remote-viewer "spice+unix:///tmp/spice.qxl.sock"

I've been running remote-viewer (from virt-viewer package) since around
March 13, version 8.0 since then.  It's only when upgrading QEMU from
4.0.0 to 4.1.0 that introduces the problem.

Running remote-viewer this way also shows that it outputs these, right
when KMS changes resolution:

(remote-viewer:15090): GLib-GObject-WARNING **: 23:56:03.914: value "64"
of type 'gint' is invalid or out of range for property 'desktop-width'
of type 'gint'

(remote-viewer:15090): GLib-GObject-WARNING **: 23:56:03.915: value "64"
of type 'gint' is invalid or out of range for property 'desktop-height'
of type 'gint'

When downgrading to QEMU 4.0.0, remote-viewer STILL outputs these lines
regarding desktop-width and height, when KMS changes resolution.

In case it helps, below are spice-debug logs from remote-viewer.  I've
included the whole log, but also added a bunch of spacing and a header
showing the second worth of output correlating with the KMS resolution
change.

QEMU 4.0.0 without the bug: http://ix.io/1USn

QEMU 4.1.0 with the bug: http://ix.io/1USo

So, it's always possible the fix might need to be in remote-viewer, but
at minimum, the case it would need to handle properly wasn't being given
to it until QEMU 4.1.0.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-11 Thread James Harvey
Comparing the spice debug logs, where I see this with QEMU 4.0.0 without
the bug:

(remote-viewer:19270): GSpice-DEBUG: 00:05:21.201: channel-display.c:1979 
display-2:0: received new monitors config from guest: n: 1/4
(remote-viewer:19270): GSpice-DEBUG: 00:05:21.201: channel-display.c:1997 
display-2:0: monitor id: 0, surface id: 0, +0+0-1024x768

I see this with QEMU 4.1.0 with the bug:

(remote-viewer:19896): GSpice-DEBUG: 00:07:40.019: channel-display.c:1975 
display-2:0: received empty monitor config
(remote-viewer:19896): GSpice-DEBUG: 00:07:40.049: channel-cursor.c:542 
cursor-4:0: cursor_handle_reset, init_done: 1
(remote-viewer:19896): GSpice-DEBUG: 00:07:40.049: channel-display.c:1951 
display-2:0: 0: FIXME primary destroy, but is display really disabled?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1843151] Re: Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-07 Thread James Harvey
** Description changed:

- Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.
+ Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.
  
  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.
  
  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.
  
  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY
  
  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0
  
  4.1.0 displays fine until KMS kicks in.
  
  Using 4.1.0 with virtio-vga doesn't cause this.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.  virt-viewer 8.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1843151] [NEW] Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

2019-09-07 Thread James Harvey
Public bug reported:

Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.

Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

Have replicated this both on a system using amdgpu and one using
integrated ASPEED graphics.

Downgrading from 4.1.0 to 4.0.0 works as usual, see:
https://www.youtube.com/watch?v=NyMdcYwOCvY

Going back to 4.1.0 reproduces, see:
https://www.youtube.com/watch?v=H3nGG2Mk6i0

4.1.0 displays fine until KMS kicks in.

Using 4.1.0 with virtio-vga doesn't cause this.

** Affects: qemu
 Importance: Undecided
 Status: New

** Description changed:

  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.
  
  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.
  
  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.
  
  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY
  
  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0
  
  4.1.0 displays fine until KMS kicks in.
+ 
+ Using 4.1.0 with virtio-vga doesn't cause this.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1843151

Title:
  Regression: QEMU 4.1.0 qxl and KMS resoluiton only 4x10

Status in QEMU:
  New

Bug description:
  Host is Arch Linux.  linux 5.2.13, qemu 4.1.0.

  Guest is Arch Linux Sept 2019 ISO.  linux 5.2.11.

  Have replicated this both on a system using amdgpu and one using
  integrated ASPEED graphics.

  Downgrading from 4.1.0 to 4.0.0 works as usual, see:
  https://www.youtube.com/watch?v=NyMdcYwOCvY

  Going back to 4.1.0 reproduces, see:
  https://www.youtube.com/watch?v=H3nGG2Mk6i0

  4.1.0 displays fine until KMS kicks in.

  Using 4.1.0 with virtio-vga doesn't cause this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1843151/+subscriptions



[Qemu-devel] [Bug 1842787] Re: Writes permanently hang with very heavy I/O on virtio-scsi - worse on virtio-blk

2019-09-04 Thread James Harvey
** Description changed:

  Up to date Arch Linux on host and guest.  linux 5.2.11.  QEMU 4.1.0.
  Full command line at bottom.
  
  Host gives QEMU two thin LVM volumes.  The first is the root filesystem,
  and the second is for heavy I/O, on a Samsung 970 Evo 1TB.
  
  When maxing out the I/O on the second virtual block device using virtio-
  blk, I often get a "lockup" in about an hour or two.  From the advise of
  iggy in IRC, I switched over to virtio-scsi.  It ran perfectly for a few
  days, but then "locked up" in the same way.
  
  By "lockup", I mean writes to the second virtual block device
  permanently hang.  I can read files from it, but even "touch foo" never
  times out, cannot be "kill -9"'ed, and is stuck in uninterruptible
  sleep.
  
  When this happens, writes to the first virtual block device with the
  root filesystem are fine, so the O/S itself remains responsive.
  
  The second virtual block device uses BTRFS.  But, I have also tried XFS
  and reproduced the issue.
  
  In guest, when this starts, it starts logging "task X blocked for more
  than Y seconds".  Below is an example of one of these.  At this point,
  anything that is or does in the future write to this block device gets
  stuck in uninterruptible sleep.
  
  -
  
  INFO: task kcompactd:232 blocked for more than 860 seconds.
    Not tained 5.2.11-1 #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messae.
  kcompactd0  D0   232  2 0x80004000
  Call Trace:
   ? __schedule+0x27f/0x6d0
   schedule+0x3d/0xc0
   io_schedule+0x12/0x40
   __lock_page+0x14a/0x250
   ? add_to_page_cache_lru+0xe0/0xe0
   migrate_pages+0x803/0xb70
   ? isolate_migratepages_block+0x9f0/0x9f0
   ? __reset_isolation_suitable+0x110/0x110
   compact_zone+0x6a2/0xd30
   kcompactd_do_work+0x134/0x260
   ? kvm_clock_read+0x14/0x30
   ? kvm_sched_clock_read+0x5/0x10
   kcompactd+0xd3/0x220
   ? wait_woken+0x80/0x80
   kthread+0xfd/0x130
   ? kcompactd_do_work+0x260/0x260
   ? kthread_park+0x80/0x80
   ret_from_fork+0x35/0x40
  
  -
  
  In guest, there are no other dmesg/journalctl entries other than
  "task...blocked".
  
  On host, there are no dmesg/journalctl entries whatsoever.  Everything
  else in host continues to work fine, including other QEMU VM's on the
  same underlying SSD (but obviously different lvm volumes.)
  
  I understand there might not be enough to go on here, and I also
  understand it's possible this isn't a QEMU bug.  Happy to run given
  commands or patches to help diagnose what's going on here.
  
  I'm now running a custom compiled QEMU 4.1.0, with debug symbols, so I
  can get a meaningful backtrace from the host point of view.
  
  I've only recently tried this level of I/O, so can't say if this is a
  new issue.
  
+ When writes are hanging, on host, I can connect to the monitor.  Running
+ "info block" shows nothing unusual.
+ 
  -
  
  /usr/bin/qemu-system-x86_64
     -name arch,process=qemu:arch
     -no-user-config
     -nodefaults
     -nographic
     -uuid 0528162b-2371-41d5-b8da-233fe61b6458
     -pidfile /tmp/0528162b-2371-41d5-b8da-233fe61b6458.pid
     -machine q35,accel=kvm,vmport=off,dump-guest-core=off
     -cpu SandyBridge-IBRS
     -smp cpus=24,cores=12,threads=1,sockets=2
     -m 24G
     -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd
     -drive 
if=pflash,format=raw,readonly,file=/var/qemu/0528162b-2371-41d5-b8da-233fe61b6458.fd
     -monitor telnet:localhost:8000,server,nowait,nodelay
     -spice 
unix,addr=/tmp/0528162b-2371-41d5-b8da-233fe61b6458.sock,disable-ticketing
     -device ioh3420,id=pcie.1,bus=pcie.0,slot=0
     -device virtio-vga,bus=pcie.1,addr=0
     -usbdevice tablet
     -netdev bridge,id=network0,br=br0
     -device 
virtio-net-pci,netdev=network0,mac=02:37:de:79:19:09,bus=pcie.0,addr=3
     -device virtio-scsi-pci,id=scsi1
     -drive 
driver=raw,node-name=hd0,file=/dev/lvm/arch_root,if=none,discard=unmap
     -device scsi-hd,drive=hd0,bootindex=1
     -drive 
driver=raw,node-name=hd1,file=/dev/lvm/arch_nvme,if=none,discard=unmap
     -device scsi-hd,drive=hd1,bootindex=2
  
  -

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1842787

Title:
  Writes permanently hang with very heavy I/O on virtio-scsi - worse on
  virtio-blk

Status in QEMU:
  New

Bug description:
  Up to date Arch Linux on host and guest.  linux 5.2.11.  QEMU 4.1.0.
  Full command line at bottom.

  Host gives QEMU two thin LVM volumes.  The first is the root
  filesystem, and the second is for heavy I/O, on a Samsung 970 Evo 1TB.

  When maxing out the I/O on the second virtual block device using
  virtio-blk, I often get a "lockup" in about an hour or two.  From the
  advise of iggy in IRC, I switched over to virtio-scsi.  It ran
  perfectly for a few days, but then "locked up" in the same way.

  By "lockup", I mean writes to the second 

[Qemu-devel] [Bug 1842787] Re: Writes permanently hang with very heavy I/O on virtio-scsi - worse on virtio-blk

2019-09-04 Thread James Harvey
** Description changed:

  Up to date Arch Linux on host and guest.  linux 5.2.11.  QEMU 4.1.0.
  Full command line at bottom.
  
  Host gives QEMU two thin LVM volumes.  The first is the root filesystem,
  and the second is for heavy I/O, on a Samsung 970 Evo 1TB.
  
  When maxing out the I/O on the second virtual block device using virtio-
  blk, I often get a "lockup" in about an hour or two.  From the advise of
  iggy in IRC, I switched over to virtio-scsi.  It ran perfectly for a few
  days, but then "locked up" in the same way.
  
  By "lockup", I mean writes to the second virtual block device
  permanently hang.  I can read files from it, but even "touch foo" never
  times out, cannot be "kill -9"'ed, and is stuck in uninterruptible
  sleep.
  
  When this happens, writes to the first virtual block device with the
  root filesystem are fine, so the O/S itself remains responsive.
  
  The second virtual block device uses BTRFS.  But, I have also tried XFS
  and reproduced the issue.
  
  In guest, when this starts, it starts logging "task X blocked for more
  than Y seconds".  Below is an example of one of these.  At this point,
  anything that is or does in the future write to this block device gets
  stuck in uninterruptible sleep.
  
  -
  
  INFO: task kcompactd:232 blocked for more than 860 seconds.
-   Not tained 5.2.11-1 #1
+   Not tained 5.2.11-1 #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messae.
  kcompactd0  D0   232  2 0x80004000
  Call Trace:
-  ? __schedule+0x27f/0x6d0
-  schedule+0x3d/0xc0
-  io_schedule+0x12/0x40
-  __lock_page+0x14a/0x250
-  ? add_to_page_cache_lru+0xe0/0xe0
-  migrate_pages+0x803/0xb70
-  ? isolate_migratepages_block+0x9f0/0x9f0
-  ? __reset_isolation_suitable+0x110/0x110
-  compact_zone+0x6a2/0xd30
-  kcompactd_do_work+0x134/0x260
-  ? kvm_clock_read+0x14/0x30
-  ? kvm_sched_clock_read+0x5/0x10
-  kcompactd+0xd3/0x220
-  ? wait_woken+0x80/0x80
-  kthread+0xfd/0x130
-  ? kcompactd_do_work+0x260/0x260
-  ? kthread_park+0x80/0x80
-  ret_from_fork+0x35/0x40
+  ? __schedule+0x27f/0x6d0
+  schedule+0x3d/0xc0
+  io_schedule+0x12/0x40
+  __lock_page+0x14a/0x250
+  ? add_to_page_cache_lru+0xe0/0xe0
+  migrate_pages+0x803/0xb70
+  ? isolate_migratepages_block+0x9f0/0x9f0
+  ? __reset_isolation_suitable+0x110/0x110
+  compact_zone+0x6a2/0xd30
+  kcompactd_do_work+0x134/0x260
+  ? kvm_clock_read+0x14/0x30
+  ? kvm_sched_clock_read+0x5/0x10
+  kcompactd+0xd3/0x220
+  ? wait_woken+0x80/0x80
+  kthread+0xfd/0x130
+  ? kcompactd_do_work+0x260/0x260
+  ? kthread_park+0x80/0x80
+  ret_from_fork+0x35/0x40
  
  -
  
  In guest, there are no other dmesg/journalctl entries other than
  "task...blocked".
  
  On host, there are no dmesg/journalctl entries whatsoever.  Everything
  else in host continues to work fine, including other QEMU VM's on the
  same underlying SSD (but obviously different lvm volumes.)
  
  I understand there might not be enough to go on here, and I also
  understand it's possible this isn't a QEMU bug.  Happy to run given
  commands or patches to help diagnose what's going on here.
  
  I'm now running a custom compiled QEMU 4.1.0, with debug symbols, so I
  can get a meaningful backtrace from the host point of view.
  
+ I've only recently tried this level of I/O, so can't say if this is a
+ new issue.
+ 
  -
  
  /usr/bin/qemu-system-x86_64
--name arch,process=qemu:arch
--no-user-config
--nodefaults
--nographic
--uuid 0528162b-2371-41d5-b8da-233fe61b6458
--pidfile /tmp/0528162b-2371-41d5-b8da-233fe61b6458.pid
--machine q35,accel=kvm,vmport=off,dump-guest-core=off
--cpu SandyBridge-IBRS
--smp cpus=24,cores=12,threads=1,sockets=2
--m 24G
--drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd
--drive 
if=pflash,format=raw,readonly,file=/var/qemu/0528162b-2371-41d5-b8da-233fe61b6458.fd
--monitor telnet:localhost:8000,server,nowait,nodelay
--spice 
unix,addr=/tmp/0528162b-2371-41d5-b8da-233fe61b6458.sock,disable-ticketing
--device ioh3420,id=pcie.1,bus=pcie.0,slot=0
--device virtio-vga,bus=pcie.1,addr=0
--usbdevice tablet
--netdev bridge,id=network0,br=br0
--device 
virtio-net-pci,netdev=network0,mac=02:37:de:79:19:09,bus=pcie.0,addr=3
--device virtio-scsi-pci,id=scsi1
--drive 
driver=raw,node-name=hd0,file=/dev/lvm/arch_root,if=none,discard=unmap
--device scsi-hd,drive=hd0,bootindex=1
--drive 
driver=raw,node-name=hd1,file=/dev/lvm/arch_nvme,if=none,discard=unmap
--device scsi-hd,drive=hd1,bootindex=2
+    -name arch,process=qemu:arch
+    -no-user-config
+    -nodefaults
+    -nographic
+    -uuid 0528162b-2371-41d5-b8da-233fe61b6458
+    -pidfile /tmp/0528162b-2371-41d5-b8da-233fe61b6458.pid
+    -machine q35,accel=kvm,vmport=off,dump-guest-core=off
+    -cpu SandyBridge-IBRS
+    -smp cpus=24,cores=12,threads=1,sockets=2
+    -m 24G
+    -drive 

[Qemu-devel] [Bug 1842787] [NEW] Writes permanently hang with very heavy I/O on virtio-scsi - worse on virtio-blk

2019-09-04 Thread James Harvey
Public bug reported:

Up to date Arch Linux on host and guest.  linux 5.2.11.  QEMU 4.1.0.
Full command line at bottom.

Host gives QEMU two thin LVM volumes.  The first is the root filesystem,
and the second is for heavy I/O, on a Samsung 970 Evo 1TB.

When maxing out the I/O on the second virtual block device using virtio-
blk, I often get a "lockup" in about an hour or two.  From the advise of
iggy in IRC, I switched over to virtio-scsi.  It ran perfectly for a few
days, but then "locked up" in the same way.

By "lockup", I mean writes to the second virtual block device
permanently hang.  I can read files from it, but even "touch foo" never
times out, cannot be "kill -9"'ed, and is stuck in uninterruptible
sleep.

When this happens, writes to the first virtual block device with the
root filesystem are fine, so the O/S itself remains responsive.

The second virtual block device uses BTRFS.  But, I have also tried XFS
and reproduced the issue.

In guest, when this starts, it starts logging "task X blocked for more
than Y seconds".  Below is an example of one of these.  At this point,
anything that is or does in the future write to this block device gets
stuck in uninterruptible sleep.

-

INFO: task kcompactd:232 blocked for more than 860 seconds.
  Not tained 5.2.11-1 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messae.
kcompactd0  D0   232  2 0x80004000
Call Trace:
 ? __schedule+0x27f/0x6d0
 schedule+0x3d/0xc0
 io_schedule+0x12/0x40
 __lock_page+0x14a/0x250
 ? add_to_page_cache_lru+0xe0/0xe0
 migrate_pages+0x803/0xb70
 ? isolate_migratepages_block+0x9f0/0x9f0
 ? __reset_isolation_suitable+0x110/0x110
 compact_zone+0x6a2/0xd30
 kcompactd_do_work+0x134/0x260
 ? kvm_clock_read+0x14/0x30
 ? kvm_sched_clock_read+0x5/0x10
 kcompactd+0xd3/0x220
 ? wait_woken+0x80/0x80
 kthread+0xfd/0x130
 ? kcompactd_do_work+0x260/0x260
 ? kthread_park+0x80/0x80
 ret_from_fork+0x35/0x40

-

In guest, there are no other dmesg/journalctl entries other than
"task...blocked".

On host, there are no dmesg/journalctl entries whatsoever.  Everything
else in host continues to work fine, including other QEMU VM's on the
same underlying SSD (but obviously different lvm volumes.)

I understand there might not be enough to go on here, and I also
understand it's possible this isn't a QEMU bug.  Happy to run given
commands or patches to help diagnose what's going on here.

I'm now running a custom compiled QEMU 4.1.0, with debug symbols, so I
can get a meaningful backtrace from the host point of view.

-

/usr/bin/qemu-system-x86_64
   -name arch,process=qemu:arch
   -no-user-config
   -nodefaults
   -nographic
   -uuid 0528162b-2371-41d5-b8da-233fe61b6458
   -pidfile /tmp/0528162b-2371-41d5-b8da-233fe61b6458.pid
   -machine q35,accel=kvm,vmport=off,dump-guest-core=off
   -cpu SandyBridge-IBRS
   -smp cpus=24,cores=12,threads=1,sockets=2
   -m 24G
   -drive if=pflash,format=raw,readonly,file=/usr/share/ovmf/x64/OVMF_CODE.fd
   -drive 
if=pflash,format=raw,readonly,file=/var/qemu/0528162b-2371-41d5-b8da-233fe61b6458.fd
   -monitor telnet:localhost:8000,server,nowait,nodelay
   -spice 
unix,addr=/tmp/0528162b-2371-41d5-b8da-233fe61b6458.sock,disable-ticketing
   -device ioh3420,id=pcie.1,bus=pcie.0,slot=0
   -device virtio-vga,bus=pcie.1,addr=0
   -usbdevice tablet
   -netdev bridge,id=network0,br=br0
   -device 
virtio-net-pci,netdev=network0,mac=02:37:de:79:19:09,bus=pcie.0,addr=3
   -device virtio-scsi-pci,id=scsi1
   -drive driver=raw,node-name=hd0,file=/dev/lvm/arch_root,if=none,discard=unmap
   -device scsi-hd,drive=hd0,bootindex=1
   -drive driver=raw,node-name=hd1,file=/dev/lvm/arch_nvme,if=none,discard=unmap
   -device scsi-hd,drive=hd1,bootindex=2

-

** Affects: qemu
 Importance: Undecided
 Status: New

** Summary changed:

- irtiWrites permanently hang with very heavy I/O on vo-scsi - worse on 
virtio-blk
+ Writes permanently hang with very heavy I/O on virtio-scsi - worse on 
virtio-blk

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1842787

Title:
  Writes permanently hang with very heavy I/O on virtio-scsi - worse on
  virtio-blk

Status in QEMU:
  New

Bug description:
  Up to date Arch Linux on host and guest.  linux 5.2.11.  QEMU 4.1.0.
  Full command line at bottom.

  Host gives QEMU two thin LVM volumes.  The first is the root
  filesystem, and the second is for heavy I/O, on a Samsung 970 Evo 1TB.

  When maxing out the I/O on the second virtual block device using
  virtio-blk, I often get a "lockup" in about an hour or two.  From the
  advise of iggy in IRC, I switched over to virtio-scsi.  It ran
  perfectly for a few days, but then "locked up" in the same way.

  By "lockup", I mean writes to the second virtual block device
  permanently hang.  I can read files from it, but even "touch foo"
  never times out, cannot be "kill -9"'ed, and is stuck in
  

[Qemu-devel] [Bug 1811543] [NEW] virtio-scsi gives improper discard sysfs entries

2019-01-12 Thread James Harvey
Public bug reported:

Apologies if this is just an inherent part of paravirtualization that
should be expected.

In my host, I have an LVM thin pool with chunk_size 128MB.  Within it, I
have a thin volume "tmp".  In the host:

# fdisk -l /dev/lvm/tmp
Disk /dev/lvm/tmp: 256 MiB, 268435456 bytes, 524288 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 262144 bytes / 134217728 bytes
Disklabel type: gpt
Disk identifier: BAE3154E-6E85-F642-8129-BAD7B58B2775

DeviceStartEnd Sectors  Size Type
/dev/lvm/tmp1  2048 524254  522207  255M Linux filesystem

$ lsblk
...
  └─lvm-tmp  254:13   0   256M  0 lvm
└─lvm-tmp1   254:14   0   255M  0 part

$ cat /sys/dev/block/254:13/discard_alignment
0
$ cat /sys/dev/block/254:13/queue/discard_granularity
134217728
$ cat /sys/dev/block/254:13/queue/discard_max_bytes
17179869184
$ cat /sys/dev/block/254:13/queue/discard_max_hw_bytes
0
$ cat /sys/dev/block/254:13/queue/discard_zeroes_data
0

$ cat /sys/dev/block/254:14/discard_alignment
133169152
$ cat /sys/dev/block/254:14/queue/discard_granularity
134217728
$ cat /sys/dev/block/254:14/queue/discard_max_bytes
17179869184
$ cat /sys/dev/block/254:14/queue/discard_max_hw_bytes
0
$ cat /sys/dev/block/254:14/queue/discard_zeroes_data
0

If this is given to QEMU using virtio-scsi:

   -device virtio-scsi-pci,id=scsi1 \
   -drive 
driver=raw,node-name=hdb,file=/dev/lvm/tmp,if=none,discard=unmap,id=hd2 \
   -device scsi-hd,drive=hd2,bootindex=1 \

Then incorrect values are given:

$ lsblk
...
sdb 8:16   0   256M  0 disk
└─sdb1  8:17   0   255M  0 part /mnt

$ cat /sys/dev/block/8:16/discard_alignment
0
$ cat /sys/dev/block/8:16/queue/discard_granularity
4096
$ cat /sys/dev/block/8:16/queue/discard_max_bytes
1073741824
$ cat /sys/dev/block/8:16/queue/discard_max_hw_bytes
1073741824
$ cat /sys/dev/block/8:16/queue/discard_zeroes_data
0

$ cat /sys/dev/block/8:17/discard_alignment
133169152

And, there isn't even a /sys/dev/block/8:17/queue direcotry.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1811543

Title:
  virtio-scsi gives improper discard sysfs entries

Status in QEMU:
  New

Bug description:
  Apologies if this is just an inherent part of paravirtualization that
  should be expected.

  In my host, I have an LVM thin pool with chunk_size 128MB.  Within it,
  I have a thin volume "tmp".  In the host:

  # fdisk -l /dev/lvm/tmp
  Disk /dev/lvm/tmp: 256 MiB, 268435456 bytes, 524288 sectors
  Units: sectors of 1 * 512 = 512 bytes
  Sector size (logical/physical): 512 bytes / 4096 bytes
  I/O size (minimum/optimal): 262144 bytes / 134217728 bytes
  Disklabel type: gpt
  Disk identifier: BAE3154E-6E85-F642-8129-BAD7B58B2775

  DeviceStartEnd Sectors  Size Type
  /dev/lvm/tmp1  2048 524254  522207  255M Linux filesystem

  $ lsblk
  ...
└─lvm-tmp  254:13   0   256M  0 lvm
  └─lvm-tmp1   254:14   0   255M  0 part

  $ cat /sys/dev/block/254:13/discard_alignment
  0
  $ cat /sys/dev/block/254:13/queue/discard_granularity
  134217728
  $ cat /sys/dev/block/254:13/queue/discard_max_bytes
  17179869184
  $ cat /sys/dev/block/254:13/queue/discard_max_hw_bytes
  0
  $ cat /sys/dev/block/254:13/queue/discard_zeroes_data
  0

  $ cat /sys/dev/block/254:14/discard_alignment
  133169152
  $ cat /sys/dev/block/254:14/queue/discard_granularity
  134217728
  $ cat /sys/dev/block/254:14/queue/discard_max_bytes
  17179869184
  $ cat /sys/dev/block/254:14/queue/discard_max_hw_bytes
  0
  $ cat /sys/dev/block/254:14/queue/discard_zeroes_data
  0

  If this is given to QEMU using virtio-scsi:

 -device virtio-scsi-pci,id=scsi1 \
 -drive 
driver=raw,node-name=hdb,file=/dev/lvm/tmp,if=none,discard=unmap,id=hd2 \
 -device scsi-hd,drive=hd2,bootindex=1 \

  Then incorrect values are given:

  $ lsblk
  ...
  sdb 8:16   0   256M  0 disk
  └─sdb1  8:17   0   255M  0 part /mnt

  $ cat /sys/dev/block/8:16/discard_alignment
  0
  $ cat /sys/dev/block/8:16/queue/discard_granularity
  4096
  $ cat /sys/dev/block/8:16/queue/discard_max_bytes
  1073741824
  $ cat /sys/dev/block/8:16/queue/discard_max_hw_bytes
  1073741824
  $ cat /sys/dev/block/8:16/queue/discard_zeroes_data
  0

  $ cat /sys/dev/block/8:17/discard_alignment
  133169152

  And, there isn't even a /sys/dev/block/8:17/queue direcotry.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1811543/+subscriptions



[Qemu-devel] [Bug 1523246] Re: Virtio-blk does not support TRIM

2019-01-11 Thread James Harvey
I believe this feature was just merged by Linus about a week ago, and is
in linux 5.0-rc1:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d548e65904ae43b0637d200a2441fc94e0589c30

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1523246

Title:
  Virtio-blk does not support TRIM

Status in QEMU:
  Confirmed

Bug description:
  When model=virtio is used, TRIM is not supported.

  # mount -o discard /dev/vda4 /mnt
  # mount | tail -1
  /dev/vda4 on /mnt type fuseblk 
(rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other,blksize=4096)
  # fstrim /mnt/
  fstrim: /mnt/: the discard operation is not supported

  Booting without model=virtio allows using TRIM (in Windows as well).

  Full QEMU line:

  qemu-system-x86_64 -enable-kvm -cpu host -bios
  /usr/share/ovmf/ovmf_x64.bin -smp 2 -m 7G -vga qxl -usbdevice tablet
  -net nic,model=virtio -net user -drive discard=unmap,detect-
  zeroes=unmap,cache=none,file=vms/win10.hd.img.vmdk,format=vmdk,if=virtio

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1523246/+subscriptions



Re: [Qemu-devel] hw/input/ps2.c : BTN_SIDE and BTN_EXTRA not forwarded

2018-12-16 Thread james harvey
I think this is a spice issue.  I think anyone who has successfully
used these buttons wasn't using spice.

spice-protocol::spice/enums.h only gives SpiceMouseButton types of
LEFT, MIDDLE, RIGHT, UP, and DOWN.

spice-gtk::src/spide-widget.c::button_event() calls
button_gdk_to_spice() which also only gives the same 5 button types,
so I think winds up effectively ignoring the event and never passing
it along.

Off to the spice guys.

On Sun, Dec 16, 2018 at 5:10 PM james harvey  wrote:
>
> Running "remote-viewer --spice-debug" does show events for pushing the
> scroll wheel left and right:
>
> (remote-viewer:14226): GSpice-DEBUG: 17:09:00.043: spice-widget.c:2007
> 0:0 button_event press: button 8, state 0x10
> (remote-viewer:14226): GSpice-DEBUG: 17:09:00.414: spice-widget.c:2007
> 0:0 button_event release: button 8, state 0x10
> (remote-viewer:14226): GSpice-DEBUG: 17:09:01.045: spice-widget.c:2007
> 0:0 button_event press: button 9, state 0x10
> (remote-viewer:14226): GSpice-DEBUG: 17:09:01.473: spice-widget.c:2007
> 0:0 button_event release: button 9, state 0x10
>
> On Sun, Dec 16, 2018 at 5:00 PM james harvey  wrote:
> >
> > I didn't make it very clear that my Arch Linux guest didn't had tablet
> > emulation.  On it, evtest only shows "ImExPS/2 Generic Explorer
> > Mouse".  On its monitor, "info mice" shows:
> >
> > * Mouse #2: QEMU PS/2 Mouse
> >
> > On the Windows 7 guest, you're right:
> >
> >   Mouse #2: QEMU PS/2 Mouse
> > * Mouse #3: QEMU HID Tablet (absolute)
> >
> > After restarting qemu without the tablet option it only shows:
> >
> > * Mouse #2: QEMU PS/2 Mouse
> >
> > But, the SIDE and EXTRA buttons still don't work, with Device Manager
> > using "Microsoft PS/2 Mouse".
> >
> > Not knowing enough to know if the SPICE Guest Tools or QEMU guest
> > agent could be affecting things, I uninstalled those from the Windows
> > 7 guest, so the mouse is grabbed by the client.  The extra buttons
> > still don't work.  In the Arch guest, I uninstalled spice-vdagent,
> > which also didn't help.
> >
> > I ran across "-device virtio-mouse-pci", so tried adding that to my
> > Arch guest, and it does become used over the PS/2, but the SIDE and
> > EXTRA events still aren't working.  In the guest, evtest shows "QEMU
> > Virtio Mouse" and shows these buttons as supported events, and
> > hw/input/virtio-input-hid.c maps the SIDE and EXTRA buttons, so I'm
> > thinking it would be expected for it to work this way as well.  Since
> > it isn't, I'm thinking that might indicate the problem is a step
> > closer to the physical hardware, from
> > hw/input/{ps2,virtio-input-hid}.c.
> >
> > What is it that should send the events to the hw/input code?  Could it
> > be that remote-viewer isn't listening to the physical hardware events
> > for these buttons?  Or, is there an area within QEMU that should be
> > receiving the events?  Or could this be a spice issue?  I'm not sure
> > how these different parts interact for mouse events, or how to break
> > down where this is happening.
> >
> > Also trying through libvirt rather than qemu directly, and using the
> > virt-manager interface acts the same way.  I'm guessing that's using
> > remote-viewer, just embedded in the libvirt window, but I'm not
> > positive on that.
> >
> > On Sun, Dec 16, 2018 at 1:48 PM Fabian Lesniak  wrote:
> > >
> > > Probably the PS/2 mouse is not used at all because the HID Tablet takes
> > > precedence. By entering "info mice" on the monitor console you can see
> > > which mouse is currently used. If you disable or uninstall the HID
> > > Tablet, the PS/2 mouse should take over.
> > >
> > > "IMEX" is short for Intelli Mouse Explorer.
> > >
> > > Am 16.12.18 um 06:40 schrieb james harvey:
> > > > Running qemu 3.1.0.  virt-viewer 7.0.  spice, spice-gtk, and
> > > > spice-protocol all git versions from the past week or so.
> > > >
> > > > I have a Logitech G600 mouse.  The scroll wheel can be pushed left or 
> > > > right.
> > > >
> > > > On Arch Linux host, "evtest" shows these as event codes 275 (BTN_SIDE)
> > > > and 276 (BTN_EXTRA.)  In host, they work as expected, by default as
> > > > back and forward in supporting programs such as web browsers.
> > > >
> > > > On Arch Linux guest, "evtest" shows these events as supported for the
> > > > "ImExPS/2 Generic Explorer Mouse", but it doe

Re: [Qemu-devel] hw/input/ps2.c : BTN_SIDE and BTN_EXTRA not forwarded

2018-12-16 Thread james harvey
Running "remote-viewer --spice-debug" does show events for pushing the
scroll wheel left and right:

(remote-viewer:14226): GSpice-DEBUG: 17:09:00.043: spice-widget.c:2007
0:0 button_event press: button 8, state 0x10
(remote-viewer:14226): GSpice-DEBUG: 17:09:00.414: spice-widget.c:2007
0:0 button_event release: button 8, state 0x10
(remote-viewer:14226): GSpice-DEBUG: 17:09:01.045: spice-widget.c:2007
0:0 button_event press: button 9, state 0x10
(remote-viewer:14226): GSpice-DEBUG: 17:09:01.473: spice-widget.c:2007
0:0 button_event release: button 9, state 0x10

On Sun, Dec 16, 2018 at 5:00 PM james harvey  wrote:
>
> I didn't make it very clear that my Arch Linux guest didn't had tablet
> emulation.  On it, evtest only shows "ImExPS/2 Generic Explorer
> Mouse".  On its monitor, "info mice" shows:
>
> * Mouse #2: QEMU PS/2 Mouse
>
> On the Windows 7 guest, you're right:
>
>   Mouse #2: QEMU PS/2 Mouse
> * Mouse #3: QEMU HID Tablet (absolute)
>
> After restarting qemu without the tablet option it only shows:
>
> * Mouse #2: QEMU PS/2 Mouse
>
> But, the SIDE and EXTRA buttons still don't work, with Device Manager
> using "Microsoft PS/2 Mouse".
>
> Not knowing enough to know if the SPICE Guest Tools or QEMU guest
> agent could be affecting things, I uninstalled those from the Windows
> 7 guest, so the mouse is grabbed by the client.  The extra buttons
> still don't work.  In the Arch guest, I uninstalled spice-vdagent,
> which also didn't help.
>
> I ran across "-device virtio-mouse-pci", so tried adding that to my
> Arch guest, and it does become used over the PS/2, but the SIDE and
> EXTRA events still aren't working.  In the guest, evtest shows "QEMU
> Virtio Mouse" and shows these buttons as supported events, and
> hw/input/virtio-input-hid.c maps the SIDE and EXTRA buttons, so I'm
> thinking it would be expected for it to work this way as well.  Since
> it isn't, I'm thinking that might indicate the problem is a step
> closer to the physical hardware, from
> hw/input/{ps2,virtio-input-hid}.c.
>
> What is it that should send the events to the hw/input code?  Could it
> be that remote-viewer isn't listening to the physical hardware events
> for these buttons?  Or, is there an area within QEMU that should be
> receiving the events?  Or could this be a spice issue?  I'm not sure
> how these different parts interact for mouse events, or how to break
> down where this is happening.
>
> Also trying through libvirt rather than qemu directly, and using the
> virt-manager interface acts the same way.  I'm guessing that's using
> remote-viewer, just embedded in the libvirt window, but I'm not
> positive on that.
>
> On Sun, Dec 16, 2018 at 1:48 PM Fabian Lesniak  wrote:
> >
> > Probably the PS/2 mouse is not used at all because the HID Tablet takes
> > precedence. By entering "info mice" on the monitor console you can see
> > which mouse is currently used. If you disable or uninstall the HID
> > Tablet, the PS/2 mouse should take over.
> >
> > "IMEX" is short for Intelli Mouse Explorer.
> >
> > Am 16.12.18 um 06:40 schrieb james harvey:
> > > Running qemu 3.1.0.  virt-viewer 7.0.  spice, spice-gtk, and
> > > spice-protocol all git versions from the past week or so.
> > >
> > > I have a Logitech G600 mouse.  The scroll wheel can be pushed left or 
> > > right.
> > >
> > > On Arch Linux host, "evtest" shows these as event codes 275 (BTN_SIDE)
> > > and 276 (BTN_EXTRA.)  In host, they work as expected, by default as
> > > back and forward in supporting programs such as web browsers.
> > >
> > > On Arch Linux guest, "evtest" shows these events as supported for the
> > > "ImExPS/2 Generic Explorer Mouse", but it doesn't show those events as
> > > happening when I push the scroll wheel left or right.  Other events
> > > work fine.
> > >
> > > On Windows 7 guest, there's no effect from pushing the scroll wheel
> > > left or right, either.
> > >
> > > I'm happy to help debug where the event forwarding is breaking down,
> > > but have no idea how to do that.
> > >
> > >
> > > Patch v1 for these buttons from Nov 24, 2016:
> > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg415246.html
> > >
> > > Patch v2 from Nov 28, 2016:
> > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg415690.html
> > >
> > > Patch v3 from Dec 6, 2016:
> > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg417007.html
> > >
> > >
> > &

Re: [Qemu-devel] hw/input/ps2.c : BTN_SIDE and BTN_EXTRA not forwarded

2018-12-16 Thread james harvey
I didn't make it very clear that my Arch Linux guest didn't had tablet
emulation.  On it, evtest only shows "ImExPS/2 Generic Explorer
Mouse".  On its monitor, "info mice" shows:

* Mouse #2: QEMU PS/2 Mouse

On the Windows 7 guest, you're right:

  Mouse #2: QEMU PS/2 Mouse
* Mouse #3: QEMU HID Tablet (absolute)

After restarting qemu without the tablet option it only shows:

* Mouse #2: QEMU PS/2 Mouse

But, the SIDE and EXTRA buttons still don't work, with Device Manager
using "Microsoft PS/2 Mouse".

Not knowing enough to know if the SPICE Guest Tools or QEMU guest
agent could be affecting things, I uninstalled those from the Windows
7 guest, so the mouse is grabbed by the client.  The extra buttons
still don't work.  In the Arch guest, I uninstalled spice-vdagent,
which also didn't help.

I ran across "-device virtio-mouse-pci", so tried adding that to my
Arch guest, and it does become used over the PS/2, but the SIDE and
EXTRA events still aren't working.  In the guest, evtest shows "QEMU
Virtio Mouse" and shows these buttons as supported events, and
hw/input/virtio-input-hid.c maps the SIDE and EXTRA buttons, so I'm
thinking it would be expected for it to work this way as well.  Since
it isn't, I'm thinking that might indicate the problem is a step
closer to the physical hardware, from
hw/input/{ps2,virtio-input-hid}.c.

What is it that should send the events to the hw/input code?  Could it
be that remote-viewer isn't listening to the physical hardware events
for these buttons?  Or, is there an area within QEMU that should be
receiving the events?  Or could this be a spice issue?  I'm not sure
how these different parts interact for mouse events, or how to break
down where this is happening.

Also trying through libvirt rather than qemu directly, and using the
virt-manager interface acts the same way.  I'm guessing that's using
remote-viewer, just embedded in the libvirt window, but I'm not
positive on that.

On Sun, Dec 16, 2018 at 1:48 PM Fabian Lesniak  wrote:
>
> Probably the PS/2 mouse is not used at all because the HID Tablet takes
> precedence. By entering "info mice" on the monitor console you can see
> which mouse is currently used. If you disable or uninstall the HID
> Tablet, the PS/2 mouse should take over.
>
> "IMEX" is short for Intelli Mouse Explorer.
>
> Am 16.12.18 um 06:40 schrieb james harvey:
> > Running qemu 3.1.0.  virt-viewer 7.0.  spice, spice-gtk, and
> > spice-protocol all git versions from the past week or so.
> >
> > I have a Logitech G600 mouse.  The scroll wheel can be pushed left or right.
> >
> > On Arch Linux host, "evtest" shows these as event codes 275 (BTN_SIDE)
> > and 276 (BTN_EXTRA.)  In host, they work as expected, by default as
> > back and forward in supporting programs such as web browsers.
> >
> > On Arch Linux guest, "evtest" shows these events as supported for the
> > "ImExPS/2 Generic Explorer Mouse", but it doesn't show those events as
> > happening when I push the scroll wheel left or right.  Other events
> > work fine.
> >
> > On Windows 7 guest, there's no effect from pushing the scroll wheel
> > left or right, either.
> >
> > I'm happy to help debug where the event forwarding is breaking down,
> > but have no idea how to do that.
> >
> >
> > Patch v1 for these buttons from Nov 24, 2016:
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg415246.html
> >
> > Patch v2 from Nov 28, 2016:
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg415690.html
> >
> > Patch v3 from Dec 6, 2016:
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg417007.html
> >
> >
> > The v1 notes say: 'Note that the guest has to switch the ps2 mouse
> > into IMEX mode, otherwise events of the extra buttons are ignored. For
> > example on a Windows guest one needs to manually select the "Microsoft
> > PS/2 Mouse" driver.'
> >
> > I'll admit I'm not sure what IMEX mode is.  QEMU is providing the PS/2
> > mouse emulation by default, and I don't see a way to give qemu options
> > for it.
> >
> > Regardless, following this note's instructions for "IMEX mode", in
> > Windows 7 guest, changing the driver from the default "Microsoft -
> > PS/2 Compatible Mouse" to "Microsoft - Microsoft PS/2 Mouse" and
> > rebooting guest has no effect.  The extra buttons still don't work.
> >
> > Windows 7 Device Manager does show 2 "Mice and other pointing
> > devices".  First is "HID-compliant mouse"
> > (HID\VID_0627_0001_) which shows it's USB, so I'm guessing
> > that's the absolute movement EvTouch USB Graphics Tablet.  Second is
> > the PS/2 - currently set to "Microsoft PS/2 Mouse" (ACPI\PNP0F13).



[Qemu-devel] hw/input/ps2.c : BTN_SIDE and BTN_EXTRA not forwarded

2018-12-15 Thread james harvey
Running qemu 3.1.0.  virt-viewer 7.0.  spice, spice-gtk, and
spice-protocol all git versions from the past week or so.

I have a Logitech G600 mouse.  The scroll wheel can be pushed left or right.

On Arch Linux host, "evtest" shows these as event codes 275 (BTN_SIDE)
and 276 (BTN_EXTRA.)  In host, they work as expected, by default as
back and forward in supporting programs such as web browsers.

On Arch Linux guest, "evtest" shows these events as supported for the
"ImExPS/2 Generic Explorer Mouse", but it doesn't show those events as
happening when I push the scroll wheel left or right.  Other events
work fine.

On Windows 7 guest, there's no effect from pushing the scroll wheel
left or right, either.

I'm happy to help debug where the event forwarding is breaking down,
but have no idea how to do that.


Patch v1 for these buttons from Nov 24, 2016:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg415246.html

Patch v2 from Nov 28, 2016:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg415690.html

Patch v3 from Dec 6, 2016:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg417007.html


The v1 notes say: 'Note that the guest has to switch the ps2 mouse
into IMEX mode, otherwise events of the extra buttons are ignored. For
example on a Windows guest one needs to manually select the "Microsoft
PS/2 Mouse" driver.'

I'll admit I'm not sure what IMEX mode is.  QEMU is providing the PS/2
mouse emulation by default, and I don't see a way to give qemu options
for it.

Regardless, following this note's instructions for "IMEX mode", in
Windows 7 guest, changing the driver from the default "Microsoft -
PS/2 Compatible Mouse" to "Microsoft - Microsoft PS/2 Mouse" and
rebooting guest has no effect.  The extra buttons still don't work.

Windows 7 Device Manager does show 2 "Mice and other pointing
devices".  First is "HID-compliant mouse"
(HID\VID_0627_0001_) which shows it's USB, so I'm guessing
that's the absolute movement EvTouch USB Graphics Tablet.  Second is
the PS/2 - currently set to "Microsoft PS/2 Mouse" (ACPI\PNP0F13).