[Bug 1905562] Re: Guest seems suspended after host freed memory for it using oom-killer

2021-07-09 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905562

Title:
  Guest seems suspended after host freed memory for it using oom-killer

Status in QEMU:
  Expired

Bug description:
  Host: qemu 5.1.0, linux 5.5.13
  Guest: Windows 7 64-bit

  This guest ran a memory intensive process, and triggered oom-killer on
  host.  Luckily, it killed chromium.  My understanding is this should
  mean qemu should have continued running unharmed.  But, the spice
  connection shows the host system clock is stuck at the exact time oom-
  killer was triggered.  The host is completely unresponsive.

  I can telnet to the qemu monitor.  "info status" shows "running".
  But, multiple times running "info registers -a" and saving the output
  to text files shows the registers are 100% unchanged, so it's not
  really running.

  On the host, top shows around 4% CPU usage by qemu.  strace shows
  about 1,000 times a second, these 6 lines repeat:

  0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.10>
  0.34 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.09>
  0.31 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.07>
  0.28 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.07>
  0.30 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=0}, 
NULL, 8) = 0 (Timeout)  <0.09>
  0.43 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, 
tv_nsec=769662}, NULL, 8) = 0 (Tim eout) <0.000788>

  In the monitor, "info irq" shows IRQ 0 is increasing about 1,000 times
  a second.  IRQ 0 seems to be for the system clock, and 1,000 times a
  second seems to be the frequency a windows 7 guest might have the
  clock at.

  Those fd's are for: (9) [eventfd]; [signalfd], type=STREAM, 4 x the
  spice socket file, and "TCP localhost:ftnmtp->localhost:36566
  (ESTABLISHED)".

  Because the guest's registers aren't changing, it seems to me like
  monitor thinks the VM is running, but it's actually effectively in a
  paused state.  I think all the strace activity shown above must be
  generated by the host.  Perhaps it's repeatedly trying to contact the
  guest to inject a new clock, and communicate with it on the various
  eventfd's, spice socket, etc.  So, I'm thinking the strace doesn't
  give any information about the real reason why the VM is acting as if
  it's paused.

  I've checked "info block", and there's nothing showing that a device
  is paused, or that there's any issues with them.  (Can't remember what
  term can be there, but a paused/blocked/etc block device I think
  caused a VM to act like this for me in the past.)

  
  Is there something I can provide to help fix the bug here?

  Is there something I can do, to try to get the VM running again?  (I
  sadly have unsaved work in it.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905562/+subscriptions



[Bug 1905562] Re: Guest seems suspended after host freed memory for it using oom-killer

2021-05-09 Thread Thomas Huth
The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting the bug state to "Incomplete" now.

If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".

If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:

1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:

https://gitlab.com/qemu-project/qemu/-/issues

and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.

2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" or "Confirmed" within the next 60 days (other-
wise it will get closed as "Expired"). We will then eventually migrate
the ticket automatically to the new system (but you won't be the reporter
of the bug in the new system and thus you won't get notified on changes
anymore).

Thank you and sorry for the inconvenience.


** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905562

Title:
  Guest seems suspended after host freed memory for it using oom-killer

Status in QEMU:
  Incomplete

Bug description:
  Host: qemu 5.1.0, linux 5.5.13
  Guest: Windows 7 64-bit

  This guest ran a memory intensive process, and triggered oom-killer on
  host.  Luckily, it killed chromium.  My understanding is this should
  mean qemu should have continued running unharmed.  But, the spice
  connection shows the host system clock is stuck at the exact time oom-
  killer was triggered.  The host is completely unresponsive.

  I can telnet to the qemu monitor.  "info status" shows "running".
  But, multiple times running "info registers -a" and saving the output
  to text files shows the registers are 100% unchanged, so it's not
  really running.

  On the host, top shows around 4% CPU usage by qemu.  strace shows
  about 1,000 times a second, these 6 lines repeat:

  0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.10>
  0.34 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.09>
  0.31 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.07>
  0.28 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.07>
  0.30 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=0}, 
NULL, 8) = 0 (Timeout)  <0.09>
  0.43 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, 
tv_nsec=769662}, NULL, 8) = 0 (Tim eout) <0.000788>

  In the monitor, "info irq" shows IRQ 0 is increasing about 1,000 times
  a second.  IRQ 0 seems to be for the system clock, and 1,000 times a
  second seems to be the frequency a windows 7 guest might have the
  clock at.

  Those fd's are for: (9) [eventfd]; [signalfd], type=STREAM, 4 x the
  spice socket file, and "TCP localhost:ftnmtp->localhost:36566
  (ESTABLISHED)".

  Because the guest's registers aren't changing, it seems to me like
  monitor thinks the VM is running, but it's actually effectively in a
  paused state.  I think all the strace activity shown above must be
  generated by the host.  Perhaps it's repeatedly trying to contact the
  guest to inject a new clock, and communicate with it on the various
  eventfd's, spice socket, etc.  So, I'm thinking the strace doesn't
  give any information about the real reason why the VM is acting as if
  it's paused.

  I've checked "info block", and there's nothing showing that a device
  is paused, or that there's any issues with them.  (Can't remember what
  term can be there, but a paused/blocked/etc block device I think
  caused a VM to act like this for me in the past.)

  
  Is there something I can provide to help fix the bug here?

  Is there something I can do, to try to get the VM running again?  (I
  sadly have unsaved work in it.)

To manage notifications 

[Bug 1905562] Re: Guest seems suspended after host freed memory for it using oom-killer

2020-11-25 Thread James Harvey
Am I correct to expect the VM to continue successfully, after oom-killer
successfully freed up memory?  This journactl does show a calltrace
which includes "vmx_vmexit", and I'm not sure what that function is for
but looks a little worrisome.

** Attachment added: "section or journalctl from host, showing oom-killer"
   
https://bugs.launchpad.net/qemu/+bug/1905562/+attachment/5437889/+files/journalctl.oom-killer

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1905562

Title:
  Guest seems suspended after host freed memory for it using oom-killer

Status in QEMU:
  New

Bug description:
  Host: qemu 5.1.0, linux 5.5.13
  Guest: Windows 7 64-bit

  This guest ran a memory intensive process, and triggered oom-killer on
  host.  Luckily, it killed chromium.  My understanding is this should
  mean qemu should have continued running unharmed.  But, the spice
  connection shows the host system clock is stuck at the exact time oom-
  killer was triggered.  The host is completely unresponsive.

  I can telnet to the qemu monitor.  "info status" shows "running".
  But, multiple times running "info registers -a" and saving the output
  to text files shows the registers are 100% unchanged, so it's not
  really running.

  On the host, top shows around 4% CPU usage by qemu.  strace shows
  about 1,000 times a second, these 6 lines repeat:

  0.000698 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c10) = 0 <0.10>
  0.34 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c60) = 0 <0.09>
  0.31 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c20) = 0 <0.07>
  0.28 ioctl(18, KVM_IRQ_LINE_STATUS, 0x7fff1f030c70) = 0 <0.07>
  0.30 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, tv_nsec=0}, 
NULL, 8) = 0 (Timeout)  <0.09>
  0.43 ppoll([{fd=4, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, 
events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events
 =POLLIN}, {fd=16, events=POLLIN}, {fd=32, events=POLLIN}, {fd=34, 
events=POLLIN}, {fd=39, events=POLLIN}, {fd=40, events=POLLIN}, {fd=41, 
events=POLLI N}, {fd=42, events=POLLIN}, {fd=43, events=POLLIN}, 
{fd=44, events=POLLIN}, {fd=45, events=POLLIN}], 16, {tv_sec=0, 
tv_nsec=769662}, NULL, 8) = 0 (Tim eout) <0.000788>

  In the monitor, "info irq" shows IRQ 0 is increasing about 1,000 times
  a second.  IRQ 0 seems to be for the system clock, and 1,000 times a
  second seems to be the frequency a windows 7 guest might have the
  clock at.

  Those fd's are for: (9) [eventfd]; [signalfd], type=STREAM, 4 x the
  spice socket file, and "TCP localhost:ftnmtp->localhost:36566
  (ESTABLISHED)".

  Because the guest's registers aren't changing, it seems to me like
  monitor thinks the VM is running, but it's actually effectively in a
  paused state.  I think all the strace activity shown above must be
  generated by the host.  Perhaps it's repeatedly trying to contact the
  guest to inject a new clock, and communicate with it on the various
  eventfd's, spice socket, etc.  So, I'm thinking the strace doesn't
  give any information about the real reason why the VM is acting as if
  it's paused.

  I've checked "info block", and there's nothing showing that a device
  is paused, or that there's any issues with them.  (Can't remember what
  term can be there, but a paused/blocked/etc block device I think
  caused a VM to act like this for me in the past.)

  
  Is there something I can provide to help fix the bug here?

  Is there something I can do, to try to get the VM running again?  (I
  sadly have unsaved work in it.)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1905562/+subscriptions