Re: virtlogd spinning on 100% CPU with the latest libvirt

2020-02-18 Thread Ján Tomko

On Tue, Feb 18, 2020 at 08:34:53PM +0100, Andrea Bolognani wrote:

[Dropped Peter from CC. Please don't CC individual developers
unless they've explicitly requested that you do so.]

On Mon, 2020-02-17 at 17:50 +, Richard W.M. Jones wrote:

Build libvirt from git (ccf7567329f).

Using the libvirt ‘run’ script, run something like
libguestfs-test-tool.  I think basically any command which runs a
guest will do.  NB These commands are all run as NON-root:

  killall libvirtd lt-libvirtd virtlogd lt-virtlogd
  ./build/run libguestfs-test-tool

Now there will be a lt-virtlogd process using 100% of CPU:

PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ COMMAND
2572972 rjones20   0   47880  16256  14516 R 100.0   0.1   0:19.27 lt-virt+


It's actually worse than that: not only virtlogd usese an
unwarranted amount of CPU, but it also keeps the log file for the
domain busy, thus preventing the same domain from being started
again:

 $ virsh start alpine
 Domain alpine started

 $ virsh destroy alpine
 Domain alpine destroyed

 $ virsh start alpine
 error: Failed to start domain alpine
 error: can't connect to virtlogd: Cannot open log file: 
'/var/log/libvirt/qemu/alpine.log': Device or resource busy

 $ sudo lsof | grep alpine.log
 virtlogd  146845  root   16w  REG  253,0   
  35103   17195654 /var/log/libvirt/qemu/alpine.log
 $

Restarting virtlogd makes thing operational again:

 $ sudo systemctl restart virtlogd
 $ virsh start alpine
 Domain alpine started

 $

My guess is that virtlogd doesn't realize the QEMU process is gone,
and sits there spinning forever waiting for some output that will
never arrive.


Yeah, since the switch to GLib event loop, the virLogHandlerDomainLogFileEvent
which was supposed to clean that up is no longer called on hangup.

My naive fix:
https://www.redhat.com/archives/libvir-list/2020-February/msg00651.html

Jano


signature.asc
Description: PGP signature


Re: virtlogd spinning on 100% CPU with the latest libvirt

2020-02-18 Thread Andrea Bolognani
[Dropped Peter from CC. Please don't CC individual developers
 unless they've explicitly requested that you do so.]

On Mon, 2020-02-17 at 17:50 +, Richard W.M. Jones wrote:
> Build libvirt from git (ccf7567329f).
> 
> Using the libvirt ‘run’ script, run something like
> libguestfs-test-tool.  I think basically any command which runs a
> guest will do.  NB These commands are all run as NON-root:
> 
>   killall libvirtd lt-libvirtd virtlogd lt-virtlogd
>   ./build/run libguestfs-test-tool
> 
> Now there will be a lt-virtlogd process using 100% of CPU:
> 
> PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ 
> COMMAND  
> 2572972 rjones20   0   47880  16256  14516 R 100.0   0.1   0:19.27 
> lt-virt+ 

It's actually worse than that: not only virtlogd usese an
unwarranted amount of CPU, but it also keeps the log file for the
domain busy, thus preventing the same domain from being started
again:

  $ virsh start alpine
  Domain alpine started

  $ virsh destroy alpine
  Domain alpine destroyed

  $ virsh start alpine
  error: Failed to start domain alpine
  error: can't connect to virtlogd: Cannot open log file: 
'/var/log/libvirt/qemu/alpine.log': Device or resource busy

  $ sudo lsof | grep alpine.log
  virtlogd  146845  root   16w  REG  253,0  
   35103   17195654 /var/log/libvirt/qemu/alpine.log
  $

Restarting virtlogd makes thing operational again:

  $ sudo systemctl restart virtlogd
  $ virsh start alpine
  Domain alpine started

  $

My guess is that virtlogd doesn't realize the QEMU process is gone,
and sits there spinning forever waiting for some output that will
never arrive.

-- 
Andrea Bolognani / Red Hat / Virtualization