Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Michal Privoznik

On 19.02.2014 00:11, Richard W.M. Jones wrote:

When qemu is completely broken, libvirtd starts up OK but exists in a
kind of broken state where no guests can possibly be run.  I hit this
problem ... again ... today:

https://bugzilla.redhat.com/show_bug.cgi?id=1066630#c0

There is a libvirt bug here, which is that it's very hard to diagnose
what is going on when qemu fails to work at all.  The logging system
in libvirt(d) is trememdously powerful, but ultimately confusing to
use, and requires users to edit config files which makes it a
non-starter for programs using libvirt through the API [1].

From a libguestfs point of view, it's impossible for us to report back
to the user that there is a problem two layers below in qemu.

So my idea is that libvirt capabilities output should have an info
section containing log messages/errors.

   capabilities
 ...
 info
 Could not run qemu-system-blah:
 symbol lookup error: /usr/bin/qemu-system-s390x: undefined symbol: 
glfs_discard_async
 /info
   /capabilities


This makes sense, although we should make it more versatile to 
distinguish different qemu targets. For example -s390x can be missing a 
symbol, while -x86_64 can be missing a shared library, or have denied 
access somewhere, whatever. If that's the case, we should be able to 
report errors independently.




Libguestfs queries for libvirt capabilities anyway.  If we don't get a
satisfactory set of guest/ elements, then we could list out the
info/ section.  Easy for us.

The problem is the info/ element hardly fits into capabilities.  If
we didn't put it there, could it go some other place?  Or a new API?

Are there other unanticipated problems here?  I think one is that
libvirt doesn't appear to collect detailed log information by default,
(unless the user edits log_level).  That's assuming I understand the
code correctly.  Personally I think libvirt should always collect
debug information, because you never know when it could be useful, but
for the above, collecting errors  warnings unconditionally is
sufficient.

Rich.




[1] By the way, this is a general complaint about libvirt.  Please
DON'T add any more stuff to the configuration file.  Everything should
be configurable through the API, or not at all.  There are two other
settings I can think of that libguestfs would like to adjust but
cannot because they are only available in a configuration file.



This all will be solved by administration module, once we implement it. 
I don't know about anybody working on it though.


Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Kashyap Chamarthy
On Tue, Feb 18, 2014 at 11:11:10PM +, Richard W.M. Jones wrote:

[. . .]

 Are there other unanticipated problems here?  I think one is that
 libvirt doesn't appear to collect detailed log information by default,
 (unless the user edits log_level).  That's assuming I understand the
 code correctly.  Personally I think libvirt should always collect
 debug information, 

Interestingly, recently there was a similar debate[1] in OpenStack
upstream and RDO lands. A CI fix landed to enable libvirt debug logs on
OpenStack Jenkins machines, after a discussion on the list, it got
reverted, and settled on using these filters, per Dan's suggestion:

  log_filters=1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 
1:util
  log_outputs=1:file:/var/log/libvirt/libvirtd.log


 [1] https://bugzilla.redhat.com/show_bug.cgi?id=1061753


-- 
/kashyap

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Daniel P. Berrange
On Tue, Feb 18, 2014 at 11:11:10PM +, Richard W.M. Jones wrote:
 When qemu is completely broken, libvirtd starts up OK but exists in a
 kind of broken state where no guests can possibly be run.  I hit this
 problem ... again ... today:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1066630#c0
 
 There is a libvirt bug here, which is that it's very hard to diagnose
 what is going on when qemu fails to work at all.  The logging system
 in libvirt(d) is trememdously powerful, but ultimately confusing to
 use, and requires users to edit config files which makes it a
 non-starter for programs using libvirt through the API [1].
 
 From a libguestfs point of view, it's impossible for us to report back
 to the user that there is a problem two layers below in qemu.
 
 So my idea is that libvirt capabilities output should have an info
 section containing log messages/errors.
 
   capabilities
 ...
 info
 Could not run qemu-system-blah:
 symbol lookup error: /usr/bin/qemu-system-s390x: undefined symbol: 
 glfs_discard_async
 /info
   /capabilities
 
 Libguestfs queries for libvirt capabilities anyway.  If we don't get a
 satisfactory set of guest/ elements, then we could list out the
 info/ section.  Easy for us.
 
 The problem is the info/ element hardly fits into capabilities.  If
 we didn't put it there, could it go some other place?  Or a new API?

Yeah, I don't really like the idea of doing this in the capabilities
XML. 

I'm not even really convinced this should be in the API at all in fact.

What we could usefully do in libvirt though is to log a structured
message to the systemd journal when we find a QEMU binary that we
fail to extract capabilities from. Apps that care about it could
directly query the journal for the precise well-known log UUID.

And of course it goes without saying we should never have got into
this scenario in the first place. We need better testing of QEMU
binaries to make sure such brokenness can get detected at build
time.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Daniel P. Berrange
On Wed, Feb 19, 2014 at 02:01:43PM +0100, Michal Privoznik wrote:
 On 19.02.2014 00:11, Richard W.M. Jones wrote:
 [1] By the way, this is a general complaint about libvirt.  Please
 DON'T add any more stuff to the configuration file.  Everything should
 be configurable through the API, or not at all.  There are two other
 settings I can think of that libguestfs would like to adjust but
 cannot because they are only available in a configuration file.
 
 
 This all will be solved by administration module, once we implement
 it. I don't know about anybody working on it though.

Yeah, we really need to get our act together on that. I might even be
able to squeeze out some free time for this in the next few weeks. At
least to get a proof of concept working with 1 or 2 example APIs.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Daniel P. Berrange
On Tue, Feb 18, 2014 at 11:11:10PM +, Richard W.M. Jones wrote:
 There is a libvirt bug here, which is that it's very hard to diagnose
 what is going on when qemu fails to work at all.  The logging system
 in libvirt(d) is trememdously powerful, but ultimately confusing to
 use, and requires users to edit config files which makes it a
 non-starter for programs using libvirt through the API [1].

The problem with allowing apps to change the logging config is that
it is global state, not per client. So multiple apps would conflict
in what they could do with changes here. While we could probably
make it possible for apps to register their own callback to receive
log messages, the setting of actual log levels would still be global.

 [1] By the way, this is a general complaint about libvirt.  Please
 DON'T add any more stuff to the configuration file.  Everything should
 be configurable through the API, or not at all.  There are two other
 settings I can think of that libguestfs would like to adjust but
 cannot because they are only available in a configuration file.

What are the other settings you're thinking of here ? The stuff in the
global config file is primarily intended for things that are globally
affecting libvirt behaviour and so would not be appropriate for individual
apps to change independantly. Or for things where we want to make global
defaults tweakable, but still allow app overrides - eg  VNC listen
address is global, but still tweakable in XML per VM.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Richard W.M. Jones
On Wed, Feb 19, 2014 at 01:41:21PM +, Daniel P. Berrange wrote:
 And of course it goes without saying we should never have got into
 this scenario in the first place. We need better testing of QEMU
 binaries to make sure such brokenness can get detected at build
 time.

To be fair in this case it was because I was cherry picking packages
(qemu) from Rawhide, without updating the whole system.  Although if
gluster had symbol versioning, I guess dependencies would have pulled
in the updated glusterfs-libs package too ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Richard W.M. Jones
On Wed, Feb 19, 2014 at 01:56:24PM +, Daniel P. Berrange wrote:
 On Wed, Feb 19, 2014 at 02:01:43PM +0100, Michal Privoznik wrote:
  On 19.02.2014 00:11, Richard W.M. Jones wrote:
  [1] By the way, this is a general complaint about libvirt.  Please
  DON'T add any more stuff to the configuration file.  Everything should
  be configurable through the API, or not at all.  There are two other
  settings I can think of that libguestfs would like to adjust but
  cannot because they are only available in a configuration file.
  
  
  This all will be solved by administration module, once we implement
  it. I don't know about anybody working on it though.
 
 Yeah, we really need to get our act together on that. I might even be
 able to squeeze out some free time for this in the next few weeks. At
 least to get a proof of concept working with 1 or 2 example APIs.

Is there some background reading on this feature?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Daniel P. Berrange
On Wed, Feb 19, 2014 at 04:50:42PM +, Richard W.M. Jones wrote:
 On Wed, Feb 19, 2014 at 01:41:21PM +, Daniel P. Berrange wrote:
  And of course it goes without saying we should never have got into
  this scenario in the first place. We need better testing of QEMU
  binaries to make sure such brokenness can get detected at build
  time.
 
 To be fair in this case it was because I was cherry picking packages
 (qemu) from Rawhide, without updating the whole system.  Although if
 gluster had symbol versioning, I guess dependencies would have pulled
 in the updated glusterfs-libs package too ...

Or in the absence of symbol versioning,  QEMU's RPM spec must be clear
about using a versioned dependancy on gluster, instead of relying just
on the automatic ELF deps.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Daniel P. Berrange
On Wed, Feb 19, 2014 at 04:51:17PM +, Richard W.M. Jones wrote:
 On Wed, Feb 19, 2014 at 01:56:24PM +, Daniel P. Berrange wrote:
  On Wed, Feb 19, 2014 at 02:01:43PM +0100, Michal Privoznik wrote:
   On 19.02.2014 00:11, Richard W.M. Jones wrote:
   [1] By the way, this is a general complaint about libvirt.  Please
   DON'T add any more stuff to the configuration file.  Everything should
   be configurable through the API, or not at all.  There are two other
   settings I can think of that libguestfs would like to adjust but
   cannot because they are only available in a configuration file.
   
   
   This all will be solved by administration module, once we implement
   it. I don't know about anybody working on it though.
  
  Yeah, we really need to get our act together on that. I might even be
  able to squeeze out some free time for this in the next few weeks. At
  least to get a proof of concept working with 1 or 2 example APIs.
 
 Is there some background reading on this feature?

Nothing nicely written up in any one place.

The general idea though is that we'll create an administrative API for
libvirtd. eg a libvirtadmin.so that connects to a dedicated UNIX socket
like /var/run/libvirt/libvirt-admin which has its own RPC program running
separate from the main RPC program. This library / RPC protocol would be
thus independant of any specific HV connection. The original motivation
was to provide the host admin with a way to turn on/off logging levels
without having to restart libvirtd itself. We also wanted a way to inspect
what clients are connected and what API calls they were waiting for
completion of.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Richard W.M. Jones
On Wed, Feb 19, 2014 at 02:00:21PM +, Daniel P. Berrange wrote:
 On Tue, Feb 18, 2014 at 11:11:10PM +, Richard W.M. Jones wrote:
  There is a libvirt bug here, which is that it's very hard to diagnose
  what is going on when qemu fails to work at all.  The logging system
  in libvirt(d) is trememdously powerful, but ultimately confusing to
  use, and requires users to edit config files which makes it a
  non-starter for programs using libvirt through the API [1].
 
 The problem with allowing apps to change the logging config is that
 it is global state, not per client. So multiple apps would conflict
 in what they could do with changes here. While we could probably
 make it possible for apps to register their own callback to receive
 log messages, the setting of actual log levels would still be global.

This comes back to the whole private libvirtd thing.  Even sharing a
single session libvirtd with the current user has proven problematic
for libguestfs, and it's a big mess for root.  See bugs passim.  In an
ideal world we'd have one private libvirtd per connection.

  [1] By the way, this is a general complaint about libvirt.  Please
  DON'T add any more stuff to the configuration file.  Everything should
  be configurable through the API, or not at all.  There are two other
  settings I can think of that libguestfs would like to adjust but
  cannot because they are only available in a configuration file.
 
 What are the other settings you're thinking of here?

- log_level / log messages as discussed.

- qemu user/group: when libguestfs runs as root, we'd like to set it
  to root/root

- max_clients

There are lots of problems (still) with max_clients.  The default
setting is far too low.  The default was recently increased but we
cannot read what it is, thus cannot make a decision on how many
threads we can run safely.  And being a global setting (and us being a
local library) it wouldn't help much even if we could read it, because
another libguestfs instance might be running threads.  Ideally it
would be just a safety mechanism, stopping someone from connecting
thousands of times, and would also depend on the size of the main
memory.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Reporting log/error messages through capabilities

2014-02-19 Thread Daniel P. Berrange
On Wed, Feb 19, 2014 at 05:00:43PM +, Richard W.M. Jones wrote:
 On Wed, Feb 19, 2014 at 02:00:21PM +, Daniel P. Berrange wrote:
  On Tue, Feb 18, 2014 at 11:11:10PM +, Richard W.M. Jones wrote:
   There is a libvirt bug here, which is that it's very hard to diagnose
   what is going on when qemu fails to work at all.  The logging system
   in libvirt(d) is trememdously powerful, but ultimately confusing to
   use, and requires users to edit config files which makes it a
   non-starter for programs using libvirt through the API [1].
  
  The problem with allowing apps to change the logging config is that
  it is global state, not per client. So multiple apps would conflict
  in what they could do with changes here. While we could probably
  make it possible for apps to register their own callback to receive
  log messages, the setting of actual log levels would still be global.
 
 This comes back to the whole private libvirtd thing.  Even sharing a
 single session libvirtd with the current user has proven problematic
 for libguestfs, and it's a big mess for root.  See bugs passim.  In an
 ideal world we'd have one private libvirtd per connection.

Yep, I don't have a perfect answer for that yet. Conceptually the
idea of having a dedicated root session libvirtd but it does raise
co-ordination issues. eg when libvirtd tracks what vms are using
what PCI devices it is assuming there's only one privileged libvirtd
instance. We could avoid this by saying that the 'session' instance
that runs as root is unprivileged and thus not allowed to use PCI
devices and other such things, but there could be dragons here.

   [1] By the way, this is a general complaint about libvirt.  Please
   DON'T add any more stuff to the configuration file.  Everything should
   be configurable through the API, or not at all.  There are two other
   settings I can think of that libguestfs would like to adjust but
   cannot because they are only available in a configuration file.
  
  What are the other settings you're thinking of here?
 
 - log_level / log messages as discussed.
 
 - qemu user/group: when libguestfs runs as root, we'd like to set it
   to root/root

We do have an override for that in the XML now, though I do
recall there were some issues. Conceptually though, the goal
of the XML override is that it /ought/ to be functionally
identical to changing the global config file.

 - max_clients
 
 There are lots of problems (still) with max_clients.  The default
 setting is far too low.  The default was recently increased but we
 cannot read what it is, thus cannot make a decision on how many
 threads we can run safely.  And being a global setting (and us being a
 local library) it wouldn't help much even if we could read it, because
 another libguestfs instance might be running threads.  Ideally it
 would be just a safety mechanism, stopping someone from connecting
 thousands of times, and would also depend on the size of the main
 memory.

We are partway through changing this. The end goal is that we'll have a
new 'max_anonymous_clients' setting that will be a low value and just
prevents DOS from clients which have not yet authenticated. The existing
'max_clients' value will then only apply to authenticated clients, and
can thus be raised to a very high value such that people won't hit it
in any reasonably normal usage.

I thought we'd finished it, but I see it is outstanding review still

  https://www.redhat.com/archives/libvir-list/2013-December/msg00453.html

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] Reporting log/error messages through capabilities

2014-02-18 Thread Richard W.M. Jones
When qemu is completely broken, libvirtd starts up OK but exists in a
kind of broken state where no guests can possibly be run.  I hit this
problem ... again ... today:

https://bugzilla.redhat.com/show_bug.cgi?id=1066630#c0

There is a libvirt bug here, which is that it's very hard to diagnose
what is going on when qemu fails to work at all.  The logging system
in libvirt(d) is trememdously powerful, but ultimately confusing to
use, and requires users to edit config files which makes it a
non-starter for programs using libvirt through the API [1].

From a libguestfs point of view, it's impossible for us to report back
to the user that there is a problem two layers below in qemu.

So my idea is that libvirt capabilities output should have an info
section containing log messages/errors.

  capabilities
...
info
Could not run qemu-system-blah:
symbol lookup error: /usr/bin/qemu-system-s390x: undefined symbol: 
glfs_discard_async
/info
  /capabilities

Libguestfs queries for libvirt capabilities anyway.  If we don't get a
satisfactory set of guest/ elements, then we could list out the
info/ section.  Easy for us.

The problem is the info/ element hardly fits into capabilities.  If
we didn't put it there, could it go some other place?  Or a new API?

Are there other unanticipated problems here?  I think one is that
libvirt doesn't appear to collect detailed log information by default,
(unless the user edits log_level).  That's assuming I understand the
code correctly.  Personally I think libvirt should always collect
debug information, because you never know when it could be useful, but
for the above, collecting errors  warnings unconditionally is
sufficient.

Rich.

[1] By the way, this is a general complaint about libvirt.  Please
DON'T add any more stuff to the configuration file.  Everything should
be configurable through the API, or not at all.  There are two other
settings I can think of that libguestfs would like to adjust but
cannot because they are only available in a configuration file.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list