Re: any advantage to using pool type glusterfs if no GFAPI

2023-10-09 Thread Peter Krempa
On Fri, Oct 06, 2023 at 13:23:18 -0700, W Kern wrote:
> I have used fuse mounted gluster with 'dir' for years. Works great, aside
> from  having to use the --unsafe flag on migrations.
> 
> I am building up a new system  and wondering if using the glusterfs type
> would be a better/safer choice.
> 
> I'm not concerned about gfapi (host is U22LTS so I'm not even sure if it has
> it).
> 
> I'm not concerned about the automount feature as that happens on start up
> anyway.
> 
> I'm more interested if there is any technical or performance reasons to
> prefer 'glusterfs' pool type over 'dir'
> 
> of course not having the --unsafe flag would be nice, but thats all scripted
> in anyway.

Let's break down multiple things:

Firstly regarding use of '--unsafe' for migration on fuse-mounted
glusterfs. This should work without the need to use --unafe at least
since:

commit 478da65fb46c866973886848ae17f1e16199a77d
Author: Michal Prívozník 
Date:   Thu Sep 27 16:19:31 2018 +0200

virFileIsSharedFSType: Check for fuse.glusterfs too

$ git desc 478da65fb46c866973886848ae17f1e16199a77d
v4.8.0-21-g478da65fb4

Thus released in libvirt-4.9


Secondly let's discuss the actual real way how qemu interacts with the
blocks. With fuse in use the gluster volume is mounted on the host and
qemu accesses it directly as any other file. Thus for qemu this is
transparent.

With libgfapi, each qemu instance runs their own gluster client
internally. Advantage is that you don't have it mounted in your host
and maybe save few cycles on fuse not being needed. Disadvantage is that
you have multiple copies of the gluster client in memory.



Now the third issue and that is storage pool type in libvirt.

So in the first place having a storage pool in libvirt is not really
needed. This depends on how you configure the disks in the VM:

If you do it via:

 
   [...]
   
 

Then you are not even using a storage pool in the first place. That
happens only if you have a .

In such case I'd suggest you don't even set up one.

This brings us to pool type gluster. This pool type uses libgfapi to
natively access the gluster volume without mounting it in the host. Thus
if you need it to automount the images, it will not do that.
Additionally support for  was never finished for
that one either.

Conclusion: You don't want to use storage pool of type 'gluster', and
very likely you don't need any storage pool.



Re: Help ! libvirt

2023-09-21 Thread Peter Krempa
On Thu, Sep 21, 2023 at 10:50:07 +0100, Bhasker C V wrote:
> Attaching win11.xml
> Please note that this used to work fine. It is failing now on libvirt-
> 9.7.0-1
> 
> On Thu, Sep 21, 2023 at 9:13 AM Peter Krempa  wrote:
> 
> > On Thu, Sep 21, 2023 at 09:05:43 +0100, Bhasker C V wrote:
> > > Adding libvirt mailing list
> > > apologies for cross-posting
> > > libvirt version: 9.7.0-1
> > >
> > > On Thu, Sep 21, 2023 at 8:39 AM john doe  wrote:
> > >
> > > > On 9/21/23 09:32, Bhasker C V wrote:
> > > > > I am getting an error with libivrt when I create a VM
> > > > >
> > > > > ```
> > > > >   $ sudo virsh create ./win11.xml
> >
> > Please attach the XML used here. It comes from a code path which
> > shouldn't be possible to reach.
> >
> > > > > error: Failed to create domain from ./win11.xml
> > > > > error: internal error: mishandled storage format 'none'
> > > > >
> > > > > ```
> > > > >
> > > > > This is after I have done a dist-upgrade (was working fine before)
> > > > > debian trixie.
> >
> > Which version did you have before?
> >
> > > > >
> > > > > error message says
> > > > > qemuBlockStorageSourceGetBlockdevFormatProps:1227 : internal error:
> > > > > mishandled storage format 'none'
> >
> >

>   destroy
>   restart
>   destroy
>   
> 
> 
>   
>   
> /usr/bin/qemu-system-x86_64
> 
>   
>   
>   
> 

Could you please also attach the output of:

  qemu-img info --backing-chain '/var/virt/WINDOWS/WIN11'


> 
>   
>   
> 
>   
>   
> 

In the definition I don't see anything that would hint that anything
with the disk config is broken, for qcow2 volumes we do auto-detection
of backing images, thus the request for the output of the command above.



Re: Help ! libvirt

2023-09-21 Thread Peter Krempa
On Thu, Sep 21, 2023 at 09:05:43 +0100, Bhasker C V wrote:
> Adding libvirt mailing list
> apologies for cross-posting
> libvirt version: 9.7.0-1
> 
> On Thu, Sep 21, 2023 at 8:39 AM john doe  wrote:
> 
> > On 9/21/23 09:32, Bhasker C V wrote:
> > > I am getting an error with libivrt when I create a VM
> > >
> > > ```
> > >   $ sudo virsh create ./win11.xml

Please attach the XML used here. It comes from a code path which
shouldn't be possible to reach.

> > > error: Failed to create domain from ./win11.xml
> > > error: internal error: mishandled storage format 'none'
> > >
> > > ```
> > >
> > > This is after I have done a dist-upgrade (was working fine before)
> > > debian trixie.

Which version did you have before?

> > >
> > > error message says
> > > qemuBlockStorageSourceGetBlockdevFormatProps:1227 : internal error:
> > > mishandled storage format 'none'



Re: Activate storage during domain migration

2023-08-02 Thread Peter Krempa
On Tue, Aug 01, 2023 at 12:28:56 +0200, e...@mailbox.org wrote:
> Hi,
> 
> I have a block storage which I only want to be mounted on a single node. I
> know that there are many possibilities for shared storage usage but I want
> to know if the following is possible (using the API).
> - Have a domain running on node-A
> - Initialize a migration for that domain to node-B
> - Run a hook or something just before the domain starts on node-B to:
>     - unmount storage on node-A
>     - mount/prepare storage on node-B

This is not possible with qemu, because during migration the process
running the VM exists both on node-A and node-B, and has the storage
open (although not accessing it) from both sides.

At the time the migration is switching over, the source flushes buffers
and the destination starts writing into the image.

This means that the storage must be mounted on both nodes during the
migration.

What you can do though is, to save the VM state into a file and restore
it on the other node. (virsh save, virsh restore). Which uses basically
the same approach as migration, but the VM state is dumped to a file and
preserved, thus you can umount the storage for the necessary time until
it's moved to the next node.



Re: ignored ?

2023-05-16 Thread Peter Krempa
On Mon, May 15, 2023 at 20:14:58 +0200, lejeczek wrote:
> 
> 
> On 15/05/2023 19:14, Marc wrote:
> > > Is there something else which is a prerequisite to 'qemu:commandline'
> > > but if yes and I'm missing those, why would not then 'virsh' and/or
> > > 'virtqemud' say something?
> > No what you have looks ok, this is what I have as a test and is working ok. 
> > You can try and these to see if something is shown in the guest.
> > 
> >
> >  
> >  
> >  
> >   > value='type=0,vendor=LENOVO,version=FBKTB4AUS,date=07/01/2015,release=1.180'/>
> >  
> >   > value='type=1,manufacturer=LENOVO,product=30AH001GPB,version="ThinkStation
> >  
> > P300",serial=S4M62281,uuid=1ecefe02-f1b6-4bf8-a925-c9f4ae512209,sku=LENOVO_MT_30AH,family=P300'/>
> >
> > 
> yes, for those who might stumble upon this/similar - it turns out to be the
> structure declaration in xml
>   
> VS
>   
> of whose the latter _works_ ! on centos 9 with
> libvirt-daemon-9.0.0-7.el9.x86_64 but not the former.
> 
> @devel - should this go as BZ report into bugzilla?

Did you declare the appropriate XML namespace?:

  

https://www.libvirt.org/drvqemu.html#pass-through-of-arbitrary-qemu-commands



Re: how to get domain name into the logs - ?

2023-05-09 Thread Peter Krempa
On Mon, May 08, 2023 at 10:02:17 +0200, lejeczek wrote:
> Hi guys.
> 
> I have libvirt/qemu setup with pretty vanilla settings and my logs, a
> snippet, look like this:
> 
> ...
> migration successfully aborted
> internal error: qemu unexpectedly closed the monitor:
> 2023-05-08T07:56:37.785886Z qemu-kvm: Failed to load
> virtio_pci/modern_queue_state:desc
> 2023-05-08T07:56:37.786024Z qemu-kvm: Failed to load
> virtio_pci/modern_state:vqs
> 2023-05-08T07:56:37.786029Z qemu-kvm: Failed to load
> virtio/extra_state:extra_state
> 2023-05-08T07:56:37.786033Z qemu-kvm: Failed to load virtio-rng:virtio
> 2023-05-08T07:56:37.786036Z qemu-kvm: error while loading state for instance
> 0x0 of device ':00:02.5:00.0/virtio-rng'
> 2023-05-08T07:56:37.786279Z qemu-kvm: load of migration failed: Input/output
> error
> 
> Would you know how to make libvirt/qemu to get domains/guests names into the
> logs? (ideally without going into debug/similar level)

The above snippet looks like it's from the VM log file
(/var/log/libvirt/qemu/*.log). The VM name is the name of the file.



Re: Using pki/ssl/tls connection.

2023-05-04 Thread Peter Krempa
On Wed, May 03, 2023 at 17:38:16 +0200, Kamil Jońca wrote:
> 
> I am thinking of using tls connection between my client and server
> instead of current ssh.
> I found
> https://libvirt.org/kbase/tlscerts.html and I want to know if it is
> possible to customise some setting (e.g. use my own cert names, or
> locations)
> but I was not able to.

Server-side location of the certificates can be configured in the
appropriate config file based on how your host is configured (
/etc/libvirt/virtproxyd.conf, /etc/libvirt/libvirtd.conf, you also need
to enable virtpoxyd's TLS socket).

In the config file you have the following config options:

key_file, cert_file, ca_file, crl_file

> Moreover
> https://github.com/libvirt/libvirt/blob/44520f6e01580d6bada88b47e5b77e6bee023ac6/src/rpc/virnettlscontext.c
> suggests that these values are hardcoded.

The client file names need to conform to the expected values.

> So my questions are: is it possible to customise these values? If so,
> how? How can I configure virt-manager with two connections, each with
> different CA?

The path to the directory containing the certificates can be changed per
connection using the 'pkipath' URI argument. See:

  https://libvirt.org/uri.html#tls-transport



Re: UEFI and External Snapshots

2023-04-27 Thread Peter Krempa
On Wed, Apr 26, 2023 at 21:04:36 +0100, Simon Fairweather wrote:
> Is there a plan to support UEFI and Ext Snaps? Or is there a location for
> documentation as the info I can find is quite old.

Your question is a bit vague.

You already can take an external snapshot of a VM using UEFI.

There is one slight missing bit, which should not hinder most usage
though and that is that the UEFI variable store itself is not
snapshotted.



Re: which daemon/service for live migration - ?

2023-04-17 Thread Peter Krempa
On Mon, Apr 17, 2023 at 16:38:17 +0200, lejeczek wrote:
> 
> 
> On 17/04/2023 15:34, Peter Krempa wrote:
> > On Mon, Apr 17, 2023 at 14:39:18 +0200, lejeczek wrote:
> > > 
> > > On 17/04/2023 14:31, Peter Krempa wrote:
> > > > On Mon, Apr 17, 2023 at 14:24:32 +0200, lejeczek wrote:

[...]

> > If you think more explanation is needed then please submit a issue and
> > describe your request and suggestion how you'd like that to be worded.
> > 
> I do. I did - I said that it appeared to be more specific.
> I said:
> "
> migration with 'qemu+tls' fails if receiving node does not have
> 'virtproxyd-tls.socket' up&running,
> even though 'virtproxyd.socket' & 'virtqemud.service' are running on that
> node.
> "

I'm sorry but 'migration' fairly and squarely fits into 'remote access'
category IMO. The only reasonable way I can see this changed is to do
e.g.

  Remote off-host (TLS socket) access ...

Enumerating everything that needs it doesn't make much sense the
document would become even more off-putting for others to read. Over
recent times I've made a similar mistake with trying to improve the
knowledge base page about debug logging to put every possible scenario
based on what people were doing wrong, and the document is now so
massive that people simple don't read it.

> I said - if that is the business logic, also for 'tcp' - then those would
> certainly be worth an explanation in man pages. Saves many some time.

TCP is insecure and deprecated and should not be used for any real
use case.



Re: which daemon/service for live migration - ?

2023-04-17 Thread Peter Krempa
On Mon, Apr 17, 2023 at 14:39:18 +0200, lejeczek wrote:
> 
> 
> On 17/04/2023 14:31, Peter Krempa wrote:
> > On Mon, Apr 17, 2023 at 14:24:32 +0200, lejeczek wrote:
> > > On 17/04/2023 12:27, Peter Krempa wrote:
> > > > On Sun, Apr 16, 2023 at 08:54:57 +0200, lejeczek wrote:

[...]

> > > So I wonder - if that is the business logic here - if man pages which are
> > > already are very good, could enhance even more to explain those bits 
> > > too...
> > The proxy daemon is necessary when you need very old clients which don't
> > support the modular topology to work with the modern daemon topology.
> > 
> > That's not a strict migration requirement though as you can run the
> > migration from a modern client. In case you are migrating *from* an
> > older daemon, that would mean that you can't use '--p2p' mode.
> > 
> They are all the same - in my case - decently modern - in my mind - servers
> & clients.
> It is all Centos 9 Stream with everything from default repos up-to-date. 
> Are those "old"?

No, that is fine. I forgot about the fact that 'virtproxyd' is required
when you want to use TLS because I always use SSH as transport.

> And even if so then my suggestion - to explain & include all that, that
> modular relevance to certain operations, in man pages - I still share.
> That will certainly safe admins like myself, good chunks of time.

The man page for 'virtqemud' states in second paragraph:

  The virtqemud daemon only listens for requests on a local Unix domain
  socket. Remote off-host access and backwards compatibility with legacy
  clients expecting libvirtd is provided by the virtproxy daemon.

If you think more explanation is needed then please submit a issue and
describe your request and suggestion how you'd like that to be worded.



Re: which daemon/service for live migration - ?

2023-04-17 Thread Peter Krempa
On Mon, Apr 17, 2023 at 14:24:32 +0200, lejeczek wrote:
> 
> 
> On 17/04/2023 12:27, Peter Krempa wrote:
> > On Sun, Apr 16, 2023 at 08:54:57 +0200, lejeczek wrote:
> > > Hi guys.
> > > 
> > > With this relatively new modular approach in libvirt - which service is
> > > needed in order to migrate guests via tcp?
> > There is nothing special needed for migration when compared to running a
> > VM.
> > 
> > With new daemons you need 'virtqemud' to manage the VM and optionally
> > 'virtnetworkd' if the VM uses libvirt-managed networks, 'virtstoraged'
> > if it uses libvirt managed storage, and/or 'virtsecretd' if it uses
> > secrets storage.
> > 
> > Configuration of daemon options moved to the appropriate per-daemon
> > config file.
> > 
> I have a feeling - have not tested all thoroughly - that specific
> modules/daemons need to be up & running for specific methods of
> transportation.
> Say..
> migration with 'qemu+tls' fails if receiving node does not have
> 'virtproxyd-tls.socket' up&running,
> even though 'virtproxyd.socket' & 'virtqemud.service' are running on that
> node.
> 
> So I wonder - if that is the business logic here - if man pages which are
> already are very good, could enhance even more to explain those bits too...

The proxy daemon is necessary when you need very old clients which don't
support the modular topology to work with the modern daemon topology.

That's not a strict migration requirement though as you can run the
migration from a modern client. In case you are migrating *from* an
older daemon, that would mean that you can't use '--p2p' mode.



Re: which daemon/service for live migration - ?

2023-04-17 Thread Peter Krempa
On Mon, Apr 17, 2023 at 12:49:09 +0200, Peter Krempa wrote:
> On Mon, Apr 17, 2023 at 10:15:00 +, Marc wrote:

[...]

>  - hypervisor drivers:
>- virtqemud - for managing qemu machines
>- virtlxcd - for lxc
>- ... for all other hypervisor drivers...
>  - network driver
> - virtnetworkd
>  - storage driver
> - virtstoraged
>  - secret driver
> - virtsecretd
> 
> etc...
> 
> If you are running modular daemons you then don't use 'libvirtd' at all,
> but rather the client library communicates with the appropriate daemon
> based on the object type.

Further reading:

https://libvirt.org/daemons.html



Re: which daemon/service for live migration - ?

2023-04-17 Thread Peter Krempa
On Mon, Apr 17, 2023 at 10:15:00 +, Marc wrote:
> 
> > 
> > With this relatively new modular approach in libvirt - which service is
> > needed in order to migrate guests via tcp?
> > 
> 
> I am not sure what you mean with 'new modular'. I am still on el7 going to 
> el9 this year. I am doing live migrations just with these. 
> 
> libvirtd.service  enabled

'libvirtd' was separated into specific sub-daemons for specific object
or hypervisor:

 - hypervisor drivers:
   - virtqemud - for managing qemu machines
   - virtlxcd - for lxc
   - ... for all other hypervisor drivers...
 - network driver
- virtnetworkd
 - storage driver
- virtstoraged
 - secret driver
- virtsecretd

etc...

If you are running modular daemons you then don't use 'libvirtd' at all,
but rather the client library communicates with the appropriate daemon
based on the object type.



Re: ecrypting image file breaks efi/boot of the guest/Ubuntu - ?

2023-04-17 Thread Peter Krempa
On Fri, Apr 14, 2023 at 14:26:53 +0200, lejeczek wrote:
> 
> 
> On 14/04/2023 13:57, Peter Krempa wrote:
> > On Fri, Apr 14, 2023 at 13:39:17 +0200, lejeczek wrote:
> > > 
> > > On 11/04/2023 09:13, Peter Krempa wrote:
> > > > On Sat, Apr 08, 2023 at 11:25:18 +0200, lejeczek wrote:
> > > > > Hi guys.
> > > > > 
> > > > > I've have a guest and that guest differs from all other guest by:
> > > > > 
> > > > >     
> > > > >       hvm
> > > > >        > > > > type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd
> > > > > /var/lib/libvirt/qemu/nvram/ubusrv1_VARS.fd
> > > > >       
> > > > >       
> > > > >     
> > > > > 
> > > > > whereas everything else has:
> > > > > 
> > > > >     
> > > > >       hvm
> > > > >       
> > > > >       
> > > > >       
> > > > >     
> > > > > 
> > > > > Now, that different guest fails - as the only one - to start, to boot 
> > > > > after
> > > > > its qcow2 image was luks-encrypted.
> > > > > Guest starts but says that:
> > > > > 
> > > > > BdsDxe: failed to load Boot0001 "Uefi Misc Device" from PciRoot
> > > > > (0x0)/Pci(0x2,0x3)/Pci(0x0,0x0): Not found
> > > > > 
> > > > > revert back to original, non-encrypted qcow2 image and all works a ok.
> > > > Please attach either the full XML or at least the disk part for *both*
> > > > the case where it doesn't work and where it does work.
> > [...]
> > 
> > >    
> > >      /usr/libexec/qemu-kvm
> > >      
> > >    
> > >    
> > >    
> > >     > > function='0x0'/>
> > >      
> > > ...
> > > 
> > > When I add encryption to  & use encrypted qcow2 then VM fails as I
> > > described.
> > I specifically asked for '*both*' XMLs. The working one. And the
> > non-working one.
> > 
> In case it might not be clear - which in my mind should not have been as
> it's simple, sure only in my mind - it is:
> all guests use almost identical "template" with obvious differences, such as
> names/paths, hw adresses, now...
> 
> - tree guests have used 'pflash' in  from the beginning, always.
> 
> and I point to that as only those three guest fail to boot after their
> qcow2s were encrypted, just like all other VM's were, but those other VM's
> start & boot okey.
> If in those three VMs I use original, non-encrypted qcow2 - without changing
> anything else in XML definition but 'encryption' relevant in  - then
> those VMs start & boot just fine.

Well, your problem is then most likely with the guest image or the way
how you've created/converted them.

I've ran a few experiments and my machines that I've converted to
encrypted qcows run properly with uefi.

> BdsDxe: failed to load Boot0001 "Uefi Misc Device" from PciRoot
> (0x0)/Pci(0x2,0x3)/Pci(0x0,0x0): Not found

The error seems to indicate that the boot entry was not found, but the
disk encryption is transparent when the disk is decrypted by qemu, so
there should be no difference between an encrypted and unencrypted
image.

Thus why I suspect that the guest image is somehow not properly
converted.

Specifically I've managed to get this error when I had an empty
encrypted image used with a VM.



Re: which daemon/service for live migration - ?

2023-04-17 Thread Peter Krempa
On Sun, Apr 16, 2023 at 08:54:57 +0200, lejeczek wrote:
> Hi guys.
> 
> With this relatively new modular approach in libvirt - which service is
> needed in order to migrate guests via tcp?

There is nothing special needed for migration when compared to running a
VM.

With new daemons you need 'virtqemud' to manage the VM and optionally
'virtnetworkd' if the VM uses libvirt-managed networks, 'virtstoraged'
if it uses libvirt managed storage, and/or 'virtsecretd' if it uses
secrets storage.

Configuration of daemon options moved to the appropriate per-daemon
config file.



Re: ecrypting image file breaks efi/boot of the guest/Ubuntu - ?

2023-04-14 Thread Peter Krempa
On Fri, Apr 14, 2023 at 13:39:17 +0200, lejeczek wrote:
> 
> 
> On 11/04/2023 09:13, Peter Krempa wrote:
> > On Sat, Apr 08, 2023 at 11:25:18 +0200, lejeczek wrote:
> > > Hi guys.
> > > 
> > > I've have a guest and that guest differs from all other guest by:
> > > 
> > >    
> > >      hvm
> > >       > > type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd
> > > /var/lib/libvirt/qemu/nvram/ubusrv1_VARS.fd
> > >      
> > >      
> > >    
> > > 
> > > whereas everything else has:
> > > 
> > >    
> > >      hvm
> > >      
> > >      
> > >      
> > >    
> > > 
> > > Now, that different guest fails - as the only one - to start, to boot 
> > > after
> > > its qcow2 image was luks-encrypted.
> > > Guest starts but says that:
> > > 
> > > BdsDxe: failed to load Boot0001 "Uefi Misc Device" from PciRoot
> > > (0x0)/Pci(0x2,0x3)/Pci(0x0,0x0): Not found
> > > 
> > > revert back to original, non-encrypted qcow2 image and all works a ok.
> > Please attach either the full XML or at least the disk part for *both*
> > the case where it doesn't work and where it does work.

[...]

>   
>     /usr/libexec/qemu-kvm
>     
>   
>   
>   
>    function='0x0'/>
>     
> ...
> 
> When I add encryption to  & use encrypted qcow2 then VM fails as I
> described.

I specifically asked for '*both*' XMLs. The working one. And the
non-working one.



Re: storage backup with encryption on-the-fly ?

2023-04-11 Thread Peter Krempa
On Tue, Apr 11, 2023 at 09:21:30 +0200, Peter Krempa wrote:
> On Fri, Apr 07, 2023 at 19:42:11 +0200, lejeczek wrote:
> > 
> > 
> > On 06/04/2023 16:12, Peter Krempa wrote:
> > > On Thu, Apr 06, 2023 at 15:22:10 +0200, lejeczek wrote:
> > > > Hi guys.
> > > > 
> > > > Is there a solution, perhaps a function of libvirt, to backup guest's
> > > > storage and encrypt the resulting image file?
> > > > On-the-fly ideally.
> > > > If not ready/built-in solution then perhaps a best technique you
> > > > recommend/use?
> > > > I currently use 'backup-begin' on qcow2s, which are LUKS encrypted.
> > > libvirt's block code supports the raw+luks and qcow2+luks encrypted
> > > image formats with qemu. You should be able to use both for backups too:
> > > 
> > > 
> > >   
> > > 
> > >   
> > > 
> > > 
> > >   
> > >  > > uuid='d5c7780c-80c4-45eb-bee9-9fbbc1f3847c'/>
> > >   
> > > 
> > >   
> > >   
> > > 
> > > Another option would be to use an encrypted device-mapper device via the
> > > block backend.
> > > 
> > > Lastly if you need any other storage format the 'pull' mode of backups
> > > exposes a (optionally TLS-encrypted) NBD socket from where a client
> > > application can pull the blocks for backup and store them in any way it
> > > wants.
> > > 
> > That works as I hoped, nice & smooth, I've not had the right xml syntax.
> > Are there any docs with more details on the other two alternatives?
> > many thanks, L.
> 
> Well, the backup to a (externally provided) device mapper target is
> quite straihtforward:
> 
>  
>
>  
>
>
>  
>  
> 
> The pull-mode backup with NBD where you handle the encryption in the
> client program (not provided by libvirt, but you can have a look at e.g
> https://www.libvirt.org/apps.html#backup or oVirt which both implement a
> NBD backup flow). To setup a backup in pull mode, simply use:
> 
>  
>
>
>  
>
>  
>
>  
> 
> To setup TLS to encrypt the transport you can use tls='on' and need to
> setup the TLS certs. Have a look at the docs for 'server':
> 
>  https://www.libvirt.org/formatbackup.html
> 

Note: The document explains what the optional  element does,
but for a pull backup you need a temporary file where the blocks the
guest overwrote but werent backed up yet are stored.



Re: storage backup with encryption on-the-fly ?

2023-04-11 Thread Peter Krempa
On Fri, Apr 07, 2023 at 19:42:11 +0200, lejeczek wrote:
> 
> 
> On 06/04/2023 16:12, Peter Krempa wrote:
> > On Thu, Apr 06, 2023 at 15:22:10 +0200, lejeczek wrote:
> > > Hi guys.
> > > 
> > > Is there a solution, perhaps a function of libvirt, to backup guest's
> > > storage and encrypt the resulting image file?
> > > On-the-fly ideally.
> > > If not ready/built-in solution then perhaps a best technique you
> > > recommend/use?
> > > I currently use 'backup-begin' on qcow2s, which are LUKS encrypted.
> > libvirt's block code supports the raw+luks and qcow2+luks encrypted
> > image formats with qemu. You should be able to use both for backups too:
> > 
> > 
> >   
> > 
> >   
> > 
> > 
> >   
> >  > uuid='d5c7780c-80c4-45eb-bee9-9fbbc1f3847c'/>
> >   
> > 
> >   
> >   
> > 
> > Another option would be to use an encrypted device-mapper device via the
> > block backend.
> > 
> > Lastly if you need any other storage format the 'pull' mode of backups
> > exposes a (optionally TLS-encrypted) NBD socket from where a client
> > application can pull the blocks for backup and store them in any way it
> > wants.
> > 
> That works as I hoped, nice & smooth, I've not had the right xml syntax.
> Are there any docs with more details on the other two alternatives?
> many thanks, L.

Well, the backup to a (externally provided) device mapper target is
quite straihtforward:

 
   
 
   
   
 
 

The pull-mode backup with NBD where you handle the encryption in the
client program (not provided by libvirt, but you can have a look at e.g
https://www.libvirt.org/apps.html#backup or oVirt which both implement a
NBD backup flow). To setup a backup in pull mode, simply use:

 
   
   
 
   
 
   
 

To setup TLS to encrypt the transport you can use tls='on' and need to
setup the TLS certs. Have a look at the docs for 'server':

 https://www.libvirt.org/formatbackup.html



Re: ecrypting image file breaks efi/boot of the guest/Ubuntu - ?

2023-04-11 Thread Peter Krempa
On Sat, Apr 08, 2023 at 11:25:18 +0200, lejeczek wrote:
> Hi guys.
> 
> I've have a guest and that guest differs from all other guest by:
> 
>   
>     hvm
>      type='pflash'>/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd
> /var/lib/libvirt/qemu/nvram/ubusrv1_VARS.fd
>     
>     
>   
> 
> whereas everything else has:
> 
>   
>     hvm
>     
>     
>     
>   
> 
> Now, that different guest fails - as the only one - to start, to boot after
> its qcow2 image was luks-encrypted.
> Guest starts but says that:
> 
> BdsDxe: failed to load Boot0001 "Uefi Misc Device" from PciRoot
> (0x0)/Pci(0x2,0x3)/Pci(0x0,0x0): Not found
> 
> revert back to original, non-encrypted qcow2 image and all works a ok.

Please attach either the full XML or at least the disk part for *both*
the case where it doesn't work and where it does work.



Re: Virtiofsd

2023-04-06 Thread Peter Krempa
On Tue, Apr 04, 2023 at 17:31:37 +0100, Simon Fairweather wrote:
> Thanks these are missing in rust.
> 
> 

You'll have to raise this with the virtiofsd project:

  https://gitlab.com/virtio-fs/virtiofsd/-/issues

> 
> On Tue, Apr 4, 2023 at 5:25 PM Peter Krempa  wrote:
> 
> > On Tue, Apr 04, 2023 at 17:12:15 +0100, Simon Fairweather wrote:
> > > Hi
> > >
> > > In QEMU 8 virtiofsd has been removed in favor of the rust version. Which
> > > includes options that are not longer supported,
> > >
> > >
> > > Do you have a view on what should be used going forwards to support
> > > virtiofsd in libvirt with qemu 8?
> > >
> > > The options are showing as depreciated,
> > > -o ...
> > > Options in a format compatible with the legacy implementation
> > > [deprecated]
> > >
> > > Rust version options
> >
> > The rust version of virtiofsd is supposed to be compatible with
> > everything that libvirt used with the C version via the deprecated
> > compat options above.
> >
> > Going forward we'd like to switch to the native syntax:
> >
> > https://gitlab.com/libvirt/libvirt/-/issues/431
> >
> >



Re: storage backup with encryption on-the-fly ?

2023-04-06 Thread Peter Krempa
On Thu, Apr 06, 2023 at 15:22:10 +0200, lejeczek wrote:
> Hi guys.
> 
> Is there a solution, perhaps a function of libvirt, to backup guest's
> storage and encrypt the resulting image file?
> On-the-fly ideally.
> If not ready/built-in solution then perhaps a best technique you
> recommend/use?
> I currently use 'backup-begin' on qcow2s, which are LUKS encrypted.

libvirt's block code supports the raw+luks and qcow2+luks encrypted
image formats with qemu. You should be able to use both for backups too:


 
   
 
   
   
 
   
 
   
 
 

Another option would be to use an encrypted device-mapper device via the
block backend.

Lastly if you need any other storage format the 'pull' mode of backups
exposes a (optionally TLS-encrypted) NBD socket from where a client
application can pull the blocks for backup and store them in any way it
wants.



Re: backup-begin

2023-04-06 Thread Peter Krempa
On Wed, Apr 05, 2023 at 17:37:31 +0200, André Malm wrote:
> Den 2023-04-05 kl. 09:47, skrev Peter Krempa:
> > The backup operation is quite complex so it is possible.
> > Please have a look into /var/log/libvirt/qemu/$VMNAME.log to see whether
> > qemu logged something like an assertion failure before crashing.
> > 
> > Additionally you can have a look into 'coredumpctl' whether there are
> > any recorded crashes of 'qemu-system-x86_64' and also possibly collect
> > the backtrace.
> > 
> > Also make sure to try updating the qemu package and see whether the bug
> > reproduces.
> > 
> > If yes, please collect the stack/back-trace, versions of qemu and
> > libvirt, the contents of the VM log file and also ideally configure
> > libvirt for debug logging and collect the debug log as well:
> > 
> >https://www.libvirt.org/kbase/debuglogs.html
> 
> In the $VMNAME.log: qemu-system-x86_64: ../../block/qcow2.c:5175:
> qcow2_get_specific_info: Assertion `false' failed.
> 
> I'm running libvirt 8.0.0, which is the latest version for my dist (ubuntu
> 22.04). If required how would I properly upgrade?
> 
> Looking at 
> https://github.com/qemu/qemu/blob/0c8022876f2183f93e23a7314862140c94ee62e7/block/qcow2.c
> which would be the version of qcow2.c for 8.0.0 there seem to be some issue
> with qcow2 compat.

Huh, that is weird. Both images seem to be qcow2v3 so it's weird that
the code reaches the assertion.

I think at this point you should report an issue with qemu:

 https://gitlab.com/qemu-project/qemu/-/issues

or report it on the qemu-bl...@nongnu.org mailing list

You'll be asked for what operations lead to the failure so please make
sure to collect the libvirt debug log as I've requested.

I can help the qemu team to analyze it so make sure to mention me (or my
gitlab handle 'pipo.sk') on the issue.



Re: backup-begin

2023-04-05 Thread Peter Krempa
(preferrably don't top-post on technical lists)

On Wed, Apr 05, 2023 at 07:44:21 +0200, André Malm wrote:
> The reason given is shut off (crashed).
> 
> So something virsh backup-begin does is causing he guest to crash?

The backup operation is quite complex so it is possible.

Please have a look into /var/log/libvirt/qemu/$VMNAME.log to see whether
qemu logged something like an assertion failure before crashing.

Additionally you can have a look into 'coredumpctl' whether there are
any recorded crashes of 'qemu-system-x86_64' and also possibly collect
the backtrace.

Also make sure to try updating the qemu package and see whether the bug
reproduces.

If yes, please collect the stack/back-trace, versions of qemu and
libvirt, the contents of the VM log file and also ideally configure
libvirt for debug logging and collect the debug log as well:

  https://www.libvirt.org/kbase/debuglogs.html



> 
> Den 2023-04-04 kl. 16:58, skrev Peter Krempa:
> > On Tue, Apr 04, 2023 at 16:28:18 +0200, André Malm wrote:
> > > Hello,
> > > 
> > > For some vms the virsh backup-begin sometimes shuts off the vm and returns
> > > "error: operation failed: domain is not running" although it was clearly 
> > > in
> > > state running (or paused).
> > > 
> > > Is the idea that you should guest-fsfreeze-freeze / virsh suspend before
> > > virsh backup-begin? I have tried with both with the same results.
> > Freezing the guest filesystems is a good idea to increase the data
> > consistency of the backup, but is not necessary. Nor it should have any
> > influence on the lifecycle of the VM.
> > 
> > > What could be causing the machine to shut off?
> > The VM most likely crashed, or was turned off in a different way.
> > 
> > Try running
> > 
> >   virsh domstate --reason $VMNAME
> > 
> > to see what the reason for the current state is.
> > 
> 



Re: Virtiofsd

2023-04-04 Thread Peter Krempa
On Tue, Apr 04, 2023 at 17:12:15 +0100, Simon Fairweather wrote:
> Hi
> 
> In QEMU 8 virtiofsd has been removed in favor of the rust version. Which
> includes options that are not longer supported,
> 
> 
> Do you have a view on what should be used going forwards to support
> virtiofsd in libvirt with qemu 8?
> 
> The options are showing as depreciated,
> -o ...
> Options in a format compatible with the legacy implementation
> [deprecated]
> 
> Rust version options

The rust version of virtiofsd is supposed to be compatible with
everything that libvirt used with the C version via the deprecated
compat options above.

Going forward we'd like to switch to the native syntax:

https://gitlab.com/libvirt/libvirt/-/issues/431



Re: backup-begin

2023-04-04 Thread Peter Krempa
On Tue, Apr 04, 2023 at 16:28:18 +0200, André Malm wrote:
> Hello,
> 
> For some vms the virsh backup-begin sometimes shuts off the vm and returns
> "error: operation failed: domain is not running" although it was clearly in
> state running (or paused).
> 
> Is the idea that you should guest-fsfreeze-freeze / virsh suspend before
> virsh backup-begin? I have tried with both with the same results.

Freezing the guest filesystems is a good idea to increase the data
consistency of the backup, but is not necessary. Nor it should have any
influence on the lifecycle of the VM.

> What could be causing the machine to shut off?

The VM most likely crashed, or was turned off in a different way.

Try running

 virsh domstate --reason $VMNAME

to see what the reason for the current state is.



Re: Option Flags

2023-03-23 Thread Peter Krempa
On Thu, Mar 23, 2023 at 09:35:44 +, Simon Fairweather wrote:
> Hi
> 
> Are the flags documented? can this function be used to  specify  same as virsh
> undefine --nvram "name of VM"
> 
> libvirt_domain_undefine_flags($res, $flags)
> 
> [Since version (null)]
> 
> Function is used to undefine(with flags) the domain identified by it's
> resource.
> *@res [resource]*: libvirt domain resource, e.g. from
> libvirt_domain_lookup_by_*()
> *@flags [int]*: optional flags
> *Returns*: : TRUE for success, FALSE on error

I'm not sure which bindings you use but all API flags are documented in
our main API documentation. In your case for the virDomainUndefineFlags
API it's:
https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainUndefineFlagsValues



Re: Upgrade machine type during migration

2023-03-13 Thread Peter Krempa
On Sat, Mar 11, 2023 at 18:35:20 +0100, Michael Schwartzkopff wrote:
> Hi,
> 
> 
> I have an old system. The guest there is defined with:
> 
>   
>     hvm
>   
> 
> 
> When I try to migrate this guest to a new system I get the error:
> 
> error: internal error: unable to execute QEMU command 'blockdev-add': Failed
> to connect socket: Permission denied

You seem to be using migration with non-shared storage. Could you please
post the full set of parameters you are passing to the migration API?

(or the full virsh commandline arguments)

This error seems to be a problem with setting up the disk migration
connection and (as also Daniel pointed out) has nothing to do with the
machine type.



Re: Attach a GPS in preboot mode

2023-01-25 Thread Peter Krempa
On Tue, Jan 24, 2023 at 00:10:46 +0100, lnj@gmail.com wrote:
> Hello everyone and best wishes for 2023 :)
> 
> I have an old *Garmin Drive Smart 50 GPS* and I want to be able to attach it
> when it is in preboot mode to a VM hosted by a *Debian 11 host*.
> 
> From what I understand, the preboot mode allows us to flash a firmware
> before the GPS actually starts (used among other things when the GPS is soft
> bricked).
> 
> When I connect the GPS to the host it gives:

[...]

> My questions :
> 
> Q1 : Is it possible ?

It shpi;d be/

> 
> Q2 : Am I doing it the right way ?

I don't think so. You are using a serial device emulated which is then
connected to the host side USB-tty device.

A better way is simply to use USB-assignment/passthrough and have the
guest OS handle also the USB side

> 
> Q3 : Is there a method to pass such a device directly (pass-through) ?

To pass a USB device use the following XML: 


  


  
  


Obviously tweak vendor/product.

Alternatively when you are using virt-manager GUI there's a menu to pass
any device, or by default when you have a VM window active and plug in a
USB device it's passed to the VM. (It can be disabled though).



Re: [Question] Which qcow2 options could be inherited to snapshot/blockcopy?

2022-12-11 Thread Peter Krempa
On Sun, Dec 11, 2022 at 20:36:26 +0100, Gionatan Danti wrote:
> Il 2022-11-26 10:36 Gionatan Danti ha scritto:
> > Il 2022-11-21 10:30 Peter Krempa ha scritto:
> > > In regards to the 'compat' option in terms of snapshots/blockcopy, we
> > > don't set it and thus use the qemu default. Since both operations
> > > create
> > > a new image with an existing qemu instance, the default new qemu
> > > format
> > > is okay.
> > 
> > Hi, some idea on why creating a qcow2 volume and snapshotting it via a
> > backing file results in two qcow2 files with different format (v2 vs
> > v3)? Example below:
> > 
> > # volume creation
> > [root@whitehole ~]# virsh vol-create-as default zzz.qcow2 1G --format
> > qcow2
> > Vol zzz.qcow2 created
> > 
> > [root@whitehole ~]# qemu-img info /var/lib/libvirt/images/zzz.qcow2
> > image: /var/lib/libvirt/images/zzz.qcow2
> > file format: qcow2
> > virtual size: 1 GiB (1073741824 bytes)
> > disk size: 16.5 KiB
> > cluster_size: 65536
> > Format specific information:
> > compat: 0.10
> > compression type: zlib
> > refcount bits: 16
> > 
> > [root@whitehole ~]# file /var/lib/libvirt/images/zzz.qcow2
> > /var/lib/libvirt/images/zzz.qcow2: QEMU QCOW2 Image (v2), 1073741824
> > bytes
> > 
> > # snapshot creation
> > [root@whitehole ~]# virsh snapshot-create-as zzz --name zsnap1
> > --disk-only
> > Domain snapshot zsnap1 created
> > 
> > [root@whitehole ~]# qemu-img info /var/lib/libvirt/images/zzz.zsnap1
> > image: /var/lib/libvirt/images/zzz.zsnap1
> > file format: qcow2
> > virtual size: 1 GiB (1073741824 bytes)
> > disk size: 16.5 KiB
> > cluster_size: 65536
> > backing file: /var/lib/libvirt/images/zzz.qcow2
> > backing file format: qcow2
> > Format specific information:
> > compat: 1.1
> > compression type: zlib
> > lazy refcounts: false
> > refcount bits: 16
> > corrupt: false
> > extended l2: false
> > 
> > [root@whitehole ~]# file /var/lib/libvirt/images/zzz.zsnap1
> > /var/lib/libvirt/images/zzz.zsnap1: QEMU QCOW2 Image (v3), has backing
> > file (path /var/lib/libvirt/images/zzz.qcow2), 1073741824 bytes
> > 
> > Regards.
> 
> Hi all,
> anyone with some ideas/suggestions?

Hi,

When creating a snapshot libvirt uses the qemu default for the created
qcow2 image's version which is now v3.

Do you have any specific use case for keeping them in v2 mode?



Re: detach-interface success but domiflist still saw interface

2022-11-24 Thread Peter Krempa
On Thu, Nov 24, 2022 at 16:47:48 +0800, Jiatong Shen wrote:
> Hello Commnunity,
> 
> I saw an weird situation on a phytium machine (arm64 v8),  after the
> following commands, I can still see the interface which should be
> successfully detached.
> 
> virsh # detach-interface 4a365b06-2597-4c17-8b44-dbb6953f9ced bridge --mac
> fa:16:3e:c5:62:40
> Interface detached successfully
> 
> Future qmp commands shows, that a device hostnet23 exists but seems no
> front end device exists.
> 
> So what could be the problem? Thank you very much.

The man page for virsh states for the 'detach-interface' command:

 "Please see documentation for detach-device for known quirks."

And the 'quirks' part of 'detach-device':

  Quirk:  Device  unplug is asynchronous in most cases and requires guest
  cooperation. This means that it's up to the discretion of the guest  to
  disallow  or  delay  the unplug arbitrarily. As the libvirt API used in
  this command was designed as synchronous it returns success after  some
  timeout  even  if the device was not unplugged yet to allow further in‐
  teractions with the domain e.g. if the guest is  unresponsive.  Callers
  which  need  to make sure that the device was unplugged can use libvirt
  events (see virsh event) to be notified when  the  device  is  removed.
  Note that the event may arrive before the command returns.



Re: [Question] Which qcow2 options could be inherited to snapshot/blockcopy?

2022-11-21 Thread Peter Krempa
On Mon, Nov 21, 2022 at 11:22:40 +0800, Meina Li wrote:
> Hi,
> 
> I'm trying to find out which qcow2 options could be inherited to
> snapshot/blockcopy and then test them.
> 
> From https://github.com/qemu/qemu/blob/master/qapi/block-core.json#L4720 we
> can know all the qcow2 options. As far as I know, size, cluster_size and
> extended-l2 have already been implemented. So I'm curious that:
> 1) What are the current status and future plans of other options? Like
> compat option.

The 'size' option is needed obviously to have a correctly sized image.

With 'cluster_size' and 'extended_l2' those options were identified as
having potential performance implications and users actually wanting to
tweak them for their images. Thus we inherit them.

For anything else there wasn't any specific request or noting that it
can have performance implications so they are omitted for now.

In regards to the 'compat' option in terms of snapshots/blockcopy, we
don't set it and thus use the qemu default. Since both operations create
a new image with an existing qemu instance, the default new qemu format
is okay.

For the other options:

 - 'encrypt' and co.
- encryption can be explicitly enabled via XML
 - 'data-file-raw'
- not supported by libvirt, no plans for now
 - 'preallocation'
- not implemented
- some options don't make sense, e.g. full allocation for a snapshot
 - 'lazy-refcounts'/'refcount-bits'
- not implemented, no plans
 - 'compression-type'
- libvirt for now doesn't allow to use compression IIRC

> 2) Also whether the vol-clone can cover all options?

Note that 'vol-clone' uses qemu-img, so the logic is different there.

> Thank you very much. And please help to correct me if I have something
> wrong.
> 
> Best Regards,
> Meina Li



Re: qemu-block-rbd be removed ?

2022-11-21 Thread Peter Krempa
On Sat, Nov 12, 2022 at 08:41:44 +0100, vrms wrote:

Firstly please don't start a conversation by replying to an existing one
as it gets threaded improperly.

> I am running qemu/kvm on a Manjaro Laptop. There is a thing with a package
> named ceph-libs being removed from manjaro repos apparently.
> This requires to build it from AUR which fails. The only reason for
> ceph-libs seems that apparently qemu-block-rbd depends on it. However
> nothing else seems to depend on qemu-block-rbd.
> 
> Can anybody tell me whether in your eyes qemu-block-rbd may be save to
> remove and likewise resolve this problem?

'qemu-block-rbd' is a qemu backed for accessing ceph/RBD disks. qemu is
modular so it will work without the backend if you don't need to access
RBD storage, so you can remove the package without any problem.



Re: blockcopy preserve original cache mode value

2022-11-16 Thread Peter Krempa
On Wed, Nov 16, 2022 at 09:17:43 +0800, Jiatong Shen wrote:
> Hello Community,
> 
>We are observing that after a blockcopy, the disk backend will inherit
> from original disk's cache mode configuration even though a new cache mode
> is given. Specifically, the function
> https://github.com/libvirt/libvirt/blob/cd94d891fb4b5cdda229f58b1dee261d5514082b/src/qemu/qemu_domain.c#L10887
> which eventually calls  qemuDomainPrepareDiskSourceData and override input
> config.
> My question is is this on purpose? Because seems like qemu allows different
> configs.
> Thnak you very much for the help.

This is an historical artifact from the time when the block copy API
used the 'drive-mirror' QMP API of qemu which didn't allow configuring
any image specific parameters.

Nowadays we use 'blockdev-mirror' which takes an image instantiated via
'blockdev-add' which can configure everything the same way as the disk,
so it would be possible to honour the cache mode as specified in the
block copy XML.

Note that if you intend to implement it, the code must still preserve
the original cache mode if the user doesn't specify it in the XML.



Re: DOS

2022-11-02 Thread Peter Krempa
On Wed, Nov 02, 2022 at 08:16:01 +0100, Tomas By wrote:
> Hi all,
> 
> I am trying to set up an `isolated virtual network' with two or more
> MS-DOS guests, all on one single Linux box.
> 
> The aim is for them to share a disk (or just one directory on one
> disk).
> 
> As I could not get it to work with IPX, am now trying TCP/IP. However,
> I suspect the MS tools are still attempting to use IPX as well.
> 
> Is there some simple solution that I am missing?
> 
> Does libvirt support IPX at all?

Libvirt's network definition creates a bridge (switch), so it shouldn't
really matter which protocol you run across it. Obviously libvirt
does not configure IPX on the host facing side of the bridge because we
don't support it, but if VMs talk IPX they should be able to.



Re: DOS

2022-11-02 Thread Peter Krempa
On Wed, Nov 02, 2022 at 08:33:52 +0100, Tomas By wrote:
> On Wed, 02 Nov 2022 08:16:01 +0100, Tomas By wrote:
> > As I could not get it to work with IPX, am now trying TCP/IP. However,
> > I suspect the MS tools are still attempting to use IPX as well.
> 
> 
> In fact, it seems that DHCP does not even work.
> 
> In virt-manager/Connection details, it says "DHCP range..." for that
> network, and virsh net-list says it is active.
> 
> DOS says, after a while, "No DHCP server found".
> 
> Anything else I can try?

Please post the full XML definition of the network you use for the VMs.

It can be obtained by running 'virsh net-dumpxml $NETWORKNAME'



Re: Predictable and consistent net interface naming in guests

2022-10-31 Thread Peter Krempa
On Mon, Oct 31, 2022 at 16:32:27 +0200, Edward Haas wrote:

[...]

> Are there any plans to add the acpi_index support?

https://www.libvirt.org/formatdomain.html#network-interfaces

"Since 7.3.0, one can set the ACPI index against network interfaces. With
 some operating systems (eg Linux with systemd), the ACPI index is used
 to provide network interface device naming, that is stable across
 changes in PCI addresses assigned to the device. This value is required
 to be unique across all devices and be between 1 and (16*1024-1)."



Re: How to merge incremental backups generated with virsh backup-begin?

2022-10-19 Thread Peter Krempa
On Tue, Oct 18, 2022 at 14:54:03 -0300, Jorge Luiz Correa wrote:
> I'm trying to create incremental backups that I could restore when
> necessary. All backups are being generated fine but I couldn't find a way
> to recreate an image using all or some of the backup files.
> 
> For example, my domain is called jammy and is running.
> 
> jammy-backup.xml:
> 
>   
> 
> 
> jammy-checkpoint.xml
> 
>   
> 
>   
> 
> 
> ~# virsh backup-begin jammy jammy-backup.xml jammy-checkpoint.xml
> 
> * This command with these files generates backup file in
> /var/lib/libvirt/images appending the checkpoint timestamp in file name
> (jammy.qcow2 -> jammy.qcow2.TIMESTAMP).
> 
> ~# virsh checkpoint-list jammy
>  Name Creation Time
> -
>  1666006874   2022-10-17 08:41:14 -0300
> 
> ~# ls -lh /var/lib/libvirt/images
> -rw-r--r-- 1 libvirt-qemu kvm  2,6G out 18 14:38 jammy.qcow2
> -rw--- 1 root root 1,5G out 17 08:41 jammy.qcow2.1666006874
> 
> If I create a new domain using jammy.qcow2.1666006874 as disk, everything
> works good.
> 
> Then, I've created some incremental backups.
> 
> jammy-backup.xml:
> 
>   1666006874
> 
> 
> ~# virsh backup-begin jammy jammy-backup.xml jammy-checkpoint.xml
> 
> ~# ls -lh /var/lib/libvirt/images
> -rw-r--r-- 1 libvirt-qemu kvm  2,6G out 18 14:38 jammy.qcow2
> -rw--- 1 root root 1,5G out 17 08:41 jammy.qcow2.1666006874
> -rw--- 1 root root 247M out 17 14:42 jammy.qcow2.1666010735
> 
> At this point, if I need to restore the backup with checkpoint 1666010735,
> how can I create a new image using jammy.qcow2.1666010735 and
> jammy.qcow2.1666006874?
> 
> I would like to merge the incremental backup jammy.qcow2.1666010735 into
> the full backup jammy.qcow2.1666006874, to get a new image so I can create
> a new domain using it as disk. Am I doing it the right way?

To restore incremental backups you need to restore the layering of the
qcow2 images in the order they were created. This is needed as a single
incremental image contains only differences from the previous backup
(full or incremental). You need to do that until you reach your original
full backup.

The metadata in question is the backing image and backing image path.
The full backing chain should look like:

incr4.qcow -> incr3.qcow2 -> incr2.qcow2 -> incr1.qcow2 -> fullbackup.qcow2

The '->' represents the backing image relationship, thus 'incr4.qcow2'
whould list 'incr3.qcow2' as it's backing image.

The steps how to fix the image using qemu-img are the same as in the
follwing guide:

https://www.libvirt.org/kbase/backing_chains.html#vm-refuses-to-start-due-to-misconfigured-backing-store-format

After you do that you then should use e.g. 'qemu-img convert' to copy
out the data into a new image and use that new image as base for your
new VM.

Note that using the backup images directly as source for a disk used by
VM invaidates/destroys any further increments in that chain, as that VM
then can write data into the image.



Re: Help in porting smack related patches to meson build system

2022-09-19 Thread Peter Krempa
On Mon, Sep 19, 2022 at 16:31:25 +, Vishal Gupta (vishagu2) wrote:
> Hi Peter ,

Hi,

I firstly want to ask you to avoid top-posting on technical mailing
lists. 

> 
> 
> 
> Thanks for ur reply .
> 
> 
> 
> Kirkstone is the latest release from Yocto foundation . details 
> https://docs.yoctoproject.org/dev/migration-guides/migration-4.0.html
> 
> 
> 
> Kirkstone support libvirt 8.10  
> https://layers.openembedded.org/layerindex/recipe/3052/
>
>
>
> We are trying to port 2016 smack patch  " 
> https://listman.redhat.com/archives/libvir-list/2016-July/msg00456.html";   
> for libvirt which is based on automake  make file

Okay so that is quite an old patch which was never finished and merged.

> But above patch is incompatible with libvirt 8.10 .as libvirt 8.1 supports 
> only meson
> 
> 
> I was just wondering if above smack patch is already ported for meson based 
> libvirt 8.10?
> 
> We have issue in porting m4/virt-smack.m4   to  meson.build file. 
> virt-smack.m4is define in 2016 smack patch

No it is not ported. I want to warn you though that porting the build
system part to meson will not be biggest of the problems.

1) The patch is more than 6 years old at this point

  There's quite a lot of change in libvirt during that time. Many
  helpers and internal APIs were refactored. The patch even once you
  port the build system will require *significant* rework to actually
  even compile.

2) libsmack (smack-utils) used by the new security module is not present
   in distros

  The library it requires doesn't seem to be adopted much:

  https://repology.org/project/smack-utils/versions

  Did you make sure that it's present in your environment?

Additionally if you want to get the patch accepted upstream note that
there were outstanding problems with the patch in the last version
which is on the mailing list. I can see that there are definitely
problems with coding style and the patch will need to be split to
logical chunks as it's now just a big blob of code.

Also given that the library is simply not present in distros it will
require a justification why upstream should cary the patch. Carying
code which is not compiled because of lack of dependencies is a burden
to upstream and still can bitrot in the future.

Even if you are not striving for upstreaming the patch, be prepared for
more work than just porting the build system.



Re: Libvirt slow after a couple of months uptime

2022-09-19 Thread Peter Krempa
On Fri, Sep 16, 2022 at 19:41:28 +0200, André Malm wrote:
> Hello,
> 
> I have some issues with libvirtd getting slow over time.
> 
> After a fresh reboot (or systemctl restart libvirtd) virsh list /
> virt-install is fast, as expected, but after a couple of months uptime they
> both take a significantly longer time.
> 
> Virsh list takes around 3 seconds (from 0.04s on a fresh reboot) and
> virt-install takes over a minute (from around a second).
> 
> Running strace on virsh list it seems to get stuck in a loop on this:
> poll([{fd=5, events=POLLOUT},
> {fd=6, events=POLLIN}], 2, -1) = 2 ([{fd=5,
> revents=POLLOUT}, {fd=6, revents=POLLIN}])

Unfortunately this bit doesn't help much. Virsh' is simply a client
which does RPC over a unix socket to the libvirt/virtqemud daemon based
on your host configuration.

This means that what you straced was simply a event loop waiting for the
communication with the server. In fact there's a whole thread simply for
polling and dispatching the calls so it's expected that it's always
stuck in a poll().

> While restarting libvirtd fixes it

So it looks like the problem isn't in virsh at all. In such case
stracing virsh won't help at all as it's a completely different process
from the dameon.

> a restart takes around 1 minute where
> ebtables rules etc are recreated and it does interrupt the service. What
> could cause this? How would I troubleshoot this?

The best way to at least get an idea where the problem might be would be
to collect debug logs of the libvirt daemon (libvirtd/virtqemud based on
how your host is configured).

To enable debug logs you can use the following guide, which also
explains how to figure out which daemon is in use and also outlines how
to set it without restarting the daemon. Make sure to read the
appropriate chapters:

https://www.libvirt.org/kbase/debuglogs.html

The log contains timestamps so we'll be able to see what bogs down the
runtime once it's in the 'slow' period.



Re: Help in porting smack related patches to meson build system

2022-09-19 Thread Peter Krempa
On Fri, Sep 16, 2022 at 11:10:14 +, Akash Bhaskaran (akabhask)
wrote:
> Hi,
> 
> We are trying to port some patches pertaining to files like
> Makefile.am ,configure.ac and .m4’s to libvirt kirkstone release. We
> see that latest kirskstone now uses meson build architecture.

What is a 'kirkstone'? I didn't ever hear this word in conjunction with
libvirt.

> We have
> difficulties in porting the changes in the above-mentioned files to
> meson.

You didn't really show what you have problems with. I can point you to
the 'rough' equivalents of the files you've mentioned above.

> Is there any script or reference you can provide for helping us
> port these files to meson?

You'll have to refer to meson's docs and how we use it there isn't
anything automatic for the conversion.

Based on what you want to change, the equivalent of a makefile is the
'meson.build' file. Now note there are multiple 'meson.build' files.

The top level meson.build has mostly dependency-related definitions. In
case you need to depend on a new library, that will most likely be the
correct file.

Then each sub-module has their own meson.build file: e.g.
src/meson.build, src/qemu/meson.build, src/security/meson.build etc.

If you need to declare configure time options look into
'meson_options.txt'.


Next time if you want more specific answers, please be a bit more
specific in your question.



Re: [libvirt-users] [virtual interface] detach interface during boot succeed with no changes

2022-09-08 Thread Peter Krempa
On Thu, Sep 08, 2022 at 15:16:56 +0800, Yalan Zhang wrote:
> Hi Laine,
> 
> As for the hot-unplug behavior, I have one more question about it, could
> you please help to confirm?
> 
> unplugging a PCI device properly requires cooperation from the guest OS.
> > If the guest OS isn't running yet, the unplug won't complete, so qemu
> > (and libvirt) still show the device as plugged into the guest.
> >
> > virsh reports success on the unplug because unplugging a device is done
> > asynchronously - the "success" means "libvirt successfully told qemu to
> > unplug the device, qemu has told the virtual machine to unplug the
> > device, and is waiting for acknowledgment from the virtual machine that
> > the guest has completed removal". At some later time the guest OS may
> > complete its part of the unplug; when that happens, qemu will get a
> > notification and will send an event to libvirt - at that time the device
> > will be removed from libvirt's list of devices.
> >
> > tl;dr - this is all expected.
> >
> 
> The question is that, when I unplug it during boot, the virsh cmd will
> succeed but the interface still exists, which is expected.
> After the vm boot successfully, the guest OS will *not* complete this
> removal. When I tried to detach it again, it reported that the device was
> in the process of unplugging.
> Is this acceptable?
> 
> # virsh detach-interface rhel_new network 52:54:00:36:a8:d4
> Interface detached successfully
> # virsh domiflist rhel_new
>  Interface   Type  SourceModelMAC
> -
>  vnet4   network   default   virtio   52:54:00:36:a8:d4
> 
> # virsh detach-interface rhel_new network 52:54:00:36:a8:d4
> error: Failed to detach interface
> error: internal error: unable to execute QEMU command 'device_del': Device
> net0 is already in the process of unplug

The same problem was already reported for disks:

https://bugzilla.redhat.com/show_bug.cgi?id=2087047

https://gitlab.com/libvirt/libvirt/-/issues/309

The main problem is that qemu doesn't re-send the request to unplug the
device and rather reprots an eroror. At the same time the guest OS
doesn't notice it any more, so the unplug can't be finished until the VM
is rebooted.



Re: Problem with a disk device of type 'volume'

2022-08-18 Thread Peter Krempa
On Tue, Aug 16, 2022 at 13:51:20 +0200, Frédéric Lespez wrote:
> Hi,
> 
> I have progressed in my research.
> 
> I created a minimal test case in order to reproduce the problem (see below).
> 
> I made tests on 3 (physical) machines under Debian 11.4: the problem is
> present on 2 machines but there is no problem on the third.
> 
> I booted a machine where the problem is present into a Debian 11.4 live OS
> and made the test : it works, no problem.
> 
> So far, all my tests lead me to the following conclusions:
>  - The problem is tied to the configuration of the system.
>  - It's not 'file permission' problem. The directory structure of the
> storage pool, the file permissions on this structure, the configuration of
> libvirt and qemu and the user under which the daemon runs are the same on
> all systems.
>  - I have made the test with libvirt 7.0.0 & qemu 1.5.2 and with libvirt
> 8.0.0 and qemu 1.7.0 (from Debian 11 backports). The different versions have
> the same behavior.
>  - Apparmor is not the culprit (No error in logs). I have also disabled it
> and the behavior is still in the same
> 
> I will appreciate any hint about what I should check to find the difference
> between the working systems and the failing ones.
> 
> Regards,
> Fred
> 
> How to made a test (under root):
> 
> 1/ Install libvirt & qemu if needed
> apt install libvirt-daemon-system qemu-system-x86 virtinst
> 
> 2/ Start libvirt daemon if needed
> systemctl start libvirtd
> 
> 3/ Create the default pool storage (if it is not created automatically)
> virsh pool-define-as default dir - - - - /var/lib/libvirt/images/
> virsh pool-build default
> virsh pool-start default
> 
> 5/ Download Debian 11.4 Generic cloud image and put it in the default
> storage pool
> wget -O /var/lib/libvirt/images/debian.qcow2 
> https://cloud.debian.org/images/cloud/bullseye/latest/debian-11-genericcloud-amd64.qcow2
> 
> 6/ Refresh the default storage and check the Debian image is visible.
> virsh pool-refresh default
> virsh vol-list --pool default
> 
> 7) Start the default network
> virsh net-start default
> 
> 8) Create a VM based on the Debian 11.4 Generic cloud image
> virt-install -n TESTBUG --disk vol=default/debian.qcow2  --memory 1024
> --import --noreboot --graphics none
> 
> 9/ Start the VM, it should start and work fine
> virsh start TESTBUG
> 
> 10/ Stop the VM
> virsh shutdown TESTBUG
> 
> 11/ Change the disk definition to switch to the disk type from 'file' to
> 'volume' and adapt the 'source' attributes accordingly.
> virsh edit --domain TESTBUG
> 
> Change this section:
> 
>   
>   
>   
>   
> 
> 
> to :
> 
>   
>   
>   
>   
> 
> 
> 12/ Start the VM again. It will either succeed or fail with the fololwing
> error :
> error creating libvirt domain: internal error: qemu unexpectedly closed the
> monitor: 2022-08-11T16:12:22.987252Z qemu-system-x86_64: -blockdev 
> {"driver":"file","filename":"/var/lib/libvirt/images/debian.qcow2","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}:
> Could not open '/var/lib/libvirt/images/debian.qcow2': Permission denied

Hi,

I'm fairly certain that the above is because of Apparmor. Specifically
the apparmor labelling code does not translate the pool/volume name to
the path to the image, while for other security drivers we use the
existing definition and thus do translate it.

I'm not familiar enough with apparmor to point you to how to configure
logging properly, though.

The issue originates from the fact that the apparmor driver uses a
helper process to setup the labelling and the helper process itself is
not able to access libvirt's storage driver and thus unable to do the
translation.

I'll try to think about a possibility to pass the path though.



Re: Memory locking limit and zero-copy migrations

2022-08-17 Thread Peter Krempa
On Wed, Aug 17, 2022 at 10:56:54 +0200, Milan Zamazal wrote:
> Hi,
> 
> do I read libvirt sources right that when  is not used in the
> libvirt domain then libvirt takes proper care about setting memory
> locking limits when zero-copy is requested for a migration?

Well yes, for a definition of "proper". In this instance qemu can lock
up to the guest-visible memory size of memory for the migration, thus we
set the lockable size to the guest memory size. This is a simple upper
bound which is supposed to work in all scenarios. Qemu is also unlikely
to ever use up all the allowed locking.

> I also wonder whether there are any other situations where memory limits
> could be set by libvirt or QEMU automatically rather than having no
> memory limits?  We had oVirt bugs in the past where certain VMs with
> VFIO devices couldn't be started due to extra requirements on the amount
> of locked memory and adding  to the domain apparently
> helped.

 is not only an amount of memory qemu can lock into ram, but
an upper bound of all memory the qemu process can consume. This includes
any qemu overhead e.g. used for the emulation layer.

Guessing the correct size of overhead still has the same problems it had
and libvirt is not going to be in the business of doing that.



Re: Storage Pool - NFS v4.1

2022-07-01 Thread Peter Krempa
On Wed, Jun 29, 2022 at 17:58:28 +0200, Charles Koprowski wrote:
> Le jeu. 23 juin 2022 à 17:19, Peter Krempa  a écrit :
> 
> >
> > I've posted a patch that fixes this:
> >
> > https://listman.redhat.com/archives/libvir-list/2022-June/232541.html
> >
> >
> Thank you !
> 
> This has been reported to Canonical support :
> https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1980134

It's now pushed as commit c44930d932203b4a58dccbbeaa814fff6cea8216



Re: Storage Pool - NFS v4.1

2022-06-23 Thread Peter Krempa
On Thu, Jun 23, 2022 at 16:42:25 +0200, Charles Koprowski wrote:
> Hello,
> 
> In order to use pNFS which is only available in NFS version 4.1, I'm trying
> to create a netfs storage pool specifying the protocol version to be used :
> 
> 
>   vms
>   
> ...
> 
> 
>   
>   
> ...
> 
> 
> But the storage pool XML documentation [1] states that :

[...]

I've posted a patch that fixes this:

https://listman.redhat.com/archives/libvir-list/2022-June/232541.html



Re: managed save returns "migration with virtiofs device is not supported"

2022-06-16 Thread Peter Krempa
On Thu, Jun 16, 2022 at 13:09:11 +0200, Francesc Guasch wrote:
> El 15/6/22 a les 23:25, Ján Tomko ha escrit:
> > On a Wednesday in 2022, Francesc Guasch wrote:
> > > Hello. I have a virtual machine with a virtiofs entry configured.
> > > 
> > > When I try to do a managed save it fails with this message:
> > > 
> > >    libvirt error code: 55, message: Requested operation is not valid:
> > >    migration with virtiofs device is not supported
> > > 
> > 
> > Yes, this is not yet implemented.
> > 
> 
> ok ! Thanks for answering me.
> 
> What still puzzles me is I didn't want to do a migration.
> I just wanted to do managed save.
> 
> Does this error message still applies ?

Saving of the execution state of a VM is basically identical to
migration. In case of qemu it is actually implemented using a migration
into the save file.

Certain configurations simply don't work with it either due to technical
reasons (e.g. restoring state of a hardware device might not be possible
to the extent that it's transparent to the guest OS) or not implemented.



Re: Domain XML and VLAN tagging

2022-06-16 Thread Peter Krempa
On Thu, Jun 16, 2022 at 09:20:21 +0200, Gionatan Danti wrote:
> Hi all,
> from here [1]:
> 
> "Network connections that support guest-transparent VLAN tagging include 1)
> type='bridge' interfaces connected to an Open vSwitch bridge Since 0.10.0 ,
> 2) SRIOV Virtual Functions (VF) used via type='hostdev' (direct device
> assignment) Since 0.10.0 , and 3) SRIOV VFs used via type='direct' with
> mode='passthrough' (macvtap "passthru" mode) Since 1.3.5 . All other
> connection types, including standard linux bridges and libvirt's own virtual
> networks, do not support it."
> 
> I read it correctly that when used on a classical linux bridge these vlan
> tags does nothing? If so, it is due to something related to the underlying
> bridge device (ie: incomplete support for vlan filtering) or it is because
> libvirt lacks the necessary "plumbing" to use advanced bridge features?

AFAIK it was simply never implemented. There's also an upstream feature
request for this:

https://gitlab.com/libvirt/libvirt/-/issues/157

> 
> Thanks.
> 
> [1] 
> https://libvirt.org/formatdomain.html#setting-vlan-tag-on-supported-network-types-only
> 
> -- 
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.da...@assyoma.it - i...@assyoma.it
> GPG public key ID: FF5F32A8
> 



Re: vepa-mode directly attached interface

2022-06-13 Thread Peter Krempa
On Sun, Jun 12, 2022 at 00:16:20 +0200, Gionatan Danti wrote:
> Il 2022-06-11 19:32 Laine Stump ha scritto:

[...]

> > I guess you'll need to do a "virsh dumpxml --inactive" for each guest
> > and parse it out of there.
> 
> It matches my results. However, virsh dumpxml is an expensive operation -
> polkitd can easily burn >30% of a single core for 1s polling interval with
> virsh dumpxml. I settled (for testing) with grepping "mode..vepa" in
> /etc/libvirtd/qemu and it seems to work well (1s polling not even register
> in top).

The unfortunate thing about using `virsh dumpxml $VM` as written is that
it opens a connection (which uses polkit to auth), gets the XML and then
closes the connection.

If you want it to be more optimized you can e.g. write a script using
python bindings for libvirt and simply keep the connection open ...

> Does libvirt support calling some external script when a new virtual machine
> is defined?

... which additionally allows you to register 'domain lifecycle events'
[1] which give you a trigger when a new VM is defined.


https://www.libvirt.org/html/libvirt-libvirt-domain.html#virConnectDomainEventRegister
https://www.libvirt.org/html/libvirt-libvirt-domain.html#virDomainEventID



Re: how to use "virsh shutdown domain --mode [initctl|signal|paravirt) ?

2022-06-01 Thread Peter Krempa
On Wed, Jun 01, 2022 at 12:05:58 +0200, Lentes, Bernd wrote:
> Hi,
> 
> ocasionally my Virtual Domains running on a pacmekaer cluster don't shutdown, 
> although beeing told to do it.
> "virsh help shutdown" says:
>  ...
> --mode   shutdown mode: acpi|agent|initctl|signal|paravirt
> 
> How is it possible to use initctl or signal or paravirt ?
> What do i have to do ? What are the prerequisites ?

I presume you use qemu/kvm as virt, right? In such case only 'acpi' and
'agent' are available.

'initctl'/'signal' is meant for LXC containers, and
'paravirt' is a mode available for the XEN hypervisor



Re: how to change emulator path during live migration

2022-04-27 Thread Peter Krempa
[re-adding libvirt-users list]

Please always reply to the list so that the follow-up conversation is
archived and delivered to all subscribers.

On Wed, Apr 27, 2022 at 15:36:54 +0800, Jiatong Shen wrote:
> Thank you for the feedback!
> 
> Is it ok if the source node does not contain a emulator path used by the
> dest node? for example, on src emulator path is /a/b/c, but
> on dest it is /a/b/d, and /a/b/d does not exist on src.

You can change the emulator path arbitrarily. The only limitation is
that the emulator you pick (the binary, not the path) must be able to
run the VM, but that will be validated during the migration.



Re: how to change emulator path during live migration

2022-04-27 Thread Peter Krempa
On Wed, Apr 27, 2022 at 14:11:23 +0800, Jiatong Shen wrote:
> Hello libvirt experts,
> 
>   I am facing the following exceptions during live migrating a virtual
> machine from one compute node to another.
> 
> file or directory: libvirt.libvirtError: Cannot check QEMU binary
> /usr/bin/kvm-spice: No such file or directory
>   File "/var/lib/openstack/lib/python3.6/site-packages/eventlet/tpool.py",
> line 83, in tworker
> rv = meth(*args, **kwargs)
>   File "/var/lib/openstack/lib/python3.6/site-packages/libvirt.py", line
> 1745, in migrateToURI3
> if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed',
> dom=self)
> libvirt.libvirtError: Cannot check QEMU binary /usr/bin/kvm-spice: No such
> file or directory
> 
> After some investigation, we found that this error is triggered because we
> do not have qemu-kvm installed in our container, btw the libvirt is
> directly installed on the source node.
> 
> I have following questions
> 
> Is it possible to change emulator during live migration? I try to to remove
> emulator under devices but looks like it does not help.

The migration API you've used (virDomainMigrateToURI3) supports
additional parameters. One of the supported parameters is:

VIR_MIGRATE_PARAM_DEST_XML:

/**
 * VIR_MIGRATE_PARAM_DEST_XML:
 *
 * virDomainMigrate* params field: the new configuration to be used for the
 * domain on the destination host as VIR_TYPED_PARAM_STRING. The configuration
 * must include an identical set of virtual devices, to ensure a stable guest
 * ABI across migration. Only parameters related to host side configuration
 * can be changed in the XML. Hypervisors which support this field will forbid
 * migration if the provided XML would cause a change in the guest ABI. This
 * field cannot be used to rename the domain during migration (use
 * VIR_MIGRATE_PARAM_DEST_NAME field for that purpose). Domain name in the
 * destination XML must match the original domain name.
 *
 * Omitting this parameter keeps the original domain configuration. Using this
 * field with hypervisors that do not support changing domain configuration
 * during migration will result in a failure.
 *
 * Since: v1.1.0
 */

So you fetch a migratable version of the XML (VIR_DOMAIN_XML_MIGRATABLE
flag for the XML dumping API) and update the emulator path. Then feed it
as the additional parameter



Re: set fixed time at vm guest startup?

2022-04-26 Thread Peter Krempa
On Fri, Apr 22, 2022 at 12:54:35 -0600, Fred Clift wrote:
> I'm looking at the  libvirt parameters.
> 
> Is there a way to set the guests clock to a specific time/date at  virt
> startup?  I'm trying to virtualize a system with pci passthrough for a
> hardware device that wants to always live in pre-2011.  The license manager
> for the software/hardware has 31-bit overflow of time calculations.

Hi, currently there is no such thing, only stuff that would allow you to
set an arbitrary offset to the current time, which would not be zero
maintenance.

However it's very simple to implement
https://listman.redhat.com/archives/libvir-list/2022-April/230492.html

That patchset introduces an 'absolute' mode for clock with an element
'start' where you can put an arbitrary unix epoch timestamp which is set
always on boot of the VM.



Re: race condition? virsh migrate --copy-storage-all

2022-04-19 Thread Peter Krempa
On Tue, Apr 19, 2022 at 15:51:32 +0200, Valentijn Sessink wrote:
> Hi Peter,
> 
> Thanks.
> 
> On 19-04-2022 13:22, Peter Krempa wrote:
> > It would be helpful if you provide the VM XML file to see how your disks
> > are configured and the debug log file when the bug reproduces:
> 
> I created a random VM to show the effect. XML file attached.
> 
> > Without that my only hunch would be that you ran out of disk space on
> > the destination which caused the I/O error.
> 
> ... it's an LVM2 volume with exact the same size as the source machine, so
> that would be rather odd ;-)

Oh, you are using raw disks backed by block volumes. That was not
obvious before ;)

> 
> I'm guessing that it's this weird message at the destination machine:
> 
> 2022-04-19 13:31:09.394+: 1412559: error : virKeepAliveTimerInternal:137
> : internal error: connection closed due to keepalive timeout

That certainly could be a hint ...

> 
> Source machine says:
> 2022-04-19 13:31:09.432+: 2641309: debug :
> qemuMonitorJSONIOProcessLine:220 : Line [{"timestamp": {"seconds":
> 1650375069, "microseconds": 432613}, "event": "BLOCK_JOB_ERROR", "data":
> {"device": "drive-virtio-disk2", "operation": "write", "action": "report"}}]
> 2022-04-19 13:31:09.432+: 2641309: debug : virJSONValueFromString:1822 :
> string={"timestamp": {"seconds": 1650375069, "microseconds": 432613},
> "event": "BLOCK_JOB_ERROR", "data": {"device": "drive-virtio-disk2",
> "operation": "write", "action": "report"}}

The migration of non-shared storage works as follows:

1) libvirt sets up everything
2) libvirt asks destination qemu to open an NBD server exporting the
   disk backends
3) source libvirt instructs qemu to copy the disks to the NBD server via
   a block-copy job
4) when the block jobs converge, source qemu is instructed to migrate
   memory
5) when memory migrates, source qemu is killed and destination is
resumed

Now from the keepalive failure on the destiantion it seems that the
network connection at least between the migration controller and the
destination libvirt broke. That might actually cause also the NBD
connection to break and in such case the block job gets an I/O error.

Now the I/O error is actually based on the network connection and not
any storage issue.

So at this point I suspect that something without the network broke and
the migration was aborted in the storage copy phase, but could been in
any other.



Re: race condition? virsh migrate --copy-storage-all

2022-04-19 Thread Peter Krempa
On Fri, Apr 15, 2022 at 16:58:08 +0200, Valentijn Sessink wrote:
> Hi list,
> 
> I'm trying to migrate a few qemu virtual machines between two 1G ethernet
> connected hosts, with local storage only. I got endless "error: operation
> failed: migration of disk vda failed: Input/output error" errors and
> thought: something wrong with settings.
> 
> However, then, suddenly: I succeeded without changing anything. And, hey:
>  while ! time virsh migrate --live --persistent --undefinesource
> --copy-storage-all ubuntu20.04 qemu+ssh://duikboot/system; do a=$(( $a + 1
> )); echo $a; done
> 
> ... retried 8 times, but then: success. This smells like a race condition,
> doesn't it? A bit weird is the fact that the migration seems to succeed
> every time while copying from revolving disks to SSD; but the other way
> around has this Input/output error.
> 
> There are some messages in /var/log/syslog, but not at the time of the
> failure, and no disk errors. These disks are LVM2 volumes and they live on
> raid arrays - and/so there is not a real, as in physical, I/O-error.
> 
> Source system has SSD's, target system has regular disks.
> 
> 1) is this the right mailing list? I'm not 100% sure.
> 2) how can I research this further? Spending hours on a "while / then" loop
> to try and retry live migration looks like a dull job for my poor computers
> ;-)

It would be helpful if you provide the VM XML file to see how your disks
are configured and the debug log file when the bug reproduces:

https://www.libvirt.org/kbase/debuglogs.html#less-verbose-logging-for-qemu-vms

Without that my only hunch would be that you ran out of disk space on
the destination which caused the I/O error.



Re: Virtio-scsi and block mirroring

2022-04-19 Thread Peter Krempa
On Thu, Apr 14, 2022 at 16:36:38 +, Bjoern Teipel wrote:
> Hello everyone,

Hi,

> 
> I’m looking at an issue where I do see guests freezing (Dl) process state 
> during a block disk mirror from one storage to another storage (NFS) where 
> the network stack of the guest can freeze for up to 10 seconds.
> Looking at the storage and IO I noticed good throughput ad low latency <3ms 
> and I am having trouble to track down the source for the issue, as neither 
> storage nor networking  show issues. Interestingly when I do the same test 
> with virtio-blk I do not really see the process freezes at the frequency or 
> duration compared to virtio-scsi which seem to indicate a client side rather 
> than storage side problem.

Hmm, this is really weird if the difference is in the guest-facing
device frontend.

Since libvirt is merely setting up the block job for the copy and the
copy itself is handled by qemu I suggest you contact the
qemu-bl...@nongnu.org mailing list.

Unfortunately you didn't provide any information on the disk
configuration (the VM XML) or how you start the blockjob, which I could
translate for you into qemu specifics. If you provide such information I
can do that to ensure that the qemu folks have all the relevant
information.



Re: Debugging hanging libvirt

2022-04-01 Thread Peter Krempa
On Wed, Mar 30, 2022 at 01:11:50 +, Tobias Hofmann (tohofman) wrote:
> Hello all,
> 
> I have a system with one VM running. After some time the VM needs to be 
> temporarily stopped and started again. This start operation fails and from 
> that point on any virsh command is hanging and does not execute.
> This issue is reproducible and I have already figured out that restarting 
> libvirtd resolves this issue. However, I’m now trying to understand why it’s 
> getting stuck in the first place.
> 
> I try not to get too much into detail because I think this would be more 
> confusing than it would actually help to understand the problem. In general, 
> I’m wondering what approach you should follow to debug why libvirt gets stuck.
> Online I’ve read that you should run this command: `# gdb -batch -p $(pidof 
> libvirtd) -ex 't a a bt f'`. I’ve run that command and attached the output to 
> this mail. However, I have to admit that I have no idea what to do with it.

So from the backtrace:


Thread 18 (Thread 0x7f58bcf3e700 (LWP 19010)):
#0  0x7f58c9e3a9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x7f58ccc79e26 in virCondWait () from /lib64/libvirt.so.0
#2  0x7f58a046f70b in qemuMonitorSend () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#3  0x7f58a04811d0 in qemuMonitorJSONCommandWithFd () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#4  0x7f58a0482a01 in qemuMonitorJSONSetCapabilities () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#5  0x7f58a044f567 in qemuConnectMonitor () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#6  0x7f58a04506f8 in qemuProcessWaitForMonitor () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#7  0x7f58a0456a52 in qemuProcessLaunch () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#8  0x7f58a045a4b2 in qemuProcessStart () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#9  0x7f58a04bd5c6 in qemuDomainObjStart.constprop.50 () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#10 0x7f58a04bdd26 in qemuDomainCreateWithFlags () from 
/usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so
#11 0x7f58cce1524c in virDomainCreate () from /lib64/libvirt.so.0
#12 0x55e7499a3da3 in remoteDispatchDomainCreateHelper ()

Libvirtd is not actually entirely stuck but it's waiting for qemu to
start communicating with us when starting the VM.

Now why qemu got stuck it's not clear from this.

To un-stuck libvirt it should not be needed to restart it, but it should
be simply enough to 'virsh destroy $VM', which kills of the stuck qemu.

Now you can use the same approach you did with collecting the backtrace
of libvirtd to check what qemu is doing.

Additionally you can also have a look in /var/log/libvirt/qemu/$VM.log
to see whether qemu logged something.

> System related info:
> 
>   *   OS: CentOS 7.8.2003
>   *   libvirt version: 4.5.0-33



Re: Resize backing store

2022-03-08 Thread Peter Krempa
On Tue, Mar 08, 2022 at 06:17:32 +0200, Jeff Brown wrote:
> Apologies in advance if this be a stupid question, as everything I've read
> seems to suggest it's impossible.
> 
> Slowly running out of space on a 200GB root partition.
> 
> Is it possible to create a snapshot - leaving the QCOW2 image 'quiescent' -
> resize it using 'qemu-img resize' - then pivot the snapshot back? (Then
> shutdown, attach the storage to another bootable OS, and run fdisk and
> resize2fs.)
> 
> Everything I've read says to shutdown the VM; but besides the real
> possibility of the snapshot blockcommit being corrupted when pivoted into
> the newly resized backing store, I don't see why it shouldn't work. And save
> some significant downtime.
> 
> Please can someone advise if this is actually possible, or if I'm wasting my
> time?

We actually also have 'virsh blockresize' which allows to modify the
size of the disk while the VM is running.

There are some caveats, e.g. certain disk bus and operating system
combinations will not notice that the size increased, but in most common
cases it actually works while the VM is running. The workaround if that
happens is to just reboot the guest.

The common approach is to:

virsh blockresize $VM $DISK $NEWSIZE (look into the manual for how size is 
treated)

in the vm then:

1) increase the size of the partition
2) resize LVM (physical volume) if used
3) resize filesystem using resize2fs or similar



Re: libvirtd daemon missing in LFS

2022-02-10 Thread Peter Krempa
On Thu, Feb 10, 2022 at 11:39:17 +0530, Sai Kiran Kumar Reddy wrote:
> Hi,
> 
> There was some issue with pkg-config-path. I have fixed it. I see that it
> looks for wireshark and other dependencies. I get an error saying "remote
> driver is required for libvirtd daemon". I am not sure what this error

So you are missing some of the dependencies needed by the remote driver
which is needed for the libvirtd daemon. Similarly to what I've
suggested before can be used for any other option.

To list full configuration of the project along with options that were
selected you can run 'meson configure' from the builddir and it will
print all options:

$ meson configure

[... snipped ... ]

  Project optionsCurrent ValuePossible Values   
   Description
  -------   
   ---
  apparmor   auto [enabled, disabled, 
auto]apparmor support
  apparmor_profiles  auto [enabled, disabled, 
auto]install apparmor profiles
  attr   auto [enabled, disabled, 
auto]attr support
  audit  auto [enabled, disabled, 
auto]audit support
  bash_completionauto [enabled, disabled, 
auto]bash-completion support
  bash_completion_dir   
   directory containing bash completion scripts
  blkid  auto [enabled, disabled, 
auto]blkid support
  capng  auto [enabled, disabled, 
auto]cap-ng support
  ch_group  
   groupname to run Cloud-Hypervisor system instance as
  ch_user   
   username to run Cloud-Hypervisor system instance as
  chrdev_lock_files 
   location for UUCP style lock files for character devices

   (leave empty for default paths on some platforms)
  curl   auto [enabled, disabled, 
auto]curl support
  docdir
   documentation installation directory
  docs   auto [enabled, disabled, 
auto]whether to generate documentation
  driver_bhyve   auto [enabled, disabled, 
auto]bhyve driver
  driver_ch  auto [enabled, disabled, 
auto]Cloud-Hypervisor driver
  driver_esx auto [enabled, disabled, 
auto]esx driver
  driver_hyperv  auto [enabled, disabled, 
auto]Hyper-V driver
  driver_interface   auto [enabled, disabled, 
auto]host interface driver
  driver_libvirtdauto [enabled, disabled, 
auto]libvirtd driver
  driver_libxl   auto [enabled, disabled, 
auto]libxenlight driver
  driver_lxc auto [enabled, disabled, 
auto]Linux Container driver
  driver_network auto [enabled, disabled, 
auto]virtual network driver
  driver_openvz  auto [enabled, disabled, 
auto]OpenVZ driver
  driver_qemuauto [enabled, disabled, 
auto]QEMU/KVM driver
  driver_remote  auto [enabled, disabled, 
auto]remote driver
  driver_secrets auto [enabled, disabled, 
auto]local secrets management driver
  driver_testauto [enabled, disabled, 
auto]test driver
  driver_vboxauto [enabled, disabled, 
auto]VirtualBox XPCOMC driver
  driver_vmware  auto [enabled, disabled, 
auto]VMware driver
  driver_vz  auto [enabled, disabled, 
auto]Virtuozzo driver
  dtrace auto [enabled, disabled, 
auto]use dtrace for static probing
  expensive_testsauto [enabled, disabled, 
auto]set the default for enabling expensive tests (long

[...]


> means. Does it mean that I have to install wireshark or is it looking for

No wireshark is optional, only if you want to build the dissector for
the libvirt protocol as plugin into wireshark.

> something else. Could you please help me out here.

So 

Re: libvirtd daemon missing in LFS

2022-02-09 Thread Peter Krempa
On Wed, Feb 09, 2022 at 16:34:43 +0530, Sai Kiran Kumar Reddy wrote:
> Hi,
> 
> I am Sai Kiran. I am trying to build libvirt from source on my Linux From
> Scratch(LFS) system. I have successfully installed libvirt and its
> dependencies. When I start virt-manager, there is a prompt in GUI saying
> that "libvirtd service is not installed".  I also do not see any libvirtd
> in my system. I would like to install the libvirtd service(build from
> source). But I am not able to find the source code for it and also, I am
> not sure about the build process for the daemon. Could you please help me
> out here.

The libvirt daemon is integral part of the libvirt project so the
sources you used to build the library also contain the daemon sources.

In your instance it's most likely that you are missing a dependancy and
libvirtd was not auto-enabled. To force-enable it configure libvirt with

 '-Ddriver_libvirtd=enabled'

which should report what you are missing. Since you are using LFS you
need to ensure that you have all deps yourself.



Re: NVMe drive PCI passthrough and suprise hotplug

2022-02-04 Thread Peter Krempa
On Thu, Feb 03, 2022 at 23:25:05 +, Kalra, Ashish wrote:
> [AMD Official Use Only]

Well, I hope it's okay to use it for libvirt officially too ;)

> 
> Hi,
> I am using Fedora 33, with the following KVM, qemu and libvirt versions:

Note that Fedora 33 is already end-of-life, it would be great if you can
re-test with a more recent version

> QEMU 5.1.0
> libvirt 6.6.0

specifically this was released 1.5 years ago

> KVM 5.14.18
> 
> We have done pass-through of a PCIe NVMe device to the guest running on FC33
> using either virt-manager or virsh and then we do the hot-unplug of the device
> while it is attached to the guest.
> 
> The device is no longer seen on the guest hardware device list on virt-manager
> and then we hotplug the device again and we are able to use it on the Host,
> but when we try to re-attach it to the guest, we get the following error 
> message:
> 
> Requested operation is not valid, PCI device :c4::00.0 is in use by 
> driver QEMU,
> Domain fedora 33.

[...]

Unfortunately the tracing you've done doesn't really help in seeing what
gone wrong in libvirt.

Please attach debug logs per https://www.libvirt.org/kbase/debuglogs.html



Re: udmabuf error with libvirt + QEMU

2022-02-01 Thread Peter Krempa
On Tue, Feb 01, 2022 at 14:27:55 +, M, Shivakumar wrote:
> Hi,
> 
> We are seeing an issue with udambuf, where it says "open /dev/udmabuf: No 
> such file or directory " even if the device exits. This issue particularly we 
> are seeing with libvirt. When we run the QEMU args on the command line, 
> everything works as expected.
> It seems to be some permission issue when we use the Libvirt, please help us 
> on resolving this.

libvirt runs qemu in a mount namespace where we propagate only nodes
from /dev/ which are know and used by libvirt so that the qemu proces is
confined to only what it needs.

Ideally you'll impelemt the support for the 'blob' parameter you are
using in a way which allows the use of the appropriate files for qemu:

> 
> Libvirt XML:
> 
> http://libvirt.org/schemas/domain/qemu/1.0 type="kvm">
>   win-vm-0
>   4194304
>   4194304
>   
> 
>   
>   6
>   
> hvm
> 
> 
>   
>   
> 
> 
> 
>   
>   
>   
> 
> 
>   
>   
> 
>   
>   
>  
>   destroy
>   restart
>   destroy
>   
> /usr/bin/qemu-system-x86_64
> 
>   
>   
>   
>   
>   
>   
> 
> 
>   
>   
> 
> 
>   
>   
> 
>   
> 
> 
>   
>   
> 
> 
>   
>   
> 
>   
> 
>   
>   
> 
> 
> 
> 
> 

Libvirt doesn't interpret this in any way so you'll need to implement
support for what you want.

Alternatively as a proof-of-concept/workaround you can set the
'cgroup_device_acl' setting in /etc/libvirt/qemu.conf. But as noted that
is not really a supportable solution.



Re: Backend memory object creation - query

2022-01-27 Thread Peter Krempa
On Thu, Jan 27, 2022 at 04:05:19 +, M, Shivakumar wrote:
> Hello,

Hi,

note that there's no need to CC both libvir-list and libvirt-users
maling lists.

> 
> 
> For our use-case with Libvirt we want to create the Memory backend object ,
> 
> Expected QEMU args would be
> -object memory-backend-memfd,id=mem1,size=4096M

Your description is a bit vague. Do you want to use it to back the guest
memory?

As by itself a memory object itself is useless without being attached
somewhere

In case you want to configure qemu to back the default memory of the VM
by a memfd:

Per https://www.libvirt.org/formatdomain.html#memory-backing


 ...

 
   
 
...


Results into invoking qemu with:

-machine 
pc-i440fx-2.9,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram \
...
-object 
'{"qom-type":"memory-backend-memfd","id":"pc.ram","x-use-canonical-path-for-ramblock-id":false,"size":1048576000,"host-nodes":[0],"policy":"bind"}'
 \


> 
> Request you to please help us to specify this arg in the libvirt XML.
> 
> Thanks,
> Shiv



Re: What does the positional parameters of "virsh backup-begin" actually do?

2022-01-14 Thread Peter Krempa
On Fri, Jan 14, 2022 at 23:39:10 +0600, Ahmad Ismail wrote:
> Normally when I backup a kvm machine I shutdown the machine then run:

[...]

> The problem with this help is, it is not clear enough.
> 
> I understand that I should use virsh backup-begin vm1 to backup a live kvm
> machine. However, this command only create .qcow2 files. What about the .xml
> file.
> 
> What does --backupxml , --checkpointxml & --reuse-external actually do?
> 
> When should I use them?

'man virsh' states:

   backup-begin
   Syntax:

  backup-begin domain [backupxml] [checkpointxml] [--reuse-external]

   Begin a new backup job. If backupxml is omitted, this defaults  to
   a  full  backup  using a push model to filenames generated by lib‐
   virt; supplying XML allows fine-tuning such as requesting  an  in‐
   cremental  backup  relative  to an earlier checkpoint, controlling
   which disks participate or which filenames are  involved,  or  re‐
   questing  the use of a pull model backup.  The backup-dumpxml com‐
   mand shows any resulting values assigned by libvirt. For more  in‐
   formation   on  backup  XML,  see:
   https://libvirt.org/formatbackup.html

   If --reuse-external is used it instructs libvirt to  reuse  tempo‐
   rary and output files provided by the user in backupxml.

   If checkpointxml is specified, a second file with a top-level ele‐
   ment of domaincheckpoint is used to create a  simultaneous  check‐
   point,  for  doing a later incremental backup relative to the time
   the backup was created. See checkpoint-create for more details  on
   checkpoints.

   This  command returns as soon as possible, and the backup job runs
   in the background; the progress of a  push  model  backup  can  be
   checked with domjobinfo or by waiting for an event with event (the
   progress of a pull model backup is under the control  of  whatever
   third  party  connects  to  the NBD export). The job is ended with
   domjobabort.


Does this clarify it sufficiently?



Re: Allow unsafe migration - ?

2022-01-03 Thread Peter Krempa
On Mon, Jan 03, 2022 at 10:14:48 +0100, Peter Krempa wrote:
> On Wed, Dec 29, 2021 at 17:56:57 +, lejeczek wrote:
> > Hi guys.
> > 
> > Is it possible to allow 'unsafe' migration globally?
> 
> No there's no global switch for this and there won't be one as there are
> multiple reasons why migration can be unsafe.

Additionally, would you mind describing your setup and why you even need
unsafe? Maybe there's a gap that we can fix.



Re: Allow unsafe migration - ?

2022-01-03 Thread Peter Krempa
On Wed, Dec 29, 2021 at 17:56:57 +, lejeczek wrote:
> Hi guys.
> 
> Is it possible to allow 'unsafe' migration globally?

No there's no global switch for this and there won't be one as there are
multiple reasons why migration can be unsafe.



Re: Qemu monitor info tlb gives unable to encode message payload

2021-12-08 Thread Peter Krempa
On Wed, Dec 08, 2021 at 10:30:27 +, Philipp Klocke wrote:
> Hi,
> 
> the command
> virsh qemu-monitor-command ubuntu_vm --hmp --cmd "info tlb"
> fails with "error: Unable to encode message payload".
> 
> I found a bugtracker entry for a similiar error [1], but I don't if this is 
> the same error (message too large). I also don't know how large an info tlb 
> message is.
> Preferably I would not have to recompile libvirt just to issue this monitor 
> command..

Libvirt unfortunately limits strings to 4MiB:

const REMOTE_STRING_MAX = 4194304;

And the reply from qemu-monitor-command is a single string. Now
internally we process JSON messages up to 10 MiB so one could argue that
we should increase the size for the 'qemu-monitor-command' reply up to
10MiB. This could be straightforward but it's questionable whether it's
worth it.

> Then I thought about circumventing the error by connecting directly to the 
> qemu monitor via netcat, but I found a thread [2] that says I cannot add my 
> own "-monitor tcp:..." to the Qemu commandline arguments.

IIRC at that point qemu wasn't able to handle two monitor connections.
At this point it is possible to have two concurrent connections to the
montitor. Obviously things may break and you get to keep the pieces if
it breaks.

By adding:

  


  


I then get a connection from qemu on the socket when starting the VM:

  $ nc -l -p 1235
  {"QMP": {"version": {"qemu": {"micro": 93, "minor": 1, "major": 6}, 
"package": "v6.2.0-rc3"}, "capabilities": ["oob"]}}

I can then start conversing with the monitor:

  {"execute":"qmp_capabilities"}
  {"return": {}}
  {"execute":"human-monitor-command","arguments":{"command-line":"info"}}
  {"return": "info balloon  - 


Out of curiousity, what do you specifically need 'info tlb' for?



Re: Libvirt Snapshot Question

2021-12-02 Thread Peter Krempa
On Thu, Dec 02, 2021 at 16:27:00 +0100, Elias Mobery wrote:
> OK, so I tried both ways. (The VM image is in the read-only squashfs.)
> 
> 1.) Editing VM domain XML:
> 
> 
> 
> 
> 
> Error:  Internal Error - Could not open
> /var/lib/libvirt/images/vm.snapshot1 : Permission Denied
> 
> I chowned everything to libvirt but no changes. Maybe a read-only conflict?

Yes, the overlay is created in the same path as the original image, just
with a suffix, so if that is a read-only FS it will not work.

I've thought about adding a possibility to specify the location for the
overlay but didn't ever get to implementing it actually.

> 2.) Adding second disk to VM domain XML
> (shortened )
> 
>  
> source /var/lib/images/vm.qcow2
> target dev=vda
>  
> 
> 
>  
>  source /var/lib./images/vm.qcow2
>  target dev=vdb
> 
>  
> 
> Error: unsupported configuration: cannot create external snapshot for disk
> vdb - collision with disk vdb.

Could you please post the unabbreviated steps? I don't really know
what's going on based on this.

> After googling I found that a lot of people get permission denied errors
> using transient. I will keep looking, really not sure.



Re: Libvirt Snapshot Question

2021-12-02 Thread Peter Krempa
On Wed, Dec 01, 2021 at 18:35:44 +0100, Elias Mobery wrote:
> Hey man, I tried both suggestions, thanks again.
> 
> Option 1.) Create/attach a second qcow2 disk & unplug when wiping.
> This did not work unfortunately. After wiping the overlay filesystem on the
> host (where the snapshots are sitting), I cannot create another snapshot, a
> similar error message like before appears.

Okay, that's weird. Could you please post your exact steps?

Additionally if you have a new enough libvirt you can actually possibly
even try using a disk with . This transparently creates a
overlay (in the same path as the image though, not sure if it suits your
setup) and automatically discards it when unplugging or shutting down
the VM.

In such case you could simply unplug and replug the disk, but note that
this feature was implemented only this year IIRC, so you need a fairly
recent libvirt to do so.

> 
> Option 2.) Enabling Discard
> 
> Now I think this worked, but I'm not sure. I read all the docs, and enabled
> discard like so on the VM domain:
> 
> 

This is the correct setting. Unmap means that the qcow2 image should
unmap the blocks which were discarded.

> 
> I tried both discard=unmap and discard=on (same thing)
> 
> Then I enabled fstrim in the guest like this:
> (Not entirely sure if I am missing smth)
> 
> sudo fstrim -av
> 
> Or permanently
> 
> sudo systemctl enable --now fstrim.timer

Yup, both should do fine. Now if you delete something in the VM (which
would result in freeing blocks in the overlay; deleting something from
the base image won't help obviously) the overlay image should shrink.



Re: Libvirt Snapshot Question

2021-11-30 Thread Peter Krempa
On Tue, Nov 30, 2021 at 14:51:54 +0100, Elias Mobery wrote:
> Hello Peter, thank you so much for that detailed info!
> 
> Sorry, you were right, when trying to delete my external snapshot via
> snapshot-delete, the error says "deletion of external snapshots unsupported"
> 
> I can't merge the snapshot because the VM image is in a read-only
> filesystem. Sorry I should've said, it's a live system. So the image is in
> the read-only squashfs and external snapshot in overlay is used for writing.

Okay, that changes the situation quite a bit:

> 
> Now I would like the snapshot emptied or deleted/recreated when it reaches
> 4GB.
> 
> Is there even a way to do this with the image being read-only?

If the base image is on a read-only filesystem you obviously can't
commit to it.

Now the question is what should happen to the data in the overlay.

Discarding/recreating the overlay image is possible only once you turn
off the VM because it basically rolls back the state of the disk back to
the time when the overlay was created. This means that everything
written to the disk will be lost.

Filesystems obviously can't handle that so that's why it simply won't be
possible to do live with the root image.

You can have a second disk, which you hot-unplug, wipe the overlay and
plug it back.

Another possibility is to enable trim/discard and just simply delete the
data in the VM which was added after the overlay was created. When
trim/discard is enabled on all layers incluging the guest filesystem,
then deleting stuff inside the VM will also mean that the overlay will
shrink again.



Re: Libvirt Snapshot Question

2021-11-30 Thread Peter Krempa
On Mon, Nov 29, 2021 at 13:05:40 +0100, Elias Mobery wrote:
> Hi everybody!
> 
> I'm using a snapshot purely for writes and was wondering if it's possible
> to clear/empty the snapshot after it reaches a certain size?
> 
> Created with:
>  virsh snapshot-create-as  --disk-only

So this creates a so-called external snapshot, which for disks means an
overlay image.

> 
> I tried deleting and recreating it but get an error at deletion:
> 
> virsh snapshot-delete

but unfortunately deleting external snapshots is not yet implemented.

The usual approach is to do a block commit operation to merge the
overlay images back into the backing image manually.

The basic syntax via virsh is:

  virsh blockcommit --active --pivot --verbose $VM $DISK

This merges any overlay images into the backing image. Backing image is
the most deepest nested image when you look at  in the
XML.

> Error: source for disk 'vda' is not a regular file

This error doesn't make much sense though. The error should state that
deletion of external snapshots is nto supported. 

> 
> I also tried virsh snapshot-create-as --no-metadata --disk-only

This creates a snapshot without libvirt metadata. This means that
libvirt just creates the overlay, but doesn't create any internal state
related to it.

It saves you the step of deleting metadata, but you still need to
manually merge the image.

> 
> But the same error as above pops up when trying to delete.

This makes absolutely no sense though. Are you sure you are copying the
correct error message? 'virsh snapshot-delete' works _only_ with
snapshots which do have metadata, so if you create a snapshot without
there's nothing to delete.



Re: virsh blockjob $domain --abort

2021-11-29 Thread Peter Krempa
On Tue, Nov 30, 2021 at 08:15:30 +0200, Jeff Brown wrote:
> Please advise:
> 
> After a scheduled crontab snapshot backup of a VM it failed to do a
> blockcommit, and is currently writing to both the snapshot and the original
> QCOW2 images.
> 
> (I don't think it's a bug, and suspect that I caused it by running the
> backup script manually while the snapshot was already running.)
> 
> $ virsh blockcommit VM sda --active --pivot --shallow --verbose
> error: block copy still active: disk 'sda' already in active block job

The '--pivot' switch here ...

> 
> $ virsh blockjob VM sda --info
> Active Block Commit: [100 %]
> 
> $ virsh domblkerror VM
> No errors found
> 
> Running on Debian 9 (Stretch).
> 
> $ virsh version
> Compiled against library: libvirt 3.0.0
> Using library: libvirt 3.0.0
> Using API: QEMU 3.0.0
> Running hypervisor: QEMU 2.8.1

(this is a rather old libvirt version, but what I suggest below should
work)

> 
> Now, I've read up on the probable solution, somewhat, but running $ virsh
> blockjob $domain --abort on this production server scares me; and I'm just
> asking whether it would be advisable to stop services and make a full data
> backup prior to doing so? It would entail taking the services offline for at
> least an hour. The snapshot is currently at 12GB,  and according to blockjob
> --info the VM image is 100% in synch.

... is equivalent to do a 'virsh blockjob $VM $DISK --pivot' after the
blockjob is complete. In your case, the original virsh invocation was
probably killed or missed the finishing event and thus didn't finalize
the blockjob.



Re: Compiling libvirt on ubuntu

2021-09-13 Thread Peter Krempa
On Mon, Sep 13, 2021 at 01:36:57 +, Or Ozeri wrote:
> Hi,
> 
> 
> I'm trying to compile libvirt on ubuntu machine.
> I installed meson 0.59.1 using pip.
> Then installed a few more packages that were required by meson build:
> sudo apt-get install libxml2-utils xsltproc libpciaccess-dev ninja-build
> 
> 
> 
> Finally, I followed the simple instructions over here: 
> https://libvirt.org/compiling.html
> 
> xz -dc libvirt-7.7.0.tar.xz | tar xvf -
> cd libvirt-7.7.0
> meson build
> ninja -C build
> 
> The last command fails with:
> 
> FAILED: src/libvirt-admin.so.0.7007.0
> cc  -o src/libvirt-admin.so.0.7007.0 src/libvirt_probes.o 
> src/libvirt-admin.so.0.7007.0.p/meson-generated_.._admin_admin_protocol.c.o 
> src/libvirt-admin.so.0.7007.0.p/admin_libvirt-admin.c.o 
> src/libvirt-admin.so.0.7007.0.p/datatypes.c.o -Wl,--as-needed 
> -Wl,--no-undefined -shared -fPIC -Wl,--start-group 
> -Wl,-soname,libvirt-admin.so.0 
> '-Wl,-rpath,$ORIGIN/:' 
> -Wl,-rpath-link,/home/oro/ozeri/libvirt-7.7.0/build/src 
> src/libvirt.so.0.7007.0 
> -Wl,--version-script=/home/oro/ozeri/libvirt-7.7.0/build/src/admin/libvirt_admin.syms
>  -Wl,-z,nodelete /usr/lib/x86_64-linux-gnu/libcap-ng.so 
> /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libdevmapper.so 
> /usr/lib/x86_64-linux-gnu/libgnutls.so -Wl,-z,relro -Wl,-z,now 
> -Wl,--no-copy-dt-needed-entries /usr/lib/x86_64-linux-gnu/libglib-2.0.so 
> /usr/lib/x86_64-linux-gnu/libgobject-2.0.so 
> /usr/lib/x86_64-linux-gnu/libgio-2.0.so /usr/lib/x86_64-linux-gnu/libxml2.so 
> /usr/lib/x86_64-linux-gnu/libyajl.so -Wl,--end-grou
 p
> /usr/bin/ld: src/libvirt-admin.so.0.7007.0.p/admin_libvirt-admin.c.o: in 
> function `callFull':
> /home/oro/ozeri/libvirt-7.7.0/build/../src/admin/admin_remote.c:99: undefined 
> reference to `virNetClientProgramCall'
> /usr/bin/ld: src/libvirt-admin.so.0.7007.0.p/admin_libvirt-admin.c.o: in 
> function `remoteAdminConnectClose':
> /home/oro/ozeri/libvirt-7.7.0/build/../src/admin/admin_remote.c:197: 
> undefined reference to `virNetClientSetCloseCallback'

You've seem to have run into the same issue that is reported as:

https://gitlab.com/libvirt/libvirt/-/issues/196

If I read the issue correctly the following should fix it for you:

meson build -Ddriver_remote=enabled



Re: startupPolicy issue when changing CD

2021-09-10 Thread Peter Krempa
On Fri, Sep 10, 2021 at 15:34:57 +0200, Vojtech Juranek wrote:
> On Friday, 10 September 2021 15:12:07 CEST Peter Krempa wrote:
> > On Fri, Sep 10, 2021 at 14:53:23 +0200, Vojtech Juranek wrote:
> > > Hi,
> > > when adding support for CD disk on block based storage into oVirt,
> > > I spotted following issue. When starting VM without CD, we add
> > > startupPolicy='optional' attribute into  element.
> > > 
> > > Whole XML looks like this:
> > > 
> > > 
> > >   
> > >   
> > >   
> > >   
> > >   
> > >   
> > > 
> > > 
> > > 
> > > To change/insert CD we use libvirt.updateDeviceFlags() with XML which
> > > looks like this (for block based disk):
> > > 
> > > 
> > > 
> > > 
> > >  > > dev="/rhev/data-center/mnt/blockSD/cdac2a0c-b110-456d-a988-7d588626c8
> > > 71/images/638247d7-b4b1-4d98-87fa-c90235fcf4b1/145e7cd2-f92d-4eec-a8fb
> > > -6835b4b652e1" /> 

Please note that according to the updateDevice API docs you should
provide a fully-defined XML of the device, and not omit fields you don't
want to change.

This fact was even discussed today on the mailing list:

https://listman.redhat.com/archives/libvir-list/2021-September/msg00246.html

> > > 
> > > 
> > > 
> > > However, updateDeviceFlags() fails with
> > > 
> > > libvirt.libvirtError: XML error: 'startupPolicy' is only valid for
> > > 'file' type volume> 
> > > What is the reason for this error? We don't use `startupPolicy` attribute
> > > for block based disks, as shown on example above.
> > 
> > Hmm, the bug is that the disk source change is attempted before the
> > update to startup policy, thus the validator is unhappy.
> 
> is this a libvirt bug or intentional behavior and the CD change should be 
> done 
> in two steps:
> * remove startupPolicy from the disk
> * do actual change of the disk
> ?

Fix proposed:

https://listman.redhat.com/archives/libvir-list/2021-September/msg00283.html



Re: startupPolicy issue when changing CD

2021-09-10 Thread Peter Krempa
On Fri, Sep 10, 2021 at 14:53:23 +0200, Vojtech Juranek wrote:
> Hi,
> when adding support for CD disk on block based storage into oVirt, 
> I spotted following issue. When starting VM without CD, we add 
> startupPolicy='optional' attribute into  element. 
> Whole XML looks like this:
> 
> 
>   
>   
>   
>   
>   
>   
> 
> 
> To change/insert CD we use libvirt.updateDeviceFlags() with XML which 
> looks like this (for block based disk):
> 
> 
> 
>  dev="/rhev/data-center/mnt/blockSD/cdac2a0c-b110-456d-a988-7d588626c871/images/638247d7-b4b1-4d98-87fa-c90235fcf4b1/145e7cd2-f92d-4eec-a8fb-6835b4b652e1"
>  />
> 
> 
> 
> However, updateDeviceFlags() fails with 
> 
> libvirt.libvirtError: XML error: 'startupPolicy' is only valid for 'file' 
> type volume
> 
> What is the reason for this error? We don't use `startupPolicy` attribute for 
> block
> based disks, as shown on example above.

Hmm, the bug is that the disk source change is attempted before the
update to startup policy, thus the validator is unhappy.



Re: one qustion about the snapshot of the libvirt, thanks.

2021-09-05 Thread Peter Krempa
On Tue, Aug 31, 2021 at 16:09:50 +0800, Guozhonghua wrote:
> 
> Hello, 
> 
> When we test snapshot features and review the code of libvirt, there is one 
> question, not an issue. 
> 
> /* do the memory snapshot if necessary */
> if (memory) {
> /* check if migration is possible */
> if (!qemuMigrationSrcIsAllowed(driver, vm, false, 0))
> goto cleanup;
> 
> While making one snapshot with memory on one vm, but it is not allowed while 
> the vm which has some src devices, such as pci devs, with which the vm is not 
> allowed to be migrated.
> I want to known the reason, why should  it check this conditions? 

The reason for this check is that the state of host devices namely PCI
devices can't be serialized by qemu and saved in the snapshot. That
means that when reverting the devices would not be configured properly
and would not work.

Internally qemu and libvirt use the migration code to serialize the
state of the VM and that is the reason why 'qemuMigrationSrcIsAllowed'
is called here, because it uses the same implementation.



Re: Disk extend during migration

2021-08-02 Thread Peter Krempa
On Mon, Aug 02, 2021 at 15:34:52 +0200, Vojtech Juranek wrote:
> On Monday, 2 August 2021 14:30:05 CEST Peter Krempa wrote:
> > On Mon, Aug 02, 2021 at 14:20:44 +0200, Vojtech Juranek wrote:
> > > Hi,
> > > as a follow-up of BZ #1883399 [1], we are reviewing vdsm VM migration
> > > flows and solve few follow-up bugs, e.g. BZ #1981079 [2]. I have couple
> > > of questions related to libvirt:
> > > 
> > > * if we run disk extend during migration, it can happen that migration
> > > finishes sooner than disk extend. In such case we will try to set disk
> > > threshold on already stopped VM (we handle libvirt event that VM was
> > > stopper, but due to Python GIL there can be a delay between obtaining
> > > appropriate signal from libvirt  and handling it). In such case we get
> > > libvirt
> > > VIR_ERR_OPERATION_INVALID when setting disk threshold. 
> 
> actually I was wrong here and the issue is actually caused by delay libvirt 
> setBlockThreshold() call, form vdsm log:
> 
> 2021-08-02 09:06:01,918-0400 WARN  (mailbox-hsm/3) [virt.vm] 
> (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') setting theshold using dom 
>  (drivemonitor:122)
> 
> [...]
> 
> 2021-08-02 09:06:03,967-0400 WARN  (libvirt/events) [virt.vm] 
> (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') libvirt event Stopped detail 3 
> opaque None (vm:5657)
> 
> [...]
> 
> 2021-08-02 09:06:03,969-0400 WARN  (mailbox-hsm/3) [virt.vm] 
> (vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') Domain not connected, skipping 
> set block threshold for drive 'sdc' (drivemonitor:133)
> 
> 
> so it took about 2 second to libvirt setBlockThreshold() call to return and 
> in meantime
> migration was finished and we get VIR_ERR_OPERATION_INVALID error from 
> setBlockThreshold() call.
> 
> What is the reason for this delay? Is this operation intentionally delayed 
> until
> migration finishes?

Actually, qemuDomainSetBlockThreshold which is the backend for
virDomainSetBlockThreshold requires a QEMU_JOB_MODIFY job on the domain,
so this actually can't even be set _during_ migration.

In fact what happens is that the API call is waiting to be able to
obtain the MODIFY job and that can happen only after the migration is
finished, thus it always serializes after the migration.



Re: Disk extend during migration

2021-08-02 Thread Peter Krempa
On Mon, Aug 02, 2021 at 14:20:44 +0200, Vojtech Juranek wrote:
> Hi,
> as a follow-up of BZ #1883399 [1], we are reviewing vdsm VM migration flows 
> and 
> solve few follow-up bugs, e.g. BZ #1981079 [2]. I have couple of questions 
> related to libvirt:
> 
> * if we run disk extend during migration, it can happen that migration 
> finishes 
> sooner than disk extend. In such case we will try to set disk threshold on 
> already stopped VM (we handle libvirt event that VM was stopper, but due to 
> Python GIL there can be a delay between obtaining appropriate signal from 
> libvirt  and handling it). In such case we get libvirt 
> VIR_ERR_OPERATION_INVALID when setting disk threshold. Is it safe to 
> catch this exception and ignore it or it's thrown for various reasons and the 
> root cause can be something else than stopped VM?

The API to set the block trheshold level can return the following errors
including cases when it can happen:

VIR_ERR_OPERATION_UNSUPPORTED <- unlikely new qemu supports it
VIR_ERR_INVALID_ARG <- disk was not found in VM definition
VIR_ERR_INTERNAL_ERROR <- on error from qemu

Thus VIR_ERR_OPERATION_INVALID seems to be safe to ignore in your
specific case, while not ignoring others can be used to catch problems.



Re: investigate locks on a domain

2021-07-20 Thread Peter Krempa
On Tue, Jul 20, 2021 at 15:52:58 +0800, Jiatong Shen wrote:
> Hello community,
> 
> I am seeing following log in production,
> 
> 2021-07-20 07:43:49.417+: 3918294: error :
> qemuDomainObjBeginJobInternal:4945 : Timed out during operation: cannot
> acquire state change lock (held by qemuProcessReconnect)
> 2021-07-20 07:44:19.424+: 3918296: warning :
> qemuDomainObjBeginJobInternal:4933 : Cannot start job (modify, none) for
> domain instance-074e; current job is (modify, none) owned by (3919429
> qemuProcessReconnect, 0 ) for (2183193s, 0s)
> 2021-07-20 07:44:19.424+: 3918296: error :
> qemuDomainObjBeginJobInternal:4945 : Timed out during operation: cannot
> acquire state change lock (held by qemuProcessReconnect)
> 2021-07-20 07:44:49.428+: 3918296: warning :
> qemuDomainObjBeginJobInternal:4933 : Cannot start job (query, none) for
> domain instance-074e; current job is (modify, none) owned by (3919429
> qemuProcessReconnect, 0 ) for (2183223s, 0s)
> 2021-07-20 07:44:49.428+: 3918296: error :
> qemuDomainObjBeginJobInternal:4945 : Timed out during operation: cannot
> acquire state change lock (held by qemuProcessReconnect)
> 2021-07-20 07:45:19.429+: 3918298: warning :
> qemuDomainObjBeginJobInternal:4933 :
> 
> I am confused about what is qemuProcessReconnect and why it acquires a
> domain state lock..

qemuProcessReconnect is an operation that is executed in a separate
thread which re-establishes connection to a qemu process once you
restart libvirtd.

This usually means that the reconnection process got stuck for some
reason. Unfortunately your log doesn't show why and unless you've got a
debug log prior to that happening it won't be possible.

Theoretically seeing which function the thread doing the reconnect is
stuck at could perhaps show why.

Usually the only fix for this is to destroy the VM which is stuck.



Re: Libvirt hit a issue when do VM migration

2021-06-17 Thread Peter Krempa
Firstly please don't CC random people who are subscribed to the list.

Additionally this is a user-question so it's not entirely appropriate
for libvir-l...@redhat.com only for libvirt-users@redhat.com.

On Thu, Jun 17, 2021 at 16:52:42 +0800, 梁朝军 wrote:
> Hi All,
> 
> Who can give me a help? I hit another issue when do vm migration?  Libvirt 
> throw out the error like below
> 
> "libvirt: Domain Config error : unsupported configuration: vcpu enable order 
> of vCPU '0' differs between source and destination definitions”

This error happens when the 'order' attribute ...

> 
>  
> The cup information is defined VM domain on source host is like: 
> 2
>   
> 
> 

... such you can see here doesn't match for a vcpu of a particular ID.

>   
>   
> 
> 
>   
> 
> The distinction host previews generate xml for VM like:
> 
> 2
>   
> 
> 

Yours do match though. Did you post the correct XMLs?

>   
>   
> 
> 
>   
> 
> What’t wrong with it? Or I missing some actions in configurations regarding 
> the migration

It's unfortunately impossible to tell, your report doesn't descirbe what
steps you took to do the migration nor does contain debug logs.



Re: why cannot connect to libvirtd using virsh ???

2021-06-10 Thread Peter Krempa
On Thu, Jun 10, 2021 at 17:37:43 +0800, tommy wrote:
> [root@test2 ~]# virsh -c quem+ssh://root@192.168.10.175/system
> 
> root@192.168.10.175's   password:
> 
> error: failed to connect to the hypervisor
> 
> error: no connection driver available for quem:///system

You've got a typo here, you wrote "quem" but it's "qemu".



Re: How to hot plugin a new vhost-user-blk-pci device to running VM?

2021-06-04 Thread Peter Krempa
On Fri, Jun 04, 2021 at 19:22:31 +0800, 梁朝军 wrote:
> Hi Guys:
> 
> Who can help me ? What does this issue mean?  When I attach a network I hit 
> this issue.
> 
> libvirt: QEMU Driver error : internal error: unable to execute QEMU command 
> 'netdev_add': Invalid parameter type for 'vhost', expected: boolean

Your qemu is too-new for libvirt. the 'netdev_add' command was converted
to a strict description by QMP schema and libvirt wasn't ready for that.

commit b6738ffc9f8be5a2a61236cd9bef7fd317982f01
Author: Peter Krempa 
Date:   Thu May 14 22:50:59 2020 +0200

qemu: command: Generate -netdev command line via JSON->cmdline conversion

The 'netdev_add' command was recently formally described in qemu via the
QMP schema. This means that it also requires the arguments to be
properly formatted. Our current approach is to generate the command line
and then use qemuMonitorJSONKeywordStringToJSON to get the JSON
properties for the monitor. This will not work if we need to pass some
fields as numbers or booleans.

In this step we re-do internals of qemuBuildHostNetStr to format a JSON
object which is converted back via virQEMUBuildNetdevCommandlineFromJSON
to the equivalent command line. This will later allow fixing of the
monitor code to use the JSON object directly rather than rely on the
conversion.

v6.3.0-139-gb6738ffc9f

Thus you need at least libvirt 6.4.0 with that qemu.



Re: How to hot plugin a new vhost-user-blk-pci device to running VM?

2021-05-14 Thread Peter Krempa
On Fri, May 14, 2021 at 14:33:37 +0800, Liang Chaojun wrote:
> 
> 
> Thanks,I have tried qemu monitor as below. I used chardev_add to add a 
> chardev and used device_add it to running vm. But i often hit a issue and 
> cause the vm crash. qemu-system-x86_64: ../hw/virtio/vhost.c:1566: 
> vhost_dev_get_config: Assertion `hdev->vhost_ops' failed.
> 
> virsh qemu-monitor-command spdk1 --hmp --cmd "chardev-add 
> socket,id=spdk_vhost_blk0,path=/var/tmp/vhost.0,reconnect=1"  virsh 
> qemu-monitor-command spdk1 --hmp --cmd "device_add 
> vhost-user-blk-pci,chardev=spdk_vhost_blk0,num-queues=4"

As noted, this is not an interface we'd provide support for. Please
upgrade both libvirt and qemu and try the supported interface if you
want us to deal with any problems you might have.



Re: How to hot plugin a new vhost-user-blk-pci device to running VM?

2021-05-13 Thread Peter Krempa
On Thu, May 13, 2021 at 23:11:36 +0800, Liang Chaojun wrote:
> 
> 
> Thanks Peter for your quickly response. Is there any workaround to do that?As 
> you know we must take care the risk of using latest version in product 
> environment.

Manual approach is to use 'virsh qemu-monitor-command' or the equivalent
to attach the appropriate backends and frontends manually.

Obviously that is very far from anything I'd recommend to use in any
production environment.



Re: How to hot plugin a new vhost-user-blk-pci device to running VM?

2021-05-13 Thread Peter Krempa
On Thu, May 13, 2021 at 15:25:23 +0800, 梁朝军 wrote:
>Hi Guy,
> 
>Does  anyone clear how to hot plugin a new vhost-user-blk-pci device to a
>running VM?
> 
>Before staring vm , I pass the disk through QEMU command line  like below.
> 
>
> 
> 
> value='memory-backend-file,id=mem0,size=4G,mem-path=/dev/hugepages,share=on'/>
> 
> 
> 
> 
> value='socket,id=spdk_vhost_blk721ea46a-b306-11eb-a280-525400a98761,path=/var/tmp/vhost.721ea46a-b306-11eb-a280-525400a98761,reconnect=1'/>
> 
> 
> value='vhost-user-blk-pci,chardev=spdk_vhost_blk721ea46a-b306-11eb-a280-525400a98761,bootindex=1,num-queues=4'/>
> 
> 
> value='socket,id=spdk_vhost_blk2f699c58-d222-4629-9fdc-400c3aadc55e,path=/var/tmp/vhost.2f699c58-d222-4629-9fdc-400c3aadc55e,reconnect=1'/>
> 
> 
> value='vhost-user-blk-pci,chardev=spdk_vhost_blk2f699c58-d222-4629-9fdc-400c3aadc55e,num-queues=4'/>
>
> 
>But I don**t know how to live add a vhost-user-blk-pci device on running
>VM even with calling attachDevice API now.

You need to use the proper and supported way to use vhost-user-blk:

  


  


  

That works also with attachDevice.


>OS: redhat 7.4 Libvirt version: 3.4

This is obviously way too old for it. You'll need at least libvirt-7.1
for that.



Re: external snapshot create error

2021-04-20 Thread Peter Krempa
On Mon, Apr 19, 2021 at 20:19:31 -0500, Eyüp Hakan Duran wrote:
> Dear all,
> 
> I have been creating external snapshots of my KVM/QEMU VMs for more than a
> year on a host machine that runs Manjaro Linux. The current version of
> libvirt I am using is 1:7.1.0-3. I just noticed that the script I am using
> for this purpose has been failing. More specifically the command below
> returns with the error message indicated underneath the command:
> 
> sudo virsh snapshot-create-as --no-metadata --domain myVM myVM-state
> --diskspec hda,file=overlay.qcow2 --disk-only --atomic
> error: XML document failed to validate against schema: Unable to validate
> doc against /usr/share/libvirt/schemas/domainsnapshot.rng

The problem is that file=overlay.qcow2 points to a relative path (not
starting with a leading '/'). Libvirt enforces that the path must be a
full path.

> Extra element disks in interleave
> Element domainsnapshot failed to validate content

I've recently pushed a commit which removes the XML validation in this
case as it's counter-productive and the XML parser provides a more
understandable error:

https://gitlab.com/libvirt/libvirt/-/commit/f1c9fed2ca8f2509495f46b355dab019604f7475

(note that this commit will be in the upcomming libvirt-7.3 release)

> I also tried different versions of the command above with similar results,
> such as:
> sudo virsh snapshot-create-as mvVM --no-metadata myVM-state --diskspec
> hda,snapshot=external,file=overlay.qcow2 --disk-only --atomic
> 
> Any pointers will be greatly appreciated!
> 
> Hakan Duran



Re: could not start libvirt service

2021-04-13 Thread Peter Krempa
On Tue, Apr 13, 2021 at 15:56:57 +0530, shafnamol N wrote:
> Hi,
> 
> I am using *CentOS 8*. I have built *libvirt* with the following method:
> 
> $ meson build -Dsystem=true
> $ ninja -C build
> $ ninja -C build install
> 
> But the problem is when i started it
> 
> # systemctl start libvirtd
> Job for libvirtd.service failed because the control process exited
> with error code.
> See "systemctl status libvirtd.service" and "journalctl -xe" for details.
> 
> i tried to get the status of libvirt
> 
> # systemctl status libvirtd
> 
> libvirtd.service - Virtualization daemon
>Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled;
> vendor preset: enabled)
>Active: failed (Result: exit-code)
> 
> What will be the problem?

Well, what does the log say?

https://www.libvirt.org/kbase/debuglogs.html



Re: how to use external snapshots with memory state

2021-04-11 Thread Peter Krempa
On Fri, Apr 09, 2021 at 19:00:41 +0200, Riccardo Ravaioli wrote:
> Hi all,
> 
> Can I get some feedback on these two questions I posted a while back,
> concerning external snapshots:
> 
>  On Tue, 26 Jan 2021 at 00:15, Riccardo Ravaioli 
> wrote:
> [...]
> 
> > 1) When creating an external online snapshot (disks+memory) of a qemu-KVM
> > VM, I'd like to store the backing files in a separate folder and rename the
> > newly-created delta files with the respective names of the original disks
> > (now the backing files). Is this possible at all? Say a VM has a disk
> > vm/disk1.qcow2 and I take an external snapshot, I'd like the backing file
> > to appear in snap/disk1.qcow2 and the new delta disk to appear as

No, this is not possible during the snapshot operation, you can control
only the name of the new overlay image.

> > vm/disk1.qcow2. Currently I wait for the VM to be shutdown by the user
> > before I move around disks as desired and run qemu-img rebase in unsafe
> > mode to update the delta disks. Can I accomplish the same result while the
> > VNF is running?

Very theoretically you can achieve the same during the lifetime of the
VM by using 'virsh blockcopy' with the following algorithm:

1) ensure you have a overlay image that is small enough (but that's what
the snapshot operation does if you do it right after snapshot).

Note that the snapshot target should be something temporary, you'll see
why below.

2) symlink/hardlink the original backing image (vm/disk1.qcow2) to the
new locaton

3) create an empty qcow2 overlay file with the desired name of the disk
_after_ snapshot, pointing to the hardlink of the new backing file
destination

4) use shallow 'blockcopy' with --reuse-external operation to move the
temp overlay to the file created in 3), this ensures that it's backing
image as described in the metadata is used

The above has IMO too many moving parts that can go wrong. I'd suggest
adapting to the "backwards" naming of the images which happens with
external snapshots.

> > 2) As you suggested, I now use "virsh restore" to launch a VM with the
> > memory state I had previously backed up. The "virsh restore" API applies to
> > a VM that is not running. However, I see that for internal snapshots,
> > libvirt also supports a snapshot-revert operation on a /live/ VM. Is this
> > possible at all with an external snapshot? If so, what are the commands

No the operation is not implemented for external snapshots yet. The use
of 'virsh restore' is a workaround, which requires manual steps to
preserve images in certain scenarios.

> > needed? My current way of automating a snapshot-revert with an external
> > online snapshot only applies to a shutdown VM. It consists in:
> > - shutting down the VM;
> > - erasing all its delta disks, replacing them with new ones;
> > - executing "virsh restore" on the memory state, making sure that the disk
> > paths referenced in the embedded XML file are correct.

Yes that will sounds reasonable. Note that the XML embedded in the
saveimage points to the original base file so you'll always need to
modify it to use the new deltas.

> > Does this seem reasonable? Can I do the same while the VNF is running?
> > I suppose that for a snapshot-delete operation on a live VM I just need to
> > run "virsh blockpull" on each disk, but I couldn't figure out how to do a
> > revert.

No to use 'virsh restore' you must shutdown the VM, but note that in
many cases with internal snapshots libvirt will also need to terminate
and restart the QEMU process so this situation doesn't really differ.

In the internal snapshot code is way more complicated in order to
preserve the running process needlessly and I'd probably elect to always
restart qemu when reverting them nowadays.

The only drawback is that the remote video connection needs to be
reopened.



Re: live migrate storage to different volume on the same host

2021-04-11 Thread Peter Krempa
On Sat, Apr 10, 2021 at 01:05:31 +0800, Jing-Wei Su wrote:
> Hello experts,
> 
> I would like to move the qcow2 disk images of a running VM to a
> different storage because of the maintenance of the underlying storage
> on the same host.

You can use the 'virDomainBlockCopy' API to achieve this.

To use it via virsh, the simplest way would be to:

virsh blockcopy $VMNAME $DISKTARGET --dest /path/to/destination
--transient-job --pivot (--verbose)

($DISKTARGET is 'vda' for example)

In case you need a more complicated description of the target of the
copy such as a network storage target you can also use the --xml option
to use a XML description of the target.

> I found the live migration is not allowable on the same host using
> virsh. Is there any suggestion or better practices for this?

Full migration would also try to move the qemu process which is not
required in your case.



Re: libvirt 7.2.0 domainbackup

2021-04-11 Thread Peter Krempa
On Thu, Apr 08, 2021 at 15:53:14 +0200, Thomas Stein wrote:
> Cool! So just to be clear, if I start the backup at 1400 and it's finished at 
> 1410, changes to the image at 1405 are not included?

Yes, exactly.

> It's the same quality as my current backup solution then. Thank you for your 
> answer Peter.

If you do full backups (create just 1 overlay, copy over the full
original image, merge the 1 overlay back) then yes. If you were doing an
"incremental" backup (2 overlays when backing up, 1 overlay in normal
use) then that is not yet supported. The code for incremental backups
via 'virsh backup-begin' exists in libvirt but qemu doesn't yet support
the 'blockdev-reopen' command which is used by the incremental backup
code.

> 
> cheers, t.
> 
> Am 8. April 2021 15:46:56 MESZ schrieb Peter Krempa :
> >On Wed, Apr 07, 2021 at 19:46:54 +0200, Thomas Stein wrote:
> >> 
> >> Got a little bit further. "virsh backup-begin jitsi jitsi-backup.xml"
> >works
> >> and I have a backup file now. VM was running all the time. Question.
> >Is it
> >> save to use this function this way? Currently I do:
> >
> >The backup file you'll get will represent the state of the VM at the
> >time the backup operation has started, but should contain a full copy
> >of
> >the image from that time.
> 
> 



Re: libvirt 7.2.0 domainbackup

2021-04-08 Thread Peter Krempa
On Wed, Apr 07, 2021 at 19:46:54 +0200, Thomas Stein wrote:
> 
> Got a little bit further. "virsh backup-begin jitsi jitsi-backup.xml" works
> and I have a backup file now. VM was running all the time. Question. Is it
> save to use this function this way? Currently I do:

The backup file you'll get will represent the state of the VM at the
time the backup operation has started, but should contain a full copy of
the image from that time.



Re: libvirt 7.2.0 domainbackup

2021-04-08 Thread Peter Krempa
On Wed, Apr 07, 2021 at 19:03:21 +0200, Thomas Stein wrote:
> Hello one and all.
> 
> Just wanted to check out the new feature domainbackup but having trouble to
> configure it. I thought I just have to put
> 
> 
>   
> 
>   
>   
> 
>   
> 
> 
> at the top level of my domain xml file but that does not seem to work. I
> get:
> 
> error: XML document failed to validate against schema: Unable to validate
> doc against /usr/share/libvirt/schemas/domain.rng
> Element domain has extra content: domainbackup

That's not how it's used, if you want to create a backup you have to use
the virDomainBackupBegin API e.g. via `virsh backup-begin
VMNAME /path/to/backup.xml`.

Your backup xml file seems to look correct.



Re: Xen: libvirt.libvirtError: An error occurred, but the cause is unknown

2021-04-07 Thread Peter Krempa
On Wed, Apr 07, 2021 at 08:14:10 +, Mathieu Tarral wrote:
> Hi,
> 
> I'm facing a strange issue with libvirt and the Xen driver.
> 
> Whenever I try to define or start a domain, I always get this error:
> "libvirt.libvirtError: An error occurred, but the cause is unknown" (in 
> Python virt-manager interface)
> 
> Which is not very helpful.
> Trying to use virsh directly leads to the same error message:
> 
> virsh # start win10_xen
> error: Failed to start domain win10_xen
> error: An error occurred, but the cause is unknown

This is reported when we fail to set an error message

> 
> I checked /var/log/libvirt/libxl/libxl-driver.log, but the log file is empty.

Please try enabling full debug logging, it may at least show what the
last thing we were doing is.


https://www.libvirt.org/kbase/debuglogs.html#persistent-setting



Re: GlusterFS libgfapi with TLS

2021-04-01 Thread Peter Krempa
On Thu, Apr 01, 2021 at 15:13:02 +0100, lejeczek wrote:
> Hi guys.
> 
> I have KVM guests stored on glusterFS volume and I recently added TLS
> encryption to Gluster.
> What changes, tweaks are required at libvirtd/qemu's end?

Looking at the definition of the gluster backend object in qemu:


##
# @BlockdevOptionsGluster:
#
# Driver specific block device options for Gluster
#
# @volume: name of gluster volume where VM image resides
#
# @path: absolute path to image file in gluster volume
#
# @server: gluster servers description
#
# @debug: libgfapi log level (default '4' which is Error)
# (Since 2.8)
#
# @logfile: libgfapi log file (default /dev/stderr) (Since 2.8)
#
# Since: 2.9
##
{ 'struct': 'BlockdevOptionsGluster',
  'data': { 'volume': 'str',
'path': 'str',
'server': ['SocketAddress'],
'*debug': 'int',
'*logfile': 'str' } }



it doesn't seem to yet support TLS encryption of the transport or a way
to set it in a non-implicit way (it still might be possible to trick
libgfapi to support it via a config file or such).

That means you'll probably need to submit qemu patches implementing the
support for configuring TLS for gluster to qemu first, and then do the
same for libvirt.

Libvirt already has some infrastructure for that for NBD and VXHS disks,
so you can then take inspiration there when implementing it in libvirt.



Re: how to check a virtual disk

2021-03-29 Thread Peter Krempa
On Mon, Mar 29, 2021 at 13:59:11 +0200, Lentes, Bernd wrote:
> 
> - On Mar 29, 2021, at 12:58 PM, Bernd Lentes 
> bernd.len...@helmholtz-muenchen.de wrote:

[...]

> 
> > 
> 
> I forgot:
> host is SLES 12 SP5, virtual domain too.
> The image file is in raw format.

Please always attach the VM config XMLs, so that we don't have to guess
how your disks are configured.



Re: Snapshot operation aborted and volume usage

2021-03-11 Thread Peter Krempa
On Thu, Mar 11, 2021 at 10:51:13 +0200, Liran Rotenberg wrote:
> We recently had this bug[1]. The thought that came from it is the handling
> of error code after running virDomainSnapshotCreateXML, we encountered
> VIR_ERR_OPERATION_ABORTED(78).

VIR_ERR_OPERATION_ABORTED is an error code which is emitted by the
migration code only. That means that the error comes from the failure to
take a memory image/snapshot of the VM.

Quick skim through the bugreport seems to mention timeout, so your code
probably aborted the snapshot if it was taking too long.

> Apparently, the new volume is in use. Are there cases where this will
> happen and the new volume won't appear in the volumes chain? Can we detect
> / know when?

In the vast majority of cases if virDomainSnapshotCreateXML returns
failure the new disk volumes are NOT used at that point.

Libvirt tries very hard to ensure that everything is atomic. The memory
snapshot is taken before installing volumes into the backing chain, so
if that one fails we don't even attempt to do anything with the disks.

There are three extremely unlikely reasons where the snapshot API returns
failure and new images were already installed into the backing chain:

1) resuming of the VM failed after snapshot
2) thawing (domfsthaw) of filesystems has failed
(easily avoided by not using the _QUIESCE flag, but freezing
manually)
3) saving of the internal VM state XML failed

Any error except those above can happen only if the images werent
installed or the VM died while installing the images.

In addition if resuming the cpus after the snapshot fails, the cpus
didn't run so the guest couldn't have written anything to the image.
Since snapshot is supposed to flush qemu caches, in case you destroy the
VM without running the vcpus it's safe to discard the overlays as guest
didn't write anything into them yet.

> Thinking aloud, if we can detect such cases we can prevent rolling back by
> reporting it back from VDSM to ovirt. Or, if it can't be detected to go on
> the safe side in order to save data corruption and prevent the rollback as
> well.

In general, except for the case when saving of the guest XML has failed,
the new disk images will not be used by the VM so it's safe to delete
them.

> Currently, in ovirt, if the job is aborted, we will look into the chain to
> decide whether to rollback or not.

This is okay, we update the XML only if qemu successfully installed the
overlays.



Re: virsh snapshot-create-as with quiesce option times out

2021-03-05 Thread Peter Krempa
On Thu, Mar 04, 2021 at 18:12:06 -0300, David Wells wrote:
> Hi all!
> 
> I'm still working on the live backup of a couple vm's and what happens most
> of the times es that when I execute virsh with the snapshot-create-as with
> the --quiesce option the process finishes with an error that reads
> > error: Timed out during operation: cannot acquire state change lock
> 
> I tried turning up the debug level but found nothing that appears to be of
> interest and I can connect successfully to the qemu guest agent since the
> following command returns the correct value
> > sudo /usr/sbin/virsh domfsinfo slackware-current
> > Mountpoint Name Type Target
> > ---
> > /    vda1 ext4 vda
> 
> I also tried running the qemu-ga agent on the guest with debugging enabled
> and when I issue this last command I can see the agent talking to the host
> but when I issue the quiesce option the guest agent shows nothing at all.
> 
> Is this by any chance a known bug? Is there something obvious I'm missing?
> What else can I provide to help debug this issue?

It's not a known bug. Ideally post the libvirtd debug log. Make sure you
are starting the VM fresh when doing so to clear out any prior state.

Post the log either here or file an issue:

https://gitlab.com/libvirt/libvirt/-/issues/new?issue%5Bassignee_id%5D=&issue%5Bmilestone_id%5D=

Please observe our guideline on filing proper bug reports:

 https://libvirt.org/bugs.html#quality
 https://libvirt.org/kbase/debuglogs.htm



Re: Live backups create-snapshot-as and memspec

2021-03-01 Thread Peter Krempa
On Fri, Feb 26, 2021 at 15:29:45 -0300, David Wells wrote:
> 
> El 25/2/2021 a las 11:37, Peter Krempa escribió:
> > On Wed, Feb 24, 2021 at 14:49:19 -0300, David Wells wrote:
> > > Hi all!
> > > 
> > > I've been using libvirt for some time and until now I have treated backups
> > > of virtual computers as if they where physical computers installing the
> > > backup client on the guest. I am now however facing the need to backup a
> > > couple a couple of guest at the host level so I've been trying to catch up
> > > by reading, googling and by trial and error too. Up to now I've been able 
> > > to
> > > backup a live machine whith a command like the following
> > > 
> > > > virsh snapshot-create-as --domain test --name backup --atomic --diskspec
> > > > vda,snapshot=external --disk-only
> > > This command creates a file test.backup and in the meantime I can backup 
> > > the
> > > original test.qcow2 but for what I saw this disk image is in a "dirty"
> > > state, as if the machine I could restore from this file had been turned 
> > > off
> > > whithout a proper shutdown.
> > > 
> > > I know that I can later restore the machine to its original state by 
> > > issuing
> > > a command like this
> > > 
> > > > virsh blockcommit --domain test vda --active --pivot
> > > > virsh snapshot-delete test --metadata backup
> > > I have seen that it is possible to create the snapshot using a memspec
> > > parameter which would make the backup of the guest as if it where in a 
> > > clean
> > > state, however I haven't found the equivalent of the blockcommit for the
> > > memory file, in a sort of speak, to be able to restore the guest to it's
> > > original state.
> > The memory image file doesn't depend on any other state nor does the VM
> > use it after the snapshot is taken, if you don't need it you can delete
> > it.
> > 
> > The disk overlay you create by  'virsh snapshot-create-as' records only
> > differences to the original image, but for memory state it doesn't make
> > much sense as the memory is small and changes a lot, so we take a
> > snapshot of the whole memory.
> > 
> > Now to restore the state of a VM to a snapshot taken with memory,
> > libvirt's native APIs for reverting don't work on external snapshots
> > yet.
> > 
> > You can use 'virsh restore memimg' to load VM's state as the memory
> > snaphsot images have the same format as images created by 'virsh save'.
> > 
> > Please note though that the configuration of the VM is taken from the
> > save image if you do so, which includes paths to disk images, which may
> > no longer be correct or desired, but virsh save-image-edit can be used
> > to modify the XML if needed.
> > 
> Hi all!
> 
> Peter, thank you very much for your reply! So if I understood correctly
> there is no way, at this time, to backup a live guest in a way that I can
> restore it to a "clean" state. Also, if I understood correctly, the

No you misunderstood. Creating an external snaphsot with --memspec in
fact creates a state capture which can be completely restored in terms
of consistency of filesystems and in-memory data.

What currently isn't possible is to use the virDomainRevertToSnapshot
API to do so, but there are manual steps which can be taken to do so.

Obviously state beyond the VM can't be restored, so network connections
existing at the time of the snapshot will be broken.

> probability of a corrupt system on restore of the vm is minimized having the
> qemu guest agent installed on the guest and issue the virsh
> create-snapshot-as with the --quiesce parameter, is this correct?

Yes that is correct, the filesystem is still mounted but the guest agent
ensures that caches were flushed before. AFAIK the guest agent even
provides guest-OS side hooks which allow to flush databases and others
which are not filesystem bound to have the on-disk state as clean as
possible.



Re: Live backups create-snapshot-as and memspec

2021-02-25 Thread Peter Krempa
On Wed, Feb 24, 2021 at 14:49:19 -0300, David Wells wrote:
> Hi all!
> 
> I've been using libvirt for some time and until now I have treated backups
> of virtual computers as if they where physical computers installing the
> backup client on the guest. I am now however facing the need to backup a
> couple a couple of guest at the host level so I've been trying to catch up
> by reading, googling and by trial and error too. Up to now I've been able to
> backup a live machine whith a command like the following
> 
> > virsh snapshot-create-as --domain test --name backup --atomic --diskspec
> > vda,snapshot=external --disk-only
> 
> This command creates a file test.backup and in the meantime I can backup the
> original test.qcow2 but for what I saw this disk image is in a "dirty"
> state, as if the machine I could restore from this file had been turned off
> whithout a proper shutdown.
> 
> I know that I can later restore the machine to its original state by issuing
> a command like this
> 
> > virsh blockcommit --domain test vda --active --pivot
> > virsh snapshot-delete test --metadata backup
> 
> I have seen that it is possible to create the snapshot using a memspec
> parameter which would make the backup of the guest as if it where in a clean
> state, however I haven't found the equivalent of the blockcommit for the
> memory file, in a sort of speak, to be able to restore the guest to it's
> original state.

The memory image file doesn't depend on any other state nor does the VM
use it after the snapshot is taken, if you don't need it you can delete
it.

The disk overlay you create by  'virsh snapshot-create-as' records only
differences to the original image, but for memory state it doesn't make
much sense as the memory is small and changes a lot, so we take a
snapshot of the whole memory.

Now to restore the state of a VM to a snapshot taken with memory,
libvirt's native APIs for reverting don't work on external snapshots
yet.

You can use 'virsh restore memimg' to load VM's state as the memory
snaphsot images have the same format as images created by 'virsh save'.

Please note though that the configuration of the VM is taken from the
save image if you do so, which includes paths to disk images, which may
no longer be correct or desired, but virsh save-image-edit can be used
to modify the XML if needed.



Re: Fwd: virsh backup-begin problem

2021-01-21 Thread Peter Krempa
On Thu, Jan 21, 2021 at 18:12:54 +0300, Andrey Fokin wrote:
> Do you mean "experimental" feature to enable incremental backup? No, didn't
> because I try to implement full buckup (not an incremental one). Or am I
> missing something and it is the same?

It's the same. Implementing lockouts for every single sub-feature which
is actually complete would be too much of a hassle.

If you don't do incrementals (don't create checkpoints) the
snapshot/blockjob APIs will work as expected.

And yes, it's still considered experimental.



Re: Fwd: virsh backup-begin problem

2021-01-21 Thread Peter Krempa
On Thu, Jan 21, 2021 at 17:34:25 +0300, Andrey Fokin wrote:
> Peters, thanks! It was a lot of XML errors. You are right about. But it
> doesn't work... I've an error about backup type-
> Operation not supported: incremental backup is not supported yet
> How is possible describe full backup operation?

Did you apply the VM XML workaround that I've pointed out twice already:

https://www.redhat.com/archives/libvirt-users/2021-January/msg00034.html

And if yes, did you restart your VM after adding the XML to enable the
feature?



  1   2   3   >