[ovirt-users] Re: What is this error message from?
Am 17.02.2020 um 16:16 hat Nir Soffer geschrieben: > On Mon, Feb 17, 2020, 16:53 wrote: > > > I have seen this error message repeatedly when reviewing events. > > > > VDSM vmh.cyber-range.lan command HSMGetAllTasksStatusesVDS failed: low > > level Image copy failed: ("Command ['/usr/bin/qemu-img', 'convert', '-p', > > '-t', 'none', '-T', 'none', '-f', 'raw', > > u'/rhev/data-center/mnt/glusterSD/storage.cyber-range.lan:_vmstore/dd69364b-2c02-4165-bc4b-2f2a3b7fc10d/images/c651575f-75a0-492e-959e-8cfee6b6a7b5/9b5601fe-9627-4a8a-8a98-4959f68fb137', > > '-O', 'qcow2', '-o', 'compat=1.1', > > u'/rhev/data-center/mnt/glusterSD/storage.cyber-range.lan:_vmstore/dd69364b-2c02-4165-bc4b-2f2a3b7fc10d/images/6a2ce11a-deec-41e0-a726-9de6ba6d4ddd/6d738c08-0f8c-4a10-95cd-eeaa2d638db5'] > > failed with rc=1 out='' err=bytearray(b'qemu-img: error while reading > > sector 24117243: No such file or directory\\n')",) > > > > Looks like copying image failed with ENOENT while reading > offset 12348028416 (11.49 GiB). > > I never seen such failure, typically after opening a file read will never > fail with such error, but in gluster this may be possible. > > Please share vdsm log showingn this error, it may add useful info. > > Also glusterfs client logs from > /var/log/glusterfs*/*storage.cyber-range.lan*.log > > Kevin, Krutika, do you have an idea about this error? This is a weird one. Not only that reading shouldn't be looking up any filename, but also that it's not at offset 0, but suddenly somewhere in the middle of the image file. I think it's pretty safe to say that this error doesn't come from QEMU, but from the kernel. Did you (or some software) change anything about the volume in the background while the convert operation was running? Kevin ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/6C3666HJSRP5BEGTZUTV6C6A6GV2N37I/
[ovirt-users] Re: max number of snapshot per vm
Am 25.11.2018 um 19:55 hat Nir Soffer geschrieben: > On Fri, Nov 23, 2018 at 1:01 PM Nathanaël Blanchet wrote: > > What are the best pratices about vm snapshots? > > > > I think the general guideline is keep only snapshot you need, since every > snapshot has a potential performance cost. > > qemu caches some image metadata in memory, so accessing data from an > image with 20 snapshots should be efficient as image with 2 snapshots, but > using more snapshots will consume more memory. The memory will also only really be used when the backing file is actually accesses. So if you keep very old snapshots, but all data has already been overwritten and the old snapshot is never accessed, it doesn't need that memory. Additionally, starting with QEMU 3.1, cache-clean-interval is active by default and frees the cached table of a snapshot layer when it hasn't been accessed for 10 minutes. (You had to enable it explicitly before that, which I don't think oVirt does.) > Kevin, do we have performance tests comparing VMs with different amount > of snapshots? I'm not aware of a proper performance test where this aspect was tested. Kevin ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/KY742KYFW255T3SDCPG55Q4J4CZVOK4K/
[ovirt-users] Re: [Qemu-block] Libvirt ERROR cannot access backing file after importing VM from OpenStack
Am 30.05.2018 um 18:14 hat Arik Hadas geschrieben: > On Wed, May 30, 2018 at 6:33 PM, Kevin Wolf wrote: > > I think the problem is that we're talking about two different things in > > one thread. If I understand correctly, what oVirt does today is: > > > > 1. qemu-img convert to create a temporary qcow2 image that merges the > >whole backing chain in a single file > > > > 2. tar to create an temporary OVA archive that contains, amongst others, > >the temporary qcow2 image. This is a second temporary file. > > > > 3. Stream this temporary OVA archive over HTTP > > > > Well, today we suggest users to mount a shared storage to multiple hosts > that reside in different oVirt/RHV deployments so they could export > VMs/templates as OVAs to that shared storage and import these OVAs from the > shared storage to a destination deployment. This process involves only #1 > and #2. > > The technique you proposed earlier for writing disks directly into an OVA, > assuming that the target size can be retrieved with 'qemu-img measure', > sounds like a nice approach to accelerate this process. I think we should > really consider doing that if that's as easy as it sounds. Writing the image to a given offset in a file is the example that I gave further down in the mail: > > You added another host into the mix, which just receives the image > > content via NBD and then re-exports it as HTTP. Does this host actually > > exist or is it the same host where the original images are located? > > > > Because if you stay local for this step, there is no need to use NBD at > > all: > > > > $ ./qemu-img measure -O qcow2 ~/images/hd.img > > required size: 67436544 > > fully allocated size: 67436544 > > $ ./qemu-img create -f file /tmp/test.qcow2 67436544 > > Formatting '/tmp/test.qcow2', fmt=file size=67436544 > > $ ./qemu-img convert -n --target-image-opts ~/images/hd.img > > driver=raw,file.driver=file,file.filename=/tmp/test.qcow2,offset=65536 > > > > hexdump verifies that this does the expected thing. > But #3 is definitely something we are interested in because we expect the > next step to be exporting the OVAs to a remote instance of Glance that > serves as a shared repository for the different deployments. Being able to > stream the collapsed form of a volume chain without writing anything to the > storage device would be fantastic. I think that even at the expense of > iterating the chain twice - once to map the structure of the jump tables > (right?) and once to stream the whole data. If the target is not a stupid web browser, but something actually virt-related like Glance, I'm sure it can offer a more suitable protocol than HTTP? If you could talk NBD to Glance, you'd get rid of the streaming requirement. I think it would make more sense to invest the effort there. Kevin ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/AGEZDBNTOSTEFN7SIZ63U6CSIXHL547G/
[ovirt-users] Re: [Qemu-block] Libvirt ERROR cannot access backing file after importing VM from OpenStack
Am 30.05.2018 um 17:05 hat Eric Blake geschrieben: > If I understood the question, we start with a local: > > T (any format) <- S (qcow2) <- V (qcow2) > > and want to create a remote tar file: > > dest.tar == | header ... | qcow2 image | > > where we write a single collapsed view of the T<-S<-V chain as a qcow2 image > in the subset of the remote tar file. I think the problem is that we're talking about two different things in one thread. If I understand correctly, what oVirt does today is: 1. qemu-img convert to create a temporary qcow2 image that merges the whole backing chain in a single file 2. tar to create an temporary OVA archive that contains, amongst others, the temporary qcow2 image. This is a second temporary file. 3. Stream this temporary OVA archive over HTTP Your proposal is about getting rid of the temporary file from step 1, but keeping the temporary file from step 2. I was kind of ignoring step 2 and answering how you can avoid a temporary file by creating and streaming a qcow2 file in a single step, but if you already have the code to create a qcow2 image as a stream, adding a tar header as well shouldn't be that hard... I think Nir was talking about both. Ideally, we'd somehow get rid of HTTP, which introduces the requirement of a non-seekable stream. > So, first use qemu-img to learn how big to size the collapsed qcow2 image, > and by extension, the overall tar image > $ qemu-img measure -f qcow2 -O qcow2 V > > then pre-create a large enough tar file on the destination > $ create header > $ truncate --size=XXX dest.qcow2 > $ tar cf dest.tar header dest.qcow2 > > (note that I explicitly did NOT use tar --sparse; dest.qcow2 is sparse and > occupies practically no disk space, but dest.tar must NOT be sparse because > neither tar nor NBD work well with after-the-fact resizing) > > then set up an NBD server on the destination that can write to the subset of > the tar file: > > $ learn the offset of dest.qcow2 within dest.tar (probably a multiple of > 10240, given default GNU tar options) > $ qemu-nbd --image-opts > driver=raw,offset=YYY,size=XXX,file.driver=file,file.filename=dest.tar > > (I'm not sure if I got the --image-opts syntax exactly correct. nbdkit has > more examples of learning offsets within a tar file, and may be a better > option as a server than qemu-nbd - but the point remains: serve up the > subset of the dest.tar file as raw bytes) > > finally set up qemu as an NBD client on the source: > $ qemu-img convert -f qcow2 V -O qcow2 nbd://remote > > (now the client collapses the qcow2 chain onto the source, and writes that > into a qcow2 subset of the tar file on the destination, where the > destination was already sized large enough to hold the qcow2 image, and > where no other temporary storage was needed other than the sparse dest.qcow2 > used in creating a large enough tar file) You added another host into the mix, which just receives the image content via NBD and then re-exports it as HTTP. Does this host actually exist or is it the same host where the original images are located? Because if you stay local for this step, there is no need to use NBD at all: $ ./qemu-img measure -O qcow2 ~/images/hd.img required size: 67436544 fully allocated size: 67436544 $ ./qemu-img create -f file /tmp/test.qcow2 67436544 Formatting '/tmp/test.qcow2', fmt=file size=67436544 $ ./qemu-img convert -n --target-image-opts ~/images/hd.img driver=raw,file.driver=file,file.filename=/tmp/test.qcow2,offset=65536 hexdump verifies that this does the expected thing. > > Exporting to a stream is possible if we're allowed to make two passes > > over the source, but the existing QEMU code is useless for that because > > it inherently requires seeking. I think if I had to get something like > > this, I'd probably implement such an exporter as a script external to > > QEMU. > > Wait. What are we trying to stream? A qcow2 file, or what the guest would > see? If you stream just what the guest sees, then 'qemu-img map' tells you > which portions of which source files to read in order to reconstruct data in > the order it would be seen by the guest. I think the requirement was that the HTTP client downloads a qcow2 image. Did I get this wrong? > But yeah, an external exporter that takes a raw file, learns its size > and where the holes are, and then writes a trivial qcow2 header and > appends L1/L2/refcount tables on the end to convert the raw file into > a slightly-larger qcow2 file, might be a valid way to create a qcow2 > file from a two-pass read. Right. It may have to calculate the size of the L1 and refcount table first so it can write the right offsets into the header, so maybe it's easiest to precreate the whole metadata. But that's an implementation detail. Anyway, I don't think the existing QEMU code helps you with this. Kevin ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt
[ovirt-users] Re: [Qemu-block] Libvirt ERROR cannot access backing file after importing VM from OpenStack
Am 30.05.2018 um 15:44 hat Eric Blake geschrieben: > On 05/29/2018 04:18 PM, Nir Soffer wrote: > > > You CAN get a logically collapsed view of storage (that is, what the > > > guest would see), by using an NBD export of volume V. Reading from that > > > volume will then pull sectors from whichever portion of the chain you > > > need. You can use either qemu-nbd (if no guest is writing to the > > > chain), or within a running qemu, you can use nbd-server-start and > > > nbd-server-add (over QMP) to get such an NBD server running. > > > > > > NBD expose the guest data, but we want the qcow2 stream - without > > creating a new image. > > NBD can do both. You choose whether it exposes the guest data or the qcow2 > data, by whether the client or the server is interpreting qcow2 data. But if I understand correctly, it doesn't result in the image Nir wants. You would only export an existing qcow2 file, i.e. a single layer in the backing chain, this way. The question was about a collapsed image, i.e. the disk content as the guest sees it. The problem is that qcow2 just isn't made to be streamable. Importing a qcow2 stream without saving it into a temporary file (or a memory buffer as large as the image file) simply isn't possible in the general case. Exporting to a stream is possible if we're allowed to make two passes over the source, but the existing QEMU code is useless for that because it inherently requires seeking. I think if I had to get something like this, I'd probably implement such an exporter as a script external to QEMU. Kevin ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/I7D7JQZMQXNSWOPEURXLT7BX3E3UUBER/
[ovirt-users] Re: Libvirt ERROR cannot access backing file after importing VM from OpenStack
Am 29.05.2018 um 11:27 hat Richard W.M. Jones geschrieben: > On Mon, May 28, 2018 at 01:27:21PM +0300, Arik Hadas wrote: > > Let me demonstrate briefly the flow for OVA: > > Let's say that we have a VM that is based on a template and has one disk > > and one snapshot, so its volume-chain would be: > > T -> S -> V > > (V is the volume the VM writes to, S is the backing file of V and T is the > > backing file of S). > > When exporting that VM to an OVA file we want the produced tar file to be > > comprised of: > > (1) OVF configuration > > (2) single disk volume (preferably qcow). > > > > So we need to collapse T, S, V into a single volume. > > Sure, we can do 'qemu-img convert'. That's what we do now in oVirt 4.2: > > (a) qemu-img convert produces a 'temporary' collapsed volume > > (b) make a tar file of the OVf configuration and that 'temporary' volume > > (c) delete the temporary volume > > > > But the fact that we produce that 'temporary' volume obviously slows down > > the entire operation. > > It would be much better if we could "open" a stream that we can read from > > the 'collapsed' form of that chain and stream it directly into the > > appropriate tar file entry, without extra writes to the storage device. > > A custom nbdkit plugin is possible here. In fact it's almost possible > using the existing nbdkit-tar-plugin[1], except that it doesn't > support resizing the tarball so you'd need a way to predict the size > of the final qcow2 file. I think you can predict the size with 'qemu-img measure'. But how do you create a tar archive that contains an empty file of the right size without actually processing and writing gigabytes of zero bytes? Is there an existing tool that can do that or would you have to write your own? > The main difficulty for modifying nbdkit-tar-plugin is working out how > to resize tar files. If you can do that then it's likely just a few > lines of code. This sounds impossible to do when the tar archive needs to stay consistent at all times. Kevin ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/UEWL6ATIGOAC7F6FG5HJG5CEXXSKEGYE/
[ovirt-users] Re: Libvirt ERROR cannot access backing file after importing VM from OpenStack
Am 28.05.2018 um 16:06 hat Tomáš Golembiovský geschrieben: > > On Mon, 28 May 2018 13:37:59 +0200 > Kevin Wolf wrote: > > > Am 28.05.2018 um 12:27 hat Arik Hadas geschrieben: > > > On Mon, May 28, 2018 at 11:25 AM, Kevin Wolf wrote: > > > > > > > [ Adding qemu-block ] > > > > > > > > Am 27.05.2018 um 10:36 hat Arik Hadas geschrieben: > > > > > On Thu, May 24, 2018 at 6:13 PM, Nir Soffer > > > > > wrote: > > > > > > > > > > > On Thu, May 24, 2018 at 6:06 PM Vrgotic, Marko < > > > > m.vrgo...@activevideo.com> > > > > > > wrote: > > > > > > > > > > > >> Dear Nir, > > > > > >> > > > > > >> Thank you for quick reply. > > > > > >> > > > > > >> Ok, why it will not work? > > > > > >> > > > > > > > > > > > > Because the image has a backing file which is not accessible to > > > > > > oVirt. > > > > > > > > > > > > > > > > > >> I used qemu+tcp connection, via import method through engine admin > > > > > >> UI. > > > > > >> > > > > > >> Images was imported and converted according logs, still “backing > > > > > >> file” > > > > > >> invalid entry remained. > > > > > >> > > > > > >> Also, I did use same method before, connecting to plain “libvirt > > > > > >> kvm” > > > > > >> host, import and conversion went smooth, no backend file. > > > > > >> > > > > > >> Image format is qcow(2) which is supported by oVirt. > > > > > >> > > > > > >> What am I missing? Should I use different method? > > > > > >> > > > > > > > > > > > > I guess this is not a problem on your side, but a bug in our side. > > > > > > > > > > > > Either we should block the operation that cannot work, or fix the > > > > process > > > > > > so we don't refer to non-existing image. > > > > > > > > > > > > When importing we have 2 options: > > > > > > > > > > > > - import the entire chain, importing all images in the chain, > > > > converting > > > > > > each image to oVirt volume, and updating the backing file of each > > > > layer > > > > > > to point to the oVirt image. > > > > > > > > > > > > - import the current state of the image into a new image, using > > > > > > either > > > > raw > > > > > > or qcow2, but without any backing file. > > > > > > > > > > > > Arik, do you know why we create qcow2 file with invalid backing > > > > > > file? > > > > > > > > > > > > > > > > It seems to be a result of a bit naive behavior of the kvm2ovirt > > > > > module > > > > > that tries to download only the top-level volume the VM uses, > > > > > assuming > > > > each > > > > > of the disks to be imported is comprised of a single volume. > > > > > > > > > > Maybe it's time to finally asking QEMU guys to provide a way to > > > > > consume > > > > the > > > > > 'collapsed' form of a chain of volumes as a stream if that's not > > > > available > > > > > yet? ;) It can also boost the recently added process of exporting VMs > > > > > as > > > > > OVAs... > > > > > > > > Not sure which operation we're talking about on the QEMU level, but > > > > generally the "collapsed" view is the normal thing because that's what > > > > guests see. > > > > > > > > For example, if you use 'qemu-img convert', you have to pass options to > > > > specifically disable it and convert only a single layer if you want to > > > > keep using backing files instead of getting a standalone image that > > > > contains everything. > > > > > > > > > > Yeah, some context was missing. Sorry about that. > > > > > >
[ovirt-users] Re: Libvirt ERROR cannot access backing file after importing VM from OpenStack
Am 28.05.2018 um 12:27 hat Arik Hadas geschrieben: > On Mon, May 28, 2018 at 11:25 AM, Kevin Wolf wrote: > > > [ Adding qemu-block ] > > > > Am 27.05.2018 um 10:36 hat Arik Hadas geschrieben: > > > On Thu, May 24, 2018 at 6:13 PM, Nir Soffer wrote: > > > > > > > On Thu, May 24, 2018 at 6:06 PM Vrgotic, Marko < > > m.vrgo...@activevideo.com> > > > > wrote: > > > > > > > >> Dear Nir, > > > >> > > > >> Thank you for quick reply. > > > >> > > > >> Ok, why it will not work? > > > >> > > > > > > > > Because the image has a backing file which is not accessible to oVirt. > > > > > > > > > > > >> I used qemu+tcp connection, via import method through engine admin UI. > > > >> > > > >> Images was imported and converted according logs, still “backing file” > > > >> invalid entry remained. > > > >> > > > >> Also, I did use same method before, connecting to plain “libvirt kvm” > > > >> host, import and conversion went smooth, no backend file. > > > >> > > > >> Image format is qcow(2) which is supported by oVirt. > > > >> > > > >> What am I missing? Should I use different method? > > > >> > > > > > > > > I guess this is not a problem on your side, but a bug in our side. > > > > > > > > Either we should block the operation that cannot work, or fix the > > process > > > > so we don't refer to non-existing image. > > > > > > > > When importing we have 2 options: > > > > > > > > - import the entire chain, importing all images in the chain, > > converting > > > > each image to oVirt volume, and updating the backing file of each > > layer > > > > to point to the oVirt image. > > > > > > > > - import the current state of the image into a new image, using either > > raw > > > > or qcow2, but without any backing file. > > > > > > > > Arik, do you know why we create qcow2 file with invalid backing file? > > > > > > > > > > It seems to be a result of a bit naive behavior of the kvm2ovirt module > > > that tries to download only the top-level volume the VM uses, assuming > > each > > > of the disks to be imported is comprised of a single volume. > > > > > > Maybe it's time to finally asking QEMU guys to provide a way to consume > > the > > > 'collapsed' form of a chain of volumes as a stream if that's not > > available > > > yet? ;) It can also boost the recently added process of exporting VMs as > > > OVAs... > > > > Not sure which operation we're talking about on the QEMU level, but > > generally the "collapsed" view is the normal thing because that's what > > guests see. > > > > For example, if you use 'qemu-img convert', you have to pass options to > > specifically disable it and convert only a single layer if you want to > > keep using backing files instead of getting a standalone image that > > contains everything. > > > > Yeah, some context was missing. Sorry about that. > > Let me demonstrate briefly the flow for OVA: > Let's say that we have a VM that is based on a template and has one disk > and one snapshot, so its volume-chain would be: > T -> S -> V > (V is the volume the VM writes to, S is the backing file of V and T is the > backing file of S). > When exporting that VM to an OVA file we want the produced tar file to be > comprised of: > (1) OVF configuration > (2) single disk volume (preferably qcow). > > So we need to collapse T, S, V into a single volume. > Sure, we can do 'qemu-img convert'. That's what we do now in oVirt 4.2: > (a) qemu-img convert produces a 'temporary' collapsed volume > (b) make a tar file of the OVf configuration and that 'temporary' volume > (c) delete the temporary volume > > But the fact that we produce that 'temporary' volume obviously slows down > the entire operation. > It would be much better if we could "open" a stream that we can read from > the 'collapsed' form of that chain and stream it directly into the > appropriate tar file entry, without extra writes to the storage device. > > Few months ago people from the oVirt-storage team checked the qemu toolset > and replied that thi
[ovirt-users] Re: Libvirt ERROR cannot access backing file after importing VM from OpenStack
[ Adding qemu-block ] Am 27.05.2018 um 10:36 hat Arik Hadas geschrieben: > On Thu, May 24, 2018 at 6:13 PM, Nir Soffer wrote: > > > On Thu, May 24, 2018 at 6:06 PM Vrgotic, Marko > > wrote: > > > >> Dear Nir, > >> > >> Thank you for quick reply. > >> > >> Ok, why it will not work? > >> > > > > Because the image has a backing file which is not accessible to oVirt. > > > > > >> I used qemu+tcp connection, via import method through engine admin UI. > >> > >> Images was imported and converted according logs, still “backing file” > >> invalid entry remained. > >> > >> Also, I did use same method before, connecting to plain “libvirt kvm” > >> host, import and conversion went smooth, no backend file. > >> > >> Image format is qcow(2) which is supported by oVirt. > >> > >> What am I missing? Should I use different method? > >> > > > > I guess this is not a problem on your side, but a bug in our side. > > > > Either we should block the operation that cannot work, or fix the process > > so we don't refer to non-existing image. > > > > When importing we have 2 options: > > > > - import the entire chain, importing all images in the chain, converting > > each image to oVirt volume, and updating the backing file of each layer > > to point to the oVirt image. > > > > - import the current state of the image into a new image, using either raw > > or qcow2, but without any backing file. > > > > Arik, do you know why we create qcow2 file with invalid backing file? > > > > It seems to be a result of a bit naive behavior of the kvm2ovirt module > that tries to download only the top-level volume the VM uses, assuming each > of the disks to be imported is comprised of a single volume. > > Maybe it's time to finally asking QEMU guys to provide a way to consume the > 'collapsed' form of a chain of volumes as a stream if that's not available > yet? ;) It can also boost the recently added process of exporting VMs as > OVAs... Not sure which operation we're talking about on the QEMU level, but generally the "collapsed" view is the normal thing because that's what guests see. For example, if you use 'qemu-img convert', you have to pass options to specifically disable it and convert only a single layer if you want to keep using backing files instead of getting a standalone image that contains everything. Kevin > > > > > Nir > > > > > >> > >> Kindly awaiting your reply. > >> > >> — — — > >> Met vriendelijke groet / Best regards, > >> > >> Marko Vrgotic > >> Sr. System Engineer > >> ActiveVideo > >> > >> Tel. +31 (0)35 677 4131 <+31%2035%20677%204131> > >> email: m.vrgo...@activevideo.com > >> skype: av.mvrgotic.se > >> www.activevideo.com > >> -- > >> *From:* Nir Soffer > >> *Sent:* Thursday, May 24, 2018 4:09:40 PM > >> *To:* Vrgotic, Marko > >> *Cc:* users@ovirt.org; Richard W.M. Jones; Arik Hadas > >> *Subject:* Re: [ovirt-users] Libvirt ERROR cannot access backing file > >> after importing VM from OpenStack > >> > >> > >> > >> On Thu, May 24, 2018 at 5:05 PM Vrgotic, Marko > >> wrote: > >> > >> Dear oVirt team, > >> > >> > >> > >> When trying to start imported VM, it fails with following message: > >> > >> > >> > >> ERROR > >> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > >> (ForkJoinPool-1-worker-2) [] EVENT_ID: VM_DOWN_ERROR(119), VM > >> instance-0673 is down with error. Exit message: Cannot access backing > >> file > >> '/var/lib/nova/instances/_base/2f4f8c5fc11bb83bcab03f4c829ddda4da8c0bce' > >> of storage file '/rhev/data-center/mnt/glusterSD/aws-gfs-01.awesome. > >> lan:_gv0__he/2607c265-248c-40ad-b020-f3756454839e/images/ > >> 816ac00f-ba98-4827-b5c8-42a8ba496089/8ecfcd5b-db67-4c23-9869-0e20d7553aba' > >> (as uid:107, gid:107): No such file or directory. > >> > >> > >> > >> Platform details: > >> > >> Ovirt SHE > >> > >> Version 4.2.2.6-1.el7.centos > >> > >> GlusterFS, unmanaged by oVirt. > >> > >> > >> > >> VM is imported & converted from OpenStack, according to log files, > >> successfully (one WARN, related to different MAC address): > >> > >> 2018-05-24 12:03:31,028+02 INFO [org.ovirt.engine.core. > >> vdsbroker.vdsbroker.GetVmsNamesFromExternalProviderVDSCommand] (default > >> task-29) [cc5931a2-1af5-4d65-b0b3-362588db9d3f] FINISH, > >> GetVmsNamesFromExternalProviderVDSCommand, return: [VM > >> [instance-0001f94c], VM [instance-00078f6a], VM [instance-0814], VM > >> [instance-0001f9ac], VM [instance-01ff], VM [instance-0001f718], VM > >> [instance-0673], VM [instance-0001ecf2], VM [instance-00078d38]], log > >> id: 7f178a5e > >> > >> 2018-05-24 12:48:33,722+02 INFO [org.ovirt.engine.core. > >> vdsbroker.vdsbroker.GetVmsNamesFromExternalProviderVDSCommand] (default > >> task-8) [103d56e1-7449-4853-ae50-48ee94d43d77] FINISH, > >> GetVmsNamesFromExternalProviderVDSCommand, return: [VM > >> [instance-0001f94c], VM [instance-00078f6a], VM [instance-0814], VM > >> [instance-0001f9ac], VM [instance-01ff], VM [instance-0001f718],
Re: [ovirt-users] [Qemu-block] qcow2 images corruption
Am 07.02.2018 um 18:06 hat Nicolas Ecarnot geschrieben: > TL; DR : qcow2 images keep getting corrupted. Any workaround? Not without knowing the cause. The first thing to make sure is that the image isn't touched by a second process while QEMU is running a VM. The classic one is using 'qemu-img snapshot' on the image of a running VM, which is instant corruption (and newer QEMU versions have locking in place to prevent this), but we have seen more absurd cases of things outside QEMU tampering with the image when we were investigating previous corruption reports. This covers the majority of all reports, we haven't had a real corruption caused by a QEMU bug in ages. > After having found (https://access.redhat.com/solutions/1173623) the right > logical volume hosting the qcow2 image, I can run qemu-img check on it. > - On 80% of my VMs, I find no errors. > - On 15% of them, I find Leaked cluster errors that I can correct using > "qemu-img check -r all" > - On 5% of them, I find Leaked clusters errors and further fatal errors, > which can not be corrected with qemu-img. > In rare cases, qemu-img can correct them, but destroys large parts of the > image (becomes unusable), and on other cases it can not correct them at all. It would be good if you could make the 'qemu-img check' output available somewhere. It would be even better if we could have a look at the respective image. I seem to remember that John (CCed) had a few scripts to analyse corrupted qcow2 images, maybe we would be able to see something there. > What I read similar to my case is : > - usage of qcow2 > - heavy disk I/O > - using the virtio-blk driver > > In the proxmox thread, they tend to say that using virtio-scsi is the > solution. Having asked this question to oVirt experts > (https://lists.ovirt.org/pipermail/users/2018-February/086753.html) but it's > not clear the driver is to blame. This seems very unlikely. The corruption you're seeing is in the qcow2 metadata, not only in the guest data. If anything, virtio-scsi exercises more qcow2 code paths than virtio-blk, so any potential bug that affects virtio-blk should also affect virtio-scsi, but not the other way around. > I agree with the answer Yaniv Kaul gave to me, saying I have to properly > report the issue, so I'm longing to know which peculiar information I can > give you now. To be honest, debugging corruption after the fact is pretty hard. We'd need the 'qemu-img check' output and ideally the image to do anything, but I can't promise that anything would come out of this. Best would be a reproducer, or at least some operation that you can link to the appearance of the corruption. Then we could take a more targeted look at the respective code. > As you can imagine, all this setup is in production, and for most of the > VMs, I can not "play" with them. Moreover, we launched a campaign of nightly > stopping every VM, qemu-img check them one by one, then boot. > So it might take some time before I find another corrupted image. > (which I'll preciously store for debug) > > Other informations : We very rarely do snapshots, but I'm close to imagine > that automated migrations of VMs could trigger similar behaviors on qcow2 > images. To my knowledge, oVirt only uses external snapshots and creates them with QMP. This should be perfectly safe because from the perspective of the qcow2 image being snapshotted, it just means that it gets no new write requests. Migration is something more involved, and if you could relate the problem to migration, that would certainly be something to look into. In that case, it would be important to know more about the setup, e.g. is it migration with shared or non-shared storage? > Last point about the versions we use : yes that's old, yes we're planning to > upgrade, but we don't know when. That would be helpful, too. Nothing is more frustrating that debugging a bug in an old version only to find that it's already fixed in the current version (well, except maybe debugging and finding nothing). Kevin ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Qemu-block] slow performance with export storage on glusterfs
Am 07.12.2017 um 23:45 hat Nir Soffer geschrieben: > The qemu bug https://bugzilla.redhat.com/713743 explains the issue: > qemu-img was writing disk images using writeback and fillingup the > cache buffers which are then flushed by the kernel preventing other > processes from accessing the storage. This is particularly bad in > cluster environments where time-based algorithms might be in place and > accessing the storage within certain timeouts is critical > > I'm not sure it this issue relevant now. We use now sanlock instead of > safelease, (except for export domain still using safelease), and qemu > or kernel may have better options to avoid trashing the host cache, or > guarantee reliable access to storage. Non-direct means that the data goes through the kernel page cache, and the kernel doesn't know that it won't be needed again, so it will fill up the cache with the image. I'm also not aware that cache coherency is now provided by all backends for shared storage, so O_DIRECT still seems to be the only way to avoid using stale caches. Since the problem is about stale caches, I don't see how the locking mechanism could make a difference. The only thing I can suggest, given that there is a "glusterfs" in the subject line of the email, is that the native gluster driver in QEMU takes a completely different path and never uses the kernel page cache, which should make both problems disappear. Maybe it would be worth having a look at this. Kevin ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Qemu-block] Scheduling daily Snapshot
Am 07.12.2017 um 23:19 hat Nir Soffer geschrieben: > On Wed, Dec 6, 2017 at 6:02 PM Jason Lelievre > wrote: > > > Hello, > > > > What is the best way to set up a daily live snapshot for all VM, and have > > the possibility to recover, for example, a specific VM to a specific day? > > Each snapshot you create make reads and writes slower, as qemu has to > lookup data through the entire chain. This is true in principle. However, as long as the lookup is purely in memory and doesn't involve I/O, you won't even notice this in average use cases. Whether additional I/O is necessary depends on whether the metadata caches already cover the part of the image that you're accessing. By choosing the right cache size values for the use case, it can normally be achieved that everything is already in memory. > When we take a snapshot, we create a new file (or block device) and make > the new file the active layer of the chain. > > For example, assuming you have a base image in raw format: > > image-1.raw (top) > > After taking a snapshot, you have: > > image-1.raw <- image-2.qcow2 (top) > > Now when qemu need to read data from the image, it will try to get the > data from the top layer (image-2), if it does not exists it will try > the backing file (image-1). Same when writing data, if qemu need to > write small amount of data, it may need to get entire sector from a > another layer in the chain and copy it to the top layer. Yes, though for this operation it doesn't matter whether it has to copy it from the second image in the chain or the thirtieth. As soon as you do a partial write to a cluster that hasn't been written yet since the last snapshot was taken, you get to copy data, no matter the length of the chain. Kevin ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] [Qemu-block] Enabling libgfapi disk access with oVirt 4.2
Am 15.11.2017 um 23:05 hat Nir Soffer geschrieben: > On Wed, Nov 15, 2017 at 8:58 AM Misak Khachatryan wrote: > > > Hi, > > > > will it be a more clean approach? I can't tolerate full stop of all > > VMs just to enable it, seems too disastrous for real production > > environment. Will it be some migration mechanisms in future? > > > > You can enable it per vm, you don't need to stop all of them. But I think > we do not support upgrading a machine with running vms, so upgrading > requires: > > 1. migrating vms from the host you want to upgrade > 2. upgrading the host > 3. stopping the vm you want to upgrade to libgfapi > 4. starting this vm on the upgraded host > > Theoretically qemu could switch from one disk to another, but I'm not > sure this is supported when switching to the same disk using different > transports. I know it is not supported now to mirror a network drive to > another network drive. I don't think this is possible yet, but we're relatively close to actually allowing this. No promises as to when qemu will allow QMP to make arbitrary changes to the block graph, but it is being worked on. Kevin ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Performance of cloning
Am 28.09.2017 um 12:44 hat Nir Soffer geschrieben: > On Thu, Sep 28, 2017 at 12:03 PM Gianluca Cecchi > wrote: > > > Hello, > > I'm on 4.1.5 and I'm cloning a snapshot of a VM with 3 disks for a total > > of about 200Gb to copy > > The target I choose is on a different domain than the source one. > > They are both FC storage domains, with the source on SSD disks and the > > target on SAS disks. > > > > The disks are preallocated > > > > Now I have 3 processes of kind: > > /usr/bin/qemu-img convert -p -t none -T none -f raw > > /rhev/data-center/59b7af54-0155-01c2-0248-0195/fad05d79-254d-4f40-8201-360757128ede/images/8f62600a-057d-4d59-9655-631f080a73f6/21a8812f-6a89-4015-a79e-150d7e202450 > > -O raw > > /rhev/data-center/mnt/blockSD/6911716c-aa99-4750-a7fe-f83675a2d676/images/c3973d1b-a168-4ec5-8c1a-630cfc4b66c4/27980581-5935-4b23-989a-4811f80956ca > > > > but despite capabilities it seems it is copying using very low system > > resources. > > > > We run qemu-img convert (and other storage related commands) with: > > nice -n 19 ionice -c 3 qemu-img ... > > ionice should not have any effect unless you use the CFQ I/O scheduler. > > The intent is to limit the effect of virtual machines. > > > > I see this both using iotop and vmstat > > > > vmstat 3 gives: > > io -system-- --cpu- > > bibo in cs us sy id wa st > > 2527 698 3771 29394 1 0 89 10 0 > > > > us 94% also seems very high - maybe this hypervisor is overloaded with > other workloads? > wa 89% seems very high The alignment in the table is a bit off, but us is 1%. The 94 you saw is part of cs=29394. A high percentage for wait is generally a good sign because that means that the system is busy with actual I/O work. Obviously, this I/O work is rather slow, but at least qemu-img is making requests to the kernel instead of doing other work, otherwise user would be much higher. Kevin ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM get stuck randomly
Hi Christophe, Am 30.03.2016 um 13:45 hat Christophe TREFOIS geschrieben: > Another host went down, so I have to prepare info for this one. > > I could not SSH to it anymore. > Console would show login screen, but no keystrokes were registered. > > I could “suspend” the VM and “run” it, but still can’t SSH to it. > Before suspension, all QEMU threads were around 0%, after resuming, 3 of them > hover at 100%. > > Attached you could find the gdb, core dump, and other logs. > > Logs: https://dl.dropboxusercontent.com/u/63261/ubuntu2-logs.tar.gz > > Core Dump: https://dl.dropboxusercontent.com/u/63261/core-ubuntu2.tar.gz > > Is there anything else we could provide? This sounds much like it's not qemu that hangs (because then stopping and resuming wouldn't work any more), but just the guest OS that is running inside the VM. We've had cases before where qemu was reported to hang with 100% CPU usage and in the end it turned out that the guest kernel had panicked. Can you check whether a guest kernel crash could be the cause? If this is reproducible, maybe the easiest way would be to attach a serial console to the VM and let the kernel print its messages there. Kevin ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] VM get stuck randomly
Am 27.03.2016 um 22:38 hat Christophe TREFOIS geschrieben: > Hi, > > MS does not like my previous email, so here it is again with a link to Dropbox > instead of as attached. > > —— > Hi Nir, > > Inside the core dump tarball is also the output of the two gdb commands you > mentioned. > > Understandbly, you might not want to download the big files for that, so I > attached them here seperately. The gdb dump looks pretty much like an idle qemu that just sits there and waits for events. The vcpu threads seem to be running guest code, the I/O thread and SPICE thread are in poll() waiting for events to respond to, and finally the RCU thread is idle as well. Does the qemu process still respond to monitor commands, so for example can you still pause and resume the guest? Kevin > For the other logs, here you go. > > For gluster I didn’t know which, so I sent all. > > I got the icinga notifcation at 17:06 CEST on March 27th (today). So for vdsm, > I provided logs from 16h-18h. > The check said that the VM was down for 11 minutes at that time. > > https://dl.dropboxusercontent.com/u/63261/bioservice-1.tar.gz > > Please do let me know if there is anything else I can provide. > > Best regards, > > > > On 27 Mar 2016, at 21:24, Nir Soffer wrote: > > > > On Sun, Mar 27, 2016 at 8:39 PM, Christophe TREFOIS > > wrote: > >> Hi Nir, > >> > >> Here is another one, this time with strace of children and gdb dump. > >> > >> Interestingly, this time, the qemu seems stuck 0%, vs 100% for other cases. > >> > >> The files for strace are attached. > > > > Hopefully Kevin can take a look. > > > > > >> The gdb + core dump is found here (too > >> big): > >> > >> https://dl.dropboxusercontent.com/u/63261/gdb-core.tar.gz > > > > I think it will be more useful to extract a traceback of all threads > > and send the tiny traceback. > > > > gdb --pid --batch --eval-command='thread apply all bt' > > > >> If it helps, most machines get stuck on the host hosting the self-hosted > >> engine, which runs a local 1-node glusterfs. > > > > And getting also /var/log/messages, sanlock, vdsm, glusterfs and > > libvirt logs for this timeframe > > would be helpful. > > > > Nir > > > >> > >> Thank you for your help, > >> > >> — > >> Christophe > >> > >> Dr Christophe Trefois, Dipl.-Ing. > >> Technical Specialist / Post-Doc > >> > >> UNIVERSITÉ DU LUXEMBOURG > >> > >> LUXEMBOURG CENTRE FOR SYSTEMS BIOMEDICINE > >> Campus Belval | House of Biomedicine > >> 6, avenue du Swing > >> L-4367 Belvaux > >> T: +352 46 66 44 6124 > >> F: +352 46 66 44 6949 > >> http://www.uni.lu/lcsb > >> > >> > >> > >> > >> This message is confidential and may contain privileged information. > >> It is intended for the named recipient only. > >> If you receive it in error please notify me and permanently delete the > >> original message and any copies. > >> > >> > >> > >> > >>> On 25 Mar 2016, at 11:53, Nir Soffer wrote: > >>> > >>> gdb --pid --batch --eval-command='thread apply all bt' > >> > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users