Re: [Qemu-block] [PATCH 1/4 for-2.11?] block/dirty-bitmap: add lock to bdrv_enable/disable_dirty_bitmap
13.11.2017 20:50, Eric Blake wrote: On 11/13/2017 10:20 AM, Vladimir Sementsov-Ogievskiy wrote: Like other setters here these functions should take a lock. Signed-off-by: Vladimir Sementsov-Ogievskiy--- block/dirty-bitmap.c | 4 1 file changed, 4 insertions(+) Should this patch be in 2.11? diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c index bd04e991b1..2a0bcd9e51 100644 --- a/block/dirty-bitmap.c +++ b/block/dirty-bitmap.c @@ -397,15 +397,19 @@ void bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs, /* Called with BQL taken. */ void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap) { +bdrv_dirty_bitmap_lock(bitmap); assert(!bdrv_dirty_bitmap_frozen(bitmap)); bitmap->disabled = true; +bdrv_dirty_bitmap_unlock(bitmap); Why do we need this lock in addition to BQL? in fact, comment about BQL should be dropped. block_int.h: /* Writing to the list requires the BQL _and_ the dirty_bitmap_mutex. * Reading from the list can be done with either the BQL or the * dirty_bitmap_mutex. Modifying a bitmap only requires * dirty_bitmap_mutex. */ QemuMutex dirty_bitmap_mutex; } /* Called with BQL taken. */ void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap) { +bdrv_dirty_bitmap_lock(bitmap); assert(!bdrv_dirty_bitmap_frozen(bitmap)); bitmap->disabled = false; +bdrv_dirty_bitmap_unlock(bitmap); Again, why do we need this in addition to BQL? and here. The commit message needs more details about a scenario where our existing BQL lock is insufficient to prevent misuse of bitmap->disabled. -- Best regards, Vladimir
Re: [Qemu-block] [PATCH for-2.11] nbd/client: Don't hard-disconnect on ESHUTDOWN from server
13.11.2017 22:48, Eric Blake wrote: The NBD spec says that a server may fail any transmission request with ESHUTDOWN when it is apparent that no further request from the client can be successfully honored. The client is supposed to then initiate a soft shutdown (wait for all remaining in-flight requests to be answered, then send NBD_CMD_DISC). However, since qemu's server never uses ESHUTDOWN errors, this code was mostly untested since its introduction in commit b6f5d3b5. More recently, I learned that nbdkit as the NBD server is able to send ESHUTDOWN errors, so I finally tested this code, and noticed that our client was special-casing ESHUTDOWN to cause a hard shutdown (immediate disconnect, with no NBD_CMD_DISC), but only if the server sends this error as a simple reply. Further investigation found that commit d2febedb introduced a regression where structured replies behave differently than simple replies - but that the structured reply behavior is more in line with the spec (even if we still lack code in nbd-client.c to properly quit sending further requests). So this patch reverts the portion of b6f5d3b5 that introduced an improper hard-disconnect special-case at the lower level, and leaves the future enhancement of a nicer soft-disconnect at the higher level for another day. CC: qemu-triv...@nongnu.org Signed-off-by: Eric BlakeReviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/client.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/nbd/client.c b/nbd/client.c index 4e15fc484d..eea236ca06 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -996,15 +996,9 @@ int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) if (ret < 0) { break; } - trace_nbd_receive_simple_reply(reply->simple.error, nbd_err_lookup(reply->simple.error), reply->handle); -if (reply->simple.error == NBD_ESHUTDOWN) { -/* This works even on mingw which lacks a native ESHUTDOWN */ -error_setg(errp, "server shutting down"); -return -EINVAL; -} break; case NBD_STRUCTURED_REPLY_MAGIC: ret = nbd_receive_structured_reply_chunk(ioc, >structured, errp); -- Best regards, Vladimir
Re: [Qemu-block] [PATCH 0/5] compressed block-stream
On Tue, 11/14 13:16, Anton Nefedov wrote: > It might be useful to compress images during block-stream; > this way the user can merge compressed images of a backing chain and > the result will remain compressed. I haven't looked at the patches yet so maybe the answer is obvious, but still: would the "compress" option be still necessary if what we have is blockdev-stream instead? Fam
Re: [Qemu-block] [Qemu-devel] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests
Hi, This series failed build test on s390x host. Please find the details below. Type: series Subject: [Qemu-devel] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests Message-id: 20171115211917.789-1-ebl...@redhat.com === TEST SCRIPT BEGIN === #!/bin/bash # Testing script will be invoked under the git checkout with # HEAD pointing to a commit that has the patches applied on top of "base" # branch set -e echo "=== ENV ===" env echo "=== PACKAGES ===" rpm -qa echo "=== TEST BEGIN ===" CC=$HOME/bin/cc INSTALL=$PWD/install BUILD=$PWD/build echo -n "Using CC: " realpath $CC mkdir -p $BUILD $INSTALL SRC=$PWD cd $BUILD $SRC/configure --cc=$CC --prefix=$INSTALL make -j4 # XXX: we need reliable clean up # make check -j4 V=1 make install === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu - [tag update] patchew/1510654613-47868-1-git-send-email-anton.nefe...@virtuozzo.com -> patchew/1510654613-47868-1-git-send-email-anton.nefe...@virtuozzo.com - [tag update] patchew/20171115180732.31753-1-mre...@redhat.com -> patchew/20171115180732.31753-1-mre...@redhat.com * [new tag] patchew/20171115211917.789-1-ebl...@redhat.com -> patchew/20171115211917.789-1-ebl...@redhat.com Switched to a new branch 'test' 48097f6 nbd/server: Fix error reporting for bad requests === OUTPUT BEGIN === === ENV === XDG_SESSION_ID=92762 SHELL=/bin/sh USER=fam PATCHEW=/home/fam/patchew/patchew-cli -s http://patchew.org --nodebug PATH=/usr/bin:/bin PWD=/var/tmp/patchew-tester-tmp-6bl2h6ka/src LANG=en_US.UTF-8 HOME=/home/fam SHLVL=2 LOGNAME=fam DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1012/bus XDG_RUNTIME_DIR=/run/user/1012 _=/usr/bin/env === PACKAGES === gpg-pubkey-873529b8-54e386ff xz-libs-5.2.2-2.fc24.s390x libxshmfence-1.2-3.fc24.s390x giflib-4.1.6-15.fc24.s390x trousers-lib-0.3.13-6.fc24.s390x ncurses-base-6.0-6.20160709.fc25.noarch gmp-6.1.1-1.fc25.s390x libidn-1.33-1.fc25.s390x slang-2.3.0-7.fc25.s390x pkgconfig-0.29.1-1.fc25.s390x alsa-lib-1.1.1-2.fc25.s390x yum-metadata-parser-1.1.4-17.fc25.s390x python3-slip-dbus-0.6.4-4.fc25.noarch python2-cssselect-0.9.2-1.fc25.noarch createrepo_c-libs-0.10.0-6.fc25.s390x initscripts-9.69-1.fc25.s390x parted-3.2-21.fc25.s390x flex-2.6.0-3.fc25.s390x colord-libs-1.3.4-1.fc25.s390x python-osbs-client-0.33-3.fc25.noarch perl-Pod-Simple-3.35-1.fc25.noarch python2-simplejson-3.10.0-1.fc25.s390x brltty-5.4-2.fc25.s390x librados2-10.2.4-2.fc25.s390x tcp_wrappers-7.6-83.fc25.s390x libcephfs_jni1-10.2.4-2.fc25.s390x nettle-devel-3.3-1.fc25.s390x bzip2-devel-1.0.6-21.fc25.s390x libuuid-2.28.2-2.fc25.s390x python3-dnf-1.1.10-6.fc25.noarch texlive-kpathsea-doc-svn41139-33.fc25.1.noarch openssh-7.4p1-4.fc25.s390x texlive-kpathsea-bin-svn40473-33.20160520.fc25.1.s390x texlive-graphics-svn41015-33.fc25.1.noarch texlive-dvipdfmx-def-svn40328-33.fc25.1.noarch texlive-mfware-svn40768-33.fc25.1.noarch texlive-texlive-scripts-svn41433-33.fc25.1.noarch texlive-euro-svn22191.1.1-33.fc25.1.noarch texlive-etex-svn37057.0-33.fc25.1.noarch texlive-iftex-svn29654.0.2-33.fc25.1.noarch texlive-palatino-svn31835.0-33.fc25.1.noarch texlive-texlive-docindex-svn41430-33.fc25.1.noarch texlive-xunicode-svn30466.0.981-33.fc25.1.noarch texlive-koma-script-svn41508-33.fc25.1.noarch texlive-pst-grad-svn15878.1.06-33.fc25.1.noarch texlive-pst-blur-svn15878.2.0-33.fc25.1.noarch texlive-jknapltx-svn19440.0-33.fc25.1.noarch texinfo-6.1-4.fc25.s390x openssl-devel-1.0.2k-1.fc25.s390x jansson-2.10-2.fc25.s390x fedora-repos-25-4.noarch perl-Errno-1.25-387.fc25.s390x acl-2.2.52-13.fc25.s390x systemd-pam-231-17.fc25.s390x NetworkManager-libnm-1.4.4-5.fc25.s390x poppler-0.45.0-5.fc25.s390x ccache-3.3.4-1.fc25.s390x valgrind-3.12.0-9.fc25.s390x perl-open-1.10-387.fc25.noarch libgcc-6.4.1-1.fc25.s390x libsoup-2.56.1-1.fc25.s390x libstdc++-devel-6.4.1-1.fc25.s390x libobjc-6.4.1-1.fc25.s390x python2-rpm-4.13.0.1-2.fc25.s390x python2-gluster-3.10.5-1.fc25.s390x rpm-build-4.13.0.1-2.fc25.s390x glibc-static-2.24-10.fc25.s390x lz4-1.8.0-1.fc25.s390x xapian-core-libs-1.2.24-1.fc25.s390x elfutils-libelf-devel-0.169-1.fc25.s390x nss-softokn-3.32.0-1.2.fc25.s390x pango-1.40.9-1.fc25.s390x glibc-debuginfo-common-2.24-10.fc25.s390x libaio-0.3.110-6.fc24.s390x libfontenc-1.1.3-3.fc24.s390x lzo-2.08-8.fc24.s390x isl-0.14-5.fc24.s390x libXau-1.0.8-6.fc24.s390x linux-atm-libs-2.5.1-14.fc24.s390x libXext-1.3.3-4.fc24.s390x libXxf86vm-1.1.4-3.fc24.s390x bison-3.0.4-4.fc24.s390x perl-srpm-macros-1-20.fc25.noarch gawk-4.1.3-8.fc25.s390x libwayland-client-1.12.0-1.fc25.s390x perl-Exporter-5.72-366.fc25.noarch perl-version-0.99.17-1.fc25.s390x fftw-libs-double-3.3.5-3.fc25.s390x libssh2-1.8.0-1.fc25.s390x ModemManager-glib-1.6.4-1.fc25.s390x newt-python3-0.52.19-2.fc25.s390x python-munch-2.0.4-3.fc25.noarch python-bugzilla-1.2.2-4.fc25.noarch libedit-3.1-16.20160618cvs.fc25.s390x createrepo_c-0.10.0-6.fc25.s390x device-mapper-multipath-libs-0.4.9-83.fc25.s390x
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 23:55, Max Reitz wrote: > On 2017-11-15 23:28, Max Reitz wrote: >> On 2017-11-15 22:50, Gandalf Corvotempesta wrote: >>> https://stacklet.com/downloads/XenServer-XVA-Template-Debian-7.8-Lightweight-x86 >> >> Hmmm, that gives me a timeout when downloading. And another image I got >> gives me an "argument list too long" -- which I had feared already... >> It has a size of 3 MB and Linux "only" allows 2... > > OK, being a bit clever helped a lot. > > https://xanclic.moe/convert-xva.rb -- does this work? > (It seems to works on the two example images I found...) Note that I have no idea how to get the total image size from the ova.xml (or wherever else), so you may have to call qemu-img resize on the output to get it to the right size. (It only creates an image that covers all of the input data blocks.) Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 23:28, Max Reitz wrote: > On 2017-11-15 22:50, Gandalf Corvotempesta wrote: >> https://stacklet.com/downloads/XenServer-XVA-Template-Debian-7.8-Lightweight-x86 > > Hmmm, that gives me a timeout when downloading. And another image I got > gives me an "argument list too long" -- which I had feared already... > It has a size of 3 MB and Linux "only" allows 2... OK, being a bit clever helped a lot. https://xanclic.moe/convert-xva.rb -- does this work? (It seems to works on the two example images I found...) An example is in the code, you use it like this: $ ./convert-xva.rb ~/Downloads/stats-appliance-2.36.020502.xva Ref:73 $ ./convert-xva.rb ~/Downloads/stats-appliance-2.36.020502.xva Ref:73 \ -O qcow2 -p out.qcow2 (100.00/100%) $ ./qemu-img info out.qcow2 image: out.qcow2 file format: qcow2 virtual size: 80G (85916123136 bytes) disk size: 4.0G cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
Il 15 nov 2017 11:28 PM, "Max Reitz"ha scritto: Hmmm, that gives me a timeout when downloading. And another image I got gives me an "argument list too long" -- which I had feared already... It has a size of 3 MB and Linux "only" allows 2... https://cloud.vates.fr/index.php/s/jynIvCjhso47nR8/download
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 22:50, Gandalf Corvotempesta wrote: > https://stacklet.com/downloads/XenServer-XVA-Template-Debian-7.8-Lightweight-x86 Hmmm, that gives me a timeout when downloading. And another image I got gives me an "argument list too long" -- which I had feared already... It has a size of 3 MB and Linux "only" allows 2... Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH v2] qapi: block-core: Clarify events emitted by 'block-job-cancel'
On Wed, Nov 15, 2017 at 02:15:57PM -0500, John Snow wrote: > > > On 11/15/2017 04:09 AM, Kashyap Chamarthy wrote: > > When you cancel an in-progress live block operation with QMP > > `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED. However, > > when `block-job-cancel` is issued after `drive-mirror` has indicated (by > > emitting the event BLOCK_JOB_READY) that the source and destination > > remain synchronized: [...] > > But this is expected behaviour, where the _COMPLETED event indicates > > that synchronization has successfully ended (and the destination has a > > point-in-time copy, which is at the time of cancel). > > > > So add a small note to this effect. (Thanks: Max Reitz for reminding > > me of this on IRC.) > > > > I suppose this difference probably isn't covered in what was the > bitmaps.md doc file (we don't bother covering mirror there, only > backup); Your supposition is correct: Looking at the now-renamed 'bitmaps.rst'[1], it isn't covered there. > is it covered sufficiently in live-block-operations.rst ? I looked in there[2] too. Short answer: no. Long: In the "Live disk synchronization — drive-mirror and blockdev-mirror" section, I simply seemed to declare: "Issuing the command ``block-job-cancel`` after it emits the event ``BLOCK_JOB_CANCELLED``" As if that's the *only* event it emits, which is clearly not the case. So while at it, wonder if should I also update it ('live-block-operations.rst') too. [1] https://git.qemu.org/?p=qemu.git;a=blob;f=docs/interop/bitmaps.rst [2] https://git.qemu.org/?p=qemu.git;a=blob;f=docs/interop/live-block-operations.rst#l513 -- /kashyap
Re: [Qemu-block] [PATCH v2] qapi: block-core: Clarify events emitted by 'block-job-cancel'
On 11/15/2017 04:54 PM, Kashyap Chamarthy wrote: > On Wed, Nov 15, 2017 at 02:15:57PM -0500, John Snow wrote: >> >> >> On 11/15/2017 04:09 AM, Kashyap Chamarthy wrote: >>> When you cancel an in-progress live block operation with QMP >>> `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED. However, >>> when `block-job-cancel` is issued after `drive-mirror` has indicated (by >>> emitting the event BLOCK_JOB_READY) that the source and destination >>> remain synchronized: > > [...] > >>> But this is expected behaviour, where the _COMPLETED event indicates >>> that synchronization has successfully ended (and the destination has a >>> point-in-time copy, which is at the time of cancel). >>> >>> So add a small note to this effect. (Thanks: Max Reitz for reminding >>> me of this on IRC.) >>> >> >> I suppose this difference probably isn't covered in what was the >> bitmaps.md doc file (we don't bother covering mirror there, only >> backup); > > Your supposition is correct: Looking at the now-renamed > 'bitmaps.rst'[1], it isn't covered there. > Makes sense, I don't think we need to correct this, and >> is it covered sufficiently in live-block-operations.rst ? > > I looked in there[2] too. Short answer: no. Long: In the "Live disk > synchronization — drive-mirror and blockdev-mirror" section, I simply > seemed to declare: > > "Issuing the command ``block-job-cancel`` after it emits the event > ``BLOCK_JOB_CANCELLED``" > > As if that's the *only* event it emits, which is clearly not the case. > So while at it, wonder if should I also update it > ('live-block-operations.rst') too. > It's an interesting gotcha that I wasn't really acutely aware of myself, so having it in the doc format for API programmers who aren't necessarily digging through our source sounds like a pleasant courtesy. > [1] https://git.qemu.org/?p=qemu.git;a=blob;f=docs/interop/bitmaps.rst > [2] > https://git.qemu.org/?p=qemu.git;a=blob;f=docs/interop/live-block-operations.rst#l513 > > Thanks Kashyap :)
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
https://stacklet.com/downloads/XenServer-XVA-Template-Debian-7.8-Lightweight-x86 Some XVAs Il 15 nov 2017 10:42 PM, "Gandalf Corvotempesta" < gandalf.corvotempe...@gmail.com> ha scritto: > I'm thinking on how to prove you a sample XVA > I have to create (and populate) a VM because an empty image will result in > an empty XVA > And a VM is 300-400Mb as minimum > > Il 15 nov 2017 10:30 PM, "Max Reitz"ha scritto: > >> On 2017-11-15 21:41, Gandalf Corvotempesta wrote: >> > 2017-11-15 21:29 GMT+01:00 Richard W.M. Jones : >> >> Gandalf, is there an XVA file publically available (pref. not >> >> too big) that we can look at? >> > >> > I can try to provide one, but it's simple: >> > >> > # tar tvf 20160630_124823_aa72_.xva.gz | head -n 50 >> > -- 0/0 42353 1970-01-01 01:00 ova.xml >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/ >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0001 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0001.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0003 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0003.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0004 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0004.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0005 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0005.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0006 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0006.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0007 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0007.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0009 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0009.checksum >> > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0010 >> > -- 0/0 40 1970-01-01 01:00 >> Ref:175/0010.checksum >> > >> > >> > You can ignore the ova.xml and just use the "Ref:175" directory. >> > Inside the XVA you'll fine one "Ref" directory for each virtual disk >> > (ref number is different for each disk) >> > Inside each Ref directory, you'll get tons of 1MB file, corrisponding >> > to the RAW image. >> > You have to merge these files in a single raw file with just an >> > exception: numbers are not sequential. >> > as you can see above, we have: , 0001, 0003 >> > >> > The 0002 is missing, because it's totally blank. XenServer doesn't >> > export any empty block, thus it will skip the corrisponding 1MB file. >> > When building the raw image, you have to replace empty blocks with 1MB >> > full of zeros. >> > >> > This is (in addition to the tar extract) the most time-consuming part. >> > Currently I'm rebuilding a 250GB image, with tons of files to be >> > merge. >> > If qemu-img can be patched to automatically convert this kind of >> > format, I can save about 3 hours (30 minutes for extracting the >> > tarball, and about 2 hours to merge 250-300GB image) >> >> Hmmm... OK, so as Rich said, you can use the raw driver with an offset >> and a size. And you can use the null-co driver to emulate holes. And >> qemu-img convert supports concatenation. >> >> Taking all of that together, it should be possible to create an input >> "file" per 1 MB block and concatenate all of those with qemu-img >> convert. The only issue is that it'll get you a *really* long command >> line. >> >> (And you'll still have to gunzip the archive before, of course.) >> >> I'll try to write a script, but of course it's hart without an example... >> >> Max >> >>
Re: [Qemu-block] [Qemu-devel] [qemu-img] support for XVA
2017-11-15 20:59 GMT+01:00 Max Reitz: > Well, you can't just add support to qemu-img alone either. Every image > format supported by qemu-img is one supported by qemu as a whole and > thus needs a proper block driver that needs to support random accesses > as well. I was talking about qemu-img convert, just to convert an XVA image to something different, in a single pass, without having to extract the tar.
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
I'm thinking on how to prove you a sample XVA I have to create (and populate) a VM because an empty image will result in an empty XVA And a VM is 300-400Mb as minimum Il 15 nov 2017 10:30 PM, "Max Reitz"ha scritto: > On 2017-11-15 21:41, Gandalf Corvotempesta wrote: > > 2017-11-15 21:29 GMT+01:00 Richard W.M. Jones : > >> Gandalf, is there an XVA file publically available (pref. not > >> too big) that we can look at? > > > > I can try to provide one, but it's simple: > > > > # tar tvf 20160630_124823_aa72_.xva.gz | head -n 50 > > -- 0/0 42353 1970-01-01 01:00 ova.xml > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/ > > -- 0/0 40 1970-01-01 01:00 Ref:175/.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0001 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0001.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0003 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0003.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0004 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0004.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0005 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0005.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0006 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0006.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0007 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0007.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0009 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0009.checksum > > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0010 > > -- 0/0 40 1970-01-01 01:00 Ref:175/0010.checksum > > > > > > You can ignore the ova.xml and just use the "Ref:175" directory. > > Inside the XVA you'll fine one "Ref" directory for each virtual disk > > (ref number is different for each disk) > > Inside each Ref directory, you'll get tons of 1MB file, corrisponding > > to the RAW image. > > You have to merge these files in a single raw file with just an > > exception: numbers are not sequential. > > as you can see above, we have: , 0001, 0003 > > > > The 0002 is missing, because it's totally blank. XenServer doesn't > > export any empty block, thus it will skip the corrisponding 1MB file. > > When building the raw image, you have to replace empty blocks with 1MB > > full of zeros. > > > > This is (in addition to the tar extract) the most time-consuming part. > > Currently I'm rebuilding a 250GB image, with tons of files to be > > merge. > > If qemu-img can be patched to automatically convert this kind of > > format, I can save about 3 hours (30 minutes for extracting the > > tarball, and about 2 hours to merge 250-300GB image) > > Hmmm... OK, so as Rich said, you can use the raw driver with an offset > and a size. And you can use the null-co driver to emulate holes. And > qemu-img convert supports concatenation. > > Taking all of that together, it should be possible to create an input > "file" per 1 MB block and concatenate all of those with qemu-img > convert. The only issue is that it'll get you a *really* long command > line. > > (And you'll still have to gunzip the archive before, of course.) > > I'll try to write a script, but of course it's hart without an example... > > Max > >
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
2017-11-15 21:29 GMT+01:00 Richard W.M. Jones: > Gandalf, is there an XVA file publically available (pref. not > too big) that we can look at? I can try to provide one, but it's simple: # tar tvf 20160630_124823_aa72_.xva.gz | head -n 50 -- 0/0 42353 1970-01-01 01:00 ova.xml -- 0/0 1048576 1970-01-01 01:00 Ref:175/ -- 0/0 40 1970-01-01 01:00 Ref:175/.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0001 -- 0/0 40 1970-01-01 01:00 Ref:175/0001.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0003 -- 0/0 40 1970-01-01 01:00 Ref:175/0003.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0004 -- 0/0 40 1970-01-01 01:00 Ref:175/0004.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0005 -- 0/0 40 1970-01-01 01:00 Ref:175/0005.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0006 -- 0/0 40 1970-01-01 01:00 Ref:175/0006.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0007 -- 0/0 40 1970-01-01 01:00 Ref:175/0007.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0009 -- 0/0 40 1970-01-01 01:00 Ref:175/0009.checksum -- 0/0 1048576 1970-01-01 01:00 Ref:175/0010 -- 0/0 40 1970-01-01 01:00 Ref:175/0010.checksum You can ignore the ova.xml and just use the "Ref:175" directory. Inside the XVA you'll fine one "Ref" directory for each virtual disk (ref number is different for each disk) Inside each Ref directory, you'll get tons of 1MB file, corrisponding to the RAW image. You have to merge these files in a single raw file with just an exception: numbers are not sequential. as you can see above, we have: , 0001, 0003 The 0002 is missing, because it's totally blank. XenServer doesn't export any empty block, thus it will skip the corrisponding 1MB file. When building the raw image, you have to replace empty blocks with 1MB full of zeros. This is (in addition to the tar extract) the most time-consuming part. Currently I'm rebuilding a 250GB image, with tons of files to be merge. If qemu-img can be patched to automatically convert this kind of format, I can save about 3 hours (30 minutes for extracting the tarball, and about 2 hours to merge 250-300GB image)
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On Wed, Nov 15, 2017 at 10:30:31PM +0100, Max Reitz wrote: > Hmmm... OK, so as Rich said, you can use the raw driver with an offset > and a size. And you can use the null-co driver to emulate holes. And > qemu-img convert supports concatenation. Nice, I didn't know qemu-img convert could concatenate :-) So that's two distinct ways to do this, neither needing any qemu modifications. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 22:50, Gandalf Corvotempesta wrote: > https://stacklet.com/downloads/XenServer-XVA-Template-Debian-7.8-Lightweight-x86 > > Some XVAs Thanks! signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On Wed, Nov 15, 2017 at 10:42:42PM +0100, Gandalf Corvotempesta wrote: > I'm thinking on how to prove you a sample XVA > I have to create (and populate) a VM because an empty image will result in > an empty XVA > And a VM is 300-400Mb as minimum That's fine to download. Also you could compress it (for transfer) which should make it smaller. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [Qemu-block] [Qemu-devel] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests
Hi, This series failed build test on ppc host. Please find the details below. Type: series Message-id: 20171115211917.789-1-ebl...@redhat.com Subject: [Qemu-devel] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests === TEST SCRIPT BEGIN === #!/bin/bash # Testing script will be invoked under the git checkout with # HEAD pointing to a commit that has the patches applied on top of "base" # branch set -e echo "=== ENV ===" env echo "=== PACKAGES ===" rpm -qa echo "=== TEST BEGIN ===" INSTALL=$PWD/install BUILD=$PWD/build mkdir -p $BUILD $INSTALL SRC=$PWD cd $BUILD $SRC/configure --prefix=$INSTALL make -j100 # XXX: we need reliable clean up # make check -j100 V=1 make install === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu - [tag update] patchew/1510654613-47868-1-git-send-email-anton.nefe...@virtuozzo.com -> patchew/1510654613-47868-1-git-send-email-anton.nefe...@virtuozzo.com - [tag update] patchew/20171115180732.31753-1-mre...@redhat.com -> patchew/20171115180732.31753-1-mre...@redhat.com * [new tag] patchew/20171115211917.789-1-ebl...@redhat.com -> patchew/20171115211917.789-1-ebl...@redhat.com Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Submodule 'pixman' (git://anongit.freedesktop.org/pixman) registered for path 'pixman' Submodule 'roms/SLOF' (git://git.qemu-project.org/SLOF.git) registered for path 'roms/SLOF' Submodule 'roms/ipxe' (git://git.qemu-project.org/ipxe.git) registered for path 'roms/ipxe' Submodule 'roms/openbios' (git://git.qemu-project.org/openbios.git) registered for path 'roms/openbios' Submodule 'roms/openhackware' (git://git.qemu-project.org/openhackware.git) registered for path 'roms/openhackware' Submodule 'roms/qemu-palcode' (git://github.com/rth7680/qemu-palcode.git) registered for path 'roms/qemu-palcode' Submodule 'roms/seabios' (git://git.qemu-project.org/seabios.git/) registered for path 'roms/seabios' Submodule 'roms/sgabios' (git://git.qemu-project.org/sgabios.git) registered for path 'roms/sgabios' Submodule 'roms/u-boot' (git://git.qemu-project.org/u-boot.git) registered for path 'roms/u-boot' Submodule 'roms/vgabios' (git://git.qemu-project.org/vgabios.git/) registered for path 'roms/vgabios' Cloning into 'dtc'... Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf' Cloning into 'pixman'... Submodule path 'pixman': checked out '87eea99e443b389c978cf37efc52788bf03a0ee0' Cloning into 'roms/SLOF'... Submodule path 'roms/SLOF': checked out 'e3d05727a074619fc12d0a67f05cf2c42c875cce' Cloning into 'roms/ipxe'... Submodule path 'roms/ipxe': checked out '04186319181298083ef28695a8309028b26fe83c' Cloning into 'roms/openbios'... Submodule path 'roms/openbios': checked out 'e79bca64838c96ec44fd7acd508879c5284233dd' Cloning into 'roms/openhackware'... Submodule path 'roms/openhackware': checked out 'c559da7c8eec5e45ef1f67978827af6f0b9546f5' Cloning into 'roms/qemu-palcode'... Submodule path 'roms/qemu-palcode': checked out 'c87a92639b28ac42bc8f6c67443543b405dc479b' Cloning into 'roms/seabios'... Submodule path 'roms/seabios': checked out 'e2fc41e24ee0ada60fc511d60b15a41b294538be' Cloning into 'roms/sgabios'... Submodule path 'roms/sgabios': checked out '23d474943dcd55d0550a3d20b3d30e9040a4f15b' Cloning into 'roms/u-boot'... Submodule path 'roms/u-boot': checked out '2072e7262965bb48d7fffb1e283101e6ed8b21a8' Cloning into 'roms/vgabios'... Submodule path 'roms/vgabios': checked out '19ea12c230ded95928ecaef0db47a82231c2e485' warning: unable to rmdir pixman: Directory not empty Switched to a new branch 'test' M dtc M roms/SLOF M roms/ipxe M roms/openbios M roms/qemu-palcode M roms/seabios M roms/sgabios M roms/u-boot 48097f6 nbd/server: Fix error reporting for bad requests === OUTPUT BEGIN === === ENV === XDG_SESSION_ID=33456 SHELL=/bin/sh USER=patchew PATCHEW=/home/patchew/patchew/patchew-cli -s http://patchew.org --nodebug PATH=/usr/bin:/bin PWD=/var/tmp/patchew-tester-tmp-fidfv7g9/src LANG=en_US.UTF-8 HOME=/home/patchew SHLVL=2 LOGNAME=patchew XDG_RUNTIME_DIR=/run/user/1000 _=/usr/bin/env === PACKAGES === plymouth-core-libs-0.8.9-0.28.20140113.el7.centos.ppc64le vim-common-7.4.160-2.el7.ppc64le perl-Test-Simple-0.98-243.el7.noarch hplip-common-3.15.9-3.el7.ppc64le valgrind-3.12.0-8.el7.ppc64le gamin-0.1.10-16.el7.ppc64le libpeas-loader-python-1.20.0-1.el7.ppc64le telepathy-filesystem-0.0.2-6.el7.noarch colord-libs-1.3.4-1.el7.ppc64le kbd-legacy-1.15.5-13.el7.noarch perl-CPAN-Meta-YAML-0.008-14.el7.noarch libvirt-daemon-driver-nwfilter-3.2.0-14.el7.ppc64le ntsysv-1.7.4-1.el7.ppc64le kernel-bootwrapper-3.10.0-693.el7.ppc64le telepathy-farstream-0.6.0-5.el7.ppc64le kdenetwork-common-4.10.5-8.el7_0.noarch elfutils-devel-0.168-8.el7.ppc64le pm-utils-1.4.1-27.el7.ppc64le perl-Error-0.17020-2.el7.noarch usbmuxd-1.1.0-1.el7.ppc64le
[Qemu-block] [PATCH v2 for-2.11] nbd/server: Fix error reporting for bad requests
The NBD spec says an attempt to NBD_CMD_TRIM on a read-only export should fail with EPERM, as a trim has the potential to change disk contents, but we were relying on the block layer to catch that for us, which might not always give the right error (and even if it does, it does not let us pass back a sane message for structured replies). The NBD spec says an attempt to NBD_CMD_WRITE_ZEROES out of bounds should fail with ENOSPC, not EINVAL. Our check for u64 offset + u32 length wraparound up front is pointless; nothing uses offset until after the second round of sanity checks, and we can just as easily ensure there is no wraparound by checking whether offset is in bounds (since a disk size cannot exceed off_t which is 63 bits, adding a 32-bit number for a valid offset can't overflow). Solve all of these issues by some code motion and improved request validation. Signed-off-by: Eric Blake--- v2: actually commit the compiler-error fixes before submitting... nbd/server.c | 36 1 file changed, 12 insertions(+), 24 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index df771fd42f..7d6801b427 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1366,15 +1366,6 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request, return -EIO; } -/* Check for sanity in the parameters, part 1. Defer as many - * checks as possible until after reading any NBD_CMD_WRITE - * payload, so we can try and keep the connection alive. */ -if ((request->from + request->len) < request->from) { -error_setg(errp, - "integer overflow detected, you're probably being attacked"); -return -EINVAL; -} - if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE) { if (request->len > NBD_MAX_BUFFER_SIZE) { error_setg(errp, "len (%" PRIu32" ) is larger than max len (%u)", @@ -1399,12 +1390,21 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request, request->len); } -/* Sanity checks, part 2. */ -if (request->from + request->len > client->exp->size) { +/* Sanity checks. */ +if (client->exp->nbdflags & NBD_FLAG_READ_ONLY && +(request->type == NBD_CMD_WRITE || + request->type == NBD_CMD_WRITE_ZEROES || + request->type == NBD_CMD_TRIM)) { +error_setg(errp, "Export is read-only"); +return -EROFS; +} +if (request->from > client->exp->size || +request->from + request->len > client->exp->size) { error_setg(errp, "operation past EOF; From: %" PRIu64 ", Len: %" PRIu32 ", Size: %" PRIu64, request->from, request->len, (uint64_t)client->exp->size); -return request->type == NBD_CMD_WRITE ? -ENOSPC : -EINVAL; +return (request->type == NBD_CMD_WRITE || +request->type == NBD_CMD_WRITE_ZEROES) ? -ENOSPC : -EINVAL; } valid_flags = NBD_CMD_FLAG_FUA; if (request->type == NBD_CMD_READ && client->structured_reply) { @@ -1482,12 +1482,6 @@ static coroutine_fn void nbd_trip(void *opaque) break; case NBD_CMD_WRITE: -if (exp->nbdflags & NBD_FLAG_READ_ONLY) { -error_setg(_err, "Export is read-only"); -ret = -EROFS; -break; -} - flags = 0; if (request.flags & NBD_CMD_FLAG_FUA) { flags |= BDRV_REQ_FUA; @@ -1500,12 +1494,6 @@ static coroutine_fn void nbd_trip(void *opaque) break; case NBD_CMD_WRITE_ZEROES: -if (exp->nbdflags & NBD_FLAG_READ_ONLY) { -error_setg(_err, "Export is read-only"); -ret = -EROFS; -break; -} - flags = 0; if (request.flags & NBD_CMD_FLAG_FUA) { flags |= BDRV_REQ_FUA; -- 2.13.6
Re: [Qemu-block] [Qemu-devel] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests
On 11/15/2017 03:23 PM, no-re...@patchew.org wrote: > Hi, > > This series failed build test on s390x host. Please find the details below. EPEBCAK. Serves me right for tweaking the patch in between compile-testing and posting. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 21:41, Gandalf Corvotempesta wrote: > 2017-11-15 21:29 GMT+01:00 Richard W.M. Jones: >> Gandalf, is there an XVA file publically available (pref. not >> too big) that we can look at? > > I can try to provide one, but it's simple: > > # tar tvf 20160630_124823_aa72_.xva.gz | head -n 50 > -- 0/0 42353 1970-01-01 01:00 ova.xml > -- 0/0 1048576 1970-01-01 01:00 Ref:175/ > -- 0/0 40 1970-01-01 01:00 Ref:175/.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0001 > -- 0/0 40 1970-01-01 01:00 Ref:175/0001.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0003 > -- 0/0 40 1970-01-01 01:00 Ref:175/0003.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0004 > -- 0/0 40 1970-01-01 01:00 Ref:175/0004.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0005 > -- 0/0 40 1970-01-01 01:00 Ref:175/0005.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0006 > -- 0/0 40 1970-01-01 01:00 Ref:175/0006.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0007 > -- 0/0 40 1970-01-01 01:00 Ref:175/0007.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0009 > -- 0/0 40 1970-01-01 01:00 Ref:175/0009.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0010 > -- 0/0 40 1970-01-01 01:00 Ref:175/0010.checksum > > > You can ignore the ova.xml and just use the "Ref:175" directory. > Inside the XVA you'll fine one "Ref" directory for each virtual disk > (ref number is different for each disk) > Inside each Ref directory, you'll get tons of 1MB file, corrisponding > to the RAW image. > You have to merge these files in a single raw file with just an > exception: numbers are not sequential. > as you can see above, we have: , 0001, 0003 > > The 0002 is missing, because it's totally blank. XenServer doesn't > export any empty block, thus it will skip the corrisponding 1MB file. > When building the raw image, you have to replace empty blocks with 1MB > full of zeros. > > This is (in addition to the tar extract) the most time-consuming part. > Currently I'm rebuilding a 250GB image, with tons of files to be > merge. > If qemu-img can be patched to automatically convert this kind of > format, I can save about 3 hours (30 minutes for extracting the > tarball, and about 2 hours to merge 250-300GB image) Hmmm... OK, so as Rich said, you can use the raw driver with an offset and a size. And you can use the null-co driver to emulate holes. And qemu-img convert supports concatenation. Taking all of that together, it should be possible to create an input "file" per 1 MB block and concatenate all of those with qemu-img convert. The only issue is that it'll get you a *really* long command line. (And you'll still have to gunzip the archive before, of course.) I'll try to write a script, but of course it's hart without an example... Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests
Hi, This series failed automatic build test. Please find the testing commands and their output below. If you have docker installed, you can probably reproduce it locally. Subject: [Qemu-devel] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests Type: series Message-id: 20171115211917.789-1-ebl...@redhat.com === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 export J=8 time make docker-test-quick@centos6 time make docker-test-build@min-glib time make docker-test-mingw@fedora time make docker-test-block@fedora === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' 48097f65f5 nbd/server: Fix error reporting for bad requests === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-vt57qj11/src/dtc'... Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d' BUILD centos6 make[1]: Entering directory '/var/tmp/patchew-tester-tmp-vt57qj11/src' GEN /var/tmp/patchew-tester-tmp-vt57qj11/src/docker-src.2017-11-15-16.22.44.14040/qemu.tar Cloning into '/var/tmp/patchew-tester-tmp-vt57qj11/src/docker-src.2017-11-15-16.22.44.14040/qemu.tar.vroot'... done. Checking out files: 44% (2526/5654) Checking out files: 45% (2545/5654) Checking out files: 46% (2601/5654) Checking out files: 47% (2658/5654) Checking out files: 48% (2714/5654) Checking out files: 49% (2771/5654) Checking out files: 50% (2827/5654) Checking out files: 51% (2884/5654) Checking out files: 52% (2941/5654) Checking out files: 53% (2997/5654) Checking out files: 54% (3054/5654) Checking out files: 55% (3110/5654) Checking out files: 56% (3167/5654) Checking out files: 57% (3223/5654) Checking out files: 58% (3280/5654) Checking out files: 59% (3336/5654) Checking out files: 60% (3393/5654) Checking out files: 61% (3449/5654) Checking out files: 62% (3506/5654) Checking out files: 63% (3563/5654) Checking out files: 64% (3619/5654) Checking out files: 65% (3676/5654) Checking out files: 66% (3732/5654) Checking out files: 67% (3789/5654) Checking out files: 68% (3845/5654) Checking out files: 69% (3902/5654) Checking out files: 70% (3958/5654) Checking out files: 71% (4015/5654) Checking out files: 72% (4071/5654) Checking out files: 73% (4128/5654) Checking out files: 74% (4184/5654) Checking out files: 75% (4241/5654) Checking out files: 76% (4298/5654) Checking out files: 77% (4354/5654) Checking out files: 78% (4411/5654) Checking out files: 79% (4467/5654) Checking out files: 80% (4524/5654) Checking out files: 81% (4580/5654) Checking out files: 82% (4637/5654) Checking out files: 83% (4693/5654) Checking out files: 84% (4750/5654) Checking out files: 85% (4806/5654) Checking out files: 86% (4863/5654) Checking out files: 87% (4919/5654) Checking out files: 88% (4976/5654) Checking out files: 89% (5033/5654) Checking out files: 90% (5089/5654) Checking out files: 91% (5146/5654) Checking out files: 92% (5202/5654) Checking out files: 93% (5259/5654) Checking out files: 94% (5315/5654) Checking out files: 95% (5372/5654) Checking out files: 96% (5428/5654) Checking out files: 96% (5479/5654) Checking out files: 97% (5485/5654) Checking out files: 98% (5541/5654) Checking out files: 99% (5598/5654) Checking out files: 100% (5654/5654) Checking out files: 100% (5654/5654), done. Your branch is up-to-date with 'origin/test'. Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into '/var/tmp/patchew-tester-tmp-vt57qj11/src/docker-src.2017-11-15-16.22.44.14040/qemu.tar.vroot/dtc'... Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d' Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered for path 'ui/keycodemapdb' Cloning into '/var/tmp/patchew-tester-tmp-vt57qj11/src/docker-src.2017-11-15-16.22.44.14040/qemu.tar.vroot/ui/keycodemapdb'... Submodule path 'ui/keycodemapdb': checked out '10739aa26051a5d49d88132604539d3ed085e72e' COPYRUNNER RUN test-quick in qemu:centos6 Packages installed: SDL-devel-1.2.14-7.el6_7.1.x86_64 bison-2.4.1-5.el6.x86_64 bzip2-devel-1.0.5-7.el6_0.x86_64 ccache-3.1.6-2.el6.x86_64 csnappy-devel-0-6.20150729gitd7bc683.el6.x86_64 flex-2.5.35-9.el6.x86_64 gcc-4.4.7-18.el6.x86_64 gettext-0.17-18.el6.x86_64 git-1.7.1-9.el6_9.x86_64 glib2-devel-2.28.8-9.el6.x86_64 libepoxy-devel-1.2-3.el6.x86_64 libfdt-devel-1.4.0-1.el6.x86_64 librdmacm-devel-1.0.21-0.el6.x86_64 lzo-devel-2.03-3.1.el6_5.1.x86_64 make-3.81-23.el6.x86_64 mesa-libEGL-devel-11.0.7-4.el6.x86_64 mesa-libgbm-devel-11.0.7-4.el6.x86_64 package g++ is not installed
Re: [Qemu-block] [Qemu-devel] [PATCH for-2.12 0/4] qmp dirty bitmap API
On 11/13/2017 11:20 AM, Vladimir Sementsov-Ogievskiy wrote: > Hi all. > > There are three qmp commands, needed to implement external backup API. > > Using these three commands, client may do all needed bitmap management by > hand: > > on backup start we need to do a transaction: > {disable old bitmap, create new bitmap} > > on backup success: > drop old bitmap > > on backup fail: > enable old bitmap > merge new bitmap to old bitmap > drop new bitmap > Can you give me an example of how you expect these commands to be used, and why they're required? I'm a little weary about how error-prone these commands might be and the potential for incorrect usage seems... high. Why do we require them?
[Qemu-block] [PATCH for-2.11] nbd/server: Fix error reporting for bad requests
The NBD spec says an attempt to NBD_CMD_TRIM on a read-only export should fail with EPERM, as a trim has the potential to change disk contents, but we were relying on the block layer to catch that for us, which might not always give the right error (and even if it does, it does not let us pass back a sane message over structured replies). The NBD spec says an attempt to NBD_CMD_WRITE_ZEROES out of bounds should fail with ENOSPC, not EINVAL. Our check for u64 offset + u32 length wraparound up front is pointless; nothing uses offset until after the second round of sanity checks, and we can just as easily ensure there is no wraparound by checking whether offset is in bounds (since a disk size cannot exceed off_t which is 63 bits, adding a 32-bit number for a valid offset can't overflow). Solve all of these issues by some code motion and improved request validation. Signed-off-by: Eric Blake--- nbd/server.c | 35 +++ 1 file changed, 11 insertions(+), 24 deletions(-) diff --git a/nbd/server.c b/nbd/server.c index df771fd42f..a27183b427 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1366,15 +1366,6 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request, return -EIO; } -/* Check for sanity in the parameters, part 1. Defer as many - * checks as possible until after reading any NBD_CMD_WRITE - * payload, so we can try and keep the connection alive. */ -if ((request->from + request->len) < request->from) { -error_setg(errp, - "integer overflow detected, you're probably being attacked"); -return -EINVAL; -} - if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE) { if (request->len > NBD_MAX_BUFFER_SIZE) { error_setg(errp, "len (%" PRIu32" ) is larger than max len (%u)", @@ -1399,12 +1390,20 @@ static int nbd_co_receive_request(NBDRequestData *req, NBDRequest *request, request->len); } -/* Sanity checks, part 2. */ -if (request->from + request->len > client->exp->size) { +/* Sanity checks. */ +if (client->exp->nbdflags & NBD_FLAG_READ_ONLY && +(cmd == NBD_CMD_WRITE || cmd == NBD_CMD_WRITE_ZEROES || + cmd == NBD_CMD_TRIM)) { +error_setg(_err, "Export is read-only"); +return -EROFS; +} +if (request->from > client->exp->size || +request->from + request->len > client->exp->size) { error_setg(errp, "operation past EOF; From: %" PRIu64 ", Len: %" PRIu32 ", Size: %" PRIu64, request->from, request->len, (uint64_t)client->exp->size); -return request->type == NBD_CMD_WRITE ? -ENOSPC : -EINVAL; +return (request->type == NBD_CMD_WRITE || +request->type == NBD_CMD_WRITE_ZEROES) ? -ENOSPC : -EINVAL; } valid_flags = NBD_CMD_FLAG_FUA; if (request->type == NBD_CMD_READ && client->structured_reply) { @@ -1482,12 +1481,6 @@ static coroutine_fn void nbd_trip(void *opaque) break; case NBD_CMD_WRITE: -if (exp->nbdflags & NBD_FLAG_READ_ONLY) { -error_setg(_err, "Export is read-only"); -ret = -EROFS; -break; -} - flags = 0; if (request.flags & NBD_CMD_FLAG_FUA) { flags |= BDRV_REQ_FUA; @@ -1500,12 +1493,6 @@ static coroutine_fn void nbd_trip(void *opaque) break; case NBD_CMD_WRITE_ZEROES: -if (exp->nbdflags & NBD_FLAG_READ_ONLY) { -error_setg(_err, "Export is read-only"); -ret = -EROFS; -break; -} - flags = 0; if (request.flags & NBD_CMD_FLAG_FUA) { flags |= BDRV_REQ_FUA; -- 2.13.6
Re: [Qemu-block] [Qemu-devel] [Libguestfs] [qemu-img] support for XVA
On Wed, Nov 15, 2017 at 09:05:00PM +, Richard W.M. Jones wrote: > This will give you one offset/size/filename tuple for each file. The > plugin will then simply need to calculate which file to access to > resolve each virtual file range (or substitute zeroes for missing > files). By which I mean, of course, that it accesses the offset/size directly by opening the tar file. The file doesn't have to be extracted. I'm still interested if you have a small XVA example you can share. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Re: [Qemu-block] [Qemu-devel] [Libguestfs] [qemu-img] support for XVA
On Wed, Nov 15, 2017 at 09:41:20PM +0100, Gandalf Corvotempesta wrote: > 2017-11-15 21:29 GMT+01:00 Richard W.M. Jones: > > Gandalf, is there an XVA file publically available (pref. not > > too big) that we can look at? > > I can try to provide one, but it's simple: > > # tar tvf 20160630_124823_aa72_.xva.gz | head -n 50 > -- 0/0 42353 1970-01-01 01:00 ova.xml > -- 0/0 1048576 1970-01-01 01:00 Ref:175/ > -- 0/0 40 1970-01-01 01:00 Ref:175/.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0001 > -- 0/0 40 1970-01-01 01:00 Ref:175/0001.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0003 > -- 0/0 40 1970-01-01 01:00 Ref:175/0003.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0004 > -- 0/0 40 1970-01-01 01:00 Ref:175/0004.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0005 > -- 0/0 40 1970-01-01 01:00 Ref:175/0005.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0006 > -- 0/0 40 1970-01-01 01:00 Ref:175/0006.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0007 > -- 0/0 40 1970-01-01 01:00 Ref:175/0007.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0009 > -- 0/0 40 1970-01-01 01:00 Ref:175/0009.checksum > -- 0/0 1048576 1970-01-01 01:00 Ref:175/0010 > -- 0/0 40 1970-01-01 01:00 Ref:175/0010.checksum > > > You can ignore the ova.xml and just use the "Ref:175" directory. > Inside the XVA you'll fine one "Ref" directory for each virtual disk > (ref number is different for each disk) > Inside each Ref directory, you'll get tons of 1MB file, corrisponding > to the RAW image. > You have to merge these files in a single raw file with just an > exception: numbers are not sequential. > as you can see above, we have: , 0001, 0003 > > The 0002 is missing, because it's totally blank. XenServer doesn't > export any empty block, thus it will skip the corrisponding 1MB file. > When building the raw image, you have to replace empty blocks with 1MB > full of zeros. > > This is (in addition to the tar extract) the most time-consuming part. > Currently I'm rebuilding a 250GB image, with tons of files to be > merge. > If qemu-img can be patched to automatically convert this kind of > format, I can save about 3 hours (30 minutes for extracting the > tarball, and about 2 hours to merge 250-300GB image) I guess the nbdkit approach would be better given the multiple and missing files within the tar file. You'll have to use ‘tar tRvf file.xva’ to extract the offset and size of each file. (See the function ‘find_file_in_tar’ in virt-v2v source for exactly how). This will give you one offset/size/filename tuple for each file. The plugin will then simply need to calculate which file to access to resolve each virtual file range (or substitute zeroes for missing files). Note that nbdkit lets you write plugins in high-level languages like Perl or Python which would be ideally suited to this kind of task. You can then use "captive nbdkit" (see the manual) like this: nbdkit -U - \ perl script=/path/to/xva-parser.pl file=/path/to/file.xva \ --run 'qemu-img convert $nbd -O qcow2 output.qcow2' At no point does the tar file need to be fully unpacked and it should all be reasonably efficient. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Re: [Qemu-block] [PATCH for-2.11] qcow2: Fix overly broad madvise()
On 2017-11-14 19:41, Max Reitz wrote: > @mem_size and @offset are both size_t, thus subtracting them from one > another will just return a big size_t if mem_size < offset -- even more > obvious here because the result is stored in another size_t. > > Checking that result to be positive is therefore not sufficient to > excluse the case that offset > mem_size. Thus, we currently sometimes > issue an madvise() over a very large address range. > > This is triggered by iotest 163, but with -m64, this does not result in > tangible problems. But with -m32, this test produces three segfaults, > all of which are fixed by this patch. > > Signed-off-by: Max Reitz> --- > block/qcow2-cache.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Fixed the typo and applied to my block branch: https://github.com/XanClic/qemu/commits/block Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On Wed, Nov 15, 2017 at 09:27:14PM +0100, Max Reitz wrote: > On 2017-11-15 21:24, Richard W.M. Jones wrote: > > On Wed, Nov 15, 2017 at 09:07:12PM +0100, Max Reitz wrote: > >> On 2017-11-15 21:06, Gandalf Corvotempesta wrote: > >>> 2017-11-15 20:59 GMT+01:00 Max Reitz: > Well, you can't just add support to qemu-img alone either. Every image > format supported by qemu-img is one supported by qemu as a whole and > thus needs a proper block driver that needs to support random accesses > as well. > >>> > >>> I was talking about qemu-img convert, just to convert an XVA image to > >>> something different, in a single pass, without having to extract the > >>> tar. > >> > >> I know, but that doesn't work. qemu-img convert uses the normal > >> general-purpose block drivers for that. > > > > In any case there's no need as qemu/qemu-img can already access files > > inside an uncompressed tarball using the offset/size support added to > > the raw driver exactly for this purpose: > > > > https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03945.html > > If that works, yes. To me it doesn't look like XVA is just a single raw > image inside of a tarball, but I may very well be wrong, of course. Gandalf, is there an XVA file publically available (pref. not too big) that we can look at? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 21:24, Richard W.M. Jones wrote: > On Wed, Nov 15, 2017 at 09:07:12PM +0100, Max Reitz wrote: >> On 2017-11-15 21:06, Gandalf Corvotempesta wrote: >>> 2017-11-15 20:59 GMT+01:00 Max Reitz: Well, you can't just add support to qemu-img alone either. Every image format supported by qemu-img is one supported by qemu as a whole and thus needs a proper block driver that needs to support random accesses as well. >>> >>> I was talking about qemu-img convert, just to convert an XVA image to >>> something different, in a single pass, without having to extract the >>> tar. >> >> I know, but that doesn't work. qemu-img convert uses the normal >> general-purpose block drivers for that. > > In any case there's no need as qemu/qemu-img can already access files > inside an uncompressed tarball using the offset/size support added to > the raw driver exactly for this purpose: > > https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03945.html If that works, yes. To me it doesn't look like XVA is just a single raw image inside of a tarball, but I may very well be wrong, of course. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH for-2.11 0/5] qcow2: Fixes for corrupted images
On 2017-11-10 21:31, Max Reitz wrote: > This series contains fixes for another batch of qcow2-related crashes > reported on Launchpad by Nageswara (the first batch was > http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00082.html by > Berto). > > Patch 4 fixes an out-of-bounds array access in memory which is not > really a security issue for multiple reasons (really, at most you can > read eight bytes from somewhere with an extremely high chance of > crashing qemu and requiring the user to invoke a block_resize shrinking > the qcow2 image (and also reset some bit in the image from 1 to 0, but > only if the overlap checks don't catch you)), but most importantly that > code hasn't been in 2.10, so we're fine. > > > Max Reitz (5): > qcow2: check_errors are fatal > qcow2: Unaligned zero cluster in handle_alloc() > block: Guard against NULL bs->drv > qcow2: Add bounds check to get_refblock_offset() > qcow2: Refuse to get unaligned offsets from cache > > block/qcow2.h | 6 --- > block.c| 19 ++- > block/io.c | 36 + > block/qapi.c | 8 ++- > block/qcow2-cache.c| 21 > block/qcow2-cluster.c | 13 - > block/qcow2-refcount.c | 26 +- > block/qcow2.c | 5 +- > block/replication.c| 15 ++ > block/vvfat.c | 2 +- > tests/qemu-iotests/060 | 125 > + > tests/qemu-iotests/060.out | 115 + > 12 files changed, 379 insertions(+), 12 deletions(-) Applied to my block tree: https://github.com/XanClic/qemu/commits/block Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Libguestfs] [Qemu-devel] [qemu-img] support for XVA
On Wed, Nov 15, 2017 at 09:07:12PM +0100, Max Reitz wrote: > On 2017-11-15 21:06, Gandalf Corvotempesta wrote: > > 2017-11-15 20:59 GMT+01:00 Max Reitz: > >> Well, you can't just add support to qemu-img alone either. Every image > >> format supported by qemu-img is one supported by qemu as a whole and > >> thus needs a proper block driver that needs to support random accesses > >> as well. > > > > I was talking about qemu-img convert, just to convert an XVA image to > > something different, in a single pass, without having to extract the > > tar. > > I know, but that doesn't work. qemu-img convert uses the normal > general-purpose block drivers for that. In any case there's no need as qemu/qemu-img can already access files inside an uncompressed tarball using the offset/size support added to the raw driver exactly for this purpose: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03945.html Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Re: [Qemu-block] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 21:06, Gandalf Corvotempesta wrote: > 2017-11-15 20:59 GMT+01:00 Max Reitz: >> Well, you can't just add support to qemu-img alone either. Every image >> format supported by qemu-img is one supported by qemu as a whole and >> thus needs a proper block driver that needs to support random accesses >> as well. > > I was talking about qemu-img convert, just to convert an XVA image to > something different, in a single pass, without having to extract the > tar. I know, but that doesn't work. qemu-img convert uses the normal general-purpose block drivers for that. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [qemu-img] support for XVA
On 2017-11-15 18:44, Gandalf Corvotempesta wrote: > XVA is a tar archive. > I don't think would be possible to directly use the image stored inside > without extracting and merging each chunks > > Any random reads would be impossible to do, only a huge sequential dump to > build the raw image Well, you can't just add support to qemu-img alone either. Every image format supported by qemu-img is one supported by qemu as a whole and thus needs a proper block driver that needs to support random accesses as well. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH 5/5] iotest 030: add compressed block-stream test
On 2017-11-14 11:16, Anton Nefedov wrote: > Signed-off-by: Anton Nefedov> --- > tests/qemu-iotests/030 | 69 > +- > tests/qemu-iotests/030.out | 4 +-- > 2 files changed, 70 insertions(+), 3 deletions(-) > > diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030 > index 1883894..52cbe5d 100755 > --- a/tests/qemu-iotests/030 > +++ b/tests/qemu-iotests/030 > @@ -21,7 +21,7 @@ > import time > import os > import iotests > -from iotests import qemu_img, qemu_io > +from iotests import qemu_img, qemu_img_pipe, qemu_io > > backing_img = os.path.join(iotests.test_dir, 'backing.img') > mid_img = os.path.join(iotests.test_dir, 'mid.img') > @@ -800,5 +800,72 @@ class TestSetSpeed(iotests.QMPTestCase): > > self.cancel_and_wait() > > +class TestCompressed(iotests.QMPTestCase): > +supported_fmts = ['qcow2'] > +cluster_size = 64 * 1024; > +image_len = 1 * 1024 * 1024; > + > +def setUp(self): > +qemu_img('create', backing_img, str(TestCompressed.image_len)) First, this is missing the '-f raw', if you want to keep it in line with the next command. > +qemu_io('-f', 'raw', '-c', 'write -P 1 0 ' + > str(TestCompressed.image_len), backing_img) But why would you want this to be raw? > +qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % > backing_img, test_img) > + > +# write '4' in every 4th cluster > +step=4 I'd prefer spaces around operators. (Also in _pattern() and in the call to pattern ([1,4,2] should be [1, 4, 2]).) > +for i in range(TestCompressed.image_len / > TestCompressed.cluster_size / step + 1): As far as I know, range() has an actual step parameter, maybe that would be simpler -- and I don't know what the +1 is supposed to mean. How about "for ofs in range(0, TestCompressed.image_len, TestCompressed.cluster_size * step)"? Would that work? > +qemu_io('-c', 'write -P %d %d %d' % > +(step, step * i * TestCompressed.cluster_size,> + > TestCompressed.cluster_size), > +test_img) > + > +self.vm = iotests.VM().add_drive(test_img) > +self.vm.launch() > + > +def tearDown(self): > +os.remove(test_img) > +os.remove(backing_img) > + > +def _pattern(self, x, divs): > +return divs[-1] if not x%divs[-1] else self._pattern(x, divs[:-1]) I have no idea what this function does. ...OK, having looked into how it's used, I understand better. A comment would certainly be helpful, though. Maybe also call it differently. It doesn't really return a pattern. What this function does is it returns the first integer from @divs (starting from the end, which is weird) which is a divider of @x. So maybe call it _first_divider(self, x, dividers), that would help already. > + > +def test_compressed(self): > +self.assert_no_active_block_jobs() > + > +result = self.vm.qmp('block-stream', device='drive0', compress=True) > +if iotests.imgfmt not in TestCompressed.supported_fmts: > +self.assert_qmp( > +result, 'error/desc', > +'Compression is not supported for this drive drive0') > +return > +self.assert_qmp(result, 'return', {}) > + > +# write '2' in every 2nd cluster > +step = 2 > +for i in range(TestCompressed.image_len / > TestCompressed.cluster_size / step + 1): > +result = self.vm.qmp('human-monitor-command', > + command_line= > + 'qemu-io drive0 \"write -P %d %d %d\"' % Just " instead of \" would be enough, I think. > + (step, step * i * > TestCompressed.cluster_size, > + TestCompressed.cluster_size)) > +self.assert_qmp(result, 'return', "") > + > +self.wait_until_completed() > +self.assert_no_active_block_jobs() > + > +self.vm.shutdown() > + > +for i in range(TestCompressed.image_len / > TestCompressed.cluster_size): > +outp = qemu_io('-c', 'read -P %d %d %d' % > + (self._pattern(i, [1,4,2]), The 4 is useless, because everything that's divisible by 4 is divisible by 2 already. Also, I don't quite see what this should achieve exactly. You first write the backing image full of 1s, OK. Then you write 4s to every fourth cluster in the top image. Then you start streaming, and then you write 2s to ever second cluster in the top image, which overwrites all of the 4s you wrote. Do you maybe not want to write 2 into every second cluster of the backing image in setUp() instead of 4 into the top image? (And then 4th into every fourth cluster here in test_compressed().) Then you would need [1, 2, 4] here, which makes much more sense. Max > +
Re: [Qemu-block] [PATCH 4/5] block-stream: add compress option
On 2017-11-14 11:16, Anton Nefedov wrote: > Signed-off-by: Anton Nefedov> --- > qapi/block-core.json | 4 > include/block/block_int.h | 4 +++- > block/stream.c| 16 > blockdev.c| 13 - > hmp.c | 2 ++ > hmp-commands.hx | 4 ++-- > 6 files changed, 35 insertions(+), 8 deletions(-) Looks good to me overall, two notes below. Max > diff --git a/qapi/block-core.json b/qapi/block-core.json > index ab96e34..b0d022f 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -2007,6 +2007,9 @@ > # > # @speed: the maximum speed, in bytes per second > # > +# @compress: true to compress data, if the target format supports it. > +#(default: false). Since 2.12. > +# Too many full stops (one before, one after the parentheses). Also, this makes it sound a bit like it would still be possible to specify this parameter even if the target doesn't support it (and it would just be ignored then), but that's not true. Also, the format most parameter documentation follows for block-stream is more "@parameter: Description. Default. (Since version)". So I'd suggest: "true to compress data; may only be set if the target format supports it. Defaults to false if omitted. (Since 2.12)" But I won't argue too much over the format, so the following is OK, too: "true to compress data; may only be set if the target format supports it (default: false). (Since 2.12)" > # @on-error: the action to take on an error (default report). > #'stop' and 'enospc' can only be used if the block device > #supports io-status (see BlockInfo). Since 1.3. > @@ -2026,6 +2029,7 @@ > { 'command': 'block-stream', >'data': { '*job-id': 'str', 'device': 'str', '*base': 'str', > '*base-node': 'str', '*backing-file': 'str', '*speed': 'int', > +'*compress': 'bool', > '*on-error': 'BlockdevOnError' } } > > ## > diff --git a/include/block/block_int.h b/include/block/block_int.h > index a548277..093bf9b 100644 > --- a/include/block/block_int.h > +++ b/include/block/block_int.h > @@ -863,6 +863,7 @@ int is_windows_drive(const char *filename); > * @backing_file_str: The file name that will be written to @bs as the > * the new backing file if the job completes. Ignored if @base is %NULL. > * @speed: The maximum speed, in bytes per second, or 0 for unlimited. > + * @compress: True to compress data. > * @on_error: The action to take upon error. > * @errp: Error object. > * > @@ -875,7 +876,8 @@ int is_windows_drive(const char *filename); > */ > void stream_start(const char *job_id, BlockDriverState *bs, >BlockDriverState *base, const char *backing_file_str, > - int64_t speed, BlockdevOnError on_error, Error **errp); > + int64_t speed, bool compress, > + BlockdevOnError on_error, Error **errp); > > /** > * commit_start: > diff --git a/block/stream.c b/block/stream.c > index e6f7234..75c9d66 100644 > --- a/block/stream.c > +++ b/block/stream.c > @@ -38,23 +38,29 @@ typedef struct StreamBlockJob { > BlockdevOnError on_error; > char *backing_file_str; > int bs_flags; > +bool compress; > } StreamBlockJob; > > static int coroutine_fn stream_populate(BlockBackend *blk, > int64_t offset, uint64_t bytes, > -void *buf) > +void *buf, bool compress) > { > struct iovec iov = { > .iov_base = buf, > .iov_len = bytes, > }; > QEMUIOVector qiov; > +int flags = BDRV_REQ_COPY_ON_READ; > + > +if (compress) { > +flags |= BDRV_REQ_WRITE_COMPRESSED; > +} > > assert(bytes < SIZE_MAX); > qemu_iovec_init_external(, , 1); > > /* Copy-on-read the unallocated clusters */ > -return blk_co_preadv(blk, offset, qiov.size, , > BDRV_REQ_COPY_ON_READ); > +return blk_co_preadv(blk, offset, qiov.size, , flags); > } > > typedef struct { > @@ -166,7 +172,7 @@ static void coroutine_fn stream_run(void *opaque) > } > trace_stream_one_iteration(s, offset, n, ret); > if (copy) { > -ret = stream_populate(blk, offset, n, buf); > +ret = stream_populate(blk, offset, n, buf, s->compress); > } > if (ret < 0) { > BlockErrorAction action = > @@ -227,7 +233,8 @@ static const BlockJobDriver stream_job_driver = { > > void stream_start(const char *job_id, BlockDriverState *bs, >BlockDriverState *base, const char *backing_file_str, > - int64_t speed, BlockdevOnError on_error, Error **errp) > + int64_t speed, bool compress, > + BlockdevOnError on_error, Error **errp) > { > StreamBlockJob *s; > BlockDriverState
Re: [Qemu-block] [PATCH v2] qapi: block-core: Clarify events emitted by 'block-job-cancel'
On 11/15/2017 04:09 AM, Kashyap Chamarthy wrote: > When you cancel an in-progress live block operation with QMP > `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED. However, > when `block-job-cancel` is issued after `drive-mirror` has indicated (by > emitting the event BLOCK_JOB_READY) that the source and destination > remain synchronized: > > [...] # Snip `drive-mirror` invocation & outputs > { > "execute":"block-job-cancel", > "arguments":{ > "device":"virtio0" > } > } > > {"return": {}} > > It (`block-job-cancel`) will counterintuitively emit the event > 'BLOCK_JOB_COMPLETED': > > { > "timestamp":{ > "seconds":1510678024, > "microseconds":526240 > }, > "event":"BLOCK_JOB_COMPLETED", > "data":{ > "device":"virtio0", > "len":41126400, > "offset":41126400, > "speed":0, > "type":"mirror" > } > } > > But this is expected behaviour, where the _COMPLETED event indicates > that synchronization has successfully ended (and the destination has a > point-in-time copy, which is at the time of cancel). > > So add a small note to this effect. (Thanks: Max Reitz for reminding > me of this on IRC.) > I suppose this difference probably isn't covered in what was the bitmaps.md doc file (we don't bother covering mirror there, only backup); is it covered sufficiently in live-block-operations.rst ? --js
Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: Make 087 pass without AIO enabled
On 11/15/2017 12:07 PM, Max Reitz wrote: > If AIO has not been enabled in the qemu build that is to be tested, we > should skip the "aio=native without O_DIRECT" test instead of failing. > > Signed-off-by: Max Reitz> --- > Cleber wanted to fix this in July with his "build configuration query > tool and conditional (qemu-io)test skip" series > (https://lists.gnu.org/archive/html/qemu-block/2017-07/msg01303.html), > but unfortunately there hasn't been any activity on that (as far as I > can see), so let's just solve it the simple way. > --- > tests/qemu-iotests/087 | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) Yep, that's about as simple as possible. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH 3/5] block: support compressed write for copy-on-read
On 2017-11-14 11:16, Anton Nefedov wrote: > Signed-off-by: Anton Nefedov> --- > block/io.c | 30 -- > block/trace-events | 2 +- > 2 files changed, 25 insertions(+), 7 deletions(-) > > diff --git a/block/io.c b/block/io.c > index 3d5ef2c..93c6b24 100644 > --- a/block/io.c > +++ b/block/io.c [...] > @@ -1209,6 +1220,13 @@ int coroutine_fn bdrv_co_preadv(BdrvChild *child, > return ret; > } > > +/* write compressed only makes sense with copy on read */ > +if ((flags & BDRV_REQ_WRITE_COMPRESSED) && > +!(flags & BDRV_REQ_COPY_ON_READ)) > +{ > +return -EINVAL; > +} > + I think the assertion in bdrv_aligned_preadv() should be enough, but either way: Reviewed-by: Max Reitz > bdrv_inc_in_flight(bs); > > /* Don't do copy-on-read if we read data before write operation */ signature.asc Description: OpenPGP digital signature
[Qemu-block] [PATCH] iotests: Make 087 pass without AIO enabled
If AIO has not been enabled in the qemu build that is to be tested, we should skip the "aio=native without O_DIRECT" test instead of failing. Signed-off-by: Max Reitz--- Cleber wanted to fix this in July with his "build configuration query tool and conditional (qemu-io)test skip" series (https://lists.gnu.org/archive/html/qemu-block/2017-07/msg01303.html), but unfortunately there hasn't been any activity on that (as far as I can see), so let's just solve it the simple way. --- tests/qemu-iotests/087 | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087 index 27ab6c5151..2561a14456 100755 --- a/tests/qemu-iotests/087 +++ b/tests/qemu-iotests/087 @@ -102,7 +102,14 @@ echo echo === aio=native without O_DIRECT === echo -run_qemu <
Re: [Qemu-block] [PATCH v3] ssh: switch from libssh2 to libssh
On Wed, Nov 15, 2017 at 05:26:48PM +0100, Pino Toscano wrote: > Rewrite the implementation of the ssh block driver to use libssh instead > of libssh2. The libssh library has various advantages over libssh2: > - easier API for authentication (for example for using ssh-agent) > - easier API for known_hosts handling > - supports newer types of keys in known_hosts > > Kerberos authentication can be enabled once the libssh bug for it [1] is > fixed. > > The development version of libssh (i.e. the future 0.8.x) supports > fsync, so reuse the build time check for this. > > [1] https://red.libssh.org/issues/242 > > Signed-off-by: Pino ToscanoI haven't reviewed this in detail, but I have tested it and I was able to boot a VM remotely over ssh, do lots of heavy reads and writes, and at no time did it hang/crash. So: Tested-by: Richard W.M. Jones I'll take a look at the code in more detail later. Rich. > Changes from v2: > - used again an own fd > - fixed co_yield() implementation > > Changes from v1: > - fixed jumbo packets writing > - fixed missing 'err' assignment > - fixed commit message > > block/Makefile.objs | 6 +- > block/ssh.c | 494 > > configure | 65 --- > 3 files changed, 259 insertions(+), 306 deletions(-) > > diff --git a/block/Makefile.objs b/block/Makefile.objs > index 6eaf78a046..c0df21d0b4 100644 > --- a/block/Makefile.objs > +++ b/block/Makefile.objs > @@ -20,7 +20,7 @@ block-obj-$(CONFIG_CURL) += curl.o > block-obj-$(CONFIG_RBD) += rbd.o > block-obj-$(CONFIG_GLUSTERFS) += gluster.o > block-obj-$(CONFIG_VXHS) += vxhs.o > -block-obj-$(CONFIG_LIBSSH2) += ssh.o > +block-obj-$(CONFIG_LIBSSH) += ssh.o > block-obj-y += accounting.o dirty-bitmap.o > block-obj-y += write-threshold.o > block-obj-y += backup.o > @@ -41,8 +41,8 @@ rbd.o-libs := $(RBD_LIBS) > gluster.o-cflags := $(GLUSTERFS_CFLAGS) > gluster.o-libs := $(GLUSTERFS_LIBS) > vxhs.o-libs:= $(VXHS_LIBS) > -ssh.o-cflags := $(LIBSSH2_CFLAGS) > -ssh.o-libs := $(LIBSSH2_LIBS) > +ssh.o-cflags := $(LIBSSH_CFLAGS) > +ssh.o-libs := $(LIBSSH_LIBS) > block-obj-$(if $(CONFIG_BZIP2),m,n) += dmg-bz2.o > dmg-bz2.o-libs := $(BZIP2_LIBS) > qcow.o-libs:= -lz > diff --git a/block/ssh.c b/block/ssh.c > index b049a16eb9..9e96c9d684 100644 > --- a/block/ssh.c > +++ b/block/ssh.c > @@ -24,8 +24,8 @@ > > #include "qemu/osdep.h" > > -#include > -#include > +#include > +#include > > #include "block/block_int.h" > #include "qapi/error.h" > @@ -41,14 +41,12 @@ > /* DEBUG_SSH=1 enables the DPRINTF (debugging printf) statements in > * this block driver code. > * > - * TRACE_LIBSSH2= enables tracing in libssh2 itself. Note > - * that this requires that libssh2 was specially compiled with the > - * `./configure --enable-debug' option, so most likely you will have > - * to compile it yourself. The meaning of is described > - * here: http://www.libssh2.org/libssh2_trace.html > + * TRACE_LIBSSH= enables tracing in libssh itself. > + * The meaning of is described here: > + * http://api.libssh.org/master/group__libssh__log.html > */ > #define DEBUG_SSH 0 > -#define TRACE_LIBSSH2 0 /* or try: LIBSSH2_TRACE_SFTP */ > +#define TRACE_LIBSSH 0 /* see: SSH_LOG_* */ > > #define DPRINTF(fmt, ...) \ > do {\ > @@ -64,18 +62,14 @@ typedef struct BDRVSSHState { > > /* SSH connection. */ > int sock; /* socket */ > -LIBSSH2_SESSION *session; /* ssh session */ > -LIBSSH2_SFTP *sftp; /* sftp session */ > -LIBSSH2_SFTP_HANDLE *sftp_handle; /* sftp remote file handle */ > +ssh_session session; /* ssh session */ > +sftp_session sftp;/* sftp session */ > +sftp_file sftp_handle;/* sftp remote file handle */ > > -/* See ssh_seek() function below. */ > -int64_t offset; > -bool offset_op_read; > - > -/* File attributes at open. We try to keep the .filesize field > +/* File attributes at open. We try to keep the .size field > * updated if it changes (eg by writing at the end of the file). > */ > -LIBSSH2_SFTP_ATTRIBUTES attrs; > +sftp_attributes attrs; > > InetSocketAddress *inet; > > @@ -87,26 +81,23 @@ static void ssh_state_init(BDRVSSHState *s) > { > memset(s, 0, sizeof *s); > s->sock = -1; > -s->offset = -1; > qemu_co_mutex_init(>lock); > } > > static void ssh_state_free(BDRVSSHState *s) > { > +if (s->attrs) { > +sftp_attributes_free(s->attrs); > +} > if (s->sftp_handle) { > -libssh2_sftp_close(s->sftp_handle); > +sftp_close(s->sftp_handle); > } > if (s->sftp) { > -libssh2_sftp_shutdown(s->sftp); > +sftp_free(s->sftp); > } >
Re: [Qemu-block] segfault in parallel blockjobs (iotest 30)
On 15/11/2017 6:42 PM, Alberto Garcia wrote: On Fri 10 Nov 2017 04:02:23 AM CET, Fam Zheng wrote: I'm thinking that perhaps we should add the pause point directly to block_job_defer_to_main_loop(), to prevent any block job from running the exit function when it's paused. I was trying this and unfortunately this breaks the mirror job at least (reproduced with a simple block-commit on the topmost node, which uses commit_active_start() -> mirror_start_job()). So what happens is that mirror_run() always calls bdrv_drained_begin() before returning, pausing the block job. The closing bdrv_drained_end() call is at the end of mirror_exit(), already in the main loop. So the mirror job is always calling block_job_defer_to_main_loop() and mirror_exit() when the job is paused. FWIW, I think Max's report on 194 failures is related: https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01822.html so perhaps it's worth testing his patch too: https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01835.html Well, that doesn't solve the original crash with parallel block jobs. The root of the crash is that the mirror job manipulates the graph _while being paused_, so the BlockReopenQueue in bdrv_reopen_multiple() gets messed up, and pausing the jobs (commit 40840e419be31e) won't help. I have the impression that one major source of headaches is the fact that the reopen queue contains nodes that don't need to be reopened at all. Ideally this should be detected early on in bdrv_reopen_queue(), so there's no chance that the queue contains nodes used by a different block job. If we had that then op blockers should be enough to prevent these things. Or am I missing something? Berto After applying Max's patch I tried the similar approach; that is keep BDSes referenced while they are in the reopen queue. Now I get the stream job hanging. Somehow one blk_root_drained_begin() is not paired with blk_root_drained_end(). So the job stays paused. Didn't dig deeper yet, but at first glance the reduced reopen queue won't help with this, as reopen drains all BDSes anyway (or can we avoid that too?) /Anton
Re: [Qemu-block] [PATCH 2/5] qcow2: multiple clusters write compressed
On 2017-11-15 17:28, Anton Nefedov wrote: > On 15/11/2017 6:11 PM, Max Reitz wrote: >> On 2017-11-14 11:16, Anton Nefedov wrote: >>> From: Pavel Butsykin>>> >>> At the moment, qcow2_co_pwritev_compressed can process the requests size >>> less than or equal to one cluster. This patch added possibility to write >>> compressed data in the QCOW2 more than one cluster. The implementation >>> is simple, we just split large requests into separate clusters and write >>> using existing functionality. For unaligned requests we use a workaround >>> and do write data without compression. >>> >>> Signed-off-by: Anton Nefedov >>> --- >>> block/qcow2.c | 77 >>> +++ >>> 1 file changed, 56 insertions(+), 21 deletions(-) >> >> On one hand, it might be better to do this centrally somewhere in >> block/io.c. On the other, that would require more work because it would >> probably mean having to introduce another field in BlockLimits, and it >> wouldn't do much -- because qcow (v1) is, well, qcow v1... And vmdk >> seems to completely not care about this single cluster limitation. So >> for now we probably wouldn't even gain anything by doing this in >> block/io.c. >> >> So long story short, it's OK to do this locally in qcow2, yes. >> > > [..] > >>> + qemu_iovec_reset(_qiov); >>> + chunk_size = MIN(bytes, s->cluster_size); >>> + qemu_iovec_concat(_qiov, qiov, curr_off, chunk_size); >>> + >>> + ret = qcow2_co_pwritev_cluster_compressed(bs, offset + >>> curr_off, >>> + chunk_size, >>> _qiov); >>> + if (ret == -ENOTSUP) { >> >> Why this? I mean, I can see the appeal, but then we should probably >> actually return -ENOTSUP somewhere (e.g. for unaligned clusters and the >> like) -- and we should not abort if offset_into_cluster(s, cluster) is >> true, but we should write the header uncompressed and compress the main >> bulk. >> >> Max >> > > Right, sorry, missed this part when porting the patch. > > I think this needs to be removed (and the commit message fixed > accordingly). > Returning an error, rather than silently falling back to uncompressed > seems preferable to me. "Compressing writers" (backup, img convert and > now stream) are aware that they have to cluster-align, and if they fail > to do so that means there is an error somewhere. OK for me. > If it won't return an error anymore, then the unaligned tail shouldn't > either. So we can end up 'letting' the callers send small unaligned > requests which will never get compressed. Either way is fine. It just looks to me like vmdk falls back to uncompressed writes, so it's not like it would be completely new behavior... (But I won't judge whether what vmdk does is a good idea.) Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH 2/5] qcow2: multiple clusters write compressed
On 15/11/2017 6:11 PM, Max Reitz wrote: On 2017-11-14 11:16, Anton Nefedov wrote: From: Pavel ButsykinAt the moment, qcow2_co_pwritev_compressed can process the requests size less than or equal to one cluster. This patch added possibility to write compressed data in the QCOW2 more than one cluster. The implementation is simple, we just split large requests into separate clusters and write using existing functionality. For unaligned requests we use a workaround and do write data without compression. Signed-off-by: Anton Nefedov --- block/qcow2.c | 77 +++ 1 file changed, 56 insertions(+), 21 deletions(-) On one hand, it might be better to do this centrally somewhere in block/io.c. On the other, that would require more work because it would probably mean having to introduce another field in BlockLimits, and it wouldn't do much -- because qcow (v1) is, well, qcow v1... And vmdk seems to completely not care about this single cluster limitation. So for now we probably wouldn't even gain anything by doing this in block/io.c. So long story short, it's OK to do this locally in qcow2, yes. [..] +qemu_iovec_reset(_qiov); +chunk_size = MIN(bytes, s->cluster_size); +qemu_iovec_concat(_qiov, qiov, curr_off, chunk_size); + +ret = qcow2_co_pwritev_cluster_compressed(bs, offset + curr_off, + chunk_size, _qiov); +if (ret == -ENOTSUP) { Why this? I mean, I can see the appeal, but then we should probably actually return -ENOTSUP somewhere (e.g. for unaligned clusters and the like) -- and we should not abort if offset_into_cluster(s, cluster) is true, but we should write the header uncompressed and compress the main bulk. Max Right, sorry, missed this part when porting the patch. I think this needs to be removed (and the commit message fixed accordingly). Returning an error, rather than silently falling back to uncompressed seems preferable to me. "Compressing writers" (backup, img convert and now stream) are aware that they have to cluster-align, and if they fail to do so that means there is an error somewhere. If it won't return an error anymore, then the unaligned tail shouldn't either. So we can end up 'letting' the callers send small unaligned requests which will never get compressed. /Anton +ret = qcow2_co_pwritev(bs, offset + curr_off, chunk_size, + _qiov, 0);
[Qemu-block] [PATCH v3] ssh: switch from libssh2 to libssh
Rewrite the implementation of the ssh block driver to use libssh instead of libssh2. The libssh library has various advantages over libssh2: - easier API for authentication (for example for using ssh-agent) - easier API for known_hosts handling - supports newer types of keys in known_hosts Kerberos authentication can be enabled once the libssh bug for it [1] is fixed. The development version of libssh (i.e. the future 0.8.x) supports fsync, so reuse the build time check for this. [1] https://red.libssh.org/issues/242 Signed-off-by: Pino Toscano--- Changes from v2: - used again an own fd - fixed co_yield() implementation Changes from v1: - fixed jumbo packets writing - fixed missing 'err' assignment - fixed commit message block/Makefile.objs | 6 +- block/ssh.c | 494 configure | 65 --- 3 files changed, 259 insertions(+), 306 deletions(-) diff --git a/block/Makefile.objs b/block/Makefile.objs index 6eaf78a046..c0df21d0b4 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -20,7 +20,7 @@ block-obj-$(CONFIG_CURL) += curl.o block-obj-$(CONFIG_RBD) += rbd.o block-obj-$(CONFIG_GLUSTERFS) += gluster.o block-obj-$(CONFIG_VXHS) += vxhs.o -block-obj-$(CONFIG_LIBSSH2) += ssh.o +block-obj-$(CONFIG_LIBSSH) += ssh.o block-obj-y += accounting.o dirty-bitmap.o block-obj-y += write-threshold.o block-obj-y += backup.o @@ -41,8 +41,8 @@ rbd.o-libs := $(RBD_LIBS) gluster.o-cflags := $(GLUSTERFS_CFLAGS) gluster.o-libs := $(GLUSTERFS_LIBS) vxhs.o-libs:= $(VXHS_LIBS) -ssh.o-cflags := $(LIBSSH2_CFLAGS) -ssh.o-libs := $(LIBSSH2_LIBS) +ssh.o-cflags := $(LIBSSH_CFLAGS) +ssh.o-libs := $(LIBSSH_LIBS) block-obj-$(if $(CONFIG_BZIP2),m,n) += dmg-bz2.o dmg-bz2.o-libs := $(BZIP2_LIBS) qcow.o-libs:= -lz diff --git a/block/ssh.c b/block/ssh.c index b049a16eb9..9e96c9d684 100644 --- a/block/ssh.c +++ b/block/ssh.c @@ -24,8 +24,8 @@ #include "qemu/osdep.h" -#include -#include +#include +#include #include "block/block_int.h" #include "qapi/error.h" @@ -41,14 +41,12 @@ /* DEBUG_SSH=1 enables the DPRINTF (debugging printf) statements in * this block driver code. * - * TRACE_LIBSSH2= enables tracing in libssh2 itself. Note - * that this requires that libssh2 was specially compiled with the - * `./configure --enable-debug' option, so most likely you will have - * to compile it yourself. The meaning of is described - * here: http://www.libssh2.org/libssh2_trace.html + * TRACE_LIBSSH= enables tracing in libssh itself. + * The meaning of is described here: + * http://api.libssh.org/master/group__libssh__log.html */ #define DEBUG_SSH 0 -#define TRACE_LIBSSH2 0 /* or try: LIBSSH2_TRACE_SFTP */ +#define TRACE_LIBSSH 0 /* see: SSH_LOG_* */ #define DPRINTF(fmt, ...) \ do {\ @@ -64,18 +62,14 @@ typedef struct BDRVSSHState { /* SSH connection. */ int sock; /* socket */ -LIBSSH2_SESSION *session; /* ssh session */ -LIBSSH2_SFTP *sftp; /* sftp session */ -LIBSSH2_SFTP_HANDLE *sftp_handle; /* sftp remote file handle */ +ssh_session session; /* ssh session */ +sftp_session sftp;/* sftp session */ +sftp_file sftp_handle;/* sftp remote file handle */ -/* See ssh_seek() function below. */ -int64_t offset; -bool offset_op_read; - -/* File attributes at open. We try to keep the .filesize field +/* File attributes at open. We try to keep the .size field * updated if it changes (eg by writing at the end of the file). */ -LIBSSH2_SFTP_ATTRIBUTES attrs; +sftp_attributes attrs; InetSocketAddress *inet; @@ -87,26 +81,23 @@ static void ssh_state_init(BDRVSSHState *s) { memset(s, 0, sizeof *s); s->sock = -1; -s->offset = -1; qemu_co_mutex_init(>lock); } static void ssh_state_free(BDRVSSHState *s) { +if (s->attrs) { +sftp_attributes_free(s->attrs); +} if (s->sftp_handle) { -libssh2_sftp_close(s->sftp_handle); +sftp_close(s->sftp_handle); } if (s->sftp) { -libssh2_sftp_shutdown(s->sftp); +sftp_free(s->sftp); } if (s->session) { -libssh2_session_disconnect(s->session, - "from qemu ssh client: " - "user closed the connection"); -libssh2_session_free(s->session); -} -if (s->sock >= 0) { -close(s->sock); +ssh_disconnect(s->session); +ssh_free(s->session); } } @@ -121,13 +112,13 @@ session_error_setg(Error **errp, BDRVSSHState *s, const char *fs, ...) va_end(args); if (s->session) { -char *ssh_err; +const char *ssh_err; int ssh_err_code; -/*
Re: [Qemu-block] segfault in parallel blockjobs (iotest 30)
On Fri 10 Nov 2017 04:02:23 AM CET, Fam Zheng wrote: >> > I'm thinking that perhaps we should add the pause point directly to >> > block_job_defer_to_main_loop(), to prevent any block job from >> > running the exit function when it's paused. >> >> I was trying this and unfortunately this breaks the mirror job at >> least (reproduced with a simple block-commit on the topmost node, >> which uses commit_active_start() -> mirror_start_job()). >> >> So what happens is that mirror_run() always calls >> bdrv_drained_begin() before returning, pausing the block job. The >> closing bdrv_drained_end() call is at the end of mirror_exit(), >> already in the main loop. >> >> So the mirror job is always calling block_job_defer_to_main_loop() >> and mirror_exit() when the job is paused. > > FWIW, I think Max's report on 194 failures is related: > > https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01822.html > > so perhaps it's worth testing his patch too: > > https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01835.html Well, that doesn't solve the original crash with parallel block jobs. The root of the crash is that the mirror job manipulates the graph _while being paused_, so the BlockReopenQueue in bdrv_reopen_multiple() gets messed up, and pausing the jobs (commit 40840e419be31e) won't help. I have the impression that one major source of headaches is the fact that the reopen queue contains nodes that don't need to be reopened at all. Ideally this should be detected early on in bdrv_reopen_queue(), so there's no chance that the queue contains nodes used by a different block job. If we had that then op blockers should be enough to prevent these things. Or am I missing something? Berto
Re: [Qemu-block] [PATCH 1/5] qcow2: reject unaligned offsets in write compressed
On 2017-11-14 11:16, Anton Nefedov wrote: > Misaligned compressed write is not supported. > > Signed-off-by: Anton Nefedov> --- > block/qcow2.c | 4 > 1 file changed, 4 insertions(+) Thanks, applied to my block branch for 2.11: https://github.com/XanClic/qemu/commits/block Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH 2/5] qcow2: multiple clusters write compressed
On 2017-11-14 11:16, Anton Nefedov wrote: > From: Pavel Butsykin> > At the moment, qcow2_co_pwritev_compressed can process the requests size > less than or equal to one cluster. This patch added possibility to write > compressed data in the QCOW2 more than one cluster. The implementation > is simple, we just split large requests into separate clusters and write > using existing functionality. For unaligned requests we use a workaround > and do write data without compression. > > Signed-off-by: Anton Nefedov > --- > block/qcow2.c | 77 > +++ > 1 file changed, 56 insertions(+), 21 deletions(-) On one hand, it might be better to do this centrally somewhere in block/io.c. On the other, that would require more work because it would probably mean having to introduce another field in BlockLimits, and it wouldn't do much -- because qcow (v1) is, well, qcow v1... And vmdk seems to completely not care about this single cluster limitation. So for now we probably wouldn't even gain anything by doing this in block/io.c. So long story short, it's OK to do this locally in qcow2, yes. > diff --git a/block/qcow2.c b/block/qcow2.c > index 45c5651..3d5f17d 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -3325,11 +3325,9 @@ static int qcow2_truncate(BlockDriverState *bs, > int64_t offset, > return 0; > } > > -/* XXX: put compressed sectors first, then all the cluster aligned > - tables to avoid losing bytes in alignment */ > static coroutine_fn int > -qcow2_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset, > -uint64_t bytes, QEMUIOVector *qiov) > +qcow2_co_pwritev_cluster_compressed(BlockDriverState *bs, uint64_t offset, > +uint64_t bytes, QEMUIOVector *qiov) > { > BDRVQcow2State *s = bs->opaque; > QEMUIOVector hd_qiov; > @@ -3339,25 +3337,12 @@ qcow2_co_pwritev_compressed(BlockDriverState *bs, > uint64_t offset, > uint8_t *buf, *out_buf; > int64_t cluster_offset; > > -if (bytes == 0) { > -/* align end of file to a sector boundary to ease reading with > - sector based I/Os */ > -cluster_offset = bdrv_getlength(bs->file->bs); > -if (cluster_offset < 0) { > -return cluster_offset; > -} > -return bdrv_truncate(bs->file, cluster_offset, PREALLOC_MODE_OFF, > NULL); > -} > - > -if (offset_into_cluster(s, offset)) { > -return -EINVAL; > -} > +assert(bytes <= s->cluster_size); > +assert(!offset_into_cluster(s, offset)); > > buf = qemu_blockalign(bs, s->cluster_size); > -if (bytes != s->cluster_size) { > -if (bytes > s->cluster_size || > -offset + bytes != bs->total_sectors << BDRV_SECTOR_BITS) > -{ > +if (bytes < s->cluster_size) { > +if (offset + bytes != bs->total_sectors << BDRV_SECTOR_BITS) { > qemu_vfree(buf); > return -EINVAL; > } > @@ -3437,6 +3422,56 @@ fail: > return ret; > } > > +/* XXX: put compressed sectors first, then all the cluster aligned > + tables to avoid losing bytes in alignment */ > +static coroutine_fn int > +qcow2_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset, > +uint64_t bytes, QEMUIOVector *qiov) > +{ > +BDRVQcow2State *s = bs->opaque; > +QEMUIOVector hd_qiov; > +uint64_t curr_off = 0; > +int ret; > + > +if (bytes == 0) { > +/* align end of file to a sector boundary to ease reading with > + sector based I/Os */ > +int64_t cluster_offset = bdrv_getlength(bs->file->bs); > +if (cluster_offset < 0) { > +return cluster_offset; > +} > +return bdrv_truncate(bs->file, cluster_offset, PREALLOC_MODE_OFF, > NULL); > +} > + > +if (offset_into_cluster(s, offset)) { > +return -EINVAL; > +} > + > +qemu_iovec_init(_qiov, qiov->niov); > +do { > +uint32_t chunk_size; > + > +qemu_iovec_reset(_qiov); > +chunk_size = MIN(bytes, s->cluster_size); > +qemu_iovec_concat(_qiov, qiov, curr_off, chunk_size); > + > +ret = qcow2_co_pwritev_cluster_compressed(bs, offset + curr_off, > + chunk_size, _qiov); > +if (ret == -ENOTSUP) { Why this? I mean, I can see the appeal, but then we should probably actually return -ENOTSUP somewhere (e.g. for unaligned clusters and the like) -- and we should not abort if offset_into_cluster(s, cluster) is true, but we should write the header uncompressed and compress the main bulk. Max > +ret = qcow2_co_pwritev(bs, offset + curr_off, chunk_size, > + _qiov, 0); > +} > +if (ret < 0) { > +break; > +} > +curr_off += chunk_size; > +bytes
Re: [Qemu-block] [PATCH for-2.11] qcow2: Fix overly broad madvise()
On 2017-11-15 10:09, Alberto Garcia wrote: > On Tue 14 Nov 2017 07:41:27 PM CET, Max Reitz wrote: >> @mem_size and @offset are both size_t, thus subtracting them from one >> another will just return a big size_t if mem_size < offset -- even more >> obvious here because the result is stored in another size_t. >> >> Checking that result to be positive is therefore not sufficient to >> excluse the case that offset > mem_size. Thus, we currently sometimes >> issue an madvise() over a very large address range. >> >> This is triggered by iotest 163, but with -m64, this does not result in >> tangible problems. But with -m32, this test produces three segfaults, >> all of which are fixed by this patch. >> >> Signed-off-by: Max Reitz> > Oh, I guess this happens when the page size is larger than the cluster > size? Otherwise I don't see how... > > Reviewed-by: Alberto Garcia Yes, the test uses 512 byte clusters. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH for-2.11] qcow2: Fix overly broad madvise()
Should have said that this is subject to the typo that Eric pointed out, of course. Thanks, Darren. On Wed, Nov 15, 2017 at 11:04:19AM +, Darren Kenny wrote: FWIW, Reviewed-by: Darren KennyThanks, Darren. On Tue, Nov 14, 2017 at 07:41:27PM +0100, Max Reitz wrote: @mem_size and @offset are both size_t, thus subtracting them from one another will just return a big size_t if mem_size < offset -- even more obvious here because the result is stored in another size_t. Checking that result to be positive is therefore not sufficient to excluse the case that offset > mem_size. Thus, we currently sometimes issue an madvise() over a very large address range. This is triggered by iotest 163, but with -m64, this does not result in tangible problems. But with -m32, this test produces three segfaults, all of which are fixed by this patch. Signed-off-by: Max Reitz --- block/qcow2-cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index 75746a7f43..5222a7b94d 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -73,7 +73,7 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, size_t mem_size = (size_t) s->cluster_size * num_tables; size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t; size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align); -if (length > 0) { +if (mem_size > offset && length > 0) { madvise((uint8_t *) t + offset, length, MADV_DONTNEED); } #endif -- 2.13.6
Re: [Qemu-block] [PATCH for-2.11] qcow2: Fix overly broad madvise()
FWIW, Reviewed-by: Darren KennyThanks, Darren. On Tue, Nov 14, 2017 at 07:41:27PM +0100, Max Reitz wrote: @mem_size and @offset are both size_t, thus subtracting them from one another will just return a big size_t if mem_size < offset -- even more obvious here because the result is stored in another size_t. Checking that result to be positive is therefore not sufficient to excluse the case that offset > mem_size. Thus, we currently sometimes issue an madvise() over a very large address range. This is triggered by iotest 163, but with -m64, this does not result in tangible problems. But with -m32, this test produces three segfaults, all of which are fixed by this patch. Signed-off-by: Max Reitz --- block/qcow2-cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index 75746a7f43..5222a7b94d 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -73,7 +73,7 @@ static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c, size_t mem_size = (size_t) s->cluster_size * num_tables; size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t; size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align); -if (length > 0) { +if (mem_size > offset && length > 0) { madvise((uint8_t *) t + offset, length, MADV_DONTNEED); } #endif -- 2.13.6
[Qemu-block] [PATCH v2] qapi: block-core: Clarify events emitted by 'block-job-cancel'
When you cancel an in-progress live block operation with QMP `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED. However, when `block-job-cancel` is issued after `drive-mirror` has indicated (by emitting the event BLOCK_JOB_READY) that the source and destination remain synchronized: [...] # Snip `drive-mirror` invocation & outputs { "execute":"block-job-cancel", "arguments":{ "device":"virtio0" } } {"return": {}} It (`block-job-cancel`) will counterintuitively emit the event 'BLOCK_JOB_COMPLETED': { "timestamp":{ "seconds":1510678024, "microseconds":526240 }, "event":"BLOCK_JOB_COMPLETED", "data":{ "device":"virtio0", "len":41126400, "offset":41126400, "speed":0, "type":"mirror" } } But this is expected behaviour, where the _COMPLETED event indicates that synchronization has successfully ended (and the destination has a point-in-time copy, which is at the time of cancel). So add a small note to this effect. (Thanks: Max Reitz for reminding me of this on IRC.) Signed-off-by: Kashyap Chamarthy--- Changes in v2: - "Note:" seems to be a special construct in Patchew, my usage caused a build failure. So do: s/Note:/Note that/ - Add the missing 'Signed-off-by' --- qapi/block-core.json | 8 1 file changed, 8 insertions(+) diff --git a/qapi/block-core.json b/qapi/block-core.json index ab96e348e6317bb42769ae20f4a4519bac02e93a..4ecfd1fbc5bc231c69da15d489c44e3e8b0706a5 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2065,6 +2065,14 @@ # BLOCK_JOB_CANCELLED event. Before that happens the job is still visible when # enumerated using query-block-jobs. # +# Note that the 'block-job-cancel' command will emit the event +# BLOCK_JOB_COMPLETED if you issue it ('block-job-cancel') after 'drive-mirror' +# has indicated (by emitting the event BLOCK_JOB_READY) that the source and +# destination remain synchronized. In this case, the BLOCK_JOB_COMPLETED +# event indicates that synchronization (from 'drive-mirror') has successfully +# ended and the destination now has a point-in-time copy, which is at the time +# of cancel. +# # For streaming, the image file retains its backing file unless the streaming # operation happens to complete just as it is being cancelled. A new streaming # operation can be started at a later time to finish copying all data from the -- 2.9.5
Re: [Qemu-block] [PATCH for-2.11] qcow2: Fix overly broad madvise()
On Tue 14 Nov 2017 07:41:27 PM CET, Max Reitz wrote: > @mem_size and @offset are both size_t, thus subtracting them from one > another will just return a big size_t if mem_size < offset -- even more > obvious here because the result is stored in another size_t. > > Checking that result to be positive is therefore not sufficient to > excluse the case that offset > mem_size. Thus, we currently sometimes > issue an madvise() over a very large address range. > > This is triggered by iotest 163, but with -m64, this does not result in > tangible problems. But with -m32, this test produces three segfaults, > all of which are fixed by this patch. > > Signed-off-by: Max ReitzOh, I guess this happens when the page size is larger than the cluster size? Otherwise I don't see how... Reviewed-by: Alberto Garcia Berto
Re: [Qemu-block] [Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 'block-job-cancel'
On Tue, Nov 14, 2017 at 11:26:59AM -0800, no-re...@patchew.org wrote: > Hi, > > This series failed automatic build test. Please find the testing commands and > their output below. If you have docker installed, you can probably reproduce > it > locally. [...] > In file included from /tmp/qemu-test/src/qapi-schema.json:85: > In file included from /tmp/qemu-test/src/qapi/block.json:7: > /tmp/qemu-test/src/qapi/block-core.json:2081:1: '@device:' can't follow > 'Note' section Hmm, I don't know the syntax enough to see why the 'Note' section is not allowed. But it's not strictly needed at all. So I'll just do: s/Note:/It is worth noting/ > make: *** [qapi-visit.h] Error 1 > make: *** Deleting file `qapi-visit.h' > make: *** Waiting for unfinished jobs [...] -- /kashyap