Re: [Qemu-devel] [PATCH] qemu-img.c: Add examples section
On Thu, 08/02 20:50, John Arbuckle wrote: > Add an examples section to the help output. > > Signed-off-by: John Arbuckle > --- > qemu-img.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/qemu-img.c b/qemu-img.c > index 1acddf693c..f77c82695d 100644 > --- a/qemu-img.c > +++ b/qemu-img.c > @@ -199,6 +199,17 @@ static void QEMU_NORETURN help(void) > > printf("%s\nSupported formats:", help_msg); > bdrv_iterate_format(format_print, NULL); > + > +printf("\n\nExamples:\n\n" > + "Create: qemu-img create -f qcow2 image.qcow2 10G\n\n" > + "Info: qemu-img info image.qcow2\n\n" > + "Resize: qemu-img resize image.qcow2 20G\n\n" > + "Convert: qemu-img convert -f raw -O qcow2 image.img > image.qcow2\n\n" > + "Check: qemu-img check image.qcow2\n\n" > + "Map: qemu-img map -f qcow2 --output=human image.qcow2\n\n" > + "Rebase: qemu-img rebase -b new_backing_file.qcow2 image.qcow2" > + ); The text looks good but maybe it's better to condense the section by using only one \n between command lines instead of two? Fam > + > printf("\n\n" QEMU_HELP_BOTTOM "\n"); > exit(EXIT_SUCCESS); > } > -- > 2.14.3 (Apple Git-98) > >
Re: [Qemu-devel] 'make vm-build-freebsd' don't work if KVM isn't enabled
On Mon, 07/30 16:20, Peter Maydell wrote: > On 30 July 2018 at 14:40, Peter Maydell wrote: > > On 30 July 2018 at 14:36, Peter Maydell wrote: > >> On 30 July 2018 at 14:23, Peter Maydell wrote: > >>> The tests in tests/vm/ seem to make some attempt to cope with the > >>> host system not allowing the user to use KVM, but it doesn't quite > >>> work. > >> > >> Also, passing V=1 to the outer make command like: > >> make vm-build-freebsd J=16 V=1 > >> isn't propagated through to the VM: test output looks like > >> GTESTER check-qtest-aarch64 > >> GTESTER check-qtest-alpha > >> instead of the proper verbose output with a line per subtest. > > > > Nor is the J=16 option passed through to "make check": > > Patches for these things on the way, once I've completed a test run. All good catches! Looking forward to the fixes. Thanks. Fam
Re: [Qemu-devel] [PATCH for-3.1] s390x: remove 's390-squash-mcss' option
On Wed, Aug 1, 2018 at 9:18 PM Cornelia Huck wrote: > > On Wed, 1 Aug 2018 15:11:23 +0200 > Cornelia Huck wrote: > > > On Wed, 1 Aug 2018 14:54:51 +0200 > > Cornelia Huck wrote: > > > > > On Wed, 1 Aug 2018 18:21:27 +0800 > > > Fam Zheng wrote: > > > > > > > On Tue, 07/24 11:24, Cornelia Huck wrote: > > > > > diff --git a/hw/s390x/css-bridge.c b/hw/s390x/css-bridge.c > > > > > index a02d708239..1bd6c8b458 100644 > > > > > --- a/hw/s390x/css-bridge.c > > > > > +++ b/hw/s390x/css-bridge.c > > > > > @@ -106,7 +106,6 @@ VirtualCssBus *virtual_css_bus_init(void) > > > > > /* Create bus on bridge device */ > > > > > bus = qbus_create(TYPE_VIRTUAL_CSS_BUS, dev, "virtual-css"); > > > > > cbus = VIRTUAL_CSS_BUS(bus); > > > > > > > > Not used now? > > > > > > > > Fam > > > > > > Indeed, we can ditch the cbus variable. > > > > ...or not :) We still need it for the return value, which is processed > > in ccw_init(). We could change the return code of the function to > > BusState, but I'm not sure it is worth the hassle. > > ...but we can indeed get rid of the cbus and qbus variables in > s390_ccw_realize(). I got this from a patchew.org test (make docker-test-clang@ubuntu): /tmp/qemu-test/src/hw/s390x/3270-ccw.c:103:20: error: unused variable 'cbus' [-Werror,-Wunused-variable] VirtualCssBus *cbus = VIRTUAL_CSS_BUS(qbus); ^ 1 error generated. /tmp/qemu-test/src/rules.mak:69: recipe for target 'hw/s390x/3270-ccw.o' failed
Re: [Qemu-devel] [PATCH for-3.1] s390x: remove 's390-squash-mcss' option
On Tue, 07/24 11:24, Cornelia Huck wrote: > diff --git a/hw/s390x/css-bridge.c b/hw/s390x/css-bridge.c > index a02d708239..1bd6c8b458 100644 > --- a/hw/s390x/css-bridge.c > +++ b/hw/s390x/css-bridge.c > @@ -106,7 +106,6 @@ VirtualCssBus *virtual_css_bus_init(void) > /* Create bus on bridge device */ > bus = qbus_create(TYPE_VIRTUAL_CSS_BUS, dev, "virtual-css"); > cbus = VIRTUAL_CSS_BUS(bus); Not used now? Fam
Re: [Qemu-devel] [PATCH] scsi: mptsas: Mark as storage device
On Tue, 07/31 15:28, Guenter Roeck wrote: > mptsas1068 is currently listed as uncategorized device. > Mark it as storage device. > > Signed-off-by: Guenter Roeck > --- > hw/scsi/mptsas.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c > index 4176e87..929404f 100644 > --- a/hw/scsi/mptsas.c > +++ b/hw/scsi/mptsas.c > @@ -1431,6 +1431,7 @@ static void mptsas1068_class_init(ObjectClass *oc, void > *data) > dc->reset = mptsas_reset; > dc->vmsd = _mptsas; > dc->desc = "LSI SAS 1068"; > +set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); > } > > static const TypeInfo mptsas_info = { > -- > 2.7.4 > Reviewed-by: Fam Zheng
Re: [Qemu-devel] [PATCH 0/2] iotests: Fix 226 on _my_ system
On Fri, Jul 27, 2018 at 6:04 PM Kevin Wolf wrote: > > Am 27.07.2018 um 11:24 hat Fam Zheng geschrieben: > > On Tue, 07/24 16:47, Fam Zheng wrote: > > > Something has locked /dev/null on my system (I still don't know what to > > > do with > > > the annoying incapability of lslocks, or more precisely /proc/locks, on > > > inspecting OFD lock information), and as a result 226 cannot pass due to > > > the > > > unexpected image locking error. > > > > > > Fix the test case by disabling locking, and add a doc text about using > > > test > > > images. > > > > > > Fam Zheng (2): > > > docs: Describe using images in writing iotests > > > iotests: Don't lock /dev/null in 226 > > > > > > docs/devel/testing.rst | 11 +++ > > > tests/qemu-iotests/226 | 4 ++-- > > > 2 files changed, 13 insertions(+), 2 deletions(-) > > > > Kevin, could you apply this for 3.0 too? > > Sure, applied to the block branch. > > Out of curiosity, did you find out what had locked the devices? No, but I'll for sure allocate some time to understand kernel's OFD internals, find the holder application, then hopefully, propose something that can fix lslocks. Thanks, Fam > > Kevin
Re: [Qemu-devel] [PATCH RFC for-3.0-rc3 0/3] qemu-img: Disable copy offloading by default
On Fri, Jul 27, 2018 at 6:29 PM Kevin Wolf wrote: > > Am 27.07.2018 um 05:33 hat Fam Zheng geschrieben: > > Kevin pointed out that both glibc and kernel provides a slow fallback of > > copy_file_range which hurts thin provisioning. This is particularly true for > > thin LVs, because host_device driver cannot get allocation info from the > > volume, and copy_file_range is called on every sectors, making the dst fully > > allocated. > > > > NFS mount points also doesn't support SEEK_DATA well, so the allocation > > information is unknown to QEMU. > > > > That leaves only iscsi:// which seems to do what we want so far, but it is a > > smaller use case. > > > > Add an option to qemu-img convert, "-C", to enable (attempting) copy > > offloading > > explicitly. And mark it incompatible with "-S" and "-c". > > Reviewed-by: Kevin Wolf > > Not sure why you made this an RFC only, but I think we absolutely need > this. People are used to using 'qemu-img convert' to compact images and > this would regress with automatic copy offloading. > > Do you think we need more discussion? I think merging this for 3.0 is the right thing do to. What worries me is the general usability of the feature. We could probably explore ideas about how we can better take advantage of copy offloading. I don't think counting on the user to make the right decision between disk efficiency (thin provisioning) and BW efficiency (copy offloading) will ever work. Even if we don't care about breaking the default '-S 4k' behavior, the lack of SEEK_DATA/SEEK_HOLE support on host NFS and block devices will make it very hard to use. Making it worse, if the network to NFS server is good enough, convert with pread64/pwrite64 with host page cache is also more efficient than copy_file_range, so we'll be slower by trying to play clever. :( Any thought? Fam
Re: [Qemu-devel] [PATCH 0/2] iotests: Fix 226 on _my_ system
On Tue, 07/24 16:47, Fam Zheng wrote: > Something has locked /dev/null on my system (I still don't know what to do > with > the annoying incapability of lslocks, or more precisely /proc/locks, on > inspecting OFD lock information), and as a result 226 cannot pass due to the > unexpected image locking error. > > Fix the test case by disabling locking, and add a doc text about using test > images. > > Fam Zheng (2): > docs: Describe using images in writing iotests > iotests: Don't lock /dev/null in 226 > > docs/devel/testing.rst | 11 +++ > tests/qemu-iotests/226 | 4 ++-- > 2 files changed, 13 insertions(+), 2 deletions(-) Kevin, could you apply this for 3.0 too? Fam
Re: [Qemu-devel] [PATCH] file-posix: Handle EINTR in preallocation=full write
On Fri, 07/27 14:53, Fam Zheng wrote: > Cc: qemu-sta...@nongnu.org > Signed-off-by: Fam Zheng Actually cc qemu-sta...@nongnu.org
Re: [Qemu-devel] [RFC PATCH] travis: Test out-of-tree builds
On Thu, 06/21 09:26, Philippe Mathieu-Daudé wrote: > Force one config to build 'out-of-tree' (object files and executables > are created in a tree outside the project source code). > > Signed-off-by: Philippe Mathieu-Daudé > --- > I noticed various out-of-tree issue in the last 2 merge windows. > > Pseudo-randomly picked a build from the matrix which covers system + user. > > .travis.yml | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/.travis.yml b/.travis.yml > index fabfe9ec34..02e096f6a9 100644 > --- a/.travis.yml > +++ b/.travis.yml > @@ -50,6 +50,8 @@ notifications: > on_failure: always > env: >global: > +- SRC_DIR="." > +- BUILD_DIR="." > - TEST_CMD="make check" > - MAKEFLAGS="-j3" >matrix: > @@ -69,11 +71,15 @@ before_install: >- wget -O - > http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar > -xvJ >- git submodule update --init --recursive > before_script: > - - ./configure ${CONFIG} || { cat config.log && exit 1; } > + - mkdir -p ${BUILD_DIR} && cd ${BUILD_DIR} > + - ${SRC_DIR}/configure ${CONFIG} || { cat config.log && exit 1; } > script: >- make ${MAKEFLAGS} && ${TEST_CMD} > matrix: >include: > +# Test out-of-tree builds > +- env: CONFIG="--enable-debug --enable-debug-tcg" > + BUILD_DIR="out-of-tree/build/dir" SRC_DIR="../../.." Is a three-level nesting really necessary? I think a simple './build' dir should do. > # Test with Clang for compile portability (Travis uses clang-5.0) > - env: CONFIG="--disable-system" >compiler: clang > -- > 2.18.0.rc2 > Fam
[Qemu-devel] [PATCH] tests/vm: Add vm-build-all/vm-clean-all in help text
Suggested-by: Philippe Mathieu-Daudé Signed-off-by: Fam Zheng --- tests/vm/Makefile.include | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include index 5daa2a3b73..9e19f8662f 100644 --- a/tests/vm/Makefile.include +++ b/tests/vm/Makefile.include @@ -14,6 +14,9 @@ vm-test: @echo " vm-build-freebsd- Build QEMU in FreeBSD VM" @echo " vm-build-netbsd - Build QEMU in NetBSD VM" @echo " vm-build-openbsd- Build QEMU in OpenBSD VM" + @echo "" + @echo " vm-build-all- Build QEMU in all VMs" + @echo " vm-clean-all- Clean up VM images" vm-build-all: $(addprefix vm-build-, $(IMAGES)) -- 2.17.1
Re: [Qemu-devel] [PATCH RFC 00/10] docker on non-x86 hosts
On Wed, 07/18 11:04, Alex Bennée wrote: > Hi, > > Our existing support for docker is fairly x86 centric. While docker > itself has support for multiple architectures not all architectures > are equal. For example Debian only packages the widest range of > cross-compilers in it's x86 images (although for example armhf is > available on aarch64 based images). > > As the binfmt support is getting more solid we always have that fall > back option and I've converted a number of the guests that way. > Unfortunately I've overloaded the meaning of DOCKER_PARTIAL_IMAGES but > I was loathed to throw in yet another magic variable without some > discussion first. I'm not sure if my current path will end up with a > maze of twisty if/else statements all subtly alike. Just my two cents. So far it looks reasonable. Though I guess we could give DOCKER_PARTIAL_IMAGES a better name or at least better documentation. Would a per-arch variable list look cleaner? Like DOCKER_IMAGES_GENERIC = debian DOCKER_IMAGES_X86_64 = ubuntu fedora centos7 DOCKER_PARTIAL_IMAGES_X86_64 = ... DOCKER_IMAGES_AARCH64 = ... DOCKER_PARTIAL_IMAGES_AARCH64 = ... Then the actual host arch can be used to expand rules: docker-all-images: $(DOCKER_IMAGES_GENERIC) docker-all-images: $(DOCKER_IMAGES_$(shell uname -m | tr '[a-z]' '[A-Z])) Fam > > Thoughts? > > Alex Bennée (10): > docker: rename docker-amd64 to docker-host > docker: change docker-image to docker-all-images target > docker: add a placeholder for handling non-x86 hosts > docker: don't include docker-arm64-cross on aarch64 hosts > docker: fall-back to binfmt_misc debian-mips64el-user-cross on non-x86 > docker: fall-back to binfmt_misc debian-ppc64el-user-cross on non-x86 > docker: fall-back to binfmt_misc debian-s390x-user-cross on non-x86 > docker: disable additional non-x86 images > tests: tcg skip docker images we can't build > tests/tcg: debian-mips64el-user-cross fallback > > Makefile | 4 +- > tests/docker/Makefile.include | 58 +-- > ...debian-amd64.docker => debian-host.docker} | 4 +- > .../debian-mips64el-user-cross.docker | 16 + > .../debian-ppc64el-user-cross.docker | 16 + > .../debian-s390x-user-cross.docker| 16 + > tests/tcg/Makefile.include| 3 + > tests/tcg/Makefile.probe | 2 +- > tests/tcg/mips/Makefile.include | 5 ++ > 9 files changed, 116 insertions(+), 8 deletions(-) > rename tests/docker/dockerfiles/{debian-amd64.docker => debian-host.docker} > (91%) > create mode 100644 tests/docker/dockerfiles/debian-mips64el-user-cross.docker > create mode 100644 tests/docker/dockerfiles/debian-ppc64el-user-cross.docker > create mode 100644 tests/docker/dockerfiles/debian-s390x-user-cross.docker > > -- > 2.17.1 >
Re: [Qemu-devel] [PATCH RFC 01/10] docker: rename docker-amd64 to docker-host
On Wed, 07/18 11:04, Alex Bennée wrote: > When building on non-x86 systems the base system will be correct so if > we avoid too many x86'isms in the install we can still use the image. > > Signed-off-by: Alex Bennée > --- > .../dockerfiles/{debian-amd64.docker => debian-host.docker} | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > rename tests/docker/dockerfiles/{debian-amd64.docker => debian-host.docker} > (91%) > > diff --git a/tests/docker/dockerfiles/debian-amd64.docker > b/tests/docker/dockerfiles/debian-host.docker > similarity index 91% > rename from tests/docker/dockerfiles/debian-amd64.docker > rename to tests/docker/dockerfiles/debian-host.docker > index eb13f06ed1..3605cc4658 100644 > --- a/tests/docker/dockerfiles/debian-amd64.docker > +++ b/tests/docker/dockerfiles/debian-host.docker > @@ -30,9 +30,9 @@ RUN cd /usr/src/virglrenderer && ./autogen.sh && > ./configure --with-glx --disabl > # netmap > RUN DEBIAN_FRONTEND=noninteractive eatmydata \ > apt-get install -y --no-install-recommends \ > -linux-headers-amd64 > +linux-headers-$(dpkg --print-architecture) > RUN git clone https://github.com/luigirizzo/netmap.git /usr/src/netmap > -RUN cd /usr/src/netmap/LINUX && ./configure --no-drivers --no-apps > --kernel-dir=$(ls -d /usr/src/linux-headers-*-amd64) && make install > +RUN cd /usr/src/netmap/LINUX && ./configure --no-drivers --no-apps > --kernel-dir=$(ls -d /usr/src/linux-headers-*) && make install Curious: why isn't this "ls -d linux-headers-*-$(dpkg --print-architecture)", like above? > ENV QEMU_CONFIGURE_OPTS --enable-netmap > > # gcrypt > -- > 2.17.1 >
[Qemu-devel] [PATCH] file-posix: Handle EINTR in preallocation=full write
Cc: qemu-sta...@nongnu.org Signed-off-by: Fam Zheng --- block/file-posix.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index ad299beb38..928b863ced 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1646,6 +1646,9 @@ static int handle_aiocb_truncate(RawPosixAIOData *aiocb) num = MIN(left, 65536); result = write(fd, buf, num); if (result < 0) { +if (errno == EINTR) { +continue; +} result = -errno; error_setg_errno(errp, -result, "Could not write zeros for preallocation"); -- 2.17.1
[Qemu-devel] [PATCH RFC for-3.0-rc3 3/3] iotests: Add test for 'qemu-img convert -C' compatibility
Signed-off-by: Fam Zheng --- tests/qemu-iotests/082 | 8 tests/qemu-iotests/082.out | 11 +++ 2 files changed, 19 insertions(+) diff --git a/tests/qemu-iotests/082 b/tests/qemu-iotests/082 index a872f771a6..3e605d52d1 100755 --- a/tests/qemu-iotests/082 +++ b/tests/qemu-iotests/082 @@ -157,6 +157,14 @@ run_qemu_img convert -o help # Try help option for a format that does not support creation run_qemu_img convert -O bochs -o help +echo +echo === convert: -C and other options === + +# Adding the help option to a command without other -o options +run_qemu_img convert -C -S 4k -O $IMGFMT "$TEST_IMG" "$TEST_IMG".target +run_qemu_img convert -C -S 8k -O $IMGFMT "$TEST_IMG" "$TEST_IMG".target +run_qemu_img convert -C -c -O $IMGFMT "$TEST_IMG" "$TEST_IMG".target + echo echo === amend: Options specified more than once === diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out index 60ef87c276..19e9fb13ff 100644 --- a/tests/qemu-iotests/082.out +++ b/tests/qemu-iotests/082.out @@ -508,6 +508,17 @@ size Virtual disk size Testing: convert -O bochs -o help qemu-img: Format driver 'bochs' does not support image creation +=== convert: -C and other options === + +Testing: convert -C -S 4k -O qcow2 TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.target +qemu-img: Cannot enable copy offloading when -S is used + +Testing: convert -C -S 8k -O qcow2 TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.target +qemu-img: Cannot enable copy offloading when -S is used + +Testing: convert -C -c -O qcow2 TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.target +qemu-img: Cannot enable copy offloading when -c is used + === amend: Options specified more than once === Testing: amend -f foo -f qcow2 -o lazy_refcounts=on TEST_DIR/t.qcow2 -- 2.17.1
[Qemu-devel] [PATCH RFC for-3.0-rc3 2/3] qemu-img: Add -C option for convert with copy offloading
Signed-off-by: Fam Zheng --- qemu-img-cmds.hx | 2 +- qemu-img.c | 21 + qemu-img.texi| 8 +++- 3 files changed, 25 insertions(+), 6 deletions(-) diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index 69758fb6e8..1526f327a5 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -44,7 +44,7 @@ STEXI ETEXI DEF("convert", img_convert, -"convert [--object objectdef] [--image-opts] [--target-image-opts] [-U] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B backing_file] [-o options] [-l snapshot_param] [-S sparse_size] [-m num_coroutines] [-W] filename [filename2 [...]] output_filename") +"convert [--object objectdef] [--image-opts] [--target-image-opts] [-U] [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B backing_file] [-o options] [-l snapshot_param] [-S sparse_size] [-m num_coroutines] [-W] filename [filename2 [...]] output_filename") STEXI @item convert [--object @var{objectdef}] [--image-opts] [--target-image-opts] [-U] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-B @var{backing_file}] [-o @var{options}] [-l @var{snapshot_param}] [-S @var{sparse_size}] [-m @var{num_coroutines}] [-W] @var{filename} [@var{filename2} [...]] @var{output_filename} ETEXI diff --git a/qemu-img.c b/qemu-img.c index 9b7506b8ae..1acddf693c 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2024,11 +2024,12 @@ static int img_convert(int argc, char **argv) skip_create = false, progress = false, tgt_image_opts = false; int64_t ret = -EINVAL; bool force_share = false; +bool explict_min_sparse = false; ImgConvertState s = (ImgConvertState) { /* Need at least 4k of zeros for sparse detection */ .min_sparse = 8, -.copy_range = true, +.copy_range = false, .buf_sectors= IO_BUF_SIZE / BDRV_SECTOR_SIZE, .wr_in_order= true, .num_coroutines = 8, @@ -2043,7 +2044,7 @@ static int img_convert(int argc, char **argv) {"target-image-opts", no_argument, 0, OPTION_TARGET_IMAGE_OPTS}, {0, 0, 0, 0} }; -c = getopt_long(argc, argv, ":hf:O:B:co:l:S:pt:T:qnm:WU", +c = getopt_long(argc, argv, ":hf:O:B:Cco:l:S:pt:T:qnm:WU", long_options, NULL); if (c == -1) { break; @@ -2067,9 +2068,11 @@ static int img_convert(int argc, char **argv) case 'B': out_baseimg = optarg; break; +case 'C': +s.copy_range = true; +break; case 'c': s.compressed = true; -s.copy_range = false; break; case 'o': if (!is_valid_option_list(optarg)) { @@ -2112,7 +2115,7 @@ static int img_convert(int argc, char **argv) } s.min_sparse = sval / BDRV_SECTOR_SIZE; -s.copy_range = false; +explict_min_sparse = true; break; } case 'p': @@ -2172,6 +2175,16 @@ static int img_convert(int argc, char **argv) goto fail_getopt; } +if (s.compressed && s.copy_range) { +error_report("Cannot enable copy offloading when -c is used"); +goto fail_getopt; +} + +if (explict_min_sparse && s.copy_range) { +error_report("Cannot enable copy offloading when -S is used"); +goto fail_getopt; +} + if (tgt_image_opts && !skip_create) { error_report("--target-image-opts requires use of -n flag"); goto fail_getopt; diff --git a/qemu-img.texi b/qemu-img.texi index aeb1b9e66c..3b6710a580 100644 --- a/qemu-img.texi +++ b/qemu-img.texi @@ -169,6 +169,12 @@ Number of parallel coroutines for the convert process Allow out-of-order writes to the destination. This option improves performance, but is only recommended for preallocated devices like host devices or other raw block devices. +@item -C +Try to use copy offloading to move data from source image to target. This may +improve performance if the data is remote, such as with NFS or iSCSI backends, +but will not automatically sparsify zero sectors, and may result in a fully +allocated target image depending on the host support for getting allocation +information. @end table Parameters to dd subcommand: @@ -319,7 +325,7 @@ Error on reading data @end table -@item convert [--object @var{objectdef}] [--image-opts] [--target-image-opts] [-U] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-B @var{backing_file}] [-o @var{options}] [-l @var{snapshot_param}] [-S @var{sparse_size}] [-m @var{num_coroutines}] [-W] @var{filename} [@var{filename2} [...]] @var{output_filename} +@item convert [--object @var{objectdef}] [--im
[Qemu-devel] [PATCH RFC for-3.0-rc3 1/3] Revert "qemu-img: Document copy offloading implications with -S and -c"
This reverts commit eb461485f4558e362fab905735b50987505bca44. Now that we introduce an explicit option, these implicit rules are not used. Signed-off-by: Fam Zheng --- qemu-img.texi | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/qemu-img.texi b/qemu-img.texi index 5853cd18d1..aeb1b9e66c 100644 --- a/qemu-img.texi +++ b/qemu-img.texi @@ -96,8 +96,7 @@ will enumerate information about backing files in a disk image chain. Refer below for further description. @item -c -indicates that target image must be compressed (qcow format only). If this -option is used, copy offloading will not be attempted. +indicates that target image must be compressed (qcow format only) @item -h with or without a command shows help and lists the supported formats @@ -116,8 +115,7 @@ in case both @var{-q} and @var{-p} options are used. indicates the consecutive number of bytes that must contain only zeros for qemu-img to create a sparse image during conversion. This value is rounded down to the nearest 512 bytes. You may use the common size suffixes like -@code{k} for kilobytes. If this option is used, copy offloading will not be -attempted. +@code{k} for kilobytes. @item -t @var{cache} specifies the cache mode that should be used with the (destination) file. See -- 2.17.1
[Qemu-devel] [PATCH RFC for-3.0-rc3 0/3] qemu-img: Disable copy offloading by default
Kevin pointed out that both glibc and kernel provides a slow fallback of copy_file_range which hurts thin provisioning. This is particularly true for thin LVs, because host_device driver cannot get allocation info from the volume, and copy_file_range is called on every sectors, making the dst fully allocated. NFS mount points also doesn't support SEEK_DATA well, so the allocation information is unknown to QEMU. That leaves only iscsi:// which seems to do what we want so far, but it is a smaller use case. Add an option to qemu-img convert, "-C", to enable (attempting) copy offloading explicitly. And mark it incompatible with "-S" and "-c". Fam Zheng (3): Revert "qemu-img: Document copy offloading implications with -S and -c" qemu-img: Add -C option for convert with copy offloading iotests: Add test for 'qemu-img convert -C' compatibility qemu-img-cmds.hx | 2 +- qemu-img.c | 21 + qemu-img.texi | 14 +- tests/qemu-iotests/082 | 8 tests/qemu-iotests/082.out | 11 +++ 5 files changed, 46 insertions(+), 10 deletions(-) -- 2.17.1
Re: [Qemu-devel] [PATCH v3 0/3] arm: Add nRF51 SoC and micro:bit machine
On Thu, 07/26 12:06, Joel Stanley wrote: > v3: Rebase on Stefan's cortex-m0 series > Based-on: 20180725085944.11856-8-stefa...@redhat.com Correction: Based-on: 20180725085944.11856-1-stefa...@redhat.com (Point to the cover letter message) Fam
Re: [Qemu-devel] [PATCH v3 for 3.0 00/18] docker fixes (and one tcg test tweak)
On Tue, Jul 24, 2018 at 4:56 PM Alex Bennée wrote: > > > Fam Zheng writes: > > > On Mon, Jul 23, 2018 at 6:03 PM Alex Bennée wrote: > >> > >> > >> Alex Bennée writes: > >> > >> > Hi, > >> > > >> > I've missed the boat for today's rc1 but I'd like to get this merged > >> > before rc2. The new docker.py change is technically new functionality > >> > but I'm counting it as a usability bug fix as it replaces a random > >> > back trace failure with a preemptive failure and message mentioning > >> > binfmt_misc configuration. This would have saved Richard a lot of head > >> > scratching as he tried to setup a powerpc-user setup to test his > >> > setcontext fix (he had a custom binfmt_misc pointing to his src tree). > >> > > >> > Finally we also drop the runcom test. It was cute that it got > >> > resurrected but it is ultimately a pointless test for something I'm > >> > sure no one actually uses. > >> > > >> > There will be a follow-up RFC series after this that cleans-up some of > >> > the rough edges when your host is not an x86_64 box but that series > >> > won't be targeting the 3.0 release. > >> > > >> > : The following patches need review > >> > : patch docker/disable debian powerpc user cross.patch > >> > : patch docker/drop QEMU_TARGET check fallback in EXECUTABLE.patch > >> > : patch docker/Update debootstrap script after Debian migrat.patch > >> > : patch docker/ignore distro versioning of debootstrap.patch > >> > : patch docker/perform basic binfmt_misc validation in docke.patch > >> > : patch tests/tcg remove runcom test.patch > >> > >> Ping? > > > > I had questions about patch 13 and 17. Otherwise looks good. > > Maybe you can drop them for now (including patch 9) and send the PULL for > > the rest. > > I can drop/replace 13 - but 17 was there for a better reporting of a > failure case Peter found. We can certainly expand it and be smarter in > future iterations. Sounds good to me. Fam > > > > I don't know if we can catch up -rc2 but I guess the testing fixes are > > fairly safe to sneak in. :) > > > > (My development machine is affected by > > an > > office power outage, and I was > > also > > busy with upgrading patchew.org. Sorry for the > > > > late reply > > !) > > > > Fam > > > -- > Alex Bennée
[Qemu-devel] [PATCH 1/2] docs: Describe using images in writing iotests
Signed-off-by: Fam Zheng --- docs/devel/testing.rst | 11 +++ 1 file changed, 11 insertions(+) diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst index 5e19cd50da..8e1fa3a66e 100644 --- a/docs/devel/testing.rst +++ b/docs/devel/testing.rst @@ -255,6 +255,17 @@ comparable library support for invoking and interacting with QEMU programs. If you opt for Python, it is strongly recommended to write Python 3 compatible code. +Both Python and Bash frameworks in iotests provide helpers to manage test +images. They can be used to create and clean up images under the test +directory. If no I/O or any protocol specific feature is needed, it is often +more convenient to use the pseudo block driver, ``null-co://``, as the test +image, which doesn't require image creation or cleaning up. Avoid system-wide +devices or files whenever possible, such as ``/dev/null`` or ``/dev/zero``. +Otherwise, image locking implications have to be considered. For example, +another application on the host may have locked the file, possibly leading to a +test failure. If using such devices are explicitly desired, consider adding +``locking=off`` option to disable image locking. + Docker based tests == -- 2.17.1
[Qemu-devel] [PATCH 0/2] iotests: Fix 226 on _my_ system
Something has locked /dev/null on my system (I still don't know what to do with the annoying incapability of lslocks, or more precisely /proc/locks, on inspecting OFD lock information), and as a result 226 cannot pass due to the unexpected image locking error. Fix the test case by disabling locking, and add a doc text about using test images. Fam Zheng (2): docs: Describe using images in writing iotests iotests: Don't lock /dev/null in 226 docs/devel/testing.rst | 11 +++ tests/qemu-iotests/226 | 4 ++-- 2 files changed, 13 insertions(+), 2 deletions(-) -- 2.17.1
[Qemu-devel] [PATCH 2/2] iotests: Don't lock /dev/null in 226
On my system (Fedora 28), this script reports a 'failed to get "consistent read" lock' error. Following docs/devel/testing.rst, it's better to add locking=off here. Signed-off-by: Fam Zheng --- tests/qemu-iotests/226 | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/qemu-iotests/226 b/tests/qemu-iotests/226 index 211ea9888a..8ec3e612dd 100755 --- a/tests/qemu-iotests/226 +++ b/tests/qemu-iotests/226 @@ -56,10 +56,10 @@ for PROTO in "file" "host_device" "host_cdrom"; do echo echo "== Testing RO ==" $QEMU_IO -c "open -r -o driver=$PROTO,filename=$TEST_IMG" 2>&1 | _filter_testdir | _filter_imgfmt -$QEMU_IO -c "open -r -o driver=$PROTO,filename=/dev/null" 2>&1 | _filter_imgfmt +$QEMU_IO -c "open -r -o driver=$PROTO,filename=/dev/null,locking=off" 2>&1 | _filter_imgfmt echo "== Testing RW ==" $QEMU_IO -c "open -o driver=$PROTO,filename=$TEST_IMG" 2>&1 | _filter_testdir | _filter_imgfmt -$QEMU_IO -c "open -o driver=$PROTO,filename=/dev/null" 2>&1 | _filter_imgfmt +$QEMU_IO -c "open -o driver=$PROTO,filename=/dev/null,locking=off" 2>&1 | _filter_imgfmt done # success, all done -- 2.17.1
Re: [Qemu-devel] [PATCH v3 for 3.0 00/18] docker fixes (and one tcg test tweak)
On Mon, Jul 23, 2018 at 6:03 PM Alex Bennée wrote: > > > Alex Bennée writes: > > > Hi, > > > > I've missed the boat for today's rc1 but I'd like to get this merged > > before rc2. The new docker.py change is technically new functionality > > but I'm counting it as a usability bug fix as it replaces a random > > back trace failure with a preemptive failure and message mentioning > > binfmt_misc configuration. This would have saved Richard a lot of head > > scratching as he tried to setup a powerpc-user setup to test his > > setcontext fix (he had a custom binfmt_misc pointing to his src tree). > > > > Finally we also drop the runcom test. It was cute that it got > > resurrected but it is ultimately a pointless test for something I'm > > sure no one actually uses. > > > > There will be a follow-up RFC series after this that cleans-up some of > > the rough edges when your host is not an x86_64 box but that series > > won't be targeting the 3.0 release. > > > > : The following patches need review > > : patch docker/disable debian powerpc user cross.patch > > : patch docker/drop QEMU_TARGET check fallback in EXECUTABLE.patch > > : patch docker/Update debootstrap script after Debian migrat.patch > > : patch docker/ignore distro versioning of debootstrap.patch > > : patch docker/perform basic binfmt_misc validation in docke.patch > > : patch tests/tcg remove runcom test.patch > > Ping? I had questions about patch 13 and 17. Otherwise looks good. Maybe you can drop them for now (including patch 9) and send the PULL for the rest. I don't know if we can catch up -rc2 but I guess the testing fixes are fairly safe to sneak in. :) (My development machine is affected by an office power outage, and I was also busy with upgrading patchew.org. Sorry for the late reply !) Fam
Re: [Qemu-devel] [PATCH v3 for 3.0 17/18] docker: perform basic binfmt_misc validation in docker.py
On Wed, Jul 18, 2018 at 4:02 AM Alex Bennée wrote: > > Setting up binfmt_misc is outside of the scope of the docker.py script > but we can at least validate it with any given executable so we have a > more useful error message than the sed line of deboostrap failing > cryptically. > > Signed-off-by: Alex Bennée > Reported-by: Richard Henderson > --- > tests/docker/docker.py | 30 ++ > 1 file changed, 30 insertions(+) > > diff --git a/tests/docker/docker.py b/tests/docker/docker.py > index 523f4b95a2..a3f5b0c1b0 100755 > --- a/tests/docker/docker.py > +++ b/tests/docker/docker.py > @@ -112,6 +112,31 @@ def _copy_binary_with_libs(src, dest_dir): > so_path = os.path.dirname(l) > _copy_with_mkdir(l , dest_dir, so_path) > > + > +def _check_binfmt_misc(executable): > +"""Check binfmt_misc has entry for executable in the right place. > + > +The details of setting up binfmt_misc are outside the scope of > +this script but we should at least fail early with a useful > +message if it won't work.""" > + > +binary = os.path.basename(executable) > +binfmt_entry = "/proc/sys/fs/binfmt_misc/%s" % (binary) > + > +if not os.path.exists(binfmt_entry): > +print ("No binfmt_misc entry for %s" % (binary)) > +return False > + > +with open(binfmt_entry) as x: entry = x.read() > + > +qpath = "/usr/bin/%s" % (binary) Is it intended to hardcode this to /usr/bin? I thought we were more flexible than that.. Fam > +if not re.search("interpreter %s\n" % (qpath), entry): > +print ("binfmt_misc for %s does not point to %s" % (binary, qpath)) > +return False > + > +return True > + > + > def _read_qemu_dockerfile(img_name): > # special case for Debian linux-user images > if img_name.startswith("debian") and img_name.endswith("user"): > @@ -315,6 +340,11 @@ class BuildCommand(SubCommand): > # Create a docker context directory for the build > docker_dir = tempfile.mkdtemp(prefix="docker_build") > > +# Validate binfmt_misc will work > +if args.include_executable: > +if not _check_binfmt_misc(args.include_executable): > +return 1 > + > # Is there a .pre file to run in the build context? > docker_pre = os.path.splitext(args.dockerfile)[0]+".pre" > if os.path.exists(docker_pre): > -- > 2.17.1 >
Re: [Qemu-devel] [PATCH v3 for 3.0 13/18] docker: add --hint to docker.py check
On Wed, Jul 18, 2018 at 4:03 AM Alex Bennée wrote: > > When a check fails we currently just report why we failed. This is not > totally helpful to people who want to boot-strap a new image. Add a > --hint option which we can pass down to give a bit more information. > > Signed-off-by: Alex Bennée > Reviewed-by: Philippe Mathieu-Daudé > Tested-by: Philippe Mathieu-Daudé > --- > tests/docker/Makefile.include | 3 ++- > tests/docker/docker.py| 3 ++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include > index ec23620153..c9e412f9d0 100644 > --- a/tests/docker/Makefile.include > +++ b/tests/docker/Makefile.include > @@ -73,7 +73,8 @@ docker-binfmt-image-debian-%: > $(DOCKER_FILES_DIR)/debian-bootstrap.docker > $(if > $(EXECUTABLE),--include-executable=$(EXECUTABLE)), \ > "BUILD","binfmt debian-$* (debootstrapped)"), > \ > $(call quiet-command, > \ > - $(DOCKER_SCRIPT) check --quiet qemu:debian-$* $<, > \ > + $(DOCKER_SCRIPT) check --quiet qemu:debian-$* $< > \ > + --hint "you will need to build $(EXECUTABLE)", > \ > "CHECK", "debian-$* exists")) > > endif > diff --git a/tests/docker/docker.py b/tests/docker/docker.py > index 2f81c6b13b..523f4b95a2 100755 > --- a/tests/docker/docker.py > +++ b/tests/docker/docker.py > @@ -475,6 +475,7 @@ class CheckCommand(SubCommand): > default="checksum", help="check type") > parser.add_argument("--olderthan", default=60, type=int, > help="number of minutes") > +parser.add_argument("--hint", default="", help="hint to user") > > def run(self, args, argv): > tag = args.tag > @@ -487,7 +488,7 @@ class CheckCommand(SubCommand): > > info = dkr.inspect_tag(tag) > if info is None: > -print("Image does not exist") > +print("Image does not exist %s" % (args.hint)) No, please. This is just hacky and makes no sense to me. The Makefile can do this $(DOCKER_SCRIPT) check ... || { echo "you will need to build $(EXECUTABLE)"; exit 1 } Fam > return 1 > > if args.checktype == "checksum": > -- > 2.17.1 >
Re: [Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically
On Mon, Jul 23, 2018 at 10:37 PM Max Reitz wrote: > > On 2018-07-23 03:56, Fam Zheng wrote: > > On Sun, Jul 22, 2018 at 10:06 PM Max Reitz wrote: > >> > >> On 2018-07-22 04:37, Fam Zheng wrote: > >>> On Sun, Jul 22, 2018 at 5:08 AM Max Reitz wrote: > >>>> > >>>> On 2018-07-19 05:41, Fam Zheng wrote: > >>>>> On my Fedora 28, /dev/null is locked by some other process (couldn't > >>>>> inspect it due to the current lslocks limitation), so iotests 226 fails > >>>>> with some unexpected image locking errors because it uses qemu-io to > >>>>> open it. > >>>>> > >>>>> Actually it's safe to not use any lock on /dev/null or /dev/zero. > >>>>> > >>>>> Signed-off-by: Fam Zheng > >>>>> --- > >>>>> block/file-posix.c | 7 ++- > >>>>> 1 file changed, 6 insertions(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/block/file-posix.c b/block/file-posix.c > >>>>> index 60af4b3d51..8bf034108a 100644 > >>>>> --- a/block/file-posix.c > >>>>> +++ b/block/file-posix.c > >>>>> @@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, > >>>>> QDict *options, > >>>>> s->use_lock = false; > >>>>> break; > >>>>> case ON_OFF_AUTO_AUTO: > >>>>> -s->use_lock = qemu_has_ofd_lock(); > >>>>> +if (!strcmp(filename, "/dev/null") || > >>>>> + !strcmp(filename, "/dev/zero")) { > >>>> > >>>> I’m not sure I like a strcmp() based on filename (though it should work > >>>> for all of the cases where someone would want to specify either of those > >>>> for a qemu block device). Isn’t there some specific major/minor number > >>>> we can use to check for these two files? Or are those Linux-specific? > >>> > >>> Yeah, I guess major/minor numbers are Linux-specific. > >>> > >>>> > >>>> I could also imagine just not locking any host_device, but since people > >>>> do use qcow2 immediately on block devices, maybe we do want to lock them. > >>> > >>> That's right. > >>> > >>>> > >>>> Finally, if really all you are trying to do is to make test code easier, > >>>> then I don’t know exactly why. Just disabling locking in 226 shouldn’t > >>>> be too hard. > >>> > >>> The tricky thing is in remembering to do that for future test cases. > >>> My machine seems to be somehow different than John's so that my 226 > >>> cannot lock /dev/null, but I'm not sure that is the case for other > >>> releases, distros or system instances. > >> > >> Usually we don’t need to use /dev/null, though, because null-co:// is > >> better suited for most purposes. This is a very specific test of host > >> devices. > >> > >> Maybe we should start a doc file with common good practices about > >> writing iotests? > > > > Yes, mentioning using pseudo devices in docs/devel/testing.rst would > > probably be a good idea. So is my understanding right that you prefer > > fixing the test case and discard this patch? If so I'll send another > > version together with the doc update. > > I personally would prefer fixing the test and not modifying the code, > yes. But I'm aware that it is a personal opinion. Sure, and you are the maintainer, so why not follow what you think. :) Fam
Re: [Qemu-devel] qemu-system_x86-64 (32-bit binary) -M q35 can be crashed on viewing youtube.
On Mon, 07/23 03:42, Andrew Randrianasulu wrote: > It was crashing and crashing, so I tried to debug it a bit ... > > > valgrind --leak-check=yes /dev/shm/qemu/x86_64-softmmu/qemu-system-x86_64 > -display > sdl,gl=on -M q35 -soundhw > hda -cdrom /home/guest/Downloads/ISO/slax-English-US-7.0.8-x86_64.iso -m > 1G -enable-kvm -d trace:e1000e* shows some errors at very end, see attached > file. > > For reproduction, wait for liveCD to finish loading, start firefox, go to > youtube.com, it will warn you about outdated browser but continue anyway, try > to click on any video > > It seems even modern KDE Neon distribution affected by same bug :/ Cc'ing Jason and Samuel/Jan who maintain net and slirp respectively. > --quote- > > 29362@1532305439.464672:e1000e_core_write Write to register 0xd0, 4 byte(s), > value: 0x10 > 29362@1532305439.464735:e1000e_irq_set_ims Setting IMS bits 0x10: > 0x154 --> 0x154 > 29362@1532305439.464791:e1000e_irq_msix_pending_clearing Clearing MSI-X > pending > bit for cause 0x10, IVAR config 0x800a0908, vector 0 > 29362@1532305439.464867:e1000e_irq_fix_icr_asserted ICR_ASSERTED bit fixed: > 0x8002 > 29362@1532305439.464916:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: > 0x8002, IMS: 0x154) > 29362@1532305439.524010:e1000e_core_write Write to register 0x3818, 4 > byte(s), > value: 0x97 > 29362@1532305439.524097:e1000e_tx_descr 0x21180e : 27000b68 5b43600 > 29362@1532305439.524169:e1000e_tx_descr 0x3bf970fa : 261005c6 300 > 29362@1532305439.524248:e1000e_tx_descr 0x1c1c8000 : af1005d8 300 > ==29362== Invalid write of size 4 > ==29362==at 0x552E58: memcpy (string3.h:53) > ==29362==by 0x552E58: m_cat (mbuf.c:143) > ==29362==by 0x54FE1E: ip_reass (ip_input.c:341) > ==29362==by 0x54FE1E: ip_input (ip_input.c:190) > ==29362==by 0x552478: slirp_input (slirp.c:874) > ==29362==by 0x53FBE7: net_slirp_receive (slirp.c:121) > ==29362==by 0x537478: nc_sendv_compat (net.c:701) > ==29362==by 0x537478: qemu_deliver_packet_iov (net.c:728) > ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179) > ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224) > ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764) > ==29362==by 0x538925: qemu_sendv_packet (net.c:772) > ==29362==by 0x53AB71: net_hub_receive_iov (hub.c:73) > ==29362==by 0x53AB71: net_hub_port_receive_iov (hub.c:124) > ==29362==by 0x53748D: qemu_deliver_packet_iov (net.c:726) > ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179) > ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224) > ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764) > ==29362== Address 0x114f9b60 is 0 bytes after a block of size 2,976 alloc'd > ==29362==at 0x482B29C: malloc (vg_replace_malloc.c:299) > ==29362==by 0x534D3E1: g_malloc (in /usr/lib/libglib-2.0.so.0.4600.2) > ==29362==by 0x552B53: m_inc.part.1 (mbuf.c:166) > ==29362==by 0x552EAE: m_inc (string3.h:53) > ==29362==by 0x552EAE: m_cat (mbuf.c:141) > ==29362==by 0x54FE1E: ip_reass (ip_input.c:341) > ==29362==by 0x54FE1E: ip_input (ip_input.c:190) > ==29362==by 0x552478: slirp_input (slirp.c:874) > ==29362==by 0x53FBE7: net_slirp_receive (slirp.c:121) > ==29362==by 0x537478: nc_sendv_compat (net.c:701) > ==29362==by 0x537478: qemu_deliver_packet_iov (net.c:728) > ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179) > ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224) > ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764) > ==29362==by 0x538925: qemu_sendv_packet (net.c:772) > ==29362==by 0x53AB71: net_hub_receive_iov (hub.c:73) > ==29362==by 0x53AB71: net_hub_port_receive_iov (hub.c:124) > ==29362== > ==29362== Invalid write of size 4 > ==29362==at 0x552E5E: memcpy (string3.h:53) > ==29362==by 0x552E5E: m_cat (mbuf.c:143) > ==29362==by 0x54FE1E: ip_reass (ip_input.c:341) > ==29362==by 0x54FE1E: ip_input (ip_input.c:190) > ==29362==by 0x552478: slirp_input (slirp.c:874) > ==29362==by 0x53FBE7: net_slirp_receive (slirp.c:121) > ==29362==by 0x537478: nc_sendv_compat (net.c:701) > ==29362==by 0x537478: qemu_deliver_packet_iov (net.c:728) > ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179) > ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224) > ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764) > ==29362==by 0x538925: qemu_sendv_packet (net.c:772) > ==29362==by 0x53AB71: net_hub_receive_iov (hub.c:73) > ==29362==by 0x53AB71: net_hub_port_receive_iov (hub.c:124) > ==29362==by 0x53748D: qemu_deliver_packet_iov (net.c:726) > ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179) > ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224) > ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764) > ==29362== Address 0x114f9b78 is 24
Re: [Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically
On Sun, Jul 22, 2018 at 10:06 PM Max Reitz wrote: > > On 2018-07-22 04:37, Fam Zheng wrote: > > On Sun, Jul 22, 2018 at 5:08 AM Max Reitz wrote: > >> > >> On 2018-07-19 05:41, Fam Zheng wrote: > >>> On my Fedora 28, /dev/null is locked by some other process (couldn't > >>> inspect it due to the current lslocks limitation), so iotests 226 fails > >>> with some unexpected image locking errors because it uses qemu-io to > >>> open it. > >>> > >>> Actually it's safe to not use any lock on /dev/null or /dev/zero. > >>> > >>> Signed-off-by: Fam Zheng > >>> --- > >>> block/file-posix.c | 7 ++- > >>> 1 file changed, 6 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/block/file-posix.c b/block/file-posix.c > >>> index 60af4b3d51..8bf034108a 100644 > >>> --- a/block/file-posix.c > >>> +++ b/block/file-posix.c > >>> @@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, > >>> QDict *options, > >>> s->use_lock = false; > >>> break; > >>> case ON_OFF_AUTO_AUTO: > >>> -s->use_lock = qemu_has_ofd_lock(); > >>> +if (!strcmp(filename, "/dev/null") || > >>> + !strcmp(filename, "/dev/zero")) { > >> > >> I’m not sure I like a strcmp() based on filename (though it should work > >> for all of the cases where someone would want to specify either of those > >> for a qemu block device). Isn’t there some specific major/minor number > >> we can use to check for these two files? Or are those Linux-specific? > > > > Yeah, I guess major/minor numbers are Linux-specific. > > > >> > >> I could also imagine just not locking any host_device, but since people > >> do use qcow2 immediately on block devices, maybe we do want to lock them. > > > > That's right. > > > >> > >> Finally, if really all you are trying to do is to make test code easier, > >> then I don’t know exactly why. Just disabling locking in 226 shouldn’t > >> be too hard. > > > > The tricky thing is in remembering to do that for future test cases. > > My machine seems to be somehow different than John's so that my 226 > > cannot lock /dev/null, but I'm not sure that is the case for other > > releases, distros or system instances. > > Usually we don’t need to use /dev/null, though, because null-co:// is > better suited for most purposes. This is a very specific test of host > devices. > > Maybe we should start a doc file with common good practices about > writing iotests? Yes, mentioning using pseudo devices in docs/devel/testing.rst would probably be a good idea. So is my understanding right that you prefer fixing the test case and discard this patch? If so I'll send another version together with the doc update. Fam > > Max >
Re: [Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically
On Sun, Jul 22, 2018 at 5:08 AM Max Reitz wrote: > > On 2018-07-19 05:41, Fam Zheng wrote: > > On my Fedora 28, /dev/null is locked by some other process (couldn't > > inspect it due to the current lslocks limitation), so iotests 226 fails > > with some unexpected image locking errors because it uses qemu-io to > > open it. > > > > Actually it's safe to not use any lock on /dev/null or /dev/zero. > > > > Signed-off-by: Fam Zheng > > --- > > block/file-posix.c | 7 ++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/block/file-posix.c b/block/file-posix.c > > index 60af4b3d51..8bf034108a 100644 > > --- a/block/file-posix.c > > +++ b/block/file-posix.c > > @@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, QDict > > *options, > > s->use_lock = false; > > break; > > case ON_OFF_AUTO_AUTO: > > -s->use_lock = qemu_has_ofd_lock(); > > +if (!strcmp(filename, "/dev/null") || > > + !strcmp(filename, "/dev/zero")) { > > I’m not sure I like a strcmp() based on filename (though it should work > for all of the cases where someone would want to specify either of those > for a qemu block device). Isn’t there some specific major/minor number > we can use to check for these two files? Or are those Linux-specific? Yeah, I guess major/minor numbers are Linux-specific. > > I could also imagine just not locking any host_device, but since people > do use qcow2 immediately on block devices, maybe we do want to lock them. That's right. > > Finally, if really all you are trying to do is to make test code easier, > then I don’t know exactly why. Just disabling locking in 226 shouldn’t > be too hard. The tricky thing is in remembering to do that for future test cases. My machine seems to be somehow different than John's so that my 226 cannot lock /dev/null, but I'm not sure that is the case for other releases, distros or system instances. Fam > > Max > > > +s->use_lock = false; > > +} else { > > +s->use_lock = qemu_has_ofd_lock(); > > +} > > break; > > default: > > abort(); > > > >
Re: [Qemu-devel] [Qemu-block] [PATCH] block: Don't lock /dev/null and /dev/zero automatically
On Sat, Jul 21, 2018 at 12:35 AM Eric Blake wrote: > > On 07/20/2018 03:24 AM, Fam Zheng wrote: > I'm not familiar with /dev/nullb* - is that a typo? > > $ ll /dev/nullb* > ls: cannot access '/dev/nullb*': No such file or directory You probably have figured out already but just in case: it's kernel null_blk mod. Fam
Re: [Qemu-devel] [Qemu-block] [PATCH] block: Don't lock /dev/null and /dev/zero automatically
On Thu, 07/19 13:57, John Snow wrote: > Should we instead modify the test in this case to not attempt to take a > lock on a device we know cannot meaningfully store state, or is it your > preference to attempt to maintain such a list in the raw driver itself? > > I guess we never want QEMU to try to lock things like /dev/zero, but I > don't know if there are more such pseudo-devices we should never try to > lock and if so, what common property unifies them such that we don't > have to maintain a list. They are 0 sized anyway so I only expect them to be used in test cases like the one we're fixing. So this really doesn't matter. An exceptional one would be /dev/nullb* but that is not used in real world either. I'm not trying to maintain a complete list (e.g. /dev/urandom and /dev/nullb* are missing), this patch is just trying to make writing test code easier. Fam
Re: [Qemu-devel] [PATCH] hw/block/nvme: add optional parameter num_namespaces for nvme device
On Thu, 07/19 12:33, Weiping Zhang wrote: > Add an optional paramter num_namespaces for device, and set it > to 1 by default. > > Signed-off-by: Weiping Zhang > --- > hw/block/nvme.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index 156ecf3c41..b53be4b5c0 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -19,7 +19,7 @@ > * -drive file=,if=none,id= > * -device nvme,drive=,serial=,id=, \ > * cmb_size_mb=, \ > - * num_queues= > + * num_queues=,num_namespaces= > * > * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at > * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. > @@ -1232,7 +1232,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error > **errp) > pci_config_set_class(pci_dev->config, PCI_CLASS_STORAGE_EXPRESS); > pcie_endpoint_cap_init(>parent_obj, 0x80); > > -n->num_namespaces = 1; > n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4); > n->ns_size = bs_size / (uint64_t)n->num_namespaces; > > @@ -1342,6 +1341,7 @@ static Property nvme_props[] = { > DEFINE_PROP_STRING("serial", NvmeCtrl, serial), > DEFINE_PROP_UINT32("cmb_size_mb", NvmeCtrl, cmb_size_mb, 0), > DEFINE_PROP_UINT32("num_queues", NvmeCtrl, num_queues, 64), > +DEFINE_PROP_UINT32("num_namespaces", NvmeCtrl, num_namespaces, 1), You need to verify the user provided value. At least 0x is the broadcast value and shouldn't be used, and 0 doesn't make much sense too, I guess. > DEFINE_PROP_END_OF_LIST(), > }; > > -- > 2.14.1 > > I think the data_offset calculations in nvme_rw need some changes too? Fam
[Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically
On my Fedora 28, /dev/null is locked by some other process (couldn't inspect it due to the current lslocks limitation), so iotests 226 fails with some unexpected image locking errors because it uses qemu-io to open it. Actually it's safe to not use any lock on /dev/null or /dev/zero. Signed-off-by: Fam Zheng --- block/file-posix.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/block/file-posix.c b/block/file-posix.c index 60af4b3d51..8bf034108a 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, s->use_lock = false; break; case ON_OFF_AUTO_AUTO: -s->use_lock = qemu_has_ofd_lock(); +if (!strcmp(filename, "/dev/null") || + !strcmp(filename, "/dev/zero")) { +s->use_lock = false; +} else { +s->use_lock = qemu_has_ofd_lock(); +} break; default: abort(); -- 2.17.1
[Qemu-devel] [PATCH] file-posix: Skip effectiveless OFD lock operations
If we know we've already locked the bytes, don't do it again; similarly don't unlock a byte if we haven't locked it. This doesn't change the behavior, but fixes a corner case explained below. Libvirt had an error handling bug that an image can get its (ownership, file mode, SELinux) permissions changed (RHBZ 1584982) by mistake behind QEMU. Specifically, an image in use by Libvirt VM has: $ ls -lhZ b.img -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c600,c690 b.img Trying to attach it a second time won't work because of image locking. And after the error, it becomes: $ ls -lhZ b.img -rw-r--r--. root root system_u:object_r:virt_image_t:s0 b.img Then, we won't be able to do OFD lock operations with the existing fd. In other words, the code such as in blk_detach_dev: blk_set_perm(blk, 0, BLK_PERM_ALL, _abort); can abort() QEMU, out of environmental changes. This patch is an easy fix to this and the change is regardlessly reasonable, so do it. Signed-off-by: Fam Zheng --- block/file-posix.c | 27 +-- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 60af4b3d51..45d44c9947 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -680,23 +680,28 @@ typedef enum { * file; if @unlock == true, also unlock the unneeded bytes. * @shared_perm_lock_bits is the mask of all permissions that are NOT shared. */ -static int raw_apply_lock_bytes(int fd, +static int raw_apply_lock_bytes(BDRVRawState *s, int fd, uint64_t perm_lock_bits, uint64_t shared_perm_lock_bits, bool unlock, Error **errp) { int ret; int i; +uint64_t locked_perm, locked_shared_perm; + +locked_perm = s ? s->perm : 0; +locked_shared_perm = s ? ~s->shared_perm & BLK_PERM_ALL : 0; PERM_FOREACH(i) { int off = RAW_LOCK_PERM_BASE + i; -if (perm_lock_bits & (1ULL << i)) { +uint64_t bit = (1ULL << i); +if ((perm_lock_bits & bit) && !(locked_perm & bit)) { ret = qemu_lock_fd(fd, off, 1, false); if (ret) { error_setg(errp, "Failed to lock byte %d", off); return ret; } -} else if (unlock) { +} else if (unlock && (locked_perm & bit) && !(perm_lock_bits & bit)) { ret = qemu_unlock_fd(fd, off, 1); if (ret) { error_setg(errp, "Failed to unlock byte %d", off); @@ -706,13 +711,15 @@ static int raw_apply_lock_bytes(int fd, } PERM_FOREACH(i) { int off = RAW_LOCK_SHARED_BASE + i; -if (shared_perm_lock_bits & (1ULL << i)) { +uint64_t bit = (1ULL << i); +if ((shared_perm_lock_bits & bit) && !(locked_shared_perm & bit)) { ret = qemu_lock_fd(fd, off, 1, false); if (ret) { error_setg(errp, "Failed to lock byte %d", off); return ret; } -} else if (unlock) { +} else if (unlock && (locked_shared_perm & bit) && + !(shared_perm_lock_bits & bit)) { ret = qemu_unlock_fd(fd, off, 1); if (ret) { error_setg(errp, "Failed to unlock byte %d", off); @@ -788,7 +795,7 @@ static int raw_handle_perm_lock(BlockDriverState *bs, switch (op) { case RAW_PL_PREPARE: -ret = raw_apply_lock_bytes(s->lock_fd, s->perm | new_perm, +ret = raw_apply_lock_bytes(s, s->lock_fd, s->perm | new_perm, ~s->shared_perm | ~new_shared, false, errp); if (!ret) { @@ -800,7 +807,7 @@ static int raw_handle_perm_lock(BlockDriverState *bs, op = RAW_PL_ABORT; /* fall through to unlock bytes. */ case RAW_PL_ABORT: -raw_apply_lock_bytes(s->lock_fd, s->perm, ~s->shared_perm, +raw_apply_lock_bytes(s, s->lock_fd, s->perm, ~s->shared_perm, true, _err); if (local_err) { /* Theoretically the above call only unlocks bytes and it cannot @@ -810,7 +817,7 @@ static int raw_handle_perm_lock(BlockDriverState *bs, } break; case RAW_PL_COMMIT: -raw_apply_lock_bytes(s->lock_fd, new_perm, ~new_shared, +raw_apply_lock_bytes(s, s->lock_fd, new_perm, ~new_shared, true, _err); if (local_err) { /* Theoretically the above call only unlocks bytes and it cannot @@ -2174,7 +2181,7 @@ raw_co_create(BlockdevCreateOptions *options, Error **errp) shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE; /* Step one: Take locks */ -result = raw_apply_loc
Re: [Qemu-devel] [PATCH 3/5] tests/vm: Do not abuse parallelism when KVM is not available
On Mon, 07/16 23:48, Philippe Mathieu-Daudé wrote: > Signed-off-by: Philippe Mathieu-Daudé > --- > tests/vm/basevm.py | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py > index 1f4ad04c32..715e433142 100755 > --- a/tests/vm/basevm.py > +++ b/tests/vm/basevm.py > @@ -233,7 +233,7 @@ def main(vmcls): > return 1 > logging.basicConfig(level=(logging.DEBUG if args.debug > else logging.WARN)) > -vm = vmcls(debug=args.debug, vcpus=args.jobs) > +vm = vmcls(debug=args.debug, vcpus=args.jobs if kvm_available() else > 0) > if args.build_image: > if os.path.exists(args.image) and not args.force: > sys.stderr.writelines(["Image file exists: %s\n" % > args.image, > @@ -244,7 +244,7 @@ def main(vmcls): > vm.add_source_dir(args.build_qemu) > cmd = [vm.BUILD_SCRIPT.format( > configure_opts = " ".join(argv), > - jobs=args.jobs)] > + jobs=args.jobs if kvm_available() else 1)] > else: > cmd = argv > vm.boot(args.image + ",snapshot=on") > -- > 2.18.0 > Could you instead change this line parser.add_option("--jobs", type=int, default=multiprocessing.cpu_count() / 2, help="number of virtual CPUs") to use kvm_available()? With another helper function to contain the line width, probably: ..., default=get_default_jobs(), help=... and def get_default_jobs(): if kvm_available(): return ... else: return ... Fam
Re: [Qemu-devel] [PATCH v3 4/4] tests: Add centos VM testing
On Sun, 07/15 22:34, Philippe Mathieu-Daudé wrote: > Hi Fam, > > On 07/11/2018 10:28 PM, Fam Zheng wrote: > > This one does docker testing in the VM. It is intended to replace the > > native docker testing on patchew testers. > > > > Signed-off-by: Fam Zheng > > --- > > tests/vm/Makefile.include | 3 +- > > tests/vm/centos | 84 +++ > > 2 files changed, 86 insertions(+), 1 deletion(-) > > create mode 100755 tests/vm/centos > > > > diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include > > index 5daa2a3b73..af19b7a4e6 100644 > > --- a/tests/vm/Makefile.include > > +++ b/tests/vm/Makefile.include > > @@ -2,7 +2,7 @@ > > > > .PHONY: vm-build-all > > Can we have a vm-clean-all rule too? Yes, sure! > > > > > -IMAGES := ubuntu.i386 freebsd netbsd openbsd > > +IMAGES := ubuntu.i386 freebsd netbsd openbsd centos > > IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES)) > > > > .PRECIOUS: $(IMAGE_FILES) > > @@ -14,6 +14,7 @@ vm-test: > > @echo " vm-build-freebsd- Build QEMU in FreeBSD VM" > > @echo " vm-build-netbsd - Build QEMU in NetBSD VM" > > @echo " vm-build-openbsd- Build QEMU in OpenBSD VM" > > + @echo " vm-build-centos - Build QEMU in CentOS VM, > > with Docker" > > > > vm-build-all: $(addprefix vm-build-, $(IMAGES)) > > > > diff --git a/tests/vm/centos b/tests/vm/centos > > new file mode 100755 > > index 00..afd560c564 > > --- /dev/null > > +++ b/tests/vm/centos > > @@ -0,0 +1,84 @@ > > +#!/usr/bin/env python > > +# > > +# CentOS image > > +# > > +# Copyright 2018 Red Hat Inc. > > +# > > +# Authors: > > +# Fam Zheng > > +# > > +# This code is licensed under the GPL version 2 or later. See > > +# the COPYING file in the top-level directory. > > +# > > + > > +import os > > +import sys > > +import subprocess > > +import basevm > > +import time > > + > > +class CentosVM(basevm.BaseVM): > > +name = "centos" > > +BUILD_SCRIPT = """ > > +set -e; > > +cd $(mktemp -d); > > +export SRC_ARCHIVE=/dev/vdb; > > +sudo chmod a+r $SRC_ARCHIVE; > > +tar -xf $SRC_ARCHIVE; > > +make docker-test-block@centos7 V={verbose} J={jobs}; > > +make docker-test-quick@centos7 V={verbose} J={jobs}; > > +make docker-test-mingw@fedora V={verbose} J={jobs}; > > +""" > > + > > +def _gen_cloud_init_iso(self): > > +cidir = self._tmpdir > > +mdata = open(os.path.join(cidir, "meta-data"), "w") > > +mdata.writelines(["instance-id: centos-vm-0\n", > > + "local-hostname: centos-guest\n"]) > > +mdata.close() > > +udata = open(os.path.join(cidir, "user-data"), "w") > > +udata.writelines(["#cloud-config\n", > > + "chpasswd:\n", > > + " list: |\n", > > + "root:%s\n" % self.ROOT_PASS, > > + "%s:%s\n" % (self.GUEST_USER, > > self.GUEST_PASS), > > + " expire: False\n", > > + "users:\n", > > + " - name: %s\n" % self.GUEST_USER, > > + "sudo: ALL=(ALL) NOPASSWD:ALL\n", > > + "ssh-authorized-keys:\n", > > + "- %s\n" % basevm.SSH_PUB_KEY, > > + " - name: root\n", > > + "ssh-authorized-keys:\n", > > + "- %s\n" % basevm.SSH_PUB_KEY, > > + "locale: en_US.UTF-8\n"]) > > +udata.close() > > +subprocess.check_call(["genisoimage", "-output", "cloud-init.iso", > > + "-volid", "cidata", "-joliet", "-rock", > > + "user-data", "meta-data"], > > + cwd=cidir, > > +
[Qemu-devel] [PATCH] tests: vm: Add vm-clean-all
The images are big. Add a rule to clean up easily. Suggested-by: Philippe Mathieu-Daudé Signed-off-by: Fam Zheng --- tests/vm/Makefile.include | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include index 5daa2a3b73..79e7866c8b 100644 --- a/tests/vm/Makefile.include +++ b/tests/vm/Makefile.include @@ -1,6 +1,6 @@ # Makefile for VM tests -.PHONY: vm-build-all +.PHONY: vm-build-all vm-clean-all IMAGES := ubuntu.i386 freebsd netbsd openbsd IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES)) @@ -17,6 +17,9 @@ vm-test: vm-build-all: $(addprefix vm-build-, $(IMAGES)) +vm-clean-all: + rm -f $(IMAGE_FILES) + tests/vm/%.img: $(SRC_PATH)/tests/vm/% \ $(SRC_PATH)/tests/vm/basevm.py \ $(SRC_PATH)/tests/vm/Makefile.include -- 2.17.1
Re: [Qemu-devel] [PATCH v3 4/4] tests: Add centos VM testing
On Thu, 07/12 12:02, Philippe Mathieu-Daudé wrote: > Hi Fam, > > On 07/11/2018 10:28 PM, Fam Zheng wrote: > > This one does docker testing in the VM. It is intended to replace the > > native docker testing on patchew testers. > > > > Signed-off-by: Fam Zheng > > --- > > tests/vm/Makefile.include | 3 +- > > tests/vm/centos | 84 +++ > > 2 files changed, 86 insertions(+), 1 deletion(-) > > create mode 100755 tests/vm/centos > > > > diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include > > index 5daa2a3b73..af19b7a4e6 100644 > > --- a/tests/vm/Makefile.include > > +++ b/tests/vm/Makefile.include > > @@ -2,7 +2,7 @@ > > > > .PHONY: vm-build-all > > > > -IMAGES := ubuntu.i386 freebsd netbsd openbsd > > +IMAGES := ubuntu.i386 freebsd netbsd openbsd centos > > IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES)) > > > > .PRECIOUS: $(IMAGE_FILES) > > @@ -14,6 +14,7 @@ vm-test: > > @echo " vm-build-freebsd- Build QEMU in FreeBSD VM" > > @echo " vm-build-netbsd - Build QEMU in NetBSD VM" > > @echo " vm-build-openbsd- Build QEMU in OpenBSD VM" > > + @echo " vm-build-centos - Build QEMU in CentOS VM, > > with Docker" > > > > vm-build-all: $(addprefix vm-build-, $(IMAGES)) > > > > diff --git a/tests/vm/centos b/tests/vm/centos > > new file mode 100755 > > index 00..afd560c564 > > --- /dev/null > > +++ b/tests/vm/centos > > @@ -0,0 +1,84 @@ > > +#!/usr/bin/env python > > +# > > +# CentOS image > > +# > > +# Copyright 2018 Red Hat Inc. > > +# > > +# Authors: > > +# Fam Zheng > > +# > > +# This code is licensed under the GPL version 2 or later. See > > +# the COPYING file in the top-level directory. > > +# > > + > > +import os > > +import sys > > +import subprocess > > +import basevm > > +import time > > + > > +class CentosVM(basevm.BaseVM): > > +name = "centos" > > +BUILD_SCRIPT = """ > > +set -e; > > +cd $(mktemp -d); > > +export SRC_ARCHIVE=/dev/vdb; > > +sudo chmod a+r $SRC_ARCHIVE; > > +tar -xf $SRC_ARCHIVE; > > +make docker-test-block@centos7 V={verbose} J={jobs}; > > +make docker-test-quick@centos7 V={verbose} J={jobs}; > > +make docker-test-mingw@fedora V={verbose} J={jobs}; > > +""" > > + > > +def _gen_cloud_init_iso(self): > > +cidir = self._tmpdir > > +mdata = open(os.path.join(cidir, "meta-data"), "w") > > +mdata.writelines(["instance-id: centos-vm-0\n", > > + "local-hostname: centos-guest\n"]) > > +mdata.close() > > +udata = open(os.path.join(cidir, "user-data"), "w") > > +udata.writelines(["#cloud-config\n", > > + "chpasswd:\n", > > + " list: |\n", > > + "root:%s\n" % self.ROOT_PASS, > > + "%s:%s\n" % (self.GUEST_USER, > > self.GUEST_PASS), > > + " expire: False\n", > > + "users:\n", > > + " - name: %s\n" % self.GUEST_USER, > > + "sudo: ALL=(ALL) NOPASSWD:ALL\n", > > + "ssh-authorized-keys:\n", > > + "- %s\n" % basevm.SSH_PUB_KEY, > > + " - name: root\n", > > + "ssh-authorized-keys:\n", > > + "- %s\n" % basevm.SSH_PUB_KEY, > > + "locale: en_US.UTF-8\n"]) > > +udata.close() > > +subprocess.check_call(["genisoimage", "-output", "cloud-init.iso", > > + "-volid", "cidata", "-joliet", "-rock", > > + "user-data", "meta-data"], > > + cwd=cidir, > > + stdin=self._devnull, stdout=self._stdout, > > +
[Qemu-devel] [PATCH v2] nvme: Fix nvme_init error handling
It is wrong to leave this field as 1, as nvme_close() called in the error handling code in nvme_file_open() will use it and try to free s->queues again. Another problem is the cleaning ups are duplicated between the fail* labels of nvme_init() and nvme_file_open(), which calls nvme_close(). A third problem is nvme_close() misses g_free() and event_notifier_cleanup(). Fix all of them. Cc: qemu-sta...@nongnu.org Signed-off-by: Fam Zheng --- v2: Adopt the suggested fix by Kevin. --- block/nvme.c | 37 - 1 file changed, 12 insertions(+), 25 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index 6f71122bf5..37805e8890 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -569,13 +569,13 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, s->vfio = qemu_vfio_open_pci(device, errp); if (!s->vfio) { ret = -EINVAL; -goto fail; +goto out; } s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE, errp); if (!s->regs) { ret = -EINVAL; -goto fail; +goto out; } /* Perform initialize sequence as described in NVMe spec "7.6.1 @@ -585,7 +585,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, if (!(cap & (1ULL << 37))) { error_setg(errp, "Device doesn't support NVMe command set"); ret = -EINVAL; -goto fail; +goto out; } s->page_size = MAX(4096, 1 << (12 + ((cap >> 48) & 0xF))); @@ -603,7 +603,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, PRId64 " ms)", timeout_ms); ret = -ETIMEDOUT; -goto fail; +goto out; } } @@ -613,7 +613,7 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, s->queues[0] = nvme_create_queue_pair(bs, 0, NVME_QUEUE_SIZE, errp); if (!s->queues[0]) { ret = -EINVAL; -goto fail; +goto out; } QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000); s->regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE); @@ -633,14 +633,14 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, PRId64 " ms)", timeout_ms); ret = -ETIMEDOUT; -goto fail_queue; +goto out; } } ret = qemu_vfio_pci_init_irq(s->vfio, >irq_notifier, VFIO_PCI_MSIX_IRQ_INDEX, errp); if (ret) { -goto fail_queue; +goto out; } aio_set_event_notifier(bdrv_get_aio_context(bs), >irq_notifier, false, nvme_handle_event, nvme_poll_cb); @@ -649,30 +649,15 @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace, if (local_err) { error_propagate(errp, local_err); ret = -EIO; -goto fail_handler; +goto out; } /* Set up command queues. */ if (!nvme_add_io_queue(bs, errp)) { ret = -EIO; -goto fail_handler; } -return 0; - -fail_handler: -aio_set_event_notifier(bdrv_get_aio_context(bs), >irq_notifier, - false, NULL, NULL); -fail_queue: -nvme_free_queue_pair(bs, s->queues[0]); -fail: -g_free(s->queues); -if (s->regs) { -qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, NVME_BAR_SIZE); -} -if (s->vfio) { -qemu_vfio_close(s->vfio); -} -event_notifier_cleanup(>irq_notifier); +out: +/* Cleaning up is done in nvme_file_open() upon error. */ return ret; } @@ -739,8 +724,10 @@ static void nvme_close(BlockDriverState *bs) for (i = 0; i < s->nr_queues; ++i) { nvme_free_queue_pair(bs, s->queues[i]); } +g_free(s->queues); aio_set_event_notifier(bdrv_get_aio_context(bs), >irq_notifier, false, NULL, NULL); +event_notifier_cleanup(>irq_notifier); qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, NVME_BAR_SIZE); qemu_vfio_close(s->vfio); } -- 2.17.1
[Qemu-devel] [PATCH v3 4/4] tests: Add centos VM testing
This one does docker testing in the VM. It is intended to replace the native docker testing on patchew testers. Signed-off-by: Fam Zheng --- tests/vm/Makefile.include | 3 +- tests/vm/centos | 84 +++ 2 files changed, 86 insertions(+), 1 deletion(-) create mode 100755 tests/vm/centos diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include index 5daa2a3b73..af19b7a4e6 100644 --- a/tests/vm/Makefile.include +++ b/tests/vm/Makefile.include @@ -2,7 +2,7 @@ .PHONY: vm-build-all -IMAGES := ubuntu.i386 freebsd netbsd openbsd +IMAGES := ubuntu.i386 freebsd netbsd openbsd centos IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES)) .PRECIOUS: $(IMAGE_FILES) @@ -14,6 +14,7 @@ vm-test: @echo " vm-build-freebsd- Build QEMU in FreeBSD VM" @echo " vm-build-netbsd - Build QEMU in NetBSD VM" @echo " vm-build-openbsd- Build QEMU in OpenBSD VM" + @echo " vm-build-centos - Build QEMU in CentOS VM, with Docker" vm-build-all: $(addprefix vm-build-, $(IMAGES)) diff --git a/tests/vm/centos b/tests/vm/centos new file mode 100755 index 00..afd560c564 --- /dev/null +++ b/tests/vm/centos @@ -0,0 +1,84 @@ +#!/usr/bin/env python +# +# CentOS image +# +# Copyright 2018 Red Hat Inc. +# +# Authors: +# Fam Zheng +# +# This code is licensed under the GPL version 2 or later. See +# the COPYING file in the top-level directory. +# + +import os +import sys +import subprocess +import basevm +import time + +class CentosVM(basevm.BaseVM): +name = "centos" +BUILD_SCRIPT = """ +set -e; +cd $(mktemp -d); +export SRC_ARCHIVE=/dev/vdb; +sudo chmod a+r $SRC_ARCHIVE; +tar -xf $SRC_ARCHIVE; +make docker-test-block@centos7 V={verbose} J={jobs}; +make docker-test-quick@centos7 V={verbose} J={jobs}; +make docker-test-mingw@fedora V={verbose} J={jobs}; +""" + +def _gen_cloud_init_iso(self): +cidir = self._tmpdir +mdata = open(os.path.join(cidir, "meta-data"), "w") +mdata.writelines(["instance-id: centos-vm-0\n", + "local-hostname: centos-guest\n"]) +mdata.close() +udata = open(os.path.join(cidir, "user-data"), "w") +udata.writelines(["#cloud-config\n", + "chpasswd:\n", + " list: |\n", + "root:%s\n" % self.ROOT_PASS, + "%s:%s\n" % (self.GUEST_USER, self.GUEST_PASS), + " expire: False\n", + "users:\n", + " - name: %s\n" % self.GUEST_USER, + "sudo: ALL=(ALL) NOPASSWD:ALL\n", + "ssh-authorized-keys:\n", + "- %s\n" % basevm.SSH_PUB_KEY, + " - name: root\n", + "ssh-authorized-keys:\n", + "- %s\n" % basevm.SSH_PUB_KEY, + "locale: en_US.UTF-8\n"]) +udata.close() +subprocess.check_call(["genisoimage", "-output", "cloud-init.iso", + "-volid", "cidata", "-joliet", "-rock", + "user-data", "meta-data"], + cwd=cidir, + stdin=self._devnull, stdout=self._stdout, + stderr=self._stdout) +return os.path.join(cidir, "cloud-init.iso") + +def build_image(self, img): +cimg = self._download_with_cache("https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-1802.qcow2.xz;) +img_tmp = img + ".tmp" +subprocess.check_call(["cp", "-f", cimg, img_tmp + ".xz"]) +subprocess.check_call(["xz", "-df", img_tmp + ".xz"]) +subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"]) +self.boot(img_tmp, extra_args = ["-cdrom", self._gen_cloud_init_iso()]) +self.wait_ssh() +self.ssh_root_check("touch /etc/cloud/cloud-init.disabled") +self.ssh_root_check("yum update -y") +self.ssh_root_check("yum install -y docker make git") +self.ssh_root_check("systemctl enable docker") +self.ssh_root("poweroff") +self.wait() +if os.path.exists(img): +os.remove(img) +os.rename(img_tmp, img) +return 0 + +if __name__ == "__main__": +sys.exit(basevm.main(CentosVM)) -- 2.17.1
[Qemu-devel] [PATCH v3 2/4] tests/vm: Pass verbose flag into VM make commands
Our Makefile has: vm-build-%: tests/vm/%.img $(call quiet-command, \ $(SRC_PATH)/tests/vm/$* \ $(if $(V)$(DEBUG), --debug) \ $(if $(DEBUG), --interactive) \ the intention of which is to let the make command in VM have V=1 if V=1 is set. We pass this to the BUILD_SCRIPT format. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Fam Zheng --- tests/vm/basevm.py | 3 ++- tests/vm/freebsd | 4 ++-- tests/vm/netbsd | 4 ++-- tests/vm/openbsd | 4 ++-- tests/vm/ubuntu.i386 | 4 ++-- 5 files changed, 10 insertions(+), 9 deletions(-) diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py index e5d6a328d5..25361c6b7d 100755 --- a/tests/vm/basevm.py +++ b/tests/vm/basevm.py @@ -240,7 +240,8 @@ def main(vmcls): vm.add_source_dir(args.build_qemu) cmd = [vm.BUILD_SCRIPT.format( configure_opts = " ".join(argv), - jobs=args.jobs)] + jobs=args.jobs, + verbose="1" if args.debug else "")] else: cmd = argv img = args.image diff --git a/tests/vm/freebsd b/tests/vm/freebsd index 039dad8f69..b20612a6c9 100755 --- a/tests/vm/freebsd +++ b/tests/vm/freebsd @@ -23,8 +23,8 @@ class FreeBSDVM(basevm.BaseVM): cd $(mktemp -d /var/tmp/qemu-test.XX); tar -xf /dev/vtbd1; ./configure {configure_opts}; -gmake -j{jobs}; -gmake check; +gmake -j{jobs} V={verbose}; +gmake check V={verbose}; """ def build_image(self, img): diff --git a/tests/vm/netbsd b/tests/vm/netbsd index 3972d8b45c..3f9d34a208 100755 --- a/tests/vm/netbsd +++ b/tests/vm/netbsd @@ -23,8 +23,8 @@ class NetBSDVM(basevm.BaseVM): cd $(mktemp -d /var/tmp/qemu-test.XX); tar -xf /dev/rld1a; ./configure --python=python2.7 {configure_opts}; -gmake -j{jobs}; -gmake check; +gmake -j{jobs} V={verbose}; +gmake check V={verbose}; """ def build_image(self, img): diff --git a/tests/vm/openbsd b/tests/vm/openbsd index 6ae16d97fd..3285c1abde 100755 --- a/tests/vm/openbsd +++ b/tests/vm/openbsd @@ -23,9 +23,9 @@ class OpenBSDVM(basevm.BaseVM): cd $(mktemp -d /var/tmp/qemu-test.XX); tar -xf /dev/rsd1c; ./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4 --python=python2.7 {configure_opts}; -gmake -j{jobs}; +gmake -j{jobs} V={verbose}; # XXX: "gmake check" seems to always hang or fail -#gmake check; +#gmake check V={verbose}; """ def build_image(self, img): diff --git a/tests/vm/ubuntu.i386 b/tests/vm/ubuntu.i386 index fc319e0e6e..b4eaa0dedc 100755 --- a/tests/vm/ubuntu.i386 +++ b/tests/vm/ubuntu.i386 @@ -25,8 +25,8 @@ class UbuntuX86VM(basevm.BaseVM): sudo chmod a+r /dev/vdb; tar -xf /dev/vdb; ./configure {configure_opts}; -make -j{jobs}; -make check; +make -j{jobs} V={verbose}; +make check V={verbose}; """ def _gen_cloud_init_iso(self): -- 2.17.1
[Qemu-devel] [PATCH v3 1/4] tests: Add an option for snapshot (default: off)
Not using snapshot has the benefit of automatically persisting useful test harnesses, such as docker images and ccache database. Although it will lose some cleanness, it is imaginably useful for patchew. Signed-off-by: Fam Zheng --- tests/vm/basevm.py | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py index 3643117816..e5d6a328d5 100755 --- a/tests/vm/basevm.py +++ b/tests/vm/basevm.py @@ -216,6 +216,8 @@ def parse_args(vm_name): help="build QEMU from source in guest") parser.add_option("--interactive", "-I", action="store_true", help="Interactively run command") +parser.add_option("--snapshot", "-s", action="store_true", + help="run tests with a snapshot") parser.disable_interspersed_args() return parser.parse_args() @@ -241,7 +243,10 @@ def main(vmcls): jobs=args.jobs)] else: cmd = argv -vm.boot(args.image + ",snapshot=on") +img = args.image +if args.snapshot: +img += ",snapshot=on" +vm.boot(img) vm.wait_ssh() except Exception as e: if isinstance(e, SystemExit) and e.code == 0: -- 2.17.1
[Qemu-devel] [PATCH v3 0/4] Add a CentOS test image to run docker tests
v3: Add 'make vm-test' document. [Phil] v2: Drop archive-source.sh changes. The new test depends on the iotests nbd fix I posted today to pass. Docker testing on patchew has long suffered from 'make check' hangings. The cleanness of VM testing is the cure. Now let's add a CentOS 7 image to run the tests. It's purely ad-hoc, but hopefully still easy to understand and use for everyone. The first patch makes passing source code from host to the container in VM working, and is a nice clean up. The second patch makes caches work, to speed up repetitive runs like on patchew. The last patch adds the new image that does the job. Two out of three docker tests running on patchew.org are added to the image. I'll wait for Peter to fix the 'docker-test-quick@centos6' hanging (oob test) before adding it too. Fam Fam Zheng (4): tests: Add an option for snapshot (default: off) tests/vm: Pass verbose flag into VM make commands tests: Allow overriding archive path with SRC_ARCHIVE tests: Add centos VM testing tests/docker/Makefile.include | 7 ++- tests/vm/Makefile.include | 3 +- tests/vm/basevm.py| 10 - tests/vm/centos | 84 +++ tests/vm/freebsd | 4 +- tests/vm/netbsd | 4 +- tests/vm/openbsd | 4 +- tests/vm/ubuntu.i386 | 4 +- 8 files changed, 107 insertions(+), 13 deletions(-) create mode 100755 tests/vm/centos -- 2.17.1
[Qemu-devel] [PATCH v3 3/4] tests: Allow overriding archive path with SRC_ARCHIVE
In VM based tests, the source archive is created in host, we don't have to run archive-source.sh again, as it complicates the Makefile and scripts. Signed-off-by: Fam Zheng --- tests/docker/Makefile.include | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include index b2a7e761cc..f0525b91a7 100644 --- a/tests/docker/Makefile.include +++ b/tests/docker/Makefile.include @@ -27,8 +27,11 @@ DOCKER_SRC_COPY := $(BUILD_DIR)/docker-src.$(CUR_TIME) $(DOCKER_SRC_COPY): @mkdir $@ - $(call quiet-command, cd $(SRC_PATH) && scripts/archive-source.sh $@/qemu.tar, \ - "GEN", "$@/qemu.tar") + $(if $(SRC_ARCHIVE), \ + $(call quiet-command, cp "$(SRC_ARCHIVE)" $@/qemu.tar, \ + "CP", "$@/qemu.tar"), \ + $(call quiet-command, cd $(SRC_PATH) && scripts/archive-source.sh $@/qemu.tar, \ + "GEN", "$@/qemu.tar")) $(call quiet-command, cp $(SRC_PATH)/tests/docker/run $@/run, \ "COPY","RUNNER") -- 2.17.1
Re: [Qemu-devel] [PATCH] docker: Install more packages in centos7
On Wed, 07/11 14:58, Fam Zheng wrote: > This makes test-block work. > > Signed-off-by: Fam Zheng Queued, thanks.
[Qemu-devel] [PATCH v2 4/4] tests: Add centos VM testing
This one does docker testing in the VM. It is intended to replace the native docker testing on patchew testers. Signed-off-by: Fam Zheng --- tests/vm/Makefile.include | 2 +- tests/vm/centos | 84 +++ 2 files changed, 85 insertions(+), 1 deletion(-) create mode 100755 tests/vm/centos diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include index 5daa2a3b73..e492cd73e5 100644 --- a/tests/vm/Makefile.include +++ b/tests/vm/Makefile.include @@ -2,7 +2,7 @@ .PHONY: vm-build-all -IMAGES := ubuntu.i386 freebsd netbsd openbsd +IMAGES := ubuntu.i386 freebsd netbsd openbsd centos IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES)) .PRECIOUS: $(IMAGE_FILES) diff --git a/tests/vm/centos b/tests/vm/centos new file mode 100755 index 00..afd560c564 --- /dev/null +++ b/tests/vm/centos @@ -0,0 +1,84 @@ +#!/usr/bin/env python +# +# CentOS image +# +# Copyright 2018 Red Hat Inc. +# +# Authors: +# Fam Zheng +# +# This code is licensed under the GPL version 2 or later. See +# the COPYING file in the top-level directory. +# + +import os +import sys +import subprocess +import basevm +import time + +class CentosVM(basevm.BaseVM): +name = "centos" +BUILD_SCRIPT = """ +set -e; +cd $(mktemp -d); +export SRC_ARCHIVE=/dev/vdb; +sudo chmod a+r $SRC_ARCHIVE; +tar -xf $SRC_ARCHIVE; +make docker-test-block@centos7 V={verbose} J={jobs}; +make docker-test-quick@centos7 V={verbose} J={jobs}; +make docker-test-mingw@fedora V={verbose} J={jobs}; +""" + +def _gen_cloud_init_iso(self): +cidir = self._tmpdir +mdata = open(os.path.join(cidir, "meta-data"), "w") +mdata.writelines(["instance-id: centos-vm-0\n", + "local-hostname: centos-guest\n"]) +mdata.close() +udata = open(os.path.join(cidir, "user-data"), "w") +udata.writelines(["#cloud-config\n", + "chpasswd:\n", + " list: |\n", + "root:%s\n" % self.ROOT_PASS, + "%s:%s\n" % (self.GUEST_USER, self.GUEST_PASS), + " expire: False\n", + "users:\n", + " - name: %s\n" % self.GUEST_USER, + "sudo: ALL=(ALL) NOPASSWD:ALL\n", + "ssh-authorized-keys:\n", + "- %s\n" % basevm.SSH_PUB_KEY, + " - name: root\n", + "ssh-authorized-keys:\n", + "- %s\n" % basevm.SSH_PUB_KEY, + "locale: en_US.UTF-8\n"]) +udata.close() +subprocess.check_call(["genisoimage", "-output", "cloud-init.iso", + "-volid", "cidata", "-joliet", "-rock", + "user-data", "meta-data"], + cwd=cidir, + stdin=self._devnull, stdout=self._stdout, + stderr=self._stdout) +return os.path.join(cidir, "cloud-init.iso") + +def build_image(self, img): +cimg = self._download_with_cache("https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-1802.qcow2.xz;) +img_tmp = img + ".tmp" +subprocess.check_call(["cp", "-f", cimg, img_tmp + ".xz"]) +subprocess.check_call(["xz", "-df", img_tmp + ".xz"]) +subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"]) +self.boot(img_tmp, extra_args = ["-cdrom", self._gen_cloud_init_iso()]) +self.wait_ssh() +self.ssh_root_check("touch /etc/cloud/cloud-init.disabled") +self.ssh_root_check("yum update -y") +self.ssh_root_check("yum install -y docker make git") +self.ssh_root_check("systemctl enable docker") +self.ssh_root("poweroff") +self.wait() +if os.path.exists(img): +os.remove(img) +os.rename(img_tmp, img) +return 0 + +if __name__ == "__main__": +sys.exit(basevm.main(CentosVM)) -- 2.17.1
[Qemu-devel] [PATCH v2 2/4] tests/vm: Pass verbose flag into VM make commands
Our Makefile has: vm-build-%: tests/vm/%.img $(call quiet-command, \ $(SRC_PATH)/tests/vm/$* \ $(if $(V)$(DEBUG), --debug) \ $(if $(DEBUG), --interactive) \ the intention of which is to let the make command in VM have V=1 if V=1 is set. We pass this to the BUILD_SCRIPT format. Signed-off-by: Fam Zheng --- tests/vm/basevm.py | 3 ++- tests/vm/freebsd | 4 ++-- tests/vm/netbsd | 4 ++-- tests/vm/openbsd | 4 ++-- tests/vm/ubuntu.i386 | 4 ++-- 5 files changed, 10 insertions(+), 9 deletions(-) diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py index e5d6a328d5..25361c6b7d 100755 --- a/tests/vm/basevm.py +++ b/tests/vm/basevm.py @@ -240,7 +240,8 @@ def main(vmcls): vm.add_source_dir(args.build_qemu) cmd = [vm.BUILD_SCRIPT.format( configure_opts = " ".join(argv), - jobs=args.jobs)] + jobs=args.jobs, + verbose="1" if args.debug else "")] else: cmd = argv img = args.image diff --git a/tests/vm/freebsd b/tests/vm/freebsd index 039dad8f69..b20612a6c9 100755 --- a/tests/vm/freebsd +++ b/tests/vm/freebsd @@ -23,8 +23,8 @@ class FreeBSDVM(basevm.BaseVM): cd $(mktemp -d /var/tmp/qemu-test.XX); tar -xf /dev/vtbd1; ./configure {configure_opts}; -gmake -j{jobs}; -gmake check; +gmake -j{jobs} V={verbose}; +gmake check V={verbose}; """ def build_image(self, img): diff --git a/tests/vm/netbsd b/tests/vm/netbsd index 3972d8b45c..3f9d34a208 100755 --- a/tests/vm/netbsd +++ b/tests/vm/netbsd @@ -23,8 +23,8 @@ class NetBSDVM(basevm.BaseVM): cd $(mktemp -d /var/tmp/qemu-test.XX); tar -xf /dev/rld1a; ./configure --python=python2.7 {configure_opts}; -gmake -j{jobs}; -gmake check; +gmake -j{jobs} V={verbose}; +gmake check V={verbose}; """ def build_image(self, img): diff --git a/tests/vm/openbsd b/tests/vm/openbsd index 6ae16d97fd..3285c1abde 100755 --- a/tests/vm/openbsd +++ b/tests/vm/openbsd @@ -23,9 +23,9 @@ class OpenBSDVM(basevm.BaseVM): cd $(mktemp -d /var/tmp/qemu-test.XX); tar -xf /dev/rsd1c; ./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4 --python=python2.7 {configure_opts}; -gmake -j{jobs}; +gmake -j{jobs} V={verbose}; # XXX: "gmake check" seems to always hang or fail -#gmake check; +#gmake check V={verbose}; """ def build_image(self, img): diff --git a/tests/vm/ubuntu.i386 b/tests/vm/ubuntu.i386 index fc319e0e6e..b4eaa0dedc 100755 --- a/tests/vm/ubuntu.i386 +++ b/tests/vm/ubuntu.i386 @@ -25,8 +25,8 @@ class UbuntuX86VM(basevm.BaseVM): sudo chmod a+r /dev/vdb; tar -xf /dev/vdb; ./configure {configure_opts}; -make -j{jobs}; -make check; +make -j{jobs} V={verbose}; +make check V={verbose}; """ def _gen_cloud_init_iso(self): -- 2.17.1
[Qemu-devel] [PATCH v2 3/4] tests: Allow overriding archive path with SRC_ARCHIVE
In VM based tests, the source archive is created in host, we don't have to run archive-source.sh again, as it complicates the Makefile and scripts. Signed-off-by: Fam Zheng --- tests/docker/Makefile.include | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include index b2a7e761cc..f0525b91a7 100644 --- a/tests/docker/Makefile.include +++ b/tests/docker/Makefile.include @@ -27,8 +27,11 @@ DOCKER_SRC_COPY := $(BUILD_DIR)/docker-src.$(CUR_TIME) $(DOCKER_SRC_COPY): @mkdir $@ - $(call quiet-command, cd $(SRC_PATH) && scripts/archive-source.sh $@/qemu.tar, \ - "GEN", "$@/qemu.tar") + $(if $(SRC_ARCHIVE), \ + $(call quiet-command, cp "$(SRC_ARCHIVE)" $@/qemu.tar, \ + "CP", "$@/qemu.tar"), \ + $(call quiet-command, cd $(SRC_PATH) && scripts/archive-source.sh $@/qemu.tar, \ + "GEN", "$@/qemu.tar")) $(call quiet-command, cp $(SRC_PATH)/tests/docker/run $@/run, \ "COPY","RUNNER") -- 2.17.1
[Qemu-devel] [PATCH v2 0/4] Add a CentOS test image to run docker tests
v2: Drop archive-source.sh changes. The new test depends on the iotests nbd fix I posted today to pass. Docker testing on patchew has long suffered from 'make check' hangings. The cleanness of VM testing is the cure. Now let's add a CentOS 7 image to run the tests. It's purely ad-hoc, but hopefully still easy to understand and use for everyone. The first patch makes passing source code from host to the container in VM working, and is a nice clean up. The second patch makes caches work, to speed up repetitive runs like on patchew. The last patch adds the new image that does the job. Two out of three docker tests running on patchew.org are added to the image. I'll wait for Peter to fix the 'docker-test-quick@centos6' hanging (oob test) before adding it too. Fam Fam Zheng (4): tests: Add an option for snapshot (default: off) tests/vm: Pass verbose flag into VM make commands tests: Allow overriding archive path with SRC_ARCHIVE tests: Add centos VM testing tests/docker/Makefile.include | 7 ++- tests/vm/Makefile.include | 2 +- tests/vm/basevm.py| 10 - tests/vm/centos | 84 +++ tests/vm/freebsd | 4 +- tests/vm/netbsd | 4 +- tests/vm/openbsd | 4 +- tests/vm/ubuntu.i386 | 4 +- 8 files changed, 106 insertions(+), 13 deletions(-) create mode 100755 tests/vm/centos -- 2.17.1
[Qemu-devel] [PATCH v2 1/4] tests: Add an option for snapshot (default: off)
Not using snapshot has the benefit of automatically persisting useful test harnesses, such as docker images and ccache database. Although it will lose some cleanness, it is imaginably useful for patchew. Signed-off-by: Fam Zheng --- tests/vm/basevm.py | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py index 3643117816..e5d6a328d5 100755 --- a/tests/vm/basevm.py +++ b/tests/vm/basevm.py @@ -216,6 +216,8 @@ def parse_args(vm_name): help="build QEMU from source in guest") parser.add_option("--interactive", "-I", action="store_true", help="Interactively run command") +parser.add_option("--snapshot", "-s", action="store_true", + help="run tests with a snapshot") parser.disable_interspersed_args() return parser.parse_args() @@ -241,7 +243,10 @@ def main(vmcls): jobs=args.jobs)] else: cmd = argv -vm.boot(args.image + ",snapshot=on") +img = args.image +if args.snapshot: +img += ",snapshot=on" +vm.boot(img) vm.wait_ssh() except Exception as e: if isinstance(e, SystemExit) and e.code == 0: -- 2.17.1
Re: [Qemu-devel] [PATCH] docker: Update debootstrap script after Debian migration from Alioth to Salsa
On Tue, 07/03 12:35, Philippe Mathieu-Daudé wrote: > This silents the following warning: > > Cloning into './debootstrap.git'... > warning: redirecting to > https://salsa.debian.org/installer-team/debootstrap.git/ > > See https://lists.debian.org/debian-devel-announce/2018/01/msg4.html > > Signed-off-by: Philippe Mathieu-Daudé > --- > tests/docker/dockerfiles/debian-bootstrap.pre | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tests/docker/dockerfiles/debian-bootstrap.pre > b/tests/docker/dockerfiles/debian-bootstrap.pre > index 7c76dce663..44b18db7a2 100755 > --- a/tests/docker/dockerfiles/debian-bootstrap.pre > +++ b/tests/docker/dockerfiles/debian-bootstrap.pre > @@ -53,7 +53,7 @@ if [ -z $DEBOOTSTRAP_DIR ]; then > NEED_DEBOOTSTRAP=true > fi > if $NEED_DEBOOTSTRAP; then > -DEBOOTSTRAP_SOURCE=https://anonscm.debian.org/git/d-i/debootstrap.git > + > DEBOOTSTRAP_SOURCE=https://salsa.debian.org/installer-team/debootstrap.git > git clone ${DEBOOTSTRAP_SOURCE} ./debootstrap.git > export DEBOOTSTRAP_DIR=./debootstrap.git > DEBOOTSTRAP=./debootstrap.git/debootstrap > -- > 2.18.0 > Queued, thanks. Fam
[Qemu-devel] [PATCH] iotests: 153: Fix dead code
This step was left behind my mistake. As suggested by the echoed text, the intention was to test two devices with the same image, with different options. The behavior should be the same as two QEMU processes. Complete it. Signed-off-by: Fam Zheng --- tests/qemu-iotests/153 | 2 ++ tests/qemu-iotests/153.out | 25 + 2 files changed, 27 insertions(+) diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153 index 673813cd39..0daeb1b005 100755 --- a/tests/qemu-iotests/153 +++ b/tests/qemu-iotests/153 @@ -162,6 +162,7 @@ for opts1 in "" "read-only=on" "read-only=on,force-share=on"; do _cleanup_qemu done +test_opts="read-only=off read-only=on read-only=on,force-share=on" for opt1 in $test_opts; do for opt2 in $test_opts; do echo @@ -170,6 +171,7 @@ for opt1 in $test_opts; do done done +echo echo "== Creating ${TEST_IMG}.[abc] ==" | _filter_testdir ( $QEMU_IMG create -f qcow2 "${TEST_IMG}.a" -b "${TEST_IMG}" diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out index 3492ba7a29..93eaf10486 100644 --- a/tests/qemu-iotests/153.out +++ b/tests/qemu-iotests/153.out @@ -369,6 +369,31 @@ _qemu_img_wrapper bench -U -w -c 1 TEST_DIR/t.qcow2 qemu-img: Could not open 'TEST_DIR/t.qcow2': force-share=on can only be used with read-only images Round done + +== Two devices with the same image (read-only=off - read-only=off) == +QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2,read-only=off: Failed to get "write" lock +Is another process using the image? + +== Two devices with the same image (read-only=off - read-only=on) == +QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2,read-only=on: Failed to get shared "write" lock +Is another process using the image? + +== Two devices with the same image (read-only=off - read-only=on,force-share=on) == + +== Two devices with the same image (read-only=on - read-only=off) == +QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2,read-only=off: Failed to get "write" lock +Is another process using the image? + +== Two devices with the same image (read-only=on - read-only=on) == + +== Two devices with the same image (read-only=on - read-only=on,force-share=on) == + +== Two devices with the same image (read-only=on,force-share=on - read-only=off) == + +== Two devices with the same image (read-only=on,force-share=on - read-only=on) == + +== Two devices with the same image (read-only=on,force-share=on - read-only=on,force-share=on) == + == Creating TEST_DIR/t.qcow2.[abc] == Formatting 'TEST_DIR/t.IMGFMT.a', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT Formatting 'TEST_DIR/t.IMGFMT.b', fmt=IMGFMT size=33554432 backing_file=TEST_DIR/t.IMGFMT -- 2.17.1
[Qemu-devel] [PATCH] docker: Install more packages in centos7
This makes test-block work. Signed-off-by: Fam Zheng --- tests/docker/dockerfiles/centos7.docker | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/docker/dockerfiles/centos7.docker b/tests/docker/dockerfiles/centos7.docker index 575de29a0a..83462b7205 100644 --- a/tests/docker/dockerfiles/centos7.docker +++ b/tests/docker/dockerfiles/centos7.docker @@ -3,6 +3,7 @@ RUN yum install -y epel-release centos-release-xen RUN yum -y update ENV PACKAGES \ bison \ +bzip2 \ bzip2-devel \ ccache \ csnappy-devel \ @@ -12,10 +13,12 @@ ENV PACKAGES \ gettext \ git \ glib2-devel \ +libaio-devel \ libepoxy-devel \ libfdt-devel \ librdmacm-devel \ lzo-devel \ +nettle-devel \ make \ mesa-libEGL-devel \ mesa-libgbm-devel \ -- 2.17.1
[Qemu-devel] [PATCH v2] iotests: nbd: Stop qemu-nbd before remaking image
197 is one example where _make_test_img is used twice without stopping the NBD server in between. An error will occur like this: @@ -26,9 +26,13 @@ === Partial final cluster === +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "resize" lock +Is another process using the image? Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1024 +Failed to find an available port: Address already in use read 1024/1024 bytes at offset 0 Patch _make_test_img to stop the old qemu-nbd before starting a new one, which fixes this problem, and similarly 215. Signed-off-by: Fam Zheng --- tests/qemu-iotests/common.rc | 21 +++-- 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index d054cb97fc..44bee16a5e 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -195,6 +195,16 @@ _use_sample_img() fi } +_stop_nbd_server() +{ +if [ -f "${QEMU_TEST_DIR}/qemu-nbd.pid" ]; then +local QEMU_NBD_PID +read QEMU_NBD_PID < "${QEMU_TEST_DIR}/qemu-nbd.pid" +kill ${QEMU_NBD_PID} +rm -f "${QEMU_TEST_DIR}/qemu-nbd.pid" +fi +} + _make_test_img() { # extra qemu-img options can be added by tests @@ -234,6 +244,10 @@ _make_test_img() extra_img_options="-o $optstr $extra_img_options" fi +if [ $IMGPROTO = "nbd" ]; then +_stop_nbd_server +fi + # XXX(hch): have global image options? ( if [ $use_backing = 1 ]; then @@ -274,12 +288,7 @@ _cleanup_test_img() case "$IMGPROTO" in nbd) -if [ -f "${QEMU_TEST_DIR}/qemu-nbd.pid" ]; then -local QEMU_NBD_PID -read QEMU_NBD_PID < "${QEMU_TEST_DIR}/qemu-nbd.pid" -kill ${QEMU_NBD_PID} -rm -f "${QEMU_TEST_DIR}/qemu-nbd.pid" -fi +_stop_nbd_server rm -f "$TEST_IMG_FILE" ;; vxhs) -- 2.17.1
Re: [Qemu-devel] [PATCH 2/4] tests/vm: Add a dependency to qemu-img
On Tue, 07/03 12:56, Philippe Mathieu-Daudé wrote: > Hi Fam, > > On 07/02/2018 12:19 PM, Philippe Mathieu-Daudé wrote: > > On 07/02/2018 04:18 AM, Fam Zheng wrote: > >> On Thu, 06/28 12:35, Philippe Mathieu-Daudé wrote: > >>> Before the first use, the VM image are resized with qemu-img. > >>> > >>> Add a dependency to the qemu-img tool to fix: > >>> > >>> $ make vm-build-ubuntu.i386 > >>> Traceback (most recent call last): > >>> File "source/qemu/tests/vm/basevm.py", line 236, in main > >>> return vm.build_image(args.image) > >>> File "tests/vm/ubuntu.i386", line 67, in build_image > >>> subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"]) > >>> OSError: [Errno 2] No such file or directory > >>> tests/vm/Makefile.include:23: recipe for target > >>> 'tests/vm/ubuntu.i386.img' failed > >>> make: *** [tests/vm/ubuntu.i386.img] Error 2 > >>> > >>> Signed-off-by: Philippe Mathieu-Daudé > >>> --- > >>> tests/vm/Makefile.include | 3 +++ > >>> 1 file changed, 3 insertions(+) > >>> > >>> diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include > >>> index 5daa2a3b73..e5e8146267 100644 > >>> --- a/tests/vm/Makefile.include > >>> +++ b/tests/vm/Makefile.include > >>> @@ -17,10 +17,13 @@ vm-test: > >>> > >>> vm-build-all: $(addprefix vm-build-, $(IMAGES)) > >>> > >>> +$(IMAGE_FILES): qemu-img$(EXESUF) > >> > >> No, please don't do this. Let's keep on supporting vm tests from a clean > >> source > >> tree. Similar to qemu_bin, this can be handled in basevm.py. > > > > OK. > > The ubuntu.i386 VM is the only one using the qemu-img tool in his > build_image() method implementation. Is it OK to add this check there? > The basevm.build_image() is only an abstract method. Maybe you can add a BaseVM.qemu_img() method that looks up for available qemu-img on the system or build dir. Fam > > >>> + > >>> tests/vm/%.img: $(SRC_PATH)/tests/vm/% \ > >>> $(SRC_PATH)/tests/vm/basevm.py \ > >>> $(SRC_PATH)/tests/vm/Makefile.include > >>> $(call quiet-command, \ > >>> + PATH=$(PATH):. \ > >>> $< \ > >>> $(if $(V)$(DEBUG), --debug) \ > >>> --image "$@" \ > >>> -- > >>> 2.18.0 > >>> > >>
Re: [Qemu-devel] [PATCH 0/2] virtio-scsi: fix hotplug ->reset() vs event race
On Tue, 07/10 16:50, Stefan Hajnoczi wrote: > The virtio-scsi command virtqueues run during hotplug. This creates the > possibility of race conditions since the guest can submit commands while the > monitor is performing hotplug. > > See Patch 2 for a fix for the ->reset() vs event race condition that Zhengui > Li > encountered. > > Zhengui Li: Sorry it took me so long to look into this. Please let me know if > this fixes the issue you are seeing. Thanks! > > Stefan Hajnoczi (2): > qdev: add HotplugHandler->post_plug() callback > virtio-scsi: fix hotplug ->reset() vs event race > > include/hw/hotplug.h | 12 > hw/core/hotplug.c | 11 +++ > hw/core/qdev.c| 7 +++ > hw/scsi/virtio-scsi.c | 11 ++- > 4 files changed, 40 insertions(+), 1 deletion(-) > > -- > 2.17.1 Reviewed-by: Fam Zheng
Re: [Qemu-devel] [PATCH for 3.0 10/10] docker: add expansion for docker-test-FOO to Makefile.include
On Tue, 07/10 22:04, Alex Bennée wrote: > > Philippe Mathieu-Daudé writes: > > > On 07/09/2018 12:21 PM, Alex Bennée wrote: > >> This allows us to run a particular test on all docker images. For > >> example: > >> > >> make docker-test-unit > >> > >> Will run the unit tests on every supported image. > >> > >> Signed-off-by: Alex Bennée > >> --- > >> tests/docker/Makefile.include | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include > >> index fe63aacf69..765b2c36f2 100644 > >> --- a/tests/docker/Makefile.include > >> +++ b/tests/docker/Makefile.include > >> @@ -152,6 +152,7 @@ $(foreach i,$(filter-out > >> $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES) $(DOCKER_DEPR > >>) \ > >>$(foreach t,$(DOCKER_TESTS), \ > >>$(eval docker-test: docker-$t@$i) \ > >> + $(eval docker-$t: docker-$t@$i) \ > >>) \ > >> ) > >> > >> @@ -162,6 +163,7 @@ docker: > >>@echo > >>@echo 'docker: Print this help.' > >>@echo 'docker-test: Run all image/test combinations.' > > > > Can we rename "docker-test" -> "docker-tests" or is it too late? > > Probably, I don't know how many people were actively using it. Fam? I don't know either. Maybe because I'm not native English speaker, but why bother? "Docker test" sounds like a generic "cover all" test, whereas "docker tests" sounds like a set of many independent tests based on Docker. Both make sense to me, though I agree the latter describes it better. On the other hand, maybe we should create a selection instead of "all combinations" and call it docker-test, to make it a bit more useful. Then we can rename the current docker-test to "docker-all" or "docker-test-all". Fam > > > > >> + @echo 'docker-TEST: Run TEST on all image combinations.' > >>@echo 'docker-clean:Kill and remove residual docker testing > >> containers.' > >>@echo 'docker-TEST@IMAGE: Run "TEST" in container "IMAGE".' > >>@echo ' Note: "TEST" is one of the listed test > >> name,' > >> > > > -- > Alex Bennée
[Qemu-devel] [PATCH] qemu-img: Document copy offloading implications with -S and -c
Explicitly enabling zero detection or compression suppresses copy offloading during convert. Document it. Suggested-by: Kevin Wolf Signed-off-by: Fam Zheng --- qemu-img.texi | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/qemu-img.texi b/qemu-img.texi index aeb1b9e66c..5853cd18d1 100644 --- a/qemu-img.texi +++ b/qemu-img.texi @@ -96,7 +96,8 @@ will enumerate information about backing files in a disk image chain. Refer below for further description. @item -c -indicates that target image must be compressed (qcow format only) +indicates that target image must be compressed (qcow format only). If this +option is used, copy offloading will not be attempted. @item -h with or without a command shows help and lists the supported formats @@ -115,7 +116,8 @@ in case both @var{-q} and @var{-p} options are used. indicates the consecutive number of bytes that must contain only zeros for qemu-img to create a sparse image during conversion. This value is rounded down to the nearest 512 bytes. You may use the common size suffixes like -@code{k} for kilobytes. +@code{k} for kilobytes. If this option is used, copy offloading will not be +attempted. @item -t @var{cache} specifies the cache mode that should be used with the (destination) file. See -- 2.17.1
[Qemu-devel] [PATCH] iotests: 222: Don't run with luks
Luks needs special parameters to operate the image. Since this test is focusing on image fleecing, skip skip that format. Signed-off-by: Fam Zheng --- tests/qemu-iotests/222 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/222 index ff3bfc1470..0ead56d574 100644 --- a/tests/qemu-iotests/222 +++ b/tests/qemu-iotests/222 @@ -25,6 +25,8 @@ import iotests from iotests import log, qemu_img, qemu_io, qemu_io_silent iotests.verify_platform(['linux']) +iotests.verify_image_format(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk', +'vhdx', 'raw']) patterns = [("0x5d", "0", "64k"), ("0xd5", "1M","64k"), -- 2.17.1
Re: [Qemu-devel] [PATCH for 3.0 01/10] tests/.gitignore: don't ignore docker tests
On Tue, 07/10 08:54, Alex Bennée wrote: > > Fam Zheng writes: > > > On Mon, 07/09 16:21, Alex Bennée wrote: > >> This was being a little over enthusiastic hiding files. > > > > What is "this" that hides test-* and calls for this patch? > > I thought that was implicit from the first line, this is the .gitignore. Indeed you're right. Maybe that line should be more specific, but I have no strong opinion. Fam > > > > >> > >> Signed-off-by: Alex Bennée > >> --- > >> tests/.gitignore | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/tests/.gitignore b/tests/.gitignore > >> index 08e2df1ce1..72c18aaab0 100644 > >> --- a/tests/.gitignore > >> +++ b/tests/.gitignore > >> @@ -9,6 +9,7 @@ qht-bench > >> rcutorture > >> test-* > >> !test-*.c > >> +!docker/test-* > >> test-qapi-commands.[ch] > >> test-qapi-events.[ch] > >> test-qapi-types.[ch] > >> -- > >> 2.17.1 > >> > > > -- > Alex Bennée
Re: [Qemu-devel] [PATCH for 3.0 00/10] various docker fixes
On Mon, 07/09 16:21, Alex Bennée wrote: > Hi, > > The addition of the cross compilers for check-tcg broke the ability to > run "make docker-test". In truth this was probably already broken as > it attempts to run every single test with every single docker image > which isn't something that gets done on a regular basis. > > Anyway the patches clean up the ability to do that in a sane way > although it still takes a long time to run the full test set. > > To help with running a better subset I've expanded the individual > tests so you can now run a line like: > > make docker-test-build TARGET_LIST=aarch64-softmmu J=30 > > To make sure your favourite architecture still builds everywhere. > > There is now also a docker-test-unit which just runs the unit tests > although we have to do a little re-factoring to make sure we don't > attempt to run "make check" steps when the docker image isn't capable > of it. This is also needed to make sure the other test-FOO build tests > don't choke on the check step. > > Finally there is a minor tweak for .gitignore and a fix for docker.py > throwing backtraces when we attempt to calculate SID_AGE. A cleaner > re-factoring can be left for a future release. Thanks for the clean ups! Looks good in general. The only thing I notice is the test-unit script. Fam
Re: [Qemu-devel] [PATCH for 3.0 09/10] docker: add test-unit runner
On Mon, 07/09 16:21, Alex Bennée wrote: > This test doesn't even build QEMU, it just runs all the unit tests. > Intended to make checking unit tests on all docker images easier. > > Signed-off-by: Alex Bennée > --- > tests/docker/test-unit | 19 +++ > 1 file changed, 19 insertions(+) > create mode 100755 tests/docker/test-unit > > diff --git a/tests/docker/test-unit b/tests/docker/test-unit > new file mode 100755 > index 00..be0d90d748 > --- /dev/null > +++ b/tests/docker/test-unit > @@ -0,0 +1,19 @@ > +#!/bin/bash > +# > +# Build and run the unit tests > +# > +# Copyright (c) 2018 Linaro Ltd. > +# > +# Authors: > +# Alex Bennée > +# > +# This work is licensed under the terms of the GNU GPL, version 2 > +# or (at your option) any later version. See the COPYING file in > +# the top-level directory. > + > +. common.rc > + > +cd "$BUILD_DIR" > + > +configure_qemu This reads a bit unusual and counter-intuitive: how could configure_qemu succeed when the env cannot actually build it? Does configure_qemu fail but the side effects needed to run build and run unit tests are done anyway? Either way, please add a comment explaining what is happening here. Also, should we test the exit code of configure_qemu? > +check_qemu check-unit > -- > 2.17.1 >
Re: [Qemu-devel] [PATCH for 3.0 05/10] docker: move make check into check_qemu helper
On Mon, 07/09 16:21, Alex Bennée wrote: > Not all docker images can run the check step. Let's everything into a "move everything"? > common helper so we don't need to replicate checks in the future. > > Signed-off-by: Alex Bennée > --- > tests/docker/common.rc | 10 ++ > tests/docker/test-clang | 2 +- > tests/docker/test-debug | 2 +- > tests/docker/test-full | 2 +- > tests/docker/test-quick | 2 +- > 5 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/tests/docker/common.rc b/tests/docker/common.rc > index ba1f942328..cfc620d554 100755 > --- a/tests/docker/common.rc > +++ b/tests/docker/common.rc > @@ -40,6 +40,16 @@ build_qemu() > make $MAKEFLAGS > } > > +check_qemu() > +{ > +if test -z "$@"; then > +TEST="check" > +else > +TEST="$@" > +fi > +make $MAKEFLAGS $TEST > +} > + > test_fail() > { > echo "$@" > diff --git a/tests/docker/test-clang b/tests/docker/test-clang > index e90a793178..324e341cea 100755 > --- a/tests/docker/test-clang > +++ b/tests/docker/test-clang > @@ -23,5 +23,5 @@ OPTS="--cxx=clang++ --cc=clang --host-cc=clang" > #OPTS="$OPTS --extra-cflags=-fsanitize=undefined \ > #--extra-cflags=-fno-sanitize=float-divide-by-zero" > build_qemu $OPTS > -make $MAKEFLAGS check > +check_qemu > install_qemu > diff --git a/tests/docker/test-debug b/tests/docker/test-debug > index d3f9f70d01..137f4f2ddc 100755 > --- a/tests/docker/test-debug > +++ b/tests/docker/test-debug > @@ -22,5 +22,5 @@ OPTS="--cxx=clang++ --cc=clang --host-cc=clang" > OPTS="--enable-debug --enable-sanitizers $OPTS" > > build_qemu $OPTS > -make $MAKEFLAGS V=1 check > +check_qemu check V=1 > install_qemu > diff --git a/tests/docker/test-full b/tests/docker/test-full > index b4e42d25d7..aadc0f00a2 100755 > --- a/tests/docker/test-full > +++ b/tests/docker/test-full > @@ -15,4 +15,4 @@ > > cd "$BUILD_DIR" > > -build_qemu && make check $MAKEFLAGS && install_qemu > +build_qemu && check_qemu && install_qemu > diff --git a/tests/docker/test-quick b/tests/docker/test-quick > index 3b7bce6105..eee59c55fb 100755 > --- a/tests/docker/test-quick > +++ b/tests/docker/test-quick > @@ -18,5 +18,5 @@ cd "$BUILD_DIR" > DEF_TARGET_LIST="x86_64-softmmu,aarch64-softmmu" > TARGET_LIST=${TARGET_LIST:-$DEF_TARGET_LIST} \ > build_qemu > -make check $MAKEFLAGS > +check_qemu > install_qemu > -- > 2.17.1 >
Re: [Qemu-devel] [PATCH for 3.0 01/10] tests/.gitignore: don't ignore docker tests
On Mon, 07/09 16:21, Alex Bennée wrote: > This was being a little over enthusiastic hiding files. What is "this" that hides test-* and calls for this patch? > > Signed-off-by: Alex Bennée > --- > tests/.gitignore | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/tests/.gitignore b/tests/.gitignore > index 08e2df1ce1..72c18aaab0 100644 > --- a/tests/.gitignore > +++ b/tests/.gitignore > @@ -9,6 +9,7 @@ qht-bench > rcutorture > test-* > !test-*.c > +!docker/test-* > test-qapi-commands.[ch] > test-qapi-events.[ch] > test-qapi-types.[ch] > -- > 2.17.1 >
[Qemu-devel] [PATCH v3 10/10] block: Use common write req handling in truncate
Truncation is the last to convert from open coded req handling to reusing helpers. This time the permission check in prepare has to adapt to the new caller: it checks a different permission bit, and doesn't trigger the before write notifier. Also, truncation should always trigger a bs->total_sectors update and in turn call parent resize_cb. Update the condition in finish helper, too. It's intended to do a duplicated bs->read_only check before calling bdrv_co_write_req_prepare() so that we can be more informative with the error message, as bdrv_co_write_req_prepare() doesn't have Error parameter. Signed-off-by: Fam Zheng --- block/io.c | 57 ++ 1 file changed, 36 insertions(+), 21 deletions(-) diff --git a/block/io.c b/block/io.c index e3e2d5286d..abcb8b3de6 100644 --- a/block/io.c +++ b/block/io.c @@ -1582,14 +1582,24 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes, is_request_serialising_and_aligned(req)); assert(req->overlap_offset <= offset); assert(offset + bytes <= req->overlap_offset + req->overlap_bytes); - -if (flags & BDRV_REQ_WRITE_UNCHANGED) { -assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); -} else { -assert(child->perm & BLK_PERM_WRITE); -} assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE); -return notifier_with_return_list_notify(>before_write_notifiers, req); + +switch (req->type) { +case BDRV_TRACKED_WRITE: +case BDRV_TRACKED_DISCARD: +if (flags & BDRV_REQ_WRITE_UNCHANGED) { +assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); +} else { +assert(child->perm & BLK_PERM_WRITE); +} +return notifier_with_return_list_notify(>before_write_notifiers, +req); +case BDRV_TRACKED_TRUNCATE: +assert(child->perm & BLK_PERM_RESIZE); +return 0; +default: +abort(); +} } static inline void coroutine_fn @@ -1608,8 +1618,9 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes, * beyond EOF cannot expand the image anyway. * */ if (ret == 0 && -end_sector > bs->total_sectors && -req->type != BDRV_TRACKED_DISCARD) { +(req->type == BDRV_TRACKED_TRUNCATE || + end_sector > bs->total_sectors) && +req->type != BDRV_TRACKED_DISCARD) { bs->total_sectors = end_sector; bdrv_parent_cb_resize(bs); bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS); @@ -3088,7 +3099,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, int64_t old_size, new_bytes; int ret; -assert(child->perm & BLK_PERM_RESIZE); /* if bs->drv == NULL, bs is closed, so there's nothing to do here */ if (!drv) { @@ -3121,7 +3131,18 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, * concurrently or they might be overwritten by preallocation. */ if (new_bytes) { mark_request_serialising(, 1); -wait_serialising_requests(); +} +if (bs->read_only) { +error_setg(errp, "Image is read-only"); +ret = -EACCES; +goto out; +} +ret = bdrv_co_write_req_prepare(child, offset - new_bytes, new_bytes, , +0); +if (ret < 0) { +error_setg_errno(errp, -ret, + "Failed to prepare request for truncation"); +goto out; } if (!drv->bdrv_co_truncate) { @@ -3133,13 +3154,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, ret = -ENOTSUP; goto out; } -if (bs->read_only) { -error_setg(errp, "Image is read-only"); -ret = -EACCES; -goto out; -} - -assert(!(bs->open_flags & BDRV_O_INACTIVE)); ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp); if (ret < 0) { @@ -3151,9 +3165,10 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, } else { offset = bs->total_sectors * BDRV_SECTOR_SIZE; } -bdrv_dirty_bitmap_truncate(bs, offset); -bdrv_parent_cb_resize(bs); -atomic_inc(>write_gen); +/* It's possible that truncation succeeded but refresh_total_sectors + * failed, but the latter doesn't affect how we should finish the request. + * Pass 0 as the last parameter so that dirty bitmaps etc. are handled. */ +bdrv_co_write_req_finish(child, offset - new_bytes, new_bytes, , 0); out: tracked_request_end(); -- 2.17.1
[Qemu-devel] [PATCH v3 08/10] block: Use common req handling in copy offloading
This brings the request handling logic inline with write and discard, fixing write_gen, resize_cb, dirty bitmaps and image size refreshing. The last of these issues broke iotest case 222, which is now fixed. Signed-off-by: Fam Zheng --- block/io.c | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/block/io.c b/block/io.c index cc426bab2e..9687da1ce9 100644 --- a/block/io.c +++ b/block/io.c @@ -3007,20 +3007,16 @@ static int coroutine_fn bdrv_co_copy_range_internal( bdrv_inc_in_flight(dst->bs); tracked_request_begin(, dst->bs, dst_offset, bytes, BDRV_TRACKED_WRITE); - -/* BDRV_REQ_NO_SERIALISING is only for read operation */ -assert(!(write_flags & BDRV_REQ_NO_SERIALISING)); -if (write_flags & BDRV_REQ_SERIALISING) { -mark_request_serialising(, bdrv_get_cluster_size(dst->bs)); +ret = bdrv_co_write_req_prepare(dst, dst_offset, bytes, , +write_flags); +if (!ret) { +ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs, + src, src_offset, + dst, dst_offset, + bytes, + read_flags, write_flags); } -wait_serialising_requests(); - -ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs, - src, src_offset, - dst, dst_offset, - bytes, - read_flags, write_flags); - +bdrv_co_write_req_finish(dst, dst_offset, bytes, , ret); tracked_request_end(); bdrv_dec_in_flight(dst->bs); } -- 2.17.1
[Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard
Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation here is that discard requests don't affect bs->wr_highest_offset, and it cannot extend the image. Signed-off-by: Fam Zheng --- block/io.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/block/io.c b/block/io.c index 10a475302a..cc426bab2e 100644 --- a/block/io.c +++ b/block/io.c @@ -1601,15 +1601,31 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes, atomic_inc(>write_gen); -stat64_max(>wr_highest_offset, offset + bytes); - +/* Discard cannot extend the image, but in error handling cases, such as + * when reverting a qcow2 cluster allocation, the discarded range can pass + * the end of image file, so we cannot assert about BDRV_TRACKED_DISCARD + * here. Instead, just skip it, since semantically a discard request + * beyond EOF cannot expand the image anyway. + * */ if (ret == 0 && -end_sector > bs->total_sectors) { +end_sector > bs->total_sectors && +req->type != BDRV_TRACKED_DISCARD) { bs->total_sectors = end_sector; bdrv_parent_cb_resize(bs); bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS); } -bdrv_set_dirty(bs, offset, bytes); +if (req->bytes) { +switch (req->type) { +case BDRV_TRACKED_WRITE: +stat64_max(>wr_highest_offset, offset + bytes); +/* fall through, to set dirty bits */ +case BDRV_TRACKED_DISCARD: +bdrv_set_dirty(bs, offset, bytes); +break; +default: +break; +} +} } /* @@ -2670,10 +2686,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) ret = bdrv_check_byte_request(bs, offset, bytes); if (ret < 0) { return ret; -} else if (bs->read_only) { -return -EPERM; } -assert(!(bs->open_flags & BDRV_O_INACTIVE)); /* Do nothing if disabled. */ if (!(bs->open_flags & BDRV_O_UNMAP)) { @@ -2697,7 +2710,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) bdrv_inc_in_flight(bs); tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_DISCARD); -ret = notifier_with_return_list_notify(>before_write_notifiers, ); +ret = bdrv_co_write_req_prepare(child, offset, bytes, , 0); if (ret < 0) { goto out; } @@ -2763,8 +2776,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) } ret = 0; out: -atomic_inc(>write_gen); -bdrv_set_dirty(bs, req.offset, req.bytes); +bdrv_co_write_req_finish(child, req.offset, req.bytes, , ret); tracked_request_end(); bdrv_dec_in_flight(bs); return ret; -- 2.17.1
[Qemu-devel] [PATCH v3 09/10] block: Fix bdrv_co_truncate overlap check
If we are growing the image and potentially using preallocation for the new area, we need to make sure that no write requests are made to the "preallocated" area which is [@old_size, @offset), not [@offset, offset * 2 - @old_size). Signed-off-by: Fam Zheng Reviewed-by: Eric Blake --- block/io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index 9687da1ce9..e3e2d5286d 100644 --- a/block/io.c +++ b/block/io.c @@ -3113,7 +3113,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, } bdrv_inc_in_flight(bs); -tracked_request_begin(, bs, offset, new_bytes, BDRV_TRACKED_TRUNCATE); +tracked_request_begin(, bs, offset - new_bytes, new_bytes, + BDRV_TRACKED_TRUNCATE); /* If we are growing the image and potentially using preallocation for the * new area, we need to make sure that no write requests are made to it -- 2.17.1
[Qemu-devel] [PATCH v3 05/10] block: Extract common write req handling
As a mechanical refactoring patch, this is the first step towards unified and more correct write code paths. This is helpful because multiple BlockDriverState fields need to be updated after modifying image data, and it's hard to maintain in multiple places such as copy offload, discard and truncate. Suggested-by: Kevin Wolf Signed-off-by: Fam Zheng --- block/io.c | 91 ++ 1 file changed, 57 insertions(+), 34 deletions(-) diff --git a/block/io.c b/block/io.c index c56e329bb5..960e1492d0 100644 --- a/block/io.c +++ b/block/io.c @@ -1553,6 +1553,61 @@ fail: return ret; } +static inline int coroutine_fn +bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes, + BdrvTrackedRequest *req, int flags) +{ +BlockDriverState *bs = child->bs; +bool waited; +int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE); + +if (bs->read_only) { +return -EPERM; +} + +/* BDRV_REQ_NO_SERIALISING is only for read operation */ +assert(!(flags & BDRV_REQ_NO_SERIALISING)); +assert(!(bs->open_flags & BDRV_O_INACTIVE)); +assert((bs->open_flags & BDRV_O_NO_IO) == 0); +assert(!(flags & ~BDRV_REQ_MASK)); + +if (flags & BDRV_REQ_SERIALISING) { +mark_request_serialising(req, bdrv_get_cluster_size(bs)); +} + +waited = wait_serialising_requests(req); + +assert(!waited || !req->serialising || + is_request_serialising_and_aligned(req)); +assert(req->overlap_offset <= offset); +assert(offset + bytes <= req->overlap_offset + req->overlap_bytes); + +if (flags & BDRV_REQ_WRITE_UNCHANGED) { +assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); +} else { +assert(child->perm & BLK_PERM_WRITE); +} +assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE); +return notifier_with_return_list_notify(>before_write_notifiers, req); +} + +static inline void coroutine_fn +bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes, + BdrvTrackedRequest *req, int ret) +{ +int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE); +BlockDriverState *bs = child->bs; + +atomic_inc(>write_gen); +bdrv_set_dirty(bs, offset, bytes); + +stat64_max(>wr_highest_offset, offset + bytes); + +if (ret == 0) { +bs->total_sectors = MAX(bs->total_sectors, end_sector); +} +} + /* * Forwards an already correctly aligned write request to the BlockDriver, * after possibly fragmenting it. @@ -1563,10 +1618,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child, { BlockDriverState *bs = child->bs; BlockDriver *drv = bs->drv; -bool waited; int ret; -int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE); uint64_t bytes_remaining = bytes; int max_transfer; @@ -1582,31 +1635,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child, assert((offset & (align - 1)) == 0); assert((bytes & (align - 1)) == 0); assert(!qiov || bytes == qiov->size); -assert((bs->open_flags & BDRV_O_NO_IO) == 0); -assert(!(flags & ~BDRV_REQ_MASK)); max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX), align); -/* BDRV_REQ_NO_SERIALISING is only for read operation */ -assert(!(flags & BDRV_REQ_NO_SERIALISING)); - -if (flags & BDRV_REQ_SERIALISING) { -mark_request_serialising(req, bdrv_get_cluster_size(bs)); -} - -waited = wait_serialising_requests(req); -assert(!waited || !req->serialising || - is_request_serialising_and_aligned(req)); -assert(req->overlap_offset <= offset); -assert(offset + bytes <= req->overlap_offset + req->overlap_bytes); -if (flags & BDRV_REQ_WRITE_UNCHANGED) { -assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); -} else { -assert(child->perm & BLK_PERM_WRITE); -} -assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE); - -ret = notifier_with_return_list_notify(>before_write_notifiers, req); +ret = bdrv_co_write_req_prepare(child, offset, bytes, req, flags); if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF && !(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_pwrite_zeroes && @@ -1655,15 +1687,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child, } bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE); -atomic_inc(>write_gen); -bdrv_set_dirty(bs, offset, bytes); - -stat64_max(>wr_highest_offset, offset + bytes); - if (ret >= 0) { -bs->tota
[Qemu-devel] [PATCH v3 03/10] block: Use BdrvChild to discard
Other I/O functions are already using a BdrvChild pointer in the API, so make discard do the same. It makes it possible to initiate the same permission checks before doing I/O, and much easier to share the helper functions for this, which will be added and used by write, truncate and copy range paths. Signed-off-by: Fam Zheng --- block/blkdebug.c | 2 +- block/blklogwrites.c | 2 +- block/blkreplay.c | 2 +- block/block-backend.c | 2 +- block/copy-on-read.c | 2 +- block/io.c | 18 +- block/mirror.c | 2 +- block/qcow2-refcount.c | 2 +- block/raw-format.c | 2 +- block/throttle.c | 2 +- include/block/block.h | 4 ++-- 11 files changed, 20 insertions(+), 20 deletions(-) diff --git a/block/blkdebug.c b/block/blkdebug.c index 526af2a808..0457bf5b66 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -625,7 +625,7 @@ static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs, return err; } -return bdrv_co_pdiscard(bs->file->bs, offset, bytes); +return bdrv_co_pdiscard(bs->file, offset, bytes); } static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs, diff --git a/block/blklogwrites.c b/block/blklogwrites.c index 63bf6b34a9..34e98e4ff5 100644 --- a/block/blklogwrites.c +++ b/block/blklogwrites.c @@ -483,7 +483,7 @@ static int coroutine_fn blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr) static int coroutine_fn blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr) { -return bdrv_co_pdiscard(fr->bs->file->bs, fr->offset, fr->bytes); +return bdrv_co_pdiscard(fr->bs->file, fr->offset, fr->bytes); } static int coroutine_fn diff --git a/block/blkreplay.c b/block/blkreplay.c index b016dbeee7..766150ade6 100755 --- a/block/blkreplay.c +++ b/block/blkreplay.c @@ -113,7 +113,7 @@ static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) { uint64_t reqid = blkreplay_next_id(); -int ret = bdrv_co_pdiscard(bs->file->bs, offset, bytes); +int ret = bdrv_co_pdiscard(bs->file, offset, bytes); block_request_create(reqid, bs, qemu_coroutine_self()); qemu_coroutine_yield(); diff --git a/block/block-backend.c b/block/block-backend.c index ac8c3e0b1c..fa120630be 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1559,7 +1559,7 @@ int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes) return ret; } -return bdrv_co_pdiscard(blk_bs(blk), offset, bytes); +return bdrv_co_pdiscard(blk->root, offset, bytes); } int blk_co_flush(BlockBackend *blk) diff --git a/block/copy-on-read.c b/block/copy-on-read.c index 1dcdaeed69..a19164f9eb 100644 --- a/block/copy-on-read.c +++ b/block/copy-on-read.c @@ -116,7 +116,7 @@ static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs, static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) { -return bdrv_co_pdiscard(bs->file->bs, offset, bytes); +return bdrv_co_pdiscard(bs->file, offset, bytes); } diff --git a/block/io.c b/block/io.c index 600060c936..77f35b1013 100644 --- a/block/io.c +++ b/block/io.c @@ -2611,7 +2611,7 @@ int bdrv_flush(BlockDriverState *bs) } typedef struct DiscardCo { -BlockDriverState *bs; +BdrvChild *child; int64_t offset; int bytes; int ret; @@ -2620,17 +2620,17 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque) { DiscardCo *rwco = opaque; -rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->offset, rwco->bytes); +rwco->ret = bdrv_co_pdiscard(rwco->child, rwco->offset, rwco->bytes); } -int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset, - int bytes) +int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) { BdrvTrackedRequest req; int max_pdiscard, ret; int head, tail, align; +BlockDriverState *bs = child->bs; -if (!bs->drv) { +if (!bs || !bs->drv) { return -ENOMEDIUM; } @@ -2741,11 +2741,11 @@ out: return ret; } -int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) +int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes) { Coroutine *co; DiscardCo rwco = { -.bs = bs, +.child = child, .offset = offset, .bytes = bytes, .ret = NOT_DONE, @@ -2756,8 +2756,8 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) bdrv_pdiscard_co_entry(); } else { co = qemu_coroutine_create(bdrv_pdiscard_co_entry, ); -bdrv_coroutine_enter(bs, co); -BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE); +bdrv_coroutine_enter(child->bs, co); +BDRV_POLL_WHILE(child->bs, rwco.ret
[Qemu-devel] [PATCH v3 04/10] block: Use uint64_t for BdrvTrackedRequest byte fields
This matches the types used for bytes in the rest parts of block layer. In the case of bdrv_co_truncate, new_bytes can be the image size which probably doesn't fit in a 32 bit int. Signed-off-by: Fam Zheng --- block/io.c| 8 +--- include/block/block_int.h | 4 ++-- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/block/io.c b/block/io.c index 77f35b1013..c56e329bb5 100644 --- a/block/io.c +++ b/block/io.c @@ -587,9 +587,11 @@ static void tracked_request_end(BdrvTrackedRequest *req) static void tracked_request_begin(BdrvTrackedRequest *req, BlockDriverState *bs, int64_t offset, - unsigned int bytes, + uint64_t bytes, enum BdrvTrackedRequestType type) { +assert(bytes <= INT64_MAX && offset <= INT64_MAX - bytes); + *req = (BdrvTrackedRequest){ .bs = bs, .offset = offset, @@ -611,7 +613,7 @@ static void tracked_request_begin(BdrvTrackedRequest *req, static void mark_request_serialising(BdrvTrackedRequest *req, uint64_t align) { int64_t overlap_offset = req->offset & ~(align - 1); -unsigned int overlap_bytes = ROUND_UP(req->offset + req->bytes, align) +uint64_t overlap_bytes = ROUND_UP(req->offset + req->bytes, align) - overlap_offset; if (!req->serialising) { @@ -667,7 +669,7 @@ static int bdrv_get_cluster_size(BlockDriverState *bs) } static bool tracked_request_overlaps(BdrvTrackedRequest *req, - int64_t offset, unsigned int bytes) + int64_t offset, uint64_t bytes) { /* */ if (offset >= req->overlap_offset + req->overlap_bytes) { diff --git a/include/block/block_int.h b/include/block/block_int.h index 119b6e4ea5..90a33e4b0e 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -69,12 +69,12 @@ enum BdrvTrackedRequestType { typedef struct BdrvTrackedRequest { BlockDriverState *bs; int64_t offset; -unsigned int bytes; +uint64_t bytes; enum BdrvTrackedRequestType type; bool serialising; int64_t overlap_offset; -unsigned int overlap_bytes; +uint64_t overlap_bytes; QLIST_ENTRY(BdrvTrackedRequest) list; Coroutine *co; /* owner, used for deadlock detection */ -- 2.17.1
[Qemu-devel] [PATCH v3 06/10] block: Fix handling of image enlarging write
Two problems exist when a write request that enlarges the image (i.e. write beyond EOF) finishes: 1) parent is not notified about size change; 2) dirty bitmap is not resized although we try to set the dirty bits; Fix them just like how bdrv_co_truncate works. Reported-by: Kevin Wolf Signed-off-by: Fam Zheng --- block/io.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index 960e1492d0..10a475302a 100644 --- a/block/io.c +++ b/block/io.c @@ -40,6 +40,7 @@ static AioWait drain_all_aio_wait; +static void bdrv_parent_cb_resize(BlockDriverState *bs); static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, BdrvRequestFlags flags); @@ -1599,13 +1600,16 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes, BlockDriverState *bs = child->bs; atomic_inc(>write_gen); -bdrv_set_dirty(bs, offset, bytes); stat64_max(>wr_highest_offset, offset + bytes); -if (ret == 0) { -bs->total_sectors = MAX(bs->total_sectors, end_sector); +if (ret == 0 && +end_sector > bs->total_sectors) { +bs->total_sectors = end_sector; +bdrv_parent_cb_resize(bs); +bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS); } +bdrv_set_dirty(bs, offset, bytes); } /* -- 2.17.1
[Qemu-devel] [PATCH v3 02/10] block: Add copy offloading trace points
A few trace points that can help reveal what is happening in a copy offloading I/O path. Signed-off-by: Fam Zheng --- block/file-posix.c | 2 ++ block/io.c | 4 block/iscsi.c | 3 +++ block/trace-events | 6 ++ 4 files changed, 15 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index 1185c7c5cc..09f6b938f6 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1488,6 +1488,8 @@ static ssize_t handle_aiocb_copy_range(RawPosixAIOData *aiocb) ssize_t ret = copy_file_range(aiocb->aio_fildes, _off, aiocb->aio_fd2, _off, bytes, 0); +trace_file_copy_file_range(aiocb->bs, aiocb->aio_fildes, in_off, + aiocb->aio_fd2, out_off, bytes, 0, ret); if (ret == 0) { /* No progress (e.g. when beyond EOF), let the caller fall back to * buffer I/O. */ diff --git a/block/io.c b/block/io.c index 102cb0e1ba..600060c936 100644 --- a/block/io.c +++ b/block/io.c @@ -2997,6 +2997,8 @@ int coroutine_fn bdrv_co_copy_range_from(BdrvChild *src, uint64_t src_offset, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { +trace_bdrv_co_copy_range_from(src, src_offset, dst, dst_offset, bytes, + read_flags, write_flags); return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset, bytes, read_flags, write_flags, true); } @@ -3011,6 +3013,8 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, uint64_t src_offset, BdrvRequestFlags read_flags, BdrvRequestFlags write_flags) { +trace_bdrv_co_copy_range_to(src, src_offset, dst, dst_offset, bytes, +read_flags, write_flags); return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset, bytes, read_flags, write_flags, false); } diff --git a/block/iscsi.c b/block/iscsi.c index 38b917a1e5..bb69faf34a 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -44,6 +44,7 @@ #include "qapi/qmp/qstring.h" #include "crypto/secret.h" #include "scsi/utils.h" +#include "trace.h" /* Conflict between scsi/utils.h and libiscsi! :( */ #define SCSI_XFER_NONE ISCSI_XFER_NONE @@ -2399,6 +2400,8 @@ retry: } out_unlock: + +trace_iscsi_xcopy(src_lun, src_offset, dst_lun, dst_offset, bytes, r); g_free(iscsi_task.task); qemu_mutex_unlock(_lun->mutex); g_free(iscsi_task.err_str); diff --git a/block/trace-events b/block/trace-events index 854d5525af..3e8c47bb24 100644 --- a/block/trace-events +++ b/block/trace-events @@ -15,6 +15,8 @@ bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) "bs bdrv_co_pwritev(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) "bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x" bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p offset %"PRId64" count %d flags 0x%x" bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t cluster_offset, int64_t cluster_bytes) "bs %p offset %"PRId64" bytes %u cluster_offset %"PRId64" cluster_bytes %"PRId64 +bdrv_co_copy_range_from(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x" +bdrv_co_copy_range_to(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x" # block/stream.c stream_one_iteration(void *s, int64_t offset, uint64_t bytes, int is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d" @@ -57,6 +59,7 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p" # block/file-posix.c file_paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d" file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d" +file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_off, int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset %"PRIu64" dst_fd %d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64 # block/qcow2.c qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%&
[Qemu-devel] [PATCH v3 01/10] block: Prefix file driver trace points with "file_"
With in one module, trace points usually have a common prefix named after the module name. paio_submit and paio_submit_co are the only two trace points so far in the two file protocol drivers. As we are adding more, having a common prefix here is better so that trace points can be enabled with a glob. Rename them. Suggested-by: Kevin Wolf Signed-off-by: Fam Zheng --- block/file-posix.c | 2 +- block/file-win32.c | 2 +- block/trace-events | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 4fec8cb53c..1185c7c5cc 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1743,7 +1743,7 @@ static int paio_submit_co_full(BlockDriverState *bs, int fd, assert(qiov->size == bytes); } -trace_paio_submit_co(offset, bytes, type); +trace_file_paio_submit_co(offset, bytes, type); pool = aio_get_thread_pool(bdrv_get_aio_context(bs)); return thread_pool_submit_co(pool, aio_worker, acb); } diff --git a/block/file-win32.c b/block/file-win32.c index 0411fe80fd..f1e2187f3b 100644 --- a/block/file-win32.c +++ b/block/file-win32.c @@ -162,7 +162,7 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile, acb->aio_nbytes = count; acb->aio_offset = offset; -trace_paio_submit(acb, opaque, offset, count, type); +trace_file_paio_submit(acb, opaque, offset, count, type); pool = aio_get_thread_pool(bdrv_get_aio_context(bs)); return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque); } diff --git a/block/trace-events b/block/trace-events index c35287b48a..854d5525af 100644 --- a/block/trace-events +++ b/block/trace-events @@ -55,8 +55,8 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p" # block/file-win32.c # block/file-posix.c -paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d" -paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d" +file_paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d" +file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d" # block/qcow2.c qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" PRIx64 " bytes %d" -- 2.17.1
[Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading
Based-on: 20180709163719.87107-1-vsement...@virtuozzo.com v3: - Rebase onto Vladmir's: [PATCH v5 0/4] fix image fleecing on top of master, which has blklogwrites to be converted to BdrvChild for the bdrv_co_pdiscard() parameter. [Kevin] - Add file_ prefix to file protocol trace points. [Kevin] - Assert that BdrvTrackedRequest bytes and offset don't overflow INT64_MAX. [Kevin] - Don't misuse req->offset/req->bytes in bdrv_co_write_req_prepare/finish. [Kevin] - Fix stat64_max. [Kevin] - Keep lines within 80 columns. [Kevin] - Grammar fix in commit message and rev-by. [Eric] Qcow2 allocates new clusters after the end of the file. If it is the destinaton of copy offloading, we must adjust dst->bs->total_sectors. Otherwise, further reads will drop to the "beyond EOF" code path and return zeroes, which problem is caught by iotests 222. Follow the logic in the normal write code and update bs->total_sectors after I/O is done. While at it, add a few convenient trace points to aid future debug experiences in the topic. Fam Zheng (10): block: Prefix file driver trace points with "file_" block: Add copy offloading trace points block: Use BdrvChild to discard block: Use uint64_t for BdrvTrackedRequest byte fields block: Extract common write req handling block: Fix handling of image enlarging write block: Use common req handling for discard block: Use common req handling in copy offloading block: Fix bdrv_co_truncate overlap check block: Use common write req handling in truncate block/blkdebug.c | 2 +- block/blklogwrites.c | 2 +- block/blkreplay.c | 2 +- block/block-backend.c | 2 +- block/copy-on-read.c | 2 +- block/file-posix.c| 4 +- block/file-win32.c| 2 +- block/io.c| 213 -- block/iscsi.c | 3 + block/mirror.c| 2 +- block/qcow2-refcount.c| 2 +- block/raw-format.c| 2 +- block/throttle.c | 2 +- block/trace-events| 10 +- include/block/block.h | 4 +- include/block/block_int.h | 4 +- 16 files changed, 163 insertions(+), 95 deletions(-) -- 2.17.1
Re: [Qemu-devel] [PATCH v2 8/9] block: Fix bdrv_co_truncate overlap check
On Fri, 07/06 17:09, Eric Blake wrote: > On 07/05/2018 02:37 AM, Fam Zheng wrote: > > If we are growing the image and potentially using preallocation for the > > new area, we need to make sure that no write requests are made to the > > "preallocated" area which [@old_size, @offset), not [@offset, offset * 2 > > s/which/which is/ > > > - @old_size). > > > > Signed-off-by: Fam Zheng > > --- > > block/io.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > Reviewed-by: Eric Blake > > > > > diff --git a/block/io.c b/block/io.c > > index d07849fa96..ed18eb0ca3 100644 > > --- a/block/io.c > > +++ b/block/io.c > > @@ -3070,7 +3070,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, > > int64_t offset, > > } > > bdrv_inc_in_flight(bs); > > -tracked_request_begin(, bs, offset, new_bytes, > > BDRV_TRACKED_TRUNCATE); > > +tracked_request_begin(, bs, offset - new_bytes, new_bytes, > > + BDRV_TRACKED_TRUNCATE); > > Is it any more legible to do s/offset - new_bytes/old_size/, since those are > equivalent? No they are not. offset - new_bytes is either old_size (if expanding), or smaller than old_size (if shrinking). Fam > > > /* If we are growing the image and potentially using preallocation > > for the > >* new area, we need to make sure that no write requests are made to > > it > > > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3266 > Virtualization: qemu.org | libvirt.org
Re: [Qemu-devel] [PATCH v5 4/4] block/backup: fix fleecing scheme: use serialized writes
job->target, start, nbytes, > -is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0, > 0); > +read_flags, write_flags); > if (ret < 0) { > trace_backup_do_cow_copy_range_fail(job, start, ret); > hbitmap_set(job->copy_bitmap, start / job->cluster_size, > @@ -701,6 +706,9 @@ BlockJob *backup_job_create(const char *job_id, > BlockDriverState *bs, > sync_bitmap : NULL; > job->compress = compress; > > +/* Detect image-fleecing (and similar) schemes */ > +job->serialize_target_writes = bdrv_chain_contains(target, bs); > + > /* If there is no backing file on the target, we cannot rely on COW if > our > * backup cluster size is smaller than the target cluster size. Even for > * targets with a backing file, try to avoid COW if possible. */ I always thought mirror is by far the most complicated job, but now backup is catching up quickly! Reviewed-by: Fam Zheng
Re: [Qemu-devel] [PATCH v5 3/4] block: add BDRV_REQ_SERIALISING flag
ad_flags & BDRV_REQ_SERIALISING)); > if (!(read_flags & BDRV_REQ_NO_SERIALISING)) { > wait_serialising_requests(); > } > @@ -2948,6 +2969,9 @@ static int coroutine_fn bdrv_co_copy_range_internal( > > /* BDRV_REQ_NO_SERIALISING is only for read operation */ > assert(!(write_flags & BDRV_REQ_NO_SERIALISING)); > +if (write_flags & BDRV_REQ_SERIALISING) { > +mark_request_serialising(, bdrv_get_cluster_size(dst->bs)); > +} > wait_serialising_requests(); > > ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs, > -- > 2.11.1 > Reviewed-by: Fam Zheng Fam
Re: [Qemu-devel] [PATCH v5 2/4] block: split flags in copy_range
On Mon, 07/09 19:37, Vladimir Sementsov-Ogievskiy wrote: > Pass read flags and write flags separately. This is needed to handle > coming BDRV_REQ_NO_SERIALISING clearly in following patches. > > Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Fam Zheng
Re: [Qemu-devel] [PATCH v5 1/4] block/io: fix copy_range
On Mon, 07/09 19:37, Vladimir Sementsov-Ogievskiy wrote: > Here two things are fixed: > > 1. Architecture > > On each recursion step, we go to the child of src or dst, only for one > of them. So, it's wrong to create tracked requests for both on each > step. It leads to tracked requests duplication. > > 2. Wait for serializing requests on write path independently of >BDRV_REQ_NO_SERIALISING > > Before commit 9ded4a01149 "backup: Use copy offloading", > BDRV_REQ_NO_SERIALISING was used for only one case: read in > copy-on-write operation during backup. Also, the flag was handled only > on read path (in bdrv_co_preadv and bdrv_aligned_preadv). > > After 9ded4a01149, flag is used for not waiting serializing operations > on backup target (in same case of copy-on-write operation). This > behavior change is unsubstantiated and potentially dangerous, let's > drop it and add additional asserts and documentation. > > Signed-off-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Fam Zheng
Re: [Qemu-devel] [PATCH v4 1/4] block/io: fix copy_range
On Mon, 07/09 17:38, Vladimir Sementsov-Ogievskiy wrote: > 09.07.2018 16:17, Fam Zheng wrote: > > On Mon, 07/09 12:43, Vladimir Sementsov-Ogievskiy wrote: > > > 09.07.2018 04:15, Fam Zheng wrote: > > > > On Fri, 07/06 21:30, Vladimir Sementsov-Ogievskiy wrote: > > > > > Here two things are fixed: > > > > > > > > > > 1. Architecture > > > > > > > > > > On each recursion step, we go to the child of src or dst, only for one > > > > > of them. So, it's wrong to create tracked requests for both on each > > > > > step. It leads to tracked requests duplication. > > > > > > > > > > 2. Wait for serializing requests on write path independently of > > > > > BDRV_REQ_NO_SERIALISING > > > > > > > > > > Before commit 9ded4a01149 "backup: Use copy offloading", > > > > > BDRV_REQ_NO_SERIALISING was used for only one case: read in > > > > > copy-on-write operation during backup. Also, the flag was handled only > > > > > on read path (in bdrv_co_preadv and bdrv_aligned_preadv). > > > > > > > > > > After 9ded4a01149, flag is used for not waiting serializing operations > > > > > on backup target (in same case of copy-on-write operation). This > > > > > behavior change is unsubstantiated and potentially dangerous, let's > > > > > drop it and add additional asserts and documentation. > > > > > > > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy > > > > > --- > > > > >include/block/block.h | 13 +++ > > > > >block/io.c| 103 > > > > > +++--- > > > > >2 files changed, 78 insertions(+), 38 deletions(-) > > > > > > > > > > diff --git a/include/block/block.h b/include/block/block.h > > > > > index e5c7759a0c..a06a4d27de 100644 > > > > > --- a/include/block/block.h > > > > > +++ b/include/block/block.h > > > > > @@ -50,6 +50,19 @@ typedef enum { > > > > > * opened with BDRV_O_UNMAP. > > > > > */ > > > > >BDRV_REQ_MAY_UNMAP = 0x4, > > > > > + > > > > > +/* The BDRV_REQ_NO_SERIALISING means that we don't want to > > > > > + * wait_serialising_requests(), when reading. > > > > > + * > > > > > + * This flag is used for backup copy on write operation, when we > > > > > need to > > > > > + * read old data before write (write notifier triggered). It is > > > > > ok, due to > > > > > + * we already waited for serializing requests in initiative > > > > > write (see > > > > > + * bdrv_aligned_pwritev), and it is necessary for the case when > > > > > initiative > > > > > + * write is serializing itself (we'll dead lock waiting it). > > > > > + * > > > > > + * The described case is the only usage for the flag for now, > > > > > so, it is > > > > > + * supported only for read operation and restricted for write. > > > > > + */ > > > > >BDRV_REQ_NO_SERIALISING = 0x8, > > > > >BDRV_REQ_FUA= 0x10, > > > > >BDRV_REQ_WRITE_COMPRESSED = 0x20, > > > > > diff --git a/block/io.c b/block/io.c > > > > > index 1a2272fad3..621b21c455 100644 > > > > > --- a/block/io.c > > > > > +++ b/block/io.c > > > > > @@ -1572,6 +1572,8 @@ static int coroutine_fn > > > > > bdrv_aligned_pwritev(BdrvChild *child, > > > > >max_transfer = > > > > > QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX), > > > > > align); > > > > > +/* BDRV_REQ_NO_SERIALISING is only for read operation */ > > > > > +assert(!(flags & BDRV_REQ_NO_SERIALISING)); > > > > >waited = wait_serialising_requests(req); > > > > >assert(!waited || !req->serialising); > > > > >assert(req->overlap_offset <= offset); > > > > > @@ -2888,15 +2890,19 @@ void bdrv_unregister_buf(BlockDriverState > > > > > *bs, void *host) > > > > >} > > > > >} >
Re: [Qemu-devel] [PATCH v4 1/4] block/io: fix copy_range
On Mon, 07/09 12:43, Vladimir Sementsov-Ogievskiy wrote: > 09.07.2018 04:15, Fam Zheng wrote: > > On Fri, 07/06 21:30, Vladimir Sementsov-Ogievskiy wrote: > > > Here two things are fixed: > > > > > > 1. Architecture > > > > > > On each recursion step, we go to the child of src or dst, only for one > > > of them. So, it's wrong to create tracked requests for both on each > > > step. It leads to tracked requests duplication. > > > > > > 2. Wait for serializing requests on write path independently of > > > BDRV_REQ_NO_SERIALISING > > > > > > Before commit 9ded4a01149 "backup: Use copy offloading", > > > BDRV_REQ_NO_SERIALISING was used for only one case: read in > > > copy-on-write operation during backup. Also, the flag was handled only > > > on read path (in bdrv_co_preadv and bdrv_aligned_preadv). > > > > > > After 9ded4a01149, flag is used for not waiting serializing operations > > > on backup target (in same case of copy-on-write operation). This > > > behavior change is unsubstantiated and potentially dangerous, let's > > > drop it and add additional asserts and documentation. > > > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy > > > --- > > > include/block/block.h | 13 +++ > > > block/io.c| 103 > > > +++--- > > > 2 files changed, 78 insertions(+), 38 deletions(-) > > > > > > diff --git a/include/block/block.h b/include/block/block.h > > > index e5c7759a0c..a06a4d27de 100644 > > > --- a/include/block/block.h > > > +++ b/include/block/block.h > > > @@ -50,6 +50,19 @@ typedef enum { > > >* opened with BDRV_O_UNMAP. > > >*/ > > > BDRV_REQ_MAY_UNMAP = 0x4, > > > + > > > +/* The BDRV_REQ_NO_SERIALISING means that we don't want to > > > + * wait_serialising_requests(), when reading. > > > + * > > > + * This flag is used for backup copy on write operation, when we > > > need to > > > + * read old data before write (write notifier triggered). It is ok, > > > due to > > > + * we already waited for serializing requests in initiative write > > > (see > > > + * bdrv_aligned_pwritev), and it is necessary for the case when > > > initiative > > > + * write is serializing itself (we'll dead lock waiting it). > > > + * > > > + * The described case is the only usage for the flag for now, so, it > > > is > > > + * supported only for read operation and restricted for write. > > > + */ > > > BDRV_REQ_NO_SERIALISING = 0x8, > > > BDRV_REQ_FUA= 0x10, > > > BDRV_REQ_WRITE_COMPRESSED = 0x20, > > > diff --git a/block/io.c b/block/io.c > > > index 1a2272fad3..621b21c455 100644 > > > --- a/block/io.c > > > +++ b/block/io.c > > > @@ -1572,6 +1572,8 @@ static int coroutine_fn > > > bdrv_aligned_pwritev(BdrvChild *child, > > > max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, > > > INT_MAX), > > > align); > > > +/* BDRV_REQ_NO_SERIALISING is only for read operation */ > > > +assert(!(flags & BDRV_REQ_NO_SERIALISING)); > > > waited = wait_serialising_requests(req); > > > assert(!waited || !req->serialising); > > > assert(req->overlap_offset <= offset); > > > @@ -2888,15 +2890,19 @@ void bdrv_unregister_buf(BlockDriverState *bs, > > > void *host) > > > } > > > } > > > -static int coroutine_fn bdrv_co_copy_range_internal(BdrvChild *src, > > > -uint64_t src_offset, > > > -BdrvChild *dst, > > > -uint64_t dst_offset, > > > -uint64_t bytes, > > > -BdrvRequestFlags > > > flags, > > > -bool recurse_src) > > > +/* Common part of bdrv_co_copy_range_from and bdrv_co_copy_range_to. > > > + * > > > + * Return -errno on failure, > > > + *0 if successfully handled by bdrv_co_pwrite_zeroes > >
Re: [Qemu-devel] [PATCH v2 9/9] block: Use common write req handling in truncate
On Fri, 07/06 17:12, Eric Blake wrote: > On 07/05/2018 02:37 AM, Fam Zheng wrote: > > Truncation is the last to convert from open coded req handling to > > reusing helpers. This time the permission check in prepare has to adapt > > to the new caller: it checks a different permission bit, and don't > > did you mean "won't" or "doesn't"? Yeah, "doesn't'. Thanks! Fam
Re: [Qemu-devel] [PATCH v4 1/4] block/io: fix copy_range
On Fri, 07/06 21:30, Vladimir Sementsov-Ogievskiy wrote: > Here two things are fixed: > > 1. Architecture > > On each recursion step, we go to the child of src or dst, only for one > of them. So, it's wrong to create tracked requests for both on each > step. It leads to tracked requests duplication. > > 2. Wait for serializing requests on write path independently of >BDRV_REQ_NO_SERIALISING > > Before commit 9ded4a01149 "backup: Use copy offloading", > BDRV_REQ_NO_SERIALISING was used for only one case: read in > copy-on-write operation during backup. Also, the flag was handled only > on read path (in bdrv_co_preadv and bdrv_aligned_preadv). > > After 9ded4a01149, flag is used for not waiting serializing operations > on backup target (in same case of copy-on-write operation). This > behavior change is unsubstantiated and potentially dangerous, let's > drop it and add additional asserts and documentation. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/block.h | 13 +++ > block/io.c| 103 > +++--- > 2 files changed, 78 insertions(+), 38 deletions(-) > > diff --git a/include/block/block.h b/include/block/block.h > index e5c7759a0c..a06a4d27de 100644 > --- a/include/block/block.h > +++ b/include/block/block.h > @@ -50,6 +50,19 @@ typedef enum { > * opened with BDRV_O_UNMAP. > */ > BDRV_REQ_MAY_UNMAP = 0x4, > + > +/* The BDRV_REQ_NO_SERIALISING means that we don't want to > + * wait_serialising_requests(), when reading. > + * > + * This flag is used for backup copy on write operation, when we need to > + * read old data before write (write notifier triggered). It is ok, due > to > + * we already waited for serializing requests in initiative write (see > + * bdrv_aligned_pwritev), and it is necessary for the case when > initiative > + * write is serializing itself (we'll dead lock waiting it). > + * > + * The described case is the only usage for the flag for now, so, it is > + * supported only for read operation and restricted for write. > + */ > BDRV_REQ_NO_SERIALISING = 0x8, > BDRV_REQ_FUA= 0x10, > BDRV_REQ_WRITE_COMPRESSED = 0x20, > diff --git a/block/io.c b/block/io.c > index 1a2272fad3..621b21c455 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -1572,6 +1572,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild > *child, > max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, > INT_MAX), > align); > > +/* BDRV_REQ_NO_SERIALISING is only for read operation */ > +assert(!(flags & BDRV_REQ_NO_SERIALISING)); > waited = wait_serialising_requests(req); > assert(!waited || !req->serialising); > assert(req->overlap_offset <= offset); > @@ -2888,15 +2890,19 @@ void bdrv_unregister_buf(BlockDriverState *bs, void > *host) > } > } > > -static int coroutine_fn bdrv_co_copy_range_internal(BdrvChild *src, > -uint64_t src_offset, > -BdrvChild *dst, > -uint64_t dst_offset, > -uint64_t bytes, > -BdrvRequestFlags flags, > -bool recurse_src) > +/* Common part of bdrv_co_copy_range_from and bdrv_co_copy_range_to. > + * > + * Return -errno on failure, > + *0 if successfully handled by bdrv_co_pwrite_zeroes > + *1 to continue copy_range operation > + */ > +static int coroutine_fn bdrv_co_copy_range_check(BdrvChild *src, > + uint64_t src_offset, > + BdrvChild *dst, > + uint64_t dst_offset, > + uint64_t bytes, > + BdrvRequestFlags flags) > { > -BdrvTrackedRequest src_req, dst_req; > int ret; > > if (!dst || !dst->bs) { > @@ -2923,33 +2929,8 @@ static int coroutine_fn > bdrv_co_copy_range_internal(BdrvChild *src, > || src->bs->encrypted || dst->bs->encrypted) { > return -ENOTSUP; > } > -bdrv_inc_in_flight(src->bs); > -bdrv_inc_in_flight(dst->bs); > -tracked_request_begin(_req, src->bs, src_offset, > - bytes, BDRV_TRACKED_READ); > -tracked_request_begin(_req, dst->bs, dst_offset, > - bytes, BDRV_TRACKED_WRITE); > > -if (!(flags & BDRV_REQ_NO_SERIALISING)) { > -wait_serialising_requests(_req); > -wait_serialising_requests(_req); > -} > -if (recurse_src) { > -ret = src->bs->drv->bdrv_co_copy_range_from(src->bs, > -
Re: [Qemu-devel] [PATCH v3 0/4] fix image fleecing
On Thu, 07/05 10:46, Vladimir Sementsov-Ogievskiy wrote: > Hi all. > > This fixes image fleecing scheme for 3.0, details are in 04 patch. Looks like this breaks 'test-replication': http://patchew.org/QEMU/20180705074638.770905-1-vsement...@virtuozzo.com/ test-replication: /stor/work/qemu/block/io.c:725: wait_serialising_requests: Assertion `qemu_coroutine_self() != req->co' failed. GTester: last random seed: R02Se49967625f19c5be9918a2503fb2fff5 test-replication: /stor/work/qemu/block/io.c:725: wait_serialising_requests: Assertion `qemu_coroutine_self() != req->co' failed. GTESTER tests/test-qht-par GTester: last random seed: R02S8fad2bb2daddc51eb51623bb7c86cb67 test-replication: /stor/work/qemu/block/io.c:725: wait_serialising_requests: Assertion `qemu_coroutine_self() != req->co' failed. GTester: last random seed: R02S024718e9b1e2b1facafe254c43e1430b Fam
Re: [Qemu-devel] [PATCH v2 6/9] block: Use common req handling for discard
On Thu, 07/05 14:44, Kevin Wolf wrote: > Am 05.07.2018 um 09:36 hat Fam Zheng geschrieben: > > Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation > > here is that discard requests don't affect bs->wr_highest_offset. > > > > Signed-off-by: Fam Zheng > > --- > > block/io.c | 21 ++--- > > 1 file changed, 14 insertions(+), 7 deletions(-) > > > > diff --git a/block/io.c b/block/io.c > > index f06978dda0..912fcb962a 100644 > > --- a/block/io.c > > +++ b/block/io.c > > @@ -1582,7 +1582,18 @@ bdrv_co_write_req_finish(BdrvChild *child, > > BdrvTrackedRequest *req, int ret) > > bdrv_parent_cb_resize(bs); > > bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS); > > } > > -bdrv_set_dirty(bs, req->offset, req->bytes); > > +if (req->bytes) { > > +switch (req->type) { > > +case BDRV_TRACKED_WRITE: > > +stat64_max(>wr_highest_offset, req->offset + req->bytes); > > You forgot to remove this line above, so now this one is redundant and > we still execute it always. > > Apart from that, why shouldn't discard be included in > bs->wr_highest_offset? It's an access to an area in the image that must > be present, so it indicates a larger file size, doesn't it? I'm not sure. wr_highest_offset is used for management to allocate disk space. A discard request is on the contrary for releasing disk space. Since guest is allowed to discard unallocated sectors even though it should be nop in the backend, such an operation shouldn't cause a user visible change in @wr_highest_offset in QMP. Fam > > > +/* fall through, to set dirty bits */ > > +case BDRV_TRACKED_DISCARD: > > +bdrv_set_dirty(bs, req->offset, req->bytes); > > +break; > > +default: > > +break; > > +} > > +} > > } > > Kevin
Re: [Qemu-devel] [PATCH v2 1/9] block: Add copy offloading trace points
On Thu, 07/05 13:08, Kevin Wolf wrote: > Am 05.07.2018 um 09:36 hat Fam Zheng geschrieben: > > A few trace points that can help reveal what is happening in a copy > > offloading I/O path. > > > > Signed-off-by: Fam Zheng > > --- > > block/file-posix.c | 2 ++ > > block/io.c | 2 ++ > > block/iscsi.c | 3 +++ > > block/trace-events | 6 ++ > > 4 files changed, 13 insertions(+) > > > > diff --git a/block/file-posix.c b/block/file-posix.c > > index 829ee538d8..d3b1609410 100644 > > --- a/block/file-posix.c > > +++ b/block/file-posix.c > > @@ -1488,6 +1488,8 @@ static ssize_t > > handle_aiocb_copy_range(RawPosixAIOData *aiocb) > > ssize_t ret = copy_file_range(aiocb->aio_fildes, _off, > >aiocb->aio_fd2, _off, > >bytes, 0); > > +trace_copy_file_range(aiocb->bs, aiocb->aio_fildes, in_off, > > + aiocb->aio_fd2, out_off, bytes, 0, ret); > > I think it's preferable to have a common prefix for all trace points in > a driver, so they can be enabled with a glob. > > paio_* is the existing one for thread pool based file-posix trace > points. Not sure if we like it or want to replace it with something > else. Yes, I think it makes sense to add a "file_" prefix. Fam
Re: [Qemu-devel] [PATCH v4] module: Use QEMU_MODULE_DIR as a search path
On Wed, 07/04 14:10, ryang wrote: > The current paths for modules are CONFIG_QEMU_MODDIR and paths relative > to the executable. Qemu and its modules can be installed and executed in > paths that are different from these search paths. This change allows > a search path to be specified by environment variable. > > An example usage for this is postmarketOS[1]. This is a build environment > for Alpine Linux. It sets up Alpine Linux in a chroot environment. > Alpine's Qemu packages are installed in the chroot. The Alpine Linux Qemu > package is used to test compiled Alpine Linux system images. This way there > isn't a reliance on the which ever version of Qemu the host system / distro > provides. > > postmarketOS executes Qemu on host system outside of the chroot > The Qemu module search path needs to point to the location of the > chroot relative to the host system. > > e.g. > The root of the Alpine Linux chroot is: > ~/.local/var/pmbootstrap/chroot_native/ > > Alpine's Qemu is installed at > ~/.local/var/pmbootstrap/chroot_native/usr/bin/ > > The Qemu module search path needs to be: > QEMU_MODULE_DIR=~/.local/var/pmbootstrap/chroot_native/usr/lib/qemu/ > > [1] https://postmarketos.org/ > > Signed-off-by: ryang > --- > > v2: > - fix checkpatch errors > v3: > - initialize n_dirs variable > v4: > - change env variable suffix from _PATH to _DIR > - renamed search_path to search_dir and make it const > > util/module.c | 22 ++ > 1 file changed, 14 insertions(+), 8 deletions(-) > > diff --git a/util/module.c b/util/module.c > index c909737..1259dd3 100644 > --- a/util/module.c > +++ b/util/module.c > @@ -162,9 +162,10 @@ void module_load_one(const char *prefix, const char > *lib_name) > #ifdef CONFIG_MODULES > char *fname = NULL; > char *exec_dir; > -char *dirs[3]; > +const char *search_dir; > +char *dirs[4]; > char *module_name; > -int i = 0; > +int i = 0, n_dirs = 0; > int ret; > static GHashTable *loaded_modules; > > @@ -186,14 +187,19 @@ void module_load_one(const char *prefix, const char > *lib_name) > g_hash_table_insert(loaded_modules, module_name, module_name); > > exec_dir = qemu_get_exec_dir(); > -dirs[i++] = g_strdup_printf("%s", CONFIG_QEMU_MODDIR); > -dirs[i++] = g_strdup_printf("%s/..", exec_dir ? : ""); > -dirs[i++] = g_strdup_printf("%s", exec_dir ? : ""); > -assert(i == ARRAY_SIZE(dirs)); > +search_dir = getenv("QEMU_MODULE_DIR"); > +if (search_dir != NULL) { > +dirs[n_dirs++] = g_strdup_printf("%s", search_dir); > +} > +dirs[n_dirs++] = g_strdup_printf("%s", CONFIG_QEMU_MODDIR); > +dirs[n_dirs++] = g_strdup_printf("%s/..", exec_dir ? : ""); > +dirs[n_dirs++] = g_strdup_printf("%s", exec_dir ? : ""); > +assert(n_dirs <= ARRAY_SIZE(dirs)); > + > g_free(exec_dir); > exec_dir = NULL; > > -for (i = 0; i < ARRAY_SIZE(dirs); i++) { > +for (i = 0; i < n_dirs; i++) { > fname = g_strdup_printf("%s/%s%s", > dirs[i], module_name, HOST_DSOSUF); > ret = module_load_file(fname); > @@ -205,7 +211,7 @@ void module_load_one(const char *prefix, const char > *lib_name) > } > } > > -for (i = 0; i < ARRAY_SIZE(dirs); i++) { > +for (i = 0; i < n_dirs; i++) { > g_free(dirs[i]); > } > > -- > 2.7.4 > Reviewed-by: Fam Zheng
[Qemu-devel] [PATCH v2 9/9] block: Use common write req handling in truncate
Truncation is the last to convert from open coded req handling to reusing helpers. This time the permission check in prepare has to adapt to the new caller: it checks a different permission bit, and don't trigger the before write notifier. Also, truncation should always trigger a bs->total_sectors update and in turn call parent resize_cb. Update the condition in finish helper, too. It's intended to do a duplicated bs->read_only check before calling bdrv_co_write_req_prepare() so that we can be more informative with the error message, as bdrv_co_write_req_prepare() doesn't have Error parameter. Signed-off-by: Fam Zheng --- block/io.c | 54 +- 1 file changed, 33 insertions(+), 21 deletions(-) diff --git a/block/io.c b/block/io.c index ed18eb0ca3..53f2bf4103 100644 --- a/block/io.c +++ b/block/io.c @@ -1556,13 +1556,22 @@ bdrv_co_write_req_prepare(BdrvChild *child, BdrvTrackedRequest *req, int flags) assert(!waited || !req->serialising); assert(req->overlap_offset <= req->offset); assert(req->offset + req->bytes <= req->overlap_offset + req->overlap_bytes); -if (flags & BDRV_REQ_WRITE_UNCHANGED) { -assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); -} else { -assert(child->perm & BLK_PERM_WRITE); -} assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE); -return notifier_with_return_list_notify(>before_write_notifiers, req); +switch (req->type) { +case BDRV_TRACKED_WRITE: +case BDRV_TRACKED_DISCARD: +if (flags & BDRV_REQ_WRITE_UNCHANGED) { +assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); +} else { +assert(child->perm & BLK_PERM_WRITE); +} +return notifier_with_return_list_notify(>before_write_notifiers, req); +case BDRV_TRACKED_TRUNCATE: +assert(child->perm & BLK_PERM_RESIZE); +return 0; +default: +abort(); +} } static inline void coroutine_fn @@ -1575,9 +1584,10 @@ bdrv_co_write_req_finish(BdrvChild *child, BdrvTrackedRequest *req, int ret) stat64_max(>wr_highest_offset, req->offset + req->bytes); -if (req->type != BDRV_TRACKED_DISCARD && -ret == 0 && -end_sector > bs->total_sectors) { +if (ret == 0 && +(req->type == BDRV_TRACKED_TRUNCATE || + (req->type != BDRV_TRACKED_DISCARD && + end_sector > bs->total_sectors))) { bs->total_sectors = end_sector; bdrv_parent_cb_resize(bs); bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS); @@ -3045,7 +3055,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, int64_t old_size, new_bytes; int ret; -assert(child->perm & BLK_PERM_RESIZE); /* if bs->drv == NULL, bs is closed, so there's nothing to do here */ if (!drv) { @@ -3078,7 +3087,16 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, * concurrently or they might be overwritten by preallocation. */ if (new_bytes) { mark_request_serialising(, 1); -wait_serialising_requests(); +} +if (bs->read_only) { +error_setg(errp, "Image is read-only"); +ret = -EACCES; +goto out; +} +ret = bdrv_co_write_req_prepare(child, , 0); +if (ret < 0) { +error_setg_errno(errp, -ret, "Failed to prepare request for truncation"); +goto out; } if (!drv->bdrv_co_truncate) { @@ -3090,13 +3108,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, ret = -ENOTSUP; goto out; } -if (bs->read_only) { -error_setg(errp, "Image is read-only"); -ret = -EACCES; -goto out; -} - -assert(!(bs->open_flags & BDRV_O_INACTIVE)); ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp); if (ret < 0) { @@ -3108,9 +3119,10 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, } else { offset = bs->total_sectors * BDRV_SECTOR_SIZE; } -bdrv_dirty_bitmap_truncate(bs, offset); -bdrv_parent_cb_resize(bs); -atomic_inc(>write_gen); +/* It's possible that truncation succeeded but refresh_total_sectors + * failed, but the latter doesn't affect how we should finish the request. + * Pass 0 as the last parameter so that dirty bitmaps etc. are handled. */ +bdrv_co_write_req_finish(child, , 0); out: tracked_request_end(); -- 2.17.1
[Qemu-devel] [PATCH v2 5/9] block: Fix handling of image enlarging write
Two problems exist when a write request that enlarges the image (i.e. write beyond EOF) finishes: 1) parent is not notified about size change; 2) dirty bitmap is not resized although we try to set the dirty bits; Fix them just like how bdrv_co_truncate works. Reported-by: Kevin Wolf Signed-off-by: Fam Zheng --- block/io.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index 03d9eb0a65..f06978dda0 100644 --- a/block/io.c +++ b/block/io.c @@ -40,6 +40,7 @@ static AioWait drain_all_aio_wait; +static void bdrv_parent_cb_resize(BlockDriverState *bs); static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, BdrvRequestFlags flags); @@ -1571,13 +1572,17 @@ bdrv_co_write_req_finish(BdrvChild *child, BdrvTrackedRequest *req, int ret) BlockDriverState *bs = child->bs; atomic_inc(>write_gen); -bdrv_set_dirty(bs, req->offset, req->bytes); stat64_max(>wr_highest_offset, req->offset + req->bytes); -if (ret == 0) { -bs->total_sectors = MAX(bs->total_sectors, end_sector); +if (req->type != BDRV_TRACKED_DISCARD && +ret == 0 && +end_sector > bs->total_sectors) { +bs->total_sectors = end_sector; +bdrv_parent_cb_resize(bs); +bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS); } +bdrv_set_dirty(bs, req->offset, req->bytes); } /* -- 2.17.1
[Qemu-devel] [PATCH v2 6/9] block: Use common req handling for discard
Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation here is that discard requests don't affect bs->wr_highest_offset. Signed-off-by: Fam Zheng --- block/io.c | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/block/io.c b/block/io.c index f06978dda0..912fcb962a 100644 --- a/block/io.c +++ b/block/io.c @@ -1582,7 +1582,18 @@ bdrv_co_write_req_finish(BdrvChild *child, BdrvTrackedRequest *req, int ret) bdrv_parent_cb_resize(bs); bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS); } -bdrv_set_dirty(bs, req->offset, req->bytes); +if (req->bytes) { +switch (req->type) { +case BDRV_TRACKED_WRITE: +stat64_max(>wr_highest_offset, req->offset + req->bytes); +/* fall through, to set dirty bits */ +case BDRV_TRACKED_DISCARD: +bdrv_set_dirty(bs, req->offset, req->bytes); +break; +default: +break; +} +} } /* @@ -2643,10 +2654,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) ret = bdrv_check_byte_request(bs, offset, bytes); if (ret < 0) { return ret; -} else if (bs->read_only) { -return -EPERM; } -assert(!(bs->open_flags & BDRV_O_INACTIVE)); /* Do nothing if disabled. */ if (!(bs->open_flags & BDRV_O_UNMAP)) { @@ -2670,7 +2678,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) bdrv_inc_in_flight(bs); tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_DISCARD); -ret = notifier_with_return_list_notify(>before_write_notifiers, ); +ret = bdrv_co_write_req_prepare(child, , 0); if (ret < 0) { goto out; } @@ -2736,8 +2744,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) } ret = 0; out: -atomic_inc(>write_gen); -bdrv_set_dirty(bs, req.offset, req.bytes); +bdrv_co_write_req_finish(child, , ret); tracked_request_end(); bdrv_dec_in_flight(bs); return ret; -- 2.17.1
[Qemu-devel] [PATCH v2 8/9] block: Fix bdrv_co_truncate overlap check
If we are growing the image and potentially using preallocation for the new area, we need to make sure that no write requests are made to the "preallocated" area which [@old_size, @offset), not [@offset, offset * 2 - @old_size). Signed-off-by: Fam Zheng --- block/io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index d07849fa96..ed18eb0ca3 100644 --- a/block/io.c +++ b/block/io.c @@ -3070,7 +3070,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, } bdrv_inc_in_flight(bs); -tracked_request_begin(, bs, offset, new_bytes, BDRV_TRACKED_TRUNCATE); +tracked_request_begin(, bs, offset - new_bytes, new_bytes, + BDRV_TRACKED_TRUNCATE); /* If we are growing the image and potentially using preallocation for the * new area, we need to make sure that no write requests are made to it -- 2.17.1
[Qemu-devel] [PATCH v2 3/9] block: Use uint64_t for BdrvTrackedRequest byte fields
This matches the types used for bytes in the rest parts of block layer. In the case of bdrv_co_truncate, new_bytes can be the image size which probably doesn't fit in a 32 bit int. Signed-off-by: Fam Zheng --- block/io.c| 2 +- include/block/block_int.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index 3e00667a2a..443a8584c4 100644 --- a/block/io.c +++ b/block/io.c @@ -587,7 +587,7 @@ static void tracked_request_end(BdrvTrackedRequest *req) static void tracked_request_begin(BdrvTrackedRequest *req, BlockDriverState *bs, int64_t offset, - unsigned int bytes, + uint64_t bytes, enum BdrvTrackedRequestType type) { *req = (BdrvTrackedRequest){ diff --git a/include/block/block_int.h b/include/block/block_int.h index af71b414be..66c0e50d82 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -69,12 +69,12 @@ enum BdrvTrackedRequestType { typedef struct BdrvTrackedRequest { BlockDriverState *bs; int64_t offset; -unsigned int bytes; +uint64_t bytes; enum BdrvTrackedRequestType type; bool serialising; int64_t overlap_offset; -unsigned int overlap_bytes; +uint64_t overlap_bytes; QLIST_ENTRY(BdrvTrackedRequest) list; Coroutine *co; /* owner, used for deadlock detection */ -- 2.17.1
[Qemu-devel] [PATCH v2 4/9] block: Extract common write req handling
As a mechanical refactoring patch, this is the first step towards unified and more correct write code paths. This is helpful because multiple BlockDriverState fields need to be updated after modifying image data, and it's hard to maintain in multiple places such as copy offload, discard and truncate. Suggested-by: Kevin Wolf Signed-off-by: Fam Zheng --- block/io.c | 70 ++ 1 file changed, 44 insertions(+), 26 deletions(-) diff --git a/block/io.c b/block/io.c index 443a8584c4..03d9eb0a65 100644 --- a/block/io.c +++ b/block/io.c @@ -1538,6 +1538,48 @@ fail: return ret; } +static inline int coroutine_fn +bdrv_co_write_req_prepare(BdrvChild *child, BdrvTrackedRequest *req, int flags) +{ +BlockDriverState *bs = child->bs; +bool waited; +int64_t end_sector = DIV_ROUND_UP(req->offset + req->bytes, BDRV_SECTOR_SIZE); + +if (bs->read_only) { +return -EPERM; +} +assert(!(bs->open_flags & BDRV_O_INACTIVE)); +assert((bs->open_flags & BDRV_O_NO_IO) == 0); +assert(!(flags & ~BDRV_REQ_MASK)); +waited = wait_serialising_requests(req); +assert(!waited || !req->serialising); +assert(req->overlap_offset <= req->offset); +assert(req->offset + req->bytes <= req->overlap_offset + req->overlap_bytes); +if (flags & BDRV_REQ_WRITE_UNCHANGED) { +assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); +} else { +assert(child->perm & BLK_PERM_WRITE); +} +assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE); +return notifier_with_return_list_notify(>before_write_notifiers, req); +} + +static inline void coroutine_fn +bdrv_co_write_req_finish(BdrvChild *child, BdrvTrackedRequest *req, int ret) +{ +int64_t end_sector = DIV_ROUND_UP(req->offset + req->bytes, BDRV_SECTOR_SIZE); +BlockDriverState *bs = child->bs; + +atomic_inc(>write_gen); +bdrv_set_dirty(bs, req->offset, req->bytes); + +stat64_max(>wr_highest_offset, req->offset + req->bytes); + +if (ret == 0) { +bs->total_sectors = MAX(bs->total_sectors, end_sector); +} +} + /* * Forwards an already correctly aligned write request to the BlockDriver, * after possibly fragmenting it. @@ -1548,10 +1590,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child, { BlockDriverState *bs = child->bs; BlockDriver *drv = bs->drv; -bool waited; int ret; -int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE); uint64_t bytes_remaining = bytes; int max_transfer; @@ -1567,23 +1607,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child, assert((offset & (align - 1)) == 0); assert((bytes & (align - 1)) == 0); assert(!qiov || bytes == qiov->size); -assert((bs->open_flags & BDRV_O_NO_IO) == 0); -assert(!(flags & ~BDRV_REQ_MASK)); max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX), align); -waited = wait_serialising_requests(req); -assert(!waited || !req->serialising); -assert(req->overlap_offset <= offset); -assert(offset + bytes <= req->overlap_offset + req->overlap_bytes); -if (flags & BDRV_REQ_WRITE_UNCHANGED) { -assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE)); -} else { -assert(child->perm & BLK_PERM_WRITE); -} -assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE); - -ret = notifier_with_return_list_notify(>before_write_notifiers, req); +ret = bdrv_co_write_req_prepare(child, req, flags); if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF && !(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_pwrite_zeroes && @@ -1632,15 +1659,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child, } bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE); -atomic_inc(>write_gen); -bdrv_set_dirty(bs, offset, bytes); - -stat64_max(>wr_highest_offset, offset + bytes); - if (ret >= 0) { -bs->total_sectors = MAX(bs->total_sectors, end_sector); ret = 0; } +bdrv_co_write_req_finish(child, req, ret); return ret; } @@ -1755,10 +1777,6 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child, if (!bs->drv) { return -ENOMEDIUM; } -if (bs->read_only) { -return -EPERM; -} -assert(!(bs->open_flags & BDRV_O_INACTIVE)); ret = bdrv_check_byte_request(bs, offset, bytes); if (ret < 0) { -- 2.17.1
[Qemu-devel] [PATCH v2 7/9] block: Use common req handling in copy offloading
This brings the request handling logic inline with write and discard, fixing write_gen, resize_cb, dirty bitmaps and image size refreshing. The last of these issues broke iotest case 222, which is now fixed. Signed-off-by: Fam Zheng --- block/io.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index 912fcb962a..d07849fa96 100644 --- a/block/io.c +++ b/block/io.c @@ -2962,7 +2962,11 @@ static int coroutine_fn bdrv_co_copy_range_internal(BdrvChild *src, if (!(flags & BDRV_REQ_NO_SERIALISING)) { wait_serialising_requests(_req); -wait_serialising_requests(_req); +} + +ret = bdrv_co_write_req_prepare(dst, _req, flags); +if (ret) { +goto out; } if (recurse_src) { ret = src->bs->drv->bdrv_co_copy_range_from(src->bs, @@ -2975,6 +2979,8 @@ static int coroutine_fn bdrv_co_copy_range_internal(BdrvChild *src, dst, dst_offset, bytes, flags); } +out: +bdrv_co_write_req_finish(dst, _req, ret); tracked_request_end(_req); tracked_request_end(_req); bdrv_dec_in_flight(src->bs); -- 2.17.1
[Qemu-devel] [PATCH v2 2/9] block: Use BdrvChild to discard
Other I/O functions are already using a BdrvChild pointer in the API, so make discard do the same. It makes it possible to initiate the same permission checks before doing I/O, and much easier to share the helper functions for this, which will be added and used by write, truncate and copy range paths. Signed-off-by: Fam Zheng --- block/blkdebug.c | 2 +- block/blkreplay.c | 2 +- block/block-backend.c | 2 +- block/copy-on-read.c | 2 +- block/io.c | 18 +- block/mirror.c | 2 +- block/qcow2-refcount.c | 2 +- block/raw-format.c | 2 +- block/throttle.c | 2 +- include/block/block.h | 4 ++-- 10 files changed, 19 insertions(+), 19 deletions(-) diff --git a/block/blkdebug.c b/block/blkdebug.c index 526af2a808..0457bf5b66 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -625,7 +625,7 @@ static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs, return err; } -return bdrv_co_pdiscard(bs->file->bs, offset, bytes); +return bdrv_co_pdiscard(bs->file, offset, bytes); } static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs, diff --git a/block/blkreplay.c b/block/blkreplay.c index b016dbeee7..766150ade6 100755 --- a/block/blkreplay.c +++ b/block/blkreplay.c @@ -113,7 +113,7 @@ static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) { uint64_t reqid = blkreplay_next_id(); -int ret = bdrv_co_pdiscard(bs->file->bs, offset, bytes); +int ret = bdrv_co_pdiscard(bs->file, offset, bytes); block_request_create(reqid, bs, qemu_coroutine_self()); qemu_coroutine_yield(); diff --git a/block/block-backend.c b/block/block-backend.c index 6b75bca317..dd037b40a2 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1559,7 +1559,7 @@ int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes) return ret; } -return bdrv_co_pdiscard(blk_bs(blk), offset, bytes); +return bdrv_co_pdiscard(blk->root, offset, bytes); } int blk_co_flush(BlockBackend *blk) diff --git a/block/copy-on-read.c b/block/copy-on-read.c index 1dcdaeed69..a19164f9eb 100644 --- a/block/copy-on-read.c +++ b/block/copy-on-read.c @@ -116,7 +116,7 @@ static int coroutine_fn cor_co_pwrite_zeroes(BlockDriverState *bs, static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) { -return bdrv_co_pdiscard(bs->file->bs, offset, bytes); +return bdrv_co_pdiscard(bs->file, offset, bytes); } diff --git a/block/io.c b/block/io.c index 8cc5ee661d..3e00667a2a 100644 --- a/block/io.c +++ b/block/io.c @@ -2590,7 +2590,7 @@ int bdrv_flush(BlockDriverState *bs) } typedef struct DiscardCo { -BlockDriverState *bs; +BdrvChild *child; int64_t offset; int bytes; int ret; @@ -2599,17 +2599,17 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void *opaque) { DiscardCo *rwco = opaque; -rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->offset, rwco->bytes); +rwco->ret = bdrv_co_pdiscard(rwco->child, rwco->offset, rwco->bytes); } -int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset, - int bytes) +int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes) { BdrvTrackedRequest req; int max_pdiscard, ret; int head, tail, align; +BlockDriverState *bs = child->bs; -if (!bs->drv) { +if (!bs || !bs->drv) { return -ENOMEDIUM; } @@ -2720,11 +2720,11 @@ out: return ret; } -int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) +int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes) { Coroutine *co; DiscardCo rwco = { -.bs = bs, +.child = child, .offset = offset, .bytes = bytes, .ret = NOT_DONE, @@ -2735,8 +2735,8 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) bdrv_pdiscard_co_entry(); } else { co = qemu_coroutine_create(bdrv_pdiscard_co_entry, ); -bdrv_coroutine_enter(bs, co); -BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE); +bdrv_coroutine_enter(child->bs, co); +BDRV_POLL_WHILE(child->bs, rwco.ret == NOT_DONE); } return rwco.ret; diff --git a/block/mirror.c b/block/mirror.c index 61bd9f3cf1..b48c3f8cf5 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1333,7 +1333,7 @@ static int coroutine_fn bdrv_mirror_top_do_write(BlockDriverState *bs, break; case MIRROR_METHOD_DISCARD: -ret = bdrv_co_pdiscard(bs->backing->bs, offset, bytes); +ret = bdrv_co_pdiscard(bs->backing, offset, bytes); break; default: diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 18c729aa27..4e1589a
[Qemu-devel] [PATCH v2 1/9] block: Add copy offloading trace points
A few trace points that can help reveal what is happening in a copy offloading I/O path. Signed-off-by: Fam Zheng --- block/file-posix.c | 2 ++ block/io.c | 2 ++ block/iscsi.c | 3 +++ block/trace-events | 6 ++ 4 files changed, 13 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index 829ee538d8..d3b1609410 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1488,6 +1488,8 @@ static ssize_t handle_aiocb_copy_range(RawPosixAIOData *aiocb) ssize_t ret = copy_file_range(aiocb->aio_fildes, _off, aiocb->aio_fd2, _off, bytes, 0); +trace_copy_file_range(aiocb->bs, aiocb->aio_fildes, in_off, + aiocb->aio_fd2, out_off, bytes, 0, ret); if (ret == 0) { /* No progress (e.g. when beyond EOF), let the caller fall back to * buffer I/O. */ diff --git a/block/io.c b/block/io.c index 1a2272fad3..8cc5ee661d 100644 --- a/block/io.c +++ b/block/io.c @@ -2960,6 +2960,7 @@ int coroutine_fn bdrv_co_copy_range_from(BdrvChild *src, uint64_t src_offset, BdrvChild *dst, uint64_t dst_offset, uint64_t bytes, BdrvRequestFlags flags) { +trace_bdrv_co_copy_range_from(src, src_offset, dst, dst_offset, bytes, flags); return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset, bytes, flags, true); } @@ -2972,6 +2973,7 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, uint64_t src_offset, BdrvChild *dst, uint64_t dst_offset, uint64_t bytes, BdrvRequestFlags flags) { +trace_bdrv_co_copy_range_to(src, src_offset, dst, dst_offset, bytes, flags); return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset, bytes, flags, false); } diff --git a/block/iscsi.c b/block/iscsi.c index ead2bd5aa7..118555d051 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -44,6 +44,7 @@ #include "qapi/qmp/qstring.h" #include "crypto/secret.h" #include "scsi/utils.h" +#include "trace.h" /* Conflict between scsi/utils.h and libiscsi! :( */ #define SCSI_XFER_NONE ISCSI_XFER_NONE @@ -2396,6 +2397,8 @@ retry: } out_unlock: + +trace_iscsi_xcopy(src_lun, src_offset, dst_lun, dst_offset, bytes, r); g_free(iscsi_task.task); qemu_mutex_unlock(_lun->mutex); g_free(iscsi_task.err_str); diff --git a/block/trace-events b/block/trace-events index c35287b48a..1a25a997f2 100644 --- a/block/trace-events +++ b/block/trace-events @@ -15,6 +15,8 @@ bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) "bs bdrv_co_pwritev(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) "bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x" bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p offset %"PRId64" count %d flags 0x%x" bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t cluster_offset, int64_t cluster_bytes) "bs %p offset %"PRId64" bytes %u cluster_offset %"PRId64" cluster_bytes %"PRId64 +bdrv_co_copy_range_from(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" flags 0x%x" +bdrv_co_copy_range_to(void *src, uint64_t src_offset, void *dst, uint64_t dst_offset, uint64_t bytes, int flags) "src %p offset %"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" flags 0x%x" # block/stream.c stream_one_iteration(void *s, int64_t offset, uint64_t bytes, int is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d" @@ -57,6 +59,7 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p" # block/file-posix.c paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d" paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d" +copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_off, int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset %"PRIu64" dst_fd %d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64 # block/qcow2.c qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" PRIx64 " bytes %d" @@ -150,3 +153,6 @@ nvme_free_req_queue_wait(void *q) "q %p" nvme_cmd_map_qiov(void *s, void *cmd, void *req, void *qiov, int entries) &quo
[Qemu-devel] [PATCH v2 0/9] block: Fix dst reading after tail copy offloading
Qcow2 allocates new clusters after the end of the file. If it is the destinaton of copy offloading, we must adjust dst->bs->total_sectors. Otherwise, further reads will drop to the "beyond EOF" code path and return zeroes, which problem is caught by iotests 222. Follow the logic in the normal write code and update bs->total_sectors after I/O is done. While at it, add a few convenient trace points to aid future debug experiences in the topic. Fam Zheng (9): block: Add copy offloading trace points block: Use BdrvChild to discard block: Use uint64_t for BdrvTrackedRequest byte fields block: Extract common write req handling block: Fix handling of image enlarging write block: Use common req handling for discard block: Use common req handling in copy offloading block: Fix bdrv_co_truncate overlap check block: Use common write req handling in truncate block/blkdebug.c | 2 +- block/blkreplay.c | 2 +- block/block-backend.c | 2 +- block/copy-on-read.c | 2 +- block/file-posix.c| 2 + block/io.c| 163 +- block/iscsi.c | 3 + block/mirror.c| 2 +- block/qcow2-refcount.c| 2 +- block/raw-format.c| 2 +- block/throttle.c | 2 +- block/trace-events| 6 ++ include/block/block.h | 4 +- include/block/block_int.h | 4 +- 14 files changed, 130 insertions(+), 68 deletions(-) -- 2.17.1
Re: [Qemu-devel] [PATCH 1/2] block: Fix dst total_sectors after copy offloading
On Wed, 07/04 09:44, Kevin Wolf wrote: > Am 04.07.2018 um 08:13 hat Fam Zheng geschrieben: > > This was noticed by the new image fleecing tests case 222. The issue is > > apparent and we should just do the same right things as in > > bdrv_aligned_pwritev. > > > > Reported-by: John Snow > > Signed-off-by: Fam Zheng > > > --- a/block/io.c > > +++ b/block/io.c > > @@ -2945,6 +2945,10 @@ static int coroutine_fn > > bdrv_co_copy_range_internal(BdrvChild *src, > >dst, dst_offset, > >bytes, flags); > > } > > +if (ret == 0) { > > +int64_t end_sector = DIV_ROUND_UP(dst_offset + bytes, > > BDRV_SECTOR_SIZE); > > +dst->bs->total_sectors = MAX(dst->bs->total_sectors, end_sector); > > +} > > I think it's time to extract this stuff to a common function. I was > already aware that a write request that extends the image isn't behaving > exactly the same as bdrv_co_truncate(), and this one is bound to diverge > from the other two instances as well. > > This is what bdrv_aligned_pwritev() does after the write: > > atomic_inc(>write_gen); > bdrv_set_dirty(bs, offset, bytes); > > stat64_max(>wr_highest_offset, offset + bytes); > > if (ret >= 0) { > bs->total_sectors = MAX(bs->total_sectors, end_sector); > ret = 0; > } > > Before the write, it also calls bs->before_write_notifiers. > > This is what bdrv_co_truncate() does after truncating the image: > > ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS); > if (ret < 0) { > error_setg_errno(errp, -ret, "Could not refresh total sector count"); > } else { > offset = bs->total_sectors * BDRV_SECTOR_SIZE; > } > bdrv_dirty_bitmap_truncate(bs, offset); > bdrv_parent_cb_resize(bs); > atomic_inc(>write_gen); > > This means that we probably have at least the following bugs in > bdrv_co_copy_range_internal(): > > 1. bs->write_gen is not increased, a following flush is ignored > 2. Dirty bitmaps are not dirtied > 3. Dirty bitmaps are not resized when extending the image > 4. bs->wr_highest_offset is not adjusted correctly > 5. bs->total_sectors is not resized (the bug this patch fixes) > 6. The parent node isn't notified about an image size change > > Of these, 3. and 6. also seem to be bugs in bdrv_aligned_pwritev(). > Indeed, great insight. I'll work on v2. Fam