Re: [Qemu-devel] [PATCH] qemu-img.c: Add examples section

2018-08-02 Thread Fam Zheng
On Thu, 08/02 20:50, John Arbuckle wrote:
> Add an examples section to the help output.
> 
> Signed-off-by: John Arbuckle 
> ---
>  qemu-img.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 1acddf693c..f77c82695d 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -199,6 +199,17 @@ static void QEMU_NORETURN help(void)
>  
>  printf("%s\nSupported formats:", help_msg);
>  bdrv_iterate_format(format_print, NULL);
> +
> +printf("\n\nExamples:\n\n"
> +   "Create: qemu-img create -f qcow2 image.qcow2 10G\n\n"
> +   "Info: qemu-img info image.qcow2\n\n"
> +   "Resize: qemu-img resize image.qcow2 20G\n\n"
> +   "Convert: qemu-img convert -f raw -O qcow2 image.img 
> image.qcow2\n\n"
> +   "Check: qemu-img check image.qcow2\n\n"
> +   "Map: qemu-img map -f qcow2 --output=human image.qcow2\n\n"
> +   "Rebase: qemu-img rebase -b new_backing_file.qcow2 image.qcow2"
> +   );

The text looks good but maybe it's better to condense the section by using only
one \n between command lines instead of two?

Fam

> +
>  printf("\n\n" QEMU_HELP_BOTTOM "\n");
>  exit(EXIT_SUCCESS);
>  }
> -- 
> 2.14.3 (Apple Git-98)
> 
> 



Re: [Qemu-devel] 'make vm-build-freebsd' don't work if KVM isn't enabled

2018-08-01 Thread Fam Zheng
On Mon, 07/30 16:20, Peter Maydell wrote:
> On 30 July 2018 at 14:40, Peter Maydell  wrote:
> > On 30 July 2018 at 14:36, Peter Maydell  wrote:
> >> On 30 July 2018 at 14:23, Peter Maydell  wrote:
> >>> The tests in tests/vm/ seem to make some attempt to cope with the
> >>> host system not allowing the user to use KVM, but it doesn't quite
> >>> work.
> >>
> >> Also, passing V=1 to the outer make command like:
> >>  make vm-build-freebsd J=16 V=1
> >> isn't propagated through to the VM: test output looks like
> >>   GTESTER check-qtest-aarch64
> >>   GTESTER check-qtest-alpha
> >> instead of the proper verbose output with a line per subtest.
> >
> > Nor is the J=16 option passed through to "make check":
> 
> Patches for these things on the way, once I've completed a test run.

All good catches! Looking forward to the fixes. Thanks.

Fam



Re: [Qemu-devel] [PATCH for-3.1] s390x: remove 's390-squash-mcss' option

2018-08-01 Thread Fam Zheng
On Wed, Aug 1, 2018 at 9:18 PM Cornelia Huck  wrote:
>
> On Wed, 1 Aug 2018 15:11:23 +0200
> Cornelia Huck  wrote:
>
> > On Wed, 1 Aug 2018 14:54:51 +0200
> > Cornelia Huck  wrote:
> >
> > > On Wed, 1 Aug 2018 18:21:27 +0800
> > > Fam Zheng  wrote:
> > >
> > > > On Tue, 07/24 11:24, Cornelia Huck wrote:
> > > > > diff --git a/hw/s390x/css-bridge.c b/hw/s390x/css-bridge.c
> > > > > index a02d708239..1bd6c8b458 100644
> > > > > --- a/hw/s390x/css-bridge.c
> > > > > +++ b/hw/s390x/css-bridge.c
> > > > > @@ -106,7 +106,6 @@ VirtualCssBus *virtual_css_bus_init(void)
> > > > >  /* Create bus on bridge device */
> > > > >  bus = qbus_create(TYPE_VIRTUAL_CSS_BUS, dev, "virtual-css");
> > > > >  cbus = VIRTUAL_CSS_BUS(bus);
> > > >
> > > > Not used now?
> > > >
> > > > Fam
> > >
> > > Indeed, we can ditch the cbus variable.
> >
> > ...or not :) We still need it for the return value, which is processed
> > in ccw_init(). We could change the return code of the function to
> > BusState, but I'm not sure it is worth the hassle.
>
> ...but we can indeed get rid of the cbus and qbus variables in
> s390_ccw_realize().

I got this from a patchew.org test (make docker-test-clang@ubuntu):

/tmp/qemu-test/src/hw/s390x/3270-ccw.c:103:20: error: unused variable
'cbus' [-Werror,-Wunused-variable]
VirtualCssBus *cbus = VIRTUAL_CSS_BUS(qbus);
   ^
1 error generated.
/tmp/qemu-test/src/rules.mak:69: recipe for target 'hw/s390x/3270-ccw.o' failed



Re: [Qemu-devel] [PATCH for-3.1] s390x: remove 's390-squash-mcss' option

2018-08-01 Thread Fam Zheng
On Tue, 07/24 11:24, Cornelia Huck wrote:
> diff --git a/hw/s390x/css-bridge.c b/hw/s390x/css-bridge.c
> index a02d708239..1bd6c8b458 100644
> --- a/hw/s390x/css-bridge.c
> +++ b/hw/s390x/css-bridge.c
> @@ -106,7 +106,6 @@ VirtualCssBus *virtual_css_bus_init(void)
>  /* Create bus on bridge device */
>  bus = qbus_create(TYPE_VIRTUAL_CSS_BUS, dev, "virtual-css");
>  cbus = VIRTUAL_CSS_BUS(bus);

Not used now?

Fam



Re: [Qemu-devel] [PATCH] scsi: mptsas: Mark as storage device

2018-07-31 Thread Fam Zheng
On Tue, 07/31 15:28, Guenter Roeck wrote:
> mptsas1068 is currently listed as uncategorized device.
> Mark it as storage device.
> 
> Signed-off-by: Guenter Roeck 
> ---
>  hw/scsi/mptsas.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c
> index 4176e87..929404f 100644
> --- a/hw/scsi/mptsas.c
> +++ b/hw/scsi/mptsas.c
> @@ -1431,6 +1431,7 @@ static void mptsas1068_class_init(ObjectClass *oc, void 
> *data)
>  dc->reset = mptsas_reset;
>  dc->vmsd = _mptsas;
>  dc->desc = "LSI SAS 1068";
> +set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
>  }
>  
>  static const TypeInfo mptsas_info = {
> -- 
> 2.7.4
> 

Reviewed-by: Fam Zheng 




Re: [Qemu-devel] [PATCH 0/2] iotests: Fix 226 on _my_ system

2018-07-27 Thread Fam Zheng
On Fri, Jul 27, 2018 at 6:04 PM Kevin Wolf  wrote:
>
> Am 27.07.2018 um 11:24 hat Fam Zheng geschrieben:
> > On Tue, 07/24 16:47, Fam Zheng wrote:
> > > Something has locked /dev/null on my system (I still don't know what to 
> > > do with
> > > the annoying incapability of lslocks, or more precisely /proc/locks, on
> > > inspecting OFD lock information), and as a result 226 cannot pass due to 
> > > the
> > > unexpected image locking error.
> > >
> > > Fix the test case by disabling locking, and add a doc text about using 
> > > test
> > > images.
> > >
> > > Fam Zheng (2):
> > >   docs: Describe using images in writing iotests
> > >   iotests: Don't lock /dev/null in 226
> > >
> > >  docs/devel/testing.rst | 11 +++
> > >  tests/qemu-iotests/226 |  4 ++--
> > >  2 files changed, 13 insertions(+), 2 deletions(-)
> >
> > Kevin, could you apply this for 3.0 too?
>
> Sure, applied to the block branch.
>
> Out of curiosity, did you find out what had locked the devices?

No, but I'll for sure allocate some time to understand kernel's OFD
internals, find the holder application, then hopefully, propose
something that can fix lslocks.

Thanks,
Fam

>
> Kevin



Re: [Qemu-devel] [PATCH RFC for-3.0-rc3 0/3] qemu-img: Disable copy offloading by default

2018-07-27 Thread Fam Zheng
On Fri, Jul 27, 2018 at 6:29 PM Kevin Wolf  wrote:
>
> Am 27.07.2018 um 05:33 hat Fam Zheng geschrieben:
> > Kevin pointed out that both glibc and kernel provides a slow fallback of
> > copy_file_range which hurts thin provisioning. This is particularly true for
> > thin LVs, because host_device driver cannot get allocation info from the
> > volume, and copy_file_range is called on every sectors, making the dst fully
> > allocated.
> >
> > NFS mount points also doesn't support SEEK_DATA well, so the allocation
> > information is unknown to QEMU.
> >
> > That leaves only iscsi:// which seems to do what we want so far, but it is a
> > smaller use case.
> >
> > Add an option to qemu-img convert, "-C", to enable (attempting) copy 
> > offloading
> > explicitly. And mark it incompatible with "-S" and "-c".
>
> Reviewed-by: Kevin Wolf 
>
> Not sure why you made this an RFC only, but I think we absolutely need
> this. People are used to using 'qemu-img convert' to compact images and
> this would regress with automatic copy offloading.
>
> Do you think we need more discussion?

I think merging this for 3.0 is the right thing do to.

What worries me is the general usability of the feature. We could
probably explore ideas about how we can better take advantage of copy
offloading. I don't think counting on the user to make the right
decision between disk efficiency (thin provisioning) and BW efficiency
(copy offloading) will ever work. Even if we don't care about breaking
the default '-S 4k' behavior, the lack of SEEK_DATA/SEEK_HOLE support
on host NFS and block devices will make it very hard to use. Making it
worse, if the network to NFS server is good enough, convert with
pread64/pwrite64 with host page cache is also more efficient than
copy_file_range, so we'll be slower by trying to play clever. :(

Any thought?

Fam



Re: [Qemu-devel] [PATCH 0/2] iotests: Fix 226 on _my_ system

2018-07-27 Thread Fam Zheng
On Tue, 07/24 16:47, Fam Zheng wrote:
> Something has locked /dev/null on my system (I still don't know what to do 
> with
> the annoying incapability of lslocks, or more precisely /proc/locks, on
> inspecting OFD lock information), and as a result 226 cannot pass due to the
> unexpected image locking error.
> 
> Fix the test case by disabling locking, and add a doc text about using test
> images.
> 
> Fam Zheng (2):
>   docs: Describe using images in writing iotests
>   iotests: Don't lock /dev/null in 226
> 
>  docs/devel/testing.rst | 11 +++
>  tests/qemu-iotests/226 |  4 ++--
>  2 files changed, 13 insertions(+), 2 deletions(-)

Kevin, could you apply this for 3.0 too?

Fam



Re: [Qemu-devel] [PATCH] file-posix: Handle EINTR in preallocation=full write

2018-07-27 Thread Fam Zheng
On Fri, 07/27 14:53, Fam Zheng wrote:
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Fam Zheng 

Actually cc qemu-sta...@nongnu.org



Re: [Qemu-devel] [RFC PATCH] travis: Test out-of-tree builds

2018-07-27 Thread Fam Zheng
On Thu, 06/21 09:26, Philippe Mathieu-Daudé wrote:
> Force one config to build 'out-of-tree' (object files and executables
> are created in a tree outside the project source code).
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> I noticed various out-of-tree issue in the last 2 merge windows.
> 
> Pseudo-randomly picked a build from the matrix which covers system + user.
> 
>  .travis.yml | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/.travis.yml b/.travis.yml
> index fabfe9ec34..02e096f6a9 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -50,6 +50,8 @@ notifications:
>  on_failure: always
>  env:
>global:
> +- SRC_DIR="."
> +- BUILD_DIR="."
>  - TEST_CMD="make check"
>  - MAKEFLAGS="-j3"
>matrix:
> @@ -69,11 +71,15 @@ before_install:
>- wget -O - 
> http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar 
> -xvJ
>- git submodule update --init --recursive
>  before_script:
> -  - ./configure ${CONFIG} || { cat config.log && exit 1; }
> +  - mkdir -p ${BUILD_DIR} && cd ${BUILD_DIR}
> +  - ${SRC_DIR}/configure ${CONFIG} || { cat config.log && exit 1; }
>  script:
>- make ${MAKEFLAGS} && ${TEST_CMD}
>  matrix:
>include:
> +# Test out-of-tree builds
> +- env: CONFIG="--enable-debug --enable-debug-tcg"
> +   BUILD_DIR="out-of-tree/build/dir" SRC_DIR="../../.."

Is a three-level nesting really necessary? I think a simple './build' dir should
do.

>  # Test with Clang for compile portability (Travis uses clang-5.0)
>  - env: CONFIG="--disable-system"
>compiler: clang
> -- 
> 2.18.0.rc2
> 

Fam



[Qemu-devel] [PATCH] tests/vm: Add vm-build-all/vm-clean-all in help text

2018-07-27 Thread Fam Zheng
Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Fam Zheng 
---
 tests/vm/Makefile.include | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
index 5daa2a3b73..9e19f8662f 100644
--- a/tests/vm/Makefile.include
+++ b/tests/vm/Makefile.include
@@ -14,6 +14,9 @@ vm-test:
@echo "  vm-build-freebsd- Build QEMU in FreeBSD VM"
@echo "  vm-build-netbsd - Build QEMU in NetBSD VM"
@echo "  vm-build-openbsd- Build QEMU in OpenBSD VM"
+   @echo ""
+   @echo "  vm-build-all- Build QEMU in all VMs"
+   @echo "  vm-clean-all- Clean up VM images"
 
 vm-build-all: $(addprefix vm-build-, $(IMAGES))
 
-- 
2.17.1




Re: [Qemu-devel] [PATCH RFC 00/10] docker on non-x86 hosts

2018-07-27 Thread Fam Zheng
On Wed, 07/18 11:04, Alex Bennée wrote:
> Hi,
> 
> Our existing support for docker is fairly x86 centric. While docker
> itself has support for multiple architectures not all architectures
> are equal. For example Debian only packages the widest range of
> cross-compilers in it's x86 images (although for example armhf is
> available on aarch64 based images).
> 
> As the binfmt support is getting more solid we always have that fall
> back option and I've converted a number of the guests that way.
> Unfortunately I've overloaded the meaning of DOCKER_PARTIAL_IMAGES but
> I was loathed to throw in yet another magic variable without some
> discussion first. I'm not sure if my current path will end up with a
> maze of twisty if/else statements all subtly alike.

Just my two cents.  So far it looks reasonable. Though I guess we could give
DOCKER_PARTIAL_IMAGES a better name or at least better documentation.

Would a per-arch variable list look cleaner? Like

DOCKER_IMAGES_GENERIC = debian
DOCKER_IMAGES_X86_64 = ubuntu fedora centos7
DOCKER_PARTIAL_IMAGES_X86_64 = ...
DOCKER_IMAGES_AARCH64 = ...
DOCKER_PARTIAL_IMAGES_AARCH64 = ...

Then the actual host arch can be used to expand rules:

docker-all-images: $(DOCKER_IMAGES_GENERIC)
docker-all-images: $(DOCKER_IMAGES_$(shell uname -m | tr '[a-z]' '[A-Z]))

Fam

> 
> Thoughts?
> 
> Alex Bennée (10):
>   docker: rename docker-amd64 to docker-host
>   docker: change docker-image to docker-all-images target
>   docker: add a placeholder for handling non-x86 hosts
>   docker: don't include docker-arm64-cross on aarch64 hosts
>   docker: fall-back to binfmt_misc debian-mips64el-user-cross on non-x86
>   docker: fall-back to binfmt_misc debian-ppc64el-user-cross on non-x86
>   docker: fall-back to binfmt_misc debian-s390x-user-cross on non-x86
>   docker: disable additional non-x86 images
>   tests: tcg skip docker images we can't build
>   tests/tcg: debian-mips64el-user-cross fallback
> 
>  Makefile  |  4 +-
>  tests/docker/Makefile.include | 58 +--
>  ...debian-amd64.docker => debian-host.docker} |  4 +-
>  .../debian-mips64el-user-cross.docker | 16 +
>  .../debian-ppc64el-user-cross.docker  | 16 +
>  .../debian-s390x-user-cross.docker| 16 +
>  tests/tcg/Makefile.include|  3 +
>  tests/tcg/Makefile.probe  |  2 +-
>  tests/tcg/mips/Makefile.include   |  5 ++
>  9 files changed, 116 insertions(+), 8 deletions(-)
>  rename tests/docker/dockerfiles/{debian-amd64.docker => debian-host.docker} 
> (91%)
>  create mode 100644 tests/docker/dockerfiles/debian-mips64el-user-cross.docker
>  create mode 100644 tests/docker/dockerfiles/debian-ppc64el-user-cross.docker
>  create mode 100644 tests/docker/dockerfiles/debian-s390x-user-cross.docker
> 
> -- 
> 2.17.1
> 



Re: [Qemu-devel] [PATCH RFC 01/10] docker: rename docker-amd64 to docker-host

2018-07-27 Thread Fam Zheng
On Wed, 07/18 11:04, Alex Bennée wrote:
> When building on non-x86 systems the base system will be correct so if
> we avoid too many x86'isms in the install we can still use the image.
> 
> Signed-off-by: Alex Bennée 
> ---
>  .../dockerfiles/{debian-amd64.docker => debian-host.docker}   | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>  rename tests/docker/dockerfiles/{debian-amd64.docker => debian-host.docker} 
> (91%)
> 
> diff --git a/tests/docker/dockerfiles/debian-amd64.docker 
> b/tests/docker/dockerfiles/debian-host.docker
> similarity index 91%
> rename from tests/docker/dockerfiles/debian-amd64.docker
> rename to tests/docker/dockerfiles/debian-host.docker
> index eb13f06ed1..3605cc4658 100644
> --- a/tests/docker/dockerfiles/debian-amd64.docker
> +++ b/tests/docker/dockerfiles/debian-host.docker
> @@ -30,9 +30,9 @@ RUN cd /usr/src/virglrenderer && ./autogen.sh && 
> ./configure --with-glx --disabl
>  # netmap
>  RUN DEBIAN_FRONTEND=noninteractive eatmydata \
>  apt-get install -y --no-install-recommends \
> -linux-headers-amd64
> +linux-headers-$(dpkg --print-architecture)
>  RUN git clone https://github.com/luigirizzo/netmap.git /usr/src/netmap
> -RUN cd /usr/src/netmap/LINUX && ./configure --no-drivers --no-apps 
> --kernel-dir=$(ls -d /usr/src/linux-headers-*-amd64) && make install
> +RUN cd /usr/src/netmap/LINUX && ./configure --no-drivers --no-apps 
> --kernel-dir=$(ls -d /usr/src/linux-headers-*) && make install

Curious: why isn't this "ls -d linux-headers-*-$(dpkg --print-architecture)", 
like above?

>  ENV QEMU_CONFIGURE_OPTS --enable-netmap
>  
>  # gcrypt
> -- 
> 2.17.1
> 



[Qemu-devel] [PATCH] file-posix: Handle EINTR in preallocation=full write

2018-07-27 Thread Fam Zheng
Cc: qemu-sta...@nongnu.org
Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index ad299beb38..928b863ced 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1646,6 +1646,9 @@ static int handle_aiocb_truncate(RawPosixAIOData *aiocb)
 num = MIN(left, 65536);
 result = write(fd, buf, num);
 if (result < 0) {
+if (errno == EINTR) {
+continue;
+}
 result = -errno;
 error_setg_errno(errp, -result,
  "Could not write zeros for preallocation");
-- 
2.17.1




[Qemu-devel] [PATCH RFC for-3.0-rc3 3/3] iotests: Add test for 'qemu-img convert -C' compatibility

2018-07-26 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/082 |  8 
 tests/qemu-iotests/082.out | 11 +++
 2 files changed, 19 insertions(+)

diff --git a/tests/qemu-iotests/082 b/tests/qemu-iotests/082
index a872f771a6..3e605d52d1 100755
--- a/tests/qemu-iotests/082
+++ b/tests/qemu-iotests/082
@@ -157,6 +157,14 @@ run_qemu_img convert -o help
 # Try help option for a format that does not support creation
 run_qemu_img convert -O bochs -o help
 
+echo
+echo === convert: -C and other options ===
+
+# Adding the help option to a command without other -o options
+run_qemu_img convert -C -S 4k -O $IMGFMT "$TEST_IMG" "$TEST_IMG".target
+run_qemu_img convert -C -S 8k -O $IMGFMT "$TEST_IMG" "$TEST_IMG".target
+run_qemu_img convert -C -c -O $IMGFMT "$TEST_IMG" "$TEST_IMG".target
+
 echo
 echo === amend: Options specified more than once ===
 
diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out
index 60ef87c276..19e9fb13ff 100644
--- a/tests/qemu-iotests/082.out
+++ b/tests/qemu-iotests/082.out
@@ -508,6 +508,17 @@ size Virtual disk size
 Testing: convert -O bochs -o help
 qemu-img: Format driver 'bochs' does not support image creation
 
+=== convert: -C and other options ===
+
+Testing: convert -C -S 4k -O qcow2 TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.target
+qemu-img: Cannot enable copy offloading when -S is used
+
+Testing: convert -C -S 8k -O qcow2 TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.target
+qemu-img: Cannot enable copy offloading when -S is used
+
+Testing: convert -C -c -O qcow2 TEST_DIR/t.qcow2 TEST_DIR/t.qcow2.target
+qemu-img: Cannot enable copy offloading when -c is used
+
 === amend: Options specified more than once ===
 
 Testing: amend -f foo -f qcow2 -o lazy_refcounts=on TEST_DIR/t.qcow2
-- 
2.17.1




[Qemu-devel] [PATCH RFC for-3.0-rc3 2/3] qemu-img: Add -C option for convert with copy offloading

2018-07-26 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 qemu-img-cmds.hx |  2 +-
 qemu-img.c   | 21 +
 qemu-img.texi|  8 +++-
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 69758fb6e8..1526f327a5 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -44,7 +44,7 @@ STEXI
 ETEXI
 
 DEF("convert", img_convert,
-"convert [--object objectdef] [--image-opts] [--target-image-opts] [-U] 
[-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B 
backing_file] [-o options] [-l snapshot_param] [-S sparse_size] [-m 
num_coroutines] [-W] filename [filename2 [...]] output_filename")
+"convert [--object objectdef] [--image-opts] [--target-image-opts] [-U] 
[-C] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B 
backing_file] [-o options] [-l snapshot_param] [-S sparse_size] [-m 
num_coroutines] [-W] filename [filename2 [...]] output_filename")
 STEXI
 @item convert [--object @var{objectdef}] [--image-opts] [--target-image-opts] 
[-U] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] 
[-O @var{output_fmt}] [-B @var{backing_file}] [-o @var{options}] [-l 
@var{snapshot_param}] [-S @var{sparse_size}] [-m @var{num_coroutines}] [-W] 
@var{filename} [@var{filename2} [...]] @var{output_filename}
 ETEXI
diff --git a/qemu-img.c b/qemu-img.c
index 9b7506b8ae..1acddf693c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2024,11 +2024,12 @@ static int img_convert(int argc, char **argv)
  skip_create = false, progress = false, tgt_image_opts = false;
 int64_t ret = -EINVAL;
 bool force_share = false;
+bool explict_min_sparse = false;
 
 ImgConvertState s = (ImgConvertState) {
 /* Need at least 4k of zeros for sparse detection */
 .min_sparse = 8,
-.copy_range = true,
+.copy_range = false,
 .buf_sectors= IO_BUF_SIZE / BDRV_SECTOR_SIZE,
 .wr_in_order= true,
 .num_coroutines = 8,
@@ -2043,7 +2044,7 @@ static int img_convert(int argc, char **argv)
 {"target-image-opts", no_argument, 0, OPTION_TARGET_IMAGE_OPTS},
 {0, 0, 0, 0}
 };
-c = getopt_long(argc, argv, ":hf:O:B:co:l:S:pt:T:qnm:WU",
+c = getopt_long(argc, argv, ":hf:O:B:Cco:l:S:pt:T:qnm:WU",
 long_options, NULL);
 if (c == -1) {
 break;
@@ -2067,9 +2068,11 @@ static int img_convert(int argc, char **argv)
 case 'B':
 out_baseimg = optarg;
 break;
+case 'C':
+s.copy_range = true;
+break;
 case 'c':
 s.compressed = true;
-s.copy_range = false;
 break;
 case 'o':
 if (!is_valid_option_list(optarg)) {
@@ -2112,7 +2115,7 @@ static int img_convert(int argc, char **argv)
 }
 
 s.min_sparse = sval / BDRV_SECTOR_SIZE;
-s.copy_range = false;
+explict_min_sparse = true;
 break;
 }
 case 'p':
@@ -2172,6 +2175,16 @@ static int img_convert(int argc, char **argv)
 goto fail_getopt;
 }
 
+if (s.compressed && s.copy_range) {
+error_report("Cannot enable copy offloading when -c is used");
+goto fail_getopt;
+}
+
+if (explict_min_sparse && s.copy_range) {
+error_report("Cannot enable copy offloading when -S is used");
+goto fail_getopt;
+}
+
 if (tgt_image_opts && !skip_create) {
 error_report("--target-image-opts requires use of -n flag");
 goto fail_getopt;
diff --git a/qemu-img.texi b/qemu-img.texi
index aeb1b9e66c..3b6710a580 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -169,6 +169,12 @@ Number of parallel coroutines for the convert process
 Allow out-of-order writes to the destination. This option improves performance,
 but is only recommended for preallocated devices like host devices or other
 raw block devices.
+@item -C
+Try to use copy offloading to move data from source image to target. This may
+improve performance if the data is remote, such as with NFS or iSCSI backends,
+but will not automatically sparsify zero sectors, and may result in a fully
+allocated target image depending on the host support for getting allocation
+information.
 @end table
 
 Parameters to dd subcommand:
@@ -319,7 +325,7 @@ Error on reading data
 
 @end table
 
-@item convert [--object @var{objectdef}] [--image-opts] [--target-image-opts] 
[-U] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-T @var{src_cache}] 
[-O @var{output_fmt}] [-B @var{backing_file}] [-o @var{options}] [-l 
@var{snapshot_param}] [-S @var{sparse_size}] [-m @var{num_coroutines}] [-W] 
@var{filename} [@var{filename2} [...]] @var{output_filename}
+@item convert [--object @var{objectdef}] [--im

[Qemu-devel] [PATCH RFC for-3.0-rc3 1/3] Revert "qemu-img: Document copy offloading implications with -S and -c"

2018-07-26 Thread Fam Zheng
This reverts commit eb461485f4558e362fab905735b50987505bca44.

Now that we introduce an explicit option, these implicit rules are not
used.

Signed-off-by: Fam Zheng 
---
 qemu-img.texi | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/qemu-img.texi b/qemu-img.texi
index 5853cd18d1..aeb1b9e66c 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -96,8 +96,7 @@ will enumerate information about backing files in a disk 
image chain. Refer
 below for further description.
 
 @item -c
-indicates that target image must be compressed (qcow format only). If this
-option is used, copy offloading will not be attempted.
+indicates that target image must be compressed (qcow format only)
 
 @item -h
 with or without a command shows help and lists the supported formats
@@ -116,8 +115,7 @@ in case both @var{-q} and @var{-p} options are used.
 indicates the consecutive number of bytes that must contain only zeros
 for qemu-img to create a sparse image during conversion. This value is rounded
 down to the nearest 512 bytes. You may use the common size suffixes like
-@code{k} for kilobytes. If this option is used, copy offloading will not be
-attempted.
+@code{k} for kilobytes.
 
 @item -t @var{cache}
 specifies the cache mode that should be used with the (destination) file. See
-- 
2.17.1




[Qemu-devel] [PATCH RFC for-3.0-rc3 0/3] qemu-img: Disable copy offloading by default

2018-07-26 Thread Fam Zheng
Kevin pointed out that both glibc and kernel provides a slow fallback of
copy_file_range which hurts thin provisioning. This is particularly true for
thin LVs, because host_device driver cannot get allocation info from the
volume, and copy_file_range is called on every sectors, making the dst fully
allocated.

NFS mount points also doesn't support SEEK_DATA well, so the allocation
information is unknown to QEMU.

That leaves only iscsi:// which seems to do what we want so far, but it is a
smaller use case.

Add an option to qemu-img convert, "-C", to enable (attempting) copy offloading
explicitly. And mark it incompatible with "-S" and "-c".

Fam Zheng (3):
  Revert "qemu-img: Document copy offloading implications with -S and
-c"
  qemu-img: Add -C option for convert with copy offloading
  iotests: Add test for 'qemu-img convert -C' compatibility

 qemu-img-cmds.hx   |  2 +-
 qemu-img.c | 21 +
 qemu-img.texi  | 14 +-
 tests/qemu-iotests/082 |  8 
 tests/qemu-iotests/082.out | 11 +++
 5 files changed, 46 insertions(+), 10 deletions(-)

-- 
2.17.1




Re: [Qemu-devel] [PATCH v3 0/3] arm: Add nRF51 SoC and micro:bit machine

2018-07-25 Thread Fam Zheng
On Thu, 07/26 12:06, Joel Stanley wrote:
> v3: Rebase on Stefan's cortex-m0 series
> Based-on: 20180725085944.11856-8-stefa...@redhat.com

Correction:

Based-on: 20180725085944.11856-1-stefa...@redhat.com
(Point to the cover letter message)

Fam



Re: [Qemu-devel] [PATCH v3 for 3.0 00/18] docker fixes (and one tcg test tweak)

2018-07-24 Thread Fam Zheng
On Tue, Jul 24, 2018 at 4:56 PM Alex Bennée  wrote:
>
>
> Fam Zheng  writes:
>
> > On Mon, Jul 23, 2018 at 6:03 PM Alex Bennée  wrote:
> >>
> >>
> >> Alex Bennée  writes:
> >>
> >> > Hi,
> >> >
> >> > I've missed the boat for today's rc1 but I'd like to get this merged
> >> > before rc2. The new docker.py change is technically new functionality
> >> > but I'm counting it as a usability bug fix as it replaces a random
> >> > back trace failure with a preemptive failure and message mentioning
> >> > binfmt_misc configuration. This would have saved Richard a lot of head
> >> > scratching as he tried to setup a powerpc-user setup to test his
> >> > setcontext fix (he had a custom binfmt_misc pointing to his src tree).
> >> >
> >> > Finally we also drop the runcom test. It was cute that it got
> >> > resurrected but it is ultimately a pointless test for something I'm
> >> > sure no one actually uses.
> >> >
> >> > There will be a follow-up RFC series after this that cleans-up some of
> >> > the rough edges when your host is not an x86_64 box but that series
> >> > won't be targeting the 3.0 release.
> >> >
> >> > : The following patches need review
> >> > : patch docker/disable debian powerpc user cross.patch
> >> > : patch docker/drop QEMU_TARGET check fallback in EXECUTABLE.patch
> >> > : patch docker/Update debootstrap script after Debian migrat.patch
> >> > : patch docker/ignore distro versioning of debootstrap.patch
> >> > : patch docker/perform basic binfmt_misc validation in docke.patch
> >> > : patch tests/tcg remove runcom test.patch
> >>
> >> Ping?
> >
> > I had questions about patch 13 and 17. Otherwise looks good.
> > Maybe you can drop them for now (including patch 9) and send the PULL for
> > the rest.
>
> I can drop/replace 13 - but 17 was there for a better reporting of a
> failure case Peter found. We can certainly expand it and be smarter in
> future iterations.


Sounds good to me.

Fam

>
>
> > I don't know if we can catch up -rc2 but I guess the testing fixes are
> > fairly safe to sneak in. :)
> >
> > (My development machine is affected by
> > an
> >  office power outage, and I was
> > also
> >  busy with upgrading patchew.org. Sorry for the
> >
> > late reply
> > !)
> >
> > Fam
>
>
> --
> Alex Bennée



[Qemu-devel] [PATCH 1/2] docs: Describe using images in writing iotests

2018-07-24 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 docs/devel/testing.rst | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 5e19cd50da..8e1fa3a66e 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -255,6 +255,17 @@ comparable library support for invoking and interacting 
with QEMU programs. If
 you opt for Python, it is strongly recommended to write Python 3 compatible
 code.
 
+Both Python and Bash frameworks in iotests provide helpers to manage test
+images. They can be used to create and clean up images under the test
+directory. If no I/O or any protocol specific feature is needed, it is often
+more convenient to use the pseudo block driver, ``null-co://``, as the test
+image, which doesn't require image creation or cleaning up. Avoid system-wide
+devices or files whenever possible, such as ``/dev/null`` or ``/dev/zero``.
+Otherwise, image locking implications have to be considered.  For example,
+another application on the host may have locked the file, possibly leading to a
+test failure.  If using such devices are explicitly desired, consider adding
+``locking=off`` option to disable image locking.
+
 Docker based tests
 ==
 
-- 
2.17.1




[Qemu-devel] [PATCH 0/2] iotests: Fix 226 on _my_ system

2018-07-24 Thread Fam Zheng
Something has locked /dev/null on my system (I still don't know what to do with
the annoying incapability of lslocks, or more precisely /proc/locks, on
inspecting OFD lock information), and as a result 226 cannot pass due to the
unexpected image locking error.

Fix the test case by disabling locking, and add a doc text about using test
images.

Fam Zheng (2):
  docs: Describe using images in writing iotests
  iotests: Don't lock /dev/null in 226

 docs/devel/testing.rst | 11 +++
 tests/qemu-iotests/226 |  4 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

-- 
2.17.1




[Qemu-devel] [PATCH 2/2] iotests: Don't lock /dev/null in 226

2018-07-24 Thread Fam Zheng
On my system (Fedora 28), this script reports a 'failed to get
"consistent read" lock' error. Following docs/devel/testing.rst, it's
better to add locking=off here.

Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/226 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/226 b/tests/qemu-iotests/226
index 211ea9888a..8ec3e612dd 100755
--- a/tests/qemu-iotests/226
+++ b/tests/qemu-iotests/226
@@ -56,10 +56,10 @@ for PROTO in "file" "host_device" "host_cdrom"; do
 echo
 echo "== Testing RO =="
 $QEMU_IO -c "open -r -o driver=$PROTO,filename=$TEST_IMG" 2>&1 | 
_filter_testdir | _filter_imgfmt
-$QEMU_IO -c "open -r -o driver=$PROTO,filename=/dev/null" 2>&1 | 
_filter_imgfmt
+$QEMU_IO -c "open -r -o driver=$PROTO,filename=/dev/null,locking=off" 2>&1 
| _filter_imgfmt
 echo "== Testing RW =="
 $QEMU_IO -c "open -o driver=$PROTO,filename=$TEST_IMG" 2>&1 | 
_filter_testdir | _filter_imgfmt
-$QEMU_IO -c "open -o driver=$PROTO,filename=/dev/null" 2>&1 | 
_filter_imgfmt
+$QEMU_IO -c "open -o driver=$PROTO,filename=/dev/null,locking=off" 2>&1 | 
_filter_imgfmt
 done
 
 # success, all done
-- 
2.17.1




Re: [Qemu-devel] [PATCH v3 for 3.0 00/18] docker fixes (and one tcg test tweak)

2018-07-24 Thread Fam Zheng
On Mon, Jul 23, 2018 at 6:03 PM Alex Bennée  wrote:
>
>
> Alex Bennée  writes:
>
> > Hi,
> >
> > I've missed the boat for today's rc1 but I'd like to get this merged
> > before rc2. The new docker.py change is technically new functionality
> > but I'm counting it as a usability bug fix as it replaces a random
> > back trace failure with a preemptive failure and message mentioning
> > binfmt_misc configuration. This would have saved Richard a lot of head
> > scratching as he tried to setup a powerpc-user setup to test his
> > setcontext fix (he had a custom binfmt_misc pointing to his src tree).
> >
> > Finally we also drop the runcom test. It was cute that it got
> > resurrected but it is ultimately a pointless test for something I'm
> > sure no one actually uses.
> >
> > There will be a follow-up RFC series after this that cleans-up some of
> > the rough edges when your host is not an x86_64 box but that series
> > won't be targeting the 3.0 release.
> >
> > : The following patches need review
> > : patch docker/disable debian powerpc user cross.patch
> > : patch docker/drop QEMU_TARGET check fallback in EXECUTABLE.patch
> > : patch docker/Update debootstrap script after Debian migrat.patch
> > : patch docker/ignore distro versioning of debootstrap.patch
> > : patch docker/perform basic binfmt_misc validation in docke.patch
> > : patch tests/tcg remove runcom test.patch
>
> Ping?

I had questions about patch 13 and 17. Otherwise looks good.
​ Maybe you can drop them for now (including patch 9) and send the PULL for
the rest.​
I don't know if we can catch up -rc2 but I guess the testing fixes are
fairly safe to sneak in. :)

(My development machine is affected by
an
 office power outage, and I was
​ also​
 busy with upgrading patchew.org. Sorry for the
​ ​
late reply
!)

Fam


Re: [Qemu-devel] [PATCH v3 for 3.0 17/18] docker: perform basic binfmt_misc validation in docker.py

2018-07-24 Thread Fam Zheng
On Wed, Jul 18, 2018 at 4:02 AM Alex Bennée  wrote:
>
> Setting up binfmt_misc is outside of the scope of the docker.py script
> but we can at least validate it with any given executable so we have a
> more useful error message than the sed line of deboostrap failing
> cryptically.
>
> Signed-off-by: Alex Bennée 
> Reported-by: Richard Henderson 
> ---
>  tests/docker/docker.py | 30 ++
>  1 file changed, 30 insertions(+)
>
> diff --git a/tests/docker/docker.py b/tests/docker/docker.py
> index 523f4b95a2..a3f5b0c1b0 100755
> --- a/tests/docker/docker.py
> +++ b/tests/docker/docker.py
> @@ -112,6 +112,31 @@ def _copy_binary_with_libs(src, dest_dir):
>  so_path = os.path.dirname(l)
>  _copy_with_mkdir(l , dest_dir, so_path)
>
> +
> +def _check_binfmt_misc(executable):
> +"""Check binfmt_misc has entry for executable in the right place.
> +
> +The details of setting up binfmt_misc are outside the scope of
> +this script but we should at least fail early with a useful
> +message if it won't work."""
> +
> +binary = os.path.basename(executable)
> +binfmt_entry = "/proc/sys/fs/binfmt_misc/%s" % (binary)
> +
> +if not os.path.exists(binfmt_entry):
> +print ("No binfmt_misc entry for %s" % (binary))
> +return False
> +
> +with open(binfmt_entry) as x: entry = x.read()
> +
> +qpath = "/usr/bin/%s" % (binary)

Is it intended to hardcode this to /usr/bin? I thought we were more
flexible than that..

Fam

> +if not re.search("interpreter %s\n" % (qpath), entry):
> +print ("binfmt_misc for %s does not point to %s" % (binary, qpath))
> +return False
> +
> +return True
> +
> +
>  def _read_qemu_dockerfile(img_name):
>  # special case for Debian linux-user images
>  if img_name.startswith("debian") and img_name.endswith("user"):
> @@ -315,6 +340,11 @@ class BuildCommand(SubCommand):
>  # Create a docker context directory for the build
>  docker_dir = tempfile.mkdtemp(prefix="docker_build")
>
> +# Validate binfmt_misc will work
> +if args.include_executable:
> +if not _check_binfmt_misc(args.include_executable):
> +return 1
> +
>  # Is there a .pre file to run in the build context?
>  docker_pre = os.path.splitext(args.dockerfile)[0]+".pre"
>  if os.path.exists(docker_pre):
> --
> 2.17.1
>



Re: [Qemu-devel] [PATCH v3 for 3.0 13/18] docker: add --hint to docker.py check

2018-07-24 Thread Fam Zheng
On Wed, Jul 18, 2018 at 4:03 AM Alex Bennée  wrote:
>
> When a check fails we currently just report why we failed. This is not
> totally helpful to people who want to boot-strap a new image. Add a
> --hint option which we can pass down to give a bit more information.
>
> Signed-off-by: Alex Bennée 
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 
> ---
>  tests/docker/Makefile.include | 3 ++-
>  tests/docker/docker.py| 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> index ec23620153..c9e412f9d0 100644
> --- a/tests/docker/Makefile.include
> +++ b/tests/docker/Makefile.include
> @@ -73,7 +73,8 @@ docker-binfmt-image-debian-%: 
> $(DOCKER_FILES_DIR)/debian-bootstrap.docker
> $(if 
> $(EXECUTABLE),--include-executable=$(EXECUTABLE)), \
> "BUILD","binfmt debian-$* (debootstrapped)"), 
>   \
> $(call quiet-command, 
>   \
> -   $(DOCKER_SCRIPT) check --quiet qemu:debian-$* $<, 
>   \
> +   $(DOCKER_SCRIPT) check --quiet qemu:debian-$* $<  
>   \
> +   --hint "you will need to build $(EXECUTABLE)",
>   \
> "CHECK", "debian-$* exists"))
>
>  endif
> diff --git a/tests/docker/docker.py b/tests/docker/docker.py
> index 2f81c6b13b..523f4b95a2 100755
> --- a/tests/docker/docker.py
> +++ b/tests/docker/docker.py
> @@ -475,6 +475,7 @@ class CheckCommand(SubCommand):
>  default="checksum", help="check type")
>  parser.add_argument("--olderthan", default=60, type=int,
>  help="number of minutes")
> +parser.add_argument("--hint", default="", help="hint to user")
>
>  def run(self, args, argv):
>  tag = args.tag
> @@ -487,7 +488,7 @@ class CheckCommand(SubCommand):
>
>  info = dkr.inspect_tag(tag)
>  if info is None:
> -print("Image does not exist")
> +print("Image does not exist %s" % (args.hint))

No, please. This is just hacky and makes no sense to me. The Makefile
can do this

  $(DOCKER_SCRIPT) check ... || { echo "you will need to build
$(EXECUTABLE)"; exit 1 }

Fam

>  return 1
>
>  if args.checktype == "checksum":
> --
> 2.17.1
>



Re: [Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically

2018-07-23 Thread Fam Zheng
On Mon, Jul 23, 2018 at 10:37 PM Max Reitz  wrote:
>
> On 2018-07-23 03:56, Fam Zheng wrote:
> > On Sun, Jul 22, 2018 at 10:06 PM Max Reitz  wrote:
> >>
> >> On 2018-07-22 04:37, Fam Zheng wrote:
> >>> On Sun, Jul 22, 2018 at 5:08 AM Max Reitz  wrote:
> >>>>
> >>>> On 2018-07-19 05:41, Fam Zheng wrote:
> >>>>> On my Fedora 28, /dev/null is locked by some other process (couldn't
> >>>>> inspect it due to the current lslocks limitation), so iotests 226 fails
> >>>>> with some unexpected image locking errors because it uses qemu-io to
> >>>>> open it.
> >>>>>
> >>>>> Actually it's safe to not use any lock on /dev/null or /dev/zero.
> >>>>>
> >>>>> Signed-off-by: Fam Zheng 
> >>>>> ---
> >>>>>  block/file-posix.c | 7 ++-
> >>>>>  1 file changed, 6 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/block/file-posix.c b/block/file-posix.c
> >>>>> index 60af4b3d51..8bf034108a 100644
> >>>>> --- a/block/file-posix.c
> >>>>> +++ b/block/file-posix.c
> >>>>> @@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, 
> >>>>> QDict *options,
> >>>>>  s->use_lock = false;
> >>>>>  break;
> >>>>>  case ON_OFF_AUTO_AUTO:
> >>>>> -s->use_lock = qemu_has_ofd_lock();
> >>>>> +if (!strcmp(filename, "/dev/null") ||
> >>>>> +   !strcmp(filename, "/dev/zero")) {
> >>>>
> >>>> I’m not sure I like a strcmp() based on filename (though it should work
> >>>> for all of the cases where someone would want to specify either of those
> >>>> for a qemu block device).  Isn’t there some specific major/minor number
> >>>> we can use to check for these two files?  Or are those Linux-specific?
> >>>
> >>> Yeah, I guess major/minor numbers are Linux-specific.
> >>>
> >>>>
> >>>> I could also imagine just not locking any host_device, but since people
> >>>> do use qcow2 immediately on block devices, maybe we do want to lock them.
> >>>
> >>> That's right.
> >>>
> >>>>
> >>>> Finally, if really all you are trying to do is to make test code easier,
> >>>> then I don’t know exactly why.  Just disabling locking in 226 shouldn’t
> >>>> be too hard.
> >>>
> >>> The tricky thing is in remembering to do that for future test cases.
> >>> My machine seems to be somehow different than John's so that my 226
> >>> cannot lock /dev/null, but I'm not sure that is the case for other
> >>> releases, distros or system instances.
> >>
> >> Usually we don’t need to use /dev/null, though, because null-co:// is
> >> better suited for most purposes.  This is a very specific test of host
> >> devices.
> >>
> >> Maybe we should start a doc file with common good practices about
> >> writing iotests?
> >
> > Yes, mentioning using pseudo devices in docs/devel/testing.rst would
> > probably be a good idea. So is my understanding right that you prefer
> > fixing the test case and discard this patch? If so I'll send another
> > version together with the doc update.
>
> I personally would prefer fixing the test and not modifying the code,
> yes.  But I'm aware that it is a personal opinion.

Sure, and you are the maintainer, so why not follow what you think. :)

Fam



Re: [Qemu-devel] qemu-system_x86-64 (32-bit binary) -M q35 can be crashed on viewing youtube.

2018-07-22 Thread Fam Zheng
On Mon, 07/23 03:42, Andrew Randrianasulu wrote:
> It was crashing and crashing, so I tried to debug it a bit ... 
> 
> 
> valgrind --leak-check=yes /dev/shm/qemu/x86_64-softmmu/qemu-system-x86_64  
> -display 
> sdl,gl=on  -M q35 -soundhw 
> hda -cdrom /home/guest/Downloads/ISO/slax-English-US-7.0.8-x86_64.iso -m 
> 1G -enable-kvm -d trace:e1000e*   shows some errors at very end, see attached 
> file.
> 
> For reproduction, wait for liveCD to finish loading, start firefox, go to 
> youtube.com, it will warn you about outdated browser but continue anyway, try 
> to click on any video 
> 
> It seems even modern KDE Neon distribution affected by same bug :/

Cc'ing Jason and Samuel/Jan who maintain net and slirp respectively.

> --quote-
> 
> 29362@1532305439.464672:e1000e_core_write Write to register 0xd0, 4 byte(s), 
> value: 0x10
> 29362@1532305439.464735:e1000e_irq_set_ims Setting IMS bits 0x10: 
> 0x154 --> 0x154
> 29362@1532305439.464791:e1000e_irq_msix_pending_clearing Clearing MSI-X 
> pending 
> bit for cause 0x10, IVAR config 0x800a0908, vector 0
> 29362@1532305439.464867:e1000e_irq_fix_icr_asserted ICR_ASSERTED bit fixed: 
> 0x8002
> 29362@1532305439.464916:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 
> 0x8002, IMS: 0x154)
> 29362@1532305439.524010:e1000e_core_write Write to register 0x3818, 4 
> byte(s), 
> value: 0x97
> 29362@1532305439.524097:e1000e_tx_descr 0x21180e : 27000b68 5b43600
> 29362@1532305439.524169:e1000e_tx_descr 0x3bf970fa : 261005c6 300
> 29362@1532305439.524248:e1000e_tx_descr 0x1c1c8000 : af1005d8 300
> ==29362== Invalid write of size 4
> ==29362==at 0x552E58: memcpy (string3.h:53)
> ==29362==by 0x552E58: m_cat (mbuf.c:143)
> ==29362==by 0x54FE1E: ip_reass (ip_input.c:341)
> ==29362==by 0x54FE1E: ip_input (ip_input.c:190)
> ==29362==by 0x552478: slirp_input (slirp.c:874)
> ==29362==by 0x53FBE7: net_slirp_receive (slirp.c:121)
> ==29362==by 0x537478: nc_sendv_compat (net.c:701)
> ==29362==by 0x537478: qemu_deliver_packet_iov (net.c:728)
> ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179)
> ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224)
> ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764)
> ==29362==by 0x538925: qemu_sendv_packet (net.c:772)
> ==29362==by 0x53AB71: net_hub_receive_iov (hub.c:73)
> ==29362==by 0x53AB71: net_hub_port_receive_iov (hub.c:124)
> ==29362==by 0x53748D: qemu_deliver_packet_iov (net.c:726)
> ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179)
> ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224)
> ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764)
> ==29362==  Address 0x114f9b60 is 0 bytes after a block of size 2,976 alloc'd
> ==29362==at 0x482B29C: malloc (vg_replace_malloc.c:299)
> ==29362==by 0x534D3E1: g_malloc (in /usr/lib/libglib-2.0.so.0.4600.2)
> ==29362==by 0x552B53: m_inc.part.1 (mbuf.c:166)
> ==29362==by 0x552EAE: m_inc (string3.h:53)
> ==29362==by 0x552EAE: m_cat (mbuf.c:141)
> ==29362==by 0x54FE1E: ip_reass (ip_input.c:341)
> ==29362==by 0x54FE1E: ip_input (ip_input.c:190)
> ==29362==by 0x552478: slirp_input (slirp.c:874)
> ==29362==by 0x53FBE7: net_slirp_receive (slirp.c:121)
> ==29362==by 0x537478: nc_sendv_compat (net.c:701)
> ==29362==by 0x537478: qemu_deliver_packet_iov (net.c:728)
> ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179)
> ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224)
> ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764)
> ==29362==by 0x538925: qemu_sendv_packet (net.c:772)
> ==29362==by 0x53AB71: net_hub_receive_iov (hub.c:73)
> ==29362==by 0x53AB71: net_hub_port_receive_iov (hub.c:124)
> ==29362==
> ==29362== Invalid write of size 4
> ==29362==at 0x552E5E: memcpy (string3.h:53)
> ==29362==by 0x552E5E: m_cat (mbuf.c:143)
> ==29362==by 0x54FE1E: ip_reass (ip_input.c:341)
> ==29362==by 0x54FE1E: ip_input (ip_input.c:190)
> ==29362==by 0x552478: slirp_input (slirp.c:874)
> ==29362==by 0x53FBE7: net_slirp_receive (slirp.c:121)
> ==29362==by 0x537478: nc_sendv_compat (net.c:701)
> ==29362==by 0x537478: qemu_deliver_packet_iov (net.c:728)
> ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179)
> ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224)
> ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764)
> ==29362==by 0x538925: qemu_sendv_packet (net.c:772)
> ==29362==by 0x53AB71: net_hub_receive_iov (hub.c:73)
> ==29362==by 0x53AB71: net_hub_port_receive_iov (hub.c:124)
> ==29362==by 0x53748D: qemu_deliver_packet_iov (net.c:726)
> ==29362==by 0x53A649: qemu_net_queue_deliver_iov (queue.c:179)
> ==29362==by 0x53A649: qemu_net_queue_send_iov (queue.c:224)
> ==29362==by 0x538901: qemu_sendv_packet_async (net.c:764)
> ==29362==  Address 0x114f9b78 is 24 

Re: [Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically

2018-07-22 Thread Fam Zheng
On Sun, Jul 22, 2018 at 10:06 PM Max Reitz  wrote:
>
> On 2018-07-22 04:37, Fam Zheng wrote:
> > On Sun, Jul 22, 2018 at 5:08 AM Max Reitz  wrote:
> >>
> >> On 2018-07-19 05:41, Fam Zheng wrote:
> >>> On my Fedora 28, /dev/null is locked by some other process (couldn't
> >>> inspect it due to the current lslocks limitation), so iotests 226 fails
> >>> with some unexpected image locking errors because it uses qemu-io to
> >>> open it.
> >>>
> >>> Actually it's safe to not use any lock on /dev/null or /dev/zero.
> >>>
> >>> Signed-off-by: Fam Zheng 
> >>> ---
> >>>  block/file-posix.c | 7 ++-
> >>>  1 file changed, 6 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/block/file-posix.c b/block/file-posix.c
> >>> index 60af4b3d51..8bf034108a 100644
> >>> --- a/block/file-posix.c
> >>> +++ b/block/file-posix.c
> >>> @@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, 
> >>> QDict *options,
> >>>  s->use_lock = false;
> >>>  break;
> >>>  case ON_OFF_AUTO_AUTO:
> >>> -s->use_lock = qemu_has_ofd_lock();
> >>> +if (!strcmp(filename, "/dev/null") ||
> >>> +   !strcmp(filename, "/dev/zero")) {
> >>
> >> I’m not sure I like a strcmp() based on filename (though it should work
> >> for all of the cases where someone would want to specify either of those
> >> for a qemu block device).  Isn’t there some specific major/minor number
> >> we can use to check for these two files?  Or are those Linux-specific?
> >
> > Yeah, I guess major/minor numbers are Linux-specific.
> >
> >>
> >> I could also imagine just not locking any host_device, but since people
> >> do use qcow2 immediately on block devices, maybe we do want to lock them.
> >
> > That's right.
> >
> >>
> >> Finally, if really all you are trying to do is to make test code easier,
> >> then I don’t know exactly why.  Just disabling locking in 226 shouldn’t
> >> be too hard.
> >
> > The tricky thing is in remembering to do that for future test cases.
> > My machine seems to be somehow different than John's so that my 226
> > cannot lock /dev/null, but I'm not sure that is the case for other
> > releases, distros or system instances.
>
> Usually we don’t need to use /dev/null, though, because null-co:// is
> better suited for most purposes.  This is a very specific test of host
> devices.
>
> Maybe we should start a doc file with common good practices about
> writing iotests?

Yes, mentioning using pseudo devices in docs/devel/testing.rst would
probably be a good idea. So is my understanding right that you prefer
fixing the test case and discard this patch? If so I'll send another
version together with the doc update.

Fam

>
> Max
>



Re: [Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically

2018-07-21 Thread Fam Zheng
On Sun, Jul 22, 2018 at 5:08 AM Max Reitz  wrote:
>
> On 2018-07-19 05:41, Fam Zheng wrote:
> > On my Fedora 28, /dev/null is locked by some other process (couldn't
> > inspect it due to the current lslocks limitation), so iotests 226 fails
> > with some unexpected image locking errors because it uses qemu-io to
> > open it.
> >
> > Actually it's safe to not use any lock on /dev/null or /dev/zero.
> >
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/file-posix.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index 60af4b3d51..8bf034108a 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, QDict 
> > *options,
> >  s->use_lock = false;
> >  break;
> >  case ON_OFF_AUTO_AUTO:
> > -s->use_lock = qemu_has_ofd_lock();
> > +if (!strcmp(filename, "/dev/null") ||
> > +   !strcmp(filename, "/dev/zero")) {
>
> I’m not sure I like a strcmp() based on filename (though it should work
> for all of the cases where someone would want to specify either of those
> for a qemu block device).  Isn’t there some specific major/minor number
> we can use to check for these two files?  Or are those Linux-specific?

Yeah, I guess major/minor numbers are Linux-specific.

>
> I could also imagine just not locking any host_device, but since people
> do use qcow2 immediately on block devices, maybe we do want to lock them.

That's right.

>
> Finally, if really all you are trying to do is to make test code easier,
> then I don’t know exactly why.  Just disabling locking in 226 shouldn’t
> be too hard.

The tricky thing is in remembering to do that for future test cases.
My machine seems to be somehow different than John's so that my 226
cannot lock /dev/null, but I'm not sure that is the case for other
releases, distros or system instances.

Fam

>
> Max
>
> > +s->use_lock = false;
> > +} else {
> > +s->use_lock = qemu_has_ofd_lock();
> > +}
> >  break;
> >  default:
> >  abort();
> >
>
>



Re: [Qemu-devel] [Qemu-block] [PATCH] block: Don't lock /dev/null and /dev/zero automatically

2018-07-21 Thread Fam Zheng
On Sat, Jul 21, 2018 at 12:35 AM Eric Blake  wrote:
>
> On 07/20/2018 03:24 AM, Fam Zheng wrote:
> I'm not familiar with /dev/nullb* - is that a typo?
>
> $ ll /dev/nullb*
> ls: cannot access '/dev/nullb*': No such file or directory

You probably have figured out already but just in case: it's kernel
null_blk mod.

Fam



Re: [Qemu-devel] [Qemu-block] [PATCH] block: Don't lock /dev/null and /dev/zero automatically

2018-07-20 Thread Fam Zheng
On Thu, 07/19 13:57, John Snow wrote:
> Should we instead modify the test in this case to not attempt to take a
> lock on a device we know cannot meaningfully store state, or is it your
> preference to attempt to maintain such a list in the raw driver itself?
> 
> I guess we never want QEMU to try to lock things like /dev/zero, but I
> don't know if there are more such pseudo-devices we should never try to
> lock and if so, what common property unifies them such that we don't
> have to maintain a list.

They are 0 sized anyway so I only expect them to be used in test cases like the
one we're fixing. So this really doesn't matter.  An exceptional one would be
/dev/nullb* but that is not used in real world either.

I'm not trying to maintain a complete list (e.g. /dev/urandom and /dev/nullb*
are missing), this patch is just trying to make writing test code easier.

Fam



Re: [Qemu-devel] [PATCH] hw/block/nvme: add optional parameter num_namespaces for nvme device

2018-07-19 Thread Fam Zheng
On Thu, 07/19 12:33, Weiping Zhang wrote:
> Add an optional paramter num_namespaces for device, and set it
> to 1 by default.
> 
> Signed-off-by: Weiping Zhang 
> ---
>  hw/block/nvme.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 156ecf3c41..b53be4b5c0 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -19,7 +19,7 @@
>   *  -drive file=,if=none,id=
>   *  -device nvme,drive=,serial=,id=, \
>   *  cmb_size_mb=, \
> - *  num_queues=
> + *  num_queues=,num_namespaces=
>   *
>   * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
>   * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
> @@ -1232,7 +1232,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error 
> **errp)
>  pci_config_set_class(pci_dev->config, PCI_CLASS_STORAGE_EXPRESS);
>  pcie_endpoint_cap_init(>parent_obj, 0x80);
>  
> -n->num_namespaces = 1;
>  n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
>  n->ns_size = bs_size / (uint64_t)n->num_namespaces;
>  
> @@ -1342,6 +1341,7 @@ static Property nvme_props[] = {
>  DEFINE_PROP_STRING("serial", NvmeCtrl, serial),
>  DEFINE_PROP_UINT32("cmb_size_mb", NvmeCtrl, cmb_size_mb, 0),
>  DEFINE_PROP_UINT32("num_queues", NvmeCtrl, num_queues, 64),
> +DEFINE_PROP_UINT32("num_namespaces", NvmeCtrl, num_namespaces, 1),

You need to verify the user provided value. At least 0x is the broadcast
value and shouldn't be used, and 0 doesn't make much sense too, I guess.

>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> -- 
> 2.14.1
> 
> 

I think the data_offset calculations in nvme_rw need some changes too?

Fam



[Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically

2018-07-18 Thread Fam Zheng
On my Fedora 28, /dev/null is locked by some other process (couldn't
inspect it due to the current lslocks limitation), so iotests 226 fails
with some unexpected image locking errors because it uses qemu-io to
open it.

Actually it's safe to not use any lock on /dev/null or /dev/zero.

Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 60af4b3d51..8bf034108a 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 s->use_lock = false;
 break;
 case ON_OFF_AUTO_AUTO:
-s->use_lock = qemu_has_ofd_lock();
+if (!strcmp(filename, "/dev/null") ||
+   !strcmp(filename, "/dev/zero")) {
+s->use_lock = false;
+} else {
+s->use_lock = qemu_has_ofd_lock();
+}
 break;
 default:
 abort();
-- 
2.17.1




[Qemu-devel] [PATCH] file-posix: Skip effectiveless OFD lock operations

2018-07-18 Thread Fam Zheng
If we know we've already locked the bytes, don't do it again; similarly
don't unlock a byte if we haven't locked it. This doesn't change the
behavior, but fixes a corner case explained below.

Libvirt had an error handling bug that an image can get its (ownership,
file mode, SELinux) permissions changed (RHBZ 1584982) by mistake behind
QEMU. Specifically, an image in use by Libvirt VM has:

$ ls -lhZ b.img
-rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c600,c690 b.img

Trying to attach it a second time won't work because of image locking.
And after the error, it becomes:

$ ls -lhZ b.img
-rw-r--r--. root root system_u:object_r:virt_image_t:s0 b.img

Then, we won't be able to do OFD lock operations with the existing fd.
In other words, the code such as in blk_detach_dev:

blk_set_perm(blk, 0, BLK_PERM_ALL, _abort);

can abort() QEMU, out of environmental changes.

This patch is an easy fix to this and the change is regardlessly
reasonable, so do it.

Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 60af4b3d51..45d44c9947 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -680,23 +680,28 @@ typedef enum {
  * file; if @unlock == true, also unlock the unneeded bytes.
  * @shared_perm_lock_bits is the mask of all permissions that are NOT shared.
  */
-static int raw_apply_lock_bytes(int fd,
+static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
 uint64_t perm_lock_bits,
 uint64_t shared_perm_lock_bits,
 bool unlock, Error **errp)
 {
 int ret;
 int i;
+uint64_t locked_perm, locked_shared_perm;
+
+locked_perm = s ? s->perm : 0;
+locked_shared_perm = s ? ~s->shared_perm & BLK_PERM_ALL : 0;
 
 PERM_FOREACH(i) {
 int off = RAW_LOCK_PERM_BASE + i;
-if (perm_lock_bits & (1ULL << i)) {
+uint64_t bit = (1ULL << i);
+if ((perm_lock_bits & bit) && !(locked_perm & bit)) {
 ret = qemu_lock_fd(fd, off, 1, false);
 if (ret) {
 error_setg(errp, "Failed to lock byte %d", off);
 return ret;
 }
-} else if (unlock) {
+} else if (unlock && (locked_perm & bit) && !(perm_lock_bits & bit)) {
 ret = qemu_unlock_fd(fd, off, 1);
 if (ret) {
 error_setg(errp, "Failed to unlock byte %d", off);
@@ -706,13 +711,15 @@ static int raw_apply_lock_bytes(int fd,
 }
 PERM_FOREACH(i) {
 int off = RAW_LOCK_SHARED_BASE + i;
-if (shared_perm_lock_bits & (1ULL << i)) {
+uint64_t bit = (1ULL << i);
+if ((shared_perm_lock_bits & bit) && !(locked_shared_perm & bit)) {
 ret = qemu_lock_fd(fd, off, 1, false);
 if (ret) {
 error_setg(errp, "Failed to lock byte %d", off);
 return ret;
 }
-} else if (unlock) {
+} else if (unlock && (locked_shared_perm & bit) &&
+   !(shared_perm_lock_bits & bit)) {
 ret = qemu_unlock_fd(fd, off, 1);
 if (ret) {
 error_setg(errp, "Failed to unlock byte %d", off);
@@ -788,7 +795,7 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
 
 switch (op) {
 case RAW_PL_PREPARE:
-ret = raw_apply_lock_bytes(s->lock_fd, s->perm | new_perm,
+ret = raw_apply_lock_bytes(s, s->lock_fd, s->perm | new_perm,
~s->shared_perm | ~new_shared,
false, errp);
 if (!ret) {
@@ -800,7 +807,7 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
 op = RAW_PL_ABORT;
 /* fall through to unlock bytes. */
 case RAW_PL_ABORT:
-raw_apply_lock_bytes(s->lock_fd, s->perm, ~s->shared_perm,
+raw_apply_lock_bytes(s, s->lock_fd, s->perm, ~s->shared_perm,
  true, _err);
 if (local_err) {
 /* Theoretically the above call only unlocks bytes and it cannot
@@ -810,7 +817,7 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
 }
 break;
 case RAW_PL_COMMIT:
-raw_apply_lock_bytes(s->lock_fd, new_perm, ~new_shared,
+raw_apply_lock_bytes(s, s->lock_fd, new_perm, ~new_shared,
  true, _err);
 if (local_err) {
 /* Theoretically the above call only unlocks bytes and it cannot
@@ -2174,7 +2181,7 @@ raw_co_create(BlockdevCreateOptions *options, Error 
**errp)
 shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
 
 /* Step one: Take locks */
-result = raw_apply_loc

Re: [Qemu-devel] [PATCH 3/5] tests/vm: Do not abuse parallelism when KVM is not available

2018-07-17 Thread Fam Zheng
On Mon, 07/16 23:48, Philippe Mathieu-Daudé wrote:
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/vm/basevm.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
> index 1f4ad04c32..715e433142 100755
> --- a/tests/vm/basevm.py
> +++ b/tests/vm/basevm.py
> @@ -233,7 +233,7 @@ def main(vmcls):
>  return 1
>  logging.basicConfig(level=(logging.DEBUG if args.debug
> else logging.WARN))
> -vm = vmcls(debug=args.debug, vcpus=args.jobs)
> +vm = vmcls(debug=args.debug, vcpus=args.jobs if kvm_available() else 
> 0)
>  if args.build_image:
>  if os.path.exists(args.image) and not args.force:
>  sys.stderr.writelines(["Image file exists: %s\n" % 
> args.image,
> @@ -244,7 +244,7 @@ def main(vmcls):
>  vm.add_source_dir(args.build_qemu)
>  cmd = [vm.BUILD_SCRIPT.format(
> configure_opts = " ".join(argv),
> -   jobs=args.jobs)]
> +   jobs=args.jobs if kvm_available() else 1)]
>  else:
>  cmd = argv
>  vm.boot(args.image + ",snapshot=on")
> -- 
> 2.18.0
> 

Could you instead change this line

parser.add_option("--jobs", type=int, default=multiprocessing.cpu_count() / 
2,
  help="number of virtual CPUs")

to use kvm_available()? With another helper function to contain the line width,
probably:

..., default=get_default_jobs(), help=...

and

def get_default_jobs():
if kvm_available():
return ...
else:
return ...

Fam



Re: [Qemu-devel] [PATCH v3 4/4] tests: Add centos VM testing

2018-07-15 Thread Fam Zheng
On Sun, 07/15 22:34, Philippe Mathieu-Daudé wrote:
> Hi Fam,
> 
> On 07/11/2018 10:28 PM, Fam Zheng wrote:
> > This one does docker testing in the VM. It is intended to replace the
> > native docker testing on patchew testers.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  tests/vm/Makefile.include |  3 +-
> >  tests/vm/centos   | 84 +++
> >  2 files changed, 86 insertions(+), 1 deletion(-)
> >  create mode 100755 tests/vm/centos
> > 
> > diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
> > index 5daa2a3b73..af19b7a4e6 100644
> > --- a/tests/vm/Makefile.include
> > +++ b/tests/vm/Makefile.include
> > @@ -2,7 +2,7 @@
> >  
> >  .PHONY: vm-build-all
> 
> Can we have a vm-clean-all rule too?

Yes, sure!

> 
> >  
> > -IMAGES := ubuntu.i386 freebsd netbsd openbsd
> > +IMAGES := ubuntu.i386 freebsd netbsd openbsd centos
> >  IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES))
> >  
> >  .PRECIOUS: $(IMAGE_FILES)
> > @@ -14,6 +14,7 @@ vm-test:
> > @echo "  vm-build-freebsd- Build QEMU in FreeBSD VM"
> > @echo "  vm-build-netbsd - Build QEMU in NetBSD VM"
> > @echo "  vm-build-openbsd- Build QEMU in OpenBSD VM"
> > +   @echo "  vm-build-centos - Build QEMU in CentOS VM, 
> > with Docker"
> >  
> >  vm-build-all: $(addprefix vm-build-, $(IMAGES))
> >  
> > diff --git a/tests/vm/centos b/tests/vm/centos
> > new file mode 100755
> > index 00..afd560c564
> > --- /dev/null
> > +++ b/tests/vm/centos
> > @@ -0,0 +1,84 @@
> > +#!/usr/bin/env python
> > +#
> > +# CentOS image
> > +#
> > +# Copyright 2018 Red Hat Inc.
> > +#
> > +# Authors:
> > +#  Fam Zheng 
> > +#
> > +# This code is licensed under the GPL version 2 or later.  See
> > +# the COPYING file in the top-level directory.
> > +#
> > +
> > +import os
> > +import sys
> > +import subprocess
> > +import basevm
> > +import time
> > +
> > +class CentosVM(basevm.BaseVM):
> > +name = "centos"
> > +BUILD_SCRIPT = """
> > +set -e;
> > +cd $(mktemp -d);
> > +export SRC_ARCHIVE=/dev/vdb;
> > +sudo chmod a+r $SRC_ARCHIVE;
> > +tar -xf $SRC_ARCHIVE;
> > +make docker-test-block@centos7 V={verbose} J={jobs};
> > +make docker-test-quick@centos7 V={verbose} J={jobs};
> > +make docker-test-mingw@fedora V={verbose} J={jobs};
> > +"""
> > +
> > +def _gen_cloud_init_iso(self):
> > +cidir = self._tmpdir
> > +mdata = open(os.path.join(cidir, "meta-data"), "w")
> > +mdata.writelines(["instance-id: centos-vm-0\n",
> > +  "local-hostname: centos-guest\n"])
> > +mdata.close()
> > +udata = open(os.path.join(cidir, "user-data"), "w")
> > +udata.writelines(["#cloud-config\n",
> > +  "chpasswd:\n",
> > +  "  list: |\n",
> > +  "root:%s\n" % self.ROOT_PASS,
> > +  "%s:%s\n" % (self.GUEST_USER, 
> > self.GUEST_PASS),
> > +  "  expire: False\n",
> > +  "users:\n",
> > +  "  - name: %s\n" % self.GUEST_USER,
> > +  "sudo: ALL=(ALL) NOPASSWD:ALL\n",
> > +  "ssh-authorized-keys:\n",
> > +  "- %s\n" % basevm.SSH_PUB_KEY,
> > +  "  - name: root\n",
> > +  "ssh-authorized-keys:\n",
> > +  "- %s\n" % basevm.SSH_PUB_KEY,
> > +  "locale: en_US.UTF-8\n"])
> > +udata.close()
> > +subprocess.check_call(["genisoimage", "-output", "cloud-init.iso",
> > +   "-volid", "cidata", "-joliet", "-rock",
> > +   "user-data", "meta-data"],
> > +   cwd=cidir,
> > + 

[Qemu-devel] [PATCH] tests: vm: Add vm-clean-all

2018-07-15 Thread Fam Zheng
The images are big. Add a rule to clean up easily.

Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Fam Zheng 
---
 tests/vm/Makefile.include | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
index 5daa2a3b73..79e7866c8b 100644
--- a/tests/vm/Makefile.include
+++ b/tests/vm/Makefile.include
@@ -1,6 +1,6 @@
 # Makefile for VM tests
 
-.PHONY: vm-build-all
+.PHONY: vm-build-all vm-clean-all
 
 IMAGES := ubuntu.i386 freebsd netbsd openbsd
 IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES))
@@ -17,6 +17,9 @@ vm-test:
 
 vm-build-all: $(addprefix vm-build-, $(IMAGES))
 
+vm-clean-all:
+   rm -f $(IMAGE_FILES)
+
 tests/vm/%.img: $(SRC_PATH)/tests/vm/% \
$(SRC_PATH)/tests/vm/basevm.py \
$(SRC_PATH)/tests/vm/Makefile.include
-- 
2.17.1




Re: [Qemu-devel] [PATCH v3 4/4] tests: Add centos VM testing

2018-07-13 Thread Fam Zheng
On Thu, 07/12 12:02, Philippe Mathieu-Daudé wrote:
> Hi Fam,
> 
> On 07/11/2018 10:28 PM, Fam Zheng wrote:
> > This one does docker testing in the VM. It is intended to replace the
> > native docker testing on patchew testers.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  tests/vm/Makefile.include |  3 +-
> >  tests/vm/centos   | 84 +++
> >  2 files changed, 86 insertions(+), 1 deletion(-)
> >  create mode 100755 tests/vm/centos
> > 
> > diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
> > index 5daa2a3b73..af19b7a4e6 100644
> > --- a/tests/vm/Makefile.include
> > +++ b/tests/vm/Makefile.include
> > @@ -2,7 +2,7 @@
> >  
> >  .PHONY: vm-build-all
> >  
> > -IMAGES := ubuntu.i386 freebsd netbsd openbsd
> > +IMAGES := ubuntu.i386 freebsd netbsd openbsd centos
> >  IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES))
> >  
> >  .PRECIOUS: $(IMAGE_FILES)
> > @@ -14,6 +14,7 @@ vm-test:
> > @echo "  vm-build-freebsd- Build QEMU in FreeBSD VM"
> > @echo "  vm-build-netbsd - Build QEMU in NetBSD VM"
> > @echo "  vm-build-openbsd- Build QEMU in OpenBSD VM"
> > +   @echo "  vm-build-centos - Build QEMU in CentOS VM, 
> > with Docker"
> >  
> >  vm-build-all: $(addprefix vm-build-, $(IMAGES))
> >  
> > diff --git a/tests/vm/centos b/tests/vm/centos
> > new file mode 100755
> > index 00..afd560c564
> > --- /dev/null
> > +++ b/tests/vm/centos
> > @@ -0,0 +1,84 @@
> > +#!/usr/bin/env python
> > +#
> > +# CentOS image
> > +#
> > +# Copyright 2018 Red Hat Inc.
> > +#
> > +# Authors:
> > +#  Fam Zheng 
> > +#
> > +# This code is licensed under the GPL version 2 or later.  See
> > +# the COPYING file in the top-level directory.
> > +#
> > +
> > +import os
> > +import sys
> > +import subprocess
> > +import basevm
> > +import time
> > +
> > +class CentosVM(basevm.BaseVM):
> > +name = "centos"
> > +BUILD_SCRIPT = """
> > +set -e;
> > +cd $(mktemp -d);
> > +export SRC_ARCHIVE=/dev/vdb;
> > +sudo chmod a+r $SRC_ARCHIVE;
> > +tar -xf $SRC_ARCHIVE;
> > +make docker-test-block@centos7 V={verbose} J={jobs};
> > +make docker-test-quick@centos7 V={verbose} J={jobs};
> > +make docker-test-mingw@fedora V={verbose} J={jobs};
> > +"""
> > +
> > +def _gen_cloud_init_iso(self):
> > +cidir = self._tmpdir
> > +mdata = open(os.path.join(cidir, "meta-data"), "w")
> > +mdata.writelines(["instance-id: centos-vm-0\n",
> > +  "local-hostname: centos-guest\n"])
> > +mdata.close()
> > +udata = open(os.path.join(cidir, "user-data"), "w")
> > +udata.writelines(["#cloud-config\n",
> > +  "chpasswd:\n",
> > +  "  list: |\n",
> > +  "root:%s\n" % self.ROOT_PASS,
> > +  "%s:%s\n" % (self.GUEST_USER, 
> > self.GUEST_PASS),
> > +  "  expire: False\n",
> > +  "users:\n",
> > +  "  - name: %s\n" % self.GUEST_USER,
> > +  "sudo: ALL=(ALL) NOPASSWD:ALL\n",
> > +  "ssh-authorized-keys:\n",
> > +  "- %s\n" % basevm.SSH_PUB_KEY,
> > +  "  - name: root\n",
> > +  "ssh-authorized-keys:\n",
> > +  "- %s\n" % basevm.SSH_PUB_KEY,
> > +  "locale: en_US.UTF-8\n"])
> > +udata.close()
> > +subprocess.check_call(["genisoimage", "-output", "cloud-init.iso",
> > +   "-volid", "cidata", "-joliet", "-rock",
> > +   "user-data", "meta-data"],
> > +   cwd=cidir,
> > +   stdin=self._devnull, stdout=self._stdout,
> > +  

[Qemu-devel] [PATCH v2] nvme: Fix nvme_init error handling

2018-07-11 Thread Fam Zheng
It is wrong to leave this field as 1, as nvme_close() called in the
error handling code in nvme_file_open() will use it and try to free
s->queues again.

Another problem is the cleaning ups are duplicated between the fail*
labels of nvme_init() and nvme_file_open(), which calls nvme_close().

A third problem is nvme_close() misses g_free() and
event_notifier_cleanup().

Fix all of them.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Fam Zheng 

---

v2: Adopt the suggested fix by Kevin.
---
 block/nvme.c | 37 -
 1 file changed, 12 insertions(+), 25 deletions(-)

diff --git a/block/nvme.c b/block/nvme.c
index 6f71122bf5..37805e8890 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -569,13 +569,13 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 s->vfio = qemu_vfio_open_pci(device, errp);
 if (!s->vfio) {
 ret = -EINVAL;
-goto fail;
+goto out;
 }
 
 s->regs = qemu_vfio_pci_map_bar(s->vfio, 0, 0, NVME_BAR_SIZE, errp);
 if (!s->regs) {
 ret = -EINVAL;
-goto fail;
+goto out;
 }
 
 /* Perform initialize sequence as described in NVMe spec "7.6.1
@@ -585,7 +585,7 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 if (!(cap & (1ULL << 37))) {
 error_setg(errp, "Device doesn't support NVMe command set");
 ret = -EINVAL;
-goto fail;
+goto out;
 }
 
 s->page_size = MAX(4096, 1 << (12 + ((cap >> 48) & 0xF)));
@@ -603,7 +603,7 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
  PRId64 " ms)",
timeout_ms);
 ret = -ETIMEDOUT;
-goto fail;
+goto out;
 }
 }
 
@@ -613,7 +613,7 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 s->queues[0] = nvme_create_queue_pair(bs, 0, NVME_QUEUE_SIZE, errp);
 if (!s->queues[0]) {
 ret = -EINVAL;
-goto fail;
+goto out;
 }
 QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
 s->regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << 16) | NVME_QUEUE_SIZE);
@@ -633,14 +633,14 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
  PRId64 " ms)",
timeout_ms);
 ret = -ETIMEDOUT;
-goto fail_queue;
+goto out;
 }
 }
 
 ret = qemu_vfio_pci_init_irq(s->vfio, >irq_notifier,
  VFIO_PCI_MSIX_IRQ_INDEX, errp);
 if (ret) {
-goto fail_queue;
+goto out;
 }
 aio_set_event_notifier(bdrv_get_aio_context(bs), >irq_notifier,
false, nvme_handle_event, nvme_poll_cb);
@@ -649,30 +649,15 @@ static int nvme_init(BlockDriverState *bs, const char 
*device, int namespace,
 if (local_err) {
 error_propagate(errp, local_err);
 ret = -EIO;
-goto fail_handler;
+goto out;
 }
 
 /* Set up command queues. */
 if (!nvme_add_io_queue(bs, errp)) {
 ret = -EIO;
-goto fail_handler;
 }
-return 0;
-
-fail_handler:
-aio_set_event_notifier(bdrv_get_aio_context(bs), >irq_notifier,
-   false, NULL, NULL);
-fail_queue:
-nvme_free_queue_pair(bs, s->queues[0]);
-fail:
-g_free(s->queues);
-if (s->regs) {
-qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, NVME_BAR_SIZE);
-}
-if (s->vfio) {
-qemu_vfio_close(s->vfio);
-}
-event_notifier_cleanup(>irq_notifier);
+out:
+/* Cleaning up is done in nvme_file_open() upon error. */
 return ret;
 }
 
@@ -739,8 +724,10 @@ static void nvme_close(BlockDriverState *bs)
 for (i = 0; i < s->nr_queues; ++i) {
 nvme_free_queue_pair(bs, s->queues[i]);
 }
+g_free(s->queues);
 aio_set_event_notifier(bdrv_get_aio_context(bs), >irq_notifier,
false, NULL, NULL);
+event_notifier_cleanup(>irq_notifier);
 qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->regs, 0, NVME_BAR_SIZE);
 qemu_vfio_close(s->vfio);
 }
-- 
2.17.1




[Qemu-devel] [PATCH v3 4/4] tests: Add centos VM testing

2018-07-11 Thread Fam Zheng
This one does docker testing in the VM. It is intended to replace the
native docker testing on patchew testers.

Signed-off-by: Fam Zheng 
---
 tests/vm/Makefile.include |  3 +-
 tests/vm/centos   | 84 +++
 2 files changed, 86 insertions(+), 1 deletion(-)
 create mode 100755 tests/vm/centos

diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
index 5daa2a3b73..af19b7a4e6 100644
--- a/tests/vm/Makefile.include
+++ b/tests/vm/Makefile.include
@@ -2,7 +2,7 @@
 
 .PHONY: vm-build-all
 
-IMAGES := ubuntu.i386 freebsd netbsd openbsd
+IMAGES := ubuntu.i386 freebsd netbsd openbsd centos
 IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES))
 
 .PRECIOUS: $(IMAGE_FILES)
@@ -14,6 +14,7 @@ vm-test:
@echo "  vm-build-freebsd- Build QEMU in FreeBSD VM"
@echo "  vm-build-netbsd - Build QEMU in NetBSD VM"
@echo "  vm-build-openbsd- Build QEMU in OpenBSD VM"
+   @echo "  vm-build-centos - Build QEMU in CentOS VM, 
with Docker"
 
 vm-build-all: $(addprefix vm-build-, $(IMAGES))
 
diff --git a/tests/vm/centos b/tests/vm/centos
new file mode 100755
index 00..afd560c564
--- /dev/null
+++ b/tests/vm/centos
@@ -0,0 +1,84 @@
+#!/usr/bin/env python
+#
+# CentOS image
+#
+# Copyright 2018 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This code is licensed under the GPL version 2 or later.  See
+# the COPYING file in the top-level directory.
+#
+
+import os
+import sys
+import subprocess
+import basevm
+import time
+
+class CentosVM(basevm.BaseVM):
+name = "centos"
+BUILD_SCRIPT = """
+set -e;
+cd $(mktemp -d);
+export SRC_ARCHIVE=/dev/vdb;
+sudo chmod a+r $SRC_ARCHIVE;
+tar -xf $SRC_ARCHIVE;
+make docker-test-block@centos7 V={verbose} J={jobs};
+make docker-test-quick@centos7 V={verbose} J={jobs};
+make docker-test-mingw@fedora V={verbose} J={jobs};
+"""
+
+def _gen_cloud_init_iso(self):
+cidir = self._tmpdir
+mdata = open(os.path.join(cidir, "meta-data"), "w")
+mdata.writelines(["instance-id: centos-vm-0\n",
+  "local-hostname: centos-guest\n"])
+mdata.close()
+udata = open(os.path.join(cidir, "user-data"), "w")
+udata.writelines(["#cloud-config\n",
+  "chpasswd:\n",
+  "  list: |\n",
+  "root:%s\n" % self.ROOT_PASS,
+  "%s:%s\n" % (self.GUEST_USER, self.GUEST_PASS),
+  "  expire: False\n",
+  "users:\n",
+  "  - name: %s\n" % self.GUEST_USER,
+  "sudo: ALL=(ALL) NOPASSWD:ALL\n",
+  "ssh-authorized-keys:\n",
+  "- %s\n" % basevm.SSH_PUB_KEY,
+  "  - name: root\n",
+  "ssh-authorized-keys:\n",
+  "- %s\n" % basevm.SSH_PUB_KEY,
+  "locale: en_US.UTF-8\n"])
+udata.close()
+subprocess.check_call(["genisoimage", "-output", "cloud-init.iso",
+   "-volid", "cidata", "-joliet", "-rock",
+   "user-data", "meta-data"],
+   cwd=cidir,
+   stdin=self._devnull, stdout=self._stdout,
+   stderr=self._stdout)
+return os.path.join(cidir, "cloud-init.iso")
+
+def build_image(self, img):
+cimg = 
self._download_with_cache("https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-1802.qcow2.xz;)
+img_tmp = img + ".tmp"
+subprocess.check_call(["cp", "-f", cimg, img_tmp + ".xz"])
+subprocess.check_call(["xz", "-df", img_tmp + ".xz"])
+subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"])
+self.boot(img_tmp, extra_args = ["-cdrom", self._gen_cloud_init_iso()])
+self.wait_ssh()
+self.ssh_root_check("touch /etc/cloud/cloud-init.disabled")
+self.ssh_root_check("yum update -y")
+self.ssh_root_check("yum install -y docker make git")
+self.ssh_root_check("systemctl enable docker")
+self.ssh_root("poweroff")
+self.wait()
+if os.path.exists(img):
+os.remove(img)
+os.rename(img_tmp, img)
+return 0
+
+if __name__ == "__main__":
+sys.exit(basevm.main(CentosVM))
-- 
2.17.1




[Qemu-devel] [PATCH v3 2/4] tests/vm: Pass verbose flag into VM make commands

2018-07-11 Thread Fam Zheng
Our Makefile has:

vm-build-%: tests/vm/%.img
$(call quiet-command, \
$(SRC_PATH)/tests/vm/$* \
$(if $(V)$(DEBUG), --debug) \
$(if $(DEBUG), --interactive) \

the intention of which is to let the make command in VM have V=1 if
V=1 is set. We pass this to the BUILD_SCRIPT format.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Fam Zheng 
---
 tests/vm/basevm.py   | 3 ++-
 tests/vm/freebsd | 4 ++--
 tests/vm/netbsd  | 4 ++--
 tests/vm/openbsd | 4 ++--
 tests/vm/ubuntu.i386 | 4 ++--
 5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index e5d6a328d5..25361c6b7d 100755
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -240,7 +240,8 @@ def main(vmcls):
 vm.add_source_dir(args.build_qemu)
 cmd = [vm.BUILD_SCRIPT.format(
configure_opts = " ".join(argv),
-   jobs=args.jobs)]
+   jobs=args.jobs,
+   verbose="1" if args.debug else "")]
 else:
 cmd = argv
 img = args.image
diff --git a/tests/vm/freebsd b/tests/vm/freebsd
index 039dad8f69..b20612a6c9 100755
--- a/tests/vm/freebsd
+++ b/tests/vm/freebsd
@@ -23,8 +23,8 @@ class FreeBSDVM(basevm.BaseVM):
 cd $(mktemp -d /var/tmp/qemu-test.XX);
 tar -xf /dev/vtbd1;
 ./configure {configure_opts};
-gmake -j{jobs};
-gmake check;
+gmake -j{jobs} V={verbose};
+gmake check V={verbose};
 """
 
 def build_image(self, img):
diff --git a/tests/vm/netbsd b/tests/vm/netbsd
index 3972d8b45c..3f9d34a208 100755
--- a/tests/vm/netbsd
+++ b/tests/vm/netbsd
@@ -23,8 +23,8 @@ class NetBSDVM(basevm.BaseVM):
 cd $(mktemp -d /var/tmp/qemu-test.XX);
 tar -xf /dev/rld1a;
 ./configure --python=python2.7 {configure_opts};
-gmake -j{jobs};
-gmake check;
+gmake -j{jobs} V={verbose};
+gmake check V={verbose};
 """
 
 def build_image(self, img):
diff --git a/tests/vm/openbsd b/tests/vm/openbsd
index 6ae16d97fd..3285c1abde 100755
--- a/tests/vm/openbsd
+++ b/tests/vm/openbsd
@@ -23,9 +23,9 @@ class OpenBSDVM(basevm.BaseVM):
 cd $(mktemp -d /var/tmp/qemu-test.XX);
 tar -xf /dev/rsd1c;
 ./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4 
--python=python2.7 {configure_opts};
-gmake -j{jobs};
+gmake -j{jobs} V={verbose};
 # XXX: "gmake check" seems to always hang or fail
-#gmake check;
+#gmake check V={verbose};
 """
 
 def build_image(self, img):
diff --git a/tests/vm/ubuntu.i386 b/tests/vm/ubuntu.i386
index fc319e0e6e..b4eaa0dedc 100755
--- a/tests/vm/ubuntu.i386
+++ b/tests/vm/ubuntu.i386
@@ -25,8 +25,8 @@ class UbuntuX86VM(basevm.BaseVM):
 sudo chmod a+r /dev/vdb;
 tar -xf /dev/vdb;
 ./configure {configure_opts};
-make -j{jobs};
-make check;
+make -j{jobs} V={verbose};
+make check V={verbose};
 """
 
 def _gen_cloud_init_iso(self):
-- 
2.17.1




[Qemu-devel] [PATCH v3 1/4] tests: Add an option for snapshot (default: off)

2018-07-11 Thread Fam Zheng
Not using snapshot has the benefit of automatically persisting useful
test harnesses, such as docker images and ccache database. Although it
will lose some cleanness, it is imaginably useful for patchew.

Signed-off-by: Fam Zheng 
---
 tests/vm/basevm.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index 3643117816..e5d6a328d5 100755
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -216,6 +216,8 @@ def parse_args(vm_name):
   help="build QEMU from source in guest")
 parser.add_option("--interactive", "-I", action="store_true",
   help="Interactively run command")
+parser.add_option("--snapshot", "-s", action="store_true",
+  help="run tests with a snapshot")
 parser.disable_interspersed_args()
 return parser.parse_args()
 
@@ -241,7 +243,10 @@ def main(vmcls):
jobs=args.jobs)]
 else:
 cmd = argv
-vm.boot(args.image + ",snapshot=on")
+img = args.image
+if args.snapshot:
+img += ",snapshot=on"
+vm.boot(img)
 vm.wait_ssh()
 except Exception as e:
 if isinstance(e, SystemExit) and e.code == 0:
-- 
2.17.1




[Qemu-devel] [PATCH v3 0/4] Add a CentOS test image to run docker tests

2018-07-11 Thread Fam Zheng
v3: Add 'make vm-test' document. [Phil]

v2: Drop archive-source.sh changes.
The new test depends on the iotests nbd fix I posted today to pass.

Docker testing on patchew has long suffered from 'make check' hangings. The
cleanness of VM testing is the cure. Now let's add a CentOS 7 image to run the
tests.  It's purely ad-hoc, but hopefully still easy to understand and use for
everyone.

The first patch makes passing source code from host to the container in VM
working, and is a nice clean up.

The second patch makes caches work, to speed up repetitive runs like on
patchew.

The last patch adds the new image that does the job. Two out of three docker
tests running on patchew.org are added to the image. I'll wait for Peter to fix
the 'docker-test-quick@centos6' hanging (oob test) before adding it too.

Fam

Fam Zheng (4):
  tests: Add an option for snapshot (default: off)
  tests/vm: Pass verbose flag into VM make commands
  tests: Allow overriding archive path with SRC_ARCHIVE
  tests: Add centos VM testing

 tests/docker/Makefile.include |  7 ++-
 tests/vm/Makefile.include |  3 +-
 tests/vm/basevm.py| 10 -
 tests/vm/centos   | 84 +++
 tests/vm/freebsd  |  4 +-
 tests/vm/netbsd   |  4 +-
 tests/vm/openbsd  |  4 +-
 tests/vm/ubuntu.i386  |  4 +-
 8 files changed, 107 insertions(+), 13 deletions(-)
 create mode 100755 tests/vm/centos

-- 
2.17.1




[Qemu-devel] [PATCH v3 3/4] tests: Allow overriding archive path with SRC_ARCHIVE

2018-07-11 Thread Fam Zheng
In VM based tests, the source archive is created in host, we don't have
to run archive-source.sh again, as it complicates the Makefile and
scripts.

Signed-off-by: Fam Zheng 
---
 tests/docker/Makefile.include | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index b2a7e761cc..f0525b91a7 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -27,8 +27,11 @@ DOCKER_SRC_COPY := $(BUILD_DIR)/docker-src.$(CUR_TIME)
 
 $(DOCKER_SRC_COPY):
@mkdir $@
-   $(call quiet-command, cd $(SRC_PATH) && scripts/archive-source.sh 
$@/qemu.tar, \
-   "GEN", "$@/qemu.tar")
+   $(if $(SRC_ARCHIVE), \
+   $(call quiet-command, cp "$(SRC_ARCHIVE)" $@/qemu.tar, \
+   "CP", "$@/qemu.tar"), \
+   $(call quiet-command, cd $(SRC_PATH) && 
scripts/archive-source.sh $@/qemu.tar, \
+   "GEN", "$@/qemu.tar"))
$(call quiet-command, cp $(SRC_PATH)/tests/docker/run $@/run, \
"COPY","RUNNER")
 
-- 
2.17.1




Re: [Qemu-devel] [PATCH] docker: Install more packages in centos7

2018-07-11 Thread Fam Zheng
On Wed, 07/11 14:58, Fam Zheng wrote:
> This makes test-block work.
> 
> Signed-off-by: Fam Zheng 

Queued, thanks.




[Qemu-devel] [PATCH v2 4/4] tests: Add centos VM testing

2018-07-11 Thread Fam Zheng
This one does docker testing in the VM. It is intended to replace the
native docker testing on patchew testers.

Signed-off-by: Fam Zheng 
---
 tests/vm/Makefile.include |  2 +-
 tests/vm/centos   | 84 +++
 2 files changed, 85 insertions(+), 1 deletion(-)
 create mode 100755 tests/vm/centos

diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
index 5daa2a3b73..e492cd73e5 100644
--- a/tests/vm/Makefile.include
+++ b/tests/vm/Makefile.include
@@ -2,7 +2,7 @@
 
 .PHONY: vm-build-all
 
-IMAGES := ubuntu.i386 freebsd netbsd openbsd
+IMAGES := ubuntu.i386 freebsd netbsd openbsd centos
 IMAGE_FILES := $(patsubst %, tests/vm/%.img, $(IMAGES))
 
 .PRECIOUS: $(IMAGE_FILES)
diff --git a/tests/vm/centos b/tests/vm/centos
new file mode 100755
index 00..afd560c564
--- /dev/null
+++ b/tests/vm/centos
@@ -0,0 +1,84 @@
+#!/usr/bin/env python
+#
+# CentOS image
+#
+# Copyright 2018 Red Hat Inc.
+#
+# Authors:
+#  Fam Zheng 
+#
+# This code is licensed under the GPL version 2 or later.  See
+# the COPYING file in the top-level directory.
+#
+
+import os
+import sys
+import subprocess
+import basevm
+import time
+
+class CentosVM(basevm.BaseVM):
+name = "centos"
+BUILD_SCRIPT = """
+set -e;
+cd $(mktemp -d);
+export SRC_ARCHIVE=/dev/vdb;
+sudo chmod a+r $SRC_ARCHIVE;
+tar -xf $SRC_ARCHIVE;
+make docker-test-block@centos7 V={verbose} J={jobs};
+make docker-test-quick@centos7 V={verbose} J={jobs};
+make docker-test-mingw@fedora V={verbose} J={jobs};
+"""
+
+def _gen_cloud_init_iso(self):
+cidir = self._tmpdir
+mdata = open(os.path.join(cidir, "meta-data"), "w")
+mdata.writelines(["instance-id: centos-vm-0\n",
+  "local-hostname: centos-guest\n"])
+mdata.close()
+udata = open(os.path.join(cidir, "user-data"), "w")
+udata.writelines(["#cloud-config\n",
+  "chpasswd:\n",
+  "  list: |\n",
+  "root:%s\n" % self.ROOT_PASS,
+  "%s:%s\n" % (self.GUEST_USER, self.GUEST_PASS),
+  "  expire: False\n",
+  "users:\n",
+  "  - name: %s\n" % self.GUEST_USER,
+  "sudo: ALL=(ALL) NOPASSWD:ALL\n",
+  "ssh-authorized-keys:\n",
+  "- %s\n" % basevm.SSH_PUB_KEY,
+  "  - name: root\n",
+  "ssh-authorized-keys:\n",
+  "- %s\n" % basevm.SSH_PUB_KEY,
+  "locale: en_US.UTF-8\n"])
+udata.close()
+subprocess.check_call(["genisoimage", "-output", "cloud-init.iso",
+   "-volid", "cidata", "-joliet", "-rock",
+   "user-data", "meta-data"],
+   cwd=cidir,
+   stdin=self._devnull, stdout=self._stdout,
+   stderr=self._stdout)
+return os.path.join(cidir, "cloud-init.iso")
+
+def build_image(self, img):
+cimg = 
self._download_with_cache("https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-1802.qcow2.xz;)
+img_tmp = img + ".tmp"
+subprocess.check_call(["cp", "-f", cimg, img_tmp + ".xz"])
+subprocess.check_call(["xz", "-df", img_tmp + ".xz"])
+subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"])
+self.boot(img_tmp, extra_args = ["-cdrom", self._gen_cloud_init_iso()])
+self.wait_ssh()
+self.ssh_root_check("touch /etc/cloud/cloud-init.disabled")
+self.ssh_root_check("yum update -y")
+self.ssh_root_check("yum install -y docker make git")
+self.ssh_root_check("systemctl enable docker")
+self.ssh_root("poweroff")
+self.wait()
+if os.path.exists(img):
+os.remove(img)
+os.rename(img_tmp, img)
+return 0
+
+if __name__ == "__main__":
+sys.exit(basevm.main(CentosVM))
-- 
2.17.1




[Qemu-devel] [PATCH v2 2/4] tests/vm: Pass verbose flag into VM make commands

2018-07-11 Thread Fam Zheng
Our Makefile has:

vm-build-%: tests/vm/%.img
$(call quiet-command, \
$(SRC_PATH)/tests/vm/$* \
$(if $(V)$(DEBUG), --debug) \
$(if $(DEBUG), --interactive) \

the intention of which is to let the make command in VM have V=1 if
V=1 is set. We pass this to the BUILD_SCRIPT format.

Signed-off-by: Fam Zheng 
---
 tests/vm/basevm.py   | 3 ++-
 tests/vm/freebsd | 4 ++--
 tests/vm/netbsd  | 4 ++--
 tests/vm/openbsd | 4 ++--
 tests/vm/ubuntu.i386 | 4 ++--
 5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index e5d6a328d5..25361c6b7d 100755
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -240,7 +240,8 @@ def main(vmcls):
 vm.add_source_dir(args.build_qemu)
 cmd = [vm.BUILD_SCRIPT.format(
configure_opts = " ".join(argv),
-   jobs=args.jobs)]
+   jobs=args.jobs,
+   verbose="1" if args.debug else "")]
 else:
 cmd = argv
 img = args.image
diff --git a/tests/vm/freebsd b/tests/vm/freebsd
index 039dad8f69..b20612a6c9 100755
--- a/tests/vm/freebsd
+++ b/tests/vm/freebsd
@@ -23,8 +23,8 @@ class FreeBSDVM(basevm.BaseVM):
 cd $(mktemp -d /var/tmp/qemu-test.XX);
 tar -xf /dev/vtbd1;
 ./configure {configure_opts};
-gmake -j{jobs};
-gmake check;
+gmake -j{jobs} V={verbose};
+gmake check V={verbose};
 """
 
 def build_image(self, img):
diff --git a/tests/vm/netbsd b/tests/vm/netbsd
index 3972d8b45c..3f9d34a208 100755
--- a/tests/vm/netbsd
+++ b/tests/vm/netbsd
@@ -23,8 +23,8 @@ class NetBSDVM(basevm.BaseVM):
 cd $(mktemp -d /var/tmp/qemu-test.XX);
 tar -xf /dev/rld1a;
 ./configure --python=python2.7 {configure_opts};
-gmake -j{jobs};
-gmake check;
+gmake -j{jobs} V={verbose};
+gmake check V={verbose};
 """
 
 def build_image(self, img):
diff --git a/tests/vm/openbsd b/tests/vm/openbsd
index 6ae16d97fd..3285c1abde 100755
--- a/tests/vm/openbsd
+++ b/tests/vm/openbsd
@@ -23,9 +23,9 @@ class OpenBSDVM(basevm.BaseVM):
 cd $(mktemp -d /var/tmp/qemu-test.XX);
 tar -xf /dev/rsd1c;
 ./configure --cc=x86_64-unknown-openbsd6.1-gcc-4.9.4 
--python=python2.7 {configure_opts};
-gmake -j{jobs};
+gmake -j{jobs} V={verbose};
 # XXX: "gmake check" seems to always hang or fail
-#gmake check;
+#gmake check V={verbose};
 """
 
 def build_image(self, img):
diff --git a/tests/vm/ubuntu.i386 b/tests/vm/ubuntu.i386
index fc319e0e6e..b4eaa0dedc 100755
--- a/tests/vm/ubuntu.i386
+++ b/tests/vm/ubuntu.i386
@@ -25,8 +25,8 @@ class UbuntuX86VM(basevm.BaseVM):
 sudo chmod a+r /dev/vdb;
 tar -xf /dev/vdb;
 ./configure {configure_opts};
-make -j{jobs};
-make check;
+make -j{jobs} V={verbose};
+make check V={verbose};
 """
 
 def _gen_cloud_init_iso(self):
-- 
2.17.1




[Qemu-devel] [PATCH v2 3/4] tests: Allow overriding archive path with SRC_ARCHIVE

2018-07-11 Thread Fam Zheng
In VM based tests, the source archive is created in host, we don't have
to run archive-source.sh again, as it complicates the Makefile and
scripts.

Signed-off-by: Fam Zheng 
---
 tests/docker/Makefile.include | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index b2a7e761cc..f0525b91a7 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -27,8 +27,11 @@ DOCKER_SRC_COPY := $(BUILD_DIR)/docker-src.$(CUR_TIME)
 
 $(DOCKER_SRC_COPY):
@mkdir $@
-   $(call quiet-command, cd $(SRC_PATH) && scripts/archive-source.sh 
$@/qemu.tar, \
-   "GEN", "$@/qemu.tar")
+   $(if $(SRC_ARCHIVE), \
+   $(call quiet-command, cp "$(SRC_ARCHIVE)" $@/qemu.tar, \
+   "CP", "$@/qemu.tar"), \
+   $(call quiet-command, cd $(SRC_PATH) && 
scripts/archive-source.sh $@/qemu.tar, \
+   "GEN", "$@/qemu.tar"))
$(call quiet-command, cp $(SRC_PATH)/tests/docker/run $@/run, \
"COPY","RUNNER")
 
-- 
2.17.1




[Qemu-devel] [PATCH v2 0/4] Add a CentOS test image to run docker tests

2018-07-11 Thread Fam Zheng
v2: Drop archive-source.sh changes.
The new test depends on the iotests nbd fix I posted today to pass.

Docker testing on patchew has long suffered from 'make check' hangings. The
cleanness of VM testing is the cure. Now let's add a CentOS 7 image to run the
tests.  It's purely ad-hoc, but hopefully still easy to understand and use for
everyone.

The first patch makes passing source code from host to the container in VM
working, and is a nice clean up.

The second patch makes caches work, to speed up repetitive runs like on
patchew.

The last patch adds the new image that does the job. Two out of three docker
tests running on patchew.org are added to the image. I'll wait for Peter to fix
the 'docker-test-quick@centos6' hanging (oob test) before adding it too.

Fam

Fam Zheng (4):
  tests: Add an option for snapshot (default: off)
  tests/vm: Pass verbose flag into VM make commands
  tests: Allow overriding archive path with SRC_ARCHIVE
  tests: Add centos VM testing

 tests/docker/Makefile.include |  7 ++-
 tests/vm/Makefile.include |  2 +-
 tests/vm/basevm.py| 10 -
 tests/vm/centos   | 84 +++
 tests/vm/freebsd  |  4 +-
 tests/vm/netbsd   |  4 +-
 tests/vm/openbsd  |  4 +-
 tests/vm/ubuntu.i386  |  4 +-
 8 files changed, 106 insertions(+), 13 deletions(-)
 create mode 100755 tests/vm/centos

-- 
2.17.1




[Qemu-devel] [PATCH v2 1/4] tests: Add an option for snapshot (default: off)

2018-07-11 Thread Fam Zheng
Not using snapshot has the benefit of automatically persisting useful
test harnesses, such as docker images and ccache database. Although it
will lose some cleanness, it is imaginably useful for patchew.

Signed-off-by: Fam Zheng 
---
 tests/vm/basevm.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/vm/basevm.py b/tests/vm/basevm.py
index 3643117816..e5d6a328d5 100755
--- a/tests/vm/basevm.py
+++ b/tests/vm/basevm.py
@@ -216,6 +216,8 @@ def parse_args(vm_name):
   help="build QEMU from source in guest")
 parser.add_option("--interactive", "-I", action="store_true",
   help="Interactively run command")
+parser.add_option("--snapshot", "-s", action="store_true",
+  help="run tests with a snapshot")
 parser.disable_interspersed_args()
 return parser.parse_args()
 
@@ -241,7 +243,10 @@ def main(vmcls):
jobs=args.jobs)]
 else:
 cmd = argv
-vm.boot(args.image + ",snapshot=on")
+img = args.image
+if args.snapshot:
+img += ",snapshot=on"
+vm.boot(img)
 vm.wait_ssh()
 except Exception as e:
 if isinstance(e, SystemExit) and e.code == 0:
-- 
2.17.1




Re: [Qemu-devel] [PATCH] docker: Update debootstrap script after Debian migration from Alioth to Salsa

2018-07-11 Thread Fam Zheng
On Tue, 07/03 12:35, Philippe Mathieu-Daudé wrote:
> This silents the following warning:
> 
>   Cloning into './debootstrap.git'...
>   warning: redirecting to 
> https://salsa.debian.org/installer-team/debootstrap.git/
> 
> See https://lists.debian.org/debian-devel-announce/2018/01/msg4.html
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/docker/dockerfiles/debian-bootstrap.pre | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/docker/dockerfiles/debian-bootstrap.pre 
> b/tests/docker/dockerfiles/debian-bootstrap.pre
> index 7c76dce663..44b18db7a2 100755
> --- a/tests/docker/dockerfiles/debian-bootstrap.pre
> +++ b/tests/docker/dockerfiles/debian-bootstrap.pre
> @@ -53,7 +53,7 @@ if [ -z $DEBOOTSTRAP_DIR ]; then
>  NEED_DEBOOTSTRAP=true
>  fi
>  if $NEED_DEBOOTSTRAP; then
> -DEBOOTSTRAP_SOURCE=https://anonscm.debian.org/git/d-i/debootstrap.git
> +
> DEBOOTSTRAP_SOURCE=https://salsa.debian.org/installer-team/debootstrap.git
>  git clone ${DEBOOTSTRAP_SOURCE} ./debootstrap.git
>  export DEBOOTSTRAP_DIR=./debootstrap.git
>  DEBOOTSTRAP=./debootstrap.git/debootstrap
> -- 
> 2.18.0
> 

Queued, thanks.

Fam



[Qemu-devel] [PATCH] iotests: 153: Fix dead code

2018-07-11 Thread Fam Zheng
This step was left behind my mistake. As suggested by the echoed text,
the intention was to test two devices with the same image, with
different options. The behavior should be the same as two QEMU
processes. Complete it.

Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/153 |  2 ++
 tests/qemu-iotests/153.out | 25 +
 2 files changed, 27 insertions(+)

diff --git a/tests/qemu-iotests/153 b/tests/qemu-iotests/153
index 673813cd39..0daeb1b005 100755
--- a/tests/qemu-iotests/153
+++ b/tests/qemu-iotests/153
@@ -162,6 +162,7 @@ for opts1 in "" "read-only=on" 
"read-only=on,force-share=on"; do
 _cleanup_qemu
 done
 
+test_opts="read-only=off read-only=on read-only=on,force-share=on"
 for opt1 in $test_opts; do
 for opt2 in $test_opts; do
 echo
@@ -170,6 +171,7 @@ for opt1 in $test_opts; do
 done
 done
 
+echo
 echo "== Creating ${TEST_IMG}.[abc] ==" | _filter_testdir
 (
 $QEMU_IMG create -f qcow2 "${TEST_IMG}.a" -b "${TEST_IMG}"
diff --git a/tests/qemu-iotests/153.out b/tests/qemu-iotests/153.out
index 3492ba7a29..93eaf10486 100644
--- a/tests/qemu-iotests/153.out
+++ b/tests/qemu-iotests/153.out
@@ -369,6 +369,31 @@ _qemu_img_wrapper bench -U -w -c 1 TEST_DIR/t.qcow2
 qemu-img: Could not open 'TEST_DIR/t.qcow2': force-share=on can only be used 
with read-only images
 
 Round done
+
+== Two devices with the same image (read-only=off - read-only=off) ==
+QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2,read-only=off: Failed to get 
"write" lock
+Is another process using the image?
+
+== Two devices with the same image (read-only=off - read-only=on) ==
+QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2,read-only=on: Failed to get 
shared "write" lock
+Is another process using the image?
+
+== Two devices with the same image (read-only=off - 
read-only=on,force-share=on) ==
+
+== Two devices with the same image (read-only=on - read-only=off) ==
+QEMU_PROG: -drive if=none,file=TEST_DIR/t.qcow2,read-only=off: Failed to get 
"write" lock
+Is another process using the image?
+
+== Two devices with the same image (read-only=on - read-only=on) ==
+
+== Two devices with the same image (read-only=on - 
read-only=on,force-share=on) ==
+
+== Two devices with the same image (read-only=on,force-share=on - 
read-only=off) ==
+
+== Two devices with the same image (read-only=on,force-share=on - 
read-only=on) ==
+
+== Two devices with the same image (read-only=on,force-share=on - 
read-only=on,force-share=on) ==
+
 == Creating TEST_DIR/t.qcow2.[abc] ==
 Formatting 'TEST_DIR/t.IMGFMT.a', fmt=IMGFMT size=33554432 
backing_file=TEST_DIR/t.IMGFMT
 Formatting 'TEST_DIR/t.IMGFMT.b', fmt=IMGFMT size=33554432 
backing_file=TEST_DIR/t.IMGFMT
-- 
2.17.1




[Qemu-devel] [PATCH] docker: Install more packages in centos7

2018-07-11 Thread Fam Zheng
This makes test-block work.

Signed-off-by: Fam Zheng 
---
 tests/docker/dockerfiles/centos7.docker | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/docker/dockerfiles/centos7.docker 
b/tests/docker/dockerfiles/centos7.docker
index 575de29a0a..83462b7205 100644
--- a/tests/docker/dockerfiles/centos7.docker
+++ b/tests/docker/dockerfiles/centos7.docker
@@ -3,6 +3,7 @@ RUN yum install -y epel-release centos-release-xen
 RUN yum -y update
 ENV PACKAGES \
 bison \
+bzip2 \
 bzip2-devel \
 ccache \
 csnappy-devel \
@@ -12,10 +13,12 @@ ENV PACKAGES \
 gettext \
 git \
 glib2-devel \
+libaio-devel \
 libepoxy-devel \
 libfdt-devel \
 librdmacm-devel \
 lzo-devel \
+nettle-devel \
 make \
 mesa-libEGL-devel \
 mesa-libgbm-devel \
-- 
2.17.1




[Qemu-devel] [PATCH v2] iotests: nbd: Stop qemu-nbd before remaking image

2018-07-11 Thread Fam Zheng
197 is one example where _make_test_img is used twice without stopping
the NBD server in between. An error will occur like this:

@@ -26,9 +26,13 @@

 === Partial final cluster ===

+qemu-img: TEST_DIR/t.IMGFMT: Failed to get "resize" lock
+Is another process using the image?
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1024
+Failed to find an available port: Address already in use
 read 1024/1024 bytes at offset 0

Patch _make_test_img to stop the old qemu-nbd before starting a new one,
which fixes this problem, and similarly 215.

Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/common.rc | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index d054cb97fc..44bee16a5e 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -195,6 +195,16 @@ _use_sample_img()
 fi
 }
 
+_stop_nbd_server()
+{
+if [ -f "${QEMU_TEST_DIR}/qemu-nbd.pid" ]; then
+local QEMU_NBD_PID
+read QEMU_NBD_PID < "${QEMU_TEST_DIR}/qemu-nbd.pid"
+kill ${QEMU_NBD_PID}
+rm -f "${QEMU_TEST_DIR}/qemu-nbd.pid"
+fi
+}
+
 _make_test_img()
 {
 # extra qemu-img options can be added by tests
@@ -234,6 +244,10 @@ _make_test_img()
 extra_img_options="-o $optstr $extra_img_options"
 fi
 
+if [ $IMGPROTO = "nbd" ]; then
+_stop_nbd_server
+fi
+
 # XXX(hch): have global image options?
 (
  if [ $use_backing = 1 ]; then
@@ -274,12 +288,7 @@ _cleanup_test_img()
 case "$IMGPROTO" in
 
 nbd)
-if [ -f "${QEMU_TEST_DIR}/qemu-nbd.pid" ]; then
-local QEMU_NBD_PID
-read QEMU_NBD_PID < "${QEMU_TEST_DIR}/qemu-nbd.pid"
-kill ${QEMU_NBD_PID}
-rm -f "${QEMU_TEST_DIR}/qemu-nbd.pid"
-fi
+_stop_nbd_server
 rm -f "$TEST_IMG_FILE"
 ;;
 vxhs)
-- 
2.17.1




Re: [Qemu-devel] [PATCH 2/4] tests/vm: Add a dependency to qemu-img

2018-07-10 Thread Fam Zheng
On Tue, 07/03 12:56, Philippe Mathieu-Daudé wrote:
> Hi Fam,
> 
> On 07/02/2018 12:19 PM, Philippe Mathieu-Daudé wrote:
> > On 07/02/2018 04:18 AM, Fam Zheng wrote:
> >> On Thu, 06/28 12:35, Philippe Mathieu-Daudé wrote:
> >>> Before the first use, the VM image are resized with qemu-img.
> >>>
> >>> Add a dependency to the qemu-img tool to fix:
> >>>
> >>>   $ make vm-build-ubuntu.i386
> >>>   Traceback (most recent call last):
> >>> File "source/qemu/tests/vm/basevm.py", line 236, in main
> >>>   return vm.build_image(args.image)
> >>> File "tests/vm/ubuntu.i386", line 67, in build_image
> >>>   subprocess.check_call(["qemu-img", "resize", img_tmp, "50G"])
> >>>   OSError: [Errno 2] No such file or directory
> >>>   tests/vm/Makefile.include:23: recipe for target 
> >>> 'tests/vm/ubuntu.i386.img' failed
> >>>   make: *** [tests/vm/ubuntu.i386.img] Error 2
> >>>
> >>> Signed-off-by: Philippe Mathieu-Daudé 
> >>> ---
> >>>  tests/vm/Makefile.include | 3 +++
> >>>  1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/tests/vm/Makefile.include b/tests/vm/Makefile.include
> >>> index 5daa2a3b73..e5e8146267 100644
> >>> --- a/tests/vm/Makefile.include
> >>> +++ b/tests/vm/Makefile.include
> >>> @@ -17,10 +17,13 @@ vm-test:
> >>>  
> >>>  vm-build-all: $(addprefix vm-build-, $(IMAGES))
> >>>  
> >>> +$(IMAGE_FILES): qemu-img$(EXESUF)
> >>
> >> No, please don't do this. Let's keep on supporting vm tests from a clean 
> >> source
> >> tree. Similar to qemu_bin, this can be handled in basevm.py.
> > 
> > OK.
> 
> The ubuntu.i386 VM is the only one using the qemu-img tool in his
> build_image() method implementation. Is it OK to add this check there?
> The basevm.build_image() is only an abstract method.

Maybe you can add a BaseVM.qemu_img() method that looks up for available
qemu-img on the system or build dir.

Fam

> 
> >>> +
> >>>  tests/vm/%.img: $(SRC_PATH)/tests/vm/% \
> >>>   $(SRC_PATH)/tests/vm/basevm.py \
> >>>   $(SRC_PATH)/tests/vm/Makefile.include
> >>>   $(call quiet-command, \
> >>> + PATH=$(PATH):. \
> >>>   $< \
> >>>   $(if $(V)$(DEBUG), --debug) \
> >>>   --image "$@" \
> >>> -- 
> >>> 2.18.0
> >>>
> >>



Re: [Qemu-devel] [PATCH 0/2] virtio-scsi: fix hotplug ->reset() vs event race

2018-07-10 Thread Fam Zheng
On Tue, 07/10 16:50, Stefan Hajnoczi wrote:
> The virtio-scsi command virtqueues run during hotplug.  This creates the
> possibility of race conditions since the guest can submit commands while the
> monitor is performing hotplug.
> 
> See Patch 2 for a fix for the ->reset() vs event race condition that Zhengui 
> Li
> encountered.
> 
> Zhengui Li: Sorry it took me so long to look into this.  Please let me know if
> this fixes the issue you are seeing.  Thanks!
> 
> Stefan Hajnoczi (2):
>   qdev: add HotplugHandler->post_plug() callback
>   virtio-scsi: fix hotplug ->reset() vs event race
> 
>  include/hw/hotplug.h  | 12 
>  hw/core/hotplug.c | 11 +++
>  hw/core/qdev.c|  7 +++
>  hw/scsi/virtio-scsi.c | 11 ++-
>  4 files changed, 40 insertions(+), 1 deletion(-)
> 
> -- 
> 2.17.1

Reviewed-by: Fam Zheng 




Re: [Qemu-devel] [PATCH for 3.0 10/10] docker: add expansion for docker-test-FOO to Makefile.include

2018-07-10 Thread Fam Zheng
On Tue, 07/10 22:04, Alex Bennée wrote:
> 
> Philippe Mathieu-Daudé  writes:
> 
> > On 07/09/2018 12:21 PM, Alex Bennée wrote:
> >> This allows us to run a particular test on all docker images. For
> >> example:
> >>
> >>   make docker-test-unit
> >>
> >> Will run the unit tests on every supported image.
> >>
> >> Signed-off-by: Alex Bennée 
> >> ---
> >>  tests/docker/Makefile.include | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> >> index fe63aacf69..765b2c36f2 100644
> >> --- a/tests/docker/Makefile.include
> >> +++ b/tests/docker/Makefile.include
> >> @@ -152,6 +152,7 @@ $(foreach i,$(filter-out 
> >> $(DOCKER_PARTIAL_IMAGES),$(DOCKER_IMAGES) $(DOCKER_DEPR
> >>) \
> >>$(foreach t,$(DOCKER_TESTS), \
> >>$(eval docker-test: docker-$t@$i) \
> >> +  $(eval docker-$t: docker-$t@$i) \
> >>) \
> >>  )
> >>
> >> @@ -162,6 +163,7 @@ docker:
> >>@echo
> >>@echo 'docker:  Print this help.'
> >>@echo 'docker-test: Run all image/test combinations.'
> >
> > Can we rename "docker-test" -> "docker-tests" or is it too late?
> 
> Probably, I don't know how many people were actively using it. Fam?

I don't know either. Maybe because I'm not native English speaker, but why
bother? "Docker test" sounds like a generic "cover all" test, whereas "docker
tests" sounds like a set of many independent tests based on Docker. Both make
sense to me, though I agree the latter describes it better.

On the other hand, maybe we should create a selection instead of "all
combinations" and call it docker-test, to make it a bit more useful. Then we can
rename the current docker-test to "docker-all" or "docker-test-all".

Fam

> 
> >
> >> +  @echo 'docker-TEST: Run TEST on all image combinations.'
> >>@echo 'docker-clean:Kill and remove residual docker testing 
> >> containers.'
> >>@echo 'docker-TEST@IMAGE:   Run "TEST" in container "IMAGE".'
> >>@echo ' Note: "TEST" is one of the listed test 
> >> name,'
> >>
> 
> 
> --
> Alex Bennée



[Qemu-devel] [PATCH] qemu-img: Document copy offloading implications with -S and -c

2018-07-10 Thread Fam Zheng
Explicitly enabling zero detection or compression suppresses copy
offloading during convert. Document it.

Suggested-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 qemu-img.texi | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/qemu-img.texi b/qemu-img.texi
index aeb1b9e66c..5853cd18d1 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -96,7 +96,8 @@ will enumerate information about backing files in a disk 
image chain. Refer
 below for further description.
 
 @item -c
-indicates that target image must be compressed (qcow format only)
+indicates that target image must be compressed (qcow format only). If this
+option is used, copy offloading will not be attempted.
 
 @item -h
 with or without a command shows help and lists the supported formats
@@ -115,7 +116,8 @@ in case both @var{-q} and @var{-p} options are used.
 indicates the consecutive number of bytes that must contain only zeros
 for qemu-img to create a sparse image during conversion. This value is rounded
 down to the nearest 512 bytes. You may use the common size suffixes like
-@code{k} for kilobytes.
+@code{k} for kilobytes. If this option is used, copy offloading will not be
+attempted.
 
 @item -t @var{cache}
 specifies the cache mode that should be used with the (destination) file. See
-- 
2.17.1




[Qemu-devel] [PATCH] iotests: 222: Don't run with luks

2018-07-10 Thread Fam Zheng
Luks needs special parameters to operate the image. Since this test is
focusing on image fleecing, skip skip that format.

Signed-off-by: Fam Zheng 
---
 tests/qemu-iotests/222 | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/222
index ff3bfc1470..0ead56d574 100644
--- a/tests/qemu-iotests/222
+++ b/tests/qemu-iotests/222
@@ -25,6 +25,8 @@ import iotests
 from iotests import log, qemu_img, qemu_io, qemu_io_silent
 
 iotests.verify_platform(['linux'])
+iotests.verify_image_format(supported_fmts=['qcow2', 'qcow', 'qed', 'vmdk',
+'vhdx', 'raw'])
 
 patterns = [("0x5d", "0", "64k"),
 ("0xd5", "1M","64k"),
-- 
2.17.1




Re: [Qemu-devel] [PATCH for 3.0 01/10] tests/.gitignore: don't ignore docker tests

2018-07-10 Thread Fam Zheng
On Tue, 07/10 08:54, Alex Bennée wrote:
> 
> Fam Zheng  writes:
> 
> > On Mon, 07/09 16:21, Alex Bennée wrote:
> >> This was being a little over enthusiastic hiding files.
> >
> > What is "this" that hides test-* and calls for this patch?
> 
> I thought that was implicit from the first line, this is the .gitignore.

Indeed you're right. Maybe that line should be more specific, but I have no
strong opinion.

Fam

> 
> >
> >>
> >> Signed-off-by: Alex Bennée 
> >> ---
> >>  tests/.gitignore | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/tests/.gitignore b/tests/.gitignore
> >> index 08e2df1ce1..72c18aaab0 100644
> >> --- a/tests/.gitignore
> >> +++ b/tests/.gitignore
> >> @@ -9,6 +9,7 @@ qht-bench
> >>  rcutorture
> >>  test-*
> >>  !test-*.c
> >> +!docker/test-*
> >>  test-qapi-commands.[ch]
> >>  test-qapi-events.[ch]
> >>  test-qapi-types.[ch]
> >> --
> >> 2.17.1
> >>
> 
> 
> --
> Alex Bennée



Re: [Qemu-devel] [PATCH for 3.0 00/10] various docker fixes

2018-07-10 Thread Fam Zheng
On Mon, 07/09 16:21, Alex Bennée wrote:
> Hi,
> 
> The addition of the cross compilers for check-tcg broke the ability to
> run "make docker-test".  In truth this was probably already broken as
> it attempts to run every single test with every single docker image
> which isn't something that gets done on a regular basis.
> 
> Anyway the patches clean up the ability to do that in a sane way
> although it still takes a long time to run the full test set.
> 
> To help with running a better subset I've expanded the individual
> tests so you can now run a line like:
> 
>   make docker-test-build TARGET_LIST=aarch64-softmmu J=30
> 
> To make sure your favourite architecture still builds everywhere.
> 
> There is now also a docker-test-unit which just runs the unit tests
> although we have to do a little re-factoring to make sure we don't
> attempt to run "make check" steps when the docker image isn't capable
> of it. This is also needed to make sure the other test-FOO build tests
> don't choke on the check step.
> 
> Finally there is a minor tweak for .gitignore and a fix for docker.py
> throwing backtraces when we attempt to calculate SID_AGE. A cleaner
> re-factoring can be left for a future release.

Thanks for the clean ups!

Looks good in general. The only thing I notice is the test-unit script.

Fam



Re: [Qemu-devel] [PATCH for 3.0 09/10] docker: add test-unit runner

2018-07-10 Thread Fam Zheng
On Mon, 07/09 16:21, Alex Bennée wrote:
> This test doesn't even build QEMU, it just runs all the unit tests.
> Intended to make checking unit tests on all docker images easier.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/docker/test-unit | 19 +++
>  1 file changed, 19 insertions(+)
>  create mode 100755 tests/docker/test-unit
> 
> diff --git a/tests/docker/test-unit b/tests/docker/test-unit
> new file mode 100755
> index 00..be0d90d748
> --- /dev/null
> +++ b/tests/docker/test-unit
> @@ -0,0 +1,19 @@
> +#!/bin/bash
> +#
> +# Build and run the unit tests
> +#
> +# Copyright (c) 2018 Linaro Ltd.
> +#
> +# Authors:
> +#  Alex Bennée 
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2
> +# or (at your option) any later version. See the COPYING file in
> +# the top-level directory.
> +
> +. common.rc
> +
> +cd "$BUILD_DIR"
> +
> +configure_qemu

This reads a bit unusual and counter-intuitive: how could configure_qemu succeed
when the env cannot actually build it? Does configure_qemu fail but the side
effects needed to run build and run unit tests are done anyway?  Either way,
please add a comment explaining what is happening here.

Also, should we test the exit code of configure_qemu?

> +check_qemu check-unit
> -- 
> 2.17.1
> 



Re: [Qemu-devel] [PATCH for 3.0 05/10] docker: move make check into check_qemu helper

2018-07-10 Thread Fam Zheng
On Mon, 07/09 16:21, Alex Bennée wrote:
> Not all docker images can run the check step. Let's everything into a

"move everything"?

> common helper so we don't need to replicate checks in the future.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/docker/common.rc  | 10 ++
>  tests/docker/test-clang |  2 +-
>  tests/docker/test-debug |  2 +-
>  tests/docker/test-full  |  2 +-
>  tests/docker/test-quick |  2 +-
>  5 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/docker/common.rc b/tests/docker/common.rc
> index ba1f942328..cfc620d554 100755
> --- a/tests/docker/common.rc
> +++ b/tests/docker/common.rc
> @@ -40,6 +40,16 @@ build_qemu()
>  make $MAKEFLAGS
>  }
>  
> +check_qemu()
> +{
> +if test -z "$@"; then
> +TEST="check"
> +else
> +TEST="$@"
> +fi
> +make $MAKEFLAGS $TEST
> +}
> +
>  test_fail()
>  {
>  echo "$@"
> diff --git a/tests/docker/test-clang b/tests/docker/test-clang
> index e90a793178..324e341cea 100755
> --- a/tests/docker/test-clang
> +++ b/tests/docker/test-clang
> @@ -23,5 +23,5 @@ OPTS="--cxx=clang++ --cc=clang --host-cc=clang"
>  #OPTS="$OPTS --extra-cflags=-fsanitize=undefined \
>  #--extra-cflags=-fno-sanitize=float-divide-by-zero"
>  build_qemu $OPTS
> -make $MAKEFLAGS check
> +check_qemu
>  install_qemu
> diff --git a/tests/docker/test-debug b/tests/docker/test-debug
> index d3f9f70d01..137f4f2ddc 100755
> --- a/tests/docker/test-debug
> +++ b/tests/docker/test-debug
> @@ -22,5 +22,5 @@ OPTS="--cxx=clang++ --cc=clang --host-cc=clang"
>  OPTS="--enable-debug --enable-sanitizers $OPTS"
>  
>  build_qemu $OPTS
> -make $MAKEFLAGS V=1 check
> +check_qemu check V=1
>  install_qemu
> diff --git a/tests/docker/test-full b/tests/docker/test-full
> index b4e42d25d7..aadc0f00a2 100755
> --- a/tests/docker/test-full
> +++ b/tests/docker/test-full
> @@ -15,4 +15,4 @@
>  
>  cd "$BUILD_DIR"
>  
> -build_qemu && make check $MAKEFLAGS && install_qemu
> +build_qemu && check_qemu && install_qemu
> diff --git a/tests/docker/test-quick b/tests/docker/test-quick
> index 3b7bce6105..eee59c55fb 100755
> --- a/tests/docker/test-quick
> +++ b/tests/docker/test-quick
> @@ -18,5 +18,5 @@ cd "$BUILD_DIR"
>  DEF_TARGET_LIST="x86_64-softmmu,aarch64-softmmu"
>  TARGET_LIST=${TARGET_LIST:-$DEF_TARGET_LIST} \
>  build_qemu
> -make check $MAKEFLAGS
> +check_qemu
>  install_qemu
> -- 
> 2.17.1
> 



Re: [Qemu-devel] [PATCH for 3.0 01/10] tests/.gitignore: don't ignore docker tests

2018-07-10 Thread Fam Zheng
On Mon, 07/09 16:21, Alex Bennée wrote:
> This was being a little over enthusiastic hiding files.

What is "this" that hides test-* and calls for this patch?

> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/.gitignore | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tests/.gitignore b/tests/.gitignore
> index 08e2df1ce1..72c18aaab0 100644
> --- a/tests/.gitignore
> +++ b/tests/.gitignore
> @@ -9,6 +9,7 @@ qht-bench
>  rcutorture
>  test-*
>  !test-*.c
> +!docker/test-*
>  test-qapi-commands.[ch]
>  test-qapi-events.[ch]
>  test-qapi-types.[ch]
> -- 
> 2.17.1
> 



[Qemu-devel] [PATCH v3 10/10] block: Use common write req handling in truncate

2018-07-10 Thread Fam Zheng
Truncation is the last to convert from open coded req handling to
reusing helpers. This time the permission check in prepare has to adapt
to the new caller: it checks a different permission bit, and doesn't
trigger the before write notifier.

Also, truncation should always trigger a bs->total_sectors update and in
turn call parent resize_cb. Update the condition in finish helper, too.

It's intended to do a duplicated bs->read_only check before calling
bdrv_co_write_req_prepare() so that we can be more informative with the
error message, as bdrv_co_write_req_prepare() doesn't have Error
parameter.

Signed-off-by: Fam Zheng 
---
 block/io.c | 57 ++
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/block/io.c b/block/io.c
index e3e2d5286d..abcb8b3de6 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1582,14 +1582,24 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t 
offset, uint64_t bytes,
is_request_serialising_and_aligned(req));
 assert(req->overlap_offset <= offset);
 assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
-
-if (flags & BDRV_REQ_WRITE_UNCHANGED) {
-assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
-} else {
-assert(child->perm & BLK_PERM_WRITE);
-}
 assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
-return notifier_with_return_list_notify(>before_write_notifiers, req);
+
+switch (req->type) {
+case BDRV_TRACKED_WRITE:
+case BDRV_TRACKED_DISCARD:
+if (flags & BDRV_REQ_WRITE_UNCHANGED) {
+assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+} else {
+assert(child->perm & BLK_PERM_WRITE);
+}
+return notifier_with_return_list_notify(>before_write_notifiers,
+req);
+case BDRV_TRACKED_TRUNCATE:
+assert(child->perm & BLK_PERM_RESIZE);
+return 0;
+default:
+abort();
+}
 }
 
 static inline void coroutine_fn
@@ -1608,8 +1618,9 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t 
offset, uint64_t bytes,
  * beyond EOF cannot expand the image anyway.
  * */
 if (ret == 0 &&
-end_sector > bs->total_sectors &&
-req->type != BDRV_TRACKED_DISCARD)  {
+(req->type == BDRV_TRACKED_TRUNCATE ||
+ end_sector > bs->total_sectors) &&
+req->type != BDRV_TRACKED_DISCARD) {
 bs->total_sectors = end_sector;
 bdrv_parent_cb_resize(bs);
 bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
@@ -3088,7 +3099,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 int64_t old_size, new_bytes;
 int ret;
 
-assert(child->perm & BLK_PERM_RESIZE);
 
 /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
 if (!drv) {
@@ -3121,7 +3131,18 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
  * concurrently or they might be overwritten by preallocation. */
 if (new_bytes) {
 mark_request_serialising(, 1);
-wait_serialising_requests();
+}
+if (bs->read_only) {
+error_setg(errp, "Image is read-only");
+ret = -EACCES;
+goto out;
+}
+ret = bdrv_co_write_req_prepare(child, offset - new_bytes, new_bytes, ,
+0);
+if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed to prepare request for truncation");
+goto out;
 }
 
 if (!drv->bdrv_co_truncate) {
@@ -3133,13 +3154,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 ret = -ENOTSUP;
 goto out;
 }
-if (bs->read_only) {
-error_setg(errp, "Image is read-only");
-ret = -EACCES;
-goto out;
-}
-
-assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
 ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp);
 if (ret < 0) {
@@ -3151,9 +3165,10 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 } else {
 offset = bs->total_sectors * BDRV_SECTOR_SIZE;
 }
-bdrv_dirty_bitmap_truncate(bs, offset);
-bdrv_parent_cb_resize(bs);
-atomic_inc(>write_gen);
+/* It's possible that truncation succeeded but refresh_total_sectors
+ * failed, but the latter doesn't affect how we should finish the request.
+ * Pass 0 as the last parameter so that dirty bitmaps etc. are handled. */
+bdrv_co_write_req_finish(child, offset - new_bytes, new_bytes, , 0);
 
 out:
 tracked_request_end();
-- 
2.17.1




[Qemu-devel] [PATCH v3 08/10] block: Use common req handling in copy offloading

2018-07-10 Thread Fam Zheng
This brings the request handling logic inline with write and discard,
fixing write_gen, resize_cb, dirty bitmaps and image size refreshing.
The last of these issues broke iotest case 222, which is now fixed.

Signed-off-by: Fam Zheng 
---
 block/io.c | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/block/io.c b/block/io.c
index cc426bab2e..9687da1ce9 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3007,20 +3007,16 @@ static int coroutine_fn bdrv_co_copy_range_internal(
 bdrv_inc_in_flight(dst->bs);
 tracked_request_begin(, dst->bs, dst_offset, bytes,
   BDRV_TRACKED_WRITE);
-
-/* BDRV_REQ_NO_SERIALISING is only for read operation */
-assert(!(write_flags & BDRV_REQ_NO_SERIALISING));
-if (write_flags & BDRV_REQ_SERIALISING) {
-mark_request_serialising(, bdrv_get_cluster_size(dst->bs));
+ret = bdrv_co_write_req_prepare(dst, dst_offset, bytes, ,
+write_flags);
+if (!ret) {
+ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs,
+  src, src_offset,
+  dst, dst_offset,
+  bytes,
+  read_flags, write_flags);
 }
-wait_serialising_requests();
-
-ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs,
-  src, src_offset,
-  dst, dst_offset,
-  bytes,
-  read_flags, write_flags);
-
+bdrv_co_write_req_finish(dst, dst_offset, bytes, , ret);
 tracked_request_end();
 bdrv_dec_in_flight(dst->bs);
 }
-- 
2.17.1




[Qemu-devel] [PATCH v3 07/10] block: Use common req handling for discard

2018-07-10 Thread Fam Zheng
Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation
here is that discard requests don't affect bs->wr_highest_offset, and it
cannot extend the image.

Signed-off-by: Fam Zheng 
---
 block/io.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/block/io.c b/block/io.c
index 10a475302a..cc426bab2e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1601,15 +1601,31 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t 
offset, uint64_t bytes,
 
 atomic_inc(>write_gen);
 
-stat64_max(>wr_highest_offset, offset + bytes);
-
+/* Discard cannot extend the image, but in error handling cases, such as
+ * when reverting a qcow2 cluster allocation, the discarded range can pass
+ * the end of image file, so we cannot assert about BDRV_TRACKED_DISCARD
+ * here. Instead, just skip it, since semantically a discard request
+ * beyond EOF cannot expand the image anyway.
+ * */
 if (ret == 0 &&
-end_sector > bs->total_sectors) {
+end_sector > bs->total_sectors &&
+req->type != BDRV_TRACKED_DISCARD)  {
 bs->total_sectors = end_sector;
 bdrv_parent_cb_resize(bs);
 bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
 }
-bdrv_set_dirty(bs, offset, bytes);
+if (req->bytes) {
+switch (req->type) {
+case BDRV_TRACKED_WRITE:
+stat64_max(>wr_highest_offset, offset + bytes);
+/* fall through, to set dirty bits */
+case BDRV_TRACKED_DISCARD:
+bdrv_set_dirty(bs, offset, bytes);
+break;
+default:
+break;
+}
+}
 }
 
 /*
@@ -2670,10 +2686,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, 
int64_t offset, int bytes)
 ret = bdrv_check_byte_request(bs, offset, bytes);
 if (ret < 0) {
 return ret;
-} else if (bs->read_only) {
-return -EPERM;
 }
-assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
 /* Do nothing if disabled.  */
 if (!(bs->open_flags & BDRV_O_UNMAP)) {
@@ -2697,7 +2710,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, 
int64_t offset, int bytes)
 bdrv_inc_in_flight(bs);
 tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_DISCARD);
 
-ret = notifier_with_return_list_notify(>before_write_notifiers, );
+ret = bdrv_co_write_req_prepare(child, offset, bytes, , 0);
 if (ret < 0) {
 goto out;
 }
@@ -2763,8 +2776,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, 
int64_t offset, int bytes)
 }
 ret = 0;
 out:
-atomic_inc(>write_gen);
-bdrv_set_dirty(bs, req.offset, req.bytes);
+bdrv_co_write_req_finish(child, req.offset, req.bytes, , ret);
 tracked_request_end();
 bdrv_dec_in_flight(bs);
 return ret;
-- 
2.17.1




[Qemu-devel] [PATCH v3 09/10] block: Fix bdrv_co_truncate overlap check

2018-07-10 Thread Fam Zheng
If we are growing the image and potentially using preallocation for the
new area, we need to make sure that no write requests are made to the
"preallocated" area which is [@old_size, @offset), not
[@offset, offset * 2 - @old_size).

Signed-off-by: Fam Zheng 
Reviewed-by: Eric Blake 
---
 block/io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 9687da1ce9..e3e2d5286d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3113,7 +3113,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 }
 
 bdrv_inc_in_flight(bs);
-tracked_request_begin(, bs, offset, new_bytes, BDRV_TRACKED_TRUNCATE);
+tracked_request_begin(, bs, offset - new_bytes, new_bytes,
+  BDRV_TRACKED_TRUNCATE);
 
 /* If we are growing the image and potentially using preallocation for the
  * new area, we need to make sure that no write requests are made to it
-- 
2.17.1




[Qemu-devel] [PATCH v3 05/10] block: Extract common write req handling

2018-07-10 Thread Fam Zheng
As a mechanical refactoring patch, this is the first step towards
unified and more correct write code paths. This is helpful because
multiple BlockDriverState fields need to be updated after modifying
image data, and it's hard to maintain in multiple places such as copy
offload, discard and truncate.

Suggested-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 block/io.c | 91 ++
 1 file changed, 57 insertions(+), 34 deletions(-)

diff --git a/block/io.c b/block/io.c
index c56e329bb5..960e1492d0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1553,6 +1553,61 @@ fail:
 return ret;
 }
 
+static inline int coroutine_fn
+bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
+  BdrvTrackedRequest *req, int flags)
+{
+BlockDriverState *bs = child->bs;
+bool waited;
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+
+if (bs->read_only) {
+return -EPERM;
+}
+
+/* BDRV_REQ_NO_SERIALISING is only for read operation */
+assert(!(flags & BDRV_REQ_NO_SERIALISING));
+assert(!(bs->open_flags & BDRV_O_INACTIVE));
+assert((bs->open_flags & BDRV_O_NO_IO) == 0);
+assert(!(flags & ~BDRV_REQ_MASK));
+
+if (flags & BDRV_REQ_SERIALISING) {
+mark_request_serialising(req, bdrv_get_cluster_size(bs));
+}
+
+waited = wait_serialising_requests(req);
+
+assert(!waited || !req->serialising ||
+   is_request_serialising_and_aligned(req));
+assert(req->overlap_offset <= offset);
+assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
+
+if (flags & BDRV_REQ_WRITE_UNCHANGED) {
+assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+} else {
+assert(child->perm & BLK_PERM_WRITE);
+}
+assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
+return notifier_with_return_list_notify(>before_write_notifiers, req);
+}
+
+static inline void coroutine_fn
+bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes,
+ BdrvTrackedRequest *req, int ret)
+{
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+BlockDriverState *bs = child->bs;
+
+atomic_inc(>write_gen);
+bdrv_set_dirty(bs, offset, bytes);
+
+stat64_max(>wr_highest_offset, offset + bytes);
+
+if (ret == 0) {
+bs->total_sectors = MAX(bs->total_sectors, end_sector);
+}
+}
+
 /*
  * Forwards an already correctly aligned write request to the BlockDriver,
  * after possibly fragmenting it.
@@ -1563,10 +1618,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 {
 BlockDriverState *bs = child->bs;
 BlockDriver *drv = bs->drv;
-bool waited;
 int ret;
 
-int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
 uint64_t bytes_remaining = bytes;
 int max_transfer;
 
@@ -1582,31 +1635,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 assert((offset & (align - 1)) == 0);
 assert((bytes & (align - 1)) == 0);
 assert(!qiov || bytes == qiov->size);
-assert((bs->open_flags & BDRV_O_NO_IO) == 0);
-assert(!(flags & ~BDRV_REQ_MASK));
 max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
align);
 
-/* BDRV_REQ_NO_SERIALISING is only for read operation */
-assert(!(flags & BDRV_REQ_NO_SERIALISING));
-
-if (flags & BDRV_REQ_SERIALISING) {
-mark_request_serialising(req, bdrv_get_cluster_size(bs));
-}
-
-waited = wait_serialising_requests(req);
-assert(!waited || !req->serialising ||
-   is_request_serialising_and_aligned(req));
-assert(req->overlap_offset <= offset);
-assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
-if (flags & BDRV_REQ_WRITE_UNCHANGED) {
-assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
-} else {
-assert(child->perm & BLK_PERM_WRITE);
-}
-assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
-
-ret = notifier_with_return_list_notify(>before_write_notifiers, req);
+ret = bdrv_co_write_req_prepare(child, offset, bytes, req, flags);
 
 if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF &&
 !(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_pwrite_zeroes &&
@@ -1655,15 +1687,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 }
 bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
 
-atomic_inc(>write_gen);
-bdrv_set_dirty(bs, offset, bytes);
-
-stat64_max(>wr_highest_offset, offset + bytes);
-
 if (ret >= 0) {
-bs->tota

[Qemu-devel] [PATCH v3 03/10] block: Use BdrvChild to discard

2018-07-10 Thread Fam Zheng
Other I/O functions are already using a BdrvChild pointer in the API, so
make discard do the same. It makes it possible to initiate the same
permission checks before doing I/O, and much easier to share the
helper functions for this, which will be added and used by write,
truncate and copy range paths.

Signed-off-by: Fam Zheng 
---
 block/blkdebug.c   |  2 +-
 block/blklogwrites.c   |  2 +-
 block/blkreplay.c  |  2 +-
 block/block-backend.c  |  2 +-
 block/copy-on-read.c   |  2 +-
 block/io.c | 18 +-
 block/mirror.c |  2 +-
 block/qcow2-refcount.c |  2 +-
 block/raw-format.c |  2 +-
 block/throttle.c   |  2 +-
 include/block/block.h  |  4 ++--
 11 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 526af2a808..0457bf5b66 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -625,7 +625,7 @@ static int coroutine_fn 
blkdebug_co_pdiscard(BlockDriverState *bs,
 return err;
 }
 
-return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+return bdrv_co_pdiscard(bs->file, offset, bytes);
 }
 
 static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs,
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index 63bf6b34a9..34e98e4ff5 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -483,7 +483,7 @@ static int coroutine_fn 
blk_log_writes_co_do_file_flush(BlkLogWritesFileReq *fr)
 static int coroutine_fn
 blk_log_writes_co_do_file_pdiscard(BlkLogWritesFileReq *fr)
 {
-return bdrv_co_pdiscard(fr->bs->file->bs, fr->offset, fr->bytes);
+return bdrv_co_pdiscard(fr->bs->file, fr->offset, fr->bytes);
 }
 
 static int coroutine_fn
diff --git a/block/blkreplay.c b/block/blkreplay.c
index b016dbeee7..766150ade6 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -113,7 +113,7 @@ static int coroutine_fn 
blkreplay_co_pdiscard(BlockDriverState *bs,
   int64_t offset, int bytes)
 {
 uint64_t reqid = blkreplay_next_id();
-int ret = bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+int ret = bdrv_co_pdiscard(bs->file, offset, bytes);
 block_request_create(reqid, bs, qemu_coroutine_self());
 qemu_coroutine_yield();
 
diff --git a/block/block-backend.c b/block/block-backend.c
index ac8c3e0b1c..fa120630be 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1559,7 +1559,7 @@ int blk_co_pdiscard(BlockBackend *blk, int64_t offset, 
int bytes)
 return ret;
 }
 
-return bdrv_co_pdiscard(blk_bs(blk), offset, bytes);
+return bdrv_co_pdiscard(blk->root, offset, bytes);
 }
 
 int blk_co_flush(BlockBackend *blk)
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 1dcdaeed69..a19164f9eb 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -116,7 +116,7 @@ static int coroutine_fn 
cor_co_pwrite_zeroes(BlockDriverState *bs,
 static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
 int64_t offset, int bytes)
 {
-return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+return bdrv_co_pdiscard(bs->file, offset, bytes);
 }
 
 
diff --git a/block/io.c b/block/io.c
index 600060c936..77f35b1013 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2611,7 +2611,7 @@ int bdrv_flush(BlockDriverState *bs)
 }
 
 typedef struct DiscardCo {
-BlockDriverState *bs;
+BdrvChild *child;
 int64_t offset;
 int bytes;
 int ret;
@@ -2620,17 +2620,17 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void 
*opaque)
 {
 DiscardCo *rwco = opaque;
 
-rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->offset, rwco->bytes);
+rwco->ret = bdrv_co_pdiscard(rwco->child, rwco->offset, rwco->bytes);
 }
 
-int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
-  int bytes)
+int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
 {
 BdrvTrackedRequest req;
 int max_pdiscard, ret;
 int head, tail, align;
+BlockDriverState *bs = child->bs;
 
-if (!bs->drv) {
+if (!bs || !bs->drv) {
 return -ENOMEDIUM;
 }
 
@@ -2741,11 +2741,11 @@ out:
 return ret;
 }
 
-int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
 {
 Coroutine *co;
 DiscardCo rwco = {
-.bs = bs,
+.child = child,
 .offset = offset,
 .bytes = bytes,
 .ret = NOT_DONE,
@@ -2756,8 +2756,8 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, 
int bytes)
 bdrv_pdiscard_co_entry();
 } else {
 co = qemu_coroutine_create(bdrv_pdiscard_co_entry, );
-bdrv_coroutine_enter(bs, co);
-BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE);
+bdrv_coroutine_enter(child->bs, co);
+BDRV_POLL_WHILE(child->bs, rwco.ret

[Qemu-devel] [PATCH v3 04/10] block: Use uint64_t for BdrvTrackedRequest byte fields

2018-07-10 Thread Fam Zheng
This matches the types used for bytes in the rest parts of block layer.
In the case of bdrv_co_truncate, new_bytes can be the image size which
probably doesn't fit in a 32 bit int.

Signed-off-by: Fam Zheng 
---
 block/io.c| 8 +---
 include/block/block_int.h | 4 ++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 77f35b1013..c56e329bb5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -587,9 +587,11 @@ static void tracked_request_end(BdrvTrackedRequest *req)
 static void tracked_request_begin(BdrvTrackedRequest *req,
   BlockDriverState *bs,
   int64_t offset,
-  unsigned int bytes,
+  uint64_t bytes,
   enum BdrvTrackedRequestType type)
 {
+assert(bytes <= INT64_MAX && offset <= INT64_MAX - bytes);
+
 *req = (BdrvTrackedRequest){
 .bs = bs,
 .offset = offset,
@@ -611,7 +613,7 @@ static void tracked_request_begin(BdrvTrackedRequest *req,
 static void mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
 {
 int64_t overlap_offset = req->offset & ~(align - 1);
-unsigned int overlap_bytes = ROUND_UP(req->offset + req->bytes, align)
+uint64_t overlap_bytes = ROUND_UP(req->offset + req->bytes, align)
- overlap_offset;
 
 if (!req->serialising) {
@@ -667,7 +669,7 @@ static int bdrv_get_cluster_size(BlockDriverState *bs)
 }
 
 static bool tracked_request_overlaps(BdrvTrackedRequest *req,
- int64_t offset, unsigned int bytes)
+ int64_t offset, uint64_t bytes)
 {
 /*    */
 if (offset >= req->overlap_offset + req->overlap_bytes) {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 119b6e4ea5..90a33e4b0e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -69,12 +69,12 @@ enum BdrvTrackedRequestType {
 typedef struct BdrvTrackedRequest {
 BlockDriverState *bs;
 int64_t offset;
-unsigned int bytes;
+uint64_t bytes;
 enum BdrvTrackedRequestType type;
 
 bool serialising;
 int64_t overlap_offset;
-unsigned int overlap_bytes;
+uint64_t overlap_bytes;
 
 QLIST_ENTRY(BdrvTrackedRequest) list;
 Coroutine *co; /* owner, used for deadlock detection */
-- 
2.17.1




[Qemu-devel] [PATCH v3 06/10] block: Fix handling of image enlarging write

2018-07-10 Thread Fam Zheng
Two problems exist when a write request that enlarges the image (i.e.
write beyond EOF) finishes:

1) parent is not notified about size change;
2) dirty bitmap is not resized although we try to set the dirty bits;

Fix them just like how bdrv_co_truncate works.

Reported-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 block/io.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 960e1492d0..10a475302a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -40,6 +40,7 @@
 
 static AioWait drain_all_aio_wait;
 
+static void bdrv_parent_cb_resize(BlockDriverState *bs);
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 int64_t offset, int bytes, BdrvRequestFlags flags);
 
@@ -1599,13 +1600,16 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t 
offset, uint64_t bytes,
 BlockDriverState *bs = child->bs;
 
 atomic_inc(>write_gen);
-bdrv_set_dirty(bs, offset, bytes);
 
 stat64_max(>wr_highest_offset, offset + bytes);
 
-if (ret == 0) {
-bs->total_sectors = MAX(bs->total_sectors, end_sector);
+if (ret == 0 &&
+end_sector > bs->total_sectors) {
+bs->total_sectors = end_sector;
+bdrv_parent_cb_resize(bs);
+bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
 }
+bdrv_set_dirty(bs, offset, bytes);
 }
 
 /*
-- 
2.17.1




[Qemu-devel] [PATCH v3 02/10] block: Add copy offloading trace points

2018-07-10 Thread Fam Zheng
A few trace points that can help reveal what is happening in a copy
offloading I/O path.

Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 2 ++
 block/io.c | 4 
 block/iscsi.c  | 3 +++
 block/trace-events | 6 ++
 4 files changed, 15 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index 1185c7c5cc..09f6b938f6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1488,6 +1488,8 @@ static ssize_t handle_aiocb_copy_range(RawPosixAIOData 
*aiocb)
 ssize_t ret = copy_file_range(aiocb->aio_fildes, _off,
   aiocb->aio_fd2, _off,
   bytes, 0);
+trace_file_copy_file_range(aiocb->bs, aiocb->aio_fildes, in_off,
+   aiocb->aio_fd2, out_off, bytes, 0, ret);
 if (ret == 0) {
 /* No progress (e.g. when beyond EOF), let the caller fall back to
  * buffer I/O. */
diff --git a/block/io.c b/block/io.c
index 102cb0e1ba..600060c936 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2997,6 +2997,8 @@ int coroutine_fn bdrv_co_copy_range_from(BdrvChild *src, 
uint64_t src_offset,
  BdrvRequestFlags read_flags,
  BdrvRequestFlags write_flags)
 {
+trace_bdrv_co_copy_range_from(src, src_offset, dst, dst_offset, bytes,
+  read_flags, write_flags);
 return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset,
bytes, read_flags, write_flags, true);
 }
@@ -3011,6 +3013,8 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, 
uint64_t src_offset,
BdrvRequestFlags read_flags,
BdrvRequestFlags write_flags)
 {
+trace_bdrv_co_copy_range_to(src, src_offset, dst, dst_offset, bytes,
+read_flags, write_flags);
 return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset,
bytes, read_flags, write_flags, false);
 }
diff --git a/block/iscsi.c b/block/iscsi.c
index 38b917a1e5..bb69faf34a 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -44,6 +44,7 @@
 #include "qapi/qmp/qstring.h"
 #include "crypto/secret.h"
 #include "scsi/utils.h"
+#include "trace.h"
 
 /* Conflict between scsi/utils.h and libiscsi! :( */
 #define SCSI_XFER_NONE ISCSI_XFER_NONE
@@ -2399,6 +2400,8 @@ retry:
 }
 
 out_unlock:
+
+trace_iscsi_xcopy(src_lun, src_offset, dst_lun, dst_offset, bytes, r);
 g_free(iscsi_task.task);
 qemu_mutex_unlock(_lun->mutex);
 g_free(iscsi_task.err_str);
diff --git a/block/trace-events b/block/trace-events
index 854d5525af..3e8c47bb24 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -15,6 +15,8 @@ bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, 
unsigned int flags) "bs
 bdrv_co_pwritev(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) 
"bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
 bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p 
offset %"PRId64" count %d flags 0x%x"
 bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t 
cluster_offset, int64_t cluster_bytes) "bs %p offset %"PRId64" bytes %u 
cluster_offset %"PRId64" cluster_bytes %"PRId64
+bdrv_co_copy_range_from(void *src, uint64_t src_offset, void *dst, uint64_t 
dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset 
%"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x"
+bdrv_co_copy_range_to(void *src, uint64_t src_offset, void *dst, uint64_t 
dst_offset, uint64_t bytes, int read_flags, int write_flags) "src %p offset 
%"PRIu64" dst %p offset %"PRIu64" bytes %"PRIu64" rw flags 0x%x 0x%x"
 
 # block/stream.c
 stream_one_iteration(void *s, int64_t offset, uint64_t bytes, int 
is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d"
@@ -57,6 +59,7 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p"
 # block/file-posix.c
 file_paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" 
count %d type %d"
 file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) 
"acb %p opaque %p offset %"PRId64" count %d type %d"
+file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t 
dst_off, int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset 
%"PRIu64" dst_fd %d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64
 
 # block/qcow2.c
 qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%&

[Qemu-devel] [PATCH v3 01/10] block: Prefix file driver trace points with "file_"

2018-07-10 Thread Fam Zheng
With in one module, trace points usually have a common prefix named
after the module name. paio_submit and paio_submit_co are the only two
trace points so far in the two file protocol drivers. As we are adding
more, having a common prefix here is better so that trace points can be
enabled with a glob. Rename them.

Suggested-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 2 +-
 block/file-win32.c | 2 +-
 block/trace-events | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 4fec8cb53c..1185c7c5cc 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1743,7 +1743,7 @@ static int paio_submit_co_full(BlockDriverState *bs, int 
fd,
 assert(qiov->size == bytes);
 }
 
-trace_paio_submit_co(offset, bytes, type);
+trace_file_paio_submit_co(offset, bytes, type);
 pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
 return thread_pool_submit_co(pool, aio_worker, acb);
 }
diff --git a/block/file-win32.c b/block/file-win32.c
index 0411fe80fd..f1e2187f3b 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -162,7 +162,7 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE 
hfile,
 acb->aio_nbytes = count;
 acb->aio_offset = offset;
 
-trace_paio_submit(acb, opaque, offset, count, type);
+trace_file_paio_submit(acb, opaque, offset, count, type);
 pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
 return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
 }
diff --git a/block/trace-events b/block/trace-events
index c35287b48a..854d5525af 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -55,8 +55,8 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p"
 
 # block/file-win32.c
 # block/file-posix.c
-paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d 
type %d"
-paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb 
%p opaque %p offset %"PRId64" count %d type %d"
+file_paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" 
count %d type %d"
+file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) 
"acb %p opaque %p offset %"PRId64" count %d type %d"
 
 # block/qcow2.c
 qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" 
PRIx64 " bytes %d"
-- 
2.17.1




[Qemu-devel] [PATCH v3 00/10] block: Fix dst reading after tail copy offloading

2018-07-10 Thread Fam Zheng
Based-on: 20180709163719.87107-1-vsement...@virtuozzo.com

v3: - Rebase onto Vladmir's:
  [PATCH v5 0/4] fix image fleecing
  on top of master, which has blklogwrites to be converted to BdrvChild for
  the bdrv_co_pdiscard() parameter.  [Kevin]
- Add file_ prefix to file protocol trace points. [Kevin]
- Assert that BdrvTrackedRequest bytes and offset don't overflow INT64_MAX.
  [Kevin]
- Don't misuse req->offset/req->bytes in bdrv_co_write_req_prepare/finish.
  [Kevin]
- Fix stat64_max. [Kevin]
- Keep lines within 80 columns. [Kevin]
- Grammar fix in commit message and rev-by. [Eric]

Qcow2 allocates new clusters after the end of the file. If it is the destinaton
of copy offloading, we must adjust dst->bs->total_sectors. Otherwise, further
reads will drop to the "beyond EOF" code path and return zeroes, which problem
is caught by iotests 222.

Follow the logic in the normal write code and update bs->total_sectors after
I/O is done.

While at it, add a few convenient trace points to aid future debug experiences
in the topic.

Fam Zheng (10):
  block: Prefix file driver trace points with "file_"
  block: Add copy offloading trace points
  block: Use BdrvChild to discard
  block: Use uint64_t for BdrvTrackedRequest byte fields
  block: Extract common write req handling
  block: Fix handling of image enlarging write
  block: Use common req handling for discard
  block: Use common req handling in copy offloading
  block: Fix bdrv_co_truncate overlap check
  block: Use common write req handling in truncate

 block/blkdebug.c  |   2 +-
 block/blklogwrites.c  |   2 +-
 block/blkreplay.c |   2 +-
 block/block-backend.c |   2 +-
 block/copy-on-read.c  |   2 +-
 block/file-posix.c|   4 +-
 block/file-win32.c|   2 +-
 block/io.c| 213 --
 block/iscsi.c |   3 +
 block/mirror.c|   2 +-
 block/qcow2-refcount.c|   2 +-
 block/raw-format.c|   2 +-
 block/throttle.c  |   2 +-
 block/trace-events|  10 +-
 include/block/block.h |   4 +-
 include/block/block_int.h |   4 +-
 16 files changed, 163 insertions(+), 95 deletions(-)

-- 
2.17.1




Re: [Qemu-devel] [PATCH v2 8/9] block: Fix bdrv_co_truncate overlap check

2018-07-09 Thread Fam Zheng
On Fri, 07/06 17:09, Eric Blake wrote:
> On 07/05/2018 02:37 AM, Fam Zheng wrote:
> > If we are growing the image and potentially using preallocation for the
> > new area, we need to make sure that no write requests are made to the
> > "preallocated" area which [@old_size, @offset), not [@offset, offset * 2
> 
> s/which/which is/
> 
> > - @old_size).
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >   block/io.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> Reviewed-by: Eric Blake 
> 
> > 
> > diff --git a/block/io.c b/block/io.c
> > index d07849fa96..ed18eb0ca3 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -3070,7 +3070,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
> > int64_t offset,
> >   }
> >   bdrv_inc_in_flight(bs);
> > -tracked_request_begin(, bs, offset, new_bytes, 
> > BDRV_TRACKED_TRUNCATE);
> > +tracked_request_begin(, bs, offset - new_bytes, new_bytes,
> > +  BDRV_TRACKED_TRUNCATE);
> 
> Is it any more legible to do s/offset - new_bytes/old_size/, since those are
> equivalent?

No they are not. offset - new_bytes is either old_size (if expanding), or
smaller than old_size (if shrinking).

Fam

> 
> >   /* If we are growing the image and potentially using preallocation 
> > for the
> >* new area, we need to make sure that no write requests are made to 
> > it
> > 
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v5 4/4] block/backup: fix fleecing scheme: use serialized writes

2018-07-09 Thread Fam Zheng
 job->target, start, nbytes,
> -is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0, 
> 0);
> +read_flags, write_flags);
>  if (ret < 0) {
>  trace_backup_do_cow_copy_range_fail(job, start, ret);
>  hbitmap_set(job->copy_bitmap, start / job->cluster_size,
> @@ -701,6 +706,9 @@ BlockJob *backup_job_create(const char *job_id, 
> BlockDriverState *bs,
> sync_bitmap : NULL;
>  job->compress = compress;
>  
> +/* Detect image-fleecing (and similar) schemes */
> +job->serialize_target_writes = bdrv_chain_contains(target, bs);
> +
>  /* If there is no backing file on the target, we cannot rely on COW if 
> our
>   * backup cluster size is smaller than the target cluster size. Even for
>   * targets with a backing file, try to avoid COW if possible. */

I always thought mirror is by far the most complicated job, but now backup is
catching up quickly!

Reviewed-by: Fam Zheng 




Re: [Qemu-devel] [PATCH v5 3/4] block: add BDRV_REQ_SERIALISING flag

2018-07-09 Thread Fam Zheng
ad_flags & BDRV_REQ_SERIALISING));
>  if (!(read_flags & BDRV_REQ_NO_SERIALISING)) {
>  wait_serialising_requests();
>  }
> @@ -2948,6 +2969,9 @@ static int coroutine_fn bdrv_co_copy_range_internal(
>  
>  /* BDRV_REQ_NO_SERIALISING is only for read operation */
>  assert(!(write_flags & BDRV_REQ_NO_SERIALISING));
> +if (write_flags & BDRV_REQ_SERIALISING) {
> +mark_request_serialising(, bdrv_get_cluster_size(dst->bs));
> +}
>  wait_serialising_requests();
>  
>  ret = dst->bs->drv->bdrv_co_copy_range_to(dst->bs,
> -- 
> 2.11.1
> 

Reviewed-by: Fam Zheng 

Fam



Re: [Qemu-devel] [PATCH v5 2/4] block: split flags in copy_range

2018-07-09 Thread Fam Zheng
On Mon, 07/09 19:37, Vladimir Sementsov-Ogievskiy wrote:
> Pass read flags and write flags separately. This is needed to handle
> coming BDRV_REQ_NO_SERIALISING clearly in following patches.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Fam Zheng 




Re: [Qemu-devel] [PATCH v5 1/4] block/io: fix copy_range

2018-07-09 Thread Fam Zheng
On Mon, 07/09 19:37, Vladimir Sementsov-Ogievskiy wrote:
> Here two things are fixed:
> 
> 1. Architecture
> 
> On each recursion step, we go to the child of src or dst, only for one
> of them. So, it's wrong to create tracked requests for both on each
> step. It leads to tracked requests duplication.
> 
> 2. Wait for serializing requests on write path independently of
>BDRV_REQ_NO_SERIALISING
> 
> Before commit 9ded4a01149 "backup: Use copy offloading",
> BDRV_REQ_NO_SERIALISING was used for only one case: read in
> copy-on-write operation during backup. Also, the flag was handled only
> on read path (in bdrv_co_preadv and bdrv_aligned_preadv).
> 
> After 9ded4a01149, flag is used for not waiting serializing operations
> on backup target (in same case of copy-on-write operation). This
> behavior change is unsubstantiated and potentially dangerous, let's
> drop it and add additional asserts and documentation.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Fam Zheng 




Re: [Qemu-devel] [PATCH v4 1/4] block/io: fix copy_range

2018-07-09 Thread Fam Zheng
On Mon, 07/09 17:38, Vladimir Sementsov-Ogievskiy wrote:
> 09.07.2018 16:17, Fam Zheng wrote:
> > On Mon, 07/09 12:43, Vladimir Sementsov-Ogievskiy wrote:
> > > 09.07.2018 04:15, Fam Zheng wrote:
> > > > On Fri, 07/06 21:30, Vladimir Sementsov-Ogievskiy wrote:
> > > > > Here two things are fixed:
> > > > > 
> > > > > 1. Architecture
> > > > > 
> > > > > On each recursion step, we go to the child of src or dst, only for one
> > > > > of them. So, it's wrong to create tracked requests for both on each
> > > > > step. It leads to tracked requests duplication.
> > > > > 
> > > > > 2. Wait for serializing requests on write path independently of
> > > > >  BDRV_REQ_NO_SERIALISING
> > > > > 
> > > > > Before commit 9ded4a01149 "backup: Use copy offloading",
> > > > > BDRV_REQ_NO_SERIALISING was used for only one case: read in
> > > > > copy-on-write operation during backup. Also, the flag was handled only
> > > > > on read path (in bdrv_co_preadv and bdrv_aligned_preadv).
> > > > > 
> > > > > After 9ded4a01149, flag is used for not waiting serializing operations
> > > > > on backup target (in same case of copy-on-write operation). This
> > > > > behavior change is unsubstantiated and potentially dangerous, let's
> > > > > drop it and add additional asserts and documentation.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > > > ---
> > > > >include/block/block.h |  13 +++
> > > > >block/io.c| 103 
> > > > > +++---
> > > > >2 files changed, 78 insertions(+), 38 deletions(-)
> > > > > 
> > > > > diff --git a/include/block/block.h b/include/block/block.h
> > > > > index e5c7759a0c..a06a4d27de 100644
> > > > > --- a/include/block/block.h
> > > > > +++ b/include/block/block.h
> > > > > @@ -50,6 +50,19 @@ typedef enum {
> > > > > * opened with BDRV_O_UNMAP.
> > > > > */
> > > > >BDRV_REQ_MAY_UNMAP  = 0x4,
> > > > > +
> > > > > +/* The BDRV_REQ_NO_SERIALISING means that we don't want to
> > > > > + * wait_serialising_requests(), when reading.
> > > > > + *
> > > > > + * This flag is used for backup copy on write operation, when we 
> > > > > need to
> > > > > + * read old data before write (write notifier triggered). It is 
> > > > > ok, due to
> > > > > + * we already waited for serializing requests in initiative 
> > > > > write (see
> > > > > + * bdrv_aligned_pwritev), and it is necessary for the case when 
> > > > > initiative
> > > > > + * write is serializing itself (we'll dead lock waiting it).
> > > > > + *
> > > > > + * The described case is the only usage for the flag for now, 
> > > > > so, it is
> > > > > + * supported only for read operation and restricted for write.
> > > > > + */
> > > > >BDRV_REQ_NO_SERIALISING = 0x8,
> > > > >BDRV_REQ_FUA= 0x10,
> > > > >BDRV_REQ_WRITE_COMPRESSED   = 0x20,
> > > > > diff --git a/block/io.c b/block/io.c
> > > > > index 1a2272fad3..621b21c455 100644
> > > > > --- a/block/io.c
> > > > > +++ b/block/io.c
> > > > > @@ -1572,6 +1572,8 @@ static int coroutine_fn 
> > > > > bdrv_aligned_pwritev(BdrvChild *child,
> > > > >max_transfer = 
> > > > > QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
> > > > >   align);
> > > > > +/* BDRV_REQ_NO_SERIALISING is only for read operation */
> > > > > +assert(!(flags & BDRV_REQ_NO_SERIALISING));
> > > > >waited = wait_serialising_requests(req);
> > > > >assert(!waited || !req->serialising);
> > > > >assert(req->overlap_offset <= offset);
> > > > > @@ -2888,15 +2890,19 @@ void bdrv_unregister_buf(BlockDriverState 
> > > > > *bs, void *host)
> > > > >}
> > > > >}
>

Re: [Qemu-devel] [PATCH v4 1/4] block/io: fix copy_range

2018-07-09 Thread Fam Zheng
On Mon, 07/09 12:43, Vladimir Sementsov-Ogievskiy wrote:
> 09.07.2018 04:15, Fam Zheng wrote:
> > On Fri, 07/06 21:30, Vladimir Sementsov-Ogievskiy wrote:
> > > Here two things are fixed:
> > > 
> > > 1. Architecture
> > > 
> > > On each recursion step, we go to the child of src or dst, only for one
> > > of them. So, it's wrong to create tracked requests for both on each
> > > step. It leads to tracked requests duplication.
> > > 
> > > 2. Wait for serializing requests on write path independently of
> > > BDRV_REQ_NO_SERIALISING
> > > 
> > > Before commit 9ded4a01149 "backup: Use copy offloading",
> > > BDRV_REQ_NO_SERIALISING was used for only one case: read in
> > > copy-on-write operation during backup. Also, the flag was handled only
> > > on read path (in bdrv_co_preadv and bdrv_aligned_preadv).
> > > 
> > > After 9ded4a01149, flag is used for not waiting serializing operations
> > > on backup target (in same case of copy-on-write operation). This
> > > behavior change is unsubstantiated and potentially dangerous, let's
> > > drop it and add additional asserts and documentation.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > > ---
> > >   include/block/block.h |  13 +++
> > >   block/io.c| 103 
> > > +++---
> > >   2 files changed, 78 insertions(+), 38 deletions(-)
> > > 
> > > diff --git a/include/block/block.h b/include/block/block.h
> > > index e5c7759a0c..a06a4d27de 100644
> > > --- a/include/block/block.h
> > > +++ b/include/block/block.h
> > > @@ -50,6 +50,19 @@ typedef enum {
> > >* opened with BDRV_O_UNMAP.
> > >*/
> > >   BDRV_REQ_MAY_UNMAP  = 0x4,
> > > +
> > > +/* The BDRV_REQ_NO_SERIALISING means that we don't want to
> > > + * wait_serialising_requests(), when reading.
> > > + *
> > > + * This flag is used for backup copy on write operation, when we 
> > > need to
> > > + * read old data before write (write notifier triggered). It is ok, 
> > > due to
> > > + * we already waited for serializing requests in initiative write 
> > > (see
> > > + * bdrv_aligned_pwritev), and it is necessary for the case when 
> > > initiative
> > > + * write is serializing itself (we'll dead lock waiting it).
> > > + *
> > > + * The described case is the only usage for the flag for now, so, it 
> > > is
> > > + * supported only for read operation and restricted for write.
> > > + */
> > >   BDRV_REQ_NO_SERIALISING = 0x8,
> > >   BDRV_REQ_FUA= 0x10,
> > >   BDRV_REQ_WRITE_COMPRESSED   = 0x20,
> > > diff --git a/block/io.c b/block/io.c
> > > index 1a2272fad3..621b21c455 100644
> > > --- a/block/io.c
> > > +++ b/block/io.c
> > > @@ -1572,6 +1572,8 @@ static int coroutine_fn 
> > > bdrv_aligned_pwritev(BdrvChild *child,
> > >   max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, 
> > > INT_MAX),
> > >  align);
> > > +/* BDRV_REQ_NO_SERIALISING is only for read operation */
> > > +assert(!(flags & BDRV_REQ_NO_SERIALISING));
> > >   waited = wait_serialising_requests(req);
> > >   assert(!waited || !req->serialising);
> > >   assert(req->overlap_offset <= offset);
> > > @@ -2888,15 +2890,19 @@ void bdrv_unregister_buf(BlockDriverState *bs, 
> > > void *host)
> > >   }
> > >   }
> > > -static int coroutine_fn bdrv_co_copy_range_internal(BdrvChild *src,
> > > -uint64_t src_offset,
> > > -BdrvChild *dst,
> > > -uint64_t dst_offset,
> > > -uint64_t bytes,
> > > -BdrvRequestFlags 
> > > flags,
> > > -bool recurse_src)
> > > +/* Common part of bdrv_co_copy_range_from and bdrv_co_copy_range_to.
> > > + *
> > > + * Return -errno on failure,
> > > + *0 if successfully handled by bdrv_co_pwrite_zeroes
> >

Re: [Qemu-devel] [PATCH v2 9/9] block: Use common write req handling in truncate

2018-07-08 Thread Fam Zheng
On Fri, 07/06 17:12, Eric Blake wrote:
> On 07/05/2018 02:37 AM, Fam Zheng wrote:
> > Truncation is the last to convert from open coded req handling to
> > reusing helpers. This time the permission check in prepare has to adapt
> > to the new caller: it checks a different permission bit, and don't
> 
> did you mean "won't" or "doesn't"?

Yeah, "doesn't'. Thanks!

Fam



Re: [Qemu-devel] [PATCH v4 1/4] block/io: fix copy_range

2018-07-08 Thread Fam Zheng
On Fri, 07/06 21:30, Vladimir Sementsov-Ogievskiy wrote:
> Here two things are fixed:
> 
> 1. Architecture
> 
> On each recursion step, we go to the child of src or dst, only for one
> of them. So, it's wrong to create tracked requests for both on each
> step. It leads to tracked requests duplication.
> 
> 2. Wait for serializing requests on write path independently of
>BDRV_REQ_NO_SERIALISING
> 
> Before commit 9ded4a01149 "backup: Use copy offloading",
> BDRV_REQ_NO_SERIALISING was used for only one case: read in
> copy-on-write operation during backup. Also, the flag was handled only
> on read path (in bdrv_co_preadv and bdrv_aligned_preadv).
> 
> After 9ded4a01149, flag is used for not waiting serializing operations
> on backup target (in same case of copy-on-write operation). This
> behavior change is unsubstantiated and potentially dangerous, let's
> drop it and add additional asserts and documentation.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block.h |  13 +++
>  block/io.c| 103 
> +++---
>  2 files changed, 78 insertions(+), 38 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index e5c7759a0c..a06a4d27de 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -50,6 +50,19 @@ typedef enum {
>   * opened with BDRV_O_UNMAP.
>   */
>  BDRV_REQ_MAY_UNMAP  = 0x4,
> +
> +/* The BDRV_REQ_NO_SERIALISING means that we don't want to
> + * wait_serialising_requests(), when reading.
> + *
> + * This flag is used for backup copy on write operation, when we need to
> + * read old data before write (write notifier triggered). It is ok, due 
> to
> + * we already waited for serializing requests in initiative write (see
> + * bdrv_aligned_pwritev), and it is necessary for the case when 
> initiative
> + * write is serializing itself (we'll dead lock waiting it).
> + *
> + * The described case is the only usage for the flag for now, so, it is
> + * supported only for read operation and restricted for write.
> + */
>  BDRV_REQ_NO_SERIALISING = 0x8,
>  BDRV_REQ_FUA= 0x10,
>  BDRV_REQ_WRITE_COMPRESSED   = 0x20,
> diff --git a/block/io.c b/block/io.c
> index 1a2272fad3..621b21c455 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1572,6 +1572,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
> *child,
>  max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, 
> INT_MAX),
> align);
>  
> +/* BDRV_REQ_NO_SERIALISING is only for read operation */
> +assert(!(flags & BDRV_REQ_NO_SERIALISING));
>  waited = wait_serialising_requests(req);
>  assert(!waited || !req->serialising);
>  assert(req->overlap_offset <= offset);
> @@ -2888,15 +2890,19 @@ void bdrv_unregister_buf(BlockDriverState *bs, void 
> *host)
>  }
>  }
>  
> -static int coroutine_fn bdrv_co_copy_range_internal(BdrvChild *src,
> -uint64_t src_offset,
> -BdrvChild *dst,
> -uint64_t dst_offset,
> -uint64_t bytes,
> -BdrvRequestFlags flags,
> -bool recurse_src)
> +/* Common part of bdrv_co_copy_range_from and bdrv_co_copy_range_to.
> + *
> + * Return -errno on failure,
> + *0 if successfully handled by bdrv_co_pwrite_zeroes
> + *1 to continue copy_range operation
> + */
> +static int coroutine_fn bdrv_co_copy_range_check(BdrvChild *src,
> + uint64_t src_offset,
> + BdrvChild *dst,
> + uint64_t dst_offset,
> + uint64_t bytes,
> + BdrvRequestFlags flags)
>  {
> -BdrvTrackedRequest src_req, dst_req;
>  int ret;
>  
>  if (!dst || !dst->bs) {
> @@ -2923,33 +2929,8 @@ static int coroutine_fn 
> bdrv_co_copy_range_internal(BdrvChild *src,
>  || src->bs->encrypted || dst->bs->encrypted) {
>  return -ENOTSUP;
>  }
> -bdrv_inc_in_flight(src->bs);
> -bdrv_inc_in_flight(dst->bs);
> -tracked_request_begin(_req, src->bs, src_offset,
> -  bytes, BDRV_TRACKED_READ);
> -tracked_request_begin(_req, dst->bs, dst_offset,
> -  bytes, BDRV_TRACKED_WRITE);
>  
> -if (!(flags & BDRV_REQ_NO_SERIALISING)) {
> -wait_serialising_requests(_req);
> -wait_serialising_requests(_req);
> -}
> -if (recurse_src) {
> -ret = src->bs->drv->bdrv_co_copy_range_from(src->bs,
> - 

Re: [Qemu-devel] [PATCH v3 0/4] fix image fleecing

2018-07-06 Thread Fam Zheng
On Thu, 07/05 10:46, Vladimir Sementsov-Ogievskiy wrote:
> Hi all.
> 
> This fixes image fleecing scheme for 3.0, details are in 04 patch.

Looks like this breaks 'test-replication':

http://patchew.org/QEMU/20180705074638.770905-1-vsement...@virtuozzo.com/

test-replication: /stor/work/qemu/block/io.c:725: wait_serialising_requests: 
Assertion `qemu_coroutine_self() != req->co' failed.
GTester: last random seed: R02Se49967625f19c5be9918a2503fb2fff5
test-replication: /stor/work/qemu/block/io.c:725: wait_serialising_requests: 
Assertion `qemu_coroutine_self() != req->co' failed.
  GTESTER tests/test-qht-par
GTester: last random seed: R02S8fad2bb2daddc51eb51623bb7c86cb67
test-replication: /stor/work/qemu/block/io.c:725: wait_serialising_requests: 
Assertion `qemu_coroutine_self() != req->co' failed.
GTester: last random seed: R02S024718e9b1e2b1facafe254c43e1430b

Fam



Re: [Qemu-devel] [PATCH v2 6/9] block: Use common req handling for discard

2018-07-06 Thread Fam Zheng
On Thu, 07/05 14:44, Kevin Wolf wrote:
> Am 05.07.2018 um 09:36 hat Fam Zheng geschrieben:
> > Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation
> > here is that discard requests don't affect bs->wr_highest_offset.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/io.c | 21 ++---
> >  1 file changed, 14 insertions(+), 7 deletions(-)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index f06978dda0..912fcb962a 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -1582,7 +1582,18 @@ bdrv_co_write_req_finish(BdrvChild *child, 
> > BdrvTrackedRequest *req, int ret)
> >  bdrv_parent_cb_resize(bs);
> >  bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
> >  }
> > -bdrv_set_dirty(bs, req->offset, req->bytes);
> > +if (req->bytes) {
> > +switch (req->type) {
> > +case BDRV_TRACKED_WRITE:
> > +stat64_max(>wr_highest_offset, req->offset + req->bytes);
> 
> You forgot to remove this line above, so now this one is redundant and
> we still execute it always.
> 
> Apart from that, why shouldn't discard be included in
> bs->wr_highest_offset? It's an access to an area in the image that must
> be present, so it indicates a larger file size, doesn't it?

I'm not sure. wr_highest_offset is used for management to allocate disk space. A
discard request is on the contrary for releasing disk space. Since guest is
allowed to discard unallocated sectors even though it should be nop in the
backend, such an operation shouldn't cause a user visible change in
@wr_highest_offset in QMP.

Fam

> 
> > +/* fall through, to set dirty bits */
> > +case BDRV_TRACKED_DISCARD:
> > +bdrv_set_dirty(bs, req->offset, req->bytes);
> > +break;
> > +default:
> > +break;
> > +}
> > +}
> >  }
> 
> Kevin



Re: [Qemu-devel] [PATCH v2 1/9] block: Add copy offloading trace points

2018-07-06 Thread Fam Zheng
On Thu, 07/05 13:08, Kevin Wolf wrote:
> Am 05.07.2018 um 09:36 hat Fam Zheng geschrieben:
> > A few trace points that can help reveal what is happening in a copy
> > offloading I/O path.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/file-posix.c | 2 ++
> >  block/io.c | 2 ++
> >  block/iscsi.c  | 3 +++
> >  block/trace-events | 6 ++
> >  4 files changed, 13 insertions(+)
> > 
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index 829ee538d8..d3b1609410 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -1488,6 +1488,8 @@ static ssize_t 
> > handle_aiocb_copy_range(RawPosixAIOData *aiocb)
> >  ssize_t ret = copy_file_range(aiocb->aio_fildes, _off,
> >aiocb->aio_fd2, _off,
> >bytes, 0);
> > +trace_copy_file_range(aiocb->bs, aiocb->aio_fildes, in_off,
> > +  aiocb->aio_fd2, out_off, bytes, 0, ret);
> 
> I think it's preferable to have a common prefix for all trace points in
> a driver, so they can be enabled with a glob.
> 
> paio_* is the existing one for thread pool based file-posix trace
> points. Not sure if we like it or want to replace it with something
> else.

Yes, I think it makes sense to add a "file_" prefix.

Fam



Re: [Qemu-devel] [PATCH v4] module: Use QEMU_MODULE_DIR as a search path

2018-07-05 Thread Fam Zheng
On Wed, 07/04 14:10, ryang wrote:
> The current paths for modules are CONFIG_QEMU_MODDIR and paths relative
> to the executable. Qemu and its modules can be installed and executed in
> paths that are different from these search paths. This change allows
> a search path to be specified by environment variable.
> 
> An example usage for this is postmarketOS[1]. This is a build environment
> for Alpine Linux. It sets up Alpine Linux in a chroot environment.
> Alpine's Qemu packages are installed in the chroot. The Alpine Linux Qemu
> package is used to test compiled Alpine Linux system images. This way there
> isn't a reliance on the which ever version of Qemu the host system / distro
> provides.
> 
> postmarketOS executes Qemu on host system outside of the chroot
> The Qemu module search path needs to point to the location of the
> chroot relative to the host system.
> 
> e.g.
> The root of the Alpine Linux chroot is:
> ~/.local/var/pmbootstrap/chroot_native/
> 
> Alpine's Qemu is installed at
> ~/.local/var/pmbootstrap/chroot_native/usr/bin/
> 
> The Qemu module search path needs to be:
> QEMU_MODULE_DIR=~/.local/var/pmbootstrap/chroot_native/usr/lib/qemu/
> 
> [1] https://postmarketos.org/
> 
> Signed-off-by: ryang 
> ---
> 
> v2:
>  - fix checkpatch errors
> v3:
>  - initialize n_dirs variable
> v4:
>  - change env variable suffix from _PATH to _DIR
>  - renamed search_path to search_dir and make it const
> 
>  util/module.c | 22 ++
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/util/module.c b/util/module.c
> index c909737..1259dd3 100644
> --- a/util/module.c
> +++ b/util/module.c
> @@ -162,9 +162,10 @@ void module_load_one(const char *prefix, const char 
> *lib_name)
>  #ifdef CONFIG_MODULES
>  char *fname = NULL;
>  char *exec_dir;
> -char *dirs[3];
> +const char *search_dir;
> +char *dirs[4];
>  char *module_name;
> -int i = 0;
> +int i = 0, n_dirs = 0;
>  int ret;
>  static GHashTable *loaded_modules;
>  
> @@ -186,14 +187,19 @@ void module_load_one(const char *prefix, const char 
> *lib_name)
>  g_hash_table_insert(loaded_modules, module_name, module_name);
>  
>  exec_dir = qemu_get_exec_dir();
> -dirs[i++] = g_strdup_printf("%s", CONFIG_QEMU_MODDIR);
> -dirs[i++] = g_strdup_printf("%s/..", exec_dir ? : "");
> -dirs[i++] = g_strdup_printf("%s", exec_dir ? : "");
> -assert(i == ARRAY_SIZE(dirs));
> +search_dir = getenv("QEMU_MODULE_DIR");
> +if (search_dir != NULL) {
> +dirs[n_dirs++] = g_strdup_printf("%s", search_dir);
> +}
> +dirs[n_dirs++] = g_strdup_printf("%s", CONFIG_QEMU_MODDIR);
> +dirs[n_dirs++] = g_strdup_printf("%s/..", exec_dir ? : "");
> +dirs[n_dirs++] = g_strdup_printf("%s", exec_dir ? : "");
> +assert(n_dirs <= ARRAY_SIZE(dirs));
> +
>  g_free(exec_dir);
>  exec_dir = NULL;
>  
> -for (i = 0; i < ARRAY_SIZE(dirs); i++) {
> +for (i = 0; i < n_dirs; i++) {
>  fname = g_strdup_printf("%s/%s%s",
>  dirs[i], module_name, HOST_DSOSUF);
>  ret = module_load_file(fname);
> @@ -205,7 +211,7 @@ void module_load_one(const char *prefix, const char 
> *lib_name)
>  }
>  }
>  
> -for (i = 0; i < ARRAY_SIZE(dirs); i++) {
> +for (i = 0; i < n_dirs; i++) {
>  g_free(dirs[i]);
>  }
>  
> -- 
> 2.7.4
> 

Reviewed-by: Fam Zheng 




[Qemu-devel] [PATCH v2 9/9] block: Use common write req handling in truncate

2018-07-05 Thread Fam Zheng
Truncation is the last to convert from open coded req handling to
reusing helpers. This time the permission check in prepare has to adapt
to the new caller: it checks a different permission bit, and don't
trigger the before write notifier.

Also, truncation should always trigger a bs->total_sectors update and in
turn call parent resize_cb. Update the condition in finish helper, too.

It's intended to do a duplicated bs->read_only check before calling
bdrv_co_write_req_prepare() so that we can be more informative with the
error message, as bdrv_co_write_req_prepare() doesn't have Error
parameter.

Signed-off-by: Fam Zheng 
---
 block/io.c | 54 +-
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/block/io.c b/block/io.c
index ed18eb0ca3..53f2bf4103 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1556,13 +1556,22 @@ bdrv_co_write_req_prepare(BdrvChild *child, 
BdrvTrackedRequest *req, int flags)
 assert(!waited || !req->serialising);
 assert(req->overlap_offset <= req->offset);
 assert(req->offset + req->bytes <= req->overlap_offset + 
req->overlap_bytes);
-if (flags & BDRV_REQ_WRITE_UNCHANGED) {
-assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
-} else {
-assert(child->perm & BLK_PERM_WRITE);
-}
 assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
-return notifier_with_return_list_notify(>before_write_notifiers, req);
+switch (req->type) {
+case BDRV_TRACKED_WRITE:
+case BDRV_TRACKED_DISCARD:
+if (flags & BDRV_REQ_WRITE_UNCHANGED) {
+assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+} else {
+assert(child->perm & BLK_PERM_WRITE);
+}
+return notifier_with_return_list_notify(>before_write_notifiers, 
req);
+case BDRV_TRACKED_TRUNCATE:
+assert(child->perm & BLK_PERM_RESIZE);
+return 0;
+default:
+abort();
+}
 }
 
 static inline void coroutine_fn
@@ -1575,9 +1584,10 @@ bdrv_co_write_req_finish(BdrvChild *child, 
BdrvTrackedRequest *req, int ret)
 
 stat64_max(>wr_highest_offset, req->offset + req->bytes);
 
-if (req->type != BDRV_TRACKED_DISCARD &&
-ret == 0 &&
-end_sector > bs->total_sectors) {
+if (ret == 0 &&
+(req->type == BDRV_TRACKED_TRUNCATE ||
+ (req->type != BDRV_TRACKED_DISCARD &&
+  end_sector > bs->total_sectors))) {
 bs->total_sectors = end_sector;
 bdrv_parent_cb_resize(bs);
 bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
@@ -3045,7 +3055,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 int64_t old_size, new_bytes;
 int ret;
 
-assert(child->perm & BLK_PERM_RESIZE);
 
 /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
 if (!drv) {
@@ -3078,7 +3087,16 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
  * concurrently or they might be overwritten by preallocation. */
 if (new_bytes) {
 mark_request_serialising(, 1);
-wait_serialising_requests();
+}
+if (bs->read_only) {
+error_setg(errp, "Image is read-only");
+ret = -EACCES;
+goto out;
+}
+ret = bdrv_co_write_req_prepare(child, , 0);
+if (ret < 0) {
+error_setg_errno(errp, -ret, "Failed to prepare request for 
truncation");
+goto out;
 }
 
 if (!drv->bdrv_co_truncate) {
@@ -3090,13 +3108,6 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 ret = -ENOTSUP;
 goto out;
 }
-if (bs->read_only) {
-error_setg(errp, "Image is read-only");
-ret = -EACCES;
-goto out;
-}
-
-assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
 ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp);
 if (ret < 0) {
@@ -3108,9 +3119,10 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 } else {
 offset = bs->total_sectors * BDRV_SECTOR_SIZE;
 }
-bdrv_dirty_bitmap_truncate(bs, offset);
-bdrv_parent_cb_resize(bs);
-atomic_inc(>write_gen);
+/* It's possible that truncation succeeded but refresh_total_sectors
+ * failed, but the latter doesn't affect how we should finish the request.
+ * Pass 0 as the last parameter so that dirty bitmaps etc. are handled. */
+bdrv_co_write_req_finish(child, , 0);
 
 out:
 tracked_request_end();
-- 
2.17.1




[Qemu-devel] [PATCH v2 5/9] block: Fix handling of image enlarging write

2018-07-05 Thread Fam Zheng
Two problems exist when a write request that enlarges the image (i.e.
write beyond EOF) finishes:

1) parent is not notified about size change;
2) dirty bitmap is not resized although we try to set the dirty bits;

Fix them just like how bdrv_co_truncate works.

Reported-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 block/io.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 03d9eb0a65..f06978dda0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -40,6 +40,7 @@
 
 static AioWait drain_all_aio_wait;
 
+static void bdrv_parent_cb_resize(BlockDriverState *bs);
 static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 int64_t offset, int bytes, BdrvRequestFlags flags);
 
@@ -1571,13 +1572,17 @@ bdrv_co_write_req_finish(BdrvChild *child, 
BdrvTrackedRequest *req, int ret)
 BlockDriverState *bs = child->bs;
 
 atomic_inc(>write_gen);
-bdrv_set_dirty(bs, req->offset, req->bytes);
 
 stat64_max(>wr_highest_offset, req->offset + req->bytes);
 
-if (ret == 0) {
-bs->total_sectors = MAX(bs->total_sectors, end_sector);
+if (req->type != BDRV_TRACKED_DISCARD &&
+ret == 0 &&
+end_sector > bs->total_sectors) {
+bs->total_sectors = end_sector;
+bdrv_parent_cb_resize(bs);
+bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
 }
+bdrv_set_dirty(bs, req->offset, req->bytes);
 }
 
 /*
-- 
2.17.1




[Qemu-devel] [PATCH v2 6/9] block: Use common req handling for discard

2018-07-05 Thread Fam Zheng
Reuse the new bdrv_co_write_req_prepare/finish helpers. The variation
here is that discard requests don't affect bs->wr_highest_offset.

Signed-off-by: Fam Zheng 
---
 block/io.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/block/io.c b/block/io.c
index f06978dda0..912fcb962a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1582,7 +1582,18 @@ bdrv_co_write_req_finish(BdrvChild *child, 
BdrvTrackedRequest *req, int ret)
 bdrv_parent_cb_resize(bs);
 bdrv_dirty_bitmap_truncate(bs, end_sector << BDRV_SECTOR_BITS);
 }
-bdrv_set_dirty(bs, req->offset, req->bytes);
+if (req->bytes) {
+switch (req->type) {
+case BDRV_TRACKED_WRITE:
+stat64_max(>wr_highest_offset, req->offset + req->bytes);
+/* fall through, to set dirty bits */
+case BDRV_TRACKED_DISCARD:
+bdrv_set_dirty(bs, req->offset, req->bytes);
+break;
+default:
+break;
+}
+}
 }
 
 /*
@@ -2643,10 +2654,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, 
int64_t offset, int bytes)
 ret = bdrv_check_byte_request(bs, offset, bytes);
 if (ret < 0) {
 return ret;
-} else if (bs->read_only) {
-return -EPERM;
 }
-assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
 /* Do nothing if disabled.  */
 if (!(bs->open_flags & BDRV_O_UNMAP)) {
@@ -2670,7 +2678,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, 
int64_t offset, int bytes)
 bdrv_inc_in_flight(bs);
 tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_DISCARD);
 
-ret = notifier_with_return_list_notify(>before_write_notifiers, );
+ret = bdrv_co_write_req_prepare(child, , 0);
 if (ret < 0) {
 goto out;
 }
@@ -2736,8 +2744,7 @@ int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, 
int64_t offset, int bytes)
 }
 ret = 0;
 out:
-atomic_inc(>write_gen);
-bdrv_set_dirty(bs, req.offset, req.bytes);
+bdrv_co_write_req_finish(child, , ret);
 tracked_request_end();
 bdrv_dec_in_flight(bs);
 return ret;
-- 
2.17.1




[Qemu-devel] [PATCH v2 8/9] block: Fix bdrv_co_truncate overlap check

2018-07-05 Thread Fam Zheng
If we are growing the image and potentially using preallocation for the
new area, we need to make sure that no write requests are made to the
"preallocated" area which [@old_size, @offset), not [@offset, offset * 2
- @old_size).

Signed-off-by: Fam Zheng 
---
 block/io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index d07849fa96..ed18eb0ca3 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3070,7 +3070,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, 
int64_t offset,
 }
 
 bdrv_inc_in_flight(bs);
-tracked_request_begin(, bs, offset, new_bytes, BDRV_TRACKED_TRUNCATE);
+tracked_request_begin(, bs, offset - new_bytes, new_bytes,
+  BDRV_TRACKED_TRUNCATE);
 
 /* If we are growing the image and potentially using preallocation for the
  * new area, we need to make sure that no write requests are made to it
-- 
2.17.1




[Qemu-devel] [PATCH v2 3/9] block: Use uint64_t for BdrvTrackedRequest byte fields

2018-07-05 Thread Fam Zheng
This matches the types used for bytes in the rest parts of block layer.
In the case of bdrv_co_truncate, new_bytes can be the image size which
probably doesn't fit in a 32 bit int.

Signed-off-by: Fam Zheng 
---
 block/io.c| 2 +-
 include/block/block_int.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 3e00667a2a..443a8584c4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -587,7 +587,7 @@ static void tracked_request_end(BdrvTrackedRequest *req)
 static void tracked_request_begin(BdrvTrackedRequest *req,
   BlockDriverState *bs,
   int64_t offset,
-  unsigned int bytes,
+  uint64_t bytes,
   enum BdrvTrackedRequestType type)
 {
 *req = (BdrvTrackedRequest){
diff --git a/include/block/block_int.h b/include/block/block_int.h
index af71b414be..66c0e50d82 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -69,12 +69,12 @@ enum BdrvTrackedRequestType {
 typedef struct BdrvTrackedRequest {
 BlockDriverState *bs;
 int64_t offset;
-unsigned int bytes;
+uint64_t bytes;
 enum BdrvTrackedRequestType type;
 
 bool serialising;
 int64_t overlap_offset;
-unsigned int overlap_bytes;
+uint64_t overlap_bytes;
 
 QLIST_ENTRY(BdrvTrackedRequest) list;
 Coroutine *co; /* owner, used for deadlock detection */
-- 
2.17.1




[Qemu-devel] [PATCH v2 4/9] block: Extract common write req handling

2018-07-05 Thread Fam Zheng
As a mechanical refactoring patch, this is the first step towards
unified and more correct write code paths. This is helpful because
multiple BlockDriverState fields need to be updated after modifying
image data, and it's hard to maintain in multiple places such as copy
offload, discard and truncate.

Suggested-by: Kevin Wolf 
Signed-off-by: Fam Zheng 
---
 block/io.c | 70 ++
 1 file changed, 44 insertions(+), 26 deletions(-)

diff --git a/block/io.c b/block/io.c
index 443a8584c4..03d9eb0a65 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1538,6 +1538,48 @@ fail:
 return ret;
 }
 
+static inline int coroutine_fn
+bdrv_co_write_req_prepare(BdrvChild *child, BdrvTrackedRequest *req, int flags)
+{
+BlockDriverState *bs = child->bs;
+bool waited;
+int64_t end_sector = DIV_ROUND_UP(req->offset + req->bytes, 
BDRV_SECTOR_SIZE);
+
+if (bs->read_only) {
+return -EPERM;
+}
+assert(!(bs->open_flags & BDRV_O_INACTIVE));
+assert((bs->open_flags & BDRV_O_NO_IO) == 0);
+assert(!(flags & ~BDRV_REQ_MASK));
+waited = wait_serialising_requests(req);
+assert(!waited || !req->serialising);
+assert(req->overlap_offset <= req->offset);
+assert(req->offset + req->bytes <= req->overlap_offset + 
req->overlap_bytes);
+if (flags & BDRV_REQ_WRITE_UNCHANGED) {
+assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
+} else {
+assert(child->perm & BLK_PERM_WRITE);
+}
+assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
+return notifier_with_return_list_notify(>before_write_notifiers, req);
+}
+
+static inline void coroutine_fn
+bdrv_co_write_req_finish(BdrvChild *child, BdrvTrackedRequest *req, int ret)
+{
+int64_t end_sector = DIV_ROUND_UP(req->offset + req->bytes, 
BDRV_SECTOR_SIZE);
+BlockDriverState *bs = child->bs;
+
+atomic_inc(>write_gen);
+bdrv_set_dirty(bs, req->offset, req->bytes);
+
+stat64_max(>wr_highest_offset, req->offset + req->bytes);
+
+if (ret == 0) {
+bs->total_sectors = MAX(bs->total_sectors, end_sector);
+}
+}
+
 /*
  * Forwards an already correctly aligned write request to the BlockDriver,
  * after possibly fragmenting it.
@@ -1548,10 +1590,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 {
 BlockDriverState *bs = child->bs;
 BlockDriver *drv = bs->drv;
-bool waited;
 int ret;
 
-int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
 uint64_t bytes_remaining = bytes;
 int max_transfer;
 
@@ -1567,23 +1607,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 assert((offset & (align - 1)) == 0);
 assert((bytes & (align - 1)) == 0);
 assert(!qiov || bytes == qiov->size);
-assert((bs->open_flags & BDRV_O_NO_IO) == 0);
-assert(!(flags & ~BDRV_REQ_MASK));
 max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
align);
 
-waited = wait_serialising_requests(req);
-assert(!waited || !req->serialising);
-assert(req->overlap_offset <= offset);
-assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
-if (flags & BDRV_REQ_WRITE_UNCHANGED) {
-assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
-} else {
-assert(child->perm & BLK_PERM_WRITE);
-}
-assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
-
-ret = notifier_with_return_list_notify(>before_write_notifiers, req);
+ret = bdrv_co_write_req_prepare(child, req, flags);
 
 if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF &&
 !(flags & BDRV_REQ_ZERO_WRITE) && drv->bdrv_co_pwrite_zeroes &&
@@ -1632,15 +1659,10 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 }
 bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
 
-atomic_inc(>write_gen);
-bdrv_set_dirty(bs, offset, bytes);
-
-stat64_max(>wr_highest_offset, offset + bytes);
-
 if (ret >= 0) {
-bs->total_sectors = MAX(bs->total_sectors, end_sector);
 ret = 0;
 }
+bdrv_co_write_req_finish(child, req, ret);
 
 return ret;
 }
@@ -1755,10 +1777,6 @@ int coroutine_fn bdrv_co_pwritev(BdrvChild *child,
 if (!bs->drv) {
 return -ENOMEDIUM;
 }
-if (bs->read_only) {
-return -EPERM;
-}
-assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
 ret = bdrv_check_byte_request(bs, offset, bytes);
 if (ret < 0) {
-- 
2.17.1




[Qemu-devel] [PATCH v2 7/9] block: Use common req handling in copy offloading

2018-07-05 Thread Fam Zheng
This brings the request handling logic inline with write and discard,
fixing write_gen, resize_cb, dirty bitmaps and image size refreshing.
The last of these issues broke iotest case 222, which is now fixed.

Signed-off-by: Fam Zheng 
---
 block/io.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 912fcb962a..d07849fa96 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2962,7 +2962,11 @@ static int coroutine_fn 
bdrv_co_copy_range_internal(BdrvChild *src,
 
 if (!(flags & BDRV_REQ_NO_SERIALISING)) {
 wait_serialising_requests(_req);
-wait_serialising_requests(_req);
+}
+
+ret = bdrv_co_write_req_prepare(dst, _req, flags);
+if (ret) {
+goto out;
 }
 if (recurse_src) {
 ret = src->bs->drv->bdrv_co_copy_range_from(src->bs,
@@ -2975,6 +2979,8 @@ static int coroutine_fn 
bdrv_co_copy_range_internal(BdrvChild *src,
   dst, dst_offset,
   bytes, flags);
 }
+out:
+bdrv_co_write_req_finish(dst, _req, ret);
 tracked_request_end(_req);
 tracked_request_end(_req);
 bdrv_dec_in_flight(src->bs);
-- 
2.17.1




[Qemu-devel] [PATCH v2 2/9] block: Use BdrvChild to discard

2018-07-05 Thread Fam Zheng
Other I/O functions are already using a BdrvChild pointer in the API, so
make discard do the same. It makes it possible to initiate the same
permission checks before doing I/O, and much easier to share the
helper functions for this, which will be added and used by write,
truncate and copy range paths.

Signed-off-by: Fam Zheng 
---
 block/blkdebug.c   |  2 +-
 block/blkreplay.c  |  2 +-
 block/block-backend.c  |  2 +-
 block/copy-on-read.c   |  2 +-
 block/io.c | 18 +-
 block/mirror.c |  2 +-
 block/qcow2-refcount.c |  2 +-
 block/raw-format.c |  2 +-
 block/throttle.c   |  2 +-
 include/block/block.h  |  4 ++--
 10 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 526af2a808..0457bf5b66 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -625,7 +625,7 @@ static int coroutine_fn 
blkdebug_co_pdiscard(BlockDriverState *bs,
 return err;
 }
 
-return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+return bdrv_co_pdiscard(bs->file, offset, bytes);
 }
 
 static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs,
diff --git a/block/blkreplay.c b/block/blkreplay.c
index b016dbeee7..766150ade6 100755
--- a/block/blkreplay.c
+++ b/block/blkreplay.c
@@ -113,7 +113,7 @@ static int coroutine_fn 
blkreplay_co_pdiscard(BlockDriverState *bs,
   int64_t offset, int bytes)
 {
 uint64_t reqid = blkreplay_next_id();
-int ret = bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+int ret = bdrv_co_pdiscard(bs->file, offset, bytes);
 block_request_create(reqid, bs, qemu_coroutine_self());
 qemu_coroutine_yield();
 
diff --git a/block/block-backend.c b/block/block-backend.c
index 6b75bca317..dd037b40a2 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1559,7 +1559,7 @@ int blk_co_pdiscard(BlockBackend *blk, int64_t offset, 
int bytes)
 return ret;
 }
 
-return bdrv_co_pdiscard(blk_bs(blk), offset, bytes);
+return bdrv_co_pdiscard(blk->root, offset, bytes);
 }
 
 int blk_co_flush(BlockBackend *blk)
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 1dcdaeed69..a19164f9eb 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -116,7 +116,7 @@ static int coroutine_fn 
cor_co_pwrite_zeroes(BlockDriverState *bs,
 static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
 int64_t offset, int bytes)
 {
-return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
+return bdrv_co_pdiscard(bs->file, offset, bytes);
 }
 
 
diff --git a/block/io.c b/block/io.c
index 8cc5ee661d..3e00667a2a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2590,7 +2590,7 @@ int bdrv_flush(BlockDriverState *bs)
 }
 
 typedef struct DiscardCo {
-BlockDriverState *bs;
+BdrvChild *child;
 int64_t offset;
 int bytes;
 int ret;
@@ -2599,17 +2599,17 @@ static void coroutine_fn bdrv_pdiscard_co_entry(void 
*opaque)
 {
 DiscardCo *rwco = opaque;
 
-rwco->ret = bdrv_co_pdiscard(rwco->bs, rwco->offset, rwco->bytes);
+rwco->ret = bdrv_co_pdiscard(rwco->child, rwco->offset, rwco->bytes);
 }
 
-int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,
-  int bytes)
+int coroutine_fn bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int bytes)
 {
 BdrvTrackedRequest req;
 int max_pdiscard, ret;
 int head, tail, align;
+BlockDriverState *bs = child->bs;
 
-if (!bs->drv) {
+if (!bs || !bs->drv) {
 return -ENOMEDIUM;
 }
 
@@ -2720,11 +2720,11 @@ out:
 return ret;
 }
 
-int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
+int bdrv_pdiscard(BdrvChild *child, int64_t offset, int bytes)
 {
 Coroutine *co;
 DiscardCo rwco = {
-.bs = bs,
+.child = child,
 .offset = offset,
 .bytes = bytes,
 .ret = NOT_DONE,
@@ -2735,8 +2735,8 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, 
int bytes)
 bdrv_pdiscard_co_entry();
 } else {
 co = qemu_coroutine_create(bdrv_pdiscard_co_entry, );
-bdrv_coroutine_enter(bs, co);
-BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE);
+bdrv_coroutine_enter(child->bs, co);
+BDRV_POLL_WHILE(child->bs, rwco.ret == NOT_DONE);
 }
 
 return rwco.ret;
diff --git a/block/mirror.c b/block/mirror.c
index 61bd9f3cf1..b48c3f8cf5 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1333,7 +1333,7 @@ static int coroutine_fn 
bdrv_mirror_top_do_write(BlockDriverState *bs,
 break;
 
 case MIRROR_METHOD_DISCARD:
-ret = bdrv_co_pdiscard(bs->backing->bs, offset, bytes);
+ret = bdrv_co_pdiscard(bs->backing, offset, bytes);
 break;
 
 default:
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 18c729aa27..4e1589a

[Qemu-devel] [PATCH v2 1/9] block: Add copy offloading trace points

2018-07-05 Thread Fam Zheng
A few trace points that can help reveal what is happening in a copy
offloading I/O path.

Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 2 ++
 block/io.c | 2 ++
 block/iscsi.c  | 3 +++
 block/trace-events | 6 ++
 4 files changed, 13 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index 829ee538d8..d3b1609410 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1488,6 +1488,8 @@ static ssize_t handle_aiocb_copy_range(RawPosixAIOData 
*aiocb)
 ssize_t ret = copy_file_range(aiocb->aio_fildes, _off,
   aiocb->aio_fd2, _off,
   bytes, 0);
+trace_copy_file_range(aiocb->bs, aiocb->aio_fildes, in_off,
+  aiocb->aio_fd2, out_off, bytes, 0, ret);
 if (ret == 0) {
 /* No progress (e.g. when beyond EOF), let the caller fall back to
  * buffer I/O. */
diff --git a/block/io.c b/block/io.c
index 1a2272fad3..8cc5ee661d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2960,6 +2960,7 @@ int coroutine_fn bdrv_co_copy_range_from(BdrvChild *src, 
uint64_t src_offset,
  BdrvChild *dst, uint64_t dst_offset,
  uint64_t bytes, BdrvRequestFlags 
flags)
 {
+trace_bdrv_co_copy_range_from(src, src_offset, dst, dst_offset, bytes, 
flags);
 return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset,
bytes, flags, true);
 }
@@ -2972,6 +2973,7 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, 
uint64_t src_offset,
BdrvChild *dst, uint64_t dst_offset,
uint64_t bytes, BdrvRequestFlags flags)
 {
+trace_bdrv_co_copy_range_to(src, src_offset, dst, dst_offset, bytes, 
flags);
 return bdrv_co_copy_range_internal(src, src_offset, dst, dst_offset,
bytes, flags, false);
 }
diff --git a/block/iscsi.c b/block/iscsi.c
index ead2bd5aa7..118555d051 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -44,6 +44,7 @@
 #include "qapi/qmp/qstring.h"
 #include "crypto/secret.h"
 #include "scsi/utils.h"
+#include "trace.h"
 
 /* Conflict between scsi/utils.h and libiscsi! :( */
 #define SCSI_XFER_NONE ISCSI_XFER_NONE
@@ -2396,6 +2397,8 @@ retry:
 }
 
 out_unlock:
+
+trace_iscsi_xcopy(src_lun, src_offset, dst_lun, dst_offset, bytes, r);
 g_free(iscsi_task.task);
 qemu_mutex_unlock(_lun->mutex);
 g_free(iscsi_task.err_str);
diff --git a/block/trace-events b/block/trace-events
index c35287b48a..1a25a997f2 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -15,6 +15,8 @@ bdrv_co_preadv(void *bs, int64_t offset, int64_t nbytes, 
unsigned int flags) "bs
 bdrv_co_pwritev(void *bs, int64_t offset, int64_t nbytes, unsigned int flags) 
"bs %p offset %"PRId64" nbytes %"PRId64" flags 0x%x"
 bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p 
offset %"PRId64" count %d flags 0x%x"
 bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t 
cluster_offset, int64_t cluster_bytes) "bs %p offset %"PRId64" bytes %u 
cluster_offset %"PRId64" cluster_bytes %"PRId64
+bdrv_co_copy_range_from(void *src, uint64_t src_offset, void *dst, uint64_t 
dst_offset, uint64_t bytes, int flags) "src %p offset %"PRIu64" dst %p offset 
%"PRIu64" bytes %"PRIu64" flags 0x%x"
+bdrv_co_copy_range_to(void *src, uint64_t src_offset, void *dst, uint64_t 
dst_offset, uint64_t bytes, int flags) "src %p offset %"PRIu64" dst %p offset 
%"PRIu64" bytes %"PRIu64" flags 0x%x"
 
 # block/stream.c
 stream_one_iteration(void *s, int64_t offset, uint64_t bytes, int 
is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d"
@@ -57,6 +59,7 @@ qmp_block_stream(void *bs, void *job) "bs %p job %p"
 # block/file-posix.c
 paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d 
type %d"
 paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb 
%p opaque %p offset %"PRId64" count %d type %d"
+copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_off, 
int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset %"PRIu64" dst_fd 
%d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64
 
 # block/qcow2.c
 qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset 0x%" 
PRIx64 " bytes %d"
@@ -150,3 +153,6 @@ nvme_free_req_queue_wait(void *q) "q %p"
 nvme_cmd_map_qiov(void *s, void *cmd, void *req, void *qiov, int entries) &quo

[Qemu-devel] [PATCH v2 0/9] block: Fix dst reading after tail copy offloading

2018-07-05 Thread Fam Zheng
Qcow2 allocates new clusters after the end of the file. If it is the destinaton
of copy offloading, we must adjust dst->bs->total_sectors. Otherwise, further
reads will drop to the "beyond EOF" code path and return zeroes, which problem
is caught by iotests 222.

Follow the logic in the normal write code and update bs->total_sectors after
I/O is done.

While at it, add a few convenient trace points to aid future debug experiences
in the topic.

Fam Zheng (9):
  block: Add copy offloading trace points
  block: Use BdrvChild to discard
  block: Use uint64_t for BdrvTrackedRequest byte fields
  block: Extract common write req handling
  block: Fix handling of image enlarging write
  block: Use common req handling for discard
  block: Use common req handling in copy offloading
  block: Fix bdrv_co_truncate overlap check
  block: Use common write req handling in truncate

 block/blkdebug.c  |   2 +-
 block/blkreplay.c |   2 +-
 block/block-backend.c |   2 +-
 block/copy-on-read.c  |   2 +-
 block/file-posix.c|   2 +
 block/io.c| 163 +-
 block/iscsi.c |   3 +
 block/mirror.c|   2 +-
 block/qcow2-refcount.c|   2 +-
 block/raw-format.c|   2 +-
 block/throttle.c  |   2 +-
 block/trace-events|   6 ++
 include/block/block.h |   4 +-
 include/block/block_int.h |   4 +-
 14 files changed, 130 insertions(+), 68 deletions(-)

-- 
2.17.1




Re: [Qemu-devel] [PATCH 1/2] block: Fix dst total_sectors after copy offloading

2018-07-04 Thread Fam Zheng
On Wed, 07/04 09:44, Kevin Wolf wrote:
> Am 04.07.2018 um 08:13 hat Fam Zheng geschrieben:
> > This was noticed by the new image fleecing tests case 222. The issue is
> > apparent and we should just do the same right things as in
> > bdrv_aligned_pwritev.
> > 
> > Reported-by: John Snow 
> > Signed-off-by: Fam Zheng 
> 
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -2945,6 +2945,10 @@ static int coroutine_fn 
> > bdrv_co_copy_range_internal(BdrvChild *src,
> >dst, dst_offset,
> >bytes, flags);
> >  }
> > +if (ret == 0) {
> > +int64_t end_sector = DIV_ROUND_UP(dst_offset + bytes, 
> > BDRV_SECTOR_SIZE);
> > +dst->bs->total_sectors = MAX(dst->bs->total_sectors, end_sector);
> > +}
> 
> I think it's time to extract this stuff to a common function. I was
> already aware that a write request that extends the image isn't behaving
> exactly the same as bdrv_co_truncate(), and this one is bound to diverge
> from the other two instances as well.
> 
> This is what bdrv_aligned_pwritev() does after the write:
> 
> atomic_inc(>write_gen);
> bdrv_set_dirty(bs, offset, bytes);
> 
> stat64_max(>wr_highest_offset, offset + bytes);
> 
> if (ret >= 0) {
> bs->total_sectors = MAX(bs->total_sectors, end_sector);
> ret = 0;
> }
> 
> Before the write, it also calls bs->before_write_notifiers.
> 
> This is what bdrv_co_truncate() does after truncating the image:
> 
> ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
> if (ret < 0) {
> error_setg_errno(errp, -ret, "Could not refresh total sector count");
> } else {
> offset = bs->total_sectors * BDRV_SECTOR_SIZE;
> }
> bdrv_dirty_bitmap_truncate(bs, offset);
> bdrv_parent_cb_resize(bs);
> atomic_inc(>write_gen);
> 
> This means that we probably have at least the following bugs in
> bdrv_co_copy_range_internal():
> 
> 1. bs->write_gen is not increased, a following flush is ignored
> 2. Dirty bitmaps are not dirtied
> 3. Dirty bitmaps are not resized when extending the image
> 4. bs->wr_highest_offset is not adjusted correctly
> 5. bs->total_sectors is not resized (the bug this patch fixes)
> 6. The parent node isn't notified about an image size change
> 
> Of these, 3. and 6. also seem to be bugs in bdrv_aligned_pwritev().
> 

Indeed, great insight. I'll work on v2.

Fam



<    1   2   3   4   5   6   7   8   9   10   >