Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: Support BDRV_REQ_MAY_UNMAP

2016-10-11 Thread Fam Zheng
On Wed, 09/28 15:04, Fam Zheng wrote:
> Handling this is similar to what is done to the L2 entry in the case of
> compressed clusters.

Kevin, Max, is there anything else I need to do before this patch can be
applied?

Fam



Re: [Qemu-block] [Questions] NBD issue or CoMutex->holder issue?

2016-10-11 Thread Changlong Xie

On 10/11/2016 06:47 PM, Paolo Bonzini wrote:

the free_sema->queue head, so set free_sema->holder as
>revelant coroutine.

NBD is using the CoMutex in a way that wasn't anticipated.  The simplest
fix is to change it to CoQueue, which is like a condition variable.
Instead of locking if in_flight >= MAX_NBD_REQUESTS - 1, wait on the
queue while in_flight == MAX_NBD_REQUESTS.  Instead of unlocking, use
qemu_co_queue_next to wake up one request.



Thanks for your explanation! will send out a patch later.


Thanks
-Xie


Thanks for the report!

Paolo


>For example if there are N(N=26 and MAX_NBD_REQUESTS=16) nbd write
>requests, so we'll invoke nbd_client_co_pwritev 26 times.
>time request No   Actions
>1 1   in_flight=1, Coroutine=C1
>2 2   in_flight=2, Coroutine=C2






Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/2] block/replication: Clarify 'top-id' parameter usage

2016-10-11 Thread Changlong Xie

On 10/11/2016 10:54 PM, Eric Blake wrote:

The replication driver only supports the 'top-id' parameter for the
secondary side; it must not be supplied for the primary side.


Will apply in next version.

Thanks
-Xie





Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/2] block/replication: prefect the logic to acquire 'top_id'

2016-10-11 Thread Changlong Xie

On 10/11/2016 10:52 PM, Eric Blake wrote:

On 10/11/2016 05:46 AM, Changlong Xie wrote:

Only g_strdup(top_id) if 'top_id' is not NULL, although there
is no memory leak here

Signed-off-by: Changlong Xie 
---
  block/replication.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 3bd1cf1..5b432d9 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -104,11 +104,11 @@ static int replication_open(BlockDriverState *bs, QDict 
*options,
  } else if (!strcmp(mode, "secondary")) {
  s->mode = REPLICATION_MODE_SECONDARY;
  top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
-s->top_id = g_strdup(top_id);


g_strdup(NULL) is safe; it returns NULL in that case.


Yes, that's why i said 'there is no memory leak here' in the commit 
message.





-if (!s->top_id) {
+if (!top_id) {
  error_setg(_err, "Missing the option top-id");
  goto fail;
  }
+s->top_id = g_strdup(top_id);


I see no point to this patch, rather than churn.


It just reduce on execution path. Maybe i'm too academic :)
Will remove it in the next series.

Thanks
-Xie











Re: [Qemu-block] [Qemu-devel] [PATCH 0/4] Allow blockdev-add for SSH

2016-10-11 Thread no-reply
Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 1476171437-11830-1-git-send-email-ashijeetacha...@gmail.com
Subject: [Qemu-devel] [PATCH 0/4] Allow blockdev-add for SSH
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
make J=8 docker-test-quick@centos6
make J=8 docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
ada148c qapi: allow blockdev-add for ssh
c15d50b block/ssh: Use InetSocketAddress options
370773b block/ssh: Add InetSocketAddress and accept it
7b6cfe0 block/ssh: Add ssh_has_filename_options_conflict()

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
  RUN test-quick in centos6
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=15ba99d379b9
TERM=xterm
MAKEFLAGS= -j8
HISTSIZE=1000
J=8
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority  NORMAL
GNUTLS supportno
GNUTLS rndno
libgcrypt no
libgcrypt kdf no
nettleno 
nettle kdfno
libtasn1  no
curses supportno
virgl support no
curl support  no
mingw32 support   no
Audio drivers oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS supportno
VNC support   yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support   no
brlapi supportno
bluez  supportno
Documentation no
PIE   yes
vde support   no
netmap supportno
Linux AIO support no
ATTR/XATTR support yes
Install blobs yes
KVM support   yes
RDMA support  no
TCG interpreter   no
fdt support   yes
preadv supportyes
fdatasync yes
madvise   yes
posix_madvise yes

Re: [Qemu-block] [Qemu-devel] [PATCH v5] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Programmingkid

On Oct 11, 2016, at 2:12 PM, Eric Blake wrote:

> On 10/11/2016 01:03 PM, Programmingkid wrote:
> 
>>> +/* Mac OSX has a  bug that incorrectly defines SIZE_MAX with
>>> + * the wrong type. Our replacement isn't usable in preprocessor
>>> + * expressions, but it is sufficient for our needs. */
>>> +#if defined(HAVE_BROKEN_SIZE_MAX) && HAVE_BROKEN_SIZE_MAX
>>> +#undef SIZE_MAX
>>> +#define SIZE_MAX ((size_t)-1)
>>> +#endif
>>> +
> 
>> I have applied your patch to the most recent git commit 
>> (627eae7d729277c84f8e0ac07a8caab39c92c38d) on Mac  OS 10.6.8. QEMU built 
>> without any problems using gcc 4.9.
> 
> Did you also tweak the code to make sure there was an instance of
> printf("%zu", SIZE_MAX) (or similar)? It's not enough that it compiles
> without complaint (although that helps), but also that the
> compiler-warning-on-printf goes away (which we currently don't have any
> in the tree, because we've been writing '"%zu", (size_t)SIZE_MAX' to
> work around the broken headers).

I saw no warnings when I added your printf code. The output was 
18446744073709551615.


Re: [Qemu-block] [Qemu-devel] [PATCH v5] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Peter Maydell
On 11 October 2016 at 19:12, Eric Blake  wrote:
> On 10/11/2016 01:03 PM, Programmingkid wrote:
>
>>> +/* Mac OSX has a  bug that incorrectly defines SIZE_MAX with
>>> + * the wrong type. Our replacement isn't usable in preprocessor
>>> + * expressions, but it is sufficient for our needs. */
>>> +#if defined(HAVE_BROKEN_SIZE_MAX) && HAVE_BROKEN_SIZE_MAX
>>> +#undef SIZE_MAX
>>> +#define SIZE_MAX ((size_t)-1)
>>> +#endif
>>> +
>
>> I have applied your patch to the most recent git commit 
>> (627eae7d729277c84f8e0ac07a8caab39c92c38d) on Mac  OS 10.6.8. QEMU built 
>> without any problems using gcc 4.9.
>
> Did you also tweak the code to make sure there was an instance of
> printf("%zu", SIZE_MAX) (or similar)? It's not enough that it compiles
> without complaint (although that helps), but also that the
> compiler-warning-on-printf goes away (which we currently don't have any
> in the tree, because we've been writing '"%zu", (size_t)SIZE_MAX' to
> work around the broken headers).

I have made that check, and tested that the patch causes the
resulting build failure to go away.

I'll apply this to master...

thanks
-- PMM



Re: [Qemu-block] [Qemu-devel] [PATCH v5] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Eric Blake
On 10/11/2016 01:03 PM, Programmingkid wrote:

>> +/* Mac OSX has a  bug that incorrectly defines SIZE_MAX with
>> + * the wrong type. Our replacement isn't usable in preprocessor
>> + * expressions, but it is sufficient for our needs. */
>> +#if defined(HAVE_BROKEN_SIZE_MAX) && HAVE_BROKEN_SIZE_MAX
>> +#undef SIZE_MAX
>> +#define SIZE_MAX ((size_t)-1)
>> +#endif
>> +

> I have applied your patch to the most recent git commit 
> (627eae7d729277c84f8e0ac07a8caab39c92c38d) on Mac  OS 10.6.8. QEMU built 
> without any problems using gcc 4.9.

Did you also tweak the code to make sure there was an instance of
printf("%zu", SIZE_MAX) (or similar)? It's not enough that it compiles
without complaint (although that helps), but also that the
compiler-warning-on-printf goes away (which we currently don't have any
in the tree, because we've been writing '"%zu", (size_t)SIZE_MAX' to
work around the broken headers).

> 
> Reviewed-by: John Arbuckle 
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [Qemu-devel] [PATCH v5] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Programmingkid

On Oct 11, 2016, at 12:00 PM, qemu-devel-requ...@nongnu.org wrote:

> C99 requires SIZE_MAX to be declared with the same type as the
> integral promotion of size_t, but OSX mistakenly defines it as
> an 'unsigned long long' expression even though size_t is only
> 'unsigned long'.  Rather than futzing around with whether size_t
> is 32- or 64-bits wide (which would be needed if we cared about
> using SIZE_T in a #if expression), just hard-code it with a cast.
> This is not a strict C99-compliant definition, because it doesn't
> work in the preprocessor, but if we later need that, the build
> will break on Mac to inform us to improve our replacement at that
> time.
> 
> See also https://patchwork.ozlabs.org/patch/542327/ for an
> instance where the wrong type trips us up if we don't fix it
> for good in osdep.h.
> 
> Some versions of glibc make a similar mistake with SSIZE_MAX; the
> goal is that the approach of this patch could be copied to work
> around that problem if it ever becomes important to us.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v1 was here:
> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg02520.html
> 
> The topic recently came up again, and I noticed this patch sitting
> on one of my older branches, so I've taken another shot at it.
> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg00950.html
> 
> v2: rewrite into a configure check (not sure if directly adding a
> -D to QEMU_CFLAGS is the best, so advice welcome)
> 
> v3: Use config-host.mak rather than direct -D [Peter]
> 
> v4: placate -Wunused builds
> 
> v5: use a simpler cast, rather than arithmetic promotion [Markus]
> 
> I lack easy access to a Mac box, so this is untested as to whether
> it actually solves the issue...
> ---
> include/qemu/osdep.h |  8 
> configure| 16 
> 2 files changed, 24 insertions(+)
> 
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index 9e9fa61..c65dad7 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -141,6 +141,14 @@ extern int daemon(int, int);
> # error Unknown pointer size
> #endif
> 
> +/* Mac OSX has a  bug that incorrectly defines SIZE_MAX with
> + * the wrong type. Our replacement isn't usable in preprocessor
> + * expressions, but it is sufficient for our needs. */
> +#if defined(HAVE_BROKEN_SIZE_MAX) && HAVE_BROKEN_SIZE_MAX
> +#undef SIZE_MAX
> +#define SIZE_MAX ((size_t)-1)
> +#endif
> +
> #ifndef MIN
> #define MIN(a, b) (((a) < (b)) ? (a) : (b))
> #endif
> diff --git a/configure b/configure
> index 5751d8e..dd9e679 100755
> --- a/configure
> +++ b/configure
> @@ -1725,6 +1725,19 @@ if test "$cocoa" = "yes"; then
> sdl=no
> fi
> 
> +# Some versions of Mac OS X incorrectly define SIZE_MAX
> +cat > $TMPC << EOF
> +#include 
> +#include 
> +int main(int argc, char *argv[]) {
> +return printf("%zu", SIZE_MAX);
> +}
> +EOF
> +have_broken_size_max=no
> +if ! compile_object -Werror ; then
> +have_broken_size_max=yes
> +fi
> +
> ##
> # L2TPV3 probe
> 
> @@ -5245,6 +5258,9 @@ fi
> if test "$have_ifaddrs_h" = "yes" ; then
> echo "HAVE_IFADDRS_H=y" >> $config_host_mak
> fi
> +if test "$have_broken_size_max" = "yes" ; then
> +echo "HAVE_BROKEN_SIZE_MAX=y" >> $config_host_mak
> +fi
> 
> # Work around a system header bug with some kernel/XFS header
> # versions where they both try to define 'struct fsxattr':
> -- 
> 2.7.4

I have applied your patch to the most recent git commit 
(627eae7d729277c84f8e0ac07a8caab39c92c38d) on Mac  OS 10.6.8. QEMU built 
without any problems using gcc 4.9.

Reviewed-by: John Arbuckle 


Re: [Qemu-block] [Qemu-devel] [PATCH v5] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Markus Armbruster
Eric Blake  writes:

> C99 requires SIZE_MAX to be declared with the same type as the
> integral promotion of size_t, but OSX mistakenly defines it as
> an 'unsigned long long' expression even though size_t is only
> 'unsigned long'.  Rather than futzing around with whether size_t
> is 32- or 64-bits wide (which would be needed if we cared about
> using SIZE_T in a #if expression), just hard-code it with a cast.
> This is not a strict C99-compliant definition, because it doesn't
> work in the preprocessor, but if we later need that, the build
> will break on Mac to inform us to improve our replacement at that
> time.
>
> See also https://patchwork.ozlabs.org/patch/542327/ for an
> instance where the wrong type trips us up if we don't fix it
> for good in osdep.h.
>
> Some versions of glibc make a similar mistake with SSIZE_MAX; the
> goal is that the approach of this patch could be copied to work
> around that problem if it ever becomes important to us.
>
> Signed-off-by: Eric Blake 

Reviewed-by: Markus Armbruster 



Re: [Qemu-block] [Qemu-devel] [PATCH 2/2] macio: switch over to new byte-aligned DMA helpers

2016-10-11 Thread John Snow



On 10/09/2016 12:43 PM, Mark Cave-Ayland wrote:

Now that the DMA helpers are byte-aligned they can be called directly from
the macio routines rather than emulating byte-aligned accesses via multiple
block-level accesses.



_cool_


Signed-off-by: Mark Cave-Ayland 
---
 hw/ide/macio.c |  213 
 1 file changed, 28 insertions(+), 185 deletions(-)

diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index 76f97c2..9742c00 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -52,187 +52,6 @@ static const int debug_macio = 0;

 #define MACIO_PAGE_SIZE 4096

-/*
- * Unaligned DMA read/write access functions required for OS X/Darwin which
- * don't perform DMA transactions on sector boundaries. These functions are
- * modelled on bdrv_co_preadv()/bdrv_co_pwritev() and so should be easy to
- * remove if the unaligned block APIs are ever exposed.
- */
-
-static void pmac_dma_read(BlockBackend *blk,
-  int64_t offset, unsigned int bytes,
-  void (*cb)(void *opaque, int ret), void *opaque)
-{
-DBDMA_io *io = opaque;
-MACIOIDEState *m = io->opaque;
-IDEState *s = idebus_active_if(>bus);
-dma_addr_t dma_addr;
-int64_t sector_num;
-int nsector;
-uint64_t align = BDRV_SECTOR_SIZE;
-size_t head_bytes, tail_bytes;
-
-qemu_iovec_destroy(>iov);
-qemu_iovec_init(>iov, io->len / MACIO_PAGE_SIZE + 1);
-
-sector_num = (offset >> 9);
-nsector = (io->len >> 9);
-
-MACIO_DPRINTF("--- DMA read transfer (0x%" HWADDR_PRIx ",0x%x): "
-  "sector_num: %" PRId64 ", nsector: %d\n", io->addr, io->len,
-  sector_num, nsector);
-
-dma_addr = io->addr;
-io->dir = DMA_DIRECTION_FROM_DEVICE;
-io->dma_len = io->len;
-io->dma_mem = dma_memory_map(_space_memory, dma_addr, >dma_len,
- io->dir);
-
-if (offset & (align - 1)) {
-head_bytes = offset & (align - 1);
-
-MACIO_DPRINTF("--- DMA unaligned head: sector %" PRId64 ", "
-  "discarding %zu bytes\n", sector_num, head_bytes);
-
-qemu_iovec_add(>iov, >head_remainder, head_bytes);
-
-bytes += offset & (align - 1);
-offset = offset & ~(align - 1);
-}
-
-qemu_iovec_add(>iov, io->dma_mem, io->len);
-
-if ((offset + bytes) & (align - 1)) {
-tail_bytes = (offset + bytes) & (align - 1);
-
-MACIO_DPRINTF("--- DMA unaligned tail: sector %" PRId64 ", "
-  "discarding bytes %zu\n", sector_num, tail_bytes);
-
-qemu_iovec_add(>iov, >tail_remainder, align - tail_bytes);
-bytes = ROUND_UP(bytes, align);
-}
-
-s->io_buffer_size -= io->len;
-s->io_buffer_index += io->len;
-
-io->len = 0;
-
-MACIO_DPRINTF("--- Block read transfer - sector_num: %" PRIx64 "  "
-  "nsector: %x\n", (offset >> 9), (bytes >> 9));
-
-s->bus->dma->aiocb = blk_aio_preadv(blk, offset, >iov, 0, cb, io);
-}
-
-static void pmac_dma_write(BlockBackend *blk,
- int64_t offset, int bytes,
- void (*cb)(void *opaque, int ret), void *opaque)
-{
-DBDMA_io *io = opaque;
-MACIOIDEState *m = io->opaque;
-IDEState *s = idebus_active_if(>bus);
-dma_addr_t dma_addr;
-int64_t sector_num;
-int nsector;
-uint64_t align = BDRV_SECTOR_SIZE;
-size_t head_bytes, tail_bytes;
-bool unaligned_head = false, unaligned_tail = false;
-
-qemu_iovec_destroy(>iov);
-qemu_iovec_init(>iov, io->len / MACIO_PAGE_SIZE + 1);
-
-sector_num = (offset >> 9);
-nsector = (io->len >> 9);
-
-MACIO_DPRINTF("--- DMA write transfer (0x%" HWADDR_PRIx ",0x%x): "
-  "sector_num: %" PRId64 ", nsector: %d\n", io->addr, io->len,
-  sector_num, nsector);
-
-dma_addr = io->addr;
-io->dir = DMA_DIRECTION_TO_DEVICE;
-io->dma_len = io->len;
-io->dma_mem = dma_memory_map(_space_memory, dma_addr, >dma_len,
- io->dir);
-
-if (offset & (align - 1)) {
-head_bytes = offset & (align - 1);
-sector_num = ((offset & ~(align - 1)) >> 9);
-
-MACIO_DPRINTF("--- DMA unaligned head: pre-reading head sector %"
-  PRId64 "\n", sector_num);
-
-blk_pread(s->blk, (sector_num << 9), >head_remainder, align);
-
-qemu_iovec_add(>iov, >head_remainder, head_bytes);
-qemu_iovec_add(>iov, io->dma_mem, io->len);
-
-bytes += offset & (align - 1);
-offset = offset & ~(align - 1);
-
-unaligned_head = true;
-}
-
-if ((offset + bytes) & (align - 1)) {
-tail_bytes = (offset + bytes) & (align - 1);
-sector_num = (((offset + bytes) & ~(align - 1)) >> 9);
-
-MACIO_DPRINTF("--- DMA unaligned tail: pre-reading tail sector %"
-  PRId64 "\n", sector_num);
-
-blk_pread(s->blk, (sector_num 

Re: [Qemu-block] [Qemu-devel] [PATCH v10 09/16] block: Add QMP support for streaming to an intermediate layer

2016-10-11 Thread Markus Armbruster
Eric Blake  writes:

> On 10/11/2016 09:57 AM, Kevin Wolf wrote:
>> Should we introduce a new, clean blockdev-stream command that fixes this
>> and matches the common name pattern? Of course, block-stream vs.
>> blockdev-stream could be a bit confusing, too...
>> 
>
> A new command is easy to introspect (query-commands), lets us get rid of
> cruft, and makes it obvious that we support everything from the get-go.
> I'm favoring that option, even if it leads to slightly confusing names
> of the deprecated vs. the new command.

Let's take a step back and consider extending old commands vs. adding
new commands.

A new command is trivial to detect in introspection.

Command extensions are not as trivial to detect in introspection.  Many
extensions are exposed in query-qmp-schema, but not all.

Back when QMP was young, Anthony argued for always adding new commands,
never extend existing ones.  I opposed it, because it would lead to a
confusing mess of related commands, unreadable or incomplete
documentation, and abysmal test coverage.

However, the other extreme is also unwise: we shouldn't shoehorn new
functionality into existing commands just because we can.  We should ask
ourselves questions like these:

* Is the extended command still a sane interface?  If writing clear
  documentation for it is hard, it perhaps isn't.  Pay special attention
  to failure modes.  Overloaded arguments are prone to confusing errors.

* How will the command's users use the extension?  If it requires new
  code paths, a new command may be more convenient for them.



Re: [Qemu-block] [Qemu-devel] [PATCH 0/3] block: new bdrv_drain implementation

2016-10-11 Thread no-reply
Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Subject: [Qemu-devel] [PATCH 0/3] block: new bdrv_drain implementation
Message-id: 1475857193-28735-1-git-send-email-pbonz...@redhat.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
make J=8 docker-test-quick@centos6
make J=8 docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
783bbdc qed: Implement .bdrv_drain
5532f33 block: change drain to look only at one child at a time
36c2426 block: add BDS field to count in-flight requests

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
  RUN test-quick in centos6
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=c061aa2804f1
TERM=xterm
MAKEFLAGS= -j8
HISTSIZE=1000
J=8
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority  NORMAL
GNUTLS supportno
GNUTLS rndno
libgcrypt no
libgcrypt kdf no
nettleno 
nettle kdfno
libtasn1  no
curses supportno
virgl support no
curl support  no
mingw32 support   no
Audio drivers oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS supportno
VNC support   yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support   no
brlapi supportno
bluez  supportno
Documentation no
PIE   yes
vde support   no
netmap supportno
Linux AIO support no
ATTR/XATTR support yes
Install blobs yes
KVM support   yes
RDMA support  no
TCG interpreter   no
fdt support   yes
preadv supportyes
fdatasync yes
madvise   yes
posix_madvise yes
libcap-ng support no
vhost-net support yes

Re: [Qemu-block] [PATCH 1/3] block: add BDS field to count in-flight requests

2016-10-11 Thread Paolo Bonzini


On 11/10/2016 17:25, Kevin Wolf wrote:
> Am 11.10.2016 um 16:09 hat Paolo Bonzini geschrieben:
>> Is what you call "a BDS-level bdrv_isolate_begin/end" the same as my
>> "bdrv_drained_begin(bs) and bdrv_drained_end(bs), which quiesce all root
>> BdrvChildren reachable from bs"?  Anyway I think we agree.
> 
> Ah, is your bdrv_drained_begin() working in child-to-parent direction,
> too?

Yes.

>> I mentioned block jobs, block migration and the NBD server.  They do use
>> a BB (as you said, they wouldn't compile after the BdrvChild
>> conversion), but they don't have their own *separate* BB as far as I can
>> tell, with separate throttling state etc.  How far are we from that?
> 
> They have separate BBs, we're already there.

Hey, you're right!  I don't know how I missed that!

> bdrv_drain() should only be called from bdrv_close(), yes, and I would
> agree with inactivating an image as the first step before we close it.
> 
> However, the .bdrv_drained_begin/end callbacks in BlockDriver are still
> called in the context of isolation and that's different from
> inactivating.

Agreed, .bdrv_drained_begin/end is separate from inactivation.  That's
just "bdrv_drain" (which we agreed has to be called from bdrv_close only).

> I think my point was that you don't have to count requests at the BB
> level if you know that there are no requests pending on the BB level
> that haven't reached the BDS level yet.

I need to count requests at the BB level because the blk_aio_*
operations have a separate bottom half that is invoked if either 1) they
never reach BDS (because of an error); or 2) the bdrv_co_* call
completes without yielding.  The count must be >0 when blk_aio_*
returns, or bdrv_drain (and thus blk_drain) won't loop.  Because
bdrv_drain and blk_drain are conflated, the counter must be the BDS one.

In turn, the BDS counter is needed because of the lack of isolated
sections.  The right design would be for blk_isolate_begin to call
blk_drain on *other* BlockBackends reachable in a child-to-parent visit.
 Instead, until that is implemented, we have bdrv_drained_begin that
emulates that through the same-named callback, followed by a
parent-to-child bdrv_drain that is almost always unnecessary.

> If you move the restarting to
> throttled requests to blk_drain(), shouldn't that be all you need to do
> on the BB level for now?
>
> Hm, or actually, doesn't the BdrvChild callback still do this work after
> your patch?

Yes, it does.  The only difference is that the callback is not needed
anymore for bdrv_requests_pending to answer true.  With my patch,
bdrv_requests_pending always answers true if there are throttled
requests.  Without my patch, it relies on blk_root_drained_begin's call
to throttle_group_restart_blk.

The long-term view is that there would be a difference between blk_drain
and the BdrvChild callback.  blk_drain waits for throttled requests
(using the same algorithm as bdrv_drain and a BB-level counter only).
Instead, the BdrvChild callback can merely arrange to be ignored by the
throttle group.

(By the way, I need to repost this series anyway, but let's finish the
discussion first to understand what you'd like to have in 2.8).

Paolo

> Maybe I'm failing to understand something important about
> the patch...
> 
>> All in all I don't think this should be done as a single patch series
>> and certainly wouldn't be ready in time for 2.8.  These two patches
>> (plus the blockjob_drain one that I posted) are needed for me to get rid
>> of RFifoLock and fix bdrv_drain_all deadlocks.  I'd really love to do
>> that in 2.8, and the patches for that are ready to be posted (as RFC I
>> guess).
> 
> Agreed, I don't want to make the full thing a requirement.
> 
> Kevin
> 
> 



Re: [Qemu-block] [Qemu-devel] [PATCH v10 09/16] block: Add QMP support for streaming to an intermediate layer

2016-10-11 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 11.10.2016 um 16:30 hat Alberto Garcia geschrieben:
[...]
>> 3) QEMU could advertise that feature to the client. This is probably
>> simpler than trying to figure it out from the API. I guess that's the
>> idea of 'qmp_capabilities'?
>
> I think that was the idea, though it was never used. If we had used it,
> I'm not sure how long the capabilities list would be today. :-)

QMP capabilities are for changes in the QMP protocol, not for changes in
commands, events and types.  The protocol has been good enough so far,
thus no capabilities.



Re: [Qemu-block] [Qemu-devel] [PATCH v14 07/21] qapi: permit scalar type conversions in QObjectInputVisitor

2016-10-11 Thread Markus Armbruster
"Daniel P. Berrange"  writes:

> Currently the QObjectInputVisitor assumes that all scalar
> values are directly represented as the final types declared
> by the thing being visited. ie it assumes an 'int' is using

i.e.

> QInt, and a 'bool' is using QBool, etc.  This is good when
> QObjectInputVisitor is fed a QObject that came from a JSON
> document on the QMP monitor, as it will strictly validate
> correctness.
>
> To allow QObjectInputVisitor to be reused for visiting
> a QObject originating from QemuOpts, an alternative mode
> is needed where all the scalars types are represented as
> QString and converted on the fly to the final desired
> type.
>
> Reviewed-by: Kevin Wolf 
> Reviewed-by: Marc-André Lureau 
> Signed-off-by: Daniel P. Berrange 
> ---
>  include/qapi/qobject-input-visitor.h |  32 +-
>  qapi/qobject-input-visitor.c | 132 
>  tests/test-qobject-input-visitor.c   | 194 
> ++-
>  3 files changed, 350 insertions(+), 8 deletions(-)

Quite some code.  Here's how I'm persuading myself to like it.

We need to parse two and a half different languages into QAPI objects:
JSON, command line key=value,... (QemuOpts), and just value (same syntax
as in QemuOpts).

Two language differences stand out:

1. JSON is recursive, key=value,... is flat (but see below), and

2. JSON values have syntactically obvious JSON types, key=value values
   are just strings.

To parse JSON into a QAPI object, we first parse it into a QObject, then
use the QObject input visitor to convert the QObject into a QAPI object.

To parse key=value, we first parse it into a QemuOpts, then use either
the options visitor to convert the QemuOpts into a QAPI object.  We can
also use the string input visitor to convert just an individual key's
value.  Or any string for that matter.

Note that we always use the QemuOpts *string* value, never the integer
or boolean value QemuOpts additionally provides when the key has been
specified by a QemuOptDesc with a non-string type.  Such QemuOptDesc are
best avoided when we parse the string again with a visitor.  For when it
isn't avoided, the visitors' parsers need to match the QemuOpts parser.
Yes, this is a mess.

The flatness of key=value,... has become overly limiting.  We've grown
workarounds for lists in both the options and the string input visitor.
Independent ones *sigh*.  More recently, the block layer's dotted key
convention has provided a way to overcome flatness completely.

We could implement this convention in the options visitor.  Instead,
you're picking a different route: you reuse existing
qemu_opts_to_qdict() to go from QemuOpts to flat qdict, create
qdict_crumple() to unflatten using dotted key convention, then reuse the
existing QObject input visitor to go to QAPI object.  This decomposes
the task into smaller ones, some of them solved already, at a tolerable
cost in efficiency.

Except the QObject input visitor won't do as is, because of language
difference 2.  This patch makes it cope.

The need to do this negates much of the code reuse advantage.  Doesn't
look so hot by itself.  However, you then do more work and replace the
options visitor, with an obvious path to replacing the string input
visitor as well.

> diff --git a/include/qapi/qobject-input-visitor.h 
> b/include/qapi/qobject-input-visitor.h
> index cde328d..5022297 100644
> --- a/include/qapi/qobject-input-visitor.h
> +++ b/include/qapi/qobject-input-visitor.h
> @@ -19,12 +19,36 @@
>  
>  typedef struct QObjectInputVisitor QObjectInputVisitor;
>  
> -/*
> - * Return a new input visitor that converts a QObject to a QAPI object.
> +/**
> + * Create a new input visitor that converts @obj to a QAPI object.
> + *
> + * Any scalar values in the @obj input data structure should be in the
> + * required type already. i.e. if visiting a bool, the value should
> + * already be a QBool instance.
>   *
> - * Set @strict to reject a parse that doesn't consume all keys of a
> - * dictionary; otherwise excess input is ignored.
> + * If @strict is set to true, then an error will be reported if any
> + * dict keys are not consumed during visitation. If @strict is false
> + * then extra dict keys are silently ignored.

Aside: I want to get rid of non-strict, badly.  Quoting commit 240f64b
"qapi: Use strict QMP input visitor in more places":

The only remaining uses of non-strict input visits are:

- QMP 'qom-set' (which eventually executes
object_property_set_qobject()) - mark it as something to revisit
in the future (I didn't want to spend any more time on this patch
auditing if we have any QOM dictionary properties that might be
impacted, and couldn't easily prove whether this code path is
shared with anything else).

- test-qmp-input-visitor: explicit tests of non-strict mode. If
we later get rid of users that don't need 

Re: [Qemu-block] [Qemu-devel] [PATCH v10 09/16] block: Add QMP support for streaming to an intermediate layer

2016-10-11 Thread Eric Blake
On 10/11/2016 09:57 AM, Kevin Wolf wrote:
> Should we introduce a new, clean blockdev-stream command that fixes this
> and matches the common name pattern? Of course, block-stream vs.
> blockdev-stream could be a bit confusing, too...
> 

A new command is easy to introspect (query-commands), lets us get rid of
cruft, and makes it obvious that we support everything from the get-go.
I'm favoring that option, even if it leads to slightly confusing names
of the deprecated vs. the new command.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] dma-helpers: explicitly pass alignment into dma-helpers

2016-10-11 Thread John Snow



On 10/10/2016 03:23 PM, Mark Cave-Ayland wrote:

On 10/10/16 17:34, Eric Blake wrote:


On 10/09/2016 11:43 AM, Mark Cave-Ayland wrote:

The hard-coded default alignment is BDRV_SECTOR_SIZE, however this is not
necessarily the case for all platforms. Use this as the default alignment for
all current callers.

Signed-off-by: Mark Cave-Ayland 
---



@@ -160,8 +161,8 @@ static void dma_blk_cb(void *opaque, int ret)
 return;
 }

-if (dbs->iov.size & ~BDRV_SECTOR_MASK) {
-qemu_iovec_discard_back(>iov, dbs->iov.size & ~BDRV_SECTOR_MASK);
+if (dbs->iov.size & (dbs->align - 1)) {
+qemu_iovec_discard_back(>iov, dbs->iov.size & (dbs->align - 1));


Would it be any smarter to use osdep.h's QEMU_IS_ALIGNED(dbs->iov.size,
dbs->align) and QEMU_ALIGN_DOWN(dbs->iov.size, dbs->align)?
Semantically it is the same, but the macros make it obvious what the
bit-twiddling is doing.

Unless you think that needs a tweak,
Reviewed-by: Eric Blake 


I can't say I feel too strongly about it since there are plenty of other
examples of this style in the codebase, so I'm happy to go with whatever
John/Paolo are most happy with.


ATB,

Mark.



I can't pretend I am consistent, but when in doubt use the macro. Not 
worth a respin IMO, but I think this falls out of my jurisdiction :)


Acked-by: John Snow 



[Qemu-block] [PATCH v5] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Eric Blake
C99 requires SIZE_MAX to be declared with the same type as the
integral promotion of size_t, but OSX mistakenly defines it as
an 'unsigned long long' expression even though size_t is only
'unsigned long'.  Rather than futzing around with whether size_t
is 32- or 64-bits wide (which would be needed if we cared about
using SIZE_T in a #if expression), just hard-code it with a cast.
This is not a strict C99-compliant definition, because it doesn't
work in the preprocessor, but if we later need that, the build
will break on Mac to inform us to improve our replacement at that
time.

See also https://patchwork.ozlabs.org/patch/542327/ for an
instance where the wrong type trips us up if we don't fix it
for good in osdep.h.

Some versions of glibc make a similar mistake with SSIZE_MAX; the
goal is that the approach of this patch could be copied to work
around that problem if it ever becomes important to us.

Signed-off-by: Eric Blake 

---
v1 was here:
https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg02520.html

The topic recently came up again, and I noticed this patch sitting
on one of my older branches, so I've taken another shot at it.
https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg00950.html

v2: rewrite into a configure check (not sure if directly adding a
-D to QEMU_CFLAGS is the best, so advice welcome)

v3: Use config-host.mak rather than direct -D [Peter]

v4: placate -Wunused builds

v5: use a simpler cast, rather than arithmetic promotion [Markus]

I lack easy access to a Mac box, so this is untested as to whether
it actually solves the issue...
---
 include/qemu/osdep.h |  8 
 configure| 16 
 2 files changed, 24 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 9e9fa61..c65dad7 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -141,6 +141,14 @@ extern int daemon(int, int);
 # error Unknown pointer size
 #endif

+/* Mac OSX has a  bug that incorrectly defines SIZE_MAX with
+ * the wrong type. Our replacement isn't usable in preprocessor
+ * expressions, but it is sufficient for our needs. */
+#if defined(HAVE_BROKEN_SIZE_MAX) && HAVE_BROKEN_SIZE_MAX
+#undef SIZE_MAX
+#define SIZE_MAX ((size_t)-1)
+#endif
+
 #ifndef MIN
 #define MIN(a, b) (((a) < (b)) ? (a) : (b))
 #endif
diff --git a/configure b/configure
index 5751d8e..dd9e679 100755
--- a/configure
+++ b/configure
@@ -1725,6 +1725,19 @@ if test "$cocoa" = "yes"; then
 sdl=no
 fi

+# Some versions of Mac OS X incorrectly define SIZE_MAX
+cat > $TMPC << EOF
+#include 
+#include 
+int main(int argc, char *argv[]) {
+return printf("%zu", SIZE_MAX);
+}
+EOF
+have_broken_size_max=no
+if ! compile_object -Werror ; then
+have_broken_size_max=yes
+fi
+
 ##
 # L2TPV3 probe

@@ -5245,6 +5258,9 @@ fi
 if test "$have_ifaddrs_h" = "yes" ; then
 echo "HAVE_IFADDRS_H=y" >> $config_host_mak
 fi
+if test "$have_broken_size_max" = "yes" ; then
+echo "HAVE_BROKEN_SIZE_MAX=y" >> $config_host_mak
+fi

 # Work around a system header bug with some kernel/XFS header
 # versions where they both try to define 'struct fsxattr':
-- 
2.7.4




[Qemu-block] [PATCH v6 08.5/15] fixup! qstring: Add qstring_wrap_str()

2016-10-11 Thread Eric Blake
Signed-off-by: Eric Blake 

---
squash this in to address Marc-Andre's finding on a harmless
extra g_free.
---
 block.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 16d5981..6f00fd4 100644
--- a/block.c
+++ b/block.c
@@ -1605,7 +1605,7 @@ static BlockDriverState 
*bdrv_append_temp_snapshot(BlockDriverState *bs,
 char *tmp_filename = g_malloc0(PATH_MAX + 1);
 int64_t total_size;
 QemuOpts *opts = NULL;
-BlockDriverState *bs_snapshot;
+BlockDriverState *bs_snapshot = NULL;
 int ret;

 /* if snapshot, we create a temporary backing file and open it
@@ -1658,13 +1658,10 @@ static BlockDriverState 
*bdrv_append_temp_snapshot(BlockDriverState *bs,
 bdrv_ref(bs_snapshot);
 bdrv_append(bs_snapshot, bs);

+out:
+QDECREF(snapshot_options);
 g_free(tmp_filename);
 return bs_snapshot;
-
-out:
-QDECREF(snapshot_options);
-g_free(tmp_filename);
-return NULL;
 }

 /*
-- 
2.7.4




Re: [Qemu-block] [PATCH 1/3] block: add BDS field to count in-flight requests

2016-10-11 Thread Kevin Wolf
Am 11.10.2016 um 16:09 hat Paolo Bonzini geschrieben:
> > Anyway, let's see which of the existing bdrv_drained_begin/end users
> > would use this (please correct):
> > 
> > * Block jobs use during completion
> >
> > * The QMP transaction commands to start block jobs drain as well, but
> >   they don't have a BlockBackend yet. Those would call a BDS-level
> >   bdrv_isolate_begin/end then, right?
> > 
> > * Quorum wants to isolate itself from new parent requests while adding a
> >   new child, that's another user for bdrv_isolate_begin/end
> > 
> > * bdrv_snapshot_delete(), probably the save BDS-level thing
> 
> Is what you call "a BDS-level bdrv_isolate_begin/end" the same as my
> "bdrv_drained_begin(bs) and bdrv_drained_end(bs), which quiesce all root
> BdrvChildren reachable from bs"?  Anyway I think we agree.

Ah, is your bdrv_drained_begin() working in child-to-parent direction,
too? Then I misunderstood and it's the same, yes.

> >> blk_drain and blk_co_drain can simply count in-flight requests, exactly
> >> as done in patch 1.  Sure, I'm adding it at the wrong layer because not
> >> every user of the block layer has already gotten its own personal BB.
> > 
> > Really? Where is it still missing? I was pretty sure I had converted all
> > users.
> 
> I mentioned block jobs, block migration and the NBD server.  They do use
> a BB (as you said, they wouldn't compile after the BdrvChild
> conversion), but they don't have their own *separate* BB as far as I can
> tell, with separate throttling state etc.  How far are we from that?

They have separate BBs, we're already there.

> > As for .bdrv_inactivate/invalidate, I agree that it should ideally
> > imply .bdrv_drained_begin/end, though in practice we don't assert
> > cleared BDRV_O_INACTIVE on reads, so I think it might not hold true.
> > 
> > I'm not so sure if .bdrv_drained_begin/end should imply inactivation,
> > though.
> 
> Again, bad naming hurts...  See above where you said that bdrv_drain
> should be called from bdrv_close only, and not be recursive.  And indeed
> in qcow2_close you see:
> 
> if (!(s->flags & BDRV_O_INACTIVE)) {
> qcow2_inactivate(bs);
> }
> 
> so this in bdrv_close:
> 
> bdrv_flush(bs);
> bdrv_drain(bs); /* in case flush left pending I/O */
> 
> should be replaced by a non-recursive bdrv_inactivate, and bdrv_flush
> should be the default implementation of bdrv_inactivate.  The QED code
> added in patch 3 can also become its .bdrv_inactivate callback.  In
> addition, stopping the VM could also do a "full flush" without setting
> BDRV_O_INACTIVE instead of using bdrv_flush_all.

bdrv_drain() should only be called from bdrv_close(), yes, and I would
agree with inactivating an image as the first step before we close it.

However, the .bdrv_drained_begin/end callbacks in BlockDriver are still
called in the context of isolation and that's different from
inactivating.

> > Why don't we start with adding a proper blk_drain so that we don't have
> > to introduce the ugly intermediate state? This would mostly mean
> > splitting the throttling drain into one version called by blk_drain that
> > restarts all requests while ignoring the throttling, and changing the
> > BdrvChild callback to only stop sending them. Once you're there, I think
> > that gets you rid of the BDS request count manipulations from the BB
> > layer because the BDS layer already counts all requests.
> 
> I guess that would be doable.  However I don't think it's so easy.  I
> also don't think it's very interesting (that's probably where we disagree).
> 
> First of all, the current implementation of bdrv_drain is nasty,
> especially the double aio_poll and the recursive bdrv_requests_pending
> are hard to follow.  My aim for these two patches was to have an
> implementation of bdrv_drain that is more easily understandable, so that
> you can _then_ move things to the correct layer.  So even if I
> implemented a series to add a proper blk_drain, I would start with these
> two patches and then move things up.
> 
> Second, there's the issue of bdrv_drained_begin/end, which is the main
> reason why I placed the request count at the BDS level.  BlockBackend
> isolation is a requirement before you can count requests exclusively at
> the BB level.  At which point, implementing isolated sections properly
> (including migration from aio_disable/enable_external to two new
> BlockDevOps, and giving NBD+jobs+migration a separate BB) is more
> interesting than cobbling together the minimum that's enough to
> eliminate bs->in_flight.

I think my point was that you don't have to count requests at the BB
level if you know that there are no requests pending on the BB level
that haven't reached the BDS level yet. If you move the restarting o
throttled requests to blk_drain(), shouldn't that be all you need to do
on the BB level for now?

Of course, having the full thing would be nice, but I don't think it's a
requirement.

Hm, or actually, doesn't the BdrvChild 

Re: [Qemu-block] [Qemu-devel] [PATCH v5 06/14] nbd: Send message along with server NBD_REP_ERR errors

2016-10-11 Thread Eric Blake
On 07/19/2016 12:15 AM, Fam Zheng wrote:
> On Mon, 07/18 22:07, Eric Blake wrote:
>>  nbd/server.c | 78 
>> +---
>>  1 file changed, 59 insertions(+), 19 deletions(-)
>>
>> diff --git a/nbd/server.c b/nbd/server.c
>> index c8716f1..ad31c4a 100644
>> --- a/nbd/server.c
>> +++ b/nbd/server.c
>> @@ -235,6 +235,38 @@ static int nbd_negotiate_send_rep(QIOChannel *ioc, 
>> uint32_t type, uint32_t opt)
>>  return nbd_negotiate_send_rep_len(ioc, type, opt, 0);
>>  }
>>
>> +/* Send an error reply.
>> + * Return -errno to kill connection, 0 to continue negotiation. */
>> +static int GCC_FMT_ATTR(4, 5)
>> +nbd_negotiate_send_rep_err(QIOChannel *ioc, uint32_t type,
> 
> Isn't the function name supposed to be place at col 0?

I blame emacs for that one.  I'm (finally) back to working on this
series, and will have a v6 posted soon addressing comments here and
elsewhere.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 08/15] qstring: Add qstring_wrap_str()

2016-10-11 Thread Eric Blake
On 10/11/2016 10:04 AM, Eric Blake wrote:
> On 10/11/2016 06:08 AM, Marc-André Lureau wrote:
> 
>>> +++ b/block.c
>>> @@ -1640,7 +1640,8 @@ static BlockDriverState
>>> *bdrv_append_temp_snapshot(BlockDriverState *bs,
>>>  qdict_put(snapshot_options, "file.driver",
>>>qstring_from_str("file"));
>>>  qdict_put(snapshot_options, "file.filename",
>>> -  qstring_from_str(tmp_filename));
>>> +  qstring_wrap_str(tmp_filename));
>>> +tmp_filename = NULL;
>>>  qdict_put(snapshot_options, "driver",
>>>qstring_from_str("qcow2"));
>>>
>>>
>> You could also remove g_free(tmp_filename) from the normal return path
>> (this may please static analyzers).
> 
> No. g_free(NULL) is safe, but we can also reach the 'out' label with
> tmp_filename still malloc'd prior to the place where we transfer it
> here, so the g_free() in the cleanup label is still required.  The
> assignment to NULL here prevents a double free.  The patch is correct as-is.

Spoke too soon.  I see what you're saying - the normal return path now
has a dead g_free(NULL).  It won't cause any grief to the static
analyzers, but it is a useless no-op function call, so I can indeed trim
it (the one before the label, not the one after).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v6 08/15] qstring: Add qstring_wrap_str()

2016-10-11 Thread Eric Blake
On 10/11/2016 06:08 AM, Marc-André Lureau wrote:

>> +++ b/block.c
>> @@ -1640,7 +1640,8 @@ static BlockDriverState
>> *bdrv_append_temp_snapshot(BlockDriverState *bs,
>>  qdict_put(snapshot_options, "file.driver",
>>qstring_from_str("file"));
>>  qdict_put(snapshot_options, "file.filename",
>> -  qstring_from_str(tmp_filename));
>> +  qstring_wrap_str(tmp_filename));
>> +tmp_filename = NULL;
>>  qdict_put(snapshot_options, "driver",
>>qstring_from_str("qcow2"));
>>
>>
> You could also remove g_free(tmp_filename) from the normal return path
> (this may please static analyzers).

No. g_free(NULL) is safe, but we can also reach the 'out' label with
tmp_filename still malloc'd prior to the place where we transfer it
here, so the g_free() in the cleanup label is still required.  The
assignment to NULL here prevents a double free.  The patch is correct as-is.


-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/2] block/replication: Clarify 'top-id' parameter usage

2016-10-11 Thread Eric Blake
On 10/11/2016 05:46 AM, Changlong Xie wrote:
> Replication driver only support 'top-id' parameter in secondary side,
> and it must not be supplied in primary side

Grammar suggestion:

The replication driver only supports the 'top-id' parameter for the
secondary side; it must not be supplied for the primary side.

> 
> Signed-off-by: Changlong Xie 
> ---
>  block/replication.c  | 5 +
>  qapi/block-core.json | 3 ++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/2] block/replication: prefect the logic to acquire 'top_id'

2016-10-11 Thread Eric Blake
On 10/11/2016 05:46 AM, Changlong Xie wrote:
> Only g_strdup(top_id) if 'top_id' is not NULL, although there
> is no memory leak here
> 
> Signed-off-by: Changlong Xie 
> ---
>  block/replication.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/replication.c b/block/replication.c
> index 3bd1cf1..5b432d9 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -104,11 +104,11 @@ static int replication_open(BlockDriverState *bs, QDict 
> *options,
>  } else if (!strcmp(mode, "secondary")) {
>  s->mode = REPLICATION_MODE_SECONDARY;
>  top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
> -s->top_id = g_strdup(top_id);

g_strdup(NULL) is safe; it returns NULL in that case.

> -if (!s->top_id) {
> +if (!top_id) {
>  error_setg(_err, "Missing the option top-id");
>  goto fail;
>  }
> +s->top_id = g_strdup(top_id);

I see no point to this patch, rather than churn.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v10 09/16] block: Add QMP support for streaming to an intermediate layer

2016-10-11 Thread Alberto Garcia
On Mon 10 Oct 2016 09:09:25 PM CEST, Eric Blake wrote:
>>  # @job-id: #optional identifier for the newly-created block job. If
>>  #  omitted, the device name will be used. (Since 2.7)
>>  #
>> -# @device: the device name or node-name of a root node
>> +# @device: the device or node name of the top image
>>  #
>>  # @base:   #optional the common backing file name
>>  #
>> -# @backing-file: #optional The backing file string to write into the active
>> -#  layer. This filename is not validated.
>> +# @backing-file: #optional The backing file string to write into the top
>> +#  image. This filename is not validated.
>
> No change to the actual introspection. What's the easiest way for
> libvirt to learn if the new style command will work, short of actually
> trying it to see if it succeeds or fails?  (That may be sufficient,
> but the error message quality can often be improved in libvirt if it
> is known up front if qemu is new enough rather than taking the 'try
> and see' approach and getting stuck with qemu's error message)

I don't think there's any straightforward way. Some ideas:

1) Try to start a block-stream job that does nothing. When base ==
backing_bs(device) that's a no-op, but it fails if device is not the
root node and intermediate block streaming is not supported.

2) We could add an extra parameter. For example 'base-node', which would
be the same as 'base' but would take a node name instead of a file name.

3) QEMU could advertise that feature to the client. This is probably
simpler than trying to figure it out from the API. I guess that's the
idea of 'qmp_capabilities'?

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH v4] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Eric Blake
On 10/11/2016 02:18 AM, Markus Armbruster wrote:

>> +#define SIZE_MAX ((sizeof(char)) * -1)
> 
> All right, let's see how this works.
> 

> 
> Cute, but what's wrong with straightforward
> 
> #define SIZE_MAX ((size_t)-1)

I was trying to make the macro usable even in situations where 'size_t'
itself is undefined (because sufficient headers were not pre-included).
But you're right:  pulls in , and therefore size_t
is always available.  And since neither my solution nor your shorter
direct cast is usable in a preprocessor expression, neither has that
advantage in its favor (a solution that works in ALL scenarios required
by C99, at the expense of more text, would have an advantage).  So I
guess going with the shorter direct cast is just fine.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 1/3] block: add BDS field to count in-flight requests

2016-10-11 Thread Paolo Bonzini
First of all, a correction:

>> The exception is that there are places
>> where we don't have a BlockBackend and thus call
>> bdrv_drain/bdrv_co_drain instead of blk_drain/blk_co_drain

Nevermind---it's just that there is no blk_drained_begin/end yet.

On 11/10/2016 13:00, Kevin Wolf wrote:
> Actually, does bdrv_set_aio_context really need bdrv_drain, or should it
> be using a begin/end pair, too?

Hmm, yes, it should use a pair.

> I was wondering if we need separate .bdrv_isolate_begin/end callbacks,
> but I think on this level there is actually no difference: Both tell the
> block driver to stop sending requests.

Yup, the difference is in whether you exempt one BlockBackend.  If you
do, that BB is isolated.  If you don't, the BDS is quiesced (or drained).

> Anyway, let's see which of the existing bdrv_drained_begin/end users
> would use this (please correct):
> 
> * Block jobs use during completion
>
> * The QMP transaction commands to start block jobs drain as well, but
>   they don't have a BlockBackend yet. Those would call a BDS-level
>   bdrv_isolate_begin/end then, right?
> 
> * Quorum wants to isolate itself from new parent requests while adding a
>   new child, that's another user for bdrv_isolate_begin/end
> 
> * bdrv_snapshot_delete(), probably the save BDS-level thing

Is what you call "a BDS-level bdrv_isolate_begin/end" the same as my
"bdrv_drained_begin(bs) and bdrv_drained_end(bs), which quiesce all root
BdrvChildren reachable from bs"?  Anyway I think we agree.

> Okay, I think we agree on the theory.
> 
> Now I'm curious how this relates to the layering violation in this
> specific patch.

It doesn't, but knowing that we want to go to the same place helps.
Especially since you said I was permanently preventing getting there. :)

>> blk_drain and blk_co_drain can simply count in-flight requests, exactly
>> as done in patch 1.  Sure, I'm adding it at the wrong layer because not
>> every user of the block layer has already gotten its own personal BB.
> 
> Really? Where is it still missing? I was pretty sure I had converted all
> users.

I mentioned block jobs, block migration and the NBD server.  They do use
a BB (as you said, they wouldn't compile after the BdrvChild
conversion), but they don't have their own *separate* BB as far as I can
tell, with separate throttling state etc.  How far are we from that?

> As for .bdrv_inactivate/invalidate, I agree that it should ideally
> imply .bdrv_drained_begin/end, though in practice we don't assert
> cleared BDRV_O_INACTIVE on reads, so I think it might not hold true.
> 
> I'm not so sure if .bdrv_drained_begin/end should imply inactivation,
> though.

Again, bad naming hurts...  See above where you said that bdrv_drain
should be called from bdrv_close only, and not be recursive.  And indeed
in qcow2_close you see:

if (!(s->flags & BDRV_O_INACTIVE)) {
qcow2_inactivate(bs);
}

so this in bdrv_close:

bdrv_flush(bs);
bdrv_drain(bs); /* in case flush left pending I/O */

should be replaced by a non-recursive bdrv_inactivate, and bdrv_flush
should be the default implementation of bdrv_inactivate.  The QED code
added in patch 3 can also become its .bdrv_inactivate callback.  In
addition, stopping the VM could also do a "full flush" without setting
BDRV_O_INACTIVE instead of using bdrv_flush_all.

> Why don't we start with adding a proper blk_drain so that we don't have
> to introduce the ugly intermediate state? This would mostly mean
> splitting the throttling drain into one version called by blk_drain that
> restarts all requests while ignoring the throttling, and changing the
> BdrvChild callback to only stop sending them. Once you're there, I think
> that gets you rid of the BDS request count manipulations from the BB
> layer because the BDS layer already counts all requests.

I guess that would be doable.  However I don't think it's so easy.  I
also don't think it's very interesting (that's probably where we disagree).

First of all, the current implementation of bdrv_drain is nasty,
especially the double aio_poll and the recursive bdrv_requests_pending
are hard to follow.  My aim for these two patches was to have an
implementation of bdrv_drain that is more easily understandable, so that
you can _then_ move things to the correct layer.  So even if I
implemented a series to add a proper blk_drain, I would start with these
two patches and then move things up.

Second, there's the issue of bdrv_drained_begin/end, which is the main
reason why I placed the request count at the BDS level.  BlockBackend
isolation is a requirement before you can count requests exclusively at
the BB level.  At which point, implementing isolated sections properly
(including migration from aio_disable/enable_external to two new
BlockDevOps, and giving NBD+jobs+migration a separate BB) is more
interesting than cobbling together the minimum that's enough to
eliminate bs->in_flight.

All in all I don't think this should be done as a single patch 

[Qemu-block] [PATCH v5 1/1] block: improve error handling in raw_open

2016-10-11 Thread Halil Pasic
Make raw_open for POSIX more consistent in handling errors by setting
the error object also when qemu_open fails. The error object was set
generally set in case of errors, but I guess this case was overlooked.
Do the same for win32.

Signed-off-by: Halil Pasic 
Reviewed-by: Sascha Silbe 
Tested-by: Marc Hartmayer  (POSIX only)

---

Stumbled upon this (POSIX) while testing VMs with too many SCSI disks in
respect to my nofile limit. When open hits the nofile limit while trying
to hotplug yet another SCSI disk via libvirt we end up with no adequate
error message (one stating too many files). Sadly this patch in not
sufficient to fix this problem because drive_new (/qemu/blockdev.c)
handles errors using error_report_err which is documented as not to be
used in QMP context.

The win32 part was not tested, and the sole reason I touched it is
to not introduce unnecessary divergence.

v4 -> v5:
* fix qemu-iotests by adding the filename to the message
v3 -> v4:
* rebased on current master
v2 -> v3:
* first save errno then error_setg_errno
v1 -> v2:
* fixed win32 by the correct error_setg_*
* use the original errno consequently
---
 block/raw-posix.c | 1 +
 block/raw-win32.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index 166e9d1..f481e57 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -443,6 +443,7 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 fd = qemu_open(filename, s->open_flags, 0644);
 if (fd < 0) {
 ret = -errno;
+error_setg_errno(errp, errno, "Could not open '%s'", filename);
 if (ret == -EROFS) {
 ret = -EACCES;
 }
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 734bb10..800fabd 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -373,6 +373,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, 
int flags,
 if (s->hfile == INVALID_HANDLE_VALUE) {
 int err = GetLastError();
 
+error_setg_win32(errp, err, "Could not open '%s'", filename);
 if (err == ERROR_ACCESS_DENIED) {
 ret = -EACCES;
 } else {
-- 
2.8.4




Re: [Qemu-block] [PATCH v10 03/16] block: Use block_job_add_bdrv() in mirror_start_job()

2016-10-11 Thread Kevin Wolf
Am 11.10.2016 um 15:46 hat Alberto Garcia geschrieben:
> On Mon 10 Oct 2016 06:03:41 PM CEST, Kevin Wolf wrote:
> 
> >> Use block_job_add_bdrv() instead of blocking all operations in
> >> mirror_start_job() and unblocking them in mirror_exit().
> >> 
> >> Signed-off-by: Alberto Garcia 
> >
> > Compared to the old code, this unblocks BLOCK_OP_TYPE_DATAPLANE, i.e.
> > you can now run a dataplane device on a BDS used as the mirror target.
> >
> > This means that the target could require a different AioContext than
> > the source, which we can't support. So it seems unlikely to me that we
> > can lift this restriction.
> 
> Thanks, I'll fix it.
> 
> What happens if you run a dataplane on the source, though? That's
> currently allowed as far as I'm aware. Wouldn't that have a similar
> effect?

The block job takes care to put the target into the same dataplane
AioContext then. The job doesn't really care whether it works in the
main thread or a separate I/O thread, it just requires that it's a
single context, which is currently defined by the source.

Kevin



Re: [Qemu-block] [PATCH v10 03/16] block: Use block_job_add_bdrv() in mirror_start_job()

2016-10-11 Thread Alberto Garcia
On Mon 10 Oct 2016 06:03:41 PM CEST, Kevin Wolf wrote:

>> Use block_job_add_bdrv() instead of blocking all operations in
>> mirror_start_job() and unblocking them in mirror_exit().
>> 
>> Signed-off-by: Alberto Garcia 
>
> Compared to the old code, this unblocks BLOCK_OP_TYPE_DATAPLANE, i.e.
> you can now run a dataplane device on a BDS used as the mirror target.
>
> This means that the target could require a different AioContext than
> the source, which we can't support. So it seems unlikely to me that we
> can lift this restriction.

Thanks, I'll fix it.

What happens if you run a dataplane on the source, though? That's
currently allowed as far as I'm aware. Wouldn't that have a similar
effect?

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-11 Thread Kashyap Chamarthy
On Mon, Oct 10, 2016 at 02:28:52PM -0500, Eric Blake wrote:
> On 10/10/2016 01:36 PM, John Snow wrote:

[...]

> > I'll be honest that I don't know; this is related to Replication which I
> > know reasonably little about overall. It got added in the 2.8 timeframe,
> > so the behavior it currently exhibits is not a good or meaningful
> > reference for maintaining compatibility.
> > 
> > We can /change/ the behavior before releases with no love lost.
> 
> And if Replication is the only way to trigger internal use of jobs, then
> we aren't breaking libvirt (which doesn't know how to drive replication
> yet) by changing anything on that front.

Exactly.

> > Or, what if you just didn't get events for internal jobs? Are events for
> > un-managed jobs useful in any sense beyond understanding the stateful
> > availability of the drive to participate in a new job?
> 
> If libvirt isn't driving replication, then it's a moot point. And even
> though replication in libvirt is not supported yet, I suspect that down
> the road when support is added, the easiest thing for libvirt will be to
> state that replication and libvirt-controlled block jobs are mutually
> exclusive; there's probably enough lurking dragons that if your system
> MUST be high-reliance by replication, you probably don't want to be
> confusing things by changing the backing file depth manually with
> streams, pulls, or other manual actions at the same time as replication
> is managing the system, because how can you guarantee that both primary
> and secondary see the same manual actions at all the right times?

Very nice argument for making them mutually exclusive, from a libvirt
POV.

> At any rate, not seeing internal-only jobs is probably the most
> reasonable, even if it means an internal-only job can block the attempt
> to do a manual job.

FWIW, I agree, if only as a user / observer of these events during
debugging.

[...]

-- 
/kashyap



[Qemu-block] [PATCH v3] block: Remove "options" indirection from blockdev-add

2016-10-11 Thread Kevin Wolf
Now that QAPI supports boxed types, we can have unions at the top level
of a command, so let's put our real options directly there for
blockdev-add instead of having a single "options" dict that contains the
real arguments.

blockdev-add is still experimental and we already made substantial
changes to the API recently, so we're free to make changes like this
one, too.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 docs/qmp-commands.txt  |  84 ---
 qapi/block-core.json   |   4 +-
 tests/qemu-iotests/041 |  11 +++--
 tests/qemu-iotests/067 |  12 +++--
 tests/qemu-iotests/071 | 118 +
 tests/qemu-iotests/081 |  52 ++
 tests/qemu-iotests/085 |   9 ++--
 tests/qemu-iotests/087 |  76 +--
 tests/qemu-iotests/117 |  12 ++---
 tests/qemu-iotests/118 |  42 +-
 tests/qemu-iotests/124 |  20 -
 tests/qemu-iotests/139 |  10 ++---
 tests/qemu-iotests/141 |  13 +++---
 tests/qemu-iotests/155 |  10 ++---
 14 files changed, 214 insertions(+), 259 deletions(-)

diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt
index 7f652e0..cb99dcc 100644
--- a/docs/qmp-commands.txt
+++ b/docs/qmp-commands.txt
@@ -1090,11 +1090,11 @@ Arguments:
 Example:
 
 -> { "execute": "blockdev-add",
-"arguments": { "options": { "driver": "qcow2",
-"node-name": "node1534",
-"file": { "driver": "file",
-  "filename": "hd1.qcow2" 
},
-"backing": "" } } }
+"arguments": { "driver": "qcow2",
+   "node-name": "node1534",
+   "file": { "driver": "file",
+ "filename": "hd1.qcow2" },
+   "backing": "" } }
 
 <- { "return": {} }
 
@@ -3123,41 +3123,37 @@ This command is still a work in progress.  It doesn't 
support all
 block drivers among other things.  Stay away from it unless you want
 to help with its development.
 
-Arguments:
-
-- "options": block driver options
+For the arguments, see the QAPI schema documentation of BlockdevOptions.
 
 Example (1):
 
 -> { "execute": "blockdev-add",
-"arguments": { "options" : { "driver": "qcow2",
- "file": { "driver": "file",
-   "filename": "test.qcow2" } } } }
+"arguments": { "driver": "qcow2",
+   "file": { "driver": "file",
+ "filename": "test.qcow2" } } }
 <- { "return": {} }
 
 Example (2):
 
 -> { "execute": "blockdev-add",
  "arguments": {
- "options": {
-   "driver": "qcow2",
-   "node-name": "my_disk",
-   "discard": "unmap",
-   "cache": {
-   "direct": true,
-   "writeback": true
-   },
-   "file": {
-   "driver": "file",
-   "filename": "/tmp/test.qcow2"
-   },
-   "backing": {
-   "driver": "raw",
-   "file": {
-   "driver": "file",
-   "filename": "/dev/fdset/4"
-   }
-   }
+ "driver": "qcow2",
+ "node-name": "my_disk",
+ "discard": "unmap",
+ "cache": {
+ "direct": true,
+ "writeback": true
+ },
+ "file": {
+ "driver": "file",
+ "filename": "/tmp/test.qcow2"
+ },
+ "backing": {
+ "driver": "raw",
+ "file": {
+ "driver": "file",
+ "filename": "/dev/fdset/4"
+ }
  }
}
  }
@@ -3184,13 +3180,11 @@ Example:
 
 -> { "execute": "blockdev-add",
  "arguments": {
- "options": {
- "driver": "qcow2",
- "node-name": "node0",
- "file": {
- "driver": "file",
- "filename": "test.qcow2"
- }
+ "driver": "qcow2",
+ "node-name": "node0",
+ "file": {
+ "driver": "file",
+ "filename": "test.qcow2"
  }
  }
}
@@ -3335,10 +3329,10 @@ Arguments:
 Example:
 
 -> { "execute": "blockdev-add",
- "arguments": { "options": { "node-name": "node0",
- "driver": "raw",
- "file": { "driver": "file",
-   "filename": "fedora.iso" } } } }
+ "arguments": { { "node-name": "node0",
+  "driver": "raw",
+  "file": { "driver": "file",
+"filename": "fedora.iso" } } }
 
 <- { "return": {} }
 
@@ -3376,10 +3370,10 

Re: [Qemu-block] [PATCH 06/22] qcow2: add dirty bitmaps extension

2016-10-11 Thread Vladimir Sementsov-Ogievskiy

On 01.10.2016 17:46, Max Reitz wrote:

On 30.09.2016 12:53, Vladimir Sementsov-Ogievskiy wrote:

Add dirty bitmap extension as specified in docs/specs/qcow2.txt.
For now, just mirror extension header into Qcow2 state and check
constraints.

For now, disable image resize if it has bitmaps. It will be fixed later.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2.c | 83 +++
  block/qcow2.h |  4 +++
  2 files changed, 87 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index c079aa8..08c4ef9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c

[...]


@@ -162,6 +164,62 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
  }
  break;
  
+case QCOW2_EXT_MAGIC_DIRTY_BITMAPS:

+ret = bdrv_pread(bs->file, offset, _ext, ext.len);

Overflows with ext.len > sizeof(bitmaps_ext).

(ext.len < sizeof(bitmaps_ext) is also wrong, but less dramatically so.)


+if (ret < 0) {
+error_setg_errno(errp, -ret, "ERROR: bitmaps_ext: "
+ "Could not read ext header");
+return ret;
+}
+
+if (bitmaps_ext.reserved32 != 0) {
+error_setg_errno(errp, -ret, "ERROR: bitmaps_ext: "
+ "Reserved field is not zero.");

Please drop the full stop at the end.


what do you mean? goto to fail: here? or not stop at all, just print error?




+return -EINVAL;
+}
+
+be32_to_cpus(_ext.nb_bitmaps);
+be64_to_cpus(_ext.bitmap_directory_size);
+be64_to_cpus(_ext.bitmap_directory_offset);
+
+if (bitmaps_ext.nb_bitmaps > QCOW_MAX_DIRTY_BITMAPS) {
+error_setg(errp, "ERROR: bitmaps_ext: "
+ "too many dirty bitmaps");
+return -EINVAL;
+}
+
+if (bitmaps_ext.nb_bitmaps == 0) {
+error_setg(errp, "ERROR: bitmaps_ext: "
+ "found bitmaps extension with zero bitmaps");
+return -EINVAL;
+}
+
+if (bitmaps_ext.bitmap_directory_offset & (s->cluster_size - 1)) {
+error_setg(errp, "ERROR: bitmaps_ext: "
+ "wrong dirty bitmap directory offset");

s/wrong/invalid/


+return -EINVAL;
+}
+
+if (bitmaps_ext.bitmap_directory_size >
+QCOW_MAX_DIRTY_BITMAP_DIRECTORY_SIZE) {
+error_setg(errp, "ERROR: bitmaps_ext: "
+ "too large dirty bitmap directory");
+return -EINVAL;
+}
+
+s->nb_bitmaps = bitmaps_ext.nb_bitmaps;
+s->bitmap_directory_offset =
+bitmaps_ext.bitmap_directory_offset;
+s->bitmap_directory_size =
+bitmaps_ext.bitmap_directory_size;
+
+#ifdef DEBUG_EXT
+printf("Qcow2: Got dirty bitmaps extension:"
+   " offset=%" PRIu64 " nb_bitmaps=%" PRIu32 "\n",
+   s->bitmap_directory_offset, s->nb_bitmaps);
+#endif
+break;
+
  default:
  /* unknown magic - save it in case we need to rewrite the header 
*/
  {

[...]


@@ -2509,6 +2585,13 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t 
offset)
  return -ENOTSUP;
  }
  
+/* cannot proceed if image has bitmaps */

+if (s->nb_bitmaps) {
+/* FIXME */

I'd call it a TODO, but that's probably a matter of taste.


+error_report("Can't resize an image which has bitmaps");
+return -ENOTSUP;
+}
+
  /* shrinking is currently not supported */
  if (offset < bs->total_sectors * 512) {
  error_report("qcow2 doesn't support shrinking images yet");

Max




--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH 05/22] qcow2-bitmap: structs and consts

2016-10-11 Thread Vladimir Sementsov-Ogievskiy

On 01.10.2016 17:34, Max Reitz wrote:

On 30.09.2016 12:53, Vladimir Sementsov-Ogievskiy wrote:

Create block/qcow2-bitmap.c
Add data structures and constraints accordingly to docs/specs/qcow2.txt

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/Makefile.objs  |  2 +-
  block/qcow2-bitmap.c | 47 +++
  block/qcow2.h| 29 +
  3 files changed, 77 insertions(+), 1 deletion(-)
  create mode 100644 block/qcow2-bitmap.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index fa4d8b8..0f661bb 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,5 +1,5 @@
  block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o 
dmg.o
-block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
+block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o qcow2-bitmap.o
  block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
  block-obj-y += qed-check.o
  block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
new file mode 100644
index 000..cd18b07
--- /dev/null
+++ b/block/qcow2-bitmap.c
@@ -0,0 +1,47 @@
+/*
+ * Bitmaps for the QCOW version 2 format
+ *
+ * Copyright (c) 2014-2016 Vladimir Sementsov-Ogievskiy
+ *
+ * This file is derived from qcow2-snapshot.c, original copyright:
+ * Copyright (c) 2004-2006 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+/* NOTICE: BME here means Bitmaps Extension and used as a namespace for
+ * _internal_ constants. Please do not use this _internal_ abbreviation for
+ * other needs and/or outside of this file. */
+
+/* Bitmap directory entry constraints */
+#define BME_MAX_TABLE_SIZE 0x800
+#define BME_MAX_PHYS_SIZE 0x2000 /* 512 mb */

I suppose BME_MAX_TABLE_SIZE (8M) is greater than BME_MAX_PHYS_SIZE (512
MB) divided by the cluster size (>= 512; 512 MB / cluster_size <= 1 MB)
because fully zero or one clusters do not require any physical space?

Makes some sense, but I can see that this might make give some trouble
when trying to serialize overly large bitmaps. But I guess that comes
later in this series, so I'll wait for that point.

Another thing is that 512 MB is rather big. It gets worse: The bitmap
may only require 512 MB on disk, but with a maximum table size of 8 MB,
it can require up to 8M * cluster_size in memory (with just 64 MB of
disk space!) by using the "read as all zeroes" or "read as all ones"
flags. With the default cluster size of 64 kB, this would be 512 GB in
RAM. That sounds bad to me.

Well, it is probably fine as long as the bitmap is not auto-loaded...
But we do have a flag for exactly that. So it seems to me that a
manipulated image can easily consume huge amounts of RAM on the host.

So I think we also need some sane limitation on the in-RAM size of a
bitmap (which is BME_MAX_TABLE_SIZE * cluster_size, as far as I
understand). The question of course is, what is sane? For a server
system with no image manipulation possible from the outside, 1 GB may be
completely fine. But imagine you download some qcow2 image to your
laptop. Then, 1 GB may not be fine, actually.

Maybe it would make sense to use a runtime-adjustable limit here?


Actualy BME_MAX_PHYS_SIZE is this limit:
in check_constraints we have

uint64_t phys_bitmap_bytes =
(uint64_t)h->bitmap_table_size * s->cluster_size;

...

(phys_bitmap_bytes > BME_MAX_PHYS_SIZE) ||




+#define BME_MAX_GRANULARITY_BITS 31
+#define BME_MIN_GRANULARITY_BITS 9
+#define BME_MAX_NAME_SIZE 1023
+
+/* Bitmap directory entry flags */
+#define BME_RESERVED_FLAGS 0x
+
+/* bits [1, 8] U [56, 63] are reserved */
+#define BME_TABLE_ENTRY_RESERVED_MASK 0xff0001fe

ull suffix is missing.


+
+typedef enum BitmapType {
+BT_DIRTY_TRACKING_BITMAP = 1
+} BitmapType;

Re: [Qemu-block] [Qemu-devel] [PATCH 0/2] qemu-img: change img_open() and opening method in dd

2016-10-11 Thread no-reply
Hi,

Your series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Subject: [Qemu-devel] [PATCH 0/2] qemu-img: change img_open() and opening 
method in dd
Message-id: 20161007153617.28704-1-fullma...@gmail.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
make J=8 docker-test-quick@centos6
make J=8 docker-test-mingw@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
83110f3 qemu-img: change opening method for the output in dd
7e616f7 qemu-img: make img_open() able to surpress file opening errors

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
  BUILD   centos6
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
  RUN test-quick in centos6
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=0df14e0c14b8
TERM=xterm
MAKEFLAGS= -j8
HISTSIZE=1000
J=8
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
No C++ compiler available; disabling C++ specific optional code
Install prefix/var/tmp/qemu-build/install
BIOS directory/var/tmp/qemu-build/install/share/qemu
binary directory  /var/tmp/qemu-build/install/bin
library directory /var/tmp/qemu-build/install/lib
module directory  /var/tmp/qemu-build/install/lib/qemu
libexec directory /var/tmp/qemu-build/install/libexec
include directory /var/tmp/qemu-build/install/include
config directory  /var/tmp/qemu-build/install/etc
local state directory   /var/tmp/qemu-build/install/var
Manual directory  /var/tmp/qemu-build/install/share/man
ELF interp prefix /usr/gnemul/qemu-%M
Source path   /tmp/qemu-test/src
C compilercc
Host C compiler   cc
C++ compiler  
Objective-C compiler cc
ARFLAGS   rv
CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -g 
QEMU_CFLAGS   -I/usr/include/pixman-1-pthread -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include   -fPIE -DPIE -m64 -D_GNU_SOURCE 
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
-Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv  -Wendif-labels -Wmissing-include-dirs 
-Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self 
-Wignored-qualifiers -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -fstack-protector-all
LDFLAGS   -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g 
make  make
install   install
pythonpython -B
smbd  /usr/sbin/smbd
module supportno
host CPU  x86_64
host big endian   no
target list   x86_64-softmmu aarch64-softmmu
tcg debug enabled no
gprof enabled no
sparse enabledno
strip binariesyes
profiler  no
static build  no
pixmansystem
SDL support   yes (1.2.14)
GTK support   no 
GTK GL supportno
VTE support   no 
TLS priority  NORMAL
GNUTLS supportno
GNUTLS rndno
libgcrypt no
libgcrypt kdf no
nettleno 
nettle kdfno
libtasn1  no
curses supportno
virgl support no
curl support  no
mingw32 support   no
Audio drivers oss
Block whitelist (rw) 
Block whitelist (ro) 
VirtFS supportno
VNC support   yes
VNC SASL support  no
VNC JPEG support  no
VNC PNG support   no
xen support   no
brlapi supportno
bluez  supportno
Documentation no
PIE   yes
vde support   no
netmap supportno
Linux AIO support no
ATTR/XATTR support yes
Install blobs yes
KVM support   yes
RDMA support  no
TCG interpreter   no
fdt support   yes
preadv supportyes
fdatasync yes
madvise   yes
posix_madvise yes
libcap-ng support no
vhost-net support yes
vhost-scsi support yes

Re: [Qemu-block] [Qemu-devel] [PATCH v6 08/15] qstring: Add qstring_wrap_str()

2016-10-11 Thread Marc-André Lureau
Hi

On Mon, Oct 10, 2016 at 5:25 PM Eric Blake  wrote:

> Several spots in the code malloc a string, then wrap it in a
> QString, which has to duplicate the input.  Adding a new
> constructor with transfer semantics makes this pattern more
> efficient, comparable to the just-added transfer semantics to
> go from QString back to raw string.  Use the new
> qstring_wrap_str() where it makes sense.
>
> The new test still passes under valgrind.
>
> Signed-off-by: Eric Blake 
>
> ---
> v6: don't auto-convert NULL into ""
> [no v5 due to series split]
> v4: new patch
> ---
>  include/qapi/qmp/qstring.h |  1 +
>  block.c|  3 ++-
>  block/archipelago.c|  6 ++
>  blockdev.c |  3 +--
>  qobject/qstring.c  | 20 
>  tests/check-qstring.c  | 15 +++
>  6 files changed, 41 insertions(+), 7 deletions(-)
>
> diff --git a/include/qapi/qmp/qstring.h b/include/qapi/qmp/qstring.h
> index 2d55c87..97cf776 100644
> --- a/include/qapi/qmp/qstring.h
> +++ b/include/qapi/qmp/qstring.h
> @@ -25,6 +25,7 @@ typedef struct QString {
>  QString *qstring_new(void);
>  QString *qstring_from_str(const char *str);
>  QString *qstring_from_substr(const char *str, int start, int end);
> +QString *qstring_wrap_str(char *str);
>  size_t qstring_get_length(const QString *qstring);
>  const char *qstring_get_str(const QString *qstring);
>  char *qstring_consume_str(QString *qstring);
> diff --git a/block.c b/block.c
> index bb1f1ec..8a2876f 100644
> --- a/block.c
> +++ b/block.c
> @@ -1640,7 +1640,8 @@ static BlockDriverState
> *bdrv_append_temp_snapshot(BlockDriverState *bs,
>  qdict_put(snapshot_options, "file.driver",
>qstring_from_str("file"));
>  qdict_put(snapshot_options, "file.filename",
> -  qstring_from_str(tmp_filename));
> +  qstring_wrap_str(tmp_filename));
> +tmp_filename = NULL;
>  qdict_put(snapshot_options, "driver",
>qstring_from_str("qcow2"));
>
>
You could also remove g_free(tmp_filename) from the normal return path
(this may please static analyzers).


diff --git a/block/archipelago.c b/block/archipelago.c
> index 37b8aca..ac047e4 100644
> --- a/block/archipelago.c
> +++ b/block/archipelago.c
> @@ -426,13 +426,11 @@ static void archipelago_parse_filename(const char
> *filename, QDict *options,
>  parse_filename_opts(filename, errp, , _name, ,
> );
>
>  if (volume) {
> -qdict_put(options, ARCHIPELAGO_OPT_VOLUME,
> qstring_from_str(volume));
> -g_free(volume);
> +qdict_put(options, ARCHIPELAGO_OPT_VOLUME,
> qstring_wrap_str(volume));
>  }
>  if (segment_name) {
>  qdict_put(options, ARCHIPELAGO_OPT_SEGMENT,
> -  qstring_from_str(segment_name));
> -g_free(segment_name);
> +  qstring_wrap_str(segment_name));
>  }
>  if (mport != NoPort) {
>  qdict_put(options, ARCHIPELAGO_OPT_MPORT, qint_from_int(mport));
> diff --git a/blockdev.c b/blockdev.c
> index 07ec733..35cd905 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1026,8 +1026,7 @@ DriveInfo *drive_new(QemuOpts *all_opts,
> BlockInterfaceType block_default_type)
>  new_id = g_strdup_printf("%s%s%i", if_name[type],
>   mediastr, unit_id);
>  }
> -qdict_put(bs_opts, "id", qstring_from_str(new_id));
> -g_free(new_id);
> +qdict_put(bs_opts, "id", qstring_wrap_str(new_id));
>  }
>
>  /* Add virtio block device */
> diff --git a/qobject/qstring.c b/qobject/qstring.c
> index 7a438e9..a64373a 100644
> --- a/qobject/qstring.c
> +++ b/qobject/qstring.c
> @@ -66,6 +66,26 @@ QString *qstring_from_str(const char *str)
>  return qstring_from_substr(str, 0, strlen(str) - 1);
>  }
>
> +/**
> + * qstring_wrap_str(): Create a new QString around a malloc'd C string
> + *
> + * Returns a strong reference, and caller must not use @str any more.
> + * @str must not be NULL.
> + */
> +QString *qstring_wrap_str(char *str)
> +{
> +QString *qstring;
> +
> +qstring = g_malloc(sizeof(*qstring));
> +qobject_init(QOBJECT(qstring), QTYPE_QSTRING);
> +
> +assert(str);
> +qstring->string = str;
> +qstring->capacity = qstring->length = strlen(str);
> +
> +return qstring;
> +}
> +
>  static void capacity_increase(QString *qstring, size_t len)
>  {
>  if (qstring->capacity < (qstring->length + len)) {
> diff --git a/tests/check-qstring.c b/tests/check-qstring.c
> index 11823c2..0c427c7 100644
> --- a/tests/check-qstring.c
> +++ b/tests/check-qstring.c
> @@ -34,6 +34,20 @@ static void qstring_from_str_test(void)
>  QDECREF(qstring);
>  }
>
> +static void qstring_wrap_str_test(void)
> +{
> +QString *qstring;
> +char *str = g_strdup("QEMU");
> +
> +qstring = qstring_wrap_str(str);
> +g_assert(qstring != NULL);
> +g_assert(qstring->base.refcnt == 1);
> +

Re: [Qemu-block] [PATCH v10 01/16] block: Pause all jobs during bdrv_reopen_multiple()

2016-10-11 Thread Kevin Wolf
Am 11.10.2016 um 11:54 hat Paolo Bonzini geschrieben:
> On 11/10/2016 11:39, Kevin Wolf wrote:
> > Am 10.10.2016 um 18:41 hat Paolo Bonzini geschrieben:
> >> On 10/10/2016 17:37, Kevin Wolf wrote:
>  +while ((job = block_job_next(job))) {
>  +AioContext *aio_context = blk_get_aio_context(job->blk);
>  +
>  +aio_context_acquire(aio_context);
>  +block_job_pause(job);
>  +aio_context_release(aio_context);
>  +}
>  +
>   bdrv_drain_all();
> >>>
> >>> We already have a bdrv_drain_all() here, which does the same thing (and
> >>> more) internally, except that it resumes all jobs before it returns.
> >>> Maybe what we should do is split bdrv_drain_all() in a begin/end pair,
> >>> too.
> >>
> >> Hey, haven't I just suggested the same? :)  I swear I hadn't read this
> >> before.
> >>
> >>> The other point I'm wondering now is whether bdrv_drain_all() should
> >>> have the aio_disable/enable_external() pair that bdrv_drain() has.
> >>
> >> bdrv_drain_all need not have it, but its start/end replacement should.
> > 
> > Doesn't need it because it holds the AioContext lock?
> 
> No, because as soon as bdrv_drain_all exits, external file descriptors
> can be triggered again so I don't think the aio_disable/enable_external
> actually provides any protection.

Right, what I was thinking of was more like new requests coming in
while bdrv_drain_all() is still running. If that happened, the result
after bdrv_drain_all() wouldn't be different, but if external file
descriptors are still active and the guest keeps sending requests, it
could take forever until it returns.

But that part is addressed by the AioContext lock, I think.

> bdrv_drain_all should really only be used in cases where you already
> have some kind of "hold" on external file descriptors, like bdrv_close
> uses bdrv_drain():
> 
> bdrv_drained_begin(bs); /* complete I/O */
> bdrv_flush(bs);
> bdrv_drain(bs); /* in case flush left pending I/O */
> 
> That said, because the simplest implementation of bdrv_drain_all() does
> 
> bdrv_drained_all_start();
> bdrv_drained_all_end();
> 
> just like bdrv_drain() does it, it's not a very interesting point.
> bdrv_drain_all will probably disable/enable external file descriptors,
> it just doesn't need that.

Yes, of course.

Kevin



Re: [Qemu-block] [PATCH 1/3] block: add BDS field to count in-flight requests

2016-10-11 Thread Kevin Wolf
Am 10.10.2016 um 18:37 hat Paolo Bonzini geschrieben:
> On 10/10/2016 12:36, Kevin Wolf wrote:
> > 
> > At the BlockBackend level (or really anywhere in the graph), we have two
> > different kinds of drain requests that we just fail to implement
> > differently today:
> > 
> > 1. blk_drain(), i.e. a request to actually drain all requests pending in
> >this BlockBackend. This is what we implement today, even though only
> >indirectly: We call bdrv_drain() on the root BDS and it calls back
> >into what is case 2.
> > 
> > 2. BdrvChildRole.drained_begin/end(), i.e. stop sending requests to a
> >child node. The efficient way to do this is not what we're doing
> >today (sending all throttled requests and waiting for their
> >completion), but just stopping to process throttled requests.
> > 
> > Your approach would permanently prevent us from doing this right and
> > force us to do the full drain even for the second case.
> 
> You're entirely correct that these patches are a layering violation.
> However, I think you're wrong that they are a step backwards, because
> they are merely exposing pre-existing gaps in the BDS/BB separation.  In
> fact, I think that these patches are a very good first step towards
> getting the design correct, and I suspect that we agree on the design
> since we have talked about it in the past.
> 
> Now, the good news is that we have an API that makes sense, and we're
> using it mostly correctly.  (The exception is that there are places
> where we don't have a BlockBackend and thus call
> bdrv_drain/bdrv_co_drain instead of blk_drain/blk_co_drain).  Yes,
> bdrv_drained_begin and bdrv_drained_end may need tweaks to work with
> multiple parents, but wherever we use one of the APIs in the family
> we're using the right one.
> 
> The bad news is that while the APIs are good, they are implemented in
> the wrong way---all of them, though some less than others.  In
> particular, blk_drain and bdrv_drain and
> bdrv_drained_begin/bdrv_drained_end represent three completely different
> concepts but we conflate the implementations.   But again, knowing and
> exposing the problem is the first step of solving it, and IMO these
> patches if anything make the problem easier to solve.
> 
> 
> So, let's first of all look at what the three operations should mean.
> 
> blk_drain should mean "wait for completion of all requests sent from
> *this* BlockBackend".  Typically it would be used by the owner of the
> BlockBackend to get its own *data structures* in a known-good state.  It
> should _not_ imply any wait for metadata writes to complete, and it
> should not do anything to requests sent from other BlockBackends.  Three
> random notes: 1) this should obviously include throttled operations,
> just like it does now; 2) there should be a blk_co_drain too; 3) if the
> implementation can't be contained entirely within BlockBackend, we're
> doing it wrong.
> 
> bdrv_drain should only be needed in bdrv_set_aio_context and bdrv_close,
> everyone else should drain its own BlockBackend or use a section.  Its
> role is to ensure the BDS's children are completely quiescent.
> Quiescence of the parents should be a precondition:
> bdrv_set_aio_context, should use bdrv_drained_begin/bdrv_drained_end for
> this purpose; while in the case of bdrv_close there should be no parents.

You making some good points here. I never spent much thought on the fact
that bdrv_drain and bdrv_drained_begin/end are really for different
purposes.

Actually, does bdrv_set_aio_context really need bdrv_drain, or should it
be using a begin/end pair, too?

bdrv_close() is an interesting case because it has both a section and
then inside that another bdrv_drain() call. Maybe that's the only
constellation in which bdrv_drain() without a section even really makes
sense because then the parents are already quiesced. If that's the only
user of it, though, maybe we should inline it and remove the public
defintion of bdrv_drain(), which is almost always wrong.

> bdrv_{co_,}drained_begin and bdrv_{co_,}drained_end should split into two:
> 
> - blk_{co_,}isolate_begin(blk) and blk_{co_,}isolate_end(blk).  These
> operations visit the BdrvChild graph in the child-to-parent direction,
> starting at blk->root->bs, and tell each root BdrvChild (other than blk)
> to quiesce itself.  Quiescence means ensuring that no new I/O operations
> are triggered.  This in turn has both an external aspect (a Notifier or
> other callback invoked by blk_root_drained_begin/end) and an internal
> aspect (throttling).

It wouldn't only tell the BlockBackends to quiesce themselves, but also
all nodes on the path so that they don't issue internal requests.

Anyway, let's see which of the existing bdrv_drained_begin/end users
would use this (please correct):

* Block jobs use during completion

* The QMP transaction commands to start block jobs drain as well, but
  they don't have a BlockBackend yet. Those would call a BDS-level
  

Re: [Qemu-block] [Questions] NBD issue or CoMutex->holder issue?

2016-10-11 Thread Paolo Bonzini


On 11/10/2016 12:35, Changlong Xie wrote:
> For nbd client, if request number is large than MAX_NBD_REQUESTS(16), we
> will queue the rest requests into free_sema->queue.
> When nbd client receives one reply,  it will unlock free_sema, then pop
> the free_sema->queue head, so set free_sema->holder as
> revelant coroutine.

NBD is using the CoMutex in a way that wasn't anticipated.  The simplest
fix is to change it to CoQueue, which is like a condition variable.
Instead of locking if in_flight >= MAX_NBD_REQUESTS - 1, wait on the
queue while in_flight == MAX_NBD_REQUESTS.  Instead of unlocking, use
qemu_co_queue_next to wake up one request.

Thanks for the report!

Paolo

> For example if there are N(N=26 and MAX_NBD_REQUESTS=16) nbd write
> requests, so we'll invoke nbd_client_co_pwritev 26 times.
> time request No   Actions
> 1 1   in_flight=1, Coroutine=C1
> 2 2   in_flight=2, Coroutine=C2
> ...   ...
> 1515  in_flight=15, Coroutine=C15
> 1616  in_flight=16, Coroutine=C16,
> free_sema->holder=C16, mutex->locked=true
> 1717  in_flight=16, Coroutine=C17, queue C17 into
> free_sema->queue
> 1818  in_flight=16, Coroutine=C18, queue C18 into
> free_sema->queue
> ...   ...
> 26N   in_flight=16, Coroutine=C26, queue C26 into
> free_sema->queue
> 
> Once nbd client recieves request No.16' reply, we will re-enter request
> C16. It's ok, because it's equal to 'free_sema->holder'.
> time request No   Actions
> 2716  in_flight=15, Coroutine=C16,
> free_sema->holder=C16, mutex->locked=false
> 
> Then nbd_coroutine_end invokes qemu_co_mutex_unlock, what will pop
> coroutines from free_sema->queue's head and enter C17. More
> free_sema->holder is C17 now.
> time request No   Actions
> 2817  in_flight=16, Coroutine=C17,
> free_sema->holder=C17, mutex->locked=true
> 
> In above scenario, we only recieves request No.16' reply. So as time go
> on, nbd client will almostly recieves replies from requests
> 1 to 15 rather than request 17 who owns C17. In this case, we will
> encounter Assertion "`mutex->holder == self' failed" in nbd_coroutine_end.
> For example, if nbd client recieves request No.15' reply:
> time request No  Actions
> 29   15(most case)   in_flight=15, Coroutine=C15,
> free_sema->holder=C17, mutex->locked = false
> 
> qemu-system-x86_64: util/qemu-coroutine-lock.c:148:
> qemu_co_mutex_unlock: Assertion `mutex->holder == self' failed.
> 
> This is introduced by Kevin's patch
> commit 0e438cdc932a785de72166af4641aafa103a6670
> Author: Kevin Wolf 
> Date:   Thu Aug 11 17:45:06 2016 +0200
> 
> coroutine: Let CoMutex remember who holds it
> 
> In cases of deadlocks, knowing who holds a given CoMutex is really
> helpful for debugging. Keeping the information around doesn't cost much
> and allows us to add another assertion to keep the code correct, so
> let's just add it.
> 
> Signed-off-by: Kevin Wolf 
> Reviewed-by: Paolo Bonzini 
> Reviewed-by: Stefan Hajnoczi 
> 
> Any ideas? Is it a nbd bug or should we revert commit 0e438cdc?
> 
> Thanks
> -Xie
> 
> 



[Qemu-block] [PATCH v2 0/2] block/replication fixes

2016-10-11 Thread Changlong Xie
V2:
1. fix typo

Changlong Xie (2):
  block/replication: prefect the logic to acquire 'top_id'
  block/replication: Clarify 'top-id' parameter usage

 block/replication.c  | 9 +++--
 qapi/block-core.json | 3 ++-
 2 files changed, 9 insertions(+), 3 deletions(-)

-- 
1.9.3






[Qemu-block] [PATCH v2 1/2] block/replication: prefect the logic to acquire 'top_id'

2016-10-11 Thread Changlong Xie
Only g_strdup(top_id) if 'top_id' is not NULL, although there
is no memory leak here

Signed-off-by: Changlong Xie 
---
 block/replication.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 3bd1cf1..5b432d9 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -104,11 +104,11 @@ static int replication_open(BlockDriverState *bs, QDict 
*options,
 } else if (!strcmp(mode, "secondary")) {
 s->mode = REPLICATION_MODE_SECONDARY;
 top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
-s->top_id = g_strdup(top_id);
-if (!s->top_id) {
+if (!top_id) {
 error_setg(_err, "Missing the option top-id");
 goto fail;
 }
+s->top_id = g_strdup(top_id);
 } else {
 error_setg(_err,
"The option mode's value should be primary or secondary");
-- 
1.9.3






Re: [Qemu-block] [PATCH v10 01/16] block: Pause all jobs during bdrv_reopen_multiple()

2016-10-11 Thread Paolo Bonzini


On 11/10/2016 11:39, Kevin Wolf wrote:
> Am 10.10.2016 um 18:41 hat Paolo Bonzini geschrieben:
>> On 10/10/2016 17:37, Kevin Wolf wrote:
 +while ((job = block_job_next(job))) {
 +AioContext *aio_context = blk_get_aio_context(job->blk);
 +
 +aio_context_acquire(aio_context);
 +block_job_pause(job);
 +aio_context_release(aio_context);
 +}
 +
  bdrv_drain_all();
>>>
>>> We already have a bdrv_drain_all() here, which does the same thing (and
>>> more) internally, except that it resumes all jobs before it returns.
>>> Maybe what we should do is split bdrv_drain_all() in a begin/end pair,
>>> too.
>>
>> Hey, haven't I just suggested the same? :)  I swear I hadn't read this
>> before.
>>
>>> The other point I'm wondering now is whether bdrv_drain_all() should
>>> have the aio_disable/enable_external() pair that bdrv_drain() has.
>>
>> bdrv_drain_all need not have it, but its start/end replacement should.
> 
> Doesn't need it because it holds the AioContext lock?

No, because as soon as bdrv_drain_all exits, external file descriptors
can be triggered again so I don't think the aio_disable/enable_external
actually provides any protection.

bdrv_drain_all should really only be used in cases where you already
have some kind of "hold" on external file descriptors, like bdrv_close
uses bdrv_drain():

bdrv_drained_begin(bs); /* complete I/O */
bdrv_flush(bs);
bdrv_drain(bs); /* in case flush left pending I/O */

That said, because the simplest implementation of bdrv_drain_all() does

bdrv_drained_all_start();
bdrv_drained_all_end();

just like bdrv_drain() does it, it's not a very interesting point.
bdrv_drain_all will probably disable/enable external file descriptors,
it just doesn't need that.

Paolo



Re: [Qemu-block] [Qemu-devel] [PATCH v2 02/11] blockjob: centralize QMP event emissions

2016-10-11 Thread Markus Armbruster
John Snow  writes:

> On 10/05/2016 09:43 AM, Kevin Wolf wrote:
[...]
>> Here we have an additional caller in block/replication.c and qemu-img,
>> so the parameters must stay. For qemu-img, nothing changes. For
>> replication, the block job events are added as a side effect.
>>
>> Not sure if we want to emit such events for an internal block job, but
>> if we do want the change, it should be explicit.
>>
>
> Hmm, do we want to make it so some jobs are invisible and others are
> not? Because as it stands right now, neither case is strictly true. We
> only emit cancelled/completed events if it was started via QMP,
> however we do emit events for error and ready regardless of who
> started the job.
>
> That didn't seem particularly consistent to me; either all events
> should be controlled by the job layer itself or none of them should
> be.
>
> I opted for "all."
>
> For "internal" jobs that did not previously emit any events, is it not
> true that these jobs still appear in the block job list and are
> effectively public regardless? I'd argue that these messages may be of
> value for management utilities who are still blocked by these jobs
> whether or not they are 'internal' or not.
>
> I'll push for keeping it mandatory and explicit. If it becomes a
> problem, we can always add a 'silent' job property that silences ALL
> qmp events, including all completion, error, and ready notices.
>
> I've CC'd Wen Congyang and Eric Blake to talk me down if they wish.

Having read the thread so far, I have two high-level remarks:

1. We should expose a job externally either completely (all queries show
it, all events get sent, any non-query command works normally as far as
it makes sense) or not at all.

2. Exposing internal jobs risks making them ABI.  Implementation details
need to be kept out of ABI.  So the question is whether the job is truly
internal detail, or a bona fide part of the external interface.



Re: [Qemu-block] [PATCH v10 01/16] block: Pause all jobs during bdrv_reopen_multiple()

2016-10-11 Thread Kevin Wolf
Am 10.10.2016 um 18:41 hat Paolo Bonzini geschrieben:
> On 10/10/2016 17:37, Kevin Wolf wrote:
> > > +while ((job = block_job_next(job))) {
> > > +AioContext *aio_context = blk_get_aio_context(job->blk);
> > > +
> > > +aio_context_acquire(aio_context);
> > > +block_job_pause(job);
> > > +aio_context_release(aio_context);
> > > +}
> > > +
> > >  bdrv_drain_all();
> >
> > We already have a bdrv_drain_all() here, which does the same thing (and
> > more) internally, except that it resumes all jobs before it returns.
> > Maybe what we should do is split bdrv_drain_all() in a begin/end pair,
> > too.
> 
> Hey, haven't I just suggested the same? :)  I swear I hadn't read this
> before.
> 
> > The other point I'm wondering now is whether bdrv_drain_all() should
> > have the aio_disable/enable_external() pair that bdrv_drain() has.
> 
> bdrv_drain_all need not have it, but its start/end replacement should.

Doesn't need it because it holds the AioContext lock?

Kevin



Re: [Qemu-block] [Qemu-devel] [PATCH v2 10/11] blockjob: refactor backup_start as backup_job_create

2016-10-11 Thread Kevin Wolf
Am 11.10.2016 um 00:51 hat John Snow geschrieben:
> >>Sadly for me, I realized this patch has a potential problem. When we
> >>were adding the bitmap operations, it became clear that the
> >>atomicity point was during .prepare, not .commit.
> >>
> >>e.g. the bitmap is cleared or created during prepare, and backup_run
> >>installs its Write Notifier at that point in time, too.
> >
> >Strictly speaking that's wrong then.
> >
> 
> I agree, though I do remember this coming up during the bitmap
> review process that the current point-in-time spot is during prepare
> at the moment.
> 
> I do think that while it's at least a consistent model (The model
> where we do in fact commit during .prepare(), and simply undo or
> revert during .abort(), and only clean or remove undo-cache in
> .commit()) it certainly violates the principle of least surprise and
> is a little rude...

As long as we can reliably undo things in .abort (i.e. use operations
that can't fail) and keep the locks and the device drained, we should be
okay in terms of atomicity.

I think it's still nicer if we can enable things only in .commit, but
sometimes we have to use operations that could fail, so we have to do
them in .prepare.

The exact split between .prepare/.commit/.abort isn't visible on the
external interfaces as long as it's done correctly, so it doesn't
necessarily have to be the same for all commands.

> >The write notifier doesn't really hurt because it is never triggered
> >between prepare and commit (we're holding the lock) and it can just be
> >removed again.
> >
> >Clearing the bitmap is a bug because the caller could expect that the
> >bitmap is in its original state if the transaction fails. I doubt this
> >is a problem in practice, but we should fix it anyway.
> 
> We make a backup to undo the process if it fails. I only mention it
> to emphasize that the atomic point appears to be during prepare. In
> practice we hold the locks for the whole process, but... I think
> Paolo may be actively trying to change that.

Well, the whole .prepare/.commit or .prepare/.abort sequence is supposed
to be atomic, so it's really the same thing. Changing this would break
the transactional behaviour, so that's not possible anyway.

> >By the way, why did we allow to add a 'bitmap' option for DriveBackup
> >without adding it to BlockdevBackup at the same time?
> 
> I don't remember. I'm not sure anyone ever audited it to convince
> themselves it was a useful or safe thing to do. I believe at the
> time I was pushing for bitmaps in DriveBackup, Fam was still
> authoring the BlockdevBackup interface.

Hm, maybe that's why. I checked the commit dates of both (and there
BlockdevBackup was earlier), but I didn't check the development history.

Should we add it now or is it a bad idea?

> >>By changing BlockJobs to only run on commit, we've severed the
> >>atomicity point such that some actions will take effect during
> >>prepare, and others at commit.
> >>
> >>I still think it's the correct thing to do to delay the BlockJobs
> >>until the commit phase, so I will start auditing the code to see how
> >>hard it is to shift the atomicity point to commit instead. If it's
> >>possible to do that, I think from the POV of the managing
> >>application, having the atomicity point be
> >>
> >>Feel free to chime in with suggestions and counterpoints until then.
> >
> >I agree that jobs have to be started only at commit. There may be other
> >things that are currently happening in prepare that really should be
> >moved as well, but unless moving one thing but not the other doesn't
> >break anything that was working, we can fix one thing at a time.
> >
> >Kevin
> >
> 
> Alright, let's give this a whirl.
> 
> We have 8 transaction actions:
> 
> drive_backup
> blockdev_backup
> block_dirty_bitmap_add
> block_dirty_bitmap_clear
> abort
> blockdev_snapshot
> blockdev_snapshot_sync
> blockdev_snapshot_internal_sync
> 
> Drive and Blockdev backup are already modified to behave
> point-in-time at time of .commit() by changing them to only begin
> running once the commit phase occurs.
> 
> Bitmap add and clear are trivial to rework; clear just moves the
> call to clear in commit, with possibly some action taken to prevent
> the bitmap from become used by some other process in the meantime.
> Add is easy to rework too, we can create it during prepare but reset
> it back to zero during commit if necessary.
> 
> Abort needs no changes.
> 
> blockdev_snapshot[_sync] actually appears to already be doing the
> right thing, by only installing the new top layer during commit,
> which makes this action inconsistent by current semantics, but
> requires no changes to move to the desired new semantics.

This doesn't sound too bad.

> That leaves only the internal snapshot to worry about, which does
> admittedly look like quite the yak to shave. It's a bit out of scope
> for me, but Kevin, do you think this is possible?
> 
> Looks like implementations are qcow2, rbd, and sheepdog. I 

Re: [Qemu-block] [Qemu-devel] [PATCH v2 10/11] blockjob: refactor backup_start as backup_job_create

2016-10-11 Thread Paolo Bonzini


On 11/10/2016 00:51, John Snow wrote:
>> Clearing the bitmap is a bug because the caller could expect that the
>> bitmap is in its original state if the transaction fails. I doubt this
>> is a problem in practice, but we should fix it anyway.
> 
> We make a backup to undo the process if it fails. I only mention it to
> emphasize that the atomic point appears to be during prepare. In
> practice we hold the locks for the whole process, but... I think Paolo
> may be actively trying to change that.

Even now, atomicity must be ensured with bdrv_drained_begin/end.  The
AioContext lock does not promise atomicity.

Paolo



Re: [Qemu-block] [PATCH v10 03/16] block: Use block_job_add_bdrv() in mirror_start_job()

2016-10-11 Thread Paolo Bonzini


On 10/10/2016 18:03, Kevin Wolf wrote:
>> > Use block_job_add_bdrv() instead of blocking all operations in
>> > mirror_start_job() and unblocking them in mirror_exit().
>> > 
>> > Signed-off-by: Alberto Garcia 
> Compared to the old code, this unblocks BLOCK_OP_TYPE_DATAPLANE, i.e.
> you can now run a dataplane device on a BDS used as the mirror target.
> 
> This means that the target could require a different AioContext than the
> source, which we can't support. So it seems unlikely to me that we can
> lift this restriction.

Indeed, this is in the pipeline, but still quite far away!

Paolo



Re: [Qemu-block] [Qemu-devel] [PATCH v2] block: Remove "options" indirection from blockdev-add

2016-10-11 Thread Markus Armbruster
Kevin Wolf  writes:

> Now that QAPI supports boxed types, we can have unions at the top level
> of a command, so let's put our real options directly there for
> blockdev-add instead of having a single "options" dict that contains the
> real arguments.
>
> blockdev-add is still experimental and we already made substantial
> changes to the API recently, so we're free to make changes like this
> one, too.
>
> Signed-off-by: Kevin Wolf 

Glad to see the QAPI work enable a nicer interface.

With docs/qmp-commands.txt touched up as per Max's review:
Reviewed-by: Markus Armbruster 



[Qemu-block] [PATCH 4/4] qapi: allow blockdev-add for ssh

2016-10-11 Thread Ashijeet Acharya
Introduce new object 'BlockdevOptionsSsh' in qapi/block-core.json to
support blockdev-add for SSH network protocol driver. Use only 'struct
InetSocketAddress' since SSH only supports connection over TCP.

Signed-off-by: Ashijeet Acharya 
---
 qapi/block-core.json | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9d797b8..2e8a390 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1716,7 +1716,8 @@
 'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom',
 'host_device', 'http', 'https', 'luks', 'null-aio', 'null-co',
 'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw',
-   'replication', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] }
+'replication', 'ssh', 'tftp', 'vdi', 'vhdx', 'vmdk', 'vpc',
+'vvfat' ] }
 
 ##
 # @BlockdevOptionsFile
@@ -1953,6 +1954,25 @@
 '*vport': 'int',
 '*segment': 'str' } }
 
+##
+# @BlockdevoptionsSsh
+#
+# @server:  host address and port number
+#
+# @path:path to the image on the host
+#
+# @user:user as which to connect
+#
+# @host_key_check   defines how and what to check the host key against
+#
+# Since 2.8
+##
+{ 'struct': 'BlockdevoptionsSsh',
+  'data': { 'server': 'InetSocketAddress',
+'path': 'str',
+'user': 'str',
+'*host_key_check': 'str' } }
+
 
 ##
 # @BlkdebugEvent
@@ -2281,7 +2301,7 @@
 # TODO rbd: Wait for structured options
   'replication':'BlockdevOptionsReplication',
 # TODO sheepdog: Wait for structured options
-# TODO ssh: Should take InetSocketAddress for 'host'?
+  'ssh':'BlockdevoptionsSsh',
   'tftp':   'BlockdevOptionsCurl',
   'vdi':'BlockdevOptionsGenericFormat',
   'vhdx':   'BlockdevOptionsGenericFormat',
-- 
2.6.2




[Qemu-block] [PATCH 2/4] block/ssh: Add InetSocketAddress and accept it

2016-10-11 Thread Ashijeet Acharya
Add InetSocketAddress compatibility to SSH driver.

Add a new option "server" to the SSH block driver which then accepts
a InetSocketAddress.

"host" and "port" are supported as legacy options and are mapped to
their InetSocketAddress representation.

Signed-off-by: Ashijeet Acharya 
---
 block/ssh.c | 87 ++---
 1 file changed, 78 insertions(+), 9 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index 75cb7bc..702871a 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -32,8 +32,11 @@
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
 #include "qemu/uri.h"
+#include "qapi-visit.h"
 #include "qapi/qmp/qint.h"
 #include "qapi/qmp/qstring.h"
+#include "qapi/qmp-input-visitor.h"
+#include "qapi/qmp-output-visitor.h"
 
 /* DEBUG_SSH=1 enables the DPRINTF (debugging printf) statements in
  * this block driver code.
@@ -74,6 +77,8 @@ typedef struct BDRVSSHState {
  */
 LIBSSH2_SFTP_ATTRIBUTES attrs;
 
+InetSocketAddress *inet;
+
 /* Used to warn if 'flush' is not supported. */
 char *hostport;
 bool unsafe_flush_warning;
@@ -263,7 +268,9 @@ static bool ssh_has_filename_options_conflict(QDict 
*options, Error **errp)
 !strcmp(qe->key, "port") ||
 !strcmp(qe->key, "path") ||
 !strcmp(qe->key, "user") ||
-!strcmp(qe->key, "host_key_check"))
+!strcmp(qe->key, "host_key_check") ||
+!strcmp(qe->key, "server") ||
+!strncmp(qe->key, "server.", 7))
 {
 error_setg(errp, "Option '%s' cannot be used with a file name",
qe->key);
@@ -555,13 +562,71 @@ static QemuOptsList ssh_runtime_opts = {
 },
 };
 
+static bool ssh_process_legacy_socket_options(QDict *output_opts,
+  QemuOpts *legacy_opts,
+  Error **errp)
+{
+const char *host = qemu_opt_get(legacy_opts, "host");
+const char *port = qemu_opt_get(legacy_opts, "port");
+
+if (!host && port) {
+error_setg(errp, "port may not be used without host");
+return false;
+}
+
+if (!host) {
+error_setg(errp, "No hostname was specified");
+return false;
+} else {
+qdict_put(output_opts, "server.host", qstring_from_str(host));
+qdict_put(output_opts, "server.port",
+  qstring_from_str(port ?: stringify(22)));
+}
+
+return true;
+}
+
+static InetSocketAddress *ssh_config(BDRVSSHState *s, QDict *options,
+ Error **errp)
+{
+InetSocketAddress *inet = NULL;
+QDict *addr = NULL;
+QObject *crumpled_addr = NULL;
+Visitor *iv = NULL;
+Error *local_error = NULL;
+
+qdict_extract_subqdict(options, , "server.");
+if (!qdict_size(addr)) {
+error_setg(errp, "SSH server address missing");
+goto out;
+}
+
+crumpled_addr = qdict_crumple(addr, true, errp);
+if (!crumpled_addr) {
+goto out;
+}
+
+iv = qmp_input_visitor_new(crumpled_addr, true);
+visit_type_InetSocketAddress(iv, NULL, , _error);
+if (local_error) {
+error_propagate(errp, local_error);
+goto out;
+}
+
+out:
+QDECREF(addr);
+qobject_decref(crumpled_addr);
+visit_free(iv);
+return inet;
+}
+
 static int connect_to_ssh(BDRVSSHState *s, QDict *options,
   int ssh_flags, int creat_mode, Error **errp)
 {
 int r, ret;
 QemuOpts *opts = NULL;
 Error *local_err = NULL;
-const char *host, *user, *path, *host_key_check;
+const char *user, *path, *host_key_check;
 int port;
 
 opts = qemu_opts_create(_runtime_opts, NULL, 0, _abort);
@@ -572,15 +637,11 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
 goto err;
 }
 
-host = qemu_opt_get(opts, "host");
-if (!host) {
+if (!ssh_process_legacy_socket_options(options, opts, errp)) {
 ret = -EINVAL;
-error_setg(errp, "No hostname was specified");
 goto err;
 }
 
-port = qemu_opt_get_number(opts, "port", 22);
-
 path = qemu_opt_get(opts, "path");
 if (!path) {
 ret = -EINVAL;
@@ -603,9 +664,16 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
 host_key_check = "yes";
 }
 
+/* Pop the config into our state object, Exit if invalid */
+s->inet = ssh_config(s, options, errp);
+if (!s->inet) {
+goto err;
+}
+
 /* Construct the host:port name for inet_connect. */
 g_free(s->hostport);
-s->hostport = g_strdup_printf("%s:%d", host, port);
+port = atoi(s->inet->port);
+s->hostport = g_strdup_printf("%s:%d", s->inet->host, port);
 
 /* Open the socket and connect. */
 s->sock = inet_connect(s->hostport, errp);
@@ -634,7 +702,8 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
 }
 
 /* Check the remote host's key against 

[Qemu-block] [PATCH 1/4] block/ssh: Add ssh_has_filename_options_conflict()

2016-10-11 Thread Ashijeet Acharya
We have 5 options plus ("server") option which is added in the next
patch that conflict with specifying a SSH filename. We need to iterate
over all the options to check whether its key has an "server." prefix.

This iteration will help us adding the new option "server" easily.

Signed-off-by: Ashijeet Acharya 
---
 block/ssh.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index 5ce12b6..75cb7bc 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -254,15 +254,30 @@ static int parse_uri(const char *filename, QDict 
*options, Error **errp)
 return -EINVAL;
 }
 
+static bool ssh_has_filename_options_conflict(QDict *options, Error **errp)
+{
+const QDictEntry *qe;
+
+for (qe = qdict_first(options); qe; qe = qdict_next(options, qe)) {
+if (!strcmp(qe->key, "host") ||
+!strcmp(qe->key, "port") ||
+!strcmp(qe->key, "path") ||
+!strcmp(qe->key, "user") ||
+!strcmp(qe->key, "host_key_check"))
+{
+error_setg(errp, "Option '%s' cannot be used with a file name",
+   qe->key);
+return true;
+}
+}
+
+return false;
+}
+
 static void ssh_parse_filename(const char *filename, QDict *options,
Error **errp)
 {
-if (qdict_haskey(options, "user") ||
-qdict_haskey(options, "host") ||
-qdict_haskey(options, "port") ||
-qdict_haskey(options, "path") ||
-qdict_haskey(options, "host_key_check")) {
-error_setg(errp, "user, host, port, path, host_key_check cannot be 
used at the same time as a file option");
+if (ssh_has_filename_options_conflict(options, errp)) {
 return;
 }
 
-- 
2.6.2




[Qemu-block] [PATCH 0/4] Allow blockdev-add for SSH

2016-10-11 Thread Ashijeet Acharya
This series adds blockdev-add support for SSH block driver.

Patch 1 prepares the code for the addition of a new option prefix,
which is "server.". This is accomplished by adding a
ssh_has_filename_options_conflict() function which helps to iterate
over the various options and check for conflict.

Patch 2 first adds InetSocketAddress compatibility to SSH block driver
and then makes it accept a InetSocketAddress under the "server" option.
The old options "host" and "port" are supported as legacy options and
then translated to the respective InetSocketAddress representation.

Patch 3 drops the usage of "host" and "port" outside of
ssh_has_filename_options_conflict() and
ssh_process_legacy_socket_options() functions in order to make them
legacy options completely.

Patch 4 helps to allow blockdev-add support for the SSH block driver
by making the SSH option available.


*** This series depends on the following patch: ***
"qdict: implement a qdict_crumple method for un-flattening a dict"
from Daniel's "QAPI/QOM work for non-scalar object properties"
series.

Ashijeet Acharya (4):
  block/ssh: Add ssh_has_filename_options_conflict()
  block/ssh: Add InetSocketAddress and accept it
  block/ssh: Use InetSocketAddress options
  qapi: allow blockdev-add for ssh

 block/ssh.c  | 121 +++
 qapi/block-core.json |  24 +-
 2 files changed, 125 insertions(+), 20 deletions(-)

--
2.6.2




Re: [Qemu-block] [Qemu-devel] [PATCH v4] build: Work around SIZE_MAX bug in OSX headers

2016-10-11 Thread Markus Armbruster
Eric Blake  writes:

> C99 requires SIZE_MAX to be declared with the same type as the
> integral promotion of size_t, but OSX mistakenly defines it as
> an 'unsigned long long' expression even though size_t is only
> 'unsigned long'.  Rather than futzing around with whether size_t
> is 32- or 64-bits wide (which would be needed if we cared about
> using SIZE_T in a #if expression), let the compiler get the right
> type for us by virtue of integer promotion - if we later need it

By virtue of usual arithmetic conversions, actually.

> during the preprocessor, the build will break on Mac until we
> improve our replacement.
>
> See also https://patchwork.ozlabs.org/patch/542327/ for an
> instance where the wrong type trips us up if we don't fix it
> for good in osdep.h.
>
> Some versions of glibc make a similar mistake with SSIZE_MAX; the
> goal is that the approach of this patch could be copied to work
> around that problem if it ever becomes important to us.
>
> Signed-off-by: Eric Blake 
>
> ---
> v1 was here:
> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg02520.html
>
> The topic recently came up again, and I noticed this patch sitting
> on one of my older branches, so I've taken another shot at it.
> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg00950.html
>
> v2: rewrite into a configure check (not sure if directly adding a
> -D to QEMU_CFLAGS is the best, so advice welcome)
>
> v3: Use config-host.mak rather than direct -D
>
> v4: placate -Wunused builds
>
> I lack easy access to a Mac box, so this is untested as to whether
> it actually solves the issue...
> ---
>  include/qemu/osdep.h |  8 
>  configure| 16 
>  2 files changed, 24 insertions(+)
>
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index 9e9fa61..e06ac47 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -141,6 +141,14 @@ extern int daemon(int, int);
>  # error Unknown pointer size
>  #endif
>
> +/* Mac OSX has a  bug that incorrectly defines SIZE_MAX with
> + * the wrong type. Our replacement isn't usable in preprocessor
> + * expressions, but it is sufficient for our needs. */
> +#if defined(HAVE_BROKEN_SIZE_MAX) && HAVE_BROKEN_SIZE_MAX
> +#undef SIZE_MAX
> +#define SIZE_MAX ((sizeof(char)) * -1)

All right, let's see how this works.

* -1 is of type int (C99 §6.4.4.1).

* sizeof(char) is a clever way to say (size_t)1.

* Integer promotion (§6.3.1.1) is a no-op for int, i.e. the left operand
  stays int.

* Integer promotion is a no-op for size_t, since size_t is surely at
  least as wide as unsigned int.  The right operand stays size_t.

* The type of the multiplication is determined by the usual arithmetic
  conversions (§6.3.1.6).  Since size_t's rank is surely greater or
  equal to int's rank, the int operand is converted to size_t, and the
  result is size_t.

* The value of the expression is therefore (size_t)-1 * (size_t)1,
  yielding (size_t)-1.

Cute, but what's wrong with straightforward

#define SIZE_MAX ((size_t)-1)

?

> +#endif
> +
>  #ifndef MIN
>  #define MIN(a, b) (((a) < (b)) ? (a) : (b))
>  #endif
> diff --git a/configure b/configure
> index 5751d8e..dd9e679 100755
> --- a/configure
> +++ b/configure
> @@ -1725,6 +1725,19 @@ if test "$cocoa" = "yes"; then
>  sdl=no
>  fi
>
> +# Some versions of Mac OS X incorrectly define SIZE_MAX
> +cat > $TMPC << EOF
> +#include 
> +#include 
> +int main(int argc, char *argv[]) {
> +return printf("%zu", SIZE_MAX);
> +}
> +EOF
> +have_broken_size_max=no
> +if ! compile_object -Werror ; then
> +have_broken_size_max=yes
> +fi
> +
>  ##
>  # L2TPV3 probe
>
> @@ -5245,6 +5258,9 @@ fi
>  if test "$have_ifaddrs_h" = "yes" ; then
>  echo "HAVE_IFADDRS_H=y" >> $config_host_mak
>  fi
> +if test "$have_broken_size_max" = "yes" ; then
> +echo "HAVE_BROKEN_SIZE_MAX=y" >> $config_host_mak
> +fi
>
>  # Work around a system header bug with some kernel/XFS header
>  # versions where they both try to define 'struct fsxattr':



Re: [Qemu-block] [Qemu-devel] [PATCH v1 2/2] block/replication: Clarify 'top-id' parameter usage

2016-10-11 Thread Fam Zheng
On Tue, 10/11 13:39, Changlong Xie wrote:
> Replication driver only support 'top-id' parameter in secondary side,
> and it must not be supplied in primary side
> 
> Signed-off-by: Changlong Xie 
> ---
>  block/replication.c  | 5 +
>  qapi/block-core.json | 3 ++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/block/replication.c b/block/replication.c
> index 5b432d9..1e8284b 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -101,6 +101,11 @@ static int replication_open(BlockDriverState *bs, QDict 
> *options,
>  
>  if (!strcmp(mode, "primary")) {
>  s->mode = REPLICATION_MODE_PRIMARY;
> +top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
> +if (top_id) {
> +error_setg(_err, "The primary side do not support option 
> top-id");

s/do not/does not/

> +goto fail;
> +}
>  } else if (!strcmp(mode, "secondary")) {
>  s->mode = REPLICATION_MODE_SECONDARY;
>  top_id = qemu_opt_get(opts, REPLICATION_TOP_ID);
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 4badb97..ec92df4 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2197,7 +2197,8 @@
>  # @mode: the replication mode
>  #
>  # @top-id: #optional In secondary mode, node name or device ID of the root
> -#  node who owns the replication node chain. Ignored in primary mode.
> +#  node who owns the replication node chain. Must not be given in
> +#  primary mode.
>  #
>  # Since: 2.8
>  ##
> -- 
> 1.9.3
> 
> 
> 
>