[Qemu-block] [PATCH] qemu-img: Allow unmap backing image for zeroed clusters
We already specified BDRV_O_UNMAP when opening images in 'qemu-img commit', but didn't turn on the "unmap" in the active commit job. This patch fixes that so that zeroed clusters in top image can be discarded which is desired in the virt-sparsify use case, where a temporary overlay is created and fstrim'ed before commiting back, to free space in the original image. The block-commit keeps the existing behavior. Signed-off-by: Fam Zheng--- block/mirror.c| 5 +++-- block/replication.c | 2 +- blockdev.c| 3 ++- include/block/block_int.h | 4 +++- qemu-img.c| 2 +- 5 files changed, 10 insertions(+), 6 deletions(-) diff --git a/block/mirror.c b/block/mirror.c index f9d1fec..5892cf6 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1002,7 +1002,8 @@ void commit_active_start(const char *job_id, BlockDriverState *bs, BlockdevOnError on_error, BlockCompletionFunc *cb, void *opaque, Error **errp, - bool auto_complete) + bool auto_complete, + bool unmap) { int64_t length, base_length; int orig_base_flags; @@ -1042,7 +1043,7 @@ void commit_active_start(const char *job_id, BlockDriverState *bs, mirror_start_job(job_id, bs, base, NULL, speed, 0, 0, MIRROR_LEAVE_BACKING_CHAIN, - on_error, on_error, false, cb, opaque, _err, + on_error, on_error, unmap, cb, opaque, _err, _active_job_driver, false, base, auto_complete); if (local_err) { error_propagate(errp, local_err); diff --git a/block/replication.c b/block/replication.c index 3bd1cf1..b8921a6 100644 --- a/block/replication.c +++ b/block/replication.c @@ -624,7 +624,7 @@ static void replication_stop(ReplicationState *rs, bool failover, Error **errp) commit_active_start("replication-commit", s->active_disk->bs, s->secondary_disk->bs, 0, BLOCKDEV_ON_ERROR_REPORT, replication_done, -bs, errp, true); +bs, errp, true, false); break; default: aio_context_release(aio_context); diff --git a/blockdev.c b/blockdev.c index 29c6561..8f9125a 100644 --- a/blockdev.c +++ b/blockdev.c @@ -3136,7 +3136,8 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device, goto out; } commit_active_start(has_job_id ? job_id : NULL, bs, base_bs, speed, -on_error, block_job_cb, bs, _err, false); +on_error, block_job_cb, bs, _err, false, +false); } else { commit_start(has_job_id ? job_id : NULL, bs, base_bs, top_bs, speed, on_error, block_job_cb, bs, diff --git a/include/block/block_int.h b/include/block/block_int.h index ef3c047..9b1e5d8 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -694,13 +694,15 @@ void commit_start(const char *job_id, BlockDriverState *bs, * @opaque: Opaque pointer value passed to @cb. * @errp: Error object. * @auto_complete: Auto complete the job. + * @unmap: Unmap zero clusters on base. * */ void commit_active_start(const char *job_id, BlockDriverState *bs, BlockDriverState *base, int64_t speed, BlockdevOnError on_error, BlockCompletionFunc *cb, - void *opaque, Error **errp, bool auto_complete); + void *opaque, Error **errp, bool auto_complete, + bool unmap); /* * mirror_start: * @job_id: The id of the newly-created job, or %NULL to use the diff --git a/qemu-img.c b/qemu-img.c index ceffefe..f2397e6 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -929,7 +929,7 @@ static int img_commit(int argc, char **argv) }; commit_active_start("commit", bs, base_bs, 0, BLOCKDEV_ON_ERROR_REPORT, -common_block_job_cb, , _err, false); +common_block_job_cb, , _err, false, true); if (local_err) { goto done; } -- 2.7.4
Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi
On Mon, 09/26 16:30, Peter Lieven wrote: > > So it looks like this file is not compliant to the specfication. Is it > > recognized and imported by VMware? > > Yes, VMware ESXi imports it flawlessly. The only possibility is that the "footer" is not in the end of the image. There must be some other sector begins with bytes '4b 44 4d 56' besides the first. Theoretically we can make QEMU accept this case by "scanning" the whole image to look for the footer, but I'm not sure it's worth it. Fam
Re: [Qemu-block] [Nbd] [PATCH] proto: add 'shift' extension.
On Mon, Sep 26, 2016 at 03:21:46PM -0500, Eric Blake wrote: > I'd much rather support a single flag that says to zero the entire disk > than to introduce stateful variable-amount shifting. That's almost exactly the opposite of what I said :) Now, I don't feel very strong either way, but what matters to me is: - NBD is a simple, easy to understand protocol; that is a feature, and so it should remain that way. - Every time we add another option, flag, or command, we make the protocol slightly more complex, which is counter to that goal. - Adding a command with a single use case (i.e., a "wipe the whole device" command) seems like it would not see much use, except perhaps in the use case that Virtuozzo is thinking about. In other words, it makes things slightly more complex for little benefit. I thought a negotiated shift size could be creatively used for other things beyond just "wipe the whole disk" commands, and that it might be elegant in that way. On the other hand, I recognize that adding state in that manner also complicates the protocol in that an observer which sees only part of the traffic may not understand what's going on anymore. So let's just say that an NBD_CMD_FLAG_SHIFT would: - Left-shift the size by 16 bits; no more, no less - 2^32-1 is too large a granularity for this to be useful beyond "wipe whole disk" commands; 2^16-1 (65535) seems like a more useful granularity. - This allows for a maximum number of 2^48-1 bytes (one byte shy of 256 tebibytes) to be affected by a single command, which seems sufficient for the given purpose. - If someone really wants to wipe 2^64-1 bytes (i.e., 16 exbibytes), they are probably using the wrong tools. - Be only valid for commands that don't send or expect data to be sent out over the wire. - currently TRIM and WRITE_ZEROES, but not READ or WRITE. Thoughts? -- < ron> I mean, the main *practical* problem with C++, is there's like a dozen people in the world who think they really understand all of its rules, and pretty much all of them are just lying to themselves too. -- #debian-devel, OFTC, 2016-02-12
Re: [Qemu-block] [Qemu-devel] [PATCH v3 1/2] block: sync bdrv_co_get_block_status_above() with bdrv_is_allocated_above()
On Mon, Sep 26, 2016 at 05:04:21PM +0200, Kevin Wolf wrote: > Am 19.09.2016 um 22:39 hat Eric Blake geschrieben: [...] > > I typically write this as: > > > > L2 <- L1 <- L0 > > > > (read "L2 backs L1, which in turn backs L0") with the active on the > > right. So I understand the confusion in Fam's question where you were > > using the opposite direction. > > And I tend to use this one: > > base <- sn1 <- sn2 <- top > > "sn*" isn't any better than "L*", but having at least one of "base" and > "top" (or "active") in there disambiguates the roles of the nodes. Not to quibble over terminology too much, but now that I'm writing a doc that I want to submit upstream, I began with your (Kevin's) notation. Then, I thought: Hmm, "sn1" could also be referred to as 'base', and "sn2" as 'top' when using `block-commit`' (and `block-stream`, once it starts supporting intermediate streaming?). And, moreover, as Eric (correctly) warns elsewhere about file-names vs. points-in-time: the guest state when 'sn1' was created is contained in 'base', so one could argue that 'sn1' ("snapshot 1") is a misnomer, and is technically 'overlay1'. So, I used the below notation until recently, including 'active' (with the rationale Kevin mentioned): base <- overlay1 <- overlay2 <- active Then, someone asked: "In the above chain, are you pointing to 'overlay2' as active, or is 'active' a separate image unto itself"? "Sigh, so it is still prone to misunderstanding", I thought. Given that, for now, though slightly more verbose and space-occupying, I settled on the below (occasionally doing s/base/orig/, to avoid the "overlay1 could be referred to as 'base' in some cases" problem): Live QEMU | v base <- overlay1 <- overlay2 <- overlay3 FWIW, the above also avoids the problem of a file called 'active' being described as: "previously-active, but not anymore, because its contents are merged into its backing file" in the event of a 'block-commit'. -- /kashyap
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 09/26/2016 07:51 AM, Paolo Bonzini wrote: >> +- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and >> + `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request >> + *length* and *offset* left by N bits, where N is defined by >> `NBD_OPT_SHIFT` >> + option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is >> + not specified. If after shift `(offset + length)` exceeds disk size, >> length >> + should be reduced to `( - offset)`. However, `(offset + >> length)` >> + must not exceed disk size by more than `(1 << N) - 1`. >> >> Request types >> >> > > This is very ad hoc. Can we instead have a block size common to all > commands? Block devices in practice have one, in fact that's why > they're called block devices... The INFO extensions are supposed to be a way for the server to communicate block sizes to the guest (note, it is one-way communication; the guest does NOT get to pick an arbitrary size, but relies on the server reporting it) - but I don't know if that sizing information is useful enough for the task at hand. As it was, the INFO extension had a proposal idea a while back about advertising a different size for TRIM and WRITE_ZEROES than what is preferred for WRITE and READ, so having a single size that works for shifted commands that operate on blocks instead of bytes may not even be feasible if there is no single block size to settle on. But having a universal command flag that says "this command requests offset and length in units of blocks instead of bytes", where blocks is an up-front sizing settled on during handshake via INFO extension, and NOT something that the client can change at-will during a live session, may be useful. Or it may be the source of arithmetic overflow exploits in poor implementations, or of denial-of-service with used with a READ or WRITE to send more than 2G of data in a single command. In other words, I don't yet see a compelling use case for being able to request shifted sizes. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 09/26/2016 07:46 AM, Vladimir Sementsov-Ogievskiy wrote: > This extension allows big requests for TRIM and WRITE_ZEROES through > special 'shift' parameter, which means that offset and length should be > shifted left by several bits. > > This is needed for efficient clearing large regions of the disk (up to > the whole disk). > > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > > Here mentioned WRITE_ZEROES command which is only an experemental > extension for now. > > To dicuss: > +- `NBD_OPT_SHIFT` (10) > + > +Defines shift used to calculate request offset and length if > +`NBD_CMD_FLAG_SHIFT` is set. > + > +Data: > + > +- 32 bits, shift (unsigned); Must not be larger than 32. Uggh. You're making the protocol stateful (the server has to remember what the client has previously requested via NBD_CMD_FLAG_SHIFT, rather than having ALL information it needs immediately available in the current NBD_CMD_WRITE_ZEROES). I'd much rather support a single flag that says to zero the entire disk than to introduce stateful variable-amount shifting. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PATCH] MAINTAINERS: Add some more headers to the IDE section
On 09/26/2016 05:39 AM, Thomas Huth wrote: On 26.09.2016 10:22, Kevin Wolf wrote: Am 23.09.2016 um 18:42 hat John Snow geschrieben: On 09/23/2016 12:09 PM, Thomas Huth wrote: The folder include/hw/ide/ belongs to the IDE section. Signed-off-by: Thomas Huth--- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index d8a0cfc..acf6d6c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -791,6 +791,7 @@ M: John Snow L: qemu-block@nongnu.org S: Supported F: include/hw/ide.h +F: include/hw/ide/ F: hw/ide/ F: hw/block/block.c F: hw/block/cdrom.c Ah, yeah. These got missed when they were moved over. Thanks. Reviewed-by: John Snow Who is supposed to merge this if you only give an R-b? I've CC:ed this patch to qemu-trivial, so I hope it will get picked up there if John does not want to apply this directly. Thomas Thanks, applied to my IDE tree: https://github.com/jnsnow/qemu/commits/ide https://github.com/jnsnow/qemu.git --js
Re: [Qemu-block] [Qemu-devel] [PATCH 0/1] ahci: fix ncq aiocb-related segfault
On 09/26/2016 12:10 PM, Stefan Hajnoczi wrote: On Thu, Sep 22, 2016 at 04:10:39PM -0400, John Snow wrote: Fix ncq_cb to prevent a segfault on sys_reset. John Snow (1): ahci: clear aiocb in ncq_cb hw/ide/ahci.c | 1 + 1 file changed, 1 insertion(+) -- 2.7.4 Maybe worth adding as a clarification: The issue is when bdrv_aio_cancel() is called after ncq_cb() was already invoked. The aiocb will be a dangling pointer. Done. Reviewed-by: Stefan HajnocziThanks, applied to my IDE tree: https://github.com/jnsnow/qemu/commits/ide https://github.com/jnsnow/qemu.git --js
Re: [Qemu-block] [PATCH v2 0/7] block: Make more blockdev-add options work
On 23.09.2016 16:32, Kevin Wolf wrote: > This series moves a few more options from flags to the appropriate place. This > doesn't only result in cleaner code, but also allows using these options in > nested node definitions. > > After this series, bds_tree_init() is only a thin wrapper around bdrv_open() > which sets the right defaults for cache modes and the BDRV_O_INACTIVE flag if > necessary. > > Depends on my block branch, specifically Berto's 'read-only' patches. > > v2: > - Patch 3 ('block/qapi: Move 'aio' option to file driver') > Use qapi_enum_parse() instead of parsing manually [Eric] > > - Patch 4 ('block: Parse 'detect-zeroes' in bdrv_open_common()') > Fixed typo in commit message [Eric] > Reuse local detect_zeroes variable instead of querying QemuOpts [Eric] > > Kevin Wolf (7): > block: Drop aio/cache consistency check from qmp_blockdev_add() > block/qapi: Use separate options type for curl driver > block/qapi: Move 'aio' option to file driver > block: Parse 'detect-zeroes' in bdrv_open_common() > block: Use 'detect-zeroes' option for 'blockdev-change-medium' > block: Move 'discard' option to bdrv_open_common() > block: Remove qemu_root_bds_opts > > block.c| 50 ++- > block/block-backend.c | 9 ++-- > block/raw-posix.c | 44 ++--- > block/raw-win32.c | 56 +++-- > blockdev.c | 110 > +++-- > include/block/block.h | 1 + > include/sysemu/block-backend.h | 2 +- > qapi/block-core.json | 31 > tests/qemu-iotests/087 | 4 +- > tests/qemu-iotests/087.out | 2 +- > 10 files changed, 164 insertions(+), 145 deletions(-) Reviewed-by: Max ReitzIt's a bit funny now how blockdev_init() and extract_common_blockdev_options() are actually used by neither blockdev-add nor -blockdev, but only by drive_new() (which happily advertises blockdev_init() to be shared with blockev-add, though). Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH v2 3/7] block/qapi: Move 'aio' option to file driver
On 23.09.2016 16:32, Kevin Wolf wrote: > The option whether or not to use a native AIO interface really isn't a > generic option for all drivers, but only applies to the native file > protocols. This patch moves the option in blockdev-add to the > appropriate places (raw-posix and raw-win32). > > We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility > because so far the AIO option was usually specified on the wrong layer > (the top-level format driver, which didn't even look at it) and then > inherited by the protocol driver (where it was actually used). We can't > forbid this use except in new interfaces. > > Signed-off-by: Kevin Wolf> --- > block/raw-posix.c | 44 --- > block/raw-win32.c | 56 > +- > qapi/block-core.json | 6 +++--- > tests/qemu-iotests/087 | 4 ++-- > 4 files changed, 83 insertions(+), 27 deletions(-) [...] > diff --git a/block/raw-win32.c b/block/raw-win32.c > index 56f45fe..734bb10 100644 > --- a/block/raw-win32.c > +++ b/block/raw-win32.c [...] > +static bool get_aio_option(QemuOpts *opts, int flags, Error **errp) > +{ > +BlockdevAioOptions aio, aio_default; > + > +aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE > + : BLOCKDEV_AIO_OPTIONS_THREADS; > +aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, > "aio"), > + BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp); > + > +switch (aio) { > +case BLOCKDEV_AIO_OPTIONS_NATIVE: > +return true; > +case BLOCKDEV_AIO_OPTIONS_THREADS: > +return false; > +default: > +error_setg(errp, "Invalid AIO option"); Any reason for catching this case here but not in raw-posix? (Not that it really matters, though.) Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH v3 0/3] Add -blockdev command line option
On 26.09.2016 17:27, Kevin Wolf wrote: > This series adds an option that is directly mapped to the blockdev-add QMP > command. It works more or less like -drive, except that it doesn't create a > BlockBackend (creating just a BDS without a BB is impossible with -drive) and > doesn't support legacy options. > > Depends on Dan's "[PATCH v13 0/6] QAPI/QOM work for non-scalar object > properties". > > v3: > - Some more documentation improvements [Max] > > v2: > - Fix uninitialised use of options variable [Eric] > - Use qapi_free_BlockdevOptions() instead of dealloc visitor [Eric] > - Improved documentation to mention dotted syntax, explain use of > -blockdev with -device, include examples and default values [Eric] > > Kevin Wolf (3): > block: Add '-blockdev' command line option > doc: Document generic -blockdev options > doc: Document driver-specific -blockdev options > > blockdev.c | 12 +++ > include/sysemu/sysemu.h | 1 + > qemu-options.hx | 227 > +--- > vl.c| 53 +++ > 4 files changed, 264 insertions(+), 29 deletions(-) Thanks! Reviewed-by: Max Reitzsignature.asc Description: OpenPGP digital signature
Re: [Qemu-block] [PATCH 0/1] ahci: fix ncq aiocb-related segfault
On Thu, Sep 22, 2016 at 04:10:39PM -0400, John Snow wrote: > Fix ncq_cb to prevent a segfault on sys_reset. > > John Snow (1): > ahci: clear aiocb in ncq_cb > > hw/ide/ahci.c | 1 + > 1 file changed, 1 insertion(+) > > -- > 2.7.4 Maybe worth adding as a clarification: The issue is when bdrv_aio_cancel() is called after ncq_cb() was already invoked. The aiocb will be a dangling pointer. Reviewed-by: Stefan Hajnoczisignature.asc Description: PGP signature
Re: [Qemu-block] [Qemu-devel] [PATCH v3 0/3] Add -blockdev command line option
Hi, Your series failed automatic build test. Please find the testing commands and their output below. If you have docker installed, you can probably reproduce it locally. Type: series Message-id: 1474903676-5680-1-git-send-email-kw...@redhat.com Subject: [Qemu-devel] [PATCH v3 0/3] Add -blockdev command line option === TEST SCRIPT BEGIN === #!/bin/bash set -e git submodule update --init dtc # Let docker tests dump environment info export SHOW_ENV=1 make J=8 docker-test-quick@centos6 make J=8 docker-test-mingw@fedora === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu - [tag update] patchew/1474895173-27278-1-git-send-email-kw...@redhat.com -> patchew/1474895173-27278-1-git-send-email-kw...@redhat.com * [new tag] patchew/1474903676-5680-1-git-send-email-kw...@redhat.com -> patchew/1474903676-5680-1-git-send-email-kw...@redhat.com Switched to a new branch 'test' 66335c8 doc: Document driver-specific -blockdev options 840d6f1 doc: Document generic -blockdev options eb554d3 block: Add '-blockdev' command line option === OUTPUT BEGIN === Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc' Cloning into 'dtc'... Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf' BUILD centos6 ARCHIVE qemu.tgz ARCHIVE dtc.tgz COPY RUNNER RUN test-quick in centos6 Packages installed: SDL-devel-1.2.14-7.el6_7.1.x86_64 ccache-3.1.6-2.el6.x86_64 epel-release-6-8.noarch gcc-4.4.7-17.el6.x86_64 git-1.7.1-4.el6_7.1.x86_64 glib2-devel-2.28.8-5.el6.x86_64 libfdt-devel-1.4.0-1.el6.x86_64 make-3.81-23.el6.x86_64 package g++ is not installed pixman-devel-0.32.8-1.el6.x86_64 tar-1.23-15.el6_8.x86_64 zlib-devel-1.2.3-29.el6.x86_64 Environment variables: PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel glib2-devel SDL-devel pixman-devel epel-release HOSTNAME=8ed38ebbff06 TERM=xterm MAKEFLAGS= -j8 HISTSIZE=1000 J=8 USER=root CCACHE_DIR=/var/tmp/ccache EXTRA_CONFIGURE_OPTS= V= SHOW_ENV=1 MAIL=/var/spool/mail/root PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ LANG=en_US.UTF-8 TARGET_LIST= HISTCONTROL=ignoredups SHLVL=1 HOME=/root TEST_DIR=/tmp/qemu-test LOGNAME=root LESSOPEN=||/usr/bin/lesspipe.sh %s FEATURES= dtc DEBUG= G_BROKEN_FILENAMES=1 CCACHE_HASHDIR= _=/usr/bin/env Configure options: --enable-werror --target-list=x86_64-softmmu,aarch64-softmmu --prefix=/tmp/qemu-test/src/tests/docker/install No C++ compiler available; disabling C++ specific optional code Install prefix/tmp/qemu-test/src/tests/docker/install BIOS directory/tmp/qemu-test/src/tests/docker/install/share/qemu binary directory /tmp/qemu-test/src/tests/docker/install/bin library directory /tmp/qemu-test/src/tests/docker/install/lib module directory /tmp/qemu-test/src/tests/docker/install/lib/qemu libexec directory /tmp/qemu-test/src/tests/docker/install/libexec include directory /tmp/qemu-test/src/tests/docker/install/include config directory /tmp/qemu-test/src/tests/docker/install/etc local state directory /tmp/qemu-test/src/tests/docker/install/var Manual directory /tmp/qemu-test/src/tests/docker/install/share/man ELF interp prefix /usr/gnemul/qemu-%M Source path /tmp/qemu-test/src C compilercc Host C compiler cc C++ compiler Objective-C compiler cc ARFLAGS rv CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -g QEMU_CFLAGS -I/usr/include/pixman-1-fPIE -DPIE -m64 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wendif-labels -Wmissing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-declaration -Wold-style-definition -Wtype-limits -fstack-protector-all LDFLAGS -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g make make install install pythonpython -B smbd /usr/sbin/smbd module supportno host CPU x86_64 host big endian no target list x86_64-softmmu aarch64-softmmu tcg debug enabled no gprof enabled no sparse enabledno strip binariesyes profiler no static build no pixmansystem SDL support yes (1.2.14) GTK support no GTK GL supportno VTE support no TLS priority NORMAL GNUTLS supportno GNUTLS rndno libgcrypt no libgcrypt kdf no nettleno nettle kdfno libtasn1 no curses supportno virgl support no curl support no mingw32 support no Audio drivers oss Block whitelist (rw) Block whitelist (ro) VirtFS supportno VNC support yes VNC SASL support no VNC JPEG support no VNC PNG
Re: [Qemu-block] [PATCH] block: Fix error path in qmp_blockdev_change_medium()
On 26.09.2016 15:06, Kevin Wolf wrote: > Commit 00949bab incorrectly changed one instance of into errp while > touching the line. Change it back. > > Signed-off-by: Kevin Wolf> --- > blockdev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
[Qemu-block] [PATCH v3 3/3] doc: Document driver-specific -blockdev options
This documents the driver-specific options for the raw, qcow2 and file block drivers for the man page. For everything else, we refer to the QAPI documentation. Signed-off-by: Kevin Wolf--- qemu-options.hx | 115 +++- 1 file changed, 114 insertions(+), 1 deletion(-) diff --git a/qemu-options.hx b/qemu-options.hx index b85aae4..872e97c 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -522,7 +522,18 @@ STEXI @item -blockdev @var{option}[,@var{option}[,@var{option}[,...]]] @findex -blockdev -Define a new block driver node. +Define a new block driver node. Some of the options apply to all block drivers, +other options are only accepted for a specific block driver. See below for a +list of generic options and options for the most common block drivers. + +Options that expect a reference to another node (e.g. @code{file}) can be +given in two ways. Either you specify the node name of an already existing node +(file=@var{node-name}), or you define a new node inline, adding options +for the referenced node after a dot (file.filename=@var{path},file.aio=native). + +A block driver node created with @option{-blockdev} can be used for a guest +device by specifying its node name for the @code{drive} property in a +@option{-device} argument that defines a block device. @table @option @item Valid options for any block driver node: @@ -562,6 +573,108 @@ zero write commands. You may even choose "unmap" if @var{discard} is set to "unmap" to allow a zero write to be converted to an @code{unmap} operation. @end table +@item Driver-specific options for @code{file} + +This is the protocol-level block driver for accessing regular files. + +@table @code +@item filename +The path to the image file in the local filesystem +@item aio +Specifies the AIO backend (threads/native, default: threads) +@end table +Example: +@example +-blockdev driver=file,node-name=disk,filename=disk.img +@end example + +@item Driver-specific options for @code{raw} + +This is the image format block driver for raw images. It is usually +stacked on top of a protocol level block driver such as @code{file}. + +@table @code +@item file +Reference to or definition of the data source block driver node +(e.g. a @code{file} driver node) +@end table +Example 1: +@example +-blockdev driver=file,node-name=disk_file,filename=disk.img +-blockdev driver=raw,node-name=disk,file=disk_file +@end example +Example 2: +@example +-blockdev driver=raw,node-name=disk,file.driver=file,file.filename=disk.img +@end example + +@item Driver-specific options for @code{qcow2} + +This is the image format block driver for qcow2 images. It is usually +stacked on top of a protocol level block driver such as @code{file}. + +@table @code +@item file +Reference to or definition of the data source block driver node +(e.g. a @code{file} driver node) + +@item backing +Reference to or definition of the backing file block device (default is taken +from the image file). It is allowed to pass an empty string here in order to +disable the default backing file. + +@item lazy-refcounts +Whether to enable the lazy refcounts feature (on/off; default is taken from the +image file) + +@item cache-size +The maximum total size of the L2 table and refcount block caches in bytes +(default: 1048576 bytes or 8 clusters, whichever is larger) + +@item l2-cache-size +The maximum size of the L2 table cache in bytes +(default: 4/5 of the total cache size) + +@item refcount-cache-size +The maximum size of the refcount block cache in bytes +(default: 1/5 of the total cache size) + +@item cache-clean-interval +Clean unused entries in the L2 and refcount caches. The interval is in seconds. +The default value is 0 and it disables this feature. + +@item pass-discard-request +Whether discard requests to the qcow2 device should be forwarded to the data +source (on/off; default: on if discard=unmap is specified, off otherwise) + +@item pass-discard-snapshot +Whether discard requests for the data source should be issued when a snapshot +operation (e.g. deleting a snapshot) frees clusters in the qcow2 file (on/off; +default: on) + +@item pass-discard-other +Whether discard requests for the data source should be issued on other +occasions where a cluster gets freed (on/off; default: off) + +@item overlap-check +Which overlap checks to perform for writes to the image +(none/constant/cached/all; default: cached). For details or finer +granularity control refer to the QAPI documentation of @code{blockdev-add}. +@end table + +Example 1: +@example +-blockdev driver=file,node-name=my_file,filename=/tmp/disk.qcow2 +-blockdev driver=qcow2,node-name=hda,file=my_file,overlap-check=none,cache-size=16777216 +@end example +Example 2: +@example +-blockdev driver=qcow2,node-name=disk,file.driver=http,file.filename=http://example.com/image.qcow2 +@end example + +@item Driver-specific options for other drivers +Please refer to the QAPI documentation of
[Qemu-block] [PATCH v3 2/3] doc: Document generic -blockdev options
This adds documentation for the -blockdev options that apply to all nodes independent of the block driver used. All options that are shared by -blockdev and -drive are now explained in the section for -blockdev. The documentation of -drive mentions that all -blockdev options are accepted as well. Signed-off-by: Kevin WolfReviewed-by: Eric Blake --- qemu-options.hx | 102 1 file changed, 73 insertions(+), 29 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 5d13898..b85aae4 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -523,6 +523,47 @@ STEXI @findex -blockdev Define a new block driver node. + +@table @option +@item Valid options for any block driver node: + +@table @code +@item driver +Specifies the block driver to use for the given node. +@item node-name +This defines the name of the block driver node by which it will be referenced +later. The name must be unique, i.e. it must not match the name of a different +block driver node, or (if you use @option{-drive} as well) the ID of a drive. + +If no node name is specified, it is automatically generated. The generated node +name is not intended to be predictable and changes between QEMU invocations. +For the top level, an explicit node name must be specified. +@item read-only +Open the node read-only. Guest write attempts will fail. +@item cache.direct +The host page cache can be avoided with @option{cache.direct=on}. This will +attempt to do disk IO directly to the guest's memory. QEMU may still perform an +internal copy of the data. +@item cache.no-flush +In case you don't care about data integrity over host failures, you can use +@option{cache.no-flush=on}. This option tells QEMU that it never needs to write +any data to the disk but can instead keep things in cache. If anything goes +wrong, like your host losing power, the disk storage getting disconnected +accidentally, etc. your image will most probably be rendered unusable. +@item discard=@var{discard} +@var{discard} is one of "ignore" (or "off") or "unmap" (or "on") and controls +whether @code{discard} (also known as @code{trim} or @code{unmap}) requests are +ignored or passed to the filesystem. Some machine types may not support +discard requests. +@item detect-zeroes=@var{detect-zeroes} +@var{detect-zeroes} is "off", "on" or "unmap" and enables the automatic +conversion of plain zero writes by the OS to driver specific optimized +zero write commands. You may even choose "unmap" if @var{discard} is set +to "unmap" to allow a zero write to be converted to an @code{unmap} operation. +@end table + +@end table + ETEXI DEF("drive", HAS_ARG, QEMU_OPTION_drive, @@ -544,7 +585,12 @@ STEXI @item -drive @var{option}[,@var{option}[,@var{option}[,...]]] @findex -drive -Define a new drive. Valid options are: +Define a new drive. This includes creating a block driver node (the backend) as +well as a guest device, and is mostly a shortcut for defining the corresponding +@option{-blockdev} and @option{-device} options. + +@option{-drive} accepts all options that are accepted by @option{-blockdev}. In +addition, it knows the following options: @table @option @item file=@var{file} @@ -571,11 +617,31 @@ These options have the same definition as they have in @option{-hdachs}. @var{snapshot} is "on" or "off" and controls snapshot mode for the given drive (see @option{-snapshot}). @item cache=@var{cache} -@var{cache} is "none", "writeback", "unsafe", "directsync" or "writethrough" and controls how the host cache is used to access block data. +@var{cache} is "none", "writeback", "unsafe", "directsync" or "writethrough" +and controls how the host cache is used to access block data. This is a +shortcut that sets the @option{cache.direct} and @option{cache.no-flush} +options (as in @option{-blockdev}), and additionally @option{cache.writeback}, +which provides a default for the @option{write-cache} option of block guest +devices (as in @option{-device}). The modes correspond to the following +settings: + +@c Our texi2pod.pl script doesn't support @multitable, so fall back to using +@c plain ASCII art (well, UTF-8 art really). This looks okay both in the manpage +@c and the HTML output. +@example +@ │ cache.writeback cache.direct cache.no-flush +─┼─ +writeback│ onoffoff +none │ onon off +writethrough │ off offoff +directsync │ off on off +unsafe │ onoffon +@end example + +The default mode is @option{cache=writeback}. + @item aio=@var{aio} @var{aio} is "threads", or "native" and selects between pthread based disk I/O and native Linux AIO. -@item discard=@var{discard} -@var{discard} is one of "ignore" (or "off") or "unmap" (or "on") and controls
Re: [Qemu-block] [Qemu-devel] [PATCH v3 1/2] block: sync bdrv_co_get_block_status_above() with bdrv_is_allocated_above()
Am 19.09.2016 um 22:39 hat Eric Blake geschrieben: > On 09/18/2016 11:37 PM, Denis V. Lunev wrote: > > On 09/19/2016 04:21 AM, Fam Zheng wrote: > >> On Thu, 09/15 19:34, Denis V. Lunev wrote: > >>> They should work very similar, covering same areas if backing store is > >>> shorter than the image. This change is necessary for the followup patch > >>> switching to bdrv_get_block_status_above() in mirror to avoid assert > >>> in check_block. > >>> > >>> This change should be made very carefully. Let us assume that we have > >>> top image and 2 backing stores L0->L1->L2. > >> Stupid question: which one is top and which are backing? > > L0 is top, L2 is at bottom. > > I typically write this as: > > L2 <- L1 <- L0 > > (read "L2 backs L1, which in turn backs L0") with the active on the > right. So I understand the confusion in Fam's question where you were > using the opposite direction. And I tend to use this one: base <- sn1 <- sn2 <- top "sn*" isn't any better than "L*", but having at least one of "base" and "top" (or "active") in there disambiguates the roles of the nodes. Kevin pgp8pg5gbJmqB.pgp Description: PGP signature
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 26/09/2016 15:53, Vladimir Sementsov-Ogievskiy wrote: > On 26.09.2016 15:51, Paolo Bonzini wrote: >> This is very ad hoc. Can we instead have a block size common to all >> commands? Block devices in practice have one, in fact that's why >> they're called block devices... > > Block size can be too small to clear the whole disk in one request (i.e. > (2**31 * block_size) is too small..) Considering that NBD supports multiple outstanding requests, is it a big deal to require one request per terabyte of storage? Paolo
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
> On 26 Sep 2016, at 14:54, Paolo Bonziniwrote: > > > Considering that NBD supports multiple outstanding requests, is it a big > deal to require one request per terabyte of storage? +1 Also I don't think we particularly want to make clearing the entire disk easy to do by mistake! This whole 'clear the whole disk in one command' seems like a dire case of premature optimisation. -- Alex Bligh
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
> On 26 Sep 2016, at 13:46, Vladimir Sementsov-Ogievskiy >wrote: > > Option reply types > > These values are used in the "reply type" field, sent by the server > @@ -872,7 +883,13 @@ valid may depend on negotiation during the handshake > phase. > > [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md). > - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY` > > [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md). > - > +- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and > + `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request > + *length* and *offset* left by N bits, where N is defined by `NBD_OPT_SHIFT` > + option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is > + not specified. If after shift `(offset + length)` exceeds disk size, length > + should be reduced to `( - offset)`. However, `(offset + length)` > + must not exceed disk size by more than `(1 << N) - 1`. Is there a good reason the shift size can't be either the minimum block size or otherwise negotiated using NBD_OPT_INFO? -- Alex Bligh
Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi
On Mon, 09/26 13:52, Peter Lieven wrote: > Am 26.09.2016 um 13:34 schrieb Fam Zheng: > > On Mon, 09/26 12:56, Peter Lieven wrote: > > > Hi, > > > > > > > > > i have a VMDK file for a Cisco Prime software appliance that I cannot > > > open in Qemu, Virtualbox or any other tool except > > > > > > for ESXi. Qemu exits with an invalid footer message. At the expected > > > position and also at the end of the file seems to be > > > > > > no footer. The VMDK has a -flex in the filename? Is this anything new > > > that we can't handle in Qemu? > > Looking at the code path it's possible that the image is not compliant to > > the > > spec, or using a newer version number that QEMU doesn't support. Could you > > hexdump the first and the last sectors for me to check? > > Sure, here we go > > lieven@lieven-pc:~/git/qemu$ ls -la /work/PI-VA-3.1.0.0.132-flex-disk1.vmdk > -rw--- 1 lieven lieven 3792931840 Apr 11 05:44 > /work/PI-VA-3.1.0.0.132-flex-disk1.vmdk > > lieven@lieven-pc:~/git/qemu$ head -c4096 > /work/PI-VA-3.1.0.0.132-flex-disk1.vmdk | hexdump -C > 4b 44 4d 56 03 00 00 00 01 00 03 00 00 00 9f 24 |KDMV...$| > 0010 00 00 00 00 80 00 00 00 00 00 00 00 01 00 00 00 || > 0020 00 00 00 00 02 00 00 00 00 00 00 00 00 02 00 00 || > 0030 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff || ^^^ This offset means a footer is used. In vmdk_50_technote.pdf: > The footer should be the last block of the disk and immediately followed by > the end‐of‐stream marker so that they together occupy the last two sectors of > the disk. which is the format of streamOptimized format, matching what the embedded descriptor says: > 0040 80 00 00 00 00 00 00 00 00 0a 20 0d 0a 01 00 00 |.. .| > 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || > * > 0200 23 20 44 69 73 6b 20 44 65 73 63 72 69 70 74 6f |# Disk Descripto| > 0210 72 46 69 6c 65 0a 76 65 72 73 69 6f 6e 3d 31 0a |rFile.version=1.| > 0220 43 49 44 3d 36 65 32 37 64 38 31 65 0a 70 61 72 |CID=6e27d81e.par| > 0230 65 6e 74 43 49 44 3d 66 66 66 66 66 66 66 66 0a |entCID=.| > 0240 63 72 65 61 74 65 54 79 70 65 3d 22 73 74 72 65 |createType="stre| > 0250 61 6d 4f 70 74 69 6d 69 7a 65 64 22 0a 0a 23 20 |amOptimized"..# | ^^^ > 0d00 00 00 00 00 ed 08 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d10 00 00 00 00 f3 08 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d20 00 00 00 00 f9 08 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d30 00 00 00 00 ff 08 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d40 00 00 00 00 05 09 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d50 00 00 00 00 0b 09 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d60 00 00 00 00 11 09 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d70 00 00 00 00 17 09 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d80 00 00 00 00 1d 09 71 00 00 00 00 00 00 00 00 00 |..q.| > 0d90 00 00 00 00 23 09 71 00 00 00 00 00 00 00 00 00 |#.q.| > 0da0 00 00 00 00 29 09 71 00 00 00 00 00 00 00 00 00 |).q.| > 0db0 00 00 00 00 2f 09 71 00 00 00 00 00 00 00 00 00 |/.q.| > 0dc0 00 00 00 00 35 09 71 00 00 00 00 00 00 00 00 00 |5.q.| > 0dd0 00 00 00 00 3b 09 71 00 00 00 00 00 00 00 00 00 |;.q.| > 0de0 00 00 00 00 41 09 71 00 00 00 00 00 00 00 00 00 |A.q.| > 0df0 00 00 00 00 47 09 71 00 00 00 00 00 00 00 00 00 |G.q.| > 0e00 00 00 00 00 4d 09 71 00 00 00 00 00 00 00 00 00 |M.q.| But the last sector is not the expected EOS marker. > 0e10 00 00 00 00 53 09 71 00 00 00 00 00 00 00 00 00 |S.q.| > 0e20 00 00 00 00 59 09 71 00 00 00 00 00 00 00 00 00 |Y.q.| > 0e30 00 00 00 00 5f 09 71 00 00 00 00 00 00 00 00 00 |_.q.| > 0e40 00 00 00 00 65 09 71 00 00 00 00 00 00 00 00 00 |e.q.| > 0e50 00 00 00 00 6b 09 71 00 00 00 00 00 00 00 00 00 |k.q.| > 0e60 00 00 00 00 71 09 71 00 00 00 00 00 00 00 00 00 |q.q.| > 0e70 00 00 00 00 77 09 71 00 00 00 00 00 00 00 00 00 |w.q.| > 0e80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || > * > 1000 > > Peter > So it looks like this file is not compliant to the specfication. Is it recognized and imported by VMware? Fam
Re: [Qemu-block] [PATCH v5 7/9] block: don't make snapshots for filters
Am 26.09.2016 um 11:51 hat Pavel Dovgalyuk geschrieben: > > From: Kevin Wolf [mailto:kw...@redhat.com] > > Am 26.09.2016 um 10:08 hat Pavel Dovgalyuk geschrieben: > > > This patch disables snapshotting for block driver filters. > > > It is needed, because snapshots should be created > > > in underlying disk images, not in filters itself. > > > > > > Signed-off-by: Pavel Dovgalyuk> > > > But that's exactly what the existing code implements? If a driver > > doesn't provide .bdrv_snapshot_goto, the request is redirected to > > bs->file. > > > > > block/snapshot.c |3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/block/snapshot.c b/block/snapshot.c > > > index bf5c2ca..8998b8b 100644 > > > --- a/block/snapshot.c > > > +++ b/block/snapshot.c > > > @@ -184,6 +184,9 @@ int bdrv_snapshot_goto(BlockDriverState *bs, > > > if (!drv) { > > > return -ENOMEDIUM; > > > } > > > +if (drv->is_filter) { > > > +return 0; > > > +} > > > > This, on the other hand, doesn't redirect the request, but silently > > ignores it. That is, loading the snapshot will apparently succeed, but > > it wouldn't actually load anything and the disk would stay in its > > current state. > > In my use case bdrv_all_goto_snapshot iterates all block drivers, including > filters and disk images. Therefore skipping goto for images is ok. Hm, this can happy today indeed. Originally, we only called bdrv_goto_snapshot() for all _top level_ BDSes, and this is still what you normally get. However, if you explicitly create a BDS (e.g. with its own -drive option), it is considered a top level BDS without actually being top level for the guest, and therefore the snapshotting function is called for it. Of course, this is highly inefficient because the goto_snapshot request is passed by the filter driver and then called another time for the lower node, effectively loading the snapshot a second time. On the other hand if you use a single -drive option to create both the qcow2 BDS and the blkreplay filter, we do need to pass down the goto_snapshot request because it won't be called for the qcow2 layer otherwise. I'm not completely sure yet what the right behaviour would be here. Kevin
[Qemu-block] [PATCH] block: Fix error path in qmp_blockdev_change_medium()
Commit 00949bab incorrectly changed one instance of into errp while touching the line. Change it back. Signed-off-by: Kevin Wolf--- blockdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/blockdev.c b/blockdev.c index 29c6561..62d0dd0 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2614,7 +2614,7 @@ void qmp_blockdev_change_medium(bool has_device, const char *device, error_free(err); err = NULL; -qmp_x_blockdev_remove_medium(has_device, device, has_id, id, errp); +qmp_x_blockdev_remove_medium(has_device, device, has_id, id, ); if (err) { error_propagate(errp, err); goto fail; -- 1.8.3.1
Re: [Qemu-block] [PATCH] proto: add 'shift' extension.
On 26/09/2016 14:46, Vladimir Sementsov-Ogievskiy wrote: > This extension allows big requests for TRIM and WRITE_ZEROES through > special 'shift' parameter, which means that offset and length should be > shifted left by several bits. > > This is needed for efficient clearing large regions of the disk (up to > the whole disk). > > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > > Here mentioned WRITE_ZEROES command which is only an experemental > extension for now. > > To dicuss: > NBD_OPT_SHIFT Data. It can be reduced to 8 bits actually... Are some >reserved bits needed here? > > doc/proto.md | 19 ++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/doc/proto.md b/doc/proto.md > index 2de3a6a..6fd1b16 100644 > --- a/doc/proto.md > +++ b/doc/proto.md > @@ -682,6 +682,8 @@ The field has the following format: >experimental `WRITE_ZEROES` > [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md). > - bit 7, `NBD_FLAG_SEND_DF`: defined by the experimental `STRUCTURED_REPLY` > > [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md). > +- bit 8, `NBD_FLAG_SEND_SHIFT` : exposes support for `NBD_CMD_FLAG_SHIFT` and > + `NBD_OPT_SHIFT` > > Clients SHOULD ignore unknown flags. > > @@ -765,6 +767,15 @@ of the newstyle negotiation. > > Defined by the experimental `INFO` > [extension](https://github.com/yoe/nbd/blob/extension-info/doc/proto.md). > > +- `NBD_OPT_SHIFT` (10) > + > +Defines shift used to calculate request offset and length if > +`NBD_CMD_FLAG_SHIFT` is set. > + > +Data: > + > +- 32 bits, shift (unsigned); Must not be larger than 32. > + > Option reply types > > These values are used in the "reply type" field, sent by the server > @@ -872,7 +883,13 @@ valid may depend on negotiation during the handshake > phase. > > [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md). > - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY` > > [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md). > - > +- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and > + `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request > + *length* and *offset* left by N bits, where N is defined by `NBD_OPT_SHIFT` > + option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is > + not specified. If after shift `(offset + length)` exceeds disk size, length > + should be reduced to `( - offset)`. However, `(offset + length)` > + must not exceed disk size by more than `(1 << N) - 1`. > > Request types > > This is very ad hoc. Can we instead have a block size common to all commands? Block devices in practice have one, in fact that's why they're called block devices... Paolo
Re: [Qemu-block] write_zeroes/trim on the whole disk
On 24/09/2016 14:27, Vladimir Sementsov-Ogievskiy wrote: > On 24.09.2016 15:06, Vladimir Sementsov-Ogievskiy wrote: >> On 24.09.2016 00:21, Wouter Verhelst wrote: >>> On Fri, Sep 23, 2016 at 02:00:06PM -0500, Eric Blake wrote: My preference would be a new flag to the existing commands, with explicit documentation that 0 offset and 0 length must be used with that flag, when requesting a full-device wipe. >>> Alternatively, what about a flag that says "if you use this flag, the >>> size should be left-shifted by X bits before processing"? That allows >>> you to do TRIM or WRITE_ZEROES on much larger chunks, without being >>> limited to "whole disk" commands. We should probably make it an illegal >>> flag for any command that actually sends data over the wire, though. >> >> Note: if disk size is not aligned to X we will have to send request >> larger than the disk size to clear the whole disk. > > Also, in this case, which realization of bdrv interface in qemu would be > most appropriate? Similar flag (in this case X must be defined in some > very transparent way, as a constant of 64k for example), or flag > BDRV_REQ_WHOLE_DISK, or separate .bdrv_zero_all and .bdrv_discard_all ? This makes nice sense. It also matches the SANITIZE command from the SCSI command set. Paolo
[Qemu-block] [PATCH] proto: add 'shift' extension.
This extension allows big requests for TRIM and WRITE_ZEROES through special 'shift' parameter, which means that offset and length should be shifted left by several bits. This is needed for efficient clearing large regions of the disk (up to the whole disk). Signed-off-by: Vladimir Sementsov-Ogievskiy--- Here mentioned WRITE_ZEROES command which is only an experemental extension for now. To dicuss: NBD_OPT_SHIFT Data. It can be reduced to 8 bits actually... Are some reserved bits needed here? doc/proto.md | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/doc/proto.md b/doc/proto.md index 2de3a6a..6fd1b16 100644 --- a/doc/proto.md +++ b/doc/proto.md @@ -682,6 +682,8 @@ The field has the following format: experimental `WRITE_ZEROES` [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md). - bit 7, `NBD_FLAG_SEND_DF`: defined by the experimental `STRUCTURED_REPLY` [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md). +- bit 8, `NBD_FLAG_SEND_SHIFT` : exposes support for `NBD_CMD_FLAG_SHIFT` and + `NBD_OPT_SHIFT` Clients SHOULD ignore unknown flags. @@ -765,6 +767,15 @@ of the newstyle negotiation. Defined by the experimental `INFO` [extension](https://github.com/yoe/nbd/blob/extension-info/doc/proto.md). +- `NBD_OPT_SHIFT` (10) + +Defines shift used to calculate request offset and length if +`NBD_CMD_FLAG_SHIFT` is set. + +Data: + +- 32 bits, shift (unsigned); Must not be larger than 32. + Option reply types These values are used in the "reply type" field, sent by the server @@ -872,7 +883,13 @@ valid may depend on negotiation during the handshake phase. [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md). - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY` [extension](https://github.com/yoe/nbd/blob/extension-structured-reply/doc/proto.md). - +- bit 3, `NBD_CMD_FLAG_SHIFT`; This flag is valid for `NBD_CMD_TRIM` and + `NBD_CMD_WRITE_ZEROES`. If this flag is set the server shifts request + *length* and *offset* left by N bits, where N is defined by `NBD_OPT_SHIFT` + option or is assumed to be 16 bits by default if `NBD_OPT_SHIFT` option is + not specified. If after shift `(offset + length)` exceeds disk size, length + should be reduced to `( - offset)`. However, `(offset + length)` + must not exceed disk size by more than `(1 << N) - 1`. Request types -- 1.8.3.1
Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi
On Mon, 09/26 12:56, Peter Lieven wrote: > Hi, > > > i have a VMDK file for a Cisco Prime software appliance that I cannot open in > Qemu, Virtualbox or any other tool except > > for ESXi. Qemu exits with an invalid footer message. At the expected position > and also at the end of the file seems to be > > no footer. The VMDK has a -flex in the filename? Is this anything new that we > can't handle in Qemu? Looking at the code path it's possible that the image is not compliant to the spec, or using a newer version number that QEMU doesn't support. Could you hexdump the first and the last sectors for me to check? Fam > > > This here is the relevant part of the OVF description: > > > > ovf:size="3792931840" /> > > > > Virtual disk information > ovf:diskId="vmdisk1" ovf:fileRef="file1" > ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; > ovf:populatedSize="8525510083"/> > ovf:diskId="vmdisk-expp" > ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; > ovf:populatedSize="8525510083"/> > ovf:diskId="vmdisk-std" > ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; > ovf:populatedSize="8525510083"/> > ovf:diskId="vmdisk-pro" > ovf:format="http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; > ovf:populatedSize="8525510083"/> > > > > Thanks, > > Peter >
Re: [Qemu-block] VMDK file unable to open in Qemu but ESXi
Am 26.09.2016 um 13:34 schrieb Fam Zheng: On Mon, 09/26 12:56, Peter Lieven wrote: Hi, i have a VMDK file for a Cisco Prime software appliance that I cannot open in Qemu, Virtualbox or any other tool except for ESXi. Qemu exits with an invalid footer message. At the expected position and also at the end of the file seems to be no footer. The VMDK has a -flex in the filename? Is this anything new that we can't handle in Qemu? Looking at the code path it's possible that the image is not compliant to the spec, or using a newer version number that QEMU doesn't support. Could you hexdump the first and the last sectors for me to check? Sure, here we go lieven@lieven-pc:~/git/qemu$ ls -la /work/PI-VA-3.1.0.0.132-flex-disk1.vmdk -rw--- 1 lieven lieven 3792931840 Apr 11 05:44 /work/PI-VA-3.1.0.0.132-flex-disk1.vmdk lieven@lieven-pc:~/git/qemu$ head -c4096 /work/PI-VA-3.1.0.0.132-flex-disk1.vmdk | hexdump -C 4b 44 4d 56 03 00 00 00 01 00 03 00 00 00 9f 24 |KDMV...$| 0010 00 00 00 00 80 00 00 00 00 00 00 00 01 00 00 00 || 0020 00 00 00 00 02 00 00 00 00 00 00 00 00 02 00 00 || 0030 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff || 0040 80 00 00 00 00 00 00 00 00 0a 20 0d 0a 01 00 00 |.. .| 0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 0200 23 20 44 69 73 6b 20 44 65 73 63 72 69 70 74 6f |# Disk Descripto| 0210 72 46 69 6c 65 0a 76 65 72 73 69 6f 6e 3d 31 0a |rFile.version=1.| 0220 43 49 44 3d 36 65 32 37 64 38 31 65 0a 70 61 72 |CID=6e27d81e.par| 0230 65 6e 74 43 49 44 3d 66 66 66 66 66 66 66 66 0a |entCID=.| 0240 63 72 65 61 74 65 54 79 70 65 3d 22 73 74 72 65 |createType="stre| 0250 61 6d 4f 70 74 69 6d 69 7a 65 64 22 0a 0a 23 20 |amOptimized"..# | 0260 45 78 74 65 6e 74 20 64 65 73 63 72 69 70 74 69 |Extent descripti| 0270 6f 6e 0a 52 44 4f 4e 4c 59 20 36 31 34 34 30 30 |on.RDONLY 614400| 0280 30 30 30 20 53 50 41 52 53 45 20 22 50 49 2d 56 |000 SPARSE "PI-V| 0290 41 2d 33 2e 31 2e 30 2e 30 2e 31 33 32 2d 66 6c |A-3.1.0.0.132-fl| 02a0 65 78 2d 64 69 73 6b 31 2e 76 6d 64 6b 22 0a 0a |ex-disk1.vmdk"..| 02b0 23 20 54 68 65 20 64 69 73 6b 20 44 61 74 61 20 |# The disk Data | 02c0 42 61 73 65 20 0a 23 44 44 42 0a 0a 64 64 62 2e |Base .#DDB..ddb.| 02d0 76 69 72 74 75 61 6c 48 57 56 65 72 73 69 6f 6e |virtualHWVersion| 02e0 20 3d 20 22 34 22 0a 64 64 62 2e 61 64 61 70 74 | = "4".ddb.adapt| 02f0 65 72 54 79 70 65 3d 22 69 64 65 22 0a 64 64 62 |erType="ide".ddb| 0300 2e 67 65 6f 6d 65 74 72 79 2e 63 79 6c 69 6e 64 |.geometry.cylind| 0310 65 72 73 3d 22 31 36 33 38 33 22 0a 64 64 62 2e |ers="16383".ddb.| 0320 67 65 6f 6d 65 74 72 79 2e 68 65 61 64 73 3d 22 |geometry.heads="| 0330 31 36 22 0a 64 64 62 2e 67 65 6f 6d 65 74 72 79 |16".ddb.geometry| 0340 2e 73 65 63 74 6f 72 73 3d 22 36 33 22 0a 64 64 |.sectors="63".dd| 0350 62 2e 75 75 69 64 2e 69 6d 61 67 65 3d 22 32 30 |b.uuid.image="20| 0360 66 63 31 34 64 34 2d 32 62 64 36 2d 34 38 36 32 |fc14d4-2bd6-4862| 0370 2d 39 30 36 61 2d 61 63 64 64 63 35 30 65 37 66 |-906a-acddc50e7f| 0380 31 65 22 0a 64 64 62 2e 75 75 69 64 2e 70 61 72 |1e".ddb.uuid.par| 0390 65 6e 74 3d 22 30 30 30 30 30 30 30 30 2d 30 30 |ent="-00| 03a0 30 30 2d 30 30 30 30 2d 30 30 30 30 2d 30 30 30 |00---000| 03b0 30 30 30 30 30 30 30 30 30 22 0a 64 64 62 2e 75 |0".ddb.u| 03c0 75 69 64 2e 6d 6f 64 69 66 69 63 61 74 69 6f 6e |uid.modification| 03d0 3d 22 30 30 30 30 30 30 30 30 2d 30 30 30 30 2d |="--| 03e0 30 30 30 30 2d 30 30 30 30 2d 30 30 30 30 30 30 |--00| 03f0 30 30 30 30 30 30 22 0a 64 64 62 2e 75 75 69 64 |00".ddb.uuid| 0400 2e 70 61 72 65 6e 74 6d 6f 64 69 66 69 63 61 74 |.parentmodificat| 0410 69 6f 6e 3d 22 30 30 30 30 30 30 30 30 2d 30 30 |ion="-00| 0420 30 30 2d 30 30 30 30 2d 30 30 30 30 2d 30 30 30 |00---000| 0430 30 30 30 30 30 30 30 30 30 22 0a 64 64 62 2e 63 |0".ddb.c| 0440 6f 6d 6d 65 6e 74 3d 22 22 0a 00 00 00 00 00 00 |omment=""...| 0450 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 1000 lieven@lieven-pc:~/git/qemu$ tail -c4096 /work/PI-VA-3.1.0.0.132-flex-disk1.vmdk | hexdump -C 00 00 00 00 00 00 00 00 e8 14 3e 00 00 00 00 00 |..>.| 0010 00 00 00 00 00 00 00 00 ee 14 3e 00 00 00 00 00 |..>.| 0020 00 00 00 00 00 00 00 00 f4 14 3e 00 00 00 00 00 |..>.| 0030 00 00 00 00 00 00 00 00 fa 14 3e 00 00 00 00 00 |..>.| 0040 00 00 00 00 00 00 00 00 00 15 3e 00 00 00 00 00 |..>.| 0050 00 00 00 00 00 00 00 00 06 15 3e 00 00 00 00 00 |..>.| 0060 00 00 00 00 00 00 00 00 0c 15 3e 00 00
Re: [Qemu-block] [PULL 17/33] block: Accept device model name for x-blockdev-remove-medium
On 22/09/2016 18:29, Kevin Wolf wrote: > -qmp_x_blockdev_remove_medium(device, ); > +qmp_x_blockdev_remove_medium(true, device, false, NULL, errp); > if (err) { > error_propagate(errp, err); > goto fail; Bug: changed to errp, so err is always NULL. Paolo
[Qemu-block] VMDK file unable to open in Qemu but ESXi
Hi, i have a VMDK file for a Cisco Prime software appliance that I cannot open in Qemu, Virtualbox or any other tool except for ESXi. Qemu exits with an invalid footer message. At the expected position and also at the end of the file seems to be no footer. The VMDK has a -flex in the filename? Is this anything new that we can't handle in Qemu? This here is the relevant part of the OVF description: Virtual disk information http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; ovf:populatedSize="8525510083"/> http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; ovf:populatedSize="8525510083"/> http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; ovf:populatedSize="8525510083"/> http://www.vmware.com/interfaces/specifications/vmdk.html#streamOptimized; ovf:populatedSize="8525510083"/> Thanks, Peter
Re: [Qemu-block] [PATCH v2 3/7] block/qapi: Move 'aio' option to file driver
Am 23.09.2016 um 16:40 hat Eric Blake geschrieben: > On 09/23/2016 09:32 AM, Kevin Wolf wrote: > > The option whether or not to use a native AIO interface really isn't a > > generic option for all drivers, but only applies to the native file > > protocols. This patch moves the option in blockdev-add to the > > appropriate places (raw-posix and raw-win32). > > > > We still have to keep the flag BDRV_O_NATIVE_AIO for compatibility > > because so far the AIO option was usually specified on the wrong layer > > (the top-level format driver, which didn't even look at it) and then > > inherited by the protocol driver (where it was actually used). We can't > > forbid this use except in new interfaces. > > > > Signed-off-by: Kevin Wolf> > --- > > block/raw-posix.c | 44 --- > > block/raw-win32.c | 56 > > +- > > qapi/block-core.json | 6 +++--- > > tests/qemu-iotests/087 | 4 ++-- > > 4 files changed, 83 insertions(+), 27 deletions(-) > > > > > +++ b/qapi/block-core.json > > @@ -1724,11 +1724,13 @@ > > # Driver specific block device options for the file backend. > > # > > # @filename:path to the image file > > +# @aio: #optional AIO backend (default: threads) > > Missed this last time, but probably worth a '(since 2.8)' marker. I'm not sure how useful this is when the whole blockdev-add command is still experimental and we're going to break it incompatibly by removing the "options" layer. But we have the annotation elsewhere, so I'll add it. Maybe the patch that breaks compatibility should remove the annotation everywhere again. Kevin pgpHOlvyEFhZJ.pgp Description: PGP signature
Re: [Qemu-block] [Qemu-devel] [PATCH] MAINTAINERS: Add some more headers to the IDE section
On 26.09.2016 10:22, Kevin Wolf wrote: > Am 23.09.2016 um 18:42 hat John Snow geschrieben: >> On 09/23/2016 12:09 PM, Thomas Huth wrote: >>> The folder include/hw/ide/ belongs to the IDE section. >>> >>> Signed-off-by: Thomas Huth>>> --- >>> MAINTAINERS | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/MAINTAINERS b/MAINTAINERS >>> index d8a0cfc..acf6d6c 100644 >>> --- a/MAINTAINERS >>> +++ b/MAINTAINERS >>> @@ -791,6 +791,7 @@ M: John Snow >>> L: qemu-block@nongnu.org >>> S: Supported >>> F: include/hw/ide.h >>> +F: include/hw/ide/ >>> F: hw/ide/ >>> F: hw/block/block.c >>> F: hw/block/cdrom.c >>> >> >> Ah, yeah. These got missed when they were moved over. Thanks. >> >> Reviewed-by: John Snow > > Who is supposed to merge this if you only give an R-b? I've CC:ed this patch to qemu-trivial, so I hope it will get picked up there if John does not want to apply this directly. Thomas signature.asc Description: OpenPGP digital signature
Re: [Qemu-block] write_zeroes/trim on the whole disk
Am 24.09.2016 um 14:27 hat Vladimir Sementsov-Ogievskiy geschrieben: > On 24.09.2016 15:06, Vladimir Sementsov-Ogievskiy wrote: > >On 24.09.2016 00:21, Wouter Verhelst wrote: > >>On Fri, Sep 23, 2016 at 02:00:06PM -0500, Eric Blake wrote: > >>>My preference would be a new flag to the existing commands, with > >>>explicit documentation that 0 offset and 0 length must be used > >>>with that > >>>flag, when requesting a full-device wipe. > >>Alternatively, what about a flag that says "if you use this flag, the > >>size should be left-shifted by X bits before processing"? That allows > >>you to do TRIM or WRITE_ZEROES on much larger chunks, without being > >>limited to "whole disk" commands. We should probably make it an illegal > >>flag for any command that actually sends data over the wire, though. > >> > > > > > >Note: if disk size is not aligned to X we will have to send > >request larger than the disk size to clear the whole disk. > > > > Also, in this case, which realization of bdrv interface in qemu > would be most appropriate? Similar flag (in this case X must be > defined in some very transparent way, as a constant of 64k for > example), or flag BDRV_REQ_WHOLE_DISK, or separate .bdrv_zero_all > and .bdrv_discard_all ? Maybe the best would be to extend the existing discard/write_zeroes functions to take a 64 bit byte count and then NBD can internally check whether a request clears the whole disk. Kevin
Re: [Qemu-block] [Qemu-devel] [PATCH v5 0/3] block: allow flush on devices with open tray
Am 23.09.2016 um 21:38 hat John Snow geschrieben: > On 09/23/2016 11:35 AM, Max Reitz wrote: > >On 23.09.2016 03:45, John Snow wrote: > >> block/block-backend.c | 22 -- > >> block/io.c | 25 + > >> cpus.c | 4 ++-- > >> hw/i386/xen/xen_platform.c | 2 -- > >> hw/ide/piix.c | 4 > >> include/block/block.h | 1 + > >> include/sysemu/block-backend.h | 1 - > >> 7 files changed, 32 insertions(+), 27 deletions(-) > > > >Reviewed-by: Max Reitz> > Since Fam acked this, I suppose it's for Kevin's tree? If nobody else wants the patches, I'll take them. Thanks, applied to the block branch. Kevin
Re: [Qemu-block] [Qemu-devel] [PATCH] MAINTAINERS: Add some more headers to the IDE section
Am 23.09.2016 um 18:42 hat John Snow geschrieben: > On 09/23/2016 12:09 PM, Thomas Huth wrote: > >The folder include/hw/ide/ belongs to the IDE section. > > > >Signed-off-by: Thomas Huth> >--- > > MAINTAINERS | 1 + > > 1 file changed, 1 insertion(+) > > > >diff --git a/MAINTAINERS b/MAINTAINERS > >index d8a0cfc..acf6d6c 100644 > >--- a/MAINTAINERS > >+++ b/MAINTAINERS > >@@ -791,6 +791,7 @@ M: John Snow > > L: qemu-block@nongnu.org > > S: Supported > > F: include/hw/ide.h > >+F: include/hw/ide/ > > F: hw/ide/ > > F: hw/block/block.c > > F: hw/block/cdrom.c > > > > Ah, yeah. These got missed when they were moved over. Thanks. > > Reviewed-by: John Snow Who is supposed to merge this if you only give an R-b? And didn't we want to move the internal header files back to hw/ instead of the global include directory? Kevin