Re: [RFC 0/3] block/file-posix: Work around XFS bug
Patchew URL: https://patchew.org/QEMU/20191025095849.25283-1-mre...@redhat.com/ Hi, This series failed the docker-quick@centos7 build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-centos7 V=1 NETWORK=1 time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === CC block/sheepdog.o CC block/accounting.o /tmp/qemu-test/src/block/file-posix.c: In function 'raw_open_common': /tmp/qemu-test/src/block/file-posix.c:671:5: error: implicit declaration of function 'platform_test_xfs_fd' [-Werror=implicit-function-declaration] if (platform_test_xfs_fd(s->fd)) { ^ /tmp/qemu-test/src/block/file-posix.c:671:5: error: nested extern declaration of 'platform_test_xfs_fd' [-Werror=nested-externs] cc1: all warnings being treated as errors make: *** [block/file-posix.o] Error 1 make: *** Waiting for unfinished jobs Traceback (most recent call last): File "./tests/docker/docker.py", line 662, in --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=c74d47e9bfb24107b6e94885fa8a2151', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-ytazf4e4/src/docker-src.2019-10-25-20.11.53.7609:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=c74d47e9bfb24107b6e94885fa8a2151 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-ytazf4e4/src' make: *** [docker-run-test-quick@centos7] Error 2 real2m32.235s user0m8.092s The full log is available at http://patchew.org/logs/20191025095849.25283-1-mre...@redhat.com/testing.docker-quick@centos7/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
Re: [PATCH 0/4] SCSI COMPARE_AND_WRITE support
Patchew URL: https://patchew.org/QEMU/1571996163-27688-1-git-send-email-baiyao...@cmss.chinamobile.com/ Hi, This series seems to have some coding style problems. See output below for more information: Subject: [PATCH 0/4] SCSI COMPARE_AND_WRITE support Type: series Message-id: 1571996163-27688-1-git-send-email-baiyao...@cmss.chinamobile.com === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Switched to a new branch 'test' 9dec556 scsi-disk: add FUA support for COMPARE_AND_WRITE 8ee0b25 hw/scsi: add SCSI COMPARE_AND_WRITE support 6c2bd51 block/rbd: implement bdrv_aio_compare_and_write interface f2cafca block: add SCSI COMPARE_AND_WRITE support === OUTPUT BEGIN === 1/4 Checking commit f2cafca98400 (block: add SCSI COMPARE_AND_WRITE support) 2/4 Checking commit 6c2bd51ede14 (block/rbd: implement bdrv_aio_compare_and_write interface) ERROR: braces {} are necessary for all arms of this statement #59: FILE: block/rbd.c:808: +if (LIBRBD_HAVE_COMPARE_AND_WRITE) [...] ERROR: line over 90 characters #87: FILE: block/rbd.c:1015: +r = rbd_aio_compare_and_write(s->image, off, size/2, rcb->buf, (rcb->buf + size/2), c, 0, 0); ERROR: spaces required around that '/' (ctx:VxV) #87: FILE: block/rbd.c:1015: +r = rbd_aio_compare_and_write(s->image, off, size/2, rcb->buf, (rcb->buf + size/2), c, 0, 0); ^ ERROR: spaces required around that '/' (ctx:VxV) #87: FILE: block/rbd.c:1015: +r = rbd_aio_compare_and_write(s->image, off, size/2, rcb->buf, (rcb->buf + size/2), c, 0, 0); ^ WARNING: line over 80 characters #98: FILE: block/rbd.c:1082: + uint64_t offset, uint64_t bytes, total: 4 errors, 1 warnings, 90 lines checked Patch 2/4 has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. 3/4 Checking commit 8ee0b25d3d83 (hw/scsi: add SCSI COMPARE_AND_WRITE support) WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? #219: new file mode 100644 total: 0 errors, 1 warnings, 190 lines checked Patch 3/4 has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. 4/4 Checking commit 9dec556b752c (scsi-disk: add FUA support for COMPARE_AND_WRITE) === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/1571996163-27688-1-git-send-email-baiyao...@cmss.chinamobile.com/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
[PULL 1/2] virtio-blk: Add blk_drain() to virtio_blk_device_unrealize()
From: Julia Suvorova QEMU does not wait for completed I/O requests, assuming that the guest driver will reset the device before calling unrealize(). This does not happen on Windows, and QEMU crashes in virtio_notify(), getting the result of a completed I/O request on hot-unplugged device. Signed-off-by: Julia Suvorova Message-Id: <20191018142856.31870-1-jus...@redhat.com> Signed-off-by: Stefan Hajnoczi --- hw/block/virtio-blk.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index ed2ddebd2b..14e9f85b8b 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1207,6 +1207,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOBlock *s = VIRTIO_BLK(dev); +blk_drain(s->blk); virtio_blk_data_plane_destroy(s->dataplane); s->dataplane = NULL; qemu_del_vm_change_state_handler(s->change); -- 2.21.0
[PULL 0/2] Block patches
The following changes since commit 58560ad254fbda71d4daa6622d71683190070ee2: Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.2-20191024' into staging (2019-10-24 16:22:58 +0100) are available in the Git repository at: https://github.com/stefanha/qemu.git tags/block-pull-request for you to fetch changes up to d154ef37ff885918fa3e512fd7a8e42870291667: yield_until_fd_readable: make it work with any AioContect (2019-10-25 14:38:29 +0200) Pull request Dietmar Maurer (1): yield_until_fd_readable: make it work with any AioContect Julia Suvorova (1): virtio-blk: Add blk_drain() to virtio_blk_device_unrealize() hw/block/virtio-blk.c| 1 + util/qemu-coroutine-io.c | 7 +-- 2 files changed, 6 insertions(+), 2 deletions(-) -- 2.21.0
[PULL 2/2] yield_until_fd_readable: make it work with any AioContect
From: Dietmar Maurer Simply use qemu_get_current_aio_context(). Signed-off-by: Dietmar Maurer Message-Id: <20191024045610.9071-1-diet...@proxmox.com> Signed-off-by: Stefan Hajnoczi --- util/qemu-coroutine-io.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/util/qemu-coroutine-io.c b/util/qemu-coroutine-io.c index 44a8969a69..5b80bb416f 100644 --- a/util/qemu-coroutine-io.c +++ b/util/qemu-coroutine-io.c @@ -67,6 +67,7 @@ qemu_co_send_recv(int sockfd, void *buf, size_t bytes, bool do_send) } typedef struct { +AioContext *ctx; Coroutine *co; int fd; } FDYieldUntilData; @@ -74,7 +75,7 @@ typedef struct { static void fd_coroutine_enter(void *opaque) { FDYieldUntilData *data = opaque; -qemu_set_fd_handler(data->fd, NULL, NULL, NULL); +aio_set_fd_handler(data->ctx, data->fd, false, NULL, NULL, NULL, NULL); qemu_coroutine_enter(data->co); } @@ -83,8 +84,10 @@ void coroutine_fn yield_until_fd_readable(int fd) FDYieldUntilData data; assert(qemu_in_coroutine()); +data.ctx = qemu_get_current_aio_context(); data.co = qemu_coroutine_self(); data.fd = fd; -qemu_set_fd_handler(fd, fd_coroutine_enter, NULL, &data); +aio_set_fd_handler( +data.ctx, fd, false, fd_coroutine_enter, NULL, NULL, &data); qemu_coroutine_yield(); } -- 2.21.0
Re: [PATCH v2 02/15] qapi/block-core: add option for io_uring
Stefan Hajnoczi writes: > From: Aarushi Mehta > > Only enumerates option for devices that support it. I'm not sure I get this sentence. > Since QAPI schema > supports io_uring, which is the actual name of the Linux API, it is > preferred over io-uring. I guess this one means something like "Since io_uring is the actual name of the Linux API, we use it as enum value even though the QAPI schema conventions would prefer io-uring." > Signed-off-by: Aarushi Mehta > Signed-off-by: Stefan Hajnoczi > --- > qapi/block-core.json | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/qapi/block-core.json b/qapi/block-core.json > index b274aef713..3196f40178 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -2851,11 +2851,13 @@ > # > # @threads: Use qemu's thread pool > # @native: Use native AIO backend (only Linux and Windows) > +# @io_uring:Use linux io_uring (since 4.2) > # > # Since: 2.9 > ## > { 'enum': 'BlockdevAioOptions', > - 'data': [ 'threads', 'native' ] } > + 'data': [ 'threads', 'native', > +{ 'name': 'io_uring', 'if': 'defined(CONFIG_LINUX_IO_URING)' } ] > } > > ## > # @BlockdevCacheOptions: I encourage you to polish the commit message some. Acked-by: Markus Armbruster
[PATCH v7 0/4] colo: Add support for continuous replication
Hello Everyone, These Patches add support for continuous replication to colo. This means that after the Primary fails and the Secondary did a failover, the Secondary can then become Primary and resume replication to a new Secondary. Regards, Lukas Straub v7: - clarify meaning of ip's in documentation and note that active and hidden images just need to be created once - reverted removal of bdrv_is_root_node(top_bs) in replication and adjusted the top-id= parameter in documentation acordingly v6: - documented the position= and insert= options - renamed replication test - clarified documentation by using different ip's for primary and secondary - added Reviewed-by tags v5: - change syntax for the position= parameter - fix spelling mistake v4: - fix checkpatch.pl warnings v3: - add test for replication changes - check if the filter to be inserted before/behind belongs to the same interface - fix the error message for the position= parameter - rename term "after" -> "behind" and variable "insert_before" -> "insert_before_flag" - document the quorum node on the secondary side - simplify quorum parameters in documentation - remove trailing spaces in documentation - clarify the testing procedure in documentation v2: - fix email formating - fix checkpatch.pl warnings - fix patchew error - clearer commit messages Lukas Straub (4): block/replication.c: Ignore requests after failover tests/test-replication.c: Add test for for secondary node continuing replication net/filter.c: Add Options to insert filters anywhere in the filter list colo: Update Documentation for continuous replication block/replication.c| 35 +- docs/COLO-FT.txt | 224 +++-- docs/block-replication.txt | 28 +++-- include/net/filter.h | 2 + net/filter.c | 92 ++- qemu-options.hx| 31 - tests/test-replication.c | 52 + 7 files changed, 389 insertions(+), 75 deletions(-) -- 2.20.1
[PATCH v7 0/4] colo: Add support for continuous replication
Hello Everyone, These Patches add support for continuous replication to colo. This means that after the Primary fails and the Secondary did a failover, the Secondary can then become Primary and resume replication to a new Secondary. Regards, Lukas Straub v7: - clarify meaning of ip's in documentation and note that active and hidden images just need to be created once - reverted removal of bdrv_is_root_node(top_bs) in replication and adjusted the top-id= parameter in documentation acordingly v6: - documented the position= and insert= options - renamed replication test - clarified documentation by using different ip's for primary and secondary - added Reviewed-by tags v5: - change syntax for the position= parameter - fix spelling mistake v4: - fix checkpatch.pl warnings v3: - add test for replication changes - check if the filter to be inserted before/behind belongs to the same interface - fix the error message for the position= parameter - rename term "after" -> "behind" and variable "insert_before" -> "insert_before_flag" - document the quorum node on the secondary side - simplify quorum parameters in documentation - remove trailing spaces in documentation - clarify the testing procedure in documentation v2: - fix email formating - fix checkpatch.pl warnings - fix patchew error - clearer commit messages Lukas Straub (4): block/replication.c: Ignore requests after failover tests/test-replication.c: Add test for for secondary node continuing replication net/filter.c: Add Options to insert filters anywhere in the filter list colo: Update Documentation for continuous replication block/replication.c| 35 +- docs/COLO-FT.txt | 224 +++-- docs/block-replication.txt | 28 +++-- include/net/filter.h | 2 + net/filter.c | 92 ++- qemu-options.hx| 31 - tests/test-replication.c | 52 + 7 files changed, 389 insertions(+), 75 deletions(-) -- 2.20.1
[PATCH v2 10/15] block/io_uring: adds userspace completion polling
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- block/io_uring.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/block/io_uring.c b/block/io_uring.c index a5c0d16220..56892fd1ab 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -274,6 +274,21 @@ static void qemu_luring_completion_cb(void *opaque) luring_process_completions_and_submit(s); } +static bool qemu_luring_poll_cb(void *opaque) +{ +LuringState *s = opaque; +struct io_uring_cqe *cqes; + +if (io_uring_peek_cqe(&s->ring, &cqes) == 0) { +if (cqes) { +luring_process_completions_and_submit(s); +return true; +} +} + +return false; +} + static void ioq_init(LuringQueue *io_q) { QSIMPLEQ_INIT(&io_q->submit_queue); @@ -387,7 +402,7 @@ void luring_attach_aio_context(LuringState *s, AioContext *new_context) s->aio_context = new_context; s->completion_bh = aio_bh_new(new_context, qemu_luring_completion_bh, s); aio_set_fd_handler(s->aio_context, s->ring.ring_fd, false, - qemu_luring_completion_cb, NULL, NULL, s); + qemu_luring_completion_cb, NULL, qemu_luring_poll_cb, s); } LuringState *luring_init(Error **errp) -- 2.21.0
[PATCH v2 06/15] util/async: add aio interfaces for io_uring
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- util/async.c | 36 1 file changed, 36 insertions(+) diff --git a/util/async.c b/util/async.c index ca83e32c7f..c9880e1812 100644 --- a/util/async.c +++ b/util/async.c @@ -276,6 +276,14 @@ aio_ctx_finalize(GSource *source) } #endif +#ifdef CONFIG_LINUX_IO_URING +if (ctx->linux_io_uring) { +luring_detach_aio_context(ctx->linux_io_uring, ctx); +luring_cleanup(ctx->linux_io_uring); +ctx->linux_io_uring = NULL; +} +#endif + assert(QSLIST_EMPTY(&ctx->scheduled_coroutines)); qemu_bh_delete(ctx->co_schedule_bh); @@ -340,6 +348,29 @@ LinuxAioState *aio_get_linux_aio(AioContext *ctx) } #endif +#ifdef CONFIG_LINUX_IO_URING +LuringState *aio_setup_linux_io_uring(AioContext *ctx, Error **errp) +{ +if (ctx->linux_io_uring) { +return ctx->linux_io_uring; +} + +ctx->linux_io_uring = luring_init(errp); +if (!ctx->linux_io_uring) { +return NULL; +} + +luring_attach_aio_context(ctx->linux_io_uring, ctx); +return ctx->linux_io_uring; +} + +LuringState *aio_get_linux_io_uring(AioContext *ctx) +{ +assert(ctx->linux_io_uring); +return ctx->linux_io_uring; +} +#endif + void aio_notify(AioContext *ctx) { /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs @@ -435,6 +466,11 @@ AioContext *aio_context_new(Error **errp) #ifdef CONFIG_LINUX_AIO ctx->linux_aio = NULL; #endif + +#ifdef CONFIG_LINUX_IO_URING +ctx->linux_io_uring = NULL; +#endif + ctx->thread_pool = NULL; qemu_rec_mutex_init(&ctx->lock); timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx); -- 2.21.0
[PATCH v2 07/15] blockdev: adds bdrv_parse_aio to use io_uring
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- include/block/block.h | 1 + block.c | 22 ++ blockdev.c| 12 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/include/block/block.h b/include/block/block.h index bdb48dcd1b..bf36fa9863 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -300,6 +300,7 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top, void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to, Error **errp); +int bdrv_parse_aio(const char *mode, int *flags); int bdrv_parse_cache_mode(const char *mode, int *flags, bool *writethrough); int bdrv_parse_discard_flags(const char *mode, int *flags); BdrvChild *bdrv_open_child(const char *filename, diff --git a/block.c b/block.c index dad5a3d8e0..290ab7d2bf 100644 --- a/block.c +++ b/block.c @@ -845,6 +845,28 @@ static BlockdevDetectZeroesOptions bdrv_parse_detect_zeroes(QemuOpts *opts, return detect_zeroes; } +/** + * Set open flags for aio engine + * + * Return 0 on success, -1 if the engine specified is invalid + */ +int bdrv_parse_aio(const char *mode, int *flags) +{ +if (!strcmp(mode, "threads")) { +/* do nothing, default */ +} else if (!strcmp(mode, "native")) { +*flags |= BDRV_O_NATIVE_AIO; +#ifdef CONFIG_LINUX_IO_URING +} else if (!strcmp(mode, "io_uring")) { +*flags |= BDRV_O_IO_URING; +#endif +} else { +return -1; +} + +return 0; +} + /** * Set open flags for a given discard mode * diff --git a/blockdev.c b/blockdev.c index 03c7cd7651..6ceef376a6 100644 --- a/blockdev.c +++ b/blockdev.c @@ -385,13 +385,9 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags, } if ((aio = qemu_opt_get(opts, "aio")) != NULL) { -if (!strcmp(aio, "native")) { -*bdrv_flags |= BDRV_O_NATIVE_AIO; -} else if (!strcmp(aio, "threads")) { -/* this is the default */ -} else { - error_setg(errp, "invalid aio option"); - return; +if (bdrv_parse_aio(aio, bdrv_flags) < 0) { +error_setg(errp, "invalid aio option"); +return; } } } @@ -4642,7 +4638,7 @@ QemuOptsList qemu_common_drive_opts = { },{ .name = "aio", .type = QEMU_OPT_STRING, -.help = "host AIO implementation (threads, native)", +.help = "host AIO implementation (threads, native, io_uring)", },{ .name = BDRV_OPT_CACHE_WB, .type = QEMU_OPT_BOOL, -- 2.21.0
[PATCH v2 15/15] tests/qemu-iotests: use AIOMODE with various tests
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- v11: * Drop line-wrapping changes that accidentally broke qemu-iotests --- tests/qemu-iotests/028 | 2 +- tests/qemu-iotests/058 | 2 +- tests/qemu-iotests/089 | 4 ++-- tests/qemu-iotests/091 | 4 ++-- tests/qemu-iotests/109 | 2 +- tests/qemu-iotests/147 | 5 +++-- tests/qemu-iotests/181 | 8 tests/qemu-iotests/183 | 4 ++-- tests/qemu-iotests/185 | 10 +- tests/qemu-iotests/200 | 2 +- tests/qemu-iotests/201 | 8 11 files changed, 26 insertions(+), 25 deletions(-) diff --git a/tests/qemu-iotests/028 b/tests/qemu-iotests/028 index bba1ee59ae..d4d972c884 100755 --- a/tests/qemu-iotests/028 +++ b/tests/qemu-iotests/028 @@ -108,7 +108,7 @@ echo block-backup echo qemu_comm_method="monitor" -_launch_qemu -drive file="${TEST_IMG}",cache=${CACHEMODE},id=disk +_launch_qemu -drive file="${TEST_IMG}",cache=${CACHEMODE},aio=${AIOMODE},id=disk h=$QEMU_HANDLE if [ "${VALGRIND_QEMU}" == "y" ]; then QEMU_COMM_TIMEOUT=7 diff --git a/tests/qemu-iotests/058 b/tests/qemu-iotests/058 index 8c3212a72f..38d1ed90c0 100755 --- a/tests/qemu-iotests/058 +++ b/tests/qemu-iotests/058 @@ -64,7 +64,7 @@ nbd_snapshot_img="nbd:unix:$nbd_unix_socket" converted_image=$TEST_IMG.converted # Use -f raw instead of -f $IMGFMT for the NBD connection -QEMU_IO_NBD="$QEMU_IO -f raw --cache=$CACHEMODE" +QEMU_IO_NBD="$QEMU_IO -f raw --cache=$CACHEMODE --aio=$AIOMODE" echo echo "== preparing image ==" diff --git a/tests/qemu-iotests/089 b/tests/qemu-iotests/089 index ad029f1f09..059ad75e28 100755 --- a/tests/qemu-iotests/089 +++ b/tests/qemu-iotests/089 @@ -64,7 +64,7 @@ $QEMU_IO -c 'write -P 42 0 512' -c 'write -P 23 512 512' \ $QEMU_IMG convert -f raw -O $IMGFMT "$TEST_IMG.base" "$TEST_IMG" -$QEMU_IO_PROG --cache $CACHEMODE \ +$QEMU_IO_PROG --cache $CACHEMODE --aio $AIOMODE \ -c 'read -P 42 0 512' -c 'read -P 23 512 512' \ -c 'read -P 66 1024 512' "json:{ \"driver\": \"$IMGFMT\", @@ -111,7 +111,7 @@ $QEMU_IO -c 'write -P 42 0x38000 512' "$TEST_IMG" | _filter_qemu_io # The "image.filename" part tests whether "a": { "b": "c" } and "a.b": "c" do # the same (which they should). -$QEMU_IO_PROG --cache $CACHEMODE \ +$QEMU_IO_PROG --cache $CACHEMODE --aio $AIOMODE \ -c 'read -P 42 0x38000 512' "json:{ \"driver\": \"$IMGFMT\", \"file\": { diff --git a/tests/qemu-iotests/091 b/tests/qemu-iotests/091 index f4b44659ae..3a29469324 100755 --- a/tests/qemu-iotests/091 +++ b/tests/qemu-iotests/091 @@ -60,13 +60,13 @@ echo === Starting QEMU VM1 === echo qemu_comm_method="monitor" -_launch_qemu -drive file="${TEST_IMG}",cache=${CACHEMODE},id=disk +_launch_qemu -drive file="${TEST_IMG}",cache=${CACHEMODE},aio=${AIOMODE},id=disk h1=$QEMU_HANDLE echo echo === Starting QEMU VM2 === echo -_launch_qemu -drive file="${TEST_IMG}",cache=${CACHEMODE},id=disk \ +_launch_qemu -drive file="${TEST_IMG}",cache=${CACHEMODE},aio=${AIOMODE},id=disk \ -incoming "exec: cat '${MIG_FIFO}'" h2=$QEMU_HANDLE diff --git a/tests/qemu-iotests/109 b/tests/qemu-iotests/109 index 9897ceb6cd..3a3dcf94eb 100755 --- a/tests/qemu-iotests/109 +++ b/tests/qemu-iotests/109 @@ -52,7 +52,7 @@ run_qemu() local qmp_format="$3" local qmp_event="$4" -_launch_qemu -drive file="${source_img}",format=raw,cache=${CACHEMODE},id=src +_launch_qemu -drive file="${source_img}",format=raw,cache=${CACHEMODE},aio=${AIOMODE},id=src _send_qemu_cmd $QEMU_HANDLE "{ 'execute': 'qmp_capabilities' }" "return" _send_qemu_cmd $QEMU_HANDLE \ diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147 index ab8480b9a4..6bc50ff5d9 100755 --- a/tests/qemu-iotests/147 +++ b/tests/qemu-iotests/147 @@ -24,7 +24,7 @@ import socket import stat import time import iotests -from iotests import cachemode, imgfmt, qemu_img, qemu_nbd, qemu_nbd_early_pipe +from iotests import cachemode, aiomode, imgfmt, qemu_img, qemu_nbd, qemu_nbd_early_pipe NBD_PORT_START = 32768 NBD_PORT_END= NBD_PORT_START + 1024 @@ -134,7 +134,8 @@ class BuiltinNBD(NBDBlockdevAddBase): self.server.add_drive_raw('if=none,id=nbd-export,' + 'file=%s,' % test_img + 'format=%s,' % imgfmt + - 'cache=%s' % cachemode) + 'cache=%s' % cachemode + + 'aio=%s' % aiomode) self.server.launch() def tearDown(self): diff --git a/tests/qemu-iotests/181 b/tests/qemu-iotests/181 index e317e63422..5b7de72f18 100755 --- a/tests/qemu-iotests/181 +++ b/tests/qemu-iotests/181 @@ -58,20 +58,20 @@ qemu_comm_method="monitor" if [ "$IMGOPTSSYNTAX" = "true" ]; then _launch_qemu \ --drive "${TEST_IMG}",cache=${CACHEMODE},id=disk +-drive "${TEST_IMG}",cache=${CACHEMODE},aio=$AIOMODE,id=disk else _launch_qe
[PATCH v2 05/15] stubs: add stubs for io_uring interface
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- MAINTAINERS | 1 + stubs/Makefile.objs | 1 + stubs/io_uring.c| 32 3 files changed, 34 insertions(+) create mode 100644 stubs/io_uring.c diff --git a/MAINTAINERS b/MAINTAINERS index 31b090033c..c5cf2466b6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2551,6 +2551,7 @@ M: Stefan Hajnoczi L: qemu-block@nongnu.org S: Maintained F: block/io_uring.c +F: stubs/io_uring.c qcow2 M: Kevin Wolf diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 4a50e95ec3..56c177c507 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -13,6 +13,7 @@ stub-obj-y += iothread.o stub-obj-y += iothread-lock.o stub-obj-y += is-daemonized.o stub-obj-$(CONFIG_LINUX_AIO) += linux-aio.o +stub-obj-$(CONFIG_LINUX_IO_URING) += io_uring.o stub-obj-y += machine-init-done.o stub-obj-y += migr-blocker.o stub-obj-y += change-state-handler.o diff --git a/stubs/io_uring.c b/stubs/io_uring.c new file mode 100644 index 00..622d1e4648 --- /dev/null +++ b/stubs/io_uring.c @@ -0,0 +1,32 @@ +/* + * Linux io_uring support. + * + * Copyright (C) 2009 IBM, Corp. + * Copyright (C) 2009 Red Hat, Inc. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include "qemu/osdep.h" +#include "block/aio.h" +#include "block/raw-aio.h" + +void luring_detach_aio_context(LuringState *s, AioContext *old_context) +{ +abort(); +} + +void luring_attach_aio_context(LuringState *s, AioContext *new_context) +{ +abort(); +} + +LuringState *luring_init(Error **errp) +{ +abort(); +} + +void luring_cleanup(LuringState *s) +{ +abort(); +} -- 2.21.0
[PATCH v2 12/15] qemu-img: adds option to use aio engine for benchmarking
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- v10: * Add missing space to qemu-img command-line documentation --- qemu-img.c | 11 ++- qemu-img-cmds.hx | 4 ++-- qemu-img.texi| 5 - 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index 8b03ef8171..3e696a4d8e 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -4210,7 +4210,8 @@ static int img_bench(int argc, char **argv) {"force-share", no_argument, 0, 'U'}, {0, 0, 0, 0} }; -c = getopt_long(argc, argv, ":hc:d:f:no:qs:S:t:wU", long_options, NULL); +c = getopt_long(argc, argv, ":hc:d:f:ni:o:qs:S:t:wU", long_options, +NULL); if (c == -1) { break; } @@ -4253,6 +4254,14 @@ static int img_bench(int argc, char **argv) case 'n': flags |= BDRV_O_NATIVE_AIO; break; +case 'i': +ret = bdrv_parse_aio(optarg, &flags); +if (ret < 0) { +error_report("Invalid aio option: %s", optarg); +ret = -1; +goto out; +} +break; case 'o': { offset = cvtnum(optarg); diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index 1c93e6d185..77b5a8dda8 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -20,9 +20,9 @@ STEXI ETEXI DEF("bench", img_bench, -"bench [-c count] [-d depth] [-f fmt] [--flush-interval=flush_interval] [-n] [--no-drain] [-o offset] [--pattern=pattern] [-q] [-s buffer_size] [-S step_size] [-t cache] [-w] [-U] filename") +"bench [-c count] [-d depth] [-f fmt] [--flush-interval=flush_interval] [-n] [--no-drain] [-o offset] [--pattern=pattern] [-q] [-s buffer_size] [-S step_size] [-t cache] [-i aio] [-w] [-U] filename") STEXI -@item bench [-c @var{count}] [-d @var{depth}] [-f @var{fmt}] [--flush-interval=@var{flush_interval}] [-n] [--no-drain] [-o @var{offset}] [--pattern=@var{pattern}] [-q] [-s @var{buffer_size}] [-S @var{step_size}] [-t @var{cache}] [-w] [-U] @var{filename} +@item bench [-c @var{count}] [-d @var{depth}] [-f @var{fmt}] [--flush-interval=@var{flush_interval}] [-n] [--no-drain] [-o @var{offset}] [--pattern=@var{pattern}] [-q] [-s @var{buffer_size}] [-S @var{step_size}] [-t @var{cache}] [-i @var{aio}] [-w] [-U] @var{filename} ETEXI DEF("check", img_check, diff --git a/qemu-img.texi b/qemu-img.texi index b5156d6316..20136fcb94 100644 --- a/qemu-img.texi +++ b/qemu-img.texi @@ -206,7 +206,7 @@ Command description: Amends the image format specific @var{options} for the image file @var{filename}. Not all file formats support this operation. -@item bench [-c @var{count}] [-d @var{depth}] [-f @var{fmt}] [--flush-interval=@var{flush_interval}] [-n] [--no-drain] [-o @var{offset}] [--pattern=@var{pattern}] [-q] [-s @var{buffer_size}] [-S @var{step_size}] [-t @var{cache}] [-w] [-U] @var{filename} +@item bench [-c @var{count}] [-d @var{depth}] [-f @var{fmt}] [--flush-interval=@var{flush_interval}] [-n] [-i @var{aio}] [--no-drain] [-o @var{offset}] [--pattern=@var{pattern}] [-q] [-s @var{buffer_size}] [-S @var{step_size}] [-t @var{cache}] [-w] [-U] @var{filename} Run a simple sequential I/O benchmark on the specified image. If @code{-w} is specified, a write test is performed, otherwise a read test is performed. @@ -227,6 +227,9 @@ If @code{-n} is specified, the native AIO backend is used if possible. On Linux, this option only works if @code{-t none} or @code{-t directsync} is specified as well. +If @code{-i} is specified, aio option can be used to specify different AIO +backends: @var{threads}, @var{native} or @var{io_uring}. + For write tests, by default a buffer filled with zeros is written. This can be overridden with a pattern byte specified by @var{pattern}. -- 2.21.0
[PATCH v2 11/15] qemu-io: adds option to use aio engine
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- qemu-io.c | 25 + 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/qemu-io.c b/qemu-io.c index 91e3276592..3adc5a7d0d 100644 --- a/qemu-io.c +++ b/qemu-io.c @@ -130,7 +130,8 @@ static void open_help(void) " -C, -- use copy-on-read\n" " -n, -- disable host cache, short for -t none\n" " -U, -- force shared permissions\n" -" -k, -- use kernel AIO implementation (on Linux only)\n" +" -k, -- use kernel AIO implementation (Linux only, prefer use of -i)\n" +" -i, -- use AIO mode (threads, native or io_uring)\n" " -t, -- use the given cache mode for the image\n" " -d, -- use the given discard mode for the image\n" " -o, -- options to be given to the block driver" @@ -172,7 +173,7 @@ static int open_f(BlockBackend *blk, int argc, char **argv) QDict *opts; bool force_share = false; -while ((c = getopt(argc, argv, "snCro:kt:d:U")) != -1) { +while ((c = getopt(argc, argv, "snCro:ki:t:d:U")) != -1) { switch (c) { case 's': flags |= BDRV_O_SNAPSHOT; @@ -204,6 +205,13 @@ static int open_f(BlockBackend *blk, int argc, char **argv) return -EINVAL; } break; +case 'i': +if (bdrv_parse_aio(optarg, &flags) < 0) { +error_report("Invalid aio option: %s", optarg); +qemu_opts_reset(&empty_opts); +return -EINVAL; +} +break; case 'o': if (imageOpts) { printf("--image-opts and 'open -o' are mutually exclusive\n"); @@ -291,7 +299,9 @@ static void usage(const char *name) " -n, --nocachedisable host cache, short for -t none\n" " -C, --copy-on-read enable copy-on-read\n" " -m, --misalign misalign allocations for O_DIRECT\n" -" -k, --native-aio use kernel AIO implementation (on Linux only)\n" +" -k, --native-aio use kernel AIO implementation\n" +" (Linux only, prefer use of -i)\n" +" -i, --aio=MODE use AIO mode (threads, native or io_uring)\n" " -t, --cache=MODE use the given cache mode for the image\n" " -d, --discard=MODE use the given discard mode for the image\n" " -T, --trace [[enable=]][,events=][,file=]\n" @@ -496,7 +506,7 @@ static QemuOptsList file_opts = { int main(int argc, char **argv) { int readonly = 0; -const char *sopt = "hVc:d:f:rsnCmkt:T:U"; +const char *sopt = "hVc:d:f:rsnCmki:t:T:U"; const struct option lopt[] = { { "help", no_argument, NULL, 'h' }, { "version", no_argument, NULL, 'V' }, @@ -508,6 +518,7 @@ int main(int argc, char **argv) { "copy-on-read", no_argument, NULL, 'C' }, { "misalign", no_argument, NULL, 'm' }, { "native-aio", no_argument, NULL, 'k' }, +{ "aio", required_argument, NULL, 'i' }, { "discard", required_argument, NULL, 'd' }, { "cache", required_argument, NULL, 't' }, { "trace", required_argument, NULL, 'T' }, @@ -575,6 +586,12 @@ int main(int argc, char **argv) case 'k': flags |= BDRV_O_NATIVE_AIO; break; +case 'i': +if (bdrv_parse_aio(optarg, &flags) < 0) { +error_report("Invalid aio option: %s", optarg); +exit(1); +} +break; case 't': if (bdrv_parse_cache_mode(optarg, &flags, &writethrough) < 0) { error_report("Invalid cache option: %s", optarg); -- 2.21.0
[PATCH v2 03/15] block/block: add BDRV flag for io_uring
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Reviewed-by: Maxim Levitsky Signed-off-by: Stefan Hajnoczi --- include/block/block.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/block/block.h b/include/block/block.h index 89606bd9f8..bdb48dcd1b 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -126,6 +126,7 @@ typedef struct HDGeometry { ignoring the format layer */ #define BDRV_O_NO_IO 0x1 /* don't initialize for I/O */ #define BDRV_O_AUTO_RDONLY 0x2 /* degrade to read-only if opening read-write fails */ +#define BDRV_O_IO_URING0x4 /* use io_uring instead of the thread pool */ #define BDRV_O_CACHE_MASK (BDRV_O_NOCACHE | BDRV_O_NO_FLUSH) -- 2.21.0
[PATCH v2 02/15] qapi/block-core: add option for io_uring
From: Aarushi Mehta Only enumerates option for devices that support it. Since QAPI schema supports io_uring, which is the actual name of the Linux API, it is preferred over io-uring. Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- qapi/block-core.json | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index b274aef713..3196f40178 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2851,11 +2851,13 @@ # # @threads: Use qemu's thread pool # @native: Use native AIO backend (only Linux and Windows) +# @io_uring:Use linux io_uring (since 4.2) # # Since: 2.9 ## { 'enum': 'BlockdevAioOptions', - 'data': [ 'threads', 'native' ] } + 'data': [ 'threads', 'native', +{ 'name': 'io_uring', 'if': 'defined(CONFIG_LINUX_IO_URING)' } ] } ## # @BlockdevCacheOptions: -- 2.21.0
[PATCH v2 14/15] tests/qemu-iotests: enable testing with aio options
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- tests/qemu-iotests/check | 15 ++- tests/qemu-iotests/common.rc | 14 ++ tests/qemu-iotests/iotests.py | 12 ++-- 3 files changed, 38 insertions(+), 3 deletions(-) diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check index 588c453a94..150fb71cf9 100755 --- a/tests/qemu-iotests/check +++ b/tests/qemu-iotests/check @@ -132,6 +132,7 @@ sortme=false expunge=true have_test_arg=false cachemode=false +aiomode=false tmp="${TEST_DIR}"/$$ rm -f $tmp.list $tmp.tmp $tmp.sed @@ -141,6 +142,7 @@ export IMGFMT_GENERIC=true export IMGPROTO=file export IMGOPTS="" export CACHEMODE="writeback" +export AIOMODE="threads" export QEMU_IO_OPTIONS="" export QEMU_IO_OPTIONS_NO_FMT="" export CACHEMODE_IS_DEFAULT=true @@ -225,6 +227,11 @@ s/ .*//p CACHEMODE_IS_DEFAULT=false cachemode=false continue +elif $aiomode +then +AIOMODE="$r" +aiomode=false +continue fi xpand=true @@ -269,6 +276,7 @@ other options -n show me, do not run tests -o options -o options to pass to qemu-img create/convert -c mode cache mode +-i mode AIO mode -makecheck pretty print output for make check testlist options @@ -433,10 +441,13 @@ testlist options cachemode=true xpand=false ;; +-i) +aiomode=true +xpand=false +;; -T)# deprecated timestamp option xpand=false ;; - -v) verbose=true xpand=false @@ -515,6 +526,8 @@ done # Set qemu-io cache mode with $CACHEMODE we have QEMU_IO_OPTIONS="$QEMU_IO_OPTIONS --cache $CACHEMODE" +# Set qemu-io aio mode with $AIOMODE we have +QEMU_IO_OPTIONS="$QEMU_IO_OPTIONS --aio $AIOMODE" QEMU_IO_OPTIONS_NO_FMT="$QEMU_IO_OPTIONS" if [ "$IMGOPTSSYNTAX" != "true" ]; then diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index 12b4751848..13f934f79b 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -576,6 +576,20 @@ _default_cache_mode() return fi } +_supported_aio_modes() +{ +for mode; do +if [ "$mode" = "$AIOMODE" ]; then +return +fi +done +_notrun "not suitable for aio mode: $AIOMODE" +} +_default_aio_mode() +{ +AIOMODE="$1" +QEMU_IO="$QEMU_IO --aio $1" +} _unsupported_imgopts() { diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 693fde155a..0d4ef6b3e6 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -59,6 +59,7 @@ imgproto = os.environ.get('IMGPROTO', 'file') test_dir = os.environ.get('TEST_DIR') output_dir = os.environ.get('OUTPUT_DIR', '.') cachemode = os.environ.get('CACHEMODE') +aiomode = os.environ.get('AIOMODE') qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE') socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper') @@ -483,6 +484,7 @@ class VM(qtest.QEMUQtestMachine): options.append('file=%s' % path) options.append('format=%s' % format) options.append('cache=%s' % cachemode) +options.append('aio=%s' % aiomode) if opts: options.append(opts) @@ -888,6 +890,10 @@ def verify_cache_mode(supported_cache_modes=[]): if supported_cache_modes and (cachemode not in supported_cache_modes): notrun('not suitable for this cache mode: %s' % cachemode) +def verify_aio_mode(supported_aio_modes=[]): +if supported_aio_modes and (aiomode not in supported_aio_modes): +notrun('not suitable for this aio mode: %s' % aiomode) + def supports_quorum(): return 'quorum' in qemu_img_pipe('--help') @@ -945,8 +951,9 @@ def execute_unittest(output, verbosity, debug): def execute_test(test_function=None, supported_fmts=[], supported_oses=['linux'], - supported_cache_modes=[], unsupported_fmts=[], - supported_protocols=[], unsupported_protocols=[]): + supported_cache_modes=[], supported_aio_modes={}, + unsupported_fmts=[], supported_protocols=[], + unsupported_protocols=[]): """Run either unittest or script-style tests.""" # We are using TEST_DIR and QEMU_DEFAULT_MACHINE as proxies to @@ -963,6 +970,7 @@ def execute_test(test_function=None, verify_protocol(supported_protocols, unsupported_protocols) verify_platform(supported_oses) verify_cache_mode(supported_cache_modes) +verify_aio_mode(supported_aio_modes) if debug: output = sys.stdout -- 2.21.0
[PATCH v2 13/15] qemu-nbd: adds option for aio engines
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Acked-by: Eric Blake Signed-off-by: Stefan Hajnoczi --- qemu-nbd.c| 12 qemu-nbd.texi | 4 ++-- 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/qemu-nbd.c b/qemu-nbd.c index caacf0ed73..1761b3e70a 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -135,7 +135,7 @@ static void usage(const char *name) "'[ID_OR_NAME]'\n" " -n, --nocache disable host cache\n" " --cache=MODE set cache mode (none, writeback, ...)\n" -" --aio=MODEset AIO mode (native or threads)\n" +" --aio=MODEset AIO mode (native, io_uring or threads)\n" " --discard=MODEset discard mode (ignore, unmap)\n" " --detect-zeroes=MODE set detect-zeroes mode (off, on, unmap)\n" " --image-opts treat FILE as a full set of image options\n" @@ -726,13 +726,9 @@ int main(int argc, char **argv) exit(EXIT_FAILURE); } seen_aio = true; -if (!strcmp(optarg, "native")) { -flags |= BDRV_O_NATIVE_AIO; -} else if (!strcmp(optarg, "threads")) { -/* this is the default */ -} else { - error_report("invalid aio mode `%s'", optarg); - exit(EXIT_FAILURE); +if (bdrv_parse_aio(optarg, &flags) < 0) { +error_report("Invalid aio mode '%s'", optarg); +exit(EXIT_FAILURE); } break; case QEMU_NBD_OPT_DISCARD: diff --git a/qemu-nbd.texi b/qemu-nbd.texi index 7f55657722..3ee3e4bdee 100644 --- a/qemu-nbd.texi +++ b/qemu-nbd.texi @@ -77,8 +77,8 @@ as an read-only device, @var{snapshot_param} format is The cache mode to be used with the file. See the documentation of the emulator's @code{-drive cache=...} option for allowed values. @item --aio=@var{aio} -Set the asynchronous I/O mode between @samp{threads} (the default) -and @samp{native} (Linux only). +Set the asynchronous I/O mode between @samp{threads} (the default), +@samp{native} (Linux only) and @samp{io_uring} (Linux 5.1+). @item --discard=@var{discard} Control whether @dfn{discard} (also known as @dfn{trim} or @dfn{unmap}) requests are ignored or passed to the filesystem. @var{discard} is one of -- 2.21.0
[PATCH v2 08/15] block/file-posix.c: extend to use io_uring
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Reviewed-by: Maxim Levitsky Signed-off-by: Stefan Hajnoczi --- block/file-posix.c | 99 -- 1 file changed, 79 insertions(+), 20 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 695fcf740d..14c80b8cb3 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -156,6 +156,7 @@ typedef struct BDRVRawState { bool has_write_zeroes:1; bool discard_zeroes:1; bool use_linux_aio:1; +bool use_linux_io_uring:1; bool page_cache_inconsistent:1; bool has_fallocate; bool needs_alignment; @@ -444,7 +445,7 @@ static QemuOptsList raw_runtime_opts = { { .name = "aio", .type = QEMU_OPT_STRING, -.help = "host AIO implementation (threads, native)", +.help = "host AIO implementation (threads, native, io_uring)", }, { .name = "locking", @@ -503,9 +504,15 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, goto fail; } -aio_default = (bdrv_flags & BDRV_O_NATIVE_AIO) - ? BLOCKDEV_AIO_OPTIONS_NATIVE - : BLOCKDEV_AIO_OPTIONS_THREADS; +if (bdrv_flags & BDRV_O_NATIVE_AIO) { +aio_default = BLOCKDEV_AIO_OPTIONS_NATIVE; +#ifdef CONFIG_LINUX_IO_URING +} else if (bdrv_flags & BDRV_O_IO_URING) { +aio_default = BLOCKDEV_AIO_OPTIONS_IO_URING; +#endif +} else { +aio_default = BLOCKDEV_AIO_OPTIONS_THREADS; +} aio = qapi_enum_parse(&BlockdevAioOptions_lookup, qemu_opt_get(opts, "aio"), aio_default, &local_err); @@ -514,7 +521,11 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, ret = -EINVAL; goto fail; } + s->use_linux_aio = (aio == BLOCKDEV_AIO_OPTIONS_NATIVE); +#ifdef CONFIG_LINUX_IO_URING +s->use_linux_io_uring = (aio == BLOCKDEV_AIO_OPTIONS_IO_URING); +#endif locking = qapi_enum_parse(&OnOffAuto_lookup, qemu_opt_get(opts, "locking"), @@ -578,7 +589,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, s->shared_perm = BLK_PERM_ALL; #ifdef CONFIG_LINUX_AIO - /* Currently Linux does AIO only for files opened with O_DIRECT */ +/* Currently Linux does AIO only for files opened with O_DIRECT */ if (s->use_linux_aio) { if (!(s->open_flags & O_DIRECT)) { error_setg(errp, "aio=native was specified, but it requires " @@ -600,6 +611,22 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, } #endif /* !defined(CONFIG_LINUX_AIO) */ +#ifdef CONFIG_LINUX_IO_URING +if (s->use_linux_io_uring) { +if (!aio_setup_linux_io_uring(bdrv_get_aio_context(bs), errp)) { +error_prepend(errp, "Unable to use io_uring: "); +goto fail; +} +} +#else +if (s->use_linux_io_uring) { +error_setg(errp, "aio=io_uring was specified, but is not supported " + "in this build."); +ret = -EINVAL; +goto fail; +} +#endif /* !defined(CONFIG_LINUX_IO_URING) */ + s->has_discard = true; s->has_write_zeroes = true; if ((bs->open_flags & BDRV_O_NOCACHE) != 0) { @@ -1876,21 +1903,25 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, return -EIO; /* - * Check if the underlying device requires requests to be aligned, - * and if the request we are trying to submit is aligned or not. - * If this is the case tell the low-level driver that it needs - * to copy the buffer. + * When using O_DIRECT, the request must be aligned to be able to use + * either libaio or io_uring interface. If not fail back to regular thread + * pool read/write code which emulates this for us if we + * set QEMU_AIO_MISALIGNED. */ -if (s->needs_alignment) { -if (!bdrv_qiov_is_aligned(bs, qiov)) { -type |= QEMU_AIO_MISALIGNED; +if (s->needs_alignment && !bdrv_qiov_is_aligned(bs, qiov)) { +type |= QEMU_AIO_MISALIGNED; +#ifdef CONFIG_LINUX_IO_URING +} else if (s->use_linux_io_uring) { +LuringState *aio = aio_get_linux_io_uring(bdrv_get_aio_context(bs)); +assert(qiov->size == bytes); +return luring_co_submit(bs, aio, s->fd, offset, qiov, type); +#endif #ifdef CONFIG_LINUX_AIO -} else if (s->use_linux_aio) { -LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs)); -assert(qiov->size == bytes); -return laio_co_submit(bs, aio, s->fd, offset, qiov, type); +} else if (s->use_linux_aio) { +LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs)); +assert(qiov->size == bytes); +return laio_co_submit(bs, aio, s->fd, offset, qiov, type); #endif -} } acb = (RawPosixAIOData) { @@ -1926,24 +1957,
[PATCH v2 04/15] block/io_uring: implements interfaces for io_uring
From: Aarushi Mehta Aborts when sqe fails to be set as sqes cannot be returned to the ring. Adds slow path for short reads for older kernels Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- v10: * Update MAINTAINERS file [Julia] * Rename MAX_EVENTS to MAX_ENTRIES [Julia] * Define ioq_submit() before callers so the prototype isn't necessary [Julia] * Declare variables at the beginning of the block in luring_init() [Julia] --- MAINTAINERS | 8 + block/Makefile.objs | 3 + include/block/aio.h | 16 +- include/block/raw-aio.h | 12 ++ block/io_uring.c| 403 5 files changed, 441 insertions(+), 1 deletion(-) create mode 100644 block/io_uring.c diff --git a/MAINTAINERS b/MAINTAINERS index ed41d7d1b6..31b090033c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2544,6 +2544,14 @@ F: block/file-posix.c F: block/file-win32.c F: block/win32-aio.c +Linux io_uring +M: Aarushi Mehta +M: Julia Suvorova +M: Stefan Hajnoczi +L: qemu-block@nongnu.org +S: Maintained +F: block/io_uring.c + qcow2 M: Kevin Wolf M: Max Reitz diff --git a/block/Makefile.objs b/block/Makefile.objs index e394fe0b6c..035abb9c5c 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -18,6 +18,7 @@ block-obj-y += block-backend.o snapshot.o qapi.o block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o block-obj-$(CONFIG_POSIX) += file-posix.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o +block-obj-$(CONFIG_LINUX_IO_URING) += io_uring.o block-obj-y += null.o mirror.o commit.o io.o create.o block-obj-y += throttle-groups.o block-obj-$(CONFIG_LINUX) += nvme.o @@ -65,5 +66,7 @@ block-obj-$(if $(CONFIG_LZFSE),m,n) += dmg-lzfse.o dmg-lzfse.o-libs := $(LZFSE_LIBS) qcow.o-libs:= -lz linux-aio.o-libs := -laio +io_uring.o-cflags := $(LINUX_IO_URING_CFLAGS) +io_uring.o-libs:= $(LINUX_IO_URING_LIBS) parallels.o-cflags := $(LIBXML2_CFLAGS) parallels.o-libs := $(LIBXML2_LIBS) diff --git a/include/block/aio.h b/include/block/aio.h index 6b0d52f732..7ba9bd7874 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -49,6 +49,7 @@ typedef void IOHandler(void *opaque); struct Coroutine; struct ThreadPool; struct LinuxAioState; +struct LuringState; struct AioContext { GSource source; @@ -117,11 +118,19 @@ struct AioContext { struct ThreadPool *thread_pool; #ifdef CONFIG_LINUX_AIO -/* State for native Linux AIO. Uses aio_context_acquire/release for +/* + * State for native Linux AIO. Uses aio_context_acquire/release for * locking. */ struct LinuxAioState *linux_aio; #endif +#ifdef CONFIG_LINUX_IO_URING +/* + * State for Linux io_uring. Uses aio_context_acquire/release for + * locking. + */ +struct LuringState *linux_io_uring; +#endif /* TimerLists for calling timers - one per clock type. Has its own * locking. @@ -386,6 +395,11 @@ struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, Error **errp); /* Return the LinuxAioState bound to this AioContext */ struct LinuxAioState *aio_get_linux_aio(AioContext *ctx); +/* Setup the LuringState bound to this AioContext */ +struct LuringState *aio_setup_linux_io_uring(AioContext *ctx, Error **errp); + +/* Return the LuringState bound to this AioContext */ +struct LuringState *aio_get_linux_io_uring(AioContext *ctx); /** * aio_timer_new_with_attrs: * @ctx: the aio context diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h index 4629f24d08..251b10d273 100644 --- a/include/block/raw-aio.h +++ b/include/block/raw-aio.h @@ -57,6 +57,18 @@ void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context); void laio_io_plug(BlockDriverState *bs, LinuxAioState *s); void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s); #endif +/* io_uring.c - Linux io_uring implementation */ +#ifdef CONFIG_LINUX_IO_URING +typedef struct LuringState LuringState; +LuringState *luring_init(Error **errp); +void luring_cleanup(LuringState *s); +int coroutine_fn luring_co_submit(BlockDriverState *bs, LuringState *s, int fd, +uint64_t offset, QEMUIOVector *qiov, int type); +void luring_detach_aio_context(LuringState *s, AioContext *old_context); +void luring_attach_aio_context(LuringState *s, AioContext *new_context); +void luring_io_plug(BlockDriverState *bs, LuringState *s); +void luring_io_unplug(BlockDriverState *bs, LuringState *s); +#endif #ifdef _WIN32 typedef struct QEMUWin32AIOState QEMUWin32AIOState; diff --git a/block/io_uring.c b/block/io_uring.c new file mode 100644 index 00..307c4c5823 --- /dev/null +++ b/block/io_uring.c @@ -0,0 +1,403 @@ +/* + * Linux io_uring support. + * + * Copyright (C) 2009 IBM, Corp. + * Copyright (C) 2009 Red Hat, Inc. + * Copyright (C) 2019 Aarushi Mehta + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include "
[PATCH v2 01/15] configure: permit use of io_uring
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Reviewed-by: Maxim Levitsky Signed-off-by: Stefan Hajnoczi --- configure | 27 +++ 1 file changed, 27 insertions(+) diff --git a/configure b/configure index 145fcabbb3..745429c925 100755 --- a/configure +++ b/configure @@ -370,6 +370,7 @@ xen="" xen_ctrl_version="" xen_pci_passthrough="" linux_aio="" +linux_io_uring="" cap_ng="" attr="" libattr="" @@ -1250,6 +1251,10 @@ for opt do ;; --enable-linux-aio) linux_aio="yes" ;; + --disable-linux-io-uring) linux_io_uring="no" + ;; + --enable-linux-io-uring) linux_io_uring="yes" + ;; --disable-attr) attr="no" ;; --enable-attr) attr="yes" @@ -1760,6 +1765,7 @@ disabled with --disable-FEATURE, default is enabled if available: vde support for vde network netmap support for netmap network linux-aio Linux AIO support + linux-io-uring Linux io_uring support cap-ng libcap-ng support attrattr and xattr support vhost-net vhost-net kernel acceleration support @@ -3950,6 +3956,21 @@ EOF linux_aio=no fi fi +## +# linux-io-uring probe + +if test "$linux_io_uring" != "no" ; then + if $pkg_config liburing; then +linux_io_uring_cflags=$($pkg_config --cflags liburing) +linux_io_uring_libs=$($pkg_config --libs liburing) +linux_io_uring=yes + else +if test "$linux_io_uring" = "yes" ; then + feature_not_found "linux io_uring" "Install liburing devel" +fi +linux_io_uring=no + fi +fi ## # TPM emulation is only on POSIX @@ -6356,6 +6377,7 @@ echo "PIE $pie" echo "vde support $vde" echo "netmap support$netmap" echo "Linux AIO support $linux_aio" +echo "Linux io_uring support $linux_io_uring" echo "ATTR/XATTR support $attr" echo "Install blobs $blobs" echo "KVM support $kvm" @@ -6844,6 +6866,11 @@ fi if test "$linux_aio" = "yes" ; then echo "CONFIG_LINUX_AIO=y" >> $config_host_mak fi +if test "$linux_io_uring" = "yes" ; then + echo "CONFIG_LINUX_IO_URING=y" >> $config_host_mak + echo "LINUX_IO_URING_CFLAGS=$linux_io_uring_cflags" >> $config_host_mak + echo "LINUX_IO_URING_LIBS=$linux_io_uring_libs" >> $config_host_mak +fi if test "$attr" = "yes" ; then echo "CONFIG_ATTR=y" >> $config_host_mak fi -- 2.21.0
[PATCH v2 09/15] block: add trace events for io_uring
From: Aarushi Mehta Signed-off-by: Aarushi Mehta Signed-off-by: Stefan Hajnoczi --- block/io_uring.c | 21 ++--- block/trace-events | 12 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/block/io_uring.c b/block/io_uring.c index 307c4c5823..a5c0d16220 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -17,6 +17,7 @@ #include "block/raw-aio.h" #include "qemu/coroutine.h" #include "qapi/error.h" +#include "trace.h" /* io_uring ring size */ #define MAX_ENTRIES 128 @@ -85,6 +86,8 @@ static void luring_resubmit_short_read(LuringState *s, LuringAIOCB *luringcb, QEMUIOVector *resubmit_qiov; size_t remaining; +trace_luring_resubmit_short_read(s, luringcb, nread); + /* Update read position */ luringcb->total_read = nread; remaining = luringcb->qiov->size - luringcb->total_read; @@ -156,6 +159,7 @@ static void luring_process_completions(LuringState *s) /* Change counters one-by-one because we can be nested. */ s->io_q.in_flight--; +trace_luring_process_completion(s, luringcb, ret); /* total_read is non-zero only for resubmitted read requests */ total_bytes = ret + luringcb->total_read; @@ -224,6 +228,7 @@ static int ioq_submit(LuringState *s) QSIMPLEQ_REMOVE_HEAD(&s->io_q.submit_queue, next); } ret = io_uring_submit(&s->ring); +trace_luring_io_uring_submit(s, ret); /* Prevent infinite loop if submission is refused */ if (ret <= 0) { if (ret == -EAGAIN) { @@ -280,12 +285,15 @@ static void ioq_init(LuringQueue *io_q) void luring_io_plug(BlockDriverState *bs, LuringState *s) { +trace_luring_io_plug(s); s->io_q.plugged++; } void luring_io_unplug(BlockDriverState *bs, LuringState *s) { assert(s->io_q.plugged); +trace_luring_io_unplug(s, s->io_q.blocked, s->io_q.plugged, + s->io_q.in_queue, s->io_q.in_flight); if (--s->io_q.plugged == 0 && !s->io_q.blocked && s->io_q.in_queue > 0) { ioq_submit(s); @@ -306,6 +314,7 @@ void luring_io_unplug(BlockDriverState *bs, LuringState *s) static int luring_do_submit(int fd, LuringAIOCB *luringcb, LuringState *s, uint64_t offset, int type) { +int ret; struct io_uring_sqe *sqes = &luringcb->sqeq; switch (type) { @@ -329,11 +338,14 @@ static int luring_do_submit(int fd, LuringAIOCB *luringcb, LuringState *s, QSIMPLEQ_INSERT_TAIL(&s->io_q.submit_queue, luringcb, next); s->io_q.in_queue++; - +trace_luring_do_submit(s, s->io_q.blocked, s->io_q.plugged, + s->io_q.in_queue, s->io_q.in_flight); if (!s->io_q.blocked && (!s->io_q.plugged || s->io_q.in_flight + s->io_q.in_queue >= MAX_ENTRIES)) { -return ioq_submit(s); +ret = ioq_submit(s); +trace_luring_do_submit_done(s, ret); +return ret; } return 0; } @@ -348,8 +360,10 @@ int coroutine_fn luring_co_submit(BlockDriverState *bs, LuringState *s, int fd, .qiov = qiov, .is_read= (type == QEMU_AIO_READ), }; - +trace_luring_co_submit(bs, s, &luringcb, fd, offset, qiov ? qiov->size : 0, + type); ret = luring_do_submit(fd, &luringcb, s, offset, type); + if (ret < 0) { return ret; } @@ -400,4 +414,5 @@ void luring_cleanup(LuringState *s) { io_uring_queue_exit(&s->ring); g_free(s); +trace_luring_cleanup_state(s); } diff --git a/block/trace-events b/block/trace-events index b8d70f5242..c8e1647e90 100644 --- a/block/trace-events +++ b/block/trace-events @@ -63,6 +63,18 @@ qmp_block_stream(void *bs) "bs %p" file_paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d" file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t dst_off, int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset %"PRIu64" dst_fd %d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64 +#io_uring.c +luring_init_state(void *s, size_t size) "s %p size %zu" +luring_cleanup_state(void *s) "%p freed" +luring_io_plug(void *s) "LuringState %p plug" +luring_io_unplug(void *s, int blocked, int plugged, int queued, int inflight) "LuringState %p blocked %d plugged %d queued %d inflight %d" +luring_do_submit(void *s, int blocked, int plugged, int queued, int inflight) "LuringState %p blocked %d plugged %d queued %d inflight %d" +luring_do_submit_done(void *s, int ret) "LuringState %p submitted to kernel %d" +luring_co_submit(void *bs, void *s, void *luringcb, int fd, uint64_t offset, size_t nbytes, int type) "bs %p s %p luringcb %p fd %d offset %" PRId64 " nbytes %zd type %d" +luring_process_completion(void *s, void *aiocb, int ret) "LuringState %p luringcb %p ret %d" +luring_io_uring_submit(void *s, int ret) "LuringState %p ret %d" +luring_resubmit_
[PATCH v2 00/15] io_uring: add Linux io_uring AIO engine
v11: * Drop fd registration because it breaks QEMU's file locking and will need to be resolved in a separate patch series * Drop line-wrapping changes that accidentally broke several qemu-iotests v10: * Dropped kernel submission queue polling, it requires root and has additional limitations. It should be benchmarked and considered for inclusion later, maybe even together with kernel side changes. * Add io_uring_register_files() return value to trace_luring_fd_register() * Fix indentation in luring_fd_unregister() * Set s->fd_reg.fd_array to NULL after g_free() to avoid dangling pointers * Simplify fd registration code * Add luring_fd_unregister() and call it from file-posix.c to prevent fd leaks * Add trace_luring_fd_unregister() trace event * Add missing space to qemu-img command-line documentation * Update MAINTAINERS file [Julia] * Rename MAX_EVENTS to MAX_ENTRIES [Julia] * Define ioq_submit() before callers so the prototype isn't necessary [Julia] * Declare variables at the beginning of the block in luring_init() [Julia] This patch series is based on Aarushi Mehta's v9 patch series written for Google Summer of Code 2019: https://lists.gnu.org/archive/html/qemu-devel/2019-08/msg00179.html It adds a new AIO engine that uses the new Linux io_uring API. This is the successor to Linux AIO with a number of improvements: 1. Both O_DIRECT and buffered I/O work 2. fdatasync(2) is supported (no need for a separate thread pool!) 3. True async behavior so the syscall doesn't block (Linux AIO got there to some degree...) 4. Advanced performance optimizations are available (file registration, memory buffer registration, completion polling, submission polling). Since Aarushi has been busy, I have taken up this patch series. Booting a guest works with -drive aio=io_uring and -drive aio=io_uring,cache=none with a raw file on XFS. I currently recommend using -drive aio=io_uring only with host block devices (like NVMe devices). As of Linux v5.4-rc1 I still hit kernel bugs when using image files on ext4 or XFS. Aarushi Mehta (15): configure: permit use of io_uring qapi/block-core: add option for io_uring block/block: add BDRV flag for io_uring block/io_uring: implements interfaces for io_uring stubs: add stubs for io_uring interface util/async: add aio interfaces for io_uring blockdev: adds bdrv_parse_aio to use io_uring block/file-posix.c: extend to use io_uring block: add trace events for io_uring block/io_uring: adds userspace completion polling qemu-io: adds option to use aio engine qemu-img: adds option to use aio engine for benchmarking qemu-nbd: adds option for aio engines tests/qemu-iotests: enable testing with aio options tests/qemu-iotests: use AIOMODE with various tests MAINTAINERS | 9 + qapi/block-core.json | 4 +- configure | 27 +++ block/Makefile.objs | 3 + stubs/Makefile.objs | 1 + include/block/aio.h | 16 +- include/block/block.h | 2 + include/block/raw-aio.h | 12 + block.c | 22 ++ block/file-posix.c| 99 ++-- block/io_uring.c | 433 ++ blockdev.c| 12 +- qemu-img.c| 11 +- qemu-io.c | 25 +- qemu-nbd.c| 12 +- stubs/io_uring.c | 32 +++ util/async.c | 36 +++ block/trace-events| 12 + qemu-img-cmds.hx | 4 +- qemu-img.texi | 5 +- qemu-nbd.texi | 4 +- tests/qemu-iotests/028| 2 +- tests/qemu-iotests/058| 2 +- tests/qemu-iotests/089| 4 +- tests/qemu-iotests/091| 4 +- tests/qemu-iotests/109| 2 +- tests/qemu-iotests/147| 5 +- tests/qemu-iotests/181| 8 +- tests/qemu-iotests/183| 4 +- tests/qemu-iotests/185| 10 +- tests/qemu-iotests/200| 2 +- tests/qemu-iotests/201| 8 +- tests/qemu-iotests/check | 15 +- tests/qemu-iotests/common.rc | 14 ++ tests/qemu-iotests/iotests.py | 12 +- 35 files changed, 797 insertions(+), 76 deletions(-) create mode 100644 block/io_uring.c create mode 100644 stubs/io_uring.c -- 2.21.0
Re: [PATCH] qemu-iotests: restrict 264 to qcow2 only
On 10/25/19 9:50 AM, Vladimir Sementsov-Ogievskiy wrote: 264 is unprepared to run with different formats, for example luks needs handling keys, cloop doesn't support image creation, vpc creates image larger than requested (which breaks "Backup completed: 5242880" in test output). The test is here to check nbd-reconnect feature and we actually don't need it for all formats. Let's restrict it to qcow2 only. Reported-by: Max Reitz Signed-off-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/264 | 2 ++ 1 file changed, 2 insertions(+) Reviewed-by: Eric Blake I will queue through my NBD tree; it may miss soft freeze (due to my travel schedule for KVM Forum), but is definite 4.2 material, so it will be in by rc1. diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264 index c8cd97ae2b..131366422b 100755 --- a/tests/qemu-iotests/264 +++ b/tests/qemu-iotests/264 @@ -24,6 +24,8 @@ import iotests from iotests import qemu_img_create, qemu_io_silent_check, file_path, \ qemu_nbd_popen, log +iotests.verify_image_format(supported_fmts=['qcow2']) + disk_a, disk_b, nbd_sock = file_path('disk_a', 'disk_b', 'nbd-sock') nbd_uri = 'nbd+unix:///?socket=' + nbd_sock size = 5 * 1024 * 1024 -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: Early 4.1.1 release for image corruption fixes?
On Fri, Oct 25, 2019 at 04:29:15PM +0200, Kevin Wolf wrote: > Hi, > > the QEMU 4.1.0 release has two problems that can easily cause image > corruption with any qcow2 images (no special configuration needed to > trigger the bugs): > > 1. A locking bug in the qcow2 code. I just sent a pull request that >includes the fix for this. The important patch there is: > >'qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()' > > 2. A kernel bug in the XFS driver that became visible by new I/O >patterns the qcow2 implementation started to use in 4.1. Max is >currently working on a workaround for this. > > The Planning/4.1 wiki page tells me that a 4.1.1 release is planned for > end of November, which isn't too far, but I was wondering if want to > have a stable release even earlier, right after the fixes for both > problems are in. IMHO data corruption for our primary image format is an excellent reason to do a stable release at the soonest viable opportunity. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [RFC 0/3] block/file-posix: Work around XFS bug
On Fri, 25 Oct 2019 at 15:21, Max Reitz wrote: > > On 25.10.19 16:17, Peter Maydell wrote: > > On Fri, 25 Oct 2019 at 15:16, Max Reitz wrote: > >> I’ve created an RH BZ here: > >> > >> https://bugzilla.redhat.com/show_bug.cgi?id=1765547 > > > > Currently "You are not authorized to access bug #1765547." for > > anonymous browsers, just fyi. > > Err. Oops. That wasn’t my intention. I hate that misfeature. > > Thanks for telling me, I fixed it. :-) Yeah, that's a really well written bug report, you definitely don't want to hide it away :-) Thanks for making it public. -- PMM
[PATCH] qemu-iotests: restrict 264 to qcow2 only
264 is unprepared to run with different formats, for example luks needs handling keys, cloop doesn't support image creation, vpc creates image larger than requested (which breaks "Backup completed: 5242880" in test output). The test is here to check nbd-reconnect feature and we actually don't need it for all formats. Let's restrict it to qcow2 only. Reported-by: Max Reitz Signed-off-by: Vladimir Sementsov-Ogievskiy --- tests/qemu-iotests/264 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264 index c8cd97ae2b..131366422b 100755 --- a/tests/qemu-iotests/264 +++ b/tests/qemu-iotests/264 @@ -24,6 +24,8 @@ import iotests from iotests import qemu_img_create, qemu_io_silent_check, file_path, \ qemu_nbd_popen, log +iotests.verify_image_format(supported_fmts=['qcow2']) + disk_a, disk_b, nbd_sock = file_path('disk_a', 'disk_b', 'nbd-sock') nbd_uri = 'nbd+unix:///?socket=' + nbd_sock size = 5 * 1024 * 1024 -- 2.21.0
Re: [PULL 3/3] iotests: test nbd reconnect
25.10.2019 17:28, Max Reitz wrote: > On 23.10.19 04:01, Eric Blake wrote: >> From: Vladimir Sementsov-Ogievskiy >> >> Add test, which starts backup to nbd target and restarts nbd server >> during backup. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy >> Message-Id: <20191009084158.15614-4-vsement...@virtuozzo.com> >> Reviewed-by: Eric Blake >> Signed-off-by: Eric Blake >> --- >> tests/qemu-iotests/264| 95 +++ >> tests/qemu-iotests/264.out| 13 + >> tests/qemu-iotests/group | 1 + >> tests/qemu-iotests/iotests.py | 11 >> 4 files changed, 120 insertions(+) >> create mode 100755 tests/qemu-iotests/264 >> create mode 100644 tests/qemu-iotests/264.out > > I suppose this test should limit the supported image formats. For me, > it fails for at least LUKS, cloop, and vpc. (Due to different reasons > for each format.) > Sorry for this. I should finally remember to (almost) always restrict my tests to qcow2. Patch sent. -- Best regards, Vladimir
Re: Early 4.1.1 release for image corruption fixes?
On 25.10.19 16:29, Kevin Wolf wrote: > Hi, > > the QEMU 4.1.0 release has two problems that can easily cause image > corruption with any qcow2 images (no special configuration needed to > trigger the bugs): Three, actually. There’s also b2c6f23f4a9f6d8f1b648705cd46d3713b78d6a2 which fixed 50ba5b2d994853b38fed10e0841b119da0f8b8e5 that was part of 4.1.0, too. > 1. A locking bug in the qcow2 code. I just sent a pull request that >includes the fix for this. The important patch there is: > >'qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()' > > 2. A kernel bug in the XFS driver that became visible by new I/O >patterns the qcow2 implementation started to use in 4.1. Max is >currently working on a workaround for this. You most likely won’t see the kernel bug on 4.1.0, because 50ba5b2d99 is basically the same bug, but in qemu... (I don’t know if it has the same technical reason, but it causes the same symptoms.) Max > The Planning/4.1 wiki page tells me that a 4.1.1 release is planned for > end of November, which isn't too far, but I was wondering if want to > have a stable release even earlier, right after the fixes for both > problems are in. > > Peter, if we haven't made a decision until then, maybe discussing this > is an agenda item for QEMU Summit, too. > > Kevin > signature.asc Description: OpenPGP digital signature
Re: [PULL 0/7] Block layer patches
On Fri, 25 Oct 2019 at 14:46, Kevin Wolf wrote: > > The following changes since commit 7bc8f9734213b76e76631a483be13d6737c2adbc: > > Merge remote-tracking branch > 'remotes/pmaydell/tags/pull-target-arm-20191025' into staging (2019-10-25 > 13:12:16 +0100) > > are available in the Git repository at: > > git://repo.or.cz/qemu/kevin.git tags/for-upstream > > for you to fetch changes up to 5e9785505210e2477e590e61b1ab100d0ec22b01: > > qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() > (2019-10-25 15:18:55 +0200) > > > Block layer patches: > > - qcow2: Fix data corruption bug that is triggered in partial cluster > allocation with default options > - qapi: add support for blkreplay driver > - doc: Describe missing generic -blockdev options > - iotests: Fix 118 when run as root > - Minor code cleanups > > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2 for any user-visible changes. -- PMM
Re: [PATCH v14 1/9] esp: add pseudo-DMA as used by Macintosh
On 10/25/19 4:01 PM, Paolo Bonzini wrote: On 22/10/19 13:17, Laurent Vivier wrote: +if (s->dma_memory_read) { +s->dma_memory_read(s->dma_opaque, &s->cmdbuf[s->cmdlen], len); +} else { +set_pdma(s, CMD, s->cmdlen, len); +s->pdma_cb = do_dma_pdma_cb; +esp_raise_drq(s); +return; +} +trace_esp_handle_ti_cmd(s->cmdlen); +s->ti_size = 0; +s->cmdlen = 0; +s->do_cmd = 0; +do_cmd(s, s->cmdbuf); return; Can you explain these lines after s->dma_memory_read? I suppose they are related to -} -if (s->do_cmd) { +} else if (s->do_cmd) { If so, it would be nice to make those a separate patch. Otherwise seems okay. Third reviewer asking, so it seems worthwhile.
Re: [PATCH v14 1/9] esp: add pseudo-DMA as used by Macintosh
Le 25/10/2019 à 16:54, Philippe Mathieu-Daudé a écrit : > On 10/25/19 4:01 PM, Paolo Bonzini wrote: >> On 22/10/19 13:17, Laurent Vivier wrote: >>> + if (s->dma_memory_read) { >>> + s->dma_memory_read(s->dma_opaque, &s->cmdbuf[s->cmdlen], >>> len); >>> + } else { >>> + set_pdma(s, CMD, s->cmdlen, len); >>> + s->pdma_cb = do_dma_pdma_cb; >>> + esp_raise_drq(s); >>> + return; >>> + } >>> + trace_esp_handle_ti_cmd(s->cmdlen); >>> + s->ti_size = 0; >>> + s->cmdlen = 0; >>> + s->do_cmd = 0; >>> + do_cmd(s, s->cmdbuf); >>> return; >> >> Can you explain these lines after s->dma_memory_read? I suppose they >> are related to >> >>> - } >>> - if (s->do_cmd) { >>> + } else if (s->do_cmd) { >> >> If so, it would be nice to make those a separate patch. Otherwise seems >> okay. > > Third reviewer asking, so it seems worthwhile. It seems, yes :) I'm going to try to make a separate patch. Thank you to all of you. Laurent
Re: [PULL 3/3] iotests: test nbd reconnect
On 23.10.19 04:01, Eric Blake wrote: > From: Vladimir Sementsov-Ogievskiy > > Add test, which starts backup to nbd target and restarts nbd server > during backup. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Message-Id: <20191009084158.15614-4-vsement...@virtuozzo.com> > Reviewed-by: Eric Blake > Signed-off-by: Eric Blake > --- > tests/qemu-iotests/264| 95 +++ > tests/qemu-iotests/264.out| 13 + > tests/qemu-iotests/group | 1 + > tests/qemu-iotests/iotests.py | 11 > 4 files changed, 120 insertions(+) > create mode 100755 tests/qemu-iotests/264 > create mode 100644 tests/qemu-iotests/264.out I suppose this test should limit the supported image formats. For me, it fails for at least LUKS, cloop, and vpc. (Due to different reasons for each format.) Max signature.asc Description: OpenPGP digital signature
Re: [RFC 0/3] block/file-posix: Work around XFS bug
Am 25.10.2019 um 16:19 hat Max Reitz geschrieben: > On 25.10.19 15:56, Vladimir Sementsov-Ogievskiy wrote: > > 25.10.2019 16:40, Vladimir Sementsov-Ogievskiy wrote: > >> 25.10.2019 12:58, Max Reitz wrote: > >>> Hi, > >>> > >>> It seems to me that there is a bug in Linux’s XFS kernel driver, as > >>> I’ve explained here: > >>> > >>> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html > >>> > >>> In combination with our commit c8bb23cbdbe32f, this may lead to guest > >>> data corruption when using qcow2 images on XFS with aio=native. > >>> > >>> We can’t wait until the XFS kernel driver is fixed, we should work > >>> around the problem ourselves. > >>> > >>> This is an RFC for two reasons: > >>> (1) I don’t know whether this is the right way to address the issue, > >>> (2) Ideally, we should detect whether the XFS kernel driver is fixed and > >>> if so stop applying the workaround. > >>> I don’t know how we would go about this, so this series doesn’t do > >>> it. (Hence it’s an RFC.) > >>> (3) Perhaps it’s a bit of a layering violation to let the file-posix > >>> driver access and modify a BdrvTrackedRequest object. > >>> > >>> As for how we can address the issue, I see three ways: > >>> (1) The one presented in this series: On XFS with aio=native, we extend > >>> tracked requests for post-EOF fallocate() calls (i.e., write-zero > >>> operations) to reach until infinity (INT64_MAX in practice), mark > >>> them serializing and wait for other conflicting requests. > >>> > >>> Advantages: > >>> + Limits the impact to very specific cases > >>> (And that means it wouldn’t hurt too much to keep this workaround > >>> even when the XFS driver has been fixed) > >>> + Works around the bug where it happens, namely in file-posix > >>> > >>> Disadvantages: > >>> - A bit complex > >>> - A bit of a layering violation (should file-posix have access to > >>> tracked requests?) > >>> > >>> (2) Always skip qcow2’s handle_alloc_space() on XFS. The XFS bug only > >>> becomes visible due to that function: I don’t think qcow2 writes > >>> zeroes in any other I/O path, and raw images are fixed in size so > >>> post-EOF writes won’t happen. > >>> > >>> Advantages: > >>> + Maybe simpler, depending on how difficult it is to handle the > >>> layering violation > >>> + Also fixes the performance problem of handle_alloc_space() being > >>> slow on ppc64+XFS. > >>> > >>> Disadvantages: > >>> - Huge layering violation because qcow2 would need to know whether > >>> the image is stored on XFS or not. > >>> - We’d definitely want to skip this workaround when the XFS driver > >>> has been fixed, so we need some method to find out whether it has > >>> > >>> (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f. > >>> To my knowledge I’m the only one who has provided any benchmarks for > >>> this commit, and even then I was a bit skeptical because it performs > >>> well in some cases and bad in others. I concluded that it’s > >>> probably worth it because the “some cases” are more likely to occur. > >>> > >>> Now we have this problem of corruption here (granted due to a bug in > >>> the XFS driver), and another report of massively degraded > >>> performance on ppc64 > >>> (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – sorry, a > >>> private BZ; I hate that :-/ The report is about 40 % worse > >>> performance for an in-guest fio write benchmark.) > >>> > >>> So I have to ask the question about what the justification for > >>> keeping c8bb23cbdbe32f is. How much does performance increase with > >>> it actually? (On non-(ppc64+XFS) machines, obviously) > >>> > >>> Advantages: > >>> + Trivial > >>> + No layering violations > >>> + We wouldn’t need to keep track of whether the kernel bug has been > >>> fixed or not > >>> + Fixes the ppc64+XFS performance problem > >>> > >>> Disadvantages: > >>> - Reverts cluster allocation performance to pre-c8bb23cbdbe32f > >>> levels, whatever that means > >>> > >>> So this is the main reason this is an RFC: What should we do? Is (1) > >>> really the best choice? > >>> > >>> > >>> In any case, I’ve ran the test case I showed in > >>> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01282.html > >>> more than ten times with this series applied and the installation > >>> succeeded every time. (Without this series, it fails like every other > >>> time.) > >>> > >>> > >> > >> Hi! > >> > >> First, great thanks for your investigation! > >> > >> We need c8bb23cbdbe3 patch, because we use 1M clusters, and zeroing 1M is > >> significant > >> in time. > >> > >> I've tested a bit: > >> > >> test: > >> for img in /ssd/test.img /test.img; do for cl in 64K 1M; do for step in 4K > >> 64K 1M; do ./qemu-img creat
Re: [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1
Hello Kevin, On Thu, Oct 24, 2019 at 04:26:56PM +0200, Kevin Wolf wrote: > Kevin Wolf (2): > coroutine: Add qemu_co_mutex_assert_locked() > qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Tested-by: Michael Weiser with offending 69f47505e and today's master (58560ad254fbda71d4daa6622d71683190070ee2). Corruption does not happen with series applied. Assertion tiggers as expected if lock is not taken. FWIW: Reviewed-by: Michael Weiser -- Thanks, Michael
Early 4.1.1 release for image corruption fixes?
Hi, the QEMU 4.1.0 release has two problems that can easily cause image corruption with any qcow2 images (no special configuration needed to trigger the bugs): 1. A locking bug in the qcow2 code. I just sent a pull request that includes the fix for this. The important patch there is: 'qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()' 2. A kernel bug in the XFS driver that became visible by new I/O patterns the qcow2 implementation started to use in 4.1. Max is currently working on a workaround for this. The Planning/4.1 wiki page tells me that a 4.1.1 release is planned for end of November, which isn't too far, but I was wondering if want to have a stable release even earlier, right after the fixes for both problems are in. Peter, if we haven't made a decision until then, maybe discussing this is an agenda item for QEMU Summit, too. Kevin
Re: [RFC 0/3] block/file-posix: Work around XFS bug
25.10.2019 17:19, Max Reitz wrote: > On 25.10.19 15:56, Vladimir Sementsov-Ogievskiy wrote: >> 25.10.2019 16:40, Vladimir Sementsov-Ogievskiy wrote: >>> 25.10.2019 12:58, Max Reitz wrote: Hi, It seems to me that there is a bug in Linux’s XFS kernel driver, as I’ve explained here: https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html In combination with our commit c8bb23cbdbe32f, this may lead to guest data corruption when using qcow2 images on XFS with aio=native. We can’t wait until the XFS kernel driver is fixed, we should work around the problem ourselves. This is an RFC for two reasons: (1) I don’t know whether this is the right way to address the issue, (2) Ideally, we should detect whether the XFS kernel driver is fixed and if so stop applying the workaround. I don’t know how we would go about this, so this series doesn’t do it. (Hence it’s an RFC.) (3) Perhaps it’s a bit of a layering violation to let the file-posix driver access and modify a BdrvTrackedRequest object. As for how we can address the issue, I see three ways: (1) The one presented in this series: On XFS with aio=native, we extend tracked requests for post-EOF fallocate() calls (i.e., write-zero operations) to reach until infinity (INT64_MAX in practice), mark them serializing and wait for other conflicting requests. Advantages: + Limits the impact to very specific cases (And that means it wouldn’t hurt too much to keep this workaround even when the XFS driver has been fixed) + Works around the bug where it happens, namely in file-posix Disadvantages: - A bit complex - A bit of a layering violation (should file-posix have access to tracked requests?) (2) Always skip qcow2’s handle_alloc_space() on XFS. The XFS bug only becomes visible due to that function: I don’t think qcow2 writes zeroes in any other I/O path, and raw images are fixed in size so post-EOF writes won’t happen. Advantages: + Maybe simpler, depending on how difficult it is to handle the layering violation + Also fixes the performance problem of handle_alloc_space() being slow on ppc64+XFS. Disadvantages: - Huge layering violation because qcow2 would need to know whether the image is stored on XFS or not. - We’d definitely want to skip this workaround when the XFS driver has been fixed, so we need some method to find out whether it has (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f. To my knowledge I’m the only one who has provided any benchmarks for this commit, and even then I was a bit skeptical because it performs well in some cases and bad in others. I concluded that it’s probably worth it because the “some cases” are more likely to occur. Now we have this problem of corruption here (granted due to a bug in the XFS driver), and another report of massively degraded performance on ppc64 (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – sorry, a private BZ; I hate that :-/ The report is about 40 % worse performance for an in-guest fio write benchmark.) So I have to ask the question about what the justification for keeping c8bb23cbdbe32f is. How much does performance increase with it actually? (On non-(ppc64+XFS) machines, obviously) Advantages: + Trivial + No layering violations + We wouldn’t need to keep track of whether the kernel bug has been fixed or not + Fixes the ppc64+XFS performance problem Disadvantages: - Reverts cluster allocation performance to pre-c8bb23cbdbe32f levels, whatever that means So this is the main reason this is an RFC: What should we do? Is (1) really the best choice? In any case, I’ve ran the test case I showed in https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01282.html more than ten times with this series applied and the installation succeeded every time. (Without this series, it fails like every other time.) >>> >>> Hi! >>> >>> First, great thanks for your investigation! >>> >>> We need c8bb23cbdbe3 patch, because we use 1M clusters, and zeroing 1M is >>> significant >>> in time. >>> >>> I've tested a bit: >>> >>> test: >>> for img in /ssd/test.img /test.img; do for cl in 64K 1M; do for step in 4K >>> 64K 1M; do ./qemu-img create -f qcow2 -o cluster_size=$cl $img 15G > >>> /dev/null; printf '%-15s%-7s
Re: [RFC 0/3] block/file-posix: Work around XFS bug
On 25.10.19 16:17, Peter Maydell wrote: > On Fri, 25 Oct 2019 at 15:16, Max Reitz wrote: >> >> On 25.10.19 15:46, Peter Maydell wrote: >>> On Fri, 25 Oct 2019 at 11:09, Max Reitz wrote: Hi, It seems to me that there is a bug in Linux’s XFS kernel driver, as I’ve explained here: https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html In combination with our commit c8bb23cbdbe32f, this may lead to guest data corruption when using qcow2 images on XFS with aio=native. >>> >>> Have we reported that upstream to the xfs folks? >> >> I’ve created an RH BZ here: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1765547 > > Currently "You are not authorized to access bug #1765547." for > anonymous browsers, just fyi. Err. Oops. That wasn’t my intention. I hate that misfeature. Thanks for telling me, I fixed it. :-) Max signature.asc Description: OpenPGP digital signature
Re: [RFC 0/3] block/file-posix: Work around XFS bug
On 25.10.19 15:56, Vladimir Sementsov-Ogievskiy wrote: > 25.10.2019 16:40, Vladimir Sementsov-Ogievskiy wrote: >> 25.10.2019 12:58, Max Reitz wrote: >>> Hi, >>> >>> It seems to me that there is a bug in Linux’s XFS kernel driver, as >>> I’ve explained here: >>> >>> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html >>> >>> In combination with our commit c8bb23cbdbe32f, this may lead to guest >>> data corruption when using qcow2 images on XFS with aio=native. >>> >>> We can’t wait until the XFS kernel driver is fixed, we should work >>> around the problem ourselves. >>> >>> This is an RFC for two reasons: >>> (1) I don’t know whether this is the right way to address the issue, >>> (2) Ideally, we should detect whether the XFS kernel driver is fixed and >>> if so stop applying the workaround. >>> I don’t know how we would go about this, so this series doesn’t do >>> it. (Hence it’s an RFC.) >>> (3) Perhaps it’s a bit of a layering violation to let the file-posix >>> driver access and modify a BdrvTrackedRequest object. >>> >>> As for how we can address the issue, I see three ways: >>> (1) The one presented in this series: On XFS with aio=native, we extend >>> tracked requests for post-EOF fallocate() calls (i.e., write-zero >>> operations) to reach until infinity (INT64_MAX in practice), mark >>> them serializing and wait for other conflicting requests. >>> >>> Advantages: >>> + Limits the impact to very specific cases >>> (And that means it wouldn’t hurt too much to keep this workaround >>> even when the XFS driver has been fixed) >>> + Works around the bug where it happens, namely in file-posix >>> >>> Disadvantages: >>> - A bit complex >>> - A bit of a layering violation (should file-posix have access to >>> tracked requests?) >>> >>> (2) Always skip qcow2’s handle_alloc_space() on XFS. The XFS bug only >>> becomes visible due to that function: I don’t think qcow2 writes >>> zeroes in any other I/O path, and raw images are fixed in size so >>> post-EOF writes won’t happen. >>> >>> Advantages: >>> + Maybe simpler, depending on how difficult it is to handle the >>> layering violation >>> + Also fixes the performance problem of handle_alloc_space() being >>> slow on ppc64+XFS. >>> >>> Disadvantages: >>> - Huge layering violation because qcow2 would need to know whether >>> the image is stored on XFS or not. >>> - We’d definitely want to skip this workaround when the XFS driver >>> has been fixed, so we need some method to find out whether it has >>> >>> (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f. >>> To my knowledge I’m the only one who has provided any benchmarks for >>> this commit, and even then I was a bit skeptical because it performs >>> well in some cases and bad in others. I concluded that it’s >>> probably worth it because the “some cases” are more likely to occur. >>> >>> Now we have this problem of corruption here (granted due to a bug in >>> the XFS driver), and another report of massively degraded >>> performance on ppc64 >>> (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – sorry, a >>> private BZ; I hate that :-/ The report is about 40 % worse >>> performance for an in-guest fio write benchmark.) >>> >>> So I have to ask the question about what the justification for >>> keeping c8bb23cbdbe32f is. How much does performance increase with >>> it actually? (On non-(ppc64+XFS) machines, obviously) >>> >>> Advantages: >>> + Trivial >>> + No layering violations >>> + We wouldn’t need to keep track of whether the kernel bug has been >>> fixed or not >>> + Fixes the ppc64+XFS performance problem >>> >>> Disadvantages: >>> - Reverts cluster allocation performance to pre-c8bb23cbdbe32f >>> levels, whatever that means >>> >>> So this is the main reason this is an RFC: What should we do? Is (1) >>> really the best choice? >>> >>> >>> In any case, I’ve ran the test case I showed in >>> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01282.html >>> more than ten times with this series applied and the installation >>> succeeded every time. (Without this series, it fails like every other >>> time.) >>> >>> >> >> Hi! >> >> First, great thanks for your investigation! >> >> We need c8bb23cbdbe3 patch, because we use 1M clusters, and zeroing 1M is >> significant >> in time. >> >> I've tested a bit: >> >> test: >> for img in /ssd/test.img /test.img; do for cl in 64K 1M; do for step in 4K >> 64K 1M; do ./qemu-img create -f qcow2 -o cluster_size=$cl $img 15G > >> /dev/null; printf '%-15s%-7s%-10s : ' $img cl=$cl step=$step; ./qemu-img >> bench -c $((15 * 1024)) -n -s 4K -S $step -t none -w $img | tail -1 | awk >> '{print $4}'; done; done; done >> >> on master: >> >> /ssd/test.img cl=6
Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
Hi Kevin, On Wed, Oct 23, 2019 at 05:37:49PM +0200, Kevin Wolf wrote: > > qcow2_cache_do_get() requires that s->lock is locked because it can > > yield between picking a cache entry and actually taking ownership of it > > by setting offset and increasing the reference count. > > > > Add an assertion to make sure the caller really holds the lock. The > > function can be called outside of coroutine context, where bdrv_pread > > and flushes become synchronous operations. The lock cannot and need not > > be taken in this case. > I'm still running tests to see if any other code paths trigger the > assertion, but image creation calls this without the lock held (which is > harmless because nobody else knows about the image so there won't be > concurrent requests). The following patch is needed additionally to make > image creation work with the new assertion. I can confirm that with all four patches corruption does no longer occur as of commit 69f47505ee66afaa513305de0c1895a224e52c45. Removing only 3/3 (qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()) the assertion triggers after a few seconds, leaving behind a few leaked clusters but no errors in the image. (qemu) qemu-system-x86_64:qemu/include/qemu/coroutine.h:175: qemu_co_mutex_assert_locked: Assertion `mutex->locked && mutex->holder == qemu_coroutine_self()' failed. Aborted (core dumped) $ qemu-img check qtest.qcow2 Leaked cluster 169257 refcount=3 reference=2 Leaked cluster 172001 refcount=1 reference=0 Leaked cluster 172002 refcount=1 reference=0 Leaked cluster 172003 refcount=1 reference=0 Leaked cluster 172004 refcount=1 reference=0 Leaked cluster 172005 refcount=1 reference=0 Leaked cluster 172006 refcount=1 reference=0 Leaked cluster 172007 refcount=1 reference=0 Leaked cluster 172008 refcount=1 reference=0 Leaked cluster 172009 refcount=1 reference=0 Leaked cluster 172010 refcount=1 reference=0 Leaked cluster 172011 refcount=1 reference=0 Leaked cluster 172012 refcount=1 reference=0 13 leaked clusters were found on the image. This means waste of disk space, but no harm to data. 255525/327680 = 77.98% allocated, 3.22% fragmented, 0.00% compressed clusters Image end offset: 17106403328 I was going to test with master as well but got overtaken by v2. Will move on to test v2 now. :) Series: Tested-by: Michael Weiser No biggie but if there's a chance could you switch my address to the above? -- Thanks, Michael
Re: [RFC 0/3] block/file-posix: Work around XFS bug
On Fri, 25 Oct 2019 at 15:16, Max Reitz wrote: > > On 25.10.19 15:46, Peter Maydell wrote: > > On Fri, 25 Oct 2019 at 11:09, Max Reitz wrote: > >> > >> Hi, > >> > >> It seems to me that there is a bug in Linux’s XFS kernel driver, as > >> I’ve explained here: > >> > >> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html > >> > >> In combination with our commit c8bb23cbdbe32f, this may lead to guest > >> data corruption when using qcow2 images on XFS with aio=native. > > > > Have we reported that upstream to the xfs folks? > > I’ve created an RH BZ here: > > https://bugzilla.redhat.com/show_bug.cgi?id=1765547 Currently "You are not authorized to access bug #1765547." for anonymous browsers, just fyi. thanks -- PMM
Re: [RFC 0/3] block/file-posix: Work around XFS bug
On 25.10.19 15:46, Peter Maydell wrote: > On Fri, 25 Oct 2019 at 11:09, Max Reitz wrote: >> >> Hi, >> >> It seems to me that there is a bug in Linux’s XFS kernel driver, as >> I’ve explained here: >> >> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html >> >> In combination with our commit c8bb23cbdbe32f, this may lead to guest >> data corruption when using qcow2 images on XFS with aio=native. > > Have we reported that upstream to the xfs folks? I’ve created an RH BZ here: https://bugzilla.redhat.com/show_bug.cgi?id=1765547 So at least some XFS folks are aware of it (and I trust them to inform the others as necessary :-)). (Eric Sandeen has been part of the discussion for the last couple of days already.) Max signature.asc Description: OpenPGP digital signature
Re: [PULL 00/19] Trivial branch patches
On Fri, 25 Oct 2019 at 09:35, Laurent Vivier wrote: > > The following changes since commit 58560ad254fbda71d4daa6622d71683190070ee2: > > Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.2-20191024' > into staging (2019-10-24 16:22:58 +0100) > > are available in the Git repository at: > > git://github.com/vivier/qemu.git tags/trivial-branch-pull-request > > for you to fetch changes up to fabb862f76f093cdd1610571de9ba714dc1c: > > hw/rtc/aspeed_rtc: Remove unused includes (2019-10-24 20:35:28 +0200) > > > Fix typos and docs, trivial changes and RTC devices split > > > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2 for any user-visible changes. -- PMM
Re: [PATCH v14 1/9] esp: add pseudo-DMA as used by Macintosh
On 22/10/19 13:17, Laurent Vivier wrote: > +if (s->dma_memory_read) { > +s->dma_memory_read(s->dma_opaque, &s->cmdbuf[s->cmdlen], len); > +} else { > +set_pdma(s, CMD, s->cmdlen, len); > +s->pdma_cb = do_dma_pdma_cb; > +esp_raise_drq(s); > +return; > +} > +trace_esp_handle_ti_cmd(s->cmdlen); > +s->ti_size = 0; > +s->cmdlen = 0; > +s->do_cmd = 0; > +do_cmd(s, s->cmdbuf); > return; Can you explain these lines after s->dma_memory_read? I suppose they are related to > -} > -if (s->do_cmd) { > +} else if (s->do_cmd) { If so, it would be nice to make those a separate patch. Otherwise seems okay. Paolo
Re: [RFC 0/3] block/file-posix: Work around XFS bug
25.10.2019 16:40, Vladimir Sementsov-Ogievskiy wrote: > 25.10.2019 12:58, Max Reitz wrote: >> Hi, >> >> It seems to me that there is a bug in Linux’s XFS kernel driver, as >> I’ve explained here: >> >> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html >> >> In combination with our commit c8bb23cbdbe32f, this may lead to guest >> data corruption when using qcow2 images on XFS with aio=native. >> >> We can’t wait until the XFS kernel driver is fixed, we should work >> around the problem ourselves. >> >> This is an RFC for two reasons: >> (1) I don’t know whether this is the right way to address the issue, >> (2) Ideally, we should detect whether the XFS kernel driver is fixed and >> if so stop applying the workaround. >> I don’t know how we would go about this, so this series doesn’t do >> it. (Hence it’s an RFC.) >> (3) Perhaps it’s a bit of a layering violation to let the file-posix >> driver access and modify a BdrvTrackedRequest object. >> >> As for how we can address the issue, I see three ways: >> (1) The one presented in this series: On XFS with aio=native, we extend >> tracked requests for post-EOF fallocate() calls (i.e., write-zero >> operations) to reach until infinity (INT64_MAX in practice), mark >> them serializing and wait for other conflicting requests. >> >> Advantages: >> + Limits the impact to very specific cases >> (And that means it wouldn’t hurt too much to keep this workaround >> even when the XFS driver has been fixed) >> + Works around the bug where it happens, namely in file-posix >> >> Disadvantages: >> - A bit complex >> - A bit of a layering violation (should file-posix have access to >> tracked requests?) >> >> (2) Always skip qcow2’s handle_alloc_space() on XFS. The XFS bug only >> becomes visible due to that function: I don’t think qcow2 writes >> zeroes in any other I/O path, and raw images are fixed in size so >> post-EOF writes won’t happen. >> >> Advantages: >> + Maybe simpler, depending on how difficult it is to handle the >> layering violation >> + Also fixes the performance problem of handle_alloc_space() being >> slow on ppc64+XFS. >> >> Disadvantages: >> - Huge layering violation because qcow2 would need to know whether >> the image is stored on XFS or not. >> - We’d definitely want to skip this workaround when the XFS driver >> has been fixed, so we need some method to find out whether it has >> >> (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f. >> To my knowledge I’m the only one who has provided any benchmarks for >> this commit, and even then I was a bit skeptical because it performs >> well in some cases and bad in others. I concluded that it’s >> probably worth it because the “some cases” are more likely to occur. >> >> Now we have this problem of corruption here (granted due to a bug in >> the XFS driver), and another report of massively degraded >> performance on ppc64 >> (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – sorry, a >> private BZ; I hate that :-/ The report is about 40 % worse >> performance for an in-guest fio write benchmark.) >> >> So I have to ask the question about what the justification for >> keeping c8bb23cbdbe32f is. How much does performance increase with >> it actually? (On non-(ppc64+XFS) machines, obviously) >> >> Advantages: >> + Trivial >> + No layering violations >> + We wouldn’t need to keep track of whether the kernel bug has been >> fixed or not >> + Fixes the ppc64+XFS performance problem >> >> Disadvantages: >> - Reverts cluster allocation performance to pre-c8bb23cbdbe32f >> levels, whatever that means >> >> So this is the main reason this is an RFC: What should we do? Is (1) >> really the best choice? >> >> >> In any case, I’ve ran the test case I showed in >> https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01282.html >> more than ten times with this series applied and the installation >> succeeded every time. (Without this series, it fails like every other >> time.) >> >> > > Hi! > > First, great thanks for your investigation! > > We need c8bb23cbdbe3 patch, because we use 1M clusters, and zeroing 1M is > significant > in time. > > I've tested a bit: > > test: > for img in /ssd/test.img /test.img; do for cl in 64K 1M; do for step in 4K > 64K 1M; do ./qemu-img create -f qcow2 -o cluster_size=$cl $img 15G > > /dev/null; printf '%-15s%-7s%-10s : ' $img cl=$cl step=$step; ./qemu-img > bench -c $((15 * 1024)) -n -s 4K -S $step -t none -w $img | tail -1 | awk > '{print $4}'; done; done; done > > on master: > > /ssd/test.img cl=64K step=4K : 0.291 > /ssd/test.img cl=64K step=64K : 0.813 > /ssd/test.img cl=64K step=1M : 2.799 > /ssd/test.img cl=1M step=4K : 0.217 > /ssd/test.img
Re: [RFC 0/3] block/file-posix: Work around XFS bug
On Fri, 25 Oct 2019 at 11:09, Max Reitz wrote: > > Hi, > > It seems to me that there is a bug in Linux’s XFS kernel driver, as > I’ve explained here: > > https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html > > In combination with our commit c8bb23cbdbe32f, this may lead to guest > data corruption when using qcow2 images on XFS with aio=native. Have we reported that upstream to the xfs folks? thanks -- PMM
[PULL 6/7] coroutine: Add qemu_co_mutex_assert_locked()
Some functions require that the caller holds a certain CoMutex for them to operate correctly. Add a function so that they can assert the lock is really held. Cc: qemu-sta...@nongnu.org Signed-off-by: Kevin Wolf Tested-by: Michael Weiser Reviewed-by: Michael Weiser Reviewed-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Denis V. Lunev Reviewed-by: Max Reitz --- include/qemu/coroutine.h | 15 +++ 1 file changed, 15 insertions(+) diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index 8d55663062..dfd261c5b1 100644 --- a/include/qemu/coroutine.h +++ b/include/qemu/coroutine.h @@ -167,6 +167,21 @@ void coroutine_fn qemu_co_mutex_lock(CoMutex *mutex); */ void coroutine_fn qemu_co_mutex_unlock(CoMutex *mutex); +/** + * Assert that the current coroutine holds @mutex. + */ +static inline coroutine_fn void qemu_co_mutex_assert_locked(CoMutex *mutex) +{ +/* + * mutex->holder doesn't need any synchronisation if the assertion holds + * true because the mutex protects it. If it doesn't hold true, we still + * don't mind if another thread takes or releases mutex behind our back, + * because the condition will be false no matter whether we read NULL or + * the pointer for any other coroutine. + */ +assert(atomic_read(&mutex->locked) && + mutex->holder == qemu_coroutine_self()); +} /** * CoQueues are a mechanism to queue coroutines in order to continue executing -- 2.20.1
[PULL 1/7] qapi: add support for blkreplay driver
From: Pavel Dovgalyuk This patch adds support for blkreplay driver to the blockdev options. Now blkreplay can be used with -blockdev command line option in the following format: -blockdev driver=blkreplay,image=file-node-name,node-name=replay-node-name This option makes possible implementation of the better command line support for record/replay invocations. Signed-off-by: Pavel Dovgalyuk Signed-off-by: Kevin Wolf --- qapi/block-core.json | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index b274aef713..aa97ee2641 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2883,12 +2883,13 @@ # @nvme: Since 2.12 # @copy-on-read: Since 3.0 # @blklogwrites: Since 3.0 +# @blkreplay: Since 4.2 # # Since: 2.9 ## { 'enum': 'BlockdevDriver', - 'data': [ 'blkdebug', 'blklogwrites', 'blkverify', 'bochs', 'cloop', -'copy-on-read', 'dmg', 'file', 'ftp', 'ftps', 'gluster', + 'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs', +'cloop', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps', 'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd', @@ -3501,6 +3502,18 @@ 'data': { 'test': 'BlockdevRef', 'raw': 'BlockdevRef' } } +## +# @BlockdevOptionsBlkreplay: +# +# Driver specific block device options for blkreplay. +# +# @image: disk image which should be controlled with blkreplay +# +# Since: 4.2 +## +{ 'struct': 'BlockdevOptionsBlkreplay', + 'data': { 'image': 'BlockdevRef' } } + ## # @QuorumReadPattern: # @@ -4028,6 +4041,7 @@ 'blkdebug': 'BlockdevOptionsBlkdebug', 'blklogwrites':'BlockdevOptionsBlklogwrites', 'blkverify': 'BlockdevOptionsBlkverify', + 'blkreplay': 'BlockdevOptionsBlkreplay', 'bochs': 'BlockdevOptionsGenericFormat', 'cloop': 'BlockdevOptionsGenericFormat', 'copy-on-read':'BlockdevOptionsGenericFormat', -- 2.20.1
[PULL 2/7] iotests: Skip read-only cases in 118 when run as root
Some tests in 118 use chmod to remove write permissions from the file and assume that the image can indeed not be opened read-write afterwards. This doesn't work when the test is run as root, because root can still open the file as writable even when the permission bit isn't set. Introduce a @skip_if_root decorator and use it in 118 to skip the tests in question when the script is run as root. Signed-off-by: Kevin Wolf Reviewed-by: Philippe Mathieu-Daudé --- tests/qemu-iotests/118| 3 +++ tests/qemu-iotests/iotests.py | 10 ++ 2 files changed, 13 insertions(+) diff --git a/tests/qemu-iotests/118 b/tests/qemu-iotests/118 index ea0b326ae0..e20080e9a6 100755 --- a/tests/qemu-iotests/118 +++ b/tests/qemu-iotests/118 @@ -446,6 +446,7 @@ class TestChangeReadOnly(ChangeBaseClass): self.assert_qmp(result, 'return[0]/inserted/ro', True) self.assert_qmp(result, 'return[0]/inserted/image/filename', new_img) +@iotests.skip_if_user_is_root def test_rw_ro_retain(self): os.chmod(new_img, 0o444) self.vm.add_drive(old_img, 'media=disk', 'none') @@ -530,6 +531,7 @@ class TestChangeReadOnly(ChangeBaseClass): self.assert_qmp(result, 'return[0]/inserted/ro', True) self.assert_qmp(result, 'return[0]/inserted/image/filename', new_img) +@iotests.skip_if_user_is_root def test_make_ro_rw(self): os.chmod(new_img, 0o444) self.vm.add_drive(old_img, 'media=disk', 'none') @@ -571,6 +573,7 @@ class TestChangeReadOnly(ChangeBaseClass): self.assert_qmp(result, 'return[0]/inserted/ro', True) self.assert_qmp(result, 'return[0]/inserted/image/filename', new_img) +@iotests.skip_if_user_is_root def test_make_ro_rw_by_retain(self): os.chmod(new_img, 0o444) self.vm.add_drive(old_img, 'media=disk', 'none') diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 693fde155a..709def4d5d 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -931,6 +931,16 @@ def skip_if_unsupported(required_formats=[], read_only=False): return func_wrapper return skip_test_decorator +def skip_if_user_is_root(func): +'''Skip Test Decorator + Runs the test only without root permissions''' +def func_wrapper(*args, **kwargs): +if os.getuid() == 0: +case_notrun('{}: cannot be run as root'.format(args[0])) +else: +return func(*args, **kwargs) +return func_wrapper + def execute_unittest(output, verbosity, debug): runner = unittest.TextTestRunner(stream=output, descriptions=True, verbosity=verbosity) -- 2.20.1
[PULL 0/7] Block layer patches
The following changes since commit 7bc8f9734213b76e76631a483be13d6737c2adbc: Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20191025' into staging (2019-10-25 13:12:16 +0100) are available in the Git repository at: git://repo.or.cz/qemu/kevin.git tags/for-upstream for you to fetch changes up to 5e9785505210e2477e590e61b1ab100d0ec22b01: qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() (2019-10-25 15:18:55 +0200) Block layer patches: - qcow2: Fix data corruption bug that is triggered in partial cluster allocation with default options - qapi: add support for blkreplay driver - doc: Describe missing generic -blockdev options - iotests: Fix 118 when run as root - Minor code cleanups Kevin Wolf (5): iotests: Skip read-only cases in 118 when run as root blockdev: Use error_report() in hmp_commit() doc: Describe missing generic -blockdev options coroutine: Add qemu_co_mutex_assert_locked() qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() Pavel Dovgaluk (1): qapi: add support for blkreplay driver Vladimir Sementsov-Ogievskiy (1): block/backup: drop dead code from backup_job_create qapi/block-core.json | 18 -- include/qemu/coroutine.h | 15 +++ block/backup.c| 5 + block/qcow2-refcount.c| 2 ++ block/qcow2.c | 3 ++- blockdev.c| 7 +++ qemu-options.hx | 22 +- tests/qemu-iotests/118| 3 +++ tests/qemu-iotests/iotests.py | 10 ++ 9 files changed, 73 insertions(+), 12 deletions(-)
[PULL 7/7] qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()
qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which requires s->lock to be taken to protect its accesses to the refcount table and refcount blocks. However, nothing in this code path actually took the lock. This could cause the same cache entry to be used by two requests at the same time, for different tables at different offsets, resulting in image corruption. As it would be preferable to base the detection on consistent data (even though it's just heuristics), let's take the lock not only around the qcow2_get_refcount() calls, but around the whole function. This patch takes the lock in qcow2_co_block_status() earlier and asserts in qcow2_detect_metadata_preallocation() that we hold the lock. Fixes: 69f47505ee66afaa513305de0c1895a224e52c45 Cc: qemu-sta...@nongnu.org Reported-by: Michael Weiser Signed-off-by: Kevin Wolf Tested-by: Michael Weiser Reviewed-by: Michael Weiser Reviewed-by: Vladimir Sementsov-Ogievskiy Reviewed-by: Max Reitz --- block/qcow2-refcount.c | 2 ++ block/qcow2.c | 3 ++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index ef965d7895..0d64bf5a5e 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -3455,6 +3455,8 @@ int qcow2_detect_metadata_preallocation(BlockDriverState *bs) int64_t i, end_cluster, cluster_count = 0, threshold; int64_t file_length, real_allocation, real_clusters; +qemu_co_mutex_assert_locked(&s->lock); + file_length = bdrv_getlength(bs->file->bs); if (file_length < 0) { return file_length; diff --git a/block/qcow2.c b/block/qcow2.c index 8b05933565..0bc69e6996 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1916,6 +1916,8 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs, unsigned int bytes; int status = 0; +qemu_co_mutex_lock(&s->lock); + if (!s->metadata_preallocation_checked) { ret = qcow2_detect_metadata_preallocation(bs); s->metadata_preallocation = (ret == 1); @@ -1923,7 +1925,6 @@ static int coroutine_fn qcow2_co_block_status(BlockDriverState *bs, } bytes = MIN(INT_MAX, count); -qemu_co_mutex_lock(&s->lock); ret = qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset); qemu_co_mutex_unlock(&s->lock); if (ret < 0) { -- 2.20.1
[PULL 5/7] doc: Describe missing generic -blockdev options
We added more generic options after introducing -blockdev and forgot to update the documentation (man page and --help output) accordingly. Do that now. Signed-off-by: Kevin Wolf Reviewed-by: Peter Maydell --- qemu-options.hx | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/qemu-options.hx b/qemu-options.hx index 996b6fba74..19709f973d 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -864,7 +864,8 @@ ETEXI DEF("blockdev", HAS_ARG, QEMU_OPTION_blockdev, "-blockdev [driver=]driver[,node-name=N][,discard=ignore|unmap]\n" " [,cache.direct=on|off][,cache.no-flush=on|off]\n" -" [,read-only=on|off][,detect-zeroes=on|off|unmap]\n" +" [,read-only=on|off][,auto-read-only=on|off]\n" +" [,force-share=on|off][,detect-zeroes=on|off|unmap]\n" " [,driver specific parameters...]\n" "configure a block backend\n", QEMU_ARCH_ALL) STEXI @@ -900,6 +901,25 @@ name is not intended to be predictable and changes between QEMU invocations. For the top level, an explicit node name must be specified. @item read-only Open the node read-only. Guest write attempts will fail. + +Note that some block drivers support only read-only access, either generally or +in certain configurations. In this case, the default value +@option{read-only=off} does not work and the option must be specified +explicitly. +@item auto-read-only +If @option{auto-read-only=on} is set, QEMU may fall back to read-only usage +even when @option{read-only=off} is requested, or even switch between modes as +needed, e.g. depending on whether the image file is writable or whether a +writing user is attached to the node. +@item force-share +Override the image locking system of QEMU by forcing the node to utilize +weaker shared access for permissions where it would normally request exclusive +access. When there is the potential for multiple instances to have the same +file open (whether this invocation of QEMU is the first or the second +instance), both instances must permit shared access for the second instance to +succeed at opening the file. + +Enabling @option{force-share=on} requires @option{read-only=on}. @item cache.direct The host page cache can be avoided with @option{cache.direct=on}. This will attempt to do disk IO directly to the guest's memory. QEMU may still perform an -- 2.20.1
[PULL 3/7] blockdev: Use error_report() in hmp_commit()
Instead of using monitor_printf() to report errors, hmp_commit() should use error_report() like other places do. Signed-off-by: Kevin Wolf Reviewed-by: Eric Blake Reviewed-by: Philippe Mathieu-Daudé --- blockdev.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/blockdev.c b/blockdev.c index 03c7cd7651..ba491e3ef5 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1088,11 +1088,11 @@ void hmp_commit(Monitor *mon, const QDict *qdict) blk = blk_by_name(device); if (!blk) { -monitor_printf(mon, "Device '%s' not found\n", device); +error_report("Device '%s' not found", device); return; } if (!blk_is_available(blk)) { -monitor_printf(mon, "Device '%s' has no medium\n", device); +error_report("Device '%s' has no medium", device); return; } @@ -1105,8 +1105,7 @@ void hmp_commit(Monitor *mon, const QDict *qdict) aio_context_release(aio_context); } if (ret < 0) { -monitor_printf(mon, "'commit' error for '%s': %s\n", device, - strerror(-ret)); +error_report("'commit' error for '%s': %s", device, strerror(-ret)); } } -- 2.20.1
Re: [RFC 0/3] block/file-posix: Work around XFS bug
25.10.2019 12:58, Max Reitz wrote: > Hi, > > It seems to me that there is a bug in Linux’s XFS kernel driver, as > I’ve explained here: > > https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html > > In combination with our commit c8bb23cbdbe32f, this may lead to guest > data corruption when using qcow2 images on XFS with aio=native. > > We can’t wait until the XFS kernel driver is fixed, we should work > around the problem ourselves. > > This is an RFC for two reasons: > (1) I don’t know whether this is the right way to address the issue, > (2) Ideally, we should detect whether the XFS kernel driver is fixed and > if so stop applying the workaround. > I don’t know how we would go about this, so this series doesn’t do > it. (Hence it’s an RFC.) > (3) Perhaps it’s a bit of a layering violation to let the file-posix > driver access and modify a BdrvTrackedRequest object. > > As for how we can address the issue, I see three ways: > (1) The one presented in this series: On XFS with aio=native, we extend > tracked requests for post-EOF fallocate() calls (i.e., write-zero > operations) to reach until infinity (INT64_MAX in practice), mark > them serializing and wait for other conflicting requests. > > Advantages: > + Limits the impact to very specific cases >(And that means it wouldn’t hurt too much to keep this workaround >even when the XFS driver has been fixed) > + Works around the bug where it happens, namely in file-posix > > Disadvantages: > - A bit complex > - A bit of a layering violation (should file-posix have access to >tracked requests?) > > (2) Always skip qcow2’s handle_alloc_space() on XFS. The XFS bug only > becomes visible due to that function: I don’t think qcow2 writes > zeroes in any other I/O path, and raw images are fixed in size so > post-EOF writes won’t happen. > > Advantages: > + Maybe simpler, depending on how difficult it is to handle the >layering violation > + Also fixes the performance problem of handle_alloc_space() being >slow on ppc64+XFS. > > Disadvantages: > - Huge layering violation because qcow2 would need to know whether >the image is stored on XFS or not. > - We’d definitely want to skip this workaround when the XFS driver >has been fixed, so we need some method to find out whether it has > > (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f. > To my knowledge I’m the only one who has provided any benchmarks for > this commit, and even then I was a bit skeptical because it performs > well in some cases and bad in others. I concluded that it’s > probably worth it because the “some cases” are more likely to occur. > > Now we have this problem of corruption here (granted due to a bug in > the XFS driver), and another report of massively degraded > performance on ppc64 > (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – sorry, a > private BZ; I hate that :-/ The report is about 40 % worse > performance for an in-guest fio write benchmark.) > > So I have to ask the question about what the justification for > keeping c8bb23cbdbe32f is. How much does performance increase with > it actually? (On non-(ppc64+XFS) machines, obviously) > > Advantages: > + Trivial > + No layering violations > + We wouldn’t need to keep track of whether the kernel bug has been >fixed or not > + Fixes the ppc64+XFS performance problem > > Disadvantages: > - Reverts cluster allocation performance to pre-c8bb23cbdbe32f >levels, whatever that means > > So this is the main reason this is an RFC: What should we do? Is (1) > really the best choice? > > > In any case, I’ve ran the test case I showed in > https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01282.html > more than ten times with this series applied and the installation > succeeded every time. (Without this series, it fails like every other > time.) > > Hi! First, great thanks for your investigation! We need c8bb23cbdbe3 patch, because we use 1M clusters, and zeroing 1M is significant in time. I've tested a bit: test: for img in /ssd/test.img /test.img; do for cl in 64K 1M; do for step in 4K 64K 1M; do ./qemu-img create -f qcow2 -o cluster_size=$cl $img 15G > /dev/null; printf '%-15s%-7s%-10s : ' $img cl=$cl step=$step; ./qemu-img bench -c $((15 * 1024)) -n -s 4K -S $step -t none -w $img | tail -1 | awk '{print $4}'; done; done; done on master: /ssd/test.img cl=64K step=4K: 0.291 /ssd/test.img cl=64K step=64K : 0.813 /ssd/test.img cl=64K step=1M: 2.799 /ssd/test.img cl=1M step=4K: 0.217 /ssd/test.img cl=1M step=64K : 0.332 /ssd/test.img cl=1M step=1M: 0.685 /test.img cl=64K step=4K: 1.751 /test.img cl=64K step=64K : 14.811 /test.img cl=64K step=1M
[PULL 4/7] block/backup: drop dead code from backup_job_create
From: Vladimir Sementsov-Ogievskiy After commit 00e30f05de1d195, there is no more "goto error" points after job creation, so after "error:" @job is always NULL and we don't need roll-back job creation. Reported-by: Coverity (CID 1406402) Signed-off-by: Vladimir Sementsov-Ogievskiy Acked-by: Stefano Garzarella Signed-off-by: Kevin Wolf --- block/backup.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/block/backup.c b/block/backup.c index dddcf77f53..cf62b1a38c 100644 --- a/block/backup.c +++ b/block/backup.c @@ -474,10 +474,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, if (sync_bitmap) { bdrv_reclaim_dirty_bitmap(sync_bitmap, NULL); } -if (job) { -backup_clean(&job->common.job); -job_early_fail(&job->common.job); -} else if (backup_top) { +if (backup_top) { bdrv_backup_top_drop(backup_top); } -- 2.20.1
Re: [PATCH v2 2/2] iotests: add 269 to check maximum of bitmaps in qcow2
On 14.10.19 13:51, Vladimir Sementsov-Ogievskiy wrote: > Check that it's impossible to create more persistent bitmaps than qcow2 > supports. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > tests/qemu-iotests/269 | 47 ++ > tests/qemu-iotests/269.out | 3 +++ > tests/qemu-iotests/group | 1 + > 3 files changed, 51 insertions(+) > create mode 100755 tests/qemu-iotests/269 > create mode 100644 tests/qemu-iotests/269.out Is there no way to make this test any faster, e.g. by creating like 65534 bitmaps with dd and a binary blob? (Similarly to what I do in “iotests: Test qcow2's snapshot table handling”) This is such an edge case, but running the test took 3:46 min before patch 1 (which I already find much too long), and 8:13 min afterwards (on my machine). (To be honest, if we take this test as-is, I’m probably just never going to run it.) Max signature.asc Description: OpenPGP digital signature
Re: [PATCH v2 1/2] qcow2-bitmaps: fix qcow2_can_store_new_dirty_bitmap
25.10.2019 15:50, Max Reitz wrote: > On 14.10.19 13:51, Vladimir Sementsov-Ogievskiy wrote: >> qcow2_can_store_new_dirty_bitmap works wrong, as it considers only >> bitmaps already stored in the qcow2 image and ignores persistent >> BdrvDirtyBitmap objects. >> >> So, let's instead count persistent BdrvDirtyBitmaps. We load all qcow2 >> bitmaps on open, so there should not be any bitmap in the image for >> which we don't have BdrvDirtyBitmaps version. If it is - it's a kind of >> corruption, and no reason to check for corruptions here (open() and >> close() are better places for it). >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy >> --- >> block/qcow2-bitmap.c | 41 ++--- >> 1 file changed, 18 insertions(+), 23 deletions(-) > > Am I right in interpreting qcow2_load_dirty_bitmaps() in such a way that > every persistent dirty bitmap will cause a runtime dirty bitmap to be > created? Yes, we load all the bitmaps. Every bitmap stored in qcow2 image is loaded and corresponding BdrvDirtyBitmap is created. If bitmap has IN_USE flag in the image, created BdrvDirtyBitmap is marked inconsistent, but it is still there. If bitmap doesn't have AUTO flag, it becomes disabled bitmap. > > If so: > > Reviewed-by: Max Reitz > -- Best regards, Vladimir
Re: [PATCH v2 1/2] qcow2-bitmaps: fix qcow2_can_store_new_dirty_bitmap
On 14.10.19 13:51, Vladimir Sementsov-Ogievskiy wrote: > qcow2_can_store_new_dirty_bitmap works wrong, as it considers only > bitmaps already stored in the qcow2 image and ignores persistent > BdrvDirtyBitmap objects. > > So, let's instead count persistent BdrvDirtyBitmaps. We load all qcow2 > bitmaps on open, so there should not be any bitmap in the image for > which we don't have BdrvDirtyBitmaps version. If it is - it's a kind of > corruption, and no reason to check for corruptions here (open() and > close() are better places for it). > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/qcow2-bitmap.c | 41 ++--- > 1 file changed, 18 insertions(+), 23 deletions(-) Am I right in interpreting qcow2_load_dirty_bitmaps() in such a way that every persistent dirty bitmap will cause a runtime dirty bitmap to be created? If so: Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock
Am 25.10.2019 um 12:35 hat Michael Weiser geschrieben: > I was going to test with master as well but got overtaken by v2. Will > move on to test v2 now. :) > > Series: > Tested-by: Michael Weiser Thanks for testing! The fix itself is unchanged in v2, so I assume the result will be the same, but testing it explicitly can't hurt. I'm going to send a pull request today, but if you're quick, I can wait for your results. > No biggie but if there's a chance could you switch my address to the > above? No problem, I've updated the address in the Reported-by tag. Kevin
Re: [PATCH v3 0/6] block-copy: memory limit
25.10.2019 15:31, Max Reitz wrote: > On 22.10.19 13:17, Vladimir Sementsov-Ogievskiy wrote: >> I'm going to bring block-status driven, async copying process to >> block-copy, to make it fast. The first step is to limit memory usage of >> backup, here is it. >> >> v3: >> 03: add Max's r-b >> 04: fix commit message and include guards, add Max's r-b >> 05-06: add Max's r-b >> >> v2: [mostly by Max's comments] >> Now based on master (Thank you Max!) >> 01: add Max's r-b >> 02: add Max's r-b >> 03: - refactor block_copy_do_copy goto/return >> - add small comment to block_copy_do_copy >> 04: - a lot of renaming and wording fixes >> - refactor to use "available" instead of "taken" >> - refactor co_get_from_shres >> 05: rebase on 04 changes >> 06: drop extra things from max_transfer calculation >> >> Vladimir Sementsov-Ogievskiy (6): >>block/block-copy: allocate buffer in block_copy_with_bounce_buffer >>block/block-copy: limit copy_range_size to 16 MiB >>block/block-copy: refactor copying >>util: introduce SharedResource >>block/block-copy: add memory limit >>block/block-copy: increase buffered copy request >> >> include/block/block-copy.h| 5 +- >> include/qemu/co-shared-resource.h | 71 >> block/block-copy.c| 182 +++--- >> util/qemu-co-shared-resource.c| 76 + >> block/trace-events| 6 +- >> util/Makefile.objs| 1 + >> 6 files changed, 249 insertions(+), 92 deletions(-) >> create mode 100644 include/qemu/co-shared-resource.h >> create mode 100644 util/qemu-co-shared-resource.c > > Thanks, applied to my block branch: > > https://git.xanclic.moe/XanClic/qemu/commits/branch/block > > Max > Thank you Max! -- Best regards, Vladimir
Re: [PATCH v3 0/6] block-copy: memory limit
On 22.10.19 13:17, Vladimir Sementsov-Ogievskiy wrote: > I'm going to bring block-status driven, async copying process to > block-copy, to make it fast. The first step is to limit memory usage of > backup, here is it. > > v3: > 03: add Max's r-b > 04: fix commit message and include guards, add Max's r-b > 05-06: add Max's r-b > > v2: [mostly by Max's comments] > Now based on master (Thank you Max!) > 01: add Max's r-b > 02: add Max's r-b > 03: - refactor block_copy_do_copy goto/return > - add small comment to block_copy_do_copy > 04: - a lot of renaming and wording fixes > - refactor to use "available" instead of "taken" > - refactor co_get_from_shres > 05: rebase on 04 changes > 06: drop extra things from max_transfer calculation > > Vladimir Sementsov-Ogievskiy (6): > block/block-copy: allocate buffer in block_copy_with_bounce_buffer > block/block-copy: limit copy_range_size to 16 MiB > block/block-copy: refactor copying > util: introduce SharedResource > block/block-copy: add memory limit > block/block-copy: increase buffered copy request > > include/block/block-copy.h| 5 +- > include/qemu/co-shared-resource.h | 71 > block/block-copy.c| 182 +++--- > util/qemu-co-shared-resource.c| 76 + > block/trace-events| 6 +- > util/Makefile.objs| 1 + > 6 files changed, 249 insertions(+), 92 deletions(-) > create mode 100644 include/qemu/co-shared-resource.h > create mode 100644 util/qemu-co-shared-resource.c Thanks, applied to my block branch: https://git.xanclic.moe/XanClic/qemu/commits/branch/block Max signature.asc Description: OpenPGP digital signature
Re: [PATCH v7 0/8] Packed virtqueue for virtio
On Fri, Oct 25, 2019 at 10:35:19AM +0200, Eugenio Pérez wrote: > Hi: > > This is an updated version of packed virtqueue support based on Wei and > Jason's > V6, mainly solving the clang leak detector error CI gave. Looks good, I will queue this up. It would be nice to add libqos based tests on top, based on Stefan's work. > Please review. > > Changes from V6: > - Commit reorder: Squash bugfix and sepparate big changes into smaller > commits. > > Changes from V5: > - Fix qemu's CI asan error. > - Move/copy rcu comments. > - Merge duplicated vdev->broken check between split and packet version. > > Eugenio Pérez (2): > virtio: Free blk virqueues at unrealize() > virtio: Free rnd virqueue at unrealize() > > Jason Wang (4): > virtio: basic packed virtqueue support > virtio: event suppression support for packed ring > vhost_net: enable packed ring support > virtio: add property to enable packed virtqueue > > Wei Xu (2): > virtio: basic structure for packed ring > virtio: device/driver area size calculation refactor for split ring > > hw/block/virtio-blk.c |7 +- > hw/char/virtio-serial-bus.c |2 +- > hw/net/vhost_net.c |2 + > hw/scsi/virtio-scsi.c |3 +- > hw/virtio/virtio-rng.c |1 + > hw/virtio/virtio.c | 1154 > ++- > include/hw/virtio/virtio.h | 14 +- > 7 files changed, 1045 insertions(+), 138 deletions(-) > > -- > 2.16.5
Re: [PATCH v4 00/16] libqos: add VIRTIO PCI 1.0 support
On Wed, Oct 23, 2019 at 11:04:09AM +0100, Stefan Hajnoczi wrote: > v4: > * Introduce bool d->features_negotiated so that tests can negotiate a >0 feature bit set in Legacy mode [Thomas] > * Make the FEATURES_OK code change in qvirtio_set_driver_ok() clearer and >mention it in the commit description [Thomas] > * Fix indentation in qvring_init() [Thomas] > v3: > * Now implements VIRTIO 1.0 fully (vring, device initialization). >This required several new patches to address the following issues: >1. Tests that do not negotiate features (non-compliant!) >2. Tests that access configuration space before feature negotiation > (non-compliant!) >3. Tests that set up virtqueues before feature negotiation (non-compliant!) >4. vring accesses require byte-swapping when VIRTIO 1.0 is used with a > big-endian guest because the qtest_readX/writeX() API automatically > converts to guest-endian >5. VIRTIO 1.0 has an additional FEATURES_OK step during Device > Initialization > * Change uint8_t bar_idx to int [Thomas] > * Document qpci_find_capability() [Thomas] > * Every commit tested with arm, ppc64, and x86_64 targets using: >git rebase -i origin/master -x 'make tests/qos-test && >QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 tests/qos-test && >QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/qos-test' >QTEST_QEMU_BINARY=arm-softmmu/qemu-system-arm tests/qos-test' > v2: > * Fix checkpatch.pl issues, except MAINTAINERS file warning. libqos already >has maintainers and the new virtio-pci-modern.[ch] files don't need extra >entries since they are already covered by the existing libqos/ entry. > > New VIRTIO devices are Non-Transitional. This means they only expose the > VIRTIO 1.0 interface, not the Legacy interface. > > The libqos only supports Legacy and Transitional devices (in Legacy mode). > This patch series adds VIRTIO 1.0 support so that tests can run against > Non-Transitional devices too. Very nice, thanks! I'll queue this up in my tree. > The virtio-fs device is Non-Transitional, so this is a prerequisite for > virtio-fs qos tests. > > Stefan Hajnoczi (16): > tests/virtio-blk-test: read config space after feature negotiation > libqos: read QVIRTIO_MMIO_VERSION register > libqos: extend feature bits to 64-bit > virtio-scsi-test: add missing feature negotiation > tests/virtio-blk-test: set up virtqueue after feature negotiation > libqos: add missing virtio-9p feature negotiation > libqos: enforce Device Initialization order > libqos: implement VIRTIO 1.0 FEATURES_OK step > libqos: access VIRTIO 1.0 vring in little-endian > libqos: add iteration support to qpci_find_capability() > libqos: pass full QVirtQueue to set_queue_address() > libqos: add MSI-X callbacks to QVirtioPCIDevice > libqos: expose common virtqueue setup/cleanup functions > libqos: make the virtio-pci BAR index configurable > libqos: extract Legacy virtio-pci.c code > libqos: add VIRTIO PCI 1.0 support > > tests/Makefile.include | 1 + > tests/libqos/pci.h | 2 +- > tests/libqos/virtio-mmio.h | 1 + > tests/libqos/virtio-pci-modern.h | 17 ++ > tests/libqos/virtio-pci.h| 34 ++- > tests/libqos/virtio.h| 19 +- > tests/libqos/pci.c | 30 ++- > tests/libqos/virtio-9p.c | 6 + > tests/libqos/virtio-mmio.c | 38 ++- > tests/libqos/virtio-net.c| 6 +- > tests/libqos/virtio-pci-modern.c | 443 +++ > tests/libqos/virtio-pci.c| 105 +--- > tests/libqos/virtio.c| 160 --- > tests/virtio-blk-test.c | 66 +++-- > tests/virtio-scsi-test.c | 8 + > 15 files changed, 802 insertions(+), 134 deletions(-) > create mode 100644 tests/libqos/virtio-pci-modern.h > create mode 100644 tests/libqos/virtio-pci-modern.c > > -- > 2.21.0
Re: qemu crashing when attaching an ISO file to a virtio-scsi CD-ROM device through libvirt
I managed to upgrade to qemu 4.1 on a test KVM host and I can confirm I can't repro the issue in this version. Great news that is fixed in 4.1. Thanks everyone for your inputs and the fast replies. Kind regards, Fernando On vie, oct 25, 2019 at 12:28 PM, Fernando Casas Schössow wrote: Thanks for the reply Kevin. I will do my best to upgrade to 4.1, test again and report back if this is fixed or not in that version. Hopefully it is. Fernando On vie, oct 25, 2019 at 12:07 PM, Kevin Wolf wrote: Am 23.10.2019 um 19:28 hat Fernando Casas Schössow geschrieben: Hi John, Thanks for looking into this. I can quickly repro the problem with qemu 4.0 binary with debugging symbols enabled as I have it available already. Doing the same with qemu 4.1 or development head may be too much hassle but if it's really the only way I can give it try. We had a lot of iothread related fixes in 4.1, so this would really be the only way to tell if it's a bug that still exists. I suspect that it's already fixed (and to be more precise, I assume that commit d0ee0204f fixed it). Kevin
Re: [PATCH v2 0/2] qcow2: Fix image corruption bug in 4.1
On 24.10.19 16:26, Kevin Wolf wrote: > This series fixes an image corruption bug that was introduced in commit > 69f47505e ('block: avoid recursive block_status call if possible'), > first contained in the QEMU 4.1.0 release. > > This bug was reported by Michael Weiser on Launchpad: > https://bugs.launchpad.net/qemu/+bug/1846427 > > v2: > > - Dropped the assertion in qcow2_cache_do_get() for now. Making sure > that it actually holds true for all callers requires more work and > getting the corruption fix in quickly is important. > > - Use atomic_read() and add comment to qemu_co_mutex_assert_locked() > implementation [Denis] > > Kevin Wolf (2): > coroutine: Add qemu_co_mutex_assert_locked() > qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation() > > include/qemu/coroutine.h | 15 +++ > block/qcow2-refcount.c | 2 ++ > block/qcow2.c| 3 ++- > 3 files changed, 19 insertions(+), 1 deletion(-) Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [RFC 2/3] block/file-posix: Detect XFS with CONFIG_FALLOCATE
Am 25.10.2019 um 11:58 hat Max Reitz geschrieben: > We will need this for the next patch. > > Signed-off-by: Max Reitz > --- > block/file-posix.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/block/file-posix.c b/block/file-posix.c > index 695fcf740d..5cd54c8bff 100644 > --- a/block/file-posix.c > +++ b/block/file-posix.c > @@ -149,7 +149,7 @@ typedef struct BDRVRawState { > int perm_change_flags; > BDRVReopenState *reopen_state; > > -#ifdef CONFIG_XFS > +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) > bool is_xfs:1; > #endif > bool has_discard:1; > @@ -667,7 +667,7 @@ static int raw_open_common(BlockDriverState *bs, QDict > *options, > } > #endif > > -#ifdef CONFIG_XFS > +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) > if (platform_test_xfs_fd(s->fd)) { platform_test_xfs_fd() is defined in a header file from xfsprogs. I don't think you can call that without CONFIG_XFS, it would break the build. > s->is_xfs = true; > } Kevin
Re: [RFC 2/3] block/file-posix: Detect XFS with CONFIG_FALLOCATE
On 25.10.19 12:35, Kevin Wolf wrote: > Am 25.10.2019 um 12:22 hat Max Reitz geschrieben: >> On 25.10.19 12:19, Kevin Wolf wrote: >>> Am 25.10.2019 um 11:58 hat Max Reitz geschrieben: We will need this for the next patch. Signed-off-by: Max Reitz --- block/file-posix.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 695fcf740d..5cd54c8bff 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -149,7 +149,7 @@ typedef struct BDRVRawState { int perm_change_flags; BDRVReopenState *reopen_state; -#ifdef CONFIG_XFS +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) bool is_xfs:1; #endif bool has_discard:1; @@ -667,7 +667,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, } #endif -#ifdef CONFIG_XFS +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) if (platform_test_xfs_fd(s->fd)) { >>> >>> platform_test_xfs_fd() is defined in a header file from xfsprogs. I >>> don't think you can call that without CONFIG_XFS, it would break the >>> build. >> >> Aw. >> >> OK. Should we then just assume is_xfs to be true (for the next patch) >> with !defined(CONFIG_XFS) && defined(CONFIG_FALLOCATE)? > > It's a small inline function that basically just calls statfs() and then > checks f_type. Yes, this is where my error came from. I asked cscope, which happily referred me to the inline implementation and so I didn’t bother looking at the filename and just assumed it’d be some code that belongs to qemu. > I think we can have a small function to implement this in the file-posix > code. Don't copy it from the header because of licensing (LGPL, while > file-posix is MIT); refer to the statfs() manpage instead and write it > yourself. OK. Max signature.asc Description: OpenPGP digital signature
Re: qemu crashing when attaching an ISO file to a virtio-scsi CD-ROM device through libvirt
Am 23.10.2019 um 19:28 hat Fernando Casas Schössow geschrieben: > Hi John, > > Thanks for looking into this. I can quickly repro the problem with > qemu 4.0 binary with debugging symbols enabled as I have it available > already. Doing the same with qemu 4.1 or development head may be too > much hassle but if it's really the only way I can give it try. We had a lot of iothread related fixes in 4.1, so this would really be the only way to tell if it's a bug that still exists. I suspect that it's already fixed (and to be more precise, I assume that commit d0ee0204f fixed it). Kevin
Re: [RFC 2/3] block/file-posix: Detect XFS with CONFIG_FALLOCATE
Am 25.10.2019 um 12:22 hat Max Reitz geschrieben: > On 25.10.19 12:19, Kevin Wolf wrote: > > Am 25.10.2019 um 11:58 hat Max Reitz geschrieben: > >> We will need this for the next patch. > >> > >> Signed-off-by: Max Reitz > >> --- > >> block/file-posix.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/block/file-posix.c b/block/file-posix.c > >> index 695fcf740d..5cd54c8bff 100644 > >> --- a/block/file-posix.c > >> +++ b/block/file-posix.c > >> @@ -149,7 +149,7 @@ typedef struct BDRVRawState { > >> int perm_change_flags; > >> BDRVReopenState *reopen_state; > >> > >> -#ifdef CONFIG_XFS > >> +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) > >> bool is_xfs:1; > >> #endif > >> bool has_discard:1; > >> @@ -667,7 +667,7 @@ static int raw_open_common(BlockDriverState *bs, QDict > >> *options, > >> } > >> #endif > >> > >> -#ifdef CONFIG_XFS > >> +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) > >> if (platform_test_xfs_fd(s->fd)) { > > > > platform_test_xfs_fd() is defined in a header file from xfsprogs. I > > don't think you can call that without CONFIG_XFS, it would break the > > build. > > Aw. > > OK. Should we then just assume is_xfs to be true (for the next patch) > with !defined(CONFIG_XFS) && defined(CONFIG_FALLOCATE)? It's a small inline function that basically just calls statfs() and then checks f_type. I think we can have a small function to implement this in the file-posix code. Don't copy it from the header because of licensing (LGPL, while file-posix is MIT); refer to the statfs() manpage instead and write it yourself. Kevin signature.asc Description: PGP signature
Re: qemu crashing when attaching an ISO file to a virtio-scsi CD-ROM device through libvirt
Thanks for the reply Kevin. I will do my best to upgrade to 4.1, test again and report back if this is fixed or not in that version. Hopefully it is. Fernando On vie, oct 25, 2019 at 12:07 PM, Kevin Wolf wrote: Am 23.10.2019 um 19:28 hat Fernando Casas Schössow geschrieben: Hi John, Thanks for looking into this. I can quickly repro the problem with qemu 4.0 binary with debugging symbols enabled as I have it available already. Doing the same with qemu 4.1 or development head may be too much hassle but if it's really the only way I can give it try. We had a lot of iothread related fixes in 4.1, so this would really be the only way to tell if it's a bug that still exists. I suspect that it's already fixed (and to be more precise, I assume that commit d0ee0204f fixed it). Kevin
Re: [RFC 2/3] block/file-posix: Detect XFS with CONFIG_FALLOCATE
On 25.10.19 12:19, Kevin Wolf wrote: > Am 25.10.2019 um 11:58 hat Max Reitz geschrieben: >> We will need this for the next patch. >> >> Signed-off-by: Max Reitz >> --- >> block/file-posix.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/block/file-posix.c b/block/file-posix.c >> index 695fcf740d..5cd54c8bff 100644 >> --- a/block/file-posix.c >> +++ b/block/file-posix.c >> @@ -149,7 +149,7 @@ typedef struct BDRVRawState { >> int perm_change_flags; >> BDRVReopenState *reopen_state; >> >> -#ifdef CONFIG_XFS >> +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) >> bool is_xfs:1; >> #endif >> bool has_discard:1; >> @@ -667,7 +667,7 @@ static int raw_open_common(BlockDriverState *bs, QDict >> *options, >> } >> #endif >> >> -#ifdef CONFIG_XFS >> +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) >> if (platform_test_xfs_fd(s->fd)) { > > platform_test_xfs_fd() is defined in a header file from xfsprogs. I > don't think you can call that without CONFIG_XFS, it would break the > build. Aw. OK. Should we then just assume is_xfs to be true (for the next patch) with !defined(CONFIG_XFS) && defined(CONFIG_FALLOCATE)? Max signature.asc Description: OpenPGP digital signature
[RFC 2/3] block/file-posix: Detect XFS with CONFIG_FALLOCATE
We will need this for the next patch. Signed-off-by: Max Reitz --- block/file-posix.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 695fcf740d..5cd54c8bff 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -149,7 +149,7 @@ typedef struct BDRVRawState { int perm_change_flags; BDRVReopenState *reopen_state; -#ifdef CONFIG_XFS +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) bool is_xfs:1; #endif bool has_discard:1; @@ -667,7 +667,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, } #endif -#ifdef CONFIG_XFS +#if defined(CONFIG_XFS) || defined(CONFIG_FALLOCATE) if (platform_test_xfs_fd(s->fd)) { s->is_xfs = true; } -- 2.21.0
[RFC 3/3] block/file-posix: Let post-EOF fallocate serialize
The XFS kernel driver has a bug that may cause data corruption for qcow2 images as of qemu commit c8bb23cbdbe32f. We can work around it by treating post-EOF fallocates as serializing up until infinity (INT64_MAX in practice). Signed-off-by: Max Reitz --- block/file-posix.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index 5cd54c8bff..1f5a01df70 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2713,6 +2713,48 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes, RawPosixAIOData acb; ThreadPoolFunc *handler; +#ifdef CONFIG_FALLOCATE +if (s->is_xfs && s->use_linux_aio && +offset + bytes > bs->total_sectors * BDRV_SECTOR_SIZE) +{ +BdrvTrackedRequest *req; +uint64_t end; + +/* + * The Linux XFS driver has a bug where it will discard writes + * submitted through the AIO interface if they happen beyond a + * concurrently running fallocate() that increases the file + * length (i.e., both the write and the fallocate() happen + * beyond the EOF). + * + * To work around it, we look for the tracked request for this + * zero write, extend it until INT64_MAX (effectively + * infinity), and mark it as serializing. + * + * TODO: Detect whether this has been fixed in the XFS driver. + */ + +QLIST_FOREACH(req, &bs->tracked_requests, list) { +if (req->co == qemu_coroutine_self() && +req->type == BDRV_TRACKED_WRITE) +{ +break; +} +} + +assert(req); +assert(req->offset <= offset); +assert(req->offset + req->bytes >= offset + bytes); + +end = INT64_MAX & -(uint64_t)bs->bl.request_alignment; +req->bytes = end - req->offset; +req->overlap_bytes = req->bytes; + +bdrv_mark_request_serialising(req, bs->bl.request_alignment); +bdrv_wait_serialising_requests(req); +} +#endif + acb = (RawPosixAIOData) { .bs = bs, .aio_fildes = s->fd, -- 2.21.0
[RFC 0/3] block/file-posix: Work around XFS bug
Hi, It seems to me that there is a bug in Linux’s XFS kernel driver, as I’ve explained here: https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01429.html In combination with our commit c8bb23cbdbe32f, this may lead to guest data corruption when using qcow2 images on XFS with aio=native. We can’t wait until the XFS kernel driver is fixed, we should work around the problem ourselves. This is an RFC for two reasons: (1) I don’t know whether this is the right way to address the issue, (2) Ideally, we should detect whether the XFS kernel driver is fixed and if so stop applying the workaround. I don’t know how we would go about this, so this series doesn’t do it. (Hence it’s an RFC.) (3) Perhaps it’s a bit of a layering violation to let the file-posix driver access and modify a BdrvTrackedRequest object. As for how we can address the issue, I see three ways: (1) The one presented in this series: On XFS with aio=native, we extend tracked requests for post-EOF fallocate() calls (i.e., write-zero operations) to reach until infinity (INT64_MAX in practice), mark them serializing and wait for other conflicting requests. Advantages: + Limits the impact to very specific cases (And that means it wouldn’t hurt too much to keep this workaround even when the XFS driver has been fixed) + Works around the bug where it happens, namely in file-posix Disadvantages: - A bit complex - A bit of a layering violation (should file-posix have access to tracked requests?) (2) Always skip qcow2’s handle_alloc_space() on XFS. The XFS bug only becomes visible due to that function: I don’t think qcow2 writes zeroes in any other I/O path, and raw images are fixed in size so post-EOF writes won’t happen. Advantages: + Maybe simpler, depending on how difficult it is to handle the layering violation + Also fixes the performance problem of handle_alloc_space() being slow on ppc64+XFS. Disadvantages: - Huge layering violation because qcow2 would need to know whether the image is stored on XFS or not. - We’d definitely want to skip this workaround when the XFS driver has been fixed, so we need some method to find out whether it has (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f. To my knowledge I’m the only one who has provided any benchmarks for this commit, and even then I was a bit skeptical because it performs well in some cases and bad in others. I concluded that it’s probably worth it because the “some cases” are more likely to occur. Now we have this problem of corruption here (granted due to a bug in the XFS driver), and another report of massively degraded performance on ppc64 (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – sorry, a private BZ; I hate that :-/ The report is about 40 % worse performance for an in-guest fio write benchmark.) So I have to ask the question about what the justification for keeping c8bb23cbdbe32f is. How much does performance increase with it actually? (On non-(ppc64+XFS) machines, obviously) Advantages: + Trivial + No layering violations + We wouldn’t need to keep track of whether the kernel bug has been fixed or not + Fixes the ppc64+XFS performance problem Disadvantages: - Reverts cluster allocation performance to pre-c8bb23cbdbe32f levels, whatever that means So this is the main reason this is an RFC: What should we do? Is (1) really the best choice? In any case, I’ve ran the test case I showed in https://lists.nongnu.org/archive/html/qemu-block/2019-10/msg01282.html more than ten times with this series applied and the installation succeeded every time. (Without this series, it fails like every other time.) Max Reitz (3): block: Make wait/mark serialising requests public block/file-posix: Detect XFS with CONFIG_FALLOCATE block/file-posix: Let post-EOF fallocate serialize include/block/block_int.h | 3 +++ block/file-posix.c| 46 +-- block/io.c| 24 ++-- 3 files changed, 59 insertions(+), 14 deletions(-) -- 2.21.0
[RFC 1/3] block: Make wait/mark serialising requests public
Make both bdrv_mark_request_serialising() and bdrv_wait_serialising_requests() public so they can be used from block drivers. Signed-off-by: Max Reitz --- include/block/block_int.h | 3 +++ block/io.c| 24 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/include/block/block_int.h b/include/block/block_int.h index ca4ccac4c1..c85733293d 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -984,6 +984,9 @@ extern unsigned int bdrv_drain_all_count; void bdrv_apply_subtree_drain(BdrvChild *child, BlockDriverState *new_parent); void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent); +bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self); +void bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align); + int get_tmp_filename(char *filename, int size); BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size, const char *filename); diff --git a/block/io.c b/block/io.c index f0b86c1d19..a65cc7fb61 100644 --- a/block/io.c +++ b/block/io.c @@ -715,7 +715,7 @@ static void tracked_request_begin(BdrvTrackedRequest *req, qemu_co_mutex_unlock(&bs->reqs_lock); } -static void mark_request_serialising(BdrvTrackedRequest *req, uint64_t align) +void bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align) { int64_t overlap_offset = req->offset & ~(align - 1); uint64_t overlap_bytes = ROUND_UP(req->offset + req->bytes, align) @@ -805,7 +805,7 @@ void bdrv_dec_in_flight(BlockDriverState *bs) bdrv_wakeup(bs); } -static bool coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self) +bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self) { BlockDriverState *bs = self->bs; BdrvTrackedRequest *req; @@ -1437,14 +1437,14 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child, * with each other for the same cluster. For example, in copy-on-read * it ensures that the CoR read and write operations are atomic and * guest writes cannot interleave between them. */ -mark_request_serialising(req, bdrv_get_cluster_size(bs)); +bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs)); } /* BDRV_REQ_SERIALISING is only for write operation */ assert(!(flags & BDRV_REQ_SERIALISING)); if (!(flags & BDRV_REQ_NO_SERIALISING)) { -wait_serialising_requests(req); +bdrv_wait_serialising_requests(req); } if (flags & BDRV_REQ_COPY_ON_READ) { @@ -1841,10 +1841,10 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes, assert(!(flags & ~BDRV_REQ_MASK)); if (flags & BDRV_REQ_SERIALISING) { -mark_request_serialising(req, bdrv_get_cluster_size(bs)); +bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs)); } -waited = wait_serialising_requests(req); +waited = bdrv_wait_serialising_requests(req); assert(!waited || !req->serialising || is_request_serialising_and_aligned(req)); @@ -2008,8 +2008,8 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild *child, padding = bdrv_init_padding(bs, offset, bytes, &pad); if (padding) { -mark_request_serialising(req, align); -wait_serialising_requests(req); +bdrv_mark_request_serialising(req, align); +bdrv_wait_serialising_requests(req); bdrv_padding_rmw_read(child, req, &pad, true); @@ -2111,8 +2111,8 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child, } if (bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad)) { -mark_request_serialising(&req, align); -wait_serialising_requests(&req); +bdrv_mark_request_serialising(&req, align); +bdrv_wait_serialising_requests(&req); bdrv_padding_rmw_read(child, &req, &pad, false); } @@ -3205,7 +3205,7 @@ static int coroutine_fn bdrv_co_copy_range_internal( /* BDRV_REQ_SERIALISING is only for write operation */ assert(!(read_flags & BDRV_REQ_SERIALISING)); if (!(read_flags & BDRV_REQ_NO_SERIALISING)) { -wait_serialising_requests(&req); +bdrv_wait_serialising_requests(&req); } ret = src->bs->drv->bdrv_co_copy_range_from(src->bs, @@ -3332,7 +3332,7 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, * new area, we need to make sure that no write requests are made to it * concurrently or they might be overwritten by preallocation. */ if (new_bytes) { -mark_request_serialising(&req, 1); +bdrv_mark_request_serialising(&req, 1); } if (bs->read_only) { error_setg(errp, "Image is read-only"); -- 2.21.0
[bugfix ping] Re: [PATCH v2 0/2] fix qcow2_can_store_new_dirty_bitmap
Hi! Don't we forget it? Here is a bug-fix, I think we want it for 4.2. 14.10.2019 14:51, Vladimir Sementsov-Ogievskiy wrote: > Hi all! > > Here is a fix for persistent bitmaps managing: we must check existent > but not yet stored bitmaps for qcow2-related constraints, like maximum > number of bitmaps in qcow2 image. > > v2: > > 01: change assertion to error-return at function start > Be free to add > Reported-by: aihua liang > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1712636 > if it's appropriate > 02: new test > Ohh, it takes about 4 minutes. Be free to drop it, as I doubt that > it worth to have. The case is simple, we may live without a > test. > > Vladimir Sementsov-Ogievskiy (2): >qcow2-bitmaps: fix qcow2_can_store_new_dirty_bitmap >iotests: add 269 to check maximum of bitmaps in qcow2 > > block/qcow2-bitmap.c | 41 +++-- > tests/qemu-iotests/269 | 47 ++ > tests/qemu-iotests/269.out | 3 +++ > tests/qemu-iotests/group | 1 + > 4 files changed, 69 insertions(+), 23 deletions(-) > create mode 100755 tests/qemu-iotests/269 > create mode 100644 tests/qemu-iotests/269.out > -- Best regards, Vladimir
Re: [PATCH 3/4] hw/scsi: add SCSI COMPARE_AND_WRITE support
On Fri, Oct 25, 2019 at 05:36:01PM +0800, Yaowei Bai wrote: > > diff --git a/include/tcmu/tcmu.h b/include/tcmu/tcmu.h > new file mode 100644 > index 000..656a545 > --- /dev/null > +++ b/include/tcmu/tcmu.h > @@ -0,0 +1,14 @@ > +#ifndef QEMU_TCMU_H > +#define QEMU_TCMU_H > + > +#include "qemu-common.h" > + > +typedef struct TCMUExport TCMUExport; > +extern QemuOptsList qemu_tcmu_export_opts; > + > +void qemu_tcmu_stop(void); > +void qemu_tcmu_start(const char *subtype, Error **errp); > +TCMUExport *tcmu_export_new(BlockBackend *blk, bool writable, Error **errp); > +int export_init_func(void *opaque, QemuOpts *all_opts, Error **errp); > + > +#endif Please ignore this odd part. > -- > 1.8.3.1 >
[PATCH 3/4] hw/scsi: add SCSI COMPARE_AND_WRITE support
This patch emulates COMPARE_AND_WRITE command with the BDRV_REQ_COMPARE_AND_WRITE flag introduced by last patch. It matches the SBC-4 standard except the FUA bit support, it'll be finished in the next patch. Note that cmd->xfer is set 2 * the number got by scsi_data_cdb_xfer so we could touch the least code. Signed-off-by: Yaowei Bai --- hw/scsi/emulation.c | 1 + hw/scsi/scsi-bus.c | 4 +++ hw/scsi/scsi-disk.c | 88 + hw/scsi/trace-events| 1 + include/hw/scsi/emulation.h | 3 ++ include/scsi/utils.h| 2 ++ include/tcmu/tcmu.h | 14 scsi/utils.c| 5 +++ 8 files changed, 118 insertions(+) create mode 100644 include/tcmu/tcmu.h diff --git a/hw/scsi/emulation.c b/hw/scsi/emulation.c index 06d62f3..1f53c4a 100644 --- a/hw/scsi/emulation.c +++ b/hw/scsi/emulation.c @@ -9,6 +9,7 @@ int scsi_emulate_block_limits(uint8_t *outbuf, const SCSIBlockLimits *bl) memset(outbuf, 0, 0x3c); outbuf[0] = bl->wsnz; /* wsnz */ +outbuf[1] = MAX_COMPARE_AND_WRITE_LENGTH; if (bl->max_io_sectors) { /* optimal transfer length granularity. This field and the optimal diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index bccb7cc..15aab8a 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -987,6 +987,9 @@ static int scsi_req_xfer(SCSICommand *cmd, SCSIDevice *dev, uint8_t *buf) case WRITE_VERIFY_16: cmd->xfer *= dev->blocksize; break; +case COMPARE_AND_WRITE: +cmd->xfer *= 2 * dev->blocksize; +break; case READ_6: case READ_REVERSE: /* length 0 means 256 blocks */ @@ -1206,6 +1209,7 @@ static void scsi_cmd_xfer_mode(SCSICommand *cmd) case PERSISTENT_RESERVE_OUT: case MAINTENANCE_OUT: case SET_WINDOW: +case COMPARE_AND_WRITE: case SCAN: /* SCAN conflicts with START_STOP. START_STOP has cmd->xfer set to 0 for * non-scanner devices, so we only get here for SCAN and not for START_STOP. diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 68b1675..4bff862 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -477,6 +477,9 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int error, bool acct_failed) case ENOSPC: scsi_check_condition(r, SENSE_CODE(SPACE_ALLOC_FAILED)); break; +case EILSEQ: +scsi_check_condition(r, SENSE_CODE(MISCOMPARE_DURING_VERIFY)); +break; default: scsi_check_condition(r, SENSE_CODE(IO_ERROR)); break; @@ -1824,6 +1827,84 @@ static void scsi_disk_emulate_write_same(SCSIDiskReq *r, uint8_t *inbuf) scsi_write_same_complete, data); } +typedef struct CompareAndWriteCBData { +SCSIDiskReq *r; +int64_t sector; +int nb_sectors; +QEMUIOVector qiov; +struct iovec iov; +} CompareAndWriteCBData; + +static void scsi_compare_and_write_complete(void *opaque, int ret) +{ +CompareAndWriteCBData *data = opaque; +SCSIDiskReq *r = data->r; +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev); + +assert(r->req.aiocb != NULL); +r->req.aiocb = NULL; +aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk)); +if (scsi_disk_req_check_error(r, ret, true)) { +goto done; +} + +block_acct_done(blk_get_stats(s->qdev.conf.blk), &r->acct); +scsi_req_complete(&r->req, GOOD); + +done: +scsi_req_unref(&r->req); +qemu_vfree(data->iov.iov_base); +g_free(data); +aio_context_release(blk_get_aio_context(s->qdev.conf.blk)); +} + +static void scsi_disk_emulate_compare_and_write(SCSIDiskReq *r, uint8_t *inbuf) +{ +SCSIRequest *req = &r->req; +SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev); +uint32_t nb_sectors = scsi_data_cdb_xfer(r->req.cmd.buf); +CompareAndWriteCBData *data; +uint8_t *buf; +int i; + +if (nb_sectors > MAX_COMPARE_AND_WRITE_LENGTH) { +scsi_check_condition(r, SENSE_CODE(INVALID_FIELD)); +return; +} + +if (blk_is_read_only(s->qdev.conf.blk)) { +scsi_check_condition(r, SENSE_CODE(WRITE_PROTECTED)); +return; +} + +if (r->req.cmd.lba > s->qdev.max_lba || +!check_lba_range(s, r->req.cmd.lba, nb_sectors)) { +scsi_check_condition(r, SENSE_CODE(LBA_OUT_OF_RANGE)); +return; +} + +data = g_new0(CompareAndWriteCBData, 1); +data->r = r; +data->sector = r->req.cmd.lba * (s->qdev.blocksize / 512); +data->nb_sectors = r->req.cmd.xfer * (s->qdev.blocksize / 512); +data->iov.iov_len = data->nb_sectors; +data->iov.iov_base = buf = blk_blockalign(s->qdev.conf.blk, + data->iov.iov_len); +qemu_iovec_init_external(&data->qiov, &data->iov, 1); + +for (i = 0; i < data->iov.iov_len; i += s->qdev.blocksize) { +memcpy(&buf[i], &inbuf[i]
[PATCH 0/4] SCSI COMPARE_AND_WRITE support
Recently ceph/librbd added several interfaces to handle SCSI commands like COMPARE_AND_WRITE directly. However they were only be used in special scenarios, i.e. ISCSI. That involves more software components which makes the IO path longer and could bring more potential issues. Actually we're maintaining several ceph clusters with ISCSI protocal being used which, i have to say, is not an easy job. So supporting COMPARE_AND_WRITE in scsi-disk would be a reasonable solution and easy to implement with the help of the SCSI handlers in librbd, which could leave alone the ISCSI stuff and make the IO path shorter. This patchset implements it by reusing the blk_aio_pwritev interface and introducing a new BDRV_REQ_COMPARE_AND_WRITE element into BdrvRequestFlags to indicate a COMPARE_AND_WRITE request, rather than adding a new interface into block-backend.c. The FUA support is implemented in the blk_aio_pwritev's callback function similar to its emulation in scsi_write_do_fua function, maybe through BDRV_REQ_FUA is another doable way. This patchset is tested with the method of sg_compare_and_write.txt from sg3_utils. Link below: https://github.com/hreinecke/sg3_utils/blob/master/examples/sg_compare_and_write.txt Yaowei Bai (4): block: add SCSI COMPARE_AND_WRITE support block/rbd: implement bdrv_aio_compare_and_write interface hw/scsi: add SCSI COMPARE_AND_WRITE support scsi-disk: add FUA support for COMPARE_AND_WRITE block/io.c | 20 + block/raw-format.c | 3 +- block/rbd.c | 41 ++- hw/scsi/emulation.c | 1 + hw/scsi/scsi-bus.c | 4 ++ hw/scsi/scsi-disk.c | 98 + hw/scsi/trace-events| 1 + include/block/block.h | 5 ++- include/block/block_int.h | 3 ++ include/hw/scsi/emulation.h | 3 ++ include/scsi/utils.h| 2 + include/tcmu/tcmu.h | 14 +++ scsi/utils.c| 5 +++ 13 files changed, 195 insertions(+), 5 deletions(-) create mode 100644 include/tcmu/tcmu.h -- 1.8.3.1
[PATCH 4/4] scsi-disk: add FUA support for COMPARE_AND_WRITE
It is implemented in the blk_aio_pwritev's callback function in a way similar to its emulation in scsi_write_do_fua function Signed-off-by: Yaowei Bai --- hw/scsi/scsi-disk.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 4bff862..ef9c257 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -228,6 +228,7 @@ static bool scsi_is_cmd_fua(SCSICommand *cmd) case WRITE_10: case WRITE_12: case WRITE_16: +case COMPARE_AND_WRITE: return (cmd->buf[1] & 8) != 0; case VERIFY_10: @@ -1849,10 +1850,17 @@ static void scsi_compare_and_write_complete(void *opaque, int ret) } block_acct_done(blk_get_stats(s->qdev.conf.blk), &r->acct); +if (r->need_fua_emulation) { +block_acct_start(blk_get_stats(s->qdev.conf.blk), &r->acct, 0, + BLOCK_ACCT_FLUSH); +r->req.aiocb = blk_aio_flush(s->qdev.conf.blk, scsi_aio_complete, r); +goto free; +} scsi_req_complete(&r->req, GOOD); done: scsi_req_unref(&r->req); +free: qemu_vfree(data->iov.iov_base); g_free(data); aio_context_release(blk_get_aio_context(s->qdev.conf.blk)); @@ -1953,6 +1961,7 @@ static int32_t scsi_disk_emulate_command(SCSIRequest *req, uint8_t *buf) { SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req); SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev); +SCSIDiskClass *sdc = (SCSIDiskClass *) object_get_class(OBJECT(s)); uint64_t nb_sectors; uint8_t *outbuf; int buflen; @@ -2208,6 +2217,7 @@ static int32_t scsi_disk_emulate_command(SCSIRequest *req, uint8_t *buf) return 0; } assert(!r->req.aiocb); +r->need_fua_emulation = sdc->need_fua_emulation(&r->req.cmd); r->iov.iov_len = MIN(r->buflen, req->cmd.xfer); if (r->iov.iov_len == 0) { scsi_req_complete(&r->req, GOOD); -- 1.8.3.1
[PATCH 2/4] block/rbd: implement bdrv_aio_compare_and_write interface
This patch adds librbd's SCSI COMPARE_AND_WRITE command interface support with bdrv_aio_compare_and_write function pointer. Note currently when a miscompare happens a mismatch offset of 0 is always reported rather than the actual mismatch offset. This should not be a big issue contemporarily and will be fixed accordingly in the future. Signed-off-by: Yaowei Bai --- block/raw-format.c | 3 ++- block/rbd.c| 41 +++-- 2 files changed, 41 insertions(+), 3 deletions(-) diff --git a/block/raw-format.c b/block/raw-format.c index 42c28cc..3d8f201 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -438,7 +438,8 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags, bs->sg = bs->file->bs->sg; bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED | -(BDRV_REQ_FUA & bs->file->bs->supported_write_flags); +(BDRV_REQ_FUA & bs->file->bs->supported_write_flags) | +(BDRV_REQ_COMPARE_AND_WRITE & bs->file->bs->supported_write_flags); bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED | ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) & bs->file->bs->supported_zero_flags); diff --git a/block/rbd.c b/block/rbd.c index c71e45d..7065343 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -73,11 +73,18 @@ #define LIBRBD_USE_IOVEC 0 #endif +#ifdef LIBRBD_SUPPORTS_COMPARE_AND_WRITE +#define LIBRBD_HAVE_COMPARE_AND_WRITE 1 +#else +#define LIBRBD_HAVE_COMPARE_AND_WRITE 0 +#endif + typedef enum { RBD_AIO_READ, RBD_AIO_WRITE, RBD_AIO_DISCARD, -RBD_AIO_FLUSH +RBD_AIO_FLUSH, +RBD_AIO_COMPARE_AND_WRITE } RBDAIOCmd; typedef struct RBDAIOCB { @@ -798,6 +805,9 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags, } } +if (LIBRBD_HAVE_COMPARE_AND_WRITE) +bs->supported_write_flags = BDRV_REQ_COMPARE_AND_WRITE; + r = 0; goto out; @@ -933,7 +943,15 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, rcb = g_new(RADOSCB, 1); -if (!LIBRBD_USE_IOVEC) { +if (cmd == RBD_AIO_COMPARE_AND_WRITE) { +acb->bounce = qemu_try_blockalign(bs, qiov->size); +if (acb->bounce == NULL) { +goto failed; +} + +qemu_iovec_to_buf(acb->qiov, 0, acb->bounce, qiov->size); +rcb->buf = acb->bounce; +} else if (!LIBRBD_USE_IOVEC) { if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) { acb->bounce = NULL; } else { @@ -993,6 +1011,9 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState *bs, case RBD_AIO_FLUSH: r = rbd_aio_flush_wrapper(s->image, c); break; +case RBD_AIO_COMPARE_AND_WRITE: +r = rbd_aio_compare_and_write(s->image, off, size/2, rcb->buf, (rcb->buf + size/2), c, 0, 0); +break; default: r = -EINVAL; } @@ -1056,6 +1077,18 @@ static int qemu_rbd_co_flush(BlockDriverState *bs) } #endif +#ifdef LIBRBD_SUPPORTS_COMPARE_AND_WRITE +static BlockAIOCB *qemu_rbd_aio_compare_and_write(BlockDriverState *bs, + uint64_t offset, uint64_t bytes, + QEMUIOVector *qiov, int flags, + BlockCompletionFunc *cb, + void *opaque) +{ +return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque, + RBD_AIO_COMPARE_AND_WRITE); +} +#endif + static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi) { BDRVRBDState *s = bs->opaque; @@ -1309,6 +1342,10 @@ static BlockDriver bdrv_rbd = { .bdrv_aio_pdiscard = qemu_rbd_aio_pdiscard, #endif +#ifdef LIBRBD_SUPPORTS_COMPARE_AND_WRITE +.bdrv_aio_compare_and_write = qemu_rbd_aio_compare_and_write, +#endif + .bdrv_snapshot_create = qemu_rbd_snap_create, .bdrv_snapshot_delete = qemu_rbd_snap_remove, .bdrv_snapshot_list = qemu_rbd_snap_list, -- 1.8.3.1
[PATCH 1/4] block: add SCSI COMPARE_AND_WRITE support
Some storages(i.e. librbd) already have interfaces to handle some SCSI commands directly. This patch adds COMPARE_AND_WRITE command support through the write path in the block io layer by introducing a new element BDRV_REQ_COMPARE_AND_WRITE into BdrvRequestFlags which indicates a COMPARE_AND_WRITE request. In this way we could easily extend to other SCSI commands support like WRITE_SAME in the future. Signed-off-by: Yaowei Bai --- block/io.c| 20 include/block/block.h | 5 +++-- include/block/block_int.h | 3 +++ 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/block/io.c b/block/io.c index f0b86c1..667b5ea 100644 --- a/block/io.c +++ b/block/io.c @@ -1168,6 +1168,26 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs, goto emulate_flags; } +if (drv->bdrv_aio_compare_and_write && + (flags & BDRV_REQ_COMPARE_AND_WRITE)) { +BlockAIOCB *acb; +CoroutineIOCompletion co = { +.coroutine = qemu_coroutine_self(), +}; + +acb = drv->bdrv_aio_compare_and_write(bs, offset, bytes, qiov, +flags & bs->supported_write_flags, +bdrv_co_io_em_complete, &co); +flags &= ~bs->supported_write_flags; +if (acb == NULL) { +ret = -EIO; +} else { +qemu_coroutine_yield(); +ret = co.ret; +} +goto emulate_flags; +} + if (drv->bdrv_aio_pwritev) { BlockAIOCB *acb; CoroutineIOCompletion co = { diff --git a/include/block/block.h b/include/block/block.h index 89606bd..7159e03 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -92,9 +92,10 @@ typedef enum { * on read request and means that caller doesn't really need data to be * written to qiov parameter which may be NULL. */ -BDRV_REQ_PREFETCH = 0x200, +BDRV_REQ_PREFETCH = 0x200, +BDRV_REQ_COMPARE_AND_WRITE = 0x400, /* Mask of valid flags */ -BDRV_REQ_MASK = 0x3ff, +BDRV_REQ_MASK = 0x7ff, } BdrvRequestFlags; typedef struct BlockSizes { diff --git a/include/block/block_int.h b/include/block/block_int.h index ca4ccac..a039b68 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -189,6 +189,9 @@ struct BlockDriver { BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs, int64_t offset, int bytes, BlockCompletionFunc *cb, void *opaque); +BlockAIOCB *(*bdrv_aio_compare_and_write)(BlockDriverState *bs, +uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags, +BlockCompletionFunc *cb, void *opaque); int coroutine_fn (*bdrv_co_readv)(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); -- 1.8.3.1
[PULL 19/19] hw/rtc/aspeed_rtc: Remove unused includes
From: Philippe Mathieu-Daudé The system include is already provided by "osdep.h" (the scripts/clean-includes file clean such headers). Commit 64552b6be47 suggests we don't need to include "hw/irq.h": Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to qemu/typedefs.h, and then include hw/irq.h only where it's still needed. Reviewed-by: Cédric Le Goater Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Acked-by: Peter Maydell Message-Id: <20191003230404.19384-15-phi...@redhat.com> Signed-off-by: Laurent Vivier --- include/hw/rtc/aspeed_rtc.h | 3 --- 1 file changed, 3 deletions(-) diff --git a/include/hw/rtc/aspeed_rtc.h b/include/hw/rtc/aspeed_rtc.h index 3fde854ad99c..b94a7102688a 100644 --- a/include/hw/rtc/aspeed_rtc.h +++ b/include/hw/rtc/aspeed_rtc.h @@ -8,9 +8,6 @@ #ifndef HW_RTC_ASPEED_RTC_H #define HW_RTC_ASPEED_RTC_H -#include - -#include "hw/irq.h" #include "hw/sysbus.h" typedef struct AspeedRtcState { -- 2.21.0
[PULL 16/19] hw: Move Aspeed RTC from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé Move RTC devices under the hw/rtc/ subdirectory. Reviewed-by: Cédric Le Goater Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Acked-by: Peter Maydell Message-Id: <20191003230404.19384-12-phi...@redhat.com> Signed-off-by: Laurent Vivier --- hw/rtc/Makefile.objs | 1 + hw/{timer => rtc}/aspeed_rtc.c | 2 +- hw/rtc/trace-events| 4 hw/timer/Makefile.objs | 2 +- hw/timer/trace-events | 4 include/hw/arm/aspeed_soc.h| 2 +- include/hw/{timer => rtc}/aspeed_rtc.h | 6 +++--- 7 files changed, 11 insertions(+), 10 deletions(-) rename hw/{timer => rtc}/aspeed_rtc.c (99%) rename include/hw/{timer => rtc}/aspeed_rtc.h (84%) diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index 3d4763fc269e..8dc9fcd3a983 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -10,3 +10,4 @@ common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-rtc.o common-obj-$(CONFIG_EXYNOS4) += exynos4210_rtc.o obj-$(CONFIG_MC146818RTC) += mc146818rtc.o common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o +common-obj-$(CONFIG_ASPEED_SOC) += aspeed_rtc.o diff --git a/hw/timer/aspeed_rtc.c b/hw/rtc/aspeed_rtc.c similarity index 99% rename from hw/timer/aspeed_rtc.c rename to hw/rtc/aspeed_rtc.c index 531301735334..3ca1183558b7 100644 --- a/hw/timer/aspeed_rtc.c +++ b/hw/rtc/aspeed_rtc.c @@ -8,7 +8,7 @@ #include "qemu/osdep.h" #include "qemu-common.h" -#include "hw/timer/aspeed_rtc.h" +#include "hw/rtc/aspeed_rtc.h" #include "migration/vmstate.h" #include "qemu/log.h" #include "qemu/timer.h" diff --git a/hw/rtc/trace-events b/hw/rtc/trace-events index 7f1945ad4cc6..d6749f4616a0 100644 --- a/hw/rtc/trace-events +++ b/hw/rtc/trace-events @@ -13,3 +13,7 @@ pl031_read(uint32_t addr, uint32_t value) "addr 0x%08x value 0x%08x" pl031_write(uint32_t addr, uint32_t value) "addr 0x%08x value 0x%08x" pl031_alarm_raised(void) "alarm raised" pl031_set_alarm(uint32_t ticks) "alarm set for %u ticks" + +# aspeed-rtc.c +aspeed_rtc_read(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 +aspeed_rtc_write(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index 33191d74cb98..83091770df3a 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -29,7 +29,7 @@ common-obj-$(CONFIG_MIPS_CPS) += mips_gictimer.o common-obj-$(CONFIG_ALLWINNER_A10_PIT) += allwinner-a10-pit.o common-obj-$(CONFIG_STM32F2XX_TIMER) += stm32f2xx_timer.o -common-obj-$(CONFIG_ASPEED_SOC) += aspeed_timer.o aspeed_rtc.o +common-obj-$(CONFIG_ASPEED_SOC) += aspeed_timer.o common-obj-$(CONFIG_CMSDK_APB_TIMER) += cmsdk-apb-timer.o common-obj-$(CONFIG_CMSDK_APB_DUALTIMER) += cmsdk-apb-dualtimer.o diff --git a/hw/timer/trace-events b/hw/timer/trace-events index 1459d07237b9..e18b87fc96dd 100644 --- a/hw/timer/trace-events +++ b/hw/timer/trace-events @@ -66,10 +66,6 @@ cmsdk_apb_dualtimer_read(uint64_t offset, uint64_t data, unsigned size) "CMSDK A cmsdk_apb_dualtimer_write(uint64_t offset, uint64_t data, unsigned size) "CMSDK APB dualtimer write: offset 0x%" PRIx64 " data 0x%" PRIx64 " size %u" cmsdk_apb_dualtimer_reset(void) "CMSDK APB dualtimer: reset" -# hw/timer/aspeed-rtc.c -aspeed_rtc_read(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 -aspeed_rtc_write(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 - # nrf51_timer.c nrf51_timer_read(uint64_t addr, uint32_t value, unsigned size) "read addr 0x%" PRIx64 " data 0x%" PRIx32 " size %u" nrf51_timer_write(uint64_t addr, uint32_t value, unsigned size) "write addr 0x%" PRIx64 " data 0x%" PRIx32 " size %u" diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h index cccb684a19bb..495c08be1b84 100644 --- a/include/hw/arm/aspeed_soc.h +++ b/include/hw/arm/aspeed_soc.h @@ -18,7 +18,7 @@ #include "hw/misc/aspeed_sdmc.h" #include "hw/misc/aspeed_xdma.h" #include "hw/timer/aspeed_timer.h" -#include "hw/timer/aspeed_rtc.h" +#include "hw/rtc/aspeed_rtc.h" #include "hw/i2c/aspeed_i2c.h" #include "hw/ssi/aspeed_smc.h" #include "hw/watchdog/wdt_aspeed.h" diff --git a/include/hw/timer/aspeed_rtc.h b/include/hw/rtc/aspeed_rtc.h similarity index 84% rename from include/hw/timer/aspeed_rtc.h rename to include/hw/rtc/aspeed_rtc.h index 15ba42912b7f..3fde854ad99c 100644 --- a/include/hw/timer/aspeed_rtc.h +++ b/include/hw/rtc/aspeed_rtc.h @@ -5,8 +5,8 @@ * Copyright 2019 IBM Corp * SPDX-License-Identifier: GPL-2.0-or-later */ -#ifndef ASPEED_RTC_H -#define ASPEED_RTC_H +#ifndef HW_RTC_ASPEED_RTC_H +#define HW_RTC_ASPEED_RTC_H #include @@ -27,4 +27,4 @@ typedef struct AspeedRtcState { #define TYPE_ASPEED_RTC "aspeed.rtc" #define ASPEED_RTC(obj) OBJECT_CHECK(AspeedRtcState, (obj), TYPE_ASPEED_RTC) -#endif /* ASPEED_RTC_H */ +#endif /* HW_RTC_ASPEED_RTC_H */ -- 2.21.0
[PATCH v7 8/8] virtio: add property to enable packed virtqueue
From: Jason Wang Signed-off-by: Jason Wang Signed-off-by: Eugenio Pérez Reviewed-by: Jens Freimann --- include/hw/virtio/virtio.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index d123d5b181..40ddeafadb 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -285,7 +285,9 @@ typedef struct VirtIORNGConf VirtIORNGConf; DEFINE_PROP_BIT64("any_layout", _state, _field, \ VIRTIO_F_ANY_LAYOUT, true), \ DEFINE_PROP_BIT64("iommu_platform", _state, _field, \ - VIRTIO_F_IOMMU_PLATFORM, false) + VIRTIO_F_IOMMU_PLATFORM, false), \ +DEFINE_PROP_BIT64("packed", _state, _field, \ + VIRTIO_F_RING_PACKED, false) hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n); bool virtio_queue_enabled(VirtIODevice *vdev, int n); -- 2.16.5
[PATCH v7 7/8] vhost_net: enable packed ring support
From: Jason Wang Signed-off-by: Jason Wang Signed-off-by: Eugenio Pérez Reviewed-by: Jens Freimann --- hw/net/vhost_net.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index e975700f95..6b82803fa7 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -49,6 +49,7 @@ static const int kernel_feature_bits[] = { VIRTIO_F_VERSION_1, VIRTIO_NET_F_MTU, VIRTIO_F_IOMMU_PLATFORM, +VIRTIO_F_RING_PACKED, VHOST_INVALID_FEATURE_BIT }; @@ -74,6 +75,7 @@ static const int user_feature_bits[] = { VIRTIO_NET_F_MRG_RXBUF, VIRTIO_NET_F_MTU, VIRTIO_F_IOMMU_PLATFORM, +VIRTIO_F_RING_PACKED, /* This bit implies RARP isn't sent by QEMU out of band */ VIRTIO_NET_F_GUEST_ANNOUNCE, -- 2.16.5
[PULL 14/19] hw: Move Xilinx ZynqMP RTC from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé Move RTC devices under the hw/rtc/ subdirectory. Remove Alistair outdated email address (see commit c22e580c2ad). Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Acked-by: Peter Maydell Message-Id: <20191003230404.19384-10-phi...@redhat.com> Signed-off-by: Laurent Vivier --- hw/rtc/Makefile.objs| 1 + hw/rtc/trace-events | 3 +++ hw/{timer => rtc}/xlnx-zynqmp-rtc.c | 2 +- hw/timer/Makefile.objs | 1 - hw/timer/trace-events | 3 --- include/hw/arm/xlnx-zynqmp.h| 2 +- include/hw/{timer => rtc}/xlnx-zynqmp-rtc.h | 6 +++--- 7 files changed, 9 insertions(+), 9 deletions(-) rename hw/{timer => rtc}/xlnx-zynqmp-rtc.c (99%) rename include/hw/{timer => rtc}/xlnx-zynqmp-rtc.h (95%) diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index b195863291d1..543a550a0f11 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -6,5 +6,6 @@ common-obj-$(CONFIG_M48T59) += m48t59-isa.o endif common-obj-$(CONFIG_PL031) += pl031.o common-obj-$(CONFIG_TWL92230) += twl92230.o +common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-rtc.o obj-$(CONFIG_MC146818RTC) += mc146818rtc.o common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o diff --git a/hw/rtc/trace-events b/hw/rtc/trace-events index ac9e0e0fba32..7f1945ad4cc6 100644 --- a/hw/rtc/trace-events +++ b/hw/rtc/trace-events @@ -4,6 +4,9 @@ sun4v_rtc_read(uint64_t addr, uint64_t value) "read: addr 0x%" PRIx64 " value 0x%" PRIx64 sun4v_rtc_write(uint64_t addr, uint64_t value) "write: addr 0x%" PRIx64 " value 0x%" PRIx64 +# xlnx-zynqmp-rtc.c +xlnx_zynqmp_rtc_gettime(int year, int month, int day, int hour, int min, int sec) "Get time from host: %d-%d-%d %2d:%02d:%02d" + # pl031.c pl031_irq_state(int level) "irq state %d" pl031_read(uint32_t addr, uint32_t value) "addr 0x%08x value 0x%08x" diff --git a/hw/timer/xlnx-zynqmp-rtc.c b/hw/rtc/xlnx-zynqmp-rtc.c similarity index 99% rename from hw/timer/xlnx-zynqmp-rtc.c rename to hw/rtc/xlnx-zynqmp-rtc.c index 5692db98c2db..f9f09b72965a 100644 --- a/hw/timer/xlnx-zynqmp-rtc.c +++ b/hw/rtc/xlnx-zynqmp-rtc.c @@ -36,7 +36,7 @@ #include "qemu/cutils.h" #include "sysemu/sysemu.h" #include "trace.h" -#include "hw/timer/xlnx-zynqmp-rtc.h" +#include "hw/rtc/xlnx-zynqmp-rtc.h" #include "migration/vmstate.h" #ifndef XLNX_ZYNQMP_RTC_ERR_DEBUG diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index 70b61b69c7a4..294465ef47ad 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -14,7 +14,6 @@ common-obj-$(CONFIG_IMX) += imx_epit.o common-obj-$(CONFIG_IMX) += imx_gpt.o common-obj-$(CONFIG_LM32) += lm32_timer.o common-obj-$(CONFIG_MILKYMIST) += milkymist-sysctl.o -common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-rtc.o common-obj-$(CONFIG_NRF51_SOC) += nrf51_timer.o common-obj-$(CONFIG_ALTERA_TIMER) += altera_timer.o diff --git a/hw/timer/trace-events b/hw/timer/trace-events index ce34b967db9f..1459d07237b9 100644 --- a/hw/timer/trace-events +++ b/hw/timer/trace-events @@ -70,9 +70,6 @@ cmsdk_apb_dualtimer_reset(void) "CMSDK APB dualtimer: reset" aspeed_rtc_read(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 aspeed_rtc_write(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 -# xlnx-zynqmp-rtc.c -xlnx_zynqmp_rtc_gettime(int year, int month, int day, int hour, int min, int sec) "Get time from host: %d-%d-%d %2d:%02d:%02d" - # nrf51_timer.c nrf51_timer_read(uint64_t addr, uint32_t value, unsigned size) "read addr 0x%" PRIx64 " data 0x%" PRIx32 " size %u" nrf51_timer_write(uint64_t addr, uint32_t value, unsigned size) "write addr 0x%" PRIx64 " data 0x%" PRIx32 " size %u" diff --git a/include/hw/arm/xlnx-zynqmp.h b/include/hw/arm/xlnx-zynqmp.h index d7483c3b4285..53076fa29a75 100644 --- a/include/hw/arm/xlnx-zynqmp.h +++ b/include/hw/arm/xlnx-zynqmp.h @@ -29,7 +29,7 @@ #include "hw/dma/xlnx-zdma.h" #include "hw/display/xlnx_dp.h" #include "hw/intc/xlnx-zynqmp-ipi.h" -#include "hw/timer/xlnx-zynqmp-rtc.h" +#include "hw/rtc/xlnx-zynqmp-rtc.h" #include "hw/cpu/cluster.h" #include "target/arm/cpu.h" diff --git a/include/hw/timer/xlnx-zynqmp-rtc.h b/include/hw/rtc/xlnx-zynqmp-rtc.h similarity index 95% rename from include/hw/timer/xlnx-zynqmp-rtc.h rename to include/hw/rtc/xlnx-zynqmp-rtc.h index 97e32322ed70..6fa1cb2f43f8 100644 --- a/include/hw/timer/xlnx-zynqmp-rtc.h +++ b/include/hw/rtc/xlnx-zynqmp-rtc.h @@ -3,7 +3,7 @@ * * Copyright (c) 2017 Xilinx Inc. * - * Written-by: Alistair Francis + * Written-by: Alistair Francis * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal @@ -24,8 +24,8 @@ * THE SOFTWARE. */ -#ifndef HW_TIMER_XLNX_ZYNQMP_RTC_H -#define HW_TIMER_XLNX_ZYNQMP_RTC_H +#ifndef HW_RTC_XLNX_ZYNQMP_H +#define HW_RTC_XLNX_ZYNQMP_H
[PULL 18/19] hw/rtc/xlnx-zynqmp-rtc: Remove unused "ptimer.h" include
From: Philippe Mathieu-Daudé The "hw/ptimer.h" header is not used, remove it. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Acked-by: Peter Maydell Message-Id: <20191003230404.19384-14-phi...@redhat.com> Signed-off-by: Laurent Vivier --- hw/rtc/xlnx-zynqmp-rtc.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/rtc/xlnx-zynqmp-rtc.c b/hw/rtc/xlnx-zynqmp-rtc.c index f9f09b72965a..2bcd14d7795d 100644 --- a/hw/rtc/xlnx-zynqmp-rtc.c +++ b/hw/rtc/xlnx-zynqmp-rtc.c @@ -32,7 +32,6 @@ #include "qemu/log.h" #include "qemu/module.h" #include "hw/irq.h" -#include "hw/ptimer.h" #include "qemu/cutils.h" #include "sysemu/sysemu.h" #include "trace.h" -- 2.21.0
[PULL 12/19] hw: Move TWL92230 device from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé The TWL92230 is an "energy management device" companion with a RTC. Since we mostly model the RTC, move it under the hw/rtc/ subdirectory. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Acked-by: Peter Maydell Message-Id: <20191003230404.19384-8-phi...@redhat.com> Signed-off-by: Laurent Vivier --- MAINTAINERS | 2 +- hw/rtc/Kconfig | 4 hw/rtc/Makefile.objs | 1 + hw/{timer => rtc}/twl92230.c | 0 hw/timer/Kconfig | 4 hw/timer/Makefile.objs | 1 - 6 files changed, 6 insertions(+), 6 deletions(-) rename hw/{timer => rtc}/twl92230.c (100%) diff --git a/MAINTAINERS b/MAINTAINERS index a7de5e2d1253..0dc72d37bdc8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -663,7 +663,7 @@ F: hw/display/blizzard.c F: hw/input/lm832x.c F: hw/input/tsc2005.c F: hw/misc/cbus.c -F: hw/timer/twl92230.c +F: hw/rtc/twl92230.c F: include/hw/display/blizzard.h F: include/hw/input/tsc2xxx.h F: include/hw/misc/cbus.h diff --git a/hw/rtc/Kconfig b/hw/rtc/Kconfig index cc7fead764f4..dff9d60946af 100644 --- a/hw/rtc/Kconfig +++ b/hw/rtc/Kconfig @@ -8,6 +8,10 @@ config M48T59 config PL031 bool +config TWL92230 +bool +depends on I2C + config MC146818RTC bool diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index 4621b37bc2f6..810a38ee7b3d 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -4,5 +4,6 @@ ifeq ($(CONFIG_ISA_BUS),y) common-obj-$(CONFIG_M48T59) += m48t59-isa.o endif common-obj-$(CONFIG_PL031) += pl031.o +common-obj-$(CONFIG_TWL92230) += twl92230.o obj-$(CONFIG_MC146818RTC) += mc146818rtc.o common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o diff --git a/hw/timer/twl92230.c b/hw/rtc/twl92230.c similarity index 100% rename from hw/timer/twl92230.c rename to hw/rtc/twl92230.c diff --git a/hw/timer/Kconfig b/hw/timer/Kconfig index b04c928136c2..9357875f285d 100644 --- a/hw/timer/Kconfig +++ b/hw/timer/Kconfig @@ -20,10 +20,6 @@ config HPET config I8254 bool -config TWL92230 -bool -depends on I2C - config ALTERA_TIMER bool select PTIMER diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index 034bd30255c0..23be70b71d32 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -7,7 +7,6 @@ common-obj-$(CONFIG_DS1338) += ds1338.o common-obj-$(CONFIG_HPET) += hpet.o common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o common-obj-$(CONFIG_PUV3) += puv3_ost.o -common-obj-$(CONFIG_TWL92230) += twl92230.o common-obj-$(CONFIG_XILINX) += xilinx_timer.o common-obj-$(CONFIG_SLAVIO) += slavio_timer.o common-obj-$(CONFIG_ETRAXFS) += etraxfs_timer.o -- 2.21.0
[PULL 11/19] hw: Move sun4v hypervisor RTC from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé Move RTC devices under the hw/rtc/ subdirectory. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Artyom Tarasenko Message-Id: <20191003230404.19384-7-phi...@redhat.com> Signed-off-by: Laurent Vivier --- MAINTAINERS | 4 ++-- hw/rtc/Kconfig| 3 +++ hw/rtc/Makefile.objs | 1 + hw/{timer => rtc}/sun4v-rtc.c | 2 +- hw/rtc/trace-events | 4 hw/sparc64/niagara.c | 2 +- hw/timer/Kconfig | 3 --- hw/timer/Makefile.objs| 1 - hw/timer/trace-events | 4 include/hw/rtc/sun4v-rtc.h| 19 +++ include/hw/timer/sun4v-rtc.h | 1 - 11 files changed, 31 insertions(+), 13 deletions(-) rename hw/{timer => rtc}/sun4v-rtc.c (98%) create mode 100644 include/hw/rtc/sun4v-rtc.h delete mode 100644 include/hw/timer/sun4v-rtc.h diff --git a/MAINTAINERS b/MAINTAINERS index ba0d1906aae9..a7de5e2d1253 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1163,8 +1163,8 @@ Sun4v M: Artyom Tarasenko S: Maintained F: hw/sparc64/niagara.c -F: hw/timer/sun4v-rtc.c -F: include/hw/timer/sun4v-rtc.h +F: hw/rtc/sun4v-rtc.c +F: include/hw/rtc/sun4v-rtc.h Leon3 M: Fabien Chouteau diff --git a/hw/rtc/Kconfig b/hw/rtc/Kconfig index 434b20b2b1bd..cc7fead764f4 100644 --- a/hw/rtc/Kconfig +++ b/hw/rtc/Kconfig @@ -10,3 +10,6 @@ config PL031 config MC146818RTC bool + +config SUN4V_RTC +bool diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index 89e8e48c6409..4621b37bc2f6 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -5,3 +5,4 @@ common-obj-$(CONFIG_M48T59) += m48t59-isa.o endif common-obj-$(CONFIG_PL031) += pl031.o obj-$(CONFIG_MC146818RTC) += mc146818rtc.o +common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o diff --git a/hw/timer/sun4v-rtc.c b/hw/rtc/sun4v-rtc.c similarity index 98% rename from hw/timer/sun4v-rtc.c rename to hw/rtc/sun4v-rtc.c index 54272a822fdf..ada01b57748b 100644 --- a/hw/timer/sun4v-rtc.c +++ b/hw/rtc/sun4v-rtc.c @@ -13,7 +13,7 @@ #include "hw/sysbus.h" #include "qemu/module.h" #include "qemu/timer.h" -#include "hw/timer/sun4v-rtc.h" +#include "hw/rtc/sun4v-rtc.h" #include "trace.h" diff --git a/hw/rtc/trace-events b/hw/rtc/trace-events index 54c94ac557b2..ac9e0e0fba32 100644 --- a/hw/rtc/trace-events +++ b/hw/rtc/trace-events @@ -1,5 +1,9 @@ # See docs/devel/tracing.txt for syntax documentation. +# sun4v-rtc.c +sun4v_rtc_read(uint64_t addr, uint64_t value) "read: addr 0x%" PRIx64 " value 0x%" PRIx64 +sun4v_rtc_write(uint64_t addr, uint64_t value) "write: addr 0x%" PRIx64 " value 0x%" PRIx64 + # pl031.c pl031_irq_state(int level) "irq state %d" pl031_read(uint32_t addr, uint32_t value) "addr 0x%08x value 0x%08x" diff --git a/hw/sparc64/niagara.c b/hw/sparc64/niagara.c index 5987693659ed..5eb2d097b904 100644 --- a/hw/sparc64/niagara.c +++ b/hw/sparc64/niagara.c @@ -30,7 +30,7 @@ #include "hw/misc/unimp.h" #include "hw/loader.h" #include "hw/sparc/sparc64.h" -#include "hw/timer/sun4v-rtc.h" +#include "hw/rtc/sun4v-rtc.h" #include "exec/address-spaces.h" #include "sysemu/block-backend.h" #include "qemu/error-report.h" diff --git a/hw/timer/Kconfig b/hw/timer/Kconfig index a6b668b2559b..b04c928136c2 100644 --- a/hw/timer/Kconfig +++ b/hw/timer/Kconfig @@ -35,9 +35,6 @@ config ALLWINNER_A10_PIT config STM32F2XX_TIMER bool -config SUN4V_RTC -bool - config CMSDK_APB_TIMER bool select PTIMER diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index 2fb12162a623..034bd30255c0 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -35,7 +35,6 @@ common-obj-$(CONFIG_ALLWINNER_A10_PIT) += allwinner-a10-pit.o common-obj-$(CONFIG_STM32F2XX_TIMER) += stm32f2xx_timer.o common-obj-$(CONFIG_ASPEED_SOC) += aspeed_timer.o aspeed_rtc.o -common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o common-obj-$(CONFIG_CMSDK_APB_TIMER) += cmsdk-apb-timer.o common-obj-$(CONFIG_CMSDK_APB_DUALTIMER) += cmsdk-apb-dualtimer.o common-obj-$(CONFIG_MSF2) += mss-timer.o diff --git a/hw/timer/trace-events b/hw/timer/trace-events index 6936fe8573e9..ce34b967db9f 100644 --- a/hw/timer/trace-events +++ b/hw/timer/trace-events @@ -70,10 +70,6 @@ cmsdk_apb_dualtimer_reset(void) "CMSDK APB dualtimer: reset" aspeed_rtc_read(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 aspeed_rtc_write(uint64_t addr, uint64_t value) "addr 0x%02" PRIx64 " value 0x%08" PRIx64 -# sun4v-rtc.c -sun4v_rtc_read(uint64_t addr, uint64_t value) "read: addr 0x%" PRIx64 " value 0x%" PRIx64 -sun4v_rtc_write(uint64_t addr, uint64_t value) "write: addr 0x%" PRIx64 " value 0x%" PRIx64 - # xlnx-zynqmp-rtc.c xlnx_zynqmp_rtc_gettime(int year, int month, int day, int hour, int min, int sec) "Get time from host: %d-%d-%d %2d:%02d:%02d" diff --git a/include/hw/rtc/sun4v-rtc.h b/include/hw/rtc/sun4v-rtc.h new file mode 100644 index ..fd868f6ed2fa --- /dev/null +++
[PULL 17/19] hw/rtc/mc146818: Include mc146818rtc_regs.h a bit less
From: Philippe Mathieu-Daudé Only 2 source files require the "mc146818rtc_regs.h" header. Instead of having it processed 12 times, by all objects using "mc146818rtc.h", include it directly where used. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Message-Id: <20191003230404.19384-13-phi...@redhat.com> Signed-off-by: Laurent Vivier --- hw/rtc/mc146818rtc.c | 1 + hw/timer/hpet.c | 1 + include/hw/rtc/mc146818rtc.h | 1 - 3 files changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/rtc/mc146818rtc.c b/hw/rtc/mc146818rtc.c index ced15f764fc1..9d4ed54f65e2 100644 --- a/hw/rtc/mc146818rtc.c +++ b/hw/rtc/mc146818rtc.c @@ -35,6 +35,7 @@ #include "sysemu/reset.h" #include "sysemu/runstate.h" #include "hw/rtc/mc146818rtc.h" +#include "hw/rtc/mc146818rtc_regs.h" #include "migration/vmstate.h" #include "qapi/error.h" #include "qapi/qapi-commands-misc-target.h" diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c index 02bf8a8ce8fc..9f17aaa278e2 100644 --- a/hw/timer/hpet.c +++ b/hw/timer/hpet.c @@ -34,6 +34,7 @@ #include "hw/timer/hpet.h" #include "hw/sysbus.h" #include "hw/rtc/mc146818rtc.h" +#include "hw/rtc/mc146818rtc_regs.h" #include "migration/vmstate.h" #include "hw/timer/i8254.h" diff --git a/include/hw/rtc/mc146818rtc.h b/include/hw/rtc/mc146818rtc.h index 2e9331637a6d..7fa59d4279c5 100644 --- a/include/hw/rtc/mc146818rtc.h +++ b/include/hw/rtc/mc146818rtc.h @@ -10,7 +10,6 @@ #define HW_RTC_MC146818RTC_H #include "hw/isa/isa.h" -#include "hw/rtc/mc146818rtc_regs.h" #define TYPE_MC146818_RTC "mc146818rtc" -- 2.21.0
[PATCH v7 5/8] virtio: basic packed virtqueue support
From: Jason Wang This patch implements basic support for the packed virtqueue. Compare the split virtqueue which has three rings, packed virtqueue only have one which is supposed to have better cache utilization and more hardware friendly. Please refer virtio specification for more information. Signed-off-by: Wei Xu Signed-off-by: Jason Wang Signed-off-by: Eugenio Pérez --- hw/block/virtio-blk.c | 2 +- hw/char/virtio-serial-bus.c | 2 +- hw/scsi/virtio-scsi.c | 3 +- hw/virtio/virtio.c | 901 include/hw/virtio/virtio.h | 10 +- 5 files changed, 836 insertions(+), 82 deletions(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index ba846fe9dc..7dbdeaaab9 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1052,7 +1052,7 @@ static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f) qemu_put_be32(f, virtio_get_queue_index(req->vq)); } -qemu_put_virtqueue_element(f, &req->elem); +qemu_put_virtqueue_element(vdev, f, &req->elem); req = req->next; } qemu_put_sbyte(f, 0); diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c index 4e0ed829ae..33259042a9 100644 --- a/hw/char/virtio-serial-bus.c +++ b/hw/char/virtio-serial-bus.c @@ -708,7 +708,7 @@ static void virtio_serial_save_device(VirtIODevice *vdev, QEMUFile *f) if (elem_popped) { qemu_put_be32s(f, &port->iov_idx); qemu_put_be64s(f, &port->iov_offset); -qemu_put_virtqueue_element(f, port->elem); +qemu_put_virtqueue_element(vdev, f, port->elem); } } } diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index ee52aa7d17..e8b2b64d09 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -190,11 +190,12 @@ static void virtio_scsi_save_request(QEMUFile *f, SCSIRequest *sreq) { VirtIOSCSIReq *req = sreq->hba_private; VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(req->dev); +VirtIODevice *vdev = VIRTIO_DEVICE(req->dev); uint32_t n = virtio_get_queue_index(req->vq) - 2; assert(n < vs->conf.num_queues); qemu_put_be32s(f, &n); -qemu_put_virtqueue_element(f, &req->elem); +qemu_put_virtqueue_element(vdev, f, &req->elem); } static void *virtio_scsi_load_request(QEMUFile *f, SCSIRequest *sreq) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 74cc10fad9..a694b4ab50 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -96,6 +96,7 @@ typedef struct VRingPackedDescEvent { struct VirtQueue { VRing vring; +VirtQueueElement *used_elems; /* Next head to pop */ uint16_t last_avail_idx; @@ -160,6 +161,7 @@ static void virtio_init_region_cache(VirtIODevice *vdev, int n) VRingMemoryRegionCaches *new = NULL; hwaddr addr, size; int64_t len; +bool packed; addr = vq->vring.desc; @@ -168,8 +170,10 @@ static void virtio_init_region_cache(VirtIODevice *vdev, int n) } new = g_new0(VRingMemoryRegionCaches, 1); size = virtio_queue_get_desc_size(vdev, n); +packed = virtio_vdev_has_feature(vq->vdev, VIRTIO_F_RING_PACKED) ? + true : false; len = address_space_cache_init(&new->desc, vdev->dma_as, - addr, size, false); + addr, size, packed); if (len < size) { virtio_error(vdev, "Cannot map desc"); goto err_desc; @@ -225,8 +229,8 @@ void virtio_queue_update_rings(VirtIODevice *vdev, int n) } /* Called within rcu_read_lock(). */ -static void vring_desc_read(VirtIODevice *vdev, VRingDesc *desc, -MemoryRegionCache *cache, int i) +static void vring_split_desc_read(VirtIODevice *vdev, VRingDesc *desc, + MemoryRegionCache *cache, int i) { address_space_read_cached(cache, i * sizeof(VRingDesc), desc, sizeof(VRingDesc)); @@ -236,6 +240,7 @@ static void vring_desc_read(VirtIODevice *vdev, VRingDesc *desc, virtio_tswap16s(vdev, &desc->next); } +/* Called within rcu_read_lock(). */ static VRingMemoryRegionCaches *vring_get_region_caches(struct VirtQueue *vq) { VRingMemoryRegionCaches *caches = atomic_rcu_read(&vq->vring.caches); @@ -370,6 +375,95 @@ int virtio_queue_ready(VirtQueue *vq) return vq->vring.avail != 0; } +static void vring_packed_desc_read_flags(VirtIODevice *vdev, + uint16_t *flags, + MemoryRegionCache *cache, + int i) +{ +address_space_read_cached(cache, + i * sizeof(VRingPackedDesc) + + offsetof(VRingPackedDesc, flags), + flags, sizeof(*flags)); +virtio_tswap16s(vdev, flags); +} + +static void vring_packed_desc_read(Virt
[PULL 09/19] hw: Move M48T59 device from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé The M48T59 is a Real Time Clock, not a timer. Move it under the hw/rtc/ subdirectory. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Message-Id: <20191003230404.19384-5-phi...@redhat.com> Signed-off-by: Laurent Vivier --- MAINTAINERS | 4 +- hw/ppc/ppc405_boards.c | 2 +- hw/ppc/prep.c | 2 +- hw/rtc/Kconfig | 3 ++ hw/rtc/Makefile.objs| 4 ++ hw/{timer => rtc}/m48t59-internal.h | 0 hw/{timer => rtc}/m48t59-isa.c | 4 +- hw/{timer => rtc}/m48t59.c | 2 +- hw/sparc/sun4m.c| 2 +- hw/sparc64/sun4u.c | 2 +- hw/timer/Kconfig| 3 -- hw/timer/Makefile.objs | 4 -- include/hw/rtc/m48t59.h | 57 + include/hw/timer/m48t59.h | 32 14 files changed, 73 insertions(+), 48 deletions(-) rename hw/{timer => rtc}/m48t59-internal.h (100%) rename hw/{timer => rtc}/m48t59-isa.c (98%) rename hw/{timer => rtc}/m48t59.c (99%) create mode 100644 include/hw/rtc/m48t59.h delete mode 100644 include/hw/timer/m48t59.h diff --git a/MAINTAINERS b/MAINTAINERS index 7eba146444ae..4e65f062f29c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1064,9 +1064,9 @@ F: hw/pci-host/prep.[hc] F: hw/isa/i82378.c F: hw/isa/pc87312.c F: hw/dma/i82374.c -F: hw/timer/m48t59-isa.c +F: hw/rtc/m48t59-isa.c F: include/hw/isa/pc87312.h -F: include/hw/timer/m48t59.h +F: include/hw/rtc/m48t59.h F: pc-bios/ppc_rom.bin sPAPR diff --git a/hw/ppc/ppc405_boards.c b/hw/ppc/ppc405_boards.c index 388cae0b4370..1f721feed6a4 100644 --- a/hw/ppc/ppc405_boards.c +++ b/hw/ppc/ppc405_boards.c @@ -29,7 +29,7 @@ #include "cpu.h" #include "hw/ppc/ppc.h" #include "ppc405.h" -#include "hw/timer/m48t59.h" +#include "hw/rtc/m48t59.h" #include "hw/block/flash.h" #include "sysemu/sysemu.h" #include "sysemu/qtest.h" diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c index 3a51536e1a39..862345c2ac5f 100644 --- a/hw/ppc/prep.c +++ b/hw/ppc/prep.c @@ -25,7 +25,7 @@ #include "qemu/osdep.h" #include "cpu.h" -#include "hw/timer/m48t59.h" +#include "hw/rtc/m48t59.h" #include "hw/char/serial.h" #include "hw/block/fdc.h" #include "net/net.h" diff --git a/hw/rtc/Kconfig b/hw/rtc/Kconfig index 7ffd702268ad..159c2335171a 100644 --- a/hw/rtc/Kconfig +++ b/hw/rtc/Kconfig @@ -1,3 +1,6 @@ +config M48T59 +bool + config PL031 bool diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index 3cac0d5a637b..c87f81405e9d 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -1,2 +1,6 @@ +common-obj-$(CONFIG_M48T59) += m48t59.o +ifeq ($(CONFIG_ISA_BUS),y) +common-obj-$(CONFIG_M48T59) += m48t59-isa.o +endif common-obj-$(CONFIG_PL031) += pl031.o obj-$(CONFIG_MC146818RTC) += mc146818rtc.o diff --git a/hw/timer/m48t59-internal.h b/hw/rtc/m48t59-internal.h similarity index 100% rename from hw/timer/m48t59-internal.h rename to hw/rtc/m48t59-internal.h diff --git a/hw/timer/m48t59-isa.c b/hw/rtc/m48t59-isa.c similarity index 98% rename from hw/timer/m48t59-isa.c rename to hw/rtc/m48t59-isa.c index 5e5432abfdfa..7fde854c0f8f 100644 --- a/hw/timer/m48t59-isa.c +++ b/hw/rtc/m48t59-isa.c @@ -1,5 +1,5 @@ /* - * QEMU M48T59 and M48T08 NVRAM emulation (ISA bus interface + * QEMU M48T59 and M48T08 NVRAM emulation (ISA bus interface) * * Copyright (c) 2003-2005, 2007 Jocelyn Mayer * Copyright (c) 2013 Hervé Poussineau @@ -26,7 +26,7 @@ #include "qemu/osdep.h" #include "hw/isa/isa.h" #include "hw/qdev-properties.h" -#include "hw/timer/m48t59.h" +#include "hw/rtc/m48t59.h" #include "m48t59-internal.h" #include "qemu/module.h" diff --git a/hw/timer/m48t59.c b/hw/rtc/m48t59.c similarity index 99% rename from hw/timer/m48t59.c rename to hw/rtc/m48t59.c index a9fc2f981a56..fc592b9fb126 100644 --- a/hw/timer/m48t59.c +++ b/hw/rtc/m48t59.c @@ -27,7 +27,7 @@ #include "qemu-common.h" #include "hw/irq.h" #include "hw/qdev-properties.h" -#include "hw/timer/m48t59.h" +#include "hw/rtc/m48t59.h" #include "qemu/timer.h" #include "sysemu/runstate.h" #include "sysemu/sysemu.h" diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c index 6c5a17a02055..2aaa5bf1ae2e 100644 --- a/hw/sparc/sun4m.c +++ b/hw/sparc/sun4m.c @@ -31,7 +31,7 @@ #include "qemu/error-report.h" #include "qemu/timer.h" #include "hw/sparc/sun4m_iommu.h" -#include "hw/timer/m48t59.h" +#include "hw/rtc/m48t59.h" #include "migration/vmstate.h" #include "hw/sparc/sparc32_dma.h" #include "hw/block/fdc.h" diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c index 1ded2a4c9ab3..955082773b06 100644 --- a/hw/sparc64/sun4u.c +++ b/hw/sparc64/sun4u.c @@ -36,7 +36,7 @@ #include "hw/pci-host/sabre.h" #include "hw/char/serial.h" #include "hw/char/parallel.h" -#include "hw/timer/m48t59.h" +#include "hw/rtc/m48t59.h" #include "migration/vmstate.h" #include "hw/input/i8042.h" #include "hw/block/fdc.h" diff --
[PULL 15/19] hw: Move Exynos4210 RTC from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé Move RTC devices under the hw/rtc/ subdirectory. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Acked-by: Peter Maydell Message-Id: <20191003230404.19384-11-phi...@redhat.com> Signed-off-by: Laurent Vivier --- hw/rtc/Makefile.objs | 1 + hw/{timer => rtc}/exynos4210_rtc.c | 0 hw/timer/Makefile.objs | 1 - 3 files changed, 1 insertion(+), 1 deletion(-) rename hw/{timer => rtc}/exynos4210_rtc.c (100%) diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index 543a550a0f11..3d4763fc269e 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -7,5 +7,6 @@ endif common-obj-$(CONFIG_PL031) += pl031.o common-obj-$(CONFIG_TWL92230) += twl92230.o common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-rtc.o +common-obj-$(CONFIG_EXYNOS4) += exynos4210_rtc.o obj-$(CONFIG_MC146818RTC) += mc146818rtc.o common-obj-$(CONFIG_SUN4V_RTC) += sun4v-rtc.o diff --git a/hw/timer/exynos4210_rtc.c b/hw/rtc/exynos4210_rtc.c similarity index 100% rename from hw/timer/exynos4210_rtc.c rename to hw/rtc/exynos4210_rtc.c diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index 294465ef47ad..33191d74cb98 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -19,7 +19,6 @@ common-obj-$(CONFIG_NRF51_SOC) += nrf51_timer.o common-obj-$(CONFIG_ALTERA_TIMER) += altera_timer.o common-obj-$(CONFIG_EXYNOS4) += exynos4210_mct.o common-obj-$(CONFIG_EXYNOS4) += exynos4210_pwm.o -common-obj-$(CONFIG_EXYNOS4) += exynos4210_rtc.o common-obj-$(CONFIG_OMAP) += omap_gptimer.o common-obj-$(CONFIG_OMAP) += omap_synctimer.o common-obj-$(CONFIG_PXA2XX) += pxa2xx_timer.o -- 2.21.0
[PULL 05/19] qemu-timer: reuse MIN macro in qemu_timeout_ns_to_ms
From: Frediano Ziglio Signed-off-by: Frediano Ziglio Reviewed-by: Laurent Vivier Message-Id: <20191023122652.2999-3-fzig...@redhat.com> [lv: removed the two useless casts] Reviewed-by: Eric Blake Signed-off-by: Laurent Vivier --- util/qemu-timer.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/util/qemu-timer.c b/util/qemu-timer.c index d428fec56772..ef52d28d3753 100644 --- a/util/qemu-timer.c +++ b/util/qemu-timer.c @@ -322,11 +322,7 @@ int qemu_timeout_ns_to_ms(int64_t ns) ms = DIV_ROUND_UP(ns, SCALE_MS); /* To avoid overflow problems, limit this to 2^31, i.e. approx 25 days */ -if (ms > (int64_t) INT32_MAX) { -ms = INT32_MAX; -} - -return (int) ms; +return MIN(ms, INT32_MAX); } -- 2.21.0
[PULL 10/19] hw: Move M41T80 device from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé The M41T80 is a Real Time Clock, not a timer. Move it under the hw/rtc/ subdirectory. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Message-Id: <20191003230404.19384-6-phi...@redhat.com> Signed-off-by: Laurent Vivier --- MAINTAINERS| 2 +- hw/rtc/Kconfig | 4 hw/rtc/Makefile.objs | 1 + hw/{timer => rtc}/m41t80.c | 0 hw/timer/Kconfig | 4 hw/timer/Makefile.objs | 1 - 6 files changed, 6 insertions(+), 6 deletions(-) rename hw/{timer => rtc}/m41t80.c (100%) diff --git a/MAINTAINERS b/MAINTAINERS index 4e65f062f29c..ba0d1906aae9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -,7 +,7 @@ F: hw/ppc/sam460ex.c F: hw/ppc/ppc440_pcix.c F: hw/display/sm501* F: hw/ide/sii3112.c -F: hw/timer/m41t80.c +F: hw/rtc/m41t80.c F: pc-bios/canyonlands.dt[sb] F: pc-bios/u-boot-sam460ex-20100605.bin F: roms/u-boot-sam460ex diff --git a/hw/rtc/Kconfig b/hw/rtc/Kconfig index 159c2335171a..434b20b2b1bd 100644 --- a/hw/rtc/Kconfig +++ b/hw/rtc/Kconfig @@ -1,3 +1,7 @@ +config M41T80 +bool +depends on I2C + config M48T59 bool diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index c87f81405e9d..89e8e48c6409 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -1,3 +1,4 @@ +common-obj-$(CONFIG_M41T80) += m41t80.o common-obj-$(CONFIG_M48T59) += m48t59.o ifeq ($(CONFIG_ISA_BUS),y) common-obj-$(CONFIG_M48T59) += m48t59-isa.o diff --git a/hw/timer/m41t80.c b/hw/rtc/m41t80.c similarity index 100% rename from hw/timer/m41t80.c rename to hw/rtc/m41t80.c diff --git a/hw/timer/Kconfig b/hw/timer/Kconfig index a57e9b59fca8..a6b668b2559b 100644 --- a/hw/timer/Kconfig +++ b/hw/timer/Kconfig @@ -20,10 +20,6 @@ config HPET config I8254 bool -config M41T80 -bool -depends on I2C - config TWL92230 bool depends on I2C diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index fe2d1fbc4040..2fb12162a623 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -6,7 +6,6 @@ common-obj-$(CONFIG_CADENCE) += cadence_ttc.o common-obj-$(CONFIG_DS1338) += ds1338.o common-obj-$(CONFIG_HPET) += hpet.o common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o -common-obj-$(CONFIG_M41T80) += m41t80.o common-obj-$(CONFIG_PUV3) += puv3_ost.o common-obj-$(CONFIG_TWL92230) += twl92230.o common-obj-$(CONFIG_XILINX) += xilinx_timer.o -- 2.21.0
[PATCH v7 4/8] virtio: Free rnd virqueue at unrealize()
The function virtio_del_queue was not called at unrealize() callback. This was detected due to add an allocated element on the vq introduce in future commits (used_elems) and running address sanitizer memory leak detector. Signed-off-by: Eugenio Pérez --- hw/virtio/virtio-rng.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c index e93bed020f..b498a20332 100644 --- a/hw/virtio/virtio-rng.c +++ b/hw/virtio/virtio-rng.c @@ -238,6 +238,7 @@ static void virtio_rng_device_unrealize(DeviceState *dev, Error **errp) qemu_del_vm_change_state_handler(vrng->vmstate); timer_del(vrng->rate_limit_timer); timer_free(vrng->rate_limit_timer); +virtio_del_queue(vdev, 0); virtio_cleanup(vdev); } -- 2.16.5
[PULL 08/19] hw: Move MC146818 device from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé The MC146818 is a Real Time Clock, not a timer. Move it under the hw/rtc/ subdirectory. Use copyright statement from 80cabfad163 for "hw/rtc/mc146818rtc.h". Reviewed-by: Alistair Francis Acked-by: David Gibson Signed-off-by: Philippe Mathieu-Daudé Message-Id: <20191003230404.19384-4-phi...@redhat.com> Signed-off-by: Laurent Vivier --- MAINTAINERS | 4 ++-- hw/alpha/dp264.c | 2 +- hw/hppa/machine.c| 2 +- hw/i386/acpi-build.c | 2 +- hw/i386/pc.c | 2 +- hw/i386/pc_q35.c | 2 +- hw/mips/mips_fulong2e.c | 2 +- hw/mips/mips_jazz.c | 2 +- hw/mips/mips_malta.c | 2 +- hw/mips/mips_r4k.c | 2 +- hw/ppc/pnv.c | 2 +- hw/ppc/prep.c| 2 +- hw/rtc/Kconfig | 3 +++ hw/rtc/Makefile.objs | 1 + hw/{timer => rtc}/mc146818rtc.c | 2 +- hw/timer/Kconfig | 3 --- hw/timer/Makefile.objs | 2 -- hw/timer/hpet.c | 2 +- include/hw/{timer => rtc}/mc146818rtc.h | 14 +++--- include/hw/{timer => rtc}/mc146818rtc_regs.h | 5 +++-- tests/rtc-test.c | 2 +- 21 files changed, 34 insertions(+), 26 deletions(-) rename hw/{timer => rtc}/mc146818rtc.c (99%) rename include/hw/{timer => rtc}/mc146818rtc.h (58%) rename include/hw/{timer => rtc}/mc146818rtc_regs.h (96%) diff --git a/MAINTAINERS b/MAINTAINERS index 2e13ba46282d..7eba146444ae 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1261,7 +1261,7 @@ F: hw/misc/debugexit.c F: hw/misc/pc-testdev.c F: hw/timer/hpet* F: hw/timer/i8254* -F: hw/timer/mc146818rtc* +F: hw/rtc/mc146818rtc* F: hw/watchdog/wdt_ib700.c F: hw/watchdog/wdt_i6300esb.c F: include/hw/display/vga.h @@ -1273,7 +1273,7 @@ F: include/hw/isa/i8259_internal.h F: include/hw/isa/superio.h F: include/hw/timer/hpet.h F: include/hw/timer/i8254* -F: include/hw/timer/mc146818rtc* +F: include/hw/rtc/mc146818rtc* Machine core M: Eduardo Habkost diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c index 51feee855812..51b3cf7a6128 100644 --- a/hw/alpha/dp264.c +++ b/hw/alpha/dp264.c @@ -14,7 +14,7 @@ #include "alpha_sys.h" #include "qemu/error-report.h" #include "sysemu/sysemu.h" -#include "hw/timer/mc146818rtc.h" +#include "hw/rtc/mc146818rtc.h" #include "hw/ide.h" #include "hw/timer/i8254.h" #include "hw/isa/superio.h" diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c index 953d454f4879..b30aba6d5439 100644 --- a/hw/hppa/machine.c +++ b/hw/hppa/machine.c @@ -12,7 +12,7 @@ #include "qemu/error-report.h" #include "sysemu/reset.h" #include "sysemu/sysemu.h" -#include "hw/timer/mc146818rtc.h" +#include "hw/rtc/mc146818rtc.h" #include "hw/ide.h" #include "hw/timer/i8254.h" #include "hw/char/serial.h" diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 1d077a7cb772..d9435ba0b310 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -45,7 +45,7 @@ #include "hw/acpi/vmgenid.h" #include "hw/boards.h" #include "sysemu/tpm_backend.h" -#include "hw/timer/mc146818rtc_regs.h" +#include "hw/rtc/mc146818rtc_regs.h" #include "migration/vmstate.h" #include "hw/mem/memory-device.h" #include "sysemu/numa.h" diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 4b1904237ec6..51b72439b440 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -42,7 +42,7 @@ #include "elf.h" #include "migration/vmstate.h" #include "multiboot.h" -#include "hw/timer/mc146818rtc.h" +#include "hw/rtc/mc146818rtc.h" #include "hw/dma/i8257.h" #include "hw/timer/i8254.h" #include "hw/input/i8042.h" diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 8fad20f3146a..748fc2ee15bf 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -33,7 +33,7 @@ #include "hw/loader.h" #include "sysemu/arch_init.h" #include "hw/i2c/smbus_eeprom.h" -#include "hw/timer/mc146818rtc.h" +#include "hw/rtc/mc146818rtc.h" #include "hw/xen/xen.h" #include "sysemu/kvm.h" #include "kvm_i386.h" diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c index cf537dd7e631..03a27e176796 100644 --- a/hw/mips/mips_fulong2e.c +++ b/hw/mips/mips_fulong2e.c @@ -39,7 +39,7 @@ #include "hw/ide.h" #include "elf.h" #include "hw/isa/vt82c686.h" -#include "hw/timer/mc146818rtc.h" +#include "hw/rtc/mc146818rtc.h" #include "hw/timer/i8254.h" #include "exec/address-spaces.h" #include "sysemu/qtest.h" diff --git a/hw/mips/mips_jazz.c b/hw/mips/mips_jazz.c index 8d010a0b6e10..d978bb64a0f5 100644 --- a/hw/mips/mips_jazz.c +++ b/hw/mips/mips_jazz.c @@ -39,7 +39,7 @@ #include "hw/scsi/esp.h" #include "hw/mips/bios.h" #include "hw/loader.h" -#include "hw/timer/mc146818rtc.h" +#include "hw/rtc/m
[PULL 13/19] hw: Move DS1338 device from hw/timer/ to hw/rtc/ subdirectory
From: Philippe Mathieu-Daudé The DS1338 is a Real Time Clock, not a timer. Move it under the hw/rtc/ subdirectory. Reviewed-by: Alistair Francis Signed-off-by: Philippe Mathieu-Daudé Acked-by: Peter Maydell Message-Id: <20191003230404.19384-9-phi...@redhat.com> Signed-off-by: Laurent Vivier --- hw/rtc/Kconfig | 4 hw/rtc/Makefile.objs | 1 + hw/{timer => rtc}/ds1338.c | 0 hw/timer/Kconfig | 4 hw/timer/Makefile.objs | 1 - 5 files changed, 5 insertions(+), 5 deletions(-) rename hw/{timer => rtc}/ds1338.c (100%) diff --git a/hw/rtc/Kconfig b/hw/rtc/Kconfig index dff9d60946af..45daa8d655c9 100644 --- a/hw/rtc/Kconfig +++ b/hw/rtc/Kconfig @@ -1,3 +1,7 @@ +config DS1338 +bool +depends on I2C + config M41T80 bool depends on I2C diff --git a/hw/rtc/Makefile.objs b/hw/rtc/Makefile.objs index 810a38ee7b3d..b195863291d1 100644 --- a/hw/rtc/Makefile.objs +++ b/hw/rtc/Makefile.objs @@ -1,3 +1,4 @@ +common-obj-$(CONFIG_DS1338) += ds1338.o common-obj-$(CONFIG_M41T80) += m41t80.o common-obj-$(CONFIG_M48T59) += m48t59.o ifeq ($(CONFIG_ISA_BUS),y) diff --git a/hw/timer/ds1338.c b/hw/rtc/ds1338.c similarity index 100% rename from hw/timer/ds1338.c rename to hw/rtc/ds1338.c diff --git a/hw/timer/Kconfig b/hw/timer/Kconfig index 9357875f285d..a990f9fe35fc 100644 --- a/hw/timer/Kconfig +++ b/hw/timer/Kconfig @@ -9,10 +9,6 @@ config ARM_MPTIMER config A9_GTIMER bool -config DS1338 -bool -depends on I2C - config HPET bool default y if PC diff --git a/hw/timer/Makefile.objs b/hw/timer/Makefile.objs index 23be70b71d32..70b61b69c7a4 100644 --- a/hw/timer/Makefile.objs +++ b/hw/timer/Makefile.objs @@ -3,7 +3,6 @@ common-obj-$(CONFIG_ARM_MPTIMER) += arm_mptimer.o common-obj-$(CONFIG_ARM_V7M) += armv7m_systick.o common-obj-$(CONFIG_A9_GTIMER) += a9gtimer.o common-obj-$(CONFIG_CADENCE) += cadence_ttc.o -common-obj-$(CONFIG_DS1338) += ds1338.o common-obj-$(CONFIG_HPET) += hpet.o common-obj-$(CONFIG_I8254) += i8254_common.o i8254.o common-obj-$(CONFIG_PUV3) += puv3_ost.o -- 2.21.0