Re: [Qemu-block] [PATCH 2/3] xen-disk: add support for multi-page shared rings
On Tue, 20 Jun 2017, Paul Durrant wrote: > The blkif protocol has had provision for negotiation of multi-page shared > rings for some time now and many guest OS have support in their frontend > drivers. > > This patch makes the necessary modifications to xen-disk support a shared > ring up to order 4 (i.e. 16 pages). > > Signed-off-by: Paul DurrantThanks for the patch! > --- > Cc: Stefano Stabellini > Cc: Anthony Perard > Cc: Kevin Wolf > Cc: Max Reitz > --- > hw/block/xen_disk.c | 141 > > 1 file changed, 110 insertions(+), 31 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 9b06e3aa81..a9942d32db 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -36,8 +36,6 @@ > > static int batch_maps = 0; > > -static int max_requests = 32; > - > /* - */ > > #define BLOCK_SIZE 512 > @@ -84,6 +82,8 @@ struct ioreq { > BlockAcctCookie acct; > }; > > +#define MAX_RING_PAGE_ORDER 4 > + > struct XenBlkDev { > struct XenDevicexendev; /* must be first */ > char*params; > @@ -94,7 +94,8 @@ struct XenBlkDev { > booldirectiosafe; > const char *fileproto; > const char *filename; > -int ring_ref; > +unsigned intring_ref[1 << MAX_RING_PAGE_ORDER]; > +unsigned intnr_ring_ref; > void*sring; > int64_t file_blk; > int64_t file_size; > @@ -110,6 +111,7 @@ struct XenBlkDev { > int requests_total; > int requests_inflight; > int requests_finished; > +unsigned intmax_requests; > > /* Persistent grants extension */ > gbooleanfeature_discard; > @@ -199,7 +201,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) > struct ioreq *ioreq = NULL; > > if (QLIST_EMPTY(>freelist)) { > -if (blkdev->requests_total >= max_requests) { > +if (blkdev->requests_total >= blkdev->max_requests) { > goto out; > } > /* allocate new struct */ > @@ -905,7 +907,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) > ioreq_runio_qemu_aio(ioreq); > } > > -if (blkdev->more_work && blkdev->requests_inflight < max_requests) { > +if (blkdev->more_work && blkdev->requests_inflight < > blkdev->max_requests) { > qemu_bh_schedule(blkdev->bh); > } > } > @@ -918,15 +920,6 @@ static void blk_bh(void *opaque) > blk_handle_requests(blkdev); > } > > -/* > - * We need to account for the grant allocations requiring contiguous > - * chunks; the worst case number would be > - * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > - * but in order to keep things simple just use > - * 2 * max_req * max_seg. > - */ > -#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > - > static void blk_alloc(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > @@ -938,11 +931,6 @@ static void blk_alloc(struct XenDevice *xendev) > if (xen_mode != XEN_EMULATE) { > batch_maps = 1; > } > -if (xengnttab_set_max_grants(xendev->gnttabdev, > -MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) { > -xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", > - strerror(errno)); > -} > } > > static void blk_parse_discard(struct XenBlkDev *blkdev) > @@ -1037,6 +1025,9 @@ static int blk_init(struct XenDevice *xendev) >!blkdev->feature_grant_copy); > xenstore_write_be_int(>xendev, "info", info); > > +xenstore_write_be_int(>xendev, "max-ring-page-order", > + MAX_RING_PAGE_ORDER); > + > blk_parse_discard(blkdev); > > g_free(directiosafe); > @@ -1058,12 +1049,25 @@ out_error: > return -1; > } > > +/* > + * We need to account for the grant allocations requiring contiguous > + * chunks; the worst case number would be > + * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > + * but in order to keep things simple just use > + * 2 * max_req * max_seg. > + */ > +#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > + > static int blk_connect(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > int pers, index, qflags; > bool readonly = true; > bool writethrough = true; > +int order, ring_ref; > +unsigned int ring_size, max_grants; > +unsigned int i; > +uint32_t *domids; > > /* read-only ? */ > if (blkdev->directiosafe) { >
Re: [Qemu-block] [PATCH 1/3] xen-disk: only advertize feature-persistent if grant copy is not available
On Tue, 20 Jun 2017, Paul Durrant wrote: > If grant copy is available then it will always be used in preference to > persistent maps. In this case feature-persistent should not be advertized > to the frontend, otherwise it may needlessly copy data into persistently > granted buffers. > > Signed-off-by: Paul DurrantCC'ing Roger. It is true that using feature-persistent together with grant copies is a a very bad idea. But this change enstablishes an explicit preference of feature_grant_copy over feature-persistent in the xen_disk backend. It is not obvious to me that it should be the case. Why is feature_grant_copy (without feature-persistent) better than feature-persistent (without feature_grant_copy)? Shouldn't we simply avoid grant copies to copy data to persistent grants? > --- > Cc: Stefano Stabellini > Cc: Anthony Perard > Cc: Kevin Wolf > Cc: Max Reitz > --- > hw/block/xen_disk.c | 15 --- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 3a22805fbc..9b06e3aa81 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev) > > blkdev->file_blk = BLOCK_SIZE; > > +blkdev->feature_grant_copy = > +(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == > 0); > + > +xen_pv_printf(>xendev, 3, "grant copy operation %s\n", > + blkdev->feature_grant_copy ? "enabled" : "disabled"); > + > /* fill info > * blk_connect supplies sector-size and sectors > */ > xenstore_write_be_int(>xendev, "feature-flush-cache", 1); > -xenstore_write_be_int(>xendev, "feature-persistent", 1); > +xenstore_write_be_int(>xendev, "feature-persistent", > + !blkdev->feature_grant_copy); > xenstore_write_be_int(>xendev, "info", info); > > blk_parse_discard(blkdev); > @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev) > > xen_be_bind_evtchn(>xendev); > > -blkdev->feature_grant_copy = > -(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == > 0); > - > -xen_pv_printf(>xendev, 3, "grant copy operation %s\n", > - blkdev->feature_grant_copy ? "enabled" : "disabled"); > - > xen_pv_printf(>xendev, 1, "ok: proto %s, ring-ref %d, " >"remote port %d, local port %d\n", >blkdev->xendev.protocol, blkdev->ring_ref, > -- > 2.11.0 >
Re: [Qemu-block] [PATCH 3/3] xen-disk: use an IOThread per instance
> -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: 20 June 2017 17:08 > To: Paul Durrant; xen-de...@lists.xenproject.org; > qemu-de...@nongnu.org; qemu-block@nongnu.org > Cc: Anthony Perard ; Kevin Wolf > ; Stefano Stabellini ; Max Reitz > > Subject: Re: [PATCH 3/3] xen-disk: use an IOThread per instance > > On 20/06/2017 15:47, Paul Durrant wrote: > > This patch allocates an IOThread object for each xen_disk instance and > > sets the AIO context appropriately on connect. This allows processing > > of I/O to proceed in parallel. > > > > The patch also adds tracepoints into xen_disk to make it possible to > > follow the state transtions of an instance in the log. > > > > Signed-off-by: Paul Durrant > > The QEMU block layer is not yet thread safe, but code running in > IOThreads still has to take the AioContext lock. You need to call > aio_context_acquire/release in blk_bh and qemu_aio_complete. > Ok, thanks. I'll update the patch and re-test. Cheers, Paul > Paolo > > > --- > > Cc: Stefano Stabellini > > Cc: Anthony Perard > > Cc: Kevin Wolf > > Cc: Max Reitz > > --- > > hw/block/trace-events | 7 +++ > > hw/block/xen_disk.c | 44 > +++- > > 2 files changed, 50 insertions(+), 1 deletion(-) > > > > diff --git a/hw/block/trace-events b/hw/block/trace-events > > index 65e83dc258..608b24ba66 100644 > > --- a/hw/block/trace-events > > +++ b/hw/block/trace-events > > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int > num_reqs, uint64_t offset, > > # hw/block/hd-geometry.c > > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p > LCHS %d %d %d" > > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t > secs, int trans) "blk %p CHS %u %u %u trans %d" > > + > > +# hw/block/xen_disk.c > > +xen_disk_alloc(char *name) "%s" > > +xen_disk_init(char *name) "%s" > > +xen_disk_connect(char *name) "%s" > > +xen_disk_disconnect(char *name) "%s" > > +xen_disk_free(char *name) "%s" > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > > index a9942d32db..ec1085c802 100644 > > --- a/hw/block/xen_disk.c > > +++ b/hw/block/xen_disk.c > > @@ -27,10 +27,13 @@ > > #include "hw/xen/xen_backend.h" > > #include "xen_blkif.h" > > #include "sysemu/blockdev.h" > > +#include "sysemu/iothread.h" > > #include "sysemu/block-backend.h" > > #include "qapi/error.h" > > #include "qapi/qmp/qdict.h" > > #include "qapi/qmp/qstring.h" > > +#include "qom/object_interfaces.h" > > +#include "trace.h" > > > > /* - */ > > > > @@ -128,6 +131,9 @@ struct XenBlkDev { > > DriveInfo *dinfo; > > BlockBackend*blk; > > QEMUBH *bh; > > + > > +IOThread*iothread; > > +AioContext *ctx; > > }; > > > > /* - */ > > @@ -923,11 +929,31 @@ static void blk_bh(void *opaque) > > static void blk_alloc(struct XenDevice *xendev) > > { > > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > > +Object *obj; > > +char *name; > > +Error *err = NULL; > > + > > +trace_xen_disk_alloc(xendev->name); > > > > QLIST_INIT(>inflight); > > QLIST_INIT(>finished); > > QLIST_INIT(>freelist); > > -blkdev->bh = qemu_bh_new(blk_bh, blkdev); > > + > > +obj = object_new(TYPE_IOTHREAD); > > +name = g_strdup_printf("iothread-%s", xendev->name); > > + > > +object_property_add_child(object_get_objects_root(), name, obj, > ); > > +assert(!err); > > + > > +g_free(name); > > + > > +user_creatable_complete(obj, ); > > +assert(!err); > > + > > +blkdev->iothread = (IOThread *)object_dynamic_cast(obj, > TYPE_IOTHREAD); > > +blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > > +blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > > + > > if (xen_mode != XEN_EMULATE) { > > batch_maps = 1; > > } > > @@ -954,6 +980,8 @@ static int blk_init(struct XenDevice *xendev) > > int info = 0; > > char *directiosafe = NULL; > > > > +trace_xen_disk_init(xendev->name); > > + > > /* read xenstore entries */ > > if (blkdev->params == NULL) { > > char *h = NULL; > > @@ -1069,6 +1097,8 @@ static int blk_connect(struct XenDevice *xendev) > > unsigned int i; > > uint32_t *domids; > > > > +trace_xen_disk_connect(xendev->name); > > + > > /* read-only ? */ > > if (blkdev->directiosafe) { > > qflags = BDRV_O_NOCACHE | BDRV_O_NATIVE_AIO; > > @@ -1285,6 +1315,8 @@ static int blk_connect(struct XenDevice *xendev) >
Re: [Qemu-block] [PATCH 3/3] xen-disk: use an IOThread per instance
On 20/06/2017 15:47, Paul Durrant wrote: > This patch allocates an IOThread object for each xen_disk instance and > sets the AIO context appropriately on connect. This allows processing > of I/O to proceed in parallel. > > The patch also adds tracepoints into xen_disk to make it possible to > follow the state transtions of an instance in the log. > > Signed-off-by: Paul DurrantThe QEMU block layer is not yet thread safe, but code running in IOThreads still has to take the AioContext lock. You need to call aio_context_acquire/release in blk_bh and qemu_aio_complete. Paolo > --- > Cc: Stefano Stabellini > Cc: Anthony Perard > Cc: Kevin Wolf > Cc: Max Reitz > --- > hw/block/trace-events | 7 +++ > hw/block/xen_disk.c | 44 +++- > 2 files changed, 50 insertions(+), 1 deletion(-) > > diff --git a/hw/block/trace-events b/hw/block/trace-events > index 65e83dc258..608b24ba66 100644 > --- a/hw/block/trace-events > +++ b/hw/block/trace-events > @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int > num_reqs, uint64_t offset, > # hw/block/hd-geometry.c > hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p > LCHS %d %d %d" > hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, > int trans) "blk %p CHS %u %u %u trans %d" > + > +# hw/block/xen_disk.c > +xen_disk_alloc(char *name) "%s" > +xen_disk_init(char *name) "%s" > +xen_disk_connect(char *name) "%s" > +xen_disk_disconnect(char *name) "%s" > +xen_disk_free(char *name) "%s" > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index a9942d32db..ec1085c802 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -27,10 +27,13 @@ > #include "hw/xen/xen_backend.h" > #include "xen_blkif.h" > #include "sysemu/blockdev.h" > +#include "sysemu/iothread.h" > #include "sysemu/block-backend.h" > #include "qapi/error.h" > #include "qapi/qmp/qdict.h" > #include "qapi/qmp/qstring.h" > +#include "qom/object_interfaces.h" > +#include "trace.h" > > /* - */ > > @@ -128,6 +131,9 @@ struct XenBlkDev { > DriveInfo *dinfo; > BlockBackend*blk; > QEMUBH *bh; > + > +IOThread*iothread; > +AioContext *ctx; > }; > > /* - */ > @@ -923,11 +929,31 @@ static void blk_bh(void *opaque) > static void blk_alloc(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > +Object *obj; > +char *name; > +Error *err = NULL; > + > +trace_xen_disk_alloc(xendev->name); > > QLIST_INIT(>inflight); > QLIST_INIT(>finished); > QLIST_INIT(>freelist); > -blkdev->bh = qemu_bh_new(blk_bh, blkdev); > + > +obj = object_new(TYPE_IOTHREAD); > +name = g_strdup_printf("iothread-%s", xendev->name); > + > +object_property_add_child(object_get_objects_root(), name, obj, ); > +assert(!err); > + > +g_free(name); > + > +user_creatable_complete(obj, ); > +assert(!err); > + > +blkdev->iothread = (IOThread *)object_dynamic_cast(obj, TYPE_IOTHREAD); > +blkdev->ctx = iothread_get_aio_context(blkdev->iothread); > +blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); > + > if (xen_mode != XEN_EMULATE) { > batch_maps = 1; > } > @@ -954,6 +980,8 @@ static int blk_init(struct XenDevice *xendev) > int info = 0; > char *directiosafe = NULL; > > +trace_xen_disk_init(xendev->name); > + > /* read xenstore entries */ > if (blkdev->params == NULL) { > char *h = NULL; > @@ -1069,6 +1097,8 @@ static int blk_connect(struct XenDevice *xendev) > unsigned int i; > uint32_t *domids; > > +trace_xen_disk_connect(xendev->name); > + > /* read-only ? */ > if (blkdev->directiosafe) { > qflags = BDRV_O_NOCACHE | BDRV_O_NATIVE_AIO; > @@ -1285,6 +1315,8 @@ static int blk_connect(struct XenDevice *xendev) > blkdev->persistent_gnt_count = 0; > } > > +blk_set_aio_context(blkdev->blk, blkdev->ctx); > + > xen_be_bind_evtchn(>xendev); > > xen_pv_printf(>xendev, 1, "ok: proto %s, nr-ring-ref %u, " > @@ -1298,13 +1330,20 @@ static void blk_disconnect(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > > +trace_xen_disk_disconnect(xendev->name); > + > +aio_context_acquire(blkdev->ctx); > + > if (blkdev->blk) { > +blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); > blk_detach_dev(blkdev->blk, blkdev); > blk_unref(blkdev->blk); > blkdev->blk = NULL; > } > xen_pv_unbind_evtchn(>xendev); > > +
Re: [Qemu-block] [PATCH] qcow2: Use offset_into_cluster() and offset_to_l2_index()
Am 20.06.2017 um 15:01 hat Alberto Garcia geschrieben: > We already have functions for doing these calculations, so let's use > them instead of doing everything by hand. This makes the code a bit > more readable. > > Signed-off-by: Alberto GarciaThanks, applied to the block branch. Kevin
Re: [Qemu-block] [PATCH v4 0/7] Reduce the number of I/O ops when doing COW
Am 19.06.2017 um 15:40 hat Alberto Garcia geschrieben: > Hi all, > > here's a patch series that rewrites the copy-on-write code in the > qcow2 driver to reduce the number of I/O operations. > > This is version v4, please refer to the original e-mail for a complete > description: > > https://lists.gnu.org/archive/html/qemu-block/2017-05/msg00882.html Thanks, applied to the block branch. Kevin
Re: [Qemu-block] [PATCH v9 07/20] block: deprecate "encryption=on" in favor of "encrypt.format=aes"
On Tue 20 Jun 2017 02:02:06 PM CEST, Daniel P. Berrange wrote: >> > +if (encryptfmt) { >> > +buf = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT); >> > +if (buf != NULL) { >> > +g_free(buf); >> >> If you use qemu_opt_get() instead then you don't need "buf" at all, >> do you? > > IIRC, we needed to delete the option from opts, otherwise something > will later complain that there are opts that are not consumed. I don't see how, since once you reached this point it means that you cannot create the image because the options are already wrong. Anyway, there's a more important problem with this patch that I just noticed: in this scenario you're freeing 'buf' twice, first in the snip I quoted above, and then at the end of qcow2_create(). qcow_create() from qcow.c is safe, but perhaps it's clearer if you put the declaration of the 'buf' variable inside the 'if (encryptfmt)' block. That assuming that you really need to use qemu_opt_get_del() there. With the double-free bug fixed, I don't see any difference when using qemu_opt_get_del() vs qemu_opt_get(). Berto
[Qemu-block] [PATCH 1/3] xen-disk: only advertize feature-persistent if grant copy is not available
If grant copy is available then it will always be used in preference to persistent maps. In this case feature-persistent should not be advertized to the frontend, otherwise it may needlessly copy data into persistently granted buffers. Signed-off-by: Paul Durrant--- Cc: Stefano Stabellini Cc: Anthony Perard Cc: Kevin Wolf Cc: Max Reitz --- hw/block/xen_disk.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 3a22805fbc..9b06e3aa81 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev) blkdev->file_blk = BLOCK_SIZE; +blkdev->feature_grant_copy = +(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); + +xen_pv_printf(>xendev, 3, "grant copy operation %s\n", + blkdev->feature_grant_copy ? "enabled" : "disabled"); + /* fill info * blk_connect supplies sector-size and sectors */ xenstore_write_be_int(>xendev, "feature-flush-cache", 1); -xenstore_write_be_int(>xendev, "feature-persistent", 1); +xenstore_write_be_int(>xendev, "feature-persistent", + !blkdev->feature_grant_copy); xenstore_write_be_int(>xendev, "info", info); blk_parse_discard(blkdev); @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev) xen_be_bind_evtchn(>xendev); -blkdev->feature_grant_copy = -(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0); - -xen_pv_printf(>xendev, 3, "grant copy operation %s\n", - blkdev->feature_grant_copy ? "enabled" : "disabled"); - xen_pv_printf(>xendev, 1, "ok: proto %s, ring-ref %d, " "remote port %d, local port %d\n", blkdev->xendev.protocol, blkdev->ring_ref, -- 2.11.0
[Qemu-block] [PATCH 0/3] xen-disk: performance improvements
Paul Durrant (3): xen-disk: only advertize feature-persistent if grant copy is not available xen-disk: add support for multi-page shared rings xen-disk: use an IOThread per instance hw/block/trace-events | 7 ++ hw/block/xen_disk.c | 200 -- 2 files changed, 168 insertions(+), 39 deletions(-) -- 2.11.0
[Qemu-block] [PATCH 2/3] xen-disk: add support for multi-page shared rings
The blkif protocol has had provision for negotiation of multi-page shared rings for some time now and many guest OS have support in their frontend drivers. This patch makes the necessary modifications to xen-disk support a shared ring up to order 4 (i.e. 16 pages). Signed-off-by: Paul Durrant--- Cc: Stefano Stabellini Cc: Anthony Perard Cc: Kevin Wolf Cc: Max Reitz --- hw/block/xen_disk.c | 141 1 file changed, 110 insertions(+), 31 deletions(-) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 9b06e3aa81..a9942d32db 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -36,8 +36,6 @@ static int batch_maps = 0; -static int max_requests = 32; - /* - */ #define BLOCK_SIZE 512 @@ -84,6 +82,8 @@ struct ioreq { BlockAcctCookie acct; }; +#define MAX_RING_PAGE_ORDER 4 + struct XenBlkDev { struct XenDevicexendev; /* must be first */ char*params; @@ -94,7 +94,8 @@ struct XenBlkDev { booldirectiosafe; const char *fileproto; const char *filename; -int ring_ref; +unsigned intring_ref[1 << MAX_RING_PAGE_ORDER]; +unsigned intnr_ring_ref; void*sring; int64_t file_blk; int64_t file_size; @@ -110,6 +111,7 @@ struct XenBlkDev { int requests_total; int requests_inflight; int requests_finished; +unsigned intmax_requests; /* Persistent grants extension */ gbooleanfeature_discard; @@ -199,7 +201,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) struct ioreq *ioreq = NULL; if (QLIST_EMPTY(>freelist)) { -if (blkdev->requests_total >= max_requests) { +if (blkdev->requests_total >= blkdev->max_requests) { goto out; } /* allocate new struct */ @@ -905,7 +907,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) ioreq_runio_qemu_aio(ioreq); } -if (blkdev->more_work && blkdev->requests_inflight < max_requests) { +if (blkdev->more_work && blkdev->requests_inflight < blkdev->max_requests) { qemu_bh_schedule(blkdev->bh); } } @@ -918,15 +920,6 @@ static void blk_bh(void *opaque) blk_handle_requests(blkdev); } -/* - * We need to account for the grant allocations requiring contiguous - * chunks; the worst case number would be - * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, - * but in order to keep things simple just use - * 2 * max_req * max_seg. - */ -#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) - static void blk_alloc(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); @@ -938,11 +931,6 @@ static void blk_alloc(struct XenDevice *xendev) if (xen_mode != XEN_EMULATE) { batch_maps = 1; } -if (xengnttab_set_max_grants(xendev->gnttabdev, -MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) { -xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", - strerror(errno)); -} } static void blk_parse_discard(struct XenBlkDev *blkdev) @@ -1037,6 +1025,9 @@ static int blk_init(struct XenDevice *xendev) !blkdev->feature_grant_copy); xenstore_write_be_int(>xendev, "info", info); +xenstore_write_be_int(>xendev, "max-ring-page-order", + MAX_RING_PAGE_ORDER); + blk_parse_discard(blkdev); g_free(directiosafe); @@ -1058,12 +1049,25 @@ out_error: return -1; } +/* + * We need to account for the grant allocations requiring contiguous + * chunks; the worst case number would be + * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, + * but in order to keep things simple just use + * 2 * max_req * max_seg. + */ +#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) + static int blk_connect(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); int pers, index, qflags; bool readonly = true; bool writethrough = true; +int order, ring_ref; +unsigned int ring_size, max_grants; +unsigned int i; +uint32_t *domids; /* read-only ? */ if (blkdev->directiosafe) { @@ -1138,9 +1142,39 @@ static int blk_connect(struct XenDevice *xendev) xenstore_write_be_int64(>xendev, "sectors", blkdev->file_size / blkdev->file_blk); -if (xenstore_read_fe_int(>xendev, "ring-ref", >ring_ref) == -1) { +if (xenstore_read_fe_int(>xendev, "ring-page-order", + ) ==
[Qemu-block] [PATCH 3/3] xen-disk: use an IOThread per instance
This patch allocates an IOThread object for each xen_disk instance and sets the AIO context appropriately on connect. This allows processing of I/O to proceed in parallel. The patch also adds tracepoints into xen_disk to make it possible to follow the state transtions of an instance in the log. Signed-off-by: Paul Durrant--- Cc: Stefano Stabellini Cc: Anthony Perard Cc: Kevin Wolf Cc: Max Reitz --- hw/block/trace-events | 7 +++ hw/block/xen_disk.c | 44 +++- 2 files changed, 50 insertions(+), 1 deletion(-) diff --git a/hw/block/trace-events b/hw/block/trace-events index 65e83dc258..608b24ba66 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -10,3 +10,10 @@ virtio_blk_submit_multireq(void *mrb, int start, int num_reqs, uint64_t offset, # hw/block/hd-geometry.c hd_geometry_lchs_guess(void *blk, int cyls, int heads, int secs) "blk %p LCHS %d %d %d" hd_geometry_guess(void *blk, uint32_t cyls, uint32_t heads, uint32_t secs, int trans) "blk %p CHS %u %u %u trans %d" + +# hw/block/xen_disk.c +xen_disk_alloc(char *name) "%s" +xen_disk_init(char *name) "%s" +xen_disk_connect(char *name) "%s" +xen_disk_disconnect(char *name) "%s" +xen_disk_free(char *name) "%s" diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index a9942d32db..ec1085c802 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -27,10 +27,13 @@ #include "hw/xen/xen_backend.h" #include "xen_blkif.h" #include "sysemu/blockdev.h" +#include "sysemu/iothread.h" #include "sysemu/block-backend.h" #include "qapi/error.h" #include "qapi/qmp/qdict.h" #include "qapi/qmp/qstring.h" +#include "qom/object_interfaces.h" +#include "trace.h" /* - */ @@ -128,6 +131,9 @@ struct XenBlkDev { DriveInfo *dinfo; BlockBackend*blk; QEMUBH *bh; + +IOThread*iothread; +AioContext *ctx; }; /* - */ @@ -923,11 +929,31 @@ static void blk_bh(void *opaque) static void blk_alloc(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); +Object *obj; +char *name; +Error *err = NULL; + +trace_xen_disk_alloc(xendev->name); QLIST_INIT(>inflight); QLIST_INIT(>finished); QLIST_INIT(>freelist); -blkdev->bh = qemu_bh_new(blk_bh, blkdev); + +obj = object_new(TYPE_IOTHREAD); +name = g_strdup_printf("iothread-%s", xendev->name); + +object_property_add_child(object_get_objects_root(), name, obj, ); +assert(!err); + +g_free(name); + +user_creatable_complete(obj, ); +assert(!err); + +blkdev->iothread = (IOThread *)object_dynamic_cast(obj, TYPE_IOTHREAD); +blkdev->ctx = iothread_get_aio_context(blkdev->iothread); +blkdev->bh = aio_bh_new(blkdev->ctx, blk_bh, blkdev); + if (xen_mode != XEN_EMULATE) { batch_maps = 1; } @@ -954,6 +980,8 @@ static int blk_init(struct XenDevice *xendev) int info = 0; char *directiosafe = NULL; +trace_xen_disk_init(xendev->name); + /* read xenstore entries */ if (blkdev->params == NULL) { char *h = NULL; @@ -1069,6 +1097,8 @@ static int blk_connect(struct XenDevice *xendev) unsigned int i; uint32_t *domids; +trace_xen_disk_connect(xendev->name); + /* read-only ? */ if (blkdev->directiosafe) { qflags = BDRV_O_NOCACHE | BDRV_O_NATIVE_AIO; @@ -1285,6 +1315,8 @@ static int blk_connect(struct XenDevice *xendev) blkdev->persistent_gnt_count = 0; } +blk_set_aio_context(blkdev->blk, blkdev->ctx); + xen_be_bind_evtchn(>xendev); xen_pv_printf(>xendev, 1, "ok: proto %s, nr-ring-ref %u, " @@ -1298,13 +1330,20 @@ static void blk_disconnect(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); +trace_xen_disk_disconnect(xendev->name); + +aio_context_acquire(blkdev->ctx); + if (blkdev->blk) { +blk_set_aio_context(blkdev->blk, qemu_get_aio_context()); blk_detach_dev(blkdev->blk, blkdev); blk_unref(blkdev->blk); blkdev->blk = NULL; } xen_pv_unbind_evtchn(>xendev); +aio_context_release(blkdev->ctx); + if (blkdev->sring) { xengnttab_unmap(blkdev->xendev.gnttabdev, blkdev->sring, blkdev->nr_ring_ref); @@ -1338,6 +1377,8 @@ static int blk_free(struct XenDevice *xendev) struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); struct ioreq *ioreq; +trace_xen_disk_free(xendev->name); + if (blkdev->blk || blkdev->sring) { blk_disconnect(xendev); } @@ -1355,6 +1396,7 @@ static int blk_free(struct XenDevice *xendev)
[Qemu-block] [PATCH] qcow2: Use offset_into_cluster() and offset_to_l2_index()
We already have functions for doing these calculations, so let's use them instead of doing everything by hand. This makes the code a bit more readable. Signed-off-by: Alberto Garcia--- block/qcow2-cluster.c | 4 ++-- block/qcow2.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index d779ea19cf..a2c612433d 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -548,7 +548,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset, /* find the cluster offset for the given disk offset */ -l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); +l2_index = offset_to_l2_index(s, offset); *cluster_offset = be64_to_cpu(l2_table[l2_index]); nb_clusters = size_to_clusters(s, bytes_needed); @@ -685,7 +685,7 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset, /* find the cluster offset for the given disk offset */ -l2_index = (offset >> s->cluster_bits) & (s->l2_size - 1); +l2_index = offset_to_l2_index(s, offset); *new_l2_table = l2_table; *new_l2_index = l2_index; diff --git a/block/qcow2.c b/block/qcow2.c index b3ba5daa93..adb604fc81 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -356,7 +356,7 @@ static int validate_table_offset(BlockDriverState *bs, uint64_t offset, } /* Tables must be cluster aligned */ -if (offset & (s->cluster_size - 1)) { +if (offset_into_cluster(s, offset) != 0) { return -EINVAL; } -- 2.11.0
Re: [Qemu-block] [PATCH v9 07/20] block: deprecate "encryption=on" in favor of "encrypt.format=aes"
On Tue, Jun 20, 2017 at 01:44:50PM +0200, Alberto Garcia wrote: > On Mon 19 Jun 2017 07:34:42 PM CEST, Daniel P. Berrange wrote: > > Historically the qcow & qcow2 image formats supported a property > > "encryption=on" to enable their built-in AES encryption. We'll > > soon be supporting LUKS for qcow2, so need a more general purpose > > way to enable encryption, with a choice of formats. > > > > This introduces an "encrypt.format" option, which will later be > > joined by a number of other "encrypt.XXX" options. The use of > > a "encrypt." prefix instead of "encrypt-" is done to facilitate > > mapping to a nested QAPI schema at later date. > > > > e.g. the preferred syntax is now > > > > qemu-img create -f qcow2 -o encrypt.format=aes demo.qcow2 > > > > Signed-off-by: Daniel P. Berrange> > +if (encryptfmt) { > > +buf = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT); > > +if (buf != NULL) { > > +g_free(buf); > > If you use qemu_opt_get() instead then you don't need "buf" at all, do > you? IIRC, we needed to delete the option from opts, otherwise something will later complain that there are opts that are not consumed. > > > +if (encryptfmt) { > > +buf = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT); > > +if (buf != NULL) { > > +g_free(buf); > > Same here. > > Everything else looks fine. > > Reviewed-by: Alberto Garcia Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [Qemu-block] [PATCH v9 07/20] block: deprecate "encryption=on" in favor of "encrypt.format=aes"
On Mon 19 Jun 2017 07:34:42 PM CEST, Daniel P. Berrange wrote: > Historically the qcow & qcow2 image formats supported a property > "encryption=on" to enable their built-in AES encryption. We'll > soon be supporting LUKS for qcow2, so need a more general purpose > way to enable encryption, with a choice of formats. > > This introduces an "encrypt.format" option, which will later be > joined by a number of other "encrypt.XXX" options. The use of > a "encrypt." prefix instead of "encrypt-" is done to facilitate > mapping to a nested QAPI schema at later date. > > e.g. the preferred syntax is now > > qemu-img create -f qcow2 -o encrypt.format=aes demo.qcow2 > > Signed-off-by: Daniel P. Berrange> +if (encryptfmt) { > +buf = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT); > +if (buf != NULL) { > +g_free(buf); If you use qemu_opt_get() instead then you don't need "buf" at all, do you? > +if (encryptfmt) { > +buf = qemu_opt_get_del(opts, BLOCK_OPT_ENCRYPT); > +if (buf != NULL) { > +g_free(buf); Same here. Everything else looks fine. Reviewed-by: Alberto Garcia Berto
Re: [Qemu-block] [PULL 0/2] Block patches
On 14 June 2017 at 22:55, Jeff Codywrote: > The following changes since commit 3f0602927b120a480b35dcf58cf6f95435b3ae91: > > Merge remote-tracking branch > 'remotes/pmaydell/tags/pull-target-arm-20170613' into staging (2017-06-13 > 15:49:07 +0100) > > are available in the git repository at: > > git://github.com/codyprime/qemu-kvm-jtc.git tags/block-pull-request > > for you to fetch changes up to 5c3ad1a6a8fa041c57403dbe1fc5927eec0be66b: > > block/iscsi: enable filename option and parsing (2017-06-14 17:39:46 -0400) > > > Block patches > > > Jeff Cody (2): > block/rbd: enable filename option and parsing > block/iscsi: enable filename option and parsing > > block/iscsi.c | 22 +- > block/rbd.c | 22 +- > 2 files changed, 42 insertions(+), 2 deletions(-) Applied, thanks. -- PMM
Re: [Qemu-block] [PATCH] nvme: Add support for Read Data and Write Data in CMBs.
Am 19.06.2017 um 23:24 hat Keith Busch geschrieben: > On Tue, Jun 13, 2017 at 04:08:35AM -0600, sba...@raithlin.com wrote: > > From: Stephen Bates> > > > Add the ability for the NVMe model to support both the RDS and WDS > > modes in the Controller Memory Buffer. > > > > Although not currently supported in the upstreamed Linux kernel a fork > > with support exists [1] and user-space test programs that build on > > this also exist [2]. > > > > Useful for testing CMB functionality in preperation for real CMB > > enabled NVMe devices (coming soon). > > > > [1] https://github.com/sbates130272/linux-p2pmem > > [2] https://github.com/sbates130272/p2pmem-test > > > > Signed-off-by: Stephen Bates > > Reviewed-by: Logan Gunthorpe > > This looks good to me. > > Reviewed-by: Keith Busch Thanks, applied to the block branch. Kevin