Re: [Qemu-devel] [PATCH] xen_disk: add discard support
On Tue, 4 Feb 2014, Olaf Hering wrote: On Tue, Feb 04, Olaf Hering wrote: On Tue, Feb 04, Kevin Wolf wrote: Now you call bdrv_acct_done() in the callback without having a matching bdrv_acct_start(). You need to make it conditional in the callback. Stefano, Is ioreq_runio_qemu_aio symetric in this regard anyway? In case of BLKIF_OP_WRITE|BLKIF_OP_FLUSH_DISKCACHE and no ioreq-req.nr_segments then qemu_aio_complete is called anyway. Will qemu_aio_complete get down to the bdrv_acct_done call at all in this case? Right, you spotted a bug there: if !ioreq-req.nr_segments, qemu_aio_complete is called without bdrv_acct_start being called first. What I have in mind is something like the (not compile tested) change below. diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index e74efc7..99d36b8 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -486,7 +486,16 @@ static void qemu_aio_complete(void *opaque, int ret) ioreq-status = ioreq-aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY; ioreq_unmap(ioreq); ioreq_finish(ioreq); -bdrv_acct_done(ioreq-blkdev-bs, ioreq-acct); +switch (ioreq-req.operation) { +case BLKIF_OP_DISCARD: +break; +case BLKIF_OP_WRITE: +case BLKIF_OP_FLUSH_DISKCACHE: What about BLKIF_OP_READ? +if (!ioreq-req.nr_segments) { +break; +} +bdrv_acct_done(ioreq-blkdev-bs, ioreq-acct); +} qemu_bh_schedule(ioreq-blkdev-bh); }
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
On Tue, 4 Feb 2014, Olaf Hering wrote: On Mon, Feb 03, Kevin Wolf wrote: Am 30.01.2014 um 16:02 hat Olaf Hering geschrieben: +++ b/hw/block/xen_disk.c +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); Neither SCSI nor IDE account for discards. I think we should keep the behaviour consistent across devices. Stefano,did you already put this change into your queue? I will resubmit the patch with the change below. Go ahead and resubmit. Thanks. Olaf diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index e74efc7..69ecc98 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -527,8 +527,6 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) case BLKIF_OP_DISCARD: { struct blkif_request_discard *discard_req = (void *)ioreq-req; -bdrv_acct_start(blkdev-bs, ioreq-acct, -discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); ioreq-aio_inflight++; bdrv_aio_discard(blkdev-bs, discard_req-sector_number, discard_req-nr_sectors,
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
On Tue, Feb 04, Olaf Hering wrote: On Tue, Feb 04, Kevin Wolf wrote: Now you call bdrv_acct_done() in the callback without having a matching bdrv_acct_start(). You need to make it conditional in the callback. Stefano, Is ioreq_runio_qemu_aio symetric in this regard anyway? In case of BLKIF_OP_WRITE|BLKIF_OP_FLUSH_DISKCACHE and no ioreq-req.nr_segments then qemu_aio_complete is called anyway. Will qemu_aio_complete get down to the bdrv_acct_done call at all in this case? What I have in mind is something like the (not compile tested) change below. Olaf diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index e74efc7..99d36b8 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -486,7 +486,16 @@ static void qemu_aio_complete(void *opaque, int ret) ioreq-status = ioreq-aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY; ioreq_unmap(ioreq); ioreq_finish(ioreq); -bdrv_acct_done(ioreq-blkdev-bs, ioreq-acct); +switch (ioreq-req.operation) { +case BLKIF_OP_DISCARD: +break; +case BLKIF_OP_WRITE: +case BLKIF_OP_FLUSH_DISKCACHE: +if (!ioreq-req.nr_segments) { +break; +} +bdrv_acct_done(ioreq-blkdev-bs, ioreq-acct); +} qemu_bh_schedule(ioreq-blkdev-bh); }
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
On Mon, Feb 03, Kevin Wolf wrote: Am 30.01.2014 um 16:02 hat Olaf Hering geschrieben: +++ b/hw/block/xen_disk.c +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); Neither SCSI nor IDE account for discards. I think we should keep the behaviour consistent across devices. Stefano,did you already put this change into your queue? I will resubmit the patch with the change below. Olaf diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index e74efc7..69ecc98 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -527,8 +527,6 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) case BLKIF_OP_DISCARD: { struct blkif_request_discard *discard_req = (void *)ioreq-req; -bdrv_acct_start(blkdev-bs, ioreq-acct, -discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); ioreq-aio_inflight++; bdrv_aio_discard(blkdev-bs, discard_req-sector_number, discard_req-nr_sectors,
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
On Tue, Feb 04, Kevin Wolf wrote: Now you call bdrv_acct_done() in the callback without having a matching bdrv_acct_start(). You need to make it conditional in the callback. I see. Stefano, Is ioreq_runio_qemu_aio symetric in this regard anyway? In case of BLKIF_OP_WRITE|BLKIF_OP_FLUSH_DISKCACHE and no ioreq-req.nr_segments then qemu_aio_complete is called anyway. Will qemu_aio_complete get down to the bdrv_acct_done call at all in this case? Olaf
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
Am 04.02.2014 um 16:47 hat Olaf Hering geschrieben: On Mon, Feb 03, Kevin Wolf wrote: Am 30.01.2014 um 16:02 hat Olaf Hering geschrieben: +++ b/hw/block/xen_disk.c +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); Neither SCSI nor IDE account for discards. I think we should keep the behaviour consistent across devices. Stefano,did you already put this change into your queue? I will resubmit the patch with the change below. Olaf diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index e74efc7..69ecc98 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -527,8 +527,6 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) case BLKIF_OP_DISCARD: { struct blkif_request_discard *discard_req = (void *)ioreq-req; -bdrv_acct_start(blkdev-bs, ioreq-acct, -discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); ioreq-aio_inflight++; bdrv_aio_discard(blkdev-bs, discard_req-sector_number, discard_req-nr_sectors, Now you call bdrv_acct_done() in the callback without having a matching bdrv_acct_start(). You need to make it conditional in the callback. Kevin
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
Am 30.01.2014 um 16:02 hat Olaf Hering geschrieben: Implement discard support for xen_disk. It makes use of the existing discard code in qemu. The discard support is enabled unconditionally. The tool stack may provide a property discard-enable in the backend node to optionally disable discard support. This is helpful in case the backing file was intentionally created non-sparse to avoid fragmentation. Signed-off-by: Olaf Hering o...@aepfle.de --- The counter part for libxl is here, will go into xen-4.5: http://lists.xenproject.org/archives/html/xen-devel/2014-01/msg02632.html checkpatch comaplains about tabs in hw/block/xen_blkif.h. The whole file is full of tabs, like the whole source tree... If desired I will send a follow up patch to wipe tabs in hw/block/xen_blkif.h hw/block/xen_blkif.h | 12 hw/block/xen_disk.c | 34 ++ 2 files changed, 46 insertions(+) diff --git a/hw/block/xen_blkif.h b/hw/block/xen_blkif.h index c0f4136..711b692 100644 --- a/hw/block/xen_blkif.h +++ b/hw/block/xen_blkif.h @@ -79,6 +79,12 @@ static inline void blkif_get_x86_32_req(blkif_request_t *dst, blkif_x86_32_reque dst-handle = src-handle; dst-id = src-id; dst-sector_number = src-sector_number; + if (src-operation == BLKIF_OP_DISCARD) { + struct blkif_request_discard *s = (void *)src; + struct blkif_request_discard *d = (void *)dst; + d-nr_sectors = s-nr_sectors; + return; + } if (n src-nr_segments) n = src-nr_segments; for (i = 0; i n; i++) @@ -94,6 +100,12 @@ static inline void blkif_get_x86_64_req(blkif_request_t *dst, blkif_x86_64_reque dst-handle = src-handle; dst-id = src-id; dst-sector_number = src-sector_number; + if (src-operation == BLKIF_OP_DISCARD) { + struct blkif_request_discard *s = (void *)src; + struct blkif_request_discard *d = (void *)dst; + d-nr_sectors = s-nr_sectors; + return; + } if (n src-nr_segments) n = src-nr_segments; for (i = 0; i n; i++) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 098f6c6..e74efc7 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -114,6 +114,7 @@ struct XenBlkDev { int requests_finished; /* Persistent grants extension */ +gbooleanfeature_discard; gbooleanfeature_persistent; GTree *persistent_gnts; unsigned intpersistent_gnt_count; @@ -253,6 +254,8 @@ static int ioreq_parse(struct ioreq *ioreq) case BLKIF_OP_WRITE: ioreq-prot = PROT_READ; /* from memory */ break; +case BLKIF_OP_DISCARD: +return 0; default: xen_be_printf(blkdev-xendev, 0, error: unknown operation (%d)\n, ioreq-req.operation); @@ -521,6 +524,17 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) ioreq-v, ioreq-v.size / BLOCK_SIZE, qemu_aio_complete, ioreq); break; +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); Neither SCSI nor IDE account for discards. I think we should keep the behaviour consistent across devices. If we do want to introduce accounting for discards, I'm not sure whether counting them as writes or giving them their own category makes more sense. +ioreq-aio_inflight++; +bdrv_aio_discard(blkdev-bs, +discard_req-sector_number, discard_req-nr_sectors, +qemu_aio_complete, ioreq); +break; +} default: /* unknown operation (shouldn't happen -- parse catches this) */ goto err; Kevin
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
Il 03/02/2014 16:49, Kevin Wolf ha scritto: Neither SCSI nor IDE account for discards. I think we should keep the behaviour consistent across devices. If we do want to introduce accounting for discards, I'm not sure whether counting them as writes or giving them their own category makes more sense. I think giving them their own category is better. Paolo
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
Am 03.02.2014 um 17:03 hat Olaf Hering geschrieben: On Mon, Feb 03, Kevin Wolf wrote: Am 30.01.2014 um 16:02 hat Olaf Hering geschrieben: +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); Neither SCSI nor IDE account for discards. I think we should keep the behaviour consistent across devices. If we do want to introduce accounting for discards, I'm not sure whether counting them as writes or giving them their own category makes more sense. This line was just copied. I have to look how virtio does it, maybe I copied it from there. No problem with removing it from my side. virtio-blk doesn't support discard at all. I guess you just copied it from the write a few lines above (and you need it if you don't want to change the callback, because that has a bdrv_acct_end() call). But I think in the end a discard is also a write, isnt it? Well... Not really, but perhaps close enough. I can think of arguments for either way. All that I'm really interested in is that all devices apply the same logic for accounting discards, so we can keep a consistent meaning of the statistics. If we want to account for them as writes here, we need to change IDE and SCSI to do the same; and if we leave IDE and SCSI unchanged, we can't account for discards here. Kevin
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
On Mon, Feb 03, Kevin Wolf wrote: Am 30.01.2014 um 16:02 hat Olaf Hering geschrieben: +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); Neither SCSI nor IDE account for discards. I think we should keep the behaviour consistent across devices. If we do want to introduce accounting for discards, I'm not sure whether counting them as writes or giving them their own category makes more sense. This line was just copied. I have to look how virtio does it, maybe I copied it from there. No problem with removing it from my side. But I think in the end a discard is also a write, isnt it? Olaf
Re: [Qemu-devel] [PATCH] xen_disk: add discard support
On Thu, 30 Jan 2014, Olaf Hering wrote: Implement discard support for xen_disk. It makes use of the existing discard code in qemu. The discard support is enabled unconditionally. The tool stack may provide a property discard-enable in the backend node to optionally disable discard support. This is helpful in case the backing file was intentionally created non-sparse to avoid fragmentation. Signed-off-by: Olaf Hering o...@aepfle.de --- The counter part for libxl is here, will go into xen-4.5: http://lists.xenproject.org/archives/html/xen-devel/2014-01/msg02632.html checkpatch comaplains about tabs in hw/block/xen_blkif.h. The whole file is full of tabs, like the whole source tree... If desired I will send a follow up patch to wipe tabs in hw/block/xen_blkif.h No worries. I'll add this patch to my queue. hw/block/xen_blkif.h | 12 hw/block/xen_disk.c | 34 ++ 2 files changed, 46 insertions(+) diff --git a/hw/block/xen_blkif.h b/hw/block/xen_blkif.h index c0f4136..711b692 100644 --- a/hw/block/xen_blkif.h +++ b/hw/block/xen_blkif.h @@ -79,6 +79,12 @@ static inline void blkif_get_x86_32_req(blkif_request_t *dst, blkif_x86_32_reque dst-handle = src-handle; dst-id = src-id; dst-sector_number = src-sector_number; + if (src-operation == BLKIF_OP_DISCARD) { + struct blkif_request_discard *s = (void *)src; + struct blkif_request_discard *d = (void *)dst; + d-nr_sectors = s-nr_sectors; + return; + } if (n src-nr_segments) n = src-nr_segments; for (i = 0; i n; i++) @@ -94,6 +100,12 @@ static inline void blkif_get_x86_64_req(blkif_request_t *dst, blkif_x86_64_reque dst-handle = src-handle; dst-id = src-id; dst-sector_number = src-sector_number; + if (src-operation == BLKIF_OP_DISCARD) { + struct blkif_request_discard *s = (void *)src; + struct blkif_request_discard *d = (void *)dst; + d-nr_sectors = s-nr_sectors; + return; + } if (n src-nr_segments) n = src-nr_segments; for (i = 0; i n; i++) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 098f6c6..e74efc7 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -114,6 +114,7 @@ struct XenBlkDev { int requests_finished; /* Persistent grants extension */ +gbooleanfeature_discard; gbooleanfeature_persistent; GTree *persistent_gnts; unsigned intpersistent_gnt_count; @@ -253,6 +254,8 @@ static int ioreq_parse(struct ioreq *ioreq) case BLKIF_OP_WRITE: ioreq-prot = PROT_READ; /* from memory */ break; +case BLKIF_OP_DISCARD: +return 0; default: xen_be_printf(blkdev-xendev, 0, error: unknown operation (%d)\n, ioreq-req.operation); @@ -521,6 +524,17 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) ioreq-v, ioreq-v.size / BLOCK_SIZE, qemu_aio_complete, ioreq); break; +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); +ioreq-aio_inflight++; +bdrv_aio_discard(blkdev-bs, +discard_req-sector_number, discard_req-nr_sectors, +qemu_aio_complete, ioreq); +break; +} default: /* unknown operation (shouldn't happen -- parse catches this) */ goto err; @@ -699,6 +713,21 @@ static void blk_alloc(struct XenDevice *xendev) } } +static void blk_parse_discard(struct XenBlkDev *blkdev) +{ +int enable; + +blkdev-feature_discard = true; + +if (xenstore_read_be_int(blkdev-xendev, discard-enable, enable) == 0) { +blkdev-feature_discard = !!enable; +} + +if (blkdev-feature_discard) { +xenstore_write_be_int(blkdev-xendev, feature-discard, 1); +} +} + static int blk_init(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); @@ -766,6 +795,8 @@ static int blk_init(struct XenDevice *xendev) xenstore_write_be_int(blkdev-xendev, feature-persistent, 1); xenstore_write_be_int(blkdev-xendev, info, info); +blk_parse_discard(blkdev); + g_free(directiosafe); return 0; @@ -801,6 +832,9 @@ static int blk_connect(struct XenDevice *xendev) qflags |= BDRV_O_RDWR; readonly = false; } +if (blkdev-feature_discard) { +qflags |= BDRV_O_UNMAP; +} /* init qemu block driver */ index = (blkdev-xendev.dev - 202 * 256) / 16;
[Qemu-devel] [PATCH] xen_disk: add discard support
Implement discard support for xen_disk. It makes use of the existing discard code in qemu. The discard support is enabled unconditionally. The tool stack may provide a property discard-enable in the backend node to optionally disable discard support. This is helpful in case the backing file was intentionally created non-sparse to avoid fragmentation. Signed-off-by: Olaf Hering o...@aepfle.de --- The counter part for libxl is here, will go into xen-4.5: http://lists.xenproject.org/archives/html/xen-devel/2014-01/msg02632.html checkpatch comaplains about tabs in hw/block/xen_blkif.h. The whole file is full of tabs, like the whole source tree... If desired I will send a follow up patch to wipe tabs in hw/block/xen_blkif.h hw/block/xen_blkif.h | 12 hw/block/xen_disk.c | 34 ++ 2 files changed, 46 insertions(+) diff --git a/hw/block/xen_blkif.h b/hw/block/xen_blkif.h index c0f4136..711b692 100644 --- a/hw/block/xen_blkif.h +++ b/hw/block/xen_blkif.h @@ -79,6 +79,12 @@ static inline void blkif_get_x86_32_req(blkif_request_t *dst, blkif_x86_32_reque dst-handle = src-handle; dst-id = src-id; dst-sector_number = src-sector_number; + if (src-operation == BLKIF_OP_DISCARD) { + struct blkif_request_discard *s = (void *)src; + struct blkif_request_discard *d = (void *)dst; + d-nr_sectors = s-nr_sectors; + return; + } if (n src-nr_segments) n = src-nr_segments; for (i = 0; i n; i++) @@ -94,6 +100,12 @@ static inline void blkif_get_x86_64_req(blkif_request_t *dst, blkif_x86_64_reque dst-handle = src-handle; dst-id = src-id; dst-sector_number = src-sector_number; + if (src-operation == BLKIF_OP_DISCARD) { + struct blkif_request_discard *s = (void *)src; + struct blkif_request_discard *d = (void *)dst; + d-nr_sectors = s-nr_sectors; + return; + } if (n src-nr_segments) n = src-nr_segments; for (i = 0; i n; i++) diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c index 098f6c6..e74efc7 100644 --- a/hw/block/xen_disk.c +++ b/hw/block/xen_disk.c @@ -114,6 +114,7 @@ struct XenBlkDev { int requests_finished; /* Persistent grants extension */ +gbooleanfeature_discard; gbooleanfeature_persistent; GTree *persistent_gnts; unsigned intpersistent_gnt_count; @@ -253,6 +254,8 @@ static int ioreq_parse(struct ioreq *ioreq) case BLKIF_OP_WRITE: ioreq-prot = PROT_READ; /* from memory */ break; +case BLKIF_OP_DISCARD: +return 0; default: xen_be_printf(blkdev-xendev, 0, error: unknown operation (%d)\n, ioreq-req.operation); @@ -521,6 +524,17 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) ioreq-v, ioreq-v.size / BLOCK_SIZE, qemu_aio_complete, ioreq); break; +case BLKIF_OP_DISCARD: +{ +struct blkif_request_discard *discard_req = (void *)ioreq-req; +bdrv_acct_start(blkdev-bs, ioreq-acct, +discard_req-nr_sectors * BLOCK_SIZE, BDRV_ACCT_WRITE); +ioreq-aio_inflight++; +bdrv_aio_discard(blkdev-bs, +discard_req-sector_number, discard_req-nr_sectors, +qemu_aio_complete, ioreq); +break; +} default: /* unknown operation (shouldn't happen -- parse catches this) */ goto err; @@ -699,6 +713,21 @@ static void blk_alloc(struct XenDevice *xendev) } } +static void blk_parse_discard(struct XenBlkDev *blkdev) +{ +int enable; + +blkdev-feature_discard = true; + +if (xenstore_read_be_int(blkdev-xendev, discard-enable, enable) == 0) { +blkdev-feature_discard = !!enable; +} + +if (blkdev-feature_discard) { +xenstore_write_be_int(blkdev-xendev, feature-discard, 1); +} +} + static int blk_init(struct XenDevice *xendev) { struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev); @@ -766,6 +795,8 @@ static int blk_init(struct XenDevice *xendev) xenstore_write_be_int(blkdev-xendev, feature-persistent, 1); xenstore_write_be_int(blkdev-xendev, info, info); +blk_parse_discard(blkdev); + g_free(directiosafe); return 0; @@ -801,6 +832,9 @@ static int blk_connect(struct XenDevice *xendev) qflags |= BDRV_O_RDWR; readonly = false; } +if (blkdev-feature_discard) { +qflags |= BDRV_O_UNMAP; +} /* init qemu block driver */ index = (blkdev-xendev.dev - 202 * 256) / 16;