rbd: add write zeroes support

Peter Lieven Mon, 21 Jun 2021 01:49:30 -0700

Am 18.06.21 um 12:34 schrieb Ilya Dryomov:

On Fri, Jun 18, 2021 at 11:00 AM Peter Lieven <p...@kamp.de> wrote:

Am 16.06.21 um 14:34 schrieb Ilya Dryomov:

On Wed, May 19, 2021 at 4:28 PM Peter Lieven <p...@kamp.de> wrote:

Signed-off-by: Peter Lieven <p...@kamp.de>
---
  block/rbd.c | 37 ++++++++++++++++++++++++++++++++++++-
  1 file changed, 36 insertions(+), 1 deletion(-)


diff --git a/block/rbd.c b/block/rbd.c
index 0d8612a988..ee13f08a74 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -63,7 +63,8 @@ typedef enum {
      RBD_AIO_READ,
      RBD_AIO_WRITE,
      RBD_AIO_DISCARD,
-    RBD_AIO_FLUSH
+    RBD_AIO_FLUSH,
+    RBD_AIO_WRITE_ZEROES
  } RBDAIOCmd;

  typedef struct BDRVRBDState {
@@ -705,6 +706,10 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
          }
      }

+#ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
+    bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;

I wonder if we should also set BDRV_REQ_NO_FALLBACK here since librbd
does not really have a notion of non-efficient explicit zeroing.


This is only true if thick provisioning is supported which is in Octopus 
onwards, right?

Since Pacific, I think.

So it would only be correct to set this if thick provisioning is supported 
otherwise we could

fail with ENOTSUP and then qemu emulates the zeroing with plain writes.

I actually had a question about that.  Why are you returning ENOTSUP
in case BDRV_REQ_MAY_UNMAP is not specified and that can't be fulfilled
because librbd is too old for RBD_WRITE_ZEROES_FLAG_THICK_PROVISION?

My understanding has always been that BDRV_REQ_MAY_UNMAP is just
a hint.  Deallocating if BDRV_REQ_MAY_UNMAP is specified is not nice
but should be perfectly acceptable.  It is certainly better than
returning ENOTSUP, particularly if ENOTSUP causes Qemu to do plain
zeroing.



I think this was introduced to support different provisioning modes. If 
BDRV_REQ_MAY_UNMAP is not set

the caller of bdrv_write_zeroes expects that the driver does thick 
provisioning. If the driver cannot handle that (efficiently)

qemu does a plain zero write.


I am still not fully understanding the meaning of the BDRV_REQ_NO_FALLBACK 
flag. The original commit states that it was introduced for qemu-img to 
efficiently

zero out the target and avoid the slow fallback. When I last worked on qemu-img 
convert I remember that there was a call to zero out the target if 
bdrv_has_zero_init

is not 1. It seems hat meanwhile a target_is_zero cmdline switch for qemu-img 
convert was added to let the user assert that a preexisting target is zero.

Maybe someone can help here if it would be right to set BDRV_REQ_NO_FALLBACK 
for rbd in either of the 2 cases (thick provisioning is support or not)?


Thanks

Peter

Re: [PATCH V3 5/6] block/rbd: add write zeroes support

Reply via email to