Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
> Am 12.01.2022 um 22:57 schrieb Ilya Dryomov : > > On Wed, Jan 12, 2022 at 9:37 PM Peter Lieven wrote: >> >>> Am 12.01.22 um 20:51 schrieb Ilya Dryomov: >>> On Wed, Jan 12, 2022 at 1:32 PM Peter Lieven wrote: Am 12.01.22 um 13:22 schrieb Ilya Dryomov: > On Wed, Jan 12, 2022 at 12:55 PM Peter Lieven wrote: >> Am 12.01.22 um 10:59 schrieb Ilya Dryomov: >>> On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven wrote: librbd had a bug until early 2022 that affected all versions of ceph that supported fast-diff. This bug results in reporting of incorrect offsets if the offset parameter to rbd_diff_iterate2 is not object aligned. Work around this bug by rounding down the offset to object boundaries. Fixes: https://tracker.ceph.com/issues/53784 >>> I don't think the Fixes tag is appropriate here. Linking librbd >>> ticket is fine but this patch doesn't really fix anything. >> Okay, I will change that to See: > It's already linked in the source code, up to you if you also want to > link it in the description. > Cc: qemu-sta...@nongnu.org Signed-off-by: Peter Lieven --- block/rbd.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/block/rbd.c b/block/rbd.c index 5e9dc91d81..260cb9f4b4 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -1333,6 +1333,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs, int status, r; RBDDiffIterateReq req = { .offs = offset }; uint64_t features, flags; +int64_t head; assert(offset + bytes <= s->image_size); @@ -1360,6 +1361,19 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs, return status; } +/* + * librbd had a bug until early 2022 that affected all versions of ceph that + * supported fast-diff. This bug results in reporting of incorrect offsets + * if the offset parameter to rbd_diff_iterate2 is not object aligned. + * Work around this bug by rounding down the offset to object boundaries. + * + * See: https://tracker.ceph.com/issues/53784 + */ +head = offset & (s->object_size - 1); +offset -= head; +req.offs -= head; +bytes += head; >>> So it looks like the intention is to have more or less a permanent >>> workaround since all librbd versions are affected, right? For that, >>> I think we would need to also reject custom striping patterns and >>> clones. For the above to be reliable, the image has to be standalone >>> and have a default striping pattern (stripe_unit == object_size && >>> stripe_count == 1). Otherwise, behave as if fast-diff is disabled or >>> invalid. >> Do you have a fealing how many users use a different striping pattern >> than default? > Very few. > >> What about EC backed pools? > In this context EC pools behave exactly the same as replicated pools. > >> Do you have another idea how we can detect if the librbd version is >> broken? > No. Initially I wanted to just fix these bugs in librbd, relying on > the assumption that setups with a new QEMU should also have a fairly > new librbd. But after looking at various distros and realizing the > extent of rbd_diff_iterate2() issues, I think a long-term workaround > in QEMU makes sense. A configure-time check for known good versions > of librbd can be added later if someone feels like it. Okay, I will add a check on image open if the striping fits or we have a clone. Can you explain how I would best check for it? >>> ... if _we_ are a clone. Use rbd_get_parent_info(), it returns 0 if an >>> image has a parent and -ENOENT if an image is standalone. Keep in mind >>> that the image can be flattened (i.e. made standalone) at any moment so >>> this should be checked on each invocation. >>> >>> To get striping parameters, use rbd_get_stripe_unit() and >>> rbd_get_stripe_count() after checking for RBD_FEATURE_STRIPINGV2 >>> feature. This can be done just on image open, but since features >>> are queried on each invocation anyway and very few users are going >>> to have a custom striping pattern (and therefore have this feature >>> enabled), I'd do it here as well. >>> >>> Thanks, >>> >>>Ilya >> >> The flattening is not a problem because if we detect a clone we disable >> get_block_status >> >> for the session. If it gets flattened later it doesn't hurt we have just no >> allocation info. >> >> So at least these checks could be done at image open. > > I
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
On Wed, Jan 12, 2022 at 9:37 PM Peter Lieven wrote: > > Am 12.01.22 um 20:51 schrieb Ilya Dryomov: > > On Wed, Jan 12, 2022 at 1:32 PM Peter Lieven wrote: > >> Am 12.01.22 um 13:22 schrieb Ilya Dryomov: > >>> On Wed, Jan 12, 2022 at 12:55 PM Peter Lieven wrote: > Am 12.01.22 um 10:59 schrieb Ilya Dryomov: > > On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven wrote: > >> librbd had a bug until early 2022 that affected all versions of ceph > >> that > >> supported fast-diff. This bug results in reporting of incorrect offsets > >> if the offset parameter to rbd_diff_iterate2 is not object aligned. > >> Work around this bug by rounding down the offset to object boundaries. > >> > >> Fixes: https://tracker.ceph.com/issues/53784 > > I don't think the Fixes tag is appropriate here. Linking librbd > > ticket is fine but this patch doesn't really fix anything. > Okay, I will change that to See: > >>> It's already linked in the source code, up to you if you also want to > >>> link it in the description. > >>> > >> Cc: qemu-sta...@nongnu.org > >> Signed-off-by: Peter Lieven > >> --- > >> block/rbd.c | 17 - > >> 1 file changed, 16 insertions(+), 1 deletion(-) > >> > >> diff --git a/block/rbd.c b/block/rbd.c > >> index 5e9dc91d81..260cb9f4b4 100644 > >> --- a/block/rbd.c > >> +++ b/block/rbd.c > >> @@ -1333,6 +1333,7 @@ static int coroutine_fn > >> qemu_rbd_co_block_status(BlockDriverState *bs, > >> int status, r; > >> RBDDiffIterateReq req = { .offs = offset }; > >> uint64_t features, flags; > >> +int64_t head; > >> > >> assert(offset + bytes <= s->image_size); > >> > >> @@ -1360,6 +1361,19 @@ static int coroutine_fn > >> qemu_rbd_co_block_status(BlockDriverState *bs, > >> return status; > >> } > >> > >> +/* > >> + * librbd had a bug until early 2022 that affected all versions > >> of ceph that > >> + * supported fast-diff. This bug results in reporting of > >> incorrect offsets > >> + * if the offset parameter to rbd_diff_iterate2 is not object > >> aligned. > >> + * Work around this bug by rounding down the offset to object > >> boundaries. > >> + * > >> + * See: https://tracker.ceph.com/issues/53784 > >> + */ > >> +head = offset & (s->object_size - 1); > >> +offset -= head; > >> +req.offs -= head; > >> +bytes += head; > > So it looks like the intention is to have more or less a permanent > > workaround since all librbd versions are affected, right? For that, > > I think we would need to also reject custom striping patterns and > > clones. For the above to be reliable, the image has to be standalone > > and have a default striping pattern (stripe_unit == object_size && > > stripe_count == 1). Otherwise, behave as if fast-diff is disabled or > > invalid. > Do you have a fealing how many users use a different striping pattern > than default? > >>> Very few. > >>> > What about EC backed pools? > >>> In this context EC pools behave exactly the same as replicated pools. > >>> > Do you have another idea how we can detect if the librbd version is > broken? > >>> No. Initially I wanted to just fix these bugs in librbd, relying on > >>> the assumption that setups with a new QEMU should also have a fairly > >>> new librbd. But after looking at various distros and realizing the > >>> extent of rbd_diff_iterate2() issues, I think a long-term workaround > >>> in QEMU makes sense. A configure-time check for known good versions > >>> of librbd can be added later if someone feels like it. > >> > >> Okay, I will add a check on image open if the striping fits or > >> > >> we have a clone. Can you explain how I would best check for it? > > ... if _we_ are a clone. Use rbd_get_parent_info(), it returns 0 if an > > image has a parent and -ENOENT if an image is standalone. Keep in mind > > that the image can be flattened (i.e. made standalone) at any moment so > > this should be checked on each invocation. > > > > To get striping parameters, use rbd_get_stripe_unit() and > > rbd_get_stripe_count() after checking for RBD_FEATURE_STRIPINGV2 > > feature. This can be done just on image open, but since features > > are queried on each invocation anyway and very few users are going > > to have a custom striping pattern (and therefore have this feature > > enabled), I'd do it here as well. > > > > Thanks, > > > > Ilya > > The flattening is not a problem because if we detect a clone we disable > get_block_status > > for the session. If it gets flattened later it doesn't hurt we have just no > allocation info. > > So at least these checks could be done at image open. I see. Doing both on image open and handling clones this way is fine too. > > >
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
Am 12.01.22 um 20:51 schrieb Ilya Dryomov: > On Wed, Jan 12, 2022 at 1:32 PM Peter Lieven wrote: >> Am 12.01.22 um 13:22 schrieb Ilya Dryomov: >>> On Wed, Jan 12, 2022 at 12:55 PM Peter Lieven wrote: Am 12.01.22 um 10:59 schrieb Ilya Dryomov: > On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven wrote: >> librbd had a bug until early 2022 that affected all versions of ceph that >> supported fast-diff. This bug results in reporting of incorrect offsets >> if the offset parameter to rbd_diff_iterate2 is not object aligned. >> Work around this bug by rounding down the offset to object boundaries. >> >> Fixes: https://tracker.ceph.com/issues/53784 > I don't think the Fixes tag is appropriate here. Linking librbd > ticket is fine but this patch doesn't really fix anything. Okay, I will change that to See: >>> It's already linked in the source code, up to you if you also want to >>> link it in the description. >>> >> Cc: qemu-sta...@nongnu.org >> Signed-off-by: Peter Lieven >> --- >> block/rbd.c | 17 - >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/block/rbd.c b/block/rbd.c >> index 5e9dc91d81..260cb9f4b4 100644 >> --- a/block/rbd.c >> +++ b/block/rbd.c >> @@ -1333,6 +1333,7 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> int status, r; >> RBDDiffIterateReq req = { .offs = offset }; >> uint64_t features, flags; >> +int64_t head; >> >> assert(offset + bytes <= s->image_size); >> >> @@ -1360,6 +1361,19 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> return status; >> } >> >> +/* >> + * librbd had a bug until early 2022 that affected all versions of >> ceph that >> + * supported fast-diff. This bug results in reporting of incorrect >> offsets >> + * if the offset parameter to rbd_diff_iterate2 is not object >> aligned. >> + * Work around this bug by rounding down the offset to object >> boundaries. >> + * >> + * See: https://tracker.ceph.com/issues/53784 >> + */ >> +head = offset & (s->object_size - 1); >> +offset -= head; >> +req.offs -= head; >> +bytes += head; > So it looks like the intention is to have more or less a permanent > workaround since all librbd versions are affected, right? For that, > I think we would need to also reject custom striping patterns and > clones. For the above to be reliable, the image has to be standalone > and have a default striping pattern (stripe_unit == object_size && > stripe_count == 1). Otherwise, behave as if fast-diff is disabled or > invalid. Do you have a fealing how many users use a different striping pattern than default? >>> Very few. >>> What about EC backed pools? >>> In this context EC pools behave exactly the same as replicated pools. >>> Do you have another idea how we can detect if the librbd version is broken? >>> No. Initially I wanted to just fix these bugs in librbd, relying on >>> the assumption that setups with a new QEMU should also have a fairly >>> new librbd. But after looking at various distros and realizing the >>> extent of rbd_diff_iterate2() issues, I think a long-term workaround >>> in QEMU makes sense. A configure-time check for known good versions >>> of librbd can be added later if someone feels like it. >> >> Okay, I will add a check on image open if the striping fits or >> >> we have a clone. Can you explain how I would best check for it? > ... if _we_ are a clone. Use rbd_get_parent_info(), it returns 0 if an > image has a parent and -ENOENT if an image is standalone. Keep in mind > that the image can be flattened (i.e. made standalone) at any moment so > this should be checked on each invocation. > > To get striping parameters, use rbd_get_stripe_unit() and > rbd_get_stripe_count() after checking for RBD_FEATURE_STRIPINGV2 > feature. This can be done just on image open, but since features > are queried on each invocation anyway and very few users are going > to have a custom striping pattern (and therefore have this feature > enabled), I'd do it here as well. > > Thanks, > > Ilya The flattening is not a problem because if we detect a clone we disable get_block_status for the session. If it gets flattened later it doesn't hurt we have just no allocation info. So at least these checks could be done at image open. To be honest, all these checks and workaround stuff doesn't feel right. We try to fix something that is broken from scratch and make the qemu block driver quite ugly doing so. I would personally go and drop get_block_status support for all existing librbd versions and add a flag to librbd that reports that rbd_diff_iterate2 is not broken and check for th
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
On Wed, Jan 12, 2022 at 12:55 PM Peter Lieven wrote: > > Am 12.01.22 um 10:59 schrieb Ilya Dryomov: > > On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven wrote: > >> librbd had a bug until early 2022 that affected all versions of ceph that > >> supported fast-diff. This bug results in reporting of incorrect offsets > >> if the offset parameter to rbd_diff_iterate2 is not object aligned. > >> Work around this bug by rounding down the offset to object boundaries. > >> > >> Fixes: https://tracker.ceph.com/issues/53784 > > I don't think the Fixes tag is appropriate here. Linking librbd > > ticket is fine but this patch doesn't really fix anything. > > > Okay, I will change that to See: It's already linked in the source code, up to you if you also want to link it in the description. > > > > > >> Cc: qemu-sta...@nongnu.org > >> Signed-off-by: Peter Lieven > >> --- > >> block/rbd.c | 17 - > >> 1 file changed, 16 insertions(+), 1 deletion(-) > >> > >> diff --git a/block/rbd.c b/block/rbd.c > >> index 5e9dc91d81..260cb9f4b4 100644 > >> --- a/block/rbd.c > >> +++ b/block/rbd.c > >> @@ -1333,6 +1333,7 @@ static int coroutine_fn > >> qemu_rbd_co_block_status(BlockDriverState *bs, > >> int status, r; > >> RBDDiffIterateReq req = { .offs = offset }; > >> uint64_t features, flags; > >> +int64_t head; > >> > >> assert(offset + bytes <= s->image_size); > >> > >> @@ -1360,6 +1361,19 @@ static int coroutine_fn > >> qemu_rbd_co_block_status(BlockDriverState *bs, > >> return status; > >> } > >> > >> +/* > >> + * librbd had a bug until early 2022 that affected all versions of > >> ceph that > >> + * supported fast-diff. This bug results in reporting of incorrect > >> offsets > >> + * if the offset parameter to rbd_diff_iterate2 is not object aligned. > >> + * Work around this bug by rounding down the offset to object > >> boundaries. > >> + * > >> + * See: https://tracker.ceph.com/issues/53784 > >> + */ > >> +head = offset & (s->object_size - 1); > >> +offset -= head; > >> +req.offs -= head; > >> +bytes += head; > > So it looks like the intention is to have more or less a permanent > > workaround since all librbd versions are affected, right? For that, > > I think we would need to also reject custom striping patterns and > > clones. For the above to be reliable, the image has to be standalone > > and have a default striping pattern (stripe_unit == object_size && > > stripe_count == 1). Otherwise, behave as if fast-diff is disabled or > > invalid. > > > Do you have a fealing how many users use a different striping pattern than > default? Very few. > > What about EC backed pools? In this context EC pools behave exactly the same as replicated pools. > > Do you have another idea how we can detect if the librbd version is broken? No. Initially I wanted to just fix these bugs in librbd, relying on the assumption that setups with a new QEMU should also have a fairly new librbd. But after looking at various distros and realizing the extent of rbd_diff_iterate2() issues, I think a long-term workaround in QEMU makes sense. A configure-time check for known good versions of librbd can be added later if someone feels like it. Thanks, Ilya
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
Am 12.01.22 um 10:59 schrieb Ilya Dryomov: > On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven wrote: >> librbd had a bug until early 2022 that affected all versions of ceph that >> supported fast-diff. This bug results in reporting of incorrect offsets >> if the offset parameter to rbd_diff_iterate2 is not object aligned. >> Work around this bug by rounding down the offset to object boundaries. >> >> Fixes: https://tracker.ceph.com/issues/53784 > I don't think the Fixes tag is appropriate here. Linking librbd > ticket is fine but this patch doesn't really fix anything. Okay, I will change that to See: > >> Cc: qemu-sta...@nongnu.org >> Signed-off-by: Peter Lieven >> --- >> block/rbd.c | 17 - >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/block/rbd.c b/block/rbd.c >> index 5e9dc91d81..260cb9f4b4 100644 >> --- a/block/rbd.c >> +++ b/block/rbd.c >> @@ -1333,6 +1333,7 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> int status, r; >> RBDDiffIterateReq req = { .offs = offset }; >> uint64_t features, flags; >> +int64_t head; >> >> assert(offset + bytes <= s->image_size); >> >> @@ -1360,6 +1361,19 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> return status; >> } >> >> +/* >> + * librbd had a bug until early 2022 that affected all versions of ceph >> that >> + * supported fast-diff. This bug results in reporting of incorrect >> offsets >> + * if the offset parameter to rbd_diff_iterate2 is not object aligned. >> + * Work around this bug by rounding down the offset to object >> boundaries. >> + * >> + * See: https://tracker.ceph.com/issues/53784 >> + */ >> +head = offset & (s->object_size - 1); >> +offset -= head; >> +req.offs -= head; >> +bytes += head; > So it looks like the intention is to have more or less a permanent > workaround since all librbd versions are affected, right? For that, > I think we would need to also reject custom striping patterns and > clones. For the above to be reliable, the image has to be standalone > and have a default striping pattern (stripe_unit == object_size && > stripe_count == 1). Otherwise, behave as if fast-diff is disabled or > invalid. Do you have a fealing how many users use a different striping pattern than default? What about EC backed pools? Do you have another idea how we can detect if the librbd version is broken? > >> + > Nit: I'd replace { .offs = offset } initialization at the top with {} > and assign to req.offs here, right before calling rbd_diff_iterate2(). > >> r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, >>qemu_rbd_diff_iterate_cb, &req); >> if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { >> @@ -1379,7 +1393,8 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID; >> } >> >> -*pnum = req.bytes; >> +assert(req.bytes > head); > I'd expand the workaround comment with an explanation of why it's OK > to round down the offset -- because rbd_diff_iterate2() is called with > whole_object=true. If that wasn't the case, on top of inconsistent > results for different offsets within an object, this assert could be > triggered. Sure, you are right. I had this in mind. This also does not change complexity since we stay with the offset in the same object. I will mention both. Peter
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven wrote: > > librbd had a bug until early 2022 that affected all versions of ceph that > supported fast-diff. This bug results in reporting of incorrect offsets > if the offset parameter to rbd_diff_iterate2 is not object aligned. > Work around this bug by rounding down the offset to object boundaries. > > Fixes: https://tracker.ceph.com/issues/53784 I don't think the Fixes tag is appropriate here. Linking librbd ticket is fine but this patch doesn't really fix anything. > Cc: qemu-sta...@nongnu.org > Signed-off-by: Peter Lieven > --- > block/rbd.c | 17 - > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/block/rbd.c b/block/rbd.c > index 5e9dc91d81..260cb9f4b4 100644 > --- a/block/rbd.c > +++ b/block/rbd.c > @@ -1333,6 +1333,7 @@ static int coroutine_fn > qemu_rbd_co_block_status(BlockDriverState *bs, > int status, r; > RBDDiffIterateReq req = { .offs = offset }; > uint64_t features, flags; > +int64_t head; > > assert(offset + bytes <= s->image_size); > > @@ -1360,6 +1361,19 @@ static int coroutine_fn > qemu_rbd_co_block_status(BlockDriverState *bs, > return status; > } > > +/* > + * librbd had a bug until early 2022 that affected all versions of ceph > that > + * supported fast-diff. This bug results in reporting of incorrect > offsets > + * if the offset parameter to rbd_diff_iterate2 is not object aligned. > + * Work around this bug by rounding down the offset to object boundaries. > + * > + * See: https://tracker.ceph.com/issues/53784 > + */ > +head = offset & (s->object_size - 1); > +offset -= head; > +req.offs -= head; > +bytes += head; So it looks like the intention is to have more or less a permanent workaround since all librbd versions are affected, right? For that, I think we would need to also reject custom striping patterns and clones. For the above to be reliable, the image has to be standalone and have a default striping pattern (stripe_unit == object_size && stripe_count == 1). Otherwise, behave as if fast-diff is disabled or invalid. > + Nit: I'd replace { .offs = offset } initialization at the top with {} and assign to req.offs here, right before calling rbd_diff_iterate2(). > r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, >qemu_rbd_diff_iterate_cb, &req); > if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { > @@ -1379,7 +1393,8 @@ static int coroutine_fn > qemu_rbd_co_block_status(BlockDriverState *bs, > status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID; > } > > -*pnum = req.bytes; > +assert(req.bytes > head); I'd expand the workaround comment with an explanation of why it's OK to round down the offset -- because rbd_diff_iterate2() is called with whole_object=true. If that wasn't the case, on top of inconsistent results for different offsets within an object, this assert could be triggered. Thanks, Ilya
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
Hi Peter, On Tue, Jan 11, 2022 at 10:10:16AM +0100, Peter Lieven wrote: Hi Stefano, thanks for the feedback. Please note that you also need the other patch or you will sooner or later run into another assertion as soon as rbd snapshots are involved. Yep, I tested with the entire series applied. Anyway, thanks for clarifying that. Regarding the workaround I need confirmation from Ilya that it covers all cases. I do not know if it works if striping or EC is configured on the pool. Sure :-) Thanks, Stefano
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
Am 10.01.22 um 15:18 schrieb Stefano Garzarella: > On Mon, Jan 10, 2022 at 12:41:54PM +0100, Peter Lieven wrote: >> librbd had a bug until early 2022 that affected all versions of ceph that >> supported fast-diff. This bug results in reporting of incorrect offsets >> if the offset parameter to rbd_diff_iterate2 is not object aligned. >> Work around this bug by rounding down the offset to object boundaries. >> >> Fixes: https://tracker.ceph.com/issues/53784 >> Cc: qemu-sta...@nongnu.org >> Signed-off-by: Peter Lieven >> --- >> block/rbd.c | 17 - >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/block/rbd.c b/block/rbd.c >> index 5e9dc91d81..260cb9f4b4 100644 >> --- a/block/rbd.c >> +++ b/block/rbd.c >> @@ -1333,6 +1333,7 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> int status, r; >> RBDDiffIterateReq req = { .offs = offset }; >> uint64_t features, flags; >> + int64_t head; >> >> assert(offset + bytes <= s->image_size); >> >> @@ -1360,6 +1361,19 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> return status; >> } >> >> + /* >> + * librbd had a bug until early 2022 that affected all versions of ceph >> that >> + * supported fast-diff. This bug results in reporting of incorrect >> offsets >> + * if the offset parameter to rbd_diff_iterate2 is not object aligned. >> + * Work around this bug by rounding down the offset to object >> boundaries. >> + * >> + * See: https://tracker.ceph.com/issues/53784 >> + */ >> + head = offset & (s->object_size - 1); >> + offset -= head; >> + req.offs -= head; >> + bytes += head; >> + >> r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, >> qemu_rbd_diff_iterate_cb, &req); >> if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { >> @@ -1379,7 +1393,8 @@ static int coroutine_fn >> qemu_rbd_co_block_status(BlockDriverState *bs, >> status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID; >> } >> >> - *pnum = req.bytes; >> + assert(req.bytes > head); >> + *pnum = req.bytes - head; >> return status; >> } > > Thanks for the workaround! > > I just tested this patch for the issue reported in this BZ [1] and the test > now works correctly! > > Tested-by: Stefano Garzarella > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=2034791 > Hi Stefano, thanks for the feedback. Please note that you also need the other patch or you will sooner or later run into another assertion as soon as rbd snapshots are involved. Regarding the workaround I need confirmation from Ilya that it covers all cases. I do not know if it works if striping or EC is configured on the pool. Best, Peter
Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784
On Mon, Jan 10, 2022 at 12:41:54PM +0100, Peter Lieven wrote: librbd had a bug until early 2022 that affected all versions of ceph that supported fast-diff. This bug results in reporting of incorrect offsets if the offset parameter to rbd_diff_iterate2 is not object aligned. Work around this bug by rounding down the offset to object boundaries. Fixes: https://tracker.ceph.com/issues/53784 Cc: qemu-sta...@nongnu.org Signed-off-by: Peter Lieven --- block/rbd.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/block/rbd.c b/block/rbd.c index 5e9dc91d81..260cb9f4b4 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -1333,6 +1333,7 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs, int status, r; RBDDiffIterateReq req = { .offs = offset }; uint64_t features, flags; +int64_t head; assert(offset + bytes <= s->image_size); @@ -1360,6 +1361,19 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs, return status; } +/* + * librbd had a bug until early 2022 that affected all versions of ceph that + * supported fast-diff. This bug results in reporting of incorrect offsets + * if the offset parameter to rbd_diff_iterate2 is not object aligned. + * Work around this bug by rounding down the offset to object boundaries. + * + * See: https://tracker.ceph.com/issues/53784 + */ +head = offset & (s->object_size - 1); +offset -= head; +req.offs -= head; +bytes += head; + r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true, qemu_rbd_diff_iterate_cb, &req); if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) { @@ -1379,7 +1393,8 @@ static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs, status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID; } -*pnum = req.bytes; +assert(req.bytes > head); +*pnum = req.bytes - head; return status; } Thanks for the workaround! I just tested this patch for the issue reported in this BZ [1] and the test now works correctly! Tested-by: Stefano Garzarella [1] https://bugzilla.redhat.com/show_bug.cgi?id=2034791