Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784

2022-01-12 Thread Ilya Dryomov
On Wed, Jan 12, 2022 at 12:55 PM Peter Lieven  wrote:
>
> Am 12.01.22 um 10:59 schrieb Ilya Dryomov:
> > On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven  wrote:
> >> librbd had a bug until early 2022 that affected all versions of ceph that
> >> supported fast-diff. This bug results in reporting of incorrect offsets
> >> if the offset parameter to rbd_diff_iterate2 is not object aligned.
> >> Work around this bug by rounding down the offset to object boundaries.
> >>
> >> Fixes: https://tracker.ceph.com/issues/53784
> > I don't think the Fixes tag is appropriate here.  Linking librbd
> > ticket is fine but this patch doesn't really fix anything.
>
>
> Okay, I will change that to See:

It's already linked in the source code, up to you if you also want to
link it in the description.

>
>
> >
> >> Cc: qemu-sta...@nongnu.org
> >> Signed-off-by: Peter Lieven 
> >> ---
> >>  block/rbd.c | 17 -
> >>  1 file changed, 16 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/block/rbd.c b/block/rbd.c
> >> index 5e9dc91d81..260cb9f4b4 100644
> >> --- a/block/rbd.c
> >> +++ b/block/rbd.c
> >> @@ -1333,6 +1333,7 @@ static int coroutine_fn 
> >> qemu_rbd_co_block_status(BlockDriverState *bs,
> >>  int status, r;
> >>  RBDDiffIterateReq req = { .offs = offset };
> >>  uint64_t features, flags;
> >> +int64_t head;
> >>
> >>  assert(offset + bytes <= s->image_size);
> >>
> >> @@ -1360,6 +1361,19 @@ static int coroutine_fn 
> >> qemu_rbd_co_block_status(BlockDriverState *bs,
> >>  return status;
> >>  }
> >>
> >> +/*
> >> + * librbd had a bug until early 2022 that affected all versions of 
> >> ceph that
> >> + * supported fast-diff. This bug results in reporting of incorrect 
> >> offsets
> >> + * if the offset parameter to rbd_diff_iterate2 is not object aligned.
> >> + * Work around this bug by rounding down the offset to object 
> >> boundaries.
> >> + *
> >> + * See: https://tracker.ceph.com/issues/53784
> >> + */
> >> +head = offset & (s->object_size - 1);
> >> +offset -= head;
> >> +req.offs -= head;
> >> +bytes += head;
> > So it looks like the intention is to have more or less a permanent
> > workaround since all librbd versions are affected, right?  For that,
> > I think we would need to also reject custom striping patterns and
> > clones.  For the above to be reliable, the image has to be standalone
> > and have a default striping pattern (stripe_unit == object_size &&
> > stripe_count == 1).  Otherwise, behave as if fast-diff is disabled or
> > invalid.
>
>
> Do you have a fealing how many users use a different striping pattern than 
> default?

Very few.

>
> What about EC backed pools?

In this context EC pools behave exactly the same as replicated pools.

>
> Do you have another idea how we can detect if the librbd version is broken?

No.  Initially I wanted to just fix these bugs in librbd, relying on
the assumption that setups with a new QEMU should also have a fairly
new librbd.  But after looking at various distros and realizing the
extent of rbd_diff_iterate2() issues, I think a long-term workaround
in QEMU makes sense.  A configure-time check for known good versions
of librbd can be added later if someone feels like it.

Thanks,

Ilya



Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784

2022-01-12 Thread Peter Lieven
Am 12.01.22 um 10:59 schrieb Ilya Dryomov:
> On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven  wrote:
>> librbd had a bug until early 2022 that affected all versions of ceph that
>> supported fast-diff. This bug results in reporting of incorrect offsets
>> if the offset parameter to rbd_diff_iterate2 is not object aligned.
>> Work around this bug by rounding down the offset to object boundaries.
>>
>> Fixes: https://tracker.ceph.com/issues/53784
> I don't think the Fixes tag is appropriate here.  Linking librbd
> ticket is fine but this patch doesn't really fix anything.


Okay, I will change that to See:


>
>> Cc: qemu-sta...@nongnu.org
>> Signed-off-by: Peter Lieven 
>> ---
>>  block/rbd.c | 17 -
>>  1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/rbd.c b/block/rbd.c
>> index 5e9dc91d81..260cb9f4b4 100644
>> --- a/block/rbd.c
>> +++ b/block/rbd.c
>> @@ -1333,6 +1333,7 @@ static int coroutine_fn 
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>>  int status, r;
>>  RBDDiffIterateReq req = { .offs = offset };
>>  uint64_t features, flags;
>> +int64_t head;
>>
>>  assert(offset + bytes <= s->image_size);
>>
>> @@ -1360,6 +1361,19 @@ static int coroutine_fn 
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>>  return status;
>>  }
>>
>> +/*
>> + * librbd had a bug until early 2022 that affected all versions of ceph 
>> that
>> + * supported fast-diff. This bug results in reporting of incorrect 
>> offsets
>> + * if the offset parameter to rbd_diff_iterate2 is not object aligned.
>> + * Work around this bug by rounding down the offset to object 
>> boundaries.
>> + *
>> + * See: https://tracker.ceph.com/issues/53784
>> + */
>> +head = offset & (s->object_size - 1);
>> +offset -= head;
>> +req.offs -= head;
>> +bytes += head;
> So it looks like the intention is to have more or less a permanent
> workaround since all librbd versions are affected, right?  For that,
> I think we would need to also reject custom striping patterns and
> clones.  For the above to be reliable, the image has to be standalone
> and have a default striping pattern (stripe_unit == object_size &&
> stripe_count == 1).  Otherwise, behave as if fast-diff is disabled or
> invalid.


Do you have a fealing how many users use a different striping pattern than 
default?

What about EC backed pools?

Do you have another idea how we can detect if the librbd version is broken?


>
>> +
> Nit: I'd replace { .offs = offset } initialization at the top with {}
> and assign to req.offs here, right before calling rbd_diff_iterate2().
>
>>  r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
>>qemu_rbd_diff_iterate_cb, );
>>  if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
>> @@ -1379,7 +1393,8 @@ static int coroutine_fn 
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>>  status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
>>  }
>>
>> -*pnum = req.bytes;
>> +assert(req.bytes > head);
> I'd expand the workaround comment with an explanation of why it's OK
> to round down the offset -- because rbd_diff_iterate2() is called with
> whole_object=true.  If that wasn't the case, on top of inconsistent
> results for different offsets within an object, this assert could be
> triggered.

Sure, you are right. I had this in mind. This also does not change complexity

since we stay with the offset in the same object. I will mention both.


Peter






Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784

2022-01-12 Thread Ilya Dryomov
On Mon, Jan 10, 2022 at 12:43 PM Peter Lieven  wrote:
>
> librbd had a bug until early 2022 that affected all versions of ceph that
> supported fast-diff. This bug results in reporting of incorrect offsets
> if the offset parameter to rbd_diff_iterate2 is not object aligned.
> Work around this bug by rounding down the offset to object boundaries.
>
> Fixes: https://tracker.ceph.com/issues/53784

I don't think the Fixes tag is appropriate here.  Linking librbd
ticket is fine but this patch doesn't really fix anything.

> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Peter Lieven 
> ---
>  block/rbd.c | 17 -
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/block/rbd.c b/block/rbd.c
> index 5e9dc91d81..260cb9f4b4 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -1333,6 +1333,7 @@ static int coroutine_fn 
> qemu_rbd_co_block_status(BlockDriverState *bs,
>  int status, r;
>  RBDDiffIterateReq req = { .offs = offset };
>  uint64_t features, flags;
> +int64_t head;
>
>  assert(offset + bytes <= s->image_size);
>
> @@ -1360,6 +1361,19 @@ static int coroutine_fn 
> qemu_rbd_co_block_status(BlockDriverState *bs,
>  return status;
>  }
>
> +/*
> + * librbd had a bug until early 2022 that affected all versions of ceph 
> that
> + * supported fast-diff. This bug results in reporting of incorrect 
> offsets
> + * if the offset parameter to rbd_diff_iterate2 is not object aligned.
> + * Work around this bug by rounding down the offset to object boundaries.
> + *
> + * See: https://tracker.ceph.com/issues/53784
> + */
> +head = offset & (s->object_size - 1);
> +offset -= head;
> +req.offs -= head;
> +bytes += head;

So it looks like the intention is to have more or less a permanent
workaround since all librbd versions are affected, right?  For that,
I think we would need to also reject custom striping patterns and
clones.  For the above to be reliable, the image has to be standalone
and have a default striping pattern (stripe_unit == object_size &&
stripe_count == 1).  Otherwise, behave as if fast-diff is disabled or
invalid.

> +

Nit: I'd replace { .offs = offset } initialization at the top with {}
and assign to req.offs here, right before calling rbd_diff_iterate2().

>  r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
>qemu_rbd_diff_iterate_cb, );
>  if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
> @@ -1379,7 +1393,8 @@ static int coroutine_fn 
> qemu_rbd_co_block_status(BlockDriverState *bs,
>  status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
>  }
>
> -*pnum = req.bytes;
> +assert(req.bytes > head);

I'd expand the workaround comment with an explanation of why it's OK
to round down the offset -- because rbd_diff_iterate2() is called with
whole_object=true.  If that wasn't the case, on top of inconsistent
results for different offsets within an object, this assert could be
triggered.

Thanks,

Ilya



Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784

2022-01-11 Thread Stefano Garzarella

Hi Peter,

On Tue, Jan 11, 2022 at 10:10:16AM +0100, Peter Lieven wrote:

Hi Stefano,


thanks for the feedback. Please note that you also need the other patch 
or you will sooner or later run into another assertion as soon as rbd 
snapshots are involved.


Yep, I tested with the entire series applied.
Anyway, thanks for clarifying that.



Regarding the workaround I need confirmation from Ilya that it covers 
all cases. I do not know if it works if striping or EC is configured on 
the pool.


Sure :-)

Thanks,
Stefano




Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784

2022-01-11 Thread Peter Lieven
Am 10.01.22 um 15:18 schrieb Stefano Garzarella:
> On Mon, Jan 10, 2022 at 12:41:54PM +0100, Peter Lieven wrote:
>> librbd had a bug until early 2022 that affected all versions of ceph that
>> supported fast-diff. This bug results in reporting of incorrect offsets
>> if the offset parameter to rbd_diff_iterate2 is not object aligned.
>> Work around this bug by rounding down the offset to object boundaries.
>>
>> Fixes: https://tracker.ceph.com/issues/53784
>> Cc: qemu-sta...@nongnu.org
>> Signed-off-by: Peter Lieven 
>> ---
>> block/rbd.c | 17 -
>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/rbd.c b/block/rbd.c
>> index 5e9dc91d81..260cb9f4b4 100644
>> --- a/block/rbd.c
>> +++ b/block/rbd.c
>> @@ -1333,6 +1333,7 @@ static int coroutine_fn 
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>>     int status, r;
>>     RBDDiffIterateReq req = { .offs = offset };
>>     uint64_t features, flags;
>> +    int64_t head;
>>
>>     assert(offset + bytes <= s->image_size);
>>
>> @@ -1360,6 +1361,19 @@ static int coroutine_fn 
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>>     return status;
>>     }
>>
>> +    /*
>> + * librbd had a bug until early 2022 that affected all versions of ceph 
>> that
>> + * supported fast-diff. This bug results in reporting of incorrect 
>> offsets
>> + * if the offset parameter to rbd_diff_iterate2 is not object aligned.
>> + * Work around this bug by rounding down the offset to object 
>> boundaries.
>> + *
>> + * See: https://tracker.ceph.com/issues/53784
>> + */
>> +    head = offset & (s->object_size - 1);
>> +    offset -= head;
>> +    req.offs -= head;
>> +    bytes += head;
>> +
>>     r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
>>   qemu_rbd_diff_iterate_cb, );
>>     if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
>> @@ -1379,7 +1393,8 @@ static int coroutine_fn 
>> qemu_rbd_co_block_status(BlockDriverState *bs,
>>     status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
>>     }
>>
>> -    *pnum = req.bytes;
>> +    assert(req.bytes > head);
>> +    *pnum = req.bytes - head;
>>     return status;
>> }
>
> Thanks for the workaround!
>
> I just tested this patch for the issue reported in this BZ [1] and the test 
> now works correctly!
>
> Tested-by: Stefano Garzarella 
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=2034791
>


Hi Stefano,


thanks for the feedback. Please note that you also need the other patch or you 
will sooner or later run into another assertion as soon as rbd snapshots are 
involved.


Regarding the workaround I need confirmation from Ilya that it covers all 
cases. I do not know if it works if striping or EC is configured on the pool.


Best,

Peter






Re: [PATCH 2/2] block/rbd: workaround for ceph issue #53784

2022-01-10 Thread Stefano Garzarella

On Mon, Jan 10, 2022 at 12:41:54PM +0100, Peter Lieven wrote:

librbd had a bug until early 2022 that affected all versions of ceph that
supported fast-diff. This bug results in reporting of incorrect offsets
if the offset parameter to rbd_diff_iterate2 is not object aligned.
Work around this bug by rounding down the offset to object boundaries.

Fixes: https://tracker.ceph.com/issues/53784
Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Lieven 
---
block/rbd.c | 17 -
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/block/rbd.c b/block/rbd.c
index 5e9dc91d81..260cb9f4b4 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1333,6 +1333,7 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
int status, r;
RBDDiffIterateReq req = { .offs = offset };
uint64_t features, flags;
+int64_t head;

assert(offset + bytes <= s->image_size);

@@ -1360,6 +1361,19 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
return status;
}

+/*
+ * librbd had a bug until early 2022 that affected all versions of ceph 
that
+ * supported fast-diff. This bug results in reporting of incorrect offsets
+ * if the offset parameter to rbd_diff_iterate2 is not object aligned.
+ * Work around this bug by rounding down the offset to object boundaries.
+ *
+ * See: https://tracker.ceph.com/issues/53784
+ */
+head = offset & (s->object_size - 1);
+offset -= head;
+req.offs -= head;
+bytes += head;
+
r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
  qemu_rbd_diff_iterate_cb, );
if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
@@ -1379,7 +1393,8 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
}

-*pnum = req.bytes;
+assert(req.bytes > head);
+*pnum = req.bytes - head;
return status;
}


Thanks for the workaround!

I just tested this patch for the issue reported in this BZ [1] and the 
test now works correctly!


Tested-by: Stefano Garzarella 

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2034791




[PATCH 2/2] block/rbd: workaround for ceph issue #53784

2022-01-10 Thread Peter Lieven
librbd had a bug until early 2022 that affected all versions of ceph that
supported fast-diff. This bug results in reporting of incorrect offsets
if the offset parameter to rbd_diff_iterate2 is not object aligned.
Work around this bug by rounding down the offset to object boundaries.

Fixes: https://tracker.ceph.com/issues/53784
Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Lieven 
---
 block/rbd.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/block/rbd.c b/block/rbd.c
index 5e9dc91d81..260cb9f4b4 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1333,6 +1333,7 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
 int status, r;
 RBDDiffIterateReq req = { .offs = offset };
 uint64_t features, flags;
+int64_t head;
 
 assert(offset + bytes <= s->image_size);
 
@@ -1360,6 +1361,19 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
 return status;
 }
 
+/*
+ * librbd had a bug until early 2022 that affected all versions of ceph 
that
+ * supported fast-diff. This bug results in reporting of incorrect offsets
+ * if the offset parameter to rbd_diff_iterate2 is not object aligned.
+ * Work around this bug by rounding down the offset to object boundaries.
+ *
+ * See: https://tracker.ceph.com/issues/53784
+ */
+head = offset & (s->object_size - 1);
+offset -= head;
+req.offs -= head;
+bytes += head;
+
 r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
   qemu_rbd_diff_iterate_cb, );
 if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
@@ -1379,7 +1393,8 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
 status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
 }
 
-*pnum = req.bytes;
+assert(req.bytes > head);
+*pnum = req.bytes - head;
 return status;
 }
 
-- 
2.25.1