On 03/25/2014 07:34 AM, Ilya Dryomov wrote:
>> On 03/25/2014 04:04 AM, Ilya Dryomov wrote:
>>> On Tue, Mar 25, 2014 at 10:39 AM, Olivier Bonvalet <[email protected]>
>>> wrote:
>>>> Hi,
>>>>
>>>> what can/should I do to help fix that problem ?
>>>>
>>>> for now, RBD kernel client hang on :
>>>> Assertion failure in rbd_img_obj_callback() at line 2131:
>>>> rbd_assert(which >= img_request->next_completion);
>>
>> If you can build your own kernel as Ilya says I'd like to
>> see the values of which and img_request->next_completion
>> here.
>
> Looks like which was 1, which means that next_completion had to be 2 or
> greater. I miss solaris crash dumps ...
>
> On a different note, why are we asserting next_completion outside of
> a spinlock which is supposed to protect next_completion?
That's a very good point (which could be easily remedied by moving
the assertion down a couple lines). The image object request (#1)
in this case will have been marked done at this point; it's possible
that request #2 (or later) was concurrently getting handled by the
for_each_obj_request_from() loop below in that same function, but
may not have updated next_completion yet.
So that *could* explain the tripped assertion. The assertion
should be moved in any case, it's a bug.
That being said, it doesn't explain the other assertion:
rbd_assert(img_request != NULL);
So there's at least one other thing going on.
-Alex
> Thanks,
>
> Ilya
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html