On Tue, Mar 25, 2014 at 2:51 PM, Alex Elder <[email protected]> wrote:
> On 03/25/2014 07:34 AM, Ilya Dryomov wrote:
>>> On 03/25/2014 04:04 AM, Ilya Dryomov wrote:
>>>> On Tue, Mar 25, 2014 at 10:39 AM, Olivier Bonvalet <[email protected]> 
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> what can/should I do to help fix that problem ?
>>>>>
>>>>> for now, RBD kernel client hang on :
>>>>>         Assertion failure in rbd_img_obj_callback() at line 2131:
>>>>>            rbd_assert(which >= img_request->next_completion);
>>>
>>> If you can build your own kernel as Ilya says I'd like to
>>> see the values of which and img_request->next_completion
>>> here.
>>
>> Looks like which was 1, which means that next_completion had to be 2 or
>> greater.  I miss solaris crash dumps ...
>>
>> On a different note, why are we asserting next_completion outside of
>> a spinlock which is supposed to protect next_completion?
>
> That's a very good point (which could be easily remedied by moving
> the assertion down a couple lines).  The image object request (#1)
> in this case will have been marked done at this point; it's possible
> that request #2 (or later) was concurrently getting handled by the
> for_each_obj_request_from() loop below in that same function, but
> may not have updated next_completion yet.
>
> So that *could* explain the tripped assertion.  The assertion
> should be moved in any case, it's a bug.
>
> That being said, it doesn't explain the other assertion:
>         rbd_assert(img_request != NULL);
> So there's at least one other thing going on.

Yeah, exactly my thoughts.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to