On 03/25/2014 03:21 PM, Olivier Bonvalet wrote:
> Le mardi 25 mars 2014 à 22:18 +0200, Ilya Dryomov a écrit :
>> On Tue, Mar 25, 2014 at 9:03 PM, Alex Elder <[email protected]> wrote:
>>> On 03/25/2014 01:53 PM, Olivier Bonvalet wrote:
>>>> Le mardi 25 mars 2014 à 12:43 -0500, Alex Elder a écrit :
>>>>> Please try applying this, on top of the previous patch.
>>>>> If you can then reproduce the problem we'll have a bunch
>>>>> of new information about the particular request that's
>>>>> leading to the failure. That might tell us what more we
>>>>> can do to find the root cause. Thank you.
>>>>>
>>>>> -Alex
>>>>>
>>>>> PS I hope my mailer doesn't botch the long lines. It might.
>>>>>
>>>>
>>>> Here the execution will continue, no more kernel panic after this
>>>> debugging display. Is it wanted ?
>>>
>>>
>>> I guess it should panic. I'm glad you mentioned this.
>>
>> Just in case, if you haven't done it already: stick rbd_assert(0);
>> after the last printk in that if statement, so it looks like this:
>>
>> if (which != img_request->next_completion) {
>> printk("%s: bad image object request information:\n", __func__);
>> printk("obj_request %p\n", obj_request);
>> printk(" ->object_name <%s>\n", obj_request->object_name);
>> ...
>>
>> printk("img_request %p\n", img_request);
>> printk(" ->snap 0x%016llx\n", img_request->snap_id);
>> ...
>> printk(" ->result %d\n", img_request->result);
>>
>> rbd_assert(0);
>> }
>>
>> Thanks,
>>
>> Ilya
>>
>
> Without the rbd_assert(0), I add this hang :
>
>
> Mar 25 21:17:58 murmillia kernel: [ 2205.255933] rbd_img_obj_callback: bad
> image object request information:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255938] obj_request ffff88025a2b3c48
> Mar 25 21:17:58 murmillia kernel: [ 2205.255940] ->object_name
> <rb.0.1536881.238e1f29.000000000439>
> Mar 25 21:17:58 murmillia kernel: [ 2205.255941] ->offset 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255943] ->length 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255944] ->type 0x1
BIO request
> Mar 25 21:17:58 murmillia kernel: [ 2205.255945] ->flags 0x3
IMG_DATA, KNOWN
> Mar 25 21:17:58 murmillia kernel: [ 2205.255946] ->which 1
Second object in the request
> Mar 25 21:17:58 murmillia kernel: [ 2205.255948] ->xferred 28672
> Mar 25 21:17:58 murmillia kernel: [ 2205.255949] ->result 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255950] img_request ffff8802536c4a60
> Mar 25 21:17:58 murmillia kernel: [ 2205.255952] ->snap 0xffff880257f85ec0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255953] ->offset 4534026240
> Mar 25 21:17:58 murmillia kernel: [ 2205.255954] ->length 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255955] ->flags 0x1
> Mar 25 21:17:58 murmillia kernel: [ 2205.255957] ->obj_request_count 1
!!! There is only one request... (?)
So obj_request_count might be getting computed incorrectly.
-Alex
> Mar 25 21:17:58 murmillia kernel: [ 2205.255958] ->next_completion 2
> Mar 25 21:17:58 murmillia kernel: [ 2205.255959] ->xferred 45056
> Mar 25 21:17:58 murmillia kernel: [ 2205.255960] ->result 0
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962]
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] Assertion failure in
> rbd_img_obj_callback() at line 2162:
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962]
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962] rbd_assert(which <
> img_request->obj_request_count);
> Mar 25 21:17:58 murmillia kernel: [ 2205.255962]
> Mar 25 21:17:58 murmillia kernel: [ 2205.256141] ------------[ cut here
> ]------------
> Mar 25 21:17:58 murmillia kernel: [ 2205.256178] kernel BUG at
> drivers/block/rbd.c:2162!
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html