On Mon, Apr 13, 2015 at 10:18 PM, Shawn Edwards <[email protected]> wrote:
> Here's a vmcore, along with log files from Xen's crash dump utility.
>
> https://drive.google.com/file/d/0Bz8b7ZiWX00AeHRhMjNvdVNLdDQ/view?usp=sharing
>
> Let me know if we can help more.
>
> On Fri, Apr 10, 2015 at 1:04 PM Ilya Dryomov <[email protected]> wrote:
>>
>> On Fri, Apr 10, 2015 at 8:03 PM, Shawn Edwards <[email protected]>
>> wrote:
>> > I took the rbd and ceph drivers out of the patched kernel above and
>> > merged
>> > them into Xen's kernel.  Works as well as the old one; still crashes.
>> > But
>> > now I get logs.  From the Xen logs:
>> >
>> > [   1128.217561]    ERR:
>> > Assertion failure in rbd_img_obj_callback() at line 2363:
>> >
>> > rbd_assert(more ^ (which == img_request->obj_request_count));
>>
>> Ah, that's a long standing bug which we know wasn't properly fixed -
>> a tight race in rbd completion callback.  It looks like it doesn't take
>> long for you to reproduce it.  Can you try grabbing a vmcore such that
>> it can be inspected with crash utility?

On a closer inspection, that looks like a simple error handling bug.  The out
of memory splat before the assert sets ->result to -ENOMEM and the logic in
rbd_img_obj_callback() just fails to handle it.  I'll fix it later this week.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to