On Oct 30, 2010, at 12:26 AM, Sage Weil <[email protected]> wrote:
> On Fri, 29 Oct 2010, Leander Yu wrote:
>> Hi Sage,
>>
>> On Fri, Oct 29, 2010 at 1:08 PM, Sage Weil <[email protected]> wrote:
>>> On Fri, 29 Oct 2010, Henry C Chang wrote:
>>>> Hi,
>>>>
>>>> getattr on mds hanged again.
>>>>
>>>> I have already reverted d91f2438d881514e4a923fd786dbd94b764a9440.
>>>> Although the probability is significant lowered down, it still has the
>>>> chance to hang on getattr.
>>>>
>>>> Attached are the logs of mds and the hanging client. :(
>>>>
>>>> I'm using ceph-client-standalone master-backport branch on 2.6.32 kernel.
>>>
>>> It looks like ceph_check_caps is hung somehow:
>>>
>>> ceph:ceph: handle_caps from mds0
>>> ceph:ceph: mds0 seq 99 cap seq 28
>>> ceph:ceph: op revoke ino 10000000bd1.fffffffffffffffe inode
>>> ffff8800a6251d88
>>> ceph:ceph: handle_cap_grant inode ffff8800a6251d88 cap ffff8800a635b780
>>> mds0 seq 28 pAsLsXsFr
>>> ceph:ceph: size 4294967296 max_size 8594128896, i_size 4294967296
>>> ceph:ceph: try_nonblocking_invalidate ffff8800a6251d88 success
>>> ceph:ceph: __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFscr
>>> ceph:ceph: __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFscr
>>> ceph:ceph: ffff8800a6251d88 mode 0100644 uid.gid 0.0
>>> ceph:ceph: my wanted = pAsxXsxFsxcrwb, used = pFcr, dirty -
>>> ceph:ceph: revocation: pAsLsXsFscr -> pAsLsXsFr (revoking Fsc)
>>> ceph:ceph: __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFr
>>> ceph:ceph: check_caps ffff8800a6251d88 file_want pAsxXsxFsxcrwb used pFcr
>>> dirty - flushing - issued pAsLsXsFr revoking Fsc retain pAsxLsxXsxFsxcrwbl
>>> AUTHONLY NODELAY
>>> ceph:ceph: mds0 revoking Fsc
>>> ceph:ceph: mdsc put_session ffff8800b41c6000 3 -> 2
>>> ceph:ceph: mdsc con_put ffff8800b41c6000 (2)
>>> ceph:ceph: aio_read ffff8800a6251d88 10000000bd1.fffffffffffffffe
>>> dropping cap refs on Fcr = 512
>>> ceph:ceph: put_cap_refs ffff8800a6251d88 had Fcr last
>>> ceph:ceph: __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780
>>> issued pAsLsXsFr
>>> ceph:ceph: check_caps ffff8800a6251d88 file_want pAsxXsxFsxcrwb used pFc
>>> dirty - flushing - issued pAsLsXsFr revoking Fsc retain pAsxLsxXsxFsxcrwbl
>>> ceph:ceph: check_caps trying to invalidate on ffff8800a6251d88
>>> ceph:ceph: try_nonblocking_invalidate ffff8800a6251d88 failed
>>> ceph:ceph: check_caps queuing invalidate
>>>
>>> --> this means queue_invalidate = 1, and check_caps will call
>>> ceph_queue_invalidate on exit, which will always print something...
>>>
>>> ceph:ceph: __ceph_caps_issued ffff8800a6251d88 cap ffff8800a635b780 issued
>>> pAsLsXsFr
>>> ceph:ceph: check_caps ffff8800a6251d88 file_want pAsxXsxFsxcrwb used pFc
>>> dirty - flushing - issued pAsLsXsFr revoking Fsc retain pAsxLsxXsxFsxcrwbl
>>> ceph:ceph: mds0 revoking Fsc
>>> ceph:ceph: __cap_delay_cancel ffff8800a6251d88
>>>
>>> ...but that never happens. Probably the CPU got blocked somewhere? Can
>>> you see what the system is doing at this point? sysrq-t, or check the
>>> process list for ceph-msgr and cat it's stack (/proc/$pid/stack)? The
>>> task should be blocked in ceph_check_caps() somewhere...
>>
>> it didn't go to ceph_queue_invalidate because delayed and is_delayed = 0.
>>
>> if (delayed && is_delayed)
>> force_requeue = 1; /* __send_cap delayed release; requeue
>> */
>> if (!delayed && !is_delayed)
>> __cap_delay_cancel(mdsc, ci);
>> else if (!is_delayed || force_requeue)
>> __cap_delay_requeue(mdsc, ci);
>>
>> spin_unlock(&inode->i_lock);
>>
>> if (queue_invalidate)
>> ceph_queue_invalidate(inode);
>>
>> so it goes to __cap_delay_cancel().
>>
>> Should I said, the correct behavior is to go to ceph_queue_invalidate(inode)?
>
> I don't think delayed && is_delayed are related. (Those deal with whether
> we come back and re-check caps a little latter.) The invalidate should be
> queued immediately: at this point queue_invalidate == 1, and the function
> only has one exit point, so one way or another ceph_queue_invalidate()
> should get called. Since it's not, we must be getting stuck somewhere...
>
> Some combination of sysrq and strategic printk's should narrow down where
> things go wrong.
>
> Thanks!
> sage
>From the kernel debug log we found it went to __cap_delay_cancel()
then didn't go to ceph_queue_invalidate(inode). add one more checking
condition as following workaround it
if (!delayed && !is_delayed && !queue_invalidate)
__cap_delay_cancel(mdsc, ci);
But you are right, no matter what, it should goes to
ceph_queue_invalidate() after return from __cap_delay_cancel().
We will add some more printk to see where it got stuck. The most
suspicious part is spin_lock(&mdsc->cap_delay_lock) inside
__cap_delay_cancel()
>
>
>>
>>
>>>
>>> (BTW, if you're building your own kernel, one thing that I've found
>>> helpful is enabling the CONFIG_PRINTK_TIME option in .config, and updating
>>> kernel/printk.c to also include current->pid in the line prefix. That
>>> helps sort out what tasks are doing what when. But if you're stuck on
>>> 2.6.32 for some reason that probably not the case!)
>>>
>>> Thanks!
>>> sage
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html