On 06/11/2013 02:17 PM, Sage Weil wrote:
> On Mon, 10 Jun 2013, Sage Weil wrote:
>> Hi Yan-
>>
>> On Tue, 4 Jun 2013, Yan, Zheng wrote:
>>
>>> From: "Yan, Zheng" <[email protected]>
>>>
>>> Signed-off-by: Yan, Zheng <[email protected]>
>>> ---
>>> fs/ceph/caps.c | 6 ++++++
>>> 1 file changed, 6 insertions(+)
>>>
>>> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
>>> index 790f88b..458a66e 100644
>>> --- a/fs/ceph/caps.c
>>> +++ b/fs/ceph/caps.c
>>> @@ -1982,8 +1982,14 @@ static void kick_flushing_inode_caps(struct
>>> ceph_mds_client *mdsc,
>>> cap = ci->i_auth_cap;
>>> dout("kick_flushing_inode_caps %p flushing %s flush_seq %lld\n", inode,
>>> ceph_cap_string(ci->i_flushing_caps), ci->i_cap_flush_seq);
>>> +
>>> __ceph_flush_snaps(ci, &session, 1);
>>
>> This function does funny things to the local session pointer... did you
>> consider this when using it below? It can change to the auth cap mds if
>> it is different than the value passed in...
>
I didn't realize that. But even take it into consideration, I still don't
understand
how the list gets corrupt. Did you use snapshot? how many active MDS?.
> I wonder if we screwed something up here, but I just got a crash inside
> remove_session_caps() that might be explained by a corrupt list. I don't
> think I've seen this before..
BUG_ON(session->s_nr_caps > 0) or BUG_ON(!list_empty(&session->s_cap_flushing))
?
and why the kclient receives CEPH_SESSION_CLOSE message ?
Regards
Yan, Zheng
>
> 0xffff880214aabf20 753 2 1 3 R 0xffff880214aac3a8
> *kworker/3:2
> ffff880224a33ae8 0000000000000018 ffffffffa0814d63 ffff880224f85800
> ffff88020b277790 ffff880224f85800 ffff88020c04e800 ffff880224a33c08
> ffffffffa081a1cf ffffffffffffffff ffff880224a33fd8 ffffffffffffffff
> Call Trace:
> [<ffffffffa0814d63>] ? remove_session_caps+0x33/0x140 [ceph]
> [<ffffffffa081a1cf>] ? dispatch+0x7ff/0x1740 [ceph]
> [<ffffffff81510b66>] ? kernel_recvmsg+0x46/0x60
> [<ffffffffa07c4e38>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph]
> [<ffffffff810a317d>] ? trace_hardirqs_on+0xd/0x10
> [<ffffffffa07c81f8>] ? con_work+0x1948/0x2d50 [libceph]
> [<ffffffff81080c93>] ? idle_balance+0x133/0x180
> [<ffffffff81071c58>] ? finish_task_switch+0x48/0x110
> [<ffffffff81071c58>] ? finish_task_switch+0x48/0x110
> [<ffffffff8105f44f>] ? process_one_work+0x16f/0x540
> [<ffffffff8105f4ba>] ? process_one_work+0x1da/0x540
> [<ffffffff8105f44f>] ? process_one_work+0x16f/0x540
> [<ffffffff8106069c>] ? worker_thread+0x11c/0x370
> [<ffffffff81060580>] ? manage_workers.isra.20+0x2e0/0x2e0
> [<ffffffff8106735a>] ? kthread+0xea/0xf0
>
>
>
>>
>>> +
>>> if (ci->i_flushing_caps) {
>>> + spin_lock(&mdsc->cap_dirty_lock);
>>> + list_move_tail(&ci->i_flushing_item, &session->s_cap_flushing);
>>> + spin_unlock(&mdsc->cap_dirty_lock);
>>> +
>>> delayed = __send_cap(mdsc, cap, CEPH_CAP_OP_FLUSH,
>>> __ceph_caps_used(ci),
>>> __ceph_caps_wanted(ci),
>>> --
>>> 1.8.1.4
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html