Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

Andreas Dilger Mon, 08 Aug 2011 11:07:27 -0700

On 2011-08-08, at 10:03 AM, chas williams - CONTRACTOR wrote:
> we have seen a few crashes that look like:
> 
> [250696.381575] RIP: 0010:[<ffffffffa0a1f9e4>]  [<ffffffffa0a1f9e4>] 
> mdc_exit_request+0x74/0xb0 [mdc]
> ...
> [250696.381575] Call Trace:
> [250696.381575]  [<ffffffffa0a25042>] 
> mdc_intent_getattr_async_interpret+0x82/0x500 [mdc]
> [250696.381575]  [<ffffffffa089efd0>] ptlrpc_check_set+0x200/0x1690 [ptlrpc]
> [250696.381575]  [<ffffffffa08d3140>] ptlrpcd_check+0x110/0x250 [ptlrpc]
> 
> and i sort of gather the problem arises from mdc_enter_request().
> it allocates an mdc_cache_waiter on the stack and inserts it into the
> wait list and then returns.
> 
>       int mdc_enter_request(struct client_obd *cli)
>       ...
>               struct mdc_cache_waiter mcw;
>       ...
>                       list_add_tail(&mcw.mcw_entry, &cli->cl_cache_waiters);
>                       init_waitqueue_head(&mcw.mcw_waitq);
> 
> later mdc_exit_request() finds this mcw by iterating the list.
> seeing as mcw was allocated on the stack, i dont think you can do this.
> mcw might have been reused by the time mdc_exit_request() gets around
> to removing it.


What version of Lustre is this?

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.



_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

Reply via email to