Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

2011-08-10 Thread Johann Lombardi
On Tue, Aug 09, 2011 at 10:29:43AM -0600, Kevin Van Maren wrote:
 That code is unchanged in 1.8.6.

The two relevant patches for 1.8 are the following:
http://review.whamcloud.com/#change,457
http://review.whamcloud.com/#change,506

Both patches are included in 1.8.6-wc1 and waiting for landing approval on 
Oracle's side (see bugzilla 24508).

Cheers,
Johann
-- 
Johann Lombardi
Whamcloud, Inc.
www.whamcloud.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

2011-08-09 Thread Kevin Van Maren
chas williams - CONTRACTOR wrote:
 On Mon, 08 Aug 2011 12:03:25 -0400
 chas williams - CONTRACTOR c...@cmf.nrl.navy.mil wrote:

   
 later mdc_exit_request() finds this mcw by iterating the list.
 seeing as mcw was allocated on the stack, i dont think you can do this.
 mcw might have been reused by the time mdc_exit_request() gets around
 to removing it.
 

 nevermind. i see this has been fixed in later releases apparently (i
 was looking at 1.8.5). if l_wait_event() returns early (like
 from being interrupted) mdc_enter_request() does the cleanup itself now.
   

That code is unchanged in 1.8.6.

Kevin

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

2011-08-09 Thread chas williams - CONTRACTOR
On Tue, 09 Aug 2011 10:29:43 -0600
Kevin Van Maren kevin.van.ma...@oracle.com wrote:

  chas williams - CONTRACTOR wrote:
  nevermind. i see this has been fixed in later releases apparently (i
  was looking at 1.8.5). if l_wait_event() returns early (like
  from being interrupted) mdc_enter_request() does the cleanup itself now.
 
 That code is unchanged in 1.8.6.

it appears to have been fixed in the 2.x releases.  i think this is the
relevant change http://review.whamcloud.com/#change,506
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

2011-08-08 Thread Andreas Dilger
On 2011-08-08, at 10:03 AM, chas williams - CONTRACTOR wrote:
 we have seen a few crashes that look like:
 
 [250696.381575] RIP: 0010:[a0a1f9e4]  [a0a1f9e4] 
 mdc_exit_request+0x74/0xb0 [mdc]
 ...
 [250696.381575] Call Trace:
 [250696.381575]  [a0a25042] 
 mdc_intent_getattr_async_interpret+0x82/0x500 [mdc]
 [250696.381575]  [a089efd0] ptlrpc_check_set+0x200/0x1690 [ptlrpc]
 [250696.381575]  [a08d3140] ptlrpcd_check+0x110/0x250 [ptlrpc]
 
 and i sort of gather the problem arises from mdc_enter_request().
 it allocates an mdc_cache_waiter on the stack and inserts it into the
 wait list and then returns.
 
   int mdc_enter_request(struct client_obd *cli)
   ...
   struct mdc_cache_waiter mcw;
   ...
   list_add_tail(mcw.mcw_entry, cli-cl_cache_waiters);
   init_waitqueue_head(mcw.mcw_waitq);
 
 later mdc_exit_request() finds this mcw by iterating the list.
 seeing as mcw was allocated on the stack, i dont think you can do this.
 mcw might have been reused by the time mdc_exit_request() gets around
 to removing it.

What version of Lustre is this?

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [bug?] mdc_enter_request() problems

2011-08-08 Thread chas williams - CONTRACTOR
On Mon, 08 Aug 2011 12:03:25 -0400
chas williams - CONTRACTOR c...@cmf.nrl.navy.mil wrote:

 later mdc_exit_request() finds this mcw by iterating the list.
 seeing as mcw was allocated on the stack, i dont think you can do this.
 mcw might have been reused by the time mdc_exit_request() gets around
 to removing it.

nevermind. i see this has been fixed in later releases apparently (i
was looking at 1.8.5). if l_wait_event() returns early (like
from being interrupted) mdc_enter_request() does the cleanup itself now.
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss