[Lustre-discuss] New lustre message

2008-08-21 Thread Brock Palen
I don't know if this is a bad thing,  I was doing a stress of our new  
lustre install and managed to have a client kicked out with the  
following message on the OST that kicked it out:

Lustre: 6584:0:(ldlm_lib.c:760:target_handle_connect()) nobackup- 
OST: refuse reconnection from 749b3c01-4ac0- 
[EMAIL PROTECTED]@tcp to 0x0102f7cdc000; still  
busy with 6 active RPCs


Was this just a result of hammering the filesystem really hard?  Both  
OSS became CPU bound, so I would not be surprised if it was just to  
much.  Any other common causes of this message (I never saw it with  
our old setup) would be great.

Thanks,
New install is working great, nice product.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] New lustre message

2008-08-21 Thread Brian J. Murrell
On Thu, 2008-08-21 at 22:23 -0400, Brock Palen wrote:
 I don't know if this is a bad thing,  I was doing a stress of our new  
 lustre install and managed to have a client kicked out with the  
 following message on the OST that kicked it out:

To be clear the below message is not a client being evicted but rather a
client trying to reconnect after it has been evicted.

 Lustre: 6584:0:(ldlm_lib.c:760:target_handle_connect()) nobackup- 
 OST: refuse reconnection from 749b3c01-4ac0- 
 [EMAIL PROTECTED]@tcp to 0x0102f7cdc000; still  
 busy with 6 active RPCs

The OSS is refusing to allow the client to reconnect however because it
is still trying to finish the transactions the client had in progress
when it was evicted.

 Was this just a result of hammering the filesystem really hard?

Could be, if the load was atypical and you have tuned your obd_timeout
for a more typical load.  Typically, until AT is in full swing, you need
to tune for your worst case scenario.

b.



signature.asc
Description: This is a digitally signed message part
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] New lustre message

2008-08-21 Thread Brock Palen
On Aug 21, 2008, at 11:17 PM, Brian J. Murrell wrote:
 On Thu, 2008-08-21 at 22:23 -0400, Brock Palen wrote:
 I don't know if this is a bad thing,  I was doing a stress of our new
 lustre install and managed to have a client kicked out with the
 following message on the OST that kicked it out:

 To be clear the below message is not a client being evicted but  
 rather a
 client trying to reconnect after it has been evicted.

Thanks yes,  this message appeared after the eviction notice,


 Lustre: 6584:0:(ldlm_lib.c:760:target_handle_connect()) nobackup-
 OST: refuse reconnection from 749b3c01-4ac0-
 [EMAIL PROTECTED]@tcp to 0x0102f7cdc000; still
 busy with 6 active RPCs

 The OSS is refusing to allow the client to reconnect however  
 because it
 is still trying to finish the transactions the client had in progress
 when it was evicted.

Good to know that its just for 'that' client.


 Was this just a result of hammering the filesystem really hard?

 Could be, if the load was atypical and you have tuned your obd_timeout
 for a more typical load.  Typically, until AT is in full swing, you  
 need
 to tune for your worst case scenario.

 b.

 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss