On Apr 18, 2007, at 10:40 AM, Scott Atchley wrote:

Hi all,

We are testing a small cluster with 1 MDS, 2 OSS, and 5 clients. When all clients are writing to independent directories as is well. When one client tries to list the contents of a directory that another client is creating/deleting files in, Lustre will hang and / var/log/messages shows a lot of "printk suppressed" messages.

Is this normal behavior or can we do something to minimize it (besides not having two clients work in the same directory)?

Scott

This may or may not be related, but four of the clients can list a directory, but the fifth client cannot. On the fifth client, dmesg shows:

Lustre: MDC_nas-0-0.local_mds1_MNT_client-0000010037e37800: Connection restored to service mds1 using nid [EMAIL PROTECTED]
Lustre: Skipped 1 previous similar message
LustreError: 9634:0:(mdc_request.c:684:mdc_close()) Unexpected: can't find mdc_open_data, but the close succeeded. Please tell CFS. LustreError: 23673:0:(client.c:576:ptlrpc_check_status()) @@@ type == PTL_RPC_MSG_ERR, err == -107 [EMAIL PROTECTED] x201696/t0 o400- >[EMAIL PROTECTED]:12 lens 64/64 ref 1 fl Rpc:RN/0/0 rc 0/-107 LustreError: MDC_nas-0-0.local_mds1_MNT_client-0000010037e37800: Connection to service mds1 via nid [EMAIL PROTECTED] was lost; in progress operations using this service will wait for recovery to complete. LustreError: This client was evicted by mds1; in progress operations using this service will fail. LustreError: 9645:0:(client.c:548:ptlrpc_check_reply()) @@@ ABORTED: [EMAIL PROTECTED] x201693/t0 o37->[EMAIL PROTECTED]:12 lens 240/240 ref 1 fl Rpc:E/0/0 rc 0/0 LustreError: 9645:0:(dir.c:329:ll_readdir()) error reading dir 480862/408283751 page 0: rc -5 LustreError: 9645:0:(dir.c:329:ll_readdir()) Skipped 89 previous similar messages LustreError: 9645:0:(client.c:511:ptlrpc_import_delay_req()) @@@ IMP_INVALID [EMAIL PROTECTED] x201700/t0 o37->[EMAIL PROTECTED] m_UUID:12 lens 240/240 ref 1 fl Rpc:/0/0 rc 0/0 LustreError: 9645:0:(client.c:511:ptlrpc_import_delay_req()) Skipped 88 previous similar messages Lustre: MDC_nas-0-0.local_mds1_MNT_client-0000010037e37800: Connection restored to service mds1 using nid [EMAIL PROTECTED] LustreError: 9645:0:(mdc_request.c:684:mdc_close()) Unexpected: can't find mdc_open_data, but the close succeeded. Please tell CFS.

I like the "Please tell CFS." note. :-)

Any suggestions?

Scott


_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to