On Apr 18, 2007, at 10:40 AM, Scott Atchley wrote:
Hi all,
We are testing a small cluster with 1 MDS, 2 OSS, and 5 clients.
When all clients are writing to independent directories as is well.
When one client tries to list the contents of a directory that
another client is creating/deleting files in, Lustre will hang and /
var/log/messages shows a lot of "printk suppressed" messages.
Is this normal behavior or can we do something to minimize it
(besides not having two clients work in the same directory)?
Scott
This may or may not be related, but four of the clients can list a
directory, but the fifth client cannot. On the fifth client, dmesg
shows:
Lustre: MDC_nas-0-0.local_mds1_MNT_client-0000010037e37800:
Connection restored to service mds1 using nid [EMAIL PROTECTED]
Lustre: Skipped 1 previous similar message
LustreError: 9634:0:(mdc_request.c:684:mdc_close()) Unexpected: can't
find mdc_open_data, but the close succeeded. Please tell CFS.
LustreError: 23673:0:(client.c:576:ptlrpc_check_status()) @@@ type ==
PTL_RPC_MSG_ERR, err == -107 [EMAIL PROTECTED] x201696/t0 o400-
>[EMAIL PROTECTED]:12 lens 64/64 ref 1 fl Rpc:RN/0/0 rc 0/-107
LustreError: MDC_nas-0-0.local_mds1_MNT_client-0000010037e37800:
Connection to service mds1 via nid [EMAIL PROTECTED] was lost; in
progress operations using this service will wait for recovery to
complete.
LustreError: This client was evicted by mds1; in progress operations
using this service will fail.
LustreError: 9645:0:(client.c:548:ptlrpc_check_reply()) @@@ ABORTED:
[EMAIL PROTECTED] x201693/t0 o37->[EMAIL PROTECTED]:12 lens
240/240 ref 1 fl Rpc:E/0/0 rc 0/0
LustreError: 9645:0:(dir.c:329:ll_readdir()) error reading dir
480862/408283751 page 0: rc -5
LustreError: 9645:0:(dir.c:329:ll_readdir()) Skipped 89 previous
similar messages
LustreError: 9645:0:(client.c:511:ptlrpc_import_delay_req()) @@@
IMP_INVALID [EMAIL PROTECTED] x201700/t0 o37->[EMAIL PROTECTED]
m_UUID:12 lens 240/240 ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 9645:0:(client.c:511:ptlrpc_import_delay_req()) Skipped
88 previous similar messages
Lustre: MDC_nas-0-0.local_mds1_MNT_client-0000010037e37800:
Connection restored to service mds1 using nid [EMAIL PROTECTED]
LustreError: 9645:0:(mdc_request.c:684:mdc_close()) Unexpected: can't
find mdc_open_data, but the close succeeded. Please tell CFS.
I like the "Please tell CFS." note. :-)
Any suggestions?
Scott
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss