Alexey Lyashkov wrote:
Grant,

Lustre: Changing connection for data-OST0007-osc-0000010420740400 to
[EMAIL PROTECTED]/[EMAIL PROTECTED]
Lustre: Skipped 47 previous similar messages
This is telling you that the client has given up on your OST0007 and is trying to reconnect
LustreError: 3251:0:(client.c:574:ptlrpc_check_status()) @@@ type ==
PTL_RPC_MSG_ERR, err == -16 [EMAIL PROTECTED] x117468083/t0
o8->[EMAIL PROTECTED]@tcp:28 lens 304/328 ref 1 fl
Rpc:R/0/0 rc 0/-16
EBUSY for second request, because first not finished.

Lustre: 9018:0:(ldlm_lib.c:709:target_handle_connect()) data-OST0007:
refuse reconnection from
[EMAIL PROTECTED]@tcp to
0xffff8101045e3000/2
refused new connect, because old not finished.
OST0007 is itself alive, and remembers the old connection / requests from the client and won't let the client reconnect until those requests are serviced.
so there's a difference and servername2 has something in disk wait.

this root cause of this problem, after 'something' is finished - client
is unfreeze.

The thread doing the IO is stuck, implying there's something wrong with your disk/raid for OST0007.
You can deactivate the OSC's for that OST using lctl on the clients:
lctl dl | grep OST0007-osc-
lctl device XX deactivate
this will prevent the clients from trying to use OST0007 until you resolve the problem. Doing the same on the MDT node will prevent new files from being created on OST0007.

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to