I'm using Luster 1.8.2 Everything seemed to be working quite nicely, until i enabled user quotas.
I am able to mount the file system on the client, but when ever i cd into it, or ls it, or try anything else on it, it hangs. Then when i type in "lfs df -h", the MDS server longer appears in the list 192.168.0.2 is the MDS/MGS server 192.168.0.3 is the OST server (/dev/hdc is the oss) 192.168.0.6 is the patchless client Thanks for the help all -Dusty This shows up in /var/log/messages on the client (sorry, the time is wrong on this machine) ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Feb 13 18:15:49 mainframe2 kernel: Lustre: mgc192.168....@tcp: Reactivating import Feb 13 18:15:49 mainframe2 kernel: Lustre: Client cluster-client has started Feb 13 18:16:21 mainframe2 kernel: Lustre: 6386:0:(client.c:1434:ptlrpc_expire_one_request()) @@@ Request x1327583245569516 sent from cluster-MDT0000-mdc-ffff810009154c00 to NID 192.168....@tcp 7s ago has timed out (7s prior to deadline). Feb 13 18:16:21 mainframe2 kernel: r...@ffff810036502c00 x1327583245569516/t0 o101->[email protected]@tcp:12/10 lens 544/1064 e 0 to 1 dl 1266106581 ref 1 fl Rpc:/0/0 rc 0/0 Feb 13 18:16:21 mainframe2 kernel: Lustre: cluster-MDT0000-mdc-ffff810009154c00: Connection to service cluster-MDT0000 via nid 192.168....@tcp was lost; in progress operations using this service will wait for recovery to complete. Feb 13 18:16:27 mainframe2 kernel: LustreError: 6386:0:(mdc_locks.c:625:mdc_enqueue()) ldlm_cli_enqueue: -4 ---------------------------------------------------------------------------------------------------------------------------------------------------------------- This shows up in /var/log/messages on the MDS server ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Feb 13 23:43:07 MDS kernel: Lustre: MGS: haven't heard from client d9029b94-c905-383b-b046-df9c7d7be59d (at 0...@lo) in 248 seconds. I think it's dead, and I am evicting it. Feb 13 23:53:08 MDS kernel: LustreError: 4121:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error (-43) r...@f6553600 x1327583245569136/t0 o36->8b82793a-0c0a-06d5-220b-4e2bc0e85...@net_0x20000c0a80006_uuid:0/0 lens 424/360 e 0 to 0 dl 1266126794 ref 1 fl Interpret:/0/0 rc 0/0 Feb 14 00:03:15 MDS kernel: LustreError: 2581:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error (-43) r...@f5fb4800 x1327583245569252/t0 o36->8b82793a-0c0a-06d5-220b-4e2bc0e85...@net_0x20000c0a80006_uuid:0/0 lens 424/360 e 0 to 0 dl 1266127401 ref 1 fl Interpret:/0/0 rc 0/0 Feb 14 00:04:49 MDS kernel: LustreError: 11-0: an error occurred while communicating with 192.168....@tcp. The ost_statfs operation failed with -107 Feb 14 00:04:49 MDS kernel: Lustre: cluster-OST0000-osc: Connection to service cluster-OST0000 via nid 192.168....@tcp was lost; in progress operations using this service will wait for recovery to complete. Feb 14 00:04:49 MDS kernel: LustreError: 167-0: This client was evicted by cluster-OST0000; in progress operations using this service will fail. Feb 14 00:04:49 MDS kernel: Lustre: 4352:0:(quota_master.c:1711:mds_quota_recovery()) Only 0/1 OSTs are active, abort quota recovery Feb 14 00:04:49 MDS kernel: Lustre: cluster-OST0000-osc: Connection restored to service cluster-OST0000 using nid 192.168....@tcp. Feb 14 00:04:49 MDS kernel: Lustre: MDS cluster-MDT0000: cluster-OST0000_UUID now active, resetting orphans Feb 14 00:04:56 MDS kernel: Lustre: 4121:0:(ldlm_lib.c:540:target_handle_reconnect()) cluster-MDT0000: 8b82793a-0c0a-06d5-220b-4e2bc0e85cdf reconnecting Feb 14 00:04:56 MDS kernel: Lustre: 4121:0:(ldlm_lib.c:837:target_handle_connect()) cluster-MDT0000: refuse reconnection from [email protected]@tcp to 0xc9b5f600; still busy with 1 active RPCs Feb 14 00:05:10 MDS kernel: Lustre: 4121:0:(ldlm_lib.c:540:target_handle_reconnect()) cluster-MDT0000: 8b82793a-0c0a-06d5-220b-4e2bc0e85cdf reconnecting Feb 14 00:05:10 MDS kernel: Lustre: 4121:0:(ldlm_lib.c:540:target_handle_reconnect()) Skipped 1 previous similar message Feb 14 00:05:10 MDS kernel: Lustre: 4121:0:(ldlm_lib.c:837:target_handle_connect()) cluster-MDT0000: refuse reconnection from [email protected]@tcp to 0xc9b5f600; still busy with 1 active RPCs Feb 14 00:05:10 MDS kernel: Lustre: 4121:0:(ldlm_lib.c:837:target_handle_connect()) Skipped 1 previous similar message Feb 14 00:12:11 MDS kernel: Lustre: cluster-MDT0000: haven't heard from client 8b82793a-0c0a-06d5-220b-4e2bc0e85cdf (at 192.168....@tcp) in 258 seconds. I think it's dead, and I am evicting it. Feb 14 00:15:37 MDS kernel: LustreError: 11-0: an error occurred while communicating with 192.168....@tcp. The ost_quotactl operation failed with -107 Feb 14 00:15:37 MDS kernel: Lustre: cluster-OST0000-osc: Connection to service cluster-OST0000 via nid 192.168....@tcp was lost; in progress operations using this service will wait for recovery to complete. Feb 14 00:15:37 MDS kernel: LustreError: 4357:0:(quota_ctl.c:379:client_quota_ctl()) ptlrpc_queue_wait failed, rc: -107 Feb 14 00:15:37 MDS kernel: LustreError: 167-0: This client was evicted by cluster-OST0000; in progress operations using this service will fail. Feb 14 00:15:37 MDS kernel: Lustre: 4358:0:(quota_master.c:1711:mds_quota_recovery()) Only 0/1 OSTs are active, abort quota recovery Feb 14 00:15:37 MDS kernel: Lustre: cluster-OST0000-osc: Connection restored to service cluster-OST0000 using nid 192.168....@tcp. Feb 14 00:15:37 MDS kernel: Lustre: MDS cluster-MDT0000: cluster-OST0000_UUID now active, resetting orphans ---------------------------------------------------------------------------------------------------------------------------------------------------------------- -- The graduate with a Science degree asks, "Why does it work?" The graduate with an Engineering degree asks, "How does it work?" The graduate with an Accounting degree asks, "How much will it cost?" The graduate with an Arts degree asks, "Do you want fries with that?" _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
