Since the last post, I realized the fstab didn't have an entry for the OST to mount. It's not clear to me how this was working before, because I don't recall seeing an OST mounted when running "mount" before. Anyway, continuing on...
On oss0 I ran: mount -t lustre /dev/sda5 /mnt/ost0 Logs on oss0 say: Jan 11 14:33:33 compute-oss-0-0 kernel: LustreError: 4458:0:(ldlm_lib.c:806:target_handle_connect()) mylustre-OST0003: denying connection for new client 10.1....@tcp (mylustre-mdtlov_UUID): 3 clients in recovery for 300s Jan 11 14:33:33 compute-oss-0-0 kernel: LustreError: 4458:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ processing error (-16) r...@ffff81021e84da00 x72/t0 o8-><?>@<?>:0/0 lens 240/144 e 0 to 0 dl 1294785313 ref 1 fl Interpret:/0/0 rc -16/0 Jan 11 14:33:58 compute-oss-0-0 kernel: LustreError: 4459:0:(ldlm_lib.c:806:target_handle_connect()) mylustre-OST0003: denying connection for new client 10.1....@tcp (mylustre-mdtlov_UUID): 3 clients in recovery for 275s ... Jan 11 14:37:18 compute-oss-0-0 kernel: LustreError: 4468:0:(ldlm_lib.c:806:target_handle_connect()) mylustre-OST0003: denying connection for new client 10.1....@tcp (mylustre-mdtlov_UUID): 3 clients in recovery for 74s Jan 11 14:37:18 compute-oss-0-0 kernel: LustreError: 4468:0:(ldlm_lib.c:806:target_handle_connect()) Skipped 1 previous similar message Jan 11 14:37:18 compute-oss-0-0 kernel: LustreError: 4468:0:(ldlm_lib.c:1536:target_send_reply_msg()) @@@ processing error (-16) r...@ffff81021e310800 x91/t0 o8-><?>@<?>:0/0 lens 240/144 e 0 to 0 dl 1294785538 ref 1 fl Interpret:/0/0 rc -16/0 Jan 11 14:37:18 compute-oss-0-0 kernel: LustreError: 4468:0:(ldlm_lib.c:1536:target_send_reply_msg()) Skipped1 previous similar message Jan 11 14:38:33 compute-oss-0-0 kernel: LustreError: 0:0:(ldlm_lib.c:1105:target_recovery_expired()) mylustre-OST0003: recovery timed out, aborting Jan 11 14:38:33 compute-oss-0-0 kernel: LustreError: 4471:0:(genops.c:1024:class_disconnect_stale_exports()) mylustre-OST0003: disconnecting 3 stale clients logs on head server : Jan 11 14:33:33 jupiter kernel: LustreError: 11-0: an error occurred while communicating with 10.1....@tcp. The ost_connect operation failed with -16 Jan 11 14:34:23 jupiter last message repeated 2 times Jan 11 14:35:38 jupiter last message repeated 3 times Jan 11 14:37:18 jupiter last message repeated 3 times Jan 11 14:37:18 jupiter kernel: LustreError: Skipped 1 previous similar message Any help is much appreciated. -Brendon _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
