In replying to myself here, I did mange to get my disk mounted to perform a new benchmark test. I could not figure a way around the "mount.lustre: mount /dev/sdg1 at / srv/lustre/mds/crew5-MDT0000 failed: Address already in use; The target service's index is already in use. (/dev/sdg1)" error. Even rebooting both the OSS and MDS did not help.
So, being as this was just a test of hw with a different stripesize setting on an LSI 8888ELP RAID card (128kB in place of default 64kB), I re-created both the OST and the MDT using a new, unique fsname and all of the same hardware. This worked like a charm. ....someday I will have to figure out what to do with the "cast-off" MDT names which I apparently may no longer use... Any comment, observations, suggestions appreciated. Later, megan On Aug 14, 12:55 pm, "Ms. Megan Larko" <[EMAIL PROTECTED]> wrote: > Hello, > > As a part of my continuing to benchmark Lustre to ascertain was may be > best-suited for our needs here, I have re-created at the LSI 8888ELP > card level some of my arrays from my earlier benchmark posts. The > card is now sending /dev/sdf 998999Mb with 128kB stripesize and > /dev/sdg 6992995 Mb with 128kB stripesize to my OSS. The sdf and sdg > formatted fine with Lustre and mounted without issue on the OSS. > Recycling the MGS MDT's seem to have been a problem. When I tried to > mount the MDT on the MGS after mounting the new OST's the mounts > performed without error, but the bonnie benchmark test as run before > would hang every time. > > Sample of errors in MGS file /var/log/messages: > Aug 13 12:39:30 mds1 kernel: Lustre: crew5-OST0001-osc: Connection to > service crew5-OST000 > 1 via nid [EMAIL PROTECTED] was lost; in progress operations using this > service will wait for recovery to complete. > Aug 13 12:39:30 mds1 kernel: > LustreError: 167-0: This client was evicted by crew5-OST0001; > in progress operations using this service will fail. > Aug 13 12:39:30 mds1 kernel: Lustre: > crew5-OST0001-osc: Connection restored to service crew5-OST0001 using > nid [EMAIL PROTECTED] > Aug 13 12:39:30 mds1 kernel: Lustre: MDS crew5-MDT0000: > crew5-OST0001_UUID now active, resetting orphans > Aug 13 12:42:42 > mds1 kernel: Lustre: 3406:0:(ldlm_lib.c:519:target_handle_reconnect()) > cre > w5-MDT0000: 50b043bb-0e8c-7a5b-b0fe-6bdb67d21e0b reconnecting > Aug 13 12:42:42 mds1 kernel: Lustre: > 3406:0:(ldlm_lib.c:519:target_handle_reconnect()) Skipped 24 previous > similar messages > Aug 13 12:42:42 mds1 kernel: Lustre: > 3406:0:(ldlm_lib.c:747:target_handle_connect()) crew5-MDT0000: refuse > reconnection from > [EMAIL PROTECTED]@o2ib to > 0xffff81006994d000; still busy with 2 active RPCs > Aug 13 12:42:42 mds1 kernel: Lustre: > 3406:0:(ldlm_lib.c:747:target_handle_connect()) Skipped 24 previous > similar messages > Aug 13 12:42:42 mds1 kernel: LustreError: > 3406:0:(ldlm_lib.c:1442:target_send_reply_msg()) > @@@ processing error (-16) [EMAIL PROTECTED] x600107/t0 > o38->[EMAIL PROTECTED]:-1 > lens 304/200 ref 0 fl Interpret:/0/0 rc -16/0 Aug 13 12:42:42 mds1 > kernel: LustreError: 3406:0:(ldlm_lib.c:1442:target_send_reply_msg()) > Skipped 24 previous similar messages > Aug 13 12:43:40 mds1 kernel: LustreError: 11-0: an > error occurred while communicating with [EMAIL PROTECTED] The > ost_connect operation failed with -19 > Aug 13 12:43:40 mds1 kernel: LustreError: Skipped 7 previous similar > messages Aug 13 12:47:50 mds1 kernel: Lustre: > crew5-OST0001-osc: Connection to service crew5-OST0001 via nid > [EMAIL PROTECTED] was lost; in progress operations using this service > will wait f > or recovery to complete. > > Sample of errors on OSS file /var/log/messages: > Aug 13 12:39:30 oss4 kernel: Lustre: crew5-OST0001: received MDS > connection from [EMAIL PROTECTED] > Aug 13 12:43:57 oss4 kernel: Lustre: crew5-OST0001: haven't heard from > client crew5-mdtlov_UUID (at [EMAIL PROTECTED]) in 267 seconds. I think > it's dead, and I am evicting it. > Aug 13 12:46:27 oss4 kernel: LustreError: 137-5: UUID > 'crew5-OST0000_UUID' is not available for connect (no target) > Aug 13 12:46:27 oss4 kernel: LustreError: Skipped 51 previous similar messages > Aug 13 12:46:27 oss4 kernel: LustreError: > 4151:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@ processing error > (-19) [EMAIL PROTECTED] x600171/t0 o8-><?>@<?>:-1 lens 240/0 ref 0 > fl Interpret:/0/0 rc -19/0 > Aug 13 12:46:27 oss4 kernel: LustreError: > 4151:0:(ldlm_lib.c:1442:target_send_reply_msg()) Skipped 52 previous > similar messages > Aug 13 12:47:50 oss4 kernel: Lustre: crew5-OST0001: received MDS > connection from [EMAIL PROTECTED] > > In lctl all pings were successful. Additionally files on Lustre disks > on our live system using the same MGS were all fine; no errors in > logfile. > > I thought that maybe changing the disk kB size and reformatting the > OST without reformatting the MDT was a problem. So I unmounted OST > and MDT and reformatted the MDT on the MGS. All okay. The OST > remount without error. The MDT on the MGS will not remount: > > [EMAIL PROTECTED] ~]# mount -t lustre /dev/sdg1 /srv/lustre/mds/crew5-MDT0000 > mount.lustre: mount /dev/sdg1 at /srv/lustre/mds/crew5-MDT0000 failed: > Address already in use > The target service's index is already in use. (/dev/sdg1) > > Again, the live systems on the MGS are still fine. A web search for > the error suggested I try " tunefs.lustre --reformat --index=0 > --writeconf=/dev/sdg1" but I was unable to get a syntax of that > command that would run for me (I tried various index= and adding > /dev/sdg1 at the end of the line but it failed each time and only > reprinted the help without indicating what about what I typed was > unparseable). > > My current thought is that a stripesize of 128kB from the LSI 8888ELP > card is not testable on my Lustre 1.6.4 system. This does not seem to > be an accurate statement from what I have read of Lustre but seems to > be what is occurring on my systems. I will test one more time back at > 64kB stripesize. > > megan > _______________________________________________ > Lustre-discuss mailing list > [EMAIL PROTECTED]://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
