Oh right, that makes sense. I guess if I were you I would try one of two things. First, back up the MDT, and then try: 1) format a small loopback device with the parameters you want the MDT to have, then replace the CONFIGS directory on your MDT with the CONFIGS directory on the loopback device - OR - 2) use a hex editor to modify the UUID
Then use tunefs.lustre --print to make sure it all looks good before mounting it. Though one thing I wonder about is, are the OSTs on the same page with the fsname? Like are they expecting to be part of the p1 filesystem? HTH, Kit -- Kit Westneat System Administrator, eSys [email protected] 212-992-7647 On Sun, Mar 18, 2012 at 2:40 AM, Dr Stuart Midgley <[email protected]> wrote: > No, we have tried that. > > This file system started life about 6 years ago as lustre 1.4 and has > continually been upgraded… hence the whacky UUID. Trying to rename the FS > doesn't work. It doesn't change the UUID that the mgs tells clients to > mount. > > > -- > Dr Stuart Midgley > [email protected] > > > > On 18/03/2012, at 2:24 PM, Kit Westneat wrote: > > > You should be able to reset the UUID by doing another writeconf with the > --fsname flag. After the writeconf, you'll have to writeconf all the OSTs > too. > > > > It worked on my very simple test at least: > > [root@mds1 tmp]# tunefs.lustre --writeconf --fsname=test1 /dev/loop0 > > checking for existing Lustre data: found CONFIGS/mountdata > > Reading CONFIGS/mountdata > > > > Read previous values: > > Target: t1-MDT0000 > > Index: 0 > > Lustre FS: t1 > > Mount type: ldiskfs > > Flags: 0x5 > > (MDT MGS ) > > Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro > > Parameters: mdt.group_upcall=/usr/sbin/l_getgroups > > > > > > Permanent disk data: > > Target: test1-MDT0000 > > Index: 0 > > Lustre FS: test1 > > Mount type: ldiskfs > > Flags: 0x105 > > (MDT MGS writeconf ) > > Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro > > Parameters: mdt.group_upcall=/usr/sbin/l_getgroups > > > > Writing CONFIGS/mountdata > > > > > > HTH, > > Kit > > -- > > Kit Westneat > > System Administrator, eSys > > [email protected] > > 212-992-7647 > > > > > > On Sun, Mar 18, 2012 at 1:20 AM, Stu Midgley <[email protected]> wrote: > > ok, from what I can tell, the root of the problem is > > > > > > [root@mds001 CONFIGS]# hexdump -C p1-MDT0000 | grep -C 2 mds > > 00002450 0b 00 00 00 04 00 00 00 12 00 00 00 00 00 00 00 > |................| > > 00002460 70 31 2d 4d 44 54 30 30 30 30 00 00 00 00 00 00 > |p1-MDT0000......| > > 00002470 6d 64 73 00 00 00 00 00 70 72 6f 64 5f 6d 64 73 > |mds.....prod_mds| > > 00002480 5f 30 30 31 5f 55 55 49 44 00 00 00 00 00 00 00 > |_001_UUID.......| > > 00002490 78 00 00 00 07 00 00 00 88 00 00 00 08 00 00 00 > |x...............| > > -- > > 000024c0 00 00 00 00 04 00 00 00 0b 00 00 00 12 00 00 00 > |................| > > 000024d0 02 00 00 00 0b 00 00 00 70 31 2d 4d 44 54 30 30 > |........p1-MDT00| > > 000024e0 30 30 00 00 00 00 00 00 70 72 6f 64 5f 6d 64 73 > |00......prod_mds| > > 000024f0 5f 30 30 31 5f 55 55 49 44 00 00 00 00 00 00 00 > |_001_UUID.......| > > 00002500 30 00 00 00 00 00 00 00 70 31 2d 4d 44 54 30 30 > |0.......p1-MDT00| > > > > [root@mds001 CONFIGS]# > > [root@mds001 CONFIGS]# hexdump -C /mnt/md2/CONFIGS/p1-MDT0000 | grep -C > 2 mds > > 00002450 0b 00 00 00 04 00 00 00 10 00 00 00 00 00 00 00 > |................| > > 00002460 70 31 2d 4d 44 54 30 30 30 30 00 00 00 00 00 00 > |p1-MDT0000......| > > 00002470 6d 64 73 00 00 00 00 00 70 31 2d 4d 44 54 30 30 > |mds.....p1-MDT00| > > 00002480 30 30 5f 55 55 49 44 00 70 00 00 00 07 00 00 00 > |00_UUID.p.......| > > 00002490 80 00 00 00 08 00 00 00 00 00 62 10 ff ff ff ff > |..........b.....| > > > > > > now if only I can get the UUID to be removed or reset... > > > > > > On Sun, Mar 18, 2012 at 1:05 PM, Dr Stuart Midgley <[email protected]> > wrote: > > > hmmm… that didn't work > > > > > > # tunefs.lustre --force --fsname=p1 /dev/md2 > > > checking for existing Lustre data: found CONFIGS/mountdata > > > Reading CONFIGS/mountdata > > > > > > Read previous values: > > > Target: p1-MDT0000 > > > Index: 0 > > > UUID: prod_mds_001_UUID > > > Lustre FS: p1 > > > Mount type: ldiskfs > > > Flags: 0x405 > > > (MDT MGS ) > > > Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr > > > Parameters: > > > > > > tunefs.lustre: unrecognized option `--force' > > > tunefs.lustre: exiting with 22 (Invalid argument) > > > > > > > > > > > > > > > -- > > > Dr Stuart Midgley > > > [email protected] > > > > > > > > > > > > On 18/03/2012, at 12:17 AM, Nathan Rutman wrote: > > > > > >> Take them all down again, use tunefs.lustre --force --fsname. > > >> > > >> > > >> On Mar 17, 2012, at 2:10 AM, "Stu Midgley" <[email protected]> wrote: > > >> > > >>> Afternoon > > >>> > > >>> We have a rather severe problem with our lustre file system. We had > a > > >>> full config log and the advice was to rewrite it with a new one. So, > > >>> we unmounted our lustre file system off all clients, unmount all the > > >>> ost's and then unmounted the mds. I then did > > >>> > > >>> mds: > > >>> tunefs.lustre --writeconf --erase-params /dev/md2 > > >>> > > >>> oss: > > >>> tunefs.lustre --writeconf --erase-params --mgsnode=mds001 /dev/md2 > > >>> > > >>> > > >>> > > >>> After the tunefs.lustre on the mds I saw > > >>> > > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGS MGS started > > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGC172.16.0.251@tcp: > Reactivating import > > >>> Mar 17 14:33:02 mds001 kernel: Lustre: MGS: Logs for fs p1 were > > >>> removed by user request. All servers must be restarted in order to > > >>> regenerate the logs. > > >>> Mar 17 14:33:02 mds001 kernel: Lustre: Enabling user_xattr > > >>> Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: new disk, > initializing > > >>> Mar 17 14:33:02 mds001 kernel: Lustre: p1-MDT0000: Now serving > > >>> p1-MDT0000 on /dev/md2 with recovery enabled > > >>> > > >>> which scared me a little... > > >>> > > >>> > > >>> > > >>> the mds and the oss's mount happily BUT I can't mount the file system > > >>> on my clients... on the mds I see > > >>> > > >>> > > >>> Mar 17 16:42:11 mds001 kernel: LustreError: 137-5: UUID > > >>> 'prod_mds_001_UUID' is not available for connect (no target) > > >>> > > >>> > > >>> On the client I see > > >>> > > >>> > > >>> Mar 17 16:00:06 host kernel: LustreError: 11-0: an error occurred > > >>> while communicating with 172.16.0.251@tcp. The mds_connect operation > > >>> failed with -19 > > >>> > > >>> > > >>> now, it appears the writeconf renamed the UUID of the mds from > > >>> prod_mds_001_UUID to p1-MDT0000_UUID but I can't work out how to get > > >>> it back... > > >>> > > >>> > > >>> for example I tried > > >>> > > >>> > > >>> # tunefs.lustre --mgs --mdt --fsname=p1 /dev/md2 > > >>> checking for existing Lustre data: found CONFIGS/mountdata > > >>> Reading CONFIGS/mountdata > > >>> > > >>> Read previous values: > > >>> Target: p1-MDT0000 > > >>> Index: 0 > > >>> UUID: prod_mds_001_UUID > > >>> Lustre FS: p1 > > >>> Mount type: ldiskfs > > >>> Flags: 0x405 > > >>> (MDT MGS ) > > >>> Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr > > >>> Parameters: > > >>> > > >>> tunefs.lustre: cannot change the name of a registered target > > >>> tunefs.lustre: exiting with 1 (Operation not permitted) > > >>> > > >>> > > >>> > > >>> I'm now stuck not being able to mount a 1PB file system... which > isn't good :( > > >>> > > >>> -- > > >>> Dr Stuart Midgley > > >>> [email protected] > > >> ______________________________________________________________________ > > >> This email may contain privileged or confidential information, which > should only be used for the purpose for which it was sent by Xyratex. No > further rights or licenses are granted to use such information. If you are > not the intended recipient of this message, please notify the sender by > return and delete it. You may not use, copy, disclose or rely on the > information contained in it. > > >> > > >> Internet email is susceptible to data corruption, interception and > unauthorised amendment for which Xyratex does not accept liability. While > we have taken reasonable precautions to ensure that this email is free of > viruses, Xyratex does not accept liability for the presence of any computer > viruses in this email, nor for any losses caused as a result of viruses. > > >> > > >> Xyratex Technology Limited (03134912), Registered in England & Wales, > Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA. > > >> > > >> The Xyratex group of companies also includes, Xyratex Ltd, registered > in Bermuda, Xyratex International Inc, registered in California, Xyratex > (Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd > registered in The People's Republic of China and Xyratex Japan Limited > registered in Japan. > > >> ______________________________________________________________________ > > >> > > >> > > > > > > > > > > > -- > > Dr Stuart Midgley > > [email protected] > > > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
