I have a simple lustre setup ( 1 MGS, 1 MDS (2 MDT), 2 OSS (2 OST each) and 1 client node to run some IO load). I was testing what happens if one of the OSS dies (but no impact to data). To recover from failed OSS, I create a new instance and attached the 2 OSTs from failed node. I assume, since I am using existing OSTs from failed node and the index will remain the same, I tried directly mount of it like below:
mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1 mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2 Since I tried many different time, I also tried the below: Ran mkfs.lustre on the OSTs: mkfs.lustre --fsname=lustrefs --index=2 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdb mkfs.lustre --fsname=lustrefs --index=3 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdc mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1 mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2 Ran mkfs.lustre on the OSTs with --reformat --replace mkfs.lustre --fsname=lustrefs --reformat --replace --index=2 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdb mount -t lustre /dev/oracleoci/oraclevdb /mnt/oss-2-ost-1 mkfs.lustre --fsname=lustrefs --reformat --replace --index=3 --ost --mgsnode=10.0.6.2@tcp1 /dev/oracleoci/oraclevdc mount -t lustre /dev/oracleoci/oraclevdc /mnt/oss-2-ost-2 Questions: 1. After OSS node was replaced, the client node mount was still in hang state and I had to reboot the client node for the mount to work. Is there some config I need to set , so it auto-recovers. 2. On the client node, I see the 2 OSTs are showing as INACTIVE, how do I make them active again. I read on forums to do “lctl –device <device_name> recover/activate and I ran that on MDS and Client, and it still shows INACTIVE. It was confusing on what to pass as <device_name> and where to find the correct name. [root@client-1 ~]# lfs osts OBDS: 0: lustrefs-OST0000_UUID ACTIVE 1: lustrefs-OST0001_UUID ACTIVE 2: lustrefs-OST0002_UUID INACTIVE 3: lustrefs-OST0003_UUID INACTIVE [root@client-1 ~]# lctl dl 0 UP mgc MGC10.0.6.2@tcp1 0e4fae60-66e5-963d-1aea-59b80f9fd77b 4 1 UP lov lustrefs-clilov-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 3 2 UP lmv lustrefs-clilmv-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 3 UP mdc lustrefs-MDT0000-mdc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 4 UP mdc lustrefs-MDT0001-mdc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 5 UP osc lustrefs-OST0002-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 6 UP osc lustrefs-OST0003-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 7 UP osc lustrefs-OST0000-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 8 UP osc lustrefs-OST0001-osc-ffff89259ae86000 6c141ed7-bffe-3d1b-a094-11fbdaab9ac5 4 [root@client-1 ~]# MDS node $ sudo lctl dl 0 UP osd-ldiskfs lustrefs-MDT0001-osd lustrefs-MDT0001-osd_UUID 10 1 UP osd-ldiskfs lustrefs-MDT0000-osd lustrefs-MDT0000-osd_UUID 11 2 UP mgc MGC10.0.6.2@tcp1 acc3160e-9975-9262-89e1-8dc66812ac94 4 3 UP mds MDS MDS_uuid 2 4 UP lod lustrefs-MDT0000-mdtlov lustrefs-MDT0000-mdtlov_UUID 3 5 UP mdt lustrefs-MDT0000 lustrefs-MDT0000_UUID 18 6 UP mdd lustrefs-MDD0000 lustrefs-MDD0000_UUID 3 7 UP qmt lustrefs-QMT0000 lustrefs-QMT0000_UUID 3 8 UP osp lustrefs-MDT0001-osp-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4 9 UP osp lustrefs-OST0002-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4 10 UP osp lustrefs-OST0003-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4 11 UP osp lustrefs-OST0000-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4 12 UP osp lustrefs-OST0001-osc-MDT0000 lustrefs-MDT0000-mdtlov_UUID 4 13 UP lwp lustrefs-MDT0000-lwp-MDT0000 lustrefs-MDT0000-lwp-MDT0000_UUID 4 14 UP lod lustrefs-MDT0001-mdtlov lustrefs-MDT0001-mdtlov_UUID 3 15 UP mdt lustrefs-MDT0001 lustrefs-MDT0001_UUID 14 16 UP mdd lustrefs-MDD0001 lustrefs-MDD0001_UUID 3 17 UP osp lustrefs-MDT0000-osp-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 18 UP osp lustrefs-OST0002-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 19 UP osp lustrefs-OST0003-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 20 UP osp lustrefs-OST0000-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 21 UP osp lustrefs-OST0001-osc-MDT0001 lustrefs-MDT0001-mdtlov_UUID 4 22 UP lwp lustrefs-MDT0000-lwp-MDT0001 lustrefs-MDT0000-lwp-MDT0001_UUID 4 Thanks, Pinkesh Valdria Oracle Cloud Infrastructure +65-8932-3639 (m) - Singapore +1-425-205-7834 (m) - USA https://blogs.oracle.com/author/pinkesh-valdria
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org