Re: [lustre-discuss] missing option mgsnode
Thanks. Here is what I get: [root@holylfs02oss06 ~]# tunefs.lustre --dryrun /dev/mapper/mpathd Failed to initialize ZFS library: 256 checking for existing Lustre data: found Read previous values: Target: holylfs2-OST001f Index: unassigned Lustre FS: Mount type: ldiskfs Flags: 0x50 (needs_index update ) Persistent mount opts: Parameters: tunefs.lustre FATAL: must set target type: MDT,OST,MGS tunefs.lustre: exiting with 22 (Invalid argument) -Paul Edmon- On 7/22/2022 10:37 AM, Thomas Roth via lustre-discuss wrote: You could look at what the device believes it's formatted with by > tunefs.lustre --dryrun /dev/mapper/mpathd When I do that here, I get something like checking for existing Lustre data: found Read previous values: Target: idril-OST000e Index: 14 Lustre FS: idril Mount type: zfs Flags: 0x2 (OST ) Persistent mount opts: Parameters: mgsnode=10.20.6.64@o2ib4:10.20.6.69@o2ib4 ... Tells you about 'mount type' and 'mgsnode'. Regards Thomas On 20/07/2022 19.48, Paul Edmon via lustre-discuss wrote: We have a filesystem that we have running Lustre 2.10.4 in HA mode using IML. One of our OST's had some disk failures and after reconstruction of the RAID set it won't remount but gives: [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd /mnt/holylfs2-OST001f Failed to initialize ZFS library: 256 mount.lustre: missing option mgsnode= The weird thing is that we didn't build this with ZFS, the devices are all ldiskfs. We suspect some of the data is corrupt on the disk but we were wondering if anyone had seen this error before and if there was a solution. -Paul Edmon- ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] missing option mgsnode
You could look at what the device believes it's formatted with by > tunefs.lustre --dryrun /dev/mapper/mpathd When I do that here, I get something like checking for existing Lustre data: found Read previous values: Target: idril-OST000e Index: 14 Lustre FS: idril Mount type: zfs Flags: 0x2 (OST ) Persistent mount opts: Parameters: mgsnode=10.20.6.64@o2ib4:10.20.6.69@o2ib4 ... Tells you about 'mount type' and 'mgsnode'. Regards Thomas On 20/07/2022 19.48, Paul Edmon via lustre-discuss wrote: We have a filesystem that we have running Lustre 2.10.4 in HA mode using IML. One of our OST's had some disk failures and after reconstruction of the RAID set it won't remount but gives: [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd /mnt/holylfs2-OST001f Failed to initialize ZFS library: 256 mount.lustre: missing option mgsnode= The weird thing is that we didn't build this with ZFS, the devices are all ldiskfs. We suspect some of the data is corrupt on the disk but we were wondering if anyone had seen this error before and if there was a solution. -Paul Edmon- ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org -- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528 Managing Directors / Geschäftsführung: Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats: State Secretary / Staatssekretär Dr. Volkmar Dietz ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] missing option mgsnode
The individual LUN looks good but the controller is showing amber, which is confusing us. However other LUN's going through that controller are mounting fine. -Paul Edmon- On 7/20/2022 3:08 PM, Colin Faber wrote: raid check? On Wed, Jul 20, 2022, 12:41 PM Paul Edmon wrote: [root@holylfs02oss06 ~]# mount -t ldiskfs /dev/mapper/mpathd /mnt/holylfs2-OST001f mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpathd, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. e2fsck did not look good: [root@holylfs02oss06 ~]# less OST001f.out ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap e2fsck: Group descriptors look bad... trying backup blocks... MMP interval is 10 seconds and total wait time is 42 seconds. Please wait... Superblock needs_recovery flag is clear, but journal has data. Recovery flag not set in backup superblock, so running journal anyway. Clear journal? no Block bitmap for group 8128 is not in group. (block 3518518062363072290) Relocate? no Inode bitmap for group 8128 is not in group. (block 12235298632209565410) Relocate? no Inode table for group 8128 is not in group. (block 17751685088477790304) WARNING: SEVERE DATA LOSS POSSIBLE. Relocate? no Block bitmap for group 8129 is not in group. (block 2193744380193356980) Relocate? no Inode bitmap for group 8129 is not in group. (block 4102707059848926418) Relocate? no It continues at length like that. -Paul Edmon- On 7/20/2022 2:31 PM, Colin Faber wrote: Can you mount the target directly with -t ldiskfs ? Also what does e2fsck report? On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss wrote: We have a filesystem that we have running Lustre 2.10.4 in HA mode using IML. One of our OST's had some disk failures and after reconstruction of the RAID set it won't remount but gives: [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd /mnt/holylfs2-OST001f Failed to initialize ZFS library: 256 mount.lustre: missing option mgsnode= The weird thing is that we didn't build this with ZFS, the devices are all ldiskfs. We suspect some of the data is corrupt on the disk but we were wondering if anyone had seen this error before and if there was a solution. -Paul Edmon- ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] missing option mgsnode
raid check? On Wed, Jul 20, 2022, 12:41 PM Paul Edmon wrote: > [root@holylfs02oss06 ~]# mount -t ldiskfs /dev/mapper/mpathd > /mnt/holylfs2-OST001f > mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpathd, >missing codepage or helper program, or other error > >In some cases useful info is found in syslog - try >dmesg | tail or so. > > e2fsck did not look good: > > [root@holylfs02oss06 ~]# less OST001f.out > ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap > e2fsck: Group descriptors look bad... trying backup blocks... > MMP interval is 10 seconds and total wait time is 42 seconds. Please > wait... > Superblock needs_recovery flag is clear, but journal has data. > Recovery flag not set in backup superblock, so running journal anyway. > Clear journal? no > > Block bitmap for group 8128 is not in group. (block 3518518062363072290) > Relocate? no > > Inode bitmap for group 8128 is not in group. (block 12235298632209565410) > Relocate? no > > Inode table for group 8128 is not in group. (block 17751685088477790304) > WARNING: SEVERE DATA LOSS POSSIBLE. > Relocate? no > > Block bitmap for group 8129 is not in group. (block 2193744380193356980) > Relocate? no > > Inode bitmap for group 8129 is not in group. (block 4102707059848926418) > Relocate? no > > It continues at length like that. > > -Paul Edmon- > On 7/20/2022 2:31 PM, Colin Faber wrote: > > Can you mount the target directly with -t ldiskfs ? > > Also what does e2fsck report? > > On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss < > lustre-discuss@lists.lustre.org> wrote: > >> We have a filesystem that we have running Lustre 2.10.4 in HA mode using >> IML. One of our OST's had some disk failures and after reconstruction >> of the RAID set it won't remount but gives: >> >> [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd >> /mnt/holylfs2-OST001f >> Failed to initialize ZFS library: 256 >> mount.lustre: missing option mgsnode= >> >> The weird thing is that we didn't build this with ZFS, the devices are >> all ldiskfs. We suspect some of the data is corrupt on the disk but we >> were wondering if anyone had seen this error before and if there was a >> solution. >> >> -Paul Edmon- >> >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] missing option mgsnode
[root@holylfs02oss06 ~]# mount -t ldiskfs /dev/mapper/mpathd /mnt/holylfs2-OST001f mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpathd, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. e2fsck did not look good: [root@holylfs02oss06 ~]# less OST001f.out ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap e2fsck: Group descriptors look bad... trying backup blocks... MMP interval is 10 seconds and total wait time is 42 seconds. Please wait... Superblock needs_recovery flag is clear, but journal has data. Recovery flag not set in backup superblock, so running journal anyway. Clear journal? no Block bitmap for group 8128 is not in group. (block 3518518062363072290) Relocate? no Inode bitmap for group 8128 is not in group. (block 12235298632209565410) Relocate? no Inode table for group 8128 is not in group. (block 17751685088477790304) WARNING: SEVERE DATA LOSS POSSIBLE. Relocate? no Block bitmap for group 8129 is not in group. (block 2193744380193356980) Relocate? no Inode bitmap for group 8129 is not in group. (block 4102707059848926418) Relocate? no It continues at length like that. -Paul Edmon- On 7/20/2022 2:31 PM, Colin Faber wrote: Can you mount the target directly with -t ldiskfs ? Also what does e2fsck report? On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss wrote: We have a filesystem that we have running Lustre 2.10.4 in HA mode using IML. One of our OST's had some disk failures and after reconstruction of the RAID set it won't remount but gives: [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd /mnt/holylfs2-OST001f Failed to initialize ZFS library: 256 mount.lustre: missing option mgsnode= The weird thing is that we didn't build this with ZFS, the devices are all ldiskfs. We suspect some of the data is corrupt on the disk but we were wondering if anyone had seen this error before and if there was a solution. -Paul Edmon- ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] missing option mgsnode
Can you mount the target directly with -t ldiskfs ? Also what does e2fsck report? On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss < lustre-discuss@lists.lustre.org> wrote: > We have a filesystem that we have running Lustre 2.10.4 in HA mode using > IML. One of our OST's had some disk failures and after reconstruction > of the RAID set it won't remount but gives: > > [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd > /mnt/holylfs2-OST001f > Failed to initialize ZFS library: 256 > mount.lustre: missing option mgsnode= > > The weird thing is that we didn't build this with ZFS, the devices are > all ldiskfs. We suspect some of the data is corrupt on the disk but we > were wondering if anyone had seen this error before and if there was a > solution. > > -Paul Edmon- > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] missing option mgsnode
We have a filesystem that we have running Lustre 2.10.4 in HA mode using IML. One of our OST's had some disk failures and after reconstruction of the RAID set it won't remount but gives: [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd /mnt/holylfs2-OST001f Failed to initialize ZFS library: 256 mount.lustre: missing option mgsnode= The weird thing is that we didn't build this with ZFS, the devices are all ldiskfs. We suspect some of the data is corrupt on the disk but we were wondering if anyone had seen this error before and if there was a solution. -Paul Edmon- ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org