Re: [lustre-discuss] missing option mgsnode

2022-07-22 Thread Paul Edmon via lustre-discuss

Thanks.  Here is what I get:

[root@holylfs02oss06 ~]# tunefs.lustre --dryrun /dev/mapper/mpathd
Failed to initialize ZFS library: 256
checking for existing Lustre data: found

   Read previous values:
Target: holylfs2-OST001f
Index:  unassigned
Lustre FS:
Mount type: ldiskfs
Flags:  0x50
  (needs_index update )
Persistent mount opts:
Parameters:


tunefs.lustre FATAL: must set target type: MDT,OST,MGS
tunefs.lustre: exiting with 22 (Invalid argument)

-Paul Edmon-

On 7/22/2022 10:37 AM, Thomas Roth via lustre-discuss wrote:

You could look at what the device believes it's formatted with by

> tunefs.lustre --dryrun /dev/mapper/mpathd

When I do that here, I get something like

checking for existing Lustre data: found

   Read previous values:
Target: idril-OST000e
Index:  14
Lustre FS:  idril
Mount type: zfs
Flags:  0x2
  (OST )
Persistent mount opts:
Parameters: mgsnode=10.20.6.64@o2ib4:10.20.6.69@o2ib4
...


Tells you about 'mount type' and 'mgsnode'.


Regards
Thomas


On 20/07/2022 19.48, Paul Edmon via lustre-discuss wrote:
We have a filesystem that we have running Lustre 2.10.4 in HA mode 
using IML.  One of our OST's had some disk failures and after 
reconstruction of the RAID set it won't remount but gives:


[root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd 
/mnt/holylfs2-OST001f

Failed to initialize ZFS library: 256
mount.lustre: missing option mgsnode=

The weird thing is that we didn't build this with ZFS, the devices 
are all ldiskfs.  We suspect some of the data is corrupt on the disk 
but we were wondering if anyone had seen this error before and if 
there was a solution.


-Paul Edmon-

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] missing option mgsnode

2022-07-22 Thread Thomas Roth via lustre-discuss

You could look at what the device believes it's formatted with by

> tunefs.lustre --dryrun /dev/mapper/mpathd

When I do that here, I get something like

checking for existing Lustre data: found

   Read previous values:
Target: idril-OST000e
Index:  14
Lustre FS:  idril
Mount type: zfs
Flags:  0x2
  (OST )
Persistent mount opts:
Parameters: mgsnode=10.20.6.64@o2ib4:10.20.6.69@o2ib4
...


Tells you about 'mount type' and 'mgsnode'.


Regards
Thomas


On 20/07/2022 19.48, Paul Edmon via lustre-discuss wrote:
We have a filesystem that we have running Lustre 2.10.4 in HA mode using IML.  One of our OST's had 
some disk failures and after reconstruction of the RAID set it won't remount but gives:


[root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd 
/mnt/holylfs2-OST001f
Failed to initialize ZFS library: 256
mount.lustre: missing option mgsnode=

The weird thing is that we didn't build this with ZFS, the devices are all ldiskfs.  We suspect some 
of the data is corrupt on the disk but we were wondering if anyone had seen this error before and if 
there was a solution.


-Paul Edmon-

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


--

Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291



GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Volkmar Dietz

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] missing option mgsnode

2022-07-20 Thread Paul Edmon via lustre-discuss
The individual LUN looks good but the controller is showing amber, which 
is confusing us.  However other LUN's going through that controller are 
mounting fine.


-Paul Edmon-

On 7/20/2022 3:08 PM, Colin Faber wrote:

raid check?

On Wed, Jul 20, 2022, 12:41 PM Paul Edmon  wrote:

[root@holylfs02oss06 ~]# mount -t ldiskfs /dev/mapper/mpathd
/mnt/holylfs2-OST001f
mount: wrong fs type, bad option, bad superblock on
/dev/mapper/mpathd,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

e2fsck did not look good:

[root@holylfs02oss06 ~]# less OST001f.out
ext2fs_check_desc: Corrupt group descriptor: bad block for block
bitmap
e2fsck: Group descriptors look bad... trying backup blocks...
MMP interval is 10 seconds and total wait time is 42 seconds.
Please wait...
Superblock needs_recovery flag is clear, but journal has data.
Recovery flag not set in backup superblock, so running journal anyway.
Clear journal? no

Block bitmap for group 8128 is not in group.  (block
3518518062363072290)
Relocate? no

Inode bitmap for group 8128 is not in group.  (block
12235298632209565410)
Relocate? no

Inode table for group 8128 is not in group.  (block
17751685088477790304)
WARNING: SEVERE DATA LOSS POSSIBLE.
Relocate? no

Block bitmap for group 8129 is not in group.  (block
2193744380193356980)
Relocate? no

Inode bitmap for group 8129 is not in group.  (block
4102707059848926418)
Relocate? no

It continues at length like that.

-Paul Edmon-

On 7/20/2022 2:31 PM, Colin Faber wrote:

Can you mount the target directly with -t ldiskfs ?

Also what does e2fsck report?

On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss
 wrote:

We have a filesystem that we have running Lustre 2.10.4 in HA
mode using
IML.  One of our OST's had some disk failures and after
reconstruction
of the RAID set it won't remount but gives:

[root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd
/mnt/holylfs2-OST001f
Failed to initialize ZFS library: 256
mount.lustre: missing option mgsnode=

The weird thing is that we didn't build this with ZFS, the
devices are
all ldiskfs.  We suspect some of the data is corrupt on the
disk but we
were wondering if anyone had seen this error before and if
there was a
solution.

-Paul Edmon-

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] missing option mgsnode

2022-07-20 Thread Colin Faber via lustre-discuss
raid check?

On Wed, Jul 20, 2022, 12:41 PM Paul Edmon  wrote:

> [root@holylfs02oss06 ~]# mount -t ldiskfs /dev/mapper/mpathd
> /mnt/holylfs2-OST001f
> mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpathd,
>missing codepage or helper program, or other error
>
>In some cases useful info is found in syslog - try
>dmesg | tail or so.
>
> e2fsck did not look good:
>
> [root@holylfs02oss06 ~]# less OST001f.out
> ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap
> e2fsck: Group descriptors look bad... trying backup blocks...
> MMP interval is 10 seconds and total wait time is 42 seconds. Please
> wait...
> Superblock needs_recovery flag is clear, but journal has data.
> Recovery flag not set in backup superblock, so running journal anyway.
> Clear journal? no
>
> Block bitmap for group 8128 is not in group.  (block 3518518062363072290)
> Relocate? no
>
> Inode bitmap for group 8128 is not in group.  (block 12235298632209565410)
> Relocate? no
>
> Inode table for group 8128 is not in group.  (block 17751685088477790304)
> WARNING: SEVERE DATA LOSS POSSIBLE.
> Relocate? no
>
> Block bitmap for group 8129 is not in group.  (block 2193744380193356980)
> Relocate? no
>
> Inode bitmap for group 8129 is not in group.  (block 4102707059848926418)
> Relocate? no
>
> It continues at length like that.
>
> -Paul Edmon-
> On 7/20/2022 2:31 PM, Colin Faber wrote:
>
> Can you mount the target directly with -t ldiskfs ?
>
> Also what does e2fsck report?
>
> On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss <
> lustre-discuss@lists.lustre.org> wrote:
>
>> We have a filesystem that we have running Lustre 2.10.4 in HA mode using
>> IML.  One of our OST's had some disk failures and after reconstruction
>> of the RAID set it won't remount but gives:
>>
>> [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd
>> /mnt/holylfs2-OST001f
>> Failed to initialize ZFS library: 256
>> mount.lustre: missing option mgsnode=
>>
>> The weird thing is that we didn't build this with ZFS, the devices are
>> all ldiskfs.  We suspect some of the data is corrupt on the disk but we
>> were wondering if anyone had seen this error before and if there was a
>> solution.
>>
>> -Paul Edmon-
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] missing option mgsnode

2022-07-20 Thread Paul Edmon via lustre-discuss
[root@holylfs02oss06 ~]# mount -t ldiskfs /dev/mapper/mpathd 
/mnt/holylfs2-OST001f

mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpathd,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

e2fsck did not look good:

[root@holylfs02oss06 ~]# less OST001f.out
ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap
e2fsck: Group descriptors look bad... trying backup blocks...
MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...
Superblock needs_recovery flag is clear, but journal has data.
Recovery flag not set in backup superblock, so running journal anyway.
Clear journal? no

Block bitmap for group 8128 is not in group.  (block 3518518062363072290)
Relocate? no

Inode bitmap for group 8128 is not in group.  (block 12235298632209565410)
Relocate? no

Inode table for group 8128 is not in group.  (block 17751685088477790304)
WARNING: SEVERE DATA LOSS POSSIBLE.
Relocate? no

Block bitmap for group 8129 is not in group.  (block 2193744380193356980)
Relocate? no

Inode bitmap for group 8129 is not in group.  (block 4102707059848926418)
Relocate? no

It continues at length like that.

-Paul Edmon-

On 7/20/2022 2:31 PM, Colin Faber wrote:

Can you mount the target directly with -t ldiskfs ?

Also what does e2fsck report?

On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss 
 wrote:


We have a filesystem that we have running Lustre 2.10.4 in HA mode
using
IML.  One of our OST's had some disk failures and after
reconstruction
of the RAID set it won't remount but gives:

[root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd
/mnt/holylfs2-OST001f
Failed to initialize ZFS library: 256
mount.lustre: missing option mgsnode=

The weird thing is that we didn't build this with ZFS, the devices
are
all ldiskfs.  We suspect some of the data is corrupt on the disk
but we
were wondering if anyone had seen this error before and if there
was a
solution.

-Paul Edmon-

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] missing option mgsnode

2022-07-20 Thread Colin Faber via lustre-discuss
Can you mount the target directly with -t ldiskfs ?

Also what does e2fsck report?

On Wed, Jul 20, 2022, 11:48 AM Paul Edmon via lustre-discuss <
lustre-discuss@lists.lustre.org> wrote:

> We have a filesystem that we have running Lustre 2.10.4 in HA mode using
> IML.  One of our OST's had some disk failures and after reconstruction
> of the RAID set it won't remount but gives:
>
> [root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd
> /mnt/holylfs2-OST001f
> Failed to initialize ZFS library: 256
> mount.lustre: missing option mgsnode=
>
> The weird thing is that we didn't build this with ZFS, the devices are
> all ldiskfs.  We suspect some of the data is corrupt on the disk but we
> were wondering if anyone had seen this error before and if there was a
> solution.
>
> -Paul Edmon-
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] missing option mgsnode

2022-07-20 Thread Paul Edmon via lustre-discuss
We have a filesystem that we have running Lustre 2.10.4 in HA mode using 
IML.  One of our OST's had some disk failures and after reconstruction 
of the RAID set it won't remount but gives:


[root@holylfs02oss06 ~]# mount -t lustre /dev/mapper/mpathd 
/mnt/holylfs2-OST001f

Failed to initialize ZFS library: 256
mount.lustre: missing option mgsnode=

The weird thing is that we didn't build this with ZFS, the devices are 
all ldiskfs.  We suspect some of the data is corrupt on the disk but we 
were wondering if anyone had seen this error before and if there was a 
solution.


-Paul Edmon-

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org