Re: [lustre-discuss] Odd behavior with tunefs.lustre and device index

Andreas Dilger via lustre-discuss Wed, 24 Jan 2024 17:53:31 -0800

This is more like a bug report and should be filed in Jira.
That said, no guarantee that someone would be able to
work on this in a timely manner.


On Jan 24, 2024, at 09:47, Backer via lustre-discuss 
<[email protected]<mailto:[email protected]>> wrote:

Just pushing it on to the top of inbox :)  Or is there any other distribution 
list that is more appropriate for this type of questions? I am also trying 
devel mailing list.

On Sun, 21 Jan 2024 at 18:34, Backer 
<[email protected]<mailto:[email protected]>> wrote:
Just to clarify. OSS-2 is completely powered off (hard power off without any 
graceful shutdown) before start working on OSS-3.

On Sun, 21 Jan 2024 at 12:12, Backer 
<[email protected]<mailto:[email protected]>> wrote:
Hi All,

I am seeing a behavior with tunefs.lustre. After changing the failover node and 
trying to mount an OST, getting getting the following error:

The target service's index is already in use. (/dev/sdd)

After the above error, and performing --writeconf once, I can repeat these 
steps (see below) any number of times and any OSS without --writeconf.

This is an effort to mount an OST to a new OSS. I reproduced this issue after 
simplifying some steps and reproducing the behavior (see below) consistently. I 
was wondering if anyone could help me to understand this?

[root@OSS-2 opc]# lctl list_nids
10.99.101.18@tcp1
[root@OSS-2 opc]#

[root@OSS-2 opc]# mkfs.lustre --reformat  --ost --fsname="testfs" --index="64"  
--mgsnode "10.99.101.6@tcp1" --mgsnode "10.99.101.7@tcp1" --servicenode 
"10.99.101.18@tcp1" "/dev/sdd"

   Permanent disk data:
Target:     testfs:OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1

device size = 51200MB
formatting backing filesystem ldiskfs on /dev/sdd
target name   testfs:OST0040
kilobytes     52428800
options        -J size=1024 -I 512 -i 69905 -q -O 
extents,uninit_bg,mmp,dir_nlink,quota,project,huge_file,^fast_commit,flex_bg -G 
256 -E resize="4290772992",lazy_journal_init="0",lazy_itable_init="0" -F
mkfs_cmd = mke2fs -j -b 4096 -L testfs:OST0040  -J size=1024 -I 512 -i 69905 -q 
-O extents,uninit_bg,mmp,dir_nlink,quota,project,huge_file,^fast_commit,flex_bg 
-G 256 -E resize="4290772992",lazy_journal_init="0",lazy_itable_init="0" -F 
/dev/sdd 52428800k
Writing CONFIGS/mountdata

[root@OSS-2 opc]# tunefs.lustre --dryrun /dev/sdd
checking for existing Lustre data: found

   Read previous values:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1


   Permanent disk data:
Target:     testfs:OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1

exiting before disk write.
[root@OSS-2 opc]#

[root@OSS-2 opc]# tunefs.lustre --erase-param failover.node --servicenode 
10.99.101.18@tcp1 /dev/sdd
checking for existing Lustre data: found

   Read previous values:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1


   Permanent disk data:
Target:     testfs:OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1

Writing CONFIGS/mountdata

[root@OSS-2 opc]# mkdir /testfs-OST0040
[root@OSS-2 opc]# mount -t lustre /dev/sdd  /testfs-OST0040
mount.lustre: increased 
'/sys/devices/platform/host5/session3/target5:0:0/5:0:0:1/block/sdd/queue/max_sectors_kb'
 from 1024 to 16384
[root@OSS-2 opc]#

[root@OSS-2 opc]# tunefs.lustre --dryrun /dev/sdd
checking for existing Lustre data: found

   Read previous values:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1


   Permanent disk data:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1

exiting before disk write.
[root@OSS-2 opc]#



Going over to OSS-3 and trying to mount OST.


[root@OSS-3 opc]# lctl list_nids
10.99.101.19@tcp1
[root@OSS-3 opc]#

Parameters looks same as OSS-2

[root@OSS-3 opc]#  tunefs.lustre --dryrun /dev/sdd
checking for existing Lustre data: found

   Read previous values:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1


   Permanent disk data:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1

exiting before disk write.
[root@OSS-3 opc]#

Changing failover node to current node.

[root@OSS-3 opc]# tunefs.lustre --erase-param failover.node --servicenode 
10.99.101.19@tcp1 /dev/sdd
checking for existing Lustre data: found

   Read previous values:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.18@tcp1


   Permanent disk data:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1042
              (OST update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.19@tcp1


<waits here for MPP time out (multi mount protection>



After it completes the write, for some reason this OST is being marked as 
'first_time' flag 0x1062 in next command.

[root@OSS-3 opc]#  tunefs.lustre --dryrun /dev/sdd
checking for existing Lustre data: found

   Read previous values:
Target:     testfs-OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.19@tcp1


   Permanent disk data:
Target:     testfs:OST0040
Index:      64
Lustre FS:  testfs
Mount type: ldiskfs
Flags:      0x1062
              (OST first_time update no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1 
failover.node=10.99.101.19@tcp1

exiting before disk write.
[root@OSS-3 opc]#





Mount doesn't work here because it is marked as first time and this OST is not 
first time as it was already mounted using OST-2 OSS, and MGS knows about it.

[root@OSS-3 opc]#  mkdir /testfs-OST0040
[root@OSS-3 opc]# mount -t lustre /dev/sdd  /testfs-OST0040
mount.lustre: mount /dev/sdd at /testfs-OST0040 failed: Address already in use
The target service's index is already in use. (/dev/sdd)
[root@OSS-3 opc]#

>From here, if I do tunefs.lustre with --writeconf, it works. Once this is 
>done, repeating the above experiment any number of times on any servers works 
>fine as expected without using --writeconf. (FYI Note: --writeconfig is 
>mentioned as a dangerous command)




_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Odd behavior with tunefs.lustre and device index

Reply via email to