[lustre-discuss] Migrating to new OSTs

2024-06-18 Thread Sid Young via lustre-discuss
G'Day all,

I'm in the process of scoping up more HPC storage on newer hardware and I
was looking to deploy a bunch of new OSTs (JBOD with ZFS) and then phase
out the older OSTs on the older hardware.

Is there a comprehensive guide to doing this, I've found many different
ways to migrate files to a new OST? But I also need steps on adding the new
OSTs to the MDS (I have /home and /lustre as 2 pools).



Sid Young
Translational Research Institute
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] tunefs.lustre safe way to get config

2023-02-23 Thread Sid Young via lustre-discuss
G'Day all,

I need to review the IP's assigned during the initial mkfs.lustre on ten
ZFS based OST's and two ZFS backed MDT's.

The ZFS disks are:
osthome0/ost0, osthome1/ost1, osthome2/ost2, osthome3/ost3,
ostlustre0/ost0, ostlustre1/ost1, ostlustre2/ost2,
ostlustre3/ost3, ostlustre4/ost4, ostlustre5/ost5
And
mdsthome/home
mdtlustre/lustre).

A few questions

Is it safe to use tunefs.lustre on the running system to read back the
parameters only? or do I have to shut everything down and read from the
unmounted filesystems?

Is this the correct commands to use for the DMTs?

tunefs.lustre --print mdthome/home
tunefs.lustre --print mdtlustre/lustre


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre crash and now lockup on ls -la /lustre

2023-02-22 Thread Sid Young via lustre-discuss
Hi all,

I've been running lustre 2.12.6 and (clients are 2.12.7) on HP gear for
nearly 2 years and had an odd crash requiring a reboot of all nodes. I have
lustre /home and /lustre file systems and I've been able to remount them on
the clients after restarting the MGS/MDT and OSS nodes but on any client
when I do an ls -la on the /lustre file system it locks solid. The /home
appears to be OK for the directories and sub-directories I tested.

I am ver rusty on Lustre now but I logged into another node and ran the
following:

[root@n04 ~]# lfs check osts
home-OST-osc-9f3b26547800 active.
home-OST0001-osc-9f3b26547800 active.
home-OST0002-osc-9f3b26547800 active.
home-OST0003-osc-9f3b26547800 active.
lustre-OST-osc-9efd1e392800 active.
lustre-OST0001-osc-9efd1e392800 active.
lustre-OST0002-osc-9efd1e392800 active.
lustre-OST0003-osc-9efd1e392800 active.
lustre-OST0004-osc-9efd1e392800 active.
lustre-OST0005-osc-9efd1e392800 active.
[root@n04 ~]# lfs check mds
home-MDT-mdc-9f3b26547800 active.
lustre-MDT-mdc-9efd1e392800 active.
[root@n04 ~]# lfs check servers
home-OST-osc-9f3b26547800 active.
home-OST0001-osc-9f3b26547800 active.
home-OST0002-osc-9f3b26547800 active.
home-OST0003-osc-9f3b26547800 active.
lustre-OST-osc-9efd1e392800 active.
lustre-OST0001-osc-9efd1e392800 active.
lustre-OST0002-osc-9efd1e392800 active.
lustre-OST0003-osc-9efd1e392800 active.
lustre-OST0004-osc-9efd1e392800 active.
lustre-OST0005-osc-9efd1e392800 active.
home-MDT-mdc-9f3b26547800 active.
lustre-MDT-mdc-9efd1e392800 active.
[root@n04 ~]#

[root@n04 ~]# lfs df -h
UUID   bytesUsed   Available Use% Mounted on
home-MDT_UUID   4.2T  217.5G4.0T   6% /home[MDT:0]
home-OST_UUID  47.6T   42.5T5.1T  90% /home[OST:0]
home-OST0001_UUID  47.6T   44.6T2.9T  94% /home[OST:1]
home-OST0002_UUID  47.6T   41.9T5.7T  88% /home[OST:2]
home-OST0003_UUID  47.6T   42.2T5.4T  89% /home[OST:3]

filesystem_summary:   190.4T  171.2T   19.1T  90% /home

UUID   bytesUsed   Available Use% Mounted on
lustre-MDT_UUID 5.0T   53.8G4.9T   2% /lustre[MDT:0]
lustre-OST_UUID47.6T   42.3T5.3T  89% /lustre[OST:0]
lustre-OST0001_UUID47.6T   41.8T5.8T  88% /lustre[OST:1]
lustre-OST0002_UUID47.6T   41.3T6.3T  87% /lustre[OST:2]
lustre-OST0003_UUID47.6T   42.3T5.3T  89% /lustre[OST:3]
lustre-OST0004_UUID47.6T   43.7T3.9T  92% /lustre[OST:4]
lustre-OST0005_UUID47.6T   40.1T7.4T  85% /lustre[OST:5]

filesystem_summary:   285.5T  251.5T   34.0T  89% /lustre

[root@n04 ~]#

Is it worth remounting everything and hope crash recovery is working or is
there some specific checks I can make.



Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] upgrade 2.12.6 to 2.12.7 - no lnet after reboot - SOLVED

2021-11-10 Thread Sid Young via lustre-discuss
I've managed to solve this after checking a few nodes in the cluster and
discovered this particular node must have had a partial update resulting in
a mismatch between the kernel version (locked at base release) and some of
the kernel support files which appeared to be a slightly later release
causing the DKMS to not generate the required files.

Normally I disable kernel updates in YUM so  everything is at the same
release version and just update packages until I'm ready for a major update
cycle.

bad node:

# yum list installed | grep kernel
abrt-addon-kerneloops.x86_64   2.1.11-60.el7.centos
@anaconda
kernel.x86_64  3.10.0-1160.el7
 @anaconda
kernel-debug-devel.x86_64  3.10.0-1160.15.2.el7
@updates
kernel-devel.x86_643.10.0-1160.15.2.el7
@updates
kernel-headers.x86_64  3.10.0-1160.15.2.el7
@updates
kernel-tools.x86_643.10.0-1160.15.2.el7
@updates
kernel-tools-libs.x86_64   3.10.0-1160.15.2.el7
@updates
#

Working node:
# yum list installed | grep kernel
abrt-addon-kerneloops.x86_64   2.1.11-60.el7.centos
@anaconda
kernel.x86_64  3.10.0-1160.el7
 @anaconda
kernel-debug-devel.x86_64  3.10.0-1160.31.1.el7
@updates
kernel-devel.x86_643.10.0-1160.el7
 @/kernel-devel-3.10.0-1160.el7.x86_64
kernel-headers.x86_64  3.10.0-1160.el7
 @anaconda
kernel-tools.x86_643.10.0-1160.el7
 @anaconda
kernel-tools-libs.x86_64   3.10.0-1160.el7
 @anaconda
#

After I removed the extraneous release packages and the lustre packages, I
then updated the kernel and re-installed the kernel-headers and
kernel-devel code then installed the (minimal) lustre client:

# yum list installed|grep lustre
kmod-lustre-client.x86_64  2.12.7-1.el7
@/kmod-lustre-client-2.12.7-1.el7.x86_64
lustre-client.x86_64   2.12.7-1.el7
@/lustre-client-2.12.7-1.el7.x86_64
lustre-client-dkms.noarch  2.12.7-1.el7
@/lustre-client-dkms-2.12.7-1.el7.noarch
#

And all good, every mounts and works first go as expected :)



Sid Young
Translational Research Institute
Brisbane



> -- Forwarded message ------
> From: Sid Young 
> To: lustre-discuss 
> Cc:
> Bcc:
> Date: Mon, 8 Nov 2021 11:15:59 +1000
> Subject: [lustre-discuss] upgrade 2.12.6 to 2.12.7 - no lnet after reboot?
> I was running 2.12.6 on a HP DL385 running standard Centos 7.9
> (3.10.0-1160.el7.x86_64) for around 6 months and decided to plan and start
> an upgrade cycle to 2.12.7, so I downloaded and installed the 2.12.7 centos
> release from whamcloud using the 7.9.2009 release RPMS
>
> # cat /etc/centos-release
> CentOS Linux release 7.9.2009 (Core)
>
> I have tried on the a node and I now have the following error after I
> rebooted:
>
> # modprobe -v lnet
> modprobe: FATAL: Module lnet not found.
>
> I suspect its not built against the kernel as there are 3 releases showing
> and no errors during the yum install process:
>
> # ls -la  /usr/lib/modules
> drwxr-xr-x.  3 root root 4096 Mar 18  2021 3.10.0-1160.2.1.el7.x86_64
> drwxr-xr-x   3 root root 4096 Nov  8 10:32 3.10.0-1160.25.1.el7.x86_64
> drwxr-xr-x.  7 root root 4096 Nov  8 11:02 3.10.0-1160.el7.x86_64
> #
>
> Anyone upgraded this way? Any obvious gottas I've missed?
>
> Sid Young
> Translational Research Institute
> Brisbane
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] upgrade 2.12.6 to 2.12.7 - no lnet after reboot?

2021-11-07 Thread Sid Young via lustre-discuss
I was running 2.12.6 on a HP DL385 running standard Centos 7.9
(3.10.0-1160.el7.x86_64) for around 6 months and decided to plan and start
an upgrade cycle to 2.12.7, so I downloaded and installed the 2.12.7 centos
release from whamcloud using the 7.9.2009 release RPMS

# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)

I have tried on the a node and I now have the following error after I
rebooted:

# modprobe -v lnet
modprobe: FATAL: Module lnet not found.

I suspect its not built against the kernel as there are 3 releases showing
and no errors during the yum install process:

# ls -la  /usr/lib/modules
drwxr-xr-x.  3 root root 4096 Mar 18  2021 3.10.0-1160.2.1.el7.x86_64
drwxr-xr-x   3 root root 4096 Nov  8 10:32 3.10.0-1160.25.1.el7.x86_64
drwxr-xr-x.  7 root root 4096 Nov  8 11:02 3.10.0-1160.el7.x86_64
#

Anyone upgraded this way? Any obvious gottas I've missed?

Sid Young
Translational Research Institute
Brisbane
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] OST "D" status - only 1 OSS mounting

2021-11-01 Thread Sid Young via lustre-discuss
Thanks Andreas, The ZFS pools became degraded so I have cold restarted the
storage and the OSTs and everything has come back up after about 5 minutes
of crash recovery.

Ive also worked out using the lfs_migrate and am emptying the full OST.

Is there a tool that cAn specifically check an MDT and its associated OST's?




Sid Young


On Mon, Nov 1, 2021 at 2:11 PM Andreas Dilger  wrote:

> The "D" status means the OST is marked in "Degraded" mode, see the
> lfs-df(1) man page.  The "lfs check osts" is only checking the client
> connection to the OSTs, but whether the MDS creates objects on those OSTs
> really depends on how the MDS is feeling about them.
>
> On Oct 31, 2021, at 19:28, Sid Young via lustre-discuss <
> lustre-discuss@lists.lustre.org> wrote:
>
> Hi all,
>
> I have a really odd issue, only 1 OST appears to mount despite there being
> 4 OSTS available and ACTIVE.
>
> [root@hpc-login-01 home]# lfs df -h
> UUID   bytesUsed   Available Use% Mounted on
> home-MDT_UUID   4.2T   40.2G4.1T   1% /home[MDT:0]
> home-OST_UUID  47.6T   37.8T9.8T  80% /home[OST:0]
> home-OST0001_UUID  47.6T   47.2T  413.4G 100% /home[OST:1]
> D
> home-OST0002_UUID  47.6T   35.7T   11.9T  75% /home[OST:2]
> home-OST0003_UUID  47.6T   39.4T8.2T  83% /home[OST:3]
>
> filesystem_summary:   190.4T  160.0T   30.3T  85% /home
>
> [root@hpc-login-01 home]# lfs check osts
> home-OST-osc-a10c8f483800 active.
> home-OST0001-osc-a10c8f483800 active.
> home-OST0002-osc-a10c8f483800 active.
> home-OST0003-osc-a10c8f483800 active.
>
> Should be 191TB... only shows 1 OST..
>
> 10.140.93.42@o2ib:/home *48T *  48T  414G 100% /home
>
> Where should I look?
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
>
>
>
>
>
>
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] OST "D" status - only 1 OSS mounting

2021-10-31 Thread Sid Young via lustre-discuss
Hi all,

I have a really odd issue, only 1 OST appears to mount despite there being
4 OSTS available and ACTIVE.

[root@hpc-login-01 home]# lfs df -h
UUID   bytesUsed   Available Use% Mounted on
home-MDT_UUID   4.2T   40.2G4.1T   1% /home[MDT:0]
home-OST_UUID  47.6T   37.8T9.8T  80% /home[OST:0]
home-OST0001_UUID  47.6T   47.2T  413.4G 100% /home[OST:1] D
home-OST0002_UUID  47.6T   35.7T   11.9T  75% /home[OST:2]
home-OST0003_UUID  47.6T   39.4T8.2T  83% /home[OST:3]

filesystem_summary:   190.4T  160.0T   30.3T  85% /home

[root@hpc-login-01 home]# lfs check osts
home-OST-osc-a10c8f483800 active.
home-OST0001-osc-a10c8f483800 active.
home-OST0002-osc-a10c8f483800 active.
home-OST0003-osc-a10c8f483800 active.

Should be 191TB... only shows 1 OST..

10.140.93.42@o2ib:/home *48T *  48T  414G 100% /home

Where should I look?





Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] df shows wrong size of lustre file system (on all nodes).

2021-10-18 Thread Sid Young via lustre-discuss
I have some stability in my lustre installation after many days of testing,
however df- h now reports the /home filesystem incorrectly.

After mounting the /home I get:
[root@n04 ~]# df -h
10.140.90.42@tcp:/lustre  286T   59T  228T  21% /lustre
10.140.90.42@tcp:/home191T  153T   38T  81% /home

doing it again straight after, I get:

[root@n04 ~]# df -h
10.140.90.42@tcp:/lustre  286T   59T  228T  21% /lustre
10.140.90.42@tcp:/home 48T   40T  7.8T  84% /home

The 4 OSTs report as active and present:

[root@n04 ~]# lfs df

UUID   1K-blocksUsed   Available Use% Mounted on
home-MDT_UUID 447380569641784064  4432019584   1% /home[MDT:0]
home-OST_UUID51097753600 40560842752 10536908800  80% /home[OST:0]
home-OST0001_UUID51097896960 42786978816  8310916096  84% /home[OST:1]
home-OST0002_UUID51097687040 38293322752 12804362240  75% /home[OST:2]
home-OST0003_UUID51097765888 42293640192  8804123648  83% /home[OST:3]

filesystem_summary:  204391103488 163934784512 40456310784  81% /home

[root@n04 ~]#
[root@n04 ~]# lfs osts
OBDS:
0: lustre-OST_UUID ACTIVE
1: lustre-OST0001_UUID ACTIVE
2: lustre-OST0002_UUID ACTIVE
3: lustre-OST0003_UUID ACTIVE
4: lustre-OST0004_UUID ACTIVE
5: lustre-OST0005_UUID ACTIVE
OBDS:
0: home-OST_UUID ACTIVE
1: home-OST0001_UUID ACTIVE
2: home-OST0002_UUID ACTIVE
3: home-OST0003_UUID ACTIVE
[root@n04 ~]#

Anyone seen this before? Reboots and remounts do not appear to change the
value. zfs pool is reporting as online and a scrub returns 0 errors.

Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Best ways to backup a Lustre file system?

2021-10-16 Thread Sid Young via lustre-discuss
G'Day all,

Apart from rsync'ing all the data on a mounted lustre filesystem to another
server, what backup systems are people using to backup Lustre?


Sid Young
M: 0458 396300
W: https://off-grid-engineering.com
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] /home remounted and running for 6 hours

2021-10-13 Thread Sid Young via lustre-discuss
Well my saga with /home locking up was partially resolved for about 6 hours
today. I rebooted the MDS and re mounted the MGS and lustre MDT and home
MDT and after a while it all came good, then rebooted each compute node and
we were operational for about 6 hours when it all locked up again, /lustre
worked fine but /home just locked solid.. I'm suspecting corruption but I
don't know how to fix it...

I have found that once I restart the MDS I can do a remount of home and all
the D state processes come good and we are up and running.

Is there a tool that can specifically check an individual MDT / OST etc?



Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre /home lockup - more info

2021-10-11 Thread Sid Young via lustre-discuss
I tried remounting the /home lustre file system to /mnt in read-only mode
and when I try to ls the directory it locks up but I can escape it, how
ever when I do a df command i get the completely wrong size (should be
around 192TB):

10.140.93.42@o2ib:/home6.0P  4.8P  1.3P  80% /mnt

zfs scrub is still working and all disks physically report as OK in the ILO
of the two OSS servers...

When the scrub finishes later today I will unmount and remount the 4 OSTs
and see if the remount changes the status... updates in about 8 hours.

Sid Young

On Tue, Oct 12, 2021 at 8:18 AM Sid Young  wrote:

>
>>2. Tools to check a lustre (Sid Young)
>>4. Re: Tools to check a lustre (Dennis Nelson)
>>
>>
>> My key issue is why /home locks solid when you try to use it but /lustre
> is OK . The backend is ZFS used to manage the disks presented from the HP
> D8000 JBOD
> I'm at a loss after 6 months of 100% operation why this is suddenly
> occurring. If I do repeated "dd" tasks on lustre it works fine, start one
> on /home and it locks solid.
>
> I have started a ZFS scrub on two of the zfs pools. at 47T each it will
> take most of today to resolve, but that should rule out the actual storage
> (which is showing "NORMAL/ONLINE" and no errors.
>
> I'm seeing a lot of these in /var/log/messages
> kernel: LustreError: 6578:0:(events.c:200:client_bulk_callback()) event
> type 1, status -5, desc 89cdf3b9dc00
> A google search returned this:
> https://wiki.lustre.org/Lustre_Resiliency:_Understanding_Lustre_Message_Loss_and_Tuning_for_Resiliency
>
> Could it be a network issue? - the nodes are running the
> Centos7.9 drivers... the Mellanox one did not seam to make any difference
> when I originally tried it 6 months ago.
>
> Any help appreciated :)
>
> Sid
>
>
>>
>> -- Forwarded message --
>> From: Sid Young 
>> To: lustre-discuss 
>> Cc:
>> Bcc:
>> Date: Mon, 11 Oct 2021 16:07:56 +1000
>> Subject: [lustre-discuss] Tools to check a lustre
>>
>> I'm having trouble diagnosing where the problem lies in  my Lustre
>> installation, clients are 2.12.6 and I have a /home and /lustre
>> filesystems using Lustre.
>>
>> /home has 4 OSTs and /lustre is made up of 6 OSTs. lfs df shows all OSTs
>> as ACTIVE.
>>
>> The /lustre file system appears fine, I can *ls *into every directory.
>>
>> When people log into the login node, it appears to lockup. I have shut
>> down everything and remounted the OSTs and MDTs etc in order with no
>> errors reporting but I'm getting the lockup issue soon after a few people
>> log in.
>> The backend network is 100G Ethernet using ConnectX5 cards and the OS is
>> Cento 7.9, everything was installed as RPMs and updates are disabled in
>> yum.conf
>>
>> Two questions to start with:
>> Is there a command line tool to check each OST individually?
>> Apart from /var/log/messages, is there a lustre specific log I can
>> monitor on the login node to see errors when I hit /home...
>>
>>
>>
>> Sid Young
>>
>>
>>
>>
>>
>>
>>
>> -- Forwarded message --
>> From: Dennis Nelson 
>> To: Sid Young 
>>
>> Date: Mon, 11 Oct 2021 12:20:25 +
>> Subject: Re: [lustre-discuss] Tools to check a lustre
>> Have you tried lfs check servers on the login node?
>>
>
> Yes - one of the first things I did and this is what it always reports:
>
> ]# lfs check servers
> home-OST-osc-89adb7e5e000 active.
> home-OST0001-osc-89adb7e5e000 active.
> home-OST0002-osc-89adb7e5e000 active.
> home-OST0003-osc-89adb7e5e000 active.
> lustre-OST-osc-89cdd14a2000 active.
> lustre-OST0001-osc-89cdd14a2000 active.
> lustre-OST0002-osc-89cdd14a2000 active.
> lustre-OST0003-osc-89cdd14a2000 active.
> lustre-OST0004-osc-89cdd14a2000 active.
> lustre-OST0005-osc-89cdd14a2000 active.
> home-MDT-mdc-89adb7e5e000 active.
> lustre-MDT-mdc-89cdd14a2000 active.
> [root@tri-minihub-01 ~]#
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre /home lockup - how to check

2021-10-11 Thread Sid Young via lustre-discuss
>
>
>2. Tools to check a lustre (Sid Young)
>4. Re: Tools to check a lustre (Dennis Nelson)
>
>
> My key issue is why /home locks solid when you try to use it but /lustre
is OK . The backend is ZFS used to manage the disks presented from the HP
D8000 JBOD
I'm at a loss after 6 months of 100% operation why this is suddenly
occurring. If I do repeated "dd" tasks on lustre it works fine, start one
on /home and it locks solid.

I have started a ZFS scrub on two of the zfs pools. at 47T each it will
take most of today to resolve, but that should rule out the actual storage
(which is showing "NORMAL/ONLINE" and no errors.

I'm seeing a lot of these in /var/log/messages
kernel: LustreError: 6578:0:(events.c:200:client_bulk_callback()) event
type 1, status -5, desc 89cdf3b9dc00
A google search returned this:
https://wiki.lustre.org/Lustre_Resiliency:_Understanding_Lustre_Message_Loss_and_Tuning_for_Resiliency

Could it be a network issue? - the nodes are running the
Centos7.9 drivers... the Mellanox one did not seam to make any difference
when I originally tried it 6 months ago.

Any help appreciated :)

Sid


>
> -- Forwarded message --
> From: Sid Young 
> To: lustre-discuss 
> Cc:
> Bcc:
> Date: Mon, 11 Oct 2021 16:07:56 +1000
> Subject: [lustre-discuss] Tools to check a lustre
>
> I'm having trouble diagnosing where the problem lies in  my Lustre
> installation, clients are 2.12.6 and I have a /home and /lustre
> filesystems using Lustre.
>
> /home has 4 OSTs and /lustre is made up of 6 OSTs. lfs df shows all OSTs
> as ACTIVE.
>
> The /lustre file system appears fine, I can *ls *into every directory.
>
> When people log into the login node, it appears to lockup. I have shut
> down everything and remounted the OSTs and MDTs etc in order with no
> errors reporting but I'm getting the lockup issue soon after a few people
> log in.
> The backend network is 100G Ethernet using ConnectX5 cards and the OS is
> Cento 7.9, everything was installed as RPMs and updates are disabled in
> yum.conf
>
> Two questions to start with:
> Is there a command line tool to check each OST individually?
> Apart from /var/log/messages, is there a lustre specific log I can monitor
> on the login node to see errors when I hit /home...
>
>
>
> Sid Young
>
>
>
>
>
>
>
> -- Forwarded message --
> From: Dennis Nelson 
> To: Sid Young 
>
> Date: Mon, 11 Oct 2021 12:20:25 +
> Subject: Re: [lustre-discuss] Tools to check a lustre
> Have you tried lfs check servers on the login node?
>

Yes - one of the first things I did and this is what it always reports:

]# lfs check servers
home-OST-osc-89adb7e5e000 active.
home-OST0001-osc-89adb7e5e000 active.
home-OST0002-osc-89adb7e5e000 active.
home-OST0003-osc-89adb7e5e000 active.
lustre-OST-osc-89cdd14a2000 active.
lustre-OST0001-osc-89cdd14a2000 active.
lustre-OST0002-osc-89cdd14a2000 active.
lustre-OST0003-osc-89cdd14a2000 active.
lustre-OST0004-osc-89cdd14a2000 active.
lustre-OST0005-osc-89cdd14a2000 active.
home-MDT-mdc-89adb7e5e000 active.
lustre-MDT-mdc-89cdd14a2000 active.
[root@tri-minihub-01 ~]#
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Tools to check a lustre

2021-10-11 Thread Sid Young via lustre-discuss
I'm having trouble diagnosing where the problem lies in  my Lustre
installation, clients are 2.12.6 and I have a /home and /lustre
filesystems using Lustre.

/home has 4 OSTs and /lustre is made up of 6 OSTs. lfs df shows all OSTs as
ACTIVE.

The /lustre file system appears fine, I can *ls *into every directory.

When people log into the login node, it appears to lockup. I have shut down
everything and remounted the OSTs and MDTs etc in order with no
errors reporting but I'm getting the lockup issue soon after a few people
log in.
The backend network is 100G Ethernet using ConnectX5 cards and the OS is
Cento 7.9, everything was installed as RPMs and updates are disabled in
yum.conf

Two questions to start with:
Is there a command line tool to check each OST individually?
Apart from /var/log/messages, is there a lustre specific log I can monitor
on the login node to see errors when I hit /home...



Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] eviction timeout

2021-10-10 Thread Sid Young via lustre-discuss
I'm seeing a lot of these messages:

Oct 11 11:12:09 hpc-mds-02 kernel: Lustre: lustre-MDT: Denying
connection for new client b6df7eda-8ae1-617c-6ff1-406d1ffb6006 (at
10.140.90.82@tcp), waiting for 6 known clients (0 recovered, 0 in progress,
and 0 evicted) to recover in 2:42

It seems to be a 3minute timeout, is it possible to shorten this and even
not log this message?

Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Missing OST's from 1 node only

2021-10-07 Thread Sid Young via lustre-discuss
G'Day all,

I have an odd situation where 1 compute node, mounts /home and /lustre but
only half the OST's are present, while all the other nodes are fine not
sure where to start on this one?

Good node:
[root@n02 ~]# lfs df
UUID   1K-blocksUsed   Available Use% Mounted on
home-MDT_UUID 447397068830695424  4443273216   1% /home[MDT:0]
home-OST_UUID51097721856 39839794176 11257662464  78% /home[OST:0]
home-OST0001_UUID51097897984 40967138304 10130627584  81% /home[OST:1]
home-OST0002_UUID51097705472 37731089408 13366449152  74% /home[OST:2]
home-OST0003_UUID51097773056 41447411712  9650104320  82% /home[OST:3]

filesystem_summary:  204391098368 159985433600 44404843520  79% /home

UUID   1K-blocksUsed   Available Use% Mounted on
lustre-MDT_UUID   536881612828246656  5340567424   1% /lustre[MDT:0]
lustre-OST_UUID  51098352640 10144093184 40954257408  20% /lustre[OST:0]
lustre-OST0001_UUID  51098497024  9584398336 41514096640  19% /lustre[OST:1]
lustre-OST0002_UUID  51098414080 11683002368 39415409664  23% /lustre[OST:2]
lustre-OST0003_UUID  51098514432 10475310080 40623202304  21% /lustre[OST:3]
lustre-OST0004_UUID  51098506240 11505326080 39593178112  23% /lustre[OST:4]
lustre-OST0005_UUID  51098429440  9272059904 41826367488  19% /lustre[OST:5]

filesystem_summary:  306590713856 62664189952 243926511616  21% /lustre

[root@n02 ~]#



The bad Node:

 [root@n04 ~]# lfs df
UUID   1K-blocksUsed   Available Use% Mounted on
home-MDT_UUID 447397068830726400  4443242240   1% /home[MDT:0]
home-OST0002_UUID51097703424 37732352000 13363446784  74% /home[OST:2]
home-OST0003_UUID51097778176 41449634816  9646617600  82% /home[OST:3]

filesystem_summary:  102195481600 79181986816 23010064384  78% /home

UUID   1K-blocksUsed   Available Use% Mounted on
lustre-MDT_UUID   536881612828246656  5340567424   1% /lustre[MDT:0]
lustre-OST0003_UUID  51098514432 10475310080 40623202304  21% /lustre[OST:3]
lustre-OST0004_UUID  51098511360 11505326080 39593183232  23% /lustre[OST:4]
lustre-OST0005_UUID  51098429440  9272059904 41826367488  19% /lustre[OST:5]

filesystem_summary:  153295455232 31252696064 122042753024  21% /lustre

[root@n04 ~]#



Sid Young
Translational Research Institute
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Converting MGS to ZFS - HA Config Question

2021-05-27 Thread Sid Young via lustre-discuss
Hi,

I am in the process of converting my pre-production cluster to use ZFS, and
I have a question regarding HA config parameters. The storage node has 24
disks, I've sliced off two disks in HBA mode to act as a 960G mirror. the
command is:

# mkfs.lustre --reformat --mgs  --failnode 10.140.93.41@o2ib
--backfstype=zfs mgspool/mgt mirror d3710M0 d3710M1

This runs successfully and I get the output below, however I want to make
sure the second MDS node can be failed over too using Pacemaker, so if the
server I am on now is 10.140.93.42 and the other MDS is 10.140.93.41, do I
need to specify the host its on now (.42) anywhere in the config? I tried
the servicenode parameter but it refuses to have servicenode and failnode
in the command:

   Permanent disk data:
Target: MGS
Index:  unassigned
Lustre FS:
Mount type: zfs
Flags:  0x64
  (MGS first_time update )
Persistent mount opts:
Parameters: failover.node=10.140.93.41@o2ib
mkfs_cmd = zpool create -f -O canmount=off mgspool mirror d3710M0 d3710M1
mkfs_cmd = zfs create -o canmount=off  mgspool/mgt
  xattr=sa
  dnodesize=auto
Writing mgspool/mgt properties
  lustre:failover.node=10.140.93.41@o2ib
  lustre:version=1
  lustre:flags=100
  lustre:index=65535
  lustre:svname=MGS
[root@hpc-mds-02]#

]# zfs list
NAME  USED  AVAIL  REFER  MOUNTPOINT
mgspool   468K   860G96K  /mgspool
mgspool/mgt96K   860G96K  /mgspool/mgt
[root@hpc-mds-02 by-id]# zpool status
  pool: mgspool
 state: ONLINE
  scan: none requested
config:

NAME STATE READ WRITE CKSUM
mgspool  ONLINE   0 0 0
  mirror-0   ONLINE   0 0 0
d3710M0  ONLINE   0 0 0
d3710M1  ONLINE   0 0 0

errors: No known data errors
[root@hpc-mds-02#



Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] lustre-discuss Digest, Vol 181, Issue 22

2021-04-29 Thread Sid Young via lustre-discuss
3 things

Can you send your /etc/lnet.conf file
Can you also send /etc/modprobe.d/lnet.conf
and does a systemctl restart lnet produce an error?


Sid

On Fri, Apr 30, 2021 at 6:27 AM 
wrote:

> Send lustre-discuss mailing list submissions to
> lustre-discuss@lists.lustre.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> or, via email, send a message with subject or body 'help' to
> lustre-discuss-requ...@lists.lustre.org
>
> You can reach the person managing the list at
> lustre-discuss-ow...@lists.lustre.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of lustre-discuss digest..."
> Today's Topics:
>
>1. Lustre client LNET problem from a novice (Yau Hing Tuen, Bill)
>
>
>
> -- Forwarded message --
> From: "Yau Hing Tuen, Bill" 
> To: lustre-discuss@lists.lustre.org
> Cc:
> Bcc:
> Date: Thu, 29 Apr 2021 15:23:51 +0800
> Subject: [lustre-discuss] Lustre client LNET problem from a novice
> Dear All,
>
>  Need some advice on the following situation: one of my servers
> (Lustre client only) could no longer connect to the Lustre server.
> Suspecting some problem on the LNET configuration, but I am too new to
> Lustre and does not have more clue on how to troubleshoot it.
>
> Kernel version: Linux 5.4.0-65-generic #73-Ubuntu SMP Mon Jan 18
> 17:25:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
> Lustre version: 2.14.0 (pulled from git)
> Lustre debs built with GCC 9.3.0 on the server.
>
> Modprobe not cleanly complete as static lnet configuration does not work:
> # modprobe -v lustre
> insmod /lib/modules/5.4.0-65-generic/updates/kernel/net/libcfs.ko
> insmod /lib/modules/5.4.0-65-generic/updates/kernel/net/lnet.ko
> networks="o2ib0(ibp225s0f0)"
> insmod /lib/modules/5.4.0-65-generic/updates/kernel/fs/obdclass.ko
> insmod /lib/modules/5.4.0-65-generic/updates/kernel/fs/ptlrpc.ko
> modprobe: ERROR: could not insert 'lustre': Network is down
>
>  So resort to try dynamic lnet configuration:
>
> # lctl net up
> LNET configure error 100: Network is down
>
> # lnetctl net show
> net:
>  - net type: lo
>local NI(s):
>  - nid: 0@lo
>status: up
>
> # lnetctl net add --net o2ib0 --if ibp225s0f0"
> add:
>  - net:
>errno: -100
>descr: "cannot add network: Network is down"
>
> Having these error messages in dmesg after the above "lnetctl net
> add" command
> [265979.237735] LNet: 3893180:0:(config.c:1564:lnet_inet_enumerate())
> lnet: Ignoring interface enxeeeb676d0232: it's down
> [265979.237738] LNet: 3893180:0:(config.c:1564:lnet_inet_enumerate())
> Skipped 9 previous similar messages
> [265979.238395] LNetError:
> 3893180:0:(o2iblnd.c:2655:kiblnd_hdev_get_attr()) Invalid mr size:
> 0x100
> [265979.267372] LNetError:
> 3893180:0:(o2iblnd.c:2869:kiblnd_dev_failover()) Can't get device
> attributes: -22
> [265979.298129] LNetError: 3893180:0:(o2iblnd.c:3353:kiblnd_startup())
> ko2iblnd: Can't initialize device: rc = -22
> [265980.353643] LNetError: 105-4: Error -100 starting up LNI o2ib
>
> Initial Diagnosis:
> # ip link show ibp225s0f0
> 41: ibp225s0f0:  mtu 2044 qdisc mq
> state UP mode DEFAULT group default qlen 256
>  link/infiniband
> 00:00:11:08:fe:80:00:00:00:00:00:00:0c:42:a1:03:00:79:99:1c brd
> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
>
> # ip address show ibp225s0f0
> 41: ibp225s0f0:  mtu 2044 qdisc mq
> state UP group default qlen 256
>  link/infiniband
> 00:00:11:08:fe:80:00:00:00:00:00:00:0c:42:a1:03:00:79:99:1c brd
> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
>  inet 10.10.10.3/16 brd 10.10.255.255 scope global ibp225s0f0
> valid_lft forever preferred_lft forever
>  inet6 fe80::e42:a103:79:991c/64 scope link
> valid_lft forever preferred_lft forever
>
> # ifconfig ibp225s0f0
> ibp225s0f0: flags=4163  mtu 2044
>  inet 10.10.10.3  netmask 255.255.0.0  broadcast 10.10.255.255
>  inet6 fe80::e42:a103:79:991c  prefixlen 64  scopeid 0x20
>  unspec 00-00-11-08-FE-80-00-00-00-00-00-00-00-00-00-00
> txqueuelen 256  (UNSPEC)
>  RX packets 14363998  bytes 1440476592 (1.4 GB)
>  RX errors 0  dropped 0  overruns 0  frame 0
>  TX packets 88  bytes 6648 (6.6 KB)
>  TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>
> # lsmod | grep ib
> ko2iblnd  233472  0
> lnet  552960  3 ko2iblnd,obdclass
> libcfs487424  3 lnet,ko2iblnd,obdclass
> ib_umad28672  0
> ib_ipoib  110592  0
> rdma_cm61440  2 ko2iblnd,rdma_ucm
> ib_cm  57344  2 rdma_cm,ib_ipoib
> mlx5_ib   307200  0
> mlx_compat 65536  1 ko2iblnd
> ib_uverbs 126976  2 rdma_ucm,mlx5_ib
> ib_core   311296  9
> rdma_cm,ib_ipoib,ko2iblnd,iw_cm,ib

Re: [lustre-discuss] lustre-discuss Digest, Vol 180, Issue 23

2021-03-23 Thread Sid Young via lustre-discuss
>
> LNET on the failover node will be operational as its a separate service,
> you can check it as shown below and do a "lnetctl net show":
>

[root@hpc-mds-02 ~]# systemctl status lnet
● lnet.service - lnet management
   Loaded: loaded (/usr/lib/systemd/system/lnet.service; disabled; vendor
preset: disabled)
   Active: active (exited) since Mon 2021-03-08 15:19:07 AEST; 2 weeks 1
days ago
  Process: 25742 ExecStart=/usr/sbin/lnetctl import /etc/lnet.conf
(code=exited, status=0/SUCCESS)
  Process: 25738 ExecStart=/usr/sbin/lnetctl lnet configure (code=exited,
status=0/SUCCESS)
  Process: 25736 ExecStart=/sbin/modprobe lnet (code=exited,
status=0/SUCCESS)
 Main PID: 25742 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/lnet.service

Mar 08 15:19:07 hpc-mds-02 systemd[1]: Starting lnet management...
Mar 08 15:19:07 hpc-mds-02 systemd[1]: Started lnet management.
[root@hpc-mds-02 ~]#

Its only the disk management that is down on the failover node.

Sid

>
>
> Imho, LNET to a failover node _must_ fail, because LNET should not be
> up on the failover node, right?
>
> If I started LNET there, and some client does not get an answer
> quickly enough from the acting MDS, it
> would try the failover, LNET yes but Lustre no - that doesn't sound
> right.
>
>
> Regards,
> Thomas
>
> --
> 
> Thomas Roth
> Department: Informationstechnologie
> Location: SB3 2.291
> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] LVM support

2021-03-08 Thread Sid Young via lustre-discuss
G'Day all,

I put this question to another member of the list privately, but I thought
it would be good to ask of the whole list, if ZFS is supported for managing
a large pool of disks in a typical OSS node, could LVM be used with a
stripped LV configuration?

Since LVM is rock solid and most likely managing the file system of each
node in your cluster, what impact does an LVM LV have as an OST ?


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Performance over 100G ethernet

2021-03-08 Thread Sid Young via lustre-discuss
G'Day all,

What sort of transfer speeds should I expect to see from  a client writing
a 1G file into the Lustre storage via RoCE on a 100G ConnectX5 (write block
size is 1M)?

I have done virtually no tuning yet and the MTU is showing as "active 1024".

If you have any links to share with some performance benchmarks and config
examples that would be much appreciated.

Thanks


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Solved - OSS Crash

2021-03-03 Thread Sid Young via lustre-discuss
A big thanks to Karsten - I've downgraded the kernels on two OSS nodes and
one of the MDS to 3.10.0-1160.2.1.el7.x86_64, placed the others in standby
and everything has run overnight with 50,000 continuous
reads/writes/deletes/per cycle and bulk deletes in a shell script running
continuously and this morning its all still up and running :)

Thanks everyone for your suggestions.

Next challenge RoCE over 100G ConnectX5 cards :)


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] OSS node crash/high CPU latency when deleting 100's of empty test files

2021-03-02 Thread Sid Young via lustre-discuss
Thx Karsten, looks like I found it at the same time you posted... I
will have a go at re-imaging with 1160.6.1 (the build updates to
1160.15.2) and re-testing.

Do you know if 2.14 will be released for Centos 7.9?


Sid


Hi Sid,

if you are using a CentOS 7.9 kernel newer than
3.10.0-1160.6.1.el7.x86_64 then check out LU-14341 as these kernel
versions cause a timer related regression:
https://jira.whamcloud.com/browse/LU-14341

We learnt this the hard way during the last couple of days and
downgraded to kernel-3.10.0-1160.2.1.el7.x86_64 (which is the
officially supported kernel version of lustre 2.12.6). We use ZFS.
YMMV.

--
Karsten Weiss






Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] OSS crashes - could be LU-14341

2021-03-02 Thread Sid Young via lustre-discuss
G'Day all,

Is 2.12.6 supported on Centos 7.9?

After more investigation, I believe this is the issue I am seeing:
https://jira.whamcloud.com/browse/LU-14341

If there is a patch release built for 7.9 I am really happy to test it, as
it's easy to reproduce and crash the OSS's


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] OSS Nodes crashing (and an MDS crash as well)

2021-03-02 Thread Sid Young via lustre-discuss
G'Day all,

As I reported in a previous email my OSS nodes crash soon after initiating
a file creation script using "dd" in a loop and then trying to delete all
the files at once.

At first I thought it was related to the Melanox 100G cards but after
rebuilding everything using just the 10G network I still get the crashes. I
have a crash dump file from the MDS which crashed during the creates and
the OSS crashed when I did the deletes.

This leads me to think Lustre 2.12.6 running on Centos 7.9 has a subtle bug
somewhere?

I'm not sure how to progress this, should I attempt to try 2.13?
https://downloads.whamcloud.com/public/lustre/lustre-2.13.0/el7/patchless-ldiskfs-server/RPMS/x86_64/

Or build a fresh instance on a clean build of the OS?

Thoughts?


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] OSS node crash/high CPU latency when deleting 100's of emty test files

2021-03-01 Thread Sid Young via lustre-discuss
G'Day all,

I've been doing some file create/delete testing on our new Lustre storage
which results in the OSS nodes crashing and rebooting due to high latency
issues.

I can reproduce it by running "dd" commands on the /lustre file system in a
for loop and then do a rm -f testfile-*.text at the end.
This results in console errors on our DL385 OSS nodes (running Centos 7.9)
which basically show a stack of:
  mlx5_core and bnxt_en error messages mlx5 being the Mellanox Driver
for the 100G ConnectX5 cards followed by a stack of:
"NMI watchdog: BUG: soft lockup - CPU#"N stuck for XXs "
where the CPU number is around 4 different ones and XX is typical
20-24seconds...then the boxes reboot!

Before I log a support ticket to HPe, I'm going to try and disable the 100G
cards and see if its repeatable via the 10G interfaces on the motherboards,
but before I do that, does anyone use the mellanox ConnectX5 cards on their
Lustre Storage nodes and ethernet only and if so, which driver are you
using and on which OS...

Thanks in advance!

Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] servicenode /failnode

2021-02-25 Thread Sid Young via lustre-discuss
G'Day all,

I'm rebuilding my Lustre cluster again and in doing so I am trying to
understand the role of the --servicenode option when creating an OST. There
is an example in the doco shown as this:

[root@rh7z-oss1 system]# mkfs.lustre --ost \
>   --fsname demo \
>   --index 0 \
>   --mgsnode 192.168.227.11@tcp1 \
>   --mgsnode 192.168.227.12@tcp1 \
>   --servicenode 192.168.227.21@tcp1 \
>   --servicenode 192.168.227.22@tcp1 \
>   /dev/dm-3

But its not clear what the service node actually is.

Am I correct in saying the service nodes are the IP's of the two OSS
servers that can manage this particular OST (the HA pair)?



Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] lustre-discuss Digest, Vol 179, Issue 20

2021-02-23 Thread Sid Young via lustre-discuss
Thanks for the replies, the nodes have multiple interfaces (four on
compute nodes and 6 on the storage nodes), ens2f0 is the 100G Mellanox
ConnectX5 card in slot 2 and they are all running 2.12.6 using the RPMS
from the lustre site.

I will remove one of the network definition files and add the lnetctl
--backup config to the /etc/lnet.conf i did try an export and noticed
it barfs on some of the parameters but I did not try the --backup option,
so it gives me a few options to experiment with minimising the config
just a bit of trial and error

I gather then the lustre.conf file is not needed, just the
/etc/modprobe.d/lnet.conf and the /etc/lnet.conf.


Sid Young

>
> -- Forwarded message --
> From: "Degremont, Aurelien" 
> To: Sid Young , lustre-discuss <
> lustre-discuss@lists.lustre.org>
> Cc:
> Bcc:
> Date: Tue, 23 Feb 2021 08:47:27 +
> Subject: Re: [lustre-discuss] need to always manually add network after
> reboot
>
> Hello
>
>
>
> If I understand correctly, you're telling that you have 2 configuration
> files:
>
>
>
> /etc/modprobe.d/lnet.conf
>
> options lnet networks=tcp
>
>
>
> [root@hpc-oss-03 ~]# cat /etc/modprobe.d/lustre.conf
> options lnet networks="tcp(ens2f0)"
> options lnet ip2nets="tcp(ens2f0) 10.140.93.*
>
>
>
> That means you are declaring twice the "networks" option for "lnet" kernel
> module. I don't know how 'modprobe' will behave regarding that.
>
> If you have a very simple configuration, where your nodes only have one
> Ethernet interface "ens2f0", you only need the following lines, from the 3
> above:
>
>
>
> options lnet networks="tcp(ens2f0)"
>
>
>
> If this interface is the only Ethernet interface on your host, you don't
> even need a network specific setup. By default, when loading Lustre, in the
> absence of a network configuration, Lustre will automatically setup the
> only ethernet interface to use it for "tcp".
>
>
>
> Aurélien
>
>
>
>
>
> *De : *lustre-discuss  au nom de
> Sid Young via lustre-discuss 
> *Répondre à : *Sid Young 
> *Date : *mardi 23 février 2021 à 06:59
> *À : *lustre-discuss 
> *Objet : *[EXTERNAL] [lustre-discuss] need to always manually add network
> after reboot
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
>
> G'Day all,
>
> I'm finding that when I reboot any node in our new HPC, I need to keep
> manually adding the network using lnetctl net add --net tcp --if ens2f0
>
> Then I can do an lnetctl net show and see the tcp part active...
>
>
>
> I have options in  /etc/modprobe.d/lnet.conf
>
> options lnet networks=tcp
>
>
>
> and
>
>
>
> [root@hpc-oss-03 ~]# cat /etc/modprobe.d/lustre.conf
> options lnet networks="tcp(ens2f0)"
> options lnet ip2nets="tcp(ens2f0) 10.140.93.*
>
>
>
> I've read the doco and tried to understand the correct parameters for a
> simple Lustre config so this is what I worked out is needed... but I
> suspect its still wrong.
>
>
>
> Any help appreciated :)
>
>
>
>
>
>
>
> Sid Young
>
>
>
>
>
> -- Forwarded message --
> From: Angelos Ching 
> To: lustre-discuss@lists.lustre.org
> Cc:
> Bcc:
> Date: Tue, 23 Feb 2021 18:06:02 +0800
> Subject: Re: [lustre-discuss] need to always manually add network after
> reboot
>
> Hi Sid,
>
> Notice that you are using lnetctl net add to add the lnet network, which
> means you should be using a recent version of Lustre that depends on
> /etc/lnet.conf for boot time lnet configuration.
>
> You can save the current lnet configuration using command: lnetctl export
> --backup > /etc/lnet.conf (make a backup of the original file first if
> required)
>
> On next boot, lnet.service will load your lnet configuration from the file.
>
> Or you can manually build lnet.conf as lnetctl seems to have occasion
> problems with some of the fields exported by "lnetctl export --backup"
>
> Attaching my simple lnet.conf for your reference:
>
> # cat /etc/lnet.conf
> ip2nets:
>   - net-spec: o2ib
> ip-range:
>   0: 10.2.8.*
>   - net-spec: tcp
> ip-range:
>   0: 10.5.9.*
> route:
> - net: o2ib
>   gateway: 10.5.9.25@tcp
>   hop: -1
>   priority: 0
> - net: o2ib
>   gateway: 10.5.9.24@tcp
>   hop: -1
>   priority: 0
> global:
> numa_range: 0
> max_intf: 200
> discovery: 1
> drop_asym_route: 0
>
> Best regards,
> Angelos
>
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] need to always manually add network after reboot

2021-02-22 Thread Sid Young via lustre-discuss
G'Day all,
I'm finding that when I reboot any node in our new HPC, I need to keep
manually adding the network using lnetctl net add --net tcp --if ens2f0
Then I can do an lnetctl net show and see the tcp part active...

I have options in  /etc/modprobe.d/lnet.conf
options lnet networks=tcp

and

[root@hpc-oss-03 ~]# cat /etc/modprobe.d/lustre.conf
options lnet networks="tcp(ens2f0)"
options lnet ip2nets="tcp(ens2f0) 10.140.93.*

I've read the doco and tried to understand the correct parameters for a
simple Lustre config so this is what I worked out is needed... but I
suspect its still wrong.

Any help appreciated :)



Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] MDS using D3710 DAS - partially Solved

2021-02-18 Thread Sid Young
After some investigation it looks like a timeout issue in the smartpqi
kernel module is causing the disks to be removed soon after they are
initially added based on what is reported in "dmesg"

This issue first occurred in RHEL/Centos 7.4 and should have been resolved
by centos 7.7. I've emailed the maintainer of the module and he's come
back to me with an offer to create a test driver to see if increasing the
timeout fixes the issue. There is an existing patch but its version is less
than the one in Centos 7.9.

On the bright side, I've built and rebuilt the Lustre MDS and OSS config
several times as I optimise the installation while running under
Pacemaker and have been able to mount /lustre and /home on the Compute
nodes so this new system is 50% of the way there :)


Sid Young


> Today's Topics:
>
>    1. Re: MDS using D3710 DAS (Sid Young)
>2. Re: MDS using D3710 DAS (Christopher Mountford)
>
>
>
> ------ Forwarded message --
> From: Sid Young 
> To: Christopher Mountford ,
> lustre-discuss@lists.lustre.org
> Cc:
> Bcc:
> Date: Mon, 15 Feb 2021 08:42:43 +1000
> Subject: Re: [lustre-discuss] MDS using D3710 DAS
> Hi Christopher,
>
> Just some background, all servers are DL385's all servers are running the
> same image of Centos 7.9, The MDS HA pair have a SAS connected D3710 and
> the dual OSS HA pair have a D8000 each with 45 disks in each of them.
>
> The D3710 (which has 24x 960G SSD's) seams a bit hit and miss at
> presenting two LV's, I had setup a /lustre and /home which I was going to
> use ldiskfs rather than zfs however I am finding that the disks MAY present
> to both servers after some reboots but usually the first server to reboot
> see's the LV presented and the other only see's its local internal disks
> only, so the array appears to only present the LV's to one host most of the
> time.
>
> With the 4 OSS servers. i see the same issue, sometimes the LV's present
> and sometimes they don't.
>
> I was planning on setting up the OST's as ldiskfs as well, but I could
> also go zfs, my test bed system and my current HPC uses ldsikfs.
>
> Correct me if I am wrong, but disks should present to both servers all the
> time and using PCS I should be able to mount up a /lustre and /home one the
> first server while the disks present on the second server but no software
> is mounting them so there should be no issues?
>
>
> Sid Young
>
> On Fri, Feb 12, 2021 at 7:27 PM Christopher Mountford <
> cj...@leicester.ac.uk> wrote:
>
>> Hi Sid,
>>
>> We've a similar hardware configuration - 2 MDS pairs and 1 OSS pair which
>> each consist of 2 DL360 connected to a single D3700. However we are using
>> Lustre on ZFS with each array split into 2 or 4 zpools (depending on the
>> usage) and haven't seen any problems of this sort. Are you using ldiskfs?
>>
>> - Chris
>>
>>
>> On Fri, Feb 12, 2021 at 03:14:58PM +1000, Sid Young wrote:
>> >G'day all,
>> >Is anyone using a HPe D3710 with two HPeDL380/385 servers in a MDS HA
>> >Configuration? If so, is your D3710 presenting LV's to both servers
>> at
>> >the same time AND are you using PCS with the Lustre PCS Resources?
>> >I've just received new kit and cannot get disk to present to the MDS
>> >servers at the same time. :(
>> >Sid Young
>>
>> > ___
>> > lustre-discuss mailing list
>> > lustre-discuss@lists.lustre.org
>> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>
>
> -- Forwarded message --
> From: Christopher Mountford 
> To: Sid Young 
> Cc: Christopher Mountford ,
> lustre-discuss@lists.lustre.org
> Bcc:
> Date: Mon, 15 Feb 2021 10:44:10 +0000
> Subject: Re: [lustre-discuss] MDS using D3710 DAS
>
> Hi Sid.
>
> We use the D3700s (and our D8000s) as JBODS with zfs providing the
> redundancy - do you have some kind of hardware RAID? If so, are your raid
> controller the array corntrollers or on the HBAs? Off the top of my head,
> if the latter, there might be an issue with multiple HBAs trying to
> assemble the same RAID array?
>
> - Chris.
>
> On Mon, Feb 15, 2021 at 08:42:43AM +1000, Sid Young wrote:
> >Hi Christopher,
> >Just some background, all servers are DL385's all servers are running
> >the same image of Centos 7.9, The MDS HA pair have a SAS connected
> >D3710 and the dual OSS HA pair have a D8000 each with 45 disks in each
> >of them.
> >The D3710 (which ha

[lustre-discuss] lfs check now working

2021-02-18 Thread Sid Young
After some experiments and recreating the two filesystems I now have lfs
check mds etc working from the HPC clients :) sorry to waste bandwidth.

Sid
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] cant check MDS?

2021-02-18 Thread Sid Young
G'Day all,

I'm slowly working through various issues and thought I would run a check
on the mds node(s) after mounting a "lustre" fs and a "home" fs... but get
an odd error?

/dev/sdd  3.7T  5.6M  3.4T   1% /mdt-lustre
/dev/sdc  2.6T  5.5M  2.4T   1% /mdt-home
[root@hpc-mds-02 ~]# lfs check mds
lfs check: cannot find mounted Lustre filesystem: No such device
[root@hpc-mds-02 ~]#


What am I doing wrong?


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] MGS IP in a HA cluster

2021-02-18 Thread Sid Young
Thanks for the clarification. :)


Sid Young
M: 0458 396300
W: https://off-grid-engineering.com
W: ( <https://z900collector.wordpress.com/restoration/>personal)
https://sidyoung.com/


On Thu, Feb 18, 2021 at 4:35 PM Indivar Nair 
wrote:

> Hi Sid,
>
> 1.
> -- You don't need a Cluster/Virtual IP for Lustre. Only the MGT, MDT and
> OST volumes need to be failed over.
> When these volumes are failed over to the other server, all the
> components of the Lustre file system are informed about this failover, and
> they will then continue accessing these volumes using the IP of the
> failover server.
>
> 2.
> -- No. MGS IP should also be in the same network(s) as MDS and OSS.
>
> If you are using IML for installation and management of Lustre, then this
> should be on a different network (for example, your 10G network (or a 1G
> network)).
>
> Regards,
>
>
> Indivar Nair
>
> On Thu, Feb 18, 2021 at 10:29 AM Sid Young  wrote:
>
>> G'day all,
>>
>> I'm trying to get my head around configuring a new Lustre 2.12.6 cluster
>> on Centos 7.9, in particular the correct IP(s) for the MGS.
>>
>> In a pacemaker based MDS cluster, when I define the IP for the HA, is
>> that the same IP used when referencing the MGS, or is the MGS IP only
>> specified by using the IP of both the MDS servers (assume dual MDS HA
>> cluster here)?
>>
>> And, if I have a 100G ethernet network (for RoCE) for Lustre usage and a
>> 10G network for server access is the MGS IP based around the 100G network
>> or my 10G network?
>>
>> Any help appreciated :)
>>
>> Sid Young
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] MGS IP in a HA cluster

2021-02-17 Thread Sid Young
G'day all,

I'm trying to get my head around configuring a new Lustre 2.12.6 cluster on
Centos 7.9, in particular the correct IP(s) for the MGS.

In a pacemaker based MDS cluster, when I define the IP for the HA, is that
the same IP used when referencing the MGS, or is the MGS IP only specified
by using the IP of both the MDS servers (assume dual MDS HA cluster here)?

And, if I have a 100G ethernet network (for RoCE) for Lustre usage and a
10G network for server access is the MGS IP based around the 100G network
or my 10G network?

Any help appreciated :)

Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] MDS using D3710 DAS

2021-02-14 Thread Sid Young
Hi Christopher,

Just some background, all servers are DL385's all servers are running the
same image of Centos 7.9, The MDS HA pair have a SAS connected D3710 and
the dual OSS HA pair have a D8000 each with 45 disks in each of them.

The D3710 (which has 24x 960G SSD's) seams a bit hit and miss at presenting
two LV's, I had setup a /lustre and /home which I was going to use ldiskfs
rather than zfs however I am finding that the disks MAY present to both
servers after some reboots but usually the first server to reboot see's the
LV presented and the other only see's its local internal disks only, so the
array appears to only present the LV's to one host most of the time.

With the 4 OSS servers. i see the same issue, sometimes the LV's present
and sometimes they don't.

I was planning on setting up the OST's as ldiskfs as well, but I could also
go zfs, my test bed system and my current HPC uses ldsikfs.

Correct me if I am wrong, but disks should present to both servers all the
time and using PCS I should be able to mount up a /lustre and /home one the
first server while the disks present on the second server but no software
is mounting them so there should be no issues?


Sid Young

On Fri, Feb 12, 2021 at 7:27 PM Christopher Mountford 
wrote:

> Hi Sid,
>
> We've a similar hardware configuration - 2 MDS pairs and 1 OSS pair which
> each consist of 2 DL360 connected to a single D3700. However we are using
> Lustre on ZFS with each array split into 2 or 4 zpools (depending on the
> usage) and haven't seen any problems of this sort. Are you using ldiskfs?
>
> - Chris
>
>
> On Fri, Feb 12, 2021 at 03:14:58PM +1000, Sid Young wrote:
> >G'day all,
> >Is anyone using a HPe D3710 with two HPeDL380/385 servers in a MDS HA
> >Configuration? If so, is your D3710 presenting LV's to both servers at
> >the same time AND are you using PCS with the Lustre PCS Resources?
> >    I've just received new kit and cannot get disk to present to the MDS
> >servers at the same time. :(
> >Sid Young
>
> > ___
> > lustre-discuss mailing list
> > lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] MDS using D3710 DAS

2021-02-11 Thread Sid Young
G'day all,

Is anyone using a HPe D3710 with two HPeDL380/385 servers in a MDS HA
Configuration? If so, is your D3710 presenting LV's to both servers at the
same time AND are you using PCS with the Lustre PCS Resources?

I've just received new kit and cannot get disk to present to the MDS
servers at the same time. :(


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Metrics Gathering into ELK stack

2020-12-09 Thread Sid Young
G'Day all,

I am about to commission a new HPC over the holiday break and in planning I
am looking at metrics gathering of the Lustre Cluster, most likely into an
Elastic/Kibana Stack.

Are there any reliable/solid Lustre Specific metrics tools that can push
data to ELK
OR
Can generate JSON strings of metrics I can push into more
bespoke monitoring solutions...

I am more interested in I/O metrics from the lustre side of things as I can
gather Disk/CPU/memory metrics with Metricbeat as needed already in the
legacy HPC.


Sid Young
W: https://off-grid-engineering.com
W: ( <https://z900collector.wordpress.com/restoration/>personal)
https://sidyoung.com/
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre via 100G Ethernet or Infiniband

2020-09-14 Thread Sid Young
With the growth of 100G ethernet, is it better to connect a lustre file
server via EDR 100G Infiniband or 100G Ethernet for a 32 node HPC cluster
running a typical life sciences - Genomics workload?

Thoughts anyone?


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] RA's found

2020-07-13 Thread Sid Young
Please ignore my last email, I discovered I had the resource agent rpm but
not installed it.



Sid Young
W: https://off-grid-engineering.com
W: ( <https://z900collector.wordpress.com/restoration/>personal)
https://sidyoung.com/
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Pacemaker resource Agents

2020-07-13 Thread Sid Young
I've been trying to locate the Lustre specific Pacemaker resource agents
but I've had no luck at github where they were meant to be hosted, maybe I
am looking at the wrong project?

Has anyone recently implemented a HA lustre cluster using pacemaker and did
you use lustre specific RA's?

Thanks in advance!


Sid Young
W: https://off-grid-engineering.com
W: ( <https://z900collector.wordpress.com/restoration/>personal)
https://sidyoung.com/
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] SOLVED - new client locks up on ls /lustre

2020-07-09 Thread Sid Young
SOLVED - Rebuilt the MDT and OST disks, changed /etc/fstab to have rw flag
set explicitly and rebooted everything. Clients now mount and OSTs come up
as active when I run "lfs check servers".


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] new install client locks up on ls /lustre

2020-07-08 Thread Sid Young
Hi all,

I'm new ish to lustre and I've just created a lustre 2.12.5 cluster using
the RPMs from whamcloud for Centos 7.8 with 1 MDT/MGS and 1 OSS with 3
OST's (20GB each)
Everything is formatted as ldiskfs and it's running on a vmware platform as
a test bed using tcp.
The MDT mounts ok, the OST's mount and on my client I can mount the /lustre
mount point (58GB) and I can ping everything via the lnet however as soon
as I try to do an ls -l /lustre or any kind of I/O the client locks solid
till I reboot it.

I've tried to work out how to run basic diagnostics to no avail so I am
stupped why I don't see a directory listing for what should be an empty 60G
disk.

On the MDS I ran this:
[root@lustre-mds tests]# lctl  dl
  0 UP osd-ldiskfs lustre-MDT-osd lustre-MDT-osd_UUID 10
  1 UP mgs MGS MGS 8
  2 UP mgc MGC10.140.95.118@tcp acdb253b-b7a8-a949-0bf2-eaa17dc8dca4 4
  3 UP mds MDS MDS_uuid 2
  4 UP lod lustre-MDT-mdtlov lustre-MDT-mdtlov_UUID 3
  5 UP mdt lustre-MDT lustre-MDT_UUID 12
  6 UP mdd lustre-MDD lustre-MDD_UUID 3
  7 UP qmt lustre-QMT lustre-QMT_UUID 3
  8 UP lwp lustre-MDT-lwp-MDT lustre-MDT-lwp-MDT_UUID 4
  9 UP osp lustre-OST-osc-MDT lustre-MDT-mdtlov_UUID 4
 10 UP osp lustre-OST0001-osc-MDT lustre-MDT-mdtlov_UUID 4
 11 UP osp lustre-OST0002-osc-MDT lustre-MDT-mdtlov_UUID 4
[root@lustre-mds tests]#

So it looks like I have everything is running even dmesg on the client
reports:

[7.998649] Lustre: Lustre: Build Version: 2.12.5
[8.016113] LNet: Added LNI 10.140.95.65@tcp [8/256/0/180]
[8.016214] LNet: Accept secure, port 988
[   10.992285] Lustre: Mounted lustre-client


Any pointer where to look? /var/log/messages shows no errors


Sid Young
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org