Re: [lustre-discuss] kernel threads for rpcs in flight

2024-05-02 Thread Andreas Dilger via lustre-discuss
On May 2, 2024, at 18:10, Anna Fuchs mailto:anna.fu...@uni-hamburg.de>> wrote: The number of ptlrpc threads per CPT is set by the "ptlrpcd_partner_group_size" module parameter, and defaults to 2 threads per CPT, IIRC. I don't think that clients dynamically start/stop ptlrpcd threads at

Re: [lustre-discuss] kernel threads for rpcs in flight

2024-04-30 Thread Andreas Dilger via lustre-discuss
On Apr 29, 2024, at 02:36, Anna Fuchs mailto:anna.fu...@uni-hamburg.de>> wrote: Hi Andreas. Thank you very much, that helps a lot. Sorry for the confusion, I primarily meant the client. The servers rarely have to compete with anything else for CPU resources I guess. The mechanism to start new

Re: [lustre-discuss] [EXTERNAL] [BULK] Files created in append mode don't obey directory default stripe count

2024-04-29 Thread Andreas Dilger via lustre-discuss
Simon is exactly correct. This is expected behavior for files opened with O_APPEND, at least until LU-12738 is implemented. Since O_APPEND writes are (by definition) entirely serialized, having multiple stripes on such files is mostly useless and just adds overhead. Feel free to read

Re: [lustre-discuss] kernel threads for rpcs in flight

2024-04-28 Thread Andreas Dilger via lustre-discuss
On Apr 28, 2024, at 16:54, Anna Fuchs via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: The setting max_rpcs_in_flight affects, among other things, how many threads can be spawned simultaneously for processing the RPCs, right? The {osc,mdc}.*.max_rpcs_in_flight are actually

Re: [lustre-discuss] ko2iblnd.conf

2024-04-12 Thread Andreas Dilger via lustre-discuss
The ko2iblnd-opa settings are only used if you have Intel OPA instead of Mellanox cards (depends on the ko2iblnd-probe script). You should still have ko2iblnd line in the server config that is used for MLX cards in order to set the values to match on both sides. As for the actual settings,

Re: [lustre-discuss] ko2iblnd.conf

2024-04-11 Thread Andreas Dilger via lustre-discuss
On Apr 11, 2024, at 09:56, Daniel Szkola via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hello all, I recently discovered some mismatches in our /etc/modprobe.d/ko2iblnd.conf files between our clients and servers. Is it now recommended to keep the defaults on this module

Re: [lustre-discuss] Could not read from remote repository

2024-04-09 Thread Andreas Dilger via lustre-discuss
On Apr 9, 2024, at 04:16, Jannek Squar via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hey, I tried to clone the source code via `git clone git://git.whamcloud.com/fs/lustre-release.git` but got an error: """ fatal: Could not read from remote repository. Please make sure

Re: [lustre-discuss] Building Lustre against Mellanox OFED

2024-03-16 Thread Andreas Dilger via lustre-discuss
On Mar 15, 2024, at 09:18, Paul Edmon via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: I'm working on building Lustre 2.15.4 against recent versions of Mellanox OFED. I built OFED against the specific kernel and then install mlnx-ofa_kernel-modules for that specific kernel.

Re: [lustre-discuss] The confusion for mds hardware requirement

2024-03-11 Thread Andreas Dilger via lustre-discuss
All of the numbers in this example are estimates/approximations to give an idea about the amount of memory that the MDS may need under normal operating circumstances. However, the MDS will also continue to function with more or less memory. The actual amount of memory in use will change very

Re: [lustre-discuss] The confusion for mds hardware requirement

2024-03-10 Thread Andreas Dilger via lustre-discuss
These numbers are just estimates, you can use values more suitable to your workload. Similarly, 32-core clients may be on the low side these days. NVIDIA DGX nodes have 256 cores, though you may not have 1024 of them. The net answer is that having 64GB+ of RAM is inexpensive these days and

Re: [lustre-discuss] Issues draining OSTs for decommissioning

2024-03-07 Thread Andreas Dilger via lustre-discuss
It's almost certainly just internal files. You could mount as ldiskfs and run "ls -lR" to check. Cheers, Andreas > On Mar 6, 2024, at 22:23, Scott Wood via lustre-discuss > wrote: > > Hi folks, > > Time to empty some OSTs to shut down some old arrays. I've been following > the docs from

Re: [lustre-discuss] lustre-client-dkms-2.15.4 is still checking for python2

2024-02-06 Thread Andreas Dilger via lustre-discuss
I've cherry-picked patch https://review.whamcloud.com/53947 "LU-15655 contrib: update branch_comm to python3" to b2_15 to avoid this issue in the future. This script is for developers and does not affect functionality of the filesystem at all.

Re: [lustre-discuss] ldiskfs / mdt size limits

2024-02-03 Thread Andreas Dilger via lustre-discuss
Thomas, You are exactly correct that large MDTs can be useful for DoM if you have HDD OSTs. The benefit is relatively small if you have NVMe OSTs. If the MDT is larger than 16TB it must be formatted with the extents feature to address block numbers over 2^32. Unfortunately, this is _slightly_

Re: [lustre-discuss] Lustre github mirror out of sync

2024-01-26 Thread Andreas Dilger via lustre-discuss
No particular reason. I normally sync the github tree manually after Oleg lands patches to master, but forgot to do it the last couple of times. It's been updated now. Thanks for pointing it out. On Jan 26, 2024, at 00:55, Tommi Tervo wrote: > > Is sync between

Re: [lustre-discuss] Odd behavior with tunefs.lustre and device index

2024-01-24 Thread Andreas Dilger via lustre-discuss
This is more like a bug report and should be filed in Jira. That said, no guarantee that someone would be able to work on this in a timely manner. On Jan 24, 2024, at 09:47, Backer via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Just pushing it on to the top of inbox :) Or

Re: [lustre-discuss] OST still has inodes and size after deleting all files

2024-01-19 Thread Andreas Dilger via lustre-discuss
On Jan 19, 2024, at 13:48, Pavlo Khmel via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hi, I'm trying to remove 4 OSTs. # lfs osts OBDS: 0: cluster-OST_UUID ACTIVE 1: cluster-OST0001_UUID ACTIVE 2: cluster-OST0002_UUID ACTIVE 3: cluster-OST0003_UUID ACTIVE . . . I

Re: [lustre-discuss] lustre-client-dkms-2.15.4 is still checking for python2

2024-01-19 Thread Andreas Dilger via lustre-discuss
It looks like there may be a couple of test tools that are referencing python2, but it definitely isn't needed for normal operation. Are you using the lustre-client binary or the lustre-client-dkms? Only one is needed. For the short term it would be possible to override this dependency, but

Re: [lustre-discuss] Lustre errors asking for help

2024-01-17 Thread Andreas Dilger via lustre-discuss
Roman, have you tried running e2fsck on the underlying device ("-fn" to start)? It is usually best to run with the latest version of e2fsprogs as it has most fixes. It is definitely strange that all OSTs are reporting errors at the same time, which makes me wonder how the underlying hardware

Re: [lustre-discuss] LNet Multi-Rail config - with BODY!

2024-01-16 Thread Andreas Dilger via lustre-discuss
Hello Gwen, I'm not a networking expert, but it seems entirely possible that the MR discovery in 2.12.9 isn't doing as well as what is in 2.15.3 (or 2.15.4 for that matter). It would make more sense to have both nodes running the same (newer) version before digging too deeply into this. We

Re: [lustre-discuss] Mixing ZFS and LDISKFS

2024-01-12 Thread Andreas Dilger via lustre-discuss
All of the OSTs and MDTs are "independently managed" (have their own connection state between each client and target) so this should be possible, though I don't know of sites that are doing this. Possibly this makes sense to put NVMe flash OSTs on ldiskfs, and HDD OSTs on ZFS, and then put

Re: [lustre-discuss] Recommendation on number of OSTs

2024-01-12 Thread Andreas Dilger via lustre-discuss
I would recommend *not* to use too many OSTs as this causes fragmentation of the free space, and excess overhead in managing the connections. Today, single OSTs can be up to 500TiB in size (or larger, though not necessarily optimal for performance). Depending on your cluster size and total

Re: [lustre-discuss] Mixing ZFS and LDISKFS

2024-01-12 Thread Andreas Dilger via lustre-discuss
Yes, some systems use ldiskfs for the MDT (for performance) and ZFS for the OSTs (for low-cost RAID). The IOPS performance of ZFS is low vs. ldiskfs, but the streaming bandwidth is fine. Cheers, Andreas > On Jan 12, 2024, at 08:40, Backer via lustre-discuss > wrote: > >  > Hi, > > Could

Re: [lustre-discuss] Symbols not found in newly built lustre?

2024-01-11 Thread Andreas Dilger via lustre-discuss
On Jan 10, 2024, at 02:37, Jan Andersen mailto:j...@comind.io>> wrote: I am running Rocky 8.9 (uname -r: 4.18.0-513.9.1.el8_9.x86_64) and have, apparently successfully, built the lustre rpms: [root@mds lustre-release]# ll *2.15.4-1.el8.x86_64.rpm -rw-r--r--. 1 root root 4640828 Jan 10 09:19

Re: [lustre-discuss] 2.15.4 o2iblnd on RoCEv2?

2024-01-10 Thread Andreas Dilger via lustre-discuss
Granted that I'm not an LNet expert, but "errno: -1 descr: cannot parse net '<255:65535>' " doesn't immediately lead me to the same conclusion as if "unknown internface 'ib0' " were printed for the error message. Also "errno: -1" is "-EPERM = Operation not permitted", and doesn't give the same

Re: [lustre-discuss] 2.15.4 o2iblnd on RoCEv2?

2024-01-10 Thread Andreas Dilger via lustre-discuss
It would seem that the error message could be improved in this case? Could you file an LU ticket for that with the reproducer below, and ideally along with a patch? Cheers, Andreas > On Jan 10, 2024, at 11:37, Jeff Johnson > wrote: > > Man am I an idiot. Been up all night too many nights

Re: [lustre-discuss] Extending Lustre file system

2024-01-08 Thread Andreas Dilger via lustre-discuss
I would recommend *against* mounting all 175 OSTs at the same time. There are (or at least were*) some issues with the MGS registration RPCs timing out when too many config changes happen at once. Your "mount and wait 2 sec" is more robust and doesn't take very much time (a few minutes) vs.

Re: [lustre-discuss] Extending Lustre file system

2024-01-08 Thread Andreas Dilger via lustre-discuss
The need to rebalance depends on how full the existing OSTs are. My recommendation if you know that the data will continue to grow is to add new OSTs when the existing ones are at 60-70% full, and add them in larger groups rather than one at a time. Cheers, Andreas > On Jan 8, 2024, at

Re: [lustre-discuss] Building lustre on rocky 8.8 fails?

2024-01-06 Thread Andreas Dilger via lustre-discuss
Why not download the matching kernel and Lustre RPMs together? I would recommend RHEL8 servers as the most stable, RHEL9 hasn't been run for very long as a Lustre server. On Jan 5, 2024, at 02:41, Jan Andersen mailto:j...@comind.io>> wrote: Hi Xinliang and Andreas, Thanks for helping with

Re: [lustre-discuss] Error: GPG check FAILED when trying to install e2fsprogs

2024-01-03 Thread Andreas Dilger via lustre-discuss
Sorry, those packages are not signed, you'll just have to install them without a signature. Cheers, Andreas > On Jan 3, 2024, at 09:10, Jan Andersen wrote: > > I have finally managed to build the lustre rpms, but when I try to install > them with: > > dnf install ./*.rpm > > I get a list

Re: [lustre-discuss] Building lustre on rocky 8.8 fails?

2024-01-02 Thread Andreas Dilger via lustre-discuss
Try 2.15.4, as it may fix the EL8.8 build issue. Cheers, Andreas > On Jan 2, 2024, at 07:30, Jan Andersen wrote: > > I have installed Rocky 8.8 on a new server (Dell PowerEdge R640): > > [root@mds 4.18.0-513.9.1.el8_9.x86_64]# cat /etc/*release* > Rocky Linux release 8.8 (Green Obsidian) >

Re: [lustre-discuss] Lustre server still try to recover the lnet reply to the depreciated clients

2023-12-08 Thread Andreas Dilger via lustre-discuss
If you are evicting a client by NID, then use the "nid:" keyword: lctl set_param mdt.*.evict_client=nid:10.68.178.25@tcp Otherwise it is expecting the input to be in the form of a client UUID (to allow evicting a single export from a client mounting the filesystem multiple times). That

Re: [lustre-discuss] Error messages (ex: not available for connect from 0@lo) on server boot with Lustre 2.15.3 and 2.15.4-RC1

2023-12-07 Thread Andreas Dilger via lustre-discuss
Aurelien, there have beeen a number of questions about this message. > Lustre: lustrevm-OST0001: deleting orphan objects from 0x0:227 to 0x0:513 This is not marked LustreError, so it is just an advisory message. This can sometimes be useful for debugging issues related to MDT->OST connections.

Re: [lustre-discuss] Lustre caching and NUMA nodes

2023-12-05 Thread Andreas Dilger via lustre-discuss
On Dec 4, 2023, at 15:06, John Bauer mailto:bau...@iodoctors.com>> wrote: I have a an OSC caching question. I am running a dd process which writes an 8GB file. The file is on lustre, striped 8x1M. This is run on a system that has 2 NUMA nodes (cpu sockets). All the data is apparently stored

Re: [lustre-discuss] Debian 11: configure fails

2023-12-04 Thread Andreas Dilger via lustre-discuss
Which version of Lustre are you trying to build? On Dec 4, 2023, at 05:48, Jan Andersen mailto:j...@comind.io>> wrote: My system: root@debian11:~/lustre-release# uname -r 5.10.0-26-amd64 Lustre: git clone git://git.whamcloud.com/fs/lustre-release.git I'm building the client with:

Re: [lustre-discuss] Error messages (ex: not available for connect from 0@lo) on server boot with Lustre 2.15.3 and 2.15.4-RC1

2023-12-04 Thread Andreas Dilger via lustre-discuss
It wasn't clear from your rail which message(s) are you concerned about? These look like normal mount message(s) to me. The "error" is pretty normal, it just means there were multiple services starting at once and one wasn't yet ready for the other. LustreError: 137-5:

Re: [lustre-discuss] OST is not mounting

2023-11-07 Thread Andreas Dilger via lustre-discuss
The OST went read-only because that is what happens when the block device disappears underneath it. That is a behavior of ext4 and other local filesystems as well. If you look in the console logs you would see SCSI errors and the filesystem being remounted read-only. To have reliability in

[lustre-discuss] Possible change to "lfs find -size" default units?

2023-11-05 Thread Andreas Dilger via lustre-discuss
I've recently realized that "lfs find -size N" defaults to looking for files of N *bytes* by default, unlike regular find(1) that is assuming 512-byte blocks by default if no units are given. I'm wondering if it would be disruptive to users if the default unit for -size was changed to 512-byte

Re: [lustre-discuss] Lustre-Manual on lfsck - non-existing entries?

2023-10-31 Thread Andreas Dilger via lustre-discuss
On Oct 31, 2023, at 13:12, Thomas Roth via lustre-discuss wrote: > > Hi all, > > after starting an `lctl lfsck_start -A -C -o` and the oi_scrub having > completed, I would check the layout scan as described in the Lustre manual, > "36.4.3.3. LFSCK status of layout via procfs", by > > >

Re: [lustre-discuss] very slow mounts with OSS node down and peer discovery enabled

2023-10-26 Thread Andreas Dilger via lustre-discuss
I can't comment on the LNet peer discovery part, but I would definitely not tecommend to leave the lnet_transaction_timeout that low for normal usage. This can cause messages to be dropped while the server is processing them and introduce failures needlessly. Cheers, Andreas > On Oct 26,

Re: [lustre-discuss] re-registration of MDTs and OSTs

2023-10-24 Thread Andreas Dilger via lustre-discuss
On Oct 18, 2023, at 13:04, Peter Grandi via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: So I have been upgrading my one and only MDT to a larger ZFS pool, by the classic route of creating a new pool, new MDT, and then 'zfs send'/zfs receive' for the copy over (BTW for those

Re: [lustre-discuss] setting quotas from within a container

2023-10-21 Thread Andreas Dilger via lustre-discuss
Hi Lisa, The first question to ask is which Lustre version you are using? Second, are you using subdirectory mounts or other UID/GID mapping for the container? That could happen at both the Lustre level or by the kernel itself. If you aren't sure, you could try creating a new file as root

Re: [lustre-discuss] mount not possible: "no server support"

2023-10-19 Thread Andreas Dilger via lustre-discuss
On Oct 19, 2023, at 19:58, Benedikt Alexander Braunger via lustre-discuss mailto:lustre-discuss@lists.Lustre.org>> wrote: Hi Lustrers, I'm currently struggling with a unmountable Lustre filesystem. The client only says "no server support", no further logs on client or server. I first thought

Re: [lustre-discuss] backup restore docs not quite accurate?

2023-10-18 Thread Andreas Dilger via lustre-discuss
Removing the OI files is for ldiskfs backup/restore (eg. after tar/untar) when the inode numbers are changed. That is not needed for ZFS send/recv because the inode numbers stay the same after such an operation. If that isn't clear in the manual it should be fixed. Cheers, Andreas > On Oct

Re: [lustre-discuss] OSS on compute node

2023-10-13 Thread Andreas Dilger via lustre-discuss
On Oct 13, 2023, at 20:58, Fedele Stabile mailto:fedele.stab...@fis.unical.it>> wrote: Hello everyone, We are in progress to integrate Lustre on our little HPC Cluster and we would like to know if it is possible to use the same node in a cluster to act as an OSS with disks and to also use it

Re: [lustre-discuss] Ongoing issues with quota

2023-10-10 Thread Andreas Dilger via lustre-discuss
There is a $ROOT/.lustre/lost+found that you could check. What does "lfs df -i" report for the used inode count? Maybe it is RBH that is reporting the wrong count? The other alternative would be to mount the MDT filesystem directly as type ZFS and see what df -i and find report? Cheers,

Re: [lustre-discuss] Ongoing issues with quota

2023-10-09 Thread Andreas Dilger via lustre-discuss
The quota accounting is controlled by the backing filesystem of the OSTs and MDTs. For ldiskfs/ext4 you could run e2fsck to re-count all of the inode and block usage. For ZFS you would have to ask on the ZFS list to see if there is some way to re-count the quota usage. The "inode" quota is

Re: [lustre-discuss] OST went back in time: no(?) hardware issue

2023-10-04 Thread Andreas Dilger via lustre-discuss
On Oct 3, 2023, at 16:22, Thomas Roth via lustre-discuss wrote: > > Hi all, > > in our Lustre 2.12.5 system, we have "OST went back in time" after OST > hardware replacement: > - hardware had reached EOL > - we set `max_create_count=0` for these OSTs, searched for and migrated off > the

Re: [lustre-discuss] Failing build of lustre client on Debian 12

2023-10-04 Thread Andreas Dilger via lustre-discuss
On Oct 4, 2023, at 16:26, Jan Andersen mailto:j...@comind.io>> wrote: Hi, I've just successfully built the lustre 2.15.3 client on Debian 11 and need to do the same on Debian 12; however, configure fails with: checking if Linux kernel was built with CONFIG_FHANDLE in or as module... no

Re: [lustre-discuss] Cannot mount MDT after upgrading from Lustre 2.12.6 to 2.15.3

2023-10-01 Thread Andreas Dilger via lustre-discuss
On Oct 1, 2023, at 00:36, Tung-Han Hsieh via lustre-discuss wrote: > I should apologize for replying late. Here I would like to clarify why in my > opinion the Lustre ldiskfs code is not self-contained. > > In the past, to compile lustre with ldiskfs, we needed to patch Linux kernel > using

Re: [lustre-discuss] Adding lustre clients into the Debian

2023-10-01 Thread Andreas Dilger via lustre-discuss
On Oct 1, 2023, at 05:54, Arman Khalatyan via lustre-discuss wrote: > > Hello everyone, > > We are in the process of integrating the Lustre client into Debian. Are there > any legal concerns or significant obstacles to this? We're curious why it > hasn't been included in the official Debian

Re: [lustre-discuss] Adding lustre clients into the Debian

2023-10-01 Thread Andreas Dilger via lustre-discuss
On Oct 1, 2023, at 05:54, Arman Khalatyan via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hello everyone, We are in the process of integrating the Lustre client into Debian. Are there any legal concerns or significant obstacles to this? We're curious why it hasn't been

Re: [lustre-discuss] Cannot mount MDT after upgrading from Lustre 2.12.6 to 2.15.3

2023-09-28 Thread Andreas Dilger via lustre-discuss
On Sep 26, 2023, at 13:44, Audet, Martin via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hello all, I would appreciate if the community would give more attention to this issue because upgrading from 2.12.x to 2.15.x, two LTS versions, is something that we can expect many

Re: [lustre-discuss] No port 988?

2023-09-26 Thread Andreas Dilger via lustre-discuss
On Sep 26, 2023, at 06:12, Jan Andersen mailto:j...@comind.io>> wrote: Hi, I've built and installed lustre on two VirtualBoxes running Rocky 8.8 and formatted one as the MGS/MDS and the other as OSS, following a presentation from Oak Ridge National Laboratory: "Creating a Lustre Test System

Re: [lustre-discuss] [BULK] Re: [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-25 Thread Andreas Dilger via lustre-discuss
Probably using "stat" on each file is slow, since this is getting the file size from each OST object. You could try the "xstat" utility in the lustre-tests RPM (or build it directly) as it will only query the MDS for the requested attributes (owner at minimum). Then you could split into

Re: [lustre-discuss] [EXTERNAL EMAIL] Re: Lustre 2.15.3: patching the kernel fails

2023-09-22 Thread Andreas Dilger via lustre-discuss
On Sep 22, 2023, at 01:45, Jan Andersen mailto:j...@comind.io>> wrote: Hi Andreas, Thank you for your insightful reply. I didn't know Rocky; I see there's a version 9 as well - is ver 8 better, since it is more mature? There is an el9.2 ldiskfs series that would likely also apply to the

Re: [lustre-discuss] [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-22 Thread Andreas Dilger via lustre-discuss
On Sep 21, 2023, at 16:06, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] mailto:darby.vicke...@nasa.gov>> wrote: I knew an lfsck would identify the orphaned objects. That’s great that it will move those objects to an area we can triage. With ownership still intact (and I assume time

Re: [lustre-discuss] Data recovery with lost MDT data

2023-09-21 Thread Andreas Dilger via lustre-discuss
In the absence of backups, you could try LFSCK to link all of the orphan OST objects into .lustre/lost+found (see lctl-lfsck_start.8 man page for details). The data is still in the objects, and they should have UID/GID/PRJID assigned (if used) but they have no filenames. It would be up to you

Re: [lustre-discuss] File size discrepancy on lustre

2023-09-15 Thread Andreas Dilger via lustre-discuss
Are you using any file mirroring (FLR, "lfs mirror extend") on the files, perhaps before the "lfs getstripe" was run? On Sep 15, 2023, at 08:12, Kurt Strosahl via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Good Morning, We have encountered a very odd issue. Where

Re: [lustre-discuss] Getting started with Lustre on RHEL 8.8

2023-09-12 Thread Andreas Dilger via lustre-discuss
On Sep 12, 2023, at 22:31, Cyberxstudio cxs mailto:cyberxstudio.cl...@gmail.com>> wrote: Hi I get this error while installing lustre and other packages [root@localhost ~]# yum --nogpgcheck --enablerepo=lustre-server install \ > kmod-lustre-osd-ldiskfs \ > lustre-dkms \ >

Re: [lustre-discuss] Getting started with Lustre on RHEL 8.8

2023-09-12 Thread Andreas Dilger via lustre-discuss
Hello, The preferred path to set up Lustre depends on what you are planning to do with it? If for regular usage it is easiest to start with RPMs built for the distro from

Re: [lustre-discuss] questions about group locks / LDLM_FL_NO_TIMEOUT flag

2023-08-30 Thread Andreas Dilger via lustre-discuss
You can't directly dump the holders of a particular lock, but it is possible to dump the list of FIDs that each client has open. mds# lctl get_param mdt.*.exports.*.open_files | egrep "=|FID" | grep -B1 FID That should list all client NIDs that have FID open. It shouldn't be possible for

Re: [lustre-discuss] question about rename operation ?

2023-08-16 Thread Andreas Dilger via lustre-discuss
Any directory renames where it is not just a simple name change (ie. parent directory is not the same for both source and target) the MDS thread doing the rename will take the LDLM "big filesystem lock" (BFL), which is a specific FID for global rename serialization. This ensures that there is

Re: [lustre-discuss] getting without inodes

2023-08-11 Thread Andreas Dilger via lustre-discuss
The t0 filesystem OSTs are formatted for an average file size of 70TB / 300M inodes = 240KB/inode. The t1 filesystem OSTs are formatted for an average file size of 500TB / 65M inodes = 7.7MB/inode. So not only are the t1 OSTs larger, but they have fewer inodes (by a factor of 32x). This must

Re: [lustre-discuss] Pool_New Naming Error

2023-08-08 Thread Andreas Dilger via lustre-discuss
On Aug 8, 2023, at 18:41, Baucum, Rashun via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hello, I am running into an issue when attempting to setup pooling. The commands are being run on a server that hosts the MDS and MGS: # lctl dl 0 UP osd-ldiskfs lfs1-MDT-osd

Re: [lustre-discuss] how does lustre handle node failure

2023-07-22 Thread Andreas Dilger via lustre-discuss
Shawn, Lustre handles the largest filesystems in the world, hundreds of PB in size, so there are definitely Lustre filesystems with hundreds of servers. In large storage clusters the servers failover in pairs or quads, since the storage is typically not on a single global SAN for all nodes to

Re: [lustre-discuss] File system global quota

2023-07-20 Thread Andreas Dilger via lustre-discuss
Probably the closest that could be achieved like this would be to set the ldiskfs reserved space on the OSTs like: tune2fs -m 10 /dev/sdX That sets the root reserved space to 10% of the filesystem, and non-root users wouldn't be able to allocate blocks once the filesystem hits 90% full. This

Re: [lustre-discuss] Old Lustre Filesystem migrate to newer servers

2023-07-19 Thread Andreas Dilger via lustre-discuss
Wow, Lustre 1.6 is really old, released in 2009. Even Lustre 2.6 would be pretty old, released in 2014. While there haven't been a *lot* of on-disk format changes over the years, there was a fairly significant change in Lustre 2.0 that would probably make upgrading the filesystem directly to

Re: [lustre-discuss] New client mounts fail after deactivating OSTs

2023-07-18 Thread Andreas Dilger via lustre-discuss
Brian, Please file a ticket in LUDOC with details of how the manual should be updated. Ideally, including a patch. :-) Cheers, Andreas On Jul 11, 2023, at 15:39, Brad Merchant wrote:  We recreated the issue in a test cluster and it was definitely the llog_cancel steps that caused the

Re: [lustre-discuss] Use of lazystatfs

2023-07-05 Thread Andreas Dilger via lustre-discuss
On Jul 5, 2023, at 07:14, Mike Mosley via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hello everyone, We have drained some of our OSS/OSTs and plan to deactivate them soon. The process ahead leads us to a couple of questions that we hope somebody can advise us on.

Re: [lustre-discuss] Rocky 9.2/lustre 2.15.3 client questions

2023-06-23 Thread Andreas Dilger via lustre-discuss
Applying the LU-16626 patch locally should fix the issue, and has no risk since it is only fixing a build issue that affects an obscure diagnostic tool. That said, I've cherry-picked that patch back to b2_15, so it should be included into 2.15.4. https://review.whamcloud.com/51426 Cheers,

Re: [lustre-discuss] CentOS Stream 8/9 support?

2023-06-22 Thread Andreas Dilger via lustre-discuss
On Jun 22, 2023, at 06:58, Will Furnass via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hi, I imagine that many here might have seen RedHat's announcement yesterday about ceasing to provide sources for EL8 and EL9 to those who aren't paying customers (see [1] - CentOS 7

Re: [lustre-discuss] No space left on device MDT DoM but not full nor run out of inodes

2023-06-22 Thread Andreas Dilger via lustre-discuss
There is a bug in the grant accounting that leaks under certain operations (maybe O_DIRECT?). It is resolved by unmounting and remounting the clients, and/or upgrading. There was a thread about it on lustre-discuss a couple of years ago. Cheers, Andreas On Jun 20, 2023, at 09:32, Jon

Re: [lustre-discuss] Data stored in OST

2023-05-22 Thread Andreas Dilger via lustre-discuss
Yes, the OSTs must provide internal redundancy - RAID-6 typically. There is File Level Redundancy (FLR = mirroring) possible in Lustre file layouts, but it is "unmanaged", so users or other system-level tools are required to resync FLR files if they are written after mirroring. Cheers,

Re: [lustre-discuss] mlx5 errors on oss

2023-05-18 Thread Andreas Dilger via lustre-discuss
I can't comment on the specific network issue, but in general it is far better to use the MOFED drivers than the in-kernel ones. Cheers, Andreas > On May 18, 2023, at 09:08, Nehring, Shane R [LAS] via lustre-discuss > wrote: > > Hello all, > > We recently added infiniband to our cluster

Re: [lustre-discuss] [EXTERNAL] Re: storing Lustre jobid in file xattrs: seeking feedback

2023-05-15 Thread Andreas Dilger via lustre-discuss
edding the jobname itself, perhaps just a least significant 7 character sha-1 hash of the jobname. Small chance of collision, easy to decode/cross reference to jobid when needed. Just a thought. --Jeff On Fri, May 12, 2023 at 3:08 PM Andreas Dilger via lustre-discuss mailto:lustre-discus

Re: [lustre-discuss] storing Lustre jobid in file xattrs: seeking feedback

2023-05-12 Thread Andreas Dilger via lustre-discuss
Hi Thomas, thanks for working on this functionality and raising this question. As you know, I'm inclined toward the user.job xattr, but I think it is never a good idea to unilaterally make policy decisions in the kernel that cannot be changed. As such, it probably makes sense to have a tunable

Re: [lustre-discuss] Missing Files in /proc/fs/lustre after Upgrading to Lustre 2.15.X

2023-05-04 Thread Andreas Dilger via lustre-discuss
On May 4, 2023, at 16:43, Jane Liu via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Hi, We previously had a monitoring tool in Lustre 2.12.X that relied on files located under /proc/fs/lustre for gathering metrics. However, after upgrading our system to version 2.15.2, we

Re: [lustre-discuss] question mark when listing file after the upgrade

2023-05-03 Thread Andreas Dilger via lustre-discuss
This looks like https://jira.whamcloud.com/browse/LU-16655 causing problems after the upgrade from 2.12.x to 2.15.[012] breaking the Object Index files. A patch for this has already been landed to b2_15 and will be included in 2.15.3. If you've hit this issue, then you need to backup/delete the

Re: [lustre-discuss] Recovering MDT failure

2023-04-28 Thread Andreas Dilger via lustre-discuss
On Apr 27, 2023, at 02:12, Ramiro Alba Queipo mailto:ramiro.a...@upc.edu>> wrote: Hi everybody, I have lustre 2.15.0 using Oracle on servers and Ubuntu 20.04 at clients. I have one MDT on a raid 1 SSD and the two disk have failed, so all the data is apparently lost. - Is there a remote

Re: [lustre-discuss] [EXTERNAL] Mounting lustre on block device

2023-04-05 Thread Andreas Dilger via lustre-discuss
On Mar 16, 2023, at 17:02, Jeff Johnson mailto:jeff.john...@aeoncomputing.com>> wrote: If you *really* want a block device on a client that resides in Lustre you *could* create a file in Lustre and then make that file a loopback device with losetup. Of course, your mileage will vary *a lot*

Re: [lustre-discuss] Joining files

2023-03-30 Thread Andreas Dilger via lustre-discuss
Based on your use case, I don't think file join will be a suitable solution. There is a limit on the number of files that can be joined (about 2000) and this would make for an unusual file format (something like a tar file, but would need special tools to access). It would also be very

Re: [lustre-discuss] Joining files

2023-03-29 Thread Andreas Dilger via lustre-discuss
Patrick, once upon a time there was "file join" functionality in Lustre that was ancient and complex, and was finally removed in 2009. There are still a few remnants of this like "MDS_OPEN_JOIN_FILE" and "LOV_MAGIC_JOIN_V1" defined, but unused. That functionality long predated composite file

Re: [lustre-discuss] About Lustre small files performace(8k) improve

2023-03-27 Thread Andreas Dilger via lustre-discuss
Are your performance tests on NFS or on native Lustre clients? Native Lustre clients will likely be faster, and with many clients they can create files in parallel, even in the same directory. With a single NFS server they will be limited by the VFS locking for a single directory. Are you

Re: [lustre-discuss] DNE v3 and directory inode changing

2023-03-24 Thread Andreas Dilger via lustre-discuss
On Mar 24, 2023, at 13:20, Bertschinger, Thomas Andrew Hjorth mailto:bertschin...@lanl.gov>> wrote: Thanks, this is helpful. We certainly don't need the auto-split feature and were just experimenting with it, so this should be fine for us. And we have been satisfied with the round robin

Re: [lustre-discuss] DNE v3 and directory inode changing

2023-03-23 Thread Andreas Dilger via lustre-discuss
The DNE auto-split functionality is disabled by default and not fully completed (e.g. preserve inode numbers) because it had issues with significant performance impact/latency while splitting a directory that was currently in use (which is exactly when you would want to use it), so I wouldn't

Re: [lustre-discuss] Lustre project quotas and project IDs

2023-03-22 Thread Andreas Dilger via lustre-discuss
Of course my preference would be a contribution to improving the name-projid mapping in the "lfs project" command under LU-13335 so this would also help other Lustre users manage their project IDs. One proposal I had in LU-13335 that I would welcome feedback on was if a name or projid did not

Re: [lustre-discuss] Repeated ZFS panics on MDT

2023-03-17 Thread Andreas Dilger via lustre-discuss
It's been a while since I've worked with ZFS servers, but one old chestnut that caused problems with ZFS 0.7 on the MDTs was the variable dnode size feature. I believe there was a tunable, something like "dnodesize=auto" that caused problems, and this could be changed to "dnodesize=1024" or

Re: [lustre-discuss] Lustre project quotas and project IDs

2023-03-16 Thread Andreas Dilger via lustre-discuss
On Mar 16, 2023, at 04:50, Passerini Marco mailto:marco.passer...@cscs.ch>> wrote: By trial and error, I found that, when using project quotas, the maximum ID available is 4294967294. Is this correct? Yes, the "-1" ID is reserved for error conditions. If I assign quota to a lot of project

Re: [lustre-discuss] Node Failure in Lustre

2023-03-15 Thread Andreas Dilger via lustre-discuss
No, because the remote-attached SSDs are part of the ZFS pool and any drive failures a t that level are the responsibility of ZFS in that case to manage the failed drives (eg. with RAID) and for you to have system monitors in place to detect this case and alert you to the drive failures. This

Re: [lustre-discuss] Slow Lustre traffic failover issue

2023-03-10 Thread Andreas Dilger via lustre-discuss
On Mar 4, 2023, at 02:50, 覃江龙 via lustre-discuss wrote: > > Dear Developer, > I hope this message finds you well. I am currently working with a Lustre file > system installed on two nodes, with a mounted client and NFS connection to > the Lustre client directory. When I generated traffic into

Re: [lustre-discuss] Renaming or Moving directories on Lustre?

2023-02-27 Thread Andreas Dilger via lustre-discuss
On Feb 27, 2023, at 11:57, Grigory Shamov mailto:grigory.sha...@umanitoba.ca>> wrote: Hi All, What happens if a directory on Lustre FS gets moved with a regular CentOS7 mv command, within the same filesystem? On CentOS 7, using mv from the distro, like this, as root: mv /project/TEMP/user

Re: [lustre-discuss] Question about lustre deduplication?

2023-02-27 Thread Andreas Dilger via lustre-discuss
On Feb 27, 2023, at 05:59, yuehui gan via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: hello all does lustre have deduplication? is this in the development plan? thanks Lustre on ZFS has dedpulication at the ZFS level. There is no deduplication across OSTs. Cheers,

Re: [lustre-discuss] Access times for file (file heat)

2023-02-25 Thread Andreas Dilger via lustre-discuss
Anna, for monitoring server storage access, the client-side file heat is not actually very useful, because (a) it is spread across all of the clients, and (b) it shows the client-side access patterns (which may be largely from cache) and not the actual server storage access which is what is

Re: [lustre-discuss] lfs setstripe with stripe_count=0

2023-02-24 Thread Andreas Dilger via lustre-discuss
On Feb 21, 2023, at 10:26, John Bauer mailto:bau...@iodoctors.com>> wrote: Something doesn't make sense to me when using lfs setstripe when specifying 0 for the stripe_count . This first command works as expected. The pool is the one specified, 2_hdd, and the -c 0 results in a stripe_count

Re: [lustre-discuss] Access times for file (file heat)

2023-02-18 Thread Andreas Dilger via lustre-discuss
Anna, there was a client-side file heat mechanism added a few years ago, but I don't know if it is fully functional today. lctl get_param llite.*.*heat* llite.myth-979380fc1800.file_heat=1 llite.myth-979380fc1800.heat_decay_percentage=80 llite.myth-979380fc1800.heat_period_second=60

Re: [lustre-discuss] Full List of Required Open Lustre Ports?

2023-02-02 Thread Andreas Dilger via lustre-discuss
Ellis, the addition of dynamic conns_per_peer for TCP connections is relatively new. There would be no performance "regression" against earlier Lustre releases (which always had the equivalent of conns_per_peer=1), just not additional performance gains for high-speed Ethernet interfaces. The

Re: [lustre-discuss] Mistake while removing an OST

2023-02-02 Thread Andreas Dilger via lustre-discuss
You should follow the documented process, that's why it is documented. All targets need to be unmounted to make it work properly. On Feb 2, 2023, at 01:08, BALVERS Martin mailto:martin.balv...@danone.com>> wrote: Hi Andreas, Thank you for answering. Can I just run the ‘tunefs.lustre

Re: [lustre-discuss] Mistake while removing an OST

2023-02-01 Thread Andreas Dilger via lustre-discuss
You should just be able to run the "writeconf" process to regenerate the config logs. The removed OST will not re-register with the MGS, but all of the other servers will, so it should be fine. Cheers, Andreas On Feb 1, 2023, at 03:48, BALVERS Martin via lustre-discuss wrote:  Hi, I have

Re: [lustre-discuss] Monitoring Lustre IOPS on OSTs

2023-01-24 Thread Andreas Dilger via lustre-discuss
Yes, each RPC will increment these stats counters by one. Traditional "IOPS" are measured with 4KB read or write, but in this case the IO sizes are variable. Also, the client may aggregate multiple disjoint writes into a single RPC. This can be seen in the osd-ldiskfs.*.brw_stats as

Re: [lustre-discuss] User find out OST configuration

2023-01-23 Thread Andreas Dilger via lustre-discuss
On Jan 23, 2023, at 10:01, Anna Fuchs mailto:anna.fu...@uni-hamburg.de>> wrote: Thanks! Is it planned to introduce some metric propagation to the user? For advanced users who are benchmarking stuff on remote systems it remains unclear which performance to expect if they can not access

  1   2   >