Re: [lustre-discuss] [EXTERNAL] [BULK] Files created in append mode don't obey directory default stripe count

2024-04-29 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
options can be used to specify special default striping for files created with O_APPEND. On Mon, Apr 29, 2024 at 11:21 AM Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss mailto:lustre-discuss@lists.lustre.org>> wrote: Wow, I would say that is definitely not

Re: [lustre-discuss] [EXTERNAL] [BULK] Files created in append mode don't obey directory default stripe count

2024-04-29 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
Wow, I would say that is definitely not expected. I can recreate this on both of our LFS’s. One is community lustre 2.14, the other is a DDN Exascalar. Shown below is our community lustre but we also have a 3-segment PFL on our Exascalar and the behavor is the same there. $ echo > aaa $

Re: [lustre-discuss] [EXTERNAL] [BULK] MDS hardware - NVME?

2024-01-08 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
u have Hardware RAID running on the JBOD and SAS HBA on the servers, but for a total software solution I’m unaware how that will work effectively. Thank you. On 5 Jan 2024, at 14:07, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss wrote: We are in the process

Re: [lustre-discuss] [EXTERNAL] [BULK] MDS hardware - NVME?

2024-01-05 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
We are in the process of retiring two long standing LFS's (about 8 years old), which we built and managed ourselves. Both use ZFS and have the MDT'S on ssd's in a JBOD that require the kind of software-based management you describe, in our case ZFS pools built on multipath devices. The MDT in

Re: [lustre-discuss] [EXTERNAL] [BULK] Re: Ongoing issues with quota

2023-10-10 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
> I don’t have a .lustre directory at the filesystem root. It's there, but doesn't show up even with 'ls -a'. If you cd into it or ls it, it's there. Lustre magic. :) -Original Message- From: lustre-discuss mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of Daniel Szkola

Re: [lustre-discuss] [BULK] Re: [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-27 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
n perl and I think the output is buffered so I’m waiting to the the first output.) Other suggestions welcome if you have ideas how to move these files into subdirectories more efficiently. From: lustre-discuss mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of "Vicke

Re: [lustre-discuss] [BULK] Re: [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-26 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
the first output.) Other suggestions welcome if you have ideas how to move these files into subdirectories more efficiently. From: lustre-discuss on behalf of "Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss" Reply-To: "Vicker, Darby J. (JSC-EG111)[Ja

Re: [lustre-discuss] [BULK] Re: [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-25 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
gestions welcome if you have ideas how to move these files into subdirectories more efficiently. From: lustre-discuss on behalf of "Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss" Reply-To: "Vicker, Darby J. (JSC-EG111)[Jacobs Technology

Re: [lustre-discuss] [BULK] Re: [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-25 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
-discuss on behalf of "Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss" Reply-To: "Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.]" Date: Friday, September 22, 2023 at 2:49 PM To: Andreas Dilger Cc: "lustre-discuss@lists.lustre.org"

Re: [lustre-discuss] [EXTERNAL] [BULK] mds and mst are lost

2023-09-25 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
Sorry to hear this. We are dealing with MDT data loss right now as well and its no fun. Please look at my posts to the list from last week for some more information about what we are doing to recover. Our situation is not as bad as yours, we only lost part of our MDT data (the last 3 months

Re: [lustre-discuss] [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-22 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
"user.job" xattr on every object, exactly to help identify the provenance of files after the fact (regardless of whether there is corruption), but it only just landed to master and will be in 2.16. That is cold comfort, but would help in the future. Cheers, Andreas On Sep 20, 2023

Re: [lustre-discuss] [EXTERNAL] Re: Data recovery with lost MDT data

2023-09-21 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
nd will be in 2.16. That is cold comfort, but would help in the future. Cheers, Andreas On Sep 20, 2023, at 15:34, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss wrote: Hello, We have recently accidentally deleted some of our MDT data. I think its gone for good but looki

[lustre-discuss] Data recovery with lost MDT data

2023-09-20 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
Hello, We have recently accidentally deleted some of our MDT data. I think its gone for good but looking for advice to see if there is any way to recover. Thoughts appreciated. We run two LFS’s on the same set of hardware. We didn’t set out to do this, but it kind of evolved. The original

Re: [lustre-discuss] [EXTERNAL] [BULK] ldiskfs patch rejected Rocky 8.6

2023-07-21 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
We have been using ZFS on our LFS for about the last 7 years. Back then, we were using ZFS 0.7 and lustre 2.10 and there was a significant decrease in metadata performance compared to ldiskfs. Most of our workflows at the time didn’t need a lot of metadata performance and so we were OK with

Re: [lustre-discuss] [EXTERNAL] Re: Joining files

2023-03-30 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
> Instead, my recommendation would be to use an ext4 filesystem image to hold > the many small files (during create, if from a single client, or aggregated > after they are created). Later, this filesystem image could be mounted > read-only on multiple clients for access. Also, the whole image

Re: [lustre-discuss] Help with recovery of data

2022-06-28 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
An update. We were able to recover our filesystem (minus the two days between when the ZFS swap occurred and when we detected it and shut down the filesystem). Simply promoting the cloned ZFS volume (which was really our primary volume) and cleaning up the snapshot and clone got us back to

Re: [lustre-discuss] [EXTERNAL] Re: Help with recovery of data

2022-06-22 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
Thanks Andreas – I appreciate the info. I am dd’ing the MDT block device (both of them – more details below) to separate storage now. I’ve written this up on the ZFS mailing list. https://zfsonlinux.topicbox.com/groups/zfs-discuss/Tcb8a3ef663db0031/need-help-with-data-recovery-if-possible

Re: [lustre-discuss] Help with recovery of data

2022-06-22 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
A quick follow up. I thought an lfsck would only clean up (i.e. remove orphaned MDT and OST objects) but it appears this might have a good shot at repairing the file system – specifically, recreating the MDT objects with the --create-mdtobj option. We have started this command:

[lustre-discuss] Help with recovery of data

2022-06-21 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
Hi everyone, We ran into a problem with our lustre filesystem this weekend and could use a sanity check and/or advice on recovery. We are running on CentOS 7.9, ZFS 2.1.4 and Lustre 2.14. We are using ZFS OST’s but and an ldiskfs MDT (for better MDT performance). For various reasons, the

Re: [lustre-discuss] ASSERTION( obj->oo_with_projid ) failed

2021-10-04 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
e best way to reverse any ill effects the "lfs setquota -p 1" command may have caused? * Should there be some protection in the lustre source for this? -Original Message- From: lustre-discuss on behalf of "Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] vi

[lustre-discuss] ASSERTION( obj->oo_with_projid ) failed

2021-09-30 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
Hello everyone, We've run into a pretty nasty LBUG that took our LFS down. We're not exactly sure the cause and could use some help. Its pretty much identical to this: https://jira.whamcloud.com/browse/LU-13189 One of our OSS's started crashing repeated last night. We are configured with

Re: [lustre-discuss] Disabling multi-rail dynamic discovery

2021-09-14 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
like this: options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 ntx=2048 map_on_demand=256 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 Hope it helps. Rick On 9/13/21 1:53 PM, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss wrot

Re: [lustre-discuss] [EXTERNAL] Re: Disabling multi-rail dynamic discovery

2021-09-14 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
mand=256 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 Hope it helps. Rick On 9/13/21 1:53 PM, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss wrote: Hello, I would like to know how to turn off auto discovery of peers on a client. This seems

Re: [lustre-discuss] Disabling multi-rail dynamic discovery

2021-09-13 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
ize=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 Hope it helps. Rick On 9/13/21 1:53 PM, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss wrote: Hello, I would like to know how to turn off auto discovery of peers on a client. This seems like it should be str

[lustre-discuss] Disabling multi-rail dynamic discovery

2021-09-13 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
Hello, I would like to know how to turn off auto discovery of peers on a client. This seems like it should be straight forward but we can't get it to work. Please fill me in on what I'm missing. We recently upgraded our servers to 2.14. Our servers are multi-homed (1 tcp network and 2

Re: [lustre-discuss] Benchmarking Lustre, reduce caching

2021-05-19 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
I'd recommend using the io500 benchmark. This runs both bandwidth and metadata tests and has checks to run prevent caching from tainting the results (like forcing it to run for a specified amount of time). I've found it useful for benchmarking all of our file systems (lustre, NFS, local,

Re: [lustre-discuss] [EXTERNAL] Elegant way to dump quota/usage database?

2021-02-11 Thread Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
FWIW, I've had the same need and I do exactly the brute force iteration you speak of for our LFS's to log user usage vs time. For our NFS server, we use ZFS and there is a nice one-liner to give that info in ZFS. I agree, it would be nice if there was a similar one-liner for lustre.