Re: [lustre-discuss] new lustre setup, questions about the mgt and mdt.

2018-10-23 Thread Harr, Cameron
I'd second what Daniel said. Each of our MDS nodes has one zpool with one mdt, except the first MDS node also has an mgs dataset on the pool. The nodes are set up in failover pairs where each can see each other's zpool and import them if necessary (with MMP protection turned on). On

Re: [lustre-discuss] migrating MDS to different infrastructure

2018-11-05 Thread Harr, Cameron
You may already know this, but you'll probably want to use the -R option as well, to replicate the Lustre attributes to the new dataset. On 10/29/2018 08:33 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote: >> On Oct 29, 2018, at 1:12 AM, Riccardo Veraldi >> wrote: >> >> it is time for me to move

Re: [lustre-discuss] Command line tool to monitor Lustre I/O ?

2018-12-20 Thread Harr, Cameron
I use ltop heavily: https://github.com/LLNL/lmt On 12/20/18 9:15 AM, Alexander I Kulyavtsev wrote: 1) cerebro + ltop still work. 2) telegraf + inflixdb (collector, time series DB ). Telegraf has input plugins for lustre ("lustre2"), zfs, and many others. Grafana to plot live data from

Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Harr, Cameron
In my brief attempts to use lfs migrate, I found performance pretty slow (it's serial). I also got some ASSERTs which should be fixed in 2.10 per LU-8807; note that I was using 2.8. On a more trivial level, I found the -v|--verbose option to the command doesn't work. On 01/07/2019 12:26 PM,

Re: [lustre-discuss] Lustre chown and create operations failing for certain UIDs

2019-01-10 Thread Harr, Cameron
Russell, Your symptoms are a little different from what I see when the MDS node's passwd file is incomplete, but did you verify the affected_user has a proper /etc/passwd entry on the MDS node(s)? On 1/10/19 12:14 PM, Russell Dekema wrote: > We've got a Lustre system running

Re: [lustre-discuss] LFS Quota

2019-01-11 Thread Harr, Cameron
When you're over the soft limit, you *should* see an '*' in the listing, as well as time left in the grace period. We've had mixed success with that actually working however. Cameron On 1/9/19 5:21 AM, Moreno Diego (ID SIS) wrote: Hi ANS, About the soft limits and not receiving any warning or

Re: [lustre-discuss] LNET Conf Advise and Rearchitecting

2019-04-08 Thread Harr, Cameron
Paul, We still largely use static routing as we migrate from 2.5 and 2.8 to 2.10. We basically cross mount all our production file systems across the various compute clusters and have routing clusters to route Lustre traffic from IB or OPA to Ethernet between buildings. Each building has its

[lustre-discuss] Setting infinite grace period with soft quotas

2019-04-11 Thread Harr, Cameron
We're exploring an idea where we keep soft quotas enabled so that users will be notified they're nearing their hard quotas (via in-house scripts), but users don't like that the soft quota becomes a hard block after the grace period. I can understand their rationale as well that they should be

Re: [lustre-discuss] Setting infinite grace period with soft quotas

2019-05-09 Thread Harr, Cameron
ut 9M years, so it should probably be long enough? It might > make sense to map "-1" internally to "(1 << 48) - 1" to make this easier. > > On May 8, 2019, at 17:18, Harr, Cameron wrote: >> I had tested first and couldn't find a way to do so, so I

Re: [lustre-discuss] Setting infinite grace period with soft quotas

2019-05-08 Thread Harr, Cameron
I had tested first and couldn't find a way to do so, so I was curious if there was some undocumented way. I'm proceeding with, "No, there's not a way." On 5/6/19 12:52 PM, Andreas Dilger wrote: > On Apr 11, 2019, at 11:02, Harr, Cameron wrote: >> We're exploring an idea

Re: [lustre-discuss] ZFS and multipathing for OSTs

2019-04-26 Thread Harr, Cameron
We use a simple multipath config and then have our vdev_id.conf set up like the following: multipath yes # Intent of channel names: # First letter {L,U} indicates lower or upper enclosure # PCI_ID HBA PORT CHANNEL NAME channel05:00.0 1L channel

[lustre-discuss] 'lfs check' not runnable in 2.10 by non-root users

2019-04-26 Thread Harr, Cameron
There was a thread a couple weeks back about users no longer being able to run 'lfs check *' in 2.10 clients, but there was no resolution to it. (http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2019-April/016386.html) This is becoming an issue at our site as well now. Is this

Re: [lustre-discuss] Is it a good practice to use big OST?

2019-10-15 Thread Harr, Cameron
We run one OST per OSS and each OST is ~580TB. Lustre 2.8 or 2.10, ZFS 0.7. On 10/8/19 10:50 AM, Carlson, Timothy S wrote: I’ve been running 100->200TB OSTs making up small petabyte file systems for the last 4 or 5 years with no pain. Lustre 2.5.x through current generation. Plenty of ZFS