I had the same problem. You can collect this information on a per-OST
basis, and do the summation yourself. If you're collecting other metrics
already, it's easy to do. We use telegraf and grafana to harvest metrics.
On each OSS, under /proc/fs/lustre/osd-*/*-OST*/quota_slave you'll find
We appear to be tripping over the same issues reported recently by Tung-Han
Hsieh and Simon Guilbault, namely that cur_grant_bytes is being reduced to
a very small value and causing abysmal performance.
For example, OSTs 0, 1, and 4 are having poor performance on this client
running Lustre
m/#/c/39569/
> <https://review.whamcloud.com/#/c/39569/>*
>
>
>
> I will kill of the other one (it’s ugly and probably not worth it).
>
>
>
> Regards,
>
> Shaun
>
>
>
> *From: *"Kevin M. Hildebrand"
> *Date: *Thursday, August 6, 2020 at 12:35 P
being, but that's not an
ideal solution.
Thanks!
Kevin
On Tue, Aug 4, 2020 at 3:53 PM Tancheff, Shaun
wrote:
> This issue: https://jira.whamcloud.com/browse/LU-13742 Is affecting rhel
> 8.2
>
>
>
> *From: *lustre-discuss on
> behalf of "Kevin M. Hildebrand"
&g
Hi, I just updated a RedHat machine from RHEL8.1 to RHEL8.2, and Lustre
from 2.12.4 to 2.12.5. I've built the lustre client from source, using
Mellanox OFED 5.0-2.1.8.0.
After the update, I can no longer mount my Lustre filesystem- I'm getting
the following error:
# mount /lustre
[ 1482.866631]
discuss-ow...@lists.lustre.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of lustre-discuss digest..."
>
>
> Today's Topics:
>
>1. Re: Lustre 2.12.3 client can't mount filesystem (Weiss, Karsten)
> 2. Re: Lustr
, recv_wr: 254, send_sge: 2, recv_sge: 1
Thanks,
Kevin
On Wed, Feb 12, 2020 at 3:50 PM Andreas Dilger
wrote:
> Can you please try 2.12.4, it was just released yesterday and has a number
> of fixes.
>
> On Feb 12, 2020, at 13:36, Kevin M. Hildebrand wrote:
>
> I just updated
I just updated some of my clients to RHEL 7.7, Lustre 2.12.3, MOFED 4.7.
Server version is 2.10.8.
I'm now getting errors mounting the filesystem on the client. In fact, I
can't even do an 'lctl ping' to any of the servers without getting an I/O
error.
Debug logs show this message when I
/fs/lustre/osd-zfs/yourfsname-MDT/quota_slave.
>
> I’m not in front of a keyboard, I’m cooking breakfast but I’ll follow up
> with the exact files. You can cat them and maybe find what you’re looking
> for.
>
> —Jeff
>
> On Thu, Sep 5, 2019 at 05:07 Kevin M. Hildebrand wrote
Is there any way to dump the Lustre quota data in its entirety, rather than
having to call 'lfs quota' individually for each user, group, and project?
I'm currently doing this on a regular basis so we can keep graphs of how
users and groups behave over time, but it's problematic for two reasons:
of the server I'm trying to reach first, the lctl
ping succeeds immediately. So if I ping all of the MDSes and OSSes, the
filesystem will mount immediately.
Does this sound familiar to anyone?
Thanks,
Kevin
On Thu, Jan 10, 2019 at 4:23 PM Kevin M. Hildebrand wrote:
> I've got a RHEL6 Lus
ck Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>
> > On Jan 10, 2019, at 4:23 PM, Kevin M. Hildebrand wrote:
> >
> > I've got a RHEL6 Lustre installation where the servers are running
> 2.8.
I've got a RHEL6 Lustre installation where the servers are running 2.8.0,
that I'd prefer not to upgrade.
We've been running 2.8.0 on RHEL6 clients as well and everything's been
working fine. However, I just updated the Linux release on the RHEL6
clients to 6.10, and Lustre 2.8.0 will no longer
on this. I gather we need 2.11 for SL7.5
> bob
> On 5/11/2018 10:15 AM, Kevin M. Hildebrand wrote:
> > I'm not sure if this is a supported behavior or not, but I'm currently
> > unable to mount a filesystem from my 2.8 server on my 2.10.3 client. Or
> > more specifically
I'm not sure if this is a supported behavior or not, but I'm currently
unable to mount a filesystem from my 2.8 server on my 2.10.3 client. Or
more specifically, the mount succeeds, but I'm unable to access any data
from the mount point.
The client is running RedHat 7.5
all an older LNet version
> on your routers to match your client/server.
>
> You may need to build your own RPMs for your new kernel, but can use
> --disable-server for configure to simplify things.
>
> Cheers, Andreas
>
> On Oct 31, 2017, at 04:45, Kevin M. Hildebrand
017, at 8:47 AM, Kevin M. Hildebrand <ke...@umd.edu> wrote:
> >
> > All of the hosts (client, server, router) have the following in
> ko2iblnd.conf:
> >
> > alias ko2iblnd-opa ko2iblnd
> > options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits
do the numbers count to 180 instead of
60, which is the frequency they're being sent?
Thanks,
Kevin
On Mon, Oct 30, 2017 at 8:47 AM, Kevin M. Hildebrand <ke...@umd.edu> wrote:
> Hello, I'm trying to set up some new Lustre routers between a set of
> Infiniband connected Lustre se
Hello, I'm trying to set up some new Lustre routers between a set of
Infiniband connected Lustre servers and a few hosts connected to an
external 100G Ethernet network. The problem I'm having is that the
routers work just fine for a minute or two, and then shortly thereafter
they're marked as
We recently updated to Lustre 2.8 on our cluster, and have started seeing
some unusal load issues.
Last night our MDS load climbed to well over 100, and client performance
dropped to almost zero.
Initially this appeared to be related to a number of jobs that were doing
large numbers of
gt; needed to find the solution when incident happens J
>
>
>
> Cheers,
>
> Marcin
>
>
>
> *From:* lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] *On
> Behalf Of *Kevin M. Hildebrand
> *Sent:* Monday, August 01, 2016 1:06 PM
> *To:* lustre
Our Lustre filesystem is currently set up to use the o2ib interface only-
all of the servers have
options lnet networks=o2ib0(ib0)
We've just added a Mellanox IB-to-Ethernet gateway and would like to be
able to have clients on the Ethernet side also mount Lustre. The gateway
extends the same
22 matches
Mail list logo