G'day all,
Is anyone using a HPe D3710 with two HPeDL380/385 servers in a MDS HA
Configuration? If so, is your D3710 presenting LV's to both servers at the
same time AND are you using PCS with the Lustre PCS Resources?
I've just received new kit and cannot get disk to present to the MDS
servers a
I had the same problem. You can collect this information on a per-OST
basis, and do the summation yourself. If you're collecting other metrics
already, it's easy to do. We use telegraf and grafana to harvest metrics.
On each OSS, under /proc/fs/lustre/osd-*/*-OST*/quota_slave you'll find
several
Hi Chris and Cory,
I remember looking at configuring multi-rail when 2.12 came out for this
very reason, but stopped when it looked like round-robin only. Is there a
way to trick the LNet Health system into seeing one interface as "sick but
not dead"?
Also, when is 2.14 coming out :)
For w
Hi Colin,
I've done checks of the performance/error counters, and used the
in-OS-repo version ibdiagnet. Apart from a couple nodes with known failing
cables/HCAs (not involved in lnet connectino probs), the fabric was
healthy. It did pick up that the IPoIB partition was still at 20gbit/s from
wh
FYI, multi-rail in 2.12 will round robin traffic between both @tcp and @o2ib
networks (assuming peers are reachable on both). If @o2ib flakes out then
traffic should shift entirely to @tcp, but there isn’t a way to specify that
traffic go to @tcp only when there’s a problem with @o2ib. You need
Hi, Nate.
You asked, “can LNET be easily configured to go over the @tcp connection when
the @o2ib flakes out?”
Yes, you can use LNet Multi-Rail for it and that _is_ covered in the “fine
manual”, chapter 16 ☺
https://doc.lustre.org/lustre_manual.xhtml#lnetmr
-Cory
On 2/10/21, 4:54 PM, "lustre-
FWIW, I've had the same need and I do exactly the brute force iteration you
speak of for our LFS's to log user usage vs time. For our NFS server, we use
ZFS and there is a nice one-liner to give that info in ZFS. I agree, it would
be nice if there was a similar one-liner for lustre.
-O
Hey all,
I would like to be able to dump the usage tracking and quota
information for my lustre filesystems. I am currently running lustre 2.12
lfs quota -u $user $filesystem
works well enough for a single user. But I have been looking for a way
to get that information for all users of the
Good Morning,
One of the OSTs in a lustre file system I manage is showing a higher usage. I
attempted to stop writes by setting the max_create_count to zero and then
moving data off it but that doesn't seem to be working.
> lfs df | grep OST:15lustre19-OST000f_UUID 71145018368 62653382656 849