Hello all,
As is tradition, resident "off the beaten path" guy, Christian here! I've
been trying to track down some odd eviction behavior and whilst conducting
a network survey noticed an odd development: a steadily increasing number
of drops reported by lnet stat's "drop_count" statistic
Howdy all!
Long and short is; when multiple clients write long (100+ GB) contiguous of
streams to a file (different for each client) over an SMB export of the
lustre client mount, all but one stream locks up and the clients writing
those streams lose connectoins to the MDS/OSS momentarily.
Oof! That's not a good situation to be in. Unfortunately, I've hit the dual
import situation before as well, and as far as I know once you have two
nodes import a pool at the same time you're more or less hosed.
When it happened to me, I tried using zdb to read all the recent TXGs to
try to back
as possible (within reason, of course).
Alternatively, any way to manually end the recovery window would be
appreciated.
Cheers, and thanks for your attention,
Christian Kuntz
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http
transfer with "calculating" in the transfer size bar. When it's
> done "calculating" I get an additional prompt about "
> exist in the destination" that lets me overwrite/skip. Sure enough, the
> destination folder has been created with 817 0-byte files, all with the
> approp
he
appropriate names. The lowest number of 3K files I've see it happen with
thus far is 112.
Cheers,
Christian Kuntz
--
*
<https://opendrives.ac-page.com/nabshow2022?utm_source=signature_medium=email_campaign=nab022_content=bookameeting>
*
*Need a free
Hi Hugo,
The autoconf has some detection that should be able to grab the SPL
information for Zfs .8+ source dirs, so you may be able to scrub it out and
let the scripting handle it (you can always double check it's correct by
reading the conf logs).
Can you forward any configuration errors you
Hello all,
Requisite preamble: This is debian 10.7 with lustre 2.13.0 (compiled by
yours truly).
We've been observing some odd behavior recently with o2ib NIDs. Everyone's
all connected over the same switch (cards and switch are all mellanox),
each machine has a single network card connected in
Hello,
I hope I'm communicating in the right place here, I'm currently working to
compile Lustre 2.14.0-RC2 on Debian 10.7 with ZFS 2.0.2 for OSDs. If
there's anything I can do to help with the testing effort or help Lustre's
Debian support be more robust, please let me know! I hope I'm not too
Hello all,
I've been trying to test the fallout failover mode, but instead of getting
the "connection lost, in progress operations using this service will fail"
message and a failure, I receive the "in progress operations will wait for
recovery to complete" and the operation hangs forever. I'm
Hello all,
I'm currently running 2.13.0 on Debian Buster with ZFS osds. My current
setup is a simple cluster with all the components on the same node. Though
the OST is marked as "failout", operations are still hanging indefinitely
when they should fail after a timeout.
Predictably, I get the
Hello,
I've been running into a strange issue where my writes are blazingly fast
(5.5 GB/s) over RoCE with Mellanox MCX516A-CCAT cards all running together
over o2ib, but read performance tanks to roughly 100 MB/s. During mixed
read/write situations write performance also plummets to sub 100MB/s.
12 matches
Mail list logo