On the server node(s):

options ko2iblnd-opa peer_credits=32 peer_credits_hiw=16 credits=1024 
concurrent_sends=64 ntx=2048 map_on_demand=256 fmr_pool_size=2048 
fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4

On clients:

options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 
concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 
fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4

My concern isn’t so much the mismatch because I know that’s an issue but rather 
what numbers we should settle on with a recent lustre build. I also see the 
ko2iblnd-opa in the server config, which means because the server is actually 
loading ko2iblnd that maybe defaults are used?

What made me look was we were seeing lots of:
LNetError: 2961324:0:(o2iblnd_cb.c:2612:kiblnd_passive_connect()) Can't accept 
conn from xxx.xxx.xxx.xxx@o2ib2, queue depth too large:  42 (<=32 wanted)

—
Dan Szkola
FNAL


> On Apr 11, 2024, at 12:36 PM, Andreas Dilger <[email protected]> wrote:
> 
> [EXTERNAL] – This message is from an external sender
> 
> 
> On Apr 11, 2024, at 09:56, Daniel Szkola via lustre-discuss 
> <[email protected]> wrote:
>> 
>> Hello all,
>> 
>> I recently discovered some mismatches in our /etc/modprobe.d/ko2iblnd.conf 
>> files between our clients and servers.
>> 
>> Is it now recommended to keep the defaults on this module and run without a 
>> config file or are there recommended numbers for lustre-2.15.X?
>> 
>> The only thing I’ve seen that provides any guidance is the Lustre wiki and 
>> an HP/Cray doc:
>> 
>> https://www.hpe.com/psnow/resources/ebooks/a00113867en_us_v2/Lustre_Server_Recommended_Tuning_Parameters_4.x.html
>> 
>> Anyone have any sage advice on what the ko2iblnd.conf (and possibly 
>> ko2iblnd-opa.conf and hfi1.conf as well) on modern systems?
> 
> It would be useful to know what specific settings are mismatched.  Definitely 
> some of them need to be consistent between peers, others depend on your 
> system.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Whamcloud
> 
> 
> 
> 
> 
> 
> 

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to