Jesse, Thank you very much for your answer. I tried your recommendation of putting map_on_demand=0, but it does not work. Anyway, as it is only a means to test lustre installation I can manage to use ksocklnd over ethernet, which it does work.
Best regards On Fri, 7 Feb 2025 at 19:24, Jesse Stroik <[email protected]> wrote: > Hi Ramiro, > > The invalid MR size looks like you're running into a limit with your cards > setting up the RDMA (o2ib) LND when bringing up the network. There may be > adjustments or workarounds for it possibly including setting > map_on_demand=0 as an argument to the lnet module there. > > And since you are using older IB hardware on a newer OS, just a heads up: > we recently ran into an issue with connectx-3 IB cards after upgrading our > operating systems where we found RMDA communication to be unreliable > possibly because they often would exceed the amount of connection queue > pairs they could create. For us, the workaround was to use the ksocklnd > instead of o2iblnd. If you have trouble getting the o2ib lustre network > driver to work with this older hardware due to RDMA problems, that could be > a workaround although it may not be feasible to implement depending on your > networking setup. > > Best, > Jesse > > > ________________________________________ > From: lustre-discuss <[email protected]> on behalf > of Ramiro Alba Queipo <[email protected]> > Sent: Thursday, February 6, 2025 3:34 AM > To: [email protected] > Subject: [lustre-discuss] Lnet not going up with InfiniHost III Lx HCA card > > > Hi all, > > I am testing Ubuntu 24.04 (6.8.0-52-generic) client with Lustre 2.16.1 > over Infiniband and using an old Mellanox DDR card (InfiniHost III Lx HCA). > > - # ip -br a > > options lnet networks=o2ib0(ib0) > > - # modprobe lnet > - # lctl network up > > LNET configure error 100: Network is down > > - # tail -10 /var/log/kernel.log > > LNetError: 5071:0:(o2iblnd.c:2866:kiblnd_hdev_get_attr()) Invalid mr > size: 0xffffffffffffffff > LNetError: 5071:0:(o2iblnd.c:3103:kiblnd_dev_failover()) Can't get > device attributes: -22 > LNetError: 5071:0:(o2iblnd.c:3831:kiblnd_startup()) ko2iblnd: Can't > initialize device: rc = -22 > LNetError: Error -100 starting up LNI o2ib > > Lustre 2.15.0 and Ubuntu 20.04 (kernel 5.4.0-198-generic) is working fine > with the same hardware > > Can anyone give me some advice or idea to make it work? > > Thans in advance > Best regards > > -- > Ramiro Alba > > Centre Tecnològic de Tranferència de Calor > http://www.cttc.upc.edu< > https://urldefense.com/v3/__http://www.cttc.upc.edu__;!!Mak6IKo!On9vgnDU5CEln4C9zazniBI1hEgioSxBPqr7Fd5blSIUQcojlPmtCAmRsP3OMqt4ZdEii93FRWH2FtVn8993JZ4Ixw$ > > > > Escola Tècnica Superior d'Enginyeries > Industrial i Aeronàutica de Terrassa > Colom 11, E-08222, Terrassa, Barcelona, Spain > Tel: (+34) 93 739 8928 > -- Ramiro Alba Centre Tecnològic de Tranferència de Calor http://www.cttc.upc.edu Escola Tècnica Superior d'Enginyeries Industrial i Aeronàutica de Terrassa Colom 11, E-08222, Terrassa, Barcelona, Spain Tel: (+34) 93 739 8928
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
