Hi.
I've got an /etc/lnet.conf on a RockyLinux 9.4 client running
lustre 2.15.5-1.el9 which has this lnet.conf:
[root@stg-login-0 rocky]# cat /etc/lnet.conf
net:
- net: tcp1
interfaces:
0: eth0
Running systemctl start lnet just hangs forever, with the syslog just
showing
Sep 13 15:31:35 stg-login-0 systemd[1]: Starting lnet management...
and its actually the below which hangs:
[root@stg-login-0 rocky]# /usr/sbin/lnetctl import /etc/lnet.conf
i.e. module load and lnet configure work OK.
However it looks like it autoconfigured an interface on tcp (not tcp1):
[root@stg-login-0 rocky]# lnetctl net show
net:
- net type: lo
local NI(s):
- nid: 0@lo
status: up
- net type: tcp
local NI(s):
- nid: 10.179.2.45@tcp
status: up
So:
1. How can I debug this hanging please?
2. Do the client and server NIDs need to be in the same IPv4 subnet? I have
a client NID of 10.179.2.45@tcp1 and a server NID of 10.167.128.1@tcp1,
with IP routing between them such that icmp ping works between them, is
that OK?
many thanks for any help!
http://stackhpc.com/
Please note I work Tuesday to Friday.
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org