Hi all,
We tried to make infiniband network up on one of the computer nodes using
lctl command,
but got some errors:
# /sbin/modprobe lnet
# /usr/sbin/lctl network up
LNET configure error 100: Network is down
OS is RHEL AS 4 update 3 and the kernel version is
2.6.9-42.0.10.EL_lustre-1.4.10.1custom,
and here are the log outputs:
… hald[4534]: Timed out waiting for hotplug event 522. Rebasing to 547
… /usr/sbin/gmond[4369]: mcast_thread() error multicasting
… last message repeated 7 times
… kernel: ERROR : VVERBS : vv_hca_open:(vverbs_base.c):Didn't find
provider that support device with a name: InfiniHost0
… kernel: LustreError: 12355:0:(viblnd.c:1800:kibnal_startup()) Can't open
HCA InfiniHost0: -256
… kernel: LustreError: Error -100 starting up LNI vib
… /usr/sbin/gmond[4369]: mcast_thread() error multicasting
vstat command's outputs:
1 HCA found:
hca_id=InfiniHost_III_Ex0
pci_location={BUS=0x20,DEV/FUNC=0x00}
vendor_id=0x02C9
vendor_part_id=0x6282
hw_ver=0xA0
fw_ver=5.1.400
PSID=MT_0140000001
num_phys_ports=2
port=1
port_state=PORT_ACTIVE
sm_lid=0x0002
port_lid=0x0141
port_lmc=0x00
max_mtu=2048
port=2
port_state=PORT_DOWN
sm_lid=0x0000
port_lid=0x0000
port_lmc=0x00
max_mtu=2048
From the error we can see, its looking for hca_id as InfiniHost0,
but what we have hca_id as InfiniHost_III_Ex0,
where to configure to look for hca_id InfiniHost_III_Ex0?
Any help would be greatly appreciated.
--
Regards,
Changer
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss