I can successfully ping other servers thru ib using ipoib ip addresses. Loading lnet or trying to mount a lustre device using o2ib using OFED 1.1.1 modprobe lnet generates complaints about symbol versions of ib related routines. What versions of the OFED driver (1.0, 1.1, or 1.1.1) are compatible with Lustre 1.5.95? Thanks for the advice. Tim /etc/modprobe.conf alias eth0 tg3 alias eth1 tg3 alias scsi_hostadapter mptbase alias scsi_hostadapter1 mptscsih alias usb-controller ohci-hcd options lnet networks=tcp,o2ib # specify both ethernet and ib networks for Lustre. alias ib0 ib_ipoib alias ib1 ib_ipoib alias net-pf-27 ib_sdp
Sample of messages: Feb 6 14:34:21 FedoraCore121 root: =========start lnet and debug Feb 6 14:34:27 FedoraCore121 kernel: Lustre: 2306:0:(module.c:382:init_libcfs_module()) maximum lustre stack 8192 Feb 6 14:34:46 FedoraCore121 kernel: Lustre: Added LNI [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> [8/256] Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol ib_create_cq Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol ib_create_cq Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_resolve_addr Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_resolve_addr Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol ib_dereg_mr Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol ib_dereg_mr Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_reject Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_reject Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_disconnect Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_disconnect Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_resolve_route Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_resolve_route Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_bind_addr Feb 6 14:34:46 FedoraCore121 modprobe: FATAL: Error inserting ko2iblnd (/lib/modules/2.6.9-42.EL_lustre.1.5.95smp/kernel/net/lustre/ko2iblnd.ko ): Unknown symbol in module, or unknown parameter (see dmesg) Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_bind_addr Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_create_qp Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_create_qp Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol ib_destroy_cq Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol ib_destroy_cq Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_create_id Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_create_id Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_listen Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_listen Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_destroy_qp Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_destroy_qp Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol ib_get_dma_mr Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol ib_get_dma_mr Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol ib_alloc_pd Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol ib_alloc_pd Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_connect Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_connect Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol ib_modify_qp Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol ib_modify_qp Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_destroy_id Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_destroy_id Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol rdma_accept Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol rdma_accept Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about version of symbol ib_dealloc_pd Feb 6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol ib_dealloc_pd Feb 6 14:34:47 FedoraCore121 kernel: Lustre: Removed LNI [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> Feb 6 14:35:01 FedoraCore121 sendmail[2268]: sql_select option missing Feb 6 14:35:01 FedoraCore121 sendmail[2268]: auxpropfunc error no mechanism available ________________________________ From: Eric Barton [mailto:[EMAIL PROTECTED] Sent: Monday, February 05, 2007 10:42 AM To: Snider, Tim; [email protected] Subject: RE: [Lustre-discuss] [Lustre-devel] Using Infiniband with 1.5.95 Is that OFED 1.1? Does /etc/modprobe.conf contain... options lnet networks=o2ib ...or the equivalent using ip2nets? If this isn't clear, please see the lustre manual for an explanation of network setup. Can you bring up lustre networking on the mgs and a client node... modprobe lnet; lctl net up ...and then check /proc/sys/lnet/nis? It should list the local NIDs (e.g.... <ipoib IP address>@o2ib [EMAIL PROTECTED] ...). If that looks OK, run an lnet ping from the client to the MGS... lctl ping [EMAIL PROTECTED] Please note that by default, network error messages are logged internally, but are not printed to the console or /var/log/messages, so it may help to "echo + neterror > /proc/sys/lnet/printk" to enable verbose network messages while you are debugging connectivity. Cheers, Eric ________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Snider, Tim Sent: 05 February 2007 2:40 PM To: [email protected] Subject: [Lustre-discuss] [Lustre-devel] Using Infiniband with 1.5.95 We're trying to set up a Lustre configuration using infiniband ipoib with 1.5.95. openib 1.1 (was formally openib gen 2) is installed. We can successfully ping between the mdt/mgs nad ost servers using the ipoib address. Lustre fs creation is "apparently" successfull. Mounting the lustre device fails. 1. Does 1.5.95 work properly with ipoib? 2. What is the proper form of mgsnode specification, should o2ib or openiib be used? 2.a Should we specify the ipoib address or the adapter/port #? The ost command line we're trying is: mkfs.lustre --fsname=testfs [EMAIL PROTECTED] /dev/sdb1 Thanks, Timothy Snider Storage Architect Strategic Planning, Technology and Architecture LSI Logic Corporation 3718 North Rock Road Wichita, KS 67226 (316) 636-8736 [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
_______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
