Ok - Can you provide more insight?  I'm using the same disto, kernel,
and  Lustre RPMs on all the servers. Why would modules load on one
server but not the others? 
And a more practical point what target do I build?
make
make install
make rpms?
Thx

-----Original Message-----
From: Nathaniel Rutman [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 06, 2007 4:25 PM
To: Snider, Tim
Cc: Eric Barton; [email protected]
Subject: Re: [Lustre-discuss] [Lustre-devel] Using Infiniband with
1.5.95

This is strictly a compile issue -- Lustre won't work over o2ib until
the ko2iblnd module can load successfully.
The default header path the o2iblnd uses is $LINUX/drivers/infiniband -
you need to make sure Lustre is compiled against the o2ib/OFED headers
that your kernel modules actually use.  The ./configure flag for Lustre
is:
  --with-o2ib=path        build o2iblnd against path
HTH
 

Snider, Tim wrote:
> Ok - more details. ipoib itself is working on all servers. there are 
> ipoib ping utilities that run successfully between all the servers in 
> the fabric.
> I was able to successfully mount on the mdt/mgs after installing 
> Lustre modules by hand using the force option.
> Mounting the OST device still fails. ptlrpc refuses to load manually 
> with the force option. All kernel / lustre versions are identical 
> between the servers.
>  
> What am I missing?
>  
> uname -a
>         Linux FedoraCore120 2.6.9-42.EL_lustre.1.5.95smp #1 SMP Thu 
> Sep 28 06:36:13 MDT 2006 i686 i686 i386 GNU/Linux [EMAIL PROTECTED] 
> mnt]# modprobe -vf ptlrpc
>         insmod
> /lib/modules/2.6.9-42.EL_lustre.1.5.95smp/kernel/fs/lustre/ptlrpc.ko
>         FATAL: Error inserting ptlrpc
>
(/lib/modules/2.6.9-42.EL_lustre.1.5.95smp/kernel/fs/lustre/ptlrpc.ko): 
> Input/output error
> /var/log/messages
>     Feb  6 17:03:20 FedoraCore120 kernel: ptlrpc: no version magic, 
> tainting kernel.
>     Feb  6 17:03:20 FedoraCore120 kernel: Lustre: Added LNI 
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> [8/256]
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol ib_create_cq
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> ib_create_cq
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_resolve_addr
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> rdma_resolve_addr
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol ib_dereg_mr
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> ib_dereg_mr
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_reject
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> rdma_reject
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_disconnect
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> rdma_disconnect
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_resolve_route
>     Feb  6 17:03:20 FedoraCore120 modprobe: FATAL: Error inserting 
> ko2iblnd
>
(/lib/modules/2.6.9-42.EL_lustre.1.5.95smp/kernel/net/lustre/ko2iblnd.ko
):     
> Unknown symbol in module, or unknown parameter (see dmesg)
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> rdma_resolve_route
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_bind_addr
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> rdma_bind_addr
>     Feb  6 17:03:20 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_create_qp
>                 <<<similar messages are displayed for awhile same as
> before>>>
>     Feb  6 17:03:21 FedoraCore120 kernel: ko2iblnd: disagrees about 
> version of symbol ib_dealloc_pd
>     Feb  6 17:03:21 FedoraCore120 kernel: ko2iblnd: Unknown symbol 
> ib_dealloc_pd
>     Feb  6 17:03:21 FedoraCore120 kernel: LustreError: 
> 4753:0:(api-ni.c:1002:lnet_startup_lndnis()) Can't load LND o2ib, 
> module ko2iblnd, rc=256
>     Feb  6 17:03:21 FedoraCore120 kernel: Lustre: Removed LNI 
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>     Feb  6 17:03:21 FedoraCore120 kernel: LustreError: 
> 4753:0:(events.c:581:ptlrpc_init_portals()) network initialisation 
> failed
>  
>  
>
> ----------------------------------------------------------------------
> --
> *From:* [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] *On Behalf Of *Snider, 
> Tim
> *Sent:* Tuesday, February 06, 2007 10:19 AM
> *To:* Eric Barton; [email protected]
> *Subject:* RE: [Lustre-discuss] [Lustre-devel] Using Infiniband with
> 1.5.95
>
> I can successfully ping other servers thru ib using ipoib ip
addresses.
> Loading lnet or trying to mount a lustre device using o2ib using OFED
> 1.1.1
> modprobe lnet generates complaints about symbol versions of ib related

> routines.
> What versions of the OFED driver (1.0, 1.1, or 1.1.1) are compatible 
> with Lustre 1.5.95?
>  
> Thanks for the advice.
> Tim
>  
> /etc/modprobe.conf
>  alias eth0 tg3
>  alias eth1 tg3
>  alias scsi_hostadapter mptbase
>  alias scsi_hostadapter1 mptscsih
>  alias usb-controller ohci-hcd
>  options lnet networks=tcp,o2ib    # specify both ethernet and ib 
> networks for Lustre.
>  alias ib0 ib_ipoib
>  alias ib1 ib_ipoib
>  alias net-pf-27 ib_sdp
>
> Sample of messages:
>    Feb  6 14:34:21 FedoraCore121 root: =========start lnet and debug
>    Feb  6 14:34:27 FedoraCore121 kernel: Lustre: 
> 2306:0:(module.c:382:init_libcfs_module()) maximum lustre stack 8192
>    Feb  6 14:34:46 FedoraCore121 kernel: Lustre: Added LNI 
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> [8/256]
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol ib_create_cq
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> ib_create_cq
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_resolve_addr
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_resolve_addr
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol ib_dereg_mr
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> ib_dereg_mr
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_reject
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_reject
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_disconnect
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_disconnect
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_resolve_route
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_resolve_route
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_bind_addr
>    Feb  6 14:34:46 FedoraCore121 modprobe: FATAL: Error inserting 
> ko2iblnd
>
(/lib/modules/2.6.9-42.EL_lustre.1.5.95smp/kernel/net/lustre/ko2iblnd.ko
): 
> Unknown symbol in module, or unknown parameter (see dmesg)
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_bind_addr
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_create_qp
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_create_qp
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol ib_destroy_cq
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> ib_destroy_cq
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_create_id
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_create_id
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_listen
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_listen
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_destroy_qp
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_destroy_qp
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol ib_get_dma_mr
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> ib_get_dma_mr
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol ib_alloc_pd
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> ib_alloc_pd
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_connect
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_connect
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol ib_modify_qp
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> ib_modify_qp
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_destroy_id
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_destroy_id
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol rdma_accept
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> rdma_accept
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: disagrees about 
> version of symbol ib_dealloc_pd
>    Feb  6 14:34:46 FedoraCore121 kernel: ko2iblnd: Unknown symbol 
> ib_dealloc_pd
>    Feb  6 14:34:47 FedoraCore121 kernel: Lustre: Removed LNI 
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>    Feb  6 14:35:01 FedoraCore121 sendmail[2268]: sql_select option
missing
>    Feb  6 14:35:01 FedoraCore121 sendmail[2268]: auxpropfunc error no 
> mechanism available
>  
>
> ----------------------------------------------------------------------
> --
> *From:* Eric Barton [mailto:[EMAIL PROTECTED]
> *Sent:* Monday, February 05, 2007 10:42 AM
> *To:* Snider, Tim; [email protected]
> *Subject:* RE: [Lustre-discuss] [Lustre-devel] Using Infiniband with
> 1.5.95
>
> Is that OFED 1.1?  Does /etc/modprobe.conf contain...
>  
> options lnet networks=o2ib
>  
> ...or the equivalent using ip2nets?   If this isn't clear, please see 
> the lustre manual for an explanation of network setup. 
>  
> Can you bring up lustre networking on the mgs and a client node...
>  
> modprobe lnet; lctl net up
>  
> ...and then check /proc/sys/lnet/nis? It should list the local NIDs 
> (e.g....
>  
> <ipoib IP address>@o2ib
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>  
> ...).  If that looks OK, run an lnet ping from the client to the
MGS...
>  
> lctl ping [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>  
> Please note that by default, network error messages are logged 
> internally, but are not printed to the console or /var/log/messages, 
> so it may help to "echo + neterror > /proc/sys/lnet/printk" to enable 
> verbose network messages while you are debugging connectivity.
>
>     Cheers,
>                        Eric
>
>
------------------------------------------------------------------------
>     *From:* [EMAIL PROTECTED]
>     [mailto:[EMAIL PROTECTED] *On Behalf Of
>     *Snider, Tim
>     *Sent:* 05 February 2007 2:40 PM
>     *To:* [email protected]
>     *Subject:* [Lustre-discuss] [Lustre-devel] Using Infiniband with
>     1.5.95
>
>     We're trying to set up a Lustre  configuration using infiniband
>     ipoib with 1.5.95. openib 1.1 (was formally openib gen 2) is
>     installed. We can successfully ping between the mdt/mgs nad ost
>     servers using the ipoib address. Lustre fs creation is
>     "apparently" successfull. Mounting the lustre device fails.
>     1.    Does 1.5.95 work properly with ipoib?
>     2.    What is the proper form of mgsnode specification, should
>     o2ib or openiib be used?
>     2.a        Should we specify the ipoib address or the adapter/port
#?
>      
>     The ost command line we're trying is:
>          mkfs.lustre --fsname=testfs [EMAIL PROTECTED]
>     <mailto:[EMAIL PROTECTED]> /dev/sdb1
>      
>     Thanks,
>     Timothy Snider
>     Storage Architect
>     Strategic Planning, Technology and Architecture
>
>     LSI Logic Corporation
>     3718 North Rock Road
>     Wichita, KS 67226
>     (316) 636-8736
>     [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>_
>
>      
>
> ----------------------------------------------------------------------
> --
>
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>   

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to