On Mon, 25 Sep 2006, Or Gerlitz wrote: > Jack Morgenstein wrote: > > Did you recompile Lustre following the installation of ofed-1.1? > > I'm not familiar with the Lustre installation procedure (i.e., if it > > gets compiled on the current host). If yes, you probably merely need > > to uninstall and reinstall Lustre o2ib. > > OK, can we state clearly what's the user needs to do with modules > directly dependent on ofed symbols (eg Lustre's o2ib, NFSoRDMA, RDS and > hopefully more to come). > > Is it recompile / uninstall / install ???
The issue is about the installation of Lustre 1.5.95 o2ib with OFED-1.1rc6 for SLES10. ofed-1.1-rc6 compiles nicely as shown below. The ib kernel modules all resides under /lib/modules/2.6.16.21-0.8-smp/kernel/drivers/infiniband/ and do match the ones compiled by ofed. I have tried these steps several times. n32:~ # lsmod | grep ib libcfs 103060 1 lnet ib_ucm 19332 0 ib_addr 10756 1 rdma_cm ib_cm 31968 2 ib_ucm,rdma_cm ib_ipoib 48144 0 ib_sa 16652 3 rdma_cm,ib_cm,ib_ipoib ib_uverbs 38312 2 rdma_ucm,ib_ucm ib_umad 17968 0 ib_mthca 116240 0 ib_mad 36116 4 ib_cm,ib_sa,ib_umad,ib_mthca ib_core 49024 9 ib_ucm,rdma_cm,ib_cm,ib_ipoib,ib_sa,ib_uverbs,ib_umad,ib_mthca,ib_mad I compiled lustre for the above kernel and ofed installation. I get the following when doing a 'lctl network up' in lustre. I have modversion set to on in the kernel. If i set it to 'n' then i get a null pointer exception and the module crashes. ko2iblnd: disagrees about version of symbol ib_create_cq ko2iblnd: Unknown symbol ib_create_cq ko2iblnd: disagrees about version of symbol ib_dereg_mr ko2iblnd: Unknown symbol ib_dereg_mr ko2iblnd: disagrees about version of symbol ib_destroy_cq ko2iblnd: Unknown symbol ib_destroy_cq ko2iblnd: disagrees about version of symbol ib_get_dma_mr ko2iblnd: Unknown symbol ib_get_dma_mr ko2iblnd: disagrees about version of symbol ib_alloc_pd ko2iblnd: Unknown symbol ib_alloc_pd ko2iblnd: disagrees about version of symbol ib_modify_qp ko2iblnd: Unknown symbol ib_modify_qp ko2iblnd: disagrees about version of symbol ib_dealloc_pd ko2iblnd: Unknown symbol ib_dealloc_pd LustreError: 5725:0:(api-ni.c:1002:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256 I have tried with ofed-1.1-rc5 and experiences the same issue. Thierry. > Or. > > > On Sunday 24 September 2006 12:57, Thierry Delaitre wrote: > >> I get the following when loading lustre o2ib module. I'm using ofed-1.1 > >> rc6 on sles10 and i'm sure the ib modules are the ones recompiled for the > >> kernel i'm using and lustre too. I don't understand why i get the > >> following as i only have one version of the ib modules ? > > > _______________________________________________ > openib-general mailing list > [email protected] > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > > ---------------------------------------- Dr Thierry DELAITRE Systems and Services Manager, CSCS University of Westminster 115 New Cavendish Street, London W1W 6UW Tel: 020 7911 5000 ext: 3586 Fax: 020 7911 5089 Mobile short dial code 1788 http://www.cscs.wmin.ac.uk/~delaitt ---------------------------------------- This e-mail and its attachments are intended for the above named only and may be confidential. If they have come to you in error you must not copy or show them to anyone, nor should you take any action based on them, other than to notify the error by replying to the sender. _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
