Megan,
You will have to rebuild Lustre from source. Furthermore you will have
to have the Mellanox ib driver source installed so the Lustre build
process can grab the necessary bits from the Mellanox source.
The issue you are seeing is exactly what you think it is. The WC builds
use the RHEL in-kernel IB driver. I have even had issues with MDS/OSS
boxes running RHEL in-kernel IB and clients running Mellanox of OFED IB
drivers. Even though IB is a standard you really need to have
everything, from core to edge, talking the same driver.
I recently did nearly the same config you have; RHEL6.2 x86_64, MLX
OFED, Lustre 2.1.3.
You could opt to run your Mellanox IB HCA using the RHEL in-kernel IB
drivers and not have to recompile anything.
--Jeff
On 11/20/12 1:20 PM, Ms. Megan Larko wrote:
Hello to Everyone!
I have a question to which I think I know the answer, but I am seeking
confirmation (re-assurance?).
I have build a RHEL 6.2 system with lustre-2.1.2. I am using the
rpms from the Whamcloud site for linux kernel
2.6.32_220.17.1.el6_lustre.x85_64 along with the version-matching
lustre, lustre-modules, lustre-ldiskfs, and kernel-devel,I also
have from the Whamcloud site
kernel-ib-1.8.5-2.6.32-220.17.1.el6_lustre.x86_64 and the related
kernel-ib-devel for same.
The lustre file system works properly for TCP.
I would like to use InfiniBand. The system has a new Mellanox card
for which mlxn1 firmware and drivers were installed. After this was
done (I cannot speak to before) the IB network will come up on boot
and copy and ping in a traditional network fashion.
Hard Part: I would like to run the lustre file system on the IB (ib0).
I re-created the lustre network to use /etc/modprobe.d/lustre.conf
pointing to o2ib in place of tcp0. I rebuilt the mgs/mdt and all
osts to use the IB network (the mgs/mds --failnode=[new_IB_addr] and
the osts point to mgs on IB net). When I modprobe lustre to start
the system I receive error messages stating that there are
Input/Output errors on lustre modules fld.ko, fid,ko, mdc.ko osc.ko
lov.ko. The lustre.ko cannot be started. A look in
/var/log/messages reveals many Unknown symbol and Disagrees about
version of symbol from the ko2iblnd module.
A modprobe --dump-modversions /path/to/kernel/lo2iblnd.ko shows it
pointing to the Modules.symvers of the lustre kernel.
Am I correct in thinking that because of the specific Mellanox IB
hardware I have (with its own /usr/src/ofa_kernel/Module.symvers
file), that I have to build Lustre-2.1.2 from tarball to use the
configure --with-o2ib=/usr/src/ofa_kernel mandating that this
system use the ofa_kernel-1.8.5 modules and not the OFED 1.8.5 from
the kernel-ib rpms to which Lustre defaults in the Linux kernel?
Is a rebuild of lustre from source mandartory or is there a way in
which I may point to the appropriate symbols needed by the
ko2iblnd.ko?
Enjoy the Thanksgiving holiday for those U.S. readers.To everyone
else in the world, have a great weekend!
Megan Larko
Hewlett-Packard
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
--
Jeff Johnson
Co-Founder
Aeon Computing
jeff.john...@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101 f: 858-412-3845
m: 619-204-9061
/* New Address */
4170 Morena Boulevard, Suite D - San Diego, CA 92117
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss