Re: [lustre-discuss] Lustre and OFED
To find the RDMA devices you can use: ibv_devices You can bench your RDMA connection using qperf: yum install qperf -y on the client machine: qperf on the server or ost machine: qperf clienthostname ud_lat ud_bw On Mon, Jul 31, 2017 at 4:08 PM, Ben Evans <bev...@cray.com> wrote: > > > From: "E.S. Rosenberg" <esr+lus...@mail.hebrew.edu> > Date: Sunday, July 30, 2017 at 7:53 AM > To: Ben Evans <bev...@cray.com> > Cc: Harald van Pee <p...@hiskp.uni-bonn.de>, "lustre-discuss@lists.lustre. > org" <lustre-discuss@lists.lustre.org> > Subject: Re: [lustre-discuss] Lustre and OFED > > > > On Fri, Jul 28, 2017 at 6:22 PM, Ben Evans <bev...@cray.com> wrote: > >> >> >> On 7/28/17, 11:12 AM, "Harald van Pee" <p...@hiskp.uni-bonn.de> wrote: >> >> >Hello >> > >> >On Friday 28 July 2017 15:48:12 Ben Evans wrote: >> >> Eli, just to clarify are you talking about using the in-kernel OFED vs. >> >>a >> >> vendor (Mellanox) OFED, or >> > >> >In our case we are using the OFED of the debian distribution used. >> > I am using the IB support that ships with Debian/CentOS/mainline kernel > and did not install any OFED/Mellanox OFED package. > As far as I can tell RDMA does work. (using the various test tools > suggested here https://community.mellanox.com/docs/DOC-2086) > >> > >> >> are you talking about using the ConnectX-3 >> >> hardware in IPoIB mode and just using it as a faster Ethernet? >> > >> >is possible? How one have to do this? >> >> You'd configure the lustre LNET to use it like any other ethernet device. >> The downside of this is that it's slower due to a lack of RDMA and other >> features that IB has. I'm not sure if there's a real upside to it. >> > > Now you've made me unsure of what whether or not my Lustre install is > using RDMA, how should I be able to tell (we are definitely using > IPoIB/o2ib)? > > If you are mounting Lustre with a string that looks like 192.168.1.10@o2ib > ,192.168.0.11@o2ib:/lustre then you're using OFED and RDMA. > > Thanks, > Eli > >> >> > >> >Harald >> > >> > >> >> >> >> -Ben Evans >> >> >> >> From: lustre-discuss >> >> >> >><lustre-discuss-boun...@lists.lustre.org<mailto:lustre- >> discuss-bounces@li >> >>s >> >> ts.lustre.org>> on behalf of "E.S. Rosenberg" >> >> <esr+lus...@mail.hebrew.edu<mailto:esr+lus...@mail.hebrew.edu>> Date: >> >> Thursday, July 27, 2017 at 4:55 PM >> >> To: >> >> >> >>"lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org >> >" >> >> >> >><lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org >> >> >> >> Subject: [lustre-discuss] Lustre and OFED >> >> >> >> Hi all, >> >> >> >> How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every >> >> once in a while and that got me thinking a bit. >> >> >> >> What things are gained by installing OFED? Performance? Accurate >> traffic >> >> reports? >> >> >> >> Currently I am using a lustre system without OFED but our IB hardware >> is >> >> from the FDR generation so not bleeding edge and probably doesn't need >> >> OFED because of that >> >> >> >> Thanks, >> >> Eli >> >> >> >> Tech specs: >> >> Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) >> >> Clients: Debian + kernel 4.2 + Lustre 2.8 >> >> IB: ConnectX-3 FDR >> > >> >> > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
From: "E.S. Rosenberg" <esr+lus...@mail.hebrew.edu<mailto:esr+lus...@mail.hebrew.edu>> Date: Sunday, July 30, 2017 at 7:53 AM To: Ben Evans <bev...@cray.com<mailto:bev...@cray.com>> Cc: Harald van Pee <p...@hiskp.uni-bonn.de<mailto:p...@hiskp.uni-bonn.de>>, "lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>" <lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> Subject: Re: [lustre-discuss] Lustre and OFED On Fri, Jul 28, 2017 at 6:22 PM, Ben Evans <bev...@cray.com<mailto:bev...@cray.com>> wrote: On 7/28/17, 11:12 AM, "Harald van Pee" <p...@hiskp.uni-bonn.de<mailto:p...@hiskp.uni-bonn.de>> wrote: >Hello > >On Friday 28 July 2017 15:48:12 Ben Evans wrote: >> Eli, just to clarify are you talking about using the in-kernel OFED vs. >>a >> vendor (Mellanox) OFED, or > >In our case we are using the OFED of the debian distribution used. I am using the IB support that ships with Debian/CentOS/mainline kernel and did not install any OFED/Mellanox OFED package. As far as I can tell RDMA does work. (using the various test tools suggested here https://community.mellanox.com/docs/DOC-2086) > >> are you talking about using the ConnectX-3 >> hardware in IPoIB mode and just using it as a faster Ethernet? > >is possible? How one have to do this? You'd configure the lustre LNET to use it like any other ethernet device. The downside of this is that it's slower due to a lack of RDMA and other features that IB has. I'm not sure if there's a real upside to it. Now you've made me unsure of what whether or not my Lustre install is using RDMA, how should I be able to tell (we are definitely using IPoIB/o2ib)? If you are mounting Lustre with a string that looks like 192.168.1.10@o2ib,192.168.0.11@o2ib:/lustre then you're using OFED and RDMA. Thanks, Eli > >Harald > > >> >> -Ben Evans >> >> From: lustre-discuss >> >><lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-boun...@lists.lustre.org><mailto:lustre-discuss-bounces@li<mailto:lustre-discuss-bounces@li> >>s >> ts.lustre.org<http://ts.lustre.org>>> on behalf of "E.S. Rosenberg" >> <esr+lus...@mail.hebrew.edu<mailto:esr%2blus...@mail.hebrew.edu><mailto:esr+lus...@mail.hebrew.edu<mailto:esr%2blus...@mail.hebrew.edu>>> >> Date: >> Thursday, July 27, 2017 at 4:55 PM >> To: >> >>"lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org><mailto:lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>>" >> >><lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org><mailto:lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>>> >> Subject: [lustre-discuss] Lustre and OFED >> >> Hi all, >> >> How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every >> once in a while and that got me thinking a bit. >> >> What things are gained by installing OFED? Performance? Accurate traffic >> reports? >> >> Currently I am using a lustre system without OFED but our IB hardware is >> from the FDR generation so not bleeding edge and probably doesn't need >> OFED because of that >> >> Thanks, >> Eli >> >> Tech specs: >> Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) >> Clients: Debian + kernel 4.2 + Lustre 2.8 >> IB: ConnectX-3 FDR > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
On Fri, Jul 28, 2017 at 6:22 PM, Ben Evans <bev...@cray.com> wrote: > > > On 7/28/17, 11:12 AM, "Harald van Pee" <p...@hiskp.uni-bonn.de> wrote: > > >Hello > > > >On Friday 28 July 2017 15:48:12 Ben Evans wrote: > >> Eli, just to clarify are you talking about using the in-kernel OFED vs. > >>a > >> vendor (Mellanox) OFED, or > > > >In our case we are using the OFED of the debian distribution used. > I am using the IB support that ships with Debian/CentOS/mainline kernel and did not install any OFED/Mellanox OFED package. As far as I can tell RDMA does work. (using the various test tools suggested here https://community.mellanox.com/docs/DOC-2086) > > > >> are you talking about using the ConnectX-3 > >> hardware in IPoIB mode and just using it as a faster Ethernet? > > > >is possible? How one have to do this? > > You'd configure the lustre LNET to use it like any other ethernet device. > The downside of this is that it's slower due to a lack of RDMA and other > features that IB has. I'm not sure if there's a real upside to it. > Now you've made me unsure of what whether or not my Lustre install is using RDMA, how should I be able to tell (we are definitely using IPoIB/o2ib)? Thanks, Eli > > > > >Harald > > > > > >> > >> -Ben Evans > >> > >> From: lustre-discuss > >> > >><lustre-discuss-boun...@lists.lustre.org<mailto:lustre > -discuss-bounces@li > >>s > >> ts.lustre.org>> on behalf of "E.S. Rosenberg" > >> <esr+lus...@mail.hebrew.edu<mailto:esr+lus...@mail.hebrew.edu>> Date: > >> Thursday, July 27, 2017 at 4:55 PM > >> To: > >> > >>"lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org > >" > >> > >><lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org > >> > >> Subject: [lustre-discuss] Lustre and OFED > >> > >> Hi all, > >> > >> How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every > >> once in a while and that got me thinking a bit. > >> > >> What things are gained by installing OFED? Performance? Accurate traffic > >> reports? > >> > >> Currently I am using a lustre system without OFED but our IB hardware is > >> from the FDR generation so not bleeding edge and probably doesn't need > >> OFED because of that > >> > >> Thanks, > >> Eli > >> > >> Tech specs: > >> Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) > >> Clients: Debian + kernel 4.2 + Lustre 2.8 > >> IB: ConnectX-3 FDR > > > > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
On 7/28/17, 11:12 AM, "Harald van Pee" <p...@hiskp.uni-bonn.de> wrote: >Hello > >On Friday 28 July 2017 15:48:12 Ben Evans wrote: >> Eli, just to clarify are you talking about using the in-kernel OFED vs. >>a >> vendor (Mellanox) OFED, or > >In our case we are using the OFED of the debian distribution used. > >> are you talking about using the ConnectX-3 >> hardware in IPoIB mode and just using it as a faster Ethernet? > >is possible? How one have to do this? You'd configure the lustre LNET to use it like any other ethernet device. The downside of this is that it's slower due to a lack of RDMA and other features that IB has. I'm not sure if there's a real upside to it. > >Harald > > >> >> -Ben Evans >> >> From: lustre-discuss >> >><lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-bounces@li >>s >> ts.lustre.org>> on behalf of "E.S. Rosenberg" >> <esr+lus...@mail.hebrew.edu<mailto:esr+lus...@mail.hebrew.edu>> Date: >> Thursday, July 27, 2017 at 4:55 PM >> To: >> >>"lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>" >> >><lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> >> Subject: [lustre-discuss] Lustre and OFED >> >> Hi all, >> >> How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every >> once in a while and that got me thinking a bit. >> >> What things are gained by installing OFED? Performance? Accurate traffic >> reports? >> >> Currently I am using a lustre system without OFED but our IB hardware is >> from the FDR generation so not bleeding edge and probably doesn't need >> OFED because of that >> >> Thanks, >> Eli >> >> Tech specs: >> Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) >> Clients: Debian + kernel 4.2 + Lustre 2.8 >> IB: ConnectX-3 FDR > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
Hello On Friday 28 July 2017 15:48:12 Ben Evans wrote: > Eli, just to clarify are you talking about using the in-kernel OFED vs. a > vendor (Mellanox) OFED, or In our case we are using the OFED of the debian distribution used. > are you talking about using the ConnectX-3 > hardware in IPoIB mode and just using it as a faster Ethernet? is possible? How one have to do this? Harald > > -Ben Evans > > From: lustre-discuss > <lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-bounces@lis > ts.lustre.org>> on behalf of "E.S. Rosenberg" > <esr+lus...@mail.hebrew.edu<mailto:esr+lus...@mail.hebrew.edu>> Date: > Thursday, July 27, 2017 at 4:55 PM > To: > "lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>" > <lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> > Subject: [lustre-discuss] Lustre and OFED > > Hi all, > > How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every > once in a while and that got me thinking a bit. > > What things are gained by installing OFED? Performance? Accurate traffic > reports? > > Currently I am using a lustre system without OFED but our IB hardware is > from the FDR generation so not bleeding edge and probably doesn't need > OFED because of that > > Thanks, > Eli > > Tech specs: > Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) > Clients: Debian + kernel 4.2 + Lustre 2.8 > IB: ConnectX-3 FDR ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
Eli, just to clarify are you talking about using the in-kernel OFED vs. a vendor (Mellanox) OFED, or are you talking about using the ConnectX-3 hardware in IPoIB mode and just using it as a faster Ethernet? -Ben Evans From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of "E.S. Rosenberg" <esr+lus...@mail.hebrew.edu<mailto:esr+lus...@mail.hebrew.edu>> Date: Thursday, July 27, 2017 at 4:55 PM To: "lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>" <lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> Subject: [lustre-discuss] Lustre and OFED Hi all, How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every once in a while and that got me thinking a bit. What things are gained by installing OFED? Performance? Accurate traffic reports? Currently I am using a lustre system without OFED but our IB hardware is from the FDR generation so not bleeding edge and probably doesn't need OFED because of that Thanks, Eli Tech specs: Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) Clients: Debian + kernel 4.2 + Lustre 2.8 IB: ConnectX-3 FDR ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
Hi Eli, we are running lustre without OFED on debian client and server. With lustre 2.4.0 on client and servers no problem at all since years. With lustre 2.5.3 on servers and 2.6. 92 no problems at least for monthes. with lustre 2.5.3 on servers and 2.7 on clients allways ib connection loss. Here I'm wondering if a more recent OFED version could help? We are mostly interested in a rock solid lustre version, lustre 2.6 is fast enough for us, but has a memory leak caused by cache usage, lustre 2.7 was perfect for us in tests with a small number of machines, but fails completly for the full cluster and/or certain tasks. Best Harald On Donnerstag, 27. Juli 2017 22:55:33 CEST E.S. Rosenberg wrote: > Hi all, > > How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every > once in a while and that got me thinking a bit. > > What things are gained by installing OFED? Performance? Accurate traffic > reports? > > Currently I am using a lustre system without OFED but our IB hardware is > from the FDR generation so not bleeding edge and probably doesn't need OFED > because of that > > Thanks, > Eli > > Tech specs: > Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) > Clients: Debian + kernel 4.2 + Lustre 2.8 > IB: ConnectX-3 FDR ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
Jeff (and Grigory - offlist), Thanks for your fast replies! On Fri, Jul 28, 2017 at 12:09 AM, Jeff Johnson < jeff.john...@aeoncomputing.com> wrote: > Eli, > > The biggest driver is usually the drivers. Newer Mellanox hardware not yet > supported, or supported well, by kernel IB. Way back in the days of old > there were some interoperability issues where everything (clients and > servers) needed to be the same drivers and libraries but much of that was > cleaned up. There could be situations where OFED is needed on the server > side to support something under the Lustre layer like OST or MDT block > devices via iSER, SRP, NVMeF, etc. > > There may be other reasons but those are off the top of my head. > So currently everything seems to be working just fine without OFED, my only complaint is that the normal Linux interface counters don't report traffic properly which means I have to write my own perfquery wrappers for tools like zabbix etc. I may try adding OFED if I have time at some point but I hope by then to at least have moved our servers to CentOS 7.3 + Lustre 2.9/10. Has anyone ever run benchmarks of vanilla vs. OFED? Thanks again, Eli > > --Jeff > > On Thu, Jul 27, 2017 at 4:55 PM, E.S. Rosenberg < > esr+lus...@mail.hebrew.edu> wrote: > >> Hi all, >> >> How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every >> once in a while and that got me thinking a bit. >> >> What things are gained by installing OFED? Performance? Accurate traffic >> reports? >> >> Currently I am using a lustre system without OFED but our IB hardware is >> from the FDR generation so not bleeding edge and probably doesn't need OFED >> because of that >> >> Thanks, >> Eli >> >> Tech specs: >> Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) >> Clients: Debian + kernel 4.2 + Lustre 2.8 >> IB: ConnectX-3 FDR >> >> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >> > > > -- > -- > Jeff Johnson > Co-Founder > Aeon Computing > > jeff.john...@aeoncomputing.com > www.aeoncomputing.com > t: 858-412-3810 x1001 <(858)%20412-3810> f: 858-412-3845 > <(858)%20412-3845> > m: 619-204-9061 <(619)%20204-9061> > > 4170 Morena Boulevard, Suite D - San Diego, CA 92117 > > High-Performance Computing / Lustre Filesystems / Scale-out Storage > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Lustre and OFED
Eli, The biggest driver is usually the drivers. Newer Mellanox hardware not yet supported, or supported well, by kernel IB. Way back in the days of old there were some interoperability issues where everything (clients and servers) needed to be the same drivers and libraries but much of that was cleaned up. There could be situations where OFED is needed on the server side to support something under the Lustre layer like OST or MDT block devices via iSER, SRP, NVMeF, etc. There may be other reasons but those are off the top of my head. --Jeff On Thu, Jul 27, 2017 at 4:55 PM, E.S. Rosenbergwrote: > Hi all, > > How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every > once in a while and that got me thinking a bit. > > What things are gained by installing OFED? Performance? Accurate traffic > reports? > > Currently I am using a lustre system without OFED but our IB hardware is > from the FDR generation so not bleeding edge and probably doesn't need OFED > because of that > > Thanks, > Eli > > Tech specs: > Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) > Clients: Debian + kernel 4.2 + Lustre 2.8 > IB: ConnectX-3 FDR > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > -- -- Jeff Johnson Co-Founder Aeon Computing jeff.john...@aeoncomputing.com www.aeoncomputing.com t: 858-412-3810 x1001 f: 858-412-3845 m: 619-204-9061 4170 Morena Boulevard, Suite D - San Diego, CA 92117 High-Performance Computing / Lustre Filesystems / Scale-out Storage ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Lustre and OFED
Hi all, How 'needed' is OFED for Lustre? In the LUG talks it is mentioned every once in a while and that got me thinking a bit. What things are gained by installing OFED? Performance? Accurate traffic reports? Currently I am using a lustre system without OFED but our IB hardware is from the FDR generation so not bleeding edge and probably doesn't need OFED because of that Thanks, Eli Tech specs: Servers: CentOS 6.8 + Lustre 2.8 (kernel from Lustre RPMs) Clients: Debian + kernel 4.2 + Lustre 2.8 IB: ConnectX-3 FDR ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [Lustre-discuss] Lustre 1.6.4.3 + OFED 1.2.5.5 + RHEL 4u4 AS
It doesn't appear as if you've come up with a solution to this problem. I too have run into the same set of issues as you appear to have here, but I think I have resolved them. I am running SLES 10 SP1 but this should also work for you. first patch lnet/klnds/o2iblnd/o2iblnd.h You'll need to add the following line somewhere near the top: #include linux/pci.h Next you'll need to patch lnet/klnds/o2iblnd/o2iblnd.c. ib_create_cq uses 6 args in OFED-1.2.5.5 but o2iblnd.c only has 5 in the call. The new line should look like this: cq = ib_create_cq(cmid-device, kiblnd_cq_completion, kiblnd_cq_event, conn, IBLND_CQ_ENTRIES(), 0); After applying the two changes mentioned above I have been able to make though 'make rpms' finishes it is a tad chatty with warning messages. On May 15, 12:37 pm, Malcolm Cowe [EMAIL PROTECTED] wrote: Hi Folks, Having some trouble with building Lustre 1.6.4.3 with OFED 1.2.5.5 on a RHEL 4u4 AS server. Could somebody please help me to understand where I've gone wrong? Here's what I have done so far: 1. Install RHEL 4u4 AS (full installation). 2. Download the Lustre RPMs from sun.com: e2fsprogs-1.40.4.cfs1-0redhat.x86_64.rpm e2fsprogs-devel-1.40.4.cfs1-0redhat.x86_64.rpm kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3.x86_64.rpm kernel-lustre-source-2.6.9-67.0.4.EL_lustre.1.6.4.3.x86_64.rpm lustre-1.6.4.3-2.6.9_67.0.4.EL_lustre.1.6.4.3smp.x86_64.rpm lustre-modules-1.6.4.3-2.6.9_67.0.4.EL_lustre.1.6.4.3smp.x86_64.rpm lustre-source-1.6.4.3-2.6.9_67.0.4.EL_lustre.1.6.4.3smp.x86_64.rpm plus: lustre-1.6.4.3.tar.gz 3. Install kernel-lustre-smp and kernel-lustre-source rpms. - Change grub to boot from lustre patched kernel by default. - Reboot. 4. Download OFED distribution from openib.org: OFED-1.2.5.5.tgz 5. Extract OFED distribution. 6. Install OFED: cd OFED-1.2.5.5/ ./install.sh 2) Install OFED Software 3) All packages (all of Basic, HPC) [accept defaults for everything, configure IPoIB IP address]. 7. Reboot. 8. Modify Module.symvers, removing all references to Infiniband modules supplied with the kernel distribution. N.B. Could not find this file in the lustre kernel source tree, only in the -obj tree. vi /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3-obj/x86_64/smp/Module.symvers 9. Run /usr/share/doc/ofed-docs-1.2.5.5/create_Module.symvers.sh and append the resulting file to the existing Module.symvers file: /usr/share/doc/ofed-docs-1.2.5.5/create_Module.symvers.sh cat Module.symvers /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3-obj/x86_64/smp/Module.symvers 10. Change into the lustre kernel source and edit the Makefile. Change custom suffix to smp in the variable EXTRAVERSION. 11. Change into the lustre kernel source and run the setup commands: cd /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3 [linux]$ cp /boot/config-`uname -r` .config [linux]$ make oldconfig || make menuconfig # For 2.6 kernels [linux]$ make include/asm [linux]$ make include/linux/version.h [linux]$ make SUBDIRS=scripts 12. Extract the lustre source distribution (using lustre-1.6.4.3.tar.gz rather than the RPM): tar zxf lustre-1.6.4.3.tar.gz 13. Run the configure script: cd lustre-1.6.4.3/ ./configure --with-linux=/usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3 --with-o2ib=/usr/src/ofa_kernel N.B. Cannot include --with-linux-obj= option as the configure script exits with an error and a recommendation to run make config in the linux src tree: checking for /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3-obj/include/linux/autoconf.h... no configure: error: Run make config in /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3. If I do this, the build fails very early on: [EMAIL PROTECTED] lustre-1.6.4.3]# make test -d CVS || exit 0; \ list=; for mod in $list; do \ perl ./build/kabi -v archive $HOME/nonfree $mod || exit $?; \ done make all-recursive make[1]: Entering directory `/root/HPC/build/lustre-1.6.4.3' Making all in ldiskfs make[2]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs' test -d CVS || exit 0; \ list=; for mod in $list; do \ perl ./build/kabi -v archive $HOME/nonfree $mod || exit $?; \ done make all-recursive make[3]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs' Making all in . make[4]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs' for dir in ldiskfs ; do \ make sources -C $dir || exit $? ; \ done make[5]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs/ldiskfs' rm -rf linux-stage linux sources mkdir -p linux-stage/fs/ext3 linux-stage/include/linux cp /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/acl.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/balloc.c
[Lustre-discuss] Lustre 1.6.4.3 + OFED 1.2.5.5 + RHEL 4u4 AS
Hi Folks, Having some trouble with building Lustre 1.6.4.3 with OFED 1.2.5.5 on a RHEL 4u4 AS server. Could somebody please help me to understand where I've gone wrong? Here's what I have done so far: 1. Install RHEL 4u4 AS (full installation). 2. Download the Lustre RPMs from sun.com: e2fsprogs-1.40.4.cfs1-0redhat.x86_64.rpm e2fsprogs-devel-1.40.4.cfs1-0redhat.x86_64.rpm kernel-lustre-smp-2.6.9-67.0.4.EL_lustre.1.6.4.3.x86_64.rpm kernel-lustre-source-2.6.9-67.0.4.EL_lustre.1.6.4.3.x86_64.rpm lustre-1.6.4.3-2.6.9_67.0.4.EL_lustre.1.6.4.3smp.x86_64.rpm lustre-modules-1.6.4.3-2.6.9_67.0.4.EL_lustre.1.6.4.3smp.x86_64.rpm lustre-source-1.6.4.3-2.6.9_67.0.4.EL_lustre.1.6.4.3smp.x86_64.rpm plus: lustre-1.6.4.3.tar.gz 3. Install kernel-lustre-smp and kernel-lustre-source rpms. - Change grub to boot from lustre patched kernel by default. - Reboot. 4. Download OFED distribution from openib.org: OFED-1.2.5.5.tgz 5. Extract OFED distribution. 6. Install OFED: cd OFED-1.2.5.5/ ./install.sh 2) Install OFED Software 3) All packages (all of Basic, HPC) [accept defaults for everything, configure IPoIB IP address]. 7. Reboot. 8. Modify Module.symvers, removing all references to Infiniband modules supplied with the kernel distribution. N.B. Could not find this file in the lustre kernel source tree, only in the -obj tree. vi /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3-obj/x86_64/smp/Module.symvers 9. Run /usr/share/doc/ofed-docs-1.2.5.5/create_Module.symvers.sh and append the resulting file to the existing Module.symvers file: /usr/share/doc/ofed-docs-1.2.5.5/create_Module.symvers.sh cat Module.symvers /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3-obj/x86_64/smp/Module.symvers 10. Change into the lustre kernel source and edit the Makefile. Change custom suffix to smp in the variable EXTRAVERSION. 11. Change into the lustre kernel source and run the setup commands: cd /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3 [linux]$ cp /boot/config-`uname -r` .config [linux]$ make oldconfig || make menuconfig # For 2.6 kernels [linux]$ make include/asm [linux]$ make include/linux/version.h [linux]$ make SUBDIRS=scripts 12. Extract the lustre source distribution (using lustre-1.6.4.3.tar.gz rather than the RPM): tar zxf lustre-1.6.4.3.tar.gz 13. Run the configure script: cd lustre-1.6.4.3/ ./configure --with-linux=/usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3 --with-o2ib=/usr/src/ofa_kernel N.B. Cannot include --with-linux-obj= option as the configure script exits with an error and a recommendation to run make config in the linux src tree: checking for /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3-obj/include/linux/autoconf.h... no configure: error: Run make config in /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3. If I do this, the build fails very early on: [EMAIL PROTECTED] lustre-1.6.4.3]# make test -d CVS || exit 0; \ list=; for mod in $list; do \ perl ./build/kabi -v archive $HOME/nonfree $mod || exit $?; \ done make all-recursive make[1]: Entering directory `/root/HPC/build/lustre-1.6.4.3' Making all in ldiskfs make[2]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs' test -d CVS || exit 0; \ list=; for mod in $list; do \ perl ./build/kabi -v archive $HOME/nonfree $mod || exit $?; \ done make all-recursive make[3]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs' Making all in . make[4]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs' for dir in ldiskfs ; do \ make sources -C $dir || exit $? ; \ done make[5]: Entering directory `/root/HPC/build/lustre-1.6.4.3/ldiskfs/ldiskfs' rm -rf linux-stage linux sources mkdir -p linux-stage/fs/ext3 linux-stage/include/linux cp /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/acl.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/balloc.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/bitmap.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/dir.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/file.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/fsync.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/hash.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/ialloc.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/inode.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/ioctl.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/namei.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/resize.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/super.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/symlink.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/xattr.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/xattr_security.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/xattr_trusted.c /usr/src/linux-2.6.9-67.0.4.EL_lustre.1.6.4.3/fs/ext3/xattr_user.c
Re: [Lustre-discuss] Lustre 1.6.4.3 + OFED 1.2.5.5 + RHEL 4u4 AS
A search of bugzilla yields bug 15315 which identifies bug 15030 as well. Please read through those two bugs. b. signature.asc Description: This is a digitally signed message part ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre 1.4.6.3 + OFED 1.2.5.5: Error loadingko2iblnd
From: [EMAIL PROTECTED] [mailto:lustre-discuss- [EMAIL PROTECTED] On Behalf Of Kumaran Rajaram Sent: Tuesday, May 06, 2008 1:12 PM Hardware Config: x86_64 Software Config: SLES10.1, 2.6.16.46-0.12-lustre (Stock SP1 Kernel + Lustre patches), OFEDv1.2.5.5, Lustre-1.6.4.3 Status: Lustre + TCP builds and loads fine Lustre + o2ib builds but ko2iblnd does not load :-( Applied the Bugzilla patch 12276 to get Lustre compiled with OFEDv1.2.5.5. Configured as follows (see config.out attached): ./configure --with-linux=/usr/src/linux --with-o2ib=/usr/src/ofa_kernel-1.2.5.5 Kums, Try '--with-o2ib=/usr/src/ofa_kernel' (leaving out the '-1.2.5.5'). This worked for me in different circumstances (Lustre 1.4.12, RHEL 4 kernel 2.6.9-67.0.4). David -- David Kewley Dell Infrastructure Consulting Services Cell: 602-460-7617 [EMAIL PROTECTED] My views may not reflect Dell's views. Dell Services: http://www.dell.com/services/ How am I doing? Email my manager [EMAIL PROTECTED] with any feedback. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Lustre 1.4.6.3 + OFED 1.2.5.5: Error loading ko2iblnd
Hi, My cluster configuration is as follows: Hardware Config: x86_64 Software Config: SLES10.1, 2.6.16.46-0.12-lustre (Stock SP1 Kernel + Lustre patches), OFEDv1.2.5.5, Lustre-1.6.4.3 Status: Lustre + TCP builds and loads fine Lustre + o2ib builds but ko2iblnd does not load :-( Applied the Bugzilla patch 12276 to get Lustre compiled with OFEDv1.2.5.5. Configured as follows (see config.out attached): ./configure --with-linux=/usr/src/linux --with-o2ib=/usr/src/ofa_kernel-1.2.5.5 Get the following warnings when building the RPMs: - WARNING: rdma_accept [/usr/src/packages/BUILD/lustre-1.6.4.3/lnet/klnds/o2iblnd/ko2iblnd.ko] undefined! WARNING: rdma_destroy_id [/usr/src/packages/BUILD/lustre-1.6.4.3/lnet/klnds/o2iblnd/ko2iblnd.ko] undefined! WARNING: rdma_connect [/usr/src/packages/BUILD/lustre-1.6.4.3/lnet/klnds/o2iblnd/ko2iblnd.ko] undefined! - I get the following error when I try to load ko2iblnd.ko (modprobe). - May 6 16:42:47 storagehost kernel: ko2iblnd: disagrees about version of symbol ib_create_cq May 6 16:42:47 storagehost kernel: ko2iblnd: Unknown symbol ib_create_cq May 6 16:42:47 storagehost kernel: ko2iblnd: disagrees about version of symbol ib_dereg_mr May 6 16:42:47 storagehost kernel: ko2iblnd: Unknown symbol ib_dereg_mr May 6 16:42:47 storagehost kernel: ko2iblnd: disagrees about version of symbol ib_destroy_cq May 6 16:42:47 storagehost kernel: ko2iblnd: Unknown symbol ib_destroy_cq May 6 16:42:47 storagehost kernel: ko2iblnd: disagrees about version of symbol ib_get_dma_mr In addition to the IB source, made the following symbolic links i) /usr/src/linux/drivers/infiniband to /usr/src/ofa_kernel-1.2.5.5 ii) /usr/src/linux/include/rdma to /usr/src/ofa_kernel-1.2.5.5/include/rdma iii) /usr/include/infiniband/verbs.h is from OFED-1.2.5.5 storagehost[3] root~$ modinfo ib_core filename: /lib/modules/2.6.16.46-0.12-lustre/updates/kernel/drivers/infiniband/core/ib_core.ko license:Dual BSD/GPL description:core kernel InfiniBand API author: Roland Dreier srcversion: 4429863EA75C0750E651039 depends: vermagic: 2.6.16.46-0.12-lustre SMP gcc-4.1 storagehost[5] root/lib/modules/2.6.16.46-0.12-lustre/updates/kernel/drivers/infiniband/core$ nm ib_core.ko | grep ib_create_cq 66d8cf93 A __crc_ib_create_cq 55140081 A __crc_ib_create_cq_mod 0c61 T ib_create_cq 0cb7 T ib_create_cq_mod 00a0 r __kcrctab_ib_create_cq 0098 r __kcrctab_ib_create_cq_mod 0161 r __kstrtab_ib_create_cq 0150 r __kstrtab_ib_create_cq_mod 0140 r __ksymtab_ib_create_cq 0130 r __ksymtab_ib_create_cq_mod Any ideas to what I may be doing wrong to get the ko2iblnd.ko loaded properly? Thanks in Advance, -Kums checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking target system type... x86_64-unknown-linux-gnu checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for gawk... gawk checking whether make sets $(MAKE)... yes checking for gcc... gcc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ANSI C... none needed checking for style of include used by make... GNU checking dependency style of gcc... gcc3 checking how to run the C preprocessor... gcc -E checking for egrep... grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking whether to build Cray XT3 features... no checking whether to build BGL features... no checking for ranlib... ranlib checking for buggy compiler... no known problems checking for unsigned long long... yes checking size of unsigned long long... 8 --- size SIZEOF --- size SIZEOF 8 checking whether __i386__ is declared... no checking if gcc accepts -m64... yes checking whether to posix osd... no checking whether to build docs... no checking whether to build utilities... yes checking whether to install init scripts... no checking whether to build Lustre tests... yes checking whether to build Lustre server support... yes checking whether to build Lustre client support... yes ./configure: line 4461: LC_CONFIG_SPLIT: command not found ./configure: line 4462: LC_CONFIG_LDISKFS: command not found checking whether to enable