Hi Thomas, Here's a one thing to check, (if you're trying to replace a tcp network with an IB one, on an existing lustre filesystem):
With the lustre mounts unmounted, run: tunefs.lustre --dryrun <DEV_PATH> | grep Parameters check to ensure that parameters like 'mgsnode=IP' end in @o2ib and not @tcp. If they do, erase and rewrite them. Cheers, Adam Erik Froese wrote: > Thomas, > > If you see a ib0 device and it has a valid IP lnet should pick it up with > options lnet networks="o2ib0(ib0)" > > What errors are you seeing? > > Erik > > On Tue, Jun 22, 2010 at 1:14 PM, Thomas Roth <[email protected]> wrote: > >> Hello Erik, >> >> thanks for your advice, esp. on routing - I'll study that carefully once >> I get that far. >> For now, I was just trying the minimal first steps to get lnet via IB: >> - It's all happening on the MGS/MDS, but neither mgs nor mdt yet >> mounted, just 'modprobe lnet; lctl network up; lctl list_nids' >> - I tried to use IB exclusively. >> - options lnet networks="o2ib0(ib0)" doesn't work either (nor >> variations thereof) >> >> Regards, >> Thomas >> >> On 22.06.2010 18:40, Erik Froese wrote: >> >>> Hey Thomas, >>> >>> Are you trying to connect to Lustre via IB and ethernet? If so your >>> modprobe config should look like this. >>> options lnet networks="o2ib0(ib0),tcp0(eth0)" >>> >>> If you're IB only use. >>> options lnet networks="o2ib0(ib0)" >>> >>> If your MDS and OSS servers are on a separate networks you'll need to >>> do something different. >>> Let's say the MDS and OSSs are on o2ib0/tcp0 and the clients are on >>> o2ib1/tcp1. You'll need a router server with separate addresses on >>> o2ib0 and o2ib1. >>> >>> Also its important to note that o2ib0 and o2ib1 should be different IP >>> address spaces. >>> >>> On the clients. >>> # I live on o2ib1 >>> options lnet networks="o2ib1(ib0),tcp1(eth0)" >>> # To get to o2ib0 go through ip.add.of.rou...@oi2ib1 >>> options lnet routes="o2ib0 ip.add.of.rou...@o2ib1" >>> >>> On the servers >>> # I live on o2ib0 >>> options lnet networks="o2ib0(ib0),tcp0(eth0)" >>> # To get to o2ib1 go through ip.add.of.rou...@oi2ib0 >>> options lnet routes="o2ib1 ip.add.of.rou...@o2ib0" >>> >>> ip.add.of.rou...@oi2ib0 and ip.add.of.rou...@oi2ib1 are different IPs >>> on distinct networks. >>> >>> lctl list_nids will show you the lustre nids of the node you're logged >>> into only. >>> lctl route_list will show you the lustre routers and the networks that >>> they bridge. >>> >>> I hope this was helpful. >>> >>> Erik >>> >>> On Tue, Jun 22, 2010 at 10:19 AM, Thomas Roth <[email protected]> wrote: >>> >>>> Hi all, >>>> >>>> I'm getting my feet wet in the infiniband lake and of course I run into >>>> some problems. >>>> It would seem I got the compilation part of sles11 kernel 2.6.27 + >>>> Lustre 1.8.3 + ofed 1.4.2 right, because it allows me to see and use the >>>> infiniband fabric, and because ko2iblnd loads without any complaints. >>>> >>>> In /etc/modprobe.d/lustre (this is a Debian system, hence this subdir of >>>> modprobe-configs), I have >>>> >>>>> options ip2nets="o2ib0 192.168.0.[1-5]" >>>>> >>>> I load lnet and do 'lctl network up', but then 'lctl list_nids' will >>>> invariably give me only >>>> >>>>> 192.168....@tcp >>>>> >>>> no matter how I twist the modprobe-config (ip2nets="o2ib", >>>> network="o2ib", network="o2ib(ib0), etc.) >>>> >>>> This is true as long as I have ib0 configured with the IP 192.168.0.1 >>>> Once I unconfigure it, I get, quite expectedly, >>>> LNET configure error 100: Network is down >>>> >>>> So I can either configure ipoib and bring up the network, but using tcp, >>>> or I don't configure ib0 and then cannot start the network -? ;-{} I >>>> think I'm rather missing something here. >>>> Any clues? >>>> >>>> Cheers, >>>> Thomas >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> [email protected] >>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >>>> >> -- >> -------------------------------------------------------------------- >> Thomas Roth >> Department: Informationstechnologie >> Location: SB3 1.262 >> Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 >> >> GSI Helmholtzzentrum für Schwerionenforschung GmbH >> Planckstraße 1 >> 64291 Darmstadt >> www.gsi.de >> >> Gesellschaft mit beschränkter Haftung >> Sitz der Gesellschaft: Darmstadt >> Handelsregister: Amtsgericht Darmstadt, HRB 1528 >> >> Geschäftsführung: Professor Dr. Dr. h.c. Horst Stöcker, >> Christiane Neumann, Dr. Hartmut Eickhoff >> >> Vorsitzende des Aufsichtsrates: Dr. Beatrix Vierkorn-Rudolph >> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt >> >> >> > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > -- Adam Munro System Administrator | SHARCNET | http://www.sharcnet.ca Compute Canada | http://www.computecanada.org 519-888-4567 x36453 _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
