Hi Diego,

Do  you have any other module parameter  for lnet and lnd? 

Regards
Liang 


On Mar 22, 2011, at 9:26 PM, Diego Moreno wrote:

> Hi,
> 
> We are having this problem right now with our Lustre 2.0. We tried the 
> proposed solutions but we didn't get it.
> 
> We have 2 QDR IB cards on 4 servers and we have to do "lctl ping" from 
> each server to every client if we want clients to connect to servers. We 
> don't have ib_mthca modules loaded because we don't have DDR cards and 
> we configured ip2nets with no result.
> 
> Our ip2nets configuration ([7-10] interfaces are in servers, the others 
> are in clients):
> o2ib0(ib0) 10.50.0.[7-10] ; o2ib1(ib1) 10.50.1.[7-10] ; o2ib0(ib0) 
> 10.50.*.* ; o2ib1(ib0) 10.50.*.*
> 
> So the only way of having clients connected to servers is doing 
> something like this on every server:
> 
> for i in $CLIENT_IB_LIST ; do
> lctl ping $i@o2ib0
> lctl ping $i@o2ib1
> done
> 
> Before "lctl ping" we get messages like this one:
> 
> Lustre: 50389:0:(lib-move.c:1028:lnet_post_send_locked()) Dropping 
> message for 12345-10.50.1.7@o2ib1: peer not alive
> 
> After "lctl ping' everything works right.
> 
> Maybe I'm missing something or this is a known bug in lustre 2.0...
> 
> 
> On 16/03/2011 22:13, Andreas Dilger wrote:
>> On 2011-03-16, at 3:04 PM, Mike Hanby wrote:
>>> Thanks, I forgot to include the card info:
>>> 
>>> The servers each have a single IB card: dual port MT26528 QDR
>>> o2ib0(ib0) on each server is attached to the QLogic switch (with three 
>>> attached M3601Q switches 48 attached blades)
>>> o2ib1(ib1) on each server is attached to a stack of two M3601Q switches 
>>> with 24 attached blades
>>> 
>>> The blades connected to o2ib0 each have an MT26428 QDR IB card
>>> The blades connected to o2ib1 each have an MT25418 DDR IB card
>> 
>> You may also want to check out the ip2nets option for specifying the Lustre 
>> networks.  It is made to handle configuration issues like this where the 
>> interface name is not constant across client/server nodes.
>> 
>>> 
>>> -----Original Message-----
>>> From: [email protected] 
>>> [mailto:[email protected]] On Behalf Of Nirmal Seenu
>>> Sent: Wednesday, March 16, 2011 2:10 PM
>>> To: [email protected]
>>> Subject: Re: [Lustre-discuss] Lustre over o2ib issue
>>> 
>>> If you are using DDR and QDR or any 2 different cards cards in the same 
>>> machine there is no guarantee that the same IB cards get assigned to ib0 
>>> and ib.
>>> 
>>> To fix that problem you need to comment out the following 3 lines 
>>> /etc/init.d/openibd:
>>> 
>>>     #for i in `grep "^driver: " /etc/sysconfig/hwconf | sed -e 's/driver: 
>>> //' | grep -w "ib_mthca\\\|ib_ipath\\\|mlx4_core\\\|cxgb3\\\|iw_nes"`; do
>>>     #    load_modules $i
>>>     #done
>>> 
>>> and include the following lines instead(we wanted the DDR card to be ib0 
>>> and the QDR card to be ib1):
>>>     load_modules ib_mthca
>>>     /bin/sleep 10
>>>     load_modules mlx4_core
>>> 
>>> and you will need to restart openibd once again (we included it in 
>>> rc.local) to make sure that the same IB cards are assigned to the devices 
>>> ib0 and ib1.
>>> 
>>> Nirmal
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> [email protected]
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> [email protected]
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Principal Engineer
>> Whamcloud, Inc.
>> 
>> 
>> 
>> _______________________________________________
>> Lustre-discuss mailing list
>> [email protected]
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
>> 
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to