On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <[email protected]> wrote:
> On Thu, 4 Feb 2010 09:19:39 -0500
> Hal Rosenstock <[email protected]> wrote:
>
>> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <[email protected]> wrote:
>> > Sasha,
>> >
>> > Following up on our thread regarding having multiple outstanding SMP's in 
>> > libibnetdisc.
>> >
>> > These 2 patches implement that as well as add a function to set the max 
>> > outstanding the lib will use.
>> >
>> > I left the default here to be 4.  On a large cluster there seems to be 
>> > some variance with using 8 or 12.  Sometimes I get a speed up over 4 and 
>> > other times I don't see any.  I think it has to do with the traffic on the 
>> > fabric at any particular time.
>> >
>> > For example here are some runs I just did on Hyperion.
>> >
>> > 14:31:55 > /usr/sbin/ibqueryerrors  -s 
>> > RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
>> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
>> > Errors for 0x66a00d90006fb "SW19"
>> >   GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 14562048] 
>> > [RcvData == 14563872] [XmtPkts == 202255] [RcvPkts == 202276]
>> >       Link info:    139   9[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>  
>> > 0x0002c9030001d736    864    1[  ] "hyperion1" ( )
>> >
>> > 14:32:02 > time ./ibnetdiscover -o 8 --node-name-map 
>> > /etc/opensm/ib-node-name-map -g > new
>> >
>> > real    0m2.210s
>> > user    0m1.251s
>> > sys     0m0.869s
>> >
>> > 14:40:36 > time ./ibnetdiscover -o 4 --node-name-map 
>> > /etc/opensm/ib-node-name-map -g > new
>> >
>> > real    0m3.385s
>> > user    0m1.888s
>> > sys     0m1.448s
>> >
>> > 14:40:46 > time ./ibnetdiscover -o 4 --node-name-map 
>> > /etc/opensm/ib-node-name-map -g > new
>> >
>> > real    0m2.211s
>> > user    0m1.165s
>> > sys     0m0.951s
>> >
>> > 14:40:51 > time ./ibnetdiscover -o 8 --node-name-map 
>> > /etc/opensm/ib-node-name-map -g > new
>> >
>> > real    0m2.249s
>> > user    0m1.244s
>> > sys     0m0.936s
>> >
>> > 14:40:59 > time ./ibnetdiscover -o 4 --node-name-map 
>> > /etc/opensm/ib-node-name-map -g > new
>> >
>> > real    0m2.170s
>> > user    0m1.160s
>> > sys     0m0.933s
>> >
>> > 14:41:10 > /usr/sbin/ibqueryerrors  -s 
>> > RcvErrors,SymbolErrors,RcvSwRelayErrors,XmtWait -r --data
>> > Suppressing: RcvErrors SymbolErrors RcvSwRelayErrors XmtWait
>> > Errors for 0x66a00d90006fb "SW19"
>> >   GUID 0x66a00d90006fb port 9: [VL15Dropped == 3] [XmtData == 25187379] 
>> > [RcvData == 25196688] [XmtPkts == 349861] [RcvPkts == 349954]
>> >       Link info:    139   9[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>  
>> > 0x0002c9030001d736    864    1[  ] "hyperion1" ( )
>> >
>> > Note that there were no additional VL15Dropped packets on the fabric.  I 
>> > think 4 seems to be a good compromise.  I have not tested when there are 
>> > errors on the fabric.  (Right now things seem to be good!)
>>
>> Is this just with the SM doing light sweeping ?
>
> Yes.

That's not a lot of SMP stress from the SM side. SMP consumers are SM,
diags, and the unsolicited traps.

>
>>
>> Is there a speedup with 4 rather than 2 ?
>
> There is a bit of a speed up (~0.5 to 1.0 sec).  But my main reason to want to
> go to 4 is that if there are issues on the fabric, unresponsive nodes etc.; 4
> will give us better parallelism to get around these issues.  I have not had
> the chance to test this condition with the new algorithm but the original
> ibnetdiscover would slow way down when there are nodes which have unresponsive
> SMA's.  If there are only 2 outstanding this will not give us much speed up.
> This was the main motivation I had for improving the library in this way.
>
> Also, I think you are correct that we should increase OpenSM's default from 4
> to 8.  For the same reason as above.  Some of our clusters have worked better
> with 8 when we are having issues.  But right now we are still running with 4.

I'm concerned about just increasing ibnetdiscover to 4 rather than 2.
I've seen a number of clusters with SMP dropping with the current
lower defaults.

-- Hal

> Ira
>
>>
>> -- Hal
>>
>> >
>> > The first patch converts the algorithm and the second adds the 
>> > ibnd_set_max_smps_on_wire call.
>> >
>> > Let me know what you think.  Because the algorithm changed so much testing 
>> > this is a bit difficult because the order of the node discovery is 
>> > different.  However, I have done some extensive diffing of the output of 
>> > ibnetdiscover and things look good.
>> >
>> > Ira
>> >
>> > --
>> > Ira Weiny
>> > Math Programmer/Computer Scientist
>> > Lawrence Livermore National Lab
>> > 925-423-8008
>> > [email protected]
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> > the body of a message to [email protected]
>> > More majordomo info at  http://*vger.kernel.org/majordomo-info.html
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to [email protected]
>> More majordomo info at  http://*vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Ira Weiny
> Math Programmer/Computer Scientist
> Lawrence Livermore National Lab
> 925-423-8008
> [email protected]
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to