On Thu, 4 Feb 2010 16:13:25 -0800 Ira Weiny <wei...@llnl.gov> wrote: > On Thu, 4 Feb 2010 15:01:32 -0500 > Hal Rosenstock <hal.rosenst...@gmail.com> wrote: > > > On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <wei...@llnl.gov> wrote: > > > On Thu, 4 Feb 2010 09:19:39 -0500 > > > Hal Rosenstock <hal.rosenst...@gmail.com> wrote: > > > > > >> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <wei...@llnl.gov> wrote: > > >> > Sasha, > > >> > > > [snip]
[snip] > > >> > > >> Is there a speedup with 4 rather than 2 ? > > > > > > There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to > > > want to > > > go to 4 is that if there are issues on the fabric, unresponsive nodes > > > etc.; 4 > > > will give us better parallelism to get around these issues. I have not > > > had > > > the chance to test this condition with the new algorithm but the original > > > ibnetdiscover would slow way down when there are nodes which have > > > unresponsive > > > SMA's. If there are only 2 outstanding this will not give us much speed > > > up. > > > This was the main motivation I had for improving the library in this way. Ok, I found a fabric with just 2 nodes which were unresponsive... A quick test shows... Original ibnetdiscover: 18:12:29 > time ./ibnetdiscover > foo ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,11,9) src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) failed, skipping port ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) failed, skipping port real 0m9.073s user 0m0.137s sys 0m0.172s 18:12:43 > time ./ibnetdiscover > foo ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,11,9) src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) failed, skipping port ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,24,18,7,6) failed, skipping port real 0m9.103s user 0m0.046s sys 0m0.046s *New* ibnetdiscover with different outstanding SMP's. 18:12:14 > time ./ibnetdiscover -o 2 > foo src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out real 0m9.746s user 0m6.559s sys 0m3.156s 18:13:00 > time ./ibnetdiscover -o 4 > foo src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out real 0m4.668s user 0m3.043s sys 0m1.601s 18:13:10 > time ./ibnetdiscover -o 8 > foo src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) bad status 110; Connection timed out src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr 0x11:0) bad status 110; Connection timed out real 0m4.360s user 0m2.891s sys 0m1.451s Note that 2 does not give much speed up, where 4 does. Obviously this could have to do with the fact there were 2 nodes which were bad (so if you had 100's of nodes unresponsive a higher value might be worth using) but as a default compromise I think 4 is good. Ira > > > > > > Also, I think you are correct that we should increase OpenSM's default > > > from 4 > > > to 8. For the same reason as above. Some of our clusters have worked > > > better > > > with 8 when we are having issues. But right now we are still running > > > with 4. > > > > I'm concerned about just increasing ibnetdiscover to 4 rather than 2. > > I've seen a number of clusters with SMP dropping with the current > > lower defaults. > > So OpenSM is seeing dropped packets? With 4 SMP's on the wire? I do see some > VL15Dropped errors (maybe 2-3 a day) but I did not think that would be an > issue. What kind of rate are you seeing? > > The other question is; do people regularly run the tools which are using > libibnetdisc (ibqueryerrors, iblinkinfo, ibnetdiscover)? We do. If others > are not then I would say this change would have less impact as they would want > the diags to have some priority for debugging. The other option is to change > the patch to be a default of 2 and allow user to change it depending on what > they are trying to do. If you think that is best I will change the patch. > > Ira > > > > > -- Hal > > > > > Ira > > > > > >> > > >> -- Hal > > >> > > >> > > > >> > The first patch converts the algorithm and the second adds the > > >> > ibnd_set_max_smps_on_wire call. > > >> > > > >> > Let me know what you think. Because the algorithm changed so much > > >> > testing this is a bit difficult because the order of the node > > >> > discovery is different. However, I have done some extensive diffing > > >> > of the output of ibnetdiscover and things look good. > > >> > > > >> > Ira > > >> > > > >> > -- > > >> > Ira Weiny > > >> > Math Programmer/Computer Scientist > > >> > Lawrence Livermore National Lab > > >> > 925-423-8008 > > >> > wei...@llnl.gov > > >> > -- > > >> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" > > >> > in > > >> > the body of a message to majord...@vger.kernel.org > > >> > More majordomo info at http://**vger.kernel.org/majordomo-info.html > > >> > > > >> -- > > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > > >> the body of a message to majord...@vger.kernel.org > > >> More majordomo info at http://**vger.kernel.org/majordomo-info.html > > >> > > > > > > > > > -- > > > Ira Weiny > > > Math Programmer/Computer Scientist > > > Lawrence Livermore National Lab > > > 925-423-8008 > > > wei...@llnl.gov > > > > > > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > wei...@llnl.gov -- Ira Weiny Math Programmer/Computer Scientist Lawrence Livermore National Lab 925-423-8008 wei...@llnl.gov -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html