On Thu, Feb 4, 2010 at 9:18 PM, Ira Weiny <[email protected]> wrote: > On Thu, 4 Feb 2010 16:13:25 -0800 > Ira Weiny <[email protected]> wrote: > >> On Thu, 4 Feb 2010 15:01:32 -0500 >> Hal Rosenstock <[email protected]> wrote: >> >> > On Thu, Feb 4, 2010 at 1:00 PM, Ira Weiny <[email protected]> wrote: >> > > On Thu, 4 Feb 2010 09:19:39 -0500 >> > > Hal Rosenstock <[email protected]> wrote: >> > > >> > >> On Tue, Feb 2, 2010 at 7:45 PM, Ira Weiny <[email protected]> wrote: >> > >> > Sasha, >> > >> > >> >> [snip] > > [snip] > >> > >> >> > >> Is there a speedup with 4 rather than 2 ? >> > > >> > > There is a bit of a speed up (~0.5 to 1.0 sec). But my main reason to >> > > want to >> > > go to 4 is that if there are issues on the fabric, unresponsive nodes >> > > etc.; 4 >> > > will give us better parallelism to get around these issues. I have not >> > > had >> > > the chance to test this condition with the new algorithm but the original >> > > ibnetdiscover would slow way down when there are nodes which have >> > > unresponsive >> > > SMA's. If there are only 2 outstanding this will not give us much speed >> > > up. >> > > This was the main motivation I had for improving the library in this way. > > Ok, I found a fabric with just 2 nodes which were unresponsive... A quick > test shows... > > Original ibnetdiscover: > > 18:12:29 > time ./ibnetdiscover > foo > ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; > 0,1,24,11,9) > src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) > failed, skipping port > ibwarn: [26993] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; > 0,1,24,24,18,7,6) > src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; > 0,1,24,24,18,7,6) failed, skipping port > > real 0m9.073s > user 0m0.137s > sys 0m0.172s > > 18:12:43 > time ./ibnetdiscover > foo > ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; > 0,1,24,11,9) > src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; 0,1,24,11,9) > failed, skipping port > ibwarn: [31111] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; > 0,1,24,24,18,7,6) > src/ibnetdisc.c:457; Query remote node (DR path slid 0; dlid 0; > 0,1,24,24,18,7,6) failed, skipping port > > real 0m9.103s > user 0m0.046s > sys 0m0.046s > > > *New* ibnetdiscover with different outstanding SMP's. > > 18:12:14 > time ./ibnetdiscover -o 2 > foo > src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) > bad status 110; Connection timed out > src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr > 0x11:0) bad status 110; Connection timed out > > real 0m9.746s > user 0m6.559s > sys 0m3.156s > > 18:13:00 > time ./ibnetdiscover -o 4 > foo > src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) > bad status 110; Connection timed out > src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr > 0x11:0) bad status 110; Connection timed out > > real 0m4.668s > user 0m3.043s > sys 0m1.601s > > 18:13:10 > time ./ibnetdiscover -o 8 > foo > src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,11,9 Attr 0x11:0) > bad status 110; Connection timed out > src/query_smp.c:185; umad (DR path slid 0; dlid 0; 0,1,13,13,7,7,6 Attr > 0x11:0) bad status 110; Connection timed out > > real 0m4.360s > user 0m2.891s > sys 0m1.451s > > > Note that 2 does not give much speed up, where 4 does. Obviously this could > have to do with the fact there were 2 nodes which were bad (so if you had > 100's of nodes unresponsive a higher value might be worth using)
It depends on the number of unresponsive nodes being same or higher than number of outstanding/parallel SMPs. In a sense, the number of outstanding SMPs is a measure of how many unresponsive nodes one is willing to tolerate before slowing down/waiting for timeouts. In some environments, unresponsive nodes are a normal case. -- Hal > but as a > default compromise I think 4 is good. > > Ira > >> > > >> > > Also, I think you are correct that we should increase OpenSM's default >> > > from 4 >> > > to 8. For the same reason as above. Some of our clusters have worked >> > > better >> > > with 8 when we are having issues. But right now we are still running >> > > with 4. >> > >> > I'm concerned about just increasing ibnetdiscover to 4 rather than 2. >> > I've seen a number of clusters with SMP dropping with the current >> > lower defaults. >> >> So OpenSM is seeing dropped packets? With 4 SMP's on the wire? I do see >> some >> VL15Dropped errors (maybe 2-3 a day) but I did not think that would be an >> issue. What kind of rate are you seeing? >> >> The other question is; do people regularly run the tools which are using >> libibnetdisc (ibqueryerrors, iblinkinfo, ibnetdiscover)? We do. If others >> are not then I would say this change would have less impact as they would >> want >> the diags to have some priority for debugging. The other option is to change >> the patch to be a default of 2 and allow user to change it depending on what >> they are trying to do. If you think that is best I will change the patch. >> >> Ira >> >> > >> > -- Hal >> > >> > > Ira >> > > >> > >> >> > >> -- Hal >> > >> >> > >> > >> > >> > The first patch converts the algorithm and the second adds the >> > >> > ibnd_set_max_smps_on_wire call. >> > >> > >> > >> > Let me know what you think. Because the algorithm changed so much >> > >> > testing this is a bit difficult because the order of the node >> > >> > discovery is different. However, I have done some extensive diffing >> > >> > of the output of ibnetdiscover and things look good. >> > >> > >> > >> > Ira >> > >> > >> > >> > -- >> > >> > Ira Weiny >> > >> > Math Programmer/Computer Scientist >> > >> > Lawrence Livermore National Lab >> > >> > 925-423-8008 >> > >> > [email protected] >> > >> > -- >> > >> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" >> > >> > in >> > >> > the body of a message to [email protected] >> > >> > More majordomo info at http://**vger.kernel.org/majordomo-info.html >> > >> > >> > >> -- >> > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> > >> the body of a message to [email protected] >> > >> More majordomo info at http://**vger.kernel.org/majordomo-info.html >> > >> >> > > >> > > >> > > -- >> > > Ira Weiny >> > > Math Programmer/Computer Scientist >> > > Lawrence Livermore National Lab >> > > 925-423-8008 >> > > [email protected] >> > > >> > >> >> >> -- >> Ira Weiny >> Math Programmer/Computer Scientist >> Lawrence Livermore National Lab >> 925-423-8008 >> [email protected] > > > -- > Ira Weiny > Math Programmer/Computer Scientist > Lawrence Livermore National Lab > 925-423-8008 > [email protected] > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
