Hey Sasha,

I am finally getting back to this...  Sorry.

On Wed, 13 Jan 2010 15:11:44 -0500
Hal Rosenstock <hal.rosenst...@gmail.com> wrote:

> Hi Sasha,
> 
> On Tue, Jan 12, 2010 at 4:31 AM, Sasha Khapyorsky <sas...@voltaire.com> wrote:
> > Hi Hal,
> >
> > On 08:56 Mon 11 Jan     , Hal Rosenstock wrote:
> >> >
> >> > diff --git a/tests/subnet_discover.c b/tests/subnet_discover.c
> >> > index 7f8a85c..42e7aee 100644
> >> > --- a/tests/subnet_discover.c
> >> > +++ b/tests/subnet_discover.c
> >> > @@ -40,6 +40,7 @@ static struct node *node_array[32 * 1024];
> >> >  static unsigned node_count = 0;
> >> >  static unsigned trid_cnt = 0;
> >> >  static unsigned outstanding = 0;
> >> > +static unsigned max_outstanding = 8;
> >>
> >> Any reason why this default is different from the one which OpenSM
> >> uses ? Seems to me it should be the same (or less).
> >
> > In my tests I found that '8' is more optimal number (the tool works
> > faster and without drops) than '4' used in OpenSM.
> >
> > Of course it would be helpful to run this over bigger cluster than
> > what I have to see that the results are consistent.

Here is some test data on a real cluster.

09:49:10 > ibhosts | wc -l
1158

09:49:28 > ibswitches | wc -l
281

09:44:45 > time ./subnet_discover -n 1 > /dev/null

real    0m1.414s
user    0m0.309s
sys     0m0.244s

09:44:55 > time ./subnet_discover -n 2 > /dev/null

real    0m1.025s
user    0m0.284s
sys     0m0.201s

09:45:00 > time ./subnet_discover -n 4 > /dev/null

real    0m0.644s
user    0m0.268s
sys     0m0.228s

09:45:04 > time ./subnet_discover -n 8 > /dev/null

real    0m0.550s
user    0m0.253s
sys     0m0.184s

09:45:08 > time ./subnet_discover -n 12 > /dev/null

real    0m0.524s
user    0m0.207s
sys     0m0.201s

09:45:14 > time ./subnet_discover -n 16 > /dev/null

real    0m0.432s
user    0m0.248s
sys     0m0.144s

09:45:18 > time ./subnet_discover -n 32 > /dev/null

real    0m0.484s
user    0m0.260s
sys     0m0.150s


09:45:57 > time ibnetdiscover  > /dev/null

real    0m3.180s
user    0m0.068s
sys     0m0.672s


What I find most interesting is that your test utility runs nearly 2x faster
even when there is only 1 outstanding MAD.  :-/  ibnetdiscover (libibnetdisc)
does do a lot more with the data but I would not have expected such a
difference.

As a comparison I ran iblinkinfo it would seem that there is something in the
library which takes a lot more time.

09:51:59 > time iblinkinfo > /dev/null

real    0m3.159s
user    0m0.063s
sys     0m0.526s


For further comparison I rebuilt the parallel version of libibnetdisc.

12:39:02 > time ./ibnetdiscover > /dev/null

real    0m2.552s
user    0m0.295s
sys     0m0.863s

This is with 8 threads (ie 8 outstanding SMP's).

I would appear that your algorithm is superior.  I will look at converting
libibnetdisc, test, and submit a patch.  I still don't know why there would be
so much difference when only using 1 outstanding MAD though?  :-/

> 
> This is exactly my concern. Not only cluster size but use cases
> including concurrent diag discover and SM operation where SMPs are
> heavily in use.
> 
> There already have been a number of reports of dropped SMPs on this
> list with the current diags and this change will only make things
> worse IMO.

This is a problem.  I have seen this issue with large systems which are having
trouble.  OpenSM is trying to discover and route.  We are running diags trying
to figure out what is going on.  There is hardware going up and down; bad
switches or nodes which are booting/rebooting.

I plan to go forward with this but having an option for outstanding MAD's is a
good idea.  I don't have an opinion on where it should default.

> 
> Also, the OpenSM default should be at least as large as the diags for this.

I agree.  OpenSM should have some priority in this matter.

Ira

> 
> -- Hal
> 
> > Sasha
> >


-- 
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
wei...@llnl.gov
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to