Re: Fwd: [ofa-general] Performance evaluation of Opensm

Nifty Tom Mitchell Tue, 28 Jul 2009 21:30:18 -0700

On Tue, Jul 07, 2009 at 04:28:55PM +0530, Devesh Sharma wrote:
> Thanks Yevgeny, for your valuable input. This will surly help for my work.
> 
> On Tue, Jul 7, 2009 at 2:59 PM, Yevgeny
> Kliteynik<[email protected]> wrote:
> > Hi Davesh,
> >
> > It's kind of hard to talk about "performance of OpenSM".
> > Subnet Manager has different phases and modes of operation,
> > each of them is completely separate issue:
> >
> > - Fabric discovery
> > - Fabric ports/nodes configuration
> > - Unicast routing calculation
> > - Unicast routing configuration on fabric switches
> > - Multicast routing calculation
> > - Multicast routing configuration on fabric switches
> > - SA queries processing
> > - Memory consumption
> > - Different routing algorithms consume different time and memory
> > - QoS
> > - etc, etc, etc
> >
> > Most of the above can be measured only on real cluster.
...
> But how these can be measured is there any compile time flag available
> in the Code?


You can edit the code and add a log or time stamps -- but I am
not sure that you should bother.....  i.e. If there was a compile
time flag what would you compare it to?

  *) N.B. once a fabric is setup up you can kill the subnet manager
        and the fabric will stay as it is and continue to operate.   
        The implication of this is that the subnet manager is not 
        in a critical performance path for normal operation.  It
        is however in a "correctness + reliability" path which clearly
        sets the agenda for the authors.

  *) Much of the time you might measure on a cluster depends
        on the interaction of the subnet manager and other 
        parts on the fabric. i.e. each node and switch in the cluster 
        has a key component in the process (dt(SM)+dt(SMA)=something).  
        This makes it hard to extract only the performance of the subnet 
manager.

  *) Sweeping the fabric to discover changes, new or absent devices can often 
        be lazy.  There is a configuration flag to tune this.  On massive 
fabrics
        the time to rescan will grow with the size of the fabric.  A lazy
        scan is normal. Trap notice SMP processing mitigates any lazy tuning.

  *) Processor, cache type and size, memory performance, I/O path to IB hardware
        TLB size and management all come to play. Small test fabric
        results will have some edges, lumps and bumps in any curves
        that make extrapolating to "interesting" fabric sizes difficult.

Some things have been done.
        
http://nowlab.cse.ohio-state.edu/publications/conf-papers/2005/vishnu-fastos05.pdf


-- 
        T o m  M i t c h e l l 
        Found me a new hat, now what?

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: Fwd: [ofa-general] Performance evaluation of Opensm

Reply via email to