On May 24, 2010, at 2:02 PM, Michael E. Thomadakis wrote:

> | > 1) high-resolution timers: how do I specify the HRT linux timers in the
> | >     --with-timer=TYPE
> | >  line of ./configure ?
> |
> | You shouldn't need to do anything; the "linux" timer component of Open MPI
> | should get automatically selected.  You should be able to see this in the
> | stdout of Open MPI's "configure", and/or if you run ompi_info | grep timer
> | -- there should only be one entry: linux.
> 
> If nothing is menioned, will it by default select 'linux' timers?

Yes.

> Or I have to specify in th configure
> 
>         --with-timer=linux ?

Nope.  The philosophy of Open MPI is that whenever possible, we try to choose a 
sensible default.  It never hurts to double check, but we try to do the Right 
Thing whenever it's possible to automatically choose it (within reason, of 
course).

You can also check the output of ompi_info -- ompi_info tells you lots of 
things about your Open MPI installation.

> I actually spent some time looking around in the source trying to see which
> actual timer is the base. Is this a high-resolution timer such as a POSIX
> timers (timer_gettime or clock_nanosleep, etc.) or Intel processor's TSC ?
> 
> I am just trying to stay away from gettimeofday()

Understood.

Ugh; I just poked into the code -- it's complicated how we resolve the timer 
functions.  It looks like we put in the infrastructure into getting high 
resolution timers, but at least for Linux, we don't use it (the code falls back 
to gettimeofday).  It looks like we're only using the high-resolution timers on 
AIX (!) and Solaris.

Patches would be greatly appreciated; I'd be happy to walk someone through what 
to do. 

> | > 2) I have installed blcr V0.8.2 but when I try to built OMPI and I point 
> to the
> | > full installation it complains it cannot find it. Note that I build BLCR 
> with
> | > GCC but I am building OMPI with Intel compilers (V11.1)
> |
> | Can you be more specific here?
> 
> I pointed to the insatllation path for BLCR but config complained that it
> couldn't find it. If BLCR is only needed for checkpoint / restart then we can
> leave without it. Is BLCR needed for suspend/resume of mpi jobs ?

You mean suspend with ctrl-Z?  If so, correct -- BLCR is *only* used for 
checkpoint/restart.  Ctrl-Z just uses the SIGSTP functionality.

> | > 4) How could I select the high-speed transport, say DAPL or OFED IB verbs 
> ? Is
> | > there any preference as to the specific high-speed transport over QDR IB?
> |
> | openib is the preferred Open MPI plugin (the name is somewhat outdated, but
> | it's modern OpenFabrics verbs -- see
> | http://www.open-mpi.org/faq/?category=openfabrics#why-openib-name).
> 
> Does this mean that the DAPL API is not suported at all or that OPMI works
> better with OFED verbs?

Sorry for not being clear.

Our DAPL plugin is only supported on Solaris.  It *probably* works on Linux 
(the API should be the same), but we don't test it on Linux at all.  The OMPI 
configure script deactivates the udapl plugin on Linux by default.

On Linux, DAPL is a (thin) layer over verbs, anyway, so there isn't much point 
in using it.  On Linux, Open MPI uses the verb (openib) plugin.  FWIW, DAPL is 
an abstraction layer that intentionally hides lower-layer things.  The verbs 
API is much more complex than DAPL and exposes a lot more information, which 
OMPI uses.

We strongly recommend that you use the verbs (openib) plugin on Linux.

> Justr as a feedback from one of the many HPC centers, for us it is most
> important to have
> 
> a) a light-weight efficient MPI stack which makes the underlying IB h/w
> capabilities available and
> 
> b) it can smoothly cooperate withe a batch scheduler / resource manager so
> that a mixture of jobs get a decent allocation of the cluster resources.

Cools; good to know.  We try to make these things very workable in Open MPI -- 
it's been a goal from day 1 to integrate with job schedulers, etc.  And without 
high performance, we wouldn't have much to talk about.

Please be sure to let us know of questions / problems / etc.  I admit that 
we're sometimes a little slow to answer on the users list, but we do the best 
we can.  So don't hesitate to bump us if we don't reply.

Thanks!

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to