I believe compile-time is preferable as there is a non-zero time impact of 
enabling this code. It's really more for developers to improve scalability - if 
a user is actually interested, I think it isn't that hard for them to configure 
it.


On Sep 18, 2014, at 7:16 AM, Artem Polyakov <artpo...@gmail.com> wrote:

> Jeff, thank you for the feedback! All of mentioned issues are clear and I 
> will fix them shortly.
> 
> One important thing that needs additional discussion is compile-time vs 
> runtime selection. Ralph, what do you think about that? Several of issues 
> depends on that decision.
> 
> 2014-09-18 20:09 GMT+07:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>:
> I have a few comments:
> 
> - This looks nice.  Thanks for the contribution.
> 
> - I notice that the ORTE timing stuff is now a compile-time decision, not a 
> run-time decision.  Do we care that we've now taken away the ability for 
> users to do timings in a production build?
> - "clksync" -- can we use "clocksync"?  It's only 2 letters.  We tend to use 
> real words in the OMPI code base; unnecessary abbreviation should be avoided. 
> 
> - r32738 introduced a several files into the code base that have no 
> copyrights, and do not have the standard OMPI copyright header block.  Please 
> fix.
> 
> - There's no documentation on how to use mpisync, mpirun_prof, or 
> ompi_timing_post, even though they're installed when you --enable-timing.  
> What are these 3 executables?  Can we get man pages?
> I post their description in the first e-mail. Sure I can prepare man pages 
> for them,
>  
> 
> - What's the purpose of the MCA param orte_rml_base_timing?  A *quick* look 
> through the code seems to indicate that it is ignored.
> 
> - What's the purpose of the MCA params opal_clksync_file, opal_timing_file, 
> and opal_timing_overhead?  E.g., what is a "clksync" file, what is it for, 
> and what is its format?  Does the user have to provide one?  If so, how to 
> you get one?  Or is it an output file?  ...etc.  The brief descriptions given 
> in the MCA help strings don't really provide enough information for someone 
> who has no idea what the timing stuff is.  Also, can those 3 params have a 
> common prefix?  I.e., it's not obvious that opal_clksync_file is related to 
> opal_timing_* at all. 
> 
> - A *quick* look at ompi/tools/mpisync shows that a bunch of that code came 
> from an external project.  Is the license compatible with OMPI's license?  
> What do we need to do to conform to their license?
> 
> - opal/util/timings.h is protected by OPAL_SYS_TIMING_H -- shouldn't it be 
> OPAL_UTIL_TIMINGS_H?
> 
> - There's commented-out code in opal/util/timings.h.
> 
> - There's no doxygen-style documentation in opal/util/timings.h to tell 
> developers how to use it.
> 
> - There's "TODO" comments in opal/util/timings.c; should those be fixed?
> 
> - opal_config.h should be the first include in opal/util/timings.c.
> 
> - If timing support is not to be compiled in, then opal/util/timings.c should 
> not be be compiled via the Makefile.am (rather than entirely #if'ed out).
> 
> It looks like this work is about 95% complete.  Finishing the remaining 5% 
> would make it great and genuinely useful to the rest of the code base.
> 
> Thanks!
> 
> 
> 
> On Sep 16, 2014, at 10:20 AM, Artem Polyakov <artpo...@gmail.com> wrote:
> 
> > Hello,
> >
> > I would like to introduce OMPI timing framework that was included into the 
> > trunk yesterday (r32738). The code is new so if you'll hit some bugs - just 
> > let me know.
> >
> > The framework consists of the set of macro's and routines for internal OMPI 
> > usage + standalone tool mpisync and few additional scripts: mpirun_prof and 
> > ompi_timing_post. The set of features is very basic and I am open for 
> > discussion of new things that are desirable there.
> >
> > To enable framework compilation you should configure OMPI with 
> > --enable-timing option. If the option was passed to ./configure, standalone 
> > tools and scripts will be installed into <prefix>/bin.
> >
> > The timing code is located in OPAL (opal/utils/timing.[ch]). There is a set 
> > of macro's that should be used to preprocess out all mentions of the timing 
> > code in case it wasn't requested with --enable-timing:
> > OPAL_TIMING_DECLARE(t) - declare timing handler structure with name "t".
> > OPAL_TIMING_DECLARE_EXT(x, t) - external declaration of a timing handler 
> > "t".
> > OPAL_TIMING_INIT(t) - initialize timing handler "t"
> > OPAL_TIMING_EVENT(x) - printf-like event declaration similar to OPAL_OUTPUT.
> > The information about the event will be quickly inserted into the linked 
> > list. Maximum event description is limited by OPAL_TIMING_DESCR_MAX.
> > The malloc is performed in buckets (OPAL_TIMING_BUFSIZE at once) and 
> > overhead (time to malloc and prepare the bucket) is accounted in 
> > corresponding list element. It might be excluded from the timing results 
> > (controlled by OMPI_MCA_opal_timing_overhead parameter).
> > OPAL_TIMING_REPORT(enable, t, prefix) - prepare and print out timing 
> > information. If OMPI_MCA_opal_timing_file was specified the output will go 
> > to that file. In other case the output will be directed using opal_output, 
> > each line will be prefixed with "prefix" to ease grep'ing. "enable" is a 
> > boolean/integer variable that is used for runtime selection of what should 
> > be reported.
> > OPAL_TIMING_RELEASE(t) - the counterpart for OPAL_TIMING_INIT.
> >
> > There are several examples in OMPI code. And here is another simple example:
> >     OPAL_TIMING_DECLARE(tm);
> >     OPAL_TIMING_INIT(&tm);
> >     ...
> >     OPAL_TIMING_EVENT((&tm,"Begin of timing: %s", 
> > ORTE_NAME_PRINT(&(peer->name)) ));
> >     ....
> >     OPAL_TIMING_EVENT((&tm,"Next timing event with condition x = %d", x ));
> >     ...
> >     OPAL_TIMING_EVENT((&tm,"Finish"));
> >     OPAL_TIMING_REPORT(enable_var, &tm,"MPI Init");
> >     OPAL_TIMING_RELEASE(&tm);
> >
> >
> > An output from all OMPI processes (mpirun, orted's, user processes) is 
> > merged together. NTP provides 1 millisecond - 100 microsecond level of 
> > precision. This may not be sufficient to order events globally.
> > To help developers extract the most realistic picture of what is going on, 
> > additional time synchronisation might be performed before profiling. The 
> > mpisync program should be runned 1-user-process-per-node to acquire the 
> > file with time offsets relative to HNP of each node. If the cluster runs 
> > over Gig Ethernet the precision will be 30-50 microseconds, in case of 
> > Infiniband - 4 microseconds. mpisync produces output file that might be 
> > readed and used by timing framework (OMPI_MCA_opal_clksync_file parameter). 
> > The bad news is that this synchronisation is not enough because of 
> > different clock skew on different nodes. Additional periodical 
> > synchronisation is needed. This is planned for the near future (me and 
> > Ralph discussing possible ways now).
> >
> > the mpirun_prof & ompi_timing_post script may be used to automate clock 
> > synchronisation in following manner:
> > export OMPI_MCA_ompi_timing=true
> > export OMPI_MCA_orte_oob_timing=true
> > export OMPI_MCA_orte_rml_timing=true
> > export OMPI_MCA_opal_timing_file=timing.out
> > mpirun_prof <ompi-params> ./mpiprog
> > ompi_timing_post timing.out
> >
> > ompi_timing_post will simply sort the events and made all times to be 
> > relative to the first one.
> >
> > --
> > С Уважением, Поляков Артем Юрьевич
> > Best regards, Artem Y. Polyakov
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2014/09/15837.php
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/09/15869.php
> 
> 
> 
> -- 
> С Уважением, Поляков Артем Юрьевич
> Best regards, Artem Y. Polyakov
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/09/15870.php

Reply via email to