Re: [OMPI devel] tm-less tm module

Paul Hargrove Mon, 25 Jan 2016 16:27:41 -0500 (EST)

As one who builds other people's software frequently, I have my own
opinions here.

Above all else, is that there is no one "right" answer, but that
consistency with in a product is best.
So (within reason) the same things that work to configure module A and B
should work with C and D as well.
To use an analogy from (human) languages, I dislike "irregular verbs".

The proposal to report (at run time) the existence of TM support on the
system (but lacking in ORTE), doesn't "feel" consistent with existing
practice.
In GASNet we *do* report at runtime if a high-speed network is present and
you are not using it.
For instance we warn if the headers were missing at configure time but we
can see the /dev entry at runtime.
However, we do that uniformly across all the networks and have done this
for years.
So, it is a *consistent* practice in that project.

Keep It Simple Stupid is also an important one.
So, I agree with those who think the proposal to catch this at runtime is
an unnecessary complication.

I think improving the FAQ a good idea

I do, however, I can think of one thing that might help the "I thought I
had configured X" problem Jeff mentions.
What about a summary output at the end of configure or make?

Right now I sometimes use something like the following:
  $ grep 'bindings\.\.\. yes' configure.out
  $ grep -e 'component .* can compile\.\.\. yes' configure.log
This lets me see what is going to be built.
Outputing something like this a the end of configure might encourage admins
to check for their feature X before typing "make"
The existing configury goop can easily be modified to keep a list of
configured components and language bindings.

However, another alternative is probably easier to implement:
The last step of "make install" could print a message like
  NOTICE: Your installation is complete.
  NOTICE: You can run ompi_info to verify that all expected components and
language bindings have been built.

-Paul

On Mon, Jan 25, 2016 at 11:13 AM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> Haters gotta hate.  ;-)
>
> Kidding aside, ok, you make valid points.  So -- no tm "addition".  We
> just have to rely on people using functionality like "--with-tm" in the
> configure line to force/ensure that tm (or whatever feature) will actually
> get built.
>
>
> > On Jan 25, 2016, at 1:31 PM, Ralph Castain <r...@open-mpi.org> wrote:
> >
> > I think we would be opening a real can of worms with this idea. There
> are environments, for example, that use PBSPro for one part of the system
> (e.g., IO nodes), but something else for the compute section.
> >
> > Personally, I'd rather follow Howard's suggestion.
> >
> > On Mon, Jan 25, 2016 at 10:21 AM, Nathan Hjelm <hje...@lanl.gov> wrote:
> > On Mon, Jan 25, 2016 at 05:55:20PM +0000, Jeff Squyres (jsquyres) wrote:
> > > Hmm.  I'm of split mind here.
> > >
> > > I can see what Howard is saying here -- adding complexity is usually a
> bad thing.
> > >
> > > But we have gotten these problem reports multiple times over the
> years: someone *thinking* that they have built with launcher support X
> (e.g., TM, LSF), but then figuring out later that things aren't running as
> expected, and after a bunch of work, figure out that it's because they
> didn't build with support X.
> > >
> > > Gilles idea actually sounds interesting -- if the tm module detect
> some of the sentinel PBS/TM env variables, emit a show_help() if we don't
> have full TM support compiled in.  This would actually save some users a
> bunch of time and frustration.
> > >
> > > --> Keep in mind that the SLRUM launcher is different, because it's
> all CLI-based (not API-based) and therefore we always build it (because we
> don't have to find headers and libraries).
> > >
> > > FWIW, we do have precedent of having extra MCA params for users to
> turn off warnings that they don't want to see.
> > >
> > > I guess the question here is: is there a valid use case for running in
> PBS/Torque and *not* wanting to use the TM launcher?
> >
> > Once case comes to mind. In the case of Cray systems that unfortunately
> > run Moab/Toque we can launch using either alps or torque (Howard correct
> > me if I am wrong). When Sam and I originally wrote the XE support we
> > used alps instead of torque. I am not entirely sure what we do now.
> >
> > -Nathan
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/01/18509.php
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/01/18510.php
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/01/18511.php
>

-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Re: [OMPI devel] tm-less tm module

Reply via email to