I guess what I was aiming at was something similar to what we are all converging upon. People don't really care about all the details of what mapper components were built etc. What they really need to know is: (a) what resource manager support was built, and (b) what fabrics.
So a very simple, short output indicating "support for SLURM and IB was found and built" is probably adequate for resolving the original concerns in this email thread. On Mon, Jan 25, 2016 at 3:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Jeff, > > Excellent point about the --with-foo behavior. > If an admin knows what component name to grep for then they should > "--with-foo" that component. > With language bindings the spelling is "--enable-mpi-foo", but the > principle is the same. > Adding new places to apply grep is entirely superfluous if use of those > configure options is applied consistently/correctly. > > Even so, if folks feel (as Nathan or Howard seem to) that a configure > summary is useful, then I can't see any *harm* in adding it. > Since once the build is complete ompi_info can tell one essentially > everything about the build, I don't think Jeff's "slippery slope"/"eye > chart" concern is a real problem - the summary (if any) would remain very > high level (such as a list of configured components and language bindings). > > If at the end of this line of discussion no new summary output is to be > generated, then I stand my original proposal of having "make install" print > a suggestion that admins run ompi_info to double-check what they have > built/installed. > That helps the admin who doesn't know the name of the component for > passing --with-foo, but might recognize it when they see it (e.g. > "ofi"-vs-"libfabric", "verbs"-vs-"ibv", or "pbs"-vs-"tm"). > > -Paul > > On Mon, Jan 25, 2016 at 2:37 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > >> I'd like to point out an offhand comment that I made earlier that seems >> to have gotten lost -- let me cite the README, because it cites it much >> better than I did earlier in this thread: >> >> ----- >> Note that for many of Open MPI's --with-<foo> options, Open MPI will, >> by default, search for header files and/or libraries for <foo>. If >> the relevant files are found, Open MPI will built support for <foo>; >> if they are not found, Open MPI will skip building support for <foo>. >> However, if you specify --with-<foo> on the configure command line and >> Open MPI is unable to find relevant support for <foo>, configure will >> assume that it was unable to provide a feature that was specifically >> requested and will abort so that a human can resolve out the issue. >> ----- >> >> Hence, if the user had specified --with-tm (even without a path), and >> Open MPI's configure was not able to find TM support, configure would have >> aborted. >> >> This --with-<foo> support is uniform across all of its options. Hence, >> if you want to guarantee that you have support for a specific feature, you >> should use --with-<foo>. >> >> I'm almost certain that we decided on this behavior back near the >> beginning of the Open MPI project because of conversations exactly like >> this (and me/others citing that writing something out at the end of >> configure might end up being a losing battle)... >> >> >> > On Jan 25, 2016, at 5:30 PM, Howard Pritchard <hpprit...@gmail.com> >> wrote: >> > >> > HI Folks, >> > >> > I like Paul's suggestion for configury summary output a lot. It would >> have helped me when I was trying to deal with an oddball >> > one-off install of the moab/torque software on one of the non-standard >> front ends at LANL. The libfabric configury has >> > such a summary output at the end of configure and it makes it much >> simpler (for a much smaller project) to check that >> > you're getting what you expected. >> > >> > I still say updating the FAQ with something more precise is in order. >> I'll work on that. >> > >> > Howard >> > >> > >> > 2016-01-25 15:20 GMT-07:00 Paul Hargrove <phhargr...@lbl.gov>: >> > Ralph, >> > >> > As a practical matter most users probably aren't going to know what to >> do with anything that scrolls off their screen. >> > So I think dumping the ompi_info output as-is would be just "noise" to >> many folks. >> > That is one reason I didn't just suggest doing exactly that >> (cross-compilation being another) >> > >> > However, a suitably summarized output might be just the right thing. >> > Perhaps something compact along the lines of >> > MCA foo: component1 component2 component2 >> > MCA foobar: componentA componentB >> > ... >> > Bindings: C C++ Java Fortan(mpif.h 'use mpi') >> > >> > If this could information be generated at the end of configure, rather >> than "make install", it could save folks some time spent compiling >> incorrectly configured builds. >> > >> > >> > Another thing one might independently want to consider is having >> configure warn when the required libs are present for a component but the >> "can compile" test fails. >> > This would, for instance, catch the situation when the "libfoo" >> packages is installed but the "libfoo-dev" package is not. >> > This approach, however, may require non-trivial changes to how all the >> configure probes are performed since I don't believe this is something >> autoconf has existing support for (the AC_CHECK_LIB macro is effectively a >> check for the "libfoo-dev" package only). >> > >> > >> > Just my $0.02USD, of course. >> > >> > -Paul >> > >> > On Mon, Jan 25, 2016 at 1:46 PM, Ralph Castain <r...@open-mpi.org> >> wrote: >> > That makes sense, Paul - what if we output effectively the ompi_info >> summary of what was built at the end of the make install procedure? Then >> you would have immediate feedback on the result. >> > >> > On Mon, Jan 25, 2016 at 1:27 PM, Paul Hargrove <phhargr...@lbl.gov> >> wrote: >> > As one who builds other people's software frequently, I have my own >> opinions here. >> > >> > Above all else, is that there is no one "right" answer, but that >> consistency with in a product is best. >> > So (within reason) the same things that work to configure module A and >> B should work with C and D as well. >> > To use an analogy from (human) languages, I dislike "irregular verbs". >> > >> > The proposal to report (at run time) the existence of TM support on the >> system (but lacking in ORTE), doesn't "feel" consistent with existing >> practice. >> > In GASNet we *do* report at runtime if a high-speed network is present >> and you are not using it. >> > For instance we warn if the headers were missing at configure time but >> we can see the /dev entry at runtime. >> > However, we do that uniformly across all the networks and have done >> this for years. >> > So, it is a *consistent* practice in that project. >> > >> > Keep It Simple Stupid is also an important one. >> > So, I agree with those who think the proposal to catch this at runtime >> is an unnecessary complication. >> > >> > I think improving the FAQ a good idea >> > >> > I do, however, I can think of one thing that might help the "I thought >> I had configured X" problem Jeff mentions. >> > What about a summary output at the end of configure or make? >> > >> > Right now I sometimes use something like the following: >> > $ grep 'bindings\.\.\. yes' configure.out >> > $ grep -e 'component .* can compile\.\.\. yes' configure.log >> > This lets me see what is going to be built. >> > Outputing something like this a the end of configure might encourage >> admins to check for their feature X before typing "make" >> > The existing configury goop can easily be modified to keep a list of >> configured components and language bindings. >> > >> > However, another alternative is probably easier to implement: >> > The last step of "make install" could print a message like >> > NOTICE: Your installation is complete. >> > NOTICE: You can run ompi_info to verify that all expected components >> and language bindings have been built. >> > >> > -Paul >> > >> > On Mon, Jan 25, 2016 at 11:13 AM, Jeff Squyres (jsquyres) < >> jsquy...@cisco.com> wrote: >> > Haters gotta hate. ;-) >> > >> > Kidding aside, ok, you make valid points. So -- no tm "addition". We >> just have to rely on people using functionality like "--with-tm" in the >> configure line to force/ensure that tm (or whatever feature) will actually >> get built. >> > >> > >> > > On Jan 25, 2016, at 1:31 PM, Ralph Castain <r...@open-mpi.org> wrote: >> > > >> > > I think we would be opening a real can of worms with this idea. There >> are environments, for example, that use PBSPro for one part of the system >> (e.g., IO nodes), but something else for the compute section. >> > > >> > > Personally, I'd rather follow Howard's suggestion. >> > > >> > > On Mon, Jan 25, 2016 at 10:21 AM, Nathan Hjelm <hje...@lanl.gov> >> wrote: >> > > On Mon, Jan 25, 2016 at 05:55:20PM +0000, Jeff Squyres (jsquyres) >> wrote: >> > > > Hmm. I'm of split mind here. >> > > > >> > > > I can see what Howard is saying here -- adding complexity is >> usually a bad thing. >> > > > >> > > > But we have gotten these problem reports multiple times over the >> years: someone *thinking* that they have built with launcher support X >> (e.g., TM, LSF), but then figuring out later that things aren't running as >> expected, and after a bunch of work, figure out that it's because they >> didn't build with support X. >> > > > >> > > > Gilles idea actually sounds interesting -- if the tm module detect >> some of the sentinel PBS/TM env variables, emit a show_help() if we don't >> have full TM support compiled in. This would actually save some users a >> bunch of time and frustration. >> > > > >> > > > --> Keep in mind that the SLRUM launcher is different, because it's >> all CLI-based (not API-based) and therefore we always build it (because we >> don't have to find headers and libraries). >> > > > >> > > > FWIW, we do have precedent of having extra MCA params for users to >> turn off warnings that they don't want to see. >> > > > >> > > > I guess the question here is: is there a valid use case for running >> in PBS/Torque and *not* wanting to use the TM launcher? >> > > >> > > Once case comes to mind. In the case of Cray systems that >> unfortunately >> > > run Moab/Toque we can launch using either alps or torque (Howard >> correct >> > > me if I am wrong). When Sam and I originally wrote the XE support we >> > > used alps instead of torque. I am not entirely sure what we do now. >> > > >> > > -Nathan >> > > >> > > _______________________________________________ >> > > devel mailing list >> > > de...@open-mpi.org >> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18509.php >> > > >> > > _______________________________________________ >> > > devel mailing list >> > > de...@open-mpi.org >> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18510.php >> > >> > >> > -- >> > Jeff Squyres >> > jsquy...@cisco.com >> > For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18511.php >> > >> > >> > >> > -- >> > Paul H. Hargrove phhargr...@lbl.gov >> > Computer Languages & Systems Software (CLaSS) Group >> > Computer Science Department Tel: +1-510-495-2352 >> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18513.php >> > >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18514.php >> > >> > >> > >> > -- >> > Paul H. Hargrove phhargr...@lbl.gov >> > Computer Languages & Systems Software (CLaSS) Group >> > Computer Science Department Tel: +1-510-495-2352 >> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18516.php >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18518.php >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/01/18520.php >> > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Computer Languages & Systems Software (CLaSS) Group > Computer Science Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18522.php >