My concern with the runtime solution is that I fear we will suffer the death by a thousand cuts as we try to navigate our way around all the odd configurations that exist out there. What I don't want to do is get into a constant game of whack-a-mole where we are trying to only emit the warning when we should, and always emit it when we should.
Just seems to me like we are begging for a long-running search for the perfect solution. On Mon, Jan 25, 2016 at 2:37 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > I'd like to point out an offhand comment that I made earlier that seems to > have gotten lost -- let me cite the README, because it cites it much better > than I did earlier in this thread: > > ----- > Note that for many of Open MPI's --with-<foo> options, Open MPI will, > by default, search for header files and/or libraries for <foo>. If > the relevant files are found, Open MPI will built support for <foo>; > if they are not found, Open MPI will skip building support for <foo>. > However, if you specify --with-<foo> on the configure command line and > Open MPI is unable to find relevant support for <foo>, configure will > assume that it was unable to provide a feature that was specifically > requested and will abort so that a human can resolve out the issue. > ----- > > Hence, if the user had specified --with-tm (even without a path), and Open > MPI's configure was not able to find TM support, configure would have > aborted. > > This --with-<foo> support is uniform across all of its options. Hence, if > you want to guarantee that you have support for a specific feature, you > should use --with-<foo>. > > I'm almost certain that we decided on this behavior back near the > beginning of the Open MPI project because of conversations exactly like > this (and me/others citing that writing something out at the end of > configure might end up being a losing battle)... > > > > On Jan 25, 2016, at 5:30 PM, Howard Pritchard <hpprit...@gmail.com> > wrote: > > > > HI Folks, > > > > I like Paul's suggestion for configury summary output a lot. It would > have helped me when I was trying to deal with an oddball > > one-off install of the moab/torque software on one of the non-standard > front ends at LANL. The libfabric configury has > > such a summary output at the end of configure and it makes it much > simpler (for a much smaller project) to check that > > you're getting what you expected. > > > > I still say updating the FAQ with something more precise is in order. > I'll work on that. > > > > Howard > > > > > > 2016-01-25 15:20 GMT-07:00 Paul Hargrove <phhargr...@lbl.gov>: > > Ralph, > > > > As a practical matter most users probably aren't going to know what to > do with anything that scrolls off their screen. > > So I think dumping the ompi_info output as-is would be just "noise" to > many folks. > > That is one reason I didn't just suggest doing exactly that > (cross-compilation being another) > > > > However, a suitably summarized output might be just the right thing. > > Perhaps something compact along the lines of > > MCA foo: component1 component2 component2 > > MCA foobar: componentA componentB > > ... > > Bindings: C C++ Java Fortan(mpif.h 'use mpi') > > > > If this could information be generated at the end of configure, rather > than "make install", it could save folks some time spent compiling > incorrectly configured builds. > > > > > > Another thing one might independently want to consider is having > configure warn when the required libs are present for a component but the > "can compile" test fails. > > This would, for instance, catch the situation when the "libfoo" packages > is installed but the "libfoo-dev" package is not. > > This approach, however, may require non-trivial changes to how all the > configure probes are performed since I don't believe this is something > autoconf has existing support for (the AC_CHECK_LIB macro is effectively a > check for the "libfoo-dev" package only). > > > > > > Just my $0.02USD, of course. > > > > -Paul > > > > On Mon, Jan 25, 2016 at 1:46 PM, Ralph Castain <r...@open-mpi.org> wrote: > > That makes sense, Paul - what if we output effectively the ompi_info > summary of what was built at the end of the make install procedure? Then > you would have immediate feedback on the result. > > > > On Mon, Jan 25, 2016 at 1:27 PM, Paul Hargrove <phhargr...@lbl.gov> > wrote: > > As one who builds other people's software frequently, I have my own > opinions here. > > > > Above all else, is that there is no one "right" answer, but that > consistency with in a product is best. > > So (within reason) the same things that work to configure module A and B > should work with C and D as well. > > To use an analogy from (human) languages, I dislike "irregular verbs". > > > > The proposal to report (at run time) the existence of TM support on the > system (but lacking in ORTE), doesn't "feel" consistent with existing > practice. > > In GASNet we *do* report at runtime if a high-speed network is present > and you are not using it. > > For instance we warn if the headers were missing at configure time but > we can see the /dev entry at runtime. > > However, we do that uniformly across all the networks and have done this > for years. > > So, it is a *consistent* practice in that project. > > > > Keep It Simple Stupid is also an important one. > > So, I agree with those who think the proposal to catch this at runtime > is an unnecessary complication. > > > > I think improving the FAQ a good idea > > > > I do, however, I can think of one thing that might help the "I thought I > had configured X" problem Jeff mentions. > > What about a summary output at the end of configure or make? > > > > Right now I sometimes use something like the following: > > $ grep 'bindings\.\.\. yes' configure.out > > $ grep -e 'component .* can compile\.\.\. yes' configure.log > > This lets me see what is going to be built. > > Outputing something like this a the end of configure might encourage > admins to check for their feature X before typing "make" > > The existing configury goop can easily be modified to keep a list of > configured components and language bindings. > > > > However, another alternative is probably easier to implement: > > The last step of "make install" could print a message like > > NOTICE: Your installation is complete. > > NOTICE: You can run ompi_info to verify that all expected components > and language bindings have been built. > > > > -Paul > > > > On Mon, Jan 25, 2016 at 11:13 AM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > > Haters gotta hate. ;-) > > > > Kidding aside, ok, you make valid points. So -- no tm "addition". We > just have to rely on people using functionality like "--with-tm" in the > configure line to force/ensure that tm (or whatever feature) will actually > get built. > > > > > > > On Jan 25, 2016, at 1:31 PM, Ralph Castain <r...@open-mpi.org> wrote: > > > > > > I think we would be opening a real can of worms with this idea. There > are environments, for example, that use PBSPro for one part of the system > (e.g., IO nodes), but something else for the compute section. > > > > > > Personally, I'd rather follow Howard's suggestion. > > > > > > On Mon, Jan 25, 2016 at 10:21 AM, Nathan Hjelm <hje...@lanl.gov> > wrote: > > > On Mon, Jan 25, 2016 at 05:55:20PM +0000, Jeff Squyres (jsquyres) > wrote: > > > > Hmm. I'm of split mind here. > > > > > > > > I can see what Howard is saying here -- adding complexity is usually > a bad thing. > > > > > > > > But we have gotten these problem reports multiple times over the > years: someone *thinking* that they have built with launcher support X > (e.g., TM, LSF), but then figuring out later that things aren't running as > expected, and after a bunch of work, figure out that it's because they > didn't build with support X. > > > > > > > > Gilles idea actually sounds interesting -- if the tm module detect > some of the sentinel PBS/TM env variables, emit a show_help() if we don't > have full TM support compiled in. This would actually save some users a > bunch of time and frustration. > > > > > > > > --> Keep in mind that the SLRUM launcher is different, because it's > all CLI-based (not API-based) and therefore we always build it (because we > don't have to find headers and libraries). > > > > > > > > FWIW, we do have precedent of having extra MCA params for users to > turn off warnings that they don't want to see. > > > > > > > > I guess the question here is: is there a valid use case for running > in PBS/Torque and *not* wanting to use the TM launcher? > > > > > > Once case comes to mind. In the case of Cray systems that unfortunately > > > run Moab/Toque we can launch using either alps or torque (Howard > correct > > > me if I am wrong). When Sam and I originally wrote the XE support we > > > used alps instead of torque. I am not entirely sure what we do now. > > > > > > -Nathan > > > > > > _______________________________________________ > > > devel mailing list > > > de...@open-mpi.org > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18509.php > > > > > > _______________________________________________ > > > devel mailing list > > > de...@open-mpi.org > > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18510.php > > > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18511.php > > > > > > > > -- > > Paul H. Hargrove phhargr...@lbl.gov > > Computer Languages & Systems Software (CLaSS) Group > > Computer Science Department Tel: +1-510-495-2352 > > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18513.php > > > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18514.php > > > > > > > > -- > > Paul H. Hargrove phhargr...@lbl.gov > > Computer Languages & Systems Software (CLaSS) Group > > Computer Science Department Tel: +1-510-495-2352 > > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18516.php > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18518.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/01/18520.php >