Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Gilles Gouaillardet
Jason, two other lesser-known wrappers are available : mpirun --mca orte_launch_agent --mca orte_fork_agent a.out instead of "exec orted", mpirun will "exec a.out" if i understand correctly the issue you are trying to solve, you might simply mpirun from /tmp (assuming /tmp is availab

Re: [OMPI users] [Open MPI Announce] Follow-up to Open MPI SC'16 BOF

2016-11-28 Thread Jeff Squyres (jsquyres)
If you have an opinion on the v2.2.x-vs-v3.x question, please submit your vote by COB this upcoming Friday, 2 Dec, 2016: https://www.open-mpi.org/sc16/ Thanks! > On Nov 22, 2016, at 4:32 PM, Pritchard Jr., Howard wrote: > > Hello Folks, > > This is a followup to the question posed at th

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jeff Squyres (jsquyres)
On Nov 28, 2016, at 1:04 PM, Jason Patton wrote: > > Passing --wdir to mpirun does not solve this particular case, I > believe. HTCondor sets up each worker slot with a uniquely named > sandbox, e.g. a 2-process job might have the user's executable copied > to /var/lib/condor/execute/dir_11955 on

Re: [OMPI users] Issues building Open MPI 2.0.1 with PGI 16.10 on macOS

2016-11-28 Thread Jeff Hammond
attached config.log that contains the details of the following failures is the best way to make forward-progress here. that none of the system headers are detected suggests a rather serious compiler problem that may not have anything to do with headers. checking for sys/types.h... no checking for

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jason Patton
Passing --wdir to mpirun does not solve this particular case, I believe. HTCondor sets up each worker slot with a uniquely named sandbox, e.g. a 2-process job might have the user's executable copied to /var/lib/condor/execute/dir_11955 on one machine and /var/lib/condor/execute/dir_3484 on another

Re: [OMPI users] Issues building Open MPI 2.0.1 with PGI 16.10 on macOS

2016-11-28 Thread Bennet Fauber
You could try an explicit $ export CFLAGS="-I/usr/include" prior to running ./configure and see if that has any effect. If it still throws the error, you can examine the full compile line that make prints when it tries to compile the source file to see whether the explicit include made it. If i

Re: [OMPI users] Issues building Open MPI 2.0.1 with PGI 16.10 on macOS

2016-11-28 Thread Matt Thompson
Hmm. Well, I definitely have /usr/include/stdint.h as I previously was trying work with clang as compiler stack. And as near as I can tell, Open MPI's configure is seeing /usr/include as oldincludedir, but maybe that's not how it finds it? If I check my configure output: =

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jeff Squyres (jsquyres)
On Nov 28, 2016, at 12:16 PM, Jason Patton wrote: > > We do assume that Open MPI is installed in the same location on all > execute nodes, and we set that by passing --prefix $OPEN_MPI_DIR to > mpirun. The ssh wrapper script still tells ssh to execute the PATH, > LD_LIBRARY_PATH, etc. definitions

[OMPI users] Signal propagation in 2.0.1

2016-11-28 Thread Noel Rycroft
I'm seeing different behaviour between Open MPI 1.8.4 and 2.0.1 with regards to signal propagation. With version 1.8.4 mpirun seems to propagate SIGTERM to the tasks it starts which enables the tasks to handle SIGTERM. In version 2.0.1 mpirun does not seem to propagate SIGTERM and instead I suspe

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jason Patton
We do assume that Open MPI is installed in the same location on all execute nodes, and we set that by passing --prefix $OPEN_MPI_DIR to mpirun. The ssh wrapper script still tells ssh to execute the PATH, LD_LIBRARY_PATH, etc. definitions that mpirun feeds it. However, the location of the mpicc-comp

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jeff Squyres (jsquyres)
I'm not sure I understand your solution -- it sounds like you are overriding $HOME for each process...? If so, that's playing with fire. Is there a reason you can't set PATH / LD_LIBRARY_PATH in your ssh wrapper script to point to the Open MPI installation that you want to use on each node? To

Re: [OMPI users] Issues building Open MPI 2.0.1 with PGI 16.10 on macOS

2016-11-28 Thread Bennet Fauber
I think PGI uses installed GCC components for some parts of standard C (at least for some things on Linux, it does; and I imagine it is similar for Mac). If you look at the post at http://www.pgroup.com/userforum/viewtopic.php?t=5147&sid=17f3afa2cd0eec05b0f4e54a60f50479 The problem seems to have

Re: [OMPI users] How to yield CPU more when not computing (was curious behavior during wait for broadcast: 100% cpu)

2016-11-28 Thread Jeff Hammond
> > > > Note that MPI implementations may be interested in taking advantage of > > https://software.intel.com/en-us/blogs/2016/10/06/intel- > xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait. > > Is that really useful if it's KNL-specific and MSR-based, with a setup > that implem

Re: [OMPI users] Issues building Open MPI 2.0.1 with PGI 16.10 on macOS

2016-11-28 Thread Jeff Hammond
The following is the code that fails. The comments indicate the likely source of the error. Please see http://www.pgroup.com/userforum/viewtopic.php?t=5147&sid=17f3afa2cd0eec05b0f4e54a60f50479 and other entries on https://www.google.com/search?q=pgi+stdint.h. You may want to debug libevent by it

[OMPI users] Issues building Open MPI 2.0.1 with PGI 16.10 on macOS

2016-11-28 Thread Matt Thompson
All, I recently tried building Open MPI 2.0.1 with the new Community Edition of PGI on macOS. My first mistake was I was configuring with a configure line I'd cribbed from Linux that had -fPIC. Apparently -fPIC was removed from the macOS build. Okay, I can remove that and I configured with: ./con

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-28 Thread Christof Koehler
Hello everybody, just to bring this to some conclusion for other people who might eventually find this thread. I tried several times to submit a message to the scalapack/lapack forum. However, regardless what I do my message gets flagged as spam and is denied. An strongly abbreviated message got

Re: [OMPI users] malloc related crash inside openmpi

2016-11-28 Thread Jeff Squyres (jsquyres)
> On Nov 25, 2016, at 11:20 AM, Noam Bernstein > wrote: > > Looks like this openmpi 2 crash was a matter of not using the correctly > linked executable on all nodes. Now that it’s straightened out, I think it’s > all working, and apparently even fixed my malloc related crash, so perhaps > the