Re: [OMPI users] Installation of openmpi-1.10.7 fails

2018-01-18 Thread Jeff Squyres (jsquyres)
l k points > > There are 1728 k points in the input file and Quantum Espresso, by default, > can read up to 4 k points. > > This error did not occur with openmpi-1.8.1. > > So I will just continue to use openmpi-1.8.1 as it does not crash. > > Thanks, > >

Re: [OMPI users] Installation of openmpi-1.10.7 fails

2018-01-11 Thread Jeff Squyres (jsquyres)
: > > Hi Jeff, > > I looked for the 3.0.1 version but I only found the 3.0.0 version available > for download. So I thought it may take a while for the 3.0.1 to become > available. Or did I miss something? > > Thanks, > > Vahid > >> On Jan 11, 2

Re: [OMPI users] Installation of openmpi-1.10.7 fails

2018-01-11 Thread Jeff Squyres (jsquyres)
ugly) workaround for the compilation issue is to >> configure —with-ucx=/usr ... >> That being said, you should really upgrade to a supported version of Open >> MPI as previously suggested >> >> Cheers, >> >> Gilles >> >> On Saturday, January 6, 2018

Re: [OMPI users] Setting mpirun default parameters in a file

2018-01-10 Thread Jeff Squyres (jsquyres)
See https://www.open-mpi.org/faq/?category=tuning#setting-mca-params for a little more info on how to set MCA params. In terms of physical vs. logical -- are you talking about hyperthreading? If so, Open MPI uses the number of *cores* (by default), because that's what "most" HPC users want (I

Re: [OMPI users] Installation of openmpi-1.10.7 fails

2018-01-05 Thread Jeff Squyres (jsquyres)
en. > > So I am hoping to avoid the 2.x.x series and use the 1.10.7 version suggested > by the EPW developers. However, it appears that this is not possible. > > Vahid > >> On Jan 5, 2018, at 5:06 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> >> wrote: >&

Re: [OMPI users] Installation of openmpi-1.10.7 fails

2018-01-05 Thread Jeff Squyres (jsquyres)
I forget what the underlying issue was, but this issue just came up and was recently fixed: https://github.com/open-mpi/ompi/issues/4345 However, the v1.10 series is fairly ancient -- the fix was not applied to that series. The fix was applied to the v2.1.x series, and a snapshot tarball

Re: [OMPI users] OMPI 3.0.0 crashing at mpi_init on OS X using Fortran

2017-12-12 Thread Jeff Squyres (jsquyres)
I am unable to reproduce your error with Open MPI v3.0.0 on the latest stable MacOS High Sierra. Given that you're failing in MPI_INIT, it feels like the application shouldn't matter. But regardless, can you test with the trivial Fortran test programs in the examples/ directory in the Open

[OMPI users] IMB-MPI1 hangs after 30 minutes with Open MPI 3.0.0 (was: Openmpi 1.10.4 crashes with 1024 processes)

2017-11-30 Thread Jeff Squyres (jsquyres)
t; > Regards, Götz > > On Thu, Nov 30, 2017 at 4:24 PM, Jeff Squyres (jsquyres) > <jsquy...@cisco.com> wrote: >> Can you upgrade to 1.10.7? That's the last release in the v1.10 series, and >> has all the latest bug fixes. >> >>> On Nov 30, 2017, at 9:5

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-11-30 Thread Jeff Squyres (jsquyres)
Can you upgrade to 1.10.7? That's the last release in the v1.10 series, and has all the latest bug fixes. > On Nov 30, 2017, at 9:53 AM, Götz Waschk wrote: > > Hi everyone, > > I have managed to solve the first part of this problem. It was caused > by the quota on

Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path

2017-11-29 Thread Jeff Squyres (jsquyres)
Chemical and Biomolecular Engineering > Advanced Computing Center for Research and Education (ACCRE) > Vanderbilt University - Hill Center 201 > (615)-875-9137 > www.accre.vanderbilt.edu > > On 2017-11-29 16:07:04-06:00 Jeff Squyres (jsquyres) wrote: > > On Nov 29, 2017, at 4

Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path

2017-11-29 Thread Jeff Squyres (jsquyres)
On Nov 29, 2017, at 4:51 PM, Vanzo, Davide wrote: > > Although tempting, changing the version of OpenMPI would mean a significant > amount of changes in our software stack. Understood. FWIW: the only differences between 1.10.3 and 1.10.7 were bug fixes (including,

Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path

2017-11-29 Thread Jeff Squyres (jsquyres)
FWIW, adding -L/usr/lib or -L/usr/lib64 is generally considered Bad, because it may usurp the default linker path order. I note that you're using Open MPI 1.10.3 -- if you're unwilling/unable to upgrade to Open MPI 3.0.x, could you upgrade to Open MPI 1.10.7? We may well have fixed the issue

Re: [OMPI users] Bug in 2.1.2 configure script

2017-11-27 Thread Jeff Squyres (jsquyres)
Just to follow up for the web: https://github.com/open-mpi/ompi/pull/4538 > On Nov 24, 2017, at 7:51 AM, gil...@rist.or.jp wrote: > > Thanks Fabrizio ! > > this has been fixed from v3.0.x, but has never been back-ported into the > v2.x branch. > > i will issue a PR to fix this > > >

Re: [OMPI users] Fwd: Request to debug the code(edited)

2017-11-18 Thread Jeff Squyres (jsquyres)
Nitu -- We actually try hard not to do students' homework for them on this list. We are more than willing to *help*, but please don't just send your program to us and say "fix it for me." Remember that we are volunteers on this list; people are inspired to help others when it is obvious that

Re: [OMPI users] usNIC BTL unrecognized payload type 255 when running under SLURM srun nut not mpiexec/mpirun

2017-11-10 Thread Jeff Squyres (jsquyres)
On Nov 9, 2017, at 6:51 PM, Forai,Petar wrote: > > We’re observing output such as the following when running non-trivial MPI > software through SLURM’s srun > > [cn-11:52778] unrecognized payload type 255 > [cn-11:52778] base = 0x9ce2c0, proto = 0x9ce2c0, hdr = 0x9ce300

[OMPI users] Open MPI SC'17 Birds of a Feather (BOF)

2017-11-02 Thread Jeff Squyres (jsquyres)
Who's going to SC'17? We are! Come see the Open MPI State of the Union BOF on Wednesday, November 15, 2017, at 5:15pm: http://sc17.supercomputing.org/presentation/?id=bof115=sess328 We'll discuss where Open MPI is, and where it's going. We'll also have presentations from a few of our

Re: [OMPI users] MCA version error

2017-10-16 Thread Jeff Squyres (jsquyres)
_mmap.so > It gives me: openmpi-1.10.0-10.el7.x86_64 > > So, I think I'm having some sort of path issue. Is that right ? The program I > wish to run will go fine with MCA 2.0. > > On Fri, Oct 13, 2017 at 3:06 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: >

Re: [OMPI users] Fwd: MCA version error

2017-10-13 Thread Jeff Squyres (jsquyres)
>From the output you supplied, it looks like you are running Open MPI v2.1.x. Did you install Open MPI v2.1.x from source, and install it into the same directory that you had previously installed Open MPI v2.0.x? If so, the warnings you are seeing (not errors) are likely the fact that there

Re: [OMPI users] MCA version error

2017-10-13 Thread Jeff Squyres (jsquyres)
I think you're mixing a few versions there: - You mention installing Open MPI v1.10 - But you say that running ompi_info shows MCA v2.1 (which probably means Open MPI v2.1) - And you say that running your code with MCA v2.0 (which probably means Open MPI v2.0) works You might want to

Re: [OMPI users] RoCE device performance with large message size

2017-10-10 Thread Jeff Squyres (jsquyres)
Probably want to check to make sure that lossless ethernet is enabled everywhere (that's a common problem I've seen); otherwise, you end up in timeouts and retransmissions. Check with your vendor on how to do layer-0 diagnostics, etc. Also, if this is a new vendor, they should probably try

Re: [OMPI users] OpenMPI 3.0.0, compilation using Intel icc 11.1 on Linux, error when compiling pmix_mmap

2017-10-02 Thread Jeff Squyres (jsquyres)
file and the errors. I’m just pointing out that > the referenced file cannot possibly contain a pointer to > opal/threads/condition.h. There is no include in that file that can pull in > that path. > > >> On Oct 2, 2017, at 11:39 AM, Jeff Squyres (jsquyres) <jsquy...@cis

Re: [OMPI users] OpenMPI 3.0.0, compilation using Intel icc 11.1 on Linux, error when compiling pmix_mmap

2017-10-02 Thread Jeff Squyres (jsquyres)
Ralph -- I think he cited a typo in his email. The actual file he is referring to is - $ find . -name pmix_mmap.c ./opal/mca/pmix/pmix2x/pmix/src/sm/pmix_mmap.c - From his log file, there appear to be two problems: - sm/pmix_mmap.c(66): warning #266: function "posix_fallocate"

Re: [OMPI users] OpenMPI v3.0 on Cygwin

2017-09-27 Thread Jeff Squyres (jsquyres)
On Sep 27, 2017, at 3:21 PM, Llelan D. wrote: > >> After I finish on 2.1.2 I will look on 3.0. > Thank you for your response. I am looking forward to a Cygwin release. > If you could send me some guidelines as to the preferred manner of doing this > as was done with

Re: [OMPI users] Fwd: Make All error regarding either "Conflicting" or "Previous Declaration" among others

2017-09-27 Thread Jeff Squyres (jsquyres)
Check out this thread on the users archive: https://www.mail-archive.com/users@lists.open-mpi.org/msg31602.html including Marco's reply (Marco is the Cygwin Open MPI package maintainer). > On Sep 27, 2017, at 1:21 AM, Aragorn Inocencio > wrote: > > Good

Re: [OMPI users] Fwd: Make All error regarding either "Conflicting" or "Previous Declaration" among others

2017-09-21 Thread Jeff Squyres (jsquyres)
> On Sep 21, 2017, at 11:26 AM, Aragorn Inocencio > wrote: > > Hi, sorry about the mixup earlier. But I have recently tried installing > openmpi 3.0.0 using the instructions I found in the Reef3D manual (attached > below), so > > ./configure CC=gcc CXX=g++

Re: [OMPI users] Question concerning compatibility of languages used with building OpenMPI and languages OpenMPI uses to build MPI binaries.

2017-09-21 Thread Jeff Squyres (jsquyres)
Don't forget that there's a lot more to "binary portability" between MPI implementations than just the ABI (wire protocols, run-time interfaces, ...etc.). This is the main (set of) reasons that ABI standardization of the MPI specification never really took off -- so much would need to be

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-21 Thread Jeff Squyres (jsquyres)
A few things: 0. Rather than go a few more rounds of "how was Open MPI configured", can you send all the information listed here: https://www.open-mpi.org/community/help/ That will tell us a lot about exactly how your Open MPI was configured, installed, etc. 1. Your mpirun error is

Re: [OMPI users] Question concerning compatibility of languages used with building OpenMPI and languages OpenMPI uses to build MPI binaries.

2017-09-18 Thread Jeff Squyres (jsquyres)
FWIW, we always encourage you to use the same compiler to build Open MPI and your application. Compatibility between gcc and Intel *usually* works for C and C++, but a) doesn't work for Fortran, and b) there have been bugs in the past where C/C++ compatibility broke in corner cases. My $0.02:

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-14 Thread Jeff Squyres (jsquyres)
Let me throw in one more item: I don't know what versions of Open MPI are available in those Rocks Rolls, but Open MPI v3.0.0 was released yesterday. You will be much better served with a modern version of Open MPI (vs. v1.4, the last release of which was in 2012). > On Sep 14, 2017, at

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-13 Thread Jeff Squyres (jsquyres)
Beware: static linking is not for the meek. Is there a reason you need to link statically? Be sure to read this FAQ item: https://www.open-mpi.org/faq/?category=mpi-apps#static-ofa-mpi-apps (note that that FAQ item was written a long time ago; it cites the "mthca" Mellanox obverts driver; the

Re: [OMPI users] OpenMPI 1.10.5 oversubscribing cores

2017-09-08 Thread Jeff Squyres (jsquyres)
Tom -- If you're going to upgrade, can you upgrade to the latest Open MPI (2.1.1)? I.e., unless you have a reason for wanting to stay back at an already-old version, you might as well upgrade to the latest latest latest to give you the longest shelf life. I mention this because we are

Re: [OMPI users] mpi_f08 interfaces in man3 pages?

2017-08-11 Thread Jeff Squyres (jsquyres)
On Aug 10, 2017, at 2:18 PM, Matt Thompson wrote: > > I know from a while back when I was scanning git to find some other thing, I > saw a kind user (Gilles Gouaillardet?) added the F08 interfaces into the man > pages. As I am lazy, 'man mpi_send' would be nicer than me

Re: [OMPI users] Groups and Communicators

2017-08-02 Thread Jeff Squyres (jsquyres)
7 at 10:15 AM, Diego Avesani <diego.aves...@gmail.com> > wrote: > Dear Jeff, Dear all, > > thanks, I will try immediately. > > thanks again > > > > Diego > > > On 2 August 2017 at 14:01, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: &

Re: [OMPI users] MPI_Finalize?

2017-08-02 Thread Jeff Squyres (jsquyres)
MPI_FINALIZE is required in all MPI applications, sorry. :-\ https://www.open-mpi.org/doc/v2.1/man3/MPI_Finalize.3.php If you're getting a segv in MPI_FINALIZE, it likely means that there's something else wrong with the application, and it's just not showing up until the end. Check and

Re: [OMPI users] Groups and Communicators

2017-08-02 Thread Jeff Squyres (jsquyres)
are? Thanks again Diego On 1 August 2017 at 16:18, Jeff Squyres (jsquyres) <jsquy...@cisco.com<mailto:jsquy...@cisco.com>> wrote: On Aug 1, 2017, at 5:56 AM, Diego Avesani <diego.aves...@gmail.com<mailto:diego.aves...@gmail.com>> wrote: > > If I do this: > >

Re: [OMPI users] Groups and Communicators

2017-08-01 Thread Jeff Squyres (jsquyres)
On Aug 1, 2017, at 5:56 AM, Diego Avesani wrote: > > If I do this: > > CALL MPI_SCATTER(PP, npart, MPI_DOUBLE, PPL, 10,MPI_DOUBLE, 0, MASTER_COMM, > iErr) > > I get an error. This because some CPU does not belong to MATER_COMM. The > alternative should be: > >

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-16 Thread Jeff Squyres (jsquyres)
Ted -- Sorry for jumping in late. Here's my $0.02... In the runtime, we can do 4 things: 1. Kill just the process that we forked. 2. Kill just the process(es) that call back and identify themselves as MPI processes (we don't track this right now, but we could add that functionality). 3. Union

Re: [OMPI users] Double free or corruption with OpenMPI 2.0

2017-06-13 Thread Jeff Squyres (jsquyres)
On Jun 13, 2017, at 8:22 AM, ashwin .D wrote: > > Also when I try to build and run a make check I get these errors - Am I clear > to proceed or is my installation broken ? This is on Ubuntu 16.04 LTS. > > == >Open MPI

Re: [OMPI users] Hello world Runtime error: Primary job terminated normally, but 1 process returned a non-zero exit code.

2017-05-22 Thread Jeff Squyres (jsquyres)
What Gilles suggested is probably the right answer. There's a Linux executable named "test" already (e.g., /usr/bin/test) that is not an MPI application. When you didn't specify a path, mpirun probably found and ran that one instead. > On May 22, 2017, at 9:58 AM, Gilles Gouaillardet >

Re: [OMPI users] Checkpoint with blcr

2017-05-19 Thread Jeff Squyres (jsquyres)
Open MPI v2.1.x does not support checkpoint restart; it was unmaintained and getting stale, so it was removed. Looks like we forgot to remove the cr MPI extension from the v2.1.x release series when we removed the rest of the checkpoint restart support. Sorry for the confusion. > On May

Re: [OMPI users] [Open MPI Announce] Open MPI v2.1.1 released

2017-05-10 Thread Jeff Squyres (jsquyres)
On May 10, 2017, at 4:50 PM, Joseph Schuchart wrote: > > > - Fix memory allocated by MPI_WIN_ALLOCATE_SHARED to > > be 64 byte aligned. > > The alignment has been fixed to 64 *bit* or 8 byte, just in case someone is > relying on it or stumbling across that. Verified using

Re: [OMPI users] MPI the correct solution?

2017-05-08 Thread Jeff Squyres (jsquyres)
FWIW, here's a screencast on "What is MPI?": https://www.open-mpi.org/video/?category=general Slides are available there, too, if you just want to breeze through them. > On May 8, 2017, at 5:25 PM, David Niklas wrote: > > Hello, > I originally ported this question at LQ,

Re: [OMPI users] Strange OpenMPI errors showing up in Caffe rc5 build

2017-05-08 Thread Jeff Squyres (jsquyres)
On May 6, 2017, at 3:28 AM, Lane, William wrote: > > The strange thing is OpenMPI isn't mentioned anywhere as being a dependency > for Caffe! I haven't read anything that suggests OpenMPI is supported in > Caffe either. This is why I figure it must be a dependency of

Re: [OMPI users] OpenMPI 2.1.0 + PGI 17.3 = asm test failures

2017-05-01 Thread Jeff Squyres (jsquyres)
What I should have said was: NVIDIA -- can someone check to see if this is a PGI compiler error? > On Apr 29, 2017, at 7:37 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > IBM: can someone check to see if this is a compiler error? > > >> On Apr

Re: [OMPI users] OpenMPI 2.1.0 + PGI 17.3 = asm test failures

2017-05-01 Thread Jeff Squyres (jsquyres)
Er... right. Duh. > On May 1, 2017, at 11:21 AM, Prentice Bisbal <pbis...@pppl.gov> wrote: > > Jeff, > > Why IBM? This problem is caused by the PGI compilers, so shouldn't this be > directed towards NVidia, which now owns PGI? > > Prentice > > On 04/29/2

Re: [OMPI users] OpenMPI 2.1.0 + PGI 17.3 = asm test failures

2017-04-29 Thread Jeff Squyres (jsquyres)
IBM: can someone check to see if this is a compiler error? > On Apr 28, 2017, at 5:09 PM, Prentice Bisbal wrote: > > Update: removing the -fast switch caused this error to go away. > > Prentice > > On 04/27/2017 06:00 PM, Prentice Bisbal wrote: >> I'm building Open MPI

Re: [OMPI users] Runtime error with OpenMPI via InfiniBand - [btl_openib_proc.c:157] ompi_modex_recv failed for peer

2017-04-19 Thread Jeff Squyres (jsquyres)
Dong -- I do not see an obvious cause for the error. Are you able to run trivial hello world / ring kinds of MPI jobs? Is the problem localized to a specific set of nodes in the cluster? > On Apr 14, 2017, at 4:30 PM, Dong Young Yoon wrote: > > Hi everyone, > > I am a

Re: [OMPI users] Performance degradation of OpenMPI 1.10.2 when oversubscribed?

2017-03-27 Thread Jeff Squyres (jsquyres)
On Mar 27, 2017, at 11:00 AM, r...@open-mpi.org wrote: > > I’m confused - mpi_yield_when_idle=1 is precisely the “oversubscribed” > setting. So why would you expect different results? A few additional points to Ralph's question: 1. Recall that sched_yield() has effectively become a no-op in

Re: [OMPI users] Performance degradation of OpenMPI 1.10.2 when oversubscribed?

2017-03-25 Thread Jeff Squyres (jsquyres)
On Mar 25, 2017, at 3:04 AM, Ben Menadue wrote: > > I’m not sure about this. It was my understanding that HyperThreading is > implemented as a second set of e.g. registers that share execution units. > There’s no division of the resources between the hardware threads,

Re: [OMPI users] Performance degradation of OpenMPI 1.10.2 when oversubscribed?

2017-03-24 Thread Jeff Squyres (jsquyres)
On Mar 24, 2017, at 6:10 PM, Reuti wrote: > >> - Disabling HT in the BIOS means that the one hardware thread left in each >> core will get all the cores resources (buffers, queues, processor units, >> etc.). >> - Enabling HT in the BIOS means that each of the 2

Re: [OMPI users] Performance degradation of OpenMPI 1.10.2 when oversubscribed?

2017-03-24 Thread Jeff Squyres (jsquyres)
Performance goes out the window if you oversubscribe your machines (i.e., run more MPI processes than cores). The effect of oversubscription is non-deterministic. (for the next few paragraphs, assume that HT is disabled in the BIOS -- i.e., that there's only 1 hardware thread on each core)

Re: [OMPI users] Communicating MPI processes running in Docker containers in the same host by means of shared memory?

2017-03-24 Thread Jeff Squyres (jsquyres)
On Mar 24, 2017, at 6:41 AM, Jordi Guitart wrote: > > Docker containers have different IP addresses, indeed, so now we know why it > does not work. I think that this could be a nice feature for OpenMPI, so I'll > probably issue a request for it ;-) Cool. I don't think

Re: [OMPI users] Communicating MPI processes running in Docker containers in the same host by means of shared memory?

2017-03-24 Thread Jeff Squyres (jsquyres)
If the Docker containers have different IP addresses, Open MPI will think that they are different "nodes" (or "hosts" or "servers" or whatever your favorite word is), and therefore will assume that they processes in these different containers are unable to share memory. Meaning: no work has

Re: [OMPI users] a question about MPI dynamic process manage

2017-03-24 Thread Jeff Squyres (jsquyres)
(keeping the user's list in the CC) > On Mar 24, 2017, at 4:05 AM, gzzh...@buaa.edu.cn wrote: > > hi jeff: > I tried to call MPI_Comm_spawn("./child", MPI_ARGV_NULL, 1, > MPI_INFO_NULL, root, MPI_COMM_WORLD, , ) > in order every MPI process in MPI_COMM_WORLD can spawn one child process.

Re: [OMPI users] more migrating to MPI_F08

2017-03-23 Thread Jeff Squyres (jsquyres)
On Mar 23, 2017, at 3:20 PM, Tom Rosmond wrote: > > I had stared at those lines many times and it didn't register that (count) > was explicitly specifying only 1-D is allowed. Pretty cryptic. I wonder how > many other fortran programmers will be bit by this? My

Re: [OMPI users] more migrating to MPI_F08

2017-03-23 Thread Jeff Squyres (jsquyres)
Actually, MPI-3.1 p90:37-45 explicitly says that the array_of_blocklengths and array_of_displacements arrays must be both 1D and of length count. If my Fortran memory serves me correctly, I think you can pass in an array subsection if your blocklengths/displacements are part of a larger array

Re: [OMPI users] openmpi installation error

2017-03-23 Thread Jeff Squyres (jsquyres)
That's a pretty weird error. We don't require any specific version of perl that I'm aware of. Are you sure that it's Open MPI's installer that is kicking out the error? Can you send all the information listed here: https://www.open-mpi.org/community/help/ > On Mar 23, 2017, at 1:39 PM,

Re: [OMPI users] a question about MPI dynamic process manage

2017-03-23 Thread Jeff Squyres (jsquyres)
It's likely a lot more efficient to MPI_COMM_SPAWN *all* of your children at once, and then subdivide up the resulting newcomm communicator as desired. It is *possible* to have a series MPI_COMM_SPAWN calls that spawn a single child process, and then later join all of those children into a

Re: [OMPI users] Erors and segmentation faults when installing openmpi-2.1

2017-03-23 Thread Jeff Squyres (jsquyres)
Note that Open MPI and MPICH are different implementations of the MPI specification. If you are mixing an Open MPI tarball install with an MPICH apt install, things will likely go downhill from there. You need to ensure to use Open MPI *or* MPICH. > On Mar 23, 2017, at 5:38 AM, Dimitrova,

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Jeff Squyres (jsquyres)
c.rwth-aachen.de> wrote: > > Jeff, I confirm: your patch did it. > > (tried on 1.10.6 - do not even need to rebuild the cp2k.popt , just load > another Open MPI version compiled with Jeff'path) > > ( On Intel OmpiPath the same speed as with --mca btl ^tcp,openib ) > >

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-16 Thread Jeff Squyres (jsquyres)
On Mar 16, 2017, at 10:37 AM, Jingchao Zhang wrote: > > One of my earlier replies includes the backtraces of cp2k.popt process and > the problem points to MPI_ALLOC_MEM/MPI_FREE_MEM. > https://mail-archive.com/users@lists.open-mpi.org/msg30587.html Yep -- saw it. That -- paired

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Jeff Squyres (jsquyres)
On Mar 15, 2017, at 8:25 PM, Jeff Hammond wrote: > > I couldn't find the docs on mpool_hints, but shouldn't there be a way to > disable registration via MPI_Info rather than patching the source? Yes; that's what I was thinking, but wanted to get the data point first.

Re: [OMPI users] openib/mpi_alloc_mem pathology

2017-03-15 Thread Jeff Squyres (jsquyres)
It looks like there were 3 separate threads on this CP2K issue, but I think we developers got sidetracked because there was a bunch of talk in the other threads about PSM, non-IB(verbs) networks, etc. So: the real issue is an app is experiencing a lot of slowdown when calling

Re: [OMPI users] coredump about MPI

2017-03-02 Thread Jeff Squyres (jsquyres)
A few suggestions: 1. Look for the core files in directories where you might not expect: - your $HOME (particularly if your $HOME is not a networked filesystem) - in /cores - in the pwd where the executable was launched on that machine 2. If multiple processes will be writing core files

Re: [hwloc-users] Hwloc command not working

2017-03-02 Thread Jeff Squyres (jsquyres)
Jeyaraj -- I think what we need is a bit more specific information in order to help you. Everyone's system is setup differently; we don't know how yours is setup. For example: - What version of hwloc did you install? - Where did you get the RPM for hwloc? - How exactly are you testing? - You

Re: [OMPI users] Error "Bad parameter" in mpirun

2017-02-16 Thread Jeff Squyres (jsquyres)
Are you running on a Mac, perchance? If so, see question 8: https://www.open-mpi.org/faq/?category=osx#startup-errors-with-open-mpi-2.0.x > On Feb 16, 2017, at 5:10 AM, Alessandra Bonazzi > wrote: > > Goodmorning, > I’m a beginner and I’m trying to run a

Re: [OMPI users] numaif.h present but not usable with openmpi-master-201702080209-bc2890e on Linux

2017-02-15 Thread Jeff Squyres (jsquyres)
> On Feb 15, 2017, at 11:34 AM, Siegmar Gross > wrote: > >> Did adding these flags to CPPFLAGS/CXXCPPFLAGS also solve the cuda.h issues? > > Yes, but it would be great if "configure" would add the flags > automatically when "--with-cuda=..." is available.

Re: [OMPI users] numaif.h present but not usable with openmpi-master-201702080209-bc2890e on Linux

2017-02-15 Thread Jeff Squyres (jsquyres)
uch for your help once more > > Siegmar > > > Am 15.02.2017 um 14:42 schrieb Jeff Squyres (jsquyres): >> Siegmar -- >> >> Sorry for the delay in replying. >> >> You should actually put -I flags in CPPFLAGS and CXXCPPFLAGS, not CFLAGS and >&

Re: [OMPI users] configure test doesn't find cuda.h and valgrind.h for openmpi-master-201702150209-404fe32

2017-02-15 Thread Jeff Squyres (jsquyres)
Siegmar -- Thanks for the reminder; sorry for not replying to your initial email earlier! I just replied about the valgrind.h issue -- check out https://www.mail-archive.com/users@lists.open-mpi.org/msg30631.html. I'm not quite sure what is going on with cuda.h, though -- I've asked Sylvain

Re: [OMPI users] numaif.h present but not usable with openmpi-master-201702080209-bc2890e on Linux

2017-02-15 Thread Jeff Squyres (jsquyres)
Siegmar -- Sorry for the delay in replying. You should actually put -I flags in CPPFLAGS and CXXCPPFLAGS, not CFLAGS and CXXFLAGS. The difference is: 1. CFLAGS is given to the C compiler when compiling 2. CPPFLAFS is given to the C compiler when compiling and to the C preprocessor when

Re: [OMPI users] Error during installation

2017-02-14 Thread Jeff Squyres (jsquyres)
x=/usr/local' Cheers, Gilles On 2/14/2017 12:53 AM, Alessandra Bonazzi wrote: 1 Open MPI version: 2.0.2 2 The config.log file: see attachment 3 Output from when you ran "./configure" to configure Open MPI: see attachment (config_output) Thank you 3 Il giorno 13 feb 2017, alle o

Re: [OMPI users] Error during installation

2017-02-13 Thread Jeff Squyres (jsquyres)
Can you send all the information listed here: https://www.open-mpi.org/community/help/ > On Feb 13, 2017, at 3:57 AM, Alessandra Bonazzi > wrote: > > Goodmorning, > I'm facing a problem during the installation of Open MPI. > The error appears on terminal at

Re: [OMPI users] How to get rid of OpenMPI warning: unable to find any relevant network interfaces

2017-02-09 Thread Jeff Squyres (jsquyres)
Susan -- Try setting --mca btl_base_warn_component_unused 0 That should make the warning go away (shame on us for not putting that in the warning message itself -- doh!). If that works for you, you can put "btl_base_warn_component_used = 0" in $prefix/etc/openmpi-mca-params.conf (i.e.,

Re: [OMPI users] Is gridengine integration broken in openmpi 2.0.2?

2017-02-09 Thread Jeff Squyres (jsquyres)
Yes, we can get it fixed. Ralph is unavailable this week; I don't know offhand what he meant by his prior remarks. It's possible that https://github.com/open-mpi/ompi/commit/71ec5cfb436977ea9ad409ba634d27e6addf6fae; can you try changing the "!=" on line to be "=="? I.e., from if

Re: [OMPI users] Open MPI Java Error

2017-02-08 Thread Jeff Squyres (jsquyres)
On Feb 8, 2017, at 12:54 PM, Mota, Thyago wrote: > > This error happens just by calling mpirun Did you read / can you comment on the setups described in the FAQ items? E.g., did you take care to not install a new version of Open MPI over a prior version? Did you take

Re: [OMPI users] Open MPI Java Error

2017-02-08 Thread Jeff Squyres (jsquyres)
This may or may not be a Java-specific issue. Are you able to run any Open MPI jobs at all? Check out these FAQ items: https://www.open-mpi.org/faq/?category=building#install-overwrite https://www.open-mpi.org/faq/?category=running#adding-ompi-to-path

Re: [OMPI users] numaif.h present but not usable with openmpi-master-201702080209-bc2890e on Linux

2017-02-08 Thread Jeff Squyres (jsquyres)
Siegmar -- This might be normal and expected (i.e., that the numaif.h that comes with your gcc may not be suitable for use with the Sun CC compiler). Is there any functionality difference between your two builds (with and without numaif.h support)? The config.log from the cc build should

Re: [OMPI users] openmpi single node jobs using btl openib

2017-02-07 Thread Jeff Squyres (jsquyres)
Can you try upgrading to Open MPI v2.0.2? We just released that last week with a bunch of bug fixes. > On Feb 7, 2017, at 3:07 PM, Jingchao Zhang wrote: > > Hi Tobias, > > Thanks for the reply. I tried both "export OMPI_MCA_mpi_leave_pinned=0" and > "mpirun -mca

Re: [OMPI users] problem with opal_list_remove_item for openmpi-v2.x-201702010255-8b16747 on Linux

2017-02-03 Thread Jeff Squyres (jsquyres)
I've filed this as https://github.com/open-mpi/ompi/issues/2920. Ralph is just heading out for about a week or so; it may not get fixed until he comes back. > On Feb 3, 2017, at 2:03 AM, Siegmar Gross > wrote: > > Hi, > > I have installed

Re: [OMPI users] OpenMPI not running any job on Mac OS X 10.12

2017-02-02 Thread Jeff Squyres (jsquyres)
Michel -- Also, did you install Open MPI v2.0.2 over a prior version of Open MPI (i.e., with the same prefix value to configure)? That would almost certainly cause a problem. > On Feb 2, 2017, at 7:56 AM, Howard Pritchard wrote: > > Hi Michel > > It's somewhat unusual

Re: [OMPI users] [Open MPI Announce] Follow-up to Open MPI SC'16 BOF

2016-12-05 Thread Jeff Squyres (jsquyres)
loper community will work on v3.0.0 >> after the v2.1.x series. >> >> >> >>> On Nov 28, 2016, at 5:18 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> >>> wrote: >>> >>> If you have an opinion on the v2.2.x-vs-v3.x question, p

Re: [OMPI users] [Open MPI Announce] Follow-up to Open MPI SC'16 BOF

2016-12-05 Thread Jeff Squyres (jsquyres)
Thanks to all who provided their opinion. Based on the results, the Open MPI developer community will work on v3.0.0 after the v2.1.x series. [cid:FA30796A-0903-454E-A4E0-AB8FAE7FE732@cisco.com] On Nov 28, 2016, at 5:18 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com<mailto:jsquy...@cis

Re: [OMPI users] [Open MPI Announce] Follow-up to Open MPI SC'16 BOF

2016-11-28 Thread Jeff Squyres (jsquyres)
If you have an opinion on the v2.2.x-vs-v3.x question, please submit your vote by COB this upcoming Friday, 2 Dec, 2016: https://www.open-mpi.org/sc16/ Thanks! > On Nov 22, 2016, at 4:32 PM, Pritchard Jr., Howard wrote: > > Hello Folks, > > This is a followup to the

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jeff Squyres (jsquyres)
On Nov 28, 2016, at 1:04 PM, Jason Patton wrote: > > Passing --wdir to mpirun does not solve this particular case, I > believe. HTCondor sets up each worker slot with a uniquely named > sandbox, e.g. a 2-process job might have the user's executable copied > to

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jeff Squyres (jsquyres)
t exist, or it otherwise fails to chdir there, it should fail / kill your job (on the rationale that you explicitly asked for something that Open MPI couldn't do, vs. the implicit chdir to the working dir of mpirun). > On Mon, Nov 28, 2016 at 10:57 AM, Jeff Squyres (jsquyres) > <jsquy...@cis

Re: [OMPI users] non-shared fs, executable in different directories

2016-11-28 Thread Jeff Squyres (jsquyres)
I'm not sure I understand your solution -- it sounds like you are overriding $HOME for each process...? If so, that's playing with fire. Is there a reason you can't set PATH / LD_LIBRARY_PATH in your ssh wrapper script to point to the Open MPI installation that you want to use on each node?

Re: [OMPI users] malloc related crash inside openmpi

2016-11-28 Thread Jeff Squyres (jsquyres)
> On Nov 25, 2016, at 11:20 AM, Noam Bernstein > wrote: > > Looks like this openmpi 2 crash was a matter of not using the correctly > linked executable on all nodes. Now that it’s straightened out, I think it’s > all working, and apparently even fixed my malloc

Re: [OMPI users] openmpi-2.0.1

2016-11-18 Thread Jeff Squyres (jsquyres)
On Nov 17, 2016, at 3:43 PM, Gilles Gouaillardet wrote: > > if it still does not work, you can > cd ompi/tools > make V=1 > > and post the output Let me add to that: if that doesn't work, please send all the information listed here:

Re: [OMPI users] Using custom version of gfortran in mpifort

2016-11-18 Thread Jeff Squyres (jsquyres)
> On Nov 18, 2016, at 2:54 AM, Mahmood Naderan wrote: > > The mpifort wrapper uses the default gfortran compiler on the system. How can > I give it another version of gfortran which has been installed in another > folder? The best way is to specify the compiler(s) that

Re: [OMPI users] Open MPI State of the Union BOF at SC'16 next week

2016-11-16 Thread Jeff Squyres (jsquyres)
to > the list when the rest of us can get them? > > -Sean > > -- > Sean Ahern > Computational Engineering International > 919-363-0883 > > On Tue, Nov 15, 2016 at 10:53 AM, Jeff Squyres (jsquyres) > <jsquy...@cisco.com> wrote: > On Nov 10, 2016

Re: [OMPI users] Open MPI State of the Union BOF at SC'16 next week

2016-11-15 Thread Jeff Squyres (jsquyres)
On Nov 10, 2016, at 9:31 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > > The slides will definitely be available afterwards. We'll see if we can make > some flavor of recording available as well. After poking around a bit, it looks like the SC rules prohibit us

Re: [OMPI users] An old code compatibility

2016-11-14 Thread Jeff Squyres (jsquyres)
On Nov 14, 2016, at 12:19 PM, Mahmood Naderan wrote: > > The output is not meaningful for me. > If I add --showme option, the output is http://pastebin.com/FX1ks8iW --showme does not compile anything. It just shows you what underlying command line *would* be invoked (if

Re: [OMPI users] An old code compatibility

2016-11-14 Thread Jeff Squyres (jsquyres)
Remember that mpifort is just a wrapper over your underlying fortran compiler. If your fortran compiler can't compile your Fortran code, then neither can mpifort. You can use the "--showme" option to mpifort to show you the command line that it is invoking under the coverts. E.g.:

Re: [OMPI users] Open MPI State of the Union BOF at SC'16 next week

2016-11-10 Thread Jeff Squyres (jsquyres)
; week? > > -Sean > > -- > Sean Ahern > Computational Engineering International > 919-363-0883 > > On Thu, Nov 10, 2016 at 11:00 AM, Jeff Squyres (jsquyres) > <jsquy...@cisco.com> wrote: > Be sure to come see us at "Open MPI State of the Union X"

[OMPI users] Open MPI State of the Union BOF at SC'16 next week

2016-11-10 Thread Jeff Squyres (jsquyres)
Be sure to come see us at "Open MPI State of the Union X" BOF (yes, that's right, we've been giving these BOFs for ***10 years***!) next week in Salt Lake City, UT at the SC'16 trade show: http://sc16.supercomputing.org/presentation/?id=bof103=sess322 This year, the BOF is at 5:30pm on

Re: [OMPI users] error on dlopen

2016-11-04 Thread Jeff Squyres (jsquyres)
On Nov 4, 2016, at 12:14 PM, Mahmood Naderan wrote: > > >​If there's a reason you did --enable-static --disable-shared​ > Basically, I want to prevent dynamic library problems (ldd) on a distributed > environment. What problems are you referring to? > $ mpifort --showme

Re: [OMPI users] error on dlopen

2016-11-04 Thread Jeff Squyres (jsquyres)
> On Nov 4, 2016, at 7:07 AM, Mahmood Naderan wrote: > > > You might have to remove -ldl from the scalapack makefile > I removed that before... I will try one more time > > Actually, using --disable-dlopen fixed the error. To clarify: 1. Using --enable-static causes all

Re: [OMPI users] mpirun --map-by-node

2016-11-04 Thread Jeff Squyres (jsquyres)
In your case, using slots or --npernode or --map-by node will result in the same distribution of processes because you're only launching 1 process per node (a.k.a. "1ppn"). They have more pronounced differences when you're launching more than 1ppn. Let's take a step back: you should know that

Re: [OMPI users] Disabling MCA component

2016-11-03 Thread Jeff Squyres (jsquyres)
In https://www.mail-archive.com/users@lists.open-mpi.org/msg30229.html, I referred to the --enable-mca-no-build configure option. From "./configure --help": --enable-mca-no-build=LIST Comma-separated list of - pairs that will not be built.

Re: [OMPI users] OpenMPI + InfiniBand

2016-11-01 Thread Jeff Squyres (jsquyres)
I actually just filed a Github issue to ask this exact question: https://github.com/open-mpi/ompi/issues/2326 > On Nov 1, 2016, at 9:49 AM, Sergei Hrushev wrote: > > > I haven't worked with InfiniBand for years, but I do believe that yes: you > need IPoIB enabled on

<    1   2   3   4   5   6   7   8   9   10   >