Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-26 Thread Angel de Vicente via users
Hello, thanks for your help and suggestions. At the end it was no issue with OpenMPI or with any other system stuff, but rather a single line in our code. I thought I was doing the tests with the -fbounds-check option, but it turns out I was not, arrrghh!! At some point I was writing outside one

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-23 Thread Protze, Joachim via users
From: users on behalf of Angel de Vicente via users Sent: Friday, April 22, 2022 10:31:38 PM To: Keller, Rainer Cc: Angel de Vicente ; Open MPI Users Subject: Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-22 Thread George Bosilca via users
gt; Jeff Squyres > jsquy...@cisco.com > > > From: users on behalf of Cici Feng via > users > Sent: Friday, April 22, 2022 5:30 AM > To: Open MPI Users > Cc: Cici Feng > Subject: Re: [OMPI users] help with M1 chip macOS openMPI install

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Angel de Vicente via users
Hello, "Keller, Rainer" writes: > You’re using MPI_Probe() with Threads; that’s not safe. > Please consider using MPI_Mprobe() together with MPI_Mrecv(). many thanks for the suggestion. I will try with the M variants, though I was under the impression that mpi_probe() was OK as far as one made

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Angel de Vicente via users
Hello Jeff, "Jeff Squyres (jsquyres)" writes: > With THREAD_FUNNELED, it means that there can only be one thread in > MPI at a time -- and it needs to be the same thread as the one that > called MPI_INIT_THREAD. > > Is that the case in your app? the master rank (i.e. 0) never creates threads,

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Keller, Rainer via users
Dear Angel, You’re using MPI_Probe() with Threads; that’s not safe. Please consider using MPI_Mprobe() together with MPI_Mrecv(). However, you mention running with only one Thread — setting OMP_NUM_THREADS=1, assuming you didn’t set using omp_set_num_threads() again, or use num_threads()

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-22 Thread Jeff Squyres (jsquyres) via users
22 5:30 AM To: Open MPI Users Cc: Cici Feng Subject: Re: [OMPI users] help with M1 chip macOS openMPI installation Hi George, Thanks so much with the tips and I have installed Rosetta in order for my computer to run the Intel software. However, the same error appears as I tried to mak

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Jeff Squyres (jsquyres) via users
From: users on behalf of Angel de Vicente via users Sent: Friday, April 22, 2022 10:54 AM To: Gilles Gouaillardet via users Cc: Angel de Vicente Subject: Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none Thanks Gilles, Gilles

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Angel de Vicente via users
Thanks Gilles, Gilles Gouaillardet via users writes: > You can first double check you > MPI_Init_thread(..., MPI_THREAD_MULTIPLE, ...) my code uses "mpi_thread_funneled" and OpenMPI was compiled with MPI_THREAD_MULTIPLE support: , | ompi_info | grep -i thread | Thread support:

Re: [OMPI users] Help diagnosing MPI+OpenMP application segmentation fault only when run with --bind-to none

2022-04-22 Thread Gilles Gouaillardet via users
You can first double check you MPI_Init_thread(..., MPI_THREAD_MULTIPLE, ...) And the provided level is MPI_THREAD_MULTIPLE as you requested. Cheers, Gilles On Fri, Apr 22, 2022, 21:45 Angel de Vicente via users < users@lists.open-mpi.org> wrote: > Hello, > > I'm running out of ideas, and

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-22 Thread Cici Feng via users
>> A little more color on Gilles' answer: I believe that we had some Open >> MPI community members work on adding M1 support to Open MPI, but Gilles is >> absolutely correct: the underlying compiler has to support the M1, or you >> won't get anywhere. >> >> -- &

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-21 Thread George Bosilca via users
To: Open MPI Users > Cc: Cici Feng > Subject: Re: [OMPI users] help with M1 chip macOS openMPI installation > > Gilles, > > Thank you so much for the quick response! > openMPI installed by brew is compiled on gcc and gfortran using the > original compilers by Apple. N

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-21 Thread Jeff Squyres (jsquyres) via users
From: users on behalf of Cici Feng via users Sent: Thursday, April 21, 2022 6:11 AM To: Open MPI Users Cc: Cici Feng Subject: Re: [OMPI users] help with M1 chip macOS openMPI installation Gilles, Thank you so much for the quick response! openMPI installed by brew

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-21 Thread Cici Feng via users
Gilles, Thank you so much for the quick response! openMPI installed by brew is compiled on gcc and gfortran using the original compilers by Apple. Now I haven't figured out how to use this gcc openMPI for the inversion software :( Given by your answer, I think I'll pause for now with the M1-intel

Re: [OMPI users] help with M1 chip macOS openMPI installation

2022-04-21 Thread Gilles Gouaillardet via users
Cici, I do not think the Intel C compiler is able to generate native code for the M1 (aarch64). The best case scenario is it would generate code for x86_64 and then Rosetta would be used to translate it to aarch64 code, and this is a very downgraded solution. So if you really want to stick to

Re: [OMPI users] [Help] Must orted exit after all spawned proecesses exit

2021-05-19 Thread Ralph Castain via users
To answer your specific questions: The backend daemons (orted) will not exit until all locally spawned procs exit. This is not configurable - for one thing, OMPI procs will suicide if they see the daemon depart, so it makes no sense to have the daemon fail if a proc terminates. The logic

Re: [OMPI users] [Help] Must orted exit after all spawned proecesses exit

2021-05-17 Thread Jeff Squyres (jsquyres) via users
FYI: general Open MPI questions are better sent to the user's mailing list. Up through the v4.1.x series, the "orted" is a general helper process that Open MPI uses on the back-end. It will not quit until all of its children have died. Open MPI's run time is designed with the intent that some

Re: [OMPI users] help

2020-12-14 Thread Lesiano 16 via users
Thanks for the answer On Mon, Dec 14, 2020 at 4:20 PM Jeff Squyres (jsquyres) wrote: > On Dec 12, 2020, at 4:58 AM, Lesiano 16 via users < > users@lists.open-mpi.org> wrote: > > > > My question is, can I assume that when skipping the beginning of the > file that MPI will fill up with zeros? Or

Re: [OMPI users] help

2020-12-14 Thread Jeff Squyres (jsquyres) via users
On Dec 12, 2020, at 4:58 AM, Lesiano 16 via users wrote: > > My question is, can I assume that when skipping the beginning of the file > that MPI will fill up with zeros? Or is it implementation dependent? > > I have read the standard, but I could not found anything meaningful expected >

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-25 Thread Adam Simpson via users
From: Matt Thompson Sent: Tuesday, February 25, 2020 5:54 AM To: Adam Simpson Cc: Open MPI Users Subject: Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI External email: Use caution opening links or attachments Adam, A couple questions. First, is seccomp

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-25 Thread Matt Thompson via users
--- > *From:* users on behalf of Matt > Thompson via users > *Sent:* Monday, February 24, 2020 5:15 PM > *To:* Open MPI Users > *Cc:* Matt Thompson > *Subject:* Re: [OMPI users] Help with One-Sided Communication: Works in > Intel MPI, Fails in Open MPI > &g

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Adam Simpson via users
with "sysctl -w kernel.yama.ptrace_scope=0". Adam From: users on behalf of Matt Thompson via users Sent: Monday, February 24, 2020 5:15 PM To: Open MPI Users Cc: Matt Thompson Subject: Re: [OMPI users] Help with One-Sided Communication: Works in Int

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Matt Thompson via users
Nathan, The reproducer would be that code that's on the Intel website. That is what I was running. You could pull my image if you like but...since you are the genius: [root@adac3ce0cf32 ~]# mpirun --mca btl_vader_single_copy_mechanism none -np 2 ./a.out Rank 0 running on adac3ce0cf32 Rank 1

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Matt Thompson via users
On Mon, Feb 24, 2020 at 4:57 PM Gabriel, Edgar wrote: > I am not an expert for the one-sided code in Open MPI, I wanted to comment > briefly on the potential MPI -IO related item. As far as I can see, the > error message > > > > “Read -1, expected 48, errno = 1” > > does not stem from MPI I/O,

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Nathan Hjelm via users
The error is from btl/vader. CMA is not functioning as expected. It might work if you set btl_vader_single_copy_mechanism=none Performance will suffer though. It would be worth understanding with process_readv is failing. Can you send a simple reproducer? -Nathan > On Feb 24, 2020, at 2:59

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Gabriel, Edgar via users
I am not an expert for the one-sided code in Open MPI, I wanted to comment briefly on the potential MPI -IO related item. As far as I can see, the error message “Read -1, expected 48, errno = 1” does not stem from MPI I/O, at least not from the ompio library. What file system did you use for

Re: [OMPI users] HELP: openmpi is not using the specified infiniband interface !!

2020-01-14 Thread George Bosilca via users
According to the error message you are using MPICH not Open MPI. George. On Tue, Jan 14, 2020 at 5:53 PM SOPORTE MODEMAT via users < users@lists.open-mpi.org> wrote: > Hello everyone. > > > > I would like somebody help me to figure out how can I make that the > openmpi use the infiniband

Re: [OMPI users] HELP: openmpi is not using the specified infiniband interface !!

2020-01-14 Thread Gilles Gouaillardet via users
Soporte, The error message is from MPICH! If you intend to use Open MPI, fix your environment first Cheers, Gilles Sent from my iPod > On Jan 15, 2020, at 7:53, SOPORTE MODEMAT via users > wrote: > > Hello everyone. > > I would like somebody help me to figure out how can I make that

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-23 Thread Matt Thompson
MAC > > > > *From:* users [mailto:users-boun...@lists.open-mpi.org] *On Behalf Of *Matt > Thompson > *Sent:* Tuesday, January 22, 2019 6:04 AM > *To:* Open MPI Users > *Subject:* Re: [OMPI users] Help Getting Started with Open MPI and PMIx > and UCX > > > > Wel

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-22 Thread Cabral, Matias A
To: Open MPI Users Subject: Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX Well, By turning off UCX compilation per Howard, things get a bit better in that something happens! It's not a good something, as it seems to die with an infiniband error. As this is an Omnipath system

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-22 Thread Matt Thompson
Well, By turning off UCX compilation per Howard, things get a bit better in that something happens! It's not a good something, as it seems to die with an infiniband error. As this is an Omnipath system, is OpenMPI perhaps seeing libverbs somewhere and compiling it in? To wit: (1006)(master) $

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-20 Thread Howard Pritchard
Hi Matt Definitely do not include the ucx option for an omnipath cluster. Actually if you accidentally installed ucx in it’s default location use on the system Switch to this config option —with-ucx=no Otherwise you will hit https://github.com/openucx/ucx/issues/750 Howard Gilles

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-19 Thread Gilles Gouaillardet
Matt, There are two ways of using PMIx - if you use mpirun, then the MPI app (e.g. the PMIx client) will talk to mpirun and orted daemons (e.g. the PMIx server) - if you use SLURM srun, then the MPI app will directly talk to the PMIx server provided by SLURM. (note you might have to srun

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-18 Thread Cabral, Matias A
(vader?). So, if the job is not starting this seems to be a runtime issue rather than transport…. Pmix? slurm? Thanks _MAC From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Matt Thompson Sent: Friday, January 18, 2019 10:27 AM To: Open MPI Users Subject: Re: [OMPI users] Help

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-18 Thread Matt Thompson
On Fri, Jan 18, 2019 at 1:13 PM Jeff Squyres (jsquyres) via users < users@lists.open-mpi.org> wrote: > On Jan 18, 2019, at 12:43 PM, Matt Thompson wrote: > > > > With some help, I managed to build an Open MPI 4.0.0 with: > > We can discuss each of these params to let you know what they are. > >

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-18 Thread Jeff Squyres (jsquyres) via users
On Jan 18, 2019, at 12:43 PM, Matt Thompson wrote: > > With some help, I managed to build an Open MPI 4.0.0 with: We can discuss each of these params to let you know what they are. > ./configure --disable-wrapper-rpath --disable-wrapper-runpath Did you have a reason for disabling these?

Re: [OMPI users] Help Getting Started with Open MPI and PMIx and UCX

2019-01-18 Thread Matt Thompson
All, With some help, I managed to build an Open MPI 4.0.0 with: ./configure --disable-wrapper-rpath --disable-wrapper-runpath --with-psm2 --with-slurm --enable-mpi1-compatibility --with-ucx --with-pmix=/usr/nlocal/pmix/2.1 --with-libevent=/usr CC=icc CXX=icpc FC=ifort The MPI 1 is because I

Re: [OMPI users] help installing openmpi 3.0 in ubuntu 16.04

2018-03-16 Thread Jeff Squyres (jsquyres)
(Sending this to the users list, not to just the owner of the users list) It looks like you might have installed Open MPI correctly. But you have to give some command line options to mpirun to tell it what to do -- you're basically getting an error saying "you didn't tell me what to do, so I

Re: [OMPI users] Help debugging invalid read

2018-02-19 Thread Florian Lindner
Ok, I think I have found the problem During std::vector::push_back or emplace_back a realloc happens and thus memory locations that I gave to MPI_Isend become invalid. My loop now reads: std::vector eventSendBuf(eventsSize); // Buffer to hold the MPI_EventData object for (int i = 0; i <

Re: [OMPI users] Help with binding processes correctly in Hybrid code (Openmpi +openmp)

2017-11-14 Thread Gilles Gouaillardet
Hi, per https://www2.cisl.ucar.edu/resources/computational-systems/cheyenne/running-jobs/pbs-pro-job-script-examples, you can try #PBS -l select=2:ncpus=16:mpiprocs=2:ompthreads=8 Cheers, Gilles On Tue, Nov 14, 2017 at 4:32 PM, Anil K. Dasanna wrote: > Hello

Re: [OMPI users] Help

2017-04-27 Thread Gus Correa
On 04/27/2017 06:21 AM, Corina Jeni Tudorache wrote: Hello, I am trying to install Open MPI on Centos and I got stuck. I have installed an GNU compiler and after that I run the command: _yum install openmpi-devel.x86_64. _But when I run command mpi selector –- list I receive this error “mpi:

Re: [OMPI users] Help

2017-04-27 Thread gilles
PM To: Open MPI Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] Help by the way, are you running CentOS 5 ? it seems mpi-selector is no more available from CentOS 6 Cheers, Gilles - Original Message - Yes, I

Re: [OMPI users] Help

2017-04-27 Thread Corina Jeni Tudorache
Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] Help by the way, are you running CentOS 5 ? it seems mpi-selector is no more available from CentOS 6 Cheers, Gilles - Original Message - Yes, I write it wrong the previous e-mail, but actually it does not work. Gives the

Re: [OMPI users] Help

2017-04-27 Thread Corina Jeni Tudorache
org> Subject: Re: [OMPI users] Help Well, i cannot make sense of this error message. if the command is mpi-selector, the error message could be mpi-selector: command not found but this is not the error message you reported what does rpm -ql mpi-selector reports ? Cheers, Gilles - Or

Re: [OMPI users] Help

2017-04-27 Thread gilles
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of gil...@rist.or.jp Sent: Thursday, April 27, 2017 11:34 AM To: Open MPI Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] Help Hi, that looks like a typo, the command is

Re: [OMPI users] Help

2017-04-27 Thread gilles
<users@lists.open-mpi.org> Subject: Re: [OMPI users] Help Hi, that looks like a typo, the command is mpi-selector --list Cheers, Gilles - Original Message - Hello, I am trying to instal

Re: [OMPI users] Help

2017-04-27 Thread Corina Jeni Tudorache
pen-mpi.org> Subject: Re: [OMPI users] Help Hi, that looks like a typo, the command is mpi-selector --list Cheers, Gilles - Original Message - Hello, I am trying to install Open MPI on Centos and I got stuck. I have installed an GNU compiler and after that I run the comman

Re: [OMPI users] Help

2017-04-27 Thread gilles
Hi, that looks like a typo, the command is mpi-selector --list Cheers, Gilles - Original Message - Hello, I am trying to install Open MPI on Centos and I got stuck. I have installed an GNU compiler and after that I run the command: yum install openmpi-devel.x86_64.

Re: [OMPI users] Help with Open MPI 2.1.0 and PGI 16.10: Configure and C++

2017-03-24 Thread Matt Thompson
Gilles, The library I have having issues linking is ESMF and it is a C++/Fortran application. From http://www.earthsystemmodeling.org/esmf_releases/non_public/ESMF_7_0_0/ESMF_usrdoc/node9.html#SECTION00092000 : The following compilers and utilities *are required* for compiling,

Re: [OMPI users] Help with Open MPI 2.1.0 and PGI 16.10: Configure and C++

2017-03-23 Thread Gilles Gouaillardet
Matt, a C++ compiler is required to configure Open MPI. That being said, C++ compiler is only used if you build the C++ bindings (That were removed from MPI-3) And unless you plan to use the mpic++ wrapper (with or without the C++ bindings), a valid C++ compiler is not required at all. /*

Re: [OMPI users] Help with Open MPI 2.1.0 and PGI 16.10: Configure and C++

2017-03-23 Thread Reuti
Hi, Am 22.03.2017 um 20:12 schrieb Matt Thompson: > […] > > Ah. PGI 16.9+ now use pgc++ to do C++ compiling, not pgcpp. So, I hacked > configure so that references to pgCC (nonexistent on macOS) are gone and all > pgcpp became pgc++, but: This is not unique to macOS. pgCC used STLPort STL

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-10-21 Thread Gilles Gouaillardet
Matus, This has very likely been fixed by https://github.com/open-mpi/ompi/pull/2259 Can you download the patch at https://github.com/open-mpi/ompi/pull/2259.patch and apply it manually on v1.10 ? Cheers, Gilles On Monday, August 29, 2016, M. D. wrote: > > Hi, > >

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-07-19 Thread Gilles Gouaillardet
my bad for the confusion, I misread you and miswrote my reply. I will investigate this again. strictly speaking, the clients can only start after the server first write the port info to a file. if you start the client right after the server start, they might use incorrect/outdated info and

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-07-19 Thread M. D.
Yes I understand it, but I think, this is exactly that situation you are talking about. In my opinion, the test is doing exactly what you said - when a new player is willing to join, other players must invoke MPI_Comm_accept(). All *other* players must invoke MPI_Comm_accept(). Only the last

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-07-19 Thread Gilles Gouaillardet
here is what the client is doing printf("CLIENT: after merging, new comm: size=%d rank=%d\n", size, rank) ; for (i = rank ; i < num_clients ; i++) { /* client performs a collective accept */ CHK(MPI_Comm_accept(server_port_name, MPI_INFO_NULL, 0, intracomm, )) ;

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-07-19 Thread M. D.
2016-07-19 10:06 GMT+02:00 Gilles Gouaillardet : > MPI_Comm_accept must be called by all the tasks of the local communicator. > Yes, that's how I understand it. In the source code of the test, all the tasks call MPI_Comm_accept - server and also relevant clients. > so if you

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-07-19 Thread Gilles Gouaillardet
MPI_Comm_accept must be called by all the tasks of the local communicator. so if you 1) mpirun -np 1 ./singleton_client_server 2 1 2) mpirun -np 1 ./singleton_client_server 2 0 3) mpirun -np 1 ./singleton_client_server 2 0 then 3) starts after 2) has exited, so on 1), intracomm is made of 1)

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-07-19 Thread M. D.
Hi, thank you for your interest in this topic. So, I normally run the test as follows: Firstly, I run "server" (second parameter is 1): *mpirun -np 1 ./singleton_client_server number_of_clients 1* Secondly, I run corresponding number of "clients" via following command: *mpirun -np 1

Re: [OMPI users] Help - Client / server - app hangs in connect/accept by the second or next client that wants to connect to server

2016-07-19 Thread Gilles Gouaillardet
How do you run the test ? you should have the same number of clients in each mpirun instance, the following simple shell starts the test as i think it is supposed to note the test itself is arguable since MPI_Comm_disconnect() is never invoked (and you will observe some related

Re: [OMPI users] Help on Windows

2016-02-23 Thread Walt Brainerd
Thank you, Gilles! It's amazing to get such help. It seems to work when I unplugged the ethernet and have the wireless on, but I will check it out further (including the firewall situation) to pin it down. time mpirun -np 4 ./a Hello from 1 out of 4 images. Hello from

Re: [OMPI users] Help on Windows

2016-02-23 Thread Gilles Gouaillardet
Walt, generally speaking, that kind of things happen when you are using a wireless network and/or a firewall. so i recommend you first try to disconnect all your networks and see how things get improved Cheers, Gilles On 2/24/2016 5:08 AM, Walt Brainerd wrote: I am running up-to-date

Re: [OMPI users] Help with OpenMPI and Univa Grid Engine

2016-02-09 Thread Rahul Pisharody
Hello Ralph, Dave, Thank you for your suggestions. Let me check on the nfs mounts. The problem is I am not the grid administrator. I'm working with the grid administrator to get it resolved. If I had my way, I would be probably using Sun Grid. Thank you Dave for pointing out something that I

Re: [OMPI users] Help with OpenMPI and Univa Grid Engine

2016-02-09 Thread Dave Love
Rahul Pisharody writes: > Hello all, > > I'm trying to get a simple program (print the hostname of the executing > machine) compiled with openmpi run across multiple machines on Univa Grid > Engine. > > This particular configuration has many of the ports blocked. My run

Re: [OMPI users] Help with OpenMPI and Univa Grid Engine

2016-02-08 Thread Ralph Castain
Is your OMPI installed on an NFS partition? If so, is it in the same mount point on all nodes? Most likely problem is that the required libraries were not found on the remote node > On Feb 8, 2016, at 10:45 AM, Rahul Pisharody wrote: > > Hello all, > > I'm trying to

Re: [OMPI users] Help with Binding in 1.8.8: Use only second socket

2015-12-21 Thread Saliya Ekanayake
I tried the following with OpenMPI 1.8.1 and 1.10.1. The both worked. In my case a node has 2 sockets like yours, but each socket has 12 cores and lstopo showed core numbers for the second socket are from 12 to 23. * mpirun --report-bindings --bind-to core --cpu-set 12,13,14,15,16,17,18,19 -np 8

Re: [OMPI users] Help with Binding in 1.8.8: Use only second socket

2015-12-21 Thread Matt Thompson
Ralph, Huh. That isn't in the Open MPI 1.8.8 mpirun man page. It is in Open MPI 1.10, so I'm guessing someone noticed it wasn't there. Explains why I didn't try it out. I'm assuming this option is respected on all nodes? Note: a SmarterManThanI™ here at Goddard thought up this: #!/bin/bash

Re: [OMPI users] Help with Binding in 1.8.8: Use only second socket

2015-12-21 Thread Ralph Castain
Try adding —cpu-set a,b,c,… where the a,b,c… are the core id’s of your second socket. I’m working on a cleaner option as this has come up before. > On Dec 21, 2015, at 5:29 AM, Matt Thompson > wrote: > > Dear Open MPI Gurus, > > I'm currently

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-30 Thread Jeff Squyres (jsquyres)
On Nov 24, 2015, at 9:31 AM, Dave Love wrote: > >> btw, we already use the force, thanks to the ob1 pml and the yoda spml > > I think that's assuming familiarity with something which leaves out some > people... FWIW, I agree: we use unhelpful names for components in

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-24 Thread Dave Love
Gilles Gouaillardet writes: > Currently, ompi create a file in the temporary directory and then mmap it. > an obvious requirement is the temporary directory must have enough free > space for that file. > (this might be an issue on some disk less nodes) > > > a

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-20 Thread Gilles Gouaillardet
Currently, ompi create a file in the temporary directory and then mmap it. an obvious requirement is the temporary directory must have enough free space for that file. (this might be an issue on some disk less nodes) a simple alternative could be to try /tmp, and if there is not enough space,

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-20 Thread Dave Love
Jeff Hammond writes: >> Doesn't mpich have the option to use sysv memory? You may want to try that >> >> > MPICH? Look, I may have earned my way onto Santa's naughty list more than > a few times, but at least I have the decency not to post MPICH questions to > the

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-20 Thread Dave Love
[There must be someone better to answer this, but since I've seen it:] Jeff Hammond writes: > I have no idea what this is trying to tell me. Help? > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64 > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG:

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
On Thu, Nov 19, 2015 at 4:11 PM, Howard Pritchard wrote: > Hi Jeff H. > > Why don't you just try configuring with > > ./configure --prefix=my_favorite_install_dir > --with-libfabric=install_dir_for_libfabric > make -j 8 install > > and see what happens? > > That was the

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard Pritchard
Hi Jeff, I finally got an allocation on cori - its one busy machine. Anyway, using the ompi i'd built on edison with the above recommended configure options I was able to run using either srun or mpirun on cori provided that in the later case I used mpirun -np X -N Y --mca plm slurm

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
> > > How did you configure for Cori? You need to be using the slurm plm > component for that system. I know this sounds like gibberish. > > ../configure --with-libfabric=$HOME/OFI/install-ofi-gcc-gni-cori \ --enable-mca-static=mtl-ofi \

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard
Hi Jeff How did you configure for Cori? You need to be using the slurm plm component for that system. I know this sounds like gibberish. There should be a with-slurm configure option to pick up this component. Doesn't mpich have the option to use sysv memory? You may want to try that Oh

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Martin Siegert
Hi Jeff, On Thu 19.11.2015 09:44:20 Jeff Hammond wrote: > I have no idea what this is trying to tell me. Help? > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64 > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file >

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Ralph Castain
Checkout the man page “OMPI_Affinity_str” for an MPI extension that might help > On Sep 13, 2015, at 7:28 AM, Saliya Ekanayake wrote: > > Thank you, I'll try this. Also, is there a way to know which core a process > is bound to within the program other than executing

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Gilles Gouaillardet
on linux, you can look at /proc/self/status and search allowed_cpus_list or you can use the sched_getaffinity system call note that in some (hopefully rare)cases, this will return different results than hwloc On Sunday, September 13, 2015, Saliya Ekanayake wrote: > Thank

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Saliya Ekanayake
Thank you, I'll try this. Also, is there a way to know which core a process is bound to within the program other than executing something like taskset from program? On Sun, Sep 13, 2015 at 10:05 AM, Ralph Castain wrote: > Actually, the error was correct - it was me that was

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Ralph Castain
Actually, the error was correct - it was me that was incorrect. The correct set of options would be: —map-by ppr:12_node —bind-to core —cpu-set=0,2,4,… Sorry about the confusion > On Sep 13, 2015, at 2:43 AM, Ralph Castain wrote: > > The rankfile will certainly do it, but

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Ralph Castain
The rankfile will certainly do it, but that error is a bug and I’ll have to fix it. > On Sep 13, 2015, at 1:10 AM, Saliya Ekanayake wrote: > > I could get it working by manually generating a rankfile all the ranks and > not using any --map-by options. > > I'll try the

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Saliya Ekanayake
I could get it working by manually generating a rankfile all the ranks and not using any --map-by options. I'll try the --map-by core as well On Sun, Sep 13, 2015 at 3:59 AM, Tobias Kloeffel wrote: > Hi, > use: --map-by core > > regards, > Tobias > > > On 09/13/2015

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Tobias Kloeffel
Hi, use: --map-by core regards, Tobias On 09/13/2015 09:41 AM, Saliya Ekanayake wrote: I tried, --map-by ppr:12:node --slot-list 0,2,4,6,8,10,12,14,16,18,20,22 --bind-to core -np 12 but it complains, "Conflicting directives for binding policy are causing the policy to be redefined:

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Saliya Ekanayake
I tried, --map-by ppr:12:node --slot-list 0,2,4,6,8,10,12,14,16,18,20,22 --bind-to core -np 12 but it complains, "Conflicting directives for binding policy are causing the policy to be redefined: New policy: socket Prior policy: CORE Please check that only one policy is defined. " On

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Ralph Castain
Try something like this instead: —map-by ppr:12:node —bind-to core —slot-list=0,2,4,6,8,… You’ll have to play a bit with the core numbers in the slot-list to get the numbering right as I don’t know how your machine numbers them, and I can’t guarantee it will work - but it’s worth a shot. If it

Re: [OMPI users] Help : Slowness with OpenMPI (1.8.1) and Numpy

2015-06-12 Thread Ralph Castain
Is this a threaded code? If so, you should add —bind-to none to your 1.8 series command line > On Jun 12, 2015, at 7:58 AM, kishor sharma wrote: > > Hi There, > > > > I am facing slowness running numpy code using mpirun with openmpi 1.8.1 > version. > > > > With

Re: [OMPI users] help in execution mpi

2015-04-23 Thread Ralph Castain
Use “orte_rsh_agent = rsh” instead > On Apr 23, 2015, at 10:48 AM, rebona...@upf.br wrote: > > Hi all > > I am install mpi (version 1.6.5) at ubuntu 14.04. I am teach parallel > programming in undergraduate course. > I wnat use rsh instead ssh (default). > I change the file

Re: [OMPI users] Help on getting CMA works

2015-02-24 Thread Nathan Hjelm
I don't know the reasoning for requiring --with-cma to enable CMA but I am looking at auto-detecting CMA instead of requiring Open MPI to be configured with --with-cma. This will likely go into the 1.9 release series and not 1.8. -Nathan On Thu, Feb 19, 2015 at 09:31:43PM -0500, Eric

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
Maybe it is a stupid question, but... why it is not tested and enabled by default at configure time since it is part of the kernel? Eric On 02/19/2015 03:53 PM, Nathan Hjelm wrote: Great! I will add an MCA variable to force CMA and also enable it if 1) no yama and 2) no PR_SET_PTRACER. You

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Nathan Hjelm
Aurélien, I should also point out your fix has already been applied to the 1.8 branch and will be included in 1.8.5. -Nathan On Thu, Feb 19, 2015 at 02:57:38PM -0700, Nathan Hjelm wrote: > > Hmm, wait. Yes. Your change went in after 1.8.4 and has the same > effect. If yama ins't installed it

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Nathan Hjelm
Hmm, wait. Yes. Your change went in after 1.8.4 and has the same effect. If yama ins't installed it is safe to assume that the ptrace scope is effectively 0. So, your patch does fix the issue. -Nathan On Thu, Feb 19, 2015 at 02:53:47PM -0700, Nathan Hjelm wrote: > > I don't think that will fix

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Nathan Hjelm
I don't think that will fix this issue. In this case yama is not installed and it appears PR_SET_PTRACER is not available. This forces vader to assume that CMA can not be used when that isn't always the case. I think it might be safe to assume that CMA is unrestricted here. -Nathan On Thu, Feb

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Aurélien Bouteiller
Nathan, I think I already pushed a patch for this particular issue last month. I do not know if it has been back ported to release yet. See here:https://github.com/open-mpi/ompi/commit/ee3b0903164898750137d3b71a8f067e16521102 Aurelien -- ~~~ Aurélien Bouteiller, Ph.D. ~~~

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
On 02/19/2015 03:53 PM, Nathan Hjelm wrote: Great! I will add an MCA variable to force CMA and also enable it if 1) no yama and 2) no PR_SET_PTRACER. cool, thanks again! You might also look at using xpmem. You can find a version that supports 3.x @ https://github.com/hjelmn/xpmem . It is a

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Nathan Hjelm
Great! I will add an MCA variable to force CMA and also enable it if 1) no yama and 2) no PR_SET_PTRACER. You might also look at using xpmem. You can find a version that supports 3.x @ https://github.com/hjelmn/xpmem . It is a kernel module + userspace library that can be used by vader as a

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
On 02/19/2015 02:58 PM, Nathan Hjelm wrote: On Thu, Feb 19, 2015 at 12:16:49PM -0500, Eric Chamberland wrote: On 02/19/2015 11:56 AM, Nathan Hjelm wrote: If you have yama installed you can try: Nope, I do not have it installed... is it absolutely necessary? (and would it change something

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Nathan Hjelm
On Thu, Feb 19, 2015 at 12:16:49PM -0500, Eric Chamberland wrote: > > On 02/19/2015 11:56 AM, Nathan Hjelm wrote: > > > >If you have yama installed you can try: > > Nope, I do not have it installed... is it absolutely necessary? (and would > it change something when it fails when I am root?) >

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
On 02/19/2015 11:56 AM, Nathan Hjelm wrote: If you have yama installed you can try: Nope, I do not have it installed... is it absolutely necessary? (and would it change something when it fails when I am root?) Other question: In addition to "--with-cma" configure flag, do we have to pass

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Nathan Hjelm
If you have yama installed you can try: echo 1 > /proc/sys/kernel/yama/ptrace_scope as root. -Nathan On Thu, Feb 19, 2015 at 11:06:09AM -0500, Eric Chamberland wrote: > By the way, > > I have tried two others things: > > #1- I launched it as root: > > mpiexec --mca

  1   2   3   >