Re: [OMPI users] problem with cancelling Send-Request

2019-10-02 Thread Jeff Hammond via users
gt; > Edif. PRBB > 08003 Barcelona, Spain > Phone Ext: #1098 > > -- > *From:* users on behalf of Christian > Von Kutzleben via users > *Sent:* 02 October 2019 16:14:24 > *To:* users@lists.open-mpi.org > *Cc:* Christian Von Kutzleben > *Subject:*

Re: [OMPI users] problem with cancelling Send-Request

2019-10-02 Thread Jeff Hammond via users
Don’t try to cancel sends. https://github.com/mpi-forum/mpi-issues/issues/27 has some useful info. Jeff On Wed, Oct 2, 2019 at 7:17 AM Christian Von Kutzleben via users < users@lists.open-mpi.org> wrote: > Hi, > > I’m currently evaluating to use openmpi (4.0.1) in our application. > > We are

Re: [OMPI users] problem with cancelling Send-Request

2019-10-02 Thread Emyr James via users
Regulation C/ Dr. Aiguader, 88 Edif. PRBB 08003 Barcelona, Spain Phone Ext: #1098 From: users on behalf of Christian Von Kutzleben via users Sent: 02 October 2019 16:14:24 To: users@lists.open-mpi.org Cc: Christian Von Kutzleben Subject: [OMPI users] problem

[OMPI users] problem with cancelling Send-Request

2019-10-02 Thread Christian Von Kutzleben via users
Hi, I’m currently evaluating to use openmpi (4.0.1) in our application. We are using a construct like this for some cleanup functionality, to cancel some Send requests: if (*req != MPI_REQUEST_NULL) { MPI_Cancel(req); MPI_Wait(req, MPI_STATUS_IGNORE); assert(*req == MPI_REQUEST_NULL); }

[OMPI users] Problem with OpenMPI v 4.0.1 with UCX on IB network with hca_id mlx4_0

2019-04-25 Thread Bertini, Denis Dr. via users
Hi I tried to install OpenMPI v 4.0.1 on our Debian cluster using Infiniband network with following dev_info: hca_id: mlx4_0 1) I first tried to install openMPI without UCX framework and i runs prefectly as before just need to add >export OMPI_MCA_btl_openib_allow_ib=1 to remove the warning

Re: [OMPI users] Problem running with UCX/oshmem on single node?

2018-05-14 Thread Michael Di Domenico
On Wed, May 9, 2018 at 9:45 PM, Howard Pritchard wrote: > > You either need to go and buy a connectx4/5 HCA from mellanox (and maybe a > switch), and install that > on your system, or else install xpmem (https://github.com/hjelmn/xpmem). > Note there is a bug right now > in

Re: [OMPI users] problem

2018-05-10 Thread dpchoudh
What Jeff is suggesting is probably valgrind. However, in my experience, which is much less than most OpenMPI developers, a simple code inspection often is adequate. Here are the steps: 1. If you don't already have it, build a debug version of your code. If you are using gcc, you'd use a -g to

Re: [OMPI users] problem

2018-05-10 Thread Ankita m
ok...Thank you so much sir On Wed, May 9, 2018 at 11:13 PM, Jeff Squyres (jsquyres) wrote: > It looks like you're getting a segv when calling MPI_Comm_rank(). > > This is quite unusual -- MPI_Comm_rank() is just a local lookup / return > of an integer. If MPI_Comm_rank()

Re: [OMPI users] Problem running with UCX/oshmem on single node?

2018-05-09 Thread Howard Pritchard
Hi Craig, You are experiencing problems because you don't have a transport installed that UCX can use for oshmem. You either need to go and buy a connectx4/5 HCA from mellanox (and maybe a switch), and install that on your system, or else install xpmem (https://github.com/hjelmn/xpmem). Note

[OMPI users] Problem running with UCX/oshmem on single node?

2018-05-09 Thread Craig Reese
I'm trying to play with oshmem on a single node (just to have a way to do some simple experimentation and playing around) and having spectacular problems: CentOS 6.9 (gcc 4.4.7) built and installed ucx 1.3.0 built and installed openmpi-3.1.0 [cfreese]$ cat oshmem.c #include int

Re: [OMPI users] problem

2018-05-09 Thread Jeff Squyres (jsquyres)
It looks like you're getting a segv when calling MPI_Comm_rank(). This is quite unusual -- MPI_Comm_rank() is just a local lookup / return of an integer. If MPI_Comm_rank() is seg faulting, it usually indicates that there's some other kind of memory error in the application, and this seg fault

Re: [OMPI users] problem

2018-05-09 Thread Ankita m
yes. Because previously i was using intel-mpi. That time the program was running perfectly. Now when i use openmpi this shows this error files...Though i am not quite sure. I just thought if the issue will be for Openmpi then i could get some help here. On Wed, May 9, 2018 at 6:47 PM, Gilles

Re: [OMPI users] problem

2018-05-09 Thread Gilles Gouaillardet
Ankita, Do you have any reason to suspect the root cause of the crash is Open MPI ? Cheers, Gilles On Wednesday, May 9, 2018, Ankita m wrote: > MPI "Hello World" program is also working > > please see this error file attached below. its of a different program > > On

Re: [OMPI users] problem

2018-05-09 Thread Ankita m
MPI "Hello World" program is also working please see this error file attached below. its of a different program On Wed, May 9, 2018 at 4:10 PM, John Hearns via users < users@lists.open-mpi.org> wrote: > Ankita, looks like your program is not launching correctly. > I would try the following: >

Re: [OMPI users] problem

2018-05-09 Thread John Hearns via users
Ankita, looks like your program is not launching correctly. I would try the following: define two hosts in a machinefile. Use mpirun -np 2 machinefile date Ie can you use mpirun just to run the command 'date' Secondly compile up and try to run an MPI 'Hello World' program On 9 May 2018 at

[OMPI users] problem

2018-05-09 Thread Ankita m
I am using ompi -3.1.0 version in my program and compiler is mpicc its a parallel program which uses multiple nodes with 16 cores in each node. but its not working and generates a error file . i Have attached the error file below. can anyone please tell what is the issue actually

Re: [OMPI users] problem related ORTE

2018-04-06 Thread Jeff Squyres (jsquyres)
Can you please send all the information listed here: https://www.open-mpi.org/community/help/ Thanks! > On Apr 6, 2018, at 8:27 AM, Ankita m wrote: > > Hello Sir/Madam > > I am Ankita Maity, a PhD scholar from Mechanical Dept., IIT Roorkee, India > > I am

[OMPI users] problem related ORTE

2018-04-06 Thread Ankita m
Hello Sir/Madam I am Ankita Maity, a PhD scholar from Mechanical Dept., IIT Roorkee, India I am facing a problem while submitting a parallel program to the HPC cluster available in our dept. I have attached the error file its showing during the time of run. Can You please help me with the

[OMPI users] Problem with Mellanox device selection

2017-12-18 Thread Götz Waschk
Hi everyone, I have a cluster of 32 nodes with Infiniband, four of them additionally have a 10G Mellanox Ethernet card for faster I/O. If my job based on openmpi 1.10.6 ends up on one of these nodes, it will crash: No OpenFabrics connection schemes reported that they were able to be used on a

Re: [OMPI users] Problem related to openmpi cart create command

2017-12-03 Thread Gilles Gouaillardet
Hi, There is not enough information to help. Can you build a minimal example that evidences the issue and states how many MPI tasks are needed to evidence this issue ? Cheers, Gilles On Sun, Dec 3, 2017 at 6:00 PM, Muhammad Umar wrote: > Hello, hope everyone is fine. >

[OMPI users] Problem related to openmpi cart create command

2017-12-03 Thread Muhammad Umar
Hello, hope everyone is fine. I have been given a code of openmpi by a senior which includes mpi_cart_create. I have been trying to run the program. The program compiles corrrecty but on execution gives error on the builtin function mpi_cart_create. The operating system I am using is ubuntu 64

Re: [OMPI users] Problem with MPI jobs terminating when using OMPI 3.0.x

2017-10-31 Thread Andy Riebs
As always, thanks for your help Ralph! Cutting over to PMIx 1.2.4 solved the problem for me. (Slurm wasn't happy building with PMIx v2.) And yes, I had ssh access to node04. (And Gilles, thanks for your note, as well.) Andy On 10/27/2017 04:31 PM, r...@open-mpi.org wrote: Two questions:

Re: [OMPI users] Problem with MPI jobs terminating when using OMPI 3.0.x

2017-10-29 Thread Gilles Gouaillardet
Andy, The crash occurs in the orted daemon and not in the mpi_hello MPI app, so you will not see anything useful in gdb. you can use the attached launch agent script in order to get a stack trace of orted. your mpirun command line should be updated like this mpirun --mca

Re: [OMPI users] Problem with MPI jobs terminating when using OMPI 3.0.x

2017-10-27 Thread r...@open-mpi.org
Two questions: 1. are you running this on node04? Or do you have ssh access to node04? 2. I note you are building this against an old version of PMIx for some reason. Does it work okay if you build it with the embedded PMIx (which is 2.0)? Does it work okay if you use PMIx v1.2.4, the latest

[OMPI users] Problem with MPI jobs terminating when using OMPI 3.0.x

2017-10-27 Thread Andy Riebs
We have built a version of Open MPI 3.0.x that works with Slurm (our primary use case), but it fails when executed without Slurm. If I srun an MPI "hello world" program, it works just fine. Likewise, if I salloc a couple of nodes and use mpirun from there, life is good. But if I just try to

Re: [OMPI users] Problem with MPI_FILE_WRITE_AT

2017-09-15 Thread Edgar Gabriel
thank you for the report and the code, I will look into this. What file system is that occurring on? Until I find the problem, note that you could switch to back to the previous parallel I/O implementation (romio) by providing that as a parameter to your mpirun command, e.g. mpirun --mca io

[OMPI users] Problem with MPI_FILE_WRITE_AT

2017-09-15 Thread McGrattan, Kevin B. Dr. (Fed)
I am using MPI_FILE_WRITE_AT to print out the timings of subroutines in a big Fortran code. I have noticed since upgrading to Open MPI 2.1.1 that sometimes the file to be written is corrupted. Each MPI process is supposed to write out a character string that is 159 characters in length, plus a

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Gilles Gouaillardet
Thanks Siegmar, i was finally able to reproduce it. the error is triggered by the VM topology, and i was able to reproduce it by manually removing the "NUMA" objects from the topology. as a workaround, you can mpirun --map-by socket ... i will follow-up on the devel ML with Ralph.

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Siegmar Gross
Hi Gilles, Am 31.05.2017 um 08:38 schrieb Gilles Gouaillardet: Siegmar, the "big ORTE update" is a bunch of backports from master to v3.x btw, does the same error occurs with master ? Yes, it does, but the error occurs only if I use a real machine with my virtual machine "exin". I get the

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Gilles Gouaillardet
Siegmar, the "big ORTE update" is a bunch of backports from master to v3.x btw, does the same error occurs with master ? i noted mpirun simply does ssh exin orted ... can you double check the right orted (e.g. /usr/local/openmpi-3.0.0_64_cc/bin/orted) or you can try to mpirun --mca

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Siegmar Gross
Hi Gilles, I configured Open MPI with the following command. ../openmpi-v3.x-201705250239-d5200ea/configure \ --prefix=/usr/local/openmpi-3.0.0_64_cc \ --libdir=/usr/local/openmpi-3.0.0_64_cc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread r...@open-mpi.org
Until the fixes pending in the big ORTE update PR are committed, I suggest not wasting time chasing this down. I tested the “patched” version of the 3.x branch, and it works just fine. > On May 30, 2017, at 7:43 PM, Gilles Gouaillardet wrote: > > Ralph, > > > the issue

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread Gilles Gouaillardet
Ralph, the issue Siegmar initially reported was loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi per what you wrote, this should be equivalent to loki hello_1 111 mpiexec -np 3 --host loki:2,exin:1 hello_1_mpi and this is what i initially wanted to double check (but i made a

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread r...@open-mpi.org
This behavior is as-expected. When you specify "-host foo,bar”, you have told us to assign one slot to each of those nodes. Thus, running 3 procs exceeds the number of slots you assigned. You can tell it to set the #slots to the #cores it discovers on the node by using “-host foo:*,bar:*” I

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread gilles
Hi Siegmar, my bad, there was a typo in my reply. i really meant > > what if you ? > > mpiexec --host loki:2,exin:1 -np 3 hello_1_mpi but you also tried that and it did not help. i could not find anything in your logs that suggest mpiexec tries to start 5 MPI tasks, did i miss something ? i

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread Siegmar Gross
Hi Gilles, what if you ? mpiexec --host loki:1,exin:1 -np 3 hello_1_mpi I need as many slots as processes so that I use "-np 2". "mpiexec --host loki,exin -np 2 hello_1_mpi" works as well. The command breaks, if I use at least "-np 3" and distribute the processes across at least two machines.

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread gilles
Hi Siegmar, what if you ? mpiexec --host loki:1,exin:1 -np 3 hello_1_mpi are loki and exin different ? (os, sockets, core) Cheers, Gilles - Original Message - > Hi, > > I have installed openmpi-v3.x-201705250239-d5200ea on my "SUSE Linux > Enterprise Server 12.2 (x86_64)" with Sun C

[OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread Siegmar Gross
Hi, I have installed openmpi-v3.x-201705250239-d5200ea on my "SUSE Linux Enterprise Server 12.2 (x86_64)" with Sun C 5.14 and gcc-7.1.0. Depending on the machine that I use to start my processes, I have a problem with "--host" for versions "v3.x" and "master", while everything works as expected

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-21 Thread Jing Gong
Hi, The email is intended to follow the thread about "Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch". https://mail-archive.com/users@lists.open-mpi.org/msg30650.html We have installed the latest version v2.0.2 on the cluster that

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread Anastasia Kruchinina
Ok, thanks for your answers! I was not aware that it is a known issue. I guess I will just try to find a machine with OpenMPI/2.0.2 and try there. On 16 February 2017 at 00:01, r...@open-mpi.org wrote: > Yes, 2.0.1 has a spawn issue. We believe that 2.0.2 is okay if you want

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread r...@open-mpi.org
Yes, 2.0.1 has a spawn issue. We believe that 2.0.2 is okay if you want to give it a try Sent from my iPad > On Feb 15, 2017, at 1:14 PM, Jason Maldonis wrote: > > Just to throw this out there -- to me, that doesn't seem to be just a problem > with SLURM. I'm guessing the

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread Jason Maldonis
Just to throw this out there -- to me, that doesn't seem to be just a problem with SLURM. I'm guessing the exact same error would be thrown interactively (unless I didn't read the above messages carefully enough). I had a lot of problems running spawned jobs on 2.0.x a few months ago, so I

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread Anastasia Kruchinina
Hi! I am doing like this: sbatch -N 2 -n 5 ./job.sh where job.sh is: #!/bin/bash -l module load openmpi/2.0.1-icc mpirun -np 1 ./manager 4 On 15 February 2017 at 17:58, r...@open-mpi.org wrote: > The cmd line looks fine - when you do your “sbatch” request, what is

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread r...@open-mpi.org
The cmd line looks fine - when you do your “sbatch” request, what is in the shell script you give it? Or are you saying you just “sbatch” the mpirun cmd directly? > On Feb 15, 2017, at 8:07 AM, Anastasia Kruchinina > wrote: > > Hi, > > I am running like this:

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread Anastasia Kruchinina
Hi, I am running like this: mpirun -np 1 ./manager Should I do it differently? I also thought that all sbatch does is create an allocation and then run my script in it. But it seems it is not since I am getting these results... I would like to upgrade to OpenMPI, but no clusters near me have

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread Howard Pritchard
Hi Anastasia, Definitely check the mpirun when in batch environment but you may also want to upgrade to Open MPI 2.0.2. Howard r...@open-mpi.org schrieb am Mi. 15. Feb. 2017 um 07:49: > Nothing immediate comes to mind - all sbatch does is create an allocation > and then run

Re: [OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-15 Thread r...@open-mpi.org
Nothing immediate comes to mind - all sbatch does is create an allocation and then run your script in it. Perhaps your script is using a different “mpirun” command than when you type it interactively? > On Feb 14, 2017, at 5:11 AM, Anastasia Kruchinina > wrote: >

[OMPI users] Problem with MPI_Comm_spawn using openmpi 2.0.x + sbatch

2017-02-14 Thread Anastasia Kruchinina
Hi, I am trying to use MPI_Comm_spawn function in my code. I am having trouble with openmpi 2.0.x + sbatch (batch system Slurm). My test program is located here: http://user.it.uu.se/~anakr367/files/MPI_test/ When I am running my code I am getting an error: OPAL ERROR: Timeout in file

Re: [OMPI users] problem with opal_list_remove_item for openmpi-v2.x-201702010255-8b16747 on Linux

2017-02-03 Thread Jeff Squyres (jsquyres)
I've filed this as https://github.com/open-mpi/ompi/issues/2920. Ralph is just heading out for about a week or so; it may not get fixed until he comes back. > On Feb 3, 2017, at 2:03 AM, Siegmar Gross > wrote: > > Hi, > > I have installed

[OMPI users] problem with opal_list_remove_item for openmpi-v2.x-201702010255-8b16747 on Linux

2017-02-02 Thread Siegmar Gross
Hi, I have installed openmpi-v2.x-201702010255-8b16747 on my "SUSE Linux Enterprise Server 12.2 (x86_64)" with Sun C 5.14 and gcc-6.3.0. Unfortunately, I get a warning from "opal_list_remove_item" about a missing item when I run one of my programs. loki spawn 115 mpiexec -np 1 --host

Re: [OMPI users] Problem with double shared library

2016-10-28 Thread Sean Ahern
Gilles, You described the problem exactly. I think we were able to nail down a solution to this one through judicious use of the -rpath $MPI_DIR/lib linker flag, allowing the runtime linker to properly find OpenMPI symbols at runtime. We're operational. Thanks for your help. -Sean -- Sean Ahern

Re: [OMPI users] Problem building OpenMPI with CUDA 8.0

2016-10-24 Thread Gilles Gouaillardet
org] On Behalf Of Justin Luitjens Sent: Tuesday, October 18, 2016 9:53 AM To: users@lists.open-mpi.org Subject: [OMPI users] Problem building OpenMPI with CUDA 8.0 I have the release version of CUDA 8.0 installed and am trying to build OpenMPI. Here is my configure and build line: ./conf

Re: [OMPI users] Problem building OpenMPI with CUDA 8.0

2016-10-24 Thread Brice Goglin
t at the following path: >>> ${CUDA_HOME}/lib64/stubs >>> For 8.0 I’d suggest updating the configure/make scripts to look >>> for nvml there and link in the stubs. This way the build is not >>> dependent on the driver being installed and only the toolkit. >>

Re: [OMPI users] Problem building OpenMPI with CUDA 8.0

2016-10-23 Thread Gilles Gouaillardet
way the build is not dependent on the driver being installed and only the toolkit. Thanks, Justin From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Justin Luitjens Sent: Tuesday, October 18, 2016 9:53 AM To: users@lists.open-mpi.org Subject: [OMPI users] Problem building OpenMPI

Re: [OMPI users] Problem building OpenMPI with CUDA 8.0

2016-10-19 Thread Jeff Squyres (jsquyres)
, October 18, 2016 9:53 AM > To: users@lists.open-mpi.org > Subject: [OMPI users] Problem building OpenMPI with CUDA 8.0 > > I have the release version of CUDA 8.0 installed and am trying to build > OpenMPI. > > Here is my configure and build line: > > ./config

Re: [OMPI users] Problem building OpenMPI with CUDA 8.0

2016-10-18 Thread Justin Luitjens
, October 18, 2016 9:53 AM To: users@lists.open-mpi.org Subject: [OMPI users] Problem building OpenMPI with CUDA 8.0 I have the release version of CUDA 8.0 installed and am trying to build OpenMPI. Here is my configure and build line: ./configure --prefix=$PREFIXPATH --with-cuda=$CUDA_HOME --with-tm

[OMPI users] Problem building OpenMPI with CUDA 8.0

2016-10-18 Thread Justin Luitjens
I have the release version of CUDA 8.0 installed and am trying to build OpenMPI. Here is my configure and build line: ./configure --prefix=$PREFIXPATH --with-cuda=$CUDA_HOME --with-tm= --with-openib= && make && sudo make install Where CUDA_HOME points to the cuda install path. When I run the

Re: [OMPI users] Problem with double shared library

2016-10-17 Thread Gilles Gouaillardet
Sean, if i understand correctly, your built a libtransport_mpi.so library that depends on Open MPI, and your main program dlopen libtransport_mpi.so. in this case, and at least for the time being, you need to use RTLD_GLOBAL in your dlopen flags. Cheers, Gilles On 10/18/2016 4:53

[OMPI users] Problem with double shared library

2016-10-17 Thread Sean Ahern
Folks, For our code, we have a communication layer that abstracts the code that does the actual transfer of data. We call these "transports", and we link them as shared libraries. We have created an MPI transport that compiles/links against OpenMPI 2.0.1 using the compiler wrappers. When I

Re: [OMPI users] Problem running an MPI program through the PBS manager

2016-09-26 Thread Mahmood Naderan
OK thank you very much. It is now running... Regards, Mahmood On Mon, Sep 26, 2016 at 2:04 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Mahmood, > > The node is defined in the PBS config, however it is not part of the > allocation (e.g. job) so it cannot be used, and

Re: [OMPI users] Problem running an MPI program through the PBS manager

2016-09-26 Thread Gilles Gouaillardet
Mahmood, The node is defined in the PBS config, however it is not part of the allocation (e.g. job) so it cannot be used, and hence the error message. In your PBS script, you do not need -np nor -host parameters to your mpirun command. Open MPI mpirun will automatically detect it is launched

[OMPI users] Problem running an MPI program through the PBS manager

2016-09-26 Thread Mahmood Naderan
Hi, When I run an MPI command through the terminal the programs runs fine on the compute node specified in hosts.txt. However, when I put that command in a PBS script, if says that the compute node is not defined in the job manager's list. However, that node is actually defined in the job

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Mahmood Naderan
OK. Running "module unload rocks-openmpi" and putting that in ~/.bashrc will remove /opt/openmpi/lib from LD_LIBRARY_PATH. Thanks Gilles for your help. Regards, Mahmood On Mon, Sep 12, 2016 at 1:25 PM, Mahmood Naderan wrote: > It seems that it is part of rocks-openmpi.

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Mahmood Naderan
It seems that it is part of rocks-openmpi. I will find out how to remove it and will come back. Regards, Mahmood On Mon, Sep 12, 2016 at 1:06 PM, Gilles Gouaillardet wrote: > Mahmood, > > you need to manually remove /opt/openmpi/lib from your LD_LIBRARY_PATH > (or have

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Gilles Gouaillardet
Mahmood, you need to manually remove /opt/openmpi/lib from your LD_LIBRARY_PATH (or have your sysadmin do it if this is somehow done automatically) the point of configuring with --enable-mpirun-prefix-by-default is you do *not* need to add /export/apps/siesta/openmpi-1.8.8/lib in your

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Mahmood Naderan
Is the following output OK? ... Making install in util make[2]: Entering directory `/export/apps/siesta/openmpi-1.8.8/test/util' make[3]: Entering directory `/export/apps/siesta/openmpi-1.8.8/test/util' make[3]: Nothing to be done for `install-exec-am'. make[3]: Nothing to be done for

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Gilles Gouaillardet
Mahmood, I was suggesting you (re)configure (i assume you did it) the Open MPI 1.8.8 installed in /export/apps/siesta/openmpi-1.8.8 with --enable-mpirun-prefix-by-default Cheers, Gilles On 9/12/2016 4:51 PM, Mahmood Naderan wrote: >​ --enable-mpirun-prefix-by-default​ What is that? Does

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Gilles Gouaillardet
Basically, it means libs with be linked with -Wl,-rpath,/export/apps/siesta/openmpi-1.8.8/lib so if you run a.out with an empty $LD_LIBRARY_PATH, then it will look for the MPI libraries in /export/apps/siesta/openmpi-1.8.8/lib Cheers, Gilles On 9/12/2016 4:50 PM, Mahmood Naderan wrote:

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Mahmood Naderan
>​ --enable-mpirun-prefix-by-default​ What is that? Does that mean "configure 1.8.8 with the default one installed on the system"? Then that is not good I think because # /opt/openmpi/bin/ompi_info Package: Open MPI root@centos-6-3.localdomain Distribution

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Mahmood Naderan
>​ --enable-mpirun-prefix-by-default​ What is that? Does that mean "configure 1.8.8 with the default one installed on the system"? Then that is not good I think because Regards, Mahmood ___ users mailing list users@lists.open-mpi.org

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Gilles Gouaillardet
That sounds good to me ! just to make it crystal clear ... assuming you configure'd your Open MPI 1.8.8 with --enable-mpirun-prefix-by-default (and if you did not, i do encourage you to do so), then all you need is to remove /opt/openmpi/lib from your LD_LIBRARY_PATH (e.g. you do *not*

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Mahmood Naderan
​>(i'd like to make sure you are not using IntelMPI libmpi.so.1 with Open MPI libmpi_mpifh.so.2, that can happen if Intel MPI >appears first in your LD_LIBRARY_PATH) # echo $LD_LIBRARY_PATH /opt/gridengine/lib/linux-x64:/opt/openmpi/lib # ls /opt/openmpi/lib libmpi.a libompitrace.a

Re: [OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Gilles Gouaillardet
Hi, this is the relevant part of your config.log configure:1594: checking whether the Fortran compiler works configure:1600: ./a.out ./a.out: symbol lookup error: /export/apps/siesta/openmpi-1.8.8/lib/libmpi_mpifh.so.2: undefined symbol: mpi_fortran_weights_empty configure:1603: $? = 127

[OMPI users] Problem with specifying wrapper compiler mpifort

2016-09-12 Thread Mahmood Naderan
Hi, Following the suggestion by Gilles Gouaillardet ( https://mail-archive.com/users@lists.open-mpi.org/msg29688.html), I ran a configure command for a program like this ​# ../Src/configure FC=/export/apps/siesta/openmpi-1.8.8/bin/mpifort --with-blas=libopenblas.a --with-lapack=liblapack.a

Re: [OMPI users] problem with exceptions in Java interface

2016-08-29 Thread Graham, Nathaniel Richard
behalf of Gilles Gouaillardet <gilles.gouaillar...@gmail.com> Sent: Monday, August 29, 2016 6:16 AM To: Open MPI Users Subject: Re: [OMPI users] problem with exceptions in Java interface Hi Siegmar, I will review PR 1698 and wait some more feedback from the developers, they might have differen

Re: [OMPI users] problem with exceptions in Java interface

2016-08-29 Thread Gilles Gouaillardet
Hi Siegmar, I will review PR 1698 and wait some more feedback from the developers, they might have different views than mine. assuming PR 1698 does what you expect, it does not catch all user errors. for example, if you MPI_Send a buffer that is too short, the exception might be thrown at any

Re: [OMPI users] problem with exceptions in Java interface

2016-08-29 Thread Siegmar Gross
Hi Gilles, isn't it possible to pass all exceptions from the Java interface to the calling method? I can live with the current handling of exceptions as well, although some exceptions can be handled within my program and some will break my program even if I want to handle exceptions myself. I

Re: [OMPI users] problem with exceptions in Java interface

2016-08-29 Thread Gilles Gouaillardet
Siegmar and all, i am puzzled with this error. on one hand, it is caused by an invalid buffer (e.g. buffer size is 1, but user suggests size is 2) so i am fine with current behavior (e.g. java.lang.ArrayIndexOutOfBoundsException is thrown) /* if that was a C program, it would very likely

[OMPI users] problem with exceptions in Java interface

2016-08-29 Thread Siegmar Gross
Hi, I have installed v1.10.3-31-g35ba6a1, openmpi-v2.0.0-233-gb5f0a4f, and openmpi-dev-4691-g277c319 on my "SUSE Linux Enterprise Server 12 (x86_64)" with Sun C 5.14 beta and gcc-6.1.0. In May I had reported a problem with Java execeptions (PR 1698) which had been solved in June (PR 1803).

Re: [OMPI users] Problem when installing Rmpi package in HPC cluster

2016-07-11 Thread Bennet Fauber
We have found that virtually all Rmpi jobs need to be started with $ mpirun -np 1 R CMD BATCH This is, as I understand it, because the first R will initialize the MPI environment and then when you create the cluster, it wants to be able to start the rest of the processes. When you

Re: [OMPI users] Problem when installing Rmpi package in HPC cluster

2016-07-11 Thread Gilles Gouaillardet
Note this is just a workaround, this simply disables the mxm mtl (e.g. Mellanox optimized infiniband driver). basically, there are two ways to run a single task mpi program (a.out) - mpirun -np 1 ./a.out (this is the "standard" way) - ./a.out (aka singleton mode) the logs you posted do not

Re: [OMPI users] Problem when installing Rmpi package in HPC cluster

2016-07-11 Thread pan yang
Dear Gilles, I tried export OMPI_MCA_pml=ob1, and it worked! Thank you very much for your brilliant suggestion. By the way, I don't really understand what do you mean by '*can you also extract the command tha launch the test ?*'... Cheers, Pan

Re: [OMPI users] Problem when installing Rmpi package in HPC cluster

2016-07-11 Thread Gilles Gouaillardet
That could be specific to mtl/mxm could you export OMPI_MCA_pml=ob1 and try again ? can you also extract the command tha launch the test ? I am curious whether this is via mpirun or as a singleton Cheers, Gilles On Monday, July 11, 2016, pan yang wrote: > Dear

[OMPI users] Problem when installing Rmpi package in HPC cluster

2016-07-11 Thread pan yang
Dear OpenMPI community, I faced this problem when I am installing the Rmpi: > install.packages('Rmpi',repos='http://cran.r-project.org ',configure.args=c( + '--with-Rmpi-include=/usr/mpi/gcc/openmpi-1.8.2/include/', + '--with-Rmpi-libpath=/usr/mpi/gcc/openmpi-1.8.2/lib64/', +

Re: [OMPI users] problem with exceptions in Java interface

2016-05-24 Thread Howard Pritchard
Hi Siegmar, Sorry for the delay, I seem to have missed this one. It looks like there's an error in the way the native methods are processing java exceptions. The code correctly builds up an exception message for cases where MPI 'c' returns non-success but, not if the problem occured in one of

Re: [OMPI users] problem about mpirun on two nodes

2016-05-23 Thread Jeff Squyres (jsquyres)
lt;us...@open-mpi.org> > Sent: Mon, May 23, 2016 9:13 am > Subject: Re: [OMPI users] problem about mpirun on two nodes > > On May 21, 2016, at 11:31 PM, dour...@aol.com wrote: >> >> I encountered a problem about mpirun and SSH when using OMPI 1.10.0 compiled >

Re: [OMPI users] problem about mpirun on two nodes

2016-05-23 Thread douraku
13 am Subject: Re: [OMPI users] problem about mpirun on two nodes On May 21, 2016, at 11:31 PM, dour...@aol.com wrote: > > I encountered a problem about mpirun and SSH when using OMPI 1.10.0 compiled > with gcc, running on centos7.2. > When I execute mpirun on my 2 node cluster, I ge

Re: [OMPI users] problem about mpirun on two nodes

2016-05-23 Thread Jeff Squyres (jsquyres)
On May 21, 2016, at 11:31 PM, dour...@aol.com wrote: > > I encountered a problem about mpirun and SSH when using OMPI 1.10.0 compiled > with gcc, running on centos7.2. > When I execute mpirun on my 2 node cluster, I get the following errors pasted > below. > > [douraku@master home]$ mpirun -np

[OMPI users] problem with slot-list and openmpi-v2.x-dev-1441-g402abf9

2016-05-23 Thread Siegmar Gross
Hi, I installed openmpi-v2.x-dev-1441-g402abf9 on my "SUSE Linux Enterprise Server 12 (x86_64)" with Sun C 5.14 and gcc-6.1.0. Unfortunately I get a timeout error for "--slot-list". It's the same behaviour for both compilers. loki spawn 143 mpiexec -np 1 --host loki,loki,loki,nfs1,nfs1

[OMPI users] problem about mpirun on two nodes

2016-05-22 Thread douraku
Hi all I encountered a problem about mpirun and SSH when using OMPI 1.10.0 compiled with gcc, running on centos7.2. When I execute mpirun on my 2 node cluster, I get the following errors pasted below. [douraku@master home]$ mpirun -np 12 a.out Permission denied

[OMPI users] problem with exceptions in Java interface

2016-05-20 Thread Siegmar Gross
Hi, I tried MPI.ERRORS_RETURN in a small Java program with Open MPI 1.10.2 and master. I get the expected behaviour, if I use a wrong value for the root process in "bcast". Unfortunately I get an MPI or Java error message if I try to broadcast more data than available. Is this intended or is it

Re: [OMPI users] problem with ld for Sun C 5.14 beta and openmpi-dev-4010-g6c9d65c

2016-05-10 Thread Gilles Gouaillardet
Siegmar, this issue was previously reported at http://www.open-mpi.org/community/lists/devel/2016/05/18923.php i just pushed the patch Cheers, Gilles On 5/10/2016 2:27 PM, Siegmar Gross wrote: Hi, I tried to install openmpi-dev-4010-g6c9d65c on my "SUSE Linux Enterprise Server 12

[OMPI users] problem with ld for Sun C 5.14 beta and openmpi-dev-4010-g6c9d65c

2016-05-10 Thread Siegmar Gross
Hi, I tried to install openmpi-dev-4010-g6c9d65c on my "SUSE Linux Enterprise Server 12 (x86_64)" with Sun C 5.14 beta. Unfortunately "make" breaks with the following error. make[2]: Entering directory '/export2/src/openmpi-master/openmpi-dev-4010-g6c9d6 GENERATE mpi-ignore-tkr-sizeof.h

Re: [OMPI users] problem with Sun C 5.14 beta

2016-05-07 Thread Siegmar Gross
Hi Gilles, thank you very much for your help. Now C and C++ are link compatible. Kind regards Siegmar On 05/07/16 12:15, Gilles Gouaillardet wrote: Siegmar, per the config.log, you need to update your CXXFLAGS="-m64 -library=stlport4 -std=sun03" or just CXXFLAGS="-m64" Cheers, Gilles

Re: [OMPI users] problem with Sun C 5.14 beta

2016-05-07 Thread Gilles Gouaillardet
Siegmar, per the config.log, you need to update your CXXFLAGS="-m64 -library=stlport4 -std=sun03" or just CXXFLAGS="-m64" Cheers, Gilles On Saturday, May 7, 2016, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > today I tried to install openmpi-v1.10.2-176-g9d45e07 on my

[OMPI users] problem with Sun C 5.14 beta

2016-05-07 Thread Siegmar Gross
Hi, today I tried to install openmpi-v1.10.2-176-g9d45e07 on my "SUSE Linux Enterprise Server 12 (x86_64)" with Sun C 5.14 beta. Unfortunately "configure" breaks, because it thinks that C and C++ are link incompatible. I used the following configure command.

[OMPI users] problem compiling Java programs with openmpi-v1.10.2-176-g9d45e07

2016-05-07 Thread Siegmar Gross
Hi, yesterday I installed openmpi-v1.10.2-176-g9d45e07 on my "SUSE Linux Enterprise Server 12 (x86_64)" with Sun C 5.13 and gcc-5.3.0. Unfortunately I have a problem compiling Java programs. loki java 124 ompi_info | grep -e "OPAL repo revision" -e "C compiler absolute" OPAL repo

Re: [OMPI users] Problem with 'orted: command not found'

2016-05-03 Thread Maciek Lewiński
Thank you! I have set up my env paths at the end of the script and thanks to you I just noticed that at the beginning of the bashrc script there's a simple IF that returns when it's opened as non-interactive. I moved my exports above it and it finally works. Again, thank you very much. 2016-05-03

Re: [OMPI users] Problem with 'orted: command not found'

2016-05-02 Thread Gilles Gouaillardet
If OpenMPI is installed at the same path on every node, the easiest optin is to re-configure with --enable-mpirun-prefix-by-default an other option is to use `which mpirun` instead of mpirun and yet an other option is to mpirun --prefix=$USER/.openmpi Cheers, Gilles On Tuesday, May 3, 2016,

Re: [OMPI users] Problem with 'orted: command not found'

2016-05-02 Thread Jeff Squyres (jsquyres)
Make sure you check that these paths are set for *non-interactive* logins. > On May 2, 2016, at 6:14 PM, Maciek Lewiński wrote: > > I already had correct paths in .bashrc: > > export >

  1   2   3   4   5   6   7   8   9   10   >