[OMPI users] OMPI CUDA IPC synchronisation/fail-silent problem

2014-08-26 Thread Christoph Winter
Hey all, to test the performance of my application I duplicated the call to the function that will issue the computation on two GPUs 5 times. During the 4th and 5th run of the algorithm, however, the algorithm yields different results (9 instead of 20): # datatype: double # datapoints: 2 #

[OMPI users] openmpi-1.8.1 Unable to compile on CentOS6.5

2014-08-26 Thread Syed Ahsan Ali
I have problems in compilation of openmpi-1.8.1 on Linux machine. Kindly see the logs attached. configure.bz2 Description: BZip2 compressed data

Re: [OMPI users] OMPI CUDA IPC synchronisation/fail-silent problem

2014-08-26 Thread Rolf vandeVaart
Hi Christoph: I will try and reproduce this issue and will let you know what I find. There may be an issue with CUDA IPC support with certain traffic patterns. Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Christoph Winter Sent: Tuesday, August 26, 2014 2:46 AM To:

Re: [OMPI users] long initialization

2014-08-26 Thread Timur Ismagilov
Hello! Here is my time results: $time mpirun -n 1 ./hello_c Hello, world, I am 0 of 1, (Open MPI v1.9a1, package: Open MPI semenov@compiler-2 Distribution, ident: 1.9a1r32570, repo rev: r32570, Aug 21, 2014 (nightly snapshot tarball), 146) real 1m3.985s user 0m0.031s sys 0m0.083s Fri, 22 Aug

Re: [OMPI users] openmpi-1.8.1 Unable to compile on CentOS6.5

2014-08-26 Thread Ralph Castain
Looks like there is something wrong with your gfortran install: *** Fortran compiler checking for gfortran... gfortran checking whether we are using the GNU Fortran compiler... yes checking whether gfortran accepts -g... yes checking whether ln -s works... yes checking if Fortran compiler

Re: [OMPI users] openmpi-1.8.1 Unable to compile on CentOS6.5

2014-08-26 Thread Jeff Squyres (jsquyres)
Just to elaborate: as the error message implies, this error message was put there specifically to ensure that the Fortran compiler works before continuing any further. If the Fortran compiler is busted, configure exits with this help message. You can either fix your Fortran compiler, or use

[OMPI users] OpenMPI Remote Execution Problem (Application does not start)

2014-08-26 Thread Benjamin Giehle
Hello, i have a problem with running my mpi application on a remote machine. If I start the application via ssh everything works just fine, but if I use mpirun the application won't start. If I start the application on the local machine with mpi it works too. ssh myhost ./myapp <- works

Re: [OMPI users] OpenMPI Remote Execution Problem (Application does not start)

2014-08-26 Thread Ralph Castain
Add --enable-debug to your configure, and then re-run the --host test and add "--leave-session-attached -mca plm_base_verbose 5 -ma oob_base_verbose 5" and let's see what's going on On Aug 26, 2014, at 7:31 AM, Benjamin Giehle wrote: > Hello, > > i have a problem with

Re: [OMPI users] A daemon on node cl231 failed to start as expected

2014-08-26 Thread Pengcheng Wang
Hi Reuti, Thanks a lot for your help. The 'Openmp' PE in our clusters has the allocation rule 'pe_slots'. But I guess I can only use limited slots for my job under this PE... The command 'qacct -j jobID' gives the information below. It turns out the job might exceed its memory allocation. After

Re: [OMPI users] openmpi-1.8.1 Unable to compile on CentOS6.5

2014-08-26 Thread Syed Ahsan Ali
Hi Jeff and Ralph I could have figured out the issue but the problem was that I cannot find the exact error line in config.log just as you identified. The shared library libquadmath is present in lib64 directory. So, adding the path to the environment removed the error. Thank you guys for

Re: [OMPI users] long initialization

2014-08-26 Thread Ralph Castain
I think something may be messed up with your installation. I went ahead and tested this on a Slurm 2.5.4 cluster, and got the following: $ time mpirun -np 1 --host bend001 ./hello Hello, World, I am 0 of 1 [0 local peers]: get_cpubind: 0 bitmap 0,12 real0m0.086s user0m0.039s sys