Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault
a very friendly way to handle that error. -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime Boissonneault Sent: Tuesday, August 19, 2014 10:39 AM To: Open MPI Users Subject: Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes Hi, I believe I found

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Rolf vandeVaart
Open MPI Users >Subject: Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes > >Hi, >I believe I found what the problem was. My script set the >CUDA_VISIBLE_DEVICES based on the content of $PBS_GPUFILE. Since the >GPUs were listed twice in the $PBS_GPUFILE because of the two n

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault
-Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime Boissonneault Sent: Tuesday, August 19, 2014 8:55 AM To: Open MPI Users Subject: Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes Hi, I recompiled OMPI 1.8.1 without Cuda and with debug, but it did

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Rolf vandeVaart
2014 8:55 AM >To: Open MPI Users >Subject: Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes > >Hi, >I recompiled OMPI 1.8.1 without Cuda and with debug, but it did not give me >much more information. >[mboisson@gpu-k20-07 simple_cuda_mpi]$ ompi_info | grep debug >

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault
: Monday, August 18, 2014 4:23 PM To: Open MPI Users Subject: [OMPI users] Segfault with MPI + Cuda on multiple nodes Hi, Since my previous thread (Segmentation fault in OpenMPI 1.8.1) kindda derailed into two problems, one of which has been addressed, I figured I would start a new, more precise

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault
: [OMPI users] Segfault with MPI + Cuda on multiple nodes Same thing : [mboisson@gpu-k20-07 simple_cuda_mpi]$ export MALLOC_CHECK_=1 [mboisson@gpu-k20-07 simple_cuda_mpi]$ mpiexec -np 2 --map-by ppr:1:node cudampi_simple malloc: using debugging hooks malloc: using debugging hooks [gpu-k20-07:47628

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Alex A. Granovsky
MPI Users Subject: [OMPI users] Segfault with MPI + Cuda on multiple nodes Hi, Since my previous thread (Segmentation fault in OpenMPI 1.8.1) kindda derailed into two problems, one of which has been addressed, I figured I would start a new, more precise and simple one. I reduced the code

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Alex A. Granovsky
Users Subject: [OMPI users] Segfault with MPI + Cuda on multiple nodes Hi, Since my previous thread (Segmentation fault in OpenMPI 1.8.1) kindda derailed into two problems, one of which has been addressed, I figured I would start a new, more precise and simple one. I reduced the code

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-18 Thread Maxime Boissonneault
? -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime Boissonneault Sent: Monday, August 18, 2014 4:23 PM To: Open MPI Users Subject: [OMPI users] Segfault with MPI + Cuda on multiple nodes Hi, Since my previous thread (Segmentation fault in OpenMPI 1.8.1

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-18 Thread Maxime Boissonneault
users] Segfault with MPI + Cuda on multiple nodes Hi, Since my previous thread (Segmentation fault in OpenMPI 1.8.1) kindda derailed into two problems, one of which has been addressed, I figured I would start a new, more precise and simple one. I reduced the code to the minimal that would reproduce

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-18 Thread Alex A. Granovsky
Try the following: export MALLOC_CHECK_=1 and then run it again Kind regards, Alex Granovsky -Original Message- From: Maxime Boissonneault Sent: Tuesday, August 19, 2014 12:23 AM To: Open MPI Users Subject: [OMPI users] Segfault with MPI + Cuda on multiple nodes Hi, Since my

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-18 Thread Rolf vandeVaart
Maxime >Boissonneault >Sent: Monday, August 18, 2014 4:23 PM >To: Open MPI Users >Subject: [OMPI users] Segfault with MPI + Cuda on multiple nodes > >Hi, >Since my previous thread (Segmentation fault in OpenMPI 1.8.1) kindda >derailed into two problems, one of which has been addre

[OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-18 Thread Maxime Boissonneault
Hi, Since my previous thread (Segmentation fault in OpenMPI 1.8.1) kindda derailed into two problems, one of which has been addressed, I figured I would start a new, more precise and simple one. I reduced the code to the minimal that would reproduce the bug. I have pasted it here :