Re: [OMPI users] segfault during MPI_Isend when transmitting GPU arrays between multiple GPUs

2015-03-27 Thread Rolf vandeVaart
Hi Lev: I am not sure what is happening here but there are a few things we can do to try and narrow things done. 1. If you run with --mca btl_smcuda_use_cuda_ipc 0 then I assume this error will go away? 2. Do you know if when you see this error it happens on the first pass through your

[OMPI users] segfault during MPI_Isend when transmitting GPU arrays between multiple GPUs

2015-03-27 Thread Lev Givon
I'm using PyCUDA 2014.1 and mpi4py (git commit 3746586, uploaded today) built against OpenMPI 1.8.4 with CUDA support activated to asynchronously send GPU arrays between multiple Tesla GPUs (Fermi generation). Each MPI process is associated with a single GPU; the process has a run loop that starts

Re: [OMPI users] [EXTERNAL] Re: Errors on POWER8 Ubuntu 14.04u2

2015-03-27 Thread Hammond, Simon David (-EXP)
Thanks guys, I have tried two configure lines: (1) ./configure --prefix=/home/projects/power8/openmpi/1.8.4/gnu/4.8.2/cuda/none --enable-mpi-thread-multiple CC=/usr/bin/gcc CXX=/usr/bin/g++ FC=/usr/bin/gfortran (2) ./configure --prefix=/home/projects/power8/openmpi/1.8.4/gnu/4.8.2/cuda/none

Re: [OMPI users] Errors on POWER8 Ubuntu 14.04u2

2015-03-27 Thread Jeff Squyres (jsquyres)
It might be helpful to send all the information listed here: http://www.open-mpi.org/community/help/ > On Mar 26, 2015, at 10:55 PM, Ralph Castain wrote: > > Could you please send us your configure line? > >> On Mar 26, 2015, at 4:47 PM, Hammond, Simon David (-EXP)

Re: [hwloc-users] lstopo on Kaveri

2015-03-27 Thread Brice Goglin
Hello, That's an interesting question: Even if the GPU is physically-located inside the die, it is exposed as a "virtual" PCI device (vendor number 1002 and model number 130f), and that's how we detect it, and that's how the driver configures it. Many components of the CPU die are configured

[hwloc-users] lstopo on Kaveri

2015-03-27 Thread Samy CHBINOU
Hello, I run lstopo on my APU A10-7850K (4CPUs + 8 GPUs), they are detected (see included picture) but the 8 GPUs are detected on the PCI bus, while they are on the same die as the CPUs and directly share parts of the RAM. ... I don't understand the signification of the numbers PCI 1002:130f