Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-18 Thread Rolf vandeVaart
Just to help reduce the scope of the problem, can you retest with a non CUDA-aware Open MPI 1.8.1? And if possible, use --enable-debug in the configure line to help with the stack trace? >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Rolf vandeVaart
gpu-k20-08:46045] *** End of error message *** >-- >mpiexec noticed that process rank 1 with PID 46045 on node gpu-k20-08 >exited on signal 11 (Segmentation fault). >---

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Rolf vandeVaart
odes, I had >CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7 > >instead of >CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 > >Sorry for the false bug and thanks for directing me toward the solution. > >Maxime > > >Le 2014-08-19 09:15, Rolf vandeVaart a écrit : >>

Re: [OMPI users] OMPI CUDA IPC synchronisation/fail-silent problem

2014-08-26 Thread Rolf vandeVaart
Hi Christoph: I will try and reproduce this issue and will let you know what I find. There may be an issue with CUDA IPC support with certain traffic patterns. Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Christoph Winter Sent: Tuesday, August 26, 2014 2:46 AM To:

[OMPI users] CUDA-aware Users

2014-10-09 Thread Rolf vandeVaart
If you are utilizing the CUDA-aware support in Open MPI, can you send me an email with some information about the application and the cluster you are on. I will consolidate information. Thanks, Rolf (rvandeva...@nvidia.com)

Re: [OMPI users] CuEventCreate Failed...

2014-10-19 Thread Rolf vandeVaart
The error 304 corresponds to CUDA_ERRROR_OPERATNG_SYSTEM which means an OS call failed. However, I am not sure how that relates to the call that is getting the error. Also, the last error you report is from MVAPICH2-GDR, not from Open MPI. I guess then I have a few questions. 1. Can

Re: [OMPI users] CuEventCreate Failed...

2014-10-20 Thread Rolf vandeVaart
. Also, our defaults for openmpi-mca-params.conf are: mtl=^mxm btl=^usnic,tcp btl_openib_flags=1 service nv_peer_mem status nv_peer_mem module is loaded. Kindest Regards, - Steven Eliuk, From: Rolf vandeVaart <rvandeva...@nvidia.com<mailto:rvandeva...@nvidia.com>> Reply-To: Op

Re: [OMPI users] Randomly long (100ms vs 7000+ms) fulfillment of MPI_Ibcast

2014-11-06 Thread Rolf vandeVaart
The CUDA person is now responding. I will try and reproduce. I looked through the zip file but did not see the mpirun command. Can this be reproduced with -np 4 running across four nodes? Also, in your original message you wrote "Likewise, it doesn't matter if I enable CUDA support or not.

Re: [OMPI users] Segmentation fault when using CUDA Aware feature

2015-01-12 Thread Rolf vandeVaart
That is strange, not sure why that is happening. I will try to reproduce with your program on my system. Also, perhaps you could rerun with –mca mpi_common_cuda_verbose 100 and send me that output. Thanks From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Xun Gong Sent: Sunday,

Re: [OMPI users] Segmentation fault when using CUDA Aware feature

2015-01-12 Thread Rolf vandeVaart
I think I found a bug in your program with how you were allocating the GPU buffers. I will send you a version offlist with the fix. Also, there is no need to rerun with the flags I had mentioned below. Rolf From: Rolf vandeVaart Sent: Monday, January 12, 2015 9:38 AM To: us...@open-mpi.org

Re: [OMPI users] GPUDirect with OpenMPI

2015-02-11 Thread Rolf vandeVaart
Let me try to reproduce this. This should not have anything to do with GPU Direct RDMA. However, to eliminate it, you could run with: --mca btl_openib_want_cuda_gdr 0. Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Aulwes, Rob Sent: Wednesday, February 11, 2015 2:17 PM To:

Re: [OMPI users] GPUDirect with OpenMPI

2015-03-03 Thread Rolf vandeVaart
retry with a pre-release version of Open MPI 1.8.5 that is available here and confirm it fixes your issue. Any of the ones listed on that page should be fine. http://www.open-mpi.org/nightly/v1.8/ Thanks, Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart Sent

Re: [OMPI users] issue with openmpi + CUDA

2015-03-26 Thread Rolf vandeVaart
Hi Jason: The issue is that Open MPI is (presumably) a 64 bit application and it is trying to load up a 64-bit libcuda.so.1 but not finding one. Making the link as you did will not fix the problem (as you saw). In all my installations, I also have a 64-bit driver installed in

Re: [OMPI users] segfault during MPI_Isend when transmitting GPU arrays between multiple GPUs

2015-03-27 Thread Rolf vandeVaart
Hi Lev: I am not sure what is happening here but there are a few things we can do to try and narrow things done. 1. If you run with --mca btl_smcuda_use_cuda_ipc 0 then I assume this error will go away? 2. Do you know if when you see this error it happens on the first pass through your

Re: [OMPI users] segfault during MPI_Isend when transmitting GPU arrays between multiple GPUs

2015-03-30 Thread Rolf vandeVaart
>-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lev Givon >Sent: Sunday, March 29, 2015 10:11 PM >To: Open MPI Users >Subject: Re: [OMPI users] segfault during MPI_Isend when transmitting GPU >arrays between multiple GPUs > >Recei

Re: [OMPI users] segfault during MPI_Isend when transmitting GPU arrays between multiple GPUs

2015-03-30 Thread Rolf vandeVaart
>-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Rolf >vandeVaart >Sent: Monday, March 30, 2015 9:37 AM >To: Open MPI Users >Subject: Re: [OMPI users] segfault during MPI_Isend when transmitting GPU >arrays between multiple GPUs >

Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-06 Thread Rolf vandeVaart
It is my belief that you cannot do this at least with the openib BTL. The IB card to be used for communication is selected during the MPI _Init() phase based on where the CPU process is bound to. You can see some of this selection by using the --mca btl_base_verbose 1 flag. There is a bunch

Re: [OMPI users] getting OpenMPI 1.8.4 w/ CUDA to look for absolute path to libcuda.so.1

2015-04-29 Thread Rolf vandeVaart
Hi Lev: Any chance you can try Open MPI 1.8.5rc3 and see if you see the same behavior? That code has changed a bit from the 1.8.4 series and I am curious if you will still see the same issue. http://www.open-mpi.org/software/ompi/v1.8/downloads/openmpi-1.8.5rc3.tar.gz Thanks, Rolf

Re: [OMPI users] cuIpcOpenMemHandle failure when using OpenMPI 1.8.5 with CUDA 7.0 and Multi-Process Service

2015-05-19 Thread Rolf vandeVaart
I am not sure why you are seeing this. One thing that is clear is that you have found a bug in the error reporting. The error message is a little garbled and I see a bug in what we are reporting. I will fix that. If possible, could you try running with --mca btl_smcuda_use_cuda_ipc 0. My

Re: [OMPI users] cuIpcOpenMemHandle failure when using OpenMPI 1.8.5 with CUDA 7.0 and Multi-Process Service

2015-05-20 Thread Rolf vandeVaart
-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lev Givon >Sent: Tuesday, May 19, 2015 10:25 PM >To: Open MPI Users >Subject: Re: [OMPI users] cuIpcOpenMemHandle failure when using >OpenMPI 1.8.5 with CUDA 7.0 and Multi-Process Service > &

Re: [OMPI users] cuIpcOpenMemHandle failure when using OpenMPI 1.8.5 with CUDA 7.0 and Multi-Process Service

2015-05-21 Thread Rolf vandeVaart
ti-Process Service > >Received from Lev Givon on Thu, May 21, 2015 at 11:32:33AM EDT: >> Received from Rolf vandeVaart on Wed, May 20, 2015 at 07:48:15AM EDT: >> >> (snip) >> >> > I see that you mentioned you are starting 4 MPS daemons. Are

Re: [OMPI users] Problems running linpack benchmark on old Sunfire opteron nodes

2015-05-26 Thread Rolf vandeVaart
I think we bumped up a default value in Open MPI 1.8.5. To go back to the old 64Mbyte value try running with: --mca mpool_sm_min_size 67108864 Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Aurélien Bouteiller Sent: Tuesday, May 26, 2015 10:10 AM To: Open MPI Users Subject:

Re: [OMPI users] CUDA-aware MPI_Reduce problem in Openmpi 1.8.5

2015-06-17 Thread Rolf vandeVaart
-aware MPI_Reduce problem in Openmpi 1.8.5 Hi Rolf, Thank you very much for clarifying the problem. Is there any plan to support GPU RDMA for reduction in the future? On Jun 17, 2015, at 1:38 PM, Rolf vandeVaart <rvandeva...@nvidia.com<mailto:rvandeva...@nvidia.com>> wro

Re: [OMPI users] 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak

2015-06-30 Thread Rolf vandeVaart
how you observed the behavior. Does the code need to run for a while to see this? Any suggestions on how I could reproduce this? Thanks, Rolf From: Steven Eliuk [mailto:s.el...@samsung.com] Sent: Tuesday, June 30, 2015 6:05 PM To: Rolf vandeVaart Cc: Open MPI Users Subject: 1.8.6 w/ CUDA 7.0

Re: [OMPI users] 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak

2015-07-01 Thread Rolf vandeVaart
Hi Stefan (and Steven who reported this earlier with CUDA-aware program) I have managed to observed the leak when running LAMMPS as well. Note that this has nothing to do with CUDA-aware features. I am going to move this discussion to the Open MPI developer’s list to dig deeper into this

Re: [OMPI users] 1.8.6 w/ CUDA 7.0 & GDR Huge Memory Leak

2015-07-06 Thread Rolf vandeVaart
Just an FYI that this issue has been found and fixed and will be available in the next release. https://github.com/open-mpi/ompi-release/pull/357 Rolf From: Rolf vandeVaart Sent: Wednesday, July 01, 2015 4:47 PM To: us...@open-mpi.org Subject: RE: [OMPI users] 1.8.6 w/ CUDA 7.0 & GDR

Re: [OMPI users] openmpi 1.8.7 build error with cuda support using pgi compiler 15.4

2015-08-04 Thread Rolf vandeVaart
Hi Shahzeb: I believe another colleague of mine may have helped you with this issue (I was not around last week). However, to help me better understand the issue you are seeing, could you send me your config.log file from when you did the configuration? You can just send to

Re: [OMPI users] CUDA Buffers: Enforce asynchronous memcpy's

2015-08-11 Thread Rolf vandeVaart
I talked with Jeremia off list and we figured out what was going on. There is the ability to use the cuMemcpyAsync/cuStreamSynchronize rather than the cuMemcpy but it was never made the default for Open MPI 1.8 series. So, to get that behavior you need the following: --mca

Re: [OMPI users] CUDA Buffers: Enforce asynchronous memcpy's

2015-08-12 Thread Rolf vandeVaart
com> www.ibm.com<http://www.ibm.com> - Original message - From: Rolf vandeVaart <rvandeva...@nvidia.com<mailto:rvandeva...@nvidia.com>> Sent by: "users" <users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>> To: Open MPI Users <

Re: [OMPI users] cuda aware mpi

2015-08-21 Thread Rolf vandeVaart
No, it is not. You have to use pml ob1 which will pull in the smcuda and openib BTLs which have CUDA-aware built into them. Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Subhra Mazumdar Sent: Friday, August 21, 2015 12:18 AM To: Open MPI Users Subject: [OMPI users] cuda

Re: [OMPI users] Wrong distance calculations in multi-rail setup?

2015-08-28 Thread Rolf vandeVaart
I am not sure why the distances are being computed as you are seeing. I do not have a dual rail card system to reproduce with. However, short term, I think you could get what you want by running like the following. The first argument tells the selection logic to ignore locality, so both cards

Re: [OMPI users] Wrong distance calculations in multi-rail setup?

2015-08-28 Thread Rolf vandeVaart
/ where I can >look, I could help to find the issue. > >Thanks a lot! > >Marcin > > >On 08/28/2015 05:28 PM, Rolf vandeVaart wrote: >> I am not sure why the distances are being computed as you are seeing. I do >not have a dual rail card system to reproduce with

Re: [OMPI users] tracking down what's causing a cuIpcOpenMemHandle error emitted by OpenMPI

2015-09-03 Thread Rolf vandeVaart
Lev: Can you run with --mca mpi_common_cuda_verbose 100 --mca mpool_rgpusm_verbose 100 and send me (rvandeva...@nvidia.com) the output of that. Thanks, Rolf >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lev Givon >Sent: Wednesday, September 02, 2015

Re: [OMPI users] How does MPI_Allreduce work?

2015-09-25 Thread Rolf vandeVaart
Hello Yang: It is not clear to me if you are asking about a CUDA-aware build of Open MPI where you do the MPI_Allreduce() or the GPU buffer or if you are handling staging the GPU into host memory and then calling the MPI_Allreduce(). Either way, they are somewhat similar. With CUDA-aware, the

Re: [OMPI users] How does MPI_Allreduce work?

2015-09-25 Thread Rolf vandeVaart
> >Sent by Apple Mail > >Yang ZHANG > >PhD candidate > >Networking and Wide-Area Systems Group >Computer Science Department >New York University > >715 Broadway Room 705 >New York, NY 10003 > >> On Sep 25, 2015, at 11:07 AM, Rolf vandeVaart <rva

Re: [OMPI users] Compiling openmpi 1.6.4 without CUDA

2013-05-20 Thread Rolf vandeVaart
I can speak to part of your issue. There are no CUDA-aware features in the 1.6 series of Open MPI. Therefore, the various configure flags you tried would not affect Open MPI itself. Those configure flags are relevant with the 1.7 series and later, but as the FAQ says, the CUDA-aware feature

Re: [OMPI users] Application hangs on mpi_waitall

2013-06-27 Thread Rolf vandeVaart
Ed, how large are the messages that you are sending and receiving? Rolf From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ed Blosch Sent: Thursday, June 27, 2013 9:01 AM To: us...@open-mpi.org Subject: Re: [OMPI users] Application hangs on mpi_waitall It ran a

Re: [OMPI users] Support for CUDA and GPU-direct with OpenMPI 1.6.5 an 1.7.2

2013-07-08 Thread Rolf vandeVaart
With respect to the CUDA-aware support, Ralph is correct. The ability to send and receive GPU buffers is in the Open MPI 1.7 series. And incremental improvements will be added to the Open MPI 1.7 series. CUDA 5.0 is supported. From: users-boun...@open-mpi.org

Re: [OMPI users] Trouble configuring 1.7.2 for Cuda 5.0.35

2013-08-14 Thread Rolf vandeVaart
It is looking for the libcuda.so file, not the libcudart.so file. So, maybe --with-libdir=/usr/lib64 You need to be on a machine with the CUDA driver installed. What was your configure command? http://www.open-mpi.org/faq/?category=building#build-cuda Rolf >-Original Message-

Re: [OMPI users] Trouble configuring 1.7.2 for Cuda 5.0.35

2013-08-14 Thread Rolf vandeVaart
3 2:59 PM >To: Open MPI Users >Cc: Rolf vandeVaart >Subject: Re: [OMPI users] Trouble configuring 1.7.2 for Cuda 5.0.35 > >Thank you for the quick reply Rolf, > I personally don't know the Cuda libraries. I was hoping there had been a >name change. I am on a Cray XT-7.

[OMPI users] CUDA-aware usage

2013-10-01 Thread Rolf vandeVaart
We have done some work over the last year or two to add some CUDA-aware support into the Open MPI library. Details on building and using the feature are here. http://www.open-mpi.org/faq/?category=building#build-cuda http://www.open-mpi.org/faq/?category=running#mpi-cuda-support I am looking

Re: [OMPI users] Build Failing for OpenMPI 1.7.2 and CUDA 5.5.11

2013-10-07 Thread Rolf vandeVaart
That might be a bug. While I am checking, you could try configuring with this additional flag: --enable-mca-no-build=pml-bfo Rolf >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Hammond, >Simon David (-EXP) >Sent: Monday, October 07, 2013 3:30 PM >To:

Re: [OMPI users] [EXTERNAL] Re: Build Failing for OpenMPI 1.7.2 and CUDA 5.5.11

2013-10-07 Thread Rolf vandeVaart
>Laboratories, NM, USA > > > > > > >On 10/7/13 1:47 PM, "Rolf vandeVaart" <rvandeva...@nvidia.com> wrote: > >>That might be a bug. While I am checking, you could try configuring with >>this additional flag: >> >>--enable-mca-no-bu

Re: [OMPI users] OpenMPI-1.7.3 - cuda support

2013-10-30 Thread Rolf vandeVaart
Let me try this out and see what happens for me. But yes, please go ahead and send me the complete backtrace. Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of KESTENER Pierre Sent: Wednesday, October 30, 2013 11:34 AM To: us...@open-mpi.org Cc: KESTENER Pierre Subject: [OMPI

Re: [OMPI users] OpenMPI-1.7.3 - cuda support

2013-10-30 Thread Rolf vandeVaart
The CUDA-aware support is only available when running with the verbs interface to Infiniband. It does not work with the PSM interface which is being used in your installation. To verify this, you need to disable the usage of PSM. This can be done in a variety of ways, but try running like

Re: [OMPI users] Bug in MPI_REDUCE in CUDA-aware MPI

2013-12-02 Thread Rolf vandeVaart
Thanks for the report. CUDA-aware Open MPI does not currently support doing reduction operations on GPU memory. Is this a feature you would be interested in? Rolf >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Peter Zaspel >Sent: Friday, November 29,

Re: [OMPI users] Bug in MPI_REDUCE in CUDA-aware MPI

2013-12-02 Thread Rolf vandeVaart
asoning for this? Is there some documentation, >which MPI calls are CUDA-aware and which not? > >Best regards > >Peter > > > >On 12/02/2013 02:18 PM, Rolf vandeVaart wrote: >> Thanks for the report. CUDA-aware Open MPI does not currently support >do

Re: [OMPI users] Cuda Aware MPI Problem

2013-12-13 Thread Rolf vandeVaart
Yes, this was a bug with Open MPI 1.7.3. I could not reproduce it, but it was definitely an issue in certain configurations. Here was the fix. https://svn.open-mpi.org/trac/ompi/changeset/29762 We fixed it in Open MPI 1.7.4 and the trunk version, so as you have seen, they do not have the

Re: [OMPI users] Configure issue with/without HWLOC when PGI used and CUDA support enabled

2014-02-14 Thread Rolf vandeVaart
I assume your first issue is happening because you configured hwloc with cuda support which creates a dependency on libcudart.so. Not sure why that would mess up Open MPI. Can you send me how you configured hwloc? I am not sure I understand the second issue. Open MPI puts everything in lib

Re: [OMPI users] 1.7.5rc1, error "COLL-ML ml_discover_hierarchy exited with error."

2014-03-03 Thread Rolf vandeVaart
Can you try running with --mca coll ^ml and see if things work? Rolf >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga >Sent: Monday, March 03, 2014 7:14 PM >To: Open MPI Users >Subject: [OMPI users] 1.7.5rc1, error "COLL-ML

Re: [OMPI users] 1.7.5rc1, error "COLL-ML ml_discover_hierarchy exited with error."

2014-03-03 Thread Rolf vandeVaart
.12 >1048576 765.65 > > >Can you clarify exactly where the problem come from? > >Regards, >Filippo > > >On Mar 4, 2014, at 12:17 AM, Rolf vandeVaart <rvandeva...@nvidia.com> >wrote: >> Can you try running with --mca coll ^ml and see if

Re: [OMPI users] Advices for parameter tuning for CUDA-aware MPI

2014-05-27 Thread Rolf vandeVaart
Answers inline... >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime >Boissonneault >Sent: Friday, May 23, 2014 4:31 PM >To: Open MPI Users >Subject: [OMPI users] Advices for parameter tuning for CUDA-aware MPI > >Hi, >I am currently configuring a GPU

Re: [OMPI users] Advices for parameter tuning for CUDA-aware MPI

2014-05-27 Thread Rolf vandeVaart
>-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime >Boissonneault >Sent: Tuesday, May 27, 2014 4:07 PM >To: Open MPI Users >Subject: Re: [OMPI users] Advices for parameter tuning for CUDA-aware MPI > >Answers inline too. >>> 2) Is the absence of

Re: [OMPI users] deprecated cuptiActivityEnqueueBuffer

2014-06-16 Thread Rolf vandeVaart
Do you need the vampire support in your build? If not, you could add this to configure. --disable-vt >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of >jcabe...@computacion.cs.cinvestav.mx >Sent: Monday, June 16, 2014 1:40 PM >To: us...@open-mpi.org

Re: [OMPI users] Help with multirail configuration

2014-07-21 Thread Rolf vandeVaart
With Open MPI 1.8.1, the library will use the NIC that is "closest" to the CPU. There was a bug in earlier versions of Open MPI 1.8 so that did not happen. You can see this by running with some verbosity using the "btl_base_verbose" flag. For example, this is what I observed on a two node

Re: [OMPI users] Test Program works on 1, 2 or 3 nodes. Hangs on 4 or more nodes.

2010-09-21 Thread Rolf vandeVaart
Ethan: Can you run just "hostname" successfully? In other words, a non-MPI program. If that does not work, then we know the problem is in the runtime. If it does works, then there is something with the way the MPI library is setting up its connections. Is there more than one interface on

Re: [OMPI users] [Rocks-Discuss] compiling Openmpi on solaris studio express

2010-11-29 Thread Rolf vandeVaart
This problem looks a lot like a thread from earlier today. Can you look at this ticket and see if it helps? It has a workaround documented in it. https://svn.open-mpi.org/trac/ompi/ticket/2632 Rolf On 11/29/10 16:13, Prentice Bisbal wrote: No, it looks like ld is being called with the

Re: [OMPI users] [Rocks-Discuss] compiling Openmpi on solaris studio express

2010-11-29 Thread Rolf vandeVaart
what is in the ticket. Rolf On 11/29/10 16:26, Nehemiah Dacres wrote: that looks about right. So the suggestion: ./configure LDFLAGS="-notpath ... ... ..." -notpath should be replaced by whatever the proper flag should be, in my case -L ? On Mon, Nov 29, 2010 at 3:1

Re: [OMPI users] One-sided datatype errors

2010-12-14 Thread Rolf vandeVaart
Hi James: I can reproduce the problem on a single node with Open MPI 1.5 and the trunk. I have submitted a ticket with the information. https://svn.open-mpi.org/trac/ompi/ticket/2656 Rolf On 12/13/10 18:44, James Dinan wrote: Hi, I'm getting strange behavior using datatypes in a one-sided

Re: [OMPI users] anybody tried OMPI with gpudirect?

2011-02-28 Thread Rolf vandeVaart
Hi Brice: Yes, I have tired OMPI 1.5 with gpudirect and it worked for me. You definitely need the patch or you will see the behavior just as you described, a hang. One thing you could try is disabling the large message RDMA in OMPI and see if that works. That can be done by adjusting the

Re: [OMPI users] anybody tried OMPI with gpudirect?

2011-02-28 Thread Rolf vandeVaart
] anybody tried OMPI with gpudirect? Le 28/02/2011 17:30, Rolf vandeVaart a écrit : > Hi Brice: > Yes, I have tired OMPI 1.5 with gpudirect and it worked for me. You > definitely need the patch or you will see the behavior just as you described, > a hang. One thing you could try

Re: [OMPI users] anybody tried OMPI with gpudirect?

2011-02-28 Thread Rolf vandeVaart
-Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Brice Goglin Sent: Monday, February 28, 2011 2:14 PM To: Open MPI Users Subject: Re: [OMPI users] anybody tried OMPI with gpudirect? Le 28/02/2011 19:49, Rolf vandeVaart a écrit

Re: [OMPI users] Program hangs when using OpenMPI and CUDA

2011-06-06 Thread Rolf vandeVaart
Hi Fengguang: That is odd that you see the problem even when running with the openib flags set as Brice indicated. Just to be extra sure there are no typo errors in your flag settings, maybe you can verify with the ompi_info command like this? ompi_info -mca btl_openib_flags 304 -param btl

Re: [OMPI users] MPI hangs on multiple nodes

2011-09-20 Thread Rolf vandeVaart
>> 1: After a reboot of two nodes I ran again, and the inter-node freeze didn't >happen until the third iteration. I take that to mean that the basic >communication works, but that something is saturating. Is there some notion >of buffer size somewhere in the MPI system that could explain this? >

Re: [OMPI users] gpudirect p2p?

2011-10-14 Thread Rolf vandeVaart
>-Original Message- >From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] >On Behalf Of Chris Cooper >Sent: Friday, October 14, 2011 1:28 AM >To: us...@open-mpi.org >Subject: [OMPI users] gpudirect p2p? > >Hi, > >Are the recent peer to peer capabilities of cuda leveraged by

Re: [OMPI users] configure with cuda

2011-10-27 Thread Rolf vandeVaart
Actually, that is not quite right. From the FAQ: "This feature currently only exists in the trunk version of the Open MPI library." You need to download and use the trunk version for this to work. http://www.open-mpi.org/nightly/trunk/ Rolf From: users-boun...@open-mpi.org

Re: [OMPI users] How "CUDA Init prior to MPI_Init" co-exists with unique GPU for each MPI process?

2011-12-14 Thread Rolf vandeVaart
, December 14, 2011 10:47 AM To: Open MPI Users Cc: Rolf vandeVaart Subject: Re: [OMPI users] How "CUDA Init prior to MPI_Init" co-exists with unique GPU for each MPI process? Hi, Processes are not spawned by MPI_Init. They are spawned before by some applications between your mpirun cal

Re: [OMPI users] Problem running an mpi applicatio​n on nodes with more than one interface

2012-02-17 Thread Rolf vandeVaart
Open MPI cannot handle having two interfaces on a node on the same subnet. I believe it has to do with our matching code when we try to match up a connection. The result is a hang as you observe. I also believe it is not good practice to have two interfaces on the same subnet. If you put them

Re: [OMPI users] Open MPI 1.4.5 and CUDA support

2012-04-17 Thread Rolf vandeVaart
Yes, they are supported in the sense that they can work together. However, if you want to have the ability to send/receive GPU buffers directly via MPI calls, then I recommend you get CUDA 4.1 and use the Open MPI trunk. http://www.open-mpi.org/faq/?category=building#build-cuda Rolf From:

Re: [OMPI users] MPI and CUDA

2012-04-24 Thread Rolf vandeVaart
I am not sure about everything that is going wrong, but there are at least two issues I found. First, you are skipping the first line that you read from integers.txt. Maybe something like this instead. while(fgets(line, sizeof line, fp)!= NULL){ sscanf(line,"%d",[k]); sum = sum +

Re: [OMPI users] MPI over tcp

2012-05-03 Thread Rolf vandeVaart
I tried your program on a single node and it worked fine. Yes, TCP message passing in Open MPI has been working well for some time. I have a few suggestions. 1. Can you run something like hostname successfully (mpirun -np 10 -hostfile yourhostfile hostname) 2. If that works, then you can also

Re: [OMPI users] MPI over tcp

2012-05-04 Thread Rolf vandeVaart
>-Original Message- >From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] >On Behalf Of Don Armstrong >Sent: Thursday, May 03, 2012 5:43 PM >To: us...@open-mpi.org >Subject: Re: [OMPI users] MPI over tcp > >On Thu, 03 May 2012, Rolf vandeVaar

Re: [OMPI users] GPU and CPU timing - OpenMPI and Thrust

2012-05-08 Thread Rolf vandeVaart
You should be running with one GPU per MPI process. If I understand correctly, you have a 3 node cluster and each node has a GPU so you should run with np=3. Maybe you can try that and see if your numbers come out better. From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On

Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments

2012-06-18 Thread Rolf vandeVaart
Hi Dmitry: Let me look into this. Rolf From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Dmitry N. Mikushin Sent: Monday, June 18, 2012 10:56 AM To: Open MPI Users Cc: Олег Рябков Subject: Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not

Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments

2012-06-18 Thread Rolf vandeVaart
en-mpi.org] On Behalf Of Rolf vandeVaart Sent: Monday, June 18, 2012 11:00 AM To: Open MPI Users Cc: Олег Рябков Subject: Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments Hi Dmitry: Let me look into this. Rolf From: users-boun...@open-mpi.org [mailto:users-

Re: [OMPI users] gpudirect p2p (again)?

2012-07-09 Thread Rolf vandeVaart
Yes, this feature is in Open MPI 1.7. It is implemented in the "smcuda" btl. If you configure as outlined in the FAQ, then things should just work. The smcuda btl will be selected and P2P will be used between GPUs on the same node. This is only utilized on transfers of buffers that are

Re: [OMPI users] bug in CUDA support for dual-processor systems?

2012-07-31 Thread Rolf vandeVaart
The current implementation does assume that the GPUs are on the same IOH and therefore can use the IPC features of the CUDA library for communication. One of the initial motivations for this was that to be able to detect whether GPUs can talk to one another, the CUDA library has to be

Re: [OMPI users] CUDA in v1.7? (was: Compilation of OpenMPI 1.5.4 & 1.6.X fail for PGI compiler...)

2012-08-09 Thread Rolf vandeVaart
>-Original Message- >From: Jeff Squyres [mailto:jsquy...@cisco.com] >Sent: Thursday, August 09, 2012 9:45 AM >To: Open MPI Users >Cc: Rolf vandeVaart >Subject: CUDA in v1.7? (was: Compilation of OpenMPI 1.5.4 & 1.6.X fail for PGI >compiler...) > >On Aug 9,

Re: [OMPI users] RDMA GPUDirect CUDA...

2012-08-14 Thread Rolf vandeVaart
To answer the original questions, Open MPI will look at taking advantage of the RDMA CUDA when it is available. Obviously, work needs to be done to figure out the best way to integrate into the library. Much like there are a variety of protocols under the hood to support host transfer of data

Re: [OMPI users] ompi-clean on single executable

2012-10-24 Thread Rolf vandeVaart
And just to give a little context, ompi-clean was created initially to "clean" up a node, not for cleaning up a specific job. It was for the case where MPI jobs would leave some files behind or leave some processes running. (I do not believe this happens much at all anymore.) But, as was

Re: [OMPI users] mpi_leave_pinned is dangerous

2012-11-08 Thread Rolf vandeVaart
Not sure. I will look into this. And thank you for the feedback Jens! Rolf >-Original Message- >From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] >On Behalf Of Jeff Squyres >Sent: Thursday, November 08, 2012 8:49 AM >To: Open MPI Users >Subject: Re: [OMPI users]

Re: [OMPI users] status of cuda across multiple IO hubs?

2013-03-11 Thread Rolf vandeVaart
Yes, unfortunately, that issue is still unfixed. I just created the ticket and included a possible workaround. https://svn.open-mpi.org/trac/ompi/ticket/3531 >-Original Message- >From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] >On Behalf Of Russell Power >Sent:

Re: [OMPI users] selectively bind MPI to one HCA out of available ones

2009-07-15 Thread Rolf Vandevaart
As Lenny said, you should use the if_include parameter. Specifically, it would look like this depending on which ones you want to select. -mca btl_openib_if_include mtcha0 or -mca btl_openib_if_include mtcha1 Rolf On 07/15/09 09:33, nee...@crlindia.com wrote: Thanks Ralph, i

Re: [OMPI users] Problem launching jobs in SGE (with loose integration), OpenMPI 1.3.3

2009-07-23 Thread Rolf Vandevaart
I think what you are looking for is this: --mca plm_rsh_disable_qrsh 1 This means we will disable the use of qrsh and use rsh or ssh instead. The --mca pls ^sge does not work anymore for two reasons. First, the "pls" framework was renamed "plm". Secondly, the gridgengine plm was folded

Re: [OMPI users] problem w sge 6.2 & openmpi

2009-08-05 Thread Rolf Vandevaart
I assume it is working with np=8 because the 8 processes are getting launched on the same node as mpirun and therefore there is no call to qrsh to start up any remote processes. When you go beyond 8, mpirun calls qrsh to start up processes on some of the remote nodes. I would suggest first

Re: [OMPI users] pipes system limit

2009-08-07 Thread Rolf Vandevaart
This message is telling you that you have run out of file descriptors. I am surprised that the -mca parameter setting did not fix the problem. Can you run limit or ulimit on your shell and send the information? I typically set my limit to 65536 assuming the system allows it. burl-16 58

Re: [OMPI users] an MPI process using about 12 file descriptors per neighbour processes - isn't it a bit too much?

2009-08-14 Thread Rolf Vandevaart
Hi Paul: I tried the running the same way as you did and I saw the same thing. I was using ClusterTools 8.2 (Open MPI 1.3.3r21324) and running on Solaris. I looked at the mpirun process and it was definitely consuming approximately 12 file descriptors per a.out process. burl-ct-v440-0 59

Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Rolf Vandevaart
Hi, how exactly do you run this to get this error? I tried and it worked for me. burl-ct-x2200-16 50 =>mpirun -mca btl_openib_warn_default_gid_prefix 0 -mca btl self,sm,openib -np 2 -host burl-ct-x2200-16,burl-ct-x2200-17 -mca btl_openib_ib_timeout 16 a.out I am 0 at 1252670691 I am 1 at

[OMPI users] Leftover session directories [was sm btl choices]

2010-03-01 Thread Rolf Vandevaart
On 03/01/10 11:51, Ralph Castain wrote: On Mar 1, 2010, at 8:41 AM, David Turner wrote: On 3/1/10 1:51 AM, Ralph Castain wrote: Which version of OMPI are you using? We know that the 1.2 series was unreliable about removing the session directories, but 1.3 and above appear to be quite good

Re: [OMPI users] [openib] segfault when using openib btl

2010-07-13 Thread Rolf vandeVaart
Hi Eloi: To select the different bcast algorithms, you need to add an extra mca parameter that tells the library to use dynamic selection. --mca coll_tuned_use_dynamic_rules 1 One way to make sure you are typing this in correctly is to use it with ompi_info. Do the following: ompi_info -mca

Re: [OMPI users] how this is possible?

2007-10-02 Thread Rolf . Vandevaart
My guess is you must have a mismatched MPI_Bcast somewhere in the code. Presumably, there is a call to MPI_Bcast on the head node that broadcasts something larger than 1 MPI_INT and does not have the matching call on the worker nodes. Then, when the MPI_Bcast on the worker nodes is called,

Re: [OMPI users] large number of processes

2007-12-03 Thread Rolf vandeVaart
Hi: I managed to run a 256 process job on a single node. I ran a simple test in which all processes send a message to all others. This was using Sun's Binary Distribution of Open MPI on Solaris which is based on r16572 of the 1.2 branch. The machine had 8 cores. burl-ct-v40z-0 49

Re: [OMPI users] how to select a specific network

2008-01-11 Thread Rolf Vandevaart
Hello: Have you actually tried this and got it to work? It did not work for me. burl-ct-v440-0 50 =>mpirun -host burl-ct-v440-0,burl-ct-v440-1 -np 1 -mca btl self,sm,tcp -mca btl_tcp_if_include ce0 connectivity_c : -np 1 -mca btl self,sm,tcp -mca btl_tcp_if_include ce0 connectivity_c

Re: [OMPI users] problems with hostfile when doing MPMD

2008-04-10 Thread Rolf Vandevaart
This worked for me although I am not sure how extensive our 32/64 interoperability support is. I tested on Solaris using the TCP interconnect and a 1.2.5 version of Open MPI. Also, we configure with the --enable-heterogeneous flag which may make a difference here. Also this did not work

Re: [OMPI users] vprotocol pessimist

2008-07-15 Thread Rolf vandeVaart
And if you want to stop seeing it in the short term, you have at least two choices I know of. At configure time, add this to your configure line. --enable-mca-no-build=vprotocol This will prevent that component from being built, and will eliminate the message. If it is in there, you can

Re: [OMPI users] How to cease the process triggered by OPENMPI

2008-07-28 Thread Rolf Vandevaart
One other option which should kill of processes and cleanup is the orte-clean command. In your case, you could do the following: mpirun -hostfile ~/hostfile --pernode orte-clean There is a man page for it also. Rolf Brock Palen wrote: You would be much better off to not use nohup, and

Re: [OMPI users] problem with alltoall with ppn=8

2008-08-18 Thread Rolf Vandevaart
Ashley Pittman wrote: On Sat, 2008-08-16 at 08:03 -0400, Jeff Squyres wrote: - large all to all operations are very stressful on the network, even if you have very low latency / high bandwidth networking such as DDR IB - if you only have 1 IB HCA in a machine with 8 cores, the problem

Re: [OMPI users] Continuous poll/select using btl sm (svn 1.4a1r18899)

2008-08-29 Thread Rolf Vandevaart
I have submitted a ticket on this issue. https://svn.open-mpi.org/trac/ompi/ticket/1468 Rolf On 08/18/08 18:27, Mostyn Lewis wrote: George, I'm glad you changed the scheduling and my program seems to work. Thank you. However, to stress it a bit more I changed #define NUM_ITERS 1000 to

Re: [OMPI users] Problems with compilig of OpenMPI 1.2.7

2008-08-29 Thread Rolf Vandevaart
Hi Paul: I can comment on why you are seeing the mpicxx problem, but I am not sure what to do about it. In the file mpicxx.cc there is a declaration near the bottom that looks like this. const int LOCK_SHARED = MPI_LOCK_SHARED; The preprocessor is going through that file and replacing

Re: [OMPI users] Problems with compilig of OpenMPI 1.2.7

2008-08-29 Thread Rolf Vandevaart
mpicxx.cc issue. Rolf On 08/29/08 13:48, Rolf Vandevaart wrote: Hi Paul: I can comment on why you are seeing the mpicxx problem, but I am not sure what to do about it. In the file mpicxx.cc there is a declaration near the bottom that looks like this. const int LOCK_SHARED = MPI_LOCK_SHARED

  1   2   >