date:20140819

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Aulwes, Rob

I'll give this a try. Thanks Brice! From: Brice Goglin > Reply-To: Hardware locality user list > List-Post: hwloc-users@lists.open-mpi.org Date: Tue, 19 Aug 2014 19:26:17 +0200 To:

Re: [OMPI users] Running a hybrid MPI+openMP program

2014-08-19 Thread Reuti

Hi, Am 19.08.2014 um 19:06 schrieb Oscar Mojica: > I discovered what was the error. I forgot include the '-fopenmp' when I > compiled the objects in the Makefile, so the program worked but it didn't > divide the job in threads. Now the program is working and I can use until 15 > cores for

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin

You have to pass HWLOC_MEMBIND_STRICT if you want an error code when the policy isn't supported. Assuming you get the nodeset of your current binding with get_area_membind_nodeset() in bindset, you can do something like this (untested): hwloc_bitmap_t bindset, totalset, newset; int i; /* get

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Aulwes, Rob

ok, in the meantime, is there a way to manually 'replicate'? That is, if I allocate a node, I would like to find out which NUMA domain it resides in, and then allocate replicates to other domains. Are there example codes that show how to use the bitmaps for this? I've been unsuccessful in

Re: [OMPI users] Running a hybrid MPI+openMP program

2014-08-19 Thread Oscar Mojica

Reuti I discovered what was the error. I forgot include the '-fopenmp' when I compiled the objects in the Makefile, so the program worked but it didn't divide the job in threads. Now the program is working and I can use until 15 cores for machine in the queue one.q. Anyway i would like to try

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Aulwes, Rob

nope, no error. is there a way to find out what policies are supported? I would like to try 'replicate'. From: Brice Goglin > Reply-To: Hardware locality user list > List-Post:

Re: [hwloc-users] setting memory bindings

2014-08-19 Thread Brice Goglin

Le 19/08/2014 18:38, Aulwes, Rob a écrit : > Hi, > > I'm trying to write a custom C++ allocator that wraps hwloc calls. > I've tried using various hwloc_alloc* functions to set the memory > bindings, but when I call hwloc_get_area_membind_nodeset to verify, I > don't get the same policy I passed

[hwloc-users] setting memory bindings

2014-08-19 Thread Aulwes, Rob

Hi, I'm trying to write a custom C++ allocator that wraps hwloc calls. I've tried using various hwloc_alloc* functions to set the memory bindings, but when I call hwloc_get_area_membind_nodeset to verify, I don't get the same policy I passed to alloc. Are there example codes that show how to

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault

I am also filing a bug at Adaptive Computing since, while I do set CUDA_VISIBLE_DEVICES myself, the default value set by Torque in that case is also wrong. Maxime Le 2014-08-19 10:47, Rolf vandeVaart a écrit : Glad it was solved. I will submit a bug at NVIDIA as that does not seem like a

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Rolf vandeVaart

Glad it was solved. I will submit a bug at NVIDIA as that does not seem like a very friendly way to handle that error. >-Original Message- >From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime >Boissonneault >Sent: Tuesday, August 19, 2014 10:39 AM >To: Open MPI Users

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault

Hi, I believe I found what the problem was. My script set the CUDA_VISIBLE_DEVICES based on the content of $PBS_GPUFILE. Since the GPUs were listed twice in the $PBS_GPUFILE because of the two nodes, I had CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7 instead of

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Rolf vandeVaart

Hi: This problem does not appear to have anything to do with MPI. We are getting a SEGV during the initial call into the CUDA driver. Can you log on to gpu-k20-08, compile your simple program without MPI, and run it there? Also, maybe run dmesg on gpu-k20-08 and see if there is anything in

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault

Hi, I recompiled OMPI 1.8.1 without Cuda and with debug, but it did not give me much more information. [mboisson@gpu-k20-07 simple_cuda_mpi]$ ompi_info | grep debug Prefix: /software-gpu/mpi/openmpi/1.8.1-debug_gcc4.8_nocuda Internal debug support: yes Memory debugging

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Maxime Boissonneault

Indeed, there were those to problems. I took the code from here and simplified it. http://cudamusing.blogspot.ca/2011/08/cuda-mpi-and-infiniband.html However, even with the modified code here http://pastebin.com/ax6g10GZ The symptoms are still the same. Maxime Le 2014-08-19 07:59, Alex A.

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Alex A. Granovsky

Also you need to check return code from cudaMalloc before calling cudaFree - the pointer may be invalid as you did not initialized cuda properly. Alex -Original Message- From: Maxime Boissonneault Sent: Tuesday, August 19, 2014 2:19 AM To: Open MPI Users Subject: Re: [OMPI users]

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

2014-08-19 Thread Alex A. Granovsky

Hello, I think your cuda program may be incorrect. Add proper cudaSetDevice call at the beginning and check it again. Kind regards, Alex Granovsky -Original Message- From: Maxime Boissonneault Sent: Tuesday, August 19, 2014 2:19 AM To: Open MPI Users Subject: Re: [OMPI users]

[hwloc-users] I'd like to add you to my professional network on LinkedIn

2014-08-19 Thread Yury Vorobyov

Hi Hardware, Id like to add you to my professional network on LinkedIn. - Yury Accept:

Re: [OMPI users] No log_num_mtt in Ubuntu 14.04

2014-08-19 Thread Mike Dubman

so, it seems you have old ofed w/o this parameter. Can you install latest Mellanox ofed? or check which community ofed has it? On Tue, Aug 19, 2014 at 9:34 AM, Rio Yokota wrote: > Here is what "modinfo mlx4_core" gives > > filename: > >

Re: [OMPI users] No log_num_mtt in Ubuntu 14.04

2014-08-19 Thread Rio Yokota

Here is what "modinfo mlx4_core" gives filename: /lib/modules/3.13.0-34-generic/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko version:2.2-1 license:Dual BSD/GPL description:Mellanox ConnectX HCA low-level driver author: Roland Dreier srcversion:

Re: [hwloc-users] setting memory bindings

Re: [OMPI users] Running a hybrid MPI+openMP program

Re: [hwloc-users] setting memory bindings

Re: [hwloc-users] setting memory bindings

Re: [OMPI users] Running a hybrid MPI+openMP program

Re: [hwloc-users] setting memory bindings

Re: [hwloc-users] setting memory bindings

[hwloc-users] setting memory bindings

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes

[hwloc-users] I'd like to add you to my professional network on LinkedIn

Re: [OMPI users] No log_num_mtt in Ubuntu 14.04

Re: [OMPI users] No log_num_mtt in Ubuntu 14.04

19 matches

Site Navigation

Mail list logo

Footer information