bindings had been similar to Boost MPI, they would
probably have been adopted more widely and may still be alive.
--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Président - Comité de coordination du soutien à la recherche de Calcul
Le 2015-04-13 09:54, Ralph Castain a écrit :
On Apr 13, 2015, at 6:52 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Just out of curiosity... how will OpenMPI start processes under different
accounts ? Through SSH while specifying different user names ?
I am as
87.php
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/04/26690.php
--
-
Maxime Boissonneault
Analyste de
I figured it out. It seems like setting CPP to pgprepro isn't the right
variable.
Thanks,
Maxime
Le 2014-10-03 10:39, Maxime Boissonneault a écrit :
Hi,
I am trying to compile OpenMPI 1.8.3 with PGI 14.9 I am getting a
severe errors here :
1956 PGC-S-0039-Use of undeclared variable
on ?
Attached is the output of my configure and make lines.
Thanks,
--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
config-make.log.tar.gz
Description: GNU Zip compressed data
__
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/09/25379.php
--
---------
Maxime Boissonneault
Analyste de calcul - Calcul Québec
Hi,
Just an idea here. Do you use cpusets within Torque ? Did you request
enough cores to torque ?
Maxime Boissonneault
Le 2014-09-23 13:53, Brock Palen a écrit :
I found a fun head scratcher, with openmpi 1.8.2 with torque 5 built with TM
support, on hereto core layouts I get the fun
- but we aren't really maintaining the 1.6 series any
more. You might try updating to 1.6.5 and see if it remains there
On Aug 29, 2014, at 9:12 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca
<mailto:maxime.boissonnea...@calculquebec.ca>> wrote:
It looks like
-
It looks like
-npersocket 1
cannot be used alone. If I do
mpiexec -npernode 2 -npersocket 1 ls -la
then I get no error message.
Is this expected behavior ?
Maxime
Le 2014-08-29 11:53, Maxime Boissonneault a écrit :
Hi,
I am having a weird error with OpenMPI 1.6.3. I run a non-MPI command
. Remember that MPI ranks begin
with 0, not 1.
Please correct the cmd line and try again.
How can I debug that ?
Thanks,
--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
Hi,
Would you say that softwares compiled using OpenMPI 1.8.1 need to be
recompiled using OpenMPI 1.8.2rc4 to work properly ?
Maxime
a
very friendly way to handle that error.
-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime
Boissonneault
Sent: Tuesday, August 19, 2014 10:39 AM
To: Open MPI Users
Subject: Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes
Hi,
I believe I found
-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime
Boissonneault
Sent: Tuesday, August 19, 2014 8:55 AM
To: Open MPI Users
Subject: Re: [OMPI users] Segfault with MPI + Cuda on multiple nodes
Hi,
I recompiled OMPI 1.8.1 without Cuda and with debug, but it did
to help reduce the scope of the problem, can you retest with a non
CUDA-aware Open MPI 1.8.1? And if possible, use --enable-debug in the
configure line to help with the stack trace?
-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime
Boissonneault
Sent
A. Granovsky a écrit :
Also you need to check return code from cudaMalloc before calling
cudaFree -
the pointer may be invalid as you did not initialized cuda properly.
Alex
-Original Message- From: Maxime Boissonneault
Sent: Tuesday, August 19, 2014 2:19 AM
To: Open MPI Users
Subject: Re
?
-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime
Boissonneault
Sent: Monday, August 18, 2014 4:23 PM
To: Open MPI Users
Subject: [OMPI users] Segfault with MPI + Cuda on multiple nodes
Hi,
Since my previous thread (Segmentation fault in OpenMPI 1.8.1
the other node)
Maxime
Le 2014-08-18 16:52, Alex A. Granovsky a écrit :
Try the following:
export MALLOC_CHECK_=1
and then run it again
Kind regards,
Alex Granovsky
-Original Message- From: Maxime Boissonneault
Sent: Tuesday, August 19, 2014 12:23 AM
To: Open MPI Users
Subject: [OMPI
would ask here if somebody has a clue of what might be going on. I have
yet to be able to fill a bug report on NVidia's website for Cuda.
Thanks,
--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
that need to get
fixed. We haven't had many cases where it's been an issue, but a couple like
this have cropped up - enough that I need to set aside some time to fix it.
My apologies for the problem.
On Aug 18, 2014, at 10:31 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca>
would be the solution.
On Aug 18, 2014, at 10:04 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Here it is.
Maxime
Le 2014-08-18 12:59, Ralph Castain a écrit :
Ah...now that showed the problem. To pinpoint it better, please add
-mca oob_base_verbose 10
and I
Here it is.
Maxime
Le 2014-08-18 12:59, Ralph Castain a écrit :
Ah...now that showed the problem. To pinpoint it better, please add
-mca oob_base_verbose 10
and I think we'll have it
On Aug 18, 2014, at 9:54 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca>
one node, yes?
Try adding the following:
-mca odls_base_verbose 5 -mca state_base_verbose 5 -mca errmgr_base_verbose 5
Lot of garbage, but should tell us what is going on.
On Aug 18, 2014, at 9:36 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Here it is
Le 2014-0
Here it is
Le 2014-08-18 12:30, Joshua Ladd a écrit :
mpirun -np 4 --mca plm_base_verbose 10
[mboisson@helios-login1 examples]$ mpirun -np 4 --mca plm_base_verbose
10 ring_c
[helios-login1:27853] mca: base: components_register: registering plm
components
[helios-login1:27853] mca: base:
cho $?
65
Maxime
Le 2014-08-16 06:22, Jeff Squyres (jsquyres) a écrit :
Just out of curiosity, I saw that one of the segv stack traces involved the
cuda stack.
Can you try a build without CUDA and see if that resolves the problem?
On Aug 15, 2014, at 6:47 PM, Maxime Boissonneault
&l
?
On Aug 15, 2014, at 6:47 PM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Hi Jeff,
Le 2014-08-15 17:50, Jeff Squyres (jsquyres) a écrit :
On Aug 15, 2014, at 5:39 PM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Correct.
Can it be be
Hi Jeff,
Le 2014-08-15 17:50, Jeff Squyres (jsquyres) a écrit :
On Aug 15, 2014, at 5:39 PM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Correct.
Can it be because torque (pbs_mom) is not running on the head node and mpiexec
attempts to contact it ?
Not fo
login
node if I understood you correctly.
Josh
On Fri, Aug 15, 2014 at 5:20 PM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca
<mailto:maxime.boissonnea...@calculquebec.ca>> wrote:
Here are the requested files.
In the archive, you will find the output of con
On Thu, Aug 14, 2014 at 3:14 PM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca
<mailto:maxime.boissonnea...@calculquebec.ca>> wrote:
Yes,
Everything has been built with GCC 4.8.x, although x might
will recompile it from scratch and provide all the information
requested on the help webpage.
Cheers,
Maxime
Le 2014-08-15 11:58, Maxime Boissonneault a écrit :
Hi Josh,
The ring_c example does not work on our login node :
[mboisson@helios-login1 examples]$ mpiexec -np 10 ring_c
[mboisson
15:16, Joshua Ladd a écrit :
Can you try to run the example code "ring_c" across nodes?
Josh
On Thu, Aug 14, 2014 at 3:14 PM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca
<mailto:maxime.boissonnea...@calculquebec.ca>> wrote:
Yes,
Everything has bee
e this is coming from the OpenIB
BTL, would be good to check this.
Do you know what the MPI thread level is set to when used with
the Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is
not thread safe.
Josh
On Thu, Aug 14, 2014 at 2:17 PM, Maxime Bois
is coming from the OpenIB BTL,
would be good to check this.
Do you know what the MPI thread level is set to when used with the
Charm++ runtime? Is it MPI_THREAD_MULTIPLE? The OpenIB BTL is not
thread safe.
Josh
On Thu, Aug 14, 2014 at 2:17 PM, Maxime Boissonneault
<maxi
Hi,
I ran gromacs successfully with OpenMPI 1.8.1 and Cuda 6.0.37 on a
single node, with 8 ranks and multiple OpenMP threads.
Maxime
Le 2014-08-14 14:15, Joshua Ladd a écrit :
Hi, Maxime
Just curious, are you able to run a vanilla MPI program? Can you try
one one of the example programs
/software/ompi/v1.8/
On Aug 14, 2014, at 8:39 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Hi,
I compiled Charm++ 6.6.0rc3 using
./build charm++ mpi-linux-x86_64 smp --with-production
When compiling the simple example
mpi-linux-x86_64-smp/tests
Geologist M.S. in Geophysics
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/08/25016.php
--
-----
Note that if I do the same build with OpenMPI 1.6.5, it works flawlessly.
Maxime
Le 2014-08-14 08:39, Maxime Boissonneault a écrit :
Hi,
I compiled Charm++ 6.6.0rc3 using
./build charm++ mpi-linux-x86_64 smp --with-production
When compiling the simple example
mpi-linux-x86_64-smp/tests/charm
,
--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
ESS
--
What is weird is that this same command works for other users, on the
same node.
Anyone know what might be going on here ?
Thanks,
--
-----
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
outweight the changes
that need to be made to get them.
My 2 cents,
Maxime Boissonneault
My two cents of opinion
Gus Correa
On 08/05/2014 12:54 PM, Ralph Castain wrote:
Check the repo - hasn't been touched in a very long time
On Aug 5, 2014, at 9:42 AM, Fabricio Cannini <fcann...@gma
2:fischega] $ /usr/sbin/ibstat
CA 'mlx4_0'
CA type: MT26428
Command line (path and LD_LIBRARY_PATH are set correctly):
mpirun -x LD_LIBRARY_PATH -mca btl openib,sm,self -mca
btl_openib_verbose 1 -np 31 $CTF_EXEC
*From:*users [mailto:users-boun...@open-mpi.org] *On Behalf Of *Maxime
Boi
What are your threading options for OpenMPI (when it was built) ?
I have seen OpenIB BTL completely lock when some level of threading is
enabled before.
Maxime Boissonneault
Le 2014-06-24 18:18, Fischer, Greg A. a écrit :
Hello openmpi-users,
A few weeks ago, I posted to the list about
Hi,
I've been following this thread because it may be relevant to our setup.
Is there a drawback of having orte_hetero_nodes=1 as default MCA
parameter ? Is there a reason why the most generic case is not assumed ?
Maxime Boissonneault
Le 2014-06-20 13:48, Ralph Castain a écrit :
Put
ellanox Ofed. We use the Linux RDMA from
CentOS 6.5. However, should that completely disable GDR within a single
node ? i.e. does GDR _have_ to go through IB ? I would assume that our
lack of Mellanox OFED would result in no-GDR inter-node, but GDR intra-node.
Thanks
--
----
pi/openmpi/1.8.1_gcc4.8_cuda6.0.37/share/openmpi/mca-coll-ml.config",
data source: default, level: 9 dev/all, type: string)
MCA io: informational
"io_romio_complete_configure_params" (current value:
"--with-file-system=nfs+lustre FROM_OMPI=yes
CC='/software6/compilers/gcc/4.8/bin/gcc -std=gnu99' CFLAGS='-O3
-DNDEBUG -finline-functions -fno-strict-aliasing -pthread' CPPFLAGS='
-I/software-gpu/src/openmpi-1.8.1/opal/mca/hwloc/hwloc172/hwloc/include
-I/software-gpu/src/openmpi-1.8.1/opal/mca/event/libevent2021/libevent
-I/software-gpu/src/openmpi-1.8.1/opal/mca/event/libevent2021/libevent/include'
FFLAGS='' LDFLAGS=' ' --enable-shared --enable-static
--with-file-system=nfs+lustre
--prefix=/software-gpu/mpi/openmpi/1.8.1_gcc4.8_cuda6.0.37
--disable-aio", data source: default, level: 9 dev/all, type: string)
[login-gpu01.calculquebec.ca:11486] mca: base: close: unloading component Q
--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
---------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
Le 2014-05-15 18:27, Jeff Squyres (jsquyres) a écrit :
On May 15, 2014, at 6:14 PM, Fabricio Cannini wrote:
Alright, but now I'm curious as to why you decided against it.
Could please elaborate on it a bit ?
OMPI has a long, deep history with the GNU Autotools. It's a
Please allow me to chip in my $0.02 and suggest to not reinvent the
wheel, but instead consider to migrate the build system to cmake :
http://www.cmake.org/
I agree that menu-wise, CMake does a pretty good job with ccmake, and is
much, much easier to create than autoconf/automake/m4 stuff
of
things to build, so any work toward that scheme might not be lost.
-- bennet
On Thu, May 15, 2014 at 7:41 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Le 2014-05-15 06:29, Jeff Squyres (jsquyres) a écrit :
I think Ralph's email summed it up pretty well
e. Just in case
you're taking ha'penny's worth from the groundlings. I think I would
prefer not to have capability included that we won't use.
-- bennet
On Wed, May 14, 2014 at 7:43 PM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
For the scheduler issue, I would b
asically *always* build them. So we do.
In general, OMPI builds support for everything that it can find on the rationale that a) we can't
know ahead of time exactly what people want, and b) most people want to just "./configure
&& make -j 32 install" and be done with it -- so bu
is that it is compiling support for many schedulers while I'm
rather convinced that very few sites actually use multiple schedulers at
the same time.
Maxime
Le 2014-05-14 16:51, Gus Correa a écrit :
On 05/14/2014 04:25 PM, Maxime Boissonneault wrote:
Hi,
I was compiling OpenMPI 1.8.1 today and I
Hi,
I was compiling OpenMPI 1.8.1 today and I noticed that pretty much every
single scheduler has its support enabled by default at configure (except
the one I need, which is Torque). Is there a reason for that ? Why not
have a single scheduler enabled and require to specify it at configure
I heard that c/r support in OpenMPI was being dropped after version
1.6.x. Is this not still the case ?
Maxime Boissonneault
Le 2014-02-27 13:09, George Bosilca a écrit :
Both were supported at some point. I'm not sure if any is still in a
workable state in the trunk today. However
Hi,
Do you have thread multiples enabled in your OpenMPI installation ?
Maxime Boissonneault
Le 2013-12-16 17:40, Noam Bernstein a écrit :
Has anyone tried to use openmpi 1.7.3 with the latest CentOS kernel
(well, nearly latest: 2.6.32-431.el6.x86_64), and especially with infiniband?
I'm
presentatives.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
-----
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
ke.out, sample code and output
etc.
Thanks,
Jeff
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
-
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Ph. D. en physique
quite low.
The fact that mvapich2 does not show this behavior points out to a
problem with the openib btl within openmpi, and not with our setup.
Can anyone try to reproduce this on a different machine ?
Thanks,
Maxime Boissonneault
Le 2013-02-15 14:29, Maxime Boissonneault a écrit :
Hi agai
0 followed by MPI_Isend to rank 0
In this case also, rank n's MPI_Isend executes quasi-instantaneously,
and rank 0's MPI_Recv only returns a few minutes later.
Thanks,
Maxime Boissonneault
Le 2013-01-29 21:02, Ralph Castain a écrit :
On Jan 28, 2013, at 10:53 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca
<mailto:maxime.boissonnea...@calculquebec.ca>> wrote:
While our filesystem and management nodes are on UPS, our compute
nodes are not. With
u run some parallel
benchmarks on your cluster ?
George.
PS: You can some MPI I/O benchmarks at
http://www.mcs.anl.gov/~thakur/pio-benchmarks.html
On Mon, Jan 28, 2013 at 2:04 PM, Ralph Castain <r...@open-mpi.org> wrote:
On Jan 28, 2013, at 10:53 AM, Maxime Boissonneault
<maxime.b
ince the last checkpoint.
HTH
Ralph
On Jan 28, 2013, at 7:47 AM, Maxime Boissonneault
<maxime.boissonnea...@calculquebec.ca> wrote:
Hello,
I am doing checkpointing tests (with BLCR) with an MPI application compiled
with OpenMPI 1.6.3, and I am seeing behaviors that are quite strange.
would reach tens of thousands and would completely overload our
lustre filesystem. Moreover, with 15MB/s per node, the checkpointing
process would take hours.
How can I improve on that ? Is there an MCA setting that I am missing ?
Thanks,
--
-
Maxime
62 matches
Mail list logo