[OMPI users] MPI_Comm_accept() / MPI_Comm_connect() fail between two different machines

2015-07-13 Thread Audet, Martin
Hi OMPI_Developers, It seems that I am unable to establish an MPI communication between two independently started MPI programs using the simplest client/server call sequence I can imagine (see the two attached files) when the client and server process are started on different machines. Note

Re: [OMPI users] MPI_Comm_accept() / MPI_Comm_connect() fail between two different machines

2015-07-13 Thread Audet, Martin
, 2015 5:29 PM To: Open MPI Users Subject: Re: [OMPI users] MPI_Comm_accept() / MPI_Comm_connect() fail between two different machines Try running it with “—mca oob_base_verbose 100” on both client and server - it will tell us why the connection was refused. > On Jul 13, 2015, at 2:14

Re: [OMPI users] MPI_Comm_accept() / MPI_Comm_connect() fail between two different machines

2015-07-14 Thread Audet, Martin
between two different machines I see the problem - it's a race condition, actually. I'll try to provide a patch for you to test, if you don't mind. > On Jul 13, 2015, at 3:03 PM, Audet, Martin <martin.au...@cnrc-nrc.gc.ca> > wrote: > > Thanks Ralph for this quick respons

Re: [OMPI users] MPI_Comm_accept() / MPI_Comm_connect() fail between two different machines

2015-07-14 Thread Audet, Martin
Yes, this patch applied over OpenMPI 1.8.6 solves my problem. Attached are the new output files for the server and the client when started with "--mca oob_base_verbose 100". Will this patch be included in 1.8.7 ? Thanks again, Martin Audet From: users

[OMPI users] Unable to connect to a server using MX MTL with TCP

2010-06-04 Thread Audet, Martin
Hi OpenMPI_Users and OpenMPI_Developers, I'm unable to connect a client application using MPI_Comm_connect() to a server job (the server job calls MPI_Open_port() before calling by MPI_Comm_accept()) when the server job uses MX MTL (although it works without problems when the server uses MX

[OMPI users] RE : Unable to connect to a server using MX MTL with TCP

2010-06-04 Thread Audet, Martin
Sorry, I forgot the attachements... Martin De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de Audet, Martin [martin.au...@imi.cnrc-nrc.gc.ca] Date d'envoi : 4 juin 2010 19:18 À : us...@open-mpi.org Objet : [OMPI users] Unable

Re: [OMPI users] Unable to connect to a server using MX MTL with TCP

2010-06-09 Thread Audet, Martin
Thanks to both Scott and Jeff ! Next time I have a problem, I will check the README file first (Doh !). Also we might mitigate the problem by connecting the workstation to the Myrinet switch. Martin -Original Message- From: users-boun...@open-mpi.org

[OMPI users] MPI_Comm_connect() fails.

2008-03-11 Thread Audet, Martin
Hi, I'm experimenting with the MPI-2 functions for supporting the client/server model in MPI (e.g. server and client are independently created MPI jobs establishing an intercommunicator between them at run time, see section 5.4 "Establishing Communication" of the MPI-2 standard document) and

[OMPI users] RE : MPI_Comm_connect() fails

2008-03-12 Thread Audet, Martin
Hi again, Thanks Pak for the link and suggesting to start an "orted" deamon, by doing so my clients and servers jobs were able to establish an intercommunicator between them. However I modified my programs to perform an MPI_Allgather() of a single "int" over the new intercommunicator to test

[OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread Audet, Martin
re, valgrind correctly detect the non initialized data transfer. On Mar 12, 2008, at 3:58 PM, Audet, Martin wrote: > Hi again, > > Thanks Pak for the link and suggesting to start an "orted" deamon, > by doing so my clients and servers jobs were able to establish a

[OMPI users] RE : users Digest, Vol 841, Issue 3

2008-03-13 Thread Audet, Martin
Hi Georges, Thanks for your patch, but I'm not sure I got it correctly. The patch I got modify a few arguments passed to isend()/irecv()/recv() in coll_basic_allgather.c. Here is the patch I applied: Index: ompi/mca/coll/basic/coll_basic_allgather.c

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Audet, Martin
for example the server is started on 3 process and the client on 2 process. Feel free to try yourself the two small client and server programs I posted in my first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, Martin (Martin.Audet_at_[hidden]) List-Post

[OMPI users] RE : RE : MPI_Comm_connect() fails

2008-03-17 Thread Audet, Martin
ses in the two >> groups passes for me -- and did fail with the previous version. >> >> >> Thanks >> Edgar >> >> >> Audet, Martin wrote: >>> Hi Jeff, >>> >>> As I said in my last message (see bellow) the patch (or at least

[OMPI users] Problem with MPI_Scatter() on inter-communicator...

2008-04-04 Thread Audet, Martin
Hi, I don't know if it is my sample code or if it is a problem whit MPI_Scatter() on inter-communicator (maybe similar to the problem we found with MPI_Allgather() on inter-communicator a few weeks ago) but a simple program I wrote freeze during its second iteration of a loop doing an

Re: [OMPI users] Memory question and possible bug in 64bit addressing under Leopard!

2008-04-25 Thread Audet, Martin
This has nothing to do with the segmentation fault you got but in addition to Brian comment, I would suggest you to not forget that with ISO C++ (the C++98 standard and the upcoming C++0x) a constant expression known at compile time is needed for dimensions of local arrays. In other words, a

Re: [OMPI users] Memchecker report on v1.3b2 (includes potential bug reports)

2008-11-19 Thread Audet, Martin
4) Well, this sounds reasonable, but according to the MPI-1 standard (see page 40 for non-blocking send/recv, a more detailed explanation in page 30): "A nonblocking send call indicates that the system may start copying data out of the send buffer. The sender

[OMPI users] Avoiding the memory registration costs by having memory always registered, is it possible with Linux ?

2016-06-16 Thread Audet, Martin
Hi, After reading a little the FAQ on the methods used by Open MPI to deal with memory registration (or pinning) with Infiniband adapter, it seems that we could avoid all the overhead and complexity of memory registration/deregistration, registration cache access and update, memory management

Re: [OMPI users] Avoiding the memory registration costs by having memory always registered, is it possible with Linux ?

2016-06-20 Thread Audet, Martin
Thanks Jeff for your answer. It is sad that the approach I mentioned of having all memory registered for user process on cluster nodes didn't become more popular. I still believe that such an approach would shorten the executed code path in MPI libraries, reduce message latency, increase the

Re: [OMPI users] Avoiding the memory registration costs by having memory always registered, is it possible with Linux ?

2016-06-27 Thread Audet, Martin
Thanks Jeff and Alex for your answers and comments. mlockall(), especially with the MCL_FUTURE argument is indeed interesting. Thanks Jeff for your clarification of what memory registration really means (e.g. locking and telling the network stack the virtual to physical mapping). Also

[O-MPI users] MPI_Offset and C++ interface

2005-11-25 Thread Audet, Martin
Hi, I just compiled my library with version 1.0 of OpenMPI and I had two problems. First the MPI_Offset datatype is defined as a preprocessor macro as follow in mpi.h: /* Type of MPI_Offset */ #define MPI_Offset long long This generate a syntax error when MPI_Offset is used in C++ for

[O-MPI users] const_cast<>(), Alltoallw() and Spawn_multiple()

2005-12-15 Thread Audet, Martin
Hi, I just tried OpenMPI 1.0.1 and this time I had much less warnings related to the C++ API than I had with version 1.0.0 (I compile with g++ -Wall). I nonetheless looked at the C++ headers and found that those warnings were still related to the use of C-style cast. Some of them were simply

[OMPI users] Incorrect behavior for attributes attached to MPI_COMM_SELF.

2006-04-10 Thread Audet, Martin
Hi, It looks like there is a problem in OpenMPI 1.0.2 with how MPI_COMM_SELF attributes callback functions are handled by MPI_Finalize(). The following C program register a callback function associated with the MPI_COMM_SELF communicator to be called during the first steps of MPI_Finalize().

[OMPI users] configure script not hapy with OpenPBS

2006-10-19 Thread Audet, Martin
Hi, When I tried to install OpenMPI on the front node of a cluster using OpenPBS batch system (e.g. --with-tm=/usr/open-pbs argument to configure), it didn't work and I got the error message: --- MCA component pls:tm (m4 configuration macro) checking for MCA component pls:tm compile mode...

[OMPI users] mpicc adds an inexitant directory in the include path.

2007-01-24 Thread Audet, Martin
Hi, I use sometimes OpenMPI and it looks like the mpicc wrapper gives gcc an inexistant directory with -I option. If I ask mpicc how it calls gcc it prints the following: [audet@linux15 libdfem]$ mpicc -show gcc -I/usr/local/openmpi-1.1.2/include

[OMPI users] Suggestion: adding OMPI_ versions macros in mpi.h

2007-02-01 Thread Audet, Martin
Hi, I would like to suggest you to add macros indicating the version of the OpenMPI library in the C/C++ header file mpi.h analogous to the parameter constants in the Fortran header file: parameter (OMPI_MAJOR_VERSION=1) parameter (OMPI_MINOR_VERSION=1) parameter

[OMPI users] An equivalent to btl_openib_include_if when MXM over Infiniband ?

2016-08-12 Thread Audet, Martin
Hi OMPI_Users && OMPI_Developers, Is there an equivalent to the MCA parameter btl_openib_include_if when using MXM over Infiniband (e.g. either (pml=cm mtl=mxm) or (pml=yalla)) ? I ask this question because I'm working on a cluster where LXC containers are used on compute nodes (with SR-IOV I

Re: [OMPI users] An equivalent to btl_openib_include_if when MXM over Infiniband ?

2016-08-16 Thread Audet, Martin
Hi Josh, Thanks for your reply. I did try setting MXM_RDMA_PORTS=mlx4_0:1 for all my MPI processes and it did improve performance but the performance I obtain isn't completely satisfying. When I use IMB 4.1 pingpong and sendrecv benchmarks between two nodes I get using Open MPI 1.10.3:

Re: [OMPI users] An equivalent to btl_openib_include_if when MXM over Infiniband ?

2016-08-19 Thread Audet, Martin
Hi Devendar, Thank you for your answer. Setting MXM_TLS=rc,shm,self does improve the speed of MXM (both latency and bandwidth): without MXM_TLS comm lat_min bw_max bw_max pingpong pingpongsendrecv (us) (MB/s) (MB/s)

Re: [OMPI users] An equivalent to btl_openib_include_if when MXM over Infiniband ?

2016-08-22 Thread Audet, Martin
Hi Devendar, Thank again you for your answer. I searched a little bit and found that UD stands for "Unreliable Datagram" while RC is for "Reliable Connected" transport mechanism. I found another called DC for "Dynamically Connected" which is not supported on our HCA. Do you know what is