[OMPI users] Multi-threaded MPI communication

saiyedul islam Thu, 21 Sep 2017 14:00:50 -0700

Hi all,

I am working on parallelization of a Data Clustering Algorithm in which I
am following MPMD pattern of MPI (i.e. 1 master process and p slave
processes in same communicator). It is an iterative algorithm where 2 loops
inside iteration are separately parallelized.


The first loop is parallelized by partitioning the N size input data into
(almost) equal parts between p slaves. Each slave produces a contiguous
chunk of about (p * N/p) double values as result of its local processing.
This local chunk from each slave is collected back on master process where
it is merged with chunks from other slaves.
If a blocking call (MPI_Send / Recv) is put in a loop on master such that
it receives the data one by one in order of their rank from slaves, then
each slave takes about 75 seconds for its local computation (as calculated
by MPI_Wtime() ) and about 1.5 seconds for transferring its chunk to
master. But, as the transfer happens in order, by the time last slave
process is done, the total time becomes 75 seconds for computation and 50
seconds for communication.
These timings are for a cluster of 31 machines where a single process
executes in each machine. All these machines are connected directly via a
private Gigabit network switch. In order to be effectively parallelize the
algorithm, the overall execution time needs to come below 80 seconds.

I have tried following strategies to solve this problem:
0. Ordered transfer, as explained above.
1. Collecting data through MPI_Gatherv and assuming that internally it will
transfer data in parallel.
2. Creating p threads at master using OpenMP and calling MPI_Recv (or
MPI_Irecv with MPI_Wait) by threads. The received data by each process is
put in a separate buffer. My installation support MPI_THREAD_MULTIPLE.

The problem is that strategies 1 & 2 are taking almost similar time as
compared to strategy 0.
*Is there a way through which I can receive data in parallel and
substantially decrease the overall execution time?*

Hoping to get your help soon. Sorry for the long question.

Regards,
Saiyedul Islam

PS: Specifications of the cluster: GCC 5.10, OpenMP 2.0.1, CentOS 6.5 (as
part of Rockscluster).

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Multi-threaded MPI communication

Reply via email to