________________________________
From: Eugene Loh <eugene....@sun.com>
To: Open MPI Users <us...@open-mpi.org>
Sent: Thursday, 6 November, 2008 18:08:26
Subject: Re: [OMPI users] Progress of the asynchronous messages

 vladimir marjanovic wrote: 
I am new user of Open MPI, I've used MPICH before.

There is performance bug with the following scenario:

proc_B:  MPI_Isend(...,proc_A,..,&request)
                do{
                  sleep(1);
                  MPI_Test(..,&flag,&request);
                  count++
                }while(!flag);

proc_A: MPI_Recv(...,proc_B);

For message size 8MB,  proc_B calls MPI_Test 88 times. It means that
point to point communication costs 88 seconds.
Btw, bandwidth isn't the problem (interconnection network: InfiniBand)

Obviously, there is the problem with progress of the asynchronous
messages.

How can I avoid this problem?

I'm no expert, but I think the problem is that the send is being
"progressed" (advanced) only during MPI calls and MPI_Test doesn't
progress/advance the message very aggressively.  The message is
probably being decomposed into chunks and MPI_Test will advance the
message at most one chunk at a time.  So:

1) You could decrease the time between MPI_Test calls.
2) You could block (e.g., with MPI_Wait).

It's a tough tradeoff to make.  That's bad news... but do you want OMPI
to be making the tough choices here for you?  Let's say the sending
process sends a chunk and it takes a little while for the receiver to
process data and make room for you to send some more.  During that
waiting time, should the sender return control to the user application,
or stay blocked inside of MPI_Test?

Anyhow, I believe that's the issue here.

In order to overlap communication and computation I don't want to use MPI_Wait. 
For sure the message is being decomposed into chucks and the size of chuck is 
probably defined by environment variable. 
Maybe do you know how can I control size of chuck?
Thanks

Vladimir



      

Reply via email to