Re: [OMPI users] Question about collective messages implementation

Jeff Squyres Tue, 2 Nov 2010 09:50:14 -0400

On Nov 2, 2010, at 6:21 AM, Jerome Reybert wrote:

> Each host_comm communicator is grouping tasks by machines. I ran this version,
> but performances are worst than the current version (each task performing its
> own Lapack function). I have several questions:


>  - in my implementation, is MPI_Bcast aware that it should use shared memory
> memory communication? Is data go through the network? It seems it is the case,
> considering the first results.

It should use shared memory by default.

>  - is there any other methods to group task by machine, OpenMPI being aware
> that it is grouping task by shared memory?

The MPI API does not expose this kind of functionality, but there's at least 1 
proposal in front of the MPI Forum to do this kind of thing.

As Ashley mentioned, you might want to do this MPI_Comm_split once and then 
just use that communicator from then on.  The code snipit you sent leaks the 
host_comm, for example.

>  - is it possible to assign a policy (in this case, a shared memory policy) to
> a Bcast or a Barrier call?

Not really, no.

>  - do you have any better idea for this problem? :)

Ashley probably hit the nail on the head.  The short version is that OMPI 
aggressively polls for progress.  Forcing the degraded mode will help (because 
it'll yield), but it won't solve the problem because it'll still be 
aggressively polling -- it'll just yield every time it polls.  But it's still 
polling.

We've had many discussions about this topic, but have never really addressed it 
-- the need for low latency has been greater than the need for 
blocking/not-consuming-CPU progress.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Question about collective messages implementation

Reply via email to