Re: [OMPI devel] RFC: changes to modex

Terry Dontje Wed, 2 Apr 2008 10:53:07 -0400

Jeff Squyres wrote:

WHAT: Changes to MPI layer modex API
WHY: To be mo' betta scalable
WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere thatcalls ompi_modex_send() and/or ompi_modex_recv()
TIMEOUT: COB Fri 4 Apr 2008

DESCRIPTION:

[...snip...]

* int ompi_modex_node_send(...): send modex data that is relevantfor all processes in this job on this node. It is intended that onlyone process in a job on a node will call this function. If more thanone process in a job on a node calls _node_send(), then only one will"win" (meaning that the data sent by the others will be overwritten).
* int ompi_modex_node_recv(...): receive modex data that is relevantfor a whole peer node; receive the ["winning"] blob sent by_node_send() from the source node. We haven't yet decided what thenode index will be; it may be (ompi_proc_t*) (i.e., _node_recv() wouldfigure out what node the (ompi_proc_t*) resides on and then give youthe data).

The above sounds like there could be race conditions if more than oneprocess on a node is doingompi_modex_node_send. That is are you really going to be able to beassured when ompi_modex_node_recvis done that one of the processes is not in the middle of doingompi_modex_node_send? I assumethere must be some sort of gate that allows you to make sure no one isin the middle of overwriting your data.


--td

Re: [OMPI devel] RFC: changes to modex

Reply via email to