On Mar 12, 2006, at 12:32 PM, Christian Leber wrote:
I would like to write a module (i think that's the right terminology)
for OpenMPI to a new network interface.
To be completely pedantic (but we all know what you mean, so it
doesn't really matter), you would be writing a component. A module
is an instance of that component (with all it's associated state and
all that). Other than helping understand some of the variable names
in the code, it really doesn't matter.
The network interface (actually a simulator) can do the following
simple things:
-return the size of the network
-return the id of the node inside the network
-send non-blocking to a node
-probe for messages
-recv blocking
So it's a pretty simple interface, the connection to the simulator
works
over normal TCP/IP sockets.
So how should I start doing this?
Or is there something existing i could modify?
The answer is "it depends" ;).
At an absolute minimum, you will have to write a BTL component for
MPI message transport. The BTL code is in <top srcdir>/ompi/mca/
btl/, with the interface specified in <top srcdir>/ompi/mca/btl/
btl.h. I believe there is an example BTL you can use as a starting
point, but if not, there are a number of already implemented BTLs
that you can look at. I'd explicitly recommend not looking at the
Portals BTL, as it is layered over a communication architecture that
is the polar opposite of yours - it has RDMA put/get but no send/
recv. You do not need to implement the btl_put and btl_get functions
- the upper layers will just not use RDMA operations if the network
does not support them.
You're going to need a subversion checkout of Open MPI in order to
add a new component, and therefore all the tools that requires. The
complete list of instructions is available at:
http://www.open-mpi.org/svn/obtaining.php
I would also recommend grabbing a copy of Doxygen so that you can
read our interface documentation in a format more comprehendible than
source code comments.
It's possible that all you will need to do is write a BTL - I'm not
exactly sure. The second question is how much "other network" your
simulator supports - can two processes in your simulator talk over
TCP? How do you intend to start processes in your simulator? You
will likely need to write some components for our run-time layer if
you can't use rsh/fork/etc to start processes or the started
processes can't make normal TCP connections for out of band
communication. If you include more information on how you intend
this part to work, I can offer some more advice on the lowest effort
way of making our run-time environment suit your needs.
Hope this helps,
Brian
--
Brian Barrett
Open MPI developer
http://www.open-mpi.org/