Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Ralph Castain




Actually, we have some sensor network folks that are interested in
using OpenRTE for their applications. Their platforms can be small
microprocessors, many with custom mini-operating systems. Almost none
support IPv6 nor have any knowledge of that protocol.

Ralph


Christian Kauhaus wrote:

  Ralph Castain :
>From the run-time perspective, whatever you do *must* support heterogeneous
  
  
networks of computers that may and may not support IPv6, and may and may not
support IPv6-mapped IPv4 addresses. In other words, the solution must support
systems including computers that only know IPv4.

  
  
It is clear to me that we cannot entirely rely on the availability of
IPv6-mapped IPv4 addresses. Getting the IPv6-enabled sources to compile
on systems without sockaddr_in6 or getaddrinfo() is sort of nasty, but
should be possible. For example, we could extend the configure script to
test this and provide some drop-in replacements in case these
structures/call are missing on the system.

Are there some reference platforms which are really old, but need to
be supported? The information on slide 101 on
http://www.open-rte.org/documentation/march-2006-orte/March-2006-ORTE.pdf
is fairly generic...

  Christian

  





Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Brooks Davis
On Fri, Mar 31, 2006 at 06:53:05PM +0200, Christian Kauhaus wrote:
> Adrian Knoth :
> >(I really prefer the v6-mapped-v4 solution with a single
> > socket, thus eliminating this problem)
> 
> One little problem here is that it is possible to disable the
> IPv6-mapped IPv4 addresses at least under Linux and some BSD variants.
> For Linux, have a look at sys.net.ipv6.bindv6only.  Some authors even
> recommend to do so for security considerations (for example, Murphy &
> Malone in IPv6 Network Administration, O'Reilly 2005). 

More specifically, KAME derived (BSD) stacks disable them by default so
it might be best to assume it doesn't work since you'll probably have to
support that case anyway.  The other nice thing about a two socket model
it that it should be easier a network that is dual-stack and preparing
to transition to pure v6 to disable v4 in order to verify that v6 is
actually working and performing correctly.

-- Brooks
 
-- 
Any statement of the form "X is the one, true Y" is FALSE.
PGP fingerprint 655D 519C 26A7 82E7 2529  9BF0 5D8E 8BE9 F238 1AD4


pgpghxH6JBCT3.pgp
Description: PGP signature


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Ralph Castain




Hi folks

Sorry to be coming late to the discussion - I'm on travel, so my
comments will likely have long time delays in them.

Only one contribution I would like to make. You are welcome to do
whatever you like (subject to the usual approval procedure) in the MPI
layer (the btl's for example) - my
comments only apply to the OpenRTE layer (specifically, the oob tcp
component).

>From the run-time perspective, whatever you do *must* support
heterogeneous networks of computers that may and may not support IPv6,
and may and may not support IPv6-mapped IPv4 addresses. In other words,
the solution must support systems including computers that only know
IPv4.

I know this may make things more difficult, but OpenRTE has additional
requirements on it for other applications as well. We cannot lock
ourselves into IPv6-supported systems.

Thanks
Ralph


Christian Kauhaus wrote:

  Adrian Knoth :
  
  
(I really prefer the v6-mapped-v4 solution with a single
socket, thus eliminating this problem)

  
  
One little problem here is that it is possible to disable the
IPv6-mapped IPv4 addresses at least under Linux and some BSD variants.
For Linux, have a look at sys.net.ipv6.bindv6only.  Some authors even
recommend to do so for security considerations (for example, Murphy &
Malone in IPv6 Network Administration, O'Reilly 2005). 

So the approach that maximizes the environments where it works out of
the box is this: Call getaddrinfo with PF_UNSPEC and open a socket for
each IP version it returns (usually this means two sockets on
IPv6-enabled hosts, but this may change in the future... who knows?)

If the connection handling code already makes use of one big select
loop, this should not be *too* hard...

  Christian

  





Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Christian Kauhaus
Bogdan Costescu :
>- are all computers that should participate in a job configured 
>similarly (only IPv6 or both IPv4 and IPv6) ? If not all are, then 
>should some part of the computers communicate over one protocol and 
>the rest over the other ? I think that this split coomunication would 

This should be really possible. If we do the connection handling code
correctly, the Internet Protocol version should not matter. Many other
daemons are coded right this way. The basic algorithm is like this:

/* retrieve list of addresses bound to the given target host */
getaddrinfo(..., _list);

for (addr_res in addr_list) {
  /* initialize socket of the correct address family */
  fd = socket(addr_res->ai_family, ...);

  if (try_to_connect(fd)) break;
}

So the resolver already does the complicated work for us, since it
returns all addresses associated to a given target (hostname or IP-addr
notation) in the order of decreasing preference.

>- a related point is whether the 2 protocols should really be regarded 
>as 2 different communication channels. OpenMPI is able to use several 
>communication channels between 2 processes/MPI ranks at the same time, 
>so should the same physical interface be split between the 2 logical 
>protocols for communication between the same pair of computers ?

This one is sort of complicated. According to OMPI, there are several
interfaces on a host, and each interface has access to some fraction of
the total bandwidth. Now we also have two different protocols on each
interface. 

Possible scenarios:

- We add the IP version to the OMP interface name. So instead of eth0
  and eth1 we would get eth0 eth0.v6 eth1 eth1.v6. Using this approach
  one could quite easily state her preferences using the btl command
  line arguments. Of course, the latency/bandwidth code would need to be
  re-worked, since now all traffic on a IPv6 interface would take
  available bandwidth away from the corresponding IPv4 interface.

- We do not add the IP version to the interface name, but perform
  protocol selection automatically based on resolver results. In this
  case the modification to the interface selection algorithm would
  probably a minor one. Backdraw: we cannot control the IP version
  beyond the resolver configuration, which is probably out of reach from
  the user. Since IPv6 imposes a slightly higher protocol overhead,
  users might want to use IPv4 in the local network, but cannot do
  anything if the automatic selection does it wrong.

- We introduce another parameter, which allows an IP version selection 
  both globally and on a per-interface basis. Something like:
  IPv4-only / prefer IPv4 / auto (resolver) / prefer IPv6 / IPv6-only

The third approach would possibly the cleanest one.

>of the computers. For example, if the remote computer has IPv6 
>configured but the sshd is restricted to bind to IPv4, then a ssh 
>connection to this computer using the IPv6 address (which would be 
>specified in the hostfile) will fail, while OpenMPI processes [...]

In my experience, this is no problem. We currently have some IPv6 test
networks running and also one of our clusters does IPv6 on its internal
ethernet. Hosts which are generally not IPv6-ready get no IPv6 address
in the DNS / hosts file. This prevents any contact using IPv6, since
their address is not known. Hosts which have some IPv6 support get a
double entry in the DNS or hosts file. Since it is standard behaviour
for every IPv6 app to try all known addresses for the target host until
any one succeeds, we are also able to connect to a IPv6-enabled host
where the target daemon does not listen on a IPv6 interface. For
example, we ran several weeks without an IPv6-enabled rsh, which is used
to handle MPI job startup on the cluster, without any problems. 

>IMHO, some discussion of them should occur before the actual coding...

I agree. So here we go :-)

  Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Ralf Wildenhues
* Adrian Knoth wrote on Fri, Mar 31, 2006 at 05:33:29PM CEST:
> 
> If there is really a platform without sockaddr_in6
*snip*

> As far as I know: All BSDs have v6, Linux has, HPUX, AIX, Solaris,
> Windows (XP for sure, 2000 experimental, 9X/ME don't).

As determined by a cheap
  find /usr/include -type f | xargs grep sockaddr_in6
here is some more information:

Have:
AIX 4.3.3
HP-UX 11.11
IRIX 6.5
Solaris 8
Tru64 UNIX 5.1

Have not:
HP-UX 11.00
Solaris 7
Tru64 UNIX 4.0D

Cheers,
Ralf


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Brian Barrett

On Mar 31, 2006, at 10:33 AM, Adrian Knoth wrote:


On Fri, Mar 31, 2006 at 05:21:42PM +0200, Ralf Wildenhues wrote:


Perhaps it's a good idea to port any internal structure to
IPv6, as it is able to represent the whole v4 namespace.
One can always determine whether it is a real v6 or only
a mapped v4 address (the common ::: prefix)

I'm far from knowledgeable in this networking area, but I have a
maybe-naive question here: Won't you have to assume in this case that
the host operating system has IPv6 support, so that the corresponding
data structures are defined?


This is true. I don't know of any modern OS without IPv6 support,
even Windows provides these structures ;)

If there is really a platform without sockaddr_in6, this should
be catched by configure (reverting to v4-only code, a little
tricky, yes).

As far as I know: All BSDs have v6, Linux has, HPUX, AIX, Solaris,
Windows (XP for sure, 2000 experimental, 9X/ME don't).


Do you know which versions of these operating systems?  We have to  
support some fairly old platforms, so it would be good to at least  
know what we are getting into...  I think we actually do run on a  
couple without IPv6 support, but I could be wrong there.


Brian


--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 05:21:42PM +0200, Ralf Wildenhues wrote:

> > Perhaps it's a good idea to port any internal structure to
> > IPv6, as it is able to represent the whole v4 namespace.
> > One can always determine whether it is a real v6 or only
> > a mapped v4 address (the common ::: prefix)
> I'm far from knowledgeable in this networking area, but I have a
> maybe-naive question here: Won't you have to assume in this case that
> the host operating system has IPv6 support, so that the corresponding
> data structures are defined?

This is true. I don't know of any modern OS without IPv6 support,
even Windows provides these structures ;)

If there is really a platform without sockaddr_in6, this should
be catched by configure (reverting to v4-only code, a little
tricky, yes).

As far as I know: All BSDs have v6, Linux has, HPUX, AIX, Solaris,
Windows (XP for sure, 2000 experimental, 9X/ME don't).


-- 
mail: a...@thur.de  http://adi.thur.de  PGP: v2-key via keyserver

  Schlecht: Dein Mann zieht gerne Frauenkleider an.
  Panik: Er sieht darin besser aus als du.


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Ralf Wildenhues
* Adrian Knoth wrote on Fri, Mar 31, 2006 at 04:59:42PM CEST:
> 
> Perhaps it's a good idea to port any internal structure to
> IPv6, as it is able to represent the whole v4 namespace.
> One can always determine whether it is a real v6 or only
> a mapped v4 address (the common ::: prefix)

I'm far from knowledgeable in this networking area, but I have a
maybe-naive question here: Won't you have to assume in this case that
the host operating system has IPv6 support, so that the corresponding
data structures are defined?  Does OpenMPI aim at supporting systems
that lack this support?

Cheers,
Ralf


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Brian Barrett

On Mar 31, 2006, at 10:15 AM, Adrian Knoth wrote:

On Fri, Mar 31, 2006 at 09:36:31AM -0500, Jeff Squyres (jsquyres)  
wrote:


I have no personal experience with IPv6, but one thought that  
strikes me
is that the components might be able to figure out what to do by  
looking

at/parsing either the hostnames or the results that come back from
resolving the hostname...?


Yes. You can ask the resolver for v4, v6 or any of them.
The libc functions are standardized and handle both.
The socket family, too. You just have to specify whether
to use AF_INET or AF_INET6. That's all.

Due to the new lookup functions, DNS lookups now return
a linked list of dynamically allocated memory containing
the results for probably multi homed hosts. The common way
is to iterate over this list, try every given address/information
and manually free the memory afterwards.

The whole process in its naive implementation is straightforward.

Are we getting trouble with listen()/accept()? If we use
v6-mapped-v4 (:::a.b.c.d/96), we only have one socket
to bind to and to listen on. But if we create two separate
sockets, are they non-blocking to each other? So to say:
does OMPI already handle more than one listen socket?

Would this be a problem in case of a btl/tcp6-component?

(I really prefer the v6-mapped-v4 solution with a single
 socket, thus eliminating this problem)


We use an event library and select() to determine when things are  
pending on sockets.  However, I have to say that I would prefer have  
one tcp btl / oob component and have it do the right things  
internally.  The space difference for storing an IPv6 vs IPv4 address  
isn't that huge and maintaining all the extra code will be a  
nightmare.  At least, that's been my experience with similar things  
in the past.  Just my $0.02, of course.


The other question is what to do on platforms without IPv6 support.   
I'm pretty sure we're going to run into them, so it would be good to  
plan along those lines


Brian


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 09:36:31AM -0500, Jeff Squyres (jsquyres) wrote:

> I have no personal experience with IPv6, but one thought that strikes me
> is that the components might be able to figure out what to do by looking
> at/parsing either the hostnames or the results that come back from
> resolving the hostname...?

Yes. You can ask the resolver for v4, v6 or any of them.
The libc functions are standardized and handle both.
The socket family, too. You just have to specify whether
to use AF_INET or AF_INET6. That's all.

Due to the new lookup functions, DNS lookups now return
a linked list of dynamically allocated memory containing
the results for probably multi homed hosts. The common way
is to iterate over this list, try every given address/information
and manually free the memory afterwards.

The whole process in its naive implementation is straightforward.

Are we getting trouble with listen()/accept()? If we use
v6-mapped-v4 (:::a.b.c.d/96), we only have one socket
to bind to and to listen on. But if we create two separate
sockets, are they non-blocking to each other? So to say:
does OMPI already handle more than one listen socket?

Would this be a problem in case of a btl/tcp6-component?

(I really prefer the v6-mapped-v4 solution with a single
 socket, thus eliminating this problem)



-- 
mail: a...@thur.de  http://adi.thur.de  PGP: v2-key via keyserver

Werbung für einen Schützenverein:
"Lernen Sie bei uns schießen und treffen Sie gute Freunde!"


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 09:07:39AM -0500, Brian Barrett wrote:

> > I have a first quick and dirty patch, replacing AF_INET by AF_INET6,
> > the sockaddr_in structs and so on.
> Is there a way to do this to better support both IPv4 and IPv6?

I think so, too. There are probably two different ways to achieve
this: either provide two components "tcp" and "tcp6" or use
v6-mapped-v4 addresses. The first would surely result in a lot
of shared code, but I think this won't be a problem. If it is
possible to have to components (and by this several modules)
for communication, this might be a solution.

The other way, v6-mapped-v4, is how normal userland daemons
are usually implemented. The application only listens on
v6-sockets, v4-addresses are mapped to :::a.b.c.d/96,
where a.b.c.d is the normal 32bit v4-address:

Mar 31 13:58:26 ltw pop3-login: Login: x [:::84.184.164.40]

Perhaps it's a good idea to port any internal structure to
IPv6, as it is able to represent the whole v4 namespace.
One can always determine whether it is a real v6 or only
a mapped v4 address (the common ::: prefix)


> mca_btl_tcp_proc_insert(), which is what I think you're referring to  
> by the net1/net2 code, that's intended to be used to try to get all  
> the multi-nic scenarios wired up in the most advantageous way  
> possible.  So we look at the combination IPv4 addr and netmask and  
> prefer to connect two endpoints in the same subnet.

Ok, this is how I understood the code. The current implementation
does a bitwise AND on uint32, for IPv6 this will be 128 bits.

I don't know of any predeclared type of this size, so we have
to find a different solution. Though the final decision will
always be boolean ("Are we on the same network?" Yes/No), we
have to represent the correct answer.

There is only one comparision between net1 and net2, so the
decision is a local one and we don't really need the
netmasks.

> I'm not sure how IPv6 deals with netmasks and routing, but I'm
> assuming there would be something similar.

Pretty much the same. Netmasks are now called "prefixlen",
integers between 0 (like /0) and 128 (like /32).
The typical onlink prefixlen is /64, there's usually no
smaller (i.e. /112) prefixlen, though it might exist.

Routing aggregation is done by enlarging the prefix.
A typical one is /48, this means 2^16 networks with 2^64
hosts each.

So to say: the LAN prefixlen will be 64 in most cases.
Larger ones (i.e. /48) are only for routing.

I apologize for calling the numerical smaller value of 48
the larger prefix than 64. This just refers to the network
size as the /64 is the smaller network.


> > I don't know if this patched tcp-component can handle
> > IPv6 connections, I've never tested it. I think it
> > even breaks IPv4 functionality; we should make clear
> > how IPv4 and IPv6 may work in parallel (or may not, if
> > one considers IPv4 deprecated ;)
>  From a practical standpoint, Open MPI has to support both IPv4 and  
> IPv6 for the foreseeable future. 

I think so, too. We're dual stacked.

> We currently try to wire up one connection per "IP device", so it
> seems like we should be able to find some way to automatically
> switch between IPv6 or IPv4 based on what we determine is available
> on that host, right?

That's right. The orte-oob seems to be the right place for
this decision, assuming that ompi/mca/btl/tcp can handle
both or have two different components providing the desired
functionality.

Implementing this dual stack behaviour isn't that hard, almost
every userland tool does it this way: try the v6 and if it
fails, use v4. The user can usually force the code to use
either v4 or v6. This shouldn't be too hard in case of
v6-mapped-v4. The only thing to take care is for RFC1918 networks.

adi@drcomp:~$ telnet :::127.0.0.1 25

(works fine)

To automatically select the right protocol, it might be good
to prefer IPv4 (smaller headers->less overhead). The user
can still force the use of IPv6 via DNS (assigning special
IPv6-only hostnames)


-- 
mail: a...@thur.de  http://adi.thur.de  PGP: v2-key via keyserver

Lieber einen Spanner im Garten als garkein Strom!


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Brian Barrett

On Mar 31, 2006, at 8:40 AM, Adrian Knoth wrote:


On Fri, Mar 31, 2006 at 10:44:11AM +0200, Christian Kauhaus wrote:


Hello *,


Hi.


University of Jena (Germany). Our work group is digging into how to
connect several clusters on a campus.


I think I'm also a member of this workgroup, though I am not
working at University of Jena, but studying there.


First we are interested to integrate IPv6 support into the tcp btl.
Does anyone know if there is someone already working on this?


I have a first quick and dirty patch, replacing AF_INET by AF_INET6,
the sockaddr_in structs and so on.


Is there a way to do this to better support both IPv4 and IPv6?  it  
looks like you had to change an awful lot of interface declarations,  
making the code IPv6 only...



I think it is broken, the calculation of net1 and net2 in
btl_tcp_proc.c isn't really ported and to be honest: I don't
understand the details, i.e. do I have to port name lookups,
are there high level structures relying on IPv4 structs
and so on.


The port name information will all be in the modex share that I  
talked about in the previous e-mail, so it's just a matter of looking  
it up in the endpoint information.  As for the code in  
mca_btl_tcp_proc_insert(), which is what I think you're referring to  
by the net1/net2 code, that's intended to be used to try to get all  
the multi-nic scenarios wired up in the most advantageous way  
possible.  So we look at the combination IPv4 addr and netmask and  
prefer to connect two endpoints in the same subnet.  We also try not  
to connect public and private addresses, as that rarely works the way  
people intend.


As an example, say we have two hosts, both with two NICs:

  host1: 129.79.200.1/255.255.0.0, 129.72.100.1/255.255.0.0
  host2: 129.79.200.2/255.255.0.0, 129.72.100.2/255.255.0.0

When host1 is trying to wire-up connections to host2, it's going to  
figure out how to wire up the btl instance for the 79.200 address and  
the 72.100 address separately.  For the 79.200.1 address, we're going  
to see we have two addresses we can connect to - 129.79.200.2 and  
129.72.100.2.  By looking at netmasks and addresses, we can make the  
guess that the 79.200.2 address is on the "same" network and the  
72.100.2 address is on a "different" network.  I'm not sure how IPv6  
deals with netmasks and routing, but I'm assuming there would be  
something similar.



At least it compiles ;) (let's ship it)

I don't know if this patched tcp-component can handle
IPv6 connections, I've never tested it. I think it
even breaks IPv4 functionality; we should make clear
how IPv4 and IPv6 may work in parallel (or may not, if
one considers IPv4 deprecated ;)

You can retrieve the patch here:

   http://cluster.inf-ra.uni-jena.de/~adi/ompi.ipv6.v1.patch

I'd also appreciate any suggestions, hints or even success stories ;)


From a practical standpoint, Open MPI has to support both IPv4 and  
IPv6 for the foreseeable future.  We currently try to wire up one  
connection per "IP device", so it seems like we should be able to  
find some way to automatically switch between IPv6 or IPv4 based on  
what we determine is available on that host, right?  I'll admit it  
has been a year or so since I've looked at this, so I could be  
completely off base there.


Brian


--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Brian Barrett

On Mar 31, 2006, at 3:44 AM, Christian Kauhaus wrote:


first I'd like to introduce myself. I'm Christian Kauhaus and I am
currently working at the Department of Computer Architecture at the
University of Jena (Germany). Our work group is digging into how to
connect several clusters on a campus.

As part of our research, we'd like to evaluate the use of IPv6 for
multi-cluster coupling. Therefore, we need to run OpenMPI over TCP/ 
IPv6.

Last year during EuroPVM/MPI I already had a short chat with Jeff
Squyres about this, but now we actually do have the time to work on
this.


Great!  We currently only have IPv4 support, but IPv6 has always been  
on the wishlist.  Most of the developers in the States don't have  
access to IPv6 networks, so it hasn't been a concern / need that  
we've had time to address at this point.  It would be great if  
someone else could take a stab at it.


First we are interested to integrate IPv6 support into the tcp btl.  
Does

anyone know if there is someone already working on this? If yes, we
would be glad to cooperate. If no, we would start it by ourselves,
although we would need some help from the OpenMPI developer community
regarding OpenMPI / ORTE internals.


As far as I'm aware, there is no one working on IPv6 support for Open  
MPI.  We would welcome anyone willing to work on the support :).  And  
we'll be as responsive as possible to requests for help / advice -  
this list is the best forum for that type of discussion.


Are your hosts configured for both IPv4 and IPv6 traffic (or are they  
IPv6 only)?  I ask because that will determine what your first step  
is.  There are two TCP communication channels in Open MPI -- the tcp  
oob component, used by the run-time layer for out-of-band  
communication and the tcp btl component, used by the MPI layer for  
MPI traffic.  Without a working tcp oob component, it's pretty close  
to impossible to start the tcp btl, so if you only have IPv6 on your  
machines, that will dictate starting with the tcp oob component.   
Otherwise, you could start with either component (although both will  
eventually need to be updated).


The oob tcp component (code is in orte/mca/oob/tcp/) is fairly  
straight-forward, especially if all you need to deal with is  
connection setup.  There are really two pieces to be aware of - in  
oob_tcp.c there is some code dealing with uri strings - this is used  
by the upper layers to ask the oob component for it's contact address  
(as a uri string) and to give the oob component a uri string and  
associate it with an orte_process_name.  The peer connection code is  
in a combination of oob_tcp_peer.[h,c] and oob_tcp_addr.[h,c].  I'm  
sure you will have to modify oob_tcp_addr.[h,c], and I think you'll  
probably have to modify oob_tcp_peer.[c,h] as well.


I should diverge for a second...  Every process in Open MPI has an  
orte_process_name.  This value will be unique between processes that  
can connect to each other.  When I want to send an out of band  
message to a remote host, I send to that orte_process_name and the  
communication layer figures out how to get the message over there.   
So if the upper layers associate an orte_process_name with a uri  
string, you'll use that information to contact that  
orte_process_name, should you ever need to send data that way.


The tcp btl is mostly the same type of thing.  The main difference is  
how peers are setup.  Instead of a char string to share endpoint  
connections, we have what we call the "modex".  This is basically a  
one-time write, many time read global data store.  So the tcp btl  
puts a fixed-size structure into the modex data (behind the scenes,  
this is stored in our gpr data store), and each process in the  
universe can get that data by looking up against it's process name  
(actually, in this case, it's a datastructure called the ompi_proc,  
which is an orte_process_name, plus data needed for each MPI  
process).  So we'd need to extend that datastructure out a little bit  
to be able to support either IPv4 or IPv6 addresses.  From there, it  
would be the usual changes during connection setup issue.


This was a fairly simple overview - I'd recommend starting with the  
tcp oob component and asking when you have questions about what you  
see.  You don't need to run Open MPI jobs to test the tcp oob  
component - you can just use orterun to launch normal old unix  
commands.  Something with a bit of stdio output will give a  
reasonable first test of the oob.  I usually do something like:


  orterun -np 2 -host host_a,host_b ls -l $HOME

as I have enough files in my home directory that a page or two of  
standard I/O should be returned.


Hope this helps,

Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/




Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 10:44:11AM +0200, Christian Kauhaus wrote:

> Hello *,

Hi.

> University of Jena (Germany). Our work group is digging into how to
> connect several clusters on a campus. 

I think I'm also a member of this workgroup, though I am not
working at University of Jena, but studying there.

> First we are interested to integrate IPv6 support into the tcp btl.
> Does anyone know if there is someone already working on this?

I have a first quick and dirty patch, replacing AF_INET by AF_INET6,
the sockaddr_in structs and so on.

I think it is broken, the calculation of net1 and net2 in
btl_tcp_proc.c isn't really ported and to be honest: I don't
understand the details, i.e. do I have to port name lookups,
are there high level structures relying on IPv4 structs
and so on.

At least it compiles ;) (let's ship it)

I don't know if this patched tcp-component can handle
IPv6 connections, I've never tested it. I think it
even breaks IPv4 functionality; we should make clear
how IPv4 and IPv6 may work in parallel (or may not, if
one considers IPv4 deprecated ;)

You can retrieve the patch here:

   http://cluster.inf-ra.uni-jena.de/~adi/ompi.ipv6.v1.patch

I'd also appreciate any suggestions, hints or even success stories ;)



-- 
mail: a...@thur.de  http://adi.thur.de  PGP: v2-key via keyserver

Bill Gates's Motto: "If you can't make it good, make it look good!"


[OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Christian Kauhaus
Hello *,

first I'd like to introduce myself. I'm Christian Kauhaus and I am
currently working at the Department of Computer Architecture at the
University of Jena (Germany). Our work group is digging into how to
connect several clusters on a campus. 

As part of our research, we'd like to evaluate the use of IPv6 for
multi-cluster coupling. Therefore, we need to run OpenMPI over TCP/IPv6.
Last year during EuroPVM/MPI I already had a short chat with Jeff
Squyres about this, but now we actually do have the time to work on
this.

First we are interested to integrate IPv6 support into the tcp btl. Does
anyone know if there is someone already working on this? If yes, we
would be glad to cooperate. If no, we would start it by ourselves,
although we would need some help from the OpenMPI developer community
regarding OpenMPI / ORTE internals. 

So I would really appreciate any pointers, hints or contacts to share.

TIA

  Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217