[OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Christian Kauhaus
Hello *,

first I'd like to introduce myself. I'm Christian Kauhaus and I am
currently working at the Department of Computer Architecture at the
University of Jena (Germany). Our work group is digging into how to
connect several clusters on a campus. 

As part of our research, we'd like to evaluate the use of IPv6 for
multi-cluster coupling. Therefore, we need to run OpenMPI over TCP/IPv6.
Last year during EuroPVM/MPI I already had a short chat with Jeff
Squyres about this, but now we actually do have the time to work on
this.

First we are interested to integrate IPv6 support into the tcp btl. Does
anyone know if there is someone already working on this? If yes, we
would be glad to cooperate. If no, we would start it by ourselves,
although we would need some help from the OpenMPI developer community
regarding OpenMPI / ORTE internals. 

So I would really appreciate any pointers, hints or contacts to share.

TIA

  Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Christian Kauhaus
Bogdan Costescu <bogdan.coste...@iwr.uni-heidelberg.de>:
>- are all computers that should participate in a job configured 
>similarly (only IPv6 or both IPv4 and IPv6) ? If not all are, then 
>should some part of the computers communicate over one protocol and 
>the rest over the other ? I think that this split coomunication would 

This should be really possible. If we do the connection handling code
correctly, the Internet Protocol version should not matter. Many other
daemons are coded right this way. The basic algorithm is like this:

/* retrieve list of addresses bound to the given target host */
getaddrinfo(..., _list);

for (addr_res in addr_list) {
  /* initialize socket of the correct address family */
  fd = socket(addr_res->ai_family, ...);

  if (try_to_connect(fd)) break;
}

So the resolver already does the complicated work for us, since it
returns all addresses associated to a given target (hostname or IP-addr
notation) in the order of decreasing preference.

>- a related point is whether the 2 protocols should really be regarded 
>as 2 different communication channels. OpenMPI is able to use several 
>communication channels between 2 processes/MPI ranks at the same time, 
>so should the same physical interface be split between the 2 logical 
>protocols for communication between the same pair of computers ?

This one is sort of complicated. According to OMPI, there are several
interfaces on a host, and each interface has access to some fraction of
the total bandwidth. Now we also have two different protocols on each
interface. 

Possible scenarios:

- We add the IP version to the OMP interface name. So instead of eth0
  and eth1 we would get eth0 eth0.v6 eth1 eth1.v6. Using this approach
  one could quite easily state her preferences using the btl command
  line arguments. Of course, the latency/bandwidth code would need to be
  re-worked, since now all traffic on a IPv6 interface would take
  available bandwidth away from the corresponding IPv4 interface.

- We do not add the IP version to the interface name, but perform
  protocol selection automatically based on resolver results. In this
  case the modification to the interface selection algorithm would
  probably a minor one. Backdraw: we cannot control the IP version
  beyond the resolver configuration, which is probably out of reach from
  the user. Since IPv6 imposes a slightly higher protocol overhead,
  users might want to use IPv4 in the local network, but cannot do
  anything if the automatic selection does it wrong.

- We introduce another parameter, which allows an IP version selection 
  both globally and on a per-interface basis. Something like:
  IPv4-only / prefer IPv4 / auto (resolver) / prefer IPv6 / IPv6-only

The third approach would possibly the cleanest one.

>of the computers. For example, if the remote computer has IPv6 
>configured but the sshd is restricted to bind to IPv4, then a ssh 
>connection to this computer using the IPv6 address (which would be 
>specified in the hostfile) will fail, while OpenMPI processes [...]

In my experience, this is no problem. We currently have some IPv6 test
networks running and also one of our clusters does IPv6 on its internal
ethernet. Hosts which are generally not IPv6-ready get no IPv6 address
in the DNS / hosts file. This prevents any contact using IPv6, since
their address is not known. Hosts which have some IPv6 support get a
double entry in the DNS or hosts file. Since it is standard behaviour
for every IPv6 app to try all known addresses for the target host until
any one succeeds, we are also able to connect to a IPv6-enabled host
where the target daemon does not listen on a IPv6 interface. For
example, we ran several weeks without an IPv6-enabled rsh, which is used
to handle MPI job startup on the cluster, without any problems. 

>IMHO, some discussion of them should occur before the actual coding...

I agree. So here we go :-)

  Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


Re: [OMPI devel] Building ompi occasionally touches the source files

2006-07-18 Thread Christian Kauhaus
Adrian Knoth <a...@drcomp.erfurt.thur.de>:
>b) fails to complete (see attachment), the errors are all
>   related to lex.

What are the flex versions used on these systems? On Debian stable it is
flex 2.5.31 and on my Gentoo box it is flex 2.5.33, both giving correct
builds. 

I'm using the same VPATH setup like Adrian and during the build process
opal/util/show_help_lex.c is *neither* touched nor modified. It is just
compiled to build/ARCH/opal/util/show_help_lex.lo as it is supposed to
be.

-Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


Re: [OMPI devel] [IPv6] new component oob/tcp6

2006-09-06 Thread Christian Kauhaus
Bogdan Costescu <bogdan.coste...@iwr.uni-heidelberg.de>:
>I don't know why you think that this (talking to different nodes via 
>different channels) is unusual - I think that it's quite probable, 
>especially in a heterogenous environment.

I think the first goal should be to get IPv6 working -- and this is much
more easier when we restrict ourselves to the case when all system
participating in one(!) job are reachable via a single protocol version,
either IPv4 or IPv6. 

I'm not quite sure if we need to run a *single* job across a network
with both systems that are not reachable via IPv4 and systems
that are not reachable via IPv6. If there is a practical need for this,
we will probably tackle this in the future. Note that the current plan
does not restrict the use of OpenMPI in heterogenous IPv4/IPv6
environments, but we will not support mixed IPv4/IPv6 operation in a
single job right now. 

Our current plan is to look into the hostfile and see if there are 

(1a) just IPv4 addresses
(1b) IPv4 addresses and hostnames for which 'A' queries can be resolved
(2a) just IPv6 addresses
(2b) IPv6 addresses and hostnames for which '' queries can be resolved.

In case 1 we initially use an IPv4 transport and in case 2 we initially
use an IPv6 transport for the oob. If neither case 1 or 2 are possible,
we abort. 

I hope that all can agree that this is a good starting point. 

Regards
  Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


[OMPI devel] MPI::File::Create_errhandler() missing?

2006-10-02 Thread Christian Kauhaus
Hello!

After doing some application coding using the C++ bindings, I tried to
create a custom MPI::File Errorhandler but failed: 

| mpiiowriter.cc: In member function `virtual void MPIIOWriter::initialized()':
| mpiiowriter.cc:29: error: `Create_errhandler' is not a member of `MPI::File'
| mpiiowriter.cc:29: warning: unused variable 'errhandler'

After some source grepping, it seems that the function
MPI::File::Create_errhandler() is not implemented, although the MPI2.0
standard requires it for MPI-IO. Should I file a ticket?

Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


Re: [OMPI devel] ORTE scalability issues

2007-04-17 Thread Christian Kauhaus
Ralph H Castain <r...@lanl.gov>:
>even though the HNP isn't actually part of the MPI job itself, or the
>processes are opening duplicate OOB sockets back to the HNP. I am not
>certain which (or either) of these is the root cause, however - it needs
>further investigation to identify the source of the extra sockets.

If you are using the IPv6-ready code: in this case we need to create two
sockets for each OOB/TCP. One uses AF_INET and one uses AF_INET6.
IIRC, we close the superfluous socket once the connection attempt on
either one succeeds. Adrian, correct me if I'm wrong. :-)
Unfortunately, there's no easy way around this.

Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217