[OMPI devel] IPv6 support in OpenMPI?
Hello *, first I'd like to introduce myself. I'm Christian Kauhaus and I am currently working at the Department of Computer Architecture at the University of Jena (Germany). Our work group is digging into how to connect several clusters on a campus. As part of our research, we'd like to evaluate the use of IPv6 for multi-cluster coupling. Therefore, we need to run OpenMPI over TCP/IPv6. Last year during EuroPVM/MPI I already had a short chat with Jeff Squyres about this, but now we actually do have the time to work on this. First we are interested to integrate IPv6 support into the tcp btl. Does anyone know if there is someone already working on this? If yes, we would be glad to cooperate. If no, we would start it by ourselves, although we would need some help from the OpenMPI developer community regarding OpenMPI / ORTE internals. So I would really appreciate any pointers, hints or contacts to share. TIA Christian -- Dipl.-Inf. Christian Kauhaus <>< Lehrstuhl fuer Rechnerarchitektur und -kommunikation Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217
Re: [OMPI devel] IPv6 support in OpenMPI?
Bogdan Costescu <bogdan.coste...@iwr.uni-heidelberg.de>: >- are all computers that should participate in a job configured >similarly (only IPv6 or both IPv4 and IPv6) ? If not all are, then >should some part of the computers communicate over one protocol and >the rest over the other ? I think that this split coomunication would This should be really possible. If we do the connection handling code correctly, the Internet Protocol version should not matter. Many other daemons are coded right this way. The basic algorithm is like this: /* retrieve list of addresses bound to the given target host */ getaddrinfo(..., _list); for (addr_res in addr_list) { /* initialize socket of the correct address family */ fd = socket(addr_res->ai_family, ...); if (try_to_connect(fd)) break; } So the resolver already does the complicated work for us, since it returns all addresses associated to a given target (hostname or IP-addr notation) in the order of decreasing preference. >- a related point is whether the 2 protocols should really be regarded >as 2 different communication channels. OpenMPI is able to use several >communication channels between 2 processes/MPI ranks at the same time, >so should the same physical interface be split between the 2 logical >protocols for communication between the same pair of computers ? This one is sort of complicated. According to OMPI, there are several interfaces on a host, and each interface has access to some fraction of the total bandwidth. Now we also have two different protocols on each interface. Possible scenarios: - We add the IP version to the OMP interface name. So instead of eth0 and eth1 we would get eth0 eth0.v6 eth1 eth1.v6. Using this approach one could quite easily state her preferences using the btl command line arguments. Of course, the latency/bandwidth code would need to be re-worked, since now all traffic on a IPv6 interface would take available bandwidth away from the corresponding IPv4 interface. - We do not add the IP version to the interface name, but perform protocol selection automatically based on resolver results. In this case the modification to the interface selection algorithm would probably a minor one. Backdraw: we cannot control the IP version beyond the resolver configuration, which is probably out of reach from the user. Since IPv6 imposes a slightly higher protocol overhead, users might want to use IPv4 in the local network, but cannot do anything if the automatic selection does it wrong. - We introduce another parameter, which allows an IP version selection both globally and on a per-interface basis. Something like: IPv4-only / prefer IPv4 / auto (resolver) / prefer IPv6 / IPv6-only The third approach would possibly the cleanest one. >of the computers. For example, if the remote computer has IPv6 >configured but the sshd is restricted to bind to IPv4, then a ssh >connection to this computer using the IPv6 address (which would be >specified in the hostfile) will fail, while OpenMPI processes [...] In my experience, this is no problem. We currently have some IPv6 test networks running and also one of our clusters does IPv6 on its internal ethernet. Hosts which are generally not IPv6-ready get no IPv6 address in the DNS / hosts file. This prevents any contact using IPv6, since their address is not known. Hosts which have some IPv6 support get a double entry in the DNS or hosts file. Since it is standard behaviour for every IPv6 app to try all known addresses for the target host until any one succeeds, we are also able to connect to a IPv6-enabled host where the target daemon does not listen on a IPv6 interface. For example, we ran several weeks without an IPv6-enabled rsh, which is used to handle MPI job startup on the cluster, without any problems. >IMHO, some discussion of them should occur before the actual coding... I agree. So here we go :-) Christian -- Dipl.-Inf. Christian Kauhaus <>< Lehrstuhl fuer Rechnerarchitektur und -kommunikation Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217
Re: [OMPI devel] Building ompi occasionally touches the source files
Adrian Knoth <a...@drcomp.erfurt.thur.de>: >b) fails to complete (see attachment), the errors are all > related to lex. What are the flex versions used on these systems? On Debian stable it is flex 2.5.31 and on my Gentoo box it is flex 2.5.33, both giving correct builds. I'm using the same VPATH setup like Adrian and during the build process opal/util/show_help_lex.c is *neither* touched nor modified. It is just compiled to build/ARCH/opal/util/show_help_lex.lo as it is supposed to be. -Christian -- Dipl.-Inf. Christian Kauhaus <>< Lehrstuhl fuer Rechnerarchitektur und -kommunikation Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217
Re: [OMPI devel] [IPv6] new component oob/tcp6
Bogdan Costescu <bogdan.coste...@iwr.uni-heidelberg.de>: >I don't know why you think that this (talking to different nodes via >different channels) is unusual - I think that it's quite probable, >especially in a heterogenous environment. I think the first goal should be to get IPv6 working -- and this is much more easier when we restrict ourselves to the case when all system participating in one(!) job are reachable via a single protocol version, either IPv4 or IPv6. I'm not quite sure if we need to run a *single* job across a network with both systems that are not reachable via IPv4 and systems that are not reachable via IPv6. If there is a practical need for this, we will probably tackle this in the future. Note that the current plan does not restrict the use of OpenMPI in heterogenous IPv4/IPv6 environments, but we will not support mixed IPv4/IPv6 operation in a single job right now. Our current plan is to look into the hostfile and see if there are (1a) just IPv4 addresses (1b) IPv4 addresses and hostnames for which 'A' queries can be resolved (2a) just IPv6 addresses (2b) IPv6 addresses and hostnames for which '' queries can be resolved. In case 1 we initially use an IPv4 transport and in case 2 we initially use an IPv6 transport for the oob. If neither case 1 or 2 are possible, we abort. I hope that all can agree that this is a good starting point. Regards Christian -- Dipl.-Inf. Christian Kauhaus <>< Lehrstuhl fuer Rechnerarchitektur und -kommunikation Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217
[OMPI devel] MPI::File::Create_errhandler() missing?
Hello! After doing some application coding using the C++ bindings, I tried to create a custom MPI::File Errorhandler but failed: | mpiiowriter.cc: In member function `virtual void MPIIOWriter::initialized()': | mpiiowriter.cc:29: error: `Create_errhandler' is not a member of `MPI::File' | mpiiowriter.cc:29: warning: unused variable 'errhandler' After some source grepping, it seems that the function MPI::File::Create_errhandler() is not implemented, although the MPI2.0 standard requires it for MPI-IO. Should I file a ticket? Christian -- Dipl.-Inf. Christian Kauhaus <>< Lehrstuhl fuer Rechnerarchitektur und -kommunikation Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217
Re: [OMPI devel] ORTE scalability issues
Ralph H Castain <r...@lanl.gov>: >even though the HNP isn't actually part of the MPI job itself, or the >processes are opening duplicate OOB sockets back to the HNP. I am not >certain which (or either) of these is the root cause, however - it needs >further investigation to identify the source of the extra sockets. If you are using the IPv6-ready code: in this case we need to create two sockets for each OOB/TCP. One uses AF_INET and one uses AF_INET6. IIRC, we close the superfluous socket once the connection attempt on either one succeeds. Adrian, correct me if I'm wrong. :-) Unfortunately, there's no easy way around this. Christian -- Dipl.-Inf. Christian Kauhaus <>< Lehrstuhl fuer Rechnerarchitektur und -kommunikation Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217