On Fri, 31 Mar 2006, Brian Barrett wrote:
Are your hosts configured for both IPv4 and IPv6 traffic (or are they IPv6 only)?
This is a big question and one that basically stopped me from adding IPv6 support to LAM/MPI some 3 years ago. There are several things that have to be considered:
- are all computers that should participate in a job configured similarly (only IPv6 or both IPv4 and IPv6) ? If not all are, then should some part of the computers communicate over one protocol and the rest over the other ? I think that this split coomunication would be easier now with OpenMPI than it was with LAM/MPI (which supported pretty much only one communication channel at a time), but this means that the routing decisions might get tougher.
- a related point is whether the 2 protocols should really be regarded as 2 different communication channels. OpenMPI is able to use several communication channels between 2 processes/MPI ranks at the same time, so should the same physical interface be split between the 2 logical protocols for communication between the same pair of computers ?
- related to the one above, if both IPv4 and IPv6 are available, which one should be used ?
- the IP addresses are also used for starting up the daemons on remote computers. I can't remember all details clearly, but I think that the OpenSSH sshd would discover the IPv6 interface and would only allow that one for incoming connections (and would use the IPv6 address as part of the authentication process)... which leads to stronger ties between the addresses specified in the hostfile and the configuration of the computers. For example, if the remote computer has IPv6 configured but the sshd is restricted to bind to IPv4, then a ssh connection to this computer using the IPv6 address (which would be specified in the hostfile) will fail, while OpenMPI processes (daemons and children) would not have any problem in using the IPv6 address. So there might be some need of both IPv4 and IPv6 addresses of the same computer to be known, maybe via DNS or maybe via some user-provided mapping.
That's all that I remember now from my IPv6 endeavour with LAM/MPI. IMHO, some discussion of them should occur before the actual coding...
-- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: bogdan.coste...@iwr.uni-heidelberg.de