Re: [OMPI devel] [IPv6] new component oob/tcp6
On Thursday 07 September 2006 18:42, George Bosilca wrote: > I still wonder why we need any configuration "magic". We don't want > to be the only one around supporting IPv4 OR IPv6. Supporting both of > them simultaneously can be interesting, and it does not require huge > changes. In fact, we have a problem only at the connection step, > everything else will be identically. > > In fact, as we're talking about the TCP layer, we might want to > finish the discussion we had a while ago, about merging the OOB and > the BTL in one component. They do have very similar functions, and > right now we have to maintain 2 components. I think it's more than > time to do the merge, and move the resulting component or whatever > down in the OPAL layer. > > I even volunteer for that. Next week I will be away, so I will come > back with a design for the phone conference on ... well beginning of > october. Sounds the most reasonable solution for me. At the moment the TCP BTL would have a problem in the case where a Open MPI jobs is spawned across multiple cells where at least 2 cells have the same private IP address range. In this scenario a process of one cell could think that a process from the other cell is reachable. That's not really an IPv6 specific problem but when we are thinking about moving the BTL down to the OPAL layer we should take care about that. I'm not sure if other BTLs have similar problems (e.g. 2 infiniband cells connect via TCP). Thanks, Sven >george. > > > On Sep 7, 2006, at 12:22 PM, Ralph H Castain wrote: > > > Jeff and I talked about this for awhile this morning, and we both > > agree > > (yes, I did change my mind after we discussed all the > > ramifications). It > > appears that we should be able to consolidate the code into a single > > component with the right configuration system "magic" - and that would > > definitely be preferable. > > > > My primary concern originally was with the lack of knowledge and > > documentation on the configuration system. I know that I don't know > > enough > > about that system to make everything work in a single component. The > > component method would have allowed you to remain ignorant of that > > system. > > However, with Jeff's willingness to help in that regard, the > > approach he > > recommends would be easier for everyone. > > > > Hope that doesn't cause too much of a problem. > > Ralph > > > > > > On 9/7/06 9:46 AM, "Jeff Squyres" wrote: > > > >> On 9/1/06 12:21 PM, "Adrian Knoth" wrote: > >> > >>> On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: > >>> > > Do you agree to go on with two oob components, tcp and tcp6? > Yes, I think that's the right approach > >>> > >>> It's a deal. ;) > >> > >> Actually, I would disagree here (sorry for jumping in late! :-( ). > >> > >> Given the amount of code duplication, it seems like a big shame to > >> make a > >> separate component that is almost identical. > >> > >> Can we just have one component that handles both ivp4 and ivp6? > >> Appropriate > >> #if's can be added (I'm willing to help with the configure.m4 mojo > >> -- the > >> stuff to tell OMPI whether ipv4 and/or ipv6 stuff can be found and > >> to set > >> the #define's appropriately). > >> > >> More specifically -- I can help with component / configure / build > >> system > >> issues. I'll defer on the whole how-to-wire-them-up issue for the > >> moment > >> (I've got some other fires burning that must be tended to :-\ ). > >> > >> My $0.02: OOB is the first target to get working -- once you can > >> orterun > >> non-MPI apps properly across ipv6 and/or ipv4 nodes, then move on > >> to the MPI > >> layer and take the same approach there (e.g., one TCP btl with > >> configure.m4 > >> mojo, etc.). > > > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > "Half of what I say is meaningless; but I say it so that the other > half may reach you" >Kahlil Gibran > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] [IPv6] new component oob/tcp6
On 9/7/06 6:15 PM, "Jeff Squyres" wrote: > All you have to do to get this define is #include "ompi_config.h", which all > of the files should be doing already, anyway. Oops! Ralph pointed out to me that this is not quite correct. In the OOB TCP component, you should *NOT* include "ompi_config.h" -- you should include "orte_config.h" (and it should already be included, anyway). Short version: -- both orte_config.h and ompi_config.h should have the appropriate #define's in place. I goofed in my original patch; see new version (attached) where the macro has been renamed to OPAL_WANT_IPV6 (vs. OMPI_WANT_IPV6). The new patch wholly replaces the prior patch. Longer version: --- The stack has 3 layers: - opal: Open Portable Access Layer - orte: Open Run Time Environment - ompi: Open Message Passing Interface These are strictly layered on each other, so ORTE, for example, has zero knowledge of OMPI. We used to have one big tree and only informal layering of these 3 sections of code, but now they're actually 3 different trees. Hence, the code division is strict and absolute (e.g., by default, we make 3 libraries: libopal, liborte, libompi). Abstractions violations are swiftly punished by the linker. However, there are still a bunch of top-level names that are "OMPI" (mainly stemming from configure), even though they're intended for OPAL and/or ORTE. But that's no reason to continue the bad names. The macro that I added yesterday was OMPI_WANT_IPV6, but I really should have named it OPAL_WANT_IPV (since OPAL is where most of the portability machinery is supposed to go). I've amended my previous patch -- see attached. So surround your code with: #if OPAL_WANT_IPV6 ... #endif I can't commit this to the trunk until tonight -- we have an informal policy in the project to not make changes to configure and friends during the Euro/US work day so as not to force developers to re-run autogen.sh during the day. -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ipv6-2.patch Description: Binary data
Re: [OMPI devel] [IPv6] new component oob/tcp6
It occurred to me last night that this solves the homogeneous case, but still leaves us with the problem of hetero systems. What we really need to know is not only "what do I support", but "what does the recipient support". Then it hit me that we may already have the solution for that problem in the OOB, though we don't use it currently. If you check the OOB code, you will find that we store the OOB contact info on the registry during startup, and in return we obtain ALL of the OOB contact info for our peers. In that code, we allow for multiple contact points to be passed for each peer process - including what protocol is to be used for each contact point. In other words, if we have an IPv6 socket, that information gets passed to our peers (including the fact that it is an IPv6 address). Ditto if we have an IPv4 socket. And we are covered even if we have both types. What is missing in the code is the selection of which contact point to use to communicate a given message, and the decision logic that uses the "right" addressing protocol as specified for that recipient (current code assumes only one is given). So I think we can actually build a lot of the hetero support into the existing OOB component. We just may need to add a little to take full advantage of what is already there. For example, on a send, we may just need to use the proper call that matches the specified protocol. The "if" statement approach should be adequate for that level of separation. Ralph On 9/7/06 4:15 PM, "Jeff Squyres" wrote: > On 9/7/06 1:51 PM, "Adrian Knoth" wrote: > >>> (I'm willing to help with the configure.m4 mojo -- the >> >> That's good. Just check for struct sockaddr_in6 and add >> -DIPV6 to the CFLAGS. This flag is currently needed by >> opal/util/if.* and orte/mca/oob/tcp/*, so one might limit >> it to the two corresponding makefiles. >> >> We can also set/define IPV6 in something_config.h. >> It'd also be a good idea to have a --disable-ipv6 configure flag. > > Done. See the attached patch (apply it, then re-run autogen.sh and > configure). It does three things: > > 1. Check if --disable-ipv6 was passed to configure. > 2. Check to see if struct sockaddr_in6 exists. > 3. Sets a macro OMPI_WANT_IPV6 to either 0 or 1 (i.e., it's always defined > and is therefore suitable for #if, not #ifdef): >- Set to 1 if --disable-ipv6 was not passed to configure *AND* struct > sockaddr_in6 exists >- Set to 0 otherwise > > So surround your code with: > > #if OMPI_WANT_IPV6 > ...ipv6 stuff... > #endif > > All you have to do to get this define is #include "ompi_config.h", which all > of the files should be doing already, anyway. > > Let me know if this works for you.
Re: [OMPI devel] [IPv6] new component oob/tcp6
On Thu, Sep 07, 2006 at 07:51:28PM +0200, Adrian Knoth wrote: > No problem, just two hours ago, Christian and me decided to drop > the idea of oob/tcp6 and go on with only one oob-tcp-component. > It shouldn't be that hard and I'll try it tonight or tomorrow. Looks quite promising: adi@ipc654:~/ompi/trunk/test$ (orterun -np 2 -host amun,ipc654 netstat -tpln) 2> /dev/null | grep orte tcp0 0 0.0.0.0:44012 0.0.0.0:* LISTEN 1332/orted tcp0 0 0.0.0.0:42706 0.0.0.0:* LISTEN 1329/orterun tcp0 0 0.0.0.0:36376 0.0.0.0:* LISTEN 27961/orted tcp6 0 0 :::56783:::*LISTEN 27961/orted tcp6 0 0 :::34615:::*LISTEN 1329/orterun tcp6 0 0 :::39837:::*LISTEN 1332/orted This is one component with two listening sockets. The main work isn't done yet: the mca_oob_tcp_peer_start_connect. I've extended it a little bit: static int mca_oob_tcp_peer_start_connect(mca_oob_tcp_peer_t* peer, uint16_t af_family); where af_family is one of {AF_INET, AF_INET6}. I start with AF_INET and within mca_oob_tcp_peer_start_connect, I call this function again with AF_INET6 (one level of recursion) to try the other address family. This approach (coded last week when I still had a single component) is bad (long timeouts before trying AF_INET6) and probably wrong: for the accepting sockets, I've added opal_event_t tcp6_send_event; opal_event_t tcp6_recv_event; and perhaps something like this is necessary for peers, too (don't know this, yet. I'll have a look at it tomorrow). So long -- mail: a...@thur.de http://adi.thur.de PGP: v2-key via keyserver Frauen verstehen entweder gar nichts oder alles falsch
Re: [OMPI devel] [IPv6] new component oob/tcp6
On 9/7/06 1:51 PM, "Adrian Knoth" wrote: >> (I'm willing to help with the configure.m4 mojo -- the > > That's good. Just check for struct sockaddr_in6 and add > -DIPV6 to the CFLAGS. This flag is currently needed by > opal/util/if.* and orte/mca/oob/tcp/*, so one might limit > it to the two corresponding makefiles. > > We can also set/define IPV6 in something_config.h. > It'd also be a good idea to have a --disable-ipv6 configure flag. Done. See the attached patch (apply it, then re-run autogen.sh and configure). It does three things: 1. Check if --disable-ipv6 was passed to configure. 2. Check to see if struct sockaddr_in6 exists. 3. Sets a macro OMPI_WANT_IPV6 to either 0 or 1 (i.e., it's always defined and is therefore suitable for #if, not #ifdef): - Set to 1 if --disable-ipv6 was not passed to configure *AND* struct sockaddr_in6 exists - Set to 0 otherwise So surround your code with: #if OMPI_WANT_IPV6 ...ipv6 stuff... #endif All you have to do to get this define is #include "ompi_config.h", which all of the files should be doing already, anyway. Let me know if this works for you. -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ipv6.patch Description: Binary data
Re: [OMPI devel] [IPv6] new component oob/tcp6
On Thu, Sep 07, 2006 at 11:46:28AM -0400, Jeff Squyres wrote: > > On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: > > > >>> Do you agree to go on with two oob components, tcp and tcp6? > >> Yes, I think that's the right approach > > > > It's a deal. ;) > Actually, I would disagree here (sorry for jumping in late! :-( ). No problem, just two hours ago, Christian and me decided to drop the idea of oob/tcp6 and go on with only one oob-tcp-component. It shouldn't be that hard and I'll try it tonight or tomorrow. > Can we just have one component that handles both ivp4 and ivp6? Yes. At least that's what I try to code ;) > Appropriate #if's can be added Are already present. > (I'm willing to help with the configure.m4 mojo -- the That's good. Just check for struct sockaddr_in6 and add -DIPV6 to the CFLAGS. This flag is currently needed by opal/util/if.* and orte/mca/oob/tcp/*, so one might limit it to the two corresponding makefiles. We can also set/define IPV6 in something_config.h. It'd also be a good idea to have a --disable-ipv6 configure flag. -- mail: a...@thur.de http://adi.thur.de PGP: v2-key via keyserver Die Nase ist die Bohrinsel des kleinen Mannes
Re: [OMPI devel] [IPv6] new component oob/tcp6
On 9/7/06 12:42 PM, "George Bosilca" wrote: > I still wonder why we need any configuration "magic". We don't want > to be the only one around supporting IPv4 OR IPv6. Supporting both of > them simultaneously can be interesting, and it does not require huge > changes. In fact, we have a problem only at the connection step, > everything else will be identically. The only configuration magic I'm talking about is adding relevant tests into configure.m4 to test for the presence of IPv6 types/functions. If they're not there, then we need to #if out the relevant code in the components. > In fact, as we're talking about the TCP layer, we might want to > finish the discussion we had a while ago, about merging the OOB and > the BTL in one component. They do have very similar functions, and > right now we have to maintain 2 components. I think it's more than > time to do the merge, and move the resulting component or whatever > down in the OPAL layer. > > I even volunteer for that. Next week I will be away, so I will come > back with a design for the phone conference on ... well beginning of > october. Sounds good to me -- I'd be interested to see a design for such a beast. There's a lot of implications, but can talk it over when you show the design. :-) -- Jeff Squyres Server Virtualization Business Unit Cisco Systems
Re: [OMPI devel] [IPv6] new component oob/tcp6
That would be great with me! And much appreciated. A design would really help. On 9/7/06 10:42 AM, "George Bosilca" wrote: > I still wonder why we need any configuration "magic". We don't want > to be the only one around supporting IPv4 OR IPv6. Supporting both of > them simultaneously can be interesting, and it does not require huge > changes. In fact, we have a problem only at the connection step, > everything else will be identically. > > In fact, as we're talking about the TCP layer, we might want to > finish the discussion we had a while ago, about merging the OOB and > the BTL in one component. They do have very similar functions, and > right now we have to maintain 2 components. I think it's more than > time to do the merge, and move the resulting component or whatever > down in the OPAL layer. > > I even volunteer for that. Next week I will be away, so I will come > back with a design for the phone conference on ... well beginning of > october. > >george. > > > On Sep 7, 2006, at 12:22 PM, Ralph H Castain wrote: > >> Jeff and I talked about this for awhile this morning, and we both >> agree >> (yes, I did change my mind after we discussed all the >> ramifications). It >> appears that we should be able to consolidate the code into a single >> component with the right configuration system "magic" - and that would >> definitely be preferable. >> >> My primary concern originally was with the lack of knowledge and >> documentation on the configuration system. I know that I don't know >> enough >> about that system to make everything work in a single component. The >> component method would have allowed you to remain ignorant of that >> system. >> However, with Jeff's willingness to help in that regard, the >> approach he >> recommends would be easier for everyone. >> >> Hope that doesn't cause too much of a problem. >> Ralph >> >> >> On 9/7/06 9:46 AM, "Jeff Squyres" wrote: >> >>> On 9/1/06 12:21 PM, "Adrian Knoth" wrote: >>> On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: >> Do you agree to go on with two oob components, tcp and tcp6? > Yes, I think that's the right approach It's a deal. ;) >>> >>> Actually, I would disagree here (sorry for jumping in late! :-( ). >>> >>> Given the amount of code duplication, it seems like a big shame to >>> make a >>> separate component that is almost identical. >>> >>> Can we just have one component that handles both ivp4 and ivp6? >>> Appropriate >>> #if's can be added (I'm willing to help with the configure.m4 mojo >>> -- the >>> stuff to tell OMPI whether ipv4 and/or ipv6 stuff can be found and >>> to set >>> the #define's appropriately). >>> >>> More specifically -- I can help with component / configure / build >>> system >>> issues. I'll defer on the whole how-to-wire-them-up issue for the >>> moment >>> (I've got some other fires burning that must be tended to :-\ ). >>> >>> My $0.02: OOB is the first target to get working -- once you can >>> orterun >>> non-MPI apps properly across ipv6 and/or ipv4 nodes, then move on >>> to the MPI >>> layer and take the same approach there (e.g., one TCP btl with >>> configure.m4 >>> mojo, etc.). >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > "Half of what I say is meaningless; but I say it so that the other > half may reach you" >Kahlil Gibran > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [IPv6] new component oob/tcp6
I still wonder why we need any configuration "magic". We don't want to be the only one around supporting IPv4 OR IPv6. Supporting both of them simultaneously can be interesting, and it does not require huge changes. In fact, we have a problem only at the connection step, everything else will be identically. In fact, as we're talking about the TCP layer, we might want to finish the discussion we had a while ago, about merging the OOB and the BTL in one component. They do have very similar functions, and right now we have to maintain 2 components. I think it's more than time to do the merge, and move the resulting component or whatever down in the OPAL layer. I even volunteer for that. Next week I will be away, so I will come back with a design for the phone conference on ... well beginning of october. george. On Sep 7, 2006, at 12:22 PM, Ralph H Castain wrote: Jeff and I talked about this for awhile this morning, and we both agree (yes, I did change my mind after we discussed all the ramifications). It appears that we should be able to consolidate the code into a single component with the right configuration system "magic" - and that would definitely be preferable. My primary concern originally was with the lack of knowledge and documentation on the configuration system. I know that I don't know enough about that system to make everything work in a single component. The component method would have allowed you to remain ignorant of that system. However, with Jeff's willingness to help in that regard, the approach he recommends would be easier for everyone. Hope that doesn't cause too much of a problem. Ralph On 9/7/06 9:46 AM, "Jeff Squyres" wrote: On 9/1/06 12:21 PM, "Adrian Knoth" wrote: On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: Do you agree to go on with two oob components, tcp and tcp6? Yes, I think that's the right approach It's a deal. ;) Actually, I would disagree here (sorry for jumping in late! :-( ). Given the amount of code duplication, it seems like a big shame to make a separate component that is almost identical. Can we just have one component that handles both ivp4 and ivp6? Appropriate #if's can be added (I'm willing to help with the configure.m4 mojo -- the stuff to tell OMPI whether ipv4 and/or ipv6 stuff can be found and to set the #define's appropriately). More specifically -- I can help with component / configure / build system issues. I'll defer on the whole how-to-wire-them-up issue for the moment (I've got some other fires burning that must be tended to :-\ ). My $0.02: OOB is the first target to get working -- once you can orterun non-MPI apps properly across ipv6 and/or ipv4 nodes, then move on to the MPI layer and take the same approach there (e.g., one TCP btl with configure.m4 mojo, etc.). ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel "Half of what I say is meaningless; but I say it so that the other half may reach you" Kahlil Gibran
Re: [OMPI devel] [IPv6] new component oob/tcp6
Jeff and I talked about this for awhile this morning, and we both agree (yes, I did change my mind after we discussed all the ramifications). It appears that we should be able to consolidate the code into a single component with the right configuration system "magic" - and that would definitely be preferable. My primary concern originally was with the lack of knowledge and documentation on the configuration system. I know that I don't know enough about that system to make everything work in a single component. The component method would have allowed you to remain ignorant of that system. However, with Jeff's willingness to help in that regard, the approach he recommends would be easier for everyone. Hope that doesn't cause too much of a problem. Ralph On 9/7/06 9:46 AM, "Jeff Squyres" wrote: > On 9/1/06 12:21 PM, "Adrian Knoth" wrote: > >> On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: >> Do you agree to go on with two oob components, tcp and tcp6? >>> Yes, I think that's the right approach >> >> It's a deal. ;) > > Actually, I would disagree here (sorry for jumping in late! :-( ). > > Given the amount of code duplication, it seems like a big shame to make a > separate component that is almost identical. > > Can we just have one component that handles both ivp4 and ivp6? Appropriate > #if's can be added (I'm willing to help with the configure.m4 mojo -- the > stuff to tell OMPI whether ipv4 and/or ipv6 stuff can be found and to set > the #define's appropriately). > > More specifically -- I can help with component / configure / build system > issues. I'll defer on the whole how-to-wire-them-up issue for the moment > (I've got some other fires burning that must be tended to :-\ ). > > My $0.02: OOB is the first target to get working -- once you can orterun > non-MPI apps properly across ipv6 and/or ipv4 nodes, then move on to the MPI > layer and take the same approach there (e.g., one TCP btl with configure.m4 > mojo, etc.).
Re: [OMPI devel] [IPv6] new component oob/tcp6
On 9/1/06 12:21 PM, "Adrian Knoth" wrote: > On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: > >>> Do you agree to go on with two oob components, tcp and tcp6? >> Yes, I think that's the right approach > > It's a deal. ;) Actually, I would disagree here (sorry for jumping in late! :-( ). Given the amount of code duplication, it seems like a big shame to make a separate component that is almost identical. Can we just have one component that handles both ivp4 and ivp6? Appropriate #if's can be added (I'm willing to help with the configure.m4 mojo -- the stuff to tell OMPI whether ipv4 and/or ipv6 stuff can be found and to set the #define's appropriately). More specifically -- I can help with component / configure / build system issues. I'll defer on the whole how-to-wire-them-up issue for the moment (I've got some other fires burning that must be tended to :-\ ). My $0.02: OOB is the first target to get working -- once you can orterun non-MPI apps properly across ipv6 and/or ipv4 nodes, then move on to the MPI layer and take the same approach there (e.g., one TCP btl with configure.m4 mojo, etc.). -- Jeff Squyres Server Virtualization Business Unit Cisco Systems
Re: [OMPI devel] [IPv6] new component oob/tcp6
On 9/6/06 9:44 AM, "Christian Kauhaus" wrote: > Bogdan Costescu : >> I don't know why you think that this (talking to different nodes via >> different channels) is unusual - I think that it's quite probable, >> especially in a heterogenous environment. > > I think the first goal should be to get IPv6 working -- and this is much > more easier when we restrict ourselves to the case when all system > participating in one(!) job are reachable via a single protocol version, > either IPv4 or IPv6. > > I'm not quite sure if we need to run a *single* job across a network > with both systems that are not reachable via IPv4 and systems > that are not reachable via IPv6. If there is a practical need for this, > we will probably tackle this in the future. Note that the current plan > does not restrict the use of OpenMPI in heterogenous IPv4/IPv6 > environments, but we will not support mixed IPv4/IPv6 operation in a > single job right now. > > Our current plan is to look into the hostfile and see if there are > > (1a) just IPv4 addresses > (1b) IPv4 addresses and hostnames for which 'A' queries can be resolved > (2a) just IPv6 addresses > (2b) IPv6 addresses and hostnames for which '' queries can be resolved. > > In case 1 we initially use an IPv4 transport and in case 2 we initially > use an IPv6 transport for the oob. If neither case 1 or 2 are possible, > we abort. > Actually, that could cause us considerable problem. Only a subset of OpenRTE and OpenMPI users actually have hostfiles - the majority do not. Hence, if we base the IPv6 operation on what is in a hostfile we will be in trouble. I believe we are going to have to use the "select" mechanism of the OOB and/or the RML frameworks to let us know which protocol to use when talking to a specific host. I also believe you cannot assume that this choice will be consistent for all processes involved in a job. For example, the head node process must talk to the external network, which may well be IPv6. However, the nodes *inside* the cluster may well be IPv4 since they could likely be sitting on a NAT. The HNP still needs to talk to those nodes as well as the external network. I don't believe that letting both modes co-exist is all that much harder a problem to solve. We have similar situations elsewhere in the code base and have found that the framework mechanism works very well in this situation. I need to answer Adrian's note anyway and will describe there how to handle multiple component operations. > I hope that all can agree that this is a good starting point. > > Regards > Christian
Re: [OMPI devel] [IPv6] new component oob/tcp6
On Wed, Sep 06, 2006 at 05:44:23PM +0200, Christian Kauhaus wrote: > Our current plan is to look into the hostfile and see if there are > > (1a) just IPv4 addresses > (1b) IPv4 addresses and hostnames for which 'A' queries can be resolved > (2a) just IPv6 addresses > (2b) IPv6 addresses and hostnames for which '' queries can be resolved. Speaking of which: Today, I've extended rds/hostfile/ to accept IPv6 addresses. This now gives me the possibility to specify IPv6 addresses, resulting in an IPv4 (yes, I-P-v-four) connection. Obviously, I'll have to investigate ;) (just to let you know I'm working on it) -- mail: a...@thur.de http://adi.thur.de PGP: v2-key via keyserver Wer braucht 'ne Maus, wenn er 'ne Tastatur hat? (Sebastian Linser)
Re: [OMPI devel] [IPv6] new component oob/tcp6
Bogdan Costescu : >I don't know why you think that this (talking to different nodes via >different channels) is unusual - I think that it's quite probable, >especially in a heterogenous environment. I think the first goal should be to get IPv6 working -- and this is much more easier when we restrict ourselves to the case when all system participating in one(!) job are reachable via a single protocol version, either IPv4 or IPv6. I'm not quite sure if we need to run a *single* job across a network with both systems that are not reachable via IPv4 and systems that are not reachable via IPv6. If there is a practical need for this, we will probably tackle this in the future. Note that the current plan does not restrict the use of OpenMPI in heterogenous IPv4/IPv6 environments, but we will not support mixed IPv4/IPv6 operation in a single job right now. Our current plan is to look into the hostfile and see if there are (1a) just IPv4 addresses (1b) IPv4 addresses and hostnames for which 'A' queries can be resolved (2a) just IPv6 addresses (2b) IPv6 addresses and hostnames for which '' queries can be resolved. In case 1 we initially use an IPv4 transport and in case 2 we initially use an IPv6 transport for the oob. If neither case 1 or 2 are possible, we abort. I hope that all can agree that this is a good starting point. Regards Christian -- Dipl.-Inf. Christian Kauhaus <>< Lehrstuhl fuer Rechnerarchitektur und -kommunikation Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena Tel: +49 3641 9 46376 * Fax: +49 3641 9 46372 * Raum 3217
Re: [OMPI devel] [IPv6] new component oob/tcp6
Actually, I was a part of that thread - see my comments beginning with http://www.open-mpi.org/community/lists/devel/2006/03/0797.php. Perhaps I communicated poorly here. The issue in the prior thread was that few systems nowadays don't offer at least some level of IPv6 compatibility, even if nothing more than mapping IPv6 addresses to IPv4. My point in that thread was that some types of systems (e.g., embedded systems) don't - they have no ability to interact with IPv6 at all - but that these are not commonly found in the high performance world (the focus of OpenMPI). Although I expect hetero operations to be fairly common, I don't expect to see too many high performance systems that have no library support at all for IPv6. Hope that clarifies my comment. The intent is to fully support both types of systems anyway, so I'll concede that the point (i.e., how unusual the situation might be) is somewhat moot. On 9/6/06 8:13 AM, "Bogdan Costescu" wrote: > On Fri, 1 Sep 2006, Ralph Castain wrote: > >> The only use case I am really concerned about is that of a Head Node >> Process (HNP) that needs to talk to both IPv6 and IPv4 systems. I >> admit this will be unusual, > > This and other aspects were discussed or at least mentioned in a > thread starting at: > > http://www.open-mpi.org/community/lists/devel/2006/03/0781.php > > I don't know why you think that this (talking to different nodes via > different channels) is unusual - I think that it's quite probable, > especially in a heterogenous environment. > > However, if the present discussion is only about a proof of concept > version, then I'd say that anything to show IPv6 functionality would > be acceptable.
Re: [OMPI devel] [IPv6] new component oob/tcp6
On Fri, 1 Sep 2006, Ralph Castain wrote: > The only use case I am really concerned about is that of a Head Node > Process (HNP) that needs to talk to both IPv6 and IPv4 systems. I > admit this will be unusual, This and other aspects were discussed or at least mentioned in a thread starting at: http://www.open-mpi.org/community/lists/devel/2006/03/0781.php I don't know why you think that this (talking to different nodes via different channels) is unusual - I think that it's quite probable, especially in a heterogenous environment. However, if the present discussion is only about a proof of concept version, then I'd say that anything to show IPv6 functionality would be acceptable. -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868 E-mail: bogdan.coste...@iwr.uni-heidelberg.de
Re: [OMPI devel] [IPv6] new component oob/tcp6
On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: > > Do you agree to go on with two oob components, tcp and tcp6? > Yes, I think that's the right approach It's a deal. ;) > I think this can be supported nicely in the framework system. All we > have to do is set the IPv6 component's priority higher than IPv4. Do you mean that priority?: MCA oob: parameter "oob_tcp6_priority" (current value: "0") > We then can deal with the "try IPv6 first" by traversing the component > list in priority order. As an example, see the RAS framework. Where is it done? It's outside the mca/oob directory, right? My knowledge about orte is currently more or less limited to this subdirectory ;) > it. In this case, we need both OOB components active, and we need a routing > table that tells us which one to use to talk to various processes. I suspect > the routing table belongs in the RML framework. If you look at the PLS > framework, you'll see where we "front" the select function to give you the > ability to specify a preferred selection. We might have to do the same thing > with the OOB to allow the RML to say "send this buffer using this specific > OOB component", while still allowing it to say "send this buffer using the > *best* component". Sounds good (but I don't have to do it on my own, do I?). Right now it looks like this: orterun -np 2 -host hostA,hostB some_command uses IPv4 and it is still working. orterun -mca oob ^tcp hostA,hostB some_command hangs. The HNP correctly generated the tcp6://-URIs, but I guess the remote node tries to connect with its oob/tcp module (which cannot handle IPv6 anymore). So I chmod 0 the mca_oob_tcp.so to prevent its loading, thus resulting in a working IPv6 connection. (for now, I don't know why this happens (the hang), but at least the oob/tcp6 component is working at all) > I suspect that backend processes (i.e., non-HNP processes) really will > only use one or the other. The question also arises for the btl/tcp component: if all nodes should be able to communicate with each other, they must use the same address family. Thanks for your help. -- mail: a...@thur.de http://adi.thur.de PGP: v2-key via keyserver Person1: Geil. Morgen um 9 muss ich Präsentation halten. ÖRKS! Person2: Morgen um 9 werde ich eine Kaffeetasse halten.
Re: [OMPI devel] [IPv6] new component oob/tcp6
On 9/1/06 6:17 AM, "Adrian Knoth" wrote: > Hi, > > yesterday I felt impelled to create a new ORTE oob component: tcp6. > > I was able to either compile the library with IPv4 or IPv6 support, > but not with both (so to say: two different ompi installations or > at least two different DSO versions). > > As far as I can see, many functions use mca_oob_tcp_component.tcp_listen_sd. > Unfortunately, as I am not allowed to use v4mapped addresses (not supported > by the Windows IPv6 stack, disabled by default on *BSD), this socket > is either AF_INET or AF_INET6, but not both (both means AF_INET6 *and* > accepting v4mapped addresses). > > Do you agree to go on with two oob components, tcp and tcp6? > There is a lot of duplicated code, but we might refactor this > when everything else will be done. Yes, I think that's the right approach - see bottom for more comments, though. > > On the other hand, this whole procedure might be totally useless: > two nodes may exchange IPv4-URIs via IPv6 containing identical > RFC1918 networks. One would prefer IPv4 due to less overhead, > but with IPv6, these v4-addresses might be at different locations > anywhere in the world. > > In other words: IPv6 must be tried first or mixing with IPv4 > cannot be reliable. In this case, a lot of code may be removed > and we'll end up with either two installations/DSOs (a mentioned > above) or with runtime detection of af_family (i.e. look for > global IPv6 addresses and iff found, disable IPv4 completely) I think this can be supported nicely in the framework system. All we have to do is set the IPv6 component's priority higher than IPv4. We then can deal with the "try IPv6 first" by traversing the component list in priority order. As an example, see the RAS framework. > > What do you think - which way is best? Use cases? > The only use case I am really concerned about is that of a Head Node Process (HNP) that needs to talk to both IPv6 and IPv4 systems. I admit this will be unusual, but I would hate to pursue a design that inherently can't support it. In this case, we need both OOB components active, and we need a routing table that tells us which one to use to talk to various processes. I suspect the routing table belongs in the RML framework. If you look at the PLS framework, you'll see where we "front" the select function to give you the ability to specify a preferred selection. We might have to do the same thing with the OOB to allow the RML to say "send this buffer using this specific OOB component", while still allowing it to say "send this buffer using the *best* component". I suspect that backend processes (i.e., non-HNP processes) really will only use one or the other. Of course, someone might set up a bizarre cluster or grid that has a mix of IPv6 and IPv4 systems, but I doubt it. So I'm not as concerned there. You know, we never did much of a communications layer design for OpenRTE. What may really be required here is to take a step back and do just that - define the relative roles of the RML and OOB a little more clearly, decide what would drive us to add components to either framework, etc. Does that sound like a good idea? Otherwise, I fear we will have another major overhaul (like we are doing right now for the launch frameworks) in our future. Ralph