Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
On Thu, Dec 13, 2007 at 10:49:45AM +0200, Pavel Shamis (Pasha) wrote: >> Because we want to support mixed setups and create XRC between nodes that >> support it and RC between all other nodes. >> > Ok, sounds reasonable for me. Just need make sure that the parameters name > will be user friendly. > Some thing like --mca enable-xrc that will cause to XOOB priority be > highest (and not something like --mca xoob 10 :-)) > You can be sure that mca parameter name will be as cryptic as possible. -- Gleb.
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
Gleb Natapov wrote: On Wed, Dec 12, 2007 at 04:08:31PM +0200, Pavel Shamis (Pasha) wrote: Gleb Natapov wrote: On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote: Gleb Natapov wrote: On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept back a priority. The one with the highest priority then wins. This is quite similar to much of the other selection logic in OMPI. Sidenote: Keep in mind that there are some changes coming to select CPCs on a per-endpoint basis (I can't look up the trac ticket right now...). This makes things a little complicated -- do we need btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to include/exclude CPCs (because you might need more than one CPC in a single job)? That wouldn't be hard to do. But then what to do about if someone sets to use some XRC QPs and selects to use OOB or RDMA CM? How do we catch this and print an error? It doesn't seem right to put the "if num_xrc_qps>0" check in every CPC. What happens if you try to make an XRC QP when not using xoob? Where is the error detected and what kind of error message do we print? In my opinion "X" notation for QP specification should be removed. I didn't want this to prevent XRC merging so I haven't raced this point. It is enough to have two types of QPs "P" - SW credit management "S" - HW credit management. How will you decide witch QP type to use ? (SRQ or XRC) If both sides support XOOB and priority of XOOB is higher then all other CPC then create XRC, otherwise use regular RC. If some body have connectX hca but he want to use SRQ and not XRC ? This will be the default. (prio of OOB will be bigger than of XOOB), but if uses will want to use XRC it will increase XOOB priority by specifying MCA parameter. I guess anyway we will be need some additional parameter that will allow enable/disable XRC, correct ? (So why just not leave the X qp type ?) Because we want to support mixed setups and create XRC between nodes that support it and RC between all other nodes. Ok, sounds reasonable for me. Just need make sure that the parameters name will be user friendly. Some thing like --mca enable-xrc that will cause to XOOB priority be highest (and not something like --mca xoob 10 :-)) Pasha
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
On Wed, Dec 12, 2007 at 01:35:33PM -0500, Jeff Squyres wrote: > I agree with Gleb's idea. More below. > > On Dec 12, 2007, at 12:24 PM, Jon Mason wrote: > > > Ok, glad I got this conversation started :) > > > > So, we need a slight redesign to determine the cm method (unless > > forced > > via commandline arg). This can be determined by calling all the > > individual open routines, and having them return a priority based on > > their ability to function. For example, the xoob open function will > > check the mca_btl_openib_component.num_xrc_qps for a non-zero value > > and > > return the priority based on that. > > > > Of course, if forced then it will only call that specific open > > function > > and throw any relevant errors as necessary. > > > Close, but I'd do it slightly differently: > > - open() is *only* used for creating MCA params. It's a bad name, but > it's unfortunately the precedent throughout the rest of the OMPI code > base. :-\ (it has roots in the ompi_info command -- ompi_info has to > be able to get a full list of all MCA params regardless of what > hardware is available on the current system) > > - during the openib component startup, we should add a query() > function that does what you describe. I.e., we query() each endpoint > and it either returns a valid priority or "I don't want to be used > with this endpoint." > > - there should be a priority MCA param for every CPC. Perhaps the CPC > base can handle this...? I'm not sure; it may need to be down in each > CPC. > > - the list of CPCs that want to run with each endpoint are ordered by > priority (ties will be arbitrarily, but deterministically, broken -- > alphabetical?) and sent around in the modex. > > - when a new connection comes up, the intersection of the CPC lists > for the near and far endpoints is computed and the highest priority > CPC is used to make the connection. Since everyone has the same data, > both sides will make the same decision. > > - CPC init may have to change a bit -- more than one CPC may be used > for a given endpoint because both the local module and the remote > module are involved in making the decision of which CPC is used. > > After this first cut is done, we should probably also add > btl_openib_cpc_include and btl_openib_cpc_exclude as I described in a > prior mail (just like *_if_include and *_if_exclude in several BTLs) > to include/exclude sets of CPCs at run-time. > > > If this sounds sane, then let me know and I'll start coding it up. > > > This has actually been on my to-do list for too long; if you have the > cycles to do this now, it would be great... Since I need to have it done before I can do my rdma_cm bits, I'll add this to my queue and get started immediately. > > I'll make you a bargain: if you do the stuff above, I'll add in the > configure/build mojo for selectively compiling the XOOB CPC or not > (depending on whether the underlying system has XRC library support or > not). Cool? > > Let's go off on a /tmp-public branch for this so we don't hose the > trunk... I just made /tmp-public/openib-cpc. > > -- > Jeff Squyres > Cisco Systems > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
I agree with Gleb's idea. More below. On Dec 12, 2007, at 12:24 PM, Jon Mason wrote: Ok, glad I got this conversation started :) So, we need a slight redesign to determine the cm method (unless forced via commandline arg). This can be determined by calling all the individual open routines, and having them return a priority based on their ability to function. For example, the xoob open function will check the mca_btl_openib_component.num_xrc_qps for a non-zero value and return the priority based on that. Of course, if forced then it will only call that specific open function and throw any relevant errors as necessary. Close, but I'd do it slightly differently: - open() is *only* used for creating MCA params. It's a bad name, but it's unfortunately the precedent throughout the rest of the OMPI code base. :-\ (it has roots in the ompi_info command -- ompi_info has to be able to get a full list of all MCA params regardless of what hardware is available on the current system) - during the openib component startup, we should add a query() function that does what you describe. I.e., we query() each endpoint and it either returns a valid priority or "I don't want to be used with this endpoint." - there should be a priority MCA param for every CPC. Perhaps the CPC base can handle this...? I'm not sure; it may need to be down in each CPC. - the list of CPCs that want to run with each endpoint are ordered by priority (ties will be arbitrarily, but deterministically, broken -- alphabetical?) and sent around in the modex. - when a new connection comes up, the intersection of the CPC lists for the near and far endpoints is computed and the highest priority CPC is used to make the connection. Since everyone has the same data, both sides will make the same decision. - CPC init may have to change a bit -- more than one CPC may be used for a given endpoint because both the local module and the remote module are involved in making the decision of which CPC is used. After this first cut is done, we should probably also add btl_openib_cpc_include and btl_openib_cpc_exclude as I described in a prior mail (just like *_if_include and *_if_exclude in several BTLs) to include/exclude sets of CPCs at run-time. If this sounds sane, then let me know and I'll start coding it up. This has actually been on my to-do list for too long; if you have the cycles to do this now, it would be great... I'll make you a bargain: if you do the stuff above, I'll add in the configure/build mojo for selectively compiling the XOOB CPC or not (depending on whether the underlying system has XRC library support or not). Cool? Let's go off on a /tmp-public branch for this so we don't hose the trunk... I just made /tmp-public/openib-cpc. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
Ok, glad I got this conversation started :) So, we need a slight redesign to determine the cm method (unless forced via commandline arg). This can be determined by calling all the individual open routines, and having them return a priority based on their ability to function. For example, the xoob open function will check the mca_btl_openib_component.num_xrc_qps for a non-zero value and return the priority based on that. Of course, if forced then it will only call that specific open function and throw any relevant errors as necessary. If this sounds sane, then let me know and I'll start coding it up. Thanks, Jon
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
On Wed, Dec 12, 2007 at 04:08:31PM +0200, Pavel Shamis (Pasha) wrote: > Gleb Natapov wrote: >> On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote: >> >>> Gleb Natapov wrote: >>> On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: > Isn't there a better way somehow? Perhaps we should have "select" > call *all* the functions and accept back a priority. The one with the > highest priority then wins. This is quite similar to much of the > other selection logic in OMPI. > > Sidenote: Keep in mind that there are some changes coming to select > CPCs on a per-endpoint basis (I can't look up the trac ticket right > now...). This makes things a little complicated -- do we need > btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to > include/exclude CPCs (because you might need more than one CPC in a > single job)? That wouldn't be hard to do. > > But then what to do about if someone sets to use some XRC QPs and > selects to use OOB or RDMA CM? How do we catch this and print an > error? It doesn't seem right to put the "if num_xrc_qps>0" check in > every CPC. What happens if you try to make an XRC QP when not using > xoob? Where is the error detected and what kind of error message do > we print? > > In my opinion "X" notation for QP specification should be removed. I didn't want this to prevent XRC merging so I haven't raced this point. It is enough to have two types of QPs "P" - SW credit management "S" - HW credit management. >>> How will you decide witch QP type to use ? (SRQ or XRC) >>> >>> >> If both sides support XOOB and priority of XOOB is higher then all other >> CPC >> then create XRC, otherwise use regular RC. >> > If some body have connectX hca but he want to use SRQ and not XRC ? This will be the default. (prio of OOB will be bigger than of XOOB), but if uses will want to use XRC it will increase XOOB priority by specifying MCA parameter. > I guess anyway we will be need some additional parameter that will allow > enable/disable XRC, correct ? (So why just not leave the X qp type ?) Because we want to support mixed setups and create XRC between nodes that support it and RC between all other nodes. -- Gleb.
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
Gleb Natapov wrote: On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote: Gleb Natapov wrote: On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept back a priority. The one with the highest priority then wins. This is quite similar to much of the other selection logic in OMPI. Sidenote: Keep in mind that there are some changes coming to select CPCs on a per-endpoint basis (I can't look up the trac ticket right now...). This makes things a little complicated -- do we need btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to include/exclude CPCs (because you might need more than one CPC in a single job)? That wouldn't be hard to do. But then what to do about if someone sets to use some XRC QPs and selects to use OOB or RDMA CM? How do we catch this and print an error? It doesn't seem right to put the "if num_xrc_qps>0" check in every CPC. What happens if you try to make an XRC QP when not using xoob? Where is the error detected and what kind of error message do we print? In my opinion "X" notation for QP specification should be removed. I didn't want this to prevent XRC merging so I haven't raced this point. It is enough to have two types of QPs "P" - SW credit management "S" - HW credit management. How will you decide witch QP type to use ? (SRQ or XRC) If both sides support XOOB and priority of XOOB is higher then all other CPC then create XRC, otherwise use regular RC. If some body have connectX hca but he want to use SRQ and not XRC ? I guess anyway we will be need some additional parameter that will allow enable/disable XRC, correct ? (So why just not leave the X qp type ?)
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote: > Gleb Natapov wrote: > > On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: > > > >> Isn't there a better way somehow? Perhaps we should have "select" > >> call *all* the functions and accept back a priority. The one with the > >> highest priority then wins. This is quite similar to much of the > >> other selection logic in OMPI. > >> > >> Sidenote: Keep in mind that there are some changes coming to select > >> CPCs on a per-endpoint basis (I can't look up the trac ticket right > >> now...). This makes things a little complicated -- do we need > >> btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to > >> include/exclude CPCs (because you might need more than one CPC in a > >> single job)? That wouldn't be hard to do. > >> > >> But then what to do about if someone sets to use some XRC QPs and > >> selects to use OOB or RDMA CM? How do we catch this and print an > >> error? It doesn't seem right to put the "if num_xrc_qps>0" check in > >> every CPC. What happens if you try to make an XRC QP when not using > >> xoob? Where is the error detected and what kind of error message do > >> we print? > >> > >> > > In my opinion "X" notation for QP specification should be removed. I > > didn't want this to prevent XRC merging so I haven't raced this point. > > It is enough to have two types of QPs "P" - SW credit management "S" - > > HW credit management. > How will you decide witch QP type to use ? (SRQ or XRC) > If both sides support XOOB and priority of XOOB is higher then all other CPC then create XRC, otherwise use regular RC. -- Gleb.
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
Gleb Natapov wrote: On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept back a priority. The one with the highest priority then wins. This is quite similar to much of the other selection logic in OMPI. Sidenote: Keep in mind that there are some changes coming to select CPCs on a per-endpoint basis (I can't look up the trac ticket right now...). This makes things a little complicated -- do we need btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to include/exclude CPCs (because you might need more than one CPC in a single job)? That wouldn't be hard to do. But then what to do about if someone sets to use some XRC QPs and selects to use OOB or RDMA CM? How do we catch this and print an error? It doesn't seem right to put the "if num_xrc_qps>0" check in every CPC. What happens if you try to make an XRC QP when not using xoob? Where is the error detected and what kind of error message do we print? In my opinion "X" notation for QP specification should be removed. I didn't want this to prevent XRC merging so I haven't raced this point. It is enough to have two types of QPs "P" - SW credit management "S" - HW credit management. How will you decide witch QP type to use ? (SRQ or XRC) I think connection management should work like this: Each BTL knows what type of CPC it can use and it should share this info during modex stage. During connection establishment modex info is used to figure out the list of CPCs that both endpoints support and one with highest prio is selected. ok for me. -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote: > Isn't there a better way somehow? Perhaps we should have "select" > call *all* the functions and accept back a priority. The one with the > highest priority then wins. This is quite similar to much of the > other selection logic in OMPI. > > Sidenote: Keep in mind that there are some changes coming to select > CPCs on a per-endpoint basis (I can't look up the trac ticket right > now...). This makes things a little complicated -- do we need > btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to > include/exclude CPCs (because you might need more than one CPC in a > single job)? That wouldn't be hard to do. > > But then what to do about if someone sets to use some XRC QPs and > selects to use OOB or RDMA CM? How do we catch this and print an > error? It doesn't seem right to put the "if num_xrc_qps>0" check in > every CPC. What happens if you try to make an XRC QP when not using > xoob? Where is the error detected and what kind of error message do > we print? > In my opinion "X" notation for QP specification should be removed. I didn't want this to prevent XRC merging so I haven't raced this point. It is enough to have two types of QPs "P" - SW credit management "S" - HW credit management. I think connection management should work like this: Each BTL knows what type of CPC it can use and it should share this info during modex stage. During connection establishment modex info is used to figure out the list of CPCs that both endpoints support and one with highest prio is selected. -- Gleb.
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
On Dec 12, 2007, at 5:13 AM, Pavel Shamis (Pasha) wrote: Hmm. I don't think that we want to put knowledge of XRC in the OOB CPC (and vice versa). That seems like an abstraction violation. I didn't like that XRC knowledge was put in the connect base either, but I was too busy to argue with it. :-) Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept back a priority. The one with the highest priority then wins. This is quite similar to much of the other selection logic in OMPI. Sidenote: Keep in mind that there are some changes coming to select CPCs on a per-endpoint basis (I can't look up the trac ticket right now...). This makes things a little complicated -- do we need btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to include/exclude CPCs (because you might need more than one CPC in a single job)? That wouldn't be hard to do. But then what to do about if someone sets to use some XRC QPs and selects to use OOB or RDMA CM? Error message will be reported , that for using XRC you _must_ select xoob. I understand that that is what it does today; I was asking my somewhat- rhetorical question with the above text in mind (that we remove the abstraction violations -- remove knowledge of XRC from the OOB CPC, etc.). How do we catch this and print an error? It doesn't seem right to put the "if num_xrc_qps>0" check in every CPC. What happens if you try to make an XRC QP when not using xoob? Where is the error detected and what kind of error message do we print? I would like to remind 2 things: 1. XRC little bit change QP logic. We creates one XRC qp for send and one for recv. As result it require absolutely different oob mechanism. 2. Current implementation doesn't allow to run with XRC + RC (or srq) and I don't think that we need such mixed configuration support at all. So as results the the XRC may work only with XOOB. If you will try to run it with oob error message will be reported. As well if you will try to run !(XRC) with XOOB error message will be reported too. The check is located in ompi_btl_openib_connect_base_open. I understand all of that. I think the question is if there is a way to de-centralize these checks such that the XOOB CPC can be the one that figures this stuff out (for example) rather than having to put this in the base. The original code in the function used oob as default connection method. I changed it to check in which mode we are running (xrc enabled/disabled) and make xoob default connection manager for xrc mode and oob make default for not xrc mode. Right -- this is problematic for adding IBCM and RDMA CM; that's Jon's point. I not sure that oob cpc is the best place for the logic. also I don't think that the "select + priority" solution will resolve the dependences problem: XRC enabled -> xoob XRC disabled -> oob , cm. We may push the logic outside of cpc and pass to ompi_btl_openib_connect_base_open() witch connection manger we want to use. I guess that the change also will be usefull for future "CPCs on a per-endpoint basis" changes. From an abstraction point of view, it would be nice to get all this CPC-specific information out of the base and into the CPCs that they belong to. Also, I'm not sure why the #if/#else is there for xoob (i.e., having empty/printf functions there when XRC support is compiled out) -- if xoob was disabled during compilation, then it simply should not be compiled and therefore not be there at all at run-time. If a user selects the xoob CPC, then we should print a message from the base that that CPC doesn't exist in the installation. Correspondingly, we can make an info MCA param in the btl openib that shows which CPCs are available (we already have this information -- it's easy enough to put this in an info MCA param). Sounds reasonable for me. Pasha. On Dec 11, 2007, at 6:59 PM, Jon Mason wrote: Currently, alternate CMs cannot be called because ompi_btl_openib_connect_base_open forces a choice of either oob or xoob (and goes into an erroneous error path if you pick something else). This patch reorganizes ompi_btl_openib_connect_base_open so that new functions can easily be added. New Open functions were added to oob and xoob for the error handling. I tested calling oob, xoob, and rdma_cm. oob happily allows connections to be established and throws no errors. xoob fails because ompi does not have it compiled in (and I have no connectx cards). rdma_cm calls the empty hooks and exits without connecting (thus throwing non-connection errors). All expected behavior. Since this patch fixes the existing behavior, and is not necessarily tied to my implementing of rdma_cm, I think it is acceptable to go in now. Thanks, Jon Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c === --- ompi/mca/btl/openib/connect/btl_openib_connect_ba
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
Jeff Squyres wrote: Hmm. I don't think that we want to put knowledge of XRC in the OOB CPC (and vice versa). That seems like an abstraction violation. I didn't like that XRC knowledge was put in the connect base either, but I was too busy to argue with it. :-) Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept back a priority. The one with the highest priority then wins. This is quite similar to much of the other selection logic in OMPI. Sidenote: Keep in mind that there are some changes coming to select CPCs on a per-endpoint basis (I can't look up the trac ticket right now...). This makes things a little complicated -- do we need btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to include/exclude CPCs (because you might need more than one CPC in a single job)? That wouldn't be hard to do. But then what to do about if someone sets to use some XRC QPs and selects to use OOB or RDMA CM? Error message will be reported , that for using XRC you _must_ select xoob. How do we catch this and print an error? It doesn't seem right to put the "if num_xrc_qps>0" check in every CPC. What happens if you try to make an XRC QP when not using xoob? Where is the error detected and what kind of error message do we print? I would like to remind 2 things: 1. XRC little bit change QP logic. We creates one XRC qp for send and one for recv. As result it require absolutely different oob mechanism. 2. Current implementation doesn't allow to run with XRC + RC (or srq) and I don't think that we need such mixed configuration support at all. So as results the the XRC may work only with XOOB. If you will try to run it with oob error message will be reported. As well if you will try to run !(XRC) with XOOB error message will be reported too. The check is located in ompi_btl_openib_connect_base_open. The original code in the function used oob as default connection method. I changed it to check in which mode we are running (xrc enabled/disabled) and make xoob default connection manager for xrc mode and oob make default for not xrc mode. I not sure that oob cpc is the best place for the logic. also I don't think that the "select + priority" solution will resolve the dependences problem: XRC enabled -> xoob XRC disabled -> oob , cm. We may push the logic outside of cpc and pass to ompi_btl_openib_connect_base_open() witch connection manger we want to use. I guess that the change also will be usefull for future "CPCs on a per-endpoint basis" changes. Also, I'm not sure why the #if/#else is there for xoob (i.e., having empty/printf functions there when XRC support is compiled out) -- if xoob was disabled during compilation, then it simply should not be compiled and therefore not be there at all at run-time. If a user selects the xoob CPC, then we should print a message from the base that that CPC doesn't exist in the installation. Correspondingly, we can make an info MCA param in the btl openib that shows which CPCs are available (we already have this information -- it's easy enough to put this in an info MCA param). Sounds reasonable for me. Pasha. On Dec 11, 2007, at 6:59 PM, Jon Mason wrote: Currently, alternate CMs cannot be called because ompi_btl_openib_connect_base_open forces a choice of either oob or xoob (and goes into an erroneous error path if you pick something else). This patch reorganizes ompi_btl_openib_connect_base_open so that new functions can easily be added. New Open functions were added to oob and xoob for the error handling. I tested calling oob, xoob, and rdma_cm. oob happily allows connections to be established and throws no errors. xoob fails because ompi does not have it compiled in (and I have no connectx cards). rdma_cm calls the empty hooks and exits without connecting (thus throwing non-connection errors). All expected behavior. Since this patch fixes the existing behavior, and is not necessarily tied to my implementing of rdma_cm, I think it is acceptable to go in now. Thanks, Jon Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c === --- ompi/mca/btl/openib/connect/btl_openib_connect_base.c (revision 16937) +++ ompi/mca/btl/openib/connect/btl_openib_connect_base.c (working copy) @@ -50,8 +50,8 @@ */ int ompi_btl_openib_connect_base_open(void) { -int i; -char **temp, *a, *b; +char **temp, *a, *b, *defval; +int i, ret = OMPI_ERROR; /* Make an MCA parameter to select which connect module to use */ temp = NULL; @@ -66,40 +66,23 @@ /* For XRC qps we must to use XOOB connection manager */ if (mca_btl_openib_component.num_xrc_qps > 0) { - mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version, -"connect", -b, false, false, -"xoob", ¶m); -
Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
Hmm. I don't think that we want to put knowledge of XRC in the OOB CPC (and vice versa). That seems like an abstraction violation. I didn't like that XRC knowledge was put in the connect base either, but I was too busy to argue with it. :-) Isn't there a better way somehow? Perhaps we should have "select" call *all* the functions and accept back a priority. The one with the highest priority then wins. This is quite similar to much of the other selection logic in OMPI. Sidenote: Keep in mind that there are some changes coming to select CPCs on a per-endpoint basis (I can't look up the trac ticket right now...). This makes things a little complicated -- do we need btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to include/exclude CPCs (because you might need more than one CPC in a single job)? That wouldn't be hard to do. But then what to do about if someone sets to use some XRC QPs and selects to use OOB or RDMA CM? How do we catch this and print an error? It doesn't seem right to put the "if num_xrc_qps>0" check in every CPC. What happens if you try to make an XRC QP when not using xoob? Where is the error detected and what kind of error message do we print? Also, I'm not sure why the #if/#else is there for xoob (i.e., having empty/printf functions there when XRC support is compiled out) -- if xoob was disabled during compilation, then it simply should not be compiled and therefore not be there at all at run-time. If a user selects the xoob CPC, then we should print a message from the base that that CPC doesn't exist in the installation. Correspondingly, we can make an info MCA param in the btl openib that shows which CPCs are available (we already have this information -- it's easy enough to put this in an info MCA param). On Dec 11, 2007, at 6:59 PM, Jon Mason wrote: Currently, alternate CMs cannot be called because ompi_btl_openib_connect_base_open forces a choice of either oob or xoob (and goes into an erroneous error path if you pick something else). This patch reorganizes ompi_btl_openib_connect_base_open so that new functions can easily be added. New Open functions were added to oob and xoob for the error handling. I tested calling oob, xoob, and rdma_cm. oob happily allows connections to be established and throws no errors. xoob fails because ompi does not have it compiled in (and I have no connectx cards). rdma_cm calls the empty hooks and exits without connecting (thus throwing non-connection errors). All expected behavior. Since this patch fixes the existing behavior, and is not necessarily tied to my implementing of rdma_cm, I think it is acceptable to go in now. Thanks, Jon Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c === --- ompi/mca/btl/openib/connect/btl_openib_connect_base.c (revision 16937) +++ ompi/mca/btl/openib/connect/btl_openib_connect_base.c (working copy) @@ -50,8 +50,8 @@ */ int ompi_btl_openib_connect_base_open(void) { -int i; -char **temp, *a, *b; +char **temp, *a, *b, *defval; +int i, ret = OMPI_ERROR; /* Make an MCA parameter to select which connect module to use */ temp = NULL; @@ -66,40 +66,23 @@ /* For XRC qps we must to use XOOB connection manager */ if (mca_btl_openib_component.num_xrc_qps > 0) { - mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version, -"connect", -b, false, false, -"xoob", ¶m); -if (0 != strcmp("xoob", param)) { -opal_show_help("help-mpi-btl-openib.txt", -"XRC with wrong OOB", true, -orte_system_info.nodename, -mca_btl_openib_component.num_xrc_qps); -return OMPI_ERROR; -} + defval = "xoob"; } else { /* For all others we should use OOB */ - mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version, -"connect", -b, false, false, -"oob", ¶m); -if (0 != strcmp("oob", param)) { -opal_show_help("help-mpi-btl-openib.txt", -"SRQ or PP with wrong OOB", true, -orte_system_info.nodename, -mca_btl_openib_component.num_srq_qps, -mca_btl_openib_component.num_pp_qps); -return OMPI_ERROR; -} + defval = "oob"; } + mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version, + "connect", b, false, false, defval, ¶m); + /* Call the open function on all the connect modules */ for (i = 0; NULL != all[i]; ++i) { -if (NULL != all[i]->bcf_open) { -all[i]->bcf_open(); +if (0 == strcmp(all[i]->bcf_name, param)) { +ret = all[i]->bcf_open(); + break; } } -return OMPI_S
[OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's
Currently, alternate CMs cannot be called because ompi_btl_openib_connect_base_open forces a choice of either oob or xoob (and goes into an erroneous error path if you pick something else). This patch reorganizes ompi_btl_openib_connect_base_open so that new functions can easily be added. New Open functions were added to oob and xoob for the error handling. I tested calling oob, xoob, and rdma_cm. oob happily allows connections to be established and throws no errors. xoob fails because ompi does not have it compiled in (and I have no connectx cards). rdma_cm calls the empty hooks and exits without connecting (thus throwing non-connection errors). All expected behavior. Since this patch fixes the existing behavior, and is not necessarily tied to my implementing of rdma_cm, I think it is acceptable to go in now. Thanks, Jon Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c === --- ompi/mca/btl/openib/connect/btl_openib_connect_base.c (revision 16937) +++ ompi/mca/btl/openib/connect/btl_openib_connect_base.c (working copy) @@ -50,8 +50,8 @@ */ int ompi_btl_openib_connect_base_open(void) { -int i; -char **temp, *a, *b; +char **temp, *a, *b, *defval; +int i, ret = OMPI_ERROR; /* Make an MCA parameter to select which connect module to use */ temp = NULL; @@ -66,40 +66,23 @@ /* For XRC qps we must to use XOOB connection manager */ if (mca_btl_openib_component.num_xrc_qps > 0) { -mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version, -"connect", -b, false, false, -"xoob", ¶m); -if (0 != strcmp("xoob", param)) { -opal_show_help("help-mpi-btl-openib.txt", -"XRC with wrong OOB", true, -orte_system_info.nodename, -mca_btl_openib_component.num_xrc_qps); -return OMPI_ERROR; -} + defval = "xoob"; } else { /* For all others we should use OOB */ -mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version, -"connect", -b, false, false, -"oob", ¶m); -if (0 != strcmp("oob", param)) { -opal_show_help("help-mpi-btl-openib.txt", -"SRQ or PP with wrong OOB", true, -orte_system_info.nodename, -mca_btl_openib_component.num_srq_qps, -mca_btl_openib_component.num_pp_qps); -return OMPI_ERROR; -} + defval = "oob"; } +mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version, + "connect", b, false, false, defval, ¶m); + /* Call the open function on all the connect modules */ for (i = 0; NULL != all[i]; ++i) { -if (NULL != all[i]->bcf_open) { -all[i]->bcf_open(); +if (0 == strcmp(all[i]->bcf_name, param)) { +ret = all[i]->bcf_open(); + break; } } -return OMPI_SUCCESS; +return ret; } Index: ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c === --- ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c (revision 16937) +++ ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c (working copy) @@ -28,11 +28,7 @@ static int ibcm_open(void) { -mca_base_param_reg_int(&mca_btl_openib_component.super.btl_version, - "btl_openib_connect_ibcm_foo", - "A dummy help message", false, false, - 17, NULL); - +printf("ibcm open\n"); return OMPI_SUCCESS; } Index: ompi/mca/btl/openib/connect/btl_openib_connect_oob.c === --- ompi/mca/btl/openib/connect/btl_openib_connect_oob.c(revision 16937) +++ ompi/mca/btl/openib/connect/btl_openib_connect_oob.c(working copy) @@ -22,6 +22,8 @@ #include "ompi_config.h" +#include "opal/util/show_help.h" + #include "orte/mca/ns/base/base.h" #include "orte/mca/oob/base/base.h" #include "orte/mca/rml/rml.h" @@ -39,6 +41,7 @@ ENDPOINT_CONNECT_ACK } connect_message_type_t; +static int oob_open(void); static int oob_init(void); static int oob_start_connect(mca_btl_base_endpoint_t *e); static int oob_finalize(void); @@ -67,8 +70,8 @@ */ ompi_btl_openib_connect_base_funcs_t ompi_btl_openib_connect_oob = { "oob", -/* No need for "open */ -NULL, +/* Open */ +oob_open, /* Init */ oob_init, /* Connect */ @@ -78,6 +81,23 @@ }; /* + * Open function. + */ +static int oob_open(void) +{ +if (mca_btl_openib_component.num_xrc_qps > 0) { +opal_show_help("help-mpi-btl-openib.txt", +"SRQ or PP with wrong OOB", true, +orte_system_info.nodename, +