Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-13 Thread Gleb Natapov
On Thu, Dec 13, 2007 at 10:49:45AM +0200, Pavel Shamis (Pasha) wrote:
>> Because we want to support mixed setups and create XRC between nodes that
>> support it and RC between all other nodes.
>>   
> Ok, sounds reasonable for me. Just need make sure that the parameters name 
> will be user friendly.
> Some thing like --mca enable-xrc that will cause to XOOB priority be 
> highest (and not something like --mca xoob 10 :-))
>
You can be sure that mca parameter name will be as cryptic as possible.

--
Gleb.


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-13 Thread Pavel Shamis (Pasha)

Gleb Natapov wrote:

On Wed, Dec 12, 2007 at 04:08:31PM +0200, Pavel Shamis (Pasha) wrote:
  

Gleb Natapov wrote:


On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote:
  
  

Gleb Natapov wrote:



On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote:

  
Isn't there a better way somehow?  Perhaps we should have "select"  
call *all* the functions and accept back a priority.  The one with the  
highest priority then wins.  This is quite similar to much of the  
other selection logic in OMPI.


Sidenote: Keep in mind that there are some changes coming to select  
CPCs on a per-endpoint basis (I can't look up the trac ticket right  
now...).  This makes things a little complicated -- do we need  
btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
include/exclude CPCs (because you might need more than one CPC in a  
single job)?  That wouldn't be hard to do.


But then what to do about if someone sets to use some XRC QPs and  
selects to use OOB or RDMA CM?  How do we catch this and print an  
error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
every CPC.  What happens if you try to make an XRC QP when not using  
xoob?  Where is the error detected and what kind of error message do  
we print?





In my opinion "X" notation for QP specification should be removed. I
didn't want this to prevent XRC merging so I haven't raced this point.
It is enough to have two types of QPs "P" - SW credit management "S" -
HW credit management.   
  

How will you decide witch QP type to use ? (SRQ or XRC)



If both sides support XOOB and priority of XOOB is higher then all other 
CPC

then create XRC, otherwise use regular RC.
  
  

If some body have connectX hca but  he want to use SRQ and not XRC ?


This will be the default. (prio of OOB will be bigger than of XOOB), but
if uses will want to use XRC it will increase XOOB priority by
specifying MCA parameter.

  
I guess anyway we will be need some additional parameter that will allow 
enable/disable XRC, correct ? (So why just not leave the X qp type ?)


Because we want to support mixed setups and create XRC between nodes that
support it and RC between all other nodes.
  
Ok, sounds reasonable for me. 
Just need make sure that the parameters name will be user friendly.
Some thing like --mca enable-xrc that will cause to XOOB priority be 
highest (and not something like --mca xoob 10 :-))


Pasha


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jon Mason
On Wed, Dec 12, 2007 at 01:35:33PM -0500, Jeff Squyres wrote:
> I agree with Gleb's idea.  More below.
> 
> On Dec 12, 2007, at 12:24 PM, Jon Mason wrote:
> 
> > Ok, glad I got this conversation started :)
> >
> > So, we need a slight redesign to determine the cm method (unless  
> > forced
> > via commandline arg).  This can be determined by calling all the
> > individual open routines, and having them return a priority based on
> > their ability to function.  For example, the xoob open function will
> > check the mca_btl_openib_component.num_xrc_qps for a non-zero value  
> > and
> > return the priority based on that.
> >
> > Of course, if forced then it will only call that specific open  
> > function
> > and throw any relevant errors as necessary.
> 
> 
> Close, but I'd do it slightly differently:
> 
> - open() is *only* used for creating MCA params.  It's a bad name, but  
> it's unfortunately the precedent throughout the rest of the OMPI code  
> base.  :-\ (it has roots in the ompi_info command -- ompi_info has to  
> be able to get a full list of all MCA params regardless of what  
> hardware is available on the current system)
> 
> - during the openib component startup, we should add a query()  
> function that does what you describe.  I.e., we query() each endpoint  
> and it either returns a valid priority or "I don't want to be used  
> with this endpoint."
> 
> - there should be a priority MCA param for every CPC.  Perhaps the CPC  
> base can handle this...?  I'm not sure; it may need to be down in each  
> CPC.
> 
> - the list of CPCs that want to run with each endpoint are ordered by  
> priority (ties will be arbitrarily, but deterministically, broken --  
> alphabetical?) and sent around in the modex.
> 
> - when a new connection comes up, the intersection of the CPC lists  
> for the near and far endpoints is computed and the highest priority  
> CPC is used to make the connection.  Since everyone has the same data,  
> both sides will make the same decision.
> 
> - CPC init may have to change a bit -- more than one CPC may be used  
> for a given endpoint because both the local module and the remote  
> module are involved in making the decision of which CPC is used.
> 
> After this first cut is done, we should probably also add  
> btl_openib_cpc_include and btl_openib_cpc_exclude as I described in a  
> prior mail (just like *_if_include and *_if_exclude in several BTLs)  
> to include/exclude sets of CPCs at run-time.
> 
> > If this sounds sane, then let me know and I'll start coding it up.
> 
> 
> This has actually been on my to-do list for too long; if you have the  
> cycles to do this now, it would be great...

Since I need to have it done before I can do my rdma_cm bits, I'll add
this to my queue and get started immediately.

> 
> I'll make you a bargain: if you do the stuff above, I'll add in the  
> configure/build mojo for selectively compiling the XOOB CPC or not  
> (depending on whether the underlying system has XRC library support or  
> not).  Cool?
> 
> Let's go off on a /tmp-public branch for this so we don't hose the  
> trunk...  I just made /tmp-public/openib-cpc.
> 
> -- 
> Jeff Squyres
> Cisco Systems
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jeff Squyres

I agree with Gleb's idea.  More below.

On Dec 12, 2007, at 12:24 PM, Jon Mason wrote:


Ok, glad I got this conversation started :)

So, we need a slight redesign to determine the cm method (unless  
forced

via commandline arg).  This can be determined by calling all the
individual open routines, and having them return a priority based on
their ability to function.  For example, the xoob open function will
check the mca_btl_openib_component.num_xrc_qps for a non-zero value  
and

return the priority based on that.

Of course, if forced then it will only call that specific open  
function

and throw any relevant errors as necessary.



Close, but I'd do it slightly differently:

- open() is *only* used for creating MCA params.  It's a bad name, but  
it's unfortunately the precedent throughout the rest of the OMPI code  
base.  :-\ (it has roots in the ompi_info command -- ompi_info has to  
be able to get a full list of all MCA params regardless of what  
hardware is available on the current system)


- during the openib component startup, we should add a query()  
function that does what you describe.  I.e., we query() each endpoint  
and it either returns a valid priority or "I don't want to be used  
with this endpoint."


- there should be a priority MCA param for every CPC.  Perhaps the CPC  
base can handle this...?  I'm not sure; it may need to be down in each  
CPC.


- the list of CPCs that want to run with each endpoint are ordered by  
priority (ties will be arbitrarily, but deterministically, broken --  
alphabetical?) and sent around in the modex.


- when a new connection comes up, the intersection of the CPC lists  
for the near and far endpoints is computed and the highest priority  
CPC is used to make the connection.  Since everyone has the same data,  
both sides will make the same decision.


- CPC init may have to change a bit -- more than one CPC may be used  
for a given endpoint because both the local module and the remote  
module are involved in making the decision of which CPC is used.


After this first cut is done, we should probably also add  
btl_openib_cpc_include and btl_openib_cpc_exclude as I described in a  
prior mail (just like *_if_include and *_if_exclude in several BTLs)  
to include/exclude sets of CPCs at run-time.



If this sounds sane, then let me know and I'll start coding it up.



This has actually been on my to-do list for too long; if you have the  
cycles to do this now, it would be great...


I'll make you a bargain: if you do the stuff above, I'll add in the  
configure/build mojo for selectively compiling the XOOB CPC or not  
(depending on whether the underlying system has XRC library support or  
not).  Cool?


Let's go off on a /tmp-public branch for this so we don't hose the  
trunk...  I just made /tmp-public/openib-cpc.


--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jon Mason
Ok, glad I got this conversation started :)

So, we need a slight redesign to determine the cm method (unless forced
via commandline arg).  This can be determined by calling all the
individual open routines, and having them return a priority based on
their ability to function.  For example, the xoob open function will
check the mca_btl_openib_component.num_xrc_qps for a non-zero value and
return the priority based on that.

Of course, if forced then it will only call that specific open function
and throw any relevant errors as necessary.

If this sounds sane, then let me know and I'll start coding it up.

Thanks,
Jon


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 04:08:31PM +0200, Pavel Shamis (Pasha) wrote:
> Gleb Natapov wrote:
>> On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote:
>>   
>>> Gleb Natapov wrote:
>>> 
 On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote:
 
> Isn't there a better way somehow?  Perhaps we should have "select"  
> call *all* the functions and accept back a priority.  The one with the  
> highest priority then wins.  This is quite similar to much of the  
> other selection logic in OMPI.
>
> Sidenote: Keep in mind that there are some changes coming to select  
> CPCs on a per-endpoint basis (I can't look up the trac ticket right  
> now...).  This makes things a little complicated -- do we need  
> btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
> include/exclude CPCs (because you might need more than one CPC in a  
> single job)?  That wouldn't be hard to do.
>
> But then what to do about if someone sets to use some XRC QPs and  
> selects to use OOB or RDMA CM?  How do we catch this and print an  
> error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
> every CPC.  What happens if you try to make an XRC QP when not using  
> xoob?  Where is the error detected and what kind of error message do  
> we print?
>
> 
 In my opinion "X" notation for QP specification should be removed. I
 didn't want this to prevent XRC merging so I haven't raced this point.
 It is enough to have two types of QPs "P" - SW credit management "S" -
 HW credit management.   
>>> How will you decide witch QP type to use ? (SRQ or XRC)
>>>
>>> 
>> If both sides support XOOB and priority of XOOB is higher then all other 
>> CPC
>> then create XRC, otherwise use regular RC.
>>   
> If some body have connectX hca but  he want to use SRQ and not XRC ?
This will be the default. (prio of OOB will be bigger than of XOOB), but
if uses will want to use XRC it will increase XOOB priority by
specifying MCA parameter.

> I guess anyway we will be need some additional parameter that will allow 
> enable/disable XRC, correct ? (So why just not leave the X qp type ?)
Because we want to support mixed setups and create XRC between nodes that
support it and RC between all other nodes.

--
Gleb.


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Pavel Shamis (Pasha)

Gleb Natapov wrote:

On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote:
  

Gleb Natapov wrote:


On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote:
  
  
Isn't there a better way somehow?  Perhaps we should have "select"  
call *all* the functions and accept back a priority.  The one with the  
highest priority then wins.  This is quite similar to much of the  
other selection logic in OMPI.


Sidenote: Keep in mind that there are some changes coming to select  
CPCs on a per-endpoint basis (I can't look up the trac ticket right  
now...).  This makes things a little complicated -- do we need  
btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
include/exclude CPCs (because you might need more than one CPC in a  
single job)?  That wouldn't be hard to do.


But then what to do about if someone sets to use some XRC QPs and  
selects to use OOB or RDMA CM?  How do we catch this and print an  
error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
every CPC.  What happens if you try to make an XRC QP when not using  
xoob?  Where is the error detected and what kind of error message do  
we print?





In my opinion "X" notation for QP specification should be removed. I
didn't want this to prevent XRC merging so I haven't raced this point.
It is enough to have two types of QPs "P" - SW credit management "S" -
HW credit management. 
  

How will you decide witch QP type to use ? (SRQ or XRC)



If both sides support XOOB and priority of XOOB is higher then all other CPC
then create XRC, otherwise use regular RC.
  

If some body have connectX hca but  he want to use SRQ and not XRC ?
I guess anyway we will be need some additional parameter that will allow 
enable/disable XRC, correct ? (So why just not leave the X qp type ?)




Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 03:37:26PM +0200, Pavel Shamis (Pasha) wrote:
> Gleb Natapov wrote:
> > On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote:
> >   
> >> Isn't there a better way somehow?  Perhaps we should have "select"  
> >> call *all* the functions and accept back a priority.  The one with the  
> >> highest priority then wins.  This is quite similar to much of the  
> >> other selection logic in OMPI.
> >>
> >> Sidenote: Keep in mind that there are some changes coming to select  
> >> CPCs on a per-endpoint basis (I can't look up the trac ticket right  
> >> now...).  This makes things a little complicated -- do we need  
> >> btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
> >> include/exclude CPCs (because you might need more than one CPC in a  
> >> single job)?  That wouldn't be hard to do.
> >>
> >> But then what to do about if someone sets to use some XRC QPs and  
> >> selects to use OOB or RDMA CM?  How do we catch this and print an  
> >> error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
> >> every CPC.  What happens if you try to make an XRC QP when not using  
> >> xoob?  Where is the error detected and what kind of error message do  
> >> we print?
> >>
> >> 
> > In my opinion "X" notation for QP specification should be removed. I
> > didn't want this to prevent XRC merging so I haven't raced this point.
> > It is enough to have two types of QPs "P" - SW credit management "S" -
> > HW credit management. 
> How will you decide witch QP type to use ? (SRQ or XRC)
> 
If both sides support XOOB and priority of XOOB is higher then all other CPC
then create XRC, otherwise use regular RC.

--
Gleb.


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Pavel Shamis (Pasha)

Gleb Natapov wrote:

On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote:
  
Isn't there a better way somehow?  Perhaps we should have "select"  
call *all* the functions and accept back a priority.  The one with the  
highest priority then wins.  This is quite similar to much of the  
other selection logic in OMPI.


Sidenote: Keep in mind that there are some changes coming to select  
CPCs on a per-endpoint basis (I can't look up the trac ticket right  
now...).  This makes things a little complicated -- do we need  
btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
include/exclude CPCs (because you might need more than one CPC in a  
single job)?  That wouldn't be hard to do.


But then what to do about if someone sets to use some XRC QPs and  
selects to use OOB or RDMA CM?  How do we catch this and print an  
error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
every CPC.  What happens if you try to make an XRC QP when not using  
xoob?  Where is the error detected and what kind of error message do  
we print?




In my opinion "X" notation for QP specification should be removed. I
didn't want this to prevent XRC merging so I haven't raced this point.
It is enough to have two types of QPs "P" - SW credit management "S" -
HW credit management. 

How will you decide witch QP type to use ? (SRQ or XRC)


I think connection management should work like
this: Each BTL knows what type of CPC it can use and it should share
this info during modex stage. During connection establishment modex info
is used to figure out the list of CPCs that both endpoints support and one
with highest prio is selected.
  

ok for me.

--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

  




Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Gleb Natapov
On Tue, Dec 11, 2007 at 08:16:07PM -0500, Jeff Squyres wrote:
> Isn't there a better way somehow?  Perhaps we should have "select"  
> call *all* the functions and accept back a priority.  The one with the  
> highest priority then wins.  This is quite similar to much of the  
> other selection logic in OMPI.
> 
> Sidenote: Keep in mind that there are some changes coming to select  
> CPCs on a per-endpoint basis (I can't look up the trac ticket right  
> now...).  This makes things a little complicated -- do we need  
> btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
> include/exclude CPCs (because you might need more than one CPC in a  
> single job)?  That wouldn't be hard to do.
> 
> But then what to do about if someone sets to use some XRC QPs and  
> selects to use OOB or RDMA CM?  How do we catch this and print an  
> error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
> every CPC.  What happens if you try to make an XRC QP when not using  
> xoob?  Where is the error detected and what kind of error message do  
> we print?
> 
In my opinion "X" notation for QP specification should be removed. I
didn't want this to prevent XRC merging so I haven't raced this point.
It is enough to have two types of QPs "P" - SW credit management "S" -
HW credit management. I think connection management should work like
this: Each BTL knows what type of CPC it can use and it should share
this info during modex stage. During connection establishment modex info
is used to figure out the list of CPCs that both endpoints support and one
with highest prio is selected.

--
Gleb.


Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Jeff Squyres

On Dec 12, 2007, at 5:13 AM, Pavel Shamis (Pasha) wrote:


Hmm.  I don't think that we want to put knowledge of XRC in the OOB
CPC (and vice versa).  That seems like an abstraction violation.

I didn't like that XRC knowledge was put in the connect base either,
but I was too busy to argue with it.  :-)

Isn't there a better way somehow?  Perhaps we should have "select"
call *all* the functions and accept back a priority.  The one with  
the

highest priority then wins.  This is quite similar to much of the
other selection logic in OMPI.

Sidenote: Keep in mind that there are some changes coming to select
CPCs on a per-endpoint basis (I can't look up the trac ticket right
now...).  This makes things a little complicated -- do we need
btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to
include/exclude CPCs (because you might need more than one CPC in a
single job)?  That wouldn't be hard to do.

But then what to do about if someone sets to use some XRC QPs and
selects to use OOB or RDMA CM?
Error message will be reported , that for using XRC you _must_  
select xoob.


I understand that that is what it does today; I was asking my somewhat- 
rhetorical question with the above text in mind (that we remove the  
abstraction violations -- remove knowledge of XRC from the OOB CPC,  
etc.).



How do we catch this and print an
error?  It doesn't seem right to put the "if num_xrc_qps>0" check in
every CPC.  What happens if you try to make an XRC QP when not using
xoob?



Where is the error detected and what kind of error message do
we print?


I would like to remind 2 things:
1. XRC little bit change QP logic. We creates one XRC qp for send and
one for recv. As result
it require absolutely different oob mechanism.
2. Current implementation doesn't allow to run with XRC  + RC (or srq)
and I don't think that we need such mixed configuration
support at all.

So as results the the XRC may work only with XOOB. If you will try to
run it with oob error message will be reported.
As well if you will try to run !(XRC) with XOOB error message will be
reported too.

The check is located in ompi_btl_openib_connect_base_open.


I understand all of that.  I think the question is if there is a way  
to de-centralize these checks such that the XOOB CPC can be the one  
that figures this stuff out (for example) rather than having to put  
this in the base.


The original code in the function used oob as default connection  
method.

I changed it to check
in which mode we are running (xrc enabled/disabled) and make xoob
default connection manager for xrc mode
and oob make default for not xrc mode.


Right -- this is problematic for adding IBCM and RDMA CM; that's Jon's  
point.



I  not sure that oob cpc is the best place for the logic.
also I don't think that the "select + priority" solution will resolve
the dependences problem:
XRC enabled -> xoob
XRC disabled -> oob , cm.

We may push the logic outside of cpc  and pass to
ompi_btl_openib_connect_base_open()
witch connection manger we want to use. I guess that the change also
will be usefull for future "CPCs on a per-endpoint basis" changes.


From an abstraction point of view, it would be nice to get all this  
CPC-specific information out of the base and into the CPCs that they  
belong to.



Also, I'm not sure why the #if/#else is there for xoob (i.e., having
empty/printf functions there when XRC support is compiled out) -- if
xoob was disabled during compilation, then it simply should not be
compiled and therefore not be there at all at run-time.  If a user
selects the xoob CPC, then we should print a message from the base
that that CPC doesn't exist in the installation.  Correspondingly, we
can make an info MCA param in the btl openib that shows which CPCs  
are
available (we already have this information -- it's easy enough to  
put

this in an info MCA param).


Sounds reasonable for me.

Pasha.


On Dec 11, 2007, at 6:59 PM, Jon Mason wrote:



Currently, alternate CMs cannot be called because
ompi_btl_openib_connect_base_open forces a choice of either oob or
xoob
(and goes into an erroneous error path if you pick something else).
This patch reorganizes ompi_btl_openib_connect_base_open so that new
functions can easily be added.  New Open functions were added to oob
and xoob for the error handling.

I tested calling oob, xoob, and rdma_cm.  oob happily allows
connections
to be established and throws no errors.  xoob fails because ompi  
does
not have it compiled in (and I have no connectx cards).  rdma_cm  
calls

the empty hooks and exits without connecting (thus throwing
non-connection errors).  All expected behavior.

Since this patch fixes the existing behavior, and is not necessarily
tied to my implementing of rdma_cm, I think it is acceptable to go  
in

now.

Thanks,
Jon

Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c
===
--- ompi/mca/btl/openib/connect/btl_openib_connect_ba

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-12 Thread Pavel Shamis (Pasha)

Jeff Squyres wrote:
Hmm.  I don't think that we want to put knowledge of XRC in the OOB  
CPC (and vice versa).  That seems like an abstraction violation.


I didn't like that XRC knowledge was put in the connect base either,  
but I was too busy to argue with it.  :-)


Isn't there a better way somehow?  Perhaps we should have "select"  
call *all* the functions and accept back a priority.  The one with the  
highest priority then wins.  This is quite similar to much of the  
other selection logic in OMPI.


Sidenote: Keep in mind that there are some changes coming to select  
CPCs on a per-endpoint basis (I can't look up the trac ticket right  
now...).  This makes things a little complicated -- do we need  
btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
include/exclude CPCs (because you might need more than one CPC in a  
single job)?  That wouldn't be hard to do.


But then what to do about if someone sets to use some XRC QPs and  
selects to use OOB or RDMA CM?  

Error message will be reported , that for using XRC you _must_ select xoob.
How do we catch this and print an  
error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
every CPC.  What happens if you try to make an XRC QP when not using  
xoob?  


Where is the error detected and what kind of error message do  
we print?
  

I would like to remind 2 things:
1. XRC little bit change QP logic. We creates one XRC qp for send and 
one for recv. As result

it require absolutely different oob mechanism.
2. Current implementation doesn't allow to run with XRC  + RC (or srq) 
and I don't think that we need such mixed configuration

support at all.

So as results the the XRC may work only with XOOB. If you will try to 
run it with oob error message will be reported.
As well if you will try to run !(XRC) with XOOB error message will be 
reported too.


The check is located in ompi_btl_openib_connect_base_open.

The original code in the function used oob as default connection method. 
I changed it to check
in which mode we are running (xrc enabled/disabled) and make xoob 
default connection manager for xrc mode

and oob make default for not xrc mode.

I  not sure that oob cpc is the best place for the logic.
also I don't think that the "select + priority" solution will resolve 
the dependences problem:

XRC enabled -> xoob
XRC disabled -> oob , cm.

We may push the logic outside of cpc  and pass to 
ompi_btl_openib_connect_base_open()
witch connection manger we want to use. I guess that the change also 
will be usefull for future "CPCs on a per-endpoint basis" changes.


Also, I'm not sure why the #if/#else is there for xoob (i.e., having  
empty/printf functions there when XRC support is compiled out) -- if  
xoob was disabled during compilation, then it simply should not be  
compiled and therefore not be there at all at run-time.  If a user  
selects the xoob CPC, then we should print a message from the base  
that that CPC doesn't exist in the installation.  Correspondingly, we  
can make an info MCA param in the btl openib that shows which CPCs are  
available (we already have this information -- it's easy enough to put  
this in an info MCA param).
  

Sounds reasonable for me.

Pasha.


On Dec 11, 2007, at 6:59 PM, Jon Mason wrote:

  

Currently, alternate CMs cannot be called because
ompi_btl_openib_connect_base_open forces a choice of either oob or  
xoob

(and goes into an erroneous error path if you pick something else).
This patch reorganizes ompi_btl_openib_connect_base_open so that new
functions can easily be added.  New Open functions were added to oob
and xoob for the error handling.

I tested calling oob, xoob, and rdma_cm.  oob happily allows  
connections

to be established and throws no errors.  xoob fails because ompi does
not have it compiled in (and I have no connectx cards).  rdma_cm calls
the empty hooks and exits without connecting (thus throwing
non-connection errors).  All expected behavior.

Since this patch fixes the existing behavior, and is not necessarily
tied to my implementing of rdma_cm, I think it is acceptable to go in
now.

Thanks,
Jon

Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c
===
--- ompi/mca/btl/openib/connect/btl_openib_connect_base.c	(revision  
16937)
+++ ompi/mca/btl/openib/connect/btl_openib_connect_base.c	(working  
copy)

@@ -50,8 +50,8 @@
 */
int ompi_btl_openib_connect_base_open(void)
{
-int i;
-char **temp, *a, *b;
+char **temp, *a, *b, *defval;
+int i, ret = OMPI_ERROR;

/* Make an MCA parameter to select which connect module to use */
temp = NULL;
@@ -66,40 +66,23 @@

/* For XRC qps we must to use XOOB connection manager */
if (mca_btl_openib_component.num_xrc_qps > 0) {
- 
mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version,

-"connect",
-b, false, false,
-"xoob", ¶m);
-   

Re: [OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-11 Thread Jeff Squyres
Hmm.  I don't think that we want to put knowledge of XRC in the OOB  
CPC (and vice versa).  That seems like an abstraction violation.


I didn't like that XRC knowledge was put in the connect base either,  
but I was too busy to argue with it.  :-)


Isn't there a better way somehow?  Perhaps we should have "select"  
call *all* the functions and accept back a priority.  The one with the  
highest priority then wins.  This is quite similar to much of the  
other selection logic in OMPI.


Sidenote: Keep in mind that there are some changes coming to select  
CPCs on a per-endpoint basis (I can't look up the trac ticket right  
now...).  This makes things a little complicated -- do we need  
btl_openib_cpc_include and btl_openib_cpc_exclude MCA params to  
include/exclude CPCs (because you might need more than one CPC in a  
single job)?  That wouldn't be hard to do.


But then what to do about if someone sets to use some XRC QPs and  
selects to use OOB or RDMA CM?  How do we catch this and print an  
error?  It doesn't seem right to put the "if num_xrc_qps>0" check in  
every CPC.  What happens if you try to make an XRC QP when not using  
xoob?  Where is the error detected and what kind of error message do  
we print?


Also, I'm not sure why the #if/#else is there for xoob (i.e., having  
empty/printf functions there when XRC support is compiled out) -- if  
xoob was disabled during compilation, then it simply should not be  
compiled and therefore not be there at all at run-time.  If a user  
selects the xoob CPC, then we should print a message from the base  
that that CPC doesn't exist in the installation.  Correspondingly, we  
can make an info MCA param in the btl openib that shows which CPCs are  
available (we already have this information -- it's easy enough to put  
this in an info MCA param).



On Dec 11, 2007, at 6:59 PM, Jon Mason wrote:


Currently, alternate CMs cannot be called because
ompi_btl_openib_connect_base_open forces a choice of either oob or  
xoob

(and goes into an erroneous error path if you pick something else).
This patch reorganizes ompi_btl_openib_connect_base_open so that new
functions can easily be added.  New Open functions were added to oob
and xoob for the error handling.

I tested calling oob, xoob, and rdma_cm.  oob happily allows  
connections

to be established and throws no errors.  xoob fails because ompi does
not have it compiled in (and I have no connectx cards).  rdma_cm calls
the empty hooks and exits without connecting (thus throwing
non-connection errors).  All expected behavior.

Since this patch fixes the existing behavior, and is not necessarily
tied to my implementing of rdma_cm, I think it is acceptable to go in
now.

Thanks,
Jon

Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c
===
--- ompi/mca/btl/openib/connect/btl_openib_connect_base.c	(revision  
16937)
+++ ompi/mca/btl/openib/connect/btl_openib_connect_base.c	(working  
copy)

@@ -50,8 +50,8 @@
 */
int ompi_btl_openib_connect_base_open(void)
{
-int i;
-char **temp, *a, *b;
+char **temp, *a, *b, *defval;
+int i, ret = OMPI_ERROR;

/* Make an MCA parameter to select which connect module to use */
temp = NULL;
@@ -66,40 +66,23 @@

/* For XRC qps we must to use XOOB connection manager */
if (mca_btl_openib_component.num_xrc_qps > 0) {
- 
mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version,

-"connect",
-b, false, false,
-"xoob", ¶m);
-if (0 != strcmp("xoob", param)) {
-opal_show_help("help-mpi-btl-openib.txt",
-"XRC with wrong OOB", true,
-orte_system_info.nodename,
-mca_btl_openib_component.num_xrc_qps);
-return OMPI_ERROR;
-}
+   defval = "xoob";
} else { /* For all others we should use OOB */
- 
mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version,

-"connect",
-b, false, false,
-"oob", ¶m);
-if (0 != strcmp("oob", param)) {
-opal_show_help("help-mpi-btl-openib.txt",
-"SRQ or PP with wrong OOB", true,
-orte_system_info.nodename,
-mca_btl_openib_component.num_srq_qps,
-mca_btl_openib_component.num_pp_qps);
-return OMPI_ERROR;
-}
+   defval = "oob";
}

+ 
mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version,

+ "connect", b, false, false, defval, ¶m);
+
/* Call the open function on all the connect modules */
for (i = 0; NULL != all[i]; ++i) {
-if (NULL != all[i]->bcf_open) {
-all[i]->bcf_open();
+if (0 == strcmp(all[i]->bcf_name, param)) {
+ret = all[i]->bcf_open();
+   break;
}
}

-return OMPI_S

[OMPI devel] [PATCH] openib: clean-up connect to allow for new cm's

2007-12-11 Thread Jon Mason
Currently, alternate CMs cannot be called because
ompi_btl_openib_connect_base_open forces a choice of either oob or xoob
(and goes into an erroneous error path if you pick something else).
This patch reorganizes ompi_btl_openib_connect_base_open so that new
functions can easily be added.  New Open functions were added to oob
and xoob for the error handling.

I tested calling oob, xoob, and rdma_cm.  oob happily allows connections
to be established and throws no errors.  xoob fails because ompi does
not have it compiled in (and I have no connectx cards).  rdma_cm calls
the empty hooks and exits without connecting (thus throwing
non-connection errors).  All expected behavior.

Since this patch fixes the existing behavior, and is not necessarily
tied to my implementing of rdma_cm, I think it is acceptable to go in
now.  

Thanks,
Jon

Index: ompi/mca/btl/openib/connect/btl_openib_connect_base.c
===
--- ompi/mca/btl/openib/connect/btl_openib_connect_base.c   (revision 16937)
+++ ompi/mca/btl/openib/connect/btl_openib_connect_base.c   (working copy)
@@ -50,8 +50,8 @@
  */
 int ompi_btl_openib_connect_base_open(void)
 {
-int i;
-char **temp, *a, *b;
+char **temp, *a, *b, *defval;
+int i, ret = OMPI_ERROR;

 /* Make an MCA parameter to select which connect module to use */
 temp = NULL;
@@ -66,40 +66,23 @@

 /* For XRC qps we must to use XOOB connection manager */
 if (mca_btl_openib_component.num_xrc_qps > 0) {
-mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version,
-"connect",
-b, false, false,
-"xoob", ¶m);
-if (0 != strcmp("xoob", param)) {
-opal_show_help("help-mpi-btl-openib.txt",
-"XRC with wrong OOB", true,
-orte_system_info.nodename,
-mca_btl_openib_component.num_xrc_qps);
-return OMPI_ERROR;
-}
+   defval = "xoob";
 } else { /* For all others we should use OOB */
-mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version,
-"connect",
-b, false, false,
-"oob", ¶m);
-if (0 != strcmp("oob", param)) {
-opal_show_help("help-mpi-btl-openib.txt",
-"SRQ or PP with wrong OOB", true,
-orte_system_info.nodename,
-mca_btl_openib_component.num_srq_qps,
-mca_btl_openib_component.num_pp_qps);
-return OMPI_ERROR;
-}
+   defval = "oob";
 }

+mca_base_param_reg_string(&mca_btl_openib_component.super.btl_version,
+ "connect", b, false, false, defval, ¶m);
+
 /* Call the open function on all the connect modules */
 for (i = 0; NULL != all[i]; ++i) {
-if (NULL != all[i]->bcf_open) {
-all[i]->bcf_open();
+if (0 == strcmp(all[i]->bcf_name, param)) {
+ret = all[i]->bcf_open();
+   break;
 }
 }

-return OMPI_SUCCESS;
+return ret;
 }


Index: ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c
===
--- ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c   (revision 16937)
+++ ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c   (working copy)
@@ -28,11 +28,7 @@

 static int ibcm_open(void)
 {
-mca_base_param_reg_int(&mca_btl_openib_component.super.btl_version,
-   "btl_openib_connect_ibcm_foo",
-   "A dummy help message", false, false,
-   17, NULL);
-
+printf("ibcm open\n");
 return OMPI_SUCCESS;
 }

Index: ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
===
--- ompi/mca/btl/openib/connect/btl_openib_connect_oob.c(revision 16937)
+++ ompi/mca/btl/openib/connect/btl_openib_connect_oob.c(working copy)
@@ -22,6 +22,8 @@

 #include "ompi_config.h"

+#include "opal/util/show_help.h"
+
 #include "orte/mca/ns/base/base.h"
 #include "orte/mca/oob/base/base.h"
 #include "orte/mca/rml/rml.h"
@@ -39,6 +41,7 @@
 ENDPOINT_CONNECT_ACK
 } connect_message_type_t;

+static int oob_open(void);
 static int oob_init(void);
 static int oob_start_connect(mca_btl_base_endpoint_t *e);
 static int oob_finalize(void);
@@ -67,8 +70,8 @@
  */
 ompi_btl_openib_connect_base_funcs_t ompi_btl_openib_connect_oob = {
 "oob",
-/* No need for "open */
-NULL,
+/* Open */
+oob_open,
 /* Init */
 oob_init,
 /* Connect */
@@ -78,6 +81,23 @@
 };

 /*
+ * Open function.
+ */
+static int oob_open(void)
+{
+if (mca_btl_openib_component.num_xrc_qps > 0) {
+opal_show_help("help-mpi-btl-openib.txt",
+"SRQ or PP with wrong OOB", true,
+orte_system_info.nodename,
+