Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-15 Thread Hal Rosenstock
On 8/9/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> I'd like to propose the following change as a simple solution for handling SA
> scalability problems:
>
> Modify the ib_sa module to support an SA LID that's separate from the SM LID.
>
> This concept is supported by the spec through SA redirection; however, I 
> propose
> that we also allow the SA LID to be set manually by an administrator.
> Additional details are below.
>
> ---
>
> The SA LID can be set to a local or remote LID - it doesn't matter to the
> kernel.  All SA MADs (PR queries, MC joins, event registration, etc.) would be
> sent to that destination for processing.
> Initially, I envision a user space library capable of responding to PR 
> queries,
> but it could be expanded to respond to other types of requests.  How the 
> library
> responds to requests (forwarding them to the SM/SA, using lookup tables, etc.)
> is outside the scope of the proposal.

Other than PRs, what SA requests are planned to be handled without
reforwarding to the "real" SM/SA ? Is it just PRs ? Even PRs in QoS
mode will be a challenge and likely be forwarded.

-- Hal

> - Sean
> ___
> general mailing list
> [email protected]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-14 Thread Or Gerlitz

Roland Dreier wrote:

My initial hope is that we can get away with replicating the data, and
using the existing SA protocol to keep it in sync.  For the example
you gave, SA LID X could forward the set to the master SA before
recording it locally.  SA LID Y would check with the master SA for the
data if it weren't found local.



In any case, I don't see what having a mechanism for manual SA
redirect buys you.  I'm probably missing the point, but obviously
you're not planning on having sysadmins manually set the SA LID on
each node, which implies having some automatic agent that can set the
SA LID.  And given the existence of that agent, I don't see the
difficulty in putting that agent in the real SA and using the existing
SA redirect protocol to handle setting the SA LID.


Sean,

Knowing the long way you have passed in order to solve the 
order-n-squared-load-on-the-SA-with-all-to-all-path-query-on-mpi-job-startup
I really liked your idea which allows for implementing what you have 
called the "local sa" as user space service which --if-- installed and 
running allows for offloading the SA in the mentioned scenario.


From many aspects which were discussed over the previous threads, it 
--really-- makes a difference if the "local sa" resides in the kernel or 
in user space, and my take is put it in user space.


Roland,

What this mechanism buys Sean is solving the PR problem.

Indeed sysadmins would have to install/enable the user space rpm that 
does the job, if they want this feature. Other then that, I don't see 
any manual work: a possible design I see here, is that the "user space 
SA" would use the SM LID for each port such that it does PR offload as 
the actual SA LID to replicate/invalidate/etc PR info and "proxy" non PR 
queries to.


Indeed, end-in-mind, this package may or may not be the basis for real 
distributed SA for those many K ports IB clusters. However, assuming its 
possible to implement the concept with only a little change in the 
kernel ib_sa (ie conditioned on a module param etc), I would not block 
this approach at this stage, and let Sean suggest a concrete design.


Or.

___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Roland Dreier
 > > Don't you need a distributed SA for your idea to work?  Otherwise what
 > > happens if node A sets a service record at SA LID X, and then node B
 > > sends a query for that service record at SA LID Y?
 > 
 > My initial hope is that we can get away with replicating the data, and
 > using the existing SA protocol to keep it in sync.  For the example
 > you gave, SA LID X could forward the set to the master SA before
 > recording it locally.  SA LID Y would check with the master SA for the
 > data if it weren't found local.

You would need a protocol to invalidate any locally cached service
records when a service record is deleted.  And similarly for
everything else that needs to be kept coherent.  That sounds like what
I would call a distributed SA.

In any case, I don't see what having a mechanism for manual SA
redirect buys you.  I'm probably missing the point, but obviously
you're not planning on having sysadmins manually set the SA LID on
each node, which implies having some automatic agent that can set the
SA LID.  And given the existence of that agent, I don't see the
difficulty in putting that agent in the real SA and using the existing
SA redirect protocol to handle setting the SA LID.

 - R.
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Hal Rosenstock
On 8/13/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > In terms of the redirection fields (for the local subnet), right ?
>
> Yes - any field in ClassPortInfo named Redirect*.  I don't see that it
> needs to be limited to the local subnet.
>
> > I was referring to the implications on the partition configuration in
> > terms of the ports for the ib_sa module.
>
> I don't know that an implementation needs to handle different PKeys or
> GRH fields right away, but I don't believe that general SA redirection
> prohibits this.  (I really only care about the LID at the moment.)

Even in this case, it requires that the port(s) running the ib_sa
module are in the full members of the default partition whereas before
they might have been limited members.

> - Sean
>
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Sean Hefty

In terms of the redirection fields (for the local subnet), right ?


Yes - any field in ClassPortInfo named Redirect*.  I don't see that it 
needs to be limited to the local subnet.



I was referring to the implications on the partition configuration in
terms of the ports for the ib_sa module.


I don't know that an implementation needs to handle different PKeys or 
GRH fields right away, but I don't believe that general SA redirection 
prohibits this.  (I really only care about the LID at the moment.)


- Sean
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Hal Rosenstock
On 8/13/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > There are implications on the "deployment" model for this based on the PKey.
>
> The intent is to match ClassPortInfo fields.

In terms of the redirection fields (for the local subnet), right ?

I was referring to the implications on the partition configuration in
terms of the ports for the ib_sa module.

> - Sean
>
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Hal Rosenstock
On 8/13/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > In that mode, I suppose it also requires an admin to reset it when the
> > node for the ib_sa module fails.
>
> Not necessarily.  See C13-43.1.1:
>
> When a request for a particular MADHeader:MgmtClass has been redirected
> to another location, that location shall continue to service requests
> for the MADHeader:MgmtClass until either the location becomes inoperable
> for some reason or the requests are redirected again away from that
> location.
>
> Separately from admin or SA controlled redirection, the ib_sa module
> will need to determine when to fail back to the master SM/SA.  I would
> do this based on X number of retries/timeouts.

Yes, that was what I was getting at.

>
> - Sean
>
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Sean Hefty

Don't you need a distributed SA for your idea to work?  Otherwise what
happens if node A sets a service record at SA LID X, and then node B
sends a query for that service record at SA LID Y?


My initial hope is that we can get away with replicating the data, and 
using the existing SA protocol to keep it in sync.  For the example you 
gave, SA LID X could forward the set to the master SA before recording 
it locally.  SA LID Y would check with the master SA for the data if it 
weren't found local.


- Sean
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Sean Hefty

In that mode, I suppose it also requires an admin to reset it when the
node for the ib_sa module fails.


Not necessarily.  See C13-43.1.1:

When a request for a particular MADHeader:MgmtClass has been redirected 
to another location, that location shall continue to service requests 
for the MADHeader:MgmtClass until either the location becomes inoperable 
for some reason or the requests are redirected again away from that 
location.


Separately from admin or SA controlled redirection, the ib_sa module 
will need to determine when to fail back to the master SM/SA.  I would 
do this based on X number of retries/timeouts.


- Sean
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Sean Hefty

There are implications on the "deployment" model for this based on the PKey.


The intent is to match ClassPortInfo fields.

- Sean
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Roland Dreier
 > > My first reaction to this is to wonder why we wouldn't just use
 > > redirection as already specified in the IB architecture?  It seems
 > > that having something that has to be set manually and that breaks all
 > > communication if it is set wrong is a bad idea.
 > 
 > This requires changes to the SAs to send the redirect message in
 > response to a query initiated by the host.  I was trying to avoid
 > creating an actual distributed SA.

Don't you need a distributed SA for your idea to work?  Otherwise what
happens if node A sets a service record at SA LID X, and then node B
sends a query for that service record at SA LID Y?

 - R.
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Sean Hefty

My first reaction to this is to wonder why we wouldn't just use
redirection as already specified in the IB architecture?  It seems
that having something that has to be set manually and that breaks all
communication if it is set wrong is a bad idea.


This requires changes to the SAs to send the redirect message in 
response to a query initiated by the host.  I was trying to avoid 
creating an actual distributed SA.


- Sean
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-13 Thread Roland Dreier
 > >This concept is supported by the spec through SA redirection; however, I
 > >propose that we also allow the SA LID to be set manually by an 
 > >administrator.

 > Is the following an acceptable approach for this?
 > 
 > Add a class device file to the ib_sa that allows setting all SA redirection
 > parameters (SL, LID, PKey, QP - with processing similar to SRP's add_target).
 > 
 > /sys/class/infiniband_sa/sa-mthca-0/redirect_sa

My first reaction to this is to wonder why we wouldn't just use
redirection as already specified in the IB architecture?  It seems
that having something that has to be set manually and that breaks all
communication if it is set wrong is a bad idea.

 - R.
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-12 Thread Hal Rosenstock
On 8/9/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> I'd like to propose the following change as a simple solution for handling SA
> scalability problems:
>
> Modify the ib_sa module to support an SA LID that's separate from the SM LID.
>
> This concept is supported by the spec through SA redirection; however, I 
> propose
> that we also allow the SA LID to be set manually by an administrator.

In that mode, I suppose it also requires an admin to reset it when the
node for the ib_sa module fails.

> Additional details are below.
>
> ---
>
> The SA LID can be set to a local or remote LID - it doesn't matter to the
> kernel.  All SA MADs (PR queries, MC joins, event registration, etc.) would be
> sent to that destination for processing.
>
> Initially, I envision a user space library capable of responding to PR 
> queries,
> but it could be expanded to respond to other types of requests.  How the 
> library
> responds to requests (forwarding them to the SM/SA, using lookup tables, etc.)
> is outside the scope of the proposal.

Currently outside the scope of the proposal is how a failure of that
SA cache node is handled and how nodes learn the SA redirection
information (other than it being admin'd currently).


> - Sean
> ___
> general mailing list
> [email protected]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-12 Thread Hal Rosenstock
On 8/10/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> >I'd like to propose the following change as a simple solution for handling SA
> >scalability problems:
> >
> >Modify the ib_sa module to support an SA LID that's separate from the SM LID.
> >
> >This concept is supported by the spec through SA redirection; however, I
> >propose that we also allow the SA LID to be set manually by an administrator.
>
> Roland,
>
> Is the following an acceptable approach for this?
>
> Add a class device file to the ib_sa that allows setting all SA redirection
> parameters (SL, LID, PKey, QP

There are implications on the "deployment" model for this based on the PKey.

> - with processing similar to SRP's add_target).
>
> /sys/class/infiniband_sa/sa-mthca-0/redirect_sa
>
> Alternatively, I can use a module parameter to redirect only the SA LID.
>
> - Sean
> ___
> general mailing list
> [email protected]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [ofa-general] [RFC] the never ending search for SA scalability

2007-08-10 Thread Sean Hefty
>I'd like to propose the following change as a simple solution for handling SA
>scalability problems:
>
>Modify the ib_sa module to support an SA LID that's separate from the SM LID.
>
>This concept is supported by the spec through SA redirection; however, I
>propose that we also allow the SA LID to be set manually by an administrator.

Roland,

Is the following an acceptable approach for this?

Add a class device file to the ib_sa that allows setting all SA redirection
parameters (SL, LID, PKey, QP - with processing similar to SRP's add_target).

/sys/class/infiniband_sa/sa-mthca-0/redirect_sa

Alternatively, I can use a module parameter to redirect only the SA LID.

- Sean
___
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general