Re: [ofa-general] Re: [Query] ib add path record cache

Devesh Sharma Fri, 25 May 2007 06:53:24 -0700

On 24 May 2007 11:30:24 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:

On Thu, 2007-05-24 at 08:22, Devesh Sharma wrote:
> On 23 May 2007 10:35:13 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > On Wed, 2007-05-23 at 10:27, Devesh Sharma wrote:
> > > On 21 May 2007 13:52:11 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > > > On Mon, 2007-05-21 at 01:58, Devesh Sharma wrote:
> > > > > On 18 May 2007 06:21:05 -0400, Hal Rosenstock <[EMAIL PROTECTED]> 
wrote:
> > > > > > On Thu, 2007-05-17 at 08:28, Devesh Sharma wrote:
> > > > > > > On 17 May 2007 06:42:16 -0400, Hal Rosenstock <[EMAIL PROTECTED]> 
wrote:
> > > > > > > > On Thu, 2007-05-17 at 01:21, Devesh Sharma wrote:
> > > > > > > > > On 5/17/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > > > > > > > > > > But initially this will generate a packet for each path, 
while sys
> > > > > > > > > > > admin knows that path is there and he can hard-code the 
entries for
> > > > > > > > > > > it. Other thing is that why Admin will care about 
creating such record
> > > > > > > > > > > while SA is itself taking care, right?
> > > > > > > > > >
> > > > > > > > > > In your original message you asked about adding 'dummy 
entries' to the
> > > > > > > > > > cache.  I agree that pre-loading the cache can be useful.  
What I still
> > > > > > > > > > am not understanding is the reasoning for adding 'dummy 
entries'.  By
> > > > > > > > > > 'dummy entries', I've been assuming that these are invalid 
path records,
> > > > > > > > > > but maybe that's not what you meant.
> > > > > > > > > Ok if "dummy entries" word as such has created confusion then 
I am
> > > > > > > > > sorry for that, But with that I mean that, those are valid 
path
> > > > > > > > > records which Administrator knows in advance and while 
loading the
> > > > > > > > > module,
> > > > > > > >
> > > > > > > > How does the admin know they are valid ?
> > > > > > > Depending on the initial application runs, some trusted PRs can 
be generated.
> > > > > >
> > > > > > What do initial application runs have to do with this ?
> > > > > My understanding is that, once the cluster is UP, and if between Node
> > > > > A and Node B there is only one path,
> > > >
> > > > So this is a feature for such one path subnets. I wonder what percentage
> > > > of deployed subnets fits this case.
> > > You never know, It may be used for debugging also.
> >
> > I still don't have a good feel for how common/generally useful this will
> > really be.
> >
> > > > > then, SA query always going to return same values in PR.
> > > >
> > > > If subnet topology is changed, these PRs might change. There are other
> > > > cases where they change too.
> > > Not sure about it...some suggestion?
> > > >
> > > > >  On this basis Initial application runs will generate PRs,
> > > >
> > > > That's what confused me before (Applications don't generate PRs but
> > > > rather request them.) but I think I see what you mean now.
> > > Ok
> > > >
> > > > > these PRs can be saved in some file, and can be loaded
> > > > > when cache_module comes in.
> > > > > >
> > > > > > > >Are they somehow preconfigured at the SM ?
> > > > > > > I am not sure about SM has any such provision?
> > > > > >
> > > > > > Not that I'm aware of.
> > > > > Ok, So, currently no such support is there in SM?
> > > >
> > > > I can speak definitively for OpenSM and there is no such support. As to
> > > > the vendor SMs, I don't think so but don't know for absolute certainty.
> > > > Someone can correct me if I'm wrong but I wouldn't assume no response
> > > > means correctness as some may not be listening nor want to respond as to
> > > > "value added" vendor specific features.
> > > What is the issue if OpenSM provides this?
> >
> > I'm not following you. What does/should OpenSM provide ? OpenIB works in
> > configurations with other SMs.
> I am talking about pre-configuring PRs in OpenSM DB.


How does that help ? Why would PRs need to be preconfigured at the SM ?
Do you mean preconfigure the routing tables (and generate the PRs from
that) ? What problem is being solved on the SM side ?

I just queried out of curiosity......nothing special.:)


> > > > > > > Also not sure about the
> > > > > > > role of SM in path resolving. I mean once node has initiated SA 
query,
> > > > > > > whether SM has some database to reply SA or On the fly destination
> > > > > > > node is contacted to get asked path recored?
> > > > > >
> > > > > > SMs can either calculate the SA PRs on the fly based on the routing
> > > > > > algorithm in use and some other things or put them in a local 
database.
> > > > > > This is up to that SM.
> > > > > Ok
> > > > > >
> > > > > > Destination node is not contacted in the SA PR query process.
> > > > > >
> > > > > > > >Doesn't each SM have its own policy for generating valid PRs ?
> > > > > > > Ultimately path record is in Path_Record object format, and SA 
cache
> > > > > > > is going to store in a fixed manner, How generation policy 
matters?
> > > > > >
> > > > > > What if the local policy loaded does not agree with what the SM 
would
> > > > > > generate for a particular PR ? One then gets a local error which 
will
> > > > > > need to be tracked down. Not so easy IMO.
> > > > > SM policies in a subnet to generate PRs, changes dynamically? at run 
time?
> > > >
> > > > The policy doesn't change dynamically but the data to be returned in the
> > > > SA PR response might.
> > > >
> > > > > if Not then depending on the local SM policy static PR can be
> > > > > generated to load initially.
> > > >
> > > > Just as one question related to this, how would link failures be handled
> > > > ? There are others.
> > > Its just a matter of avoiding initial PR query packets by loading the
> > > cache with static PRs.....Later on cache module will function in
> > > normal fashion. I expect, initially every thing will come up in a
> > > trusted cluster.
> >
> > So you're saying the cache would still react to GIDs out and in service,
> > right ?
> I am not about what GIDs in out service....

Why not ?

Actually it was a typing mistake....I am trying to say that I am not
sure about what GID out and in service is.


> but what I mean to say is,
> Once sa_cache is programmed with some static PRs....it will avoid
> initial cache_update step and after first time out normal
> update_cache() will be initiated using SA MADs.

How would the client know what PRs to request when that timeout first
occurs ? There's no get all except these semantics. If it is all PRs,
what does that save ?

I think my statement has again confused you.....sorry my falt.."and
after first time out normal update_cache() will be initiated using SA
MADs." I mean to say, after first time out....only the requested PR
will be resolved....not all.


> > If the cache is loaded from a file, does it bypass querying the SA
> > initially for PRs ?
> Yes It will, and hence reduce the initial SA traffic generated on a
> big cluster...just imagin, the cluster is quite big and every node is
> trying to build its cache initially. It will create large burst of SA
> packets.
> >If that is the case, then the file is required to be
> > the full set of PRs for this node otherwise there would be incomplete
> > connectivity.
> Yes, correct, Generating these PRs is the next issue which I want to
> discuss. may be this can be done by Admin on every node using the
> read() entry point provided by char_dev interface of sa_cache module.
> read entry point will simple extract PRs from cache itself.
>
> Incomplete connectivity will be till first PR is requested for that
> destination, Because if its a cache miss, any how application is going
> to initiate a ib_sa_get_path_rec() and resolved PR will be added in
> cache for future reference.

OK then this becomes an on demand model for those destnations (at least
initially).

By "on demand" do you mean.....normal cluster without cache? if yes
than it will be on demand PR resolve model for those incomplete paths.


-- Hal

> > -- Hal
> >
> > > > > > > CMIIW. Also I am assuming a homogeneous cluster where certain
> > > > > > > parameters can be assumed to be same always.
> > > > > >
> > > > > > and always in agreement with what the SM would return ? For example,
> > > > > yes
> > > > > > what happens when a link goes down and the end node is no longer
> > > > > > reachable ?
> > > > > If node is not reachable then, after first timeout of sa_cache, that
> > > > > entry will be removed from cache.
> > > >
> > > > OK; that's another aspect to add into this feature. I don't think that
> > > > is currently done. I think there would need to be an API added to do
> > > > this.
> > > Yes, this has been discussed with Sean, we can add one char_dev
> > > interface to the existing  sa_cache module implementation, Write entry
> > > point will generate a SA_PR_response packet and this packet will be
> > > passed to update_cache() function.
> > >
> > > Also we need to remove the initial schedule_update() call in the
> > > add_one() function.
> > > One user command is also required to read from user file and write
> > > onto this device.
> > > >
> > > > -- Hal
> > > >
> > > > > > > >are these from a live SM and just loaded "out of band" to
> > > > > > > bypass/preclude the SA PR >mechanism ?
> > > > > > > may be
> > > > > >
> > > > > > Even if they are, there is still the changes in the subnet issue.
> > > > > >
> > > > > > -- Hal
> > > > > >
> > > > > > > > -- Hal
> > > > > > > >
> > > > > > > > >  Admin is loading this info in the cache with user command.
> > > > > > > > > >
> > > > > > > > > > > Another point I want to know is,
> > > > > > > > > > > When local_sa_cache module will be inserted? After SM 
comes up or
> > > > > > > > > > > Before SM comes up?
> > > > > > > > > >
> > > > > > > > > > It can occur either way.  There is no restriction.  The 
cache responds
> > > > > > > > > > to port up and GID in/out of service events to update 
itself.
> > > > > > > > > Do you mean cache module will start building cache only after 
Port is UP?
> > > > > > > > > >
> > > > > > > > > > > If Its inserted before SM is coming up (I am assuming SM 
is running on
> > > > > > > > > > > some node not on switch) then First Forced 
schedule_update() is
> > > > > > > > > > > waisted, and for the first application presence of cache 
is
> > > > > > > > > > > meaningless. Why not to keep cache effective right from 
the start?
> > > > > > > > > >
> > > > > > > > > > Pre-loading the cache with path records doesn't guarantee 
that those
> > > > > > > > > > paths are usable.  If the SM has not come up, then the path 
records will
> > > > > > > > > > be unusable until the SM configures the subnet, plus 
there's no
> > > > > > > > > > guarantee that the remote endpoints specified by the paths 
are running.
> > > > > > > > > You mean there is no guarantee that even if SM is UP and we 
have some
> > > > > > > > > hard coded entries of path record corresponding to some node 
X, we are
> > > > > > > > > not sure that node X has actually come up or not?  In that 
case
> > > > > > > > > actually that path resolving should fail if node has not come 
up, but
> > > > > > > > > with the hard coding still path will be resolved?
> > > > > > > > > >
> > > > > > > > > > The main benefit I see to pre-loading the cache is to avoid 
SA storms
> > > > > > > > > > when booting a large cluster.
> > > > > > > > > that's true. Also cache will get valid entries only if 
network is
> > > > > > > > > configured by SM otherwise every node SA will, possibly, drop 
SA
> > > > > > > > > packets.
> > > > > > > > > >
> > > > > > > > > > - Sean
> > > > > > > > > >
> > > > > > > > > _______________________________________________
> > > > > > > > > general mailing list
> > > > > > > > > [email protected]
> > > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > > > > > > > >
> > > > > > > > > To unsubscribe, please visit 
http://openib.org/mailman/listinfo/openib-general
> > > > > > > >
> > > > > > > >
> > > > > >
> > > > > >
> > > >
> > > >
> >
> >

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Re: [Query] ib add path record cache

Reply via email to