Re: [ofa-general] Re: [Query] ib add path record cache

Devesh Sharma Wed, 23 May 2007 07:28:06 -0700

On 21 May 2007 13:52:11 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:

On Mon, 2007-05-21 at 01:58, Devesh Sharma wrote:
> On 18 May 2007 06:21:05 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > On Thu, 2007-05-17 at 08:28, Devesh Sharma wrote:
> > > On 17 May 2007 06:42:16 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > > > On Thu, 2007-05-17 at 01:21, Devesh Sharma wrote:
> > > > > On 5/17/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > > > > > > But initially this will generate a packet for each path, while sys
> > > > > > > admin knows that path is there and he can hard-code the entries 
for
> > > > > > > it. Other thing is that why Admin will care about creating such 
record
> > > > > > > while SA is itself taking care, right?
> > > > > >
> > > > > > In your original message you asked about adding 'dummy entries' to 
the
> > > > > > cache.  I agree that pre-loading the cache can be useful.  What I 
still
> > > > > > am not understanding is the reasoning for adding 'dummy entries'.  
By
> > > > > > 'dummy entries', I've been assuming that these are invalid path 
records,
> > > > > > but maybe that's not what you meant.
> > > > > Ok if "dummy entries" word as such has created confusion then I am
> > > > > sorry for that, But with that I mean that, those are valid path
> > > > > records which Administrator knows in advance and while loading the
> > > > > module,
> > > >
> > > > How does the admin know they are valid ?
> > > Depending on the initial application runs, some trusted PRs can be 
generated.
> >
> > What do initial application runs have to do with this ?
> My understanding is that, once the cluster is UP, and if between Node
> A and Node B there is only one path,


So this is a feature for such one path subnets. I wonder what percentage
of deployed subnets fits this case.

You never know, It may be used for debugging also.


> then, SA query always going to return same values in PR.

If subnet topology is changed, these PRs might change. There are other
cases where they change too.

Not sure about it...some suggestion?


>  On this basis Initial application runs will generate PRs,

That's what confused me before (Applications don't generate PRs but
rather request them.) but I think I see what you mean now.

Ok


> these PRs can be saved in some file, and can be loaded
> when cache_module comes in.
> >
> > > >Are they somehow preconfigured at the SM ?
> > > I am not sure about SM has any such provision?
> >
> > Not that I'm aware of.
> Ok, So, currently no such support is there in SM?

I can speak definitively for OpenSM and there is no such support. As to
the vendor SMs, I don't think so but don't know for absolute certainty.
Someone can correct me if I'm wrong but I wouldn't assume no response
means correctness as some may not be listening nor want to respond as to
"value added" vendor specific features.

What is the issue if OpenSM provides this?


> > > Also not sure about the
> > > role of SM in path resolving. I mean once node has initiated SA query,
> > > whether SM has some database to reply SA or On the fly destination
> > > node is contacted to get asked path recored?
> >
> > SMs can either calculate the SA PRs on the fly based on the routing
> > algorithm in use and some other things or put them in a local database.
> > This is up to that SM.
> Ok
> >
> > Destination node is not contacted in the SA PR query process.
> >
> > > >Doesn't each SM have its own policy for generating valid PRs ?
> > > Ultimately path record is in Path_Record object format, and SA cache
> > > is going to store in a fixed manner, How generation policy matters?
> >
> > What if the local policy loaded does not agree with what the SM would
> > generate for a particular PR ? One then gets a local error which will
> > need to be tracked down. Not so easy IMO.
> SM policies in a subnet to generate PRs, changes dynamically? at run time?

The policy doesn't change dynamically but the data to be returned in the
SA PR response might.

> if Not then depending on the local SM policy static PR can be
> generated to load initially.

Just as one question related to this, how would link failures be handled
? There are others.

Its just a matter of avoiding initial PR query packets by loading the
cache with static PRs.....Later on cache module will function in
normal fashion. I expect, initially every thing will come up in a
trusted cluster.


> > > CMIIW. Also I am assuming a homogeneous cluster where certain
> > > parameters can be assumed to be same always.
> >
> > and always in agreement with what the SM would return ? For example,
> yes
> > what happens when a link goes down and the end node is no longer
> > reachable ?
> If node is not reachable then, after first timeout of sa_cache, that
> entry will be removed from cache.

OK; that's another aspect to add into this feature. I don't think that
is currently done. I think there would need to be an API added to do
this.

Yes, this has been discussed with Sean, we can add one char_dev
interface to the existing  sa_cache module implementation, Write entry
point will generate a SA_PR_response packet and this packet will be
passed to update_cache() function.

Also we need to remove the initial schedule_update() call in the
add_one() function.
One user command is also required to read from user file and write
onto this device.


-- Hal

> > > >are these from a live SM and just loaded "out of band" to
> > > bypass/preclude the SA PR >mechanism ?
> > > may be
> >
> > Even if they are, there is still the changes in the subnet issue.
> >
> > -- Hal
> >
> > > > -- Hal
> > > >
> > > > >  Admin is loading this info in the cache with user command.
> > > > > >
> > > > > > > Another point I want to know is,
> > > > > > > When local_sa_cache module will be inserted? After SM comes up or
> > > > > > > Before SM comes up?
> > > > > >
> > > > > > It can occur either way.  There is no restriction.  The cache 
responds
> > > > > > to port up and GID in/out of service events to update itself.
> > > > > Do you mean cache module will start building cache only after Port is 
UP?
> > > > > >
> > > > > > > If Its inserted before SM is coming up (I am assuming SM is 
running on
> > > > > > > some node not on switch) then First Forced schedule_update() is
> > > > > > > waisted, and for the first application presence of cache is
> > > > > > > meaningless. Why not to keep cache effective right from the start?
> > > > > >
> > > > > > Pre-loading the cache with path records doesn't guarantee that those
> > > > > > paths are usable.  If the SM has not come up, then the path records 
will
> > > > > > be unusable until the SM configures the subnet, plus there's no
> > > > > > guarantee that the remote endpoints specified by the paths are 
running.
> > > > > You mean there is no guarantee that even if SM is UP and we have some
> > > > > hard coded entries of path record corresponding to some node X, we are
> > > > > not sure that node X has actually come up or not?  In that case
> > > > > actually that path resolving should fail if node has not come up, but
> > > > > with the hard coding still path will be resolved?
> > > > > >
> > > > > > The main benefit I see to pre-loading the cache is to avoid SA 
storms
> > > > > > when booting a large cluster.
> > > > > that's true. Also cache will get valid entries only if network is
> > > > > configured by SM otherwise every node SA will, possibly, drop SA
> > > > > packets.
> > > > > >
> > > > > > - Sean
> > > > > >
> > > > > _______________________________________________
> > > > > general mailing list
> > > > > [email protected]
> > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > > > >
> > > > > To unsubscribe, please visit 
http://openib.org/mailman/listinfo/openib-general
> > > >
> > > >
> >
> >

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Re: [Query] ib add path record cache

Reply via email to