Re: [ofa-general] Re: [Query] ib add path record cache

Hal Rosenstock Wed, 23 May 2007 07:40:50 -0700

On Wed, 2007-05-23 at 10:27, Devesh Sharma wrote:
> On 21 May 2007 13:52:11 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > On Mon, 2007-05-21 at 01:58, Devesh Sharma wrote:
> > > On 18 May 2007 06:21:05 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > > > On Thu, 2007-05-17 at 08:28, Devesh Sharma wrote:
> > > > > On 17 May 2007 06:42:16 -0400, Hal Rosenstock <[EMAIL PROTECTED]> 
> > > > > wrote:
> > > > > > On Thu, 2007-05-17 at 01:21, Devesh Sharma wrote:
> > > > > > > On 5/17/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > > > > > > > > But initially this will generate a packet for each path, 
> > > > > > > > > while sys
> > > > > > > > > admin knows that path is there and he can hard-code the 
> > > > > > > > > entries for
> > > > > > > > > it. Other thing is that why Admin will care about creating 
> > > > > > > > > such record
> > > > > > > > > while SA is itself taking care, right?
> > > > > > > >
> > > > > > > > In your original message you asked about adding 'dummy entries' 
> > > > > > > > to the
> > > > > > > > cache.  I agree that pre-loading the cache can be useful.  What 
> > > > > > > > I still
> > > > > > > > am not understanding is the reasoning for adding 'dummy 
> > > > > > > > entries'.  By
> > > > > > > > 'dummy entries', I've been assuming that these are invalid path 
> > > > > > > > records,
> > > > > > > > but maybe that's not what you meant.
> > > > > > > Ok if "dummy entries" word as such has created confusion then I am
> > > > > > > sorry for that, But with that I mean that, those are valid path
> > > > > > > records which Administrator knows in advance and while loading the
> > > > > > > module,
> > > > > >
> > > > > > How does the admin know they are valid ?
> > > > > Depending on the initial application runs, some trusted PRs can be 
> > > > > generated.
> > > >
> > > > What do initial application runs have to do with this ?
> > > My understanding is that, once the cluster is UP, and if between Node
> > > A and Node B there is only one path,
> >
> > So this is a feature for such one path subnets. I wonder what percentage
> > of deployed subnets fits this case.
> You never know, It may be used for debugging also.


I still don't have a good feel for how common/generally useful this will
really be.

> > > then, SA query always going to return same values in PR.
> >
> > If subnet topology is changed, these PRs might change. There are other
> > cases where they change too.
> Not sure about it...some suggestion?
> >
> > >  On this basis Initial application runs will generate PRs,
> >
> > That's what confused me before (Applications don't generate PRs but
> > rather request them.) but I think I see what you mean now.
> Ok
> >
> > > these PRs can be saved in some file, and can be loaded
> > > when cache_module comes in.
> > > >
> > > > > >Are they somehow preconfigured at the SM ?
> > > > > I am not sure about SM has any such provision?
> > > >
> > > > Not that I'm aware of.
> > > Ok, So, currently no such support is there in SM?
> >
> > I can speak definitively for OpenSM and there is no such support. As to
> > the vendor SMs, I don't think so but don't know for absolute certainty.
> > Someone can correct me if I'm wrong but I wouldn't assume no response
> > means correctness as some may not be listening nor want to respond as to
> > "value added" vendor specific features.
> What is the issue if OpenSM provides this?

I'm not following you. What does/should OpenSM provide ? OpenIB works in
configurations with other SMs.

> >
> > > > > Also not sure about the
> > > > > role of SM in path resolving. I mean once node has initiated SA query,
> > > > > whether SM has some database to reply SA or On the fly destination
> > > > > node is contacted to get asked path recored?
> > > >
> > > > SMs can either calculate the SA PRs on the fly based on the routing
> > > > algorithm in use and some other things or put them in a local database.
> > > > This is up to that SM.
> > > Ok
> > > >
> > > > Destination node is not contacted in the SA PR query process.
> > > >
> > > > > >Doesn't each SM have its own policy for generating valid PRs ?
> > > > > Ultimately path record is in Path_Record object format, and SA cache
> > > > > is going to store in a fixed manner, How generation policy matters?
> > > >
> > > > What if the local policy loaded does not agree with what the SM would
> > > > generate for a particular PR ? One then gets a local error which will
> > > > need to be tracked down. Not so easy IMO.
> > > SM policies in a subnet to generate PRs, changes dynamically? at run time?
> >
> > The policy doesn't change dynamically but the data to be returned in the
> > SA PR response might.
> >
> > > if Not then depending on the local SM policy static PR can be
> > > generated to load initially.
> >
> > Just as one question related to this, how would link failures be handled
> > ? There are others.
> Its just a matter of avoiding initial PR query packets by loading the
> cache with static PRs.....Later on cache module will function in
> normal fashion. I expect, initially every thing will come up in a
> trusted cluster.

So you're saying the cache would still react to GIDs out and in service,
right ?

If the cache is loaded from a file, does it bypass querying the SA
initially for PRs ? If that is the case, then the file is required to be
the full set of PRs for this node otherwise there would be incomplete
connectivity.

-- Hal

> > > > > CMIIW. Also I am assuming a homogeneous cluster where certain
> > > > > parameters can be assumed to be same always.
> > > >
> > > > and always in agreement with what the SM would return ? For example,
> > > yes
> > > > what happens when a link goes down and the end node is no longer
> > > > reachable ?
> > > If node is not reachable then, after first timeout of sa_cache, that
> > > entry will be removed from cache.
> >
> > OK; that's another aspect to add into this feature. I don't think that
> > is currently done. I think there would need to be an API added to do
> > this.
> Yes, this has been discussed with Sean, we can add one char_dev
> interface to the existing  sa_cache module implementation, Write entry
> point will generate a SA_PR_response packet and this packet will be
> passed to update_cache() function.
> 
> Also we need to remove the initial schedule_update() call in the
> add_one() function.
> One user command is also required to read from user file and write
> onto this device.
> >
> > -- Hal
> >
> > > > > >are these from a live SM and just loaded "out of band" to
> > > > > bypass/preclude the SA PR >mechanism ?
> > > > > may be
> > > >
> > > > Even if they are, there is still the changes in the subnet issue.
> > > >
> > > > -- Hal
> > > >
> > > > > > -- Hal
> > > > > >
> > > > > > >  Admin is loading this info in the cache with user command.
> > > > > > > >
> > > > > > > > > Another point I want to know is,
> > > > > > > > > When local_sa_cache module will be inserted? After SM comes 
> > > > > > > > > up or
> > > > > > > > > Before SM comes up?
> > > > > > > >
> > > > > > > > It can occur either way.  There is no restriction.  The cache 
> > > > > > > > responds
> > > > > > > > to port up and GID in/out of service events to update itself.
> > > > > > > Do you mean cache module will start building cache only after 
> > > > > > > Port is UP?
> > > > > > > >
> > > > > > > > > If Its inserted before SM is coming up (I am assuming SM is 
> > > > > > > > > running on
> > > > > > > > > some node not on switch) then First Forced schedule_update() 
> > > > > > > > > is
> > > > > > > > > waisted, and for the first application presence of cache is
> > > > > > > > > meaningless. Why not to keep cache effective right from the 
> > > > > > > > > start?
> > > > > > > >
> > > > > > > > Pre-loading the cache with path records doesn't guarantee that 
> > > > > > > > those
> > > > > > > > paths are usable.  If the SM has not come up, then the path 
> > > > > > > > records will
> > > > > > > > be unusable until the SM configures the subnet, plus there's no
> > > > > > > > guarantee that the remote endpoints specified by the paths are 
> > > > > > > > running.
> > > > > > > You mean there is no guarantee that even if SM is UP and we have 
> > > > > > > some
> > > > > > > hard coded entries of path record corresponding to some node X, 
> > > > > > > we are
> > > > > > > not sure that node X has actually come up or not?  In that case
> > > > > > > actually that path resolving should fail if node has not come up, 
> > > > > > > but
> > > > > > > with the hard coding still path will be resolved?
> > > > > > > >
> > > > > > > > The main benefit I see to pre-loading the cache is to avoid SA 
> > > > > > > > storms
> > > > > > > > when booting a large cluster.
> > > > > > > that's true. Also cache will get valid entries only if network is
> > > > > > > configured by SM otherwise every node SA will, possibly, drop SA
> > > > > > > packets.
> > > > > > > >
> > > > > > > > - Sean
> > > > > > > >
> > > > > > > _______________________________________________
> > > > > > > general mailing list
> > > > > > > [email protected]
> > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > > > > > >
> > > > > > > To unsubscribe, please visit 
> > > > > > > http://openib.org/mailman/listinfo/openib-general
> > > > > >
> > > > > >
> > > >
> > > >
> >
> >

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Re: [Query] ib add path record cache

Reply via email to