On 23 May 2007 10:35:13 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
On Wed, 2007-05-23 at 10:27, Devesh Sharma wrote:
> On 21 May 2007 13:52:11 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > On Mon, 2007-05-21 at 01:58, Devesh Sharma wrote:
> > > On 18 May 2007 06:21:05 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > > > On Thu, 2007-05-17 at 08:28, Devesh Sharma wrote:
> > > > > On 17 May 2007 06:42:16 -0400, Hal Rosenstock <[EMAIL PROTECTED]>
wrote:
> > > > > > On Thu, 2007-05-17 at 01:21, Devesh Sharma wrote:
> > > > > > > On 5/17/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
> > > > > > > > > But initially this will generate a packet for each path,
while sys
> > > > > > > > > admin knows that path is there and he can hard-code the
entries for
> > > > > > > > > it. Other thing is that why Admin will care about creating
such record
> > > > > > > > > while SA is itself taking care, right?
> > > > > > > >
> > > > > > > > In your original message you asked about adding 'dummy entries'
to the
> > > > > > > > cache. I agree that pre-loading the cache can be useful. What
I still
> > > > > > > > am not understanding is the reasoning for adding 'dummy
entries'. By
> > > > > > > > 'dummy entries', I've been assuming that these are invalid path
records,
> > > > > > > > but maybe that's not what you meant.
> > > > > > > Ok if "dummy entries" word as such has created confusion then I am
> > > > > > > sorry for that, But with that I mean that, those are valid path
> > > > > > > records which Administrator knows in advance and while loading the
> > > > > > > module,
> > > > > >
> > > > > > How does the admin know they are valid ?
> > > > > Depending on the initial application runs, some trusted PRs can be
generated.
> > > >
> > > > What do initial application runs have to do with this ?
> > > My understanding is that, once the cluster is UP, and if between Node
> > > A and Node B there is only one path,
> >
> > So this is a feature for such one path subnets. I wonder what percentage
> > of deployed subnets fits this case.
> You never know, It may be used for debugging also.
I still don't have a good feel for how common/generally useful this will
really be.
> > > then, SA query always going to return same values in PR.
> >
> > If subnet topology is changed, these PRs might change. There are other
> > cases where they change too.
> Not sure about it...some suggestion?
> >
> > > On this basis Initial application runs will generate PRs,
> >
> > That's what confused me before (Applications don't generate PRs but
> > rather request them.) but I think I see what you mean now.
> Ok
> >
> > > these PRs can be saved in some file, and can be loaded
> > > when cache_module comes in.
> > > >
> > > > > >Are they somehow preconfigured at the SM ?
> > > > > I am not sure about SM has any such provision?
> > > >
> > > > Not that I'm aware of.
> > > Ok, So, currently no such support is there in SM?
> >
> > I can speak definitively for OpenSM and there is no such support. As to
> > the vendor SMs, I don't think so but don't know for absolute certainty.
> > Someone can correct me if I'm wrong but I wouldn't assume no response
> > means correctness as some may not be listening nor want to respond as to
> > "value added" vendor specific features.
> What is the issue if OpenSM provides this?
I'm not following you. What does/should OpenSM provide ? OpenIB works in
configurations with other SMs.
I am talking about pre-configuring PRs in OpenSM DB.
> >
> > > > > Also not sure about the
> > > > > role of SM in path resolving. I mean once node has initiated SA query,
> > > > > whether SM has some database to reply SA or On the fly destination
> > > > > node is contacted to get asked path recored?
> > > >
> > > > SMs can either calculate the SA PRs on the fly based on the routing
> > > > algorithm in use and some other things or put them in a local database.
> > > > This is up to that SM.
> > > Ok
> > > >
> > > > Destination node is not contacted in the SA PR query process.
> > > >
> > > > > >Doesn't each SM have its own policy for generating valid PRs ?
> > > > > Ultimately path record is in Path_Record object format, and SA cache
> > > > > is going to store in a fixed manner, How generation policy matters?
> > > >
> > > > What if the local policy loaded does not agree with what the SM would
> > > > generate for a particular PR ? One then gets a local error which will
> > > > need to be tracked down. Not so easy IMO.
> > > SM policies in a subnet to generate PRs, changes dynamically? at run time?
> >
> > The policy doesn't change dynamically but the data to be returned in the
> > SA PR response might.
> >
> > > if Not then depending on the local SM policy static PR can be
> > > generated to load initially.
> >
> > Just as one question related to this, how would link failures be handled
> > ? There are others.
> Its just a matter of avoiding initial PR query packets by loading the
> cache with static PRs.....Later on cache module will function in
> normal fashion. I expect, initially every thing will come up in a
> trusted cluster.
So you're saying the cache would still react to GIDs out and in service,
right ?
I am not about what GIDs in out service....but what I mean to say is,
Once sa_cache is programmed with some static PRs....it will avoid
initial cache_update step and after first time out normal
update_cache() will be initiated using SA MADs.
If the cache is loaded from a file, does it bypass querying the SA
initially for PRs ?
Yes It will, and hence reduce the initial SA traffic generated on a
big cluster...just imagin, the cluster is quite big and every node is
trying to build its cache initially. It will create large burst of SA
packets.
If that is the case, then the file is required to be
the full set of PRs for this node otherwise there would be incomplete
connectivity.
Yes, correct, Generating these PRs is the next issue which I want to
discuss. may be this can be done by Admin on every node using the
read() entry point provided by char_dev interface of sa_cache module.
read entry point will simple extract PRs from cache itself.
Incomplete connectivity will be till first PR is requested for that
destination, Because if its a cache miss, any how application is going
to initiate a ib_sa_get_path_rec() and resolved PR will be added in
cache for future reference.
-- Hal
> > > > > CMIIW. Also I am assuming a homogeneous cluster where certain
> > > > > parameters can be assumed to be same always.
> > > >
> > > > and always in agreement with what the SM would return ? For example,
> > > yes
> > > > what happens when a link goes down and the end node is no longer
> > > > reachable ?
> > > If node is not reachable then, after first timeout of sa_cache, that
> > > entry will be removed from cache.
> >
> > OK; that's another aspect to add into this feature. I don't think that
> > is currently done. I think there would need to be an API added to do
> > this.
> Yes, this has been discussed with Sean, we can add one char_dev
> interface to the existing sa_cache module implementation, Write entry
> point will generate a SA_PR_response packet and this packet will be
> passed to update_cache() function.
>
> Also we need to remove the initial schedule_update() call in the
> add_one() function.
> One user command is also required to read from user file and write
> onto this device.
> >
> > -- Hal
> >
> > > > > >are these from a live SM and just loaded "out of band" to
> > > > > bypass/preclude the SA PR >mechanism ?
> > > > > may be
> > > >
> > > > Even if they are, there is still the changes in the subnet issue.
> > > >
> > > > -- Hal
> > > >
> > > > > > -- Hal
> > > > > >
> > > > > > > Admin is loading this info in the cache with user command.
> > > > > > > >
> > > > > > > > > Another point I want to know is,
> > > > > > > > > When local_sa_cache module will be inserted? After SM comes
up or
> > > > > > > > > Before SM comes up?
> > > > > > > >
> > > > > > > > It can occur either way. There is no restriction. The cache
responds
> > > > > > > > to port up and GID in/out of service events to update itself.
> > > > > > > Do you mean cache module will start building cache only after
Port is UP?
> > > > > > > >
> > > > > > > > > If Its inserted before SM is coming up (I am assuming SM is
running on
> > > > > > > > > some node not on switch) then First Forced schedule_update()
is
> > > > > > > > > waisted, and for the first application presence of cache is
> > > > > > > > > meaningless. Why not to keep cache effective right from the
start?
> > > > > > > >
> > > > > > > > Pre-loading the cache with path records doesn't guarantee that
those
> > > > > > > > paths are usable. If the SM has not come up, then the path
records will
> > > > > > > > be unusable until the SM configures the subnet, plus there's no
> > > > > > > > guarantee that the remote endpoints specified by the paths are
running.
> > > > > > > You mean there is no guarantee that even if SM is UP and we have
some
> > > > > > > hard coded entries of path record corresponding to some node X,
we are
> > > > > > > not sure that node X has actually come up or not? In that case
> > > > > > > actually that path resolving should fail if node has not come up,
but
> > > > > > > with the hard coding still path will be resolved?
> > > > > > > >
> > > > > > > > The main benefit I see to pre-loading the cache is to avoid SA
storms
> > > > > > > > when booting a large cluster.
> > > > > > > that's true. Also cache will get valid entries only if network is
> > > > > > > configured by SM otherwise every node SA will, possibly, drop SA
> > > > > > > packets.
> > > > > > > >
> > > > > > > > - Sean
> > > > > > > >
> > > > > > > _______________________________________________
> > > > > > > general mailing list
> > > > > > > [email protected]
> > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> > > > > > >
> > > > > > > To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
> > > > > >
> > > > > >
> > > >
> > > >
> >
> >
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general