On Wed, 2007-05-23 at 10:27, Devesh Sharma wrote: > On 21 May 2007 13:52:11 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote: > > On Mon, 2007-05-21 at 01:58, Devesh Sharma wrote: > > > On 18 May 2007 06:21:05 -0400, Hal Rosenstock <[EMAIL PROTECTED]> wrote: > > > > On Thu, 2007-05-17 at 08:28, Devesh Sharma wrote: > > > > > On 17 May 2007 06:42:16 -0400, Hal Rosenstock <[EMAIL PROTECTED]> > > > > > wrote: > > > > > > On Thu, 2007-05-17 at 01:21, Devesh Sharma wrote: > > > > > > > On 5/17/07, Sean Hefty <[EMAIL PROTECTED]> wrote: > > > > > > > > > But initially this will generate a packet for each path, > > > > > > > > > while sys > > > > > > > > > admin knows that path is there and he can hard-code the > > > > > > > > > entries for > > > > > > > > > it. Other thing is that why Admin will care about creating > > > > > > > > > such record > > > > > > > > > while SA is itself taking care, right? > > > > > > > > > > > > > > > > In your original message you asked about adding 'dummy entries' > > > > > > > > to the > > > > > > > > cache. I agree that pre-loading the cache can be useful. What > > > > > > > > I still > > > > > > > > am not understanding is the reasoning for adding 'dummy > > > > > > > > entries'. By > > > > > > > > 'dummy entries', I've been assuming that these are invalid path > > > > > > > > records, > > > > > > > > but maybe that's not what you meant. > > > > > > > Ok if "dummy entries" word as such has created confusion then I am > > > > > > > sorry for that, But with that I mean that, those are valid path > > > > > > > records which Administrator knows in advance and while loading the > > > > > > > module, > > > > > > > > > > > > How does the admin know they are valid ? > > > > > Depending on the initial application runs, some trusted PRs can be > > > > > generated. > > > > > > > > What do initial application runs have to do with this ? > > > My understanding is that, once the cluster is UP, and if between Node > > > A and Node B there is only one path, > > > > So this is a feature for such one path subnets. I wonder what percentage > > of deployed subnets fits this case. > You never know, It may be used for debugging also.
I still don't have a good feel for how common/generally useful this will really be. > > > then, SA query always going to return same values in PR. > > > > If subnet topology is changed, these PRs might change. There are other > > cases where they change too. > Not sure about it...some suggestion? > > > > > On this basis Initial application runs will generate PRs, > > > > That's what confused me before (Applications don't generate PRs but > > rather request them.) but I think I see what you mean now. > Ok > > > > > these PRs can be saved in some file, and can be loaded > > > when cache_module comes in. > > > > > > > > > >Are they somehow preconfigured at the SM ? > > > > > I am not sure about SM has any such provision? > > > > > > > > Not that I'm aware of. > > > Ok, So, currently no such support is there in SM? > > > > I can speak definitively for OpenSM and there is no such support. As to > > the vendor SMs, I don't think so but don't know for absolute certainty. > > Someone can correct me if I'm wrong but I wouldn't assume no response > > means correctness as some may not be listening nor want to respond as to > > "value added" vendor specific features. > What is the issue if OpenSM provides this? I'm not following you. What does/should OpenSM provide ? OpenIB works in configurations with other SMs. > > > > > > > Also not sure about the > > > > > role of SM in path resolving. I mean once node has initiated SA query, > > > > > whether SM has some database to reply SA or On the fly destination > > > > > node is contacted to get asked path recored? > > > > > > > > SMs can either calculate the SA PRs on the fly based on the routing > > > > algorithm in use and some other things or put them in a local database. > > > > This is up to that SM. > > > Ok > > > > > > > > Destination node is not contacted in the SA PR query process. > > > > > > > > > >Doesn't each SM have its own policy for generating valid PRs ? > > > > > Ultimately path record is in Path_Record object format, and SA cache > > > > > is going to store in a fixed manner, How generation policy matters? > > > > > > > > What if the local policy loaded does not agree with what the SM would > > > > generate for a particular PR ? One then gets a local error which will > > > > need to be tracked down. Not so easy IMO. > > > SM policies in a subnet to generate PRs, changes dynamically? at run time? > > > > The policy doesn't change dynamically but the data to be returned in the > > SA PR response might. > > > > > if Not then depending on the local SM policy static PR can be > > > generated to load initially. > > > > Just as one question related to this, how would link failures be handled > > ? There are others. > Its just a matter of avoiding initial PR query packets by loading the > cache with static PRs.....Later on cache module will function in > normal fashion. I expect, initially every thing will come up in a > trusted cluster. So you're saying the cache would still react to GIDs out and in service, right ? If the cache is loaded from a file, does it bypass querying the SA initially for PRs ? If that is the case, then the file is required to be the full set of PRs for this node otherwise there would be incomplete connectivity. -- Hal > > > > > CMIIW. Also I am assuming a homogeneous cluster where certain > > > > > parameters can be assumed to be same always. > > > > > > > > and always in agreement with what the SM would return ? For example, > > > yes > > > > what happens when a link goes down and the end node is no longer > > > > reachable ? > > > If node is not reachable then, after first timeout of sa_cache, that > > > entry will be removed from cache. > > > > OK; that's another aspect to add into this feature. I don't think that > > is currently done. I think there would need to be an API added to do > > this. > Yes, this has been discussed with Sean, we can add one char_dev > interface to the existing sa_cache module implementation, Write entry > point will generate a SA_PR_response packet and this packet will be > passed to update_cache() function. > > Also we need to remove the initial schedule_update() call in the > add_one() function. > One user command is also required to read from user file and write > onto this device. > > > > -- Hal > > > > > > > >are these from a live SM and just loaded "out of band" to > > > > > bypass/preclude the SA PR >mechanism ? > > > > > may be > > > > > > > > Even if they are, there is still the changes in the subnet issue. > > > > > > > > -- Hal > > > > > > > > > > -- Hal > > > > > > > > > > > > > Admin is loading this info in the cache with user command. > > > > > > > > > > > > > > > > > Another point I want to know is, > > > > > > > > > When local_sa_cache module will be inserted? After SM comes > > > > > > > > > up or > > > > > > > > > Before SM comes up? > > > > > > > > > > > > > > > > It can occur either way. There is no restriction. The cache > > > > > > > > responds > > > > > > > > to port up and GID in/out of service events to update itself. > > > > > > > Do you mean cache module will start building cache only after > > > > > > > Port is UP? > > > > > > > > > > > > > > > > > If Its inserted before SM is coming up (I am assuming SM is > > > > > > > > > running on > > > > > > > > > some node not on switch) then First Forced schedule_update() > > > > > > > > > is > > > > > > > > > waisted, and for the first application presence of cache is > > > > > > > > > meaningless. Why not to keep cache effective right from the > > > > > > > > > start? > > > > > > > > > > > > > > > > Pre-loading the cache with path records doesn't guarantee that > > > > > > > > those > > > > > > > > paths are usable. If the SM has not come up, then the path > > > > > > > > records will > > > > > > > > be unusable until the SM configures the subnet, plus there's no > > > > > > > > guarantee that the remote endpoints specified by the paths are > > > > > > > > running. > > > > > > > You mean there is no guarantee that even if SM is UP and we have > > > > > > > some > > > > > > > hard coded entries of path record corresponding to some node X, > > > > > > > we are > > > > > > > not sure that node X has actually come up or not? In that case > > > > > > > actually that path resolving should fail if node has not come up, > > > > > > > but > > > > > > > with the hard coding still path will be resolved? > > > > > > > > > > > > > > > > The main benefit I see to pre-loading the cache is to avoid SA > > > > > > > > storms > > > > > > > > when booting a large cluster. > > > > > > > that's true. Also cache will get valid entries only if network is > > > > > > > configured by SM otherwise every node SA will, possibly, drop SA > > > > > > > packets. > > > > > > > > > > > > > > > > - Sean > > > > > > > > > > > > > > > _______________________________________________ > > > > > > > general mailing list > > > > > > > [email protected] > > > > > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > > > > > > > > > To unsubscribe, please visit > > > > > > > http://openib.org/mailman/listinfo/openib-general > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
