Re: [ofa-general] Further 2.6.23 merge plans...

Sean Hefty Wed, 18 Jul 2007 13:52:30 -0700

So, my main concern is with the role of kernel caching and especially with
how control is exported to user space.

The only control currently exported by the local SA is a moduleparameter that allows a user to force a refresh of the entire cache. Ido not want to extend this until we can get at least some basic PRcaching functionality merged.

I want something small that we can build on, and the local_sa patch isalready 1300 lines of code, with another 1000 lines of code to supportinforminfo registration.

Clearly the kernel needs a fast lookup cache for things like ipoib and
others. I don't think a kernel module needs or wants a full on
distributed SA.

We talking about PR caching only at this point, with possible extensionsto support QoS. Other SA information is not cached or needed.


For all to all connections, current code does something like the following:

1. Resolves IP addresses to DGIDs using ARP. This results in IPoIBquerying the SA and caching 1 PR per DGID.2. Apps query the SA for PRs, with 1 PR query per DGID. Eventuallywe'll get back the same set of PRs that IPoIB already had cached.3. Establish the connections. The IB CM stores the PR information witheach connection in order to set the QP attributes properly.

We end up with redundant queries and the PR being cached in multipleplaces. One optimization is to replace the N PR queries with a single,more efficient GetTable query. A second optimization is to centralizethe PR caching. The local SA does the first, and starts us down theroad of the second.

I personally think a simple in-kernel (small) fast lookup cache merged
with the ipoib cache that has a netlink interface to userspace to
add/delete/flush entries is a very good solution that will keep being
useful in future. netlink would also carry cache miss queries to
userspace. In absense of a daemon the kernel could query on its own
but cache very conservatively. A userspace version of the very
agressive cache you have now could also be created right away.

I believe that the PR caching should be done outside of IPoIB. Otherpaths may exist that IPoIB does not use.

This is because I firmly do not belive in caching as a solution to the
scalability problems. It must be solved with some level of replication
and distribution of the SA data and algorithms.

PR caching *is* replication of the SA data. The local SA works with allexisting SAs. It is not tied to one vendor, nor does it require changesto the SAs. Sure, we can define vendor specific protocols to assistwith/optimize synchronization, but I don't believe it is necessary in aninitial submission. (In fact I think it's undesirable at this point,since it would require changes to the SA.)

Maybe you could summarise how the user/kernel interface works?  The
last I saw was something based on MADs that looked very inefficient
compared with netlink.

I suggested a MAD interface to the local SA as being the mostextensible. It allows interacting with the cache from a local or remotenode in a very IB fashion. The local SA is located over QP1, and anynew protocols can re-use the existing SA MAD format.

For example, the cache could be loaded using a 'SetTable PR' MAD. Itdoesn't matter if the MAD is sent from a local user space daemon, somedistributed SA agent, or the master SA. Paths can be invalidated bysending 'Delete PR' MADs.


It may also be possible to extend such an interface for QoS purposes.

- Sean
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Further 2.6.23 merge plans...

Reply via email to