> Using a local SA cache we were able to establish all-to-all connections > between > 1024 processes (about 1 million connections) in about 3 seconds. Without the > cache, connection time took about a minute, and required a substantial amount > of > tuning of timeout values to achieve this.
Sean, I think there are some good ideas in this patch: - limiting the number of outstanding SA MADs - batching multiple path queries in a single table request I think these are likely to help many workloads, not just MPI all-to-all. I am very time-constrained currently due to the work on OFED 1.2 so I am responding to design more than the code itself. Questions: 1. What happens on e.g. a heterogenious network? It seems that path to a specific GID might change e.g. MTU without GID going in/out of service. How would this be handled? 2. What will happen on a number of changes in the network? Would not the SA would need to send a huge number of notices now? Should we be concerned? 3. Comments indicate that the main win from the patch is with all-to-all startup times on large MPI clusters. If that is so, and assuming a small number of MPI jobs is running on each node, isn't it true that the main win is not from *caching* as such (since all paths are requested at the beginning and never used after this), but rather from limiting the number of outstanding MADs to SA and from reusing multiple path queries in a single request. Could that be the case? 4. Why do we need yet another API and yet another module to speed up just RDMA/CM path record queries? We now get 2 ways to do this (with/without the cache). Shouldn't there be just one? 5. How will the user guess the correct value for paths_per_dest tunable, besides disabling the cache? I notice it is currently set to a value of 0x7F. Where does this value come from? > I've only updated the rdma_cm to use the cache, but similar changes could be > made to SRP and ipoib (which implements its own path record cache). > > I would like to get feedback on both the notice and local_sa patches for > inclusion in 2.6.22 or 2.6.23 (if 2.6.22 is not possible). Since OFED includes a significantly different version of this code (without notices), and this is the first time the notices code makes an appearance, I think that targeting .23, and considering alternative options such as the above, would be more prudent. -- MST _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
