Sean Hefty wrote:
Administrators can enable or disable the cache. I don't believe that individual applications should be able to override the administrator, nor do I think we gain anything by having per application settings. This is similar to exposing to applications whether they want to use cached ARP information every time they connect.

Applications --can-- delete the network stack neighbour before doing this or that action.

For example, I think it would be correct for IB block and file I/O ULPs (iSER, SRP, Lustre, rNFS, etc) to request non cached PR, as their connecting model is not all-to-all but rather n-to-m (n clients to m servers with m << n), the connections are long-lived (hours, days, weeks, more) and a connection failure as of PR caching does not seem acceptable.

I believe a better solution is for everyone to use cached records, if they exist, with a feedback mechanism from the CM that removes paths on a connection failure or path migration event.

That's an interesting point. What's the conceptual difference between CM connection failure caused as of "wrong" PR to failure of --unicast-- ARP probe initiated by the network stack? CM feedback to the local sa seems a correct approach for me, however, I don't see the equivalent for UD communication.

With all to all connections over the rdma cm, the first thing that needs to be done is resolve the remote addresses to GIDs. This causes an ARP storm, followed by an SA storm caused by IPoIB, followed by a second SA storm caused by the rdma cm. For scalability, we need to remove both of these SA storms, not just the second. We don't see the first SA storm today because IPoIB caches PRs. Let's not add it. Restricting caching to the rdma cm, but removing it from IPoIB leaves us with the same issues that we have today.

Again, typical I/O client-server scheme is n-to-m where m is small (1,2 say up to few tens). The PRs are needed by IPoIB only at the --passive-- side which sends the unicast ARP reply. So when n=1024 and m=4 the SA would need to serve 4096 PRs which is about one fourth of the queries/second rate you have reported on earlier threads on the matter.

Or.

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to