Sean Hefty wrote:
What I have in mind is that IPoIB must not use cached IB path info.

If the IB stack has path caching which is in the default flow of requesting a path record, it should provide an API (eg flag to the function through which one does path query) to request a non cached path.

Argh! This was the original design. I believe the current design is a better approach. The ULP shouldn't care whether the PR is cached or not - only that it's usable.

Linux has a quite sophisticated mechanism to maintain / cache / probe / invalidate / update the network stack L2 neighbour info.

Stating that although the neighbour cache state machine decided to update/delete a neighbour it is just correct by design for IPoIB to use cached IB L2 info is somehow moving too fast I think, some discussion is needed here.

My basic thought is that for IPoIB its better to never use cached path then to always use cached path. But! maybe there's a way in the middle here, lets think. This is what I was referring to when saying "almost always".

For example, in the Voltaire gen1 stack we had an ib arp module which was used by both IPoIB and native IB ULPs (SDP, iSER, Lustre, etc). This module managed some sort of path cache, were IPoIB was always asking for non-cached path and other ULPs were willing to get cached path.

The design I was thinking to suggest for IPoIB is to almost always use this API since this policy makes the implementation consistent with the decisions made by the network stack neighbour cache

This defeats one of the benefit of caching, which is using a single GetTable query, versus literally hundreds or thousands of Get queries. Consider that constant all-to-all communication using IPoIB between 1024 ports, with a 15 minute ARP table timeout would hit the SA with close to 600 queries per second.

If the cache comes to serve all-to-all MPI jobs and practically with IB, to get MPI performance (specifically latency) people would --not-- be using IPoIB for their MPI jobs since they want kernel AND net-stack bypass, it does make sense to use non-cached path in IPoIB if we agree that design-wise its the the correct approach.

While I agree that there's the potential for a problem, given that IPoIB has always cached PRs and no one has reported problems, I think we're overstating the likelihood of issues occurring in practice. Even the SA caches the path data -- getting a PR from the SA doesn't provide any additional guarantees.

I am not with you... I would expect an SA implementation to invalid / recompute the relevant data structures associated with each change in the fabric and get a trap for each change.

Or.


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to