Re: [ofa-general] Re: IPoIB path caching

Or Gerlitz Tue, 24 Jul 2007 02:40:22 -0700

Sean Hefty wrote:

What I have in mind is that IPoIB must not use cached IB path info.

If the IB stack has path caching which is in the default flow ofrequesting a path record, it should provide an API (eg flag to thefunction through which one does path query) to request a non cached path.

Argh! This was the original design. I believe the current design is abetter approach. The ULP shouldn't care whether the PR is cached or not- only that it's usable.

Linux has a quite sophisticated mechanism to maintain / cache / probe /invalidate / update the network stack L2 neighbour info.

Stating that although the neighbour cache state machine decided toupdate/delete a neighbour it is just correct by design for IPoIB to usecached IB L2 info is somehow moving too fast I think, some discussionis needed here.

My basic thought is that for IPoIB its better to never use cached paththen to always use cached path. But! maybe there's a way in the middlehere, lets think. This is what I was referring to when saying "almostalways".

For example, in the Voltaire gen1 stack we had an ib arp module whichwas used by both IPoIB and native IB ULPs (SDP, iSER, Lustre, etc). Thismodule managed some sort of path cache, were IPoIB was always asking fornon-cached path and other ULPs were willing to get cached path.

The design I was thinking to suggest for IPoIB is to almost always usethis API since this policy makes the implementation consistent withthe decisions made by the network stack neighbour cache

This defeats one of the benefit of caching, which is using a singleGetTable query, versus literally hundreds or thousands of Get queries.Consider that constant all-to-all communication using IPoIB between 1024ports, with a 15 minute ARP table timeout would hit the SA with close to600 queries per second.

If the cache comes to serve all-to-all MPI jobs and practically with IB,to get MPI performance (specifically latency) people would --not-- beusing IPoIB for their MPI jobs since they want kernel AND net-stackbypass, it does make sense to use non-cached path in IPoIB if we agreethat design-wise its the the correct approach.

While I agree that there's the potential for a problem, given that IPoIBhas always cached PRs and no one has reported problems, I think we'reoverstating the likelihood of issues occurring in practice. Even the SAcaches the path data -- getting a PR from the SA doesn't provide anyadditional guarantees.

I am not with you... I would expect an SA implementation to invalid /recompute the relevant data structures associated with each change inthe fabric and get a trap for each change.


Or.


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] Re: IPoIB path caching

Reply via email to