>Aren't we creating a monster here??? if this is SA replica which should >work for scale from day one, lets call it this way and see how to reach >there.
The cache update window is configurable. What we don't know is how often the SA would be queried to establish connections without a local cache present. Based on information from SilverStorm, the cache should work well in practice. What I think we'd like is a userspace cache hierarchy/distributed SA; however, the time to develop these does not meet any of the Path Forward schedules. Having a mechanism where clients could ask if there have been any updates would also work, but I didn't see a way to do this without modifying the SA node. >+ neither MVAPICH nor OpenMPI are using path query The national labs want all path records for their routing algorithms. I believe that the problems here were API issues that make connecting difficult. As a result, most applications just hard-coded everything. >+ OpenMPI is opening its connections "per demand" that is only if rank >I >attempts to send a message to rank J then I connects to J > >+ even MPIs that connect all-to-all in an N ranks JOB would do only >n(n-1)/2 path queries, so the load aggregated load on the SA is half >what the all-to-all caching scheme is generating It would be better to issue a single query for all path records, and discard those not needed, than issue separate path records queries. This is what the cache does. The difference is 1000 queries, versus 500,000 queries. The total number of MADs generated by the SA is still lower using a single query to return all path records. - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
