I have a question related to the mixture of HCA bandwidths in the fabric. For an upper layer, like MVAPICH, "negotiating" for a rate so that all the ports are "involved" can be quite expensive, especially if the code falls in the critical path. Can any additional support be provided by the underlying SA interface, so that the upper protocal layers can do the job in a minimum time. This kind of support can be used not only for MPI but for other stacks as well,
Thanks, Amith On Wed, 22 Feb 2006, Fabian Tillier wrote: > On 2/22/06, Greg Lindahl <[EMAIL PROTECTED]> wrote: > > > > On Tue, Feb 21, 2006 at 11:40:53PM -0800, Fabian Tillier wrote: > > > > > You'd have to make the group 1X. Note that the group being 1X doesn't > > > limit unicast traffic to 1X rates, since the rate for unicast traffic > > > would be set based on the rate reported in the path records for the > > > various endpoints. > > > > > > So 4X SDR and 4X DDR nodes would have to set their inter-packet delay > > > for the broadcast group to end up with a 1X packet injection rate. > > > > So, basically, MVAPICH doesn't have code that does either the group > > creation properly when there is a mixture of HCA bandwidths, or limit > > the packet injection rate. And IPoIB could violate this rule depending > > on how user programs use it, e.g. if I did a lot of broadcasting, I > > could easily exceed 1X's bandwidth. > > > > So this is more than just a "fix OpenSM" issue. It's more of a "fix > > the spec" issue, if I'm understanding it correctly. > > No, the spec is fine. This is a "fix the SW" issue. If OpenSM > rejected join requests of nodes for which the MC group is unrealizable > (that is, some setting of the requestor conflict with the existing > group, such as the rate), such nodes would not be able to join the > broadcast group and thus not have IPoIB connectivity. > > When the SA responds to the MC join request, the response includes the > rate. The recipient of the response should create an address vector > for the MC group that takes the rate into account, which would cause > the hardware to honor the injection rate such as to not flood the > group. I haven't looked at MVAPICH, so I can't tell you if what it > does is correct. IPoIB does seem to do the right thing, though. > > - Fab > _______________________________________________ > openib-general mailing list > [email protected] > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
