Thanks for the sanity check. Fix in r24202.
George Bosilca wrote:
As the endpoint's btl_max_send_size has been initialized to the min of the
max_size of all BTLs in the send (respectively rdma) array, the loop you
pinpointed will have no effect (as it is impossible to find a smaller value
than the minimum already computed). Pre-setting to (size_t)-1 should fix the
issue.
On Jan 3, 2011, at 17:17 , Eugene Loh wrote:
I can't tell if this is a problem, though I suspect it's a small one even if
it's a problem at all.
In mca_bml_r2_del_proc_btl(), a BTL is removed from the send list and from the
RDMA list.
If the BTL is removed from the send list, the end-point's max send size is
recomputed to be the minimum of the max send sizes of the remaining BTLs. The
code looks like this, where I've removed some code to focus on the parts that
matter:
/* remove btl from send list */
if(mca_bml_base_btl_array_remove(&ep->btl_send, btl)) {
/* reset max_send_size to the min of all btl's */
for(b=0; b< mca_bml_base_btl_array_get_size(&ep->btl_send); b++) {
bml_btl = mca_bml_base_btl_array_get_index(&ep->btl_send, b);
ep_btl = bml_btl->btl;
if (ep_btl->btl_max_send_size < ep->btl_max_send_size) {
ep->btl_max_send_size = ep_btl->btl_max_send_size;
}
}
}
Shouldn't that inner loop be preceded by initialization of ep->btl_max_send_size to some
very large value (ironically enough, perhaps "-1")?
Something similar happens in the same function when the BTL is removed from the RDMA
list and ep->btl_pipeline_send_length and ep->btl_send_limit are recomputed.