I was playing around with some really silly fragment sizes (sub 72
bytes) when I ran into some asserts in the btl_openib_sendi. I traced
the assert to be caused by mca_pml_ob1_send_request_start_btl()
calculating the true eager_limit with the following line:
size_t eager_limit = btl->btl_eager_limit - sizeof(mca_pml_ob1_hdr_t);
If btl_eager_limit ends up being less than the sizeof(mca_pml_ob1_hdr_t)
the eager_limit calculated results in a very large number and an assert
later on in the stack.
It seems to me that it would be nice to insert some checks in
mca_btl_base_param_register() to make sure btl_eager_limit is >
sizeof(mca_pml_ob1_hdr_t). Am I missing a reason why this was not done
in the first place?
--td