On Jun 7, 2007, at 9:11 AM, George Bosilca wrote:
There is something weird with this change, and the patch reflect
it. The new argument "order" come from the PML level and might be
MCA_BTL_NO_ORDER (which is kind of global) or BTL_OPENIB_LP_QP or
BTL_OPENIB_HP_QP (which are definitively Open IB related). Do you
really intend to let the PML knows about Open IB internal constants ?
No, the PML knows only one thing about the order tag, it is either
MCA_BTL_NO_ORDER or it is something that the BTL assigns.
The PML has no idea about BTL_OPENIB_LP_QP or BTL_OPENIB_HP_QP, to
the PML it is just an order tag assigned to a fragment by the BTL.
So the semantics are that after a btl_send/put/get an order tag may
be assigned by the BTL to the descriptor,
This order tag can then be specified to subsequent calls to btl_alloc
or btl_prepare. The PML has no idea what the value means, other than
he is requesting a descriptor that will be ordered w.r.t. a
previously transmitted descriptor.
If it's the case (which seems to be true from the following snippet
if(MCA_BTL_NO_ORDER == order) {
frag->base.order = BTL_OPENIB_LP_QP;
} else {
frag->base.order = order;
}
So I am choosing some ordering to use here because the PML told me he
doesn't care, what is wrong with this?
) I expect you to revise the patch in order to propose a generic
solution or I'll trigger a vote against the patch.
This exports no knowledge of the Open IB BTL to the PML layer, the
PML doesn't know that this is a QP index, he doesn't care! The PML
simply uses this value (if it wants to) to request ordering with
subsequent fragments. We use the QP index only as a BTL optimization,
it could have been anything. So the only new knowledge that the PML
has is how to request that ordering of fragments be enforced, and the
BTL doesn't even have to provide this if it doesn't want, that is the
reason for MCA_BTL_NO_ORDER.
Please describe a use case where this is not a generic solution. Keep
in mind that MX, TCP, GM all can provide ordering guarantees if they
wish, in fact for MX you can simply always assign an order tag, say
the value is 1. MX can then guarantee ordering of all fragments sent
over the same BTL.
I vote to be backed out of the trunk as it export way to much
knowledge from the Open IB BTL into the PML layer.
The only other option that I have identified that doesn't push PML
level protocol into the BTL is to require that BTLs always guarantee
ordering of fragments sent/put/get over the same BTL.
george.
PS: With Gleb changes the problem is the same. The following
snippet reflect exactly the same behavior as the original patch.
Gleb's changes don't change the semantic guarantees that I have
described above.
frag->base.order = order;
assert(frag->base.order != BTL_OPENIB_HP_QP);
On Jun 7, 2007, at 9:49 AM, Gleb Natapov wrote:
Hi Galen,
On Sun, May 27, 2007 at 10:19:09AM -0600, Galen Shipman wrote:
With current code this is not the case. Order tag is set during a
fragment
allocation. It seems wrong according to your description. Attached
patch fixes
this. If no specific ordering tag is provided to allocation
function order of
the fragment is set to be MCA_BTL_NO_ORDER. After call to send/put/
get order
is set to whatever QP was used for communication. If order is set
before send call
it is used to choose QP.
I do set the order tag during allocation/prepare, but the defined
semantics are that the tag is only valid after send/put/get. We can
set them up any where we wish in the BTL, the PML however cannot
rely
on anything until after the send/put/get call. So really this is an
issue of semantics versus implementation. The implementation I
believe does conform to the semantics as the upper layer (PML)
doesn't use the tag value until after a call to send/put/get.
I will look over the patch however, might make more sense to delay
setting the value until the actual send/put/get call.
Have you had a chance to look over the patch?
--
Gleb.
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel