On May 28, 2008, at 5:09 PM, Roland Dreier wrote:
I think Patrick's point is that it's not too much more expensive to
do the
syscall on Linux vs just doing the cache lookup, particularly in the
context of a long message. And it means that upper layer protocols
like
MPI don't have to deal with caches (and since MPI implementors hate
registration caches only slightly less than we hate MPI_CANCEL,
that will
make us happy).
Stick in a separate library then?
I don't think we want the complexity in the kernel -- I personally
would
argue against merging it upstream; and given that the userspace
solution
is actually faster, it becomes pretty hard to justify.
If someone would like to pull registration cache into OFED, that would
be great. But something tells me they won't want to. It's a pain, it
screws up users, and it only works about 50% of the time.
It's a support issue -- pushing it in a separate library doesn't help
anyone unless someone's willing to handle the support. I sure as heck
don't want to do the support anymore, particularly since OFED is the
*ONLY* major software stack that requires such evil hacks. MX handles
it at the lower layer. Portals is specified such that the hardware
and/or Portals library must handle it (by specifying semantics that
require registration per message). Quadrics (with tports) handles it
in a combination of the kernel and library. TCP doesn't require
pinning and/or registration.
Brian
--
Brian Barrett
Open MPI developer
http://www.open-mpi.org/