On Wed, 28 May 2008, Roland Dreier wrote:
> - gleb asks: don't we want to avoid the system call when possible?
> - patrick: a single syscall can be/is cheaper than a reg cache
> lookup in user space
This doesn't really make sense -- syscall + cache lookup in kernel is
"obviously" more expensive than cache lookup in userspace with no
context switch (I don't see any tricks the kernel can do that make the
cache lookup cheaper there).
However the solution I proposed a long time ago (when Pete Wyckoff
originally did his work on having the kernel track this -- and as a side
note, it's not clear to me whether MMU notifiers really help what Pete
did) is for userspace to provide a pointer to a flag when registering
memory with the kernel, and then the kernel can mark the flag if the
mapping changes -- ie keep the userspace cache but have the kernel
manage invalidation "perfectly" without any malloc hooks.
I think Patrick's point is that it's not too much more expensive to do the
syscall on Linux vs just doing the cache lookup, particularly in the
context of a long message. And it means that upper layer protocols like
MPI don't have to deal with caches (and since MPI implementors hate
registration caches only slightly less than we hate MPI_CANCEL, that will
make us happy).
Brian