Re: [Pvfs2-developers] libpvfs2 usage

Troy Benjegerdes Wed, 18 Oct 2006 13:09:38 -0700

The bigger problem is the same one seen by most applications that
use networks that require memory registration:  program semantics do
not require users to register memory but underlying hardware does,
thus something has to patch that gap.  If you reg/dereg around every
transfer, things are very slow.  Hence we go with caching in some
middle layer to fix this up.  The same is true for MPI as well.
(The Netpipe guys had a way to cause lots of damage by sending lots
of little buffers rather than one big one, I recall.)

The NetPIPE guy(s), which is me right now, do this by doing a ping-pong of say a 128k message, but we send the message from a differentaddress each time. This beats up memory registration caches nicely ;)We call this NetPIPE's cache-invalidate mode. It was originallywritten to address stuff that ended up in CPU caches, but it worksquite nicely to break other caches as well.

The NetPIPE pvfs module, when run with cache invalidate, effectivelywrites sequentially, but from a different buffer every time as well,and we end up seeing the same behavior, and breakage on the ehca.


By the way, various groups keep rediscovering this problem but there
are no real appealing fixes.  When was the last time you saw anybody
use MPI_Alloc_mem?  :)  We discovered it ourselves in the context of
PVFS back in 2003 or thereabouts, and took a stab at fixing it, but
didn't quite complete the work needed to fully integrate it.
(Wuj's Unifier framework (CCGrid04):
    http://www.osc.edu/~pw/papers/wu-unifier-ccgrid04.pdf
)

                -- Pete

The solution for a kernel hacker like me is obvious, you allow the OSkernel memory management and network driver handle the memory pinningand interaction with the hardware. This way an application can justcall the OS to register the entire application memory space, and theOS kernel can deal with keeping it all pinned down, and if it needsto unpin something, it can do so.

The catch is that it requires the hardware to support keeping anaddress registered, but *not* physically pinned, and *ask nicely* tothe OS via the page fault handler to pin the page back down ifsomething comes in. This seems to be an idea that RDMA hardwaredesigners just can't wrap their heads around. I guess they are tooused to dealing with OS'es that never change.

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] libpvfs2 usage

Reply via email to