> An issue popped up as part of the code review for
> 
> https://github.com/ofiwg/libfabric/pull/6185
> 
> The summary is that supporting FI_HMEM (e.g. GPU memory buffers) may have a 
> negative
> impact on the performance of fi_inject() calls.  Either the provider must 
> disable
> inject completely, or possibly check the buffer location on each inject call.

There are comments in the above issue about possible options.

But I'm actually going to propose an alternative, which is we do nothing, other 
than possibly document that the use of fi_inject() is not recommended for 
non-system memory.

First, fi_inject() should work with device buffers with no special handling.  
Fi_inject() is usually implemented using memcpy, which might be slower than a 
specialized copy routine, but the impact is unknown at this point.  Second, 
fi_inject() is intended for small transfers.  It's unknown to me if there's a 
significant need to perform small transfers to/from device memory, such that 
fi_inject() would naturally be used by apps for device buffers.  I think we 
need quantified data for both of these showing a need before creating an 
optimized device inject path.

Removing support for device buffers from fi_inject completely likely forces 
memory registration of those buffers.  It's highly likely that the cost of 
registration will impact performance greater than using a non-optimized memory 
copy routine.

- Sean
_______________________________________________
ofiwg mailing list
[email protected]
https://lists.openfabrics.org/mailman/listinfo/ofiwg

Reply via email to