TX rte_memcpy, bulk free, prefetch

Morten Brørup Wed, 28 Jan 2026 01:50:05 -0800

> > > - Replace memcpy() with rte_memcpy() for optimized copy operations
> > There is no good reason that rte_memcpy() should be faster than
> memcpy().
> > There were some cases observed with virtio but my hunch is that this
> is
> > because the two routines are making different alignment assumptions.
> 
> ack. I will drop rte_memcpy.


The community is increasingly skeptical about using rte_memcpy() instead of 
memcpy().
I'm not sure all DPDK documentation has been updated to reflect this change, 
but might still recommend rte_memcpy().

So, simply replacing memcpy() with rte_memcpy() is no longer acceptable.
However, if you back up the replacement with performance data, it is more 
likely to get accepted.

> Under what scenarios is rte_memcpy preferred/beneficial?

I wish someone had an answer to that question!
The best I can come up with is:
When using an ancient compiler or C library, where memcpy() isn't properly 
optimized.

With modern compilers catching up, rte_memcpy() is becoming increasingly 
obsolete.

Here's some background information about rte_memcpy() from 2017:
https://www.intel.com/content/www/us/en/developer/articles/technical/performance-optimization-of-memcpy-in-dpdk.html

IIRC, the concept of a specialized memcpy() originates from some video 
streaming or gaming code, where huge memory areas were being copied around.

RE: [PATCH v1 2/3] net/af_packet: RX/TX rte_memcpy, bulk free, prefetch

Reply via email to