> -----Original Message----- > From: users [mailto:[email protected]] On Behalf Of Dell Will > Sent: Tuesday, March 26, 2019 9:04 AM > To: users <[email protected]> > Subject: [dpdk-users] Why not prefetch the second cache line of struct > rte_mbuf for better performance ? > > Hello, everybody
Hi, > I find that many codes in DPDK only prefetch the first cache line of struct > rte_mbuf. > The struct rte_mbuf has 2 cache lines. > Why not prefetch the second line ? A reason that cache-line 2 is not always prefetched is that it is not always going to be used. For example, the packet RX routines modify only the 1-st cache line, and do not require the 2nd to be available. > Is it hinted that the CPU (x64 or ARM) always automatically prefetch the > next immediately followed cache line ? Some details on x86-64 prefetchers here, particularly the "Adjacent Cache-Line Prefetch" is of interest; https://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers [Side note, x64 is actually a different architecture than x86-64]. > Thanks a lot ! Hope that helps, -Harry
