Hi Honnappa, > -----Original Message----- > From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > Sent: Thursday, August 29, 2019 6:10 AM > To: Gavin Hu (Arm Technology China) <gavin...@arm.com>; > dev@dpdk.org > Cc: nd <n...@arm.com>; tho...@monjalon.net; jer...@marvell.com; > pbhagavat...@marvell.com; qi.z.zh...@intel.com; > bruce.richard...@intel.com; sta...@dpdk.org; Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com>; nd <n...@arm.com> > Subject: RE: [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered for > aarch64 > > Thanks Gavin, few comments are inline > > > -----Original Message----- > > From: Gavin Hu <gavin...@arm.com> > > Sent: Tuesday, August 13, 2019 5:44 AM > > To: dev@dpdk.org > > Cc: nd <n...@arm.com>; tho...@monjalon.net; jer...@marvell.com; > > pbhagavat...@marvell.com; Honnappa Nagarahalli > > <honnappa.nagaraha...@arm.com>; qi.z.zh...@intel.com; > > bruce.richard...@intel.com; sta...@dpdk.org > > Subject: [PATCH 1/2] net/i40e: desc loading is unnecessarily ordered for > > aarch64 > > > > For x86, the descriptors needs to be loaded in order, so in between two > > descriptors loading, there is a compiler barrier in place. > IMO, we can skip the above as this change applies to Arm platforms. Instead, > capture this in the code in comments to explain why the ordering of the > loads is not required. This will help others reading the code.
As the line of code was removed, there is no suitable place to add a comment. Instead adding it in the commit log makes the story complete and easy to understand. > [1] For aarch64, a > > patch [2] is in place to survive with discontinuous DD bits, the barriers > > can > be > > removed to take full advantage of out-of-order execution. > > > > 50% performance gain in the RFC2544 NDR test was measured on > ThunderX2. > > 12.50% performan gain in the RFC2544 NDR test was measured on > Ampere > > eMAG80 platform. > > > > [1] > > > http://inbox.dpdk.org/users/039ED4275CED7440929022BC67E7061153D71 > > 548@ > > SHSMSX105.ccr.corp.intel.com/ > > [2] https://mails.dpdk.org/archives/stable/2017-October/003324.html > > > > Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Gavin Hu <gavin...@arm.com> > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> > > Reviewed-by: Steve Capper <steve.cap...@arm.com> > > --- > > drivers/net/i40e/i40e_rxtx_vec_neon.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c > > b/drivers/net/i40e/i40e_rxtx_vec_neon.c > > index 83572ef..5555e9b 100644 > > --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c > > +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c > > @@ -285,7 +285,6 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *rxq, > > struct rte_mbuf **rx_pkts, > > /* Read desc statuses backwards to avoid race condition */ > > /* A.1 load 4 pkts desc */ > > descs[3] = vld1q_u64((uint64_t *)(rxdp + 3)); > > - rte_rmb(); > > > > /* B.2 copy 2 mbuf point into rx_pkts */ > > vst1q_u64((uint64_t *)&rx_pkts[pos], mbp1); > > -- > > 2.7.4