17/09/2018 10:11, Gavin Hu: > In __rte_ring_move_prod_head, move the __atomic_load_n up and out of > the do {} while loop as upon failure the old_head will be updated, > another load is costly and not necessary. > > This helps a little on the latency,about 1~5%. > > Test result with the patch(two cores): > SP/SC bulk enq/dequeue (size: 8): 5.64 > MP/MC bulk enq/dequeue (size: 8): 9.58 > SP/SC bulk enq/dequeue (size: 32): 1.98 > MP/MC bulk enq/dequeue (size: 32): 2.30 > > Fixes: 39368ebfc6 ("ring: introduce C11 memory model barrier option") > Cc: sta...@dpdk.org > > Signed-off-by: Gavin Hu <gavin...@arm.com> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > Reviewed-by: Steve Capper <steve.cap...@arm.com> > Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
We are missing reviews and acknowledgements on this series.