Tested on Intel Haswell platform and Acked! (Provided there are no performance regression are reported so far on other platforms for various tests too)
Regards _Sugesh > -----Original Message----- > From: Bodireddy, Bhanuprakash > Sent: Monday, December 4, 2017 8:10 PM > To: [email protected] > Cc: Chandran, Sugesh <[email protected]>; Bodireddy, Bhanuprakash > <[email protected]>; Ben Pfaff <[email protected]> > Subject: [PATCH v2] packets: Prefetch the packet metadata in cacheline1. > > pkt_metadata_prefetch_init() is used to prefetch the packet metadata before > initializing the metadata in pkt_metadata_init(). This is done for every > packet in > userspace datapath and is performance critical. > > Commit 99fc16c0 prefetches only cachline0 and cacheline2 as the metadata > part of respective cachelines will be initialized by pkt_metadata_init(). > > However in VXLAN case when popping the vxlan header, > netdev_vxlan_pop_header() invokes pkt_metadata_init_tnl() which zeroes out > metadata part of > cacheline1 that wasn't prefetched earlier and causes performance degradation. > > By prefetching cacheline1, 9% performance improvement is observed with vxlan > decapsulation test case for packet sizes of 118 bytes. Performance variation > is > observed based on CFLAGS. > > CFLAGS="-O2" CFLAGS="-O2 -msse4.2" > Master 4.667 Mpps Master 4.710 Mpps > With Patch 5.045 Mpps With Patch 5.097 Mpps > > CFLAGS="-O2 -march=native" CFLAGS="-Ofast -march=native" > Master 5.072 Mpps Master 5.349 Mpps > With Patch 5.193 Mpps With Patch 5.378 Mpps > > CC: Ben Pfaff <[email protected]> > Fixes: 99fc16c0 ("Reorganize the pkt_metadata structure.") > Signed-off-by: Bhanuprakash Bodireddy <[email protected]> > --- > v2->v1 > * Include the throughput stats with different CFLAG options. > > lib/packets.h | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/lib/packets.h b/lib/packets.h index 13ea46d..74bec5d 100644 > --- a/lib/packets.h > +++ b/lib/packets.h > @@ -159,7 +159,8 @@ pkt_metadata_init(struct pkt_metadata *md, > odp_port_t port) } > > /* This function prefetches the cachelines touched by pkt_metadata_init() > - * For performance reasons the two functions should be kept in sync. */ > + * and pkt_metadata_init_tnl(). For performance reasons the two > + functions > + * should be kep in sync. */ > static inline void > pkt_metadata_prefetch_init(struct pkt_metadata *md) { @@ -167,6 +168,10 > @@ pkt_metadata_prefetch_init(struct pkt_metadata *md) > * be initialized later in pkt_metadata_init(). */ > OVS_PREFETCH(md->cacheline0); > > + /* Prefetch cacheline1 as members of this cacheline will be zeroed out > + * in pkt_metadata_init_tnl(). */ > + OVS_PREFETCH(md->cacheline1); > + > /* Prefetch cachline2 as ip_dst & ipv6_dst fields will be initialized. */ > OVS_PREFETCH(md->cacheline2); > } > -- > 2.4.11 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
