> -----Original Message----- > From: William Tu <[email protected]> > Sent: Friday, May 29, 2020 2:19 AM > To: Van Haaren, Harry <[email protected]> > Cc: [email protected]; [email protected] > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather > implementation > > On Wed, May 27, 2020 at 12:21:43PM +0000, Van Haaren, Harry wrote: <snip hashing details> > > As a result, hashing identical data in different .c files produces a > > different hash > values. > > > > From OVS docs (http://docs.openvswitch.org/en/latest/intro/install/general/) > the following > > enables native ISA for your build, or else just enable SSE4.2 and popcount: > > ./configure CFLAGS="-g -O2 -march=native" > > ./configure CFLAGS="-g -O2 -march=nehalem" > > Hi Harry, > Thanks for the info! > I can make it work now, with > ./configure CFLAGS="-g -O2 -msse4.2 -march=native"
OK - that's good - the root cause of the bug/hash-mismatch is confirmed! > using similar setup > ovs-ofctl add-flow br0 'actions=drop' > ovs-appctl dpif-netdev/subtable-lookup-set avx512_gather 5 > ovs-vsctl add-port br0 tg0 -- set int tg0 type=dpdk \ > options:dpdk- > devargs=vdev:net_pcap0,rx_pcap=/root/ovs/p0.pcap,infinite_rx=1 > > The performance seems a little worse (9.7Mpps -> 8.7Mpps). > I wonder whether it's due to running it in VM (however I don't > have physical machine). Performance degradations are not expected, let me try understand the below performance data posted, and work through it. Agree that isolating the hardware and being able to verify environment would help in removing potential noise.. but let us work with the setup you have. Do you know what CPU it is you're running on? It seems you have EMC enabled (as per OVS defaults). The stats posted show an approx 10:1 ratio on hits in EMC and DPCLS. This likely adds noise to the measurements - as only 10% of the packets hit the changes in DPCLS. Also in the perf top profile dp_netdev_input__ takes more cycles than miniflow_extract, and the memcmp() is present, indicating EMC is consuming CPU cycles to perform its duties. I guess our simple test case is failing to show what we're trying to measure, as you know a EMC likes low flow counts, all explaining why DPCLS is only ~2% of CPU time. <snip> Removed details of CPU profiles & PMD stats for AVX512 and Generic DPCLS removed to trim conversation. Very helpful to see into your system, and I'm a big fan of perf top and friends - so this was useful to see, thanks! (Future readers, check the mailing list "thread" view for previous post's details). > Is there any thing I should double check? Would you mind re-testing with EMC disabled? Likely DPCLS will show up as a much larger % in the CPU profile, and this might provide some new insights. Regards, -Harry <snip context/backlog of hashing debug and resolution> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
