On Wed, May 27, 2020 at 12:21:43PM +0000, Van Haaren, Harry wrote:
> > -----Original Message-----
> > From: dev <[email protected]> On Behalf Of Van Haaren, Harry
> > Sent: Tuesday, May 26, 2020 3:52 PM
> > To: William Tu <[email protected]>
> > Cc: [email protected]; [email protected]
> > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather
> > implementation
> 
> <snip>
> 
> > > Why ukey is related here? Does you avx512 patch make any change to ukey?
> > 
> > No AVX512 doesn't make any ukey changes - but issues in the hashing of the
> > miniflow data blocks cause ukeys to be installed in different locations than
> > where they are looked up - hence "ukey install fail" == "issue in miniflow 
> > iteration" in
> > this context.
> 
> The ukey install fails are due to a mismatch in compile flags (with/without 
> SSE 4.2),
> combined with the fact that the hashing in OVS changes its implementation 
> depending
> on the availability of the SSE 4.2  ISA (and other defines for other 
> architectures).
> 
> The mismatch comes from upcall code being compiled without SSE4.2 (so using 
> mhash hash code)
> while the AVX512 lookup hash routines have SSE4.2 enabled (so using CRC32 
> hash code).
> As a result, hashing identical data in different .c files produces a 
> different hash values.
> 
> From OVS docs (http://docs.openvswitch.org/en/latest/intro/install/general/) 
> the following
> enables native ISA for your build, or else just enable SSE4.2 and popcount:
> ./configure CFLAGS="-g -O2 -march=native"
> ./configure CFLAGS="-g -O2 -march=nehalem"

Hi Harry,
Thanks for the info!
I can make it work now, with 
./configure CFLAGS="-g -O2 -msse4.2 -march=native"

using similar setup
ovs-ofctl add-flow br0 'actions=drop'
ovs-appctl dpif-netdev/subtable-lookup-set avx512_gather 5
ovs-vsctl add-port br0 tg0 -- set int tg0 type=dpdk \
  options:dpdk-devargs=vdev:net_pcap0,rx_pcap=/root/ovs/p0.pcap,infinite_rx=1

The performance seems a little worse (9.7Mpps -> 8.7Mpps).
I wonder whether it's due to running it in VM (however I don't
have physical machine).

=== Enable AVX512 ===
Drop rate: 8.7Mpps
2020-05-29T01:03:15.740Z|00049|dpif_netdev_lookup|INFO|Subtable function 
'avx512_gather' set priority to 5
  21.93%  pmd-c00/id:10  ovs-vswitchd        [.] dp_netdev_input__
  19.38%  pmd-c00/id:10  ovs-vswitchd        [.] miniflow_extract
  19.08%  pmd-c00/id:10  ovs-vswitchd        [.] eth_pcap_rx_infinite
  10.24%  pmd-c00/id:10  ovs-vswitchd        [.] miniflow_hash_5tuple
   9.63%  pmd-c00/id:10  libc-2.27.so        [.] __memcmp_avx2_movbe
   8.46%  pmd-c00/id:10  ovs-vswitchd        [.] free_dpdk_buf
   1.83%  pmd-c00/id:10  ovs-vswitchd        [.] dpcls_avx512_gather_skx_mf_4_1
   1.65%  pmd-c00/id:10  ovs-vswitchd        [.] odp_execute_actions
   1.17%  pmd-c00/id:10  ovs-vswitchd        [.] fast_path_processing
   1.12%  pmd-c00/id:10  ovs-vswitchd        [.] netdev_dpdk_rxq_recv
   0.99%  pmd-c00/id:10  ovs-vswitchd        [.] pmd_perf_end_iteration
   0.87%  pmd-c00/id:10  ovs-vswitchd        [.] dp_netdev_process_rxq_port
   0.51%  pmd-c00/id:10  ovs-vswitchd        [.] cmap_find_batch

root@instance-3:~/ovs# ovs-appctl dpif-netdev/pmd-stats-show
pmd thread numa_id 0 core_id 0:
  packets received: 167704800
  packet recirculations: 0
  avg. datapath passes per packet: 1.00
  emc hits: 152452853
  smc hits: 0
  megaflow hits: 15251600
  avg. subtable lookups per megaflow hit: 1.00
  miss with success upcall: 1
  miss with failed upcall: 346 
  avg. packets per output batch: 0.00
  idle cycles: 0 (0.00%)
  processing cycles: 38399744430 (100.00%)
  avg cycles per packet: 228.97 (38399744430/167704800)
  avg processing cycles per packet: 228.97 (38399744430/167704800)

=== Generic lookup ===
Drop rate: 9.7Mpps
2020-05-29T01:07:05.781Z|00049|dpif_netdev_lookup|INFO|Subtable function 
'generic' set priority to 5

pmd thread numa_id 0 core_id 1:
  packets received: 332413344
  packet recirculations: 0
  avg. datapath passes per packet: 1.00
  emc hits: 302178098
  smc hits: 0
  megaflow hits: 30234893
  avg. subtable lookups per megaflow hit: 1.00
  miss with success upcall: 1
  miss with failed upcall: 320 
  avg. packets per output batch: 0.00
  idle cycles: 0 (0.00%)
  processing cycles: 68605925782 (100.00%)
  avg cycles per packet: 206.39 (68605925782/332413344)
  avg processing cycles per packet: 206.39 (68605925782/332413344)

  22.04%  pmd-c01/id:10  ovs-vswitchd        [.] dp_netdev_input__
  19.87%  pmd-c01/id:10  ovs-vswitchd        [.] miniflow_extract
  18.24%  pmd-c01/id:10  ovs-vswitchd        [.] eth_pcap_rx_infinite
   9.84%  pmd-c01/id:10  libc-2.27.so        [.] __memcmp_avx2_movbe
   9.76%  pmd-c01/id:10  ovs-vswitchd        [.] miniflow_hash_5tuple
   8.16%  pmd-c01/id:10  ovs-vswitchd        [.] free_dpdk_buf
   2.27%  pmd-c01/id:10  ovs-vswitchd        [.] 
dpcls_subtable_lookup_mf_u0w4_u1w1
   1.71%  pmd-c01/id:10  ovs-vswitchd        [.] odp_execute_actions
   1.39%  pmd-c01/id:10  ovs-vswitchd        [.] fast_path_processing
   1.10%  pmd-c01/id:10  ovs-vswitchd        [.] netdev_dpdk_rxq_recv
   0.99%  pmd-c01/id:10  ovs-vswitchd        [.] dp_netdev_process_rxq_port
   0.87%  pmd-c01/id:10  ovs-vswitchd        [.] pmd_perf_end_iteration
   0.55%  pmd-c01/id:10  ovs-vswitchd        [.] cmap_find_batch

Is there any thing I should double check?
Regards,
William

> 
> To continue your testing William, I suggest using the above workaround - 
> compile OVS and explicitly
> enable SSE4.2, aligning all hashing code in OVS to use the more performant 
> CRC32 hashing.
> 
> I will work on a proper solution to avoid this issue in the v3 patchset.
> 
> Thanks for reporting, -Harry
> 
> > > > There is an alternative - set the "autovalidator" DPCLS implementation 
> > > > to
> > > > the highest priority, and it should ovs_assert() if the scalar/AVX512
> > > implementations
> > > > mismatch. Then a dump of the OVS miniflow will give what's needed to 
> > > > verify
> > > root cause.
> > > >
> > > that's a cool feature.
> > > When setting
> > > ovs-appctl dpif-netdev/subtable-lookup-set autovalidator 100
> > > log shows
> > > 2020-05-21T22:28:55.964Z|77007|dpif_lookup_autovalidator(pmd-
> > > c00/id:9)|ERR|matches_good
> > > 7 != matches_test 0 for func avx512_gather
> > 
> > Brilliant - this is exactly why the autovalidator is there. It has 
> > correctly flagged
> > an issue here - I've reproduced using a pcap and your commands above. I will
> > investigate a fix and include in the v3.
> > 
> > Thanks for the details - will keep you all posted on progress. -Harry
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to