On Wed, 20 May 2020 at 12:20, Van Haaren, Harry <[email protected]>
wrote:

> > -----Original Message-----
> > From: William Tu <[email protected]>
> > Sent: Wednesday, May 20, 2020 1:12 AM
> > To: Van Haaren, Harry <[email protected]>
> > Cc: [email protected]; [email protected]
> > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather
> > implementation
> >
> > On Mon, May 18, 2020 at 9:12 AM Van Haaren, Harry
> > <[email protected]> wrote:
> > >
> > > > -----Original Message-----
> > > > From: William Tu <[email protected]>
> > > > Sent: Monday, May 18, 2020 3:58 PM
> > > > To: Van Haaren, Harry <[email protected]>
> > > > Cc: [email protected]; [email protected]
> > > > Subject: Re: [ovs-dev] [PATCH v2 5/5] dpif-lookup: add avx512 gather
> > > > implementation
> > > >
> > > > On Wed, May 06, 2020 at 02:06:09PM +0100, Harry van Haaren wrote:
> > > > > This commit adds an AVX-512 dpcls lookup implementation.
> > > > > It uses the AVX-512 SIMD ISA to perform multiple miniflow
> > > > > operations in parallel.
>
> <snip lots of code/patch contents for readability>
>
> > Hi Harry,
> >
> > I managed to find a machine with avx512 in google cloud and did some
> > performance testing. I saw lower performance when enabling avx512,
>

AVX512 instruction path lowers the clock speed well below the base
frequency [1].
Aren't you killing the PMD performance while improving the lookup ones?

[1]
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/2nd-gen-xeon-scalable-spec-update.pdf
(see
page 20)


> > I believe I did something wrong. Do you mind having a look:
> >
> > 1) first a compile error
> > diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
> > index b22a26b8c8a2..5c71096c10c5 100644
> > --- a/lib/dpif-netdev-lookup.c
> > +++ b/lib/dpif-netdev-lookup.c
> > @@ -1,5 +1,6 @@
> >
> >  #include <config.h>
> > +#include <errno.h>
> >  #include "dpif-netdev-lookup.h"
> >
> >  #include "openvswitch/vlog.h"
>
> Existing code compiles fine here - but I've added this in the v3, thanks
> for flagging.
>
>
> > 2) cpuinfo
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> > pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm
> > constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq
> > pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt
> > aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch
> > invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle
> > avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap
> > clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1
> > xsaves arat md_clear arch_capabilities
>
> The avx512f and dq/cd/bw/vl extensions indicate AVX512 is available on
> this machine, all looks good so far.
>
>
> >
> > 3) start ovs and set avx and traffic gen
> >  ovs-appctl dpif-netdev/subtable-lookup-set avx512_gather 5
> >  ovs-vsctl add-port br0 tg0 -- set int tg0 type=dpdk
> >
> options:dpdk-devargs=vdev:net_pcap0,rx_pcap=/root/ovs/p0.pcap,infinite_rx=1
>
> The output of the first command (enabling the AVX512 lookup) posts some
> output to Log INFO, please ensure its there?
>
> 2020-05-20T09:39:09Z|00262|dpif_netdev_lookup|INFO|Subtable function
> 'avx512_gather' set priority to 4
> 2020-05-20T09:39:09Z|00006|dpif_netdev(pmd-c15/id:99)|INFO|reprobing sub
> func, 5 1
>
>
> > 4) dp flows with miniflow info
> > root@instance-3:~/ovs# ovs-appctl dpctl/dump-flows -m
> > flow-dump from pmd on cpu core: 0
> > ufid:caf11111-2e15-418c-a7d4-b4ec377593ca,
> >
> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(
> >
> 0/0),recirc_id(0),dp_hash(0/0),in_port(tg0),packet_type(ns=0,id=0),eth(src=42:01:
> > 0a:b6:00:02,dst=42:01:0a:b6:00:01),eth_type(0x0800),ipv4(src=
> 10.182.0.2/0.0.0.0,
> > dst=76.21.95.192/0.0.0.0,proto=6/0,tos=0x10/0,ttl=64/0,frag=no
> ),tcp(src=22/0,ds
> > t=62190/0),tcp_flags(0/0),
> > packets:0, bytes:0, used:never, dp:ovs, actions:drop,
> > dp-extra-info:miniflow_bits(5,1)
> > ufid:78cc1751-3a81-4dba-900c-b3507d965bdc,
> >
> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(
> >
> 0/0),recirc_id(0),dp_hash(0/0),in_port(tg0),packet_type(ns=0,id=0),eth(src=42:01:
> > 0a:b6:00:02,dst=42:01:0a:b6:00:01),eth_type(0x0800),ipv4(src=
> 10.182.0.2/0.0.0.0,
> > dst=169.254.169.254/0.0.0.0,proto=6/0,tos=0/0,ttl=64/0,frag=no
> ),tcp(src=51650/
> > 0,dst=80/0),tcp_flags(0/0),
> > packets:0, bytes:0, used:never, dp:ovs, actions:drop,
> > dp-extra-info:miniflow_bits(5,1)
>
> It seems the "packets:0, bytes:0,used:never" tags indicate that there is
> no traffic hitting these rules at all?
>
> Output here (with traffic running for a while) shows:
> packets:621588543, bytes:37295312580, used:0.000s, dp:ovs, actions:dpdk1,
> dp-extra-info:miniflow_bits(4,1)
>
>
> > 5) pmd-stat-show
> > root@instance-3:~/ovs# ovs-appctl dpif-netdev/pmd-stats-show
> > pmd thread numa_id 0 core_id 0:
> >   packets received: 19838528
> >   packet recirculations: 0
> >   avg. datapath passes per packet: 1.00
> >   emc hits: 0
> >   smc hits: 0
> >   megaflow hits: 0
> >   avg. subtable lookups per megaflow hit: 0.00  (---> this doesn't
> > look right ....)
> >   miss with success upcall: 78
> >   miss with failed upcall: 19838418
> >   avg. packets per output batch: 2.00
> >   idle cycles: 0 (0.00%)
> >   processing cycles: 103431787838 (100.00%)
> >   avg cycles per packet: 5213.68 (103431787838/19838528)
> >   avg processing cycles per packet: 5213.68 (103431787838/19838528)
>
> Would you try the pmd-stats-show command before setting the AVX512 lookup?
> If the issue is still present it would indicate its not related to the
> exact lookup
> implementation.
>
> Running the same test on master, patchset using scalar, and patchset using
> AVX512
> for lookup all provides a valid megaflow hits count, exactly equal to
> (packets_rx - 1)
> as the first one goes to the upcall to install the rule (and hence doesn't
> hit the rule :)
>
> $ ./utilities/ovs-appctl dpif-netdev/pmd-stats-show
>
> ### Master
> pmd thread numa_id 0 core_id 15:
>   packets received: 503095
>   packet recirculations: 0
>   avg. datapath passes per packet: 1.00
>   emc hits: 0
>   smc hits: 0
>   megaflow hits: 503094
>   avg. subtable lookups per megaflow hit: 1.00
>
> ### Scalar Lookup
> $ ./utilities/ovs-appctl dpif-netdev/pmd-stats-show
> pmd thread numa_id 0 core_id 15:
>   packets received: 508759
>   packet recirculations: 0
>   avg. datapath passes per packet: 1.00
>   emc hits: 0
>   smc hits: 0
>   megaflow hits: 508758
>   avg. subtable lookups per megaflow hit: 1.00
>
> ### AVX512 Lookup
> ./utilities/ovs-appctl dpif-netdev/pmd-stats-show
> pmd thread numa_id 0 core_id 15:
>   packets received: 540311
>   packet recirculations: 0
>   avg. datapath passes per packet: 1.00
>   emc hits: 0
>   smc hits: 0
>   megaflow hits: 540310
>   avg. subtable lookups per megaflow hit: 1.00
>
>
> > 6) gdb also looks not right..., I didn't see any avx512 instructions
> > (gdb) b avx512_lookup_impl
> > Breakpoint 2 at 0x55e92342a8df: avx512_lookup_impl. (4 locations)
> > Dump of assembler code for function dpcls_avx512_gather_skx_mf_5_1:
> > 96     const uint64_t tbl_u0 = subtable->mask.mf.map.bits[0];
> >    0x000055e92342a8df <+31>: mov    0x30(%rdi),%r8
> > 97     const uint64_t tbl_u1 = subtable->mask.mf.map.bits[1];
> >    0x000055e92342a8e3 <+35>: mov    0x38(%rdi),%r9
> > 98     ovs_assert(__builtin_popcountll(tbl_u0) == bit_count_u0);
> <snip some ASM>
>
> (gdb) disas dpcls_avx512_gather_skx_mf_5_1
> <snip preamble>
>   0x0000555556103f34 <+724>:   vmovdqu64 0x28(%rdi),%zmm2{%k1}{z}
>    0x0000555556103f3e <+734>:   vmovdqu64 0x18(%rcx),%zmm0{%k1}{z}
>    0x0000555556103f48 <+744>:   vpandd %zmm0,%zmm1,%zmm0
>    0x0000555556103f4e <+750>:   vpcmpeqq %zmm2,%zmm0,%k7{%k1}
>
> Disassembly here shows AVX512 register usage here, as expected.
>
> Note the "avx512_lookup_impl" is a static function in a .c file, so it is
> not visible
> outside the compilation unit. Further, it is also marked "ALWAYS_INLINE",
> so even
> inside the compilation unit, there isn't a symbol with that name. I'm
> surprised GDB
> let me set a breakpoint on it. Disassembling it doesn't work:
> (gdb) b avx512_lookup_impl
> Breakpoint 2 at 0x5555561035af: avx512_lookup_impl. (4 locations)
> (gdb) disas avx512_lookup_impl
> No symbol "avx512_lookup_impl" in current context.
>
> The functions it is inlined into are available for disassembly, as their
> symbols
> do exist in the binary. (Sidenote: Going to add dpcls_ to the _any
> function for
> consistency in naming with the others);
> dpcls_avx512_gather_skx_mf_4_0
> dpcls_avx512_gather_skx_mf_4_1
> dpcls_avx512_gather_skx_mf_5_1
> avx512_gather_any
>
> Disassembling the _any version of the avx512 lookup function here
> shows the AVX512 instructions, using ZMM registers and {k} masks.
> (gdb) disas avx512_gather_mf_any
> Dump of assembler code for function avx512_gather_mf_any:
>    0x0000555556103fb0 <+0>:     lea    0x8(%rsp),%r10
>    0x0000555556103fb5 <+5>:     and    $0xffffffffffffffc0,%rsp
>    0x0000555556103fb9 <+9>:     pushq  -0x8(%r10)
> <skipping preamble/pushes etc, to the fun AVX512 part>
>    0x00005555561040dd <+301>:   vpandd %zmm0,%zmm5,%zmm0
>    0x00005555561040e3 <+307>:   or     %rdi,%rax
>    0x00005555561040e6 <+310>:   test   %r8,%r8
>    0x00005555561040e9 <+313>:   kmovb  %eax,%k4
>    0x00005555561040ed <+317>:   vpsrlq $0x4,%zmm0,%zmm2
>    0x00005555561040f4 <+324>:   vpandd %zmm3,%zmm0,%zmm0
>    0x00005555561040fa <+330>:   vpandd %zmm2,%zmm3,%zmm2
>    0x0000555556104100 <+336>:   vpshufb %zmm0,%zmm4,%zmm0
>    0x0000555556104106 <+342>:   vpshufb %zmm2,%zmm4,%zmm2
>    0x000055555610410c <+348>:   vpaddb %zmm2,%zmm0,%zmm0
>    0x0000555556104112 <+354>:   vpsadbw %zmm7,%zmm0,%zmm0
>    0x0000555556104118 <+360>:   vpaddq %zmm1,%zmm0,%zmm0
>    0x000055555610411e <+366>:   vmovdqa64 %zmm8,%zmm1
>    0x0000555556104124 <+372>:   vpgatherqq 0x18(%r9,%zmm0,8),%zmm1{%k3}
>    0x000055555610412c <+380>:   vpandq %zmm6,%zmm1,%zmm0{%k4}{z}
>
> Would you try some of the above and see can it be reproduced?
>
> Regards, -Harry
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to