Hi Flavio,

I repeatedly tried to reproduce the results you have but failed. In fact I am 
now see better results for the hw-offload (thanks to optimization I learned 
from your script). 

w/o hw-offload I get fwd rate of ~7.76Mpps 

pmd thread numa_id 1 core_id 1:                                          
        packets received: 475469632                                      
        packet recirculations: 0                                         
        avg. datapath passes per packet: 1.00                            
        emc hits: 475466456                                              
        megaflow hits: 3072                                              
        avg. subtable lookups per megaflow hit: 1.00                     
        miss with success upcall: 1                                      
        miss with failed upcall: 71                                      
        avg. packets per output batch: 20.75                             
        idle cycles: 0 (0.00%)                                           
        processing cycles: 135318738225 (100.00%)                        
        avg cycles per packet: 284.60 (135318738225/475469632)           
        avg processing cycles per packet: 284.60 (135318738225/475469632)

w/ hw-offload I get fwd rate of ~12Mpps. 

pmd thread numa_id 1 core_id 1:                                          
        packets received: 213505047                                      
        packet recirculations: 0                                         
        avg. datapath passes per packet: 1.00                            
        emc hits: 213502401                                              
        megaflow hits: 2554                                              
        avg. subtable lookups per megaflow hit: 1.00                     
        miss with success upcall: 1                                      
        miss with failed upcall: 71                                      
        avg. packets per output batch: 20.09                             
        idle cycles: 0 (0.00%)                                           
        processing cycles: 40049794229 (100.00%)                         
        avg cycles per packet: 187.58 (40049794229/213505047)            
        avg processing cycles per packet: 187.58 (40049794229/213505047)

around 71% improvement and ~100 cycles of the datapath were saved by the HW 
offload.
Moreover, this is the most simple configuration. With more flows we can assume 
the gain will be much higher. 

Am wondering if a third person can check it to understand which system among 
the two is configured badly. 
Ian/Finn - can you please have a try? With Flavio scripts is really simple. 
BTW - you can use the latest stable tree of 17.11 (what will be 17.11.3), it 
includes the custom fixes I gave Flavio. 

More below 

Thursday, June 7, 2018 10:15 PM, Flavio Leitner:
> Subject: Re: [ovs-dev] [PATCH v9 0/7] OVS-DPDK flow offload with rte_flow
> 
> On Thu, Jun 07, 2018 at 07:24:25AM +0000, Shahaf Shuler wrote:
> > Thanks Flavio for the update,
> >
> > Wednesday, June 6, 2018 5:48 PM, Flavio Leitner:
> > > Subject: Re: [ovs-dev] [PATCH v9 0/7] OVS-DPDK flow offload with
> > > rte_flow
> > >
> > >
> > > (Looks like Intel doesn't allow shell scripts, so if you need I will
> > > put somewhere else to download)
> >
> > If you can pass me the scripts for your run it will be great, please include
> also packet gen pattern. I will try to reproduce it in house.
> > Looks like the flow is inserted correctly now.
> >
> > From my testing using single loopback rule, single core, single queue and
> single flow injected I can see ~20% improvement with the HW offload (from
> 5.2Mpps -> 6.2 Mpps).
> 
> Looking forward to get the same over here.
> 
> I added the Xena script to create the traffic and the script that brings up 
> the
> environment here:
> https://emea01.safelinks.protection.outlook.com/?url=http:%2F%2Fpeople.
> redhat.com%2F~fleitner%2Fhwol%2F&data=02%7C01%7Cshahafs%40mellan
> ox.com%7C0249e6ca39384c11979508d5ccab0556%7Ca652971c7d2e4d9ba6a4
> d149256f461b%7C0%7C1%7C636639957263712271&sdata=fWCBNjBGTpP1DS
> Nk6vHSebgyIrbkc1mnmvm%2BiLfJqd4%3D&reserved=0

Unfortunately I don't have Xena traffic generator. I used SW based traffic 
generator w/ 128 IP rules in round robin to reach my results. 
Is it possible some packets are dropped due to integrity reasons (CRC/checksum) 
? 

Can you attach the "ovs-appctl dpif-netdev/pmd-stats-show" output of each case?


> 
> Below is the packet headers I am providing to the traffic generator which will
> pad to 64 bytes, and increase the address from 10.0.0.1 to 10.0.0.128:
> 
>       1 0.000000       10.0.0.1              10.0.0.2              IPv4     34
> 
> Frame 1: 34 bytes on wire (272 bits), 34 bytes captured (272 bits)
>     Encapsulation type: Ethernet (1)
>     Arrival Time: Jun  7, 2018 16:10:56.508650000 -03
>     [Time shift for this packet: 0.000000000 seconds]
>     Epoch Time: 1528398656.508650000 seconds
>     [Time delta from previous captured frame: 0.000000000 seconds]
>     [Time delta from previous displayed frame: 0.000000000 seconds]
>     [Time since reference or first frame: 0.000000000 seconds]
>     Frame Number: 1
>     Frame Length: 34 bytes (272 bits)
>     Capture Length: 34 bytes (272 bits)
>     [Frame is marked: False]
>     [Frame is ignored: False]
>     [Protocols in frame: eth:ethertype:ip] Ethernet II, Src: RealtekU_c6:10:10
> (52:54:00:c6:10:10), Dst: RealtekU_c6:10:20 (52:54:00:c6:10:20)
>     Destination: RealtekU_c6:10:20 (52:54:00:c6:10:20)
>         Address: RealtekU_c6:10:20 (52:54:00:c6:10:20)
>         .... ..1. .... .... .... .... = LG bit: Locally administered address 
> (this is NOT the
> factory default)
>         .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
>     Source: RealtekU_c6:10:10 (52:54:00:c6:10:10)
>         Address: RealtekU_c6:10:10 (52:54:00:c6:10:10)
>         .... ..1. .... .... .... .... = LG bit: Locally administered address 
> (this is NOT the
> factory default)
>         .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
>     Type: IPv4 (0x0800)
> Internet Protocol Version 4, Src: 10.0.0.1, Dst: 10.0.0.2
>     0100 .... = Version: 4
>     .... 0101 = Header Length: 20 bytes (5)
>     Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
>         0000 00.. = Differentiated Services Codepoint: Default (0)
>         .... ..00 = Explicit Congestion Notification: Not ECN-Capable 
> Transport (0)
>     Total Length: 20
>     Identification: 0x0001 (1)
>     Flags: 0x0000
>         0... .... .... .... = Reserved bit: Not set
>         .0.. .... .... .... = Don't fragment: Not set
>         ..0. .... .... .... = More fragments: Not set
>         ...0 0000 0000 0000 = Fragment offset: 0
>     Time to live: 64
>     Protocol: IPv6 Hop-by-Hop Option (0)
>     Header checksum: 0x66e7 [validation disabled]
>     [Header checksum status: Unverified]
>     Source: 10.0.0.1
>     Destination: 10.0.0.2
> 
> >
> > >
> > > Well, I did a quick test with extra logs today:
> > >
> > > 2018-06-06T14:41:28.944Z|00120|memory|INFO|97880 kB peak resident
> > > set size after 10.3 seconds
> > > 2018-06-06T14:41:28.944Z|00121|memory|INFO|handlers:1 ports:2
> > > revalidators:1 rules:5
> > > 2018-06-06T14:41:29.414Z|00002|dpdk(dpdk_watchdog1)|ERR|PMD:
> > > mlx5_flow.c:1901: priv_flow_create(): Flow created 0x7f8f26503840
> > > 2018-06-06T14:41:29.424Z|00122|dpdk|ERR|PMD: mlx5_flow.c:1901:
> > > priv_flow_create(): Flow created 0x7f8f26503600
> > > 2018-06-06T14:41:34.415Z|00003|dpdk(dpdk_watchdog1)|ERR|PMD:
> > > mlx5_flow.c:1901: priv_flow_create(): Flow created 0x7f8f265033c0
> > > 2018-06-06T14:41:34.435Z|00123|dpdk|ERR|PMD: mlx5_flow.c:1901:
> > > priv_flow_create(): Flow created 0x7f8f26503180
> > > 2018-06-06T14:41:39.415Z|00004|dpdk(dpdk_watchdog1)|ERR|PMD:
> > > mlx5_flow.c:1901: priv_flow_create(): Flow created 0x7f8f26505e40
> > > 2018-06-06T14:41:39.446Z|00124|dpdk|ERR|PMD: mlx5_flow.c:1901:
> > > priv_flow_create(): Flow created 0x7f8f26505c80
> > > 2018-06-06T14:41:44.416Z|00005|dpdk(dpdk_watchdog1)|ERR|PMD:
> > > mlx5_flow.c:1901: priv_flow_create(): Flow created 0x7f8f26505a40
> > > 2018-06-06T14:41:44.455Z|00125|dpdk|ERR|PMD: mlx5_flow.c:1901:
> > > priv_flow_create(): Flow created 0x7f8f26505800
> > > 2018-06-06T14:41:49.417Z|00006|dpdk(dpdk_watchdog1)|ERR|PMD:
> > > mlx5_flow.c:1901: priv_flow_create(): Flow created 0x7f8f265055c0
> > > 2018-06-06T14:41:49.465Z|00126|dpdk|ERR|PMD: mlx5_flow.c:1901:
> > > priv_flow_create(): Flow created 0x7f8f26505380
> > > 2018-06-06T14:41:54.417Z|00007|dpdk(dpdk_watchdog1)|ERR|PMD:
> > > mlx5_flow.c:1901: priv_flow_create(): Flow created 0x7f8f26505140
> > > 2018-06-06T14:41:54.476Z|00127|dpdk|ERR|PMD: mlx5_flow.c:1901:
> > > priv_flow_create(): Flow created 0x7f8f26526dc0 <traffic starts>
> > > 2018-06-06T14:41:56.024Z|00001|dpif_netdev(pmd9)|INFO|fbl:
> > > queue_netdev_flow_put op=ADD
> > > 2018-06-
> > > 06T14:41:56.024Z|00045|dpif_netdev(dp_netdev_flow_8)|INFO|fbl:
> > > dp_netdev_flow_offload_main op=ADD 2018-06-
> > > 06T14:41:56.024Z|00046|dpif_netdev(dp_netdev_flow_8)|WARN|Mark
> id
> > > for ufid 90c67f4a-ebd3-482a-8191-444388fee419 was not found 2018-06-
> > > 06T14:41:56.024Z|00047|netdev_dpdk(dp_netdev_flow_8)|INFO|rte
> flow
> > > eth pattern:
> > > 2018-06-
> 06T14:41:56.024Z|00048|netdev_dpdk(dp_netdev_flow_8)|INFO|
> > > Spec = null 2018-06-
> > > 06T14:41:56.024Z|00049|netdev_dpdk(dp_netdev_flow_8)|INFO|  Mask
> =
> > > null 2018-06-
> > > 06T14:41:56.024Z|00050|netdev_dpdk(dp_netdev_flow_8)|INFO|rte
> flow
> > > ipv4 pattern:
> > > 2018-06-
> 06T14:41:56.024Z|00051|netdev_dpdk(dp_netdev_flow_8)|INFO|
> > > Spec: tos=0x0, ttl=40, proto=0x0, src=10.0.0.1, dst=10.0.0.2
> > > 2018-06-
> 06T14:41:56.024Z|00052|netdev_dpdk(dp_netdev_flow_8)|INFO|  Mask:
> > > tos=0x0, ttl=0, proto=0x0, src=0.0.0.0, dst=0.0.0.0
> > > 2018-06-06T14:41:56.039Z|00053|dpdk(dp_netdev_flow_8)|ERR|PMD:
> > > mlx5_flow.c:1901: priv_flow_create(): Flow created 0x7f8f26526c00
> > > 2018-06-
> > >
> 06T14:41:56.039Z|00054|netdev_dpdk(dp_netdev_flow_8)|INFO|installed
> > > flow 0x7f8f26526c00 by ufid 90c67f4a-ebd3-482a-8191-444388fee419
> > > 2018-06-
> > >
> 06T14:41:56.039Z|00055|dpif_netdev(dp_netdev_flow_8)|INFO|succeed to
> > > add netdev flow
> > >
> > > Looks like the flow is added to netdev, but I still see 100%
> > > processing packets and less than 10Mpps.
> > >
> > > --
> > > Flavio
> >
> 
> --
> Flavio

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to