Thanks Neale. If will fix it and recheck.

> -----Original Message-----
> From: Neale Ranns (nranns) <nra...@cisco.com>
> Sent: Thursday, May 28, 2020 1:56 AM
> To: Andrew Yourtchenko <ayour...@gmail.com>; Govindarajan Mohandoss
> <govindarajan.mohand...@arm.com>
> Cc: vpp-dev@lists.fd.io; Lijian Zhang <lijian.zh...@arm.com>; Jieqiang
> Wang <jieqiang.w...@arm.com>; Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>; nd <n...@arm.com>
> Subject: Re: [vpp-dev] ACL plugin optimization
> 
> 
> Hi Govind,
> 
> As well as removing the prefetches, you've also removed the per packet call
> to acl_fa_find_session_with_hash(). So IIUC you've removed the per-packet
> session lookup and instead re-use the lookup of packet 0 each time. that'll
> make things quicker but it's not functionally correct.
> 
> /neale
> 
> On 27/05/2020 23:51, "vpp-dev@lists.fd.io on behalf of Andrew
> Yourtchenko" <vpp-dev@lists.fd.io on behalf of ayour...@gmail.com> wrote:
> 
>     Hi Govind,
> 
>     1) According to Jenkins, this patch permits some of the packets that
>     should be denied, hence JJB voted "-1".
> 
>     2) If you suspect merely the prefetches are the issue, just commenting
>     out the body of prefetch_session_entry() in the original code should
>     turn it into a no-op that doesn't break anything.
> 
>     Hard to say anything else given the functionality is not correct.
> 
>     In general - ensure you run "EXTENDED_TESTS=y TEST=acl* make test" as
>     a sanity check before extensive perf-tests. It's not a 100% guarantee
>     but it does catch a few naughty cases.
> 
>     Also - take a look at f1cd92d8d9, which got about 30% improvement back
>     in the day, and is the source of much of the trickiness in that node.
> 
>     --a
> 
> 
>     On 5/27/20, Govindarajan Mohandoss
> <govindarajan.mohand...@arm.com> wrote:
>     > Hi Andrew,
>     >
>     >   While profiling the ACL plugin node using perf tool in ARM Neoverse
>     > platform, Bihash related prefetches were shown as bottleneck.
>     >
>     > Performance improvement is seen in ARM N1, TX2 and Intel Skylake
> servers
>     > after removing those prefetches. Testing is done with Ingress ACL/IPv4
>     > forwarding in both SF and SL modes.
>     >
>     > As the code change is common for Ingress/Egress ACL for both IPv4 and
> IPv6,
>     > performance improvement is expected for those cases also.
>     >
>     > Following are the test results for Ingress ACL / IPv4 / 1 core / 64B @ 
> MRR
>     > in ARM N1, TX2 and Intel Skylake servers:
>     >
>     >
>     >
>     > Legend:
>     >
>     > =======
>     >
>     > N1 - ARM Neoverse
>     >
>     > TX2 - ARM Thunder X2
>     >
>     > SKX - Intel Skylake
>     >
>     > SL: % imp - Performance improvement in stateless mode
>     >
>     > SF: % imp - Performance improvement in stateful mode
>     >
>     >
>     >
>     >
>     >
>     >
>     > SKX
>     > N1
>     > TX2
>     > Num Rules
>     > Matching Rules
>     > SL: Avg % imp
>     > SF: Avg % imp
>     > SL: % imp
>     > SF: % imp
>     > SL: % imp
>     > SF: % imp
>     > 1
>     > 1
>     > 0.99
>     > 12.09
>     > 8.38
>     > 10.41
>     > 4.48
>     > 4.63
>     > 50
>     > 1 (50th)
>     > 0.79
>     > 9.63
>     > 8.76
>     > 10.06
>     > 5.32
>     > 4.63
>     > 100
>     > 1 (100th)
>     > 4.34
>     > 10.75
>     > 8.60
>     > 10.06
>     > 6.98
>     > 4.63
>     > 1000
>     > 1(1000th)
>     > 4.18
>     > 13.06
>     > 8.61
>     > 11.14
>     > 6.17
>     > 5.58
>     > 100
>     > 100
>     > 3.70
>     > 11.70
>     > 6.65
>     > 14
>     > 2.82
>     > 6.53
>     > 1000
>     > 1000
>     > 1.84
>     > 15.96
>     > 5.52
>     > 27.72
>     > 4.72
>     > 8.69
>     >
>     >
>     >
>     >
>     >
>     > Please find the patch here: https://gerrit.fd.io/r/c/vpp/+/27167
>     >
>     >
>     >
>     > I ran per patch regression on ARM Taishan server in CSIT lab. Following
> are
>     > the results for Stateless and Stateful modes:
>     >
>     > 1.  perftest-3n-tsh acl_statelessAND1cAND64b:
>     >
>     >
>     > https://jenkins.fd.io/job/vpp-csit-verify-perf-master-3n-
> tsh/23/consoleFull
>     >
>     >      In the log, I can see the comparative numbers between parent and
>     > current (my patch) for 45 test cases.
>     >
>     >      I searched for "Difference of averages relative to parent" in the 
> log -
>     >  41/45 test cases have shown around 4% improvement with the patch.
> Rest of
>     > the 4 test cases stayed neutral.
>     >
>     >
>     >
>     > 2. perftest-3n-tsh acl_statefulAND1cAND64b:
>     >
>     >     https://jenkins.fd.io/job/vpp-csit-verify-perf-master-3n-tsh/25/
>     >
>     >     Performance improvement is seen in all 36 test cases.
>     >
>     >
>     >
>     > Please provide your comments.
>     >
>     >
>     >
>     > Thanks
>     >
>     > Govind
>     >
>     >
>     >

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16581): https://lists.fd.io/g/vpp-dev/message/16581
Mute This Topic: https://lists.fd.io/mt/74507621/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to