Hmm this output does look exactly like what I had before commit ed435504 - the IPv6 table bucket area gets corrupted by the overruns from IPv4 table arena.
And then I spotted the leak which was happening during the rehash which I fixed in 11521387. Your (very few) freelists look like you don’t have that fix either - normally you would see something like in http://paste.ubuntu.com/p/y43MsrdHSr/ - notice how many short elements are in the freelists. And given your tests confirm the session cleaner optimization work as I intended, I have it now in master. So you can just pull your fresh tree from master and recheck :) --a > On 25 Sep 2018, at 13:10, khers <s3m2e1.6s...@gmail.com> wrote: > > I checked out from gerrrit, I think it's using latest master. ( but i make > another version to make sure ) > let me explain how I produce this situation, > > while true :) > 1- run trex > ./t-rex-64 --cfg cfg/trex_config.yaml -f cap2/sfr.yaml -m 7 -c 2 > 2-stop trex after ~120 seconds > 3- wait until all session deleted from 'sh acl-plugin session' > > As you see I waited until all session is deleted so all bucket must be > completely free, > I send you part of 'show acl-plugin sessions verbose 1' in this link. > > > >> On Tue, Sep 25, 2018 at 1:52 PM Andrew 👽 Yourtchenko <ayour...@gmail.com> >> wrote: >> Are you using latest master ? I fixed a couple of issues in bihash last week >> related to memory usage... if it’s the latest master, the output of used vs >> available looks weird... - so please let me know... >> >> As for the “general” growth - basically what happens is bihash doubles each >> bucket size whenever there is a collision on insert, and then converts the >> bucket into linear lookup whenever there is still a collision after that >> growth. >> >> Then the only time the shrinkage/reset is happening is when the bucket is >> completely free - which with long living sessions with overlapping lifetimes >> might mean never. >> >> So one approach to this is to increase the number of buckets. Then they will >> be smaller and have higher probability of being freed. >> >> This is assuming there is nothing else “funny” going on. You can do “show >> acl-plugin sessions verbose 1” via vppctl (It will take forever to complete >> and needs pager disabled since it dumps the entire bihash) to inspect the >> way the buckets are filled... >> >> --a >> >>> On 25 Sep 2018, at 12:12, khers <s3m2e1.6s...@gmail.com> wrote: >>> >>> It's amazing!!! >>> >>> IPv4 Session lookup hash table: >>> Hash table ACL plugin FA IPv4 session bihash >>> 968086 active elements 65536 active buckets >>> 13 free lists >>> [len 16] 1 free elts >>> [len 32] 1 free elts >>> [len 256] 10669 free elts >>> [len 512] 36768 free elts >>> [len 1024] 4110 free elts >>> [len 2048] 156 free elts >>> [len 4096] 4 free elts >>> 844 linear search buckets >>> arena: base 7fe912320000, next 2680ca780 >>> used 10335594368 b (9856 Mbytes) of 10000000000 b (9536 Mbytes) >>> >>> >>>> On Tue, Sep 25, 2018 at 1:39 PM khers <s3m2e1.6s...@gmail.com> wrote: >>>> Yes, that's right. I think is completely another issue from the patch you >>>> sent >>>> >>>>> On Tue, Sep 25, 2018 at 1:35 PM Andrew 👽 Yourtchenko <ayour...@gmail.com> >>>>> wrote: >>>>> Excellent, thanks! >>>>> >>>>> Memory usage - you mean in bihash arena ? >>>>> >>>>> --a >>>>> >>>>>> On 25 Sep 2018, at 11:38, khers <s3m2e1.6s...@gmail.com> wrote: >>>>>> >>>>>> Throughput and session add/del is stable as rock. The only danger i see >>>>>> is growing memory usage. >>>>>> look at this >>>>>> >>>>>> >>>>>>> On Tue, Sep 25, 2018 at 11:31 AM khers <s3m2e1.6s...@gmail.com> wrote: >>>>>>> Of course, I test your patch, there is no slowdown with my scenario. I >>>>>>> need more time to test other >>>>>>> scenarios and make sure. >>>>>>> >>>>>>> >>>>>>>> On Mon, Sep 24, 2018 at 3:11 PM Andrew 👽 Yourtchenko >>>>>>>> <ayour...@gmail.com> wrote: >>>>>>>> Cool. Then it is probably indeed the session requeues that are not yet >>>>>>>> efficient... I have been looking at optimizing that. >>>>>>>> >>>>>>>> I have a draft in the works which should have less session requeues - >>>>>>>> I have just added you to it, could you give it a shot and see if it >>>>>>>> makes things better ? >>>>>>>> >>>>>>>> --a >>>>>>>> >>>>>>>>> On 24 Sep 2018, at 12:55, khers <s3m2e1.6s...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> yes, I confirm >>>>>>>>> >>>>>>>>>> On Mon, Sep 24, 2018 at 2:08 PM Andrew 👽 Yourtchenko >>>>>>>>>> <ayour...@gmail.com> wrote: >>>>>>>>>> Okay, so what I think I am hearing - the gradual slowdown is/was >>>>>>>>>> always there, and is somewhat more pronounced in master, right ? >>>>>>>>>> >>>>>>>>>> --a >>>>>>>>>> >>>>>>>>>>> On 24 Sep 2018, at 11:49, khers <s3m2e1.6s...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>> I allways get SIGSEGV or 'worker thread dead lock' In 1804 with 1 >>>>>>>>>>> or more worker thread and 1 main, >>>>>>>>>>> but when vpp using one cpu I hadn't any problem. In the 1807 multi >>>>>>>>>>> core is stable i didn't see any of those >>>>>>>>>>> problem but throughput is declining slowly. >>>>>>>>>>> I ran another test with same version of last email, which vpp is >>>>>>>>>>> configured with one core and throughput is declining slower than >>>>>>>>>>> master >>>>>>>>>>> second 200 >>>>>>>>>>> second 5900 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Sun, Sep 23, 2018 at 6:57 PM Andrew 👽 Yourtchenko >>>>>>>>>>>> <ayour...@gmail.com> wrote: >>>>>>>>>>>> Interesting - but you are saying in 1804 this effect is not >>>>>>>>>>>> observed ? There was no other notable changes with regards to >>>>>>>>>>>> session management - but maybe worth it to just do hit bisect and >>>>>>>>>>>> see. Should be 4-5 iterations. Could you verify that - if indeed >>>>>>>>>>>> this is not seen in 1804. >>>>>>>>>>>> >>>>>>>>>>>> --a >>>>>>>>>>>> >>>>>>>>>>>>> On 23 Sep 2018, at 16:42, khers <s3m2e1.6s...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I checked out the version before the gerrit 12770 is merged to >>>>>>>>>>>>> master. >>>>>>>>>>>>> 2371c25fed6b2e751163df590bb9d9a93a75a0f >>>>>>>>>>>>> >>>>>>>>>>>>> I got SIGSEGV with 2 workers, so I repeat the test with one >>>>>>>>>>>>> worker. >>>>>>>>>>>>> Throughput is going down like the latest version. >>>>>>>>>>>>> >>>>>>>>>>>>>> On Sun, Sep 23, 2018 at 4:55 PM Andrew 👽 Yourtchenko >>>>>>>>>>>>>> <ayour...@gmail.com> wrote: >>>>>>>>>>>>>> Would you be able to confirm that it changes at a point of >>>>>>>>>>>>>> https://gerrit.fd.io/r/#/c/12770/ ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> --a >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 23 Sep 2018, at 13:31, emma sdi <s3m2e1.6s...@gmail.com> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Dear Community >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have simple configuration as following: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> startup.conf >>>>>>>>>>>>>>> simple_acl >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I used Trex packet generator with following command: >>>>>>>>>>>>>>> ./t-rex-64 --cfg cfg/trex_config.yaml -f cap2/sfr.yaml -m 5 -c >>>>>>>>>>>>>>> 2 -d 6000 >>>>>>>>>>>>>>> The Total-RX gradually decrease, here is output of Trex in >>>>>>>>>>>>>>> second 200, and 5900. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I did not saw this problem in 18.04. I think session_cleaner >>>>>>>>>>>>>>> thread make so many >>>>>>>>>>>>>>> interrupt, do you have any idea? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=- >>>>>>>>>>>>>>> Links: You receive all messages sent to this group. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> View/Reply Online (#10615): >>>>>>>>>>>>>>> https://lists.fd.io/g/vpp-dev/message/10615 >>>>>>>>>>>>>>> Mute This Topic: https://lists.fd.io/mt/26145401/675608 >>>>>>>>>>>>>>> Group Owner: vpp-dev+ow...@lists.fd.io >>>>>>>>>>>>>>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub >>>>>>>>>>>>>>> [ayour...@gmail.com] >>>>>>>>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10652): https://lists.fd.io/g/vpp-dev/message/10652 Mute This Topic: https://lists.fd.io/mt/26145401/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-