Hi Neale,

My personal opinion (because we don’t have anything written down about the 
“acceptable practices in tests” - should we ?)

I view it within two axis:

1) ownership. 

The owner of the component owns the tests that own the component and is free to 
do whatever it makes it easy to test the component. The “smell” of the CLI 
scraping is that when the cli output changes, the scraping might break, so the 
downstream logic breaks. But in this case if you as the component owner decide 
to change the CLI outputs and it breaks the tests that scrape them... just fix 
the parsing of the outputs within the same commit and we move on. (Of course it 
makes cherrypicking harder but that’s another story)

2) observability.

If something in the tests tries to do the CLI scraping - if that value is 
something that is fundamental/important would the end users benefit from having 
an API or a counter for that purpose ?

This dimension is independent from (1), but maybe if I needed some CLI scraping 
in the test, would make sense to have a little follow-up that would convert 
that into a counter in stats segment - it would be easy to use and useful for 
everyone... if not - then just leave the cli scraping there...

--a

> On 7 May 2020, at 14:11, Neale Ranns (nranns) <nra...@cisco.com> wrote:
> 
> 
>
> Hi Andrew,
>
> They’re the first UT I’ve seen that screen scrape show output. I wasn’t sure 
> this was acceptable practice. But if I had a better alternative I’d have 
> suggested it already…
>
> /neale
>
> From: Andrew 👽 Yourtchenko <ayour...@gmail.com>
> Date: Wednesday 6 May 2020 at 15:45
> To: Nick Zavaritsky <nick.zavarit...@emnify.com>
> Cc: "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io>, "Neale Ranns (nranns)" 
> <nra...@cisco.com>
> Subject: Re: [vpp-dev] DPO leak in various tunnel types (gtpu, geneve, vxlan, 
> ...)
>
> Nick,
>
> Fixes are always good, and especially with the UTs, so thanks a lot ! 
>
> I took a glance at the UTs...
>
> one question and the bigger remark:
>
> 1) The UT stuff looks fairly straightforward except the tests are IPv4-only - 
> is it enough test only one address family ? If yes - a comment inside the UT 
> would be cool.. 
>
> 2) the deletion of ipv6 parameter in vpp_papi_provider.py seems like some 
> stray unrelated change which sneaked in ? (That’s actually what made me to 
> -1).
>
> That’s the only two things I could see, the UTs seem fairly straightforward.
>
> --a
> 
> 
> On 6 May 2020, at 13:55, Nick Zavaritsky <nick.zavarit...@emnify.com> wrote:
> 
> Dear VPP hackers,
>
> May I kindly ask to do a code review of the proposed fix?
>
> Thanks,
> N
>
> 
> 
> On 21. Apr 2020, at 15:15, Neale Ranns (nranns) <nra...@cisco.com> wrote:
>
> Hi Nick,
>
> A +1 from me for the VPP change, thank you.
> I’m all for UT too, but I’ll let some others review the UT first before I 
> merge.
>
> /neale
>
> From: <vpp-dev@lists.fd.io> on behalf of Nick Zavaritsky 
> <nick.zavarit...@emnify.com>
> Date: Tuesday 21 April 2020 at 14:57
> To: "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io>
> Subject: [vpp-dev] DPO leak in various tunnel types (gtpu, geneve, vxlan, ...)
>
> Dear VPP hackers,
> 
> We are spawning and destroying GTPU tunnels at a high rate. Only 10K tunnels 
> ever exist simultaneously in our test.
> 
> With default settings, we observe out of memory error in load_balance_create 
> after approximately .5M tunnel create commands. Apparently, load balancers 
> are leaking.
> 
> As far as my understanding goes, a load_balancer is first created in 
> fib_entry_track, to get notifications about the route changes. This is only 
> created once for a unique DIP and the refcount is correctly decremented once 
> the last subscription ceases.
> 
> The refcount is also bumped in gtpu_tunnel_restack_dpo, when next_dpo is 
> updated. Since the later is never reset, the refcount never drops to zero.
> 
> This is straightforward to exercise in CLI: create and immediately destroy 
> several GTPU tunnels. Compare `show dpo memory` output before and after.
> 
> It looks like other tunnel types, namely geneve, vxlan, vxlan-gpe and 
> vxlan-gbp are also susceptible.
> 
> My take was to add a call to dpo_reset in add_del_tunnel delete path. Please 
> take a look at the patch: https://gerrit.fd.io/r/c/vpp/+/26617
> 
> Note: was unable to make a test case for vxlan and vxlan-gbp since they don't 
> point next_dpo at a load balancer but rather at a dpo picked up from a bucket 
> in the load balancer.
> 
> Best,
> N 
> 
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16273): https://lists.fd.io/g/vpp-dev/message/16273
Mute This Topic: https://lists.fd.io/mt/73171448/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to