Hi Klement,

Thanks! I have now tested your patch (28980), it seems to work and it
does give some improvement. However, according to my tests, increasing
NAT_FQ_NELTS seems to have a bigger effect, it improves performance a
lot. When using the original NAT_FQ_NELTS value of 64, your patch
gives some improvement but I still get the best performance when
increasing NAT_FQ_NELTS.

For example, one of the tests behaves like this:

Without patch, NAT_FQ_NELTS=64  --> 129 Gbit/s and ~600k cong. drops
With patch, NAT_FQ_NELTS=64  --> 136 Gbit/s and ~400k cong. drops
Without patch, NAT_FQ_NELTS=1024  --> 151 Gbit/s and 0 cong. drops
With patch, NAT_FQ_NELTS=1024  --> 151 Gbit/s and 0 cong. drops

So it still looks like increasing NAT_FQ_NELTS would be good, which
brings me back to the same questions as before:

Were there specific reasons for setting NAT_FQ_NELTS to 64?

Are there some potential drawbacks or dangers of changing it to a
larger value?

I suppose everyone will agree that when there is a queue with a
maximum length, the choice of that maximum length can be important. Is
there some particular reason to believe that 64 would be enough? In
our case we are using 8 NAT threads. Suppose thread 8 is held up
briefly due to something taking a little longer than usual, meanwhile
threads 1-7 each hand off 10 frames to thread 8, that situation would
require a queue size of at least 70, unless I misunderstood how the
handoff mechanism works. To me, allowing a longer queue seems like a
good thing because it allows us to handle also more difficult cases
when threads are not always equally fast, there can be spikes in
traffic that affect some threads more than others, things like
that. But maybe there are strong reasons for keeping the queue short,
reasons I don't know about, that's why I'm asking.

Best regards,
Elias


On Fri, 2020-11-13 at 15:14 +0000, Klement Sekera -X (ksekera -
PANTHEON TECH SRO at Cisco) wrote:
> Hi Elias,
> 
> I’ve already debugged this and came to the conclusion that it’s the
> infra which is the weak link. I was seeing congestion drops at mild
> load, but not at full load. Issue is that with handoff, there is
> uneven workload. For simplicity’s sake, just consider thread 1
> handing off all the traffic to thread 2. What happens is that for
> thread 1, the job is much easier, it just does some ip4 parsing and
> then hands packet to thread 2, which actually does the heavy lifting
> of hash inserts/lookups/translation etc. 64 element queue can hold 64
> frames, one extreme is 64 1-packet frames, totalling 64 packets,
> other extreme is 64 255-packet frames, totalling ~16k packets. What
> happens is this: thread 1 is mostly idle and just picking a few
> packets from NIC and every one of these small frames creates an entry
> in the handoff queue. Now thread 2 picks one element from the handoff
> queue and deals with it before picking another one. If the queue has
> only 3-packet or 10-packet elements, then thread 2 can never really
> get into what VPP excels in - bulk processing.
> 
> Q: Why doesn’t it pick as many packets as possible from the handoff
> queue? 
> A: It’s not implemented.
> 
> I already wrote a patch for it, which made all congestion drops which
> I saw (in above synthetic test case) disappear. Mentioned patch 
> https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit.
> 
> Would you like to give it a try and see if it helps your issue? We
> shouldn’t need big queues under mild loads anyway …
> 
> Regards,
> Klement
> 
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18039): https://lists.fd.io/g/vpp-dev/message/18039
Mute This Topic: https://lists.fd.io/mt/78230881/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to