Hi Klement, Thanks! I have now tested your patch (28980), it seems to work and it does give some improvement. However, according to my tests, increasing NAT_FQ_NELTS seems to have a bigger effect, it improves performance a lot. When using the original NAT_FQ_NELTS value of 64, your patch gives some improvement but I still get the best performance when increasing NAT_FQ_NELTS.
For example, one of the tests behaves like this: Without patch, NAT_FQ_NELTS=64 --> 129 Gbit/s and ~600k cong. drops With patch, NAT_FQ_NELTS=64 --> 136 Gbit/s and ~400k cong. drops Without patch, NAT_FQ_NELTS=1024 --> 151 Gbit/s and 0 cong. drops With patch, NAT_FQ_NELTS=1024 --> 151 Gbit/s and 0 cong. drops So it still looks like increasing NAT_FQ_NELTS would be good, which brings me back to the same questions as before: Were there specific reasons for setting NAT_FQ_NELTS to 64? Are there some potential drawbacks or dangers of changing it to a larger value? I suppose everyone will agree that when there is a queue with a maximum length, the choice of that maximum length can be important. Is there some particular reason to believe that 64 would be enough? In our case we are using 8 NAT threads. Suppose thread 8 is held up briefly due to something taking a little longer than usual, meanwhile threads 1-7 each hand off 10 frames to thread 8, that situation would require a queue size of at least 70, unless I misunderstood how the handoff mechanism works. To me, allowing a longer queue seems like a good thing because it allows us to handle also more difficult cases when threads are not always equally fast, there can be spikes in traffic that affect some threads more than others, things like that. But maybe there are strong reasons for keeping the queue short, reasons I don't know about, that's why I'm asking. Best regards, Elias On Fri, 2020-11-13 at 15:14 +0000, Klement Sekera -X (ksekera - PANTHEON TECH SRO at Cisco) wrote: > Hi Elias, > > I’ve already debugged this and came to the conclusion that it’s the > infra which is the weak link. I was seeing congestion drops at mild > load, but not at full load. Issue is that with handoff, there is > uneven workload. For simplicity’s sake, just consider thread 1 > handing off all the traffic to thread 2. What happens is that for > thread 1, the job is much easier, it just does some ip4 parsing and > then hands packet to thread 2, which actually does the heavy lifting > of hash inserts/lookups/translation etc. 64 element queue can hold 64 > frames, one extreme is 64 1-packet frames, totalling 64 packets, > other extreme is 64 255-packet frames, totalling ~16k packets. What > happens is this: thread 1 is mostly idle and just picking a few > packets from NIC and every one of these small frames creates an entry > in the handoff queue. Now thread 2 picks one element from the handoff > queue and deals with it before picking another one. If the queue has > only 3-packet or 10-packet elements, then thread 2 can never really > get into what VPP excels in - bulk processing. > > Q: Why doesn’t it pick as many packets as possible from the handoff > queue? > A: It’s not implemented. > > I already wrote a patch for it, which made all congestion drops which > I saw (in above synthetic test case) disappear. Mentioned patch > https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. > > Would you like to give it a try and see if it helps your issue? We > shouldn’t need big queues under mild loads anyway … > > Regards, > Klement >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18039): https://lists.fd.io/g/vpp-dev/message/18039 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-