Paolo Abeni <pab...@redhat.com> writes: > On 4/16/25 6:45 PM, Sebastian Andrzej Siewior wrote: >> On 2025-04-15 12:26:13 [-0400], Aaron Conole wrote: >>> I'm going to reply here, but I need to bisect a bit more (though I >>> suspect the results below are due to 11/18). When I tested with this >>> patch there were lots of "unexplained" latency spikes during processing >>> (note, I'm not doing PREEMPT_RT in my testing, but I guess it would >>> smooth the spikes out at the cost of max performance). >>> >>> With the series: >>> [SUM] 0.00-300.00 sec 3.28 TBytes 96.1 Gbits/sec 9417 >>> sender >>> [SUM] 0.00-300.00 sec 3.28 TBytes 96.1 Gbits/sec >>> receiver >>> >>> Without the series: >>> [SUM] 0.00-300.00 sec 3.26 TBytes 95.5 Gbits/sec 149 sender >>> [SUM] 0.00-300.00 sec 3.26 TBytes 95.5 Gbits/sec >>> receiver >>> >>> And while the 'final' numbers might look acceptable, one thing I'll note >>> is I saw multiple stalls as: >>> >>> [ 5] 57.00-58.00 sec 128 KBytes 903 Kbits/sec 0 4.02 MBytes >>> >>> But without the patch, I didn't see such stalls. My testing: >>> >>> 1. Install openvswitch userspace and ipcalc >>> 2. start userspace. >>> 3. Setup two netns and connect them (I have a more complicated script to >>> set up the flows, and I can send that to you) >>> 4. Use iperf3 to test (-P5 -t 300) >>> >>> As I wrote I suspect the locking in 11 is leading to these stalls, as >>> the data I'm sending shouldn't be hitting the frag path. >>> >>> Do these results seem expected to you? >> >> You have slightly better throughput but way more retries. I wouldn't >> expect that. And then the stall. >> >> Patch 10 & 12 move per-CPU variables around and makes them "static" >> rather than allocating them at module init time. I would not expect this >> to have a negative impact. >> Patch #11 assigns the current thread to a variable and clears it again. >> The remaining lockdep code disappears. The whole thing runs with BH >> disabled so no preemption. >> >> I can't explain what you observe here. Unless it is a random glitch >> please send the script and I try to take a look. > > I also think this series should not have any visible performance impact > on not RT OVS tests. @Aaron: could you please double check the results > (both the good on unpatched kernel and the bad with the series applied) > are reproducible and not due some glitches.
I agree, it doesn't seem like it should. I guess a v3 is coming, so I will retry with that. I planned to ack 10/18 and 12/18 anyway; even without the lock restructure, it seems 'nicer' to have the pcpu variables in a single location. BTW, I am using a slightly modified version of: https://gist.github.com/apconole/ed78c9a2e76add9942dc3d6cbcfff4ca It sets things up similarly to an SDN deployment (although not perfectly since I was testing something very special at the time), and I was just doing netns->netns testing (so it would go through ct() calls but not ct(nat) calls). > @Sebastian: I think the 'owner' assignment could be optimized out at > compile time for non RT build - will likely not matter for performances, > but I think it will be 'nicer', could you please update the patches to > do that? > > Thanks! > > Paolo _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev