"Nicholas Piggin" <npig...@gmail.com> writes:

> On Fri Sep 29, 2023 at 1:26 AM AEST, Aaron Conole wrote:
>> Nicholas Piggin <npig...@gmail.com> writes:
>>
>> > Dynamically allocating the sw_flow_key reduces stack usage of
>> > ovs_vport_receive from 544 bytes to 64 bytes at the cost of
>> > another GFP_ATOMIC allocation in the receive path.
>> >
>> > XXX: is this a problem with memory reserves if ovs is in a
>> > memory reclaim path, or since we have a skb allocated, is it
>> > okay to use some GFP_ATOMIC reserves?
>> >
>> > Signed-off-by: Nicholas Piggin <npig...@gmail.com>
>> > ---
>>
>> This represents a fairly large performance hit.  Just my own quick
>> testing on a system using two netns, iperf3, and simple forwarding rules
>> shows between 2.5% and 4% performance reduction on x86-64.  Note that it
>> is a simple case, and doesn't involve a more involved scenario like
>> multiple bridges, tunnels, and internal ports.  I suspect such cases
>> will see even bigger hit.
>>
>> I don't know the impact of the other changes, but just an FYI that the
>> performance impact of this change is extremely noticeable on x86
>> platform.
>>
>> ----
>> ip netns add left
>> ip netns add right
>>
>> ip link add eth0 type veth peer name l0
>> ip link set eth0 netns left
>> ip netns exec left ip addr add 172.31.110.1/24 dev eth0
>> ip netns exec left ip link set eth0 up
>> ip link set l0 up
>>
>> ip link add eth0 type veth peer name r0
>> ip link set eth0 netns right
>> ip netns exec right ip addr add 172.31.110.2/24 dev eth0
>> ip netns exec right ip link set eth0 up
>> ip link set r0 up
>>
>> python3 ovs-dpctl.py add-dp br0
>> python3 ovs-dpctl.py add-if br0 l0
>> python3 ovs-dpctl.py add-if br0 r0
>>
>> python3 ovs-dpctl.py add-flow \
>>   br0 'in_port(1),eth(),eth_type(0x806),arp()' 2
>>   
>> python3 ovs-dpctl.py add-flow \
>>   br0 'in_port(2),eth(),eth_type(0x806),arp()' 1
>>
>> python3 ovs-dpctl.py add-flow \
>>   br0 'in_port(1),eth(),eth_type(0x800),ipv4()' 2
>>
>> python3 ovs-dpctl.py add-flow \
>>   br0 'in_port(2),eth(),eth_type(0x800),ipv4()' 1
>>
>> ----
>>
>> ex results without this patch:
>> [root@wsfd-netdev60 ~]# ip netns exec right ./git/iperf/src/iperf3 -c 
>> 172.31.110.1
>> ...
>> [  5]   0.00-10.00  sec  46.7 GBytes  40.2 Gbits/sec    0             sender
>> [  5]   0.00-10.00  sec  46.7 GBytes  40.2 Gbits/sec                  
>> receiver
>>
>>
>> ex results with this patch:
>> [root@wsfd-netdev60 ~]# ip netns exec right ./git/iperf/src/iperf3 -c 
>> 172.31.110.1
>> ...
>> [  5]   0.00-10.00  sec  44.9 GBytes  38.6 Gbits/sec    0             sender
>> [  5]   0.00-10.00  sec  44.9 GBytes  38.6 Gbits/sec                  
>> receiver
>>
>> I did testing with udp at various bandwidths and this tcp testing.
>
> Thanks for the test case. It works perfectly in the end, but it took me
> days to get there because of a random conspiracy of problems I hit :(
> Sorry for the slow reply, but I was now able to test another idea for
> this. Performance seems to be within the noise with the full series, but
> my system is only getting ~half the rate of yours so you might see more
> movement.
>
> Instead of slab it reuses the per-cpu actions key allocator here.
>
> https://github.com/torvalds/linux/commit/878f01f04ca858e445ff4b4c64351a25bb8399e3
>
> Pushed the series to kvm branch of https://github.com/npiggin/linux
>
> I can repost the series as a second RFC but will wait for thoughts on
> this approach.

Thanks - I'll take a look at it.

> Thanks,
> Nick

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to