On 6 Feb 2024, at 16:01, Ilya Maximets wrote:

> On 2/6/24 15:07, Eelco Chaudron wrote:
>>
>>
>> On 6 Feb 2024, at 14:46, Ilya Maximets wrote:
>>
>>> On 2/6/24 14:30, Eelco Chaudron wrote:
>>>> Previously, the ability to override the system default for the number
>>>> of handler threads was broken since to the introduction of the per-CPU
>>>> upcall handlers.
>>>
>>> It wasn't broken, it was intentionally disabled.  I don't think we should
>>> introduce ability to configure the number of handlers for per-CPU dispatch
>>> mode.  It only allows users to shoot themselves in a foot if they do not
>>> understand what they are doing.
>>
>> I think the design was not thought through enough, and resulted in a lot of
>> corner case issues. From the start, we should still have allowed tuning of
>> this option as it was before.
>
> The tuning is much more complicated and has very different consequences
> with per-cpu dispatch compared to per-vport.  So, the results of adjusting
> this option can be very different.

>>> Many integration scripts in a wild are still blindly setting these options
>>> without even knowing that the underlying implementation changed completely.
>>> And those setups may suffer if we suddenly decide to respect the 
>>> configuration
>>> that is ignored on modern kernels for all currently supported versions of 
>>> OVS.
>>
>> I feel like this is not OVS’s problem. If people do not know what they are 
>> doing,
>> they should not touch it… We have a nice Dutch saying for this "Wie zijn 
>> billen
>> brandt, moet op de blaren zitten” :)
>
> It is an OVS's problem for the same reason as above.  We changed the logic
> under the hood so much that users that expected a certain behavior in the
> past may have a very different behavior now.  We can't just change the
> meaning of the knob that easily.  See below.

Ack, so I would suggest adding a new nob.

>>
>> However, I do feel like we should have an option to configure this, as it 
>> makes
>> no sense on a 256-core system to create 256 handler threads.
>
> It actually makes perfect sense.  Since the traffic can appear on 256 cores, 
> we
> should be able to handle it on each one of them in order to have balanced 
> load.
>
> With per-vport dispatch, on a system with 10 cores and 5 handler threads,
> all 5 handler threads will share the load, because each one of them listens
> on all the ports, hence each one of them can process packets whenever it
> has time to do so.

Yes, but this had other problems :)

> With per-CPU dispatch, on a same system these 5 threads are mapped to specific
> cores, 2 cores per thread.  If all the traffic comes on 4 cores, we may have
> 2 threads overloaded and 3 other threads do nothing.  In the same scenario 
> with
> per-vport dispatch we would have all 5 threads sharing the load.

Yes, but with 256 we might have 250 doing nothing, and looking at some data 
from a system with decent OpenFlow tables (1M+ entries, but only a couple of 
tables), we can handle about 232k upcalls on a single handler thread for 64 
bytes packets before we run out of socket buffer. For 512bytes around 212k, and 
1500 around 189k.

If we have more than 200k upcalls per second, we have other problems, 
especially as the default dp flow size is 200k.

So I guess the new nob becomes tunable for resource consumption vs upcall 
handling, which I think is fine, and might be desirable in certain 
configurations. We have other nobs in the system that require a decent amount 
of knowledge.

> So, in order for the user to make a decision on the number of threads, they
> have to know on which CPUs they most likely to have interrupts and how exactly
> OVS will map handlers to CPU numbers.  That is too much for a user to know, 
> IMO.
> Some of that is actually impossible to know.  And when the traffic pattern 
> changes,
> the number of threads may need to be changed to game the mapping to avoid
> overload.
>
> Also, multiply that complexity by the CPU affinity of the ovs-vswitchd that
> changes the mapping.

Well, using an affinity mask will mess up the above anyway as we will get fewer 
threads, sharing potential CPU resources, etc. So making the n_threads 
configurable will have the same problem as changing the affinity mask. But we 
might have other benefits (see below).

> AFAIU, The only way to achieve an even load without dynamic reconfiguration is
> to have a thread per CPU core.

Yes, only if we have no affinity mask, but at the cost of additional resources 
(which we might not want).

>> Maybe we should
>> add a new option, n-forced-handler-threads. This will avoid the above issue 
>> with
>> existing scripts.
>
> Users can limit the number of threads by setting CPU affinity for 
> ovs-vswitchd.
> If these users are knowledgeable enough to predict traffic patterns and CPU
> mappings, they can set appropriate CPU affinity masks as well.

Well, you can not exactly set the number of handler threads as there will be 
round-up to prime (which should not be a big problem). But the main problem I 
see is that now you limit all OVS to that set of CPU. So a burst of upcall 
could mess with your revalidator handling (or other threads running on that 
set).

> For others, changing the meaning of a knob underneath them would be kind
> of irresponsible from our side.

Ack, I will introduce a new nob (also some changes to dpif-netlink, as the 
patch has a bug).

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to