Hello!
On 3/2/21 4:34 PM, Douglas Fischer wrote:
This is very good news!
I know you said "This is a ball park guess", but I confess that I was a
little scared by the proportion of extra CPU usage (30/48 -> +60%).
This depends much on what kind of load we are speaking about. Generally,
if you are a big route server, then 98% of CPU time is probably eaten by
complex filters. I would estimate that this may finish anywhere between
+10% and -10% due to other structural changes. The parallelization
overhead would be minimal.
However, if you are a big route reflector, then you're constantly just
recomputing the best route, accessing the same table. Then we may get to
the +60% estimate. Long story short, the more work you do with one
route, the less overhead you get.
Remember that BIRD is currently extremely well optimized for
single-threaded execution and some parts still heavily depend on being
executed that way. We chose first to allow parallel execution of those
parts that can be parallelized well, with adding some overhead to other
parts.
The most critical part of this is route export (from tables to
protocols) which is now done synchronously after route import. We
decided to decouple it in the multithreaded code, which involves having
a route export queue. Hence more memory stores and loads, more cache
misses etc.
Well … maybe the +60% is too much, reconsidering that guess. Let's hope
it's overestimated. I'd be more concerned about the memory usage. There
are some estimations of peak memory usage in worst cases which can be
even +100% (for a short time). In case we get to these problems in real
world, we'd definitely have to implement algorithms to limit these peaks
as swapping to disk is not desirable here at all. Anyway, this is not
the problem of today; we still need first to get to a code which at
least builds and runs without spitting one core file after another.
I also know that you said that the code is still "currently not
releasable", but I'm curious to know a little more about how this
multi-threading was handled.
Basically, one thread per receiving socket, one thread per exporting
channel, with some exceptions. One lock per protocol instance, one lock
per table. You can lock only one table and one protocol instance at
time; protocol goes first.
We'll publish more documentation; it's still WIP. For now, I'm just
answering a question to say "yes, we're going multithreaded and we're
actively working on it".
Just to illustrate:
Single-Core CPU on BGP is known to be a problem for many engines and
vendors.
One of the vendors developed a "creative" way to do this load
distribution in multiple colors.
As I understood it, they made a kind of Affinity CPU by BGP-Peer.
In a way that each peer has a BGP process, and that process is
"semi-tied" to a core.
And they created a mechanism to redistribute these affinities from
time-to-time based on the amount of BGP messages per second exchanged on
each peer.
If this arises to be a problem, we'll consider this. For now, it just
seems that the most critical part is the route itself which is being
propagated through BIRD -- which should stay in one thread as long as
possible and the threads should keep its CPU (on a well-behaved system)
unless moved for a good reason.
Maria
Em ter., 2 de mar. de 2021 às 10:13, Maria Matejka <[email protected]
<mailto:[email protected]>> escreveu:
Hi!
On 3/1/21 1:26 PM, Marcelo Balbinot wrote:
>
> Hi, I already asked this question at some point,
> but I am curious about the evolution ..
> About multi thread support (multi-core cpu use).
> Is this still a possibility?
Yes, it is. Be prepared that this will also raise memory usage (current
estimates are about >+10% memory) and overall CPU usage (compared to
single-thread execution) due to needed synchronization and buffers.
This means that if you now consume 20G of memory and 30 minutes of
single core time to converge the main table on a rather big node,
you're
going to consume, let's say, >22G of memory and 3 minutes of 16-core
CPU
(summing to 48 minutes of CPU time). This is a ball park guess, do not
take me much seriously. It may be better, it may be worse.
Anyway, there is some code (currently not releasable) that will get
to a
preview release soon. We'll highly appreciate testing from any user
around. Stay tunad!
Maria
--
Douglas Fernando Fischer
Engº de Controle e Automação