Hello!
On 4/19/21 7:15 PM, Sema Boyko wrote:
Hi,
> [...]
This happens only if I specify a lot of neighbors in the config. For
example the following config on first server ("10.1.0.1") works fine:
/protocol bfd {
interface "eth0" {
min rx interval 200 ms;
min tx interval 1000 ms;
idle tx interval 1 s;
multiplier 5;
};
neighbor 10.1.0.2;
neighbor 10.1.0.3;
}/
Looks like all BFD sessions are handled on a single thread. Could
someone, please, confirm that BIRD isn't designed to handle a huge
amount of BFD sessions simultaneously? Or possibly I can enable some
options to handle this case in my env?
Yes, all BFD sessions are handled on a single thread. Luckily enough,
this thread is separate and it isn'ŧ blocked by long tasks inside rest
of BIRD. This will most probably change in some future versions,
allowing for more BFD threads, yet I suppose that first other parts of
BIRD will run in their dedicated threads to make route propagation faster.
OTOH, we have also seen some hardware and kernel issues with massive
usage of BFD, effectively not allowing the BFD packets to reach BIRD at
all. Did you check it by tcpdump?
Anyway, there may be also some problem with the hash size. Could you
please rebuild BIRD with increased third argument of HASH_INIT() in
bfd_start() in proto/bfd/bfd.c, lines 1029 and 1030? Let's say like this:
diff --git a/proto/bfd/bfd.c b/proto/bfd/bfd.c
index dac184c5..009b58a7 100644
--- a/proto/bfd/bfd.c
+++ b/proto/bfd/bfd.c
@@ -1026,8 +1026,8 @@ bfd_start(struct proto *P)
pthread_spin_init(&p->lock, PTHREAD_PROCESS_PRIVATE);
p->session_slab = sl_new(P->pool, sizeof(struct bfd_session));
- HASH_INIT(p->session_hash_id, P->pool, 8);
- HASH_INIT(p->session_hash_ip, P->pool, 8);
+ HASH_INIT(p->session_hash_id, P->pool, 16);
+ HASH_INIT(p->session_hash_ip, P->pool, 16);
init_list(&p->iface_list);
This is a wild guess, anyway it may help you if the BFD thread is eating
all the time looking up the sessions.
Maria