Thanks for all the info.

I was able to confirm that the RPKI refresh was causing my CPU issue. When I 
remove it the CPU does settle after a few minutes (around 15-20 mins).

Looking forward to testing more of bird3!

Best regards,
François

> Le 4 avr. 2023 à 12:23, Maria Matejka via Bird-users <[email protected]> 
> a écrit :
> 
> Hello, Francois!
> 
> Regarding 2.0.12, you just need to wait, loading 1.2M routes from 50 peers 
> running 6x roa_check each is simply slow on one CPU. You may ignore the debug 
> log, there is typically nothing useful. Also thank you for noting that the 
> "reload burst split" message is still there; I forgot to convert it to a 
> L_TRACE log.
> 
> Also you should probably use the roa_check in a switch-case syntax, 
> effectively reducing the number of calls from 6 to 2. It's the most 
> resource-demanding thing inside your filter.
> 
> Also you may want to postpone BGP startup by several seconds to let the RPKI 
> feed its table completely, avoiding otherwise necessary filter recalculaton. 
> (The exact delay must be determined locally.)
> 
> 
> The "bad lock order" bug is known in 3.0-alpha0 and has no simple solution 
> and we ended up rewriting lots of other things. There is no newer version to 
> test for now.
> 
> Thanks to our Support Subscribers, we can afford a testing hardware good 
> enough to test these scenarios systematically. Thus 3.0-alpha1 will be able 
> to handle 60M routes fast and safely, and we're going to release it soon.
> 
> Please consider subscribing to make it possible for us to test BIRD also for 
> your scenarios; for more information, contact us at [email protected].
> 
> Have a nice day!
> Maria
> 
> On 4/4/23 11:24, Francois Espinet wrote:
>> Hello,
>> I am currently trying bird out in a route collector scenario. We have around 
>> 50 devices all sending around 1.2M routes.
>> I initially started with bird 2.0.12, but the CPU it stuck at 100%, and the 
>> debug logs has a lot of "channel reload burst split (max_feed=-1) ».
>> So I wanted to try bird 3.0, but I am getting the following logs (using the 
>> -d flag), and the router crashes just after starting:
>>> bird: Started
>>> bird: Trying to lock in a bad order
>>> Aborted
>> Any idea what could be the issue there ?
>> Here is my config:
>>> timeformat base         iso long;
>>> timeformat log          iso long;
>>> timeformat protocol     iso long;
>>> timeformat route        iso long;
>>> 
>>> router id X.X.X.X;
>>> hostname "route-collector";
>>> 
>>> attribute int roa_status1;
>>> attribute int roa_status2;
>>> roa4 table roa4_1;
>>> roa4 table roa4_2;
>>> roa6 table roa6_1;
>>> roa6 table roa6_2;
>>> 
>>> ipv4 table pb4
>>> ipv6 table pb6;
>>> 
>>> filter flag_rpki {
>>>    if bgp_path.len = 0 || bgp_path.last = 16276 then accept;
>>> 
>>>    if roa_check(roa4_1, net, bgp_path.last) = ROA_INVALID then  
>>> roa_status1=1;
>>>    if roa_check(roa4_1, net, bgp_path.last) = ROA_UNKNOWN then 
>>> roa_status1=2;
>>>    if roa_check(roa4_1, net, bgp_path.last) = ROA_VALID then roa_status1=3;
>>> 
>>>    if roa_check(roa4_2, net, bgp_path.last) = ROA_INVALID then 
>>> roa_status2=1;
>>>    if roa_check(roa4_2, net, bgp_path.last) = ROA_UNKNOWN then 
>>> roa_status2=2;
>>>    if roa_check(roa4_2, net, bgp_path.last) = ROA_VALID then roa_status2=3;
>>>    accept;
>>> }
>>> 
>>> protocol bgp PB {
>>>   local X.X.X.X as 16276;
>>>   neighbor range 0.0.0.0/0 as 16276;
>>>   dynamic name "pb";
>>>   dynamic name digits 2;
>>>   ipv4 {
>>>     export filter {
>>>       reject;
>>>     };
>>>     table pb4;
>>>     import filter flag_rpki;
>>>     add paths rx;
>>>     import table yes;
>>>     next hop keep on;
>>>     rpki reload on;
>>>   };
>>>   ipv6 {
>>>     export filter {
>>>       reject;
>>>     };
>>>     table pb6;
>>>     import filter flag_rpki;
>>>     add paths rx;
>>>     import table yes;
>>>     next hop keep on;
>>>     rpki reload on;
>>>   };
>>>   strict bind on;
>>> }
>>> 
>>> protocol rpki stack1 {
>>>   roa4 { table roa4_1; };
>>>   roa6 { table roa6_1; };
>>>   remote X.X.X.Z port 323;
>>>   transport tcp;
>>>   refresh 300;
>>>   retry 300;
>>>   expire 600;
>>> }
>>> 
>>> 
>>> protocol rpki stack2 {
>>>   roa4 { table roa4_2; };
>>>   roa6 { table roa6_2; };
>>>   remote X.X.X.Y port 323;
>>>   transport tcp;
>>>   refresh 300;
>>>   retry 300;
>>>   expire 600;
>>> }
>> Best regards,
>> François


Reply via email to