Hello everybody!After the holiday period is over, I'm returning to occasional reporting of v3.0 development. This is NOT AN ANNOUNCEMENT, just a progress report.
If you haven't read my previous reports, my last report is here in the archives:
http://trubka.network.cz/pipermail/bird-users/2022-July/016222.htmlAnd now let's look at the progress. If you wanna see the commits, look at the thread-next branch. No liability and no guarantee indeed, please don't run that code on a production machine.
* There are some backported changes to v2 branches and they'll be included in the next BIRD version, notably some changes in route attribute storage. Their performance impact should be negligible and there should be no changes in visible behavior, yet they'll make future v2 features more likely to be smoothly merged into v3
* The nexthop resolving, flowspec revalidation and also roa-induced autoreload procedures are now better fitting into the asynchronous nature of table change announcements. This will allow easy implementation of future improvements to reload only the affected routes. Now we simply reload everything but the path to partial reloads is open.
* There are still pending performance improvements on attribute storage, yet they're to be done after BIRD based on thread-next branch is stable running tables, BGP, RPKI and Pipes in their own threads.
* MRT is still insecure but finally quite high on the list.* I spent most of the past 3 months on merging v2 branches and pre-multithreading commits, trying to resolve all the problems. The goal is now to keep pace with the v2 branch tip and actively resolve merging conflicts by suggesting and choosing the right strategy of new feature implementation in v2 branches to be compatible with asynchronicity and locking.
* With some help of Santiago, I constructed a lockless event queue, removing one of main problems in 3.0-alpha0 -- locking on event manipulation. Now it's possible for multiple threads to safely enqueue into the same queue without locking, and also to enqueue the same event to the same queue (which is idempotent as it should be). I'll thoroughly describe this data structure in a future blogpost.
* We also decided to add another locking layer between Table and Protocol (see my blogpost[1]), called Service. In 3.0-alpha0, the table auxiliary routines (next hop update, route pruning etc.) were executed in an IO loop running on Table level; now it's moved to the Service level, slightly reducing the time when table is locked for maintenance and also fixing some previous problems.
* The Service layer will also host some MRT and BMP routines. If we thought more about this formerly, we may have had also BFD on this layer, yet I think it isn't necessary to move BFD there now.
* For now, I have BIRD in the state when the Pipes and Tables have their own threads and I'm going to spend like a week or so resetting our performance testing machinery, producing some data and hoping for good BIRD stablility.
* After that, I'll continue merging 3.0-alpha0 to thread-next, hoping that I get to RPKI and BGP threads in a meaningful time. In theory, this merge should be smooth, supposing that I'll probably revert most of the commits in 3.0-alpha0, having the issues already solved in thread-next in a different way.
* I'm also trying to find some time to reset our issue tracker at gitlab.nic.cz and fill it at least with issues regarding multithreading, better also filling it with our almost infinite backlog. That should be a tool not only for the development organization but also for everybody else who is curious about how BIRD works inside. Let's test it and we'll see the outcomes.
So here you are with the report. I hope it's complete. Feel free to ask for clarification. I'd like also to produce some more blogposts, hopefully during the next week.
Maria[1] https://en.blog.nic.cz/2022/02/09/bird-journey-to-threads-chapter-3-parallel-execution-and-message-passing/
smime.p7s
Description: S/MIME Cryptographic Signature
