Hello everyone, In the last 24 hours, the NuttX project has exceeded its limit of 25 daily runners for GitHub CI as enforced by the Apache infrastructure team. The runners hit around 70.
This has happened before and we were warned by the Apache infrastructure team, leading to several improvements to our GitHub CI. Many of these improvements came from Lup, but we need some more attention from other contributors to resolve this issue. You can see the discussion around this here: https://github.com/apache/nuttx/issues/17914 I'm opening this mailing list thread so that we can hopefully discuss some potential solutions here. I think we first need to clarify what exactly our testing goals are with the CI, outside of minimum requirements of checking linting/style compatibility. What do we want to catch in our CI runs on new PRs? Once this is narrowed down, we can hopefully start altering our CI system to run the bare minimum checks to achieve the testing we desire. Some suggestions from the issue were: - Allow maintainers to manually select which workflows to run - Prevent CI from running until PRs receive some approvals - Check the Apache infra API to stop CI runs when the daily limit has been exceeded - Use the GitHub labels on PRs to choose which parts of the CI to run - Have one large CI configuration (say, on the simulator) which can be run for PRs that affect general code and not board-specific logic (i.e. modifications to the scheduler). - Run a small CI run for new PRs and then run a nightly full-build to check for any failures that were not caught From what I can tell, much of our CI usage is spent on compile-testing every configuration for every board that is modified under a certain architecture (i.e. all ARM boards for ARM changes). I think we can start reducing CI usage by picking a representative board + configuration combo for each chip. This means a change to the RP2040 chip logic will build only one configuration in CI, and a change to the ARM Cortex 0 logic will build one configuration per Cortex 0 chip. I think this would drastically reduce our CI usage, although it will take a good amount of work to implement. Please, share your thoughts about what CI _should_ be testing and if you have any suggestions on where NuttX can cut down CI usage. We need more people than just Lup working on this now, since exceeding our limits a) frustrates Apache infra and may lead to us losing workflow privileges b) forces us to stop workflow runs and merges for incoming PRs until we are below the limit again This isn't sustainable so we need to come up with some solutions and implement them soon! -- Matteo Golin
signature.asc
Description: PGP signature
