lupyuen commented on issue #17914: URL: https://github.com/apache/nuttx/issues/17914#issuecomment-3850470159
Good News: Over the past week we used __24 Full-Time GitHub Runners__. Which is below the ASF limit of __25 Full-Time GitHub Runners__. [(Explained here)](https://lupyuen.org/articles/ci3) https://infra-reports.apache.org/#ghactions&hours=168&limit=15&group=name <img width="1521" height="1202" alt="Image" src="https://github.com/user-attachments/assets/25164183-f299-4ec8-abef-f2eafd69ca2c" /> So we'll close this issue for now. As always: Everyone can monitor the Live Usage here: https://lupyuen.github.io/nuttx-metrics/github-fulltime-runners.png <img src="https://lupyuen.github.io/nuttx-metrics/github-fulltime-runners.png" /> _Has this overuse happened before?_ We see brief spikes in usage of GitHub Runners during NuttX Releases. But Jan-Feb 2026 was the busiest sustained peak... https://docs.google.com/spreadsheets/d/13QOAzC84eUYcB7xmPT0lo5L-VXnvY5cRvA9rmqC8utM/edit?gid=0#gid=0 <img width="1693" height="1037" alt="Image" src="https://github.com/user-attachments/assets/6a22d6af-b323-47b3-88ab-e5fe76200332" /> [(Generated by history.sh)](https://github.com/lupyuen/nuttx-metrics/blob/main/history.sh) _Why has the GitHub Load jumped significantly since 9 Dec 2025?_ Maybe something we changed in NuttX CI? Needs more monitoring and analysis (during the quieter Lunar New Year holidays) Lesson Learnt: __Continuous Operational Monitoring__ of NuttX CI is super important! Even after revamping NuttX CI into a Distributed Build + Test system. _What happened when we're using too many GitHub Runners?_ Few days ago: Some of our CI Jobs were stuck forever with this message: https://github.com/apache/nuttx/actions/runs/21600990965/job/62279838438 > _Job is waiting for a hosted runner to come online. <br> Job is about to start running on the hosted runner_ We were hurting Other Apache Projects too, because GitHub Runners are pooled across All Apache Projects. _Suppose we have an idea for reducing the CI Load. How many GitHub Runners will it actually save?_ Check the __"Total Run Time"__ in the GitHub Actions Log. A Typical CI Build will require __28 Hours__ of GitHub Runners... https://github.com/apache/nuttx/actions/runs/21365186482/usage <img width="1762" height="1046" alt="Screenshot 2026-01-27 at 8 13 46 AM" src="https://github.com/user-attachments/assets/a8279cd9-5b61-4aa4-9a24-8d6839165447" /> Suppose we propose to optimise the Doc Build. A Doc Build requires __1.5 minutes__ of GitHub Runners... https://github.com/apache/nuttx/actions/runs/21378476592/usage <img width="1523" height="948" alt="Screenshot 2026-01-27 at 8 09 59 AM" src="https://github.com/user-attachments/assets/bbce351d-e9aa-4295-ba10-4892f9518377" /> So Doc Builds take up less than 0.1% of the GitHub Runners of a Typical CI Build. Which means that the optimal Doc Build probably won't reduce by much the GitHub Runners. Also remember: Fixing NuttX CI is highly risky. It might break the frequent CI Builds and/or NuttX Release Process. Someone needs to standby 24 x 7 to watch over the CI Fix, in case it goes haywire and needs to be rolled back ASAP. _How should we revamp NuttX CI?_ I have no idea, though we have [plenty of data to guide us](https://lupyuen.org/articles/ci3). Please confirm later whether btashton / @simbit18 / @lupyuen are keen to take on the job. NuttX CI is super stressful and exhausting, some of us might not wish to continue the job e.g. due to health reasons. (I have hypertension) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
