I just create an Issue listing the actions: Actions list to improve NuttX quality and reliability https://github.com/apache/nuttx/issues/16278
I don't know if this is the right place, but at least I broke down the actions and it could be easier to update it individually. BR, Alan On Sun, Apr 27, 2025 at 4:09 PM Nathan Hartman <hartman.nat...@gmail.com> wrote: > I like all of these ideas and would like to add: > * static analysis can find simple mistakes that might have been introduced. > Things like a function that forgot to return a value in some code path, or > use of uninitialized variable, can be caught by static analysis. > > By the way, did some recent change increase stack usage? If stacks are > overflowing, you will get all kinds of weird behaviors. Maybe git bisect > since a month or two ago and run some tests (like run ostest a few times > for each commit being tested), see what comes up? > > Nathan > > On Sun, Apr 27, 2025 at 8:33 AM Alan C. Assis <acas...@gmail.com> wrote: > > > Dear NuttXers, > > > > In the last weeks we are seeing some degradation of NuttX reliability as > > some users have reported. > > > > We saw it happening yesterday during our Live video: the fb command > behaved > > in some very strange ways: > > https://www.youtube.com/watch?v=pbq3suU3g5g&t=1740s > > > > First it printed all the rectangles with pausing between them, then in > the > > next test it didn't work and after some time the board started. > > > > If you go back the video will notice that the "uname" command also > detailed > > a lot to show the results. That was not expected, NuttX is really fast. > > > > We have already proposed creating automated tests to help to improve > NuttX, > > but it alone is not enough. Some features cannot be tested easily by > > automated tests. > > I.E. The audio tone was broken by a commit around 2020 or early, as we > only > > noticed it last year when someone tried to use it. > > > > So these are some suggestions that we could try to help our project: > > > > 1) Automated Test and CI Integration (only will cover some corner cases) > > It will help to detect for example if the board is not starting and if > some > > testings (ostest, etc) is passing. > > > > 2) Test Coverage Metrics > > Integrate code coverage tools like gcov/lcov for unit tests, dhrystone, > > coremark, etc > > Display and track code coverage over time to identify untested parts of > the > > kernel, drivers, and libraries. > > > > 3) Expand and Improve Documentation > > Improve Documentation/ to let end users to test boards easily. > > All boards should have basic instructions explaining how to install NuttX > > on it, currently almost none board has this basic instruction: i.e. > > > > > https://nuttx.apache.org/docs/latest/platforms/arm/stm32f4/boards/stm32f4discovery/index.html > > We should enhance board-specific installation guides: > > How to connect the board (serial, JTAG, SWD). > > How to flash NuttX (dfu-util, OpenOCD, vendor tools, etc.). > > How to configure a simple project (make menuconfig, selecting board > > options). > > Add "Getting Started" tutorials for total beginners. > > Add troubleshooting sections for common problems. > > > > 4) Standardize Board Port Quality > > Create a checklist for each board port to ensure minimum quality: > > Does ostest pass? > > Do basic drivers (UART, Timer, GPIO) work? > > Is SMP tested (if applicable)? > > Boards that don't meet the minimum criteria are marked as "experimental" > or > > "unsupported". > > > > 5) Better Unit Testing and Mocking > > Expand the apps/testing suite with more unit tests. > > Use frameworks like CMocka or extend the existing ostest usage. > > Mock drivers and hardware to allow kernel logic testing even without > > hardware. > > > > 6) Stable API Guarantees > > Formalize API stability between releases (similar to "Stable API" policy > in > > Linux kernel). > > Document which APIs are considered stable and which are still > experimental. > > Add a deprecation process for removing/renaming public APIs. > > > > 7) Regression Testing > > Maintain a regression test suite to ensure that previously fixed bugs do > > not come back. > > Basically when someone found an issue they should create a test to be > > integrated into ostest to detect it in the future. > > Set up automatic re-run of regression tests in CI when code is merged. > > > > 8) Others Performance Benchmarks Improvements > > Create standard performance tests: > > Boot time benchmarks > > Context switch time > > Interrupt latency > > Track performance regressions automatically in CI. > > > > 9) Create Documentation/Templates to be used as reference for boards and > > other common documentation > > > > Other idea that we could implement to validate that all most important > > peripherals of all arch are working as expected: create a base board > > (mainboard) with many important peripherals (sensors, audio, ethernet, > usb) > > and a "cartridge" board to be connected to it (we could use some existing > > standard like Raspberry Pi Computer Module CM4s SODIMM: > > https://datasheets.raspberrypi.com/cm4s/cm4s-datasheet.pdf or Sparkfun > > MicroMod https://www.sparkfun.com/micromod). The good thing about using > > MicroMod is that there are already a lot of microcontroller "cartridge" > > boards that we could use. > > > > Please let me know what you guy think and we could plan the actions to > make > > it happen! > > > > BR, > > > > Alan > > >