Hi David, Sorry for scolding you in public as well, but I think we don't need to find guilt.
So, I got the impression you were doing it to promote PX4 test workflow as the best solution for all the NuttX issues. And although 300K drones are a lot, there are many commercial products using NuttX. Many Sony audio recorders, Moto Z Snaps, Thermal printers, etc. Probably we have products that overcome that number. I think recently Fabio changed the buildbot link. BTW I just remember other alternative that Sebastien and I did about 3 years ago: https://bitbucket.org/acassis/raspi-nuttx-farm/src/master/ The idea was to use low cost Raspberry PIs as a distributed build test for NuttX. It worked fine! You just define a board file with the configuration you want to test and it is done. BR, Alan On 12/20/19, David Sidrane <davi...@apache.org> wrote: > Hi Alan, > > Sorry if my intent was misunderstood. I am merely stating facts on were we > are and how got there.I am not assigning blame. I am not forcing anything I > am giving some examples of how we can make it the project complete and > better. We can use all of it, some of it none of it. The is a group > decision. > > Also Pleases do fill us in on where we can see the SW CI & HW CI you > mentioned. Do you have links maybe be we can use it now? > > Again Sorry! > > David > > On 2019/12/20 11:44:23, Alan Carvalho de Assis <acas...@gmail.com> wrote: >> Hi David, >> >> On 12/20/19, David Sidrane <davi...@apache.org> wrote: >> > Hi Nathan, >> > >> > On 2019/12/20 02:51:56, Nathan Hartman <hartman.nat...@gmail.com> >> > wrote: >> >> On Thu, Dec 19, 2019 at 6:24 PM Gregory Nutt <spudan...@gmail.com> >> >> wrote: >> >> > >> ] A bad build system change can cause serious problems for a lot >> >> > >> of >> >> people around the world. A bad change in the core OS can destroy the >> >> good >> >> reputation of the OS. >> >> > > Why is this the case? Users should not be using unreleased code or >> >> > > be >> >> encouraged to use it.. If they are one solution is to make more >> >> frequent >> >> releases. >> >> > I don't think that the number of releases is the factor. It is time >> >> > in >> >> > people's hand. Subtle corruption of OS real time behavior is not >> >> > easily >> >> > testing. You normally have to specially instrument the software >> >> > and >> >> > setup a special test environment perhaps with a logic analyzer to >> >> > detect >> >> > these errors. Errors in the core OS can persists for months and in >> >> > at >> >> > least one case I am aware of, years, until some sets up the correct >> >> > instrumented test. >> >> >> >> And: >> >> >> >> On Thu, Dec 19, 2019 at 4:20 PM Justin Mclean >> >> <jus...@classsoftware.com> >> >> wrote: >> >> > > ] A bad build system change can cause serious problems for a lot >> >> > > of >> >> people around the world. A bad change in the core OS can destroy the >> >> good >> >> reputation of the OS. >> >> > >> >> > Why is this the case? Users should not be using unreleased code or >> >> > be >> >> encouraged to use it.. If they are one solution is to make more >> >> frequent >> >> releases. >> >> >> >> Many users are only using released code. However, whatever is in >> >> "master" >> >> eventually gets released. So if problems creep in unnoticed, >> >> downstream >> >> users will be affected. It is only delayed. >> >> >> >> I can personally attest that those kinds of errors are extremely >> >> difficult >> >> to detect and trace. It does require a special setup with logic >> >> analyzer >> >> or >> >> oscilloscope, and sometimes other tools, not to mention a whole setup >> >> to >> >> produce the right stimuli, several pieces of software that may have to >> >> be >> >> written specifically for the test.... >> >> >> >> I have been wracking my brain on and off thinking about how we could >> >> set >> >> up >> >> an automated test system to find errors related to timing etc. >> >> Unfortunately unlike ordinary software for which you can write an >> >> automated >> >> test suite, this sort of embedded RTOS will need specialized hardware >> >> to >> >> conduct the tests. That's a subject for another thread and i don't >> >> know >> >> if >> >> now is the time, but I will post my thoughts eventually. >> >> >> >> Nathan >> >> >> > >> > From the proposal >> > >> > "Community >> > >> > NuttX has a large, active community. Communication is via a Google >> > group at >> > https://groups.google.com/forum/#!forum/nuttx where there are 395 >> > members as >> > of this writing. Code is currently maintained at Bitbucket.org at >> > https://bitbucket.org/nuttx/. Other communications are through >> > Bitbucket >> > issues and also via Slack for focused, interactive discussions." >> > >> > >> >> Many users are only using released code. >> > >> > Can we ask the 395 members? >> > >> > I can only share my experience with NuttX since I began working on the >> > project in 2012 for multiple companies. >> > >> > Historically (based on my time on the project) releases - were build >> > tested >> > - by this I mean that the configurations were updated and the thus >> > created a >> > set of "Build Test vectors" BTV. Given the number of permutations >> > solely >> > based on the load time of >> > (http://nuttx.org/doku.php?id=documentation:configvars) with 95,338 >> > CONFIG_* >> > hits. Yes there are duplicates on the page and dependencies. This is >> > just >> > meant to give a number of bits.... >> > >> > The total space is very large >> > >> > The BTV space was very sparse coverage. >> > >> > IIRC Greg gave the build testing task a day of time. It was repeated >> > after >> > errors were found. I am not aware of any other testing. Are you? >> > >> > There were no Release Candidate (rc) nor alpha nor beta test that ran >> > this >> > code one real systems and very little, if any Run Test Vectors (RTV) - >> > I >> > have never seen a test report - has anyone? >> > >> > One way to look at this is Sporadic Integration. (SI) with limited BTV >> > and >> > minimal RTV. Total Test Vector Coverage TTVC = BTV + RTV; The ROI of >> > way >> > of working, from a reliability perspective was and is very small. >> > >> > A herculean effort Greg's part with little return: We released code >> > with >> > many significant and critical errors in it. See the ReleaseNotes and >> > the >> > commit log. >> > >> > Over the years Greg referred to TRUNK (yes it was on SVN) and master as >> > his >> > "own sandbox" stating is should not be considered stable or build-able. >> > This >> > is evident in the commit log. >> > >> >> Please stop focusing on the people (Greg) and let talk about how the >> workflow. >> We are here to discuss how we can improve the process, we are not >> talking about throw away NuttX Build System and move to PX4. >> >> You are picturing something that is not true. >> >> We have issues, as FreeRTOS, MBEB and Zephyr also have. But it is not >> Greg or the Build System guilt. >> >> Please, stop! It is disgusting! >> >> > I have personally never used a release from a tarball. Given the above >> > why >> > would I? It is less stable then master at TC = N >> > (https://www.electronics-tutorials.ws/rc/rc_1.html) where N Is some >> > number >> > of days after a release. - unfortunately based on the current practices >> > (a >> > very unprofessional workflow) N is also dictated by when apps and >> > nuttx >> > actually building for a given target's set of BTV. >> > >> >> It is not "unprofessional" it was what we could do based or our >> hardware limitations. >> >> > With the tools and resources that exist in our work today, Quite >> > frankly: >> > This unacceptable and is an embarrassment. >> > >> >> Oh my Gosh! Please don't do it. >> >> >> > I suspect this is why there is a Tizen. The modern era - gets it. >> > (Disclaimer I am an old dog - I am learning to get it) >> > >> >> Tizen exists because companies want to have control. >> This is the same logic why Redhat and others maintain their own Linux >> kernel by themselves. >> >> > --- Disclaimer --- >> > >> > In the following, I'm am not bragging about PX4 or selling tools, I am >> > merely trying to share our experiences for the betterment of NuttX. >> > >> > From what I understand PX4 has the most instances of NuttX running on >> > real >> > HW in the world. Over 300K. (I welcome other users to share their >> > numbers) >> > >> > PX4's Total TTVC is still limited, but much, much greater than NuttX. >> > >> > We use Continuous integration (CI) on Nuttx on PX4 on every commit on >> > PRs. >> > >> > C/C++ CI / build (push) Successful in 3m >> > Compile MacOS Pending — This commit is being built >> > Compile All Boards — This commit looks good >> > Hardware Test — This commit looks good >> > SITL Tests — This commit looks good >> > SITL Tests (code coverage) — This commit looks good >> > ci/circleci — Your tests passed on CircleCI! >> > continuous-integration/appveyor/pr — AppVeyor build succeeded >> > continuous-integration/jenkins/pr-head — This commit looks good >> > >> > >> > We run tests on HW. >> > >> > http://ci.px4.io:8080/blue/organizations/jenkins/PX4_misc%2FFirmware-hardware/detail/pr-mag-str-preflt/1/pipeline >> > >> > I say limited because of the set of arch we use and the way we configure >> > the >> > OS. >> > >> > I believe this to be true of all users. >> > >> > The benefit of a community is that the sum of all TTVC that finds the >> > problems and fix them. >> > >> > Why not maximize TTVC - if it will have a huge ROI and it is free: >> > >> > PX4 will contribute all that we have. We just need to build temporally >> > consistent build. Yeah he is on the submodule thing AGAIN :) >> > >> >> Just to make the history short: we already have solutions for SW and HW >> CI. >> >> Besides the buildbot (https://buildbot.net) that was implemented and >> tested by Fabio Balzano, Xiaomi also has a build test for NuttX. >> >> At end of the day, it is not only Greg testing the system, we all are >> testing it as well. >> >> Don't try to push PX4 down your throat, it will not work this way. >> Let's keep the Apache way, it is a democracy! >> >> BR, >> >> Alan >> >