Re: [DISCUSS - NuttX Workflow]

Alan Carvalho de Assis Fri, 20 Dec 2019 05:00:32 -0800

Hi David,

Sorry for scolding you in public as well, but I think we don't need to
find guilt.


So, I got the impression you were doing it to promote PX4 test
workflow as the best solution for all the NuttX issues.

And although 300K drones are a lot, there are many commercial products
using NuttX. Many Sony audio recorders, Moto Z Snaps, Thermal
printers, etc. Probably we have products that overcome that number.

I think recently Fabio changed the buildbot link. BTW I just remember
other alternative that Sebastien and I did about 3 years ago:

https://bitbucket.org/acassis/raspi-nuttx-farm/src/master/

The idea was to use low cost Raspberry PIs as a distributed build test
for NuttX. It worked fine! You just define a board file with the
configuration you want to test and it is done.

BR,

Alan

On 12/20/19, David Sidrane <[email protected]> wrote:
> Hi Alan,
>
> Sorry if  my intent was misunderstood. I am merely stating facts on were we
> are and how got there.I am not assigning blame. I am not forcing anything I
> am giving some examples of how we can make it the project complete and
> better. We can use all of it, some of it none of it. The is a group
> decision.
>
> Also Pleases do fill us in on where we can see the SW CI  & HW CI you
> mentioned. Do you have links maybe be we can use it now?
>
> Again Sorry!
>
> David
>
> On 2019/12/20 11:44:23, Alan Carvalho de Assis <[email protected]> wrote:
>> Hi David,
>>
>> On 12/20/19, David Sidrane <[email protected]> wrote:
>> > Hi Nathan,
>> >
>> > On 2019/12/20 02:51:56, Nathan Hartman <[email protected]>
>> > wrote:
>> >> On Thu, Dec 19, 2019 at 6:24 PM Gregory Nutt <[email protected]>
>> >> wrote:
>> >> > >> ] A bad build system change can cause serious problems for a lot
>> >> > >> of
>> >> people around the world.  A bad change in the core OS can destroy the
>> >> good
>> >> reputation of the OS.
>> >> > > Why is this the case? Users should not be using unreleased code or
>> >> > > be
>> >> encouraged to use it.. If they are one solution is to make more
>> >> frequent
>> >> releases.
>> >> > I don't think that the number of releases is the factor.  It is time
>> >> > in
>> >> > people's hand.  Subtle corruption of OS real time behavior is not
>> >> > easily
>> >> > testing.   You normally have to specially instrument the software
>> >> > and
>> >> > setup a special test environment perhaps with a logic analyzer to
>> >> > detect
>> >> > these errors.  Errors in the core OS can persists for months and in
>> >> > at
>> >> > least one case I am aware of, years, until some sets up the correct
>> >> > instrumented test.
>> >>
>> >> And:
>> >>
>> >> On Thu, Dec 19, 2019 at 4:20 PM Justin Mclean
>> >> <[email protected]>
>> >> wrote:
>> >> > > ] A bad build system change can cause serious problems for a lot
>> >> > > of
>> >> people around the world.  A bad change in the core OS can destroy the
>> >> good
>> >> reputation of the OS.
>> >> >
>> >> > Why is this the case? Users should not be using unreleased code or
>> >> > be
>> >> encouraged to use it.. If they are one solution is to make more
>> >> frequent
>> >> releases.
>> >>
>> >> Many users are only using released code. However, whatever is in
>> >> "master"
>> >> eventually gets released. So if problems creep in unnoticed,
>> >> downstream
>> >> users will be affected. It is only delayed.
>> >>
>> >> I can personally attest that those kinds of errors are extremely
>> >> difficult
>> >> to detect and trace. It does require a special setup with logic
>> >> analyzer
>> >> or
>> >> oscilloscope, and sometimes other tools, not to mention a whole setup
>> >> to
>> >> produce the right stimuli, several pieces of software that may have to
>> >> be
>> >> written specifically for the test....
>> >>
>> >> I have been wracking my brain on and off thinking about how we could
>> >> set
>> >> up
>> >> an automated test system to find errors related to timing etc.
>> >> Unfortunately unlike ordinary software for which you can write an
>> >> automated
>> >> test suite, this sort of embedded RTOS will need specialized hardware
>> >> to
>> >> conduct the tests. That's a subject for another thread and i don't
>> >> know
>> >> if
>> >> now is the time, but I will post my thoughts eventually.
>> >>
>> >> Nathan
>> >>
>> >
>> > From the proposal
>> >
>> > "Community
>> >
>> > NuttX has a large, active community.  Communication is via a Google
>> > group at
>> > https://groups.google.com/forum/#!forum/nuttx where there are 395
>> > members as
>> > of this writing.  Code is currently maintained at Bitbucket.org at
>> > https://bitbucket.org/nuttx/.  Other communications are through
>> > Bitbucket
>> > issues and also via Slack for focused, interactive discussions."
>> >
>> >
>> >> Many users are only using released code.
>> >
>> > Can we ask the 395 members?
>> >
>> > I can only share my experience with NuttX since I began working on the
>> > project in 2012 for multiple companies.
>> >
>> > Historically (based on my time on the project) releases - were build
>> > tested
>> > - by this I mean that the configurations were updated and the thus
>> > created a
>> > set of "Build Test vectors" BTV. Given the number of permutations
>> > solely
>> > based on the load time of
>> > (http://nuttx.org/doku.php?id=documentation:configvars) with 95,338
>> > CONFIG_*
>> > hits. Yes there are duplicates on the page and dependencies. This is
>> > just
>> > meant to give a number of bits....
>> >
>> > The total space is very large
>> >
>> > The BTV space was very sparse coverage.
>> >
>> > IIRC Greg gave the build testing task a day of time. It was repeated
>> > after
>> > errors were found.  I am not aware of any other testing. Are you?
>> >
>> > There were no Release Candidate (rc) nor alpha nor beta test that ran
>> > this
>> > code one real systems and very little, if any Run Test Vectors (RTV) -
>> > I
>> > have never seen a test report - has anyone?
>> >
>> > One way to look at this is Sporadic Integration. (SI) with limited BTV
>> > and
>> > minimal RTV.  Total Test Vector Coverage TTVC = BTV + RTV;  The ROI of
>> > way
>> > of working, from a reliability perspective was and is very small.
>> >
>> > A herculean effort Greg's part with little return: We released code
>> > with
>> > many significant and critical errors in it. See the ReleaseNotes and
>> > the
>> > commit log.
>> >
>> > Over the years Greg referred to TRUNK (yes it was on SVN) and master as
>> > his
>> > "own sandbox" stating is should not be considered stable or build-able.
>> > This
>> > is evident in the commit log.
>> >
>>
>> Please stop focusing on the people (Greg) and let talk about how the
>> workflow.
>> We are here to discuss how we can improve the process, we are not
>> talking about throw away NuttX Build System and move to PX4.
>>
>> You are picturing something that is not true.
>>
>> We have issues, as FreeRTOS, MBEB and Zephyr also have. But it is not
>> Greg or the Build System guilt.
>>
>> Please, stop! It is disgusting!
>>
>> > I have personally never used a release from a tarball. Given the above
>> > why
>> > would I? It is less stable then master at TC = N
>> > (https://www.electronics-tutorials.ws/rc/rc_1.html) where N Is some
>> > number
>> > of days after a release. - unfortunately based on the current practices
>> > (a
>> > very unprofessional workflow)  N is also dictated by when apps and
>> > nuttx
>> > actually building for a given target's set of BTV.
>> >
>>
>> It is not "unprofessional" it was what we could do based or our
>> hardware limitations.
>>
>> > With the tools and resources that exist in our work today, Quite
>> > frankly:
>> > This unacceptable and is an embarrassment.
>> >
>>
>> Oh my Gosh! Please don't do it.
>>
>>
>> > I suspect this is why there is a Tizen. The modern era - gets it.
>> > (Disclaimer I am an old dog - I am learning to get it)
>> >
>>
>> Tizen exists because companies want to have control.
>> This is the same logic why Redhat and others maintain their own Linux
>> kernel by themselves.
>>
>> > --- Disclaimer ---
>> >
>> > In the following, I'm am not bragging about PX4 or selling tools, I am
>> > merely trying to share our experiences for the betterment of NuttX.
>> >
>> > From what I understand PX4 has the most instances of NuttX running on
>> > real
>> > HW in the world. Over 300K. (I welcome other users to share their
>> > numbers)
>> >
>> > PX4's Total TTVC is still limited, but much, much greater than NuttX.
>> >
>> > We use Continuous integration (CI) on Nuttx on PX4 on every commit on
>> > PRs.
>> >
>> >    C/C++ CI / build (push) Successful in 3m
>> >    Compile MacOS Pending — This commit is being built
>> >    Compile All Boards — This commit looks good
>> >    Hardware Test — This commit looks good
>> >    SITL Tests — This commit looks good
>> >    SITL Tests (code coverage) — This commit looks good
>> >    ci/circleci — Your tests passed on CircleCI!
>> >    continuous-integration/appveyor/pr — AppVeyor build succeeded
>> >    continuous-integration/jenkins/pr-head — This commit looks good
>> >
>> >
>> > We run tests on HW.
>> >
>> > http://ci.px4.io:8080/blue/organizations/jenkins/PX4_misc%2FFirmware-hardware/detail/pr-mag-str-preflt/1/pipeline
>> >
>> > I say limited because of the set of arch we use and the way we configure
>> > the
>> > OS.
>> >
>> > I believe this to be true of all users.
>> >
>> > The benefit of a community is that the sum of all TTVC that finds the
>> > problems and fix them.
>> >
>> > Why not maximize TTVC - if it will have a huge ROI and it is free:
>> >
>> > PX4 will contribute all that we have. We just need to build temporally
>> > consistent build. Yeah he is on the submodule thing AGAIN :)
>> >
>>
>> Just to make the history short: we already have solutions for SW and HW
>> CI.
>>
>> Besides the buildbot (https://buildbot.net) that was implemented and
>> tested by Fabio Balzano, Xiaomi also has a build test for NuttX.
>>
>> At end of the day, it is not only Greg testing the system, we all are
>> testing it as well.
>>
>> Don't try to push PX4 down your throat, it will not work this way.
>> Let's keep the Apache way, it is a democracy!
>>
>> BR,
>>
>> Alan
>>
>

Re: [DISCUSS - NuttX Workflow]

Reply via email to