Re: [E-devel] Current State and Future Direction of E/EFL

The Rasterman Tue, 06 Mar 2018 08:17:15 -0800

On Tue, 06 Mar 2018 15:27:06 +0000 Stephen Houston <smhousto...@gmail.com> said:


> We have developers leaving or severely cutting back their work, and this
> includes developers who carry a large work load.  Now we see Stefan has
> lost faith and interest in QA/CI and is going to step back from that... I
> think at some point we need to agree to stop arguing the merits of getting
> better structure and better organization and agree that SOMETHING needs to
> be done and start putting together a plan.  So I think indefini put
> together a phab ticket https://phab.enlightenment.org/T6740 we should
> really work there to put together a plan to help.

not sure why you said the above... that exact ticket is entirely unrelated to
what stefan is talking about. in fact they disagree.

> On Tue, Mar 6, 2018 at 8:46 AM Carsten Haitzler <ras...@rasterman.com>
> wrote:
> 
> > On Tue, 6 Mar 2018 13:43:09 +0100 Stefan Schmidt <ste...@osg.samsung.com>
> > said:
> >
> > > Hello.
> > >
> > >
> > > On 03/06/2018 12:34 PM, Carsten Haitzler (The Rasterman) wrote:
> > > > On Tue, 6 Mar 2018 09:54:50 +0100 Stefan Schmidt <
> > ste...@osg.samsung.com>
> > > > said:
> > > >
> > > >> Hello.
> > > >>
> > > >>
> > > >> On 03/06/2018 07:44 AM, Carsten Haitzler (The Rasterman) wrote:
> > > >>> tbh platforms is an issue. windows is kind of big as setting up a
> > > >>> cross-build environment is a fair bit of work. setting up a windows
> > vm
> > > >>> costs money to test (need licenses, or if you can scrounge one off
> > another
> > > >>> pc you have/had). osx is worse in that you have to go buy hardware
> > to run
> > > >>> osx to test.
> > > >>>
> > > >>> i think making it a requirement that every commit work on every
> > platform
> > > >>> is not viable UNLESS:
> > > >>>
> > > >>> 1. a windows cross-compile env is available to developers (e.g. ssh
> > into a
> > > >>> vm set up for this with clear documentation and scripts/tools to do
> > the
> > > >>> builds. i have one set up for me at home).
> > > >>> 2. a vm with remote display that developers can use to run/test
> > changes
> > > >>> easily.
> > > >>> 3. an actual osx box that developers can ssh into and compile and
> > runa nd
> > > >>> test and remotely view/interact with like the windows vm.
> > > >>> 4. same for freebsd etc.
> > > >> We do not have this and I am pretty sure we never will (I can only
> > hope the
> > > >> future proofs me wrong). Maybe we should be more honest and state
> > that any
> > > > then i can't see how we could make it a requirement for committing
> > that they
> > > > build/run there.
> > >
> > > Having them running a build is very different from having full remote
> > shell
> > > access. E.g. osx build slaves available on TravisCi do not have any
> > option to
> > > get shell access.
> > >
> > > Detecting a problem is the first step. A build test for these *before*
> > > entering master would do this. It would still need fixing, but that is
> > the
> > > same we have right now. The difference really is where we detect the
> > break
> > > and if we have it sitting in master or not.
> >
> > ok - i wasn't thinking github's travis but more an actual osx machine
> > plugged
> > in somewhere we can remotely manage and use. :) i know travis will be very
> > limited.
> >
> > > > when people report issues with specific builds/platforms then
> > > > address them as needed.
> > >
> > > I can report the build being broken on osx, on aarch64, with the debug
> > > profile enabled and also make check with cxx enabled right now. That are
> > 4
> > > things broken in master just what i see today.
> >
> > arm64 - known. and this is a bit of a surprise from luajit. it's
> > technically
> > been broken like this forever. it wasn't some change we made. somehow
> > luajit
> > started breaking at some point. the "we only allow you to use 27 bits out
> > of 64
> > of a pointer" thing.
> >
> > so this is not something CI would magically stop as changes to the
> > dependencies
> > started making things crash there. at least i have personally compiled
> > using a
> > luajit compiled myself on arm64 and it worked. this was a while back.
> >
> > cxx bindings. indeed these are a problem to the point where i just disable
> > them
> > in my build. i am not sure if this is good or bad though. it's kind of
> > tired
> > with the eo and bindings work and as things change multiple languages need
> > to
> > adapt.
> >
> > as for osx - i have no info as to what is going on there or why. :( i don't
> > know even where the build breaks/logs etc. are. i do know
> > build.enlightenment.org and that basically seems ok.
> >
> > > >> platform we support (besides Linux on x86_64) has only been tested at
> > some
> > > >> point in the past.
> > > > i actually cross-build for windows maybe every few weeks or so, and
> > freebsd
> > > > maybe similarly too. i build on rpi3 (32bit) too regularly enough.
> > > >
> > > > we haven't tested on every linux distro either... so should we only
> > claim a
> > > > few distributions? i don't think we're being dishonest really.
> > releases do
> > > > get a lot more testing to see they work across os's. master may not
> > get as
> > > > much until a release cycle.
> > >
> > > Yeah, and as I had the pleasure of handling these releases I can tell you
> > > that dealing with them so late in the cycle is a nightmare. Having a
> > smoother
> > > release cycle is actually one of my main motivations to have this early
> > > detection and avoidance of breaks.
> >
> > indeed you are right - catching things early is far better. i really was
> > more
> > about us being honest about the releases working on those platforms or
> > not. :)
> > that's all. you are right there and i totally agree with that.
> >
> > > >>> if a platform is on EASILY accessible and able to be EASILY worked
> > with,
> > > >>> then making this a requirement to pass a build/test on that platform
> > is
> > > >>> silly.
> > > >>>
> > > >>> developers have to be able to do incremental builds. not a "wait 10
> > mins
> > > >>> for the next build cycle to happen then wait another 10 for the log
> > to
> > > >>> see if it worked this time". that's fine for reports. it's not fine
> > for
> > > >>> actual development.
> > > >>>
> > > >>> if this infra existed and worked well, THEN i think it might be sane
> > to
> > > >>> start adding "it has to pass a build test". then deciding on what
> > > >>> platforms have to be supported is another step. this has to be
> > pain-free
> > > >>> or as close as possible to that.
> > > >> Good luck to finding somehow setting this all up and keep it working.
> > > >> Definitely not me. :-) If I look back how badly the idea of having a
> > > >> windows vm, a mac and a arm device hooked up to Jenkins turned out. I
> > > >> simply gave up on that part.
> > > > well then i just can't see us ever making it a requirement they build
> > across
> > > > these os's on every single commit if it can't be automated and made
> > > > available to developers to actually look into and see what is working
> > or
> > > > not and why. :(
> > > >
> > > >>> not to mention... the test suites need to actually be reliable. i
> > just
> > > >>> found one of the ecore_file tests was grabbing a page from sf.net
> > ... and
> > > >>> now sf.net is refusing to servie it anymore thus test suites keep
> > failing.
> > > >>> tests that are fragile like this should not be some gatekeeper as to
> > if
> > > >>> code goes in or not.
> > > >>>
> > > >>> if a test suite is to be a gatekeeper it has to be done right. that
> > means
> > > >>> it has to work reliably on the build host. run very quickly. things
> > like
> > > >>> testing network fetches has to not rely on anything outside of that
> > > >>> vm/box/chroot etc. etc. ... and we don't have that situation. this
> > > >>> probably needs to be fixed first and foremost. not to mention just
> > > >>> dealing with check and our tests to find what went wrong is a
> > nightmare.
> > > >>> finding the test that goes wrong in a sea of output ... is bad.
> > > >>>
> > > >>> so my take iis this: first work on the steps needed to get the final
> > > >>> outcome. better test infra. easier to write tests. easier to run and
> > find
> > > >>> just the test that failed and run it by itself easily etc. it should
> > be as
> > > >>> simple as:
> > > >>>
> > > >>> make check
> > > >>> ...
> > > >>> FAIL: src/tests/some_test_binary
> > > >>>
> > > >>> and to test it i just copy & paste that binary/line and nothing more
> > and i
> > > >>> get exactly that one test that failed. i don't have to set env vars,
> > read
> > > >>> src code to find the test name and so on. ... it currently is not
> > this
> > > >>> simple by far. ;(
> > > >> Yes, our tests are not as reliable as they should be.
> > > >> Yes, they would need to run in an controlled env.
> > > >> Yes, we might need so look at alternatives to libcheck.
> > > > i'm just saying these need to not randomly reject commits from devs
> > when the
> > > > issue has nothing to do with what the dev is doing. it can't become an
> > > > automated requirement unless its reliably correct. :(
> > > >
> > > >> But even with me agreeing to the three things above the core question
> > stays
> > > >> still open.
> > > >>
> > > >> Is this developer community willing to accept a working test suite as
> > a
> > > >> gatekeeper? I don't think this is the case.
> > > > i think it's best to make it an expectation that devs run make check
> > and
> > > > compile against efl before a push
> > >
> > > That is an expectation that is failing in my reality for many years now.
> > > Getting people to run make check or even distcheck is a fight against
> > > windmills. Before I normally can do the first beta release from master
> > for a
> > > new cycle I need to do onion bug fixing by peeling off one bug after
> > another
> > > to finally get some tarballs produced.
> >
> > i know that at least i run make check a few times per week and i am doing
> > multiple builds from scratch per day including building against efl.
> >
> > > > before even considering making it the
> > > > gatekeeper. we aren't even there yet with enough tooling, let alone
> > talking
> > > > of an automated gatekeeper. just a reliable, easy to use and complete
> > test
> > > > suite would be a big step forward. i think it's putting the cart
> > before the
> > > > horse to talk automated gatekeepers + jenkins ... etc. without getting
> > > > these first things right.
> > >
> > > The fun fact here is that most bogus reports actually come from another
> > bugs
> > > that have been introduced before and are now shadowing new ones.
> > >
> > > We also have bogus reports due to our homebrewn infrastructure and
> > Jenkins
> > > misconfigurations.
> >
> > that's not too great. :(
> >
> > > >> My personal motivation to work on QA and CI has gone down to zero
> > over the
> > > >> years. It just feels like a Sisyphus task to look at master again and
> > again
> > > >> why it is broken now. Dig through the commits, bisect them, point
> > fingers
> > > >> and constantly poke people top get it fixed. All long after the
> > problem
> > > >> have entered master.
> > > > what you want is gerrit. and i've seen how that works. i'm not a fan.
> > it
> > > > ends up either:
> > > >
> > > > 1. it's ignored and patches sit in review for weeks, months or years
> > and
> > > > vanish or
> > > > 2. it's gamed because everything has to go through it, it's minimized
> > to try
> > > > and remove the impact. people just vote things up without real review
> > etc.
> > > >
> > > > if you automated the voting to a build check instead of humans, you'd
> > need
> > > > to find some way to do that and have a test build bot vote and do it
> > FAST.
> > > > that means you need enough infra for a build per commit and it has to
> > be
> > > > totally reliable. the build env and the tests. is that really possible?
> > > > then if you don't do them in strict order you end up with conflicts,
> > and if
> > > > some are rejected from a push with multiple commits, you have dependent
> > > > commits...
> > > >
> > > > i don't see how this ends up better. it ends up worse i think.
> > > >
> > > >> I willing to admit that the approach I used to reach my goals might
> > have
> > > >> been flawed and simply failed. Someone else might want to pick up the
> > > >> slack on it.
> > > > i really don't think we have a lot of break issues given size and
> > > > complexity. not build breaks anyway.
> > >
> > > See the list above just for the four I see today.
> >
> > well i guess i disabled cxx bindings so i stop seeing those, but i do see
> > things buildiong cross-compile for windows and on freebsd as well as arm32
> > regularly and don't run into build issues there.
> >
> > indeed your examples are for systems i don't have builds or build infra
> > for.
> >
> > > >  right now if jenkins detects a build break... how does
> > > > a developer know? it can take hours before it detects something.
> > should they
> > > > sit hitting reload on the browser for the next hour hoping one of the
> > builds
> > > > going contains their commit? we have a broken feedback cycle. jenkins
> > should
> > > > mail the mailing list with "commit X broke build: log etc. link here".
> > if a
> > > > build breaks, jenkins should go back commits until it finds a working
> > one
> > >
> > > Oh, now we even expect automated git bisecting from it? :-)
> >
> > i was even thinking a simple "roll back 1 at a time brute force until
> > found" :)
> > nothing as fancy as a bisect! :) assuming
> >
> > > >  then
> > > > report the commit that broke. at least devs would get a notification.
> > >
> > > These mails are getting sent to the e-b0rk mailing list as I have been
> > asked
> > > to not sent them to the main development list.
> >
> > i totally missed this list. i guess that's why i don't know. :) i should
> > subscribe. i imagine the lack of knowing about this might be part of the
> > problem.
> >
> > > >  unless i
> > > > they sit staring at build.e.org with reloads or someone tells me, i
> > just
> > > > have no idea a build would have broken.
> > > >
> > > > i think we've talked about this many times before... :)
> > > >
> > >
> > > We did and we both had many of the same arguments before. :-)
> > >
> > > I reached the point where I have no motivation left for doing CI/QA work,
> > > though. Thus I am going to drop it and hope someone else picks it up.
> >
> > :(
> >
> > --
> > ------------- Codito, ergo sum - "I code, therefore I am" --------------
> > Carsten Haitzler - ras...@rasterman.com
> >
> >
> >
> > ------------------------------------------------------------------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > enlightenment-devel mailing list
> > enlightenment-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> >
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> enlightenment-devel mailing list
> enlightenment-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - ras...@rasterman.com


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] Current State and Future Direction of E/EFL

Reply via email to