Re: [E-devel] Current State and Future Direction of E/EFL

Carsten Haitzler Tue, 06 Mar 2018 06:47:19 -0800

On Tue, 6 Mar 2018 13:43:09 +0100 Stefan Schmidt <[email protected]> said:


> Hello.
> 
> 
> On 03/06/2018 12:34 PM, Carsten Haitzler (The Rasterman) wrote:
> > On Tue, 6 Mar 2018 09:54:50 +0100 Stefan Schmidt <[email protected]>
> > said:
> >
> >> Hello.
> >>
> >>
> >> On 03/06/2018 07:44 AM, Carsten Haitzler (The Rasterman) wrote:
> >>> tbh platforms is an issue. windows is kind of big as setting up a
> >>> cross-build environment is a fair bit of work. setting up a windows vm
> >>> costs money to test (need licenses, or if you can scrounge one off another
> >>> pc you have/had). osx is worse in that you have to go buy hardware to run
> >>> osx to test.
> >>>
> >>> i think making it a requirement that every commit work on every platform
> >>> is not viable UNLESS:
> >>>
> >>> 1. a windows cross-compile env is available to developers (e.g. ssh into a
> >>> vm set up for this with clear documentation and scripts/tools to do the
> >>> builds. i have one set up for me at home).
> >>> 2. a vm with remote display that developers can use to run/test changes
> >>> easily.
> >>> 3. an actual osx box that developers can ssh into and compile and runa nd
> >>> test and remotely view/interact with like the windows vm.
> >>> 4. same for freebsd etc.
> >> We do not have this and I am pretty sure we never will (I can only hope the
> >> future proofs me wrong). Maybe we should be more honest and state that any
> > then i can't see how we could make it a requirement for committing that they
> > build/run there. 
> 
> Having them running a build is very different from having full remote shell
> access. E.g. osx build slaves available on TravisCi do not have any option to
> get shell access.
> 
> Detecting a problem is the first step. A build test for these *before*
> entering master would do this. It would still need fixing, but that is the
> same we have right now. The difference really is where we detect the break
> and if we have it sitting in master or not.

ok - i wasn't thinking github's travis but more an actual osx machine plugged
in somewhere we can remotely manage and use. :) i know travis will be very
limited.

> > when people report issues with specific builds/platforms then
> > address them as needed.
> 
> I can report the build being broken on osx, on aarch64, with the debug
> profile enabled and also make check with cxx enabled right now. That are 4
> things broken in master just what i see today.

arm64 - known. and this is a bit of a surprise from luajit. it's technically
been broken like this forever. it wasn't some change we made. somehow luajit
started breaking at some point. the "we only allow you to use 27 bits out of 64
of a pointer" thing.

so this is not something CI would magically stop as changes to the dependencies
started making things crash there. at least i have personally compiled using a
luajit compiled myself on arm64 and it worked. this was a while back.

cxx bindings. indeed these are a problem to the point where i just disable them
in my build. i am not sure if this is good or bad though. it's kind of tired
with the eo and bindings work and as things change multiple languages need to
adapt.

as for osx - i have no info as to what is going on there or why. :( i don't
know even where the build breaks/logs etc. are. i do know
build.enlightenment.org and that basically seems ok.

> >> platform we support (besides Linux on x86_64) has only been tested at some
> >> point in the past.
> > i actually cross-build for windows maybe every few weeks or so, and freebsd
> > maybe similarly too. i build on rpi3 (32bit) too regularly enough.
> >
> > we haven't tested on every linux distro either... so should we only claim a
> > few distributions? i don't think we're being dishonest really. releases do
> > get a lot more testing to see they work across os's. master may not get as
> > much until a release cycle.
> 
> Yeah, and as I had the pleasure of handling these releases I can tell you
> that dealing with them so late in the cycle is a nightmare. Having a smoother
> release cycle is actually one of my main motivations to have this early
> detection and avoidance of breaks.

indeed you are right - catching things early is far better. i really was more
about us being honest about the releases working on those platforms or not. :)
that's all. you are right there and i totally agree with that.

> >>> if a platform is on EASILY accessible and able to be EASILY worked with,
> >>> then making this a requirement to pass a build/test on that platform is
> >>> silly.
> >>>
> >>> developers have to be able to do incremental builds. not a "wait 10 mins
> >>> for the next build cycle to happen then wait another 10 for the log to
> >>> see if it worked this time". that's fine for reports. it's not fine for
> >>> actual development.
> >>>
> >>> if this infra existed and worked well, THEN i think it might be sane to
> >>> start adding "it has to pass a build test". then deciding on what
> >>> platforms have to be supported is another step. this has to be pain-free
> >>> or as close as possible to that.
> >> Good luck to finding somehow setting this all up and keep it working.
> >> Definitely not me. :-) If I look back how badly the idea of having a
> >> windows vm, a mac and a arm device hooked up to Jenkins turned out. I
> >> simply gave up on that part.
> > well then i just can't see us ever making it a requirement they build across
> > these os's on every single commit if it can't be automated and made
> > available to developers to actually look into and see what is working or
> > not and why. :(
> >
> >>> not to mention... the test suites need to actually be reliable. i just
> >>> found one of the ecore_file tests was grabbing a page from sf.net ... and
> >>> now sf.net is refusing to servie it anymore thus test suites keep failing.
> >>> tests that are fragile like this should not be some gatekeeper as to if
> >>> code goes in or not.
> >>>
> >>> if a test suite is to be a gatekeeper it has to be done right. that means
> >>> it has to work reliably on the build host. run very quickly. things like
> >>> testing network fetches has to not rely on anything outside of that
> >>> vm/box/chroot etc. etc. ... and we don't have that situation. this
> >>> probably needs to be fixed first and foremost. not to mention just
> >>> dealing with check and our tests to find what went wrong is a nightmare.
> >>> finding the test that goes wrong in a sea of output ... is bad.
> >>>
> >>> so my take iis this: first work on the steps needed to get the final
> >>> outcome. better test infra. easier to write tests. easier to run and find
> >>> just the test that failed and run it by itself easily etc. it should be as
> >>> simple as:
> >>>
> >>> make check
> >>> ...
> >>> FAIL: src/tests/some_test_binary
> >>>
> >>> and to test it i just copy & paste that binary/line and nothing more and i
> >>> get exactly that one test that failed. i don't have to set env vars, read
> >>> src code to find the test name and so on. ... it currently is not this
> >>> simple by far. ;(
> >> Yes, our tests are not as reliable as they should be.
> >> Yes, they would need to run in an controlled env.
> >> Yes, we might need so look at alternatives to libcheck.
> > i'm just saying these need to not randomly reject commits from devs when the
> > issue has nothing to do with what the dev is doing. it can't become an
> > automated requirement unless its reliably correct. :(
> >
> >> But even with me agreeing to the three things above the core question stays
> >> still open.
> >>
> >> Is this developer community willing to accept a working test suite as a
> >> gatekeeper? I don't think this is the case.
> > i think it's best to make it an expectation that devs run make check and
> > compile against efl before a push 
> 
> That is an expectation that is failing in my reality for many years now.
> Getting people to run make check or even distcheck is a fight against
> windmills. Before I normally can do the first beta release from master for a
> new cycle I need to do onion bug fixing by peeling off one bug after another
> to finally get some tarballs produced.

i know that at least i run make check a few times per week and i am doing
multiple builds from scratch per day including building against efl.

> > before even considering making it the
> > gatekeeper. we aren't even there yet with enough tooling, let alone talking
> > of an automated gatekeeper. just a reliable, easy to use and complete test
> > suite would be a big step forward. i think it's putting the cart before the
> > horse to talk automated gatekeepers + jenkins ... etc. without getting
> > these first things right.
> 
> The fun fact here is that most bogus reports actually come from another bugs
> that have been introduced before and are now shadowing new ones.
> 
> We also have bogus reports due to our homebrewn infrastructure and Jenkins
> misconfigurations.

that's not too great. :(

> >> My personal motivation to work on QA and CI has gone down to zero over the
> >> years. It just feels like a Sisyphus task to look at master again and again
> >> why it is broken now. Dig through the commits, bisect them, point fingers
> >> and constantly poke people top get it fixed. All long after the problem
> >> have entered master.
> > what you want is gerrit. and i've seen how that works. i'm not a fan. it
> > ends up either:
> >
> > 1. it's ignored and patches sit in review for weeks, months or years and
> > vanish or
> > 2. it's gamed because everything has to go through it, it's minimized to try
> > and remove the impact. people just vote things up without real review etc.
> >
> > if you automated the voting to a build check instead of humans, you'd need
> > to find some way to do that and have a test build bot vote and do it FAST.
> > that means you need enough infra for a build per commit and it has to be
> > totally reliable. the build env and the tests. is that really possible?
> > then if you don't do them in strict order you end up with conflicts, and if
> > some are rejected from a push with multiple commits, you have dependent
> > commits...
> >
> > i don't see how this ends up better. it ends up worse i think.
> >
> >> I willing to admit that the approach I used to reach my goals might have
> >> been flawed and simply failed. Someone else might want to pick up the
> >> slack on it.
> > i really don't think we have a lot of break issues given size and
> > complexity. not build breaks anyway.
> 
> See the list above just for the four I see today.

well i guess i disabled cxx bindings so i stop seeing those, but i do see
things buildiong cross-compile for windows and on freebsd as well as arm32
regularly and don't run into build issues there.

indeed your examples are for systems i don't have builds or build infra for.

> >  right now if jenkins detects a build break... how does
> > a developer know? it can take hours before it detects something. should they
> > sit hitting reload on the browser for the next hour hoping one of the builds
> > going contains their commit? we have a broken feedback cycle. jenkins should
> > mail the mailing list with "commit X broke build: log etc. link here". if a
> > build breaks, jenkins should go back commits until it finds a working one
> 
> Oh, now we even expect automated git bisecting from it? :-)

i was even thinking a simple "roll back 1 at a time brute force until found" :)
nothing as fancy as a bisect! :) assuming 

> >  then
> > report the commit that broke. at least devs would get a notification.
> 
> These mails are getting sent to the e-b0rk mailing list as I have been asked
> to not sent them to the main development list.

i totally missed this list. i guess that's why i don't know. :) i should
subscribe. i imagine the lack of knowing about this might be part of the
problem.

> >  unless i
> > they sit staring at build.e.org with reloads or someone tells me, i just
> > have no idea a build would have broken.
> >
> > i think we've talked about this many times before... :)
> >
> 
> We did and we both had many of the same arguments before. :-)
> 
> I reached the point where I have no motivation left for doing CI/QA work,
> though. Thus I am going to drop it and hope someone else picks it up.

:( 

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - [email protected]


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] Current State and Future Direction of E/EFL

Reply via email to