Comments inline.

On Saturday, August 31, 2013 1:47:40 PM UTC-7, Jonas Sicking wrote:
> Hi All,
> 
> 
> 
> We keep having problems with various of the platforms that our
> 
> developers are using aren't working. The problems range from that
> 
> application launching doesn't work, all the way to that the builds
> 
> don't even compile.

Note - this isn't just a developer problem. This is widespread to anyone who 
regularly has to test on B2G. For example, QA deals this with problem as well.

> 
> 
> 
> This is a really bad problem and one we have to fix *now*. Long term
> 
> solutions aren't interesting since this is actively preventing people
> 
> from writing code and fixing blockers.
> 
> 
> 
> To fix this, we need to identify which platforms that developers are
> 
> actually relying on, and then ensure that we have tests that are at
> 
> least covering the basics of a running build. As much as possible
> 
> these tests should be running on every single checkin to any of the
> 
> relevant source trees so that when something breaks, it's detected and
> 
> immediately backed out. For now, the best tool that we have for doing
> 
> this is TBPL. Other test suites are simply not run when for example
> 
> someone checks in a gecko change. So TBPL has to be our main test
> 
> runner for automated tests.
> 

To broaden the analysis here - I think we need to identify which platforms that 
is feasible to scope that each major stakeholder relies on - not just 
development. It can't just be a reliance on what development is actively 
relying on if it's not feasible to maintain the development environment. For 
example, I know there's development going on that's actively making use of 
Unagi & Keon devices for device testing. However, unagi is not a supported 
environment in 1.2 or later. It's also showing signs of expected stability 
problems, so it will eventually become an ineffective environment to use for 
device testing. 

As for automated tests specifically - TBPL isn't all that we have. We've got an 
established Jenkins instance for running on device automation that's actively 
maintained. There will definitely be failures only caught in Jenkins that is 
not seen in TBPL, so I think that both areas need to be analyzed effectively. 
Otherwise, we risk missing device-specific critical issues that can block 
active device use for each stakeholder.

As for backouts - this is something in general I think we need to get better at 
on the Firefox OS side for issues found in automation & manual smoketests on 
Firefox OS - especially on the Gaia side. The process right now is a bit 
haphazard, resulting in broken builds occasionally for a part of a day until 
someone (e.g. QA, dev) chases the root cause of the problem down and backs out 
the regressing patch.

> 
> 
> A group of developers that work on various parts of the code (gaia,
> 
> gecko and gonk) got together this week to try to bootstrap this
> 
> conversation. This email is intended to broaden that conversation.

I'm going to put out the suggestion that you should include some of the QA team 
that's involved in smoketest management & automation associated with that. We 
deal with these associated problems daily.

> 
> 
> 
> A few ground rules:
> 
> * We have very limited resources as far as getting automated tests up
> 
> and running. So we should be very picky about which platforms that we
> 
> choose to maintain. And we should prioritize them for maximum bang for
> 
> our buck.
> 
> * We can't rely on things that are planning to be fixed a few months
> 
> out. We need to get in good shape with this *now*. Discussions about
> 
> what to do long term are off topic for this thread. But please do
> 
> start separate thread about them.
> 
> * We should first focus on standing up tests on
> 
> mozilla-central+gaia-master. When we have that done, we should look at
> 
> adding support for other branches that are still being actively
> 
> developed.

One important note to think about here is that in order to work effectively, we 
need an established strategy to ensure that these tests are actively maintained 
on those branches - that includes reducing the intermittent failure rate.

> 
> 
> 
> 
> 
> The platforms that we concluded are currently relevant to support are:
> 
> 
> 
> ***B2G-desktop out-of-process***
> 
> Used by gecko devs that are testing and developing out-of-process
> 
> APIs. I.e. most of our gecko devs.
> 
> Need to support both optimized and debug builds.
> 
> 
> 
> ***B2G-desktop in-process***
> 
> Used by 3rd party devs since our B2G-desktop OOP builds aren't working
> 
> well enough. This is what what the simulator product is built on.
> 
> Possibly used by some gaia devs.
> 
> The firefox developer tools should (as of recently?) be working well
> 
> with this build.
> 
> Since this is only used by non-gecko and non-gonk developers, we can
> 
> probably get away without support for debug builds here.
> 
> 
> 
> ***Firefox + js-API-shims***
> 
> This runs gaia inside of a standard Firefox-Desktop build. The gaia
> 
> build here is somewhat special since it creates non-packaged gaia
> 
> apps.
> 
> This is used by gaia developers becuase it provides the best debugger
> 
> support. Including support for firebug.
> 
> Since this is only used by non-gecko and non-gonk developers, we can
> 
> probably get away without support for debug builds here.
> 
> 
> 
> ***Hardware Nexus4***
> 
> This is something that everyone needs. I.e. all types of developers
> 
> need the ability to build for and test on hardware.
> 
> We chose Nexus 4 because it's the closest thing we have to a reference
> 
> platform right now. The real reference platform might be different,
> 
> but we can adjust to that once the reference platform is picked.
> 
> Need to support both optimized and debug builds.

I don't think this silver bullet strategy is a good idea. The device actively 
used highly depends on what the developer is actively developing in alignment 
with active stability strategies around those devices. For example, for 1.2, if 
you do not have a need for JB support in your on-device testing, I'd recommend 
Buri as a good development platform for on-device testing, as 1) this is 
supported device for 1.2 and 2) smoketests-wise, it's relatively usable outside 
of known gfx issues. Nexus 4 at the moment has stability problems in comparison 
to Buri (see the associated daily smoketest reports), so I don't think it's a 
good general solution right now anyways.

A different example is for developers that deal with hardware dependencies as 
part of their work (e.g. camera, hardware decoding). That might require active 
development to automatically require a few devices in order to effectively test 
their solutions.

> 
> 
> 
> ***ARM Emulator JB+ICS***
> 
> We need this mostly for automated testing. Automated testing on
> 
> hardware is *very* challenging, and while it's something we're working
> 
> on improving, the emulator builds is what enables the best testing
> 
> *now*.

Agreed automated testing on hardware is challenging in both development and 
maintainability, but do note that we have a solution actively maintained by the 
WebQA team. There will be cases where automation's value can benefit better on 
device even for development, especially when testing hardware-dependent 
features. In those cases, I think it's a good idea to talk with Zac about this.

> 
> Debug builds might run so slow here that it's not really worth
> 
> testing. Though maybe doing some very basic tests with debug builds
> 
> would be a good idea.
> 
> 
> 
> 
> 
> So how do we test these?
> 
> 
> 
> ***B2G-desktop in-process/out-of-process***
> 
> Testing-wise the B2G desktop in-process and out-of-process platforms
> 
> are basically the same. We already have a few testsuites (mochitest,
> 
> reftest, gaia UI?) running for in-process B2G-desktop though they do
> 
> not appear to be enabled on TBPL front page. The gaia integration test
> 
> suite should be coming online soon.
> 
> 
> 
> We need to ensure that UI-events are getting properly tested. That
> 
> means that we need to dispatch the same events as when a developer
> 
> loads one of these builds and starts interacting with them using a
> 
> mouse. We need to ensure that that is what the marionette test tool
> 
> does.
> 
> 
> 
> Apparently we have some addon that are loaded into these builds which
> 
> changes the mouse events that gecko dispatches into touch events that
> 
> gaia will understand. It's unclear if this addon is always added to
> 
> these builds or if that only happens under some circumstances. We need
> 
> to ensure that this addon is always loaded. And we need to ensure that
> 
> when marionette does testing, that the events go through this addon
> 
> the same way as when a developer is running a build.
> 
> 
> 
> How well are the automated gaia UI tests covering things like having
> 
> the basics up and running? Mochitest and reftest mostly test gecko, so
> 
> it doesn't actually ensure that gaia isn't completely busted.

That's a good question for Zac and Stephen.

> 
> 
> 
> The main thing preventing out-of-process tests from moving forward is
> 
> that the out-of-process builds simply aren't working well enough. GFX
> 
> and UI-events have traditionally been the blockers here. We should
> 
> immediately start fixing this as to get the out-of-process builds
> 
> tested and onto TBPL.
> 
> 
> 
> Given the GFX issues, it's important that we have some way of testing
> 
> that the right pixels are actually ending up on screen. I'm not sure
> 
> how well the reftests are testing this?

We should plan on discussing the GFX-specific stability problem as a breakout 
session at the work week. QA is really concerned about whether we have the 
right tests here to catch critical gfx issues early, so I think we should plan 
on discussing this at the work week.

> 
> 
> 
> *** Hardware ***
> 
> Automated testing here is unfortunately very challenging. The hardware
> 
> tends to crap out every few times that we flash it and someone needs
> 
> to manually pull out the battery to un-crap them. We also don't yet
> 
> have the ability to test things like telephony/bluetooth or other
> 
> radio hardware, possibly with exception of WiFi.
> 
> 
> 
> So all in all, we have very low bang for the buck here. I suggest we
> 
> for now only automatically test building for hardware, but not
> 
> actually do automated testing on hardware. Eventually we should
> 
> improve here, but that's too far out to be in scope for this thread.

You should talk more with Zac about this to see what we should do in the 
direction for on-device automation. I don't think we should lose sight of not 
focusing on this - the Gaia UI Automation has found really bad regressions that 
only happen on device, so there's definitely value here. 

> 
> 
> 
> *** ARM Emulator JB+ICS ***
> 
> We already have reftests, mochitests, crashtests and WebAPI tests up
> 
> and running here for ICS. So we need to stand up the same tests for
> 
> JB.
> 
> We also need to expand the set of mochitests (and maybe
> 
> reftests/crashtests?) that we are running. The current set is horribly
> 
> small.
> 
> 
> 
> Another problem here is that debug builds are just too damn slow. So
> 
> we're currently only running optimized builds. This is a big problem
> 
> because it means that debug builds often don't work on hardware. What
> 
> can we do to improve this?
> 
> 
> 
> Two things that could help are:
> 
> * Create optimized builds that are complied with DEBUG enabled. That
> 
> way we still catch assertions and we still ensure that all the gecko
> 
> debug code compiles and runs. We're just not catching compiler issues.
> 
> Though those are rare and since this isn't builds that are shipping to
> 
> end users that's probably not a big problem.
> 
> * We can create a smaller set of tests to run than the full test that
> 
> we are running for normal optimized builds.
> 
> 
> 
> ***Firefox + js-API-shims***
> 
> We didn't end up talking about testing these builds that much. There
> 
> is some hope that we can get developer tools for B2G desktop builds
> 
> into a good enough shape soon enough that spending time on creating
> 
> test suites for these builds won't be needed.
> 
> 
> 
> 
> 
> What do people think? If this sounds good we need to get bugs filed
> 
> for anything that's lacking bugs and find assignees. Getting this
> 
> stuff in order should be top priority after leo blockers as lack of
> 
> good testing is slowing us down *a lot*.
> 
> 
> 
> / Jonas
> 
> 
> 
> PS. I'm heading out on vacation without internet connection until Sep
> 
> 9th. But hopefully lots of conversation can happen without me.
_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g

Reply via email to