[b2g] Running on-device tests in CI

Jonathan Griffin Tue, 03 Jun 2014 11:30:07 -0700

Hi all,

There have been a couple of threads related to test automation in B2G,asking why we haven't caught some especially egregious regressions; thekind that basically "break the phone".

To answer that, I'd like to describe how our on-device automationcurrently works, and what we're doing to expand it so we can moreeffectively address these concerns.

We currently have a smallish number of real devices, managed by WebQA,hooked up to on-device automation. They run a bank of tests against anumber of branches several times a day. The devices are time-consumingto manage, since they occasionally get wedged during flashing,rebooting, or other operations, and require manual intervention to fix.For this reason, and because it's been very difficult to obtainsignificant numbers of devices for automation, we haven't been able torun any tests frequently enough to provide per-commit coverage.

When tests fail, WebQA engages in a fairly time-consuming process ofinvestigation and bisection. In the case of the homescreen breakage(caused by https://bugzilla.mozilla.org/show_bug.cgi?id=957086), ouron-device tests did break, and the team was in the process ofinvestigating these failures, which has to be done in order to be ableto create specific, actionable bugs.

Clearly, what we really want is to be able to run at least a small setof tests per-commit, so that when things break, we don't need to spendlots of time investigating...we will already know which commit causedthe problem, and can back it out or address it otherwise promptly.

That's exactly what we are planning for Q3, thanks to the Flame device.Jonathan Hylands has developed a power harness for this that allows usto remotely restart the phone, which addresses some of the devicemanagement concerns. The A*Team, WebQA, and jhylands are workingtogether to get 30 Flames in automation, and to reduce their managementcosts. This is enough to allow us to run a small set of functional andperformance tests per-commit, which should be enough to catch most"break the phone" problems.

Another issue we've had with device testing is test result visibility;currently, test results are available on Jenkins, for which you need VPNaccess. This is awkward for people not closely involved in maintainingand running the tests.

Next quarter, we will be improving this as well. Jonathan Eads on theA*Team is currently in the process of deploying Treeherder, a successorto TBPL. Unlike TBPL, Treeherder is not tightly coupled with buildbot,and is capable of displaying test results from arbitrary data sources.As our bank of 30 Flames becomes available, we will start publishingon-device test results to Treeherder, in the same UI that will be usedto display the per-commit tests being run in buildbot. This will givepeople a "one-stop shop" for seeing test results for B2G, regardless ofwhether they're run on devices or in VM's managed by buildbot.

Both of these pieces together will give us the ability to manage someon-device tests in a manner similar to the way we currently handledesktop and emulator tests in TBPL; especially bad commits should breaktests, the breakage should be visible in Treeherder, and the sheriffswill back out the offending commits.

We won't have enough device capacity to run all device tests per-commit,at least at first. We'll have to carefully select a small set of teststhat guard against the worst kinds of breakage. Whether we can scalebeyond 30 devices will depend on how stable the devices are and whattheir management costs are, which is something we'll be looking at overthe next few months.


Regards,

Jonathan

_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g

[b2g] Running on-device tests in CI

Reply via email to