Re: [b2g] Running on-device tests in CI

Zac Campbell Fri, 06 Jun 2014 07:20:01 -0700

I always wondered whether this could be done with some statisticalanalysis of data from crash-stats database, for example calculating themean uptime grouped by build/device.

But it'd be far from in CI, most devices in the wild seem to be 1.2 -1.4 range.






On 06/06/14 15:11, Jonathan Griffin wrote:

Yes, one area I could see this being helpful is stability / endurancetests. Those tests aim to exercise the phone in ways that can stressit and induce crashes. Crowdsourcing this testing might be helpfulbecause it would allow us to run in a variety of environments wecan't/won't reproduce in the lab, and we could instrument the tests inways that wouldn't be destructive to user data in most cases.

We don't have the infrastructure to support this yet, but it'ssomething we could look at building out.


Jonathan


On 6/6/14 9:33 AM, Armen Zambrano G. wrote:

It is possible to make it happen.

The main questions I have are:
* What specifically do we gain? What are we trying to fix?
* What tests would make sense? (jgriffin mentioned endurance tests)
* What do we do with the results?

I don't want to stop any energy, however, these questions need proper
answering.

cheers,
Armen

On 14-06-05 07:36 PM, Gareth Aye wrote:

I have actually thought about this before! Most test suites need tostartfrom a very "clean" slate with specific versions of the os and apps.Manytest suites also make modifications to the phone state. Unless webacked up

users' data, reinstalled the operating system, ran the tests, and then

restored the users' data, I think it wouldn't work out. That soundslike a

lot of effort and the potential to run into a lot of privacy/security

issues. Not to say it's not worthwhile (perhaps it'd be moreenvironmental

to use our community's phones sometimes

instead of purchasing specialized hardware for our tests), but weshould

definitely weigh the costs and benefits since there might be a lot of
complexity/engineering under the hood.

On Thu, Jun 5, 2014 at 5:20 PM, Natalia Martinez-Winter<[email protected]

wrote:
stop me if I'm totally out of scope...

would it make sense in the future to have a (peer-to-peer ?)solution torun tests on devices across the world (especially from thecommunity, tests

running at night when devices are not being used) ?


On Tue Jun 3 20:23:39 2014, Jonathan Griffin wrote:

Hi all,

There have been a couple of threads related to test automation inB2G,

asking why we haven't caught some especially egregious regressions;
the kind that basically "break the phone".

To answer that, I'd like to describe how our on-device automation
currently works, and what we're doing to expand it so we can more
effectively address these concerns.

We currently have a smallish number of real devices, managed byWebQA,

hooked up to on-device automation. They run a bank of tests against a
number of branches several times a day. The devices are
time-consuming to manage, since they occasionally get wedged during
flashing, rebooting, or other operations, and require manual
intervention to fix. For this reason, and because it's been very
difficult to obtain significant numbers of devices for automation, we
haven't been able to run any tests frequently enough to provide
per-commit coverage.

When tests fail, WebQA engages in a fairly time-consuming process of
investigation and bisection. In the case of the homescreen breakage
(caused by https://bugzilla.mozilla.org/show_bug.cgi?id=957086), our
on-device tests did break, and the team was in the process of

investigating these failures, which has to be done in order to beable

to create specific, actionable bugs.

Clearly, what we really want is to be able to run at least a smallsetof tests per-commit, so that when things break, we don't need tospend

lots of time investigating...we will already know which commit caused
the problem, and can back it out or address it otherwise promptly.

That's exactly what we are planning for Q3, thanks to the Flame
device. Jonathan Hylands has developed a power harness for this that
allows us to remotely restart the phone, which addresses some of the
device management concerns. The A*Team, WebQA, and jhylands are
working together to get 30 Flames in automation, and to reduce their
management costs. This is enough to allow us to run a small set of

functional and performance tests per-commit, which should beenough to

catch most "break the phone" problems.

Another issue we've had with device testing is test resultvisibility;

currently, test results are available on Jenkins, for which you need
VPN access. This is awkward for people not closely involved in
maintaining and running the tests.

Next quarter, we will be improving this as well. Jonathan Eads on the
A*Team is currently in the process of deploying Treeherder, a
successor to TBPL. Unlike TBPL, Treeherder is not tightly coupled
with buildbot, and is capable of displaying test results from

arbitrary data sources. As our bank of 30 Flames becomesavailable, we

will start publishing on-device test results to Treeherder, in the

same UI that will be used to display the per-commit tests beingrun in

buildbot. This will give people a "one-stop shop" for seeing test
results for B2G, regardless of whether they're run on devices or in
VM's managed by buildbot.

Both of these pieces together will give us the ability to manage some
on-device tests in a manner similar to the way we currently handle
desktop and emulator tests in TBPL; especially bad commits should
break tests, the breakage should be visible in Treeherder, and the
sheriffs will back out the offending commits.

We won't have enough device capacity to run all device tests
per-commit, at least at first. We'll have to carefully select a small
set of tests that guard against the worst kinds of breakage. Whether
we can scale beyond 30 devices will depend on how stable the devices
are and what their management costs are, which is something we'll be
looking at over the next few months.

Regards,

Jonathan

_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g

--
Natalia Martinez-Winter

Channel Marketing Manager
Firefox OS
+33 6 88811959
[email protected]
@martinezwinter

_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g

_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g


_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g


_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g

Re: [b2g] Running on-device tests in CI

Reply via email to