Re: PSA: Python min version bumped to 3.6 for building gecko

2020-06-10 Thread Chris AtLee
The pyenv[1] project is a great way to manage multiple versions of python
on your system. I've found it easier than trying to compile directly from
source.

Cheers,
Chris

[1] https://github.com/pyenv/pyenv

On Wed, 10 Jun 2020 at 16:52, Kartikaya Gupta  wrote:

> For those of you who like me are still running Ubuntu 16.04 LTS: the
> minimum version of python required to build gecko got bumped from 3.5
> to 3.6. As Ubuntu 16.04 doesn't offer python3.6 out of the box, you
> may need to build it from source to get going again. See
> https://bugzilla.mozilla.org/show_bug.cgi?id=1644845#c10 for steps
> that worked for me.
>
> Cheers,
> kats
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal to adjust testing to run on PGO builds only and not test on OPT builds

2019-01-03 Thread Chris AtLee
Thank you Joel for writing up this proposal!

Are you also proposing that we stop the linux64-opt and win64-opt builds as
well, except for leaving them as an available option on try? If we're not
testing them on integration or release branches, there doesn't seem to be
much purpose in doing the builds.

On Thu, 3 Jan 2019 at 11:20, jmaher  wrote:

> I would like to propose that we do not run tests on linux64-opt,
> windows7-opt, and windows10-opt.
>
> Why am I proposing this:
> 1) All test regressions that were found on trunk are mostly on debug, and
> in fewer cases on PGO.  There are no unique regressions found in the last 6
> months (all the data I looked at) that are exclusive to OPT builds.
> 2) On mozilla-beta, mozilla-release, and ESR, we only build/test PGO
> builds, we do not run tests on plan OPT builds
> 3) This will reduce the jobs (about 16%) we run which in turn reduces, cpu
> time, money spent, turnaround time, intermittents, complexity of the
> taskgraph.
> 4) PGO builds are very similar to OPT builds, but we add flags to generate
> profile data and small adjustments to build scripts behind MOZ_PGO flag
> in-tree, then we launch the browser, collect data, and repack our binaries
> for faster performance.
> 5) We ship PGO builds, not OPT builds
>
> What are the risks associated with this?
> 1) try server build times will increase as we will be testing on PGO
> instead of OPT
> 2) we could miss a regression that only shows up on OPT, but if we only
> ship PGO and once we leave central we do not build OPT, this is a very low
> risk.
>
> I would like to hear any concerns you might have on this or other areas
> which I have overlooked.  Assuming there are no risks which block this, I
> would like to have a decision by January 11th, and make the adjustments on
> January 28th when Firefox 67 is on trunk.
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Launch of Phabricator and Lando for mozilla-central

2018-06-06 Thread Chris AtLee
This is really great news, I'm really excited to start using it!

Automated landings from code review is such a game changer for
productivity and security.

Congrats to everyone involved.

Cheers,
Chris

On Wed, 6 Jun 2018 at 11:01, Mark Côté  wrote:
>
> The Engineering Workflow team is happy to announce the release of Phabricator 
> and Lando for general use. Going forward, Phabricator will be the primary 
> code-review tool for modifications to the mozilla-central repository, 
> replacing both MozReview and Splinter. Lando is an all-new automatic-landing 
> system that works with Phabricator. This represents about a year of work 
> integrating Phabricator with our systems and building out Lando. Phabricator 
> has been in use by a few teams since last year, and Lando has been used by 
> the Engineering Workflow team for several weeks and lately has successfully 
> landed a few changesets to mozilla-central.
>
> Phabricator is a suite of applications, but we are primarily using the 
> code-review tool, called Differential, which will be taking the place of 
> MozReview and Splinter. Bug tracking will continue to be done with Bugzilla, 
> which is integrated with Phabricator. You will log into Phabricator via 
> Bugzilla. We will soon begin sunsetting MozReview, and Splinter will be made 
> read-only (or replaced with another patch viewer). An upcoming post will 
> outline the plans for the deprecation, archival, and decommission of 
> MozReview, with Splinter to follow.
>
> I also want to thank Phacility, the company behind Phabricator, who provided 
> both excellent support and work on Phabricator itself to meet our 
> requirements in an exceptionally helpful and responsive way.
>
> User documentation on Phabricator catered specifically to Mozillians can be 
> found at https://moz-conduit.readthedocs.io/en/latest/phabricator-user.html. 
> It is also linked from within Phabricator, in the left-hand menu on the home 
> page.
>
> User documentation on Lando can be found at 
> https://moz-conduit.readthedocs.io/en/latest/lando-user.html.
>
> MDN documentation is currently being updated.
>
> At the moment, Phabricator can support confidential revisions when they are 
> associated with a confidential bug, that is, a bug with one or more security 
> groups applied. Lando, however, cannot currently land these revisions. This 
> is a limitation we plan to fix in Q3. You can follow 
> https://bugzilla.mozilla.org/show_bug.cgi?id=1443704 for developments. See 
> http://moz-conduit.readthedocs.io/en/latest/phabricator-user.html#landing-patches
>  for our recommendations on landing patches in Phabricator without Lando.
>
> Similarly, there are two other features which are not part of initial launch 
> but will follow in subsequent releases:
> * Stacked revisions. If you have a stack of revisions, that is, two or more 
> revisions with parent-child relationships, Lando cannot land them all at 
> once.  You will need to individually land them. This is filed as 
> https://bugzilla.mozilla.org/show_bug.cgi?id=1457525.
> * Try support. Users will have to push to the Try server manually until this 
> is implemented. See https://bugzilla.mozilla.org/show_bug.cgi?id=1466275.
>
> Finally, we realize there are a few oddities with the UI that we will also be 
> fixing in parallel with the new features. See 
> https://bugzilla.mozilla.org/show_bug.cgi?id=1466120.
>
> The documentation lists several ways of getting in touch with the Engineering 
> Workflow team, but #phabricator and #lando on IRC are good starting points.
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Removing tinderbox-builds from archive.mozilla.org

2018-05-31 Thread Chris AtLee
On Tue, 29 May 2018 at 14:21, L. David Baron  wrote:
>
> On Monday 2018-05-28 15:52 -0400, Chris AtLee wrote:
> > Here's a bit of a strawman proposal...What if we keep the
> > {mozilla-central,mozilla-inbound,autoland}-{linux,linux64,macosx64,win32,win64}{,-pgo}/
> > directories in tinderbox-builds for now, and delete all the others. Does
> > that cover the majority of the use cases for wanting to access these old
> > builds?
> >
> > I'm guessing the historical builds for old esr branches aren't useful now.
> > Nor are the mozilla-aurora, mozilla-beta, mozilla-release, or b2g-inbound
> > builds.
>
> This seems reasonable to me, with the one caveat that I think
> b2g-inbound belongs in the other bucket.  It was essentially used as
> another peer to mozilla-inbound and autoland, and while many of the
> changes landed there were b2g-only, many of them weren't, and may
> have caused regressions that affect products that we still maintain.

Ok, we can do that.

For mobile, I haven't heard anybody express a desire to keep around
old CI builds in
https://archive.mozilla.org/pub/mobile/tinderbox-builds/, so I'm
planning to have those deleted in July.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Removing tinderbox-builds from archive.mozilla.org

2018-05-28 Thread Chris AtLee
On Sun, 20 May 2018 at 19:40, Karl Tomlinson  wrote:

> On Fri, 18 May 2018 13:13:04 -0400, Chris AtLee wrote:

> > IMO, it's not reasonable to keep CI builds around forever, so the
question
> > is then how long to keep them? 1 year doesn't quite cover a full ESR
cycle,
> > would 18 months be sufficient for most cases?
> >
> > Alternatively, we could investigate having different expiration policies
> > for different type of artifacts. My assumption is that the Firefox
binaries
> > for the opt builds are the most useful over the long term, and that
other
> > build configurations and artifacts are less useful. How accurate is that
> > assumption?

> Having a subset of builds around for longer would be more useful
> to me than having all builds available for a shorter period.

> The nightly builds often include large numbers of changesets,
> sometimes collected over several days, and so it becomes hard to
> identify which code change modified a particular behavior.

> I always use opt builds for regression testing, and so your
> assumption is consistent with my experience.

> I assume there are more pgo builds than nightly builds, but fewer
> than all opt builds.  If so, then having a long expiration policy
> on pgo builds could be a helpful way to reduce storage costs but
> maintain the most valuable builds.

Here's a bit of a strawman proposal...What if we keep the
{mozilla-central,mozilla-inbound,autoland}-{linux,linux64,macosx64,win32,win64}{,-pgo}/
directories in tinderbox-builds for now, and delete all the others. Does
that cover the majority of the use cases for wanting to access these old
builds?

I'm guessing the historical builds for old esr branches aren't useful now.
Nor are the mozilla-aurora, mozilla-beta, mozilla-release, or b2g-inbound
builds.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Removing tinderbox-builds from archive.mozilla.org

2018-05-18 Thread Chris AtLee
The discussion about what to do about these particular buildbot builds has
naturally shifted into a discussion about what kind of retention policy is
appropriate for CI builds.

I believe that right now we keep all CI build artifacts for 1 year. Nightly
and release builds are kept forever. There's certainly an advantage to
keeping the CI builds, as they assist in bisecting regressions. However,
they become less useful over time.

IMO, it's not reasonable to keep CI builds around forever, so the question
is then how long to keep them? 1 year doesn't quite cover a full ESR cycle,
would 18 months be sufficient for most cases?

Alternatively, we could investigate having different expiration policies
for different type of artifacts. My assumption is that the Firefox binaries
for the opt builds are the most useful over the long term, and that other
build configurations and artifacts are less useful. How accurate is that
assumption?

Archiving these artifacts into Glacier would cut the cost of storing them
significantly, but also make them much harder to access. It can take 3-5
hours to retrieve objects from Glacier, and we would need to implement some
API or process to request access to archived objects.


On Thu, 17 May 2018 at 10:33, Mike Kaply  wrote:

> Can we move the builds temporarily and see if it affects workflows over a
> few months and if not, then remove them?
>
> Mike
>
> On Thu, May 17, 2018 at 9:22 AM, Tom Ritter  wrote:
>
> > I agree with ekr in general, but I would also be curious to discover
> > what failures we would experience in practice and how we could
> > overcome them.
> >
> > I think many of the issues experienced with local builds are
> > preventable by doing a TC-like build; just build in a docker container
> > (for Linux/Mac) and auto-build any toolchains needed. (Which would be
> > part of bisect in the cloud automatically.) I've been doing this
> > locally lately and it is not a friendly process right now though.
> >
> > Of course on Windows it's an entirely different story. But one more
> > reason to pursue clang-cl builds on Linux ;)
> >
> > -tom
> >
> >
> > On Tue, May 15, 2018 at 12:53 PM, Randell Jesup 
> > wrote:
> > >>On 5/11/18 7:06 PM, Gregory Szorc wrote:
> > >>> Artifact retention and expiration boils down to a
> > >>> trade-off between the cost of storage and the convenience of
> accessing
> > >>> something immediately (as opposed to waiting several dozen minutes to
> > >>> populate the cache).
> > >>
> > >>Just to be clear, when doing a bisect, one _can_ just deal with local
> > >>builds.  But the point is that then it takes tens of minutes per build
> as
> > >>you point out.  So a bisect task that might otherwise take 10-15
> minutes
> > >>total (1 minute per downloaded build) ends up taking hours...
> > >
> > > Also (as others have pointed out) going too far back (often not that
> > > far) may run you into tool differences that break re-building old revs.
> > > Hopefully you don't get variable behavior, just a failure-to-build at
> > > some point.  I'm not sure how much Rust has made this worse.
> > >
> > > --
> > > Randell Jesup, Mozilla Corp
> > > remove "news" for personal email
> > > ___
> > > dev-platform mailing list
> > > dev-platform@lists.mozilla.org
> > > https://lists.mozilla.org/listinfo/dev-platform
> > ___
> > dev-platform mailing list
> > dev-platform@lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-platform
> >
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


“approval required” for changes affecting CI infrastructure

2017-11-03 Thread Chris AtLee
To ensure a successful Firefox 57 release, teams responsible for Firefox CI
& release infrastructure have adopted an “approval required” policy for
changes that could impact Firefox development or release. This includes
systems like buildbot, Taskcluster services, puppet, hg, product delivery,
and in-tree changes that could impact task scheduling.

If you have a change you’d like to land that impacts one of the above
systems, or you think could impact the infrastructure, please let the
firefox-ci@ list know. Most changes are fine to land, we just want to be
aware of what’s changing in the overall system in the leadup to 57.

If you don’t hear a response back in 24h, you can assume that your proposal
is fine, and proceed to land it.

Thanks in advance,
Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Firefox Nightly - now with twice as many builds a day!

2017-08-31 Thread Chris AtLee
Bug 1349227[1] landed a few days ago, which means we are now doing
"nightly" builds twice a day at 1000 and 2200 UTC.

The purpose of doing multiple nightlies is to get fixes out to users in
Europe, Africa and Asia sooner.

We have some concerns about possible impact to the build infrastructure, so
for now we're keeping an eye on load. We may need to revert if this causes
too much backlog.

In the meanwhile, enjoy a more up-to-date Nightly more often!

Please comment on the bug if there are other issues with doing multiple
nightlies a day.

Cheers,
Chris


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1349227
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Nightly updates disabled for bug 1364059

2017-05-11 Thread Chris AtLee
Updates are enabled again for all platforms. Not all locales have finished
yet, they will receive updates once the repacks finish.

On 11 May 2017 at 09:30, Chris AtLee <cat...@mozilla.com> wrote:

> We've disabled updates for a bad crash: https://bugzilla.mozilla.org/
> show_bug.cgi?id=1364059
>
> We're working on backing out the offending patches and will re-spin
> nightly builds shortly.
>
> Cheers,
> Chris
>
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Nightly updates disabled for bug 1364059

2017-05-11 Thread Chris AtLee
We've disabled updates for a bad crash:
https://bugzilla.mozilla.org/show_bug.cgi?id=1364059

We're working on backing out the offending patches and will re-spin nightly
builds shortly.

Cheers,
Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Reminder - TCW tomorrow May 6th from 0500-1200 PT

2017-05-05 Thread Chris AtLee
As indicated on our status page:
https://status.mozilla.org/incidents/cpnkqqb6b5kh

We will be closing trees tomorrow from 0500-1200PT.

Tracking bug is https://bugzilla.mozilla.org/show_bug.cgi?id=1355897

Thank you for your patience
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reproducible builds

2016-07-19 Thread Chris AtLee
Regarding timestamps in tarballs, using tar's --mtime option to force
timestamps to MOZ_BUILD_DATE (or a derivative thereof) could work.

On 19 July 2016 at 04:11, Kurt Roeckx  wrote:

> On 2016-07-18 20:56, Gregory Szorc wrote:
>
>>
>> Then of course there is build signing, which takes a private key
>> and cryptographically signs builds/installers. With these in play, there
>> is
>> no way for anybody not Mozilla to do a bit-for-bit reproduction of most
>> (all?) of the Firefox distributions at
>> https://www.mozilla.org/en-US/firefox/all/.
>>
>
> There is at least a section about this here:
> https://reproducible-builds.org/docs/embedded-signatures/
>
>
> Kurt
>
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Win8 tests disabled by default on Try

2016-05-20 Thread Chris AtLee
We've been having a lot of problems with capacity for our Windows test
pools, with Windows 8 being particularly bad.

Today we disabled running Windows 8 64-bit tests by default on Try. If you
do really need Windows 8 tests for your try pushes, you can add try syntax
like this to enable: "try: -b o -p win64 -u mochitests[Windows 8]"

Treeherder's "Add new jobs" feature is also a great way to select
additional tests for your try push.

Please be mindful of our limited hardware capacity when choosing which
tests you need. We do publish a report of top pushers to try:
https://secure.pub.build.mozilla.org/builddata/reports/reportor/daily/highscores/highscores.html

We have been working on migrating as many tests as possible to AWS. So far
we have migrated many Windows 7 test suites over, but none of the Windows 8
suites have been migrated yet. Our plan is to focus instead on providing
Windows 10 testing in AWS, and then disable the Win8 tests once those are
ready.

Cheers,
Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Windows 7 tests in AWS

2016-05-13 Thread Chris AtLee
I'm very happy to let you know that we've recently started running some of
our Windows 7 tests in AWS. Currently we're running these suites in Amazon
for all branches of gecko 49 and higher:
* Web platform tests + reftests
* gtest
* cppunit
* jittest
* jsreftest
* crashtest

Since these are now working in AWS, it means we can scale up the number of
machines with load. This should mean a big improvement in getting test
results back for Windows 7!

Work is being tracked in
https://bugzilla.mozilla.org/show_bug.cgi?id=1271355.

If you find any issues, please reach out in #releng, or file a bug and link
it to the one above.

Thanks in particular to jmaher and Q for helping to get this work done.

Cheers,
Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Chris AtLee
On 9 February 2016 at 14:51, Marco Bonardo  wrote:

> On Tue, Feb 9, 2016 at 6:54 PM, Ryan VanderMeulen 
> wrote:
>
> > I'd have a much easier time accepting that argument if my experience
> > didn't tell me that nearly every single "Test took longer than expected"
> or
> > "Test timed out" intermittent ends with a RequestLongerTimeout as the fix
>
>
> this sounds equivalent to saying "Since we don't have enough resources (or
> a plan) to investigate why some tests take so long, let's give up"... But
> then maybe we should have that explicit discussion, rather than assuming
> it's a truth.
> Since we are focused on quality I don't think it's acceptable to say we are
> fine if a test takes an unexpected amount of time to run. The fact those
> bugs end up being resolved by bumping the timeout without any kind of
> investigation (and it happens, I know) is worrisome.
>

I agree. However, this has traditionally been a very difficult area for
Release Engineering and Engineering Productivity to make progress in.

Who can we work with to understand these timing characteristics in more
depth?
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Using the Taskcluster index to find builds

2015-12-02 Thread Chris AtLee
In this case, latest is just latest from wherever. I agree that l10n
nightlies should be under 'nightly' as well.

On Wed, Dec 2, 2015 at 3:04 PM, Axel Hecht <l...@mozilla.com> wrote:

> On 12/1/15 3:48 PM, Chris AtLee wrote:
>
>> Localized builds should be at e.g.
>> gecko.v2.mozilla-central.latest.firefox-l10n.win32-opt
>>
>> And yes, once we've got the naming structure nailed down, wget-en-US
>> should
>> change to use the index.
>>
>
> I would expect l10n nightlies to be under nightly?
>
> How does one distinguish nightlies from non-nightlies under
> mozilla-central.latest? Assuming that nightlies might end up there on
> occasion?
>
> Axel
>
>
>> On Tue, Dec 1, 2015 at 5:22 AM, Axel Hecht <l...@mozilla.com> wrote:
>>
>> I haven't found localized builds and their assets by glancing at things.
>>> Are those to come?
>>>
>>> Also, I suspect we should rewrite wget-en-US? Or add an alternative
>>> that's
>>> index-bound?
>>>
>>> Axel
>>>
>>> On 11/30/15 9:43 PM, Chris AtLee wrote:
>>>
>>> The RelEng, Cloud Services and Taskcluster teams have been doing a lot of
>>>> work behind the scenes over the past few months to migrate the backend
>>>> storage for builds from the old "FTP" host to S3. While we've tried to
>>>> make
>>>> this as seamless as possible, the new system is not a 100% drop-in
>>>> replacement for the old system, resulting in some confusion about where
>>>> to
>>>> find certain types of builds.
>>>>
>>>> At the same time, we've been working on publishing builds to the
>>>> Taskcluster Index [1]. This service provides a way to find a build given
>>>> various different attributes, such as its revision or date it was built.
>>>> Our plan is to make the index be the primary mechanism for discovering
>>>> build artifacts. As part of the ongoing buildbot to Taskcluster
>>>> migration
>>>> project, builds happening on Taskcluster will no longer upload to
>>>> https://archive.mozilla.org (aka https://ftp.mozilla.org). Once we shut
>>>> off
>>>> platforms in buildbot, the index will be the only mechanism for
>>>> discovering
>>>> new builds.
>>>>
>>>> I posted to planet Mozilla last week [2] with some more examples and
>>>> details. Please explore the index, and ask questions about how to find
>>>> what
>>>> you're looking for!
>>>>
>>>> Cheers,
>>>> Chris
>>>>
>>>> [1] http://docs.taskcluster.net/services/index/
>>>> [2]
>>>> http://atlee.ca/blog/posts/firefox-builds-on-the-taskcluster-index.html
>>>>
>>>>
>>>> ___
>>> dev-platform mailing list
>>> dev-platform@lists.mozilla.org
>>> https://lists.mozilla.org/listinfo/dev-platform
>>>
>>>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Faster Windows builds everywhere!

2015-12-01 Thread Chris AtLee
A few weeks ago I posted about switching our Windows builds on Try over to
EC2, resulting in a 30 minute speed improvement.

Last week we made the same change to the rest of the Windows build
infrastructure. All our Windows builds are now running in AWS. We're seeing
good performance gains there too. On mozilla-inbound, we've reduced opt
build times by at least 45 minutes, and nearly two hours (!!) off of our
PGO build times.

Big thanks again to Rob Thijssen (:grenade), Mark Cornmesser (:markco) and
the rest of our Release Engineering and Operations team for getting this
done. Please send your kudos and thanks to them on #releng, or in person at
Orlando next week!

Cheers,
Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Faster Windows builds everywhere!

2015-12-01 Thread Chris AtLee
Right now we've got debug OSX builds in the cloud on Try in parallel with
the regular builds. There's a bunch more work to be done there to be able
to switch over, but we're definitely making progress.

All Windows / OSX unit tests are currently done on our own infra. Q Fortier
is working on getting Windows unittests stood up in AWS, and the results
are very promising. I'm not sure when we'll be able to switch over yet
though. There are no obvious solutions for OSX test infrastructure other
than maintaining our own racks of minis.

Perf tests on all platforms will stay on hardware for now. Some people have
done experiments on EC2 to see how talos performs, but I don't think we
know enough about the impact of this to decide if we can move these off
bare metal or not.

We also run some tests for mobile on panda boards, but those are going away
eventually.

On Tue, Dec 1, 2015 at 4:52 PM, Justin Dolske <dol...@mozilla.com> wrote:

> On 12/1/15 12:41 PM, Chris AtLee wrote:
>
> Last week we made the same change to the rest of the Windows build
>> infrastructure. All our Windows builds are now running in AWS. We're
>> seeing
>> good performance gains there too. On mozilla-inbound, we've reduced opt
>> build times by at least 45 minutes, and nearly two hours (!!) off of our
>> PGO build times.
>>
>
> Nice!
>
> What builds/tests have _not_ moved to the cloud? AIUI the two biggies are
> OS X (can't move because OS licensing), and perf tests... How close are we
> to transitioning everything else off MoCo metal?
>
> Justin
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Faster Windows builds everywhere!

2015-12-01 Thread Chris AtLee
On Tue, Dec 1, 2015 at 5:27 PM, Gregory Szorc <g...@mozilla.com> wrote:

>
> On Tue, Dec 1, 2015 at 2:21 PM, Chris AtLee <cat...@mozilla.com> wrote:
>
>> Right now we've got debug OSX builds in the cloud on Try in parallel with
>> the regular builds. There's a bunch more work to be done there to be able
>> to switch over, but we're definitely making progress.
>>
>> All Windows / OSX unit tests are currently done on our own infra. Q
>> Fortier
>> is working on getting Windows unittests stood up in AWS, and the results
>> are very promising. I'm not sure when we'll be able to switch over yet
>> though. There are no obvious solutions for OSX test infrastructure other
>> than maintaining our own racks of minis.
>>
>> Perf tests on all platforms will stay on hardware for now. Some people
>> have
>> done experiments on EC2 to see how talos performs, but I don't think we
>> know enough about the impact of this to decide if we can move these off
>> bare metal or not.
>>
>
> Amazon now supports dedicated instances, which means you fully control
> what runs on the machine and other random tenants aren't fighting you for
> CPU and I/O. Assuming the performance variance from other tenants is what
> was preventing us from moving Talos to AWS, that blocker may no longer
> exist.
>

I think you probably want Dedicated Hosts rather than dedicated instances,
otherwise multiple of your own workloads could end up on the same physical
box I think. https://aws.amazon.com/ec2/dedicated-hosts/

I have two main concerns with dedicated infra on AWS:
* You're still running under a hypervisor of some kind; will it introduce
too much noise into the results? Seems like a worthwhile experiment!

* Dedicated host pricing is quite a bit more expensive than what we're
paying now for test infrastructure.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Using the Taskcluster index to find builds

2015-12-01 Thread Chris AtLee
One approach we've taken when considering changes to the routes used is to
play in the 'garbage.' prefix. You can see the results of earlier
experiments there:

https://tools.taskcluster.net/index/#garbage/garbage

Regarding your proposal, I find the word 'nightly' overloaded, and it needs
more context to make sense. There are lots of 'nightly' builds, but only
one 'nightly' channel (for Firefox anyway). So
"gecko.v2.firefox.win64-opt.nightly.latest" isn't clear to me at first
glance.

I'm curious what others think though.

On Tue, Dec 1, 2015 at 11:46 AM, Julien Wajsberg <jwajsb...@mozilla.com>
wrote:

> hi,
>
> Because we have an index, it's now very easy to add new routes. I think
> it would be a lot more user-friendly to have an index that starts with
> the product name ("firefox" for example).
>
> For example: "gecko.v2.firefox.win64-opt.nightly.latest" instead of
> "gecko.v2.mozilla-central.nightly.latest.firefox.win64-opt
> <
> https://tools.taskcluster.net/index/artifacts/#gecko.v2.mozilla-central.nightly.latest.firefox/gecko.v2.mozilla-central.nightly.latest.firefox.win64-opt
> >".
> Going from the most general to the most specific.
> Using an index that starts with "mozilla-central" is really technical
> and does not make things easy to find :/ Even if it can be good for
> automated tools.
>
> Fortunately now it's not either one or the other, we can have both. But
> before filing a bug, I'd like to know if the general population thinks
> it's a good idea.
>
> --
> Julien
>
> Le 30/11/2015 21:43, Chris AtLee a écrit :
> > The RelEng, Cloud Services and Taskcluster teams have been doing a lot of
> > work behind the scenes over the past few months to migrate the backend
> > storage for builds from the old "FTP" host to S3. While we've tried to
> make
> > this as seamless as possible, the new system is not a 100% drop-in
> > replacement for the old system, resulting in some confusion about where
> to
> > find certain types of builds.
> >
> > At the same time, we've been working on publishing builds to the
> > Taskcluster Index [1]. This service provides a way to find a build given
> > various different attributes, such as its revision or date it was built.
> > Our plan is to make the index be the primary mechanism for discovering
> > build artifacts. As part of the ongoing buildbot to Taskcluster migration
> > project, builds happening on Taskcluster will no longer upload to
> > https://archive.mozilla.org (aka https://ftp.mozilla.org). Once we shut
> off
> > platforms in buildbot, the index will be the only mechanism for
> discovering
> > new builds.
> >
> > I posted to planet Mozilla last week [2] with some more examples and
> > details. Please explore the index, and ask questions about how to find
> what
> > you're looking for!
> >
> > Cheers,
> > Chris
> >
> > [1] http://docs.taskcluster.net/services/index/
> > [2]
> http://atlee.ca/blog/posts/firefox-builds-on-the-taskcluster-index.html
> > ___
> > dev-platform mailing list
> > dev-platform@lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-platform
>
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Using the Taskcluster index to find builds

2015-12-01 Thread Chris AtLee
Localized builds should be at e.g.
gecko.v2.mozilla-central.latest.firefox-l10n.win32-opt

And yes, once we've got the naming structure nailed down, wget-en-US should
change to use the index.

On Tue, Dec 1, 2015 at 5:22 AM, Axel Hecht <l...@mozilla.com> wrote:

> I haven't found localized builds and their assets by glancing at things.
> Are those to come?
>
> Also, I suspect we should rewrite wget-en-US? Or add an alternative that's
> index-bound?
>
> Axel
>
> On 11/30/15 9:43 PM, Chris AtLee wrote:
>
>> The RelEng, Cloud Services and Taskcluster teams have been doing a lot of
>> work behind the scenes over the past few months to migrate the backend
>> storage for builds from the old "FTP" host to S3. While we've tried to
>> make
>> this as seamless as possible, the new system is not a 100% drop-in
>> replacement for the old system, resulting in some confusion about where to
>> find certain types of builds.
>>
>> At the same time, we've been working on publishing builds to the
>> Taskcluster Index [1]. This service provides a way to find a build given
>> various different attributes, such as its revision or date it was built.
>> Our plan is to make the index be the primary mechanism for discovering
>> build artifacts. As part of the ongoing buildbot to Taskcluster migration
>> project, builds happening on Taskcluster will no longer upload to
>> https://archive.mozilla.org (aka https://ftp.mozilla.org). Once we shut
>> off
>> platforms in buildbot, the index will be the only mechanism for
>> discovering
>> new builds.
>>
>> I posted to planet Mozilla last week [2] with some more examples and
>> details. Please explore the index, and ask questions about how to find
>> what
>> you're looking for!
>>
>> Cheers,
>> Chris
>>
>> [1] http://docs.taskcluster.net/services/index/
>> [2]
>> http://atlee.ca/blog/posts/firefox-builds-on-the-taskcluster-index.html
>>
>>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Using the Taskcluster index to find builds

2015-12-01 Thread Chris AtLee
The expiration is currently set to one year, but we can (and should!)
change that for nightlies. That work is being tracked in
https://bugzilla.mozilla.org/show_bug.cgi?id=1145300

On Mon, Nov 30, 2015 at 7:00 PM, Ryan VanderMeulen <rya...@gmail.com> wrote:

> On 11/30/2015 3:43 PM, Chris AtLee wrote:
>
>> The RelEng, Cloud Services and Taskcluster teams have been doing a lot of
>> work behind the scenes over the past few months to migrate the backend
>> storage for builds from the old "FTP" host to S3. While we've tried to
>> make
>> this as seamless as possible, the new system is not a 100% drop-in
>> replacement for the old system, resulting in some confusion about where to
>> find certain types of builds.
>>
>> At the same time, we've been working on publishing builds to the
>> Taskcluster Index [1]. This service provides a way to find a build given
>> various different attributes, such as its revision or date it was built.
>> Our plan is to make the index be the primary mechanism for discovering
>> build artifacts. As part of the ongoing buildbot to Taskcluster migration
>> project, builds happening on Taskcluster will no longer upload to
>> https://archive.mozilla.org (aka https://ftp.mozilla.org). Once we shut
>> off
>> platforms in buildbot, the index will be the only mechanism for
>> discovering
>> new builds.
>>
>> I posted to planet Mozilla last week [2] with some more examples and
>> details. Please explore the index, and ask questions about how to find
>> what
>> you're looking for!
>>
>> Cheers,
>> Chris
>>
>> [1] http://docs.taskcluster.net/services/index/
>> [2]
>> http://atlee.ca/blog/posts/firefox-builds-on-the-taskcluster-index.html
>>
>> If I understand correctly, Taskcluster builds are only archived for one
> year, whereas we have nightly archives going back 10+ years now. What are
> our options for long-term archiving in this setup?
>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Using the Taskcluster index to find builds

2015-12-01 Thread Chris AtLee
You're right in that we can't change the expiration after the fact, but we
can copy all of the artifacts to new tasks with the longer expiration.

On Tue, Dec 1, 2015 at 9:53 AM, Ryan VanderMeulen <rya...@gmail.com> wrote:

> What does that mean for jobs that have already run? My understanding is
> that we can't change the expiration after the fact for them? Though I guess
> that it's not an issue as long as we fix bug 1145300 prior to shutting off
> publishing to archive.m.o?
>
> I just want to avoid any gaps in nightly build coverage as the archived
> builds are critical for regression hunting.
>
> On 12/1/2015 9:49 AM, Chris AtLee wrote:
>
>> The expiration is currently set to one year, but we can (and should!)
>> change that for nightlies. That work is being tracked in
>> https://bugzilla.mozilla.org/show_bug.cgi?id=1145300
>>
>> On Mon, Nov 30, 2015 at 7:00 PM, Ryan VanderMeulen <rya...@gmail.com>
>> wrote:
>>
>> On 11/30/2015 3:43 PM, Chris AtLee wrote:
>>>
>>> The RelEng, Cloud Services and Taskcluster teams have been doing a lot of
>>>> work behind the scenes over the past few months to migrate the backend
>>>> storage for builds from the old "FTP" host to S3. While we've tried to
>>>> make
>>>> this as seamless as possible, the new system is not a 100% drop-in
>>>> replacement for the old system, resulting in some confusion about where
>>>> to
>>>> find certain types of builds.
>>>>
>>>> At the same time, we've been working on publishing builds to the
>>>> Taskcluster Index [1]. This service provides a way to find a build given
>>>> various different attributes, such as its revision or date it was built.
>>>> Our plan is to make the index be the primary mechanism for discovering
>>>> build artifacts. As part of the ongoing buildbot to Taskcluster
>>>> migration
>>>> project, builds happening on Taskcluster will no longer upload to
>>>> https://archive.mozilla.org (aka https://ftp.mozilla.org). Once we shut
>>>> off
>>>> platforms in buildbot, the index will be the only mechanism for
>>>> discovering
>>>> new builds.
>>>>
>>>> I posted to planet Mozilla last week [2] with some more examples and
>>>> details. Please explore the index, and ask questions about how to find
>>>> what
>>>> you're looking for!
>>>>
>>>> Cheers,
>>>> Chris
>>>>
>>>> [1] http://docs.taskcluster.net/services/index/
>>>> [2]
>>>> http://atlee.ca/blog/posts/firefox-builds-on-the-taskcluster-index.html
>>>>
>>>> If I understand correctly, Taskcluster builds are only archived for one
>>>>
>>> year, whereas we have nightly archives going back 10+ years now. What are
>>> our options for long-term archiving in this setup?
>>>
>>> ___
>>> dev-platform mailing list
>>> dev-platform@lists.mozilla.org
>>> https://lists.mozilla.org/listinfo/dev-platform
>>>
>>>
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Using the Taskcluster index to find builds

2015-11-30 Thread Chris AtLee
The RelEng, Cloud Services and Taskcluster teams have been doing a lot of
work behind the scenes over the past few months to migrate the backend
storage for builds from the old "FTP" host to S3. While we've tried to make
this as seamless as possible, the new system is not a 100% drop-in
replacement for the old system, resulting in some confusion about where to
find certain types of builds.

At the same time, we've been working on publishing builds to the
Taskcluster Index [1]. This service provides a way to find a build given
various different attributes, such as its revision or date it was built.
Our plan is to make the index be the primary mechanism for discovering
build artifacts. As part of the ongoing buildbot to Taskcluster migration
project, builds happening on Taskcluster will no longer upload to
https://archive.mozilla.org (aka https://ftp.mozilla.org). Once we shut off
platforms in buildbot, the index will be the only mechanism for discovering
new builds.

I posted to planet Mozilla last week [2] with some more examples and
details. Please explore the index, and ask questions about how to find what
you're looking for!

Cheers,
Chris

[1] http://docs.taskcluster.net/services/index/
[2] http://atlee.ca/blog/posts/firefox-builds-on-the-taskcluster-index.html
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Faster Windows builds on Try

2015-11-19 Thread Chris AtLee
Over the past months we've been working on migrating our Windows builds
from the legacy hardware machines into Amazon.

I'm very happy to announce that we've wrapped up the initial work here, and
all our Windows builds on Try are now happening in Amazon.

The biggest win from this is that our Windows builds are now nearly 30
minutes faster than they used to be. As of today, windows builds on try
generally take around 50 minutes to complete, down from 1h20 before. Our
next step is to migrate the non-try builds onto Amazon as well.

Big thanks to Rob, Mark, and the rest of our Release Engineering and
Operations team for making this possible!

Cheers,
Chris

https://bugzilla.mozilla.org/show_bug.cgi?id=1199267
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Now measuring Firefox size per-commit. What else should we be tracking?

2015-11-09 Thread Chris AtLee
On Mon, Nov 9, 2015 at 6:39 PM, William Lachance 
wrote:

> On 2015-11-06 5:56 PM, Mark Finkle wrote:
>
>> I also think measuring build times, and other build related stats, would
>> be
>> useful. I'd like to see Mozilla capturing those stats for developer builds
>> though. I'm less interested in build times for the automation. That data
>> is
>> already looked at by the automation team.
>>
>
> Chris Manchester has volunteered to look into submitting build times in
> automation to perfherder here:
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=1222549
>
> I actually do think that perfherder has some advantages over the existing
> grafana system, in that it has the capability of being sheriffed easily
> down to the per-commit level by anyone (not just releng/ateam).
>
> We'll see I guess! The proof will be in bugs filed and fixed when
> regressions occur. I really think developer build times are strongly
> correlated with build times in automation, so my hope is that there will be
> a trickle-down effect if this system proves useful.
>

Yes, I agree this is a better approach. Having the build system produce
logs or artifacts than can be ingested by perfherder is a more flexible
model than the one we're currently using.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Now measuring Firefox size per-commit. What else should we be tracking?

2015-11-04 Thread Chris AtLee
This is really great, thanks for adding support for this!

I'd like to see the size of the complete updates measured as well, in
addition to the installer sizes.

Do we have alerts for these set up yet?

Cheers,
Chris

On Wed, Nov 4, 2015 at 10:55 AM, William Lachance 
wrote:

> Hey, so as described here:
>
> http://wrla.ch/blog/2015/11/perfherder-onward/
>
> ... I recently added tracking for Firefox installer size inside
> Perfherder. This should let us track how bloated (or not) Firefox is on our
> various supported platforms, on a per-commit basis:
>
>
> https://treeherder.mozilla.org/perf.html#/graphs?series=[mozilla-inbound,4eb0cde5431ee9aeb5eb14512ddb3da6d4702cf0,1]=[mozilla-inbound,80cac7ef44b76864458627c574af1a18a425f338,1]=[mozilla-inbound,0060252bdfb7632df5877b7594b4d16f1b5ca4c9,1]
>
> As I mentioned in the blog post, it's now *very* easy (maybe too easy?
> heh) to submit "performance" (read: quantitative) data for any job
> reporting to treeherder by outputting a line called "PERFHERDER_DATA" to
> the log.
>
> Is there anything we could be tracking as part of our build or test jobs
> that we should be? Build times are one thing that immediately comes to
> mind. Is there anything else?
>
> In order to be a good candidate for measurement in this kind of system, a
> metric should be:
>
> 1. Relatively deterministic.
> 2. Something people actually care about and are willing to act on, on a
> per-commit basis. If you're only going to look at it once a quarter or so,
> it doesn't need to be in Perfherder.
>
> Anyway, just thought I'd open the floor to brainstorming. I'd prefer to
> add stuff incrementally, to make sure Perfherder can handle the load, but
> I'd love to hear all your ideas.
>
> Will
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Partial updates temporarily disabled for Nightly and Dev-Edition

2015-11-03 Thread Chris AtLee
Partial updates should be functional again now. Sorry for the inconvenience!

On Thu, Oct 29, 2015 at 4:49 PM, Chris AtLee <cat...@mozilla.com> wrote:

> We've temporarily disabled generation of partial updates for Nightly and
> Dev-Edition (Aurora) versions of Firefox.
>
> Given that Dev-Edition updates are currently frozen as part of our uplift
> process, the main impact of this is on Nightly users.
>
> We hope to have partial update generation re-enabled in the next few days.
>
> Sorry for the inconvenience.
>
> Chris
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Partial updates temporarily disabled for Nightly and Dev-Edition

2015-10-29 Thread Chris AtLee
We've temporarily disabled generation of partial updates for Nightly and
Dev-Edition (Aurora) versions of Firefox.

Given that Dev-Edition updates are currently frozen as part of our uplift
process, the main impact of this is on Nightly users.

We hope to have partial update generation re-enabled in the next few days.

Sorry for the inconvenience.

Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Per-test chaos mode now available, use it to help win the war on orange!

2015-06-04 Thread Chris AtLee
Very interesting, thank you!

Would there be a way to add an environment variable or harness flag to run
all tests in chaos mode?

On Thu, Jun 4, 2015 at 5:31 PM, Chris Peterson cpeter...@mozilla.com
wrote:

 On 6/4/15 11:32 AM, kgu...@mozilla.com wrote:

 I just landed bug 1164218 on inbound, which adds the ability to run
 individual mochitests and reftests in chaos mode. (For those unfamiliar
 with chaos mode, it's a feature added by roc a while back that makes
 already-random things more random; see [1] or bug 955888 for details).

 The idea with making it available per-test is that new tests should be
 written and tested locally/on try with chaos mode enabled, to flush out
 possible intermittent failures faster. Ideally we should also land them
 with chaos mode enabled. At this time we're still not certain if this will
 provide a lot of value (i.e. if chaos-mode-triggered failures are
 representative of real bugs) so it's not mandatory to make your tests run
 in chaos mode, but please do let me know if you try enabling it on your
 test and are either successful or not. We need to collect more data on the
 usefulness of this to see where we should take it. If it does turn out to
 be valuable, my hope is that we can start making pre-existing tests
 chaos-mode enabled as well, and eventually reduce the intermittent failure
 rate.


 Will chaos mode enabled tests run on Try and release branches?

 We don't know if chaos mode test failures are representative of real bugs,
 but could chaos mode hide bugs that only reveal themselves when users run
 without chaos mode?



  See [2] for an example of how to enable chaos mode in your tests.
 Basically you can add chaos-mode to the reftest.list file for reftests, or
 call SimpleTest.testInChaosMode() for mochitests.

 If you do run into intermittent failures, the best way to debug them is
 usually to grab a recording of the failure using rr [3] and then debug the
 recording to see what was going on. This only works on Linux (and has some
 hardware requirements as well) but it's a really great tool to have.

 Cheers,
 kats

 [1] http://robert.ocallahan.org/2014/03/introducing-chaos-mode.html
 [2] https://hg.mozilla.org/integration/mozilla-inbound/rev/89ac61464a45
 [3] http://rr-project.org/ or https://github.com/mozilla/rr/


 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: It is now possible to apply arbitrary tags to tests/manifests and run all tests with a given tag

2015-05-04 Thread Chris AtLee
Sounds great! I've filed
https://bugzilla.mozilla.org/show_bug.cgi?id=1161282 for this.

According to
https://secure.pub.build.mozilla.org/builddata/reports/reportor/daily/highscores/highscores.html,
we still have a ton of people using '-p all -u all' on try

On Mon, May 4, 2015 at 5:12 PM, Gregory Szorc g...@mozilla.com wrote:

 Wait - you're telling me that it is now possible to limit try pushes but
 not just jobs but tests within jobs?! Stop the presses: this is huge! If
 used by the masses, this could drastically reduce try turnaround times and
 decrease automation load and costs.

 Could we encourage use of --tag by having the automation scheduler
 up-weight jobs that opt in to reduced load?

 On Thu, Apr 30, 2015 at 4:21 PM, Christopher Manchester 
 chmanches...@gmail.com wrote:

  You can now add --tag arguments to try syntax and they will get passed
 to
  test harnesses in your try push. Details of the implementation are in bug
  978846, but if you're interested in passing other arguments from try
 syntax
  to a test harness, this can be done by adding those arguments to
  testing/config/mozharness/try_arguments.py. Note this is still rather
  coarse in the sense that arguments are forwarded without regard for
 whether
  a harness supports a particular argument, but I can imagine it being
 useful
  in a number of cases (for instance, when testing the feature with
 xpcshell
  and --tag devtools, I was able to get feedback in about ten minutes
  whether things were working rather than waiting for every xpcshell test
 to
  run).
 
  Chris
 
  On Thu, Apr 2, 2015 at 2:22 PM, Andrew Halberstadt 
  ahalberst...@mozilla.com
   wrote:
 
   Minor update. It was pointed out that other list-like manifestparser
   attributes (like head and support-files) are whitespace delimited
 instead
   of comma delimited. To be consistent I switched tags to whitespace
   delimitation as well.
  
   E.g both these forms are ok:
  
   [test_foo.html]
   tags = foo bar baz
  
   [test_bar.html]
   tags =
   foo
   bar
   baz
  
   -Andrew
  
  
   On 31/03/15 12:30 PM, Andrew Halberstadt wrote:
  
   As of bug 987360, you can now run all tests with a given tag for
   mochitest (and variants), xpcshell and marionette based harnesses.
 Tags
   can be applied to either individual tests, or the DEFAULT section in
   manifests. Tests can have multiple tags, in which case they should be
   comma delimited. To run all tests with a given tag, pass in --tag tag
   name to the mach command.
  
   For example, let's say we want to group all mochitest-plain tests
   related to canvas together. First we'd add a 'canvas' tag to the
 DEFAULT
   section in
  
  
 
 https://dxr.mozilla.org/mozilla-central/source/dom/canvas/test/mochitest.ini
  
  
   [DEFAULT]
   tags = canvas
  
   We notice there is also a canvas related test under dom/media, namely:
  
  
 
 https://dxr.mozilla.org/mozilla-central/source/dom/media/test/mochitest.ini#541
  
  
   Let's pretend it is already tagged with the 'media' tag, but that's
 ok,
   we can add a second tag no problem:
  
   [test_video_to_canvas.html]
   tags = media,canvas
  
   Repeat above for any other tests or manifests scattered in the tree
 that
   are related to canvas. Now we can run all mochitest-plain tests with:
  
   ./mach mochitest-plain --tag canvas
  
   You can also run the union of two tags by specifying --tag more than
   once (though the intersection of two tags is not supported):
  
   ./mach mochitest-plain --tag canvas --tag media
  
   So far the xpcshell (./mach xpcshell-test --tag name) and marionette
   (./mach marionette-test --tag name) commands are also supported.
 Reftest
   is not supported as it has its own special manifest format.
  
   Applying tags to tests will not affect automation or other people's
   tags. So each organization or team should feel free to use tags in
   whatever creative ways they see fit. Eventually, we'll start using
 tags
   as a foundation for some more advanced features and analysis. For
   example, we may implement a way to run all tests with a given tag
 across
   multiple different suites.
  
   If you have any questions or things aren't working, please let me
 know!
  
   Cheers,
   Andrew
  
  
   ___
   dev-platform mailing list
   dev-platform@lists.mozilla.org
   https://lists.mozilla.org/listinfo/dev-platform
  
  ___
  dev-platform mailing list
  dev-platform@lists.mozilla.org
  https://lists.mozilla.org/listinfo/dev-platform
 
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-10-01 Thread Chris AtLee

On 17:26, Tue, 23 Sep, Kyle Huey wrote:

On Tue, Aug 26, 2014 at 8:23 AM, Chris AtLee cat...@mozilla.com wrote:

Just a short note to say that this experiment is now live on
mozilla-inbound.

___
dev-tree-management mailing list
dev-tree-managem...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tree-management



What was the outcome?


Thanks for the reminder.

The outcome of this experiment was inconclusive.

On the one hand, we know we didn't make anything worse. The skipping 
behaved as expected, and wasn't a burden on sheriffs. We didn't make 
wait times any worse.


On the other hand, it appears as though we improved wait times for the 
target platforms, but the signal there isn't clear due to other 
variables changing (e.g. overall load wasn't directly comparable between 
the two time windows).


We've left the skipping behaviour enabled for the moment, and are 
considering some tweaks to the amount of skipping that happens, and 
which branches/platforms it's enabled for.


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-26 Thread Chris AtLee
Just a short note to say that this experiment is now live on 
mozilla-inbound.


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-21 Thread Chris AtLee

On 17:37, Wed, 20 Aug, Jonas Sicking wrote:

On Wed, Aug 20, 2014 at 4:24 PM, Jeff Gilbert jgilb...@mozilla.com wrote:

I have been asked in the past if we really need to run WebGL tests on Android, 
if they have coverage on Desktop platforms.
And then again later, why B2G if we have Android.

There seems to be enough belief in test-once-run-everywhere that I feel the 
need to *firmly* establish that this is not acceptable, at least for the code I 
work with.
I'm happy I'm not alone in this.


I'm a firm believer that we ultimately need to run basically all
combinations of tests and platforms before allowing code to reach
mozilla-central. There's lots of platform specific code paths, and
it's hard to track which tests trigger them, and which don't.


I think we can agree on this. However, not running all tests on all 
platforms per push on mozilla-inbound (or other branch) doesn't mean 
that they won't be run on mozilla-central, or even on mozilla-inbound 
prior to merging.


I'm a firm believer that running all tests for all platforms for all 
pushes is a waste of our infrastructure and human resources.


I think the gap we need to figure out how to fill is between getting 
per-push efficiency and full test coverage prior to merging.



It would however be really cool if we were able to pull data on which
tests tend to fail in a way that affects all platforms, and which ones
tend to fail on one platform only. If we combine this with the ability
of having tbpl (or treeherder) fill in the blanks whenever a test
fails, it seems like we could run many of our tests only one one
platform for most checkins to mozilla-inbound.


There are dozens of really interesting approaches we could take here.
Skipping every nth debug test run is one of the simplest, and I hope we 
can learn a lot from the experiment.


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Experiment with running debug tests less often on mozilla-inbound the week of August 25

2014-08-20 Thread Chris AtLee

On 18:25, Tue, 19 Aug, Ehsan Akhgari wrote:

On 2014-08-19, 5:49 PM, Jonathan Griffin wrote:

On 8/19/2014 2:41 PM, Ehsan Akhgari wrote:

On 2014-08-19, 3:57 PM, Jeff Gilbert wrote:

I would actually say that debug tests are more important for
continuous integration than opt tests. At least in code I deal with,
we have a ton of asserts to guarantee behavior, and we really want
test coverage with these via CI. If a test passes on debug, it should
almost certainly pass on opt, just faster. The opposite is not true.

They take a long time and then break is part of what I believe
caused us to not bother with debug testing on much of Android and
B2G, which we still haven't completely fixed. It should be
unacceptable to ship without CI on debug tests, but here we are
anyways. (This is finally nearly fixed, though there is still some
work to do)

I'm not saying running debug tests less often is on the same scale of
bad, but I would like to express my concerns about heading in that
direction.


I second this.  I'm curious to know why you picked debug tests for
this experiment.  Would it not make more sense to run opt tests on
desktop platforms on every other run?


Just based on the fact that they take longer and thus running them less
frequently would have a larger impact.  If there's a broad consensus
that debug runs are more valuable, we could switch to running opt tests
less frequently instead.


Yep, the debug tests indeed take more time, mostly because they run 
more checks.  :-)  The checks in opt builds are not exactly a subset 
of the ones in debug builds, but they are close.  Based on that, I 
think running opt tests on every other push is a more conservative 
one, and I support it more.  That being said, for this one week 
limited trial, given that the sheriffs will help backfill the skipped 
tests, I don't care very strongly about this, as long as it doesn't 
set the precedence that we can ignore debug tests!


I'd like to highlight that we're still planning on running debug 
linux64 tests for every build. This is based on the assumption that 
debug-specific failures are generally cross-platform failures as well.


Does this help alleviate some concern? Or is that assumption just plain 
wrong?


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Always brace your ifs

2014-02-24 Thread Chris AtLee

On 17:37, Sat, 22 Feb, L. David Baron wrote:

On Saturday 2014-02-22 15:57 -0800, Gregory Szorc wrote:

On Feb 22, 2014, at 8:18, Kyle Huey m...@kylehuey.com wrote:
 If you needed another reason to follow the style guide:
 https://www.imperialviolet.org/2014/02/22/applebug.html


Code coverage would have caught this as well.

The time investment into 100% line and branch coverage is debatable. But you 
can't argue that code coverage has its place, especially for high-importance 
code such as crypto.

AFAIK, our automation currently does not collect code coverage from any test 
suite. Should that change?


There was some automation running code coverage reports (with gcov,
I think) for at least reftests + mochitests for an extended period
of time; I found it useful for improving style system test coverage
while it was running.

I'm not sure how strong our commitment is to keeping such tests
running, though; I frequently have to defend the tests against
people who want to disable them because they take a long time (i.e.,
a long time for a single test file, which sometimes leads the tests
to approach the per-file timeouts on slow VMs) or because they
happen to exhibit the latest JIT crash frequently because they run a
lot of code.  I'm worried we're moving to a model where tests need
to have active defenders to keep them running (even though that
isn't how features on the Web platform work), because we blame the
old test rather than the new regression.


Tests need owners just like any other piece of code.

One of the big problems with the previous code coverage reports was that 
they were failing to run, and nobody was stepping up to fix them. When 
we're resource constrained, it's a big waste of resources to run things 
that are broken and nobody is working on fixing.


Slow tests are a slightly different issue. If you're adding a test that 
takes 60s to run, you're saying you think it's important enough to make 
all other developers wait another minute for their test runs to complete 
locally. You're saying it's worthwhile to spend an extra minute per push 
per platform, to delay all future landings and merges by an extra minute 
per push. At ~249 pushes per day and at least 18 test platforms, 
you're adding at least 74 hours of additional machine time per day.


I know the cost of *not* testing code is even higher than this! It's 
even more expensive and painful to track down regressions after the 
fact. This doesn't mean we can't put more effort into writing efficient 
tests.


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Non-unified builds now running periodically on all trees

2014-01-15 Thread Chris AtLee
Starting today [1], you'll see a new symbol on TBPL: Bn. These are builds 
running with unified sources disabled. We're now running these 
periodically on 64-bit linux (opt and debug) on all trees on the same 
cadence as the PGO builds.


The purpose of these builds is to catch build problems that are masked 
by unifying the source files. By doing regular builds with unified 
sources disabled we'll have a smaller regression window to help pinpoint 
the changes which broke the non-unified configuration.


Once we shake out all the issues with the linux64 non-unified builds, 
we'll look at enabling other platforms.


Testing non-unified builds on Try is simple - just ensure your mozconfig 
has 'ac_add_options --disable-unified-compilation' in it.


For more details, please see bug 942167 [2].

Cheers,
Chris

[1] Once the new tbpl code is deployed - 
https://bugzilla.mozilla.org/show_bug.cgi?id=960173
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=942167


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Thinking about the merge with unified build

2013-12-03 Thread Chris AtLee

On 18:23, Mon, 02 Dec, Ehsan Akhgari wrote:

As for identifying broken non-unified builds, can we configure one of
our mozilla-inbound platforms to be non-unified (like 32-bit Linux Debug)?


I think the answer to that question depends on how soon bug 942167 can 
be fixed.  Chris, any ideas?


We're trying to figure out the best way to implement it. It'll be a week 
or so at least.


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Pushes to Backouts on Mozilla Inbound

2013-11-05 Thread Chris AtLee

On 15:10, Tue, 05 Nov, James Graham wrote:

On 05/11/13 14:57, Kyle Huey wrote:

On Tue, Nov 5, 2013 at 10:44 PM, David Burns dbu...@mozilla.com wrote:


We appear to be doing 1 backout for every 15 pushes on a rough average[4].
This number I am sure you can all agree is far too high especially if we
think about the figures that John O'Duinn suggests[5] for the cost of each
push for running and testing. With the offending patch + backout we are
using 508 computing hours for essentially doing no changes to the tree and
then we do another 254 computing hours for the fixed reland. Note the that
the 508 hours doesn't include retriggers done by the Sheriffs to see if it
is intermittent or not.

This is a lot of wasted effort when we should be striving to get patches
to stick first time. Let's see if we can try make this figure 1 in 30
patches getting backed out.



What is your proposal for doing that?  What are the costs involved?  It
isn't very useful to say X is bad, let's not do X, without looking at what
it costs to not do X.

To give one hypothetical example, if it requires just two additional full
try pushes to avoid one backout, we haven't actually saved any computing
time.


So, as far as I can tell that the heart of the problem is that the 
end-to-end time for the build+test infrastructure is unworkably slow. 
I understand that waiting half a dozen hours — a significant fraction 
of a work day — for a try run is considered normal. This has a huge 
knock-on effect e.g. it requires people to context switch away from 
one problem whilst they wait, and context switch back into it once 
they have the results. Presumably it also encourages landing changes 
without proper testing, which increases the backout rate. It seems 
that this will cost a great deal not just in terms of compute hours 
(which are easy to measure) but also in terms of developer 
productivity (which is harder to measure, but could be even more 
significant).


Wht data do we currently have about why the wait time is so long? If 
this data doesn't exist, can we start to collect it? Are there easy 
wins to be had, or do we need to think about restructuring the way 
that we do builds and/or testing to achieve greater throughput?


We're publishing data in several places about total run time for jobs.

For overall build metrics, you can try 
http://brasstacks.mozilla.com/gofaster/


For specific revisions you can query self-serve, e.g.
https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/5ff9d60c6803, 
or in json

https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/5ff9d60c6803?format=json

For historical data, you can look at all our archived build data 
here: http://builddata.pub.build.mozilla.org/buildjson/


Average times for builds/tests on m-c are published here:
https://secure.pub.build.mozilla.org/builddata/reports/reportor/daily/branch_times/output.txt

end-to-end times for try are here:
https://secure.pub.build.mozilla.org/builddata/reports/reportor/daily/end2end_try/end2end.html

I hope this helps!

Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Shutting off leak tests?

2013-07-15 Thread Chris AtLee

Hi!

Leak tests on OSX have been failing intermittently for nearly a year 
now[1]. As yet, we don't have any ideas why they're failing, and nobody 
is working on fixing them.


Would anybody be very sad if we shut them off? Are these tests providing 
useful information any more?


If they are still important to run, can we get some help fixing them?

Cheers,
Chris

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=774844


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Proposal for an inbound2 branch

2013-04-30 Thread Chris AtLee

On 02:54, Tue, 30 Apr, Justin Lebar wrote:

Is there sanity to this proposal or am I still crazy?


If we had a lot more project branches, wouldn't that increase the load
on infra dramatically, because we'd have less coalescing?


Yes, it would decrease coalescing. I wonder how many tree closures and 
backouts we'd have though?


It seems like a tree used by a smaller, more focused group of people 
could cope better with leaving some orange on the tree for short periods 
of time. Instead of backing out suspect revisions, and closing the tree 
to wait for the results of the backout to come back, could the tree 
remain open to landings while the test failures are being investigated. 
I think this is easier to coordinate with a smaller group of people, and 
with a slower check-in cadence.


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Some data on mozilla-inbound

2013-04-26 Thread Chris AtLee

On 14:29, Fri, 26 Apr, Gregory Szorc wrote:

On 4/26/2013 2:06 PM, Kartikaya Gupta wrote:

On 13-04-26 11:37 , Phil Ringnalda wrote:

 Unfortunately, engineering is totally indifferent to
things like having doubled the cycle time for Win debug browser-chrome
since last November.



Is there a bug filed for this? I just cranked some of the build.json
files through some scripts and got the average time (in seconds) for
all the jobs run on the
mozilla-central_xp-debug_test-mochitest-browser-chrome builders, and
there is in fact a significant increase since November. This makes me
think that we need a resource usage regression alarm of some sort too.

builds-2012-11-01.js: 4063
builds-2012-11-15.js: 4785
builds-2012-12-01.js: 5311
builds-2012-12-15.js: 5563
builds-2013-01-01.js: 6326
builds-2013-01-15.js: 5706
builds-2013-02-01.js: 5823
builds-2013-02-15.js: 6103
builds-2013-03-01.js: 5642
builds-2013-03-15.js: 5187
builds-2013-04-01.js: 5643
builds-2013-04-15.js: 6207


Well, wall time will [likely] increase as we write new tests. I'm
guessing (OK, really hoping) the number of mochitest files has increased
in rough proportion to the wall time? Also, aren't we executing some
tests on virtual machines now? On any virtual machine (and especially on
EC2), you don't know what else is happening on the physical machine, so
CPU and I/O steal are expected to cause variations and slowness in
execution time.


Those tests are still on exactly the same hardware. philor points out in 
https://bugzilla.mozilla.org/show_bug.cgi?id=864085#c0 that the 
time increase is disproportionate for win7. It would be interesting to 
look at all the other suites too.


Perhaps a regular report of how much our wall-clock times for builds and 
different test suite has changed week-over-week would be useful?


That aside, how do we cope with an ever-increasing runtime requirement 
of tests? Keep adding more chunks?


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Some data on mozilla-inbound

2013-04-23 Thread Chris AtLee

On 16:34, Tue, 23 Apr, Gervase Markham wrote:

On 23/04/13 10:17, Ed Morley wrote:

Given that local machine time scales linearly with the rate at which we
hire devs (unlike our automation capacity), I think we need to work out
why (some) people aren't doing things like compiling locally and running
their team's directory of tests before pushing. I would hazard a guess
that if we improved incremental build times  created mach commands to
simplify the edit-compile-test loop, then we could cut out many of these
obvious inbound bustage cases.


That would be the carrot. The stick would be finding some way of finding
out whether a changeset was pushed to try before it was pushed to m-i.
If a developer failed to push to try and then broke m-i, we could (in a
pre-commit hook) refuse to let them commit to m-i in future unless
they'd already pushed to try. For a week, on first offence, a month on
subsequent offences :-)

This, of course, is predicated on being able to detect in real time
whether a changeset being pushed to m-i has previously been pushed to try.


We've considered enforcing this using some cryptographic token. After 
you push to try and get good results, the system gives you a token you 
need to include in your commit to m-i.


Alternatively, you could indicate the try revision you pushed, and we 
could look up the results and refuse the commit based on your 
build/tests results on try, or if you commit to m-i is too different 
than the push to try.


Cheers,
Chris


signature.asc
Description: Digital signature
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: New backout policy for Ts regressions on mozilla-inbound

2012-10-19 Thread Chris AtLee
On 18/10/12 06:44 PM, Justin Lebar wrote:
 Do we still have the bug where a test that finishes first, but is from a
 later cset (say a later cset IMPROVES Ts by 4% or more) would make us
 think we regressed it on an earlier cset if that earlier talos run
 finishes later?

 Such that we set graph points by the time the test finished, not time
 the push was, etc.
 
 https://bugzilla.mozilla.org/show_bug.cgi?id=688534

That applies to the rendering of the graphs on graphs.m.o only. The
regression detection uses the push time to order the results.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: try: -p all considered harmful?

2012-10-01 Thread Chris AtLee

On 30/09/12 03:43 AM, Justin Lebar wrote:

We're all trying to build the best system we can here. We've been publishing
as much raw data as we can, as well as reports like wait time data for ages.
We're not trying to hide this stuff away.


I understand.  My point is just that the data we currently have isn't
what we actually want to measure.  Wait times for individual parts of
a try push don't tell the whole story.  If Linux-64 wait times go
down, what fraction of people get their full try results faster?
(That is, how often is Linux-64 on the critical path for a try push?)
I honestly don't know.


[snip]

I hope we all agree that by this metric, we're currently failing.  The
current infrastructure does not meet demand.  (Indeed, demand is
actually higher than the jobs we're currently running, because we
would very much like to disable coalescing on m-i, but we can't do
that for lack of capacity.)  All I'm saying is that we currently don't
have the right public data to determine, after X amount of time has
passed, whether we've made any progress in this respect.


I just want to highlight that the data _is_ available publicly through 
several different mechanisms.


Raw build data going back to October 2009 is available here:
http://builddata.pub.build.mozilla.org/buildjson/

In addition, per-push information is available via self-serve, e.g.
https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/ae6f597c4a09
https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/ae6f597c4a09?format=json

The Try High Scores data is generated by pulling hg pushlog, and then 
looking up each push in self-serve. I'm not relying on private data.


You shouldn't feel blocked by RelEng to get the data that you want. I'd 
most likely use these same APIs to look at end-to-end time.


Cheers,
Chris
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform