Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Gijs Kruitbosch
I concur with Ryan here, and I'd add that IME 90% if not more of these 
timeouts (where they are really timeouts because the test is long, 
rather than just brokenness in the test that leaves it hanging until the 
timeout) happen on debug/asan builds, where "perf regressions" isn't 
really a meaningful concept to regression-analyze for, compared to the 
debug and asan overhead.


~ Gijs

On 09/02/2016 17:54, Ryan VanderMeulen wrote:

I'd have a much easier time accepting that argument if my experience
didn't tell me that nearly every single "Test took longer than expected"
or "Test timed out" intermittent ends with a RequestLongerTimeout as the
fix.

-Ryan

On 2/9/2016 12:50 PM, Haik Aftandilian wrote:

On Tue, Feb 9, 2016 at 2:47 AM, Marco Bonardo 
wrote:


Based on that, bumping the timeout may have 2 downsides, long term:
- slower tests for everyone
- sooner or later 90 seconds won't be enough again. Are we going to
bump to
180 then?



Essentially restating Marco's concern, increasing timeouts has the side
effect where performance regressions are not noticed. i.e., a new bug
that
causes a test to take longer, but still pass, is not detected. With the
original lower timeouts, the test would fail with a timeout. So a little
bit of the value of the tests is lost, and it's difficult to address
later.

Haik




___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Marco Bonardo
On Tue, Feb 9, 2016 at 8:09 PM, Gijs Kruitbosch 
wrote:

> I concur with Ryan here, and I'd add that IME 90% if not more of these
> timeouts (where they are really timeouts because the test is long, rather
> than just brokenness in the test that leaves it hanging until the timeout)
> happen on debug/asan builds


The current timeout was already setup when we had debug tests, so it is
already accounting for that. We should at a maximum reduce the timeout on
opt if that matters.

I think it would be fine to have much bigger timeouts on asan or other
builds that are not our primary target. debug and opt are close enough to
what we ship and what devs use everyday that performance matters too.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Gecko/Firefox stats and diagrams wanted

2016-02-09 Thread Nathan Froyd
On Tue, Feb 9, 2016 at 12:31 PM, Nicholas Alexander 
wrote:

> I also wanted to try to find some diagrams to show how Firefox and Gecko
>> work/their architecture, from a high level perspective (not too insane a
>> level of detail, but reasonable).
>>
>
> Nathan Froyd worked up a very high-level slide deck for his onboarding
> sessions; they're amazing.  I'm not sure how public those slides are, so
> I've CCed him and he may choose to link to those.  I would really love to
> see these worked up into a document rather than a presentation.
>

The presentation is public:

https://docs.google.com/presentation/d/1ZHUkNzZK2TrF5_4MWd_lqEq7Ph5B6CDbNsizIkBxbnQ/edit?usp=sharing

I've tried to include links into wikis and whatnot where possible.  We have:

https://wiki.mozilla.org/Gecko:Overview

which includes jumping-off points for exploration of major subsystems, as
well.

If folks have suggestions of diagrams, links, etc. that should go in, I'd
love to hear about them.

Thanks,
-Nathan
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Marco Bonardo
On Tue, Feb 9, 2016 at 6:54 PM, Ryan VanderMeulen  wrote:

> I'd have a much easier time accepting that argument if my experience
> didn't tell me that nearly every single "Test took longer than expected" or
> "Test timed out" intermittent ends with a RequestLongerTimeout as the fix


this sounds equivalent to saying "Since we don't have enough resources (or
a plan) to investigate why some tests take so long, let's give up"... But
then maybe we should have that explicit discussion, rather than assuming
it's a truth.
Since we are focused on quality I don't think it's acceptable to say we are
fine if a test takes an unexpected amount of time to run. The fact those
bugs end up being resolved by bumping the timeout without any kind of
investigation (and it happens, I know) is worrisome.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Marco Bonardo
90 seconds for a simple test sounds like a lot of time and a huge bump from
the current situation (45).
The risk is people will start writing much bigger tests instead of
splitting them into smaller an more manageable tests. Plus when a test
depends on a long timeout in the product, developers are used to figure out
ways to reduce those (through hidden prefs or such) so that test can finish
sooner and not timeout.
Based on that, bumping the timeout may have 2 downsides, long term:
- slower tests for everyone
- sooner or later 90 seconds won't be enough again. Are we going to bump to
180 then?

I think that's the main reason the default timeout was set to a low value,
while still allowing the multipliers as a special case for tests that
really require bigger times, cause there's no other way out.

Is docker doubling the time for every test? From the bug looks like it may
add 20-30% of overhead, so why are we not bumping the timeout of 30% (let's
say 60s) and investigating the original cause (the bug that takes 80s to
run) to figure if something can be done to make it finish sooner?

-m


On Mon, Feb 8, 2016 at 11:51 PM, Armen Zambrano G. 
wrote:

> Hello,
> In order to help us have less timeouts when running mochitests under
> docker, we've decided to double mochitests' gTimeoutSeconds and reduce
> large multipliers in half.
>
> Here's the patch if you're curious:
>
> https://bugzilla.mozilla.org/page.cgi?id=splinter.html=1246152=8717111
>
> If you have any comments or concerns please raise them in the bug.
>
> regards,
> Armen
>
> --
> Zambrano Gasparnian, Armen
> Automation & Tools Engineer
> http://armenzg.blogspot.ca
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Armen Zambrano G.
I will try 60 seconds and see how it goes.

On 16-02-09 05:47 AM, Marco Bonardo wrote:
> 90 seconds for a simple test sounds like a lot of time and a huge bump from
> the current situation (45).
> The risk is people will start writing much bigger tests instead of
> splitting them into smaller an more manageable tests. Plus when a test
> depends on a long timeout in the product, developers are used to figure out
> ways to reduce those (through hidden prefs or such) so that test can finish
> sooner and not timeout.
> Based on that, bumping the timeout may have 2 downsides, long term:
> - slower tests for everyone
> - sooner or later 90 seconds won't be enough again. Are we going to bump to
> 180 then?
> 
> I think that's the main reason the default timeout was set to a low value,
> while still allowing the multipliers as a special case for tests that
> really require bigger times, cause there's no other way out.
> 
> Is docker doubling the time for every test? From the bug looks like it may
> add 20-30% of overhead, so why are we not bumping the timeout of 30% (let's
> say 60s) and investigating the original cause (the bug that takes 80s to
> run) to figure if something can be done to make it finish sooner?
> 
> -m
> 
> 
> On Mon, Feb 8, 2016 at 11:51 PM, Armen Zambrano G. 
> wrote:
> 
>> Hello,
>> In order to help us have less timeouts when running mochitests under
>> docker, we've decided to double mochitests' gTimeoutSeconds and reduce
>> large multipliers in half.
>>
>> Here's the patch if you're curious:
>>
>> https://bugzilla.mozilla.org/page.cgi?id=splinter.html=1246152=8717111
>>
>> If you have any comments or concerns please raise them in the bug.
>>
>> regards,
>> Armen
>>
>> --
>> Zambrano Gasparnian, Armen
>> Automation & Tools Engineer
>> http://armenzg.blogspot.ca
>> ___
>> dev-platform mailing list
>> dev-platform@lists.mozilla.org
>> https://lists.mozilla.org/listinfo/dev-platform
>>


-- 
Zambrano Gasparnian, Armen
Automation & Tools Engineer
http://armenzg.blogspot.ca
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Reftests moving to structured logging

2016-02-09 Thread Andrew Halberstadt

This is now live on central.

On 04/02/16 01:28 PM, Andrew Halberstadt wrote:

Reftest is the last major test harness still not using structured logs,
but that should change by the end of the week. See bug 1034290 [1] for
more details.

I've tried my best to make sure things like reftest-analyzer,
leak/assertion checks, crash detection, etc. all continue to work. But
due to the sad lack of tests for the harnesses themselves, it's possible
that I missed something. So if you see anything not working like it
should, please file a bug blocking bug 1034290 [1] and CC me.

What does this change mean for reftest? In the short term, nothing
should be different save that reftests will start working with tools
that depend on structured logging (e.g ActiveData, auto-starring, etc).
In the medium term, it means we'll be able to tweak the log format
without breaking anything (once consumers that are still parsing the
formatted log get updated). In the long term, structured logging will be
a foundation upon which new data driven tools will be built.

Let me know if you have any questions or concerns,
-Andrew

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1034290


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: MozReview/Autoland in degraded state

2016-02-09 Thread Daniel Minor
Try integration is now restored.

Autoland to inbound will be available pending some further testing.

On Fri, Feb 5, 2016 at 5:34 PM, Gregory Szorc  wrote:

> r+ carry forward/"status" column is now working again.
>
> Autoland / Try integration is still offline.
>
> On Fri, Feb 5, 2016 at 12:13 PM, Mark Côté  wrote:
>
> > And a little longer than planned, but we're back.  All users, regardless
> > of level, can once again push code to MozReview.
> >
> > As noted, Autoland and r+ carry forward/"status" column will remain
> > disabled a little while longer, as there are some unrelated issues to
> > sort out.  We'll report back here when they're back, hopefully Monday.
> >
> > Mark
> >
> >
> > On 2016-02-05 1:42 PM, Mark Côté wrote:
> > > We will be deploying a fix for the ssh-level restrictions to MozReview
> > > shortly, around 2:30 pm EST/11:30 am PST.  MozReview will be down for
> > > about 10 minutes if all goes smoothly.  We'll be able to rollback not
> > > long after that if there are unresolvable issues.  You can follow along
> > > in #mozreview.
> > >
> > > Other fixes to LDAP and Autoland will follow Mondayish.
> > >
> > > Thank you for your patience.
> > >
> > > Mark
> > >
> > >
> > > On 2016-02-03 3:02 AM, Gregory Szorc wrote:
> > >> MozReview and Autoland are currently in a degraded state:
> > >>
> > >> * HTTP pushes are disabled
> > >> * SSH pushes require LDAP SCM Level 3 access
> > >> * Autoland is disabled
> > >> * r+ carry forward has been disabled / the overall "status" column in
> > the
> > >> commits list may not turn green
> > >>
> > >> The last bullet point is particularly troubling, as we had to disable
> > >> something that wasn't designed to be disabled. There may be some weird
> > >> fallout with review flag / state as a result.
> > >>
> > >> Bug 1244835 tracks restoring push access. Bug 1245412 tracks the other
> > >> issues.
> > >>
> > >> Please understand that additional technical details cannot be provided
> > at
> > >> this time.
> > >>
> > >> We apologize for the inconvenience and hope to have full service
> > restored
> > >> as soon as possible.
> > >>
> > >
> >
> > ___
> > dev-platform mailing list
> > dev-platform@lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-platform
> >
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Gecko/Firefox stats and diagrams wanted

2016-02-09 Thread Chris Mills
Hi all,

I’m writing a presentation about browsers, standards implementation, and 
cross-browser coding to give at some universities. As a part of it, I wanted to 
present some stats about Firefox/Gecko to show how many people on average 
commit to it (say, every month, every year?), how many people work on 
localising the content strings, how many people work on platform/UI features, 
etc.

I also wanted to try to find some diagrams to show how Firefox and Gecko 
work/their architecture, from a high level perspective (not too insane a level 
of detail, but reasonable).

Has anyone got anything like these, or ideas on how I can get such information?

If so, I’d love to hear from you.

thanks,

Chris Mills
 Senior tech writer || Mozilla
developer.mozilla.org || MDN
 cmi...@mozilla.com || @chrisdavidmills

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Memory Usage on Perfherder & Memory Reduction

2016-02-09 Thread Mark Finkle
Hi All,

Recently Geoff Brown landed an AWSY-like system [1] for tracking memory
usage on Perfherder. This is awesome. It's one of my pinned tabs.

I was happy to see two recent "drops" in memory usage:

1. A ~3% drop in "Resident Memory Tabs closed [+30s]", likely due to Bug
990916 which expires displayports
https://treeherder.mozilla.org/perf.html#/graphs?series=[mozilla-inbound,f9cdadf297fd409c043e8114ed0fa656334e7fad,1]=1454516622714.927,1454583882842.8733,181623181.32925725,250028978.43070653

2. A ~2% drop across all memory tracking sometime on Feb8. Hard to pick a
changeset, but the drop happened when inbound was merged to fx-team.
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,f9cdadf297fd409c043e8114ed0fa656334e7fad,1%5D

Great to see drops in memory usage!

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1233220
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


TCW Soft Close: Tree Closing Maintenance Window, Sat February 13 2016, 06:00-10:00a PST

2016-02-09 Thread Hal Wine
FYI. We do not expect any significant impact to platform operations.

"Soft Close" means we'll leave the trees open, but devs who push:
 - can expect issues
 - are personally responsible for managing their job (retries, etc.)

-- Forwarded message --
From: 
Date: Tue, Feb 9, 2016 at 11:20 AM
Subject: [Planned] Scheduled Tree Closing Maintenance Window, Sat February
13 2016, 06:00-10:00a PST
To: all-moco-m...@mozilla.com


Issue Status:  Upcoming
Short Summary: IT will be performing the following work during the Feb
13, 2016 TCW:

1232033 - Delete trunking of releng vlans to switch1.r601-1.ops.scl3.mozilla.net
1239378 - Upgrade java on production Elasticsearch cluster
1240821 - Upstream EPEL mirror needs to be corrected and resynch'd




Mozilla IT Maintenance Notification:
--

Issue Status:  Upcoming
Bug IDs:   1239400
Start Date:2016-02-13
Start Time:06:00 PST
Site:  All
Services:  Tree Closure
Impact of Work:Minimal disruption to Mozilla sites and services is
expected.
Elasticsearch availabiity for SUMO will be impacted during change 1239378.

If you have any questions or concerns please address them
to...@mozilla.com or visit #moc in IRC

Also, visit whistlepig.mozilla.org for all notifications.
--
m...@mozilla.com - m...@mozilla.com
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Chris AtLee
On 9 February 2016 at 14:51, Marco Bonardo  wrote:

> On Tue, Feb 9, 2016 at 6:54 PM, Ryan VanderMeulen 
> wrote:
>
> > I'd have a much easier time accepting that argument if my experience
> > didn't tell me that nearly every single "Test took longer than expected"
> or
> > "Test timed out" intermittent ends with a RequestLongerTimeout as the fix
>
>
> this sounds equivalent to saying "Since we don't have enough resources (or
> a plan) to investigate why some tests take so long, let's give up"... But
> then maybe we should have that explicit discussion, rather than assuming
> it's a truth.
> Since we are focused on quality I don't think it's acceptable to say we are
> fine if a test takes an unexpected amount of time to run. The fact those
> bugs end up being resolved by bumping the timeout without any kind of
> investigation (and it happens, I know) is worrisome.
>

I agree. However, this has traditionally been a very difficult area for
Release Engineering and Engineering Productivity to make progress in.

Who can we work with to understand these timing characteristics in more
depth?
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Haik Aftandilian
On Tue, Feb 9, 2016 at 2:47 AM, Marco Bonardo  wrote:

> Based on that, bumping the timeout may have 2 downsides, long term:
> - slower tests for everyone
> - sooner or later 90 seconds won't be enough again. Are we going to bump to
> 180 then?
>

Essentially restating Marco's concern, increasing timeouts has the side
effect where performance regressions are not noticed. i.e., a new bug that
causes a test to take longer, but still pass, is not detected. With the
original lower timeouts, the test would fail with a timeout. So a little
bit of the value of the tests is lost, and it's difficult to address later.

Haik


>
> I think that's the main reason the default timeout was set to a low value,
> while still allowing the multipliers as a special case for tests that
> really require bigger times, cause there's no other way out.
>
> Is docker doubling the time for every test? From the bug looks like it may
> add 20-30% of overhead, so why are we not bumping the timeout of 30% (let's
> say 60s) and investigating the original cause (the bug that takes 80s to
> run) to figure if something can be done to make it finish sooner?
>
> -m
>
>
> On Mon, Feb 8, 2016 at 11:51 PM, Armen Zambrano G. 
> wrote:
>
> > Hello,
> > In order to help us have less timeouts when running mochitests under
> > docker, we've decided to double mochitests' gTimeoutSeconds and reduce
> > large multipliers in half.
> >
> > Here's the patch if you're curious:
> >
> >
> https://bugzilla.mozilla.org/page.cgi?id=splinter.html=1246152=8717111
> >
> > If you have any comments or concerns please raise them in the bug.
> >
> > regards,
> > Armen
> >
> > --
> > Zambrano Gasparnian, Armen
> > Automation & Tools Engineer
> > http://armenzg.blogspot.ca
> > ___
> > dev-platform mailing list
> > dev-platform@lists.mozilla.org
> > https://lists.mozilla.org/listinfo/dev-platform
> >
> ___
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Gecko/Firefox stats and diagrams wanted

2016-02-09 Thread Nicholas Alexander
+Kyle, +Nathan

On Tue, Feb 9, 2016 at 9:00 AM, Chris Mills  wrote:

> Hi all,
>
> I’m writing a presentation about browsers, standards implementation, and
> cross-browser coding to give at some universities. As a part of it, I
> wanted to present some stats about Firefox/Gecko to show how many people on
> average commit to it (say, every month, every year?), how many people work
> on localising the content strings, how many people work on platform/UI
> features, etc.
>

Kyle Lahnakoski has done some work in this area -- he set up a neat
contributor dashboard.  Perhaps Kyle has more data about paid activity
too.  I'm CCing him to see if he can say more.  I imagine Mike Hoye has
much to say here.


> I also wanted to try to find some diagrams to show how Firefox and Gecko
> work/their architecture, from a high level perspective (not too insane a
> level of detail, but reasonable).
>

Nathan Froyd worked up a very high-level slide deck for his onboarding
sessions; they're amazing.  I'm not sure how public those slides are, so
I've CCed him and he may choose to link to those.  I would really love to
see these worked up into a document rather than a presentation.

Thanks for doing this work!
Nick
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: To bump mochitest's timeout from 45 seconds to 90 seconds

2016-02-09 Thread Ryan VanderMeulen
I'd have a much easier time accepting that argument if my experience 
didn't tell me that nearly every single "Test took longer than expected" 
or "Test timed out" intermittent ends with a RequestLongerTimeout as the 
fix.


-Ryan

On 2/9/2016 12:50 PM, Haik Aftandilian wrote:

On Tue, Feb 9, 2016 at 2:47 AM, Marco Bonardo  wrote:


Based on that, bumping the timeout may have 2 downsides, long term:
- slower tests for everyone
- sooner or later 90 seconds won't be enough again. Are we going to bump to
180 then?



Essentially restating Marco's concern, increasing timeouts has the side
effect where performance regressions are not noticed. i.e., a new bug that
causes a test to take longer, but still pass, is not detected. With the
original lower timeouts, the test would fail with a timeout. So a little
bit of the value of the tests is lost, and it's difficult to address later.

Haik


___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform