Re: Updating the policy for Talos performance regression in 2015

2015-03-27 Thread Lawrence Mandel
Old thread but now that we're 3 months into 2015, has this new policy been
effective at getting perf regressions fixed or at least deliberately
accepted?

Lawrence

On Fri, Dec 19, 2014 at 1:33 PM, jma...@mozilla.com wrote:

 Great questions folks.

 :bsmedberg has answered the questions quite well, let me elaborate:
 Before a bug can be marked as resolved:fixed we need to verify the
 regression is actually fixed.  In many cases we will fix a large portion of
 the regression and accept the small remainder.

 We do keep track of all the bugs filed per version (firefox 36 example:
 https://bugzilla.mozilla.org/show_bug.cgi?id=1084461)

 these get looked at more specifically during each uplift.

 I will update the verbage next week to call out how these will be followed
 up and posted to:
 https://www.mozilla.org/hacking/regression-policy.html

 Do speak up if this should be posted elsewhere or linked from a specific
 location.
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Updating the policy for Talos performance regression in 2015

2015-03-27 Thread Joel Maher
As one of the primary people sheriffing alerts, I have found that we get
decisions made much faster as a result of this policy.  I would be
interested to hear if others have differing opinions as I could be seeing
this with tunnel vision.

-Joel


On Fri, Mar 27, 2015 at 4:34 PM, Lawrence Mandel lman...@mozilla.com
wrote:

 Old thread but now that we're 3 months into 2015, has this new policy been
 effective at getting perf regressions fixed or at least deliberately
 accepted?

 Lawrence

 On Fri, Dec 19, 2014 at 1:33 PM, jma...@mozilla.com wrote:

 Great questions folks.

 :bsmedberg has answered the questions quite well, let me elaborate:
 Before a bug can be marked as resolved:fixed we need to verify the
 regression is actually fixed.  In many cases we will fix a large portion of
 the regression and accept the small remainder.

 We do keep track of all the bugs filed per version (firefox 36 example:
 https://bugzilla.mozilla.org/show_bug.cgi?id=1084461)

 these get looked at more specifically during each uplift.

 I will update the verbage next week to call out how these will be
 followed up and posted to:
 https://www.mozilla.org/hacking/regression-policy.html

 Do speak up if this should be posted elsewhere or linked from a specific
 location.
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Updating the policy for Talos performance regression in 2015

2014-12-19 Thread Ehsan Akhgari

This looks good overall.  Two questions though:

On 2014-12-18 6:47 AM, jmaher wrote:

Mozilla - 2015 Talos performance regression policy

Over the last year and a half the Talos tests have been rewritten to be more 
useful and meaningful.  This means we need to take them seriously and cannot 
just ignore real issues when we don't have time.  This does not mean we need to 
fix or backout every changeset that caused a regression.

Starting in 2015, when a regression is identified to be related to a specific 
changeset, the patch author will be ask for information via the needinfo flag.  
We expect a response and reasonable dialog within 72 hours (3 business days) of 
requesting information.  If no response is given we will backout the patch(es) 
in question and the patch author can investigate when they have time and reland.

Some requirements before requesting needinfo:
* On integration branches (higher volume), a talos sheriff will have verified 
the root cause within 1 week of the patch landing
* a patch or set of patches from a bug must be identified as the root cause.  
This can take place through retriggers on the tree or in the case of many 
patches landing at once this would take place through a push to try backing out 
the suspected patch(es)
* links in the bug to document the regression (and any related 
regressions/improvements)
* if we are confident this is the root cause and it meets a 3% regression 
threshold, then the needinfo request will mention that this policy will be 
enforced

Acceptable outcomes:
* A promise to attempt a fix at the bug is agreed upon, the bug is assigned to 
someone and put in a queue.


How do we ensure that the follow-up bug actually does get fixed and it 
fixes the regression completely?



* The bug will contain enough details and evidence to support accepting this 
regression, we will mark it as wontfix
* It is agreed that this should be backed out


Do we plan to have a different approach towards more severe regressions? 
 For example, if a patch regresses startup time by 50%, would we still 
accept evidence to support that the regression should be accepted, or 
would we tolerate it in the tree for a few weeks before it gets fixed?



Other scenarios:
* A bug related to the alert is not filed within 1 week of the patch landing.  
This removes the urgency and required action.
* We only caught a regression at uplift time.  There is a chance this isn't 
easily determined, this will be documented and identified patch authors will 
use their judgement to fix the bug
* Regression is unrelated to code (say pgo issue) - this should be documented 
in the bug and closed as wontfix.
* When we uplift to Aurora or Beta, all regressions filed before the uplift 
that show up on the upstream branch will have a needinfo flag set and require 
action to be taken.


Please take a moment to look over this and outline any concerns you might have.

Thanks,
Joel
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform



___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Updating the policy for Talos performance regression in 2015

2014-12-19 Thread Benjamin Smedberg


On 12/19/2014 10:05 AM, Ehsan Akhgari wrote:




Acceptable outcomes:
* A promise to attempt a fix at the bug is agreed upon, the bug is 
assigned to someone and put in a queue.


How do we ensure that the follow-up bug actually does get fixed and it 
fixes the regression completely?


Avi/Vladan will be tracking these and nagging as appropriate.




* The bug will contain enough details and evidence to support 
accepting this regression, we will mark it as wontfix

* It is agreed that this should be backed out


Do we plan to have a different approach towards more severe 
regressions?  For example, if a patch regresses startup time by 50%, 
would we still accept evidence to support that the regression should 
be accepted, or would we tolerate it in the tree for a few weeks 
before it gets fixed?


I don't think this can be answered in advance. If we're in this 
situation, it will be because we're making some huge cost/benefit 
tradeoff and we have high confidence that the regression can be fixed or 
that it's worth the corresponding benefit. Product managers would likely 
be involved in making the final decision based on a technical 
recommendations from the engineers and the performance team.


--BDS

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Updating the policy for Talos performance regression in 2015

2014-12-19 Thread jmaher
Great questions folks.

:bsmedberg has answered the questions quite well, let me elaborate:
Before a bug can be marked as resolved:fixed we need to verify the regression 
is actually fixed.  In many cases we will fix a large portion of the regression 
and accept the small remainder.

We do keep track of all the bugs filed per version (firefox 36 example: 
https://bugzilla.mozilla.org/show_bug.cgi?id=1084461)

these get looked at more specifically during each uplift.

I will update the verbage next week to call out how these will be followed up 
and posted to:
https://www.mozilla.org/hacking/regression-policy.html

Do speak up if this should be posted elsewhere or linked from a specific 
location.
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform