Re: Updating the policy for Talos performance regression in 2015
Old thread but now that we're 3 months into 2015, has this new policy been effective at getting perf regressions fixed or at least deliberately accepted? Lawrence On Fri, Dec 19, 2014 at 1:33 PM, jma...@mozilla.com wrote: Great questions folks. :bsmedberg has answered the questions quite well, let me elaborate: Before a bug can be marked as resolved:fixed we need to verify the regression is actually fixed. In many cases we will fix a large portion of the regression and accept the small remainder. We do keep track of all the bugs filed per version (firefox 36 example: https://bugzilla.mozilla.org/show_bug.cgi?id=1084461) these get looked at more specifically during each uplift. I will update the verbage next week to call out how these will be followed up and posted to: https://www.mozilla.org/hacking/regression-policy.html Do speak up if this should be posted elsewhere or linked from a specific location. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Updating the policy for Talos performance regression in 2015
As one of the primary people sheriffing alerts, I have found that we get decisions made much faster as a result of this policy. I would be interested to hear if others have differing opinions as I could be seeing this with tunnel vision. -Joel On Fri, Mar 27, 2015 at 4:34 PM, Lawrence Mandel lman...@mozilla.com wrote: Old thread but now that we're 3 months into 2015, has this new policy been effective at getting perf regressions fixed or at least deliberately accepted? Lawrence On Fri, Dec 19, 2014 at 1:33 PM, jma...@mozilla.com wrote: Great questions folks. :bsmedberg has answered the questions quite well, let me elaborate: Before a bug can be marked as resolved:fixed we need to verify the regression is actually fixed. In many cases we will fix a large portion of the regression and accept the small remainder. We do keep track of all the bugs filed per version (firefox 36 example: https://bugzilla.mozilla.org/show_bug.cgi?id=1084461) these get looked at more specifically during each uplift. I will update the verbage next week to call out how these will be followed up and posted to: https://www.mozilla.org/hacking/regression-policy.html Do speak up if this should be posted elsewhere or linked from a specific location. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Updating the policy for Talos performance regression in 2015
This looks good overall. Two questions though: On 2014-12-18 6:47 AM, jmaher wrote: Mozilla - 2015 Talos performance regression policy Over the last year and a half the Talos tests have been rewritten to be more useful and meaningful. This means we need to take them seriously and cannot just ignore real issues when we don't have time. This does not mean we need to fix or backout every changeset that caused a regression. Starting in 2015, when a regression is identified to be related to a specific changeset, the patch author will be ask for information via the needinfo flag. We expect a response and reasonable dialog within 72 hours (3 business days) of requesting information. If no response is given we will backout the patch(es) in question and the patch author can investigate when they have time and reland. Some requirements before requesting needinfo: * On integration branches (higher volume), a talos sheriff will have verified the root cause within 1 week of the patch landing * a patch or set of patches from a bug must be identified as the root cause. This can take place through retriggers on the tree or in the case of many patches landing at once this would take place through a push to try backing out the suspected patch(es) * links in the bug to document the regression (and any related regressions/improvements) * if we are confident this is the root cause and it meets a 3% regression threshold, then the needinfo request will mention that this policy will be enforced Acceptable outcomes: * A promise to attempt a fix at the bug is agreed upon, the bug is assigned to someone and put in a queue. How do we ensure that the follow-up bug actually does get fixed and it fixes the regression completely? * The bug will contain enough details and evidence to support accepting this regression, we will mark it as wontfix * It is agreed that this should be backed out Do we plan to have a different approach towards more severe regressions? For example, if a patch regresses startup time by 50%, would we still accept evidence to support that the regression should be accepted, or would we tolerate it in the tree for a few weeks before it gets fixed? Other scenarios: * A bug related to the alert is not filed within 1 week of the patch landing. This removes the urgency and required action. * We only caught a regression at uplift time. There is a chance this isn't easily determined, this will be documented and identified patch authors will use their judgement to fix the bug * Regression is unrelated to code (say pgo issue) - this should be documented in the bug and closed as wontfix. * When we uplift to Aurora or Beta, all regressions filed before the uplift that show up on the upstream branch will have a needinfo flag set and require action to be taken. Please take a moment to look over this and outline any concerns you might have. Thanks, Joel ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Updating the policy for Talos performance regression in 2015
On 12/19/2014 10:05 AM, Ehsan Akhgari wrote: Acceptable outcomes: * A promise to attempt a fix at the bug is agreed upon, the bug is assigned to someone and put in a queue. How do we ensure that the follow-up bug actually does get fixed and it fixes the regression completely? Avi/Vladan will be tracking these and nagging as appropriate. * The bug will contain enough details and evidence to support accepting this regression, we will mark it as wontfix * It is agreed that this should be backed out Do we plan to have a different approach towards more severe regressions? For example, if a patch regresses startup time by 50%, would we still accept evidence to support that the regression should be accepted, or would we tolerate it in the tree for a few weeks before it gets fixed? I don't think this can be answered in advance. If we're in this situation, it will be because we're making some huge cost/benefit tradeoff and we have high confidence that the regression can be fixed or that it's worth the corresponding benefit. Product managers would likely be involved in making the final decision based on a technical recommendations from the engineers and the performance team. --BDS ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: Updating the policy for Talos performance regression in 2015
Great questions folks. :bsmedberg has answered the questions quite well, let me elaborate: Before a bug can be marked as resolved:fixed we need to verify the regression is actually fixed. In many cases we will fix a large portion of the regression and accept the small remainder. We do keep track of all the bugs filed per version (firefox 36 example: https://bugzilla.mozilla.org/show_bug.cgi?id=1084461) these get looked at more specifically during each uplift. I will update the verbage next week to call out how these will be followed up and posted to: https://www.mozilla.org/hacking/regression-policy.html Do speak up if this should be posted elsewhere or linked from a specific location. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform