Re: Changes to performance testing?

Richard Eisenberg Sun, 21 Feb 2021 12:32:37 -0800


> On Feb 21, 2021, at 11:24 AM, Ben Gamari <b...@well-typed.com> wrote:
> 
> To mitigate this I would suggest that we allow performance test failures
> in marge-bot pipelines. A slightly weaker variant of this idea would
> instead only allow performance *improvements*. I suspect the latter
> would get most of the benefit, while eliminating the possibility that a
> large regression goes unnoticed.


The value in making performance improvements a test failure is so that patch 
authors can be informed of what they have done, to make sure it matches 
expectations. This need can reasonably be satisfied without stopping merging. 
That is, if Marge can accept performance improvements, while (say) posting to 
each MR involved that it may have contributed to a performance improvement, 
then I think we've done our job here.

On the other hand, a performance degradation is a bug, just like, say, an error 
message regression. Even if it's a combination of commits that cause the 
problem (an actual possibility even for error message regressions), it's still 
a bug that we need to either fix or accept (balanced out by other 
improvements). The pain of debugging this scenario might be mitigated if there 
were a collation of the performance wibbles for each individual commit. This 
information is, in general, available: each commit passed CI on its own, and so 
it should be possible to create a little report with its rows being perf tests 
and its columns being commits or MR #s; each cell in the table would be a 
percentage regression. If we're lucky, the regression Marge sees will be the 
sum(*) of the entries in one of the rows -- this means that we have a simple 
agglomeration of performance degradation. If we're less lucky, the whole will 
not equal the sum of the parts, and some of the patches interfere. In either 
case, the table would suggest a likely place to look next.

(*) I suppose if we're recording percentages, it wouldn't necessarily be the 
actual sum, because percentages are a bit funny. But you get my meaning.

Pulling this all together:
* I'm against the initial proposal of allowing all performance failures by 
Marge. This will allow bugs to accumulate (in my opinion).
* I'm in favor of allowing performance improvements to be accepted by Marge.
* To mitigate against the information loss of Marge accepting performance 
improvements, it would be great if Marge could alert MR authors that a 
cumulative performance improvement took place.
* To mitigate against the annoyance of finding a performance regression in a 
merge commit that does not appear in any component commit, it would be great if 
there were a tool to collect performance numbers from a set of commits and 
present them in a table for further analysis.

These "mitigations" might take work. If labor is impossible to produce to 
complete this work, I'm in favor of simply allowing the performance 
improvements, maybe also filing a ticket about these potential improvements to 
the process.

Richard

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: Changes to performance testing?

Reply via email to