Simon, you are on windows, correct? If so the result you posted is using the 
"x86_64-linux-deb9" environment as a baseline when it should be using "x86_64" 
which is much closer to the result you're getting.
WSL actually.   (Windows Subsystem for Linux.)

Simon

From: David Eichmann <dav...@well-typed.com>
Sent: 02 July 2020 16:25
To: Simon Peyton Jones <simo...@microsoft.com>; ghc-devs@haskell.org
Subject: Re: Perf notes


Hello,

So CI metrics are being pushed again, good. The immediate issue was perf test 
T9803. I've looked at metrics across all CI test environments and the last 300 
commit on master. Metric looks stable. I've attached the output of: `python3 
./testsuite/driver/perf_notes.py --chart T9203.html --ci --test-name T9203  
origin/master~300..origin/master`.

Simon, you are on windows, correct? If so the result you posted is using the 
"x86_64-linux-deb9" environment as a baseline when it should be using "x86_64" 
which is much closer to the result you're getting.

> So it's just an implementation detail whether the numbers you save are gotten 
> from one run, or another identical one.

For the most part yes, but since we usually rebase and batch commit with marge 
bot, we are really creating a new commit. So it makes some sense to rerun CI.

> The CI log tells you the comparison between the preceding commit and this one

I'd love for this to be the behavior too. The problem is, most of the time we 
don't have the metrics for the parent commit, and generating them is expensive. 
We could automatically checkout / build the previous commit and run perf tests, 
but this doesn't seem like good design. That's why we've resorted to searching 
for "most recent" local or CI metrics to establish an approximate baseline. 
Requiring the developers to always run perf tests locally on the previous 
commit will be extremely annoying. Another option is to just disable perf tests 
by default.

It's not satisfying but it's hard to think of a better solution. I think the 
current implementation is still more convenient than before, where baselines 
were just hard coded into test files. If you wanted to investigate a particular 
commit, you had to manually establish a baseline. Perhaps the problem with this 
new system is that it's a bit too "magical" and it's unclear how to interpret 
results. Perhaps this can be remedied with better output form the test runner.

> Would it be possible to output a table (in the log) like we get from 
> nofib-analyse

Absolutely. This should be fairly easy to implement. I've created #18417.

- David E


On 6/29/20 3:08 PM, Simon Peyton Jones wrote:

Re the doubling of bytes-allocated on T9803, that's a good point. Due to the 
recent change in RSA keys, CI is recently failing to upload metrics (e.g. [1])! 
I'll fix that then see if I can track down where / if the metric has really 
regressed in master.
Thanks


Yes we run CI on MRs, but once merged into master CI is run again. It's only 
those metrics from CI on master (post merge) that are ultimately uploaded / 
used as a baseline.
OK.  But they are guaranteed to be 100.0% identical to the ones discovered by 
CI, aren't they?   So it's just an implementation detail whether the numbers 
you save are gotten from one run, or another identical one.

I'm still lost about when I can rely on the perf output of CI and when I can't. 
 I'm really hoping for a simple answer like:

  *   The CI log tells you the comparison between the preceding commit and this 
one
No ifs, no buts.  Simple!

Incidentally, would it be possible to output a table (in the log) like we get 
from nofib-analyse.  It looks like this


            Program           Size    Allocs   Runtime   Elapsed  TotalMem

    
--------------------------------------------------------------------------------

              boyer          -0.3%     +5.4%     +0.7%     +1.0%      0.0%

           cichelli          -0.3%     +5.9%     -9.9%     -9.5%      0.0%

          compress2          -0.4%     +9.6%     +7.2%     +6.4%      0.0%

        constraints          -0.3%     +0.2%     -3.0%     -3.4%      0.0%

       cryptarithm2          -0.3%     -3.9%     -2.2%     -2.4%      0.0%

             gamteb          -0.4%     +2.5%     +2.8%     +2.8%      0.0%

               life          -0.3%     -2.2%     -4.7%     -4.9%      0.0%

               lift          -0.3%     -0.3%     -0.8%     -0.5%      0.0%

             linear          -0.3%     -0.1%     -4.1%     -4.5%      0.0%

               mate          -0.2%     +1.4%     -2.2%     -1.9%    -14.3%

             parser          -0.3%     -2.1%     -5.4%     -4.6%      0.0%

             puzzle          -0.3%     +2.1%     -6.6%     -6.3%      0.0%

             simple          -0.4%     +2.8%     -3.4%     -3.3%     -2.2%

            veritas          -0.1%     +0.7%     -0.6%     -1.1%      0.0%

       wheel-sieve2          -0.3%    -19.2%    -24.9%    -24.5%    -42.9%

    
--------------------------------------------------------------------------------

                Min          -0.4%    -19.2%    -24.9%    -24.5%    -42.9%

                Max          +0.1%     +9.6%     +7.2%     +6.4%    +33.3%

     Geometric Mean          -0.3%     -0.0%     -3.0%     -2.9%     -0.3%

Instantly comprehensible, one line per benchmark.  I find I spent quite a lot 
of time search manually in the log and manually building a table (or excerpts 
thereof) looking like this.

I don't have an opinion about the columns, just wanting a table with one line 
per benchmark, and a number of columns.

Thanks

Simon


From: David Eichmann <dav...@well-typed.com><mailto:dav...@well-typed.com>
Sent: 27 June 2020 20:39
To: Simon Peyton Jones <simo...@microsoft.com><mailto:simo...@microsoft.com>; 
ghc-devs@haskell.org<mailto:ghc-devs@haskell.org>
Subject: Re: Perf notes


> I thought that wasn't possible.  Isn't that what CI is *for*?

Yes we run CI on MRs, but once merged into master CI is run again. It's only 
those metrics from CI on master (post merge) that are ultimately uploaded / 
used as a baseline.

Re the doubling of bytes-allocated on T9803, that's a good point. Due to the 
recent change in RSA keys, CI is recently failing to upload metrics (e.g. [1])! 
I'll fix that then see if I can track down where / if the metric has really 
regressed in master.



[1] "fatal: Could not read from remote repository."  
https://gitlab.haskell.org/ghc/ghc/-/jobs/378487<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2F-%2Fjobs%2F378487&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4f75c38623e40b4983608d81e9c118a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637293004078959683&sdata=1obE6D2oHoZxYizjzOAsxqOcW%2B3PcY6kqfckNgKFFQg%3D&reserved=0>

--

David Eichmann, Haskell Consultant

Well-Typed LLP, 
http://www.well-typed.com<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.well-typed.com%2F&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4f75c38623e40b4983608d81e9c118a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637293004078969677&sdata=lk2tF4MQQwKPoMO2ABXuWXI871hTlPNXb0O5YICa5Bg%3D&reserved=0>



Registered in England & Wales, OC335890

118 Wymering Mansions, Wymering Road, London W9 2NF, England

--

David Eichmann, Haskell Consultant

Well-Typed LLP, 
http://www.well-typed.com<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.well-typed.com%2F&data=02%7C01%7Csimonpj%40microsoft.com%7Cc4f75c38623e40b4983608d81e9c118a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637293004078969677&sdata=lk2tF4MQQwKPoMO2ABXuWXI871hTlPNXb0O5YICa5Bg%3D&reserved=0>



Registered in England & Wales, OC335890

118 Wymering Mansions, Wymering Road, London W9 2NF, England
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply via email to