Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Pavol Vaskovic via swift-dev
As the next two paragraphs after the part you quoted go on explaining, I'm hoping that with this approach we could adaptively sample the benchmark until we get stable population, but starting from lower iteration count. If the Python implementation bears this out, the proper solution would be to

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Andrew Trick via swift-dev
> On Jun 12, 2017, at 5:29 PM, Michael Gottesman wrote: > >> I don't know what that is. > > Check it out: https://en.wikipedia.org/wiki/Mann–Whitney_U_test > . It is a > non-parametric test that two sets of

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Andrew Trick via swift-dev
> On Jun 12, 2017, at 5:55 PM, Pavol Vaskovic wrote: > > > > On Tue, Jun 13, 2017 at 2:31 AM, Michael Gottesman > wrote: >> I don't think we can get more consistent test results just from re-running >> tests that were

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Andrew Trick via swift-dev
> On Jun 12, 2017, at 4:45 PM, Pavol Vaskovic wrote: > > I have sketched an algorithm for getting more consistent test results, so far > its in Numbers. I have ran the whole test suite for 100 samples and observed > the varying distribution of test results. The first result is

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Andrew Trick via swift-dev
> On Jun 12, 2017, at 4:45 PM, Pavol Vaskovic wrote: > > We really have two problems: > 1. spurious results > 2. the turnaround time for the entire benchmark suite > > > I don't think we can get more consistent test results just from re-running > tests that were detected as

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Pavol Vaskovic via swift-dev
On Tue, Jun 13, 2017 at 2:31 AM, Michael Gottesman wrote: > > I don't think we can get more consistent test results just from re-running > tests that were detected as changes in the first pass, as described in > SR-4669 , because that

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Pavol Vaskovic via swift-dev
Hi Andrew, I wrote the draft of this e-mail few weeks ago, and the following sentence is not true: > > Its emitted when the new MIN falls inside the (MIN..MAX) range of the OLD > baseline. It is not checked the other way around. > See below... On Tue, Jun 13, 2017 at 1:45 AM, Pavol Vaskovic

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Michael Gottesman via swift-dev
> On Jun 12, 2017, at 4:45 PM, Pavol Vaskovic via swift-dev > wrote: > > Hi Andrew, > > On Mon, Jun 12, 2017 at 11:55 PM, Andrew Trick > wrote: >> To partially address this issue (I'm guessing) the last SPEEDUP column >>

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Michael Gottesman via swift-dev
> On Jun 12, 2017, at 4:54 PM, Pavol Vaskovic wrote: > > > > On Mon, Jun 12, 2017 at 11:55 PM, Michael Gottesman > wrote: > > The current design assumes that in such cases, the workload will be increased > so that is not an

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Pavol Vaskovic via swift-dev
On Mon, Jun 12, 2017 at 11:55 PM, Michael Gottesman wrote: > > The current design assumes that in such cases, the workload will be > increased so that is not an issue. > I understand. But clearly some part of our process is failing, because there are multiple benchmarks in

[swift-dev] Spanish Translation

2017-06-12 Thread Luis Leos via swift-dev
I'm available to translate if needed. Thanks! -Luis Leos Sent from my iPhone ___ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Pavol Vaskovic via swift-dev
Hi Andrew, On Mon, Jun 12, 2017 at 11:55 PM, Andrew Trick wrote: > To partially address this issue (I'm guessing) the last SPEEDUP column > sometimes features mysterious question mark in brackets. Its emitted when > the new MIN falls inside the (MIN..MAX) range of the OLD

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Andrew Trick via swift-dev
> On Jun 12, 2017, at 2:55 PM, Michael Gottesman wrote: > >> Current approach to detecting performance changes is fragile for tests that >> have very low absolute runtime, as they are easily over the 5% >> improvement/regression threshold when the test machine gets a

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Andrew Trick via swift-dev
> On Jun 12, 2017, at 1:45 PM, Pavol Vaskovic wrote: > > On Tue, May 16, 2017 at 9:10 PM, Dave Abrahams via swift-dev > > wrote: > > on Thu May 11 2017, Pavol Vaskovic wrote: > > I have run Benchmark_O with --num-iters=100 on my

Re: [swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Michael Gottesman via swift-dev
> On Jun 12, 2017, at 1:45 PM, Pavol Vaskovic via swift-dev > wrote: > > On Tue, May 16, 2017 at 9:10 PM, Dave Abrahams via swift-dev > > wrote: > > on Thu May 11 2017, Pavol Vaskovic wrote: > > I have run Benchmark_O

[swift-dev] Measuring MEAN Performance (was: Questions about Swift-CI)

2017-06-12 Thread Pavol Vaskovic via swift-dev
On Tue, May 16, 2017 at 9:10 PM, Dave Abrahams via swift-dev < swift-dev@swift.org> wrote: > > on Thu May 11 2017, Pavol Vaskovic wrote: > > I have run Benchmark_O with --num-iters=100 on my machine for the the >> whole performance test suite, to get a feeling for the distribution of >>