FYI, I've added a wiki page describing how to write a new perf. test: https://trac.webkit.org/wiki/Writing%20Performance%20Tests
On Fri, Jan 20, 2012 at 11:20 AM, Ojan Vafai <[email protected]> wrote: > > On Thu, Jan 19, 2012 at 3:20 PM, Ryosuke Niwa <[email protected]> wrote: > >> I didn't merge it into run-webkit-tests because performance tests don't >> pass/fail but instead give us some values that fluctuate over time. While >> Chromium takes an approach to hard-code the rage of acceptable values, such >> an approach has a high maintenance cost and prone to problems such as >> having to increase the range periodically as the score slowly degrades over >> time. Also, as you can see on Chromium perf >> bots<http://build.chromium.org/p/chromium.perf/console>, >> the test results tend to fluctuate a lot so hard-coding a tight range of >> acceptable value is tricky. >> > > While this isn't perfect, I still think it's worth doing. > I'm afraid that the maintenance cost here will be too high. Values will necessarily depend on each bot so we'll need <number of tests>×<number of bots> expectations, and I don't think people are enthusiastic about maintaining values like that over time (even I don't want to do that myself). Turning the bot red when a performance test fails badly is helpful for > finding and reverting regressions quickly, which in turn helps identify > smaller regressions more easily (large regressions mask smaller ones). > I agree. Maybe we can obtain the historical average and standard deviation and turn bots red if the value doesn't fall within <some value between 1 and 2> standard deviations. In either case, we have to get the bots running the tests and work on > getting reliable data first. > After http://trac.webkit.org/changeset/106211, values for most tests have gotten very stable. They tend to vary within 5% range. - Ryosuke
_______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

