On Thu, Mar 1, 2012 at 6:41 PM, Jesus Sanchez-Palencia <[email protected]>wrote:
> A Qt WebKit1 performance bot was added last week, sorry for the late > announcement. > > If I'm not mistaken, currently run-perf-tests works with DRT only, but > what if we would like to make it work with WTR as well so we could > also have WebKit2 performance bots running? I'm not aware of the > infrastructure provided by webkitpy (Drivers, etc) so I'm not sure > about the amount of work needed... > To get WKTR running the performance tests a '-2' switch must be added to PerfTestRunner and some refactoring is required in the WKTR itself to properly handle the '--no-timeout' switch when given. I've got a diff of these changes laying around I can transform into a patch if there isn't one yet, just point me to a bug (or let's create one). Best, Zan > > Cheers, > jesus > > On Tue, Jan 31, 2012 at 8:16 PM, Ryosuke Niwa <[email protected]> wrote: > > FYI, I've added a wiki page describing how to write a new perf. > > test: https://trac.webkit.org/wiki/Writing%20Performance%20Tests > > > > On Fri, Jan 20, 2012 at 11:20 AM, Ojan Vafai <[email protected]> wrote: > >> > >> On Thu, Jan 19, 2012 at 3:20 PM, Ryosuke Niwa <[email protected]> wrote: > >>> > >>> I didn't merge it into run-webkit-tests because performance tests don't > >>> pass/fail but instead give us some values that fluctuate over time. > While > >>> Chromium takes an approach to hard-code the rage of acceptable values, > such > >>> an approach has a high maintenance cost and prone to problems such as > having > >>> to increase the range periodically as the score slowly degrades over > time. > >>> Also, as you can see on Chromium perf bots, the test results tend to > >>> fluctuate a lot so hard-coding a tight range of acceptable value is > tricky. > >> > >> > >> While this isn't perfect, I still think it's worth doing. > > > > > > I'm afraid that the maintenance cost here will be too high. Values will > > necessarily depend on each bot so we'll need <number of tests>×<number of > > bots> expectations, and I don't think people are enthusiastic about > > maintaining values like that over time (even I don't want to do that > > myself). > > > >> Turning the bot red when a performance test fails badly is helpful for > >> finding and reverting regressions quickly, which in turn helps identify > >> smaller regressions more easily (large regressions mask smaller ones). > > > > > > I agree. Maybe we can obtain the historical average and standard > deviation > > and turn bots red if the value doesn't fall within <some value between 1 > and > > 2> standard deviations. > > > >> In either case, we have to get the bots running the tests and work on > >> getting reliable data first. > > > > > > After http://trac.webkit.org/changeset/106211, values for most tests > have > > gotten very stable. They tend to vary within 5% range. > > > > - Ryosuke > > > > > > _______________________________________________ > > webkit-dev mailing list > > [email protected] > > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > > > _______________________________________________ > webkit-dev mailing list > [email protected] > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >
_______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

