Hi,

Three major problems with existing WebKit performance tests are:

   1. There are too many tests to run
   2. Some tests have variances that are too high to be of any use
   3. Some tests are too specific to be of any general use


To address them, I’m going to segregate the tests into 3 tiers:

   1. Reliable tests that should be run on bots and locally when testing
   patches.
   2. Supplemental tests that could be ran optionally.
   3. Skipped tests.

In addition, I’m going to add a forth category between 1 and 2 for new
tests that have just been added since deciding whether a test is reliable
or not is hard unless we have some data.

This forth category is very important because while perf.webkit.org has an
ability to aggregate results for each suite (e.g. for the entire DOM) by
arbitary functions (e.g. arithmetic means, geometric means, etc…), letting
everyone add arbitary tests to any suite will undermine our ability to
monitor the results of reliable tests due to the added noise and means
skewed by new tests.  In an essense, we need a way to determine whether new
tests can be added to tier 1 “test suites”.

- R. Niwa
_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to