Hi blink-dev
I would like to let you know that blink-engprod has added feature support
for non-WPT fuzzy tests. It now allows both non-WPT reftests and pixel
tests to use the same fuzzy matching meta-tags as WPT tests.It also shows
max color channel difference and total number of different pixels image
diff stats in results.html
<https://test-results.appspot.com/data/layout_results/linux-rel/1073794/blink_web_tests%20%28with%20patch%29/layout-test-results/results.html>.
With these capabilities in place, we like to research further to see if we
can set up some general fuzzy match rules, help blink dev identify flaky
tests that can be potentially resolved by adjusting fuzzy matching rules.
Currently there are quite some web tests that are flaky due to a slight
image mismatch, which should have been tolerated. If we setup a general
fuzzy matching rule , something like:
<meta name="fuzzy" content="0-1;0-1000">
Instruct the image comparison web tests that if color channel and pixel
diff fall within the range of the rule, we can ignore the diff and pass the
test.This way we can reduce test flakiness while still maintain test
accuracy without missing a real bug.
We want to ask you some quick survey questions to help us make design
decisions, whether it makes sense to set up an universal cross-the-board
fuzzy match tolerant rule for all blink web tests, or we should make the
rules more specific to individual test or test sets.
1. Is an universal fuzzy match tolerant rule acceptable for the web tests
in your area?
a). If the answer is yes, what is the acceptable range of max color
channel and pixel diff for your tests?
b) If the answer is no, pls share your reasons.
2. Do you prefer fuzzy matching rule adjustment at a per-test or per test
set level based on the pixel difference numbers shown in results.html?
Here is some sample data help you make choice, we collected data recently
from blink_web_tests result on linux-test builder, the distribution of
color channel maxDifference and totalPixel diff for failing/flaky
blink_web_tests
( Note: over 70% tests in color channel maxDifference 0-10 range have
maxDifference=1):
Color Channel maxDifferenece
Range Fail test count
0-10 98
11-100 31
101-200 28
201-260 111
totalPixels
Diff Range
Fail test count
0-100 30
100-1000 57
1000-10,000 99
10,000-100,000 66
100,000-1,000,000 16
Let me know if you have any questions, looking forward to hearing from you!
Vivian
on behalf of Chrome-Blink-EngProd
--
You received this message because you are subscribed to the Google Groups
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPCqkTs-L5u22-Xp5U_LeBdLP%3D%2BTDH1KGv8MTmtKQFRcANCZJg%40mail.gmail.com.