How much of a problem is flakiness caused by minor pixel differences 
compared to overall flakiness? I looked at the top 10 flaky tests here 
<https://data.corp.google.com/sites/chrome_generic_flakiness_dashboard_datasite/top_flakes/?f=test_id:re:.*web_test.*>
 and 
none of them were minor pixel differences.

70 tests is a manageable number and it seems reasonable to add fuzzy 
matching to them.
On Tuesday, August 2, 2022 at 9:04:00 AM UTC-7 Xianzhu Wang wrote:

> On Mon, Aug 1, 2022 at 10:36 AM Vivian Zhi (支文文) <[email protected]> 
> wrote:
>
>> Thanks for valuable feedback! Stephen, Xianzhu, will see if we can add a 
>> filter in result.html to grab those tests in range.
>>
>
> The CL <https://chromium-review.googlesource.com/c/chromium/src/+/3803707> 
> adding pixel diff filter in results.html has landed. Thanks Thorben!
>
> In this example results.html 
> <https://test-results.appspot.com/data/layout_results/linux-rel/1086271/blink_web_tests%20%28with%20patch%29/layout-test-results/results.html>,
>  
> you can examine the pixel results of tests that produced pixel differences 
> matching a particular fuzzy rule in the following steps:
> 1. Enter pixel difference filter e.g. "channel_max:1-1" in the filter 
> input box;
> 2. Click "All" button (as we show regressions only by default).
> You might want to switch to "side-by-side view" and click the image to 
> examine the pixel values.
>
> With "channel_max:1-1" we can see all tests that produced pixel 
> differences that can be tolerated with a fuzzy rule like <meta name=fuzzy 
> content="0-1;0-1000000">. There are 70 such tests in the example 
> results.html. All of them look benign to me. So perhaps a universal rule 
> (for non wpt tests) is proper?
>
> On the other hand, even if we have such a universal rule, we can only 
> recover 70 tests. Instead of applying the rule automatically, we can also 
> manually modify these tests to include a meta fuzzy rule.
>
>
>> On Mon, Aug 1, 2022 at 8:40 AM Xianzhu Wang <[email protected]> 
>> wrote:
>>
>>> On Mon, Aug 1, 2022 at 4:25 AM Stephen Chenney <[email protected]> 
>>> wrote:
>>>
>>>> Thanks for investigating the potential for fuzzy matching.
>>>>
>>>> Rendering Core continues to oppose a single fuzzy match rule across all 
>>>> web_tests. We have some tests where single pixel differences matter 
>>>> (related to pixel snapping, for example) and a universal fuzzy match would 
>>>> fail to identify problems with those. This came up in practice recently 
>>>> when the GPU team enabled fuzzy matching without telling us, and expected 
>>>> failing tests started passing when they shouldn't.
>>>>
>>>
>>> I think a key difference between the original fuzzy matching rule and 
>>> the rule proposed by Vivian is the ranges. With maxDifference=0-1, we 
>>> should be able to catch most visible single pixel differences. What I'm not 
>>> sure is whether a difference like rgb(1, 0, 0) vs rgb(0, 0, 0) (each 
>>> component in the range of 0-255) should be treated as a failure in some 
>>> cases.
>>>
>>> Maybe specific sub teams have directories they could apply default fuzzy 
>>>> matching to. My guess is that the same directories where it will work will 
>>>> be directories with few failing tests, limiting the impact of a 
>>>> per-directory approach.
>>>>
>>>> Is there a way to reproduce the sampling below with a side-by-side 
>>>> comparison of the images? I would find it helpful to look through some of 
>>>> the cases that would pass with <meta name="fuzzy" content="0-1;0-1000">, 
>>>> for example.
>>>>
>>>
>>> A filter by actual maxDifference and totalPixels in results.html will be 
>>> helpful. I can add it when I get time.
>>>
>>> Stephen.
>>>>
>>>> On Fri, Jul 29, 2022 at 8:20 PM 'Vivian Zhi (支文文)' via blink-dev <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi blink-dev
>>>>>
>>>>> I would like to let you know that blink-engprod has added feature 
>>>>> support for non-WPT fuzzy tests. It now allows both non-WPT reftests and 
>>>>> pixel tests to use the same fuzzy matching meta-tags as WPT tests.It also 
>>>>> shows max color channel difference and total number of different pixels 
>>>>> image diff stats in results.html 
>>>>> <https://test-results.appspot.com/data/layout_results/linux-rel/1073794/blink_web_tests%20%28with%20patch%29/layout-test-results/results.html>.
>>>>>  
>>>>> With these capabilities in place, we like to research further to see if 
>>>>> we 
>>>>> can set up some general fuzzy match rules, help blink dev identify flaky 
>>>>> tests that can be potentially resolved by adjusting fuzzy matching rules. 
>>>>> Currently there are quite some web tests that are flaky due to a slight 
>>>>> image mismatch, which should have been tolerated. If we setup a general 
>>>>> fuzzy matching rule , something like:
>>>>>
>>>>>  <meta name="fuzzy" content="0-1;0-1000">
>>>>>
>>>>> Instruct the image comparison web tests that if color channel and 
>>>>> pixel diff fall within the range of the rule, we can ignore the diff and 
>>>>> pass the test.This way we can reduce test flakiness while still 
>>>>> maintain test accuracy without missing a real bug. 
>>>>>
>>>>> We want to ask you some quick survey questions to help us make design 
>>>>> decisions, whether it makes sense to set up an universal cross-the-board 
>>>>> fuzzy match tolerant rule for all blink web tests, or we should make the 
>>>>> rules more specific to individual test or test sets.
>>>>>
>>>>> 1.  Is an universal fuzzy match tolerant rule acceptable for the web 
>>>>> tests in your area? 
>>>>>
>>>>>     a). If the answer is yes, what is the acceptable range of max 
>>>>> color channel and pixel diff for your tests?
>>>>>     b) If the answer is no, pls share your reasons.
>>>>>
>>>>> 2. Do you prefer fuzzy matching rule adjustment at a per-test or per 
>>>>> test set level based on the pixel difference numbers shown in 
>>>>> results.html?
>>>>>
>>>>> Here is some sample data help you make choice, we collected data 
>>>>> recently from blink_web_tests result on linux-test builder, the 
>>>>> distribution of color channel maxDifference and totalPixel diff for 
>>>>> failing/flaky blink_web_tests
>>>>> ( Note: over 70% tests in color channel maxDifference 0-10 range have 
>>>>> maxDifference=1):
>>>>>
>>>>> Color Channel maxDifferenece 
>>>>> Range Fail test count
>>>>> 0-10 98
>>>>> 11-100 31
>>>>> 101-200 28
>>>>> 201-260 111
>>>>> totalPixels 
>>>>> Diff Range
>>>>> Fail test count
>>>>> 0-100 30
>>>>> 100-1000 57
>>>>> 1000-10,000 99
>>>>> 10,000-100,000 66
>>>>> 100,000-1,000,000 16
>>>>>
>>>>> Let me know if you have any questions, looking forward to hearing from 
>>>>> you!
>>>>>
>>>>>
>>>>> Vivian
>>>>> on behalf of Chrome-Blink-EngProd
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "blink-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPCqkTs-L5u22-Xp5U_LeBdLP%3D%2BTDH1KGv8MTmtKQFRcANCZJg%40mail.gmail.com
>>>>>  
>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPCqkTs-L5u22-Xp5U_LeBdLP%3D%2BTDH1KGv8MTmtKQFRcANCZJg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "blink-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAGsbWzRDrX%3Dgz9NNcwpBEOXCxR37p2XwZC3Agm6fdE6%2BFcPhvg%40mail.gmail.com
>>>>  
>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAGsbWzRDrX%3Dgz9NNcwpBEOXCxR37p2XwZC3Agm6fdE6%2BFcPhvg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/f9d4f28e-385c-427b-b070-16e8ef1e843an%40chromium.org.

Reply via email to