Thanks for valuable feedback! Stephen, Xianzhu, will see if we can add a filter in result.html to grab those tests in range.
On Mon, Aug 1, 2022 at 8:40 AM Xianzhu Wang <[email protected]> wrote: > On Mon, Aug 1, 2022 at 4:25 AM Stephen Chenney <[email protected]> > wrote: > >> Thanks for investigating the potential for fuzzy matching. >> >> Rendering Core continues to oppose a single fuzzy match rule across all >> web_tests. We have some tests where single pixel differences matter >> (related to pixel snapping, for example) and a universal fuzzy match would >> fail to identify problems with those. This came up in practice recently >> when the GPU team enabled fuzzy matching without telling us, and expected >> failing tests started passing when they shouldn't. >> > > I think a key difference between the original fuzzy matching rule and the > rule proposed by Vivian is the ranges. With maxDifference=0-1, we should be > able to catch most visible single pixel differences. What I'm not sure is > whether a difference like rgb(1, 0, 0) vs rgb(0, 0, 0) (each component in > the range of 0-255) should be treated as a failure in some cases. > > Maybe specific sub teams have directories they could apply default fuzzy >> matching to. My guess is that the same directories where it will work will >> be directories with few failing tests, limiting the impact of a >> per-directory approach. >> >> Is there a way to reproduce the sampling below with a side-by-side >> comparison of the images? I would find it helpful to look through some of >> the cases that would pass with <meta name="fuzzy" content="0-1;0-1000">, >> for example. >> > > A filter by actual maxDifference and totalPixels in results.html will be > helpful. I can add it when I get time. > > Stephen. >> >> On Fri, Jul 29, 2022 at 8:20 PM 'Vivian Zhi (支文文)' via blink-dev < >> [email protected]> wrote: >> >>> Hi blink-dev >>> >>> I would like to let you know that blink-engprod has added feature >>> support for non-WPT fuzzy tests. It now allows both non-WPT reftests and >>> pixel tests to use the same fuzzy matching meta-tags as WPT tests.It also >>> shows max color channel difference and total number of different pixels >>> image diff stats in results.html >>> <https://test-results.appspot.com/data/layout_results/linux-rel/1073794/blink_web_tests%20%28with%20patch%29/layout-test-results/results.html>. >>> With these capabilities in place, we like to research further to see if we >>> can set up some general fuzzy match rules, help blink dev identify flaky >>> tests that can be potentially resolved by adjusting fuzzy matching rules. >>> Currently there are quite some web tests that are flaky due to a slight >>> image mismatch, which should have been tolerated. If we setup a general >>> fuzzy matching rule , something like: >>> >>> <meta name="fuzzy" content="0-1;0-1000"> >>> >>> Instruct the image comparison web tests that if color channel and pixel >>> diff fall within the range of the rule, we can ignore the diff and pass the >>> test.This way we can reduce test flakiness while still maintain test >>> accuracy without missing a real bug. >>> >>> We want to ask you some quick survey questions to help us make design >>> decisions, whether it makes sense to set up an universal cross-the-board >>> fuzzy match tolerant rule for all blink web tests, or we should make the >>> rules more specific to individual test or test sets. >>> >>> 1. Is an universal fuzzy match tolerant rule acceptable for the web >>> tests in your area? >>> >>> a). If the answer is yes, what is the acceptable range of max color >>> channel and pixel diff for your tests? >>> b) If the answer is no, pls share your reasons. >>> >>> 2. Do you prefer fuzzy matching rule adjustment at a per-test or per >>> test set level based on the pixel difference numbers shown in results.html? >>> >>> Here is some sample data help you make choice, we collected data >>> recently from blink_web_tests result on linux-test builder, the >>> distribution of color channel maxDifference and totalPixel diff for >>> failing/flaky blink_web_tests >>> ( Note: over 70% tests in color channel maxDifference 0-10 range have >>> maxDifference=1): >>> >>> Color Channel maxDifferenece >>> Range Fail test count >>> 0-10 98 >>> 11-100 31 >>> 101-200 28 >>> 201-260 111 >>> totalPixels >>> Diff Range >>> Fail test count >>> 0-100 30 >>> 100-1000 57 >>> 1000-10,000 99 >>> 10,000-100,000 66 >>> 100,000-1,000,000 16 >>> >>> Let me know if you have any questions, looking forward to hearing from >>> you! >>> >>> >>> Vivian >>> on behalf of Chrome-Blink-EngProd >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "blink-dev" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPCqkTs-L5u22-Xp5U_LeBdLP%3D%2BTDH1KGv8MTmtKQFRcANCZJg%40mail.gmail.com >>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPCqkTs-L5u22-Xp5U_LeBdLP%3D%2BTDH1KGv8MTmtKQFRcANCZJg%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "blink-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAGsbWzRDrX%3Dgz9NNcwpBEOXCxR37p2XwZC3Agm6fdE6%2BFcPhvg%40mail.gmail.com >> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAGsbWzRDrX%3Dgz9NNcwpBEOXCxR37p2XwZC3Agm6fdE6%2BFcPhvg%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAPCqkTs-ODMdS636ue%3DcYCCJbo7%3DSe5pfSFZVNcmXw9a4G_u5A%40mail.gmail.com.
