On Fri, Jun 9, 2023 at 11:25 AM Stefan Zager <[email protected]> wrote:
> Resurrecting this thread, because yotha@ are launching the first step in > a project that completely > ignores the previous consensus (i.e., that we should *not* enable threaded > compositing for all > web_tests). > This is exciting to hear! Is there a good place / bug to follow progress? > > I will paraphrase the high-level arguments made previously by enne@ and > ojan@, neither of whom > are still active on chromium: > > - Enabling threaded compositing will increase test flakiness > - Previous efforts to fix the flakiness have been really difficult and > largely unsuccessful > - Buggy behavior in the blink/compositor thread boundary is best covered > by focused unit tests > - There are vanishingly few bugs in that area; maybe none? > > My reponse... > > web_tests, by their very nature, are a massive set of integration tests. > That is not the most efficient > organization of test coverage. But the beauty of web_tests is that they > catch a *ton* of bugs and > prevent many bad patches from landing. The large majority of the bugs are > in core blink areas, > but not all of them. web_tests provide effective, if inefficient, coverage > of almost the entire chromium > code base. The lack of coverage for threaded compositing is an anomaly and > a blink spot. "It's too > hard" is not, in my opinion, a convincing justification. > > Yes, turning on threaded compositing *does* make web_tests flakier. And > that is evidence of real bugs! > It's oxymoronic to acknowledge the flakiness but claim that there are few > or no bugs in the > blink/compositor thread boundary. There are definitely bugs in there! They > can be excruciatingly hard > to root-cause, but they definitely exist. A couple of examples... > > https://chromium-review.googlesource.com/c/chromium/src/+/3958364 > https://bugs.chromium.org/p/chromium/issues/detail?id=1447868 > > Once these bugs are root-caused, I certainly agree that we should prefer > unit tests for regression > testing. But they key point is that we only discovered these bugs because > they caused failures in > web_tests. > > Again: these are real bugs! > +1000 > > Because bugs in the thread boundary are usually race conditions they can > be extremely difficult to > reproduce, which means that they're very likely under-reported. But they > still degrade user experience. > Beyond crbug.com and flaky web_tests, there are also crash reports that > very likely indicate bugs in this > area. > > > On Wednesday, November 4, 2015 at 4:58:29 PM UTC-8 Ojan Vafai wrote: > >> +1 on all points. I hope this new (currently half-baked) solution will >> help improve all three of these to varying degrees. It's not panacea, but >> it should be a lot better. >> >> >> On Wed, Nov 4, 2015 at 11:29 AM Adrienne Walker <[email protected]> >> wrote: >> >>> Re: interoperability tests. Looking at the guidelines for the web >>> platform tests, I think if all of our layout tests had been written with >>> the guidelines of short and minimal in mind, we would be in a much better >>> situation than we are today. >>> >>> My biggest concerns about layout tests are: (1) they get used to test >>> very internal parts of Blink, which then get exposed via awful plumbing via >>> internals or dumping the state of the world into a text file. (2) they >>> "over test", especially in terms of pixel and text output, which adds a >>> rebaselining burden as these things churn over time and over different >>> platforms, even though that doesn't affect what the test was trying to >>> verify. (3) they rely on so much of the stack that they end up being flaky. >>> >>> I think there's plenty of room for html / javascript tests that are >>> cross-platform and test web platform features that don't fall into these >>> traps. >>> >>> 2015-11-04 9:32 GMT-08:00 Rick Byers <[email protected]>: >>> >>>> I agree with this big picture direction - i.e. unit tests are generally >>>> better than integration tests. >>>> >>>> However I HAVE seen examples in input where we've missed issues because >>>> we had only unit tests (content, cc and blink layout tests), not good >>>> integration test that included the effect of threaded input handling. One >>>> case in particular was pretty embarrassing - IIRC we broke scrolling >>>> completely in most real-world cases yet not a single test failed :-). >>>> Still, I think this is, at best, an argument for having a couple sanity >>>> integration tests (eg. as content browser tests). >>>> >>>> My bigger concern here is around interoperability testing. IMHO the >>>> only way we're going to substantially reduce developer pain around >>>> interoperability is by sharing more tests with the other rendering >>>> engines. Mozilla is much better at this >>>> <https://wiki.mozilla.org/Auto-tools/Projects/web-platform-tests> than >>>> us. This, by its nature, is impossible to do with unit tests. So when new >>>> simple web platform features are being added (especially highly >>>> deterministic JS APIs), I've recently started encouraging my team to try >>>> focusing on web-platform-tests <http://testthewebforward.org>. >>>> Obviously more complex / non-deterministic features should still prefer to >>>> focus on unit tests. But the exact balance here seems very non-obvious to >>>> me. But like with everything, we should probably just try things, evaluate >>>> how we're doing, and iterate. Any concerns? >>>> >>>> Rick >>>> >>>> >>>> On Wed, Nov 4, 2015 at 11:39 AM, Ojan Vafai <[email protected]> wrote: >>>> >>>>> Thread hijacking this to blink-dev@chromium. >>>>> >>>>> At a high level, I want to +1 that we should be moving our testing >>>>> away from high level regression tests into more deterministic tests. >>>>> There's a good (not quite finished) alternative to layout tests that I >>>>> think we should pursue (an evolution of Elliott's C++-based Sim >>>>> framework). >>>>> >>>>> I'll be adding this to the Blink infra team's long-term roadmap >>>>> shortly. Coincidentally, Elliott and I hashed out a basic idea yesterday. >>>>> At some point we'll turn that into a doc and a proposal, but mostly it >>>>> needs people to do the engineering work to make it possible. So, if you're >>>>> possibly interested in 5%-80%ing on this, ping me. >>>>> >>>>> On Tue, Nov 3, 2015 at 4:04 PM 'Adrienne Walker' via paint-dev < >>>>> [email protected]> wrote: >>>>> >>>>>> (I sent this previously as an email reply to a few people about a >>>>>> particular patch, but as this continues to come up on a monthly basis, >>>>>> here >>>>>> is a slightly edited copy of this rant as a reference of my biased >>>>>> position >>>>>> on this topic.) >>>>>> >>>>>> We should *not* make layout tests use threaded compositing. >>>>>> >>>>>> >>>>>> We shouldn't do it because it's a rat hole. jamesr and I both tried >>>>>> to make this happen on separate occasions. We spent a lot of time and >>>>>> fought with even more flakiness. I'm sure somebody could heroically >>>>>> burn a >>>>>> lot of time making this happen and succeed where we couldn't, but in my >>>>>> opinion, it's a waste of time. Threaded compositing increases >>>>>> flakiness and loses determinism in terms of when events happen, how >>>>>> repaints line up with frames, and how animation happens, especially. We >>>>>> already have a ton of problems with flakiness in layout tests, so why >>>>>> make >>>>>> this worse? >>>>>> >>>>>> If there are features that are only enabled with threaded compositing >>>>>> (compositor animations?) that somehow still need Blink layout >>>>>> tests, we should enable those features for non-threaded compositing. >>>>>> I would support removing the virtual/threaded directory if somebody >>>>>> made it possible to test those features in other ways. If there are >>>>>> tests >>>>>> that have different behavior in threaded mode vs not threaded mode, then >>>>>> they need to be deflaked or tested more narrowly via unit tests or >>>>>> esprehn's simulation testing framework. >>>>>> >>>>>> Single vs threaded compositing is generally well-tested within cc >>>>>> unit tests. There should be very little difference at the Blink level, >>>>>> and >>>>>> I would question the need to test it there. We have ironed out a small >>>>>> number of message ordering differences over time in cc. We should >>>>>> consider >>>>>> these differences bugs and add cc unit tests for them to guarantee the >>>>>> same >>>>>> ordering of messages. In the long term, I think the Blimp/Mandoline work >>>>>> has the potential to merge single/thread proxy with an abstracted >>>>>> glue layer between them. >>>>>> >>>>>> So, sure, we should test what we ship, but on the other hand every >>>>>> test should not be an integration test. I have have seen exactly zero >>>>>> bugs >>>>>> ever that would have been caught if a layout test had been using threaded >>>>>> compositing instead of single threaded compositing. Really, zero. >>>>>> >>>>>> Personally, I think our problems as a team lie more on the "tests are >>>>>> flaky and hard to reason about because they're integration tests" end of >>>>>> the spectrum and far less on the "we are shipping huge hidden regressions >>>>>> because of holes in our testing". >>>>>> >>>>> >>>>> On Tue, Nov 3, 2015 at 5:17 PM Adrienne Walker <[email protected]> >>>>> wrote: >>>>> >>>>>> 2015-11-03 16:30 GMT-08:00 Alexey <[email protected]>: >>>>>> >>>>>>> Can we run all of the virtual/threaded tests in SingleThreadProxy >>>>>>> <https://code.google.com/p/chromium/codesearch#chromium/src/cc/trees/single_thread_proxy.cc&l=37&gs=cpp:cc::class-SingleThreadProxy::SingleThreadProxy(cc::LayerTreeHost%2520*,%2520cc::LayerTreeHostSingleThreadClient%2520*,%2520scoped_refptr%253Cbase::SingleThreadTaskRunner%253E,%2520scoped_ptr%253Ccc::BeginFrameSource,%2520base::DefaultDeleter%253Ccc::BeginFrameSource%253E%2520%253E)@chromium/../../cc/trees/single_thread_proxy.cc%257Cdef&gsn=SingleThreadProxy&ct=xref_usages> >>>>>>> mode? >>>>>>> >>>>>> >>>>>> Currently, no! >>>>>> >>>>>> There are some things (e.g. composited animations) that have behavior >>>>>> differences in code depending on threaded vs single thread. It seems to >>>>>> me >>>>>> that these are wrinkles that could be ironed out. It seems like there's >>>>>> no >>>>>> reason we couldn't create and run composited animations in >>>>>> single-threaded >>>>>> mode. >>>>>> >>>>>> It looks like other things have been added there over time (e.g. idle >>>>>> callbacks), that I don't understand as well. Naively, it seems like >>>>>> fundamentally nothing should depend on threaded compositing. >>>>>> >>>>> >>>> >>> -- > You received this message because you are subscribed to the Google Groups > "blink-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4d8dd468-b3c4-4f65-8ac7-1b4ca194e08an%40chromium.org > <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4d8dd468-b3c4-4f65-8ac7-1b4ca194e08an%40chromium.org?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CA%2BapAgH17Koq-tO_khHzoGboAUbxxi7D65n%3DH73MxdTSJ5-PEA%40mail.gmail.com.
