On Fri, Jun 9, 2023 at 11:38 AM Ken Rockot <[email protected]> wrote:
> > On Fri, Jun 9, 2023 at 11:25 AM Stefan Zager <[email protected]> wrote: > >> Resurrecting this thread, because yotha@ are launching the first step in >> a project that completely >> ignores the previous consensus (i.e., that we should *not* enable >> threaded compositing for all >> web_tests). >> > > This is exciting to hear! Is there a good place / bug to follow progress? > Main tracking bug <http://crbug.com/770028>. > > >> >> I will paraphrase the high-level arguments made previously by enne@ and >> ojan@, neither of whom >> are still active on chromium: >> >> - Enabling threaded compositing will increase test flakiness >> - Previous efforts to fix the flakiness have been really difficult and >> largely unsuccessful >> - Buggy behavior in the blink/compositor thread boundary is best covered >> by focused unit tests >> - There are vanishingly few bugs in that area; maybe none? >> >> My reponse... >> >> web_tests, by their very nature, are a massive set of integration tests. >> That is not the most efficient >> organization of test coverage. But the beauty of web_tests is that they >> catch a *ton* of bugs and >> prevent many bad patches from landing. The large majority of the bugs are >> in core blink areas, >> but not all of them. web_tests provide effective, if inefficient, >> coverage of almost the entire chromium >> code base. The lack of coverage for threaded compositing is an anomaly >> and a blink spot. "It's too >> hard" is not, in my opinion, a convincing justification. >> >> Yes, turning on threaded compositing *does* make web_tests flakier. And >> that is evidence of real bugs! >> It's oxymoronic to acknowledge the flakiness but claim that there are few >> or no bugs in the >> blink/compositor thread boundary. There are definitely bugs in there! >> They can be excruciatingly hard >> to root-cause, but they definitely exist. A couple of examples... >> >> https://chromium-review.googlesource.com/c/chromium/src/+/3958364 >> https://bugs.chromium.org/p/chromium/issues/detail?id=1447868 >> >> Once these bugs are root-caused, I certainly agree that we should prefer >> unit tests for regression >> testing. But they key point is that we only discovered these bugs because >> they caused failures in >> web_tests. >> >> Again: these are real bugs! >> > > +1000 > > >> >> Because bugs in the thread boundary are usually race conditions they can >> be extremely difficult to >> reproduce, which means that they're very likely under-reported. But they >> still degrade user experience. >> Beyond crbug.com and flaky web_tests, there are also crash reports that >> very likely indicate bugs in this >> area. >> >> >> On Wednesday, November 4, 2015 at 4:58:29 PM UTC-8 Ojan Vafai wrote: >> >>> +1 on all points. I hope this new (currently half-baked) solution will >>> help improve all three of these to varying degrees. It's not panacea, but >>> it should be a lot better. >>> >>> >>> On Wed, Nov 4, 2015 at 11:29 AM Adrienne Walker <[email protected]> >>> wrote: >>> >>>> Re: interoperability tests. Looking at the guidelines for the web >>>> platform tests, I think if all of our layout tests had been written with >>>> the guidelines of short and minimal in mind, we would be in a much better >>>> situation than we are today. >>>> >>>> My biggest concerns about layout tests are: (1) they get used to test >>>> very internal parts of Blink, which then get exposed via awful plumbing via >>>> internals or dumping the state of the world into a text file. (2) they >>>> "over test", especially in terms of pixel and text output, which adds a >>>> rebaselining burden as these things churn over time and over different >>>> platforms, even though that doesn't affect what the test was trying to >>>> verify. (3) they rely on so much of the stack that they end up being >>>> flaky. >>>> >>>> I think there's plenty of room for html / javascript tests that are >>>> cross-platform and test web platform features that don't fall into these >>>> traps. >>>> >>>> 2015-11-04 9:32 GMT-08:00 Rick Byers <[email protected]>: >>>> >>>>> I agree with this big picture direction - i.e. unit tests are >>>>> generally better than integration tests. >>>>> >>>>> However I HAVE seen examples in input where we've missed issues >>>>> because we had only unit tests (content, cc and blink layout tests), not >>>>> good integration test that included the effect of threaded input handling. >>>>> One case in particular was pretty embarrassing - IIRC we broke scrolling >>>>> completely in most real-world cases yet not a single test failed :-). >>>>> Still, I think this is, at best, an argument for having a couple sanity >>>>> integration tests (eg. as content browser tests). >>>>> >>>>> My bigger concern here is around interoperability testing. IMHO the >>>>> only way we're going to substantially reduce developer pain around >>>>> interoperability is by sharing more tests with the other rendering >>>>> engines. Mozilla is much better at this >>>>> <https://wiki.mozilla.org/Auto-tools/Projects/web-platform-tests> >>>>> than us. This, by its nature, is impossible to do with unit tests. So >>>>> when new simple web platform features are being added (especially highly >>>>> deterministic JS APIs), I've recently started encouraging my team to try >>>>> focusing on web-platform-tests <http://testthewebforward.org>. >>>>> Obviously more complex / non-deterministic features should still prefer to >>>>> focus on unit tests. But the exact balance here seems very non-obvious to >>>>> me. But like with everything, we should probably just try things, >>>>> evaluate >>>>> how we're doing, and iterate. Any concerns? >>>>> >>>>> Rick >>>>> >>>>> >>>>> On Wed, Nov 4, 2015 at 11:39 AM, Ojan Vafai <[email protected]> wrote: >>>>> >>>>>> Thread hijacking this to blink-dev@chromium. >>>>>> >>>>>> At a high level, I want to +1 that we should be moving our testing >>>>>> away from high level regression tests into more deterministic tests. >>>>>> There's a good (not quite finished) alternative to layout tests that I >>>>>> think we should pursue (an evolution of Elliott's C++-based Sim >>>>>> framework). >>>>>> >>>>>> I'll be adding this to the Blink infra team's long-term roadmap >>>>>> shortly. Coincidentally, Elliott and I hashed out a basic idea yesterday. >>>>>> At some point we'll turn that into a doc and a proposal, but mostly it >>>>>> needs people to do the engineering work to make it possible. So, if >>>>>> you're >>>>>> possibly interested in 5%-80%ing on this, ping me. >>>>>> >>>>>> On Tue, Nov 3, 2015 at 4:04 PM 'Adrienne Walker' via paint-dev < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> (I sent this previously as an email reply to a few people about a >>>>>>> particular patch, but as this continues to come up on a monthly basis, >>>>>>> here >>>>>>> is a slightly edited copy of this rant as a reference of my biased >>>>>>> position >>>>>>> on this topic.) >>>>>>> >>>>>>> We should *not* make layout tests use threaded compositing. >>>>>>> >>>>>>> >>>>>>> We shouldn't do it because it's a rat hole. jamesr and I both tried >>>>>>> to make this happen on separate occasions. We spent a lot of time and >>>>>>> fought with even more flakiness. I'm sure somebody could heroically >>>>>>> burn a >>>>>>> lot of time making this happen and succeed where we couldn't, but in my >>>>>>> opinion, it's a waste of time. Threaded compositing increases >>>>>>> flakiness and loses determinism in terms of when events happen, how >>>>>>> repaints line up with frames, and how animation happens, especially. We >>>>>>> already have a ton of problems with flakiness in layout tests, so why >>>>>>> make >>>>>>> this worse? >>>>>>> >>>>>>> If there are features that are only enabled with threaded >>>>>>> compositing (compositor animations?) that somehow still need Blink >>>>>>> layout tests, we should enable those features for non-threaded >>>>>>> compositing. I would support removing the virtual/threaded directory >>>>>>> if somebody made it possible to test those features in other ways. If >>>>>>> there are tests that have different behavior in threaded mode vs not >>>>>>> threaded mode, then they need to be deflaked or tested more narrowly via >>>>>>> unit tests or esprehn's simulation testing framework. >>>>>>> >>>>>>> Single vs threaded compositing is generally well-tested within cc >>>>>>> unit tests. There should be very little difference at the Blink level, >>>>>>> and >>>>>>> I would question the need to test it there. We have ironed out a small >>>>>>> number of message ordering differences over time in cc. We should >>>>>>> consider >>>>>>> these differences bugs and add cc unit tests for them to guarantee the >>>>>>> same >>>>>>> ordering of messages. In the long term, I think the Blimp/Mandoline >>>>>>> work >>>>>>> has the potential to merge single/thread proxy with an abstracted >>>>>>> glue layer between them. >>>>>>> >>>>>>> So, sure, we should test what we ship, but on the other hand every >>>>>>> test should not be an integration test. I have have seen exactly zero >>>>>>> bugs >>>>>>> ever that would have been caught if a layout test had been using >>>>>>> threaded >>>>>>> compositing instead of single threaded compositing. Really, zero. >>>>>>> >>>>>>> Personally, I think our problems as a team lie more on the "tests >>>>>>> are flaky and hard to reason about because they're integration tests" >>>>>>> end >>>>>>> of the spectrum and far less on the "we are shipping huge hidden >>>>>>> regressions because of holes in our testing". >>>>>>> >>>>>> >>>>>> On Tue, Nov 3, 2015 at 5:17 PM Adrienne Walker <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> 2015-11-03 16:30 GMT-08:00 Alexey <[email protected]>: >>>>>>> >>>>>>>> Can we run all of the virtual/threaded tests in SingleThreadProxy >>>>>>>> <https://code.google.com/p/chromium/codesearch#chromium/src/cc/trees/single_thread_proxy.cc&l=37&gs=cpp:cc::class-SingleThreadProxy::SingleThreadProxy(cc::LayerTreeHost%2520*,%2520cc::LayerTreeHostSingleThreadClient%2520*,%2520scoped_refptr%253Cbase::SingleThreadTaskRunner%253E,%2520scoped_ptr%253Ccc::BeginFrameSource,%2520base::DefaultDeleter%253Ccc::BeginFrameSource%253E%2520%253E)@chromium/../../cc/trees/single_thread_proxy.cc%257Cdef&gsn=SingleThreadProxy&ct=xref_usages> >>>>>>>> mode? >>>>>>>> >>>>>>> >>>>>>> Currently, no! >>>>>>> >>>>>>> There are some things (e.g. composited animations) that have >>>>>>> behavior differences in code depending on threaded vs single thread. It >>>>>>> seems to me that these are wrinkles that could be ironed out. It seems >>>>>>> like there's no reason we couldn't create and run composited animations >>>>>>> in >>>>>>> single-threaded mode. >>>>>>> >>>>>>> It looks like other things have been added there over time (e.g. >>>>>>> idle callbacks), that I don't understand as well. Naively, it seems >>>>>>> like >>>>>>> fundamentally nothing should depend on threaded compositing. >>>>>>> >>>>>> >>>>> >>>> -- >> You received this message because you are subscribed to the Google Groups >> "blink-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4d8dd468-b3c4-4f65-8ac7-1b4ca194e08an%40chromium.org >> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4d8dd468-b3c4-4f65-8ac7-1b4ca194e08an%40chromium.org?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAHOQ7J-GWQnOx-v%2BFft8chTimcXkvJ6E2Lg%3DQtAJG6G8eaM-3A%40mail.gmail.com.
