Resurrecting this thread, because yotha@ are launching the first step in a project that completely ignores the previous consensus (i.e., that we should *not* enable threaded compositing for all web_tests).
I will paraphrase the high-level arguments made previously by enne@ and ojan@, neither of whom are still active on chromium: - Enabling threaded compositing will increase test flakiness - Previous efforts to fix the flakiness have been really difficult and largely unsuccessful - Buggy behavior in the blink/compositor thread boundary is best covered by focused unit tests - There are vanishingly few bugs in that area; maybe none? My reponse... web_tests, by their very nature, are a massive set of integration tests. That is not the most efficient organization of test coverage. But the beauty of web_tests is that they catch a *ton* of bugs and prevent many bad patches from landing. The large majority of the bugs are in core blink areas, but not all of them. web_tests provide effective, if inefficient, coverage of almost the entire chromium code base. The lack of coverage for threaded compositing is an anomaly and a blink spot. "It's too hard" is not, in my opinion, a convincing justification. Yes, turning on threaded compositing *does* make web_tests flakier. And that is evidence of real bugs! It's oxymoronic to acknowledge the flakiness but claim that there are few or no bugs in the blink/compositor thread boundary. There are definitely bugs in there! They can be excruciatingly hard to root-cause, but they definitely exist. A couple of examples... https://chromium-review.googlesource.com/c/chromium/src/+/3958364 https://bugs.chromium.org/p/chromium/issues/detail?id=1447868 Once these bugs are root-caused, I certainly agree that we should prefer unit tests for regression testing. But they key point is that we only discovered these bugs because they caused failures in web_tests. Again: these are real bugs! Because bugs in the thread boundary are usually race conditions they can be extremely difficult to reproduce, which means that they're very likely under-reported. But they still degrade user experience. Beyond crbug.com and flaky web_tests, there are also crash reports that very likely indicate bugs in this area. On Wednesday, November 4, 2015 at 4:58:29 PM UTC-8 Ojan Vafai wrote: > +1 on all points. I hope this new (currently half-baked) solution will > help improve all three of these to varying degrees. It's not panacea, but > it should be a lot better. > > > On Wed, Nov 4, 2015 at 11:29 AM Adrienne Walker <[email protected]> wrote: > >> Re: interoperability tests. Looking at the guidelines for the web >> platform tests, I think if all of our layout tests had been written with >> the guidelines of short and minimal in mind, we would be in a much better >> situation than we are today. >> >> My biggest concerns about layout tests are: (1) they get used to test >> very internal parts of Blink, which then get exposed via awful plumbing via >> internals or dumping the state of the world into a text file. (2) they >> "over test", especially in terms of pixel and text output, which adds a >> rebaselining burden as these things churn over time and over different >> platforms, even though that doesn't affect what the test was trying to >> verify. (3) they rely on so much of the stack that they end up being flaky. >> >> I think there's plenty of room for html / javascript tests that are >> cross-platform and test web platform features that don't fall into these >> traps. >> >> 2015-11-04 9:32 GMT-08:00 Rick Byers <[email protected]>: >> >>> I agree with this big picture direction - i.e. unit tests are generally >>> better than integration tests. >>> >>> However I HAVE seen examples in input where we've missed issues because >>> we had only unit tests (content, cc and blink layout tests), not good >>> integration test that included the effect of threaded input handling. One >>> case in particular was pretty embarrassing - IIRC we broke scrolling >>> completely in most real-world cases yet not a single test failed :-). >>> Still, I think this is, at best, an argument for having a couple sanity >>> integration tests (eg. as content browser tests). >>> >>> My bigger concern here is around interoperability testing. IMHO the >>> only way we're going to substantially reduce developer pain around >>> interoperability is by sharing more tests with the other rendering >>> engines. Mozilla is much better at this >>> <https://wiki.mozilla.org/Auto-tools/Projects/web-platform-tests> than >>> us. This, by its nature, is impossible to do with unit tests. So when new >>> simple web platform features are being added (especially highly >>> deterministic JS APIs), I've recently started encouraging my team to try >>> focusing on web-platform-tests <http://testthewebforward.org>. >>> Obviously more complex / non-deterministic features should still prefer to >>> focus on unit tests. But the exact balance here seems very non-obvious to >>> me. But like with everything, we should probably just try things, evaluate >>> how we're doing, and iterate. Any concerns? >>> >>> Rick >>> >>> >>> On Wed, Nov 4, 2015 at 11:39 AM, Ojan Vafai <[email protected]> wrote: >>> >>>> Thread hijacking this to blink-dev@chromium. >>>> >>>> At a high level, I want to +1 that we should be moving our testing away >>>> from high level regression tests into more deterministic tests. There's a >>>> good (not quite finished) alternative to layout tests that I think we >>>> should pursue (an evolution of Elliott's C++-based Sim framework). >>>> >>>> I'll be adding this to the Blink infra team's long-term roadmap >>>> shortly. Coincidentally, Elliott and I hashed out a basic idea yesterday. >>>> At some point we'll turn that into a doc and a proposal, but mostly it >>>> needs people to do the engineering work to make it possible. So, if you're >>>> possibly interested in 5%-80%ing on this, ping me. >>>> >>>> On Tue, Nov 3, 2015 at 4:04 PM 'Adrienne Walker' via paint-dev < >>>> [email protected]> wrote: >>>> >>>>> (I sent this previously as an email reply to a few people about a >>>>> particular patch, but as this continues to come up on a monthly basis, >>>>> here >>>>> is a slightly edited copy of this rant as a reference of my biased >>>>> position >>>>> on this topic.) >>>>> >>>>> We should *not* make layout tests use threaded compositing. >>>>> >>>>> >>>>> We shouldn't do it because it's a rat hole. jamesr and I both tried >>>>> to make this happen on separate occasions. We spent a lot of time and >>>>> fought with even more flakiness. I'm sure somebody could heroically burn >>>>> a >>>>> lot of time making this happen and succeed where we couldn't, but in my >>>>> opinion, it's a waste of time. Threaded compositing increases >>>>> flakiness and loses determinism in terms of when events happen, how >>>>> repaints line up with frames, and how animation happens, especially. We >>>>> already have a ton of problems with flakiness in layout tests, so why >>>>> make >>>>> this worse? >>>>> >>>>> If there are features that are only enabled with threaded compositing >>>>> (compositor animations?) that somehow still need Blink layout tests, >>>>> we should enable those features for non-threaded compositing. I >>>>> would support removing the virtual/threaded directory if somebody >>>>> made it possible to test those features in other ways. If there are >>>>> tests >>>>> that have different behavior in threaded mode vs not threaded mode, then >>>>> they need to be deflaked or tested more narrowly via unit tests or >>>>> esprehn's simulation testing framework. >>>>> >>>>> Single vs threaded compositing is generally well-tested within cc >>>>> unit tests. There should be very little difference at the Blink level, >>>>> and >>>>> I would question the need to test it there. We have ironed out a small >>>>> number of message ordering differences over time in cc. We should >>>>> consider >>>>> these differences bugs and add cc unit tests for them to guarantee the >>>>> same >>>>> ordering of messages. In the long term, I think the Blimp/Mandoline work >>>>> has the potential to merge single/thread proxy with an abstracted >>>>> glue layer between them. >>>>> >>>>> So, sure, we should test what we ship, but on the other hand every >>>>> test should not be an integration test. I have have seen exactly zero >>>>> bugs >>>>> ever that would have been caught if a layout test had been using threaded >>>>> compositing instead of single threaded compositing. Really, zero. >>>>> >>>>> Personally, I think our problems as a team lie more on the "tests are >>>>> flaky and hard to reason about because they're integration tests" end of >>>>> the spectrum and far less on the "we are shipping huge hidden regressions >>>>> because of holes in our testing". >>>>> >>>> >>>> On Tue, Nov 3, 2015 at 5:17 PM Adrienne Walker <[email protected]> >>>> wrote: >>>> >>>>> 2015-11-03 16:30 GMT-08:00 Alexey <[email protected]>: >>>>> >>>>>> Can we run all of the virtual/threaded tests in SingleThreadProxy >>>>>> <https://code.google.com/p/chromium/codesearch#chromium/src/cc/trees/single_thread_proxy.cc&l=37&gs=cpp:cc::class-SingleThreadProxy::SingleThreadProxy(cc::LayerTreeHost%2520*,%2520cc::LayerTreeHostSingleThreadClient%2520*,%2520scoped_refptr%253Cbase::SingleThreadTaskRunner%253E,%2520scoped_ptr%253Ccc::BeginFrameSource,%2520base::DefaultDeleter%253Ccc::BeginFrameSource%253E%2520%253E)@chromium/../../cc/trees/single_thread_proxy.cc%257Cdef&gsn=SingleThreadProxy&ct=xref_usages> >>>>>> mode? >>>>>> >>>>> >>>>> Currently, no! >>>>> >>>>> There are some things (e.g. composited animations) that have behavior >>>>> differences in code depending on threaded vs single thread. It seems to >>>>> me >>>>> that these are wrinkles that could be ironed out. It seems like there's >>>>> no >>>>> reason we couldn't create and run composited animations in >>>>> single-threaded >>>>> mode. >>>>> >>>>> It looks like other things have been added there over time (e.g. idle >>>>> callbacks), that I don't understand as well. Naively, it seems like >>>>> fundamentally nothing should depend on threaded compositing. >>>>> >>>> >>> >> -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4d8dd468-b3c4-4f65-8ac7-1b4ca194e08an%40chromium.org.
