Summary: We're increasing sharding for running webkit tests and it's increasing test flakiness a bit.
1. Is the tradeoff of (hopefully) temporary increased flakiness worth the speed gains? We retry these failures, so they rarely actually turn the bots red, 2. The flakiness is temporary only if we fix it. Currently, as far as I know, only Julie and I are fixing flaky webkit tests. There's a lot of low-hanging fruit here. Tests with obvious race conditions. Anyone else willing to help fixing these? Details: As we increase parallelism in the webkit tests, we greatly reduce cycle times, but we also increase flakiness. I'm fairly convinced that, with the exception of the http tests on Windows, nearly all of the flakiness results from race conditions in the test itself and occasionally bugs in Chrome/WebKit. We currently shard webkit tests by directory in order to minimize flakiness. The theory is that we run them in roughly the same order as upstream webkit does that way. In order to minimize pain and flakiness we are gradually sharding into smaller chunks. Initially, we sharded just the directories under LayoutTests. Now we also shard the directories under fast, svg and (on the Mac) http. For example, sharding LayoutTests/fast made the webkit tests on the debug-2 bots >2x faster (~10 mins!). But it also exposed flakiness in ~10 tests. You can see this by looking at http://src.chromium.org/viewvc/chrome/trunk/src/webkit/tools/layout_tests/flakiness_dashboard.html#builder=Webkit%20Linux%20(dbg)(2), which used to have 2 flaky tests. Ultimately, I think on machines with 8 hyperthreaded cores we are close to getting the tests to run in <2 minutes. Ojan -- Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev
