[chromium-dev] Re: [Chrome-team] understanding layout test flakiness

David Levin Wed, 09 Sep 2009 11:00:06 -0700

Nice write up.

Idea by Drew Wilson:
5. Make test shell dump (partial) results when there is a timeout. (This may
actually be an item under "2".)


On Wed, Sep 9, 2009 at 10:51 AM, Ojan Vafai <[email protected]> wrote:

> I'm including at the top concrete tasks people can take to help identify
> and reduce flakiness. Read below for more details.
>
>    1. Mark slow tests as SLOW and reduce the timeout on the bots to 2
>    seconds.
>    2. Look into the cause of the timeouts on HTTP tests, especially on
>    Mac/Windows
>    3. Look at the actual results off the bots for the non-timeout flaky
>    failures and identify the cause of the flakiness (likely the test itself).
>    4. Make test_expectations.txt match what's actually happening on the
>    bots (see the flakiness dashboard for tests with incorrect expectations).
>
> All the data I use below is from:
> http://src.chromium.org/viewvc/chrome/trunk/src/webkit/tools/layout_tests/flakiness_dashboard.html
>
> On Tue, Sep 8, 2009 at 5:52 PM, David Levin <[email protected]> wrote:
>
>> I agree that the chromium buildbot seems to have more flakiness on layout
>> tests that webkit buildbots.
>
>
> While there is definitely more flakiness, I'm not sure how much more. I
> think the Chromium bots are primarily more flaky on the HTTP tests. What
> flakiness there is gets less noticed on the webkit buildbots since they
> don't close the tree.
>
>
>> Here's two things that may help us to understand this:
>> 1. It would be nice to save crash logs from OSX into the zip file. For
>> example, this run
>>
>> http://build.chromium.org/buildbot/waterfall/builders/Webkit%20Mac10.5%20(dbg)(2)/builds/3323/steps/webkit_tests/logs/stdio
>> had a crash and likely generated a crash log at
>> ~/Library/Logs/CrashReporter/TestShell*.crash which would help point to a
>> culprit.
>>
>
> +1 This would be very useful. That said, it won't benefit with decreasing
> flakiness much. Very few of the flaky tests are flaky crashers. They're
> almost entirely flaky timeouts or failures, even in debug builders.
>
> 2. If we suspect that tests may pass if given more time, then increase the
>> timeout and see if more tests pass but exceed this old timeout (log
>> something when this happens so we can validate that it is working).
>>
>
> -1 The test dashboard prints the out the amount of time a test takes to run
> if it takes >1 second. I don't think the timing out tests would pass if we
> just gave them more time. Specifically, there are tests that always timeout
> and there are flaky timeout tests. The flaky timeout tests, when they do
> pass, consistently take less than 10 seconds to run, most of them take less
> than 1 second.
>
> Increasing the test timeout also *considerably* increases how long it takes
> for the bots to cycle. In fact, I think we should be *decreasing* it to
> something like 2 seconds. This would actually shave minutes off of the
> current bot cycle times.
>
> We have ~100 tests that are slow, many of which timeout at 20 seconds. We
> should mark all the slow, but passing tests as SLOW in the test expectations
> file. This will give them more time to run than the other tests. Then we
> should bring the timeout down to something like 2 seconds. This will make
> the bots run a lot faster and distinguish between the tests that timeout
> versus just taking a long time to pass.
>
>
>> On Tue, Sep 8, 2009 at 5:41 PM, Dirk Pranke <[email protected]> wrote:
>>
>>> From what I've poked around at, many of the LayoutTest flaky failures
>>> are timeout-related.
>>
>>
> While more than half of the flaky tests on Windows and Mac are timeouts,
> many of them are crashes or failures. You can see this pretty clearly on the
> layout test dashboard. I'll note that on Linux, a very small percentage of
> the flakiness is timeouts. Almost all of these timeouts on Windows/Mac are
> HTTP tests. There is likely one or two causes for all the flakiness with the
> HTTP tests.
>
> There's something in the test harness and web
>>> server configurations that cause tests to be unpredictably slower. I
>>> don't think Apple has this problem, and I think that's because they
>>> use the built in apache instance in OS X,
>>
>>
> We switched away from apache to lighttp because of flakiness it was causing
> on cygwin (cygwin and apache don't play well together). Maybe it makes sense
> to use lighttp on Windows and Apache on Mac? I think we should identify the
> cause of the flakiness on Windows. Fixing that might fix the flakiness on
> Mac as well and we wouldn't need to support two http servers.
>
>
>> and also because they have a
>>> very different model for test execution (how we run tests in
>>> parallel).
>>
>>
> Running tests in parallel did seem to make things a bit more flaky, but not
> much. I haven't verified this, but I think it probably just magnified
> existing flakiness by putting higher load on the machine. Linux, the least
> flaky bot, is the only bot that has 4 cores instead of just 2, which means
> it runs using more TestShell instances in parallel.
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: [email protected] 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

[chromium-dev] Re: [Chrome-team] understanding layout test flakiness

Reply via email to