Hello,
for us too, as many of you poited out before me, the main issue is not build breakage while maintaining the QtWebKit buildbot - because these are most of the time trivial fixes - but tracking down the previously passing but after-the-night failing tests. Because of the timezone differences most of the work on the Mac port is done when we here in Europe are in our sleeping-quarters, and a green bot will most certainly be red in the morning and a non-green bot gets out of sight very quickly. However, I find the idea of the try-bots great, because this takes the right direction to a future where try-bots also test the layout tests. We are monitoring changes and their effect on our buildbot and tracking down incremental failures and flakey tests which cause false alarm and disrupt the trustiness of our bot. Another huge problem is, as Dimitri pointed out too, that in most cases we miss a certain DRT feature to run a test correctly. If we spend more effort improving (and btw fixing) DRT and the test infrastructure than improving and fixing the port itself then there might be a problem with the layout-testing approach on multiple ports.

BR:
Andras (bbandix)

Dimitri Glazkov írta:
On Mon, Nov 16, 2009 at 5:56 AM, Xan Lopez <x...@gnome.org> wrote:
On Mon, Nov 16, 2009 at 3:33 PM, Gustavo Noronha Silva <g...@gnome.org> wrote:
On Mon, 2009-11-16 at 05:24 -0800, Adam Barth wrote:
Eric and I are working on a bot that might help this situation.
Essentially, the bot will try out patches on Qt and GTK and add a
comment to the bug if the patch regresses the build.  Our plan is to
start with compiling, but we'd eventually like to run the tests as
well.
That sounds like an awesome idea. Thanks for working on it. Build
breakages themselves have become less of an issue for us in recent times
- people are definitely more aware of our bots, and are acting on fixing
them when they break.

I think such an automated approach to running the build, and tests for
upcoming patches will surely help with giving this a second step
forward.
This is nice to see, but as Gustavo says build breakage is not too
much an issue at the moment for us: the build does not break very
often, and when it does people usually take the time to figure out
what happened and either do fix it themselves or poke us about it.
That being said, this could be improved in any number of ways and I'm
happy to see it getting ever better.

What is effectively a black hole with respect to our time is the tests
breakage, though. We get new tests failing very regularly (either
through new tests or through new code making old tests fail), and once
the bots are red people tend to pay even less attention to them, so
they spiral out of control in a positive feedback loop until we have
tests failing in the double digits in a matter of days (or hours!). Of
course in an ideal world we'd have a team big enough to always have at
least one person looking at this and fixing the problems the moment
they arise, but unfortunately this is not the case.

This is a huge issue with the Chromium port as well. We spend quite a
bit of effort tracking down failing tests, only to discover that the
failure is due to one-port baselines or new functionality added to
DRT. I wonder if the approach we have today in regards to tests is not
sustainable with multiple vibrant ports, each spending way to much
time catching up.

:DG<
_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Reply via email to