Re: [Chandler-dev] Checkin process when the tree isn't green?

Philippe Bossut Tue, 19 Sep 2006 15:03:57 -0700

Hi,

I read the entire thread and it's clear that the frustration ispalpable. Let's face it: the system is broken and trying to fix it byforcing rules and processes on people won't work. The objective of theautomated test system is to prevent the product to become too unstableduring the course of development and promote a smooth incrementalregular development flow. If the system is such that it creates backlogof changes while waiting for the green to come up, it's acting againstits own stated objective, something really dysfunctional. It's like atraffic light system that's promoting traffic jams instead of making thetraffic fluid...


Several things strike me as core to the issue:

- The functional test system is itself unstable: we need to fix issueswith dependencies between tests for instance so that we avoid avalancheeffects and point of failure shifting effects.- The functional test system is network dependent: this leads tointermittent failures that basically render the current policyineffective, in effect suspecting commits that have nothing to do withthe source of the bug so basically getting the wrong people working onthe wrong issue (at worse) or getting people to ignore the nagging mail(at best). If we want a strict "green/red" policy, we need tests thatare 100% context free, deterministically reproducible within the contextof the same box. We should have a way to test Chandler that is notdependent on network activity, e.g. have a Cosmo instance running on thetest machine and used for sharing tests (I'm hand waving here heavily ifthis is at all possible), stubb network API, etc...- Test for network activity: we should of course test that our networkfunctionality do work but that should be another set of tests with itsown policy (TBD).

I propose that we held a meeting on this because I feel we need asolution soon and the email discussion will just take too long.


Would Thursday afternoon work for most?

Cheers,
- Philippe


Heikki Toivonen wrote:

Andi Vajda wrote:

Even by 'strictly' following the rules, when a failure is intermittent,
you easily get into the situation of a bunch of check-ins having
happened since the possibly bad one. I think Bryan's alternative is an
improvement.


I was just thinking about 100% reproducible cases, or close to 100%.

It can be really hard to figure out which checkin caused a rare
intermittent bug. Reasonably reliable intermittent bugs should be dealt
like 100% reproducible cases. The rare cases we have dealt with by
filing bugs and proceeding otherwise normally.

I don't think it would be a good idea to turn off intermittent tests.
First, when they succeed, it is still providing information that new
code hasn't made that test fail 100% of the time. And it is pretty easy
to check the new logs to see if it is a known intermittent failure.

If you really want to go the way of disabling all intermittent tests
then I am afraid that we'll have to turn off the whole functional test
suite right now, because there are at least two intermittent bugs that
manifest as test timeout and crash.


I have a sort of related question regarding test failures. Should we
stop further tests as soon as we see the first failure? This would
shorten Tinderbox cycle time when there was a problem. What we currently
 do is that we run all unit tests, and if those passed, we run all
functional tests (and if those passed, perf boxes run all perf tests).

------------------------------------------------------------------------


_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

Re: [Chandler-dev] Checkin process when the tree isn't green?

Reply via email to