Hi, all. I have some thoughts about our page and windmill tests, and I'd like to propose an experiment related to these tests. This is inspired by my own thinking about how to best test web UI and by Rob's recent cost/benefit analysis arguments for optional reviews.
Here's my assessment of the problem: The burden we carry for our current page tests and Windmill tests does not match the benefit we get from them. The burden: * Much longer test run times (the bugs module goes from 45 minutes to 20 locally without them) * Fragile tests that block landings * Fragile infrastructure (see issues with Windmill tests under load) * Confusion over how to best test UI (page tests vs. integration vs. browser unit tests) All of these slow down development substantially. The benefit: * catches web UI regressions Perhaps there's more to say in terms of benefit (acceptance tests, happy path testing, etc.) but in practice I think the only benefit is that we catch regressions in our UI. In my experience, I've found this more with Windmill than page tests. I can't think of a single page test failure I've had that wasn't a bad test (i.e. relying on html format). I'm sure these have caught real regressions (surely they have! :-)) but in my experience this isn't the case. Here's my argument: Since we're moving to continuous rollouts and we're having to focus more on daily QA, any regression these tests would catch should be caught in QA. If problems exist, we rollback the rev and try again. If not, we rollback the rev as soon as we catch it. We can rely on ourselves and beta users to catch these regressions, and by not carrying the burden of the tests we get quicker cycles and landings. I'm not suggesting manual testing is better. In a perfect world, we would have a fast test suite and get the benefit of both, but for now, the test bloat is causing more harm than the benefit it brings. I think we could trust our QA process and users until we can get a sufficiently fast test suite to worry about web ui tests again. All IMHO, of course. :-) I'd like to propose an experiment to see if this holds true. I would propose that for 3 months (or until the Thunderdome) that we: 1.) disable page tests and windmill tests 2.) leave windmill around for running yuitests and API integration tests 3.) anytime we touch UI in this time we must add unit test coverage for browser classes (if the page test was acting as a unit test for the view) 4.) we track closely any regression that slips through and the impact of this regression A note about #2, #3, and #4 here: For #2, I think we should still run the JS unit tests automatically and in my experience Windmill is not fragile for yui tests. Migrating to jstestdriver, if we decide post-experiement to abandon Windmill completely, should be a different issue, IMHO. Francis noted yesterday that the API tests for the js LP.client have no other coverage, so we should leave those around as well. Again, I don't think these change much or are subject to the same fragility. My assumption about #3 is that we have tested some bits in page tests that have no other test coverage. For example, view methods that should be tested in browser unit tests but are covered by the page test. I'm proposing we disable these tests but leave them around and that we add unit test coverage where appropriate when touching UI. I am not suggesting we have browser unit tests attempt to replicate story tests. I think it would be could to run the page tests locally when working on UI to see if something needs unit test coverage, too. But this can't really be enforced by the experiment. As for #4, we don't currently track regressions, so I propose that we make use of a regression tag going forward. For *any* regression bug. This will also help with Robert's experiment. And as we fix these bugs we tag them with ui-test-experiment if a page or windmill test would have caught this regression. After the experiment completes, we should make an assessment. Did the decreased test run times affect cycle time? Did the simpler UI testing strategy affect cycle time? Did we introduce regressions that tests would have caught? Was the impact of such regressions serious or minor? And then, should we continue with this, do something new, or re-enable the tests and return to what we had before? What do you all think? Cheers, deryck -- Deryck Hodge https://launchpad.net/~deryck http://www.devurandom.org/ _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

