Sure. I was floating the idea first before doing any work, but I'll just grab an existing text run and hack it up for comparison ...
-- Dirk On Fri, Oct 23, 2009 at 3:51 PM, Ojan Vafai <[email protected]> wrote: > Can you give example outputs for the common cases? It would be easier to > discuss those. > > On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke <[email protected]> wrote: >> >> If you've never run run_webkit_tests to run the layout test >> regression, or don't care about it, you can stop reading ... >> >> If you have run it, and you're like me, you've probably wondered a lot >> about the output ... questions like: >> >> 1) what do the numbers printed at the beginning of the test mean? >> 2) what do all of these "test failed" messages mean, and are they bad? >> 3) what do the numbers printed at the end of the test mean? >> 4) why are the numbers at the end different from the numbers at the >> beginning? >> 5) did my regression run cleanly, or not? >> >> You may have also wondered a couple of other things: >> 6) What do we expect this test to do? >> 7) Where is the baseline for this test? >> 8) What is the baseline search path for this test? >> >> Having just spent a week trying (again), to reconcile the numbers I'm >> getting on the LTTF dashboard with what we print out in the test, I'm >> thinking about drastically revising the output from the script, >> roughly as follows: >> >> * print the information needed to reproduce the test and look at the >> results >> * print the expected results in summary form (roughly the expanded >> version of the first table in the dashboard - # of tests by >> (wontfix/fix/defer x pass/fail/are flaky). >> * don't print out failure text to the screen during the run >> * print out any *unexpected* results at the end (like we do today) >> >> The goal would be that if all of your tests pass, you get less than a >> small screenful of output from running the tests. >> >> In addition, we would record a full log of (test,expectation,result) >> to the results directory (and this would also be available onscreen >> with --verbose) >> >> Lastly, I'll add a flag to re-run the tests that just failed, so it's >> easy to test if the failures were flaky. >> >> Then I'll rip out as much of the set logic in test_expectations.py as >> we can possibly get away with, so that no one has to spend the week I >> just did again. I'll probably replace it with much of the logic I use >> to generate the dashboard, which is much more flexible in terms of >> extracting different types of queries and numbers. >> >> I think the net result will be the same level of information that we >> get today, just in much more meaningful form. >> >> Thoughts? Comments? Is anyone particularly wedded to the existing >> output, or worried about losing a particular piece of info? >> >> -- Dirk > > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
