Sure. I was floating the idea first before doing any work, but I'll
just grab an existing text run and hack it up for comparison ...

-- Dirk

On Fri, Oct 23, 2009 at 3:51 PM, Ojan Vafai <[email protected]> wrote:
> Can you give example outputs for the common cases? It would be easier to
> discuss those.
>
> On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke <[email protected]> wrote:
>>
>> If you've never run run_webkit_tests to run the layout test
>> regression, or don't care about it, you can stop reading ...
>>
>> If you have run it, and you're like me, you've probably wondered a lot
>> about the output ... questions like:
>>
>> 1) what do the numbers printed at the beginning of the test mean?
>> 2) what do all of these "test failed" messages mean, and are they bad?
>> 3) what do the numbers printed at the end of the test mean?
>> 4) why are the numbers at the end different from the numbers at the
>> beginning?
>> 5) did my regression run cleanly, or not?
>>
>> You may have also wondered a couple of other things:
>> 6) What do we expect this test to do?
>> 7) Where is the baseline for this test?
>> 8) What is the baseline search path for this test?
>>
>> Having just spent a week trying (again), to reconcile the numbers I'm
>> getting on the LTTF dashboard with what we print out in the test, I'm
>> thinking about drastically revising the output from the script,
>> roughly as follows:
>>
>> * print the information needed to reproduce the test and look at the
>> results
>> * print the expected results in summary form (roughly the expanded
>> version of the first table in the dashboard - # of tests by
>> (wontfix/fix/defer x pass/fail/are flaky).
>> * don't print out failure text to the screen during the run
>> * print out any *unexpected* results at the end (like we do today)
>>
>> The goal would be that if all of your tests pass, you get less than a
>> small screenful of output from running the tests.
>>
>> In addition, we would record a full log of (test,expectation,result)
>> to the results directory (and this would also be available onscreen
>> with --verbose)
>>
>> Lastly, I'll add a flag to re-run the tests that just failed, so it's
>> easy to test if the failures were flaky.
>>
>> Then I'll rip out as much of the set logic in test_expectations.py as
>> we can possibly get away with, so that no one has to spend the week I
>> just did again. I'll probably replace it with much of the logic I use
>> to generate the dashboard, which is much more flexible in terms of
>> extracting different types of queries and numbers.
>>
>> I think the net result will be the same level of information that we
>> get today, just in much more meaningful form.
>>
>> Thoughts? Comments? Is anyone particularly wedded to the existing
>> output, or worried about losing a particular piece of info?
>>
>> -- Dirk
>
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: [email protected] 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to