[chromium-dev] Re: revising the output from run_webkit_tests

2009-11-03 Thread Dirk Pranke

Anyone who wants to follow along on this, I've filed
http://code.google.com/p/chromium/issues/detail?id=26659 to track it.

-- Dirk

On Sat, Oct 24, 2009 at 5:01 PM, Dirk Pranke dpra...@chromium.org wrote:
 Sure. I was floating the idea first before doing any work, but I'll
 just grab an existing text run and hack it up for comparison ...

 -- Dirk

 On Fri, Oct 23, 2009 at 3:51 PM, Ojan Vafai o...@chromium.org wrote:
 Can you give example outputs for the common cases? It would be easier to
 discuss those.

 On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke dpra...@chromium.org wrote:

 If you've never run run_webkit_tests to run the layout test
 regression, or don't care about it, you can stop reading ...

 If you have run it, and you're like me, you've probably wondered a lot
 about the output ... questions like:

 1) what do the numbers printed at the beginning of the test mean?
 2) what do all of these test failed messages mean, and are they bad?
 3) what do the numbers printed at the end of the test mean?
 4) why are the numbers at the end different from the numbers at the
 beginning?
 5) did my regression run cleanly, or not?

 You may have also wondered a couple of other things:
 6) What do we expect this test to do?
 7) Where is the baseline for this test?
 8) What is the baseline search path for this test?

 Having just spent a week trying (again), to reconcile the numbers I'm
 getting on the LTTF dashboard with what we print out in the test, I'm
 thinking about drastically revising the output from the script,
 roughly as follows:

 * print the information needed to reproduce the test and look at the
 results
 * print the expected results in summary form (roughly the expanded
 version of the first table in the dashboard - # of tests by
 (wontfix/fix/defer x pass/fail/are flaky).
 * don't print out failure text to the screen during the run
 * print out any *unexpected* results at the end (like we do today)

 The goal would be that if all of your tests pass, you get less than a
 small screenful of output from running the tests.

 In addition, we would record a full log of (test,expectation,result)
 to the results directory (and this would also be available onscreen
 with --verbose)

 Lastly, I'll add a flag to re-run the tests that just failed, so it's
 easy to test if the failures were flaky.

 Then I'll rip out as much of the set logic in test_expectations.py as
 we can possibly get away with, so that no one has to spend the week I
 just did again. I'll probably replace it with much of the logic I use
 to generate the dashboard, which is much more flexible in terms of
 extracting different types of queries and numbers.

 I think the net result will be the same level of information that we
 get today, just in much more meaningful form.

 Thoughts? Comments? Is anyone particularly wedded to the existing
 output, or worried about losing a particular piece of info?

 -- Dirk




--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---



[chromium-dev] Re: revising the output from run_webkit_tests

2009-10-24 Thread Dirk Pranke

Sure. I was floating the idea first before doing any work, but I'll
just grab an existing text run and hack it up for comparison ...

-- Dirk

On Fri, Oct 23, 2009 at 3:51 PM, Ojan Vafai o...@chromium.org wrote:
 Can you give example outputs for the common cases? It would be easier to
 discuss those.

 On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke dpra...@chromium.org wrote:

 If you've never run run_webkit_tests to run the layout test
 regression, or don't care about it, you can stop reading ...

 If you have run it, and you're like me, you've probably wondered a lot
 about the output ... questions like:

 1) what do the numbers printed at the beginning of the test mean?
 2) what do all of these test failed messages mean, and are they bad?
 3) what do the numbers printed at the end of the test mean?
 4) why are the numbers at the end different from the numbers at the
 beginning?
 5) did my regression run cleanly, or not?

 You may have also wondered a couple of other things:
 6) What do we expect this test to do?
 7) Where is the baseline for this test?
 8) What is the baseline search path for this test?

 Having just spent a week trying (again), to reconcile the numbers I'm
 getting on the LTTF dashboard with what we print out in the test, I'm
 thinking about drastically revising the output from the script,
 roughly as follows:

 * print the information needed to reproduce the test and look at the
 results
 * print the expected results in summary form (roughly the expanded
 version of the first table in the dashboard - # of tests by
 (wontfix/fix/defer x pass/fail/are flaky).
 * don't print out failure text to the screen during the run
 * print out any *unexpected* results at the end (like we do today)

 The goal would be that if all of your tests pass, you get less than a
 small screenful of output from running the tests.

 In addition, we would record a full log of (test,expectation,result)
 to the results directory (and this would also be available onscreen
 with --verbose)

 Lastly, I'll add a flag to re-run the tests that just failed, so it's
 easy to test if the failures were flaky.

 Then I'll rip out as much of the set logic in test_expectations.py as
 we can possibly get away with, so that no one has to spend the week I
 just did again. I'll probably replace it with much of the logic I use
 to generate the dashboard, which is much more flexible in terms of
 extracting different types of queries and numbers.

 I think the net result will be the same level of information that we
 get today, just in much more meaningful form.

 Thoughts? Comments? Is anyone particularly wedded to the existing
 output, or worried about losing a particular piece of info?

 -- Dirk



--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---



[chromium-dev] Re: revising the output from run_webkit_tests

2009-10-23 Thread Ojan Vafai
Can you give example outputs for the common cases? It would be easier to
discuss those.

On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke dpra...@chromium.org wrote:

 If you've never run run_webkit_tests to run the layout test
 regression, or don't care about it, you can stop reading ...

 If you have run it, and you're like me, you've probably wondered a lot
 about the output ... questions like:

 1) what do the numbers printed at the beginning of the test mean?
 2) what do all of these test failed messages mean, and are they bad?
 3) what do the numbers printed at the end of the test mean?
 4) why are the numbers at the end different from the numbers at the
 beginning?
 5) did my regression run cleanly, or not?

 You may have also wondered a couple of other things:
 6) What do we expect this test to do?
 7) Where is the baseline for this test?
 8) What is the baseline search path for this test?

 Having just spent a week trying (again), to reconcile the numbers I'm
 getting on the LTTF dashboard with what we print out in the test, I'm
 thinking about drastically revising the output from the script,
 roughly as follows:

 * print the information needed to reproduce the test and look at the
 results
 * print the expected results in summary form (roughly the expanded
 version of the first table in the dashboard - # of tests by
 (wontfix/fix/defer x pass/fail/are flaky).
 * don't print out failure text to the screen during the run
 * print out any *unexpected* results at the end (like we do today)

 The goal would be that if all of your tests pass, you get less than a
 small screenful of output from running the tests.

 In addition, we would record a full log of (test,expectation,result)
 to the results directory (and this would also be available onscreen
 with --verbose)

 Lastly, I'll add a flag to re-run the tests that just failed, so it's
 easy to test if the failures were flaky.

 Then I'll rip out as much of the set logic in test_expectations.py as
 we can possibly get away with, so that no one has to spend the week I
 just did again. I'll probably replace it with much of the logic I use
 to generate the dashboard, which is much more flexible in terms of
 extracting different types of queries and numbers.

 I think the net result will be the same level of information that we
 get today, just in much more meaningful form.

 Thoughts? Comments? Is anyone particularly wedded to the existing
 output, or worried about losing a particular piece of info?

 -- Dirk


--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---



[chromium-dev] Re: revising the output from run_webkit_tests

2009-10-23 Thread Nicolas Sylvain
On Fri, Oct 23, 2009 at 3:43 PM, Dirk Pranke dpra...@chromium.org wrote:


 If you've never run run_webkit_tests to run the layout test
 regression, or don't care about it, you can stop reading ...

 If you have run it, and you're like me, you've probably wondered a lot
 about the output ... questions like:

 1) what do the numbers printed at the beginning of the test mean?
 2) what do all of these test failed messages mean, and are they bad?
 3) what do the numbers printed at the end of the test mean?
 4) why are the numbers at the end different from the numbers at the
 beginning?
 5) did my regression run cleanly, or not?

 You may have also wondered a couple of other things:
 6) What do we expect this test to do?
 7) Where is the baseline for this test?
 8) What is the baseline search path for this test?

 Having just spent a week trying (again), to reconcile the numbers I'm
 getting on the LTTF dashboard with what we print out in the test, I'm
 thinking about drastically revising the output from the script,
 roughly as follows:

 * print the information needed to reproduce the test and look at the
 results
 * print the expected results in summary form (roughly the expanded
 version of the first table in the dashboard - # of tests by
 (wontfix/fix/defer x pass/fail/are flaky).
 * don't print out failure text to the screen during the run
 * print out any *unexpected* results at the end (like we do today)

 The goal would be that if all of your tests pass, you get less than a
 small screenful of output from running the tests.

 In addition, we would record a full log of (test,expectation,result)
 to the results directory (and this would also be available onscreen
 with --verbose)

 Lastly, I'll add a flag to re-run the tests that just failed, so it's
 easy to test if the failures were flaky.

This would be nice for the buildbots. We would also need to add a new
section
in the results for Unexpected Flaky Tests (failed then passed).

Nicolas



 Then I'll rip out as much of the set logic in test_expectations.py as
 we can possibly get away with, so that no one has to spend the week I
 just did again. I'll probably replace it with much of the logic I use
 to generate the dashboard, which is much more flexible in terms of
 extracting different types of queries and numbers.

 I think the net result will be the same level of information that we
 get today, just in much more meaningful form.

 Thoughts? Comments? Is anyone particularly wedded to the existing
 output, or worried about losing a particular piece of info?

 -- Dirk

 


--~--~-~--~~~---~--~~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
http://groups.google.com/group/chromium-dev
-~--~~~~--~~--~--~---