Re: Comment about BAIL_OUT
Greg Sabino Mullane wrote: [...] [1] I've never had a need for random tests myself. The only reason I break mine apart is to isolate testing various sub-systems, but I almost always end up having some dependencies put into an early 00 file. I also tend to a have a final 99 cleanup file. While I could in theory have each file be independent, in practice it's a lot of duplicated code and a lot of time overhead, so it's either the 00-99 or (as I sometimes have done) one giant testing file. Ah, then I suppose you have never considered the need to run a huge test suite in parallel on a multi-CPU box, where tests really are run at random (in relation to each other). 99 may complete before 00 has finished. This is a big problem for the test suite of Perl itself. David
Re: Comment about BAIL_OUT
Ovid wrote: However, if you use the '-s' switch to shuffle your tests and bailout is not first, then some tests will run until the BAIL_OUT is hit. This seems to violate the principle that tests should be able to run in any order without dependencies. It doesn't violate the principle since the tests are not dependent on BAIL_OUT happening, its just a convenience. The tests should still run fine in any order, it'll just be a lot noisier.
Desired test output?
From http://www.perlmonks.org/?node_id=593087: I'm working on TAPx::Parser and trying very hard to make my TAPx::Harness output as similar to Test::Harness output as is feasible. I am doing this primarily because if the basic output is too different from what folks are used to, the strangeness might discourage adoption. However, petdance has said the following: Please don't try to make it match Test::Harness. The Test::Harness format is clunky and yucky. I've been meaning to redo it for some time. Maybe you can come up with something beautiful that I can steal instead! Well, the fact remains, I can't, and that's where I need your help, Monks! Consider the following summary from a failed test run: Failed Test Stat Wstat Total Fail List of Failed -- t/bar.t4 1024134 2 6-8 t/foo.t1 256101 5 (1 subtest UNEXPECTEDLY SUCCEEDED). Failed 2/3 test scripts. 5/33 subtests failed. Files=3, Tests=33, 0 wallclock secs ( 0.10 cusr + 0.01 csys = 0.11 CPU) Failed 2/3 test programs. 5/33 subtests failed. I agree with petdance that this is rather clunky, but I'd love to hear your feedback on how you'd really like to see that information. That's what I'm working on right now and I'd love to get something that's more useful for folks. Keep in mind that TAPx::Parser collects far more information than Test::Harness, so if there's more stuff you'd like to see, that's fine, too. I can break it down on a test-by-test basis and show failed tests, unexpectedly succeeding tests, skipped tests, and so on. However, this could be a lot of clutter. What would be a good, easy-to-read format for this? Feel free to get creative and post examples. Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: Desired test output?
# from Ovid # on Friday 05 January 2007 01:50 am: TAPx::Parser collects far more information than Test::Harness, so if there's more stuff you'd like to see, that's fine, too. You could dump it all into some kind of data (yaml?) file, then execute $ENV{TAP_RESULTS_VIEWER} or something? TAP_RESULTS_VIEWER could then dump some ascii art, run/signal a gui program, do an html convert + browser launch, etc. Console output at 80 chars wide is inherently limited. Even starting some ncurses program or emacs mode would be more powerful for those who are stuck developing on a tty (and of course, said program could merely dump the report d'jour on the terminal (or stock tickers if things are going particularly badly.)) --Eric -- Everything goes wrong all at once. --Quantized Revision of Murphy's Law --- http://scratchcomputing.com ---
Re: Comment about BAIL_OUT
Michael G Shwern wrote: Such a bother. ... You can even get clever and pack the setup/teardown calls into loading the module so you have even less code per script. Now each test runs independently and cleans itself up. True, but at the expense of having to run the startup and cleanup code each time, which in most of my particular cases gets very expensive. It also violates the principle of DRY. :) It would be nice if there was something like t/_BEGIN_.t and t/_END_.t that would always run before and after any set of tests (even shuffled ones!) Sure, there are hacks and workarounds, but something builtin would be nice. Ovid wrote: However, if you use the '-s' switch to shuffle your tests and bailout is not first, then some tests will run until the BAIL_OUT is hit. This seems to violate the principle that tests should be able to run in any order without dependencies. Michael G Swern replied: It doesn't violate the principle since the tests are not dependent on BAIL_OUT happening, its just a convenience. The tests should still run fine in any order, it'll just be a lot noisier. I think Ovid means it violates it in the sense that BAIL_OUT typically stops all subsequent tests, which implies some sort of ordering. I've certainly used it that way before, in the manner of: 01example.t - a simple test 02example.t - another simple test 03example.t - a complex test which requires Foo::Bar, BAIL_OUT if not found 04example.t - requires Foo::Bar 05example.t - requires Foo::Bar I want a failure in 3 to stop 4 and 5 from running. For that matter, I want a failure in 3 or 4 or 5 to prevent any of the others 2 from running. But 1 and 2 can run irregardless of failures in 3, 4, and 5. The ordering is a convenience to doing so, but ideally there would be some way to interact with the testing program and do the right thing, so that instead of BAILing out at 3, it bails out of the current test, sets a flag, and then 4 and 5 can check for the flag and skip if it is not set. -- Greg Sabino Mullane [EMAIL PROTECTED] End Point Corporation signature.asc Description: This is a digitally signed message part
Re: Comment about BAIL_OUT
Greg Sabino Mullane wrote: Michael G Shwern wrote: Such a bother. ... You can even get clever and pack the setup/teardown calls into loading the module so you have even less code per script. Now each test runs independently and cleans itself up. True, but at the expense of having to run the startup and cleanup code each time, which in most of my particular cases gets very expensive. It also violates the principle of DRY. :) It would be nice if there was something like t/_BEGIN_.t and t/_END_.t that would always run before and after any set of tests (even shuffled ones!) Sure, there are hacks and workarounds, but something builtin would be nice. You can already control the order that tests run in. Why do you need more than that? I want a failure in 3 to stop 4 and 5 from running. For that matter, I want a failure in 3 or 4 or 5 to prevent any of the others 2 from running. But 1 and 2 can run irregardless of failures in 3, 4, and 5. The ordering is a convenience to doing so, but ideally there would be some way to interact with the testing program and do the right thing, so that instead of BAILing out at 3, it bails out of the current test, sets a flag, and then 4 and 5 can check for the flag and skip if it is not set. I think you're expecting too much from the default test harness. If your project has special needs then write your own test harness. It should be dead simple now with TAPx::Parser. -- Michael Peters Developer Plus Three, LP
Thoughts about test harness summary
Pursuant to some discussion with BrowserUK at http://perlmonks.org/?node_id=593087, I'm looking at this and seeing some problems. -- Failed Test Stat Wstat Total Fail List of Failed - t/bar.t4 1024134 2 6-8 t/foo.t1 256101 5 (2 subtests UNEXPECTEDLY SUCCEEDED). Failed 2/3 test scripts. 5/33 subtests failed. Files=3, Tests=33, 0 wallclock secs ( 0.10 cusr + 0.01 csys = 0.11 CPU) Failed 2/3 test programs. 5/33 subtests failed. -- How about this instead? -- Failed Test | Total | Fail | List of Failed | TODO Passed +---+--++ t/bar.t | 13| 4|2, 6-8 |3-4 +---+--++ t/foo.t | 10| 1|5 | Time: 0 wallclock secs ( 0.10 cusr + 0.01 csys = 0.11 CPU) Files=3. Failed 2/3 test programs. 5/33 subtests failed. -- That will be annoying to put together, but it really seems a lot cleaner, with no duplicate info (yes, I've already thought about the List of Failed and TODO Passed spanning more than one line.) Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: Thoughts about test harness summary
On Fri, Jan 05, 2007 at 11:11:25AM -0800, Ovid wrote: Pursuant to some discussion with BrowserUK at http://perlmonks.org/?node_id=593087, I'm looking at this and seeing some problems. -- Failed Test Stat Wstat Total Fail List of Failed - t/bar.t4 1024134 2 6-8 t/foo.t1 256101 5 (2 subtests UNEXPECTEDLY SUCCEEDED). Failed 2/3 test scripts. 5/33 subtests failed. Files=3, Tests=33, 0 wallclock secs ( 0.10 cusr + 0.01 csys = 0.11 CPU) Failed 2/3 test programs. 5/33 subtests failed. -- How about this instead? -- Failed Test | Total | Fail | List of Failed | TODO Passed +---+--++ t/bar.t | 13| 4|2, 6-8 |3-4 +---+--++ t/foo.t | 10| 1|5 | I'd like Wstat, even if I don't have Stat. I like to know if tests coredumped. I may be in a minority here, but being able to optionally switch to that output is useful. I like the prominence of TODO passed I'm not sure if I like the lines making the table. I guess it's a bit of a bikeshed, but having horizontal lines between each will increase the amount of vertical space needed to convey the same information, which will mean fewer failures will be needed to exceed my screen's height. Nicholas Clark
Re: Thoughts about test harness summary
On Jan 5, 2007, at 1:28 PM, Nicholas Clark wrote: Failed Test | Total | Fail | List of Failed | TODO Passed +---+--++ t/bar.t | 13| 4|2, 6-8 |3-4 +---+--++ t/foo.t | 10| 1|5 | The vertical lines are just noise. What Tufte calls chartjunk. -- Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance
Re: Thoughts about test harness summary
On 1/5/07, Andy Lester [EMAIL PROTECTED] wrote: On Jan 5, 2007, at 1:28 PM, Nicholas Clark wrote: Failed Test | Total | Fail | List of Failed | TODO Passed +---+--++ t/bar.t | 13| 4|2, 6-8 |3-4 +---+--++ t/foo.t | 10| 1|5 | The vertical lines are just noise. What Tufte calls chartjunk. Moreover, it looks really horrid with non-monospaced fonts. David
Re: Thoughts about test harness summary
--- Nicholas Clark [EMAIL PROTECTED] wrote: I'd like Wstat, even if I don't have Stat. I like to know if tests coredumped. I may be in a minority here, but being able to optionally switch to that output is useful. Fair enough. I like the prominence of TODO passed Cool. I hated wedging it in there like that, but I couldn't think of another nice format. I'm not sure if I like the lines making the table. I guess it's a bit of a bikeshed, but having horizontal lines between each will increase the amount of vertical space needed to convey the same information, which will mean fewer failures will be needed to exceed my screen's height. So with optional Wstat, you're thinking something like this? Failed Test | Wstat | Total | Fail | List of Failed | TODO Passed +---+---+--++ t/bar.t | 1024| 13| 4|2, 6-8 |3-4 t/foo.t |256| 10| 1|5 | And Andy Lester wrote: The vertical lines are just noise. What Tufte calls chartjunk. That's what I thought at first, too, but looks at this: Failed Test Total Fail List of FailedTODO Passed +---+--+-+ t/bar.t 13 9 2, 6-8, 13, 17, 33-35 3-4 t/foo.t 10 10 5, 19, 27, 37-38, 117 9-11 In trying to pack in the 'TODO passed' information, I'm running out of room. That's becoming annoying to read (It would be even worse without those plus signs: Failed Test Total Fail List of FailedTODO Passed -- t/bar.t 13 9 2, 6-8, 13, 17, 33-35 3-4 t/foo.t 10 10 5, 19, 27, 37-38, 117 9-11 Did you mean that you didn't like the 'horizontal' lines? -- Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: Thoughts about test harness summary
--- David Golden [EMAIL PROTECTED] wrote: On 1/5/07, Andy Lester [EMAIL PROTECTED] wrote: On Jan 5, 2007, at 1:28 PM, Nicholas Clark wrote: Failed Test | Total | Fail | List of Failed | TODO Passed +---+--++ t/bar.t | 13| 4|2, 6-8 |3-4 +---+--++ t/foo.t | 10| 1|5 | The vertical lines are just noise. What Tufte calls chartjunk. Moreover, it looks really horrid with non-monospaced fonts. You use non-monospaced fonts in your terminal? :) Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: Thoughts about test harness summary
On Jan 5, 2007, at 1:46 PM, Ovid wrote: Failed Test Total Fail List of FailedTODO Passed -- t/bar.t 13 9 2, 6-8, 13, 17, 33-35 3-4 t/foo.t 10 10 5, 19, 27, 37-38, 117 9-11 Did you mean that you didn't like the 'horizontal' lines? Both. How about t/bar.t 9/13 Failed: 2, 6-8, 13, 17, 33-35 TODO passed: 3-4 t/foo.t 10/10 Failed: 5, 19, 27, 37-38, 117 9-11 That should degrade nicely in non-monospaced. -- Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance
TAP::Tests
Some of the limitations of TAPx::Parser are due to how Test::Builder works. One thing which isn't making it into 'runtests' is the -Q switch. I have a -q which doesn't print test failures while tests are running, but as you can see, one of my 'stress tests' caused a problem: TAPx-Parser $ /usr/bin/perl -Ilib bin/runtests -qm tbad/ tbad/060-aggregator..ok tbad/badtestsFAILED tests 1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40, 41, 43, 44, 46, 47, 49, 50, 52, 53, 55, 56, 58, 59, 61, 62, 64, 65, 67, 68, 70, 71, 73, 74, 76, 77, 79, 80, 82, 83, 85, 86, 88, 89, 91, 92, 94, 95, 97, 98, 100 Failed 67/100 tests, 33.00% okay tbad/ddd.ok tbad/longtestfilenameFAILED tests 3, 10 Failed 2/10 tests, 80.00% okay (less 1 skipped test: 7 okay, 70.00%) snip There are still problems with the current TAP producer which still shoves things into STDERR, even though I'm using an experimental '-m' switch which merges STDERR and STDOUT. I probably need to figure out the IPC::Open3 solution which has been proposed, but for the time being, I've given up on the thought of having tests completely silent except for a summary. TAP::Tests would not only allow us to have a clean way of getting to TAP 2.0, but it would also allow us to include more standard test functions and clean up bits of the current testing framework that we're not happy with: use TAP::Tests tests = 3; ok $foo, 'foo is ok'; if ( some_condition ) { skip 1; } else { is $this, $that, 'this == that'; } throws_ok { some_func() } 'Exception::Hissy::Fit', 'some_func() should throw a hissy fit'; And so on. If anyone has far more tuits than they really know what to do with, now you know :) Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: TAP::Tests
--- Ovid [EMAIL PROTECTED] wrote: Some of the limitations of TAPx::Parser are due to how Test::Builder works. One thing which isn't making it into 'runtests' is the -Q switch. I have a -q which doesn't print test failures while tests are running, but as you can see, one of my 'stress tests' caused a problem: Side note: I've figured out how to shoehorn the -Q switch on there. TAP::Tests would still be very useful though. Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: Thoughts about test harness summary
On 1/5/07, Ovid [EMAIL PROTECTED] wrote: Moreover, it looks really horrid with non-monospaced fonts. You use non-monospaced fonts in your terminal? :) Thta's gmail for you. David
Re: TAP::Tests
Ovid wrote: Some of the limitations of TAPx::Parser are due to how Test::Builder works. One thing which isn't making it into 'runtests' is the -Q switch. I have a -q which doesn't print test failures while tests are running, but as you can see, one of my 'stress tests' caused a problem: TAPx-Parser $ /usr/bin/perl -Ilib bin/runtests -qm tbad/ tbad/060-aggregator..ok tbad/badtestsFAILED tests 1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40, 41, 43, 44, 46, 47, 49, 50, 52, 53, 55, 56, 58, 59, 61, 62, 64, 65, 67, 68, 70, 71, 73, 74, 76, 77, 79, 80, 82, 83, 85, 86, 88, 89, 91, 92, 94, 95, 97, 98, 100 Failed 67/100 tests, 33.00% okay tbad/ddd.ok tbad/longtestfilenameFAILED tests 3, 10 Failed 2/10 tests, 80.00% okay (less 1 skipped test: 7 okay, 70.00%) snip Ok, I'm blind. I don't see the problem.
Re: TAP::Tests
--- Michael G Schwern [EMAIL PROTECTED] wrote: TAPx-Parser $ /usr/bin/perl -Ilib bin/runtests -qm tbad/ tbad/060-aggregator..ok tbad/badtestsFAILED tests 1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40, 41, 43, 44, 46, 47, 49, 50, 52, 53, 55, 56, 58, 59, 61, 62, 64, 65, 67, 68, 70, 71, 73, 74, 76, 77, 79, 80, 82, 83, 85, 86, 88, 89, 91, 92, 94, 95, 97, 98, 100 Failed 67/100 tests, 33.00% okay tbad/ddd.ok tbad/longtestfilenameFAILED tests 3, 10 Failed 2/10 tests, 80.00% okay (less 1 skipped test: 7 okay, 70.00%) snip Ok, I'm blind. I don't see the problem. Sorry, I wasn't clear because I went ahead and trimmed the test summary output which duplicates the above list of failed tests. While I've managed to fix the problem, see that long list of failed tests? That information is printed after every test program terminates and then it's printed again in the summary. My -q and -Q options are ways of suppressing the printing of extra information. -q just suppresses the printing of test failures as they occur. This is useful if you just want to keep out clutter. Also, the list of FAILED tests after each test program is just duplicated in the summary and that's one objection I have already encountered when I've inquired about redoing the test output. My current test output is now like this: t/last_minute...ok t/head_end..ok t/head_fail.Failed 1/4 tests t/inc_taint.Failed 1/1 tests t/no_nums...Failed 1/5 tests summary snipped again because right now it's very ugly -Q suppresses *any* test output except the summary, so even the above doesn't show up. However, I was having a problem because of how Test::Builder defaulted to sending diagnostic information to STDERR. I've corrected this now by automically enabling the experimental --merge feature which merges STDERR and STDOUT. Note that --merge is is off by default but -q, -Q, and --failures (only show test failures when in verbose mode) require --merge to be enabled. Incidentally, the -Q and -q features have the nice benefit of tremendously speeding up some test suites. When you need the detail, just leave 'em off. Counter-arguments welcome. Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: TAP::Tests
Ovid wrote: --- Michael G Schwern [EMAIL PROTECTED] wrote: TAPx-Parser $ /usr/bin/perl -Ilib bin/runtests -qm tbad/ tbad/060-aggregator..ok tbad/badtestsFAILED tests 1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40, 41, 43, 44, 46, 47, 49, 50, 52, 53, 55, 56, 58, 59, 61, 62, 64, 65, 67, 68, 70, 71, 73, 74, 76, 77, 79, 80, 82, 83, 85, 86, 88, 89, 91, 92, 94, 95, 97, 98, 100 Failed 67/100 tests, 33.00% okay tbad/ddd.ok tbad/longtestfilenameFAILED tests 3, 10 Failed 2/10 tests, 80.00% okay (less 1 skipped test: 7 okay, 70.00%) snip Ok, I'm blind. I don't see the problem. *snip* Counter-arguments welcome. That list of FAILED tests does not come from Test::Builder. I'm still missing something.
Re: TAP::Tests
--- Michael G Schwern [EMAIL PROTECTED] wrote: That list of FAILED tests does not come from Test::Builder. I'm still missing something. You are correct. I had bollixed my tests (it turns out that running tests which run tests and then drive the results through the test harness I'm testing is getting me rather confused). Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
First try at sample test output.
OK, here's a first pass at sample test output with my new test harness. Note that the -q option is enabled to suppress a very long test output. Let me know what you think. I do realize that the indentation on the failure results might still cause problems with non-monospaced displays, but I can't think of any other way of handling long failure output. I think it might still look ok, but I'm not sure. Cheers, Ovid TAPx-Parser $ /usr/bin/perl -Ilib bin/runtests -q tbad/ tbad/060-aggregator..ok tbad/badtestsFailed 67/100 tests tbad/ddd.ok tbad/longtestfilenameFailed 2/10 tests (less 1 skipped test: 7 okay) (1 test unexpectedly succeeded) Test Summary Report tbad/badtests.t (Wstat: 17152 Tests: 100 Failed: 67) Failed tests: 1-2, 4-5, 7-8, 10-11, 13-14, 16-17, 19-20 22-23, 25-26, 28-29, 31-32, 34-35, 37-38 40-41, 43-44, 46-47, 49-50, 52-53, 55-56 58-59, 61-62, 64-65, 67-68, 70-71, 73-74 76-77, 79-80, 82-83, 85-86, 88-89, 91-92 94-95, 97-98, 100 tbad/longtestfilename.t (Wstat: 0 Tests: 10 Failed: 2) Failed tests: 3, 10 TODO passed: 9 Tests skipped: 7 Files=4, Tests=174, 1 wallclock secs ( 0.38 cusr + 0.10 csys = 0.48 CPU) -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/
Re: Thoughts about test harness summary
On Fri, Jan 05, 2007 at 01:50:54PM -0600, Andy Lester wrote: On Jan 5, 2007, at 1:46 PM, Ovid wrote: Failed Test Total Fail List of FailedTODO Passed -- t/bar.t 13 9 2, 6-8, 13, 17, 33-35 3-4 t/foo.t 10 10 5, 19, 27, 37-38, 117 9-11 Did you mean that you didn't like the 'horizontal' lines? Both. How about t/bar.t 9/13 Failed: 2, 6-8, 13, 17, 33-35 TODO passed: 3-4 t/foo.t 10/10 Failed: 5, 19, 27, 37-38, 117 9-11 FWIW, I like the tabular output. How about something like: TestTotal Failed List of Failed TODO Passed ---+---+--+--+ t/bar.t 13 9 2, 6-8, 13, 17, 33-35 3-4 t/foo.t 10 10 5, 19, 27, 37-38, 117 9-11 Possibly including wstat, just for Nicholas ;-) You'd have to be careful if there is a very large number of failed subtests (I can never remember what we decided those would be called) but, thinking about it, if more than a few of them have failed, I really don't care to see them all anyway. Just the first handful (perhaps the number that would fit in the column) is enough. If I really need to see them all I can get at that information some other way. The same applies for passed TODOs. That should degrade nicely in non-monospaced. And to be honest, I really don't see that as a big problem. How often do people need to look at results like this where they cannot control the font? If the worst comes to the worst, you can always copy and paste them into an editor. Just lobbing a few peanuts ... And if you squint at the table hard enough it starts to resemble a bikeshed. When you've all decided on something I didn't want, I'll spend ages creating my own output format and making it just the right shade of lilac. And rounding the corners. Actually, I won't, 'cos my tests never fail :-P -- Paul Johnson - [EMAIL PROTECTED] http://www.pjcj.net