Can anyone else comment on this? I understand exactly where Shlomi is coming from and I know that he's wrong in his thoughts, but there is certainly a difference of opinion regarding 'correct' behavior.
> I find it unlikely that my code differs in interpreation from > Test::Harness in > this regard. After all, my code is derived from Test::Harness, and I > don't > recall changing the tests inside t/test-harness.t. Furthermore, the > results > specification for the test for 'bignum' is the same in both places. I can understand that. In this case, I'm not sure where Test::Harness goes wrong, but it's wrong. After conversation with Schwern, it was determined that any tests run which are more than the 'planned' tests are 'not ok'. I think Test::Harness is wrong in this regard and that's possibly part of the reason why the prove output is so strange. > > $ prove badplan > > badplan...FAILED tests 3-4 > > > > Failed 2/2 tests, 0.00% okay > > Failed Test Stat Wstat Total Fail List of Failed > > ------------------------------------------------------------ > > badplan 2 2 3-4 > > Failed 1/1 test scripts. -2/2 subtests failed. > > Files=1, Tests=2, 0 wallclock secs ( 0.01 cusr + 0.01 csys = > 0.02 > > CPU) > > Failed 1/1 test programs. -2/2 subtests failed. First it says that 2/2 failed and that 0.00% were OK and later says that -2/2 failed. Both 2/2 and -2/2 are wrong, since 2/4 tests 'failed'. 2 tests were planned, 4 were run. We don't have any way of knowing if the plan is wrong or if the final two tests were run in error. Only a human looking at the code can determine this. Thus, the 'prove' output is incorrect (and contradictory and non-sensical). (Note: this isn't meant to be a slam on Schwern or Andy Lester. They've both done great work with a code base which has evolved into a corner.) > > $ runtests badplan > > badplan.... All 2 subtests passed > > > > Test Summary Report > > ------------------- > > badplan (Wstat: 0 Tests: 4 Failed: 2) > > Failed tests: 3-4 > > Parse errors: Bad plan. You planned 2 tests but ran 4. > > Files=1, Tests=4, 0 wallclock secs ( 0.00 cusr + 0.00 csys = > 0.00 > > CPU) > > > > That's much clearer. As pointed out, 'runtests' output is clearer, non-contradictory, documented, and tells the programmer what happened, unlike 'runprove' or 'prove'. This is one case where I think that following the exact behavior of T::H and 'prove' is an error. > It does, but I still don't know how to emulate the Test::Harness > behaviour using the new TAPx::Parser API. Admittedly, I don't know your design goals, but I don't think you should emulate broken behavior. You've done enough work on Test::Run that I suspect that fixing confusing output is not a bad thing. If you really want to emulate the behavior, check 'has_todo' and 'actual_ok' in TAPx::Parser::Result::Test. Those might help. > And furthermore, I'm not sure I'll be > able to > rely on the new API until a stable version of TAPx::Parser 0.5x is > released, That's fair. We're working on getting a stable release (there are enough changes that it might bump up to 0.6). Thanks again for pointing this out. In writing more tests for it, I found a subtle bug with my implementation. It won't fix your problem and might even make things worse. (I was comparing $parser->tests_planned to $result->test_num -- which might be wrong if tests are misnumbered -- instead of to $parser->tests_run). Cheers, Ovid -- Buy the book -- http://www.oreilly.com/catalog/perlhks/ Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/