Re: Regression Behaviour between TAPx-Parser-0.41 to TAPx-Parser-trunk

Ovid Thu, 15 Feb 2007 23:59:20 -0800

Can anyone else comment on this?  I understand exactly where Shlomi is
coming from and I know that he's wrong in his thoughts, but there is
certainly a difference of opinion regarding 'correct' behavior.


> I find it unlikely that my code differs in interpreation from
> Test::Harness in 
> this regard. After all, my code is derived from Test::Harness, and I
> don't 
> recall changing the tests inside t/test-harness.t. Furthermore, the
> results 
> specification for the test for 'bignum' is the same in both places.

I can understand that.  In this case, I'm not sure where Test::Harness
goes wrong, but it's wrong.  After conversation with Schwern, it was
determined that any tests run which are more than the 'planned' tests
are 'not ok'.  I think Test::Harness is wrong in this regard and that's
possibly part of the reason why the prove output is so strange.

> >   $ prove badplan
> >   badplan...FAILED tests 3-4
> >
> >           Failed 2/2 tests, 0.00% okay
> >   Failed Test Stat Wstat Total Fail  List of Failed
> >   ------------------------------------------------------------
> >   badplan                    2    2  3-4
> >   Failed 1/1 test scripts. -2/2 subtests failed.
> >   Files=1, Tests=2,  0 wallclock secs ( 0.01 cusr +  0.01 csys = 
> 0.02
> > CPU)
> >   Failed 1/1 test programs. -2/2 subtests failed.

First it says that 2/2 failed and that 0.00% were OK and later says
that -2/2 failed.  Both 2/2 and -2/2 are wrong, since 2/4 tests
'failed'.  2 tests were planned, 4 were run.  We don't have any way of
knowing if the plan is wrong or if the final two tests were run in
error.  Only a human looking at the code can determine this.  Thus, the
'prove' output is incorrect (and contradictory and non-sensical).

(Note:  this isn't meant to be a slam on Schwern or Andy Lester. 
They've both done great work with a code base which has evolved into a
corner.)

> >   $ runtests badplan
> >   badplan.... All 2 subtests passed
> >
> >   Test Summary Report
> >   -------------------
> >   badplan (Wstat: 0 Tests: 4 Failed: 2)
> >     Failed tests:  3-4
> >     Parse errors: Bad plan.  You planned 2 tests but ran 4.
> >   Files=1, Tests=4,  0 wallclock secs ( 0.00 cusr +  0.00 csys = 
> 0.00
> > CPU)
> >
> > That's much clearer.

As pointed out, 'runtests' output is clearer, non-contradictory,
documented, and tells the programmer what happened, unlike 'runprove'
or 'prove'.  This is one case where I think that following the exact
behavior of T::H and 'prove' is an error.

> It does, but I still don't know how to emulate the Test::Harness
> behaviour using the new TAPx::Parser API.

Admittedly, I don't know your design goals, but I don't think you
should emulate broken behavior.  You've done enough work on Test::Run
that I suspect that fixing confusing output is not a bad thing.

If you really want to emulate the behavior, check 'has_todo' and
'actual_ok' in TAPx::Parser::Result::Test.  Those might help.

> And furthermore, I'm not sure I'll be
> able to 
> rely on the new API until a stable version of TAPx::Parser 0.5x is
> released, 

That's fair.  We're working on getting a stable release (there are
enough changes that it might bump up to 0.6).

Thanks again for pointing this out.  In writing more tests for it, I
found a subtle bug with my implementation.  It won't fix your problem
and might even make things worse.  (I was comparing
$parser->tests_planned to $result->test_num -- which might be wrong if
tests are misnumbered -- instead of to $parser->tests_run).

Cheers,
Ovid

--

Buy the book -- http://www.oreilly.com/catalog/perlhks/
Perl and CGI -- http://users.easystreet.com/ovid/cgi_course/

Re: Regression Behaviour between TAPx-Parser-0.41 to TAPx-Parser-trunk

Reply via email to