Test::Fork (was Re: New Test::More features?)

2007-11-30 Thread Michael G Schwern
Eric Wilhelm wrote:
 # from Michael G Schwern
 # on Thursday 29 November 2007 19:00:
 
 Otherwise, what's important to people?
 
 Could it be made fork-safe?
 
   http://search.cpan.org/src/TJENNESS/File-Temp-0.19/t/fork.t
 
 Possibly that involves blocking, or IPC with delayed output, or a 
 plan-per-fork thing.

The trick is, how do you do it?

IPC is right out, unportable.  I'm pretty sure there's no way to coordinate
the test counter between the two processes.  There are TAP proposals to
eliminate the need for coordination but I don't want to get into that right now.

The usual way to do this is to turn off test numbers, fork and then turn them
back on when the fork is done while incrementing the test counter by the
number of tests the fork ran.  That requires a bunch of Test::Builder method
level muckery and is non-obvious.

An easier Test::More level interface would be nice, but what would that
interface be?  It needs an I'm about to fork for this many tests function
and a I'm done forking function.  It would be easier if Test::More did the
forking for you, but that's a restriction I don't want to impose.

Or maybe we just write Test::Fork like so:

use Test::Fork;

fork_ok(sub {
is 23, 42;  # this is the code in the fork
});

and it does all the necessary jiggery pokery.  Knowing when the fork is
complete to turn numbers back on is troublesome.  I guess some sort of signal
handler will deal with that?


PS  I note there is Test::MultiFork but it seems to go well beyond what we're
talking about.

-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer


Re: New Test::More features?

2007-11-30 Thread Michael G Schwern
Andy Armstrong wrote:
 On 30 Nov 2007, at 03:00, Michael G Schwern wrote:
 Otherwise, what's important to people?  I know there's a lot of
 suggestions
 about increasing the flexibility of planning.  Also the oft requested
 I'm
 done running tests sentinel for a safer no_plan.  Most of the time
 I'm just
 wibbling over interface details, getting the names just right. 
 (What!  Argue
 over tiny interface issues?  The Deuce you say!)
 
 The ability to emit TAP 13 c/w structured diagnostics would be hot.

Two open, but non-showstopper, issues related to that.

1)  What do we do with the regular STDERR diagnostics?  Ideally we'd have some
way to detect that the harness is going read our YAML diagnostic and generate
it's own user-readable version.  AFAIK none such thing exists.

At the moment, they will just always get emitted.  I don't have any other good
solution in mind other than accepting an environment variable (tied to a
Test::Builder method) to switch it off.  It's really a decision the harness
has to convey to the tests.


2)  What do we do if we don't have a YAML emitter?

At the moment, we just won't emit diagnostics.  Possibilities include shipping
with a copy of YAML::Tiny or write our own dumbed down YAML generator based on
the is_deeply() code as it already knows how to walk a data structure.


 Then
 we can reopen the debate about the namespace within diagnostic blocks
 and lose another four weeks of our respective lives :)

I'm going to pretend I don't know anything about diagnostic namespaces, which
is easy, cause I don't.  LALALALALALA


-- 
I have a date with some giant cartoon robots and booze.


[ANNOUNCE] Test::Fork 0.01_01

2007-11-30 Thread Michael G Schwern
As threatened, here's Test::Fork for easier writing of forked tests.
http://pobox.com/~schwern/src/Test-Fork-0.01_01.tar.gz

   use Test::More tests = 4;
   use Test::Fork;

   fork_ok(2, sub{
   pass(Child);
   pass(Child again);
   });

   pass(Parent);

I'm probably doing the reaping of children wrong.  Someone more familiar with
forking might make some suggestions please.

Also, it's not currently checking whether the system is capable of forking.

And I realize I got the number of tests in the synopsis wrong, it should be 4.

Finally, it might be interesting to use an attribute to declare the number of
tests in the fork, like Test::Class.


-- 
There will be snacks.



Re: New Test::More features?

2007-11-30 Thread Michael G Schwern
Michael G Schwern wrote:
 Otherwise, what's important to people?

Here's something that's important to me.  I'd like to make it easier for
people to patch my modules.  A bunch of people already have write access to my
repository, and I've taken care to ensure that most all the outstanding items
are in RT.

Ideally what I'd like is a simple way for anyone to say checkout the branch
for ticket #19389 (creating it if it doesn't already exist).  Then they can
work on it and communicate back when they feel its done and ready for review
and integration.  Ideally each change would be sent back as a comment on the
ticket.

I'll be trac does this.


-- 
Life is like a sewer - what you get out of it depends on what you put into it.
- Tom Lehrer


Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-02 Thread Michael G Schwern
Fergal Daly wrote:
 One of the supposed benefits of using TODO is that you will notice
 when the external module has been fixed. That's reasonable but I don't
 see a need to inflict the confusion of unexpectedly passing tests on
 all your users to achieve this.

Maybe we should just change the wording and presentation so we're not
inflicting so much.

Part of the problem is it screams OMG!  UNEXPECTEDLY SUCCEEDED! and the user
goes whoa, all caps and doesn't know what to do.  It's the most screamingest
part of Test::Harness 2.

Fortunately, Test::Harness 3 toned it down and made it easier to identify them.

Test Summary Report
---
/Users/schwern/tmp/todo.t (Wstat: 0 Tests: 2 Failed: 0)
  TODO passed:   1-2

TAP::Parser also has a todo_passed test summary method so you can
potentially customize behavior of passing todo tests at your end.


I agree with Eric, these tests are extra credit.  Unexpectedly working
better != failure except in the most convoluted situations.  Their
intention is to act as an alternative to commenting out a test which you can't
fix right now.  An executable TODO list that tells you when you're done, so
you don't forget.

It should not halt installation, nothing's wrong as far as the user's
concerned.  However, it does mean investigate and it would be nice if this
information got back to the author.  It would be nice if CPAN::Reporter
reported passing TODO tests... somehow.


 Another downside of using TODO like this is that when the external
 module is fixed, you have to release a new version of your module with
 the TODOs removed. These tests will start failing for anyone who
 upgrades your module but not broken one but in reality nothing has
 changed for that user,

As long as you're releasing a new version, why would you not upgrade your
module's dependency to use the version that works?


-- 
I am somewhat preoccupied telling the laws of physics to shut up and sit down.
-- Vaarsuvius, Order of the Stick
   http://www.giantitp.com/comics/oots0107.html


Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-02 Thread Michael G Schwern
Fergal Daly wrote:
 As long as you're releasing a new version, why would you not upgrade your
 module's dependency to use the version that works?
 
 Your module either is or isn't usable with version X of Foo.

 If it is usable then you would not change your dependency before or
 after the bug in version X is fixed (maybe I have a good reason not to
 upgrade Foo and you wouldn't want your module to refuse to install if
 it is actually usable).
 
 If it isn't usable then marking your tests as TODO was the wrong thing
 to do in the first place, you should have bailed out due to
 incompatibility with version X and not bothered to run any tests at
 all. I think Extutils::MM does not have any way to specify complex
 version dependencies but with Module::Build you could say

ETOOBINARY

Modules do not have a binary state of working or not working.  They're
composed of piles of (often too many) features.  Code can be shippable without
every single thing working.

The TODO test is useful when the working version *does not yet* exist.  If
it's a minor feature or bug then rather than hold up the whole release waiting
for someone else to fix their shit, you can mark it TODO and release.  This is
the author's decision to go ahead and release with a known bug.  We do it all
the time, just not necessarily with a formal TODO test.


 I am basically against the practice of using TODO to cope with
 external breakage. Not taking unexpected passes seriously encourages
 this practice. Apart from there being other ways to handle external
 breakage that seem easier, using TODO is actually dangerous as it can
 cause false passes in 2 ways. Says version X of Foo has a non-serious
 bug so you release version Y of Bar with some tests marked TODO. The
 we risk

Maybe we're arguing two different situations.  Yours seems to be when there is
a broken version of a dependency, but a known working version exists.  In this
case, you're right, it's better resolved with a rich dependency system.

My case is when a working version of the dependency does not exist, or the
last working version is so old it's more trouble than it's worth.  In this
case the author decides the bug is not critical, can't be worked around and
doesn't want to wait for fix in the dependency.  The decision is whether or
not to release with a known bug.  After that, wrapping it in a TODO test is
just an alternative to commenting it out.

Compare with the more common alternative for shipping with a known bug which
is to simply not have a test at all.


 1 Version X+1 of Foo is even worse and will cause Bar to eat your dog.
 Sadly for your dog, the test that might have warned him has been
 marked TODO.

If they release Bar with a known bug against Foo X where your dog's fur is
merely a bit ruffled, then that's ok.  If version X+1 of Foo causes Bar to eat
your dog then why didn't their tests catch that?  Was there not a dog not
eaten test?  If not then that's just an incomplete test, the TODO test has
nothing to do with that.

The dog not eaten test wouldn't have been part of the TODO test, that part
worked fine when the author released and they'd have gotten the todo passed
message and known to move it out of the TODO block.

Or maybe they're just a cat person.

Point is, there's multiple points where good testing practice has to break
down for this situation to occur.  The use of TODO test is orthogonal.


 2 You're using version X-1 of Foo, everything is sweet, your dog can
 relax. You upgrade to version Y+1 of Bar which has a newly introduced
 dog-eating bug. This bug goes undetected because the tests are marked
 TODO. So long Fido.

That's the author's (poor) decision to release with a known critical dog
eating bug.  The fact that it's in a TODO test is incidental.


 I still have not seen an example of using TODO in this manner that
 isn't better handled in a different way.

 As before, I am not advocating changing the current Test::* behaviour
 to fail on unexpected passes as that would just be a mess. It's just
 that whenever this is discussed it ends up with people advocating what
 I consider wrong and dangerous uses of TODO and so I am pointing this
 out again,

Most of the cases above boil down to author decided to release with a known
critical bug or tests didn't check for a possible critical bug.

You're right in that marking something TODO is not an excuse to release with a
known critical bug, but I don't think anyone's arguing that.


-- 
I do have a cause though. It's obscenity. I'm for it.
- Tom Lehrer


Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-03 Thread Michael G Schwern
So I read two primary statements here.

1)  Anything unexpected is suspicious.  This includes unexpected success.

2)  Anything unexpected should be reported back to the author.

The first is controversial, and leads to the conclusion that TODO passes
should fail.

The second is not controversial, but it erroneously leads to the conclusion
that TODO passes should fail.  That's the only mechanism we currently have for
telling the user hey, something weird happened.  Pay attention!  It's also
how we normally report stuff back to the author.  Also there's only two easily
identifiable states for a test: Pass and fail.

So what we need is a pass with caveats or, as Eric pointed out, some way for
the harness to communicate it's results in a machine parsable way.  The very
beginnings of such a hack was put in for CPAN::Reporter in the Result line
that is output at the end of the test.  Ideally you'd have the harness
spitting out its full conclusions... somehow... without cluttering up the
human readable output.  But maybe Result: TODO_PASS is enough.


-- 
Stabbing you in the face for your own good.


Why not run a test without a plan?

2007-12-03 Thread Michael G Schwern
use Test::More;
pass();
plan tests = 2;
pass();

Why shouldn't this work?  Currently you get a You tried to run a test without
a plan error, but what is it really protecting the test author from?

Historically, there was a clear technical reason.  It used to be that the plan
had to come first in the TAP output, so a plan had to come before any tests
were run.  Simple.

But that technical restriction no longer holds true.  The plan can come at the
end, primarily used for no_plan.  If a test is run before the plan is
declared, simply delay the plan output until the end.

Removing this restriction eliminates some unnecessarily difficult planing
problems, especially of the annoying the plan is calculated but I have to run
some tests at BEGIN time.  Like this common mistake:

use Test::More;

if( $something ) {
plan tests = 1;
}
else {
plan skip_all = Because;
}

BEGIN { use_ok Some::Module }

It also allows you to run some tests before you determine the plan, though I
can't think of a particular use for this.

It also makes it technically possible to allow the test to change it's plan
mid-stream, though the consequences and interface for that do require some
thought.

Since the technical restriction is gone, and I see no particular benefit to it
being there, and it eliminates some tricky plan counting situations, I don't
see why it shouldn't be removed.


PS  To be clear, a plan is still eventually needed before the test exits.


-- 
The interface should be as clean as newly fallen snow and its behavior
as explicit as Japanese eel porn.



Re: Why not run a test without a plan?

2007-12-03 Thread Michael G Schwern
David Golden wrote:
 Michael G Schwern [EMAIL PROTECTED] wrote:
 It also makes it technically possible to allow the test to change it's plan
 mid-stream, though the consequences and interface for that do require some
 thought.
 
 With some sugar, that could actually be quite handy for something like
 test blocks.  E.g.:
 
 {
   plan add = 2;
   ok( 1, wibble );
   ok(1, wobble );
 }

Yep, something like that.  There will likely be a change the plan method in
Test::Builder at minimum.  Right now changing the expected number of tests is
tied to printing out the header.


-- 
Schwern What we learned was if you get confused, grab someone and swing
  them around a few times
-- Life's lessons from square dancing


Re: Why not run a test without a plan?

2007-12-03 Thread Michael G Schwern
Eric Wilhelm wrote:
 # from David Golden
 # on Monday 03 December 2007 19:55:
 
 With some sugar, that could actually be quite handy for something like
 test blocks.  E.g.:

 {
   plan add = 2;
   ok( 1, wibble );
   ok(1, wobble );
 }
 
 or maybe make the block a sub
 
 block {
   subplan 2;
   ok(1, wibble);
   ok(1, wobble);
 };

I guess the unspoken benefit is block() can check the sub-plan when the
subroutine ref is done?

I'm always wary of using subs-as-blocks for general testing as they
transparently mess up the call stack which will effect testing anything that
plays with caller() (such as Carp).  Then you have to introduce more
complexity, like Sub::Uplevel, to mask that.


-- 
'All anyone gets in a mirror is themselves,' she said. 'But what you
gets in a good gumbo is everything.'
-- Witches Abroad by Terry Prachett


Re: Why not run a test without a plan?

2007-12-04 Thread Michael G Schwern
A. Pagaltzis wrote:
 Yes, so this should be allowed:
 
 pass();
 plan 'no_plan';
 pass();
 
 Whereas this should not:
 
 pass();
 plan tests = 2;
 pass();

Umm, why not?  That's exactly what I was proposing and it would result in...

ok 1
ok 2
1..2


 Consider also:
 
 pass();
 plan skip_all = 'Surprise!';
 pass();

Good point.  That wouldn't work, there's no way to express skip_all once a
test has been issued.  There are ways Test::More could cheat to make it work,
but that goes against it's intent to be as explicit as possible.  Running a
test and then stating that you're going to skip all tests is ambiguous.

It does splash some cold water on eliminating the common mistake of running a
use_ok() before deciding if you can or cannot run the tests.


 It also makes it technically possible to allow the test to
 change it's plan mid-stream
 
 Without some hypothetical future version of TAP this is only
 possible if you have run tests before declaring a plan at all,
 because otherwise the plan will already have been output as the
 first line of the TAP stream.

Just needs a way to declare that you're going to add to the plan up front.


 Since the technical restriction is gone, and I see no
 particular benefit to it being there, and it eliminates some
 tricky plan counting situations, I don't see why it shouldn't
 be removed.
 
 Because declaring a plan after running tests is effectively a
 no_plan and the programmer should be aware that that’s what they
 did. It’s fine if that’s their conscious choice; just make sure
 it was.

No, it's critically different from a no_plan in that the number of tests to be
run is still fixed by the programmer.  For example...

pass();
plan tests = 3;
pass();

Would produce...

ok 1
ok 2
1..3

Which would be a failure, just as if the plan was at the top.


-- 
I do have a cause though. It's obscenity. I'm for it.
- Tom Lehrer


Re: Why not run a test without a plan?

2007-12-04 Thread Michael G Schwern
Smylers wrote:
 It also makes it technically possible to allow the test to change
 it's plan mid-stream
 Without some hypothetical future version of TAP this is only possible
 if you have run tests before declaring a plan at all, because
 otherwise the plan will already have been output as the first line of
 the TAP stream.
 
 Wasn't there general agreement only a week or so ago to now allow plans
 to be specified at the end rather than the start?  I was presuming that
 Schwern's suggestions were in the light of this other change.

No, that was a much more involved thing which involves nested plans and
multiple plans and such.  This simply takes advantage of the existing ability
to put the plan at the end instead of at the front.

1..2
ok 1
ok 2

and

ok 1
ok 2
1..2

are equivalent test output.  This is how no_plan works and it's been around
since 2001.

Nothing new happened to allow this change, I just never really gave it thought
before.


-- 
I have a date with some giant cartoon robots and booze.


Re: Why not run a test without a plan?

2007-12-04 Thread Michael G Schwern
A. Pagaltzis wrote:
 That would work. Of course once you have that, you don’t need to
 allow assertions to run without a plan, since one can always say
 
 use Test::More tests = variable = 0;
 pass();
 plan add_tests = 2;
 pass();
 
 instead of
 
 use Test::More;
 pass();
 plan tests = 2;
 pass();
 
 which would still be an error. That way a mistake in a test
 script won’t lead to Test::More silently converting an up-front
 plan declarations into trailing ones.

Which brings us back to the original question:  why should that be an error?


-- 
There will be snacks.


Re: Why not run a test without a plan?

2007-12-04 Thread Michael G Schwern
Geoffrey Young wrote:
 
 Andy Armstrong wrote:
 On 4 Dec 2007, at 15:22, Geoffrey Young wrote:
 it would be nice if this were enforced on the TAP-digestion side and not
 from the TAP-emitter side - the coupling of TAP rules within the
 TAP-emitter is what lead to my trouble in the first place.

 A valid plan - at the beginning or the end - is required by Test::Harness.
 
 yup, I get that.  but that has nothing to do with the Test::More errors
 that started the thread - I ought to be able to use is() functionality
 to emit into whatever stream I want and not have it complain about
 missing plans, especially when Test::Harness will catch malformed TAP
 and complain anyway... if I decide to send it to Test::Harness, which I
 may not.

You can turn off all the ending checks with Test::Builder-no_ending(1) and
the header being printed with no_header(1).  That's what we came up with back
then.
http://www.nntp.perl.org/group/perl.qa/2006/07/msg6212.html


-- 
Hating the web since 1994.


Re: UNKNOWN despite only failing tests -- how come?

2007-12-04 Thread Michael G Schwern
Andreas J. Koenig wrote:
 Bug in CPAN::Reporter and/or Test::Harness and/or CPAN.pm?
 
   http://www.nntp.perl.org/group/perl.cpan.testers/796974
   http://www.nntp.perl.org/group/perl.cpan.testers/825449
 
 All tests fail but Test::Harness reports NOTESTS and CPAN::Reporter
 concludes UNKNOWN and CPAN.pm then installs it.

Test::Harness bug where it concludes NOTESTS if it sees no test output, as
is the case when every test dies.  I'll see about fixing it.


-- 
On error resume stupid


Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-04 Thread Michael G Schwern
This this whole discussion has unhinged a bit from reality, maybe you can give
some concrete examples of the problems you're talking about?  You obviously
have some specific breakdowns in mind.


Fergal Daly wrote:
 Modules do not have a binary state of working or not working.  They're
 composed of piles of (often too many) features.  Code can be shippable 
 without
 every single thing working.
 
 You're right, I was being binary, but you were being unary. There are 3 cases,
 
 1 the breakage was not so important, so you don't bail no matter what
 version you find.
 2 it's fuzzy, maybe it's OK to use Foo version X but once Foo version
 X+1 has been released you want to force people to use it
 3 the breakage is serious, you always want to bail if you find Foo
 version X (and so you definitely don't switch the tests to TODO).

 You claimed 2 is always the case.  I claimed that 1 and 3 occur.

If I did, that wasn't my intent.  I only talked about #2 because it's the only
one that results in the user seeing passing TODO tests, which is what we were
talking about.


 I'm
 happy to say admit that 2 can also occur. The point remains, you would
 not necessarily change your modules requirements as a reaction to X+1
 being released. You might, or you might change it beforehand if it
 really matters or you might not change it at all.

And I might dip my head in whipped cream and go give a random stranger a foot
bath.  You seem to have covered all possibilities, good and bad.  I'm not sure
to what end.

The final choice, incrementing the dependency version to one that does not yet
exist, boils down to it won't work.  It's also ill advised to anticipate
that version X+1 will fix a given bug as on more than one occasion an
anticipated bug has not been fixed in the next version.

Anyhow, to get back to the point, it boils down to an author's decision how to
deal with a known bug.  TODO tests are orthogonal.


 Maybe we're arguing two different situations.  Yours seems to be when there 
 is
 a broken version of a dependency, but a known working version exists.  In 
 this
 case, you're right, it's better resolved with a rich dependency system.
 
 I think maybe we are.
 
 You're talking about where someone writes a TODO for a feature that
 has never worked. That's legit, although I still think there's
 something odd about it as you personally have nothing to do. I agree
 it's not dangerous.

Sure you do, you have to watch for when the dependency fixes its bug.  But
that's boring and rote, what computers are for!  So you write a TODO test to
automate the process.  [1]

In a large project, sometimes things get implemented when you implement other
things.  This is generally more applicable to bugs, but sometimes to minor
features.

Then there are folks who embrace the whole test first thing and write out lots
and lots of tests beforehand.  Maybe you decide not to implement them all
before shipping.  Rather than delete or comment out those tests, just wrap
them in TODO blocks.  Then you don't have to do any fiddling with the tests
before and after release, something which leads to an annoying shear between
the code the author uses and the code users use.

There is also the I don't think feature X works in Y environment problem.
For example, say you have something that depends on symlinks.  You could hard
code in your test to skip if on Windows or some such, but that's often too
broad.  Maybe they'll add them in a later version, or with a different
filesystem (it's happened on VMS) or with some fancy 3rd party hack.  It's
nice to get that information back.


 I'm talking about people converting tests that were working just fine
 to be TODO tests because the latest version of Foo (an external
 module) has a new bug. While Foo is broken, they don't want lots of
 bug reports from CPAN testers that they can't do anything about.
 
 This use of TODO allows you to silence the alarm and also gives you a
 way to spot when the alarm condition has passed. It's convenient for
 developers but it's 2 fingers to users who can now get false passes
 from the test suites,

It still boils down to what known bugs the author is willing to release with.
 Once the author has decided they don't want to hear about a broken
dependency,  and that the breakage isn't important, the damage is done.  The
TODO test is orthogonal.

Again, consider the alternative which is to comment the test out.  Then you
have NO information.

So I think the problem you're concerned with is poor release decisions.  TODO
tests are just a tool being employed therein.


[1] Don't get to hung up on names, things only get one even though they can do
lots of things.  I'm sure you've written lots of perl programs that didn't do
much extracting or reporting.


-- 
Reality is that which, when you stop believing in it, doesn't go away.
-- Phillip K. Dick


Re: Why not run a test without a plan?

2007-12-04 Thread Michael G Schwern
Geoffrey Young wrote:
 I guess what I thought you were getting at was a natural decoupling of
 comparison functions with the planning without all the hackery involved
 to get that sepraration working now.  so I was suggesting that the
 decoupling go further than just no_plan, and that yeah, rock on, great
 idea.  'tis all :)

I see what you're getting at.  I don't think I'm going that far, though I'm
willing to help somehow with the I want to paste a bunch of subtest processes
together problem.

One of the original issues Test::More was designed to deal with was the
problem of running individual tests without having to parse the test output
(by eye or by computer) to get an accurate result.

That's why it changes the exit code on failure and why it has the ending
diagnostic message if there's a failure.  While prove and TAP::Parser help, I
still like running tests by hand to get complete control.


-- 
If at first you don't succeed--you fail.
-- Portal demo


Re: Why not run a test without a plan?

2007-12-04 Thread Michael G Schwern
Eric Wilhelm wrote:
 A. Pagaltzis wrote:
 ...
 which would still be an error. That way a mistake in a test
 script won’t lead to Test::More silently converting an up-front
 plan declarations into trailing ones.
 Which brings us back to the original question:  why should that be an
 error?
 
 It's a matter of stricture?  If the developer intended the plan to be 
 before-hand, they'll be expecting an error to enforce that practice.

Why do they care if the plan is output at the beginning or end?  How does this
stricture improve the quality of the test?  What mistake does it prevent?  If
we were to propose the no test without a plan stricture today, what would
the arguments in favor be?

I'm not worried about the shock of violating existing programmer's calcified
minor expectations.  They'll live.

About the only thing I can think of is consistency.  skip_all must still come
first, so that this sort of thing will sometimes work, sometimes not,
depending on $^O.

use Test::More;

if( $^O eq 'BrokenOS' ) {
plan skip_all = 'Your shit is broke';
}
else {
plan tests = 42;
}

BEGIN { use_ok 'Some::Module' }

This might be solved by the oft-requested skip_rest.

use Test::More tests = 42;

skip_rest(Your shit is broke) if $^O eq 'BrokenOS';

BEGIN { use_ok 'Some::Module' }

Hmm, it's also shorter.  This even allows something like this:

use Test::More tests = 42;

BEGIN {
use_ok 'Optional::Module' ||
skip_rest('Optional::Module not available');
}


 The current planning functions impose strictures.  (Yes, it happens to 
 be due to an old implementation detail which no longer governs -- but 
 that doesn't change the fact that behavior expectations have already 
 been set.)  Taking away the error essentially means that you've changed 
 the API.  Imagine if strict.pm suddenly stopped being strict about 
 symbolic refs.

 That is, users should somehow be able to rely on the same strictures 
 that they've had in the past (hopefully by-default.)  So, either this 
 new non-strict plan scheme should be declared in the import() params or 
 be not named plan().

I see where you're going, but I think this is going too far wrt backwards
compatibility.

The strict analogy is spurious because that would be changing a fundamental
part of strict.  Whereas this is an incidental part of Test::More.  Scale does
matter.

Furthermore, it's not going to cause any passing tests to fail, or any
legitimately failing tests (ie. due to a real bug, not Test::More stricture)
to pass.

The only breakage I can think of are all highly convoluted and improbable,
where you've somehow written a test that checks that this specific feature
works.  But the only one who should be doing that is Test::More's own tests.
Or some highly paranoid dependent on that specific feature, in which case
congratulations!  Your test did its job!

I'm not worried.


 I'm still wishing for the plan to make it to a given statement model 
 (e.g. done().)

Honestly all that's really holding that up is a good name for the plan style
and I'm done testing terminator.  Nothing has really lept out at me yet.
Maybe something as straight forward as...

plan 'until_done';



done_testing;


-- 
Ahh email, my old friend.  Do you know that revenge is a dish that is best
served cold?  And it is very cold on the Internet!


Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-04 Thread Michael G Schwern
I'm going to sum up this reply, because it got long but kept on the same themes.

*  TODO tests provide you with information about what tests the author decided
to ignore.
**  Commented out tests provide you with NO information.
**  Most TODO tests would have otherwise been commented out.

*  How you interpret that information is up to you.
**  Most folks don't care, so the default is to be quiet.

*  The decision for what is success and what is failure lies with the author
**  There's nothing we can do to stop that.
**  But TODO tests allow you to reinterpret the author's desires.

*  TAP::Harness (aka Test::Harness 3) has fairly easy ways to control how
   TODO tests are interpreted.
**  It could be made easier, especially WRT controlling make test
**  CPAN::Reporter could be made aware of TODO passes.


Fergal Daly wrote:
 On 05/12/2007, Michael G Schwern [EMAIL PROTECTED] wrote:
 This this whole discussion has unhinged a bit from reality, maybe you can 
 give
 some concrete examples of the problems you're talking about?  You obviously
 have some specific breakdowns in mind.
 
 I don't, I'm arguing against what has been put forward as good
 practice when there are other better practices that are approximately
 as easy and don't have the same downsides.
 
 In fairness though these bad practices were far more strongly
 advocated in the previous thread on this topic than in this one.

I don't know what thread that was, or if I was involved, so maybe I'm not the
best person to be arguing with.


 The final choice, incrementing the dependency version to one that does not 
 yet
 exist, boils down to it won't work.  It's also ill advised to anticipate
 that version X+1 will fix a given bug as on more than one occasion an
 anticipated bug has not been fixed in the next version.
 
 As I said earlier though, in Module::Build you have the option of
 saying version  X and then when it's finally fixed, you can say !X
 (and !X+1 if that didn't fix it).

Yep, rich dependencies are helpful.


 There is also the I don't think feature X works in Y environment problem.
 For example, say you have something that depends on symlinks.  You could hard
 code in your test to skip if on Windows or some such, but that's often too
 broad.  Maybe they'll add them in a later version, or with a different
 filesystem (it's happened on VMS) or with some fancy 3rd party hack.  It's
 nice to get that information back.
 
 How do you get this information back? Unexpected passes are not
 reported to you. If you want to be informed about things like this a
 TODO is not a very good way to do it.

The TODO test is precisely the way to do it, it provides all the information
needed.  We just don't have the infrastructure to report it back.

As discussed before, what's needed is a higher resolution then just pass and
fail for the complete test run.  That's the Result: PASS/TODO discussed
earlier.  Things like CPAN::Reporter could then send that information back to
the author.  It's a fairly trivial change for Test::Harness.

The important thing is that report back is no longer locked to fail.


 I'm talking about people converting tests that were working just fine
 to be TODO tests because the latest version of Foo (an external
 module) has a new bug. While Foo is broken, they don't want lots of
 bug reports from CPAN testers that they can't do anything about.

 This use of TODO allows you to silence the alarm and also gives you a
 way to spot when the alarm condition has passed. It's convenient for
 developers but it's 2 fingers to users who can now get false passes
 from the test suites,
 It still boils down to what known bugs the author is willing to release with.
  Once the author has decided they don't want to hear about a broken
 dependency,  and that the breakage isn't important, the damage is done.  The
 TODO test is orthogonal.

 Again, consider the alternative which is to comment the test out.  Then you
 have NO information.
 
 Who's you?

You == user.


 If you==user then a failing TODO test and commented out test are
 indistinguishable unless you go digging in the code or TAP stream.

As they say, works as designed.  The author decided the failures aren't
important.  Don't like it?  Take it up with the author.  Most folks don't care
about that information, they just want the thing installed.

You (meaning Fergal Daly) can dig them out with some Test::Harness hackery,
and maybe that should be easier if you really care about it.  The important
thing is the information is there, encoded in the tests, and you can get at it
programatically.

The alternative is to comment the failing test out in which case you have *no*
information and those who are interested cannot get it out.


 A passing TODO is just confusing.

That's a function of how it's displayed.  UNEXPECTEDLY SUCCEEDED, I agree,
was confusing.  No question.  TH 3's display is more muted and no more
confusing then a skip test.  There is also the very clear All tests
successful

Re: TODO - MAYBE tests?

2007-12-05 Thread Michael G Schwern
Eric Wilhelm wrote:
 Since we're on the subject of CPAN::Reporter, TAP::Harness, Test::More, 
 and TODO wrt failure vs. no-noise vs. report-back vs. await-dependency 
 and the binaryism of failure and etc...
 
 Perhaps a general sort of MAYBE namespace in TAP would be a nice 
 addition.

Is this a joke?  I hope it's a joke.


-- 
There will be snacks.


Re: Why not run a test without a plan?

2007-12-05 Thread Michael G Schwern
A. Pagaltzis wrote:
 * Michael G Schwern [EMAIL PROTECTED] [2007-12-05 04:30]:
 Why do they care if the plan is output at the beginning or end?
 How does this stricture improve the quality of the test?
 
 It improves the resulting TAP stream, if not the test itself.

What's improved about the plan coming at the front as opposed to at the end?
 Give me something concrete, not just it's better.  I'm going to keep
drilling through the BS until I either hit bottom or punch through.

About all that's different when the plan is at the end is the TAP reader
doesn't know how many tests are coming until the end of the test.  Then it
can't display the expected number of tests while the test is running.
Unfortunate, but hardly a showstopper.


 But maybe it’s not necessary to impose this stricture by default,
 and instead of asking to be allowed to supply a plan later, as I
 proposed, people should instead have to ask for the stricture:

Again I ask, why make them ask?


 Honestly all that's really holding that up is a good name for
 the plan style and I'm done testing terminator.  Nothing has
 really lept out at me yet. Maybe something as straight forward
 as...

  plan 'until_done';

  

  done_testing;
 
 plan 'until_completion';
 # ...
 plan 'completed';

I don't want to saddle plan() with yet another feature.  It will most
definitely be it's own function.


-- 
The interface should be as clean as newly fallen snow and its behavior
as explicit as Japanese eel porn.


Re: TODO - MAYBE tests?

2007-12-05 Thread Michael G Schwern
Eric Wilhelm wrote:
 # from Michael G Schwern
 # on Wednesday 05 December 2007 05:47:
 
 Perhaps a general sort of MAYBE namespace in TAP would be a nice
 addition.
 Is this a joke?  I hope it's a joke.
 
 Do I look like I'm joking?  :-|

   As it is, we're talking about detecting/reporting a 3rd thing, which 
   only increases the resolution by 50%.  If there are really $n, perhaps 
   just jump straight to $n and skip that 4th, 5th, 6th, ... process?
 
 You don't have to call it MAYBE -- is that what makes it hard to take 
 seriously?

Yes.  It makes my trick ambiguity in testing is bad knee act up.  I'll go
tie my leg down and reread the proposal.


-- 
I am somewhat preoccupied telling the laws of physics to shut up and sit down.
-- Vaarsuvius, Order of the Stick
   http://www.giantitp.com/comics/oots0107.html


Re: Why not run a test without a plan?

2007-12-05 Thread Michael G Schwern
Eric Wilhelm wrote:
  Give me something concrete, not just it's better.  I'm going to
 keep drilling through the BS until I either hit bottom or punch
 through.
 
 It allows you to apply the policy all tests have a plan at the test 
 level.  Yes, policy often sounds like BS.

 By historical accident Test::More has always applied (albeit not in a 
 super-formal way) that policy by default.

This BS... err, policy... would still be possible.  It's a social policy
anyway, not a technical one.  It's been possible to run a test without a plan
for a long time.  Whatever they have in place to deal with no_plan can deal
with this.


 About all that's different when the plan is at the end is the TAP
 reader doesn't know how many tests are coming until the end of the
 test.  Then it can't display the expected number of tests while the
 test is running.
 
 Yes.  That leads a shop to implement the policy all tests must plan.
 
 If you don't want to support that policy-application, fine.  It can be 
 solved in other ways -- maybe they're cleaner.  A switch in the harness 
 doesn't seem to be it, but maybe a Test::MustPlan (complete with 
 syntactic-sugar for the annoying BEGIN thing.)

I'll put in some Test::Builder-must_have_plan flag to allow the current
behavior to be switched back on rather than wholely deleting it.  It's all
encapsulated in a method anyway.


-- 
Schwern What we learned was if you get confused, grab someone and swing
  them around a few times
-- Life's lessons from square dancing


Re: Why not run a test without a plan?

2007-12-05 Thread Michael G Schwern
A. Pagaltzis wrote:
 * Michael G Schwern [EMAIL PROTECTED] [2007-12-05 15:00]:
 I'm going to keep drilling through the BS until I either hit
 bottom or punch through.
 
 Yeah, we’re all spouting bullshit. Gee, some tone you’re setting.

Sorry, I forgot the :)

That I'm pushing so hard to get something concrete out of you means I think
you've got something useful to say.  That I'm not getting it is frustrating.
Seems I've finally got it, thank you.


 About all that's different when the plan is at the end is the
 TAP reader doesn't know how many tests are coming until the end
 of the test. Then it can't display the expected number of tests
 while the test is running.
 
 Not only can’t it do anything display-wise, but the harness also
 can’t do anything else that requires knowing the projected plan
 up front. It can’t abort the test as soon as the first extra test
 runs. If the test dies, the harness doesn’t know how many tests
 were pending. A system whose job is to continuously run lots of
 tests in parallel can’t do nearly as much useful asynchronous
 reporting.

 Unfortunate, but hardly a showstopper.
 
 Whether or not it’s a showstopper is for the harness author
 to judge and not for you. It’s not hard to imagine cases where
 better streamability is important, even if they’re not garden-
 variety `./Build test` scenarios. We’re championing TAP as a
 solution for a wide variety of scenarios, right?

That's all things I haven't thought of.  Since it doesn't effect the ultimate
quality of the tests, just some inconveniences in reporting, I'm not worried.

no_plan already has all these issues and the sky remains firmly fixed in the
heavens.  Header-at-end TAP is still streamable.  You don't have to read the
whole document before you can get information.  It doesn't close off any
testing situations, and it makes quite a few more much simpler.

To make it clear, Test::Builder will still put the plan at front when it can.
 Also, to make it clear, this is all possible right now with TAP.  This is a
Test::Builder imposed restriction.


 But streamability isn’t important in that most common use case,
 so it probably shouldn’t be the default, which is why I opined
 that maybe Test::More should be strict on request but not by
 default.

Sorry, I must have missed that.  Your example code up to this point looked
like it required the user to declare up front that they were going to put a
plan later.

Having the author declare in the test that they'd like Test::More to be strict
with the plan seems near useless.  If you're going to declare that you have to
declare a plan, why not just declare the plan?  It's like preparing to
prepare.  I can think of a few weird cases where it might be handy, but it's
not worth the extra complication.


-- 
Life is like a sewer - what you get out of it depends on what you put into it.
- Tom Lehrer


Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-05 Thread Michael G Schwern
Fergal Daly wrote:
 The importance of the test has not changed. Only the worth of the
 failure report has changed.
 
 This could be solved by having another classification of test, the
 not my fault test used as follows
 
 BLAME: {
   $foo_broken = test_Foo(); # might just be a version check or might
 be a feature check
   local $BLAME = Foo is broken, see RT #12345 if $foo_broken;
 
   ok(Foo::thing());
 }
 
 The module would install just fine in the presence of a working Foo,
 the module would fail to install in the presence of a broken Foo but
 no report should be sent to the author.
 
 This gives both safety for users and convenience for developers. This
 is what I meant by smarter tools.

I hope you don't mind if I cut out the rest of the increasingly head-butting
argument and jump straight to this interesting bit.

As much as my brain screams DO NOT WANT!!! [1] because it smacks of
expected failure it might be just what we're looking for.  The allows the
author to program in I know this is broken, don't bug me about it without
completely silencing the test.

However, I think it will be very open to abuse.  I'm also not sure how this
will be different from simply having the option of making failing TODO tests
fail for the user but not report back to the author.

It still boils down to trusting the author.


[1] http://www.mgroves.com/images/do_not_want_star_wars.jpg


-- 
There will be snacks.


Customizing Test::Builder (was Re: TAP::Builder)

2007-12-05 Thread Michael G Schwern
Ovid wrote:
 Side note:  those features I really want control over in
 Test::Harness
 are the plan() and ok() methods.  There's no clean way for me to do
 that.  Just look at the constructor:

   my $Test = Test::Builder-new;
   sub new {
   my($class) = shift;
   $Test ||= $class-create;
   return $Test;
   }

 The class name is hard-coded in there.
 
 Note to self:  don't post while hung over from the London Perl
 Workshop.  What I just said is rather confusing.  What I *need* is a
 way to easily replace Test::Builder with an appropriate subclass.  I
 think I can replace the builder() method in Test::Builder::Module:
 
   sub builder {
 return Test::Builder-new;
   }
 
 But it would still be nice to have something a bit more subtle than a
 sledgehammer:
 
   sub Test::Builder::Module::builder { ... }

On the surface, this could be solved with a simple way to replace the
Test::Builder class with your own.  However, I am always hesitant to do that
because of the inevitable clash.

Test::Builder is a singleton, this way custom testing modules can make their
changes in concert and with global effect.  Now what happens if Test::Foo and
Test::Bar are used together in the same program?  And Test::Foo decides it
wants to replace the singleton with Test::Builder::Foo and Test::Bar decides
it wants to do Test::Builder::Bar.  One has to win and one has to lose.  This
is not the Test::Builder way.

Somehow, multiple modules have to be able to override Test::Builder behaviors.
 Yes, there are certain features which inherently clash, but that's at the
higher feature level rather than at the code level.

Something I'm considering to resolve this is the Class::C3 method plugin
mechanism, or something like it.  I've used it while working on the Class::DBI
compatibility wrapper around DBIx::Class and I'm very impressed with the
amount of flexibility it allows.  It also allows you to slice up functionality
by feature and combine them together.

Traits and mixins offer similar functionality.  They're also a fair sight
easier to implement, considering the dependency issues Test::Builder will have
to contend with.  mixin.pm is small enough that I can just ship a copy with
Test::Builder.

Finally, there's the idea of splitting up Test::Builder into an aggregate of
many objects.  The aggregate itself would not be a singleton, allowing local
customization, but some of its parts (such as the part responsible for the
counter) would.  I believe this is how chromatic did the Perl 6 version which
I haven't gotten around to studying.

So... it's complicated.


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.


Re: Parsing TAP into TAP

2007-12-10 Thread Michael G Schwern
Ovid wrote:
 Test results currently look something like this:
 
   t/foo.t. ok
   t/bar.t. ok
   t/baz.t. 23/?
   #   Failed test at t/baz.t line 9
   # Looks like you failed 2 tests out of 23
   t/baz.t. Dubious, test ...
 
 Why do we do this instead of outputting TAP (using YAML diagnostics)?
 
   ok 1 - t/foo.t
   ok 2 - t/bar.t
   not ok 3 - t/baz.t
 ---
 failed:
   - 2
   - 11
 ...
 
 And we could even add diagnostics for the non-failing tests.  This
 could be an alternate output, but now instead of external tools having
 to try and parse our ad-hoc Test::Harness output, we could have an
 alternate machine read-able output that those tools could use.  Now if
 only we had a useful way to read that output ...

+1

-- 
Stabbing you in the face for your own good.


Re: What's the point of a SIGNATURE test?

2007-12-14 Thread Michael G Schwern
Adrian Howard wrote:
 
 On 11 Dec 2007, at 05:12, Michael G Schwern wrote:
 
 Adam Kennedy posed me a stumper on #toolchain tonight.  In short,
 having a
 test which checks your signature doesn't appear to be an actual
 deterrent to
 tampering.  The man-in-the-middle can just delete the test, or just the
 SIGNATURE file since it's not required.  So why ship a signature test?

 The only thing I can think of is to ensure the author that the signature
 they're about to ship is valid, but that's not something that needs to
 be shipped.
 [snip]
 
 It is something that needs to be shipped if you have the CPAN is the
 definitive version of a module. Somebody can fork from it attitude.

 It certainly doesn't have to run though...

I'm really not a fan of shipping tests that don't get run.

To be clear, I'd likely just delete it entirely and either A) trust that
MakeMaker/Module::Build will do the right thing, which it always has for me or
B) add a cpansign verify to my normal release script.

Both avoid pooping a common author-only check all over the place.


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.


Re: What's the point of a SIGNATURE test?

2007-12-14 Thread Michael G Schwern
Andreas J. Koenig wrote:
 On Mon, 10 Dec 2007 21:12:51 -0800, Michael G Schwern [EMAIL 
 PROTECTED] said:
 
Adam Kennedy posed me a stumper on #toolchain tonight.  In short, having a
test which checks your signature doesn't appear to be an actual deterrent 
 to
tampering.  The man-in-the-middle can just delete the test, or just the
SIGNATURE file since it's not required.  So why ship a signature test?
 
 Asking the wrong question. None of our testsuites is there to protect
 against spoof or attacks. That's simply not the goal. Same thing for
 00-signature.t

We would seem to be agreeing.  If the goal of the test suite is not to protect
against spoofing, and if it doesn't accomplish that anyway, why put a
signature check in there?


The only thing I can think of is to ensure the author that the signature
they're about to ship is valid, but that's not something that needs to be 
 shipped.
 
 Has the world changed over night? Are we now questioning tests instead
 of encouraging them? Do now suddenly authors have to justify their
 testing efforts?

 I don't mind if we set up a few rules what tests should and should not
 do, but then this topic needs to be put into perspective.
 
It appears that a combination of a CHECKSUMS check against another CPAN 
 mirror
and a SIGNATURE check by a utility external to the code being checked is
effective, and that's what the CPAN shell does.  The CHECKSUMS check makes
sure the distribution hasn't been tampered with.  Checking against a CPAN
mirror other than the one you downloaded the distribution from checks 
 that the
mirror has not been compromised.  Checking the SIGNATURE ensures that the
module is from who you think its from.
 
 Yupp. And testing the signature in a test is better than not testing
 it because a bug in a signature or in crypto software is as alarming
 as a bug in perl or a module.

I believe this to be outside the scope of a given module's tests.  It's not
the responsibility of every CPAN module to make sure that your crypto software
is working.  Or perl.  Or the C compiler.  Or make.  That's the job of the
toolchain modules which more directly use them (CPAN, Module::Signature,
MakeMaker, Module::Build, etc...). [1]

At some point you have to trust that the tools work, you can't test the whole
universe.  You simply don't have the time.

That brings me to the central reason why we've started to examine tests for
removal.  There's a certain cost/benefit ratio to be considered.  What's the
cost of implementing and maintaining a test, what's the benefit and does the
benefit justify the cost?  What's the opportunity cost, could you be doing
something more useful with that time and effort?  Finally, what's the cost in
terms of test suite confidence?  How many false negatives are your users
willing to endure before they lose confidence?

The fixed cost of a test is in writing it.  This includes both writing the
test itself and possibly altering the code being tested to make it testable.
It's a fixed cost because you do it once and then you're done.

The reoccurring costs include diagnosing failures.  The user loses time due to
a halted installation.  They contact the author who has to diagnose the
failure and communicate back the results back to the user.  If the test found
a bug, then the cost has a benefit and it's worthwhile.  But if the test
failed because it's a bad test, or because of something out of the author's
control and/or the user doesn't care about, then there's little or no benefit.

Then there's the cost of confidence.  Tests are only useful if someone pays
attention to them.  A failed test should be a clear indication of an actual
problem.  This is why expected failures (and their related expected
warnings) are so insidious.  False failures erode the mental link between
test failure and bug.  Get enough of them, and it doesn't take much, and
people start to ignore any failure.  This is one of the most dangerous social
problems for a test suite.

A test that results in a lot of false negatives has a high reoccurring cost to
no benefit.

Finally there's the question of opportunity cost.  Instead of writing and
maintaining a faulty test, what else could you have been doing with that time?
 Could you have been doing something with an even higher benefit?  If so, you
should do it instead.


Let's look at the example of Test::More.  The last release has 120 passes and
just 4 failures.
http://cpantesters.perl.org/show/Test-Simple.html#Test-Simple-0.74

What are those four failures?  Three are due to a threading bug in certain
vendor patched versions of perl, one is due to the broken signature test.

Look at the previous gamma release, 0.72.  256 passes, 9 failures.
5 due to the threading bug, 4 from the signature test.

0.71:  73 passes, 2 failures.  1 signature, 1 threads

0.70:  221 passes, 12 failures.  3 signature, 9 threads

And so on.  That's nine months with nothing but false negatives

Re: Milton Keynes PM coding collaboration

2007-12-14 Thread Michael G Schwern
Edwardson, Tony wrote:
 Anyone written any CPAN modules for which the testing coverage needs to be
 improved ?
 
 Want someone else to sort this out for you ?

...

 Any takers ?

http://search.cpan.org/dist/ExtUtils-MakeMaker

Repository here:
http://svn.schwern.org/svn/CPAN/ExtUtils-MakeMaker

Getting a valid coverage measurement is tricky since so much happens in
sub-processes.  And then there's all the platform specific code.  But don't
worry, there's plenty to do. :)


-- 
Stabbing you in the face so you don't have to.


Re: What's the point of a SIGNATURE test?

2007-12-15 Thread Michael G Schwern
Andreas J. Koenig wrote:
 On Fri, 14 Dec 2007 15:49:32 -0800, Michael G Schwern [EMAIL 
 PROTECTED] said:
We would seem to be agreeing.  If the goal of the test suite is not to 
 protect
against spoofing, and if it doesn't accomplish that anyway, why put a
signature check in there?
 
 Of course we are agreeing 99%. But I'm citing the Michael Schwern
 saying that is dearer to me than the above paragraph: tests are there
 to find bugs.

I say lots of apparently contradictory things.  The trick is knowing when one
rule wins out over the other.

Something to keep in mind is that I'm talking about one very specific test.
Don't let this discussion get tangled up in the author tests brouhaha that
often brews up around here.


[...] But if the test failed because it's a bad test,
 
 Clearly a strawman's argument. It's impossible to contradict you on
 this. Thou shalt not write bad tests. Period.

That was supposed to come out more like if the test failed because of a
mistake in the test suite.  You know the sort of thing.  Like when you write:

like $error, qr/your shit is broke at $0 line \d+\.\n/;

and it blows up on Windows because you forgot about the backslashes in Windows
path names.  The test failure indicates a bug in the test, not the code.
Thus, the failure has a cost and no benefit.


The
signature test is not actually indicating a failure in Test::More, so 
 it's of
no benefit to me or the users, and the bug has already been reported to
Module::Signature.
 
 See above. Once the bug is reported there is no justification to keep
 the test around. In this case I prefer a skip over a removal because
 the test apparently once was useful.

Bt skipped tests don't get run so it's effectively deleted, except a
permanently skipped test sits around cluttering things up.  Smells like
commenting out code that maybe someday you might want to use again in the
future.  Just adds clutter.

If I want to bring a test (or code) back from the dead that's what version
control is for.


The threading test is indicating a perl bug that's very difficult to 
 detect
[2], only seems to exist in vendor patched perls, I can't do anything 
 about
and is unlikely to effect anyone since there's so few threads users.  It's
already been reported to the various vendors but it'll clear up as soon as
they stop mixing bleadperl patches into 5.8.
 
In short, I'm paying for somebody else's known bugs.  I get nothing.
Test::More gets nothing.  The tools get nothing.  Cost with no benefit.  
 So
why am I incurring these costs?  Maybe the individual users find out their
tools are broken, but it's not my job to tell them that.
 
 During smoking CPAN I often find bugs in one module revealed by a test
 in another one... Only because David Golden tests so hard his tests were
 well suited to reveal a bug in Test::Harness. I'm glad he doesn't ask
 if it is his job or not. Just a few RT headlines of the past year:...
 Catalyst:Plugin:Authorization:Roles found a bug in C:P:Authentication.
 DBI 1.601 broke Exception::Class::DBI. HTML-TreeBuilder-XPath 0.09
 broke Web::Scraper. Test::Distribution 1.29 broke Lingua::Stem.
 Math-BigInt-FastCalc broke Convert::ASN1. Test:Harness 3.0 broke POE.
 DBM-Deep-1.0006 broke IPC::PubSub. DateTime-Locale-0.35 broke
 Strptime. Data::Alias 1.06 breaks Devel:EvalContext. Class::Accessor
 breaks Class::Accessor::Class. DBIx-DBSchema-0.33 breaks Jifty::DBI.
 File::chdir 0.08 breaks Module::Depends 0.12. Lingua:Stem:It 0.02
 breaks the Lingua:Stem testsuite. SVN-Notify-0.26 breaks
 SVN::Notify::Config (and others). Heap 0.80 breaks Graph. DBI-1.53's
 NUM_OF_FIELDS change breaks DBD-Sybase 1.07. Getopt-Long 2.36 breaks
 Verilog::Language. And so on.

I agree that's all very useful.  Interlocking dependent test suites ferret out
bugs the original authors wouldn't find.  However, there is a very important
difference between the list above and Test::More's signature test.

On a quick scan, all of those modules have direct dependencies.  DBD::Sybase
uses DBI, Lingua::Stem uses Test::Distribution, etc... so it's natural that
their tests would test their dependencies.  If a dependency breaks, they
break.  I'm sure most of the authors above did not set out with the intention
to test their dependencies, it's all inherent in the testing of their own code.

Test::More doesn't actually use Module::Signature, so why is it testing it?

It would be like if DBI decided to add a test to make sure MakeMaker can read
the MANIFEST.  Sure it's useful to know that part of the toolchain works and
that the MANIFEST can be read, but why is that in DBI?  One can argue that DBI
depends on the good functioning of ExtUtils::Manifest to install, so it should
test it.  Ok, then what about all the other things DBI depends on to install?
 Should it test that MakeMaker can make a valid Makefile?  Should it test that
tar and gzip work?  Should I check that CPAN.pm can properly

Re: What's the point of a SIGNATURE test?

2007-12-16 Thread Michael G Schwern
Andreas J. Koenig wrote:
 On Sat, 15 Dec 2007 01:34:37 -0800, Michael G Schwern [EMAIL 
 PROTECTED] said:
 
   See above. Once the bug is reported there is no justification to keep
   the test around. In this case I prefer a skip over a removal because
   the test apparently once was useful.
 
Bt skipped tests don't get run so it's effectively deleted, except a
permanently skipped test sits around cluttering things up.  Smells like
commenting out code that maybe someday you might want to use again in the
future.  Just adds clutter.
 
If I want to bring a test (or code) back from the dead that's what version
control is for.
 
 I think I did indicate I was talking about a $VERSION-dependent skip.
 
 Let me reiterate.
 
 A test reveals a bug in module A, version N. The bug now is known and
 filed to RT. No need to run it again and again. Skip it ***if version
 N of module A is installed***. Apparently the test was useful to
 detect a malfunctioning of module A. Do not throw it away until you
 have verified that the test has found a better home. If it has found a
 better home for sure, I do not care if you delete it.
 
 POtherwise it is vital to keep the test because it has proved to be
 useful. It is unacceptable to to run the test on the broken version
 over and over again. A $VERSION check should be sufficient from that
 point in time on.
 
 What if everybody on CPAN deletes tests just because a related bug has
 been fixed? Nobody would notice if the bug were reintroduced.
 
 Nuff said?

Now I understand, I thought you meant an unconditional skip.


-- 
If at first you don't succeed--you fail.
-- Portal demo


Re: [ANNOUNCE] TAP::Harness::Archive 0.03

2007-12-16 Thread Michael G Schwern
nadim khemir wrote:
 On Saturday 15 December 2007 20.53.30 Michael Peters wrote:
 The uploaded file

 TAP-Harness-Archive-0.03.tar.gz
 ...
 
 Nice. Now, what do we do with it?

You RTFM.

http://search.cpan.org/perldoc/TAP::Harness::Archive


-- 
If at first you don't succeed--you fail.
-- Portal demo


Re: Fwd: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-18 Thread Michael G Schwern
chromatic wrote:
 On Tuesday 18 December 2007 17:27:24 Andy Armstrong wrote:
 
 Someone (MLEHMANN) doesn't like smoking... That was a test report
 generated by CPAN::Reporter.

 It hadn't previously occurred to me that test reports might cause
 offence...
 
 Didn't you get a whole slew of them a while back where the problem was that 
 that the reporter hadn't properly configured Windows to build modules?  How 
 about the one where the reporter had configured CPAN never to follow 
 dependencies?

That said, looking through IO::AIO's failures they seem reasonably legit to
me.  It has trouble on BSD, and some other systems, a useful thing to know.
IO::AIO lacks any special INSTALL instructions or special notes about BSD in
general.  There are a couple notes generated by the Makefile.PL about FreeBSD
and threading and Linux and malloc issues, but that will whiz by and likely be
completely missed.  So even a human installer would not know what to do.

Anyhow, what's clear is there is a problem with IO::AIO.  It hasn't been
addressed properly by the author.  While it's frustrating to get a constant
stream of your shit is broke, his shit is indeed broke.  This is a clear
case of CPAN Testers technology working as expected and tickling a social 
problem.

It is particularly near to my heart as Test::More has a similiar problem with
thread tests and I'm not sure what to do about it.  There was the suggested
author does not care marker for tests which might fail but are only for the
information of the installer -- the author already knows about them, don't
report the failure.

As for the social problem, the BSD testers could try to help out with whatever
the problem is.  On Marc's side he could ask for help instead of asking
everyone to turn off the immensely useful automated testing.  It could also
use an INSTALL doc and have the Makefile.PL warnings be more prominent with
perhaps a pause, beep or a well-behaved Do you wish to continue? [No].


-- 
I am somewhat preoccupied telling the laws of physics to shut up and sit down.
-- Vaarsuvius, Order of the Stick
   http://www.giantitp.com/comics/oots0107.html


Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-19 Thread Michael G Schwern
Andy Armstrong wrote:
 On 19 Dec 2007, at 03:13, Michael G Schwern wrote:
 Anyhow, what's clear is there is a problem with IO::AIO.  It hasn't been
 addressed properly by the author.  While it's frustrating to get a
 constant
 stream of your shit is broke, his shit is indeed broke.  This is a
 clear
 case of CPAN Testers technology working as expected and tickling a
 social problem.
 
 I'm locked in correspondence with Marc now.
 
 His view: cpan-testers are incompetent, ego tripping, quasi-religious
 nuisances.
 My view: approx your view.
 
 Obviously that's my (probably extremely unprofessional) impression of
 his views. He did mention religion and ego though :)

CPAN Testers does mug his modules pretty badly, just look at all that red.
http://cpantesters.perl.org/author/MLEHMANN.html

He does an awful lot of XS which is always going to be problematic.


-- 
Hating the web since 1994.


Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-20 Thread Michael G Schwern
Michael Peters wrote:
 David Golden wrote:
 On Dec 20, 2007 1:19 PM, Dave Rolsky [EMAIL PROTECTED] wrote:
 It's generally
 pretty rare that the failure report includes enough information for me to
 do anything about it, so without an engaged party on the other end, it
 really is just noise.
 With CPAN::Reporter, I've been trying to add additional context
 (within reason) to assist with problem diagnosis.  What kind of
 information would improve the reports?  (Not to say that this obviates
 the need for a responsive tester, but every little bit helps.)
 
 I for one would like the full TAP output of the tests. Not just what get's 
 sent
 to STDOUT by default. What would be ideal (and it's something that RJBS has
 poked me about before) would be to receive a TAP Archive (prove --archive) 
 that
 could get attached to the email. Of course this needs to be opt-in 
 (META.yml?).
 Then it would be pretty easy to setup an email account that is monitored by 
 some
 tool that would extract the archive and upload it to a Smolder install.

Altering how the tests run just for CPAN Testers:  Bad.
Trying to get authors to put in special code just for CPAN Testers:  Good Luck

This could be accomplished silently with some environment variables.
TAP_PARSER_ARCHIVE_DIR=/path/to/somewhere

There is the problem of getting TAP::Parser to recognize that the archive
feature is available and that gets back to the open TAP::Builder problem.


Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-20 Thread Michael G Schwern
Andy Armstrong wrote:
 On 21 Dec 2007, at 00:11, Michael G Schwern wrote:
 This could be accomplished silently with some environment variables.
 TAP_PARSER_ARCHIVE_DIR=/path/to/somewhere
 
 It's called PERL_TEST_HARNESS_DUMP_TAP and it already exists.

Well there you go.

Though might I suggest that...

A)  This be documented in Test::Harness, I note it's only in
Test::Harness.
B)  The TAP::Harness version be changed to PERL_TAP_HARNESS_DUMP_TAP.  Don't
want Test::Harness features leaking into TAP::Harness.
C)  The Test::Harness version be changed to HARNESS_DUMP_TAP (to match all
the other environment variables)

All that HARNESS_DUMP_TAP would do is twiddle the appropriate TAP::Harness bit.

I also note that setting the environment variable appears to be the *only* way
to get this behavior.  It would be handy if there was a normal parameter to
set it.

If I can sort out my SVK breakage I'll get on it.


 There is the problem of getting TAP::Parser to recognize that the archive
 feature is available and that gets back to the open TAP::Builder problem.
 
 I don't understand...

Sorry, I thought an optional plugin/subclass was necessary to get TAP::Harness
to save TAP.  Didn't realize it's built in.


-- 
If at first you don't succeed--you fail.
-- Portal demo


Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-22 Thread Michael G Schwern
chromatic wrote:
 On Saturday 22 December 2007 16:48:29 Michael G Schwern wrote:
 The I installed to a directory with a space in the path is an example of
 CPAN Testers working as expected.  It found and highlighted an annoying bug
 that the rest of us either ignore or work around.
 
 CPAN Testers reporting failures in every module they test and not stopping to 
 ask Hey, is it possible that not everything else in the world is broken? is 
 *not* an example of CPAN Testers working as expected.

 Environments where it's impossible even to *build* Perl modules are 
 unsuitable 
 for smoketest reporting, as they don't provide any useful information and 
 they make true failures much more difficult to see and believe.

One of the drawbacks of extensive automation is that special cases which
require human intervention cannot be handled and can often go from minor
annoyances to pandemics.  The proper response is to either fix the automation
to eliminate the necessity of human interaction or, if its a bug, fix the bug.
   A little work now and the system will run even more efficiently than before.

These sorts of systemic failures might not be providing *new* information, but
I wouldn't say it's not useful.  One of the greatest problems facing
collecting bug reports (or, in fact, any survey technique) is getting honest,
unfiltered feedback.  Humans have a tendency to filter out negative feedback,
especially if it's perceived to be a known problem with a narrow focus.  CPAN
testers is giving us honest, unfiltered feedback. [1]  We get to see the all
the problems and the breadth of them.  This makes it difficult to ignore long
standing problems, like configuration level dependencies or non-Perl
dependencies or failures on BSD or CPANPLUS fighting with Module::Build or all
the other things WE know how to work around but others don't and make the end
user experience annoying.  Spaces in filenames are just the next problem to fix.

Even in cases where Perl is broke, it's nice to know how it got into that
state and if we can do anything about it.

The price we pay is a little more email in the inbox. [2]  I'm willing to pay
that.


[1] Within the range of the set of people doing the testing, of course.

[2] Rather than each individual emailing the author how about CPAN Testers
sends out a daily/weekly digest?  Still push and mandatory, it's important
that authors see this information, but at least it's just one email and it can
skip a lot of the boilerplate text and provide a nice, tight summary.


-- 
I have a date with some giant cartoon robots and booze.


Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-22 Thread Michael G Schwern
chromatic wrote:
 I just went through a sampling of fail reports for my stuff.  There was one 
 legitimate packaging bug, and a couple of legitimate errors due to updates to 
 Perl.  About 35% of the other reports are these.
 
 I love the Illegal seek error message:
 
   http://www.nntp.perl.org/group/perl.cpan.testers/2007/09/msg602208.html

As three different reporters across five different operating systems and three
versions of perl reported similar test failures, I believe the illegal seek is
suspicious but incidental.


 Pod::Man is broken.  Think about that for a while:
 
   http://www.nntp.perl.org/group/perl.cpan.testers/2006/01/msg286775.html

A temporary failure almost two years ago.  But the six other Acme::UNIVERSAL
failures appear to be legit.


 No information; useless:
 
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg221656.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2006/02/msg290834.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg223400.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg223401.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg223402.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg222475.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/06/msg216573.html

All the no information reports all appear to be prior to the fixes for
CPANPLUS not reporting Module::Build test results.  A bug in the system that
has since, I believe, been fixed about two years ago.  You'll note the newest
of this bunch is two years old.

That leaves us with... I count one failure due to an individual CPAN tester's
setup being broken with another that amounts to a warning.  The rest are
system-wide bugs.

That CPANPLUS didn't report Module::Build test results is a highly annoying
bug to be sure (in fact, the whole CPANPLUS v Module::Build war is a tragedy),
but a bug that was fixed.  There's nothing we can do about previous mistakes,
only future ones.  No use dwelling on the past, let's see how CPAN Testers is
treating your current releases...

* P5NCI, 11 Dec 2007... looks like a valid pile of XS compatibility issues
with similar failures coming from several different testers.  Valid failures.

* UNIVERSAL::isa, 24 Nov 2007... looks like they caught a compatibility issue
with 5.5.5 and 5.6.2 across multiple testers.  Valid failures, all.  And it's
an alpha release, isn't it nice to have people testing your alphas?  Previous
stable release in Feb 2006 has one failure out of 400 tests.  Looks like a
valid failure, possibly due to CGI.pm not being available and the test
checking for availability but not skipping the dependent test. [2]

* Test::MockObject, 29 Jun 2007... 100% passing with 201 tests.  Previous
version from October 2006 has only one failure with 153 passes, possibly due
to a bleadperl issue.  Maybe annoying to you, but useful for bleadperl
development. [1]

* Text::WikiFormat, 29 Jun 2007... 100% passing with 71 tests.  Previous
version had 1 failure out of 43 tests.  Again, possible bleadperl issue.

* SUPER, 04 Apr 2007... one failure, a possible bleadperl bug.

Depending on which way you score the bleadperl failures for your latest five
distribution releases you've had zero or two false negatives out of, let's add
it up... 51 + 50 + 201 + 71 + 46 == 419.  So either a 0 or 0.5% false failure
rate.  That's a pretty damn good.


[1] It can be argued that bleadperl testers should probably not email authors,
and maybe they aren't I can't tell from these archives, but at least the work
is useful.  CPAN::Reporter could change the default configuration if it
detects a development perl.

[2] And before you say how do you install perl without CGI.pm?! it can be
done with a stripped down Debian system via their perl-base package and
possibly other perl distributions.


-- 
We do what we must because we can.
For the good of all of us,
Except the ones who are dead.
-- Jonathan Coulton, Still Alive


Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-23 Thread Michael G Schwern
David Golden wrote:
 On Dec 23, 2007 2:37 AM, Michael G Schwern [EMAIL PROTECTED] wrote:
 [1] It can be argued that bleadperl testers should probably not email 
 authors,
 and maybe they aren't I can't tell from these archives, but at least the work
 is useful.  CPAN::Reporter could change the default configuration if it
 detects a development perl.
 
 That's quite reasonable -- submit to CPAN Testers to help p5p check
 bleadperl against CPAN but don't annoy authors if it fails.  What's
 the best way to detect a development perl reliably?  I don't think
 it's just odd major numbers, as 5.9.5 switched to 5.10.0 well before
 the actual release candidates were out.  Maybe
 $Config{perl_patchlevel}?  That seems to have vanished from the final
 release.

That's ok, it doesn't need to be foolproof.  Odd numbered versions (starting
at 7) is a good start and will cut out most of the bleadperl noise.

The 5.even as devel period is very short.  CPAN authors should be made aware
of how their code works with release candidates.   That's a period when
problems are likely to be for real.


-- 
I do have a cause though. It's obscenity. I'm for it.
- Tom Lehrer


Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-23 Thread Michael G Schwern
Michael G Schwern wrote:
 David Golden wrote:
 On Dec 23, 2007 2:37 AM, Michael G Schwern [EMAIL PROTECTED] wrote:
 [1] It can be argued that bleadperl testers should probably not email 
 authors,
 and maybe they aren't I can't tell from these archives, but at least the 
 work
 is useful.  CPAN::Reporter could change the default configuration if it
 detects a development perl.
 That's quite reasonable -- submit to CPAN Testers to help p5p check
 bleadperl against CPAN but don't annoy authors if it fails.  What's
 the best way to detect a development perl reliably?  I don't think
 it's just odd major numbers, as 5.9.5 switched to 5.10.0 well before
 the actual release candidates were out.  Maybe
 $Config{perl_patchlevel}?  That seems to have vanished from the final
 release.
 
 That's ok, it doesn't need to be foolproof.  Odd numbered versions (starting
 at 7) is a good start and will cut out most of the bleadperl noise.
 
 The 5.even as devel period is very short.  CPAN authors should be made aware
 of how their code works with release candidates.   That's a period when
 problems are likely to be for real.

Thinking on this a little more, there is the issue of folks like me who share
a single CPAN configuration file across multiple Perl installations.  I don't
know how common that is to have a stable and devel perl running off the same
CPAN config and if it's worth adding in a special case in the configuration
for what do you want to do with development perls to override the existing
config.


-- 
If at first you don't succeed--you fail.
-- Portal demo


Re: MakeMaker warning

2007-12-26 Thread Michael G Schwern
Gabor Szabo wrote:
 might be slightly unrelated to QAsorry

In the future, MakeMaker issues go to [EMAIL PROTECTED]


 After installing   JOSHUA/Net-Telnet-Cisco-1.10.tar.gz
 if I run perl Makefile.PL  on an unrelated Makefile.PL that
 requires 'Net::Telnet::Cisco'  = '1.10'
 I get a warning
 
 Argument 1.3.1 isn't numeric in numeric lt () at
 /opt/perl510/lib/5.10.0/ExtUtils/MakeMaker.pm line 414.
 
 I did not get this warning before installing Net::Telnet::Cisco
 
 Strange thing is that when I ack for 1.3.1 the only place I found it is
 
 /opt/perl510/lib/site_perl/5.10.0/Sys/HostIP.pm
 7:$VERSION = '1.3.1';
 
 So my diagnostics pointing at Net::Telnet::Cisco might be incorrect.
 
 So which module is at fault here?

The warning is absolutely correct, 1.3.1 isn't numeric.  Nor is it a version
object as it probably should be.  So that much of Sys::HostIP is wrong.

As to why it's happening in your apparently unrelated Makefile.PL, I can't
say.  Double check that somewhere down the line you're not listing Sys::HostIP
as a prereq.


-- 
Look at me talking when there's science to do.
When I look out there it makes me glad I'm not you.
I've experiments to be run.
There is research to be done
On the people who are still alive.
-- Jonathan Coulton, Still Alive


Re: MakeMaker warning

2007-12-29 Thread Michael G Schwern
Gabor Szabo wrote:
 Is there a place with definition of what a VERSION value can be ?

Anything which compares sanely as a number plus the X.YY_ZZ alpha convention
(which MM converts to a number).  I guess that's never stated explicitly.  I'd
welcome a section on $VERSION in ExtUtils::MakeMaker::Tutorial, VERSION_FROM
isn't really the right place for it.


 I looked at the docs of ExtUtils::MakeMaker.
 It mentions version numbers under VERSION_FROM but only as examples
 it has examples like 1.2.3 though not in a string format like '1.2.3'.

1.2.3 is an ill fated version string (not to be confused with version objects)
introduced in 5.6.  I don't think they actually work anyway... no they don't.
 I'll remove them.


 CPANTS thinks it is a correct version number:
 http://cpants.perl.org/dist/kwalitee/Sys-HostIP
 
 Maybe the way CPANTS check isn't correct but I think there is some mismatch.

It's using CPAN::DistnameInfo which is looking at formatting details.  All
MakeMaker cares about is capabilities.  So there's likely to be some mismatch.

It's also not clear that CPAN::DistnameInfo cares about vetting the $VERSION
so much as simply extracting it.

A simple test for capabilities would be this:

$version =~ s/(\d+)\.(\d+)_(\d+)/$1.$2$3/;  # turn X.YY_ZZ into X.YYZZ
{
local $SIG{__WARN__} = sub { die Bad version: @_ };
() = $version = 0;
}


-- 
Stabbing you in the face so you don't have to.


Re: demining

2008-01-03 Thread Michael G Schwern
Eric Wilhelm wrote:
 # from Aristotle Pagaltzis
 # on Wednesday 02 January 2008 16:47:
 
 looking for (and diffusing) mines
 That sounds like a novel approach! Or do you mean “defusing”? :-)
 
 Yeah :-D  Diffuse is probably what they do when you find them the less 
 careful way!
 
 I guess the tank+flail mechanism is still in use, so that would 
 adequately describe that process.  Apparently it makes a huge mess 
 though.

Yes.  Mine goes off, chain snaps and goes wheee into some poor
soldier's face.  They're only used in modern day to clear little anti-personal
mines because they're relatively cheap and fast and don't require an armored
vehicle.

These days tactical (ie. small scale, under fire) mine clearing uses either a
tank mounted plow
http://en.wikipedia.org/wiki/Image:951219-O-9805M-005.jpg

Or a big, heavy roller mounted on springs in front of a tank to set off mines.
http://en.wikipedia.org/wiki/Image:M60-panther-mcgovern-base.jpg

Or you fire a rocket with an explosive filled hose attached across the
minefield and set it off detonating any mines in its path.

All of which make big messes.

My personal favorite... rats!
http://www.apopo.org/


-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer


Re: Testing print failures

2008-01-05 Thread Michael G Schwern
nadim khemir wrote:
 print 'hi' or carp q{can't print!} ;

I'm not even going to wade into the layers of neurosis demonstrated in this
post, but if you want to throw an error use croak().


-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer


Re: Dev version numbers, warnings, XS and MakeMaker dont play nicely together.

2008-01-06 Thread Michael G Schwern
demerphq wrote:
 So we are told the way to mark a module as development is to use an
 underbar in the version number:
 
 $VERSION= 1.23_01;
 
 but this will produce warnings if you assert a required version
 number, as the version isn't numeric.

We talked about this recently on [EMAIL PROTECTED]  Specifically how much
the convention sucks and replacing it with META.yml info.
http://www.nntp.perl.org/group/perl.module.build/2007/12/msg1151.html


-- 
Life is like a sewer - what you get out of it depends on what you put into it.
- Tom Lehrer


Re: Fixed Test::Builders regexp detection code.

2008-01-06 Thread Michael G Schwern
demerphq wrote:
 Just a heads up that I patched the core version of Test::Builder to
 use more reliable and robust methods for detecting regexps in test
 cases. This makes them robust to changes in the internals and also
 prevents Test::Builder from getting confused if someone uses blessed
 qr//'s.

Thanks.

For future reference, patches to dual core modules should please go upstream
to the CPAN version's bug tracker.

The bug tracker for Test::Builder is here.
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Test-Simple

or here
[EMAIL PROTECTED]


-- 
Stabbing you in the face so you don't have to.


Re: Call for Attention: Perl QA Hackathon in Oslo

2008-01-08 Thread Michael G Schwern
Salve J Nilsen wrote:
 Oslo.pm is planning a Perl QA Workshop/Hackathon in Oslo, Saturday
 April 4th to Monday April 7th, 2008.

FWIW, if I can get sponsored, I'm going with bells on.


 Just to make things more interesting, IEEE will have a conference on
 software testing nearby (in Lillehammer, Norway), just a few days after
 the workshop/hackathon.
 
   http://www.cs.colostate.edu/icst2008/

And if I do get over there I'm totally crashing this.


-- 
ROCKS FALL! EVERYONE DIES!
http://www.somethingpositive.net/sp05032002.shtml


Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-11 Thread Michael G Schwern
Ovid wrote:
 I've posted a trimmed down version of the custom 'Test::More' we use
 here:
 
   http://use.perl.org/~Ovid/journal/35363
 
 I can't recall who was asking about this, but you can now do this:
 
   use Our::Test::More 'no_plan', 'fail';
 
 If 'fail' is included in the import list, the test program will die
 immediately after the first failure.  VERY HANDY at times.

I've experimented with this idea in the past to use Test::Builder to replace
home rolled die on failure assert() style test suites.  Unfortunately
there's a major problem:

$ perl -wle 'use OurMore fail, no_plan;  is 23, 42'
not ok 1
#   Failed test at /usr/local/perl/5.8.8/lib/Test/More.pm line 329.
Test failed.  Halting at OurMore.pm line 44.
1..1

Dude, where's my diagnostics?

In Test::Builder, the diagnostics are printed *after* the test fails.  So
dying on ok() will kill those very important diagnostics.  Sure, you don't
have to read a big list of garbage but now you don't have anything to read at 
all!

Since the diagnostics are printed by a calling function outside of
Test::Builder's control (even if you cheated and wrapped all of Test::More
there's all the Test modules on CPAN, too) I'd considered die on failure
impossible. [1]  The diagnostics are far more important.


Now, getting into opinion, I really, really hate die on failure.  I had to use
a system that implemented it for a year (Ovid knows just what I'm talking
about) and I'd rather scroll up through an occasional burst of errors and
warnings then ever not be able to fully diagnose a bug because a test bailed
out before it was done giving me all the information I needed to fix it.  For
example, let's look at the ExtUtils::MakeMaker tests for generating a PPD file.

ok( open(PPD, 'Big-Dummy.ppd'), '  .ppd file generated' );
my $ppd_html;
{ local $/; $ppd_html = PPD }
close PPD;
like( $ppd_html, qr{^SOFTPKG NAME=Big-Dummy VERSION=0,01,0,0}m,
   '  SOFTPKG' );
like( $ppd_html, qr{^\s*TITLEBig-Dummy/TITLE}m,'  TITLE'   );
like( $ppd_html, qr{^\s*ABSTRACTTry our hot dog's/ABSTRACT}m,
   '  ABSTRACT');
like( $ppd_html,
  qr{^\s*AUTHORMichael G Schwern lt;[EMAIL PROTECTED]gt;/AUTHOR}m,
   '  AUTHOR'  );
like( $ppd_html, qr{^\s*IMPLEMENTATION}m,  '  IMPLEMENTATION');
like( $ppd_html, qr{^\s*DEPENDENCY NAME=strict VERSION=0,0,0,0 /}m,
   '  DEPENDENCY' );
like( $ppd_html, qr{^\s*OS NAME=$Config{osname} /}m,
   '  OS'  );
my $archname = $Config{archname};
$archname .= -. substr($Config{version},0,3) if $] = 5.008;
like( $ppd_html, qr{^\s*ARCHITECTURE NAME=$archname /}m,
   '  ARCHITECTURE');
like( $ppd_html, qr{^\s*CODEBASE HREF= /}m,'  CODEBASE');
like( $ppd_html, qr{^\s*/IMPLEMENTATION}m,   '  /IMPLEMENTATION');
like( $ppd_html, qr{^\s*/SOFTPKG}m,  '  /SOFTPKG');

Let's say the first like() fails.  So you go into the PPD code and fix that.
Rerun the test.  Oh, the second like failed.  Go into the PPD code and fix
that.  Oh, the fifth like failed.  Go into the PPD code and fix that...

Might it be faster and useful to see all the related failures at once?

And then sometimes tests are combinatorial.  A failure of A means one thing
but A + B means another entirely.

Again, let's look at the MakeMaker test to see if files got installed.

ok( -e $files{'dummy.pm'}, '  Dummy.pm installed' );
ok( -e $files{'liar.pm'},  '  Liar.pm installed'  );
ok( -e $files{'program'},  '  program installed'  );
ok( -e $files{'.packlist'},'  packlist created'   );
ok( -e $files{'perllocal.pod'},'  perllocal.pod created' );

If the first test fails, what does that mean?  Well, it could mean...

A)  Only Dummy.pm failed to get installed and it's a special case.
B)  None of the .pm files got installed, but everything else installed ok.
C)  None of the .pm files or the programs got installed, but the
generated files are ok
D)  Nothing got installed and the whole thing is broken.

Each of these things suggests different debugging tactics.  But with a die on
failure system they all look exactly the same.


Oooh, and if you're the sort of person that likes to use the debugger it's
jolly great fun to have the test suite just KILL THE PROGRAM when you want to
diagnose a post-failure problem.


There are two usual rebuttals.  The first is well just turn off
die-on-faillure and rerun the test.  Ovid's system is at least capable of
being turned off, many hard code failure == die.  Unfortunately Ovid's is at
the file level, it should be at the user level since the do I or do I not
want to see the gobbledygook is more a user preference.

But we all know the problems with the just rerun the tests 

Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
Ovid wrote:
 I'll go fix that diagnostic thing now.  Unfortunately, I think I'll
 have to violate encapsulation :(

If you know how to fix it let me know, because other than enumerating each
testing module you might use and lex-wrapping all the functions they export,
I'm not sure how to do it.  Test::Builder could cheat and register each module
as they load Test::Builder, but that relies on their using Test::Builder and
not requiring it.  Or builder() or new() could do the registration, but
there's no guarantee that they'll be called in the right package... however,
it is very likely.

One possibility involves taking advantage of $Level, so at least Test::Builder
knows which is the test function the user called, and then, somehow, inserting
the code necessary to cause failure when that function exits.  I don't know
how you insert code to run when a function that's already being executed exits.

This is why I altered the recommended calling conventions for Test::Builder to
call -builder at the beginning of each function rather than just use one
global.  Then at least I can use the builder object's DESTROY method to
indicate the end of a test to trigger this stuff.

There is, of course, a way to eliminate the problem at the source.  Since the
issue is the spewing test output and then having to scroll up to find the
original point of failure, perhaps the solution is not to truncate the output
but to use something better than just raw terminal output.  If only there was
something that could... I don't know... read the TAP and error messages and
produce a nicer output.  Some sort of TAP parser... :P


-- 
Schwern What we learned was if you get confused, grab someone and swing
  them around a few times
-- Life's lessons from square dancing


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
The whole idea of halting on first failure was introduced to me by some XUnit
folks.  Their rationale was not to avoid spewing output, they had no such
problem since it's all done via a GUI, but that once one failure has happened
the failing code might hose the environment and all following results are now
considered contaminated.  This might make sense in a laboratory, but it seems
a bit like overkill in for day-to-day software testing throwing out perfectly
fine data.  As any field scientist knows, there's no such thing as
uncontaminated data.

The idea that you can diagnose everything from the first failure reminded me
of a gag about tech support that goes something like this:
http://www.netfunny.com/rhf/jokes/97/Oct/techsupport.html

TECH: Ridge Hall computer assistant; may I help you?

CUST: Yes, well, I'm having trouble with WordPerfect.

TECH: What sort of trouble?

CUST: Well, I was just typing along, and all of a sudden the words went
away.

TECH: Went away?

CUST: They disappeared.

TECH: Hmm. So what does your screen look like now?

CUST: Nothing.

TECH: Nothing?

CUST: It's blank; it won't accept anything when I type.

TECH: Are you still in WordPerfect, or did you get out?

CUST: How do I tell?

TECH: Can you see the C prompt on the screen?

CUST: What's a sea-prompt?

TECH: Never mind. Can you move the cursor around on the screen?

CUST: There isn't any cursor: I told you, it won't accept anything I
type.

TECH: Does your monitor have a power indicator?

CUST: What's a monitor?

TECH: It's the thing with the screen on it that looks like a TV. Does it
have a little light that tells you when it's on?

CUST: I don't know.

TECH: Well, then look on the back of the monitor and find where the power
cord goes into it. Can you see that?

CUST: ...Yes, I think so.

TECH: Great! Follow the cord to the plug, and tell me if it's plugged into
the wall.

CUST: ...Yes, it is.

TECH: When you were behind the monitor, did you notice that there were two
cables plugged into the back of it, not just one?

CUST: No.

TECH: Well, there are. I need you to look back there again and find the
other cable.

CUST: ...Okay, here it is.

TECH: Follow it for me, and tell me if it's plugged securely into the back
of your computer.

CUST: I can't reach.

TECH: Uh huh. Well, can you see if it is?

CUST: No.

TECH: Even if you maybe put your knee on something and lean way over?

CUST: Oh, it's not because I don't have the right angle-it's because it's
dark.

TECH: Dark?

CUST: Yes-the office light is off, and the only light I have is coming in
from the window.

TECH: Well, turn on the office light then.

CUST: I can't.

TECH: No? Why not?

CUST: Because there's a power outage.

TECH: A power... a power outage? Aha! Okay, we've got it licked now.  Do
you still have the boxes and manuals and packing stuff your computer came
in?

CUST: Well, yes, I keep them in the closet.

TECH: Good! Go get them, and unplug your system and pack it up just like
it was when you got it. Then take it back to the store you bought it from.

CUST: Really? Is it that bad?

TECH: Yes, I'm afraid it is.

CUST: Well, all right then, I suppose. What do I tell them?

TECH: Tell them you're too stupid to own a computer.


-- 
94. Crucifixes do not ward off officers, and I should not test that.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
Ovid wrote:
 --- Michael G Schwern [EMAIL PROTECTED] wrote:
 
 The whole idea of halting on first failure was introduced to me by
 some XUnit
 folks ... As any field scientist knows, there's no such thing as
 uncontaminated data.
 
 As any tester knows, a one size fits all suit often doesn't fit.  Let
 people decide for themselves when a particular method of testing is
 appropriate.  I hate you must halt testing on a failure as much as I
 hate you must not halt testing on failure.  It's not XOR.

When it comes to failure, I like to err on the side of more information.


 There's a certain irony that beginning testers are often told to fix
 the *first* error *first* and subsequent errors go away.  I'm not
 saying this is a silver bullet to solve testing, but sometimes it's
 very useful.  

That's the general idea for dealing with syntax errors, too.

The trick is, you don't know ahead of time whether the information from the
follow on failures will prove to be useful.  Can't tell until you see it.  So
don't freak out over all the subsequent failures, fix the first thing and
re-run is a decent plan, but you can't just ignore them either.


 I am feeling a bit stupid because I can't figure out your conclusion. 
 Humor me.  At times it sounds like you're telling people not to do this
 and at times it sounds like you're telling people it's hard to do with
 Test::Builder :)

Yes, I'm saying both.  I don't like it AND it's appears impossible to do right
with TB.  Though I do still ponder how to make it work anyway.


PS  Couldn't you have the TAP harness kill the test process on first failure?

-- 
24. Must not tell any officer that I am smarter than they are, especially
if it’s true.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-12 Thread Michael G Schwern
Aristotle Pagaltzis wrote:
 * Michael G Schwern [EMAIL PROTECTED] [2008-01-12 12:00]:
 Ovid wrote:
 I'll go fix that diagnostic thing now. Unfortunately, I
 think I'll have to violate encapsulation :(
 If you know how to fix it let me know, because other than
 enumerating each testing module you might use and lex-wrapping
 all the functions they export, I'm not sure how to do it.
 
 Set a flag that T::B should quit when the next test result is
 about to be recorded?

I guess it works, but it leaves you dead halfway through another test function
which is weird.


 One possibility involves taking advantage of $Level, so at
 least Test::Builder knows which is the test function the user
 called, and then, somehow, inserting the code necessary to
 cause failure when that function exits. I don't know how you
 insert code to run when a function that's already being
 executed exits.
 
 Load the debugger and set a breakpoint?

Oh, good one.  If the debugger wasn't so damned full of bugs that might just
work as a general solution.


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.


Re: Preserving diagnostics when dieing on test failure

2008-01-12 Thread Michael G Schwern
Ovid wrote:
 So we can preserve diagnostics, but we need help in cleaning up those
 damned line numbers.  Hook::LexWrap didn't have the magic I thought it
 would.

ok() is now inside a wrapper so you're one level further down then it thinks.
 Just add one to $Level and then take it back off again afterwards.

  wrap 'Test::Builder::ok',
pre = sub {
  $_[0]-{XXX_test_failed} = 0;
  $Test::Builder::Level++;
},
post = sub {
  $Test::Builder::Level--;
  $_[0]-{XXX_test_failed} = ![ $_[0]-summary ]-[-1];
};


 Below is how I did it.  See the 'import' method.  There's a lot more
 work to be done to get fine-grained control, but the line numbers are
 the important bit.

Not everything prints more diagnostics, like ok() itself.

$ perl -wle 'use OurMore fail, no_plan;  ok(0);  ok(1);  ok(0);  ok(1)'
not ok 1
#   Failed test at -e line 1.
ok 2
not ok 3
#   Failed test at -e line 1.
ok 4
1..4
# Looks like you failed 2 tests of 4.

But you can probably special case that and fail().

The bigger problem is what happens if a function calls diag() more than once,
like Test::Exception.

$ perl -wle 'use OurMore no_plan;  throws_ok { die; } qr/foo/;  pass()'
not ok 1 - threw Regexp ((?-xism:foo))
#   Failed test 'threw Regexp ((?-xism:foo))'
#   at -e line 1.
# expecting: Regexp ((?-xism:foo))
# found: Died at -e line 1.
ok 2
1..2
# Looks like you failed 1 test of 2.

$ perl -wle 'use OurMore fail, no_plan;  throws_ok { die; } qr/foo/;  
pass()'
not ok 1 - threw Regexp ((?-xism:foo))
#   Failed test 'threw Regexp ((?-xism:foo))'
#   at -e line 1.
# expecting: Regexp ((?-xism:foo))
Test failed.  Halting at OurMore.pm line 55.
1..1
# Looks like you failed 1 test of 1.
# Looks like your test died just after 1.

(Note the lack of found)


-- 
94. Crucifixes do not ward off officers, and I should not test that.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


The spewing problem.

2008-01-12 Thread Michael G Schwern
Paul Johnson wrote:
 This is something that I too have asked for in the past.  I've even
 hacked up my own stuff to do it, though obviously not as elegantly as
 you or Geoff.  Here's my use case.
 
 I have a bunch of tests that generally pass.  I hack something
 fundamental and run my tests.  Loads of them fail.  Diagnostics spew
 over my screen.  Urgh, I say.  Now I could scroll back through them.

When faced with a tough problem it's often useful to go back check that it's
actually the problem and not a solution posing as a problem.

Make Test::Builder die on failure is a solution, and it's not a particularly
good one.  It's hard to implement in Test::Builder and there's all the loss of
information issues I've been yelping bout.

The problem I'm hearing over and over again is Test::Builder is spewing crap
all over my screen and obscuring the first, real failure.  So now that the
problem is clearly stated, how do we solve it without making all that spew
(which can be useful) totally unavailable?


-- 
39. Not allowed to ask for the day off due to religious purposes, on the
basis that the world is going to end, more than once.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: Test::Builder statistics

2008-01-12 Thread Michael G Schwern
Ovid wrote:
 My first attempt at determining the most popular testing modules left
 out Test.pm.  Whoops!  I've fixed that.
 
 Out of almost 60,000 test programs, it turns out Test.pm is used 8,937
 times.  Now that I have a file which lists how many times each test
 module is used, I can start examining my extracted CPAN to determine
 what percentage of modules actually use the Test::Builder framework.

FWIW there's a Test::Builder based emulator for Test.pm called Test::Legacy.
I'm sure you can shim your stuff to load Test::Legacy when Test.pm is asked for.


-- 
184. When operating a military vehicle I may *not* attempt something
 “I saw in a cartoon”.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: The spewing problem.

2008-01-12 Thread Michael G Schwern
Matisse Enzer wrote:
 I just want to be able to run a test suite with a switch that makes the
 entire test run stop after the first failure is reported.

Ok, it's nice to want things, but why do you want it?


-- 
100. Claymore mines are not filled with yummy candy, and it is wrong
 to tell new soldiers that they are.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: The spewing problem.

2008-01-13 Thread Michael G Schwern
Matisse Enzer wrote:
 
 On Jan 12, 2008, at 10:24 PM, Michael G Schwern wrote:
 
 Matisse Enzer wrote:
 I just want to be able to run a test suite with a switch that makes the
 entire test run stop after the first failure is reported.

 Ok, it's nice to want things, but why do you want it?
 
 Almost entirely because when I'm developing on a code base with a large
 test suite I often want to stop the test run as fast as possible when a
 failure occurs - currently I do a control-C but would prefer if I could
 use a switch when I run the tests to have it just stop right after the
 first failure.

Ok, why do you want to stop it as fast as possible when a failure occurs?


-- 
164. There is no such thing as a were-virgin.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3


Re: The spewing problem.

2008-01-13 Thread Michael G Schwern
Michael Peters wrote:
 Michael G Schwern wrote:
 
 Ok, why do you want to stop it as fast as possible when a failure occurs?
 
 I have a 45 minute test suite and I want to work on the first failure as soon 
 as
 possible. I also have multiple desktops and am doing other things in another
 desktop, so I want to know as soon as the failure happens so that I can start
 working on it:
 
make test || echo -e \a
 
 Would be nice if that would beep after the first failure instead of after 45
 minutes and the whole thing is done.

I keep digging away at this because I'm looking for a problem other than I
want to see the first failure.  And that's what I'm hearing from you and from
Matisse and everyone else.  Yours is a little different, it's I want to be
alerted on first failure.

You see how this is distinct from halt on first failure?  It gives me a lot
more room for different solutions that don't involve just cutting off all the
following information.


-- 
Look at me talking when there's science to do.
When I look out there it makes me glad I'm not you.
I've experiments to be run.
There is research to be done
On the people who are still alive.
-- Jonathan Coulton, Still Alive


Re: The spewing problem.

2008-01-13 Thread Michael G Schwern
Adam Kennedy wrote:
 This shouldn't be any more complicated than  -g (where g in my case
 stands for goat as in feinting goat)

Ok, I'll bite.  Why a goat and why is it feinting?


-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer


Re: What should it's name be?

2008-01-14 Thread Michael G Schwern
Gabor Szabo wrote:
 I know I am a bit late to the party but what about  Test::Anything ?

Rapidly drifting towards Test::Anything::Protocol.


-- 
But there's no sense crying over every mistake.
You just keep on trying till you run out of cake.
-- Jonathan Coulton, Still Alive


Re: A New Test::Builder

2008-01-15 Thread Michael G Schwern

Ovid wrote:

Test::Harness used to be very limited.  We couldn't do a lot with it,
but when we started testing, most of us didn't do a lot with it.  As we
understood more about testing, we understood better many things we
wanted.  As a result, Schwern posted a great plan for rewriting
Test::Harness.  It worked and people are taking advantage of this.

Now we're starting to see more and more limitations with Test::Builder.
 I don't want this to come across as bashing chromatic or Schwern, the
two people who've done most of the great work in writing this and
related code.  They produced a great solution and now that we've had a
chance to use it for a while, we have a better idea of what else we
could use.  Of course, this is what most of programming is like.

Part of this is driven by the new Test::Harness and part of it is
driven by people's real-world needs.  I toss the following out not
because I think everyone will agree with it, but because I think it's a
good starting point.  Maybe someone can create TAP::Builder?

  * Make it subclassable.
  * Allowed deferred plans.
  * Allow for TAP upgrades (YAMLish, YAMLish, YAMLish!).
  * On Fail callbacks?  (I realize lots of people will squawk here)


The irony being that if you have N different backends then you can no longer 
guarantee any common behavior between the two which means all the 
Test::Builder hackery proposed here to add common functionality to all test 
modules gets harder, not easier.  On the flip side, some of it becomes possible.


There's the more important question of what *must* remain in common so that 
modules written with either system can still work together in the same test 
process.  That being the plan, the test counter and any end-of-test behaviors 
have to be coordinated which means at minimum they need an object in common to 
register whether there's already a plan and what it is, what the current test 
# is and so on.


And then there's what optionally should be coordinated.  This is stuff like 
what filehandles to output to, historical test data (am I passing?), are we 
skipping or todo'ing, are we using test numbers and so on.  Global behaviors.



--
If at first you don't succeed--you fail.
-- Portal demo


Re: BAIL_OUT and parallel tests

2008-01-17 Thread Michael G Schwern

Ovid wrote:

What should parallel tests do if a BAIL_OUT is encountered?  I think
all parallel tests currently running should be allowed to finish so
they can attempt to cleanup, but no more tests should be started.  Does
this sound reasonable?


It's not entirely clear if bail out means kill the testing process or let 
the process finish but don't start any more.  If it's the former, then just 
kill everything.  If it's the latter, let them all finish and don't start any 
more.



--
THIS I COMMAND!


Re: is_deeply and qr// content on 5.11

2008-01-18 Thread Michael G Schwern

Ian Malpass wrote:
I got a failure message from CPAN testers for Pod::Extract::URI for 
5.11.0 patch 33001 on Linux 2.6.22-3-amd64 
(x86_64-linux-thread-multi-ld)[0]


The failures were where I was testing to see if an arrayref of qr// 
patterns was the array I was expecting. is_deeply() has heretofore 
worked perfectly well.


The test line in question is:

   is_deeply( $peu-stop_uris, [ qr/foo/ ] );

And the failure is:

#   Failed test at t/new.t line 42.
# Structures begin differing at:
#  $got-[0] = (?i-xsm:foo)
# $expected-[0] = (?-xism:foo)
# Looks like you failed 1 test of 24.

So, is this 5.11 brokenness, is_deeply() brokenness, or are my tests 
naive (or wrong)?


(?i-xsm:foo) is equivalent to qr/foo/i.  (?-xism:foo) is qr/foo/.  They are 
not equivalent.


So it's possible that 5.11 fixed/broke something, but it's inside stop_uris(). 
 There have been issues with how is_deeply() compares regexes in the past 
(the ordering of the xism operators, for example) but in this case it appears 
to be working.



--
On error resume stupid


Re: #50432: Oslo Perl QA Hackaton Grant Application

2008-02-01 Thread Michael G Schwern

Ovid wrote:

Crap.  Can we just forget I sent that to Perl QA instead of the Grant
Committee?


/me puts on his sunglasses.
/me pulls out a black device.

Now if everyone on perl-qa will please look this way...

*FLASH!*


--
You are wicked and wrong to have broken inside and peeked at the
implementation and then relied upon it.
-- tchrist in [EMAIL PROTECTED]


Re: expanding the cpan script, and Module

2008-02-11 Thread Michael G Schwern

Andrew Hampe wrote:

The Basic CPAN concern: --bail_on_fail flag (2008.02.10 )

Problem description:
when a cpan session is looking for more than one distribution/module
there needs to be a way to 'flag' that the session must fail and stop
if there is an error loading any distribution, or a sub component 
required module.


To be clear, you mean like if you put in:

   $ cpan Foo::Bar Bar::Baz

and Foo::Bar fails to install you want it to stop and not continue on to 
Bar::Baz?



Is there anyone working on such a flag?

Would a Patch Be Acceptable?


I believe you want to send this along to the CPAN bug tracker.  perl-qa is for 
quality assurance (testing) issues.

http://rt.cpan.org/NoAuth/Bugs.html?Dist=CPAN

But it's trivial to do yourself.

use CPAN;

for my $name (@ARGV) {
my $mod = CPAN::Shell-expand(Module, $name);

unless( $mod ) {
warn Unknown module $name.  Aborting.\n;
last;
}

next if $mod-uptodate;  # already up to date

unless( CPAN::Shell-install($mod) ) {
warn Installing $name failed.  Aborting.\n;
last;
}
}


--
101. I am not allowed to mount a bayonet on a crew-served weapon.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Re: New assertions in Ruby

2008-02-12 Thread Michael G Schwern

chromatic wrote:

On Tuesday 12 February 2008 10:55:21 chromatic wrote:


On Tuesday 12 February 2008 10:06:14 Eric Wilhelm wrote:

How will you print the assertion code without a source filter?

Show Source on Exception is fairly easy:

http://www.oreillynet.com/onlamp/blog/2007/10/adding_show_source_to_perl_ex
c.html

Making that work with anonymous functions is trickier, but doable.


Of course, in the case of Test::Exception, the assertion functions already 
have the code reference that might throw an exception, in which case the 
problem isn't even tricky.


Data::Dump::Streamer can decompile a code reference, complete with attached 
lexicals.  But as has been pointed out by Yuval, the real trick is to show the 
value of all variables used in the block.


It's an interesting idea, but at the end of the article is the Fine Print 
about how assert {2.0} doesn't quite work:


---
When an assertion passes, Ruby only evaluates it once. However, when an 
assertion fails, the module RubyNodeReflector will re-evaluate each element in 
your block. (You knew there was a “gotcha”, right?;) This effect will hammer 
your side-effects, and will disable boolean short-circuiting. So once again 
sloppy developer tests help inspire us to write clean and decoupled code!



What he puts down to sloppiness I call flexibility.  I wouldn't want to 
restrict all tests to only those without side-effects, too limiting.  You all 
know I'm a stickler for allowing any possible test.


Consider a simple test of post increment.

i = 0;  # deliberately set wrong to cause a failure
assert { 1 == i++ }

When that fails, because it re-evaluates each element in the code block, you 
will see something like this:


assert{ 1 == (i++) } -- false - should pass
i   == 1

That's just my supposition, I can get assert2 to run.  Ruby can't find the 
installed gem. :(



--
Defender of Lexical Encapsulation


Re: Diagnostics

2008-02-12 Thread Michael G Schwern

David Landgren wrote:

I wish you'd s/Got/Actual/ or Received. Got must die.


Why's that?


--
Hating the web since 1994.


Re: New assertions in Ruby

2008-02-13 Thread Michael G Schwern

Adrian Howard wrote:

which isn't _too_ shabby, but doesn't help much with things like:

ok_if { Foo-new-answer == 42 };

or

ok_if { $Some_dynamic_var == 42 };

So I don't really think it's worth pursuing.


Well, if we follow the logic of the assert2 author, you're just being SLOPPY 
using methods with side effects in a test.  All tests should be functional 
doncha know?



--
29. The Irish MPs are not after “Me frosted lucky charms”.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Wide character support for Test::More

2008-02-23 Thread Michael G Schwern
I just merged together a number of tickets having to do with Test::More not 
liking wide characters.


use 5.008;
use strict;
use warnings;
use Test::More tests = 1;

my $uni = \x{11e};

ok( $uni eq $uni, Testing $uni );

__END__
1..1
Wide character in print at lib/Test/Builder.pm line 1252.
ok 1 - Testing Ğ


I know almost nothing about Unicode.  How do I make this Just Work?  Is it 
safe to just set binmode to always be ':utf8' if perl  5.8?



--
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer



Re: Wide character support for Test::More

2008-02-24 Thread Michael G Schwern

Aristotle Pagaltzis wrote:

use 5.008;
use strict;
use warnings;

  use open ':std', ':locale';

use Test::More tests = 1;

my $uni = \x{11e};

ok( $uni eq $uni, Testing $uni );

__END__
1..1
Wide character in print at lib/Test/Builder.pm line 1252.

  ^^ after the above patch, gone


There's the rub, it doesn't go away.

Test::Builder dups STDERR and STDOUT, this is so you can mess with them to 
your heart's content and still get testing done.  File I/O disciplines don't 
appear to be copied across dups.  That's what everyone was complaining about, 
that they had to manually apply layers to Test::Builder's own handles.


It appears I have to manually copy the layers across, ok.

sub _copy_io_layers {
my($self, $src, $dest) = @_;

$self-_try(sub {
require PerlIO;
my @layers = PerlIO::get_layers($src);

binmode $dest, join  , map :$_, @layers if @layers;
});
}

That does it.  Thank you for playing software confessional. :)


--
The past has a vote, but not a veto.
-- Mordecai M. Kaplan


Re: Is there even a C compiler?

2008-02-25 Thread Michael G Schwern

Andy Armstrong wrote:
Is there a generally approved way for an XS module to test for the 
existence of a C compiler before attempting to build?


MakeMaker uses ExtUtils::CBuilder-have_compiler() in it's tests.  It's worked 
well with no complaints.  It's an additional testing dependency, but it's a 
useful one and Module::Build will eventually suck it in anyway.



--
THIS I COMMAND!


Re: Is there even a C compiler?

2008-02-26 Thread Michael G Schwern

Yitzchak Scott-Thoennes wrote:

On Mon, Feb 25, 2008 at 03:59:37PM -0800, Michael G Schwern wrote:
MakeMaker uses ExtUtils::CBuilder-have_compiler() in it's tests.  It's 
worked well with no complaints.  It's an additional testing dependency, 
but it's a useful one and Module::Build will eventually suck it in 
anyway.


Not sure what you mean by that, but M::B recommends, not requires,
ExtUtils::CBuilder, and has for a long time.


I mean that CBuilder isn't exactly a big hassle to have as a dependency and 
other things need it anyway.



--
Robrt:   People can't win
Schwern: No, but they can riot after the game.


More information in NAs (was Re: CPANTesters considered harmful)

2008-03-03 Thread Michael G Schwern

demerphq wrote:

On 03/03/2008, David Golden [EMAIL PROTECTED] wrote:

On Mon, Mar 3, 2008 at 6:57 AM, demerphq [EMAIL PROTECTED] wrote:
   IMO if an NA result comes in without email contact details and without
   an explanation for the NA then the result should not be aggregated
   against the module.


The email contact details are there, just suppressed by the NNTP web
 gateway to avoid email harvesting by spambots.  If you have a real
 NNTP client, you'll see the email.  Also, see Google Groups (though
 you have to solve a captcha to reveal the email):

 
http://groups.google.com/group/perl.cpan.testers/browse_thread/thread/f67ccb5a66aed2e/ffa37628e76a42e5?lnk=gstq=NA+ExtUtils-Install#ffa37628e76a42e5


This information would be useful to display on CpanTesters itself. The
point is I saw NA's that were inexplicable to me, and found no further
useful information.


It would be nice if NA's included the reason for it being an NA, that being 
the full Makefile/Build.PL output just like if it failed.  I don't see any 
harm in that and it would help identify accidental NAs.


Also it would be nice if an NA came with a soothing explanation for the 
author.  More than one rookie CPAN author has asked me Oh god, what's this NA 
thing mean?  How do I get rid of it?!


While I'm on the subject, a link to an author FAQ about CPAN Testers in the 
mail would be handy.



--
52. Not allowed to yell “Take that Cobra” at the rifle range.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Re: CPANTesters considered harmful

2008-03-03 Thread Michael G Schwern

Nicholas Clark wrote:

On Mon, Mar 03, 2008 at 02:19:23PM +, Smylers wrote:

demerphq writes:


It turned out the problem is that when the tests are root it seems to
be not possible to create a directory that is not writeable by root.

I think that can be reduced to: It isn't possible to create a directory
that is not writeable by root.  The whole point of root is that as the
super-user it can do anything!


Im not really sure how to tackle this better than simply skipping the
tests as root which is what the most recent release does.

That's plausible.  It could also temporarily drop privileges to be some
other user for running that test, but I don't know how you'd work out
which user to do it as.


My guess would be nobody if that user exists, else give up.

But I agree that skipping is better, because the tests run as non-root
already prove that the module's functionality worked. Adding a lot
of complex logic to the test to swap user when running as root would
actually make the test as much a test of the user ID swapping code,
and introduce code that isn't usually tested, and generally introduce
fragility and cause false positive failures.


FWIW I do this a lot:

chmod 0444, 'some_file';

SKIP: {
skip(cannot write readonly files, 1) if -w 'some_file';

...
}

The important thing is that I'm not checking if I'm root but directly checking 
if the necessary condition exists, in this case an unwritable file.


You could attempt to downgrade permissions, switching to nobody is as good a 
guess as anything else, but realize it might effect the ability to read files, 
access directories and load modules for the rest of the test.  Not everyone 
sets o+rx.



--
3. Not allowed to threaten anyone with black magic.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Re: More information in NAs (was Re: CPANTesters considered harmful)

2008-03-03 Thread Michael G Schwern

David Golden wrote:

On Mon, Mar 3, 2008 at 11:45 AM, Michael G Schwern [EMAIL PROTECTED] wrote:

 It would be nice if NA's included the reason for it being an NA, that being
 the full Makefile/Build.PL output just like if it failed.  I don't see any
 harm in that and it would help identify accidental NAs.


There is only supposed to be one reason for NA -- Perl or platform not
supported.  Anything else is a bug in the reporting software.

See http://cpantest.grango.org/wiki/Reports

For reference, CPAN::Reporter detects NA in any of three ways:

* explicit check for unsatisfied 'perl' in 'requires' section of prerequisites
* parsing for OS Unsupported or No support for OS
* parsing for error messages from code like use 5.008 or from our
being used in $VERSION strings prior to 5.005


It's that last one that concerns me, it's a bit heuristicy and I've been 
things be declared NA that should have alerted the author to a backwards 
compat problem.




CPAN::Reporter also includes PL output for tests that fail or are NA
in the PL stage.  CPANPLUS (which Chris uses) does not -- or at least
not by default as far as I know.


Ahh, I see.


--
Reality is that which, when you stop believing in it, doesn't go away.
-- Phillip K. Dick


Test-Simple 0.77 fixage

2008-03-03 Thread Michael G Schwern

I'm coining a new term, fixage, like breakage.

Fixage is when software fixes a bug and reveals bugs in dependent software.

Test-Simple 0.77 (which includes Test::More) fixed a long standing bug by 
removing the annoying global $SIG{__DIE__} handler to trap test death.  It 
would swallow the real exit code of a test.


This code used to pass:

use Test::More tests = 1;
pass();
exit 1;

Whereas now it will properly exit with 1, which is a failure, and the 
appropriate Looks like your test died message.


So far there's only been one revealed failure, that's in POE, but I figured 
I'd let folks know just in case.



--
On error resume stupid



Re: More information in NAs (was Re: CPANTesters considered harmful)

2008-03-03 Thread Michael G Schwern

David Golden wrote:

On Mon, Mar 3, 2008 at 2:04 PM, Michael G Schwern [EMAIL PROTECTED] wrote:

  * parsing for error messages from code like use 5.008 or from our
  being used in $VERSION strings prior to 5.005

 It's that last one that concerns me, it's a bit heuristicy and I've been
 things be declared NA that should have alerted the author to a backwards
 compat problem.


Back before you declared 5.005 to be dead, Slaven Rezic created a lot
of chaos with FAIL reports from 5.005_05 when Makefile.PL or Build.PL
didn't have use 5.006 and then ExtUtils::MakeMaker or Module::Build
tried to eval an our $VERSION line from a .pm file.


Don't get me wrong, I think the heuristics are fine.  That was in reference to 
why I like to see the details of an NA.



--
Ahh email, my old friend.  Do you know that revenge is a dish that is best
served cold?  And it is very cold on the Internet!


Re: Test-Simple 0.77 fixage

2008-03-03 Thread Michael G Schwern

chromatic wrote:

On Monday 03 March 2008 11:20:54 Michael G Schwern wrote:


Fixage is when software fixes a bug and reveals bugs in dependent
software.

Test-Simple 0.77 (which includes Test::More) fixed a long standing bug by
removing the annoying global $SIG{__DIE__} handler to trap test death.


Having imposed fixage on the world myself, let me recommend that you run 
*away* from villagers with pitchforks and torches rather than trying to 
reason with them.  They don't want to fix the bugs in their code.


Shame they brought a pitchfork to a gun fight.
http://schwern.org/~schwern/img/me/19_brian_mike_gary_shooting.jpg


--
164. There is no such thing as a were-virgin.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


IEEE Testing Conference in Lillehammer

2008-03-03 Thread Michael G Schwern
The First International Conference on Software Testing, Verification and 
Validation is happening in Lillehammer, Norway April 9 - 11.  The conference 
proper is April 10th and 11th, just after Go Open and the Oslo QA Hackathon.


I think this is a good chance for Perl QA to crash the IEEE and get Perl 
talking to the rest of the software industry.  I'm sure we could show them a 
thing or two and vice-versa.


The cost of the conference is $1000, a bit much for a lark.  I don't know how 
much of a hallway track they will have, I don't think an IEEE conference will 
be quite as free-wheeling as OSCON.  So I would like to try to convince them 
to give us complementary passes on the weight of our broad QA experience and 
hey, we just flew in from all around the world.  The more of us intending to 
go the better chance we have.


So please announce your intention on the wiki:
http://perl-qa.hexten.net/wiki/index.php/Oslo_QA_Hackathon_2008_:_Travel#IEEE_Testing_Conference_in_Lillehammer


--
'All anyone gets in a mirror is themselves,' she said. 'But what you
gets in a good gumbo is everything.'
-- Witches Abroad by Terry Prachett


Re: TAP::Harness / CPAN problem

2008-03-08 Thread Michael G Schwern

Chris Dolan wrote:

On Mar 8, 2008, at 11:59 AM, Andy Armstrong wrote:


On 8 Mar 2008, at 17:54, Chris Dolan wrote:


  Perl 5.8.6 (Apple's dist for OSX 10.4)
  Test::Harness 3.10
  TAP::Harness 0.54
  TAP::Parser 0.54
  CPAN 1.9205
  CPANPLUS 0.82


Yeah, you have a mixture of Test::Harness and TAP::Parser installed. 
You need to delete those old versions of TAP::Harness and TAP::Parser. 
TAP::* and Test::Harness should be the same versions.


Arg, not again!  PREFIX vs. --install_base vs. vendorlib vs. sitelib vs. 
Apple vs. Fink


Either use INSTALL_BASE with --install_base or PREFIX with --prefix but don't 
mix them.



--
On error resume stupid


Re: [tap-l] SKIP_ALL tests should not get hidden

2008-03-10 Thread Michael G Schwern

Andy Armstrong wrote:

On 20 Nov 2007, at 23:39, Michael G Schwern wrote:

Do we like that?


Test::Harness 2 put it on it's own line mostly to avoid wrapping off 
the right

side of the screen.  I still lean in that direction.


Hmm. I'm kind of hooked on the new behaviour now. It puts a summary 
column right where I can find it.


Maybe this is one of those 80 columns vs 120 columns things, but let's compare.

Here's the Test::More suite with TH 2.64.

t/00test_harness_check..ok
t/bad_plan..ok
t/bail_out..ok
t/BEGIN_require_ok..ok
t/BEGIN_use_ok..ok
t/bufferok
t/Builder...ok
t/carp..ok
t/circular_data.ok
t/cmp_okok
t/createok
t/curr_test.ok
t/details...ok
t/diag..ok
t/dont_overwrite_die_handlerok
t/eq_setok
t/exit..ok
t/extra.ok
t/extra_one.ok
t/fail-like.ok
t/fail-more.ok
t/fail..ok
t/fail_one..ok
t/filehandles...ok
t/fork..ok
t/harness_activeok
t/has_plan..ok
t/has_plan2.ok
t/importok
t/is_deeply_dne_bug.ok
t/is_deeply_failok
t/is_deeply_with_threadsskipped
all skipped: many perls have broken threads.  Enable with 
AUTHOR_TESTING.
t/is_fh.ok
t/maybe_regex...ok
3/16 skipped: various reasons
t/missing...ok
t/More..ok
t/no_diag...ok
t/no_ending.ok
t/no_header.ok
t/no_plan...ok
t/ok_objok
t/outputok
t/overload..ok
t/overload_threads..ok
1/5 skipped: various reasons
t/plan..ok
t/plan_bad..ok
t/plan_is_noplanok
t/plan_no_plan..ok
1/6 skipped: various reasons
t/plan_shouldnt_import..ok
t/plan_skip_all.skipped
all skipped: Just testing plan  skip_all
t/pod-coverage..ok
t/pod...ok
t/require_okok
t/reset.ok
t/simpleok
t/skip..ok
8/17 skipped: various reasons
t/skipall...ok
t/straysskipped
all skipped: not completed
t/tbm_doesnt_set_exported_took
t/tbt_01basic...ok
t/tbt_02fhrestore...ok
t/tbt_03die.ok
t/tbt_04line_numok
t/tbt_05faildiagok
t/tbt_06errormess...ok
t/tbt_07argsok
t/thread_taint..ok
t/threads...ok
t/todo..ok
t/try...ok
t/undef.ok
t/use_okok
t/useingok
t/utf8..ok
2/5 skipped: various reasons
All tests successful, 3 tests and 15 subtests skipped.


And here it is with 3.10.

t/00test_harness_checkok
t/bad_planok
t/bail_outok
t/BEGIN_require_okok
t/BEGIN_use_okok
t/buffer..ok
t/Builder.ok
t/carpok
t/circular_data...ok
t/cmp_ok..ok
t/create..ok
t/curr_test...ok
t/details.ok
t/diagok
t/dont_overwrite_die_handler..ok
t/eq_set..ok
t/exitok
t/extra...ok
t/extra_one...ok
t/fail-like...ok
t/fail-more...ok
t/failok
t/fail_oneok
t/filehandles.ok
t/forkok
t/harness_active..ok
t/has_planok
t/has_plan2...ok
t/import..ok
t/is_deeply_dne_bug...ok
t/is_deeply_fail..ok
t/is_deeply_with_threads..skipped: many perls have broken threads. 
Enable with AUTHOR_TESTING.

t/is_fh...ok
t/maybe_regex.ok
t/missing.ok
t/Moreok
t/no_diag.ok
t/no_ending...ok
t/no_header

Re: ExtUtils::FakeMaker 0.001 uploaded

2008-03-13 Thread Michael G Schwern

Ricardo SIGNES wrote:

That's all!  I hope someone else finds it useful.


FWIW we were just talking about the issue of generating modules from static 
files (rather than the other way around like we do now) at PDX.pm.



--
Look at me talking when there's science to do.
When I look out there it makes me glad I'm not you.
I've experiments to be run.
There is research to be done
On the people who are still alive.
-- Jonathan Coulton, Still Alive


Re: Why should package declaration match filename?

2008-03-14 Thread Michael G Schwern

Matisse Enzer wrote:
I'm discussing some potential refactorings at $work at wanted to give an 
articulate explanation of the benefits of having package declarations 
match file names, so that:


   # file is Foo/bar.pm
   package Foo::Bar;


That was probably a typo, but I hope you mean Foo/Bar.pm.  Getting the cases 
wrong will bite you even on case insensitive filesystems.  Here's a classic:


$ perl -wle 'use Strict;  $foo = 42;  print $foo'
42

Look ma, I'm using strict!  No, it's not.  use Strict is the equivalent of 
C require Strict;  Strict-import if Strict-can(import) .  This will 
load Strict.pm, and since the filesystem is case insensitive it will work, and 
then call Strict-import.  Class methods are not case insensitive so it will 
not exist and do nothing.



One reason is so that when you see a package statement, you know what 
the corresponding use statement would be, and when you see a use 
statement, you know what the corresponding package is, and have a good 
clue about the path path of the file you are importing.


What are other good reasons to have package declarations match file paths?


Eric already covered the import() issue.

There's also the principle of least surprise.  How do you load class Foo::Bar? 
 use Foo::Bar.  How do you get the docs?  perldoc Foo::Bar.  Where do I 
find the code for Foo::Bar?  In Foo/Bar.pm.  How do you inherit from it?  use 
base qw(Foo::Bar).  Simple, no investigations or local knowledge necessary.


As a counter example, consider Tie::StdHandle.  It makes a tied handle work 
like a regular file handle.  Very handy.  How do you use it?  C use base 
qw(Tie::StdHandle)  right?  Wrong, it's in Tie/Handle.pm, of course. [1]  So 
now you have to invoke it manually.


BEGIN {
require Tie::Handle;
our @ISA = qw(Tie::StdHandle);
}

As is often the case with irregular code, Tie::StdHandle has no documentation.


[1] 5.10.0 finally moved it to its own file.

--
Robrt:   People can't win
Schwern: No, but they can riot after the game.


Re: Why should package declaration match filename?

2008-03-15 Thread Michael G Schwern

Dave Rolsky wrote:
There's a lot of value in following the existing best practices of the 
Perl community as a whole. For one thing, it means you can hire people 
with Perl experience and they can bring that experience to bear on 
your application.


If you insist on reinventing every wheel, you've basically created your 
own in-house Perl-like dialect. It _looks_ sorta like Perl, but it's not 
Perl.


I agree.  My last full time corporate job in particular they did almost 
everything in-house duplicating a lot of CPAN functionality.  Their code was 
often well done, but it meant all my years of Perl and CPAN experience were 
dampened because I had to learn their way of doing everything.  Including, 
ironically, testing.



--
Hating the web since 1994.


Patent for Software Package Verification nothing to worry about

2008-03-20 Thread Michael G Schwern
About two years ago several people came upon this patent granted to Sun, 
EP1170667 - Software Package Verification

http://gauss.ffii.org/PatentView/EP1170667

Its US equivalent is 7080357
http://www.google.com/patents?vid=USPAT7080357

There was some concern this might conflict with TAP and Test::Harness since on 
the surface it looked awfully generic.  Executive summary:  It's not.  Don't 
worry.


I finally got an opportunity to sit down with a patent lawyer familiar with 
software, John Anderton of patentforge.com.  His reading was that it is very 
specific to a particular process and contains very few weasel words that 
might try to expand on it.  The process is mostly having to do with 
prioritizing tests and identifying those which are active.  Furthermore, the 
patent specifies a very specific set of things which it's testing [0056] 
without any wording like this list is not exhaustive.


Furthermore, the patent was rejected several times by the patent office and 
was extensively rewritten each time.


His opinion was that it's likely a purely defensive patent on the part of Sun 
and that Sun has a very good track record with regard to not abusing patents.


Finally, should it turn out that it is in conflict with Test::Harness the 
patent claim only goes back to 2000 while Test::Harness, under various names, 
goes back to 1988.  Busting the patent would not require a court case but a 
simple appeal to the patent office.


One of the things he explained was that in order to infringe on a patent your 
device must encompass *every* one of the qualities but only one of the claims.


As far as defending against future patent claims of this nature, one way to 
deal with it is to publish techniques as dated documented.  This can take 
many forms, shipping code is one.  For the purposes of proving prior art to a 
patent examiner, a simple human-readable document describing the technique is 
best.  A white paper, for example.  Just write it and post it somewhere, on a 
mailing list or even if it's just on a web page the Internet Archive will pick 
it up.  Documentation about techniques is good for all, patents or no.


More formal mechanisms include putting in your own patent claim.  It doesn't 
have to be accepted, but then it will be on file as prior art.  This, 
unfortunately, is expensive.  There is a method where you can file your 
invention publicly, but not patent it, however that costs in the order of 
$500-$1000.



--
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer



Hackathon logistics

2008-03-25 Thread Michael G Schwern
A few logistical items that I'd like to make sure are being taken care of for 
the hackathon.  The idea is to work this out now to maximize our in-site 
hacking time.


I don't know what the status of this is, but here's what I can think of off 
the top of my head.



*) Access for wirelessless laptops?

Somebody always shows up from 1996 with a laptop that doesn't have wireless. 
Either a wired hub is made available or USB wireless widgets.


* If your laptop doesn't have wireless, please fix that.  Almost everything 
can talk to USB wireless.  If for some reason you absolutely cannot go 
wireless, please bring some long cables and a hub.


* If you have a spare USB wireless thing, please bring it.


*) Hackathon repository?

Well-established projects have their own repos, but it's always handy to have 
a repository for any new projects started at the hackathon and for everyone to 
already have access.


* Do we have one?  Andy, can we use hex-ten?
* A checkin notify list should also be ready.


*) Whiteboards, markers  erasers.

Lots of whiteboards for taking notes.  At least one whiteboard just for 
projects being worked on, the grid at BarCamps is an example.



*) Index cards  pens

Both for taking notes and for anyone wanting to do XP.


*) Wiki

We have the perl-qa wiki.

* Please resolve any issues you have with accessing/editing it now.


*) Mailing list

We'll just the perl-qa list.


*) Real-time comms.

We have #perl-qa on irc.perl.org.  It might make sense to have a twitter 
account for broadcasts and cell phone messaging.



*) Caffeine and snacks

Snacks covering both the junk food and non-junk food kind.  The latter being 
anything not covered in sugar and salt. :)  Worse comes to worse, some sort of 
trail mix that's not covered in sugar and salt.



*) Food plans

A short list of convenient places to order food from.  Also easy to get to 
places that can accommodate all of us.  They must deal with varied dietary 
requirements (veggie, no-dairy?, no-wheat?, kosher? etc...).  Have at least 
pizza and chinese as everyone knows how to deal with that.


This way we can just make food appear without having to spend time on what 
everyone wants and needs.


* Please list your dietary requirements/preferences here.
http://perl-qa.hexten.net/wiki/index.php/Oslo_QA_Hackathon_2008_:_Food#Dietary_requirements

* Locals, please list places to get food from here:
http://perl-qa.hexten.net/wiki/index.php/Oslo_QA_Hackathon_2008_:_Food#Places_To_Get_Food


*) Water  juice

For drinking something that's not brown and fizzy or brown and foamy.


*) Boiling hot water

A personal request, for tea.


--
I have a date with some giant cartoon robots and booze.



Re: My Perl QA Hackathon Wishlist

2008-03-25 Thread Michael G Schwern

Gergely Brautigam wrote:

One last question then I swear I will shut up :)

Why use perl for testing? Of course all others languages are used for testing 
this
 and that.. What excels perl to be used for testing. Obviusly it has 
powerfull regex,
 and datahandling capabilities... But besides that.. Why would anyone want 
to use perl? :)


Something that hasn't been explicitly pointed out yet is that Perl uses TAP, 
the Test Anything Protocol.  In most other testing systems the test script 
also determines whether the test passes or fails and it also displays that 
fact.  This means the thing you're testing is tied to the testing system.


TAP instead follows the Unix philosophy of pipes.  A harness (usually 
Test::Harness) runs a series of programs which output TAP, a simple textual 
protocol.


1..3
ok 1
ok 2
not ok 3

That is, I'm going to run 3 tests.  The first one passed.  The second one 
passed.  The third one failed.  The harness parses this and displays the results.


This separation means you have maximum flexibility in how you write your 
tests.  You're not locked down to one set of test functions, you can import 
piles and piles from CPAN and even write your own.  You can even get XUnit 
style test methods (see Test::Class).


With Test::Harness 3 you can define special behaviors for various test files.
The power of this is that the test scripts can be written in anything, doesn't 
have to be Perl.  There are TAP libraries in several languages and because the 
basic protocol is so simple they're simple to implement.  I've seem places 
write TAP tests in C, shell, PHP and Java in the same test suite.  Even 
running their .html files as tests by instructing the harness to run them 
through an HTML syntax validator.


The downside is there's no pretty GUIs for TAP, but the potential exists. 
It's a simple matter of programing.  The upside is that when a TAP GUI is 
created it will work with all existing TAP tests.



--
191. Our Humvees cannot be assembled into a giant battle-robot.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Re: model-based testing

2008-03-26 Thread Michael G Schwern

[EMAIL PROTECTED] wrote:

Hi *,

are there any Perl modules for model-based testing [1]? Are there any
talks about model-based testing with Perl?

Cheers,
Renee

[1] http://en.wikipedia.org/wiki/Model-based_testing


Never heard of it.  It smells a little bit like a further extension of FIT 
testing in the sense of being able to write tests without having to write 
code.  From skimming a few articles it looks like it's closely tied to finite 
state machines.  Maybe it would be applicable to things like CGI::Application 
and other perl modules which already have clear states?


Could you give a high level idea of something you'd test with it?


--
124. Two drink limit does not mean first and last.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Server and database testing

2008-03-26 Thread Michael G Schwern
I have some work to write tests for a server that talks to a database.  This 
means creating a database and firing up a server for testing purposes, and 
then dropping the database and shutting down the server at the end.  This also 
means making sure that multiple instances of the test can run on the same 
machine by the same or multiple users.


I'm about to do a sort of brute-force approach:

* create and populate a database called projectname_$user_$pid
* find an open port
* write a config file with the port and database to use
* fire up the server with that config
* run the tests
* shutdown the server
* drop the database

Seems a bit wasteful, so I'm wondering how other people handle it.


PS  This is not a CPAN thing.


--
29. The Irish MPs are not after “Me frosted lucky charms”.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Re: Is FIT fit for purpose?

2008-03-27 Thread Michael G Schwern

Ovid wrote:

Has anyone here ever succesfully used FIT testing?  I was at one of the
first presentations of FITness a long time ago, but the example Ward
Cunningham gave was of a calculator.  I thought the idea was neat, but
how would I implement it?


When you say implement it do you mean the mechanics of it or in a when can 
it be of use sense?




We've considered FITness testing, but so far, the only person I've met
who claims success with it is a consultant to teaches it.  Everyone
else has claimed no experience or that it's more hassle than it's
worth.


Tony Bowden has had success with it.  It's most useful when the people with 
the knowledge of how the thing is supposed to work aren't programmers. 
Acceptance testing.  I do X, I expect Y.  This works well with well defined, 
deterministic behaviors.  Here's an employee and their salary, what taxes and 
fees do we take out of their payroll?


Or for units where the field of test data is very, very broad (yet the inputs 
and outputs are simple) and you want to employ cheap labor to test it.  In 
Tony's case, as I remember, he employed his nephew to test a contact 
information web scraper.  His nephew went to a random web page and eyeballed 
it for any contact information.  He wrote down the URL and contact info in a 
simple table.  Harness slurps in that table and compares what it found with 
what the human found.  Fast way to get a big wad of real world test data.  A 
programmer would be way more expensive and probably not do nearly as good a 
job as they'd get bored.



--
E: Would you want to maintain a 5000 line Perl program?
d: Why would you write a 5000 line program?


Re: Is FIT fit for purpose?

2008-03-28 Thread Michael G Schwern

Eric Wilhelm wrote:

On Thursday 27 March 2008 12:42:13 Eric Wilhelm wrote:

What do you need to test that your users need to drive?

Business rules.


So, what is a good example of such a business rule?  I posit that 
payroll does not count because the user could more concisely write the 
rule in a declarative form, this isn't Java, c.


I'm confused by that response.  FIT is declarative.  You give the user a 
table, they fill it in, it gets run through a routine that interprets the 
inputs and outputs.


| gross pay | fed tax | state tax | medicare | social sec | net pay |
-
| 4 |24%  | 6%|2%| 5% |     |
-

Categorization is a nice example.

| URL   | Category  |
-
| hooters.com   | Restaurant|
| whitehouse.com| Porn  |
-

Clear inputs and outputs.

Also no matter if you could write that out as a Perl hash or something you 
don't want to be handing accountants a text editor and Perl program.  Totally 
alien.  You want to give them a web page or Excel file to fill out.  Part of 
the point of FIT is the method of inputing and evaluating the test results is 
comfortable to the user, not the programmer.




How can it be expressed in a non-tedious and yet understandable way
that makes them feel like it is a worthwhile process?

That's the guiding design question of FIT tests.


That's conveniently intuitive then.  So where do I get the guiding 
design answer?  Or at least, how do you decide when to be asking the 
above question vs when to be asking how do I set this up such that the 
business rules are 'programmed' directly by the users??


That would be highly situational, but it's the same sort of question as when 
do I put this into a config file?


Of course, even if the rules are specified by the users they still need to be 
tested.




At that point, you're pushing so much data into the test that it has
become tedious for the user to own (create, manually review) the
time cards, so you really only want to involve them at the
configuration. ...

Don't mistake FIT tests for unit tests.  You'll stab someone if you
do.


Okay, so how do we be sure that the business rule is fully expressed by 
the Fit input?  (That is: guarantee that there are no edge cases.)  Or, 
is this one of those complicated things where worse is better because 
we don't like better better than worse?


Same way you determine any other test.  You run coverage analysis.  You give 
some thought to the data.  If they're numbers you try 0, -1, 1, .5, 2**33, 
etc...  If it's strings try whitespace, special characters, nul bytes, SQL 
commands...


But FIT is not really about doing full edge testing.  It's about getting a 
broad suite of blackbox tests that match what the users actually do, as 
opposed to what the programmer things the users do.  Since the users (or the 
client) are writing the tests and they probably know the domain better than 
you do.  Also, lacking any knowledge of the internals, they're not going to 
pussyfoot around known fragile spots.


Finally, as chromatic pointed out, FIT is *not* a replacement for unit tests. 
 It's another tool, specialized to allow the client to write their own 
acceptance tests.  It's as much about drawing the client into the development 
process, getting them involved, getting them to use the iterations, getting 
their feedback and buy-in, as it is about testing the software.



--
60. “The Giant Space Ants” are not at the top of my chain of command.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Test::Builder 2 in Oslo

2008-03-28 Thread Michael G Schwern

I put Test::Builder 2 up as a topic for the Oslo hackathon.
http://perl-qa.hexten.net/wiki/index.php/Oslo_QA_Hackathon_2008_:_Topics#Test::Builder_2


--
E: Would you want to maintain a 5000 line Perl program?
d: Why would you write a 5000 line program?



Re: Is FIT fit for purpose?

2008-03-28 Thread Michael G Schwern

Ok, let's clear this all up.

FIT is not about expressing business rules.

FIT is a tool which allows the customer to add test cases in a way they're 
comfortable with.  A programmer still has to write the logic behind those 
tests (called a Fixture), but it allows a customer to easily add more data, 
inputs and outputs.


It can have the nice side effect of getting the customer more involved with 
the project.


FIT is not about getting the customer to write the rules any more than unit 
testing is about writing the code.  It can help clarify the rules and reveal 
where the code doesn't match customer expectations.  That's why it's called an 
acceptance test, done right the customer accepts the work when the tests pass.



Eric Wilhelm wrote:

Categorization is a nice example.

| URL   | Category  |
-
| hooters.com   | Restaurant|
| whitehouse.com| Porn  |
-


Now I'm really confused.  That looks like tabulated data, so what code 
would it be testing?


The code which takes a URL and decides what category it's in.  Code you defined.

FIT is this...

my %tests = (
# URL  # Category
hooters.com = Resturaunt,
whitehouse.com  = Porn,
);

for my $url (keys %tests) {
my $category = $tests{$arg};

is categorize_web_site($url), $category;
}

Except instead of %tests you have the user write up a table (Excel, HTML, CSV, 
whatever) so they can input new test inputs and expected results without 
having to touch code.  YOU write the code which interprets those inputs (the 
fixture).  You also give them a simple way to run the tests, usually just a 
submit button on a web page.


That's it.  That's all FIT is.


--
191. Our Humvees cannot be assembled into a giant battle-robot.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


MySQL + TAP == MyTAP

2008-03-29 Thread Michael G Schwern
Stumbled across this while finding an alternative to libtap for testing C (it 
has some sort of issue linking with this hairy project I'm working on). 
Apparently MySQL wrote their own TAP library for C.


From http://dev.mysql.com/doc/mysqltest/en/unit-test.html

The unit-testing facility is based on the Test Anything Protocol (TAP) which 
is mainly used when developing Perl and PHP modules. To write unit tests for 
C/C++ code, MySQL has developed a library for generating TAP output from C/C++ 
files. Each unit test is written as a separate source file that is compiled to 
produce an executable. For the unit test to be recognized as a unit test, the 
executable file has to be of the format mytext-t. For example, you can create 
a source file named mytest-t.c the compiles to produce an executable mytest-t. 
The executable will be found and run when you execute make test or make 
test-unit in the distribution top-level directory.


Here's the docs.
http://www.kindahl.net/mytap/doc/index.html


--
Defender of Lexical Encapsulation



Re: An alternate view on deferred plans

2008-03-29 Thread Michael G Schwern

Buddy Burden wrote:

Not criticizing, not claiming my method is better, just looking for
any reasons why this wouldn't work.  And, JIC there's some agreement
that it _would_ work, I've already put together a patch for Test::Most
that does it.  That is, at the top of your script, you put this:

use Test::Most 'defer_plan';

and at the bottom of your script, you put this:

all_done();

and it just DTRT.  The implementation is perhaps not as clean as I'd
like it to be, but it's not a horrific hack either.  I'm going to
forward it to Ovid through normal CPAN channels.

Thoughts?


This method is fine and has been suggested several times before.
http://rt.cpan.org/Public/Bug/Display.html?id=20959

I've just been dragging my feet on it.  I finally had occasion to make use of 
it when I had to hand roll a C TAP library (libtap tried to be nice about 
threads and wound up losing).  I hate writing plans, and C makes everything 3x 
more annoying, so I wrote a done_testing() function and just didn't bother 
requiring a plan at all.  It was all very convenient.


So maybe I'll get around to implementing it.


--
Insulting our readers is part of our business model.
http://somethingpositive.net/sp07122005.shtml


Re: An alternate view on deferred plans

2008-03-30 Thread Michael G Schwern

Aristotle Pagaltzis wrote:

Note that it doesn’t quite protect you from running too few tests
either. You may botch some conditional in your test program and
end up skipping tests silently, in which case you will still
reach the `all_done()` line, and it’ll look as if all was fine.


The typical approach to setting the number of tests in anything but the most 
trivial cases is to run the tests and copy down the number.  If the 
conditional wasn't working in the first place, then the plan does nothing but 
copy that mistake.


If your test is simple enough that you can routinely count the tests by hand, 
you're unlikely to miss running a test in the first place.




What it protects you from is dying half-way through the tests
without the harness noticing. Of course, that’s by far the most
common failure mode.


I don't want to drag out the plan vs no_plan argument, but I do want to 
clear up this common misconception.


Death is noted by both Test::More and Test::Harness and has been for a long 
time.  Recent versions of Test::More close off a bug that caused death or 
non-zero exit codes to be lost in certain cases.  If you continue to 
experience that, report it.  It is a bug.


The only way you can abort the test halfway through using no_plan and get a 
success is with an exit(0).  That scenario is extremely rare, but I've 
considered adding in an exit() override to detect it.



--
I do have a cause though. It's obscenity. I'm for it.
- Tom Lehrer


Re: TAP has no exit code

2008-03-31 Thread Michael G Schwern

Eric Wilhelm wrote:

# from Aristotle Pagaltzis
# on Sunday 30 March 2008 23:14:


Except that the test program might be running at the other end of
an HTTP connection. Or at the other end of a serial port. Or the
harness might be parsing an archived TAP stream. Or a TAP archive
generated offline in batch mode. Or…


That's a good point, but what does it have to do with plans?

  $ perl -e 'use Test::More qw(no_plan); ok(1); die;' 2 /dev/null
  ok 1
  1..1


This will look like a success.

$ perl -wle 'use Test::More no_plan;  pass();  die;  pass()'  2/dev/null
ok 1
1..1


This will look like a failure.

$ perl -wle 'use Test::More tests = 2;  pass();  die;  pass()'  2/dev/null
1..2
ok 1


If you can't see the exit code of the test then the plan protects you.

There is a TAP proposal to add meta information such as the exit codes.
http://testanything.org/wiki/index.php/TAP_meta_information


--
Being faith-based doesn't trump reality.
-- Bruce Sterling


Re: An alternate view on deferred plans

2008-03-31 Thread Michael G Schwern

Eric Wilhelm wrote:

What it protects you from is dying half-way through the tests
without the harness noticing...

Death is noted by both Test::More and Test::Harness and has been for a
long time

The only way you can abort the test halfway through using no_plan and
get a success is with an exit(0).


Yes.  That's exactly the reason that I want a done() with my no_plan.

That scenario is extremely rare, 
but I've considered adding in an exit() override to detect it.


I'm not sure how that would work.  You would have to assign it to 
*CORE::GLOBAL::exit and Test::More would have to be the first module 
loaded.


That's what I'd do.


If you just replace it lexically, you've covered exactly the opposite of 
the case I'm concerned about.  I can *see* an exit() in my test file 
(and I sometimes include one when a big chunk of test is broken (yeah, 
yeah... let's not talk about that right now.))


The plan there would be to have Test::More to export an exit that exhibited no 
warning, however



The exit() that concerns me when testing with no_plan is the WTF? way 
off somewhere else which absolutely shouldn't be there.  Is there any 
way to catch that without the done() token?


...a done testing marker renders all this irrelevant.


--
29. The Irish MPs are not after “Me frosted lucky charms”.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Re: TAP has no exit code

2008-03-31 Thread Michael G Schwern

Eric Wilhelm wrote:

# from Michael G Schwern
# on Sunday 30 March 2008 23:35:


There is a TAP proposal to add meta information such as the exit
codes. http://testanything.org/wiki/index.php/TAP_meta_information


Yay.

Can we put 'hostname' in there too?


You can put whatever you want in there, the magic of YAML!

The scheme of reserving all leading lower case keys seems to be working well 
for META.yml, I've changed the extension rules to reflect that, but it would 
be a worthwhile official field.  (Note, no fields are required).



--
191. Our Humvees cannot be assembled into a giant battle-robot.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


<    7   8   9   10   11   12   13   14   15   >