from:"Michael G Schwern"

Eric Wilhelm wrote:
 # from Michael G Schwern
 # on Thursday 29 November 2007 19:00:
 
 Otherwise, what's important to people?
 
 Could it be made fork-safe?
 
   http://search.cpan.org/src/TJENNESS/File-Temp-0.19/t/fork.t
 
 Possibly that involves blocking, or IPC with delayed output, or a 
 plan-per-fork thing.

The trick is, how do you do it?

IPC is right out, unportable.  I'm pretty sure there's no way to coordinate
the test counter between the two processes.  There are TAP proposals to
eliminate the need for coordination but I don't want to get into that right now.

The usual way to do this is to turn off test numbers, fork and then turn them
back on when the fork is done while incrementing the test counter by the
number of tests the fork ran.  That requires a bunch of Test::Builder method
level muckery and is non-obvious.

An easier Test::More level interface would be nice, but what would that
interface be?  It needs an I'm about to fork for this many tests function
and a I'm done forking function.  It would be easier if Test::More did the
forking for you, but that's a restriction I don't want to impose.

Or maybe we just write Test::Fork like so:

use Test::Fork;

fork_ok(sub {
is 23, 42;  # this is the code in the fork
});

and it does all the necessary jiggery pokery.  Knowing when the fork is
complete to turn numbers back on is troublesome.  I guess some sort of signal
handler will deal with that?


PS  I note there is Test::MultiFork but it seems to go well beyond what we're
talking about.

-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer

Re: New Test::More features?

Andy Armstrong wrote:
 On 30 Nov 2007, at 03:00, Michael G Schwern wrote:
 Otherwise, what's important to people?  I know there's a lot of
 suggestions
 about increasing the flexibility of planning.  Also the oft requested
 I'm
 done running tests sentinel for a safer no_plan.  Most of the time
 I'm just
 wibbling over interface details, getting the names just right. 
 (What!  Argue
 over tiny interface issues?  The Deuce you say!)
 
 The ability to emit TAP 13 c/w structured diagnostics would be hot.

Two open, but non-showstopper, issues related to that.

1)  What do we do with the regular STDERR diagnostics?  Ideally we'd have some
way to detect that the harness is going read our YAML diagnostic and generate
it's own user-readable version.  AFAIK none such thing exists.

At the moment, they will just always get emitted.  I don't have any other good
solution in mind other than accepting an environment variable (tied to a
Test::Builder method) to switch it off.  It's really a decision the harness
has to convey to the tests.


2)  What do we do if we don't have a YAML emitter?

At the moment, we just won't emit diagnostics.  Possibilities include shipping
with a copy of YAML::Tiny or write our own dumbed down YAML generator based on
the is_deeply() code as it already knows how to walk a data structure.


 Then
 we can reopen the debate about the namespace within diagnostic blocks
 and lose another four weeks of our respective lives :)

I'm going to pretend I don't know anything about diagnostic namespaces, which
is easy, cause I don't.  LALALALALALA


-- 
I have a date with some giant cartoon robots and booze.

[ANNOUNCE] Test::Fork 0.01_01

As threatened, here's Test::Fork for easier writing of forked tests.
http://pobox.com/~schwern/src/Test-Fork-0.01_01.tar.gz

   use Test::More tests = 4;
   use Test::Fork;

   fork_ok(2, sub{
   pass(Child);
   pass(Child again);
   });

   pass(Parent);

I'm probably doing the reaping of children wrong.  Someone more familiar with
forking might make some suggestions please.

Also, it's not currently checking whether the system is capable of forking.

And I realize I got the number of tests in the synopsis wrong, it should be 4.

Finally, it might be interesting to use an attribute to declare the number of
tests in the fork, like Test::Class.


-- 
There will be snacks.

Re: New Test::More features?

Michael G Schwern wrote:
 Otherwise, what's important to people?

Here's something that's important to me.  I'd like to make it easier for
people to patch my modules.  A bunch of people already have write access to my
repository, and I've taken care to ensure that most all the outstanding items
are in RT.

Ideally what I'd like is a simple way for anyone to say checkout the branch
for ticket #19389 (creating it if it doesn't already exist).  Then they can
work on it and communicate back when they feel its done and ready for review
and integration.  Ideally each change would be sent back as a comment on the
ticket.

I'll be trac does this.


-- 
Life is like a sewer - what you get out of it depends on what you put into it.
- Tom Lehrer

Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-02 Thread Michael G Schwern

Fergal Daly wrote:
 One of the supposed benefits of using TODO is that you will notice
 when the external module has been fixed. That's reasonable but I don't
 see a need to inflict the confusion of unexpectedly passing tests on
 all your users to achieve this.

Maybe we should just change the wording and presentation so we're not
inflicting so much.

Part of the problem is it screams OMG!  UNEXPECTEDLY SUCCEEDED! and the user
goes whoa, all caps and doesn't know what to do.  It's the most screamingest
part of Test::Harness 2.

Fortunately, Test::Harness 3 toned it down and made it easier to identify them.

Test Summary Report
---
/Users/schwern/tmp/todo.t (Wstat: 0 Tests: 2 Failed: 0)
  TODO passed:   1-2

TAP::Parser also has a todo_passed test summary method so you can
potentially customize behavior of passing todo tests at your end.


I agree with Eric, these tests are extra credit.  Unexpectedly working
better != failure except in the most convoluted situations.  Their
intention is to act as an alternative to commenting out a test which you can't
fix right now.  An executable TODO list that tells you when you're done, so
you don't forget.

It should not halt installation, nothing's wrong as far as the user's
concerned.  However, it does mean investigate and it would be nice if this
information got back to the author.  It would be nice if CPAN::Reporter
reported passing TODO tests... somehow.


 Another downside of using TODO like this is that when the external
 module is fixed, you have to release a new version of your module with
 the TODOs removed. These tests will start failing for anyone who
 upgrades your module but not broken one but in reality nothing has
 changed for that user,

As long as you're releasing a new version, why would you not upgrade your
module's dependency to use the version that works?


-- 
I am somewhat preoccupied telling the laws of physics to shut up and sit down.
-- Vaarsuvius, Order of the Stick
   http://www.giantitp.com/comics/oots0107.html

Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

2007-12-02 Thread Michael G Schwern

Fergal Daly wrote:
 As long as you're releasing a new version, why would you not upgrade your
 module's dependency to use the version that works?
 
 Your module either is or isn't usable with version X of Foo.

 If it is usable then you would not change your dependency before or
 after the bug in version X is fixed (maybe I have a good reason not to
 upgrade Foo and you wouldn't want your module to refuse to install if
 it is actually usable).
 
 If it isn't usable then marking your tests as TODO was the wrong thing
 to do in the first place, you should have bailed out due to
 incompatibility with version X and not bothered to run any tests at
 all. I think Extutils::MM does not have any way to specify complex
 version dependencies but with Module::Build you could say

ETOOBINARY

Modules do not have a binary state of working or not working.  They're
composed of piles of (often too many) features.  Code can be shippable without
every single thing working.

The TODO test is useful when the working version *does not yet* exist.  If
it's a minor feature or bug then rather than hold up the whole release waiting
for someone else to fix their shit, you can mark it TODO and release.  This is
the author's decision to go ahead and release with a known bug.  We do it all
the time, just not necessarily with a formal TODO test.


 I am basically against the practice of using TODO to cope with
 external breakage. Not taking unexpected passes seriously encourages
 this practice. Apart from there being other ways to handle external
 breakage that seem easier, using TODO is actually dangerous as it can
 cause false passes in 2 ways. Says version X of Foo has a non-serious
 bug so you release version Y of Bar with some tests marked TODO. The
 we risk

Maybe we're arguing two different situations.  Yours seems to be when there is
a broken version of a dependency, but a known working version exists.  In this
case, you're right, it's better resolved with a rich dependency system.

My case is when a working version of the dependency does not exist, or the
last working version is so old it's more trouble than it's worth.  In this
case the author decides the bug is not critical, can't be worked around and
doesn't want to wait for fix in the dependency.  The decision is whether or
not to release with a known bug.  After that, wrapping it in a TODO test is
just an alternative to commenting it out.

Compare with the more common alternative for shipping with a known bug which
is to simply not have a test at all.


 1 Version X+1 of Foo is even worse and will cause Bar to eat your dog.
 Sadly for your dog, the test that might have warned him has been
 marked TODO.

If they release Bar with a known bug against Foo X where your dog's fur is
merely a bit ruffled, then that's ok.  If version X+1 of Foo causes Bar to eat
your dog then why didn't their tests catch that?  Was there not a dog not
eaten test?  If not then that's just an incomplete test, the TODO test has
nothing to do with that.

The dog not eaten test wouldn't have been part of the TODO test, that part
worked fine when the author released and they'd have gotten the todo passed
message and known to move it out of the TODO block.

Or maybe they're just a cat person.

Point is, there's multiple points where good testing practice has to break
down for this situation to occur.  The use of TODO test is orthogonal.


 2 You're using version X-1 of Foo, everything is sweet, your dog can
 relax. You upgrade to version Y+1 of Bar which has a newly introduced
 dog-eating bug. This bug goes undetected because the tests are marked
 TODO. So long Fido.

That's the author's (poor) decision to release with a known critical dog
eating bug.  The fact that it's in a TODO test is incidental.


 I still have not seen an example of using TODO in this manner that
 isn't better handled in a different way.

 As before, I am not advocating changing the current Test::* behaviour
 to fail on unexpected passes as that would just be a mess. It's just
 that whenever this is discussed it ends up with people advocating what
 I consider wrong and dangerous uses of TODO and so I am pointing this
 out again,

Most of the cases above boil down to author decided to release with a known
critical bug or tests didn't check for a possible critical bug.

You're right in that marking something TODO is not an excuse to release with a
known critical bug, but I don't think anyone's arguing that.


-- 
I do have a cause though. It's obscenity. I'm for it.
- Tom Lehrer

Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

So I read two primary statements here.

1)  Anything unexpected is suspicious.  This includes unexpected success.

2)  Anything unexpected should be reported back to the author.

The first is controversial, and leads to the conclusion that TODO passes
should fail.

The second is not controversial, but it erroneously leads to the conclusion
that TODO passes should fail.  That's the only mechanism we currently have for
telling the user hey, something weird happened.  Pay attention!  It's also
how we normally report stuff back to the author.  Also there's only two easily
identifiable states for a test: Pass and fail.

So what we need is a pass with caveats or, as Eric pointed out, some way for
the harness to communicate it's results in a machine parsable way.  The very
beginnings of such a hack was put in for CPAN::Reporter in the Result line
that is output at the end of the test.  Ideally you'd have the harness
spitting out its full conclusions... somehow... without cluttering up the
human readable output.  But maybe Result: TODO_PASS is enough.


-- 
Stabbing you in the face for your own good.

Why not run a test without a plan?

use Test::More;
pass();
plan tests = 2;
pass();

Why shouldn't this work?  Currently you get a You tried to run a test without
a plan error, but what is it really protecting the test author from?

Historically, there was a clear technical reason.  It used to be that the plan
had to come first in the TAP output, so a plan had to come before any tests
were run.  Simple.

But that technical restriction no longer holds true.  The plan can come at the
end, primarily used for no_plan.  If a test is run before the plan is
declared, simply delay the plan output until the end.

Removing this restriction eliminates some unnecessarily difficult planing
problems, especially of the annoying the plan is calculated but I have to run
some tests at BEGIN time.  Like this common mistake:

use Test::More;

if( $something ) {
plan tests = 1;
}
else {
plan skip_all = Because;
}

BEGIN { use_ok Some::Module }

It also allows you to run some tests before you determine the plan, though I
can't think of a particular use for this.

It also makes it technically possible to allow the test to change it's plan
mid-stream, though the consequences and interface for that do require some
thought.

Since the technical restriction is gone, and I see no particular benefit to it
being there, and it eliminates some tricky plan counting situations, I don't
see why it shouldn't be removed.


PS  To be clear, a plan is still eventually needed before the test exits.


-- 
The interface should be as clean as newly fallen snow and its behavior
as explicit as Japanese eel porn.

Re: Why not run a test without a plan?

David Golden wrote:
 Michael G Schwern [EMAIL PROTECTED] wrote:
 It also makes it technically possible to allow the test to change it's plan
 mid-stream, though the consequences and interface for that do require some
 thought.
 
 With some sugar, that could actually be quite handy for something like
 test blocks.  E.g.:
 
 {
   plan add = 2;
   ok( 1, wibble );
   ok(1, wobble );
 }

Yep, something like that.  There will likely be a change the plan method in
Test::Builder at minimum.  Right now changing the expected number of tests is
tied to printing out the header.


-- 
Schwern What we learned was if you get confused, grab someone and swing
  them around a few times
-- Life's lessons from square dancing

Re: Why not run a test without a plan?

Eric Wilhelm wrote:
 # from David Golden
 # on Monday 03 December 2007 19:55:
 
 With some sugar, that could actually be quite handy for something like
 test blocks.  E.g.:

 {
   plan add = 2;
   ok( 1, wibble );
   ok(1, wobble );
 }
 
 or maybe make the block a sub
 
 block {
   subplan 2;
   ok(1, wibble);
   ok(1, wobble);
 };

I guess the unspoken benefit is block() can check the sub-plan when the
subroutine ref is done?

I'm always wary of using subs-as-blocks for general testing as they
transparently mess up the call stack which will effect testing anything that
plays with caller() (such as Carp).  Then you have to introduce more
complexity, like Sub::Uplevel, to mask that.


-- 
'All anyone gets in a mirror is themselves,' she said. 'But what you
gets in a good gumbo is everything.'
-- Witches Abroad by Terry Prachett

Re: Why not run a test without a plan?

A. Pagaltzis wrote:
 Yes, so this should be allowed:
 
 pass();
 plan 'no_plan';
 pass();
 
 Whereas this should not:
 
 pass();
 plan tests = 2;
 pass();

Umm, why not?  That's exactly what I was proposing and it would result in...

ok 1
ok 2
1..2


 Consider also:
 
 pass();
 plan skip_all = 'Surprise!';
 pass();

Good point.  That wouldn't work, there's no way to express skip_all once a
test has been issued.  There are ways Test::More could cheat to make it work,
but that goes against it's intent to be as explicit as possible.  Running a
test and then stating that you're going to skip all tests is ambiguous.

It does splash some cold water on eliminating the common mistake of running a
use_ok() before deciding if you can or cannot run the tests.


 It also makes it technically possible to allow the test to
 change it's plan mid-stream
 
 Without some hypothetical future version of TAP this is only
 possible if you have run tests before declaring a plan at all,
 because otherwise the plan will already have been output as the
 first line of the TAP stream.

Just needs a way to declare that you're going to add to the plan up front.


 Since the technical restriction is gone, and I see no
 particular benefit to it being there, and it eliminates some
 tricky plan counting situations, I don't see why it shouldn't
 be removed.
 
 Because declaring a plan after running tests is effectively a
 no_plan and the programmer should be aware that that’s what they
 did. It’s fine if that’s their conscious choice; just make sure
 it was.

No, it's critically different from a no_plan in that the number of tests to be
run is still fixed by the programmer.  For example...

pass();
plan tests = 3;
pass();

Would produce...

ok 1
ok 2
1..3

Which would be a failure, just as if the plan was at the top.


-- 
I do have a cause though. It's obscenity. I'm for it.
- Tom Lehrer

Re: Why not run a test without a plan?

Smylers wrote:
 It also makes it technically possible to allow the test to change
 it's plan mid-stream
 Without some hypothetical future version of TAP this is only possible
 if you have run tests before declaring a plan at all, because
 otherwise the plan will already have been output as the first line of
 the TAP stream.
 
 Wasn't there general agreement only a week or so ago to now allow plans
 to be specified at the end rather than the start?  I was presuming that
 Schwern's suggestions were in the light of this other change.

No, that was a much more involved thing which involves nested plans and
multiple plans and such.  This simply takes advantage of the existing ability
to put the plan at the end instead of at the front.

1..2
ok 1
ok 2

and

ok 1
ok 2
1..2

are equivalent test output.  This is how no_plan works and it's been around
since 2001.

Nothing new happened to allow this change, I just never really gave it thought
before.


-- 
I have a date with some giant cartoon robots and booze.

Re: Why not run a test without a plan?

A. Pagaltzis wrote:
 That would work. Of course once you have that, you don’t need to
 allow assertions to run without a plan, since one can always say
 
 use Test::More tests = variable = 0;
 pass();
 plan add_tests = 2;
 pass();
 
 instead of
 
 use Test::More;
 pass();
 plan tests = 2;
 pass();
 
 which would still be an error. That way a mistake in a test
 script won’t lead to Test::More silently converting an up-front
 plan declarations into trailing ones.

Which brings us back to the original question:  why should that be an error?


-- 
There will be snacks.

Re: Why not run a test without a plan?

Geoffrey Young wrote:
 
 Andy Armstrong wrote:
 On 4 Dec 2007, at 15:22, Geoffrey Young wrote:
 it would be nice if this were enforced on the TAP-digestion side and not
 from the TAP-emitter side - the coupling of TAP rules within the
 TAP-emitter is what lead to my trouble in the first place.

 A valid plan - at the beginning or the end - is required by Test::Harness.
 
 yup, I get that.  but that has nothing to do with the Test::More errors
 that started the thread - I ought to be able to use is() functionality
 to emit into whatever stream I want and not have it complain about
 missing plans, especially when Test::Harness will catch malformed TAP
 and complain anyway... if I decide to send it to Test::Harness, which I
 may not.

You can turn off all the ending checks with Test::Builder-no_ending(1) and
the header being printed with no_header(1).  That's what we came up with back
then.
http://www.nntp.perl.org/group/perl.qa/2006/07/msg6212.html


-- 
Hating the web since 1994.

Re: UNKNOWN despite only failing tests -- how come?

Andreas J. Koenig wrote:
 Bug in CPAN::Reporter and/or Test::Harness and/or CPAN.pm?
 
   http://www.nntp.perl.org/group/perl.cpan.testers/796974
   http://www.nntp.perl.org/group/perl.cpan.testers/825449
 
 All tests fail but Test::Harness reports NOTESTS and CPAN::Reporter
 concludes UNKNOWN and CPAN.pm then installs it.

Test::Harness bug where it concludes NOTESTS if it sees no test output, as
is the case when every test dies.  I'll see about fixing it.


-- 
On error resume stupid

Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

This this whole discussion has unhinged a bit from reality, maybe you can give
some concrete examples of the problems you're talking about?  You obviously
have some specific breakdowns in mind.


Fergal Daly wrote:
 Modules do not have a binary state of working or not working.  They're
 composed of piles of (often too many) features.  Code can be shippable 
 without
 every single thing working.
 
 You're right, I was being binary, but you were being unary. There are 3 cases,
 
 1 the breakage was not so important, so you don't bail no matter what
 version you find.
 2 it's fuzzy, maybe it's OK to use Foo version X but once Foo version
 X+1 has been released you want to force people to use it
 3 the breakage is serious, you always want to bail if you find Foo
 version X (and so you definitely don't switch the tests to TODO).

 You claimed 2 is always the case.  I claimed that 1 and 3 occur.

If I did, that wasn't my intent.  I only talked about #2 because it's the only
one that results in the user seeing passing TODO tests, which is what we were
talking about.


 I'm
 happy to say admit that 2 can also occur. The point remains, you would
 not necessarily change your modules requirements as a reaction to X+1
 being released. You might, or you might change it beforehand if it
 really matters or you might not change it at all.

And I might dip my head in whipped cream and go give a random stranger a foot
bath.  You seem to have covered all possibilities, good and bad.  I'm not sure
to what end.

The final choice, incrementing the dependency version to one that does not yet
exist, boils down to it won't work.  It's also ill advised to anticipate
that version X+1 will fix a given bug as on more than one occasion an
anticipated bug has not been fixed in the next version.

Anyhow, to get back to the point, it boils down to an author's decision how to
deal with a known bug.  TODO tests are orthogonal.


 Maybe we're arguing two different situations.  Yours seems to be when there 
 is
 a broken version of a dependency, but a known working version exists.  In 
 this
 case, you're right, it's better resolved with a rich dependency system.
 
 I think maybe we are.
 
 You're talking about where someone writes a TODO for a feature that
 has never worked. That's legit, although I still think there's
 something odd about it as you personally have nothing to do. I agree
 it's not dangerous.

Sure you do, you have to watch for when the dependency fixes its bug.  But
that's boring and rote, what computers are for!  So you write a TODO test to
automate the process.  [1]

In a large project, sometimes things get implemented when you implement other
things.  This is generally more applicable to bugs, but sometimes to minor
features.

Then there are folks who embrace the whole test first thing and write out lots
and lots of tests beforehand.  Maybe you decide not to implement them all
before shipping.  Rather than delete or comment out those tests, just wrap
them in TODO blocks.  Then you don't have to do any fiddling with the tests
before and after release, something which leads to an annoying shear between
the code the author uses and the code users use.

There is also the I don't think feature X works in Y environment problem.
For example, say you have something that depends on symlinks.  You could hard
code in your test to skip if on Windows or some such, but that's often too
broad.  Maybe they'll add them in a later version, or with a different
filesystem (it's happened on VMS) or with some fancy 3rd party hack.  It's
nice to get that information back.


 I'm talking about people converting tests that were working just fine
 to be TODO tests because the latest version of Foo (an external
 module) has a new bug. While Foo is broken, they don't want lots of
 bug reports from CPAN testers that they can't do anything about.
 
 This use of TODO allows you to silence the alarm and also gives you a
 way to spot when the alarm condition has passed. It's convenient for
 developers but it's 2 fingers to users who can now get false passes
 from the test suites,

It still boils down to what known bugs the author is willing to release with.
 Once the author has decided they don't want to hear about a broken
dependency,  and that the breakage isn't important, the damage is done.  The
TODO test is orthogonal.

Again, consider the alternative which is to comment the test out.  Then you
have NO information.

So I think the problem you're concerned with is poor release decisions.  TODO
tests are just a tool being employed therein.


[1] Don't get to hung up on names, things only get one even though they can do
lots of things.  I'm sure you've written lots of perl programs that didn't do
much extracting or reporting.


-- 
Reality is that which, when you stop believing in it, doesn't go away.
-- Phillip K. Dick

Re: Why not run a test without a plan?

Geoffrey Young wrote:
 I guess what I thought you were getting at was a natural decoupling of
 comparison functions with the planning without all the hackery involved
 to get that sepraration working now.  so I was suggesting that the
 decoupling go further than just no_plan, and that yeah, rock on, great
 idea.  'tis all :)

I see what you're getting at.  I don't think I'm going that far, though I'm
willing to help somehow with the I want to paste a bunch of subtest processes
together problem.

One of the original issues Test::More was designed to deal with was the
problem of running individual tests without having to parse the test output
(by eye or by computer) to get an accurate result.

That's why it changes the exit code on failure and why it has the ending
diagnostic message if there's a failure.  While prove and TAP::Parser help, I
still like running tests by hand to get complete control.


-- 
If at first you don't succeed--you fail.
-- Portal demo

Re: Why not run a test without a plan?

Eric Wilhelm wrote:
 A. Pagaltzis wrote:
 ...
 which would still be an error. That way a mistake in a test
 script won’t lead to Test::More silently converting an up-front
 plan declarations into trailing ones.
 Which brings us back to the original question:  why should that be an
 error?
 
 It's a matter of stricture?  If the developer intended the plan to be 
 before-hand, they'll be expecting an error to enforce that practice.

Why do they care if the plan is output at the beginning or end?  How does this
stricture improve the quality of the test?  What mistake does it prevent?  If
we were to propose the no test without a plan stricture today, what would
the arguments in favor be?

I'm not worried about the shock of violating existing programmer's calcified
minor expectations.  They'll live.

About the only thing I can think of is consistency.  skip_all must still come
first, so that this sort of thing will sometimes work, sometimes not,
depending on $^O.

use Test::More;

if( $^O eq 'BrokenOS' ) {
plan skip_all = 'Your shit is broke';
}
else {
plan tests = 42;
}

BEGIN { use_ok 'Some::Module' }

This might be solved by the oft-requested skip_rest.

use Test::More tests = 42;

skip_rest(Your shit is broke) if $^O eq 'BrokenOS';

BEGIN { use_ok 'Some::Module' }

Hmm, it's also shorter.  This even allows something like this:

use Test::More tests = 42;

BEGIN {
use_ok 'Optional::Module' ||
skip_rest('Optional::Module not available');
}


 The current planning functions impose strictures.  (Yes, it happens to 
 be due to an old implementation detail which no longer governs -- but 
 that doesn't change the fact that behavior expectations have already 
 been set.)  Taking away the error essentially means that you've changed 
 the API.  Imagine if strict.pm suddenly stopped being strict about 
 symbolic refs.

 That is, users should somehow be able to rely on the same strictures 
 that they've had in the past (hopefully by-default.)  So, either this 
 new non-strict plan scheme should be declared in the import() params or 
 be not named plan().

I see where you're going, but I think this is going too far wrt backwards
compatibility.

The strict analogy is spurious because that would be changing a fundamental
part of strict.  Whereas this is an incidental part of Test::More.  Scale does
matter.

Furthermore, it's not going to cause any passing tests to fail, or any
legitimately failing tests (ie. due to a real bug, not Test::More stricture)
to pass.

The only breakage I can think of are all highly convoluted and improbable,
where you've somehow written a test that checks that this specific feature
works.  But the only one who should be doing that is Test::More's own tests.
Or some highly paranoid dependent on that specific feature, in which case
congratulations!  Your test did its job!

I'm not worried.


 I'm still wishing for the plan to make it to a given statement model 
 (e.g. done().)

Honestly all that's really holding that up is a good name for the plan style
and I'm done testing terminator.  Nothing has really lept out at me yet.
Maybe something as straight forward as...

plan 'until_done';



done_testing;


-- 
Ahh email, my old friend.  Do you know that revenge is a dish that is best
served cold?  And it is very cold on the Internet!

Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

I'm going to sum up this reply, because it got long but kept on the same themes.

*  TODO tests provide you with information about what tests the author decided
to ignore.
**  Commented out tests provide you with NO information.
**  Most TODO tests would have otherwise been commented out.

*  How you interpret that information is up to you.
**  Most folks don't care, so the default is to be quiet.

*  The decision for what is success and what is failure lies with the author
**  There's nothing we can do to stop that.
**  But TODO tests allow you to reinterpret the author's desires.

*  TAP::Harness (aka Test::Harness 3) has fairly easy ways to control how
   TODO tests are interpreted.
**  It could be made easier, especially WRT controlling make test
**  CPAN::Reporter could be made aware of TODO passes.


Fergal Daly wrote:
 On 05/12/2007, Michael G Schwern [EMAIL PROTECTED] wrote:
 This this whole discussion has unhinged a bit from reality, maybe you can 
 give
 some concrete examples of the problems you're talking about?  You obviously
 have some specific breakdowns in mind.
 
 I don't, I'm arguing against what has been put forward as good
 practice when there are other better practices that are approximately
 as easy and don't have the same downsides.
 
 In fairness though these bad practices were far more strongly
 advocated in the previous thread on this topic than in this one.

I don't know what thread that was, or if I was involved, so maybe I'm not the
best person to be arguing with.


 The final choice, incrementing the dependency version to one that does not 
 yet
 exist, boils down to it won't work.  It's also ill advised to anticipate
 that version X+1 will fix a given bug as on more than one occasion an
 anticipated bug has not been fixed in the next version.
 
 As I said earlier though, in Module::Build you have the option of
 saying version  X and then when it's finally fixed, you can say !X
 (and !X+1 if that didn't fix it).

Yep, rich dependencies are helpful.


 There is also the I don't think feature X works in Y environment problem.
 For example, say you have something that depends on symlinks.  You could hard
 code in your test to skip if on Windows or some such, but that's often too
 broad.  Maybe they'll add them in a later version, or with a different
 filesystem (it's happened on VMS) or with some fancy 3rd party hack.  It's
 nice to get that information back.
 
 How do you get this information back? Unexpected passes are not
 reported to you. If you want to be informed about things like this a
 TODO is not a very good way to do it.

The TODO test is precisely the way to do it, it provides all the information
needed.  We just don't have the infrastructure to report it back.

As discussed before, what's needed is a higher resolution then just pass and
fail for the complete test run.  That's the Result: PASS/TODO discussed
earlier.  Things like CPAN::Reporter could then send that information back to
the author.  It's a fairly trivial change for Test::Harness.

The important thing is that report back is no longer locked to fail.


 I'm talking about people converting tests that were working just fine
 to be TODO tests because the latest version of Foo (an external
 module) has a new bug. While Foo is broken, they don't want lots of
 bug reports from CPAN testers that they can't do anything about.

 This use of TODO allows you to silence the alarm and also gives you a
 way to spot when the alarm condition has passed. It's convenient for
 developers but it's 2 fingers to users who can now get false passes
 from the test suites,
 It still boils down to what known bugs the author is willing to release with.
  Once the author has decided they don't want to hear about a broken
 dependency,  and that the breakage isn't important, the damage is done.  The
 TODO test is orthogonal.

 Again, consider the alternative which is to comment the test out.  Then you
 have NO information.
 
 Who's you?

You == user.


 If you==user then a failing TODO test and commented out test are
 indistinguishable unless you go digging in the code or TAP stream.

As they say, works as designed.  The author decided the failures aren't
important.  Don't like it?  Take it up with the author.  Most folks don't care
about that information, they just want the thing installed.

You (meaning Fergal Daly) can dig them out with some Test::Harness hackery,
and maybe that should be easier if you really care about it.  The important
thing is the information is there, encoded in the tests, and you can get at it
programatically.

The alternative is to comment the failing test out in which case you have *no*
information and those who are interested cannot get it out.


 A passing TODO is just confusing.

That's a function of how it's displayed.  UNEXPECTEDLY SUCCEEDED, I agree,
was confusing.  No question.  TH 3's display is more muted and no more
confusing then a skip test.  There is also the very clear All tests
successful

Re: TODO - MAYBE tests?

Eric Wilhelm wrote:
 Since we're on the subject of CPAN::Reporter, TAP::Harness, Test::More, 
 and TODO wrt failure vs. no-noise vs. report-back vs. await-dependency 
 and the binaryism of failure and etc...
 
 Perhaps a general sort of MAYBE namespace in TAP would be a nice 
 addition.

Is this a joke?  I hope it's a joke.


-- 
There will be snacks.

Re: Why not run a test without a plan?

A. Pagaltzis wrote:
 * Michael G Schwern [EMAIL PROTECTED] [2007-12-05 04:30]:
 Why do they care if the plan is output at the beginning or end?
 How does this stricture improve the quality of the test?
 
 It improves the resulting TAP stream, if not the test itself.

What's improved about the plan coming at the front as opposed to at the end?
 Give me something concrete, not just it's better.  I'm going to keep
drilling through the BS until I either hit bottom or punch through.

About all that's different when the plan is at the end is the TAP reader
doesn't know how many tests are coming until the end of the test.  Then it
can't display the expected number of tests while the test is running.
Unfortunate, but hardly a showstopper.


 But maybe it’s not necessary to impose this stricture by default,
 and instead of asking to be allowed to supply a plan later, as I
 proposed, people should instead have to ask for the stricture:

Again I ask, why make them ask?


 Honestly all that's really holding that up is a good name for
 the plan style and I'm done testing terminator.  Nothing has
 really lept out at me yet. Maybe something as straight forward
 as...

  plan 'until_done';

  

  done_testing;
 
 plan 'until_completion';
 # ...
 plan 'completed';

I don't want to saddle plan() with yet another feature.  It will most
definitely be it's own function.


-- 
The interface should be as clean as newly fallen snow and its behavior
as explicit as Japanese eel porn.

Re: TODO - MAYBE tests?

Eric Wilhelm wrote:
 # from Michael G Schwern
 # on Wednesday 05 December 2007 05:47:
 
 Perhaps a general sort of MAYBE namespace in TAP would be a nice
 addition.
 Is this a joke?  I hope it's a joke.
 
 Do I look like I'm joking?  :-|

   As it is, we're talking about detecting/reporting a 3rd thing, which 
   only increases the resolution by 50%.  If there are really $n, perhaps 
   just jump straight to $n and skip that 4th, 5th, 6th, ... process?
 
 You don't have to call it MAYBE -- is that what makes it hard to take 
 seriously?

Yes.  It makes my trick ambiguity in testing is bad knee act up.  I'll go
tie my leg down and reread the proposal.


-- 
I am somewhat preoccupied telling the laws of physics to shut up and sit down.
-- Vaarsuvius, Order of the Stick
   http://www.giantitp.com/comics/oots0107.html

Re: Why not run a test without a plan?

Eric Wilhelm wrote:
  Give me something concrete, not just it's better.  I'm going to
 keep drilling through the BS until I either hit bottom or punch
 through.
 
 It allows you to apply the policy all tests have a plan at the test 
 level.  Yes, policy often sounds like BS.

 By historical accident Test::More has always applied (albeit not in a 
 super-formal way) that policy by default.

This BS... err, policy... would still be possible.  It's a social policy
anyway, not a technical one.  It's been possible to run a test without a plan
for a long time.  Whatever they have in place to deal with no_plan can deal
with this.


 About all that's different when the plan is at the end is the TAP
 reader doesn't know how many tests are coming until the end of the
 test.  Then it can't display the expected number of tests while the
 test is running.
 
 Yes.  That leads a shop to implement the policy all tests must plan.
 
 If you don't want to support that policy-application, fine.  It can be 
 solved in other ways -- maybe they're cleaner.  A switch in the harness 
 doesn't seem to be it, but maybe a Test::MustPlan (complete with 
 syntactic-sugar for the annoying BEGIN thing.)

I'll put in some Test::Builder-must_have_plan flag to allow the current
behavior to be switched back on rather than wholely deleting it.  It's all
encapsulated in a method anyway.


-- 
Schwern What we learned was if you get confused, grab someone and swing
  them around a few times
-- Life's lessons from square dancing

Re: Why not run a test without a plan?

A. Pagaltzis wrote:
 * Michael G Schwern [EMAIL PROTECTED] [2007-12-05 15:00]:
 I'm going to keep drilling through the BS until I either hit
 bottom or punch through.
 
 Yeah, we’re all spouting bullshit. Gee, some tone you’re setting.

Sorry, I forgot the :)

That I'm pushing so hard to get something concrete out of you means I think
you've got something useful to say.  That I'm not getting it is frustrating.
Seems I've finally got it, thank you.


 About all that's different when the plan is at the end is the
 TAP reader doesn't know how many tests are coming until the end
 of the test. Then it can't display the expected number of tests
 while the test is running.
 
 Not only can’t it do anything display-wise, but the harness also
 can’t do anything else that requires knowing the projected plan
 up front. It can’t abort the test as soon as the first extra test
 runs. If the test dies, the harness doesn’t know how many tests
 were pending. A system whose job is to continuously run lots of
 tests in parallel can’t do nearly as much useful asynchronous
 reporting.

 Unfortunate, but hardly a showstopper.
 
 Whether or not it’s a showstopper is for the harness author
 to judge and not for you. It’s not hard to imagine cases where
 better streamability is important, even if they’re not garden-
 variety `./Build test` scenarios. We’re championing TAP as a
 solution for a wide variety of scenarios, right?

That's all things I haven't thought of.  Since it doesn't effect the ultimate
quality of the tests, just some inconveniences in reporting, I'm not worried.

no_plan already has all these issues and the sky remains firmly fixed in the
heavens.  Header-at-end TAP is still streamable.  You don't have to read the
whole document before you can get information.  It doesn't close off any
testing situations, and it makes quite a few more much simpler.

To make it clear, Test::Builder will still put the plan at front when it can.
 Also, to make it clear, this is all possible right now with TAP.  This is a
Test::Builder imposed restriction.


 But streamability isn’t important in that most common use case,
 so it probably shouldn’t be the default, which is why I opined
 that maybe Test::More should be strict on request but not by
 default.

Sorry, I must have missed that.  Your example code up to this point looked
like it required the user to declare up front that they were going to put a
plan later.

Having the author declare in the test that they'd like Test::More to be strict
with the plan seems near useless.  If you're going to declare that you have to
declare a plan, why not just declare the plan?  It's like preparing to
prepare.  I can think of a few weird cases where it might be handy, but it's
not worth the extra complication.


-- 
Life is like a sewer - what you get out of it depends on what you put into it.
- Tom Lehrer

Re: shouldn't UNEXPECTEDLY SUCCEEDED mean failure?

Fergal Daly wrote:
 The importance of the test has not changed. Only the worth of the
 failure report has changed.
 
 This could be solved by having another classification of test, the
 not my fault test used as follows
 
 BLAME: {
   $foo_broken = test_Foo(); # might just be a version check or might
 be a feature check
   local $BLAME = Foo is broken, see RT #12345 if $foo_broken;
 
   ok(Foo::thing());
 }
 
 The module would install just fine in the presence of a working Foo,
 the module would fail to install in the presence of a broken Foo but
 no report should be sent to the author.
 
 This gives both safety for users and convenience for developers. This
 is what I meant by smarter tools.

I hope you don't mind if I cut out the rest of the increasingly head-butting
argument and jump straight to this interesting bit.

As much as my brain screams DO NOT WANT!!! [1] because it smacks of
expected failure it might be just what we're looking for.  The allows the
author to program in I know this is broken, don't bug me about it without
completely silencing the test.

However, I think it will be very open to abuse.  I'm also not sure how this
will be different from simply having the option of making failing TODO tests
fail for the user but not report back to the author.

It still boils down to trusting the author.


[1] http://www.mgroves.com/images/do_not_want_star_wars.jpg


-- 
There will be snacks.

Customizing Test::Builder (was Re: TAP::Builder)

Ovid wrote:
 Side note:  those features I really want control over in
 Test::Harness
 are the plan() and ok() methods.  There's no clean way for me to do
 that.  Just look at the constructor:

   my $Test = Test::Builder-new;
   sub new {
   my($class) = shift;
   $Test ||= $class-create;
   return $Test;
   }

 The class name is hard-coded in there.
 
 Note to self:  don't post while hung over from the London Perl
 Workshop.  What I just said is rather confusing.  What I *need* is a
 way to easily replace Test::Builder with an appropriate subclass.  I
 think I can replace the builder() method in Test::Builder::Module:
 
   sub builder {
 return Test::Builder-new;
   }
 
 But it would still be nice to have something a bit more subtle than a
 sledgehammer:
 
   sub Test::Builder::Module::builder { ... }

On the surface, this could be solved with a simple way to replace the
Test::Builder class with your own.  However, I am always hesitant to do that
because of the inevitable clash.

Test::Builder is a singleton, this way custom testing modules can make their
changes in concert and with global effect.  Now what happens if Test::Foo and
Test::Bar are used together in the same program?  And Test::Foo decides it
wants to replace the singleton with Test::Builder::Foo and Test::Bar decides
it wants to do Test::Builder::Bar.  One has to win and one has to lose.  This
is not the Test::Builder way.

Somehow, multiple modules have to be able to override Test::Builder behaviors.
 Yes, there are certain features which inherently clash, but that's at the
higher feature level rather than at the code level.

Something I'm considering to resolve this is the Class::C3 method plugin
mechanism, or something like it.  I've used it while working on the Class::DBI
compatibility wrapper around DBIx::Class and I'm very impressed with the
amount of flexibility it allows.  It also allows you to slice up functionality
by feature and combine them together.

Traits and mixins offer similar functionality.  They're also a fair sight
easier to implement, considering the dependency issues Test::Builder will have
to contend with.  mixin.pm is small enough that I can just ship a copy with
Test::Builder.

Finally, there's the idea of splitting up Test::Builder into an aggregate of
many objects.  The aggregate itself would not be a singleton, allowing local
customization, but some of its parts (such as the part responsible for the
counter) would.  I believe this is how chromatic did the Perl 6 version which
I haven't gotten around to studying.

So... it's complicated.


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.

Re: Parsing TAP into TAP

2007-12-10 Thread Michael G Schwern

Ovid wrote:
 Test results currently look something like this:
 
   t/foo.t. ok
   t/bar.t. ok
   t/baz.t. 23/?
   #   Failed test at t/baz.t line 9
   # Looks like you failed 2 tests out of 23
   t/baz.t. Dubious, test ...
 
 Why do we do this instead of outputting TAP (using YAML diagnostics)?
 
   ok 1 - t/foo.t
   ok 2 - t/bar.t
   not ok 3 - t/baz.t
 ---
 failed:
   - 2
   - 11
 ...
 
 And we could even add diagnostics for the non-failing tests.  This
 could be an alternate output, but now instead of external tools having
 to try and parse our ad-hoc Test::Harness output, we could have an
 alternate machine read-able output that those tools could use.  Now if
 only we had a useful way to read that output ...

+1

-- 
Stabbing you in the face for your own good.

Re: What's the point of a SIGNATURE test?

2007-12-14 Thread Michael G Schwern

Adrian Howard wrote:
 
 On 11 Dec 2007, at 05:12, Michael G Schwern wrote:
 
 Adam Kennedy posed me a stumper on #toolchain tonight.  In short,
 having a
 test which checks your signature doesn't appear to be an actual
 deterrent to
 tampering.  The man-in-the-middle can just delete the test, or just the
 SIGNATURE file since it's not required.  So why ship a signature test?

 The only thing I can think of is to ensure the author that the signature
 they're about to ship is valid, but that's not something that needs to
 be shipped.
 [snip]
 
 It is something that needs to be shipped if you have the CPAN is the
 definitive version of a module. Somebody can fork from it attitude.

 It certainly doesn't have to run though...

I'm really not a fan of shipping tests that don't get run.

To be clear, I'd likely just delete it entirely and either A) trust that
MakeMaker/Module::Build will do the right thing, which it always has for me or
B) add a cpansign verify to my normal release script.

Both avoid pooping a common author-only check all over the place.


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.

Re: What's the point of a SIGNATURE test?

2007-12-14 Thread Michael G Schwern

Andreas J. Koenig wrote:
 On Mon, 10 Dec 2007 21:12:51 -0800, Michael G Schwern [EMAIL 
 PROTECTED] said:
 
Adam Kennedy posed me a stumper on #toolchain tonight.  In short, having a
test which checks your signature doesn't appear to be an actual deterrent 
 to
tampering.  The man-in-the-middle can just delete the test, or just the
SIGNATURE file since it's not required.  So why ship a signature test?
 
 Asking the wrong question. None of our testsuites is there to protect
 against spoof or attacks. That's simply not the goal. Same thing for
 00-signature.t

We would seem to be agreeing.  If the goal of the test suite is not to protect
against spoofing, and if it doesn't accomplish that anyway, why put a
signature check in there?


The only thing I can think of is to ensure the author that the signature
they're about to ship is valid, but that's not something that needs to be 
 shipped.
 
 Has the world changed over night? Are we now questioning tests instead
 of encouraging them? Do now suddenly authors have to justify their
 testing efforts?

 I don't mind if we set up a few rules what tests should and should not
 do, but then this topic needs to be put into perspective.
 
It appears that a combination of a CHECKSUMS check against another CPAN 
 mirror
and a SIGNATURE check by a utility external to the code being checked is
effective, and that's what the CPAN shell does.  The CHECKSUMS check makes
sure the distribution hasn't been tampered with.  Checking against a CPAN
mirror other than the one you downloaded the distribution from checks 
 that the
mirror has not been compromised.  Checking the SIGNATURE ensures that the
module is from who you think its from.
 
 Yupp. And testing the signature in a test is better than not testing
 it because a bug in a signature or in crypto software is as alarming
 as a bug in perl or a module.

I believe this to be outside the scope of a given module's tests.  It's not
the responsibility of every CPAN module to make sure that your crypto software
is working.  Or perl.  Or the C compiler.  Or make.  That's the job of the
toolchain modules which more directly use them (CPAN, Module::Signature,
MakeMaker, Module::Build, etc...). [1]

At some point you have to trust that the tools work, you can't test the whole
universe.  You simply don't have the time.

That brings me to the central reason why we've started to examine tests for
removal.  There's a certain cost/benefit ratio to be considered.  What's the
cost of implementing and maintaining a test, what's the benefit and does the
benefit justify the cost?  What's the opportunity cost, could you be doing
something more useful with that time and effort?  Finally, what's the cost in
terms of test suite confidence?  How many false negatives are your users
willing to endure before they lose confidence?

The fixed cost of a test is in writing it.  This includes both writing the
test itself and possibly altering the code being tested to make it testable.
It's a fixed cost because you do it once and then you're done.

The reoccurring costs include diagnosing failures.  The user loses time due to
a halted installation.  They contact the author who has to diagnose the
failure and communicate back the results back to the user.  If the test found
a bug, then the cost has a benefit and it's worthwhile.  But if the test
failed because it's a bad test, or because of something out of the author's
control and/or the user doesn't care about, then there's little or no benefit.

Then there's the cost of confidence.  Tests are only useful if someone pays
attention to them.  A failed test should be a clear indication of an actual
problem.  This is why expected failures (and their related expected
warnings) are so insidious.  False failures erode the mental link between
test failure and bug.  Get enough of them, and it doesn't take much, and
people start to ignore any failure.  This is one of the most dangerous social
problems for a test suite.

A test that results in a lot of false negatives has a high reoccurring cost to
no benefit.

Finally there's the question of opportunity cost.  Instead of writing and
maintaining a faulty test, what else could you have been doing with that time?
 Could you have been doing something with an even higher benefit?  If so, you
should do it instead.


Let's look at the example of Test::More.  The last release has 120 passes and
just 4 failures.
http://cpantesters.perl.org/show/Test-Simple.html#Test-Simple-0.74

What are those four failures?  Three are due to a threading bug in certain
vendor patched versions of perl, one is due to the broken signature test.

Look at the previous gamma release, 0.72.  256 passes, 9 failures.
5 due to the threading bug, 4 from the signature test.

0.71:  73 passes, 2 failures.  1 signature, 1 threads

0.70:  221 passes, 12 failures.  3 signature, 9 threads

And so on.  That's nine months with nothing but false negatives

Re: Milton Keynes PM coding collaboration

2007-12-14 Thread Michael G Schwern

Edwardson, Tony wrote:
 Anyone written any CPAN modules for which the testing coverage needs to be
 improved ?
 
 Want someone else to sort this out for you ?

...

 Any takers ?

http://search.cpan.org/dist/ExtUtils-MakeMaker

Repository here:
http://svn.schwern.org/svn/CPAN/ExtUtils-MakeMaker

Getting a valid coverage measurement is tricky since so much happens in
sub-processes.  And then there's all the platform specific code.  But don't
worry, there's plenty to do. :)


-- 
Stabbing you in the face so you don't have to.

Re: What's the point of a SIGNATURE test?

2007-12-15 Thread Michael G Schwern

Andreas J. Koenig wrote:
 On Fri, 14 Dec 2007 15:49:32 -0800, Michael G Schwern [EMAIL 
 PROTECTED] said:
We would seem to be agreeing.  If the goal of the test suite is not to 
 protect
against spoofing, and if it doesn't accomplish that anyway, why put a
signature check in there?
 
 Of course we are agreeing 99%. But I'm citing the Michael Schwern
 saying that is dearer to me than the above paragraph: tests are there
 to find bugs.

I say lots of apparently contradictory things.  The trick is knowing when one
rule wins out over the other.

Something to keep in mind is that I'm talking about one very specific test.
Don't let this discussion get tangled up in the author tests brouhaha that
often brews up around here.


[...] But if the test failed because it's a bad test,
 
 Clearly a strawman's argument. It's impossible to contradict you on
 this. Thou shalt not write bad tests. Period.

That was supposed to come out more like if the test failed because of a
mistake in the test suite.  You know the sort of thing.  Like when you write:

like $error, qr/your shit is broke at $0 line \d+\.\n/;

and it blows up on Windows because you forgot about the backslashes in Windows
path names.  The test failure indicates a bug in the test, not the code.
Thus, the failure has a cost and no benefit.


The
signature test is not actually indicating a failure in Test::More, so 
 it's of
no benefit to me or the users, and the bug has already been reported to
Module::Signature.
 
 See above. Once the bug is reported there is no justification to keep
 the test around. In this case I prefer a skip over a removal because
 the test apparently once was useful.

Bt skipped tests don't get run so it's effectively deleted, except a
permanently skipped test sits around cluttering things up.  Smells like
commenting out code that maybe someday you might want to use again in the
future.  Just adds clutter.

If I want to bring a test (or code) back from the dead that's what version
control is for.


The threading test is indicating a perl bug that's very difficult to 
 detect
[2], only seems to exist in vendor patched perls, I can't do anything 
 about
and is unlikely to effect anyone since there's so few threads users.  It's
already been reported to the various vendors but it'll clear up as soon as
they stop mixing bleadperl patches into 5.8.
 
In short, I'm paying for somebody else's known bugs.  I get nothing.
Test::More gets nothing.  The tools get nothing.  Cost with no benefit.  
 So
why am I incurring these costs?  Maybe the individual users find out their
tools are broken, but it's not my job to tell them that.
 
 During smoking CPAN I often find bugs in one module revealed by a test
 in another one... Only because David Golden tests so hard his tests were
 well suited to reveal a bug in Test::Harness. I'm glad he doesn't ask
 if it is his job or not. Just a few RT headlines of the past year:...
 Catalyst:Plugin:Authorization:Roles found a bug in C:P:Authentication.
 DBI 1.601 broke Exception::Class::DBI. HTML-TreeBuilder-XPath 0.09
 broke Web::Scraper. Test::Distribution 1.29 broke Lingua::Stem.
 Math-BigInt-FastCalc broke Convert::ASN1. Test:Harness 3.0 broke POE.
 DBM-Deep-1.0006 broke IPC::PubSub. DateTime-Locale-0.35 broke
 Strptime. Data::Alias 1.06 breaks Devel:EvalContext. Class::Accessor
 breaks Class::Accessor::Class. DBIx-DBSchema-0.33 breaks Jifty::DBI.
 File::chdir 0.08 breaks Module::Depends 0.12. Lingua:Stem:It 0.02
 breaks the Lingua:Stem testsuite. SVN-Notify-0.26 breaks
 SVN::Notify::Config (and others). Heap 0.80 breaks Graph. DBI-1.53's
 NUM_OF_FIELDS change breaks DBD-Sybase 1.07. Getopt-Long 2.36 breaks
 Verilog::Language. And so on.

I agree that's all very useful.  Interlocking dependent test suites ferret out
bugs the original authors wouldn't find.  However, there is a very important
difference between the list above and Test::More's signature test.

On a quick scan, all of those modules have direct dependencies.  DBD::Sybase
uses DBI, Lingua::Stem uses Test::Distribution, etc... so it's natural that
their tests would test their dependencies.  If a dependency breaks, they
break.  I'm sure most of the authors above did not set out with the intention
to test their dependencies, it's all inherent in the testing of their own code.

Test::More doesn't actually use Module::Signature, so why is it testing it?

It would be like if DBI decided to add a test to make sure MakeMaker can read
the MANIFEST.  Sure it's useful to know that part of the toolchain works and
that the MANIFEST can be read, but why is that in DBI?  One can argue that DBI
depends on the good functioning of ExtUtils::Manifest to install, so it should
test it.  Ok, then what about all the other things DBI depends on to install?
 Should it test that MakeMaker can make a valid Makefile?  Should it test that
tar and gzip work?  Should I check that CPAN.pm can properly

Re: What's the point of a SIGNATURE test?

2007-12-16 Thread Michael G Schwern

Andreas J. Koenig wrote:
 On Sat, 15 Dec 2007 01:34:37 -0800, Michael G Schwern [EMAIL 
 PROTECTED] said:
 
   See above. Once the bug is reported there is no justification to keep
   the test around. In this case I prefer a skip over a removal because
   the test apparently once was useful.
 
Bt skipped tests don't get run so it's effectively deleted, except a
permanently skipped test sits around cluttering things up.  Smells like
commenting out code that maybe someday you might want to use again in the
future.  Just adds clutter.
 
If I want to bring a test (or code) back from the dead that's what version
control is for.
 
 I think I did indicate I was talking about a $VERSION-dependent skip.
 
 Let me reiterate.
 
 A test reveals a bug in module A, version N. The bug now is known and
 filed to RT. No need to run it again and again. Skip it ***if version
 N of module A is installed***. Apparently the test was useful to
 detect a malfunctioning of module A. Do not throw it away until you
 have verified that the test has found a better home. If it has found a
 better home for sure, I do not care if you delete it.
 
 POtherwise it is vital to keep the test because it has proved to be
 useful. It is unacceptable to to run the test on the broken version
 over and over again. A $VERSION check should be sufficient from that
 point in time on.
 
 What if everybody on CPAN deletes tests just because a related bug has
 been fixed? Nobody would notice if the bug were reintroduced.
 
 Nuff said?

Now I understand, I thought you meant an unconditional skip.


-- 
If at first you don't succeed--you fail.
-- Portal demo

Re: [ANNOUNCE] TAP::Harness::Archive 0.03

2007-12-16 Thread Michael G Schwern

nadim khemir wrote:
 On Saturday 15 December 2007 20.53.30 Michael Peters wrote:
 The uploaded file

 TAP-Harness-Archive-0.03.tar.gz
 ...
 
 Nice. Now, what do we do with it?

You RTFM.

http://search.cpan.org/perldoc/TAP::Harness::Archive


-- 
If at first you don't succeed--you fail.
-- Portal demo

Re: Fwd: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-18 Thread Michael G Schwern

chromatic wrote:
 On Tuesday 18 December 2007 17:27:24 Andy Armstrong wrote:
 
 Someone (MLEHMANN) doesn't like smoking... That was a test report
 generated by CPAN::Reporter.

 It hadn't previously occurred to me that test reports might cause
 offence...
 
 Didn't you get a whole slew of them a while back where the problem was that 
 that the reporter hadn't properly configured Windows to build modules?  How 
 about the one where the reporter had configured CPAN never to follow 
 dependencies?

That said, looking through IO::AIO's failures they seem reasonably legit to
me.  It has trouble on BSD, and some other systems, a useful thing to know.
IO::AIO lacks any special INSTALL instructions or special notes about BSD in
general.  There are a couple notes generated by the Makefile.PL about FreeBSD
and threading and Linux and malloc issues, but that will whiz by and likely be
completely missed.  So even a human installer would not know what to do.

Anyhow, what's clear is there is a problem with IO::AIO.  It hasn't been
addressed properly by the author.  While it's frustrating to get a constant
stream of your shit is broke, his shit is indeed broke.  This is a clear
case of CPAN Testers technology working as expected and tickling a social 
problem.

It is particularly near to my heart as Test::More has a similiar problem with
thread tests and I'm not sure what to do about it.  There was the suggested
author does not care marker for tests which might fail but are only for the
information of the installer -- the author already knows about them, don't
report the failure.

As for the social problem, the BSD testers could try to help out with whatever
the problem is.  On Marc's side he could ask for help instead of asking
everyone to turn off the immensely useful automated testing.  It could also
use an INSTALL doc and have the Makefile.PL warnings be more prominent with
perhaps a pause, beep or a well-behaved Do you wish to continue? [No].


-- 
I am somewhat preoccupied telling the laws of physics to shut up and sit down.
-- Vaarsuvius, Order of the Stick
   http://www.giantitp.com/comics/oots0107.html

Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-19 Thread Michael G Schwern

Andy Armstrong wrote:
 On 19 Dec 2007, at 03:13, Michael G Schwern wrote:
 Anyhow, what's clear is there is a problem with IO::AIO.  It hasn't been
 addressed properly by the author.  While it's frustrating to get a
 constant
 stream of your shit is broke, his shit is indeed broke.  This is a
 clear
 case of CPAN Testers technology working as expected and tickling a
 social problem.
 
 I'm locked in correspondence with Marc now.
 
 His view: cpan-testers are incompetent, ego tripping, quasi-religious
 nuisances.
 My view: approx your view.
 
 Obviously that's my (probably extremely unprofessional) impression of
 his views. He did mention religion and ego though :)

CPAN Testers does mug his modules pretty badly, just look at all that red.
http://cpantesters.perl.org/author/MLEHMANN.html

He does an awful lot of XS which is always going to be problematic.


-- 
Hating the web since 1994.

Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-20 Thread Michael G Schwern

Michael Peters wrote:
 David Golden wrote:
 On Dec 20, 2007 1:19 PM, Dave Rolsky [EMAIL PROTECTED] wrote:
 It's generally
 pretty rare that the failure report includes enough information for me to
 do anything about it, so without an engaged party on the other end, it
 really is just noise.
 With CPAN::Reporter, I've been trying to add additional context
 (within reason) to assist with problem diagnosis.  What kind of
 information would improve the reports?  (Not to say that this obviates
 the need for a responsive tester, but every little bit helps.)
 
 I for one would like the full TAP output of the tests. Not just what get's 
 sent
 to STDOUT by default. What would be ideal (and it's something that RJBS has
 poked me about before) would be to receive a TAP Archive (prove --archive) 
 that
 could get attached to the email. Of course this needs to be opt-in 
 (META.yml?).
 Then it would be pretty easy to setup an email account that is monitored by 
 some
 tool that would extract the archive and upload it to a Smolder install.

Altering how the tests run just for CPAN Testers:  Bad.
Trying to get authors to put in special code just for CPAN Testers:  Good Luck

This could be accomplished silently with some environment variables.
TAP_PARSER_ARCHIVE_DIR=/path/to/somewhere

There is the problem of getting TAP::Parser to recognize that the archive
feature is available and that gets back to the open TAP::Builder problem.

Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-20 Thread Michael G Schwern

Andy Armstrong wrote:
 On 21 Dec 2007, at 00:11, Michael G Schwern wrote:
 This could be accomplished silently with some environment variables.
 TAP_PARSER_ARCHIVE_DIR=/path/to/somewhere
 
 It's called PERL_TEST_HARNESS_DUMP_TAP and it already exists.

Well there you go.

Though might I suggest that...

A)  This be documented in Test::Harness, I note it's only in
Test::Harness.
B)  The TAP::Harness version be changed to PERL_TAP_HARNESS_DUMP_TAP.  Don't
want Test::Harness features leaking into TAP::Harness.
C)  The Test::Harness version be changed to HARNESS_DUMP_TAP (to match all
the other environment variables)

All that HARNESS_DUMP_TAP would do is twiddle the appropriate TAP::Harness bit.

I also note that setting the environment variable appears to be the *only* way
to get this behavior.  It would be handy if there was a normal parameter to
set it.

If I can sort out my SVK breakage I'll get on it.


 There is the problem of getting TAP::Parser to recognize that the archive
 feature is available and that gets back to the open TAP::Builder problem.
 
 I don't understand...

Sorry, I thought an optional plugin/subclass was necessary to get TAP::Harness
to save TAP.  Didn't realize it's built in.


-- 
If at first you don't succeed--you fail.
-- Portal demo

Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-22 Thread Michael G Schwern

chromatic wrote:
 On Saturday 22 December 2007 16:48:29 Michael G Schwern wrote:
 The I installed to a directory with a space in the path is an example of
 CPAN Testers working as expected.  It found and highlighted an annoying bug
 that the rest of us either ignore or work around.
 
 CPAN Testers reporting failures in every module they test and not stopping to 
 ask Hey, is it possible that not everything else in the world is broken? is 
 *not* an example of CPAN Testers working as expected.

 Environments where it's impossible even to *build* Perl modules are 
 unsuitable 
 for smoketest reporting, as they don't provide any useful information and 
 they make true failures much more difficult to see and believe.

One of the drawbacks of extensive automation is that special cases which
require human intervention cannot be handled and can often go from minor
annoyances to pandemics.  The proper response is to either fix the automation
to eliminate the necessity of human interaction or, if its a bug, fix the bug.
   A little work now and the system will run even more efficiently than before.

These sorts of systemic failures might not be providing *new* information, but
I wouldn't say it's not useful.  One of the greatest problems facing
collecting bug reports (or, in fact, any survey technique) is getting honest,
unfiltered feedback.  Humans have a tendency to filter out negative feedback,
especially if it's perceived to be a known problem with a narrow focus.  CPAN
testers is giving us honest, unfiltered feedback. [1]  We get to see the all
the problems and the breadth of them.  This makes it difficult to ignore long
standing problems, like configuration level dependencies or non-Perl
dependencies or failures on BSD or CPANPLUS fighting with Module::Build or all
the other things WE know how to work around but others don't and make the end
user experience annoying.  Spaces in filenames are just the next problem to fix.

Even in cases where Perl is broke, it's nice to know how it got into that
state and if we can do anything about it.

The price we pay is a little more email in the inbox. [2]  I'm willing to pay
that.


[1] Within the range of the set of people doing the testing, of course.

[2] Rather than each individual emailing the author how about CPAN Testers
sends out a daily/weekly digest?  Still push and mandatory, it's important
that authors see this information, but at least it's just one email and it can
skip a lot of the boilerplate text and provide a nice, tight summary.


-- 
I have a date with some giant cartoon robots and booze.

Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-22 Thread Michael G Schwern

chromatic wrote:
 I just went through a sampling of fail reports for my stuff.  There was one 
 legitimate packaging bug, and a couple of legitimate errors due to updates to 
 Perl.  About 35% of the other reports are these.
 
 I love the Illegal seek error message:
 
   http://www.nntp.perl.org/group/perl.cpan.testers/2007/09/msg602208.html

As three different reporters across five different operating systems and three
versions of perl reported similar test failures, I believe the illegal seek is
suspicious but incidental.


 Pod::Man is broken.  Think about that for a while:
 
   http://www.nntp.perl.org/group/perl.cpan.testers/2006/01/msg286775.html

A temporary failure almost two years ago.  But the six other Acme::UNIVERSAL
failures appear to be legit.


 No information; useless:
 
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg221656.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2006/02/msg290834.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg223400.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg223401.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg223402.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/07/msg222475.html
   http://www.nntp.perl.org/group/perl.cpan.testers/2005/06/msg216573.html

All the no information reports all appear to be prior to the fixes for
CPANPLUS not reporting Module::Build test results.  A bug in the system that
has since, I believe, been fixed about two years ago.  You'll note the newest
of this bunch is two years old.

That leaves us with... I count one failure due to an individual CPAN tester's
setup being broken with another that amounts to a warning.  The rest are
system-wide bugs.

That CPANPLUS didn't report Module::Build test results is a highly annoying
bug to be sure (in fact, the whole CPANPLUS v Module::Build war is a tragedy),
but a bug that was fixed.  There's nothing we can do about previous mistakes,
only future ones.  No use dwelling on the past, let's see how CPAN Testers is
treating your current releases...

* P5NCI, 11 Dec 2007... looks like a valid pile of XS compatibility issues
with similar failures coming from several different testers.  Valid failures.

* UNIVERSAL::isa, 24 Nov 2007... looks like they caught a compatibility issue
with 5.5.5 and 5.6.2 across multiple testers.  Valid failures, all.  And it's
an alpha release, isn't it nice to have people testing your alphas?  Previous
stable release in Feb 2006 has one failure out of 400 tests.  Looks like a
valid failure, possibly due to CGI.pm not being available and the test
checking for availability but not skipping the dependent test. [2]

* Test::MockObject, 29 Jun 2007... 100% passing with 201 tests.  Previous
version from October 2006 has only one failure with 153 passes, possibly due
to a bleadperl issue.  Maybe annoying to you, but useful for bleadperl
development. [1]

* Text::WikiFormat, 29 Jun 2007... 100% passing with 71 tests.  Previous
version had 1 failure out of 43 tests.  Again, possible bleadperl issue.

* SUPER, 04 Apr 2007... one failure, a possible bleadperl bug.

Depending on which way you score the bleadperl failures for your latest five
distribution releases you've had zero or two false negatives out of, let's add
it up... 51 + 50 + 201 + 71 + 46 == 419.  So either a 0 or 0.5% false failure
rate.  That's a pretty damn good.


[1] It can be argued that bleadperl testers should probably not email authors,
and maybe they aren't I can't tell from these archives, but at least the work
is useful.  CPAN::Reporter could change the default configuration if it
detects a development perl.

[2] And before you say how do you install perl without CGI.pm?! it can be
done with a stripped down Debian system via their perl-base package and
possibly other perl distributions.


-- 
We do what we must because we can.
For the good of all of us,
Except the ones who are dead.
-- Jonathan Coulton, Still Alive

Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-23 Thread Michael G Schwern

David Golden wrote:
 On Dec 23, 2007 2:37 AM, Michael G Schwern [EMAIL PROTECTED] wrote:
 [1] It can be argued that bleadperl testers should probably not email 
 authors,
 and maybe they aren't I can't tell from these archives, but at least the work
 is useful.  CPAN::Reporter could change the default configuration if it
 detects a development perl.
 
 That's quite reasonable -- submit to CPAN Testers to help p5p check
 bleadperl against CPAN but don't annoy authors if it fails.  What's
 the best way to detect a development perl reliably?  I don't think
 it's just odd major numbers, as 5.9.5 switched to 5.10.0 well before
 the actual release candidates were out.  Maybe
 $Config{perl_patchlevel}?  That seems to have vanished from the final
 release.

That's ok, it doesn't need to be foolproof.  Odd numbered versions (starting
at 7) is a good start and will cut out most of the bleadperl noise.

The 5.even as devel period is very short.  CPAN authors should be made aware
of how their code works with release candidates.   That's a period when
problems are likely to be for real.


-- 
I do have a cause though. It's obscenity. I'm for it.
- Tom Lehrer

Re: Auto: Your message 'FAIL IO-AIO-2.51 i386-freebsd-thread-multi 6.2-release' has NOT been received

2007-12-23 Thread Michael G Schwern

Michael G Schwern wrote:
 David Golden wrote:
 On Dec 23, 2007 2:37 AM, Michael G Schwern [EMAIL PROTECTED] wrote:
 [1] It can be argued that bleadperl testers should probably not email 
 authors,
 and maybe they aren't I can't tell from these archives, but at least the 
 work
 is useful.  CPAN::Reporter could change the default configuration if it
 detects a development perl.
 That's quite reasonable -- submit to CPAN Testers to help p5p check
 bleadperl against CPAN but don't annoy authors if it fails.  What's
 the best way to detect a development perl reliably?  I don't think
 it's just odd major numbers, as 5.9.5 switched to 5.10.0 well before
 the actual release candidates were out.  Maybe
 $Config{perl_patchlevel}?  That seems to have vanished from the final
 release.
 
 That's ok, it doesn't need to be foolproof.  Odd numbered versions (starting
 at 7) is a good start and will cut out most of the bleadperl noise.
 
 The 5.even as devel period is very short.  CPAN authors should be made aware
 of how their code works with release candidates.   That's a period when
 problems are likely to be for real.

Thinking on this a little more, there is the issue of folks like me who share
a single CPAN configuration file across multiple Perl installations.  I don't
know how common that is to have a stable and devel perl running off the same
CPAN config and if it's worth adding in a special case in the configuration
for what do you want to do with development perls to override the existing
config.


-- 
If at first you don't succeed--you fail.
-- Portal demo

Re: MakeMaker warning

2007-12-26 Thread Michael G Schwern

Gabor Szabo wrote:
 might be slightly unrelated to QAsorry

In the future, MakeMaker issues go to [EMAIL PROTECTED]


 After installing   JOSHUA/Net-Telnet-Cisco-1.10.tar.gz
 if I run perl Makefile.PL  on an unrelated Makefile.PL that
 requires 'Net::Telnet::Cisco'  = '1.10'
 I get a warning
 
 Argument 1.3.1 isn't numeric in numeric lt () at
 /opt/perl510/lib/5.10.0/ExtUtils/MakeMaker.pm line 414.
 
 I did not get this warning before installing Net::Telnet::Cisco
 
 Strange thing is that when I ack for 1.3.1 the only place I found it is
 
 /opt/perl510/lib/site_perl/5.10.0/Sys/HostIP.pm
 7:$VERSION = '1.3.1';
 
 So my diagnostics pointing at Net::Telnet::Cisco might be incorrect.
 
 So which module is at fault here?

The warning is absolutely correct, 1.3.1 isn't numeric.  Nor is it a version
object as it probably should be.  So that much of Sys::HostIP is wrong.

As to why it's happening in your apparently unrelated Makefile.PL, I can't
say.  Double check that somewhere down the line you're not listing Sys::HostIP
as a prereq.


-- 
Look at me talking when there's science to do.
When I look out there it makes me glad I'm not you.
I've experiments to be run.
There is research to be done
On the people who are still alive.
-- Jonathan Coulton, Still Alive

Re: MakeMaker warning

2007-12-29 Thread Michael G Schwern

Gabor Szabo wrote:
 Is there a place with definition of what a VERSION value can be ?

Anything which compares sanely as a number plus the X.YY_ZZ alpha convention
(which MM converts to a number).  I guess that's never stated explicitly.  I'd
welcome a section on $VERSION in ExtUtils::MakeMaker::Tutorial, VERSION_FROM
isn't really the right place for it.


 I looked at the docs of ExtUtils::MakeMaker.
 It mentions version numbers under VERSION_FROM but only as examples
 it has examples like 1.2.3 though not in a string format like '1.2.3'.

1.2.3 is an ill fated version string (not to be confused with version objects)
introduced in 5.6.  I don't think they actually work anyway... no they don't.
 I'll remove them.


 CPANTS thinks it is a correct version number:
 http://cpants.perl.org/dist/kwalitee/Sys-HostIP
 
 Maybe the way CPANTS check isn't correct but I think there is some mismatch.

It's using CPAN::DistnameInfo which is looking at formatting details.  All
MakeMaker cares about is capabilities.  So there's likely to be some mismatch.

It's also not clear that CPAN::DistnameInfo cares about vetting the $VERSION
so much as simply extracting it.

A simple test for capabilities would be this:

$version =~ s/(\d+)\.(\d+)_(\d+)/$1.$2$3/;  # turn X.YY_ZZ into X.YYZZ
{
local $SIG{__WARN__} = sub { die Bad version: @_ };
() = $version = 0;
}


-- 
Stabbing you in the face so you don't have to.

Re: demining

2008-01-03 Thread Michael G Schwern

Eric Wilhelm wrote:
 # from Aristotle Pagaltzis
 # on Wednesday 02 January 2008 16:47:
 
 looking for (and diffusing) mines
 That sounds like a novel approach! Or do you mean “defusing”? :-)
 
 Yeah :-D  Diffuse is probably what they do when you find them the less 
 careful way!
 
 I guess the tank+flail mechanism is still in use, so that would 
 adequately describe that process.  Apparently it makes a huge mess 
 though.

Yes.  Mine goes off, chain snaps and goes wheee into some poor
soldier's face.  They're only used in modern day to clear little anti-personal
mines because they're relatively cheap and fast and don't require an armored
vehicle.

These days tactical (ie. small scale, under fire) mine clearing uses either a
tank mounted plow
http://en.wikipedia.org/wiki/Image:951219-O-9805M-005.jpg

Or a big, heavy roller mounted on springs in front of a tank to set off mines.
http://en.wikipedia.org/wiki/Image:M60-panther-mcgovern-base.jpg

Or you fire a rocket with an explosive filled hose attached across the
minefield and set it off detonating any mines in its path.

All of which make big messes.

My personal favorite... rats!
http://www.apopo.org/


-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer

Re: Testing print failures

2008-01-05 Thread Michael G Schwern

nadim khemir wrote:
 print 'hi' or carp q{can't print!} ;

I'm not even going to wade into the layers of neurosis demonstrated in this
post, but if you want to throw an error use croak().


-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer

Re: Dev version numbers, warnings, XS and MakeMaker dont play nicely together.

2008-01-06 Thread Michael G Schwern

demerphq wrote:
 So we are told the way to mark a module as development is to use an
 underbar in the version number:
 
 $VERSION= 1.23_01;
 
 but this will produce warnings if you assert a required version
 number, as the version isn't numeric.

We talked about this recently on [EMAIL PROTECTED]  Specifically how much
the convention sucks and replacing it with META.yml info.
http://www.nntp.perl.org/group/perl.module.build/2007/12/msg1151.html


-- 
Life is like a sewer - what you get out of it depends on what you put into it.
- Tom Lehrer

Re: Fixed Test::Builders regexp detection code.

2008-01-06 Thread Michael G Schwern

demerphq wrote:
 Just a heads up that I patched the core version of Test::Builder to
 use more reliable and robust methods for detecting regexps in test
 cases. This makes them robust to changes in the internals and also
 prevents Test::Builder from getting confused if someone uses blessed
 qr//'s.

Thanks.

For future reference, patches to dual core modules should please go upstream
to the CPAN version's bug tracker.

The bug tracker for Test::Builder is here.
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Test-Simple

or here
[EMAIL PROTECTED]


-- 
Stabbing you in the face so you don't have to.

Re: Call for Attention: Perl QA Hackathon in Oslo

2008-01-08 Thread Michael G Schwern

Salve J Nilsen wrote:
 Oslo.pm is planning a Perl QA Workshop/Hackathon in Oslo, Saturday
 April 4th to Monday April 7th, 2008.

FWIW, if I can get sponsored, I'm going with bells on.


 Just to make things more interesting, IEEE will have a conference on
 software testing nearby (in Lillehammer, Norway), just a few days after
 the workshop/hackathon.
 
   http://www.cs.colostate.edu/icst2008/

And if I do get over there I'm totally crashing this.


-- 
ROCKS FALL! EVERYONE DIES!
http://www.somethingpositive.net/sp05032002.shtml

Dude, where's my diagnostics? (Re: Halting on first test failure)

2008-01-11 Thread Michael G Schwern

Ovid wrote:
 I've posted a trimmed down version of the custom 'Test::More' we use
 here:
 
   http://use.perl.org/~Ovid/journal/35363
 
 I can't recall who was asking about this, but you can now do this:
 
   use Our::Test::More 'no_plan', 'fail';
 
 If 'fail' is included in the import list, the test program will die
 immediately after the first failure.  VERY HANDY at times.

I've experimented with this idea in the past to use Test::Builder to replace
home rolled die on failure assert() style test suites.  Unfortunately
there's a major problem:

$ perl -wle 'use OurMore fail, no_plan;  is 23, 42'
not ok 1
#   Failed test at /usr/local/perl/5.8.8/lib/Test/More.pm line 329.
Test failed.  Halting at OurMore.pm line 44.
1..1

Dude, where's my diagnostics?

In Test::Builder, the diagnostics are printed *after* the test fails.  So
dying on ok() will kill those very important diagnostics.  Sure, you don't
have to read a big list of garbage but now you don't have anything to read at 
all!

Since the diagnostics are printed by a calling function outside of
Test::Builder's control (even if you cheated and wrapped all of Test::More
there's all the Test modules on CPAN, too) I'd considered die on failure
impossible. [1]  The diagnostics are far more important.


Now, getting into opinion, I really, really hate die on failure.  I had to use
a system that implemented it for a year (Ovid knows just what I'm talking
about) and I'd rather scroll up through an occasional burst of errors and
warnings then ever not be able to fully diagnose a bug because a test bailed
out before it was done giving me all the information I needed to fix it.  For
example, let's look at the ExtUtils::MakeMaker tests for generating a PPD file.

ok( open(PPD, 'Big-Dummy.ppd'), '  .ppd file generated' );
my $ppd_html;
{ local $/; $ppd_html = PPD }
close PPD;
like( $ppd_html, qr{^SOFTPKG NAME=Big-Dummy VERSION=0,01,0,0}m,
   '  SOFTPKG' );
like( $ppd_html, qr{^\s*TITLEBig-Dummy/TITLE}m,'  TITLE'   );
like( $ppd_html, qr{^\s*ABSTRACTTry our hot dog's/ABSTRACT}m,
   '  ABSTRACT');
like( $ppd_html,
  qr{^\s*AUTHORMichael G Schwern lt;[EMAIL PROTECTED]gt;/AUTHOR}m,
   '  AUTHOR'  );
like( $ppd_html, qr{^\s*IMPLEMENTATION}m,  '  IMPLEMENTATION');
like( $ppd_html, qr{^\s*DEPENDENCY NAME=strict VERSION=0,0,0,0 /}m,
   '  DEPENDENCY' );
like( $ppd_html, qr{^\s*OS NAME=$Config{osname} /}m,
   '  OS'  );
my $archname = $Config{archname};
$archname .= -. substr($Config{version},0,3) if $] = 5.008;
like( $ppd_html, qr{^\s*ARCHITECTURE NAME=$archname /}m,
   '  ARCHITECTURE');
like( $ppd_html, qr{^\s*CODEBASE HREF= /}m,'  CODEBASE');
like( $ppd_html, qr{^\s*/IMPLEMENTATION}m,   '  /IMPLEMENTATION');
like( $ppd_html, qr{^\s*/SOFTPKG}m,  '  /SOFTPKG');

Let's say the first like() fails.  So you go into the PPD code and fix that.
Rerun the test.  Oh, the second like failed.  Go into the PPD code and fix
that.  Oh, the fifth like failed.  Go into the PPD code and fix that...

Might it be faster and useful to see all the related failures at once?

And then sometimes tests are combinatorial.  A failure of A means one thing
but A + B means another entirely.

Again, let's look at the MakeMaker test to see if files got installed.

ok( -e $files{'dummy.pm'}, '  Dummy.pm installed' );
ok( -e $files{'liar.pm'},  '  Liar.pm installed'  );
ok( -e $files{'program'},  '  program installed'  );
ok( -e $files{'.packlist'},'  packlist created'   );
ok( -e $files{'perllocal.pod'},'  perllocal.pod created' );

If the first test fails, what does that mean?  Well, it could mean...

A)  Only Dummy.pm failed to get installed and it's a special case.
B)  None of the .pm files got installed, but everything else installed ok.
C)  None of the .pm files or the programs got installed, but the
generated files are ok
D)  Nothing got installed and the whole thing is broken.

Each of these things suggests different debugging tactics.  But with a die on
failure system they all look exactly the same.


Oooh, and if you're the sort of person that likes to use the debugger it's
jolly great fun to have the test suite just KILL THE PROGRAM when you want to
diagnose a post-failure problem.


There are two usual rebuttals.  The first is well just turn off
die-on-faillure and rerun the test.  Ovid's system is at least capable of
being turned off, many hard code failure == die.  Unfortunately Ovid's is at
the file level, it should be at the user level since the do I or do I not
want to see the gobbledygook is more a user preference.

But we all know the problems with the just rerun the tests

Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

Ovid wrote:
 I'll go fix that diagnostic thing now.  Unfortunately, I think I'll
 have to violate encapsulation :(

If you know how to fix it let me know, because other than enumerating each
testing module you might use and lex-wrapping all the functions they export,
I'm not sure how to do it.  Test::Builder could cheat and register each module
as they load Test::Builder, but that relies on their using Test::Builder and
not requiring it.  Or builder() or new() could do the registration, but
there's no guarantee that they'll be called in the right package... however,
it is very likely.

One possibility involves taking advantage of $Level, so at least Test::Builder
knows which is the test function the user called, and then, somehow, inserting
the code necessary to cause failure when that function exits.  I don't know
how you insert code to run when a function that's already being executed exits.

This is why I altered the recommended calling conventions for Test::Builder to
call -builder at the beginning of each function rather than just use one
global.  Then at least I can use the builder object's DESTROY method to
indicate the end of a test to trigger this stuff.

There is, of course, a way to eliminate the problem at the source.  Since the
issue is the spewing test output and then having to scroll up to find the
original point of failure, perhaps the solution is not to truncate the output
but to use something better than just raw terminal output.  If only there was
something that could... I don't know... read the TAP and error messages and
produce a nicer output.  Some sort of TAP parser... :P


-- 
Schwern What we learned was if you get confused, grab someone and swing
  them around a few times
-- Life's lessons from square dancing

Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

The whole idea of halting on first failure was introduced to me by some XUnit
folks.  Their rationale was not to avoid spewing output, they had no such
problem since it's all done via a GUI, but that once one failure has happened
the failing code might hose the environment and all following results are now
considered contaminated.  This might make sense in a laboratory, but it seems
a bit like overkill in for day-to-day software testing throwing out perfectly
fine data.  As any field scientist knows, there's no such thing as
uncontaminated data.

The idea that you can diagnose everything from the first failure reminded me
of a gag about tech support that goes something like this:
http://www.netfunny.com/rhf/jokes/97/Oct/techsupport.html

TECH: Ridge Hall computer assistant; may I help you?

CUST: Yes, well, I'm having trouble with WordPerfect.

TECH: What sort of trouble?

CUST: Well, I was just typing along, and all of a sudden the words went
away.

TECH: Went away?

CUST: They disappeared.

TECH: Hmm. So what does your screen look like now?

CUST: Nothing.

TECH: Nothing?

CUST: It's blank; it won't accept anything when I type.

TECH: Are you still in WordPerfect, or did you get out?

CUST: How do I tell?

TECH: Can you see the C prompt on the screen?

CUST: What's a sea-prompt?

TECH: Never mind. Can you move the cursor around on the screen?

CUST: There isn't any cursor: I told you, it won't accept anything I
type.

TECH: Does your monitor have a power indicator?

CUST: What's a monitor?

TECH: It's the thing with the screen on it that looks like a TV. Does it
have a little light that tells you when it's on?

CUST: I don't know.

TECH: Well, then look on the back of the monitor and find where the power
cord goes into it. Can you see that?

CUST: ...Yes, I think so.

TECH: Great! Follow the cord to the plug, and tell me if it's plugged into
the wall.

CUST: ...Yes, it is.

TECH: When you were behind the monitor, did you notice that there were two
cables plugged into the back of it, not just one?

CUST: No.

TECH: Well, there are. I need you to look back there again and find the
other cable.

CUST: ...Okay, here it is.

TECH: Follow it for me, and tell me if it's plugged securely into the back
of your computer.

CUST: I can't reach.

TECH: Uh huh. Well, can you see if it is?

CUST: No.

TECH: Even if you maybe put your knee on something and lean way over?

CUST: Oh, it's not because I don't have the right angle-it's because it's
dark.

TECH: Dark?

CUST: Yes-the office light is off, and the only light I have is coming in
from the window.

TECH: Well, turn on the office light then.

CUST: I can't.

TECH: No? Why not?

CUST: Because there's a power outage.

TECH: A power... a power outage? Aha! Okay, we've got it licked now.  Do
you still have the boxes and manuals and packing stuff your computer came
in?

CUST: Well, yes, I keep them in the closet.

TECH: Good! Go get them, and unplug your system and pack it up just like
it was when you got it. Then take it back to the store you bought it from.

CUST: Really? Is it that bad?

TECH: Yes, I'm afraid it is.

CUST: Well, all right then, I suppose. What do I tell them?

TECH: Tell them you're too stupid to own a computer.


-- 
94. Crucifixes do not ward off officers, and I should not test that.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3

Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

Ovid wrote:
 --- Michael G Schwern [EMAIL PROTECTED] wrote:
 
 The whole idea of halting on first failure was introduced to me by
 some XUnit
 folks ... As any field scientist knows, there's no such thing as
 uncontaminated data.
 
 As any tester knows, a one size fits all suit often doesn't fit.  Let
 people decide for themselves when a particular method of testing is
 appropriate.  I hate you must halt testing on a failure as much as I
 hate you must not halt testing on failure.  It's not XOR.

When it comes to failure, I like to err on the side of more information.


 There's a certain irony that beginning testers are often told to fix
 the *first* error *first* and subsequent errors go away.  I'm not
 saying this is a silver bullet to solve testing, but sometimes it's
 very useful.  

That's the general idea for dealing with syntax errors, too.

The trick is, you don't know ahead of time whether the information from the
follow on failures will prove to be useful.  Can't tell until you see it.  So
don't freak out over all the subsequent failures, fix the first thing and
re-run is a decent plan, but you can't just ignore them either.


 I am feeling a bit stupid because I can't figure out your conclusion. 
 Humor me.  At times it sounds like you're telling people not to do this
 and at times it sounds like you're telling people it's hard to do with
 Test::Builder :)

Yes, I'm saying both.  I don't like it AND it's appears impossible to do right
with TB.  Though I do still ponder how to make it work anyway.


PS  Couldn't you have the TAP harness kill the test process on first failure?

-- 
24. Must not tell any officer that I am smarter than they are, especially
if it’s true.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3

Re: Dude, where's my diagnostics? (Re: Halting on first test failure)

Aristotle Pagaltzis wrote:
 * Michael G Schwern [EMAIL PROTECTED] [2008-01-12 12:00]:
 Ovid wrote:
 I'll go fix that diagnostic thing now. Unfortunately, I
 think I'll have to violate encapsulation :(
 If you know how to fix it let me know, because other than
 enumerating each testing module you might use and lex-wrapping
 all the functions they export, I'm not sure how to do it.
 
 Set a flag that T::B should quit when the next test result is
 about to be recorded?

I guess it works, but it leaves you dead halfway through another test function
which is weird.


 One possibility involves taking advantage of $Level, so at
 least Test::Builder knows which is the test function the user
 called, and then, somehow, inserting the code necessary to
 cause failure when that function exits. I don't know how you
 insert code to run when a function that's already being
 executed exits.
 
 Load the debugger and set a breakpoint?

Oh, good one.  If the debugger wasn't so damned full of bugs that might just
work as a general solution.


-- 
Robrt:   People can't win
Schwern: No, but they can riot after the game.

Re: Preserving diagnostics when dieing on test failure

Ovid wrote:
 So we can preserve diagnostics, but we need help in cleaning up those
 damned line numbers.  Hook::LexWrap didn't have the magic I thought it
 would.

ok() is now inside a wrapper so you're one level further down then it thinks.
 Just add one to $Level and then take it back off again afterwards.

  wrap 'Test::Builder::ok',
pre = sub {
  $_[0]-{XXX_test_failed} = 0;
  $Test::Builder::Level++;
},
post = sub {
  $Test::Builder::Level--;
  $_[0]-{XXX_test_failed} = ![ $_[0]-summary ]-[-1];
};


 Below is how I did it.  See the 'import' method.  There's a lot more
 work to be done to get fine-grained control, but the line numbers are
 the important bit.

Not everything prints more diagnostics, like ok() itself.

$ perl -wle 'use OurMore fail, no_plan;  ok(0);  ok(1);  ok(0);  ok(1)'
not ok 1
#   Failed test at -e line 1.
ok 2
not ok 3
#   Failed test at -e line 1.
ok 4
1..4
# Looks like you failed 2 tests of 4.

But you can probably special case that and fail().

The bigger problem is what happens if a function calls diag() more than once,
like Test::Exception.

$ perl -wle 'use OurMore no_plan;  throws_ok { die; } qr/foo/;  pass()'
not ok 1 - threw Regexp ((?-xism:foo))
#   Failed test 'threw Regexp ((?-xism:foo))'
#   at -e line 1.
# expecting: Regexp ((?-xism:foo))
# found: Died at -e line 1.
ok 2
1..2
# Looks like you failed 1 test of 2.

$ perl -wle 'use OurMore fail, no_plan;  throws_ok { die; } qr/foo/;  
pass()'
not ok 1 - threw Regexp ((?-xism:foo))
#   Failed test 'threw Regexp ((?-xism:foo))'
#   at -e line 1.
# expecting: Regexp ((?-xism:foo))
Test failed.  Halting at OurMore.pm line 55.
1..1
# Looks like you failed 1 test of 1.
# Looks like your test died just after 1.

(Note the lack of found)


-- 
94. Crucifixes do not ward off officers, and I should not test that.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3

The spewing problem.

Paul Johnson wrote:
 This is something that I too have asked for in the past.  I've even
 hacked up my own stuff to do it, though obviously not as elegantly as
 you or Geoff.  Here's my use case.
 
 I have a bunch of tests that generally pass.  I hack something
 fundamental and run my tests.  Loads of them fail.  Diagnostics spew
 over my screen.  Urgh, I say.  Now I could scroll back through them.

When faced with a tough problem it's often useful to go back check that it's
actually the problem and not a solution posing as a problem.

Make Test::Builder die on failure is a solution, and it's not a particularly
good one.  It's hard to implement in Test::Builder and there's all the loss of
information issues I've been yelping bout.

The problem I'm hearing over and over again is Test::Builder is spewing crap
all over my screen and obscuring the first, real failure.  So now that the
problem is clearly stated, how do we solve it without making all that spew
(which can be useful) totally unavailable?


-- 
39. Not allowed to ask for the day off due to religious purposes, on the
basis that the world is going to end, more than once.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3

Re: Test::Builder statistics

Ovid wrote:
 My first attempt at determining the most popular testing modules left
 out Test.pm.  Whoops!  I've fixed that.
 
 Out of almost 60,000 test programs, it turns out Test.pm is used 8,937
 times.  Now that I have a file which lists how many times each test
 module is used, I can start examining my extracted CPAN to determine
 what percentage of modules actually use the Test::Builder framework.

FWIW there's a Test::Builder based emulator for Test.pm called Test::Legacy.
I'm sure you can shim your stuff to load Test::Legacy when Test.pm is asked for.


-- 
184. When operating a military vehicle I may *not* attempt something
 “I saw in a cartoon”.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3

Re: The spewing problem.

Matisse Enzer wrote:
 I just want to be able to run a test suite with a switch that makes the
 entire test run stop after the first failure is reported.

Ok, it's nice to want things, but why do you want it?


-- 
100. Claymore mines are not filled with yummy candy, and it is wrong
 to tell new soldiers that they are.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3

Re: The spewing problem.

2008-01-13 Thread Michael G Schwern

Matisse Enzer wrote:
 
 On Jan 12, 2008, at 10:24 PM, Michael G Schwern wrote:
 
 Matisse Enzer wrote:
 I just want to be able to run a test suite with a switch that makes the
 entire test run stop after the first failure is reported.

 Ok, it's nice to want things, but why do you want it?
 
 Almost entirely because when I'm developing on a code base with a large
 test suite I often want to stop the test run as fast as possible when a
 failure occurs - currently I do a control-C but would prefer if I could
 use a switch when I run the tests to have it just stop right after the
 first failure.

Ok, why do you want to stop it as fast as possible when a failure occurs?


-- 
164. There is no such thing as a were-virgin.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/?page_id=3

Re: The spewing problem.

2008-01-13 Thread Michael G Schwern

Michael Peters wrote:
 Michael G Schwern wrote:
 
 Ok, why do you want to stop it as fast as possible when a failure occurs?
 
 I have a 45 minute test suite and I want to work on the first failure as soon 
 as
 possible. I also have multiple desktops and am doing other things in another
 desktop, so I want to know as soon as the failure happens so that I can start
 working on it:
 
make test || echo -e \a
 
 Would be nice if that would beep after the first failure instead of after 45
 minutes and the whole thing is done.

I keep digging away at this because I'm looking for a problem other than I
want to see the first failure.  And that's what I'm hearing from you and from
Matisse and everyone else.  Yours is a little different, it's I want to be
alerted on first failure.

You see how this is distinct from halt on first failure?  It gives me a lot
more room for different solutions that don't involve just cutting off all the
following information.


-- 
Look at me talking when there's science to do.
When I look out there it makes me glad I'm not you.
I've experiments to be run.
There is research to be done
On the people who are still alive.
-- Jonathan Coulton, Still Alive

Re: The spewing problem.

2008-01-13 Thread Michael G Schwern

Adam Kennedy wrote:
 This shouldn't be any more complicated than  -g (where g in my case
 stands for goat as in feinting goat)

Ok, I'll bite.  Why a goat and why is it feinting?


-- 
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer

Re: What should it's name be?

2008-01-14 Thread Michael G Schwern

Gabor Szabo wrote:
 I know I am a bit late to the party but what about  Test::Anything ?

Rapidly drifting towards Test::Anything::Protocol.


-- 
But there's no sense crying over every mistake.
You just keep on trying till you run out of cake.
-- Jonathan Coulton, Still Alive

Re: A New Test::Builder

2008-01-15 Thread Michael G Schwern


Ovid wrote:

Test::Harness used to be very limited.  We couldn't do a lot with it,
but when we started testing, most of us didn't do a lot with it.  As we
understood more about testing, we understood better many things we
wanted.  As a result, Schwern posted a great plan for rewriting
Test::Harness.  It worked and people are taking advantage of this.

Now we're starting to see more and more limitations with Test::Builder.
 I don't want this to come across as bashing chromatic or Schwern, the
two people who've done most of the great work in writing this and
related code.  They produced a great solution and now that we've had a
chance to use it for a while, we have a better idea of what else we
could use.  Of course, this is what most of programming is like.

Part of this is driven by the new Test::Harness and part of it is
driven by people's real-world needs.  I toss the following out not
because I think everyone will agree with it, but because I think it's a
good starting point.  Maybe someone can create TAP::Builder?

  * Make it subclassable.
  * Allowed deferred plans.
  * Allow for TAP upgrades (YAMLish, YAMLish, YAMLish!).
  * On Fail callbacks?  (I realize lots of people will squawk here)


The irony being that if you have N different backends then you can no longer 
guarantee any common behavior between the two which means all the 
Test::Builder hackery proposed here to add common functionality to all test 
modules gets harder, not easier.  On the flip side, some of it becomes possible.


There's the more important question of what *must* remain in common so that 
modules written with either system can still work together in the same test 
process.  That being the plan, the test counter and any end-of-test behaviors 
have to be coordinated which means at minimum they need an object in common to 
register whether there's already a plan and what it is, what the current test 
# is and so on.


And then there's what optionally should be coordinated.  This is stuff like 
what filehandles to output to, historical test data (am I passing?), are we 
skipping or todo'ing, are we using test numbers and so on.  Global behaviors.



--
If at first you don't succeed--you fail.
-- Portal demo

Re: BAIL_OUT and parallel tests

2008-01-17 Thread Michael G Schwern


Ovid wrote:

What should parallel tests do if a BAIL_OUT is encountered?  I think
all parallel tests currently running should be allowed to finish so
they can attempt to cleanup, but no more tests should be started.  Does
this sound reasonable?


It's not entirely clear if bail out means kill the testing process or let 
the process finish but don't start any more.  If it's the former, then just 
kill everything.  If it's the latter, let them all finish and don't start any 
more.



--
THIS I COMMAND!

Re: is_deeply and qr// content on 5.11

2008-01-18 Thread Michael G Schwern


Ian Malpass wrote:
I got a failure message from CPAN testers for Pod::Extract::URI for 
5.11.0 patch 33001 on Linux 2.6.22-3-amd64 
(x86_64-linux-thread-multi-ld)[0]


The failures were where I was testing to see if an arrayref of qr// 
patterns was the array I was expecting. is_deeply() has heretofore 
worked perfectly well.


The test line in question is:

   is_deeply( $peu-stop_uris, [ qr/foo/ ] );

And the failure is:

#   Failed test at t/new.t line 42.
# Structures begin differing at:
#  $got-[0] = (?i-xsm:foo)
# $expected-[0] = (?-xism:foo)
# Looks like you failed 1 test of 24.

So, is this 5.11 brokenness, is_deeply() brokenness, or are my tests 
naive (or wrong)?


(?i-xsm:foo) is equivalent to qr/foo/i.  (?-xism:foo) is qr/foo/.  They are 
not equivalent.


So it's possible that 5.11 fixed/broke something, but it's inside stop_uris(). 
 There have been issues with how is_deeply() compares regexes in the past 
(the ordering of the xism operators, for example) but in this case it appears 
to be working.



--
On error resume stupid

Re: #50432: Oslo Perl QA Hackaton Grant Application

2008-02-01 Thread Michael G Schwern


Ovid wrote:

Crap.  Can we just forget I sent that to Perl QA instead of the Grant
Committee?


/me puts on his sunglasses.
/me pulls out a black device.

Now if everyone on perl-qa will please look this way...

*FLASH!*


--
You are wicked and wrong to have broken inside and peeked at the
implementation and then relied upon it.
-- tchrist in [EMAIL PROTECTED]

Re: expanding the cpan script, and Module

2008-02-11 Thread Michael G Schwern


Andrew Hampe wrote:

The Basic CPAN concern: --bail_on_fail flag (2008.02.10 )

Problem description:
when a cpan session is looking for more than one distribution/module
there needs to be a way to 'flag' that the session must fail and stop
if there is an error loading any distribution, or a sub component 
required module.


To be clear, you mean like if you put in:

   $ cpan Foo::Bar Bar::Baz

and Foo::Bar fails to install you want it to stop and not continue on to 
Bar::Baz?



Is there anyone working on such a flag?

Would a Patch Be Acceptable?


I believe you want to send this along to the CPAN bug tracker.  perl-qa is for 
quality assurance (testing) issues.

http://rt.cpan.org/NoAuth/Bugs.html?Dist=CPAN

But it's trivial to do yourself.

use CPAN;

for my $name (@ARGV) {
my $mod = CPAN::Shell-expand(Module, $name);

unless( $mod ) {
warn Unknown module $name.  Aborting.\n;
last;
}

next if $mod-uptodate;  # already up to date

unless( CPAN::Shell-install($mod) ) {
warn Installing $name failed.  Aborting.\n;
last;
}
}


--
101. I am not allowed to mount a bayonet on a crew-served weapon.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/

Re: New assertions in Ruby

2008-02-12 Thread Michael G Schwern


chromatic wrote:

On Tuesday 12 February 2008 10:55:21 chromatic wrote:


On Tuesday 12 February 2008 10:06:14 Eric Wilhelm wrote:

How will you print the assertion code without a source filter?

Show Source on Exception is fairly easy:

http://www.oreillynet.com/onlamp/blog/2007/10/adding_show_source_to_perl_ex
c.html

Making that work with anonymous functions is trickier, but doable.


Of course, in the case of Test::Exception, the assertion functions already 
have the code reference that might throw an exception, in which case the 
problem isn't even tricky.


Data::Dump::Streamer can decompile a code reference, complete with attached 
lexicals.  But as has been pointed out by Yuval, the real trick is to show the 
value of all variables used in the block.


It's an interesting idea, but at the end of the article is the Fine Print 
about how assert {2.0} doesn't quite work:


---
When an assertion passes, Ruby only evaluates it once. However, when an 
assertion fails, the module RubyNodeReflector will re-evaluate each element in 
your block. (You knew there was a “gotcha”, right?;) This effect will hammer 
your side-effects, and will disable boolean short-circuiting. So once again 
sloppy developer tests help inspire us to write clean and decoupled code!



What he puts down to sloppiness I call flexibility.  I wouldn't want to 
restrict all tests to only those without side-effects, too limiting.  You all 
know I'm a stickler for allowing any possible test.


Consider a simple test of post increment.

i = 0;  # deliberately set wrong to cause a failure
assert { 1 == i++ }

When that fails, because it re-evaluates each element in the code block, you 
will see something like this:


assert{ 1 == (i++) } -- false - should pass
i   == 1

That's just my supposition, I can get assert2 to run.  Ruby can't find the 
installed gem. :(



--
Defender of Lexical Encapsulation

Re: Diagnostics

2008-02-12 Thread Michael G Schwern


David Landgren wrote:

I wish you'd s/Got/Actual/ or Received. Got must die.


Why's that?


--
Hating the web since 1994.

Re: New assertions in Ruby

2008-02-13 Thread Michael G Schwern


Adrian Howard wrote:

which isn't _too_ shabby, but doesn't help much with things like:

ok_if { Foo-new-answer == 42 };

or

ok_if { $Some_dynamic_var == 42 };

So I don't really think it's worth pursuing.


Well, if we follow the logic of the assert2 author, you're just being SLOPPY 
using methods with side effects in a test.  All tests should be functional 
doncha know?



--
29. The Irish MPs are not after “Me frosted lucky charms”.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/

Wide character support for Test::More

2008-02-23 Thread Michael G Schwern

I just merged together a number of tickets having to do with Test::More not 
liking wide characters.


use 5.008;
use strict;
use warnings;
use Test::More tests = 1;

my $uni = \x{11e};

ok( $uni eq $uni, Testing $uni );

__END__
1..1
Wide character in print at lib/Test/Builder.pm line 1252.
ok 1 - Testing Ğ


I know almost nothing about Unicode.  How do I make this Just Work?  Is it 
safe to just set binmode to always be ':utf8' if perl  5.8?



--
...they shared one last kiss that left a bitter yet sweet taste in her
mouth--kind of like throwing up after eating a junior mint.
-- Dishonorable Mention, 2005 Bulwer-Lytton Fiction Contest
   by Tami Farmer

Re: Wide character support for Test::More

2008-02-24 Thread Michael G Schwern


Aristotle Pagaltzis wrote:

use 5.008;
use strict;
use warnings;

  use open ':std', ':locale';

use Test::More tests = 1;

my $uni = \x{11e};

ok( $uni eq $uni, Testing $uni );

__END__
1..1
Wide character in print at lib/Test/Builder.pm line 1252.

  ^^ after the above patch, gone


There's the rub, it doesn't go away.

Test::Builder dups STDERR and STDOUT, this is so you can mess with them to 
your heart's content and still get testing done.  File I/O disciplines don't 
appear to be copied across dups.  That's what everyone was complaining about, 
that they had to manually apply layers to Test::Builder's own handles.


It appears I have to manually copy the layers across, ok.

sub _copy_io_layers {
my($self, $src, $dest) = @_;

$self-_try(sub {
require PerlIO;
my @layers = PerlIO::get_layers($src);

binmode $dest, join  , map :$_, @layers if @layers;
});
}

That does it.  Thank you for playing software confessional. :)


--
The past has a vote, but not a veto.
-- Mordecai M. Kaplan

Re: Is there even a C compiler?

2008-02-25 Thread Michael G Schwern


Andy Armstrong wrote:
Is there a generally approved way for an XS module to test for the 
existence of a C compiler before attempting to build?


MakeMaker uses ExtUtils::CBuilder-have_compiler() in it's tests.  It's worked 
well with no complaints.  It's an additional testing dependency, but it's a 
useful one and Module::Build will eventually suck it in anyway.



--
THIS I COMMAND!

Re: Is there even a C compiler?

2008-02-26 Thread Michael G Schwern


Yitzchak Scott-Thoennes wrote:

On Mon, Feb 25, 2008 at 03:59:37PM -0800, Michael G Schwern wrote:
MakeMaker uses ExtUtils::CBuilder-have_compiler() in it's tests.  It's 
worked well with no complaints.  It's an additional testing dependency, 
but it's a useful one and Module::Build will eventually suck it in 
anyway.


Not sure what you mean by that, but M::B recommends, not requires,
ExtUtils::CBuilder, and has for a long time.


I mean that CBuilder isn't exactly a big hassle to have as a dependency and 
other things need it anyway.



--
Robrt:   People can't win
Schwern: No, but they can riot after the game.

More information in NAs (was Re: CPANTesters considered harmful)

demerphq wrote:

On 03/03/2008, David Golden [EMAIL PROTECTED] wrote:

On Mon, Mar 3, 2008 at 6:57 AM, demerphq [EMAIL PROTECTED] wrote:
IMO if an NA result comes in without email contact details and without
an explanation for the NA then the result should not be aggregated
against the module.

The email contact details are there, just suppressed by the NNTP web
gateway to avoid email harvesting by spambots. If you have a real
NNTP client, you'll see the email. Also, see Google Groups (though
you have to solve a captcha to reveal the email):

http://groups.google.com/group/perl.cpan.testers/browse_thread/thread/f67ccb5a66aed2e/ffa37628e76a42e5?lnk=gstq=NA+ExtUtils-Install#ffa37628e76a42e5

This information would be useful to display on CpanTesters itself. The
point is I saw NA's that were inexplicable to me, and found no further
useful information.

It would be nice if NA's included the reason for it being an NA, that being
the full Makefile/Build.PL output just like if it failed. I don't see any
harm in that and it would help identify accidental NAs.

Also it would be nice if an NA came with a soothing explanation for the
author. More than one rookie CPAN author has asked me Oh god, what's this NA
thing mean? How do I get rid of it?!

While I'm on the subject, a link to an author FAQ about CPAN Testers in the
mail would be handy.

--
52. Not allowed to yell “Take that Cobra” at the rifle range.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
http://skippyslist.com/list/

Re: CPANTesters considered harmful


Nicholas Clark wrote:

On Mon, Mar 03, 2008 at 02:19:23PM +, Smylers wrote:

demerphq writes:


It turned out the problem is that when the tests are root it seems to
be not possible to create a directory that is not writeable by root.

I think that can be reduced to: It isn't possible to create a directory
that is not writeable by root.  The whole point of root is that as the
super-user it can do anything!


Im not really sure how to tackle this better than simply skipping the
tests as root which is what the most recent release does.

That's plausible.  It could also temporarily drop privileges to be some
other user for running that test, but I don't know how you'd work out
which user to do it as.


My guess would be nobody if that user exists, else give up.

But I agree that skipping is better, because the tests run as non-root
already prove that the module's functionality worked. Adding a lot
of complex logic to the test to swap user when running as root would
actually make the test as much a test of the user ID swapping code,
and introduce code that isn't usually tested, and generally introduce
fragility and cause false positive failures.


FWIW I do this a lot:

chmod 0444, 'some_file';

SKIP: {
skip(cannot write readonly files, 1) if -w 'some_file';

...
}

The important thing is that I'm not checking if I'm root but directly checking 
if the necessary condition exists, in this case an unwritable file.


You could attempt to downgrade permissions, switching to nobody is as good a 
guess as anything else, but realize it might effect the ability to read files, 
access directories and load modules for the rest of the test.  Not everyone 
sets o+rx.



--
3. Not allowed to threaten anyone with black magic.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/

Re: More information in NAs (was Re: CPANTesters considered harmful)


David Golden wrote:

On Mon, Mar 3, 2008 at 11:45 AM, Michael G Schwern [EMAIL PROTECTED] wrote:

 It would be nice if NA's included the reason for it being an NA, that being
 the full Makefile/Build.PL output just like if it failed.  I don't see any
 harm in that and it would help identify accidental NAs.


There is only supposed to be one reason for NA -- Perl or platform not
supported.  Anything else is a bug in the reporting software.

See http://cpantest.grango.org/wiki/Reports

For reference, CPAN::Reporter detects NA in any of three ways:

* explicit check for unsatisfied 'perl' in 'requires' section of prerequisites
* parsing for OS Unsupported or No support for OS
* parsing for error messages from code like use 5.008 or from our
being used in $VERSION strings prior to 5.005


It's that last one that concerns me, it's a bit heuristicy and I've been 
things be declared NA that should have alerted the author to a backwards 
compat problem.




CPAN::Reporter also includes PL output for tests that fail or are NA
in the PL stage.  CPANPLUS (which Chris uses) does not -- or at least
not by default as far as I know.


Ahh, I see.


--
Reality is that which, when you stop believing in it, doesn't go away.
-- Phillip K. Dick

Test-Simple 0.77 fixage


I'm coining a new term, fixage, like breakage.

Fixage is when software fixes a bug and reveals bugs in dependent software.

Test-Simple 0.77 (which includes Test::More) fixed a long standing bug by 
removing the annoying global $SIG{__DIE__} handler to trap test death.  It 
would swallow the real exit code of a test.


This code used to pass:

use Test::More tests = 1;
pass();
exit 1;

Whereas now it will properly exit with 1, which is a failure, and the 
appropriate Looks like your test died message.


So far there's only been one revealed failure, that's in POE, but I figured 
I'd let folks know just in case.



--
On error resume stupid

Re: More information in NAs (was Re: CPANTesters considered harmful)


David Golden wrote:

On Mon, Mar 3, 2008 at 2:04 PM, Michael G Schwern [EMAIL PROTECTED] wrote:

  * parsing for error messages from code like use 5.008 or from our
  being used in $VERSION strings prior to 5.005

 It's that last one that concerns me, it's a bit heuristicy and I've been
 things be declared NA that should have alerted the author to a backwards
 compat problem.


Back before you declared 5.005 to be dead, Slaven Rezic created a lot
of chaos with FAIL reports from 5.005_05 when Makefile.PL or Build.PL
didn't have use 5.006 and then ExtUtils::MakeMaker or Module::Build
tried to eval an our $VERSION line from a .pm file.


Don't get me wrong, I think the heuristics are fine.  That was in reference to 
why I like to see the details of an NA.



--
Ahh email, my old friend.  Do you know that revenge is a dish that is best
served cold?  And it is very cold on the Internet!

Re: Test-Simple 0.77 fixage


chromatic wrote:

On Monday 03 March 2008 11:20:54 Michael G Schwern wrote:


Fixage is when software fixes a bug and reveals bugs in dependent
software.

Test-Simple 0.77 (which includes Test::More) fixed a long standing bug by
removing the annoying global $SIG{__DIE__} handler to trap test death.


Having imposed fixage on the world myself, let me recommend that you run 
*away* from villagers with pitchforks and torches rather than trying to 
reason with them.  They don't want to fix the bugs in their code.


Shame they brought a pitchfork to a gun fight.
http://schwern.org/~schwern/img/me/19_brian_mike_gary_shooting.jpg


--
164. There is no such thing as a were-virgin.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/

IEEE Testing Conference in Lillehammer