Re: RFD: Built-in testing
On Wed, 21 Jan 2009, Damian Conway wrote: > > Maybe something in all caps. For what it's worth, :OK<> can be typed > > with one hand while the other holds down the shift key. :) > > Typical right-hander fascism! On the other hands we have :QA ... which also so happens to be an apposite abbreviation. :-) -Martin
Re: RFD: Built-in testing
On Jan 23, 8:59 pm, jswit...@gmail.com (Jason Switzer) wrote: > That sounds useful on the surface but often turns out to be more difficult > to do than you might think. There are many cases where tests are performed > from within loops. Something like S09.237 may or may not be in a loop, may > be difficult to identify in files with many tests. There are at least two reasons to identify a test (or check): to control it from afar, and to track it's results. If the reason for wanting identity is to control it (e.g. Foo::Bar::Test.disable()), then the fact that it's in a loop isn't necessarily important: if you want to disable it, then you probably want to disable all iterations. If we do want finer grain control, then it is probably possible to do something with resumable exceptions that are thrown each time the test is potentially skipped. If the reason to identifying a check is to track its result, then the obvious solution is to not assume that it's result is pass/fail, but is instead a pair of pass/fail counts (or pass/total -- same thing). A good testing approach is "directed-random", where the same test is run multiple times with different random seeds so as to use different test data. IMO, it is reasonable to think of a one-shot test as an aberration.
Re: RFD: Built-in testing
On Fri, Jan 23, 2009 at 4:08 PM, jerry gay wrote: > On Fri, Jan 23, 2009 at 12:37, Dave Whipp wrote: >> I could also imagine writing code that reads from an Sqlite database, and >> imposes that info onto the test. Whatever mechanism is used, I think we need >> a language-defined mechanism to supply a stable unique identifier for each >> test, so that it can be individually tracked and manipulated. Perhaps "is >> only" is the wrong way to implement the action-at-a-distance, but it does >> seem better (IMO) than a preprocessor. >> > i don't understand the drive to have unique test identifiers. we don't > have unique identifiers for every code statement, or every bit of > documentation. why are tests so important/special/different that each > warrants a unique id? that aside, this functionality sounds like it > can be encapsulated in a module, if desired. as it stands, i can't see > a reason reason it *has to* be made available in the core. Unique test identifiers are helpful because you can then track the progress of a specific test across platforms or revisions. > as a recap, the discussions larry, patrick, moritz and i (and others, > i'm sure) had on this topic long ago led to agreement that the most > important characteristics for a portable specification test suite > were: > > ~ the tests should be organized in such a way that it makes it easy to > figure out to what bit of spec is under scrutiny > (addressed by directory/filename standardization and smartlinks) > ~ the test files mustn't be cluttered with code that implementations need > ignore > (comments are used, which are by default ignored, and can be > preprocessed to customize the test for each implementation) > ~ the skip/todo markers should be as close to the relevant tests as > possible, so they're less likely to fall out-of-sync > (the markers are in comments in the test file, directly above the tests) > > it's my view that spec tests should be easy to maintain for developers > of multiple implementations, and uniqueness is an overly burdensome > constraint. A simple algorithm (used by tcl's spec tests) is to have each named test correspond roughly to the name of the file (which in turn corresponds roughly to the name of the feature being tested), and then increment vaguely numerically. e.g: dict-1.1 dict-2.1 dict-2.2 dict-2.3 Then, if they have to add a test in a future revision, then can insert it between dict-2.1 and dict-2.2, call it dict-2.1-a, and still know that dict-2.2 is testing the same code, regardless of when that test was run. Regards. -- Will "Coke" Coleda
Re: RFD: Built-in testing
- Original Message > From: jerry gay > i don't understand the drive to have unique test identifiers. we don't > have unique identifiers for every code statement, or every bit of > documentation. why are tests so important/special/different that each > warrants a unique id? Actually, if code is well-written, we *do* sort of have unique identifiers. "Bob, you need to change &Customer::name to also show the middle initial". We don't really have anything like that in tests unless we move close to the xUnit style. TAP has no concept of this. Unique identifiers are useful in that they can let you track changes over time (many of us use source control history to understand changes over time for code). It would be very useful to have unique identifiers to persist to a db and create graphs of one's test suite behavior ("hey, we keep failing out credit card tests. We should look into this more carefully!"). Cheers, Ovid -- Buy the book - http://www.oreilly.com/catalog/perlhks/ Tech blog- http://use.perl.org/~Ovid/journal/ Twitter - http://twitter.com/OvidPerl Official Perl 6 Wiki - http://www.perlfoundation.org/perl6
Re: RFD: Built-in testing
On 2009 Jan 21, at 7:35, Carl Mäsak wrote: Moritz (>): So Larry and Patrick developed the idea of creating an adverb on the test operator instead: $x == 1e5 :ok('the :ok makes this is a test'); I'm trying to explain to myself why I don't like this idea at all. I'm only partially successful. Other people seem to have no problem with I'm having SNOBOL flashbacks. That's quite enough to put me off of it. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu electrical and computer engineering, carnegie mellon universityKF8NH
Re: RFD: Built-in testing
On Fri, Jan 23, 2009 at 6:39 PM, Dave Whipp wrote: > A spec-test is (or should be) different from an ad-hoc test. I want to be > able to say "test S09.237 passes on pugs but not on Rakudo" (perhaps with a > nicer name). Unique identifiers allow comparisons of specific tests across > multiple implementations, and over time. It is possible to derive IDs using > line numbers (perhaps block-relative), but that's only a good idea if the > test suite is reasonably stable (and it requires tool support). > That sounds useful on the surface but often turns out to be more difficult to do than you might think. There are many cases where tests are performed from within loops. Something like S09.237 may or may not be in a loop, may be difficult to identify in files with many tests. This sort of test name could be the test message output by Test.pm's verbose output, but it then makes the verbose output virtually useless in that Test.pm could just keep records of the test numbers instead. There can also be multiple tests per single line of code, especially if provided as an adverb, such as :ok Test labels seems like an aspect that is highly susceptible to bit-rot due to the ever evolving nature. Given the multitude of things that can go wrong trying to keep records, it might not be a good idea to focus on this. Rather, it might be a good idea to have the language provide a base test and a means to extend the test. This would allow for the tests previously written to transparently change the back-end testing mechanism. Here's a very crude example. Lets say that ok() is defined by the Core (and thus the language): multi sub ok(Bool $test, Str $msg) { if $test { say "ok $msg" } else { say "not ok $msg" } } Then let's say I don't want the default (psuedo)-tap test output, I could redefine what ok() does: multi sub ok(Bool $test) { say "A test has failed at some point somewhere" if $test } ok(?($x == 4), "no good has come of this"); #calls Core's ok() ok(?($x == 2)); #calls my crappy ok() That's just an example to show that the language could provide a basic version that is extensible with various implementations and various compilers such that I don't have to write constantly unique test names (or poorly identified names) and still only have to write a test once. -Jason "s1n" Switzer
Re: RFD: Built-in testing
jerry gay wrote: i don't understand the drive to have unique test identifiers. we don't have unique identifiers for every code statement, or every bit of documentation. why are tests so important/special/different that each warrants a unique id? that aside, this functionality sounds like it can be encapsulated in a module, if desired. as it stands, i can't see a reason reason it *has to* be made available in the core. I have a mental model that says, for each implementation, there is a mapping that tells us which tests are runnable, non-runnable, etc. Imposing such information from without is difficult (or fuzzy) if tests aren't identifiable (giving a name to a group of tests allows a whole group to be enabled/disabled as one). I'd point out that we do, in fact, name statements when it makes sense to do so: with nested loops, labels allow you to refer to output loops explicitly for C or C statements (also C). The fact that it's possible to name something doesn't require you to do so. But the ability name things like tests is a useful capability, in that it makes it possible to programmatically enable/disable them without touching the source code that defines them. A spec-test is (or should be) different from an ad-hoc test. I want to be able to say "test S09.237 passes on pugs but not on Rakudo" (perhaps with a nicer name). Unique identifiers allow comparisons of specific tests across multiple implementations, and over time. It is possible to derive IDs using line numbers (perhaps block-relative), but that's only a good idea if the test suite is reasonably stable (and it requires tool support). Actually, if the truth be known, I don't really want to say that. I much prefer to define behavior using properties, and then say "random seed 35467 generated a test that caused assertion XYZ to fail" (or "random seed 54578 generates a test that gives a different result on Rakudo Vs Pugs"). Specific hand-coded directed/focused tests are usually a last resort in my line of work. I don't care if the functionality is "Core" or "Module" -- I'm not even sure that there's a distinction. I think the question is more "is it specified as part of the language, or not" -- and if it's used by the spec of the language then it seems reasonable to specify it.
Re: RFD: Built-in testing
On Fri, Jan 23, 2009 at 12:37, Dave Whipp wrote: > I could also imagine writing code that reads from an Sqlite database, and > imposes that info onto the test. Whatever mechanism is used, I think we need > a language-defined mechanism to supply a stable unique identifier for each > test, so that it can be individually tracked and manipulated. Perhaps "is > only" is the wrong way to implement the action-at-a-distance, but it does > seem better (IMO) than a preprocessor. > i don't understand the drive to have unique test identifiers. we don't have unique identifiers for every code statement, or every bit of documentation. why are tests so important/special/different that each warrants a unique id? that aside, this functionality sounds like it can be encapsulated in a module, if desired. as it stands, i can't see a reason reason it *has to* be made available in the core. as a recap, the discussions larry, patrick, moritz and i (and others, i'm sure) had on this topic long ago led to agreement that the most important characteristics for a portable specification test suite were: ~ the tests should be organized in such a way that it makes it easy to figure out to what bit of spec is under scrutiny (addressed by directory/filename standardization and smartlinks) ~ the test files mustn't be cluttered with code that implementations need ignore (comments are used, which are by default ignored, and can be preprocessed to customize the test for each implementation) ~ the skip/todo markers should be as close to the relevant tests as possible, so they're less likely to fall out-of-sync (the markers are in comments in the test file, directly above the tests) it's my view that spec tests should be easy to maintain for developers of multiple implementations, and uniqueness is an overly burdensome constraint. ~jerry
Re: RFD: Built-in testing
Larry Wall wrote: module MyTests { sub group1 { ok foo :name; ## Q - would a label be better? } } >> ## Elsewhere >> MyTests.group1.test_foo is also broken; I guess I don't see offhand what you're trying to do with that. ... We must keep a clean separation between code that proves success and any indicator that says "don't try this yet". That was the intent. The test (within the MyTests module) would define tests in a platfom agnostic way. The "is also" clause would be added in some other place (a platform-specific file) that says which tests are currently broken (or perhaps adds some other tag that indicates that it should be skipped for smoke testing, but not for full regressions). The point is to have a mechanism within the language (i.e. not a preprocessor) that imposes that tags from afar: a useful action-at-a-distance that is necessary to separate the test from it's current status. I could also imagine writing code that reads from an Sqlite database, and imposes that info onto the test. Whatever mechanism is used, I think we need a language-defined mechanism to supply a stable unique identifier for each test, so that it can be individually tracked and manipulated. Perhaps "is only" is the wrong way to implement the action-at-a-distance, but it does seem better (IMO) than a preprocessor.
Re: RFD: Built-in testing
On Fri, Jan 23, 2009 at 11:16:21AM -0800, Dave Whipp wrote: > I can see that. So the alternative is to give things names and/or tags, > so that we can attach parameters remotely. Hmm, well, we also decided not to use any solutions that encourage putting the metadata too far away from the place it modifies. Somewhere else in the same file is perhaps okay (and I can see the use of tags in messages if the message itself isn't unique, but then why isn't the message unique?). But as soon as you have unique IDs people think they have to move the metadata out to a database, and then you're back with the same kind of always out-of-date and out-of-sync errors that we used to get with documentation before POD. Plus you start getting back into uncertainty as to whether something external to the program is cheating, unless you can prove a positive cutoff to the fudging metadata while doing validation testing. I really like the notion that final validation of 6.0.0 involves simply running the test files without any reference to outside data. > Such a mechanism should > probably be more general than just tests, so I'll overload "is also" to > impose additional traits: > > module MyTests { >sub group1 { > ok foo :name; ## Q - would a label be better? >} > } > > MyTests.group1.test_foo is also broken; > > presumably this would have some form of wildcarding, or inheritance of > the "broken" trait from outer scopes: > > MyTests is also broken; > > Not sure if that could work. I guess I don't see offhand what you're trying to do with that. Modules are primarily about exportation, and seem like the wrong peg to be hanging test info on--assuming such metadata even wants to look like real code, which I don't think it does. The real code wants to look exactly like what it will look like when rakudo *isn't* broken anymore. Test code should rarely be in the business of asserting that something is broken. Or to put it another way, test code that asserts failure a priori can never prove success. We must keep a clean separation between code that proves success and any indicator that says "don't try this yet". Every bit of code that is dependent on platform dependencies is, by definition, not platform independent, and we've got to keep at least the language validation tests platform independent. Larry
Re: RFD: Built-in testing
Larry Wall wrote: On Fri, Jan 23, 2009 at 08:01:14AM -0800, Dave Whipp wrote: For example, I could conceive of a trait: ok foo, :broken which might downgrade the error to a warning on rakudo, but not on other implementations. On the surface that seems like a good idea, and pugs started out doing things this way, but we discovered that it's a Terrible Mistake to mix platform dependencies in with the notation of the actual test,. .. All that being said, fudge is a preprocessor, and preprocessors are a form of evil, so I'd certainly be open to the actual parser doing the fudging during compilation if explicitly requested to do so. My main concern is that the fudging directives not be intermixed with the actual test, and that they not look like real code. I can see that. So the alternative is to give things names and/or tags, so that we can attach parameters remotely. Such a mechanism should probably be more general than just tests, so I'll overload "is also" to impose additional traits: module MyTests { sub group1 { ok foo :name; ## Q - would a label be better? } } MyTests.group1.test_foo is also broken; presumably this would have some form of wildcarding, or inheritance of the "broken" trait from outer scopes: MyTests is also broken; Not sure if that could work. Dave.
Re: RFD: Built-in testing
On Fri, Jan 23, 2009 at 08:01:14AM -0800, Dave Whipp wrote: > For example, I could conceive of a trait: > > ok foo, :broken > > which might downgrade the error to a warning on rakudo, but not on other > implementations. On the surface that seems like a good idea, and pugs started out doing things this way, but we discovered that it's a Terrible Mistake to mix platform dependencies in with the notation of the actual test, which is why we now use the "fudge" preprocessor approach, where any platform-dependent cheating is listed on its own line and looks like a comment to other platforms. Plus it's very easy to measure whether you're passing the test or not--you just turn off all the fudging, which leaves all the annotations as mere comments. If you mix the notation in with the test, then the test harness has to explicitly ignore the notations both for other platforms and also for this platform when a complete validation is desired; whether that is being done correctly is more difficult to prove, and it opens up the test harness to potential accusations of perfidious cheating. With the current approach it's drop-dead easy to see whether or not the tests are cheating--either you're running "fudge" or you're not. All that being said, fudge is a preprocessor, and preprocessors are a form of evil, so I'd certainly be open to the actual parser doing the fudging during compilation if explicitly requested to do so. My main concern is that the fudging directives not be intermixed with the actual test, and that they not look like real code. Larry
Re: RFD: Built-in testing
Timothy S. Nelson wrote: method foo() does assume { ... } method bar() does ensure { ... } Is "ensure" equivalent to the "assert" that you describe above? Yes. "does ensure" was meant to be an englishification of "postcondition"; and "does assume" is "precondition". From the perspective of formal specification, one assumes that a precondition is true, and the body of the method/sub/block must ensure that the postcondition is true (given the assumption of any preconditions). method baz() { bar; ok conserve_sum; foo; } I'd suggest that we don't even need to have "ok" here; we'd be better off just going "conserve_sum()", and assuming that, because it's a property, the "ok" will be automatically attached. I know you're not being picky about syntax at the moment, but I wanted to throw the idea into the ring. You really need some keyword there, to distinguish between the roles of "assume" and "assert". Also, it provides a construct to hang other traits on to. For example, I could conceive of a trait: ok foo, :broken which might downgrade the error to a warning on rakudo, but not on other implementations.
Re: RFD: Built-in testing
On Wed, 21 Jan 2009, Dave Whipp wrote: Moritz Lenz wrote: A few months ago Larry proposed to add some testing facilites to the language itself, because we want to culturally encourage testing, and because the test suite defines the language, so we need to specify the behaviour of our testing facilities anyway. If we're going to revamp the testing primitives, then I'd like to suggest importing some concepts from hardware verification langauges, whose entire purpose is to define tests. Not too much, but just a few defns: I love the basic ideas, but I have a few queries along the way. * Define a "property" as an expression whose truth is of interest (properties may be named, or may be anonymous inline). * An "assert " statement (aka "ok ") indicates that a violation of the property is to be considered an error * An "assume " statement indicates that a violation of the property implies an incorrect test. It seems to me that, from your description, that "assert " is more like: if(! ) { throw exception } ...and that assume is more like ok(). class Foo { has $.a; has $.b; property conserve_sum { $.a + $.b == 42 }, "a+b must sum to 42, but a=$.a + b=$.b == { $.a+$.b }"; method foo() does assume { ... } method bar() does ensure { ... } Is "ensure" equivalent to the "assert" that you describe above? method baz() { bar; ok conserve_sum; foo; } I'd suggest that we don't even need to have "ok" here; we'd be better off just going "conserve_sum()", and assuming that, because it's a property, the "ok" will be automatically attached. I know you're not being picky about syntax at the moment, but I wanted to throw the idea into the ring. An interesting type of property is one that tracks a series of events through time: a so called "temporal" property. A simple idea might be that "conserve_sum" should actually mean "sum does't change", instead of "is constant 42": class Foo { ... coro property conserve_sum { my $sum = $.a + $.b; leave True; ok $.a + $.b == $sum, "sum not conserved: expected $sum, actual {$.a+$.b}" } method foo() does maintain { --$.a; ++$.b } } Vote++ :) - | Name: Tim Nelson | Because the Creator is,| | E-mail: wayl...@wayland.id.au| I am | - BEGIN GEEK CODE BLOCK Version 3.12 GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI D G+ e++> h! y- -END GEEK CODE BLOCK-
Re: RFD: Built-in testing
On Thu, 22 Jan 2009, Richard Hainsworth wrote: 4) Testing software is different from debugging or running software. Running is about providing functionality to the user. Debugging is about getting expected behaviour and discovering why behaviour exhibited is not what is expected / specified. Testing is about demonstrating that the functionality provided is the functionality expected / specified under all specified conditions. I guess I've always seen it as having even more facets. I hadn't thought about testing in this context before. My suggestion, though, is that the facets would include: - "Useful" code (ie. the code that actually does the stuff you want) - Runtime error handling (ie. try/catch/whatever) - Debugging code (kind of like a log of watches[*], etc) - Tests (we're talking about these) - Comments # We know what these are :) - Documentation (POD, etc) [*] by "watches", I mean those things in a GUI where you get it to show you the contents of a variable. If there is a name for what I've labelled '"Useful" code', then please let me know :). Anyway, I just wanted to highlight the contrast between code (which is essentially a 1-dimensional character stream), and the 2-dimensional nature we're trying to capture, in hopes that it will give someone ideas. In particular, I note that we have specialised syntax for each of these. While it would presumably be confusing to unify the syntax for all of them, it seems to me that they naturally break into three groups: - "Useful" code (in as group by itself) - Checking code - Runtime error handling - Debugging code - Tests - English - Documentation - Comments Whether we should somehow unify the syntax of either the checking group or the English group isn't something that I know the answer to, but it's a thought. Perl 6 is confusing to the beginner, and that's OK, but I figure it should be no more confusing than it has to be. :) - | Name: Tim Nelson | Because the Creator is,| | E-mail: wayl...@wayland.id.au| I am | - BEGIN GEEK CODE BLOCK Version 3.12 GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI D G+ e++> h! y- -END GEEK CODE BLOCK-
Re: RFD: Built-in testing
On Thu, Jan 22, 2009 at 4:51 PM, jerry gay wrote: > $x == $y >:ok({ .true ?? 'message' !! 'failure message' }) >:diag( 'tap comment', :some_tap_property) I just want to stress again that I would like to see no focus on just tap emitters. While I realize this is just an example, adverbs that apply to a specific emitter would not be my preference. Extensible emitters would allow integrators the opportunity to mix perl6 tests in with perl5 tests and xUnit tests (for easily integrated test reports). -Jason "s1n" Switzer
Re: RFD: Built-in testing
- Original Message > From: jerry gay > On Thu, Jan 22, 2009 at 09:22, Moritz Lenz wrote: > > Richard Hainsworth wrote: > > But it is interesting to think about the case where a user wants two > > different diagnostic test messages (to all the testing gurus out there: > > do you actually want such a feature?). It shouldn't be too hard to do; > > maybe just :OK('True message', 'False message')? I can't speak for others, but I only want one diagnostic message, with the option to turn it on for passing tests. Having different messages for different conditions will confuse me :) Cheers, Ovid -- Buy the book - http://www.oreilly.com/catalog/perlhks/ Tech blog- http://use.perl.org/~Ovid/journal/ Twitter - http://twitter.com/OvidPerl Official Perl 6 Wiki - http://www.perlfoundation.org/perl6
Re: RFD: Built-in testing
On Thu, Jan 22, 2009 at 09:22, Moritz Lenz wrote: > Richard Hainsworth wrote: > But it is interesting to think about the case where a user wants two > different diagnostic test messages (to all the testing gurus out there: > do you actually want such a feature?). It shouldn't be too hard to do; > maybe just :OK('True message', 'False message')? > maybe $x == $y :ok('message') :nok('failure message') or $x == $y :ok({ .true ?? 'message' !! 'failure message' }) :diag( 'tap comment', :some_tap_property) to handle success and failure messages, and set custom diagnostic info in the tap stream. that is, as long as the result of the comparison is available in $_ to the :ok adverb. ~jerry
Re: RFD: Built-in testing
Ovid wrote: > One concern is where Larry asks: > > I wonder how often we'd have people making the error > of trying to interpoalte into :ok > > > > I'd be one of them. The following is a very common idiom: > > for my $method (@methods) { > can_ok $object, $method; > lives_ok { $object->$method } "... and calling '$method' isn't fatal"; > } Single angle quotes are just like single quotes in that they don't interpolate, whereas double angle quotes are just like double quotes; they interpolate. So you can just write :ok«... and calling '$method' isn't fatal», or :ok<<...>> or :ok("...") - it's not like there were only one way to write an attribute ;-) Surely people will make mistakes when they blindly assume things, but they'll learn it rather quickly. (BTW I want the non-interpolating test description just as often as the interpolating one, as in :ok; but that might be because I'm testing Perl 6, not user-level applications). > Interpolation in the test description is very important on iterative tests or > to distingiush similar tests That's why there's still more than one way to do it ;-) Cheers, Moritz
Re: RFD: Built-in testing
There are a few interesting points on which I'd like to comment Richard Hainsworth wrote: > In other words, test functionality sufficient for the compiler may not > be adequate for module testing. But other functions can be developed in > Test modules that can be hooked into a general testing approach. That's clear to me, and our current approach doesn't require all tests to be written as adverbs - only the most common ones. For example I think the eval_lives_ok and eval_dies_ok functions will remain, as well as a few others. > a) a global variable $*TESTING which defaults to FALSE (or should it be > $?TESTING ?) > > It could be set lexically so that specific software / modules can be > tested without triggering tests in other used modules. I don't think that lexical is good choice, since it means that you can only ever turn it on from the inside, which means that every code that contains TEST blocks also has to have some logic for switching on $*TESTING - which smells like a lot of code duplication. > b) When $*TESTING is TRUE, any TEST block is executed. > > c) Within a TEST block, the tenary is defined slightly > differently, thus for > ?? !! > ; > > is guaranteed to return Boolean::FALSE if an > exception or failure condition is encountered when evaluating it. > > Some advantages of this approach over :OK<>: > - no new behaviour outside of a TEST block is defined, no change to > adverbs or boolean operators. That is an advantage, but the definition of the :OK adverbs could also somehow magically be scoped to TEST blocks. > - any expression that leads to a boolean result (note that :OK is > suggested to be defined only on boolean operators) can be included in > the expression, eg., an entire block. The same can be achieved with ? ... :OK > - The programer has control over both the "True" diagnostic, as in the > :OK<> syntax, but also over the 'False' diagnostic, thus allowing a > degree of introspection on the component of the expression, which the > programmer has more knowledge about than the compiler. Sadly he not only has the control, but is also obliged to cater for both cases. I'm lazy, and I don't want to type all of my messages twice. But it is interesting to think about the case where a user wants two different diagnostic test messages (to all the testing gurus out there: do you actually want such a feature?). It shouldn't be too hard to do; maybe just :OK('True message', 'False message')? > - Since the variables used in the boolean expression are available to > the programmer for both diagnostics, there is no need for special magic > to generate the failure diagnostic, which seems to be the situation with > :OK<>. No, the :OK solves that problem, it doesn't generate it. Also it implies again that the programmer actually has to do it himself, which goes against the principle of laziness. > - Since it is the programmer that defines the False diagnostic, no extra > autogenerated macros are needed. > - The minimum that is needed for a test would be to specify a 'true' > diagnostic and the $! error variable, eg., > TEST { 2 == 2 ?? say 'constants are constants' !! say $! }; And what would $! contain in this case? (I think the real objection from is that ?? !! is just plain ugly; but then again I might be blind here...) > d) A TEST block is specified to react to exceptions / failures in a > different manner than in normal blocks. Uncaught exceptions are > discarded at the end of a block. Thus compiler / module / software > failures do not stop the software from continuing, unless specifically > required by the programmer to do so within the block. That's quite a good idea. > e) Other functions that are useful in test suites, such as plan, could > be defined later as wrappers around "?? !!" Or just stay as plain functions. Cheers, Moritz
Re: RFD: Built-in testing
Ovid wrote: > Regarding the disadvantages: > >> However nothing in life is free, we pay for it with a >> few disadvantages: >> * We nearly double the number of built-in operators >>by adding an :ok multi > > Yes, but conceptually this will be transparent to the end user, right? > They'll just know that they can add :ok to operators. They'll mentally have > one extra piece of information, not twice as many. Right. >> * We force implementors to handle operator adverbs >>and named arguments very early in their progress >>(don't know how easy or hard that is) > > This might be a problem. After my (now possibly moot) rewrite of Test.pm was > finished, my plan was to write a basic Test.pm which required as few features > as needed but still allowed the spectests to run. Then you simply provide > language developers a list of features they need to implement to run the test > suite. Adding operator adverbs to the mix means a lot of rewriting of tests. > > Alternatively, we can say "you don't need these at first" and Test.pm is > merely a older way of running tests. It still remains a valid alternative > and new implementers don't need to worry about adverbs. But if the spectests are re-written in terms of adverbs, a compiler can't use them without adverbs. If not they are not re-written, there's no point in introducing the syntax. >> * Testing of operators becomes somewhat clumsy. If you >> * want to test infix:<==>, you won't write >>'2 == 2 :ok("== works")', because you test a >>different multi there. Instead you'd have to write >>something like '?(2 == 2) :ok("== works")', where >>:ok is an adverb on prefix:. > > Bad: > > 2==2 :ok("== works"); > > Good: > > ?(2==2) :ok("== works"); > > I don't relish explaining, over and over again, why the first is bad and the > second is good. That being said, if this is only used for internals tests, > is this likely going to be exposed? This will only be a FAQ for the contributors of the official Test suite, and for people who write and test their own boolean operators. I guess we can live with that. All other people will assume that the operators already work. >> So I'd like to hear your opinions: do you think >> adverb-based testing is a good idea? If you don't like >> it, do you see any other good way to tackle the >> problems I mentioned above? > > So how would the following work? > > can_ok > lives_ok > throws_ok > isa_ok > is_deeply They would remain subs (unless somebody has a much better idea). > And so on? Sure, I can write extensions for this, but they're so common that > it seems a shame to not have them built-in, but what operator would they hook > to? > > Also, if we're going to go whole hog on this, then may I suggest a "tests" or > "test" keyword? We might have :ok embedded in our code, in which case > running multiple sections of code might have multiple sections with :ok. How > do test numbers work? When Foo.pm calls Bar.pm calls Baz.pm and they all use > :ok, we may not know how many tests we have, so these might get handled > different from something like this: > > test Unit::Customer plan 3 { > use Customer; > my Customer $cust .= new( :fname, :lname ); > $cust.fname eq 'Billy' :ok; > > # plan assumes 2 referrals > # won't work because we can't interpolate? > for $cust.referrals -> $ref_cust { > $ref_cust.referrer === $cust :ok<{$ref_cust.name} should have > correct referrer>; > } > } > > With a scheme like this, we can separate tests explicitly written by > programmers for testing and those which are embedded. If the &referrals > method has :ok in it, this shouldn't impact the overall plan, right? > > Side note: for the desugar, I'd still prefer we go with 'have/want' instead > of 'got/expected'. We've been wanting to do this with TAP for a while. It > reads well and also aligns nicely for fixed-width fonts. I'll think a bit more about these points. Cheers, Moritz
Re: RFD: Built-in testing
- Original Message > From: Moritz Lenz > > test Unit::Customer plan 3 { > > use Customer; > > my Customer $cust .= new( :fname, :lname); > > $cust.fname eq 'Billy' :ok; > > > > # plan assumes 2 referrals > > # won't work because we can't interpolate? > > for $cust.referrals -> $ref_cust { > > $ref_cust.referrer === $cust :ok<{$ref_cust.name} should have > correct referrer>; > > } > > } > I'll think a bit more about these points. I've been thinking about this and have realized that it also solves an intractable problem with Perl 5 tests: identifying tests. By promoting 'test' to a first class concept (not just adjectives), you can "name" a test. Right now, I'm trying to write App::Prove::History (http://github.com/Ovid/app--prove--history/tree/master), a bad name for code which saves the state of test runs. One incredibly thorny problem I have is that tests are identified by the name of the file. Reorganize your tests in directories or rename 'em? You've just lost your test history. However, if tests have an implicit name, developers are no longer locked into a directory hierarchy to identify their tests. This also brings us conceptually closer to the xUnit crowd. I would say for the above, if &referrals had embedded :ok tests, they could be output as warnings (if failing) or be provided via some mechanism that would let them be embedded into a TAP stream (or other test protocol) so that the information is not lost. I also wonder if 'plan' might not belong there. Not all testing protocols implement that and perhaps some developers won't want it. So long as their tests don't prematurely exit, they know they've run all of their tests. Cheers, Ovid -- Buy the book - http://www.oreilly.com/catalog/perlhks/ Tech blog- http://use.perl.org/~Ovid/journal/ Twitter - http://twitter.com/OvidPerl Official Perl 6 Wiki - http://www.perlfoundation.org/perl6
Re: RFD: Built-in testing
Moritz Lenz wrote: $x == 1e5 :ok('the :ok makes this is a test'); I can't help feeling that there's an end-weight problem here: The fact that it is a test is the essence of statement. If we're thinking of it as a library, then the MMD way of thinking might be appropriate: we know it's an equality test so there's no need to introspect it. But if we're thinking of it as a core language feature, then using macro semantics -- and introspecting the AST -- isn't necessarily a bad thing. It also depends on the context. If we're in a file that contains just tests, then there's nothing unexpected about seeing a test. Indeed, the fact that a statement is a test is no longer important (so no end weight issue). But if we want to see these ":ok" tests littering everyday code (i.e. as assertions) then it would be wrong to not make explicit the fact that the statement is a test
Re: RFD: Built-in testing
Moritz Lenz wrote: A few months ago Larry proposed to add some testing facilites to the language itself, because we want to culturally encourage testing, and because the test suite defines the language, so we need to specify the behaviour of our testing facilities anyway. If we're going to revamp the testing primitives, then I'd like to suggest importing some concepts from hardware verification langauges, whose entire purpose is to define tests. Not too much, but just a few defns: * Define a "property" as an expression whose truth is of interest (properties may be named, or may be anonymous inline). * An "assert " statement (aka "ok ") indicates that a violation of the property is to be considered an error * An "assume " statement indicates that a violation of the property implies an incorrect test. Assumptions are very important when you write automated test generators, or need to validate that your tests are not doing something illegal. If you violate an assumption when running normal code then it's not really any different from hitting an assertion. You want as many assumptions as possible to be part of the type system (which we already do with "where" clauses). (I'm using the phrase "assert" instead of "ok" because that's the standard terminology: In perl, "ok" is standard, so no need to rename it. But I do think that we need to qualify it with whether it's an assumption or an assertion. You can also think of an assumption as a precondition. But adding "PRE" blocks to every function tends to encourage cargo-cult DBC programming.) The other thing I'd like to point out is that the concept of a "property" can be very general. We shouldn't assume that they just sit in the middle of procedural code. Specifically, it should be possible to define invariants on objects, which should be true at some specified point in time (however, it's not always obvious what that point in time is). I'm thinking we might have something like: class Foo { has $.a; has $.b; property conserve_sum { $.a + $.b == 42 }, "a+b must sum to 42, but a=$.a + b=$.b == { $.a+$.b }"; method foo() does assume { ... } method bar() does ensure { ... } method baz() { bar; ok conserve_sum; foo; } } A property is just a method that returns a Bool but, if you associate the failure message with it, then it becomes simple to assert/assumme it in multiple places without needing to keep repeating the message. An interesting type of property is one that tracks a series of events through time: a so called "temporal" property. A simple idea might be that "conserve_sum" should actually mean "sum does't change", instead of "is constant 42": class Foo { ... coro property conserve_sum { my $sum = $.a + $.b; leave True; ok $.a + $.b == $sum, "sum not conserved: expected $sum, actual {$.a+$.b}" } method foo() does maintain { --$.a; ++$.b } } I don't think that it is necessary to be too cute with huffmannization of testing primitives. All my examples here are just thinking aloud.
Re: RFD: Built-in testing
- Original Message > From: Moritz Lenz > So Larry and Patrick developed the idea of creating an > adverb on the test operator instead: > > $x == 1e5 :ok('the :ok makes this is a test'); > > This is an adverb on the infix:<==> operator, and might > desugar to something like this: > > multi sub infix:<==>($left, $right, :$ok) { > $*TEST_BACKEND.proclaim($left == $right, $ok) > or $*TEST_BACKEND.diag( > "Got: «$left.perl()»; Expected: «$right.perl»"); > } Regarding the disadvantages: > However nothing in life is free, we pay for it with a > few disadvantages: > * We nearly double the number of built-in operators >by adding an :ok multi Yes, but conceptually this will be transparent to the end user, right? They'll just know that they can add :ok to operators. They'll mentally have one extra piece of information, not twice as many. Are there other consequences of this? > * We force implementors to handle operator adverbs >and named arguments very early in their progress >(don't know how easy or hard that is) This might be a problem. After my (now possibly moot) rewrite of Test.pm was finished, my plan was to write a basic Test.pm which required as few features as needed but still allowed the spectests to run. Then you simply provide language developers a list of features they need to implement to run the test suite. Adding operator adverbs to the mix means a lot of rewriting of tests. Alternatively, we can say "you don't need these at first" and Test.pm is merely a older way of running tests. It still remains a valid alternative and new implementers don't need to worry about adverbs. > * Testing of operators becomes somewhat clumsy. If you > * want to test infix:<==>, you won't write >'2 == 2 :ok("== works")', because you test a >different multi there. Instead you'd have to write >something like '?(2 == 2) :ok("== works")', where >:ok is an adverb on prefix:. Bad: 2==2 :ok("== works"); Good: ?(2==2) :ok("== works"); I don't relish explaining, over and over again, why the first is bad and the second is good. That being said, if this is only used for internals tests, is this likely going to be exposed? > So I'd like to hear your opinions: do you think > adverb-based testing is a good idea? If you don't like > it, do you see any other good way to tackle the > problems I mentioned above? So how would the following work? can_ok lives_ok throws_ok isa_ok is_deeply And so on? Sure, I can write extensions for this, but they're so common that it seems a shame to not have them built-in, but what operator would they hook to? Also, if we're going to go whole hog on this, then may I suggest a "tests" or "test" keyword? We might have :ok embedded in our code, in which case running multiple sections of code might have multiple sections with :ok. How do test numbers work? When Foo.pm calls Bar.pm calls Baz.pm and they all use :ok, we may not know how many tests we have, so these might get handled different from something like this: test Unit::Customer plan 3 { use Customer; my Customer $cust .= new( :fname, :lname ); $cust.fname eq 'Billy' :ok; # plan assumes 2 referrals # won't work because we can't interpolate? for $cust.referrals -> $ref_cust { $ref_cust.referrer === $cust :ok<{$ref_cust.name} should have correct referrer>; } } With a scheme like this, we can separate tests explicitly written by programmers for testing and those which are embedded. If the &referrals method has :ok in it, this shouldn't impact the overall plan, right? Side note: for the desugar, I'd still prefer we go with 'have/want' instead of 'got/expected'. We've been wanting to do this with TAP for a while. It reads well and also aligns nicely for fixed-width fonts. Cheers, Ovid -- Buy the book - http://www.oreilly.com/catalog/perlhks/ Tech blog- http://use.perl.org/~Ovid/journal/ Twitter - http://twitter.com/OvidPerl Official Perl 6 Wiki - http://www.perlfoundation.org/perl6
Re: RFD: Built-in testing
Moritz Lenz wrote: So I'd like to hear your opinions: do you think adverb-based testing is a good idea? If you don't like it, do you see any other good way to tackle the problems I mentioned above? After reading everything in this thread to date and in order to structure my thoughts, I wrote up some assertions a suggestion. 1) A perl6 implementation is to be certified by its ability to pass the test suite. The test functionality is implemented in the implementation (the implicit recursion has been noted in other threads). This means the behaviour of the test functionality must be specified and verifiable. Hence the assertion that perl6 is to be specified by a suite of documents and tests seems to me to imply that test functionality must be an inherent part of the language specification. 2) Test functionality for the compiler must be as simple to implement as possible, so that it can be incorporated into the implementation at an early stage. 3) Although test functionality for more complex software, eg., event-driven GUIs, could be constructed from simple specified test functionality, more complex forms will be developed to isolate the features that need testing and to provide the diagnostics. In other words, test functionality sufficient for the compiler may not be adequate for module testing. But other functions can be developed in Test modules that can be hooked into a general testing approach. 4) Testing software is different from debugging or running software. Running is about providing functionality to the user. Debugging is about getting expected behaviour and discovering why behaviour exhibited is not what is expected / specified. Testing is about demonstrating that the functionality provided is the functionality expected / specified under all specified conditions. It seems to me that the ethos of testing could be much wider than just a part of the development stage. From a risk-management perspective, mission-critical software (especially when it is complex and large) should be tested regularly against a standard test suite, because random errors may occur in the software (eg., a power surge subtly corrupts the contents of hard disk storage), and particularly after any upgrade of hardware or ancillary software, or any other environmental change, to say nothing of changes (upgrades) in the software itself. Indeed, the inclusion of test functionality in the language and the focus on test suites as part of perl6 culture would make software written in perl6 extremely desirable in risk-sensitive companies. Consequently, it seems to me that the following might be useful: a) a global variable $*TESTING which defaults to FALSE (or should it be $?TESTING ?) It could be set lexically so that specific software / modules can be tested without triggering tests in other used modules. b) When $*TESTING is TRUE, any TEST block is executed. c) Within a TEST block, the tenary is defined slightly differently, thus for ?? !! ; is guaranteed to return Boolean::FALSE if an exception or failure condition is encountered when evaluating it. Some advantages of this approach over :OK<>: - no new behaviour outside of a TEST block is defined, no change to adverbs or boolean operators. - any expression that leads to a boolean result (note that :OK is suggested to be defined only on boolean operators) can be included in the expression, eg., an entire block. - The programer has control over both the "True" diagnostic, as in the :OK<> syntax, but also over the 'False' diagnostic, thus allowing a degree of introspection on the component of the expression, which the programmer has more knowledge about than the compiler. - Since the variables used in the boolean expression are available to the programmer for both diagnostics, there is no need for special magic to generate the failure diagnostic, which seems to be the situation with :OK<>. - Since it is the programmer that defines the False diagnostic, no extra autogenerated macros are needed. - The minimum that is needed for a test would be to specify a 'true' diagnostic and the $! error variable, eg., TEST { 2 == 2 ?? say 'constants are constants' !! say $! }; d) A TEST block is specified to react to exceptions / failures in a different manner than in normal blocks. Uncaught exceptions are discarded at the end of a block. Thus compiler / module / software failures do not stop the software from continuing, unless specifically required by the programmer to do so within the block. e) Other functions that are useful in test suites, such as plan, could be defined later as wrappers around "?? !!" Regards, Richard
Re: RFD: Built-in testing
- Original Message > From: jerry gay > since the :ok adverb is modifying the operator, perl knows what kind > of comparison is being attempted, and can automatically give smart > diagnostics. this point was taken into consideration when the > adverbial test syntax was originally designed. some examples of perl 6 > tests using adverbial notation: > > plan *; > 3 === "3" :ok('int constant is equivalent to string constant integer'); > 3 !~~ "3" :ok('int constant smartmatch to string constant integer') > my $x = 284; > +$x == 284 :ok('$x is 284'); > ?$x :ok('$x is True'); > > there will no longer be ok() and is() functions, so although is() is > still a floor wax and a dessert topping, it has nothing to do with > testing. the comparisons are now explicit, so the intent of the test > isn't hidden behind a friendly-looking but difficult to debug function > like is(). Reading through that log more carefully now. Sorry I didn't do that earlier. One concern is where Larry asks: I wonder how often we'd have people making the error of trying to interpoalte into :ok I'd be one of them. The following is a very common idiom: for my $method (@methods) { can_ok $object, $method; lives_ok { $object->$method } "... and calling '$method' isn't fatal"; } Interpolation in the test description is very important on iterative tests or to distingiush similar tests (sometimes it would be nice to go so far as to ban identical test descriptions). Cheers, Ovid -- Buy the book - http://www.oreilly.com/catalog/perlhks/ Tech blog- http://use.perl.org/~Ovid/journal/ Twitter - http://twitter.com/OvidPerl Official Perl 6 Wiki - http://www.perlfoundation.org/perl6
Re: RFD: Built-in testing
On Wed, Jan 21, 2009 at 13:44, Ovid wrote: > - Original Message > >> From: Moritz Lenz > >> * the word 'is' is overloaded in Perl 6 >>* if we export subs is() and ok(), we clutter the >> namespace with subs with short names >>* is() is rather imprecise; it doesn't say *how* >> things are compared. > >> So Larry and Patrick developed the idea of creating an >> adverb on the test operator instead: >> >> $x == 1e5 :ok('the :ok makes this is a test'); > > This may all be irrelevant, but I'm tossing it out here in case anyone thinks > of how it might impact things. > > I'm not entire certain how I feel about this yet, but I love the core concept > of making testing a first class feature (well, duh ... of course I would say > that :) > > I'd like for this to be thought through really carefully lest we create an > interesting idea which is hampered by its implementation. Specifically, I'm > concerned about diagnostics. What we'd ultimately love to have in TAP is > some way of improving diagnostics (pseudo-TAP). > > is 3,3 'constants are constants; > # ok 1 - constants are constants > # have: 3 > # want: 3 > > Now we have a curious situation: > > multisub foo(Str $bar); > multisub foo(Int $bar); > > If we're testing what we should pass to &foo: > > is 3,"3" 'constants are constants; > # ok 1 - constants are constants > # have: 3 > # want: "3" > > Integration tests will still do OK, but unit tests may have issues and this > could be an expectation violation. What does it mean that the string 3 eq > the integer 3? > > Worse: > > my $bar = 284; > ok $bar, '$bar should be true'; > # ok 1 - $bar should be true > # have: 284 > # want: True > > That can also look a bit strange, particularly if someone is coming from a > different language background. > > How would this new system handle diagnostic information? One thing which > might mitigate this is something we've wanted in newer versions of TAP: > > my $bar = 284; > ok $bar, '$bar should be true'; > # ok 1 - $bar should be true > # test: ok $bar, '$bar should be true'; > # have: 284 > # want: True > > By letting programmers see the exact line of code for the test, the type > information *might* not be as important. I'm unsure. > > One possibility is to look at the &Test::More::cmp_ok function: > > $ perl -MTest::Most=no_plan -e 'cmp_ok 3, "==","2"' > not ok 1 > # Failed test at -e line 1. > # got: 3 > # expected: 2 > 1..1 > > If you change "2" to "3", the test still passes, but we could force it to not > pass unless eq is passed in as the second argument. Then we could have the > following diagnostics: > > perl6 $ perl -MTest::Most=no_plan -e 'cmp_ok 3, "eq","3"' > not ok 1 > # have: 3 > # test: eq > # want: "3" > 1..1 > > And then it's crystal clear why it failed. > since the :ok adverb is modifying the operator, perl knows what kind of comparison is being attempted, and can automatically give smart diagnostics. this point was taken into consideration when the adverbial test syntax was originally designed. some examples of perl 6 tests using adverbial notation: plan *; 3 === "3" :ok('int constant is equivalent to string constant integer'); 3 !~~ "3" :ok('int constant smartmatch to string constant integer') my $x = 284; +$x == 284 :ok('$x is 284'); ?$x :ok('$x is True'); there will no longer be ok() and is() functions, so although is() is still a floor wax and a dessert topping, it has nothing to do with testing. the comparisons are now explicit, so the intent of the test isn't hidden behind a friendly-looking but difficult to debug function like is(). ~jerry
Re: RFD: Built-in testing
- Original Message > From: Moritz Lenz > * the word 'is' is overloaded in Perl 6 >* if we export subs is() and ok(), we clutter the > namespace with subs with short names >* is() is rather imprecise; it doesn't say *how* > things are compared. > So Larry and Patrick developed the idea of creating an > adverb on the test operator instead: > > $x == 1e5 :ok('the :ok makes this is a test'); This may all be irrelevant, but I'm tossing it out here in case anyone thinks of how it might impact things. I'm not entire certain how I feel about this yet, but I love the core concept of making testing a first class feature (well, duh ... of course I would say that :) I'd like for this to be thought through really carefully lest we create an interesting idea which is hampered by its implementation. Specifically, I'm concerned about diagnostics. What we'd ultimately love to have in TAP is some way of improving diagnostics (pseudo-TAP). is 3,3 'constants are constants; # ok 1 - constants are constants # have: 3 # want: 3 Now we have a curious situation: multisub foo(Str $bar); multisub foo(Int $bar); If we're testing what we should pass to &foo: is 3,"3" 'constants are constants; # ok 1 - constants are constants # have: 3 # want: "3" Integration tests will still do OK, but unit tests may have issues and this could be an expectation violation. What does it mean that the string 3 eq the integer 3? Worse: my $bar = 284; ok $bar, '$bar should be true'; # ok 1 - $bar should be true # have: 284 # want: True That can also look a bit strange, particularly if someone is coming from a different language background. How would this new system handle diagnostic information? One thing which might mitigate this is something we've wanted in newer versions of TAP: my $bar = 284; ok $bar, '$bar should be true'; # ok 1 - $bar should be true # test: ok $bar, '$bar should be true'; # have: 284 # want: True By letting programmers see the exact line of code for the test, the type information *might* not be as important. I'm unsure. One possibility is to look at the &Test::More::cmp_ok function: $ perl -MTest::Most=no_plan -e 'cmp_ok 3, "==","2"' not ok 1 # Failed test at -e line 1. # got: 3 # expected: 2 1..1 If you change "2" to "3", the test still passes, but we could force it to not pass unless eq is passed in as the second argument. Then we could have the following diagnostics: perl6 $ perl -MTest::Most=no_plan -e 'cmp_ok 3, "eq","3"' not ok 1 # have: 3 # test: eq # want: "3" 1..1 And then it's crystal clear why it failed. Cheers, Ovid
Re: RFD: Built-in testing
On Wed, 2009-01-21 at 14:23 +, Peter Scott wrote: > On Wed, 21 Jan 2009 13:35:50 +0100, Carl Mäsak wrote: > > I'm trying to explain to myself why I don't like this idea at all. I'm > > only partially successful. Other people seem to have no problem with it, > > so I might just be wrong, or part of a very small, ignorable minority. > > :) > > I find myself echoing you. I don't have the language design skills others > are displaying here. I can only evaluate this from an educator's point of > view and say that the P5 syntax of > > is $x, 42, 'Got The Answer'; > > is just about the conceivable pinnacle of elegance for at least that form > of question. (Compare, e.g., the logorrhoea of Java tests.) I do not see > how I could tell a student with a straight face that the P6 proposal is an > improvement, at which point the conversation would devolve into a > defensive argument I do not want to have. > > I get that 'is' is already taken and we do not want the grammar to engage > in Clintonesque parsing when it encounters the token. Okay. But how do I > justify the new syntax to a student? What are they getting that makes up > for what looks like a fall in readability? I don't quite understand the problem with using the same syntax as in Perl 5, just uppercasing the verbs so they won't conflict with everyday syntactic features: OK($bool, 'Widget claimed success'); IS($x, 42, 'Widget produced the right answer'); (This is ignoring issues of placement of parens or curlies to make the Perl 6 syntax attractive and consistent with other constructs -- I'm just talking about using verb rather than adverb syntax, with our already properly Huffmanized verb names intact.) I do like the idea of having TEST {} blocks that go inactive when not in testing mode (however that is defined). But other than that, I don't understand the value of the other syntactic changes suggested, the adverb syntax in particular. Maybe I'm missing something obvious -'f
Re: RFD: Built-in testing
On Wed, 21 Jan 2009 13:35:50 +0100, Carl Mäsak wrote: > Moritz (>): >> So Larry and Patrick developed the idea of creating an adverb on the >> test operator instead: >> >>$x == 1e5 :ok('the :ok makes this is a test'); > > I'm trying to explain to myself why I don't like this idea at all. I'm > only partially successful. Other people seem to have no problem with it, > so I might just be wrong, or part of a very small, ignorable minority. > :) I find myself echoing you. I don't have the language design skills others are displaying here. I can only evaluate this from an educator's point of view and say that the P5 syntax of is $x, 42, 'Got The Answer'; is just about the conceivable pinnacle of elegance for at least that form of question. (Compare, e.g., the logorrhoea of Java tests.) I do not see how I could tell a student with a straight face that the P6 proposal is an improvement, at which point the conversation would devolve into a defensive argument I do not want to have. I get that 'is' is already taken and we do not want the grammar to engage in Clintonesque parsing when it encounters the token. Okay. But how do I justify the new syntax to a student? What are they getting that makes up for what looks like a fall in readability? -- Peter Scott http://www.perlmedic.com/ http://www.perldebugged.com/
Re: RFD: Built-in testing
Moritz (>): > So Larry and Patrick developed the idea of creating an > adverb on the test operator instead: > >$x == 1e5 :ok('the :ok makes this is a test'); I'm trying to explain to myself why I don't like this idea at all. I'm only partially successful. Other people seem to have no problem with it, so I might just be wrong, or part of a very small, ignorable minority. :) Nevertheless, here is my main kvetch about the new syntax proposal: * Adverbs traditionally modify the behaviour of some construct, giving it additional information or suggesting an alternative algorithm. Well-known examples are :by on ranges, the adverbs on regexes, or the :repl option on .pick(). All of these preserve the main objective of the construct, only modifying it somewhat. * The proposed :ok syntax changes the semantics of the comparison (or whatever) from returning a value, to committing test-related actions, probably resulting in output of some kind. The original comparison is still syntactically prominent in the statement, but it's the testing bit, whose syntax is pushed to the irrelevant far right, that does the heavy lifting. This can all be summarized in a feeling of mine that the suggested testing :ok syntax make a travesty of adverbs. For the above reasons, I don't find it particularly elegant or intuitive. I do think that it's possible to use adverbs to make a better testing framework, but IMHO this is not the way. // Carl
Re: RFD: Built-in testing
(Daniel Ruoso also proposed to call the adverb :test instead of :ok, making it easier to read but a bit longer; my happiness doesn't depend on the exact name, but of course we can discuss it once we have settled on this scheme, if we do so). My two-cents worth: The adverb on a boolean changes the nature of the statement, so that if the statement is true we get the diagnostic message in :OKmessage> but if the statement is false we get a failure message from the compiler / software Given that the diagnostic appears when the test succeeds, I - like Fagyal - would prefer :OK<> to :TEST<> because this is the way I use OK, that is I expect a positive answer. However, the nature of a test is that a program consisting of test commands continues to run even if there is a failure. This is not a problem if the boolean statements are 'standalones' meaning that the consequent flow of the program is not dependent on the test, eg., $x.value == 2 :OK; $x.color eq 'red' :OK; ... But if this is part of perl6, then it will be possible (I think) to write if $x.value == 2 :OK {dosomething()} else {dosomeotherthing()}; What sort of behaviour would be expected? I see several alternatives: a) Suppose it is decided that :OK could be a part of ordinary software, then a fork in the program would occur depending on the boolean value. Hence :OK in general generates a trace commentary that is explicitly defined for the TRUE case, but is implicitly defined by the compiler for the FALSE case. b) However, if it is considered best for :OK only to operate in Test contexts. That would mean a boolean test with :OK should be illegal unless it is a standalone statement, eg., the test should not be in a control construct. In this case, I would think :TEST should be the variant chosen. The reason being that it focusses attention on the test behaviour. c) Suppose, as Damian suggested (I think), that tests should be included in normal software, but that they are ring-fenced into a separate block with a TEST {}. That way, TEST blocks would not normally run in production software. In this case, the extra semantic hint of ok expecting a positive response would be useful. Hence :OK would be the preferable variation. Richard
Re: RFD: Built-in testing
Hi I assume that BDD(Behavior Driven Development) and the vocabulary that it implies is not a good choice at this stage ? :describe(""); $x.should be(1e5) :it(""); and that a module based on the core testing facilities can be built if someone feels like to. Well, the vocabulary that it implies is really nice anyway if it can be of any inspiration^^ http://www.oreillynet.com/pub/a/ruby/2007/08/09/behavior-driven-development-using-ruby-part-1.html
Re: RFD: Built-in testing
Larry observed: > My feeling on this is that the compiler should simply hardwire this > particular adverb so that all the tests can be autogenerated, and the > multi system never needs to see those versions. I strongly agree. > We are merely hijacking the adverb syntax so that is clear which > operator is being modified. There is no need for the late binding of > multi. It's just a "reserved adverb" if you will. Which probably means > it should be something unlikely to collide with user-defined adverbs. > Maybe something in all caps. For what it's worth, :OK<> can be typed > with one hand while the other holds down the shift key. :) Typical right-hander fascism! We do indeed want to encourage testing by making it easy to write tests, but naming it :TEST<> makes it far easier to *read* tests, which seems to me a better long-term optimization. We would probably also want a mechanism for switching tests on or off in a given compilation unit, or globally, so they can be placed in (and left in!) production code. Perhaps we could use the same mechanism for PRE{...} and POST{...} blocks as well? Which also suggests that a general TEST {...} block (which only runs if testing is enabled) might be valuable? Damian
Re: RFD: Built-in testing
Hi, I pretty much like this idea. Very perl6ish :) - I don't think it's important whether it is called :ok, :OK or :test or :wellhowdidthatworkout. I assume people who will be testing their modules/code/etc. will be using more advanced modules for testing anyway. This is for testing the implementation against the specs, and they *will* know how it works :) - I don't think we should be concerned whether to implement :ok is difficult. Implementations in early stage are totally broken anyway :), they won't even *parse* the tests well - they will have have their own, limited tests. Later they can chose to do some magic to make :ok work... and finally implement it. - I like "ok" better than "test", as the former kind of implies a boolean "was that true?" to me. YMMV, though. - Fagzal
Re: RFD: Built-in testing
On Tue, Jan 20, 2009 at 1:08 PM, Moritz Lenz wrote: > So Larry and Patrick developed the idea of creating an > adverb on the test operator instead: > >$x == 1e5 :ok('the :ok makes this is a test'); > > This is an adverb on the infix:<==> operator, and might > desugar to something like this: > > multi sub infix:<==>($left, $right, :$ok) { >$*TEST_BACKEND.proclaim($left == $right, $ok) >or $*TEST_BACKEND.diag( >"Got: «$left.perl()»; Expected: «$right.perl»"); > } > > (Daniel Ruoso also proposed to call the adverb :test > instead of :ok, making it easier to read but a bit > longer; my happiness doesn't depend on the exact name, > but of course we can discuss it once we have settled > on this scheme, if we do so). I like this idea and with it built into the language itself, there will be much less of an excuse to skip testing. I like the adverb form, which seems more perl6 than C. Naming it something like :test is a better idea than :ok as that seems a bit more direct. There isn't much in the spec concerning namespaces, other than the default * namespace. Is there any reason why the testing framework can't go in it's own namespace? > * We nearly double the number of built-in operators > by adding an :ok multi > * We force implementors to handle operator adverbs > and named arguments very early in their progress > (don't know how easy or hard that is) > * Testing of operators becomes somewhat clumsy. If you > * want to test infix:<==>, you won't write > '2 == 2 :ok("== works")', because you test a > different multi there. Instead you'd have to write > something like '?(2 == 2) :ok("== works")', where > :ok is an adverb on prefix:. > These are mostly disadvantages to implementors, not users of the testing framework. I'd rather the implementations struggle to implement a built-in testing functionality than users of the language struggle to use the built-in testing. > I'll send another mail on the subject of pluggable > testing backends in order to allow different emitters > (TAP output, storage into databases, whatever) This is a requirement for me. Having only TAP emitters may not integrate well. It would be nice if the spec, if added, would allow flexibility in this realm. I would actually like to see a flexible system that allowed me to define a new emitter, say for the cases where you want to integrate perl6 testing into an existing testing framework (think automated builds and tests). -Jason "s1n" Switzer
Re: RFD: Built-in testing
On Tue, Jan 20, 2009 at 08:08:57PM +0100, Moritz Lenz wrote: : * We nearly double the number of built-in operators :by adding an :ok multi My feeling on this is that the compiler should simply hardwire this particular adverb so that all the tests can be autogenerated, and the multi system never needs to see those versions. We are merely hijacking the adverb syntax so that is clear which operator is being modified. There is no need for the late binding of multi. It's just a "reserved adverb" if you will. Which probably means it should be something unlikely to collide with user-defined adverbs. Maybe something in all caps. For what it's worth, :OK<> can be typed with one hand while the other holds down the shift key. :) Larry