On 02/18/2011 07:30 AM, Eli Barzilay wrote:
50 minutes ago, Ryan Culpepper wrote:
On 02/15/2011 07:28 AM, Eli Barzilay wrote:
And finaly, there's the litmus test for existing code:

* Ryan: is something like this enough to implement the GUI layer?

Not well, I think. The Test-Result type in Noel's racktest code is
too simple and inflexible. It represents the minimal essence of
testing, but it would be awkward to extend to richer testing
sytems. Here's my counter-proposal for representing the results of
tests:
[...]

I can't make sense of it, besides a vague "waaaay to heavy" feeling
for something that should be core-ishly minimalistic.

Simplicity is no good if it gets in the way of representing information that needs to be represented.

In an attempt to follow it, I did this:

   TestResult = header
                execution
                status

but your TestHeader is used only there,

Not necessarily. A testing framework that distinguishes test construction from test creation might create the header when the test is constructed. SchemeUnit used to work that way, and RackUnit is able to, although less gracefully than before.

(See also my final remark, about "test started" notifications.)

> so it could be folded in:

   TestResult = name      (U String #f)
                suite     (Listof String)
                info      Dictionary
                execution
                status

TestExecution is also used only once so it can also be folded in --
but since it's just a generic dictionary, it can be dropped.

I think it's a bad idea to collapse the two dictionaries, because they represent different information. Especially since the set of keys is open-ended, it is helpful to separate information about the test from information about its execution.

   TestResult = name      (U String #f)
                suite     (Listof String)
                info      Dictionary
                status

Now, status is one of three options, the failure one has a dictionary
so it can be removed (folded into the above).

I object again to the conflation of unrelated dictionaries.

So overall, it looks like a simple struct with a name, a "suite" (kind
of, defined indirectly by a hierarchy of string lists), a generic
dictionary for "stuff", and a status.  This is all modulo some
questions/issues that are unclear to me:

* It looks like it tries to break away too many pieces into a formal
   description.  For example, it looks like an overkill to have fields
   with the actual value and expected value and worse -- the
   comparison.  For example, What happens when the comparison is
   parameterized (eg, "close within dx to some number")?

Not every testing framework might use 'actual and 'expected, and even frameworks that do might not use them all the time. For example, in rackunit:

  (check-equal? 'apple 'orange)

would result in the following failure attributes:

  'actual => 'apple
  'expected => 'orange
  'comparison => 'equal?

(I think rackunit currently reports the check name, 'check-equal?, rather than the comparison name. It could work either way, or maybe it could include both.)

On the other hand, something like check-not-false might only report an actual value. And a check parameterized over a tolerance could just include the tolerance as an extra attribute.

* What happens when there's no specific expected value to compare?
   For example, run some two pieces of code 10 times each and check
   that the average runtime of the first is below the runtime of the
   second.  This could be phrased in terms of an expected value, but in
   a superficial way, and will prevent useful information from being
   expressed (since the information would have to be reduced to two
   numbers).

You can include whatever information you want. That's why it's a dictionary, rather than a fixed set of fields. The real question is how a test result displayer will know how to interpret the fields correctly. I think a useful default is to show all attributes with keys that are interned symbols or strings. Custom attributes would only work for test result displayers that know about them.

* What if you don't want to hold on to the value?  (For example, free
   some related resource.)

Then convert it to a string and keep the string. (This is what the rackunit gui does to report custodian-managed leftovers.)

* This solidifies the list-of-strings as a representation of the test
   hierarchy.  But perhaps there is no way to avoid this -- if it's
   made into a proper hierarchy of objects it will probably complicate
   things in a way that requires the listener to get "update" events
   that tells it how the structure changed.

I was actually going to propose something more complicated for the hierarchy, but I figured it was better to leave that for later. I'm certainly open to changing this part.

* I'm not sure about the error result.  It seems to me that this is a
   meta issue that you're dealing with when you develop the test suite,
   and as such it should be something that you'd deal with in the usual
   ways =>  throw an exception.  It's the tools that should be in charge
   of catching such an exception and deal with it -- which means that
   - in my tester's case, it'll defer to racket as usual, meaning that
     you'd just get an error.
   - in rackunit's case you'd probably get some report listing the
     erroneous tests, instead of propagating the error.
   - and in your gui case you'd catch exceptions and show them as error
     results.

Are you saying you think a status should only be success or failure? If so, I disagree. I can see roughly how that would work, but I think it's useful to distinguish between failure and error at the reporting level.

But if you mean that a testing framework should be allowed to halt on errors instead of reporting them as error statuses, then I agree.

Also:

And that's not quite the end of it. The rackunit gui creates an
entry for a test case as soon as it starts running, so the user can
see what test case is hanging and interrupt it if they choose. That
requires additional communication between test execution and test
display.

Yes, that would e part of the protocol for the listener -- and it
makes sense to allow tests to invoke it to let it know that a test has
started.

Like maybe sending it just the test-header struct? The part that represents the information known about the test before it executes, packaged up as one value?

Although, if we're going to standardize this part it would also be nice to have a way of indicating that a suite has started, too.

Ryan
_________________________________________________
 For list-related administrative tasks:
 http://lists.racket-lang.org/listinfo/dev

Reply via email to