Re: [racket-dev] [racket] tests/eli-tester feedback (Was: Racket unit testing)

Ryan Culpepper Sun, 20 Feb 2011 16:08:13 -0800

On 02/18/2011 02:12 PM, Eli Barzilay wrote:

25 minutes ago, Ryan Culpepper wrote:

On 02/18/2011 07:30 AM, Eli Barzilay wrote:

50 minutes ago, Ryan Culpepper wrote:

On 02/15/2011 07:28 AM, Eli Barzilay wrote:

And finaly, there's the litmus test for existing code:


* Ryan: is something like this enough to implement the GUI layer?


Not well, I think. The Test-Result type in Noel's racktest code is
too simple and inflexible. It represents the minimal essence of
testing, but it would be awkward to extend to richer testing
sytems. Here's my counter-proposal for representing the results of
tests:
[...]


I can't make sense of it, besides a vague "waaaay to heavy" feeling
for something that should be core-ishly minimalistic.


Simplicity is no good if it gets in the way of representing information
that needs to be represented.


[But the flip token is that complexity is no good if you end up with
something that doesn't fit any system, where each one is filling in
fields that it doesn't "want to".]

The representation I outlined is based on the needs of the rackunit gui,where test execution and display are currently tightly coupled. By thatI mean that plain rackunit's notion of test results is insufficient forthe gui; I had to create my own. I also generalized the idea of testheaders based on a long-standing feature request (the ability todesignate tests as expected to fail).

In an attempt to follow it, I did this:

    TestResult = header
                 execution
                 status

but your TestHeader is used only there,


Not necessarily. A testing framework that distinguishes test
construction from test creation might create the header when the
test is constructed. SchemeUnit used to work that way, and RackUnit
is able to, although less gracefully than before.


I don't follow this -- what's the difference between "construction"
and "creation"?


Sorry, I meant "distinguishes test construction from test execution".

(See also my final remark, about "test started" notifications.)


Yes, I know that this might imply some division for a sub-struct, I'm
focusing on just the kind of information that is required.

so it could be folded in:

    TestResult = name      (U String #f)
                 suite     (Listof String)
                 info      Dictionary
                 execution
                 status

TestExecution is also used only once so it can also be folded in --
but since it's just a generic dictionary, it can be dropped.


I think it's a bad idea to collapse the two dictionaries, because
they represent different information. Especially since the set of
keys is open-ended, it is helpful to separate information about the
test from information about its execution.


(Same here -- I did the collapse to synthesize what it is that you're
actually requiring, so I treated all dictionaries as "other stuff",
which makes them trivially collapsible...)


I don't understand this response.

* What happens when there's no specific expected value to compare?
   For example, run some two pieces of code 10 times each and check
   that the average runtime of the first is below the runtime of
   the second.  This could be phrased in terms of an expected
   value, but in a superficial way, and will prevent useful
   information from being expressed (since the information would
   have to be reduced to two numbers).


You can include whatever information you want. That's why it's a
dictionary, rather than a fixed set of fields. The real question is
how a test result displayer will know how to interpret the fields
correctly.  I think a useful default is to show all attributes with
keys that are interned symbols or strings. Custom attributes would
only work for test result displayers that know about them.


The question is if some attributes are known enough to get a special
treatment, and then the whole dictionary thing becomes a burden of
html-like specification rather than an "everything works" advantage.
What I'd like to see, is something along the lines of:

I think HTTP is a closer analogue than HTML. HTTP has a well-definedrequest line followed by just a bunch of headers (essentially, adictionary mapping strings to strings). The HTTP spec specifies themeaning of some headers; other RFCs (cookies, caches/proxies) specifythe meaning of others; and web browsers and servers are free to useothers to include information that the other party may or may not findinteresting.

   Either
     String x String dictionary of field-name and field contents
     or a single string for the result

This avoids such mess as specifying when I use a string for the
printed form of some value (as you suggested in "Then convert it to a
string and keep the string") vs when it's a proper value.  It also
avoids making semi-formal fields that become de facto requirements.

* This solidifies the list-of-strings as a representation of the
   test hierarchy.  But perhaps there is no way to avoid this -- if
   it's made into a proper hierarchy of objects it will probably
   complicate things in a way that requires the listener to get
   "update" events that tells it how the structure changed.


I was actually going to propose something more complicated for the
hierarchy, but I figured it was better to leave that for later. I'm
certainly open to changing this part.


The dynamic aspect makes it looks fine as is, I think.  It just seems
redundant to start describing tests accurately to have sections that
have the same name but are realy separate.

* I'm not sure about the error result.  It seems to me that this is a
    meta issue that you're dealing with when you develop the test suite,
    and as such it should be something that you'd deal with in the usual
    ways =>   throw an exception.  It's the tools that should be in charge
    of catching such an exception and deal with it -- which means that
    - in my tester's case, it'll defer to racket as usual, meaning that
      you'd just get an error.
    - in rackunit's case you'd probably get some report listing the
      erroneous tests, instead of propagating the error.
    - and in your gui case you'd catch exceptions and show them as error
      results.


Are you saying you think a status should only be success or failure?
If so, I disagree. I can see roughly how that would work, but I
think it's useful to distinguish between failure and error at the
reporting level.


It is -- but the question is whether *that* kind of reporting belongs
in the core specification of these values or not.  Making it be there
seems wrong to me in the same way that exceptions are never really
used for anything other than throwing them.  (Except perhaps a few
weird cases that I'm sure will lead to flames, say add "almost"s or
whatever.)

I'm not convinced, but I could accept having only two variants, successand failure, and considering errors a kind of failure.

And that's not quite the end of it. The rackunit gui creates an
entry for a test case as soon as it starts running, so the user
can see what test case is hanging and interrupt it if they
choose. That requires additional communication between test
execution and test display.


Yes, that would e part of the protocol for the listener -- and it
makes sense to allow tests to invoke it to let it know that a test
has started.


Like maybe sending it just the test-header struct? The part that
represents the information known about the test before it executes,
packaged up as one value?

Although, if we're going to standardize this part it would also be
nice to have a way of indicating that a suite has started, too.


Yeah -- and that's something that I liked in Noel's list of strings,
it means that you treat test suites in the same way as tests, which
IMO means that it will lead to nice uniformities in other places (like
a gui interface).

I don't think the gui ever displays a test case's name in the same lineas its enclosing test suite. So no nice uniformities for me.

I'm also concerned about ambiguity. Would '("snark") indicate a testnamed "snark" outside of any test suite or an anonymous test in within atest suite named "snark"? We could either disallow anonymous test cases,or we could say a test case name is either a string or #f. But now we'rereally abusing cons. And since I want test-headers to accommodate otherinformation too, it seems a lot cleaner to me to make it a struct andkeep name and suite separate.


Ryan
_________________________________________________
 For list-related administrative tasks:
 http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] [racket] tests/eli-tester feedback (Was: Racket unit testing)

Reply via email to