designing a test suite for multiple implementations

jerry gay Fri, 11 Aug 2006 12:02:06 -0700

recently, perl 6 development has taken the form of a multi-method
dispatch. that is, multiple implementations are under active
development. this includes pugs (in haskell,) v6 (in perl5,)
v6-Compiler (in perl6,) and perl6 (on parrot.)  hopefully, each of
these returns the same result, a working[1] perl6 implementation.


it makes sense for these implementations to share perl6 tests written
from the perl6 specification. it is also important for this to happen,
as it will help solidify the specification, allow implementations to
check results against each other, share new and corrected tests
easily, and be the canonical reference for answering questions about
the spec. it is also important that only perl6 specification tests be
shared; implementation-specific tests belong solely with the
implementation..

few have ever created a test suite of this magnitude before--i
certainly haven't, and don't expect anyone will do it alone. the
overarching question i have is: how is an test suite targeted for
multiple implementations designed, developed, managed, and controlled?

i believe we can quickly address the "developed" question with the
word "organically." tests will be added, changed, or removed as
needed. organization of tests in the suite will change over time as
refactoring of any large body of tests or code is inevitable. this is
the simple, practical solution we usually expect when working with
perl.

as for "controlled", i expect restricted access to the test
repository, as a protected body of tests is important to more people
as soon as multiple implemenations are using it. providing a method of
submitting and applying patches will allow anyone the ability to
propose a new test for inclusion.

for "managed," i have a few ideas. currently, the suite lives in the
pugs repo. this is a fine first approximation, but i believe it will
soon be time to move this suite[3]. the question is, should it be
moved into their own repository, or into the repo of the "official"
perl6 implementation (if such a beast will indeed exist,) or should it
live in some other place i haven't thought of yet? the options here
are limited, and i believe straightforward. it should be easy to come
to an agreement and take action.

i've left the "designed" question for last, because i think it's the
most difficult.  perl6 test suite has already taken steps towards
targeting multiple implementations. the suite has been designed with a
set of "sanity" tests, which are required to pass successfully in
order to run the perl6 "Test" module, which is written in perl6. this
allows the remainder of the suite's test files to be written in perl6,
making their representation independent of implementation. all the
perl6 implementations i've mentioned above use this test suite, and
pass some subset of the tests. this is a big win, and has allowed
newer implementations to grow quickly as the course has already been
laid.

but unsolved problems remain. perl has a long history of testing.
there are widely held beliefs on how to design and develop test
suites. simple rules, like keeping like tests together, using good
test descriptions, testing components in isolation, performing
integrated tests, and marking tests as todo() or skip() when not
working as expected.

this last testing rule i mentioned becomes somewhat problematic when
designing a test suite for use by multiple perl6 implementations. it
is obvious that some implementations of perl6 will not pass all tests,
and will need to mark certain tests as todo() and skip() in order to
run the suite successfully.

this is further complicated by the fact that perl6 is designed to be
executed in multiple platforms. therefore, a test may succeed for a
particular subset of implementations on a particular subset of
platforms per implementation, but fail elsewhere.

so, what should todo() information would look like, and where it
should go? it's a tough problem, because there's a lot of pain, and it
has to be shared somehow. if it's decided we are testing the
specification, we should design implementation-independent tests. from
this, it follows that implementation details (like todo()) should not
exist in the test. instead, todo() info would be contained in a config
file, or in a test harness, or by some other implementation-specific
method. this is not ideal, since changes to the shared test suite may
invalidate the todo() information in a particular implementation as
test numbers or descriptions change.

keeping todo info directly with the test means a particular
implementation has this information with the test (as is currently
best practice,) but it has it's downsides as well. test files become
crowded with todo() statements and conditional logic for
implementation- and platform-dependent todo()s. each time a new
implementation is updated, the shared test suite will be modified,
requiring (imo) unnecessary test suite changes.

i should point out that implementation-specific tests must exist, but
they must not be in a shared perl6 test suite. how todo() info is
managed in these tests is left to the implementor, and is out of
scope.

i hope you can speak of your experience with test suite design, help
address the concerns i've laid out above, and add any i may have
missed.
~jerry

[1] for some definition of working
[2] currently, no perl6 implementations pass all tests. i expect this
to change, but i never expect all perl6 implementations to pass all
tests.
[3] it doesn't make sense to keep the "official" tests in a
non-official repo in the long term.

designing a test suite for multiple implementations

Reply via email to