Greetings Apertiumers!

TL;DR: I'm designing a regression testing framework and hope to use it
to improve testing across all of Apertium if no one objects.

I have recently been designing a regression testing framework for
Apertium language modules and translation pairs so that we can have a
measure of quality assurance in all repositories and have this measure
be consistent across the organization.

If Apertium participates in Google Summer of Code this year and my
project is accepted, I intend to implement this system and apply it to
all repositories, converting any existing tests in the process. If I
do not do this as part of GSoC, I still plan to work on it, barring
unresolvable objections, though it may take longer.

As this is a change that affects everyone, I thought it was probably a
good idea to get community input on this as early as possible.

The current version of my proposal can be found at

In summary, the developer provides a file containing a list of inputs.
The testing framework passes these inputs through the translation or
analysis pipeline, recording the output of each step. These outputs
are compared to the outputs of previous runs and the ideal outputs (a
particular input need not have an ideal output and it may have

Any differences that are found are presented to the developer. If
these differences are an improvement over the previous output, they
will become the new standard of comparison.

In this way, the test corpus will serve the function of a set of unit
tests which catch regressions while at the same time automatically
updating to reflect improvements.

I believe that all existing tests can be converted to this format
without loss of information. Doing so would also be an opportunity to
complete the transition from 2-letter codes to 3-letter codes and
ensure that all repos have good READMEs.

In addition, this could be set up with continuous integration
(probably through GitHub Actions) which would give us the ability to
automatically validate commits and PRs as not causing regressions and
also generate status badges for the READMEs to indicate things like
WER and coverage.

In any event, I welcome feedback on this proposal, whether by email,
on the wiki talk page, or on IRC (nick: popcorndude).


Apertium-stuff mailing list

Reply via email to