Hi Daniel,
On Wed, Jun 20, 2018 at 10:28:43AM -0400, Daniel Corbett wrote:
> +shell -expect "used:0" {
> + echo "show table http1" |socat ${tmpdir}/h1/stats.sock -
^^^^^
This is the point where it will start to require that we organize the
reg tests better. First, socat is rarely installed by default so we'll
have to mention that it's required. Second, socat introduces half a
second delay before quitting, making it impractical for the quick
automated tests that we expect developers to run frequently.
The dependency on socat makes me think we could probably put all of such
tests in a specific sub-directory. However, I predict that we will also
create a number of other ones which will be slower than average and which
will be unrelated to the CLI.
Maybe we could simply introduce levels :
- level 1 (the default) would contain only the immediate tests that cover
the internal state machine and HTTP compliance (the things we break the
most often by side effets when fixing a bug in the same area). Basically
we should expect to be able to run 100 tests in a second there and there
should be zero excuse for not running them before committing a patch
affecting a sensitive area.
- level 2 would cover some extra parts requiring a bit more time (eg:
CLI commands, horrible stuff involving tcploop) and would probably
be needed only when trying to ensure that a fix doesn't break
something unexpected.
- level 3 would be the painful one that we already know nobody will dare
to run. They would cover timeouts, health checks, etc. All the stuff
that takes multiple seconds per test would be there. They may occasionally
be run by a dev during lunch time, or at night by automated bots.
Then we could issue "make reg-tests" to run level 1 by default or
"make reg-tests LEVEL=<level>" for the other ones. The idea is that I would
*really* like to encourage developers to run some basic tests before sending
patches, and we all know that none of us will accept to run them if they take
more time than what is needed to divert us (ie if you have time to switch to
reading your mails while the test runs, we won't run them because this will
create distraction).
Also an important point since I'm seeing that the first tests focus on
known bugs (and I know it's often the point of regression testing) :
we've never reintroduced a bug that was already fixed, and it's likely
the same in most other projects. The reason is human nature consisting
in learning from our mistakes. So in practice, focusing on testing that
a very specific bug doesn't exist anymore is pointless because there is
zero chance that the exact same bug will re-appear, thus too precise a
test will just waste development time without bringing value.
HOWEVER, each such bugs belong to a *class* of bugs which are very
likely to be hit many times in various ways. This means two things :
1) each test must be written in a way that tries to cover the widest
spectrum of causes for a single class ;
2) when writing such a test, if other very closely related variants
are imagined, it's worth adding all these variants at once even
if they cover bugs that have not been met yet.
In your case, the test covers large parts (ie case 1 above), but only
focuses on "used:0". I think it's worth checking what else in the output
could possibly break. Given the number of entries tracked, there are very
likely a number of values set at once which deserve being matched because
we know what to expect there. By making the check cover them as well, we
increase the likeliness that this test will detect a regression affecting
stick tables, thus the usefulness of the test, hence the likeliness that
a developer will want to run it.
Then maybe you'll find that the request made doesn't trigger certain fields
and that we should slightly improve it to increase the coverage. This will
result in a test that still detects the bug you wanted to cover, but as part
of a class of bugs and not just this specific one. I suspect that for this
it will make sense to implement the server part to provide a valid response
in order to get more fields filled.
Just my two cents,
Willy