Re: [PATCH v2 1/2] tests: Test aborts due to missing requirements should be marked as error instead of skipped

Glenn Washburn Wed, 19 Jul 2023 16:04:56 -0700

On Wed, 19 Jul 2023 17:37:55 +0200
Julian Andres Klode <julian.kl...@canonical.com> wrote:

> On Thu, Oct 14, 2021 at 03:58:07PM -0500, Glenn Washburn wrote:
> > Many tests abort due to not being root or missing tools, for instance mkfs
> > commands for file system tests. The tests are exited with code 77, which
> > means they were skipped. A skipped test is a test that should not be run,
> > eg. a test specific to ARM64 should not be run on an x86 build. These aborts
> > are actually a hard error, code 99. That means that the test could not be
> > completed, but not because what was supposed to be tested failed, eg. in
> > these cases where a missing tool prevents the running of a test.
> 
> This change is inappropriate.

Yes, I can imagine this change is inappropriate for the _existing_ code
that you (or your distro) has built around packaging GRUB. The previous
code was actually inappropriate for the GRUB tests. The needs of the
GRUB project and the needs of distros don't always align. I don't
believe that is the problem here though, where the issue is misplaced
blame.

Let me be clear here, the purpose of GRUB tests is not so distros can
run half of them after the build to see if it passes the smell test.
The purpose of the tests are primarily to test regressions and
functionality. To that end, the less tests that are run, the less
useful the test suite. So the testing system should be implemented to
try to run as many tests as possible and clearly note where tests are
not being run. Of course, you're free to run tests how ever you want,
but that does not mean that the GRUB project should support it,
especially when its in conflict with the interests of the project.

It occurs to me that perhaps the problematic implication of the previous
behavior was not evident, so let me try to clarify for you and anyone
else. A someone concerned with testing GRUB, I want to make sure that
"make check" does as much testing as is available and I want to know
with as much ease as possible when not all available testing is being
done. The previous approach did not cut it.

When I first started running the tests, before this change was
included, I would get a lot of skipped tests. I would then need to dig
into the individual test log to see why the test was skipped. Sometimes
the test was skipped because it was supposed to be skipped, maybe that
test did not apply to that architecture. I don't care about the skipped
tests that should be skipped, and its a waste of my time to dig into
those. Now tests that were skipped because of a misconfiguration of the
testing environment, those I do care about because the test _should_
run but is not. Now suppose I saw skipped tests due to an environment
issue and assumed they were supposed to be skipped, then I would be
unknowingly not running the full test suite, which defeats the purpose
of having those tests. The interpreting the test output should be clear
to someone not familiar with the tests. And false positive fails are
better than false negatives (not catching true failures).

This isn't a theoretical issue. There is someone on this list that has
several times graciously taken time to run the test suite and show the
test results to the list. And while I very much appreciate the
intention and work expended, in none of the cases was the output that
useful because half the tests were not run properly due to an improperly
testing environment. So in my mind it was mostly a waste of effort. By
making it clear via hard error that the test is a test that should be
run for the target, but is failing due to an environment issue, there
is more information at a glance about what's being tested or not. This
separates failed tests that are issues for GRUB, failed tests that are
the testers responsibility, and tests that are not run because they
should not be. You are advocating for going back to a time where those
last two classes are merged into one. In the interests of testing GRUB,
that is inappropriate.

> We can't just fail the build in distro
> packaging just because our builders don't run as root.

Then don't do that. The implication that you _need_ to revert to
improper test behavior is erroneous. Let assume that packaging GRUB
implies running the tests because really that is a good idea and that
your builders run unprivileged, also a good idea. The best option I see,
with the limited knowledge I have of your situation and to resolve your
issue without reverting the patch, is to check the test-suite.log file
for ERRORs. This is incredibly simple. Assuming you have a line in the
your debian rules file for "make check", you could changes this to
something like:

  make check || grep '^# ERROR: 0$' test-suite.log

This should get you the previous behavior. If you actually are serious
about running the tests, I've built a system which runs the root
required tests in UML. So I've got all the tests running as they should
on a build box where I have no access to root.

I'd also like to point out that even without this change, "make check"
failed, continues to fail, and will likely fail for the foreseeable
future. As I'm sure you're aware, the functional tests have been
failing for a long time, its a known issue, and no one has stepped up
to fix them (though a few have looked into it). So unless you have
patches fixing this (patches welcome!) or you're already patching GRUB
to not run this test, then "make check" is already failing for you.

> Similarly dependencies - it's our choice which dependencies we install,
> if we don't care about stuff, it should just skip it.

I agree, up to the last 5 words. If you want to control which tests are
run, it should not be done by abusing the automake testing system.
Non-existence of an executable or package should not be a signal for
disabling tests. Instead more appropriate way would be to specify
exactly which tests you want to run via the TESTS variable[1]. It comes
to mind that it might be nice to have an EXCLUDE_TESTS make variable
for specifying tests to exclude, to be more explicit that just having a
TESTS variable with some missing tests. Alternatively, propose a patch
to automake for a variable IGNORE_HARD_ERRORS, a complement to the
exiting DISABLE_HARD_ERRORS, which causes hard errors to affect the
return code of make check.

> Every distro will have to revert this change presumably so they can
> build grub.

Distros are free to modify upstream as they see fit, and do to the tune
of over a hundred patches at times. However, your presumption that there
is only one way to fix this is incorrect. As I noted above, this change
need not incur another patch to GRUB source, if the tooling is fixed.

Glenn

[1]
https://www.gnu.org/software/automake/manual/html_node/Scripts_002dbased-Testsuites.html

_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel

Re: [PATCH v2 1/2] tests: Test aborts due to missing requirements should be marked as error instead of skipped

Reply via email to