Re: Where is the testing? (was: Re: [PATCH] fs/xfs: Avoid unreadble filesystem if V4 superblock)

Glenn Washburn Wed, 08 Sep 2021 12:58:42 -0700

On Wed, 8 Sep 2021 18:03:50 +0200
Daniel Kiper <dki...@net-space.pl> wrote:

> On Wed, Sep 08, 2021 at 01:22:20AM +0000, Glenn Washburn wrote:
> > It looks like the xfs_test test succeeds with tag grub-2.06-rc1a,
> > fails with tag grub-2.06, and succeeds with current master. Yes, as
> > expected. However, what this tells me is that no "make check" tests
> > were done before finalizing the 2.06 release. I was under the
> > impression that that was part of the release procedure. If its not,
> > it would seem that we're not using the tests at a time when they
> > would have the most impact.
> 
> Currently I run build tests for all architectures and platforms and
> Coverity for x86_64 EFI before every push and release. I know this is
> not enough but I tried "make check" at least once and got the
> impression that the tests are not reliable. Sadly I have not time to
> dive deeper and fix this and that. If someone, you?, wants to verify
> all tests and fix broken ones I will be more than happy to start
> using it (it seems to me your "[PATCH v2 0/8] Various
> fixes/improvements for tests" patch set fixes some issues but I am
> not sure if all of them).

I suspect the problem you're running into with the tests is more a
matter of setting up the expected environment than the tests
themselves. From my experience, most of the tests work once everything
is setup right. My gitlab patch shows how to do this on Ubuntu 20.04.

Due to not being able to load kernel modules on the shared gitlab
runners, I was not able to run all the tests either, and there are some
tests that fail. The failures that come to mind are on Sparc and some
of the functional tests, and I've notified the list, but no one came
forward to fix them. So what I've done is have a list of known failing
tests and ignore those in determining whether the test suite as a whole
has succeeded. I'm not convinced that that behavior is something we
should include in the test framework itself.

So to my mind, I have verified (nearly) all tests on most platforms. I
have fixed the ones I could. I think the biggest low hanging fruit (for
me), is to get the tests running on a system with all the needed kernel
fs drivers loaded. Perhaps to your point, the filesystem tests failing
due to not having a kernel module loaded should be marked as skipped
because the test isn't really run.

Here is an example of a successfully completed pipeline
https://gitlab.com/gwashburn/grub/-/pipelines/258561619

And here is the raw log for the x86_64-efi build and test (search for
"Testsuite summary" and look at the lines above for individual test
pass/fail):
https://storage.googleapis.com/gitlab-gprd-artifacts/0e/0d/0e0d2ccc27a1e9a121b3d4ef831c56b5fd81cc8b565a3dbc5fa09b9a58edbcb7/2021_02_19/1042905700/1137549157/job.log?response-content-type=text%2Fplain%3B%20charset%3Dutf-8&response-content-disposition=inline&GoogleAccessId=gitlab-object-storage-...@gitlab-production.iam.gserviceaccount.com&Signature=ewpRWBqWWorOXak6hFkb5kUEfefU1biAtu6xY2Rtyds3%2BduM6q0kiFIKe4A5%0A8wS%2FPbd8Al3AwF45Q22KcpYyZ87UBkryQHjispAj%2B3tBQrnnyOYSRKGNV%2Bbz%0A4iaKAdYn4onIcJM9Ro4RkJ%2FCAY2tBxWmmLoEZ%2FQzWLvbhH%2BJSmuNqtEZXfuD%0AI1tXD1ZKgoLcYCCLNz8iaMFpgB32NfR2W6sDDuFb24ZFSy0X7H02KRVAMIor%0Akhj5QXdXGqoqFFXXMM9r8ZGv34Kh4GdDbVjMmJ%2FaDhvLPgmZF3UKnhcDYJoB%0AiMj%2BbimYNoiQW0SLg4hHA11PkKNn4KPAFqhLVscsCw%3D%3D&Expires=1631130462

This is all to show that the tests are (mostly) working. There's some
post-processing on the pass/fail status of some tests to ignore
expected failures (eg. zfs_test). Occasionally there are spurious test
failures, I think due to the VM getting starved at times (eg. why
grub_cmd_sleep fails and is ignored).

No as far as you personally using the tests, based on the above what do
we need to get that to happen? I don't know the setup that you plan on
doing the tests on (is it debian based?). Do you want a script that
sets up the environment for you? Perhaps a docker or lxd setup
(probably better in the long run)?

A word about why I haven't verified the fs tests that require kernel
modules not loaded in gitlab shared runner, I could run these on my
system. However, I've had some strange experiences in the past with the
fs tests not cleaning up well and causing some issues with the rest of
the system. So I'm not keen on running all the fs tests on my daily
driver. Perhaps in a container (a virtual machine would work, but I
suspect really slow, because many of the tests use virtual machines for
testing, so there would be VMs in a VM).

I'd be curious to know what specific issues you had with the test, if
you can remember or reproduce them.

> > It is my understanding that we have travis-ci tests that get run (at
> > some point?), however they are only build tests and so would not
> > have caught this. It was precisely this scenario that I hoped to
> > avoid by doing more thorough continuous integration, which runs the
> > extensive array of "make check" tests, when I submitted the
> > Gitlab-CI patch series (which would've caught this automatically if
> > it had been merged). To me this scenario is the poster child for
> > taking action on this front. Can we make this a priority?
> 
> I think I can take a closer look at patch set mentioned above in a
> week or two. If "make check" works as expected and can be run locally
> then we can think about automating the testing.

I think the requirement to "run locally" should be more precisely
defined. Is that all tests for all architectures? Running it reliably
on an local machine might be difficult due to the variability of
configurations. Perhaps in a docker or lxd container would be the best
route for that. Thoughts?

Glenn

_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel

Re: Where is the testing? (was: Re: [PATCH] fs/xfs: Avoid unreadble filesystem if V4 superblock)

Reply via email to