On Wed, 8 Sep 2021 18:03:50 +0200 Daniel Kiper <dki...@net-space.pl> wrote:
> On Wed, Sep 08, 2021 at 01:22:20AM +0000, Glenn Washburn wrote: > > It looks like the xfs_test test succeeds with tag grub-2.06-rc1a, > > fails with tag grub-2.06, and succeeds with current master. Yes, as > > expected. However, what this tells me is that no "make check" tests > > were done before finalizing the 2.06 release. I was under the > > impression that that was part of the release procedure. If its not, > > it would seem that we're not using the tests at a time when they > > would have the most impact. > > Currently I run build tests for all architectures and platforms and > Coverity for x86_64 EFI before every push and release. I know this is > not enough but I tried "make check" at least once and got the > impression that the tests are not reliable. Sadly I have not time to > dive deeper and fix this and that. If someone, you?, wants to verify > all tests and fix broken ones I will be more than happy to start > using it (it seems to me your "[PATCH v2 0/8] Various > fixes/improvements for tests" patch set fixes some issues but I am > not sure if all of them). I suspect the problem you're running into with the tests is more a matter of setting up the expected environment than the tests themselves. From my experience, most of the tests work once everything is setup right. My gitlab patch shows how to do this on Ubuntu 20.04. Due to not being able to load kernel modules on the shared gitlab runners, I was not able to run all the tests either, and there are some tests that fail. The failures that come to mind are on Sparc and some of the functional tests, and I've notified the list, but no one came forward to fix them. So what I've done is have a list of known failing tests and ignore those in determining whether the test suite as a whole has succeeded. I'm not convinced that that behavior is something we should include in the test framework itself. So to my mind, I have verified (nearly) all tests on most platforms. I have fixed the ones I could. I think the biggest low hanging fruit (for me), is to get the tests running on a system with all the needed kernel fs drivers loaded. Perhaps to your point, the filesystem tests failing due to not having a kernel module loaded should be marked as skipped because the test isn't really run. Here is an example of a successfully completed pipeline https://gitlab.com/gwashburn/grub/-/pipelines/258561619 And here is the raw log for the x86_64-efi build and test (search for "Testsuite summary" and look at the lines above for individual test pass/fail): https://storage.googleapis.com/gitlab-gprd-artifacts/0e/0d/0e0d2ccc27a1e9a121b3d4ef831c56b5fd81cc8b565a3dbc5fa09b9a58edbcb7/2021_02_19/1042905700/1137549157/job.log?response-content-type=text%2Fplain%3B%20charset%3Dutf-8&response-content-disposition=inline&GoogleAccessId=gitlab-object-storage-...@gitlab-production.iam.gserviceaccount.com&Signature=ewpRWBqWWorOXak6hFkb5kUEfefU1biAtu6xY2Rtyds3%2BduM6q0kiFIKe4A5%0A8wS%2FPbd8Al3AwF45Q22KcpYyZ87UBkryQHjispAj%2B3tBQrnnyOYSRKGNV%2Bbz%0A4iaKAdYn4onIcJM9Ro4RkJ%2FCAY2tBxWmmLoEZ%2FQzWLvbhH%2BJSmuNqtEZXfuD%0AI1tXD1ZKgoLcYCCLNz8iaMFpgB32NfR2W6sDDuFb24ZFSy0X7H02KRVAMIor%0Akhj5QXdXGqoqFFXXMM9r8ZGv34Kh4GdDbVjMmJ%2FaDhvLPgmZF3UKnhcDYJoB%0AiMj%2BbimYNoiQW0SLg4hHA11PkKNn4KPAFqhLVscsCw%3D%3D&Expires=1631130462 This is all to show that the tests are (mostly) working. There's some post-processing on the pass/fail status of some tests to ignore expected failures (eg. zfs_test). Occasionally there are spurious test failures, I think due to the VM getting starved at times (eg. why grub_cmd_sleep fails and is ignored). No as far as you personally using the tests, based on the above what do we need to get that to happen? I don't know the setup that you plan on doing the tests on (is it debian based?). Do you want a script that sets up the environment for you? Perhaps a docker or lxd setup (probably better in the long run)? A word about why I haven't verified the fs tests that require kernel modules not loaded in gitlab shared runner, I could run these on my system. However, I've had some strange experiences in the past with the fs tests not cleaning up well and causing some issues with the rest of the system. So I'm not keen on running all the fs tests on my daily driver. Perhaps in a container (a virtual machine would work, but I suspect really slow, because many of the tests use virtual machines for testing, so there would be VMs in a VM). I'd be curious to know what specific issues you had with the test, if you can remember or reproduce them. > > It is my understanding that we have travis-ci tests that get run (at > > some point?), however they are only build tests and so would not > > have caught this. It was precisely this scenario that I hoped to > > avoid by doing more thorough continuous integration, which runs the > > extensive array of "make check" tests, when I submitted the > > Gitlab-CI patch series (which would've caught this automatically if > > it had been merged). To me this scenario is the poster child for > > taking action on this front. Can we make this a priority? > > I think I can take a closer look at patch set mentioned above in a > week or two. If "make check" works as expected and can be run locally > then we can think about automating the testing. I think the requirement to "run locally" should be more precisely defined. Is that all tests for all architectures? Running it reliably on an local machine might be difficult due to the variability of configurations. Perhaps in a docker or lxd container would be the best route for that. Thoughts? Glenn _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel