On Thu, 2019-10-24 at 20:50 +0000, Andrew Dean via gcc wrote: > TLDR: I'd like to propose adding a dependency on a modern unit > testing framework to make it easier to write unit tests within GCC. > Before I spend much more time on it, what sort of buy-in should I > get? Are there any people in particular I should work more closely > with as I make this change? > > Terminology: Within GCC, there are two types of tests in place: unit > tests and regression tests. The unit tests have been written with a > home-grown selftest framework and run as part of the build process. > Any failures to a unit test results in no compiler being produced. > The regression tests, on the other hand, run after build, and use the > separate DejaGnu framework. In this email, I am only concerning > myself with the unit tests, and throughout the remainder of the > email, any mention of tests refers to these. > > Working on GCC, I wanted to add some new unit tests to my feature as > I went, but I noticed that there is a good deal of friction involved. > Right now, adding new unit tests requires writing the test method, > then modifying a second place in the code to call said test method, > repeating as necessary until getting all the way to either the > selftest.c file or the target hook. There is also no way to do test > setup/teardown automatically. Everything is manual. > > I'd like to propose adding a dependency on a modern open-source unit > testing framework as an enhancement to the current self test system. > I have used Catch2 (https://github.com/catchorg/Catch2, Boost > Software License 1.0) with great success in the past. I experimented > with adding it to GCC and converting a handful of tests to use > Catch2. Although I only converted a small number of tests, I didn't > see any performance impact during selftest. As a bonus, while doing > so, I actually found that one test that I had written previously > wasn't actually being run, because I had failed to manually call it. > > Some nice things that Catch2 provides are better error reporting (see > below for a comparison), ease of adding new tests (just include the > header and write a TEST_CASE(), as opposed to the manual plumbing > required right now), extension points for adding custom comparisons > (I could see this being very useful to expand on the current rtl test > macros), and the ability to run a subset of the tests without > recompiling. It is also easy to integrate Catch2 with the existing > self-test framework. > > If this path seems useful to others, I'm happy to pursue it further. > A list of work items I see are: > > 1. Convert more tests to verify the claim that build performance is > not degraded > 2. Update the docs to list Catch2 as the new recommended way to write > unit tests > 3. If all of the target self-tests are converted, then we can remove > the target test hook. Similar for the lang test hook. > > One thing that would make Catch2 an even more slam-dunk case was if > we were able to enable exceptions for the check builds. Then, running > the unit tests could report multiple failures at the same time > instead of just aborting at the first one. That said, even without > enabling exceptions, Catch2 is on par with the current selftest in > terms of terminating at the first failure. > > Another option is to use a test framework that doesn't use > exceptions, such as Google Test (https://github.com/google/googletest > , BSD 3-Clause "New" or "Revised" License). I personally think Catch2 > is more flexible, or I would lead with Google Test. For example, in > Catch2, shared setup is done in place with the tests itself, having > each subtest be a nested SECTION, where-as in GTest, you have to > write a test class that derives from ::test and overrides SetUp(). In > addition, the sections in Catch2 can be nested further, allowing > several related tests to build on each other. > > Here is some sample output for the case where all the tests are > passing: > ===================================================================== > ========== > All tests passed (25 assertions in 5 test cases) > > And here is the output when a test fails: > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ~~~~~~~~~~ > is a Catch v2.9.2 host application. > Run with -? for options > > ------------------------------------------------------------------- > ------------ > test_set_range > ------------------------------------------------------------------- > ------------ > ../../gcc/bitmap.c:2661 > ..................................................................... > .......... > ../../gcc/bitmap.c:2668: FAILED: > REQUIRE( 6 == bitmap_count_bits (b) ) > with expansion: > 6 == 5 > > Catch will terminate because it needed to throw an exception. > The message was: Test failure requires aborting test! > terminate called without an active exception > ../../gcc/bitmap.c:2668: FAILED: > {Unknown expression after the reported line} > due to a fatal error condition: > SIGABRT - Abort (abnormal termination) signal > ===================================================================== > ========== > test cases: 2 | 1 passed | 1 failed > assertions: 5 | 3 passed | 2 failed > cc1: internal compiler error: Aborted > <long callstack> > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <https://gcc.gnu.org/bugs/> for instructions. > > (Note that at the moment it doesn't know the name of our application > or it would have prefixed "is a Catch..." with our app name). > > Compare that to the output of the current test framework: > ../../gcc/bitmap.c:2669: test_set_range: FAIL: ASSERT_EQ ((6), > (bitmap_count_bits (b))) > cc1: internal compiler error: in fail, at selftest.c:47 > /bin/bash ../../gcc/../move-if-change tmp-macro_list macro_list > echo timestamp > s-macro_list > <long callstack> > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <https://gcc.gnu.org/bugs/> for instructions.
Thanks for your email, it looks interesting. Is your code somewhere we can see it? I'm the author of gcc's selftest framework (and I use it heavily e.g. for testing the diagnostics subsystem [1]). It went through substantial changes during review. I looked over my notes; for reference, here's a summary of the history of the patches: v1: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00765.html "[PATCH 00/17] RFC: Addding a unit testing framework to gcc" Used Google Test framework, was a dummy "frontend" v2: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01224.html "[PATCH/RFC]: unittesting v2: as a plugin (was Re: [PATCH 00/17] RFC: Addding a unit testing framework to gcc)" Still Google Test, as a plugin rather than a frontend v3: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02947.html "[PATCH 00/16] Unit tests framework (v3)" Still Google Test, as a plugin built by plugin.exp within DejaGnu tests, with a custom gtest reporter Some discussion about gtest: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03215.html v4: https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02379.html "[PATCH 00/15] Unittests framework v4: -fself-test" Done via "-fself-test" via compiling a dummy file in DejaGnu tests I believe it was at this point that I switched to a custom API that resembles gtest, rather than gtest itself. v5: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00082.html "[PATCH 00/21] Add -fself-test framework for fast, early unit-testing (unittests v5)" Done via "-fself-test" at each of the 3 stages of bootstrap. v6: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00210.html "[PATCH 00/16] v6 of -fself-test/unit-testing patch" Switched to "abort on first failure" Eliminated runner class, and from self-registrating tests to manual test invocation v7: https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00298.html "[PATCH] Selftest framework (v7)" (one combined patch) v8: approved; committed v8 to trunk as r237144 (for gcc 7): https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00410.html Notable followups: 2016-07-11: * r238209: "Support running the selftests under valgrind" * https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00437.html 2017-12-11: * r255563: "Expensive selftests: torture testing for fix-it boundary conditions (PR c/82050)" * https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02459.html (some tests run as a DejaGnu-built plugin) 2018-04-30: * r259782: "selftest: remove "Yoda ordering" in assertions" * https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01323.html 2018-10-17: * r265240: "Run selftests for C++ as well as C" * https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00802.html I think the consensus during review was that I was over-engineering things, and that each iteration from v1 to v8 made the code simpler and involved less C++ "magic", closer to C. Whether that's still the consensus I don't know. Different people within the GCC dev community have different comfort levels with C++, and my initial version (using gtest) was probably too "magic" for some. Maybe people here are more comfortable with C++ now? GCC has some rather unique requirements, in that we support a great many build configurations, some of which are rather primitive - for example, requiring just C++98 with exceptions disabled, in that we want to be able to be bootstrappable on relatively "ancient" configurations. IIRC auto-registration of tests requires that the build configuration have a sufficiently sane implementation of C++ - having globals with non-trivial ctors tends to be problematic when dealing with early implementations of C++. Personally I don't find the manual registration of tests to be a pain, but it would be nice to have more readable errors on failures. There's probably a case for more verbose test output. (generally I immediately just do "make selftest-gdb" on failures; the issue is if it suddenly starts failing on a build I don't have access to) I suspect that exceptions would be a deal-breaker; does Catch2 support -fno-exceptions? As for setup/teardown, I've been able to do that "manually" using RAII- style classes in test functions. Thanks again for your email; hope this is constructive. Dave [1] see e.g. the selftests in gcc/input.c and gcc/diagnostic-show- locus.c