On Wed, May 4, 2016 at 10:49 PM, David Malcolm <dmalc...@redhat.com> wrote: > This patch kit introduces an RTL frontend, for the purpose > of unit testing: primarly for unit testing of RTL passes, and > possibly for unit testing of .md files. > > It's very much a work-in-progress; I'm posting it now to get feedback. > I've successfully bootstrapped®rtested patches 1-3 of the kit on > x86_64-pc-linux-gnu, but patch 4 (which is the heart of the > implementation) doesn't survive bootstrap yet (dependency issues > in the Makefile). > > The rest of this post is from gcc/rtl/notes.rst from patch 4; I'm > adding a duplicate copy up-front here to make it easier to get an > overview.
Thanks for starting this project. > RTL frontend > ============ > > Purpose > ******* > > Historically GCC testing has been done by providing source files > to be built with various command-line options (via DejaGnu > directives), dumping state at pertinent places, and verifying > properties of the state via these dumps. > > A strength of this approach is that we have excellent integration > testing, as every test case exercises the toolchain as a whole, but > it has the drawback that when testing a specific pass, > we have little control of the input to that specific pass. We > provide input, and the various passes transform the state > of the internal representation:: > > INPUT -> PASS-1 -> STATE-1 -> PASS-2 -> STATE-2 -> ... > -> etc -> > -> ... -> PASS-n-1 -> STATE-n-1 -> PASS-n -> STATE-n > ^ ^ ^ > | | Output from the pass > | The pass we care about > The actual input to the pass > > so the intervening passes before "PASS-n" could make changes to the > IR that affect the input seen by our pass ("STATE-n-1" above). This > can break our test cases, sometimes in a form that's visible, > sometimes invisibly (e.g. where a test case silently stops providing > coverage). > > The aim of the RTL frontend is to provide a convenient way to test > individual passes in the backend, by loading dumps of specific RTL > state (possibly edited by hand), and then running just one specific > pass on them, so that we effectively have this:: > > INPUT -> PASS-n -> OUTPUT > > thus fixing the problem above. > > My hope is that this makes it easy to write more fine-grained and > robust test coverage for the RTL phase of GCC. However I see this > as *complementary* to the existing "integrated testing" approach: > patches should include both RTL frontend tests *and* integrated tests, > to avoid regressing the great integration testing we currently have. > > The idea is to use the existing dump format as a input format, since > presumably existing GCC developers are very familiar with the dump > format. > > One other potential benefit of this approach is to allow unit-testing > of machine descriptions - we could provide specific RTL fragments, > and have the rtl.dg testsuite directly verify that we recognize all > instructions and addressing modes that a given target ought to support. > > Structure > ********* > > The RTL frontend is similar to a regular frontend: a gcc/rtl > subdirectory within the source tree contains frontend-specific hooks. > These provide a new "rtl" frontend, which can be optionally > enabled at configuration time within --enable-languages. > > If enabled, it builds an rtl1 binary, which is invoked by the > gcc driver on files with a .rtl extension. > > The testsuite is below gcc/testsuite/rtl.dg. There's also > a "roundtrip" subdirectory below this, in which every .rtl > file is loaded and then dumped; roundtrip.exp verifies that > the dump is identical to the original file, thus ensuring that > the RTL loaders faithfully rebuild the input dump. > > Limitations > *********** > > * It's a work-in-progress. There will be bugs. > > * The existing RTL code is structured around a single function being > optimized, so, as a simplification, the RTL frontend can only handle > one function per input file. Also, the dump format currently uses > comments to separate functions:: > > ;; Function test_1 (test_1, funcdef_no=0, decl_uid=1758, cgraph_uid=0, > symbol_order=0) > > ... various pass-specific things, sometimes expressed as comments, > sometimes not > > ;; > ;; Full RTL generated for this function: > ;; > (note 1 0 6 NOTE_INSN_DELETED) > ;; etc, insns for function "test_1" go here > (insn 27 26 0 6 (use (reg/i:SI 0 ax)) > ../../src/gcc/testsuite/rtl.dg/test.c:7 -1 > (nil)) > > ;; Function test_2 (test_2, funcdef_no=1, decl_uid=1765, cgraph_uid=1, > symbol_order=1) > ... various pass-specific things, sometimes expressed as comments, > sometimes not > ;; > ;; Full RTL generated for this function: > ;; > (note 1 0 5 NOTE_INSN_DELETED) > ;; etc, insns for function "test_2" go here > (insn 59 58 0 8 (use (reg/i:SF 21 xmm0)) > ../../src/gcc/testsuite/rtl.dg/test.c:31 -1 > (nil)) > > so that there's no clear separation of the instructions between the > two functions (and no metadata e.g. function names). > > This could be fixed by adding a new clause to the dump e.g.:: > > (function "test_1" [ > (note 1 0 6 NOTE_INSN_DELETED) > ;; etc, insns for function "test_1" go here > (insn 27 26 0 6 (use (reg/i:SI 0 ax)) > ../../src/gcc/testsuite/rtl.dg/test.c:7 -1 > (nil)) > ]) > > (function "test_2" [ > (note 1 0 5 NOTE_INSN_DELETED) > ;; etc, insns for function "test_2" go here > (insn 59 58 0 8 (use (reg/i:SF 21 xmm0)) > ../../src/gcc/testsuite/rtl.dg/test.c:31 -1 > (nil)) > ]) > > or somesuch (this wouldn't be an rtx code, just something in > rtl-frontend.c). > The RTL frontend could then compile each function in turn after parsing each > one (probably the easiest way to deal with the global state in the RTL parts > of the compiler). I think that the dumps should be massaged with a -gimplefe or -rtlfe flag, parsing the raw dumps with the frontend isn't something I'd require doing. > * The RTL frontend doesn't have any knowledge of the name of the function, > of parameters, types, locals, globals, etc. It creates a single function. > The function is currently hardcoded to have this signature: > > int test_1 (int, int, int); > > since there's no syntax for specify otherwise, and we need to provide > a FUNCTION_DECL tree when building a function object (by calling > allocate_struct_function). > > * Similarly, there are no types beyond the built-in ones; all expressions > are treated as being of type int. I suspect that this approach > will be too simplistic when it comes to e.g. aliasing. To address this and the previous issue I suggest to implement the RTL FE similar to how I proposed the GIMPLE FE - piggy-back on the C FE and thus allow int __RTL foo (int a, int b) // gets you function decl and param decls { (insn ...) ... } int main() { if (foo (1) != 0) abort (); } That would also allow dg-do run testcases and not rely solely on scanning RTL dumps. > * There's no support for running more than one pass; fixing this would > require being able to run passes from a certain point onwards. I expect the GIMPLE FE work to provide the ability to do this. > * Roundtripping of recognized instructions may be an issue (i.e. those > with INSN_CODE != -1), such as the "667 {jump}" in the following:: > > (jump_insn 50 49 51 10 > (set (pc) > (label_ref:DI 59)) ../../src/test-switch.c:18 667 {jump} > (nil) -> 59) > > since the integer ID can change when the .md files are changed > (and the associated pattern name is very much target-specific). > It may be best to reset them to -1 in the input files (and delete the > operation name), giving:: > > (jump_insn 50 49 51 10 > (set (pc) > (label_ref:DI 59)) ../../src/test-switch.c:18 -1 > (nil) -> 59) > > * Currently there's no explicit CFG edge information in the dumps. > The rtl1 frontend reconstructs the edges based on jump instructions. > As I understand the distinction between cfgrtl and cfglayout modes > https://gcc.gnu.org/wiki/cfglayout_mode , this is OK for "cfgrtl" mode, > but isn't going to work for "cfglayout" mode - in the latter, > unconditional jumps are represented purely by edges in the CFG, and this > information isn't currently present in the dumps (perhaps we could add it > if it's an issue). As suggested above I wouldn't start from the current RTL dumps we have but instead emit a variant with -rtlfe dumps that contains this kind of information that is missing. Thanks, Richard. > Open Questions > ************** > > * Register numbering: consider this fragment of RTL emitted during > expansion:: > > (reg/f:DI 82 virtual-stack-vars) > > At the time of emission, register 82 is the VIRTUAL_STACK_VARS_REGNUM, > and this value is effectively hardcoded into the dump. Presumably this > is baking in assumptions about the target into the test. Also, how likely > is > this value to change? When we reload the dump, should we notice that this > is tagged with virtual-stack-vars and override the specific register > number to use the current value of VIRTUAL_STACK_VARS_REGNUM on the > target rtl1 was built for? > > TODO items > ********** > > * test with other architectures > > * roundtrip.exp: strip out comments in source when comparing roundtrip > > * example with "-g" > > * implement a fuzzer (or use AFL on the existing test cases) > > Thoughts? > > Hope this looks useful > > Dave > > David Malcolm (4): > Make argv const char ** in read_md_files etc > Move name_to_pass_map into class pass_manager > Extract deferred-location handling from jit > Initial version of RTL frontend > > gcc/Makefile.in | 1 + > gcc/cfgexpand.c | 7 +- > gcc/deferred-locations.c | 240 ++++ > gcc/deferred-locations.h | 139 +++ > gcc/emit-rtl.c | 15 +- > gcc/emit-rtl.h | 2 + > gcc/errors.c | 4 +- > gcc/errors.h | 13 + > gcc/function.c | 41 +- > gcc/function.h | 2 +- > gcc/gcc.c | 1 + > gcc/genattr-common.c | 2 +- > gcc/genattr.c | 2 +- > gcc/genattrtab.c | 2 +- > gcc/genautomata.c | 4 +- > gcc/gencodes.c | 2 +- > gcc/genconditions.c | 2 +- > gcc/genconfig.c | 2 +- > gcc/genconstants.c | 5 +- > gcc/genemit.c | 2 +- > gcc/genenums.c | 5 +- > gcc/genextract.c | 2 +- > gcc/genflags.c | 2 +- > gcc/genmddeps.c | 5 +- > gcc/genopinit.c | 2 +- > gcc/genoutput.c | 4 +- > gcc/genpeep.c | 4 +- > gcc/genpreds.c | 11 +- > gcc/genrecog.c | 2 +- > gcc/gensupport.c | 33 +- > gcc/gensupport.h | 5 +- > gcc/gentarget-def.c | 2 +- > gcc/jit/jit-common.h | 5 +- > gcc/jit/jit-playback.c | 194 +--- > gcc/jit/jit-playback.h | 73 +- > gcc/pass_manager.h | 6 + > gcc/passes.c | 34 +- > gcc/print-rtl.c | 4 +- > gcc/read-md.c | 338 ++++-- > gcc/read-md.h | 158 ++- > gcc/read-rtl.c | 670 ++++++++++- > gcc/rtl.c | 2 + > gcc/rtl.h | 4 + > gcc/rtl/Make-lang.in | 148 +++ > gcc/rtl/config-lang.in | 36 + > gcc/rtl/lang-specs.h | 25 + > gcc/rtl/lang.opt | 38 + > gcc/rtl/notes.rst | 199 ++++ > gcc/rtl/rtl-errors.c | 35 + > gcc/rtl/rtl-frontend.c | 1219 > ++++++++++++++++++++ > gcc/testsuite/lib/rtl-dg.exp | 64 + > gcc/testsuite/rtl.dg/dfinit.rtl | 90 ++ > gcc/testsuite/rtl.dg/final.rtl | 51 + > gcc/testsuite/rtl.dg/good-include.rtl | 6 + > gcc/testsuite/rtl.dg/good-includee.md | 1 + > gcc/testsuite/rtl.dg/into-cfglayout.rtl | 85 ++ > gcc/testsuite/rtl.dg/ira.rtl | 84 ++ > gcc/testsuite/rtl.dg/missing-include.rtl | 1 + > gcc/testsuite/rtl.dg/pro_and_epilogue.rtl | 38 + > gcc/testsuite/rtl.dg/roundtrip/code-labels.rtl | 2 + > gcc/testsuite/rtl.dg/roundtrip/frame-pointer.rtl | 4 + > gcc/testsuite/rtl.dg/roundtrip/insn-with-mode.rtl | 4 + > gcc/testsuite/rtl.dg/roundtrip/jump-to-label.rtl | 6 + > gcc/testsuite/rtl.dg/roundtrip/jump-to-return.rtl | 6 + > .../rtl.dg/roundtrip/jump-to-simple-return.rtl | 6 + > .../rtl.dg/roundtrip/note-insn-basic-block.rtl | 1 + > .../rtl.dg/roundtrip/note-insn-deleted.rtl | 1 + > .../rtl.dg/roundtrip/reg-with-orig-regno.rtl | 4 + > gcc/testsuite/rtl.dg/roundtrip/roundtrip.exp | 99 ++ > .../rtl.dg/roundtrip/test-loop.cleaned.rtl | 75 ++ > .../rtl.dg/roundtrip/test-switch-after-expand.rtl | 202 ++++ > gcc/testsuite/rtl.dg/rtl.exp | 43 + > gcc/testsuite/rtl.dg/test.c | 31 + > gcc/testsuite/rtl.dg/unknown-insn-uid.rtl | 3 + > gcc/testsuite/rtl.dg/unknown-rtx-code.rtl | 1 + > gcc/testsuite/rtl.dg/vregs.rtl | 80 ++ > gcc/toplev.c | 7 + > gcc/tree-dfa.c | 5 + > 78 files changed, 4229 insertions(+), 524 deletions(-) > create mode 100644 gcc/deferred-locations.c > create mode 100644 gcc/deferred-locations.h > create mode 100644 gcc/rtl/Make-lang.in > create mode 100644 gcc/rtl/config-lang.in > create mode 100644 gcc/rtl/lang-specs.h > create mode 100644 gcc/rtl/lang.opt > create mode 100644 gcc/rtl/notes.rst > create mode 100644 gcc/rtl/rtl-errors.c > create mode 100644 gcc/rtl/rtl-frontend.c > create mode 100644 gcc/testsuite/lib/rtl-dg.exp > create mode 100644 gcc/testsuite/rtl.dg/dfinit.rtl > create mode 100644 gcc/testsuite/rtl.dg/final.rtl > create mode 100644 gcc/testsuite/rtl.dg/good-include.rtl > create mode 100644 gcc/testsuite/rtl.dg/good-includee.md > create mode 100644 gcc/testsuite/rtl.dg/into-cfglayout.rtl > create mode 100644 gcc/testsuite/rtl.dg/ira.rtl > create mode 100644 gcc/testsuite/rtl.dg/missing-include.rtl > create mode 100644 gcc/testsuite/rtl.dg/pro_and_epilogue.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/code-labels.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/frame-pointer.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/insn-with-mode.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/jump-to-label.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/jump-to-return.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/jump-to-simple-return.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/note-insn-basic-block.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/note-insn-deleted.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/reg-with-orig-regno.rtl > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/roundtrip.exp > create mode 100644 gcc/testsuite/rtl.dg/roundtrip/test-loop.cleaned.rtl > create mode 100644 > gcc/testsuite/rtl.dg/roundtrip/test-switch-after-expand.rtl > create mode 100644 gcc/testsuite/rtl.dg/rtl.exp > create mode 100644 gcc/testsuite/rtl.dg/test.c > create mode 100644 gcc/testsuite/rtl.dg/unknown-insn-uid.rtl > create mode 100644 gcc/testsuite/rtl.dg/unknown-rtx-code.rtl > create mode 100644 gcc/testsuite/rtl.dg/vregs.rtl > > -- > 1.8.5.3 >