Re: [PATCH 000/236] Introduce rtx subclasses
On Tue, 2014-08-12 at 20:26 -0400, David Malcolm wrote: On Tue, 2014-08-12 at 14:39 -0600, Jeff Law wrote: On 08/06/14 11:19, David Malcolm wrote: The aim of the patch series is to improve the type-safety and readability of the backend by introducing subclasses of rtx (actually rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE. That way we can document directly in the code the various places that manipulate insn chains vs other kinds of rtx node. Right. And by catching this stuff at compile time developers working on RTL bits should become at least a tiny bit more efficient. [...snip...] Performance === I tested the performance with --enable-checking=release using two large files (kdecore.cc, bigcode.c), comparing a control build to a patched build. There were no significant differences in compilation time: OK. A bit of a surprise, but then again perhaps not because we don't enable RTL checking by default. And I suspect a goodly amount of the RTL checking overhead is in the operands, not on the chain itself. As for the performance of a regular build i.e. with as_a checks *enabled*; looking at the wallclock time taken for a bootstrap and regression test, for my s390 builds (with -j3) I saw: s390 control: make time: 68 mins make check time: 122 mins total time: 190 mins s390 experiment: make time: 70 mins make check time: 126 mins total time: 196 mins showing a 3% increase, presumably due to the as_a and as_a_nullable checks. You want to be real careful benchmarking on ppc/s390 hardware as they're partitioned and what goes on in a different partition can affect your partition. Ah. This suggests my testing may need redoing. I'll try to redo this on a dedicated x86_64 box. I've now redone timed bootstrap testing on a dedicated x86_64 box (64 core Opteron with 128GB, with -j64). These were regular builds i.e. without setting --enable-checking. Unpatched build: make time: 15 mins make check time: 115 mins total time: 130 mins Patched build (with the full patchkit): make time: 16 mins make check time: 116 mins total time: 132 mins (and equal results from the regrtest) i.e. a release build shows no change in performance; a debug build shows a 3% increase in time taken to bootstrap and regression test. I believe the debug build could be sped up with further patches to eliminate the checked casts. Yea. As I've mentioned, my vision is to try and push out the casts as much as reasonably possible. (FWIW, I have been working on further optimizations to remove checked casts, but the above is without those optimizations)
Re: [PATCH 000/236] Introduce rtx subclasses
On Wed, 2014-08-13 at 20:13 -0400, David Malcolm wrote: On Wed, 2014-08-06 at 13:19 -0400, David Malcolm wrote: This is the patch series I spoke about at Cauldron in the talk A proposal for typesafe RTL; slides here: http://dmalcolm.fedorapeople.org/presentations/cauldron-2014/rtl They can also be seen at: https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v20/ [...snip...] The patches (and my control for testing) has been on top of r211061, which being from 2014-05-29 is now two months old, but hopefully is still reviewable; naturally I'll perform bootstrap and regression testing for each patch in turn before committing. FWIW I've now rebased the patches against today's trunk (specifically r213914) and I've backed up the results to: https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v21/ Still TODO: * rename as_a_nullable to safe_as_a * eliminate rtx_real_insn * make various scaffolding functions be inline, updating gdbinit.in as appropriate. [...snip...] Likewise; latest WIP: https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v22/ has: * the renaming of as_a_nullable to safe_as_a * the elimination of rtx_real_insn Still TODO: * make various scaffolding functions be inline, updating gdbinit.in as
Re: [PATCH 000/236] Introduce rtx subclasses
On 08/12/14 18:26, David Malcolm wrote: Essentially zero in the work-to-date. Of the 13 new subclasses of rtx, I only make major use of about half of them; here are the frequencies (as reported by grep -w in my current working copy): class rtx_def; /* ~26000 for rtx */ class rtx_expr_list; /* 73 */ class rtx_insn_list; /* 130 */ class rtx_sequence;/* 71 */ class rtx_insn;/* ~4300 */ class rtx_real_insn; /* 17 */ class rtx_debug_insn; /* 9 */ class rtx_nonjump_insn;/* 5 */ class rtx_jump_insn; /* 9 */ class rtx_call_insn; /* 25 */ class rtx_jump_table_data; /* 41 */ class rtx_barrier; /* 25 */ class rtx_code_label;/* 176 */ class rtx_note; /* 63 */ i.e. rtx_real_insn is basically unused, as are rtx_debug_insn, rtx_nonjump_insn and rtx_jump_insn. I think there's a case for keeping the concrete subclasses (those last three), but we can drop rtx_real_insn. Let's go with that. I wouldn't be surprised if we find more cases where we want the more concrete classes in the future, so I'm not worried about their low usage at this point. And we can always re-introduce the real-insn concept in the future if we find the need. I guess I'm rather biased against deep inheritance hierarchies after working on projects with unnecessarily deep hierarchies in the past. with an rtx_real_insn you're guaranteed at least a uuBeiie. But nothing is using that in the patches as they stand, so we can simply drop the class. And that's the one property I think argues to keep the intermediate class. But with nothing using it right now, let's drop. Perhaps the class hierarchy diagram in coretypes.h should gain the above operand annotation? Sure. Feel free to add that tidbit whenever you feel. No strong opinion here. I think we added NULL_TREE/NULL_RTX. I could possibly see extending that to the overall concept of insn chain things, but I think doing one for each subclass would probably be overkill. In the absence of strong opinions, maybe we should proceed with the patches as written i.e. *without* a NULL_INSN define. OK. We can revisit if/when we make things in the chain separate from rtxs. So out of curiosity, any thoughts on what other big things are out there that need to be fixed. I'm keen to keep this stuff moving as much as possible. Arguably PATTERN() should require an rtx_insn * rather than a plain rtx, but it would be an involved patch. Yea. That's a good one. Probably a series unto itself at some point. Other followups would be to reduce the number of as_a rtx_insn * in the code; for example grepping for uncast_ shows quite a few, a pattern where I strengthen the type of a parameter as seen within the function without needing to strengthen the param itself, where: Yea. These are reasonable to attack independently. A part of me really wonders if the 6 phases above should have been submitted independently. ie, once the scaffolding was done why not go ahead and get that reviewed installed, then move on to phase2 patches. I bring it up more for future work of a similar nature. I realize that you had to do a fair amount of the later work to ensure the scaffolding was right and so that we could see what the end result would likely look like. But something feels like it could be staged better. FWIW What you're seeing is the end result of a *lot* of git rebase -i, where I was splitting, combining, and reordering patches: the phases didn't exist as fully-formed entities until I was ready to send the patches. Though I appreciate things were suboptimal here. Understood. Just looking for a way where we an move this kind of work forward without the level of pain on your side and without having to wait for a 236 series patchset before it can be contributed. Maybe we just have to accept that this kind of work is going to be difficult to stage for reviews. Dunno, just feels like we ought to be able to make it simpler for both of us and have smaller hunks staging in earlier. Jeff
Re: [PATCH 000/236] Introduce rtx subclasses
On Wed, 2014-08-06 at 13:19 -0400, David Malcolm wrote: This is the patch series I spoke about at Cauldron in the talk A proposal for typesafe RTL; slides here: http://dmalcolm.fedorapeople.org/presentations/cauldron-2014/rtl They can also be seen at: https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v20/ [...snip...] The patches (and my control for testing) has been on top of r211061, which being from 2014-05-29 is now two months old, but hopefully is still reviewable; naturally I'll perform bootstrap and regression testing for each patch in turn before committing. FWIW I've now rebased the patches against today's trunk (specifically r213914) and I've backed up the results to: https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v21/ Still TODO: * rename as_a_nullable to safe_as_a * eliminate rtx_real_insn * make various scaffolding functions be inline, updating gdbinit.in as appropriate. [...snip...] Dave
Re: [PATCH 000/236] Introduce rtx subclasses
On 08/06/14 11:19, David Malcolm wrote: The aim of the patch series is to improve the type-safety and readability of the backend by introducing subclasses of rtx (actually rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE. That way we can document directly in the code the various places that manipulate insn chains vs other kinds of rtx node. Right. And by catching this stuff at compile time developers working on RTL bits should become at least a tiny bit more efficient. The class hierarchy looks like this (using indentation to show inheritance, and indicating the invariants): class rtx_def; class rtx_expr_list; /* GET_CODE (X) == EXPR_LIST */ class rtx_insn_list; /* GET_CODE (X) == INSN_LIST */ class rtx_sequence;/* GET_CODE (X) == SEQUENCE */ class rtx_insn;/* INSN_CHAIN_CODE_P (GET_CODE (X)) */ class rtx_real_insn; /* INSN_P (X) */ class rtx_debug_insn; /* DEBUG_INSN_P (X) */ class rtx_nonjump_insn;/* NONJUMP_INSN_P (X) */ class rtx_jump_insn; /* JUMP_P (X) */ class rtx_call_insn; /* CALL_P (X) */ class rtx_jump_table_data; /* JUMP_TABLE_DATA_P (X) */ class rtx_barrier; /* BARRIER_P (X) */ class rtx_code_label;/* LABEL_P (X) */ class rtx_note; /* NOTE_P (X) */ So I think the only real question here is do we want to distinguish between rtx_real_insn and the others? When I see rtx_real_insn my first thought is something that's going to turn into an instruction. But that's not true in this hierarchy due to rtx_debug_insn. So throughout the work-to-date, did you find much use for rtx_real_insn? The patch series also contains some cleanups using inline methods: * being able to get rid of this boilerplate everywhere that jump tables are handled: if (GET_CODE (PATTERN (table)) == ADDR_VEC) vec = XVEC (PATTERN (table), 0); else vec = XVEC (PATTERN (table), 1); in favor of a helper method (inlined): vec = table-get_labels (); Right. I suspect there's other cleanups like this all over the place that we could do and probably should in the interest of readability. * having a subclass for EXPR_LIST allows for replacing this kind of thing: for (x = forced_labels; x; x = XEXP (x, 1)) if (XEXP (x, 0)) set_label_offsets (XEXP (x, 0), NULL_RTX, 1); with the following, which captures that it's an EXPR_LIST, and makes it clearer that we're simply walking a singly-linked list: for (rtx_expr_list *x = forced_labels; x; x = x-next ()) if (x-element ()) set_label_offsets (x-element (), NULL_RTX, 1); And would could argue here that we really just want to utilize some standard list container. I'm not sure what we're getting by rolling our own. When we discussed this a few weeks ago, I probably should have pushed a bit harder on that. But I think we can/should table that discussion. What you've done is a clear step forward and I think should be reviewed on those merits. If someone wants to go another step forward and use standard containers, that can be dealt with independently. There are some surface details to the patches: * class names. The subclass names correspond to the lower_case name from rtl.def, with an rtx_ prefix. rtx_insn and rtx_real_insn don't correspond to concrete node kinds, and hence I had to invent the names. (In an earlier version of the patches these were rtx_base_insn and rtx_insn respectively, but the former occurred much more than the latter and so it seemed better to use the shorter spelling for the common case). I can live with the existing class names. * there's a NULL_RTX define in rtl.h. In an earlier version of the patch I added a NULL_INSN define, but in this version I simply use NULL, and I'm in two minds about whether a NULL_INSN is desirable (would we have a NULL_FOO for all of the subclasses?). I like having a strong distinction between arbitrary RTL nodes vs instructions, so maybe there's a case for NULL_INSN, but not for the subclasses? No strong opinion here. I think we added NULL_TREE/NULL_RTX. I could possibly see extending that to the overall concept of insn chain things, but I think doing one for each subclass would probably be overkill. * I added an rtx_real_insn subclass for the INSN_P predicate, adding the idea of a PATTERN, a basic_block, and a location - but I hardly use this anywhere. That said, it seems to be a real concept in the code, so I added it. So DEBUG_INSN has a pattern and that's how it got into rtx_real_insn. Hmmm. Would agree it's a real concept in the code, but as much as I hate bikeshedding on names, this is one that just doesn't fit well. If you didn't use it much, maybe it's fixable without an
Re: [PATCH 000/236] Introduce rtx subclasses
On Tue, 2014-08-12 at 14:39 -0600, Jeff Law wrote: On 08/06/14 11:19, David Malcolm wrote: The aim of the patch series is to improve the type-safety and readability of the backend by introducing subclasses of rtx (actually rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE. That way we can document directly in the code the various places that manipulate insn chains vs other kinds of rtx node. Right. And by catching this stuff at compile time developers working on RTL bits should become at least a tiny bit more efficient. The class hierarchy looks like this (using indentation to show inheritance, and indicating the invariants): class rtx_def; class rtx_expr_list; /* GET_CODE (X) == EXPR_LIST */ class rtx_insn_list; /* GET_CODE (X) == INSN_LIST */ class rtx_sequence;/* GET_CODE (X) == SEQUENCE */ class rtx_insn;/* INSN_CHAIN_CODE_P (GET_CODE (X)) */ class rtx_real_insn; /* INSN_P (X) */ class rtx_debug_insn; /* DEBUG_INSN_P (X) */ class rtx_nonjump_insn;/* NONJUMP_INSN_P (X) */ class rtx_jump_insn; /* JUMP_P (X) */ class rtx_call_insn; /* CALL_P (X) */ class rtx_jump_table_data; /* JUMP_TABLE_DATA_P (X) */ class rtx_barrier; /* BARRIER_P (X) */ class rtx_code_label;/* LABEL_P (X) */ class rtx_note; /* NOTE_P (X) */ So I think the only real question here is do we want to distinguish between rtx_real_insn and the others? When I see rtx_real_insn my first thought is something that's going to turn into an instruction. But that's not true in this hierarchy due to rtx_debug_insn. So throughout the work-to-date, did you find much use for rtx_real_insn? Essentially zero in the work-to-date. Of the 13 new subclasses of rtx, I only make major use of about half of them; here are the frequencies (as reported by grep -w in my current working copy): class rtx_def; /* ~26000 for rtx */ class rtx_expr_list; /* 73 */ class rtx_insn_list; /* 130 */ class rtx_sequence;/* 71 */ class rtx_insn;/* ~4300 */ class rtx_real_insn; /* 17 */ class rtx_debug_insn; /* 9 */ class rtx_nonjump_insn;/* 5 */ class rtx_jump_insn; /* 9 */ class rtx_call_insn; /* 25 */ class rtx_jump_table_data; /* 41 */ class rtx_barrier; /* 25 */ class rtx_code_label;/* 176 */ class rtx_note; /* 63 */ i.e. rtx_real_insn is basically unused, as are rtx_debug_insn, rtx_nonjump_insn and rtx_jump_insn. I think there's a case for keeping the concrete subclasses (those last three), but we can drop rtx_real_insn. I did some analysis of the operands of the various subclasses: class rtx_def; class rtx_expr_list; /* ee */ class rtx_insn_list; /* ue */ class rtx_sequence;/* E */ class rtx_insn; class rtx_real_insn; /* 01234567 */ class rtx_debug_insn; /* uuBeiie */ class rtx_nonjump_insn;/* uuBeiie */ class rtx_jump_insn; /* uuBeiie0 */ class rtx_call_insn; /* uuBeiiee */ class rtx_jump_table_data; /* uuBe */ class rtx_barrier; /* uu0 */ class rtx_code_label;/* uuB00is */ class rtx_note; /* uuB0ni */ /*01234567 XEXP (insn, 0): PREV_INSN()─┘│││ XEXP (insn, 1):NEXT_INSN()─┘││ XBBDEF (insn, 2):BLOCK_FOR_INSN()─┘│ XEXP (insn, 3):PATTERN()─┘ XUINT (insn, 4): INSN_LOCATION()─┘│││ XINT (insn, 5):INSN_CODE()─┘││ XEXP (insn, 6): REG_NOTES()─┘│ XEXP(INSN, 7): CALL_INSN_FUNCTION_USAGE(INSN) ─┘ */ with an rtx_real_insn you're guaranteed at least a uuBeiie. But nothing is using that in the patches as they stand, so we can simply drop the class. Perhaps the class hierarchy diagram in coretypes.h should gain the above operand annotation? The patch series also contains some cleanups using inline methods: * being able to get rid of this boilerplate everywhere that jump tables are handled: if (GET_CODE (PATTERN (table)) == ADDR_VEC) vec = XVEC (PATTERN (table), 0); else vec = XVEC (PATTERN (table), 1); in favor of a helper method (inlined): vec = table-get_labels (); Right. I suspect there's other cleanups like this all over the place that we could do and probably should in the interest of readability. (nods) *
[PATCH 000/236] Introduce rtx subclasses
This is the patch series I spoke about at Cauldron in the talk A proposal for typesafe RTL; slides here: http://dmalcolm.fedorapeople.org/presentations/cauldron-2014/rtl They can also be seen at: https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v20/ The aim of the patch series is to improve the type-safety and readability of the backend by introducing subclasses of rtx (actually rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE. That way we can document directly in the code the various places that manipulate insn chains vs other kinds of rtx node. An example of a bug detected using this approach: in mn10300.c there was dead code of the form: if (GET_CODE (insn) == PARALLEL) insn = XVECEXP (insn, 0, 0); where the test should really have been on PATTERN (insn), not insn: if (GET_CODE (PATTERN (insn)) == PARALLEL) insn = XVECEXP (PATTERN (insn), 0, 0); [as discussed in https://gcc.gnu.org/ml/gcc/2014-07/msg00078.html] The class hierarchy looks like this (using indentation to show inheritance, and indicating the invariants): class rtx_def; class rtx_expr_list; /* GET_CODE (X) == EXPR_LIST */ class rtx_insn_list; /* GET_CODE (X) == INSN_LIST */ class rtx_sequence;/* GET_CODE (X) == SEQUENCE */ class rtx_insn;/* INSN_CHAIN_CODE_P (GET_CODE (X)) */ class rtx_real_insn; /* INSN_P (X) */ class rtx_debug_insn; /* DEBUG_INSN_P (X) */ class rtx_nonjump_insn;/* NONJUMP_INSN_P (X) */ class rtx_jump_insn; /* JUMP_P (X) */ class rtx_call_insn; /* CALL_P (X) */ class rtx_jump_table_data; /* JUMP_TABLE_DATA_P (X) */ class rtx_barrier; /* BARRIER_P (X) */ class rtx_code_label;/* LABEL_P (X) */ class rtx_note; /* NOTE_P (X) */ The patch series converts roughly 4300 places in the code from using rtx to the more concrete rtx_insn *, in such places as: * the core types within basic blocks * hundreds of function params, struct fields, etc. e.g. within register allocators, schedulers * insn and curr_insn within .md files (peephole, attributes, define_bypass guards) * insn_t in sel-sched-ir.h * Target hooks: updated params of 25 of them * Debug hooks: label and var_location etc The patch series also contains some cleanups using inline methods: * being able to get rid of this boilerplate everywhere that jump tables are handled: if (GET_CODE (PATTERN (table)) == ADDR_VEC) vec = XVEC (PATTERN (table), 0); else vec = XVEC (PATTERN (table), 1); in favor of a helper method (inlined): vec = table-get_labels (); * having a subclass for EXPR_LIST allows for replacing this kind of thing: for (x = forced_labels; x; x = XEXP (x, 1)) if (XEXP (x, 0)) set_label_offsets (XEXP (x, 0), NULL_RTX, 1); with the following, which captures that it's an EXPR_LIST, and makes it clearer that we're simply walking a singly-linked list: for (rtx_expr_list *x = forced_labels; x; x = x-next ()) if (x-element ()) set_label_offsets (x-element (), NULL_RTX, 1); There are some surface details to the patches: * class names. The subclass names correspond to the lower_case name from rtl.def, with an rtx_ prefix. rtx_insn and rtx_real_insn don't correspond to concrete node kinds, and hence I had to invent the names. (In an earlier version of the patches these were rtx_base_insn and rtx_insn respectively, but the former occurred much more than the latter and so it seemed better to use the shorter spelling for the common case). * there's a NULL_RTX define in rtl.h. In an earlier version of the patch I added a NULL_INSN define, but in this version I simply use NULL, and I'm in two minds about whether a NULL_INSN is desirable (would we have a NULL_FOO for all of the subclasses?). I like having a strong distinction between arbitrary RTL nodes vs instructions, so maybe there's a case for NULL_INSN, but not for the subclasses? * I added an rtx_real_insn subclass for the INSN_P predicate, adding the idea of a PATTERN, a basic_block, and a location - but I hardly use this anywhere. That said, it seems to be a real concept in the code, so I added it. * pointerness of the types. rtx is a typedef to rtx_def * i.e. there's an implicit pointer. In the discussion about using C++ classes for gimple statements: https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01427.html Richi said: To followup myself here, it's because 'tree' is a typedef to a pointer and thus 'const tree' is different from 'const tree_node *'. Not sure why we re-introduced the 'mistake' of making 'tree' a pointer when we introduced 'gimple'. If we were to make 'gimple' the class type itself we can use gimple *, const gimple * and also const gimple (when a