Re: [PATCH 000/236] Introduce rtx subclasses

2014-08-18 Thread David Malcolm
On Tue, 2014-08-12 at 20:26 -0400, David Malcolm wrote:
 On Tue, 2014-08-12 at 14:39 -0600, Jeff Law wrote:
  On 08/06/14 11:19, David Malcolm wrote:
  
   The aim of the patch series is to improve the type-safety and
   readability of the backend by introducing subclasses of rtx (actually
   rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE.
  
   That way we can document directly in the code the various places that
   manipulate insn chains vs other kinds of rtx node.
  Right.  And by catching this stuff at compile time developers working on 
  RTL bits should become at least a tiny bit more efficient.

[...snip...]

   Performance
   ===
  
   I tested the performance with --enable-checking=release using
   two large files (kdecore.cc, bigcode.c), comparing a control build
   to a patched build.
  
   There were no significant differences in compilation time:
  OK.  A bit of a surprise, but then again perhaps not because we don't 
  enable RTL checking by default.  And I suspect a goodly amount of the 
  RTL checking overhead is in the operands, not on the chain itself.
  
  
  
  
   As for the performance of a regular build i.e. with as_a
   checks *enabled*; looking at the wallclock time taken for a bootstrap and
   regression test, for my s390 builds (with -j3) I saw:
  
   s390 control:
  make time: 68 mins
  make check time: 122 mins
  total time: 190 mins
  
   s390 experiment:
  make time: 70 mins
  make check time: 126 mins
  total time: 196 mins
  
   showing a 3% increase, presumably due to the as_a and as_a_nullable
   checks.
  You want to be real careful benchmarking on ppc/s390 hardware as they're 
  partitioned and what goes on in a different partition can affect your 
  partition.
 
 Ah.  This suggests my testing may need redoing.  I'll try to redo this
 on a dedicated x86_64 box.

I've now redone timed bootstrap testing on a dedicated x86_64 box (64
core Opteron with 128GB, with -j64).  These were regular builds i.e.
without setting --enable-checking.

Unpatched build:
   make time: 15 mins
   make check time: 115 mins
   total time: 130 mins

Patched build (with the full patchkit):
   make time: 16 mins
   make check time: 116 mins
   total time: 132 mins

(and equal results from the regrtest)

   i.e. a release build shows no change in performance; a debug build shows
   a 3% increase in time taken to bootstrap and regression test.  I believe
   the debug build could be sped up with further patches to eliminate the
   checked casts.
  Yea.  As I've mentioned, my vision is to try and push out the casts as 
  much as reasonably possible.

(FWIW, I have been working on further optimizations to remove checked
casts, but the above is without those optimizations)



Re: [PATCH 000/236] Introduce rtx subclasses

2014-08-14 Thread David Malcolm
On Wed, 2014-08-13 at 20:13 -0400, David Malcolm wrote:
 On Wed, 2014-08-06 at 13:19 -0400, David Malcolm wrote:
  This is the patch series I spoke about at Cauldron in the talk
  A proposal for typesafe RTL; slides here:
  http://dmalcolm.fedorapeople.org/presentations/cauldron-2014/rtl
  
  They can also be seen at:
  https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v20/
 
 [...snip...]
 
  The patches (and my control for testing) has been on top of r211061, which
  being from 2014-05-29 is now two months old, but hopefully is still
  reviewable; naturally I'll perform bootstrap and regression testing for
  each patch in turn before committing.
 
 FWIW I've now rebased the patches against today's trunk (specifically
 r213914) and I've backed up the results to:
 https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v21/
 
 Still TODO:
   * rename as_a_nullable to safe_as_a
   * eliminate rtx_real_insn
   * make various scaffolding functions be inline, updating gdbinit.in as
 appropriate.
 
 [...snip...]

Likewise; latest WIP:
https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v22/
has:
  * the renaming of as_a_nullable to safe_as_a
  * the elimination of rtx_real_insn

Still TODO:
* make various scaffolding functions be inline, updating gdbinit.in as




Re: [PATCH 000/236] Introduce rtx subclasses

2014-08-13 Thread Jeff Law

On 08/12/14 18:26, David Malcolm wrote:

Essentially zero in the work-to-date.

Of the 13 new subclasses of rtx, I only make major use of about half of
them; here are the frequencies (as reported by grep -w in my current
working copy):

   class rtx_def;   /* ~26000 for rtx */
 class rtx_expr_list;   /* 73 */
 class rtx_insn_list;   /* 130 */
 class rtx_sequence;/* 71 */
 class rtx_insn;/* ~4300 */
   class rtx_real_insn; /* 17 */
 class rtx_debug_insn;  /* 9 */
 class rtx_nonjump_insn;/* 5 */
 class rtx_jump_insn;   /* 9 */
 class rtx_call_insn;   /* 25 */
   class rtx_jump_table_data;   /* 41 */
   class rtx_barrier;   /* 25 */
   class rtx_code_label;/* 176 */
   class rtx_note;  /* 63 */

i.e. rtx_real_insn is basically unused, as are rtx_debug_insn,
rtx_nonjump_insn and rtx_jump_insn.

I think there's a case for keeping the concrete subclasses (those last
three), but we can drop rtx_real_insn.
Let's go with that.  I wouldn't be surprised if we find more cases where 
we want the more concrete classes in the future, so I'm not worried 
about their low usage at this point.


And we can always re-introduce the real-insn concept in the future if we 
find the need.  I guess I'm rather biased against deep inheritance 
hierarchies after working on projects with unnecessarily deep 
hierarchies in the past.





with an rtx_real_insn you're guaranteed at least a uuBeiie.  But
nothing is using that in the patches as they stand, so we can simply
drop the class.
And that's the one property I think argues to keep the intermediate 
class.  But with nothing using it right now, let's drop.




Perhaps the class hierarchy diagram in coretypes.h should gain the above
operand annotation?

Sure.  Feel free to add that tidbit whenever you feel.


No strong opinion here.  I think we added NULL_TREE/NULL_RTX.  I could
possibly see extending that to the overall concept of insn chain
things, but I think doing one for each subclass would probably be overkill.


In the absence of strong opinions, maybe we should proceed with the
patches as written i.e. *without* a NULL_INSN define.

OK.  We can revisit if/when we make things in the chain separate from rtxs.




So out of curiosity, any thoughts on what other big things are out there
that need to be fixed.  I'm keen to keep this stuff moving as much as
possible.


Arguably PATTERN() should require an rtx_insn * rather than a plain rtx,
but it would be an involved patch.

Yea.  That's a good one.  Probably a series unto itself at some point.





Other followups would be to reduce the number of as_a rtx_insn * in
the code; for example grepping for uncast_ shows quite a few, a
pattern where I strengthen the type of a parameter as seen within the
function without needing to strengthen the param itself, where:

Yea.  These are reasonable to attack independently.





A part of me really wonders if the 6 phases above should have been
submitted independently.  ie, once the scaffolding was done why not go
ahead and get that reviewed  installed, then move on to phase2 patches.
   I bring it up more for future work of a similar nature.

I realize that you had to do a fair amount of the later work to ensure
the scaffolding was right and so that we could see what the end result
would likely look like.  But something feels like it could be staged better.


FWIW What you're seeing is the end result of a *lot* of
git rebase -i, where I was splitting, combining, and reordering
patches: the phases didn't exist as fully-formed entities until I was
ready to send the patches.

Though I appreciate things were suboptimal here.
Understood.  Just looking for a way where we an move this kind of work 
forward without the level of pain on your side and without having to 
wait for a 236 series patchset before it can be contributed.  Maybe we 
just have to accept that this kind of work is going to be difficult to 
stage for reviews.  Dunno, just feels like we ought to be able to make 
it simpler for both of us and have smaller hunks staging in earlier.


Jeff


Re: [PATCH 000/236] Introduce rtx subclasses

2014-08-13 Thread David Malcolm
On Wed, 2014-08-06 at 13:19 -0400, David Malcolm wrote:
 This is the patch series I spoke about at Cauldron in the talk
 A proposal for typesafe RTL; slides here:
 http://dmalcolm.fedorapeople.org/presentations/cauldron-2014/rtl
 
 They can also be seen at:
 https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v20/

[...snip...]

 The patches (and my control for testing) has been on top of r211061, which
 being from 2014-05-29 is now two months old, but hopefully is still
 reviewable; naturally I'll perform bootstrap and regression testing for
 each patch in turn before committing.

FWIW I've now rebased the patches against today's trunk (specifically
r213914) and I've backed up the results to:
https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v21/

Still TODO:
  * rename as_a_nullable to safe_as_a
  * eliminate rtx_real_insn
  * make various scaffolding functions be inline, updating gdbinit.in as
appropriate.

[...snip...]

Dave



Re: [PATCH 000/236] Introduce rtx subclasses

2014-08-12 Thread Jeff Law

On 08/06/14 11:19, David Malcolm wrote:


The aim of the patch series is to improve the type-safety and
readability of the backend by introducing subclasses of rtx (actually
rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE.

That way we can document directly in the code the various places that
manipulate insn chains vs other kinds of rtx node.
Right.  And by catching this stuff at compile time developers working on 
RTL bits should become at least a tiny bit more efficient.




The class hierarchy looks like this (using indentation to show
inheritance, and indicating the invariants):

class rtx_def;
   class rtx_expr_list;   /* GET_CODE (X) == EXPR_LIST */
   class rtx_insn_list;   /* GET_CODE (X) == INSN_LIST */
   class rtx_sequence;/* GET_CODE (X) == SEQUENCE */
   class rtx_insn;/* INSN_CHAIN_CODE_P (GET_CODE (X)) */
 class rtx_real_insn; /* INSN_P (X) */
   class rtx_debug_insn;  /* DEBUG_INSN_P (X) */
   class rtx_nonjump_insn;/* NONJUMP_INSN_P (X) */
   class rtx_jump_insn;   /* JUMP_P (X) */
   class rtx_call_insn;   /* CALL_P (X) */
 class rtx_jump_table_data;   /* JUMP_TABLE_DATA_P (X) */
 class rtx_barrier;   /* BARRIER_P (X) */
 class rtx_code_label;/* LABEL_P (X) */
 class rtx_note;  /* NOTE_P (X) */
So I think the only real question here is do we want to distinguish 
between rtx_real_insn and the others?


When I see rtx_real_insn my first thought is something that's going to 
turn into an instruction.  But that's not true in this hierarchy due to 
rtx_debug_insn.


So throughout the work-to-date, did you find much use for rtx_real_insn?



The patch series also contains some cleanups using inline methods:

   * being able to get rid of this boilerplate everywhere that jump tables
 are handled:

   if (GET_CODE (PATTERN (table)) == ADDR_VEC)
 vec = XVEC (PATTERN (table), 0);
   else
 vec = XVEC (PATTERN (table), 1);

in favor of a helper method (inlined):

   vec = table-get_labels ();
Right.  I suspect there's other cleanups like this all over the place 
that we could do and probably should in the interest of readability.







   * having a subclass for EXPR_LIST allows for replacing this kind of
 thing:

   for (x = forced_labels; x; x = XEXP (x, 1))
 if (XEXP (x, 0))
set_label_offsets (XEXP (x, 0), NULL_RTX, 1);

 with the following, which captures that it's an EXPR_LIST, and makes
 it clearer that we're simply walking a singly-linked list:

   for (rtx_expr_list *x = forced_labels; x; x = x-next ())
 if (x-element ())
   set_label_offsets (x-element (), NULL_RTX, 1);
And would could argue here that we really just want to utilize some 
standard list container.  I'm not sure what we're getting by rolling our 
own.  When we discussed this a few weeks ago, I probably should have 
pushed a bit harder on that.  But I think we can/should table that 
discussion.  What you've done is a clear step forward and I think should 
be reviewed on those merits.  If someone wants to go another step 
forward and use standard containers, that can be dealt with independently.





There are some surface details to the patches:

   * class names.  The subclass names correspond to the lower_case name
 from rtl.def, with an rtx_ prefix.  rtx_insn and rtx_real_insn
 don't correspond to concrete node kinds, and hence I had to invent
 the names.  (In an earlier version of the patches these were
 rtx_base_insn and rtx_insn respectively, but the former occurred
 much more than the latter and so it seemed better to use the shorter
 spelling for the common case).

I can live with the existing class names.



   * there's a NULL_RTX define in rtl.h.   In an earlier version of the
 patch I added a NULL_INSN define, but in this version I simply use
 NULL, and I'm in two minds about whether a NULL_INSN is desirable
 (would we have a NULL_FOO for all of the subclasses?).  I like having
 a strong distinction between arbitrary RTL nodes vs instructions,
 so maybe there's a case for NULL_INSN, but not for the subclasses?
No strong opinion here.  I think we added NULL_TREE/NULL_RTX.  I could 
possibly see extending that to the overall concept of insn chain 
things, but I think doing one for each subclass would probably be overkill.




   * I added an rtx_real_insn subclass for the INSN_P predicate, adding
 the idea of a PATTERN, a basic_block, and a location - but I hardly
 use this anywhere.  That said, it seems to be a real concept in the
 code, so I added it.
So DEBUG_INSN has a pattern and that's how it got into rtx_real_insn. 
Hmmm.  Would agree it's a real concept in the code, but as much as I 
hate bikeshedding on names, this is one that just doesn't fit well.  If 
you didn't use it much, maybe it's fixable without an 

Re: [PATCH 000/236] Introduce rtx subclasses

2014-08-12 Thread David Malcolm
On Tue, 2014-08-12 at 14:39 -0600, Jeff Law wrote:
 On 08/06/14 11:19, David Malcolm wrote:
 
  The aim of the patch series is to improve the type-safety and
  readability of the backend by introducing subclasses of rtx (actually
  rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE.
 
  That way we can document directly in the code the various places that
  manipulate insn chains vs other kinds of rtx node.
 Right.  And by catching this stuff at compile time developers working on 
 RTL bits should become at least a tiny bit more efficient.
 
  The class hierarchy looks like this (using indentation to show
  inheritance, and indicating the invariants):
 
  class rtx_def;
 class rtx_expr_list;   /* GET_CODE (X) == EXPR_LIST */
 class rtx_insn_list;   /* GET_CODE (X) == INSN_LIST */
 class rtx_sequence;/* GET_CODE (X) == SEQUENCE */
 class rtx_insn;/* INSN_CHAIN_CODE_P (GET_CODE (X)) */
   class rtx_real_insn; /* INSN_P (X) */
 class rtx_debug_insn;  /* DEBUG_INSN_P (X) */
 class rtx_nonjump_insn;/* NONJUMP_INSN_P (X) */
 class rtx_jump_insn;   /* JUMP_P (X) */
 class rtx_call_insn;   /* CALL_P (X) */
   class rtx_jump_table_data;   /* JUMP_TABLE_DATA_P (X) */
   class rtx_barrier;   /* BARRIER_P (X) */
   class rtx_code_label;/* LABEL_P (X) */
   class rtx_note;  /* NOTE_P (X) */
 So I think the only real question here is do we want to distinguish 
 between rtx_real_insn and the others?
 
 When I see rtx_real_insn my first thought is something that's going to 
 turn into an instruction.  But that's not true in this hierarchy due to 
 rtx_debug_insn.
 
 So throughout the work-to-date, did you find much use for rtx_real_insn?

Essentially zero in the work-to-date.

Of the 13 new subclasses of rtx, I only make major use of about half of
them; here are the frequencies (as reported by grep -w in my current
working copy):

  class rtx_def;   /* ~26000 for rtx */
class rtx_expr_list;   /* 73 */
class rtx_insn_list;   /* 130 */
class rtx_sequence;/* 71 */
class rtx_insn;/* ~4300 */
  class rtx_real_insn; /* 17 */
class rtx_debug_insn;  /* 9 */
class rtx_nonjump_insn;/* 5 */
class rtx_jump_insn;   /* 9 */
class rtx_call_insn;   /* 25 */
  class rtx_jump_table_data;   /* 41 */
  class rtx_barrier;   /* 25 */
  class rtx_code_label;/* 176 */
  class rtx_note;  /* 63 */

i.e. rtx_real_insn is basically unused, as are rtx_debug_insn,
rtx_nonjump_insn and rtx_jump_insn.

I think there's a case for keeping the concrete subclasses (those last
three), but we can drop rtx_real_insn.

I did some analysis of the operands of the various subclasses:

 class rtx_def;
   class rtx_expr_list;   /* ee */
   class rtx_insn_list;   /* ue */
   class rtx_sequence;/* E  */
   class rtx_insn;
 class rtx_real_insn;
  /*  01234567 */
   class rtx_debug_insn;  /* uuBeiie */
   class rtx_nonjump_insn;/* uuBeiie */
   class rtx_jump_insn;   /* uuBeiie0 */
   class rtx_call_insn;   /* uuBeiiee */
 class rtx_jump_table_data;   /* uuBe */
 class rtx_barrier;   /* uu0  */
 class rtx_code_label;/* uuB00is  */
 class rtx_note;  /* uuB0ni  */
/*01234567
  
XEXP (insn, 0):   PREV_INSN()─┘│││
XEXP (insn, 1):NEXT_INSN()─┘││
  XBBDEF (insn, 2):BLOCK_FOR_INSN()─┘│
XEXP (insn, 3):PATTERN()─┘
   XUINT (insn, 4):   INSN_LOCATION()─┘│││
XINT (insn, 5):INSN_CODE()─┘││
XEXP (insn, 6): REG_NOTES()─┘│
  XEXP(INSN, 7): CALL_INSN_FUNCTION_USAGE(INSN) ─┘ */

with an rtx_real_insn you're guaranteed at least a uuBeiie.  But
nothing is using that in the patches as they stand, so we can simply
drop the class.

Perhaps the class hierarchy diagram in coretypes.h should gain the above
operand annotation?


  The patch series also contains some cleanups using inline methods:
 
 * being able to get rid of this boilerplate everywhere that jump tables
   are handled:
 
 if (GET_CODE (PATTERN (table)) == ADDR_VEC)
   vec = XVEC (PATTERN (table), 0);
 else
   vec = XVEC (PATTERN (table), 1);
 
  in favor of a helper method (inlined):
 
 vec = table-get_labels ();
 Right.  I suspect there's other cleanups like this all over the place 
 that we could do and probably should in the interest of readability.

(nods)

 * 

[PATCH 000/236] Introduce rtx subclasses

2014-08-06 Thread David Malcolm
This is the patch series I spoke about at Cauldron in the talk
A proposal for typesafe RTL; slides here:
http://dmalcolm.fedorapeople.org/presentations/cauldron-2014/rtl

They can also be seen at:
https://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v20/

The aim of the patch series is to improve the type-safety and
readability of the backend by introducing subclasses of rtx (actually
rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE.

That way we can document directly in the code the various places that
manipulate insn chains vs other kinds of rtx node.

An example of a bug detected using this approach: in mn10300.c there
was dead code of the form:

  if (GET_CODE (insn) == PARALLEL)
insn = XVECEXP (insn, 0, 0);

where the test should really have been on PATTERN (insn), not insn:

  if (GET_CODE (PATTERN (insn)) == PARALLEL)
insn = XVECEXP (PATTERN (insn), 0, 0);

[as discussed in https://gcc.gnu.org/ml/gcc/2014-07/msg00078.html]

The class hierarchy looks like this (using indentation to show
inheritance, and indicating the invariants):

class rtx_def;
  class rtx_expr_list;   /* GET_CODE (X) == EXPR_LIST */
  class rtx_insn_list;   /* GET_CODE (X) == INSN_LIST */
  class rtx_sequence;/* GET_CODE (X) == SEQUENCE */
  class rtx_insn;/* INSN_CHAIN_CODE_P (GET_CODE (X)) */
class rtx_real_insn; /* INSN_P (X) */
  class rtx_debug_insn;  /* DEBUG_INSN_P (X) */
  class rtx_nonjump_insn;/* NONJUMP_INSN_P (X) */
  class rtx_jump_insn;   /* JUMP_P (X) */
  class rtx_call_insn;   /* CALL_P (X) */
class rtx_jump_table_data;   /* JUMP_TABLE_DATA_P (X) */
class rtx_barrier;   /* BARRIER_P (X) */
class rtx_code_label;/* LABEL_P (X) */
class rtx_note;  /* NOTE_P (X) */

The patch series converts roughly 4300 places in the code from using
rtx to the more concrete rtx_insn *, in such places as:

  * the core types within basic blocks

  * hundreds of function params, struct fields, etc.  e.g. within
register allocators, schedulers

  * insn and curr_insn within .md files (peephole, attributes,
define_bypass guards)

  * insn_t in sel-sched-ir.h

  * Target hooks: updated params of 25 of them

  * Debug hooks: label and var_location

etc

The patch series also contains some cleanups using inline methods:

  * being able to get rid of this boilerplate everywhere that jump tables
are handled:

  if (GET_CODE (PATTERN (table)) == ADDR_VEC)
vec = XVEC (PATTERN (table), 0);
  else
vec = XVEC (PATTERN (table), 1);

   in favor of a helper method (inlined):

  vec = table-get_labels ();

  * having a subclass for EXPR_LIST allows for replacing this kind of
thing:

  for (x = forced_labels; x; x = XEXP (x, 1))
if (XEXP (x, 0))
   set_label_offsets (XEXP (x, 0), NULL_RTX, 1);

with the following, which captures that it's an EXPR_LIST, and makes
it clearer that we're simply walking a singly-linked list:

  for (rtx_expr_list *x = forced_labels; x; x = x-next ())
if (x-element ())
  set_label_offsets (x-element (), NULL_RTX, 1);

There are some surface details to the patches:

  * class names.  The subclass names correspond to the lower_case name
from rtl.def, with an rtx_ prefix.  rtx_insn and rtx_real_insn
don't correspond to concrete node kinds, and hence I had to invent
the names.  (In an earlier version of the patches these were
rtx_base_insn and rtx_insn respectively, but the former occurred
much more than the latter and so it seemed better to use the shorter
spelling for the common case).

  * there's a NULL_RTX define in rtl.h.   In an earlier version of the
patch I added a NULL_INSN define, but in this version I simply use
NULL, and I'm in two minds about whether a NULL_INSN is desirable
(would we have a NULL_FOO for all of the subclasses?).  I like having
a strong distinction between arbitrary RTL nodes vs instructions,
so maybe there's a case for NULL_INSN, but not for the subclasses?

  * I added an rtx_real_insn subclass for the INSN_P predicate, adding
the idea of a PATTERN, a basic_block, and a location - but I hardly
use this anywhere.  That said, it seems to be a real concept in the
code, so I added it.

  * pointerness of the types.  rtx is a typedef to rtx_def * i.e.
there's an implicit pointer.  In the discussion about using C++
classes for gimple statements:
   https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01427.html
Richi said:

 To followup myself here, it's because 'tree' is a typedef to a pointer
 and thus 'const tree' is different from 'const tree_node *'.

 Not sure why we re-introduced the 'mistake' of making 'tree' a pointer
 when we introduced 'gimple'.  If we were to make 'gimple' the class
 type itself we can use gimple *, const gimple * and also const gimple 
 (when a