Re: [PATCH 4/4] Devirtualization based on global objects

2011-04-18 Thread Martin Jambor
Hi,

On Fri, Apr 15, 2011 at 05:38:50PM +0200, Richard Guenther wrote:
 On Fri, 15 Apr 2011, Martin Jambor wrote:
 
  Hi,
  
  this is the patch that actually speeds up astar (together with the
  previous one).  It implements devirtualization based on global
  objects.  In the past we have refrained from doing this because in
  general it is difficult to say whether the object is currently being
  constructed and so it might have a dynamic type of one of its
  ancestors.  However, if the object's class does not have any ancestors
  that obviously cannot happen.
  
  Devirtualizing in such conditions is enough to change a virtual call
  to regwayobj::isaddtobound in 473.astar to a direct one which can and
  should be inlined.  That seemed a good justification to implement this
  and so the patch below does so and brings about 3.1% speedup for that
  benchmark with LTO.
  
  I acknowledge that instead of discarding all classes with ancestors it
  would be better to check that the called virtual method has the same
  implementation in all ancestors instead.  That is perhaps something
  for later.
  
  It took me surprisingly long to realize that this technique can be
  used for folding virtual calls based on local automatically allocated
  objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that
  regressed in 4.6.
  
  Bootstrapped and tested on x86_64-linux.  OK for trunk?
  
  Thanks,
  
  Martin
  
  
  2011-04-15  Martin Jambor  mjam...@suse.cz
  
  * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize
  also according to actual contants.
  * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function.
  (gimple_fold_obj_type_ref_call): New function.
  (gimple_fold_call): Call gimple_fold_obj_type_ref_call on
  OBJ_TYPE_REFs.
  * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare.
  
  * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL.
  * testsuite/g++.dg/opt/devirt2.C: New test.
  * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise.
  
  

...

  Index: src/gcc/gimple-fold.c
  ===
  --- src.orig/gcc/gimple-fold.c
  +++ src/gcc/gimple-fold.c
  @@ -1438,6 +1438,95 @@ gimple_adjust_this_by_delta (gimple_stmt
 gimple_call_set_arg (call_stmt, 0, tmp);
   }
   
  +/* Return a binfo to be used for devirtualization of calls based on an 
  object
  +   represented by a declaration (i.e. a global or automatically allocated 
  one)
  +   or NULL if it cannot be found or is not safe.  CST is expected to be an
  +   ADDR_EXPR of such object or the function will return NULL.  Currently 
  it is
  +   safe to use such binfo only if it has no base binfo (i.e. no 
  ancestors).  */
  +
  +tree
  +gimple_extract_devirt_binfo_from_cst (tree cst)
  +{
  +  HOST_WIDE_INT offset, size, max_size;
  +  tree base, type, expected_type, binfo;
  +  bool last_artificial = false;
  +
  +  if (!flag_devirtualize
  +  || TREE_CODE (cst) != ADDR_EXPR
  +  || TREE_CODE (TREE_TYPE (TREE_TYPE (cst))) != RECORD_TYPE)
  +return NULL_TREE;
  +
  +  cst = TREE_OPERAND (cst, 0);
  +  expected_type = TREE_TYPE (cst);
  +  base = get_ref_base_and_extent (cst, offset, size, max_size);
  +  type = TREE_TYPE (base);
  +  if (!DECL_P (base)
  +  || max_size == -1
  +  || max_size != size
  +  || TREE_CODE (type) != RECORD_TYPE)
  +return NULL_TREE;
  +
  +  while (true)
  +{
  +  HOST_WIDE_INT pos, size;
  +  tree fld;
  +
  +  if (TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (expected_type))
  +   break;
  +  if (offset  0)
  +   return NULL_TREE;
  +
  +  for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
  +   {
  + if (TREE_CODE (fld) != FIELD_DECL)
  +   continue;
  +
  + pos = int_bit_position (fld);
  + size = tree_low_cst (DECL_SIZE (fld), 1);
  + if (pos = offset  (pos + size)  offset)
  +   break;
  +   }
  +  if (!fld || TREE_CODE (TREE_TYPE (fld)) != RECORD_TYPE)
  +   return NULL_TREE;
  +
  +  last_artificial = DECL_ARTIFICIAL (fld);
  +  type = TREE_TYPE (fld);
  +  offset -= pos;
  +}
 
 Please add come comments on what this loop does.  It looks like
 it searches for the FIELD_DECL that corresponds to the incoming
 constant address (but it doesn't have to exactly point to its
 beginning?). 

Yes, an object can be within another.  Like S within R in the added
testcase g++/opt/devirt2.C.  Then it checks it is not a representative
of an ancestor but a user-defined object by looking at DECL_ARTIFICIAL.

 Note that you miss to handle the case where the
 declaration is view-converted to a different type and you
 instead use the declared type.  Which ISTR was correct as
 placement new on decls is kind of invalid.

Yes, I know I'm assuming that.

 
  +  if (last_artificial)
  +return NULL_TREE;
  +  binfo = TYPE_BINFO (type);
  +  if (!binfo || BINFO_N_BASE_BINFOS (binfo)  0)
  + 

[PATCH 4/4] Devirtualization based on global objects

2011-04-15 Thread Martin Jambor
Hi,

this is the patch that actually speeds up astar (together with the
previous one).  It implements devirtualization based on global
objects.  In the past we have refrained from doing this because in
general it is difficult to say whether the object is currently being
constructed and so it might have a dynamic type of one of its
ancestors.  However, if the object's class does not have any ancestors
that obviously cannot happen.

Devirtualizing in such conditions is enough to change a virtual call
to regwayobj::isaddtobound in 473.astar to a direct one which can and
should be inlined.  That seemed a good justification to implement this
and so the patch below does so and brings about 3.1% speedup for that
benchmark with LTO.

I acknowledge that instead of discarding all classes with ancestors it
would be better to check that the called virtual method has the same
implementation in all ancestors instead.  That is perhaps something
for later.

It took me surprisingly long to realize that this technique can be
used for folding virtual calls based on local automatically allocated
objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that
regressed in 4.6.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2011-04-15  Martin Jambor  mjam...@suse.cz

* ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize
also according to actual contants.
* gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function.
(gimple_fold_obj_type_ref_call): New function.
(gimple_fold_call): Call gimple_fold_obj_type_ref_call on
OBJ_TYPE_REFs.
* gimple.h (gimple_extract_devirt_binfo_from_cst): Declare.

* testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL.
* testsuite/g++.dg/opt/devirt2.C: New test.
* testsuite/g++.dg/ipa/devirt-g-1.C: Likewise.


Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit
 
   for (ie = node-indirect_calls; ie; ie = next_ie)
 {
-  int param_index, types_count, j;
+  int param_index;
   HOST_WIDE_INT token, anc_offset;
   tree target, delta, otr_type;
+  struct ipcp_lattice *lat;
 
   next_ie = ie-next_callee;
   if (!ie-indirect_info-polymorphic)
continue;
   param_index = ie-indirect_info-param_index;
-  if (param_index == -1
- || ipa_param_cannot_devirtualize_p (info, param_index)
- || ipa_param_types_vec_empty (info, param_index))
+  if (param_index == -1)
continue;
 
+  lat = ipcp_get_lattice (info, param_index);
   token = ie-indirect_info-otr_token;
   anc_offset = ie-indirect_info-anc_offset;
   otr_type = ie-indirect_info-otr_type;
   target = NULL_TREE;
-  types_count = VEC_length (tree, info-params[param_index].types);
-  for (j = 0; j  types_count; j++)
+  if (lat-type == IPA_CONST_VALUE)
{
- tree binfo = VEC_index (tree, info-params[param_index].types, j);
- tree d, t;
-
+ tree binfo = gimple_extract_devirt_binfo_from_cst (lat-constant);
+ if (!binfo)
+   continue;
  binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
  if (!binfo)
-   {
- target = NULL_TREE;
- break;
-   }
+   continue;
+ target = gimple_get_virt_mehtod_for_binfo (token, binfo, delta,
+false);
+   }
+  else
+   {
+ int  types_count, j;
 
- t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
- if (!t)
-   {
- target = NULL_TREE;
- break;
-   }
- else if (!target)
-   {
- target = t;
- delta = d;
-   }
- else if (target != t || !tree_int_cst_equal (delta, d))
+ if (ipa_param_cannot_devirtualize_p (info, param_index)
+ || ipa_param_types_vec_empty (info, param_index))
+   continue;
+
+ types_count = VEC_length (tree, info-params[param_index].types);
+ for (j = 0; j  types_count; j++)
{
- target = NULL_TREE;
- break;
+ tree binfo = VEC_index (tree, info-params[param_index].types, j);
+ tree d, t;
+
+ binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
+ if (!binfo)
+   {
+ target = NULL_TREE;
+ break;
+   }
+
+ t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
+ if (!t)
+   {
+ target = NULL_TREE;
+ break;
+   }
+ else if (!target)
+   {
+ target = t;
+ delta = d;
+   }
+

Re: [PATCH 4/4] Devirtualization based on global objects

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

 Hi,
 
 this is the patch that actually speeds up astar (together with the
 previous one).  It implements devirtualization based on global
 objects.  In the past we have refrained from doing this because in
 general it is difficult to say whether the object is currently being
 constructed and so it might have a dynamic type of one of its
 ancestors.  However, if the object's class does not have any ancestors
 that obviously cannot happen.
 
 Devirtualizing in such conditions is enough to change a virtual call
 to regwayobj::isaddtobound in 473.astar to a direct one which can and
 should be inlined.  That seemed a good justification to implement this
 and so the patch below does so and brings about 3.1% speedup for that
 benchmark with LTO.
 
 I acknowledge that instead of discarding all classes with ancestors it
 would be better to check that the called virtual method has the same
 implementation in all ancestors instead.  That is perhaps something
 for later.
 
 It took me surprisingly long to realize that this technique can be
 used for folding virtual calls based on local automatically allocated
 objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that
 regressed in 4.6.
 
 Bootstrapped and tested on x86_64-linux.  OK for trunk?
 
 Thanks,
 
 Martin
 
 
 2011-04-15  Martin Jambor  mjam...@suse.cz
 
   * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize
   also according to actual contants.
   * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function.
   (gimple_fold_obj_type_ref_call): New function.
   (gimple_fold_call): Call gimple_fold_obj_type_ref_call on
   OBJ_TYPE_REFs.
   * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare.
 
   * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL.
   * testsuite/g++.dg/opt/devirt2.C: New test.
   * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise.
 
 
 Index: src/gcc/ipa-cp.c
 ===
 --- src.orig/gcc/ipa-cp.c
 +++ src/gcc/ipa-cp.c
 @@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit
  
for (ie = node-indirect_calls; ie; ie = next_ie)
  {
 -  int param_index, types_count, j;
 +  int param_index;
HOST_WIDE_INT token, anc_offset;
tree target, delta, otr_type;
 +  struct ipcp_lattice *lat;
  
next_ie = ie-next_callee;
if (!ie-indirect_info-polymorphic)
   continue;
param_index = ie-indirect_info-param_index;
 -  if (param_index == -1
 -   || ipa_param_cannot_devirtualize_p (info, param_index)
 -   || ipa_param_types_vec_empty (info, param_index))
 +  if (param_index == -1)
   continue;
  
 +  lat = ipcp_get_lattice (info, param_index);
token = ie-indirect_info-otr_token;
anc_offset = ie-indirect_info-anc_offset;
otr_type = ie-indirect_info-otr_type;
target = NULL_TREE;
 -  types_count = VEC_length (tree, info-params[param_index].types);
 -  for (j = 0; j  types_count; j++)
 +  if (lat-type == IPA_CONST_VALUE)
   {
 -   tree binfo = VEC_index (tree, info-params[param_index].types, j);
 -   tree d, t;
 -
 +   tree binfo = gimple_extract_devirt_binfo_from_cst (lat-constant);
 +   if (!binfo)
 + continue;
 binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
 if (!binfo)
 - {
 -   target = NULL_TREE;
 -   break;
 - }
 + continue;
 +   target = gimple_get_virt_mehtod_for_binfo (token, binfo, delta,
 +  false);
 + }
 +  else
 + {
 +   int  types_count, j;
  
 -   t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
 -   if (!t)
 - {
 -   target = NULL_TREE;
 -   break;
 - }
 -   else if (!target)
 - {
 -   target = t;
 -   delta = d;
 - }
 -   else if (target != t || !tree_int_cst_equal (delta, d))
 +   if (ipa_param_cannot_devirtualize_p (info, param_index)
 +   || ipa_param_types_vec_empty (info, param_index))
 + continue;
 +
 +   types_count = VEC_length (tree, info-params[param_index].types);
 +   for (j = 0; j  types_count; j++)
   {
 -   target = NULL_TREE;
 -   break;
 +   tree binfo = VEC_index (tree, info-params[param_index].types, j);
 +   tree d, t;
 +
 +   binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
 +   if (!binfo)
 + {
 +   target = NULL_TREE;
 +   break;
 + }
 +
 +   t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
 +   if (!t)
 + {
 +   target = NULL_TREE;
 +   break;
 + }
 +   else if (!target)
 + {
 +   target = t;
 +