Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-18 Thread Steven Bosscher
On Thu, Aug 16, 2012 at 12:06 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Thu, Aug 16, 2012 at 1:11 AM, Steven Bosscher stevenb@gmail.com 
 wrote:
 On Mon, Aug 6, 2012 at 1:27 PM, Paolo Bonzini bonz...@gnu.org wrote:
 2. sparseset has the same problem of memory clearing (for valgrind,
 see sparseset_alloc).

 ... only the sparse array needs this clearing, but currently we do it
 for both.

 And according to the fat comment before the xcalloc, it's not even
 really needed. Why can't we just tell valgrind to treat the memory as
 defined, like in this patchlet:

 Indeed ... I suppose ok if it bootstraps / tests w/o valgrind checking and
 if a cc1 builds with valgrind checking.

Here's what I'll commit:

Index: sparseset.c
===
--- sparseset.c (revision 190474)
+++ sparseset.c (working copy)
@@ -30,12 +30,14 @@
   unsigned int n_bytes = sizeof (struct sparseset_def)
 + ((n_elms - 1) * 2 * sizeof (SPARSESET_ELT_TYPE));

-  /* We use xcalloc rather than xmalloc to silence some valgrind uninitialized
+  sparseset set = XNEWVAR(struct sparseset_def, n_bytes);
+
+  /* Mark the sparseset as defined to silence some valgrind uninitialized
  read errors when accessing set-sparse[n] when n is not, and never has
  been, in the set.  These uninitialized reads are expected, by design and
- harmless.  If this turns into a performance problem due to some future
- additional users of sparseset, we can revisit this decision.  */
-  sparseset set = (sparseset) xcalloc (1, n_bytes);
+ harmless.  */
+  VALGRIND_DISCARD (VALGRIND_MAKE_MEM_DEFINED (set, n_bytes));
+
   set-dense = (set-elms[0]);
   set-sparse = (set-elms[n_elms]);
   set-size = n_elms;


Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-16 Thread Richard Guenther
On Thu, Aug 16, 2012 at 1:11 AM, Steven Bosscher stevenb@gmail.com wrote:
 On Mon, Aug 6, 2012 at 1:27 PM, Paolo Bonzini bonz...@gnu.org wrote:
 2. sparseset has the same problem of memory clearing (for valgrind,
 see sparseset_alloc).

 ... only the sparse array needs this clearing, but currently we do it
 for both.

 And according to the fat comment before the xcalloc, it's not even
 really needed. Why can't we just tell valgrind to treat the memory as
 defined, like in this patchlet:

Indeed ... I suppose ok if it bootstraps / tests w/o valgrind checking and
if a cc1 builds with valgrind checking.

Thanks,
Richard.

 Index: sparseset.c
 ===
 --- sparseset.c (revision 190418)
 +++ sparseset.c (working copy)
 @@ -30,12 +30,14 @@ sparseset_alloc (SPARSESET_ELT_TYPE n_elms)
unsigned int n_bytes = sizeof (struct sparseset_def)
  + ((n_elms - 1) * 2 * sizeof (SPARSESET_ELT_TYPE));

 -  /* We use xcalloc rather than xmalloc to silence some valgrind 
 uninitialized
 +  sparseset set = XNEWVAR(sparseset, n_bytes);
 +
 +  /* Mark the sparseset as defined to silence some valgrind uninitialized
   read errors when accessing set-sparse[n] when n is not, and never has
   been, in the set.  These uninitialized reads are expected, by design and
 - harmless.  If this turns into a performance problem due to some future
 - additional users of sparseset, we can revisit this decision.  */
 -  sparseset set = (sparseset) xcalloc (1, n_bytes);
 + harmless.  */
 +  VALGRIND_DISCARD (VALGRIND_MAKE_MEM_DEFINED (set, n_bytes));
 +
set-dense = (set-elms[0]);
set-sparse = (set-elms[n_elms]);
set-size = n_elms;


 I have never built GCC with valgrind checking, so I don't know if this
 is correct. But there has to be a better solution than calling
 xcalloc...

 Ciao!
 Steven


Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-15 Thread Steven Bosscher
On Mon, Aug 6, 2012 at 1:27 PM, Paolo Bonzini bonz...@gnu.org wrote:
 2. sparseset has the same problem of memory clearing (for valgrind,
 see sparseset_alloc).

 ... only the sparse array needs this clearing, but currently we do it
 for both.

And according to the fat comment before the xcalloc, it's not even
really needed. Why can't we just tell valgrind to treat the memory as
defined, like in this patchlet:

Index: sparseset.c
===
--- sparseset.c (revision 190418)
+++ sparseset.c (working copy)
@@ -30,12 +30,14 @@ sparseset_alloc (SPARSESET_ELT_TYPE n_elms)
   unsigned int n_bytes = sizeof (struct sparseset_def)
 + ((n_elms - 1) * 2 * sizeof (SPARSESET_ELT_TYPE));

-  /* We use xcalloc rather than xmalloc to silence some valgrind uninitialized
+  sparseset set = XNEWVAR(sparseset, n_bytes);
+
+  /* Mark the sparseset as defined to silence some valgrind uninitialized
  read errors when accessing set-sparse[n] when n is not, and never has
  been, in the set.  These uninitialized reads are expected, by design and
- harmless.  If this turns into a performance problem due to some future
- additional users of sparseset, we can revisit this decision.  */
-  sparseset set = (sparseset) xcalloc (1, n_bytes);
+ harmless.  */
+  VALGRIND_DISCARD (VALGRIND_MAKE_MEM_DEFINED (set, n_bytes));
+
   set-dense = (set-elms[0]);
   set-sparse = (set-elms[n_elms]);
   set-size = n_elms;


I have never built GCC with valgrind checking, so I don't know if this
is correct. But there has to be a better solution than calling
xcalloc...

Ciao!
Steven


[patch] speed up ifcvt:cond_move_convert_if_block

2012-08-06 Thread Steven Bosscher
Hello,

In PR54146, ifcvt spends a lot of time just clearing memory. This
patch changes the value maps to pointer-maps to fix this issue.

Bootstrappedtested on x86_64-unknown-linux-gnu. OK?

Ciao!
Steven


PR54146_ifcvt.diff
Description: Binary data


Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-06 Thread Richard Guenther
On Mon, Aug 6, 2012 at 8:54 AM, Steven Bosscher stevenb@gmail.com wrote:
 Hello,

 In PR54146, ifcvt spends a lot of time just clearing memory. This
 patch changes the value maps to pointer-maps to fix this issue.

 Bootstrappedtested on x86_64-unknown-linux-gnu. OK?

Ok!

Thanks,
Richard.

 Ciao!
 Steven


Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-06 Thread Paolo Bonzini
Il 06/08/2012 08:54, Steven Bosscher ha scritto:
 Hello,
 
 In PR54146, ifcvt spends a lot of time just clearing memory. This
 patch changes the value maps to pointer-maps to fix this issue.
 
 Bootstrappedtested on x86_64-unknown-linux-gnu. OK?

Nice, but perhaps we need a sparsemap to do even better?

Paolo



Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-06 Thread Steven Bosscher
On Mon, Aug 6, 2012 at 1:07 PM, Paolo Bonzini bonz...@gnu.org wrote:
 Il 06/08/2012 08:54, Steven Bosscher ha scritto:
 Hello,

 In PR54146, ifcvt spends a lot of time just clearing memory. This
 patch changes the value maps to pointer-maps to fix this issue.

 Bootstrappedtested on x86_64-unknown-linux-gnu. OK?

 Nice, but perhaps we need a sparsemap to do even better?

If you mean sparseset: no, it won't do any better.

My first idea was to use a sparseset, but:

1. The amount of memory allocated is simply too big. In fact, it'd be
even worse than before: 4*max_regno*sizeof(rtx) instead of only
2*max_regno*sizeof(rtx).

2. sparseset has the same problem of memory clearing (for valgrind,
see sparseset_alloc).

Remember, we can call this function cond_move_convert_if_block
O(n_basic_block) times: On a CFG with mostly regular control flow,
i.e. no or few computed jumps, jump tables, and exception handlers,
the number of IF-THEN-{,ELSE_}-JOIN blocks will be O(n_basic_blocks)
and cond_move_convert_if_block is called on almost all of them.

What could work, is to allocate a sparseset once at the beginning of
if_convert, but I don't see any good reason to add a pair of global
variables for this. My patch reduces the time spent in ifcvt to almost
nothing (down from ~250s, which is still not so bad, considering we're
looking at a function with 26 basic blocks!).

Ciao!
Steven


Re: [patch] speed up ifcvt:cond_move_convert_if_block

2012-08-06 Thread Paolo Bonzini
Il 06/08/2012 13:15, Steven Bosscher ha scritto:
 On Mon, Aug 6, 2012 at 1:07 PM, Paolo Bonzini bonz...@gnu.org wrote:
 Il 06/08/2012 08:54, Steven Bosscher ha scritto:
 Hello,

 In PR54146, ifcvt spends a lot of time just clearing memory. This
 patch changes the value maps to pointer-maps to fix this issue.

 Bootstrappedtested on x86_64-unknown-linux-gnu. OK?

 Nice, but perhaps we need a sparsemap to do even better?
 
 If you mean sparseset: no, it won't do any better.

No, I mean inventing a sparseset that cannot do boolean operations,
but can store an associative value in a second dense array.  That is

   sparsemap_insert(m, k, v)
   {
 if (!sparsemap_contains(m, k)) {
   s-sparse[k] = s-members++;
   s-dense_keys[i] = k;
 }
 i = s-sparse[k];
 s-dense_values[i] = v;
   }

   sparsemap_contains(m, k)
   {
 if (!sparsemap_contains(m, k)) {
   return -1;
 } else {
   return s-dense_values[i];
 }

 My first idea was to use a sparseset, but:
 
 1. The amount of memory allocated is simply too big. In fact, it'd be
 even worse than before: 4*max_regno*sizeof(rtx) instead of only
 2*max_regno*sizeof(rtx).

Yeah, sparseset wastes some memory.  I wonder if the dense array should
be allocated separately and even made dynamically resizable, also because...

 2. sparseset has the same problem of memory clearing (for valgrind,
 see sparseset_alloc).

... only the sparse array needs this clearing, but currently we do it
for both.

 What could work, is to allocate a sparseset once at the beginning of
 if_convert, but I don't see any good reason to add a pair of global
 variables for this.

Yes, this is what I meant.  Fast clearing is where sparsesets behave
well, so it makes sense to make them global in that case.  What I missed
was that the maps remain small.  Otherwise they would also need clearing.

Paolo