Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-20 Thread Jakub Jelinek
On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote:
 On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote:
  Though, following patch is just fine for me too, I don't think it will
  make a significant difference:
 
 This version is fine by me.

Richard, are you ok with that too?

Bootstrapped/regtested on x86_64-linux and i686-linux now.

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): Return NULL for integer
modes wider than MAX_BITSIZE_MODE_ANY_INT.  If not using
CONST_WIDE_INT, make sure r fits into CONST_DOUBLE.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-19 09:17:15.491327992 +0100
+++ gcc/simplify-rtx.c  2014-11-19 12:28:16.223808178 +0100
@@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
+ return NULL_RTX;
for (u = 0; u  units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+   /* Make sure r will fit into CONST_INT or CONST_DOUBLE.  */
+   if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
+ return NULL_RTX;
+#endif
elems[elem] = immed_wide_int_const (r, outer_submode);
  }
  break;
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-19 12:04:23.490489130 
+0100
+++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-19 12:04:23.490489130 
+0100
@@ -0,0 +1,12 @@
+/* PR target/63910 */
+/* { dg-do compile } */
+/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */
+
+extern void bar (float *c);
+
+void
+foo (void)
+{
+  float c[1024] = { };
+  bar (c);
+}


Jakub


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-20 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote:
 On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote:
  Though, following patch is just fine for me too, I don't think it will
  make a significant difference:
 
 This version is fine by me.

 Richard, are you ok with that too?

Yeah, looks good to me, thanks.

Richard


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek
On Tue, Nov 18, 2014 at 04:34:12PM -0800, Mike Stump wrote:
 On Nov 18, 2014, at 3:42 PM, Jakub Jelinek ja...@redhat.com wrote:
  No, I'm not touching tmp array at all in that case, only look at vp
  individual bytes.  Either they are all 0, or all 0xff, or I return NULL.
 
 Oh, sorry, I misread where the break; in your code was going.  I might have 
 been misled by:
 
  -   gcc_assert (GET_MODE_PRECISION (outer_submode)
  -   = MAX_BITSIZE_MODE_ANY_INT);
 
 in your patch.  You don’t need that anymore, do you?  If not, can you remove 
 this part.

I thought the assert is unnecessary given the condition just a few lines
above it.  But can keep it, perhaps gcc_checking_assert would be enough, and
hopefully compiler optimizes it away completely.

 
 The rest looks like normal rtl/vector code, I don’t see anything wrong with 
 it.

Jakub


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Richard Biener
On Tue, 18 Nov 2014, Jakub Jelinek wrote:

 Hi!
 
 OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
 they are never used for integer arithmetics (and there is no way
 to represent all their values in RTL if not using CONST_WIDE_INT).
 As the following testcase shows, simplify_immed_subreg can be called
 with such modes though, e.g. trying to forward propagate a CONST_VECTOR
 (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
 in the IL directly) into a SUBREG_REG.

So we have (subreg:OI (reg:V4DF ...) ...) for example?  What do we
end doing with that OI mode subreg?  (why can't we simply use the
V4DF one?)

 The following patch instead of ICE handles the most common cases (all 0
 and all 1 CONST_VECTORs) and returns NULL otherwise.
 
 Before wide-int got merged, the testcase worked, though the code didn't
 bother checking anything, just created 0 or constm1_rtx for the two cases
 that could happen and if something else appeared, could just return what
 matched low TImode (or DImode for -m32).
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Looks ok to me.  Not sure if the zero/all-ones case really happens that
much to be worth special-casing - I think you could use
fixed_wide_intproper-size and simply see if the result is representable
in a CONST_INT/CONST_DOUBLE?  Can you try if that works?  It looks like
the patch may be smaller for that.

Thanks,
Richard.

 2014-11-18  Jakub Jelinek  ja...@redhat.com
 
   PR target/63910
   * simplify-rtx.c (simplify_immed_subreg): For integer modes
   wider than MAX_BITSIZE_MODE_ANY_INT, handle all zeros and all ones
   and for other values return NULL_RTX.
 
   * gcc.target/i386/pr63910.c: New test.
 
 --- gcc/simplify-rtx.c.jj 2014-11-11 00:06:19.0 +0100
 +++ gcc/simplify-rtx.c2014-11-18 10:17:01.198668555 +0100
 @@ -5505,6 +5505,21 @@ simplify_immed_subreg (machine_mode oute
   HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
 HOST_BITS_PER_WIDE_INT];
   wide_int r;
  
 + if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
 +   {
 + /* Handle just all zeros and all ones CONST_VECTORs in
 +this case.  */
 + if ((vp[0]  value_mask) == 0)
 +   elems[elem] = const0_rtx;
 + else if ((vp[0]  value_mask) == value_mask)
 +   elems[elem] = constm1_rtx;
 + else
 +   return NULL_RTX;
 + for (i = value_bit; i  elem_bitsize; i += value_bit)
 +   if ((vp[i / value_bit]  value_mask) != (vp[0]  value_mask))
 + return NULL_RTX;
 + break;
 +   }
   for (u = 0; u  units; u++)
 {
   unsigned HOST_WIDE_INT buf = 0;
 @@ -5516,8 +5531,6 @@ simplify_immed_subreg (machine_mode oute
   tmp[u] = buf;
   base += HOST_BITS_PER_WIDE_INT;
 }
 - gcc_assert (GET_MODE_PRECISION (outer_submode)
 - = MAX_BITSIZE_MODE_ANY_INT);
   r = wide_int::from_array (tmp, units,
 GET_MODE_PRECISION (outer_submode));
   elems[elem] = immed_wide_int_const (r, outer_submode);
 --- gcc/testsuite/gcc.target/i386/pr63910.c.jj2014-11-18 
 10:12:24.282659318 +0100
 +++ gcc/testsuite/gcc.target/i386/pr63910.c   2014-11-18 10:12:00.0 
 +0100
 @@ -0,0 +1,12 @@
 +/* PR target/63910 */
 +/* { dg-do compile } */
 +/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */
 +
 +extern void bar (float *c);
 +
 +void
 +foo (void)
 +{
 +  float c[1024] = { };
 +  bar (c);
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek
On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote:
  OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
  they are never used for integer arithmetics (and there is no way
  to represent all their values in RTL if not using CONST_WIDE_INT).
  As the following testcase shows, simplify_immed_subreg can be called
  with such modes though, e.g. trying to forward propagate a CONST_VECTOR
  (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
  in the IL directly) into a SUBREG_REG.
 
 So we have (subreg:OI (reg:V4DF ...) ...) for example?  What do we
 end doing with that OI mode subreg?  (why can't we simply use the
 V4DF one?)

propagate_rtx_1 is called on
(subreg:OI (reg:V8DI 89) 0)
with old_rtx
(reg:V8DI 89)
and new_rtx
(const_vector:V8DI [
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
])

Seems we throw away the result in that case though, because
gen_lowpart_common doesn't like to return low parts of VOIDmode constants
larger if the mode is larger than double int:
1353  innermode = GET_MODE (x);
1354  if (CONST_INT_P (x)
1355   msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT)
1356innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0);
1357  else if (innermode == VOIDmode)
1358innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0);
we hit the last if and as mode is wider than innermode, we return NULL later
on.

  The following patch instead of ICE handles the most common cases (all 0
  and all 1 CONST_VECTORs) and returns NULL otherwise.
  
  Before wide-int got merged, the testcase worked, though the code didn't
  bother checking anything, just created 0 or constm1_rtx for the two cases
  that could happen and if something else appeared, could just return what
  matched low TImode (or DImode for -m32).
  
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 Looks ok to me.  Not sure if the zero/all-ones case really happens that
 much to be worth special-casing - I think you could use
 fixed_wide_intproper-size and simply see if the result is representable
 in a CONST_INT/CONST_DOUBLE?  Can you try if that works?  It looks like
 the patch may be smaller for that.

So perhaps something like this?  Don't know how much more inefficient it is
compared to what it used to do before.

Or just keep the existing code and just remove the assert and instead return
NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
least during propagation that will make zero change.
Though, in that case I have still doubts about the current code handling right
modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than
MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64).  If TARGET_SUPPORTS_WIDE_INT
== 0, we still silently throw away the upper bits, don't we?

BTW, to Mike, the assert has been misplaced, there was first buffer overflow
and only after that the assert.

2014-11-19  Jakub Jelinek  ja...@redhat.com

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): Handle integer modes wider
than MAX_BITSIZE_MODE_ANY_INT.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-19 09:17:15.491327992 +0100
+++ gcc/simplify-rtx.c  2014-11-19 12:28:16.223808178 +0100
@@ -5501,8 +5501,12 @@ simplify_immed_subreg (machine_mode oute
int units
  = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1)
  / HOST_BITS_PER_WIDE_INT;
-   HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
-   wide_int r;
+   const int tmpsize = (MAX_BITSIZE_MODE_ANY_MODE 
++ HOST_BITS_PER_WIDE_INT - 1)
+   / HOST_BITS_PER_WIDE_INT;
+   HOST_WIDE_INT tmp[tmpsize];
+   typedef FIXED_WIDE_INT (tmpsize * HOST_BITS_PER_WIDE_INT) imm_int;
+   imm_int r;
 
for (u = 0; u  units; u++)
  {
@@ -5515,10 +5519,12 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
-   r = wide_int::from_array (tmp, units,
- GET_MODE_PRECISION (outer_submode));
+   r = imm_int::from_array (tmp, units,
+GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+   if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
+ return NULL_RTX;
+#endif
elems[elem] = immed_wide_int_const (r, outer_submode);
  }
  break;
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-19 

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Richard Biener
On Wed, 19 Nov 2014, Jakub Jelinek wrote:

 On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote:
   OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
   they are never used for integer arithmetics (and there is no way
   to represent all their values in RTL if not using CONST_WIDE_INT).
   As the following testcase shows, simplify_immed_subreg can be called
   with such modes though, e.g. trying to forward propagate a CONST_VECTOR
   (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can 
   appear
   in the IL directly) into a SUBREG_REG.
  
  So we have (subreg:OI (reg:V4DF ...) ...) for example?  What do we
  end doing with that OI mode subreg?  (why can't we simply use the
  V4DF one?)
 
 propagate_rtx_1 is called on
 (subreg:OI (reg:V8DI 89) 0)
 with old_rtx
 (reg:V8DI 89)
 and new_rtx
 (const_vector:V8DI [
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 ])
 
 Seems we throw away the result in that case though, because
 gen_lowpart_common doesn't like to return low parts of VOIDmode constants
 larger if the mode is larger than double int:
 1353innermode = GET_MODE (x);
 1354if (CONST_INT_P (x)
 1355 msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT)
 1356  innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0);
 1357else if (innermode == VOIDmode)
 1358  innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0);
 we hit the last if and as mode is wider than innermode, we return NULL later
 on.
 
   The following patch instead of ICE handles the most common cases (all 0
   and all 1 CONST_VECTORs) and returns NULL otherwise.
   
   Before wide-int got merged, the testcase worked, though the code didn't
   bother checking anything, just created 0 or constm1_rtx for the two cases
   that could happen and if something else appeared, could just return what
   matched low TImode (or DImode for -m32).
   
   Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
  
  Looks ok to me.  Not sure if the zero/all-ones case really happens that
  much to be worth special-casing - I think you could use
  fixed_wide_intproper-size and simply see if the result is representable
  in a CONST_INT/CONST_DOUBLE?  Can you try if that works?  It looks like
  the patch may be smaller for that.
 
 So perhaps something like this?  Don't know how much more inefficient it is
 compared to what it used to do before.

Yes, that looks good.

 Or just keep the existing code and just remove the assert and instead return
 NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
 least during propagation that will make zero change.
 Though, in that case I have still doubts about the current code handling right
 modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than
 MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64).  If TARGET_SUPPORTS_WIDE_INT
 == 0, we still silently throw away the upper bits, don't we?

Well - not with your added check, no?

I'd say the patch is ok.

Thanks,
Richard.

 BTW, to Mike, the assert has been misplaced, there was first buffer overflow
 and only after that the assert.
 
 2014-11-19  Jakub Jelinek  ja...@redhat.com
 
   PR target/63910
   * simplify-rtx.c (simplify_immed_subreg): Handle integer modes wider
   than MAX_BITSIZE_MODE_ANY_INT.
 
   * gcc.target/i386/pr63910.c: New test.
 
 --- gcc/simplify-rtx.c.jj 2014-11-19 09:17:15.491327992 +0100
 +++ gcc/simplify-rtx.c2014-11-19 12:28:16.223808178 +0100
 @@ -5501,8 +5501,12 @@ simplify_immed_subreg (machine_mode oute
   int units
 = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1)
 / HOST_BITS_PER_WIDE_INT;
 - HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
 HOST_BITS_PER_WIDE_INT];
 - wide_int r;
 + const int tmpsize = (MAX_BITSIZE_MODE_ANY_MODE 
 +  + HOST_BITS_PER_WIDE_INT - 1)
 + / HOST_BITS_PER_WIDE_INT;
 + HOST_WIDE_INT tmp[tmpsize];
 + typedef FIXED_WIDE_INT (tmpsize * HOST_BITS_PER_WIDE_INT) imm_int;
 + imm_int r;
  
   for (u = 0; u  units; u++)
 {
 @@ -5515,10 +5519,12 @@ simplify_immed_subreg (machine_mode oute
   tmp[u] = buf;
   base += HOST_BITS_PER_WIDE_INT;
 }
 - gcc_assert (GET_MODE_PRECISION (outer_submode)
 - = MAX_BITSIZE_MODE_ANY_INT);
 - r = wide_int::from_array (tmp, units,
 -   GET_MODE_PRECISION (outer_submode));
 + r = imm_int::from_array (tmp, units,
 +  GET_MODE_PRECISION (outer_submode));
 +#if TARGET_SUPPORTS_WIDE_INT == 0
 + if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
 +   

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek
On Wed, Nov 19, 2014 at 12:59:06PM +0100, Richard Biener wrote:
  So perhaps something like this?  Don't know how much more inefficient it is
  compared to what it used to do before.
 
 Yes, that looks good.
 
  Or just keep the existing code and just remove the assert and instead return
  NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
  least during propagation that will make zero change.
  Though, in that case I have still doubts about the current code handling 
  right
  modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than
  MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64).  If TARGET_SUPPORTS_WIDE_INT
  == 0, we still silently throw away the upper bits, don't we?
 
 Well - not with your added check, no?

For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok.  Not sure
about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or
is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up)
precision?  Mike?

Jakub


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Richard Biener
On Wed, 19 Nov 2014, Jakub Jelinek wrote:

 On Wed, Nov 19, 2014 at 12:59:06PM +0100, Richard Biener wrote:
   So perhaps something like this?  Don't know how much more inefficient it 
   is
   compared to what it used to do before.
  
  Yes, that looks good.
  
   Or just keep the existing code and just remove the assert and instead 
   return
   NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
   least during propagation that will make zero change.
   Though, in that case I have still doubts about the current code handling 
   right
   modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than
   MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64).  If 
   TARGET_SUPPORTS_WIDE_INT
   == 0, we still silently throw away the upper bits, don't we?
  
  Well - not with your added check, no?
 
 For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok.  Not sure
 about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or
 is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up)
 precision?  Mike?

It can represent any - well, the RTX at least.  Code then using
simple wide_int may wreck then though.

Richard.


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Mike Stump
On Nov 19, 2014, at 4:24 AM, Richard Biener rguent...@suse.de wrote:
 On Wed, 19 Nov 2014, Jakub Jelinek wrote:
 For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok.  Not sure
 about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or
 is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up)
 precision?  Mike?
 
 It can represent any - well, the RTX at least.  Code then using
 simple wide_int may wreck then though.

So, my worry is this…  once you start in on adding support here or there for 
int modes wider than the largest supported int mode, you create a never ending 
maintenance nightmare with complex rules that one will never be able to keep 
straight.  There needs to be a single line or two, that explains the rules that 
we all agree to, then it will always be clear what the rule is.  The largest 
supported int mode is: x, has a nice, simple, easy to explain aspect to it.

If one looks at the rtl optimization code, and all the code that cracks rtl 
integer constants and plays with them, I don’t think _all_ that code is larger 
than max safe.  I have reason to believe it is all reasonable safe up to the 
max however; that was our contribution in wide-int.

Originally I was going to say, this patch needs a int larger than 
MAX_BITSIZE_MODE_ANY_INT person to review it, meaning, that isn’t me.  It still 
isn’t me; so, I defer that to someone that wants to try and keep those rules 
straight and memorize them and ensure the totality of the compiler actually 
works for theses cases by design.  Luckily, that only affects x86, as it is the 
only port to try and slice and dice like this.  So, I’ll defer to others on 
that.

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Mike Stump
On Nov 19, 2014, at 4:07 AM, Jakub Jelinek ja...@redhat.com wrote:
 
 For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok.  Not sure
 about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or
 is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up)
 precision?  Mike?

So, you can generate any finite size const_int you want.  That is safe.  The 
question, what does the entire rest of the compiler do with these bigger than 
max things, well that’s the part that I’ll defer to others on.  Generally when 
Kenny and I did the code, I think we had a tendency to treat wide_int as a 
maximal size.

Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote:
  OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
  they are never used for integer arithmetics (and there is no way
  to represent all their values in RTL if not using CONST_WIDE_INT).
  As the following testcase shows, simplify_immed_subreg can be called
  with such modes though, e.g. trying to forward propagate a CONST_VECTOR
  (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can 
  appear
  in the IL directly) into a SUBREG_REG.
 
 So we have (subreg:OI (reg:V4DF ...) ...) for example?  What do we
 end doing with that OI mode subreg?  (why can't we simply use the
 V4DF one?)

 propagate_rtx_1 is called on
 (subreg:OI (reg:V8DI 89) 0)
 with old_rtx
 (reg:V8DI 89)
 and new_rtx
 (const_vector:V8DI [
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 (const_int 0 [0])
 ])

 Seems we throw away the result in that case though, because
 gen_lowpart_common doesn't like to return low parts of VOIDmode constants
 larger if the mode is larger than double int:
 1353innermode = GET_MODE (x);
 1354if (CONST_INT_P (x)
 1355 msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT)
 1356  innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0);
 1357else if (innermode == VOIDmode)
 1358  innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0);
 we hit the last if and as mode is wider than innermode, we return NULL later
 on.

  The following patch instead of ICE handles the most common cases (all 0
  and all 1 CONST_VECTORs) and returns NULL otherwise.
  
  Before wide-int got merged, the testcase worked, though the code didn't
  bother checking anything, just created 0 or constm1_rtx for the two cases
  that could happen and if something else appeared, could just return what
  matched low TImode (or DImode for -m32).
  
  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 Looks ok to me.  Not sure if the zero/all-ones case really happens that
 much to be worth special-casing - I think you could use
 fixed_wide_intproper-size and simply see if the result is representable
 in a CONST_INT/CONST_DOUBLE?  Can you try if that works?  It looks like
 the patch may be smaller for that.

 So perhaps something like this?  Don't know how much more inefficient it is
 compared to what it used to do before.

I think the effect is that OImode integers are interesting on x86 if the
value fits in a sign-extended TImode, but not otherwise.  Is that right?
In some ways it seems like an arbitrary distinction.  If any nonzero
OImode values are potentially interesting then aren't all of them potentially
interesting?

 Or just keep the existing code and just remove the assert and instead return
 NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
 least during propagation that will make zero change.

Sounds good to me FWIW, especially if there are no known cases where we
want a nonnull return for modes wider than MAX_BITSIZE_MODE_ANY_INT.
(I assume the assert would have triggered for them too.)

Thanks,
Richard


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Jakub Jelinek
On Wed, Nov 19, 2014 at 10:38:57AM -0800, Mike Stump wrote:
 On Nov 19, 2014, at 4:24 AM, Richard Biener rguent...@suse.de wrote:
  On Wed, 19 Nov 2014, Jakub Jelinek wrote:
  For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok.  Not sure
  about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, 
  or
  is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up)
  precision?  Mike?
  
  It can represent any - well, the RTX at least.  Code then using
  simple wide_int may wreck then though.
 
 So, my worry is this…  once you start in on adding support here or there
 for int modes wider than the largest supported int mode, you create a
 never ending maintenance nightmare with complex rules that one will never
 be able to keep straight.  There needs to be a single line or two, that
 explains the rules that we all agree to, then it will always be clear what
 the rule is.  The largest supported int mode is: x, has a nice, simple,
 easy to explain aspect to it.

Well, at least for TARGET_SUPPORTS_WIDE_INT == 0 my patch would always
create CONST_INTs or CONST_DOUBLEs, which all fit into
MAX_BITSIZE_MODE_ANY_INT bits, CONST_INTs are modeless, so in what wider
mode you use them doesn't matter.  For TARGET_SUPPORTS_WIDE_INT != 0
we could certainly cap it similarly, if wi::min_precision (r, SIGNED) 
MAX_BITSIZE_MODE_ANY_INT we could return NULL_RTX.

Though, following patch is just fine for me too, I don't think it will
make a significant difference:

--- gcc/simplify-rtx.c  2014-11-19 15:39:24.073113107 +0100
+++ gcc/simplify-rtx.c  2014-11-19 22:55:44.201464253 +0100
@@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
+ return NULL_RTX;
for (u = 0; u  units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+   /* Make sure r will fit into CONST_INT or CONST_DOUBLE.  */
+   if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
+ return NULL_RTX;
+#endif
elems[elem] = immed_wide_int_const (r, outer_submode);
  }
  break;


Jakub


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-19 Thread Mike Stump
On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote:
 Though, following patch is just fine for me too, I don't think it will
 make a significant difference:

This version is fine by me.

 --- gcc/simplify-rtx.c2014-11-19 15:39:24.073113107 +0100
 +++ gcc/simplify-rtx.c2014-11-19 22:55:44.201464253 +0100
 @@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute
   HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
 HOST_BITS_PER_WIDE_INT];
   wide_int r;
 
 + if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
 +   return NULL_RTX;
   for (u = 0; u  units; u++)
 {
   unsigned HOST_WIDE_INT buf = 0;
 @@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute
   tmp[u] = buf;
   base += HOST_BITS_PER_WIDE_INT;
 }
 - gcc_assert (GET_MODE_PRECISION (outer_submode)
 - = MAX_BITSIZE_MODE_ANY_INT);
   r = wide_int::from_array (tmp, units,
 GET_MODE_PRECISION (outer_submode));
 +#if TARGET_SUPPORTS_WIDE_INT == 0
 + /* Make sure r will fit into CONST_INT or CONST_DOUBLE.  */
 + if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
 +   return NULL_RTX;
 +#endif



[PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Jakub Jelinek
Hi!

OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
they are never used for integer arithmetics (and there is no way
to represent all their values in RTL if not using CONST_WIDE_INT).
As the following testcase shows, simplify_immed_subreg can be called
with such modes though, e.g. trying to forward propagate a CONST_VECTOR
(i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
in the IL directly) into a SUBREG_REG.
The following patch instead of ICE handles the most common cases (all 0
and all 1 CONST_VECTORs) and returns NULL otherwise.

Before wide-int got merged, the testcase worked, though the code didn't
bother checking anything, just created 0 or constm1_rtx for the two cases
that could happen and if something else appeared, could just return what
matched low TImode (or DImode for -m32).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-18  Jakub Jelinek  ja...@redhat.com

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): For integer modes
wider than MAX_BITSIZE_MODE_ANY_INT, handle all zeros and all ones
and for other values return NULL_RTX.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-11 00:06:19.0 +0100
+++ gcc/simplify-rtx.c  2014-11-18 10:17:01.198668555 +0100
@@ -5505,6 +5505,21 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
+ {
+   /* Handle just all zeros and all ones CONST_VECTORs in
+  this case.  */
+   if ((vp[0]  value_mask) == 0)
+ elems[elem] = const0_rtx;
+   else if ((vp[0]  value_mask) == value_mask)
+ elems[elem] = constm1_rtx;
+   else
+ return NULL_RTX;
+   for (i = value_bit; i  elem_bitsize; i += value_bit)
+ if ((vp[i / value_bit]  value_mask) != (vp[0]  value_mask))
+   return NULL_RTX;
+   break;
+ }
for (u = 0; u  units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5516,8 +5531,6 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
elems[elem] = immed_wide_int_const (r, outer_submode);
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-18 10:12:24.282659318 
+0100
+++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-18 10:12:00.0 
+0100
@@ -0,0 +1,12 @@
+/* PR target/63910 */
+/* { dg-do compile } */
+/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */
+
+extern void bar (float *c);
+
+void
+foo (void)
+{
+  float c[1024] = { };
+  bar (c);
+}

Jakub


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Mike Stump
On Nov 18, 2014, at 1:52 PM, Jakub Jelinek ja...@redhat.com wrote:
 OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because
 they are never used for integer arithmetics (and there is no way
 to represent all their values in RTL if not using CONST_WIDE_INT).
 As the following testcase shows, simplify_immed_subreg can be called
 with such modes though, e.g. trying to forward propagate a CONST_VECTOR
 (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
 in the IL directly) into a SUBREG_REG.
 The following patch instead of ICE handles the most common cases (all 0
 and all 1 CONST_VECTORs) and returns NULL otherwise.
 
 Before wide-int got merged, the testcase worked, though the code didn't
 bother checking anything, just created 0 or constm1_rtx for the two cases
 that could happen and if something else appeared, could just return what
 matched low TImode (or DImode for -m32).

tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for 
units that keeps u in bounds.  Doesn’t this access 32 bytes of OImode values in 
a 16 byte data structure?

Next, from_arrary uses a wide_int, and this from the documentation applies:

   All three flavors of wide_int are represented as a vector of 
 
   HOST_WIDE_INTs.  The default and widest_int vectors contain enough elements  
 
   to hold a value of MAX_BITSIZE_MODE_ANY_INT bits.  offset_int contains only  

   enough elements to hold ADDR_MAX_PRECISION bits.  The values are stored  

   in the vector with the least significant HOST_BITS_PER_WIDE_INT bits 
 
   in element 0.
 

If you look at the code to from_arrary:

wide_int_storage::from_array (const HOST_WIDE_INT *val, unsigned int len,
  unsigned int precision, bool need_canon_p)
{
  wide_int result = wide_int::create (precision);
  result.set_len (wi::from_array (result.write_val (), val, len, precision,
  need_canon_p));


unsigned int
wi::from_array (HOST_WIDE_INT *val, const HOST_WIDE_INT *xval,
unsigned int xlen, unsigned int precision, bool need_canon)
{
  for (unsigned i = 0; i  xlen; i++)
val[i] = xval[i];

it just does a blind copy of all the xlen hunks which forms from units, and 
units is:

int units
  = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1)
  / HOST_BITS_PER_WIDE_INT;

and GET_MODE_BITSIZE (outer_submode) is  MAX_BITSIZE_MODE_ANY_INT, right?

You can’t copy more bytes than the size of the destination has?



Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Jakub Jelinek
On Tue, Nov 18, 2014 at 03:30:56PM -0800, Mike Stump wrote:
  Before wide-int got merged, the testcase worked, though the code didn't
  bother checking anything, just created 0 or constm1_rtx for the two cases
  that could happen and if something else appeared, could just return what
  matched low TImode (or DImode for -m32).
 
 tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for
 units that keeps u in bounds.  Doesn’t this access 32 bytes of OImode
 values in a 16 byte data structure?

No, I'm not touching tmp array at all in that case, only look at vp
individual bytes.  Either they are all 0, or all 0xff, or I return NULL.

 and GET_MODE_BITSIZE (outer_submode) is  MAX_BITSIZE_MODE_ANY_INT, right?
 
 You can’t copy more bytes than the size of the destination has?

On the testcase we have
(subreg:OI (reg:V8SI 1234) 0)
and try to propagate
(const_vector:V8SI [8 x (const_int 0)])
into it.  All zeros even in OImode (or XImode) is still
(const_int 0), all ones even in those modes is (const_int -1),
which are the only constants I'm using for those ultra-wide modes,
anything else will return NULL (but not ICE).

Jakub


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-18 Thread Mike Stump
On Nov 18, 2014, at 3:42 PM, Jakub Jelinek ja...@redhat.com wrote:
 No, I'm not touching tmp array at all in that case, only look at vp
 individual bytes.  Either they are all 0, or all 0xff, or I return NULL.

Oh, sorry, I misread where the break; in your code was going.  I might have 
been misled by:

 - gcc_assert (GET_MODE_PRECISION (outer_submode)
 - = MAX_BITSIZE_MODE_ANY_INT);

in your patch.  You don’t need that anymore, do you?  If not, can you remove 
this part.

The rest looks like normal rtl/vector code, I don’t see anything wrong with it.