Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote: On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote: Though, following patch is just fine for me too, I don't think it will make a significant difference: This version is fine by me. Richard, are you ok with that too? Bootstrapped/regtested on x86_64-linux and i686-linux now. 2014-11-20 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): Return NULL for integer modes wider than MAX_BITSIZE_MODE_ANY_INT. If not using CONST_WIDE_INT, make sure r fits into CONST_DOUBLE. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-19 09:17:15.491327992 +0100 +++ gcc/simplify-rtx.c 2014-11-19 12:28:16.223808178 +0100 @@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + return NULL_RTX; for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* Make sure r will fit into CONST_INT or CONST_DOUBLE. */ + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif elems[elem] = immed_wide_int_const (r, outer_submode); } break; --- gcc/testsuite/gcc.target/i386/pr63910.c.jj 2014-11-19 12:04:23.490489130 +0100 +++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-19 12:04:23.490489130 +0100 @@ -0,0 +1,12 @@ +/* PR target/63910 */ +/* { dg-do compile } */ +/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */ + +extern void bar (float *c); + +void +foo (void) +{ + float c[1024] = { }; + bar (c); +} Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
Jakub Jelinek ja...@redhat.com writes: On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote: On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote: Though, following patch is just fine for me too, I don't think it will make a significant difference: This version is fine by me. Richard, are you ok with that too? Yeah, looks good to me, thanks. Richard
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Tue, Nov 18, 2014 at 04:34:12PM -0800, Mike Stump wrote: On Nov 18, 2014, at 3:42 PM, Jakub Jelinek ja...@redhat.com wrote: No, I'm not touching tmp array at all in that case, only look at vp individual bytes. Either they are all 0, or all 0xff, or I return NULL. Oh, sorry, I misread where the break; in your code was going. I might have been misled by: - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); in your patch. You don’t need that anymore, do you? If not, can you remove this part. I thought the assert is unnecessary given the condition just a few lines above it. But can keep it, perhaps gcc_checking_assert would be enough, and hopefully compiler optimizes it away completely. The rest looks like normal rtl/vector code, I don’t see anything wrong with it. Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Tue, 18 Nov 2014, Jakub Jelinek wrote: Hi! OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. So we have (subreg:OI (reg:V4DF ...) ...) for example? What do we end doing with that OI mode subreg? (why can't we simply use the V4DF one?) The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Looks ok to me. Not sure if the zero/all-ones case really happens that much to be worth special-casing - I think you could use fixed_wide_intproper-size and simply see if the result is representable in a CONST_INT/CONST_DOUBLE? Can you try if that works? It looks like the patch may be smaller for that. Thanks, Richard. 2014-11-18 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): For integer modes wider than MAX_BITSIZE_MODE_ANY_INT, handle all zeros and all ones and for other values return NULL_RTX. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-11 00:06:19.0 +0100 +++ gcc/simplify-rtx.c2014-11-18 10:17:01.198668555 +0100 @@ -5505,6 +5505,21 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + { + /* Handle just all zeros and all ones CONST_VECTORs in +this case. */ + if ((vp[0] value_mask) == 0) + elems[elem] = const0_rtx; + else if ((vp[0] value_mask) == value_mask) + elems[elem] = constm1_rtx; + else + return NULL_RTX; + for (i = value_bit; i elem_bitsize; i += value_bit) + if ((vp[i / value_bit] value_mask) != (vp[0] value_mask)) + return NULL_RTX; + break; + } for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5516,8 +5531,6 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); elems[elem] = immed_wide_int_const (r, outer_submode); --- gcc/testsuite/gcc.target/i386/pr63910.c.jj2014-11-18 10:12:24.282659318 +0100 +++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-18 10:12:00.0 +0100 @@ -0,0 +1,12 @@ +/* PR target/63910 */ +/* { dg-do compile } */ +/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */ + +extern void bar (float *c); + +void +foo (void) +{ + float c[1024] = { }; + bar (c); +} Jakub -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284 (AG Nuernberg) Maxfeldstrasse 5, 90409 Nuernberg, Germany
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote: OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. So we have (subreg:OI (reg:V4DF ...) ...) for example? What do we end doing with that OI mode subreg? (why can't we simply use the V4DF one?) propagate_rtx_1 is called on (subreg:OI (reg:V8DI 89) 0) with old_rtx (reg:V8DI 89) and new_rtx (const_vector:V8DI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) Seems we throw away the result in that case though, because gen_lowpart_common doesn't like to return low parts of VOIDmode constants larger if the mode is larger than double int: 1353 innermode = GET_MODE (x); 1354 if (CONST_INT_P (x) 1355 msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT) 1356innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0); 1357 else if (innermode == VOIDmode) 1358innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0); we hit the last if and as mode is wider than innermode, we return NULL later on. The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Looks ok to me. Not sure if the zero/all-ones case really happens that much to be worth special-casing - I think you could use fixed_wide_intproper-size and simply see if the result is representable in a CONST_INT/CONST_DOUBLE? Can you try if that works? It looks like the patch may be smaller for that. So perhaps something like this? Don't know how much more inefficient it is compared to what it used to do before. Or just keep the existing code and just remove the assert and instead return NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT? At least during propagation that will make zero change. Though, in that case I have still doubts about the current code handling right modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64). If TARGET_SUPPORTS_WIDE_INT == 0, we still silently throw away the upper bits, don't we? BTW, to Mike, the assert has been misplaced, there was first buffer overflow and only after that the assert. 2014-11-19 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): Handle integer modes wider than MAX_BITSIZE_MODE_ANY_INT. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-19 09:17:15.491327992 +0100 +++ gcc/simplify-rtx.c 2014-11-19 12:28:16.223808178 +0100 @@ -5501,8 +5501,12 @@ simplify_immed_subreg (machine_mode oute int units = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT; - HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; - wide_int r; + const int tmpsize = (MAX_BITSIZE_MODE_ANY_MODE ++ HOST_BITS_PER_WIDE_INT - 1) + / HOST_BITS_PER_WIDE_INT; + HOST_WIDE_INT tmp[tmpsize]; + typedef FIXED_WIDE_INT (tmpsize * HOST_BITS_PER_WIDE_INT) imm_int; + imm_int r; for (u = 0; u units; u++) { @@ -5515,10 +5519,12 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); - r = wide_int::from_array (tmp, units, - GET_MODE_PRECISION (outer_submode)); + r = imm_int::from_array (tmp, units, +GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif elems[elem] = immed_wide_int_const (r, outer_submode); } break; --- gcc/testsuite/gcc.target/i386/pr63910.c.jj 2014-11-19
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, 19 Nov 2014, Jakub Jelinek wrote: On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote: OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. So we have (subreg:OI (reg:V4DF ...) ...) for example? What do we end doing with that OI mode subreg? (why can't we simply use the V4DF one?) propagate_rtx_1 is called on (subreg:OI (reg:V8DI 89) 0) with old_rtx (reg:V8DI 89) and new_rtx (const_vector:V8DI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) Seems we throw away the result in that case though, because gen_lowpart_common doesn't like to return low parts of VOIDmode constants larger if the mode is larger than double int: 1353innermode = GET_MODE (x); 1354if (CONST_INT_P (x) 1355 msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT) 1356 innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0); 1357else if (innermode == VOIDmode) 1358 innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0); we hit the last if and as mode is wider than innermode, we return NULL later on. The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Looks ok to me. Not sure if the zero/all-ones case really happens that much to be worth special-casing - I think you could use fixed_wide_intproper-size and simply see if the result is representable in a CONST_INT/CONST_DOUBLE? Can you try if that works? It looks like the patch may be smaller for that. So perhaps something like this? Don't know how much more inefficient it is compared to what it used to do before. Yes, that looks good. Or just keep the existing code and just remove the assert and instead return NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT? At least during propagation that will make zero change. Though, in that case I have still doubts about the current code handling right modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64). If TARGET_SUPPORTS_WIDE_INT == 0, we still silently throw away the upper bits, don't we? Well - not with your added check, no? I'd say the patch is ok. Thanks, Richard. BTW, to Mike, the assert has been misplaced, there was first buffer overflow and only after that the assert. 2014-11-19 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): Handle integer modes wider than MAX_BITSIZE_MODE_ANY_INT. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-19 09:17:15.491327992 +0100 +++ gcc/simplify-rtx.c2014-11-19 12:28:16.223808178 +0100 @@ -5501,8 +5501,12 @@ simplify_immed_subreg (machine_mode oute int units = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT; - HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; - wide_int r; + const int tmpsize = (MAX_BITSIZE_MODE_ANY_MODE + + HOST_BITS_PER_WIDE_INT - 1) + / HOST_BITS_PER_WIDE_INT; + HOST_WIDE_INT tmp[tmpsize]; + typedef FIXED_WIDE_INT (tmpsize * HOST_BITS_PER_WIDE_INT) imm_int; + imm_int r; for (u = 0; u units; u++) { @@ -5515,10 +5519,12 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); - r = wide_int::from_array (tmp, units, - GET_MODE_PRECISION (outer_submode)); + r = imm_int::from_array (tmp, units, + GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) +
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 12:59:06PM +0100, Richard Biener wrote: So perhaps something like this? Don't know how much more inefficient it is compared to what it used to do before. Yes, that looks good. Or just keep the existing code and just remove the assert and instead return NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT? At least during propagation that will make zero change. Though, in that case I have still doubts about the current code handling right modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64). If TARGET_SUPPORTS_WIDE_INT == 0, we still silently throw away the upper bits, don't we? Well - not with your added check, no? For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok. Not sure about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up) precision? Mike? Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, 19 Nov 2014, Jakub Jelinek wrote: On Wed, Nov 19, 2014 at 12:59:06PM +0100, Richard Biener wrote: So perhaps something like this? Don't know how much more inefficient it is compared to what it used to do before. Yes, that looks good. Or just keep the existing code and just remove the assert and instead return NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT? At least during propagation that will make zero change. Though, in that case I have still doubts about the current code handling right modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64). If TARGET_SUPPORTS_WIDE_INT == 0, we still silently throw away the upper bits, don't we? Well - not with your added check, no? For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok. Not sure about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up) precision? Mike? It can represent any - well, the RTX at least. Code then using simple wide_int may wreck then though. Richard.
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Nov 19, 2014, at 4:24 AM, Richard Biener rguent...@suse.de wrote: On Wed, 19 Nov 2014, Jakub Jelinek wrote: For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok. Not sure about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up) precision? Mike? It can represent any - well, the RTX at least. Code then using simple wide_int may wreck then though. So, my worry is this… once you start in on adding support here or there for int modes wider than the largest supported int mode, you create a never ending maintenance nightmare with complex rules that one will never be able to keep straight. There needs to be a single line or two, that explains the rules that we all agree to, then it will always be clear what the rule is. The largest supported int mode is: x, has a nice, simple, easy to explain aspect to it. If one looks at the rtl optimization code, and all the code that cracks rtl integer constants and plays with them, I don’t think _all_ that code is larger than max safe. I have reason to believe it is all reasonable safe up to the max however; that was our contribution in wide-int. Originally I was going to say, this patch needs a int larger than MAX_BITSIZE_MODE_ANY_INT person to review it, meaning, that isn’t me. It still isn’t me; so, I defer that to someone that wants to try and keep those rules straight and memorize them and ensure the totality of the compiler actually works for theses cases by design. Luckily, that only affects x86, as it is the only port to try and slice and dice like this. So, I’ll defer to others on that.
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Nov 19, 2014, at 4:07 AM, Jakub Jelinek ja...@redhat.com wrote: For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok. Not sure about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up) precision? Mike? So, you can generate any finite size const_int you want. That is safe. The question, what does the entire rest of the compiler do with these bigger than max things, well that’s the part that I’ll defer to others on. Generally when Kenny and I did the code, I think we had a tendency to treat wide_int as a maximal size.
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
Jakub Jelinek ja...@redhat.com writes: On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote: OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. So we have (subreg:OI (reg:V4DF ...) ...) for example? What do we end doing with that OI mode subreg? (why can't we simply use the V4DF one?) propagate_rtx_1 is called on (subreg:OI (reg:V8DI 89) 0) with old_rtx (reg:V8DI 89) and new_rtx (const_vector:V8DI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) ]) Seems we throw away the result in that case though, because gen_lowpart_common doesn't like to return low parts of VOIDmode constants larger if the mode is larger than double int: 1353innermode = GET_MODE (x); 1354if (CONST_INT_P (x) 1355 msize * BITS_PER_UNIT = HOST_BITS_PER_WIDE_INT) 1356 innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0); 1357else if (innermode == VOIDmode) 1358 innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0); we hit the last if and as mode is wider than innermode, we return NULL later on. The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Looks ok to me. Not sure if the zero/all-ones case really happens that much to be worth special-casing - I think you could use fixed_wide_intproper-size and simply see if the result is representable in a CONST_INT/CONST_DOUBLE? Can you try if that works? It looks like the patch may be smaller for that. So perhaps something like this? Don't know how much more inefficient it is compared to what it used to do before. I think the effect is that OImode integers are interesting on x86 if the value fits in a sign-extended TImode, but not otherwise. Is that right? In some ways it seems like an arbitrary distinction. If any nonzero OImode values are potentially interesting then aren't all of them potentially interesting? Or just keep the existing code and just remove the assert and instead return NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT? At least during propagation that will make zero change. Sounds good to me FWIW, especially if there are no known cases where we want a nonnull return for modes wider than MAX_BITSIZE_MODE_ANY_INT. (I assume the assert would have triggered for them too.) Thanks, Richard
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 10:38:57AM -0800, Mike Stump wrote: On Nov 19, 2014, at 4:24 AM, Richard Biener rguent...@suse.de wrote: On Wed, 19 Nov 2014, Jakub Jelinek wrote: For TARGET_SUPPORTS_WIDE_INT == 0 should be hopefully ok. Not sure about TARGET_SUPPORTS_WIDE_INT != 0, can it express any generic_wide_int, or is it still bound to wide_int (i.e. MAX_BITSIZE_MODE_ANY_INT rounded up) precision? Mike? It can represent any - well, the RTX at least. Code then using simple wide_int may wreck then though. So, my worry is this… once you start in on adding support here or there for int modes wider than the largest supported int mode, you create a never ending maintenance nightmare with complex rules that one will never be able to keep straight. There needs to be a single line or two, that explains the rules that we all agree to, then it will always be clear what the rule is. The largest supported int mode is: x, has a nice, simple, easy to explain aspect to it. Well, at least for TARGET_SUPPORTS_WIDE_INT == 0 my patch would always create CONST_INTs or CONST_DOUBLEs, which all fit into MAX_BITSIZE_MODE_ANY_INT bits, CONST_INTs are modeless, so in what wider mode you use them doesn't matter. For TARGET_SUPPORTS_WIDE_INT != 0 we could certainly cap it similarly, if wi::min_precision (r, SIGNED) MAX_BITSIZE_MODE_ANY_INT we could return NULL_RTX. Though, following patch is just fine for me too, I don't think it will make a significant difference: --- gcc/simplify-rtx.c 2014-11-19 15:39:24.073113107 +0100 +++ gcc/simplify-rtx.c 2014-11-19 22:55:44.201464253 +0100 @@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + return NULL_RTX; for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* Make sure r will fit into CONST_INT or CONST_DOUBLE. */ + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif elems[elem] = immed_wide_int_const (r, outer_submode); } break; Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote: Though, following patch is just fine for me too, I don't think it will make a significant difference: This version is fine by me. --- gcc/simplify-rtx.c2014-11-19 15:39:24.073113107 +0100 +++ gcc/simplify-rtx.c2014-11-19 22:55:44.201464253 +0100 @@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + return NULL_RTX; for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* Make sure r will fit into CONST_INT or CONST_DOUBLE. */ + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif
[PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
Hi! OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-18 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): For integer modes wider than MAX_BITSIZE_MODE_ANY_INT, handle all zeros and all ones and for other values return NULL_RTX. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-11 00:06:19.0 +0100 +++ gcc/simplify-rtx.c 2014-11-18 10:17:01.198668555 +0100 @@ -5505,6 +5505,21 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + { + /* Handle just all zeros and all ones CONST_VECTORs in + this case. */ + if ((vp[0] value_mask) == 0) + elems[elem] = const0_rtx; + else if ((vp[0] value_mask) == value_mask) + elems[elem] = constm1_rtx; + else + return NULL_RTX; + for (i = value_bit; i elem_bitsize; i += value_bit) + if ((vp[i / value_bit] value_mask) != (vp[0] value_mask)) + return NULL_RTX; + break; + } for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5516,8 +5531,6 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); elems[elem] = immed_wide_int_const (r, outer_submode); --- gcc/testsuite/gcc.target/i386/pr63910.c.jj 2014-11-18 10:12:24.282659318 +0100 +++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-18 10:12:00.0 +0100 @@ -0,0 +1,12 @@ +/* PR target/63910 */ +/* { dg-do compile } */ +/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */ + +extern void bar (float *c); + +void +foo (void) +{ + float c[1024] = { }; + bar (c); +} Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Nov 18, 2014, at 1:52 PM, Jakub Jelinek ja...@redhat.com wrote: OImode/XImode on i?86/x86_64 are not = MAX_BITSIZE_MODE_ANY_INT, because they are never used for integer arithmetics (and there is no way to represent all their values in RTL if not using CONST_WIDE_INT). As the following testcase shows, simplify_immed_subreg can be called with such modes though, e.g. trying to forward propagate a CONST_VECTOR (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear in the IL directly) into a SUBREG_REG. The following patch instead of ICE handles the most common cases (all 0 and all 1 CONST_VECTORs) and returns NULL otherwise. Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for units that keeps u in bounds. Doesn’t this access 32 bytes of OImode values in a 16 byte data structure? Next, from_arrary uses a wide_int, and this from the documentation applies: All three flavors of wide_int are represented as a vector of HOST_WIDE_INTs. The default and widest_int vectors contain enough elements to hold a value of MAX_BITSIZE_MODE_ANY_INT bits. offset_int contains only enough elements to hold ADDR_MAX_PRECISION bits. The values are stored in the vector with the least significant HOST_BITS_PER_WIDE_INT bits in element 0. If you look at the code to from_arrary: wide_int_storage::from_array (const HOST_WIDE_INT *val, unsigned int len, unsigned int precision, bool need_canon_p) { wide_int result = wide_int::create (precision); result.set_len (wi::from_array (result.write_val (), val, len, precision, need_canon_p)); unsigned int wi::from_array (HOST_WIDE_INT *val, const HOST_WIDE_INT *xval, unsigned int xlen, unsigned int precision, bool need_canon) { for (unsigned i = 0; i xlen; i++) val[i] = xval[i]; it just does a blind copy of all the xlen hunks which forms from units, and units is: int units = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT; and GET_MODE_BITSIZE (outer_submode) is MAX_BITSIZE_MODE_ANY_INT, right? You can’t copy more bytes than the size of the destination has?
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Tue, Nov 18, 2014 at 03:30:56PM -0800, Mike Stump wrote: Before wide-int got merged, the testcase worked, though the code didn't bother checking anything, just created 0 or constm1_rtx for the two cases that could happen and if something else appeared, could just return what matched low TImode (or DImode for -m32). tmp is sized for MAX_BITSIZE_MODE_ANY_INT, but, you remove the limiter for units that keeps u in bounds. Doesn’t this access 32 bytes of OImode values in a 16 byte data structure? No, I'm not touching tmp array at all in that case, only look at vp individual bytes. Either they are all 0, or all 0xff, or I return NULL. and GET_MODE_BITSIZE (outer_submode) is MAX_BITSIZE_MODE_ANY_INT, right? You can’t copy more bytes than the size of the destination has? On the testcase we have (subreg:OI (reg:V8SI 1234) 0) and try to propagate (const_vector:V8SI [8 x (const_int 0)]) into it. All zeros even in OImode (or XImode) is still (const_int 0), all ones even in those modes is (const_int -1), which are the only constants I'm using for those ultra-wide modes, anything else will return NULL (but not ICE). Jakub
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Nov 18, 2014, at 3:42 PM, Jakub Jelinek ja...@redhat.com wrote: No, I'm not touching tmp array at all in that case, only look at vp individual bytes. Either they are all 0, or all 0xff, or I return NULL. Oh, sorry, I misread where the break; in your code was going. I might have been misled by: - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); in your patch. You don’t need that anymore, do you? If not, can you remove this part. The rest looks like normal rtl/vector code, I don’t see anything wrong with it.