[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Richard Biener --- Latent issue fixed as well.
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 --- Comment #7 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:d16b5975ca985cbe97698479fc38b6a636886978 commit r11-3460-gd16b5975ca985cbe97698479fc38b6a636886978 Author: Richard Biener Date: Fri Sep 25 11:13:13 2020 +0200 middle-end/96814 - fix VECTOR_BOOLEAN_TYPE_P CTOR RTL expansion The RTL expansion code for CTORs doesn't handle VECTOR_BOOLEAN_TYPE_P with bit-precision elements correctly as the testcase shows before the PR97085 fix. The following makes it do the correct thing (not 100% sure for CTOR of sub-vectors due to the lack of a testcase). The alternative would be to assert such CTORs do not happen (and also add IL verification for this). The GIMPLE FE needs a way to declare the VECTOR_BOOLEAN_TYPE_P vectors (thus the C FE needs that). 2020-09-25 Richard Biener PR middle-end/96814 * expr.c (store_constructor): Handle VECTOR_BOOLEAN_TYPE_P CTORs correctly. * gcc.target/i386/pr96814.c: New testcase.
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 --- Comment #6 from Richard Biener --- OK, so the extra folding for PR97085 also fixed this testcase, now expanding from _1 = { 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }; _2 = .VCOND_MASK (_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }); _3 = VIEW_CONVERT_EXPR(_2); I still have a patch for the CTOR expansion which I'll test and propose.
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 --- Comment #5 from Richard Biener --- So the issue of the miscompile is that we do not expand the vector constructor created by vector lowering correctly. Of course we don't actually _want_ CTORs of those ... The following fixes the miscompile (but will still mishandle vector CTORs concatenating multiple other vector) diff --git a/gcc/expr.c b/gcc/expr.c index 1a15f24b397..8bb3b4caf1c 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -6922,7 +6922,9 @@ store_constructor (tree exp, rtx target, int cleared, poly _int64 size, insn_code icode = CODE_FOR_nothing; tree elt; tree elttype = TREE_TYPE (type); - int elt_size = tree_to_uhwi (TYPE_SIZE (elttype)); + int elt_size + = (VECTOR_BOOLEAN_TYPE_P (type) ? TYPE_PRECISION (elttype) +: tree_to_uhwi (TYPE_SIZE (elttype))); machine_mode eltmode = TYPE_MODE (elttype); HOST_WIDE_INT bitsize; HOST_WIDE_INT bitpos; @@ -7045,7 +7047,9 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size, HOST_WIDE_INT eltpos; tree value = ce->value; - bitsize = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value))); + bitsize = (VECTOR_BOOLEAN_TYPE_P (type) + ? elt_size + : tree_to_uhwi (TYPE_SIZE (TREE_TYPE (value; if (cleared && initializer_zerop (value)) continue; Another interesting case is IMHO typedef unsigned char __attribute__ ((__vector_size__ (32))) V; unsigned char c = 8; int main (void) { V x = ((V){c} > 0) == 0; for (unsigned i = 0; i < sizeof (x); i++) if (x[i] != (i ? 0xff : 0)) __builtin_abort(); return 0; } which shows different behavior of veclower: + vector(32) _20; + vector(32) _21; : c.0_1 = c; _11 = {c.0_1}; _2 = _11 != { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; - _3 = VEC_COND_EXPR <_2, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }>; + _20 = _11 != { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; + _21 = ~_20; + _3 = _21; _4 = VEC_COND_EXPR <_3, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; it shows that while the initial FE IL has the nested VEC_COND (maybe avoid that for boolean vectors?) we fold one case but not the other?
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 --- Comment #4 from Martin Liška --- > > diff --git a/gcc/fold-const.c b/gcc/fold-const.c > index 1f861630225..0cc80adf632 100644 > --- a/gcc/fold-const.c > +++ b/gcc/fold-const.c > @@ -12581,7 +12581,9 @@ fold_ternary_loc (location_t loc, enum tree_code > code, t > ree type, > && tree_fits_uhwi_p (op2)) > { > tree eltype = TREE_TYPE (TREE_TYPE (arg0)); > - unsigned HOST_WIDE_INT width = tree_to_uhwi (TYPE_SIZE (eltype)); > + unsigned HOST_WIDE_INT width > + = (TREE_CODE (eltype) == BOOLEAN_TYPE > + ? TYPE_PRECISION (eltype) : tree_to_uhwi (TYPE_SIZE > (eltype))); > unsigned HOST_WIDE_INT n = tree_to_uhwi (arg1); > unsigned HOST_WIDE_INT idx = tree_to_uhwi (op2); > > diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c > index 6d5d65195ae..e9dbe07dccc 100644 > --- a/gcc/tree-vect-generic.c > +++ b/gcc/tree-vect-generic.c > @@ -137,7 +137,7 @@ tree_vec_extract (gimple_stmt_iterator *gsi, tree type, > } >if (bitpos) > { > - if (TREE_CODE (type) == BOOLEAN_TYPE) > + if (0 && TREE_CODE (type) == BOOLEAN_TYPE) > { > tree itype > = build_nonstandard_integer_type (tree_to_uhwi (bitsize), 0); > > > makes the generated code a bit easier to follow but I guess still ends up > miscompiling things? Ok, so apparently you installed already the patch. And yes, we still miscompile it :( > > Note if I add -fdisable-tree-veclower ISEL ICEs. > > So I'd see where this VEC_COND_EXPR comes from. The vector is already build in C FE: #0 build_opaque_vector_type (innertype=, nunits=...) at /home/marxin/Programming/gcc/gcc/tree.c:10964 #1 0x0082368e in build_binary_op (location=271264, code=GT_EXPR, orig_op0=, orig_op1=, convert_p=true) at /home/marxin/Programming/gcc/gcc/c/c-typeck.c:12164 #2 0x008028e3 in parser_build_binary_op (location=271264, code=GT_EXPR, arg1=..., arg2=...) at /home/marxin/Programming/gcc/gcc/c/c-typeck.c:3755 #3 0x0084dcfd in c_parser_binary_expression (parser=0x77fc4ab0, after=, omp_atomic_lhs=) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:8087 #4 0x0084e9b6 in c_parser_conditional_expression (parser=0x77fc4ab0, after=, omp_atomic_lhs=) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:7691 #5 0x0084efe1 in c_parser_expr_no_commas (parser=0x77fc4ab0, after=, omp_atomic_lhs=) at /home/marxin/Programming/gcc/gcc/c/c-parser.c:7606 for (gdb) p debug_tree(orig_op0) unit-size align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x775ec348 precision:8 min max > unsigned V16QI size unit-size align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7740ec78 nunits:16> side-effects arg:0 > side-effects arg:0 used unsigned ignored read decl_5 V16QI /home/marxin/Programming/testcases/pr96814.c:8:12 size unit-size align:128 warn_if_not_align:0 context initial > /home/marxin/Programming/testcases/pr96814.c:8:12 start: /home/marxin/Programming/testcases/pr96814.c:8:12 finish: /home/marxin/Programming/testcases/pr96814.c:8:12>> $10 = void (gdb) p debug_tree(orig_op1) constant 0> So the '(V){8} > 0' if I see correctly. I tried to manually change 'intt' in parser_build_binary_op to boolean_type_node, but it leads to: /home/marxin/Programming/testcases/pr96814.c:8:21: error: invalid operands to binary == (have ‘__vector(16) _Bool’ and ‘int’) 8 |x = ((V){8} > 0) == 0; | ^~ || |__vector(16) _Bool :/ Any hint how to resolve it?
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 --- Comment #3 from Richard Biener --- It's indeed odd that we have a VEC_COND_EXPR creating a boolean vector. With GCC 10 veclower saw _1 = { 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }; _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; and _2 is vector while _1 is vector now we have _1 = { -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; _2 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }>; _3 = VEC_COND_EXPR <_2, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; where _2 is vector, appearantly inverting _1. This is either new or missed folding I guess (I see this in .original already). Now, veclower better shouldn't touch this - definitely a vector ctor from SSA names isn't something we want... Though in principle we should generate correct code here or give up. So I'd say we hit a latent issue here? I see we're doing a bad job in constant folding as well, mostly because our representation of vector is special: unit-size align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x766472a0 precision:1 min max > SI size unit-size align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x76647348 nunits:32> constant npatterns:1 nelts-per-pattern:1 elt:0: constant 0>> see how we're having a vector type of a QImode boolean with size 8. The folding code assumes it can work with TYPE_SIZE of the component: case BIT_FIELD_REF: if (TREE_CODE (arg0) == VECTOR_CST && (type == TREE_TYPE (TREE_TYPE (arg0)) || (VECTOR_TYPE_P (type) && TREE_TYPE (type) == TREE_TYPE (TREE_TYPE (arg0 && tree_fits_uhwi_p (op1) && tree_fits_uhwi_p (op2)) { tree eltype = TREE_TYPE (TREE_TYPE (arg0)); unsigned HOST_WIDE_INT width = tree_to_uhwi (TYPE_SIZE (eltype)); unsigned HOST_WIDE_INT n = tree_to_uhwi (arg1); but that's not true for these types. diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 1f861630225..0cc80adf632 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -12581,7 +12581,9 @@ fold_ternary_loc (location_t loc, enum tree_code code, t ree type, && tree_fits_uhwi_p (op2)) { tree eltype = TREE_TYPE (TREE_TYPE (arg0)); - unsigned HOST_WIDE_INT width = tree_to_uhwi (TYPE_SIZE (eltype)); + unsigned HOST_WIDE_INT width + = (TREE_CODE (eltype) == BOOLEAN_TYPE + ? TYPE_PRECISION (eltype) : tree_to_uhwi (TYPE_SIZE (eltype))); unsigned HOST_WIDE_INT n = tree_to_uhwi (arg1); unsigned HOST_WIDE_INT idx = tree_to_uhwi (op2); diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c index 6d5d65195ae..e9dbe07dccc 100644 --- a/gcc/tree-vect-generic.c +++ b/gcc/tree-vect-generic.c @@ -137,7 +137,7 @@ tree_vec_extract (gimple_stmt_iterator *gsi, tree type, } if (bitpos) { - if (TREE_CODE (type) == BOOLEAN_TYPE) + if (0 && TREE_CODE (type) == BOOLEAN_TYPE) { tree itype = build_nonstandard_integer_type (tree_to_uhwi (bitsize), 0); makes the generated code a bit easier to follow but I guess still ends up miscompiling things? Note if I add -fdisable-tree-veclower ISEL ICEs. So I'd see where this VEC_COND_EXPR comes from.
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 Martin Liška changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #2 from Martin Liška --- So before the revision we generated: _1 = { 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }; _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; _3 = VIEW_CONVERT_EXPR(_2); x = _3; but now we have: : _1 = { -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; _24 = VIEW_CONVERT_EXPR(_1); _25 = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; _26 = BIT_FIELD_REF <_25, 1, 0>; _27 = _26; _28 = { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }; _29 = BIT_FIELD_REF <_28, 1, 0>; _30 = _29; _31 = _24 & 1; _32 = _31 != 0 ? _27 : _30; ... _151 = _24 & 32768; _152 = _151 != 0 ? _147 : _150; _2 = {_32, _40, _48, _56, _64, _72, _80, _88, _96, _104, _112, _120, _128, _136, _144, _152}; _3 = .VCOND_MASK (_2, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }); _4 = VIEW_CONVERT_EXPR(_3); x = _4; i_15 = 0; goto ; [INV] It's because veclower sees: _1 = { -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; _2 = VEC_COND_EXPR <_1, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }>; _3 = VEC_COND_EXPR <_2, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>; _4 = VIEW_CONVERT_EXPR(_3); and if I see correctly the creation of the mask _2 is broken in RTL (probably one can't build one from a vector costructor)? Anyway SSA_NAMEs like _128 don't have any use. @Richi: Can you please help me with that? I'm looking into it for quite some time :/
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 Richard Biener changed: What|Removed |Added Target Milestone|--- |11.0
[Bug target/96814] [11 Regression] wrong code with -march=cascadelake since r11-1445-g502d63b6d6141597
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96814 Martin Liška changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org Summary|[11 Regression] wrong code |[11 Regression] wrong code |with -march=cascadelake |with -march=cascadelake ||since ||r11-1445-g502d63b6d6141597 Last reconfirmed||2020-08-27 CC||marxin at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Martin Liška --- Started with my r11-1445-g502d63b6d6141597. Thank you for the report.