Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On 14/04/2023 12:47, Richard Biener wrote: On Fri, Apr 14, 2023 at 11:42 AM Andre Vieira (lists) wrote: Ah, but then vect_get_smallest_scalar_type should simply ignore that pointer in MASK_CALL. It should only look at the arguments relevant for vectorization. So diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 8daf7bd7dd3..70bb8595951 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -165,6 +165,8 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) } else if (internal_fn_mask_index (ifn) == 0) i = 1; + else if (ifn == IFN_MASK_CALL) + i = 1; } if (i < gimple_call_num_args (call)) { ? The problem with that is that not all arguments of a simdclone are mapped to vectors, for instance if one of them is 'uniform' then it's passed as a single scalar as uniform says it won't change between iteration, thus it has no effect on VF. End of the day, the logic behind VF selection for simd clones is very different to that of a normal statement. Take for instance the logic behind simdlen selection for SVE, its not yet in trunk, where we select the number of elements to handle based on the widest element type. So say a: #pragma omp declare simd int foo (int, long long); would have 4 NEON simd clones with (based on Narrowest element type): /* 128 bit */ V4SI foo (V4SI, V4DI) V4SI foo (V4SI, V4DI, V4BI) /* 64 bit */ V2SI foo (V2SI, V2DI) V2SI foo (V2SI, V2DI, V2BI) and SVE (based on Widest element type): VNx2SI foo (VN2xSI, VN2xDI, VN2xBI) This also shows how basing this on the return value wouldn't quite work either for SVE :( For now I think all targets base the selection on narrowest element type and I think we could get away with only looking at the types that map to vectors using the simdclone information, that's a pretty straight forward change to the existing code I think. And I'll make sure to come up with a way of dealing with this in a more robust way for SVE to work. I'll try to write something up, but probably won't be able to post it until Monday. Kind regards, Andre
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On Fri, Apr 14, 2023 at 11:42 AM Andre Vieira (lists) wrote: > > > > On 14/04/2023 10:09, Richard Biener wrote: > > On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists) > > wrote: > >> > >> Resending this to everyone (sorry for the double send Richard). > >> > >> On 14/04/2023 09:15, Andre Vieira (lists) wrote: > >> > > >> > > >> > On 14/04/2023 07:55, Richard Biener wrote: > >> >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) > >> >> wrote: > >> >>> > >> >>> > >> >>> > >> >>> On 13/04/2023 15:00, Richard Biener wrote: > >> On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches > >> wrote: > >> > > >> > > >> > > >> >>> > >> >>> But that's not it, I've been looking at it, and there is code in > >> place > >> >>> that does what I expected which is defer the choice of vectype for > >> simd > >> >>> clones until vectorizable_simd_clone_call, unfortunately it has a > >> >>> mistaken assumption that simdclones don't return :/ > >> >> > >> >> I think that's not it - when the SIMD clone returns a vector we have > >> to > >> >> determine the vector type in this function. We cannot defer this. > >> > > >> > What's 'this function' here, do you mean we have to determine the > >> > vectype in 'vect_get_vector_types_for_stmt' & > >> > 'vect_determine_vf_for_stmt' ? > > > > Yes. > > > >> Because at that time we don't yet know > >> > what clone we will be using, this choice is done inside > >> > vectorizable_simd_clone_call. In fact, to choose the simd clone, we > >> need > >> > to know the vf as that has to be a multiple of the chosen clone's > >> > simdlen. So we simply can't use the simdclone's types (as that depends > >> > on the simdlen) to choose the vf because the choice of simdlend depends > >> > on the vf. And there was already code in place to handle this, > >> > unfortunately that code was wrong and had the wrong assumption that > >> > simdclones didn't return (probably was true at some point and > >> bitrotted). > > > > But to compute the VF we need to know the vector types! We're only > > calling vectorizable_* when the VF is final. That said, the code you quote: > > > >> >> > >> >>> see vect_get_vector_types_for_stmt: > >> >>> ... > >> >>> if (gimple_get_lhs (stmt) == NULL_TREE > > > > is just for the case of a function without return value. For this case > > it's OK to do nothing - 'vectype' is the vector type of all vector defs > > a stmt produces. > > > > For calls with a LHS it should fall through to generic code doing > > get_vectype_for_scalar_type on the LHS type. > > I think that may work, but right now it will still go and look at the > arguments of the call and use the smallest type among them to adjust the > nunits (in 'vect_get_vector_types_for_stmt'). > > In fact (this is just for illustration) if I hack that function like this: > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -12745,8 +12745,11 @@ vect_get_vector_types_for_stmt (vec_info > *vinfo, stmt_vec_info stmt_info, > /* The number of units is set according to the smallest scalar > type (or the largest vector size, but we only support one > vector size per vectorization). */ > - scalar_type = vect_get_smallest_scalar_type (stmt_info, > - TREE_TYPE (vectype)); > + if (simd_clone_call_p (stmt_info->stmt)) > + scalar_type = TREE_TYPE (vectype); > + else > + scalar_type = vect_get_smallest_scalar_type (stmt_info, > +TREE_TYPE (vectype)); > if (scalar_type != TREE_TYPE (vectype)) > { >if (dump_enabled_p ()) > > It will use the same types as before without (-m32), like I explained > before the -m32 turns the pointer inside MASK_CALL into a 32-bit pointer > so now the smallest size is 32-bits. This makes it pick V8SI instead of > the original V4DI (scalar return type is DImode). Changing the VF to 8, > thus unrolling the loop as it needs to make 2 calls, each handling 4 nunits. Ah, but then vect_get_smallest_scalar_type should simply ignore that pointer in MASK_CALL. It should only look at the arguments relevant for vectorization. So diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index 8daf7bd7dd3..70bb8595951 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -165,6 +165,8 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) } else if (internal_fn_mask_index (ifn) == 0) i = 1; + else if (ifn == IFN_MASK_CALL) + i = 1; } if (i < gimple_call_num_args (call)) { ? Richard. > >
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On 14/04/2023 10:09, Richard Biener wrote: On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists) wrote: Resending this to everyone (sorry for the double send Richard). On 14/04/2023 09:15, Andre Vieira (lists) wrote: > > > On 14/04/2023 07:55, Richard Biener wrote: >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) >> wrote: >>> >>> >>> >>> On 13/04/2023 15:00, Richard Biener wrote: On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches wrote: > > > >>> >>> But that's not it, I've been looking at it, and there is code in place >>> that does what I expected which is defer the choice of vectype for simd >>> clones until vectorizable_simd_clone_call, unfortunately it has a >>> mistaken assumption that simdclones don't return :/ >> >> I think that's not it - when the SIMD clone returns a vector we have to >> determine the vector type in this function. We cannot defer this. > > What's 'this function' here, do you mean we have to determine the > vectype in 'vect_get_vector_types_for_stmt' & > 'vect_determine_vf_for_stmt' ? Yes. Because at that time we don't yet know > what clone we will be using, this choice is done inside > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need > to know the vf as that has to be a multiple of the chosen clone's > simdlen. So we simply can't use the simdclone's types (as that depends > on the simdlen) to choose the vf because the choice of simdlend depends > on the vf. And there was already code in place to handle this, > unfortunately that code was wrong and had the wrong assumption that > simdclones didn't return (probably was true at some point and bitrotted). But to compute the VF we need to know the vector types! We're only calling vectorizable_* when the VF is final. That said, the code you quote: >> >>> see vect_get_vector_types_for_stmt: >>> ... >>> if (gimple_get_lhs (stmt) == NULL_TREE is just for the case of a function without return value. For this case it's OK to do nothing - 'vectype' is the vector type of all vector defs a stmt produces. For calls with a LHS it should fall through to generic code doing get_vectype_for_scalar_type on the LHS type. I think that may work, but right now it will still go and look at the arguments of the call and use the smallest type among them to adjust the nunits (in 'vect_get_vector_types_for_stmt'). In fact (this is just for illustration) if I hack that function like this: --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -12745,8 +12745,11 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info, /* The number of units is set according to the smallest scalar type (or the largest vector size, but we only support one vector size per vectorization). */ - scalar_type = vect_get_smallest_scalar_type (stmt_info, - TREE_TYPE (vectype)); + if (simd_clone_call_p (stmt_info->stmt)) + scalar_type = TREE_TYPE (vectype); + else + scalar_type = vect_get_smallest_scalar_type (stmt_info, +TREE_TYPE (vectype)); if (scalar_type != TREE_TYPE (vectype)) { if (dump_enabled_p ()) It will use the same types as before without (-m32), like I explained before the -m32 turns the pointer inside MASK_CALL into a 32-bit pointer so now the smallest size is 32-bits. This makes it pick V8SI instead of the original V4DI (scalar return type is DImode). Changing the VF to 8, thus unrolling the loop as it needs to make 2 calls, each handling 4 nunits.
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On Fri, Apr 14, 2023 at 11:09 AM Richard Biener wrote: > > On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists) > wrote: > > > > Resending this to everyone (sorry for the double send Richard). > > > > On 14/04/2023 09:15, Andre Vieira (lists) wrote: > > > > > > > > > On 14/04/2023 07:55, Richard Biener wrote: > > >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) > > >> wrote: > > >>> > > >>> > > >>> > > >>> On 13/04/2023 15:00, Richard Biener wrote: > > On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches > > wrote: > > > > > > > > > > > >>> > > >>> But that's not it, I've been looking at it, and there is code in place > > >>> that does what I expected which is defer the choice of vectype for simd > > >>> clones until vectorizable_simd_clone_call, unfortunately it has a > > >>> mistaken assumption that simdclones don't return :/ > > >> > > >> I think that's not it - when the SIMD clone returns a vector we have to > > >> determine the vector type in this function. We cannot defer this. > > > > > > What's 'this function' here, do you mean we have to determine the > > > vectype in 'vect_get_vector_types_for_stmt' & > > > 'vect_determine_vf_for_stmt' ? > > Yes. > > > Because at that time we don't yet know > > > what clone we will be using, this choice is done inside > > > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need > > > to know the vf as that has to be a multiple of the chosen clone's > > > simdlen. So we simply can't use the simdclone's types (as that depends > > > on the simdlen) to choose the vf because the choice of simdlend depends > > > on the vf. And there was already code in place to handle this, > > > unfortunately that code was wrong and had the wrong assumption that > > > simdclones didn't return (probably was true at some point and bitrotted). > > But to compute the VF we need to know the vector types! We're only > calling vectorizable_* when the VF is final. That said, the code you quote: > > > >> > > >>> see vect_get_vector_types_for_stmt: > > >>> ... > > >>> if (gimple_get_lhs (stmt) == NULL_TREE > > is just for the case of a function without return value. For this case > it's OK to do nothing - 'vectype' is the vector type of all vector defs > a stmt produces. > > For calls with a LHS it should fall through to generic code doing > get_vectype_for_scalar_type on the LHS type. Isn't just the target selector wrong in these tests? /* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { { ! avx_runtime } && { ! { i686*-*-* && { ! lp64 } } } } } } } */ /* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 3 "vect" { target { avx_runtime && { ! { i686*-*-* && { ! lp64 } } } } } } } */ /* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" { target { i686*-*-* && { ! lp64 } } } } } */ the i686-*-* should be { i?86-*-* x86_64-*-* } && { ! lp64 } to properly check x86_64 -m32? I'm going to patch that. Richard. > > >>> /* MASK_STORE has no lhs, but is ok. */ > > >>> && !gimple_call_internal_p (stmt, IFN_MASK_STORE)) > > >>> { > > >>> if (is_a (stmt)) > > >>> { > > >>> /* Ignore calls with no lhs. These must be calls to > > >>>#pragma omp simd functions, and what vectorization > > factor > > >>>it really needs can't be determined until > > >>>vectorizable_simd_clone_call. */ > > >>> if (dump_enabled_p ()) > > >>> dump_printf_loc (MSG_NOTE, vect_location, > > >>>"defer to SIMD clone analysis.\n"); > > >>> return opt_result::success (); > > >>> } > > >>> > > >>> return opt_result::failure_at (stmt, > > >>>"not vectorized: irregular > > >>> stmt.%G", stmt); > > >>> } > > >>> ... > > >>> > > >>> I'm working on a patch. > > > > > Kind Regards, > > > Andre
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On Fri, Apr 14, 2023 at 10:43 AM Andre Vieira (lists) wrote: > > Resending this to everyone (sorry for the double send Richard). > > On 14/04/2023 09:15, Andre Vieira (lists) wrote: > > > > > > On 14/04/2023 07:55, Richard Biener wrote: > >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) > >> wrote: > >>> > >>> > >>> > >>> On 13/04/2023 15:00, Richard Biener wrote: > On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches > wrote: > > > > > > > >>> > >>> But that's not it, I've been looking at it, and there is code in place > >>> that does what I expected which is defer the choice of vectype for simd > >>> clones until vectorizable_simd_clone_call, unfortunately it has a > >>> mistaken assumption that simdclones don't return :/ > >> > >> I think that's not it - when the SIMD clone returns a vector we have to > >> determine the vector type in this function. We cannot defer this. > > > > What's 'this function' here, do you mean we have to determine the > > vectype in 'vect_get_vector_types_for_stmt' & > > 'vect_determine_vf_for_stmt' ? Yes. > Because at that time we don't yet know > > what clone we will be using, this choice is done inside > > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need > > to know the vf as that has to be a multiple of the chosen clone's > > simdlen. So we simply can't use the simdclone's types (as that depends > > on the simdlen) to choose the vf because the choice of simdlend depends > > on the vf. And there was already code in place to handle this, > > unfortunately that code was wrong and had the wrong assumption that > > simdclones didn't return (probably was true at some point and bitrotted). But to compute the VF we need to know the vector types! We're only calling vectorizable_* when the VF is final. That said, the code you quote: > >> > >>> see vect_get_vector_types_for_stmt: > >>> ... > >>> if (gimple_get_lhs (stmt) == NULL_TREE is just for the case of a function without return value. For this case it's OK to do nothing - 'vectype' is the vector type of all vector defs a stmt produces. For calls with a LHS it should fall through to generic code doing get_vectype_for_scalar_type on the LHS type. > >>> /* MASK_STORE has no lhs, but is ok. */ > >>> && !gimple_call_internal_p (stmt, IFN_MASK_STORE)) > >>> { > >>> if (is_a (stmt)) > >>> { > >>> /* Ignore calls with no lhs. These must be calls to > >>>#pragma omp simd functions, and what vectorization > factor > >>>it really needs can't be determined until > >>>vectorizable_simd_clone_call. */ > >>> if (dump_enabled_p ()) > >>> dump_printf_loc (MSG_NOTE, vect_location, > >>>"defer to SIMD clone analysis.\n"); > >>> return opt_result::success (); > >>> } > >>> > >>> return opt_result::failure_at (stmt, > >>>"not vectorized: irregular > >>> stmt.%G", stmt); > >>> } > >>> ... > >>> > >>> I'm working on a patch. > > > Kind Regards, > > Andre
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
Resending this to everyone (sorry for the double send Richard). On 14/04/2023 09:15, Andre Vieira (lists) wrote: > > > On 14/04/2023 07:55, Richard Biener wrote: >> On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) >> wrote: >>> >>> >>> >>> On 13/04/2023 15:00, Richard Biener wrote: On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches wrote: > > > >>> >>> But that's not it, I've been looking at it, and there is code in place >>> that does what I expected which is defer the choice of vectype for simd >>> clones until vectorizable_simd_clone_call, unfortunately it has a >>> mistaken assumption that simdclones don't return :/ >> >> I think that's not it - when the SIMD clone returns a vector we have to >> determine the vector type in this function. We cannot defer this. > > What's 'this function' here, do you mean we have to determine the > vectype in 'vect_get_vector_types_for_stmt' & > 'vect_determine_vf_for_stmt' ? Because at that time we don't yet know > what clone we will be using, this choice is done inside > vectorizable_simd_clone_call. In fact, to choose the simd clone, we need > to know the vf as that has to be a multiple of the chosen clone's > simdlen. So we simply can't use the simdclone's types (as that depends > on the simdlen) to choose the vf because the choice of simdlend depends > on the vf. And there was already code in place to handle this, > unfortunately that code was wrong and had the wrong assumption that > simdclones didn't return (probably was true at some point and bitrotted). > >> >>> see vect_get_vector_types_for_stmt: >>> ... >>> if (gimple_get_lhs (stmt) == NULL_TREE >>> /* MASK_STORE has no lhs, but is ok. */ >>> && !gimple_call_internal_p (stmt, IFN_MASK_STORE)) >>> { >>> if (is_a (stmt)) >>> { >>> /* Ignore calls with no lhs. These must be calls to >>>#pragma omp simd functions, and what vectorization factor >>>it really needs can't be determined until >>>vectorizable_simd_clone_call. */ >>> if (dump_enabled_p ()) >>> dump_printf_loc (MSG_NOTE, vect_location, >>>"defer to SIMD clone analysis.\n"); >>> return opt_result::success (); >>> } >>> >>> return opt_result::failure_at (stmt, >>>"not vectorized: irregular >>> stmt.%G", stmt); >>> } >>> ... >>> >>> I'm working on a patch. > Kind Regards, > Andre
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On Thu, Apr 13, 2023 at 4:25 PM Andre Vieira (lists) wrote: > > > > On 13/04/2023 15:00, Richard Biener wrote: > > On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches > > wrote: > >> > >> > >> > >> On 13/04/2023 11:01, Andrew Stubbs wrote: > >>> Hi Andre, > >>> > >>> I don't have a cascadelake device to test on, nor any knowledge about > >>> what makes it different from regular x86_64. > >> > >> Not sure you need one, but yeah I don't know either, it looks like it > >> fails because: > >> in-branch vector clones are not yet supported for integer mask modes. > >> > >> A quick look tells me this is because mask_mode is not VOIDmode. > >> i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set > >> mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect > >> cascadelake is TARGET_AVX512F. > >> > >> This is where I bail out as I really don't want to dive into the target > >> specific simd clone handling of x86 ;) > >> > >>> > >>> If the cascadelake device is supposed to work the same as other x86_64 > >>> devices for these vectors then the test has found a bug in the compiler > >>> and you should be looking to fix that, not fudge the testcase. > >>> > >>> Alternatively, if the device's capabilities really are different and the > >>> tests should behave differently, then the actual expectations need to be > >>> encoded in the dejagnu directives. If you can't tell the difference by > >>> looking at the "x86_64*-*-*" target selector alone then the correct > >>> solution is to invent a new "effective-target" selector. There are lots > >>> of examples of using these throughout the testsuite (you could use > >>> dg-require-effective-target to disable the whole testcase, or just use > >>> the name in the scan-tree-dump-times directive to customise the > >>> expectations), and the definitions can be found in the > >>> lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are > >>> fixed expressions and some run the compiler to probe the configuration, > >>> but in this case you probably want to do something with "check-flags". > >> > >> Even though I agree with you, I'm not the right person to do this > >> digging for such target specific stuff. So for now I'd probably suggest > >> xfailing this for avx512f. > >>> > >>> For the unroll problem, you can probably tweak the optimization options > >>> to disable that, the same as has been done for the epilogues feature > >>> that had the same problem. > >> > >> I mistaken the current behaviour for unrolling, it's actually because of > >> a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to > >> determine the vectype of a stmt. For a function like foo, that has the > >> same type (long long) everywhere this wouldn't be a problem, however, > >> because you transformed it into a MASK_CALL that has a function pointer > >> (which is 32-bit in -m32) that now becomes the 'smallest' type. > >> > >> This is all a red-herring though, I don't think we should be calling > >> this function for potential simdclone calls as the type on which the > >> veclen is not necessarily the 'smallest' type. And some arguments (like > >> uniform and linear) should be ignored anyway as they won't be mapped to > >> vectors. So I do think this might have been broken even before your > >> changes, but needs further investigation. > >>> Since these are new tests for a new feature, I don't really understand > >>> why this is classed as a regression? > >>> > >>> Andrew > >>> > >>> P.S. there was a commit to these tests in the last few days, so make > >>> sure you pull that before making changes. > >> > >> The latest commit to these tests was mine, it's the one Haochen is > >> reporting this regression against. My commit was to fix the issue richi > >> had introduced that was preventing the feature you introduced from > >> working. The reason nobody noticed was because the tests you introduced > >> didn't actually test your feature, since you didn't specify 'inbranch' > >> the omp declare simd pragma was allowing the use of not-inbranch simd > >> clones and the vectorizer was being smart enough to circumvent the > >> conditional and was still able to use simdclones (non inbranch ones) so > >> when the inbranch stopped working, the test didn't notice. > >> > >> The other changes to this test were already after the fix for 10 > >> that broke the inbranch feature you added, and so it was fixing a > >> cascadelake testism but for the not-inbranch simdclones. So basically > >> fixing a testism of a testism :/ > >> > >> > >> I am working on simdclone's for AArch64 for next Stage 1 so I don't mind > >> looking at the issue with the vectype being chosen wrongly, as for the > >> other x86 specific testisms I'll leave them to someone else. > > > > Btw, the new testsuite FAILs could be just epiloge vectorizations, so > > maybe try the usual --param vect-epilogues-nomask=0 ... > It already has those, Jakub added them. > > But that's not it,
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On 13/04/2023 15:00, Richard Biener wrote: On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches wrote: On 13/04/2023 11:01, Andrew Stubbs wrote: Hi Andre, I don't have a cascadelake device to test on, nor any knowledge about what makes it different from regular x86_64. Not sure you need one, but yeah I don't know either, it looks like it fails because: in-branch vector clones are not yet supported for integer mask modes. A quick look tells me this is because mask_mode is not VOIDmode. i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect cascadelake is TARGET_AVX512F. This is where I bail out as I really don't want to dive into the target specific simd clone handling of x86 ;) If the cascadelake device is supposed to work the same as other x86_64 devices for these vectors then the test has found a bug in the compiler and you should be looking to fix that, not fudge the testcase. Alternatively, if the device's capabilities really are different and the tests should behave differently, then the actual expectations need to be encoded in the dejagnu directives. If you can't tell the difference by looking at the "x86_64*-*-*" target selector alone then the correct solution is to invent a new "effective-target" selector. There are lots of examples of using these throughout the testsuite (you could use dg-require-effective-target to disable the whole testcase, or just use the name in the scan-tree-dump-times directive to customise the expectations), and the definitions can be found in the lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are fixed expressions and some run the compiler to probe the configuration, but in this case you probably want to do something with "check-flags". Even though I agree with you, I'm not the right person to do this digging for such target specific stuff. So for now I'd probably suggest xfailing this for avx512f. For the unroll problem, you can probably tweak the optimization options to disable that, the same as has been done for the epilogues feature that had the same problem. I mistaken the current behaviour for unrolling, it's actually because of a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to determine the vectype of a stmt. For a function like foo, that has the same type (long long) everywhere this wouldn't be a problem, however, because you transformed it into a MASK_CALL that has a function pointer (which is 32-bit in -m32) that now becomes the 'smallest' type. This is all a red-herring though, I don't think we should be calling this function for potential simdclone calls as the type on which the veclen is not necessarily the 'smallest' type. And some arguments (like uniform and linear) should be ignored anyway as they won't be mapped to vectors. So I do think this might have been broken even before your changes, but needs further investigation. Since these are new tests for a new feature, I don't really understand why this is classed as a regression? Andrew P.S. there was a commit to these tests in the last few days, so make sure you pull that before making changes. The latest commit to these tests was mine, it's the one Haochen is reporting this regression against. My commit was to fix the issue richi had introduced that was preventing the feature you introduced from working. The reason nobody noticed was because the tests you introduced didn't actually test your feature, since you didn't specify 'inbranch' the omp declare simd pragma was allowing the use of not-inbranch simd clones and the vectorizer was being smart enough to circumvent the conditional and was still able to use simdclones (non inbranch ones) so when the inbranch stopped working, the test didn't notice. The other changes to this test were already after the fix for 10 that broke the inbranch feature you added, and so it was fixing a cascadelake testism but for the not-inbranch simdclones. So basically fixing a testism of a testism :/ I am working on simdclone's for AArch64 for next Stage 1 so I don't mind looking at the issue with the vectype being chosen wrongly, as for the other x86 specific testisms I'll leave them to someone else. Btw, the new testsuite FAILs could be just epiloge vectorizations, so maybe try the usual --param vect-epilogues-nomask=0 ... It already has those, Jakub added them. But that's not it, I've been looking at it, and there is code in place that does what I expected which is defer the choice of vectype for simd clones until vectorizable_simd_clone_call, unfortunately it has a mistaken assumption that simdclones don't return :/ see vect_get_vector_types_for_stmt: ... if (gimple_get_lhs (stmt) == NULL_TREE /* MASK_STORE has no lhs, but is ok. */ && !gimple_call_internal_p (stmt, IFN_MASK_STORE)) { if (is_a (stmt)) { /* Ignore calls with no lhs. These must be
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On Thu, Apr 13, 2023 at 3:00 PM Andre Vieira (lists) via Gcc-patches wrote: > > > > On 13/04/2023 11:01, Andrew Stubbs wrote: > > Hi Andre, > > > > I don't have a cascadelake device to test on, nor any knowledge about > > what makes it different from regular x86_64. > > Not sure you need one, but yeah I don't know either, it looks like it > fails because: > in-branch vector clones are not yet supported for integer mask modes. > > A quick look tells me this is because mask_mode is not VOIDmode. > i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set > mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect > cascadelake is TARGET_AVX512F. > > This is where I bail out as I really don't want to dive into the target > specific simd clone handling of x86 ;) > > > > > If the cascadelake device is supposed to work the same as other x86_64 > > devices for these vectors then the test has found a bug in the compiler > > and you should be looking to fix that, not fudge the testcase. > > > > Alternatively, if the device's capabilities really are different and the > > tests should behave differently, then the actual expectations need to be > > encoded in the dejagnu directives. If you can't tell the difference by > > looking at the "x86_64*-*-*" target selector alone then the correct > > solution is to invent a new "effective-target" selector. There are lots > > of examples of using these throughout the testsuite (you could use > > dg-require-effective-target to disable the whole testcase, or just use > > the name in the scan-tree-dump-times directive to customise the > > expectations), and the definitions can be found in the > > lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are > > fixed expressions and some run the compiler to probe the configuration, > > but in this case you probably want to do something with "check-flags". > > Even though I agree with you, I'm not the right person to do this > digging for such target specific stuff. So for now I'd probably suggest > xfailing this for avx512f. > > > > For the unroll problem, you can probably tweak the optimization options > > to disable that, the same as has been done for the epilogues feature > > that had the same problem. > > I mistaken the current behaviour for unrolling, it's actually because of > a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to > determine the vectype of a stmt. For a function like foo, that has the > same type (long long) everywhere this wouldn't be a problem, however, > because you transformed it into a MASK_CALL that has a function pointer > (which is 32-bit in -m32) that now becomes the 'smallest' type. > > This is all a red-herring though, I don't think we should be calling > this function for potential simdclone calls as the type on which the > veclen is not necessarily the 'smallest' type. And some arguments (like > uniform and linear) should be ignored anyway as they won't be mapped to > vectors. So I do think this might have been broken even before your > changes, but needs further investigation. > > Since these are new tests for a new feature, I don't really understand > > why this is classed as a regression? > > > > Andrew > > > > P.S. there was a commit to these tests in the last few days, so make > > sure you pull that before making changes. > > The latest commit to these tests was mine, it's the one Haochen is > reporting this regression against. My commit was to fix the issue richi > had introduced that was preventing the feature you introduced from > working. The reason nobody noticed was because the tests you introduced > didn't actually test your feature, since you didn't specify 'inbranch' > the omp declare simd pragma was allowing the use of not-inbranch simd > clones and the vectorizer was being smart enough to circumvent the > conditional and was still able to use simdclones (non inbranch ones) so > when the inbranch stopped working, the test didn't notice. > > The other changes to this test were already after the fix for 10 > that broke the inbranch feature you added, and so it was fixing a > cascadelake testism but for the not-inbranch simdclones. So basically > fixing a testism of a testism :/ > > > I am working on simdclone's for AArch64 for next Stage 1 so I don't mind > looking at the issue with the vectype being chosen wrongly, as for the > other x86 specific testisms I'll leave them to someone else. Btw, the new testsuite FAILs could be just epiloge vectorizations, so maybe try the usual --param vect-epilogues-nomask=0 ... > Kind Regards, > Andre
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On 13/04/2023 11:01, Andrew Stubbs wrote: Hi Andre, I don't have a cascadelake device to test on, nor any knowledge about what makes it different from regular x86_64. Not sure you need one, but yeah I don't know either, it looks like it fails because: in-branch vector clones are not yet supported for integer mask modes. A quick look tells me this is because mask_mode is not VOIDmode. i386.cc's TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN will set mask_mode to either DI or SI mode when TARGET_AVX512F. So I suspect cascadelake is TARGET_AVX512F. This is where I bail out as I really don't want to dive into the target specific simd clone handling of x86 ;) If the cascadelake device is supposed to work the same as other x86_64 devices for these vectors then the test has found a bug in the compiler and you should be looking to fix that, not fudge the testcase. Alternatively, if the device's capabilities really are different and the tests should behave differently, then the actual expectations need to be encoded in the dejagnu directives. If you can't tell the difference by looking at the "x86_64*-*-*" target selector alone then the correct solution is to invent a new "effective-target" selector. There are lots of examples of using these throughout the testsuite (you could use dg-require-effective-target to disable the whole testcase, or just use the name in the scan-tree-dump-times directive to customise the expectations), and the definitions can be found in the lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are fixed expressions and some run the compiler to probe the configuration, but in this case you probably want to do something with "check-flags". Even though I agree with you, I'm not the right person to do this digging for such target specific stuff. So for now I'd probably suggest xfailing this for avx512f. For the unroll problem, you can probably tweak the optimization options to disable that, the same as has been done for the epilogues feature that had the same problem. I mistaken the current behaviour for unrolling, it's actually because of a latent bug. The vectorizer calls `vect_get_smallest_scalar_type` to determine the vectype of a stmt. For a function like foo, that has the same type (long long) everywhere this wouldn't be a problem, however, because you transformed it into a MASK_CALL that has a function pointer (which is 32-bit in -m32) that now becomes the 'smallest' type. This is all a red-herring though, I don't think we should be calling this function for potential simdclone calls as the type on which the veclen is not necessarily the 'smallest' type. And some arguments (like uniform and linear) should be ignored anyway as they won't be mapped to vectors. So I do think this might have been broken even before your changes, but needs further investigation. Since these are new tests for a new feature, I don't really understand why this is classed as a regression? Andrew P.S. there was a commit to these tests in the last few days, so make sure you pull that before making changes. The latest commit to these tests was mine, it's the one Haochen is reporting this regression against. My commit was to fix the issue richi had introduced that was preventing the feature you introduced from working. The reason nobody noticed was because the tests you introduced didn't actually test your feature, since you didn't specify 'inbranch' the omp declare simd pragma was allowing the use of not-inbranch simd clones and the vectorizer was being smart enough to circumvent the conditional and was still able to use simdclones (non inbranch ones) so when the inbranch stopped working, the test didn't notice. The other changes to this test were already after the fix for 10 that broke the inbranch feature you added, and so it was fixing a cascadelake testism but for the not-inbranch simdclones. So basically fixing a testism of a testism :/ I am working on simdclone's for AArch64 for next Stage 1 so I don't mind looking at the issue with the vectype being chosen wrongly, as for the other x86 specific testisms I'll leave them to someone else. Kind Regards, Andre
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
Hi Andre, I don't have a cascadelake device to test on, nor any knowledge about what makes it different from regular x86_64. If the cascadelake device is supposed to work the same as other x86_64 devices for these vectors then the test has found a bug in the compiler and you should be looking to fix that, not fudge the testcase. Alternatively, if the device's capabilities really are different and the tests should behave differently, then the actual expectations need to be encoded in the dejagnu directives. If you can't tell the difference by looking at the "x86_64*-*-*" target selector alone then the correct solution is to invent a new "effective-target" selector. There are lots of examples of using these throughout the testsuite (you could use dg-require-effective-target to disable the whole testcase, or just use the name in the scan-tree-dump-times directive to customise the expectations), and the definitions can be found in the lib/target-supports.exp and lib/target-supports-dg.exp scripts. Some are fixed expressions and some run the compiler to probe the configuration, but in this case you probably want to do something with "check-flags". For the unroll problem, you can probably tweak the optimization options to disable that, the same as has been done for the epilogues feature that had the same problem. Since these are new tests for a new feature, I don't really understand why this is classed as a regression? Andrew P.S. there was a commit to these tests in the last few days, so make sure you pull that before making changes. On 13/04/2023 10:15, Andre Simoes Dias Vieira wrote: Hi, @Andrew: Could you have a look at these? I had a quick look at 17f.c and it looks to me like the target selectors aren't specific enough. Unfortunately I am not familiar enough with target selectors (or targets for that matter) for x86_64. From what I could tell with -m32 gcc decides to unroll the vectorized loop so you end up with 4 simdclones rather than the 2 it tests for, GCC probably uses a different cost structure for -m32 that decides it is profitable to unroll? As for -march=cascadelake, that seems to prevent gcc from using the inbranch simdclones altogether, so I suspect that cascadelake doesn't support these inbranch simdclones or the vector types it is trying to use. Kind regards, Andre From: haochen.jiang Sent: Thursday, April 13, 2023 2:48 AM To: Andre Simoes Dias Vieira; gcc-regress...@gcc.gnu.org; gcc-patches@gcc.gnu.org; haochen.ji...@intel.com Subject: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64 On Linux/x86_64, 58c8c1b383bc3c286d6527fc6e8fb62463f9a877 is the first bad commit commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877 Author: Andre Vieira Date: Tue Apr 11 10:07:43 2023 +0100 if-conv: Restore MASK_CALL conversion [PR10] caused FAIL: gcc.dg/vect/vect-simd-clone-16e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-17e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-18e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-7135/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make chec
Re: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
Hi, @Andrew: Could you have a look at these? I had a quick look at 17f.c and it looks to me like the target selectors aren't specific enough. Unfortunately I am not familiar enough with target selectors (or targets for that matter) for x86_64. From what I could tell with -m32 gcc decides to unroll the vectorized loop so you end up with 4 simdclones rather than the 2 it tests for, GCC probably uses a different cost structure for -m32 that decides it is profitable to unroll? As for -march=cascadelake, that seems to prevent gcc from using the inbranch simdclones altogether, so I suspect that cascadelake doesn't support these inbranch simdclones or the vector types it is trying to use. Kind regards, Andre From: haochen.jiang Sent: Thursday, April 13, 2023 2:48 AM To: Andre Simoes Dias Vieira; gcc-regress...@gcc.gnu.org; gcc-patches@gcc.gnu.org; haochen.ji...@intel.com Subject: [r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64 On Linux/x86_64, 58c8c1b383bc3c286d6527fc6e8fb62463f9a877 is the first bad commit commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877 Author: Andre Vieira Date: Tue Apr 11 10:07:43 2023 +0100 if-conv: Restore MASK_CALL conversion [PR10] caused FAIL: gcc.dg/vect/vect-simd-clone-16e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-17e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-18e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-7135/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at haochen dot jiang at intel.com)
[r13-7135 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64
On Linux/x86_64, 58c8c1b383bc3c286d6527fc6e8fb62463f9a877 is the first bad commit commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877 Author: Andre Vieira Date: Tue Apr 11 10:07:43 2023 +0100 if-conv: Restore MASK_CALL conversion [PR10] caused FAIL: gcc.dg/vect/vect-simd-clone-16e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-17e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 FAIL: gcc.dg/vect/vect-simd-clone-18e.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 3 FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-7135/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c --target_board='unix{-m32\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at haochen dot jiang at intel.com)