Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE
On Wed, 27 Jan 2021, Jakub Jelinek wrote: > On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote: > > The following avoids repeatedly turning VALUE RTXen into > > sth useful and re-applying a constant offset through get_addr > > via DSE check_mem_read_rtx. Instead perform this once for > > all stores to be visited in check_mem_read_rtx. This avoids > > allocating 1.6GB of garbage PLUS RTXen on the PR80960 > > testcase, fixing the memory usage regression from old GCC. > > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK? > > > > Thanks, > > Richard. > > > > 2021-01-27 Richard Biener > > > > PR rtl-optimization/80960 > > * dse.c (check_mem_read_rtx): Call get_addr on the > > offsetted address. > > --- > > gcc/dse.c | 5 + > > 1 file changed, 5 insertions(+) > > > > diff --git a/gcc/dse.c b/gcc/dse.c > > index c88587e7d94..da0df54a2dd 100644 > > --- a/gcc/dse.c > > +++ b/gcc/dse.c > > @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info) > > } > >if (maybe_ne (offset, 0)) > > mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset); > > + /* Avoid passing VALUE RTXen as mem_addr to canon_true_dependence > > + which will over and over re-create proper RTL and re-apply the > > + offset above. See PR80960 where we almost allocate 1.6GB of PLUS > > + RTXen that way. */ > > + mem_addr = get_addr (mem_addr); > > > >if (group_id >= 0) > > { > > Does that result in any changes on how much does DSE optimize? > I mean, if you do 2 bootstraps/regtests, one with this patch and another one > without it, and at the end of rest_of_handle_dse dump > locally_deleted, globally_deleted > for each CU/function, do you get the same counts except perhaps for dse.c? So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the patch but counts are _extremely_ small. Statistics: 70148 dse: local deletions = 0, global deletions = 0 32 dse: local deletions = 0, global deletions = 1 9 dse: local deletions = 0, global deletions = 2 7 dse: local deletions = 0, global deletions = 3 2 dse: local deletions = 0, global deletions = 4 2 dse: local deletions = 0, global deletions = 5 3 dse: local deletions = 0, global deletions = 7 67 dse: local deletions = 1, global deletions = 0 1 dse: local deletions = 1, global deletions = 2 12 dse: local deletions = 2, global deletions = 0 1 dse: local deletions = 24, global deletions = 1 2 dse: local deletions = 3, global deletions = 0 4 dse: local deletions = 4, global deletions = 0 4 dse: local deletions = 6, global deletions = 0 1 dse: local deletions = 7, global deletions = 0 1 dse: local deletions = 8, global deletions = 0 so not sure how much confidence this brings over the analytical reasoning that it shouldn't make a difference ... stats on just dse2 are even more depressing (given it's cost) 35123 dse: local deletions = 0, global deletions = 0 2 dse: local deletions = 0, global deletions = 1 20 dse: local deletions = 1, global deletions = 0 1 dse: local deletions = 2, global deletions = 0 1 dse: local deletions = 3, global deletions = 0 1 dse: local deletions = 4, global deletions = 0 Richard.
Fix LTO bootstrap on Windows (PR lto/85574)
The last fix made for PR lto/85574 introduced a comparison of executables and this cannot directly work on Windows because they are timestamped. Moreover nobody sets $(exeext) at top level, at least on MinGW, so you get a weird behavior because some tools add the implicit .exe suffix and others don't. Bootstrapped in LTO mode on Windows, OK for all active branches? 2021-01-28 Eric Botcazou contrib/ PR lto/85574 * compare-lto: Deal with PE-COFF executables specifically. -- Eric Botcazoudiff --git a/contrib/compare-lto b/contrib/compare-lto index 17379e196a7..c0bb71c0765 100755 --- a/contrib/compare-lto +++ b/contrib/compare-lto @@ -32,7 +32,7 @@ case $1 in esac if test $# != 2; then - echo 'usage: compare-lto file1.o file2.o' >&2 + echo 'usage: compare-lto file1 file2' >&2 exit 1 fi @@ -101,6 +101,25 @@ else else status=1 fi + + # PE-COFF executables are timestamped so skip leading bytes for them. + else +case "$1" in + *.exe) +if cmp -i 256 "$1" "$2"; then + status=0 +else + status=1 +fi +;; + *) +if test -f "$1.exe" && cmp -i 256 "$1.exe" "$2.exe"; then + status=0 +else + status=1 +fi +;; +esac fi fi
Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)
On Thu, Jan 28, 2021 at 12:08 AM Martin Sebor via Gcc-patches wrote: > > Attached is another attempt to fix the problem caused by allowing > front-end trees representing nontrivial VLA bound expressions to > stay in attribute access attached to functions. Since removing > these trees seems to be everyone's preference this patch does that > by extending the free_lang_data pass to look for and zero out these > trees. > > Because free_lang_data only frees anything when LTO is enabled and > we want these trees cleared regardless to keep them from getting > clobbered during gimplification, this change also modifies the pass > to do the clearing even when the pass is otherwise inactive. if (TREE_CODE (bound) == NOP_EXPR) +bound = TREE_OPERAND (bound, 0); + + if (TREE_CODE (bound) == CONVERT_EXPR) +{ + tree op0 = TREE_OPERAND (bound, 0); + tree bndtyp = TREE_TYPE (bound); + tree op0typ = TREE_TYPE (op0); + if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ)) + bound = op0; +} + + if (TREE_CODE (bound) == NON_LVALUE_EXPR) +bound = TREE_OPERAND (bound, 0); all of the above can be just STRIP_NOPS (bound); which also handles nesting of the above in any order. + if (TREE_CODE (bound) == PLUS_EXPR + && integer_all_onesp (TREE_OPERAND (bound, 1))) +{ + bound = TREE_OPERAND (bound, 0); + if (TREE_CODE (bound) == NOP_EXPR) + bound = TREE_OPERAND (bound, 0); +} so it either does or does not strip a -1 but has no indication on whether it did that? That looks fragile and broken. Anyway, the split out of this function seems unrelated to the original problem and should be submitted separately. + for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist)) + { + tree *pvbnd = &TREE_VALUE (vblist); + if (!*pvbnd || DECL_P (*pvbnd)) + continue; so this doesn't let constant bounds prevail but only symbolical ones? Not that I care but I'd have expected || CONSTANT_CLASS_P (*pvbnd) The free-lang-data parts are OK. Richard. > Tested on x86_64-linux. > > Martin
[PATCH] testsuite: Run vec_insert case on P8 and P9 with option specified
Move common functions to header file for cleanup. gcc/testsuite/ChangeLog: 2021-01-27 Xionghu Luo * gcc.target/powerpc/pr79251.p8.c: Move definition to ... * gcc.target/powerpc/pr79251.h: ...this. * gcc.target/powerpc/pr79251.p9.c: Likewise. * gcc.target/powerpc/pr79251-run.c: Rename to... * gcc.target/powerpc/pr79251-run.p8.c: ...this. * gcc.target/powerpc/pr79251-run.p9.c: New test. --- .../gcc.target/powerpc/pr79251-run.c | 30 --- .../gcc.target/powerpc/pr79251-run.p8.c | 14 + .../gcc.target/powerpc/pr79251-run.p9.c | 14 + gcc/testsuite/gcc.target/powerpc/pr79251.h| 17 +++ gcc/testsuite/gcc.target/powerpc/pr79251.p8.c | 2 -- gcc/testsuite/gcc.target/powerpc/pr79251.p9.c | 2 -- 6 files changed, 45 insertions(+), 34 deletions(-) delete mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251-run.c b/gcc/testsuite/gcc.target/powerpc/pr79251-run.c deleted file mode 100644 index 6afd357c7ba..000 --- a/gcc/testsuite/gcc.target/powerpc/pr79251-run.c +++ /dev/null @@ -1,30 +0,0 @@ -/* { dg-do run } */ -/* { dg-require-effective-target vsx_hw } */ -/* { dg-options "-O2 -mvsx" } */ - -#include -#include -#include "pr79251.h" - -TEST_VEC_INSERT_ALL (test) - -#define run_test(TYPE, num) \ - { \ -vector TYPE v; \ -vector TYPE u = {0x0}; \ -for (long k = 0; k < 16 / sizeof (TYPE); k++) \ - v[k] = 0xaa; \ -for (long k = 0; k < 16 / sizeof (TYPE); k++) \ - { \ - u = test##num (v, 254, k); \ - if (u[k] != (TYPE) 254)\ - __builtin_abort (); \ - } \ - } - -int -main (void) -{ - TEST_VEC_INSERT_ALL (run_test) - return 0; -} diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c new file mode 100644 index 000..47d4d288f3c --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c @@ -0,0 +1,14 @@ +/* { dg-do run } */ +/* { dg-require-effective-target p8vector_hw } */ +/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power8" } */ + +#include +#include +#include "pr79251.h" + +int +main (void) +{ + TEST_VEC_INSERT_ALL (run_test) + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c new file mode 100644 index 000..fd56b2356f4 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c @@ -0,0 +1,14 @@ +/* { dg-do run } */ +/* { dg-require-effective-target p9vector_hw } */ +/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power9" } */ + +#include +#include +#include "pr79251.h" + +int +main (void) +{ + TEST_VEC_INSERT_ALL (run_test) + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251.h b/gcc/testsuite/gcc.target/powerpc/pr79251.h index addb067f9ed..2684b660966 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr79251.h +++ b/gcc/testsuite/gcc.target/powerpc/pr79251.h @@ -17,3 +17,20 @@ T (unsigned long long, 7) \ T (float, 8) \ T (double, 9) + +TEST_VEC_INSERT_ALL (test) + +#define run_test(TYPE, num) \ + { \ +vector TYPE v; \ +vector TYPE u = {0x0}; \ +for (long k = 0; k < 16 / sizeof (TYPE); k++) \ + v[k] = 0xaa; \ +for (long k = 0; k < 16 / sizeof (TYPE); k++) \ + { \ + u = test##num (v, 254, k); \ + if (u[k] != (TYPE) 254)\ + __builtin_abort (); \ + }
Re: [PATCH PR97627]Avoid computing niters info for fake edges
On Thu, Jan 28, 2021 at 3:49 AM bin.cheng via Gcc-patches wrote: > > Hi, > As described in commit message, we need to avoid computing niters info for > fake > edges. This simple patch does this by two changes. > > Bootstrap and test on X86_64, is it ok? Hmm, so I think the patch is a bit complicated and avoiding niter compute for fake edges would be easier when just returning false for fake edges in number_of_iterations_exit_assumptions? Which pass was the problematical that had infinite loops connected to exit? I guess the cfgloop code should simply ignore fake exits - they mostly exist to make reverse CFG walks easy. Specifically single_exit and single_likely_exit but also exit edge recording should ignore them. That said, the testcase seems to be fixed with just diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 7d61ef080eb..7775bc7275c 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -2407,6 +2407,11 @@ number_of_iterations_exit_assumptions (class loop *loop, edge exit, affine_iv iv0, iv1; bool safe; + /* The condition at a fake exit (if it exists) does not control its + execution. */ + if (exit->flags & EDGE_FAKE) +return false; + /* Nothing to analyze if the loop is known to be infinite. */ if (loop_constraint_set_p (loop, LOOP_C_INFINITE)) return false; Your dfs_find_deadend change likely "breaks" post-dominance DFS order (this is a very fragile area). So any objection to just simplify the patch to the above hunk? Thanks, Richard. > Thanks, > bin
[PATCH] c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847]
Hi! As the testcase shows, for vars appearing in templates, we don't attach the asm spec string to the pattern decls, nor pass it back to cp_finish_decl during instantiation. The following patch does that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2021-01-27 Jakub Jelinek PR c++/33661 PR c++/98847 * decl.c (cp_finish_decl): For register vars with asmspec in templates call set_user_assembler_name and set DECL_HARD_REGISTER. * pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars, pass asmspec_tree to cp_finish_decl. * g++.target/i386/pr98847.C: New test. --- gcc/cp/decl.c.jj2021-01-07 22:54:34.0 +0100 +++ gcc/cp/decl.c 2021-01-27 10:48:14.399582310 +0100 @@ -7840,6 +7840,12 @@ cp_finish_decl (tree decl, tree init, bo retrofit_lang_decl (decl); SET_DECL_DEPENDENT_INIT_P (decl, true); } + + if (VAR_P (decl) && DECL_REGISTER (decl) && asmspec) + { + set_user_assembler_name (decl, asmspec); + DECL_HARD_REGISTER (decl) = 1; + } return; } --- gcc/cp/pt.c.jj 2021-01-22 10:07:53.643466723 +0100 +++ gcc/cp/pt.c 2021-01-27 10:59:04.109126313 +0100 @@ -18227,6 +18227,7 @@ tsubst_expr (tree t, tree args, tsubst_f bool const_init = false; unsigned int cnt = 0; tree first = NULL_TREE, ndecl = error_mark_node; + tree asmspec_tree = NULL_TREE; maybe_push_decl (decl); if (VAR_P (decl) @@ -18250,7 +18251,18 @@ tsubst_expr (tree t, tree args, tsubst_f now. */ predeclare_vla (decl); - cp_finish_decl (decl, init, const_init, NULL_TREE, 0); + if (VAR_P (decl) && DECL_HARD_REGISTER (pattern_decl)) + { + tree id = DECL_ASSEMBLER_NAME (pattern_decl); + const char *asmspec = IDENTIFIER_POINTER (id); + gcc_assert (asmspec[0] == '*'); + asmspec_tree + = build_string (IDENTIFIER_LENGTH (id) - 1, + asmspec + 1); + TREE_TYPE (asmspec_tree) = char_array_type_node; + } + + cp_finish_decl (decl, init, const_init, asmspec_tree, 0); if (ndecl != error_mark_node) cp_finish_decomp (ndecl, first, cnt); --- gcc/testsuite/g++.target/i386/pr98847.C.jj 2021-01-27 11:01:55.312161637 +0100 +++ gcc/testsuite/g++.target/i386/pr98847.C 2021-01-27 11:00:26.601179659 +0100 @@ -0,0 +1,20 @@ +// PR c++/98847 +// { dg-do run } +// { dg-options "-O2 -masm=att" } + +template +int +foo () +{ + register int edx asm ("edx"); + asm ("movl $1234, %%edx" : "=r" (edx)); + return edx; +} + +int +main () +{ + if (foo<0> () != 1234) +__builtin_abort (); + return 0; +} Jakub
[PATCH] c++: Fix -Weffc++ in templates [PR98841]
Hi! We emit a bogus warning on the following testcase, suggesting that the operator should return *this even when it does that already. The problem is that normally cp_build_indirect_ref_1 ensures that *this is folded as current_class_ref, but in templates (if return type is non-dependent, otherwise check_return_expr doesn't check it) it didn't go through cp_build_indirect_ref_1, but just built another INDIRECT_REF. Which means it then doesn't compare pointer-equal to current_class_ref. The following patch just checks if it is the same thing like cp_build_indirect_ref_1 would check if it would be called. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2021-01-27 Jakub Jelinek PR c++/98841 * typeck.c (check_return_expr): If processing_template_decl, don't rely on *this expression being always pointer-equal to current_class_ref. * g++.dg/warn/effc5.C: New test. --- gcc/cp/typeck.c.jj 2021-01-25 10:02:28.380126847 +0100 +++ gcc/cp/typeck.c 2021-01-27 11:21:13.117875551 +0100 @@ -10188,6 +10188,19 @@ check_return_expr (tree retval, bool *no /* Returning '*this' is obviously OK. */ if (retval == current_class_ref) warn = false; + /* In templates, '*this' might not compare pointer equal to +current_class_ref. */ + else if (processing_template_decl + && retval + && TREE_CODE (retval) == INDIRECT_REF + && (TREE_OPERAND (retval, 0) == current_class_ptr + || (TREE_CODE (TREE_OPERAND (retval, 0)) == NOP_EXPR + && (TREE_OPERAND (TREE_OPERAND (retval, 0), 0) + == current_class_ptr) + && (same_type_ignoring_top_level_qualifiers_p +(TREE_TYPE (TREE_OPERAND (retval, 0)), + TREE_TYPE (current_class_ptr)) + warn = false; /* If we are calling a function whose return type is the same of the current class reference, it is ok. */ else if (INDIRECT_REF_P (retval) --- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-27 11:24:34.451565246 +0100 +++ gcc/testsuite/g++.dg/warn/effc5.C 2021-01-27 11:23:23.836375556 +0100 @@ -0,0 +1,17 @@ +// PR c++/98841 +// { dg-do compile } +// { dg-options "-Weffc++" } + +struct S { + template + S& operator=(const T&) { return *this; } // { dg-bogus "should return a reference to" } + S& operator=(const S&) { return *this; } +}; + +void +foo () +{ + S s, t; + s = 1; + s = t; +} Jakub
Re: [RFC] test builtin ratio for loop distribution
On Thu, Jan 28, 2021 at 6:28 AM Alexandre Oliva wrote: > > On Jan 27, 2021, Richard Biener wrote: > > > That said, rather than not transforming the loop as you do I'd > > say we want to re-inline small copies more forcefully during > > loop distribution code-gen so we turn a loop that sets > > 3 'short int' to zero into a 'int' store and a 'short' store for example > > (we have more code-size leeway here because we formerly had > > a loop). > > Ok, that makes sense to me, if there's a chance of growing the access > stride. > > > Since you don't add a testcase > > Uhh, sorry, I mentioned TFmode emulation calls, but I wasn't explicit I > meant the soft-fp ones from libgcc. > > ./xgcc -B./ -O2 $srcdir/libgcc/soft-fp/fixtfdi.c \ > -I$srcdir/libgcc/config/riscv -I$srcdir/include \ > -S -o - | grep memset > > > I can't see whether the actual case would be fixed by setting SSA > > pointer alignment on the memset arguments > > The dest pointer alignment is known at the builtin expansion time, > that's not a problem. What isn't known (*) is that the length is a > multiple of the word size: what gets to the expander is that it's > between 4 and 12 bytes (targeting 32-bit risc-v), but not that it's > either 4, 8 or 12. Coming up with an efficient inline expansion becomes > a bit of a challenge without that extra knowledge. Ah, yes - that's also useful information. I guess for memset an enhanced builtin would then have calloc style size and nmemb arguments where the destination is expected to have at least 'size' alignment. That would allow turning back the memset into the original loop (but with optimal IVs, etc.). So kind of the x86 rep mov{bwlq}. Now think of sth similarly useful for memcpy/move. Richard. > > > (*) at least before my related patch for get_nonzero_bits > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564344.html > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ >Free Software Activist GNU Toolchain Engineer > Vim, Vi, Voltei pro Emacs -- GNUlius Caesar
Re: [PATCH][X86] Enable X86_TUNE_AVX256_UNALIGNED_{LOAD, STORE}_OPTIMAL for generic tune [PR target/98172]
On Thu, Jan 28, 2021 at 7:32 AM Hongtao Liu via Gcc-patches wrote: > > Hi: >GCC11 will be the system GCC 2 years from now, and for the > processors then, they shouldn't even need to split a 256-bit vector > into 2 128-bits vectors. >.i.e. Test SPEC2017 with the below 2 options on Zen3/ICL show > option B is better than Option A. > Option A: > -march=x86-64 -mtune=generic -mavx2 -mfma -Ofast > > Option B: > Option A + > -mtune-ctrl="256_unaligned_load_optimal,256_unaligned_store_optimal" > > Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. Given the explicit list for unaligned loads it's a no-brainer to change that for X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL. Given both BDVER and ZNVER1 are listed for X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL we should try to benchmark the effect on ZNVER1 - Martin, do we still have a znver1 machine around? Note that with the settings differing in a way to split stores but not to split loads, loading a just stored value can cause bad STLF and quite a performance hit (since znver1 has 128bit data paths that shouldn't be an issue there but it would have an issue for actually aligned data on CPUs with 256bit data paths). Thanks, Richard. > Ok for trunk? > > > > > -- > BR, > Hongtao
Re: Fix LTO bootstrap on Windows (PR lto/85574)
On Thu, Jan 28, 2021 at 9:35 AM Eric Botcazou wrote: > > The last fix made for PR lto/85574 introduced a comparison of executables and > this cannot directly work on Windows because they are timestamped. Moreover > nobody sets $(exeext) at top level, at least on MinGW, so you get a weird > behavior because some tools add the implicit .exe suffix and others don't. > > Bootstrapped in LTO mode on Windows, OK for all active branches? OK and sorry for the breakage. Thanks, Richard. > > 2021-01-28 Eric Botcazou > > contrib/ > PR lto/85574 > * compare-lto: Deal with PE-COFF executables specifically. > > -- > Eric Botcazou
Re: [PATCH] [8/9/10/11 Regression] [OOP] PR fortran/86470 - ICE with OpenMP
Hi Thomas, > > Should the testcase be moved to the gomp/ subdirectory? > Yes. It's a compile-time test, and it will then only be run > if the the compiler can do OpenMP. > > You will not need the > > +! { dg-options "-fopenmp" } > > line, then. Adjusted and committed as r11-6950-g33a7a93218b1393d0135e3c4a9ad9ced87808f5e > Thanks for the patch! Thanks, Harald
Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)
On Thu, Jan 28, 2021 at 09:31:46AM +0100, Richard Biener via Gcc-patches wrote: > + if (TREE_CODE (bound) == PLUS_EXPR > + && integer_all_onesp (TREE_OPERAND (bound, 1))) > +{ > + bound = TREE_OPERAND (bound, 0); > + if (TREE_CODE (bound) == NOP_EXPR) > + bound = TREE_OPERAND (bound, 0); > +} > > so it either does or does not strip a -1 but has no indication on > whether it did that? That looks fragile and broken. Yeah. Plus again STRIP_NOPs in there should be used. But certainly it needs to remember whether there was + -1 or not and compensate for it. > The free-lang-data parts are OK. But is fld the right spot to do it? If it is only the FE that emits the warnings, shouldn't it be stripped already before gimplification so that it isn't actually corrupted? I mean in c_parse_final_cleanups or c_common_parse_file depending on if it is just C or C++ too? Plus, if the expressions in the attribute don't contain SAVE_EXPRs, why there isn't unshare_expr called on them (and if they do, I don't see how it would be guaranteed, can't one e.g. do int bar (void); void foo (int n, int a[n + 1][bar () + 2], int b[sizeof (a[0]) + 32]) { } ?) the unsharing variant I've pasted into the PR. Jakub
[committed] c++: Some C++20 and C++23 option help fixes
Hi! I've noticed we still refer to C++20 as draft standard, and there is a pasto in C++23 description. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as obvious. 2021-01-28 Jakub Jelinek * c.opt (-std=c++2a, -std=c++20, -std=gnu++2a, -std=gnu++20): Remove draft from description. (-std=c++2b): Fix a pasto, 2020 -> 2023. --- gcc/c-family/c.opt.jj 2021-01-27 10:05:35.231942979 +0100 +++ gcc/c-family/c.opt 2021-01-27 21:29:45.398365022 +0100 @@ -2208,15 +2208,15 @@ Conform to the ISO 2017 C++ standard. std=c++2a C++ ObjC++ Alias(std=c++20) Undocumented -Conform to the ISO 2020 C++ draft standard (experimental and incomplete support). +Conform to the ISO 2020 C++ standard (experimental and incomplete support). std=c++20 C++ ObjC++ -Conform to the ISO 2020 C++ draft standard (experimental and incomplete support). +Conform to the ISO 2020 C++ standard (experimental and incomplete support). std=c++2b C++ ObjC++ Alias(std=c++23) -Conform to the ISO 2020 C++ draft standard (experimental and incomplete support). +Conform to the ISO 2023 C++ draft standard (experimental and incomplete support). std=c++23 C++ ObjC++ Undocumented @@ -2294,11 +2294,11 @@ Conform to the ISO 2017 C++ standard wit std=gnu++2a C++ ObjC++ Alias(std=gnu++20) Undocumented -Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental and incomplete support). +Conform to the ISO 2020 C++ standard with GNU extensions (experimental and incomplete support). std=gnu++20 C++ ObjC++ -Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental and incomplete support). +Conform to the ISO 2020 C++ standard with GNU extensions (experimental and incomplete support). std=gnu++2b C++ ObjC++ Alias(std=gnu++23) Jakub
Re: [PATCH] libstdc++: implement locale support for AIX
>> * While you now define _GLIBCXX_C_LOCALE_XPG7 in >> config/locale/xpg7/c_locale.h, config/os/aix/ctype_configure_char.cc >> still tests the previous _GLIBCXX_C_LOCALE_IEEE_2008. > > Arf, I've missed that. it might not be the last patch then. > (I've made so much versions of it as I've tried to backport it to our > old versions. It seems that I've mixed up things...) Well, after several tries this morning, I can tell that I still don't understand at all how to setup this ctype for char... I'm sure it have improved things at a moment. But now, whether I'm removing or adding it, nothing changes.. Anyway, _GLIBCXX_C_LOCALE_XPG7 is needed for gnu or for the new testsuite check. So I'm just going to update aix/ctype_configure_char.c locally. I'll see later if I can figure what's wrong. Clément going to update it locally.
Re: [PATCH PR97627]Avoid computing niters info for fake edges
On Thu, Jan 28, 2021 at 5:08 PM Richard Biener via Gcc-patches wrote: > > On Thu, Jan 28, 2021 at 3:49 AM bin.cheng via Gcc-patches > wrote: > > > > Hi, > > As described in commit message, we need to avoid computing niters info for > > fake > > edges. This simple patch does this by two changes. > > > > Bootstrap and test on X86_64, is it ok? > > Hmm, so I think the patch is a bit complicated and avoiding niter compute > for fake edges would be easier when just returning false for > fake edges in number_of_iterations_exit_assumptions? I just grepped calls to get_loop_exit_edges, and thought there might be cases other than niters analysis that would also like to skip fake edges. But I didn't check the calls one by one. > > Which pass was the problematical that had infinite loops connected to exit? > > I guess the cfgloop code should simply ignore fake exits - they mostly > exist to make reverse CFG walks easy. Specifically single_exit > and single_likely_exit but also exit edge recording should ignore them. > > That said, the testcase seems to be fixed with just > > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > index 7d61ef080eb..7775bc7275c 100644 > --- a/gcc/tree-ssa-loop-niter.c > +++ b/gcc/tree-ssa-loop-niter.c > @@ -2407,6 +2407,11 @@ number_of_iterations_exit_assumptions (class > loop *loop, edge exit, >affine_iv iv0, iv1; >bool safe; > > + /* The condition at a fake exit (if it exists) does not control its > + execution. */ > + if (exit->flags & EDGE_FAKE) > +return false; > + >/* Nothing to analyze if the loop is known to be infinite. */ >if (loop_constraint_set_p (loop, LOOP_C_INFINITE)) > return false; > > Your dfs_find_deadend change likely "breaks" post-dominance DFS order > (this is a very fragile area). > > So any objection to just simplify the patch to the above hunk? Considering we are in late stage3? No objection to this change. But I do think dfs_find_deadend needs to be improved, if not as this patch does. For a loop nest with the outermost loop as the infinite one, the function adds fake (exit) edges for inner loops, which is counter-intuitive. Thanks, bin > > Thanks, > Richard. > > > Thanks, > > bin
[PATCH] PING lra: clear lra_insn_recog_data after simplifying a mem subreg
Hello, I would like to ping the following patch: lra: clear lra_insn_recog_data after simplifying a mem subreg https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563428.html Best regards, Ilya
[PATCH] aarch64: Reimplement vshrn_n* intrinsics using builtins
Hi all, This patch reimplements the vshrn_n* intrinsics to use RTL builtins. These perform a narrowing right shift. Although the intrinsic generates the half-width mode (e.g. V8HI -> V8QI), the new pattern generates a full 128-bit mode (V8HI -> V16QI) by representing the fill-with-zeroes semantics of the SHRN instruction. The narrower (V8QI) result is extracted with a lowpart subreg. I found this allows the RTL optimisers to do a better job at optimising redundant moves away in frequently-occurring SHRN+SRHN2 pairs, like in: uint8x16_t foo (uint16x8_t in1, uint16x8_t in2) { uint8x8_t tmp = vshrn_n_u16 (in2, 7); uint8x16_t tmp2 = vshrn_high_n_u16 (tmp, in1, 4); return tmp2; } Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. Pushing to trunk. Thanks, Kyrill gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (shrn): Define builtin. * config/aarch64/aarch64-simd.md (aarch64_shrn_insn_le): Define. (aarch64_shrn_insn_be): Likewise. (aarch64_shrn): Likewise. * config/aarch64/arm_neon.h (vshrn_n_s16): Reimplement using builtins. (vshrn_n_s32): Likewise. (vshrn_n_s64): Likewise. (vshrn_n_u16): Likewise. (vshrn_n_u32): Likewise. (vshrn_n_u64): Likewise. * config/aarch64/iterators.md (vn_mode): New mode attribute. vshrn.patch Description: vshrn.patch
[PATCH] aarch64: Reimplement vshrn_high_n* intrinsics using builtins
Hi all, This patch reimplements the vshrn_high_n* intrinsics that generate the SHRN2 instruction. It is a vec_concat of the narrowing shift with the bottom part of the destination register, so we need a little-endian and a big-endian version and an expander to pick between them. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. Pushing to trunk. Thanks, Kyrill gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (shrn2): Define builtin. * config/aarch64/aarch64-simd.md (aarch64_shrn2_insn_le): Define. (aarch64_shrn2_insn_be): Likewise. (aarch64_shrn2): Likewise. * config/aarch64/arm_neon.h (vshrn_high_n_s16): Reimlplement using builtins. (vshrn_high_n_s32): Likewise. (vshrn_high_n_s64): Likewise. (vshrn_high_n_u16): Likewise. (vshrn_high_n_u32): Likewise. (vshrn_high_n_u64): Likewise. vshrn-hi.patch Description: vshrn-hi.patch
aarch64: Use RTL builtins for [su]mlal_n intrinsics
Hi, As subject, this patch rewrites [su]mlal_n Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization. Regression tested and bootstrapped on aarch64-none-linux-gnu - no issues. Ok for master? Thanks, Jonathan --- gcc/ChangeLog: 2021-01-26 Jonathan Wright * config/aarch64/aarch64-simd-builtins.def: Add [su]mlal_n builtin generator macros. * config/aarch64/aarch64-simd.md (aarch64_mlal_n): Define. * config/aarch64/arm_neon.h (vmlal_n_s16): Use RTL builtin instead of inline asm. (vmlal_n_s32): Likewise. (vmlal_n_u16): Likewise. (vmlal_n_u32): Likewise. diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index a71ae4d724136c8b626d397bf6187e8b595a2b8a..4f8e28dc3c8478ea50aad333b21bd83f4a4b750e 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -196,6 +196,10 @@ BUILTIN_VD_BHSI (TERNOP, smlal, 0, NONE) BUILTIN_VD_BHSI (TERNOPU, umlal, 0, NONE) + /* Implemented by aarch64_mlal_n. */ + BUILTIN_VD_HSI (TERNOP, smlal_n, 0, NONE) + BUILTIN_VD_HSI (TERNOPU, umlal_n, 0, NONE) + /* Implemented by aarch64_mlsl_hi. */ BUILTIN_VQW (TERNOP, smlsl_hi, 0, NONE) BUILTIN_VQW (TERNOPU, umlsl_hi, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index db56b61baf2093c88d8757b25580b3032f00a355..d78f26be19a16163eb1b8f661c6100ac290e6c6b 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1839,6 +1839,21 @@ [(set_attr "type" "neon_mla__long")] ) +(define_insn "aarch64_mlal_n" + [(set (match_operand: 0 "register_operand" "=w") +(plus: + (mult: +(ANY_EXTEND: + (vec_duplicate:VD_HSI + (match_operand: 3 "register_operand" ""))) +(ANY_EXTEND: + (match_operand:VD_HSI 2 "register_operand" "w"))) + (match_operand: 1 "register_operand" "0")))] + "TARGET_SIMD" + "mlal\t%0., %2., %3.[0]" + [(set_attr "type" "neon_mla__long")] +) + (define_insn "aarch64_mlsl" [(set (match_operand: 0 "register_operand" "=w") (minus: diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 674ccc63b69ca1945dc684d2b06c1e31f52bfdb3..004c73d9e0ec4c33e24968d17e4307f858b51263 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -7608,48 +7608,28 @@ __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlal_n_s16 (int32x4_t __a, int16x4_t __b, int16_t __c) { - int32x4_t __result; - __asm__ ("smlal %0.4s,%2.4h,%3.h[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "x"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_smlal_nv4hi (__a, __b, __c); } __extension__ extern __inline int64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlal_n_s32 (int64x2_t __a, int32x2_t __b, int32_t __c) { - int64x2_t __result; - __asm__ ("smlal %0.2d,%2.2s,%3.s[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "w"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_smlal_nv2si (__a, __b, __c); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlal_n_u16 (uint32x4_t __a, uint16x4_t __b, uint16_t __c) { - uint32x4_t __result; - __asm__ ("umlal %0.4s,%2.4h,%3.h[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "x"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_umlal_nv4hi_ (__a, __b, __c); } __extension__ extern __inline uint64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlal_n_u32 (uint64x2_t __a, uint32x2_t __b, uint32_t __c) { - uint64x2_t __result; - __asm__ ("umlal %0.2d,%2.2s,%3.s[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "w"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_umlal_nv2si_ (__a, __b, __c); } __extension__ extern __inline int16x8_t
c++: header unit template alias merging [PR 98770]
Typedefs are streamed by streaming the underlying type, and then recreating the typedef. But this breaks checking a duplicate is the same as the original when it is a template alias -- we end up checking a template alias (eg __void_t) against the underlying type (void). And those are not the same template alias. This stops pretendig that the underlying type is the typedef for that checking and tells is_matching_decl 'you have a typedef', so it knows what to do. (We do not want to recreate the typedef of the duplicate, because that whole set of nodes is going to go away.) PR c++/98770 gcc/cp/ gcc/testsuite/ * g++.dg/modules/pr98770_a.C: New. * g++.dg/modules/pr98770_b.C: New. -- Nathan Sidwell diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc index 18f5de8724b..daf75b16007 100644 --- c/gcc/cp/module.cc +++ w/gcc/cp/module.cc @@ -3029,7 +3029,7 @@ public: bool read_definition (tree decl); private: - bool is_matching_decl (tree existing, tree decl); + bool is_matching_decl (tree existing, tree decl, bool is_typedef); static bool install_implicit_member (tree decl); bool read_function_def (tree decl, tree maybe_template); bool read_var_def (tree decl, tree maybe_template); @@ -7864,8 +7864,8 @@ trees_out::decl_value (tree decl, depset *dep) || !dep == (VAR_OR_FUNCTION_DECL_P (inner) && DECL_LOCAL_DECL_P (inner))); else if ((TREE_CODE (inner) == TYPE_DECL - && TYPE_NAME (TREE_TYPE (inner)) == inner - && !is_typedef) + && !is_typedef + && TYPE_NAME (TREE_TYPE (inner)) == inner) || TREE_CODE (inner) == FUNCTION_DECL) { bool write_defn = !dep && has_definition (decl); @@ -8088,12 +8088,6 @@ trees_in::decl_value () && TREE_CODE (inner) == TYPE_DECL && DECL_ORIGINAL_TYPE (inner) && !TREE_TYPE (inner)); - if (is_typedef) -{ - /* Frob it to be ready for cloning. */ - TREE_TYPE (inner) = DECL_ORIGINAL_TYPE (inner); - DECL_ORIGINAL_TYPE (inner) = NULL_TREE; -} existing = back_refs[~tag]; bool installed = install_entity (existing); @@ -8156,7 +8150,12 @@ trees_in::decl_value () } if (is_typedef) - set_underlying_type (inner); + { + /* Frob it to be ready for cloning. */ + TREE_TYPE (inner) = DECL_ORIGINAL_TYPE (inner); + DECL_ORIGINAL_TYPE (inner) = NULL_TREE; + set_underlying_type (inner); + } if (inner_tag) /* Set the TEMPLATE_DECL's type. */ @@ -8218,7 +8217,7 @@ trees_in::decl_value () /* Set the TEMPLATE_DECL's type. */ TREE_TYPE (decl) = TREE_TYPE (inner); - if (!is_matching_decl (existing, decl)) + if (!is_matching_decl (existing, decl, is_typedef)) unmatched_duplicate (existing); /* And our result is the existing node. */ @@ -8257,8 +8256,8 @@ trees_in::decl_value () if (inner && !NAMESPACE_SCOPE_P (inner) && ((TREE_CODE (inner) == TYPE_DECL - && TYPE_NAME (TREE_TYPE (inner)) == inner - && !is_typedef) + && !is_typedef + && TYPE_NAME (TREE_TYPE (inner)) == inner) || TREE_CODE (inner) == FUNCTION_DECL) && u ()) read_definition (decl); @@ -11088,7 +11087,7 @@ trees_in::binfo_mergeable (tree *type) decls_match because it can cause instantiations of constraints. */ bool -trees_in::is_matching_decl (tree existing, tree decl) +trees_in::is_matching_decl (tree existing, tree decl, bool is_typedef) { // FIXME: We should probably do some duplicate decl-like stuff here // (beware, default parms should be the same?) Can we just call @@ -11099,35 +11098,36 @@ trees_in::is_matching_decl (tree existing, tree decl) // can elide some of the checking gcc_checking_assert (TREE_CODE (existing) == TREE_CODE (decl)); - tree inner = decl; + tree d_inner = decl; + tree e_inner = existing; if (TREE_CODE (decl) == TEMPLATE_DECL) { - inner = DECL_TEMPLATE_RESULT (decl); - gcc_checking_assert (TREE_CODE (DECL_TEMPLATE_RESULT (existing)) - == TREE_CODE (inner)); + d_inner = DECL_TEMPLATE_RESULT (d_inner); + e_inner = DECL_TEMPLATE_RESULT (e_inner); + gcc_checking_assert (TREE_CODE (e_inner) == TREE_CODE (d_inner)); } gcc_checking_assert (!map_context_from); /* This mapping requres the new decl on the lhs and the existing entity on the rhs of the comparitors below. */ - map_context_from = inner; - map_context_to = STRIP_TEMPLATE (existing); + map_context_from = d_inner; + map_context_to = e_inner; - if (TREE_CODE (inner) == FUNCTION_DECL) + if (TREE_CODE (d_inner) == FUNCTION_DECL) { tree e_ret = fndecl_declared_return_type (existing); tree d_ret = fndecl_declared_return_type (decl); - if (decl != inner && DECL_NAME (inner) == fun_identifier - && LAMBDA_TYPE_P (DECL_CONTEXT (inner))) + if (decl != d_inner && DECL_NAME (d_inner) == fun_identifier + && LAMBDA_TYPE_P (DECL_CONTEXT (d_inner))) /* This has a recursive type that will compare dif
Re: c++: header unit template alias merging [PR 98770]
On 1/28/21 7:54 AM, Nathan Sidwell wrote: Typedefs are streamed by streaming the underlying type, and then recreating the typedef. But this breaks checking a duplicate is the same as the original when it is a template alias -- we end up checking a template alias (eg __void_t) against the underlying type (void). And those are not the same template alias. This stops pretendig that the underlying type is the typedef for that checking and tells is_matching_decl 'you have a typedef', so it knows what to do. (We do not want to recreate the typedef of the duplicate, because that whole set of nodes is going to go away.) d'oh! PR c++/98770 gcc/cp/ * module.cc (trees_out::decl_value): Swap is_typedef & TYPE_NAME check order. (trees_in::decl_value): Do typedef frobbing only when installing a new typedef, adjust is_matching_decl call. Swap is_typedef & TYPE_NAME check. (trees_in::is_matching_decl): Add is_typedef parm. Adjust variable names and deal with typedef checking. gcc/testsuite/ * g++.dg/modules/pr98770_a.C: New. * g++.dg/modules/pr98770_b.C: New. -- Nathan Sidwell
RE: aarch64: Use RTL builtins for [su]mlal_n intrinsics
> -Original Message- > From: Jonathan Wright > Sent: 28 January 2021 12:04 > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov > Subject: aarch64: Use RTL builtins for [su]mlal_n intrinsics > > Hi, > > As subject, this patch rewrites [su]mlal_n Neon intrinsics to use RTL builtins > rather than inline assembly code, allowing for better scheduling and > optimization. > > Regression tested and bootstrapped on aarch64-none-linux-gnu - no > issues. > > Ok for master? Ok. Thanks, Kyrill > > Thanks, > Jonathan > > --- > > gcc/ChangeLog: > > 2021-01-26 Jonathan Wright > > * config/aarch64/aarch64-simd-builtins.def: Add [su]mlal_n > builtin generator macros. > * config/aarch64/aarch64-simd.md (aarch64_mlal_n): > Define. > * config/aarch64/arm_neon.h (vmlal_n_s16): Use RTL builtin > instead of inline asm. > (vmlal_n_s32): Likewise. > (vmlal_n_u16): Likewise. > (vmlal_n_u32): Likewise.
Re: [PATCH] c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847]
On 1/28/21 3:52 AM, Jakub Jelinek wrote: Hi! As the testcase shows, for vars appearing in templates, we don't attach the asm spec string to the pattern decls, nor pass it back to cp_finish_decl during instantiation. The following patch does that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. 2021-01-27 Jakub Jelinek PR c++/33661 PR c++/98847 * decl.c (cp_finish_decl): For register vars with asmspec in templates call set_user_assembler_name and set DECL_HARD_REGISTER. * pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars, pass asmspec_tree to cp_finish_decl. * g++.target/i386/pr98847.C: New test. --- gcc/cp/decl.c.jj2021-01-07 22:54:34.0 +0100 +++ gcc/cp/decl.c 2021-01-27 10:48:14.399582310 +0100 @@ -7840,6 +7840,12 @@ cp_finish_decl (tree decl, tree init, bo retrofit_lang_decl (decl); SET_DECL_DEPENDENT_INIT_P (decl, true); } + + if (VAR_P (decl) && DECL_REGISTER (decl) && asmspec) + { + set_user_assembler_name (decl, asmspec); + DECL_HARD_REGISTER (decl) = 1; + } return; } --- gcc/cp/pt.c.jj 2021-01-22 10:07:53.643466723 +0100 +++ gcc/cp/pt.c 2021-01-27 10:59:04.109126313 +0100 @@ -18227,6 +18227,7 @@ tsubst_expr (tree t, tree args, tsubst_f bool const_init = false; unsigned int cnt = 0; tree first = NULL_TREE, ndecl = error_mark_node; + tree asmspec_tree = NULL_TREE; maybe_push_decl (decl); if (VAR_P (decl) @@ -18250,7 +18251,18 @@ tsubst_expr (tree t, tree args, tsubst_f now. */ predeclare_vla (decl); - cp_finish_decl (decl, init, const_init, NULL_TREE, 0); + if (VAR_P (decl) && DECL_HARD_REGISTER (pattern_decl)) + { + tree id = DECL_ASSEMBLER_NAME (pattern_decl); + const char *asmspec = IDENTIFIER_POINTER (id); + gcc_assert (asmspec[0] == '*'); + asmspec_tree + = build_string (IDENTIFIER_LENGTH (id) - 1, + asmspec + 1); + TREE_TYPE (asmspec_tree) = char_array_type_node; + } + + cp_finish_decl (decl, init, const_init, asmspec_tree, 0); if (ndecl != error_mark_node) cp_finish_decomp (ndecl, first, cnt); --- gcc/testsuite/g++.target/i386/pr98847.C.jj 2021-01-27 11:01:55.312161637 +0100 +++ gcc/testsuite/g++.target/i386/pr98847.C 2021-01-27 11:00:26.601179659 +0100 @@ -0,0 +1,20 @@ +// PR c++/98847 +// { dg-do run } +// { dg-options "-O2 -masm=att" } + +template +int +foo () +{ + register int edx asm ("edx"); + asm ("movl $1234, %%edx" : "=r" (edx)); + return edx; +} + +int +main () +{ + if (foo<0> () != 1234) +__builtin_abort (); + return 0; +} Jakub
Re: [PATCH][X86] Enable X86_TUNE_AVX256_UNALIGNED_{LOAD, STORE}_OPTIMAL for generic tune [PR target/98172]
On Thu, Jan 28, 2021 at 1:21 AM Richard Biener via Gcc-patches wrote: > > On Thu, Jan 28, 2021 at 7:32 AM Hongtao Liu via Gcc-patches > wrote: > > > > Hi: > >GCC11 will be the system GCC 2 years from now, and for the > > processors then, they shouldn't even need to split a 256-bit vector > > into 2 128-bits vectors. > >.i.e. Test SPEC2017 with the below 2 options on Zen3/ICL show > > option B is better than Option A. > > Option A: > > -march=x86-64 -mtune=generic -mavx2 -mfma -Ofast > > > > Option B: > > Option A + > > -mtune-ctrl="256_unaligned_load_optimal,256_unaligned_store_optimal" > > > > Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. > > Given the explicit list for unaligned loads it's a no-brainer to change that > for X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL. Given both > BDVER and ZNVER1 are listed for X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL > we should try to benchmark the effect on ZNVER1 - Martin, do we still > have a znver1 machine around? They are also turned on for Sandybridge. I don't believe we should keep it in GCC 11 to penalize today's CPUs as well as CPUs in 2024. > Note that with the settings differing in a way to split stores but not to > split > loads, loading a just stored value can cause bad STLF and quite a > performance hit (since znver1 has 128bit data paths that shouldn't > be an issue there but it would have an issue for actually aligned data > on CPUs with 256bit data paths). > > Thanks, > Richard. > > > Ok for trunk? > > > > > > > > > > -- > > BR, > > Hongtao -- H.J.
[PATCH] aarch64: Use RTL builtins for [su]mlsl_n intrinsics
Hi, As subject, this patch rewrites [su]mlsl_n Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization. Regression tested and bootstrapped on aarch64-none-linux-gnu - no issues. Ok for master? Thanks, Jonathan --- gcc/ChangeLog: 2021-01-27 Jonathan Wright * config/aarch64/aarch64-simd-builtins.def: Add [su]mlsl_n builtin generator macros. * config/aarch64/aarch64-simd.md (aarch64_mlsl_n): Define. * config/aarch64/arm_neon.h (vmlsl_n_s16): Use RTL builtin instead of inline asm. (vmlsl_n_s32): Likewise. (vmlsl_n_u16): Likewise. (vmlsl_n_u32): Likewise. diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 4f8e28dc3c8478ea50aad333b21bd83f4a4b750e..2b582bee9133039b05b4fdbef92766a30caeab20 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -192,6 +192,10 @@ BUILTIN_VD_BHSI (TERNOP, smlsl, 0, NONE) BUILTIN_VD_BHSI (TERNOPU, umlsl, 0, NONE) + /* Implemented by aarch64_mlsl_n. */ + BUILTIN_VD_HSI (TERNOP, smlsl_n, 0, NONE) + BUILTIN_VD_HSI (TERNOPU, umlsl_n, 0, NONE) + /* Implemented by aarch64_mlal. */ BUILTIN_VD_BHSI (TERNOP, smlal, 0, NONE) BUILTIN_VD_BHSI (TERNOPU, umlal, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index d78f26be19a16163eb1b8f661c6100ac290e6c6b..f2539cf84e30032ed609c12de7530d3e9be77b60 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1868,6 +1868,21 @@ [(set_attr "type" "neon_mla__long")] ) +(define_insn "aarch64_mlsl_n" + [(set (match_operand: 0 "register_operand" "=w") +(minus: + (match_operand: 1 "register_operand" "0") + (mult: +(ANY_EXTEND: + (vec_duplicate:VD_HSI + (match_operand: 3 "register_operand" ""))) +(ANY_EXTEND: + (match_operand:VD_HSI 2 "register_operand" "w")] + "TARGET_SIMD" + "mlsl\t%0., %2., %3.[0]" + [(set_attr "type" "neon_mla__long")] +) + (define_insn "aarch64_simd_vec_mult_lo_" [(set (match_operand: 0 "register_operand" "=w") (mult: (ANY_EXTEND: (vec_select: diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 004c73d9e0ec4c33e24968d17e4307f858b51263..95c5e36530f1a3b72672f62737ced45704323fff 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -8166,48 +8166,28 @@ __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlsl_n_s16 (int32x4_t __a, int16x4_t __b, int16_t __c) { - int32x4_t __result; - __asm__ ("smlsl %0.4s, %2.4h, %3.h[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "x"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_smlsl_nv4hi (__a, __b, __c); } __extension__ extern __inline int64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlsl_n_s32 (int64x2_t __a, int32x2_t __b, int32_t __c) { - int64x2_t __result; - __asm__ ("smlsl %0.2d, %2.2s, %3.s[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "w"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_smlsl_nv2si (__a, __b, __c); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlsl_n_u16 (uint32x4_t __a, uint16x4_t __b, uint16_t __c) { - uint32x4_t __result; - __asm__ ("umlsl %0.4s, %2.4h, %3.h[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "x"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_umlsl_nv4hi_ (__a, __b, __c); } __extension__ extern __inline uint64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmlsl_n_u32 (uint64x2_t __a, uint32x2_t __b, uint32_t __c) { - uint64x2_t __result; - __asm__ ("umlsl %0.2d, %2.2s, %3.s[0]" - : "=w"(__result) - : "0"(__a), "w"(__b), "w"(__c) - : /* No clobbers */); - return __result; + return __builtin_aarch64_umlsl_nv2si_ (__a, __b, __c); } __extension__ extern __inline int16x8_t
[PATCH] aarch64: Fix gcc.target/aarch64/narrow_high-intrinsics.c testism
Pushing to fix recently-updated assembly generation. gcc/testsuite/ * gcc.target/aarch64/narrow_high-intrinsics.c: Fix shrn2 scan. shrn2.patch Description: shrn2.patch
RE: [PATCH] aarch64: Use RTL builtins for [su]mlsl_n intrinsics
> -Original Message- > From: Jonathan Wright > Sent: 28 January 2021 13:23 > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov > Subject: [PATCH] aarch64: Use RTL builtins for [su]mlsl_n intrinsics > > Hi, > > As subject, this patch rewrites [su]mlsl_n Neon intrinsics to use RTL builtins > rather than inline assembly code, allowing for better scheduling and > optimization. > > Regression tested and bootstrapped on aarch64-none-linux-gnu - no > issues. > > Ok for master? > Ok. Thanks, Kyrill > Thanks, > Jonathan > > --- > > gcc/ChangeLog: > > 2021-01-27 Jonathan Wright > > * config/aarch64/aarch64-simd-builtins.def: Add [su]mlsl_n > builtin generator macros. > * config/aarch64/aarch64-simd.md (aarch64_mlsl_n): > Define. > * config/aarch64/arm_neon.h (vmlsl_n_s16): Use RTL builtin > instead of inline asm. > (vmlsl_n_s32): Likewise. > (vmlsl_n_u16): Likewise. > (vmlsl_n_u32): Likewise.
RE: [PATCH] c++: fix string literal member initializer bug [PR90926]
While trying to fix the suggested overload resolution issue I run into another bug: struct A { char str[4]; }; void f(A) {}; int main () { f({"foo"}); } Does not compile on GCC: "error: could not convert ‘{"foo"}’ from ‘’ to ‘A’", but works fine on Clang. Is this a known bug or a new one? -Original Message- From: Jason Merrill Sent: 22 January 2021 16:30 To: Tom Greenslade (thomgree) ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] c++: fix string literal member initializer bug [PR90926] On 12/17/20 5:12 PM, Thomas Greenslade (thomgree) via Gcc-patches wrote: > build_aggr_conv did not correctly handle string literal member initializers. > Extended can_convert_array to handle this case. The additional checks > of compatibility of character types, and whether string literal will > fit, would be quite complicated, so are deferred until the actual conversion > takes place. It seems that we need to check the type, though not the length. [over.ics.list]: "Otherwise, if the parameter type is a character array 125 and the initializer list has a single element that is an appropriately-typed string-literal (9.4.2), the implicit conversion sequence is the identity conversion." So this should be unambiguous: struct A { char str[10]; }; struct B { char16_t str[10]; }; void f(A); void f(B); int main () { f({"foo"}); // calls A overload f({u"foo"}); // calls B overload } You could factor the type matching code out of digest_init_r and use it here. > Testcase added for this. > > Bootstrapped/regtested on x86_64-pc-linux-gnu. > > gcc/cp/ChangeLog: > > PR c++/90926 > * call.c (can_convert_array): Extend to handle all valid aggregate > initializers of an array; including by string literals, not just by > brace-init-list. > (build_aggr_conv): Call can_convert_array more often, not just in > brace-init-list case. > * g++.dg/cpp1y/nsdmi-aggr12.C: New test. > > diff --git a/gcc/cp/call.c b/gcc/cp/call.c index > c2d62e582bf..e4ba31f3f2b 100644 > --- a/gcc/cp/call.c > +++ b/gcc/cp/call.c > @@ -887,28 +887,41 @@ strip_standard_conversion (conversion *conv) > return conv; > } > > -/* Subroutine of build_aggr_conv: check whether CTOR, a braced-init-list, > - is a valid aggregate initializer for array type ATYPE. */ > +/* Subroutine of build_aggr_conv: check whether FROM is a valid aggregate > + initializer for array type ATYPE. */ > > static bool > -can_convert_array (tree atype, tree ctor, int flags, tsubst_flags_t > complain) > +can_convert_array (tree atype, tree from, int flags, tsubst_flags_t > +complain) > { > - unsigned i; > tree elttype = TREE_TYPE (atype); > - for (i = 0; i < CONSTRUCTOR_NELTS (ctor); ++i) > + unsigned i; > + > + if (TREE_CODE (from) == CONSTRUCTOR) > { > - tree val = CONSTRUCTOR_ELT (ctor, i)->value; > - bool ok; > - if (TREE_CODE (elttype) == ARRAY_TYPE > - && TREE_CODE (val) == CONSTRUCTOR) > - ok = can_convert_array (elttype, val, flags, complain); > - else > - ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags, > - complain); > - if (!ok) > - return false; > + for (i = 0; i < CONSTRUCTOR_NELTS (from); ++i) > + { > + tree val = CONSTRUCTOR_ELT (from, i)->value; > + bool ok; > + if (TREE_CODE (elttype) == ARRAY_TYPE) > + ok = can_convert_array (elttype, val, flags, complain); > + else > + ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags, > + complain); > + if (!ok) > + return false; > + } > + return true; > } > - return true; > + > + if ( char_type_p (TYPE_MAIN_VARIANT (elttype)) > + && TREE_CODE (tree_strip_any_location_wrapper (from)) == STRING_CST) > +/* Defer the other necessary checks (compatibility of character types and > + whether string literal will fit) until the conversion actually takes > + place. */ > +return true; > + > + /* No other valid way to aggregate initialize an array. */ return > + false; > } > > /* Helper for build_aggr_conv. Return true if FIELD is in PSET, or > if @@ -965,8 +978,7 @@ build_aggr_conv (tree type, tree ctor, int flags, > tsubst_flags_t complain) >tree ftype = TREE_TYPE (idx); >bool ok; > > - if (TREE_CODE (ftype) == ARRAY_TYPE > - && TREE_CODE (val) == CONSTRUCTOR) > + if (TREE_CODE (ftype) == ARRAY_TYPE) > ok = can_convert_array (ftype, val, flags, complain); >else > ok = can_convert_arg (ftype, TREE_TYPE (val), val, > flags, @@ -1013,9 +1025,8 @@ build_aggr_conv (tree type, tree ctor, int > flags, tsubst_flags_t complain) >val = empty_ctor; > } > > - if (TREE_CODE (ftype) == ARRAY_TYPE > - &&
Re: [PATCH] c++: Fix -Weffc++ in templates [PR98841]
On 1/28/21 3:58 AM, Jakub Jelinek wrote: Hi! We emit a bogus warning on the following testcase, suggesting that the operator should return *this even when it does that already. The problem is that normally cp_build_indirect_ref_1 ensures that *this is folded as current_class_ref, but in templates (if return type is non-dependent, otherwise check_return_expr doesn't check it) it didn't go through cp_build_indirect_ref_1, but just built another INDIRECT_REF. That seems like a bug in build_x_indirect_ref. Which means it then doesn't compare pointer-equal to current_class_ref. The following patch just checks if it is the same thing like cp_build_indirect_ref_1 would check if it would be called. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2021-01-27 Jakub Jelinek PR c++/98841 * typeck.c (check_return_expr): If processing_template_decl, don't rely on *this expression being always pointer-equal to current_class_ref. * g++.dg/warn/effc5.C: New test. --- gcc/cp/typeck.c.jj 2021-01-25 10:02:28.380126847 +0100 +++ gcc/cp/typeck.c 2021-01-27 11:21:13.117875551 +0100 @@ -10188,6 +10188,19 @@ check_return_expr (tree retval, bool *no /* Returning '*this' is obviously OK. */ if (retval == current_class_ref) warn = false; + /* In templates, '*this' might not compare pointer equal to +current_class_ref. */ + else if (processing_template_decl + && retval + && TREE_CODE (retval) == INDIRECT_REF + && (TREE_OPERAND (retval, 0) == current_class_ptr + || (TREE_CODE (TREE_OPERAND (retval, 0)) == NOP_EXPR + && (TREE_OPERAND (TREE_OPERAND (retval, 0), 0) + == current_class_ptr) + && (same_type_ignoring_top_level_qualifiers_p +(TREE_TYPE (TREE_OPERAND (retval, 0)), + TREE_TYPE (current_class_ptr)) + warn = false; /* If we are calling a function whose return type is the same of the current class reference, it is ok. */ else if (INDIRECT_REF_P (retval) --- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-27 11:24:34.451565246 +0100 +++ gcc/testsuite/g++.dg/warn/effc5.C 2021-01-27 11:23:23.836375556 +0100 @@ -0,0 +1,17 @@ +// PR c++/98841 +// { dg-do compile } +// { dg-options "-Weffc++" } + +struct S { + template + S& operator=(const T&) { return *this; } // { dg-bogus "should return a reference to" } + S& operator=(const S&) { return *this; } +}; + +void +foo () +{ + S s, t; + s = 1; + s = t; +} Jakub
Re: [PATCH] c++: fix string literal member initializer bug [PR90926]
On 1/28/21 9:54 AM, Tom Greenslade (thomgree) wrote: While trying to fix the suggested overload resolution issue I run into another bug: struct A { char str[4]; }; void f(A) {}; int main () { f({"foo"}); } Does not compile on GCC: "error: could not convert ‘{"foo"}’ from ‘’ to ‘A’", but works fine on Clang. Is this a known bug or a new one? That looks like the same bug you're fixing with this patch. -Original Message- From: Jason Merrill Sent: 22 January 2021 16:30 To: Tom Greenslade (thomgree) ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] c++: fix string literal member initializer bug [PR90926] On 12/17/20 5:12 PM, Thomas Greenslade (thomgree) via Gcc-patches wrote: build_aggr_conv did not correctly handle string literal member initializers. Extended can_convert_array to handle this case. The additional checks of compatibility of character types, and whether string literal will fit, would be quite complicated, so are deferred until the actual conversion takes place. It seems that we need to check the type, though not the length. [over.ics.list]: "Otherwise, if the parameter type is a character array 125 and the initializer list has a single element that is an appropriately-typed string-literal (9.4.2), the implicit conversion sequence is the identity conversion." So this should be unambiguous: struct A { char str[10]; }; struct B { char16_t str[10]; }; void f(A); void f(B); int main () { f({"foo"}); // calls A overload f({u"foo"}); // calls B overload } You could factor the type matching code out of digest_init_r and use it here. Testcase added for this. Bootstrapped/regtested on x86_64-pc-linux-gnu. gcc/cp/ChangeLog: PR c++/90926 * call.c (can_convert_array): Extend to handle all valid aggregate initializers of an array; including by string literals, not just by brace-init-list. (build_aggr_conv): Call can_convert_array more often, not just in brace-init-list case. * g++.dg/cpp1y/nsdmi-aggr12.C: New test. diff --git a/gcc/cp/call.c b/gcc/cp/call.c index c2d62e582bf..e4ba31f3f2b 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -887,28 +887,41 @@ strip_standard_conversion (conversion *conv) return conv; } -/* Subroutine of build_aggr_conv: check whether CTOR, a braced-init-list, - is a valid aggregate initializer for array type ATYPE. */ +/* Subroutine of build_aggr_conv: check whether FROM is a valid aggregate + initializer for array type ATYPE. */ static bool -can_convert_array (tree atype, tree ctor, int flags, tsubst_flags_t complain) +can_convert_array (tree atype, tree from, int flags, tsubst_flags_t +complain) { - unsigned i; tree elttype = TREE_TYPE (atype); - for (i = 0; i < CONSTRUCTOR_NELTS (ctor); ++i) + unsigned i; + + if (TREE_CODE (from) == CONSTRUCTOR) { - tree val = CONSTRUCTOR_ELT (ctor, i)->value; - bool ok; - if (TREE_CODE (elttype) == ARRAY_TYPE - && TREE_CODE (val) == CONSTRUCTOR) - ok = can_convert_array (elttype, val, flags, complain); - else - ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags, - complain); - if (!ok) - return false; + for (i = 0; i < CONSTRUCTOR_NELTS (from); ++i) + { + tree val = CONSTRUCTOR_ELT (from, i)->value; + bool ok; + if (TREE_CODE (elttype) == ARRAY_TYPE) + ok = can_convert_array (elttype, val, flags, complain); + else + ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags, + complain); + if (!ok) + return false; + } + return true; } - return true; + + if ( char_type_p (TYPE_MAIN_VARIANT (elttype)) + && TREE_CODE (tree_strip_any_location_wrapper (from)) == STRING_CST) +/* Defer the other necessary checks (compatibility of character types and + whether string literal will fit) until the conversion actually takes + place. */ +return true; + + /* No other valid way to aggregate initialize an array. */ return + false; } /* Helper for build_aggr_conv. Return true if FIELD is in PSET, or if @@ -965,8 +978,7 @@ build_aggr_conv (tree type, tree ctor, int flags, tsubst_flags_t complain) tree ftype = TREE_TYPE (idx); bool ok; - if (TREE_CODE (ftype) == ARRAY_TYPE - && TREE_CODE (val) == CONSTRUCTOR) + if (TREE_CODE (ftype) == ARRAY_TYPE) ok = can_convert_array (ftype, val, flags, complain); else ok = can_convert_arg (ftype, TREE_TYPE (val), val, flags, @@ -1013,9 +1025,8 @@ build_aggr_conv (tree type, tree ctor, int flags, tsubst_flags_t complain) val = empty_ctor; } - if (TREE_CODE (ftype) == ARRAY_TYPE - && TREE_CODE (val) == CONSTRUCTOR) - ok = can_convert_array (
[PATCH] c++, v2: Fix -Weffc++ in templates [PR98841]
On Thu, Jan 28, 2021 at 10:04:12AM -0500, Jason Merrill wrote: > > We emit a bogus warning on the following testcase, suggesting that the > > operator should return *this even when it does that already. > > The problem is that normally cp_build_indirect_ref_1 ensures that *this > > is folded as current_class_ref, but in templates (if return type is > > non-dependent, otherwise check_return_expr doesn't check it) it didn't > > go through cp_build_indirect_ref_1, but just built another INDIRECT_REF. > > That seems like a bug in build_x_indirect_ref. So do you want this instead (if it passes bootstrap/regtest)? 2021-01-28 Jakub Jelinek PR c++/98841 * typeck.c (build_x_indirect_ref): For *this, return current_class_ref. * g++.dg/warn/effc5.C: New test. --- gcc/cp/typeck.c.jj 2021-01-27 11:48:49.715890458 +0100 +++ gcc/cp/typeck.c 2021-01-28 16:17:18.712755173 +0100 @@ -3326,7 +3326,15 @@ build_x_indirect_ref (location_t loc, tr { /* Retain the type if we know the operand is a pointer. */ if (TREE_TYPE (expr) && INDIRECT_TYPE_P (TREE_TYPE (expr))) - return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr); + { + if (expr == current_class_ptr + || (TREE_CODE (expr) == NOP_EXPR + && TREE_OPERAND (expr, 0) == current_class_ptr + && (same_type_ignoring_top_level_qualifiers_p + (TREE_TYPE (expr), TREE_TYPE (current_class_ptr) + return current_class_ref; + return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr); + } if (type_dependent_expression_p (expr)) return build_min_nt_loc (loc, INDIRECT_REF, expr); expr = build_non_dependent_expr (expr); --- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-28 16:15:05.820256255 +0100 +++ gcc/testsuite/g++.dg/warn/effc5.C 2021-01-28 16:15:05.820256255 +0100 @@ -0,0 +1,17 @@ +// PR c++/98841 +// { dg-do compile } +// { dg-options "-Weffc++" } + +struct S { + template + S& operator=(const T&) { return *this; } // { dg-bogus "should return a reference to" } + S& operator=(const S&) { return *this; } +}; + +void +foo () +{ + S s, t; + s = 1; + s = t; +} Jakub
[PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
A year ago I submitted this patch: ~~ Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1. When we strip_typedefs the element of the array "const d", we see it's a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is char, but we need to add the const qualifier, so we call cp_build_qualified_type -> build_qualified_type where get_qualified_type checks to see if we already have such a type by walking the variants list, which in this case is: char -> c -> const char -> const char -> d -> const d Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN, we choose the first const char, which has TYPE_USER_ALIGN set. If the element type of an array has TYPE_USER_ALIGN, the array type gets it too. So we can make check_base_type stricter. I was afraid that it might make us reuse types less often, but measuring showed that we build the same amount of types with and without the patch, while bootstrapping. ~~ However, the patch broke a few tests on STRICT_ALIGNMENT platforms and had to be reverted. This is another try. The original patch is kept unchanged, but I added the finalize_type_size hunk that ought to fix the STRICT_ALIGNMENT issues. The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the main variant of a type, but doesn't clear it on any of the variants. Then we end up with types which share the same TYPE_MAIN_VARIANT, but their TYPE_CANONICAL differs and then the usual "canonical types differ for identical types" follows. I've created alignas19.C to exercise this scenario. What happens is: - when parsing the class S we create a type S in xref_tag, - we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S, - we parse the member function fn and build_memfn_type creates a copy of S to add const; this variant has T_U_A set, - we finish_struct S which calls layout_class_type -> finish_record_type -> finalize_size_type where we reset T_U_A in S (but const S keeps it), - finish_non_static_data_member for arr calls maybe_dummy_object with type = S, - maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p to check if S and TREE_TYPE (current_class_ref), which is const S, are the same, - same_type_ignoring_top_level_qualifiers_p creates cv-unqualified versions of the passed types. Previously we'd use our main variant S when stripping "const S" of const, but since the T_U_A flags don't match (check_base_type), we create a new variant S'. Then we crash in comptypes because S and S' have the same TYPE_MAIN_VARIANT but different TYPE_CANONICALs. With my patch we'll clear T_U_A for S's variants too, and then instead of S' we'll just use S. Does this seem sane? Bootstrapped/regtested on * x86_64-pc-linux-gnu * powerpc64le-unknown-linux-gnu * aarch64-linux-gnu ok for trunk? gcc/ChangeLog: PR c++/94775 * stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in the main variant, maybe reset it in its variants too. * tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match. (check_aligned_type): Check if TYPE_USER_ALIGN match. gcc/testsuite/ChangeLog: PR c++/94775 * g++.dg/cpp0x/alignas19.C: New test. * g++.dg/warn/Warray-bounds15.C: New test. --- gcc/stor-layout.c | 16 +++-- gcc/testsuite/g++.dg/cpp0x/alignas19.C | 8 + gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 + gcc/tree.c | 4 ++- 4 files changed, 64 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 7d6b1b5ce52..784f131ebb8 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -1926,6 +1926,7 @@ finalize_type_size (tree type) However, where strict alignment is not required, avoid over-aligning structures, since most compilers do not do this alignment. */ + bool tua_cleared_p = false; if (TYPE_MODE (type) != BLKmode && TYPE_MODE (type) != VOIDmode && (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type))) @@ -1937,7 +1938,9 @@ finalize_type_size (tree type) if (mode_align >= TYPE_ALIGN (type)) { SET_TYPE_ALIGN (type, mode_align); - TYPE_USER_ALIGN (type) = 0; + /* Remember that we're about to reset this flag. */ + tua_cleared_p = TYPE_USER_ALIGN (type); + TYPE_USER_ALIGN (type) = false; } } @@ -1991,14 +1994,21 @@ finalize_type_size (tree type) /* Copy it into all variants. */ for (variant = TYPE_MAIN_VARIANT (type); - variant != 0; + variant != NULL_TREE; variant = TYPE_NEXT_VARIANT (variant)) { TYPE_SIZE (v
[COMMITTED] PowerPC: Map IEEE 128-bit long double built-ins.
After testing the patch I submitted on November 17th that was approved to make sure it still works, I commited the patch to the master branch. Sorry about the intermediate rewrites. [PATCH] Map long double built-ins correctly with IEEE 128-bit long double. The PowerPC has two different 128-bit long double types, one that uses a pair of doubles to get more mantissa range, and the other using the IEEE 128-bit 754R binary floating point format. The pair of doubles has been used as the traditional format, and we are in the process of moving to allow an implementation to switch to using IEEE 128-bit floating point. The GLIBC and LIBSTDC++ libraries have been modified to have functions using the two different formats in their libraries with different names. This patch goes through all of the built-in functions that either take long double arguments or return long double, and changes the name from the traditional name to the IEEE 128-bit name. The minimum GLIBC version to support IEEE 128-bit floating point is 2.32. The names changed are: * l is usually mapped to __ieee128; * printf is mapped to __printfieee128; (and) * scanf is mapped to __isoc99_scanfieee128. A few functions have different mappings: * dreml => __remainderieee128; * gammal => __lgammaieee128; * gammal_r=> __lgammaieee128_r; * lgammal_r => __lgammaieee128_r; * nexttoward => __nexttoward_to_ieee128; * nexttowardf => __nexttowardf_to_ieee128; * nexttowardl => __nexttowardl_to_ieee128; * pow10l => __exp10ieee128; * scalbl => __scalbieee128; * significandl=> __significandieee128; (and) * sincosl => __sincosieee128. gcc/ 2021-01-28 Michael Meissner * config/rs6000/rs6000.c (rs6000_mangle_decl_assembler_name): Add support for mapping built-in function names for long double built-in functions if long double is IEEE 128-bit. gcc/testsuite/ 2021-01-28 Michael Meissner * gcc.target/powerpc/float128-longdouble-math.c: New test. * gcc.target/powerpc/float128-longdouble-stdio.c: New test. * gcc.target/powerpc/float128-math.c: Adjust test for new name being generated. Add support for running test on power10. Add support for running if long double defaults to 64-bits. --- gcc/config/rs6000/rs6000.c| 135 -- .../powerpc/float128-longdouble-math.c| 442 ++ .../powerpc/float128-longdouble-stdio.c | 36 ++ .../gcc.target/powerpc/float128-math.c| 16 +- 4 files changed, 589 insertions(+), 40 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-math.c create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-stdio.c diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ccfbe7ba9f1..fbaff289a40 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -27338,57 +27338,128 @@ rs6000_globalize_decl_name (FILE * stream, tree decl) library before you can switch the real*16 type at compile time. We use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change this name. We - only do this if the default is that long double is IBM extended double, and - the user asked for IEEE 128-bit. */ + only do this transformation if the __float128 type is enabled. This + prevents us from doing the transformation on older 32-bit ports that might + have enabled using IEEE 128-bit floating point as the default long double + type. */ static tree rs6000_mangle_decl_assembler_name (tree decl, tree id) { - if (!TARGET_IEEEQUAD_DEFAULT && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 + if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 && TREE_CODE (decl) == FUNCTION_DECL - && DECL_IS_UNDECLARED_BUILTIN (decl)) + && DECL_IS_UNDECLARED_BUILTIN (decl) + && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) { size_t len = IDENTIFIER_LENGTH (id); const char *name = IDENTIFIER_POINTER (id); + char *newname = NULL; - if (name[len - 1] == 'l') + /* See if it is one of the built-in functions with an unusual name. */ + switch (DECL_FUNCTION_CODE (decl)) { - bool uses_ieee128_p = false; - tree type = TREE_TYPE (decl); - machine_mode ret_mode = TYPE_MODE (type); + case BUILT_IN_DREML: + newname = xstrdup ("__remainderieee128"); + break; - /* See if the function returns a IEEE 128-bit floating point type or -complex type. */ - if (ret_mode == TFmode || ret_mode == TCmode) - uses_ieee128_p = true; - else - { - function_args_iterator args_iter; - tree arg; + case BUILT_IN_GAMMAL: + newname = xstrdup ("__lgammaieee128"); + break; + + case BUILT_IN_GAMMAL
[PATCH] use memcpy instead of strncpy for dyn_string insertion
Calling strncpy in libiberty's dyn_string_insert() with the last argument equal to the length of the second triggers the warning: /src/gcc/master/libiberty/dyn-string.c:280:3: warning: ‘strncpy’ output truncated before terminating nul copying as many bytes from a string as its length [-Wstringop-truncation] 280 | strncpy (dest->s + pos, src, length); | ^~~~ /src/gcc/master/libiberty/dyn-string.c:272:16: note: length computed here 272 | int length = strlen (src); |^~~~ The attached patch avoids the warning by calling memcpy instead. I'll go ahead and commit the patch as obvious if there are no concerns/objections. Martin libiberty/ChangeLog: * dyn-string.c (dyn_string_insert_cstr): Use memcpy instead of strncpy to avoid warnings. diff --git a/libiberty/dyn-string.c b/libiberty/dyn-string.c index ea711182ca5..8d2456b86c8 100644 --- a/libiberty/dyn-string.c +++ b/libiberty/dyn-string.c @@ -277,7 +277,7 @@ dyn_string_insert_cstr (dyn_string_t dest, int pos, const char *src) for (i = dest->length; i >= pos; --i) dest->s[i + length] = dest->s[i]; /* Splice in the new stuff. */ - strncpy (dest->s + pos, src, length); + memcpy (dest->s + pos, src, length); /* Compute the new length. */ dest->length += length; return 1;
Re: arm: Adjust cost of vector of constant zero
On Wed, 27 Jan 2021 at 15:03, Kyrylo Tkachov wrote: > > > > > -Original Message- > > From: Christophe Lyon > > Sent: 27 January 2021 13:56 > > To: Kyrylo Tkachov > > Cc: Kyrylo Tkachov via Gcc-patches > > Subject: Re: arm: Adjust cost of vector of constant zero > > > > On Wed, 27 Jan 2021 at 14:44, Kyrylo Tkachov > > wrote: > > > > > > > > > > > > > -Original Message- > > > > From: Christophe Lyon > > > > Sent: 27 January 2021 13:12 > > > > To: Kyrylo Tkachov > > > > Cc: Kyrylo Tkachov via Gcc-patches > > > > Subject: Re: arm: Adjust cost of vector of constant zero > > > > > > > > On Wed, 27 Jan 2021 at 10:15, Kyrylo Tkachov > > > > > > wrote: > > > > > > > > > > Hi Christophe, > > > > > > > > > > > -Original Message- > > > > > > From: Gcc-patches On Behalf > > Of > > > > > > Christophe Lyon via Gcc-patches > > > > > > Sent: 26 January 2021 18:03 > > > > > > To: gcc Patches > > > > > > Subject: arm: Adjust cost of vector of constant zero > > > > > > > > > > > > Neon vector comparisons have a dedicated version when comparing > > with > > > > > > constant zero: it means its cost is free. > > > > > > > > > > > > Adjust the cost in arm_rtx_costs_internal accordingly, for Neon > > > > > > only, > > > > > > since MVE does not support this. > > > > > > > > > > I guess the other way to do this would be in the comparison code > > handling > > > > in this function where we could check for a const_vector of zeroes and a > > > > Neon mode and avoid recursing into the operands. > > > > > That would avoid the extra switch statement in your patch. > > > > > WDYT? > > > > > > > > Do you mean like so: > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > > > > index 4a5f265..542c15e 100644 > > > > --- a/gcc/config/arm/arm.c > > > > +++ b/gcc/config/arm/arm.c > > > > @@ -11316,6 +11316,28 @@ arm_rtx_costs_internal (rtx x, enum > > rtx_code > > > > code, enum rtx_code outer_code, > > > > *cost = 0; > > > > return true; > > > > } > > > > + /* Neon has special instructions when comparing with 0 (vceq, > > > > vcge, > > > > vcgt, > > > > + vcle and vclt). */ > > > > + else if (TARGET_NEON > > > > +&& TARGET_HARD_FLOAT > > > > +&& (VALID_NEON_DREG_MODE (mode) || > > VALID_NEON_QREG_MODE > > > > (mode)) > > > > +&& (XEXP (x, 1) == CONST0_RTX (mode))) > > > > + { > > > > + switch (code) > > > > + { > > > > + case EQ: > > > > + case GE: > > > > + case GT: > > > > + case LE: > > > > + case LT: > > > > + *cost = 0; > > > > + return true; > > > > + > > > > + default: > > > > + break; > > > > + } > > > > + } > > > > + > > > >return false; > > > > > > > > I'm not sure I can remove the switch, since the other comparisons are > > > > not supported by Neon anyway. > > > > > > > > > > No, I mean where: > > > case EQ: > > > case NE: > > > case LT: > > > case LE: > > > case GT: > > > case GE: > > > case LTU: > > > case LEU: > > > case GEU: > > > case GTU: > > > case ORDERED: > > > case UNORDERED: > > > case UNEQ: > > > case UNLE: > > > case UNLT: > > > case UNGE: > > > case UNGT: > > > case LTGT: > > > if (outer_code == SET) > > > { > > > /* Is it a store-flag operation? */ > > > if (REG_P (XEXP (x, 0)) && REGNO (XEXP (x, 0)) == CC_REGNUM > > > > > > you reorder the codes that are relevant to NEON, handle them for a vector > > zero argument (and the right target checks), and fall through to the rest if > > not. > > > > > > > OK, I didn't find reordering this appealing :-) > > > > Like so, then? > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > > index 4a5f265..88e398d 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -11211,11 +11211,23 @@ arm_rtx_costs_internal (rtx x, enum rtx_code > > code, enum rtx_code outer_code, > >return true; > > > > case EQ: > > -case NE: > > -case LT: > > -case LE: > > -case GT: > > case GE: > > +case GT: > > +case LE: > > +case LT: > > + /* Neon has special instructions when comparing with 0 (vceq, vcge, > > vcgt, > > + vcle and vclt). */ > > + if (TARGET_NEON > > + && TARGET_HARD_FLOAT > > + && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE > > (mode)) > > + && (XEXP (x, 1) == CONST0_RTX (mode))) > > + { > > + *cost = 0; > > + return true; > > + } > > + > > + /* Fall through. */ > > +case NE: > > case LTU: > > case LEU: > > case GEU: > > > > I find it much cleaner, but I guess it's subjective > Ok if it passes bootstrap and testing. > Thanks, > Kyrill > Pushed after successful bootstrap as 31a0ab9213f780d2fa1da6e4879df214c0f247f9 (r11-6961) Thanks, Christophe > > > > Thanks, > > > > Christophe > > > > > Kyrill > > > > > > > Thanks, > > > > > > > > Christophe > > > > > > > > > > > > > Thanks, > > > > > Kyrill > > > >
Re: [PATCH] testsuite: Run vec_insert case on P8 and P9 with option specified
Hi! On Thu, Jan 28, 2021 at 02:40:25AM -0600, Xionghu Luo wrote: > Move common functions to header file for cleanup. > > gcc/testsuite/ChangeLog: > > 2021-01-27 Xionghu Luo > > * gcc.target/powerpc/pr79251.p8.c: Move definition to ... > * gcc.target/powerpc/pr79251.h: ...this. > * gcc.target/powerpc/pr79251.p9.c: Likewise. > * gcc.target/powerpc/pr79251-run.c: Rename to... > * gcc.target/powerpc/pr79251-run.p8.c: ...this. > * gcc.target/powerpc/pr79251-run.p9.c: New test. > --- a/gcc/testsuite/gcc.target/powerpc/pr79251.p8.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr79251.p8.c > @@ -6,8 +6,6 @@ > #include > #include "pr79251.h" > > -TEST_VEC_INSERT_ALL (test) The changelog does not mention this (unless that is what "move definition" means -- please improve the changelog a bit). But the patch is fine as far as I can see. Okay for trunk. Thanks! Segher
Re: [Ping] PowerPC: Add float128/Decimal conversions.
On Wed, Jan 27, 2021 at 09:11:49PM -0600, Segher Boessenkool wrote: > On Tue, Jan 26, 2021 at 06:43:06PM -0500, Michael Meissner wrote: > > I posted this patch on January 14th, 2021: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563498.html > > > > | Date: Thu, 14 Jan 2021 12:09:36 -0500 > > | Subject: [PATCH] PowerPC: Add float128/Decimal conversions. > > | Message-ID: <20210114170936.ga3...@ibm-toto.the-meissners.org> > > > > You had a question about what changed, and I replied: > > > > | In your last message, you said that it was unacceptable that the > > conversion > > | fails if the user uses an old GLIBC. So I rewrote the code using weak > > | references. If the user has at least GLIBC 2.32, it will use the IEEE > > 128-bit > > | string support in the library. > > | > > | If an older GLIBC is used, I then use the IBM 128-bit format as an > > intermediate > > | value. Obviously there are cases where IEEE 128-bit can hold values that > > IBM > > | 128-bit can't (mostly due to the increased exponent range in IEEE > > 128-bit), but > > | it at least does the conversion for the numbers in the common range. > > | > > | In doing this transformation, I needed to do minor edits to the main > > decimal > > | to/from binary conversion functions to allow the KF functions to be > > declared. > > | Previously, I used preprocessor magic to rename the functions. > > > > This is the second most important patch in the IEEE 128-bit work. What do I > > need to do to be able to commit the patch? > > The whole thread is at > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/ > . > > I approved *that* version of the patch. Yes you approved the built-in renaming patch and I checked in the patch that you approved (rather than the modifications that I submitted). However, this is about the second patch (that provides conversions between _Float128 and Decimal), which as far as I can tell was not approved. That is what this particular question is about. You did not like the previous _Float128 <-> Decimal conversion patch because it was tied into having a minimum GLIBC version. So, I rewrote the whole patch so that it will work with older GLIBC's. If you have a newer GLIBC (detected at runtime via a weak reference), it uses the support functions in the new GLIBC to do the conversion. I had a cut+paste error in the original patch that I resubmitted, and I resubmitted with the changed line. If you have an older GLIBC and want to convert _Float128 to one of the Decimal types, it will convert the _Float128 to __ibm128 and then convert that to Decimal. Similarly if you have one of the Decimal types, and want to convert to _Float128, it will first convert the Decimal type to __ibm128, and then convert __ibm128 to _Float128. So what do I do? Can I check in this patch or do I need to do further modifications. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
[committed] libstdc++: Fix copyright dates for simd headers and tests
libstdc++-v3/ChangeLog: * include/experimental/bits/numeric_traits.h: Update copyright dates. * include/experimental/bits/simd.h: Likewise. * include/experimental/bits/simd_builtin.h: Likewise. * include/experimental/bits/simd_converter.h: Likewise. * include/experimental/bits/simd_detail.h: Likewise. * include/experimental/bits/simd_fixed_size.h: Likewise. * include/experimental/bits/simd_math.h: Likewise. * include/experimental/bits/simd_neon.h: Likewise. * include/experimental/bits/simd_ppc.h: Likewise. * include/experimental/bits/simd_scalar.h: Likewise. * include/experimental/bits/simd_x86.h: Likewise. * include/experimental/bits/simd_x86_conversions.h: Likewise. * include/experimental/simd: Likewise. * testsuite/experimental/simd/*: Likewise. Committed to trunk. commit a054608c9c409245575e3dfe61b9a36e1bf7ffcf Author: Jonathan Wakely Date: Thu Jan 28 18:13:03 2021 libstdc++: Fix copyright dates for simd headers and tests libstdc++-v3/ChangeLog: * include/experimental/bits/numeric_traits.h: Update copyright dates. * include/experimental/bits/simd.h: Likewise. * include/experimental/bits/simd_builtin.h: Likewise. * include/experimental/bits/simd_converter.h: Likewise. * include/experimental/bits/simd_detail.h: Likewise. * include/experimental/bits/simd_fixed_size.h: Likewise. * include/experimental/bits/simd_math.h: Likewise. * include/experimental/bits/simd_neon.h: Likewise. * include/experimental/bits/simd_ppc.h: Likewise. * include/experimental/bits/simd_scalar.h: Likewise. * include/experimental/bits/simd_x86.h: Likewise. * include/experimental/bits/simd_x86_conversions.h: Likewise. * include/experimental/simd: Likewise. * testsuite/experimental/simd/*: Likewise. diff --git a/libstdc++-v3/include/experimental/bits/numeric_traits.h b/libstdc++-v3/include/experimental/bits/numeric_traits.h index 1b60874b788..0a1c711a559 100644 --- a/libstdc++-v3/include/experimental/bits/numeric_traits.h +++ b/libstdc++-v3/include/experimental/bits/numeric_traits.h @@ -1,6 +1,6 @@ // Definition of numeric_limits replacement traits P1841R1 -*- C++ -*- -// Copyright (C) 2020 Free Software Foundation, Inc. +// Copyright (C) 2020-2021 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 00eec50d64f..084849e2dc9 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1,6 +1,6 @@ // Definition of the public simd interfaces -*- C++ -*- -// Copyright (C) 2020 Free Software Foundation, Inc. +// Copyright (C) 2020-2021 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h index f2c99faa4ee..efca65fa6e3 100644 --- a/libstdc++-v3/include/experimental/bits/simd_builtin.h +++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h @@ -1,6 +1,6 @@ // Simd Abi specific implementations -*- C++ -*- -// Copyright (C) 2020 Free Software Foundation, Inc. +// Copyright (C) 2020-2021 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the diff --git a/libstdc++-v3/include/experimental/bits/simd_converter.h b/libstdc++-v3/include/experimental/bits/simd_converter.h index dc4598743f9..9c8bf382df9 100644 --- a/libstdc++-v3/include/experimental/bits/simd_converter.h +++ b/libstdc++-v3/include/experimental/bits/simd_converter.h @@ -1,6 +1,6 @@ // Generic simd conversions -*- C++ -*- -// Copyright (C) 2020 Free Software Foundation, Inc. +// Copyright (C) 2020-2021 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // software; you can redistribute it and/or modify it under the diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h index a49a9d88b7f..1e75812d098 100644 --- a/libstdc++-v3/include/experimental/bits/simd_detail.h +++ b/libstdc++-v3/include/experimental/bits/simd_detail.h @@ -1,6 +1,6 @@ // Internal macros for the simd implementation -*- C++ -*- -// Copyright (C) 2020 Free Software Foundation, Inc. +// Copyright (C) 2020-2021 Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free // softw
Re: [Ping] PowerPC: Add float128/Decimal conversions.
On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote: > > The whole thread is at > > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/ > > . > > > > I approved *that* version of the patch. > > Yes you approved the built-in renaming patch and I checked in the patch that > you approved (rather than the modifications that I submitted). > > However, this is about the second patch (that provides conversions between > _Float128 and Decimal), which as far as I can tell was not approved. That is > what this particular question is about. I see no second patch? > So, I rewrote the whole patch so that it will work with older GLIBC's. Did you check in what I approved or not? It is a very simple question, it's just about facts, with a simple yes/no answer. Segher
Re: [PATCH] c++, v2: Fix -Weffc++ in templates [PR98841]
On 1/28/21 10:24 AM, Jakub Jelinek wrote: On Thu, Jan 28, 2021 at 10:04:12AM -0500, Jason Merrill wrote: We emit a bogus warning on the following testcase, suggesting that the operator should return *this even when it does that already. The problem is that normally cp_build_indirect_ref_1 ensures that *this is folded as current_class_ref, but in templates (if return type is non-dependent, otherwise check_return_expr doesn't check it) it didn't go through cp_build_indirect_ref_1, but just built another INDIRECT_REF. That seems like a bug in build_x_indirect_ref. So do you want this instead (if it passes bootstrap/regtest)? OK. 2021-01-28 Jakub Jelinek PR c++/98841 * typeck.c (build_x_indirect_ref): For *this, return current_class_ref. * g++.dg/warn/effc5.C: New test. --- gcc/cp/typeck.c.jj 2021-01-27 11:48:49.715890458 +0100 +++ gcc/cp/typeck.c 2021-01-28 16:17:18.712755173 +0100 @@ -3326,7 +3326,15 @@ build_x_indirect_ref (location_t loc, tr { /* Retain the type if we know the operand is a pointer. */ if (TREE_TYPE (expr) && INDIRECT_TYPE_P (TREE_TYPE (expr))) - return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr); + { + if (expr == current_class_ptr + || (TREE_CODE (expr) == NOP_EXPR + && TREE_OPERAND (expr, 0) == current_class_ptr + && (same_type_ignoring_top_level_qualifiers_p + (TREE_TYPE (expr), TREE_TYPE (current_class_ptr) + return current_class_ref; + return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr); + } if (type_dependent_expression_p (expr)) return build_min_nt_loc (loc, INDIRECT_REF, expr); expr = build_non_dependent_expr (expr); --- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-28 16:15:05.820256255 +0100 +++ gcc/testsuite/g++.dg/warn/effc5.C 2021-01-28 16:15:05.820256255 +0100 @@ -0,0 +1,17 @@ +// PR c++/98841 +// { dg-do compile } +// { dg-options "-Weffc++" } + +struct S { + template + S& operator=(const T&) { return *this; } // { dg-bogus "should return a reference to" } + S& operator=(const S&) { return *this; } +}; + +void +foo () +{ + S s, t; + s = 1; + s = t; +} Jakub
Re: [Ping] PowerPC: Add float128/Decimal conversions.
On Thu, Jan 28, 2021 at 12:59:18PM -0600, Segher Boessenkool wrote: > On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote: > > > The whole thread is at > > > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/ > > > . > > > > > > I approved *that* version of the patch. > > > > Yes you approved the built-in renaming patch and I checked in the patch that > > you approved (rather than the modifications that I submitted). > > > > However, this is about the second patch (that provides conversions between > > _Float128 and Decimal), which as far as I can tell was not approved. That > > is > > what this particular question is about. > > I see no second patch? > > > So, I rewrote the whole patch so that it will work with older GLIBC's. > > Did you check in what I approved or not? It is a very simple question, > it's just about facts, with a simple yes/no answer. Yes I checked in that patch earlier this morning (the original patch that you approved). The second patch I want you to review is: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564407.html | Date: Wed, 27 Jan 2021 16:19:52 -0500 | Subject: [PATCH, revised, #2] PowerPC: Add float128/Decimal conversions. | Message-ID: <20210127211952.ga28...@ibm-toto.the-meissners.org> -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Re: [Ping] PowerPC: Add float128/Decimal conversions.
On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote: > On Thu, Jan 28, 2021 at 12:59:18PM -0600, Segher Boessenkool wrote: > > On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote: > > > > The whole thread is at > > > > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/ > > > > . > > > > > > > > I approved *that* version of the patch. > > > > > > Yes you approved the built-in renaming patch and I checked in the patch > > > that > > > you approved (rather than the modifications that I submitted). > > > > > > However, this is about the second patch (that provides conversions between > > > _Float128 and Decimal), which as far as I can tell was not approved. > > > That is > > > what this particular question is about. > > > > I see no second patch? > > > > > So, I rewrote the whole patch so that it will work with older GLIBC's. > > > > Did you check in what I approved or not? It is a very simple question, > > it's just about facts, with a simple yes/no answer. > > Yes I checked in that patch earlier this morning (the original patch that you > approved). > > The second patch I want you to review is: "This patch replaces the following three patches:" Please send a patch that modifies *current* code, and that is *tested* with that. With a good explanation, and a commit message for *that* patch (not for other, older patches). Segher
Re: [Ping] PowerPC: Add float128/Decimal conversions.
On 1/28/21 1:47 PM, Segher Boessenkool wrote: > On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote: >> The second patch I want you to review is: > > "This patch replaces the following three patches:" > > Please send a patch that modifies *current* code, and that is *tested* > with that. With a good explanation, and a commit message for *that* > patch (not for other, older patches). Isn't the email Mike linked to which was submitted yesterday exactly this? Admittedly, it doesn't have a "commit message header", but it does seem to be against current code and he did say he tested it and there is some explanation given, just not in a commit message form. Is it just a nice commit message you're looking for now? Peter
Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
On 1/28/21 10:34 AM, Marek Polacek wrote: A year ago I submitted this patch: ~~ Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1. When we strip_typedefs the element of the array "const d", we see it's a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is char, but we need to add the const qualifier, so we call cp_build_qualified_type -> build_qualified_type where get_qualified_type checks to see if we already have such a type by walking the variants list, which in this case is: char -> c -> const char -> const char -> d -> const d Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN, we choose the first const char, which has TYPE_USER_ALIGN set. If the element type of an array has TYPE_USER_ALIGN, the array type gets it too. So we can make check_base_type stricter. I was afraid that it might make us reuse types less often, but measuring showed that we build the same amount of types with and without the patch, while bootstrapping. ~~ However, the patch broke a few tests on STRICT_ALIGNMENT platforms and had to be reverted. This is another try. The original patch is kept unchanged, but I added the finalize_type_size hunk that ought to fix the STRICT_ALIGNMENT issues. The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the main variant of a type, but doesn't clear it on any of the variants. Then we end up with types which share the same TYPE_MAIN_VARIANT, but their TYPE_CANONICAL differs and then the usual "canonical types differ for identical types" follows. I've created alignas19.C to exercise this scenario. What happens is: - when parsing the class S we create a type S in xref_tag, - we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S, - we parse the member function fn and build_memfn_type creates a copy of S to add const; this variant has T_U_A set, - we finish_struct S which calls layout_class_type -> finish_record_type -> finalize_size_type where we reset T_U_A in S (but const S keeps it), - finish_non_static_data_member for arr calls maybe_dummy_object with type = S, - maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p to check if S and TREE_TYPE (current_class_ref), which is const S, are the same, - same_type_ignoring_top_level_qualifiers_p creates cv-unqualified versions of the passed types. Previously we'd use our main variant S when stripping "const S" of const, but since the T_U_A flags don't match (check_base_type), we create a new variant S'. Then we crash in comptypes because S and S' have the same TYPE_MAIN_VARIANT but different TYPE_CANONICALs. With my patch we'll clear T_U_A for S's variants too, and then instead of S' we'll just use S. Does this seem sane? Bootstrapped/regtested on * x86_64-pc-linux-gnu * powerpc64le-unknown-linux-gnu * aarch64-linux-gnu ok for trunk? gcc/ChangeLog: PR c++/94775 * stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in the main variant, maybe reset it in its variants too. * tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match. (check_aligned_type): Check if TYPE_USER_ALIGN match. gcc/testsuite/ChangeLog: PR c++/94775 * g++.dg/cpp0x/alignas19.C: New test. * g++.dg/warn/Warray-bounds15.C: New test. --- gcc/stor-layout.c | 16 +++-- gcc/testsuite/g++.dg/cpp0x/alignas19.C | 8 + gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 + gcc/tree.c | 4 ++- 4 files changed, 64 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 7d6b1b5ce52..784f131ebb8 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -1926,6 +1926,7 @@ finalize_type_size (tree type) However, where strict alignment is not required, avoid over-aligning structures, since most compilers do not do this alignment. */ + bool tua_cleared_p = false; if (TYPE_MODE (type) != BLKmode && TYPE_MODE (type) != VOIDmode && (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type))) @@ -1937,7 +1938,9 @@ finalize_type_size (tree type) if (mode_align >= TYPE_ALIGN (type)) { SET_TYPE_ALIGN (type, mode_align); - TYPE_USER_ALIGN (type) = 0; + /* Remember that we're about to reset this flag. */ + tua_cleared_p = TYPE_USER_ALIGN (type); + TYPE_USER_ALIGN (type) = false; } } @@ -1991,14 +1994,21 @@ finalize_type_size (tree type) /* Copy it into all variants. */ for (variant = TYPE_MAIN_VARIANT (type); - variant != 0; + variant != NULL_TREE;
Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
On Thu, Jan 28, 2021 at 03:18:55PM -0500, Jason Merrill via Gcc-patches wrote: > On 1/28/21 10:34 AM, Marek Polacek wrote: > > A year ago I submitted this patch: > > > > ~~ > > Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it > > gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by > > build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1. > > > > When we strip_typedefs the element of the array "const d", we see it's > > a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is > > char, but we need to add the const qualifier, so we call > > cp_build_qualified_type -> build_qualified_type > > where get_qualified_type checks to see if we already have such a type > > by walking the variants list, which in this case is: > > > >char -> c -> const char -> const char -> d -> const d > > > > Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN, > > we choose the first const char, which has TYPE_USER_ALIGN set. If the > > element type of an array has TYPE_USER_ALIGN, the array type gets it too. > > > > So we can make check_base_type stricter. I was afraid that it might make > > us reuse types less often, but measuring showed that we build the same > > amount of types with and without the patch, while bootstrapping. > > ~~ > > > > However, the patch broke a few tests on STRICT_ALIGNMENT platforms and > > had to be reverted. This is another try. The original patch is kept > > unchanged, but I added the finalize_type_size hunk that ought to fix the > > STRICT_ALIGNMENT issues. > > > > The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the > > main variant of a type, but doesn't clear it on any of the variants. > > Then we end up with types which share the same TYPE_MAIN_VARIANT, but > > their TYPE_CANONICAL differs and then the usual "canonical types differ > > for identical types" follows. > > > > I've created alignas19.C to exercise this scenario. What happens is: > > - when parsing the class S we create a type S in xref_tag, > > - we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S, > > - we parse the member function fn and build_memfn_type creates a copy > >of S to add const; this variant has T_U_A set, > > - we finish_struct S which calls layout_class_type -> finish_record_type > >-> finalize_size_type where we reset T_U_A in S (but const S keeps it), > > - finish_non_static_data_member for arr calls maybe_dummy_object with > >type = S, > > - maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p > >to check if S and TREE_TYPE (current_class_ref), which is const S, > >are the same, > > - same_type_ignoring_top_level_qualifiers_p creates cv-unqualified > >versions of the passed types. Previously we'd use our main variant > >S when stripping "const S" of const, but since the T_U_A flags don't > >match (check_base_type), we create a new variant S'. Then we crash in > >comptypes because S and S' have the same TYPE_MAIN_VARIANT but > >different TYPE_CANONICALs. > > > > With my patch we'll clear T_U_A for S's variants too, and then instead > > of S' we'll just use S. Does this seem sane? > > > > Bootstrapped/regtested on > > * x86_64-pc-linux-gnu > > * powerpc64le-unknown-linux-gnu > > * aarch64-linux-gnu > > ok for trunk? > > > > gcc/ChangeLog: > > > > PR c++/94775 > > * stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in > > the main variant, maybe reset it in its variants too. > > * tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match. > > (check_aligned_type): Check if TYPE_USER_ALIGN match. > > > > gcc/testsuite/ChangeLog: > > > > PR c++/94775 > > * g++.dg/cpp0x/alignas19.C: New test. > > * g++.dg/warn/Warray-bounds15.C: New test. > > --- > > gcc/stor-layout.c | 16 +++-- > > gcc/testsuite/g++.dg/cpp0x/alignas19.C | 8 + > > gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 + > > gcc/tree.c | 4 ++- > > 4 files changed, 64 insertions(+), 4 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C > > create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C > > > > diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c > > index 7d6b1b5ce52..784f131ebb8 100644 > > --- a/gcc/stor-layout.c > > +++ b/gcc/stor-layout.c > > @@ -1926,6 +1926,7 @@ finalize_type_size (tree type) > >However, where strict alignment is not required, avoid > >over-aligning structures, since most compilers do not do this > >alignment. */ > > + bool tua_cleared_p = false; > > if (TYPE_MODE (type) != BLKmode > > && TYPE_MODE (type) != VOIDmode > > && (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type))) > > @@ -1937,7 +1938,9 @@ finalize_type_size (tree type) > > if (mode_align >= TYPE_ALIGN (type)) > > { > > SET_TYPE_ALIGN (t
Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)
On 1/28/21 1:31 AM, Richard Biener wrote: On Thu, Jan 28, 2021 at 12:08 AM Martin Sebor via Gcc-patches wrote: Attached is another attempt to fix the problem caused by allowing front-end trees representing nontrivial VLA bound expressions to stay in attribute access attached to functions. Since removing these trees seems to be everyone's preference this patch does that by extending the free_lang_data pass to look for and zero out these trees. Because free_lang_data only frees anything when LTO is enabled and we want these trees cleared regardless to keep them from getting clobbered during gimplification, this change also modifies the pass to do the clearing even when the pass is otherwise inactive. if (TREE_CODE (bound) == NOP_EXPR) +bound = TREE_OPERAND (bound, 0); + + if (TREE_CODE (bound) == CONVERT_EXPR) +{ + tree op0 = TREE_OPERAND (bound, 0); + tree bndtyp = TREE_TYPE (bound); + tree op0typ = TREE_TYPE (op0); + if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ)) + bound = op0; +} + + if (TREE_CODE (bound) == NON_LVALUE_EXPR) +bound = TREE_OPERAND (bound, 0); all of the above can be just STRIP_NOPS (bound); which also handles nesting of the above in any order. No, it can't be just STRIP_NOPS. The goal is to strip the meaningless (to the user) cast to sizetype from the array type. For example: void f (int n, int[n]); void f (int n, int[n + 1]); I want the type in the warning to reflect the source: warning: argument 2 of type ‘int[n + 1]’ declared with mismatched bound ‘n + 1’ [-Wvla-parameter] and not: warning: ... ‘int[(sizetype)(n + 1)]’ ... + if (TREE_CODE (bound) == PLUS_EXPR + && integer_all_onesp (TREE_OPERAND (bound, 1))) +{ + bound = TREE_OPERAND (bound, 0); + if (TREE_CODE (bound) == NOP_EXPR) + bound = TREE_OPERAND (bound, 0); +} so it either does or does not strip a -1 but has no indication on whether it did that? That looks fragile and broken. Indication to what? The caller? The function is only used to recover a meaningful VLA bound for warnings and its callers aren't interested in whether the -1 was stripped or not. Anyway, the split out of this function seems unrelated to the original problem and should be submitted separately. It was a remnant of the previous patch where it was used to format the string representation of the VLA bounds and called from three sites. I've removed the function from this revision (and restored the one site in the pretty printer that open-codes the same thing). + for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist)) + { + tree *pvbnd = &TREE_VALUE (vblist); + if (!*pvbnd || DECL_P (*pvbnd)) + continue; so this doesn't let constant bounds prevail but only symbolical ones? Not that I care but I'd have expected || CONSTANT_CLASS_P (*pvbnd) There must be some confusion here. There are no constant VLA bounds. The essential purpose of this patch is to remove expressions from the attributes, such as PLUS_EXPR, that denote nontrivial VLA bounds. The test above retains decls that might refer to function parameters or global variables so that they can be mentioned in middle end warnings. Attached is yet another revision of this fix that moves the call to attr_access:free_lang_data() to c_parse_final_cleanups() as Jakub suggested. Martin PR middle-end/97172 - ICE: tree code 'ssa_name' is not supported in LTO streams gcc/ChangeLog: PR middle-end/97172 * attribs.c (attr_access::free_lang_data): Define new function. * attribs.h (attr_access::free_lang_data): Declare new function. gcc/c/ChangeLog: PR middle-end/97172 * c-decl.c (free_attr_access_data): New function. (c_parse_final_cleanups): Call free_attr_access_data. gcc/testsuite/ChangeLog: PR middle-end/97172 * gcc.dg/pr97172.c: New test. diff --git a/gcc/attribs.c b/gcc/attribs.c index 94991fbbeab..81322d40f1d 100644 --- a/gcc/attribs.c +++ b/gcc/attribs.c @@ -2238,6 +2238,38 @@ attr_access::vla_bounds (unsigned *nunspec) const return list_length (size); } +/* Reset front end-specific attribute access data from ATTRS. + Called from the free_lang_data pass. */ + +/* static */ void +attr_access::free_lang_data (tree attrs) +{ + for (tree acs = attrs; (acs = lookup_attribute ("access", acs)); + acs = TREE_CHAIN (acs)) +{ + tree vblist = TREE_VALUE (acs); + vblist = TREE_CHAIN (vblist); + if (!vblist) + continue; + + vblist = TREE_VALUE (vblist); + if (!vblist) + continue; + + for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist)) + { + tree *pvbnd = &TREE_VALUE (vblist); + if (!*pvbnd || DECL_P (*pvbnd)) + continue; + + /* VLA bounds that are expressions as opposed to DECLs are + only used in the front end. Reset them to keep front end + trees leaking into the middle end (see pr97172) and to + free up memory. */ + *pvb
Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)
On 1/28/21 2:22 AM, Jakub Jelinek wrote: On Thu, Jan 28, 2021 at 09:31:46AM +0100, Richard Biener via Gcc-patches wrote: + if (TREE_CODE (bound) == PLUS_EXPR + && integer_all_onesp (TREE_OPERAND (bound, 1))) +{ + bound = TREE_OPERAND (bound, 0); + if (TREE_CODE (bound) == NOP_EXPR) + bound = TREE_OPERAND (bound, 0); +} so it either does or does not strip a -1 but has no indication on whether it did that? That looks fragile and broken. Yeah. Plus again STRIP_NOPs in there should be used. But certainly it needs to remember whether there was + -1 or not and compensate for it. The free-lang-data parts are OK. But is fld the right spot to do it? If it is only the FE that emits the warnings, shouldn't it be stripped already before gimplification so that it isn't actually corrupted? I mean in c_parse_final_cleanups or c_common_parse_file depending on if it is just C or C++ too? I've been asking for a good place to do it since December and free_lang_data was the only suggestion. I even mentioned it to you this Tuesday. c_parse_final_cleanups looks like a better place but I don't understand why you've waited until now to point me to it. Plus, if the expressions in the attribute don't contain SAVE_EXPRs, why there isn't unshare_expr called on them (and if they do, I don't see how it would be guaranteed, can't one e.g. do int bar (void); void foo (int n, int a[n + 1][bar () + 2], int b[sizeof (a[0]) + 32]) { } ?) the unsharing variant I've pasted into the PR. I'm not sure what you're asking here or if you're just repeating the same question we went over in November when I posted the first patch to unshare the expressions. In that thread, after I explained that the expressions in the attribute aren't evaluated, you ultimately agreed it wasn't necessary: https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559953.html The front end wraps each VLA bound that needs to be evaluated in a SAVE_EXPR (and, as Joseph points out in the same thread, if there are more of them, they're wrapped in a COMPOUND_EXPR). But the attribute doesn't use these SAVE_EXPRs -- instead, it uses the expressions before they're wrapped. This wasn't a choice I made deliberately. I just didn't know about the COMPOUND_EXPR wrapped bounds. Martin
Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
On 1/28/21 3:36 PM, Marek Polacek wrote: On Thu, Jan 28, 2021 at 03:18:55PM -0500, Jason Merrill via Gcc-patches wrote: On 1/28/21 10:34 AM, Marek Polacek wrote: A year ago I submitted this patch: ~~ Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1. When we strip_typedefs the element of the array "const d", we see it's a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is char, but we need to add the const qualifier, so we call cp_build_qualified_type -> build_qualified_type where get_qualified_type checks to see if we already have such a type by walking the variants list, which in this case is: char -> c -> const char -> const char -> d -> const d Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN, we choose the first const char, which has TYPE_USER_ALIGN set. If the element type of an array has TYPE_USER_ALIGN, the array type gets it too. So we can make check_base_type stricter. I was afraid that it might make us reuse types less often, but measuring showed that we build the same amount of types with and without the patch, while bootstrapping. ~~ However, the patch broke a few tests on STRICT_ALIGNMENT platforms and had to be reverted. This is another try. The original patch is kept unchanged, but I added the finalize_type_size hunk that ought to fix the STRICT_ALIGNMENT issues. The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the main variant of a type, but doesn't clear it on any of the variants. Then we end up with types which share the same TYPE_MAIN_VARIANT, but their TYPE_CANONICAL differs and then the usual "canonical types differ for identical types" follows. I've created alignas19.C to exercise this scenario. What happens is: - when parsing the class S we create a type S in xref_tag, - we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S, - we parse the member function fn and build_memfn_type creates a copy of S to add const; this variant has T_U_A set, - we finish_struct S which calls layout_class_type -> finish_record_type -> finalize_size_type where we reset T_U_A in S (but const S keeps it), - finish_non_static_data_member for arr calls maybe_dummy_object with type = S, - maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p to check if S and TREE_TYPE (current_class_ref), which is const S, are the same, - same_type_ignoring_top_level_qualifiers_p creates cv-unqualified versions of the passed types. Previously we'd use our main variant S when stripping "const S" of const, but since the T_U_A flags don't match (check_base_type), we create a new variant S'. Then we crash in comptypes because S and S' have the same TYPE_MAIN_VARIANT but different TYPE_CANONICALs. With my patch we'll clear T_U_A for S's variants too, and then instead of S' we'll just use S. Does this seem sane? Bootstrapped/regtested on * x86_64-pc-linux-gnu * powerpc64le-unknown-linux-gnu * aarch64-linux-gnu ok for trunk? gcc/ChangeLog: PR c++/94775 * stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in the main variant, maybe reset it in its variants too. * tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match. (check_aligned_type): Check if TYPE_USER_ALIGN match. gcc/testsuite/ChangeLog: PR c++/94775 * g++.dg/cpp0x/alignas19.C: New test. * g++.dg/warn/Warray-bounds15.C: New test. --- gcc/stor-layout.c | 16 +++-- gcc/testsuite/g++.dg/cpp0x/alignas19.C | 8 + gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 + gcc/tree.c | 4 ++- 4 files changed, 64 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 7d6b1b5ce52..784f131ebb8 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -1926,6 +1926,7 @@ finalize_type_size (tree type) However, where strict alignment is not required, avoid over-aligning structures, since most compilers do not do this alignment. */ + bool tua_cleared_p = false; if (TYPE_MODE (type) != BLKmode && TYPE_MODE (type) != VOIDmode && (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type))) @@ -1937,7 +1938,9 @@ finalize_type_size (tree type) if (mode_align >= TYPE_ALIGN (type)) { SET_TYPE_ALIGN (type, mode_align); - TYPE_USER_ALIGN (type) = 0; + /* Remember that we're about to reset this flag. */ + tua_cleared_p = TYPE_USER_ALIGN (type); + TYPE_USER_ALIGN (type) = false; } } @@ -1991,14 +1994,21 @@ finalize_type_size (tree type) /* Cop
Re: [Ping] PowerPC: Add float128/Decimal conversions.
On Thu, Jan 28, 2021 at 01:58:26PM -0600, Peter Bergner wrote: > On 1/28/21 1:47 PM, Segher Boessenkool wrote: > > On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote: > >> The second patch I want you to review is: > > > > "This patch replaces the following three patches:" > > > > Please send a patch that modifies *current* code, and that is *tested* > > with that. With a good explanation, and a commit message for *that* > > patch (not for other, older patches). > > Isn't the email Mike linked to which was submitted yesterday exactly this? It says it is not. That is where I stopped reading. > Admittedly, it doesn't have a "commit message header", but it does seem to > be against current code and he did say he tested it and there is some > explanation given, just not in a commit message form. Is it just a nice > commit message you're looking for now? Just like for any other patch, I need to know what it *is*, why this is an improvement, and all that. The usual. And not in ten pages but in a few sentences. After such an introduction you can go into detail where that is warranted, for example necessary to understand the patch, but you do not explain the history of the world in a commit message. I do not have the bandwidth to spend hours on this again today trying to figure out what this patch is. Segher
Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
> Bootstrapped/regtested on > * x86_64-pc-linux-gnu > * powerpc64le-unknown-linux-gnu > * aarch64-linux-gnu > ok for trunk? None of them is strict alignment though, isn't it? -- Eric Botcazou
Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
On Thu, Jan 28, 2021 at 11:02:36PM +0100, Eric Botcazou wrote: > > Bootstrapped/regtested on > > * x86_64-pc-linux-gnu > > * powerpc64le-unknown-linux-gnu > > * aarch64-linux-gnu > > ok for trunk? > > None of them is strict alignment though, isn't it? Aware. I don't have access to, e.g., a sparc box. But the test I've added uses -mstrict-align where possible to check that the issue is resolved. -- Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
> Aware. I don't have access to, e.g., a sparc box. But the test I've added > uses -mstrict-align where possible to check that the issue is resolved. There are relatively fast SPARC64/Linux (gcc202) and SPARC/Solaris machines (gcc210 and gcc211) in the Compile Farm: https://cfarm.tetaneutral.net/machines/list/ -- Eric Botcazou
Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]
On Thu, Jan 28, 2021 at 11:38:34PM +0100, Eric Botcazou wrote: > > Aware. I don't have access to, e.g., a sparc box. But the test I've added > > uses -mstrict-align where possible to check that the issue is resolved. > > There are relatively fast SPARC64/Linux (gcc202) and SPARC/Solaris machines > (gcc210 and gcc211) in the Compile Farm: > https://cfarm.tetaneutral.net/machines/list/ Ah, great. I'll give gcc202 a go. Thanks, Marek
[PATCH] adjust "partly out of bounds" warning (PR 98503)
The GCC 11 -Warray-bounds enhancement to diagnose accesses whose leading offset is in bounds but whose trailing offset is not has been causing some confusion. When the warning is issued for an access to an in-bounds member via a pointer to a struct that's larger than the pointed-to object, phrasing this strictly invalid access in terms of array subscripts can be misleading, especially when the source code doesn't involve any arrays or indexing. Since the problem boils down to an aliasing violation much more so than an actual out-of-bounds access, the attached patch adjusts the code to issue a -Wstrict-aliasing warning in these cases instead of -Warray-bounds. In addition, since the aliasing assumptions in GCC can be disabled by -fno-strict-aliasing, the patch also makes these instances of the warning conditional on -fstrict-aliasing being in effect. Martin PR middle-end/98503 -Warray-bounds when -Wstrict-aliasing would be more appropriate gcc/ChangeLog: PR middle-end/98503 * gimple-array-bounds.cc (array_bounds_checker::check_mem_ref): Issue -Wstrict-aliasing for a subset of violations. (array_bounds_checker::check_array_bounds): Set new member. * gimple-array-bounds.h (array_bounds_checker::cref_of_mref): New data member. gcc/testsuite/ChangeLog: PR middle-end/98503 * g++.dg/warn/Warray-bounds-10.C: Adjust text of expected warnings. * g++.dg/warn/Warray-bounds-11.C: Same. * g++.dg/warn/Warray-bounds-12.C: Same. * g++.dg/warn/Warray-bounds-13.C: Same. * gcc.dg/Warray-bounds-63.c: Avoid -Wstrict-aliasing. Adjust text of expected warnings. * gcc.dg/Warray-bounds-66.c: Adjust text of expected warnings. * gcc.dg/Wstrict-aliasing-2.c: New test. * gcc.dg/Wstrict-aliasing-3.c: New test. diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc index 2576556f76b..f6b2af0d681 100644 --- a/gcc/gimple-array-bounds.cc +++ b/gcc/gimple-array-bounds.cc @@ -670,6 +670,8 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref, axssize = wi::to_offset (access_size); const bool uboob = !lboob && offrange[0] + axssize > ubound; + /* Set to OFFRANGE converted to index range. */ + offset_int idxrange[2] = { offrange[0], offrange[1] }; if (lboob || uboob) { /* Treat a reference to a non-array object as one to an array @@ -681,43 +683,84 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref, to compute the index to print in the diagnostic; arrays in MEM_REF don't mean anything. A type with no size like void is as good as having a size of 1. */ - tree type = TREE_TYPE (ref); - while (TREE_CODE (type) == ARRAY_TYPE) - type = TREE_TYPE (type); + tree type = strip_array_types (TREE_TYPE (ref)); if (tree size = TYPE_SIZE_UNIT (type)) { - offrange[0] = offrange[0] / wi::to_offset (size); - offrange[1] = offrange[1] / wi::to_offset (size); + idxrange[0] = offrange[0] / wi::to_offset (size); + idxrange[1] = offrange[1] / wi::to_offset (size); } } if (lboob) { - if (offrange[0] == offrange[1]) + if (idxrange[0] == idxrange[1]) warned = warning_at (location, OPT_Warray_bounds, "array subscript %wi is outside array bounds " "of %qT", - offrange[0].to_shwi (), reftype); + idxrange[0].to_shwi (), reftype); else warned = warning_at (location, OPT_Warray_bounds, "array subscript [%wi, %wi] is outside " "array bounds of %qT", - offrange[0].to_shwi (), - offrange[1].to_shwi (), reftype); + idxrange[0].to_shwi (), + idxrange[1].to_shwi (), reftype); } else if (uboob && !ignore_off_by_one) { + bool done = false; + tree backtype = reftype; if (alloc_stmt) /* If the memory was dynamically allocated refer to it as if it were an untyped array of bytes. */ backtype = build_array_type_nelts (unsigned_char_type_node, arrbounds[1].to_uhwi ()); + else if (cref_of_mref) + { + /* For a COMPONENT_REF (struct S, MEM_REF (T, ...), fld) see + if the offset of fld's last byte is in bounds of struct S, + and if so, issue -Wstrict-aliasing instead. */ + tree fld = TREE_OPERAND (cref_of_mref, 1); + offset_int fldoff = offrange[0] + wi::to_offset (byte_position (fld)); + if (tree fldsiz = DECL_SIZE_UNIT (fld)) + if (tree_fits_uhwi_p (fldsiz)) + fldoff += wi::to_offset (fldsiz); + + if (fldoff <= arrbounds[1]) + { + if (flag_strict_aliasing) + { + /* Only warn when -fstrict-aliasing is enabled (as per + the manual). */ + if (offrange[0] == 0) + warned = warning_at (location, OPT_Wstrict_aliasing, + "access to %qT by an lvalue of %qT " + "violates strict-aliasing rules", + TREE_TYPE (reftype), axstype); + else + warned = warning_at (location, OPT_Wstrict_aliasing, + "access to %qT by %<%T[%wi]%> " + "violates strict-aliasing rules", + TREE_TYPE (reftype), axstype, +
Go driver patch committed: Always act as though -g was passed
The go1 compiler always turns on debugging, to support Go stack traces and functions like runtime.Callers. With the recent switch to turn on DWARF 5 by default, this caused failures with some versions of gas, such as 2.35.1, because the assembly code would assume DWARF 5 but the driver would not pass --gdwarf-5 to gas. gas would then give an error: "file number less than one". This change avoids that problem by having the gccgo driver spec add a -g option to the command line if no other -g option is present. The newly added -g option is passed to the assembler as --gdwarf-5. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian * gospec.c (lang_specific_driver): Add -g if no debugging options were passed. a7dbc8f83a4c146f5ab851ad7796c3ddbfe17b51 diff --git a/gcc/go/gospec.c b/gcc/go/gospec.c index aaf64e73949..cf8d0f2b60e 100644 --- a/gcc/go/gospec.c +++ b/gcc/go/gospec.c @@ -127,6 +127,9 @@ lang_specific_driver (struct cl_decoded_option **in_decoded_options, /* The first input file with an extension of .go. */ const char *first_go_file = NULL; + /* Whether we saw any -g option. */ + bool saw_opt_g = false; + argc = *in_decoded_options_count; decoded_options = *in_decoded_options; added_libraries = *in_added_libraries; @@ -208,6 +211,18 @@ lang_specific_driver (struct cl_decoded_option **in_decoded_options, saw_opt_o = true; break; + case OPT_g: + case OPT_gdwarf: + case OPT_gdwarf_: + case OPT_ggdb: + case OPT_gstabs: + case OPT_gstabs_: + case OPT_gvms: + case OPT_gxcoff: + case OPT_gxcoff_: + saw_opt_g = true; + break; + case OPT_static: static_link = 1; break; @@ -271,6 +286,15 @@ lang_specific_driver (struct cl_decoded_option **in_decoded_options, j++; } + /* The go1 compiler is going to enable debug info by default. If we + don't see any -g options, force -g, so that we invoke the + assembler with the right debug option. */ + if (!saw_opt_g) +{ + generate_option (OPT_g, "1", 0, CL_DRIVER, &new_decoded_options[j]); + j++; +} + /* NOTE: We start at 1 now, not 0. */ while (i < argc) {
Re: Default to DWARF5
The subject commit, 3804e937b0e252a7e42632fe6d9f898f1851a49c, causes a failure in the test suite for the IBM Advance Toolchain. The test in question uses "perf probe" to set a tracepoint at "main" in a newly built (with GCC 11) binary of "/bin/ld". With the patch applied, the command enters an infinte loop, calling libdw1 functions but making no progress. The infinite loop can be found in the Linux kernel tools/perf/utils/probe-finder.c: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/probe-finder.c?h=v5.11-rc5#n1190 Reverting this patch permits the command to succeed. PC
[PATCH] Add conversions between _Float128 and Decimal.
[PATCH] Add conversions between _Float128 and Decimal. This patch implements conversions between _Float128 and the 3 Decimal floating types. It does by extendending the dfp-bit conversions to add a new binary floating point type (KF), and doing the conversions in the same mannor as the other binary/decimal conversions. In particular for conversions from _Float128 to Decimal, it uses a sprintf variant to convert _Float128 to strings, and a type specific function that converts the string output to the appropriate Decimal type For conversions from one of the Decimal types to _Float128, it uses a decimal function to convert to string (i.e. __decimalToString), and then uses a variant of strtold to convert to _Float128. If the user is linked against GLIBC 2.32 or newer, then the sprintf and strtold variant functions can use the features directly in GLIBC 2.32 to do this conversion. If you have an older GLIBC and want to convert _Float128 to one of the Decimal types, it will convert the _Float128 to __ibm128 and then convert that to Decimal. Similarly if you have one of the Decimal types, and want to convert to _Float128, it will first convert the Decimal type to __ibm128, and then convert __ibm128 to _Float128. These functions will primarily be used if/when the default PowerPC long double type is changed to IEEE 128-bit, but they could also be used if the user explicitly converts _Float128 to/from a Decimal type. One test case relating to Decimal fails if I build a compiler where the default is IEEE 128-bit: * c-c++-common/dfp/convert-bfp-11.c I have patches for this test, and they have been submitted separately. I have tested this patch by doing builds, bootstraps, and make check with 3 builds on a power9 little endian server: * Build one used the default long double being IBM 128-bit; * Build two set the long double default to IEEE 128-bit; (and) * Build three set the long double default to 64-bit. The compilers built fine providing I recompiled gmp, mpc, and mpfr with the appropriate long double options. There were a few differences in the test suite runs that will be addressed in later patches, but over all it works well. This patch is required to be able to build a toolchain where the default long double is IEEE 128-bit. Can I check this patch into the master branch for GCC 11? I have also built compilers with this patch on a big endian power8 system that has both 32-bit and 64-bit support. There were no regressions in running these tests on the system. Can I check this patch into the master branch? libgcc/ 2021-01-28 Michael Meissner * config/rs6000/_dd_to_kf.c: New file. * config/rs6000/_kf_to_dd.c: New file. * config/rs6000/_kf_to_sd.c: New file. * config/rs6000/_kf_to_td.c: New file. * config/rs6000/_sd_to_kf.c: New file. * config/rs6000/_sprintfkf.c: New file. * config/rs6000/_sprintfkf.h: New file. * config/rs6000/_strtokf.h: New file. * config/rs6000/_strtokf.c: New file. * config/rs6000/_td_to_kf.c: New file. * config/rs6000/quad-float128.h: Add new declarations. * config/rs6000/t-float128 (fp128_dec_funcs): New macro. (fp128_decstr_funcs): New macro. (ibm128_dec_funcs): New macro. (fp128_ppc_funcs): Add the new conversions. (fp128_dec_objs): Force Decimal <-> __float128 conversions to be compiled with -mabi=ieeelongdouble. (fp128_decstr_objs): Force __float128 <-> string conversions to be compiled with -mabi=ibmlongdouble. (ibm128_dec_objs): Force Decimal <-> __float128 conversions to be compiled with -mabi=ieeelongdouble. (FP128_CFLAGS_DECIMAL): New macro. (IBM128_CFLAGS_DECIMAL): New macro. * dfp-bit.c (DFP_TO_BFP): Add PowerPC _Float128 support. (BFP_TO_DFP): Add PowerPC _Float128 support. * dfp-bit.h (BFP_KIND): Add new binary floating point kind for IEEE 128-bit floating point. (DFP_TO_BFP): Add PowerPC _Float128 support. (BFP_TO_DFP): Add PowerPC _Float128 support. (BFP_SPRINTF): New macro. --- libgcc/config/rs6000/_dd_to_kf.c | 37 ++ libgcc/config/rs6000/_kf_to_dd.c | 37 ++ libgcc/config/rs6000/_kf_to_sd.c | 37 ++ libgcc/config/rs6000/_kf_to_td.c | 37 ++ libgcc/config/rs6000/_sd_to_kf.c | 37 ++ libgcc/config/rs6000/_sprintfkf.c| 57 libgcc/config/rs6000/_sprintfkf.h| 28 ++ libgcc/config/rs6000/_strtokf.c | 56 +++ libgcc/config/rs6000/_strtokf.h | 27 + libgcc/config/rs6000/_td_to_kf.c | 37 ++ libgcc/config/rs6000/quad-float128.h | 8 libgcc/config/rs6000/t-float128 | 37 +- libgcc/dfp-bit.c | 12 +- libgcc/dfp-bit.h |
Re: [Ping] PowerPC: Add float128/Decimal conversions.
I rebusmitted the patch after verifying it still builds and works with the current branch as: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564486.html | Subject: [PATCH] Add conversions between _Float128 and Decimal. | Message-ID: <20210129024208.ga25...@ibm-toto.the-meissners.org> -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
[PATCH, rs6000] Optimization for PowerPC 64bit constant generation [PR94395]
Hi, This patch tries to optimize PowerPC 64 bit constant generation when the constant can be transformed from a 32 bit or 16 bit constant by rotating, shifting and mask AND. The attachments are the patch diff file and change log file. Bootstrapped and tested on powerpc64le with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. PR target/94395 * config/rs6000/rs6000.c (rs6000_emit_set_32bit_const, rs6000_rotate_long_const, rs6000_peel_long_const): New functions. (rs6000_emit_set_long_const, num_insns_constant_gpr): Call new functions. * testsuite/gcc.target/powerpc/pr94395.c: New test. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index f26fc13484b..bcb867ffe94 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1109,6 +1109,9 @@ static tree rs6000_handle_longcall_attribute (tree *, tree, tree, int, bool *); static tree rs6000_handle_altivec_attribute (tree *, tree, tree, int, bool *); static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *); static tree rs6000_builtin_vectorized_libmass (combined_fn, tree, tree); +static HOST_WIDE_INT rs6000_rotate_long_const (unsigned HOST_WIDE_INT, int *); +static HOST_WIDE_INT rs6000_peel_long_const (unsigned HOST_WIDE_INT, int *, +int *); static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT); static int rs6000_memory_move_cost (machine_mode, reg_class_t, bool); static bool rs6000_debug_rtx_costs (rtx, machine_mode, int, int, int *, bool); @@ -5868,12 +5871,28 @@ num_insns_constant_gpr (HOST_WIDE_INT value) else if (TARGET_POWERPC64) { + int rotate, head, tail; + HOST_WIDE_INT imm1, imm2; + unsigned HOST_WIDE_INT uc = value; HOST_WIDE_INT low = ((value & 0x) ^ 0x8000) - 0x8000; HOST_WIDE_INT high = value >> 31; if (high == 0 || high == -1) return 2; + /* A long constant can be transformed from both a 16 bit constant and +a 32 bit constant. So we first test imm1 and imm2 if they're 16 +bit. */ + imm1 = rs6000_rotate_long_const (uc, &rotate); + if (SIGNED_INTEGER_16BIT_P (imm1)) + return 2; + imm2 = rs6000_peel_long_const (uc, &head, &tail); + if (SIGNED_INTEGER_16BIT_P (imm2)) + return 2; + if (SIGNED_INTEGER_NBIT_P (imm1, 32) + || SIGNED_INTEGER_NBIT_P (imm2, 32)) + return 3; + high >>= 1; if (low == 0 || low == high) @@ -9720,6 +9739,96 @@ rs6000_emit_set_const (rtx dest, rtx source) return true; } +/* Function to load 32 a bit constant. */ +static void +rs6000_emit_set_32bit_const (rtx dest, HOST_WIDE_INT c) +{ + gcc_assert (SIGNED_INTEGER_NBIT_P (c, 32)); + + rtx temp = can_create_pseudo_p () ? gen_reg_rtx (DImode) : dest; + + if (SIGNED_INTEGER_16BIT_P (c)) +emit_insn (gen_rtx_SET (dest, GEN_INT (c))); + else +{ + emit_insn (gen_rtx_SET (copy_rtx (temp), +GEN_INT (c & ~(HOST_WIDE_INT) 0x))); + emit_insn (gen_rtx_SET (dest, + gen_rtx_IOR (DImode, copy_rtx (temp), + GEN_INT (c & 0x; +} +} + +/* Helper function of rs6000_emit_set_long_const to left rotate a long + constant. It returns the result immediately when it finds a 32 bit + constant. It at most rotates for 31 bits. + For instant, the constant 0x1234 can be transformed to + a 32 bit constant 0x4123 by left rotating 12 bits. */ +static HOST_WIDE_INT +rs6000_rotate_long_const (unsigned HOST_WIDE_INT c, int *rot) +{ + int bitsize = GET_MODE_BITSIZE (DImode); + bool found = false; + unsigned HOST_WIDE_INT imm = c; + unsigned HOST_WIDE_INT m = imm >> (bitsize - 1); + int rotate = 0; + + while (rotate < 31 && !found) +{ + imm = imm << 1 | m; + if (clz_hwi (imm) > 32 || clz_hwi (~imm) > 32) + found = true; + rotate++; + m = imm >> (bitsize - 1); +} + + *rot = rotate; + return imm; +} + +/* Helper function of rs6000_emit_set_long_const to reutrn a constant by + removing consecutive 0s and 1s at the head and tail then setting all high + bits. + For instance, 0x00fff2345000 can be transformed to 0xfff2345 by + peeling the head and tail, then to 0x234 by setting all + high bits. + + 0x0 0 0 0 0 0 f f f 2 3 4 5 0 0 0 + |_| |___| + head_zero tail_zero + + 0xf f f f f f f f f 2 3 4 5 f f f + |___||___| + head_one tail_one + + The HEAD_PAD is set to the number of consecutive 0s at the head. If there + is no consecutive 0s, HEAD_PAD is set to 0. + If the tail has consecutive 0s, TAIL_PAD is a positive number. Otherwise, + TAIL_PAD is set to a negative number. */ +static HOST_WIDE_INT +rs6000_peel_long_const (unsigned HOS
[PATCH] c++: Fix infinite looping with invalid operator [PR96137]
My r11-86 adjusted cp_parser_class_name to do - scope = parser->scope; + scope = parser->scope ? parser->scope : parser->context->object_type; if (scope == error_mark_node) return error_mark_node; but that caused endless looping in cp_parser_type_specifier_seq (the while (true) loop) in this invalid test, because we never set a parser error, therefore cp_parser_type_specifier returned error_mark_node instead of NULL_TREE, and we never issued the "expected type-specifier" error. At first I thought I'd just add cp_parser_simulate_error right before the return, but that regresses crash81.C -- we'd emit multiple errors for "T::X". So the next best thing seemed to revert to pre-r11-86 behavior: return early when parser->scope is bad, otherwise proceed to get the parser error. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? gcc/cp/ChangeLog: PR c++/96137 * parser.c (cp_parser_class_name): If parser->scope is error_mark_node, return it, otherwise continue. gcc/testsuite/ChangeLog: PR c++/96137 * g++.dg/parse/error63.C: New test. --- gcc/cp/parser.c | 4 +++- gcc/testsuite/g++.dg/parse/error63.C | 8 2 files changed, 11 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/parse/error63.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index e196db14113..5c1d880c9fc 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -24559,7 +24559,9 @@ cp_parser_class_name (cp_parser *parser, where we first want to look up A::a in the class of the object expression, as per [basic.lookup.classref]. */ tree scope = parser->scope ? parser->scope : parser->context->object_type; - if (scope == error_mark_node) + /* This only checks parser->scope to avoid duplicate errors; if + ->object_type is erroneous, go on to give a parse error. */ + if (parser->scope == error_mark_node) return error_mark_node; /* Any name names a type if we're following the `typename' keyword diff --git a/gcc/testsuite/g++.dg/parse/error63.C b/gcc/testsuite/g++.dg/parse/error63.C new file mode 100644 index 000..5472ef05a64 --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/error63.C @@ -0,0 +1,8 @@ +// PR c++/96137 +// { dg-do compile } + +void +fn () +{ + X.operator T(); // { dg-error ".X. was not declared in this scope|expected" } +} base-commit: e6bce7fe17bf32ce969abc6f77f07acd352f6977 -- 2.29.2
Re: [PATCH] c++: Fix infinite looping with invalid operator [PR96137]
On 1/28/21 10:57 PM, Marek Polacek wrote: My r11-86 adjusted cp_parser_class_name to do - scope = parser->scope; + scope = parser->scope ? parser->scope : parser->context->object_type; if (scope == error_mark_node) return error_mark_node; but that caused endless looping in cp_parser_type_specifier_seq (the while (true) loop) in this invalid test, because we never set a parser error, therefore cp_parser_type_specifier returned error_mark_node instead of NULL_TREE, and we never issued the "expected type-specifier" error. At first I thought I'd just add cp_parser_simulate_error right before the return, but that regresses crash81.C -- we'd emit multiple errors for "T::X". So the next best thing seemed to revert to pre-r11-86 behavior: return early when parser->scope is bad, otherwise proceed to get the parser error. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? OK. gcc/cp/ChangeLog: PR c++/96137 * parser.c (cp_parser_class_name): If parser->scope is error_mark_node, return it, otherwise continue. gcc/testsuite/ChangeLog: PR c++/96137 * g++.dg/parse/error63.C: New test. --- gcc/cp/parser.c | 4 +++- gcc/testsuite/g++.dg/parse/error63.C | 8 2 files changed, 11 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/parse/error63.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index e196db14113..5c1d880c9fc 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -24559,7 +24559,9 @@ cp_parser_class_name (cp_parser *parser, where we first want to look up A::a in the class of the object expression, as per [basic.lookup.classref]. */ tree scope = parser->scope ? parser->scope : parser->context->object_type; - if (scope == error_mark_node) + /* This only checks parser->scope to avoid duplicate errors; if + ->object_type is erroneous, go on to give a parse error. */ + if (parser->scope == error_mark_node) return error_mark_node; /* Any name names a type if we're following the `typename' keyword diff --git a/gcc/testsuite/g++.dg/parse/error63.C b/gcc/testsuite/g++.dg/parse/error63.C new file mode 100644 index 000..5472ef05a64 --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/error63.C @@ -0,0 +1,8 @@ +// PR c++/96137 +// { dg-do compile } + +void +fn () +{ + X.operator T(); // { dg-error ".X. was not declared in this scope|expected" } +} base-commit: e6bce7fe17bf32ce969abc6f77f07acd352f6977
Re: [PATCH PR97627]Avoid computing niters info for fake edges
On Thu, Jan 28, 2021 at 11:31 AM Bin.Cheng wrote: > > On Thu, Jan 28, 2021 at 5:08 PM Richard Biener via Gcc-patches > wrote: > > > > On Thu, Jan 28, 2021 at 3:49 AM bin.cheng via Gcc-patches > > wrote: > > > > > > Hi, > > > As described in commit message, we need to avoid computing niters info > > > for fake > > > edges. This simple patch does this by two changes. > > > > > > Bootstrap and test on X86_64, is it ok? > > > > Hmm, so I think the patch is a bit complicated and avoiding niter compute > > for fake edges would be easier when just returning false for > > fake edges in number_of_iterations_exit_assumptions? > I just grepped calls to get_loop_exit_edges, and thought there might > be cases other than niters analysis that would also like to skip fake > edges. But I didn't check the calls one by one. My hunch is that the usual APIs always want to ignore them, but let's do a minimal fix that we can backport easily. > > > > Which pass was the problematical that had infinite loops connected to exit? > > > > I guess the cfgloop code should simply ignore fake exits - they mostly > > exist to make reverse CFG walks easy. Specifically single_exit > > and single_likely_exit but also exit edge recording should ignore them. > > > > That said, the testcase seems to be fixed with just > > > > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > > index 7d61ef080eb..7775bc7275c 100644 > > --- a/gcc/tree-ssa-loop-niter.c > > +++ b/gcc/tree-ssa-loop-niter.c > > @@ -2407,6 +2407,11 @@ number_of_iterations_exit_assumptions (class > > loop *loop, edge exit, > >affine_iv iv0, iv1; > >bool safe; > > > > + /* The condition at a fake exit (if it exists) does not control its > > + execution. */ > > + if (exit->flags & EDGE_FAKE) > > +return false; > > + > >/* Nothing to analyze if the loop is known to be infinite. */ > >if (loop_constraint_set_p (loop, LOOP_C_INFINITE)) > > return false; > > > > Your dfs_find_deadend change likely "breaks" post-dominance DFS order > > (this is a very fragile area). > > > > So any objection to just simplify the patch to the above hunk? > Considering we are in late stage3? No objection to this change. But I > do think dfs_find_deadend needs to be improved, if not as this patch > does. For a loop nest with the outermost loop as the infinite one, > the function adds fake (exit) edges for inner loops, which is > counter-intuitive. Sure, but then this is independent of the PR. As said, the fake exits only exist to make reverse CFG walkers easier - yes, for natural infinite loops we'd like to have "intuitive" post-dom behavior but for example for irreducible regions there's not much to do. Richard. > Thanks, > bin > > > > Thanks, > > Richard. > > > > > Thanks, > > > bin