RE: [PATCH] AArch64: Add if condition in aarch64_function_value [PR96479]
Hi Richard, Thanks for the review and explanation. The previous fix adding if condition of TARGET_FLOAT does crash glibc-2.29. I checked the past log of writing the function aarch64_init_cumulative_args, and did not find the reason why Alan Lawrence added TREE_PUBLIC (fndecl) as one condition for entering the function type check. Maybe Alan could clarify? I tried to delete TREE_PUBLIC (fndecl), which turns out could solve both the glibc problem and the previous ICE problem. A new fix is made as following, passed bootstrap and deja test. I believe this fix is reasonable, since the function type should be checked no matter if it has external linkage or not. The function aarch64_init_cumulative_args checks the function types and should catch the error that "-mgeneral-regs-only" is incompatible with the use of SIMD/FP registers. In the test case on PR96479, the function myfunc2 returns one vector of 4 integers, while it is defined static type. TREE_PUBLIC (fndecl) is set as false and it prevents from entering if statement and checking function types. I delete "TREE_PUBLIC (fndecl)" so that gcc can catch the error through the function aarch64_init_cumulative_args now. The ICE on PR96479 can report the diagnostic error with this fix. The patch for the fix is attached as following: diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index b7f5bc76f1b..9ce83dce131 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6017,7 +6017,7 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, if (!silent_p && !TARGET_FLOAT - && fndecl && TREE_PUBLIC (fndecl) + && fndecl && fntype && fntype != error_mark_node) { const_tree type = TREE_TYPE (fntype); Christophe, thanks for your tests on glibc-2.29. With the above fix, I built glibc-2.29, and the previous error does not show up now. Could you please check if this fix works? Do you have any suggestions on this fix? All the best, Peixin -Original Message- From: Richard Sandiford [mailto:richard.sandif...@arm.com] Sent: Thursday, August 13, 2020 8:19 PM To: Christophe Lyon Cc: qiaopeixin ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] AArch64: Add if condition in aarch64_function_value [PR96479] Christophe Lyon writes: > On Thu, 13 Aug 2020 at 03:54, qiaopeixin wrote: >> >> Thanks for the review and commit. >> >> All the best, >> Peixin >> >> -Original Message- >> From: Richard Sandiford [mailto:richard.sandif...@arm.com] >> Sent: 2020年8月13日 0:25 >> To: qiaopeixin >> Cc: gcc-patches@gcc.gnu.org >> Subject: Re: [PATCH] AArch64: Add if condition in >> aarch64_function_value [PR96479] >> >> qiaopeixin writes: >> > Hi, >> > >> > The test case vector-subscript-2.c in the gcc testsuit will report an ICE >> > in the expand pass since '-mgeneral-regs-only' is incompatible with the >> > use of V4SI mode. I propose to report the diagnostic information instead >> > of ICE, and the problem has been discussed on PR 96479. >> > >> > I attached the patch to solve the problem. Bootstrapped and tested on >> > aarch64-linux-gnu. Any suggestions? >> >> Thanks, pushed. I was initially sceptical because raising an error here and >> in aarch64_layout_arg is a hack. Both functions are just query functions >> and shouldn't have any side effects. >> >> The approach we took for FP modes seemed better: we define the FP move >> patterns unconditionally, and raise an error if we try to emit an FP move >> with !TARGET_FLOAT. This defers any error reporting until we actually try >> to generate code that depends on TARGET_FLOAT. >> >> But I guess SIMD stuff is different. There's no reason in principle why you >> can't use: >> >> unsigned short __attribute__((vector_size(8))) >> >> *within* a function with -mgeneral-regs-only. It would just need to be >> emulated, in the same way as for: >> >> unsigned short __attribute__((vector_size(4))) >> >> So it would be wrong to define the SIMD move patterns unconditionally and >> raise an error there. >> >> So all in all, I agree this is the best we can do given the current >> infrastructure. >> > > Since this patch was committed my buildbot is broken for > aarch64-linux-gnu because it now fails to build glibc-2.29: > ../stdlib/bits/stdlib-float.h: In function 'atof': > ../stdlib/bits/stdlib-float.h:26:1: error: '-mgeneral-regs-only' is > incompatible with the use of floating-point types Thanks for the heads-up. I've reverted the patch for now. Looking more closely, it seems like aarch64_init_cumulative_args already tries to catch the problem that the patch was fixing: if (!silent_p && !TARGET_FLOAT && fndecl && TREE_PUBLIC (fndecl) && fntype && fntype != error_mark_node) { const_tree type = TREE_TYPE (fntype); machine_mode mode ATTRIBUTE_UNUSED; /* To pass pointer as argument. */ int nregs ATTRIBUTE_UNUSED; /* Likewise. */ if
Re: [PATCH v2] C-SKY: Support -mfloat-abi=hard.
Hi Jojo, Nowhere is this rule directly stated. But there are indent options showed in https://www.gnu.org/prep/standards/html_node/Formatting.html#Formatting corresponding to recommendations of C formatting style, which use the defualt 8 clumns tab wide. On 8/18/20 9:42 AM, Jojo R wrote: Hi, Is there coding rule with it ? I can not find it from https://www.gnu.org/prep/standards/html_node/index.html and https://gcc.gnu.org/codingconventions.html Could you give me any hints ? Thanks. Jojo 在 2020年8月17日 +0800 PM11:05,Xianmiao Qu ,写道: Hi Jojo, On 8/17/20 7:09 PM, Jojo R wrote: diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c index 7ba3ed3..b71291a 100644 --- a/gcc/config/csky/csky.c +++ b/gcc/config/csky/csky.c @@ -328,6 +328,16 @@ csky_cpu_cpp_builtins (cpp_reader *pfile) { builtin_define ("__csky_hard_float__"); builtin_define ("__CSKY_HARD_FLOAT__"); + if (TARGET_HARD_FLOAT_ABI) + { + builtin_define ("__csky_hard_float_abi__"); + builtin_define ("__CSKY_HARD_FLOAT_ABI__"); + } + if (TARGET_SINGLE_FPU) + { + builtin_define ("__csky_hard_float_fpu_sf__"); + builtin_define ("__CSKY_HARD_FLOAT_FPU_SF__"); + } } These is one more thing you shoud pay attention to, if the spaces number reaches 8 at begining of a line, you should use tab instead of 8 spaces. Thanks, Xianmiao
Re: [PATCH]Don't use pinsr for struct initialization.
On Fri, Aug 14, 2020 at 5:57 PM Uros Bizjak wrote: > > On Fri, Aug 14, 2020 at 8:03 AM Hongtao Liu wrote: > > > > Hi: > > For struct initialization, when it fits in a TImode, gcc will use > > pinsr insn which causes poor codegen described in PR93897 and PR96562. > > You should probably remove TImode handling also from ix86_expand_pextr. > Yes, but i failed to construct a testcase to cover this part. Anyway, the regression test for i386/x86-64 backend is ok, bootstrap is ok. I also run the patch on SPEC2017, no big impact. > Uros. > > > Bootstrap is ok, regression test is ok for i386/x86-64 backend. > > Ok for trunk? > > > > ChangeLog > > gcc/ > > PR target/96562 > > PR target/93897 > > * config/i386/i386-expand.c (ix86_expand_pinsr): Don't use > > pinsr for TImode. > > > > gcc/testsuite/ > > * gcc.target/i386/pr96562-1.c: New test. > > > > -- > > BR, > > Hongtao Update patch. -- BR, Hongtao From 12e879c481ca7ff9c3477beb3dfd3b615dbe8f60 Mon Sep 17 00:00:00 2001 From: liuhongt Date: Wed, 12 Aug 2020 10:48:17 +0800 Subject: [PATCH] Don't use pinsr/pextr for struct initialization/extraction. gcc/ PR target/96562 PR target/93897 * config/i386/i386-expand.c (ix86_expand_pinsr): Don't use pinsr for TImode. (ix86_expand_pextr): Don't use pextr for TImode. gcc/testsuite/ * gcc.target/i386/pr96562-1.c: New test. --- gcc/config/i386/i386-expand.c | 2 - gcc/testsuite/gcc.target/i386/pr96562-1.c | 81 +++ 2 files changed, 81 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr96562-1.c diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index e194214804b..9b585c8cc8c 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -20237,7 +20237,6 @@ ix86_expand_pextr (rtx *operands) case E_V4SImode: case E_V2DImode: case E_V1TImode: -case E_TImode: { machine_mode srcmode, dstmode; rtx d, pat; @@ -20333,7 +20332,6 @@ ix86_expand_pinsr (rtx *operands) case E_V4SImode: case E_V2DImode: case E_V1TImode: -case E_TImode: { machine_mode srcmode, dstmode; rtx (*pinsr)(rtx, rtx, rtx, rtx); diff --git a/gcc/testsuite/gcc.target/i386/pr96562-1.c b/gcc/testsuite/gcc.target/i386/pr96562-1.c new file mode 100644 index 000..6ebeeb1fb17 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr96562-1.c @@ -0,0 +1,81 @@ +/* { dg-do compile} */ +/* { dg-options "-msse4.1 -O2" } */ +/* { dg-final { scan-assembler-not "pinsr" } } */ + +typedef struct +{ + long long a; + int b; +} st1; + +typedef struct +{ + long long a; + int b; + short c; +} st2; + +typedef struct +{ + long long a; + int b; + short c; + char d; +} st3; + +typedef struct +{ + int b; + long long a; +} st4; + +typedef struct +{ + short c; + int b; + long long a; +} st5; + +typedef struct +{ + char d; + short c; + int b; + long long a; +} st6; + +st1 +foo1 (long long a, int b) +{ + return (st1){a, b}; +} + +st2 +foo2 (long long a, int b, short c) +{ + return (st2){a, b, c}; +} + +st3 +foo3 (long long a, int b, short c, char d) +{ + return (st3){a, b, c, d}; +} + +st4 +foo4 (long long a, int b) +{ + return (st4){b, a}; +} + +st5 +foo5 (long long a, int b, short c) +{ + return (st5){c, b, a}; +} + +st6 +foo6 (long long a, int b, short c, char d) +{ + return (st6){d, c, b, a}; +} -- 2.18.1
Re: [PATCH v2] C-SKY: Support -mfloat-abi=hard.
Hi, Is there coding rule with it ? I can not find it from https://www.gnu.org/prep/standards/html_node/index.html and https://gcc.gnu.org/codingconventions.html Could you give me any hints ? Thanks. Jojo 在 2020年8月17日 +0800 PM11:05,Xianmiao Qu ,写道: > Hi Jojo, > > > On 8/17/20 7:09 PM, Jojo R wrote: > > diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c > > index 7ba3ed3..b71291a 100644 > > --- a/gcc/config/csky/csky.c > > +++ b/gcc/config/csky/csky.c > > @@ -328,6 +328,16 @@ csky_cpu_cpp_builtins (cpp_reader *pfile) > > { > > builtin_define ("__csky_hard_float__"); > > builtin_define ("__CSKY_HARD_FLOAT__"); > > + if (TARGET_HARD_FLOAT_ABI) > > + { > > + builtin_define ("__csky_hard_float_abi__"); > > + builtin_define ("__CSKY_HARD_FLOAT_ABI__"); > > + } > > + if (TARGET_SINGLE_FPU) > > + { > > + builtin_define ("__csky_hard_float_fpu_sf__"); > > + builtin_define ("__CSKY_HARD_FLOAT_FPU_SF__"); > > + } > > } > > These is one more thing you shoud pay attention to, if the spaces number > reaches 8 at begining of a line, you should use tab instead of 8 spaces. > > > Thanks, > > Xianmiao
[committed] analyzer: fix name of local in region_model::get_rvalue_1
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to master as da7c2773e56c889f4f131b80d4b91f1adbae80a2. gcc/analyzer/ChangeLog: * region-model.cc (region_model::get_rvalue_1): Fix name of local. --- gcc/analyzer/region-model.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 305e9648c79..c3d9ca7f650 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -1118,8 +1118,8 @@ region_model::get_rvalue_1 (path_var pv, region_model_context *ctxt) case RESULT_DECL: case ARRAY_REF: { - const region *element_reg = get_lvalue (pv, ctxt); - return get_store_value (element_reg); + const region *reg = get_lvalue (pv, ctxt); + return get_store_value (reg); } case REALPART_EXPR: -- 2.26.2
[committed] analyzer: fix ICE on unhandled tree codes in get_rvalue_1 [PR96641]
The old implementation of region_model::get_rvalue_1 gracefully handled tree codes it didn't understand, returning "UNKNOWN", whereas the new implementation (r11-2694-g808f4dfeb3a95f50f15e71148e5c1067f90a126d) had an assertion left over from development, leading to ICEs. This patch restores the old behavior for these cases. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to master as r11-2729-g2242b975c08e150dd712d8e64341cae8457788ef. gcc/analyzer/ChangeLog: PR analyzer/96641 * region-model.cc (region_model::get_rvalue_1): Handle unrecognized tree codes by returning "UNKNOWN. gcc/testsuite/ChangeLog: PR analyzer/96641 * g++.dg/analyzer/pr96641.C: New test. --- gcc/analyzer/region-model.cc| 2 +- gcc/testsuite/g++.dg/analyzer/pr96641.C | 18 ++ 2 files changed, 19 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/analyzer/pr96641.C diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index cd74c0f6195..305e9648c79 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -1097,7 +1097,7 @@ region_model::get_rvalue_1 (path_var pv, region_model_context *ctxt) switch (TREE_CODE (pv.m_tree)) { default: - gcc_unreachable (); + return m_mgr->get_or_create_unknown_svalue (TREE_TYPE (pv.m_tree)); case ADDR_EXPR: { diff --git a/gcc/testsuite/g++.dg/analyzer/pr96641.C b/gcc/testsuite/g++.dg/analyzer/pr96641.C new file mode 100644 index 000..eb11c8584b6 --- /dev/null +++ b/gcc/testsuite/g++.dg/analyzer/pr96641.C @@ -0,0 +1,18 @@ +struct uh { + virtual void + sx (); +}; + +struct iz : uh { + virtual void + sx () + { +sx (); + } +}; + +void +a2 () +{ + iz ().sx (); +} -- 2.26.2
[committed] analyzer: fix ICE on unhandled tree codes in gassign [PR96640]
PR analyzer/96640 reports a ICE within region_model::on_assignment when failing to handle a WIDEN_MULT_EVEN_EXPR, and various other tree codes. The old implementation of region_model::on_assignment gracefully handled tree codes it didn't understand, returning "UNKNOWN", whereas the new implementation (r11-2694-g808f4dfeb3a95f50f15e71148e5c1067f90a126d) had a "sorry_at" and an assertion left over from development, leading to ICEs. This patch restores the old behavior for these cases, and marks various vector operations as leading to unknown results. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to master as r11-2728-g1b0be822208349b2795381deace2352e998c1ad0. gcc/analyzer/ChangeLog: PR analyzer/96640 * region-model.cc (region_model::get_gassign_result): Handle various VEC_* tree codes by returning UNKNOWN. (region_model::on_assignment): Handle unrecognized tree codes by setting lhs to an unknown value, rather than issuing a "sorry" and asserting. --- gcc/analyzer/region-model.cc | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc index 3c7ea40e8d8..cd74c0f6195 100644 --- a/gcc/analyzer/region-model.cc +++ b/gcc/analyzer/region-model.cc @@ -526,6 +526,22 @@ region_model::get_gassign_result (const gassign *assign, case VEC_SERIES_EXPR: case VEC_COND_EXPR: case VEC_PERM_EXPR: +case VEC_WIDEN_MULT_HI_EXPR: +case VEC_WIDEN_MULT_LO_EXPR: +case VEC_WIDEN_MULT_EVEN_EXPR: +case VEC_WIDEN_MULT_ODD_EXPR: +case VEC_UNPACK_HI_EXPR: +case VEC_UNPACK_LO_EXPR: +case VEC_UNPACK_FLOAT_HI_EXPR: +case VEC_UNPACK_FLOAT_LO_EXPR: +case VEC_UNPACK_FIX_TRUNC_HI_EXPR: +case VEC_UNPACK_FIX_TRUNC_LO_EXPR: +case VEC_PACK_TRUNC_EXPR: +case VEC_PACK_SAT_EXPR: +case VEC_PACK_FIX_TRUNC_EXPR: +case VEC_PACK_FLOAT_EXPR: +case VEC_WIDEN_LSHIFT_HI_EXPR: +case VEC_WIDEN_LSHIFT_LO_EXPR: return m_mgr->get_or_create_unknown_svalue (TREE_TYPE (lhs)); } } @@ -555,10 +571,12 @@ region_model::on_assignment (const gassign *assign, region_model_context *ctxt) { default: { - if (1) + if (0) sorry_at (assign->location, "unhandled assignment op: %qs", get_tree_code_name (op)); - gcc_unreachable (); + const svalue *unknown_sval + = m_mgr->get_or_create_unknown_svalue (TREE_TYPE (lhs)); + set_value (lhs_reg, unknown_sval, ctxt); } break; -- 2.26.2
Go patch committed: Export thunks referenced by inline functions
This patch to the Go fronted exports thunks referenced by inline functions. Otherwise we get a link time error. The test case is https://golang.org/cl/248637. This fixes https://golang.org/issue/40252. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline and GCC 10 branch. Ian 5903b4561331e2a8907937baad8040e58b92aea3 diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE index e443282d0e8..e425f15285e 100644 --- a/gcc/go/gofrontend/MERGE +++ b/gcc/go/gofrontend/MERGE @@ -1,4 +1,4 @@ -fe5d94c5792f7f990004c3dee0ea501835512200 +823c91088bc6ac606362fc34b2880ce0de1624ad The first line of this file holds the git revision number of the last merge done from the gofrontend repository. diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc index d295fd10136..8bbc557c65f 100644 --- a/gcc/go/gofrontend/expressions.cc +++ b/gcc/go/gofrontend/expressions.cc @@ -1635,16 +1635,15 @@ Func_descriptor_expression::do_get_backend(Translate_context* context) || no->name().find("equal") != std::string::npos)) is_exported_runtime = true; - bool is_referenced_by_inline = - no->is_function() && no->func_value()->is_referenced_by_inline(); - bool is_hidden = ((no->is_function() && no->func_value()->enclosing() != NULL) || (Gogo::is_hidden_name(no->name()) - && !is_exported_runtime - && !is_referenced_by_inline) + && !is_exported_runtime) || Gogo::is_thunk(no)); + if (no->is_function() && no->func_value()->is_referenced_by_inline()) + is_hidden = false; + bvar = context->backend()->immutable_struct(var_name, asm_name, is_hidden, false, btype, bloc); diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc index 13de74bc870..82d4c1fd54d 100644 --- a/gcc/go/gofrontend/gogo.cc +++ b/gcc/go/gofrontend/gogo.cc @@ -3370,7 +3370,8 @@ class Create_function_descriptors : public Traverse Gogo* gogo_; }; -// Create a descriptor for every top-level exported function. +// Create a descriptor for every top-level exported function and every +// function referenced by an inline function. int Create_function_descriptors::function(Named_object* no) @@ -3378,8 +3379,9 @@ Create_function_descriptors::function(Named_object* no) if (no->is_function() && no->func_value()->enclosing() == NULL && !no->func_value()->is_method() - && !Gogo::is_hidden_name(no->name()) - && !Gogo::is_thunk(no)) + && ((!Gogo::is_hidden_name(no->name()) + && !Gogo::is_thunk(no)) + || no->func_value()->is_referenced_by_inline())) no->func_value()->descriptor(this->gogo_, no); return TRAVERSE_CONTINUE;
Re: [PATCH] libibery/hashtab: add new functions
On Mon, Aug 17, 2020 at 7:06 AM Martin Liška wrote: > > Adding libiberty maintainer to CC. I guess I'm not sure why either of these belong in libiberty. htab_insert can be written elsewhere as needed. And while perhaps some sort of stats API would be reasonable, I don't think it should be something that prints values to a FILE. Ian > On 8/17/20 4:03 PM, Martin Liška wrote: > > Hey. > > > > I'm working on bintuils where I would like to port a hash table > > implementation in gas/hash.[ch] to libiberty one. > > > > But it would be handy for me to add 2 new functions. > > > > Thoughts? > > Thanks, > > Martin > > > > include/ChangeLog: > > > > * hashtab.h (htab_insert): New function. > > (htab_print_statistics): Likewise. > > > > libiberty/ChangeLog: > > > > * hashtab.c (htab_insert): New function. > > (htab_print_statistics): Likewise. > > --- > > include/hashtab.h | 6 ++ > > libiberty/hashtab.c | 23 +++ > > 2 files changed, 29 insertions(+) > > > > diff --git a/include/hashtab.h b/include/hashtab.h > > index 6cca342b989..bcaee909bcf 100644 > > --- a/include/hashtab.h > > +++ b/include/hashtab.h > > @@ -37,6 +37,7 @@ extern "C" { > > #endif /* __cplusplus */ > > > > #include "ansidecl.h" > > +#include > > > > /* The type for a hash code. */ > > typedef unsigned int hashval_t; > > @@ -172,6 +173,7 @@ extern void **htab_find_slot (htab_t, const void *, > > enum insert_option); > > extern void *htab_find_with_hash (htab_t, const void *, hashval_t); > > extern void **htab_find_slot_with_hash (htab_t, const void *, > > hashval_t, enum insert_option); > > +extern voidhtab_insert (htab_t, void *); > > extern voidhtab_clear_slot(htab_t, void **); > > extern voidhtab_remove_elt(htab_t, const void *); > > extern voidhtab_remove_elt_with_hash (htab_t, const void *, > > hashval_t); > > @@ -183,6 +185,10 @@ extern size_thtab_size (htab_t); > > extern size_thtab_elements (htab_t); > > extern doublehtab_collisions(htab_t); > > > > +extern voidhtab_print_statistics (FILE *f, htab_t table, > > + const char *name, > > + const char *prefix); > > + > > /* A hash function for pointers. */ > > extern htab_hash htab_hash_pointer; > > > > diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c > > index 225e9e540a7..fb3152ec9c6 100644 > > --- a/libiberty/hashtab.c > > +++ b/libiberty/hashtab.c > > @@ -704,6 +704,15 @@ htab_find_slot (htab_t htab, const PTR element, enum > > insert_option insert) > > insert); > > } > > > > +/* Insert ELEMENT into HTAB. If the element exists, it is overwritten. */ > > + > > +void > > +htab_insert (htab_t htab, PTR element) > > +{ > > + void **slot = htab_find_slot (htab, element, INSERT); > > + *slot = element; > > +} > > + > > /* This function deletes an element with the given value from hash > > table (the hash is computed from the element). If there is no matching > > element in the hash table, this function does nothing. */ > > @@ -803,6 +812,20 @@ htab_collisions (htab_t htab) > > return (double) htab->collisions / (double) htab->searches; > > } > > > > +/* Print statistics about a hash table. */ > > + > > +void > > +htab_print_statistics (FILE *f, htab_t table, const char *name, > > + const char *prefix) > > +{ > > + fprintf (f, "%s hash statistics:\n", name); > > + fprintf (f, "%s%u searches\n", prefix, table->searches); > > + fprintf (f, "%s%lu elements\n", prefix, htab_elements (table)); > > + fprintf (f, "%s%lu table size\n", prefix, htab_size (table)); > > + fprintf (f, "%s%.2f collisions per search\n", > > + prefix, htab_collisions (table)); > > +} > > + > > /* Hash P as a null-terminated string. > > > > Copied from gcc/hashtable.c. Zack had the following to say with > > respect >
Re: [PATCH] bb-reorder: Remove a misfiring micro-optimization (PR96475)
Ping (added some Cc:s). Thanks in advance, Segher On Fri, Aug 07, 2020 at 09:51:04PM +, Segher Boessenkool wrote: > When the compgotos pass copies the tail of blocks ending in an indirect > jump, there is a micro-optimization to not copy the last one, since the > original block will then just be deleted. This does not work properly > if cleanup_cfg does not merge all pairs of blocks we expect it to. > > > v2: This also deletes the other use of single_pred_p, which has the same > problem in principle, I just never have triggered it so far. > > Tested on powerpc64-linux {-m32,-m64} like before. Is this okay for > trunk? > > > Segher > > > 2020-08-07 Segher Boessenkool > > PR rtl-optimization/96475 > * bb-reorder.c (maybe_duplicate_computed_goto): Remove single_pred_p > micro-optimization. > --- > gcc/bb-reorder.c | 10 +++--- > 1 file changed, 3 insertions(+), 7 deletions(-) > > diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c > index c635010..76e56b5 100644 > --- a/gcc/bb-reorder.c > +++ b/gcc/bb-reorder.c > @@ -2680,9 +2680,6 @@ make_pass_reorder_blocks (gcc::context *ctxt) > static bool > maybe_duplicate_computed_goto (basic_block bb, int max_size) > { > - if (single_pred_p (bb)) > -return false; > - >/* Make sure that the block is small enough. */ >rtx_insn *insn; >FOR_BB_INSNS (bb, insn) > @@ -2700,10 +2697,9 @@ maybe_duplicate_computed_goto (basic_block bb, int > max_size) > { >basic_block pred = e->src; > > - /* Do not duplicate BB into PRED if that is the last predecessor, or if > - we cannot merge a copy of BB with PRED. */ > - if (single_pred_p (bb) > - || !single_succ_p (pred) > + /* Do not duplicate BB into PRED if we cannot merge a copy of BB > + with PRED. */ > + if (!single_succ_p (pred) > || e->flags & EDGE_COMPLEX > || pred->index < NUM_FIXED_BLOCKS > || (JUMP_P (BB_END (pred)) && !simplejump_p (BB_END (pred))) > -- > 1.8.3.1
Re: [PATCH] middle-end: Fix PR middle-end/85811: Introduce tree_expr_maybe_nan_p et al.
On Mon, Aug 17, 2020 at 10:31:08PM +, Joseph Myers wrote: > On Sat, 15 Aug 2020, Segher Boessenkool wrote: > > On Sat, Aug 15, 2020 at 12:10:42PM +0100, Roger Sayle wrote: > > > I'll quote Joseph Myers (many thanks) who describes things clearly as: > > > > (a) When both arguments are NaNs, the return value should be a qNaN, > > > > but sometimes it is an sNaN if at least one argument is an sNaN. > > > > Where is this defined? I can't find it in C11, in 18661, and of course > > it isn't what GCC does (it requires -fsignaling to even acknowledge the > > existence of signaling NaNs :-) ) > > The semantics of fmax and fmin are those of the maxNum and minNum > operations in IEEE 754-2008 (that were removed in IEEE 754-2019); see the > table of IEEE operation bindings that 18661-1 adds to Annex F. > > minNum(x, y) is the canonicalized number x if x < y, y if y < x, the > canonicalized number if one operand is a number and the other a quiet > NaN. Otherwise it is either x or y, canonicalized (this means results > might differ among implementations). When either x or y is a > signalingNaN, then the result is according to 6.2. > > maxNum(x, y) is the canonicalized number y if x < y, x if y < x, the > canonicalized number if one operand is a number and the other a quiet > NaN. Otherwise it is either x or y, canonicalized (this means results > might differ among implementations). When either x or y is a > signalingNaN, then the result is according to 6.2. > > where the relevant wording from 6.2 is > > Under default exception handling, any operation signaling an invalid > operation exception and for which a floating-point result is to be > delivered shall deliver a quiet NaN. > > Signaling NaNs shall be reserved operands that, under default exception > handling, signal the invalid operation exception (see 7.2) for every > general-computational and signaling-computational operation except for > the conversions described in 5.12. For non-default treatment, see 8. > > (and maxNum and minNum are in 5.3 "Homogeneous general-computational > operations"). Ah, so "When both arguments are NaNs, the return value should be a qNaN" means the QNaN corresponding to eother x or y. I see, thanks! Segher
Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.
On Mon, Aug 17, 2020 at 06:05:09PM -0400, David Edelsohn wrote: > The Power Vector ABI is available at > > https://github.com/power8-abi-doc/vector-function-abi > > It apparently did not attach correctly to the sourceware wiki or the > filename is different. Thanks! Segher
Re: [PATCH] rs6000: unaligned VSX in memcpy/memmove expansion
Hi! On Fri, Aug 14, 2020 at 05:59:05PM -0500, Aaron Sawdey via Gcc-patches wrote: > +static rtx > +gen_lxvl_stxvl_move (rtx dest, rtx src, int length) > +{ > + gcc_assert (MEM_P (dest) ^ MEM_P (src)); Maybe just "!="? > + gcc_assert (GET_MODE (dest) == V16QImode && GET_MODE (src) == V16QImode); > + gcc_assert (length <= 16); > + > + bool is_store = MEM_P (dest); > + > + /* If the address form is not a simple register, make it so. */ > + if (is_store) > +{ > + dest = XEXP (dest, 0); > + if (!REG_P (dest)) > + dest = force_reg (Pmode, dest); So this changes what "dest" means. Maybe it is clearer if you have a separate variable "addr"? That you can use for dest and src as well, whichever is memory. > + if (is_store) > +return gen_stxvl (src, dest, len); > + else > +return gen_lxvl (dest, src, len); (doubled space -- well I guess you wanted to align the code) > + /* If we can't succeed in doing it in one pass, we can't do it in the > + might_overlap case. Bail out and return failure. */ > + if (might_overlap && (num_reg+1) >= MAX_MOVE_REG > + && bytes > move_bytes) > + return 0; The "num_reg+1" isn't obvious, and the comment doesn't say (we usually write is as "num_reg + 1" fwiw, and the parens are superfluous). Looks good, thanks! Okay for trunk with or without such changes. Segher
Re: [PATCH] middle-end: Fix PR middle-end/85811: Introduce tree_expr_maybe_nan_p et al.
On Sat, 15 Aug 2020, Segher Boessenkool wrote: > Hi! > > On Sat, Aug 15, 2020 at 12:10:42PM +0100, Roger Sayle wrote: > > I'll quote Joseph Myers (many thanks) who describes things clearly as: > > > (a) When both arguments are NaNs, the return value should be a qNaN, > > > but sometimes it is an sNaN if at least one argument is an sNaN. > > Where is this defined? I can't find it in C11, in 18661, and of course > it isn't what GCC does (it requires -fsignaling to even acknowledge the > existence of signaling NaNs :-) ) The semantics of fmax and fmin are those of the maxNum and minNum operations in IEEE 754-2008 (that were removed in IEEE 754-2019); see the table of IEEE operation bindings that 18661-1 adds to Annex F. minNum(x, y) is the canonicalized number x if x < y, y if y < x, the canonicalized number if one operand is a number and the other a quiet NaN. Otherwise it is either x or y, canonicalized (this means results might differ among implementations). When either x or y is a signalingNaN, then the result is according to 6.2. maxNum(x, y) is the canonicalized number y if x < y, x if y < x, the canonicalized number if one operand is a number and the other a quiet NaN. Otherwise it is either x or y, canonicalized (this means results might differ among implementations). When either x or y is a signalingNaN, then the result is according to 6.2. where the relevant wording from 6.2 is Under default exception handling, any operation signaling an invalid operation exception and for which a floating-point result is to be delivered shall deliver a quiet NaN. Signaling NaNs shall be reserved operands that, under default exception handling, signal the invalid operation exception (see 7.2) for every general-computational and signaling-computational operation except for the conversions described in 5.12. For non-default treatment, see 8. (and maxNum and minNum are in 5.3 "Homogeneous general-computational operations"). -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] c: Fix -Wunused-but-set-* warning with _Generic [PR96571]
On Fri, 14 Aug 2020, Jakub Jelinek via Gcc-patches wrote: > Hi! > > The following testcase shows various problems with -Wunused-but-set* > warnings and _Generic construct. I think it is best to treat the selector > and the ignored expressions as (potentially) read, because when they are > parsed, the vars in there are already marked as TREE_USED. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 2/5] C front end support to detect out-of-bounds accesses to array parameters
On Thu, 13 Aug 2020, Martin Sebor via Gcc-patches wrote: > > * Maybe cdk_pointer is followed by cdk_attrs before cdk_id. In this case > > the code won't return. > > I think I see the problem you're pointing out (I just don't see how > to trigger it or test that it doesn't happen). If the tweak in > the attached update doesn't fix it a test case would be helpful. I think you need a while loop there, not just an if, to account for the case of multiple consecutive cdk_attrs. At least the GNU attribute syntax direct-declarator: [...] ( gnu-attributes[opt] declarator ) should produce multiple consecutive cdk_attrs for each level of parentheses with attributes inside. > > * Maybe the code is correct to continue because we're in the case of an > > array of pointers (cdk_array follows). But as I understand it, the intent > > is to set up an "arg spec" that describes only the (multidimensional) > > array that is the parameter itself - not any array pointed to. And it > > looks to me like, in the case of an array of pointers to arrays, both sets > > of array bounds would end up in the spec constructed. > > Ideally, I'd like to check even pointers to arrays and so they should > be recorded somewhere. The middle end code doesn't do any checking > of those yet for out-of-bounds accesses. It wasn't a goal for > the first iteration so I've tweaked the code to avoid recording them. Could you expand the comment on get_parm_array_spec to specify exactly what you think the function should be putting in the returned attribute, in what order, in cases where there are array declarators (constant, empty, [*] and VLA) intermixed with other kinds of declarators and the type from the type specifiers may or may not be an array type itself? That will provide a basis for subsequent rounds of review of whether the function is actually behaving as expected. As far as I can see, the logic + if (TREE_CODE (nelts) == INTEGER_CST) + { + /* Skip all constant bounds except the most significant one. +The interior ones are included in the array type. */ + if (next && (next->kind == cdk_array || next->kind == cdk_pointer)) + continue; will skip constant bounds in an array that's the target of a pointer declarator, but not any other kind of bounds. Is that what you intend - that all the other kind of bounds in pointed-to arrays will be recorded in this string? > > Then, the code > > > > + if (pd->kind == cdk_id) > > + { > > + /* Extract the upper bound from a parameter of an array type. */ > > > > also seems misplaced. If the type specifiers for the parameter are a > > typedef for an array type, that array type should be processed *before* > > the declarator to get the correct semantics (as if the bounds from those > > type specifiers were given in the declarator), not at the end which gets > > that type out of order with respect to array declarators. (Processing > > before the declarator also means clearing the results of that processing > > if a pointer declarator is encountered at any point, because in that case > > the array type in the type specifiers is irrelevant.) > > I'm not sure I follow you here. Can you show me what you mean on > a piece of code? This test case (which IIUC does what you described) > works as expected: > > $ cat q.c && gcc -O2 -S -Wall q.c > typedef int A[7][9]; > > void f (A[3][5]); So this is equivalent to A[3][5][7][9]. The c_declarator structures have the one for the [3] (the top-level bound) inside the one for the [5]. The [5] bound is skipped by the "Skip all constant bounds except the most significant one." logic. When the [3] bound is reached, the "break;" at the end of that processing means the "Extract the upper bound from a parameter of an array type." never gets executed. Try replacing the [3] bound by a VLA bound. As I read the code, it will end up generating a spec string that records first the VLA, then the [7], when it should be first the 9 (skipped), then the 7 (skipped), then the 5 (skipped), then the VLA. Or if it's "void f (A *[variable][5]);", it will do the same thing (VLA, then 7, although both the 7 and the 9 are part of the pointed-to type). -- Joseph S. Myers jos...@codesourcery.com
Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.
The Power Vector ABI is available at https://github.com/power8-abi-doc/vector-function-abi It apparently did not attach correctly to the sourceware wiki or the filename is different. Thanks, David On Mon, Aug 17, 2020 at 1:44 PM GT wrote: > > ‐‐‐ Original Message ‐‐‐ > On Thursday, August 13, 2020 6:49 PM, Segher Boessenkool > wrote: > > > Hi! > > > > This is about the Power binding to some OpenMP API, right? It has > > nothing to do with "vector" or "ABI" -- we have vectors already, and > > we have ABIs already, more than enough of each. > > > > It is very very VERY hard to review this without being told the proper > > setting here. > > > > What this is about: > > David Edelsohn wanted to have new library functions, one for each of these 6 > single-precision functions: > sinf, cosf, sincosf, expf, logf, powf; and these 6 double-precision functions: > sin, cos, sincos, exp, log, and pow. > > For the single-precision functions, the corresponding new functions would > compute 4 results > simulatneously. For the double-precision functions, the new ones would > compute 2 results > simultaneously. > > x86_64 has already done something very similar so I thought I would adapt as > much of their > documentation and implementation as I could for PPC64. > > Let's start with that. Comments so far? > > Bert.
Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.
On Mon, Aug 17, 2020 at 05:44:46PM +, GT wrote: > > This is about the Power binding to some OpenMP API, right? It has > > nothing to do with "vector" or "ABI" -- we have vectors already, and > > we have ABIs already, more than enough of each. > > > > It is very very VERY hard to review this without being told the proper > > setting here. > > What this is about: > > David Edelsohn wanted to have new library functions, one for each of these 6 > single-precision functions: > sinf, cosf, sincosf, expf, logf, powf; and these 6 double-precision functions: > sin, cos, sincos, exp, log, and pow. > > For the single-precision functions, the corresponding new functions would > compute 4 results > simulatneously. For the double-precision functions, the new ones would > compute 2 results > simultaneously. > > x86_64 has already done something very similar so I thought I would adapt as > much of their > documentation and implementation as I could for PPC64. > > Let's start with that. Comments so far? That sounds like libmvec? I still don't know what this is. Segher
[PATCH] PR fortran/96613 - SIGFPE on min1() with -ffpe-trap=invalid switch
While looking at the reported issue, it appeared that the Fortran frontend mishandled the conversion of functions of the MIN/MAX variety to inline code. At the same time, the simplification of expressions using a common and GNU extension (but non-standard) could result in inconsistent results. The patch below addresses that. Regtested on x86_64-pc-linux-gnu. OK for master? Thanks, Harald PR fortran/96613 - Fix type/kind of temporaries evaluating MIN/MAX When evaluating functions of the MIN/MAX variety inline, use a temporary of appropriate type and kind, and convert to the result type at the end. In the case of allowing for the GNU extensions to MIN/MAX, derive the result kind consistently during simplificaton. gcc/fortran/ChangeLog: * simplify.c (min_max_choose): The simplification result shall have the highest kind value of the arguments. * trans-intrinsic.c (gfc_conv_intrinsic_minmax): Choose type and kind of intermediate by looking at all arguments, not the result. gcc/testsuite/ChangeLog: * gfortran.dg/min_max_kind.f90: New test. * gfortran.dg/pr96613.f90: New test. diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c index eb8b2afeb29..074b50c2e68 100644 --- a/gcc/fortran/simplify.c +++ b/gcc/fortran/simplify.c @@ -4924,6 +4924,8 @@ min_max_choose (gfc_expr *arg, gfc_expr *extremum, int sign, bool back_val) switch (arg->ts.type) { case BT_INTEGER: + if (extremum->ts.kind < arg->ts.kind) + extremum->ts.kind = arg->ts.kind; ret = mpz_cmp (arg->value.integer, extremum->value.integer) * sign; if (ret > 0) @@ -4931,6 +4933,8 @@ min_max_choose (gfc_expr *arg, gfc_expr *extremum, int sign, bool back_val) break; case BT_REAL: + if (extremum->ts.kind < arg->ts.kind) + extremum->ts.kind = arg->ts.kind; if (mpfr_nan_p (extremum->value.real)) { ret = 1; diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c index fd8809902b7..2483f016d8e 100644 --- a/gcc/fortran/trans-intrinsic.c +++ b/gcc/fortran/trans-intrinsic.c @@ -4073,6 +4073,7 @@ gfc_conv_intrinsic_minmax (gfc_se * se, gfc_expr * expr, enum tree_code op) tree val; tree *args; tree type; + tree argtype; gfc_actual_arglist *argexpr; unsigned int i, nargs; @@ -4082,16 +4083,24 @@ gfc_conv_intrinsic_minmax (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_conv_intrinsic_function_args (se, expr, args, nargs); type = gfc_typenode_for_spec (>ts); - argexpr = expr->value.function.actual; - if (TREE_TYPE (args[0]) != type) -args[0] = convert (type, args[0]); /* Only evaluate the argument once. */ if (!VAR_P (args[0]) && !TREE_CONSTANT (args[0])) args[0] = gfc_evaluate_now (args[0], >pre); - mvar = gfc_create_var (type, "M"); - gfc_add_modify (>pre, mvar, args[0]); + /* Determine suitable type of temporary, as a GNU extension allows + different argument kinds. */ + argtype = TREE_TYPE (args[0]); + argexpr = expr->value.function.actual; + for (i = 1, argexpr = argexpr->next; i < nargs; i++, argexpr = argexpr->next) +{ + tree tmptype = TREE_TYPE (args[i]); + if (TYPE_PRECISION (tmptype) > TYPE_PRECISION (argtype)) + argtype = tmptype; +} + mvar = gfc_create_var (argtype, "M"); + gfc_add_modify (>pre, mvar, convert (argtype, args[0])); + argexpr = expr->value.function.actual; for (i = 1, argexpr = argexpr->next; i < nargs; i++, argexpr = argexpr->next) { tree cond = NULL_TREE; @@ -4119,8 +4128,8 @@ gfc_conv_intrinsic_minmax (gfc_se * se, gfc_expr * expr, enum tree_code op) Also, there is no consensus among other tested compilers. In short, it's a mess. So lets just do whatever is fastest. */ tree_code code = op == GT_EXPR ? MAX_EXPR : MIN_EXPR; - calc = fold_build2_loc (input_location, code, type, - convert (type, val), mvar); + calc = fold_build2_loc (input_location, code, argtype, + convert (argtype, val), mvar); tmp = build2_v (MODIFY_EXPR, mvar, calc); if (cond != NULL_TREE) @@ -4128,7 +4137,10 @@ gfc_conv_intrinsic_minmax (gfc_se * se, gfc_expr * expr, enum tree_code op) build_empty_stmt (input_location)); gfc_add_expr_to_block (>pre, tmp); } - se->expr = mvar; + if (TREE_CODE (type) == INTEGER_TYPE) +se->expr = fold_build1_loc (input_location, FIX_TRUNC_EXPR, type, mvar); + else +se->expr = convert (type, mvar); } diff --git a/gcc/testsuite/gfortran.dg/min_max_kind.f90 b/gcc/testsuite/gfortran.dg/min_max_kind.f90 new file mode 100644 index 000..b22691e1ffe --- /dev/null +++ b/gcc/testsuite/gfortran.dg/min_max_kind.f90 @@ -0,0 +1,15 @@ +! { dg-do run } +! { dg-options "-O2 -std=gnu" } +! Verify that the GNU extensions to MIN/MAX handle mixed kinds properly. + +program p + implicit none + integer(1), parameter :: i1 = 1 + integer(2), parameter :: i2 = 2 + real(4),parameter :: r4 = 4 + real(8),parameter :: r8 = 8 + if (kind
[committed] analyzer: fix ICE on NULL dereference [PR96644]
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to master as r11-2725-gb00a83047574eb6f8d1e670ad439609125873506. gcc/analyzer/ChangeLog: PR analyzer/96644 * region-model-manager.cc (get_region_for_unexpected_tree_code): Handle ctxt being NULL. gcc/testsuite/ChangeLog: PR analyzer/96644 * gcc.dg/analyzer/pr96644.c: New test. --- gcc/analyzer/region-model-manager.cc| 4 ++-- gcc/testsuite/gcc.dg/analyzer/pr96644.c | 24 2 files changed, 26 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr96644.c diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc index 9c7b0602e88..4faeaa52a63 100644 --- a/gcc/analyzer/region-model-manager.cc +++ b/gcc/analyzer/region-model-manager.cc @@ -927,11 +927,11 @@ get_region_for_unexpected_tree_code (region_model_context *ctxt, tree t, const dump_location_t ) { - gcc_assert (ctxt); tree type = TYPE_P (t) ? t : TREE_TYPE (t); region *new_reg = new unknown_region (alloc_region_id (), _root_region, type); - ctxt->on_unexpected_tree_code (t, loc); + if (ctxt) +ctxt->on_unexpected_tree_code (t, loc); return new_reg; } diff --git a/gcc/testsuite/gcc.dg/analyzer/pr96644.c b/gcc/testsuite/gcc.dg/analyzer/pr96644.c new file mode 100644 index 000..3953c8d58c4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/pr96644.c @@ -0,0 +1,24 @@ +/* { dg-additional-options "-O1" } */ + +int oh[1]; +int *x3; + +int * +cm (char *m0) +{ + return oh; +} + +void +ek (void) +{ + for (;;) +{ + char *b2 = 0; + + if (*b2 != 0) /* { dg-warning "dereference of NULL" } */ + ++b2; + + x3 = cm (b2); +} +} -- 2.26.2
[committed] analyzer: fix ICE due to NULL type [PR96639]
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to master as r11-2724-g42c5ae5d7f0ad89b75d93c497fe44b6c66da7e76. gcc/analyzer/ChangeLog: PR analyzer/96639 * region.cc (region::get_subregions_for_binding): Check for "type" being NULL. gcc/testsuite/ChangeLog: PR analyzer/96639 * gcc.dg/analyzer/pr96639.c: New test. --- gcc/analyzer/region.cc | 2 +- gcc/testsuite/gcc.dg/analyzer/pr96639.c | 10 ++ 2 files changed, 11 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr96639.c diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc index afe416b001b..eab1f2771cf 100644 --- a/gcc/analyzer/region.cc +++ b/gcc/analyzer/region.cc @@ -256,7 +256,7 @@ region::get_subregions_for_binding (region_model_manager *mgr, tree type, auto_vec *out) const { - if (get_type () == NULL_TREE) + if (get_type () == NULL_TREE || type == NULL_TREE) return; if (relative_bit_offset == 0 && types_compatible_p (get_type (), type)) diff --git a/gcc/testsuite/gcc.dg/analyzer/pr96639.c b/gcc/testsuite/gcc.dg/analyzer/pr96639.c new file mode 100644 index 000..02ca3f084a2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/pr96639.c @@ -0,0 +1,10 @@ +void *calloc (__SIZE_TYPE__, __SIZE_TYPE__); + +int +x7 (void) +{ + int **md = calloc (1, 1); + + return md[0][0]; /* { dg-warning "possibly-NULL" "unchecked deref" } */ + /* { dg-warning "leak of 'md'" "leak" { target *-*-* } .-1 } */ +} -- 2.26.2
[committed] analyzer: handle _CST in constant pool initializers [PR96642]
In r11-2708-g2867118ddda9b56d991c16022f7d3d634ed08313 I added support to the analyzer for initialization from var_decls in the global constant pool. However, that commit didn't support initialization from ADDR_EXPR of a STRING_CST leading to an ICE seen in data-model-1.c and pr94639.c on arm and powerpc64 at least, and as PR analyzer/96642 on x86_64 at least. This patch adds support for such initializers, fixing the ICE. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Verified the fix to data-model-1.c and pr94639.c on arm, aarch64 and powerpc64. Pushed to master as r11-2723-g35c5f8fb432c8e68af68ab48c8d3107e7839775e. gcc/analyzer/ChangeLog: PR analyzer/96642 * store.cc (get_svalue_for_ctor_val): New. (binding_map::apply_ctor_to_region): Call it. gcc/testsuite/ChangeLog: PR analyzer/96642 * gcc.dg/analyzer/pr96642.c: New test. --- gcc/analyzer/store.cc | 21 ++--- gcc/testsuite/gcc.dg/analyzer/pr96642.c | 10 ++ 2 files changed, 28 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr96642.c diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc index 232920019e0..5af86d09c2b 100644 --- a/gcc/analyzer/store.cc +++ b/gcc/analyzer/store.cc @@ -391,6 +391,22 @@ get_subregion_within_ctor (const region *parent_reg, tree index, } } +/* Get the svalue for VAL, a non-CONSTRUCTOR value within a CONSTRUCTOR. */ + +static const svalue * +get_svalue_for_ctor_val (tree val, region_model_manager *mgr) +{ + if (TREE_CODE (val) == ADDR_EXPR) +{ + gcc_assert (TREE_CODE (TREE_OPERAND (val, 0)) == STRING_CST); + const string_region *str_reg + = mgr->get_region_for_string (TREE_OPERAND (val, 0)); + return mgr->get_ptr_svalue (TREE_TYPE (val), str_reg); +} + gcc_assert (CONSTANT_CLASS_P (val)); + return mgr->get_or_create_constant_svalue (val); +} + /* Bind values from CONSTRUCTOR to this map, relative to PARENT_REG's relationship to its base region. */ @@ -415,12 +431,11 @@ binding_map::apply_ctor_to_region (const region *parent_reg, tree ctor, apply_ctor_to_region (child_reg, val, mgr); else { - gcc_assert (CONSTANT_CLASS_P (val)); - const svalue *cst_sval = mgr->get_or_create_constant_svalue (val); + const svalue *sval = get_svalue_for_ctor_val (val, mgr); const binding_key *k = binding_key::make (mgr->get_store_manager (), child_reg, BK_direct); - put (k, cst_sval); + put (k, sval); } } } diff --git a/gcc/testsuite/gcc.dg/analyzer/pr96642.c b/gcc/testsuite/gcc.dg/analyzer/pr96642.c new file mode 100644 index 000..117aa0437ac --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/pr96642.c @@ -0,0 +1,10 @@ +void +ut (void) +{ + struct { +char *cc; + } sr[] = { +0, 0, 0, 0, 0, 0, 0, 0, 0, "", "", 0, "", 0, 0, "", +0, 0, "", 0, 0, "", 0, 0, "", 0, 0, "", 0, 0, 0, 0, 0, + }; +} -- 2.26.2
[committed] i386: Use parametrized pattern names some more.
Use parameterized pattern names to simplify calling of named patterns. 2020-08-15 Uroš Bizjak gcc/ChangeLog: * config/i386/i386-builtin.def (__builtin_ia32_bextri_u32) (__builtin_ia32_bextri_u64): Use CODE_FOR_nothing. * config/i386/i386.md (@tbm_bextri_): Implement as parametrized name pattern. (@rdrand): Ditto. (@rdseed): Ditto. * config/i386/i386-expand.c (ix86_expand_builtin) [case IX86_BUILTIN_BEXTRI32, case IX86_BUILTIN_BEXTRI64]: Update for parameterized name patterns. [case IX86_BUILTIN_RDRAND16_STEP, case IX86_BUILTIN_RDRAND32_STEP] [case IX86_BUILTIN_RDRAND64_STEP]: Ditto. [case IX86_BUILTIN_RDSEED16_STEP, case IX86_BUILTIN_RDSEED32_STEP] [case IX86_BUILTIN_RDSEED64_STEP]: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/rdrand-1.c (dg-final): Update scan string. * gcc.target/i386/rdrand-2.c (dg-final): Ditto. * gcc.target/i386/rdrand-3.c (dg-final): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 3b6c4a85579..fec5cef0b55 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -1270,8 +1270,8 @@ BDESC (OPTION_MASK_ISA_BMI, 0, CODE_FOR_tzcnt_si, "__builtin_ia32_tzcnt_u32", IX BDESC (OPTION_MASK_ISA_BMI | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_tzcnt_di, "__builtin_ia32_tzcnt_u64", IX86_BUILTIN_TZCNT64, UNKNOWN, (int) UINT64_FTYPE_UINT64) /* TBM */ -BDESC (OPTION_MASK_ISA_TBM, 0, CODE_FOR_tbm_bextri_si, "__builtin_ia32_bextri_u32", IX86_BUILTIN_BEXTRI32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT) -BDESC (OPTION_MASK_ISA_TBM | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_tbm_bextri_di, "__builtin_ia32_bextri_u64", IX86_BUILTIN_BEXTRI64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64) +BDESC (OPTION_MASK_ISA_TBM, 0, CODE_FOR_nothing, "__builtin_ia32_bextri_u32", IX86_BUILTIN_BEXTRI32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT) +BDESC (OPTION_MASK_ISA_TBM | OPTION_MASK_ISA_64BIT, 0, CODE_FOR_nothing, "__builtin_ia32_bextri_u64", IX86_BUILTIN_BEXTRI64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64) /* F16C */ BDESC (OPTION_MASK_ISA_F16C, 0, CODE_FOR_vcvtph2ps, "__builtin_ia32_vcvtph2ps", IX86_BUILTIN_CVTPH2PS, UNKNOWN, (int) V4SF_FTYPE_V8HI) diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 9de6f5029b9..d8368bfd4a9 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -11709,24 +11709,26 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget, case IX86_BUILTIN_BEXTRI32: case IX86_BUILTIN_BEXTRI64: + mode = (fcode == IX86_BUILTIN_BEXTRI32 ? SImode : DImode); + arg0 = CALL_EXPR_ARG (exp, 0); arg1 = CALL_EXPR_ARG (exp, 1); op0 = expand_normal (arg0); op1 = expand_normal (arg1); - icode = (fcode == IX86_BUILTIN_BEXTRI32 - ? CODE_FOR_tbm_bextri_si - : CODE_FOR_tbm_bextri_di); + if (!CONST_INT_P (op1)) -{ - error ("last argument must be an immediate"); - return const0_rtx; -} + { + error ("last argument must be an immediate"); + return const0_rtx; + } else -{ - unsigned char length = (INTVAL (op1) >> 8) & 0xFF; - unsigned char lsb_index = INTVAL (op1) & 0xFF; - op1 = GEN_INT (length); - op2 = GEN_INT (lsb_index); + { + unsigned char lsb_index = UINTVAL (op1); + unsigned char length = UINTVAL (op1) >> 8; + + unsigned char bitsize = GET_MODE_BITSIZE (mode); + + icode = code_for_tbm_bextri (mode); mode1 = insn_data[icode].operand[1].mode; if (!insn_data[icode].operand[1].predicate (op0, mode1)) @@ -11737,25 +11739,32 @@ ix86_expand_builtin (tree exp, rtx target, rtx subtarget, || !register_operand (target, mode0)) target = gen_reg_rtx (mode0); - pat = GEN_FCN (icode) (target, op0, op1, op2); - if (pat) -emit_insn (pat); - return target; -} + if (length == 0 || lsb_index >= bitsize) + { + emit_move_insn (target, const0_rtx); + return target; + } + + if (length + lsb_index > bitsize) + length = bitsize - lsb_index; + + op1 = GEN_INT (length); + op2 = GEN_INT (lsb_index); + + emit_insn (GEN_FCN (icode) (target, op0, op1, op2)); + return target; + } case IX86_BUILTIN_RDRAND16_STEP: - icode = CODE_FOR_rdrandhi_1; - mode0 = HImode; + mode = HImode; goto rdrand_step; case IX86_BUILTIN_RDRAND32_STEP: - icode = CODE_FOR_rdrandsi_1; - mode0 = SImode; + mode = SImode; goto rdrand_step; case IX86_BUILTIN_RDRAND64_STEP: - icode = CODE_FOR_rdranddi_1; - mode0 = DImode; + mode = DImode; rdrand_step: arg0 = CALL_EXPR_ARG (exp, 0); @@ -11766,16
[PATCH][Arm] Auto-vectorization for MVE: vsub
Hi all, This patch enables MVE vsub instructions for auto-vectorization. It adds RTL templates for MVE vsub instructions using 'minus' instead of unspec expression to make the instructions recognizable for vectorization. MVE target is added in sub3 optab. The sub3 optab is modified to use a mode iterator that selects available modes for various targets correspondingly. MVE vector modes are enabled in arm_preferred_simd_mode in arm.c to support vectorization. This patch also fixes 'vreinterpretq_*.c' MVE intrinsic tests. The tests generate wrong instruction numbers because of unexpected icf optimization. This bug is exposed by the MVE vector modes enabled in this patch, therefore it is corrected in this patch to avoid test failures. MVE instructions are documented here: https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/helium-intrinsics The patch is regtested for arm-none-eabi and bootstrapped for arm-none-linux-gnueabihf. Is it OK for trunk please? Thanks Dennis gcc/ChangeLog: 2020-08-10 Dennis Zhang * config/arm/arm.c (arm_preferred_simd_mode): Enable MVE vector modes. * config/arm/arm.h (TARGET_NEON_IWMMXT): New macro. (TARGET_NEON_IWMMXT_MVE, TARGET_NEON_IWMMXT_MVE_FP): Likewise. (TARGET_NEON_MVE_HFP): Likewise. * config/arm/iterators.md (VSEL): New mode iterator to select modes for corresponding targets. * config/arm/mve.md (mve_vsubq): New entry for vsub instruction using expression 'minus'. (mve_vsubq_f): Use minus instead of VSUBQ_F unspec. * config/arm/neon.md (sub3): Removed here. Integrated in the sub3 in vec-common.md * config/arm/vec-common.md (sub3): Enable MVE target. Use VSEL to select available modes. Exclude TARGET_NEON_FP16INST from TARGET_NEON statement. Intergrate TARGET_NEON_FP16INST which is originally in neon.md. gcc/testsuite/ChangeLog: 2020-08-10 Dennis Zhang * gcc.target/arm/mve/intrinsics/vreinterpretq_f16.c: Use additional option -fno-ipa-icf and change the instruction count from 8 to 16. * gcc.target/arm/mve/intrinsics/vreinterpretq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_s64.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vreinterpretq_u8.c: Likewise. * gcc.target/arm/mve/mve.exp: Include tests in subdir 'vect'. * gcc.target/arm/mve/vect/vect_sub_0.c: New test. * gcc.target/arm/mve/vect/vect_sub_1.c: New test. diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 30e1d6dc994..eb8c9599357 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -334,6 +334,14 @@ emission of floating point pcs attributes. */ isa_bit_mve_float) \ && !TARGET_GENERAL_REGS_ONLY) +#define TARGET_NEON_IWMMXT (TARGET_NEON || TARGET_REALLY_IWMMXT) +#define TARGET_NEON_IWMMXT_MVE (TARGET_NEON || TARGET_REALLY_IWMMXT \ + || TARGET_HAVE_MVE) +#define TARGET_NEON_IWMMXT_MVE_FP ((TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT) \ + || TARGET_NEON || TARGET_REALLY_IWMMXT) +#define TARGET_NEON_MVE_HFP ((TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT) \ + || TARGET_NEON_FP16INST) + /* MVE have few common instructions as VFP, like VLDM alias VPOP, VLDR, VSTM alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it updates few registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and MVFR2. All diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 6b7ca829f1c..dcbcbbeced0 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -28913,6 +28913,30 @@ arm_preferred_simd_mode (scalar_mode mode) default:; } + if (TARGET_HAVE_MVE) +switch (mode) + { + case QImode: + return V16QImode; + case HImode: + return V8HImode; + case SImode: + return V4SImode; + + default:; + } + + if (TARGET_HAVE_MVE_FLOAT) +switch (mode) + { + case HFmode: + return V8HFmode; + case SFmode: + return V4SFmode; + + default:; + } + return word_mode; } diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 0bc9eba0722..52c3a8a4355 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -80,6 +80,19 @@ ;; Integer and float modes supported by Neon and IWMMXT but not MVE. (define_mode_iterator VNINOTM1 [V2SI V4HI V8QI V2SF]) +;; Select modes for NEON, IWMMXT and MVE. +(define_mode_iterator VSEL [(V16QI "TARGET_NEON_IWMMXT_MVE") + (V8HI "TARGET_NEON_IWMMXT_MVE") +
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Bill: On Mon, 2020-08-17 at 13:09 -0500, Bill Schmidt wrote: > > > > There are three prototypes __builtin_cfuged, __builtin_pdepd, > > __builtin_pextd defined in the document. > > > > The corresponding builtin definitions in GCC are: > > > > __builtin_altivec_cfuged, __builtin_altivec_pdepd, > > __builtin_altivec_pextd > > > > which does not match the defined prototype in the document. > > > These are scalar instructions, not vector, so they should not be > using > any flavor of "V". They should be using BU_P10_MISC_n, where n is > the > number of arguments. Yes, looks like that is those are the scalar versions. I got them mixed up with the vector definitions vector unsigned long long int vec_pdep() vector unsigned long long int vec_pext () vector unsigned long long int vec_cfuge () I was thinking the __builtin_name() was also referring to the vector versions. So, given that there are separate definitions, it does appear that the names are all consistent with the documentation. Thanks Bill. Carl
Re: [EXTERNAL] Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
On 8/17/20 12:13 PM, Carl Love wrote: Segher, Bill, Peter: On Fri, 2020-08-14 at 19:42 -0500, Segher Boessenkool wrote: Do the names agree with the (future) documentation now? Did not double check on the documentation. Someone should... Looking at the box document "Proposed function Prototypes for P10". There are a number of builtins of the form "name()" which get expanded to __builtin_altivec_name or __builtin_vsx_name. But there does not appear to be any additional defined prototype for the __builtin_altivec_name or __builtin_vsx_name in the document so we don't need to worry about these prototypes as far as I can see. There are three prototypes __builtin_cfuged, __builtin_pdepd, __builtin_pextd defined in the document. The corresponding builtin definitions in GCC are: __builtin_altivec_cfuged, __builtin_altivec_pdepd, __builtin_altivec_pextd which does not match the defined prototype in the document. These are scalar instructions, not vector, so they should not be using any flavor of "V". They should be using BU_P10_MISC_n, where n is the number of arguments. Bill I don't see any defines in gcc/config/rs6000 that would map __builtin_name to __builtin_altivec_name so these three appear to be unsupported as far as I can see. I assume adding #define __builtin_name __builtin_altivec_name to gcc/config/rs6000/altivec.h would be the easiest way to define the prototypes from the document. I can add the defines if you think that is the correct fix. Please let me know. The MMA related builtins at the end of the document appear to have the proper define BU_MMA_# macro expansions to generate the defined prototype names. Looking at the builtin definitions in box for RFC 2608, RFC 2609, RFC 2629 the builtins are all of the form name() so I don't see any issues with the internal GCC name changes for the builtins in these documents. Carl
Re: [PATCH] x86_64: PR rtl-optimization/92180: class_likely_spilled vs. cant_combine_insn.
Hi! On Mon, Aug 17, 2020 at 01:06:10PM +0200, Uros Bizjak wrote: > On Mon, Aug 17, 2020 at 12:42 PM Roger Sayle > wrote: > > (insn 14 7 15 2 (set (reg/i:SI 0 ax) > > (subreg:SI (reg:DI 84) 0)) "pr92180.c":5:1 67 {*movsi_internal} > > (expr_list:REG_DEAD (reg:DI 84) > > (nil))) > > > > Normally, combine/simplify-rtx would notice that insns 6 and 7 > > (which update highpart bits) are unnecessary as the final insn 14 > > only requires to lowpart bits. The complication is that insn 14 > > sets a hard register in targetm.class_likely_spilled_p which > > prevents combine from performing its simplifications, and removing > > the redundant instructions. > I think that fwprop interferes with recent change to combine, where > combine won't propagate hard registers anymore. It won't propagate move insns from a hard non-fixed register to a pseudo into other insns, yeah. But that does not apply here? > So, following that > change, there is no point for fwprop to create instructions that > combine won't be able to process. Alternatively, perhaps fwprop should > be prevented from propagating likely_spilled hard registers? > > Let's ask Segher for his opinion. I have no opinion about class_likely_spilled_p; it is just a gross target hack as far as I can see. (I wonder how much of that is still useful with LRA?) Maybe combine could move return values in a hard reg through a pseudo? So pretty much the same as make_more_copies, but the other way around. You'll get the copy to a pseudo (which is in SImode here) as a separate insn that combines with the previous insns fine, and RA will give the pseudo the same hard register in all cases where that is beneficial. Segher
Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.
‐‐‐ Original Message ‐‐‐ On Thursday, August 13, 2020 6:49 PM, Segher Boessenkool wrote: > Hi! > > This is about the Power binding to some OpenMP API, right? It has > nothing to do with "vector" or "ABI" -- we have vectors already, and > we have ABIs already, more than enough of each. > > It is very very VERY hard to review this without being told the proper > setting here. > What this is about: David Edelsohn wanted to have new library functions, one for each of these 6 single-precision functions: sinf, cosf, sincosf, expf, logf, powf; and these 6 double-precision functions: sin, cos, sincos, exp, log, and pow. For the single-precision functions, the corresponding new functions would compute 4 results simulatneously. For the double-precision functions, the new ones would compute 2 results simultaneously. x86_64 has already done something very similar so I thought I would adapt as much of their documentation and implementation as I could for PPC64. Let's start with that. Comments so far? Bert.
Re: [PATCH][Hashtable 5/6] Remove H1/H2 template parameters
Hi Here is the new proposal. As we can't remove template parameters I simply restore those that I tried to pass differently _H2 and _ExtractKey, so eventually I only remove usage of _Hash which I renamed in _Unused. Maybe I can keep the doc about it in hashtable.h and just add a remark saying that it is now unused. For _RangeHash, formerly _H2, and _ExtractKey I just stop maintaining any storage. When we need those I always use a value initialized instance. I kind of prefer the value initialization syntax because you can't confuse it with a function call but let me know if it is wrong and I should use _ExtractKey() or _RangeHash(). I also add some static assertions about those types regarding their noexcept qualifications. I also included in this patch the few changes left from [Hashtable 0/6] which are mostly _M_insert_unique_node and _M_insert_multi_node signature cleanup as the key part can be extracted from the inserted node. Tested under Linux x86_64, ok to commit ? François On 06/08/20 11:27 am, Jonathan Wakely wrote: On 06/08/20 08:35 +0200, François Dumont wrote: On 17/07/20 1:35 pm, Jonathan Wakely wrote: I really like the general idea of getting rid of some of the complexity and not supporting infinite customization. But we can do that without changing mangled names of the _Hashtable specialiations. I didn't thought we need to keep abi compatibility for extensions. These aren't extensions though, they're part of std::unordered_map etc. Just because something like _Vector_base is an internal type rather than something defined in the standard doesn't mean we can just change its ABI, because that would change the ABI of std::vector. It the same here. Changing _Hashtable affects all users of std::unordered_map etc. diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h index 7b772a475e3..1ba32a3c7e2 100644 --- a/libstdc++-v3/include/bits/hashtable.h +++ b/libstdc++-v3/include/bits/hashtable.h @@ -69,21 +69,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * and returns a bool-like value that is true if the two objects * are considered equal. * - * @tparam _H1 The hash function. A unary function object with + * @tparam _Hash The hash function. A unary function object with * argument type _Key and result type size_t. Return values should * be distributed over the entire range [0, numeric_limits:::max()]. * - * @tparam _H2 The range-hashing function (in the terminology of + * @tparam _RangeHash The range-hashing function (in the terminology of * Tavori and Dreizin). A binary function object whose argument * types and result type are all size_t. Given arguments r and N, * the return value is in the range [0, N). * - * @tparam _Hash The ranged hash function (Tavori and Dreizin). A - * binary function whose argument types are _Key and size_t and - * whose result type is size_t. Given arguments k and N, the - * return value is in the range [0, N). Default: hash(k, N) = - * h2(h1(k), N). If _Hash is anything other than the default, _H1 - * and _H2 are ignored. + * @tparam _Unused Not used. * * @tparam _RehashPolicy Policy class with three members, all of * which govern the bucket count. _M_next_bkt(n) returns a bucket @@ -91,9 +86,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * bucket count appropriate for an element count of n. * _M_need_rehash(n_bkt, n_elt, n_ins) determines whether, if the * current bucket count is n_bkt and the current element count is - * n_elt, we need to increase the bucket count. If so, returns - * make_pair(true, n), where n is the new bucket count. If not, - * returns make_pair(false, ) + * n_elt, we need to increase the bucket count for n_ins insertions. + * If so, returns make_pair(true, n), where n is the new bucket count. If + * not, returns make_pair(false, ) * * @tparam _Traits Compile-time class with three boolean * std::integral_constant members: __cache_hash_code, __constant_iterators, @@ -168,19 +163,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION */ template class _Hashtable : public __detail::_Hashtable_base<_Key, _Value, _ExtractKey, _Equal, - _H1, _H2, _Hash, _Traits>, + _Hash, _RangeHash, _Unused, _Traits>, public __detail::_Map_base<_Key, _Value, _Alloc, _ExtractKey, _Equal, - _H1, _H2, _Hash, _RehashPolicy, _Traits>, + _Hash, _RangeHash, _Unused, + _RehashPolicy, _Traits>, public __detail::_Insert<_Key, _Value, _Alloc, _ExtractKey, _Equal, - _H1, _H2, _Hash, _RehashPolicy, _Traits>, + _Hash, _RangeHash, _Unused, + _RehashPolicy, _Traits>, public __detail::_Rehash_base<_Key, _Value, _Alloc, _ExtractKey, _Equal, -_H1, _H2, _Hash, _RehashPolicy, _Traits>, +_Hash, _RangeHash, _Unused, +_RehashPolicy, _Traits>, public
RE: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.
Segher, Bill, Peter: On Fri, 2020-08-14 at 19:42 -0500, Segher Boessenkool wrote: > > > Do the names agree with the (future) documentation now? > > > > Did not double check on the documentation. > > Someone should... Looking at the box document "Proposed function Prototypes for P10". There are a number of builtins of the form "name()" which get expanded to __builtin_altivec_name or __builtin_vsx_name. But there does not appear to be any additional defined prototype for the __builtin_altivec_name or __builtin_vsx_name in the document so we don't need to worry about these prototypes as far as I can see. There are three prototypes __builtin_cfuged, __builtin_pdepd, __builtin_pextd defined in the document. The corresponding builtin definitions in GCC are: __builtin_altivec_cfuged, __builtin_altivec_pdepd, __builtin_altivec_pextd which does not match the defined prototype in the document. I don't see any defines in gcc/config/rs6000 that would map __builtin_name to __builtin_altivec_name so these three appear to be unsupported as far as I can see. I assume adding #define __builtin_name __builtin_altivec_name to gcc/config/rs6000/altivec.h would be the easiest way to define the prototypes from the document. I can add the defines if you think that is the correct fix. Please let me know. The MMA related builtins at the end of the document appear to have the proper define BU_MMA_# macro expansions to generate the defined prototype names. Looking at the builtin definitions in box for RFC 2608, RFC 2609, RFC 2629 the builtins are all of the form name() so I don't see any issues with the internal GCC name changes for the builtins in these documents. Carl
Re: [Patch, fortran] PRs 96100 and 96101 - Problems with string lengths of array constructors
Hi Paul, > The fix for PR9601 is rather trivial and is the last chunk of the patch. > Finding the fix for PR96100 took a silly amount of time but it now looks > rather obvious. Trying to evaluate the string length by calling > gfc_conv_expr_descriptor, when this function is already failing to find it > is kind of doomed to failure :-) Therefore, gfc_conv_expr is used with > tse.descriptor_only set. This has the effect of ignoring trailing array > references and making use of gfc_conv_component_ref's being able to extract > the hidden string length for deferred length components. Finally, the > string length of the first element in the array constructor is set if this > is a deferred length component. The patch seems to be effective. Albeit I don't understand why, when it is a parenthesis op, you deduce that this has to be the string length? The explanation for the second fix left me completely lost. > Regtests OK on FC31/x86_64 - OK for master? Tests ok with no regression. Therefore ok by me. Regards, Andre > > Paul > > This patch fixes PR96100 and PR96101 by making some minor changes to > the evaluation of string lengths for gfc_conv_expr_descriptor. > > 2020-08-13 Paul Thomas > > gcc/fortran > PR fortran/96100 > PR fortran/96101 > * trans-array.c (get_array_charlen): Tidy up the evaluation of > the string length for array constructors. Avoid trailing array > references. Ensure string lengths of deferred length components > are set. For parentheses operator apply string length to both > the primary expression and the enclosed expression. > > gcc/testsuite/ > PR fortran/96100 > PR fortran/96101 > * gfortran.dg/char_length_23.f90: New test. -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: PING: Fwd: [PATCH 2/2] Decouple adjust_range_from_scev from vr_values and value_range_equiv.
On 8/17/20 6:04 AM, Aldy Hernandez wrote: On 8/14/20 7:16 PM, Andrew MacLeod wrote: On 8/14/20 12:05 PM, Aldy Hernandez wrote: I made some minor changes to the function comments. gcc/ChangeLog: * vr-values.c (check_for_binary_op_overflow): Change type of store to range_query. (vr_values::adjust_range_with_scev): Abstract most of the code... (range_of_var_in_loop): ...here. Remove value_range_equiv uses. (simplify_using_ranges::simplify_using_ranges): Change type of store to range_query. * vr-values.h (class range_query): New. (class simplify_using_ranges): Use range_query. (class vr_values): Add OVERRIDE to get_value_range. (range_of_var_in_loop): New. --- gcc/vr-values.c | 150 ++-- gcc/vr-values.h | 23 ++-- 2 files changed, 88 insertions(+), 85 deletions(-) diff --git a/gcc/vr-values.c b/gcc/vr-values.c index 9002d87c14b..5b7bae3bfb7 100644 --- a/gcc/vr-values.c +++ b/gcc/vr-values.c @@ -1004,7 +1004,7 @@ vr_values::extract_range_from_comparison (value_range_equiv *vr, overflow. */ static bool -check_for_binary_op_overflow (vr_values *store, +check_for_binary_op_overflow (range_query *store, enum tree_code subcode, tree type, tree op0, tree op1, bool *ovf) { @@ -1737,22 +1737,18 @@ compare_range_with_value (enum tree_code comp, const value_range *vr, gcc_unreachable (); } -/* Given a range VR, a LOOP and a variable VAR, determine whether it - would be profitable to adjust VR using scalar evolution information - for VAR. If so, update VR with the new limits. */ + +/* Given a VAR in STMT within LOOP, determine the range of the + variable and store it in VR. If no range can be determined, the + resulting range will be set to VARYING. */ void -vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, - gimple *stmt, tree var) +range_of_var_in_loop (irange *vr, range_query *query, + class loop *loop, gimple *stmt, tree var) { - tree init, step, chrec, tmin, tmax, min, max, type, tem; + tree init, step, chrec, tmin, tmax, min, max, type; enum ev_direction dir; - /* TODO. Don't adjust anti-ranges. An anti-range may provide - better opportunities than a regular range, but I'm not sure. */ - if (vr->kind () == VR_ANTI_RANGE) - return; - IIUC, you've switched to using the new API, so the bounds calls will basically turn and ANTI range into a varying , making [lbound,ubound] will be [MIN, MAX] ? so its effectively a no-op, except we will not punt on getting a range when VR is an anti range anymore.. so that goodness... Yes. chrec = instantiate_parameters (loop, analyze_scalar_evolution (loop, var)); /* Like in PR19590, scev can return a constant function. */ @@ -1763,16 +1759,17 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, } if (TREE_CODE (chrec) != POLYNOMIAL_CHREC) - return; + { + vr->set_varying (TREE_TYPE (var)); + return; + } Im seeing a lot of this pattern... Maybe we should set vr to varying upon entry to the function as the default return value.. then we can just return like it did before in all those places. Better yet, since this routine doesn't "update" anymore and simply returns a range, maybe it could instead return a boolean if it finds a range rather than the current behaviour... then those simply become + return false; We won't have to intersect at the caller if we don't need to, and its useful information at other points to know a range was calculated without having to see if varying_p () came back from the call. ie, we'd the usage pattern would then be value_range_equiv r; if (range_of_var_in_loop (, this, loop, stmt, var)) vr->intersect (); This is the pattern we use throughout the ranger. Done. init = initial_condition_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (init); - if (tem) - init = tem; + if (TREE_CODE (init) == SSA_NAME) + query->get_value_range (init, stmt)->singleton_p (); step = evolution_part_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (step); - if (tem) - step = tem; + if (TREE_CODE (step) == SSA_NAME) + query->get_value_range (step, stmt)->singleton_p (); If I read this correctly, we get values for init and step... and if they are SSA_NAMES, then we query ranges, otherwise use what we got back.. So that would seem to be the same behaviour as before then.. Perhaps a comment is warranted? I had to read it a few times :-) Indeed. I am trying to do too much in one line. I've added a comment. /* If STEP is symbolic, we can't know whether INIT will be the minimum or maximum value in the range. Also, unless INIT is @@ -1781,7 +1778,10 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, if
Re: [PATCH] rs6000: unaligned VSX in memcpy/memmove expansion
On Fri, 2020-08-14 at 17:59 -0500, Aaron Sawdey via Gcc-patches wrote: Hi, > This patch adds a few new instructions to inline expansion of > memcpy/memmove. Generation of all these is controlled by s/is/are/ ? > the option -mblock-ops-unaligned-vsx which is set on by default if the > target has TARGET_EFFICIENT_UNALIGNED_VSX. > * unaligned vsx load/store (V2DImode) > * unaligned vsx pair load/store (POImode) which is also controlled >by -mblock-ops-vector-pair in case it is not wanted at some point. >The default for this option is also for it to be on if the target has >TARGET_EFFICIENT_UNALIGNED_VSX. 'this option' meaing the -mblock-ops-vecftor-pair option? > * unaligned vsx lxvl/stxvl but generally only to do the remainder >of a copy/move we stated with some vsx loads/stores, and also prefer >to use lb/lh/lw/ld if the remainder is 1/2/4/8 bytes. > > Testing of this is actually accomplished by gcc.dg/memcmp-1.c which does > two memcpy() for each memcmp(). If the memcpy() calls don't do the right > thing then the memcmp() will fail unexpectedly. > > Regstrap passed on ppc64le power9 and the memcmp-1.c test passes on > power10 simulator, ok for trunk? > > Thanks! > Aaron > > gcc/ChangeLog: > > * config/rs6000/rs6000-string.c (gen_lxvl_stxvl_move): > Helper function. > (expand_block_move): Add lxvl/stxvl, vector pair, and > unaligned VSX. > * config/rs6000/rs6000.c (rs6000_option_override_internal): > Default value for -mblock-ops-vector-pair. > * config/rs6000/rs6000.opt: Add -mblock-ops-vector-pair. > --- > gcc/config/rs6000/rs6000-string.c | 105 ++ > gcc/config/rs6000/rs6000.c| 14 +++- > gcc/config/rs6000/rs6000.opt | 4 ++ > 3 files changed, 107 insertions(+), 16 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000-string.c > b/gcc/config/rs6000/rs6000-string.c > index c35d93180ca..ce6db2ba14d 100644 > --- a/gcc/config/rs6000/rs6000-string.c > +++ b/gcc/config/rs6000/rs6000-string.c > @@ -2708,6 +2708,36 @@ gen_lvx_v4si_move (rtx dest, rtx src) > return gen_altivec_lvx_v4si_internal (dest, src); > } > > +static rtx > +gen_lxvl_stxvl_move (rtx dest, rtx src, int length) > +{ > + gcc_assert (MEM_P (dest) ^ MEM_P (src)); > + gcc_assert (GET_MODE (dest) == V16QImode && GET_MODE (src) == V16QImode); > + gcc_assert (length <= 16); > + > + bool is_store = MEM_P (dest); > + > + /* If the address form is not a simple register, make it so. */ Possibly just cosmetic - Would ' /* Force dest and src to be simple registers if necessary. */' make more sense? > + if (is_store) > +{ > + dest = XEXP (dest, 0); > + if (!REG_P (dest)) > + dest = force_reg (Pmode, dest); > +} > + else > +{ > + src = XEXP (src, 0); > + if (!REG_P (src)) > + src = force_reg (Pmode, src); > +} > + > + rtx len = force_reg (DImode, gen_int_mode (length, DImode)); > + if (is_store) > +return gen_stxvl (src, dest, len); > + else > +return gen_lxvl (dest, src, len); > +} > + > /* Expand a block move operation, and return 1 if successful. Return 0 > if we should let the compiler generate normal code. > ok > @@ -2750,18 +2780,57 @@ expand_block_move (rtx operands[], bool might_overlap) >if (bytes > rs6000_block_move_inline_limit) > return 0; > > + int orig_bytes = bytes; >for (offset = 0; bytes > 0; offset += move_bytes, bytes -= move_bytes) > { >union { > - rtx (*movmemsi) (rtx, rtx, rtx, rtx); > rtx (*mov) (rtx, rtx); > + rtx (*movlen) (rtx, rtx, int); >} gen_func; >machine_mode mode = BLKmode; >rtx src, dest; > - > - /* Altivec first, since it will be faster than a string move > - when it applies, and usually not significantly larger. */ > - if (TARGET_ALTIVEC && bytes >= 16 && align >= 128) > + bool move_with_length = false; > + > + /* Use POImode for paired vsx load/store. Use V2DI for single > + unaligned vsx load/store, for consistency with what other > + expansions (compare) already do, and so we can use lxvd2x on > + p8. Order is VSX pair unaligned, VSX unaligned, Altivec, vsx > + with length < 16 (if allowed), then smaller gpr > + load/store. */ s/vsx/VSX/ s/smaller// ? > + > + if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX > + && TARGET_BLOCK_OPS_VECTOR_PAIR > + && bytes >= 32 > + && (align >= 256 || !STRICT_ALIGNMENT)) > + { > + move_bytes = 32; > + mode = POImode; > + gen_func.mov = gen_movpoi; > + } > + else if (TARGET_POWERPC64 && TARGET_BLOCK_OPS_UNALIGNED_VSX > +&& VECTOR_MEM_VSX_P (V2DImode) > +&& bytes >= 16 && (align >= 128 || !STRICT_ALIGNMENT)) > + { > + move_bytes = 16; > + mode = V2DImode; > + gen_func.mov = gen_vsx_movv2di_64bit; > + } > + else if
Re: [PATCH v2] C-SKY: Support -mfloat-abi=hard.
Hi Jojo, On 8/17/20 7:09 PM, Jojo R wrote: diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c index 7ba3ed3..b71291a 100644 --- a/gcc/config/csky/csky.c +++ b/gcc/config/csky/csky.c @@ -328,6 +328,16 @@ csky_cpu_cpp_builtins (cpp_reader *pfile) { builtin_define ("__csky_hard_float__"); builtin_define ("__CSKY_HARD_FLOAT__"); + if (TARGET_HARD_FLOAT_ABI) +{ + builtin_define ("__csky_hard_float_abi__"); + builtin_define ("__CSKY_HARD_FLOAT_ABI__"); +} + if (TARGET_SINGLE_FPU) +{ + builtin_define ("__csky_hard_float_fpu_sf__"); + builtin_define ("__CSKY_HARD_FLOAT_FPU_SF__"); +} } These is one more thing you shoud pay attention to, if the spaces number reaches 8 at begining of a line, you should use tab instead of 8 spaces. Thanks, Xianmiao
Re: [PATCH] improve memcmp and memchr constant folding (PR 78257)
On Sat, 2020-08-15 at 16:19 +0200, Christophe Lyon wrote: > Hi Martin, > > > On Sat, 15 Aug 2020 at 01:14, Martin Sebor via Gcc-patches > wrote: > > On 8/13/20 11:44 AM, Martin Sebor wrote: > > > On 8/13/20 10:21 AM, Jeff Law wrote: > > > > On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote: > > > > > The folders for these functions (and some others) call c_getsr > > > > > which relies on string_constant to return the representation of > > > > > constant strings. Because the function doesn't handle constants > > > > > of other types, including aggregates, memcmp or memchr calls > > > > > involving those are not folded when they could be. > > > > > > > > > > The attached patch extends the algorithm used by string_constant > > > > > to also handle constant aggregates involving elements or members > > > > > of the same types as native_encode_expr. (The change restores > > > > > the empty initializer optimization inadvertently disabled in > > > > > the fix for pr96058.) > > > > > > > > > > To avoid accidentally misusing either string_constant or c_getstr > > > > > with non-strings I have introduced a pair of new functions to get > > > > > the representation of those: byte_representation and getbyterep. > > > > > > > > > > Tested on x86_64-linux. > > > > > > > > > > Martin > > > > > PR tree-optimization/78257 - missing memcmp optimization with > > > > > constant arrays > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > PR middle-end/78257 > > > > > * builtins.c (expand_builtin_memory_copy_args): Rename called > > > > > function. > > > > > (expand_builtin_stpcpy_1): Remove argument from call. > > > > > (expand_builtin_memcmp): Rename called function. > > > > > (inline_expand_builtin_bytecmp): Same. > > > > > * expr.c (convert_to_bytes): New function. > > > > > (constant_byte_string): New function (formerly string_constant). > > > > > (string_constant): Call constant_byte_string. > > > > > (byte_representation): New function. > > > > > * expr.h (byte_representation): Declare. > > > > > * fold-const-call.c (fold_const_call): Rename called function. > > > > > * fold-const.c (c_getstr): Remove an argument. > > > > > (getbyterep): Define a new function. > > > > > * fold-const.h (c_getstr): Remove an argument. > > > > > (getbyterep): Declare a new function. > > > > > * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee. > > > > > (gimple_fold_builtin_string_compare): Same. > > > > > (gimple_fold_builtin_memchr): Same. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > PR middle-end/78257 > > > > > * gcc.dg/memchr.c: New test. > > > > > * gcc.dg/memcmp-2.c: New test. > > > > > * gcc.dg/memcmp-3.c: New test. > > > > > * gcc.dg/memcmp-4.c: New test. > > > > > > > > > > diff --git a/gcc/expr.c b/gcc/expr.c > > > > > index a150fa0d3b5..a124df54655 100644 > > > > > --- a/gcc/expr.c > > > > > +++ b/gcc/expr.c > > > > > @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, > > > > > const_tree exp) > > > > > /* This must now be the address of EXP. */ > > > > > return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, > > > > > 0) == exp; > > > > > } > > > > > - > > > > > -/* Return the tree node if an ARG corresponds to a string constant > > > > > or zero > > > > > - if it doesn't. If we return nonzero, set *PTR_OFFSET to the > > > > > (possibly > > > > > - non-constant) offset in bytes within the string that ARG is > > > > > accessing. > > > > > - If MEM_SIZE is non-zero the storage size of the memory is > > > > > returned. > > > > > - If DECL is non-zero the constant declaration is returned if > > > > > available. */ > > > > > -tree > > > > > -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree > > > > > *decl) > > > > > +/* If EXPR is a constant initializer (either an expression or > > > > > CONSTRUCTOR), > > > > > + attempt to obtain its native representation as an array of > > > > > nonzero BYTES. > > > > > + Return true on success and false on failure (the latter without > > > > > modifying > > > > > + BYTES). */ > > > > > + > > > > > +static bool > > > > > +convert_to_bytes (tree type, tree expr, vec *bytes) > > > > > +{ > > > > > + if (TREE_CODE (expr) == CONSTRUCTOR) > > > > > +{ > > > > > + /* Set to the size of the CONSTRUCTOR elements. */ > > > > > + unsigned HOST_WIDE_INT ctor_size = bytes->length (); > > > > > + > > > > > + if (TREE_CODE (type) == ARRAY_TYPE) > > > > > +{ > > > > > + tree val, idx; > > > > > + tree eltype = TREE_TYPE (type); > > > > > + unsigned HOST_WIDE_INT elsize = > > > > > +tree_to_uhwi (TYPE_SIZE_UNIT (eltype)); > > > > > + unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U; > > > > > + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val) > > > > > +{ > > > > > + /* Append zeros for
Re: [PATCH] Add support for putting jump table into relocation read-only section
Hi! On Mon, Aug 17, 2020 at 10:28:31AM +0800, HAO CHEN GUI wrote: > >For the reloc, my understanding is the jump table needs to be > >relocated if it's a non-relative jump table and PIC flag is set at the > >same time. Yes, I did say the *existing* code seems sub-optimal, too :-) > >According to the slice of code in stmt.c, the non-relative jump table > >is created with PIC flag set when CASE_VECTOR_PC_RELATIVE is false, > >flag_pic is true and targetm.asm_out.generate_pic_addr_diff_vec is > >false. So I set the reloc to > > > >reloc = (! CASE_VECTOR_PC_RELATIVE && flag_pic && > > ! targetm.asm_out.generate_pic_addr_diff_vec ()) ? 1 > >: 0; > > > >The funcation_rodata_section is not only for jump tables. It's no > >relro in other cases. I am not sure if it's suitable to put selecting > >relro section in it. Of course, I can create a separate function for > >section selection of jump table and send its output to > >funcation_rodata_section. .data.rel.ro is just another kind of .rodata, one that *can* be relocated. So when we use it, fPIC or not doesn't matter. Also, we can just use the existing rodata functions for generating .data.rel.ro, and it should simplify all code even. > -@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_RODATA_SECTION > (tree @var{decl}) > -Return the readonly data section associated with > +@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_RODATA_SECTION > (tree @var{decl}, bool @var{section_reloc}) > +Return the readonly or reloc readonly data section associated with Should this take the 2-bit int "reloc" field like other functions, instead of this bool? Segher
[committed] libstdc++: Remove inheritance from elements in std::tuple
This fixes a number of std::tuple bugs by no longer making use of the empty base-class optimization. By using the C++20 [[no_unique_address]] attribute we can always store the element as a data member, while still compressing the layout of tuples containing empty types. Since we no longer use inheritance we could also apply the compression optimization for final types and for tuples of tuples, but doing so would be an ABI break. Using [[no_unique_address]] more liberally for the unstable std::__8 configuration is left for a later date. There may be reasons not to apply the attribute unconditionally, e.g. see the discussion about guaranteed elision in PR 94062. libstdc++-v3/ChangeLog: PR libstdc++/55713 PR libstdc++/71096 PR libstdc++/93147 * include/std/tuple [__has_cpp_attribute(no_unique_address)] (_Head_base): New definition of the partial specialization, using [[no_unique_address]] instead of inheritance. * testsuite/libstdc++-prettyprinters/48362.cc: Adjust expected output. * testsuite/20_util/tuple/comparison_operators/93147.cc: New test. * testsuite/20_util/tuple/creation_functions/55713.cc: New test. * testsuite/20_util/tuple/element_access/71096.cc: New test. Tested powerpc64le-linux. Committed to trunk. commit 91e6226f880b048275a7ceedef716e159c7cefd9 Author: Jonathan Wakely Date: Fri Aug 7 17:13:56 2020 libstdc++: Remove inheritance from elements in std::tuple This fixes a number of std::tuple bugs by no longer making use of the empty base-class optimization. By using the C++20 [[no_unique_address]] attribute we can always store the element as a data member, while still compressing the layout of tuples containing empty types. Since we no longer use inheritance we could also apply the compression optimization for final types and for tuples of tuples, but doing so would be an ABI break. Using [[no_unique_address]] more liberally for the unstable std::__8 configuration is left for a later date. There may be reasons not to apply the attribute unconditionally, e.g. see the discussion about guaranteed elision in PR 94062. libstdc++-v3/ChangeLog: PR libstdc++/55713 PR libstdc++/71096 PR libstdc++/93147 * include/std/tuple [__has_cpp_attribute(no_unique_address)] (_Head_base): New definition of the partial specialization, using [[no_unique_address]] instead of inheritance. * testsuite/libstdc++-prettyprinters/48362.cc: Adjust expected output. * testsuite/20_util/tuple/comparison_operators/93147.cc: New test. * testsuite/20_util/tuple/creation_functions/55713.cc: New test. * testsuite/20_util/tuple/element_access/71096.cc: New test. diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple index 0dc11768a90..d4a35f0fe7f 100644 --- a/libstdc++-v3/include/std/tuple +++ b/libstdc++-v3/include/std/tuple @@ -73,6 +73,58 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION bool = __empty_not_final<_Head>::value> struct _Head_base; +#if __has_cpp_attribute(no_unique_address) + template +struct _Head_base<_Idx, _Head, true> +{ + constexpr _Head_base() + : _M_head_impl() { } + + constexpr _Head_base(const _Head& __h) + : _M_head_impl(__h) { } + + constexpr _Head_base(const _Head_base&) = default; + constexpr _Head_base(_Head_base&&) = default; + + template + constexpr _Head_base(_UHead&& __h) + : _M_head_impl(std::forward<_UHead>(__h)) { } + + _GLIBCXX20_CONSTEXPR + _Head_base(allocator_arg_t, __uses_alloc0) + : _M_head_impl() { } + + template + _Head_base(allocator_arg_t, __uses_alloc1<_Alloc> __a) + : _M_head_impl(allocator_arg, *__a._M_a) { } + + template + _Head_base(allocator_arg_t, __uses_alloc2<_Alloc> __a) + : _M_head_impl(*__a._M_a) { } + + template + _GLIBCXX20_CONSTEXPR + _Head_base(__uses_alloc0, _UHead&& __uhead) + : _M_head_impl(std::forward<_UHead>(__uhead)) { } + + template + _Head_base(__uses_alloc1<_Alloc> __a, _UHead&& __uhead) + : _M_head_impl(allocator_arg, *__a._M_a, std::forward<_UHead>(__uhead)) + { } + + template + _Head_base(__uses_alloc2<_Alloc> __a, _UHead&& __uhead) + : _M_head_impl(std::forward<_UHead>(__uhead), *__a._M_a) { } + + static constexpr _Head& + _M_head(_Head_base& __b) noexcept { return __b._M_head_impl; } + + static constexpr const _Head& + _M_head(const _Head_base& __b) noexcept { return __b._M_head_impl; } + + [[no_unique_address]] _Head _M_head_impl; +}; +#else template struct _Head_base<_Idx, _Head, true> : public _Head @@ -119,6 +171,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION static constexpr const _Head&
[PATCH 4/X] libsanitizer: options: Add hwasan flags and argument parsing
These flags can't be used at the same time as any of the other sanitizers. We add an equivalent flag to -static-libasan in -static-libhwasan to ensure static linking. The -fsanitize=kernel-hwaddress option is for compiling targeting the kernel. This flag has defaults that allow compiling KASAN with tags as it is currently implemented. These defaults are that we do not sanitize variables on the stack and always recover from a detected bug. Stack tagging in the kernel is a future aim, I don't know of any reason it would not work, but this has not yet been tested. We introduce a backend hook `targetm.memtag.can_tag_addresses` that indicates to the mid-end whether a target has a feature like AArch64 TBI where the top byte of an address is ignored. Without this feature hwasan sanitization is not done. gcc/ChangeLog: * common.opt (flag_sanitize_recover): Default for kernel hwaddress. (static-libhwasan): New cli option. * config/aarch64/aarch64.c (aarch64_can_tag_addresses): New. (TARGET_MEMTAG_CAN_TAG_ADDRESSES): New. * config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of asan command line flags. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add hwasan equivalent of __SANITIZE_ADDRESS__. * doc/invoke.texi: Document hwasan command line flags. * doc/tm.texi: Document new hook. * doc/tm.texi.in: Document new hook. * flag-types.h (enum sanitize_code): New sanitizer values. * gcc.c (STATIC_LIBHWASAN_LIBS): New macro. (LIBHWASAN_SPEC): New macro. (LIBHWASAN_EARLY_SPEC): New macro. (SANITIZER_EARLY_SPEC): Update to include hwasan. (SANITIZER_SPEC): Update to include hwasan. (sanitize_spec_function): Use hwasan options. * opts.c (finish_options): Describe conflicts between address sanitizers. (sanitizer_opts): Introduce new sanitizer flags. (common_handle_option): Add defaults for kernel sanitizer. * params.opt (hwasan--instrument-stack): New (hwasan-random-frame-tag): New (hwasan-instrument-allocas): New (hwasan-instrument-reads): New (hwasan-instrument-writes): New (hwasan-instrument-mem-intrinsics): New * target.def (HOOK_PREFIX): Add new hook. (can_tag_addresses): Add new hook under memtag prefix. * targhooks.c (default_memtag_can_tag_addresses): New. * targhooks.h (default_memtag_can_tag_addresses): New decl. * toplev.c (process_options): Ensure hwasan only on TBI architectures. gcc/c-family/ChangeLog: * c-attribs.c (handle_no_sanitize_hwaddress_attribute): New attribute. ### Attachment also inlined for ease of reply### diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 372148315389db6671dfd943fd1a68670fcb1cbc..f8bf165aa48b5709c26f4e8245e5ab929b44fca6 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *); static tree handle_no_sanitize_address_attribute (tree *, tree, tree, int, bool *); +static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree, + int, bool *); static tree handle_no_sanitize_thread_attribute (tree *, tree, tree, int, bool *); static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree, @@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] = handle_no_sanitize_attribute, NULL }, { "no_sanitize_address",0, 0, true, false, false, false, handle_no_sanitize_address_attribute, NULL }, + { "no_sanitize_hwaddress",0, 0, true, false, false, false, + handle_no_sanitize_hwaddress_attribute, NULL }, { "no_sanitize_thread", 0, 0, true, false, false, false, handle_no_sanitize_thread_attribute, NULL }, { "no_sanitize_undefined", 0, 0, true, false, false, false, @@ -946,6 +950,22 @@ handle_no_sanitize_address_attribute (tree *node, tree name, tree, int, return NULL_TREE; } +/* Handle a "no_sanitize_hwaddress" attribute; arguments as in + struct attribute_spec.handler. */ + +static tree +handle_no_sanitize_hwaddress_attribute (tree *node, tree name, tree, int, + bool *no_add_attrs) +{ + *no_add_attrs = true; + if (TREE_CODE (*node) != FUNCTION_DECL) +warning (OPT_Wattributes, "%qE attribute ignored", name); + else +add_no_sanitize_value (*node, SANITIZE_HWADDRESS); + + return NULL_TREE; +} + /* Handle a "no_sanitize_thread" attribute; arguments as in struct
[PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes
There are four main features to this change: 1) Check pointer tags match address tags. In the new `hwasan` pass we put HWASAN_CHECK internal functions before all memory accesses to check that tags in the pointer being used match the tag stored in shadow memory for the memory region being used. These internal functions are expanded into actual checks in the sanopt pass that happens just before expansion into RTL. We use the same mechanism that currently inserts ASAN_CHECK internal functions to insert the new HWASAN_CHECK functions. 2) Instrument known builtin function calls. Handle all builtin functions that we know use memory accesses. This commit uses the machinery added for ASAN to identify builtin functions that access memory. The main differences between the approaches for HWASAN and ASAN are: - libhwasan intercepts much less builtin functions. - Alloca needs to be transformed differently (instead of adding redzones it needs to tag shadow memory and return a tagged pointer). - stack_restore needs to untag the shadow stack between the current position and where it's going. - `noreturn` functions can not be handled by simply unpoisoning the entire shadow stack -- there is no "always valid" tag. (exceptions and things such as longjmp need to be handled in a different way). For hardware implemented checking (such as AArch64's memory tagging extension) alloca and stack_restore will need to be handled by hooks in the backend rather than transformation at the gimple level. This will allow architecture specific handling of such stack modifications. 3) Introduce HWASAN block-scope poisoning Here we use exactly the same mechanism as ASAN_MARK to poison/unpoison variables on entry/exit of a block. In order to simply use the exact same machinery we're using the same internal functions until the SANOPT pass. This means that all handling of ASAN_MARK is the same. This has the negative that the naming may be a little confusing, but a positive that handling of the internal function doesn't have to be duplicated for a function that behaves exactly the same but has a different name. gcc/ChangeLog: * asan.c (asan_instrument_reads): New. (asan_instrument_writes): New. (asan_memintrin): New. (handle_builtin_stack_restore): Account for HWASAN. (handle_builtin_alloca): Account for HWASAN. (get_mem_refs_of_builtin_call): Special case strlen for HWASAN. (report_error_func): Assert not HWASAN. (build_check_stmt): Make HWASAN_CHECK instead of ASAN_CHECK. (instrument_derefs): HWASAN does not tag globals. (instrument_builtin_call): Use new helper functions. (maybe_instrument_call): Don't instrument `noreturn` functions. (initialize_sanitizer_builtins): Add new type. (asan_expand_mark_ifn): Account for HWASAN. (asan_expand_check_ifn): Assert never called by HWASAN. (asan_expand_poison_ifn): Account for HWASAN. (hwasan_instrument_reads): New. (hwasan_instrument_writes): New. (hwasan_memintrin): New. (hwasan_instrument): New. (hwasan_base): New. (hwasan_check_func): New. (hwasan_expand_check_ifn): New. (hwasan_expand_mark_ifn): New. (gate_hwasan): New. (class pass_hwasan): New. (make_pass_hwasan): New. (class pass_hwasan_O0): New. (make_pass_hwasan_O0): New. * asan.h (hwasan_base): New decl. (hwasan_expand_check_ifn): New decl. (hwasan_expand_mark_ifn): New decl. (gate_hwasan): New decl. (enum hwasan_mark_flags): New. (asan_intercepted_p): Always false for hwasan. (asan_sanitize_use_after_scope): Account for HWASAN. * builtin-types.def (BT_FN_PTR_CONST_PTR_UINT8): New. * gimple-pretty-print.c (dump_gimple_call_args): Account for HWASAN. * gimplify.c (asan_poison_variable): Account for HWASAN. (gimplify_function_tree): Remove requirement of SANITIZE_ADDRESS, requiring asan or hwasan is accounted for in `asan_sanitize_use_after_scope`. * internal-fn.c (expand_HWASAN_CHECK): New. (expand_HWASAN_CHOOSE_TAG): New. (expand_HWASAN_MARK): New. (expand_HWASAN_ALLOCA_UNPOISON): New. * internal-fn.def (HWASAN_CHOOSE_TAG): New. (HWASAN_CHECK): New. (HWASAN_MARK): New. (HWASAN_ALLOCA_UNPOISON): New. * passes.def: Add hwasan and hwasan_O0 passes. * sanitizer.def (BUILT_IN_HWASAN_LOAD1): New. (BUILT_IN_HWASAN_LOAD2): New. (BUILT_IN_HWASAN_LOAD4): New. (BUILT_IN_HWASAN_LOAD8): New. (BUILT_IN_HWASAN_LOAD16): New. (BUILT_IN_HWASAN_LOADN): New. (BUILT_IN_HWASAN_STORE1): New. (BUILT_IN_HWASAN_STORE2): New. (BUILT_IN_HWASAN_STORE4): New. (BUILT_IN_HWASAN_STORE8): New. (BUILT_IN_HWASAN_STORE16): New. (BUILT_IN_HWASAN_STOREN):
[PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN
Handling stack variables has three features. 1) Ensure HWASAN required alignment for stack variables When tagging shadow memory, we need to ensure that each tag granule is only used by one variable at a time. This is done by ensuring that each tagged variable is aligned to the tag granule representation size and also ensure that the end of each object is aligned to ensure the start of any other data stored on the stack is in a different granule. This patch ensures the above by adding alignment requirements in `align_local_variable` and forcing all stack variable allocation to be deferred so that `expand_stack_vars` can ensure the stack pointer is aligned before allocating any variable for the current frame. 2) Put tags into each stack variable pointer Make sure that every pointer to a stack variable includes a tag of some sort on it. The way tagging works is: 1) For every new stack frame, a random tag is generated. 2) A base register is formed from the stack pointer value and this random tag. 3) References to stack variables are now formed with RTL describing an offset from this base in both tag and value. The random tag generation is handled by a backend hook. This hook decides whether to introduce a random tag or use the stack background based on the parameter hwasan-random-frame-tag. Using the stack background is necessary for testing and bootstrap. It is necessary during bootstrap to avoid breaking the `configure` test program for determining stack direction. Using the stack background means that every stack frame has the initial tag of zero and variables are tagged with incrementing tags from 1, which also makes debugging a bit easier. The tag offsets are also handled by a backend hook. This patch also adds some macros defining how the HWASAN shadow memory is stored and how a tag is stored in a pointer. 3) For each stack variable, tag and untag the shadow stack on function prologue and epilogue. On entry to each function we tag the relevant shadow stack region for each stack variable the tag to match the tag added to each pointer for that variable. This is the first patch where we use the HWASAN shadow space, so we need to add in the libhwasan initialisation code that creates this shadow memory region into the binary we produce. This instrumentation is done in `compile_file`. When exiting a function we need to ensure the shadow stack for this function has no remaining tag. Without clearing the shadow stack area for this stack frame, later function calls could get false positives when those later function calls check untagged areas (such as parameters passed on the stack) against a shadow stack area with left-over tag. Hence we ensure that the entire stack frame is cleared on function exit. config/ChangeLog: * bootstrap-hwasan.mk: Disable random frame tags for stack-tagging during bootstrap. gcc/ChangeLog: * asan.c (hwasan_record_base): New function. (hwasan_emit_untag_frame): New. (hwasan_increment_tag): New function. (hwasan_with_tag): New function. (hwasan_tag_init): New function. (initialize_sanitizer_builtins): Define new builtins. (ATTR_NOTHROW_LIST): New macro. (hwasan_current_tag): New. (hwasan_extract_tag): New. (hwasan_emit_prologue): New. (hwasan_create_untagged_base): New. (hwasan_finish_file): New. (hwasan_ctor_statements): New variable. (hwasan_sanitize_stack_p): New. (hwasan_sanitize_p): New. (hwasan_sanitize_allocas_p): New. * asan.h (hwasan_record_base): New declaration. (hwasan_emit_untag_frame): New. (hwasan_increment_tag): New declaration. (hwasan_with_tag): New declaration. (hwasan_sanitize_stack_p): New declaration. (hwasan_sanitize_allocas_p): New declaration. (hwasan_tag_init): New declaration. (hwasan_sanitize_p): New declaration. (HWASAN_TAG_SIZE): New macro. (HWASAN_TAG_GRANULE_SIZE): New macro. (HWASAN_TAG_SHIFT_SIZE): New macro. (HWASAN_SHIFT): New macro. (HWASAN_SHIFT_RTX): New macro. (HWASAN_STACK_BACKGROUND): New macro. (hwasan_finish_file): New declaration. (hwasan_current_tag): New declaration. (hwasan_create_untagged_base): New declaration. (hwasan_extract_tag): New declaration. (hwasan_emit_prologue): New declaration. * cfgexpand.c (struct stack_vars_data): Add information to record hwasan variable stack offsets. (expand_stack_vars): Ensure variables are offset from a tagged base. Record offsets for hwasan. Ensure alignment. (expand_used_vars): Call function to emit prologue, and get untagging instructions for function exit. (align_local_variable): Ensure alignment. (defer_stack_allocation): Ensure all variables are deferred so they can be handled by
[PATCH 7/X] libsanitizer: Add tests
Adding hwasan tests. Only interesting thing here is that we have to make sure the tagging mechanism is deterministic to avoid flaky tests. gcc/testsuite/ChangeLog: * c-c++-common/hwasan/aligned-alloc.c: New test. * c-c++-common/hwasan/alloca-array-accessible.c: New test. * c-c++-common/hwasan/alloca-gets-different-tag.c: New test. * c-c++-common/hwasan/alloca-outside-caught.c: New test. * c-c++-common/hwasan/arguments.c: New test. * c-c++-common/hwasan/arguments-1.c: New test. * c-c++-common/hwasan/arguments-2.c: New test. * c-c++-common/hwasan/arguments-3.c: New test. * c-c++-common/hwasan/asan-pr63316.c: New test. * c-c++-common/hwasan/asan-pr70541.c: New test. * c-c++-common/hwasan/asan-pr78106.c: New test. * c-c++-common/hwasan/asan-pr79944.c: New test. * c-c++-common/hwasan/asan-rlimit-mmap-test-1.c: New test. * c-c++-common/hwasan/bitfield-1.c: New test. * c-c++-common/hwasan/bitfield-2.c: New test. * c-c++-common/hwasan/builtin-special-handling.c: New test. * c-c++-common/hwasan/check-interface.c: New test. * c-c++-common/hwasan/halt_on_error-1.c: New test. * c-c++-common/hwasan/heap-overflow.c: New test. * c-c++-common/hwasan/hwasan-poison-optimisation.c: New test. * c-c++-common/hwasan/hwasan-thread-access-parent.c: New test. * c-c++-common/hwasan/hwasan-thread-basic-failure.c: New test. * c-c++-common/hwasan/hwasan-thread-clears-stack.c: New test. * c-c++-common/hwasan/hwasan-thread-success.c: New test. * c-c++-common/hwasan/kernel-defaults.c: New test. * c-c++-common/hwasan/large-aligned-0.c: New test. * c-c++-common/hwasan/large-aligned-1.c: New test. * c-c++-common/hwasan/large-aligned-untagging-0.c: New test. * c-c++-common/hwasan/large-aligned-untagging-1.c: New test. * c-c++-common/hwasan/large-aligned-untagging-2.c: New test. * c-c++-common/hwasan/large-aligned-untagging-3.c: New test. * c-c++-common/hwasan/large-aligned-untagging-4.c: New test. * c-c++-common/hwasan/large-aligned-untagging-5.c: New test. * c-c++-common/hwasan/large-aligned-untagging-6.c: New test. * c-c++-common/hwasan/large-aligned-untagging-7.c: New test. * c-c++-common/hwasan/macro-definition.c: New test. * c-c++-common/hwasan/no-sanitize-attribute.c: New test. * c-c++-common/hwasan/param-instrument-reads-and-writes.c: New test. * c-c++-common/hwasan/param-instrument-reads.c: New test. * c-c++-common/hwasan/param-instrument-writes.c: New test. * c-c++-common/hwasan/param-instrument-mem-intrinsics.c: New test. * c-c++-common/hwasan/random-frame-tag.c: New test. * c-c++-common/hwasan/sanity-check-pure-c.c: New test. * c-c++-common/hwasan/setjmp-longjmp-0.c: New test. * c-c++-common/hwasan/setjmp-longjmp-1.c: New test. * c-c++-common/hwasan/stack-tagging-basic-0.c: New test. * c-c++-common/hwasan/stack-tagging-basic-1.c: New test. * c-c++-common/hwasan/stack-tagging-disable.c: New test. * c-c++-common/hwasan/unprotected-allocas-0.c: New test. * c-c++-common/hwasan/unprotected-allocas-1.c: New test. * c-c++-common/hwasan/use-after-free.c: New test. * c-c++-common/hwasan/vararray-outside-caught.c: New test. * c-c++-common/hwasan/vararray-stack-restore-correct.c: New test. * c-c++-common/hwasan/very-large-objects.c: New test. * g++.dg/hwasan/hwasan.exp: New file. * g++.dg/hwasan/rvo-handled.C: New test. * gcc.dg/hwasan/hwasan.exp: New file. * gcc.dg/hwasan/nested-functions-0.c: New test. * gcc.dg/hwasan/nested-functions-1.c: New test. * gcc.dg/hwasan/nested-functions-2.c: New test. * lib/hwasan-dg.exp: New file. hwasan-diff6.patch.gz Description: application/gzip
[PATCH 1/X] libsanitizer: Tie the hwasan library into our build system
This patch tries to tie libhwasan into the GCC build system in the same way that the other sanitizer runtime libraries are handled. libsanitizer/ChangeLog: * Makefile.am: Build libhwasan. * Makefile.in: Build libhwasan. * asan/Makefile.in: Build libhwasan. * configure: Build libhwasan. * configure.ac: Build libhwasan. * hwasan/Makefile.am: New file. * hwasan/Makefile.in: New file. * hwasan/libtool-version: New file. * interception/Makefile.in: Build libhwasan. * libbacktrace/Makefile.in: Build libhwasan. * libsanitizer.spec.in: Build libhwasan. * lsan/Makefile.in: Build libhwasan. * sanitizer_common/Makefile.in: Build libhwasan. * tsan/Makefile.in: Build libhwasan. * ubsan/Makefile.in: Build libhwasan. ### Attachment also inlined for ease of reply### diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am index 65ed1e712378ef453f820f86c4d3221f9dee5f2c..2a7e8e1debe838719db0f0fad218b2543cc3111b 100644 --- a/libsanitizer/Makefile.am +++ b/libsanitizer/Makefile.am @@ -14,11 +14,12 @@ endif if LIBBACKTRACE_SUPPORTED SUBDIRS += libbacktrace endif -SUBDIRS += lsan asan ubsan +SUBDIRS += lsan asan ubsan hwasan nodist_saninclude_HEADERS += \ include/sanitizer/lsan_interface.h \ include/sanitizer/asan_interface.h \ - include/sanitizer/tsan_interface.h + include/sanitizer/tsan_interface.h \ + include/sanitizer/hwasan_interface.h if TSAN_SUPPORTED SUBDIRS += tsan endif diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in index 02c7f70ac6578a3e93a490ce8bd2c54fc0693c50..2c57d49cbffdb486645aeb5f2c0f85d6e0fad124 100644 --- a/libsanitizer/Makefile.in +++ b/libsanitizer/Makefile.in @@ -92,7 +92,8 @@ target_triplet = @target@ @SANITIZER_SUPPORTED_TRUE@am__append_1 = include/sanitizer/common_interface_defs.h \ @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/lsan_interface.h \ @SANITIZER_SUPPORTED_TRUE@ include/sanitizer/asan_interface.h \ -@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h +@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/tsan_interface.h \ +@SANITIZER_SUPPORTED_TRUE@ include/sanitizer/hwasan_interface.h @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = interception @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = libbacktrace @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan @@ -207,7 +208,7 @@ ETAGS = etags CTAGS = ctags CSCOPE = cscope DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \ - ubsan tsan + ubsan hwasan tsan ACLOCAL = @ACLOCAL@ ALLOC_FILE = @ALLOC_FILE@ AMTAR = @AMTAR@ @@ -329,6 +330,7 @@ install_sh = @install_sh@ libdir = @libdir@ libexecdir = @libexecdir@ link_libasan = @link_libasan@ +link_libhwasan = @link_libhwasan@ link_liblsan = @link_liblsan@ link_libtsan = @link_libtsan@ link_libubsan = @link_libubsan@ @@ -362,7 +364,7 @@ sanincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer nodist_saninclude_HEADERS = $(am__append_1) @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \ @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \ -@SANITIZER_SUPPORTED_TRUE@ $(am__append_4) +@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4) gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER) # Work around what appears to be a GNU make bug handling MAKEFLAGS diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in index 29622bf466a37f819c9fade30e31195adda51190..25c7fd7b7597d6e243005a1bb7de5b6243d2cfcf 100644 --- a/libsanitizer/asan/Makefile.in +++ b/libsanitizer/asan/Makefile.in @@ -383,6 +383,7 @@ install_sh = @install_sh@ libdir = @libdir@ libexecdir = @libexecdir@ link_libasan = @link_libasan@ +link_libhwasan = @link_libhwasan@ link_liblsan = @link_liblsan@ link_libtsan = @link_libtsan@ link_libubsan = @link_libubsan@ diff --git a/libsanitizer/configure b/libsanitizer/configure index 04eca04fbe5e59bae1ba00597de0cf1b7cf1b5fa..9ed9669a85d3cfc2f2f623e796e61a5f8f7e4ded 100755 --- a/libsanitizer/configure +++ b/libsanitizer/configure @@ -657,6 +657,7 @@ USING_MAC_INTERPOSE_TRUE link_liblsan link_libubsan link_libtsan +link_libhwasan link_libasan LSAN_SUPPORTED_FALSE LSAN_SUPPORTED_TRUE @@ -12361,7 +12362,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 12364 "configure" +#line 12365 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -12467,7 +12468,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 12470 "configure" +#line 12471 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -15943,6 +15944,10 @@ fi link_libasan=$link_sanitizer_common +# Set up the set of additional
[PATCH 2/X] libsanitizer: Only build libhwasan when targeting AArch64
Though the library has limited support for x86, we don't have any support for generating code targeting x86 so there is no point building for that target. libsanitizer/ChangeLog: * Makefile.am: Condition building hwasan directory. * Makefile.in: Regenerate. * configure: Regenerate. * configure.ac: Set HWASAN_SUPPORTED based on target architecture. * configure.tgt: Likewise. ### Attachment also inlined for ease of reply### diff --git a/libsanitizer/Makefile.am b/libsanitizer/Makefile.am index 2a7e8e1debe838719db0f0fad218b2543cc3111b..065a65e78d49f7689a01ecb64db1f07ca83aa987 100644 --- a/libsanitizer/Makefile.am +++ b/libsanitizer/Makefile.am @@ -14,7 +14,7 @@ endif if LIBBACKTRACE_SUPPORTED SUBDIRS += libbacktrace endif -SUBDIRS += lsan asan ubsan hwasan +SUBDIRS += lsan asan ubsan nodist_saninclude_HEADERS += \ include/sanitizer/lsan_interface.h \ include/sanitizer/asan_interface.h \ @@ -23,6 +23,9 @@ nodist_saninclude_HEADERS += \ if TSAN_SUPPORTED SUBDIRS += tsan endif +if HWASAN_SUPPORTED +SUBDIRS += hwasan +endif endif ## May be used by toolexeclibdir. diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in index 2c57d49cbffdb486645aeb5f2c0f85d6e0fad124..3873ea4d7050f04a3f7bbd0dd3f2a71e9b65d287 100644 --- a/libsanitizer/Makefile.in +++ b/libsanitizer/Makefile.in @@ -97,6 +97,7 @@ target_triplet = @target@ @SANITIZER_SUPPORTED_TRUE@@USING_MAC_INTERPOSE_FALSE@am__append_2 = interception @LIBBACKTRACE_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_3 = libbacktrace @SANITIZER_SUPPORTED_TRUE@@TSAN_SUPPORTED_TRUE@am__append_4 = tsan +@HWASAN_SUPPORTED_TRUE@@SANITIZER_SUPPORTED_TRUE@am__append_5 = hwasan subdir = . ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ @@ -208,7 +209,7 @@ ETAGS = etags CTAGS = ctags CSCOPE = cscope DIST_SUBDIRS = sanitizer_common interception libbacktrace lsan asan \ - ubsan hwasan tsan + ubsan tsan hwasan ACLOCAL = @ACLOCAL@ ALLOC_FILE = @ALLOC_FILE@ AMTAR = @AMTAR@ @@ -364,7 +365,7 @@ sanincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include/sanitizer nodist_saninclude_HEADERS = $(am__append_1) @SANITIZER_SUPPORTED_TRUE@SUBDIRS = sanitizer_common $(am__append_2) \ @SANITIZER_SUPPORTED_TRUE@ $(am__append_3) lsan asan ubsan \ -@SANITIZER_SUPPORTED_TRUE@ hwasan $(am__append_4) +@SANITIZER_SUPPORTED_TRUE@ $(am__append_4) $(am__append_5) gcc_version := $(shell @get_gcc_base_ver@ $(top_srcdir)/../gcc/BASE-VER) # Work around what appears to be a GNU make bug handling MAKEFLAGS diff --git a/libsanitizer/configure b/libsanitizer/configure index 9ed9669a85d3cfc2f2f623e796e61a5f8f7e4ded..cc5c229f4aebcdd454e9e2e415a8e16046dc1b1a 100755 --- a/libsanitizer/configure +++ b/libsanitizer/configure @@ -659,6 +659,8 @@ link_libubsan link_libtsan link_libhwasan link_libasan +HWASAN_SUPPORTED_FALSE +HWASAN_SUPPORTED_TRUE LSAN_SUPPORTED_FALSE LSAN_SUPPORTED_TRUE TSAN_SUPPORTED_FALSE @@ -12362,7 +12364,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 12365 "configure" +#line 12367 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -12468,7 +12470,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 12471 "configure" +#line 12473 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -15819,6 +15821,7 @@ fi # Get target configury. unset TSAN_SUPPORTED unset LSAN_SUPPORTED +unset HWASAN_SUPPORTED . ${srcdir}/configure.tgt if test "x$TSAN_SUPPORTED" = "xyes"; then TSAN_SUPPORTED_TRUE= @@ -15836,6 +15839,14 @@ else LSAN_SUPPORTED_FALSE= fi + if test "x$HWASAN_SUPPORTED" = "xyes"; then + HWASAN_SUPPORTED_TRUE= + HWASAN_SUPPORTED_FALSE='#' +else + HWASAN_SUPPORTED_TRUE='#' + HWASAN_SUPPORTED_FALSE= +fi + # Check for functions needed. for ac_func in clock_getres clock_gettime clock_settime lstat readlink @@ -16818,7 +16829,7 @@ ac_config_files="$ac_config_files Makefile libsanitizer.spec libbacktrace/backtr ac_config_headers="$ac_config_headers config.h" -ac_config_files="$ac_config_files interception/Makefile sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile hwasan/Makefile ubsan/Makefile" +ac_config_files="$ac_config_files interception/Makefile sanitizer_common/Makefile libbacktrace/Makefile lsan/Makefile asan/Makefile ubsan/Makefile" if test "x$TSAN_SUPPORTED" = "xyes"; then @@ -16826,6 +16837,11 @@ if test "x$TSAN_SUPPORTED" = "xyes"; then fi +if test "x$HWASAN_SUPPORTED" = "xyes"; then + ac_config_files="$ac_config_files hwasan/Makefile" + +fi + @@ -17090,6 +17106,10 @@ if test -z "${LSAN_SUPPORTED_TRUE}" && test -z "${LSAN_SUPPORTED_FALSE}"; then as_fn_error $? "conditional \"LSAN_SUPPORTED\" was never defined. Usually
[PATCH 3/X] libsanitizer: Add option to bootstrap using HWASAN
This is an analogous option to --bootstrap-asan to configure. It allows bootstrapping GCC using HWASAN. For the same reasons as for ASAN we have to avoid using the HWASAN sanitizer when compiling libiberty and the lto-plugin. Also add a function to query whether -fsanitize=hwaddress has been passed. ChangeLog: * configure: Regenerate. * configure.ac: Add --bootstrap-hwasan option. config/ChangeLog: * bootstrap-hwasan.mk: New file. gcc/ChangeLog: * doc/install.texi: Document new option. libiberty/ChangeLog: * configure: Regenerate. * configure.ac: Avoid using sanitizer. lto-plugin/ChangeLog: * Makefile.am: Avoid using sanitizer. * Makefile.in: Regenerate. ### Attachment also inlined for ease of reply### diff --git a/config/bootstrap-hwasan.mk b/config/bootstrap-hwasan.mk new file mode 100644 index ..4f60bed3fd6e98b47a3a38aea6eba2a7c320da25 --- /dev/null +++ b/config/bootstrap-hwasan.mk @@ -0,0 +1,8 @@ +# This option enables -fsanitize=hwaddress for stage2 and stage3. + +STAGE2_CFLAGS += -fsanitize=hwaddress +STAGE3_CFLAGS += -fsanitize=hwaddress +POSTSTAGE1_LDFLAGS += -fsanitize=hwaddress -static-libhwasan \ + -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/ \ + -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/ \ + -B$$r/prev-$(TARGET_SUBDIR)/libsanitizer/hwasan/.libs diff --git a/configure b/configure index a0c5aca9e8d5cae2782c8fe4625a501853dc226a..203319e3f899e8d24429950c3a5d22927fb5150f 100755 --- a/configure +++ b/configure @@ -8297,7 +8297,7 @@ fi # or bootstrap-ubsan, bootstrap it. if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then case "$BUILD_CONFIG" in -*bootstrap-asan* | *bootstrap-ubsan* ) +*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* ) bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer, bootstrap_fixincludes=yes ;; diff --git a/configure.ac b/configure.ac index 1a53ed418e4d97606356b14a17b50186c79adcd3..9d5c187c31bfc01003e75058896b686807e47643 100644 --- a/configure.ac +++ b/configure.ac @@ -2809,7 +2809,7 @@ fi # or bootstrap-ubsan, bootstrap it. if echo " ${target_configdirs} " | grep " libsanitizer " > /dev/null 2>&1; then case "$BUILD_CONFIG" in -*bootstrap-asan* | *bootstrap-ubsan* ) +*bootstrap-hwasan* | *bootstrap-asan* | *bootstrap-ubsan* ) bootstrap_target_libs=${bootstrap_target_libs}target-libsanitizer, bootstrap_fixincludes=yes ;; diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index d581a34653f61a440b3c3b832836fe109e2fbd08..25d041fcbb1f7c16f7ac47b7b5d4ea8308c6f69c 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -2767,6 +2767,11 @@ the build tree. Compiles GCC itself using Address Sanitization in order to catch invalid memory accesses within the GCC code. +@item @samp{bootstrap-hwasan} +Compiles GCC itself using HWAddress Sanitization in order to catch invalid +memory accesses within the GCC code. This option is only available on AArch64 +targets running a Linux kernel that supports the required ABI (5.4 or later). + @end table @section Building a cross compiler diff --git a/libiberty/configure b/libiberty/configure index 1f8e23f0d235a6a5d5158bf6705023db95ac7023..59e0b73d5838bbd42a5548759084471e97ec254f 100755 --- a/libiberty/configure +++ b/libiberty/configure @@ -5264,6 +5264,7 @@ fi NOASANFLAG= case " ${CFLAGS} " in *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;; + *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;; esac diff --git a/libiberty/configure.ac b/libiberty/configure.ac index 4e2599c14a89bafcb8c7e523b9ce5b3d60b8c0f6..ad952963971a31968b5d109661b9cab0aa4b95fc 100644 --- a/libiberty/configure.ac +++ b/libiberty/configure.ac @@ -240,6 +240,7 @@ AC_SUBST(PICFLAG) NOASANFLAG= case " ${CFLAGS} " in *\ -fsanitize=address\ *) NOASANFLAG=-fno-sanitize=address ;; + *\ -fsanitize=hwaddress\ *) NOASANFLAG=-fno-sanitize=hwaddress ;; esac AC_SUBST(NOASANFLAG) diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am index ba5882df7a7272f65219191c82ecd78ab4d3725e..50d6e09dac881d28d4ff70def47b09ed8c0ea66c 100644 --- a/lto-plugin/Makefile.am +++ b/lto-plugin/Makefile.am @@ -11,8 +11,8 @@ AM_CPPFLAGS = -I$(top_srcdir)/../include $(DEFS) AM_CFLAGS = @ac_lto_plugin_warn_cflags@ $(CET_HOST_FLAGS) AM_LDFLAGS = @ac_lto_plugin_ldflags@ AM_LIBTOOLFLAGS = --tag=disable-static -override CFLAGS := $(filter-out -fsanitize=address,$(CFLAGS)) -override LDFLAGS := $(filter-out -fsanitize=address,$(LDFLAGS)) +override CFLAGS := $(filter-out -fsanitize=address -fsanitize=hwaddress,$(CFLAGS)) +override LDFLAGS := $(filter-out -fsanitize=address -fsanitize=hwaddress,$(LDFLAGS)) libexecsub_LTLIBRARIES = liblto_plugin.la gcc_build_dir = @gcc_build_dir@ diff --git
Re: [PATCH] libibery/hashtab: add new functions
Adding libiberty maintainer to CC. On 8/17/20 4:03 PM, Martin Liška wrote: Hey. I'm working on bintuils where I would like to port a hash table implementation in gas/hash.[ch] to libiberty one. But it would be handy for me to add 2 new functions. Thoughts? Thanks, Martin include/ChangeLog: * hashtab.h (htab_insert): New function. (htab_print_statistics): Likewise. libiberty/ChangeLog: * hashtab.c (htab_insert): New function. (htab_print_statistics): Likewise. --- include/hashtab.h | 6 ++ libiberty/hashtab.c | 23 +++ 2 files changed, 29 insertions(+) diff --git a/include/hashtab.h b/include/hashtab.h index 6cca342b989..bcaee909bcf 100644 --- a/include/hashtab.h +++ b/include/hashtab.h @@ -37,6 +37,7 @@ extern "C" { #endif /* __cplusplus */ #include "ansidecl.h" +#include /* The type for a hash code. */ typedef unsigned int hashval_t; @@ -172,6 +173,7 @@ extern void ** htab_find_slot (htab_t, const void *, enum insert_option); extern void * htab_find_with_hash (htab_t, const void *, hashval_t); extern void ** htab_find_slot_with_hash (htab_t, const void *, hashval_t, enum insert_option); +extern void htab_insert (htab_t, void *); extern void htab_clear_slot (htab_t, void **); extern void htab_remove_elt (htab_t, const void *); extern void htab_remove_elt_with_hash (htab_t, const void *, hashval_t); @@ -183,6 +185,10 @@ extern size_t htab_size (htab_t); extern size_t htab_elements (htab_t); extern double htab_collisions (htab_t); +extern void htab_print_statistics (FILE *f, htab_t table, + const char *name, + const char *prefix); + /* A hash function for pointers. */ extern htab_hash htab_hash_pointer; diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c index 225e9e540a7..fb3152ec9c6 100644 --- a/libiberty/hashtab.c +++ b/libiberty/hashtab.c @@ -704,6 +704,15 @@ htab_find_slot (htab_t htab, const PTR element, enum insert_option insert) insert); } +/* Insert ELEMENT into HTAB. If the element exists, it is overwritten. */ + +void +htab_insert (htab_t htab, PTR element) +{ + void **slot = htab_find_slot (htab, element, INSERT); + *slot = element; +} + /* This function deletes an element with the given value from hash table (the hash is computed from the element). If there is no matching element in the hash table, this function does nothing. */ @@ -803,6 +812,20 @@ htab_collisions (htab_t htab) return (double) htab->collisions / (double) htab->searches; } +/* Print statistics about a hash table. */ + +void +htab_print_statistics (FILE *f, htab_t table, const char *name, + const char *prefix) +{ + fprintf (f, "%s hash statistics:\n", name); + fprintf (f, "%s%u searches\n", prefix, table->searches); + fprintf (f, "%s%lu elements\n", prefix, htab_elements (table)); + fprintf (f, "%s%lu table size\n", prefix, htab_size (table)); + fprintf (f, "%s%.2f collisions per search\n", + prefix, htab_collisions (table)); +} + /* Hash P as a null-terminated string. Copied from gcc/hashtable.c. Zack had the following to say with respect
[PATCH] libibery/hashtab: add new functions
Hey. I'm working on bintuils where I would like to port a hash table implementation in gas/hash.[ch] to libiberty one. But it would be handy for me to add 2 new functions. Thoughts? Thanks, Martin include/ChangeLog: * hashtab.h (htab_insert): New function. (htab_print_statistics): Likewise. libiberty/ChangeLog: * hashtab.c (htab_insert): New function. (htab_print_statistics): Likewise. --- include/hashtab.h | 6 ++ libiberty/hashtab.c | 23 +++ 2 files changed, 29 insertions(+) diff --git a/include/hashtab.h b/include/hashtab.h index 6cca342b989..bcaee909bcf 100644 --- a/include/hashtab.h +++ b/include/hashtab.h @@ -37,6 +37,7 @@ extern "C" { #endif /* __cplusplus */ #include "ansidecl.h" +#include /* The type for a hash code. */ typedef unsigned int hashval_t; @@ -172,6 +173,7 @@ extern void ** htab_find_slot (htab_t, const void *, enum insert_option); extern void * htab_find_with_hash (htab_t, const void *, hashval_t); extern void ** htab_find_slot_with_hash (htab_t, const void *, hashval_t, enum insert_option); +extern voidhtab_insert (htab_t, void *); extern voidhtab_clear_slot (htab_t, void **); extern voidhtab_remove_elt (htab_t, const void *); extern voidhtab_remove_elt_with_hash (htab_t, const void *, hashval_t); @@ -183,6 +185,10 @@ extern size_t htab_size (htab_t); extern size_t htab_elements (htab_t); extern double htab_collisions (htab_t); +extern void htab_print_statistics (FILE *f, htab_t table, + const char *name, + const char *prefix); + /* A hash function for pointers. */ extern htab_hash htab_hash_pointer; diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c index 225e9e540a7..fb3152ec9c6 100644 --- a/libiberty/hashtab.c +++ b/libiberty/hashtab.c @@ -704,6 +704,15 @@ htab_find_slot (htab_t htab, const PTR element, enum insert_option insert) insert); } +/* Insert ELEMENT into HTAB. If the element exists, it is overwritten. */ + +void +htab_insert (htab_t htab, PTR element) +{ + void **slot = htab_find_slot (htab, element, INSERT); + *slot = element; +} + /* This function deletes an element with the given value from hash table (the hash is computed from the element). If there is no matching element in the hash table, this function does nothing. */ @@ -803,6 +812,20 @@ htab_collisions (htab_t htab) return (double) htab->collisions / (double) htab->searches; } +/* Print statistics about a hash table. */ + +void +htab_print_statistics (FILE *f, htab_t table, const char *name, + const char *prefix) +{ + fprintf (f, "%s hash statistics:\n", name); + fprintf (f, "%s%u searches\n", prefix, table->searches); + fprintf (f, "%s%lu elements\n", prefix, htab_elements (table)); + fprintf (f, "%s%lu table size\n", prefix, htab_size (table)); + fprintf (f, "%s%.2f collisions per search\n", + prefix, htab_collisions (table)); +} + /* Hash P as a null-terminated string. Copied from gcc/hashtable.c. Zack had the following to say with respect -- 2.28.0
Re: [PATCH] Implement no_stack_protect attribute.
PING^4 On 7/23/20 1:10 PM, Martin Liška wrote: PING^3 On 6/24/20 11:09 AM, Martin Liška wrote: PING^2 On 6/10/20 10:12 AM, Martin Liška wrote: PING^1 On 5/25/20 3:10 PM, Martin Liška wrote: On 5/21/20 4:53 PM, Martin Sebor wrote: On 5/21/20 5:28 AM, Martin Liška wrote: On 5/18/20 10:37 PM, Martin Sebor wrote: I know there are some somewhat complex cases the attribute exclusion mechanism isn't general enough to handle but this seems simple enough that it should work. Unless I'm missing something that makes it not feasible I would suggest to use it. Hi Martin. Do we have a better place where we check for attribute collision? If by collision you mean the same thing as the mutual exclusion I was talking about then that's done by creating an attribute_spec::exclusions array like for instance attr_cold_hot_exclusions in c-attribs.c and pointing to it from the attribute_spec entries for each of the mutually exclusive attributes in the attribute table. Everything else is handled automatically by decl_attributes. Martin Thanks, I'm sending updated version of the patch that utilizes the conflict detection. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin
[PATCH] floatformat.h: Add bfloat16 support.
This change is motivated by a patchset that adds bfloat16 debugging support for new avx512 instructions to GDB. The gdb thread can be found here: https://sourceware.org/pipermail/gdb-patches/2020-July/170820.html include: 2020-08-17 Felix Willgerodt * floatformat.h (floatformat_bfloat16_big): New. (floatformat_bfloat16_little): New. libiberty: 2020-08-17 Felix Willgerodt * floatformat.c (floatformat_bfloat16_big): New. (floatformat_bfloat16_little): New. --- include/floatformat.h | 3 +++ libiberty/floatformat.c | 19 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/floatformat.h b/include/floatformat.h index ce8d6d4add8..630fade0449 100644 --- a/include/floatformat.h +++ b/include/floatformat.h @@ -133,6 +133,9 @@ extern const struct floatformat floatformat_ia64_quad_little; /* IBM long double (double+double). */ extern const struct floatformat floatformat_ibm_long_double_big; extern const struct floatformat floatformat_ibm_long_double_little; +/* bfloat16. */ +extern const struct floatformat floatformat_bfloat16_big; +extern const struct floatformat floatformat_bfloat16_little; /* Convert from FMT to a double. FROM is the address of the extended float. diff --git a/libiberty/floatformat.c b/libiberty/floatformat.c index 2fd5e688ec4..6b9b03288e2 100644 --- a/libiberty/floatformat.c +++ b/libiberty/floatformat.c @@ -389,7 +389,24 @@ const struct floatformat floatformat_ibm_long_double_little = floatformat_ibm_long_double_is_valid, _ieee_double_little }; - + +const struct floatformat floatformat_bfloat16_big = +{ + floatformat_big, 16, 0, 1, 8, 127, 255, 9, 7, + floatformat_intbit_no, + "floatformat_bfloat16_big", + floatformat_always_valid, + NULL +}; + +const struct floatformat floatformat_bfloat16_little = +{ + floatformat_little, 16, 0, 1, 8, 127, 255, 9, 7, + floatformat_intbit_no, + "floatformat_bfloat16_little", + floatformat_always_valid, + NULL +}; #ifndef min #define min(a, b) ((a) < (b) ? (a) : (b)) -- 2.25.4 Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
Re: [PATCH] openmp: fix UBSAN error at gcc/fortran/openmp.c:4737
On 8/17/20 11:15 AM, Martin Liška wrote: I'm suggesting one more clean up that uses static assert instead of a run-time check. I concur that compile-time checks are nicer. LGTM – it should be able catch this kind of mistakes. Tobias Thoughts? Martin 0001-opnemp-add-static-assert-for-clause_names.patch From c9aee2c44d5cf7e417d381988b2f4900e9ea8b05 Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Mon, 17 Aug 2020 11:14:13 +0200 Subject: [PATCH] opnemp: add static assert for clause_names. gcc/fortran/ChangeLog: * openmp.c (resolve_omp_clauses): Add static assert for OMP_LIST_NUM and size of clause_names array. Remove check that is always true. --- gcc/fortran/openmp.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 60d8e5573c2..4d33a450a33 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -4371,6 +4371,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, "TO", "FROM", "REDUCTION", "DEVICE_RESIDENT", "LINK", "USE_DEVICE", "CACHE", "IS_DEVICE_PTR", "USE_DEVICE_PTR", "USE_DEVICE_ADDR", "NONTEMPORAL" }; + STATIC_ASSERT (ARRAY_SIZE (clause_names) == OMP_LIST_NUM); if (omp_clauses == NULL) return; @@ -4732,12 +4733,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, for (list = 0; list < OMP_LIST_NUM; list++) if ((n = omp_clauses->lists[list]) != NULL) { - const char *name; - - if (list < OMP_LIST_NUM) - name = clause_names[list]; - else - gcc_unreachable (); + const char *name = clause_names[list]; switch (list) { -- 2.28.0 - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
[PATCH v2] C-SKY: Support -mfloat-abi=hard.
gcc/ChangeLog: * config/csky/csky.md (CSKY_NPARM_FREGS): New. (call_value_internal_vs/d): New. (untyped_call): New. * config/csky/csky.h (TARGET_SINGLE_FPU): New. (TARGET_DOUBLE_FPU): New. (FUNCTION_VARG_REGNO_P): New. (CSKY_VREG_MODE_P): New. (FUNCTION_VARG_MODE_P): New. (CUMULATIVE_ARGS): Add extra regs info. (INIT_CUMULATIVE_ARGS): Use csky_init_cumulative_args. (FUNCTION_ARG_REGNO_P): Use FUNCTION_VARG_REGNO_P. * config/csky/csky-protos.h (csky_init_cumulative_args): Extern. * config/csky/csky.c (csky_cpu_cpp_builtins): Support TARGET_HARD_FLOAT_ABI. (csky_function_arg): Likewise. (csky_num_arg_regs): Likewise. (csky_function_arg_advance): Likewise. (csky_function_value): Likewise. (csky_libcall_value): Likewise. (csky_function_value_regno_p): Likewise. (csky_arg_partial_bytes): Likewise. (csky_setup_incoming_varargs): Likewise. (csky_init_cumulative_args): New. gcc/testsuite/ChangeLog: * gcc.dg/builtin-apply2.c : Skip if CSKY. * gcc.dg/torture/stackalign/builtin-apply-2.c : Likewise. --- gcc/config/csky/csky-protos.h | 2 + gcc/config/csky/csky.c | 96 +++--- gcc/config/csky/csky.h | 34 ++-- gcc/config/csky/csky.md| 84 +++ gcc/testsuite/gcc.dg/builtin-apply2.c | 2 +- .../gcc.dg/torture/stackalign/builtin-apply-2.c| 2 +- 6 files changed, 200 insertions(+), 20 deletions(-) diff --git a/gcc/config/csky/csky-protos.h b/gcc/config/csky/csky-protos.h index cc1a033..2c02399 100644 --- a/gcc/config/csky/csky-protos.h +++ b/gcc/config/csky/csky-protos.h @@ -68,4 +68,6 @@ extern int csky_compute_pushpop_length (rtx *); extern int csky_default_branch_cost (bool, bool); extern bool csky_default_logical_op_non_short_circuit (void); + +extern void csky_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree); #endif /* GCC_CSKY_PROTOS_H */ diff --git a/gcc/config/csky/csky.c b/gcc/config/csky/csky.c index 7ba3ed3..b71291a 100644 --- a/gcc/config/csky/csky.c +++ b/gcc/config/csky/csky.c @@ -328,6 +328,16 @@ csky_cpu_cpp_builtins (cpp_reader *pfile) { builtin_define ("__csky_hard_float__"); builtin_define ("__CSKY_HARD_FLOAT__"); + if (TARGET_HARD_FLOAT_ABI) +{ + builtin_define ("__csky_hard_float_abi__"); + builtin_define ("__CSKY_HARD_FLOAT_ABI__"); +} + if (TARGET_SINGLE_FPU) +{ + builtin_define ("__csky_hard_float_fpu_sf__"); + builtin_define ("__CSKY_HARD_FLOAT_FPU_SF__"); +} } else { @@ -1790,9 +1800,22 @@ static rtx csky_function_arg (cumulative_args_t pcum_v, const function_arg_info ) { CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v); + int reg = pcum->reg; + machine_mode mode = arg.mode; - if (*pcum < CSKY_NPARM_REGS) -return gen_rtx_REG (arg.mode, CSKY_FIRST_PARM_REGNUM + *pcum); + if (FUNCTION_VARG_MODE_P(mode) + && !pcum->is_stdarg) +{ + reg = pcum->freg; + + if (reg < CSKY_NPARM_FREGS) +return gen_rtx_REG (mode, CSKY_FIRST_VFP_REGNUM + reg); + else +return NULL_RTX; +} + + if (reg < CSKY_NPARM_REGS) +return gen_rtx_REG (mode, CSKY_FIRST_PARM_REGNUM + reg); return NULL_RTX; } @@ -1802,7 +1825,7 @@ csky_function_arg (cumulative_args_t pcum_v, const function_arg_info ) MODE and TYPE. */ static int -csky_num_arg_regs (machine_mode mode, const_tree type) +csky_num_arg_regs (machine_mode mode, const_tree type, bool is_stdarg) { int size; @@ -1811,6 +1834,14 @@ csky_num_arg_regs (machine_mode mode, const_tree type) else size = GET_MODE_SIZE (mode); + if (TARGET_HARD_FLOAT_ABI + && !is_stdarg) +{ + if (CSKY_VREG_MODE_P(mode) + && !TARGET_SINGLE_FPU) +return ((CSKY_NUM_WORDS (size) + 1) / 2); +} + return CSKY_NUM_WORDS (size); } @@ -1822,12 +1853,23 @@ csky_function_arg_advance (cumulative_args_t pcum_v, const function_arg_info ) { CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v); - int param_size = csky_num_arg_regs (arg.mode, arg.type); + int *reg = >reg; + machine_mode mode = arg.mode; + + int param_size = csky_num_arg_regs (mode, arg.type, pcum->is_stdarg); + int param_regs_nums = CSKY_NPARM_REGS; + + if (FUNCTION_VARG_MODE_P(mode) + && !pcum->is_stdarg) +{ + reg = >freg; + param_regs_nums = CSKY_NPARM_FREGS; +} - if (*pcum + param_size > CSKY_NPARM_REGS) -*pcum = CSKY_NPARM_REGS; + if (*reg + param_size > param_regs_nums) +*reg = param_regs_nums; else -*pcum += param_size; +*reg += param_size; } @@ -1843,6 +1885,12 @@ csky_function_value (const_tree type, const_tree func, mode = TYPE_MODE (type); size
Re: [PATCH] x86_64: PR rtl-optimization/92180: class_likely_spilled vs. cant_combine_insn.
On Mon, Aug 17, 2020 at 12:42 PM Roger Sayle wrote: > > > This patch catches a missed optimization opportunity where GCC currently > generates worse code than LLVM. The issue, as nicely analyzed in bugzilla, > boils down to the following three insns in combine: > > (insn 6 5 7 2 (parallel [ > (set (reg:DI 85) > (ashift:DI (reg:DI 85) > (const_int 32 [0x20]))) > (clobber (reg:CC 17 flags)) > ]) "pr92180.c":4:10 564 {*ashldi3_1} > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil))) > (insn 7 6 14 2 (parallel [ > (set (reg:DI 84) > (ior:DI (reg:DI 84) > (reg:DI 85))) > (clobber (reg:CC 17 flags)) > ]) "pr92180.c":4:10 454 {*iordi_1} > (expr_list:REG_DEAD (reg:DI 85) > (expr_list:REG_UNUSED (reg:CC 17 flags) > (nil > (insn 14 7 15 2 (set (reg/i:SI 0 ax) > (subreg:SI (reg:DI 84) 0)) "pr92180.c":5:1 67 {*movsi_internal} > (expr_list:REG_DEAD (reg:DI 84) > (nil))) > > Normally, combine/simplify-rtx would notice that insns 6 and 7 > (which update highpart bits) are unnecessary as the final insn 14 > only requires to lowpart bits. The complication is that insn 14 > sets a hard register in targetm.class_likely_spilled_p which > prevents combine from performing its simplifications, and removing > the redundant instructions. > > At first glance a fix would appear to require changes to combine, > potentially affecting code generation on all small register class > targets... An alternate (and I think clever) solution is to spot > that this problematic situation can be avoided by the backend. > > At RTL expansion time, the middle-end has a clear separation between > pseudos and hard registers, so the RTL initially contains: > > (insn 9 8 10 2 (set (reg:SI 86) > (subreg:SI (reg:DI 82 [ _1 ]) 0)) "pr92180.c":6:10 -1 > (nil)) > (insn 10 9 14 2 (set (reg:SI 83 [ ]) > (reg:SI 86)) "pr92180.c":6:10 -1 > (nil)) > (insn 14 10 15 2 (set (reg/i:SI 0 ax) > (reg:SI 83 [ ])) "pr92180.c":7:1 -1 > (nil)) > > which can be optimized without problems by combine; it is only the > intervening passes (initially fwprop1) that propagate computations > into sets of hard registers, and disable those opportunities. > > The solution proposed here is to have the x86 backend/recog prevent > early RTL passes composing instructions (that set likely_spilled hard > registers) that they (combine) can't simplify, until after reload. > We allow sets from pseudo registers, immediate constants and memory > accesses, but anything more complicated is performed via a temporary > pseudo. Not only does this simplify things for the register allocator, > but any remaining register-to-register moves are easily cleaned up > by the late optimization passes after reload, such as peephole2 and > cprop_hardreg. > > This patch has been tested on x86_64-pc-linux-gnu with a > "make bootstrap" and a "make -k check" with no new failures. > Ok for mainline? I think that fwprop interferes with recent change to combine, where combine won't propagate hard registers anymore. So, following that change, there is no point for fwprop to create instructions that combine won't be able to process. Alternatively, perhaps fwprop should be prevented from propagating likely_spilled hard registers? Let's ask Segher for his opinion. Uros. > > > 2020-08-17 Roger Sayle > > gcc/ChangeLog > PR rtl-optimization/92180 > * config/i386/i386.c (ix86_hardreg_mov_ok): New function to > determine whether (set DST SRC) should be allowed at this point. > * config/i386/i386-protos.h (ix86_hardreg_mov_ok): Prototype here. > * config/i386/i386-expand.c (ix86_expand_move): Check whether > this is a complex set of a likely spilled hard register, and if > so place the value in a pseudo, and load the hard reg from it. > * config/i386/i386.md (*movdi_internal, *movsi_internal, > *movhi_internal, *movqi_internal): Make these instructions > conditional on ix86_hardreg_mov_ok. > (*lea): Make this define_insn_and_split conditional on > ix86_hardreg_mov_ok. > > gcc/testsuite/ChangeLog > PR rtl-optimization/92180 > * gcc.target/i386/pr92180.c: New test. > > > Thanks in advance, > Roger > -- > Roger Sayle > NextMove Software > Cambridge, UK >
[PATCH] x86_64: PR rtl-optimization/92180: class_likely_spilled vs. cant_combine_insn.
This patch catches a missed optimization opportunity where GCC currently generates worse code than LLVM. The issue, as nicely analyzed in bugzilla, boils down to the following three insns in combine: (insn 6 5 7 2 (parallel [ (set (reg:DI 85) (ashift:DI (reg:DI 85) (const_int 32 [0x20]))) (clobber (reg:CC 17 flags)) ]) "pr92180.c":4:10 564 {*ashldi3_1} (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))) (insn 7 6 14 2 (parallel [ (set (reg:DI 84) (ior:DI (reg:DI 84) (reg:DI 85))) (clobber (reg:CC 17 flags)) ]) "pr92180.c":4:10 454 {*iordi_1} (expr_list:REG_DEAD (reg:DI 85) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil (insn 14 7 15 2 (set (reg/i:SI 0 ax) (subreg:SI (reg:DI 84) 0)) "pr92180.c":5:1 67 {*movsi_internal} (expr_list:REG_DEAD (reg:DI 84) (nil))) Normally, combine/simplify-rtx would notice that insns 6 and 7 (which update highpart bits) are unnecessary as the final insn 14 only requires to lowpart bits. The complication is that insn 14 sets a hard register in targetm.class_likely_spilled_p which prevents combine from performing its simplifications, and removing the redundant instructions. At first glance a fix would appear to require changes to combine, potentially affecting code generation on all small register class targets... An alternate (and I think clever) solution is to spot that this problematic situation can be avoided by the backend. At RTL expansion time, the middle-end has a clear separation between pseudos and hard registers, so the RTL initially contains: (insn 9 8 10 2 (set (reg:SI 86) (subreg:SI (reg:DI 82 [ _1 ]) 0)) "pr92180.c":6:10 -1 (nil)) (insn 10 9 14 2 (set (reg:SI 83 [ ]) (reg:SI 86)) "pr92180.c":6:10 -1 (nil)) (insn 14 10 15 2 (set (reg/i:SI 0 ax) (reg:SI 83 [ ])) "pr92180.c":7:1 -1 (nil)) which can be optimized without problems by combine; it is only the intervening passes (initially fwprop1) that propagate computations into sets of hard registers, and disable those opportunities. The solution proposed here is to have the x86 backend/recog prevent early RTL passes composing instructions (that set likely_spilled hard registers) that they (combine) can't simplify, until after reload. We allow sets from pseudo registers, immediate constants and memory accesses, but anything more complicated is performed via a temporary pseudo. Not only does this simplify things for the register allocator, but any remaining register-to-register moves are easily cleaned up by the late optimization passes after reload, such as peephole2 and cprop_hardreg. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and a "make -k check" with no new failures. Ok for mainline? 2020-08-17 Roger Sayle gcc/ChangeLog PR rtl-optimization/92180 * config/i386/i386.c (ix86_hardreg_mov_ok): New function to determine whether (set DST SRC) should be allowed at this point. * config/i386/i386-protos.h (ix86_hardreg_mov_ok): Prototype here. * config/i386/i386-expand.c (ix86_expand_move): Check whether this is a complex set of a likely spilled hard register, and if so place the value in a pseudo, and load the hard reg from it. * config/i386/i386.md (*movdi_internal, *movsi_internal, *movhi_internal, *movqi_internal): Make these instructions conditional on ix86_hardreg_mov_ok. (*lea): Make this define_insn_and_split conditional on ix86_hardreg_mov_ok. gcc/testsuite/ChangeLog PR rtl-optimization/92180 * gcc.target/i386/pr92180.c: New test. Thanks in advance, Roger -- Roger Sayle NextMove Software Cambridge, UK diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index f441ba9..e6e4433 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -190,6 +190,17 @@ ix86_expand_move (machine_mode mode, rtx operands[]) op0 = operands[0]; op1 = operands[1]; + /* Avoid complex sets of likely spilled hard registers before reload. */ + if (!ix86_hardreg_mov_ok (op0, op1)) +{ + tmp = gen_reg_rtx (mode); + operands[0] = tmp; + ix86_expand_move (mode, operands); + operands[0] = op0; + operands[1] = tmp; + op1 = tmp; +} + switch (GET_CODE (op1)) { case CONST: diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index b6088f2..a10bc56 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -161,6 +161,7 @@ extern rtx ix86_find_base_term (rtx); extern bool ix86_check_movabs (rtx, int); extern bool ix86_check_no_addr_space (rtx); extern void ix86_split_idivmod (machine_mode, rtx[], bool); +extern bool ix86_hardreg_mov_ok (rtx, rtx); extern rtx assign_386_stack_local (machine_mode, enum
Re: [PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.
On Fri, Aug 14, 2020 at 10:26 AM Hongtao Liu wrote: > > Enable operator or/xor/and/andn/not for mask register, kxnor is not > enabled since there's no corresponding instruction for general > registers. > > gcc/ > PR target/88808 > * config/i386/i386.md: (*movsi_internal): Adjust constraints > for mask registers. > (*movhi_internal): Ditto. > (*movqi_internal): Ditto. > (*anddi_1): Support mask register operations > (*and_1): Ditto. > (*andqi_1): Ditto. > (*andn_1): Ditto. > (*_1): Ditto. > (*qi_1): Ditto. > (*one_cmpl2_1): Ditto. > (*one_cmplsi2_1_zext): Ditto. > (*one_cmplqi2_1): Ditto. > > gcc/testsuite/ > * gcc.target/i386/bitwise_mask_op-1.c: New test. > * gcc.target/i386/bitwise_mask_op-2.c: New test. > * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust testcase. > * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto. > * gcc.target/i386/avx512dq-kmovb-5.c: Ditto. > * gcc.target/i386/avx512f-kmovw-5.c: Ditto. index 74d207c3711..e8ad79d1b0a 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2294,7 +2294,7 @@ (define_insn "*movsi_internal" [(set (match_operand:SI 0 "nonimmediate_operand" -"=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k") +"=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,k") (match_operand:SI 1 "general_operand" "g ,re,C ,*y,m ,*y,*y,r ,C ,*v,m ,*v,*v,r ,*r,*km,*k ,CBC"))] "!(MEM_P (operands[0]) && MEM_P (operands[1]))" I'd rather see *k everywhere, also with *movqi_internal and *movhi_internal patterns. The "*" means that the allocator won't allocate a mask register by default, but it will be used to optimize moves. With the above change, you are risking that during integer register pressure, the register allocator will allocate zero to a mask register, and later "optimize" the move with a direct maskreg-intreg move. The current strategy is that only general registers get allocated for integer modes. Let's keep it this way for now. Otherwise, the patchset LGTM, but please test the suggested changes and repost. BTW: Do you plan to remove mask operations from sse.md? ATM, they are used to distinguish mask operations, generated from builtins from generic operations, so I'd like to keep them for a while. The drawback is, that they are not combined with other operations, but at the end of the day, this is what the programmer asked for by using builtins. Uros.
Re: PING: Fwd: [PATCH 2/2] Decouple adjust_range_from_scev from vr_values and value_range_equiv.
On 8/14/20 7:16 PM, Andrew MacLeod wrote: On 8/14/20 12:05 PM, Aldy Hernandez wrote: I made some minor changes to the function comments. gcc/ChangeLog: * vr-values.c (check_for_binary_op_overflow): Change type of store to range_query. (vr_values::adjust_range_with_scev): Abstract most of the code... (range_of_var_in_loop): ...here. Remove value_range_equiv uses. (simplify_using_ranges::simplify_using_ranges): Change type of store to range_query. * vr-values.h (class range_query): New. (class simplify_using_ranges): Use range_query. (class vr_values): Add OVERRIDE to get_value_range. (range_of_var_in_loop): New. --- gcc/vr-values.c | 150 ++-- gcc/vr-values.h | 23 ++-- 2 files changed, 88 insertions(+), 85 deletions(-) diff --git a/gcc/vr-values.c b/gcc/vr-values.c index 9002d87c14b..5b7bae3bfb7 100644 --- a/gcc/vr-values.c +++ b/gcc/vr-values.c @@ -1004,7 +1004,7 @@ vr_values::extract_range_from_comparison (value_range_equiv *vr, overflow. */ static bool -check_for_binary_op_overflow (vr_values *store, +check_for_binary_op_overflow (range_query *store, enum tree_code subcode, tree type, tree op0, tree op1, bool *ovf) { @@ -1737,22 +1737,18 @@ compare_range_with_value (enum tree_code comp, const value_range *vr, gcc_unreachable (); } -/* Given a range VR, a LOOP and a variable VAR, determine whether it - would be profitable to adjust VR using scalar evolution information - for VAR. If so, update VR with the new limits. */ + +/* Given a VAR in STMT within LOOP, determine the range of the + variable and store it in VR. If no range can be determined, the + resulting range will be set to VARYING. */ void -vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, - gimple *stmt, tree var) +range_of_var_in_loop (irange *vr, range_query *query, + class loop *loop, gimple *stmt, tree var) { - tree init, step, chrec, tmin, tmax, min, max, type, tem; + tree init, step, chrec, tmin, tmax, min, max, type; enum ev_direction dir; - /* TODO. Don't adjust anti-ranges. An anti-range may provide - better opportunities than a regular range, but I'm not sure. */ - if (vr->kind () == VR_ANTI_RANGE) - return; - IIUC, you've switched to using the new API, so the bounds calls will basically turn and ANTI range into a varying , making [lbound,ubound] will be [MIN, MAX] ? so its effectively a no-op, except we will not punt on getting a range when VR is an anti range anymore.. so that goodness... Yes. chrec = instantiate_parameters (loop, analyze_scalar_evolution (loop, var)); /* Like in PR19590, scev can return a constant function. */ @@ -1763,16 +1759,17 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, } if (TREE_CODE (chrec) != POLYNOMIAL_CHREC) - return; + { + vr->set_varying (TREE_TYPE (var)); + return; + } Im seeing a lot of this pattern... Maybe we should set vr to varying upon entry to the function as the default return value.. then we can just return like it did before in all those places. Better yet, since this routine doesn't "update" anymore and simply returns a range, maybe it could instead return a boolean if it finds a range rather than the current behaviour... then those simply become + return false; We won't have to intersect at the caller if we don't need to, and its useful information at other points to know a range was calculated without having to see if varying_p () came back from the call. ie, we'd the usage pattern would then be value_range_equiv r; if (range_of_var_in_loop (, this, loop, stmt, var)) vr->intersect (); This is the pattern we use throughout the ranger. Done. init = initial_condition_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (init); - if (tem) - init = tem; + if (TREE_CODE (init) == SSA_NAME) + query->get_value_range (init, stmt)->singleton_p (); step = evolution_part_in_loop_num (chrec, loop->num); - tem = op_with_constant_singleton_value_range (step); - if (tem) - step = tem; + if (TREE_CODE (step) == SSA_NAME) + query->get_value_range (step, stmt)->singleton_p (); If I read this correctly, we get values for init and step... and if they are SSA_NAMES, then we query ranges, otherwise use what we got back.. So that would seem to be the same behaviour as before then.. Perhaps a comment is warranted? I had to read it a few times :-) Indeed. I am trying to do too much in one line. I've added a comment. /* If STEP is symbolic, we can't know whether INIT will be the minimum or maximum value in the range. Also, unless INIT is @@ -1781,7 +1778,10 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, if (step == NULL_TREE ||
Re: [PATCH 3/4][PR target/88808]Enable bitwise operator for AVX512 masks.
On Fri, Aug 14, 2020 at 10:25 AM Hongtao Liu wrote: > > 1. Set cost of movement inside mask registers a bit higher than gpr's. > 2. Set cost of movement between mask register and gpr much higher than > movement >inside gpr, but still less equal than load/store. > 3. Set cost of mask register load/store a bit higher than gpr load/store. I have no comment here (fine tuning costs is a painful task ;) ) Uros.
Re: [PATCH 2/4][PR target/88808]Enable bitwise operator for AVX512 masks.
On Fri, Aug 14, 2020 at 10:24 AM Hongtao Liu wrote: > > Enable direct move between masks and gprs in pass_reload with > consideration of cost model. > > Changelog > gcc/ > * config/i386/i386.c (inline_secondary_memory_needed): > No memory is needed between mask regs and gpr. > (ix86_hard_regno_mode_ok): Add condition TARGET_AVX512F for > mask regno. > * config/i386/i386.h (enum reg_class): Add INT_MASK_REGS. > (REG_CLASS_NAMES): Ditto. > (REG_CLASS_CONTENTS): Ditto. > * config/i386/i386.md: Exclude mask register in > define_peephole2 which is available only for gpr. > > gcc/testsuites/ > * gcc.target/i386/pr71453-1.c: New tests. > * gcc.target/i386/pr71453-2.c: Ditto. > * gcc.target/i386/pr71453-3.c: Ditto. > * gcc.target/i386/pr71453-4.c: Ditto. @@ -18571,9 +18571,7 @@ inline_secondary_memory_needed (machine_mode mode, reg_class_t class1, || MAYBE_SSE_CLASS_P (class1) != SSE_CLASS_P (class1) || MAYBE_SSE_CLASS_P (class2) != SSE_CLASS_P (class2) || MAYBE_MMX_CLASS_P (class1) != MMX_CLASS_P (class1) - || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2) - || MAYBE_MASK_CLASS_P (class1) != MASK_CLASS_P (class1) - || MAYBE_MASK_CLASS_P (class2) != MASK_CLASS_P (class2)) + || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2)) { gcc_assert (!strict || lra_in_progress); return true; No, this is still needed, the reason is explained in the comment above inline_secondary_memory_needed: The function can't work reliably when one of the CLASSES is a class containing registers from multiple sets. We avoid this by never combining different sets in a single alternative in the machine description. Ensure that this constraint holds to avoid unexpected surprises. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index b24a4557871..74d207c3711 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15051,7 +15051,7 @@ (parallel [(set (reg:CC FLAGS_REG) (unspec:CC [(match_dup 0)] UNSPEC_PARITY)) (clobber (match_dup 0))])] - "" + "!MASK_REGNO_P (REGNO (operands[0]))" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_dup 1)] UNSPEC_PARITY))]) @@ -15072,6 +15072,7 @@ (label_ref (match_operand 5)) (pc)))] "REGNO (operands[2]) == REGNO (operands[3]) + && !MASK_REGNO_P (REGNO (operands[1])) && peep2_reg_dead_p (3, operands[0]) && peep2_reg_dead_p (3, operands[2]) && peep2_regno_dead_p (4, FLAGS_REG)" Actually, there are several (historic?) peephole2 patterns that assume register_operand means only integer registers. Just change register_operand to general_reg_operand and eventually nonimmediate_operand to nonimmediate_gr_operand. Do not put additional predicates into insn predicate. Uros.
Re: [PATCH 1/4][PR target/88808]Enable bitwise operator for AVX512 masks.
On Fri, Aug 14, 2020 at 10:22 AM Hongtao Liu wrote: > > Hi: > First, since avx512 masks involve both vector isa and general part, > so i add both maintainers to the maillist. > > I'm doing this in 4 steps: > 1 - Add cost model for operation of mask registers. > 2 - Introduce new cover class INT_MASK_REGS, this will enable direct > move between gpr and mask registers in pass_reload by consideration of > cost model, this is similar as INT_SSE_REGS. > 3 - Tune cost model. > 4 - Enable operator or/xor/and/andn/not for mask register. kxnor is > not enabled since there's no corresponding instruction for general > registers, 64bit mask op is not enabled for 32bit target. > kadd/kshift/ktest are not merged into general versionsadd/ashl/test > since i think it would be odd to use mask register for those > operations. > > Bootstrap is ok, regression test is ok for i386/x86-64 result. > There's some improvement for performance of SPEC2017 tested on SKL, > i observe there're many spills from integer to mask registers instead > of memory which is the reason for the improvement. + if (MASK_CLASS_P (regclass)) +{ + int index; + switch (GET_MODE_SIZE (mode)) +{ +case 1: + index = 0; + break; +case 2: + index = 1; + break; +default: + index = 3; Max index = 2! + break; +} + + if (in == 2) +return MAX (ix86_cost->hard_register.mask_load[index], +ix86_cost->hard_register.mask_store[index]); + return in ? ix86_cost->hard_register.mask_load[2] +: ix86_cost->hard_register.mask_store[2]; +} Are DImode loads and stores assumed to cost the same as SImode? A comment would be nice here. Uros.
Re: PING: Fwd: [PATCH 1/2] Add statement context to get_value_range.
On 8/14/20 6:03 PM, Andrew MacLeod wrote: On 8/11/20 7:53 AM, Aldy Hernandez via Gcc-patches wrote: -- Forwarded message - From: Aldy Hernandez Date: Tue, Aug 4, 2020, 13:55 Subject: [PATCH 1/2] Add statement context to get_value_range. To: Cc: , Aldy Hernandez This is in line with the statement context that we have for get_value() in the substitute_and_fold_engine class. --- gcc/vr-values.c | 64 ++--- gcc/vr-values.h | 14 +-- 2 files changed, 41 insertions(+), 37 deletions(-) diff --git a/gcc/vr-values.c b/gcc/vr-values.c index 511342f2f13..9002d87c14b 100644 --- a/gcc/vr-values.c +++ b/gcc/vr-values.c @@ -147,7 +147,8 @@ vr_values::get_lattice_entry (const_tree var) return NULL. Otherwise create an empty range if none existed for VAR. */ const value_range_equiv * -vr_values::get_value_range (const_tree var) +vr_values::get_value_range (const_tree var, + gimple *stmt ATTRIBUTE_UNUSED) { /* If we have no recorded ranges, then return NULL. */ if (!vr_value) @@ -450,7 +451,7 @@ simplify_using_ranges::op_with_boolean_value_range_p (tree op) /* ?? Errr, this should probably check for [0,0] and [1,1] as well as [0,1]. */ - const value_range *vr = get_value_range (op); + const value_range *vr = get_value_range (op, NULL); return *vr == value_range (build_zero_cst (TREE_TYPE (op)), build_one_cst (TREE_TYPE (op))); } I think if we are adding "gimple *stmt" as a parameter, we should make if default to NULL... Then we won't have to change all the callers that don't have a need for it. I get that it helped us find all the places where stmts were available/needed originally, but I think that need is no longer relevant and we can revert to making it a default parameter now. Done. further more, I don't think it should be a ATTRIBUTE_UNUSED, and then pass a NULL further down :) we should be able to pass stmt. @@ -972,12 +973,13 @@ vr_values::extract_range_from_cond_expr (value_range_equiv *vr, gassign *stmt) void vr_values::extract_range_from_comparison (value_range_equiv *vr, + gimple *stmt, enum tree_code code, tree type, tree op0, tree op1) Now that we are passing stmt in, and there is only one use of this function, I think you can kill the final 4 parameters and just get them in the function itself... Done. { bool sop; tree val - = simplifier.vrp_evaluate_conditional_warnv_with_ops (code, op0, op1, + = simplifier.vrp_evaluate_conditional_warnv_with_ops (stmt, code, op0, op1, false, , NULL); if (val) { @@ -1008,14 +1010,14 @@ check_for_binary_op_overflow (vr_values *store, { value_range vr0, vr1; if (TREE_CODE (op0) == SSA_NAME) - vr0 = *store->get_value_range (op0); + vr0 = *store->get_value_range (op0, NULL); else if (TREE_CODE (op0) == INTEGER_CST) vr0.set (op0); else vr0.set_varying (TREE_TYPE (op0)); if (TREE_CODE (op1) == SSA_NAME) - vr1 = *store->get_value_range (op1); + vr1 = *store->get_value_range (op1, NULL); else if (TREE_CODE (op1) == INTEGER_CST) vr1.set (op1); else @@ -1472,7 +1474,7 @@ vr_values::extract_range_from_assignment (value_range_equiv *vr, gassign *stmt) else if (code == COND_EXPR) extract_range_from_cond_expr (vr, stmt); else if (TREE_CODE_CLASS (code) == tcc_comparison) - extract_range_from_comparison (vr, gimple_assign_rhs_code (stmt), + extract_range_from_comparison (vr, stmt, gimple_assign_rhs_code (stmt), gimple_expr_type (stmt), gimple_assign_rhs1 (stmt), gimple_assign_rhs2 (stmt)); @@ -1805,7 +1807,7 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, if (TREE_CODE (step) == INTEGER_CST && is_gimple_val (init) && (TREE_CODE (init) != SSA_NAME - || get_value_range (init)->kind () == VR_RANGE)) + || get_value_range (init, stmt)->kind () == VR_RANGE)) { widest_int nit; @@ -1838,7 +1840,7 @@ vr_values::adjust_range_with_scev (value_range_equiv *vr, class loop *loop, value_range initvr; if (TREE_CODE (init) == SSA_NAME) - initvr = *(get_value_range (init)); + initvr = *(get_value_range (init, stmt)); else if (is_gimple_min_invariant (init)) initvr.set (init); else @@ -2090,7 +2092,7 @@ const value_range_equiv * simplify_using_ranges::get_vr_for_comparison (int i, value_range_equiv *tem) { /* Shallow-copy equiv bitmap. */ - const value_range_equiv *vr =
Re: [PATCH] openmp: fix UBSAN error at gcc/fortran/openmp.c:4737
On 8/17/20 10:52 AM, Tobias Burnus wrote: LGTM & thanks! – Sorry for missing it. That happens. (I re-checked against the OMP_LIST_* enum and it seems to be only missing one.) Good. I'm suggesting one more clean up that uses static assert instead of a run-time check. Thoughts? Martin >From c9aee2c44d5cf7e417d381988b2f4900e9ea8b05 Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Mon, 17 Aug 2020 11:14:13 +0200 Subject: [PATCH] opnemp: add static assert for clause_names. gcc/fortran/ChangeLog: * openmp.c (resolve_omp_clauses): Add static assert for OMP_LIST_NUM and size of clause_names array. Remove check that is always true. --- gcc/fortran/openmp.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 60d8e5573c2..4d33a450a33 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -4371,6 +4371,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, "TO", "FROM", "REDUCTION", "DEVICE_RESIDENT", "LINK", "USE_DEVICE", "CACHE", "IS_DEVICE_PTR", "USE_DEVICE_PTR", "USE_DEVICE_ADDR", "NONTEMPORAL" }; + STATIC_ASSERT (ARRAY_SIZE (clause_names) == OMP_LIST_NUM); if (omp_clauses == NULL) return; @@ -4732,12 +4733,7 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, for (list = 0; list < OMP_LIST_NUM; list++) if ((n = omp_clauses->lists[list]) != NULL) { - const char *name; - - if (list < OMP_LIST_NUM) - name = clause_names[list]; - else - gcc_unreachable (); + const char *name = clause_names[list]; switch (list) { -- 2.28.0
RE: [PATCH] driver: Fix several memory leaks
Ping^2. > -Original Message- > From: Gcc-patches On Behalf Of Alex > Coplan > Sent: 03 August 2020 16:02 > To: gcc-patches@gcc.gnu.org > Subject: RE: [PATCH] driver: Fix several memory leaks > > Ping. > > > -Original Message- > > From: Jit On Behalf Of Alex Coplan > > Sent: 14 July 2020 10:08 > > To: gcc-patches@gcc.gnu.org; j...@gcc.gnu.org > > Subject: [PATCH] driver: Fix several memory leaks > > > > Updating the subject since this is really just a driver change (and > > therefore needs a review from those who can approve patches there). > > > > Thanks, > > Alex > > > > -Original Message- > > From: Jit On Behalf Of Alex Coplan > > Sent: 09 July 2020 21:13 > > To: gcc-patches@gcc.gnu.org; j...@gcc.gnu.org > > Cc: nd > > Subject: [PATCH] libgccjit: Fix several memory leaks in the driver > > > > Hello, > > > > This patch fixes several memory leaks in the driver, all of which > relate > > to the handling of static specs. We introduce functions > > set_static_spec_{shared,owned}() which are used to enforce proper > memory > > management when updating the strings in the static_specs table. > > > > This is achieved by making use of the alloc_p field in the table > > entries. Similarly to set_spec(), each time we update an entry, we > check > > whether alloc_p is set, and free the old value if so. We then set > > alloc_p correctly based on whether we "own" this memory or whether > we're > > just taking a pointer to a shared string which we shouldn't free. > > > > The following table shows the number of leaks found by AddressSanitizer > > when running a minimal libgccjit program on AArch64. The test program > > does the whole libgccjit compilation cycle in a loop (including > acquiring > > and releasing the context), and the table below shows the number of > leaks > > for different iterations of that loop. > > > > +--+-+-+--+---+ > > | # of runs > | 1 | 2 | 3| Leaks per run | > > +--+-+-+--+---+ > > | Before patch | 463 | 940 | 1417 | 477 | > > +--+-+-+--+---+ > > | After patch | 416 | 846 | 1276 | 430 | > > +--+-+-+--+---+ > > > > Ensuring that we minimize "leaks per run" (ultimately eliminating all > of > > them) is important in order for long-running applications to be able to > > make use of in-process libgccjit. > > > > Testing: > > * Bootstrap and regtest on aarch64-linxu-gnu, x86_64-linux-gnu. > > * Bootstrap and regtest on aarch64-linux-gnu with bootstrap-asan > config. > > * Smoke test of libgccjit, ran regressions on a --disable-bootstrap > > build on > >aarch64-linux-gnu. > > > > OK for master? > > > > Thanks, > > Alex > > > > --- > > > > gcc/ChangeLog: > > > > 2020-07-09 Alex Coplan > > > > * gcc.c (set_static_spec): New. > > (set_static_spec_owned): New. > > (set_static_spec_shared): New. > > (driver::maybe_putenv_COLLECT_LTO_WRAPPER): Use > > set_static_spec_owned() to take ownership of lto_wrapper_file > > such that it gets freed in driver::finalize. > > (driver::maybe_run_linker): Use set_static_spec_shared() to > > ensure that we don't try and free() the static string "ld", > > also ensuring that any previously-allocated string in > > linker_name_spec is freed. Likewise with argv0. > > (driver::finalize): Use set_static_spec_shared() when resetting > > specs that previously had allocated strings; remove if(0) > > around call to free().
[PATCH] aarch64: Don't generate invalid zero/sign-extend syntax
Hello, Given the following C function: double *f(double *p, unsigned x) { return p + x; } prior to this patch, GCC at -O2 would generate: f: add x0, x0, x1, uxtw 3 ret but this add instruction uses architecturally-invalid syntax: the width of the third operand conflicts with the width of the extension specifier. The third operand is only permitted to be an x register when the extension specifier is (u|s)xtx. This instruction, and analogous insns for adds, sub, subs, and cmp, are rejected by clang, but accepted by binutils. Assembling and disassembling such an insn with binutils gives the architecturally-valid version in the disassembly: 0: 8b214c00add x0, x0, w1, uxtw #3 This patch fixes several patterns in the AArch64 backend to use the standard syntax as specified in the Arm ARM such that GCC's output can be assembled by assemblers other than GAS. Note that an obvious omission here is that this patch does not touch the mult patterns such as *add__mult_. I found that I couldn't hit these patterns with C code since multiplications by powers of two always get turned into shifts by earlier RTL passes. If there's a way to reliably hit these patterns, then perhaps these should be updated as well. Testing: * New test which checks for the correct syntax in all updated patterns (fails before and passes after the aarch64.md change). * New test can be assembled by both GAS and llvm-mc following the change. * Bootstrapped and regtested on aarch64-none-linux-gnu. OK for master? Thanks, Alex --- gcc/ChangeLog: * config/aarch64/aarch64.md (*adds__): Ensure extended operand agrees with width of extension specifier. (*subs__): Likewise. (*adds__shift_): Likewise. (*subs__shift_): Likewise. (*add__): Likewise. (*add__shft_): Likewise. (*add_uxt_shift2): Likewise. (*sub__): Likewise. (*sub__shft_): Likewise. (*sub_uxt_shift2): Likewise. (*cmp_swp__reg): Likewise. (*cmp_swp__shft_): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax. * gcc.target/aarch64/cmp.c: Likewise. * gcc.target/aarch64/subs3.c: Likewise. * gcc.target/aarch64/subsp.c: Likewise. * gcc.target/aarch64/extend-syntax.c: New test. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 9b20dd0b1a0..b1e83dfda78 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2383,7 +2383,7 @@ (set (match_operand:GPI 0 "register_operand" "=r") (plus:GPI (ANY_EXTEND:GPI (match_dup 1)) (match_dup 2)))] "" - "adds\\t%0, %2, %1, xt" + "adds\\t%0, %2, %w1, xt" [(set_attr "type" "alus_ext")] ) @@ -2397,7 +2397,7 @@ (set (match_operand:GPI 0 "register_operand" "=r") (minus:GPI (match_dup 1) (ANY_EXTEND:GPI (match_dup 2] "" - "subs\\t%0, %1, %2, xt" + "subs\\t%0, %1, %w2, xt" [(set_attr "type" "alus_ext")] ) @@ -2415,7 +2415,7 @@ (match_dup 2)) (match_dup 3)))] "" - "adds\\t%0, %3, %1, xt %2" + "adds\\t%0, %3, %w1, xt %2" [(set_attr "type" "alus_ext")] ) @@ -2433,7 +2433,7 @@ (ashift:GPI (ANY_EXTEND:GPI (match_dup 2)) (match_dup 3] "" - "subs\\t%0, %1, %2, xt %3" + "subs\\t%0, %1, %w2, xt %3" [(set_attr "type" "alus_ext")] ) @@ -2549,7 +2549,7 @@ (plus:GPI (ANY_EXTEND:GPI (match_operand:ALLX 1 "register_operand" "r")) (match_operand:GPI 2 "register_operand" "r")))] "" - "add\\t%0, %2, %1, xt" + "add\\t%0, %2, %w1, xt" [(set_attr "type" "alu_ext")] ) @@ -2571,7 +2571,7 @@ (match_operand 2 "aarch64_imm3" "Ui3")) (match_operand:GPI 3 "register_operand" "r")))] "" - "add\\t%0, %3, %1, xt %2" + "add\\t%0, %3, %w1, xt %2" [(set_attr "type" "alu_ext")] ) @@ -2819,7 +2819,7 @@ "* operands[3] = GEN_INT (aarch64_uxt_size (INTVAL(operands[2]), INTVAL (operands[3]))); - return \"add\t%0, %4, %1, uxt%e3 %2\";" + return \"add\t%0, %4, %w1, uxt%e3 %2\";" [(set_attr "type" "alu_ext")] ) @@ -3305,7 +3305,7 @@ (ANY_EXTEND:GPI (match_operand:ALLX 2 "register_operand" "r"] "" - "sub\\t%0, %1, %2, xt" + "sub\\t%0, %1, %w2, xt" [(set_attr "type" "alu_ext")] ) @@ -3328,7 +3328,7 @@ (match_operand:ALLX 2 "register_operand" "r")) (match_operand 3 "aarch64_imm3" "Ui3"] "" - "sub\\t%0, %1, %2, xt %3" + "sub\\t%0, %1, %w2, xt %3" [(set_attr "type" "alu_ext")] ) @@ -3607,7 +3607,7 @@ "* operands[3] = GEN_INT (aarch64_uxt_size (INTVAL (operands[2]), INTVAL (operands[3]))); - return \"sub\t%0, %4,
[PATCH] download_prerequisites: Add option --proxy
The script contrib/download_prerequisites now accepts the command line argument '--proxy'. Which instructs the fetcher program to use the specified proxy. Signed-off-by: Mert Kirpici --- contrib/download_prerequisites | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites index 7d0c4b5ea8d..4a297f29dd2 100755 --- a/contrib/download_prerequisites +++ b/contrib/download_prerequisites @@ -58,11 +58,6 @@ case $OS in ;; esac -if type wget > /dev/null ; then - fetch='wget' -else - fetch='curl -LO' -fi chksum_extension='sha512' directory='.' @@ -74,6 +69,7 @@ GCC source tree and the GCC build will do the right thing. The following options are available: --directory=DIR download and unpack packages into DIR instead of '.' + --proxy=URL download via specified http proxy server URL --force download again overwriting existing packages --no-force do not download existing packages again (default) --isl download ISL, needed for Graphite loop optimizations (default) @@ -143,6 +139,12 @@ do --directory=*) directory="${arg#--directory=}" ;; + --proxy=*) + proxy="${arg#--proxy=}" + ;; + --proxy) + argnext='proxy' + ;; --force) force=1 ;; @@ -202,6 +204,9 @@ do directory) directory="${arg}" ;; + proxy) + proxy="${arg}" + ;; *) die "The impossible has happened" ;; @@ -218,6 +223,14 @@ unset arg argnext [ -d "${directory}" ] \ || die "No such directory: ${directory}" +if type wget > /dev/null ; then + fetch='wget' + [ -z "${proxy}" ] || fetch="${fetch} -e use_proxy=on -e http_proxy=${proxy}" +else + fetch='curl -LO' + [ -z "${proxy}" ] || fetch="${fetch} --proxy ${proxy}" +fi + for ar in $(echo_archives) do if [ ${force} -gt 0 ]; then rm -f "${directory}/${ar}"; fi -- 2.20.1
Re: [PATCH] openmp: fix UBSAN error at gcc/fortran/openmp.c:4737
On 8/17/20 10:41 AM, Martin Liška wrote: Since 21cfe724cbdc30612bf1ef59b26f19ada2210832 there's a new OMP_LIST_NONTEMPORAL value, but it was missing in resolve_omp_clauses static array that is defined at the function beginning: gcc/fortran/ChangeLog: * openmp.c (resolve_omp_clauses): Add NONTEMPORAL to clause names. LGTM & thanks! – Sorry for missing it. (I re-checked against the OMP_LIST_* enum and it seems to be only missing one.) Tobias --- gcc/fortran/openmp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index c44a2530b88..60d8e5573c2 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -4369,7 +4369,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, = { "PRIVATE", "FIRSTPRIVATE", "LASTPRIVATE", "COPYPRIVATE", "SHARED", "COPYIN", "UNIFORM", "ALIGNED", "LINEAR", "DEPEND", "MAP", "TO", "FROM", "REDUCTION", "DEVICE_RESIDENT", "LINK", "USE_DEVICE", -"CACHE", "IS_DEVICE_PTR", "USE_DEVICE_PTR", "USE_DEVICE_ADDR" }; +"CACHE", "IS_DEVICE_PTR", "USE_DEVICE_PTR", "USE_DEVICE_ADDR", +"NONTEMPORAL" }; if (omp_clauses == NULL) return; - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
[PATCH] openmp: fix UBSAN error at gcc/fortran/openmp.c:4737
Since 21cfe724cbdc30612bf1ef59b26f19ada2210832 there's a new OMP_LIST_NONTEMPORAL value, but it was missing in resolve_omp_clauses static array that is defined at the function beginning: ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gfortran.dg/gomp/nontemporal-1.f90 -fopenmp -c ../../gcc/fortran/openmp.c:4737:28: runtime error: index 21 out of bounds for type 'char *[21]' #0 0xbdb956 in resolve_omp_clauses ../../gcc/fortran/openmp.c:4737 #1 0xbeb076 in resolve_omp_do ../../gcc/fortran/openmp.c:6139 #2 0xbf029a in gfc_resolve_omp_directive(gfc_code*, gfc_namespace*) ../../gcc/fortran/openmp.c:6792 #3 0xcb6363 in gfc_resolve_code(gfc_code*, gfc_namespace*) ../../gcc/fortran/resolve.c:12185 #4 0xcef8cf in resolve_codes ../../gcc/fortran/resolve.c:17303 Ready for master? Thanks, Martin gcc/fortran/ChangeLog: * openmp.c (resolve_omp_clauses): Add NONTEMPORAL to clause names. --- gcc/fortran/openmp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index c44a2530b88..60d8e5573c2 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -4369,7 +4369,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, = { "PRIVATE", "FIRSTPRIVATE", "LASTPRIVATE", "COPYPRIVATE", "SHARED", "COPYIN", "UNIFORM", "ALIGNED", "LINEAR", "DEPEND", "MAP", "TO", "FROM", "REDUCTION", "DEVICE_RESIDENT", "LINK", "USE_DEVICE", - "CACHE", "IS_DEVICE_PTR", "USE_DEVICE_PTR", "USE_DEVICE_ADDR" }; + "CACHE", "IS_DEVICE_PTR", "USE_DEVICE_PTR", "USE_DEVICE_ADDR", + "NONTEMPORAL" }; if (omp_clauses == NULL) return; -- 2.28.0
[PATCH] vxworks: Fix GCC selftests for *-wrs-vxworks7-* targets
Hi, Currently when building a cross-compiler targeting arm-wrs-vxworks7, the selftests fail unless the VSB_DIR environment variable is set. The same !nostdinc condition is used for VXWORKS_ADDITIONAL_CPP_SPEC. OK for mainline? Iain. --- gcc/ChangeLog: * config/vxworks.h (STARTFILE_PREFIX_SPEC): Avoid using VSB_DIR if -nostdinc is used. --- gcc/config/vxworks.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h index d648d2f23cb..065c9e12b88 100644 --- a/gcc/config/vxworks.h +++ b/gcc/config/vxworks.h @@ -108,7 +108,7 @@ along with GCC; see the file COPYING3. If not see #if TARGET_VXWORKS7 #undef STARTFILE_PREFIX_SPEC -#define STARTFILE_PREFIX_SPEC "%:getenv(VSB_DIR /usr/lib/common)" +#define STARTFILE_PREFIX_SPEC "%{!nostdinc:%:getenv(VSB_DIR /usr/lib/common)}" #define TLS_SYM "-u __tls__" #else #define TLS_SYM "" -- 2.20.1
RE: [PATCH] middle-end: Recognize idioms for bswap32 and bswap64 in match.pd.
Hi Jakub and Marc, Here's version #3 of the patch to recognize bswap32 and bswap64 that now also implements Jakub's suggestion to support addition and xor in addition to bitwise ior when recognizing the union of highpart and lowpart (and two additional tests to check for these variants). This revised patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures, and confirming all four new tests pass. Ok for mainline? 2020-08-17 Roger Sayle Marc Glisse Jakub Jelinek gcc/ChangeLog * match.pd (((T)bswapX(x)<>C) -> bswapY(x)): New simplifications to recognize __builtin_bswap{32,64}. gcc/testsuite/ChangeLog * gcc.dg/fold-bswap-1.c: New test. * gcc.dg/fold-bswap-2.c: New test. * gcc.dg/fold-bswap-3.c: New test. * gcc.dg/fold-bswap-4.c: New test. Thanks in advance, Roger -- -Original Message- From: Jakub Jelinek Sent: 15 August 2020 14:26 To: Roger Sayle Cc: 'GCC Patches' ; 'Marc Glisse' Subject: Re: [PATCH] middle-end: Recognize idioms for bswap32 and bswap64 in match.pd. On Sat, Aug 15, 2020 at 11:09:17AM +0100, Roger Sayle wrote: > +/* Recognize ((T)bswap32(x)<<32)|bswap32(x>>32) as bswap64(x). */ > +(simplify > + (bit_ior:c Any reason for supporting bit_ior only? Don't plus:c or bit_xor:c work the same (i.e. use (for op (bit_ior bit_xor plus) ...)? Jakub diff --git a/gcc/match.pd b/gcc/match.pd index c3b8816..3d7a0db 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3410,6 +3410,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (bswap (bitop:c (bswap @0) @1)) (bitop @0 (bswap @1) +/* Recognize ((T)bswap32(x)<<32)|bswap32(x>>32) as bswap64(x). */ +(for op (bit_ior bit_xor plus) + (simplify +(op:c + (lshift (convert (BUILT_IN_BSWAP32 (convert@0 @1))) + INTEGER_CST@2) + (convert (BUILT_IN_BSWAP32 (convert@3 (rshift @1 @2) +(if (INTEGRAL_TYPE_P (type) +&& TYPE_PRECISION (type) == 64 +&& types_match (TREE_TYPE (@1), uint64_type_node) +&& types_match (TREE_TYPE (@0), uint32_type_node) +&& types_match (TREE_TYPE (@3), uint32_type_node) +&& wi::to_widest (@2) == 32) + (convert (BUILT_IN_BSWAP64 @1) + +/* Recognize ((T)bswap16(x)<<16)|bswap16(x>>16) as bswap32(x). */ +(for op (bit_ior bit_xor plus) + (simplify +(op:c + (lshift + (convert (BUILT_IN_BSWAP16 (convert (bit_and @0 INTEGER_CST@1 + (INTEGER_CST@2)) + (convert (BUILT_IN_BSWAP16 (convert (rshift @0 @2) +(if (INTEGRAL_TYPE_P (type) +&& TYPE_PRECISION (type) == 32 +&& types_match (TREE_TYPE (@0), uint32_type_node) +&& wi::to_widest (@1) == 65535 +&& wi::to_widest (@2) == 16) + (convert (BUILT_IN_BSWAP32 @0) /* Combine COND_EXPRs and VEC_COND_EXPRs. */ diff --git a/gcc/testsuite/gcc.dg/fold-bswap-1.c b/gcc/testsuite/gcc.dg/fold-bswap-1.c new file mode 100644 index 000..3abb862 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-bswap-1.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +unsigned int swap32(unsigned int x) +{ + if (sizeof(unsigned int)==4 && sizeof(unsigned short)==2) { +unsigned int a = __builtin_bswap16(x); +x >>= 16; +a <<= 16; +return __builtin_bswap16(x) | a; + } else return __builtin_bswap32(x); +} + +unsigned long swap64(unsigned long x) +{ + if (sizeof(unsigned long)==8 && sizeof(unsigned int)==4) { +unsigned long a = __builtin_bswap32(x); +x >>= 32; +a <<= 32; +return __builtin_bswap32(x) | a; + } else return __builtin_bswap64(x); +} + +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__builtin_bswap64" 1 "optimized" } } */ + diff --git a/gcc/testsuite/gcc.dg/fold-bswap-2.c b/gcc/testsuite/gcc.dg/fold-bswap-2.c new file mode 100644 index 000..a581fd6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-bswap-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +int swap32(unsigned int x) +{ + if (sizeof(int)==4 && sizeof(short)==2) { +int a = __builtin_bswap16(x); +x >>= 16; +a <<= 16; +return __builtin_bswap16(x) | a; + } else return __builtin_bswap32(x); +} + +long swap64(unsigned long x) +{ + if (sizeof(long)==8 && sizeof(int)==4) { +long a = __builtin_bswap32(x); +x >>= 32; +a <<= 32; +return __builtin_bswap32(x) | a; + } else return __builtin_bswap64(x); +} + +/* { dg-final { scan-tree-dump-times "__builtin_bswap32" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__builtin_bswap64" 1 "optimized" } } */ + diff --git a/gcc/testsuite/gcc.dg/fold-bswap-3.c b/gcc/testsuite/gcc.dg/fold-bswap-3.c new file mode 100644 index 000..13bb6eb --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-bswap-3.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2
RE: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3
> -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Wednesday, August 5, 2020 12:26 AM > To: xiezhiheng > Cc: Richard Biener ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions > emitted at -O3 > > xiezhiheng writes: > >> > Sorry, I should have used it. And I prepare a patch to use > FLOAT_MODE_P > >> > macro and add a flag FLAG_SUPPRESS_FP_EXCEPTIONS to suppress > >> > FLAG_RAISE_FP_EXCEPTIONS for certain intrinsics in future. > >> > >> The same thing is true for reading FPCR as well, so I think the flag > >> should suppress the FLOAT_MODE_P check, instead of fixing up the flags > >> afterwards. > >> > >> I'm struggling to think of a good name though. How about adding > >> FLAG_AUTO_FP and making the FLOAT_MODE_P check dependent on > >> FLAG_AUTO_FP > >> being set? > >> > >> We could leave FLAG_AUTO_FP out of FLAG_ALL, since FLAG_ALL already > >> includes FLAG_FP. Including it in FLAG_ALL wouldn't do no any harm > >> though. > > > > I could not think of a better name either. So I choose to use > FLAG_AUTO_FP > > to control the check of FLOAT_MODE_P finally. > > > > Bootstrapped and tested on aarch64 Linux platform. > > Thanks, pushed to master. > > Richard I add FLAGS for part of intrinsics in aarch64-simd-builtins.def first for a try, including all the add/sub arithmetic intrinsics. Something like faddp intrinsic which only handles floating-point operations, both FP and NONE flags are suitable for it because FLAG_FP will be added later if the intrinsic handles floating-point operations. And I prefer FP since it would be more clear. But for qadd intrinsics, they would modify FPSR register which is a scenario I missed before. And I consider to add an additional flag FLAG_WRITE_FPSR to represent it. Bootstrapped and tested on aarch64 Linux platform. Have any suggestions? Thanks, XieZhiheng diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 9cf1f9733e7..cde50c54d9e 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2020-08-17 Zhiheng Xie + + * config/aarch64/aarch64-builtins.c (aarch64_modifies_global_state_p): + Add flag FLAG_WRITE_FPSR to control attribtues. + * config/aarch64/aarch64-simd-builtins.def: Add proper FLAGS + for intrinsic functions. + pr94442-v1.patch Description: pr94442-v1.patch
[PING][PATCH 6/6] contrib: Add OPT-enable-obsolete to tile*-*-*
Ping. On 31/05/2020 12:20, Iain Buclaw wrote: > The tile*-*-* targets were marked as obsolete in SVN r259724. > > OK? > > Regards > Iain > > --- > contrib/ChangeLog: > > * config-list.mk (LIST): Add OPT-enable-obsolete to tilegx-linux-gnu, > tilegxbe-linux-gnu, and tilepro-linux-gnu. > --- > contrib/config-list.mk | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/contrib/config-list.mk b/contrib/config-list.mk > index 5818f7df08b..8a4ce8aca25 100644 > --- a/contrib/config-list.mk > +++ b/contrib/config-list.mk > @@ -93,7 +93,8 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \ > > sparc64-sun-solaris2.11OPT-with-gnu-ldOPT-with-gnu-asOPT-enable-threads=posix > \ >sparc-wrs-vxworks sparc64-elf sparc64-rtems sparc64-linux sparc64-freebsd6 > \ >sparc64-netbsd sparc64-openbsd \ > - tilegx-linux-gnu tilegxbe-linux-gnu tilepro-linux-gnu \ > + tilegx-linux-gnuOPT-enable-obsolete tilegxbe-linux-gnuOPT-enable-obsolete \ > + tilepro-linux-gnuOPT-enable-obsolete \ >v850e1-elf v850e-elf v850-elf v850-rtems vax-linux-gnu \ >vax-netbsdelf vax-openbsd visium-elf x86_64-apple-darwin \ >x86_64-pc-linux-gnuOPT-with-fpmath=avx \ >
[PING][PATCH] tilepro: Update generator file to define IN_TARGET_CODE in target file.
Ping. On 31/05/2020 12:48, Iain Buclaw wrote: > Hi, > > The target files tilegx/mul-tables.c and tilepri/mul-tables.c were > updated in SVN r255743, but the generator file that produces them > wasn't, so it was reverting this change during builds. > > Only tested by running make all-gcc for all tile*-*-* targets present in > config-list.mk. > > OK? > > Regards > Iain > > --- > gcc/ChangeLog: > > * config/tilepro/gen-mul-tables.cc (main): Define IN_TARGET_CODE to 1 > in the target file. > --- > gcc/config/tilepro/gen-mul-tables.cc | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/gcc/config/tilepro/gen-mul-tables.cc > b/gcc/config/tilepro/gen-mul-tables.cc > index 2a345023aea..7f9fb65dc2f 100644 > --- a/gcc/config/tilepro/gen-mul-tables.cc > +++ b/gcc/config/tilepro/gen-mul-tables.cc > @@ -1252,6 +1252,8 @@ main () >printf ("/* Note this file is auto-generated from gen-mul-tables.cc.\n"); >printf (" Make any required changes there. */\n"); >printf ("\n"); > + printf ("#define IN_TARGET_CODE 1\n"); > + printf ("\n"); >printf ("#include \"config.h\"\n"); >printf ("#include \"system.h\"\n"); >printf ("#include \"coretypes.h\"\n"); >
[PATCH] Fortran : get_environment_variable runtime error PR96486
Please find attached a fix for PR96486. OK to commit? [PATCH] Fortran : get_environment_variable runtime error PR96486 Runtime error occurs when the type of the value argument is character(0): "Zero-length string passed as value...". The status argument, intent(out), will contain -1 if the value of the environment is too large to fit in the value argument, this is the case if the type is character(0) so there is no reason to produce a runtime error if the value argument is zero length. 2020-08-17 Mark Eggleston libgfortran/ PR fortran/96486 * intrinsics/env.c: If value_len is > 0 blank the string. Copy the result only if its length is > 0. 2020-08-17 Mark Eggleston gcc/testsuite/ PR fortran/96486 * gfortran.dg/pr96486.f90 -- https://www.codethink.co.uk/privacy.html >From 63827120e6286181652c72501f927599125a0508 Mon Sep 17 00:00:00 2001 From: Mark Eggleston Date: Mon, 10 Aug 2020 08:07:39 +0100 Subject: [PATCH] Fortran : get_environment_variable runtime error PR96486 Runtime error occurs when the type of the value argument is character(0): "Zero-length string passed as value...". The status argument, intent(out), will contain -1 if the value of the environment is too large to fit in the value argument, this is the case if the type is character(0) so there is no reason to produce a runtime error if the value argument is zero length. 2020-08-17 Mark Eggleston libgfortran/ PR fortran/96486 * intrinsics/env.c: If value_len is > 0 blank the string. Copy the result only if its length is > 0. 2020-08-17 Mark Eggleston gcc/testsuite/ PR fortran/96486 * gfortran.dg/pr96486.f90 --- gcc/testsuite/gfortran.dg/pr96486.f90 | 9 + libgfortran/intrinsics/env.c | 7 ++- 2 files changed, 11 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr96486.f90 diff --git a/gcc/testsuite/gfortran.dg/pr96486.f90 b/gcc/testsuite/gfortran.dg/pr96486.f90 new file mode 100644 index 000..fdc7025d61c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr96486.f90 @@ -0,0 +1,9 @@ +! { dg-do run } + +program test + implicit none + character(0) :: value + integer :: l, stat + call get_environment_variable("HOME",value,length=l,status=stat) + if (stat.ne.-1) stop 1 +end program test diff --git a/libgfortran/intrinsics/env.c b/libgfortran/intrinsics/env.c index b7837b30873..7ab0b443897 100644 --- a/libgfortran/intrinsics/env.c +++ b/libgfortran/intrinsics/env.c @@ -110,10 +110,7 @@ get_environment_variable_i4 (char *name, char *value, GFC_INTEGER_4 *length, if (value != NULL) { - if (value_len < 1) - runtime_error ("Zero-length string passed as value to " - "get_environment_variable."); - else + if (value_len > 0) memset (value, ' ', value_len); /* Blank the string. */ } @@ -138,7 +135,7 @@ get_environment_variable_i4 (char *name, char *value, GFC_INTEGER_4 *length, memcpy (value, res, value_len); stat = GFC_VALUE_TOO_SHORT; } - else + else if (res_len > 0) memcpy (value, res, res_len); } } -- 2.11.0
[PATCH] Fortran : rejected f0.d edit descriptor PR96436
Please find attached a patch for PR96436. OK to commit? [PATCH] Fortran : rejected f0.d edit descriptor PR96436 Zero length f format descriptors are valid for Fortran 95 and later. For g format descriptors from Fortran 2008 and later. Finally for D, E, EN and ES for Fortran 2018 and later. 2020-08-10 Mark Eggleston libgfortran/io/ PR fortran/96436 * format.c (parse_format_list): Add new local variable "standard" to hold the required standard to check. If the format width is zero select standard depending on descriptor. Call notification_std using the new standard variable. 2020-08-10 Mark Eggleston gcc/testsuite/ PR fortran/96436 * gfortran.dg/pr96436_1.f90 * gfortran.dg/pr96436_2.f90 * gfortran.dg/pr96436_3.f90 * gfortran.dg/pr96436_4.f90 * gfortran.dg/pr96436_5.f90 * gfortran.dg/pr96436_6.f90 * gfortran.dg/pr96436_7.f90 * gfortran.dg/pr96436_8.f90 * gfortran.dg/pr96436_9.f90 * gfortran.dg/pr96436_10.f90 -- https://www.codethink.co.uk/privacy.html >From 9f60ccd71e0c675b48d6614141d1aeddaa863191 Mon Sep 17 00:00:00 2001 From: Mark Eggleston Date: Tue, 4 Aug 2020 14:10:08 +0100 Subject: [PATCH] Fortran : rejected f0.d edit descriptor PR96436 Zero length f format descriptors are valid for Fortran 95 and later. For g format descriptors from Fortran 2008 and later. Finally for D, E, EN and ES for Fortran 2018 and later. 2020-08-10 Mark Eggleston libgfortran/io/ PR fortran/96436 * format.c (parse_format_list): Add new local variable "standard" to hold the required standard to check. If the format width is zero select standard depending on descriptor. Call notification_std using the new standard variable. 2020-08-10 Mark Eggleston gcc/testsuite/ PR fortran/96436 * gfortran.dg/pr96436_1.f90 * gfortran.dg/pr96436_2.f90 * gfortran.dg/pr96436_3.f90 * gfortran.dg/pr96436_4.f90 * gfortran.dg/pr96436_5.f90 * gfortran.dg/pr96436_6.f90 * gfortran.dg/pr96436_7.f90 * gfortran.dg/pr96436_8.f90 * gfortran.dg/pr96436_9.f90 * gfortran.dg/pr96436_10.f90 --- gcc/testsuite/gfortran.dg/pr96436_1.f90 | 10 ++ gcc/testsuite/gfortran.dg/pr96436_10.f90 | 10 ++ gcc/testsuite/gfortran.dg/pr96436_2.f90 | 10 ++ gcc/testsuite/gfortran.dg/pr96436_3.f90 | 13 + gcc/testsuite/gfortran.dg/pr96436_4.f90 | 25 + gcc/testsuite/gfortran.dg/pr96436_5.f90 | 25 + gcc/testsuite/gfortran.dg/pr96436_6.f90 | 10 ++ gcc/testsuite/gfortran.dg/pr96436_7.f90 | 10 ++ gcc/testsuite/gfortran.dg/pr96436_8.f90 | 10 ++ gcc/testsuite/gfortran.dg/pr96436_9.f90 | 10 ++ libgfortran/io/format.c | 10 +- 11 files changed, 142 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/pr96436_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_10.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_2.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_3.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_4.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_5.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_6.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_7.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_8.f90 create mode 100644 gcc/testsuite/gfortran.dg/pr96436_9.f90 diff --git a/gcc/testsuite/gfortran.dg/pr96436_1.f90 b/gcc/testsuite/gfortran.dg/pr96436_1.f90 new file mode 100644 index 000..7cc6a0a69b1 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr96436_1.f90 @@ -0,0 +1,10 @@ +! { dg-do run } +! { dg-options "-std=f95 -pedantic" } + +character(20) :: fmt +character(9) :: buffer +fmt = "(1a1,f0.2,1a1)" +write(buffer,fmt) ">", 3.0, "<" +if (buffer.ne.">3.00<") stop 1 +end + diff --git a/gcc/testsuite/gfortran.dg/pr96436_10.f90 b/gcc/testsuite/gfortran.dg/pr96436_10.f90 new file mode 100644 index 000..3bd30a9f16b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr96436_10.f90 @@ -0,0 +1,10 @@ +! { dg-do run } +! { dg-options "-std=f2008 -pedantic" } +! { dg-shouldfail "Zero width in format descriptor" } + +character(10) :: fmt = "(es0.2)" +print fmt, 3. +end + +! { dg-output "Fortran runtime error: Zero width in format descriptor" } + diff --git a/gcc/testsuite/gfortran.dg/pr96436_2.f90 b/gcc/testsuite/gfortran.dg/pr96436_2.f90 new file mode 100644 index 000..d2d6caffbfe --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr96436_2.f90 @@ -0,0 +1,10 @@ +! { dg-do run } +! { dg-options "-std=f2003 -pedantic" } + +character(20) :: fmt +character(9) :: buffer +fmt = "(1a1,f0.2,1a1)" +write(buffer,fmt) ">", 3.0, "<" +if (buffer.ne.">3.00<") stop 1 +end + diff --git a/gcc/testsuite/gfortran.dg/pr96436_3.f90 b/gcc/testsuite/gfortran.dg/pr96436_3.f90 new file mode 100644 index 000..2750231312f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr96436_3.f90 @@ -0,0 +1,13 @@ +! { dg-do run } +! { dg-options
*PING* – Re: [Patch] OpenMP: Handle cpp_implicit_alias in declare-target discovery (PR96390)
On 8/3/20 5:37 PM, Tobias Burnus wrote: It turned out that the omp_discover_declare_target_tgt_fn_r discovered all nodes – but as it tagged the C++ alias nodes and not the streamed-out nodes, no device function was created and one got link errors if offloading devices were configured. (Only with -O0 as otherwise inlining happened.) (Testcase is based on a sollve_vv testcase which in turn was based on an LLVM bugreport.) OK? Tobias - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Re: [PATCH] C-SKY: Support -mfloat-abi=hard.
Hi Jojo, On 8/17/20 9:52 AM, Jojo R wrote: --- a/gcc/config/csky/csky.c +++ b/gcc/config/csky/csky.c @@ -328,6 +328,10 @@ csky_cpu_cpp_builtins (cpp_reader *pfile) { builtin_define ("__csky_hard_float__"); builtin_define ("__CSKY_HARD_FLOAT__"); + if (TARGET_HARD_FLOAT_ABI) +builtin_define ("__CSKY_HARD_FLOAT_ABI__"); + if (TARGET_SINGLE_FPU) +builtin_define ("__CSKY_HARD_FLOAT_FPU_SF__"); } These two builtin definitions should also support lowercase. diff --git a/gcc/config/csky/csky.md b/gcc/config/csky/csky.md @@ -3310,6 +3312,88 @@ force_reg (Pmode, XEXP (operands[1], 0))); }") +;; Call subroutine returning any type. + +(define_expand "untyped_call" + [(parallel [(call (match_operand 0 "" "") +(const_int 0)) +(match_operand 1 "" "") +(match_operand 2 "" "")])] + "TARGET_HARD_FLOAT_ABI" +{ + int i; + + emit_call_insn (gen_call (operands[0], const0_rtx)); + + for (i = 0; i < XVECLEN (operands[2], 0); i++) +{ + rtx set = XVECEXP (operands[2], 0, i); + emit_move_insn (SET_DEST (set), SET_SRC (set)); +} + + /* The optimizer does not know that the call sets the function value + registers we stored in the result block. We avoid problems by + claiming that all hard registers are used and clobbered at this + point. */ + emit_insn (gen_blockage ()); + + DONE; +}) Why does untyped_call only supported when the -mfloat-abi=hard? I think this should be supported in any float abis.