Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
Sorry, first contribution ever :) Here is the entry: 2014-10-19 Bruno Loff bruno.l...@gmail.com * c-parser.c (c_parser_declspecs): Call invoke_plugin_callbacks after processing enum declaration. The dates are off because I actually made the change a while ago (took me a while because I needed to test it with the plugin and make sure I didn't mess it up). Bruno In case you want a diff file... diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog index 6cf964c..080bd61 100644 --- a/gcc/c/ChangeLog +++ b/gcc/c/ChangeLog @@ -1,3 +1,8 @@ +2014-10-19 Bruno Loff bruno.l...@gmail.com + + * c-parser.c (c_parser_declspecs): Call invoke_plugin_callbacks after + processing enum declaration. + 2014-09-25 Thomas Schwinge tho...@codesourcery.com PR c++/63249 -- 1.9.1 On 2 February 2015 at 20:17, Diego Novillo dnovi...@google.com wrote: On Mon, Feb 2, 2015 at 2:07 PM, Bruno Loff bruno.l...@gmail.com wrote: Something like: The PLUGIN_FINISH_TYPE callback for gcc plugins is now triggered for enum declarations. ? ChangeLog entries in GCC are pretty pick as to how they want to be formatted. See other entries for reference and https://gcc.gnu.org/codingconventions.html#ChangeLogs for specific documentation. Diego.
Re: [Ping] Port of VTV for Cygwin and MinGW
Hi, after the missed bug at Linux with no VTV I checked everything again on the trunk. I saw that I erroneously wrote in the changelog for libvtv/aclocal.m4 regenerate and deleted the change from the patch. The only change I made there in my working directory was the following. Index: libvtv/aclocal.m4 === --- libvtv/aclocal.m4 (Revision 220306) +++ libvtv/aclocal.m4 (Arbeitskopie) @@ -1006,6 +1006,7 @@ AC_SUBST([am__untar]) m4_include([../config/acx.m4]) m4_include([../config/depstand.m4]) m4_include([../config/lead-dot.m4]) +m4_include([../config/lthostflags.m4]) m4_include([../config/libstdc++-raw-cxx.m4]) m4_include([../config/multi.m4]) m4_include([../config/override.m4]) And then autoconf/automake again. Something I missed during my last test, since Cygwin with gcc 4.9 and the patch bootstrapped fine, is the following. One of the last changes to the patch was to remove the implementation of mprotect in libvtv/ (copied from the MinGW port from libgcc2.c), because libgcc2.c implements it for MinGW, and cygwin1.dll implements it for Cygwin. However, PAGE_SIZE/PAGESIZE returns 0x1 on Cygwin on a 64bit PC/VM (don't have a 32bit PC/VM but I assume that the value would be 0x1000 there). On Linux 64bit it returns 0x1000 and on Windows 64bit with SYSTEM_INFO/dwPageSize also returns 0x1000. This causes mprotect of Cygwin to fail for libvtv, since the passed address is checked for alignment with PAGE_SIZE/PAGESIZE. The solutions I come up with are: - Set VTV_PAGE_SIZE to 0x1 on Cygwin with 64bit PCs/VMs. But this would set more than the desired section to be read/write. Practically the whole dll would be writable for the .vtable_map_vars section to be writable. Therefore I don't recommend this solution. The changes would be in include/vtv-change-permission.h, and various other files where sizes have to be changed. - Add the mprotect implementation from libgcc2.c again for Cygwin in libvtv/. In libgcc2.c it isn't build for Cygwin. The changes would just be in libvtv/. I'd prefer this solution. Patrick
Re: [PATCH] PR preprocessor/64803 - __LINE__ inside macro is not constant
On Mon, Feb 02, 2015 at 03:41:50PM +0100, Dodji Seketeli wrote: libcpp/ChangeLog: * internal.h (cpp_reader::top_most_macro_node): New data member. * macro.c (enter_macro_context): Pass the location of the end of the top-most invocation of the function-like macro, or the location of the expansion point of the top-most object-like macro. (cpp_get_token_1): Store the top-most macro node in the new pfile-top_most_macro_node data member. (_cpp_pop_context): Clear the new cpp_reader::top_most_macro_node data member. gcc/testsuite/ChangeLog: * gcc.dg/cpp/builtin-macro-1.c: New test case. Ok, thanks. Jakub
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
That's On Mon, Feb 2, 2015 at 2:39 PM, Bruno Loff bruno.l...@gmail.com wrote: Sorry, first contribution ever :) Here is the entry: 2014-10-19 Bruno Loff bruno.l...@gmail.com * c-parser.c (c_parser_declspecs): Call invoke_plugin_callbacks after processing enum declaration. This is fine. Thanks. The dates are off because I actually made the change a while ago (took me a while because I needed to test it with the plugin and make sure I didn't mess it up). Just change the date to the date of when you commit the patch. Diego.
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
On Mon, Feb 2, 2015 at 2:39 PM, Bruno Loff bruno.l...@gmail.com wrote: 2014-10-19 Bruno Loff bruno.l...@gmail.com * c-parser.c (c_parser_declspecs): Call invoke_plugin_callbacks after processing enum declaration. Thanks. Committed at r220358. Diego.
Re: [PATCH] PR preprocessor/64803 - __LINE__ inside macro is not constant
Jakub Jelinek ja...@redhat.com writes: On Mon, Feb 02, 2015 at 03:41:50PM +0100, Dodji Seketeli wrote: libcpp/ChangeLog: * internal.h (cpp_reader::top_most_macro_node): New data member. * macro.c (enter_macro_context): Pass the location of the end of the top-most invocation of the function-like macro, or the location of the expansion point of the top-most object-like macro. (cpp_get_token_1): Store the top-most macro node in the new pfile-top_most_macro_node data member. (_cpp_pop_context): Clear the new cpp_reader::top_most_macro_node data member. gcc/testsuite/ChangeLog: * gcc.dg/cpp/builtin-macro-1.c: New test case. Ok, thanks. Thanks. The patch that finally passed bootstrap is the one below. It's slightly different in the condition I use to detect that we are popping the context of the top-most macro expansion stored in pfile-top_most_macro_node in _cpp_pop_context(). I now use: + if (macro == pfile-top_most_macro_node context-prev == NULL) And the context-prev == NULL means, this is the first macro expansion context on the the stack. I have also corrected a typo by s/poping/popping/. I don't know what I was thinking before. Bootstrapped and tested on x86_64-unknown-linux-gnu against trunk. libcpp/ChangeLog: * internal.h (cpp_reader::top_most_macro_node): New data member. * macro.c (enter_macro_context): Pass the location of the end of the top-most invocation of the function-like macro, or the location of the expansion point of the top-most object-like macro. (cpp_get_token_1): Store the top-most macro node in the new pfile-top_most_macro_node data member. (_cpp_pop_context): Clear the new cpp_reader::top_most_macro_node data member. gcc/testsuite/ChangeLog: * gcc.dg/cpp/builtin-macro-1.c: New test case. Signed-off-by: Dodji Seketeli do...@redhat.com --- gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c | 28 +++ libcpp/internal.h | 5 + libcpp/macro.c | 31 +++--- 3 files changed, 61 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c diff --git a/gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c b/gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c new file mode 100644 index 000..90c2883 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c @@ -0,0 +1,28 @@ +/* Origin PR preprocessor/64803 + + This test ensures that the value the __LINE__ macro expands to is + constant and corresponds to the line of the closing parenthesis of + the top-most function-like macro expansion it's part of. + + { dg-do run } + { do-options -no-integrated-cpp } */ + +#include assert.h + +#define C(a, b) a ## b +#define L(x) C(L, x) +#define M(a) int L(__LINE__) = __LINE__; assert(L(__LINE__) == __LINE__); + +int +main() +{ + M(a +); + + assert(L20 == 20); /* 20 is the line number of the + closing parenthesis of the + invocation of the M macro. Please + adjust in case the layout of this + file changes. */ + return 0; +} diff --git a/libcpp/internal.h b/libcpp/internal.h index 1a74020..96ccc19 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -421,6 +421,11 @@ struct cpp_reader macro invocation. */ source_location invocation_location; + /* This is the node representing the macro being expanded at + top-level. The value of this data member is valid iff + in_macro_expansion_p() returns TRUE. */ + cpp_hashnode *top_most_macro_node; + /* Nonzero if we are about to expand a macro. Note that if we are really expanding a macro, the function macro_of_context returns the macro being expanded and this flag is set to false. Client diff --git a/libcpp/macro.c b/libcpp/macro.c index 9571345..1e0a0b5 100644 --- a/libcpp/macro.c +++ b/libcpp/macro.c @@ -1228,7 +1228,24 @@ enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, pfile-about_to_expand_macro_p = false; /* Handle built-in macros and the _Pragma operator. */ - return builtin_macro (pfile, node, location); + { +source_location loc; +if (/* The top-level macro invocation that triggered the expansion + we are looking at is with a standard macro ...*/ + !(pfile-top_most_macro_node-flags NODE_BUILTIN) + /* ... and it's a function-like macro invocation. */ +pfile-top_most_macro_node-value.macro-fun_like) + /* Then the location of the end of the macro invocation is the +location of the closing parenthesis. */ + loc = pfile-cur_token[-1].src_loc; +else + /* Otherwise, the location of the end of the macro invocation is +the location of the expansion point of that top-level macro +invocation.
PATCH: PR target/64905: unsigned short is loaded with 4-byte load (movl)
This patch fixes a long standing bug where aligned_operand ignores alignment of memory operand less than 32 bits. It drops address decomposition and returns false if alignment of memory operand less is than 32 bits. Tested on Linux/x86-64. OK for trunk, 4.9 and 4.8 branches? H.J. --- gcc/ PR target/64905 * config/i386/predicates.md (aligned_operand): Don't decompose address. Return false if alignment of memory operand is less than 32 bits. gcc/testsuite/ PR target/64905 * gcc.target/i386/pr64905.c: New file. --- gcc/config/i386/predicates.md | 33 + gcc/testsuite/gcc.target/i386/pr64905.c | 22 ++ 2 files changed, 23 insertions(+), 32 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr64905.c diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index 0f314cc..98dbcba 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1095,9 +1095,6 @@ (define_predicate aligned_operand (match_operand 0 general_operand) { - struct ix86_address parts; - int ok; - /* Registers and immediate operands are always aligned. */ if (!MEM_P (op)) return true; @@ -1121,35 +1118,7 @@ || GET_CODE (op) == POST_INC) return true; - /* Decode the address. */ - ok = ix86_decompose_address (op, parts); - gcc_assert (ok); - - if (parts.base GET_CODE (parts.base) == SUBREG) -parts.base = SUBREG_REG (parts.base); - if (parts.index GET_CODE (parts.index) == SUBREG) -parts.index = SUBREG_REG (parts.index); - - /* Look for some component that isn't known to be aligned. */ - if (parts.index) -{ - if (REGNO_POINTER_ALIGN (REGNO (parts.index)) * parts.scale 32) - return false; -} - if (parts.base) -{ - if (REGNO_POINTER_ALIGN (REGNO (parts.base)) 32) - return false; -} - if (parts.disp) -{ - if (!CONST_INT_P (parts.disp) - || (INTVAL (parts.disp) 3)) - return false; -} - - /* Didn't find one -- this must be an aligned address. */ - return true; + return false; }) ;; Return true if OP is memory operand with a displacement. diff --git a/gcc/testsuite/gcc.target/i386/pr64905.c b/gcc/testsuite/gcc.target/i386/pr64905.c new file mode 100644 index 000..bc87d85 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr64905.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options -Os -ffixed-rax -ffixed-rbx -ffixed-rcx -ffixed-rdx -ffixed-rdi -ffixed-rsi -ffixed-r8 -ffixed-r9 -ffixed-r10 -ffixed-r11 -ffixed-r12 -ffixed-r13 -ffixed-r14 -ffixed-r15 } */ +/* { dg-final { scan-assembler-not movl\[ \t\]0\\(%.*\\), %.* } } */ + +typedef unsigned short uint16_t; +uint16_t a_global; + +void __attribute__ ((noinline)) +function (uint16_t **a_p) +{ + // unaligned access by address in %rbp: mov0x0(%rbp),%ebp + a_global = **a_p; +} + +int main(int argc, char **argv) +{ + uint16_t array [4] = { 1, 2, 3, 4 }; + uint16_t *array_elem_p = array [3]; + + function (array_elem_p); + return 0; +} -- 1.9.3
Fix pesimisation due to always_inline
Hi, while looking into Firefox's regressions WRT 4.9 LTO builds, I noticed that some of very small functions are not early inlined. This is because firefox sometimes uses always_inline and we skip early inlining int those. Once the always_inline is inlined we however do not inline recursively becuase we dropped number of iterations to 1. This patch makes inliner to first apply changes of always inline prior early inlining. Bootstrapped/regtested x86_64-linux, will commit it shortly. * ipa-inline.c (early_inliner): Skip inlining only in always_inlined; if some always_inline was inlined, apply changes before inlining heuristically. Index: ipa-inline.c === --- ipa-inline.c(revision 220313) +++ ipa-inline.c(working copy) @@ -2528,7 +2528,9 @@ early_inliner (function *fun) cycles of edges to be always inlined in the callgraph. We might want to be smarter and just avoid this type of inlining. */ - || DECL_DISREGARD_INLINE_LIMITS (node-decl)) + || (DECL_DISREGARD_INLINE_LIMITS (node-decl) + lookup_attribute (always_inline, + DECL_ATTRIBUTES (node-decl ; else if (lookup_attribute (flatten, DECL_ATTRIBUTES (node-decl)) != NULL) @@ -2543,6 +2545,17 @@ early_inliner (function *fun) } else { + /* If some always_inline functions was inlined, apply the changes. +This way we will not account always inline into growth limits and +moreover we will inline calls from always inlines that we skipped +previously becuase of conditional above. */ + if (inlined) + { + timevar_push (TV_INTEGRATION); + todo |= optimize_inline_calls (current_function_decl); + inline_update_overall_summary (node); + timevar_pop (TV_INTEGRATION); + } /* We iterate incremental inlining to get trivial cases of indirect inlining. */ while (iterations PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
Re: [C++ PATCH] PR c++/64901
On 2 February 2015 at 20:50, Ville Voutilainen ville.voutilai...@gmail.com wrote: The modified test has been tested, I'm currently running the full testsuite, so testing is incomplete. I wanted to send this in asap, since this is a bad regression. /cp 2015-02-02 Ville Voutilainen ville.voutilai...@gmail.com PR c++/64901 * decl.c (duplicate_decls): Also duplicate DECL_FINAL_P and DECL_OVERRIDE_P. /testsuite 2015-02-02 Ville Voutilainen ville.voutilai...@gmail.com PR c++/64901 * g++.dg/cpp0x/override1.C: Add a test for the PR. For what it's worth, the complete testsuite passes without regressions, tested on Linux-x64.
Re: [PATCH, RFC] fortran [was Re: #pragma GCC unroll support]
On Feb 2, 2015, at 3:22 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: Untested draft patch I looked it over, seems to slot in nicely. + gfc_error (%GCC unroll% directive does not commence a loop at %C”); So, don’t like commence here.
Re: [PATCH] PR preprocessor/64803 - __LINE__ inside macro is not constant
On Mon, Feb 02, 2015 at 11:22:58PM +0100, Dodji Seketeli wrote: Thanks. The patch that finally passed bootstrap is the one below. It's slightly different in the condition I use to detect that we are popping the context of the top-most macro expansion stored in pfile-top_most_macro_node in _cpp_pop_context(). I now use: + if (macro == pfile-top_most_macro_node context-prev == NULL) And the context-prev == NULL means, this is the first macro expansion LGTM. context on the the stack. I have also corrected a typo by s/poping/popping/. I don't know what I was thinking before. Oops, sorry for missing that. Jakub
[PATCH, RFC] fortran [was Re: #pragma GCC unroll support]
Hi, Some compilers IIRC use !DIR$ unroll, if memory serves me right then the DEC compiler had !DEC$ unroll. We could support one or the other three-letter keyword or maybe not. I think a combination of unroll and ivdep directives is allowed (at least in some compilers); TODO. Not sure what other statements should be annotated with that directive? I do not like the global variable directive_unroll but is was the easy way out for cheap warnings. Untested draft patch, regstrap running over night, depends on Mike's unroll-5.diffs.txt patch in this thread ( https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02733.html ). Just stage-1 tinkering here. Cheers, Bernhard Reutner-Fischer (1): fortran: !GCC$ unroll for DO gcc/fortran/decl.c | 38 gcc/fortran/gfortran.h | 2 ++ gcc/fortran/match.h | 1 + gcc/fortran/parse.c | 13 ++- gcc/fortran/trans-decl.c | 7 gcc/fortran/trans-stmt.c | 14 gcc/fortran/trans.h | 3 ++ gcc/testsuite/gfortran.dg/directive_unroll_1.f90 | 46 gcc/testsuite/gfortran.dg/directive_unroll_2.f90 | 39 9 files changed, 162 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/directive_unroll_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/directive_unroll_2.f90 -- 2.1.4
Re: [PATCH] Fix combiner from accessing or writing out of bounds SET_N_REGS (PR other/63504)
On Mon, Feb 02, 2015 at 07:54:46PM +0100, Jakub Jelinek wrote: +/* Highest pseudo for which we track REG_N_SETS. */ +static unsigned int reg_n_sets_max; One more than the highest reg num, actually. Looks fine otherwise :-) Segher
Re: C++ PATCH for abi_tag sanity checking
On 02/02/2015 06:43 PM, Jason Merrill wrote: One of the EDG guys pointed out to me that we weren't doing any sanity checking on the arguments to the abi_tag attribute. This patch adds checks to require that the arguments be strings containing valid identifiers, so they work appropriately in mangled names. Tested x86_64-pc-linux-gnu, applying to trunk. thanks, however it would be nice to document what this flags does at all. Please see PR 64859. Matthias
[PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0
While trying to build the GCC 5 with GCC 5, I ran into an ICE when building libcpp at -O0. The problem is the C++ front-end was not folding sizeof(a)/sizeof(a[0]) when passed to a function at -O0. The C++ front-end keeps around sizeof until the gimplifier and there is no way to fold the expressions that involve them. So to work around the issue we need to change __builtin_aarch64_im_lane_boundsi to accept an extra argument and change the first two arguments to size_t type so we don't get an extra cast there and do the division inside the compiler itself. Also we don't want to cause an ICE on any source code so I changed the assert to be a sorry if either of the two arguments are not integer constants. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions and I was able to bootstrap without a modified libcpp. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Change the first argument type to size_type_node and add another size_type_node. (aarch64_simd_expand_builtin): Handle the new argument to AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather print sorry out when the first two arguments are not integer constants. * config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK): Pass the sizeof's directly to __builtin_aarch64_im_lane_boundsi. testsuite/ChangeLog: * c-c++-common/torture/aarch64-vect-lane-1.c: New testcase. commit 455a54f36a205af281b3fe8dbc97916ede704ca8 Author: Andrew Pinski apin...@cavium.com Date: Mon Feb 2 18:40:08 2015 + Fix bug 64893: ICE with vget_lane_u32 with C++ front-end PR target/64893 * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Change the first argument type to size_type_node and add another size_type_node. (aarch64_simd_expand_builtin): Handle the new argument to AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather print sorry out when the first two arguments are not integer constants. * config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK): Pass the sizeof directly to __builtin_aarch64_im_lane_boundsi. * testsuite/c-c++-common/torture/aarch64-vect-lane-1.c: New testcase. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 87f1ac2..5bd15d1 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -712,7 +712,8 @@ aarch64_init_simd_builtins (void) aarch64_init_simd_builtin_scalar_types (); tree lane_check_fpr = build_function_type_list (void_type_node, - intSI_type_node, + size_type_node, + size_type_node, intSI_type_node, NULL); aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_LANE_CHECK] = @@ -1001,13 +1002,18 @@ aarch64_simd_expand_builtin (int fcode, tree exp, rtx target) { if (fcode == AARCH64_SIMD_BUILTIN_LANE_CHECK) { - tree nlanes = CALL_EXPR_ARG (exp, 0); - gcc_assert (TREE_CODE (nlanes) == INTEGER_CST); - rtx lane_idx = expand_normal (CALL_EXPR_ARG (exp, 1)); - if (CONST_INT_P (lane_idx)) - aarch64_simd_lane_bounds (lane_idx, 0, TREE_INT_CST_LOW (nlanes), exp); + rtx totalsize = expand_normal (CALL_EXPR_ARG (exp, 0)); + rtx elementsize = expand_normal (CALL_EXPR_ARG (exp, 1)); + if (CONST_INT_P (totalsize) CONST_INT_P (elementsize)) + { + rtx lane_idx = expand_normal (CALL_EXPR_ARG (exp, 2)); + if (CONST_INT_P (lane_idx)) + aarch64_simd_lane_bounds (lane_idx, 0, UINTVAL (totalsize)/UINTVAL (elementsize), exp); + else + error (%Klane index must be a constant immediate, exp); + } else - error (%Klane index must be a constant immediate, exp); + sorry (%Ktotal size and element size must be a constant immediate, exp); /* Don't generate any RTL. */ return const0_rtx; } diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index d4ce0b8..938a3cc 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -541,7 +541,7 @@ typedef struct poly16x8x4_t #define __AARCH64_NUM_LANES(__v) (sizeof (__v) / sizeof (__v[0])) #define __AARCH64_LANE_CHECK(__vec, __idx) \ - __builtin_aarch64_im_lane_boundsi (__AARCH64_NUM_LANES (__vec), __idx) + __builtin_aarch64_im_lane_boundsi (sizeof(__vec), sizeof(__vec[0]), __idx) /* For big-endian, GCC's vector indices are the opposite way around to the architectural lane indices used by Neon intrinsics. */ diff --git
[PATCH, v0] fortran: !GCC$ unroll for DO
fortran/ChangeLog: 2015-02-02 Bernhard Reutner-Fischer al...@gcc.gnu.org * match.h (gfc_match_gcc_unroll): New prototype. * decl.c (directive_unroll): New global variable. (gfc_match_gcc_unroll): New function. * gfortran.h (directive_unroll): New extern declaration. [gfc_iterator]: New member unroll. * parse.c (decode_gcc_attribute): Match unroll. (parse_do_block): Set iterator's unroll. (parse_executable): Diagnose misplaced unroll directive. * trans.h (gfc_cfun_has_unroll): New prototype. * trans-decl.c (gfc_cfun_has_unroll): New function. * trans-stmt.c (gfc_trans_simple_do, gfc_trans_do): Annotate loop condition with annot_expr_unroll_kind. testsuite/ChangeLog: 2015-02-02 Bernhard Reutner-Fischer al...@gcc.gnu.org * gfortran.dg/directive_unroll_1.f90: New testcase. * gfortran.dg/directive_unroll_2.f90: Likewise. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- gcc/fortran/decl.c | 38 gcc/fortran/gfortran.h | 2 ++ gcc/fortran/match.h | 1 + gcc/fortran/parse.c | 13 ++- gcc/fortran/trans-decl.c | 7 gcc/fortran/trans-stmt.c | 14 gcc/fortran/trans.h | 3 ++ gcc/testsuite/gfortran.dg/directive_unroll_1.f90 | 46 gcc/testsuite/gfortran.dg/directive_unroll_2.f90 | 39 9 files changed, 162 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/directive_unroll_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/directive_unroll_2.f90 diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c index 40d851c..713e6ee 100644 --- a/gcc/fortran/decl.c +++ b/gcc/fortran/decl.c @@ -103,6 +103,8 @@ gfc_symbol *gfc_new_block; bool gfc_matching_function; +/* Set upon parsing a !GCC$ unroll n directive for use in the next loop. */ +int directive_unroll = -1; /* DATA statement subroutines */ @@ -8866,3 +8868,39 @@ syntax: gfc_error (Syntax error in !GCC$ ATTRIBUTES statement at %C); return MATCH_ERROR; } + + +/* Match a !GCC$ UNROLL statement of the form: + !GCC$ UNROLL n + + The parameter n is the number of times we are supposed to unroll; + Refer to the C frontend and loop-unroll.c decide_unrolling() for details. + + When we come here, we have already matched the !GCC$ UNROLL string. + */ +match +gfc_match_gcc_unroll (void) +{ + signed int value; + + if (gfc_match_small_int (value) == MATCH_YES) +{ + if (value 0 || value USHRT_MAX) + { + gfc_error (%GCC unroll% directive requires a + non-negative integral constant + less than or equal to %u at %C, + USHRT_MAX + ); + return MATCH_ERROR; + } + if (gfc_match_eos () == MATCH_YES) + { + directive_unroll = value; + return MATCH_YES; + } +} + + gfc_error (Syntax error in !GCC$ UNROLL directive at %C); + return MATCH_ERROR; +} diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 6b9f7dd..7bd2432 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -2185,6 +2185,7 @@ gfc_case; typedef struct { gfc_expr *var, *start, *end, *step; + unsigned short unroll; } gfc_iterator; @@ -2546,6 +2547,7 @@ gfc_finalizer; /* decl.c */ bool gfc_in_match_data (void); match gfc_match_char_spec (gfc_typespec *); +extern int directive_unroll; /* scanner.c */ void gfc_scanner_done_1 (void); diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h index 96d3ec1..30c0aa3 100644 --- a/gcc/fortran/match.h +++ b/gcc/fortran/match.h @@ -219,6 +219,7 @@ match gfc_match_contiguous (void); match gfc_match_dimension (void); match gfc_match_external (void); match gfc_match_gcc_attributes (void); +match gfc_match_gcc_unroll (void); match gfc_match_import (void); match gfc_match_intent (void); match gfc_match_intrinsic (void); diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 2c7c554..95c35b9 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -882,6 +882,7 @@ decode_gcc_attribute (void) old_locus = gfc_current_locus; match (attributes, gfc_match_gcc_attributes, ST_ATTR_DECL); + match (unroll, gfc_match_gcc_unroll, ST_NONE); /* All else has failed, so give up. See if any of the matchers has stored an error message of some sort. */ @@ -4020,7 +4021,14 @@ parse_do_block (void) s.ext.end_do_label = new_st.label1; if (new_st.ext.iterator != NULL) -stree = new_st.ext.iterator-var-symtree; +{ + stree = new_st.ext.iterator-var-symtree; + if (directive_unroll != -1) + { + new_st.ext.iterator-unroll = directive_unroll; + directive_unroll = -1; + } +}
Re: [testsuite] Run guality tests on Solaris
Jeff Law l...@redhat.com writes: On 01/30/15 01:19, Jakub Jelinek wrote: The biggest problem is that what fails and what does not varries between targets and between optimization levels. Right now we have no way to xfail test XYZ for -Os on x86_64-linux and for -O2 and -O3 on i686-linux ia32, and the lists would become very large. Some tests in guality are xfaileded just in case, even when they actually XPASS on many targets. I thought we added that kind of capability a while back. There's still significant potential for them to get unwieldy. The hope would be that we'd have a set for x86, x86_64, aarch64, etc, but not have to do anything special for the OS. I fear this won't suffice: it certainly will depend on the debug format used, and even so there are differences between Linux/x86 and Solaris/x86, both using ELF and DWARF (perhaps a DWARF-4 vs. DWARF-2 difference?). And Darwin/x86 with Mach-O will certainly differ again (not currently noticeable since the guality tests are disabled there wholesale). The way to look for regressions in the guality area, at least as I do it regularly, is just compare test_summary results. If we'd disable this by default, I'm sure our debug quality would sink very quickly. Yup. But it'd still be nicer if our test runs were cleaner. Very true. I wonder how best to go forward with filing PRs for the failures: one PR for failing test may be overkill, but it would require lots of analysis to group by failure with common cause. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH 4/4] OpenMP 4.0 offloading to Intel MIC: non-fallback testing
On 28 Jan 19:20, Ilya Verbin wrote: On 28 Jan 17:15, Jakub Jelinek wrote: On Wed, Jan 28, 2015 at 07:02:59PM +0300, Ilya Verbin wrote: + = XNEWVEC (char, len + sizeof (-B ../ DEFAULT_TARGET_MACHINE +/libgomp/)); + sprintf (optional_target_path2, -B%s/../../../ DEFAULT_TARGET_MACHINE + /libgomp/, current_path); This will surely overflow the buffer, won't it? There is space just for ../ but you put there /../../../. I'd strongly prefer if you rewrote all these XNEWVEC or XRESIZEVEC etc. + sprintf cases into concat, like optional_target_path2 = concat (-B, current_path, /../../../ DEFAULT_TARGET_MACHINE /libgomp/, NULL); and similar. That way you avoid all such bugs. The variable 'len' contains sizeof (/../../). I agree that this code looks ugly :) I'll rewrite it using concat. Here is the patch with concat. diff --git a/gcc/config.gcc b/gcc/config.gcc index abd915e..0ebdbd2 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -4374,7 +4374,7 @@ fi case ${target} in i[34567]86-*-* | x86_64-*-*) if test x$enable_as_accelerator = xyes; then - extra_programs=mkoffload\$(exeext) + extra_programs=mkoffload\$(exeext) accel/${target_noncanonical}/mkoffload$(exeext) fi ;; esac diff --git a/gcc/config/i386/intelmic-mkoffload.c b/gcc/config/i386/intelmic-mkoffload.c index edc3f92e..bc71004 100644 --- a/gcc/config/i386/intelmic-mkoffload.c +++ b/gcc/config/i386/intelmic-mkoffload.c @@ -22,13 +22,13 @@ #include config.h #include libgen.h -#include libgomp-plugin.h #include system.h #include coretypes.h #include obstack.h #include intl.h #include diagnostic.h #include collect-utils.h +#include intelmic-offload.h const char tool_name[] = intelmic mkoffload; @@ -45,6 +45,13 @@ const char *temp_files[MAX_NUM_TEMPS]; /* Shows if we should compile binaries for i386 instead of x86-64. */ bool target_ilp32 = false; +/* Optional prefixes for the target compiler, which are required when target + compiler is not installed. */ +char *optional_target_path1 = NULL; +char *optional_target_path2 = NULL; +char *optional_target_lib_path = NULL; + + /* Delete tempfiles and exit function. */ void tool_cleanup (bool from_signal ATTRIBUTE_UNUSED) @@ -151,14 +158,17 @@ access_check (const char *name, int mode) return access (name, mode); } -/* Find target compiler using a path from COLLECT_GCC or COMPILER_PATH. */ +/* Find target compiler using a path from COLLECT_GCC, COMPILER_PATH, or a path + relative to ARGV0. */ static char * -find_target_compiler (const char *name) +find_target_compiler (const char *argv0) { bool found = false; char **paths = NULL; unsigned n_paths, i; + const char *current_path; const char *collect_path = dirname (ASTRDUP (getenv (COLLECT_GCC))); + const char *name = GCC_INSTALL_NAME; size_t len = strlen (collect_path) + 1 + strlen (name) + 1; char *target_compiler = XNEWVEC (char, len); sprintf (target_compiler, %s/%s, collect_path, name); @@ -177,13 +187,32 @@ find_target_compiler (const char *name) if (access_check (target_compiler, X_OK) == 0) { found = true; - break; + goto out; } } + XDELETEVEC (target_compiler); + + /* If installed compiler wasn't found, try to find a non-installed compiler, + using a path relative to mkoffload. */ + current_path = dirname (ASTRDUP (argv0)); + target_compiler = concat (current_path, /../../xgcc, NULL); + if (access_check (target_compiler, X_OK) == 0) +{ + optional_target_path1 = concat (-B, current_path, /../../, NULL); + optional_target_path2 + = concat (-B, current_path, + /../../../ DEFAULT_TARGET_MACHINE /libgomp/, NULL); + optional_target_lib_path + = concat (-L, current_path, + /../../../ DEFAULT_TARGET_MACHINE /libgomp/.libs/, NULL); + found = true; +} out: free_array_of_ptrs ((void **) paths, n_paths); - return found ? target_compiler : NULL; + if (!found) +fatal_error (offload compiler %s not found, name); + return target_compiler; } static void @@ -193,6 +222,14 @@ compile_for_target (struct obstack *argv_obstack) obstack_ptr_grow (argv_obstack, -m32); else obstack_ptr_grow (argv_obstack, -m64); + + if (optional_target_path1) +obstack_ptr_grow (argv_obstack, optional_target_path1); + if (optional_target_path2) +obstack_ptr_grow (argv_obstack, optional_target_path2); + if (optional_target_lib_path) +obstack_ptr_grow (argv_obstack, optional_target_lib_path); + obstack_ptr_grow (argv_obstack, NULL); char **argv = XOBFINISH (argv_obstack, char **); @@ -346,7 +383,7 @@ generate_host_descr_file (const char *host_compiler) init (void)\n {\n GOMP_offload_register
Re: [PATCH libstdc++] Fix for std::uncaught_exception (PR 62258)
On 2 February 2015 at 02:37, Michael Hanselmann wrote: Calls to `std::uncaught_exception` after calling `std::rethrow_exception' always return `true' when `std::uncaught_exception' should return `false' unless an exception is in flight. `std::rethrow_exception' does not update `__cxa_eh_globals::uncaughtExceptions' while the following call to `__cxa_begin_catch' decrements it. This fixes PR 62258. The patch looks correct, but I think it can wait until the trunk reopens after the GCC 5 release.
Re: [Patch, libstdc++/64649] Fix regex_traits::lookup_collatename and regex_traits::lookup_classname
On 23/01/15 13:20 -0800, Tim Shen wrote: On Wed, Jan 21, 2015 at 9:10 PM, Tim Shen tims...@google.com wrote: Submitted version. I think this patch fits 4.9 branch well? I don't think this needs to go on the 4.9 branch, apparently I'm the only person who's noticed the problem. I expect it's quite rare to try using those functions with forward iterators.
Re: [Patch, libstdc++/64680] Conform the standard regex interface
On 01/02/15 00:18 -0800, Tim Shen wrote: On Wed, Jan 21, 2015 at 9:08 PM, Tim Shen tims...@google.com wrote: Fixed and committed. I believe this one is also suitable for 4.9? I guess we don't have a 'code freeze' for 4.9 branch as we do for 5.0 late stage? Release branches are always in regression fixes and docs only mode, but this is OK for 4.9, thanks.
Re: [PATCH] PR preprocessor/64803 - __LINE__ inside macro is not constant
On Fri, Jan 30, 2015 at 10:19:26AM +0100, Dodji Seketeli wrote: [This is a P1 regression for gcc 5] libcpp/ChangeLog: * internal.h (cpp_reader::top_most_macro_node): New data member. * macro.c (enter_macro_context): Pass the location of the end of the top-most invocation of the function-like macro, or the location of the expansion point of the top-most object-like macro. (cpp_get_token_1): Store the top-most macro node in the new pfile-top_most_macro_node data member. The thing that worries me a little bit on the patch is that the new field is never cleared, only overwritten next time we attempt to expand a function-like? toplevel macro. So outside of that it can be stale, point to a dead memory. But if it is guaranteed it won't be accessed in that case, perhaps that is safe. Jakub
Re: [PATCH][OpenMP] Forbid usage of non-target functions in target regions
On Sun, Jan 11, 2015 at 11:06:52PM +0300, Ilya Verbin wrote: On 09 Jan 16:02, Jakub Jelinek wrote: On Fri, Jan 09, 2015 at 05:57:02PM +0300, Ilya Verbin wrote: If one (by mistake) calls a non-target function from the target region, the offload compiler crashes in input_overwrite_node. This is because compute_ltrans_boundary during streaming-out includes to SET the non-offloadable nodes, called from offloadable nodes. Probably it's possible to ignore such incorrect nodes (and edges) in streaming-out, but such a situation can not appear in a correct OpenMP 4.0 program, therefore I've added a check to scan_omp_1_stmt. Unlike variables, the spec last time I've checked isn't all that clear about that. I think that GCC shouldn't allow such calls, at least for non-external functions. Otherwise why the 'declare target' directive is needed at all? As for external functions, it's an open question, e.g. for MIC it's OK to have in target region a call to a function from a native library, like printf. --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -2818,6 +2818,19 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, default: break; } + else if (!DECL_EXTERNAL (fndecl) + !cgraph_node::get_create (fndecl)-offloadable) What about if fndecl is defined in the current TU, but as global symbol and can be interposed (e.g. is in a shared library and not hidden in there), the local function definition is without target attribute but the definition used at runtime is not? I believe that if it is defined in the current TU and is used in a target region, then it should be declared as target in current TU too. + { + omp_context *octx; + if (cgraph_node::get (current_function_decl)-offloadable) + remove = true; + for (octx = ctx; octx !remove; octx = octx-outer) + if (is_targetreg_ctx (octx)) + remove = true; + if (remove) + error_at (gimple_location (stmt), function called from + target region, but not marked as 'declare target'); %declare target% ? Fixed. I guess I'm ok with warning in that case, but not erroring out, at least not when -Werror. If you get ICE if somebody does this, you should fix the problem during the offloading LTO or where it is. Generally, the solution if something goes wrong during the offloading compilation should be just to give up on the offloading to the particular offloading target (i.e. fill in the sections libgomp reads in a way that will result in host fallback). Jakub
Re: [PATCH][RFC][OpenMP] Forbid target* pragmas in target regions
On Mon, Jan 12, 2015 at 12:22:44AM +0300, Ilya Verbin wrote: Currently if a target* pragma appears within a target region, GCC successfully compiles such code (with a warning). But the binary fails at run-time, since it tries to call GOMP_target* functions on target. The spec says: If a target, target update, or target data construct appears within a target region then the behavior is unspecified. I see 2 options to make the behavior more user-friendly: 1. To return an error at compile-time. 2. To check at run-time in libgomp whether GOMP_target* is called on target, and perform target-fallback if so. If we will select option #1, the patch is ready. Option #1 is just wrong. There is nothing wrong with such constructs appearing in #pragma omp declare target functions etc., the problem is if you hit them at runtime. You can very well have say #pragma omp declare target function, that optionally invokes #pragma omp target region e.g. based on its parameters, state of global variables, what other functions return etc. - and the program can be written so that that condition just never happens if the function is already offloaded. Jakub
Re: [RFC] PR64703, glibc sysdeps/powerpc/powerpc64/dl-machine.h miscompile
On Fri, Jan 30, 2015 at 10:33:06AM +0100, Jakub Jelinek wrote: On Fri, Jan 30, 2015 at 10:12:35AM +0100, Richard Biener wrote: Ok - without digging into why the above would fail with your patch (don't see that - the use in the function call can't be opdd) - let's take a step back and decide whether we want to allow user-created function descriptors. And if we do that if we should rather expose this in a more sensible way to GCC, like with using a (target) builtin. Say, force you to do int (*f) (int) = __builtin_fdesc (opd.fd_func, opd.fd_toc, opd.fd_aux); return f (3); which would allow GCC to even optimize the call to a direct one if it (or the target) can fold reads from the __builtin_fdesc argument to a function decl. Similar builtins could allow you to inspect a function descriptor. That way the actual memory operations would be hidden from the middle-end. That would be my preference too. Constructing the calls this way is so rare that pessimizing 99.9% of code out there that doesn't ever need this is IMHO undesirable. I had a look at what it would take to fix the const function testcase, and rapidly came to the conclusion that it is not something I should attempt. I'd need at least a week, probably more, to be confident of making a proper fix. Right now, I see FUD in comments and read Fear Uncertainty and Doubt, rather than Function Use-Def chains. ;-) Besides, I hear your comments about pessimizing code. The exercise also made me view the patch I submitted as a half-baked hack. :-( Well, maybe not quite that bad, but enough to have misgivings about applying the patch. I suspect that future changes to tree optimization passes will break glibc again (at least, versions of glibc that don't have the asm fix), so it's probably better to not apply an incomplete fix. We're in a really grey area of the C standard when it comes to the glibc code. -- Alan Modra Australia Development Lab, IBM
Re: [Patch, fortran] Cosmetics: Dup. code removal, indent fix, typo fix.
Hi Jerry, thanks for the review. Committed as r220345. Regards, Andre On Sat, 31 Jan 2015 07:41:24 -0800 Jerry DeLisle jvdeli...@charter.net wrote: On 01/30/2015 04:10 AM, Andre Vehreschild wrote: Hi all, I fear this fix is not so obvious in one location, I therefore ask for a review. The attached patch fixes: - a duplicate code fragment (possibly due to merged twice), - the indentation in the trans-expr.c block (in my first patch), and - a typo on the datatype-size to create for the charlen. The length of a char-array is stored as a 4-byte BT_INTEGER. Due to a typo a 1-byte BT_INTEGER was requested. The patch fixes this. I know this patch mixes several trivial issues. Should I do separate patches for each of them, or what is the most desirable method? Cosmetic things are OK to throw in the mix as long as you have the Changelog. Duplicate code removal is just about as obvious as one can get. No need for separate patches. Bootstraps and regtests ok on x86_64-linux-gnu/FC20. I have learned the hard way that if you forget the testing even on trivial things it can byte you. OK to commit! Thanks. -- Andre Vehreschild * Email: vehre ad gmx dot de Index: gcc/fortran/trans-decl.c === --- gcc/fortran/trans-decl.c (Revision 220344) +++ gcc/fortran/trans-decl.c (Arbeitskopie) @@ -1443,8 +1443,6 @@ if (sym-ts.type == BT_CLASS sym-backend_decl) GFC_DECL_CLASS(sym-backend_decl) = 1; - if (sym-ts.type == BT_CLASS sym-backend_decl) - GFC_DECL_CLASS(sym-backend_decl) = 1; return sym-backend_decl; } Index: gcc/fortran/trans-expr.c === --- gcc/fortran/trans-expr.c (Revision 220344) +++ gcc/fortran/trans-expr.c (Arbeitskopie) @@ -660,26 +660,26 @@ expression can be evaluated to a constant one. */ else { - /* Try to simplify the expression. */ - gfc_simplify_expr (e, 0); - if (e-expr_type == EXPR_CONSTANT !e-ts.u.cl-resolved) -{ - /* Amazingly all data is present to compute the length of a - constant string, but the expression is not yet there. */ - e-ts.u.cl-length = gfc_get_constant_expr (BT_INTEGER, 1, - e-where); - mpz_set_ui (e-ts.u.cl-length-value.integer, - e-value.character.length); - gfc_conv_const_charlen (e-ts.u.cl); - e-ts.u.cl-resolved = 1; - gfc_add_modify (parmse-pre, ctree, e-ts.u.cl-backend_decl); -} - else -{ - gfc_error (Can't compute the length of the char array at %L., - e-where); -} -} + /* Try to simplify the expression. */ + gfc_simplify_expr (e, 0); + if (e-expr_type == EXPR_CONSTANT !e-ts.u.cl-resolved) + { + /* Amazingly all data is present to compute the length of a + constant string, but the expression is not yet there. */ + e-ts.u.cl-length = gfc_get_constant_expr (BT_INTEGER, 4, + e-where); + mpz_set_ui (e-ts.u.cl-length-value.integer, + e-value.character.length); + gfc_conv_const_charlen (e-ts.u.cl); + e-ts.u.cl-resolved = 1; + gfc_add_modify (parmse-pre, ctree, e-ts.u.cl-backend_decl); + } + else + { + gfc_error (Can't compute the length of the char array at %L., + e-where); + } + } } /* Pass the address of the class object. */ parmse-expr = gfc_build_addr_expr (NULL_TREE, var); Index: gcc/fortran/ChangeLog === --- gcc/fortran/ChangeLog (Revision 220344) +++ gcc/fortran/ChangeLog (Arbeitskopie) @@ -1,3 +1,10 @@ + +2015-01-30 Andre Vehreschild ve...@gmx.de + + * trans-decl.c (gfc_get_symbol_decl): Removed duplicate code. + * trans-expr.c (gfc_conv_intrinsic_to_class): Fixed indentation. + Fixed datatype of charlen to be a 32-bit int. + 2015-02-01 Joseph Myers jos...@codesourcery.com * error.c (gfc_warning (const char *, ...), gfc_warning_now (const
Re: [[ARM/AArch64][testsuite] 03/36] Add vmax, vmin, vhadd, vhsub and vrhadd tests.
On 26 January 2015 at 14:23, Christophe Lyon christophe.l...@linaro.org wrote: On 26 January 2015 at 13:10, Tejas Belagod tejas.bela...@arm.com wrote: On 25/01/15 21:05, Christophe Lyon wrote: On 23 January 2015 at 14:44, Christophe Lyon christophe.l...@linaro.org wrote: On 23 January 2015 at 12:42, Christophe Lyon christophe.l...@linaro.org wrote: On 23 January 2015 at 11:18, Tejas Belagod tejas.bela...@arm.com wrote: On 22/01/15 21:31, Christophe Lyon wrote: On 22 January 2015 at 16:22, Tejas Belagod tejas.bela...@arm.com wrote: On 22/01/15 14:28, Christophe Lyon wrote: On 22 January 2015 at 12:19, Tejas Belagod tejas.bela...@arm.com wrote: On 21/01/15 15:07, Christophe Lyon wrote: On 19 January 2015 at 17:54, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 19 January 2015 at 15:43, Christophe Lyon christophe.l...@linaro.org wrote: On 19 January 2015 at 14:29, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 16 January 2015 at 17:52, Christophe Lyon christophe.l...@linaro.org wrote: OK provided, as per the previous couple, that we don;t regression or introduce new fails on aarch64[_be] or aarch32. This patch shows failures on aarch64 and aarch64_be for vmax and vmin when the input is -NaN. It's a corner case, and my reading of the ARM ARM is that the result should the same as on aarch32. I haven't had time to look at it in more details though. So, not OK? They should have the same behaviour in aarch32 and aarch64. Did you test on HW or a model? I ran the tests on qemu for aarch32 and aarch64-linux, and on the foundation model for aarch64*-elf. Leave this one out until we understand why it fails. /Marcus I've looked at this a bit more. We have fmaxv0.4s, v0.4s, v1.4s where v0 is a vector of -NaN (0xffc0) and v1 is a vector of 1. The output is still -NaN (0xffc0), while the test expects defaultNaN (0x7fc0). In the AArch32 execution state, Advanced SIMD FP arithmetic always uses the DefaultNaN setting regardless of the DN-bit value in the FPSCR. In AArch64 execution state, result of Advanced SIMD FP arithmetic operations depend on the value of the DN-bit i.e. either propagate the input NaN or generate DefaultNaN depending on the value of DN. Maybe I'm using an outdated doc. On page 2282 of ARMv8 ARM rev C, I can see only the latter (no diff between aarch32 and aarch64 in FPProcessNan pseudo-code) If you see pg. 4005 in the same doc(rev C), you'll see the FPSCR spec - under DN: The value of this bit only controls scalar floating-point arithmetic. Advanced SIMD arithmetic always uses the Default NaN setting, regardless of the value of the DN bit. Also on page 3180 for the description of VMAX(vector FP), it says: * max(+0.0, -0.0) = +0.0 * If any input is a NaN, the corresponding result element is the default NaN. Oops I was looking at FMAX (vector) pg 936. The pseudocode for FPMax () on pg. 3180 passes StandardFPSCRValue() to FPMax() which is on pg. 2285 // StandardFPSCRValue() // FPCRType StandardFPSCRValue() return ‘0’ : FPSCR.AHP : ‘11’ Here bit-25(FPSCR.DN) is set to 1. So, we should get defaultNaN too on aarch64, and no need to try to force DN to 1 in gdb? What can be wrong? On pg 3180, I see VMAX(FPSIMD) for A32/T32, not A64. I hope we're reading the same document. Regardless of the page number, if you see the pseudocode for VMAX(FPSIMD) for AArch32, StandardFPSCRValue() (i.e. DN = 1) is passed to FPMax() which means generate DefaultNaN() regardless. OTOH, on pg 936, you have FMAX(vector) for A64 where FPMax() in the pseudocode gets just FPCR. Ok, that was my initial understanding but our discussion confused me. And that's why I tried to force DN = 1 in gdb before single-stepping over fmaxv0.4s, v0.4s, v1.4s but it changed nothing :-( Hence my question about a gdb possible bug or misuse. Hmm... user error, I missed one bit set $fpcr=0x200 works under gdb. I'll try modifying the test to have it force DN=1. Forcing DN=1 in the test makes it pass. I am going to look at adding that cleanly to my test, and resubmit it. Thanks, and sorry for the noise. Here is the updated version: - Now I set DN=1 on AArch64 in clean_results, as it is the main initialization function. - I removed the double negative :-) - I removed the useless [u]int64 and poly variants Christophe. 2015-01-25 Christophe Lyon christophe.l...@linaro.org * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (_ARM_FPSRC): Add DN and AHP fields. (clean_results): Force DN=1 on AArch64. * gcc.target/aarch64/advsimd-intrinsics/binary_op_no64.inc: New file. * gcc.target/aarch64/advsimd-intrinsics/vhadd.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vhsub.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vmax.c: New file. *
Re: [PATCH] Warn about unclosed pragma omp declare target.
On Tue, Jul 29, 2014 at 06:45:01PM +0400, Ilya Tocar wrote: Hi, As discussed here in https://gcc.gnu.org/ml/gcc/2014-01/msg00189.html Gcc should complain about pragma omp declare target without corresponding pragma omp end declare target. This patch adds a warning for those cases. Bootstraps/passes make-check. Ok for trunk? ChangeLog: 2014-07-29 Ilya Tocar ilya.to...@intel.com * c-decl.c (omp_declare_target_location_stack): New. * c-lang.h (omp_declare_target_location_stack): Declare. * c-parser.c (warn_unclosed_pragma_omp_target): New. (c_parser_translation_unit): Call it. (c_parser_omp_declare_target): Remeber location. (c_parser_omp_end_declare_target): Forget location. Sorry for the long delay on this. Can you check what will happen if you have unclosed #pragma omp declare target in some header you precompile? If you get the warning during the header compilation and then not during compilation using that PCH header, supposedly it might be fine and the patch might be ok as is. I mean something like a.h: #pragma omp declare target int i; a.c: #include a.c #pragma omp declare target int j; #pragma omp declare target int k; int main () { } gcc -fopenmp -o a.gch a.h gcc -fopenmp -o a a.c If we wanted to warn even on a.c, supposedly the vector would need to be marked for GC. Jakub
Fix crossmodule inline hint
Hi, inliner uses crossmodule hint that during LTO preffers in-module inlining over cross-module. This hint is wrong for comdats that gets merged and thus the module information is more or less random. The patch fixes it by adding merged flag to cgraph_node indicating merged comdats and always disabling the hint for those. Bootstrapped/regtested x86_64-linux. Honza * ipa-inline-analysis.c (simple_edge_hints): Fix check for cross-module inlining. * cgraph.h (cgraph_node): Add flag merged. * ipa-icf.c (sem_function::merge): Maintain it. * lto-symtab.c (lto_cgraph_replace_node): Maintain merged flag. Index: ipa-inline-analysis.c === --- ipa-inline-analysis.c (revision 220329) +++ ipa-inline-analysis.c (working copy) @@ -3702,13 +3702,15 @@ simple_edge_hints (struct cgraph_edge *e int hints = 0; struct cgraph_node *to = (edge-caller-global.inlined_to ? edge-caller-global.inlined_to : edge-caller); + struct cgraph_node *callee = edge-callee-ultimate_alias_target (); if (inline_summaries-get (to)-scc_no inline_summaries-get (to)-scc_no == inline_summaries-get (edge-callee)-scc_no !edge-recursive_p ()) hints |= INLINE_HINT_same_scc; - if (to-lto_file_data edge-callee-lto_file_data - to-lto_file_data != edge-callee-lto_file_data) + if (callee-lto_file_data edge-caller-lto_file_data + edge-caller-lto_file_data != callee-lto_file_data + !callee-merged) hints |= INLINE_HINT_cross_module; return hints; Index: ipa-icf.c === --- ipa-icf.c (revision 220329) +++ ipa-icf.c (working copy) @@ -711,6 +711,10 @@ sem_function::merge (sem_item *alias_ite } alias-icf_merged = true; + if (local_original-lto_file_data + alias-lto_file_data + local_original-lto_file_data != alias-lto_file_data) + local_original-merged = true; /* The alias function is removed if symbol address does not matter. */ @@ -725,6 +729,10 @@ sem_function::merge (sem_item *alias_ite else if (create_alias) { alias-icf_merged = true; + if (local_original-lto_file_data + alias-lto_file_data + local_original-lto_file_data != alias-lto_file_data) + local_original-merged = true; /* Remove the function's body. */ ipa_merge_profiles (original, alias); @@ -762,6 +770,10 @@ sem_function::merge (sem_item *alias_ite } alias-icf_merged = true; + if (local_original-lto_file_data + alias-lto_file_data + local_original-lto_file_data != alias-lto_file_data) + local_original-merged = true; ipa_merge_profiles (local_original, alias, true); alias-create_wrapper (local_original); Index: lto/lto-symtab.c === --- lto/lto-symtab.c(revision 220329) +++ lto/lto-symtab.c(working copy) @@ -88,6 +88,8 @@ lto_cgraph_replace_node (struct cgraph_n gcc_assert (!prevailing_node-global.inlined_to); prevailing_node-mark_address_taken (); } + if (node-definition prevailing_node-definition) +prevailing_node-merged = true; /* Redirect all incoming edges. */ compatible_p Index: cgraph.h === --- cgraph.h(revision 220329) +++ cgraph.h(working copy) @@ -1296,6 +1296,8 @@ public: other operation that could make previously non-trapping memory accesses trapping. */ unsigned nonfreeing_fn : 1; + /* True if there was multiple COMDAT bodies merged by lto-symtab. */ + unsigned merged : 1; }; /* A cgraph node set is a collection of cgraph nodes. A cgraph node
Re: [PATCH] Add new target h8300-*-linux
At Sun, 1 Feb 2015 00:39:08 +, Joseph Myers wrote: On Sat, 31 Jan 2015, Yoshinori Sato wrote: + * config/h8300/linux.h: New file. + * config/h8300/t-linux: New file. These files don't appear to be included in the patch. I'll resend it. +h8300-*-linux*) + tmake_file=t-linux h8300/t-linux t-fpbit + tm_file=$tm_file h8300/h8300-lib.h + ;; Is there a good reason for using fp-bit instead of soft-fp here? No. it copied from h8300-elf. -- Joseph S. Myers jos...@codesourcery.com -- Yoshinori Sato ys...@users.sourceforge.jp
Go patch committed: Fix 32-bit host to 64-bit target cross-compilation
In the backend interface for the Go frontend, I foolishly used size_t for the size of a type. That usually works, but fails when compiling on a 32-bit host for a 64-bit target and compiling code that uses very large types. The maximum type size for any Go target is a signed 64-bit number, so for simplicity I changed the type size and alignment routines to all return int64_t. The rest of this patch adjusts the Go frontend to deal with that correctly, and to fix the overflow checks when dealing with types whose size does not fit into a host unsigned long. This is PRs 64836 and 64838. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2015-02-02 Ian Lance Taylor i...@google.com PR go/64836 PR go/64838 * go-gcc.cc (Gcc_backend::type_size): Change return type to int64_t. (Gcc_backend::type_alignment): Likewise. (Gcc_backend::type_field_alignment): Likewise. (Gcc_backend::type_field_offset): Likewise. (Gcc_backend::implicit_variable): Change alignment parameter type to int64_t. Index: gcc/go/go-gcc.cc === --- gcc/go/go-gcc.cc(revision 219876) +++ gcc/go/go-gcc.cc(working copy) @@ -223,16 +223,16 @@ class Gcc_backend : public Backend bool is_circular_pointer_type(Btype*); - size_t + int64_t type_size(Btype*); - size_t + int64_t type_alignment(Btype*); - size_t + int64_t type_field_alignment(Btype*); - size_t + int64_t type_field_offset(Btype*, size_t index); // Expressions. @@ -411,7 +411,7 @@ class Gcc_backend : public Backend Bvariable* implicit_variable(const std::string, Btype*, bool, bool, bool, - size_t); + int64_t); void implicit_variable_set_init(Bvariable*, const std::string, Btype*, @@ -1097,7 +1097,7 @@ Gcc_backend::is_circular_pointer_type(Bt // Return the size of a type. -size_t +int64_t Gcc_backend::type_size(Btype* btype) { tree t = btype-get_tree(); @@ -1106,14 +1106,14 @@ Gcc_backend::type_size(Btype* btype) t = TYPE_SIZE_UNIT(t); gcc_assert(tree_fits_uhwi_p (t)); unsigned HOST_WIDE_INT val_wide = TREE_INT_CST_LOW(t); - size_t ret = static_castsize_t(val_wide); - gcc_assert(ret == val_wide); + int64_t ret = static_castint64_t(val_wide); + gcc_assert(ret = 0 static_castunsigned HOST_WIDE_INT(ret) == val_wide); return ret; } // Return the alignment of a type. -size_t +int64_t Gcc_backend::type_alignment(Btype* btype) { tree t = btype-get_tree(); @@ -1124,7 +1124,7 @@ Gcc_backend::type_alignment(Btype* btype // Return the alignment of a struct field of type BTYPE. -size_t +int64_t Gcc_backend::type_field_alignment(Btype* btype) { tree t = btype-get_tree(); @@ -1135,7 +1135,7 @@ Gcc_backend::type_field_alignment(Btype* // Return the offset of a field in a struct. -size_t +int64_t Gcc_backend::type_field_offset(Btype* btype, size_t index) { tree struct_tree = btype-get_tree(); @@ -1149,9 +1149,8 @@ Gcc_backend::type_field_offset(Btype* bt gcc_assert(field != NULL_TREE); } HOST_WIDE_INT offset_wide = int_byte_position(field); - gcc_assert(offset_wide = 0); - size_t ret = static_castsize_t(offset_wide); - gcc_assert(ret == static_castunsigned HOST_WIDE_INT(offset_wide)); + int64_t ret = static_castint64_t(offset_wide); + gcc_assert(ret == offset_wide); return ret; } @@ -2609,7 +2608,7 @@ Gcc_backend::temporary_variable(Bfunctio Bvariable* Gcc_backend::implicit_variable(const std::string name, Btype* type, bool is_hidden, bool is_constant, - bool is_common, size_t alignment) + bool is_common, int64_t alignment) { tree type_tree = type-get_tree(); if (type_tree == error_mark_node) Index: gcc/go/gofrontend/backend.h === --- gcc/go/gofrontend/backend.h (revision 219876) +++ gcc/go/gofrontend/backend.h (working copy) @@ -216,22 +216,22 @@ class Backend is_circular_pointer_type(Btype*) = 0; // Return the size of a type. - virtual size_t + virtual int64_t type_size(Btype*) = 0; // Return the alignment of a type. - virtual size_t + virtual int64_t type_alignment(Btype*) = 0; // Return the alignment of a struct field of this type. This is // normally the same as type_alignment, but not always. - virtual size_t + virtual int64_t type_field_alignment(Btype*) = 0; // Return the offset of field INDEX in a struct type. INDEX is the // entry in the FIELDS std::vector parameter of struct_type or // set_placeholder_struct_type. - virtual size_t + virtual int64_t type_field_offset(Btype*, size_t index) = 0; // Expressions. @@ -575,7 +575,7 @@ class Backend // If ALIGNMENT is not zero, it is the desired alignment of the variable. virtual Bvariable* implicit_variable(const std::string name, Btype* type,
Re: [PATCH] PR preprocessor/64803 - __LINE__ inside macro is not constant
Jakub Jelinek ja...@redhat.com writes: On Fri, Jan 30, 2015 at 10:19:26AM +0100, Dodji Seketeli wrote: [This is a P1 regression for gcc 5] libcpp/ChangeLog: * internal.h (cpp_reader::top_most_macro_node): New data member. * macro.c (enter_macro_context): Pass the location of the end of the top-most invocation of the function-like macro, or the location of the expansion point of the top-most object-like macro. (cpp_get_token_1): Store the top-most macro node in the new pfile-top_most_macro_node data member. The thing that worries me a little bit on the patch is that the new field is never cleared, only overwritten next time we attempt to expand a function-like? toplevel macro. So outside of that it can be stale, point to a dead memory. But if it is guaranteed it won't be accessed in that case, perhaps that is safe. Yes, that is correct. I didn't worry too much myself because cpp_reader::top_most_macro_node has the same validity span as cpp_reader::invocation. But then, unlike top_most_macro_node, cpp_reader::invocation is not a pointer, so it's rather harmless. More precisely pfile-top_most_macro_node is (for now) only accessed from within enter_macro_context; and there, normally, pfile-top_most_macro_node is set. But then I agree that we'd rather be safe than sorry. So I have updated the patch to clear that data member when the context of the top most macro being expanded is popped. I have just lightly tested it locally but a proper bootstrap test is currently underway. Below is the patch I am currently bootstrapping. libcpp/ChangeLog: * internal.h (cpp_reader::top_most_macro_node): New data member. * macro.c (enter_macro_context): Pass the location of the end of the top-most invocation of the function-like macro, or the location of the expansion point of the top-most object-like macro. (cpp_get_token_1): Store the top-most macro node in the new pfile-top_most_macro_node data member. (_cpp_pop_context): Clear the new cpp_reader::top_most_macro_node data member. gcc/testsuite/ChangeLog: * gcc.dg/cpp/builtin-macro-1.c: New test case. Signed-off-by: Dodji Seketeli do...@redhat.com --- gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c | 28 ++ libcpp/internal.h | 5 + libcpp/macro.c | 32 +++--- 3 files changed, 62 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c diff --git a/gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c b/gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c new file mode 100644 index 000..90c2883 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cpp/builtin-macro-1.c @@ -0,0 +1,28 @@ +/* Origin PR preprocessor/64803 + + This test ensures that the value the __LINE__ macro expands to is + constant and corresponds to the line of the closing parenthesis of + the top-most function-like macro expansion it's part of. + + { dg-do run } + { do-options -no-integrated-cpp } */ + +#include assert.h + +#define C(a, b) a ## b +#define L(x) C(L, x) +#define M(a) int L(__LINE__) = __LINE__; assert(L(__LINE__) == __LINE__); + +int +main() +{ + M(a +); + + assert(L20 == 20); /* 20 is the line number of the + closing parenthesis of the + invocation of the M macro. Please + adjust in case the layout of this + file changes. */ + return 0; +} diff --git a/libcpp/internal.h b/libcpp/internal.h index 1a74020..96ccc19 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -421,6 +421,11 @@ struct cpp_reader macro invocation. */ source_location invocation_location; + /* This is the node representing the macro being expanded at + top-level. The value of this data member is valid iff + in_macro_expansion_p() returns TRUE. */ + cpp_hashnode *top_most_macro_node; + /* Nonzero if we are about to expand a macro. Note that if we are really expanding a macro, the function macro_of_context returns the macro being expanded and this flag is set to false. Client diff --git a/libcpp/macro.c b/libcpp/macro.c index 9571345..90ed11a 100644 --- a/libcpp/macro.c +++ b/libcpp/macro.c @@ -1228,7 +1228,24 @@ enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, pfile-about_to_expand_macro_p = false; /* Handle built-in macros and the _Pragma operator. */ - return builtin_macro (pfile, node, location); + { +source_location loc; +if (/* The top-level macro invocation that triggered the expansion + we are looking at is with a standard macro ...*/ + !(pfile-top_most_macro_node-flags NODE_BUILTIN) + /* ... and it's a function-like macro invocation. */ +pfile-top_most_macro_node-value.macro-fun_like) + /* Then the location
Re: nvptx-tools and nvptx-newlib (was: The nvptx port [10/11+] Target files)
Hi! On Tue, 23 Dec 2014 19:49:35 +0100, I wrote: On Mon, 10 Nov 2014 17:19:57 +0100, Bernd Schmidt ber...@codesourcery.com wrote: The scripts (11/11) I've put up on github, along with a hacked up newlib. These are at [...] They are likely to migrate to MentorEmbedded from bernds, but that had some permissions problems last week. That has recently been done: https://github.com/MentorEmbedded/nvptx-tools and https://github.com/MentorEmbedded/nvptx-newlib are now available. (I'm aware that we still are to write up how to actually build and test all this.) I just updated https://gcc.gnu.org/wiki/Offloading?action=diffrev2=26rev1=25. OK to check in the following to trunk? commit a0c73cb76d1f13642df7725d64bc618ee0909abc Author: Thomas Schwinge tho...@codesourcery.com Date: Mon Feb 2 16:29:36 2015 +0100 Begin documenting the nvptx backend. gcc/ * doc/install.texi (nvptx-*-none): New section. * doc/invoke.texi (Nvidia PTX Options): Likewise. * config/nvptx/nvptx.opt: Update. --- gcc/config/nvptx/nvptx.opt | 10 +- gcc/doc/install.texi | 23 +++ gcc/doc/invoke.texi| 26 ++ 3 files changed, 54 insertions(+), 5 deletions(-) diff --git gcc/config/nvptx/nvptx.opt gcc/config/nvptx/nvptx.opt index 1448dfc..249a61d 100644 --- gcc/config/nvptx/nvptx.opt +++ gcc/config/nvptx/nvptx.opt @@ -17,13 +17,13 @@ ; along with GCC; see the file COPYING3. If not see ; http://www.gnu.org/licenses/. -m64 -Target Report RejectNegative Mask(ABI64) -Generate code for a 64 bit ABI - m32 Target Report RejectNegative InverseMask(ABI64) -Generate code for a 32 bit ABI +Generate code for a 32-bit ABI + +m64 +Target Report RejectNegative Mask(ABI64) +Generate code for a 64-bit ABI mmainkernel Target Report RejectNegative diff --git gcc/doc/install.texi gcc/doc/install.texi index c9e3bf1..b31f9b6 100644 --- gcc/doc/install.texi +++ gcc/doc/install.texi @@ -3302,6 +3302,8 @@ information have to. @item @uref{#nds32be-x-elf,,nds32be-*-elf} @item +@uref{#nvptx-x-none,,nvptx-*-none} +@item @uref{#powerpc-x-x,,powerpc*-*-*} @item @uref{#powerpc-x-darwin,,powerpc-*-darwin*} @@ -4269,6 +4271,27 @@ Andes NDS32 target in big endian mode. @html hr / @end html +@anchor{nvptx-x-none} +@heading nvptx-*-none +Nvidia PTX target. + +Instead of GNU binutils, you will need to install +@uref{https://github.com/MentorEmbedded/nvptx-tools/,,nvptx-tools}. +Tell GCC where to find it: +@option{--with-build-time-tools=[install-nvptx-tools]/nvptx-none/bin}. + +A nvptx port of newlib is available at +@uref{https://github.com/MentorEmbedded/nvptx-newlib/,,nvptx-newlib}. +It can be automatically built together with GCC@. For this, add a +symbolic link to nvptx-newlib's @file{newlib} directory to the +directory containing the GCC sources. + +Use the @option{--disable-sjlj-exceptions} and +@option{--enable-newlib-io-long-long} options when configuring. + +@html +hr / +@end html @anchor{powerpc-x-x} @heading powerpc-*-* You can specify a default version for the @option{-mcpu=@var{cpu_type}} diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi index ba81ec7..1fb329e 100644 --- gcc/doc/invoke.texi +++ gcc/doc/invoke.texi @@ -840,6 +840,9 @@ Objective-C and Objective-C++ Dialects}. -mcustom-fpu-cfg=@var{name} @gol -mhal -msmallc -msys-crt0=@var{name} -msys-lib=@var{name}} +@emph{Nvidia PTX Options} +@gccoptlist{-m32 -m64 -mmainkernel} + @emph{PDP-11 Options} @gccoptlist{-mfpu -msoft-float -mac0 -mno-ac0 -m40 -m45 -m10 @gol -mbcopy -mbcopy-builtin -mint32 -mno-int16 @gol @@ -11967,6 +11970,7 @@ platform. * MSP430 Options:: * NDS32 Options:: * Nios II Options:: +* Nvidia PTX Options:: * PDP-11 Options:: * picoChip Options:: * PowerPC Options:: @@ -18277,6 +18281,28 @@ This option is typically used to link with a library provided by a HAL BSP. @end table +@node Nvidia PTX Options +@subsection Nvidia PTX Options +@cindex Nvidia PTX options +@cindex nvptx options + +These options are defined for Nvidia PTX: + +@table @gcctabopt + +@item -m32 +@itemx -m64 +@opindex m32 +@opindex m64 +Generate code for 32-bit or 64-bit ABI. + +@item -mmainkernel +@opindex mmainkernel +Link in code for a __main kernel. This is for stand-alone instead of +offloading execution. + +@end table + @node PDP-11 Options @subsection PDP-11 Options @cindex PDP-11 Options Grüße, Thomas pgp0CHeeOXpKu.pgp Description: PGP signature
Re: [[ARM/AArch64][testsuite] 03/36] Add vmax, vmin, vhadd, vhsub and vrhadd tests.
2015-01-25 Christophe Lyon christophe.l...@linaro.org * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (_ARM_FPSRC): Add DN and AHP fields. (clean_results): Force DN=1 on AArch64. * gcc.target/aarch64/advsimd-intrinsics/binary_op_no64.inc: New file. * gcc.target/aarch64/advsimd-intrinsics/vhadd.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vhsub.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vmax.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vmin.c: New file. * gcc.target/aarch64/advsimd-intrinsics/vrhadd.c: New file. I guess you don't need the fake dependency fix for this as this is mostly called only once? Yes, that is my current assumption: for the time being there is no other code which can potentially change this value. + _ARM_FPSCR _afpscr_for_dn; + asm volatile (mrs %0,fpcr : =r (_afpscr_for_dn)); + _afpscr_for_dn.b.DN = 1; + asm volatile (msr fpcr,%0 : : r (_afpscr_for_dn)); Maybe in the future we'll want to check that DN=0 means that we actually forward a NaN != DefaultNaN, but that can be a further improvement to this patch. Marcus, Is it OK to commit this one? This is the only remaining one from this series. Yep, that's ok /Marcus
Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c
Rainer Orth wrote: I'm still not really comfortable with those target lists; they tend to artificially exclude tests on targets where they are perfectly capable of running. At least with the comments added, it's better than before with no explanation whatsoever. Perhaps Mike can weigh in here? Well, it's been awhile, but on further reflection - my feeling is that we should be dropping the target lists here too. Maybe we end up introducing a dg-skip-if that grows over time, but it'd have to grow quite a bit to reach the size of the dg-do target we'd otherwise have... However I am a bit wary about dropping the dg-do target constraint just as we are nearing a release! So if we were to keep the whitelist approach, your patch looks good to me, and I'd be happy if that were committed. Cheers, Alan
[PATCH] Improve SSA propagator wrt instruction combining
I have queued the following patch for stage1 which improves the ability to utilize the match-and-simplify combiner during SSA propagation (by CCP and VRP). It makes sure to mark stmts as not-simulate-again when possible so that the valueization hooks know when it is safe to combine multiple statements. I had to prevent CCP from prematurely deciding that simplifying isn't worth it as well. Bootstrapped and tested on x86_64-unknown-linux-gnu. I have applied the changes to valueize_op_1 and vrp_valueize_1 now, as requested by Jakub. Thanks, Richard. 2015-02-02 Richard Biener rguent...@suse.de * tree-ssa-ccp.c (likely_value): See if we have operands that are marked as never simulate again and return CONSTANT in this case. (valueize_op_1): Always allow valueizing default-defs. * tree-vrp.c (vrp_valueize_1): Likewise. * tree-ssa-propagate.c (simulate_stmt): Mark stmts that do not have any operands that will be simulated again as not being simulated again. * gcc.dg/tree-ssa/ssa-ccp-35.c: New testcase. * gcc.dg/tree-ssa/pr37508.c: Adjust. * gfortran.dg/reassoc_6.f: Remove XFAIL. Index: gcc/tree-ssa-ccp.c === *** gcc/tree-ssa-ccp.c.orig 2015-01-30 13:20:25.794134935 +0100 --- gcc/tree-ssa-ccp.c 2015-02-02 15:48:47.810028364 +0100 *** static ccp_lattice_t *** 645,650 --- 645,651 likely_value (gimple stmt) { bool has_constant_operand, has_undefined_operand, all_undefined_operands; + bool has_nsa_operand; tree use; ssa_op_iter iter; unsigned i; *** likely_value (gimple stmt) *** 667,672 --- 668,674 has_constant_operand = false; has_undefined_operand = false; all_undefined_operands = true; + has_nsa_operand = false; FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE) { ccp_prop_value_t *val = get_value (use); *** likely_value (gimple stmt) *** 678,683 --- 680,689 if (val-lattice_val == CONSTANT) has_constant_operand = true; + + if (SSA_NAME_IS_DEFAULT_DEF (use) + || !prop_simulate_again_p (SSA_NAME_DEF_STMT (use))) + has_nsa_operand = true; } /* There may be constants in regular rhs operands. For calls we *** likely_value (gimple stmt) *** 750,757 /* We do not consider virtual operands here -- load from read-only memory may have only VARYING virtual operands, but still be ! constant. */ if (has_constant_operand || gimple_references_memory_p (stmt)) return CONSTANT; --- 756,765 /* We do not consider virtual operands here -- load from read-only memory may have only VARYING virtual operands, but still be ! constant. Also we can combine the stmt with definitions from ! operands whose definitions are not simulated again. */ if (has_constant_operand + || has_nsa_operand || gimple_references_memory_p (stmt)) return CONSTANT; *** valueize_op_1 (tree op) *** 1145,1151 this SSA edge as the SSA propagator does not necessarily re-visit the use. */ gimple def_stmt = SSA_NAME_DEF_STMT (op); ! if (prop_simulate_again_p (def_stmt)) return NULL_TREE; tree tem = get_constant_value (op); if (tem) --- 1153,1160 this SSA edge as the SSA propagator does not necessarily re-visit the use. */ gimple def_stmt = SSA_NAME_DEF_STMT (op); ! if (!gimple_nop_p (def_stmt) ! prop_simulate_again_p (def_stmt)) return NULL_TREE; tree tem = get_constant_value (op); if (tem) Index: gcc/tree-vrp.c === *** gcc/tree-vrp.c.orig 2015-01-30 13:20:25.816135225 +0100 --- gcc/tree-vrp.c 2015-02-02 15:48:47.825028438 +0100 *** vrp_valueize_1 (tree name) *** 7096,7102 this SSA edge as the SSA propagator does not necessarily re-visit the use. */ gimple def_stmt = SSA_NAME_DEF_STMT (name); ! if (prop_simulate_again_p (def_stmt)) return NULL_TREE; value_range_t *vr = get_value_range (name); if (range_int_cst_singleton_p (vr)) --- 7096,7103 this SSA edge as the SSA propagator does not necessarily re-visit the use. */ gimple def_stmt = SSA_NAME_DEF_STMT (name); ! if (!gimple_nop_p (def_stmt) ! prop_simulate_again_p (def_stmt)) return NULL_TREE; value_range_t *vr = get_value_range (name); if (range_int_cst_singleton_p (vr)) Index: gcc/tree-ssa-propagate.c === *** gcc/tree-ssa-propagate.c.orig 2015-01-30 13:20:25.817135239 +0100 --- gcc/tree-ssa-propagate.c2015-02-02 15:48:47.825028438
Re: [PATCH] Change __ARM_NEON__ to __ARM_NEON in libcpp/lex.c
On 30/01/15 19:14, Richard Henderson wrote: On 01/30/2015 04:52 AM, Szabolcs Nagy wrote: When running natively on AArch64 the preprocessor did not use the AdvSIMD optimized search_line_fast function, because it was ifdefed around by __ARM_NEON__ instead of __ARM_NEON. Yes, sorry I didn't follow up on that from September: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00149.html You also want to change to use vaddvq_u16 instead of the current vpadd and vget_lane reduction. I was going to wait until stage1 to resubmit this. that's a bigger change, can it be a separate patch or do you plan to do it together with the ifdef fix?
[PATCH][libstdc++][Testsuite] isctype test fails for newlib.
Hello, With target arm-none-eabi, the libstdc++ tests 28_regex/traits/char/isctype.cc and 28_regex/traits/wchar/isctype.cc fail at -- VERIFY(!t.isctype('\n', t.lookup_classname(range(blank; -- This is because libstdc++ puts '\n' in the 'space' character class, rather than 'blank' when building on newlib. This problem was known when suport for the blank character class was added to libstdc++ (see https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01902.html) so this failure is not unexpected. Changes to newlib that would have allowed the problem to be fixed were made (https://sourceware.org/ml/newlib/2009/msg00342.html) but then reverted (https://sourceware.org/ml/newlib/2009/msg00438.html). This patch modifies the test to add a special case for the behaviour with newlib. Tested by running check-target-libstdc++-v3 - libstdc++-dg/conformance.exp, with the modified tests, for arm-none-eabi and aarch64-none-linux-gnu. No new failures and the modified tests now pass on arm-none-eabi. Ok for trunk? Matthew libstdc++-v3/testsuite/ 2015-02-02 Matthew Wahab matthew.wa...@arm.com * 28_regex/traits/char/isctype.cc (test01): Add newlib special case for '\n'. * 28_regex/traits/wchar_t/isctype.cc (test01): Likewise. diff --git a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc index a7b1396..df0dac8 100644 --- a/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc +++ b/libstdc++-v3/testsuite/28_regex/traits/char/isctype.cc @@ -53,7 +53,12 @@ test01() VERIFY(!t.isctype('_', t.lookup_classname(range(digit; VERIFY( t.isctype(' ', t.lookup_classname(range(blank; VERIFY( t.isctype('\t', t.lookup_classname(range(blank; +#if defined (__NEWLIB__) + /* newlib includes '\n' in class 'blank'. */ + VERIFY( t.isctype('\n', t.lookup_classname(range(blank; +#else VERIFY(!t.isctype('\n', t.lookup_classname(range(blank; +#endif VERIFY( t.isctype('t', t.lookup_classname(range(upper), true))); VERIFY( t.isctype('T', t.lookup_classname(range(lower), true))); #undef range diff --git a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc index e450f6d..b6088bd 100644 --- a/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc +++ b/libstdc++-v3/testsuite/28_regex/traits/wchar_t/isctype.cc @@ -50,7 +50,12 @@ test01() VERIFY(!t.isctype(L'_', t.lookup_classname(range(digit; VERIFY( t.isctype(L' ', t.lookup_classname(range(blank; VERIFY( t.isctype(L'\t', t.lookup_classname(range(blank; +#if defined (__NEWLIB__) + /* newlib includes '\n' in class 'blank'. */ + VERIFY( t.isctype(L'\n', t.lookup_classname(range(blank; +#else VERIFY(!t.isctype(L'\n', t.lookup_classname(range(blank; +#endif VERIFY( t.isctype(L't', t.lookup_classname(range(upper), true))); VERIFY( t.isctype(L'T', t.lookup_classname(range(lower), true))); #undef range
Re: [PATCH] Change __ARM_NEON__ to __ARM_NEON in libcpp/lex.c
On 02/02/15 15:34, Szabolcs Nagy wrote: On 30/01/15 19:14, Richard Henderson wrote: On 01/30/2015 04:52 AM, Szabolcs Nagy wrote: When running natively on AArch64 the preprocessor did not use the AdvSIMD optimized search_line_fast function, because it was ifdefed around by __ARM_NEON__ instead of __ARM_NEON. Yes, sorry I didn't follow up on that from September: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00149.html You also want to change to use vaddvq_u16 instead of the current vpadd and vget_lane reduction. I was going to wait until stage1 to resubmit this. that's a bigger change, can it be a separate patch or do you plan to do it together with the ifdef fix? I think the two should be separated. The existing code will work on AArch64, even though it could be improved upon. R.
Re: [PATCH] Relax check against commuting XOR and ASHIFTRT in combine.c
Hi Alan, I'm still not really comfortable with those target lists; they tend to artificially exclude tests on targets where they are perfectly capable of running. At least with the comments added, it's better than before with no explanation whatsoever. Perhaps Mike can weigh in here? Well, it's been awhile, but on further reflection - my feeling is that we should be dropping the target lists here too. Maybe we end up introducing a dg-skip-if that grows over time, but it'd have to grow quite a bit to reach the size of the dg-do target we'd otherwise have... It's not even necessary to use dg-skip if the scan-rtl-dump fails. You can just add an xfail there, which has the advantage that you do notice if the test starts to pass e.g. due to changes in a target. However I am a bit wary about dropping the dg-do target constraint just as we are nearing a release! So if we were to keep the whitelist approach, your patch looks good to me, and I'd be happy if that were committed. Let's give others a day or two to comment: if nobody is in favour of the more agressive approach, I'll commit my patch. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition
On 02/02/15 08:59, Alex Velenko wrote: On 11/10/14 13:44, Felix Yang wrote: Hello Jeff, I see that you have improved the RTL typesafety issue for ira.c, so I rebased this patch on the latest trunk and change to use the new list walking interface. Bootstrapped on x86_64-SUSE-Linux and make check regression tested. OK for trunk? Hi Felix, I believe your patch causes a regression for arm-none-eabi. FAIL: gcc.target/arm/pr43920-2.c object-size text = 54 FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2 This happens because your patch stops reuse of code for return -1; statements in pr43920-2.c. As far as I investigated, your patch prevents adding (expr_list (-1) (nil) in ira pass, which prevents jump2 optimization from happening. So before, in ira pass I could see: (insn 9 53 34 8 (set (reg:SI 110 [ D.4934 ]) (const_int -1 [0x])) /work/fsf-trunk-ref-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20 613 {*thumb2_movsi_vfp} (expr_list:REG_EQUAL (const_int -1 [0x]) (nil))) But with your patch I get (insn 9 53 34 8 (set (reg:SI 110 [ D.5322 ]) (const_int -1 [0x])) /work/fsf-trunk-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20 615 {*thumb2_movsi_vfp} (nil)) This causes a code generation regression and needs to be fixed. Kind regards, We'd need to see the full dumps. In particular is reg110 set anywhere else? If so then the change is doing precisely what it should be doing and the test needs to be updated to handle the different code we generate. Jeff
Re: [PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0
On Mon, Feb 02, 2015 at 02:51:43PM -0800, Andrew Pinski wrote: While trying to build the GCC 5 with GCC 5, I ran into an ICE when building libcpp at -O0. The problem is the C++ front-end was not folding sizeof(a)/sizeof(a[0]) when passed to a function at -O0. The C++ front-end keeps around sizeof until the gimplifier and there is no way to fold the expressions that involve them. So to work around the issue we need to change __builtin_aarch64_im_lane_boundsi to accept an extra argument and change the first two arguments to size_t type so we don't get an extra cast there and do the division inside the compiler itself. Relying on anything being folded at -O0 when the language does not guarantee it is going to be more and more of a problem. So I think your patch is reasonable (of course, I'll defer this to target maintainers). + rtx totalsize = expand_normal (CALL_EXPR_ARG (exp, 0)); + rtx elementsize = expand_normal (CALL_EXPR_ARG (exp, 1)); + if (CONST_INT_P (totalsize) CONST_INT_P (elementsize)) + { + rtx lane_idx = expand_normal (CALL_EXPR_ARG (exp, 2)); + if (CONST_INT_P (lane_idx)) + aarch64_simd_lane_bounds (lane_idx, 0, UINTVAL (totalsize)/UINTVAL (elementsize), exp); Too long line? Also, missing spaces around / . And, ICE if somebody uses __builtin_aarch64_im_lane_boundsi (4, 0, 0); So you need to check and complain for zero elementsize too. + else + error (%Klane index must be a constant immediate, exp); + } else - error (%Klane index must be a constant immediate, exp); + sorry (%Ktotal size and element size must be a constant immediate, exp); But why sorry? If you say the builtin requires constant arguments, then it is not sorry, but error, it is not an unimplemented feature. Jakub
RE: [PATCH MIPS RFA] Regression cleanup for nan2008 toolchain
Please could you add a comment explaining that the mips_nanlegacy is there because of the #include of system headers that might not compile with -mnan=legacy? I agree that that's a good reason, but it's not obvious without a comment. (And without a comment this could start a precendent of things being skipped in cases where the mips.exp options machinery could be updated instead.) True. Clarification added. Ok for trunk? Regards, Robert 2015-02-02 Robert Suchanek robert.sucha...@imgtec.com * gcc.target/mips/loongson-simd.c: Update comment to clarify the need for mips_nanlegacy target. diff --git a/gcc/testsuite/gcc.target/mips/loongson-simd.c b/gcc/testsuite/gcc.target/mips/loongson-simd.c index 949632e..9c3ebce 100644 --- a/gcc/testsuite/gcc.target/mips/loongson-simd.c +++ b/gcc/testsuite/gcc.target/mips/loongson-simd.c @@ -21,7 +21,10 @@ along with GCC; see the file COPYING3. If not see /* { dg-do run } */ /* loongson.h does not handle or check for MIPS16ness or microMIPSness. There doesn't seem any good reason for it to, given - that the Loongson processors do not support either. */ + that the Loongson processors do not support either. The effective target + mips_nanlegacy is required for a toolchain without the legacy NaN support + because inclusion of some system headers e.g. stdint.h will fail due to not + finding stubs-o32_hard.h. */ /* { dg-require-effective-target mips_nanlegacy } */ /* { dg-options isa=loongson -mhard-float -mno-micromips -mno-mips16 -flax-vector-conversions } */
Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.
On 2 February 2015 at 16:17, Paolo Carlini paolo.carl...@oracle.com wrote: Hi, On 02/02/2015 04:49 PM, Matthew Wahab wrote: Hello, With target arm-none-eabi, the libstdc++ tests 28_regex/traits/char/isctype.cc and 28_regex/traits/wchar/isctype.cc fail at -- VERIFY(!t.isctype('\n', t.lookup_classname(range(blank; -- This is because libstdc++ puts '\n' in the 'space' character class, rather than 'blank' when building on newlib. This problem was known when suport for the blank character class was added to libstdc++ (see https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01902.html) so this failure is not unexpected. Changes to newlib that would have allowed the problem to be fixed were made (https://sourceware.org/ml/newlib/2009/msg00342.html) but then reverted (https://sourceware.org/ml/newlib/2009/msg00438.html). This patch modifies the test to add a special case for the behaviour with newlib. Tested by running check-target-libstdc++-v3 - libstdc++-dg/conformance.exp, with the modified tests, for arm-none-eabi and aarch64-none-linux-gnu. No new failures and the modified tests now pass on arm-none-eabi. Ok for trunk? This is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64467 so please note that in the ChangeLog. I guess the patch is Ok for trunk, but please also add in the comment a link to this message of yours, that is https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html. Thanks, Paolo. PS: please remember to always CC libstdc++-v3 patches to libstd...@gcc.gnu.org. Yes, not everyone subscribes to gcc-patches so please always send libstdc++ patches to the libstdc++ list, as documented at https://gcc.gnu.org/lists.html and in the libstdc++ manual.
[PING, www] Re: [PATCH] update_web_docs_svn: support the JIT docs (PR jit/64257)
On Mon, 2015-01-26 at 19:14 -0500, David Malcolm wrote: On Mon, 2015-01-26 at 15:21 -0700, Jeff Law wrote: On 01/26/15 09:42, David Malcolm wrote: update_web_docs_svn-support-the-JIT-documention-v2.patch From 7f7e15881981228e51b347f23df6e3106ddd68ea Mon Sep 17 00:00:00 2001 From: David Malcolmdmalc...@redhat.com Date: Fri, 23 Jan 2015 17:26:57 -0500 Subject: [PATCH] update_web_docs_svn: support the JIT documentation maintainer-scripts/ChangeLog: * update_web_docs_svn: Don't delete gcc/jit/docs or gcc/jit/jit-common.h, gcc/jit/notes.txt. Special case the building of the jit docs (using sphinx-build). Special case copying them up. OK. Thanks. I've committed this to trunk as r220149. Does this automatically get propagated to the machine that builds the website (and thus would be run next time the relevant cronjob runs)? Or does someone need to do additional work for this to go live? (if nothing else, the machine needs to have sphinx-build in its $PATH, as noted in the patch). Ping re ^ I'm hoping to have the jit docs on the gcc website. In the best of all worlds, with r220149, the jit docs might have appeared at: https://gcc.gnu.org/onlinedocs/jit but that's currently a 404. Presumably, some machine needs to have the relevant sphinx packaged installed (if that's OK [1]), and perhaps the update to the update_web_docs_svn script needs to make it onto that machine? Or is this more appropriate for the overseers list? Thanks Dave [1] otherwise, do I need to look into another way of getting the docs built for the site?
Re: [Patch, fortran] PR 64757 - [5 Regression] ICE in fold_convert_loc, at fold-const.c:2353
Dear Paul, I have tested your patch at https://gcc.gnu.org/ml/fortran/2015-01/txtwnaoa1115V.txt (the latest version) and I found that the test type_to_class_3.f03 is miscompiled (FAIL) with -flto -O0 -m64 (this does not happens with -flto -O0 -m32 or with -Ox and x!=0). In addition, while the reduced test type :: Test integer :: i end type type :: TestReference class(Test), allocatable :: test(:) end type type(TestReference) :: testList type(test), allocatable :: x(:) allocate (testList%test(2), source = [Test(99), Test(199)]) ! Works, of course print *, size(testList%test) x = testList%test print *, x end gives what I expect, i.e., 2 99 199 type :: Test integer :: i end type type :: TestReference class(Test), allocatable :: test(:) end type type(TestReference) :: testList type(test), allocatable :: x(:) testList = TestReference([Test(99), Test(199)]) ! Gave: The rank of the element in the ! structure constructor at (1) does not ! match that of the component (1/0) print *, size(testList%test) x = testList%test print *, x end gives 1 99 Last problem I see, print *, TestReference([Test(99), Test(199)]) gives the following ICE f951: internal compiler error: Bad IO basetype (7) type_to_class_3_red_2.f03:12:0: print *, TestReference([Test(99), Test(199)]) Cheers, Dominique
MAINTAINERS: resign as testsuite maintainer, update address
I retired from Mentor Graphics 3 weeks ago and have no immediate plans to be active in GCC, so I'm resigning as a testsuite maintainer. I'm leaving myself under Write After Approval with my personal email address so people can find me. Five years ago while between jobs I got an individual FSF copyright assignment; is that still valid? Janis gcc-20150202-1 Description: Binary data
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On Sat, 31 Jan 2015, Jeff Law wrote: The nice thing about wrapping the result inside a convert is the types for the inner operations will propagate from the type of the inner operands, which is exactly what we want. We then remove the hack assigning type and instead the original type will be used for the outermost convert. Those inner operands still need converting to unsigned for arithmetic. And FWIW, there's no reason to restrict the pattern to just masking off the sign bit. That's what the PR complains about, but we can do considerably better here. That's part of the reason why I put in the iterators -- to generalize this to more cases. Well, we want to move shorten_binary_op and shorten_compare to the new mechanism. -- Joseph S. Myers jos...@codesourcery.com
Re: [Patch, AArch64, Obvious] Fix PR64231.
On 30/01/15 13:25, Jakub Jelinek wrote: On Fri, Jan 23, 2015 at 11:03:13AM +, Tejas Belagod wrote: Hi, This is an almost obvious patch to fix PR64231 as discovered by A. Pinksi and as proposed by Jakub. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64231 Regressions happy. OK to commit? This is ok for trunk. We have a real bug that we need to fix, if we have some more useful macro in the future, this can be rewritten to use that macro together with the many other spots that would be changed for it as well. But blocking the fix for it doesn't sound right to me. Thanks. Committed as r220348. Thanks, Tejas. 2015-01-23 Tejas Belagod tejas.bela...@arm.com Andrew Pinski pins...@gcc.gnu.org Jakub Jelinek ja...@gcc.gnu.org PR target/64231 * config/aarch64/aarch64.c (aarch64_classify_symbol): Fix large integer typing for small model. Use IN_RANGE. Jakub
Re: Re: [PATCH IRA] update_equiv_regs fails to set EQUIV reg-note for pseudo with more than one definition
On 11/10/14 13:44, Felix Yang wrote: Hello Jeff, I see that you have improved the RTL typesafety issue for ira.c, so I rebased this patch on the latest trunk and change to use the new list walking interface. Bootstrapped on x86_64-SUSE-Linux and make check regression tested. OK for trunk? Hi Felix, I believe your patch causes a regression for arm-none-eabi. FAIL: gcc.target/arm/pr43920-2.c object-size text = 54 FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2 This happens because your patch stops reuse of code for return -1; statements in pr43920-2.c. As far as I investigated, your patch prevents adding (expr_list (-1) (nil) in ira pass, which prevents jump2 optimization from happening. So before, in ira pass I could see: (insn 9 53 34 8 (set (reg:SI 110 [ D.4934 ]) (const_int -1 [0x])) /work/fsf-trunk-ref-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20 613 {*thumb2_movsi_vfp} (expr_list:REG_EQUAL (const_int -1 [0x]) (nil))) But with your patch I get (insn 9 53 34 8 (set (reg:SI 110 [ D.5322 ]) (const_int -1 [0x])) /work/fsf-trunk-2/src/gcc/gcc/testsuite/gcc.target/arm/pr43920-2.c:20 615 {*thumb2_movsi_vfp} (nil)) This causes a code generation regression and needs to be fixed. Kind regards, Alex Index: gcc/ChangeLog === --- gcc/ChangeLog(revision 216116) +++ gcc/ChangeLog(working copy) @@ -1,3 +1,14 @@ +2014-10-11 Felix Yang felix.y...@huawei.com +Jeff Law l...@redhat.com + +* ira.c (struct equivalence): Change member is_arg_equivalence and replace +into boolean bitfields; turn member loop_depth into a short integer; add new +member no_equiv and reserved. +(no_equiv): Set no_equiv of struct equivalence if register is marked +as having no known equivalence. +(update_equiv_regs): Check all definitions for a multiple-set +register to make sure that the RHS have the same value. + 2014-10-11 Martin Liska mli...@suse.cz PR/63376 Index: gcc/ira.c === --- gcc/ira.c(revision 216116) +++ gcc/ira.c(working copy) @@ -2902,12 +2902,14 @@ struct equivalence /* Loop depth is used to recognize equivalences which appear to be present within the same loop (or in an inner loop). */ - int loop_depth; + short loop_depth; /* Nonzero if this had a preexisting REG_EQUIV note. */ - int is_arg_equivalence; + unsigned char is_arg_equivalence : 1; /* Set when an attempt should be made to replace a register with the associated src_p entry. */ - char replace; + unsigned char replace : 1; + /* Set if this register has no known equivalence. */ + unsigned char no_equiv : 1; }; /* reg_equiv[N] (where N is a pseudo reg number) is the equivalence @@ -3255,6 +3257,7 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSE if (!REG_P (reg)) return; regno = REGNO (reg); + reg_equiv[regno].no_equiv = 1; list = reg_equiv[regno].init_insns; if (list list-insn () == NULL) return; @@ -3381,7 +3384,7 @@ update_equiv_regs (void) /* If this insn contains more (or less) than a single SET, only mark all destinations as having no known equivalence. */ - if (set == 0) + if (set == NULL_RTX) { note_stores (PATTERN (insn), no_equiv, NULL); continue; @@ -3476,16 +3479,49 @@ update_equiv_regs (void) if (note GET_CODE (XEXP (note, 0)) == EXPR_LIST) note = NULL_RTX; - if (DF_REG_DEF_COUNT (regno) != 1 - (! note + if (DF_REG_DEF_COUNT (regno) != 1) +{ + bool equal_p = true; + rtx_insn_list *list; + + /* If we have already processed this pseudo and determined it + can not have an equivalence, then honor that decision. */ + if (reg_equiv[regno].no_equiv) +continue; + + if (! note || rtx_varies_p (XEXP (note, 0), 0) || (reg_equiv[regno].replacement ! rtx_equal_p (XEXP (note, 0), -reg_equiv[regno].replacement -{ - no_equiv (dest, set, NULL); - continue; +reg_equiv[regno].replacement))) +{ + no_equiv (dest, set, NULL); + continue; +} + + list = reg_equiv[regno].init_insns; + for (; list; list = list-next ()) +{ + rtx note_tmp; + rtx_insn *insn_tmp; + + insn_tmp = list-insn (); + note_tmp = find_reg_note (insn_tmp, REG_EQUAL, NULL_RTX); + gcc_assert (note_tmp); + if (! rtx_equal_p (XEXP (note, 0), XEXP (note_tmp, 0))) +{ + equal_p = false; + break; +} +} + + if (! equal_p) +{ + no_equiv (dest,
Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.
Hi, On 02/02/2015 04:49 PM, Matthew Wahab wrote: Hello, With target arm-none-eabi, the libstdc++ tests 28_regex/traits/char/isctype.cc and 28_regex/traits/wchar/isctype.cc fail at -- VERIFY(!t.isctype('\n', t.lookup_classname(range(blank; -- This is because libstdc++ puts '\n' in the 'space' character class, rather than 'blank' when building on newlib. This problem was known when suport for the blank character class was added to libstdc++ (see https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01902.html) so this failure is not unexpected. Changes to newlib that would have allowed the problem to be fixed were made (https://sourceware.org/ml/newlib/2009/msg00342.html) but then reverted (https://sourceware.org/ml/newlib/2009/msg00438.html). This patch modifies the test to add a special case for the behaviour with newlib. Tested by running check-target-libstdc++-v3 - libstdc++-dg/conformance.exp, with the modified tests, for arm-none-eabi and aarch64-none-linux-gnu. No new failures and the modified tests now pass on arm-none-eabi. Ok for trunk? I guess the patch is Ok for trunk, but please also add in the comment a link to this message of yours, that is https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00059.html. Thanks, Paolo. PS: please remember to always CC libstdc++-v3 patches to libstd...@gcc.gnu.org.
Re: [nvptx-tools, committed] Also install [...]/nvptx-none/bin/ar and [...]/nvptx-none/bin/ranlib.
Hi Bernd! On Fri, 9 Jan 2015 16:38:51 +0100, Bernd Schmidt ber...@codesourcery.com wrote: On 12/23/2014 07:50 PM, Thomas Schwinge wrote: [nvptx-tools patches] I've pushed the three patches you sent to my github repository. It probably makes sense for you to switch over to the MentorEmbedded repositories, http://news.gmane.org/find-root.php?message_id=%3C87vbl2w69s.fsf%40kepler.schwinge.homeip.net%3E (where I already had pushed these patches). Should we be posting nvptx-tools patches on this list (gcc-patches), somewhere else, or not at all? I thought, on gcc-patches, as nvptx-tools are only to be used with GCC? (But Joseph and I have not posted the patches we pushed recently.) Grüße, Thomas pgpaSAsc7l88e.pgp Description: PGP signature
Re: [PATCH][libstdc++][Testsuite] isctype test fails for newlib.
On 2 February 2015 at 18:03, Matthew Wahab wrote: Updated patch attached and changelog below. Looks good, OK for trunk - thanks for fixing it.
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
I am forwarding this reply to Cary Coutant, Diego Novillo and Le-Chun Wu, as they were listed as the plugin maintainers. Cary, Diego, Le-Chun, please let me know if you are on it, or if I should send it to someone else. On 29 January 2015 at 22:32, Bruno Loff bruno.l...@gmail.com wrote: The issue was first reported by Joachim Wieland to the list g...@gcc.gnu.org, on Wed, Jan 19, 2011 (Subject: PLUGIN_FINISH_TYPE not executed for enums). A description of the problem/bug and how my patch addresses it. --- The problem was that when gcc plugins registered callbacks on the PLUGIN_FINISH_TYPE event, this event would not be triggered after an enum had finished processing. The function call that does this was not there; it seems to me that it has simply been forgotten. Bootstrapping and testing make bootstrap make -k check === gcc Summary === # of expected passes106729 # of expected failures 256 # of unsupported tests 1409 on x86_64 ubuntu linux 14.04 Furthermore, I tested the plugin functionality (with a gcc-with-python script), and it now works properly. (However, changes to gcc-with-python also had to be made so that enum type info is properly converted to python types; see my github fork for these changes https://github.com/bloff/gcc-python-plugin) The Patch --- From: bloff bloff.si...@gmail.com Date: Sun, 19 Oct 2014 14:54:01 +0100 Subject: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing First reported by Joachim Wieland to the list g...@gcc.gnu.org, on Wed, Jan 19, 2011 (Subject: PLUGIN_FINISH_TYPE not executed for enums). --- gcc/c/c-parser.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 264c170..cb515aa 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -2324,6 +2324,7 @@ c_parser_declspecs (c_parser *parser, struct c_declspecs *specs, attrs_ok = true; seen_type = true; t = c_parser_enum_specifier (parser); + invoke_plugin_callbacks (PLUGIN_FINISH_TYPE, t.spec); declspecs_add_type (loc, specs, t); break; case RID_STRUCT: -- 1.9.1
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On 02/02/15 01:57, Richard Biener wrote: The nice thing about wrapping the result inside a convert is the types for the inner operations will propagate from the type of the inner operands, which is exactly what we want. We then remove the hack assigning type and instead the original type will be used for the outermost convert. It's not even a hack but wrong ;) Correct supported syntax is + (with { tree type0 = TREE_TYPE (@0); } + (convert:type0 (bit_and (inner_op @0 @1) (convert @3))) Thus whenever the generator cannot auto-guess a type (or would guess the wrong one) you can explicitely specify a type to convert to. I found that explicit types were ignored in some cases. It was frustrating to say the least. But I think I've got this part doing what I want without the hack. Why do you restrict this to GENERIC? On GIMPLE you'd eventually want to impose some single-use constraints as the result with all the conversions won't really be unconditionally better? That was strictly because of the mismatch between the resulting type and how it was later used. That restriction shouldn't be needed anymore. Jeff
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
I am forwarding this reply to Cary Coutant, Diego Novillo and Le-Chun Wu, as they were listed as the plugin maintainers. Cary, Diego, Le-Chun, please let me know if you are on it, or if I should send it to someone else. Sorry, this isn't my kind of plugin -- I'm a maintainer for the LTO linker plugin, but this looks like it's related to GCC plugins. Diego or Le-Chun should be able to help, though. -cary
[PATCH] Fix CSE volatile MEM handling (PR rtl-optimization/64756)
Hi! We miscompile the following testcase, because first we add a mem/v into the hash table (which should not happen), later on during merge_equiv_classes a new element for that mem/v is added and doesn't even have in_memory set (because HASH failed with do_not_record but nothing checked it) and later on this results in that memory not being properly invalidated. The initial problem is that if SET_DEST (sets[i].rtl) is a volatile mem, but we have a known value at that memory, we compute initially sets[i].dest_hash as hash value of the known value. That doesn't result into do_not_record, and when we recompute actual hash value for the MEM, we ignore the do_not_record flag. I've tested it also with logging when did this trigger, and in both bootstraps it triggered only on the new testcase and in 64-bit build of libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit-hle.cc. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-02-02 Jakub Jelinek ja...@redhat.com PR rtl-optimization/64756 * cse.c (cse_insn): If dest != SET_DEST (sets[i].rtl) and HASH (SET_DEST (sets[i].rtl), mode) computation sets do_not_record, invalidate and do not record it. * gcc.c-torture/execute/pr64756.c: New test. --- gcc/cse.c.jj2015-01-23 20:49:11.0 +0100 +++ gcc/cse.c 2015-02-02 11:51:57.508084360 +0100 @@ -5521,7 +5521,22 @@ cse_insn (rtx_insn *insn) } if (sets[i].rtl != 0 dest != SET_DEST (sets[i].rtl)) - sets[i].dest_hash = HASH (SET_DEST (sets[i].rtl), mode); + { + do_not_record = 0; + sets[i].dest_hash = HASH (SET_DEST (sets[i].rtl), mode); + if (do_not_record) + { + rtx dst = SET_DEST (sets[i].rtl); + if (REG_P (dst) || GET_CODE (dst) == SUBREG) + invalidate (dst, VOIDmode); + else if (MEM_P (dst)) + invalidate (dst, VOIDmode); + else if (GET_CODE (dst) == STRICT_LOW_PART + || GET_CODE (dst) == ZERO_EXTRACT) + invalidate (XEXP (dst, 0), GET_MODE (dst)); + sets[i].rtl = 0; + } + } #ifdef HAVE_cc0 /* If setting CC0, record what it was set to, or a constant, if it --- gcc/testsuite/gcc.c-torture/execute/pr64756.c.jj2015-02-02 11:53:06.903882851 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr64756.c 2015-02-02 11:52:53.0 +0100 @@ -0,0 +1,30 @@ +/* PR rtl-optimization/64756 */ + +int a, *tmp, **c = tmp; +volatile int d; +static int *volatile *e = tmp; +unsigned int f; + +static void +fn1 (int *p) +{ + int g; + for (; f 1; f++) +for (g = 1; g = 0; g--) + { + d || d; + *c = p; + + if (tmp != a) + __builtin_abort (); + + *e = 0; + } +} + +int +main () +{ + fn1 (a); + return 0; +} Jakub
[C++ PATCH] PR c++/64901
The modified test has been tested, I'm currently running the full testsuite, so testing is incomplete. I wanted to send this in asap, since this is a bad regression. /cp 2015-02-02 Ville Voutilainen ville.voutilai...@gmail.com PR c++/64901 * decl.c (duplicate_decls): Also duplicate DECL_FINAL_P and DECL_OVERRIDE_P. /testsuite 2015-02-02 Ville Voutilainen ville.voutilai...@gmail.com PR c++/64901 * g++.dg/cpp0x/override1.C: Add a test for the PR. diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 4527e3f..d77a0a8 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -1813,6 +1813,8 @@ duplicate_decls (tree newdecl, tree olddecl, bool newdecl_is_friend) DECL_PURE_VIRTUAL_P (newdecl) |= DECL_PURE_VIRTUAL_P (olddecl); DECL_VIRTUAL_P (newdecl) |= DECL_VIRTUAL_P (olddecl); DECL_INVALID_OVERRIDER_P (newdecl) |= DECL_INVALID_OVERRIDER_P (olddecl); + DECL_FINAL_P (newdecl) |= DECL_FINAL_P (olddecl); + DECL_OVERRIDE_P (newdecl) |= DECL_OVERRIDE_P (olddecl); DECL_THIS_STATIC (newdecl) |= DECL_THIS_STATIC (olddecl); if (DECL_OVERLOADED_OPERATOR_P (olddecl) != ERROR_MARK) SET_OVERLOADED_OPERATOR_CODE diff --git a/gcc/testsuite/g++.dg/cpp0x/override1.C b/gcc/testsuite/g++.dg/cpp0x/override1.C index 05d7290..7686a28 100644 --- a/gcc/testsuite/g++.dg/cpp0x/override1.C +++ b/gcc/testsuite/g++.dg/cpp0x/override1.C @@ -4,8 +4,11 @@ struct B virtual void f() final {} virtual void g() {} virtual void x() const {} + virtual void y() final; }; +void B::y() {} // { dg-error overriding } + struct B2 { virtual void h() {} @@ -14,6 +17,7 @@ struct B2 struct D : B { virtual void g() override final {} // { dg-error overriding } + virtual void y() override final {} // { dg-error virtual } }; template class T struct D2 : T
[PATCH] Fix combiner from accessing or writing out of bounds SET_N_REGS (PR other/63504)
Hi! During combine we sometimes split instructions and on some target that can create new pseudos. combine_split_insns in that case grows the reg_stat vector, but REG_N_SETS lives in an array allocated by regstat.c once and it isn't able right now to grow it. Even adding the ability to grow that (by making it a vector) wouldn't help much though, because the question is what to initialize it with, DF isn't updated when we split and it isn't sure if we actually keep the split insns. From what I can see in the combiner, using DF_REG_DEF_COUNT directly would probably not work, because combiner wants to adjust its counts during try_combine, rather than only when it is done with try_combine and DF is updated. So, this patch instead just stops touching (reading and/or writing) REG_N_SETS of pseudos created during combine. Combiner only cares about REG_N_SETS (...) == 1, so the patch is conservative, but something that previously resulted in pretty much random behavior. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-02-02 Jakub Jelinek ja...@redhat.com PR other/63504 * combine.c (reg_n_sets_max): New variable. (can_change_dest_mode, reg_nonzero_bits_for_combine, reg_num_sign_bit_copies_for_combine, get_last_value_validate, get_last_value): Use REG_N_SETS only on pseudos reg_n_sets_max. (try_combine): Use INC_REG_N_SETS only on pseudos reg_n_sets_max. (rest_of_handle_combine): Initialize reg_n_sets_max. --- gcc/combine.c.jj2015-01-31 10:07:45.0 +0100 +++ gcc/combine.c 2015-02-02 13:33:06.190821688 +0100 @@ -284,6 +284,9 @@ typedef struct reg_stat_struct { static vecreg_stat_type reg_stat; +/* Highest pseudo for which we track REG_N_SETS. */ +static unsigned int reg_n_sets_max; + /* Record the luid of the last insn that invalidated memory (anything that writes memory, and subroutine calls, but not pushes). */ @@ -2420,7 +2423,9 @@ can_change_dest_mode (rtx x, int added_s = hard_regno_nregs[regno][mode])); /* Or a pseudo that is only used once. */ - return (REG_N_SETS (regno) == 1 !added_sets + return (regno reg_n_sets_max + REG_N_SETS (regno) == 1 + !added_sets !REG_USERVAR_P (x)); } @@ -3630,7 +3635,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2, if (REG_P (new_i3_dest) REG_P (new_i2_dest) - REGNO (new_i3_dest) == REGNO (new_i2_dest)) + REGNO (new_i3_dest) == REGNO (new_i2_dest) + REGNO (new_i2_dest) reg_n_sets_max) INC_REG_N_SETS (REGNO (new_i2_dest), 1); } } @@ -4480,7 +4486,8 @@ try_combine (rtx_insn *i3, rtx_insn *i2, zero its use count so it won't make `reload' do any work. */ if (! added_sets_2 (newi2pat == 0 || ! reg_mentioned_p (i2dest, newi2pat)) -! i2dest_in_i2src) +! i2dest_in_i2src +REGNO (i2dest) reg_n_sets_max) INC_REG_N_SETS (REGNO (i2dest), -1); } @@ -4497,7 +4504,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, record_value_for_reg (i1dest, i1_insn, i1_val); - if (! added_sets_1 ! i1dest_in_i1src) + if (! added_sets_1 +! i1dest_in_i1src +REGNO (i1dest) reg_n_sets_max) INC_REG_N_SETS (REGNO (i1dest), -1); } @@ -4514,7 +4523,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, record_value_for_reg (i0dest, i0_insn, i0_val); - if (! added_sets_0 ! i0dest_in_i0src) + if (! added_sets_0 +! i0dest_in_i0src +REGNO (i0dest) reg_n_sets_max) INC_REG_N_SETS (REGNO (i0dest), -1); } @@ -9750,6 +9761,7 @@ reg_nonzero_bits_for_combine (const_rtx || (rsp-last_set_label == label_tick DF_INSN_LUID (rsp-last_set) subst_low_luid) || (REGNO (x) = FIRST_PSEUDO_REGISTER + REGNO (x) reg_n_sets_max REG_N_SETS (REGNO (x)) == 1 !REGNO_REG_SET_P (DF_LR_IN (ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb), @@ -9825,6 +9837,7 @@ reg_num_sign_bit_copies_for_combine (con || (rsp-last_set_label == label_tick DF_INSN_LUID (rsp-last_set) subst_low_luid) || (REGNO (x) = FIRST_PSEUDO_REGISTER + REGNO (x) reg_n_sets_max REG_N_SETS (REGNO (x)) == 1 !REGNO_REG_SET_P (DF_LR_IN (ENTRY_BLOCK_PTR_FOR_FN (cfun)-next_bb), @@ -12863,6 +12876,7 @@ get_last_value_validate (rtx *loc, rtx_i /* If this is a pseudo-register that was only set once and not live at the beginning of the function, it is always valid. */ || (! (regno = FIRST_PSEUDO_REGISTER + regno reg_n_sets_max REG_N_SETS (regno) == 1 (!REGNO_REG_SET_P
Re: [Patch, fortran] PR 64757 - [5 Regression] ICE in fold_convert_loc, at fold-const.c:2353
Dear Dominique, On transferring from my laptop to my workstation, I find that it segfaults in runtime - both are x86_64/FC21. If I can, I intend to investigate tonight. Thanks for the report. Paul On 2 February 2015 at 17:53, Dominique Dhumieres domi...@lps.ens.fr wrote: Dear Paul, I have tested your patch at https://gcc.gnu.org/ml/fortran/2015-01/txtwnaoa1115V.txt (the latest version) and I found that the test type_to_class_3.f03 is miscompiled (FAIL) with -flto -O0 -m64 (this does not happens with -flto -O0 -m32 or with -Ox and x!=0). In addition, while the reduced test type :: Test integer :: i end type type :: TestReference class(Test), allocatable :: test(:) end type type(TestReference) :: testList type(test), allocatable :: x(:) allocate (testList%test(2), source = [Test(99), Test(199)]) ! Works, of course print *, size(testList%test) x = testList%test print *, x end gives what I expect, i.e., 2 99 199 type :: Test integer :: i end type type :: TestReference class(Test), allocatable :: test(:) end type type(TestReference) :: testList type(test), allocatable :: x(:) testList = TestReference([Test(99), Test(199)]) ! Gave: The rank of the element in the ! structure constructor at (1) does not ! match that of the component (1/0) print *, size(testList%test) x = testList%test print *, x end gives 1 99 Last problem I see, print *, TestReference([Test(99), Test(199)]) gives the following ICE f951: internal compiler error: Bad IO basetype (7) type_to_class_3_red_2.f03:12:0: print *, TestReference([Test(99), Test(199)]) Cheers, Dominique -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: [PING, www] Re: [PATCH] update_web_docs_svn: support the JIT docs (PR jit/64257)
Hi David, On Monday 2015-02-02 11:39, David Malcolm wrote: * update_web_docs_svn: Don't delete gcc/jit/docs or gcc/jit/jit-common.h, gcc/jit/notes.txt. Special case the building of the jit docs (using sphinx-build). Special case copying them up. I've committed this to trunk as r220149. Does this automatically get propagated to the machine that builds the website (and thus would be run next time the relevant cronjob runs)? I looked into this, and it does not get propagated automatically, so I did it manually (svn up at the proper location). Or does someone need to do additional work for this to go live? (if nothing else, the machine needs to have sphinx-build in its $PATH, as noted in the patch). I'm hoping to have the jit docs on the gcc website. I guess we'll be testing the error handling behavior of your new code :-), since indeed sphinx-build is not in the standard $PATH. Do you know where it can be found on this machine? Presumably, some machine needs to have the relevant sphinx packaged installed (if that's OK [1]), and perhaps the update to the update_web_docs_svn script needs to make it onto that machine? Or is this more appropriate for the overseers list? Yes, please direct the request for sphinx to overseers. (In parallel, if you can refine how the script behaves when sphinx is not present, that might be a good idea, too.) Gerald
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On 02/02/15 09:59, Joseph Myers wrote: On Sat, 31 Jan 2015, Jeff Law wrote: The nice thing about wrapping the result inside a convert is the types for the inner operations will propagate from the type of the inner operands, which is exactly what we want. We then remove the hack assigning type and instead the original type will be used for the outermost convert. Those inner operands still need converting to unsigned for arithmetic. Yes. And FWIW, there's no reason to restrict the pattern to just masking off the sign bit. That's what the PR complains about, but we can do considerably better here. That's part of the reason why I put in the iterators -- to generalize this to more cases. Well, we want to move shorten_binary_op and shorten_compare to the new mechanism. Absolutely. If we could have match.pd cover those cases, that'd be a significant validation of match.pd for this class of problems. Seems like gcc-6 stuff to me though. I haven't looked at those routines in a long time, but reviewing them seems wise both in the immediate term WRT this bug and ensuring we're doing the right thing for the various corner cases. Jeff
C++ PATCH for abi_tag sanity checking
One of the EDG guys pointed out to me that we weren't doing any sanity checking on the arguments to the abi_tag attribute. This patch adds checks to require that the arguments be strings containing valid identifiers, so they work appropriately in mangled names. Tested x86_64-pc-linux-gnu, applying to trunk. commit 3c9202343aca0b0a9d74fee4e6843000f3a612cf Author: Jason Merrill ja...@redhat.com Date: Fri Jan 30 07:45:02 2015 -0500 * tree.c (handle_abi_tag_attribute): Diagnose invalid arguments. diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index afb57a3..c51e42d 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -3501,6 +3501,50 @@ static tree handle_abi_tag_attribute (tree* node, tree name, tree args, int flags, bool* no_add_attrs) { + for (tree arg = args; arg; arg = TREE_CHAIN (arg)) +{ + tree elt = TREE_VALUE (arg); + if (TREE_CODE (elt) != STRING_CST + || (!same_type_ignoring_top_level_qualifiers_p + (strip_array_types (TREE_TYPE (elt)), + char_type_node))) + { + error (arguments to the %qE attribute must be narrow string + literals, name); + goto fail; + } + const char *begin = TREE_STRING_POINTER (elt); + const char *end = begin + TREE_STRING_LENGTH (elt); + for (const char *p = begin; p != end; ++p) + { + char c = *p; + if (p == begin) + { + if (!ISALPHA (c) c != '_') + { + error (arguments to the %qE attribute must contain valid + identifiers, name); + inform (input_location, %%c% is not a valid first + character for an identifier, c); + goto fail; + } + } + else if (p == end - 1) + gcc_assert (c == 0); + else + { + if (!ISALNUM (c) c != '_') + { + error (arguments to the %qE attribute must contain valid + identifiers, name); + inform (input_location, %%c% is not a valid character + in an identifier, c); + goto fail; + } + } + } +} + if (TYPE_P (*node)) { if (!OVERLOAD_TYPE_P (*node)) diff --git a/gcc/testsuite/g++.dg/abi/abi-tag13.C b/gcc/testsuite/g++.dg/abi/abi-tag13.C new file mode 100644 index 000..34e8da3 --- /dev/null +++ b/gcc/testsuite/g++.dg/abi/abi-tag13.C @@ -0,0 +1,5 @@ +const char *foo = bar; +void __attribute((abi_tag(foo))) f1() {} // { dg-error abi_tag } +void __attribute((abi_tag(Lfoo))) f2(); // { dg-error abi_tag } +void __attribute((abi_tag(3foo))) f3(); // { dg-error abi_tag } +void __attribute((abi_tag(1))) f5(); // { dg-error abi_tag }
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
On Sun, 1 Feb 2015, Bruno Loff wrote: Do I need to do anything else to get this patch into gcc? I suggest CC:ing the plugin maintainers (as listed in the MAINTAINERS file), and then pinging weekly for as long as needed. -- Joseph S. Myers jos...@codesourcery.com
Re: MAINTAINERS: resign as testsuite maintainer, update address
On 02/02/15 09:54, Janis Johnson wrote: I retired from Mentor Graphics 3 weeks ago and have no immediate plans to be active in GCC, so I'm resigning as a testsuite maintainer. I'm leaving myself under Write After Approval with my personal email address so people can find me. Sounds good. Thanks for all your work through the years. Five years ago while between jobs I got an individual FSF copyright assignment; is that still valid? Yes. It should be in effect until you start a new job. jeff
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
On Thu, Jan 29, 2015 at 4:32 PM, Bruno Loff bruno.l...@gmail.com wrote: The issue was first reported by Joachim Wieland to the list g...@gcc.gnu.org, on Wed, Jan 19, 2011 (Subject: PLUGIN_FINISH_TYPE not executed for enums). A description of the problem/bug and how my patch addresses it. --- The problem was that when gcc plugins registered callbacks on the PLUGIN_FINISH_TYPE event, this event would not be triggered after an enum had finished processing. The function call that does this was not there; it seems to me that it has simply been forgotten. Bootstrapping and testing make bootstrap make -k check === gcc Summary === # of expected passes106729 # of expected failures 256 # of unsupported tests 1409 on x86_64 ubuntu linux 14.04 Furthermore, I tested the plugin functionality (with a gcc-with-python script), and it now works properly. (However, changes to gcc-with-python also had to be made so that enum type info is properly converted to python types; see my github fork for these changes https://github.com/bloff/gcc-python-plugin) The Patch --- From: bloff bloff.si...@gmail.com Date: Sun, 19 Oct 2014 14:54:01 +0100 Subject: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing First reported by Joachim Wieland to the list g...@gcc.gnu.org, on Wed, Jan 19, 2011 (Subject: PLUGIN_FINISH_TYPE not executed for enums). --- gcc/c/c-parser.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 264c170..cb515aa 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -2324,6 +2324,7 @@ c_parser_declspecs (c_parser *parser, struct c_declspecs *specs, attrs_ok = true; seen_type = true; t = c_parser_enum_specifier (parser); + invoke_plugin_callbacks (PLUGIN_FINISH_TYPE, t.spec); declspecs_add_type (loc, specs, t); break; case RID_STRUCT: This is OK with a ChangeLog entry. Thanks. Diego.
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
Something like: The PLUGIN_FINISH_TYPE callback for gcc plugins is now triggered for enum declarations. ? On 2 February 2015 at 20:03, Diego Novillo dnovi...@google.com wrote: On Thu, Jan 29, 2015 at 4:32 PM, Bruno Loff bruno.l...@gmail.com wrote: The issue was first reported by Joachim Wieland to the list g...@gcc.gnu.org, on Wed, Jan 19, 2011 (Subject: PLUGIN_FINISH_TYPE not executed for enums). A description of the problem/bug and how my patch addresses it. --- The problem was that when gcc plugins registered callbacks on the PLUGIN_FINISH_TYPE event, this event would not be triggered after an enum had finished processing. The function call that does this was not there; it seems to me that it has simply been forgotten. Bootstrapping and testing make bootstrap make -k check === gcc Summary === # of expected passes106729 # of expected failures 256 # of unsupported tests 1409 on x86_64 ubuntu linux 14.04 Furthermore, I tested the plugin functionality (with a gcc-with-python script), and it now works properly. (However, changes to gcc-with-python also had to be made so that enum type info is properly converted to python types; see my github fork for these changes https://github.com/bloff/gcc-python-plugin) The Patch --- From: bloff bloff.si...@gmail.com Date: Sun, 19 Oct 2014 14:54:01 +0100 Subject: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing First reported by Joachim Wieland to the list g...@gcc.gnu.org, on Wed, Jan 19, 2011 (Subject: PLUGIN_FINISH_TYPE not executed for enums). --- gcc/c/c-parser.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 264c170..cb515aa 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -2324,6 +2324,7 @@ c_parser_declspecs (c_parser *parser, struct c_declspecs *specs, attrs_ok = true; seen_type = true; t = c_parser_enum_specifier (parser); + invoke_plugin_callbacks (PLUGIN_FINISH_TYPE, t.spec); declspecs_add_type (loc, specs, t); break; case RID_STRUCT: This is OK with a ChangeLog entry. Thanks. Diego.
Re: [PATCH] Added PLUGIN_FINISH_TYPE callback on enum type processing
On Mon, Feb 2, 2015 at 2:07 PM, Bruno Loff bruno.l...@gmail.com wrote: Something like: The PLUGIN_FINISH_TYPE callback for gcc plugins is now triggered for enum declarations. ? ChangeLog entries in GCC are pretty pick as to how they want to be formatted. See other entries for reference and https://gcc.gnu.org/codingconventions.html#ChangeLogs for specific documentation. Diego.
Re: [Patch, libstdc++/64649] Fix regex_traits::lookup_collatename and regex_traits::lookup_classname
On Mon, Feb 2, 2015 at 3:22 AM, Jonathan Wakely jwak...@redhat.com wrote: I don't think this needs to go on the 4.9 branch, apparently I'm the only person who's noticed the problem. I expect it's quite rare to try using those functions with forward iterators. Sorry, I was not talking about the first patch which fixes the forward iterator problem, because it's already checked into 4.9; I'm suggesting the last one, who fixes the first one :) -- Regards, Tim Shen
Re: [Patch, libstdc++/64649] Fix regex_traits::lookup_collatename and regex_traits::lookup_classname
On 02/02/15 11:18 -0800, Tim Shen wrote: On Mon, Feb 2, 2015 at 3:22 AM, Jonathan Wakely jwak...@redhat.com wrote: I don't think this needs to go on the 4.9 branch, apparently I'm the only person who's noticed the problem. I expect it's quite rare to try using those functions with forward iterators. Sorry, I was not talking about the first patch which fixes the forward iterator problem, because it's already checked into 4.9; I'm suggesting the last one, who fixes the first one :) Oh, I forgot the first one was already checked in to 4.9 -- OK, the second one is needed too.
Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation
On 27 Jan 12:29, Richard Biener wrote: On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich enkovich@gmail.com wrote: On 27 Jan 12:40, Ilya Enkovich wrote: Hi, This patch was supposed to fix PR tree-optimization/64277. Tracker is now fixed by warnings disabling but I think patch is still useful to avoid dead code generated by complete unroll. Bootstrapped and tested on x86_64-unknown-linux-gnu. Thanks, Ilya -- gcc/ 2015-01-27 Ilya Enkovich ilya.enkov...@intel.com * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base range info when possible to refine estimation. gcc/testsuite/ 2015-01-27 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/pr64277.c: New. Here is a new version fixed according to comments in the tracker. I also fixed a test to scan cunroll dumps. Does it look OK? Minor comments below. What are possible branches for this patch? You can probably create a testcase that shows code-size regressions against a version that didn't peel completely (GCC 4.7). Thus I'd say it would apply to 4.9 as well (4.8 doesn't have range information). Thanks, Ilya -- diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c new file mode 100644 index 000..c6ef331 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr64277.c @@ -0,0 +1,23 @@ +/* PR tree-optimization/64277 */ +/* { dg-do compile } */ +/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */ +/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled cunroll } } */ +/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled cunroll } } */ +/* { dg-final { cleanup-tree-dump cunroll } } */ + +int f1[10]; +void test1 (short a[], short m, unsigned short l) +{ + int i = l; + for (i = i + 5; i m; i++) +f1[i] = a[i]++; +} + +void test2 (short a[], short m, short l) +{ + int i; + if (m 5) +m = 5; + for (i = m; i l; i--) +f1[i] = a[i]++; +} diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 919f5c0..1cd297d 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree step, gimple stmt, { tree niter_bound, extreme, delta; tree type = TREE_TYPE (base), unsigned_type; + tree orig_base = base; if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step)) return; @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree step, gimple stmt, if (tree_int_cst_sign_bit (step)) { + wide_int min, max; extreme = fold_convert (unsigned_type, low); - if (TREE_CODE (base) != INTEGER_CST) + if (TREE_CODE (orig_base) == SSA_NAME + TREE_CODE (high) == INTEGER_CST + INTEGRAL_TYPE_P (TREE_TYPE (orig_base)) + get_range_info (orig_base, min, max) == VR_RANGE + wi::gts_p (wide_int (high), max)) For me a simple wi::gts_p (high, max) worked fine. + base = wide_int_to_tree (unsigned_type, max); + else if (TREE_CODE (base) != INTEGER_CST) base = fold_convert (unsigned_type, high); delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme); step = fold_build1 (NEGATE_EXPR, unsigned_type, step); } else { + wide_int min, max; extreme = fold_convert (unsigned_type, high); - if (TREE_CODE (base) != INTEGER_CST) + if (TREE_CODE (orig_base) == SSA_NAME + TREE_CODE (low) == INTEGER_CST + INTEGRAL_TYPE_P (TREE_TYPE (orig_base)) + get_range_info (orig_base, min, max) == VR_RANGE + wi::gts_p (min, wide_int (low))) Likewise. Ok for trunk with that changes. For the 4.9 branch you need to adjust the patch to not use wide-ints. I'd leave it on trunk for a while and eventually open a bugreport for the size regression to keep track of it. Thanks, Richard. Here is a version for 4.9 branch. Does it look OK? Bootstrapped and tested on x86_64-unknown-linux-gnu. Thanks, Ilya -- gcc/ 2015-02-02 Ilya Enkovich ilya.enkov...@intel.com PR tree-optimization/64277 * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base range info when possible to refine estimation. gcc/testsuite/ 2015-02-02 Ilya Enkovich ilya.enkov...@intel.com PR tree-optimization/64277 * gcc.dg/pr64277.c: New. diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c new file mode 100644 index 000..c6ef331 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr64277.c @@ -0,0 +1,23 @@ +/* PR tree-optimization/64277 */ +/* { dg-do compile } */ +/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */ +/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled cunroll } } */ +/* { dg-final {
Re: [PATCHv2][wwwdocs] Mention -freport-bug in release notes
On Monday 2015-01-26 11:03, Yury Gribov wrote: Second version of patch with updates from Gerald Pfeifer. Ok to commit? Yes, this looks good. Thank you, Gerald
Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation
On Mon, Feb 2, 2015 at 9:19 AM, Ilya Enkovich enkovich@gmail.com wrote: On 27 Jan 12:29, Richard Biener wrote: On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich enkovich@gmail.com wrote: On 27 Jan 12:40, Ilya Enkovich wrote: Hi, This patch was supposed to fix PR tree-optimization/64277. Tracker is now fixed by warnings disabling but I think patch is still useful to avoid dead code generated by complete unroll. Bootstrapped and tested on x86_64-unknown-linux-gnu. Thanks, Ilya -- gcc/ 2015-01-27 Ilya Enkovich ilya.enkov...@intel.com * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base range info when possible to refine estimation. gcc/testsuite/ 2015-01-27 Ilya Enkovich ilya.enkov...@intel.com * gcc.dg/pr64277.c: New. Here is a new version fixed according to comments in the tracker. I also fixed a test to scan cunroll dumps. Does it look OK? Minor comments below. What are possible branches for this patch? You can probably create a testcase that shows code-size regressions against a version that didn't peel completely (GCC 4.7). Thus I'd say it would apply to 4.9 as well (4.8 doesn't have range information). Thanks, Ilya -- diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c new file mode 100644 index 000..c6ef331 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr64277.c @@ -0,0 +1,23 @@ +/* PR tree-optimization/64277 */ +/* { dg-do compile } */ +/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */ +/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled cunroll } } */ +/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled cunroll } } */ +/* { dg-final { cleanup-tree-dump cunroll } } */ + +int f1[10]; +void test1 (short a[], short m, unsigned short l) +{ + int i = l; + for (i = i + 5; i m; i++) +f1[i] = a[i]++; +} + +void test2 (short a[], short m, short l) +{ + int i; + if (m 5) +m = 5; + for (i = m; i l; i--) +f1[i] = a[i]++; +} diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 919f5c0..1cd297d 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree step, gimple stmt, { tree niter_bound, extreme, delta; tree type = TREE_TYPE (base), unsigned_type; + tree orig_base = base; if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step)) return; @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree step, gimple stmt, if (tree_int_cst_sign_bit (step)) { + wide_int min, max; extreme = fold_convert (unsigned_type, low); - if (TREE_CODE (base) != INTEGER_CST) + if (TREE_CODE (orig_base) == SSA_NAME + TREE_CODE (high) == INTEGER_CST + INTEGRAL_TYPE_P (TREE_TYPE (orig_base)) + get_range_info (orig_base, min, max) == VR_RANGE + wi::gts_p (wide_int (high), max)) For me a simple wi::gts_p (high, max) worked fine. + base = wide_int_to_tree (unsigned_type, max); + else if (TREE_CODE (base) != INTEGER_CST) base = fold_convert (unsigned_type, high); delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme); step = fold_build1 (NEGATE_EXPR, unsigned_type, step); } else { + wide_int min, max; extreme = fold_convert (unsigned_type, high); - if (TREE_CODE (base) != INTEGER_CST) + if (TREE_CODE (orig_base) == SSA_NAME + TREE_CODE (low) == INTEGER_CST + INTEGRAL_TYPE_P (TREE_TYPE (orig_base)) + get_range_info (orig_base, min, max) == VR_RANGE + wi::gts_p (min, wide_int (low))) Likewise. Ok for trunk with that changes. For the 4.9 branch you need to adjust the patch to not use wide-ints. I'd leave it on trunk for a while and eventually open a bugreport for the size regression to keep track of it. Thanks, Richard. Here is a version for 4.9 branch. Does it look OK? Bootstrapped and tested on x86_64-unknown-linux-gnu. I don't think we want this kind of fixes on the branch. Thanks, Richard. Thanks, Ilya -- gcc/ 2015-02-02 Ilya Enkovich ilya.enkov...@intel.com PR tree-optimization/64277 * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base range info when possible to refine estimation. gcc/testsuite/ 2015-02-02 Ilya Enkovich ilya.enkov...@intel.com PR tree-optimization/64277 * gcc.dg/pr64277.c: New. diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c new file mode 100644 index 000..c6ef331 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr64277.c @@ -0,0 +1,23 @@ +/* PR tree-optimization/64277 */ +/* { dg-do compile } */ +/* {
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On Sun, Feb 1, 2015 at 6:46 AM, Jeff Law l...@redhat.com wrote: On 01/31/15 17:47, Joseph Myers wrote: On Fri, 30 Jan 2015, Jeff Law wrote: +/* If we are testing a single bit resulting from a binary + operation in precision P1 where the operands were widened + precision P2 and the tested bit is the sign bit for + precision P2. Rewrite so the binary operation is in + precision P2. */ To avoid introducing undefined behavior, if the operation is arithmetic rather than bitwise and the original type with precision P2 is signed then you need to convert the operands to the corresponding unsigned type. Yea, probably so. (I'm not sure how you avoid needing to convert the final result back to the original type of the expression to avoid type mismatches in the containing expression, but such a conversion back to the original type would need to be a zero-extension not a sign-extension and so for that I'd suppose the inner type should be unsigned even for bitwise operations. So I think the way to go to solve both issues is to wrap the result inside a convert. Right now by working on generic, we're relying on implicit type conversions, which feels wrong. The nice thing about wrapping the result inside a convert is the types for the inner operations will propagate from the type of the inner operands, which is exactly what we want. We then remove the hack assigning type and instead the original type will be used for the outermost convert. It's not even a hack but wrong ;) Correct supported syntax is + (with { tree type0 = TREE_TYPE (@0); } + (convert:type0 (bit_and (inner_op @0 @1) (convert @3))) Thus whenever the generator cannot auto-guess a type (or would guess the wrong one) you can explicitely specify a type to convert to. Why do you restrict this to GENERIC? On GIMPLE you'd eventually want to impose some single-use constraints as the result with all the conversions won't really be unconditionally better? (reminds me of thinking of a nicer way for all this single-use stuff for next stage1) That seems to DTRT in some initial sniff testing. And FWIW, there's no reason to restrict the pattern to just masking off the sign bit. That's what the PR complains about, but we can do considerably better here. That's part of the reason why I put in the iterators -- to generalize this to more cases. Indeed. Richard. jeff