Re: [PATCH, testsuite] fix ggcplug.c test-case
On 12 January 2015 at 14:19, Richard Biener rguent...@suse.de wrote: On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote: Hi, The test-case plugin/ggcplug.c was failing due to flattening of tree.h and tree-core.h. Test-case was incorrect because it included gcc-plugin.h after tree.h whereas gcc-plugin.h should be the first header to be included by plugins. No, it should be definitely included _after_ config.h, system.h and coretypes.h. gcc-plugin.h already includes these files. Shall I remove config.h, system.h and coretypes.h from ggcplug.c instead ? Ok with moving it after coretypes.h. Thanks, Richard.
[PATCH] IPA ICF: handle correctly indirect_calls
Hello. Following patch is needed to pass LTO compilation for chromium. IPA ICF verifies polymorphic types for functions that have any function call. I forgot to handle indirect_calls. Patch can bootstrap on x86_64-linux-pc and new regression is seen. Ready for trunk? Thanks, Martin From d0f7fc76d5dac5f4c3c57a2e632082485debbd8a Mon Sep 17 00:00:00 2001 From: mliska mli...@suse.cz Date: Thu, 8 Jan 2015 13:49:45 +0100 Subject: [PATCH] IPA ICF: handle correctly indirect_calls. gcc/ChangeLog: 2015-01-08 Martin Liska mli...@suse.cz * ipa-icf.c (sem_function::equals_wpa): Add indirect_calls as indication that a function is not leaf. (sem_function::compare_polymorphic_p): Likewise. --- gcc/ipa-icf.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index 1b76a1d..ed6d019 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -340,7 +340,8 @@ sem_function::equals_wpa (sem_item *item, return return_false_with_msg (NULL argument type); /* Polymorphic comparison is executed just for non-leaf functions. */ - bool is_not_leaf = get_node ()-callees != NULL; + bool is_not_leaf = get_node ()-callees != NULL + || get_node ()-indirect_calls != NULL; if (!func_checker::compatible_types_p (arg_types[i], m_compared_func-arg_types[i], @@ -884,7 +885,9 @@ bool sem_function::compare_polymorphic_p (void) { return get_node ()-callees != NULL - || m_compared_func-get_node ()-callees != NULL; + || get_node ()-indirect_calls != NULL + || m_compared_func-get_node ()-callees != NULL + || m_compared_func-get_node ()-indirect_calls != NULL; } /* For a given call graph NODE, the function constructs new -- 2.1.2
Re: [PATCH] Fix enum operands exchange in ipa-inline.c
Hi Richard, Thanks for the quick review and comments. Please find attached the modified patch as per your suggestion. Thanks, Naveen From: Richard Biener richard.guent...@gmail.com Sent: Monday, January 12, 2015 2:48 PM To: Hurugalawadi, Naveen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix enum operands exchange in ipa-inline.c On Mon, Jan 12, 2015 at 7:58 AM, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, Sorry, Had forgot the ChangeLog. Ok, but please properly wrap the long lines, put '? gimple_...' on a new one. Thanks, Richard. ChangeLog 2015-01-12 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * ipa-inline.c (inline_small_functions): Swap the operands in enum. Thanks, Naveen --- gcc/ipa-inline.c 2015-01-12 14:55:25.291575873 +0530 +++ gcc/ipa-inline.c 2015-01-12 14:56:01.795575453 +0530 @@ -1730,10 +1730,12 @@ inline_small_functions (void) to be inlined into %s/%i in %s:%i\n Estimated badness is %f, frequency %.2f.\n, edge-caller-name (), edge-caller-order, - edge-call_stmt ? unknown - : gimple_filename ((const_gimple) edge-call_stmt), - edge-call_stmt ? -1 - : gimple_lineno ((const_gimple) edge-call_stmt), + edge-call_stmt + ? gimple_filename ((const_gimple) edge-call_stmt) + : unknown, + edge-call_stmt + ? gimple_lineno ((const_gimple) edge-call_stmt) + : -1, badness.to_double (), edge-frequency / (double)CGRAPH_FREQ_BASE); if (edge-count)
Re: [PATCH] IPA ICF: handle correctly indirect_calls
On Mon, Jan 12, 2015 at 10:29 AM, Martin Liška mli...@suse.cz wrote: Hello. Following patch is needed to pass LTO compilation for chromium. IPA ICF verifies polymorphic types for functions that have any function call. I forgot to handle indirect_calls. Patch can bootstrap on x86_64-linux-pc and new regression is seen. Ready for trunk? Ok. Thanks, Richard. Thanks, Martin
[gomp4] Merge trunk r219425 (2015-01-10) into gomp-4_0-branch
Hi! In r219453, I have committed a merge from trunk r219425 (2015-01-10) into gomp-4_0-branch. Grüße, Thomas pgphl0a_xYFFn.pgp Description: PGP signature
PR ipa/63470 (zero sized call_stmt)
Hi, this ICE is caused by double updating in ipa-prop that reduces call stmt size once when it becomes speculative and again when it is turned to direct. Fixed by the following patch that makes updating to happen while duplication so ipa-prop needs to care only about case it turned real indirect call into direct. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza PR ipa/63470 * ipa-inline-analysis.c (inline_edge_duplication_hook): Adjust cost when edge becomes direct. * ipa-prop.c (make_edge_direct): Do not adjust when speculation is resolved or when introducing new speculation. * testsuite/g++.dg/ipa/pr63470.C: New testcase. Index: ipa-inline-analysis.c === --- ipa-inline-analysis.c (revision 219430) +++ ipa-inline-analysis.c (working copy) @@ -1312,6 +1312,13 @@ info-predicate = NULL; edge_set_predicate (dst, srcinfo-predicate); info-param = srcinfo-param.copy (); + if (!dst-indirect_unknown_callee src-indirect_unknown_callee) +{ + info-call_stmt_size -= (eni_size_weights.indirect_call_cost + - eni_size_weights.call_cost); + info-call_stmt_time -= (eni_time_weights.indirect_call_cost + - eni_time_weights.call_cost); +} } Index: ipa-prop.c === --- ipa-prop.c (revision 219430) +++ ipa-prop.c (working copy) @@ -2737,7 +2737,20 @@ ie-caller-name (), callee-name ()); } if (!speculative) -ie = ie-make_direct (callee); +{ + struct cgraph_edge *orig = ie; + ie = ie-make_direct (callee); + /* If we resolved speculative edge the cost is already up to date +for direct call (adjusted by inline_edge_duplication_hook). */ + if (ie == orig) + { + es = inline_edge_summary (ie); + es-call_stmt_size -= (eni_size_weights.indirect_call_cost +- eni_size_weights.call_cost); + es-call_stmt_time -= (eni_time_weights.indirect_call_cost +- eni_time_weights.call_cost); + } +} else { if (!callee-can_be_discarded_p ()) @@ -2747,14 +2760,10 @@ if (alias) callee = alias; } + /* make_speculative will update ie's cost to direct call cost. */ ie = ie-make_speculative (callee, ie-count * 8 / 10, ie-frequency * 8 / 10); } - es = inline_edge_summary (ie); - es-call_stmt_size -= (eni_size_weights.indirect_call_cost -- eni_size_weights.call_cost); - es-call_stmt_time -= (eni_time_weights.indirect_call_cost -- eni_time_weights.call_cost); return ie; } Index: testsuite/g++.dg/ipa/pr63470.C === --- testsuite/g++.dg/ipa/pr63470.C (revision 0) +++ testsuite/g++.dg/ipa/pr63470.C (revision 0) @@ -0,0 +1,54 @@ +/* PR ipa/63470.C */ +/* { dg-do compile } */ +/* { dg-options -O2 -finline-functions } */ + +class A +{ +public: + virtual bool m_fn1 (); + virtual const char **m_fn2 (int); + virtual int m_fn3 (); +}; +class FTjackSupport : A +{ + ~FTjackSupport (); + bool m_fn1 (); + bool m_fn4 (); + const char ** + m_fn2 (int) + { + } + int _inited; + int *_jackClient; + int _activePathCount; +} + +* a; +void fn1 (...); +void fn2 (void *); +int fn3 (int *); +FTjackSupport::~FTjackSupport () { m_fn4 (); } + +bool +FTjackSupport::m_fn1 () +{ + if (!_jackClient) +return 0; + for (int i=0; _activePathCount; ++i) +if (m_fn2 (i)) + fn2 (a); + if (m_fn3 ()) +fn2 (a); + if (fn3 (_jackClient)) +fn1 (0); +} + +bool +FTjackSupport::m_fn4 () +{ + if (_inited _jackClient) +{ + m_fn1 (); + return 0; +} +}
Re: [PATCH] Fix undefined label problem after crossjumping (PR rtl-optimization/64536)
On Fri, 9 Jan 2015, Jakub Jelinek wrote: On Fri, Jan 09, 2015 at 03:10:16PM +0100, Richard Biener wrote: Well, you have until the end of next week ;) For GIMPLE this is a switch with all cases going to the same basic-block, right? I think we optimize that in cleanup_control_expr_graph via the single_succ_p case? No, it is a switch with cases that all look like: _1 = a; // load _2 = _1 + 1; a = _2; // store So, either if tree-ssa-tail-merge could be tought about loads/stores, or some other pass would be able to hoist the loads before the switch and sink the store after the switch, because every switch case does that. Ah, ok. Indeed code-hoisting on GIMPLE wasn't finished (there is a very old PR with patches still), and sinking has the same issue in that it only exploits partial dead code elimination opportunities. I think that tail-merging already handles some of these cases, just maybe not the one with more than two PHI args or switches. Richard.
[PATCH, autofdo] Some code cleanup
Hi, The attached patch does some code cleanup for auto-profile.c: fix typos and remove some unnecessary MAX/MIN checks plus some else. OK for the trunk? Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 219297) +++ gcc/auto-profile.c (working copy) @@ -96,7 +96,7 @@ along with GCC; see the file COPYING3. If not see standalone symbol, or a clone of a function that is inlined into another function. - Phase 2: Early inline + valur profile transformation. + Phase 2: Early inline + value profile transformation. Early inline uses autofdo_source_profile to find if a callsite is: * inlined in the profiled binary. * callee body is hot in the profiling run. @@ -361,7 +361,7 @@ get_original_name (const char *name) /* Return the combined location, which is a 32bit integer in which higher 16 bits stores the line offset of LOC to the start lineno - of DECL, The lower 16 bits stores the discrimnator. */ + of DECL, The lower 16 bits stores the discriminator. */ static unsigned get_combined_location (location_t loc, tree decl) @@ -424,7 +424,7 @@ get_inline_stack (location_t locus, inline_stack * /* Return STMT's combined location, which is a 32bit integer in which higher 16 bits stores the line offset of LOC to the start lineno - of DECL, The lower 16 bits stores the discrimnator. */ + of DECL, The lower 16 bits stores the discriminator. */ static unsigned get_relative_location_for_stmt (gimple stmt) @@ -481,8 +481,8 @@ string_table::get_index (const char *name) const string_index_map::const_iterator iter = map_.find (name); if (iter == map_.end ()) return -1; - else -return iter-second; + + return iter-second; } /* Return the index of a given function DECL. Return -1 if DECL is not @@ -502,8 +502,8 @@ string_table::get_index_by_decl (tree decl) const return ret; if (DECL_ABSTRACT_ORIGIN (decl)) return get_index_by_decl (DECL_ABSTRACT_ORIGIN (decl)); - else -return -1; + + return -1; } /* Return the function name of a given INDEX. */ @@ -569,8 +569,8 @@ function_instance::get_function_instance_by_decl ( } if (DECL_ABSTRACT_ORIGIN (decl)) return get_function_instance_by_decl (lineno, DECL_ABSTRACT_ORIGIN (decl)); - else -return NULL; + + return NULL; } /* Store the profile info for LOC in INFO. Return TRUE if profile info @@ -597,7 +597,7 @@ function_instance::mark_annotated (location_t loc) iter-second.annotated = true; } -/* Read the inlinied indirect call target profile for STMT and store it in +/* Read the inlined indirect call target profile for STMT and store it in MAP, return the total count for all inlined indirect calls. */ gcov_type @@ -824,8 +824,8 @@ autofdo_source_profile::get_callsite_total_count ( || afdo_string_table-get_index (IDENTIFIER_POINTER ( DECL_ASSEMBLER_NAME (edge-callee-decl))) != s-name ()) return 0; - else -return s-total_count (); + + return s-total_count (); } /* Read AutoFDO profile and returns TRUE on success. */ @@ -956,9 +956,9 @@ read_profile (void) histograms for indirect-call optimization. This function is actually served for 2 purposes: - * before annotation, we need to mark histogram, promote and inline - * after annotation, we just need to mark, and let follow-up logic to - decide if it needs to promote and inline. */ + * before annotation, we need to mark histogram, promote and inline + * after annotation, we just need to mark, and let follow-up logic to + decide if it needs to promote and inline. */ static void afdo_indirect_call (gimple_stmt_iterator *gsi, const icall_target_map map, @@ -1054,7 +1054,7 @@ set_edge_annotated (edge e, edge_set *annotated) } /* For a given BB, set its execution count. Attach value profile if a stmt - is not in PROMOTED, because we only want to promot an indirect call once. + is not in PROMOTED, because we only want to promote an indirect call once. Return TRUE if BB is annotated. */ static bool @@ -1138,7 +1138,7 @@ afdo_find_equiv_class (bb_set *annotated_bb) bb1-aux = bb; if (bb1-count bb-count is_bb_annotated (bb1, *annotated_bb)) { -bb-count = MAX (bb-count, bb1-count); +bb-count = bb1-count; set_bb_annotated (bb, annotated_bb); } } @@ -1150,7 +1150,7 @@ afdo_find_equiv_class (bb_set *annotated_bb) bb1-aux = bb; if (bb1-count bb-count is_bb_annotated (bb1, *annotated_bb)) { -bb-count = MAX (bb-count, bb1-count); +bb-count = bb1-count; set_bb_annotated (bb, annotated_bb); } } @@ -1455,13 +1455,14 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt } } } + if (has_vpt) { optimize_inline_calls
Re: [PATCH] Fix enum operands exchange in ipa-inline.c
On Mon, Jan 12, 2015 at 10:36 AM, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi Richard, Thanks for the quick review and comments. Please find attached the modified patch as per your suggestion. Ok. Richard. Thanks, Naveen From: Richard Biener richard.guent...@gmail.com Sent: Monday, January 12, 2015 2:48 PM To: Hurugalawadi, Naveen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix enum operands exchange in ipa-inline.c On Mon, Jan 12, 2015 at 7:58 AM, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, Sorry, Had forgot the ChangeLog. Ok, but please properly wrap the long lines, put '? gimple_...' on a new one. Thanks, Richard. ChangeLog 2015-01-12 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * ipa-inline.c (inline_small_functions): Swap the operands in enum. Thanks, Naveen
Re: [PATCH] ipa-icf.c: Fix issues generated by original latest commit
On Sat, Jan 10, 2015 at 10:03 AM, Chen Gang S gang.c...@sunrus.com.cn wrote: The related commit is 275e275 IPA ICF: target and optimization flags comparison.. For sem_function::equals_private(), fix the typo issue, and for target_opts_for_fn(), fix access NULL issue. For cross compiling h8300, it will cause the issue below: [root@localhost h8300]# cat fp-bit.i __inline__ static int a (int x) { return __builtin_expect (x == 0, 0); } __inline__ static int b (int x) { return __builtin_expect (x == 1, 0); } __attribute__ ((__always_inline__)) int c (int x, int y) { if (a (x)) return x; if (b (x)) return x; return y; } [root@localhost h8300]# /upstream/build-gcc-h8300/gcc/cc1 -O2 fp-bit.i -o test.s a b c Analyzing compilation unit fp-bit.i:11:41: warning: always_inline function might not be inlinable [-Wattributes] __attribute__ ((__always_inline__)) int c (int x, int y) ^ Performing interprocedural optimizations *free_lang_data visibility build_ssa_passes chkp_passes opt_local_passes free-inline-summary emutls whole-program profile_estimate icffp-bit.i:18:1: internal compiler error: Segmentation fault } ^ 0xa11f0e crash_signal ../../gcc/gcc/toplev.c:372 0xda33e7 tree_check ../../gcc/gcc/tree.h:2769 0xda33e7 target_opts_for_fn ../../gcc/gcc/tree.h:4643 0xda33e7 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*, hash_mapsymtab_node*, ipa_icf::sem_item*, default_hashmap_traits) ../../gcc/gcc/ipa-icf.c:438 0xda4023 ipa_icf::sem_function::equals(ipa_icf::sem_item*, hash_mapsymtab_node*, ipa_icf::sem_item*, default_hashmap_traits) ../../gcc/gcc/ipa-icf.c:393 0xda6472 ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool) ../../gcc/gcc/ipa-icf.c:1900 0xdaad3c ipa_icf::sem_item_optimizer::execute() ../../gcc/gcc/ipa-icf.c:1719 0xdab961 ipa_icf_driver ../../gcc/gcc/ipa-icf.c:2448 0xdab961 ipa_icf::pass_ipa_icf::execute(function*) ../../gcc/gcc/ipa-icf.c:2496 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. This issue can be found for cross compiling gcc make all-target-libgcc under h8300, after fix this issue, it can continue to cross compiling to meet the next building issue for h8300. Ok. Thanks, Richard. 2015-01-10 Chen Gang gang.chen.5...@gmail.com * ipa-icf.c (sem_function::equals_private): Use '' instead of '||' to fix typo issue. * gcc/tree.h (target_opts_for_fn): Check NULL_TREE since it can accept and return NULL. --- gcc/ipa-icf.c | 2 +- gcc/tree.h| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index 1b76a1d..4ccaf8c 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -438,7 +438,7 @@ sem_function::equals_private (sem_item *item, cl_target_option *tar1 = target_opts_for_fn (decl); cl_target_option *tar2 = target_opts_for_fn (m_compared_func-decl); - if (tar1 != NULL || tar2 != NULL) + if (tar1 != NULL tar2 != NULL) { if (!cl_target_option_eq (tar1, tar2)) { diff --git a/gcc/tree.h b/gcc/tree.h index fc8c8fe..ac27268 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -4640,7 +4640,7 @@ target_opts_for_fn (const_tree fndecl) tree fn_opts = DECL_FUNCTION_SPECIFIC_TARGET (fndecl); if (fn_opts == NULL_TREE) fn_opts = target_option_default_node; - return TREE_TARGET_OPTION (fn_opts); + return fn_opts == NULL_TREE ? NULL : TREE_TARGET_OPTION (fn_opts); } /* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is -- 1.9.3
Re: [PATCH] Fix enum operands exchange in ipa-inline.c
On Mon, Jan 12, 2015 at 7:58 AM, Hurugalawadi, Naveen naveen.hurugalaw...@caviumnetworks.com wrote: Hi, Sorry, Had forgot the ChangeLog. Ok, but please properly wrap the long lines, put '? gimple_...' on a new one. Thanks, Richard. ChangeLog 2015-01-12 Naveen H.S naveen.hurugalaw...@caviumnetworks.com * ipa-inline.c (inline_small_functions): Swap the operands in enum. Thanks, Naveen
Re: [testsuite] PATCH: Correct target selector in gcc.target/i386/nop-mcount.c
On Mon, Jan 12, 2015 at 1:31 AM, H.J. Lu hjl.to...@gmail.com wrote: nonpic in target selector in gcc.target/i386/nop-mcount.c is ignored since {} is misplaced. This patch properly places {} in target selector. Tested on Linux/x86. OK for trunk? Thanks. H.J. --- gcc/testsuite/gcc.target/i386/nop-mcount.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) 2015-01-11 H.J. Lu hongjiu...@intel.com * gcc.target/i386/nop-mcount.c: Properly place {} in target selector. This counts as obvious, so OK. Thanks, Uros. diff --git a/gcc/testsuite/gcc.target/i386/nop-mcount.c b/gcc/testsuite/gcc.target/i386/nop-mcount.c index 561792f..139fbb0 100644 --- a/gcc/testsuite/gcc.target/i386/nop-mcount.c +++ b/gcc/testsuite/gcc.target/i386/nop-mcount.c @@ -1,5 +1,5 @@ /* Test -mnop-mcount */ -/* { dg-do compile { target { *-*-linux* } { nonpic } } } */ +/* { dg-do compile { target { { *-*-linux* } nonpic } } } */ /* { dg-options -pg -mfentry -mrecord-mcount -mnop-mcount } */ /* { dg-final { scan-assembler-not __fentry__ } } */ /* Origin: Andi Kleen */ -- 1.9.3
Re: [match-and-simplify] Remove printing for expression
On Sat, 10 Jan 2015, Prathamesh Kulkarni wrote: On 8 January 2015 at 17:52, Richard Biener rguent...@suse.de wrote: On Sun, 21 Dec 2014, Prathamesh Kulkarni wrote: Hi, I removed printing for expression: from print_matches. I think it is out of place tvim here and we call print_matches after lowering. OK to commit ? Hum, it's now a very simple wrapper around print_operand - why not replace the two callers with its content? Indeed. Done the changes in the attached patch. Doesn't that miss the '\n' putc? Thanks, Richard. OK to commit to match-and-simplify branch ? Thanks, Prathamesh Thanks, Richard. Thanks, Prathamesh -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg) -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: PATCH: PR bootstrap/64561: [5 Regression] HAVE_LD_PIE_COPYRELOC is defined to 1 for broken linker
On Mon, Jan 12, 2015 at 2:48 AM, H.J. Lu hongjiu...@intel.com wrote: Hi, This patch updates Linux/x86-64 linker test for PIE with copy reloc. Tested with broken and working linkers on Linux/x86-64. OK to install? Thanks. H.J. --- 2015-01-12 H.J. Lu hongjiu...@intel.com PR bootstrap/64561 * configure.ac (HAVE_LD_PIE_COPYRELOC): Update Linux/x86-64 linker test for PIE with copy reloc. * configure: Regenerated. OK. Thanks, Uros. diff --git a/gcc/configure b/gcc/configure index 8670f73..1bf4358 100755 --- a/gcc/configure +++ b/gcc/configure @@ -27052,6 +27052,11 @@ EOF main: movl%eax, a_glob(%rip) .size main, .-main + .globl ptr + .section.data.rel,aw,@progbits + .type ptr, @object +ptr: + .quad a_glob EOF if $gcc_cv_as --64 -o conftest1.o conftest1.s /dev/null 21 \ $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o /dev/null 21 \ diff --git a/gcc/configure.ac b/gcc/configure.ac index d010141..102dab9 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -4719,6 +4719,11 @@ EOF main: movl%eax, a_glob(%rip) .size main, .-main + .globl ptr + .section.data.rel,aw,@progbits + .type ptr, @object +ptr: + .quad a_glob EOF if $gcc_cv_as --64 -o conftest1.o conftest1.s /dev/null 21 \ $gcc_cv_ld -shared -melf_x86_64 -o conftest1.so conftest1.o /dev/null 21 \
Re: [Patch, Fortran, OOP] PR 63733: [4.8/4.9/5 Regression] wrong resolution for OPERATOR generics
Dear Janus, Since it is a regression, by all means update the branches. We usually, propose delaying a bit but I am not convinced that this is effective for this kind of bug fix - usually, further problems take a long time to emerge. Thus, I would recommend that you get on with it. Thanks Paul On 11 January 2015 at 23:01, Janus Weil ja...@gcc.gnu.org wrote: Well done for sorting that out. OK for trunk. Thanks, Paul. Committed as r219440. What about the branches? Cheers, Janus On 11 January 2015 at 14:38, Janus Weil ja...@gcc.gnu.org wrote: Hi all, this patch fixes a wrong-code regression related to operators, by making sure that we look for typebound operators first, before looking for non-typebound ones. (Note: Each typebound operator is also added to the list of non-typebound ones, for reasons of diagnostics.) Regtested on x86_64-unknown-linux-gnu. Ok for trunk? 4.9/4.8? Cheers, Janus 2015-01-11 Janus Weil ja...@gcc.gnu.org PR fortran/63733 * interface.c (gfc_extend_expr): Look for type-bound operators before non-typebound ones. 2015-01-11 Janus Weil ja...@gcc.gnu.org PR fortran/63733 * gfortran.dg/typebound_operator_20.f90: New. -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Simplify badness metrics in inliner, take 2
Hi, this is variant of my earlier patch I comited. It solves issues with -fprofile-use and various roundoff errors that triggered sanity checks (partly by disabling them). Bootstrapped/regtested x86_64-linux. Honza PR ipa/63967 PR ipa/64425 * ipa-inline.c (compute_uninlined_call_time, compute_inlined_call_time): Use counts for extra precision when needed possible. (big_speedup_p): Fix formating. (RELATIVE_TIME_BENEFIT_RANGE): Remove. (relative_time_benefit): Remove. (edge_badness): Turn DECL_DISREGARD_INLINE_LIMITS into hint; merge guessed and read profile paths. (inline_small_functions): Count only !optimize_size functions into initial size; be more lax about sanity check when profile is used; be sure to update inlined function profile when profile is read. Index: ipa-inline.c === --- ipa-inline.c(revision 219430) +++ ipa-inline.c(working copy) @@ -530,12 +530,19 @@ inline sreal compute_uninlined_call_time (struct inline_summary *callee_info, struct cgraph_edge *edge) { - sreal uninlined_call_time = (sreal)callee_info-time - * MAX (edge-frequency, 1) - * cgraph_freq_base_rec; - int caller_time = inline_summaries-get (edge-caller-global.inlined_to - ? edge-caller-global.inlined_to - : edge-caller)-time; + sreal uninlined_call_time = (sreal)callee_info-time; + cgraph_node *caller = (edge-caller-global.inlined_to +? edge-caller-global.inlined_to +: edge-caller); + + if (edge-count caller-count) +uninlined_call_time *= (sreal)edge-count / caller-count; + if (edge-frequency) +uninlined_call_time *= cgraph_freq_base_rec * edge-frequency; + else +uninlined_call_time = uninlined_call_time 11; + + int caller_time = inline_summaries-get (caller)-time; return uninlined_call_time + caller_time; } @@ -546,13 +553,28 @@ inline sreal compute_inlined_call_time (struct cgraph_edge *edge, int edge_time) { - int caller_time = inline_summaries-get (edge-caller-global.inlined_to - ? edge-caller-global.inlined_to - : edge-caller)-time; - sreal time = (sreal)caller_time - + ((sreal) (edge_time - inline_edge_summary (edge)-call_stmt_time) - * MAX (edge-frequency, 1) - * cgraph_freq_base_rec); + cgraph_node *caller = (edge-caller-global.inlined_to +? edge-caller-global.inlined_to +: edge-caller); + int caller_time = inline_summaries-get (caller)-time; + sreal time = edge_time; + + if (edge-count caller-count) +time *= (sreal)edge-count / caller-count; + if (edge-frequency) +time *= cgraph_freq_base_rec * edge-frequency; + else +time = time 11; + + /* This calculation should match one in ipa-inline-analysis. + FIXME: Once ipa-inline-analysis is converted to sreal this can be + simplified. */ + time -= (sreal) ((gcov_type) edge-frequency + * inline_edge_summary (edge)-call_stmt_time + * (INLINE_TIME_SCALE / CGRAPH_FREQ_BASE)) / INLINE_TIME_SCALE; + time += caller_time; + if (time = 0) +time = ((sreal) 1) 8; gcc_checking_assert (time = 0); return time; } @@ -563,8 +585,10 @@ compute_inlined_call_time (struct cgraph static bool big_speedup_p (struct cgraph_edge *e) { - sreal time = compute_uninlined_call_time (inline_summaries-get (e-callee), e); + sreal time = compute_uninlined_call_time (inline_summaries-get (e-callee), + e); sreal inlined_time = compute_inlined_call_time (e, estimate_edge_time (e)); + if (time - inlined_time (sreal) time * PARAM_VALUE (PARAM_INLINE_MIN_SPEEDUP) * percent_rec) @@ -862,49 +886,6 @@ want_inline_function_to_all_callers_p (s return true; } -#define RELATIVE_TIME_BENEFIT_RANGE (INT_MAX / 64) - -/* Return relative time improvement for inlining EDGE in range - as value NUMERATOR/DENOMINATOR. */ - -static inline void -relative_time_benefit (struct inline_summary *callee_info, - struct cgraph_edge *edge, - int edge_time, - sreal *numerator, - sreal *denominator) -{ - /* Inlining into extern inline function is not a win. */ - if (DECL_EXTERNAL (edge-caller-global.inlined_to -? edge-caller-global.inlined_to-decl -: edge-caller-decl)) -{ - *numerator = (sreal) 1; - *denominator = (sreal) 1024; - return; -} - - sreal uninlined_call_time = compute_uninlined_call_time
Re: [PATCH] ipa-icf.c: Fix issues generated by original latest commit
On 01/12/2015 09:51 AM, Richard Biener wrote: On Sat, Jan 10, 2015 at 10:03 AM, Chen Gang S gang.c...@sunrus.com.cn wrote: The related commit is 275e275 IPA ICF: target and optimization flags comparison.. For sem_function::equals_private(), fix the typo issue, and for target_opts_for_fn(), fix access NULL issue. For cross compiling h8300, it will cause the issue below: [root@localhost h8300]# cat fp-bit.i __inline__ static int a (int x) { return __builtin_expect (x == 0, 0); } __inline__ static int b (int x) { return __builtin_expect (x == 1, 0); } __attribute__ ((__always_inline__)) int c (int x, int y) { if (a (x)) return x; if (b (x)) return x; return y; } [root@localhost h8300]# /upstream/build-gcc-h8300/gcc/cc1 -O2 fp-bit.i -o test.s a b c Analyzing compilation unit fp-bit.i:11:41: warning: always_inline function might not be inlinable [-Wattributes] __attribute__ ((__always_inline__)) int c (int x, int y) ^ Performing interprocedural optimizations *free_lang_data visibility build_ssa_passes chkp_passes opt_local_passes free-inline-summary emutls whole-program profile_estimate icffp-bit.i:18:1: internal compiler error: Segmentation fault } ^ 0xa11f0e crash_signal ../../gcc/gcc/toplev.c:372 0xda33e7 tree_check ../../gcc/gcc/tree.h:2769 0xda33e7 target_opts_for_fn ../../gcc/gcc/tree.h:4643 0xda33e7 ipa_icf::sem_function::equals_private(ipa_icf::sem_item*, hash_mapsymtab_node*, ipa_icf::sem_item*, default_hashmap_traits) ../../gcc/gcc/ipa-icf.c:438 0xda4023 ipa_icf::sem_function::equals(ipa_icf::sem_item*, hash_mapsymtab_node*, ipa_icf::sem_item*, default_hashmap_traits) ../../gcc/gcc/ipa-icf.c:393 0xda6472 ipa_icf::sem_item_optimizer::subdivide_classes_by_equality(bool) ../../gcc/gcc/ipa-icf.c:1900 0xdaad3c ipa_icf::sem_item_optimizer::execute() ../../gcc/gcc/ipa-icf.c:1719 0xdab961 ipa_icf_driver ../../gcc/gcc/ipa-icf.c:2448 0xdab961 ipa_icf::pass_ipa_icf::execute(function*) ../../gcc/gcc/ipa-icf.c:2496 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. This issue can be found for cross compiling gcc make all-target-libgcc under h8300, after fix this issue, it can continue to cross compiling to meet the next building issue for h8300. Ok. Thanks, Richard. Hello. I've just installed Chen's patch. Thanks, Martin 2015-01-10 Chen Gang gang.chen.5...@gmail.com * ipa-icf.c (sem_function::equals_private): Use '' instead of '||' to fix typo issue. * gcc/tree.h (target_opts_for_fn): Check NULL_TREE since it can accept and return NULL. --- gcc/ipa-icf.c | 2 +- gcc/tree.h| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c index 1b76a1d..4ccaf8c 100644 --- a/gcc/ipa-icf.c +++ b/gcc/ipa-icf.c @@ -438,7 +438,7 @@ sem_function::equals_private (sem_item *item, cl_target_option *tar1 = target_opts_for_fn (decl); cl_target_option *tar2 = target_opts_for_fn (m_compared_func-decl); - if (tar1 != NULL || tar2 != NULL) + if (tar1 != NULL tar2 != NULL) { if (!cl_target_option_eq (tar1, tar2)) { diff --git a/gcc/tree.h b/gcc/tree.h index fc8c8fe..ac27268 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -4640,7 +4640,7 @@ target_opts_for_fn (const_tree fndecl) tree fn_opts = DECL_FUNCTION_SPECIFIC_TARGET (fndecl); if (fn_opts == NULL_TREE) fn_opts = target_option_default_node; - return TREE_TARGET_OPTION (fn_opts); + return fn_opts == NULL_TREE ? NULL : TREE_TARGET_OPTION (fn_opts); } /* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is -- 1.9.3
Re: [PATCH, testsuite] fix ggcplug.c test-case
On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote: On 12 January 2015 at 14:19, Richard Biener rguent...@suse.de wrote: On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote: Hi, The test-case plugin/ggcplug.c was failing due to flattening of tree.h and tree-core.h. Test-case was incorrect because it included gcc-plugin.h after tree.h whereas gcc-plugin.h should be the first header to be included by plugins. No, it should be definitely included _after_ config.h, system.h and coretypes.h. gcc-plugin.h already includes these files. Shall I remove config.h, system.h and coretypes.h from ggcplug.c instead ? No, keep the patch simple for now - we are inconsitent in all the testsuite plugins it seems and wasn't the idea that plugins _only_ need to include gcc-plugin.h now? Thus I'd rather cleanup all plugin testcases at once, with a separate patch. Thanks, Richard. Ok with moving it after coretypes.h. Thanks, Richard. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
[match-and-simplify] Merge from trunk
Committed. 2015-01-12 Richard Biener rguent...@suse.de Merge from trunk r218478 through r219383.
Re: [PATCH] fix visium build
On Fri, 9 Jan 2015, Prathamesh Kulkarni wrote: Hi, The tree.h and tree-core.h flattening patch: (https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00467.html broke visium build. The attached patch fixes that. Built on visium-elf. OK to commit ? Ok. Thanks, Richard.
Re: [PATCH, testsuite] fix ggcplug.c test-case
On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote: Hi, The test-case plugin/ggcplug.c was failing due to flattening of tree.h and tree-core.h. Test-case was incorrect because it included gcc-plugin.h after tree.h whereas gcc-plugin.h should be the first header to be included by plugins. No, it should be definitely included _after_ config.h, system.h and coretypes.h. Ok with moving it after coretypes.h. Thanks, Richard.
Re: [PATCH] Enable experimental TSAN support for Ada
On Sun, Jan 11, 2015 at 1:39 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: Hi Richard, On Fri, 9 Jan 2015 17:19:57, Richard Biener wrote: Yes. As said, you generally need to run folding results through force_gimple_operand. Richard. I have now used force_gimple_operand instead of special casing the VIEW_CONVERT_EXPRs. And I see that all Ada test cases still work with -fsanitize=thread. So this feels like an improvement. I have checked with a large C++ application, to see if the generated code changes or not. And although this looked like it should not change the resulting code, I found one small difference at -O3 -fsanitize=thread while compiling the function xmlSchemaCompareValuesInt in xmlschematypes.c of libxml2. The generated code size did not change, only two blocks of code changed place. That was the only difference in about 16 MB of code. The reason for this seems to be the following changes in the xmlschemastypes.c.104t.tsan1 bb 29: p1_179 = xmlSchemaDateNormalize (x_7(D), 0.0); # DEBUG p1 = p1_179 _180 = _xmlSchemaDateCastYMToDays (p1_179); - _660 = p1_179-value.date; - _659 = MEM[(struct xmlSchemaValDate *)_660 + 8B]; - __builtin___tsan_read2 (_659); + _660 = MEM[(struct xmlSchemaValDate *)p1_179 + 24B]; + __builtin___tsan_read2 (_660); _181 = p1_179-value.date.day; _182 = (long int) _181; p1d_183 = _180 + _182; this pattern is repeated everywhere. (- = before the patch. + = with the patch) So it looks as if the generated code quality slightly improves with this change. I have also tried to fold base + offset + bitpos, like this: --- tsan.c.orig2015-01-10 00:39:06.465210937 +0100 +++ tsan.c2015-01-11 09:28:38.109423856 +0100 @@ -213,7 +213,18 @@ instrument_expr (gimple_stmt_iterator gs align = get_object_alignment (expr); if (align BITS_PER_UNIT) return false; - expr_ptr = build_fold_addr_expr (unshare_expr (expr)); + expr_ptr = build_fold_addr_expr (unshare_expr (base)); + if (bitpos != 0) +{ + if (offset != NULL) +offset = size_binop (PLUS_EXPR, offset, + build_int_cst (sizetype, +bitpos / BITS_PER_UNIT)); + else +offset = build_int_cst (sizetype, bitpos / BITS_PER_UNIT); +} + if (offset != NULL) +expr_ptr = fold_build_pointer_plus (expr_ptr, offset); } expr_ptr = force_gimple_operand (expr_ptr, seq, true, NULL_TREE); if ((size (size - 1)) != 0 || size 16 For simplicity first only in the simple case without DECL_BIT_FIELD_REPRESENTATIVE. I tried this change at the same large C++ application, and see the code still works, but the binary size increases at -O3 by about 1%. So my conclusion would be that it is better to use force_gimple_operand directly on build_fold_addr_expr (unshare_expr (expr)), without using offset. Yeah, it probably needs more investigation. Well, I think this still resolves your objections. Furthermore I used may_be_nonaddressable_p instead of is_gimple_addressable and just return if it is found to be not true. (That did not happen in my tests.) And I reworked the block with the pt_solution_includes. I found that It can be rewritten, because pt_solution_includes can be expanded to (is_global_var (decl) || pt_solution_includes_1 (cfun-gimple_df-escaped, decl) || pt_solution_includes_1 (ipa_escaped_pt, decl)) So, by De Morgan's law, you can rewite that block to if (DECL_P (base)) { if (!is_global_var (base) !pt_solution_includes_1 (cfun-gimple_df-escaped, base) !pt_solution_includes_1 (ipa_escaped_pt, base)) return false; if (!is_global_var (base) !may_be_aliased (base)) return false; } Therefore I can move the common term !is_global_var (base) out of the block. That's what I did. As far as I can tell, none of the other terms here seem to be redundant. Attached patch was boot-strapped and regression-tested on x86_64-linux-gnu. OK for trunk? Ok. Thanks for these improvements! Richard. Thanks Bernd.
Re: Simplify badness metrics in inliner, take 2
On 2015.01.12 at 10:59 +0100, Markus Trippelsdorf wrote: On 2015.01.12 at 10:30 +0100, Jan Hubicka wrote: this is variant of my earlier patch I comited. It solves issues with -fprofile-use and various roundoff errors that triggered sanity checks (partly by disabling them). The new assert triggers during Firefox LTO build on ppc64: (final libxul link:) lto1: internal compiler error: in inline_small_functions, at ipa-inline.c:1664 0x10d0a023 inline_small_functions ../../gcc/gcc/ipa-inline.c:1664 0x10d0a023 ipa_inline ../../gcc/gcc/ipa-inline.c:2163 0x10d0a023 execute ../../gcc/gcc/ipa-inline.c:2536 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. lto-wrapper: fatal error: ../../../gcc_test/usr/local/bin/c++ returned 1 exit status compilation terminated. /home/trippels/bin/ld: fatal error: lto-wrapper failed collect2: error: ld returned 1 exit status make[5]: *** [libxul.so] Error 1 See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64565 -- Markus
[PATCH][ARM] Fix PR target/64460: Set 'shift' attr properly on some patterns
Hi all, In this PR we ICE when compiling with -mtune=xscale. The ICE is a segfault in xscale_sched_adjust_cost. The root cause is that xscale_sched_adjust_cost uses the value of the 'shift' insn attribute to index the recog operands. In GCC 5 the form and number of operands in those patterns were updated but the shift value was not: Author: rearnsha rearnsha@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu May 29 09:39:07 2014 + * arm/iterators.md (shiftable_ops): New code iterator. (t2_binop0, arith_shift_insn): New code attributes. * arm/predicates.md (shift_nomul_operator): New predicate. * arm/arm.md (insn_enabled): Delete. (enabled): Remove insn_enabled test. (*arith_shiftsi): Delete. Replace with ... (*arith_shift_insn_multsi): ... new pattern. (*arith_shift_insn_shiftsi): ... new pattern. * config/arm/arm.c (arm_print_operand): Handle operand format 'b'. This led to an out-of-bounds array access. Only xscale_sched_adjust_cost uses the shift attribute, so the segfault only happens for xscale tuning. In the future we might want to use a more general pattern-matching approach to find the shifted operand in an rtx... In any case, this patch fixes the value of 'shift' for the offending pattern and also updates 'shift' for the *arith_shift_insn_shiftsi pattern to point to the correct operand that is being shifted. Tested arm-none-eabi and bootstrapped with -mtune=xscale in BOOT_CFLAGS. Ok for trunk? Thanks, Kyrill 2014-01-12 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/64460 * config/arm/arm.md (*arith_shift_insn_multsi): Set 'shift' attr to 2. (*arith_shift_insn_shiftsi): Set 'shift' attr to 3. 2014-01-12 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/64460 * gcc.target/arm/pr64460_1.c: New test.
Re: [PATCH 7/10] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin
Hi! On Tue, 23 Sep 2014 19:19:31 +0100, Julian Brown jul...@codesourcery.com wrote: This patch contains the bulk of the OpenACC 2.0 runtime support, [...] --- /dev/null +++ b/libgomp/libgomp-plugin.c @@ -0,0 +1,106 @@ +/* Exported (non-hidden) functions exposing libgomp interface for plugins. */ +void +gomp_plugin_mutex_init (gomp_mutex_t *mutex) +{ + gomp_mutex_init (mutex); +} + +void +gomp_plugin_mutex_destroy (gomp_mutex_t *mutex) +{ + gomp_mutex_destroy (mutex); +} + +void +gomp_plugin_mutex_lock (gomp_mutex_t *mutex) +{ + gomp_mutex_lock (mutex); +} + +void +gomp_plugin_mutex_unlock (gomp_mutex_t *mutex) +{ + gomp_mutex_unlock (mutex); +} --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map +PLUGIN_1.0 { + global: + gomp_plugin_mutex_init; + gomp_plugin_mutex_destroy; + gomp_plugin_mutex_lock; + gomp_plugin_mutex_unlock; +}; --- /dev/null +++ b/libgomp/plugin-nvptx.c @@ -0,0 +1,1854 @@ +/* Plugin for NVPTX execution. +#include libgomp.h Plugins in libgomp are not to depend on libgomp internals (libgomp.h), and given that... +struct PTX_device +{ + /* A lock for use when manipulating the above stream list and array. */ + gomp_mutex_t stream_lock; +}; +static gomp_mutex_t PTX_event_lock; +static void +init_streams_for_device (struct PTX_device *ptx_dev, int concurrency) +{ + gomp_plugin_mutex_init (ptx_dev-stream_lock); +} +[...] ... it much more makes sense to just use pthread mutexes here. Committed to gomp-4_0-branch in r219467: commit 4de7ea8222739fa60d6eb81284dac61dc2bae7b2 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Mon Jan 12 14:35:51 2015 + libgomp: Use pthread mutexes in the nvptx plugin. ... instead of libgomp's internal mutex implementation. Plugins aren't to depend on internal libgomp interfaces, and how would you instantiate a gomp_mutex_t in a plugin without knowing what it is exactly? libgomp/ * plugin/plugin-nvptx.c (struct ptx_device): Turn stream_lock member into a pthread_mutex_t. Adjust all users. (ptx_event_lock): Likewise. * libgomp-plugin.c (GOMP_PLUGIN_mutex_init) (GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock) (GOMP_PLUGIN_mutex_unlock): Remove. * libgomp-plugin.h (GOMP_PLUGIN_mutex_init) (GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock) (GOMP_PLUGIN_mutex_unlock): Likewise. * libgomp.map (GOMP_PLUGIN_1.0): Remove GOMP_PLUGIN_mutex_init, GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock, GOMP_PLUGIN_mutex_unlock. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@219467 138bc75d-0d04-0410-961f-82ee72b054a4 --- libgomp/ChangeLog.gomp| 15 +++ libgomp/libgomp-plugin.c | 24 libgomp/libgomp-plugin.h | 7 --- libgomp/libgomp.map | 4 libgomp/plugin/plugin-nvptx.c | 39 --- 5 files changed, 35 insertions(+), 54 deletions(-) diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index 745b836..d955a85 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,3 +1,18 @@ +2015-01-12 Thomas Schwinge tho...@codesourcery.com + + * plugin/plugin-nvptx.c (struct ptx_device): Turn stream_lock + member into a pthread_mutex_t. Adjust all users. + (ptx_event_lock): Likewise. + * libgomp-plugin.c (GOMP_PLUGIN_mutex_init) + (GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock) + (GOMP_PLUGIN_mutex_unlock): Remove. + * libgomp-plugin.h (GOMP_PLUGIN_mutex_init) + (GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock) + (GOMP_PLUGIN_mutex_unlock): Likewise. + * libgomp.map (GOMP_PLUGIN_1.0): Remove GOMP_PLUGIN_mutex_init, + GOMP_PLUGIN_mutex_destroy, GOMP_PLUGIN_mutex_lock, + GOMP_PLUGIN_mutex_unlock. + 2014-12-22 Thomas Schwinge tho...@codesourcery.com * libgomp.c (struct gomp_device_descr): Add lock member. diff --git libgomp/libgomp-plugin.c libgomp/libgomp-plugin.c index 0026270..77e250e 100644 --- libgomp/libgomp-plugin.c +++ libgomp/libgomp-plugin.c @@ -82,27 +82,3 @@ GOMP_PLUGIN_fatal (const char *msg, ...) /* Unreachable. */ abort (); } - -void -GOMP_PLUGIN_mutex_init (gomp_mutex_t *mutex) -{ - gomp_mutex_init (mutex); -} - -void -GOMP_PLUGIN_mutex_destroy (gomp_mutex_t *mutex) -{ - gomp_mutex_destroy (mutex); -} - -void -GOMP_PLUGIN_mutex_lock (gomp_mutex_t *mutex) -{ - gomp_mutex_lock (mutex); -} - -void -GOMP_PLUGIN_mutex_unlock (gomp_mutex_t *mutex) -{ - gomp_mutex_unlock (mutex); -} diff --git libgomp/libgomp-plugin.h libgomp/libgomp-plugin.h index 051d4e2..2e2be1f 100644 --- libgomp/libgomp-plugin.h +++ libgomp/libgomp-plugin.h @@ -29,8 +29,6 @@ #ifndef LIBGOMP_PLUGIN_H #define LIBGOMP_PLUGIN_H 1 -#include mutex.h - extern void *GOMP_PLUGIN_malloc (size_t)
[PATCH] Fix PR64535 - increase emergency EH buffers via a new allocator
This fixes PR64535 by changing the fixed object size emergency pool to a variable EH object size (but fixed arena size) allocator. Via combining the dependent and non-dependent EH arenas this should allow around 600 bad_alloc throws in OOM situations on x86_64-linux compared to the current 64 which should provide some headroom to the poor souls using EH to communicate OOM in a heavily threaded enviroment. Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 1 as in the patch below, forcing the use of the allocator). Comments? Ok with only the #else path retained? What about the buffer size - we're now free to choose sth that doesn't depend on the size of INT_MAX (previously required for old allocator bitmap)? With the cost of some more members I can make the allocator more generic (use a constructor with a arena and a arena size parameter) and we may move it somewhere public under __gnu_cxx? But eventually boost has something like this anyway. Thanks, Richard. 2015-01-12 Richard Biener rguent...@suse.de PR libstdc++/64535 * libsupc++/eh_alloc.cc: Include new. (bitmask_type): Remove. (one_buffer): Likewise. (emergency_buffer): Likewise. (emergency_used): Likewise. (dependents_buffer): Likewise. (dependents_used): Likewise. (class pool): New custom fixed-size arena, variable size object allocator. (emergency_pool): New global. (__cxxabiv1::__cxa_allocate_exception): Use new emergency_pool. (__cxxabiv1::__cxa_free_exception): Likewise. (__cxxabiv1::__cxa_allocate_dependent_exception): Likewise. (__cxxabiv1::__cxa_free_dependent_exception): Likewise. Index: libstdc++-v3/libsupc++/eh_alloc.cc === --- libstdc++-v3/libsupc++/eh_alloc.cc (revision 216303) +++ libstdc++-v3/libsupc++/eh_alloc.cc (working copy) @@ -34,6 +34,7 @@ #include exception #include unwind-cxx.h #include ext/concurrence.h +#include new #if _GLIBCXX_HOSTED using std::free; @@ -72,62 +73,176 @@ using namespace __cxxabiv1; # define EMERGENCY_OBJ_COUNT 4 #endif -#if INT_MAX == 32767 || EMERGENCY_OBJ_COUNT = 32 -typedef unsigned int bitmask_type; -#else -#if defined (_GLIBCXX_LLP64) -typedef unsigned long long bitmask_type; -#else -typedef unsigned long bitmask_type; -#endif -#endif - - -typedef char one_buffer[EMERGENCY_OBJ_SIZE] __attribute__((aligned)); -static one_buffer emergency_buffer[EMERGENCY_OBJ_COUNT]; -static bitmask_type emergency_used; - -static __cxa_dependent_exception dependents_buffer[EMERGENCY_OBJ_COUNT]; -static bitmask_type dependents_used; namespace { // A single mutex controlling emergency allocations. __gnu_cxx::__mutex emergency_mutex; -} -extern C void * -__cxxabiv1::__cxa_allocate_exception(std::size_t thrown_size) _GLIBCXX_NOTHROW -{ - void *ret; + // A fixed-size heap, variable size object allocator + class pool +{ +public: + pool(); - thrown_size += sizeof (__cxa_refcounted_exception); - ret = malloc (thrown_size); + void *allocate (size_t); + void free (void *); + + bool in_pool (void *); + +private: + struct free_entry { + size_t size; + free_entry *next; + }; + struct allocated_entry { + size_t size; + char data[]; + }; + free_entry *first_free_entry; + char arena[EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT ++ EMERGENCY_OBJ_COUNT * sizeof (__cxa_dependent_exception)] +__attribute__((aligned(__alignof__(free_entry; +}; - if (! ret) + pool::pool() { - __gnu_cxx::__scoped_lock sentry(emergency_mutex); + first_free_entry = reinterpret_cast free_entry * (arena); + new (first_free_entry) free_entry; + first_free_entry-size = EMERGENCY_OBJ_SIZE * EMERGENCY_OBJ_COUNT; + first_free_entry-next = NULL; +} - bitmask_type used = emergency_used; - unsigned int which = 0; + void *pool::allocate (size_t size) +{ + __gnu_cxx::__scoped_lock sentry(emergency_mutex); + /* We need an additional size_t member. */ + size += sizeof (size_t); + /* And we need to at least hand out objects of the size of + a freelist entry. */ + if (size sizeof (free_entry)) + size = sizeof (free_entry); + /* And we need to align objects we hand out to the required + alignment of a freelist entry (this really aligns the +tail which will become a new freelist entry). */ + size = ((size + __alignof__(free_entry) - 1) + ~(__alignof__(free_entry) - 1)); + /* Search for an entry of proper size on the freelist. */ + free_entry **e; + for (e = first_free_entry; + *e (*e)-size size; + e = (*e)-next) + ; + if (!*e) + return NULL; + allocated_entry *x; + if ((*e)-size - size = sizeof (free_entry)) + { +
[PATCH] Fix PR64404
I am testing the following patch to fix a latent bug in the vectorizer dealing with redundant DRs. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2015-01-12 Richard Biener rguent...@suse.de PR tree-optimization/64404 * tree-vect-stmts.c (vectorizable_load): Use the proper vectorized stmts for CSEing loads with the same DR. * gcc.dg/vect/pr64404.c: New testcase. Index: gcc/tree-vect-stmts.c === --- gcc/tree-vect-stmts.c (revision 219446) +++ gcc/tree-vect-stmts.c (working copy) @@ -6155,7 +6155,7 @@ vectorizable_load (gimple stmt, gimple_s is even wrong code. See PR56270. */ !slp) { - *vec_stmt = STMT_VINFO_VEC_STMT (stmt_info); + *vec_stmt = STMT_VINFO_VEC_STMT (vinfo_for_stmt (first_stmt)); return true; } first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt)); Index: gcc/testsuite/gcc.dg/vect/pr64404.c === --- gcc/testsuite/gcc.dg/vect/pr64404.c (revision 0) +++ gcc/testsuite/gcc.dg/vect/pr64404.c (working copy) @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options --param=sccvn-max-alias-queries-per-access=1 } */ + +typedef struct +{ + float l, h; +} tFPinterval; + +tFPinterval X[1024]; +tFPinterval Y[1024]; +tFPinterval Z[1024]; + +void +Compute (void) +{ + int d; + for (d = 0; d 1024; d++) +{ + Y[d].l = X[d].l + X[d].h; + Y[d].h = Y[d].l; + Z[d].l = X[d].l; + Z[d].h = X[d].h; +} +} + +/* { dg-final { cleanup-tree-dump vect } } */
Re: [PATCH 7/10] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin
Hi! On Mon, 12 Jan 2015 15:37:46 +0100, I wrote: On Tue, 23 Sep 2014 19:19:31 +0100, Julian Brown jul...@codesourcery.com wrote: This patch contains the bulk of the OpenACC 2.0 runtime support, [...] --- /dev/null +++ b/libgomp/plugin-nvptx.c @@ -0,0 +1,1854 @@ +/* Plugin for NVPTX execution. +#include libgomp.h Plugins in libgomp are not to depend on libgomp internals (libgomp.h), ... it much more makes sense to just use pthread mutexes here. Committed to gomp-4_0-branch in r219467: commit 4de7ea8222739fa60d6eb81284dac61dc2bae7b2 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Mon Jan 12 14:35:51 2015 + libgomp: Use pthread mutexes in the nvptx plugin. ... instead of libgomp's internal mutex implementation. Plugins aren't to depend on internal libgomp interfaces, and how would you instantiate a gomp_mutex_t in a plugin without knowing what it is exactly? Given this, we can then tighten the libgomp plugins' include files; committed to gomp-4_0-branch in r219469: commit 7c011e60ec4e056e4c1b054966fd95fb2cb5e44a Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Mon Jan 12 14:53:53 2015 + libgomp: Don't use internal libgomp.h for plugins. ..., and explicitly link libgomp plugins against libgomp. libgomp/ * plugin/plugin-host.c [HOST_NONSHM_PLUGIN]: Don't include libgomp.h. * plugin/plugin-nvptx.c: Likewise. Include stdbool.h. * plugin/Makefrag.am (libgomp_plugin_nvptx_la_LIBADD) (libgomp_plugin_host_nonshm_la_LIBADD): Append libgomp.la. * Makefile.in: Regenerate. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@219469 138bc75d-0d04-0410-961f-82ee72b054a4 --- libgomp/ChangeLog.gomp| 6 ++ libgomp/Makefile.in | 7 --- libgomp/plugin/Makefrag.am| 3 ++- libgomp/plugin/plugin-host.c | 2 +- libgomp/plugin/plugin-nvptx.c | 2 +- 5 files changed, 14 insertions(+), 6 deletions(-) diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index 76f21e6..c2566cf 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,5 +1,11 @@ 2015-01-12 Thomas Schwinge tho...@codesourcery.com + * plugin/plugin-host.c [HOST_NONSHM_PLUGIN]: Don't include libgomp.h. + * plugin/plugin-nvptx.c: Likewise. Include stdbool.h. + * plugin/Makefrag.am (libgomp_plugin_nvptx_la_LIBADD) + (libgomp_plugin_host_nonshm_la_LIBADD): Append libgomp.la. + * Makefile.in: Regenerate. + * env.c: Don't include libgomp_target.h. * libgomp-plugin.c: Likewise. * oacc-async.c: Likewise. diff --git libgomp/Makefile.in libgomp/Makefile.in index ac34b97..8758989 100644 --- libgomp/Makefile.in +++ libgomp/Makefile.in @@ -123,7 +123,7 @@ am__installdirs = $(DESTDIR)$(toolexeclibdir) $(DESTDIR)$(infodir) \ $(DESTDIR)$(fincludedir) $(DESTDIR)$(libsubincludedir) \ $(DESTDIR)$(toolexeclibdir) LTLIBRARIES = $(toolexeclib_LTLIBRARIES) -libgomp_plugin_host_nonshm_la_LIBADD = +libgomp_plugin_host_nonshm_la_DEPENDENCIES = libgomp.la am_libgomp_plugin_host_nonshm_la_OBJECTS = \ libgomp_plugin_host_nonshm_la-plugin-host.lo libgomp_plugin_host_nonshm_la_OBJECTS = \ @@ -133,7 +133,7 @@ libgomp_plugin_host_nonshm_la_LINK = $(LIBTOOL) --tag=CC \ --mode=link $(CCLD) $(AM_CFLAGS) $(CFLAGS) \ $(libgomp_plugin_host_nonshm_la_LDFLAGS) $(LDFLAGS) -o $@ am__DEPENDENCIES_1 = -@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES = \ +@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES = libgomp.la \ @PLUGIN_NVPTX_TRUE@$(am__DEPENDENCIES_1) @PLUGIN_NVPTX_TRUE@am_libgomp_plugin_nvptx_la_OBJECTS = \ @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la-plugin-nvptx.lo @@ -407,7 +407,7 @@ libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \ @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LDFLAGS = \ @PLUGIN_NVPTX_TRUE@$(libgomp_plugin_nvptx_version_info) \ @PLUGIN_NVPTX_TRUE@$(lt_host_flags) $(PLUGIN_NVPTX_LDFLAGS) -@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = $(PLUGIN_NVPTX_LIBS) +@PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS) @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static libgomp_plugin_host_nonshm_version_info = -version-info $(libtool_VERSION) libgomp_plugin_host_nonshm_la_SOURCES = plugin/plugin-host.c @@ -415,6 +415,7 @@ libgomp_plugin_host_nonshm_la_CPPFLAGS = $(AM_CPPFLAGS) -DHOST_NONSHM_PLUGIN libgomp_plugin_host_nonshm_la_LDFLAGS = \ $(libgomp_plugin_host_nonshm_version_info) $(lt_host_flags) +libgomp_plugin_host_nonshm_la_LIBADD = libgomp.la libgomp_plugin_host_nonshm_la_LIBTOOLFLAGS = --tag=disable-static nodist_noinst_HEADERS = libgomp_f.h nodist_libsubinclude_HEADERS = omp.h openacc.h diff --git libgomp/plugin/Makefrag.am libgomp/plugin/Makefrag.am index d2c5428..167485f 100644 ---
Re: [x86, PATCH] operand reordering for commutative operations
Hi All, Thanks a lot for your comments. I've re-written reorder_operands as you proposed, but I'd like to know if we should apply this reordering at -O0? I will re-send the patch after testing completion. Thanks. Yuri. 2015-01-09 13:13 GMT+03:00 Richard Biener richard.guent...@gmail.com: On Mon, Jan 5, 2015 at 9:26 PM, Jeff Law l...@redhat.com wrote: On 12/29/14 06:30, Yuri Rumyantsev wrote: Hi All, Here is a patch which fixed several performance degradation after operand canonicalization (r216728). Very simple approach is used - if operation is commutative and its second operand required more operations (statements) for computation, swap operands. Currently this is done under special option which is set-up to true only for x86 32-bit targets ( we have not seen any performance improvements on 64-bit). Is it OK for trunk? 2014-12-26 Yuri Rumyantsev ysrum...@gmail.com * cfgexpand.c (count_num_stmt): New function. (reorder_operands): Likewise. (expand_gimple_basic_block): Insert call of reorder_operands. * common.opt(flag_reorder_operands): Add new flag. * config/i386/i386.c (ix86_option_override_internal): Add setup of flag_reorder_operands for 32-bit target only. * (doc/invoke.texi: Add new optimization option -freorder-operands. gcc/testsuite/ChangeLog * gcc.target/i386/swap_opnd.c: New test. I'd do this unconditionally -- I don't think there's a compelling reason to add another flag here. Indeed. Could you use estimate_num_insns rather than rolling your own estimate code here? All you have to do is setup the weights structure and call the estimation code. I wouldn't be surprised if ultimately the existing insn estimator is better than the one you're adding. Just use eni_size_weights. Your counting is quadratic, that's a no-go. You'd have to keep a lattice of counts for SSA names to avoid this. There is swap_ssa_operands (), in your swapping code you fail to update SSA operands (maybe non-fatal because we are just expanding to RTL, but ...). bb-loop_father is always non-NULL, but doing this everywhere, not only in loops looks fine to me. You can swap comparison operands on GIMPLE_CONDs for all codes by also swapping the EDGE_TRUE_VALUE/EDGE_FALSE_VALUE flags on the outgoing BB edges. There are more cases that can be swapped in regular stmts as well, but I suppose we don't need to be complete here. So, in reorder_operands I'd do (pseudo-code) n = 0; for-all-stmts gimple_set_uid (stmt, n++); lattice = XALLOCVEC (unsigned, n); i = 0; for-all-stmts this_stmt_cost = estimate_num_insns (stmt, eni_size_weights); lattice[i] = this_stmt_cost; FOR_EACH_SSA_USE_OPERAND () if (use-in-this-BB) lattice[i] += lattice[gimple_uid (SSA_NAME_DEF_STMT)]; i++; swap-if-operand-cost says so Richard. Make sure to reference the PR in the ChangeLog. Please update and resubmit. Thanks, Jeff
Re: [arm][patch] fix arm_neon_ok check on !arm_arch7
Sorry about the slow response- have been on holiday and still catching up on email. On 12/01/15 13:16, Andrew Stubbs wrote: Ping. On 23/12/14 16:46, Andrew Stubbs wrote: On 03/12/14 15:03, Andrew Stubbs wrote: The tools have always allowed us to drop down the arch to march=armv5te along with using -mfpu=neon. We are now changing command line behaviour, so an inform in terms of diagnostics to the user would be useful as it states that we don't really have mfpu=neon generating neon code any more because of this particular case. If we are to do this then the original patch is probably not enough as it then doesn't handle the case of TARGET_VFP3 / TARGET_VFP5 / TARGET_NEON_FP16 / TARGET_FP16 / TARGET_FPU_ARMV8 etc. etc. etc. I'll take a look at those shortly. Or, not so shortly. Sigh. It seems that, on ARM, the arch/CPU setting is basically orthogonal to the FPU setting, and the compiler doesn't even try to match the one to the other. The assembler does the same. In fact, the testcases that James refers to, that have hard-coded -march options, really do emit armv4 code with Neon, say, although most probably don't have vectorizable code. They only work because they're most likely executed on Neon hardware. Yes - though I'm surprised as I run an armv5te soft float only test run once a while on my Sheevaplug and don't see these issues. Maybe others do. This means that there's no obvious patch to fix the issue, in the compiler. It's easy to reject Neon for pre-v7 CPUs, but that has consequences, as we've seen. We'd have to have a table of fall-back FPUs or something, and that doesn't seem straight-forward (and anyway, I'm not sure what values to enter into that table). So, I've attacked the problem from the other end, and updated the compiler check. OK to commit? In principle ok, but I'd like a comment in there explaining why we've done this. Can you also post under what configurations these have been tested ? Ramana Andrew
Re: [x86, PATCH] operand reordering for commutative operations
On Mon, Jan 12, 2015 at 4:00 PM, Yuri Rumyantsev ysrum...@gmail.com wrote: Hi All, Thanks a lot for your comments. I've re-written reorder_operands as you proposed, but I'd like to know if we should apply this reordering at -O0? No, I think we can spare those cycles there. Richard. I will re-send the patch after testing completion. Thanks. Yuri. 2015-01-09 13:13 GMT+03:00 Richard Biener richard.guent...@gmail.com: On Mon, Jan 5, 2015 at 9:26 PM, Jeff Law l...@redhat.com wrote: On 12/29/14 06:30, Yuri Rumyantsev wrote: Hi All, Here is a patch which fixed several performance degradation after operand canonicalization (r216728). Very simple approach is used - if operation is commutative and its second operand required more operations (statements) for computation, swap operands. Currently this is done under special option which is set-up to true only for x86 32-bit targets ( we have not seen any performance improvements on 64-bit). Is it OK for trunk? 2014-12-26 Yuri Rumyantsev ysrum...@gmail.com * cfgexpand.c (count_num_stmt): New function. (reorder_operands): Likewise. (expand_gimple_basic_block): Insert call of reorder_operands. * common.opt(flag_reorder_operands): Add new flag. * config/i386/i386.c (ix86_option_override_internal): Add setup of flag_reorder_operands for 32-bit target only. * (doc/invoke.texi: Add new optimization option -freorder-operands. gcc/testsuite/ChangeLog * gcc.target/i386/swap_opnd.c: New test. I'd do this unconditionally -- I don't think there's a compelling reason to add another flag here. Indeed. Could you use estimate_num_insns rather than rolling your own estimate code here? All you have to do is setup the weights structure and call the estimation code. I wouldn't be surprised if ultimately the existing insn estimator is better than the one you're adding. Just use eni_size_weights. Your counting is quadratic, that's a no-go. You'd have to keep a lattice of counts for SSA names to avoid this. There is swap_ssa_operands (), in your swapping code you fail to update SSA operands (maybe non-fatal because we are just expanding to RTL, but ...). bb-loop_father is always non-NULL, but doing this everywhere, not only in loops looks fine to me. You can swap comparison operands on GIMPLE_CONDs for all codes by also swapping the EDGE_TRUE_VALUE/EDGE_FALSE_VALUE flags on the outgoing BB edges. There are more cases that can be swapped in regular stmts as well, but I suppose we don't need to be complete here. So, in reorder_operands I'd do (pseudo-code) n = 0; for-all-stmts gimple_set_uid (stmt, n++); lattice = XALLOCVEC (unsigned, n); i = 0; for-all-stmts this_stmt_cost = estimate_num_insns (stmt, eni_size_weights); lattice[i] = this_stmt_cost; FOR_EACH_SSA_USE_OPERAND () if (use-in-this-BB) lattice[i] += lattice[gimple_uid (SSA_NAME_DEF_STMT)]; i++; swap-if-operand-cost says so Richard. Make sure to reference the PR in the ChangeLog. Please update and resubmit. Thanks, Jeff
[PATCH][Aarch64] PR64149: Remove -mlra/-mno-lra option for Aarch64.
Hello, The LRA register is enabled by default for the Aarch64 backend and -mno-lra should no longer be used. This patch removes the -mlra/-mno-lra option for AArch64. Tested aarch64-none-linux-gnu with gcc-check. Matthew 2015-01-08 Matthew Wahab matthew.wa...@arm.com PR target/64149 * config/aarch64/aarch64.opt: Remove lra option and aarch64_lra_flag variable. * config/aarch64/aarch64.c (TARGET_LRA_P): Set to hook_bool_void_true. (aarch64_lra_p): Remove.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5100532..fc0bbad 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -146,7 +146,6 @@ enum aarch64_code_model aarch64_cmodel; #define TARGET_HAVE_TLS 1 #endif -static bool aarch64_lra_p (void); static bool aarch64_composite_type_p (const_tree, machine_mode); static bool aarch64_vfp_is_call_or_return_candidate (machine_mode, const_tree, @@ -7732,13 +7731,6 @@ aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep) return -1; } -/* Return true if we use LRA instead of reload pass. */ -static bool -aarch64_lra_p (void) -{ - return aarch64_lra_flag; -} - /* Return TRUE if the type, as described by TYPE and MODE, is a composite type as described in AAPCS64 \S 4.3. This includes aggregate, union and array types. The C99 floating-point complex types are also considered @@ -11053,7 +11045,7 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool load, #define TARGET_LIBGCC_CMP_RETURN_MODE aarch64_libgcc_cmp_return_mode #undef TARGET_LRA_P -#define TARGET_LRA_P aarch64_lra_p +#define TARGET_LRA_P hook_bool_void_true #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE aarch64_mangle_type diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt index 44c6350..f2ef124 100644 --- a/gcc/config/aarch64/aarch64.opt +++ b/gcc/config/aarch64/aarch64.opt @@ -107,10 +107,6 @@ mabi= Target RejectNegative Joined Enum(aarch64_abi) Var(aarch64_abi) Init(AARCH64_ABI_DEFAULT) -mabi=ABI Generate code that conforms to the specified ABI -mlra -Target Report Var(aarch64_lra_flag) Init(1) Save -Use LRA instead of reload (transitional) - Enum Name(aarch64_abi) Type(int) Known AArch64 ABIs (for use with the -mabi= option):
[PATCH] Fix PR64436: broken logic to process bitwise ORs in bswap pass
Hi all, To identify if a set of loads, shift, cast, mask (bitwise and) and bitwise OR is equivalent to a load or byteswap, the bswap pass assign a number to each byte loaded according to its significance (1 for lsb, 2 for next least significant byte, etc.) and form a symbolic number such as 0x04030201 for a 32bit load. When processing a bitwise OR of two such symbolic numbers, it is necessary to consider the lowest and highest addresses where a byte was loaded to renumber each byte accordingly. For instance if the two numbers are 0x04030201 and they were loaded from consecutive word in memory the result would be 0x0807060504030201 but if they overlap fully the result would be 0x04030201. Currently the computation of the byte with highest address is broken: it takes the byte with highest address of the symbolic number that starts last. That is, if one number represents a 8bit load at address 0x14 and another number represent a 32bit load at address 0x12 it will compute the end as 0x14 instead of 0x15. This error affects the computation of the size of the load for all targets and the computation of the symbolic number that result from the bitwise OR for big endian targets. This is what causes PR64436 due to a change in the gimple generated for that testcase. ChangeLog entry is as follows: gcc/ChangeLog 2014-12-30 Thomas Preud'homme thomas.preudho...@arm.com PR tree-optimization/64436 * tree-ssa-math-opts.c (find_bswap_or_nop_1): Move code performing the merge of two symbolic numbers for a bitwise OR to ... (perform_symbolic_merge): This. Also fix computation of the range and end of the symbolic number corresponding to the result of a bitwise OR. diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 1ed2838..286183a 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1816,6 +1816,123 @@ find_bswap_or_nop_load (gimple stmt, tree ref, struct symbolic_number *n) return true; } +/* Compute the symbolic number N representing the result of a bitwise OR on 2 + symbolic number N1 and N2 whose source statements are respectively + SOURCE_STMT1 and SOURCE_STMT2. */ + +static gimple +perform_symbolic_merge (gimple source_stmt1, struct symbolic_number *n1, + gimple source_stmt2, struct symbolic_number *n2, + struct symbolic_number *n) +{ + int i, size; + uint64_t mask; + gimple source_stmt; + struct symbolic_number *n_start; + + /* Sources are different, cancel bswap if they are not memory location with + the same base (array, structure, ...). */ + if (gimple_assign_rhs1 (source_stmt1) != gimple_assign_rhs1 (source_stmt2)) +{ + int64_t inc; + HOST_WIDE_INT start_sub, end_sub, end1, end2, end; + struct symbolic_number *toinc_n_ptr, *n_end; + + if (!n1-base_addr || !n2-base_addr + || !operand_equal_p (n1-base_addr, n2-base_addr, 0)) + return NULL; + + if (!n1-offset != !n2-offset || + (n1-offset !operand_equal_p (n1-offset, n2-offset, 0))) + return NULL; + + if (n1-bytepos n2-bytepos) + { + n_start = n1; + start_sub = n2-bytepos - n1-bytepos; + source_stmt = source_stmt1; + } + else + { + n_start = n2; + start_sub = n1-bytepos - n2-bytepos; + source_stmt = source_stmt2; + } + + /* Find the highest address at which a load is performed and +compute related info. */ + end1 = n1-bytepos + (n1-range - 1); + end2 = n2-bytepos + (n2-range - 1); + if (end1 end2) + { + end = end2; + end_sub = end2 - end1; + } + else + { + end = end1; + end_sub = end1 - end2; + } + n_end = (end2 end1) ? n2 : n1; + + /* Find symbolic number whose lsb is the most significant. */ + if (BYTES_BIG_ENDIAN) + toinc_n_ptr = (n_end == n1) ? n2 : n1; + else + toinc_n_ptr = (n_start == n1) ? n2 : n1; + + n-range = end - n_start-bytepos + 1; + + /* Check that the range of memory covered can be represented by +a symbolic number. */ + if (n-range 64 / BITS_PER_MARKER) + return NULL; + + /* Reinterpret byte marks in symbolic number holding the value of +bigger weight according to target endianness. */ + inc = BYTES_BIG_ENDIAN ? end_sub : start_sub; + size = TYPE_PRECISION (n1-type) / BITS_PER_UNIT; + for (i = 0; i size; i++, inc = BITS_PER_MARKER) + { + unsigned marker = + (toinc_n_ptr-n (i * BITS_PER_MARKER)) MARKER_MASK; + if (marker marker != MARKER_BYTE_UNKNOWN) + toinc_n_ptr-n += inc; + } +} + else +{ + n-range = n1-range; + n_start = n1; + source_stmt = source_stmt1; +} + + if (!n1-alias_set + || alias_ptr_types_compatible_p (n1-alias_set, n2-alias_set)) +n-alias_set = n1-alias_set; + else +
Re: [arm][patch] fix arm_neon_ok check on !arm_arch7
Ping. On 23/12/14 16:46, Andrew Stubbs wrote: On 03/12/14 15:03, Andrew Stubbs wrote: The tools have always allowed us to drop down the arch to march=armv5te along with using -mfpu=neon. We are now changing command line behaviour, so an inform in terms of diagnostics to the user would be useful as it states that we don't really have mfpu=neon generating neon code any more because of this particular case. If we are to do this then the original patch is probably not enough as it then doesn't handle the case of TARGET_VFP3 / TARGET_VFP5 / TARGET_NEON_FP16 / TARGET_FP16 / TARGET_FPU_ARMV8 etc. etc. etc. I'll take a look at those shortly. Or, not so shortly. It seems that, on ARM, the arch/CPU setting is basically orthogonal to the FPU setting, and the compiler doesn't even try to match the one to the other. The assembler does the same. In fact, the testcases that James refers to, that have hard-coded -march options, really do emit armv4 code with Neon, say, although most probably don't have vectorizable code. They only work because they're most likely executed on Neon hardware. This means that there's no obvious patch to fix the issue, in the compiler. It's easy to reject Neon for pre-v7 CPUs, but that has consequences, as we've seen. We'd have to have a table of fall-back FPUs or something, and that doesn't seem straight-forward (and anyway, I'm not sure what values to enter into that table). So, I've attacked the problem from the other end, and updated the compiler check. OK to commit? Andrew
C++ PATCH for c++/64547 (constexpr fn returning void)
In C++14 a constexpr function doesn't need to return a value. Tested x86_64-pc-linux-gnu, applying to trunk. commit 9675a7bde41b5430197854d8c1822c8f4d95b95e Author: Jason Merrill ja...@redhat.com Date: Fri Jan 9 01:46:16 2015 -0500 PR c++/64547 * constexpr.c (cxx_eval_call_expression): A call to a void function doesn't need to return a value. diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index 9a0d518..650250b 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -1386,6 +1386,8 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, value by evaluating *this, but we don't bother; there's no need to put such a call in the hash table. */ result = lval ? ctx-object : ctx-ctor; + else if (VOID_TYPE_P (TREE_TYPE (res))) + result = void_node; else { result = *ctx-values-get (slot ? slot : res); diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-void2.C b/gcc/testsuite/g++.dg/cpp1y/constexpr-void2.C new file mode 100644 index 000..321a35e --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-void2.C @@ -0,0 +1,21 @@ +// PR c++/64547 +// { dg-do compile { target c++14 } } + +struct X +{ +int x; +constexpr int get() const {return x;} +constexpr void set(int foo) {x = foo;} +}; + +constexpr int bar() +{ +X x{42}; +x.set(666); +return x.get(); +} + +int main() +{ +constexpr int foo = bar(); +}
Re: [PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P
On Thu, Dec 4, 2014 at 9:19 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: On 02/12/14 22:58, Ramana Radhakrishnan wrote: On Tue, Nov 11, 2014 at 11:55 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi all, This is the arm implementation of the macro fusion hook. It tries to fuse movw+movt operations together. It also tries to take lo_sum RTXs into account since those generate movt instructions as well. Bootstrapped and tested on arm-none-linux-gnueabihf. Ok for trunk? if (current_tune-fuseable_ops ARM_FUSE_MOVW_MOVT) +{ + /* We are trying to fuse + movw imm / movt imm + instructions as a group that gets scheduled together. */ + A comment here about the insn structure would be useful. Done. It's similar to the aarch64 adrp+add case. It does make it easier to read, thanks. 2014-12-04 Kyrylo Tkachov kyrylo.tkac...@arm.com\ * config/arm/arm-protos.h (tune_params): Add fuseable_ops field. * config/arm/arm.c (arm_macro_fusion_p): New function. (arm_macro_fusion_pair_p): Likewise. (TARGET_SCHED_MACRO_FUSION_P): Define. (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise. (ARM_FUSE_NOTHING): Likewise. (ARM_FUSE_MOVW_MOVT): Likewise. (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune arm_cortex_a5_tune): Specify fuseable_ops value. + set_dest = SET_DEST (curr_set); + if (GET_CODE (set_dest) == ZERO_EXTRACT) +{ + if (CONST_INT_P (SET_SRC (curr_set)) + CONST_INT_P (SET_SRC (prev_set)) + REG_P (XEXP (set_dest, 0)) + REG_P (SET_DEST (prev_set)) + REGNO (XEXP (set_dest, 0)) == REGNO (SET_DEST (prev_set))) +return true; +} + else if (GET_CODE (SET_SRC (curr_set)) == LO_SUM +REG_P (SET_DEST (curr_set)) +REG_P (SET_DEST (prev_set)) +GET_CODE (SET_SRC (prev_set)) == HIGH +REGNO (SET_DEST (curr_set)) == REGNO (SET_DEST (prev_set))) +{ + return true; +} Can we add a fast path exit to be if (GET_MODE (set_dest) != SImode) return false; Done, but if/when we extend the function to handle more fusion cases it will need to be refactored, since we will want to just bail out of this MOVW+MOVT case rather than the whole function. Sure - I did think whether we wanted to use reg_overlap_mentioned_p as that may simplify the logic a bit but that's overkill here as we still want to restrict it to the cases above. Otherwise OK. Here's the updated patch. I've tested on arm-none-eabi and made sure that the fusion still happens on the benchmarks I looked at. Ok? Ok - thanks, sorry about the slow response - been on vacation and still catching up. regards Ramana Thanks, Kyrill Ramana +} + return false; Thanks, Kyrill 2014-11-11 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm-protos.h (tune_params): Add fuseable_ops field. * config/arm/arm.c (arm_macro_fusion_p): New function. (arm_macro_fusion_pair_p): Likewise. (TARGET_SCHED_MACRO_FUSION_P): Define. (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise. (ARM_FUSE_NOTHING): Likewise. (ARM_FUSE_MOVW_MOVT): Likewise. (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune arm_cortex_a5_tune): Specify fuseable_ops value.
Re: [PATCH][ARM] Fix PR target/64460: Set 'shift' attr properly on some patterns
Now with patch attached Kyrill On 12/01/15 14:27, Kyrill Tkachov wrote: Hi all, In this PR we ICE when compiling with -mtune=xscale. The ICE is a segfault in xscale_sched_adjust_cost. The root cause is that xscale_sched_adjust_cost uses the value of the 'shift' insn attribute to index the recog operands. In GCC 5 the form and number of operands in those patterns were updated but the shift value was not: Author: rearnsha rearnsha@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu May 29 09:39:07 2014 + * arm/iterators.md (shiftable_ops): New code iterator. (t2_binop0, arith_shift_insn): New code attributes. * arm/predicates.md (shift_nomul_operator): New predicate. * arm/arm.md (insn_enabled): Delete. (enabled): Remove insn_enabled test. (*arith_shiftsi): Delete. Replace with ... (*arith_shift_insn_multsi): ... new pattern. (*arith_shift_insn_shiftsi): ... new pattern. * config/arm/arm.c (arm_print_operand): Handle operand format 'b'. This led to an out-of-bounds array access. Only xscale_sched_adjust_cost uses the shift attribute, so the segfault only happens for xscale tuning. In the future we might want to use a more general pattern-matching approach to find the shifted operand in an rtx... In any case, this patch fixes the value of 'shift' for the offending pattern and also updates 'shift' for the *arith_shift_insn_shiftsi pattern to point to the correct operand that is being shifted. Tested arm-none-eabi and bootstrapped with -mtune=xscale in BOOT_CFLAGS. Ok for trunk? Thanks, Kyrill 2014-01-12 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/64460 * config/arm/arm.md (*arith_shift_insn_multsi): Set 'shift' attr to 2. (*arith_shift_insn_shiftsi): Set 'shift' attr to 3. 2014-01-12 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/64460 * gcc.target/arm/pr64460_1.c: New test. commit c89087db2f16eda521d6c938d342570c1d69a7a2 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Fri Jan 9 16:41:44 2015 + [ARM] PR target/64460 ICE with -mtune=xscale in shift attr diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index c61057f..bbefb93 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -8255,36 +8255,36 @@ (define_insn trap (define_insn *arith_shift_insn_multsi [(set (match_operand:SI 0 s_register_operand =r,r) (shiftable_ops:SI (mult:SI (match_operand:SI 2 s_register_operand r,r) (match_operand:SI 3 power_of_two_operand )) (match_operand:SI 1 s_register_operand rk,t2_binop0)))] TARGET_32BIT arith_shift_insn%?\\t%0, %1, %2, lsl %b3 [(set_attr predicable yes) (set_attr predicable_short_it no) - (set_attr shift 4) + (set_attr shift 2) (set_attr arch a,t2) (set_attr type alu_shift_imm)]) (define_insn *arith_shift_insn_shiftsi [(set (match_operand:SI 0 s_register_operand =r,r,r) (shiftable_ops:SI (match_operator:SI 2 shift_nomul_operator [(match_operand:SI 3 s_register_operand r,r,r) (match_operand:SI 4 shift_amount_operand M,M,r)]) (match_operand:SI 1 s_register_operand rk,t2_binop0,rk)))] TARGET_32BIT GET_CODE (operands[2]) != MULT arith_shift_insn%?\\t%0, %1, %3%S2 [(set_attr predicable yes) (set_attr predicable_short_it no) - (set_attr shift 4) + (set_attr shift 3) (set_attr arch a,t2,a) (set_attr type alu_shift_imm,alu_shift_imm,alu_shift_reg)]) (define_split [(set (match_operand:SI 0 s_register_operand ) (match_operator:SI 1 shiftable_operator [(match_operator:SI 2 shiftable_operator [(match_operator:SI 3 shift_operator [(match_operand:SI 4 s_register_operand ) (match_operand:SI 5 reg_or_int_operand )]) diff --git a/gcc/testsuite/gcc.target/arm/pr64460_1.c b/gcc/testsuite/gcc.target/arm/pr64460_1.c new file mode 100644 index 000..ee6ad4a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr64460_1.c @@ -0,0 +1,69 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -mtune=xscale } */ + +typedef unsigned int size_t; +typedef short unsigned int __uint16_t; +typedef long unsigned int __uint32_t; +typedef unsigned int __uintptr_t; +typedef __uint16_t uint16_t ; +typedef __uint32_t uint32_t ; +typedef __uintptr_t uintptr_t; +typedef uint32_t Objects_Id; +typedef uint16_t Objects_Maximum; +typedef struct { } Objects_Control; + +static __inline__ void *_Addresses_Align_up (void *address, size_t alignment) +{ + uintptr_t mask = alignment - (uintptr_t)1; + return (void*)(((uintptr_t)address + mask) ~mask); +} + +typedef struct { + Objects_Id minimum_id; + Objects_Maximum maximum; + _Bool + auto_extend; + Objects_Maximum allocation_size; + void **object_blocks; +} Objects_Information; + +extern uint32_t _Objects_Get_index (Objects_Id); +extern void** _Workspace_Allocate (size_t); + +void _Objects_Extend_information (Objects_Information *information) +{ + uint32_t block_count; + uint32_t
Re: [PATCH 7/10] OpenACC 2.0 support for libgomp - OpenACC runtime, NVidia PTX/CUDA plugin
Hi! On Tue, 23 Sep 2014 19:19:31 +0100, Julian Brown jul...@codesourcery.com wrote: This patch contains the bulk of the OpenACC 2.0 runtime support, [...] --- /dev/null +++ b/libgomp/libgomp-plugin.h @@ -0,0 +1,57 @@ +/* An interface to various libgomp-internal functions for use by plugins. */ ..., and in parallel, a libgomp_target.h file came into existence. In gomp-4_0-branch's r219468, I now merged the two into the one with -- in my opinion -- the more descriptive name: commit 5024605e60ed2a42fefaa6882ac0ca7493643460 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Mon Jan 12 14:47:46 2015 + libgomp: Merge libgomp_target.h into libgomp-plugin.h. libgomp/ * env.c: Don't include libgomp_target.h. * libgomp-plugin.c: Likewise. * oacc-async.c: Likewise. * oacc-cuda.c: Likewise. * oacc-init.c: Likewise. * oacc-mem.c: Likewise. * oacc-parallel.c: Likewise. * oacc-plugin.c: Likewise. * plugin/plugin-host.c: Likewise. * plugin/plugin-nvptx.c: Likewise. * target.c: Likewise. * libgomp_target.h: Remove file after merging its content into... * libgomp-plugin.h: ... this file. Adjust all users. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@219468 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/config/i386/intelmic-mkoffload.c | 2 +- libgomp/ChangeLog.gomp | 14 +++ libgomp/env.c| 1 - libgomp/libgomp-plugin.c | 1 - libgomp/libgomp-plugin.h | 37 + libgomp/libgomp.h| 2 +- libgomp/libgomp_target.h | 53 libgomp/oacc-async.c | 1 - libgomp/oacc-cuda.c | 1 - libgomp/oacc-init.c | 1 - libgomp/oacc-mem.c | 1 - libgomp/oacc-parallel.c | 1 - libgomp/oacc-plugin.c| 1 - libgomp/plugin/plugin-host.c | 1 - libgomp/plugin/plugin-nvptx.c| 1 - libgomp/target.c | 1 - liboffloadmic/plugin/libgomp-plugin-intelmic.cpp | 2 +- 17 files changed, 54 insertions(+), 67 deletions(-) diff --git gcc/config/i386/intelmic-mkoffload.c gcc/config/i386/intelmic-mkoffload.c index 050f2e6..edc3f92 100644 --- gcc/config/i386/intelmic-mkoffload.c +++ gcc/config/i386/intelmic-mkoffload.c @@ -22,13 +22,13 @@ #include config.h #include libgen.h +#include libgomp-plugin.h #include system.h #include coretypes.h #include obstack.h #include intl.h #include diagnostic.h #include collect-utils.h -#include libgomp_target.h const char tool_name[] = intelmic mkoffload; diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp index d955a85..76f21e6 100644 --- libgomp/ChangeLog.gomp +++ libgomp/ChangeLog.gomp @@ -1,5 +1,19 @@ 2015-01-12 Thomas Schwinge tho...@codesourcery.com + * env.c: Don't include libgomp_target.h. + * libgomp-plugin.c: Likewise. + * oacc-async.c: Likewise. + * oacc-cuda.c: Likewise. + * oacc-init.c: Likewise. + * oacc-mem.c: Likewise. + * oacc-parallel.c: Likewise. + * oacc-plugin.c: Likewise. + * plugin/plugin-host.c: Likewise. + * plugin/plugin-nvptx.c: Likewise. + * target.c: Likewise. + * libgomp_target.h: Remove file after merging its content into... + * libgomp-plugin.h: ... this file. Adjust all users. + * plugin/plugin-nvptx.c (struct ptx_device): Turn stream_lock member into a pthread_mutex_t. Adjust all users. (ptx_event_lock): Likewise. diff --git libgomp/env.c libgomp/env.c index 81460dc..130c52c 100644 --- libgomp/env.c +++ libgomp/env.c @@ -28,7 +28,6 @@ #include libgomp.h #include libgomp_f.h -#include libgomp_target.h #include oacc-int.h #include ctype.h #include stdlib.h diff --git libgomp/libgomp-plugin.c libgomp/libgomp-plugin.c index 77e250e..1dd33f5 100644 --- libgomp/libgomp-plugin.c +++ libgomp/libgomp-plugin.c @@ -30,7 +30,6 @@ #include libgomp.h #include libgomp-plugin.h -#include libgomp_target.h void * GOMP_PLUGIN_malloc (size_t size) diff --git libgomp/libgomp-plugin.h libgomp/libgomp-plugin.h index 2e2be1f..c8383e1 100644 --- libgomp/libgomp-plugin.h +++ libgomp/libgomp-plugin.h @@ -29,6 +29,39 @@ #ifndef LIBGOMP_PLUGIN_H #define LIBGOMP_PLUGIN_H 1 +#include stddef.h +#include stdint.h + +#ifdef __cplusplus +extern C { +#endif + +/* Capabilities of offloading devices. */ +#define GOMP_OFFLOAD_CAP_SHARED_MEM(1 0) +#define GOMP_OFFLOAD_CAP_NATIVE_EXEC (1 1) +#define GOMP_OFFLOAD_CAP_OPENMP_400(1 2) +#define GOMP_OFFLOAD_CAP_OPENACC_200 (1 3) + +/* Type of offload target device. Keep in
Re: [PATCH]: Fix for PR ipa/64550
On Mon, 12 Jan 2015, Martin Liška wrote: Hello. Following patch is fix for PR ipa/64550 which can bootstrap on x86_64-linux-pc. Explanation for the patch is described here: [1]. I hope this is correct fix for such cases? Ah, using TREE_THIS_VOLATILE on the result of ao_ref_base is wrong - you need to use r1.volatile_p != r2.volatile_p. That's actually equivalent to what your patch does, thus that is ok. Thanks, Richard. Thanks, Martin [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64550 -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [PATCH, testsuite] fix ggcplug.c test-case
On 12 January 2015 at 15:49, Richard Biener rguent...@suse.de wrote: On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote: On 12 January 2015 at 14:36, Richard Biener rguent...@suse.de wrote: On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote: On 12 January 2015 at 14:19, Richard Biener rguent...@suse.de wrote: On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote: Hi, The test-case plugin/ggcplug.c was failing due to flattening of tree.h and tree-core.h. Test-case was incorrect because it included gcc-plugin.h after tree.h whereas gcc-plugin.h should be the first header to be included by plugins. No, it should be definitely included _after_ config.h, system.h and coretypes.h. gcc-plugin.h already includes these files. Shall I remove config.h, system.h and coretypes.h from ggcplug.c instead ? No, keep the patch simple for now - we are inconsitent in all the testsuite plugins it seems and wasn't the idea that plugins _only_ need to include gcc-plugin.h now? Thus I'd rather cleanup all plugin testcases at once, with a separate patch. I thought gcc-plugin.h would contain include dependencies of all headers (to make plugins transparent to include restructuring) and if a plugin needs a particular header, it should explicitly include it. Or am I missing something ? No idea - I thought the idea was that plugins only ever need to include gcc-plugin.h which will include everything (aka the world) so plugins are immune to things moving between headers (another thing that happened a lot for GCC 5). Thanks, Richard. Ok with moving it after coretypes.h. Shall I commit the patch after this change since this is the only plugin test case that's failing ? You should commit a patch moving the gcc-plugin.h include in ggcplug.c to after the include of coretypes.h. Moved gcc-plugin.h include after coretypes.h and committed as r219458. Thanks, Prathamesh Thanks, Richard.
Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)
On 12 January 2015 at 16:24, Andreas Schwab sch...@suse.de wrote: I'm getting this testsuite regression: FAIL: gcc.dg/plugin/ggcplug.c compilation Fixed with r219458. Thanks, Prathamesh In file included from /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:23:0, from /usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8: /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:18: error: 'hash_set' has not been declared void *, hash_settree *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:26: error: expected ',' or '...' before '' token void *, hash_settree *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:24: error: field 'id' has incomplete type 'ht_identifier' struct ht_identifier id; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:10: note: forward declaration of 'struct ht_identifier' struct ht_identifier id; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1164:3: error: 'vec' does not name a type vecconstructor_elt, va_gc *elts; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1206:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1253:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1258:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1285:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1286:3: error: 'location_t' does not name a type location_t end_locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1289:3: error: 'vec' does not name a type vectree, va_gc *nonlocalized_vars; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1324:3: error: 'alias_set_type' does not name a type alias_set_type alias_set; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1360:3: error: 'vec' does not name a type vectree, va_gc *base_accesses; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1366:3: error: 'vec' does not name a type vectree, va_gc base_binfos; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1371:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1758:3: error: 'vec' does not name a type vectree, va_gc *pending_statics; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1799:3: error: 'vec' does not name a type vectree, va_gc *to; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1845:16: error: 'vec' does not name a type extern GTY(()) vecalias_pair, va_gc *alias_pairs; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1858:17: error: 'vec' does not name a type extern GTY (()) vectree, va_gc *all_translation_units; ^ In file included from /usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8:0: /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:1073:48: error: 'location_t' has not been declared extern void protected_set_expr_location (tree, location_t); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2642:8: error: 'vec' does not name a type extern vectree, va_gc **decl_debug_args_lookup (tree); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2643:8: error: 'vec' does not name a type extern vectree, va_gc **decl_debug_args_insert (tree); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:38: error: 'vec' has not been declared extern tree build_nt_call_vec (tree, vectree, va_gc *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:41: error: expected ',' or '...' before '' token extern tree build_nt_call_vec (tree, vectree, va_gc *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:18: error: 'build1_stat_loc' declared as an 'inline' variable build1_stat_loc (location_t loc, enum tree_code code, tree type, ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:18: error: 'location_t' was not declared in this scope /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:34: error: expected primary-expression before 'enum' build1_stat_loc (location_t loc, enum tree_code code, tree type
Re: [PATCH, testsuite] fix ggcplug.c test-case
On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote: On 12 January 2015 at 14:36, Richard Biener rguent...@suse.de wrote: On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote: On 12 January 2015 at 14:19, Richard Biener rguent...@suse.de wrote: On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote: Hi, The test-case plugin/ggcplug.c was failing due to flattening of tree.h and tree-core.h. Test-case was incorrect because it included gcc-plugin.h after tree.h whereas gcc-plugin.h should be the first header to be included by plugins. No, it should be definitely included _after_ config.h, system.h and coretypes.h. gcc-plugin.h already includes these files. Shall I remove config.h, system.h and coretypes.h from ggcplug.c instead ? No, keep the patch simple for now - we are inconsitent in all the testsuite plugins it seems and wasn't the idea that plugins _only_ need to include gcc-plugin.h now? Thus I'd rather cleanup all plugin testcases at once, with a separate patch. I thought gcc-plugin.h would contain include dependencies of all headers (to make plugins transparent to include restructuring) and if a plugin needs a particular header, it should explicitly include it. Or am I missing something ? No idea - I thought the idea was that plugins only ever need to include gcc-plugin.h which will include everything (aka the world) so plugins are immune to things moving between headers (another thing that happened a lot for GCC 5). Thanks, Richard. Ok with moving it after coretypes.h. Shall I commit the patch after this change since this is the only plugin test case that's failing ? You should commit a patch moving the gcc-plugin.h include in ggcplug.c to after the include of coretypes.h. Thanks, Richard.
Re: [PATCH, i386] Remove EBX usage from asm code
On Mon, Jan 12, 2015 at 01:36:05PM +0300, Evgeny Stupachenko wrote: frame_dummy does not use EBX in allocation now as there are enough other registers (that we don't need to save/restore). So if we do not modify frame_dummy EBX should stay unchanged. frame_dummy does not initialize EBX register at the beginning it expects that EBX is pic from glibc frame_dummy is called from glibc and while we have glibc compiled by 4.9 or older compiler EBX should come to frame_dummy as pic register I also don't understand how is this related to glibc in any way. From my understanding, the macro relied on %ebx being set to _GLOBAL_OFFSET_TABLE_ because the frame_dummy function does access GOT, so before the i?86 PIC reg changes it was computing %ebx. Jakub
Re: [PATCH][ARM][cleanup] Use R0_REGNUM and R1_REGNUM instead of 0 and 1 where appropriate
On Thu, Dec 11, 2014 at 9:34 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi all, While looking in this area on other business I noticed we could be using the names R0_REGNUM and R1_REGNUM when creating those REG rtxs since it's a bit more descriptive that just 0 and 1. Tested arm-none-eabi. Ok for trunk? Sorry been on holiday and now catching up on emails. This is OK, thanks. Ramana Thanks, Kyrill 2014-12-11 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (arm_load_tp): Use R0_REGNUM instead of constant 0 in gen_rtx_REG. (arm_tls_descseq_addr): Likewise. (arm_gen_movmemqi): Likewise. (arm_expand_epilogue_apcs_frame): Likewise. (arm_expand_epilogue): Likewise. (arm_expand_prologue): Likewise. Use R1_REGNUM instead of constant 1 in gen_rtx_REG.
Re: [PATCH, i386] Remove EBX usage from asm code
Agree, I've missed the usage of the function __register_frame_info_bases (frame_dummy assembly had only indirect call when I miss -pie in compilation). There is no reference on glibc that way. Sorry for the confusion. So that is potentially buggy right now. On Mon, Jan 12, 2015 at 1:50 PM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Jan 12, 2015 at 01:36:05PM +0300, Evgeny Stupachenko wrote: frame_dummy does not use EBX in allocation now as there are enough other registers (that we don't need to save/restore). So if we do not modify frame_dummy EBX should stay unchanged. frame_dummy does not initialize EBX register at the beginning it expects that EBX is pic from glibc frame_dummy is called from glibc and while we have glibc compiled by 4.9 or older compiler EBX should come to frame_dummy as pic register I also don't understand how is this related to glibc in any way. From my understanding, the macro relied on %ebx being set to _GLOBAL_OFFSET_TABLE_ because the frame_dummy function does access GOT, so before the i?86 PIC reg changes it was computing %ebx. Jakub
[PATCH] Fix PR64357
The following patch fixes PR64357 (or papers over some latent issue). We were not protecting a certain aspect of simple latches properly (a simple latch should belong to its loop). Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2015-01-12 Richard Biener rguent...@suse.de PR middle-end/64357 * tree-cfg.c (gimple_can_merge_blocks_p): Protect simple latches properly. * gcc.dg/torture/pr64357.c: New testcase. Index: gcc/tree-cfg.c === --- gcc/tree-cfg.c (revision 219446) +++ gcc/tree-cfg.c (working copy) @@ -1723,11 +1727,13 @@ gimple_can_merge_blocks_p (basic_block a } /* Protect simple loop latches. We only want to avoid merging - the latch with the loop header in this case. */ + the latch with the loop header or with a block in another + loop in this case. */ if (current_loops b-loop_father-latch == b loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES) - b-loop_father-header == a) + (b-loop_father-header == a + || b-loop_father != a-loop_father)) return false; /* It must be possible to eliminate all phi nodes in B. If ssa form Index: gcc/testsuite/gcc.dg/torture/pr64357.c === --- gcc/testsuite/gcc.dg/torture/pr64357.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr64357.c (working copy) @@ -0,0 +1,34 @@ +/* { dg-do compile } */ + +int a, b, c, d, e, f; + +long long +fn1 (int p) +{ + return p ? p : 1; +} + +static int +fn2 () +{ +lbl: + for (; f;) +return 0; + for (;;) +{ + for (b = 0; b; ++b) + if (d) + goto lbl; + c = e; +} +} + +void +fn3 () +{ + for (; a; a = fn1 (a)) +{ + fn2 (); + e = 0; +} +}
Re: Simplify badness metrics in inliner, take 2
On 2015.01.12 at 10:30 +0100, Jan Hubicka wrote: this is variant of my earlier patch I comited. It solves issues with -fprofile-use and various roundoff errors that triggered sanity checks (partly by disabling them). The new assert triggers during Firefox LTO build on ppc64: (final libxul link:) lto1: internal compiler error: in inline_small_functions, at ipa-inline.c:1664 0x10d0a023 inline_small_functions ../../gcc/gcc/ipa-inline.c:1664 0x10d0a023 ipa_inline ../../gcc/gcc/ipa-inline.c:2163 0x10d0a023 execute ../../gcc/gcc/ipa-inline.c:2536 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See http://gcc.gnu.org/bugs.html for instructions. lto-wrapper: fatal error: ../../../gcc_test/usr/local/bin/c++ returned 1 exit status compilation terminated. /home/trippels/bin/ld: fatal error: lto-wrapper failed collect2: error: ld returned 1 exit status make[5]: *** [libxul.so] Error 1 -- Markus
Re: [PATCH, testsuite] fix ggcplug.c test-case
On 12 January 2015 at 14:36, Richard Biener rguent...@suse.de wrote: On Mon, 12 Jan 2015, Prathamesh Kulkarni wrote: On 12 January 2015 at 14:19, Richard Biener rguent...@suse.de wrote: On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote: Hi, The test-case plugin/ggcplug.c was failing due to flattening of tree.h and tree-core.h. Test-case was incorrect because it included gcc-plugin.h after tree.h whereas gcc-plugin.h should be the first header to be included by plugins. No, it should be definitely included _after_ config.h, system.h and coretypes.h. gcc-plugin.h already includes these files. Shall I remove config.h, system.h and coretypes.h from ggcplug.c instead ? No, keep the patch simple for now - we are inconsitent in all the testsuite plugins it seems and wasn't the idea that plugins _only_ need to include gcc-plugin.h now? Thus I'd rather cleanup all plugin testcases at once, with a separate patch. I thought gcc-plugin.h would contain include dependencies of all headers (to make plugins transparent to include restructuring) and if a plugin needs a particular header, it should explicitly include it. Or am I missing something ? Thanks, Richard. Ok with moving it after coretypes.h. Shall I commit the patch after this change since this is the only plugin test case that's failing ? Thanks, Prathamesh Thanks, Richard. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
[wwwdocs, patch] Update Fortran part of gcc-5/changes.html
Hi all, hi Gerald, sync the changes from https://gcc.gnu.org/wiki/GFortran/News#GCC5 for the today's added compatibilty section and Janne's locale addition. If there are no objects or comments, I will commit it this evening Tobias, who is really behind reading fortran@gcc emails. Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.57 diff -p -u -r1.57 changes.html --- changes.html 8 Jan 2015 16:50:23 - 1.57 +++ changes.html 12 Jan 2015 10:08:29 - @@ -333,6 +333,24 @@ void operator delete[] (void *, std::siz h3 id=fortranFortran/h3 ul +liCompatibility notice:ul + liThe version of the module files (.mod) has been incremented.li + liFor free-form source files, +a href=https://gcc.gnu.org/onlinedocs/gfortran/Error-and-Warning-Options.html;code-Werror=line-truncation/code/a +is now enabled by default; note that comments exceeding the line length +are not diagnosed. (For fixed-form source code, the same warning is +available but turned off by default, such that excess characters are +ignored. code-ffree-line-length-emn/emcode and +code-ffixed-line-length-emn/em can be used to modify the default +line lengths of 132 and 72 columns, respectively.)/li + liThe code-Wtabs/code option is now more sensible: with +code-Wtabscode the compiler warns if it encounters tabs and with +code-Wno-tabs/code this warning is turned off. Before, +code-Wno-tabs/code warned and code-Wtabs/code turned the warning +off. As before, the warning is also enabled by code-Wall/code, +code-pedantic/code and the codef95/code, codef2003/code, +codef2008/code and codef2008ts/code options of code-std=/code./li + /ulli liIncomplete support for colorizing diagnostics emitted by gfortran has been added. The option codea href=https://gcc.gnu.org/onlinedocs/gcc/Language-Independent-Options.html; @@ -359,6 +377,10 @@ void operator delete[] (void *, std::siz liThe code-Wuse-without-only/code option has been added to warn when a codeUSE/code statement has no codeONLY/code qualifier and, thus, implicitly imports all public entities of the used module./li +liFormatted READ and WRITE statements now work correctly in locale-aware + programs. For more information and potential caveats, see + a href=https://gcc.gnu.org/onlinedocs/gfortran/Thread-safety-of-the-runtime-library.html;Section + 5.3 Thread-safety of the runtime library in the manual/a./li lia href=https://gcc.gnu.org/wiki/Fortran2003Status;Fortran 2003/a: ul liThe intrinsic IEEE modules (codeIEEE_FEATURES/code,
[PATCH]: Fix for PR ipa/64550
Hello. Following patch is fix for PR ipa/64550 which can bootstrap on x86_64-linux-pc. Explanation for the patch is described here: [1]. I hope this is correct fix for such cases? Thanks, Martin [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64550 From bef79e6e5e0d7d8e555e9241ffcfb88a92552e12 Mon Sep 17 00:00:00 2001 From: mliska mli...@suse.cz Date: Mon, 12 Jan 2015 10:54:36 +0100 Subject: [PATCH] Fix for PR64550. gcc/ChangeLog: 2015-01-12 Martin Liska mli...@suse.cz * ipa-icf-gimple.c (func_checker::compare_memory_operand): Compare volatility for correct operands. gcc/testsuite/ChangeLog: 2015-01-12 Martin Liska mli...@suse.cz * gcc.dg/ipa/PR64550.c: New test. --- gcc/ipa-icf-gimple.c | 2 +- gcc/testsuite/gcc.dg/ipa/PR64550.c | 76 ++ 2 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/ipa/PR64550.c diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index 8c3a27b..ed3cdf5 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -267,7 +267,7 @@ func_checker::compare_memory_operand (tree t1, tree t2) /* Compare alias sets for memory operands. */ if (source_is_memop target_is_memop) { - if (TREE_THIS_VOLATILE (b1) != TREE_THIS_VOLATILE (b2)) + if (TREE_THIS_VOLATILE (t1) != TREE_THIS_VOLATILE (t2)) return return_false_with_msg (different operand volatility); if (ao_ref_alias_set (r1) != ao_ref_alias_set (r2) diff --git a/gcc/testsuite/gcc.dg/ipa/PR64550.c b/gcc/testsuite/gcc.dg/ipa/PR64550.c new file mode 100644 index 000..3b439c9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/PR64550.c @@ -0,0 +1,76 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-ipa-icf-details } */ + +struct __hlist_head +{ + struct __hlist_node *first; +}; + +struct __hlist_node +{ + struct __hlist_node *next, **pprev; +}; + +struct __net +{ + int ifindex; + struct __hlist_head * dev_index_head; +}; + +struct __net_device +{ + int ifindex; + struct __net *nd_net; + struct __hlist_node index_hlist; +}; + +__attribute__ ((noinline, noclone)) +static struct __hlist_head * __dev_index_hash(struct __net *net, +int ifindex) +{ + return net-dev_index_head[ifindex 1]; +} + +__attribute__ ((noinline, noclone)) +struct __net_device * __dev_get_by_index(struct __net *net, int ifindex) +{ + struct __net_device * dev; + struct __hlist_head * head = __dev_index_hash(net, ifindex); + + for (dev = ( { typeof((head)-first) ptr = ((head)-first); ptr ? ( { const typeof(((typeof(*(dev)) *) 0)-index_hlist) * __mptr = (ptr); (typeof(*(dev)) *) ((char *)__mptr - __builtin_offsetof(typeof(*(dev)), index_hlist));}): ((void *) 0);}); + dev; dev = ( { typeof ((dev)-index_hlist.next) ptr = ((dev)-index_hlist.next); ptr ? ( { const typeof(((typeof(*(dev)) *) 0)-index_hlist) * __mptr = (ptr); (typeof(*(dev)) *) ((char *)__mptr - __builtin_offsetof(typeof(*(dev)), index_hlist));}): ((void *) 0);})) +if (dev-ifindex == ifindex) + return dev; + + return ((void *)0); +} + +__attribute__ ((noinline, noclone)) +struct __net_device * dev_get_by_index_rcu(struct __net *net, int ifindex) +{ + struct __net_device * dev; + struct __hlist_head * head = __dev_index_hash(net, ifindex); + + for (dev = ( { typeof(( { typeof (* ((*((struct __hlist_node **)((head)-first) * _p1 = (typeof(*((*((struct __hlist_node **)((head)-first) *) (*(volatile typeof(((*((struct __hlist_node **)((head)-first) *)(((*((struct __hlist_node **)((head)-first)); do { } while (0);; do { } while (0); ((typeof(*((*((struct __hlist_node **)((head)-first) *) (_p1));})) ptr = (( { typeof (* ((*((struct __hlist_node **)((head)-first) * _p1 = (typeof(*((*((struct __hlist_node **)((head)-first) *) (*(volatile typeof(((*((struct __hlist_node **)((head)-first) *)(((*((struct __hlist_node **)((head)-first)); do { } while (0);; do { } while (0); ((typeof(*((*((struct __hlist_node **)((head)-first) *) (_p1));})); ptr ? ( { const typeof(((typeof(*(dev)) *) 0)-index_hlist) * __mptr = (ptr); (typeof(*(dev)) *) ((char *)__mptr - __builtin_offsetof(typeof(*(dev)), index_hlist));}):((void *) 0);}); + dev; dev = ( { typeof(( { typeof (* ((*((struct __hlist_node **)(((dev)-index_hlist)-next) * _p1 = (typeof(*((*((struct __hlist_node **)(((dev)-index_hlist)-next) *) (*(volatile typeof(((*((struct __hlist_node **)(((dev)-index_hlist)-next) *)(((*((struct __hlist_node **)(((dev)-index_hlist)-next)); do { } while (0);; do { } while (0); ((typeof(*((*((struct __hlist_node **)(((dev)-index_hlist)-next) *) (_p1));})) ptr = (( { typeof (* ((*((struct __hlist_node **)(((dev)-index_hlist)-next) * _p1 = (typeof(*((*((struct __hlist_node **)(((dev)-index_hlist)-next) *) (*(volatile typeof(((*((struct __hlist_node **)(((dev)-index_hlist)-next) *)(((*((struct
Re: [PATCH] Flatten tree.h and tree-core.h (Version 3)
I'm getting this testsuite regression: FAIL: gcc.dg/plugin/ggcplug.c compilation In file included from /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:23:0, from /usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8: /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:18: error: 'hash_set' has not been declared void *, hash_settree *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:705:26: error: expected ',' or '...' before '' token void *, hash_settree *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:24: error: field 'id' has incomplete type 'ht_identifier' struct ht_identifier id; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1139:10: note: forward declaration of 'struct ht_identifier' struct ht_identifier id; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1164:3: error: 'vec' does not name a type vecconstructor_elt, va_gc *elts; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1206:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1253:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1258:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1285:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1286:3: error: 'location_t' does not name a type location_t end_locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1289:3: error: 'vec' does not name a type vectree, va_gc *nonlocalized_vars; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1324:3: error: 'alias_set_type' does not name a type alias_set_type alias_set; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1360:3: error: 'vec' does not name a type vectree, va_gc *base_accesses; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1366:3: error: 'vec' does not name a type vectree, va_gc base_binfos; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1371:3: error: 'location_t' does not name a type location_t locus; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1758:3: error: 'vec' does not name a type vectree, va_gc *pending_statics; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1799:3: error: 'vec' does not name a type vectree, va_gc *to; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1845:16: error: 'vec' does not name a type extern GTY(()) vecalias_pair, va_gc *alias_pairs; ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree-core.h:1858:17: error: 'vec' does not name a type extern GTY (()) vectree, va_gc *all_translation_units; ^ In file included from /usr/local/gcc/gcc-20150112/gcc/testsuite/gcc.dg/plugin/ggcplug.c:8:0: /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:1073:48: error: 'location_t' has not been declared extern void protected_set_expr_location (tree, location_t); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2642:8: error: 'vec' does not name a type extern vectree, va_gc **decl_debug_args_lookup (tree); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:2643:8: error: 'vec' does not name a type extern vectree, va_gc **decl_debug_args_insert (tree); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:38: error: 'vec' has not been declared extern tree build_nt_call_vec (tree, vectree, va_gc *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3560:41: error: expected ',' or '...' before '' token extern tree build_nt_call_vec (tree, vectree, va_gc *); ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:18: error: 'build1_stat_loc' declared as an 'inline' variable build1_stat_loc (location_t loc, enum tree_code code, tree type, ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:18: error: 'location_t' was not declared in this scope /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:34: error: expected primary-expression before 'enum' build1_stat_loc (location_t loc, enum tree_code code, tree type, ^ /usr/local/gcc/gcc-20150112/gcc/testsuite/../../gcc/tree.h:3580:60: error: expected primary-expression before 'type' build1_stat_loc (location_t loc, enum tree_code code, tree type
Re: [PATCH 0/2] Offloading from dlopened libraries
Hi! How about this patch? It adds a new symbol into GOMP_4.0.1 symver, so it would be nice to include it into GCC 5 release. On 14 Nov 02:53, Ilya Verbin wrote: This patch fixes offloading from dlopened libraries, part 1 is for libgomp and part 2 is for intelmic plugin. How it works: When a library is loaded it calls GOMP_offload_register as usually. At this time some devices may already be initialized, and some may be not. Therefore libgomp goes through all devices and for the initialized devices calls GOMP_OFFLOAD_load_image, then receives corresponding addresses and inserts them into splay tree. Also it fills offload_images array for lazy Initialization. When the library is unloaded it calls GOMP_offload_unregister. This function also need to go through all devices and to call GOMP_OFFLOAD_unload_image for all initialized devices. Also it removes mapped addresses from corresponding splay trees and pending images from the array. Any thoughts on that? Thomas, Julian, Will this approach work for OpenACC+PTX? I hope that it is general enough. Yeah, I understand that this change will require some efforts on your part to rebase the patches, but it would be good to define a common libgomp-plugin interface as early as possible. -- Ilya
Re: [PATCH, i386] Remove EBX usage from asm code
frame_dummy does not use EBX in allocation now as there are enough other registers (that we don't need to save/restore). So if we do not modify frame_dummy EBX should stay unchanged. frame_dummy does not initialize EBX register at the beginning it expects that EBX is pic from glibc frame_dummy is called from glibc and while we have glibc compiled by 4.9 or older compiler EBX should come to frame_dummy as pic register Not sure that it is correct right now, but obviously will be potentially buggy when glibc is recompiled with GCC 5.0. libgcc (frame_dummy): static void __attribute__((used)) frame_dummy (void) { #ifdef USE_EH_FRAME_REGISTRY static struct object object; #ifdef CRT_GET_RFIB_DATA void *tbase, *dbase; tbase = 0; CRT_GET_RFIB_DATA (dbase); if (__register_frame_info_bases) __register_frame_info_bases (__EH_FRAME_BEGIN__, object, tbase, dbase); #else if (__register_frame_info) __register_frame_info (__EH_FRAME_BEGIN__, object); #endif /* CRT_GET_RFIB_DATA */ On Mon, Jan 5, 2015 at 11:50 PM, Jeff Law l...@redhat.com wrote: On 12/28/14 09:46, Evgeny Stupachenko wrote: Hi, The patch removes EBX usage from asm code used in libgcc/crtstuff.c It is safe now, but potentially buggy when glibc is rebuild with GCC 5.0 as EBX is not GOT register any more. x86 bootstrap, make check passed. Is it ok? Evgeny 2014-12-28 Evgeny Stupachenko evstu...@gmail.com * gnu-user.h (CRT_GET_RFIB_DATA): Remove EBX register usage. * config/i386/sysv4.h (CRT_GET_RFIB_DATA): Ditto. I don't understand the glibc reference above. Ultimately what matters here, AFAICT is the value assigned to the parameter to CRT_GET_RFIB_DATA which should be the base of the data relative relocations. So the comment It is safe now seems wrong as well. ISTM this is a critical fix as it would be possible for the PIC pseudo to be assigned to something other than %ebx when compiling libgcc/crtstuff.c. And if that happens, we'll pass in a junk value to register_frame_info_bases. Evgeny, can you clarify why you think things are safe now, but would not be safe if glibc were to be built with the current GCC trunk? Jeff
[PATCH] Fix PR64530
This fixes PR64530 by fixing a mistake (oops) in the iteration over all data-ref pairs in pg_add_dependence_edges. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. Richard. 2015-01-12 Richard Biener rguent...@suse.de PR tree-optimization/64530 * tree-loop-distribution.c (pg_add_dependence_edges): Shuffle back dr1. * gfortran.dg/pr64530.f90: New testcase. Index: gcc/tree-loop-distribution.c === --- gcc/tree-loop-distribution.c(revision 219446) +++ gcc/tree-loop-distribution.c(working copy) @@ -1362,6 +1375,7 @@ pg_add_dependence_edges (struct graph *r for (int ii = 0; drs1.iterate (ii, dr1); ++ii) for (int jj = 0; drs2.iterate (jj, dr2); ++jj) { + data_reference_p saved_dr1 = dr1; int this_dir = 1; ddr_p ddr; /* Re-shuffle data-refs to be in dominator order. */ @@ -1407,6 +1421,8 @@ pg_add_dependence_edges (struct graph *r dir = this_dir; else if (dir != this_dir) return 2; + /* Shuffle back dr1. */ + dr1 = saved_dr1; } return dir; } Index: gcc/testsuite/gfortran.dg/pr64530.f90 === --- gcc/testsuite/gfortran.dg/pr64530.f90 (revision 0) +++ gcc/testsuite/gfortran.dg/pr64530.f90 (working copy) @@ -0,0 +1,38 @@ +! { dg-do run } + +program bug + ! Bug triggered with at least three elements + integer, parameter :: asize = 3 + + double precision,save :: ave(asize) + double precision,save :: old(asize) + double precision,save :: tmp(asize) + + ave(:) = 10.d0 + old(:) = 3.d0 + tmp(:) = 0.d0 + + call buggy(2.d0,asize,ave,old,tmp) + if (any (tmp(:) .ne. 3.5)) call abort +end + +subroutine buggy(scale_factor, asize, ave, old, tmp) + + implicit none + ! Args + double precision scale_factor + integer asize + double precision ave(asize) + double precision old(asize) + double precision tmp(asize) + + ! Local + integer i + + do i = 1, asize +tmp(i) = ave(i) - old(i) +old(i) = ave(i) +tmp(i) = tmp(i) / scale_factor + end do + +end subroutine buggy
[gomp4] Replace enum omp_clause_map_kind with enum gomp_map_kind (was: Including a file from include/ in gcc/*.h)
Hi! On Mon, 22 Dec 2014 16:13:01 +0100, I wrote: I'm sending this again with some more people copied -- because I see you're working on tree.h/tree-core.h flattening, or know you're familiar with GCC plugins. ;-) Here is a question concerning both of that, where I'd appreciate your input. (That said, I don't find a GCC plugins person listed in the MAINTAINERS file, would that be worth adding?) Full quote follows: On Fri, 19 Dec 2014 18:54:04 +0100, I wrote: On Thu, 18 Dec 2014 19:33:07 +0100, Jakub Jelinek ja...@redhat.com wrote: On Thu, Dec 18, 2014 at 07:25:03PM +0100, Thomas Schwinge wrote: On Wed, 17 Dec 2014 23:26:53 +0100, I wrote: Committed to gomp-4_0-branch in r218840: commit febcd8dfdb10fa80edff0880973d1915ca2fef74 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Dec 17 22:26:24 2014 + Use include/gomp-constants.h more actively. diff --git gcc/tree-core.h gcc/tree-core.h index 743bc0d..fc61b88 100644 --- gcc/tree-core.h +++ gcc/tree-core.h @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. If not see #include alias.h #include flags.h #include symtab.h +#include gomp-constants.h /* This file contains all the data structures that define the 'tree' type. There are no accessor macros nor functions in this file. Only the Is it actually OK to #include gomp-constants.h (living in [GCC]/include/) from gcc/tree-core.h? Isn't the tree-core.h file getting installed, for later use by plugins -- which then need to be able to find gomp-constants, too, but that is not being installed? Generally, it must be possible to include include/ headers from gcc/ headers, even when they are used by plugins. Otherwise system.h including libiberty.h and safe-ctype.h etc. wouldn't work. Perhaps you need to add gomp-constants.h to some Makefile variable or something, look at how is safe-ctype.h etc. handled. Aha, that's how it is done, I guess, in gcc/Makefile.in: [...] SYSTEM_H = system.h hwint.h $(srcdir)/../include/libiberty.h \ $(srcdir)/../include/safe-ctype.h $(srcdir)/../include/filenames.h [...] PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h [...] [...] # Install the headers needed to build a plugin. install-plugin: installdirs lang.install-plugin s-header-vars install-gengtype # We keep the directory structure for files in config or c-family and .def # files. All other files are flattened to a single directory. $(mkinstalldirs) $(DESTDIR)$(plugin_includedir) headers=`echo $(PLUGIN_HEADERS) | tr ' ' '\012' | sort -u`; \ [...] [...] That said, including gomp-constants.h from tree-core.h is I think very much against all the Andrew's efforts to flatten headers (which is something I'm not very happy with generally, but in this case, I think only the very few files that really need the constants should include it). Like this (not yet applied)? [Jakub: »I think it is fine.«] Talking about external code (GCC plugins), do we have to take any measures about the removed enum omp_clause_map_kind? (Would a mere »#define omp_clause_map_kind gomp_map_kind« work? That'd also mean that we do have to add include/gomp-constants.h to PLUGIN_HEADERS, and get it included automatically, I think?) commit b1255597c6b069719960e53e385399c479c4be8b Author: Thomas Schwinge tho...@codesourcery.com Date: Fri Dec 19 18:32:25 2014 +0100 Replace enum omp_clause_map_kind with enum gomp_map_kind. gcc/ * tree-core.h: Instead of defining enum omp_clause_map_kind, use include/gomp-constants.h's enum gomp_map_kind. Update all users. include/ * gomp-constants.h: Populate enum gomp_map_kind. --- gcc/c/c-parser.c | 38 ++--- gcc/c/c-typeck.c | 9 +++ gcc/cp/parser.c| 38 ++--- gcc/cp/semantics.c | 9 +++ gcc/fortran/trans-openmp.c | 47 ++-- gcc/gimplify.c | 18 +++--- gcc/omp-low.c | 60 ++ gcc/tree-core.h| 43 +++-- gcc/tree-nested.c | 8 +++ gcc/tree-pretty-print.c| 31 gcc/tree-streamer-in.c | 2 +- gcc/tree-streamer-out.c| 2 +- gcc/tree.h | 4 ++-- include/gomp-constants.h | 50 +++--- 14 files changed, 173 insertions(+), 186 deletions(-) Here is (this is on top of gomp-4_0-branch, by the way) the patch: reordererd, and snipped to relevant
[PATCH,MIPS] Add support for the R6 LSA and DLSA instructions
This patch adds support for the R6 [D]LSA instructions. The support has been structured to allow MSA (when implemented) to turn on the same instructions as they are also added by the MSA ASE. I have continued to use the idea of 'ghost' options in the testsuite to indicate what features are required rather than arch revisions. Thanks, Matthew gcc/ * config/mips/mips.c (mips_rtx_costs): Set costs for LSA/DLSA. (mips_print_operand): Support 'y' to print exact log2 in decimal of a const_int. * config/mips/mips.h (ISA_HAS_LSA): New define. (ISA_HAS_DLSA): Likewise. * config/mips/mips.md (GPR:dlsa): New define_insn. * config/mips/predicates.md (const_immlsa_operand): New predicate. gcc/testsuite/ * gcc.target/mips/lsa.c: New file. * gcc.target/mips/mips64-lsa.c: Likewise. * gcc.target/mips/mulsize-2.c: Require !HAS_LSA. * gcc.target/mips/mulsize-4.c: Likewise. * gcc.target/mips/mulsize-5.c: New file. * gcc.target/mips/mulsize-6.c: Likewise. * gcc.target/mips/mips.exp (mips_option_groups): Support HAS_LSA and !HAS_LSA as ghost options. (mips-dg-options): Require rev 6 for HAS_LSA. Downgrade to rev 5 for !HAS_LSA. --- gcc/config/mips/mips.c | 30 ++ gcc/config/mips/mips.h | 6 ++ gcc/config/mips/mips.md| 10 ++ gcc/config/mips/predicates.md | 4 gcc/testsuite/gcc.target/mips/lsa.c| 11 +++ gcc/testsuite/gcc.target/mips/mips.exp | 16 ++-- gcc/testsuite/gcc.target/mips/mips64-lsa.c | 11 +++ gcc/testsuite/gcc.target/mips/mulsize-2.c | 1 + gcc/testsuite/gcc.target/mips/mulsize-4.c | 1 + gcc/testsuite/gcc.target/mips/mulsize-5.c | 13 + gcc/testsuite/gcc.target/mips/mulsize-6.c | 13 + 11 files changed, 114 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/mips/lsa.c create mode 100644 gcc/testsuite/gcc.target/mips/mips64-lsa.c create mode 100644 gcc/testsuite/gcc.target/mips/mulsize-5.c create mode 100644 gcc/testsuite/gcc.target/mips/mulsize-6.c diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index c2cc76e..a858a84 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -4108,6 +4108,22 @@ mips_rtx_costs (rtx x, int code, int outer_code, int opno ATTRIBUTE_UNUSED, return false; } + /* If it's an add + mult (which is equivalent to shift left) and + it's immediate operand satisfies const_immlsa_operand predicate. */ + if (((ISA_HAS_LSA mode == SImode) + || (ISA_HAS_DLSA mode == DImode)) + GET_CODE (XEXP (x, 0)) == MULT) + { + rtx op2 = XEXP (XEXP (x, 0), 1); + if (const_immlsa_operand (op2, mode)) + { + *total = (COSTS_N_INSNS (1) + + set_src_cost (XEXP (XEXP (x, 0), 0), speed) + + set_src_cost (XEXP (x, 1), speed)); + return true; + } + } + /* Double-word operations require three single-word operations and an SLTU. The MIPS16 version then needs to move the result of the SLTU from $24 to a MIPS16 register. */ @@ -8413,6 +8429,7 @@ mips_print_operand_punct_valid_p (unsigned char code) 'x' Print the low 16 bits of CONST_INT OP in hexadecimal format. 'd' Print CONST_INT OP in decimal. 'm' Print one less than CONST_INT OP in decimal. + 'y' Print exact log2 of CONST_INT OP in decimal. 'h' Print the high-part relocation associated with OP, after stripping any outermost HIGH. 'R' Print the low-part relocation associated with OP. @@ -8476,6 +8493,19 @@ mips_print_operand (FILE *file, rtx op, int letter) output_operand_lossage (invalid use of '%%%c', letter); break; +case 'y': + if (CONST_INT_P (op)) + { + int val = exact_log2 (INTVAL (op)); + if (val != -1) + fprintf (file, %d, val); + else + output_operand_lossage (invalid use of '%%%c', letter); + } + else + output_operand_lossage (invalid use of '%%%c', letter); + break; + case 'h': if (code == HIGH) op = XEXP (op, 0); diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 3d95a58..37d4cb4 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -181,6 +181,12 @@ struct mips_cpu_info { #define ISA_HAS_DSP_MULT ISA_HAS_DSPR2 #endif +/* ISA has LSA available. */ +#define ISA_HAS_LSA(mips_isa_rev = 6) + +/* ISA has DLSA available. */ +#define ISA_HAS_DLSA (TARGET_64BIT mips_isa_rev = 6) + /* The ISA compression flags that are currently in effect. */ #define TARGET_COMPRESSION (target_flags (MASK_MIPS16 | MASK_MICROMIPS)) diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index
[PATCH,MIPS] Only pass floating-point options to the assembler then
The new behaviour of the GCC driver passing floating point options like -msoft-float to the assembler is essential for the new o32 ABI extensions but is a change in behaviour. In particular GCC 5 used with binutils 2.24 would require a user to fix any hand-crafted code that made use of floating-point instructions when building for soft-float. This patch limits the new behaviour to a combination of GCC and binutils that both have the new ABI support. This patch along with parts of several previous patches need backporting to GCC 4.9 (and GCC 4.8) to enable use of binutils 2.25 with those compilers. The GCC 4.9 patch will be posted shortly. Thanks, Matthew gcc/ * config/mips/mips.h (FP_ASM_SPEC): New define. (ASM_SPEC): Remove floating-point options and use FP_ASM_SPEC instead. --- gcc/config/mips/mips.h | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index 37d4cb4..ed241fa 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -1243,6 +1243,22 @@ struct mips_cpu_info { %{gcoff*:-mdebug} %{!gcoff*:-no-mdebug} #endif +/* FP_ASM_SPEC represents the floating-point options that must be passed + to the assembler when FPXX support exists. Prior to that point the + assembler could accept the options but were not required for + correctness. We only add the options when absolutely necessary + because passing -msoft-float to the assembler will cause it to reject + all hard-float instructions which may require some user code to be + updated. */ + +#ifdef HAVE_AS_DOT_MODULE +#define FP_ASM_SPEC \ +%{mhard-float} %{msoft-float} \ +%{msingle-float} %{mdouble-float} +#else +#define FP_ASM_SPEC +#endif + /* SUBTARGET_ASM_SPEC is always passed to the assembler. It may be overridden by subtargets. */ @@ -1277,9 +1293,8 @@ struct mips_cpu_info { %{modd-spreg} %{mno-odd-spreg} \ %{mshared} %{mno-shared} \ %{msym32} %{mno-sym32} \ -%{mtune=*} \ -%{mhard-float} %{msoft-float} \ -%{msingle-float} %{mdouble-float} \ +%{mtune=*} \ +FP_ASM_SPEC \ %(subtarget_asm_spec) /* Extra switches sometimes passed to the linker. */ -- 2.2.1
[PATCH][AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64
Hi all, As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a call to the library sqrt at -O0. To avoid that this patch uses a target builtin for sqrt on DF mode and uses that to implement the intrinsic. With this patch I don't see sqrt calls being created at -O0 on a large arm_neon.h testcase where they were generated before. aarch64-none-elf testing and the intrinsics testsuite in particular are clean. Ok for trunk? Thanks, Kyrill 2015-01-12 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF. * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf instead of __builtin_sqrt.commit 865be1cc8365886904d571e244746815e2317162 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Fri Jan 9 12:18:59 2015 + [AArch64] Use target builtin for vsqrt_f64 diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index b41d9f6..60cd1d7 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -41,7 +41,7 @@ BUILTIN_VDC (COMBINE, combine, 0) BUILTIN_VB (BINOP, pmul, 0) - BUILTIN_VDQF (UNOP, sqrt, 2) + BUILTIN_VDQF_DF (UNOP, sqrt, 2) BUILTIN_VD_BHSI (BINOP, addp, 0) VAR1 (UNOP, addp, 0, di) BUILTIN_VDQ_BHSI (UNOP, clrsb, 2) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index c679802..3b151a2 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -22194,7 +22194,7 @@ vsqrtq_f32 (float32x4_t a) __extension__ static __inline float64x1_t __attribute__ ((__always_inline__)) vsqrt_f64 (float64x1_t a) { - return (float64x1_t) { __builtin_sqrt (a[0]) }; + return (float64x1_t) { __builtin_aarch64_sqrtdf (a[0]) }; } __extension__ static __inline float64x2_t __attribute__ ((__always_inline__))
Re: [Patch, Fortran, OOP] PR 63733: [4.8/4.9/5 Regression] wrong resolution for OPERATOR generics
Good, I fully agree. Fortunately the patch applies cleanly to the 4.9 branch and regtests without errors. Thus I have applied it as r219475. Will do 4.8 soon. Cheers, Janus 2015-01-12 9:30 GMT+01:00 Paul Richard Thomas paul.richard.tho...@gmail.com: Dear Janus, Since it is a regression, by all means update the branches. We usually, propose delaying a bit but I am not convinced that this is effective for this kind of bug fix - usually, further problems take a long time to emerge. Thus, I would recommend that you get on with it. Thanks Paul On 11 January 2015 at 23:01, Janus Weil ja...@gcc.gnu.org wrote: Well done for sorting that out. OK for trunk. Thanks, Paul. Committed as r219440. What about the branches? Cheers, Janus On 11 January 2015 at 14:38, Janus Weil ja...@gcc.gnu.org wrote: Hi all, this patch fixes a wrong-code regression related to operators, by making sure that we look for typebound operators first, before looking for non-typebound ones. (Note: Each typebound operator is also added to the list of non-typebound ones, for reasons of diagnostics.) Regtested on x86_64-unknown-linux-gnu. Ok for trunk? 4.9/4.8? Cheers, Janus 2015-01-11 Janus Weil ja...@gcc.gnu.org PR fortran/63733 * interface.c (gfc_extend_expr): Look for type-bound operators before non-typebound ones. 2015-01-11 Janus Weil ja...@gcc.gnu.org PR fortran/63733 * gfortran.dg/typebound_operator_20.f90: New. -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
[PATCH] Fix PR64568
The following avoids splitting TARGET_MEM_REFs by attaching REAL/IMAGPART_EXPRs around it which isn't allowed. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2014-01-12 Richard Biener rguent...@suse.de PR tree-optimization/64568 * tree-ssa-forwprop.c (pass_forwprop::execute): Properly release defs of removed stmts, avoid splitting TARGET_MEM_REFs. * g++.dg/torture/pr64568.C: New testcase. Index: gcc/tree-ssa-forwprop.c === --- gcc/tree-ssa-forwprop.c (revision 219446) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -2267,6 +2267,8 @@ pass_forwprop::execute (function *fun) gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); } + + release_defs (stmt); gsi_remove (gsi, true); } else @@ -2281,7 +2283,9 @@ pass_forwprop::execute (function *fun) if (single_imm_use (lhs, use_p, use_stmt) gimple_store_p (use_stmt) !gimple_has_volatile_ops (use_stmt) - is_gimple_assign (use_stmt)) + is_gimple_assign (use_stmt) + (TREE_CODE (gimple_assign_lhs (use_stmt)) + != TARGET_MEM_REF)) { tree use_lhs = gimple_assign_lhs (use_stmt); tree new_lhs = build1 (REALPART_EXPR, @@ -2302,6 +2306,7 @@ pass_forwprop::execute (function *fun) gimple_assign_set_rhs1 (use_stmt, gimple_assign_rhs2 (stmt)); update_stmt (use_stmt); + release_defs (stmt); gsi_remove (gsi, true); } else Index: gcc/testsuite/g++.dg/torture/pr64568.C === --- gcc/testsuite/g++.dg/torture/pr64568.C (revision 0) +++ gcc/testsuite/g++.dg/torture/pr64568.C (working copy) @@ -0,0 +1,111 @@ +// { dg-do compile } +// { dg-additional-options -std=c++11 } + +namespace std +{ +typedef long unsigned size_t; +template typename class complex; +template typename _Tp complex_Tp operator+(complex_Tp, complex_Tp) +{ + complex_Tp a = 0; + a += 0; + return a; +} +template struct complexdouble +{ + complex (int __i) : _M_value{ __i } {} + int imag (); + void operator+=(complex __z) { _M_value = __z.imag (); } + _Complex double _M_value; +}; +} +class A +{ +public: + A (int); + std::complexdouble operator[](int i) { return data_[i]; } + std::complexdouble *data_; +}; +struct B +{ + static std::complexdouble + apply (std::complexdouble t1, std::complexdouble t2) + { +return t1 + t2; + } +}; +template class T1, class struct C +{ + static void + apply (T1 t1, std::complexdouble t2) + { +t1 = t2; + } +}; +template class E class D +{ +public: + E operator()(); +}; +class G : public DG +{ +public: + typedef std::complexdouble value_type; + value_type operator()(int) { return B::apply (0, 0); } +}; +template class E1, class E2 G operator+(DE1, DE2); +template template class, class class F, class V, class E +void +indexing_vector_assign (V v, DE e) +{ + for (int i;; ++i) +Ftypename V::reference, typename E::value_type::apply (v (i), e ()(0)); +} +template template class, class class F, class V, class E +void +vector_assign (V v, DE e, int) +{ + indexing_vector_assignF (v, e); +} +template template class, class class F, class V, class E +void +vector_assign (V v, DE e) +{ + vector_assignF (v, e, typename V::storage_category ()); +} +class H : public Dint +{ +public: + typedef std::complexdouble reference; + typedef int storage_category; + H (int); + template class AE H (DAE ae) : data_ (0) + { +vector_assignC (*this, ae); + } + A + data () + { +return data_; + } + reference operator()(int i) { return data ()[i]; } + A data_; +}; +template class T1, class V1, class T2, class V2 +void +rot (T1, V1 v1, T2, V2 v2) +{ + H (v1 + v2); +} +template class, unsigned long struct F +{ + void test (); +}; +template struct FH, 3; +template class V, std::size_t N +void +FV, N::test () +{ + V b (0), c (0); + rot (0, b, 0, c); +}
[PATCH][test] Gate gcc.dg/aru-2.c test on profiling support
Hi all, This recently added test adds -pg to its dg-options but not all targets support this and fail at link-time with bin/ld: cannot find -lc_p. Looking around I see that all tests that use -pg also do a dg-require-profiling. This patch adds that. With this patch the test doesn't FAIL on aarch64-none-elf. It appears as UNSUPPORTED. Ok for trunk? Thanks, Kyrill 2015-01-12 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.dg/aru-2.c: Add dg-require-profiling directive.commit d4cffed0dede524cf1d0b3487e21ab4f783d06bd Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Fri Jan 9 12:13:02 2015 + [testuite] Add check on profiling in aru-2.c test diff --git a/gcc/testsuite/gcc.dg/aru-2.c b/gcc/testsuite/gcc.dg/aru-2.c index efd1f01..d36adc1 100644 --- a/gcc/testsuite/gcc.dg/aru-2.c +++ b/gcc/testsuite/gcc.dg/aru-2.c @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* { dg-require-profiling -pg } */ /* { dg-options -O2 -pg } */ static int __attribute__((noinline))
[PATCH] Fix REE for vector modes (PR rtl-optimization/64286)
Hi! As mentioned in the PR, giving up for all vector mode extensions is unnecessary, but unlike scalar integer extensions, where the low part of the extended value is the original value, for vectors this is not true, thus the old value is lost. Which means we can perform REE, but only if all uses of the definition are the same (code+mode) extension. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR rtl-optimization/64286 * ree.c (add_removable_extension): Don't add vector mode extensions if all uses of the source register aren't the same vector extensions. * gcc.target/i386/avx2-pr64286.c: New test. --- gcc/ree.c.jj2015-01-09 22:00:00.427660442 +0100 +++ gcc/ree.c 2015-01-12 12:15:56.097184674 +0100 @@ -1027,6 +1027,7 @@ add_removable_extension (const_rtx expr, different extension. FIXME: this obviously can be improved. */ for (def = defs; def; def = def-next) if ((idx = def_map[INSN_UID (DF_REF_INSN (def-ref))]) +idx != -1U (cand = (*insn_list)[idx - 1]) cand-code != code) { @@ -1038,6 +1039,57 @@ add_removable_extension (const_rtx expr, } return; } + /* For vector mode extensions, ensure that all uses of the + XEXP (src, 0) register are the same extension (both code + and to which mode), as unlike integral extensions lowpart + subreg of the sign/zero extended register are not equal + to the original register, so we have to change all uses or + none. */ + else if (VECTOR_MODE_P (GET_MODE (XEXP (src, 0 + { + if (idx == 0) + { + struct df_link *ref_chain, *ref_link; + + ref_chain = DF_REF_CHAIN (def-ref); + for (ref_link = ref_chain; ref_link; ref_link = ref_link-next) + { + if (ref_link-ref == NULL + || DF_REF_INSN_INFO (ref_link-ref) == NULL) + { + idx = -1U; + break; + } + rtx_insn *use_insn = DF_REF_INSN (ref_link-ref); + const_rtx use_set; + if (use_insn == insn || DEBUG_INSN_P (use_insn)) + continue; + if (!(use_set = single_set (use_insn)) + || !REG_P (SET_DEST (use_set)) + || GET_MODE (SET_DEST (use_set)) != GET_MODE (dest) + || GET_CODE (SET_SRC (use_set)) != code + || !rtx_equal_p (XEXP (SET_SRC (use_set), 0), +XEXP (src, 0))) + { + idx = -1U; + break; + } + } + if (idx == -1U) + def_map[INSN_UID (DF_REF_INSN (def-ref))] = idx; + } + if (idx == -1U) + { + if (dump_file) + { + fprintf (dump_file, Cannot eliminate extension:\n); + print_rtl_single (dump_file, insn); + fprintf (dump_file, + because some vector uses aren't extension\n); + } + return; + } + } /* Then add the candidate to the list and insert the reaching definitions into the definition map. */ --- gcc/testsuite/gcc.target/i386/avx2-pr64286.c.jj 2015-01-12 12:19:54.863031657 +0100 +++ gcc/testsuite/gcc.target/i386/avx2-pr64286.c2015-01-12 12:19:36.0 +0100 @@ -0,0 +1,37 @@ +/* PR rtl-optimization/64286 */ +/* { dg-do run } */ +/* { dg-options -O2 -mavx2 } */ +/* { dg-require-effective-target avx2 } */ + +#include string.h +#include stdlib.h +#include x86intrin.h +#include avx2-check.h + +__m128i v; +__m256i w; + +__attribute__((noinline, noclone)) void +foo (__m128i *p, __m128i *q) +{ + __m128i a = _mm_loadu_si128 (p); + __m128i b = _mm_xor_si128 (a, v); + w = _mm256_cvtepu8_epi16 (a); + *q = b; +} + +static void +avx2_test (void) +{ + v = _mm_set1_epi8 (0x40); + __m128i c = _mm_set_epi8 (16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1); + __m128i d; + foo (c, d); + __m128i e = _mm_set_epi8 (0x50, 0x4f, 0x4e, 0x4d, 0x4c, 0x4b, 0x4a, 0x49, + 0x48, 0x47, 0x46, 0x45, 0x44, 0x43, 0x42, 0x41); + __m256i f = _mm256_set_epi16 (16, 15, 14, 13, 12, 11, 10, 9, + 8, 7, 6, 5, 4, 3, 2, 1); + if (memcmp (w, f, sizeof (w)) != 0 + || memcmp (d, e, sizeof (d)) != 0) +abort (); +} Jakub
[PATCH] Use ldexp instead of scalbln for portability (PR other/64370)
Hi! As mentioned in the PR, HPUX doesn't have scalbln, but does have ldexp and that function is already used in gcj-dump, so supposedly it is more portable to use ldexp. Also in glibc it is defined in libc in addition to libm. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR other/64370 * sreal.c (sreal::to_double): Use ldexp instead of scalbnl. --- gcc/sreal.c.jj 2015-01-09 12:01:33.0 +0100 +++ gcc/sreal.c 2015-01-12 14:26:44.339128332 +0100 @@ -122,7 +122,7 @@ sreal::to_double () const { double val = m_sig; if (m_exp) -val = scalbln (val, m_exp); +val = ldexp (val, m_exp); return val; } Jakub
Re: [testsuite] PATCH: Add check_effective_target_pie
On Mon, Jan 12, 2015 at 12:03 PM, Jeff Law l...@redhat.com wrote: On 01/12/15 12:59, H.J. Lu wrote: I don't know if -pg will work PIE on any targets. For Linux/x86 the choices of crt1.o are %{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} -shared, -pg and -pie are mutually exclusive. Those crt1 files are only crt1 files provided by glibc. You can't even try -pg -pie on Linux without changing glibc. You're totally missing the point. What I care about is *why*. Showing me spec file fragments is totally unhelpful. What is the technical reason why pg and pie are mutually exclusive? What kind of technical reason are you looking for? glibc doesn't provide the right crt1 file for GCC to support this combination. You can't define GNU_USER_TARGET_STARTFILE_SPEC to support -pg and -pie. If you are asking why glibc doesn't provide one, my guess is no one has requested one before. -- H.J.
[PATCH 2/4] Pipeline model for APM XGene-1.
--- gcc/config/aarch64/aarch64.md | 1 + gcc/config/arm/xgene1.md | 531 ++ 2 files changed, 532 insertions(+) create mode 100644 gcc/config/arm/xgene1.md diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 12e1054..1f6b1b6 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -190,6 +190,7 @@ (include ../arm/cortex-a53.md) (include ../arm/cortex-a15.md) (include thunderx.md) +(include ../arm/xgene1.md) ;; --- ;; Jumps and other miscellaneous insns diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md new file mode 100644 index 000..e909fd0 --- /dev/null +++ b/gcc/config/arm/xgene1.md @@ -0,0 +1,531 @@ +;; Machine description for AppliedMicro xgene1 core. +;; Copyright (C) 2012-2014 Free Software Foundation, Inc. +;; Contributed by Theobroma Systems Design und Consulting GmbH. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; http://www.gnu.org/licenses/. + +;; Pipeline description for the xgene1 micro-architecture + +(define_automaton xgene1) + +(define_cpu_unit xgene1_decode_out0 xgene1) +(define_cpu_unit xgene1_decode_out1 xgene1) +(define_cpu_unit xgene1_decode_out2 xgene1) +(define_cpu_unit xgene1_decode_out3 xgene1) + +(define_cpu_unit xgene1_divide xgene1) +(define_cpu_unit xgene1_fp_divide xgene1) +(define_cpu_unit xgene1_fsu xgene1) +(define_cpu_unit xgene1_fcmp xgene1) + +(define_reservation xgene1_decode1op +( xgene1_decode_out0 ) +|( xgene1_decode_out1 ) +|( xgene1_decode_out2 ) +|( xgene1_decode_out3 ) +) +(define_reservation xgene1_decode2op +( xgene1_decode_out0 + xgene1_decode_out1 ) +|( xgene1_decode_out0 + xgene1_decode_out2 ) +|( xgene1_decode_out0 + xgene1_decode_out3 ) +|( xgene1_decode_out1 + xgene1_decode_out2 ) +|( xgene1_decode_out1 + xgene1_decode_out3 ) +|( xgene1_decode_out2 + xgene1_decode_out3 ) +) +(define_reservation xgene1_decodeIsolated +( xgene1_decode_out0 + xgene1_decode_out1 + xgene1_decode_out2 + xgene1_decode_out3 ) +) + +(define_insn_reservation xgene1_branch 1 + (and (eq_attr tune xgene1) + (eq_attr type branch)) + xgene1_decode1op) + +(define_insn_reservation xgene1_nop 1 + (and (eq_attr tune xgene1) + (eq_attr type no_insn)) + xgene1_decode1op) + +(define_insn_reservation xgene1_call 1 + (and (eq_attr tune xgene1) + (eq_attr type call)) + xgene1_decode2op) + +(define_insn_reservation xgene1_f_load 10 + (and (eq_attr tune xgene1) + (eq_attr type f_loadd,f_loads)) + xgene1_decode2op) + +(define_insn_reservation xgene1_f_store 4 + (and (eq_attr tune xgene1) + (eq_attr type f_stored,f_stores)) + xgene1_decode2op) + +(define_insn_reservation xgene1_fmov 2 + (and (eq_attr tune xgene1) + (eq_attr type fmov,fconsts,fconstd)) + xgene1_decode1op) + +(define_insn_reservation xgene1_f_mcr 10 + (and (eq_attr tune xgene1) + (eq_attr type f_mcr)) + xgene1_decodeIsolated) + +(define_insn_reservation xgene1_f_mrc 4 + (and (eq_attr tune xgene1) + (eq_attr type f_mrc)) + xgene1_decode2op) + +(define_insn_reservation xgene1_load_pair 6 + (and (eq_attr tune xgene1) + (eq_attr type load2)) + xgene1_decodeIsolated) + +(define_insn_reservation xgene1_store_pair 2 + (and (eq_attr tune xgene1) + (eq_attr type store2)) + xgene1_decodeIsolated) + +(define_insn_reservation xgene1_fp_load1 10 + (and (eq_attr tune xgene1) + (eq_attr type load1) + (eq_attr fp yes)) + xgene1_decode1op) + +(define_insn_reservation xgene1_load1 5 + (and (eq_attr tune xgene1) + (eq_attr type load1)) + xgene1_decode1op) + +(define_insn_reservation xgene1_store1 2 + (and (eq_attr tune xgene1) + (eq_attr type store1)) + xgene1_decode2op) + +(define_insn_reservation xgene1_move 1 + (and (eq_attr tune xgene1) + (eq_attr type mov_reg,mov_imm,mrs)) + xgene1_decode1op) + +(define_insn_reservation xgene1_alu 1 + (and (eq_attr tune xgene1) + (eq_attr type alu_imm,alu_sreg,alu_shift_imm,\ +alu_ext,adc_reg,csel,logic_imm,\ +logic_reg,logic_shift_imm,clz,\ +rbit,shift_reg,adr,mov_reg,\ +mov_imm,extend)) + xgene1_decode1op) +
[PATCH 0/4, AArch64, v4] APM X-Gene 1 cost-table and pipeline model
Marcus Ramana, Attached is the updated---and hopefully final---revision of the changes to get XGene-1 properly wired up in the AArch64 and AArch64 backends. On the AArch64 side, we've only removed the URL from the credits of the xgene1.md file and the remaining content is unchanged (safe for changes from rebasing to the current head). Note that the AArch32 integration is contained entirely in patch 4/4 and requires a gas-change that was merged as ea0d6bb on the binutils tree. These patches incorporate all earlier comments and have been tested for AArch64 (aarch64-linux-gnu in LE and BE configurations) and for AArch32 (arm-none-eabi). I'd be grateful, if you could apply at least the AArch64 patches. Best, Phil. Philipp Tomsich (4): Core definition for APM XGene-1 and associated cost-table. Pipeline model for APM XGene-1. Change the type of the prefetch-instructions to 'prefetch'. Wire X-Gene 1 up in the ARM (32bit) backend as a AArch32-capable core. gcc/ChangeLog-2014 | 18 ++ gcc/config/aarch64/aarch64-cores.def | 1 + gcc/config/aarch64/aarch64-tune.md | 2 +- gcc/config/aarch64/aarch64.c | 68 + gcc/config/aarch64/aarch64.md| 3 +- gcc/config/arm/aarch-cost-tables.h | 101 +++ gcc/config/arm/arm-cores.def | 1 + gcc/config/arm/arm-tables.opt| 3 + gcc/config/arm/arm-tune.md | 3 +- gcc/config/arm/arm.c | 22 ++ gcc/config/arm/arm.md| 11 +- gcc/config/arm/bpabi.h | 2 + gcc/config/arm/t-arm | 1 + gcc/config/arm/types.md | 2 + gcc/config/arm/xgene1.md | 531 +++ gcc/doc/invoke.texi | 6 +- 16 files changed, 768 insertions(+), 7 deletions(-) create mode 100644 gcc/config/arm/xgene1.md -- 1.9.1
[PATCH 1/4] Core definition for APM XGene-1 and associated cost-table.
To keep this change separately buildable from the pipeline model, this patch directs the APM XGene-1 to use the generic scheduling model. --- gcc/ChangeLog-2014 | 8 +++ gcc/config/aarch64/aarch64-cores.def | 1 + gcc/config/aarch64/aarch64-tune.md | 2 +- gcc/config/aarch64/aarch64.c | 68 +++ gcc/config/arm/aarch-cost-tables.h | 101 +++ gcc/doc/invoke.texi | 3 +- 6 files changed, 181 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog-2014 b/gcc/ChangeLog-2014 index 58091df..dd49d7f 100644 --- a/gcc/ChangeLog-2014 +++ b/gcc/ChangeLog-2014 @@ -5350,6 +5350,14 @@ optimization of ashiftrt of subreg of lshiftrt, check that code is ASHIFTRT. +2014-11-19 Philipp Tomsich philipp.toms...@theobroma-systems.com + + * config/aarch64/aarch64-cores.def (xgene1): Update/add the + xgene1 (APM XGene-1) core definition. + * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1 + * config/arm/aarch-cost-tables.h: Add cost tables for APM XGene-1 + * doc/invoke.texi: Document -mcpu=xgene1. + 2014-11-18 Andrew MacLeod amacl...@redhat.com * attribs.c (decl_attributes): Remove always true condition, diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 18f5c48..35a43e6 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -37,6 +37,7 @@ AARCH64_CORE(cortex-a53, cortexa53, cortexa53, 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53) AARCH64_CORE(cortex-a57, cortexa15, cortexa15, 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) AARCH64_CORE(thunderx,thunderx, thunderx, 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx) +AARCH64_CORE(xgene1, xgene1,xgene1,8, AARCH64_FL_FOR_ARCH8, xgene1) /* V8 big.LITTLE implementations. */ diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index c717ea8..6409082 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr tune - cortexa53,cortexa15,thunderx,cortexa57cortexa53 + cortexa53,cortexa15,thunderx,xgene1,cortexa57cortexa53 (const (symbol_ref ((enum attr_tune) aarch64_tune diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5100532..dd43a73 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -233,6 +233,27 @@ static const struct cpu_addrcost_table cortexa57_addrcost_table = #if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 __extension__ #endif +static const struct cpu_addrcost_table xgene1_addrcost_table = +{ +#if HAVE_DESIGNATED_INITIALIZERS + .addr_scale_costs = +#endif +{ + NAMED_PARAM (hi, 1), + NAMED_PARAM (si, 0), + NAMED_PARAM (di, 0), + NAMED_PARAM (ti, 1), +}, + NAMED_PARAM (pre_modify, 1), + NAMED_PARAM (post_modify, 0), + NAMED_PARAM (register_offset, 0), + NAMED_PARAM (register_extend, 1), + NAMED_PARAM (imm_offset, 0), +}; + +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +__extension__ +#endif static const struct cpu_regmove_cost generic_regmove_cost = { NAMED_PARAM (GP2GP, 1), @@ -271,6 +292,16 @@ static const struct cpu_regmove_cost thunderx_regmove_cost = NAMED_PARAM (FP2FP, 4) }; +static const struct cpu_regmove_cost xgene1_regmove_cost = +{ + NAMED_PARAM (GP2GP, 1), + /* Avoid the use of slow int-fp moves for spilling by setting + their cost higher than memmov_cost. */ + NAMED_PARAM (GP2FP, 8), + NAMED_PARAM (FP2GP, 8), + NAMED_PARAM (FP2FP, 2) +}; + /* Generic costs for vector insn classes. */ #if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 __extension__ @@ -311,6 +342,26 @@ static const struct cpu_vector_cost cortexa57_vector_cost = NAMED_PARAM (cond_not_taken_branch_cost, 1) }; +/* Generic costs for vector insn classes. */ +#if HAVE_DESIGNATED_INITIALIZERS GCC_VERSION = 2007 +__extension__ +#endif +static const struct cpu_vector_cost xgene1_vector_cost = +{ + NAMED_PARAM (scalar_stmt_cost, 1), + NAMED_PARAM (scalar_load_cost, 5), + NAMED_PARAM (scalar_store_cost, 1), + NAMED_PARAM (vec_stmt_cost, 2), + NAMED_PARAM (vec_to_scalar_cost, 4), + NAMED_PARAM (scalar_to_vec_cost, 4), + NAMED_PARAM (vec_align_load_cost, 10), + NAMED_PARAM (vec_unalign_load_cost, 10), + NAMED_PARAM (vec_unalign_store_cost, 2), + NAMED_PARAM (vec_store_cost, 2), + NAMED_PARAM (cond_taken_branch_cost, 2), + NAMED_PARAM (cond_not_taken_branch_cost, 1) +}; + #define AARCH64_FUSE_NOTHING (0) #define AARCH64_FUSE_MOV_MOVK (1 0) #define AARCH64_FUSE_ADRP_ADD (1 1) @@ -390,6 +441,23 @@ static const struct tune_params thunderx_tunings = 1/* vec_reassoc_width. */ }; +static const struct
[PATCH 3/4] Change the type of the prefetch-instructions to 'prefetch'.
--- gcc/config/aarch64/aarch64.md | 2 +- gcc/config/arm/types.md | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1f6b1b6..98f4f30 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -391,7 +391,7 @@ return pftype[INTVAL(operands[1])][locality]; } - [(set_attr type load1)] + [(set_attr type prefetch)] ) (define_insn trap diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index d368446..088c21a 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -118,6 +118,7 @@ ; mvn_shift_reg inverting move instruction, shifted operand by a register. ; no_insnan insn which does not represent an instruction in the ;final output, thus having no impact on scheduling. +; prefetch a prefetch instruction ; rbit reverse bits. ; revreverse bytes. ; sdiv signed division. @@ -556,6 +557,7 @@ call,\ clz,\ no_insn,\ + prefetch,\ csel,\ crc,\ extend,\ -- 1.9.1
Re: [RFC PATCH] Handle sequence in reg_set_p
On 01/11/15 04:40, Oleg Endo wrote: Any particular reason why the SEQUENCE handling isn't done first, then the REG_INC and CALL insn handling? I'd probably explicitly return false if we had a sequence and none of its elements returned true. There's no need to check anything on the toplevel SEQUENCE to the best of my knowledge. No meaningful reason. Attached is an updated patch that applies on 4.8 and trunk. Bootstrapped on trunk i686-pc-linux-gnu. Tested with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} on trunk sh-elf. In the PR it has been reported, that the patch fixes the problems when building a native SH GCC. Although the related SH problem occurs only on 4.8, I'd like to install this on trunk and 4.9, too, to avoid future surprises. OK? Cheers, Oleg gcc/ChangeLog: PR target/64479 * rtlanal.c (set_reg_p): Handle SEQUENCE constructs. OK. Jeff
[PATCH, committed] jit-playback.c: fix missing fclose
Reported by David Binderman within discussion of PR jit/63854. Before/after jit.sum has 7272 passes. Committed to trunk as r219487. gcc/jit/ChangeLog: * jit-playback.c (gcc::jit::playback::context::read_dump_file): Add missing fclose on error-handling path. --- gcc/jit/jit-playback.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c index 0e45e02..ca4e112 100644 --- a/gcc/jit/jit-playback.c +++ b/gcc/jit/jit-playback.c @@ -1947,6 +1947,7 @@ playback::context::read_dump_file (const char *path) { add_error (NULL, error reading from %s, path); free (result); + fclose (f_in); return NULL; } -- 1.8.5.3
Re: [PATCH, committed] Fix build of jit (was Re: [PATCH] Flatten tree.h and tree-core.h (Version 3))
On 13 January 2015 at 00:01, Mike Stump mikest...@comcast.net wrote: On Jan 11, 2015, at 2:33 PM, Prathamesh Kulkarni prathamesh.kulka...@linaro.org wrote: oops, sorry about this. We will build further flattening patches with --enable-languages=all,go,jit,ada. Shall that cover all the front-ends ? No objc++ is non-default: Thanks! $ grep build_by_default */config-lang.in go/config-lang.in:build_by_default=no java/config-lang.in:#build_by_default=no jit/config-lang.in:build_by_default=no lto/config-lang.in:build_by_default=no objcp/config-lang.in:build_by_default=“no I saw changes to objcp, so, I had assumed you did it, otherwise I was going to mention it. Michael had built objcp for tree.h flattening patch and tested it. Thanks, Prathamesh
Re: [testsuite] PATCH: Add check_effective_target_pie
On Mon, Jan 12, 2015 at 10:09 AM, Jeff Law l...@redhat.com wrote: On 01/11/15 16:58, H.J. Lu wrote: Hi, This patch adds check_effective_target_pie to check if the current multilib generatse PIE by default. I will submit other patches to use it. OK for trunk? Thanks. H.J. --- 2015-01-11 H.J. Lu hongjiu...@intel.com * gcc.target/i386/pie.c: New test. * lib/target-supports.exp (check_profiling_available): Return 0 if PIE is enabled. (check_effective_target_pie): New. --- gcc/testsuite/gcc.target/i386/pie.c | 12 gcc/testsuite/lib/target-supports.exp | 15 +++ 2 files changed, 27 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pie.c diff --git a/gcc/testsuite/gcc.target/i386/pie.c b/gcc/testsuite/gcc.target/i386/pie.c new file mode 100644 index 000..0a9f5ee --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pie.c @@ -0,0 +1,12 @@ +/* { dg-do compile { target pie } } */ +/* { dg-options -O2 } */ + +int foo (void); + +int +main (void) +{ + return foo (); +} + +/* { dg-final { scan-assembler foo@PLT } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index f5c6db8..549bcdf 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -475,6 +475,11 @@ proc check_profiling_available { test_what } { } } +# Profiling don't work with -fPIE -pie. +if { [check_effective_target_pie] } { + return 0 +} Is this an inherent restriction of -fPIE, or is it merely an implementation detail? If the latter, is that implementation detail a target issue? ie, could we have a target that supports profiling in conjunction with -fPIE? If so, then this test seems too restrictive. [hjl@gnu-6 tmp]$ gcc -pie -fPIE -pg h.c /usr/local/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o: relocation R_X86_64_32S against `__libc_csu_fini' can not be used when making a shared object; recompile with -fPIC /usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status [hjl@gnu-6 tmp]$ There is no crt1.o from glibc to support -pg -pie -fPIE. I don't know if other targets support -pie -fPIE. } +# Return 1 if the current multilib generatse PIE by default. s/generatse/generates/ I will fix it. Waiting on answer to PIE vs pg question above prior to approving or requesting further refinement. Sure. -- H.J.
[PATCH] Install libgcj.pc as libgcj-5.pc rather than libgcj-5.0.pc (PR libgcj/64219)
Hi! This patch changes the libgcj*.pc installed filename to match the new GCC versioning scheme. Bootstrapped/regtested on x86_64-linux and i686-linux, tested make install. -rw-r--r--. 1 jakub jakub 192 Jan 12 21:02 /tmp/blah/usr/local/lib64/pkgconfig/libgcj-5.pc -rw-r--r--. 1 jakub jakub 192 Jan 12 21:02 /tmp/blah/usr/local/lib/pkgconfig/libgcj-5.pc Ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR libgcj/64219 * Makefile.am (install-data-local): Use just the major version from GCJVERSION instead of major.minor. * Makefile.in: Regenerated. --- libjava/Makefile.am.jj 2014-02-20 21:38:45.0 +0100 +++ libjava/Makefile.am 2015-01-12 12:40:50.453179067 +0100 @@ -779,7 +779,7 @@ install_data_local_split = 50 install-data-local: $(PRE_INSTALL) ## Install the .pc file. - @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*$$//'`; \ + @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*[.][^.]*$$//'`; \ file=libgcj-$${pc_version}.pc; \ $(mkinstalldirs) $(DESTDIR)$(pkgconfigdir); \ echo $(INSTALL_DATA) libgcj.pc $(DESTDIR)$(pkgconfigdir)/$$file; \ --- libjava/Makefile.in.jj 2014-02-20 21:38:45.0 +0100 +++ libjava/Makefile.in 2015-01-12 12:41:09.376849424 +0100 @@ -12455,7 +12455,7 @@ install-exec-hook: install-binPROGRAMS i @BUILD_ECJ1_TRUE@ mv $(DESTDIR)$(libexecsubdir)/`echo ecjx | sed 's,^.*/,,;$(transform);s/$$/$(EXEEXT)/'` $(DESTDIR)$(libexecsubdir)/ecj1$(host_exeext) install-data-local: $(PRE_INSTALL) - @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*$$//'`; \ + @pc_version=`echo $(GCJVERSION) | sed -e 's/[.][^.]*[.][^.]*$$//'`; \ file=libgcj-$${pc_version}.pc; \ $(mkinstalldirs) $(DESTDIR)$(pkgconfigdir); \ echo $(INSTALL_DATA) libgcj.pc $(DESTDIR)$(pkgconfigdir)/$$file; \ Jakub
[PATCH 4/4] Wire X-Gene 1 up in the ARM (32bit) backend as a AArch32-capable core.
--- gcc/ChangeLog-2014| 10 ++ gcc/config/arm/arm-cores.def | 1 + gcc/config/arm/arm-tables.opt | 3 +++ gcc/config/arm/arm-tune.md| 3 ++- gcc/config/arm/arm.c | 22 ++ gcc/config/arm/arm.md | 11 +-- gcc/config/arm/bpabi.h| 2 ++ gcc/config/arm/t-arm | 1 + gcc/doc/invoke.texi | 3 ++- 9 files changed, 52 insertions(+), 4 deletions(-) diff --git a/gcc/ChangeLog-2014 b/gcc/ChangeLog-2014 index dd49d7f..c3c62db 100644 --- a/gcc/ChangeLog-2014 +++ b/gcc/ChangeLog-2014 @@ -3497,6 +3497,16 @@ 63965. * config/rs6000/rs6000.c: Likewise. +2014-12-23 Philipp Tomsich philipp.toms...@theobroma-systems.com + + * config/arm/arm.md (generic_sched): Specify xgene1 in 'no' list. + Include xgene1.md. + * config/arm/arm.c (arm_issue_rate): Specify 4 for xgene1. + * config/arm/arm-cores.def (xgene1): New entry. + * config/arm/arm-tables.opt: Regenerate. + * config/arm/arm-tune.md: Regenerate. + * config/arm/bpabi.h (BE8_LINK_SPEC): Specify mcpu=xgene1. + 2014-11-22 Jan Hubicka hubi...@ucw.cz PR ipa/63671 diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index be125ac..fa13eb9 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -167,6 +167,7 @@ ARM_CORE(cortex-a17.cortex-a7, cortexa17cortexa7, cortexa7, 7A, FL_LDSCHED | /* V8 Architecture Processors */ ARM_CORE(cortex-a53, cortexa53, cortexa53, 8A, FL_LDSCHED | FL_CRC32, cortex_a53) ARM_CORE(cortex-a57, cortexa57, cortexa15, 8A, FL_LDSCHED | FL_CRC32, cortex_a57) +ARM_CORE(xgene1, xgene1,xgene1, 8A, FL_LDSCHED, xgene1) /* V8 big.LITTLE implementations */ ARM_CORE(cortex-a57.cortex-a53, cortexa57cortexa53, cortexa53, 8A, FL_LDSCHED | FL_CRC32, cortex_a57) diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt index ece9d5e..1392429 100644 --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -310,6 +310,9 @@ EnumValue Enum(processor_type) String(cortex-a57) Value(cortexa57) EnumValue +Enum(processor_type) String(xgene1) Value(xgene1) + +EnumValue Enum(processor_type) String(cortex-a57.cortex-a53) Value(cortexa57cortexa53) Enum diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md index 452820ab..dcd5054 100644 --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -32,5 +32,6 @@ cortexr4f,cortexr5,cortexr7, cortexm7,cortexm4,cortexm3, marvell_pj4,cortexa15cortexa7,cortexa17cortexa7, - cortexa53,cortexa57,cortexa57cortexa53 + cortexa53,cortexa57,xgene1, + cortexa57cortexa53 (const (symbol_ref ((enum attr_tune) arm_tune diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 8ca2dd8..14c8a87 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1903,6 +1903,25 @@ const struct tune_params arm_cortex_a57_tune = ARM_FUSE_MOVW_MOVT /* Fuseable pairs of instructions. */ }; +const struct tune_params arm_xgene1_tune = +{ + arm_9e_rtx_costs, + xgene1_extra_costs, + NULL,/* Scheduler cost adjustment. */ + 1, /* Constant limit. */ + 2, /* Max cond insns. */ + ARM_PREFETCH_NOT_BENEFICIAL, + false, /* Prefer constant pool. */ + arm_default_branch_cost, + true,/* Prefer LDRD/STRD. */ + {true, true},/* Prefer non short circuit. */ + arm_default_vec_cost, /* Vectorizer costs. */ + false, /* Prefer Neon for 64-bits bitops. */ + true, true, /* Prefer 32-bit encodings. */ + false, /* Prefer Neon for stringops. */ + 32 /* Maximum insns to inline memset. */ +}; + /* Branches can be dual-issued on Cortex-A5, so conditional execution is less appealing. Set max_insns_skipped to a low value. */ @@ -27066,6 +27085,9 @@ arm_issue_rate (void) { switch (arm_tune) { +case xgene1: + return 4; + case cortexa15: case cortexa57: return 3; diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index c61057f..a3cbf3b 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -109,6 +109,11 @@ ;; given instruction does not shift one of its input operands. (define_attr shift (const_int 0)) +;; [For compatibility with AArch64 in pipeline models] +;; Attribute that specifies whether or not the instruction touches fp +;; registers. +(define_attr fp no,yes (const_string no)) + ; Floating Point Unit. If we only have floating point emulation, then there ; is no point in
Re: [COMMITTED] Merge libffi with upstream
Hello! Upstream libffi has added support for Go closures (using the static chain), and support for complex numbers. Perhaps less relevant is new support for arc, microblaze, moxie, nios, and or1k targets. Without additional changes for Go, this merge has little effect. Within the gcc tree libffi is primarily used by libjava. Tested with no regressions on {i686,x86_64,ppc64,s390x,aarch64,alpha}-linux. This patchset regressed libjava on -m32 x86_64-linux-gnu (Fedora 21): === libjava tests === Running target unix === libjava Summary for unix === Running target unix/-m32 FAIL: libjava.jar/TestClosureGC.jar execution - gij test FAIL: libjava.jar/simple.jar execution - gij test FAIL: PR15133 execution - gij test FAIL: PR18116 execution - gij test FAIL: PR28178 execution - gij test FAIL: bytebuffer execution - gij test FAIL: calls execution - gij test FAIL: cxxtest execution - gij test FAIL: directbuffer execution - gij test FAIL: field execution - gij test FAIL: final_method execution - gij test FAIL: findclass execution - gij test FAIL: findclass2 execution - gij test FAIL: iface execution - gij test FAIL: init execution - gij test FAIL: invoke execution - gij test FAIL: jniutf execution - gij test FAIL: martin execution - gij test FAIL: noclass execution - gij test FAIL: overload execution - gij test FAIL: pr11951 execution - gij test FAIL: pr18278 execution - gij test FAIL: pr23739 execution - gij test FAIL: register execution - gij test FAIL: register2 execution - gij test FAIL: simple_int execution - gij test FAIL: throwit execution - gij test FAIL: virtual execution - gij test FAIL: PR16923 run FAIL: pr29812 execution - gij test FAIL: getargssize run FAIL: getlocalvartable run FAIL: getstacktrace run FAIL: ExtraClassLoader execution - source compiled test FAIL: ExtraClassLoader -findirect-dispatch execution - source compiled test FAIL: ExtraClassLoader -O3 execution - source compiled test FAIL: ExtraClassLoader -O3 -findirect-dispatch execution - source compiled test FAIL: TestEarlyGC execution - source compiled test === libjava Summary for unix/-m32 === === libjava Summary === # of expected passes5092 # of unexpected failures38 # of expected failures8 # of untested testcases38 Uros.
Re: [testsuite] PATCH: Add check_effective_target_pie
On 01/12/15 12:29, H.J. Lu wrote: Is this an inherent restriction of -fPIE, or is it merely an implementation detail? If the latter, is that implementation detail a target issue? ie, could we have a target that supports profiling in conjunction with -fPIE? If so, then this test seems too restrictive. [hjl@gnu-6 tmp]$ gcc -pie -fPIE -pg h.c /usr/local/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o: relocation R_X86_64_32S against `__libc_csu_fini' can not be used when making a shared object; recompile with -fPIC /usr/lib/gcc/x86_64-redhat-linux/4.8.3/../../../../lib64/gcrt1.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status [hjl@gnu-6 tmp]$ There is no crt1.o from glibc to support -pg -pie -fPIE. I don't know if other targets support -pie -fPIE. Can you please investigate the questions. Showing me the link failure doesn't really help much here. It tells me there's no crt1.o, but it says nothing about *why*. Is there inherently something about PIE/pg that makes them impossible to work together or is this an implementation detail? If the latter, then is the implementation detail a target issue or not. Jeff
[PATCH] Fix VRP ICE with -Wtype-limits (PR tree-optimization/64563)
Hi! On the following testcase we ICE with -Os -Wtype-limits, as VR_UNDEFINED has NULL vr0-min and vr0-max. From what the code does I believe the code only means to handle VR_RANGE and not anything else. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/64563 * tree-vrp.c (vrp_evaluate_conditional): Check for VR_RANGE instead of != VR_VARYING. * gcc.dg/pr64563.c: New test. --- gcc/tree-vrp.c.jj 2015-01-09 21:59:29.0 +0100 +++ gcc/tree-vrp.c 2015-01-12 11:21:50.363521200 +0100 @@ -7545,7 +7545,7 @@ vrp_evaluate_conditional (enum tree_code tree type = TREE_TYPE (op0); value_range_t *vr0 = get_value_range (op0); - if (vr0-type != VR_VARYING + if (vr0-type == VR_RANGE INTEGRAL_TYPE_P (type) vrp_val_is_min (vr0-min) vrp_val_is_max (vr0-max) --- gcc/testsuite/gcc.dg/pr64563.c.jj 2015-01-12 11:24:07.595145836 +0100 +++ gcc/testsuite/gcc.dg/pr64563.c 2015-01-12 11:23:08.0 +0100 @@ -0,0 +1,14 @@ +/* PR tree-optimization/64563 */ +/* { dg-do compile } */ +/* { dg-options -Os -Wtype-limits } */ + +int a, b, c, d, f; +unsigned int e; + +void +foo (void) +{ + d = b = (a != (e | 4294967288UL)); + if (!d) +c = f || b; +} Jakub
Re: LTO streaming of TARGET_OPTIMIZE_NODE
On 09 Jan 12:45, Jakub Jelinek wrote: --- gcc/cgraphunit.c.jj 2015-01-09 12:01:33.0 +0100 +++ gcc/cgraphunit.c2015-01-09 12:22:27.742692667 +0100 @@ -2108,11 +2108,14 @@ ipa_passes (void) if (g-have_offload) { section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX; + lto_stream_offload_p = true; ipa_write_summaries (true); + lto_stream_offload_p = false; } if (flag_lto) { section_name_prefix = LTO_SECTION_NAME_PREFIX; + lto_stream_offload_p = false; ipa_write_summaries (false); } Now when we have a global flag, there is no longer need to pass a flag to ipa_write_summaries and to select_what_to_stream. Bootstrapped/regtested on x86_64-linux and i686-linux, OK for trunk? OK, Honza gcc/ * cgraphunit.c (ipa_passes): Remove argument from ipa_write_summaries. * lto-cgraph.c (select_what_to_stream): Remove argument, use lto_stream_offload_p instead. * lto-streamer.h (select_what_to_stream): Remove argument. * passes.c (ipa_write_summaries): Likewise. * tree-pass.h (ipa_write_summaries): Likewise. gcc/lto/ * lto-partition.c (lto_promote_cross_file_statics): Remove argument from select_what_to_stream. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 149f447..1ef1b6c 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -2115,14 +2115,14 @@ ipa_passes (void) { section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX; lto_stream_offload_p = true; - ipa_write_summaries (true); + ipa_write_summaries (); lto_stream_offload_p = false; } if (flag_lto) { section_name_prefix = LTO_SECTION_NAME_PREFIX; lto_stream_offload_p = false; - ipa_write_summaries (false); + ipa_write_summaries (); } } diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index 6c6501a..91be530 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -842,11 +842,11 @@ create_references (lto_symtab_encoder_t encoder, symtab_node *node) /* Select what needs to be streamed out. In regular lto mode stream everything. In offload lto mode stream only nodes marked as offloadable. */ void -select_what_to_stream (bool offload_lto_mode) +select_what_to_stream (void) { struct symtab_node *snode; FOR_EACH_SYMBOL (snode) -snode-need_lto_streaming = !offload_lto_mode || snode-offloadable; +snode-need_lto_streaming = !lto_stream_offload_p || snode-offloadable; } /* Find all symbols we want to stream into given partition and insert them diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h index 853630f..2d9f30c 100644 --- a/gcc/lto-streamer.h +++ b/gcc/lto-streamer.h @@ -837,7 +837,7 @@ bool referenced_from_this_partition_p (symtab_node *, bool reachable_from_this_partition_p (struct cgraph_node *, lto_symtab_encoder_t); lto_symtab_encoder_t compute_ltrans_boundary (lto_symtab_encoder_t encoder); -void select_what_to_stream (bool); +void select_what_to_stream (void); /* In options-save.c. */ void cl_target_option_stream_out (struct output_block *, struct bitpack_d *, diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c index 38809d2..c1179cb 100644 --- a/gcc/lto/lto-partition.c +++ b/gcc/lto/lto-partition.c @@ -973,7 +973,8 @@ lto_promote_cross_file_statics (void) gcc_assert (flag_wpa); - select_what_to_stream (false); + lto_stream_offload_p = false; + select_what_to_stream (); /* First compute boundaries. */ n_sets = ltrans_partitions.length (); diff --git a/gcc/passes.c b/gcc/passes.c index 52dc067..e78a325 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -2464,7 +2464,7 @@ ipa_write_summaries_1 (lto_symtab_encoder_t encoder) /* Write out summaries for all the nodes in the callgraph. */ void -ipa_write_summaries (bool offload_lto_mode) +ipa_write_summaries (void) { lto_symtab_encoder_t encoder; int i, order_pos; @@ -2475,7 +2475,7 @@ ipa_write_summaries (bool offload_lto_mode) if ((!flag_generate_lto !flag_generate_offload) || seen_error ()) return; - select_what_to_stream (offload_lto_mode); + select_what_to_stream (); encoder = lto_symtab_encoder_new (false); diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 398ab83..9ff5bdc 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -603,7 +603,7 @@ extern void pass_fini_dump_file (opt_pass *); extern const char *get_current_pass_name (void); extern void print_current_pass (FILE *); extern void debug_pass (void); -extern void ipa_write_summaries (bool); +extern void ipa_write_summaries (void); extern void ipa_write_optimization_summaries (struct lto_symtab_encoder_d *); extern void ipa_read_summaries (void); extern void ipa_read_optimization_summaries (void); -- Ilya
[PATCH] Fix up some gcc.dg/vect/ testcases with -fpic (PR testsuite/64028)
Hi! Various gcc.dg/vect/ testcases now fail on the trunk with -fpic. The problem is that they expect that the global vars bind locally and vectorizer can increase their alignment, but with -fpic that does not work, as one can interpose them. Fixed by adding dg-add-options bind_pic_locally. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR testsuite/64028 * gcc.dg/vect/no-section-anchors-vect-31.c: Add dg-add-options bind_pic_locally. * gcc.dg/vect/no-section-anchors-vect-34.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-36.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-64.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-65.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-68.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-69.c: Likewise. * gcc.dg/vect/slp-25.c: Likewise. * gcc.dg/vect/vect-109.c: Likewise. * gcc.dg/vect/vect-13.c: Likewise. * gcc.dg/vect/vect-17.c: Likewise. * gcc.dg/vect/vect-18.c: Likewise. * gcc.dg/vect/vect-19.c: Likewise. * gcc.dg/vect/vect-20.c: Likewise. * gcc.dg/vect/vect-21.c: Likewise. * gcc.dg/vect/vect-22.c: Likewise. * gcc.dg/vect/vect-27.c: Likewise. * gcc.dg/vect/vect-29.c: Likewise. * gcc.dg/vect/vect-2-big-array.c: Likewise. * gcc.dg/vect/vect-2.c: Likewise. * gcc.dg/vect/vect-3.c: Likewise. * gcc.dg/vect/vect-4.c: Likewise. * gcc.dg/vect/vect-5.c: Likewise. * gcc.dg/vect/vect-72.c: Likewise. * gcc.dg/vect/vect-73-big-array.c: Likewise. * gcc.dg/vect/vect-73.c: Likewise. * gcc.dg/vect/vect-77-global.c: Likewise. * gcc.dg/vect/vect-78-global.c: Likewise. * gcc.dg/vect/vect-7.c: Likewise. * gcc.dg/vect/vect-86.c: Likewise. * gcc.dg/vect/vect-align-1.c: Likewise. * gcc.dg/vect/vect-align-3.c: Likewise. * gcc.dg/vect/vect-all-big-array.c: Likewise. * gcc.dg/vect/vect-all.c: Likewise. * gcc.dg/vect/vect-multitypes-1.c: Likewise. * gcc.dg/vect/vect-multitypes-4.c: Likewise. * gcc.dg/vect/vect-peel-3.c: Likewise. * gcc.dg/vect/vect-peel-4.c: Likewise. * gcc.dg/vect/wrapv-vect-7.c: Likewise. --- gcc/testsuite/gcc.dg/vect/vect-18.c.jj 2008-09-05 12:54:35.0 +0200 +++ gcc/testsuite/gcc.dg/vect/vect-18.c 2015-01-12 13:54:28.201166746 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-17.c.jj 2010-11-03 16:58:21.0 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-17.c 2015-01-12 13:54:22.394268074 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-4.c.jj 2008-09-05 12:54:35.0 +0200 +++ gcc/testsuite/gcc.dg/vect/vect-4.c 2015-01-12 13:55:08.322466645 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_float } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-2-big-array.c.jj 2011-12-11 22:02:36.043642629 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-2-big-array.c2015-01-12 13:54:59.260624770 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-2.c.jj 2012-03-20 08:51:25.653267483 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-2.c 2015-01-12 13:55:01.959577675 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-77-global.c.jj 2009-05-13 08:42:37.0 +0200 +++ gcc/testsuite/gcc.dg/vect/vect-77-global.c 2015-01-12 13:55:23.363204190 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-27.c.jj 2009-11-04 18:36:22.0 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-27.c 2015-01-12 13:54:52.915735486 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-78-global.c.jj 2009-05-13 08:42:37.0 +0200 +++ gcc/testsuite/gcc.dg/vect/vect-78-global.c 2015-01-12 13:55:25.802161631 +0100 @@ -1,4 +1,5 @@ /* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ #include stdarg.h #include tree-vect.h --- gcc/testsuite/gcc.dg/vect/vect-5.c.jj 2008-09-05 12:54:36.0
[PATCH] Fix up computed goto on POINTERS_EXTEND_UNSIGNED targets (PR middle-end/63974)
Hi! The 991213-3.c testcase ICEs on aarch64-linux with -mabi=ilp32 since wide-int merge. The problem is that x = convert_memory_address (Pmode, x) is used twice on a VOIDmode CONST_INT, which is wrong. For non-VOIDmode rtl the second convert_memory_address is a NOP, but for VOIDmode the second call treats the CONST_INT returned by the first call as if it was again ptr_mode, rather than Pmode. On aarch64-linux in particular, the constant is zero-extended from SImode to DImode in the first call, so it is not valid SImode CONST_INT any longer. emit_indirect_jump always calls convert_memory_address (Pmode, ...) on the operand in optabs.c when handling EXPAND_ADDRESS case in maybe_legitimize_operand, so the first convert_memory_address is both unnecessary and harmful. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux (which do not define POINTERS_EXTEND_UNSIGNED) and tested on the problematic testcase with aarch64-linux cross. Can anyone with easy access to POINTERS_EXTEND_UNSIGNED targets (aarch64-linux ilp32, x86_64 -mx32, ia64-hpux) please test this? Ok for trunk if it works there? 2015-01-12 Jakub Jelinek ja...@redhat.com PR middle-end/63974 * cfgexpand.c (expand_computed_goto): Don't call convert_memory_address here. --- gcc/cfgexpand.c.jj 2015-01-09 21:59:54.0 +0100 +++ gcc/cfgexpand.c 2015-01-12 14:41:35.210705174 +0100 @@ -3060,8 +3060,6 @@ expand_computed_goto (tree exp) { rtx x = expand_normal (exp); - x = convert_memory_address (Pmode, x); - do_pending_stack_adjust (); emit_indirect_jump (x); } Jakub
[PATCH] Fix PR64461, Incorrect code on coldfire targets
As suggested by Andreas in the PR, the simplest fix for this problem is to disable the various trunc* patterns for TARGET_COLDFIRE. That's precisely what this patch does. Built cross compilers with and without the m68k.md hunk. Verified the test failed without the m68k.mk hunk and passed with the m68k.md hunk. For fun I've got an m68k bootstrap of the trunk running. I don't expect it to finish for at least a week or so, assuming it runs to completion. Installed on the trunk (in separate commits due to stupidity on my part). Jeff PR target/64461 * config/m68k/m68k.md (truncsiqi2): Disable for TARGET_COLDFIRE. (trunchiqi2, truncsihi2): Similarly. PR target/64461 * gcc.target/m68k/pr64461.c: New test. diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md index 2783a8f..2a314c3 100644 --- a/gcc/config/m68k/m68k.md +++ b/gcc/config/m68k/m68k.md @@ -1572,7 +1572,7 @@ [(set (match_operand:QI 0 nonimmediate_operand =dm,d) (truncate:QI (match_operand:SI 1 general_src_operand doJS,i)))] - + !TARGET_COLDFIRE { if (GET_CODE (operands[0]) == REG) { @@ -1590,7 +1590,7 @@ [(set (match_operand:QI 0 nonimmediate_operand =dm,d) (truncate:QI (match_operand:HI 1 general_src_operand doJS,i)))] - + !TARGET_COLDFIRE { if (GET_CODE (operands[0]) == REG (GET_CODE (operands[1]) == MEM @@ -1617,7 +1617,7 @@ [(set (match_operand:HI 0 nonimmediate_operand =dm,d) (truncate:HI (match_operand:SI 1 general_src_operand roJS,i)))] - + !TARGET_COLDFIRE { if (GET_CODE (operands[0]) == REG) { diff --git a/gcc/testsuite/gcc.target/m68k/pr64461.c b/gcc/testsuite/gcc.target/m68k/pr64461.c new file mode 100644 index 000..dd70355 --- /dev/null +++ b/gcc/testsuite/gcc.target/m68k/pr64461.c @@ -0,0 +1,16 @@ +/* { dg-do assemble } */ +/* { dg-options -mcpu=5235 -Os } */ + +typedef struct rtems_rfs_block_map_s +{ + long unsigned int blocks[(5)]; +} rtems_rfs_block_map; + +extern int foo (void); + +int +rtems_rfs_block_map_indirect_alloc (rtems_rfs_block_map *map, + unsigned char* buffer, int b) +{ + (buffer + b * 4)[3] = (unsigned char) map-blocks[b]; +}
[PATCH] Optimize (x % 5) % 5 in VRP (PR tree-optimization/64454)
Hi! This patch optimizes away TRUNC_MOD_EXPR by constant second argument (if not 0 and not type's minimum) if the range of the first argument is already known to be [-op1 + 1, op1 - 1] or its subset. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/64454 * tree-vrp.c (simplify_div_or_mod_using_ranges): Optimize op0 % op1 into op0 if op0 is in range [-op1 + 1, op1 - 1] for signed or [0, op1 - 1] for unsigned modulo. (simplify_stmt_using_ranges): Call simplify_div_or_mod_using_ranges even if op1 does not satisfy integer_pow2p. * gcc.dg/pr64454.c: New test. --- gcc/tree-vrp.c.jj 2015-01-12 11:21:50.0 +0100 +++ gcc/tree-vrp.c 2015-01-12 18:45:23.349335792 +0100 @@ -8998,7 +8998,11 @@ simplify_truth_ops_using_ranges (gimple_ /* Simplify a division or modulo operator to a right shift or bitwise and if the first operand is unsigned or is greater - than zero and the second operand is an exact power of two. */ + than zero and the second operand is an exact power of two. + For TRUNC_MOD_EXPR op0 % op1 with constant op1, optimize it + into just op0 if op0's range is known to be a subset of + [-op1 + 1, op1 - 1] for signed and [0, op1 - 1] for unsigned + modulo. */ static bool simplify_div_or_mod_using_ranges (gimple stmt) @@ -9007,7 +9011,30 @@ simplify_div_or_mod_using_ranges (gimple tree val = NULL; tree op0 = gimple_assign_rhs1 (stmt); tree op1 = gimple_assign_rhs2 (stmt); - value_range_t *vr = get_value_range (gimple_assign_rhs1 (stmt)); + value_range_t *vr = get_value_range (op0); + + if (rhs_code == TRUNC_MOD_EXPR + TREE_CODE (op1) == INTEGER_CST + tree_int_cst_sgn (op1) == 1 + range_int_cst_p (vr) + tree_int_cst_lt (vr-max, op1)) +{ + if (TYPE_UNSIGNED (TREE_TYPE (op0)) + || tree_int_cst_sgn (vr-min) = 0 + || tree_int_cst_lt (fold_unary (NEGATE_EXPR, TREE_TYPE (op1), op1), + vr-min)) + { + /* If op0 already has the range op0 % op1 has, +then TRUNC_MOD_EXPR won't change anything. */ + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + gimple_assign_set_rhs_from_tree (gsi, op0); + update_stmt (stmt); + return true; + } +} + + if (!integer_pow2p (op1)) +return false; if (TYPE_UNSIGNED (TREE_TYPE (op0))) { @@ -9880,11 +9907,14 @@ simplify_stmt_using_ranges (gimple_stmt_ /* Transform TRUNC_DIV_EXPR and TRUNC_MOD_EXPR into RSHIFT_EXPR and BIT_AND_EXPR respectively if the first operand is greater -than zero and the second operand is an exact power of two. */ +than zero and the second operand is an exact power of two. +Also optimize TRUNC_MOD_EXPR away if the second operand is +constant and the first operand already has the right value +range. */ case TRUNC_DIV_EXPR: case TRUNC_MOD_EXPR: - if (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)) - integer_pow2p (gimple_assign_rhs2 (stmt))) + if (TREE_CODE (rhs1) == SSA_NAME + INTEGRAL_TYPE_P (TREE_TYPE (rhs1))) return simplify_div_or_mod_using_ranges (stmt); break; --- gcc/testsuite/gcc.dg/pr64454.c.jj 2015-01-12 18:49:32.270004660 +0100 +++ gcc/testsuite/gcc.dg/pr64454.c 2015-01-12 18:50:43.964757197 +0100 @@ -0,0 +1,43 @@ +/* PR tree-optimization/64454 */ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-vrp1 } */ + +unsigned +f1 (unsigned x) +{ + return (x % 5) % 5; +} + +int +f2 (int x) +{ + return (x % 5) % 5; +} + +int +f3 (int x) +{ + return (x % -5) % -5; +} + +unsigned +f4 (unsigned x) +{ + return (x % 5) % 6; +} + +int +f5 (int x) +{ + return (x % 5) % 6; +} + +int +f6 (int x) +{ + return (x % -5) % -6; +} + +/* { dg-final { scan-tree-dump-times % 5 6 vrp1 } } */ +/* { dg-final { scan-tree-dump-times % 6 0 vrp1 } } */ +/* { dg-final { cleanup-tree-dump vrp1 } } */ Jakub
Re: [PATCH] Fix up some gcc.dg/vect/ testcases with -fpic (PR testsuite/64028)
On 01/12/15 13:08, Jakub Jelinek wrote: Hi! Various gcc.dg/vect/ testcases now fail on the trunk with -fpic. The problem is that they expect that the global vars bind locally and vectorizer can increase their alignment, but with -fpic that does not work, as one can interpose them. Fixed by adding dg-add-options bind_pic_locally. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR testsuite/64028 * gcc.dg/vect/no-section-anchors-vect-31.c: Add dg-add-options bind_pic_locally. * gcc.dg/vect/no-section-anchors-vect-34.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-36.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-64.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-65.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-68.c: Likewise. * gcc.dg/vect/no-section-anchors-vect-69.c: Likewise. * gcc.dg/vect/slp-25.c: Likewise. * gcc.dg/vect/vect-109.c: Likewise. * gcc.dg/vect/vect-13.c: Likewise. * gcc.dg/vect/vect-17.c: Likewise. * gcc.dg/vect/vect-18.c: Likewise. * gcc.dg/vect/vect-19.c: Likewise. * gcc.dg/vect/vect-20.c: Likewise. * gcc.dg/vect/vect-21.c: Likewise. * gcc.dg/vect/vect-22.c: Likewise. * gcc.dg/vect/vect-27.c: Likewise. * gcc.dg/vect/vect-29.c: Likewise. * gcc.dg/vect/vect-2-big-array.c: Likewise. * gcc.dg/vect/vect-2.c: Likewise. * gcc.dg/vect/vect-3.c: Likewise. * gcc.dg/vect/vect-4.c: Likewise. * gcc.dg/vect/vect-5.c: Likewise. * gcc.dg/vect/vect-72.c: Likewise. * gcc.dg/vect/vect-73-big-array.c: Likewise. * gcc.dg/vect/vect-73.c: Likewise. * gcc.dg/vect/vect-77-global.c: Likewise. * gcc.dg/vect/vect-78-global.c: Likewise. * gcc.dg/vect/vect-7.c: Likewise. * gcc.dg/vect/vect-86.c: Likewise. * gcc.dg/vect/vect-align-1.c: Likewise. * gcc.dg/vect/vect-align-3.c: Likewise. * gcc.dg/vect/vect-all-big-array.c: Likewise. * gcc.dg/vect/vect-all.c: Likewise. * gcc.dg/vect/vect-multitypes-1.c: Likewise. * gcc.dg/vect/vect-multitypes-4.c: Likewise. * gcc.dg/vect/vect-peel-3.c: Likewise. * gcc.dg/vect/vect-peel-4.c: Likewise. * gcc.dg/vect/wrapv-vect-7.c: Likewise. OK. jeff
Re: [PATCH] Fix up computed goto on POINTERS_EXTEND_UNSIGNED targets (PR middle-end/63974)
On 01/12/15 13:19, Jakub Jelinek wrote: Hi! The 991213-3.c testcase ICEs on aarch64-linux with -mabi=ilp32 since wide-int merge. The problem is that x = convert_memory_address (Pmode, x) is used twice on a VOIDmode CONST_INT, which is wrong. For non-VOIDmode rtl the second convert_memory_address is a NOP, but for VOIDmode the second call treats the CONST_INT returned by the first call as if it was again ptr_mode, rather than Pmode. On aarch64-linux in particular, the constant is zero-extended from SImode to DImode in the first call, so it is not valid SImode CONST_INT any longer. emit_indirect_jump always calls convert_memory_address (Pmode, ...) on the operand in optabs.c when handling EXPAND_ADDRESS case in maybe_legitimize_operand, so the first convert_memory_address is both unnecessary and harmful. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux (which do not define POINTERS_EXTEND_UNSIGNED) and tested on the problematic testcase with aarch64-linux cross. Can anyone with easy access to POINTERS_EXTEND_UNSIGNED targets (aarch64-linux ilp32, x86_64 -mx32, ia64-hpux) please test this? Ok for trunk if it works there? 2015-01-12 Jakub Jelinek ja...@redhat.com PR middle-end/63974 * cfgexpand.c (expand_computed_goto): Don't call convert_memory_address here. OK. jeff
Open Issues in the TSAN Runtime
Hi Jakub, I am asking if we plan to merge the TSAN runtime from the LLVM tree soon, or if it is better to cherry pick specific changes from there. I am especially interested in fixing these two issues, but there may be other important improvements too: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64350 : TSAN fails after stress-testing for a while was fixed by http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_deadlock_detector.h?r1=224518r2=224517pathrev=224518 http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_deadlock_detector.h?r1=224519r2=224518pathrev=224519 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63251 : tsan: corrupted shadow stack was fixed by http://llvm.org/viewvc/llvm-project?view=revisionrevision=224702 http://llvm.org/viewvc/llvm-project?view=revisionrevision=224834 Thanks Bernd.
Re: [Fortran, Patch] PR60334 - Segmentation fault on character pointer assignments
Hi Andre, + if (INDIRECT_REF_P (parmse.string_length)) +/* In chains of functions/procedure calls the string_length already + is a pointer to the variable holding the length. Therefore + remove the deref on call. */ +parmse.string_length = TREE_OPERAND (parmse.string_length, 0); This is OK but I would use instead: + if (POINTER_TYPE_P (parmse.string_length)) +/* In chains of functions/procedure calls the string_length already + is a pointer to the variable holding the length. Therefore + remove the deref on call. */ +parmse.string_length = build_fold_indirect_ref (parmse.string_length); If you look in ~/gcc/fold-const.c:15751, you will see that TREE_OPERAND (parmse.string_length, 0) but that it is preceded by cleaning up of NOOPS and, in any case, its usage will preserve the standard API just in case the internals change :-) of course, using TREE_OPERAND (xxx, 0) in the various fortran class functions makes such an assumption ;-) Apart from that, the patch is fine. I'll have a session of doing some commits later this week and will do this patch at that time. Cheers Paul On 11 January 2015 at 16:21, Andre Vehreschild ve...@gmx.de wrote: Hi Paul, thanks for the review. I do not have commits rights. Unfortunately is the patch not ok. I figured today, that it needs an extension when function calls that return deferred char len arrays are nested. In this special case the string length would have been lost. The attached extended version fixes this issue. Sorry for the duplicate work. Bootstraps and regtests ok on x86_64-linux-gnu. Regards, Andre On Sun, 11 Jan 2015 16:11:10 +0100 Paul Richard Thomas paul.richard.tho...@gmail.com wrote: Dear Andre, This is OK for trunk. I have not been keeping track of whether or not you have commit rights yet. If not, I will get to it sometime this week. Thanks for the patch. Paul On 10 January 2015 at 15:59, Andre Vehreschild ve...@gmx.de wrote: Hi all, attached patch fixes the bug reported in pr 60334. The issue here was that the function's result being (a pointer to) a deferred length char array. The string length for the result value was wrapped in a local variable, whose value was never written back to the string length of the result. This lead the calling routine to take the length of the result to be random leading to a crash. This patch addresses the issue by preventing the instantiation of the local var and instead using a reference to the parameter. This not only saves one value on the stack, but also because for small functions the compiler will hold all parameters in registers for a significant level of optimization, all the overhead of memory access (I hope :-). Bootstraps and regtests ok on x86_64-linux-gnu. - Andre -- Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen Tel.: +49 241 9291018 * Email: ve...@gmx.de -- Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen Tel.: +49 241 9291018 * Email: ve...@gmx.de -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: [PATCH] Use ldexp instead of scalbln for portability (PR other/64370)
On 01/12/15 13:10, Jakub Jelinek wrote: Hi! As mentioned in the PR, HPUX doesn't have scalbln, but does have ldexp and that function is already used in gcj-dump, so supposedly it is more portable to use ldexp. Also in glibc it is defined in libc in addition to libm. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR other/64370 * sreal.c (sreal::to_double): Use ldexp instead of scalbnl. OK. Got to love HPUX. jeff
Re: [PATCH] Fix -mstack-arg-probe (PR target/64513)
On 01/12/15 13:26, Jakub Jelinek wrote: Hi! For -mstack-arg-probe we push %rax and/or %r10 in the prologue, and mark that insn as RTX_FRAME_RELATED_P. But that means that the dwarf2 pass also considers that the %rax/%r10 registers, which are call used, to be saved in the unwind info, but they are never restored, which makes the dwarf2 pass ICE on it. Fixed by letting the dwarf2 pass know just that those instructions decrease stack pointer. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR target/64513 * config/i386/i386.c (ix86_expand_prologue): Add REG_FRAME_RELATED_EXPR to %rax and %r10 pushes. * gcc.target/i386/pr64513.c: New test. OK. jeff
Re: Open Issues in the TSAN Runtime
On Mon, Jan 12, 2015 at 09:53:16PM +0100, Bernd Edlinger wrote: I am asking if we plan to merge the TSAN runtime from the LLVM tree soon, or if it is better to cherry pick No. specific changes from there. Yes, I'll try to cherry-pick those tomorrow. I am especially interested in fixing these two issues, but there may be other important improvements too: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64350 : TSAN fails after stress-testing for a while was fixed by http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_deadlock_detector.h?r1=224518r2=224517pathrev=224518 http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/lib/sanitizer_common/sanitizer_deadlock_detector.h?r1=224519r2=224518pathrev=224519 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63251 : tsan: corrupted shadow stack was fixed by http://llvm.org/viewvc/llvm-project?view=revisionrevision=224702 http://llvm.org/viewvc/llvm-project?view=revisionrevision=224834 Jakub
[PATCH][4.9] PR 64569 - Backport support for MIPS binutils 2.25
This is a minimal backport of features added to GCC 5 to enable use of binutils 2.25 with GCC 4.9 for MIPS soft-float builds. Further details in the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64569 The commits which are being backported are listed below (the last one is posted but not committed yet). r213870: Fix mips16.S for soft-float r213872: Pass -m(soft|hard|single|double)-float via ASM_SPEC r217446: Implement o32 FPXX (very minimal backport) r217939: Update configure check for HAVE_MIPS_DOT_MODULE r??: Make ASM_SPEC changes conditional on HAVE_MIPS_DOT_MODULE gcc/ * config.in [!USED_FOR_TARGET] (HAVE_AS_DOT_MODULE): Undefine. * config/mips/mips.h (FP_ASM_SPEC): New macro. (ASM_SPEC): Use FP_ASM_SPEC. * configure.ac (HAVE_AS_DOT_MODULE): Detect support for .module and FPXX extensions. libgcc/ * config/mips/mips16.S: Do not build for soft-float. Once this is done I will do the same backport for GCC 4.8. Tested to check that soft-float builds work with binutils 2.25 and the floating-point options are not passed for binutils 2.24. Thanks, Matthew --- gcc/config.in | 6 ++ gcc/config/mips/mips.h | 19 ++- gcc/configure | 32 gcc/configure.ac| 7 +++ libgcc/config/mips/mips16.S | 10 +++--- 5 files changed, 70 insertions(+), 4 deletions(-) diff --git a/gcc/config.in b/gcc/config.in index 1e85325..013a606 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -447,6 +447,12 @@ #endif +/* Define if the assembler understands .module. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_DOT_MODULE +#endif + + /* Define if your assembler supports the -no-mul-bug-abort option. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_NO_MUL_BUG_ABORT_OPTION diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index a786d4c..ff88d98 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -1163,6 +1163,22 @@ struct mips_cpu_info { #define SUBTARGET_ASM_SPEC #endif +/* FP_ASM_SPEC represents the floating-point options that must be passed + to the assembler when FPXX support exists. Prior to that point the + assembler could accept the options but were not required for + correctness. We only add the options when absolutely necessary + because passing -msoft-float to the assembler will cause it to reject + all hard-float instructions which may require some user code to be + updated. */ + +#ifdef HAVE_AS_DOT_MODULE +#define FP_ASM_SPEC \ +%{mhard-float} %{msoft-float} \ +%{msingle-float} %{mdouble-float} +#else +#define FP_ASM_SPEC +#endif + #undef ASM_SPEC #define ASM_SPEC \ %{G*} %(endian_spec) %{mips1} %{mips2} %{mips3} %{mips4} \ @@ -1188,7 +1204,8 @@ struct mips_cpu_info { %{mfp32} %{mfp64} %{mnan=*} \ %{mshared} %{mno-shared} \ %{msym32} %{mno-sym32} \ -%{mtune=*} \ +%{mtune=*} \ +FP_ASM_SPEC \ %(subtarget_asm_spec) /* Extra switches sometimes passed to the linker. */ diff --git a/gcc/configure b/gcc/configure index 291e463..d5b6879 100755 --- a/gcc/configure +++ b/gcc/configure @@ -26140,6 +26140,38 @@ $as_echo #define HAVE_AS_GNU_ATTRIBUTE 1 confdefs.h fi +{ $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for .module support 5 +$as_echo_n checking assembler for .module support... 6; } +if test ${gcc_cv_as_mips_dot_module+set} = set; then : + $as_echo_n (cached) 6 +else + gcc_cv_as_mips_dot_module=no + if test x$gcc_cv_as != x; then +$as_echo '.module mips2 + .module fp=xx' conftest.s +if { ac_try='$gcc_cv_as $gcc_cv_as_flags -32 -o conftest.o conftest.s 5' + { { eval echo \\$as_me\:${as_lineno-$LINENO}: \$ac_try\; } 5 + (eval $ac_try) 25 + ac_status=$? + $as_echo $as_me:${as_lineno-$LINENO}: \$? = $ac_status 5 + test $ac_status = 0; }; } +then + gcc_cv_as_mips_dot_module=yes +else + echo configure: failed program was 5 + cat conftest.s 5 +fi +rm -f conftest.o conftest.s + fi +fi +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_mips_dot_module 5 +$as_echo $gcc_cv_as_mips_dot_module 6; } +if test $gcc_cv_as_mips_dot_module = yes; then + +$as_echo #define HAVE_AS_DOT_MODULE 1 confdefs.h + +fi + { $as_echo $as_me:${as_lineno-$LINENO}: checking assembler for .micromips support 5 $as_echo_n checking assembler for .micromips support... 6; } if test ${gcc_cv_as_micromips_support+set} = set; then : diff --git a/gcc/configure.ac b/gcc/configure.ac index b9a3799..ded0c48 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -4251,6 +4251,13 @@ LCF0: [AC_DEFINE(HAVE_AS_GNU_ATTRIBUTE, 1, [Define if your assembler supports .gnu_attribute.])]) +gcc_GAS_CHECK_FEATURE([.module support], + gcc_cv_as_mips_dot_module,,[-32], + [.module mips2 + .module fp=xx],, + [AC_DEFINE(HAVE_AS_DOT_MODULE, 1, + [Define if your assembler supports .module.])]) +
[PATCH] Fix -mstack-arg-probe (PR target/64513)
Hi! For -mstack-arg-probe we push %rax and/or %r10 in the prologue, and mark that insn as RTX_FRAME_RELATED_P. But that means that the dwarf2 pass also considers that the %rax/%r10 registers, which are call used, to be saved in the unwind info, but they are never restored, which makes the dwarf2 pass ICE on it. Fixed by letting the dwarf2 pass know just that those instructions decrease stack pointer. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR target/64513 * config/i386/i386.c (ix86_expand_prologue): Add REG_FRAME_RELATED_EXPR to %rax and %r10 pushes. * gcc.target/i386/pr64513.c: New test. --- gcc/config/i386/i386.c.jj 2015-01-09 22:00:02.0 +0100 +++ gcc/config/i386/i386.c 2015-01-12 17:13:21.342463547 +0100 @@ -11559,6 +11559,10 @@ ix86_expand_prologue (void) if (sp_is_cfa_reg) m-fs.cfa_offset += UNITS_PER_WORD; RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_FRAME_RELATED_EXPR, + gen_rtx_SET (VOIDmode, stack_pointer_rtx, +plus_constant (Pmode, stack_pointer_rtx, + -UNITS_PER_WORD))); } } @@ -11572,6 +11576,10 @@ ix86_expand_prologue (void) if (sp_is_cfa_reg) m-fs.cfa_offset += UNITS_PER_WORD; RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_FRAME_RELATED_EXPR, + gen_rtx_SET (VOIDmode, stack_pointer_rtx, +plus_constant (Pmode, stack_pointer_rtx, + -UNITS_PER_WORD))); } } --- gcc/testsuite/gcc.target/i386/pr64513.c.jj 2015-01-12 17:20:12.052330807 +0100 +++ gcc/testsuite/gcc.target/i386/pr64513.c 2015-01-12 17:20:02.0 +0100 @@ -0,0 +1,17 @@ +/* PR target/64513 */ +/* { dg-do compile } */ +/* { dg-options -O2 -mstack-arg-probe } */ + +struct A {}; +struct B { struct A y; }; +int foo (struct A); + +int +bar (int x) +{ + struct B b; + int c; + while (x--) +c = foo (b.y); + return c; +} Jakub
Re: [PATCH] Fix VRP ICE with -Wtype-limits (PR tree-optimization/64563)
On 01/12/15 13:01, Jakub Jelinek wrote: Hi! On the following testcase we ICE with -Os -Wtype-limits, as VR_UNDEFINED has NULL vr0-min and vr0-max. From what the code does I believe the code only means to handle VR_RANGE and not anything else. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/64563 * tree-vrp.c (vrp_evaluate_conditional): Check for VR_RANGE instead of != VR_VARYING. * gcc.dg/pr64563.c: New test. What about VR_ANTI_RANGE here? I'm pretty sure vrp_evaluate_conditional's children will use anti-ranges to simplify conditionals to constants. Though I guess that doesn't really matter when it comes to this warning because of the minmax testing... Nevermind. OK for the trunk. jeff
Re: [PATCH] Fix VRP ICE with -Wtype-limits (PR tree-optimization/64563)
On Mon, Jan 12, 2015 at 01:37:47PM -0700, Jeff Law wrote: On 01/12/15 13:01, Jakub Jelinek wrote: On the following testcase we ICE with -Os -Wtype-limits, as VR_UNDEFINED has NULL vr0-min and vr0-max. From what the code does I believe the code only means to handle VR_RANGE and not anything else. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/64563 * tree-vrp.c (vrp_evaluate_conditional): Check for VR_RANGE instead of != VR_VARYING. * gcc.dg/pr64563.c: New test. What about VR_ANTI_RANGE here? I'm pretty sure vrp_evaluate_conditional's children will use anti-ranges to simplify conditionals to constants. Though I guess that doesn't really matter when it comes to this warning because of the minmax testing... Nevermind. It checks for the case where e.g. enum has a smaller range of values than the underlying integer type, and tests min and max of the range against the enum type's min/max. For VR_ANTI_RANGE, that would have to be ~[type_min, type_max], i.e. say that the value is always not in the right range (thus always undefined behavior), so something we don't have to bother with. And, to handle VR_ANTI_RANGE, the diagnostics would need to change. Jakub
Re: [PATCH] Optimize (x % 5) % 5 in VRP (PR tree-optimization/64454)
On 01/12/15 13:28, Jakub Jelinek wrote: Hi! This patch optimizes away TRUNC_MOD_EXPR by constant second argument (if not 0 and not type's minimum) if the range of the first argument is already known to be [-op1 + 1, op1 - 1] or its subset. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR tree-optimization/64454 * tree-vrp.c (simplify_div_or_mod_using_ranges): Optimize op0 % op1 into op0 if op0 is in range [-op1 + 1, op1 - 1] for signed or [0, op1 - 1] for unsigned modulo. (simplify_stmt_using_ranges): Call simplify_div_or_mod_using_ranges even if op1 does not satisfy integer_pow2p. * gcc.dg/pr64454.c: New test. OK. jeff
Re: [PATCH] testsuite/lib/target-supports.exp: Fix check_effective_target_lto
On 01/11/15 12:26, Ilya Verbin wrote: On 09 Jan 10:29, Thomas Schwinge wrote: As this was the only use of ENABLE_LTO in the testsuite, I suggest to also remove it from the gcc/Makefile.in:site.exp rule. Done. Here is an updated and retested patch. OK for trunk? gcc/ * Makefile.in (site.exp): Do not set ENABLE_LTO. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_lto): Check for -flto option support instead of ENABLE_LTO from Makefile. OK. Jeff
Re: [PATCH] Fix REE for vector modes (PR rtl-optimization/64286)
On 01/12/15 12:59, Jakub Jelinek wrote: Hi! As mentioned in the PR, giving up for all vector mode extensions is unnecessary, but unlike scalar integer extensions, where the low part of the extended value is the original value, for vectors this is not true, thus the old value is lost. Which means we can perform REE, but only if all uses of the definition are the same (code+mode) extension. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-12 Jakub Jelinek ja...@redhat.com PR rtl-optimization/64286 * ree.c (add_removable_extension): Don't add vector mode extensions if all uses of the source register aren't the same vector extensions. * gcc.target/i386/avx2-pr64286.c: New test. Does it make sense to remove your change for 59754 in combine_reaching_defs? Shouldn't this patch handle that case as well? jeff
Re: [Patch docs 4/5] Update Output Template/Statement from md.texi
On 01/06/15 04:21, James Greenhalgh wrote: Hi, This patch updates the text in the Output Template and Output Statement sections of md.texi. I was aiming to: * Remove outdated details of the compiler. * Remove long or obscure words that, while accurate, only served to obfuscate a simple idea. * Refer to similar things in a consistent fashion - in particular trying to keep consistent use of insn and pattern. * Remove superflous examples, or waffling. OK? Thanks, James --- 2015-01-06 James Greenhalgh james.greenha...@arm.com * doc/md.texi (Output Template): Update text. (Output Statement): Likewise. With a better ChangeLog, this is OK. I'm a bit amazed how the m68k stuff for handling the different syntaxes ended up in the matching operands section. Egad. You might consider noting that as an example of how those characters are useful. One of the things about our manual that is kind of lame is the number of examples that come from obsolete/dead architectures. But I'm not going to ask you to change that. Jeff
[doc, committed] fix -Wbad-function-cast example
I was confused by the description of -Wbad-function-cast. It talks about function calls, but the malloc example looked more like a declaration, and IIRC it's not valid to redeclare functions from the standard C library with the wrong return type (or at the very least we shouldn't encourage doing so in an example). After checking the implementation to see what the option actually does, I decided it would be better not to name any particular function here. Checked in under the obvious fix rule. -Sandra 2015-01-12 Sandra Loosemore san...@codesourcery.com gcc/ * doc/invoke.texi ([-Wbad-function-cast]): Rewrite to avoid confusing example. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 219488) +++ gcc/doc/invoke.texi (working copy) @@ -4625,8 +4625,9 @@ example, warn if an unsigned variable is @item -Wbad-function-cast @r{(C and Objective-C only)} @opindex Wbad-function-cast @opindex Wno-bad-function-cast -Warn whenever a function call is cast to a non-matching type. -For example, warn if @code{int malloc()} is cast to @code{anything *}. +Warn when a function call is cast to a non-matching type. +For example, warn if a call to a function returning an integer type +is cast to a pointer type. @item -Wc90-c99-compat @r{(C and Objective-C only)} @opindex Wc90-c99-compat
[patch] libstdc++/64553 and libstdc++/64560 facet shims without RTTI or wchar_t
Two patches to make the new cxx11-shim_facets.cc file compile when RTTI and wchar_t are disabled. Tested x86_64-linux, commited to trunk. commit d2cbfa8426fae046eea01630e24d4d15c9aa1e61 Author: Jonathan Wakely jwak...@redhat.com Date: Mon Jan 12 11:46:57 2015 + PR libstdc++/64553 * src/c++11/cxx11-shim_facets.cc: Check for wchar_t support. diff --git a/libstdc++-v3/src/c++11/cxx11-shim_facets.cc b/libstdc++-v3/src/c++11/cxx11-shim_facets.cc index 56959b6..407b7b9 100644 --- a/libstdc++-v3/src/c++11/cxx11-shim_facets.cc +++ b/libstdc++-v3/src/c++11/cxx11-shim_facets.cc @@ -87,13 +87,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION union { const void* _M_p; char* _M_pc; +#ifdef _GLIBCXX_USE_WCHAR_T wchar_t* _M_pwc; +#endif }; size_t _M_len; char _M_unused[16]; operator const char*() const { return _M_pc; } +#ifdef _GLIBCXX_USE_WCHAR_T operator const wchar_t*() const { return _M_pwc; } +#endif }; union { __str_rep _M_str; @@ -251,9 +255,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION size_t _M_grouping_size; }; -template class numpunct_shimchar; -template class numpunct_shimwchar_t; - templatetypename _CharT struct collate_shim : std::collate_CharT, facet::__shim { @@ -279,9 +280,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } }; -template class collate_shimchar; -template class collate_shimwchar_t; - templatetypename _CharT struct time_get_shim : std::time_get_CharT, facet::__shim { @@ -363,11 +361,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __cache_type* _M_cache; }; -template class moneypunct_shimchar, true; -template class moneypunct_shimchar, false; -template class moneypunct_shimwchar_t, true; -template class moneypunct_shimwchar_t, false; - templatetypename _CharT struct money_get_shim : std::money_get_CharT, facet::__shim { @@ -409,9 +402,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } }; -template class money_get_shimchar; -template class money_get_shimwchar_t; - templatetypename _CharT struct money_put_shim : std::money_put_CharT, facet::__shim { @@ -441,10 +431,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } }; -template class money_put_shimchar; -template class money_put_shimwchar_t; - - templatetypename _CharT struct messages_shim : std::messages_CharT, facet::__shim { @@ -477,8 +463,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } }; +template class numpunct_shimchar; +template class collate_shimchar; +template class moneypunct_shimchar, true; +template class moneypunct_shimchar, false; +template class money_get_shimchar; +template class money_put_shimchar; template class messages_shimchar; +#ifdef _GLIBCXX_USE_WCHAR_T +template class numpunct_shimwchar_t; +template class collate_shimwchar_t; +template class moneypunct_shimwchar_t, true; +template class moneypunct_shimwchar_t, false; +template class money_get_shimwchar_t; +template class money_put_shimwchar_t; template class messages_shimwchar_t; +#endif templatetypename C inline size_t @@ -524,9 +524,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __numpunct_fill_cache(current_abi, const facet*, __numpunct_cachechar*, const char*, size_t); +#ifdef _GLIBCXX_USE_WCHAR_T template void __numpunct_fill_cache(current_abi, const facet*, __numpunct_cachewchar_t*, const char*, size_t); +#endif templatetypename C int @@ -540,9 +542,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __collate_compare(current_abi, const facet*, const char*, const char*, const char*, const char*); +#ifdef _GLIBCXX_USE_WCHAR_T template int __collate_compare(current_abi, const facet*, const wchar_t*, const wchar_t*, const wchar_t*, const wchar_t*); +#endif templatetypename C void @@ -557,9 +561,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __collate_transform(current_abi, const facet*, __any_string, const char*, const char*); +#ifdef _GLIBCXX_USE_WCHAR_T template void __collate_transform(current_abi, const facet*, __any_string, const wchar_t*, const wchar_t*); +#endif templatetypename C, bool Intl void @@ -599,6 +605,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __moneypunct_fill_cache(current_abi, const facet*, __moneypunct_cachechar, false*); +#ifdef _GLIBCXX_USE_WCHAR_T template void __moneypunct_fill_cache(current_abi, const facet*, __moneypunct_cachewchar_t, true*); @@ -606,6 +613,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template void __moneypunct_fill_cache(current_abi, const facet*, __moneypunct_cachewchar_t, false*); +#endif templatetypename C
Re: [testsuite] PATCH: Add check_effective_target_pie
måndag 12 januari 2015 12.11.17 skrev H.J. Lu: On Mon, Jan 12, 2015 at 12:03 PM, Jeff Law l...@redhat.com wrote: On 01/12/15 12:59, H.J. Lu wrote: I don't know if -pg will work PIE on any targets. For Linux/x86 the choices of crt1.o are %{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} -shared, -pg and -pie are mutually exclusive. Those crt1 files are only crt1 files provided by glibc. You can't even try -pg -pie on Linux without changing glibc. You're totally missing the point. What I care about is *why*. With -pg it use gcrt1.o object file and that file is not compile with -fPIC. When you build a shared lib on x86_64 all the objects files need to be buiit with -fPIC else you get a error like that one abow and it is the same problems when you build bin with -fPIE and linke with -pie. Glibc do not provide one that is compile with -fPIC Showing me spec file fragments is totally unhelpful. What is the technical reason why pg and pie are mutually exclusive? What kind of technical reason are you looking for? glibc doesn't provide the right crt1 file for GCC to support this combination. You can't define GNU_USER_TARGET_STARTFILE_SPEC to support -pg and -pie. If you are asking why glibc doesn't provide one, my guess is no one has requested one before.