Re: [PATCH] Add extra location information - PR43486
Thanks for your feedback. I also think this is to be preferred. I think of location_t as being, for most of the compiler, an opaque handle to location information. Just as it can now represent information about different concepts of location in the presence of macro expansion, so it's entirely reasonable for a location_t value to represent information about a range of locations (start and end points) for an expression, or to represent information about the locations (or ranges of locations) of the operands and operators of an expression (in the absence of an unfolded AST, where for locations of operands themselves you could just look one level down in the AST). OK, I agree that's an interesting/promising approach (using/reserving a bit to add a level of indirection into a separate hash table), I'll investigate what it could take to implement such approach, which would indeed get rid of the duplicate_expr_locations() calls, since this would be implicit as part of protected_set_expr_location() somehow, right? I would guess that much of the patch would be unchanged with such an approach - you still need to pass extra location information to various places, but the details of how you attach it to the expressions might be different. Right, hopefully the main difference would be the implementation of the new functions in tree.[ch] (possibly moved elsewhere?) to set/retrieve these extra slocs, most of the changes would remain almost identical I suspect. I would say that a location_t mapping to a set of other locations more complicated than at present should have some structure to how it maps to them. That is, rather than just mapping to an array of values with those values used in different ways for different source code constructs, it should be possible to tell that a given location_t is mapping to certain locations as corresponding to first and second operands of an binary operator (for example). OK, so you mean, instead of knowing the number of locations from the tree kind (e.g. 1 extra sloc for unary exprs, 2 for binary exprs, ...), we would encode this as part of the extra loc info? Note that the number of extra locations is already stored in my current patch, this is the unsigned char len field of sloc_Struct in tree.c. Is that sufficient (knowing the number of slocs), or do you have something more advanced in mind? Note that the structure stored in the hash table is: typedef struct { const_tree node; unsigned char len; location_t locus[1]; } sloc_struct; In other words, we have all the info you mentioned: number of extra slocs (len field) and tree kind (via the node field). If this is not what you had in mind, could you clarify what else? Arno
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
This adds -Og as optimization level targeted at the devel-compile-debug cycle (formerly mostly tied to -O0 due to debug issues with even -O1). Discussion on g...@gcc.gnu.org at least shows interest in this, so this is a formal patch submission with a request for comments on the implementation (not necessarily on what passes are enabled and why). There are a fair number of places in the compiler where things are done to help debugging (as opposed to not done like optimizations) if !optimize. I guess we want to enable (some of) them with -Og, in which case it would probably be convenient to have a shortcut for !optimize || optimize_debug. -- Eric Botcazou
Re: Use conditional casting with symtab_node
The language syntax would bind the conditional into the intializer, as in if (varpool_node *vnode = (node-try_variable () vnode-finalized)) varpool_analyze_node (vnode); which does not type-match. So, if you want the type saftey and performance, the cascade is really unavoidable. Just write: varpool_node *vnode; if ((vnode = node-try_variable ()) vnode-finalized) varpool_analyze_node (vnode); This has been the standard style for the past 2 decades and trading it for cascading if's is really a bad idea. -- Eric Botcazou
Re: [PATCH] Add extra location information - PR43486
On Wed, Sep 19, 2012 at 09:18:39AM +0200, Arnaud Charlet wrote: Thanks for your feedback. I also think this is to be preferred. I think of location_t as being, for most of the compiler, an opaque handle to location information. Just as it can now represent information about different concepts of location in the presence of macro expansion, so it's entirely reasonable for a location_t value to represent information about a range of locations (start and end points) for an expression, or to represent information about the locations (or ranges of locations) of the operands and operators of an expression (in the absence of an unfolded AST, where for locations of operands themselves you could just look one level down in the AST). OK, I agree that's an interesting/promising approach (using/reserving a bit to add a level of indirection into a separate hash table), I'll investigate what it could take to implement such approach, which would indeed get rid of the duplicate_expr_locations() calls, since this would be implicit as part of protected_set_expr_location() somehow, right? Please also read Dodji's slides from Cauldron on this: http://dodji.seketeli.net/talks/gnu-cauldron-2012/track-macro-locations.pdf and discuss with him, there is also the Combine location with block using block_locations patchset floating around that interferes with this partially. But range locations as part of location_t is definitely the way to go, we want that for proper diagnostics anyway. Jakub
Re: [PATCH] Changes in mode switching
On Tue, Sep 18, 2012 at 10:57 PM, Vladimir Yakovlev vbyakov...@gmail.com wrote: Attached files i386.patch contains changes for vzeroupper placement. post_reload.patch contens changes for post reload pass. I have bootstrap problem with post_reload.patch. Does the patch without post_reload.patch work as expected, or there are failures/problems remaining? Uros.
Re: [PATCH] OpenBSD/hppa support
Date: Tue, 18 Sep 2012 14:43:35 -0400 From: John David Anglin d...@hiauly1.hia.nrc.ca On Thu, 06 Sep 2012, Mark Kettenis wrote: Most bits are stolen from Linux, but there are a few subtle differences since our assembler is configured to be slightly more HP-UX-ish. libgcc/: 2012-09-06 Mark Kettenis kette...@openbsd.org * config.host (hppa-*-openbsd*): New target. * config/pa/t-openbsd: New file. gcc/: 2012-09-06 Mark Kettenis kette...@openbsd.org * config.gcc (hppa*-*-openbsd*): New target. * config/pa/pa-openbsd.h: New file. * config/pa/pa32-openbsd.h: New file. * config/host-openbsd.c (TRY_EXCEPT_VM_SPACE): Define for OpenBSD/hppa. OK. Please add 2012 to files with copyrights. Thanks Dave! Here is an update diff with the copyright year updates. Would you be so kind to commit this for me? Thanks, Mark libgcc/: 2012-09-19 Mark Kettenis kette...@openbsd.org * config.host (hppa-*-openbsd*): New target. * config/pa/t-openbsd: New file. gcc:/ 2012-09-19 Mark Kettenis kette...@openbsd.org * config.gcc (hppa*-*-openbsd*): New target. * config/pa/pa-openbsd.h: New file. * config/pa/pa32-openbsd.h: New file. * config/host-openbsd.c: Update copyright year. (TRY_EXCEPT_VM_SPACE): Define for OpenBSD/hppa. Index: gcc/config/pa/pa32-openbsd.h === --- gcc/config/pa/pa32-openbsd.h(revision 0) +++ gcc/config/pa/pa32-openbsd.h(working copy) @@ -0,0 +1,23 @@ +/* Definitions for PA_RISC with ELF-32 format + Copyright (C) 2000, 2002, 2004, 2006, 2007, 2010, 2011, 2012 + Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +/* Turn off various SOM crap we don't want. */ +#undef TARGET_ELF32 +#define TARGET_ELF32 1 Index: gcc/config/pa/pa-openbsd.h === --- gcc/config/pa/pa-openbsd.h (revision 0) +++ gcc/config/pa/pa-openbsd.h (working copy) @@ -0,0 +1,162 @@ +/* Definitions for PA_RISC with ELF format + Copyright 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010, + 2011, 2012 + Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + + +#undef TARGET_OS_CPP_BUILTINS +#define TARGET_OS_CPP_BUILTINS() \ + do \ +{ \ + OPENBSD_OS_CPP_BUILTINS(); \ + builtin_assert (machine=bigendian); \ +} \ + while (0) + +/* Our profiling scheme doesn't LP labels and counter words. */ +#define NO_DEFERRED_PROFILE_COUNTERS 1 + +#undef STRING_ASM_OP +#define STRING_ASM_OP \t.stringz\t + +#define TEXT_SECTION_ASM_OP \t.text +#define DATA_SECTION_ASM_OP \t.data +#define BSS_SECTION_ASM_OP \t.section\t.bss + +/* We want local labels to start with period if made with asm_fprintf. */ +#undef LOCAL_LABEL_PREFIX +#define LOCAL_LABEL_PREFIX . + +/* Define these to generate the Linux/ELF/SysV style of internal + labels all the time - i.e. to be compatible with + ASM_GENERATE_INTERNAL_LABEL in elfos.h. Compare these with the + ones in pa.h and note the lack of dollar signs in these. FIXME: + shouldn't we fix pa.h to use ASM_GENERATE_INTERNAL_LABEL instead? */ + +#undef ASM_OUTPUT_ADDR_VEC_ELT +#define ASM_OUTPUT_ADDR_VEC_ELT(FILE, VALUE) \ + if (TARGET_BIG_SWITCH) \ +fprintf (FILE, \t.word .L%d\n, VALUE); \ + else \ +fprintf (FILE, \tb .L%d\n\tnop\n, VALUE) + +#undef ASM_OUTPUT_ADDR_DIFF_ELT +#define
Re: [testsuite] vect effective targets should use arm_neon_ok
On 18/09/12 21:59, Janis Johnson wrote: On 09/18/2012 12:54 PM, Janis Johnson wrote: In most cases a test that requires ARM NEON should use effective target arm_neon, which means that flags run for all tests include NEON support. The result is cached the first time it is checked for a multilib. Vectorization tests, when run for ARM, add flags to support NEON if it's OK to do so, but those flags are not reflected in the cached results for arm_neon, nor should they be. Because of this, vect effective-target checks should use arm_neon_ok (as most already do) instead of arm_neon. This patch changes the checks for 7 effective targets, allowing more tests to run and decreasing the number of failures. The only new failures I've seen in tests on arm-none-eabi with a variety of test multilibs are for big-endian with vect_multiple_sizes, which means that vect_multiple_sizes should be false for big endian or that there's a bug in ARM big-endian support. Sadly, there are almost certainly bugs in the big-endian support for ARM Neon.
Re: [PATCH] Combine location with block using block_locations
Hi, I've integrated all the reviews from this thread (Thank you guys for helping refine this patch). Now the patch can pass all gcc testsuite as well as all spec2006 benchmarks (with LTO). Concerning memory consumption, for extreme benchmarks like tramp3d, this patch incurs around 2% peak memory overhead (mostly from the extra blocks that have been set NULL in the original implementation.) Attached is the new patch. Honza, could you help me try this on Mozzila lto to see if the error is gone? Hi, I tested the last version you posed and it works fine. (i.e. no ICE) I also observed no real differences in memory use. linemap lookup seems to be bottleneck on streaming out stage of WPA. I wonder if we can't stream location better into LTO objects, by perhaps using same encoding as we do in memory (i.e. streaming out locators and separate table) Honza
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
On Wed, 19 Sep 2012, Eric Botcazou wrote: This adds -Og as optimization level targeted at the devel-compile-debug cycle (formerly mostly tied to -O0 due to debug issues with even -O1). Discussion on g...@gcc.gnu.org at least shows interest in this, so this is a formal patch submission with a request for comments on the implementation (not necessarily on what passes are enabled and why). There are a fair number of places in the compiler where things are done to help debugging (as opposed to not done like optimizations) if !optimize. I guess we want to enable (some of) them with -Og, in which case it would probably be convenient to have a shortcut for !optimize || optimize_debug. Which means that -O0 should also set optimize_debug to 1? -O0 is then !optimize optimize_debug and -Og is optimize == 1 optimize_debug. Richard.
[PATCH] Fix PR54132
This fixes PR54132 where loop distribution pattern matching thinks that every loop that copies contiguous memory regions and has a dependence between input and output can be representet by a memmove. Not. Obviously. Thus, fixed by doing proper dependence checks. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-09-19 Richard Guenther rguent...@suse.de PR tree-optimization/54132 * tree-loop-distribution.c (classify_partition): Properly check dependences for memmove. * tree-data-ref.h (compute_affine_dependence): Declare. * tree-data-ref.c (compute_affine_dependence): Export. * gcc.dg/tree-ssa/ldist-21.c: New testcase. * gcc.dg/torture/pr54132.c: Likewise. Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c(revision 191415) --- gcc/tree-loop-distribution.c(working copy) *** classify_partition (loop_p loop, struct *** 1011,1016 --- 1011,1049 || !operand_equal_p (DR_STEP (single_store), DR_STEP (single_load), 0)) return; + /* Now check that if there is a dependence this dependence is + of a suitable form for memmove. */ + VEC(loop_p, heap) *loops = NULL; + ddr_p ddr; + VEC_safe_push (loop_p, heap, loops, loop); + ddr = initialize_data_dependence_relation (single_load, single_store, +loops); + compute_affine_dependence (ddr, loop); + VEC_free (loop_p, heap, loops); + if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know) + { + free_dependence_relation (ddr); + return; + } + if (DDR_ARE_DEPENDENT (ddr) != chrec_known) + { + if (DDR_NUM_DIST_VECTS (ddr) == 0) + { + free_dependence_relation (ddr); + return; + } + lambda_vector dist_v; + FOR_EACH_VEC_ELT (lambda_vector, DDR_DIST_VECTS (ddr), i, dist_v) + { + int dist = dist_v[index_in_loop_nest (loop-num, + DDR_LOOP_NEST (ddr))]; + if (dist 0 !DDR_REVERSED_P (ddr)) + { + free_dependence_relation (ddr); + return; + } + } + } partition-kind = PKIND_MEMCPY; partition-main_dr = single_store; partition-secondary_dr = single_load; Index: gcc/testsuite/gcc.dg/tree-ssa/ldist-21.c === *** gcc/testsuite/gcc.dg/tree-ssa/ldist-21.c(revision 0) --- gcc/testsuite/gcc.dg/tree-ssa/ldist-21.c(working copy) *** *** 0 --- 1,11 + /* { dg-do compile } */ + /* { dg-options -O3 -fdump-tree-ldist-details } */ + + void bar(char *p, int n) + { + int i; + for (i = 1; i n; i++) + p[i-1] = p[i]; + } + + /* { dg-final { scan-tree-dump generated memmove ldist } } */ Index: gcc/testsuite/gcc.dg/torture/pr54132.c === *** gcc/testsuite/gcc.dg/torture/pr54132.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr54132.c (working copy) *** *** 0 --- 1,18 + /* { dg-do run } */ + + extern void abort (void); + void foo(char *p, int n) + { + int i; + for (i = 1; i n; i++) + p[i] = p[i - 1]; + } + int main() + { + char a[1024]; + a[0] = 1; + foo (a, 1024); + if (a[1023] != 1) + abort (); + return 0; + } Index: gcc/tree-data-ref.c === *** gcc/tree-data-ref.c (revision 191415) --- gcc/tree-data-ref.c (working copy) *** ddr_consistent_p (FILE *file, *** 4130,4136 relation the first time we detect a CHREC_KNOWN element for a given subscript. */ ! static void compute_affine_dependence (struct data_dependence_relation *ddr, struct loop *loop_nest) { --- 4130,4136 relation the first time we detect a CHREC_KNOWN element for a given subscript. */ ! void compute_affine_dependence (struct data_dependence_relation *ddr, struct loop *loop_nest) { Index: gcc/tree-data-ref.h === *** gcc/tree-data-ref.h (revision 191415) --- gcc/tree-data-ref.h (working copy) *** struct data_reference *create_data_ref ( *** 396,401 --- 396,403 extern bool find_loop_nest (struct loop *, VEC (loop_p, heap) **); extern struct data_dependence_relation *initialize_data_dependence_relation (struct data_reference *, struct data_reference *, VEC (loop_p, heap) *); + extern void compute_affine_dependence (struct data_dependence_relation *, + loop_p); extern void
Re: [PATCH] Fix PR54132
On Wed, Sep 19, 2012 at 10:53:00AM +0200, Richard Guenther wrote: + extern void abort (void); + void foo(char *p, int n) + { + int i; + for (i = 1; i n; i++) + p[i] = p[i - 1]; You could turn this into if (n 1) memset (p + 1, p[0], n - 1); if you wanted, though of course that is quite a special case and won't work with pointers to larger types than char, or if the dependency is different from -1 bytes. Jakub
Re: RFA: Process '*' in '@'-output-template alternatives
On Wed, Sep 19, 2012 at 3:51 AM, Joern Rennecke amyl...@spamcop.net wrote: I am about to submit the ARCompact target port; this port needs a few patches to target-independent code. There is a move pattern with 20 alternatives; a few of them need a simple function call to decide which output pattern to use. With the '@'-syntax for multi-alternative templates, each alternative is still a one-liner. Requiring to transform this into some switch statement would make the thing several times as big, and very hard to take in; besides, it is generally a maintenance issue if you have to completely rewrite a multi-alternative template if you just change one alternative from a constant to some C-code, or vice versa for the last non-literal alternative. The attached patch makes the '*' syntax for C code fragments available for individual alternatives of an '@' multi-alternative output template. It does this by translating the input into a switch statement in the generated file, so in a way this is just syntactic sugar, but it's syntactic sugar that makes some machine descriptions easier to write and change. Bootstrapped in revision 191429 for i686-pc-linux-gnu. I've been wondering if it'd make sense to also support for '{' / '}' , but at least in the ARCompact context, I think the use of that syntax inside a multi-alternative template would reduce rather than improve legibility, so, having no application for the '{' / '}' in that place, there seems to be no use in adding support for that at this point in time. I think that needs to be documented somewhere in the internals manual, possibly with an example. Richard. 2008-11-19 Jorn Rennecke joern.renne...@arc.com * genoutput.c (process_template): Process '*' in '@' alternatives. Index: genoutput.c === --- genoutput.c (revision 191429) +++ genoutput.c (working copy) @@ -662,19 +662,55 @@ process_template (struct data *d, const list of assembler code templates, one for each alternative. */ else if (template_code[0] == '@') { - d-template_code = 0; - d-output_format = INSN_OUTPUT_FORMAT_MULTI; + int found_star = 0; - printf (\nstatic const char * const output_%d[] = {\n, d-code_number); + for (cp = template_code[1]; *cp; ) + { + while (ISSPACE (*cp)) + cp++; + if (*cp == '*') + found_star = 1; + while (!IS_VSPACE (*cp) *cp != '\0') + ++cp; + } + d-template_code = 0; + if (found_star) + { + d-output_format = INSN_OUTPUT_FORMAT_FUNCTION; + puts (\nstatic const char *); + printf (output_%d (rtx *operands ATTRIBUTE_UNUSED, + rtx insn ATTRIBUTE_UNUSED)\n, d-code_number); + puts ({); + puts ( switch (which_alternative)\n{); + } + else + { + d-output_format = INSN_OUTPUT_FORMAT_MULTI; + printf (\nstatic const char * const output_%d[] = {\n, + d-code_number); + } for (i = 0, cp = template_code[1]; *cp; ) { - const char *ep, *sp; + const char *ep, *sp, *bp; while (ISSPACE (*cp)) cp++; - printf ( \); + bp = cp; + if (found_star) + { + printf (case %d:, i); + if (*cp == '*') + { + printf (\n ); + cp++; + } + else + printf ( return \); + } + else + printf ( \); for (ep = sp = cp; !IS_VSPACE (*ep) *ep != '\0'; ++ep) if (!ISSPACE (*ep)) @@ -690,7 +726,18 @@ process_template (struct data *d, const cp++; } - printf (\,\n); + if (!found_star) + puts (\,); + else if (*bp != '*') + puts (\;); + else + { + /* The usual action will end with a return. +If there is neither break or return at the end, this is +assumed to be intentional; this allows to have multiple +consecutive alternatives share some code. */ + puts (); + } i++; } if (i == 1) @@ -700,7 +747,10 @@ process_template (struct data *d, const error_with_line (d-lineno, wrong number of alternatives in the output template); - printf (};\n); + if (found_star) + puts ( default: gcc_unreachable ();\n}\n}); + else + printf (};\n); } else {
Re: Use conditional casting with symtab_node
On Wed, Sep 19, 2012 at 9:29 AM, Eric Botcazou ebotca...@adacore.com wrote: The language syntax would bind the conditional into the intializer, as in if (varpool_node *vnode = (node-try_variable () vnode-finalized)) varpool_analyze_node (vnode); which does not type-match. So, if you want the type saftey and performance, the cascade is really unavoidable. Just write: varpool_node *vnode; if ((vnode = node-try_variable ()) vnode-finalized) varpool_analyze_node (vnode); This has been the standard style for the past 2 decades and trading it for cascading if's is really a bad idea. Indeed. Btw, can we not provide a specialization for dynamic_cast ? This -try_... looks awkward to me compared to the more familiar vnode = dynamic_cast varpool_node (node) but yeah - dynamic_cast is not a template ... (but maybe there is some standard library piece that mimics it?). Richard. -- Eric Botcazou
Re: [PATCH] Combine location with block using block_locations
On Wed, Sep 19, 2012 at 10:48 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, I've integrated all the reviews from this thread (Thank you guys for helping refine this patch). Now the patch can pass all gcc testsuite as well as all spec2006 benchmarks (with LTO). Concerning memory consumption, for extreme benchmarks like tramp3d, this patch incurs around 2% peak memory overhead (mostly from the extra blocks that have been set NULL in the original implementation.) Attached is the new patch. Honza, could you help me try this on Mozzila lto to see if the error is gone? Hi, I tested the last version you posed and it works fine. (i.e. no ICE) I also observed no real differences in memory use. linemap lookup seems to be bottleneck on streaming out stage of WPA. I wonder if we can't stream location better into LTO objects, by perhaps using same encoding as we do in memory (i.e. streaming out locators and separate table) Yes, I think we should consider streaming locations separately so that we can read them in in one chunk and sort them to be able to assign most compact location_t's to them. Dehao, the patch is ok for trunk - please be on the watch to address possible fallout. Thanks, Richard. Honza
Re: [PATCH] Add -Og optimization level - optimize for compile-time/debugging experience
On Tue, 18 Sep 2012, Ian Lance Taylor wrote: On Tue, Sep 18, 2012 at 4:23 AM, Richard Guenther rguent...@suse.de wrote: This adds -Og as optimization level targeted at the devel-compile-debug cycle (formerly mostly tied to -O0 due to debug issues with even -O1). This needs an entry in gcc-4.8/changes.html, of course. Like the following. Richard. 2012-09-19 Richard Guenther rguent...@suse.de * gcc-4.8/changes.html: Document -Og. Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.28 diff -u -r1.28 changes.html --- changes.html6 Sep 2012 03:42:45 - 1.28 +++ changes.html19 Sep 2012 09:27:38 - @@ -41,6 +41,11 @@ h2General Optimizer Improvements (and Changes)/h2 ul +liA new general optimization level, code-Og/code, has been + introduced. It addresses the need for fast compilation and a + superior debugging experience while providing a reasonable level + of runtime performance. Overall experience for development should + be better than the default optimization level code-O0/code. liA new option code-ftree-partial-pre/code was added to control the partial redundancy elimination (PRE) optimization. This option is enabled by default at the code-O3/code optimization
Re: [libbacktrace] Fix bootstrap with gcc 4.4
Ian Lance Taylor i...@google.com writes: On Tue, Sep 18, 2012 at 1:32 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: The libbacktrace integration broke Solaris 10 and 11 bootstrap when using gcc 4.4 (any version of gcc without __sync_* support actually): The patch is fine and should fix the problem, but GCC 4.4 does have Indeed, thanks. __sync_* support. Might be worth looking into why the test failed. On i386-pc-solaris2.*, __sync_bool_compare_and_swap_4 is missing. sparc-sun-solaris2.11 is fine, though. Unfortunately, Solaris 10 (and certainly Solaris 9, too) bootstrap is still broken: /vol/gcc/src/hg/trunk/local/libbacktrace/dwarf.c:652: error: implicit declaration of function 'strnlen' make[1]: *** [dwarf.lo] Error 1 Both completely lack strnlen(). I haven't done anything about this yet. This should be fixed now. The strnlen part is (i386-pc-solaris2.1[01] and sparc-sun-solaris2.11 bootstraps currently running the testsuite), but a new problem turned up on i386-pc-solaris2.9, which lacks stdint.h. The following patch fixes this for me (bootstrap currently into stage2). I've removed the stdint.h includes in btest.c and dwarf.c since that's already covered by backtrace.h. Ok for mainline if that passes? Thanks. Rainer 2012-09-19 Rainer Orth r...@cebitec.uni-bielefeld.de * configure.ac (GCC_HEADER_STDINT): Invoke. * aclocal.m4: Regenerate. * configure: Regenerate. * backtrace.h: Include gstdint.h instead of stdint.h. * btest.c: Don't include stdint.h. * dwarf.c: Likewise. # HG changeset patch # Parent cbf27345ec507da8167eb50de6c21124cfd778c4 Provide stdint.h if missing diff --git a/libbacktrace/backtrace.h b/libbacktrace/backtrace.h --- a/libbacktrace/backtrace.h +++ b/libbacktrace/backtrace.h @@ -34,7 +34,7 @@ POSSIBILITY OF SUCH DAMAGE. */ #define BACKTRACE_H #include stddef.h -#include stdint.h +#include gstdint.h #include stdio.h #ifdef __cplusplus diff --git a/libbacktrace/btest.c b/libbacktrace/btest.c --- a/libbacktrace/btest.c +++ b/libbacktrace/btest.c @@ -34,7 +34,6 @@ POSSIBILITY OF SUCH DAMAGE. */ libbacktrace library. */ #include assert.h -#include stdint.h #include stdio.h #include stdlib.h #include string.h diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac --- a/libbacktrace/configure.ac +++ b/libbacktrace/configure.ac @@ -168,6 +168,8 @@ if test $backtrace_supported = yes; fi AC_SUBST(BACKTRACE_SUPPORTED) +GCC_HEADER_STDINT(gstdint.h) + AC_CHECK_HEADERS(sys/mman.h) if test $ac_cv_header_sys_mman_h = no; then have_mmap=no diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c --- a/libbacktrace/dwarf.c +++ b/libbacktrace/dwarf.c @@ -33,7 +33,6 @@ POSSIBILITY OF SUCH DAMAGE. */ #include config.h #include errno.h -#include stdint.h #include stdlib.h #include string.h #include sys/types.h -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Combine location with block using block_locations
Hi, On Wed, Sep 12, 2012 at 04:17:45PM +0200, Michael Matz wrote: Hi, On Wed, 12 Sep 2012, Michael Matz wrote: Hm, but we shouldn't end up streaming any BLOCKs at this point (nor local TYPE_DECLs). Those are supposed to be in the local function sections only where no fixup for prevailing decls happens. That's true, something is fishy with the patch, will try to investigate. ipa-prop creates the problem. Its tree mapping can contain expressions, expressions can have locations, locations now have blocks. The tree maps are stored as part of jump functions, and hence as part of node summaries. Node summaries are global, hence blocks, and therefore block vars can be placed in the global blob. That's not supposed to happen. The patch below fixes this instance of the problem and makes the testcase work with Dehaos patch with the LTO_NO_PREVAIL call added back in. The following patch implements the unsharing and location pruning at all required places in a way that passes bootstrap and testing on an x86_64-linux. Honza pre-approved it on IRC so unless there are any objections within a few hours I'm going to commit it. (The patch does not introduce any of the asserts Michael's patch had because, as far as I my grep told me, IS_UNKNOWN_LOCATION is not in trunk yet and I suppose the pre-approval does not cover introducing things like that.) Thanks, Martin 2012-09-18 Martin Jambor mjam...@suse.cz * ipa-prop.c (prune_expression_for_jf): New function. (ipa_set_jf_constant): Use it. (ipa_set_jf_arith_pass_through): Likewise. (determine_known_aggregate_parts): Likewise. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -287,6 +287,19 @@ ipa_print_all_jump_functions (FILE *f) } } +/* Return the expression tree EXPR unshared and with location stripped off. */ + +static tree +prune_expression_for_jf (tree exp) +{ + if (EXPR_P (exp)) +{ + exp = unshare_expr (exp); + SET_EXPR_LOCATION (exp, UNKNOWN_LOCATION); +} + return exp; +} + /* Set JFUNC to be a known type jump function. */ static void @@ -305,7 +318,7 @@ static void ipa_set_jf_constant (struct ipa_jump_func *jfunc, tree constant) { jfunc-type = IPA_JF_CONST; - jfunc-value.constant = constant; + jfunc-value.constant = prune_expression_for_jf (constant); } /* Set JFUNC to be a simple pass-through jump function. */ @@ -327,7 +340,7 @@ ipa_set_jf_arith_pass_through (struct ip tree operand, enum tree_code operation) { jfunc-type = IPA_JF_PASS_THROUGH; - jfunc-value.pass_through.operand = operand; + jfunc-value.pass_through.operand = prune_expression_for_jf (operand); jfunc-value.pass_through.formal_id = formal_id; jfunc-value.pass_through.operation = operation; jfunc-value.pass_through.agg_preserved = false; @@ -1344,7 +1357,7 @@ determine_known_aggregate_parts (gimple { struct ipa_agg_jf_item item; item.offset = list-offset - arg_offset; - item.value = list-constant; + item.value = prune_expression_for_jf (list-constant); VEC_quick_push (ipa_agg_jf_item_t, jfunc-agg.items, item); } list = list-next;
[PATCH, WWWDOCS] Document AArch64-4.7 branch
Hi, This patch documents the AArch64-4.7 branch in wwwdocs/htdocs/svn.html. OK? Thanks Sofiane - Proposed ChangeLog: * htdocs/svn.html: Document aarch64-4.7 branch. aarch64-4.7-branch-wwwdocs.patch Description: Binary data
Re: [PATCH, WWWDOCS] Document AArch64-4.7 branch
On 19 Sep 2012, at 11:26, Sofiane Naci sofiane.n...@arm.commailto:sofiane.n...@arm.com wrote: Hi, This patch documents the AArch64-4.7 branch in wwwdocs/htdocs/svn.html. OK? Sofiane, When I posted the equivalent patch for the aarch64-branch Gerald pointed out that someone with write access to gcc does not need approval for changes like this: http://gcc.gnu.org/ml/gcc-patches/2012-06/msg00406.html Cheers /Marcus
[patch] split FRAME variables back into pieces
Hi, this transformation has been in our tree for a couple of years and was originally developed for SPARKSkein (http://www.skein-hash.info/node/48), which is the implementation in SPARK (subset of Ada) of the Skein algorithm. When nested functions access local variables of their parent, the compiler creates a special FRAME local variable in the parent, which represents the non-local frame, and puts into it all the variables accessed non-locally. If these nested functions are later inlined into their parent, these FRAME variables generally remain unmodified and this has various drawbacks: 1) the frame of the parent is unnecessarily large, 2) scalarization of aggregates put into the FRAME variables is hindered, 3) debug info for scalars put into the FRAME variables is poor since VTA only works on GIMPLE registers. The attached patch makes it so that the compiler splits FRAME variables back into pieces when all the nested functions have been inlined. The natural place to implement the transformation would probably be the SRA pass, but this would require a special path to work around all the heuristics and the pass is already complicated enough (sorry Martin ;-) The transformation is therefore implemented as a sub-pass of execute_update_addresses_taken for technical reasons exposed in the patch. Tested on x86-64/Linux, OK for the mainline? 2012-09-19 Eric Botcazou ebotca...@adacore.com * tree.h (DECL_NONLOCAL_FRAME): New macro. * gimple.c (gimple_ior_addresses_taken_1): Handle non-local frame structures specially. * tree-nested.c (get_frame_type): Set DECL_NONLOCAL_FRAME. * tree-ssa.c (lookup_decl_for_field): New static function. (split_nonlocal_frames_op): Likewise. (execute_update_addresses_taken): Break up non-local frame structures into variables when possible. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Stream in DECL_NONLOCAL_FRAME flag. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream out DECL_NONLOCAL_FRAME flag. 2012-09-19 Eric Botcazou ebotca...@adacore.com * gcc.dg/nested-func-9.c: New test. -- Eric BotcazouIndex: tree.h === --- tree.h (revision 191365) +++ tree.h (working copy) @@ -712,6 +712,9 @@ struct GTY(()) tree_base { SSA_NAME_IS_DEFAULT_DEF in SSA_NAME + + DECL_NONLOCAL_FRAME in + VAR_DECL */ struct GTY(()) tree_typed { @@ -3268,9 +3271,14 @@ extern void decl_fini_priority_insert (t libraries. */ #define MAX_RESERVED_INIT_PRIORITY 100 +/* In a VAR_DECL, nonzero if this is a global variable for VOPs. */ #define VAR_DECL_IS_VIRTUAL_OPERAND(NODE) \ (VAR_DECL_CHECK (NODE)-base.u.bits.saturating_flag) +/* In a VAR_DECL, nonzero if this is a non-local frame structure. */ +#define DECL_NONLOCAL_FRAME(NODE) \ + (VAR_DECL_CHECK (NODE)-base.default_def_flag) + struct GTY(()) tree_var_decl { struct tree_decl_with_vis common; }; Index: tree-streamer-out.c === --- tree-streamer-out.c (revision 191365) +++ tree-streamer-out.c (working copy) @@ -181,6 +181,9 @@ pack_ts_decl_common_value_fields (struct bp_pack_value (bp, expr-decl_common.off_align, 8); } + if (TREE_CODE (expr) == VAR_DECL) +bp_pack_value (bp, DECL_NONLOCAL_FRAME (expr), 1); + if (TREE_CODE (expr) == RESULT_DECL || TREE_CODE (expr) == PARM_DECL || TREE_CODE (expr) == VAR_DECL) Index: tree-nested.c === --- tree-nested.c (revision 191365) +++ tree-nested.c (working copy) @@ -235,6 +235,7 @@ get_frame_type (struct nesting_info *inf info-frame_type = type; info-frame_decl = create_tmp_var_for (info, type, FRAME); + DECL_NONLOCAL_FRAME (info-frame_decl) = 1; /* ??? Always make it addressable for now, since it is meant to be pointed to by the static chain pointer. This pessimizes Index: tree-ssa.c === --- tree-ssa.c (revision 191365) +++ tree-ssa.c (working copy) @@ -1930,6 +1930,152 @@ maybe_optimize_var (tree var, bitmap add } } + +struct walk_info_data +{ + /* Map of fields in non-local frame structures to variables. */ + struct pointer_map_t *field_map; + + /* Bitmap of variables whose address is taken. */ + bitmap addresses_taken; + + /* Bitmap of variables to be renamed. */ + bitmap suitable_for_renaming; +}; + +/* Given FIELD, a field in a non-local frame structure, find or create a + variable in the current function and register it with MAP. This is + the reverse function of tree-nested.c:lookup_field_for_decl. */ + +static tree +lookup_decl_for_field (tree field, struct pointer_map_t *map, + bitmap suitable_for_renaming) +{ + void **slot = pointer_map_insert
[PATCH] Simplify get_prop_source_stmt
Pointer conversions are useless for quite some time, so simplify get_prop_source_stmt. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-09-19 Richard Guenther rguent...@suse.de * tree-ssa-forwprop.c (get_prop_source_stmt): Simplify. Index: gcc/tree-ssa-forwprop.c === --- gcc/tree-ssa-forwprop.c (revision 191463) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -227,29 +227,15 @@ get_prop_source_stmt (tree name, bool si if (!is_gimple_assign (def_stmt)) return NULL; -/* If def_stmt is not a simple copy, we possibly found it. */ -if (!gimple_assign_ssa_name_copy_p (def_stmt)) +/* If def_stmt is a simple copy, continue looking. */ +if (gimple_assign_rhs_code (def_stmt) == SSA_NAME) + name = gimple_assign_rhs1 (def_stmt); +else { - tree rhs; - if (!single_use_only single_use_p) *single_use_p = single_use; - /* We can look through pointer conversions in the search - for a useful stmt for the comparison folding. */ - rhs = gimple_assign_rhs1 (def_stmt); - if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)) -TREE_CODE (rhs) == SSA_NAME -POINTER_TYPE_P (TREE_TYPE (gimple_assign_lhs (def_stmt))) -POINTER_TYPE_P (TREE_TYPE (rhs))) - name = rhs; - else - return def_stmt; - } -else - { - /* Continue searching the def of the copy source name. */ - name = gimple_assign_rhs1 (def_stmt); + return def_stmt; } } while (1); }
[PATCH] Fix scan-dumps in testcase, for -Og
This fixes scans for fab with fab1 which now appears twice after adding -Og. Tested on x86_64-unknown-linux-gnu, applied. Sorry for the breakage, Richard. 2012-09-19 Richard Guenther rguent...@suse.de * gcc.dg/builtin-object-size-10.c: Adjust. * gcc.dg/builtin-unreachable-5.c: Adjust. * gcc.dg/tree-ssa/builtin-fprintf-1.c: Adjust. * gcc.dg/tree-ssa/builtin-fprintf-chk-1.c: Adjust. * gcc.dg/tree-ssa/builtin-printf-1.c: Adjust. * gcc.dg/tree-ssa/builtin-printf-chk-1.c: Adjust. * gcc.dg/tree-ssa/builtin-vfprintf-1.c: Adjust. * gcc.dg/tree-ssa/builtin-vfprintf-chk-1.c: Adjust. * gcc.dg/tree-ssa/builtin-vprintf-1.c: Adjust. * gcc.dg/tree-ssa/builtin-vprintf-chk-1.c: Adjust. * gcc.dg/tree-ssa/ssa-ccp-10.c: Adjust. * gcc.dg/vect/vec-scal-opt.c: Adjust. * gcc.dg/vect/vec-scal-opt1.c: Adjust. * gcc.dg/vect/vec-scal-opt2.c: Adjust. Index: gcc/testsuite/gcc.dg/builtin-object-size-10.c === --- gcc/testsuite/gcc.dg/builtin-object-size-10.c (revision 191466) +++ gcc/testsuite/gcc.dg/builtin-object-size-10.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-objsz-details } */ +/* { dg-options -O2 -fdump-tree-objsz1-details } */ typedef struct { char sentinel[4]; @@ -21,6 +21,6 @@ foo(char *x) return dpkt; } -/* { dg-final { scan-tree-dump maximum object size 21 objsz } } */ -/* { dg-final { scan-tree-dump maximum subobject size 16 objsz } } */ -/* { dg-final { cleanup-tree-dump objsz } } */ +/* { dg-final { scan-tree-dump maximum object size 21 objsz1 } } */ +/* { dg-final { scan-tree-dump maximum subobject size 16 objsz1 } } */ +/* { dg-final { cleanup-tree-dump objsz1 } } */ Index: gcc/testsuite/gcc.dg/builtin-unreachable-5.c === --- gcc/testsuite/gcc.dg/builtin-unreachable-5.c(revision 191466) +++ gcc/testsuite/gcc.dg/builtin-unreachable-5.c(working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-fab } */ +/* { dg-options -O2 -fdump-tree-fab1 } */ int foo (int a) @@ -16,8 +16,8 @@ foo (int a) return a 0; } -/* { dg-final { scan-tree-dump-times if \\( 0 fab } } */ -/* { dg-final { scan-tree-dump-times goto 0 fab } } */ -/* { dg-final { scan-tree-dump-times L1: 0 fab } } */ -/* { dg-final { scan-tree-dump-times __builtin_unreachable 0 fab } } */ -/* { dg-final { cleanup-tree-dump fab } } */ +/* { dg-final { scan-tree-dump-times if \\( 0 fab1 } } */ +/* { dg-final { scan-tree-dump-times goto 0 fab1 } } */ +/* { dg-final { scan-tree-dump-times L1: 0 fab1 } } */ +/* { dg-final { scan-tree-dump-times __builtin_unreachable 0 fab1 } } */ +/* { dg-final { cleanup-tree-dump fab1 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/builtin-fprintf-1.c === --- gcc/testsuite/gcc.dg/tree-ssa/builtin-fprintf-1.c (revision 191466) +++ gcc/testsuite/gcc.dg/tree-ssa/builtin-fprintf-1.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-fab } */ +/* { dg-options -O2 -fdump-tree-fab1 } */ typedef struct { int i; } FILE; FILE *fp; @@ -29,13 +29,13 @@ void test (void) vi9 = 0; } -/* { dg-final { scan-tree-dump vi0.*fwrite.*\hello\.*1, 5, fp.*vi1 fab} } */ -/* { dg-final { scan-tree-dump vi1.*fwrite.*\hellon\.*1, 6, fp.*vi2 fab} } */ -/* { dg-final { scan-tree-dump vi2.*fputc.*fp.*vi3 fab} } */ -/* { dg-final { scan-tree-dump vi3 ={v} 0\[^\(\)\]*vi4 ={v} 0 fab} } */ -/* { dg-final { scan-tree-dump vi4.*fwrite.*\hello\.*1, 5, fp.*vi5 fab} } */ -/* { dg-final { scan-tree-dump vi5.*fwrite.*\hellon\.*1, 6, fp.*vi6 fab} } */ -/* { dg-final { scan-tree-dump vi6.*fputc.*fp.*vi7 fab} } */ -/* { dg-final { scan-tree-dump vi7.*fputc.*fp.*vi8 fab} } */ -/* { dg-final { scan-tree-dump vi8.*fprintf.*fp.*\%d%d\.*vi9 fab} } */ -/* { dg-final { cleanup-tree-dump fab } } */ +/* { dg-final { scan-tree-dump vi0.*fwrite.*\hello\.*1, 5, fp.*vi1 fab1} } */ +/* { dg-final { scan-tree-dump vi1.*fwrite.*\hellon\.*1, 6, fp.*vi2 fab1} } */ +/* { dg-final { scan-tree-dump vi2.*fputc.*fp.*vi3 fab1} } */ +/* { dg-final { scan-tree-dump vi3 ={v} 0\[^\(\)\]*vi4 ={v} 0 fab1} } */ +/* { dg-final { scan-tree-dump vi4.*fwrite.*\hello\.*1, 5, fp.*vi5 fab1} } */ +/* { dg-final { scan-tree-dump vi5.*fwrite.*\hellon\.*1, 6, fp.*vi6 fab1} } */ +/* { dg-final { scan-tree-dump vi6.*fputc.*fp.*vi7 fab1} } */ +/* { dg-final { scan-tree-dump vi7.*fputc.*fp.*vi8 fab1} } */ +/* { dg-final { scan-tree-dump vi8.*fprintf.*fp.*\%d%d\.*vi9 fab1} } */ +/* { dg-final { cleanup-tree-dump fab1 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/builtin-fprintf-chk-1.c === --- gcc/testsuite/gcc.dg/tree-ssa/builtin-fprintf-chk-1.c (revision
Re: [patch] split FRAME variables back into pieces
On Wed, Sep 19, 2012 at 12:58 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, this transformation has been in our tree for a couple of years and was originally developed for SPARKSkein (http://www.skein-hash.info/node/48), which is the implementation in SPARK (subset of Ada) of the Skein algorithm. When nested functions access local variables of their parent, the compiler creates a special FRAME local variable in the parent, which represents the non-local frame, and puts into it all the variables accessed non-locally. If these nested functions are later inlined into their parent, these FRAME variables generally remain unmodified and this has various drawbacks: 1) the frame of the parent is unnecessarily large, 2) scalarization of aggregates put into the FRAME variables is hindered, 3) debug info for scalars put into the FRAME variables is poor since VTA only works on GIMPLE registers. The attached patch makes it so that the compiler splits FRAME variables back into pieces when all the nested functions have been inlined. The natural place to implement the transformation would probably be the SRA pass, but this would require a special path to work around all the heuristics and the pass is already complicated enough (sorry Martin ;-) The transformation is therefore implemented as a sub-pass of execute_update_addresses_taken for technical reasons exposed in the patch. Tested on x86-64/Linux, OK for the mainline? I really don't like this to be done outside of SRA (and it is written in a non-MEM_REF way). For the testcase in question we scalarize back 'i' in SRA (other scalars are optimized away already, but as SRA runs before DSE it still gets to see stores to FRAME.i). Now I wonder why we generate reasonable debug info even without inlining, thus there has to be a association to the original decls with the frame FIELD_DECLs. That is, lookup_decl_for_field should not be necessary and what we use for debug info generation should be used by SRA to assign a name to scalarized fields. That alone would not solve your issue because of the 'arr' field in the structure which cannot be scalarized (moved to a stand-alone decl) by SRA. That's one missed feature of SRA though, and generally useful. So no, I don't think this patch is the right approach. Thanks, Richard. 2012-09-19 Eric Botcazou ebotca...@adacore.com * tree.h (DECL_NONLOCAL_FRAME): New macro. * gimple.c (gimple_ior_addresses_taken_1): Handle non-local frame structures specially. * tree-nested.c (get_frame_type): Set DECL_NONLOCAL_FRAME. * tree-ssa.c (lookup_decl_for_field): New static function. (split_nonlocal_frames_op): Likewise. (execute_update_addresses_taken): Break up non-local frame structures into variables when possible. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Stream in DECL_NONLOCAL_FRAME flag. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream out DECL_NONLOCAL_FRAME flag. 2012-09-19 Eric Botcazou ebotca...@adacore.com * gcc.dg/nested-func-9.c: New test. -- Eric Botcazou
[PATCHv4] rs6000: Add 2 built-ins to read the Time Base Register on PowerPC
Changes since v3: - specified the clobbered CC field; - removed the insn rs6000_get_timebase_ppc64 as it was identical to rs6000_mftb_di; - removed UNSPECV_GETTB as it was identical to UNSPECV_MFTB; - fixed indentation. -- 8 -- Add __builtin_ppc_get_timebase and __builtin_ppc_mftb to read the Time Base Register on PowerPC. They are required by applications that measure time at high frequencies with high precision that can't afford a syscall. __builtin_ppc_get_timebase returns the 64 bits of the Time Base Register while __builtin_ppc_mftb generates only 1 instruction and returns the least significant word on 32-bit environments and the whole Time Base value on 64-bit. [gcc] 2012-09-17 Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com * config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase and __builtin_ppc_mftb. * config/rs6000/rs6000.c (rs6000_expand_zeroop_builtin): New function to expand an expression that calls a built-in without arguments. (rs6000_expand_builtin): Add __builtin_ppc_get_timebase and __builtin_ppc_mftb. (rs6000_init_builtins): Likewise. * config/rs6000/rs6000.md: Likewise. * doc/extend.texi (PowerPC Built-in Functions): New section. (PowerPC AltiVec/VSX Built-in Functions): Move some built-ins unrelated to Altivec/VSX to the new section. [gcc/testsuite] 2012-09-17 Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com * gcc.target/powerpc/ppc-get-timebase.c: New file. * gcc.target/powerpc/ppc-mftb.c: New file. --- gcc/config/rs6000/rs6000-builtin.def |6 ++ gcc/config/rs6000/rs6000.c | 46 ++ gcc/config/rs6000/rs6000.md| 66 gcc/doc/extend.texi| 51 ++- .../gcc.target/powerpc/ppc-get-timebase.c | 20 ++ gcc/testsuite/gcc.target/powerpc/ppc-mftb.c| 18 + 6 files changed, 189 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-mftb.c diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index c8f8f86..9fa3a0f 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1429,6 +1429,12 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, __builtin_rsqrt, RS6000_BTM_FRSQRTE, BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, __builtin_rsqrtf, RS6000_BTM_FRSQRTES, RS6000_BTC_FP) +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, __builtin_ppc_get_timebase, +RS6000_BTM_ALWAYS, RS6000_BTC_MISC) + +BU_SPECIAL_X (RS6000_BUILTIN_MFTB, __builtin_ppc_mftb, +RS6000_BTM_ALWAYS, RS6000_BTC_MISC) + /* Darwin CfString builtin. */ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, __builtin_cfstring, RS6000_BTM_ALWAYS, RS6000_BTC_MISC) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index a5a3848..c3bece1 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -9748,6 +9748,30 @@ rs6000_overloaded_builtin_p (enum rs6000_builtins fncode) return (rs6000_builtin_info[(int)fncode].attr RS6000_BTC_OVERLOADED) != 0; } +/* Expand an expression EXP that calls a builtin without arguments. */ +static rtx +rs6000_expand_zeroop_builtin (enum insn_code icode, rtx target) +{ + rtx pat; + enum machine_mode tmode = insn_data[icode].operand[0].mode; + + if (icode == CODE_FOR_nothing) +/* Builtin not supported on this processor. */ +return 0; + + if (target == 0 + || GET_MODE (target) != tmode + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) +target = gen_reg_rtx (tmode); + + pat = GEN_FCN (icode) (target); + if (! pat) +return 0; + emit_insn (pat); + + return target; +} + static rtx rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target) @@ -11337,6 +11361,16 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, ? CODE_FOR_bpermd_di : CODE_FOR_bpermd_si), exp, target); +case RS6000_BUILTIN_GET_TB: + return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_get_timebase, + target); + +case RS6000_BUILTIN_MFTB: + return rs6000_expand_zeroop_builtin (((TARGET_64BIT) + ? CODE_FOR_rs6000_mftb_di + : CODE_FOR_rs6000_mftb_si), + target); + case ALTIVEC_BUILTIN_MASK_FOR_LOAD: case ALTIVEC_BUILTIN_MASK_FOR_STORE: { @@ -11621,6 +11655,18 @@ rs6000_init_builtins (void) POWER7_BUILTIN_BPERMD, __builtin_bpermd); def_builtin (__builtin_bpermd, ftype, POWER7_BUILTIN_BPERMD);
Re: [patch] split FRAME variables back into pieces
On Wed, Sep 19, 2012 at 01:36:50PM +0200, Richard Guenther wrote: On Wed, Sep 19, 2012 at 12:58 PM, Eric Botcazou ebotca...@adacore.com wrote: I really don't like this to be done outside of SRA (and it is written in a non-MEM_REF way). For the testcase in question we scalarize back 'i' in SRA (other scalars are optimized away already, but as SRA runs before DSE it still gets to see stores to FRAME.i). Now I wonder why we generate reasonable debug info even without inlining, thus there has to be a association to the original decls with the frame FIELD_DECLs. That is, lookup_decl_for_field should not be necessary and what we use for debug info generation should be used by SRA to assign a name to scalarized fields. For debug_info, the nested function has VAR_DECLs with DECL_VALUE_EXPR as FIELD_DECLs of the chain. That alone would not solve your issue because of the 'arr' field in the structure which cannot be scalarized (moved to a stand-alone decl) by SRA. That's one missed feature of SRA though, and generally useful. I agree that SRA is the right approach for this, perhaps with the DECL_NONLOCAL_FRAME bit used by SRA to forcefully scalarize it into individual pieces (that bit should basically tell the scalarizer that valid code can't use pointer arithmetics to go from one outermost field to another outermost field, i.e. those can be safely split appart even if the whole thing is address taken). Jakub
Re: [SH] Use more braced strings in MD
Oleg Endo oleg.e...@t-online.de wrote: Like the topic says. No functional change, just cosmetics. Tested on rev 191342 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK to install? OK. Regards, kaz
Re: [SH] PR 54236 - Add another addc case
Oleg Endo oleg.e...@t-online.de wrote: There is another opportunity where SH's addc insn can be used. Tested on rev 191342 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK to install? OK. Regards, kaz
Re: [SH] PR 54089 - Add another rotcr case
Oleg Endo oleg.e...@t-online.de wrote: There is another opportunity where SH's rotcr insn can be used. Tested on rev 191342 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK to install? OK. Regards, kaz
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Tue, Sep 18, 2012 at 10:48 AM, Sharad Singhai sing...@google.com wrote: In response to the recent comments, I have updated the patch to do the following: - Remove pass handling from -fopt-info - Support additional flags in regular dumps I have massaged the options so that they have the following (hopefully clearer) behavior: gcc ... -fopt-info --- dump all optimization info on stderr gcc ... -fopt-info-missed-optimized=file.txt -- dump info about optimization applied as well as missed opportunities on to file.txt. If no file.txt is provided, then use stderr. I have enhanced regular dump flags, so that values accepted by -fopt-info are also accepted. For example, gcc ... -O2 -ftree-vectorize -fdump-tree-vect-optimized=foo.dump Now foo.dump will include the regular tree-vect dump as well as the output of -fopt-info=optimized. This way developers can get more detailed dumps when needed. In addition? The dumping infrastructure has only one dump statement for each bit so you make it emit things twice in some circumstances then? That doesn't sound too useful. I have also changed the meaning of dump option details to include optimization details. Thus -details flag implies -missed-optimized-note in addition to other dumps. I think regular dumps should not accept the -fopt-info flags. The pass level filtering of -fopt-info dumps can be done in a follow up patch. It may even turn out to be unnecessary, because the equivalent effect can be achieved by -ftree-PASS-optimized-missed-note. It can be done as followup, but I think that is what is really useful. Directing users to -fdump-tree... should never be the answer here. Richard. I have bootstrapped and tested the attached patch on x86_64 and didn't observe any new failures. Okay for trunk? Thanks, Sharad
Re: RFA: Process '*' in '@'-output-template alternatives
Quoting Richard Guenther richard.guent...@gmail.com: I think that needs to be documented somewhere in the internals manual, I suppose it should logically go to the current end of the Output Statement node in md.texi . Line 668 in revision 191429, just before the Predicates node. possibly with an example. AFAICT the existing examples are pieces of real machine descriptions. One possibility would be the movsi_insn from the arc-4_4-20090909-branch: (define_insn *movsi_insn [(set (match_operand:SI 0 move_dest_operand =Rcq,Rcq#q,w, w,w, w,???w, ?w, w,Rcq#q, w,Rcq, S,Us,RcqRck,!*x,r,m,???m,VUsc) (match_operand:SI 1 move_src_operand cL,cP,Rcq#q,cL,I,Crr,?Rac,Cpc,Clb,?Cal,?Cal,T,Rcq,RcqRck,Us,Usd,m,c,?Rac,C32))] register_operand (operands[0], SImode) || register_operand (operands[1], SImode) || (CONSTANT_P (operands[1]) /* Don't use a LIMM that we could load with a single insn - we loose delay-slot filling opportunities. */ !satisfies_constraint_I (operands[1]) satisfies_constraint_Usc (operands[0])) @ mov%? %0,%1% mov%? %0,%1% mov%? %0,%1% mov%? %0,%1 mov%? %0,%1 ror %0,((%1*2+1) 0x3f) mov%? %0,%1 add %0,%S1 * return arc_get_unalign () ? \add %0,pcl,%1-.+2\ : \add %0,pcl,%1-.\; mov%? %0,%S1% mov%? %0,%S1 ld%? %0,%1% st%? %1,%0% * return arc_short_long (insn, \push%? %1%\, \st%U0 %1,%0%\); * return arc_short_long (insn, \pop%? %0%\, \ld%U1 %0,%1%\); ld%? %0,%1% ld%U1%V1 %0,%1 st%U0%V0 %1,%0 st%U0%V0 %1,%0 st%U0%V0 %S1,%0 [(set_attr type move,move,move,move,move,two_cycle_core,move,binary,binary,move,move,load,store,store,load,load,load,store,store,store) (set_attr iscompact maybe,maybe,maybe,false,false,false,false,false,false,maybe_limm,false,true,true,true,true,true,false,false,false,false) ; Use default length for iscompact to mark length varying. But set length ; of Crr to 4. (set_attr length *,*,*,4,4,4,4,8,8,*,8,*,*,*,*,*,*,*,*,8) (set_attr cond canuse,canuse_limm,canuse,canuse,canuse_limm,canuse_limm,canuse,nocond,nocond,canuse,canuse,nocond,nocond,nocond,nocond,nocond,nocond,nocond,nocond,nocond)]) Although the number of different concepts combined here might a bit distract from the point. Also, it'll need steering commitee approval to put this code, which was previously contributed under the GPL (on the branch) into GFDL documentation. Or should I make up a reduced/synthetic example for a simpler - but probably pointless as an actual output template - example?
Re: Rewrite lto-symtab to work on symbol table
On 2012.09.18 at 15:35 +0200, Jan Hubicka wrote: Hi, this patch reorganize lto-symtab to work across symtab's symbol table instead of building its own. This simplifies things a bit and with the previous changes it is rather straighforward - i.e. replace all uses of lto_symtab_entry_t by symtab_node. There are few differences in between the symtab as built by lto-symtab and our symbol table. In one direction the declarations that are not going to be output to final assembly (i.e. are used by debug info and such) are not in symbol table and consequentely they no longer get merged. I think this is fine. Other difference is that symbol table contains some symbols that are not really symbols in classical definition - such as inline clones or functions held in table only for purposes of materialization. I added symtab_real_symbol_p predicate for this. It would make more sense to exclude those from the assembler name hash and drop checks I added to lto-symtab.c. I plan to work on this incrementally - it is not completely trivial. The symbol can become non-real in several ways and it will need bit of work to get this consistent. Bootstrapped/regtested x86_64-linux, tested by building Mozilla, Qt and other stuff with LTO. OK? This patch causes: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54625 -- Markus
Re: Use conditional casting with symtab_node
On Wed, Sep 19, 2012 at 4:17 AM, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Sep 19, 2012 at 9:29 AM, Eric Botcazou ebotca...@adacore.com wrote: The language syntax would bind the conditional into the intializer, as in if (varpool_node *vnode = (node-try_variable () vnode-finalized)) varpool_analyze_node (vnode); which does not type-match. So, if you want the type saftey and performance, the cascade is really unavoidable. Just write: varpool_node *vnode; if ((vnode = node-try_variable ()) vnode-finalized) varpool_analyze_node (vnode); This has been the standard style for the past 2 decades and trading it for cascading if's is really a bad idea. Indeed. Btw, can we not provide a specialization for dynamic_cast ? No, it is a language primitive. but we can define out own operation with similar syntax that allows for specialization, whose generic implementation uses dynamic_cast. templatetypename T, typename U T* is(U* u) { return dynamic_castT*(u); } This -try_... looks awkward to me compared to the more familiar vnode = dynamic_cast varpool_node (node) but yeah - dynamic_cast is not a template ... (but maybe there is some standard library piece that mimics it?). Richard. -- Eric Botcazou
Re: RFA: Process '*' in '@'-output-template alternatives
On Wed, Sep 19, 2012 at 2:06 PM, Joern Rennecke amyl...@spamcop.net wrote: Quoting Richard Guenther richard.guent...@gmail.com: I think that needs to be documented somewhere in the internals manual, I suppose it should logically go to the current end of the Output Statement node in md.texi . Line 668 in revision 191429, just before the Predicates node. Yes. possibly with an example. AFAICT the existing examples are pieces of real machine descriptions. One possibility would be the movsi_insn from the arc-4_4-20090909-branch: (define_insn *movsi_insn [(set (match_operand:SI 0 move_dest_operand =Rcq,Rcq#q,w, w,w, w,???w, ?w, w,Rcq#q, w,Rcq, S,Us,RcqRck,!*x,r,m,???m,VUsc) (match_operand:SI 1 move_src_operand cL,cP,Rcq#q,cL,I,Crr,?Rac,Cpc,Clb,?Cal,?Cal,T,Rcq,RcqRck,Us,Usd,m,c,?Rac,C32))] register_operand (operands[0], SImode) || register_operand (operands[1], SImode) || (CONSTANT_P (operands[1]) /* Don't use a LIMM that we could load with a single insn - we loose delay-slot filling opportunities. */ !satisfies_constraint_I (operands[1]) satisfies_constraint_Usc (operands[0])) @ mov%? %0,%1% mov%? %0,%1% mov%? %0,%1% mov%? %0,%1 mov%? %0,%1 ror %0,((%1*2+1) 0x3f) mov%? %0,%1 add %0,%S1 * return arc_get_unalign () ? \add %0,pcl,%1-.+2\ : \add %0,pcl,%1-.\; mov%? %0,%S1% mov%? %0,%S1 ld%? %0,%1% st%? %1,%0% * return arc_short_long (insn, \push%? %1%\, \st%U0 %1,%0%\); * return arc_short_long (insn, \pop%? %0%\, \ld%U1 %0,%1%\); ld%? %0,%1% ld%U1%V1 %0,%1 st%U0%V0 %1,%0 st%U0%V0 %1,%0 st%U0%V0 %S1,%0 [(set_attr type move,move,move,move,move,two_cycle_core,move,binary,binary,move,move,load,store,store,load,load,load,store,store,store) (set_attr iscompact maybe,maybe,maybe,false,false,false,false,false,false,maybe_limm,false,true,true,true,true,true,false,false,false,false) ; Use default length for iscompact to mark length varying. But set length ; of Crr to 4. (set_attr length *,*,*,4,4,4,4,8,8,*,8,*,*,*,*,*,*,*,*,8) (set_attr cond canuse,canuse_limm,canuse,canuse,canuse_limm,canuse_limm,canuse,nocond,nocond,canuse,canuse,nocond,nocond,nocond,nocond,nocond,nocond,nocond,nocond,nocond)]) Although the number of different concepts combined here might a bit distract from the point. Also, it'll need steering commitee approval to put this code, which was previously contributed under the GPL (on the branch) into GFDL documentation. Or should I make up a reduced/synthetic example for a simpler - but probably pointless as an actual output template - example? Something smaller (but pointless) woudl be nice. Richard.
Make cfun_push and cfun_pop also change current_function_decl
Hi, this is my second attempt to make push_cfun and pop_cfun save and restore current_function_decl, so that code that wants to change the function context does not have to do the latter manually. This of course enforces that cfun and current_function_decl match at push_cfun points which is asserted and the patch checking_asserts that they match at cfun_pop times too, except when cfun is NULL and current_function_decl has been changed in order to interact with code shared with front-ends (and in some other cases, e.g. in dwarf2out.c). However, this code does not make pushing and popping NULL cfun cheaper because the patch is already quite big as it is. I will try doing that as a followup. The patch is very similar to what I have posted to the mailing list in August but without the two ugly spots (tricks in Ada front end and dwarf2out.c) which I have meanwhile sorted out differently. I have bootstrapped and tested the patch on x86_64-linux and have also LTO-built Firefox with it. OK for trunk? Thanks, Martin 2012-09-12 Martin Jambor mjam...@suse.cz * function.c (push_cfun): Check old current_function_decl matches old cfun, set new current_function_decl to the decl of the new cfun. (push_struct_function): Likewise. (pop_cfun): Likewise. (allocate_struct_function): Move call to invoke_set_current_function_hook to the end of the function. * cfgexpand.c (estimated_stack_frame_size): Do not set and restore current_function_decl. * cgraph.c (cgraph_release_function_body): Likewise. * cgraphunit.c (cgraph_process_new_functions): Likewise. (cgraph_add_new_function): Likewise. (cgraph_analyze_function): Likewise. (assemble_thunk): Set cfun to NULL at the end. (expand_function): Move call to set_cfun downwards. * gimple-low.c (record_vars_into): Only check current_function_decl before possibly doing push_cfun. * gimplify.c (gimplify_function_tree): Do not set and restore current_function_decl. * ipa-inline-analysis.c (compute_inline_parameters): Likewise. (inline_analyze_function): Likewise. * ipa-prop.c (ipa_analyze_node): Likewise. * ipa-pure-const.c (analyze_function): Likewise. * lto-streamer-in.c (lto_input_function_body): Do not set current_function_decl. * lto-streamer-out.c (output_function): Do not set and restore current_function_decl. * omp-low.c (finalize_task_copyfn): Likewise. (expand_omp_taskreg): Likewise. (create_task_copyfn): Likewise, move push_cfun up quite a bit. * passes.c (dump_passes): Do not set and restore current_function_decl. (do_per_function): Likewise. (do_per_function_toporder): Likewise. * trans-mem.c (ipa_tm_scan_irr_function): Likewise. (ipa_tm_transform_transaction): Likewise. (ipa_tm_transform_clone): Likewise. (ipa_tm_execute): Likewise. * tree-emutls.c (lower_emutls_function_body): Likewise. * tree-inline.c (initialize_cfun): Do not call pop_cfun. (tree_function_versioning): Do not call push_cfun, do not set and restore current_function_decl. Remove assert checking consistency of cfun and current_function_decl. * tree-profile.c (tree_profiling): Do not set and restore current_function_decl. * tree-sra.c (convert_callers_for_node): Do not set current_function_decl. (convert_callers): Do not restore current_function_decl. (modify_function): Do not set current_function_decl. * tree-ssa-structalias.c (ipa_pta_execute): Do not set and restore current_function_decl. fortran/ * trans-decl.c (gfc_get_extern_function_decl): Push NULL cfun. Do not set and restore current_function_decl. (gfc_init_coarray_decl): Do not set and restore current_function_decl. go/ * gofrontend/gogo-tree.cc (Gogo::write_initialization_function): Do not set and restore current_function_decl. (Gogo::write_globals): Likewise. (Named_object::get_tree): Likewise. lto/ * lto.c (lto_materialize_function): Call push_struct_function and pop_cfun. *** /tmp/WDLEAb_cfgexpand.c Wed Sep 19 14:29:03 2012 --- gcc/cfgexpand.c Mon Sep 17 14:48:19 2012 *** estimated_stack_frame_size (struct cgrap *** 1423,1432 HOST_WIDE_INT size = 0; size_t i; tree var; - tree old_cur_fun_decl = current_function_decl; struct function *fn = DECL_STRUCT_FUNCTION (node-symbol.decl); - current_function_decl = node-symbol.decl; push_cfun (fn); init_vars_expansion (); --- 1423,1430 *** estimated_stack_frame_size (struct cgrap *** 1446,1452 fini_vars_expansion (); pop_cfun (); - current_function_decl = old_cur_fun_decl; return size; } --- 1444,1449 ***
Patch ping^3
Hi! http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01100.html - C++ -Wsizeof-pointer-memaccess support (C is already in) http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00711.html - PR debug/54519 - first part of partial inlining debug info fixes Jakub
Re: [PATCHv4] rs6000: Add 2 built-ins to read the Time Base Register on PowerPC
On Wed, Sep 19, 2012 at 7:37 AM, Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com wrote: Changes since v3: - specified the clobbered CC field; - removed the insn rs6000_get_timebase_ppc64 as it was identical to rs6000_mftb_di; - removed UNSPECV_GETTB as it was identical to UNSPECV_MFTB; - fixed indentation. -- 8 -- Add __builtin_ppc_get_timebase and __builtin_ppc_mftb to read the Time Base Register on PowerPC. They are required by applications that measure time at high frequencies with high precision that can't afford a syscall. __builtin_ppc_get_timebase returns the 64 bits of the Time Base Register while __builtin_ppc_mftb generates only 1 instruction and returns the least significant word on 32-bit environments and the whole Time Base value on 64-bit. [gcc] 2012-09-17 Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com * config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase and __builtin_ppc_mftb. * config/rs6000/rs6000.c (rs6000_expand_zeroop_builtin): New function to expand an expression that calls a built-in without arguments. (rs6000_expand_builtin): Add __builtin_ppc_get_timebase and __builtin_ppc_mftb. (rs6000_init_builtins): Likewise. * config/rs6000/rs6000.md: Likewise. * doc/extend.texi (PowerPC Built-in Functions): New section. (PowerPC AltiVec/VSX Built-in Functions): Move some built-ins unrelated to Altivec/VSX to the new section. [gcc/testsuite] 2012-09-17 Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com * gcc.target/powerpc/ppc-get-timebase.c: New file. * gcc.target/powerpc/ppc-mftb.c: New file. Tulio, The revised patch looks very good now. Please change the condition register mode to CC -- the value technically is unsigned, but the comparison instruction is the signed version and the condition does not care about signedness. CCUNS is for bookkeeping within GCC. Also, please remove all of the spaces between assembly operands, e.g., %2, %0, %1 should be %2,%0,%1. The patch is okay with those changes. Obrigado, David
[PATCH] One more fab - fab1
Committed. Richard. 2012-09-19 Richard Guenther rguent...@suse.de * gcc.dg/builtin-unreachable-6.c: Adjust. Index: gcc/testsuite/gcc.dg/builtin-unreachable-6.c === --- gcc/testsuite/gcc.dg/builtin-unreachable-6.c(revision 191471) +++ gcc/testsuite/gcc.dg/builtin-unreachable-6.c(working copy) @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O2 -fdump-tree-fab } */ +/* { dg-options -O2 -fdump-tree-fab1 } */ void foo (int b, int c) @@ -16,6 +16,6 @@ lab2: goto *x; } -/* { dg-final { scan-tree-dump-times lab: 1 fab } } */ -/* { dg-final { scan-tree-dump-times __builtin_unreachable 1 fab } } */ -/* { dg-final { cleanup-tree-dump fab } } */ +/* { dg-final { scan-tree-dump-times lab: 1 fab1 } } */ +/* { dg-final { scan-tree-dump-times __builtin_unreachable 1 fab1 } } */ +/* { dg-final { cleanup-tree-dump fab1 } } */
Re: [patch] Fix PR rtl-optimization/54290
Eric Botcazou wrote: 2012-09-18 Eric Botcazou ebotca...@adacore.com PR rtl-optimization/54290 * reload1.c (choose_reload_regs): Also take into account secondary MEMs to remove address replacements for inherited reloads. (replaced_subreg): Move around. Looks good to me. OK if testing passes. Thanks, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
[PATCH] Move objsz pass for -Og
This moves objsz so basic constant propagation (done by copyprop) propagates its results before we fold the builtins in fab. This fixes execute fails in the builtins testsuite. Bootstrapped on x86_64-unknown-linux-gnu, full testing in progress. Richard. 2012-09-19 Richard Guenther rguent...@suse.de * passes.c (init_optimization_passes): For -Og move pass_object_sizes inbetween CCP and copyprop. Index: gcc/passes.c === --- gcc/passes.c(revision 191466) +++ gcc/passes.c(working copy) @@ -1528,11 +1528,13 @@ init_optimization_passes (void) NEXT_PASS (pass_lower_vector_ssa); /* Perform simple scalar cleanup which is constant/copy propagation. */ NEXT_PASS (pass_ccp); + NEXT_PASS (pass_object_sizes); + /* Copy propagation also copy-propagates constants, this is necessary + to forward object-size results properly. */ NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_rename_ssa_copies); NEXT_PASS (pass_dce); /* Fold remaining builtins. */ - NEXT_PASS (pass_object_sizes); NEXT_PASS (pass_fold_builtins); /* ??? We do want some kind of loop invariant motion, but we possibly need to adjust LIM to be more friendly towards preserving accurate
Re: [libbacktrace] Fix bootstrap with gcc 4.4
On Wed, Sep 19, 2012 at 2:30 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Ian Lance Taylor i...@google.com writes: __sync_* support. Might be worth looking into why the test failed. On i386-pc-solaris2.*, __sync_bool_compare_and_swap_4 is missing. sparc-sun-solaris2.11 is fine, though. Ah, I see. __sync_bool_compare_and_swap only exists if compiling for 486 or above. I guess I'm not too worried about that. The following patch fixes this for me (bootstrap currently into stage2). I've removed the stdint.h includes in btest.c and dwarf.c since that's already covered by backtrace.h. Ok for mainline if that passes? I don't particularly want backtrace.h, a public header file, to depend on gstdint.h, a file created at build time. This would mean that backtrace.h can not be easily installed (it's not installed right now, but I don't want to rule that out in the future). But I guess it's OK to have that dependency on ancient systems. I installed this slightly different patch instead. Thanks. Ian 2012-09-19 Rainer Orth r...@cebitec.uni-bielefeld.de Ian Lance Taylor i...@google.com * configure.ac (GCC_HEADER_STDINT): Invoke. * backtrace.h: If we can't find stdint.h, use gstdint.h. * btest.c: Don't include stdint.h. * dwarf.c: Likewise. * configure, aclocal.m4, Makefile.in, config.h.in: Rebuild. foo.patch Description: Binary data
[PATCH] Add -Og -g to TORTURE_OPTIONS
Which passes testing now. Ok? Thanks, Richard. 2012-09-19 Richard Guenther rguent...@suse.de * lib/c-torture.exp (TORTURE_OPTIONS): Add -Og -g. Index: gcc/testsuite/lib/c-torture.exp === --- gcc/testsuite/lib/c-torture.exp (revision 191474) +++ gcc/testsuite/lib/c-torture.exp (working copy) @@ -42,7 +42,8 @@ if [info exists TORTURE_OPTIONS] { { -O3 -fomit-frame-pointer -funroll-loops } \ { -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions } \ { -O3 -g } \ - { -Os } ] + { -Os } \ + { -Og -g } ] } if [info exists ADDITIONAL_TORTURE_OPTIONS] {
Re: [C++ PATCH] -Wsizeof-pointer-memaccess warning
Hmm. This is a rather intrusive change to work around the problem of early folding, which we'd like to move away from anyway. How does it work to always keep SIZEOF_EXPR unfolded until cxx_eval_constant_expression? Jason
Re: [PATCH] Add -Og -g to TORTURE_OPTIONS
On Wed, Sep 19, 2012 at 03:51:56PM +0200, Richard Guenther wrote: Which passes testing now. Ok? Yes. Probably we'll need to retune parallel checking, but we probably need to do it anyway already now, e.g. i386.exp grew too much. Jakub
Re: [PATCH] Combine location with block using block_locations
Hi, On Wed, 19 Sep 2012, Martin Jambor wrote: (The patch does not introduce any of the asserts Michael's patch had because, as far as I my grep told me, IS_UNKNOWN_LOCATION is not in trunk yet and I suppose the pre-approval does not cover introducing things like that.) Dehaos patch contains it. For current trunk it would simply be an equality test with UNKNOWN_LOCATION. If have no opinion if the asserts should be there or not. Ciao, Michael.
Re: [PATCH] Add -Og -g to TORTURE_OPTIONS
On Wed, 19 Sep 2012, Jakub Jelinek wrote: On Wed, Sep 19, 2012 at 03:51:56PM +0200, Richard Guenther wrote: Which passes testing now. Ok? Yes. Probably we'll need to retune parallel checking, but we probably need to do it anyway already now, e.g. i386.exp grew too much. Committed. Eventually we can shave off some time by removing any of the -O3 options: set C_TORTURE_OPTIONS [list \ { -O0 } \ { -O1 } \ { -O2 } \ { -O3 -fomit-frame-pointer } \ { -O3 -fomit-frame-pointer -funroll-loops } \ { -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions } \ { -O3 -g } \ { -Os } \ { -Og -g } ] -O3 and -Os already enable -finline-functions for example, likewise -fomit-frame-pointer is enabled on most targets. For testing coverage -funroll-all-loops should trump -funroll-loops, and rather than disabling frame-pointers I'd explicitely enable them. Thus keep -O3 -fno-omit-frame-pointer -O3 -funroll-all-loops -O3 -g (not sure why we have -g at -O3 but not at -O[012s] where it should be more common ...) Richard.
[PATCH,committed] AIX 6.1 TARGET_DEFAULT is POWER4
AIX 6.1 no longer supports POWER3 hardware, so this change adds the basic POWER4 capabilities to TARGET_DEFAULTS. Bootstrapped and regression tested on powerpc-ibm-aix7.1.0.0. - David * config/rs6000/aix61.h (TARGET_DEFAULT): Add MASK_PPC_GPOPT, MASK_PPC_GFXOPT, and MASK_MFCRF. Index: aix61.h === --- aix61.h (revision 191452) +++ aix61.h (working copy) @@ -106,7 +106,7 @@ %{pthread: -D_THREAD_SAFE} #undef TARGET_DEFAULT -#define TARGET_DEFAULT 0 +#define TARGET_DEFAULT (MASK_PPC_GPOPT | MASK_PPC_GFXOPT | MASK_MFCRF) #undef PROCESSOR_DEFAULT #define PROCESSOR_DEFAULT PROCESSOR_POWER7
Re: [PATCH] Add extra location information - PR43486
On Wed, 19 Sep 2012, Arnaud Charlet wrote: OK, so you mean, instead of knowing the number of locations from the tree kind (e.g. 1 extra sloc for unary exprs, 2 for binary exprs, ...), we would encode this as part of the extra loc info? Note that the number of No, I mean that rather than the front end using an interface defined as set the first location to A and the second location to B it should use one meaning set the first operand's location to A and the second operand's location to B or set the beginning of the range to A and the end of the range to B - the extra locations should in some way be tagged to indicate what their relation is to the expression, and users of the locations should then also look things up by tag rather than knowing that internally the first location for a binary operation has some particular semantics. As illustrated in the example I gave, there's more than one way one might naturally associate two locations with a binary operation, hence the desire for code to operate on them at a better-defined level than first and second locations. If you then want to insert a third location in the middle (the location of the operator, say) then you don't need to change everything that referred to locations symbolically. -- Joseph S. Myers jos...@codesourcery.com
Re: RFA: Process '*' in '@'-output-template alternatives
Quoting Richard Guenther richard.guent...@gmail.com: Something smaller (but pointless) woudl be nice. I have attached the patch with the example added to the documentation. Again, bootstrapped on i686-pc-linux-gnu. 2011-09-19 Jorn Rennecke joern.renne...@arc.com * genoutput.c (process_template): Process '*' in '@' alternatives. * doc/md.texi (node Output Statement): Provide example for the above. Index: doc/md.texi === --- doc/md.texi (revision 191429) +++ doc/md.texi (working copy) @@ -665,6 +665,22 @@ (define_insn @end group @end smallexample +If you just need a little bit of C code in one (or a few) alternatives, +you can use @samp{*} inside of a @samp{@@} multi-alternative template: + +@smallexample +@group +(define_insn + [(set (match_operand:SI 0 general_operand =r,,m) +(const_int 0))] + + @@ + clrreg %0 + * return stack_mem_p (operands[0]) ? \push 0 : \clrmem %0\; + clrmem %0) +@end group +@end smallexample + @node Predicates @section Predicates @cindex predicates Index: genoutput.c === --- genoutput.c (revision 191429) +++ genoutput.c (working copy) @@ -662,19 +662,55 @@ process_template (struct data *d, const list of assembler code templates, one for each alternative. */ else if (template_code[0] == '@') { - d-template_code = 0; - d-output_format = INSN_OUTPUT_FORMAT_MULTI; + int found_star = 0; - printf (\nstatic const char * const output_%d[] = {\n, d-code_number); + for (cp = template_code[1]; *cp; ) + { + while (ISSPACE (*cp)) + cp++; + if (*cp == '*') + found_star = 1; + while (!IS_VSPACE (*cp) *cp != '\0') + ++cp; + } + d-template_code = 0; + if (found_star) + { + d-output_format = INSN_OUTPUT_FORMAT_FUNCTION; + puts (\nstatic const char *); + printf (output_%d (rtx *operands ATTRIBUTE_UNUSED, + rtx insn ATTRIBUTE_UNUSED)\n, d-code_number); + puts ({); + puts ( switch (which_alternative)\n{); + } + else + { + d-output_format = INSN_OUTPUT_FORMAT_MULTI; + printf (\nstatic const char * const output_%d[] = {\n, + d-code_number); + } for (i = 0, cp = template_code[1]; *cp; ) { - const char *ep, *sp; + const char *ep, *sp, *bp; while (ISSPACE (*cp)) cp++; - printf ( \); + bp = cp; + if (found_star) + { + printf (case %d:, i); + if (*cp == '*') + { + printf (\n ); + cp++; + } + else + printf ( return \); + } + else + printf ( \); for (ep = sp = cp; !IS_VSPACE (*ep) *ep != '\0'; ++ep) if (!ISSPACE (*ep)) @@ -690,7 +726,18 @@ process_template (struct data *d, const cp++; } - printf (\,\n); + if (!found_star) + puts (\,); + else if (*bp != '*') + puts (\;); + else + { + /* The usual action will end with a return. +If there is neither break or return at the end, this is +assumed to be intentional; this allows to have multiple +consecutive alternatives share some code. */ + puts (); + } i++; } if (i == 1) @@ -700,7 +747,10 @@ process_template (struct data *d, const error_with_line (d-lineno, wrong number of alternatives in the output template); - printf (};\n); + if (found_star) + puts ( default: gcc_unreachable ();\n}\n}); + else + printf (};\n); } else {
Go patch committed: Ignore byte-order-mark at start of file
For convenience with some Windows editors, the other Go compiler was modified to permit a byte-order-mark (0xfeff) at the start of a .go file. This byte-order-mark is meaningless when using UTF-8, but apparently some editors introduce it anyhow. This patch changes the gccgo frontend to also ignore the mark at the start of a file. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Will commit to 4.7 branch when it is open for commits. Ian diff -r 858f39eca5c5 go/lex.cc --- a/go/lex.cc Tue Sep 18 22:24:35 2012 -0700 +++ b/go/lex.cc Wed Sep 19 08:45:16 2012 -0700 @@ -722,7 +722,16 @@ unsigned int ci; bool issued_error; this-lineoff_ = p - this-linebuf_; - this-advance_one_utf8_char(p, ci, issued_error); + const char *pnext = this-advance_one_utf8_char(p, ci, +issued_error); + + // Ignore byte order mark at start of file. + if (ci == 0xfeff this-lineno_ == 1 this-lineoff_ == 0) + { + p = pnext; + break; + } + if (Lex::is_unicode_letter(ci)) return this-gather_identifier();
[C++ Patch / RFC] PR 52432
Hi all, Jason, the large testcase attached by Jon to this PR triggers an Internal compiler error: Error reporting routines re-entered. from -fdump-tree-gimple. The immediate cause seems obvious: in one place in tsubst_copy_and_build we are calling unqualified_name_lookup_error unconditionally, thus irrespective of complain. Changing that indeed avoids the ICE. Then I come to the existing testcase which doesn't pass as-is: it's the testcase added by Jason as part of fixing 50075, which, as analyzed by Jason himself, was clearly about an endless recursion. Currently, however, we error out with that unqualified_name_lookup_error, once, and we don't mention the recursion in the error message. With the patchlet applied, the diagnostics actually shows the recursion, and we error out few lines above in tsubst_copy_and_build, with the error messages koenig lookup related. That makes some sense to me, but the testcase needs tweaking. Thanks, Paolo. Index: testsuite/g++.dg/cpp0x/decltype32.C === --- testsuite/g++.dg/cpp0x/decltype32.C (revision 191479) +++ testsuite/g++.dg/cpp0x/decltype32.C (working copy) @@ -3,10 +3,10 @@ template typename T auto make_array(const T il) - -decltype(make_array(il)) // { dg-error not declared } +decltype(make_array(il))// { dg-error not declared|no matching|exceeds } { } int main() { - int z = make_array(1); // { dg-error no match } + int z = make_array(1);// { dg-error no matching } } Index: cp/pt.c === --- cp/pt.c (revision 191483) +++ cp/pt.c (working copy) @@ -13771,7 +13771,8 @@ tsubst_copy_and_build (tree t, } if (TREE_CODE (function) == IDENTIFIER_NODE) { - unqualified_name_lookup_error (function); + if (complain tf_error) + unqualified_name_lookup_error (function); release_tree_vector (call_args); RETURN (error_mark_node); }
[patch,wwwdocs,4.6,committed]
Applied the following patch that moved the note about the progmem attribute for AVR from the improvements section to caveats. Johann * gcc-4.6/changes.html (avr): Mode note on progmem attribute from improvments to caveats. Index: gcc-4.6/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/changes.html,v retrieving revision 1.142 diff -u -p -r1.142 changes.html --- gcc-4.6/changes.html10 Aug 2012 16:25:46 - 1.142 +++ gcc-4.6/changes.html19 Sep 2012 15:52:01 - @@ -77,6 +77,10 @@ by this change. (This change affects GCC versions 4.6.4 and later, with the exception of versions 4.7.0 and 4.7.1.)/li +li id=avrOn AVR, variables with the codeprogmem/code +attribute to locate data in flash memory must be qualified +as codeconst/code./li + lipSupport for a number of older systems and recently unmaintained or untested target ports of GCC has been declared obsolete in GCC 4.6. Unless there is activity to revive them, the @@ -801,13 +805,6 @@ default./li /ul -a name=avr/a -h3 id=avrAVR/h3 - ul -liVariables with the codeprogmem/code attribute to locate data - in flash memory must be qualified as codeconst/code./li - /ul - h3IA-32/x86-64/h3 ul li
[patch, mips] Fix for PR 54619, GCC aborting with -O -mips16
While building newlib with -O2 -mips16 I ran into a problem with GCC aborting due to the compiler trying to execute '0 % 0' in mips16_unextended_reference_p. The problem was that the code checked for BLKmode and skipped the modulo operation in that case because GET_MODE_SIZE (BLKmode) is zero but it didn't check for VOIDmode whose size is also zero. Rather then add a check for VOIDmode I changed the check to 'GET_MODE_SIZE (mode) != 0'. While looking at the code I also noticed that if offset was zero then we should return true even if mode is BLKmode or VOIDmode. Returning false would not generate bad code or cause problems but returning true would result in better costing model in mips_address_insns so I made that change too. Tested on a mips elf target, OK to checkin? Steve Ellcey sell...@mips.com 2012-09-19 Steve Ellcey sell...@mips.com PR target/54619 * config/mips/mips.c (mips16_unextended_reference_p): Check for zero offset and zero size mode. diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 7f9df4c..ecdb811 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -2230,7 +2230,10 @@ static bool mips16_unextended_reference_p (enum machine_mode mode, rtx base, unsigned HOST_WIDE_INT offset) { - if (mode != BLKmode offset % GET_MODE_SIZE (mode) == 0) + if (offset == 0) +return true; + + if (GET_MODE_SIZE (mode) != 0 offset % GET_MODE_SIZE (mode) == 0) { if (GET_MODE_SIZE (mode) == 4 base == stack_pointer_rtx) return offset 256U * GET_MODE_SIZE (mode);
[Patch] catch builtin_bswap16 construct
Hi, The attached patch catches C constructs: (A 8) | (A 8) where A is unsigned 16 bits and maps them to builtin_bswap16(A) which can provide more efficient implementations on some targets. The construct above is equivalent to the default bswap16 implementation. I have added a testcase for ARM, and have found no regression with qemu-arm on arm-none-linux-gnueabi. OK? Christophe 2012-09-19 Christophe Lyon christophe.l...@linaro.org gcc/ * fold-const.c (fold_binary_loc): call builtin_bswap16 when the equivalent construct is detected. gcc/testsuite/ * gcc.target/arm/builtin-bswap-2.c: New testcase. diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 2bf5179..0ff7e8b 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -10326,6 +10326,99 @@ fold_binary_loc (location_t loc, } } + /* Catch bswap16 construct: (A 8) | (A 8) where A is + unsigned 16 bits. + This has been expanded into: +(ior:SI (lshift:SI (nop:SI A:HI) 8) +(nop:SI (rshift:HI A:HI 8))) + */ + { + enum tree_code code0, code1; + tree my_arg0 = arg0; + tree my_arg1= arg1; + tree rtype; + + code0 = TREE_CODE (arg0); + code1 = TREE_CODE (arg1); + if (code1 == NOP_EXPR) + { + my_arg1 = TREE_OPERAND (arg1, 0); + code1 = TREE_CODE (my_arg1); + } + else if (code0 == NOP_EXPR) + { + my_arg0 = TREE_OPERAND (arg0, 0); + code0 = TREE_CODE (my_arg0); + } + + /* Handle (A C1) + (A C1). */ + if ((code1 == RSHIFT_EXPR code0 == LSHIFT_EXPR) + (TREE_CODE (TREE_OPERAND (my_arg0, 0)) == NOP_EXPR) + operand_equal_p (TREE_OPERAND (TREE_OPERAND (my_arg0, 0), 0), +TREE_OPERAND (my_arg1, 0), 0) + (rtype = TREE_TYPE (TREE_OPERAND (my_arg1, 0)), +TYPE_UNSIGNED (rtype))) + { + tree tree01, tree11; + enum tree_code code01, code11; + + tree01 = TREE_OPERAND (my_arg0, 1); + tree11 = TREE_OPERAND (my_arg1, 1); + STRIP_NOPS (tree01); + STRIP_NOPS (tree11); + code01 = TREE_CODE (tree01); + code11 = TREE_CODE (tree11); + + /* Check that shift amount is 8, and input 16 bits wide. */ + if (code01 == INTEGER_CST +code11 == INTEGER_CST +TREE_INT_CST_HIGH (tree01) == 0 +TREE_INT_CST_HIGH (tree11) == 0 +TREE_INT_CST_LOW (tree01) == TREE_INT_CST_LOW (tree11) +TREE_INT_CST_LOW (tree01) == 8 +TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (my_arg1, 0))) == 16) + { + tree bswapfn = builtin_decl_explicit (BUILT_IN_BSWAP16); + return build_call_expr_loc (loc, bswapfn, 1, + TREE_OPERAND (my_arg1, 0)); + } + } + + /* Handle (A C1) + (A C1). */ + else if ((code0 == RSHIFT_EXPR code1 == LSHIFT_EXPR) + (TREE_CODE (TREE_OPERAND (my_arg1, 0)) == NOP_EXPR) + operand_equal_p (TREE_OPERAND (TREE_OPERAND (my_arg1, 0), + 0), +TREE_OPERAND (my_arg0, 0), 0) + (rtype = TREE_TYPE (TREE_OPERAND (my_arg0, 0)), +TYPE_UNSIGNED (rtype))) + { + tree tree01, tree11; + enum tree_code code01, code11; + + tree01 = TREE_OPERAND (my_arg0, 1); + tree11 = TREE_OPERAND (my_arg1, 1); + STRIP_NOPS (tree01); + STRIP_NOPS (tree11); + code01 = TREE_CODE (tree01); + code11 = TREE_CODE (tree11); + + /* Check that shift amount is 8, and input 16 bits wide. */ + if (code01 == INTEGER_CST +code11 == INTEGER_CST +TREE_INT_CST_HIGH (tree01) == 0 +TREE_INT_CST_HIGH (tree11) == 0 +TREE_INT_CST_LOW (tree01) == TREE_INT_CST_LOW (tree11) +TREE_INT_CST_LOW (tree01) == 8 +TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (my_arg0, 0))) == 16) + { + tree bswapfn = builtin_decl_explicit (BUILT_IN_BSWAP16); + return build_call_expr_loc (loc, bswapfn, 1, + TREE_OPERAND (my_arg0, 0)); + } + } + } + associate: /* In most languages, can't associate operations on floats through parentheses. Rather than remember where the parentheses were, we diff --git a/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c b/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c new file mode 100644 index 000..93dbb35 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/builtin-bswap-2.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options -O2 } */ +/*
Re: [patch, mips] Patch for new mips triplet - mips-mti-elf
Richard, Here is a new copy of my mips-mti-elf patch. I removed SYSROOT_SUFFIX_SPEC since that was unneeded as you said and changed DRIVER_SELF_SPECS to set the ABI to n32 by default on mips64 targets like mips-sde-elf does. I also updated t-mti-elf to add mips16 and mabi=64 targets and verified that they worked. That also entailed adding some MULTILIB_EXCEPTIONS entries to t-mti-elf. Here is a new patch for the gcc directory changes (config.gcc, config/mips/mti-elf.h, config/mips/t-mti-elf). I have not included the top-level changes and the testsuite change that were in my initial submission since those haven't changed since then. Once I have gotten this approved and checked in I will go back and revisit mips-mti-linux-gnu and see if I can implement your ideas for that target (making n32 the default ABI for mips64 targets and using the IRIX style sysroot directory setup. Steve Ellcey sell...@mips.com 2012-09-19 Steve Ellcey sell...@mips.com * config.gcc (mips*-mti-elf*): New target. * config/mips/mti-elf.h: New file. * config/mips/t-mti-elf: New file. diff --git a/gcc/config.gcc b/gcc/config.gcc index ba366b3..9f5e170 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1741,6 +1741,11 @@ mips*-*-linux*) # Linux MIPS, either endian. esac test x$with_llsc != x || with_llsc=yes ;; +mips*-mti-elf*) + tm_file=elfos.h newlib-stdint.h ${tm_file} mips/elf.h mips/sde.h mips/mti-elf.h + tmake_file=mips/t-mti-elf + tm_defines=${tm_defines} MIPS_ISA_DEFAULT=33 MIPS_ABI_DEFAULT=ABI_32 + ;; mips*-sde-elf*) tm_file=elfos.h newlib-stdint.h ${tm_file} mips/elf.h mips/sde.h tmake_file=mips/t-sde diff --git a/gcc/config/mips/mti-elf.h b/gcc/config/mips/mti-elf.h new file mode 100644 index 000..f6b38a5 --- /dev/null +++ b/gcc/config/mips/mti-elf.h @@ -0,0 +1,43 @@ +/* Target macros for mips*-mti-elf targets. + Copyright (C) 2012 + Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#undef DRIVER_SELF_SPECS +#define DRIVER_SELF_SPECS \ + /* Make sure a -mips option is present. This helps us to pick \ + the right multilib, and also makes the later specs easier \ + to write. */ \ + MIPS_ISA_LEVEL_SPEC, \ + \ + /* Infer the default float setting from -march. */ \ + MIPS_ARCH_FLOAT_SPEC, \ + \ + /* Infer the -msynci setting from -march if not explicitly set. */ \ + MIPS_ISA_SYNCI_SPEC, \ + \ + /* If no ABI option is specified, infer one from the ISA level \ + or -mgp setting. */ \ + %{!mabi=*: %{ MIPS_32BIT_OPTION_SPEC : -mabi=32;: -mabi=n32}}, \ + \ + /* Make sure that an endian option is always present. This makes\ + things like LINK_SPEC easier to write. */ \ + %{!EB:%{!EL:%(endian_spec)}}, \ + \ + /* Configuration-independent MIPS rules. */ \ + BASE_DRIVER_SELF_SPECS diff --git a/gcc/config/mips/t-mti-elf b/gcc/config/mips/t-mti-elf new file mode 100644 index 000..d1d975a --- /dev/null +++ b/gcc/config/mips/t-mti-elf @@ -0,0 +1,35 @@ +# Copyright (C) 2012 Free Software Foundation, Inc. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more
Re: [patch, mips] Patch for new mips triplet - mips-mti-elf
Steve Ellcey sell...@mips.com writes: 2012-09-19 Steve Ellcey sell...@mips.com * config.gcc (mips*-mti-elf*): New target. * config/mips/mti-elf.h: New file. * config/mips/t-mti-elf: New file. OK, thanks. Richard
[patch testsuite]: Fix failing test for llp64 in gcc.dg/tree-ssa
Hi, this patch fixes testsuite-failures for llp64 targets in gcc.dg/tree-ssa testsuite. ChangeLog 2012-09-19 Kai Tietz * gcc.dg/tree-ssa/scev-3.c: Add llp64 to xfail. * gcc.dg/tree-ssa/scev-4.c: Likewise. Ok for apply? Regards, Kai Index: scev-3.c === --- scev-3.c(Revision 191443) +++ scev-3.c(Arbeitskopie) @@ -14,5 +14,5 @@ } } -/* { dg-final { scan-tree-dump-times a 1 optimized { xfail lp64 } } } */ +/* { dg-final { scan-tree-dump-times a 1 optimized { xfail { lp64 || llp64 } } } } */ /* { dg-final { cleanup-tree-dump optimized } } */ Index: scev-4.c === --- scev-4.c(Revision 191443) +++ scev-4.c(Arbeitskopie) @@ -19,5 +19,5 @@ } } -/* { dg-final { scan-tree-dump-times a 1 optimized { xfail lp64 } } } */ +/* { dg-final { scan-tree-dump-times a 1 optimized { xfail { lp64 || llp64 } } } } */ /* { dg-final { cleanup-tree-dump optimized } } */
Re: [Patch] catch builtin_bswap16 construct
On Wed, 2012-09-19 at 18:44 +0200, Georg-Johann Lay wrote: This seems overly complicated on 16-bit platforms where (A 8) | (A 8) is the same as rotate:HI (A 8). Does your patch make sure that (A 8) | (A 8) is still mapped to ROTATE on these targets? If I remember correctly, the bswap16 builtin expansion will try to use a rotate pattern, if no HImode bswap is present. Cheers, Oleg
Re: [C++ Patch] for c++/54537
Hi Paolo, 2012/9/19 Paolo Carlini paolo.carl...@oracle.com: On 09/19/2012 01:24 AM, Paolo Carlini wrote: On 09/19/2012 01:12 AM, Paolo Carlini wrote: Hi again, On 09/18/2012 08:33 PM, Paolo Carlini wrote: But I'm not surprised, frankly, I think the conflict is expected, *IF* (please check) TR1 says that those three overloads, for float, double an long double, must be declared in std::tr1 (likewise for all the other math functions) Now, given that our implementation has the C math.h injecting stuff in the global namespace - and that is legal - I would say that there is nothing to fix, maybe just a library testcase to tweak. As a matter of QoI the idea of having in tr1 using std::pow seems good, if this is what you are suggesting. I found the time to go (again, hopefully it's the last time ;) through tr1/cmath, and now I understand why we have an issue with pow only, not with the other functions. Ok. On the other hand - because of that comment which you mentioned earlier - I don't think we can just have using std::pow in namespace std::tr1. I'm coming to the conclusion that having instead an using ::pow, instead of the open coded pow (double, double) could work. Did you try that together with your front-end patch? Yes, I tried that last evening (using ::pow), it passes regtest as usual -- probably by lack of testsuite coverage. And 'using std::pow' simply does not work. But I'm afraid this is still not completely correct, because if the user code has a using std::pow in the global namespace and then and include tr1/cmath the latter drags again in namespace std::tr1 the overloads pow (*, int) which we don't want there... gr I'll add a testcase for this, if you agree. You know what? All in all, I think we can go with your original idea of just removing the overload for (double, double): what I didn't realize the first time I saw the idea is that we have anyway the templatized pow which forwards to std::pow. Thus, I suppose things should work pretty well. But please add a big comment before the commented out overload. And let's see if over the next months somebody complaints, otherwise, I think it will be enough for TR1, at this point. Thanks for your patience!!! It seems reasonable, I'll update the patch soon with the comment that you suggest. Thank you for your immediate feedback ! -- Fabien
Re: [testsuite] support using target and xfail together
Ping. On 07/24/2012 11:13 AM, Janis Johnson wrote: This patch allows the use of both target and xfail in the selector of any test directive that currently takes either target or xfail: { target selector1 xfail selector2 } The test is only used if the target selector is matched, and the test is expected to fail if the xfail selector is matched. The keyword target must come first; it doesn't make sense to me otherwise because the xfail part shouldn't be processed if the target selector doesn't match. Tested with i686-pc-linux-gnu for c,c++,gfortran,objc,obj-c++ plus with examples using the new feature, with and without errors. I'd like some feedback before checking this in so I'll wait at least a couple of days. I plan to put it on the 4.7 branch also. I'd still like some feedback; I guess I lied when I said I'd check it in anyway. Janis
Re: [patch, mips] Fix for PR 54619, GCC aborting with -O -mips16
Steve Ellcey sell...@mips.com writes: While building newlib with -O2 -mips16 I ran into a problem with GCC aborting due to the compiler trying to execute '0 % 0' in mips16_unextended_reference_p. The problem was that the code checked for BLKmode and skipped the modulo operation in that case because GET_MODE_SIZE (BLKmode) is zero but it didn't check for VOIDmode whose size is also zero. Rather then add a check for VOIDmode I changed the check to 'GET_MODE_SIZE (mode) != 0'. While looking at the code I also noticed that if offset was zero then we should return true even if mode is BLKmode or VOIDmode. Returning false would not generate bad code or cause problems but returning true would result in better costing model in mips_address_insns so I made that change too. This came in with the recent change to pass the mode to address_cost. I hadn't realised when asking for the associated mips_address_cost change that address_cost sometimes gets passed VOIDmode. mips_address_insns shouldn't have to deal with VOIDmode. The quick hack I'd been using locally is to add: if (mode == VOIDmode) mode = SImode; to mips_address_cost, which restores the previous behaviour for this case. But the documentation says: This hook is never called with an invalid address. Since VOIDmode MEMs aren't valid, I think that should mean it's invalid to call this hook (and rtlanal.c:address_cost) with VOIDmode. I never got time to look at that though. (The culprit in the case I saw was tree-ssa-loop-ivopts.c. Sandra had some improvements in this area, so maybe they would fix this too.) Loading or storing BLKmode doesn't map to any instruction, so I don't think returning true for zero is any better than what we do now. Richard
Re: [testsuite] support using target and xfail together
On Sep 19, 2012, at 10:35 AM, Janis Johnson janis_john...@mentor.com wrote: On 07/24/2012 11:13 AM, Janis Johnson wrote: This patch allows the use of both target and xfail in the selector of any test directive that currently takes either target or xfail: { target selector1 xfail selector2 } The test is only used if the target selector is matched, and the test is expected to fail if the xfail selector is matched. The keyword target must come first; it doesn't make sense to me otherwise because the xfail part shouldn't be processed if the target selector doesn't match. Tested with i686-pc-linux-gnu for c,c++,gfortran,objc,obj-c++ plus with examples using the new feature, with and without errors. I'd like some feedback before checking this in so I'll wait at least a couple of days. I plan to put it on the 4.7 branch also. I'd still like some feedback; I guess I lied when I said I'd check it in anyway. I like it. What's not to like?
Re: [testsuite] support using target and xfail together
On 09/19/2012 10:48 AM, Mike Stump wrote: On Sep 19, 2012, at 10:35 AM, Janis Johnson janis_john...@mentor.com wrote: On 07/24/2012 11:13 AM, Janis Johnson wrote: This patch allows the use of both target and xfail in the selector of any test directive that currently takes either target or xfail: { target selector1 xfail selector2 } The test is only used if the target selector is matched, and the test is expected to fail if the xfail selector is matched. The keyword target must come first; it doesn't make sense to me otherwise because the xfail part shouldn't be processed if the target selector doesn't match. Tested with i686-pc-linux-gnu for c,c++,gfortran,objc,obj-c++ plus with examples using the new feature, with and without errors. I'd like some feedback before checking this in so I'll wait at least a couple of days. I plan to put it on the 4.7 branch also. I'd still like some feedback; I guess I lied when I said I'd check it in anyway. I like it. What's not to like? All right then, I'll check it in on trunk, and on 4.7 when it's open. Thanks! Janis
[PATCHv5] rs6000: Add 2 built-ins to read the Time Base Register on PowerPC
Changes since v4: - Changed the condition register mode to CC; - Fixed some style issues (removed spaces between operands in asm, included spaces after function names in the testcases and before sentences in the manual) -- 8 -- Add __builtin_ppc_get_timebase and __builtin_ppc_mftb to read the Time Base Register on PowerPC. They are required by applications that measure time at high frequencies with high precision that can't afford a syscall. __builtin_ppc_get_timebase returns the 64 bits of the Time Base Register while __builtin_ppc_mftb generates only 1 instruction and returns the least significant word on 32-bit environments and the whole Time Base value on 64-bit. [gcc] 2012-09-19 Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com * config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase and __builtin_ppc_mftb. * config/rs6000/rs6000.c (rs6000_expand_zeroop_builtin): New function to expand an expression that calls a built-in without arguments. (rs6000_expand_builtin): Add __builtin_ppc_get_timebase and __builtin_ppc_mftb. (rs6000_init_builtins): Likewise. * config/rs6000/rs6000.md: Likewise. * doc/extend.texi (PowerPC Built-in Functions): New section. (PowerPC AltiVec/VSX Built-in Functions): Move some built-ins unrelated to Altivec/VSX to the new section. [gcc/testsuite] 2012-09-19 Tulio Magno Quites Machado Filho tul...@linux.vnet.ibm.com * gcc.target/powerpc/ppc-get-timebase.c: New file. * gcc.target/powerpc/ppc-mftb.c: New file. --- gcc/config/rs6000/rs6000-builtin.def |6 ++ gcc/config/rs6000/rs6000.c | 46 ++ gcc/config/rs6000/rs6000.md| 66 gcc/doc/extend.texi| 51 ++- .../gcc.target/powerpc/ppc-get-timebase.c | 20 ++ gcc/testsuite/gcc.target/powerpc/ppc-mftb.c| 18 + 6 files changed, 189 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-mftb.c diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index c8f8f86..9fa3a0f 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1429,6 +1429,12 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, __builtin_rsqrt, RS6000_BTM_FRSQRTE, BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, __builtin_rsqrtf, RS6000_BTM_FRSQRTES, RS6000_BTC_FP) +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, __builtin_ppc_get_timebase, +RS6000_BTM_ALWAYS, RS6000_BTC_MISC) + +BU_SPECIAL_X (RS6000_BUILTIN_MFTB, __builtin_ppc_mftb, +RS6000_BTM_ALWAYS, RS6000_BTC_MISC) + /* Darwin CfString builtin. */ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, __builtin_cfstring, RS6000_BTM_ALWAYS, RS6000_BTC_MISC) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index a5a3848..c3bece1 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -9748,6 +9748,30 @@ rs6000_overloaded_builtin_p (enum rs6000_builtins fncode) return (rs6000_builtin_info[(int)fncode].attr RS6000_BTC_OVERLOADED) != 0; } +/* Expand an expression EXP that calls a builtin without arguments. */ +static rtx +rs6000_expand_zeroop_builtin (enum insn_code icode, rtx target) +{ + rtx pat; + enum machine_mode tmode = insn_data[icode].operand[0].mode; + + if (icode == CODE_FOR_nothing) +/* Builtin not supported on this processor. */ +return 0; + + if (target == 0 + || GET_MODE (target) != tmode + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) +target = gen_reg_rtx (tmode); + + pat = GEN_FCN (icode) (target); + if (! pat) +return 0; + emit_insn (pat); + + return target; +} + static rtx rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target) @@ -11337,6 +11361,16 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, ? CODE_FOR_bpermd_di : CODE_FOR_bpermd_si), exp, target); +case RS6000_BUILTIN_GET_TB: + return rs6000_expand_zeroop_builtin (CODE_FOR_rs6000_get_timebase, + target); + +case RS6000_BUILTIN_MFTB: + return rs6000_expand_zeroop_builtin (((TARGET_64BIT) + ? CODE_FOR_rs6000_mftb_di + : CODE_FOR_rs6000_mftb_si), + target); + case ALTIVEC_BUILTIN_MASK_FOR_LOAD: case ALTIVEC_BUILTIN_MASK_FOR_STORE: { @@ -11621,6 +11655,18 @@ rs6000_init_builtins (void) POWER7_BUILTIN_BPERMD, __builtin_bpermd); def_builtin (__builtin_bpermd, ftype, POWER7_BUILTIN_BPERMD);
Re: Use conditional casting with symtab_node
On 9/19/12, Eric Botcazou ebotca...@adacore.com wrote: The language syntax would bind the conditional into the intializer, as in if (varpool_node *vnode = (node-try_variable () vnode-finalized)) varpool_analyze_node (vnode); which does not type-match. So, if you want the type saftey and performance, the cascade is really unavoidable. Just write: varpool_node *vnode; if ((vnode = node-try_variable ()) vnode-finalized) varpool_analyze_node (vnode); This has been the standard style for the past 2 decades and trading it for cascading if's is really a bad idea. Assignments in if statements are known to cause confusion. The point of the change is to limit the scope of the variable to the if statement, which prevents its unintended use later. It acts like a type switch. Why do you think cascading ifs is a bad idea? -- Lawrence Crowl
Re: [C++ Patch] for c++/54537
Hi Fabien, On 09/19/2012 07:29 PM, Fabien Chêne wrote: But I'm afraid this is still not completely correct, because if the user code has a using std::pow in the global namespace and then and include tr1/cmath the latter drags again in namespace std::tr1 the overloads pow (*, int) which we don't want there... gr I'll add a testcase for this, if you agree. Sure, if you like (the never ending story of std::pow between C++99 and C++11, I'm still meeting people missing the (*, int) overloads ;) You know what? All in all, I think we can go with your original idea of just removing the overload for (double, double): what I didn't realize the first time I saw the idea is that we have anyway the templatized pow which forwards to std::pow. Thus, I suppose things should work pretty well. But please add a big comment before the commented out overload. And let's see if over the next months somebody complaints, otherwise, I think it will be enough for TR1, at this point. Thanks for your patience!!! It seems reasonable, I'll update the patch soon with the comment that you suggest. Thanks! (if you feel lazy about the comment, just add a couple of URLs to the front-end issue and to this discussion, it's more than enough to understand what's going on) Paolo.
Re: [patch, mips] Fix for PR 54619, GCC aborting with -O -mips16
On Wed, 2012-09-19 at 18:42 +0100, Richard Sandiford wrote: This hook is never called with an invalid address. Since VOIDmode MEMs aren't valid, I think that should mean it's invalid to call this hook (and rtlanal.c:address_cost) with VOIDmode. I never got time to look at that though. (The culprit in the case I saw was tree-ssa-loop-ivopts.c. Sandra had some improvements in this area, so maybe they would fix this too.) Yes, the test case I put in PR 54619 also shows us getting there from tree-ssa-loop-ivopts.c with VOIDmode. I will look a bit more at this when I get a chance. Steve Ellcey sell...@mips.com
Re: Use conditional casting with symtab_node
On 9/19/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Sep 19, 2012 Richard Guenther richard.guent...@gmail.com wrote: Indeed. Btw, can we not provide a specialization for dynamic_cast ? This -try_... looks awkward to me compared to the more familiar vnode = dynamic_cast varpool_node (node) but yeah - dynamic_cast is not a template ... (but maybe there is some standard library piece that mimics it?). No, it is a language primitive. but we can define out own operation with similar syntax that allows for specialization, whose generic implementation uses dynamic_cast. templatetypename T, typename U T* is(U* u) { return dynamic_castT*(u); } At this point, dynamic_cast is not available because we do not yet have polymorphic types. There has been some resistance to that notion. Absent dynamic cast, we need to specialize for various type combinations. Function template specialization would be handy, but C++ does not directly support that. We could work around that. However, in the end, the fact that try_whatever is a member function means that we can use a notation that depends on context and so can be shorter. That is, we can write 'function' instead of 'cgraph_node *'. -- Lawrence Crowl
Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
Dear Mikael, Thanks for the usually thorough review. snip And here comes the next round of comments. snip + e-rank = c c-as ? c-as-rank : 0; This is bogus if e-rank was != 0 previously (think of the case array(:)%scalar_comp). mistaken maybe but not bogus! The c == NULL case should be handled at the beginning (if at all). + if (e-rank) the condition should be on c-as (for the case array(:)%scalar_comp again). OK point taken. snip... + this_code-op = op; all calls are with op == EXEC_ASSIGN, you may as well hardcode it. I thought to leave it general so that the function could be reused for other purposes. snip... . + gcc_assert (e-expr_type == EXPR_VARIABLE + || e-expr_type == EXPR_FUNCTION); As far as I know anything can be used, not only variables and functions. The derived type cases are a bit specific but at least array/structure constructors are missing. There could be also typebound function calls (I never know whether they are EXPR_FUNCTION or something else). The reason for this assert is the later use of e-symtree. I'll see what I can do to generalise it. snip I guess it should be `t1%cmp {defined=} expr2%cmp'? . it might just be + || comp1-attr.proc_pointer_comp That one doesn't look right. Why not? `this_code' should be cleared, otherwise it is used in the next iteration. I'll check that this is not done in gfc_free_statements (no source to hand at the moment) - I believe that it is. snip... += super_type-attr.defined_assign_comp; I guess Tobias' reported bug is here. The flag shouldn't be cleared here if it was set just before. I am sure that it is in this vicinity :-) To finish, I would like to draw your attention on the scalarizer not supporting multiple arrays in the reference chain. The initial expressions are guaranteed to have at most one array in the chain, but as we add subfield references, that condition can not remain true. We could try adding multiple references support in the scalarizer, but I don't know how difficult it would be. Or maybe better fix it at the front-end AST level by using elemental functions to split the scalarization work. Or something else. What do you think? resolve_expr punts on this, does it not? I'll check. I cannot conceivably come back to this for a week or so because daytime and private life are overwhelmingly hectic (wife and daughter moving back to UK). Thanks again Paul
Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch
On Tue, Sep 18, 2012 at 08:04:06PM -0400, David Edelsohn wrote: On Mon, Sep 17, 2012 at 3:51 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch has support for all of the additonal cleanups I mentioned in the first patch that I hadn't gotten to. At this point, I am not planning any more enhancements to the patch, and I would like to check it in. On my 64-bit powerpc system, there are 36 options in the main ISA flags fields, 23 options in the miscellaneous flags fields, and 8 options in the debug flag fields. I believe it answers the problems Ian had. I changed all of the debugging fprintf's to use HOST_WIDE_INT_PRINT_HEX to print the numeric value of the flags fields, and I changed the #ifdef TARGET_xxx to #ifdef OPTION_xxx. It builds and bootstraps fine on my powerpc64 linux system and there were no regressions. It is ok to install? Mike, Thanks for working on this cleanup! Is it possible to split out some parts of the patch to make it easier to review and verify? Such as the debug parts? It looks like some parts are independent. Well they aren't that independent, since you need to change a few flags, and then catch all of the references, which often times hits the same files. I can do it as several waves, perhaps doing all of the flags in isa flags first, then misc. However, the isa flags make up a lot of the changes, since those are likely the things that use MASK_* etc. So, I may first do just the stuff in target flags first. Then on the second wave, add SPE, PAIRED, etc. back into isa flags and clean up builtins. Then do the misc flags and finally the debug flags. Why do you use HOST_WIDE_INT instead of an explicit 64 bit type for the flags? Because the opt*.awk functions only support flag fields being HOST_WIDE_INT or plain int. They don't have support for Mask(..) and InverseMask(...) being any other type. Also, I imagine it would break using a C90 compiler as the stage1 compiler, since long long is not guaranteed to exist, and long might only be 32-bit. I am confident that it bootstraps and passes regression tests. But how did you verify that it uses the correct defaults after the patch? I did tests with a parallel compiler of the same svn id that does not have the patches installed to make sure the asm file for various options is the same. Iain Sandoe and Andreas Tobler tested the first version of the patches on their systems. Iain found some issues on Darwin9 that I fixed in the second version of the patches. Andreas had no issues on freebsd. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
[google/gcc-4_7] fix race in __cxa_guard_acquire
This is a merge of r191125 and r191191 from gcc-4_7-branch. It fixes a race in __cxa_guard_acquire which can cause duplicate initialization of static variables within functions. Okay for google/gcc-4_7? Thanks, Ollie Google ref b/7173106. * libsupc++/guard.cc (__cxa_guard_acquire): Exit the loop earlier if we detect that another thread has had success. Don't compare_exchange from a finished state back to a waiting state. Fix up the last argument of the first __atomic_compare_exchange_n. Comment. commit 40ec687ace62b4d4c64f72be2bdf4321f5213107 Author: Ollie Wild a...@google.com Date: Wed Sep 19 14:52:53 2012 -0500 Merge r191125 and r191191 from gcc-4_7-branch. Google ref b/7173106. * libsupc++/guard.cc (__cxa_guard_acquire): Exit the loop earlier if we detect that another thread has had success. Don't compare_exchange from a finished state back to a waiting state. Fix up the last argument of the first __atomic_compare_exchange_n. Comment. diff --git a/libstdc++-v3/libsupc++/guard.cc b/libstdc++-v3/libsupc++/guard.cc index adc9608..f8550c0 100644 --- a/libstdc++-v3/libsupc++/guard.cc +++ b/libstdc++-v3/libsupc++/guard.cc @@ -244,16 +244,16 @@ namespace __cxxabiv1 if (__gthread_active_p ()) { int *gi = (int *) (void *) g; - int expected(0); const int guard_bit = _GLIBCXX_GUARD_BIT; const int pending_bit = _GLIBCXX_GUARD_PENDING_BIT; const int waiting_bit = _GLIBCXX_GUARD_WAITING_BIT; while (1) { + int expected(0); if (__atomic_compare_exchange_n(gi, expected, pending_bit, false, __ATOMIC_ACQ_REL, - __ATOMIC_RELAXED)) + __ATOMIC_ACQUIRE)) { // This thread should do the initialization. return 1; @@ -264,13 +264,26 @@ namespace __cxxabiv1 // Already initialized. return 0; } + if (expected == pending_bit) { +// Use acquire here. int newv = expected | waiting_bit; if (!__atomic_compare_exchange_n(gi, expected, newv, false, __ATOMIC_ACQ_REL, - __ATOMIC_RELAXED)) - continue; + __ATOMIC_ACQUIRE)) + { +if (expected == guard_bit) + { +// Make a thread that failed to set the +// waiting bit exit the function earlier, +// if it detects that another thread has +// successfully finished initialising. +return 0; + } +if (expected == 0) + continue; + } expected = newv; }
Re: [PATCH] Combine location with block using block_locations
This patch was commited as r191494. Thank all for the reviews and helping test. Dehao
Re: [google/gcc-4_7] fix race in __cxa_guard_acquire
On 2012-09-19 15:59 , Ollie Wild wrote: * libsupc++/guard.cc (__cxa_guard_acquire): Exit the loop earlier if we detect that another thread has had success. Don't compare_exchange from a finished state back to a waiting state. Fix up the last argument of the first __atomic_compare_exchange_n. Comment. OK. Diego.
Re: [Patch, fortran] PR46897 - [OOP] type-bound defined ASSIGNMENT(=) not used for derived type component in intrinsic assign
On 19/09/2012 20:46, Paul Richard Thomas wrote: + || comp1-attr.proc_pointer_comp That one doesn't look right. Why not? It skips any component containing a procedure pointer subcomponent. Actually, from looking at parse.c where the flag is set, it seems that the flag is only set for derived types, not for components, so it's not that bad; the condition never triggers. `this_code' should be cleared, otherwise it is used in the next iteration. I'll check that this is not done in gfc_free_statements (no source to hand at the moment) - I believe that it is. To be clear, the _pointer_ should be cleared: this_code = NULL; Mikael
Re: RFA: Process '*' in '@'-output-template alternatives
Joern Rennecke a écrit: Index: doc/md.texi === --- doc/md.texi (revision 191429) +++ doc/md.texi (working copy) @@ -665,6 +665,22 @@ (define_insn @end group @end smallexample +If you just need a little bit of C code in one (or a few) alternatives, +you can use @samp{*} inside of a @samp{@@} multi-alternative template: + +@smallexample +@group +(define_insn + [(set (match_operand:SI 0 general_operand =r,,m) +(const_int 0))] + + @@ + clrreg %0 + * return stack_mem_p (operands[0]) ? \push 0 : \clrmem %0\; ---^ Isn't there a backslash missing? + clrmem %0) +@end group +@end smallexample + @node Predicates @section Predicates @cindex predicates Index: genoutput.c === --- genoutput.c (revision 191429) +++ genoutput.c (working copy) @@ -662,19 +662,55 @@ process_template (struct data *d, const list of assembler code templates, one for each alternative. */ else if (template_code[0] == '@') { - d-template_code = 0; - d-output_format = INSN_OUTPUT_FORMAT_MULTI; + int found_star = 0; - printf (\nstatic const char * const output_%d[] = {\n, d-code_number); + for (cp = template_code[1]; *cp; ) + { + while (ISSPACE (*cp)) + cp++; + if (*cp == '*') + found_star = 1; + while (!IS_VSPACE (*cp) *cp != '\0') + ++cp; + } + d-template_code = 0; + if (found_star) + { + d-output_format = INSN_OUTPUT_FORMAT_FUNCTION; + puts (\nstatic const char *); + printf (output_%d (rtx *operands ATTRIBUTE_UNUSED, + rtx insn ATTRIBUTE_UNUSED)\n, d-code_number); + puts ({); + puts ( switch (which_alternative)\n{); + } + else + { + d-output_format = INSN_OUTPUT_FORMAT_MULTI; + printf (\nstatic const char * const output_%d[] = {\n, + d-code_number); + } for (i = 0, cp = template_code[1]; *cp; ) { - const char *ep, *sp; + const char *ep, *sp, *bp; while (ISSPACE (*cp)) cp++; - printf ( \); + bp = cp; + if (found_star) + { + printf (case %d:, i); + if (*cp == '*') + { + printf (\n ); + cp++; + } + else + printf ( return \); + } + else + printf ( \); for (ep = sp = cp; !IS_VSPACE (*ep) *ep != '\0'; ++ep) if (!ISSPACE (*ep)) @@ -690,7 +726,18 @@ process_template (struct data *d, const cp++; } - printf (\,\n); + if (!found_star) + puts (\,); + else if (*bp != '*') + puts (\;); + else + { + /* The usual action will end with a return. +If there is neither break or return at the end, this is +assumed to be intentional; this allows to have multiple +consecutive alternatives share some code. */ + puts (); + } i++; } if (i == 1) @@ -700,7 +747,10 @@ process_template (struct data *d, const error_with_line (d-lineno, wrong number of alternatives in the output template); - printf (};\n); + if (found_star) + puts ( default: gcc_unreachable ();\n}\n}); + else + printf (};\n); } else {
Re: [testsuite] vect effective targets should use arm_neon_ok
On 09/19/2012 01:43 AM, Richard Earnshaw wrote: On 18/09/12 21:59, Janis Johnson wrote: On 09/18/2012 12:54 PM, Janis Johnson wrote: In most cases a test that requires ARM NEON should use effective target arm_neon, which means that flags run for all tests include NEON support. The result is cached the first time it is checked for a multilib. Vectorization tests, when run for ARM, add flags to support NEON if it's OK to do so, but those flags are not reflected in the cached results for arm_neon, nor should they be. Because of this, vect effective-target checks should use arm_neon_ok (as most already do) instead of arm_neon. This patch changes the checks for 7 effective targets, allowing more tests to run and decreasing the number of failures. The only new failures I've seen in tests on arm-none-eabi with a variety of test multilibs are for big-endian with vect_multiple_sizes, which means that vect_multiple_sizes should be false for big endian or that there's a bug in ARM big-endian support. Sadly, there are almost certainly bugs in the big-endian support for ARM Neon. I filed PR testsuite/54622 with a list of the vect tests failures for ARM big-endian. There aren't really issues with vect_multiple_sizes, but there are some vect effective targets for which ARM big-endian fails all tests. I had originally thought the problems were with the tests; apparently not. Janis
Top Level GCC change questions
While checking in some changes at the top-level of the GCC tree I remembered that these files are shared with the binutils tree. It looks like there is another patch that has not been checked in to the GCC top level tree but not binutils and a patch in binutils but not GCC. Is there any automation for this or is it still up to each person checking in files to copy stuff over by hand? Not in binutils: 2012-09-19 Steve Ellcey sell...@mips.com * configure.ac: Add mips*-mti-elf* target. * configure: Regenerate. 2012-09-17 Ian Lance Taylor i...@google.com * MAINTAINERS (Various Maintainers): Add libbacktrace. * configure.ac (host_libs): Add libbacktrace. (target_libraries): Add libbacktrace. * Makefile.def (host_modules): Add libbacktrace. (target_modules): Likewise. * configure, Makefile.in: Rebuild. Not in GCC: 2012-09-15 Jiong Wang jiw...@tilera.com * configure.ac (ENABLE_GOLD): support tilegx* * configure: rebuild Steve Ellcey sell...@mips.com
Re: Backtrace library [1/3]
On 12-09-17 12:39 PM, Ian Lance Taylor wrote: On Thu, Sep 13, 2012 at 1:00 PM, Diego Novillo dnovi...@google.com wrote: On 2012-09-11 18:53 , Ian Lance Taylor wrote: 2012-09-11 Ian Lance Taylor i...@google.com * Initial implementation. OK. Thanks for all the reviews. I have committed the libbacktrace library to trunk. I will follow up with a patch to actually use it. Please let me know about any build problems. I'm hitting the following build issue /bin/sh ./libtool --tag=CC --mode=compile i686-unknown-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../libbacktrace -I ../../libbacktrace/../include -I ../../libbacktrace/../libgcc -I ../libgcc -I ../gcc/include -I ../../gcc/include -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -Werror -g -O2 -MT backtrace.lo -MD -MP -MF .deps/backtrace.Tpo -c -o backtrace.lo ../../libbacktrace/backtrace.c libtool: compile: i686-unknown-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../libbacktrace -I ../../libbacktrace/../include -I ../../libbacktrace/../libgcc -I ../libgcc -I ../gcc/include -I ../../gcc/include -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -Werror -g -O2 -MT backtrace.lo -MD -MP -MF .deps/backtrace.Tpo -c ../../libbacktrace/backtrace.c -o backtrace.o cc1: warnings being treated as errors ../../libbacktrace/backtrace.c: In function 'unwind': ../../libbacktrace/backtrace.c:69: warning: implicit declaration of function '_Unwind_GetIPInfo' make[3]: *** [backtrace.lo] Error 1 Regards, Ryan Mansfield
RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
-Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- ow...@gcc.gnu.org] On Behalf Of Richard Henderson Sent: Tuesday, September 11, 2012 3:12 PM To: Iyer, Balaji V Cc: Richard Guenther; gcc-patches@gcc.gnu.org; Gabriel Dos Reis; Aldy Hernandez (al...@redhat.com); Jeff Law Subject: Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22) On 09/11/2012 10:14 AM, Iyer, Balaji V wrote: The function mangling handles several of the version inconsistencies you have mentioned. If the CPU revisions, vector lengths are not the same between the function declaration and the function, then the name of the function will be different and the linker should complain. Sure. I get that. And that works for code within a single project. But that means that if you build a shared library containing one of these elemental functions, its external ABI changes depending on what compiler flags you build it with. Can you not understand how totally unacceptable this is? Hello Richard, Thank you very much for pointing this out to us. We do see the problem with the default case (when the processor clause is not specified by the user) of elemental functions attribute. We have also found a solution. Since this has to do with the calling convention for elemental functions, it requires a fix to both the Intel compiler and to gcc. It will take us a couple weeks to validate this. I will re-implement this and send out another patch as soon as possible. In the meantime, I will work on the array notation patches, so we can keep making forward progress. Thanks again for pointing this out Yours Sincerely, Balaji V. Iyer. r~
Re: Use conditional casting with symtab_node
On Wed, Sep 19, 2012 at 1:39 PM, Lawrence Crowl cr...@googlers.com wrote: On 9/19/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Sep 19, 2012 Richard Guenther richard.guent...@gmail.com wrote: Indeed. Btw, can we not provide a specialization for dynamic_cast ? This -try_... looks awkward to me compared to the more familiar vnode = dynamic_cast varpool_node (node) but yeah - dynamic_cast is not a template ... (but maybe there is some standard library piece that mimics it?). No, it is a language primitive. but we can define out own operation with similar syntax that allows for specialization, whose generic implementation uses dynamic_cast. templatetypename T, typename U T* is(U* u) { return dynamic_castT*(u); } At this point, dynamic_cast is not available because we do not yet have polymorphic types. There has been some resistance to that notion. Hmm, when did we rule that out? We currently implement dynamic_cast using the poor man's simulation based on tree_code checking. We can just as well give such simulation the is notation. Absent dynamic cast, we need to specialize for various type combinations. Function template specialization would be handy, but C++ does not directly support that. We could work around that. We can always use the standard workaround: call a static member function of a class template that can be specialized at will. However, in the end, the fact that try_whatever is a member function means that we can use a notation that depends on context and so can be shorter. That is, we can write 'function' instead of 'cgraph_node *'. -- Lawrence Crowl
Re: Top Level GCC change questions
Is there any automation for this or is it still up to each person checking in files to copy stuff over by hand? There is no automation, as neither project was willing to cede control to the other. In general, any patch applied to one repo should be (and may be) applied to the other, but at the moment it's up to each committer to actually do it.
Re: Top Level GCC change questions
On Wed, 2012-09-19 at 17:15 -0400, DJ Delorie wrote: Is there any automation for this or is it still up to each person checking in files to copy stuff over by hand? There is no automation, as neither project was willing to cede control to the other. In general, any patch applied to one repo should be (and may be) applied to the other, but at the moment it's up to each committer to actually do it. OK. I also noticed a change that is in the config directory in GCC but not in binutils: 2012-09-03 Richard Guenther rguent...@suse.de PR bootstrap/54138 * config/cloog.m4: Adjust for toplevel reorg. * config/isl.m4: Adjust.
Re: Use conditional casting with symtab_node
On 9/19/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Wed, Sep 19, 2012 at 1:39 PM, Lawrence Crowl cr...@googlers.com wrote: On 9/19/12, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Sep 19, 2012 Richard Guenther richard.guent...@gmail.com wrote: Indeed. Btw, can we not provide a specialization for dynamic_cast ? This -try_... looks awkward to me compared to the more familiar vnode = dynamic_cast varpool_node (node) but yeah - dynamic_cast is not a template ... (but maybe there is some standard library piece that mimics it?). No, it is a language primitive. but we can define out own operation with similar syntax that allows for specialization, whose generic implementation uses dynamic_cast. templatetypename T, typename U T* is(U* u) { return dynamic_castT*(u); } At this point, dynamic_cast is not available because we do not yet have polymorphic types. There has been some resistance to that notion. Hmm, when did we rule that out? We have not ruled it out, but folks are, rightly, concerned about any size increase in critical data structures. We are also currently lacking a gengtype that will handle inheritance. So, for now at least, we need a scheme that will work across both inheritance and our current tag/union approach. We currently implement dynamic_cast using the poor man's simulation based on tree_code checking. We can just as well give such simulation the is notation. Absent dynamic cast, we need to specialize for various type combinations. Function template specialization would be handy, but C++ does not directly support that. We could work around that. We can always use the standard workaround: call a static member function of a class template that can be specialized at will. However, in the end, the fact that try_whatever is a member function means that we can use a notation that depends on context and so can be shorter. That is, we can write 'function' instead of 'cgraph_node *'. -- Lawrence Crowl
[Patch, Fortran] PR54618 fix some INTENT(OUT) issues for CLASS
This patch fixes a couple of issues, I run into when working on FINAL subroutines. a) PR54618: (i) For a nonallocatable CLASS(...),INTENT(OUT), gfortran is setting the the _def_init; however, for OPTIONAL this has to be guarded by an is-present check. (ii) For CLASS(...),ALLOCATABLE, INTENT(OUT), gfortran didn't deallocate the dummy argument - nor did it reset the var-_vtab to the declared type. Note: (ii) for polymorphic arrays has still to be implemented, currently, only scalars are handled. There are also some other issues related to OPTIONAL with polymorphic arrays. (See PR.) b) When working on FINAL, I also run into the problem that attr.alloc_comp is set, when there is a pointer component, which only in turn has allocatable components. That lead to an ICE (segfault) with my FINAL patch. c) I also include three coverity patches: (i) resolve.c: nl-sym is many times dereferenced (before and after that check), thus it cannot be NULL. (ii) simplify.c: There is an if (extremum == NULL) ... continue;, hence, one always loops at least once before one reaches that line; but then last gets set. Thus, the code is unreachable. (iii) trans-array.c: Here, class_expr is NULL_TREE if the condition is false, but TREE_TYPE(NULL_TREE) won't work. Hence, an assert is better. I intent to do two commits: One for (a) and one for the rest. Build and regtested on x86-64-linux. OK for the trunk? Tobias 2012-09-19 Tobias Burnus bur...@net-b.de * parse.c (parse_derived): Don't set attr.alloc_comp for pointer components with allocatable subcomps. * resolve.c (resolve_fl_namelist): Remove superfluous NULL check. * simplify.c (simplify_min_max): Remove unreachable code. * trans-array.c (gfc_trans_create_temp_array): Change a condition into an assert. PR fortran/54618 * trans-expr.c (gfc_trans_class_init_assign): Guard re-setting of the vtab by gfc_conv_expr_present. (gfc_conv_procedure_call): Fix INTENT(OUT) handling for allocatable BT_CLASS. 2012-09-19 Tobias Burnus bur...@net-b.de PR fortran/54618 * gfortran.dg/class_array_14.f90: New. diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c index 5c5d381..f31e309 100644 --- a/gcc/fortran/parse.c +++ b/gcc/fortran/parse.c @@ -2195,7 +2195,8 @@ endType: if (c-attr.allocatable || (c-ts.type == BT_CLASS c-attr.class_ok CLASS_DATA (c)-attr.allocatable) - || (c-ts.type == BT_DERIVED c-ts.u.derived-attr.alloc_comp)) + || (c-ts.type == BT_DERIVED !c-attr.pointer + c-ts.u.derived-attr.alloc_comp)) { allocatable = true; sym-attr.alloc_comp = 1; diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index f67c07f..0a20540 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -12478,7 +12478,7 @@ resolve_fl_namelist (gfc_symbol *sym) continue; nlsym = NULL; - if (nl-sym nl-sym-name) + if (nl-sym-name) gfc_find_symbol (nl-sym-name, sym-ns, 1, nlsym); if (nlsym nlsym-attr.flavor == FL_PROCEDURE) { diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c index 1c9dff2..2f96e90 100644 --- a/gcc/fortran/simplify.c +++ b/gcc/fortran/simplify.c @@ -4106,10 +4106,7 @@ simplify_min_max (gfc_expr *expr, int sign) min_max_choose (arg-expr, extremum-expr, sign); /* Delete the extra constant argument. */ - if (last == NULL) - expr-value.function.actual = arg-next; - else - last-next = arg-next; + last-next = arg-next; arg-next = NULL; gfc_free_actual_arglist (arg); diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index c350c3b..3e684ee 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -1022,8 +1022,8 @@ gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post, gfc_ss * ss, dynamic type. Generate an eltype and then the class expression. */ if (eltype == NULL_TREE initial) { - if (POINTER_TYPE_P (TREE_TYPE (initial))) - class_expr = build_fold_indirect_ref_loc (input_location, initial); + gcc_assert (POINTER_TYPE_P (TREE_TYPE (initial))); + class_expr = build_fold_indirect_ref_loc (input_location, initial); eltype = TREE_TYPE (class_expr); eltype = gfc_get_element_type (eltype); /* Obtain the structure (class) expression. */ diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 98634c3..177d286 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -621,6 +621,16 @@ gfc_trans_class_init_assign (gfc_code *code) gfc_add_block_to_block (block, src.pre); tmp = gfc_build_memcpy_call (dst.expr, src.expr, memsz.expr); } + + if (code-expr1-symtree-n.sym-attr.optional + || code-expr1-symtree-n.sym-ns-proc_name-attr.entry_master) +{ + tree present = gfc_conv_expr_present (code-expr1-symtree-n.sym); + tmp = build3_loc (input_location, COND_EXPR, TREE_TYPE (tmp), + present, tmp, + build_empty_stmt (input_location)); +} +
Re: [google] Added new dump flag -pmu to display pmu data in pass summaries (issue 6489092)
FYI. Dehao https://codereview.appspot.com/6489092/diff/4001/gcc/common.opt File gcc/common.opt (right): https://codereview.appspot.com/6489092/diff/4001/gcc/common.opt#newcode1688 gcc/common.opt:1688: -fpmu-profile-use=[pmuprofile.gcda] The pmu profile data file to use for pmu feedback. Looks like the default value is not implemented. https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c File gcc/coverage.c (right): https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c#newcode777 gcc/coverage.c:777: +static void read_pmu_file (const char* da_file_name) This function is very large. How about splitting it into read_pmu_file and process_pmu_file (the second one builds the hash tables, etc.) https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c#newcode781 gcc/coverage.c:781: + brm_infos_t* brm_infos = pmu_global_summary.brm_infos; For consistency, please move the * after the space. (many places in this file) https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c#newcode846 gcc/coverage.c:846: + unsigned length = gcov_read_unsigned (); Please use gcov_unsigned_t https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c#newcode847 gcc/coverage.c:847: + unsigned long base = gcov_position (); Please use gcov_position_t https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c#newcode940 gcc/coverage.c:940: + there should only be one entry per filename and line number */ add a gcc_assert in the else path? https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c#newcode966 gcc/coverage.c:966: +} The above two loops looks similar, can they be abstracted out? https://codereview.appspot.com/6489092/diff/4001/gcc/coverage.c#newcode2573 gcc/coverage.c:2573: + if (pmu_profile_data != 0 TDF_PMU) if (pmu_profile_data != NULL ...) or if (pmu_profile_data ...) https://codereview.appspot.com/6489092/diff/4001/gcc/gcov-io.h File gcc/gcov-io.h (right): https://codereview.appspot.com/6489092/diff/4001/gcc/gcov-io.h#newcode705 gcc/gcov-io.h:705: /* Cumulative pmu data */ Seems this data structure should be moved into coverage.c. https://codereview.appspot.com/6489092/diff/4001/gcc/gcov.c File gcc/gcov.c (right): https://codereview.appspot.com/6489092/diff/4001/gcc/gcov.c#newcode2353 gcc/gcov.c:2353: + remove blank line https://codereview.appspot.com/6489092/diff/4001/gcc/gimple-pretty-print.c File gcc/gimple-pretty-print.c (right): https://codereview.appspot.com/6489092/diff/4001/gcc/gimple-pretty-print.c#newcode1585 gcc/gimple-pretty-print.c:1585: static void This function is very much like dump_pmu. Can we jus call dump_pmu here? https://codereview.appspot.com/6489092/diff/4001/gcc/tree-pretty-print.c File gcc/tree-pretty-print.c (right): https://codereview.appspot.com/6489092/diff/4001/gcc/tree-pretty-print.c#newcode28 gcc/tree-pretty-print.c:28: #include basic-block.h why need to include his header? https://codereview.appspot.com/6489092/diff/4001/gcc/tree-pretty-print.c#newcode519 gcc/tree-pretty-print.c:519: static void Looks to me that this function should be exported, while dump_load_latency_details should stay static. https://codereview.appspot.com/6489092/
Re: Backtrace library [1/3]
On Wed, Sep 19, 2012 at 1:56 PM, Ryan Mansfield rmansfi...@qnx.com wrote: I'm hitting the following build issue /bin/sh ./libtool --tag=CC --mode=compile i686-unknown-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../libbacktrace -I ../../libbacktrace/../include -I ../../libbacktrace/../libgcc -I ../libgcc -I ../gcc/include -I ../../gcc/include -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -Werror -g -O2 -MT backtrace.lo -MD -MP -MF .deps/backtrace.Tpo -c -o backtrace.lo ../../libbacktrace/backtrace.c libtool: compile: i686-unknown-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../libbacktrace -I ../../libbacktrace/../include -I ../../libbacktrace/../libgcc -I ../libgcc -I ../gcc/include -I ../../gcc/include -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -Werror -g -O2 -MT backtrace.lo -MD -MP -MF .deps/backtrace.Tpo -c ../../libbacktrace/backtrace.c -o backtrace.o cc1: warnings being treated as errors ../../libbacktrace/backtrace.c: In function 'unwind': ../../libbacktrace/backtrace.c:69: warning: implicit declaration of function '_Unwind_GetIPInfo' make[3]: *** [backtrace.lo] Error 1 Don't omit the details: How precisely did you run configure? In what directory is this error occurring? Please check HAVE_GETIPINFO in libbacktrace/config.h. I assume it is 1. It looks like you may be doing some sort of Canadian Cross build. What version of GCC is i686-unknown-linux-gnu-gcc? Does that version of gcc declare _Unwind_GetIPInfo in its unwind.h? Does it provide _Unwind_GetIPInfo in its libgcc? Ian
Re: [PATCH] Combine location with block using block_locations
On Fri, Sep 14, 2012 at 11:03:34AM +0800, Dehao Chen wrote: Hi, I've integrated all the reviews from this thread (Thank you guys for helping refine this patch). Now the patch can pass all gcc testsuite as well as all spec2006 benchmarks (with LTO). Concerning memory consumption, for extreme benchmarks like tramp3d, this patch incurs around 2% peak memory overhead (mostly from the extra blocks that have been set NULL in the original implementation.) Attached is the new patch. Honza, could you help me try this on Mozzila lto to see if the error is gone? The current patch that was checked in breaks powerpc, because you did not change INSN_LOCATOR to INSN_LOCATION in rs6000_final_prescan_insn in rs6000.c. I also see INSN_LOCATOR in the arm, bfin, c6x, mep, mips, picochip, s390, sh, and spu ports. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: Backtrace library [1/3]
On 12-09-19 06:17 PM, Ian Lance Taylor wrote: On Wed, Sep 19, 2012 at 1:56 PM, Ryan Mansfield rmansfi...@qnx.com wrote: I'm hitting the following build issue /bin/sh ./libtool --tag=CC --mode=compile i686-unknown-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../libbacktrace -I ../../libbacktrace/../include -I ../../libbacktrace/../libgcc -I ../libgcc -I ../gcc/include -I ../../gcc/include -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -Werror -g -O2 -MT backtrace.lo -MD -MP -MF .deps/backtrace.Tpo -c -o backtrace.lo ../../libbacktrace/backtrace.c libtool: compile: i686-unknown-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../libbacktrace -I ../../libbacktrace/../include -I ../../libbacktrace/../libgcc -I ../libgcc -I ../gcc/include -I ../../gcc/include -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -Werror -g -O2 -MT backtrace.lo -MD -MP -MF .deps/backtrace.Tpo -c ../../libbacktrace/backtrace.c -o backtrace.o cc1: warnings being treated as errors ../../libbacktrace/backtrace.c: In function 'unwind': ../../libbacktrace/backtrace.c:69: warning: implicit declaration of function '_Unwind_GetIPInfo' make[3]: *** [backtrace.lo] Error 1 Don't omit the details: How precisely did you run configure? In what directory is this error occurring? $ head config.log This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by configure, which was generated by GNU Autoconf 2.64. Invocation command line was $ ../configure --disable-bootstrap --enable-languages=c++ ## - ## ## Platform. ## Please check HAVE_GETIPINFO in libbacktrace/config.h. I assume it is 1. Yep, it's 1. $ grep HAVE_GETIPINFO config.h #define HAVE_GETIPINFO 1 It looks like you may be doing some sort of Canadian Cross build. What version of GCC is i686-unknown-linux-gnu-gcc? Does that version of gcc declare _Unwind_GetIPInfo in its unwind.h? Does it provide _Unwind_GetIPInfo in its libgcc? I'm using gcc 4.1 on this machine. No, it doesn't have it in the unwind.h or libgcc. Regards, Ryan Mansfield
Re: RFA: Process '*' in '@'-output-template alternatives
Quoting Georg-Johann Lay g...@gcc.gnu.org: + * return stack_mem_p (operands[0]) ? \push 0 : \clrmem %0\; ---^ Isn't there a backslash missing? Oops, yes, there is.
Re: Backtrace library [1/3]
On Wed, Sep 19, 2012 at 3:31 PM, Ryan Mansfield rmansfi...@qnx.com wrote: $ head config.log This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by configure, which was generated by GNU Autoconf 2.64. Invocation command line was $ ../configure --disable-bootstrap --enable-languages=c++ ## - ## ## Platform. ## Please check HAVE_GETIPINFO in libbacktrace/config.h. I assume it is 1. Yep, it's 1. $ grep HAVE_GETIPINFO config.h #define HAVE_GETIPINFO 1 It looks like you may be doing some sort of Canadian Cross build. What version of GCC is i686-unknown-linux-gnu-gcc? Does that version of gcc declare _Unwind_GetIPInfo in its unwind.h? Does it provide _Unwind_GetIPInfo in its libgcc? I'm using gcc 4.1 on this machine. No, it doesn't have it in the unwind.h or libgcc. Thanks for the additional info. I have committed this patch, which should fix the problem. Bootstrapped and ran libbacktrace tests on x86_64-unknown-linux-gnu. Ian 2012-09-19 Ian Lance Taylor i...@google.com * configure.ac: Only use GCC_CHECK_UNWIND_GETIPINFO when compiled as a target library. * configure: Rebuild. foo.patch Description: Binary data
Re: [PATCH] Combine location with block using block_locations
Thanks for reporting. I'll fix them now. Dehao On Thu, Sep 20, 2012 at 6:29 AM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Fri, Sep 14, 2012 at 11:03:34AM +0800, Dehao Chen wrote: Hi, I've integrated all the reviews from this thread (Thank you guys for helping refine this patch). Now the patch can pass all gcc testsuite as well as all spec2006 benchmarks (with LTO). Concerning memory consumption, for extreme benchmarks like tramp3d, this patch incurs around 2% peak memory overhead (mostly from the extra blocks that have been set NULL in the original implementation.) Attached is the new patch. Honza, could you help me try this on Mozzila lto to see if the error is gone? The current patch that was checked in breaks powerpc, because you did not change INSN_LOCATOR to INSN_LOCATION in rs6000_final_prescan_insn in rs6000.c. I also see INSN_LOCATOR in the arm, bfin, c6x, mep, mips, picochip, s390, sh, and spu ports. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: Backtrace library [1/3]
On 12-09-19 06:58 PM, Ian Lance Taylor wrote: Thanks for the additional info. I have committed this patch, which should fix the problem. Bootstrapped and ran libbacktrace tests on x86_64-unknown-linux-gnu. Thanks Ian. This patch fixes the issue. Regards, Ryan Mansfield 2012-09-19 Ian Lance Taylor i...@google.com * configure.ac: Only use GCC_CHECK_UNWIND_GETIPINFO when compiled as a target library. * configure: Rebuild.
Re: [PATCH] Combine location with block using block_locations
The patch to fix this problem is attached. As I don't have machines other than x86, I cannot test it. But this patch seemed straightforward. I'll check it in in a couple of hours if no objection is received. Thanks, Dehao gcc/ChangeLog: 2012-09-19 Dehao Chen de...@google.com * config/s390/s390.c (s390_chunkify_start): Replacing INSN_LOCATOR. * config/spu/spu.c (emit_nop_for_insn): Likewise. (pad_bb): Likewise. (spu_emit_branch_hint): Likewise. (insert_hbrp_for_ilb_runout): Likewise. * config/mep/mep.c (mep_make_bundle): Likewise. (mep_bundle_insns): Likewise. * config/sh/sh.c (gen_block_redirect): Likewise. * config/c6x/c6x.c (gen_one_bundle): Likewise. * config/rs6000/rs6000.c (rs6000_final_prescan_insn): Likewise. * config/picochip/picochip.c (picochip_reorg): Likewise. * config/arm/arm.c (require_pic_register): Likewise. * config/mips/mips.c (mips16_gp_pseudo_reg): Likewise. * config/bfin/bfin.c (gen_one_bundle): Likewise. Index: gcc/config/s390/s390.c === --- gcc/config/s390/s390.c (revision 191494) +++ gcc/config/s390/s390.c (working copy) @@ -6869,7 +6869,7 @@ s390_chunkify_start (void) prev = prev_nonnote_insn (prev); if (prev) jump = emit_jump_insn_after_setloc (gen_jump (label), insn, - INSN_LOCATOR (prev)); + INSN_LOCATION (prev)); else jump = emit_jump_insn_after_noloc (gen_jump (label), insn); barrier = emit_barrier_after (jump); Index: gcc/config/spu/spu.c === --- gcc/config/spu/spu.c(revision 191494) +++ gcc/config/spu/spu.c(working copy) @@ -1998,7 +1998,7 @@ emit_nop_for_insn (rtx insn) else new_insn = emit_insn_after (gen_lnop (), insn); recog_memoized (new_insn); - INSN_LOCATOR (new_insn) = INSN_LOCATOR (insn); + INSN_LOCATION (new_insn) = INSN_LOCATION (insn); } /* Insert nops in basic blocks to meet dual issue alignment @@ -2037,7 +2037,7 @@ pad_bb(void) prev_insn = emit_insn_before (gen_lnop (), insn); PUT_MODE (prev_insn, GET_MODE (insn)); PUT_MODE (insn, TImode); - INSN_LOCATOR (prev_insn) = INSN_LOCATOR (insn); + INSN_LOCATION (prev_insn) = INSN_LOCATION (insn); length += 4; } } @@ -2106,7 +2106,7 @@ spu_emit_branch_hint (rtx before, rtx branch, rtx hint = emit_insn_before (gen_hbr (branch_label, target), before); recog_memoized (hint); - INSN_LOCATOR (hint) = INSN_LOCATOR (branch); + INSN_LOCATION (hint) = INSN_LOCATION (branch); HINTED_P (branch) = 1; if (GET_CODE (target) == LABEL_REF) @@ -2129,7 +2129,7 @@ spu_emit_branch_hint (rtx before, rtx branch, rtx which could make it too far for the branch offest to fit */ insn = emit_insn_before (gen_blockage (), hint); recog_memoized (insn); - INSN_LOCATOR (insn) = INSN_LOCATOR (hint); + INSN_LOCATION (insn) = INSN_LOCATION (hint); } else if (distance = 8 * 4) { @@ -2141,20 +2141,20 @@ spu_emit_branch_hint (rtx before, rtx branch, rtx insn = emit_insn_after (gen_nopn_nv (gen_rtx_REG (SImode, 127)), hint); recog_memoized (insn); - INSN_LOCATOR (insn) = INSN_LOCATOR (hint); + INSN_LOCATION (insn) = INSN_LOCATION (hint); } /* Make sure any nops inserted aren't scheduled before the hint. */ insn = emit_insn_after (gen_blockage (), hint); recog_memoized (insn); - INSN_LOCATOR (insn) = INSN_LOCATOR (hint); + INSN_LOCATION (insn) = INSN_LOCATION (hint); /* Make sure any nops inserted aren't scheduled after the call. */ if (CALL_P (branch) distance 8 * 4) { insn = emit_insn_before (gen_blockage (), branch); recog_memoized (insn); - INSN_LOCATOR (insn) = INSN_LOCATOR (branch); + INSN_LOCATION (insn) = INSN_LOCATION (branch); } } } @@ -2340,7 +2340,7 @@ insert_hbrp_for_ilb_runout (rtx first) insn = emit_insn_before (gen_iprefetch (GEN_INT (1)), before_4); recog_memoized (insn); - INSN_LOCATOR (insn) = INSN_LOCATOR (before_4); + INSN_LOCATION (insn) = INSN_LOCATION (before_4); INSN_ADDRESSES_NEW (insn, INSN_ADDRESSES (INSN_UID (before_4))); PUT_MODE (insn, GET_MODE (before_4)); @@ -2349,7 +2349,7 @@ insert_hbrp_for_ilb_runout (rtx first) { insn = emit_insn_before (gen_lnop (), before_4); recog_memoized (insn); -
RFA: Fix PR rtl-optimization/38449 (patch updated)
Bootstrapped in revision 191429 on i686-pc-linux-gnu. The arc.c context can be seen here: http://gcc.gnu.org/viewcvs/branches/arc-4_4-20090909-branch/gcc/config/arc/arc.c?content-type=text%2Fplainview=co The relevant code in arc.c hasn't changed in the port updated to GCC 4.8 . 2012-08-14 Joern Rennecke joern.renne...@embecosm.com PR rtl-optimization/38449: * hooks.c (hook_bool_const_rtx_const_rtx_true): New function. * hooks.h (hook_bool_const_rtx_const_rtx_true): Declare. * target.def: Merge in definitions and documentation for TARGET_CAN_FOLLOW_JUMP. * doc/tm.texi.in: Add documentation locations for the above. * doc/tm.texi: Regenerate. * reorg.c (follow_jumps): New parameters jump and cp. Changed all callers. Index: reorg.c === --- reorg.c (revision 191429) +++ reorg.c (working copy) @@ -2494,24 +2494,28 @@ fill_simple_delay_slots (int non_jumps_p #endif } -/* Follow any unconditional jump at LABEL; +/* Follow any unconditional jump at LABEL, for the purpose of redirecting JUMP; return the ultimate label reached by any such chain of jumps. Return a suitable return rtx if the chain ultimately leads to a return instruction. If LABEL is not followed by a jump, return LABEL. If the chain loops or we can't find end, return LABEL, - since that tells caller to avoid changing the insn. */ + since that tells caller to avoid changing the insn. + If the returned label is obtained by following a REG_CROSSING_JUMP + jump, set *cp to (one of) the note(s), otherwise set it to NULL_RTX. */ static rtx -follow_jumps (rtx label) +follow_jumps (rtx label, rtx jump, rtx *cp) { rtx insn; rtx next; rtx value = label; int depth; + rtx crossing = NULL_RTX; if (ANY_RETURN_P (label)) return label; + *cp = 0; for (depth = 0; (depth 10 (insn = next_active_insn (value)) != 0 @@ -2537,10 +2541,15 @@ follow_jumps (rtx label) || GET_CODE (PATTERN (tem)) == ADDR_DIFF_VEC)) break; + if (!targetm.can_follow_jump (jump, insn)) + break; + if (!crossing) + crossing = find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX); value = this_label; } if (depth == 10) return label; + *cp = crossing; return value; } @@ -2984,6 +2993,7 @@ fill_slots_from_thread (rtx insn, rtx co if (new_thread != thread) { rtx label; + rtx crossing = NULL_RTX; gcc_assert (thread_if_true); @@ -2991,7 +3001,7 @@ fill_slots_from_thread (rtx insn, rtx co redirect_with_delay_list_safe_p (insn, JUMP_LABEL (new_thread), delay_list)) - new_thread = follow_jumps (JUMP_LABEL (new_thread)); + new_thread = follow_jumps (JUMP_LABEL (new_thread), insn, crossing); if (ANY_RETURN_P (new_thread)) label = find_end_label (new_thread); @@ -3001,7 +3011,11 @@ fill_slots_from_thread (rtx insn, rtx co label = get_label_before (new_thread); if (label) - reorg_redirect_jump (insn, label); + { + reorg_redirect_jump (insn, label); + if (crossing) + set_unique_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX); + } } return delay_list; @@ -3347,6 +3361,7 @@ relax_delay_slots (rtx first) for (insn = first; insn; insn = next) { rtx other; + rtx crossing; next = next_active_insn (insn); @@ -3357,7 +3372,9 @@ relax_delay_slots (rtx first) (condjump_p (insn) || condjump_in_parallel_p (insn)) !ANY_RETURN_P (target_label = JUMP_LABEL (insn))) { - target_label = skip_consecutive_labels (follow_jumps (target_label)); + target_label + = skip_consecutive_labels (follow_jumps (target_label, insn, +crossing)); if (ANY_RETURN_P (target_label)) target_label = find_end_label (target_label); @@ -3369,7 +3386,11 @@ relax_delay_slots (rtx first) } if (target_label target_label != JUMP_LABEL (insn)) - reorg_redirect_jump (insn, target_label); + { + reorg_redirect_jump (insn, target_label); + if (crossing) + set_unique_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX); + } /* See if this jump conditionally branches around an unconditional jump. If so, invert this jump and point it to the target of the @@ -3505,7 +3526,8 @@ relax_delay_slots (rtx first) /* If this jump goes to another unconditional jump, thread it, but don't convert a jump into a RETURN here. */ - trial = skip_consecutive_labels (follow_jumps (target_label)); + trial = skip_consecutive_labels (follow_jumps (target_label,
Go patch committed: Error for byte-order-mark in middle of file
I should have looked at some more of the testsuite before my last patch. This patch to the Go frontend issues an error for a Unicode byte-order-mark in the middle of a Go file, while continuing to ignore it at the beginning of the file. Bootstrapped and ran Go testsuite on x86_64-unknown-linxu-gnu. Committed to mainline. Will commit to 4.7 branch when it reopens. Ian diff -r 917ece6aa599 go/lex.cc --- a/go/lex.cc Wed Sep 19 08:50:19 2012 -0700 +++ b/go/lex.cc Wed Sep 19 17:47:42 2012 -0700 @@ -726,7 +726,7 @@ issued_error); // Ignore byte order mark at start of file. - if (ci == 0xfeff this-lineno_ == 1 this-lineoff_ == 0) + if (ci == 0xfeff) { p = pnext; break; @@ -840,6 +840,14 @@ *issued_error = true; return p + 1; } + + // Warn about byte order mark, except at start of file. + if (*value == 0xfeff (this-lineno_ != 1 || this-lineoff_ != 0)) +{ + error_at(this-location(), Unicode (UTF-8) BOM in middle of file); + *issued_error = true; +} + return p + adv; }
Re: RFA: Process '*' in '@'-output-template alternatives
Quoting Georg-Johann Lay g...@gcc.gnu.org: + * return stack_mem_p (operands[0]) ? \push 0 : \clrmem %0\; ---^ Isn't there a backslash missing? Attached is the amended patch. Again, bootstrapped on i686-pc-linux-gnu . 2011-09-19 Jorn Rennecke joern.renne...@arc.com * genoutput.c (process_template): Process '*' in '@' alternatives. * doc/md.texi (node Output Statement): Provide example for the above. Index: genoutput.c === --- genoutput.c (revision 191429) +++ genoutput.c (working copy) @@ -662,19 +662,55 @@ process_template (struct data *d, const list of assembler code templates, one for each alternative. */ else if (template_code[0] == '@') { - d-template_code = 0; - d-output_format = INSN_OUTPUT_FORMAT_MULTI; + int found_star = 0; - printf (\nstatic const char * const output_%d[] = {\n, d-code_number); + for (cp = template_code[1]; *cp; ) + { + while (ISSPACE (*cp)) + cp++; + if (*cp == '*') + found_star = 1; + while (!IS_VSPACE (*cp) *cp != '\0') + ++cp; + } + d-template_code = 0; + if (found_star) + { + d-output_format = INSN_OUTPUT_FORMAT_FUNCTION; + puts (\nstatic const char *); + printf (output_%d (rtx *operands ATTRIBUTE_UNUSED, + rtx insn ATTRIBUTE_UNUSED)\n, d-code_number); + puts ({); + puts ( switch (which_alternative)\n{); + } + else + { + d-output_format = INSN_OUTPUT_FORMAT_MULTI; + printf (\nstatic const char * const output_%d[] = {\n, + d-code_number); + } for (i = 0, cp = template_code[1]; *cp; ) { - const char *ep, *sp; + const char *ep, *sp, *bp; while (ISSPACE (*cp)) cp++; - printf ( \); + bp = cp; + if (found_star) + { + printf (case %d:, i); + if (*cp == '*') + { + printf (\n ); + cp++; + } + else + printf ( return \); + } + else + printf ( \); for (ep = sp = cp; !IS_VSPACE (*ep) *ep != '\0'; ++ep) if (!ISSPACE (*ep)) @@ -690,7 +726,18 @@ process_template (struct data *d, const cp++; } - printf (\,\n); + if (!found_star) + puts (\,); + else if (*bp != '*') + puts (\;); + else + { + /* The usual action will end with a return. +If there is neither break or return at the end, this is +assumed to be intentional; this allows to have multiple +consecutive alternatives share some code. */ + puts (); + } i++; } if (i == 1) @@ -700,7 +747,10 @@ process_template (struct data *d, const error_with_line (d-lineno, wrong number of alternatives in the output template); - printf (};\n); + if (found_star) + puts ( default: gcc_unreachable ();\n}\n}); + else + printf (};\n); } else { Index: doc/md.texi === --- doc/md.texi (revision 191429) +++ doc/md.texi (working copy) @@ -665,6 +665,22 @@ (define_insn @end group @end smallexample +If you just need a little bit of C code in one (or a few) alternatives, +you can use @samp{*} inside of a @samp{@@} multi-alternative template: + +@smallexample +@group +(define_insn + [(set (match_operand:SI 0 general_operand =r,,m) +(const_int 0))] + + @@ + clrreg %0 + * return stack_mem_p (operands[0]) ? \push 0\ : \clrmem %0\; + clrmem %0) +@end group +@end smallexample + @node Predicates @section Predicates @cindex predicates
RE: [PATCH] Enable bbro for -Os
-Original Message- From: Eric Botcazou [mailto:ebotca...@adacore.com] Sent: Thursday, September 13, 2012 10:31 PM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org; 'Steven Bosscher'; 'Richard Guenther' Subject: Re: [PATCH] Enable bbro for -Os The updated patched is attached. Is it OK? Yes, OK for mainline. -- Eric Botcazou Thanks, committed to trunk. -Zhenqiang
[v3] 28811, 54482
Fixes for various shared/static/pic issues, see SUBJECT. tested x86_64/linux tested x86_64/linux --with-pic tested x86_64/linux --without-pic* tested x86_64/linux --disable-shared tested x86_64/linux --disable-static earlier versions tested on cross arm-eabisim Most of these do what you'd expect. The default is -prefer-pic. Even if the exact route is sideways and full of libtool mysteryConfiguring without pic, ie, --without-pic, fails earlier (in liblto-plugin), and is otherwise ignored in non-libtool libraries like libgcc, but can be simulated and ends up creating an incorrect shared library, as might be expected. Seems like --disable-shared a better option for that scenario. I added some notes on the src build machinery to the manual. There's a lot of fighting libtool here. There's a patch to allow a less tortured mechanism for setting libtool's compile (not link) output, for both shared and static compiles. See: http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00340.html This seems like a good idea. This patch can be considered (c). With the completion of (a), and (b), the libtool variable overrides in libstdc++/configure.ac can be removed. If this tortured example proves the case for either a or b to the libtool maintainers then I'll have accomplished something useful today. -benjamin2012-09-19 Benjamin Kosnik b...@redhat.com PR libstdc++/28811 PR libstdc++/54482 * configure.ac (glibcxx_lt_pic_flag, glibcxx_compiler_pic_flag, glibcxx_compiler_shared_flag): New. Use them. (lt_prog_compiler_pic_CXX): Set via glibcxx_*_flag(s) above. (pic_mode): Set to default. (PIC_CXXFLAGS): Remove. * Makefile.am (PICFLAG, PICFLAG_FOR_TARGET): Remove. Comment. * libsupc++/Makefile.am: Use glibcxx_ld_pic_flag and glibcxx_compiler_shared_flag. Comment. * src/c++11/Makefile.am: Same. * src/c++98/Makefile.am: Same. * src/Makefile.am: Use glibcxx_compiler_pic_flag. * Makefile.in: Regenerated. * aclocal.m4: Same. * configure: Same. * doc/Makefile.in: Same. * include/Makefile.in: Same. * libsupc++/Makefile.in: Same. * po/Makefile.in: Same. * python/Makefile.in: Same. * src/Makefile.in: Same. * src/c++11/Makefile.in: Same. * src/c++98/Makefile.in: Same. * testsuite/Makefile.in: Same. * src/c++11/compatibility-atomic-c++0x.cc: Use _GLIBCXX_SHARED instead of PIC to designate shared-only code blocks. * src/c++11/compatibility-c++0x.cc: Same. * src/c++11/compatibility-thread-c++0x.cc: Same. * src/c++98/compatibility-list-2.cc: Same. * src/c++98/compatibility.cc: : Same. * testsuite/17_intro/shared_with_static_deps.cc: New. * doc/xml/manual/build_hacking.xml: Separate configure from make/build issues, add build details. diff --git a/libstdc++-v3/Makefile.am b/libstdc++-v3/Makefile.am index 76ff043..8be4f6c 100644 --- a/libstdc++-v3/Makefile.am +++ b/libstdc++-v3/Makefile.am @@ -152,8 +152,6 @@ AM_MAKEFLAGS = \ LIBCFLAGS_FOR_TARGET=$(LIBCFLAGS_FOR_TARGET) \ MAKE=$(MAKE) \ MAKEINFO=$(MAKEINFO) $(MAKEINFOFLAGS) \ - PICFLAG=$(PICFLAG) \ - PICFLAG_FOR_TARGET=$(PICFLAG_FOR_TARGET) \ SHELL=$(SHELL) \ RUNTESTFLAGS=$(RUNTESTFLAGS) \ exec_prefix=$(exec_prefix) \ diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac index 3943669..aff19f58 100644 --- a/libstdc++-v3/configure.ac +++ b/libstdc++-v3/configure.ac @@ -88,6 +88,7 @@ CXXFLAGS=$save_CXXFLAGS # up critical shell variables. GLIBCXX_CONFIGURE +# Libtool setup. if test x${with_newlib} != xyes; then AC_LIBTOOL_DLOPEN fi @@ -96,6 +97,38 @@ ACX_LT_HOST_FLAGS AC_SUBST(enable_shared) AC_SUBST(enable_static) +# libtool variables for C++ shared and position-independent compiles. +# +# Use glibcxx_lt_pic_flag to designate the automake variable +# used to encapsulate the default libtool approach to creating objects +# with position-independent code. Default: -prefer-pic. +# +# Use glibcxx_compiler_shared_flag to designate a compile-time flags for +# creating shared objects. Default: -D_GLIBCXX_SHARED. +# +# Use glibcxx_compiler_pic_flag to designate a compile-time flags for +# creating position-independent objects. This varies with the target +# hardware and operating system, but is often: -DPIC -fPIC. +if test $enable_shared = yes; then + glibcxx_lt_pic_flag=-prefer-pic + glibcxx_compiler_pic_flag=$lt_prog_compiler_pic_CXX + glibcxx_compiler_shared_flag=-D_GLIBCXX_SHARED + +else + glibcxx_lt_pic_flag= + glibcxx_compiler_pic_flag= + glibcxx_compiler_shared_flag= +fi +AC_SUBST(glibcxx_lt_pic_flag) +AC_SUBST(glibcxx_compiler_pic_flag) +AC_SUBST(glibcxx_compiler_shared_flag) + +# Override the libtool's pic_flag and pic_mode. +# Do this step after AM_PROG_LIBTOOL, but before AC_OUTPUT. +# NB: this impacts --with-pic and --without-pic.
Re: [C++ Patch / RFC] PR 52432
OK. Jason
RFA: Fix COND_EXEC handling of dead_or_set_regno_p
I've seen a number of execution failures with the soon-to-be submitted ARCompact port in the context of gcc 4.8, which where down to an rtlanal.c:dead_or_set_p problem, e.g.: gcc.c-torture/execute/builtin-bitops-1.c -O1 As the comment at the top of dead_or_set_p states, this function should return true if a register is dies in INSN or is entirely set. A COND_EXEC does not set at all if the condition is not true; for the purposes of dead_or_set_p, it should be treated like a partial set or a REG_INC, i.e. the COND_EXEC and its contents should be disregarded. Yet dead_or_set_regno_p, a helper function of dead_or_set_p, instead processes a COND_EXEC as if it always took place. I see this was introduced in the 2000-04-07 mega-patch that introduced COND_EXEC. Certainly in lots of other places the right thing is to look inside the COND_EXEC, but not here. Fixed with the attached patch. Bootstrapped on ia64-unknown-linux-gnu. 2011-11-27 Joern Rennecke joern.renne...@embecosm.com * rtlanal.c (dead_or_set_regno_p): Fix COND_EXEC handling. Index: rtlanal.c === --- rtlanal.c (revision 191467) +++ rtlanal.c (working copy) @@ -1701,8 +1701,9 @@ pattern = PATTERN (insn); + /* If a COND_EXEC is not executed, the value survives. */ if (GET_CODE (pattern) == COND_EXEC) -pattern = COND_EXEC_CODE (pattern); +return 0; if (GET_CODE (pattern) == SET) return covers_regno_p (SET_DEST (pattern), test_regno);
Go patch committed: Fix struct hash/equality with _ fields
This patch to the Go frontend fixes the handling of struct hash and equality when the struct has _ fields. Those fields should not participate in the hash and equality functions at all. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Will commit to 4.7 branch when it reopens. Ian diff -r 3408484e8448 go/expressions.cc --- a/go/expressions.cc Wed Sep 19 17:53:19 2012 -0700 +++ b/go/expressions.cc Wed Sep 19 21:32:53 2012 -0700 @@ -5178,6 +5178,9 @@ pf != fields-end(); ++pf, ++field_index) { + if (Gogo::is_sink_name(pf-field_name())) + continue; + if (field_index 0) { if (left_temp == NULL) diff -r 3408484e8448 go/types.cc --- a/go/types.cc Wed Sep 19 17:53:19 2012 -0700 +++ b/go/types.cc Wed Sep 19 21:32:53 2012 -0700 @@ -579,6 +579,9 @@ p != fields-end(); ++p) { + if (Gogo::is_sink_name(p-field_name())) + continue; + if (!p-type()-is_comparable()) { if (reason != NULL) @@ -4294,6 +4297,9 @@ pf != fields-end(); ++pf) { + if (Gogo::is_sink_name(pf-field_name())) + return false; + if (!pf-type()-compare_is_identity(gogo)) return false; @@ -4767,6 +4773,9 @@ pf != fields-end(); ++pf) { + if (Gogo::is_sink_name(pf-field_name())) + continue; + if (first) first = false; else @@ -4858,6 +4867,9 @@ pf != fields-end(); ++pf, ++field_index) { + if (Gogo::is_sink_name(pf-field_name())) + continue; + // Compare one field in both P1 and P2. Expression* f1 = Expression::make_temporary_reference(p1, bloc); f1 = Expression::make_unary(OPERATOR_MULT, f1, bloc);