Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On 04/09/2012 11:17 PM, Adam Conrad wrote: Like I said, then, you didn't actually read or understand why proposing multilib paths doesn't work. You realize conceptually, I hope, that there's no guarantee of uniqueness in lib/lib64/lib32/libsf/libhf once you cross the base CPU architecture boundary, right? But what you haven't done is make a case for why anyone should care about this problem. Sure, I said that /libhf/ld-linux.so.3 would *accidentally* work for us right now, due to sheer luck, and you're running with that as saying that we clearly have no problem here worth solving. When the next architecture clashes with linkers on another (hint: it almost certainly will), do we get to argue about this all over again in six months, instead of codifying a new and saner practice now? My understanding of that architecture is that it's being handled as completely different from it's prior implementations. ie, the toolchain and other things are treating it as an entirely separate architecture even though there is some common lineage to prior implementations. If the tools are treating this upcoming architecture as a separate and distinct architecture rather than as a variant of a prior architecture, then why do we have to worry about conflicting in the filesystem space? And just to be clear, I'm not taking sides, merely pointing out that you haven't made the case in this forum in a way that folks understand why this is an important problem. Jeff
Re: Ada testcase CR line endings
So, I'd like to change all the ada testcases to use normal unix line endings. testsuite/gnat.dg/taft_type2_pkg.ads is an example if one such file, any objections? No objections.
Re: Ada testcase CR line endings
On Apr 9, 2012, at 11:24 PM, Arnaud Charlet wrote: So, I'd like to change all the ada testcases to use normal unix line endings. testsuite/gnat.dg/taft_type2_pkg.ads is an example if one such file, any objections? No objections. Applied to trunk and 4.7.x. Thanks. Index: gnat.dg/array19.adb === --- gnat.dg/array19.adb (revision 186264) +++ gnat.dg/array19.adb (revision 186265) @@ -1,34 +1,34 @@ --- { dg-do compile } - -package body Array19 is - - function N return Integer is - begin - return 1; - end; - - type Array_Type is array (1 .. N) of Float; - - type Enum is (One, Two); - - type Rec (D : Enum := Enum'First) is record - case D is - when One = null; - when Two = A : Array_Type; - end case; - end record; - - procedure Proc is - - R : Rec; - - function F return Array_Type is - begin - return (others = 0.0); - end F; - - begin - R.A := F; - end; - -end Array19; +-- { dg-do compile } + +package body Array19 is + + function N return Integer is + begin + return 1; + end; + + type Array_Type is array (1 .. N) of Float; + + type Enum is (One, Two); + + type Rec (D : Enum := Enum'First) is record + case D is + when One = null; + when Two = A : Array_Type; + end case; + end record; + + procedure Proc is + + R : Rec; + + function F return Array_Type is + begin + return (others = 0.0); + end F; + + begin + R.A := F; + end; + +end Array19; Index: gnat.dg/array19.ads === --- gnat.dg/array19.ads (revision 186264) +++ gnat.dg/array19.ads (revision 186265) @@ -1,5 +1,5 @@ -package Array19 is - - procedure Proc; - -end Array19; +package Array19 is + + procedure Proc; + +end Array19; Index: gnat.dg/aggr11.adb === --- gnat.dg/aggr11.adb (revision 186264) +++ gnat.dg/aggr11.adb (revision 186265) @@ -1,17 +1,17 @@ --- { dg-do compile } --- { dg-options -O } - -with Aggr11_Pkg; use Aggr11_Pkg; - -procedure Aggr11 is - - A : Arr := ((1 = (Kind = No_Error, B = True), - 2 = (Kind = Error), - 3 = (Kind = Error), - 4 = (Kind = No_Error, B = True), - 5 = (Kind = No_Error, B = True), - 6 = (Kind = No_Error, B = True))); - -begin - null; -end; +-- { dg-do compile } +-- { dg-options -O } + +with Aggr11_Pkg; use Aggr11_Pkg; + +procedure Aggr11 is + + A : Arr := ((1 = (Kind = No_Error, B = True), + 2 = (Kind = Error), + 3 = (Kind = Error), + 4 = (Kind = No_Error, B = True), + 5 = (Kind = No_Error, B = True), + 6 = (Kind = No_Error, B = True))); + +begin + null; +end; Index: gnat.dg/discr6.adb === --- gnat.dg/discr6.adb (revision 186264) +++ gnat.dg/discr6.adb (revision 186265) @@ -1,33 +1,33 @@ --- { dg-do compile } --- { dg-options -gnatdm -gnatws } - -with Discr6_Pkg; - -procedure Discr6 is - - type T_Bit is range 0..1; - type T_Entier_16 is range -2**15 .. 2**15-1; - - package My_Q is new Discr6_Pkg(T_Entier_16); - - type T_Valeur is (BIT, Entier_16); - - type R(D : T_Valeur) is record -case D is - when BIT = V_BIT : T_Bit; - when Entier_16 = V_E16 : T_Entier_16; -end case; - end record; - for R use record -V_BIT at 0 range 0..7; -V_E16 at 0 range 0..15; -D at 8 range 0..7; - end record; - for R'size use 128; - - A : R(Entier_16); - I : Integer; - -begin - I := My_Q.X(A.V_E16); -end; +-- { dg-do compile } +-- { dg-options -gnatdm -gnatws } + +with Discr6_Pkg; + +procedure Discr6 is + + type T_Bit is range 0..1; + type T_Entier_16 is range -2**15 .. 2**15-1; + + package My_Q is new Discr6_Pkg(T_Entier_16); + + type T_Valeur is (BIT, Entier_16); + + type R(D : T_Valeur) is record +case D is + when BIT = V_BIT : T_Bit; + when Entier_16 = V_E16 : T_Entier_16; +end case; + end record; + for R use record +V_BIT at 0 range 0..7; +V_E16 at 0 range 0..15; +D at 8 range 0..7; + end record; + for R'size use 128; + + A : R(Entier_16); + I : Integer; + +begin + I := My_Q.X(A.V_E16); +end; Index: gnat.dg/aggr11_pkg.ads === --- gnat.dg/aggr11_pkg.ads (revision 186264) +++ gnat.dg/aggr11_pkg.ads (revision 186265) @@ -1,14 +1,14 @@ -package Aggr11_Pkg is - - type Error_Type is (No_Error, Error); - - type Rec (Kind : Error_Type := No_Error) is record - case Kind is - when Error = null; - when others = B : Boolean; - end case; - end record; - - type Arr is array (1..6) of Rec; - -end Aggr11_Pkg; +package Aggr11_Pkg is + + type Error_Type is (No_Error, Error); + + type
Re: [Patch, Fortran, F03] PR52909: Procedure pointers not private to modules
No patch review - but and answer to a question and a comment. Janus Weil wrote: I am aware that it will break the ABI, but only for programs involving procedure pointers (which still is a 'relatively' new feature, supported since gfortran 4.4). Btw, speaking of ABI breaking: What are the chances of the array descriptor update and ABI cleanup happening for the 4.8 release? I know such an ABI breaking has been planned for some time, but I haven't followed the gfortran mailing list in detail during the last weeks and months, so I'm not sure what the current status is. Regarding Fortran-dev (which contains the new descriptor): On the branch, the new dimension triplet (lbound, extent, sm [stride multiplier in bytes]) is used - instead of the old (lbound, ubound, stride [in # of elements]); there are still about 20 failures. The next step is to reduce them to zero. Half of them are due to c_f_pointer; I have a draft patch for it [moving it from the library to the compiler], but it still fails in an odd way. The other issues haven't been debugged. Additionally, one needs to do some optimizations - Thomas' patch is the first step into that direction. We then also have to do further changes to the other fields of the descriptor. I have no idea whether it will be ready until the 4.8 release. It is doable, but quite a lot of work needs to be done. Especially as we want to do some additional cleanup. However, there are also other gfortran projects which have to be worked on. My personal guess is that the array descriptor will be mostly but not completely ready by the end of Stage 1 and thus be deferred another year. However, as the saying goes: Prediction is very difficult, especially about the future. (Possibly by Niels Bohr, who might have taken it from the Danish poet Piet Hein, though the source of the quote is disputed.) * * * Regarding ABI breakage: I strongly propose that we - starting from 4.8 or starting from the array-descriptor ABI breakage - keep backward compatibility for the .mod files. That is, if the mod version is bumped, the compiler shall be able to read the previous version. And unless the ABI is serverely remodelled as with the array descriptor: We should try to avoid ABI breakage and if it is needed as for the procedure pointer, we should do the same as GCC's C++ compiler: -fabi-version=n -fabi-version=n Use version n of the C++ ABI. Version 2 is the ver- sion of the C++ ABI that first appeared in G++ 3.4. Version 1 is the version of the C++ ABI that first appeared in G++ 3.2. Version 0 will always be the version that conforms most closely to the C++ ABI specification. Therefore, the ABI obtained using ver- sion 0 will change as ABI bugs are fixed. The default is version 2. Version 3 corrects an error in mangling a constant address as a template argument. [...] See also -Wabi. Regarding: I am aware that it will break the ABI, but only for programs involving procedure pointers (which still is a 'relatively' new feature, supported since gfortran 4.4). That's already supported since 3 years - and there are several programs out there which use it, given that is was somewhat stable. gfortran escaped some issues by always bumping the .mod version (due to new features), which made it less likely that a user could run into issues. We also fixed some very special corner cases - in particularly with truely new features. But I don't think that proc-pointers count in this regard: They are too widely supported by old gfortran versions and by other compilers and proc-pointers as module variable is not really a corner case. Hence, we should really think about -fabi-version=n (and .mod compatibility). Unless, we are positive that we will break the ABI for the array descriptor in 4.8, I am in favour of adding -fabi-version=n already for the proc-pointer patch. Comments? Tobias
Re: [SMS] Support new loop pattern
Hello Ayal, First of all, thanks for your feedback. Now to your questions: On 31.03.2012 3:20, Ayal Zaks wrote: Roman, Andrey, Sorry for the delayed response. It would indeed be good to have SMS apply to more loop patterns, still within the realm of *countable* loops. SMS was originally designed to handle doloops, with a specific pattern controlling the loop, easily identified and separable from the loop's body. The newly proposed change to support new loop patterns is pretty invasive and sizable, taking place entirely within modulo-sched.c. The main issue I've been considering, is whether it would be possible instead to transform the new loop patterns we want SMS to handle, into doloops (potentially introducing additional induction variables to feed other uses), and then feed the resulting loop into SMS as is? In other words, could you fold it into doloop.c? And if so, will doing so introduce significant overheads? Let me perhaps explain better. The patch itself is one core patch (this thread) adding the new functionality on detecting more complex loop patterns and the three fixes to SMS found while working on the main patch (the fixes are in the mails pinged at the very end of this message). The three fixes are worthwhile to commit separately anyways, they are splitted up from the main patch for this purpose, so I would suggest to consider them in any case. For the main patch, its core is as small as we could get. It stays with the countable loops as for the cases where we could get overflow behavior or infinite loops we bail out early. We handle only a case of simple same-step affine counters. The main reason why we add support to SMS and not to the doloop pass are is when we do not pipeline a loop newly transformed to the doloop form, this loop actually slows down on the platforms not having a true doloop pattern. One has to undo the doloop form and to get back to the original loop form to avoid this, which seems rather strange. Also, the separate decrement insn that changes the induction variable is better be scheduled to get more precise schedule. And yes, I believe that making an extra induction variable just to have the control part without uses in the loop core will be unnecessary overhead. Thus, I believe that if we do want SMS to handle more complex loop, then it is inevitable that SMS itself would be somewhat more complex. I would welcome your suggestions to make the patch more clear. One way I see is that the function for getting the condition of the new loop form can be moved to the generic RTL loop code given the agreement of other RTL maintainers. Also, some new helpers can be introduced for handling this specific loop forms. But it seems that the distinction between doloop/non-doloop loops has to stay in the code. Yours, Andrey 2012/3/29 Andrey Belevantseva...@ispras.ru: Hello, I'd like to ping again those SMS patches once we're back to Stage 1. Ayal, maybe it would remove some burden for you if you'd review the general SMS functionality of those patches, and we'd ask RTL folks to look at the pieces related to RTL pattern matching and generation? It definitely would ... especially in light of the above issue. Thanks (for your patches, patience, pings..), Ayal. Yours, Andrey On 10.02.2012 16:15, Roman Zhuykov wrote: Ping. Ayal, please review this patch and these three patches too: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00505.html http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00506.html http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01800.html -- Roman Zhuykov zhr...@ispras.ru
Re: [patch] Fix cygwin ada install [was Re: Yet another issue with gcc current trunk with ada on cygwin]
Back on this! It turn out that this breaks the shared Ada runtime. Indeed, exported variables on Ada packages in a DLLs are only accessible when linking against DLL (thanks to runtime pseudo reloc). With the patch applied it is not possible to build any Ada application using the shared runtime. This is a very serious problem, the patch should be reverted. Indeed, that's a blocking issue. Pascal, could you take care of reverting this change (at least the parts that are intrusve)? Thanks in advance. Arno
Re: [Patch ARM/ configury] Add fall-back check for gnu_unique_object
The patch with correct configure output is ok. Thanks - this is what I committed. Is this something that can be considered for backporting to release branches ? This patch technically doesn't fix a regression but brings in line behaviour of the normal bootstrap for %gnu_unique_object ? http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01046.html regards, Ramana Ramana 2012-03-15 Ramana Radhakrishnan ramana.radhakrish...@linaro.org * config.gcc (target_type_format_char): New. Document it. Set it for arm*-*-* . * configure.ac (gnu_unique_option): Use target_type_format_char in test. Comment rationale. * configure: Regenerate . Ramana Paolo
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On Tue, 10 Apr 2012 07:36:07 +0200 Jakub Jelinek ja...@redhat.com wrote: We really want consistency about the dynamic linker names etc. across different targets and sneaking silently multiarch paths on one architecture would make it inconsistent with all the others. So, please just use /libhf/ld-linux.so.3. I personally find /libhf extremely ugly. If having a second path is a problem, how about using the triplet in the filename? Like: /lib/ld-linux-arm-linux-gnueabihf.so.3 ? Unique per arch and not multiarched. And AFAIK, Linux can handle long filenames just fine... Konstantinos
Re: [Libiberty - V2]: Handle VMS as a LLP64 platform in splay-tree.h
On Apr 6, 2012, at 12:26 AM, Ian Lance Taylor wrote: Tristan Gingold ging...@adacore.com writes: gcc/ 2012-04-05 Tristan Gingold ging...@adacore.com * gengtype.c (main): Make uintptr_t a known type. include/ 2012-04-05 Tristan Gingold ging...@adacore.com * splay-tree.h: Conditionnaly includes stdint.h and inttypes.h (libi_uhostptr_t, libi_shostptr_t): Remove, replaced by uintptr_t. This is OK. Thanks, committed. Tristan.
Re: Fix partitioning of aliases
On Mon, 9 Apr 2012, Jan Hubicka wrote: Hi, this patch fixes several different ICEs related to handling aliases in WHOPR partitioning. It took me over week debug this, but when variable alias is added to a boundary and its destination is not added, we get queue of unforutnate events where the destinatoin gets analyzed and added at ltrans time resulting in interesting miscompilation seen at Mozilla with some vtables. The problem is that constructor won't get streamed when the declaration is not in varpool at partitioning time and thus once the variable is re-added it has zero constructor. Of course the problem manifests itself in various weird ways depending on ordering of linker command maing it very difficult to reduce anything. While working on this I also noticed that PR 52634 is about related problem where aliases are incorectly partitioned into multiple partitions. The patch also fixes the varpool ICEs mentioned in the other two PRs. I failed to produce testcase version of PR52722 testcase, since it does not link now either, but it won't ICE. I will commit the patch and wait for some time, but I would like to backport it to 4.7, since it solves quite nasty misoptimization problem. Yeah, it looks fine to me. At mainline after this patch i would like to follow with series of cleanups and API changes I have in queue for symtab work. Thanks, Richard. Honza PR lto/52722 PR lto/51765 PR lto/52634 * lto-cgraph.c (compute_ltrans_boundary): When alias is in the boundary, add its target too. * lto.c (add_references_to_partition): Add also aliased nodes. (add_cgraph_node_to_partition, add_varpool_node_to_partition): Work on nodes, not functions/variables; when adding alias, add also the aliased object. * gcc.dg/lto/pr52634_1.c: New testcase. * gcc.dg/lto/pr52634_0.c: New testcase. Index: lto-cgraph.c === *** lto-cgraph.c (revision 185767) --- lto-cgraph.c (working copy) *** compute_ltrans_boundary (struct lto_out_ *** 799,804 --- 799,806 lto_set_varpool_encoder_encode_initializer (varpool_encoder, vnode); add_references (encoder, varpool_encoder, vnode-ref_list); } + else if (vnode-alias || vnode-alias_of) + add_references (encoder, varpool_encoder, vnode-ref_list); } /* Go over all the nodes again to include callees that are not in Index: lto/lto.c === *** lto/lto.c (revision 185767) --- lto/lto.c (working copy) *** free_ltrans_partitions (void) *** 1444,1450 VEC_free (ltrans_partition, heap, ltrans_partitions); } ! /* See all references that go to comdat objects and bring them into partition too. */ static void add_references_to_partition (ltrans_partition part, struct ipa_ref_list *refs) { --- 1444,1451 VEC_free (ltrans_partition, heap, ltrans_partitions); } ! /* See all references that go to comdat objects and bring them into partition too. !Also see all aliases of the newly added entry and bring them, too. */ static void add_references_to_partition (ltrans_partition part, struct ipa_ref_list *refs) { *** add_references_to_partition (ltrans_part *** 1453,1467 for (i = 0; ipa_ref_list_reference_iterate (refs, i, ref); i++) { if (ref-refered_type == IPA_REF_CGRAPH !DECL_COMDAT (cgraph_function_node (ipa_ref_node (ref), NULL)-decl) !cgraph_node_in_set_p (ipa_ref_node (ref), part-cgraph_set)) add_cgraph_node_to_partition (part, ipa_ref_node (ref)); else if (ref-refered_type == IPA_REF_VARPOOL ! DECL_COMDAT (ipa_ref_varpool_node (ref)-decl) ! !varpool_node_in_set_p (ipa_ref_varpool_node (ref), part-varpool_set)) add_varpool_node_to_partition (part, ipa_ref_varpool_node (ref)); } } /* Worker for add_cgraph_node_to_partition. */ --- 1454,1498 for (i = 0; ipa_ref_list_reference_iterate (refs, i, ref); i++) { if (ref-refered_type == IPA_REF_CGRAPH !(DECL_COMDAT (cgraph_function_node (ipa_ref_node (ref), !NULL)-decl) ! || (ref-use == IPA_REF_ALIAS !lookup_attribute !(weakref, DECL_ATTRIBUTES (ipa_ref_node (ref)-decl !cgraph_node_in_set_p (ipa_ref_node (ref), part-cgraph_set)) add_cgraph_node_to_partition (part, ipa_ref_node (ref)); else if (ref-refered_type == IPA_REF_VARPOOL ! (DECL_COMDAT (ipa_ref_varpool_node (ref)-decl) ! || (ref-use == IPA_REF_ALIAS ! lookup_attribute ! (weakref, ! DECL_ATTRIBUTES (ipa_ref_varpool_node (ref)-decl !
Re: [patch] Add generic __builtin_bswap16 support
On Mon, Apr 9, 2012 at 11:43 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, this adds generic support for __builtin_bswap16 (only PowerPC has it for now). It is mapped to the bswap optab in HImode, whose implementation is as follows: - if a bswaphi2 pattern is present (PowerPC), it is directly used; or else - if a rotlhi2/rotrhi2 pattern is present (x86), it is directly used, or else - if ashlxx2 lshrxx2 are present (SPARC), they are used to open-code the operation, or else, - if a bswapsi2 pattern is present, it is used (with a final shift), otherwise - the bswapsi2 libcall is used (with a final shift). Since it is expected that most architectures will fall into one of the first 3 cases, no __bswaphi2 symbol is added to libgcc. And pass_optimize_bswap isn't modified to recognize the builtin either, as this seems overkill to me. Tested on x86, x86-64 and PowerPC Linux, OK for the mainline? Ok. Thanks, Richard. 2012-04-09 Eric Botcazou ebotca...@adacore.com PR target/52624 * doc/extend.texi (Other Builtins): Document __builtin_bswap16. (PowerPC AltiVec/VSX Built-in Functions): Remove it. * builtin-types.def (BT_UINT16): New primitive type. (BT_FN_UINT16_UINT16): New function type. * builtins.def (BUILT_IN_BSWAP16): New. * builtins.c (expand_builtin_bswap): Add TARGET_MODE argument. (expand_builtin) BUILT_IN_BSWAP16: New case. Pass TARGET_MODE to expand_builtin_bswap. * optabs.c (expand_unop): Deal with bswap in HImode specially. Add missing bits for bswap to libcall code. * tree.c (build_common_tree_nodes): Build uint16_type_node. * tree.h (enum tree_index): Add TI_UINT16_TYPE. (uint16_type_node): New define. * config/rs6000/rs6000-builtin.def (RS6000_BUILTIN_BSWAP_HI): Delete. * config/rs6000/rs6000.c (rs6000_expand_builtin): Remove handling of above builtin. (rs6000_init_builtins): Likewise. * config/rs6000/rs6000.md (bswaphi2): Add TARGET_POWERPC predicate. 2012-04-09 Eric Botcazou ebotca...@adacore.com c-family/ * c-common.h (uint16_type_node): Rename into... (c_uint16_type_node): ...this. * c-common.c (c_common_nodes_and_builtins): Adjust for above renaming. * c-cppbuiltin.c (builtin_define_stdint_macros): Likewise. 2012-04-09 Eric Botcazou ebotca...@adacore.com testsuite/ * gcc.dg/builtin-bswap-1.c: Test __builtin_bswap16 __builtin_bswap64. * gcc.dg/builtin-bswap-4.c: Test __builtin_bswap16. * gcc.dg/builtin-bswap-5.c: Likewise. * gcc.target/i386/builtin-bswap-4.c: New test. -- Eric Botcazou
Re: [PATCH] Fix PR52614
I don't know enough about Fortran to know whether the same issues arise there. Perhaps in Fortran a common symbol is always a common symbol and can never be a defined symbol. If that is the case then for Fortran I think it would be safe to change the alignment of the common symbol. Of course we would need to have some way to indicate that in the middle-end--DECL_ALWAYS_COMMON or something. I don't know if this answers the question, but I have run the gfortran test suite with -fno-common and I have seen only one failure: [macbook] lin/test% gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init.f90 /opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init_driver.c -fno-common -save-temps ld: duplicate symbol _i in global_vars_f90_init_driver.o and global_vars_f90_init.o collect2: error: ld returned 1 exit status AFAICT, COMMON is only lightly tested in the gfortran test suite and mostly for errors or warnings. However I find surprising that gfortran.dg/bind_c_coms.f90 + gfortran.dg/bind_c_coms_driver.c works: [macbook] lin/test% gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms.f90 /opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms_driver.c -fno-common [macbook] lin/test% a.out [macbook] lin/test% Dominique
Re: [PATCH] Fix PR middle-end/52894
On Tue, Apr 10, 2012 at 2:45 AM, John David Anglin d...@hiauly1.hia.nrc.ca wrote: The current 4.5, 4.6 and 4.7 branches are now seriously broken for all PA target due to the fix applied for PR middle-end/52640, an optimization fix. This is because the PA backend defers output of function descriptors and externals (hpux) using TARGET_ASM_FILE_END. As a result, assemble_external can be called after after varasm.c processes it's list of pending externals and the vector used for this list deleted. The attached change fixes this problem. The change has been tested on hppa-unknown-linux-gnu and hppa2.0w-hp-hpux11.11 with no observed regressions. I believe it to be the simplest way to fix these release branches although it may not be optimal (externals are doublely delayed on PA hpux). Ok for the 4.5, 4.6 and 4.7 branches? The fix for PR middle-end/52640 was not applied to the trunk. I can't immediately see how your description of the list of pending externals and the vector is deleted. pa.c keeps its own vector which references the decls and the only issue I see is that if you call assemble_external after processing externals you ICE because the pointer-set is not initialized? Why does the pa backend end up calling assemble_external at final time? Richard. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602) 2012-04-09 John David Anglin dave.ang...@nrc-cnrc.gc.ca PR middle-end/52894 * varasm.c (process_pending_assemble_externals): Set pending_assemble_externals_processed true. (assemble_external): Call assemble_external_real if the pending assemble externals have been processed. Index: varasm.c === --- varasm.c (revision 186213) +++ varasm.c (working copy) @@ -2108,6 +2108,11 @@ the entire pending_assemble_externals list. See assemble_external(). */ static struct pointer_set_t *pending_assemble_externals_set; +/* Some targets delay some output to final using TARGET_ASM_FILE_END. + As a result, assemble_external can be called after the list of externals + is processed and the pointer set destroyed. */ +static bool pending_assemble_externals_processed; + #ifdef ASM_OUTPUT_EXTERNAL /* True if DECL is a function decl for which no out-of-line copy exists. It is assumed that DECL's assembler name has been set. */ @@ -2160,6 +2165,7 @@ assemble_external_real (TREE_VALUE (list)); pending_assemble_externals = 0; + pending_assemble_externals_processed = true; pointer_set_destroy (pending_assemble_externals_set); #endif } @@ -2201,6 +2207,12 @@ weak_decls = tree_cons (NULL, decl, weak_decls); #ifdef ASM_OUTPUT_EXTERNAL + if (pending_assemble_externals_processed) + { + assemble_external_real (decl); + return; + } + if (! pointer_set_insert (pending_assemble_externals_set, decl)) pending_assemble_externals = tree_cons (NULL, decl, pending_assemble_externals);
Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop
On Tue, Apr 10, 2012 at 12:41 PM, Kai Tietz ktiet...@googlemail.com wrote: Hello, recent changes to cfgloop have caused an ICE on bootstrapping libstdc++ for mingw targets. I assume same ICE happens for cygwin hosted gcc bootstrap, too. ICE happens on compiling of gcc/libstdc++-v3/libsupc++/eh_alloc.cc Code which produces seg-fault is in macro: DEF_VEC_P (loop_p); backtrace: #0 add_bb_to_loop (bb=0x7ef68ca0, loop=0x0) at ../../gcc/gcc/cfgloop.h:88 loop is NULL, so the issue is that mingw somewhere generates a basic-block without a loop father. #1 0x00a5e3d7 in split_block (bb=0x7ef68bc8, i=0x7eb6ea30) for this block (bb). Find out where it is generated and fix that. Richard. at ../../gcc/gcc/cfghooks.c:450 #2 0x00abaf49 in find_many_sub_basic_blocks (blocks=0x442f178) at ../../gcc/gcc/cfgbuild.c:474 #3 0x008a280f in break_superblocks () at ../../gcc/gcc/cfglayout.c:1345 #4 0x007293f9 in finish_eh_generation () at ../../gcc/gcc/except.c:1431 #5 0x00dc06f4 in gimple_expand_cfg () at ../../gcc/gcc/cfgexpand.c:4652 #6 0x008d4afc in execute_one_pass (pass=0x1143bc0) at ../../gcc/gcc/passes.c:2079 #7 0x008d4cc4 in execute_pass_list (pass=0x4350238) at ../../gcc/gcc/passes.c:2134 #8 0x00a8b1c4 in tree_rest_of_compilation (fndecl=0x7eca1d00) at ../../gcc/gcc/tree-optimize.c:422 #9 0x008da273 in cgraph_expand_function (node=0x7ecd7578) at ../../gcc/gcc/cgraphunit.c:1784 #10 0x008dc2d9 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1851 #11 0x008dcb2e in cgraph_finalize_compilation_unit () at ../../gcc/gcc/cgraphunit.c:2628 #12 0x00510c5b in cp_write_global_declarations () at ../../gcc/gcc/cp/decl2.c:4077 #13 0x008e6452 in toplev_main (argc=30, argv=0x43497d0) at ../../gcc/gcc/toplev.c:572 #14 0x006a104a in main (argc=30, argv=0x43497d0) at ../../gcc/gcc/main.c:36
Re: [PATCH] Fix PR52614
On Tue, Apr 10, 2012 at 1:00 PM, Dominique Dhumieres domi...@lps.ens.fr wrote: I don't know enough about Fortran to know whether the same issues arise there. Perhaps in Fortran a common symbol is always a common symbol and can never be a defined symbol. If that is the case then for Fortran I think it would be safe to change the alignment of the common symbol. Of course we would need to have some way to indicate that in the middle-end--DECL_ALWAYS_COMMON or something. I don't know if this answers the question, but I have run the gfortran test suite with -fno-common and I have seen only one failure: [macbook] lin/test% gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init.f90 /opt/gcc/work/gcc/testsuite/gfortran.dg/global_vars_f90_init_driver.c -fno-common -save-temps ld: duplicate symbol _i in global_vars_f90_init_driver.o and global_vars_f90_init.o collect2: error: ld returned 1 exit status AFAICT, COMMON is only lightly tested in the gfortran test suite and mostly for errors or warnings. However I find surprising that gfortran.dg/bind_c_coms.f90 + gfortran.dg/bind_c_coms_driver.c works: -fno-common seems to be ignored by the fortran frontend, thus, subroutine foo common/BAR/ x real(8) x end BAR is still a common decl with -fno-common. For your testcase only the C compiled parts end up not using commons. Thus the vectorizer would fail to promote the alignment of BAR regardless of -f[no-]common. GFortran OTOH does .comm bar_,8,16 for the above, thus aligns it to 16 already (to 32 with -mavx, probably for vectorization). So we should be fine. Richard. [macbook] lin/test% gfc /opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms.f90 /opt/gcc/work/gcc/testsuite/gfortran.dg/bind_c_coms_driver.c -fno-common [macbook] lin/test% a.out [macbook] lin/test% Dominique
[PATCH] Move tree_rest_of_compilation
To cgraphunit.c, where its only caller (should) reside. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-04-10 Richard Guenther rguent...@suse.de * toplev.h (tree_rest_of_compilation): Remove. * tree-optimize.c (tree_rest_of_compilation): Likewise. * cgraph.h (tree_rest_of_compilation): Declare. * tree-optimize.c (tree_rest_of_compilation): Move ... * cgraphunit.c (tree_rest_of_compilation): ... here. * cgraph.c (cgraph_add_new_function): Adjust. * Makefile.in (tree-optimize.o): Adjust. (cgraphunit.o): Likewise. Index: trunk/gcc/cgraph.h === *** trunk.orig/gcc/cgraph.h 2012-04-10 10:19:22.0 +0200 --- trunk/gcc/cgraph.h 2012-04-10 10:20:45.216721264 +0200 *** void cgraph_mark_if_needed (tree); *** 565,570 --- 565,571 void cgraph_analyze_function (struct cgraph_node *); void cgraph_finalize_compilation_unit (void); void cgraph_optimize (void); + void tree_rest_of_compilation (struct cgraph_node *); void cgraph_mark_needed_node (struct cgraph_node *); void cgraph_mark_address_taken_node (struct cgraph_node *); void cgraph_mark_reachable_node (struct cgraph_node *); Index: trunk/gcc/cgraphunit.c === *** trunk.orig/gcc/cgraphunit.c 2012-04-10 10:20:00.0 +0200 --- trunk/gcc/cgraphunit.c 2012-04-10 10:57:26.947589800 +0200 *** along with GCC; see the file COPYING3. *** 111,116 --- 111,117 #include coretypes.h #include tm.h #include tree.h + #include output.h #include rtl.h #include tree-flow.h #include tree-inline.h *** along with GCC; see the file COPYING3. *** 141,146 --- 142,148 #include ipa-inline.h #include ipa-utils.h #include lto-streamer.h + #include regset.h /* FIXME: For reg_obstack. */ static void cgraph_expand_all_functions (void); static void cgraph_mark_functions_to_output (void); *** assemble_thunks_and_aliases (struct cgra *** 1768,1773 --- 1770,1863 } } + /* For functions-as-trees languages, this performs all optimization and +compilation for FNDECL. */ + + void + tree_rest_of_compilation (struct cgraph_node *node) + { + tree fndecl = node-decl; + location_t saved_loc; + + timevar_push (TV_REST_OF_COMPILATION); + + gcc_assert (cgraph_global_info_ready); + + /* Initialize the default bitmap obstack. */ + bitmap_obstack_initialize (NULL); + + /* Initialize the RTL code for the function. */ + current_function_decl = fndecl; + saved_loc = input_location; + input_location = DECL_SOURCE_LOCATION (fndecl); + init_function_start (fndecl); + + gimple_register_cfg_hooks (); + + bitmap_obstack_initialize (reg_obstack); /* FIXME, only at RTL generation*/ + + execute_all_ipa_transforms (); + + /* Perform all tree transforms and optimizations. */ + + /* Signal the start of passes. */ + invoke_plugin_callbacks (PLUGIN_ALL_PASSES_START, NULL); + + execute_pass_list (all_passes); + + /* Signal the end of passes. */ + invoke_plugin_callbacks (PLUGIN_ALL_PASSES_END, NULL); + + bitmap_obstack_release (reg_obstack); + + /* Release the default bitmap obstack. */ + bitmap_obstack_release (NULL); + + set_cfun (NULL); + + /* If requested, warn about function definitions where the function will + return a value (usually of some struct or union type) which itself will + take up a lot of stack space. */ + if (warn_larger_than !DECL_EXTERNAL (fndecl) TREE_TYPE (fndecl)) + { + tree ret_type = TREE_TYPE (TREE_TYPE (fndecl)); + + if (ret_type TYPE_SIZE_UNIT (ret_type) + TREE_CODE (TYPE_SIZE_UNIT (ret_type)) == INTEGER_CST + 0 compare_tree_int (TYPE_SIZE_UNIT (ret_type), + larger_than_size)) + { + unsigned int size_as_int + = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (ret_type)); + + if (compare_tree_int (TYPE_SIZE_UNIT (ret_type), size_as_int) == 0) + warning (OPT_Wlarger_than_, size of return value of %q+D is %u bytes, + fndecl, size_as_int); + else + warning (OPT_Wlarger_than_, size of return value of %q+D is larger than %wd bytes, + fndecl, larger_than_size); + } + } + + gimple_set_body (fndecl, NULL); + if (DECL_STRUCT_FUNCTION (fndecl) == 0 +!cgraph_get_node (fndecl)-origin) + { + /* Stop pointing to the local nodes about to be freed. +But DECL_INITIAL must remain nonzero so we know this +was an actual function definition. +For a nested function, this is done in c_pop_function_context. +If rest_of_compilation set this to 0, leave it 0. */ + if (DECL_INITIAL (fndecl) != 0) + DECL_INITIAL (fndecl) =
[Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings
Hi, Patch OK? ChangeLog: tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings. Index: tree-parloops.c === --- tree-parloops.c (revision 186243) +++ tree-parloops.c (working copy) @@ -723,13 +723,15 @@ FOR_EACH_VEC_ELT (basic_block, body, i, bb) if (bb != entry_bb bb != exit_bb) for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) - if (is_gimple_debug (gsi_stmt (gsi))) - { - if (gimple_debug_bind_p (gsi_stmt (gsi))) - has_debug_stmt = true; - } - else - eliminate_local_variables_stmt (entry, gsi, decl_address); +{ + if (is_gimple_debug (gsi_stmt (gsi))) + { + if (gimple_debug_bind_p (gsi_stmt (gsi))) + has_debug_stmt = true; + } + else + eliminate_local_variables_stmt (entry, gsi, decl_address); +} if (has_debug_stmt) FOR_EACH_VEC_ELT (basic_block, body, i, bb) signature.asc Description: OpenPGP digital signature
Re: Ada testcase CR line endings
On 4/10/2012 1:35 AM, Mike Stump wrote: So, I'd like to change all the ada testcases to use normal unix line endings. testsuite/gnat.dg/taft_type2_pkg.ads is an example if one such file, any objections? As long as the test is not about line endings this seems fine.
Re: [PATCH] Fix PR middle-end/52894
On 10-Apr-12, at 7:06 AM, Richard Guenther wrote: I can't immediately see how your description of the list of pending externals and the vector is deleted. pa.c keeps its own vector which references the decls and the only issue I see is that if you call assemble_external after processing externals you ICE because the pointer-set is not initialized? Why does the pa backend end up calling assemble_external at final time? The backend calls assemble_integer to output function descriptors at final time. This indirectly calls assemble_external from output_addr_const. This can be seen in comment #4 in the PR. This occurs after the pending externals are processed. We have to ensure that we only output function descriptors that are referenced, and we can only determine this at final. As a result, assemble_external is called after the VEC has been deleted. The compiler doesn't ICE. It goes into an infinite loop when a call to pointer_set_insert tries to extend the deleted VEC. It does this even if the pointer is already in the VEC hash. Previously, the storage wasn't deleted. I suspect the ineffective calls on hpux weren't noticed because assemble_external had already been called for the function descriptors earlier, so the list in the backend was complete. The problem was first seen with the Linux target. There is no backend list for this target because it doesn't use or need assemble_external. I personally think it was wrong to add the deferral in assemble_external, but that's another issue. Dave.
[patch, fortran] Trim spaces on list-directed reads
Hello world, this patch effectively trims the spaces from the string on list-directed reads. This avoids the large overhead on processing these spaces when reading from long lines. I didn't do this for internal units which are arrays because this would need a separate calculation for each record, and would mean a major change to the way the code is structured. The running time for the test case from PR 50673 is reduced from ~8 s to 0.1 s by this patch. Regression-tested. OK for trunk? Thomas 2012-04-09 Thomas Koenig tkoe...@gcc.gnu.org PR libfortran/38199 PR libfortran/50673 * intrinsics/string_intriniscs_inc.c (string_len_trim): Remove prototypes for string_len_trim and move to... * libgfortran.h (string_len_trim): ... here and (string_len_trim_char4): ...here. * io/unit.c: For non-array internal arrays where we do reading, adjust the record length to the last non-blank character. * io/unix.c: Fix typo. Index: intrinsics/string_intrinsics_inc.c === --- intrinsics/string_intrinsics_inc.c (Revision 186190) +++ intrinsics/string_intrinsics_inc.c (Arbeitskopie) @@ -44,9 +44,6 @@ extern void concat_string (gfc_charlen_type, CHART gfc_charlen_type, const CHARTYPE *); export_proto(concat_string); -extern gfc_charlen_type string_len_trim (gfc_charlen_type, const CHARTYPE *); -export_proto(string_len_trim); - extern void adjustl (CHARTYPE *, gfc_charlen_type, const CHARTYPE *); export_proto(adjustl); Index: libgfortran.h === --- libgfortran.h (Revision 186190) +++ libgfortran.h (Arbeitskopie) @@ -791,6 +791,13 @@ internal_proto(fstrcpy); extern gfc_charlen_type cf_strcpy (char *, gfc_charlen_type, const char *); internal_proto(cf_strcpy); +extern gfc_charlen_type string_len_trim (gfc_charlen_type, const char *); +export_proto(string_len_trim); + +extern gfc_charlen_type string_len_trim_char4 (gfc_charlen_type, + const gfc_char4_t *); +export_proto(string_len_trim_char4); + /* io/intrinsics.c */ extern void flush_all_units (void); Index: io/unit.c === --- io/unit.c (Revision 186190) +++ io/unit.c (Arbeitskopie) @@ -397,7 +397,7 @@ get_internal_unit (st_parameter_dt *dtp) __gthread_mutex_lock (iunit-lock); iunit-recl = dtp-internal_unit_len; - + /* For internal units we set the unit number to -1. Otherwise internal units can be mistaken for a pre-connected unit or some other file I/O unit. */ @@ -415,6 +415,26 @@ get_internal_unit (st_parameter_dt *dtp) start_record *= iunit-recl; } + else +{ + /* If we are not processing an array, adjust the unit record length not + to include trailing blanks for list-formatted reads. */ + if (dtp-u.p.mode == READING dtp-format == NULL) + { + if (dtp-common.unit == 0) + { + dtp-internal_unit_len = + string_len_trim (dtp-internal_unit_len, dtp-internal_unit); + iunit-recl = dtp-internal_unit_len; + } + else + { + dtp-internal_unit_len = + string_len_trim_char4 (dtp-internal_unit_len, dtp-internal_unit); + iunit-recl = dtp-internal_unit_len; + } + } +} /* Set initial values for unit parameters. */ if (dtp-common.unit) Index: io/unix.c === --- io/unix.c (Revision 186190) +++ io/unix.c (Arbeitskopie) @@ -736,7 +736,7 @@ mem_alloc_w4 (stream * strm, int * len) } -/* Stream read function for character(kine=1) internal units. */ +/* Stream read function for character(kind=1) internal units. */ static ssize_t mem_read (stream * s, void * buf, ssize_t nbytes)
Re: [Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings
On Tue, Apr 10, 2012 at 2:15 PM, JonY jo...@users.sourceforge.net wrote: Hi, Patch OK? What kind of warning? ChangeLog: tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings. Index: tree-parloops.c === --- tree-parloops.c (revision 186243) +++ tree-parloops.c (working copy) @@ -723,13 +723,15 @@ FOR_EACH_VEC_ELT (basic_block, body, i, bb) if (bb != entry_bb bb != exit_bb) for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) - if (is_gimple_debug (gsi_stmt (gsi))) - { - if (gimple_debug_bind_p (gsi_stmt (gsi))) - has_debug_stmt = true; - } - else - eliminate_local_variables_stmt (entry, gsi, decl_address); + { + if (is_gimple_debug (gsi_stmt (gsi))) + { + if (gimple_debug_bind_p (gsi_stmt (gsi))) + has_debug_stmt = true; + } + else + eliminate_local_variables_stmt (entry, gsi, decl_address); + } if (has_debug_stmt) FOR_EACH_VEC_ELT (basic_block, body, i, bb)
Re: [PATCH, i386, Android] -mandroid support for i386 target
Ping. On Apr 4, 2012 at 1:02, Ilya Enkovich enkovich@gmail.com wrote: On Tue, Apr 3, 2012 at 3:49 AM, Ilya Enkovich enkovich@gmail.com wrote: On 3/04/2012, at 2:16 AM, Ilya Enkovich wrote: The point is that one can build a toolchain for i686-linux-gnu that will support both 32-bit and 64-bit multilibs. The 32-bit multilib will be used by default, and compilation for 64-bit user-space can be requested with -m64 option. Even though Android is not supported for -m64, such a toolchain can support Android as a additional -m32 -mandroid multilib. I.e., the toolchain will have three multilibs in total: -m32 (default), -m64 and -m32 -mandroid. I386/linux64.h will be picked up for such a toolchain, even though by default it would compile for 32-bit target. Does this clear up things? I think I see your point. And it seems to make all this work I'll also have to rename i386/gnu-user.h into i386/gnu-user32.h and create i386/gnu-user.h with common definitions to be included by gnu-user[32|64].h. Otherwise I wont be able to use some definitions (i.e. GNU_USER_TARGET_LINK_SPEC and GNU_USER_TARGET_MATHFILE_SPEC) in linux64.h. Right? It's simpler that you think. The target headers ($tm_file in config.gcc -- gnu-user.h, linux*.h, etc. in this case) are all included into tm.h, which serves as proxy to all those headers. All definitions made in preceding headers are available in subsequent headers. So, given that i386/gnu-user*.h precedes i386/linux*.h in config.gcc's $tm_file, you only need to touch linux*.h. Thanks, -- Maxim Kuvyrkov CodeSourcery / Mentor Graphics Hi, I prepared version with common linux.h and gnu-user.h. Does it look OK? Bootstrapped and checked on linux-x86_64. There are so many duplicates in gnu-user64.h and gnu-user32.h. Please move all of them to gnu-user.h. Thanks. -- H.J. Hi, Here is a new version with all gnu-user32.h and gnu-user64.h common definitions moved to gnu-user.h. Bootstrapped and checked on linux-x86_64. Thanks, Ilya --- 2012-04-04 Enkovich Ilya ilya.enkov...@intel.com * config/i386/linux.h: Renamed to ... * config/i386/linux32.h: ... this. * config/i386/gnu-user.h: Renamed to ... * config/i386/gnu-user32.h: ... this. (CPP_SPEC): Removed. (CC1_SPEC): Removed. (ENDFILE_SPEC): Removed. (DEFAULT_PCC_STRUCT_RETURN): Removed. (TARGET_TLS_DIRECT_SEG_REFS_DEFAULT): Removed. (TARGET_OS_CPP_BUILTINS): Removed. (LIBGCC2_HAS_TF_MODE): Removed. (LIBGCC2_TF_CEXT): Removed. (TF_SIZE): Removed. (TARGET_ASM_FILE_END): Removed. (STACK_CHECK_MOVING_SP): Removed. (STACK_CHECK_STATIC_BUILTIN): Removed. (TARGET_THREAD_SSP_OFFSET): Removed. (TARGET_CAN_SPLIT_STACK): Removed. (TARGET_THREAD_SPLIT_STACK_OFFSET): Removed. (GNU_USER_TARGET_LINK_SPEC): New. (LINK_SPEC): Use GNU_USER_TARGET_LINK_SPEC. * config.gcc: Rename i386/linux.h to i386/linux32.h and i386/gnu-user.h to i386/gnu-user32.h. * config/i386/linux.h: New. * config/i386/gnu-user.h: New. * config/i386/linux64.h: Include i386/linux.h. * config/i386/gnu-user64.h: Include i386/gnu-user.h (CPP_SPEC): Removed. (CC1_SPEC): Removed. (ENDFILE_SPEC): Removed. (DEFAULT_PCC_STRUCT_RETURN): Removed. (TARGET_TLS_DIRECT_SEG_REFS_DEFAULT): Removed. (TARGET_OS_CPP_BUILTINS): Removed. (LIBGCC2_HAS_TF_MODE): Removed. (LIBGCC2_TF_CEXT): Removed. (TF_SIZE): Removed. (TARGET_ASM_FILE_END): Removed. (STACK_CHECK_MOVING_SP): Removed. (STACK_CHECK_STATIC_BUILTIN): Removed. (TARGET_THREAD_SSP_OFFSET): Removed. (TARGET_CAN_SPLIT_STACK): Removed. (TARGET_THREAD_SPLIT_STACK_OFFSET): Removed. (GNU_USER_TARGET_LINK_SPEC): New. (LINK_SPEC): Use GNU_USER_TARGET_LINK_SPEC.
Re: [PATCH] Fix PR middle-end/52894
On Tue, Apr 10, 2012 at 2:28 PM, Dave Anglin dave.ang...@bell.net wrote: On 10-Apr-12, at 7:06 AM, Richard Guenther wrote: I can't immediately see how your description of the list of pending externals and the vector is deleted. pa.c keeps its own vector which references the decls and the only issue I see is that if you call assemble_external after processing externals you ICE because the pointer-set is not initialized? Why does the pa backend end up calling assemble_external at final time? The backend calls assemble_integer to output function descriptors at final time. This indirectly calls assemble_external from output_addr_const. This can be seen in comment #4 in the PR. This occurs after the pending externals are processed. We have to ensure that we only output function descriptors that are referenced, and we can only determine this at final. You mean function destcriptors for functions that are output? You already know this at cgraph clone materialization time. As a result, assemble_external is called after the VEC has been deleted. The compiler doesn't ICE. It goes into an infinite loop when a call to pointer_set_insert tries to extend the deleted VEC. It does this even if the pointer is already in the VEC hash. Isn't the bug then that the backend deletes the VEC too early? Previously, the storage wasn't deleted. I suspect the ineffective calls on hpux weren't noticed because assemble_external had already been called for the function descriptors earlier, so the list in the backend was complete. The problem was first seen with the Linux target. There is no backend list for this target because it doesn't use or need assemble_external. I personally think it was wrong to add the deferral in assemble_external, but that's another issue. I think the best thing would be to revert the offending patch on old branches (4.6 and 4.5) and see if the deferal can be fixed properly - though I didn't look at the pa issue in detail. Maybe Steven can do this. Richard. Dave.
Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop
Hi, issue was that TARGET_EXCEPT_UNWIND_INFO wasn't set for i386 architectures. As mingw targets are using SjLj, it is necessary that this hook is present. ChangeLog 2012-04-10 Kai Tietz kti...@redhat.com PR c++/52918 * common/config/i386/i386-common.c (ix86_except_unwind_info): Add target-hook for supporting SjLj. Tested for x86_64-w64-mingw32, and for x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: common/config/i386/i386-common.c === --- common/config/i386/i386-common.c(revision 186264) +++ common/config/i386/i386-common.c(working copy) @@ -667,6 +667,26 @@ return ret; } +/* Implement TARGET_EXCEPT_UNWIND_INFO. */ + +static enum unwind_info_type +ix86_except_unwind_info (struct gcc_options *opts) +{ + /* Honor the --enable-sjlj-exceptions configure switch. */ +#ifdef CONFIG_SJLJ_EXCEPTIONS + if (CONFIG_SJLJ_EXCEPTIONS) +return UI_SJLJ; +#endif + + /* For simplicity elsewhere in this file, indicate that all unwind + info is disabled if we're not emitting unwind tables. */ + if (!opts-x_flag_exceptions !opts-x_flag_unwind_tables) +return UI_NONE; + + return UI_TARGET; +} + + #undef TARGET_DEFAULT_TARGET_FLAGS #define TARGET_DEFAULT_TARGET_FLAGS\ (TARGET_DEFAULT \ @@ -684,4 +704,7 @@ #undef TARGET_SUPPORTS_SPLIT_STACK #define TARGET_SUPPORTS_SPLIT_STACK ix86_supports_split_stack +#undef TARGET_EXCEPT_UNWIND_INFO +#define TARGET_EXCEPT_UNWIND_INFO ix86_except_unwind_info + struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
On 28 March 2012 11:13, Richard Guenther richard.guent...@gmail.com wrote: On Wed, Mar 28, 2012 at 11:57 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Mar 27, 2012 at 3:17 PM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: And the patch is now attached This does not look like it would compile on any other target. Looks like the macros are pre-existing in rtl.h. With that the ivopts change is ok. I'll let arm folks decide over the arm specific bits. Thanks for the approval on the ivopts stuff -which I haven't changed . I have revised the ARM backend specific changes to tweak costs to prevent auto-inc-dec from gratuitously adding more moves for the pre/post_modify_disp variety of instructions and to disable these for versions of the architecture which don't have support for LDRD. I would like another set of eyes on the backend specific changes - I am currently regression testing this final version on FSF trunk. 2012-04-10 Ramana Radhakrishnan ramana.radhakrish...@linaro.org * tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost): Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately. * config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New. (USE_LOAD_POST_INCREMENT): Define. (USE_LOAD_PRE_INCREMENT): Define. (USE_LOAD_POST_DECREMENT): Define. (USE_LOAD_PRE_DECREMENT): Define. (USE_STORE_PRE_DECREMENT): Define. (USE_STORE_PRE_INCREMENT): Define. (USE_STORE_POST_DECREMENT): Define. (USE_STORE_POST_INCREMENT): Define. (arm_auto_incmodes): Add enumeration. * config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare. * config/arm/arm.c (arm_autoinc_modes_ok_p): Define. (arm_rtx_costs_1): Adjust costs for auto-inc modes and pre / post modify in floating point mode. (arm_size_rtx_costs): Likewise. regards, Ramana Richard. Richard. Ramana diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 900d09a..f9cb75a 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -248,4 +248,6 @@ extern int vfp3_const_double_for_fract_bits (rtx); extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel); extern bool arm_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel); +extern bool arm_autoinc_modes_ok_p (enum machine_mode, enum arm_auto_incmodes); + #endif /* ! GCC_ARM_PROTOS_H */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 5522fc1..6bc5aa9 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -7121,6 +7121,19 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed) /* Memory costs quite a lot for the first word, but subsequent words load at the equivalent of a single insn each. */ *total = COSTS_N_INSNS (2 + ARM_NUM_REGS (mode)); + + /* If we have hard float or there is no support for ldrd +and strd there is no point in allowing post_dec, +pre_inc and pre/post_modify_disp to have the same cost +for memory accesses in floating point modes. */ + if ((TARGET_HARD_FLOAT + || !TARGET_LDRD) + (FLOAT_MODE_P (mode) + (GET_CODE (XEXP (x, 0)) == POST_DEC + || GET_CODE (XEXP (x, 0)) == PRE_INC + || GET_CODE (XEXP (x, 0)) == PRE_MODIFY + || GET_CODE (XEXP (x, 0)) == POST_MODIFY))) + *total += COSTS_N_INSNS (2); return true; case DIV: @@ -7831,6 +7844,20 @@ arm_size_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, *total = COSTS_N_INSNS (2); else *total = COSTS_N_INSNS (ARM_NUM_REGS (mode)); + + /* If we have hard float or there is no support for ldrd +and strd there is no point in allowing post_dec, +pre_inc and pre/post_modify_disp to have the same cost +for memory accesses in floating point modes. */ + if ((TARGET_HARD_FLOAT + || !TARGET_LDRD) + (FLOAT_MODE_P (mode) + (GET_CODE (XEXP (x, 0)) == POST_DEC + || GET_CODE (XEXP (x, 0)) == PRE_INC + || GET_CODE (XEXP (x, 0)) == PRE_MODIFY + || GET_CODE (XEXP (x, 0)) == POST_MODIFY))) + *total = COSTS_N_INSNS (2); + return true; case DIV: @@ -25680,5 +25707,51 @@ arm_vectorize_vec_perm_const_ok (enum machine_mode vmode, return ret; } - +bool +arm_autoinc_modes_ok_p (enum machine_mode mode, enum arm_auto_incmodes code) +{ + /* If we are soft float and we do not have ldrd + then all auto increment forms are ok. */ + if (TARGET_SOFT_FLOAT (TARGET_LDRD || GET_MODE_SIZE (mode) = 4)) +return true; + + switch (code) +{ + /* Post increment and Pre Decrement are supported for all +instruction forms except for vector forms. */ +case
Re: [PATCH][ARM] NEON DImode immediate constants
Ping. On 30/03/12 12:15, Andrew Stubbs wrote: On 28/02/12 16:20, Andrew Stubbs wrote: Hi all, This patch implements 64-bit immediate constant loads in NEON. The current state is that you can load const_vector, but not const_int. This is clearly not ideal. The result is a constant pool entry when it's not necessary. The patch disables the movdi_vfp patterns for loading DImode values, if the operand is const_int and NEON is enabled, and extends the neon_mov pattern to include DImode const_int, as well as the const_vector operands. I've modified neon_valid_immediate only enough to accept const_int input - the logic remains untouched. That patch failed to bootstrap successfully, but this updated patch bootstraps and tests with no regressions. OK? Andrew
Re: [PING] allowing fwprop to propagate subregs
Ulrich Weigand uweig...@de.ibm.com writes: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01269.html I don't like approving patches I had a hand in, but since Paolo was happy, and since no-one else seems to care... OK. Thanks for getting this working. Richard
Re: [SH] PR 50751 - add HImode displacement addressing support
Oleg Endo oleg.e...@t-online.de wrote: The attached patch adds HImode addressing support. Tested against rev. 186243 with sudo make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb, -m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb} and no new failures. Test cases will follow soon. The patch is OK for trunk. BTW, do you have the numbers of CSiBE with this? Regards, kaz
rfa: vectorize strided loads [2/2] [PR 18437]
Hi, and this implements generally strided loads where the stride is a loop-invariant (constant or ssa-name). We only do so if the load can't be handled by interleaving groups. The implementation is fairly straight forward: for (i = 0; i n; i += stride) ... = array[i]; is transformed into: for (j = 0; ; j += VF*stride) tmp1 = array[j]; tmp2 = array[j + stride]; ... vectemp = {tmp1, tmp2, ...} (of course variously adjusted for component number) The nice thing is that by such separate loads we don't need to care for alignment (if the old access was aligned, so will be the new as the access size doesn't change). This is one more step in vectorizing the same loops as ICC. polyhedron has such a case in rnflow, where it helps quite much: Without patch: Benchmark Compile Executable Ave Run Number Estim Name(secs) (bytes)(secs) Repeats Err % - --- -- --- --- -- ac 3.71 3960337 10.06 2 0.0368 aermod 75.30 5594185 19.30 5 0.7102 air 10.30 483 6.59 2 0.0653 capacita 7.23 4185227 57.25 2 0.0968 channel 2.05 4658532 2.70 5 4.4701 doduc 14.55 4237850 23.31 2 0.0768 fatigue 6.64 4127554 7.48 5 0.3063 gas_dyn 13.25 4113557 2.99 5 4.5063 induct 7.23 4338356 8.85 5 0.9927 linpk 1.24 3927350 10.37 5 5.1174 mdbx 5.51 4053841 12.95 5 0.0956 nf 10.69 4062276 11.44 5 0.2727 protein 33.04 5011924 39.53 5 0.2328 rnflow 25.35 4238651 31.78 2 0.0673 test_fpu 12.24 4184939 9.00 5 0.1157 tfft 1.15 3976710 3.95 3 0.0736 Geometric Mean Execution Time = 11.21 seconds With patch: Benchmark Compile Executable Ave Run Number Estim Name(secs) (bytes)(secs) Repeats Err % - --- -- --- --- -- ac 3.91 3960337 10.34 5 0.3661 aermod 76.44 5598281 19.03 5 1.3394 air 11.52 483 6.75 5 0.9347 capacita 7.78 4189323 56.33 2 0.0976 channel 2.14 4658532 2.59 5 0.6640 doduc 14.61 4237850 23.41 5 0.2974 fatigue 6.62 4127554 7.44 5 0.1082 gas_dyn 13.14 4113557 2.82 5 0.5253 induct 7.26 4338356 8.49 2 0.0082 linpk 1.23 3927350 9.78 2 0.0705 mdbx 5.47 4053841 12.90 2 0.0601 nf 10.67 4062276 11.33 2 0.0004 protein 32.81 5011924 39.48 2 0.0893 rnflow 26.57 4246843 26.70 5 0.0915 test_fpu 13.29 4193131 8.82 5 0.2136 tfft 1.14 3976710 3.95 5 0.1753 Geometric Mean Execution Time = 10.95 seconds I.e. for rnflow from 31.78 to 26.70 seconds, nearly 20% better. Regstrapped together with [1/2] on x86_64-linux, all langs+Ada, no regressions. Okay for trunk? Ciao, Michael. PR tree-optimization/18437 * tree-vectorizer.h (_stmt_vec_info.stride_load_p): New member. (STMT_VINFO_STRIDE_LOAD_P): New accessor. (vect_check_strided_load): Declare. * tree-vect-data-refs.c (vect_check_strided_load): New function. (vect_analyze_data_refs): Use it to accept strided loads. * tree-vect-stmts.c (vectorizable_load): Ditto and handle them. testsuite/ * gfortran.dg/vect/rnflow-trs2a2.f90: New test. Index: tree-vectorizer.h === --- tree-vectorizer.h.orig 2012-04-10 13:19:18.0 +0200 +++ tree-vectorizer.h 2012-04-10 13:21:19.0 +0200 @@ -545,6 +545,7 @@ typedef struct _stmt_vec_info { /* For loads only, true if this is a gather load. */ bool gather_p; + bool stride_load_p; } *stmt_vec_info; /* Access Functions. */ @@ -559,6 +560,7 @@ typedef struct _stmt_vec_info { #define STMT_VINFO_VECTORIZABLE(S) (S)-vectorizable #define STMT_VINFO_DATA_REF(S) (S)-data_ref_info #define STMT_VINFO_GATHER_P(S)(S)-gather_p +#define STMT_VINFO_STRIDE_LOAD_P(S) (S)-stride_load_p #define STMT_VINFO_DR_BASE_ADDRESS(S) (S)-dr_base_address #define STMT_VINFO_DR_INIT(S) (S)-dr_init @@ -875,6 +877,7 @@ extern bool vect_analyze_data_ref_access extern bool vect_prune_runtime_alias_test_list (loop_vec_info); extern tree vect_check_gather (gimple, loop_vec_info, tree *, tree *,
Re: Symbol table 2/many: break out partitioning code out of lto.c
On 4/10/12 9:32 AM, Jan Hubicka wrote: Hi, LTO partitioning logic is one of places that are really symmetric accross most types of symbols; also the implementation is convoluted now by split in between lto.c that computes cgraph/varpool node sets specifying partitions and lto-cgraph.c that computes a boundaries (i.e. what functions/vars are referenced from given unit). There is a lot of logic duplication in between the promotion done in lto.c and boundary logic + there are subtle differences causing a lot of pain. The plan is to reorg the code to new API and unify it, so the partitioning code produce boundaries on the run and represents every of the partition by the encoder, not by sets. Excellent. * lto.c: Update copyright; remove params.h, ipa-inline.h and ipa-utils.h inlines; inline lto-partition.h (ltrans_partition_def, ltrans_partition, add_cgraph_node_to_partition, add_varpool_node_to_partition, new_partition, free_ltrans_partitions, add_references_to_partition, add_cgraph_node_to_partition_1, add_cgraph_node_to_partition, add_varpool_node_to_partition, undo_partition, partition_cgraph_node_p, partition_varpool_node_p, lto_1_to_1_map, node_cmp, varpool_node_cmp, lto_balanced_map, promote_var, promote_fn, lto_promote_cross_file_statics): move to... * lto-partition.c: ... here; new file. * lto-partition.h: New file. * Make-lang.in (lto.o): Update dependencies. (lto-partition.o): New. OK, thanks. Diego.
Re: Symbol table 1/many: symtab_nodes
On Tue, 10 Apr 2012, Jan Hubicka wrote: Hi, this is very basis of the symbol table work. The basic idea is: 1) give cgraph node and varpool node common base (symtab node) 2) Move common data there: that is type, declaration, visibility flags, force_output, ipa references and other stuff. 3) Introduce symtab.c for basic symbol table manipulation that in first run is represented as simple linked list of cgraph/varpool node entries. 4) Update ipa-reference API for the new class structure 5) Significantly clanup/rework old cgraph reachability and construction code that has grown up very convoluted to meet non-unit-at-a-time needs. Merge most of logic done on varpool and cgraph into common grounds 6) reorg WHOPR partitioning code 7) Add new types of symbol table entries - labels, aliases and possibly constant pools and symbols defined by asm statements so existing WHOPR bugs have chance to be solved. 8) Add actual symbol name hash - this is trickier than it sounds given that GCC currently don't really know the symbol names on non-LTO targets. I plan to simply take the existing lto-symtab logic and slowly require targtets to provide hook to translate ASSEMBLER_NAME into symbol name. It would be nice to get rid of ASSEMBLER_NAME and replace it by SYMBOL_NAME + set of flags, but this is very hard to do since assembler name goes everywhere into target machinery. I am not really volunteering to do that at this moment. 9) Reorg most of lto-symtab to work on symtab keeping the prevailing and other stuff there. I would like to get most of the changes into mainline for this stage1. In worst case at least 1-6. I prototytped all except for lto-symtab reorg, but I would like to re-do everything incrementally for mainline with cleanups on the way. Many of things can be significantly simplified over current implementation and I hope it will make it easier to follow the changes and get me some useful feedback. It is kind of second chance for cleaner cgraph API and it would be nice to get it mostly right. This patch takes the very first step by adding the base object. It implements inheritance in C via cgraph/varpool accessors. C++ would make sense here, but I think I still can't because of gengtype. Once C++ is not only allowed, but also useful, I don't think changing the syntactical shugar to actual classes would be a significant project. I agree. I hope we get to 8), as lto-symtab requires that ;) This is ok. Thanks for finally pushing this, Richard. Honza * cgraph.h: Remove misledaing comment on ipa-ref.h. (symtab_type): New enum. (symtab_node): New structure. (cgraph_node, varpool_node): Add symbol base type. (cgraph, varpool): New accestor functions. * cgraph.c (cgraph_create_node_1): Set symbol type. * varpool.c (varpool_node): Set symbol type. Index: cgraph.h === --- cgraph.h (revision 186252) +++ cgraph.h (working copy) @@ -27,7 +27,23 @@ along with GCC; see the file COPYING3. #include tree.h #include basic-block.h #include function.h -#include ipa-ref.h /* FIXME: inappropriate dependency of cgraph on IPA. */ +#include ipa-ref.h + +/* Symbol table consists of functions and variables. + TODO: add labels, constant pool and aliases. */ +enum symtab_type +{ + SYMTAB_FUNCTION, + SYMTAB_VARIABLE +}; + +/* Base of all entries in the symbol table. + The symtab_node is inherited by cgraph and varpol nodes. */ +struct GTY(()) symtab_node +{ + /* Type of the symbol. */ + enum symtab_type type; +}; enum availability { @@ -150,6 +166,7 @@ struct GTY(()) cgraph_clone_info Each function decl has assigned cgraph_node listing callees and callers. */ struct GTY((chain_next (%h.next), chain_prev (%h.previous))) cgraph_node { + struct symtab_node symbol; tree decl; struct cgraph_edge *callees; struct cgraph_edge *callers; @@ -387,6 +404,7 @@ DEF_VEC_ALLOC_P(cgraph_edge_p,heap); Each static variable decl has assigned varpool_node. */ struct GTY((chain_next (%h.next), chain_prev (%h.prev))) varpool_node { + struct symtab_node symbol; tree decl; /* For aliases points to declaration DECL is alias of. */ tree alias_of; @@ -688,6 +706,23 @@ void varpool_add_new_variable (tree); #define FOR_EACH_STATIC_VARIABLE(node) \ for ((node) = varpool_nodes_queue; (node); (node) = (node)-next_needed) +/* Return callgraph node for given symbol and check it is a function. */ +static inline struct cgraph_node * +cgraph (struct symtab_node *node) +{ + gcc_checking_assert (node-type == SYMTAB_FUNCTION); + return (struct cgraph_node *)node; +} + +/* Return varpool node for given symbol and check it is a variable. */ +static inline struct varpool_node * +varpool (struct symtab_node *node) +{ +
Re: [PATCH] Fix PR middle-end/52894
On Tue, Apr 10, 2012 at 3:36 PM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Apr 10, 2012 at 2:44 PM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Apr 10, 2012 at 2:28 PM, Dave Anglin dave.ang...@bell.net wrote: On 10-Apr-12, at 7:06 AM, Richard Guenther wrote: I can't immediately see how your description of the list of pending externals and the vector is deleted. pa.c keeps its own vector which references the decls and the only issue I see is that if you call assemble_external after processing externals you ICE because the pointer-set is not initialized? Why does the pa backend end up calling assemble_external at final time? The backend calls assemble_integer to output function descriptors at final time. This indirectly calls assemble_external from output_addr_const. This can be seen in comment #4 in the PR. This occurs after the pending externals are processed. We have to ensure that we only output function descriptors that are referenced, and we can only determine this at final. You mean function destcriptors for functions that are output? You already know this at cgraph clone materialization time. As a result, assemble_external is called after the VEC has been deleted. The compiler doesn't ICE. It goes into an infinite loop when a call to pointer_set_insert tries to extend the deleted VEC. It does this even if the pointer is already in the VEC hash. Isn't the bug then that the backend deletes the VEC too early? Previously, the storage wasn't deleted. I suspect the ineffective calls on hpux weren't noticed because assemble_external had already been called for the function descriptors earlier, so the list in the backend was complete. The problem was first seen with the Linux target. There is no backend list for this target because it doesn't use or need assemble_external. I personally think it was wrong to add the deferral in assemble_external, but that's another issue. I think the best thing would be to revert the offending patch on old branches (4.6 and 4.5) and see if the deferral can be fixed properly - though I didn't look at the pa issue in detail. Maybe Steven can do this. It seems to me that the patch Dave posted is fine. It basically reverts to the old situation (i.e. before the pending_assemble_external stuff was added) for all back ends. Well, ok then. Thanks, Richard. Ciao! Steven
Re: [SH] PR 50751 - add HImode displacement addressing support
On Tue, 2012-04-10 at 22:42 +0900, Kaz Kojima wrote: Oleg Endo oleg.e...@t-online.de wrote: The attached patch adds HImode addressing support. Tested against rev. 186243 with sudo make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml,-m4/-mb, -m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb} and no new failures. Test cases will follow soon. The patch is OK for trunk. BTW, do you have the numbers of CSiBE with this? Only for -m4-single -ml -O2 -mpretend-cmove so far. Not so spectacular :T I'll also do a comparison of more variants to see if something went really bad. It's a bit difficult to isolate the degradations because there's quite some code reordering happening after the patch... Cheers, Oleg OpenTCP-1.0.4 arp 2089 - 2061 -28 / -1.340354 % bootp/bootp 740 - 704 -36 / -4.864865 % demo/main_demo 372 - 368 -4 / -1.075269 % demo/tcp_client_demo 396 - 396 demo/tcp_server_demo 468 - 468 demo/udp_demo268 - 268 dhcp/dhcpc 1588 - 1588 dns/dns 1340 - 1328 -12 / -0.895522 % ethernet1240 - 1228 -12 / -0.967742 % http/http_server1784 - 1764 -20 / -1.121076 % http/https_callbacks 656 - 656 icmp 348 - 352 +4 / +1.149425 % ip 1780 - 1668-112 / -6.292135 % pop3/pop3_client3796 - 3728 -68 / -1.791359 % pop3/pop3c_callbacks 28 - 28 smtp/smtp_client2284 - 2212 -72 / -3.152364 % smtp/smtpc_callbacks 32 - 32 system 980 - 980 tcp 4966 - 4970 +4 / +0.080548 % tftp/tftps 1024 - 920 -104 / -10.156250 % timers 228 - 228 udp 1362 - 1322 -40 / -2.936858 % total: 27769 - 27269 -500 / -1.800569 % bzip2-1.0.2 blocksort 7262 - 7258 -4 / -0.055081 % bzip2 17288 - 17248-40 / -0.231374 % bzip2recover3868 - 3868 bzlib 10668 - 10636-32 / -0.299963 % compress 14208 - 14192-16 / -0.112613 % crctable1024 - 1024 decompress 8380 - 8380 huffman 1436 - 1436 randtable 2048 - 2048 total: 66182 - 66090 -92 / -0.139011 % cg_compiler_opensrc atom3728 - 3728 binding 1752 - 1728 -24 / -1.369863 % cgcmain 2388 - 2388 cgstruct 128 - 128 check 3300 - 3300 compile14052 - 14052 constfold 3996 - 3996 cpp 6776 - 6768 -8 / -0.118064 % generic_hal 2516 - 2516 hal 2720 - 2720 inline 2724 - 2724 memory 548 - 548 parser 15796 - 15796 printutils 11888 - 11888 scanner 6716 - 6672 -44 / -0.655152 % semantic4364 - 4364 stdlib 3224 - 3224 support24328 - 24332 +4 / +0.016442 % support_iter1832 - 1832 symbols 6324 - 6312 -12 / -0.189753 % tokenize 624 - 624 tokens 3112 - 3120 +8 / +0.257069 % total:122836 - 122760 -76 / -0.061871 % compiler cg 2868 - 2868 main 2600 - 2600 parser 4524 - 4528 +4 / +0.088417 % scanner3788 - 3788 vam5264 - 5264 vas6292 - 6300 +8 / +0.127146 % total: 25336 - 25348 +12 / +0.047363 % flex-2.5.31 buf 1100 - 1100 ccl 1220 - 1220 dfa 6952 - 6952 ecs 612 - 612 filter 2360 - 2360 gen 24512 - 24488-24 / -0.097911 % libmain28 - 28 libyywrap 4 - 4 main19804 - 19792-12 / -0.060594 % misc 6480 - 6480 nfa 3812 - 3812 options 2768 - 2768 parse9524 - 9528 +4 / +0.041999 % regex 596 - 596 scan47012 - 47004 -8 / -0.017017 % scanopt 3672 - 3672 skel96652 - 96652 sym 1012 - 1012 tables 2080 - 2064 -16 / -0.769231 % tables_shared 28 - 28 tblcmp 3764 - 3764 yylex1292 - 1292 total:235284 - 235228 -56 / -0.023801 % jikespg-1.3 src/ctabs 49912 - 49876-36 / -0.072127 % src/globals 288 - 288 src/lpgparse 47340 - 47036 -304 / -0.642163 % src/lpgutil 5152 - 5120 -32 / -0.621118 % src/main5440 - 5440 src/mkfirst
Re: [i386, patch, RFC] HLE support in GCC
Yeah. And you don't need to change the FEs in any way, all that is needed is to change the middle-end/expansion (builtins.c - e.g. get_memmodel) and the backend (plus predefine the macros in the backend). Jakub Hi Jakub, Attached patch implements HLE support for __atomic_compare_exchange_n. So, to emit HLE prefix, it is possible to do: int foo2 (int *p, int oldv, int newv) { __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE | __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE); return oldv; } Which will generate: ... lock xacquire cmpxchgl %esi, (%rcx) ... Comments? PS: No tests and TARGET_HLE defined yet. Thanks, K hle-rfc-2.gcc.patch Description: Binary data
Re: [SH] PR 50751 - add HImode displacement addressing support
- Original Message - BTW, do you have the numbers of CSiBE with this? Only for -m4-single -ml -O2 -mpretend-cmove so far. Not so spectacular :T I'll also do a comparison of more variants to see if something went really bad. It's a bit difficult to isolate the degradations because there's quite some code reordering happening after the patch... Are you looking at execution time or code size? If the latter, you could try disabling scheduling for comparison purposes (-fno-schedule-insns -fno-schedule-insns2). Even if you're looking at execution time, disabling scheduling might make it easier to see where things are going wrong in the patched code. -Nathan
Re: RFC reminder: an alternative -fsched-pressure implementation
The condition I orignally set myself was that this patch should only go in if it becomes the default on at least one architecture, specifically ARM. Ulrich tells me that Linaro have now made it the default for ARM in their GCC 4.7 release, so hopefully Ramana would be OK with doing the same in upstream 4.8. Before I approve that I'd like to hopefully see some numbers across the M-class cores as well so that we make sure that side of the architecture isn't missed out (I don't expect it to make a huge difference there but it's still worth checking) - I'm quite happy for it to be default on A8 and A9 based on the evidence I've seen so far. regards, Ramana
Re: [SH] PR 50751 - add HImode displacement addressing support
On Tue, 2012-04-10 at 07:14 -0700, Nathan Froyd wrote: - Original Message - BTW, do you have the numbers of CSiBE with this? Only for -m4-single -ml -O2 -mpretend-cmove so far. Not so spectacular :T I'll also do a comparison of more variants to see if something went really bad. It's a bit difficult to isolate the degradations because there's quite some code reordering happening after the patch... Are you looking at execution time or code size? If the latter, you could try disabling scheduling for comparison purposes (-fno-schedule-insns -fno-schedule-insns2). Even if you're looking at execution time, disabling scheduling might make it easier to see where things are going wrong in the patched code. Code size only. Thanks for the hint, will try it out. Cheers, Oleg
Re: [i386, patch, RFC] HLE support in GCC
On Tue, Apr 10, 2012 at 06:12:08PM +0400, Kirill Yukhin wrote: Attached patch implements HLE support for __atomic_compare_exchange_n. The target hook is definitely not appropriate, just define it in ix86_target_macros in i386-c.c instead or so. Jakub
Re: rfa: vectorize strided loads [2/2] [PR 18437]
On Tue, Apr 10, 2012 at 3:46 PM, Michael Matz m...@suse.de wrote: Hi, and this implements generally strided loads where the stride is a loop-invariant (constant or ssa-name). We only do so if the load can't be handled by interleaving groups. The implementation is fairly straight forward: for (i = 0; i n; i += stride) ... = array[i]; is transformed into: for (j = 0; ; j += VF*stride) tmp1 = array[j]; tmp2 = array[j + stride]; ... vectemp = {tmp1, tmp2, ...} (of course variously adjusted for component number) The nice thing is that by such separate loads we don't need to care for alignment (if the old access was aligned, so will be the new as the access size doesn't change). This is one more step in vectorizing the same loops as ICC. polyhedron has such a case in rnflow, where it helps quite much: Without patch: Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % - --- -- --- --- -- ac 3.71 3960337 10.06 2 0.0368 aermod 75.30 5594185 19.30 5 0.7102 air 10.30 483 6.59 2 0.0653 capacita 7.23 4185227 57.25 2 0.0968 channel 2.05 4658532 2.70 5 4.4701 doduc 14.55 4237850 23.31 2 0.0768 fatigue 6.64 4127554 7.48 5 0.3063 gas_dyn 13.25 4113557 2.99 5 4.5063 induct 7.23 4338356 8.85 5 0.9927 linpk 1.24 3927350 10.37 5 5.1174 mdbx 5.51 4053841 12.95 5 0.0956 nf 10.69 4062276 11.44 5 0.2727 protein 33.04 5011924 39.53 5 0.2328 rnflow 25.35 4238651 31.78 2 0.0673 test_fpu 12.24 4184939 9.00 5 0.1157 tfft 1.15 3976710 3.95 3 0.0736 Geometric Mean Execution Time = 11.21 seconds With patch: Benchmark Compile Executable Ave Run Number Estim Name (secs) (bytes) (secs) Repeats Err % - --- -- --- --- -- ac 3.91 3960337 10.34 5 0.3661 aermod 76.44 5598281 19.03 5 1.3394 air 11.52 483 6.75 5 0.9347 capacita 7.78 4189323 56.33 2 0.0976 channel 2.14 4658532 2.59 5 0.6640 doduc 14.61 4237850 23.41 5 0.2974 fatigue 6.62 4127554 7.44 5 0.1082 gas_dyn 13.14 4113557 2.82 5 0.5253 induct 7.26 4338356 8.49 2 0.0082 linpk 1.23 3927350 9.78 2 0.0705 mdbx 5.47 4053841 12.90 2 0.0601 nf 10.67 4062276 11.33 2 0.0004 protein 32.81 5011924 39.48 2 0.0893 rnflow 26.57 4246843 26.70 5 0.0915 test_fpu 13.29 4193131 8.82 5 0.2136 tfft 1.14 3976710 3.95 5 0.1753 Geometric Mean Execution Time = 10.95 seconds I.e. for rnflow from 31.78 to 26.70 seconds, nearly 20% better. Regstrapped together with [1/2] on x86_64-linux, all langs+Ada, no regressions. Okay for trunk? Ok with ... Ciao, Michael. PR tree-optimization/18437 * tree-vectorizer.h (_stmt_vec_info.stride_load_p): New member. (STMT_VINFO_STRIDE_LOAD_P): New accessor. (vect_check_strided_load): Declare. * tree-vect-data-refs.c (vect_check_strided_load): New function. (vect_analyze_data_refs): Use it to accept strided loads. * tree-vect-stmts.c (vectorizable_load): Ditto and handle them. testsuite/ * gfortran.dg/vect/rnflow-trs2a2.f90: New test. Index: tree-vectorizer.h === --- tree-vectorizer.h.orig 2012-04-10 13:19:18.0 +0200 +++ tree-vectorizer.h 2012-04-10 13:21:19.0 +0200 @@ -545,6 +545,7 @@ typedef struct _stmt_vec_info { /* For loads only, true if this is a gather load. */ bool gather_p; + bool stride_load_p; } *stmt_vec_info; /* Access Functions. */ @@ -559,6 +560,7 @@ typedef struct _stmt_vec_info { #define STMT_VINFO_VECTORIZABLE(S) (S)-vectorizable #define STMT_VINFO_DATA_REF(S) (S)-data_ref_info #define STMT_VINFO_GATHER_P(S) (S)-gather_p +#define STMT_VINFO_STRIDE_LOAD_P(S) (S)-stride_load_p #define STMT_VINFO_DR_BASE_ADDRESS(S) (S)-dr_base_address #define STMT_VINFO_DR_INIT(S) (S)-dr_init @@ -875,6 +877,7 @@ extern bool vect_analyze_data_ref_access extern bool vect_prune_runtime_alias_test_list
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On Tue, 10 Apr 2012 12:18:51 +0300 Konstantinos Margaritis konstantinos.margari...@linaro.org wrote: On Tue, 10 Apr 2012 07:36:07 +0200 Jakub Jelinek ja...@redhat.com wrote: We really want consistency about the dynamic linker names etc. across different targets and sneaking silently multiarch paths on one architecture would make it inconsistent with all the others. So, please just use /libhf/ld-linux.so.3. I personally find /libhf extremely ugly. If having a second path is a problem, how about using the triplet in the filename? Like: /lib/ld-linux-arm-linux-gnueabihf.so.3 ? Unique per arch and not multiarched. And AFAIK, Linux can handle long filenames just fine... every distro uses a unique triplet, by putting the triplet in there you then need to get all distros to change to using the same triplets. I personally prefer /libhfp rather than /libhf but I am ok with using either. Any change from /lib would need us to do a mass rebuild. because while not 100% needed I would rather keep libraries with the linker. the changes to rpm to support it would be somewhat minimal. we have stated in Fedora though that we have no intention to support mixing hfp and sfp on the same system. we really do need to ensure consensus for arm64 which I think should be /lib64 for 64 bit arch consistency. Dennis
Re: [Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings
On Tue, Apr 10, 2012 at 3:06 PM, JonY jo...@users.sourceforge.net wrote: On 4/10/2012 20:37, Richard Guenther wrote: On Tue, Apr 10, 2012 at 2:15 PM, JonY wrote: Hi, Patch OK? What kind of warning? Oops, I forgot to mention gcc was complaining about missing braces. Not for me. And I fail to see why it should. Richard. ChangeLog: tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings. Index: tree-parloops.c === --- tree-parloops.c (revision 186243) +++ tree-parloops.c (working copy) @@ -723,13 +723,15 @@ FOR_EACH_VEC_ELT (basic_block, body, i, bb) if (bb != entry_bb bb != exit_bb) for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) - if (is_gimple_debug (gsi_stmt (gsi))) - { - if (gimple_debug_bind_p (gsi_stmt (gsi))) - has_debug_stmt = true; - } - else - eliminate_local_variables_stmt (entry, gsi, decl_address); + { + if (is_gimple_debug (gsi_stmt (gsi))) + { + if (gimple_debug_bind_p (gsi_stmt (gsi))) + has_debug_stmt = true; + } + else + eliminate_local_variables_stmt (entry, gsi, decl_address); + } if (has_debug_stmt) FOR_EACH_VEC_ELT (basic_block, body, i, bb)
Re: [C11-atomic] [patch] gimple atomic statements
On Tue, Apr 10, 2012 at 3:14 PM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Guenther richard.guent...@gmail.com writes: On Fri, Apr 6, 2012 at 10:13 AM, Richard Sandiford rdsandif...@googlemail.com wrote: Richard Guenther richard.guent...@gmail.com writes: They can affect shared memory in some ways like a call, but don't have many of the other attributes of call. They are really more like an assignment or other operation with arbitrary shared memory side effects. I do hope to be able to teach the optimizers the directionality of the memory model restrictions. ie, ACQUIRE is only a barrier to hoisting shared memory code... stores can be moved downward past this mode. RELEASE is only a barrier to sinking code. RELAXED is no barrier at all to code motion. In fact, a relaxed store is barely different than a real store... but there is a slight difference so we can't make it a normal store :-P. By teaching the other parts of the compiler about a GIMPLE_ ATOMIC, we could hopefully lessen their impact eventually. Ok. I suppose having a GIMPLE_ATOMIC is fine then. Just for my own education really, but: does this mean that there'd be unnecessary pessimisation in representing the thing as a call? No, there are not. They are more pessimized than GIMPLE_ASSIGNs (unless you handle them specially in a few places). But the same is true for GIMPLE_ATOMIC. The question is one of convenience as far as I understand. In general I would like to avoid new GIMPLE codes, especially random ones. You can do everything with builtins or internal functions just fine. The interleaved load/store internal fns are really assignments too, so if calls aren't right for that kind of operation, maybe we need to replace the internal fns with something else. Or at least come up with some new call properties. What missed optimizations do you see? None. :-) But... Which is a roundabout way of wondering what the main difficulties would be in attaching things like directionality to a call. Directionality? [See above.] ...I was asking in the context quoted above, which seemed to be the bit that convinced you GIMPLE_ATOMIC would be OK after all. And the two main reasons in the bit quoted above seemed to be that GIMPLE_ATOMIC was more like GIMPLE_ASSIGN (which is true of the current internal fns too, and was why I was interested) and that we wanted to add the directionality of the memory model (which seemed at face value like something that could be attached to a call). Not arguing for anything here, just an onlooker wanting to understand. :-) (BTW, it sounds like restricting memory accesses to GIMPLE_ASSIGN might cause trouble for the interleave load/store stuff too.) Well. In the end my plan was to have a GIMPLE_LOAD and GIMPLE_STORE stmt, where the load would load to an SSA name and the store would store from a constant or an SSA name. Advantages would be simplified data-flow analysis and things like aggregate copy propagation and value-numbering for free. It would also be easier to attach special data / properties to the now single load / store sites (where in calls you can have an arbitrary number of loads at least). Whatever special property the interleaved load/store has would be stored there, too. The idea was to be able to fold most of the implicit information about loads/stores that are in the MEM_REF /COMPONENT_REF / ARRAY_REF trees into proper gimple level information, like having an array of index, stride tuples for (multidimensional) array accesses. Think of it like a place where we can properly embed the MEM_ATTRs we have on the RTL side for example. That's much easier if there were loads/stores only in very defined places. Ah, OK, so there'd still be a single gimple stmt (GIMPLE_LOAD or GIMPLE_STORE), and that stmt would do the interleaving too? Well, whatever we'd like to add (part of the atomic stuff would fit here, too, just not the operation part like increment). Richard. Sounds good. I was worried at first that we'd have two separate stmts (e.g. a load and an interleave) which was what the internal fns were supposed to avoid. Thanks, Richard
Re: PATCH: Define TRY_EMPTY_VM_SPACE for Linux/x32
On Thu, Apr 5, 2012 at 8:57 AM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Apr 5, 2012 at 8:36 AM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Apr 5, 2012 at 3:28 PM, H.J. Lu hjl.to...@gmail.com wrote: Looking at how other targets implement this check, I don't think that this is a problem at all. This issue only shows on a non-bootstrapped build. A full bootstrap will use correct address. The other place where it shows up is cross compilers but who is going to use a 3.2 compiler with GCC 4.8 anyways? FWIW, I have no problem with checking __LP64__ only. I just want to mention this potential issue. Here is a different patch. It checks sizeof (void *) for those hosts where it is appropriate. Does it look OK? Looks OK to me, but you need a GWP reviewer to OK this change now... Here is the patch with ChangeLog. Tested on Linux/x32. Jakub, is this OK for trunk? Hi Richard, Is this patch: http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00303.html OK for trunk? Thanks. -- H.J.
Re: [i386, patch, RFC] HLE support in GCC
On Tue, Apr 10, 2012 at 7:12 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Yeah. And you don't need to change the FEs in any way, all that is needed is to change the middle-end/expansion (builtins.c - e.g. get_memmodel) and the backend (plus predefine the macros in the backend). Jakub Hi Jakub, Attached patch implements HLE support for __atomic_compare_exchange_n. So, to emit HLE prefix, it is possible to do: int foo2 (int *p, int oldv, int newv) { __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE | __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE); return oldv; } This is wrong since HLE ACQUIRE/RELEASE has nothing to do with C++ atomic acquire/release. You can have HLE RELEASE with C++ atomic acquire. -- H.J.
Re: [i386, patch, RFC] HLE support in GCC
On Tue, Apr 10, 2012 at 07:42:53AM -0700, H.J. Lu wrote: Attached patch implements HLE support for __atomic_compare_exchange_n. So, to emit HLE prefix, it is possible to do: int foo2 (int *p, int oldv, int newv) { __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE | __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE); return oldv; } This is wrong since HLE ACQUIRE/RELEASE has nothing to do with C++ atomic acquire/release. You can have HLE RELEASE with C++ atomic acquire. Yes, of course, there should be two bits for HLE rather than one. Jakub
Re: [RFC] Unconditionally clean up CFG before emitting prologue
On Mon, Apr 9, 2012 at 6:11 PM, Eric Botcazou ebotca...@adacore.com wrote: Isn't the gimple cfg-cleanup we run post optimization (right before expansion) not enough? Or the cfg-cleanup we perform right after expansion now? At least if the branches are really caused by the gimplification process I would expect things to be cleaned up at this point, no? At least the patch has a non-null (albeit minor) effect, which pertains mainly to jump threading (typical diff attached, Ada code at -O0 on x86-64). So it's trivial forwarders that are removed. I wonder why cfgcleanup on the tree level does not do this, or why it does that on the RTL level. Are the blocks becoming empty only during/after register allocation? Richard. -- Eric Botcazou
Re: RFC: PR rtl-optimization/52876: Sign extend 32 to 64bit then clear upper 32bits fails O1 or higher
On Thu, Apr 5, 2012 at 11:48 AM, H.J. Lu hongjiu...@intel.com wrote: Hi, CSE turns (reg:DI 64) (insn 6 3 7 2 (set (reg:DI 64) (sign_extend:DI (subreg/u:SI (reg/v/f:DI 63 [ addr ]) 0))) x.i:6 122 {*extendsidi2_rex64} (nil)) into (reg/f:DI 64 [ addr ]). But nonzero_bits1 in rtlanal.c has #if defined(POINTERS_EXTEND_UNSIGNED) !defined(HAVE_ptr_extend) /* If pointers extend unsigned and this is a pointer in Pmode, say that all the bits above ptr_mode are known to be zero. */ /* As we do not know which address space the pointer is refering to, we can do this only if the target does not support different pointer or address modes depending on the address space. */ if (target_default_pointer_address_modes_p () POINTERS_EXTEND_UNSIGNED GET_MODE (x) == Pmode REG_POINTER (x)) nonzero = GET_MODE_MASK (ptr_mode); #endif Setting REG_POINTER with incompatible pointer sign extension is wrong. This patch adds a bool parameter to set_reg_attrs_from_value to indicate if a register can be marked as a pointer. Does it look OK? Thanks. H.J. --- gcc/ 2012-04-05 H.J. Lu hongjiu...@intel.com PR rtl-optimization/52876 * emit-rtl.c (set_reg_attrs_from_value): Add a bool parameter to indicate if a register can be a pointer. (gen_reg_rtx_and_attrs): Pass true to set_reg_attrs_from_value. (set_reg_attrs_for_parm): Likewise. * emit-rtl.h (set_reg_attrs_from_value): Add a bool parameter. * reginfo.c (reg_scan_mark_refs): Pass false to set_reg_attrs_from_value with incompatible pointer sign extension. gcc/testsuite 2012-04-05 H.J. Lu hongjiu...@intel.com PR rtl-optimization/52876 * gcc.target/i386/pr52876.c: New. diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 8d7d441..791ff5c 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -967,7 +967,7 @@ adjust_reg_mode (rtx reg, enum machine_mode mode) have different modes, REG is a (possibly paradoxical) lowpart of X. */ void -set_reg_attrs_from_value (rtx reg, rtx x) +set_reg_attrs_from_value (rtx reg, rtx x, bool can_be_reg_pointer) { int offset; @@ -983,14 +983,14 @@ set_reg_attrs_from_value (rtx reg, rtx x) if (MEM_OFFSET_KNOWN_P (x)) REG_ATTRS (reg) = get_reg_attrs (MEM_EXPR (x), MEM_OFFSET (x) + offset); - if (MEM_POINTER (x)) + if (can_be_reg_pointer MEM_POINTER (x)) mark_reg_pointer (reg, 0); } else if (REG_P (x)) { if (REG_ATTRS (x)) update_reg_offset (reg, x, offset); - if (REG_POINTER (x)) + if (can_be_reg_pointer REG_POINTER (x)) mark_reg_pointer (reg, REGNO_POINTER_ALIGN (REGNO (x))); } } @@ -1002,7 +1002,7 @@ rtx gen_reg_rtx_and_attrs (rtx x) { rtx reg = gen_reg_rtx (GET_MODE (x)); - set_reg_attrs_from_value (reg, x); + set_reg_attrs_from_value (reg, x, true); return reg; } @@ -1013,7 +1013,7 @@ void set_reg_attrs_for_parm (rtx parm_rtx, rtx mem) { if (REG_P (parm_rtx)) - set_reg_attrs_from_value (parm_rtx, mem); + set_reg_attrs_from_value (parm_rtx, mem, true); else if (GET_CODE (parm_rtx) == PARALLEL) { /* Check for a NULL entry in the first slot, used to indicate that the diff --git a/gcc/emit-rtl.h b/gcc/emit-rtl.h index bc91193..2f259ef 100644 --- a/gcc/emit-rtl.h +++ b/gcc/emit-rtl.h @@ -63,7 +63,7 @@ extern rtx copy_insn_1 (rtx); extern rtx copy_insn (rtx); extern rtx gen_int_mode (HOST_WIDE_INT, enum machine_mode); extern rtx emit_copy_of_insn_after (rtx, rtx); -extern void set_reg_attrs_from_value (rtx, rtx); +extern void set_reg_attrs_from_value (rtx, rtx, bool); extern void set_reg_attrs_for_parm (rtx, rtx); extern void set_reg_attrs_for_decl_rtl (tree t, rtx x); extern void adjust_reg_mode (rtx, enum machine_mode); diff --git a/gcc/reginfo.c b/gcc/reginfo.c index 6353126..585d7a8 100644 --- a/gcc/reginfo.c +++ b/gcc/reginfo.c @@ -1224,14 +1224,24 @@ reg_scan_mark_refs (rtx x, rtx insn) if (REG_P (dest) !REG_ATTRS (dest)) { rtx src = SET_SRC (x); + bool can_be_reg_pointer = true; while (GET_CODE (src) == SIGN_EXTEND || GET_CODE (src) == ZERO_EXTEND || GET_CODE (src) == TRUNCATE || (GET_CODE (src) == SUBREG subreg_lowpart_p (src))) - src = XEXP (src, 0); + { +#if defined(POINTERS_EXTEND_UNSIGNED) !defined(HAVE_ptr_extend) + if ((GET_CODE (src) == SIGN_EXTEND + POINTERS_EXTEND_UNSIGNED) + || (GET_CODE (src) != SIGN_EXTEND + ! POINTERS_EXTEND_UNSIGNED)) + can_be_reg_pointer = false; +#endif + src = XEXP (src, 0); + } - set_reg_attrs_from_value (dest, src); + set_reg_attrs_from_value (dest,
[PATCH] Fix PR52912, more loop preserving and jump-threading fallout
This properly tells the CFG manipulation of the case where we thread through a loop exit - the duplicated header block will end up in the outer loop and the original loop structure will be preserved. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2012-04-10 Richard Guenther rguent...@suse.de PR tree-optimization/52912 * tree-ssa-threadupdate.c (thread_block): Tell the cfg manipulation code we are threading through a loop header to an exit destination. * gcc.dg/torture/pr52912.c: New testcase. Index: gcc/tree-ssa-threadupdate.c === *** gcc/tree-ssa-threadupdate.c (revision 186274) --- gcc/tree-ssa-threadupdate.c (working copy) *** thread_block (basic_block bb, bool noloo *** 661,666 --- 661,673 /* We do not update dominance info. */ free_dominance_info (CDI_DOMINATORS); + /* We know we only thread through the loop header to loop exits. + Let the basic block duplication hook know we are not creating + a multiple entry loop. */ + if (noloop_only +bb == bb-loop_father-header) + set_loop_copy (bb-loop_father, loop_outer (bb-loop_father)); + /* Now create duplicates of BB. Note that for a block with a high outgoing degree we can waste *** thread_block (basic_block bb, bool noloo *** 692,697 --- 699,708 htab_delete (redirection_data); redirection_data = NULL; + if (noloop_only +bb == bb-loop_father-header) + set_loop_copy (bb-loop_father, NULL); + /* Indicate to our caller whether or not any jumps were threaded. */ return local_info.jumps_threaded; } Index: gcc/testsuite/gcc.dg/torture/pr52912.c === *** gcc/testsuite/gcc.dg/torture/pr52912.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr52912.c (revision 0) *** *** 0 --- 1,18 + /* { dg-do compile } */ + + int a, b, c; + static int + fn1 (p1) + { + lbl_549: + if (p1) + goto lbl_549; + return 0; + } + + void + fn2 () + { + b = (c a) fn1 (c) = c; + } +
Patches to enable -ftrack-macro-expansion by default
Hello, I am proposing a series of patches which is supposed to address the remaining issues (I am aware of) preventing us from enabling the -ftrack-macro-expansion by default. The idea is to address each issue I notice in the course of trying to bootstrap the compiler and running the tests with -ftrack-macro-expansion enabled. Beside the fixes, I ended up disabling the -ftrack-macro-expansion for many test cases (sometimes globally in the dg-*.exp files, or on a case by case basis), because that option changes the compiler output and so requires that I either adapt the test case or disable the option. For other tests, I chose to adapt the test case. You will find the patch series as a follow-up of this message. -- Dodji
Guard use of modulo in cshift (speedup protein)
Hi, this patch speeds up polyhedrons protein on Bulldozer quite a bit. The things is that in this testcase cshift is called with a very short length (=3) and that the shift amount always is less than the length. Surprisingly the division instruction takes up considerable amount of time, so much that it makes sense to guard it, when the shift is in bound. Here's some oprofile of _gfortrani_cshift0_i4 (total 31020 cycles): 23 0.0032 : caf00: idiv %r13 13863 1.9055 : caf03: lea(%rdx,%r13,1),%r12 I.e. despite the memory shuffling one third of the cshift cycles are that division. With the patch the time for protein drops from 0m21.367s to 0m20.547s on this Bulldozer machine. I've checked that it has no adverse effect on older AMD or Intel cores (0:44.30elapsed vs 0:44.00elapsed, still an improvement). Regstrapped on x86_64-linux. Okay for trunk? Ciao, Michael. * m4/cshift0.m4 (cshift0_'rtype_code`): Guard use of modulo. * generated/cshift0_c10.c: Regenerated. * generated/cshift0_c16.c: Regenerated. * generated/cshift0_c4.c: Regenerated. * generated/cshift0_c8.c: Regenerated. * generated/cshift0_i16.c: Regenerated. * generated/cshift0_i1.c: Regenerated. * generated/cshift0_i2.c: Regenerated. * generated/cshift0_i4.c: Regenerated. * generated/cshift0_i8.c: Regenerated. * generated/cshift0_r10.c: Regenerated. * generated/cshift0_r16.c: Regenerated. * generated/cshift0_r4.c: Regenerated. * generated/cshift0_r8.c: Regenerated. Index: m4/cshift0.m4 === --- m4/cshift0.m4 (revision 186272) +++ m4/cshift0.m4 (working copy) @@ -98,9 +98,13 @@ cshift0_'rtype_code` ('rtype` *ret, cons rptr = ret-base_addr; sptr = array-base_addr; - shift = len == 0 ? 0 : shift % (ptrdiff_t)len; - if (shift 0) -shift += len; + /* Avoid the costly modulo for trivially in-bound shifts. */ + if (shift 0 || shift = len) +{ + shift = len == 0 ? 0 : shift % (ptrdiff_t)len; + if (shift 0) + shift += len; +} while (rptr) {
[PATCH 01/11] Fix cpp_sys_macro_p with -ftrack-macro-expansion
Hello, cpp_sys_macro_p crashes when -ftrack-macro-expansion is on. The issue can be reproduced by running the tests: runtest --tool gcc --tool_opts=-ftrack-macro-expansion cpp.exp=sysmac1.c runtest --tool gcc --tool_opts=-ftrack-macro-expansion cpp.exp=sysmac2.c This is because it just doesn't support that mode. Fixed thus. Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk. Note that the bootstrap with -ftrack-macro-expansion turned on exhibits other separate issues that are addressed in subsequent patches. This patch just fixes one class of problems. The patch does pass bootstrap with -ftrack-macro-expansion turned off, though. libcpp/ * macro.c (cpp_sys_macro_p): Support -ftrack-macro-expansion. --- libcpp/macro.c | 10 +- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/libcpp/macro.c b/libcpp/macro.c index 54de3e3..58a722c 100644 --- a/libcpp/macro.c +++ b/libcpp/macro.c @@ -2436,7 +2436,15 @@ cpp_get_token_with_location (cpp_reader *pfile, source_location *loc) int cpp_sys_macro_p (cpp_reader *pfile) { - cpp_hashnode *node = pfile-context-c.macro; + cpp_hashnode *node = NULL; + + if (CPP_OPTION (pfile, track_macro_expansion)) +{ + if (pfile-context-c.mc) + node = pfile-context-c.mc-macro_node; +} + else +node = pfile-context-c.macro; return node node-value.macro node-value.macro-syshdr; } -- Dodji
[PATCH] Fix PR52881, more loop preserving fallout with RTL optimizers
RTL optimizers perform CFG adjustments themselves rather than relying on cleanup_cfg (as the gimple level does). This makes it necessary to handle things all-over-the-place. This case is if-conversion performing forwarder block removal (well, it calls it speculation), removing empty loop latches. That's unwanted, thus the following patch makes sure we do not do this. I'm sure more RTL optimiziation fallout will pop up - and I wonder if we should simply avoid modifying the CFG all over the place and instead do that in cleanup_cfg. Thus, in the if-conversion case, simply do the speculation and leave the empty forwarder blocks to be cleaned up by cleanup_cfg (or for CSE to leave unconditional jumps conditional, as if (0/1), for example). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2012-04-10 Richard Guenther rguent...@suse.de PR rtl-optimization/52881 * ifcvt.c (find_if_case_2): Avoid speculating loop latches. * gcc.dg/torture/pr52881.c: New testcase. * gcc.dg/torture/pr52913.c: Likewise. Index: gcc/ifcvt.c === *** gcc/ifcvt.c (revision 186274) --- gcc/ifcvt.c (working copy) *** find_if_case_2 (basic_block test_bb, edg *** 3927,3932 --- 3927,3937 edge else_succ; int then_prob, else_prob; + /* We do not want to speculate (empty) loop latches. */ + if (current_loops +else_bb-loop_father-latch == else_bb) + return FALSE; + /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot and cold sections. Index: gcc/testsuite/gcc.dg/torture/pr52881.c === *** gcc/testsuite/gcc.dg/torture/pr52881.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr52881.c (revision 0) *** *** 0 --- 1,35 + /* { dg-do compile } */ + + int a, b, c, d, e, f, h, i, j, k, l, m, n, o; + static int g; + int + fn1 () { + for (;; ++f) + if (e) + break; + return 0; + } + unsigned char fn2 (); + void + fn3 () { + lbl_220: + if (j) { + lbl_221: + l = (g || b) = fn1 (); + for (;;) { + g = 0; + fn2 (); + if (k) + goto lbl_220; + break; + } + if (l) + goto lbl_221; + } + } + unsigned char + fn2 () { + o = d ? 0 : c; + h = m | a % o != n; + return i; + } Index: gcc/testsuite/gcc.dg/torture/pr52913.c === --- gcc/testsuite/gcc.dg/torture/pr52913.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr52913.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ + +int a, b, c, d, e; +void +fn1 () +{ +lbl_101: + e = 0; +lbl_274: + for (c = 0; c 1; c = a) +if (d) + if (b) + goto lbl_101; + else + break; + d = 1; + goto lbl_274; +}
[PATCH 02/11] Fix token pasting with -ftrack-macro-expansion
This patch makes token pasting work with -ftrack-macro-expansion turned on. It improves some pasting related tests of the gcc.dg/cpp sub-directory. Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk. Note that the bootstrap with -ftrack-macro-expansion exhibits other separate issues that are addressed in subsequent patches. This patch just fixes one class of problems. The patch does pass bootstrap with -ftrack-macro-expansion turned off, though. libcpp/ * macro.c (paste_all_tokens): Put the token resulting from pasting into an extended token context with -ftrack-macro-location is in effect. gcc/testsuite/ * gcc.dg/cpp/paste17.c: New test case for -ftrack-macro-expansion=2 mode only. * gcc.dg/cpp/macro-exp-tracking-5.c: Likewise. --- gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c | 18 ++ gcc/testsuite/gcc.dg/cpp/paste17.c |8 ++ libcpp/macro.c | 28 ++- 3 files changed, 53 insertions(+), 1 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c create mode 100644 gcc/testsuite/gcc.dg/cpp/paste17.c diff --git a/gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c b/gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c new file mode 100644 index 000..7933660 --- /dev/null +++ b/gcc/testsuite/gcc.dg/cpp/macro-exp-tracking-5.c @@ -0,0 +1,18 @@ +/* + { dg-options -fshow-column -ftrack-macro-expansion } + { dg-do compile } + */ + +#define PASTED var ## iable /* { dg-error undeclared } */ +#define call_foo(p1, p2) \ + foo (p1, \ + p2); /* { dg-message in expansion of macro } */ + +void foo(int, char); + +void +bar() +{ + call_foo(1,PASTED); /* { dg-message expanded from here } */ +} + diff --git a/gcc/testsuite/gcc.dg/cpp/paste17.c b/gcc/testsuite/gcc.dg/cpp/paste17.c new file mode 100644 index 000..9c6506f --- /dev/null +++ b/gcc/testsuite/gcc.dg/cpp/paste17.c @@ -0,0 +1,8 @@ + /* { dg-options -ftrack-macro-expansion=2 } */ +/* { dg-do preprocess } */ + +#define do_paste 1.0e ## -1 + +do_paste + +/* { dg-final {scan-file paste17.i 1.0e- 1 } }*/ diff --git a/libcpp/macro.c b/libcpp/macro.c index 58a722c..e0bfc31 100644 --- a/libcpp/macro.c +++ b/libcpp/macro.c @@ -611,6 +611,21 @@ paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs) { const cpp_token *rhs = NULL; cpp_context *context = pfile-context; + source_location virt_loc = 0; + + /* We must have been called on a token that appears at the left + hand side of a ## operator. */ + if (!(lhs-flags PASTE_LEFT)) +abort (); + + if (context-tokens_kind == TOKENS_KIND_EXTENDED) +/* The caller must have called consume_next_token_from_context + right before calling us. That has incremented the pointer to + the current virtual location. So it now points to the location + of the token that comes right after *LHS. We want the + resulting pasted token to have the location of the current + *LHS, though. */ +virt_loc = *(context-c.mc-cur_virt_loc - 1); do { @@ -650,7 +665,18 @@ paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs) while (rhs-flags PASTE_LEFT); /* Put the resulting token in its own context. */ - _cpp_push_token_context (pfile, NULL, lhs, 1); + if (context-tokens_kind == TOKENS_KIND_EXTENDED) +{ + source_location *virt_locs = NULL; + _cpp_buff *token_buf = tokens_buff_new (pfile, 1, virt_locs); + tokens_buff_add_token (token_buf, virt_locs, lhs, +virt_loc, 0, NULL, 0); + push_extended_tokens_context (pfile, context-c.mc-macro_node, + token_buf, virt_locs, + (const cpp_token **)token_buf-base, 1); +} + else +_cpp_push_token_context (pfile, NULL, lhs, 1); } /* Returns TRUE if the number of arguments ARGC supplied in an -- 1.7.6.5 -- Dodji
[PATCH 03/11] Fix PCH crash on GTYed pointer-to-scalar field of a
Disclaimer: I am sorry for the length of this message. In essence, it's just a one liner fix in gengtype.c. I felt the need to explain extensively what I think I understood because it didn't seem obvious to me. So, when -ftrack-macro-expansion is activated, the PCH generation machinery can crash in gt_pch_save when it's about to relocate the pointer for the line_maps::info_macro::maps[i]::d.macro.macro_locations member. The call that crashes (in ggc-common.c) is: state.ptrs[i]-note_ptr_fn (state.ptrs[i]-obj, state.ptrs[i]-note_ptr_cookie, relocate_ptrs, state); The -note_ptr_fn called in this case is the gengtype-generated gt_pch_p_9line_maps function. It crashes because the second argument passed to it is a pointer to struct line_map, instead of being a pointer to struct line_maps (extra 's') like what the function expects. You can see the crash for the test case: runtest --tool g++ --tool_opts=-ftrack-macro-expansion pch.exp=system-1.C I believe it's because a part of the code of gt_pch_nx_line_maps (generated as part of gtype-desc.c by gengtype) is not correct. Note that this gt_pch_nx_line_maps function is called from gt_pch_save in the snippet: for (rt = gt_ggc_rtab; *rt; rt++) for (rti = *rt; rti-base != NULL; rti++) for (i = 0; i rti-nelt; i++) (*rti-pchw)(*(void **)((char *)rti-base + rti-stride * i)); So, in that gt_pch_nx_line_maps, in the branch that starts with the code: if ((*x).info_macro.maps != NULL) { size_t i3; for (i3 = 0; i3 != (size_t)(((*x).info_macro).used); i3++) { switch (((*x).info_macro.maps[i3]).reason == LC_ENTER_MACRO) we have the code: gt_pch_note_object ((*x).info_macro.maps[i3].d.macro.macro_locations, (*x).info_macro.maps, gt_pch_p_9line_maps, gt_types_enum_last); This last snippet registers gt_pch_p_9line_maps to be called on the object pointed by (*x).info_macro.maps[i3].d.macro.macro_locations (as a first argument), with (*x).info_macro.maps as its second argument. Note that (*x).info_macro.maps is of type struct line_map*, while 'x' is of type struct line_maps* - beware, there is an 's' at the end of the latter. The problem is that gt_pch_p_9line_maps requires that its second argument be an instance of _struct line_maps_, not struct line_map. So later when gt_pch_p_9line_maps is called, it just crashes. More generally, these gt_pch_p_xxx functions seem to require that their second argument be an instance of the xxx in question. And that invariant is violated by the snippet of code above. The invariant seems to be violated only for the case where a GTYed structure (possibly embedded in another GTYed structure) contains a pointer to a scalar (that is not a string) which memory is ggc/GTY managed, like the line_map_macro::macro_locations field. And this only happens for PCH generation. Looking at gengtype.c, it seems like write_types_process_field can be fooled in that case. It expects that the expression d-prev_val[3] contains the name of the second argument of the gt_pch_p_xxx (which is generically referenced by wtd-subfield_marker_routine there). That expression can resolve to either x, as we would like it to be, but can also resolve to another arbitrary name for e.g, the case of a pointer-to-struct used as a root). This patch simply forces the second argument of gt_pch_p_xxx to be 'x' even in the case of a member that is a pointer to a scalar. As a result, here is the the diff the new generated gtype-desc.c file: @@ -5234,7 +5234,7 @@ gt_pch_nx_line_maps (void *x_p) size_t i2; for (i2 = 0; i2 != (size_t)(2 * ((*x).info_ordinary.maps[i0].d.macro).n_tokens); i2++) { } -gt_pch_note_object ((*x).info_ordinary.maps[i0].d.macro.macro_locations, (*x).info_ordinary.maps, gt_pch_p_9line_maps, gt_types_enum_last); +gt_pch_note_object ((*x).info_ordinary.maps[i0].d.macro.macro_locations, x, gt_pch_p_9line_maps, gt_types_enum_last); } break; default: @@ -5261,7 +5261,7 @@ gt_pch_nx_line_maps (void *x_p) size_t i5; for (i5 = 0; i5 != (size_t)(2 * ((*x).info_macro.maps[i3].d.macro).n_tokens); i5++) { } -gt_pch_note_object ((*x).info_macro.maps[i3].d.macro.macro_locations, (*x).info_macro.maps, gt_pch_p_9line_maps, gt_types_enum_last); +gt_pch_note_object ((*x).info_macro.maps[i3].d.macro.macro_locations, x, gt_pch_p_9line_maps, gt_types_enum_last); } break; default: @@ -9366,7 +9366,7 @@ gt_pch_na_regno_reg_rtx (ATTRIBUTE_UNUSED void *x_p) for (i1 = 0; i1 != (size_t)(crtl-emit.x_reg_rtx_no); i1++) { gt_pch_n_7rtx_def (regno_reg_rtx[i1]); } -gt_pch_note_object (regno_reg_rtx,
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On Tue, 10 Apr 2012 09:32:22 -0500 Dennis Gilmore den...@gilmore.net.au wrote: every distro uses a unique triplet, by putting the triplet in there you then need to get all distros to change to using the same triplets. I personally prefer /libhfp rather than /libhf but I am ok with using either. Any change from /lib would need us to do a mass rebuild. because while not 100% needed I would rather keep libraries with the linker. the changes to rpm to support it would be somewhat minimal. we have stated in Fedora though that we have no intention to support mixing hfp and sfp on the same system. we really do need to ensure consensus for arm64 which I think should be /lib64 for 64 bit arch consistency. Ok, I respect that, what about using the actual ABI name, ie aapcs-vfp? or something that includes both the architecture and the eabi, (arm-hardfloat, armhf, armhfp, etc), but *in* the filename (excluding the case of using a separate directory as it's not too popular). Also, I'm not suggesting changing the triplet or anything, just deciding on a unique name for the triplet. Debian's argument is that the default multilib solution is not future-proof and we would prefer something that is more unique. Regarding /lib64 for aarch64, that's an entirely different discussion, though I do agree it should be also be resolved sooner rather than later as well. Regards Konstantinos
Re: Guard use of modulo in cshift (speedup protein)
On Tue, Apr 10, 2012 at 4:53 PM, Michael Matz m...@suse.de wrote: Hi, this patch speeds up polyhedrons protein on Bulldozer quite a bit. The things is that in this testcase cshift is called with a very short length (=3) and that the shift amount always is less than the length. Surprisingly the division instruction takes up considerable amount of time, so much that it makes sense to guard it, when the shift is in bound. Here's some oprofile of _gfortrani_cshift0_i4 (total 31020 cycles): 23 0.0032 : caf00: idiv %r13 13863 1.9055 : caf03: lea (%rdx,%r13,1),%r12 I.e. despite the memory shuffling one third of the cshift cycles are that division. With the patch the time for protein drops from 0m21.367s to 0m20.547s on this Bulldozer machine. I've checked that it has no adverse effect on older AMD or Intel cores (0:44.30elapsed vs 0:44.00elapsed, still an improvement). Regstrapped on x86_64-linux. Okay for trunk? Ciao, Michael. * m4/cshift0.m4 (cshift0_'rtype_code`): Guard use of modulo. * generated/cshift0_c10.c: Regenerated. * generated/cshift0_c16.c: Regenerated. * generated/cshift0_c4.c: Regenerated. * generated/cshift0_c8.c: Regenerated. * generated/cshift0_i16.c: Regenerated. * generated/cshift0_i1.c: Regenerated. * generated/cshift0_i2.c: Regenerated. * generated/cshift0_i4.c: Regenerated. * generated/cshift0_i8.c: Regenerated. * generated/cshift0_r10.c: Regenerated. * generated/cshift0_r16.c: Regenerated. * generated/cshift0_r4.c: Regenerated. * generated/cshift0_r8.c: Regenerated. Index: m4/cshift0.m4 === --- m4/cshift0.m4 (revision 186272) +++ m4/cshift0.m4 (working copy) @@ -98,9 +98,13 @@ cshift0_'rtype_code` ('rtype` *ret, cons rptr = ret-base_addr; sptr = array-base_addr; - shift = len == 0 ? 0 : shift % (ptrdiff_t)len; - if (shift 0) - shift += len; + /* Avoid the costly modulo for trivially in-bound shifts. */ + if (shift 0 || shift = len) + { + shift = len == 0 ? 0 : shift % (ptrdiff_t)len; + if (shift 0) + shift += len; + } while (rptr) { This is OK. Do you think it would be worthwhile to do this transformation in the middle end too, based on profile information for values? IIRC value-prof handles constant divmod but not ranges for modulo operations. Steven
Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop
Sorry, for the noise. This patch just activates dw2 and therefore avoids the ICE. Continue on that. Kai
move warning_if_unused_value to c-common.c
It is a C-family FE function, not sure how it ended up in stmt.c. OK to commit? Cheers, Manuel. Bootstrapped and regression tested with languages=all,ada,obj-c++ 2012-04-10 Manuel López-Ibáñez m...@gcc.gnu.org * c-family/c-common.c (warn_if_unused_valuer): Move definition to here. * tree.h (warn_if_unused_value): Move declaration from here. * c-family/c-common.h (warn_if_unused_value): Move declaration to here. * cp/cvt.c (convert_to_void): Update comment. * stmt.c (warn_if_unused_value): Move definition from here. warning_if_unused.diff Description: Binary data
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On Tue, Apr 10, 2012 at 09:32:22AM -0500, Dennis Gilmore wrote: On Tue, 10 Apr 2012 12:18:51 +0300 Konstantinos Margaritis konstantinos.margari...@linaro.org wrote: On Tue, 10 Apr 2012 07:36:07 +0200 Jakub Jelinek ja...@redhat.com wrote: We really want consistency about the dynamic linker names etc. across different targets and sneaking silently multiarch paths on one architecture would make it inconsistent with all the others. So, please just use /libhf/ld-linux.so.3. I personally find /libhf extremely ugly. If having a second path is a problem, how about using the triplet in the filename? Like: /lib/ld-linux-arm-linux-gnueabihf.so.3 ? Unique per arch and not multiarched. And AFAIK, Linux can handle long filenames just fine... every distro uses a unique triplet, by putting the triplet in there you then need to get all distros to change to using the same triplets. Aargh. Again, use of a standard triplet for arm hard-float was agreed by all parties at the Plumbers' meeting last September. For exactly this reason. Now it seems that a number of people have totally ignored that consensus for the last six months. I personally prefer /libhfp rather than /libhf but I am ok with using either. Any change from /lib would need us to do a mass rebuild. because while not 100% needed I would rather keep libraries with the linker. the changes to rpm to support it would be somewhat minimal. we have stated in Fedora though that we have no intention to support mixing hfp and sfp on the same system. we really do need to ensure consensus for arm64 which I think should be /lib64 for 64 bit arch consistency. Precisely who in Redhat/Fedora land has the power to make decisions in this area? We've been clearly wasting our time thus far trying to negotiate agreements about the hard-float platform. Cheers, -- Steve McIntyresteve.mcint...@linaro.org http://www.linaro.org/ Linaro.org | Open source software for ARM SoCs
Re: Guard use of modulo in cshift (speedup protein)
Hi, On Tue, 10 Apr 2012, Steven Bosscher wrote: This is OK. r186283. Do you think it would be worthwhile to do this transformation in the middle end too, based on profile information for values? I'd think so, but it probably requires a new profiler that counts for how often 0 = A = B for every A % B. Just profiling the range of values might be misleading (because A = N and B = M and N = M doesn't imply that A = B often holds). But it would possibly be an interesting experiment already to do such transformation generally (without profiling) and see what it gives on some benchmarks. Just to get a feel what's on the plate. IIRC value-prof handles constant divmod but not ranges for modulo operations. Ciao, Michael.
Re: [PATCH] Fix PR52881, more loop preserving fallout with RTL optimizers
On Tue, Apr 10, 2012 at 4:56 PM, Richard Guenther rguent...@suse.de wrote: I'm sure more RTL optimiziation fallout will pop up - and I wonder if we should simply avoid modifying the CFG all over the place and instead do that in cleanup_cfg. Thus, in the if-conversion case, simply do the speculation and leave the empty forwarder blocks to be cleaned up by cleanup_cfg (or for CSE to leave unconditional jumps conditional, as if (0/1), for example). The problem with that could be that if(0|1) is not a valid instruction for every machine. You may need to set a register (pseudo, or hard reg) that may not be available (e.g. if it is live). Ciao! Steven
Re: rfa: vectorize strided loads [2/2] [PR 18437]
Hi, On Tue, 10 Apr 2012, Richard Guenther wrote: + vec_inv = build_constructor (vectype, v); + new_temp = vect_init_vector (stmt, vec_inv, vectype, gsi); + new_stmt = SSA_NAME_DEF_STMT (new_temp); + mark_symbols_for_renaming (new_stmt); This should not be necessary - in fact please manually set the proper VUSE here. I don't see how I could know the right VUSE. It might not even exist yet. Reusing the existing vuse from the load that is replaced isn't correct. Consider that there are (unrelated) stores or calls before the load, and those are vectorized. When updating their operands they'll get a new vdef, and I'd need to use _that_ one: original: #vdef MEM_1 scalar_store = (this insn will be removed by vectorizable_store) #vuse MEM_1 ... = A[i] --- #vdef MEM (note, no ssaname yet) vect_store = ... #vuse MEM_1 ... = A[i] ( will be later removed as unused LHS) #vuse XXX ... = A[i] #vuse XXX ... = A[i+stride] So, which SSA name to use for XXX, when the vect_store doesn't yet know (which it doesn't). Tracking (and generating) proper VDEF SSA names is possible (we only vectorize a single BB, and that in stmt order), but that requires some rework of the vectorizer main loops, which should be a separate patch. Agreed? Ciao, Michael.
Re: move warning_if_unused_value to c-common.c
On Tue, Apr 10, 2012 at 10:25 AM, Manuel López-Ibáñez lopeziba...@gmail.com wrote: It is a C-family FE function, not sure how it ended up in stmt.c. OK to commit? OK.
Re: [i386, patch, RFC] HLE support in GCC
H.J. Lu hjl.to...@gmail.com writes: So, to emit HLE prefix, it is possible to do: int foo2 (int *p, int oldv, int newv) { __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE | __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE); return oldv; } This is wrong since HLE ACQUIRE/RELEASE has nothing to do with C++ atomic acquire/release. You can have HLE RELEASE with C++ atomic acquire. It makes sense to combine the two. On x86 C++ atomic acquire/release means the compiler cannot move references outside. For HLE we really want the same, otherwise some of the memory references inside the transaction may not be transactional. So I think HLE_ACQUIRE should imply C++ acquire and HLE_RELEASE imply C++ release. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: RFC reminder: an alternative -fsched-pressure implementation
On 04/10/2012 09:35 AM, Richard Sandiford wrote: Hi Vlad, Back in Decemember, when we were still very much in stage 3, I sent an RFC about an alternative implementation of -fsched-pressure. Just wanted to send a reminder now that we're in the proper stage: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01684.html Ulrich has benchmarked it on ARM, S/390 and Power7 (thanks), and got reasonable results. (I mentioned bad Power 7 results in that message, because of the way the VSX_REGS class is handled. Ulrich's results are without -mvsx though.) The condition I orignally set myself was that this patch should only go in if it becomes the default on at least one architecture, specifically ARM. Ulrich tells me that Linaro have now made it the default for ARM in their GCC 4.7 release, so hopefully Ramana would be OK with doing the same in upstream 4.8. Sorry for forgetting about this and thanks for the remainder. I'll start looking at this patch thoroughly and reviewing it on this week (may be I'll do some benchmarks too). I realise the whole thing is probably more complicated and ad-hoc than you'd like. Saying it can't go in is a perfectly acceptable answer IMO. The first impression I got for the patch when you submitted it, that it probably should go in.
Re: [PATCH] Caret diagnostics
On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote: On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote: * It uses the default cutoff as max_width, whatever it is (as controlled by -fmessage-length). * It uses the pretty-printer. The text cannot (should not) wrap because we still print only max_width chars at most. Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want to use COLUMNS to limit how much of the source we print. Like this? 2012-04-05 Manuel López-Ibáñez m...@gcc.gnu.org PR 24985 libstdc++-v3/ * testsuite/lib/prune.exp: Handle caret. libmudflap/ * testsuite/lib/libmudflap.exp: Handle caret. gcc/ * diagnostic.h (show_caret): Declare. (caret_max_width): Declare. (diagnostic_show_locus): Declare. (diagnostic_set_caret_max_width): Declare. * diagnostic.c (diagnostic_initialize): Initialize to false. (diagnostic_show_locus): New. (diagnostic_report_diagnostic): Call it. (getenv_columns): New. (diagnostic_set_caret_max_width): New. * input.c (read_line): New. (location_get_source_line): New. * input.h (location_get_source_line): Declare. * toplev.c (general_init): Initialize show_caret from options. * dwarf2out.c (gen_producer_string): Handle fdiagnostics-show-caret. * opts.c (common_handle_option): Likewise. * pretty-print.h (pp_get_prefix): New. (pp_base_get_prefix): New. * common.opt (fdiagnostics-show-caret): New option. * doc/invoke.texi (fdiagnostics-show-caret): Document it. testsuite/ * lib/prune.exp: Add -fno-diagnostics-show-caret. caret-diagnostics-20120410.diff Description: Binary data
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
FWIW, my use case for multiarch is not sharing the root filesystem among multiple systems. It's sharing the non-lib namespace (/etc, /bin, data) among multiple instruction sets / ABI variants on the same system. I don't need (/usr)?/s?bin to be decorated with a triplet, because the kernel picks a fresh ld.so variant at the execve() boundary; I am happy to mix ARM and x86 binaries (and Python and shell scripts) in /bin, and let the kernel (and binfmt_misc + qemu) sort it out. I only need (/usr)?/lib to be disambiguated *at runtime* because ld.so is not as smart as the kernel. (It's not just ld.so, of course; module/plugin loaders for everything from Python to Firefox have the same problem, and if they don't have the triplet in there somewhere then multiarch breaks them.) As long as the kernel can find the right ld.so and each ld.so can find its own ld.so.conf, I don't really care where the libraries are put at runtime, as long as I can make it different for each ISA/ABI. However, I do care how much autoconf / pkg-config / debhelper misery I have to go through each time I need to pull in another library dependency. Upstream build machinery can usually accommodate /just/about/anything/lib. Trailing components like lib32, libhf, or lib-gnu-autoconf-triplet are not as consistently trivial. Personally, I would like for all shared objects to live in (/usr)?/gnu-autoconf-triplet(-extranoise)?/lib, and for the kernel to take responsibility for pointing (/usr)?/lib at an overlay mount containing whatever makes sense for the currently running binary, a la /proc/self. That way I can grandfather in binaries with ABI-ignorant hard-coded library paths, and still handle ISA variants. The extranoise might be neon, or ssse3, or android (so that non-Android binaries on the same system don't see Android-specific libraries with stupidly generic names like libui.so). And the overlay mount is so that I can, if I choose, build the vast majority of my system without NEON instructions (and thus not take the overhead of VFP context save/restore during timeslices that don't use actual floating point) and still use a subset of those libraries from NEON-anointed binaries (assuming I define some sensible way for the kernel to make that distinction). That isn't necessarily the right solution, of course -- either at a technical level or in the light of the LSB process and other inter-distro politics. But maybe it's at least a more plausible use case for 2012 than NFS-mounting /usr/local on a mix of sun4c, sun4u, and IRIX workstations. (That never did work quite right ...) Cheers, - Michael
Re: [i386, patch, RFC] HLE support in GCC
On Tue, Apr 10, 2012 at 4:20 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Apr 10, 2012 at 06:12:08PM +0400, Kirill Yukhin wrote: Attached patch implements HLE support for __atomic_compare_exchange_n. The target hook is definitely not appropriate, just define it in ix86_target_macros in i386-c.c instead or so. Also, I think it is better to pass operand that holds the model constant to the final insn and conditionally output xacquire/xrelease based on INTVAL of this operand. Something like in attached patch, but probably with a helper function in i386.c. Uros. Index: config/i386/sync.md === --- config/i386/sync.md (revision 186282) +++ config/i386/sync.md (working copy) @@ -315,8 +315,9 @@ (match_operand:SI 7 const_int_operand)] ;; failure model TARGET_CMPXCHG { - emit_insn (gen_atomic_compare_and_swap_singlemode -(operands[1], operands[2], operands[3], operands[4])); + emit_insn + (gen_atomic_compare_and_swap_singlemode +(operands[1], operands[2], operands[3], operands[4], operands[6])); ix86_expand_setcc (operands[0], EQ, gen_rtx_REG (CCZmode, FLAGS_REG), const0_rtx); DONE; @@ -344,8 +345,9 @@ { if (MODEmode == DImode TARGET_64BIT) { - emit_insn (gen_atomic_compare_and_swap_singledi -(operands[1], operands[2], operands[3], operands[4])); + emit_insn + (gen_atomic_compare_and_swap_singledi + (operands[1], operands[2], operands[3], operands[4], operands[6])); } else { @@ -370,7 +372,7 @@ mem = replace_equiv_address (mem, force_reg (Pmode, XEXP (mem, 0))); emit_insn (gen_atomic_compare_and_swap_doublemode -(lo_o, hi_o, mem, lo_e, hi_e, lo_n, hi_n)); +(lo_o, hi_o, mem, lo_e, hi_e, lo_n, hi_n, operands[6])); } ix86_expand_setcc (operands[0], EQ, gen_rtx_REG (CCZmode, FLAGS_REG), const0_rtx); @@ -382,15 +384,24 @@ (unspec_volatile:SWI [(match_operand:SWI 1 memory_operand +m) (match_operand:SWI 2 register_operand 0) - (match_operand:SWI 3 register_operand r)] + (match_operand:SWI 3 register_operand r) + (match_operand:SI 4 const_int_operand)] UNSPECV_CMPXCHG_1)) (set (match_dup 1) (unspec_volatile:SWI [(const_int 0)] UNSPECV_CMPXCHG_2)) (set (reg:CCZ FLAGS_REG) (unspec_volatile:CCZ [(const_int 0)] UNSPECV_CMPXCHG_3))] TARGET_CMPXCHG - lock{%;} cmpxchg{imodesuffix}\t{%3, %1|%1, %3}) +{ + static char buf[128]; + const char *hle += (INTVAL (operands[4]) 8) ? xacquire : ; + snprintf (buf, sizeof (buf), + lock{%;} %scmpxchg{imodesuffix}\t{%3, %1|%1, %3}, hle); + return buf; +}) + ;; For double-word compare and swap, we are obliged to play tricks with ;; the input newval (op5:op6) because the Intel register numbering does ;; not match the gcc register numbering, so the pair must be CX:BX. @@ -403,7 +414,8 @@ (match_operand:DCASHMODE 3 register_operand 0) (match_operand:DCASHMODE 4 register_operand 1) (match_operand:DCASHMODE 5 register_operand b) - (match_operand:DCASHMODE 6 register_operand c)] + (match_operand:DCASHMODE 6 register_operand c) + (match_operand:SI 7 const_int_operand)] UNSPECV_CMPXCHG_1)) (set (match_operand:DCASHMODE 1 register_operand =d) (unspec_volatile:DCASHMODE [(const_int 0)] UNSPECV_CMPXCHG_2)) @@ -412,8 +424,16 @@ (set (reg:CCZ FLAGS_REG) (unspec_volatile:CCZ [(const_int 0)] UNSPECV_CMPXCHG_4))] - lock{%;} cmpxchgdoublemodesuffixb\t%2) +{ + static char buf[128]; + const char *hle += (INTVAL (operands[7]) 8) ? xacquire : ; + snprintf (buf, sizeof (buf), + lock{%;} %scmpxchgdoublemodesuffixb\t%2, hle); + return buf; +}) + ;; Theoretically we'd like to use constraint r (any reg) for op5, ;; but that includes ecx. If op5 and op6 are the same (like when ;; the input is -1LL) GCC might chose to allocate op5 to ecx, like
Re: ICE on bootstrap of libstdc++ for mingw-targets in add_bb_to_loop
Hi, issue was caused by the patch for PR middle-end/52772, which just handles dw2. This patch makes sure that for sjlj we add new generated bb's for dispatching table to loop, too. ChangeLog 2012-04-10 Kai Tietz kti...@redhat.com PR c++/52924 * except.c (sjlj_emit_dispatch_table): Add bb to loop, as done for dwarf2. Bootstrapped for x64_64-w64-mingw32, and x86_64-unknown-linux-gnu. Later for all languages including Ada and Obj-C++. Ok for apply? Regards, Kai Index: except.c === --- except.c(revision 186288) +++ except.c(working copy) @@ -1344,6 +1344,16 @@ e = make_edge (bb, bb-next_bb, EDGE_FALLTHRU); e-count = bb-count; e-probability = REG_BR_PROB_BASE; + if (current_loops) + { + struct loop *loop = bb-next_bb-loop_father; + /* If we created a pre-header block, add the new block to the + outer loop, otherwise to the loop itself. */ + if (bb-next_bb == loop-header) + add_bb_to_loop (bb, loop_outer (loop)); + else + add_bb_to_loop (bb, loop); + } disp_index++; } @@ -1365,6 +1375,17 @@ e-count = bb-count; e-probability = REG_BR_PROB_BASE; } + + if (current_loops) +{ + struct loop *loop = bb-next_bb-loop_father; + /* If we created a pre-header block, add the new block to the +outer loop, otherwise to the loop itself. */ + if (bb-next_bb == loop-header) + add_bb_to_loop (bb, loop_outer (loop)); + else + add_bb_to_loop (bb, loop); +} } static void
Re: [PATCH, rs6000] Fix PR16458, eliminate redudant compares
On Mon, 2012-01-30 at 10:46 +0100, Richard Guenther wrote: On Fri, Jan 27, 2012 at 5:56 PM, Peter Bergner berg...@vnet.ibm.com wrote: This patch fixes PR16458 by using the type expression attached to a reg rtx to detect its signedness and generating unsigned compares when appropriate. However, we continue to use signed compares for the special case of when we compare an unsigned reg against the constant 0. This is because signed and unsigned compares give the same results for equality and (unsigned)x 0) is equivalent to x != 0. Using a signed compare in this special case allows us to continue to generate record form instructions (ie, instructions that implicitly set cr0). [snip] This does not sound suitable for stage4. rs6000_unsigned_reg_p looks suspiciously non-rs6000 specific, and if we really can rely on the way it is implemented should find its way to general RTL helper routines. Now that we're in stage1 again, I'm attaching an updated patch that moves rs6000_unsigned_reg_p into an arch independent RTL file like you wanted. I have also attached a couple of patch hunks from Michael that sets register attributes in a couple of cases we didn't before and that allows my patch to coalesce the compares in pr16458-4.c. The description of that issue is located in: http://gcc.gnu.org/ml/gcc/2012-01/msg00339.html The updated and expanded patch passes bootstrap and regtesting with no regessions on powerpc64-linux. Ok for mainline now? Peter 2012-mm-dd Peter Bergner berg...@vnet.ibm.com Michael Matz m...@suse.de gcc/ PR target/16458 * rtlanal.c (unsigned_reg_p): New function. * rtl.h (unsigned_reg_p): Prototype it. * config/rs6000/rs6000.c (rs6000_generate_compare): Use it. Update comment. * expr.c (expand_expr_real_1): Set register attributes. * stmt.c (expand_case): Likewise. gcc/testsuite/ PR target/16458 * gcc.target/powerpc/pr16458-1.c: New test. * gcc.target/powerpc/pr16458-2.c: Likewise. * gcc.target/powerpc/pr16458-3.c: Likewise. * gcc.target/powerpc/pr16458-4.c: Likewise. Index: gcc/rtlanal.c === --- gcc/rtlanal.c (revision 186244) +++ gcc/rtlanal.c (working copy) @@ -636,6 +636,25 @@ count_occurrences (const_rtx x, const_rt } +/* Return TRUE if OP is a register or subreg of a register that + holds an unsigned quantity. Otherwise, return FALSE. */ + +bool +unsigned_reg_p (rtx op) +{ + if (REG_P (op) + REG_EXPR (op) + TYPE_UNSIGNED (TREE_TYPE (REG_EXPR (op +return true; + + if (GET_CODE (op) == SUBREG + SUBREG_PROMOTED_UNSIGNED_P (op)) +return true; + + return false; +} + + /* Nonzero if register REG appears somewhere within IN. Also works if REG is not a register; in this case it checks for a subexpression of IN that is Lisp equal to REG. */ Index: gcc/rtl.h === --- gcc/rtl.h (revision 186244) +++ gcc/rtl.h (working copy) @@ -1909,6 +1909,7 @@ extern HOST_WIDE_INT get_integer_term (c extern rtx get_related_value (const_rtx); extern bool offset_within_block_p (const_rtx, HOST_WIDE_INT); extern void split_const (rtx, rtx *, rtx *); +extern bool unsigned_reg_p (rtx); extern int reg_mentioned_p (const_rtx, const_rtx); extern int count_occurrences (const_rtx, const_rtx, int); extern int reg_referenced_p (const_rtx, const_rtx); Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 186244) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -15570,14 +15570,11 @@ rs6000_generate_compare (rtx cmp, enum m || code == GEU || code == LEU) comp_mode = CCUNSmode; else if ((code == EQ || code == NE) - GET_CODE (op0) == SUBREG - GET_CODE (op1) == SUBREG - SUBREG_PROMOTED_UNSIGNED_P (op0) - SUBREG_PROMOTED_UNSIGNED_P (op1)) + unsigned_reg_p (op0) + (unsigned_reg_p (op1) + || (CONST_INT_P (op1) INTVAL (op1) != 0))) /* These are unsigned values, perhaps there will be a later - ordering compare that can be shared with this one. - Unfortunately we cannot detect the signedness of the operands - for non-subregs. */ + ordering compare that can be shared with this one. */ comp_mode = CCUNSmode; else comp_mode = CCmode; Index: gcc/expr.c === --- gcc/expr.c (revision 186244) +++ gcc/expr.c (working copy) @@ -9015,8 +9015,13 @@ expand_expr_real_1 (tree exp, rtx target stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp))) g = SSA_NAME_DEF_STMT (exp); if (g) - return expand_expr_real (gimple_assign_rhs_to_tree (g), target, tmode, -
Re: [i386, patch, RFC] HLE support in GCC
Hello! This is wrong since HLE ACQUIRE/RELEASE has nothing to do with C++ atomic acquire/release. You can have HLE RELEASE with C++ atomic acquire. It makes sense to combine the two. On x86 C++ atomic acquire/release means the compiler cannot move references outside. For HLE we really want the same, otherwise some of the memory references inside the transaction may not be transactional. So I think HLE_ACQUIRE should imply C++ acquire and HLE_RELEASE imply C++ release. In this case, can we reverse this sentence and just emit lock xacquire for MEMMODEL_ACQUIRE and lock xrelease for MEMMODEL_RELEASE ? Do we need separate HLE_* defines or can we somehow recycle existing C++11 memmodel defines? Uros.
Re: Added better handling of outdated profile data. (issue 5989046)
On 2012/04/05 18:53:28, asharif wrote: Jan, please take a look and provide some feedback. Thanks, Ping? http://codereview.appspot.com/5989046/
Re: [patch] Fix PR52822 (stable_partition move-assigns object to itself) in trunk, 4.7, and 4.6
.. ah: remember to add PR libstdc++/52822 at the beginning of the ChangeLog entry, otherwise Bugzilla will not pick your commit! Thanks, Paolo.
[PATCH] Atom: Enabling unroll at O2 optimization level
Hi All! Here is a patch that enables unroll at O2 for Atom. This gives good performance boost on EEMBC 2.0 (~+8% in Geomean for 32 bits) with quite moderate code size increase (~5% for EEMBC2.0, 32 bits). Tested for i386 and x86-64, ok for trunk? Thanks, Igor ChangeLog: 2012-04-10 Yakovlev Vladimir vladimir.b.yakov...@intel.com * gcc/config/i386/i386.c (check_imull): New routine. (ix86_loop_unroll_adjust): New target hook. (ix86_option_override_internal): Enable unrolling on Atom at -O2. (TARGET_LOOP_UNROLL_ADJUST): New define. unroll.patch Description: Binary data
Re: [PATCH, rs6000] Fix PR16458, eliminate redudant compares
On Tue, Apr 10, 2012 at 1:36 PM, Peter Bergner berg...@vnet.ibm.com wrote: 2012-mm-dd Peter Bergner berg...@vnet.ibm.com Michael Matz m...@suse.de gcc/ PR target/16458 * rtlanal.c (unsigned_reg_p): New function. * rtl.h (unsigned_reg_p): Prototype it. * config/rs6000/rs6000.c (rs6000_generate_compare): Use it. Update comment. * expr.c (expand_expr_real_1): Set register attributes. * stmt.c (expand_case): Likewise. gcc/testsuite/ PR target/16458 * gcc.target/powerpc/pr16458-1.c: New test. * gcc.target/powerpc/pr16458-2.c: Likewise. * gcc.target/powerpc/pr16458-3.c: Likewise. * gcc.target/powerpc/pr16458-4.c: Likewise. The rs6000 and testsuite parts of the patch are okay with me. Thanks, David
Re: [PATCH] Caret diagnostics
On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote: On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote: * It uses the default cutoff as max_width, whatever it is (as controlled by -fmessage-length). * It uses the pretty-printer. The text cannot (should not) wrap because we still print only max_width chars at most. Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want to use COLUMNS to limit how much of the source we print. Like this? There is a novelty in this new version that I don't think we discussed before: automatic expansion of tabs to 8 hard space characters. That number should not be hardcoded as there is no reason to believe a tab character always expands to 8 space characters. You should check the environment first; if not present the default expansion number should be a symbolic constant as opposed to a magic number sprinkled all over the places. I would also encourage you to introduce more abstraction to reduce the size of diagnostic_show_locus.
follow up patch to clean up TODO_dump_func
Hi Richard, this is a follow up patch for more cleanups. Bootstrap and tested on x86-64/linux. Ok for trunk? thanks, David todo_dump.p Description: Binary data cg Description: Binary data
Re: [Patch] tree-parloops.c (eliminate_local_variables): Add braces to suppress warnings
On Tue, Apr 10, 2012 at 10:38 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Apr 10, 2012 at 3:06 PM, JonY jo...@users.sourceforge.net wrote: On 4/10/2012 20:37, Richard Guenther wrote: On Tue, Apr 10, 2012 at 2:15 PM, JonY wrote: Hi, Patch OK? What kind of warning? Oops, I forgot to mention gcc was complaining about missing braces. Not for me. And I fail to see why it should. This is the warning: ../../../../build/gcc/src/gcc/tree-parloops.c:724: warning: suggest explicit braces to avoid ambiguous `else'
Re: [PATCH] Caret diagnostics
On 04/10/2012 12:46 PM, Manuel López-Ibáñez wrote: + max_width = context-caret_max_width; + if (max_width= 0) +max_width = INT_MAX; I don't think we need the test here; diagnostic_set_caret_max_width should make sure caret_max_width is set sensibly. Otherwise, yes, thanks. On 04/10/2012 02:52 PM, Gabriel Dos Reis wrote: There is a novelty in this new version that I don't think we discussed before: automatic expansion of tabs to 8 hard space characters. That number should not be hardcoded as there is no reason to believe a tab character always expands to 8 space characters. You should check the environment first; if not present the default expansion number should be a symbolic constant as opposed to a magic number sprinkled all over the places. Hmm. I don't know if there's any reliable way to query tab stops; the it termcap/terminfo capability tells you what it was originally (presumably 8), but it might have changed. Jason
Re: [PATCH] Caret diagnostics
On 10 April 2012 20:52, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote: On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote: * It uses the default cutoff as max_width, whatever it is (as controlled by -fmessage-length). * It uses the pretty-printer. The text cannot (should not) wrap because we still print only max_width chars at most. Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want to use COLUMNS to limit how much of the source we print. Like this? There is a novelty in this new version that I don't think we discussed before: automatic expansion of tabs to 8 hard space characters. That number should not be hardcoded as there is no reason to believe a tab character always expands to 8 space characters. You should check the environment first; if not present the default expansion number should be a symbolic constant as opposed to a magic number sprinkled all over the places. I would also encourage you to introduce more abstraction to reduce the size of diagnostic_show_locus. There is no novelty, it has been there from the beginning, and now I realize it was a mistake to do the extra work to implement this. The GCS says: Calculate column numbers assuming that space and all ASCII printing characters have equal width, and assuming tab stops every 8 columns http://www.gnu.org/prep/standards/standards.html#Errors So, in fact, libcpp is buggy for not implementing this (as can be seen in emacs). If/When libcpp is fixed, the column info will be correct for tabs. And then, one may care about printing tabs as anything different from a single space. But until then, a single space is a good as anything. In fact this is what is done in c-ppoutput.c. As can be seen below, this simplifies a lot the function, so I appreciate the suggestion. If you have suggestions for removing other code, I will be happy to implement them. The less code, the less potential for bugs. Cheers, Manuel. +/* Print the physical source line corresponding to the location of + this diagnostics, and a caret indicating the precise column. */ +void +diagnostic_show_locus (diagnostic_context * context, + const diagnostic_info *diagnostic) +{ + const char *line; + char *buffer; + expanded_location s; + int real_width; + int max_width; + const char *saved_prefix; + int right_margin = 10; + + if (!context-show_caret + || diagnostic-location = BUILTINS_LOCATION) +return; + + s = expand_location(diagnostic-location); + line = location_get_source_line (s); + if (line == NULL) +return; + + real_width = strlen (line); + + max_width = context-caret_max_width; + if (max_width = 0) +max_width = INT_MAX; + + /* If the line is longer than max_width and the column is too close + to max_width, skip part of the line until it is not so close. */ + right_margin = MIN(real_width - s.column, right_margin); + right_margin = max_width - right_margin; + if (real_width = max_width s.column right_margin) +{ + line += s.column - right_margin; + s.column = right_margin; +} + + pp_newline (context-printer); + saved_prefix = pp_get_prefix (context-printer); + pp_set_prefix (context-printer, NULL); + pp_character (context-printer, ' '); + while (max_width 0 *line != '\0') +{ + char c = *line == '\t' ? ' ' : *line; + pp_character (context-printer, c); + max_width--; + line++; +} + pp_newline (context-printer); + /* pp_printf does not implement %*c. */ + buffer = XALLOCAVEC (char, s.column + 3); + snprintf (buffer, s.column + 3, %*c, s.column, '^'); + pp_string (context-printer, buffer); + pp_set_prefix (context-printer, saved_prefix); +} +
Re: [PATCH] Caret diagnostics
On 10 April 2012 21:31, Jason Merrill ja...@redhat.com wrote: On 04/10/2012 12:46 PM, Manuel López-Ibáñez wrote: + max_width = context-caret_max_width; + if (max_width= 0) + max_width = INT_MAX; I don't think we need the test here; diagnostic_set_caret_max_width should make sure caret_max_width is set sensibly. Otherwise, yes, thanks. It was just a bit of defensive programming, since nothing stops anyone to set the value directly. But I will remove it. On 04/10/2012 02:52 PM, Gabriel Dos Reis wrote: There is a novelty in this new version that I don't think we discussed before: automatic expansion of tabs to 8 hard space characters. That number should not be hardcoded as there is no reason to believe a tab character always expands to 8 space characters. You should check the environment first; if not present the default expansion number should be a symbolic constant as opposed to a magic number sprinkled all over the places. Hmm. I don't know if there's any reliable way to query tab stops; the it termcap/terminfo capability tells you what it was originally (presumably 8), but it might have changed. No idea either, but it is in fact easier to print tabs a single spaces. This simplifies the code a lot, as pointed by Gabriel. So if you also prefer the simpler version, I am fine with committing that one (it also saves space in the output). Cheers, Manuel.
[PATCH 04/11] Fix expansion point loc for macro-like tokens
Consider the test case gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c. Its interesting part is: #define A(x) vari x /* line 7. */ #define vari(x) #define B , varj int A(B) ; /* line 10. */ In its initial version, this test was being pre-processed as: # 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c # 1 build/gcc// # 1 command-line # 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c # 10 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c int # 7 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c vari , varj ; Note how int and vari are on separate lines, whereas int and , varj are on the same line. This looks like a bug to me, even independantly from the macro location tracking work. With macro location tracking turned on, the preprocessed output becomes: # 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c # 1 command-line # 1 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c # 10 gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c int vari , varj ; Which, IMO, is what we'd expect. This is due to an unexpected side effect of enter_macro_context when passed a token that might look like a function-like macro at first sight, but that it eventually considers to not be a macro after all. This is the case for the vari token which looks like a macro when it is first lexed, but is eventually considered to be a normal token by enter_macro_context because it's not used as a function-like macro invocation. In that case, besides returning NULL, enter_macro_context sets pfile-context-c.macro to NULL, making cpp_get_token_1 forget to set the location of the vari to the expansion point of A. Please look at the extensive comments I have added to cpp_get_token_1 to understand the details. The easy fix here is to speculatively set the location of the returned token before calling enter_macro_context. Tested on x86_64-unknown-linux-gnu against trunk. Now this test has the same output with and without tracking locations accross macro expansions. Note that the bootstrap with -ftrack-macro-expansion exhibits other separate issues that are addressed in subsequent patches. This patch just fixes one class of problems. The patch does pass bootstrap with -ftrack-macro-expansion turned off, though. libcpp/ * macro.c (enter_macro_context): Update comment. (cpp_get_token_1): Set virtual location before calling enter_macro_context. gcc/testsuite/ * gcc.dg/debug/dwarf2/pr41445-5.c: Adjust. * gcc.dg/debug/dwarf2/pr41445-6.c: Likewise. --- gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c |5 ++- gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c |5 ++- libcpp/macro.c| 43 +++- 3 files changed, 49 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c index 03af604..d21acd5 100644 --- a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c +++ b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-5.c @@ -9,6 +9,9 @@ #define B , varj int A(B) ; -/* { dg-final { scan-assembler DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0x)?7\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line } } */ +/* We want to check that both vari and varj have the same line +number. */ + +/* { dg-final { scan-assembler DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0xa|10)\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line } } */ /* { dg-final { scan-assembler DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\varj\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0xa|10)\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line } } */ /* { dg-final { cleanup-saved-temps } } */ diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c index 8aa37d1..d6d79cc 100644 --- a/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c +++ b/gcc/testsuite/gcc.dg/debug/dwarf2/pr41445-6.c @@ -4,5 +4,8 @@ #include pr41445-5.c -/* { dg-final { scan-assembler DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0x)?7\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line } } */ +/* We want to check that both vari and varj have the same line +number. */ + +/* { dg-final { scan-assembler DW_TAG_variable\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\vari\[^\\r\\n\]*DW_AT_name(\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*DW_AT_)*\[^\\r\\n\]*\[\\r\\n\]+\[^\\r\\n\]*\[^0-9a-fA-FxX](0xa|10)?\[^0-9a-fA-FxX]\[^\\r\\n\]*DW_AT_decl_line } } */ /* { dg-final { scan-assembler
Re: [i386, patch, RFC] HLE support in GCC
In this case, can we reverse this sentence and just emit lock xacquire for MEMMODEL_ACQUIRE and lock xrelease for MEMMODEL_RELEASE ? Do we need separate HLE_* defines or can we somehow recycle existing C++11 memmodel defines? No you absolutely can't. Transactions are quite different from a normal lock. There can be good reasons to have locks that never speculates (e.g. if they do some operation that always aborts) -Andi -- a...@linux.intel.com -- Speaking for myself only.
[PATCH 05/11] Make expand_location resolve to locus in main source file
Apparently, quite some places in the compiler (like the C/C++ preprocessor, the debug info machinery) expect expand_location to resolve to locations that are in the main source file, even if the token at stake comes from a macro that was defined in a header somewhere. Turning on -ftrack-macro-expansion by default was triggering a lot of failures (not necessarily related to diagnostics) because expand_location resolves to spelling locations instead. So I have changed expand_location to honour the initial expectation. In addition, I came up with the new expand_location_to_spelling_point used in diagnostic_build_prefix because the diagnostic system, on the other hand, wants to point to the location of the token where it was spelled, and then display the error context involving all the macro whose expansion led to that spelling point - if we are in the context of a macro expansion there. This seems to me like a reasonable balance. Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk and whitnessed that a lot more tests were PASSing. Note that the bootstrap with -ftrack-macro-expansion exhibits other separate issues that are addressed in subsequent patches. This patch just fixes one class of problems. The patch does pass bootstrap with -ftrack-macro-expansion turned off, though. gcc/ * input.c (expand_location_1): New. Takes a parameter to choose whether to resolve the location to spelling or expansion point. Was factorized from ... (expand_location): ... here. (expand_location_to_spelling_point): New. Implemented in terms of expand_location_1. * diagnostic.c (diagnostic_build_prefix): Use the new expand_location_to_spelling_point instead of expand_location. --- gcc/diagnostic.c |2 +- gcc/input.c | 39 ++- gcc/input.h |9 + 3 files changed, 44 insertions(+), 6 deletions(-) diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index a68d6ce..337328c 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -183,7 +183,7 @@ diagnostic_build_prefix (diagnostic_context *context, must-not-happen }; const char *text = _(diagnostic_kind_text[diagnostic-kind]); - expanded_location s = expand_location (diagnostic-location); + expanded_location s = expand_location_to_spelling_point (diagnostic-location); if (diagnostic-override_column) s.column = diagnostic-override_column; gcc_assert (diagnostic-kind DK_LAST_DIAGNOSTIC_KIND); diff --git a/gcc/input.c b/gcc/input.c index 4077f9e..dcd348b 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -32,16 +32,22 @@ struct line_maps *line_table; /* Expand the source location LOC into a human readable location. If LOC resolves to a builtin location, the file name of the readable - location is set to the string built-in. */ - -expanded_location -expand_location (source_location loc) + location is set to the string built-in. If EXPANSION_POINT_P is + TRUE and LOC is virtual, then it is resolved to the expansion + point of the involved macro. Otherwise, it is resolved to the + spelling location of the token. */ + +static expanded_location +expand_location_1 (source_location loc, + bool expansion_point_p) { expanded_location xloc; const struct line_map *map; loc = linemap_resolve_location (line_table, loc, - LRK_SPELLING_LOCATION, map); + expansion_point_p + ? LRK_MACRO_EXPANSION_POINT + : LRK_SPELLING_LOCATION, map); xloc = linemap_expand_location (line_table, map, loc); if (loc = BUILTINS_LOCATION) @@ -50,6 +56,29 @@ expand_location (source_location loc) return xloc; } +/* Expand the source location LOC into a human readable location. If + LOC is virtual, it resolves to the expansion point of the involved + macro. If LOC resolves to a builtin location, the file name of the + readable location is set to the string built-in. */ + +expanded_location +expand_location (source_location loc) +{ + return expand_location_1 (loc, /*expansion_point_p=*/true); +} + +/* Expand the source location LOC into a human readable location. If + LOC is virtual, it resolves to the expansion location of the + relevant macro. If LOC resolves to a builtin location, the file + name of the readable location is set to the string + built-in. */ + +expanded_location +expand_location_to_spelling_point (source_location loc) +{ + return expand_location_1 (loc, /*expansion_piont_p=*/false); +} + #define ONE_K 1024 #define ONE_M (ONE_K * ONE_K) diff --git a/gcc/input.h b/gcc/input.h index f2f3513..5838153 100644 --- a/gcc/input.h +++ b/gcc/input.h @@ -38,6 +38,7 @@ extern char builtins_location_check[(BUILTINS_LOCATION RESERVED_LOCATION_COUNT) ? 1 : -1]; extern expanded_location expand_location
[PATCH 06/11] Strip built-in loc from displayed expansion context
Now that diagnostics for tokens coming from macro expansions point to the spelling location of the relevant token (and then displays the context of the expansion), some ugly (not so seldom) corner cases can happen. When the relevant token is a built-in token (which means the location of that token is BUILTINS_LOCATION) the location prefix displayed to the user in the diagnostic line is the built-in:0:0 string. For instance: built-in:0:0: warning: conversion to 'float' alters 'int' constant value For the user, I think this is surprising and useless. A more user-friendly approach would be to refer to the first location that (in the reported macro expansion context) is for a location in real source code, like what is shown in the new test case gcc/testsuite/g++.dg/warn/Wconversion-real-integer2.C accompanying this patch. To do this, I am making the line-map module provide a new linemap_unwind_to_first_non_reserved_loc function that resolves a virtual location to the first spelling location that is in real source code. I am then using that facility in the diagnostics printing module and in the macro unwinder to avoid printing diagnostics lines that refer to the locations for built-ins or more generally for reserved locations. Note that when I start the dance of skipping a built-in location I also skip locations that are in system headers, because it turned out that a lot of those built-ins are actually used in system headers (e.g, #define INT_MAX __INT_MAX__ where __INT_MAX__ is a built-in). Besides the user-friendliness gain, this patch allows a number of regression tests to PASS unchanged with and without -ftrack-macro-expansion. Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk. Note that the bootstrap with -ftrack-macro-expansion exhibits other separate issues that are addressed in subsequent patches. This patch just fixes one class of problems. The patch does pass bootstrap with -ftrack-macro-expansion turned off, though. libcpp/ * include/line-map.h (linemap_unwind_toward_expansion): Fix typo in comment. (linemap_unwind_to_first_non_reserved_loc): Declare new function. * line-map.c (linemap_unwind_to_first_non_reserved_loc): Define new function. gcc/ * input.c (expand_location_1): When expanding to spelling location in a context of a macro expansion, skip reserved system header locations. Update comments. * tree-diagnostic.c (maybe_unwind_expanded_macro_loc): Likewise. gcc/testsuite/ * g++.dg/warn/Wconversion-real-integer2.C: New test. * g++.dg/warn/Wconversion-real-integer-3.C: Likewise. * g++.dg/warn/conversion-real-integer-3.h: New header used by the new test above. --- gcc/input.c| 38 --- .../g++.dg/warn/Wconversion-real-integer-3.C | 20 .../g++.dg/warn/Wconversion-real-integer2.C| 33 + .../g++.dg/warn/conversion-real-integer-3.h|3 + gcc/tree-diagnostic.c | 12 + libcpp/include/line-map.h | 20 - libcpp/line-map.c | 49 7 files changed, 167 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/g++.dg/warn/Wconversion-real-integer-3.C create mode 100644 gcc/testsuite/g++.dg/warn/Wconversion-real-integer2.C create mode 100644 gcc/testsuite/g++.dg/warn/conversion-real-integer-3.h diff --git a/gcc/input.c b/gcc/input.c index dcd348b..8edf05b 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -35,7 +35,14 @@ struct line_maps *line_table; location is set to the string built-in. If EXPANSION_POINT_P is TRUE and LOC is virtual, then it is resolved to the expansion point of the involved macro. Otherwise, it is resolved to the - spelling location of the token. */ + spelling location of the token. + + When resolving to the spelling location of the token, if the + resulting location is for a built-in location (that is, it has no + associated line/column) in the context of a macro expansion, the + returned location is the first one (while unwinding the macro + location towards its expansion point) that is in real source + code. */ static expanded_location expand_location_1 (source_location loc, @@ -43,12 +50,29 @@ expand_location_1 (source_location loc, { expanded_location xloc; const struct line_map *map; - - loc = linemap_resolve_location (line_table, loc, - expansion_point_p - ? LRK_MACRO_EXPANSION_POINT - : LRK_SPELLING_LOCATION, map); - xloc = linemap_expand_location (line_table, map, loc); + enum location_resolution_kind lrk = LRK_MACRO_EXPANSION_POINT; + + memset (xloc, 0, sizeof (xloc)); + + if (loc = RESERVED_LOCATION_COUNT) +{ + if (!expansion_point_p) + { +
Re: [i386, patch, RFC] HLE support in GCC
On Tue, Apr 10, 2012 at 9:34 AM, Andi Kleen a...@firstfloor.org wrote: H.J. Lu hjl.to...@gmail.com writes: So, to emit HLE prefix, it is possible to do: int foo2 (int *p, int oldv, int newv) { __atomic_compare_exchange_n (p, oldv, newv, 0, __ATOMIC_ACQUIRE | __ATOMIC_USE_HLE, __ATOMIC_ACQUIRE); return oldv; } This is wrong since HLE ACQUIRE/RELEASE has nothing to do with C++ atomic acquire/release. You can have HLE RELEASE with C++ atomic acquire. It makes sense to combine the two. On x86 C++ atomic acquire/release means the compiler cannot move references outside. For HLE we really want the same, otherwise some of the memory references inside the transaction may not be transactional. So I think HLE_ACQUIRE should imply C++ acquire and HLE_RELEASE imply C++ release. If it is the case, can we generate HLE RELEASE/ACQUIRE prefix automatically for C++ atomics via -mhle command line option. Then you don't need to modify the source codes to enable HLE support. -- H.J.
Re: [i386, patch, RFC] HLE support in GCC
If it is the case, can we generate HLE RELEASE/ACQUIRE prefix automatically for C++ atomics via -mhle command line option. Then you don't need to modify the source codes to enable HLE support. No, for HLE someone needs to decide whether HLE is beneficial for a given lock. There are cases where it is not or even wrong (e.g. if there is no matching unlock) -Andi -- a...@linux.intel.com -- Speaking for myself only.
Re: [C++ Patch] for c++/52465
2012/4/7 Jason Merrill ja...@redhat.com: On 04/07/2012 11:37 AM, Fabien Chêne wrote: Perhaps it is more correct like that, in cp_parser_set_decl_spec_type ? Even that seems late. Why not just return the target decl from cp_parser_class_name? Ah yes, that's slightly better. (I've kept the NULL check in strip_using_decl, it seems safer to me. Just tell me if you prefer not) Tested x86_64-unkown-linux-gnu. OK for trunk and 4.7 ? gcc/testsuite/ChangeLog 2012-03-08 Fabien Chêne fab...@gcc.gnu.org PR c++/52465 * g++.dg/lookup/using52.C: New. gcc/cp/ChangeLog 2012-03-08 Fabien Chêne fab...@gcc.gnu.org PR c++/52465 * decl.c (grokdeclarator): Call strip_using_decl. * parser.c (cp_parser_class_name): Call strip_using_decl and perform some checks on the target decl. * name-lookup.c (strip_using_decl): Returns NULL_TREE if the decl to be stripped is NULL_TREE. (qualify_lookup): Call strip_using_decl and perform some checks on the target decl. -- Fabien 52465_3.patch Description: Binary data
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On Tue, Apr 3, 2012 at 6:56 PM, Joseph S. Myers jos...@codesourcery.com wrote: (e) Existing practice for cases that do use different dynamic linkers is to use a separate library directory, not just dynamic linker name, as in lib32 and lib64 for MIPS or libx32 for x32; it's certainly a lot easier to make two sets of libraries work in parallel if you have separate library directories like that. So it would seem more appropriate to define a directory libhf for ARM (meaning you need a binutils patch as well to handle that directory, I think), and these different Debian-style names could be implemented separately in a multiarch patch if someone submits one that properly accounts for my review comments on previous patch versions (failure to produce such a fixed patch being why Debian multiarch directory support has not got into GCC so far). The thread doesn't seem to be wrapping up, instead it appears to go in circles :-) As a glibc maintainer, when it comes to this issue, I am guided by: (1) This is a distribution problem and the solution needs to come from a consensus among the distributions. (2) The gcc/glibc community should listen to the distributions. The distributions have the most experience when it comes to whole-system issues. I certainly don't have that experience. Unfortunately *I* give the distributions a B- or C+ for communication. Please make it easy for me to give you an A. It is exceedingly difficult for me to review solutions that span multiple patches, emails, mailing lists, and communities. The best way to avoid rehashing old problems is to document the design and get sign off from the interested parties. If I see uncoordinated and conflicting responses from the distributions... I get worried. Is there a proposal on a wiki that begins to summarize the suggestions and solution? Cheers, Carlos.
[PATCH 07/11] Fix -Wuninitialized for -ftrack-macro-expansion
Besides the warning emitted by warn_uninit, this function wants to hint the user at where the uninitialized variable was declared, for cases where the declaration location is outside the current function. Now that expand_location expands to the location that is in the main source file (even for -ftrack-macro-expansion) the hinting part was not working well for cases where the variable is declared in a macro (outside the function), which is then expanded in the function. So I had to adjust warn_uninit a little bit to make it consider the spelling location of the variable declaration. I have fixed the test gcc.dg/cpp/pragma-diagnostic-2.c on which I believe gcc shouldn't emit any error. Here is the new output on that test: =~= gcc.dg/cpp/pragma-diagnostic-2.c: In function 'g': gcc.dg/cpp/pragma-diagnostic-2.c:10:5: warning: 'a' is used uninitialized in this function [-Wuninitialized] gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: 'a' was declared here gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: in expansion of macro 'CODE_WITH_WARNING' gcc.dg/cpp/pragma-diagnostic-2.c:17:3: note: expanded from here gcc.dg/cpp/pragma-diagnostic-2.c: In function 'h': gcc.dg/cpp/pragma-diagnostic-2.c:10:5: warning: 'a' is used uninitialized in this function [-Wuninitialized] gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: 'a' was declared here gcc.dg/cpp/pragma-diagnostic-2.c:9:7: note: in expansion of macro 'CODE_WITH_WARNING' gcc.dg/cpp/pragma-diagnostic-2.c:27:3: note: expanded from here =~= Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk. Note that the bootstrap with -ftrack-macro-expansion turned on exhibits other separate issues that are addressed in subsequent patches. This patch just fixes one class of problems. The patch does pass bootstrap with -ftrack-macro-expansion turned off, though. gcc/ * tree-ssa.c (warn_uninit): Use the spelling location of the variable declaration. Use linemap_location_before_p for source locations. gcc/testsuite/ * gcc.dg/cpp/pragma-diagnostic-2.c: Fix this. --- gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c | 10 ++ gcc/tree-ssa.c | 15 +++ 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c b/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c index 7ab95b0..57f3f01 100644 --- a/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c +++ b/gcc/testsuite/gcc.dg/cpp/pragma-diagnostic-2.c @@ -6,8 +6,8 @@ void f (unsigned); #define CODE_WITH_WARNING \ - int a; /* { dg-message expansion|declared here } */ \ - f (a) /* { dg-message expansion } */ + int a; /* { dg-message was declared here } */ \ + f (a) /* { dg-warning used uninitialized } */ #pragma GCC diagnostic ignored -Wuninitialized @@ -26,9 +26,3 @@ h (void) { CODE_WITH_WARNING; /* { dg-message expanded } */ } - -/* - { dg-message some warnings being treated as errors {target *-*-*} 0 } -*/ - -/* { dg-error uninitialized { target *-*-* } { 10 } } */ diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c index 08f908f..977f0f7 100644 --- a/gcc/tree-ssa.c +++ b/gcc/tree-ssa.c @@ -1584,7 +1584,7 @@ warn_uninit (enum opt_code wc, tree t, tree expr, tree var, const char *gmsgid, void *data) { gimple context = (gimple) data; - location_t location; + location_t location, cfun_loc; expanded_location xloc, floc; if (!ssa_undefined_value_p (t)) @@ -1602,8 +1602,12 @@ warn_uninit (enum opt_code wc, tree t, location = (context != NULL gimple_has_location (context)) ? gimple_location (context) : DECL_SOURCE_LOCATION (var); + location = linemap_resolve_location (line_table, location, + LRK_SPELLING_LOCATION, + NULL); + cfun_loc = DECL_SOURCE_LOCATION (cfun-decl); xloc = expand_location (location); - floc = expand_location (DECL_SOURCE_LOCATION (cfun-decl)); + floc = expand_location (cfun_loc); if (warning_at (location, wc, gmsgid, expr)) { TREE_NO_WARNING (expr) = 1; @@ -1611,8 +1615,11 @@ warn_uninit (enum opt_code wc, tree t, if (location == DECL_SOURCE_LOCATION (var)) return; if (xloc.file != floc.file - || xloc.line floc.line - || xloc.line LOCATION_LINE (cfun-function_end_locus)) + || linemap_location_before_p (line_table, + location, cfun_loc) + || linemap_location_before_p (line_table, + cfun-function_end_locus, + location)) inform (DECL_SOURCE_LOCATION (var), %qD was declared here, var); } } -- Dodji
Re: [PATCH] Caret diagnostics
On 10 April 2012 21:41, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 10 April 2012 21:31, Jason Merrill ja...@redhat.com wrote: On 04/10/2012 12:46 PM, Manuel López-Ibáñez wrote: + max_width = context-caret_max_width; + if (max_width= 0) + max_width = INT_MAX; I don't think we need the test here; diagnostic_set_caret_max_width should make sure caret_max_width is set sensibly. Otherwise, yes, thanks. It was just a bit of defensive programming, since nothing stops anyone to set the value directly. But I will remove it. On 04/10/2012 02:52 PM, Gabriel Dos Reis wrote: There is a novelty in this new version that I don't think we discussed before: automatic expansion of tabs to 8 hard space characters. That number should not be hardcoded as there is no reason to believe a tab character always expands to 8 space characters. You should check the environment first; if not present the default expansion number should be a symbolic constant as opposed to a magic number sprinkled all over the places. Hmm. I don't know if there's any reliable way to query tab stops; the it termcap/terminfo capability tells you what it was originally (presumably 8), but it might have changed. No idea either, but it is in fact easier to print tabs a single spaces. This simplifies the code a lot, as pointed by Gabriel. So if you also prefer the simpler version, I am fine with committing that one (it also saves space in the output). A new version of the patch. It removes the special handling of tabs. And abstracts out the adjusting of the line as Gabriel suggested. I am not sure what else could be abstracted out. The function is much shorter than other functions in diagnostic.c. This patch builds and I am running a full boostrap+check now. OK if it passes? Cheers, Manuel. 2012-04-05 Manuel López-Ibáñez m...@gcc.gnu.org PR 24985 libstdc++-v3/ * testsuite/lib/prune.exp: Handle caret. libmudflap/ * testsuite/lib/libmudflap.exp: Handle caret. gcc/ * diagnostic.h (show_caret): Declare. (caret_max_width): Declare. (diagnostic_show_locus): Declare. * diagnostic.c (diagnostic_initialize): Initialize to false. (diagnostic_show_locus): New. (diagnostic_report_diagnostic): Call it. (getenv_columns): New. (adjust_line): New. (diagnostic_set_caret_max_width): New. * input.c (read_line): New. (location_get_source_line): New. * input.h (location_get_source_line): Declare. * toplev.c (general_init): Initialize show_caret from options. * dwarf2out.c (gen_producer_string): Handle fdiagnostics-show-caret. * opts.c (common_handle_option): Likewise. * pretty-print.h (pp_get_prefix): New. (pp_base_get_prefix): New. * common.opt (fdiagnostics-show-caret): New option. * doc/invoke.texi (fdiagnostics-show-caret): Document it. testsuite/ * lib/prune.exp: Add -fno-diagnostics-show-caret. caret-diagnostics-20120411.diff Description: Binary data
Re: Ada testcase CR line endings
On Apr 10, 2012, at 5:24 AM, Robert Dewar wrote: On 4/10/2012 1:35 AM, Mike Stump wrote: So, I'd like to change all the ada testcases to use normal unix line endings. testsuite/gnat.dg/taft_type2_pkg.ads is an example if one such file, any objections? As long as the test is not about line endings this seems fine. The C testsuite does indeed have such testcases, from the last time I converted endings in bulk for the C testsuite. The Ada testsuite appears to not have any such testcase.
Re: [PATCH] Caret diagnostics
On Apr 10, 2012, at 11:52 AM, Gabriel Dos Reis wrote: On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote: On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote: * It uses the default cutoff as max_width, whatever it is (as controlled by -fmessage-length). * It uses the pretty-printer. The text cannot (should not) wrap because we still print only max_width chars at most. Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want to use COLUMNS to limit how much of the source we print. Like this? There is a novelty in this new version that I don't think we discussed before: automatic expansion of tabs to 8 hard space characters. That number should not be hardcoded as there is no reason to believe a tab character always expands to 8 space characters. You should check the environment first; if not present the default expansion number should be a symbolic constant as opposed to a magic number sprinkled all over the places. I would also encourage you to introduce more abstraction to reduce the size of diagnostic_show_locus. Knobs have a cost. Having 1000s of command line options doesn't really make a product better. Our job is to provide what we must and omit what we can. You argue for a knob that no user has asked for or expressed a need for. I think we should have a much higher bar than this. The reality is, tabs are 8, period. There was a time went editors managed tabs stops by changing column into which a tab would move, this ability taken from devices called typewriters which had a little metal bit that you moved around where you wanted tab stops. The days of there devices called typewriters are over, we no longer cater to them. Today, editors can manage what the key called TAB does, and its semantics are disconnected from the file save format. This file format can be set at 8, while the edit function is indent 2, 4, 8 or as complex as what emacs does with c-mode. It is a bad idea to have a file format tab is other than 8, and we should encourage all but legacy people to convert to 8. We do this by having a wiki entry and explains the use of pr, and how to set the file save format to 8, if we need to. For legacy people, they don't _want_ a new compiler, pretty much by definition. So, I'd argue for a symbol constant, but, a constant nonetheless. If a large user _had_ serious needs for something other than 8, _they_ can change the compiler to: #define TAB_COLS (getenv(TABS) ? atoi (genenv(TABS)) : 8) if they want. I don't think we should have a knob for this.
Re: [PATCH] Caret diagnostics
On Tue, Apr 10, 2012 at 5:23 PM, Mike Stump mikest...@comcast.net wrote: On Apr 10, 2012, at 11:52 AM, Gabriel Dos Reis wrote: On Tue, Apr 10, 2012 at 11:46 AM, Manuel López-Ibáñez lopeziba...@gmail.com wrote: On 10 April 2012 00:28, Jason Merrill ja...@redhat.com wrote: On 04/09/2012 04:01 PM, Manuel López-Ibáńez wrote: * It uses the default cutoff as max_width, whatever it is (as controlled by -fmessage-length). * It uses the pretty-printer. The text cannot (should not) wrap because we still print only max_width chars at most. Hmm, I think if pp_line_cutoff is 0 and we're on a terminal, we still want to use COLUMNS to limit how much of the source we print. Like this? There is a novelty in this new version that I don't think we discussed before: automatic expansion of tabs to 8 hard space characters. That number should not be hardcoded as there is no reason to believe a tab character always expands to 8 space characters. You should check the environment first; if not present the default expansion number should be a symbolic constant as opposed to a magic number sprinkled all over the places. I would also encourage you to introduce more abstraction to reduce the size of diagnostic_show_locus. Knobs have a cost. Having 1000s of command line options doesn't really make a product better. Our job is to provide what we must and omit what we can. You argue for a knob that no user has asked for or expressed a need for. I think we should have a much higher bar than this. The reality is, tabs are 8, period. There was a time went editors managed tabs stops by changing column into which a tab would move, this ability taken from devices called typewriters which had a little metal bit that you moved around where you wanted tab stops. The days of there devices called typewriters are over, we no longer cater to them. Today, editors can manage what the key called TAB does, and its semantics are disconnected from the file save format. This file format can be set at 8, while the edit function is indent 2, 4, 8 or as complex as what emacs does with c-mode. It is a bad idea to have a file format tab is other than 8, and we should encourage all but legacy people to convert to 8. We do this by having a wiki entry and explains the use of pr, and how to set the file save format to 8, if we need to. For legacy people, they don't _want_ a new compiler, pretty much by definition. So, I'd argue for a symbol constant, but, a constant nonetheless. If a large user _had_ serious needs for something other than 8, _they_ can change the compiler to: #define TAB_COLS (getenv(TABS) ? atoi (genenv(TABS)) : 8) if they want. I don't think we should have a knob for this. This is fully mystifying to me and I still do not understand what it is all about after reading it 8 times. - Gaby
Re: execute permissions in ada
On Apr 10, 2012, at 1:45, Mike Stump mikest...@comcast.net wrote: I assume that was a friendly, please feel free to fix *.ads as well. Yes, sorry for the terse email. I wasn't quite sure if your message implied there was only an issue with *.adb or not and wasn't in a position to check at that time. -Geert
Re: [PATCH] Caret diagnostics
On Apr 10, 2012, at 4:37 PM, Gabriel Dos Reis wrote: You should check the environment first I don't think we should have a knob for this. This is fully mystifying to me and I still do not understand what it is all about after reading it 8 times. Let me rephrase, I disagree, we should not check the environment first. I assume, you meant checking the environment, git getenv, for some named variable to control the expansion of tabs. Yes?
Re: [PATCH] Caret diagnostics
On 04/10/2012 04:33 PM, Manuel López-Ibáñez wrote: A new version of the patch. It removes the special handling of tabs. And abstracts out the adjusting of the line as Gabriel suggested. I am not sure what else could be abstracted out. The function is much shorter than other functions in diagnostic.c. This patch builds and I am running a full boostrap+check now. OK if it passes? OK. Jason
Re: [C++ Patch] for c++/52465
On 04/10/2012 04:23 PM, Fabien Chêne wrote: Ah yes, that's slightly better. (I've kept the NULL check in strip_using_decl, it seems safer to me. Just tell me if you prefer not) Tested x86_64-unkown-linux-gnu. OK for trunk and 4.7 ? 2012-03-08 Fabien Chênefab...@gcc.gnu.org PR c++/52465 * decl.c (grokdeclarator): Call strip_using_decl. * parser.c (cp_parser_class_name): Call strip_using_decl and perform some checks on the target decl. * name-lookup.c (strip_using_decl): Returns NULL_TREE if the decl to be stripped is NULL_TREE. (qualify_lookup): Call strip_using_decl and perform some checks on the target decl. Your ChangeLog needs to be adjusted. :) The patch is OK, thanks! Jason
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On Tuesday 10 April 2012 01:17:36 Adam Conrad wrote: On Tue, Apr 10, 2012 at 12:01:57AM -0400, Mike Frysinger wrote: On Monday 09 April 2012 19:31:40 Adam Conrad wrote: I realize that most people can't see past their own use case to understand why a unique location for linkers is helpful, useful, and important for some other people's use cases, but you either didn't read or chose to ignore why using multilib paths just plain doesn't scale past a single base architecture, and why that's a problem for people who aren't you. and as already stated, the proposed paths here, free of multiarch subpaths, satisfy the requirements that you've put forth Like I said, then, you didn't actually read or understand why proposing multilib paths doesn't work. You realize conceptually, I hope, that there's no guarantee of uniqueness in lib/lib64/lib32/libsf/libhf once you cross the base CPU architecture boundary, right? i don't see this as a problem Sure, I said that /libhf/ld-linux.so.3 would *accidentally* work for us right now, due to sheer luck, and you're running with that as saying that we clearly have no problem here worth solving. my point was: it works today and has no clashes. that satisfies the omg we have to ship something $now and satisfies the requirement that only the Debian people are putting forth (and has already been violated by many targets): the ldso must be unique across all arches/multilibs. When the next architecture clashes with linkers on another (hint: it almost certainly will), do we get to argue about this all over again in six months, instead of codifying a new and saner practice now? i don't buy that it'll happen that soon (since ldso's don't get generated quickly), but that is certainly plenty of time for the Debian project to attempt to convince everyone else that multiarch isn't a waste of time. and does so without holding up moving forward with a unified arm hardfloat abi. -mike signature.asc Description: This is a digitally signed message part.
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On Tuesday 10 April 2012 12:46:49 Michael Edwards wrote: That way I can grandfather in binaries with ABI-ignorant hard-coded library paths, and still handle ISA variants. The extranoise might be neon, or ssse3 aren't ISA variants handled already by glibc ? that's what the hwcaps stuff does -- you can put optimized versions in ISA-specific subdirs of the normal lib paths. glibc will look for those first before falling back to the common libs. -mike signature.asc Description: This is a digitally signed message part.