Re: [C++ PATCH, RFC] PR c++/63959, continued
On 6 March 2015 at 23:58, Jason Merrill ja...@redhat.com wrote: On 01/19/2015 11:28 AM, Ville Voutilainen wrote: * class.c (check_field_decls): If any field is volatile, make the class type have complex copy/move operations. Discussion on the cxx-abi list points out that this breaks ABI compatibility between C and C++, and is therefore unacceptable. So.. just to clarify that we're on the same page.. making volatile-qualified types non-trivially copyable is ok, but making wrappers of volatile-qualified types non-trivially copyable is not ok? That's easily doable implementation-wise, but it makes me question the overall approach and its consistency. Is there a way to indicate that from the point of C++ a type is not trivially copyable without changing the complexness of a copy operation, ultimately without changing ABI?
RE: [PATCH] Fix a typo in the german language translation
On Wed, 4 Mar 2015, Bernd Edlinger wrote: On Wed, 4 Mar 2015 17:52:11, Jospeh S. Myers wrote: On Wed, 4 Mar 2015, Bernd Edlinger wrote: if nobody objects, I'd like to fix this in the german language translation file locally. I have already informed the german language All local changes to .po files get lost every time new .po files are imported from upstream. So no such changes should be made. Translation teams are welcome to upload new translations for .pot files from past releases (I presume the TP supports that), if e.g. you want a fixed version imported on 4.9 branch. I would have been happy to fix this message in gcc-5, because it comes quite often, and it just terribly wrong. Is there any chance to upload a corrected translation before gcc-5 gets released? See r221191. As and when the TP provides new translations they'll be committed to GCC. -- Joseph S. Myers jos...@codesourcery.com
[PATCH, libgfortran] PR 65200 Handle EPERM as EACCES
Hi, the attached patch makes GFortran handle errno=EPERM in the same way as EACCES (this affects only when no ACTION= specifier is used in the OPEN statement and opening the file in read-write mode fails). Ostensibly the distinction is roughly - EACCES: Insufficient privilege. E.g. do the same as root and this should work. - EPERM: Operation not permitted. That is, no matter the privilege level, this isn't allowed. E.g. truncating a file opened with O_APPEND and such. However, it seems at least the Linux NFSv4 client returns EPERM when trying to open a read-only file in read-write mode, whereas local filesystems and the NFSv3 client return EACCES. So it seems reasonable to check for EPERM as well in the same situations one checks for EACCES. The patch also contains a doc snippet to explain how GFortran behaves when no ACTION= is given when opening a file. This brings me to a standards interpretation question: When no ACTION= is given, F2008 9.5.6.4 (ACCESS= specifier in OPEN statement) says: If this specifier is omitted, the default value is processor dependent. So does this mean that the current GFortran behavior is allowed, or must the behavior be exactly as with one of the possible ACTION= specifiers (READ, WRITE, READWRITE)? The patch as is causes gfortran.dg/open_errors.f90 to fail, due to changed error messages. I'm a bit unsure of to fix this, as now strerror* is used to generate part of the message, and thus the message can be different on different targets, and even dependent on the locale settings. Just checking for the fixed part of the error message doesn't seem that useful in this case; should the entire testcase just be removed? Regtested on x86_64-unknown-linux-gnu, Ok for trunk (with some decision wrt open_errors.f90)? gcc/fortran ChangeLog: 2015-03-07 Janne Blomqvist j...@gcc.gnu.org PR libfortran/65200 * gfortran.texi: Document behavior when opening files without explicit ACTION= specifier. libgfortran ChangeLog: 2015-03-07 Janne Blomqvist j...@gcc.gnu.org PR libfortran/65200 * io/open.c (new_unit): Use gf_strerror rather than hardcoding error messages for different errno values. * io/unix.c (regular_file2): Handle EPERM in addition to EACCES. gcc/testsuite ChangeLog: 2015-03-07 Janne Blomqvist j...@gcc.gnu.org PR libfortran/65200 * gfortran.dg/open_new_segv.f90: Fix error message pattern. -- Janne Blomqvist diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi index 300b8b8..34999db 100644 --- a/gcc/fortran/gfortran.texi +++ b/gcc/fortran/gfortran.texi @@ -1139,6 +1139,7 @@ might in some way or another become visible to the programmer. * Internal representation of LOGICAL variables:: * Thread-safety of the runtime library:: * Data consistency and durability:: +* Files opened without an explicit ACTION= specifier:: @end menu @@ -1328,6 +1329,22 @@ releasing @code{fcntl} file locks, if the server supports them, will also force cache validation and flushing dirty data and metadata. +@node Files opened without an explicit ACTION= specifier +@section Files opened without an explicit ACTION= specifier +@cindex open, action + +The Fortran standard says that if an @code{OPEN} statement is executed +without an explicit @code{ACTION=} specifier, the default value is +processor dependent. GNU Fortran behaves as follows: + +@enumerate +@item Attempt to open the file with @code{ACTION='READWRITE'} +@item If that fails, try to open with @code{ACTION='READ'} +@item If that fails, try to open with @code{ACTION='WRITE'} +@item If that fails, generate an error +@end enumerate + + @c - @c Extensions @c - diff --git a/gcc/testsuite/gfortran.dg/open_new_segv.f90 b/gcc/testsuite/gfortran.dg/open_new_segv.f90 index fe548f1..d9f2871 100644 --- a/gcc/testsuite/gfortran.dg/open_new_segv.f90 +++ b/gcc/testsuite/gfortran.dg/open_new_segv.f90 @@ -1,5 +1,5 @@ ! { dg-do run } -! { dg-shouldfail File already exists } +! { dg-shouldfail Cannot open file } ! PR 64770 SIGSEGV when trying to open an existing file with status=new program pr64770 implicit none @@ -10,5 +10,5 @@ program pr64770 status=new) end program pr64770 ! { dg-output At line 10 of file.* } -! { dg-output Fortran runtime error: File .pr64770test.dat. already exists } +! { dg-output Fortran runtime error: Cannot open file .pr64770test.dat.: } ! { dg-final { remote_file build delete pr64770test.dat } } diff --git a/libgfortran/io/open.c b/libgfortran/io/open.c index 0a2fda9..4654de2 100644 --- a/libgfortran/io/open.c +++ b/libgfortran/io/open.c @@ -502,34 +502,12 @@ new_unit (st_parameter_open *opp, gfc_unit *u, unit_flags * flags) s = open_external (opp, flags); if (s == NULL) { + char errbuf[256]; char *path = fc_strdup (opp-file, opp-file_len); - size_t msglen = opp-file_len + 51;
Re: #pragma GCC unroll support
On Thu, 5 Mar 2015, Mike Stump wrote: On Jan 30, 2015, at 8:27 AM, Mike Stump mikest...@comcast.net wrote: On Jan 30, 2015, at 7:49 AM, Joseph Myers jos...@codesourcery.com wrote: Use error_at, and %u directly in the format. Done. Ping? I don't see any sign of https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02735.html having been addressed. -- Joseph S. Myers jos...@codesourcery.com
Re: RFC: PATCHES: Properly handle reference to protected data on x86
On Fri, Mar 06, 2015 at 05:04:56AM -0800, H.J. Lu wrote: On Thu, Mar 5, 2015 at 8:19 PM, Alan Modra amo...@gmail.com wrote: On Wed, Mar 04, 2015 at 03:26:10PM -0800, H.J. Lu wrote: Protected symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. Here are patches for glibc, binutils and GCC to handle it properly. Any comments? I'd like to see this pass some more tests. For example reference in non-PIC exe to var x protected visibility definition of x in libA protected visibility definition of x in libB I suspect you don't have this case correct, but congratulations if you do! Assuming libA is first on the breadth first search for libraries, then exe and libA ought to use the same x, but libB have its own x. I believe my new testcases on hjl/pr17711 branch at https://sourceware.org/git/?p=glibc.git;a=summary covers those and they work correctly. The test that I see in commit 9ea148bb does not. Please notice that I'm asking you about two protected definitions in the libraries, not one protected and one with default visibility. -- Alan Modra Australia Development Lab, IBM
Re: RFC: PATCHES: Properly handle reference to protected data on x86
On Fri, Mar 6, 2015 at 4:15 PM, Alan Modra amo...@gmail.com wrote: On Fri, Mar 06, 2015 at 05:04:56AM -0800, H.J. Lu wrote: On Thu, Mar 5, 2015 at 8:19 PM, Alan Modra amo...@gmail.com wrote: On Wed, Mar 04, 2015 at 03:26:10PM -0800, H.J. Lu wrote: Protected symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. Here are patches for glibc, binutils and GCC to handle it properly. Any comments? I'd like to see this pass some more tests. For example reference in non-PIC exe to var x protected visibility definition of x in libA protected visibility definition of x in libB I suspect you don't have this case correct, but congratulations if you do! Assuming libA is first on the breadth first search for libraries, then exe and libA ought to use the same x, but libB have its own x. I believe my new testcases on hjl/pr17711 branch at https://sourceware.org/git/?p=glibc.git;a=summary covers those and they work correctly. The test that I see in commit 9ea148bb does not. Please notice that I'm asking you about two protected definitions in the libraries, not one protected and one with default visibility. My patch extends my work for protected function pointer to cover protected data. There is no reason why it shouldn't work. I updated the testcase to commit 88af4693bd32e3658206b73c121de9a36c510f6b Please check it out. -- H.J.
Re: RFC: PATCHES: Properly handle reference to protected data on x86
On Thu, Mar 5, 2015 at 6:39 AM, H.J. Lu hjl.to...@gmail.com wrote: On Wed, Mar 4, 2015 at 3:26 PM, H.J. Lu hjl.to...@gmail.com wrote: Protected symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. Here are patches for glibc, binutils and GCC to handle it properly. Any comments? This is the binutils patch I checked in. It basically reverted the change for https://sourceware.org/bugzilla/show_bug.cgi?id=15228 on x86. Copy relocations against protected symbols should work. -- H.J. --- bfd/ PR ld/pr15228 PR ld/pr17709 * elf-bfd.h (elf_backend_data): Add extern_protected_data. * elf32-i386.c (elf_backend_extern_protected_data): New. Defined to 1. * elf64-x86-64.c (elf_backend_extern_protected_data): Likewise. * elflink.c (_bfd_elf_adjust_dynamic_copy): Don't error on copy relocs against protected symbols if extern_protected_data is true. (_bfd_elf_symbol_refs_local_p): Don't return true on protected non-function symbols if extern_protected_data is true. * elfxx-target.h (elf_backend_extern_protected_data): New. Default to 0. (elfNN_bed): Initialize extern_protected_data with elf_backend_extern_protected_data. ld/testsuite/ PR ld/pr15228 PR ld/pr17709 * ld-i386/i386.exp (i386tests): Add a test for PR ld/17709. * ld-i386/pr17709-nacl.rd: New file. * ld-i386/pr17709.rd: Likewise. * ld-i386/pr17709a.s: Likewise. * ld-i386/pr17709b.s: Likewise. * ld-i386/protected3.d: Updated. * ld-i386/protected3.s: Likewise. * ld-x86-64/pr17709-nacl.rd: New file. * ld-x86-64/pr17709.rd: Likewise. * ld-x86-64/pr17709a.s: Likewise. * ld-x86-64/pr17709b.s: Likewise. * ld-x86-64/protected3.d: Updated. * ld-x86-64/protected3.s: Likewise. * ld-x86-64/x86-64.exp (x86_64tests): Add a test for PR ld/17709. I back ported it to 2.25 branch. -- H.J.
Re: Strengthen ICF hash
On Wed, 4 Mar 2015, Jan Hubicka wrote: But yes, I do not think we need to significantly slow things down in current implemetnation when this feature does not work. Just perhaps keep track of changes that introduce host specific stuff and not introduce these when it is easily avoidable. Known LTO host dependencies are listed in bug 41526, so I suggest listing any new such host dependencies there (and noting there if anything listed is in fact now fixed). -- Joseph S. Myers jos...@codesourcery.com
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi, On Thu, 5 Mar 2015 16:36:48, Richard Biener wrote: On Thu, Mar 5, 2015 at 4:05 PM, Bernd Edlinger bernd.edlin...@hotmail.de wrote: every access is split in 4 QImode accesses. but that is as expected, because the structure is byte aligned. No, it is not expected because the CPU can handle unaligned SImode reads/writes just fine (even if not as an atomic operation). The C++ memory model allows an SImode read to s.y (-fstrict-volatile-bitfields would, as well, but the CPU doesn't guarantee atomic operation here) Hmm, well. I understand. However this is the normal way how the non-strict-volatile-bitfields work. But I can probably work out a patch that enables the strict-volatile-bitfields to generate un-aligned SImode accesses when necessary on !STRICT_ALIGNMENT targets. Now, I checked again the ARM port, and I am a bit surprised, because with gcc 4.9.0 this structure gets 4 QImode accesses which is correct, but with recent trunk (gcc version 5.0.0 20150301) I get SImode accesses, but the structure is not aligned and the compiler cant possibly know how the memory will be aligned. Something must have changed in be meantime, but it wasn't by me. IIRC the field mode in this example was QImode but now it seems to be SImode. struct s { unsigned int y:31; } __attribute__((packed)); int test (volatile struct s* x) { x-y=0x7FFF; return x-y; } So what would you think of this change at strict_volatile_bitfield_p? diff -up expmed.c.jj expmed.c --- expmed.c.jj 2015-01-16 11:20:40.0 +0100 +++ expmed.c 2015-03-06 10:07:14.362383274 +0100 @@ -472,9 +472,9 @@ strict_volatile_bitfield_p (rtx op0, uns return false; /* Check for cases of unaligned fields that must be split. */ - if (bitnum % BITS_PER_UNIT + bitsize modesize - || (STRICT_ALIGNMENT - bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize modesize)) + if (bitnum % (STRICT_ALIGNMENT ? modesize : BITS_PER_UNIT) + + bitsize modesize + || (STRICT_ALIGNMENT MEM_ALIGN (op0) modesize)) return false; /* Check for cases where the C++ memory model applies. */ of course this is incomplete, and needs special handling for !STRICT_ALIGNMENT in the strict-volatile-bitfields code path later. Thanks Bernd.
Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch)
On Thu, Feb 26, 2015 at 20:25:11 +0300, Ilya Verbin wrote: On Wed, Feb 25, 2015 at 10:36:08 +0100, Thomas Schwinge wrote: Julian Brown jul...@codesourcery.com wrote: This is a version of the previously-posted patch to rework initialisation and support the proposed load/unload hooks, merged to gomp4 branch and tested alongside the two patches (from Currently the 'struct gomp_memory_mapping' contains 'lock' and 'is_initialized'. Do you still need them? Or we can use gomp_device_descr::lock and is_initialized instead? If yes, then we can replace the gomp_memory_mapping structure with a splay_tree, as it was before the OpenACC merge. Ping? Thanks, -- Ilya
Another tweak to c-ada-spec.c
This is an old pasto spotted by Jonathan with the help of Coverity. Tested on x86_64-suse-linux, applied on the mainline and 4.9 branch. 2015-03-06 Eric Botcazou ebotca...@adacore.com Jonathan Wakely jwakely@gmail.com * c-ada-spec.c (dump_ada_double_name): Fix pasto. 2015-03-06 Eric Botcazou ebotca...@adacore.com * g++.dg/other/dump-ada-spec-3.C: Remove include and adjust. -- Eric BotcazouIndex: c-family/c-ada-spec.c === --- c-family/c-ada-spec.c (revision 221213) +++ c-family/c-ada-spec.c (working copy) @@ -1390,7 +1390,7 @@ dump_ada_double_name (pretty_printer *bu pp_underscore (buffer); - if (DECL_NAME (t1)) + if (DECL_NAME (t2)) pp_ada_tree_identifier (buffer, DECL_NAME (t2), t2, false); else { Index: testsuite/g++.dg/other/dump-ada-spec-3.C === --- testsuite/g++.dg/other/dump-ada-spec-3.C (revision 221202) +++ testsuite/g++.dg/other/dump-ada-spec-3.C (working copy) @@ -1,8 +1,6 @@ /* { dg-do compile } */ /* { dg-options -fdump-ada-spec } */ -#include iostream - using namespace std; class Base { @@ -14,7 +12,6 @@ class Base { }; void Base::Primitive () { - cout C++ Primitivethis-My_V \n; } Base::Base () {
Re: [PATCH] PR rtl-optimization/32219: optimizer causees wrong code in pic/hidden/weak symbol checking
On 05/03/15 15:28, Ramana Radhakrishnan wrote: diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7bf5b4d..777230e 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -6392,14 +6392,8 @@ arm_set_default_type_attributes (tree type) static bool arm_function_in_section_p (tree decl, section *section) { - /* We can only be certain about functions defined in the same - compilation unit. */ - if (!TREE_STATIC (decl)) -return false; - - /* Make sure that SYMBOL always binds to the definition in this - compilation unit. */ - if (!targetm.binds_local_p (decl)) + /* We can only be certain about the prevailing symbol definition. */ + if (!decl_binds_to_current_def_p (decl)) return false; /* If DECL_SECTION_NAME is set, assume it is trustworthy. */ Sorry to have missed this - I've also been traveling recently which has made it harder with patch traffic - this is OK if no regressions. Please apply with an appropriate Changelog. regressions Ramana Hi, Committed as r221220 and fixed ChangeLog entry in r221234. Sorry for claiming the patch for myself. Kind regards, Alex
Re: [PATCH] Fix PR ipa/65318
Yes - but I said that having an alias should have the same effect as the MEM_REF wrapping we do in LTO (to not barf on stmt verification if symbol merging merges an int and a float for example). Yep, though alias is bit different from LTO - in LTO we replace the decl in place, while alias still has its oriignal type only its constructor is not necessarily cooperating. (We sort of do that with -fmerge-constants for CONST_DECL and in constant pool vairables for ages) Concerning callers handling mismatches, the VIEW_CONVERT_EXPR thing seems valid thing to do for all uses except for fold_const_aggregate_ref_1. So perhaps we can just inline rest of get_symbol_constant_value in there and document that get_symbol_constant_value returns value in correct type. Or am I missing something obvious? Yeah, that looks good. Note that we can as well change all callers of get_symbol_constant_value to use fold_const_aggregate_ref, no? So reduce the number of APIs. Yes, that seems like good idea. As I recall, we used to have get_symbol_constant_value first and introduced aggregate folding later. Honza Richard. Thanks! Honza + val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (sym), val); +} + return val; +} else return NULL_TREE; }
Re: [PATCH] ICF: move readonly decision for variables to the right place
On Sun, Mar 1, 2015 at 11:53 PM, Jan Hubicka hubi...@ucw.cz wrote: Hi, this is a variant of patch I commited. It takes care to load the constructor to memory in sem_variable::equals and donot touch it earlier. I also made sem_variable::parse to skip volatile and reigster variables. Bootstrapped/regtested x86_64-linux, lto-bootstrapped with -fmerge-all-constants comitted. Honza 2015-02-28 Martin Liska mli...@suse.cz Jan Hubicka hubi...@ucw.cz * ipa-icf.c (sem_variable::equals): Improve debug output; get variable constructor. (sem_variable::parse): Do not filter out too early; give up on volatile and register vars. (sem_item_optimizer::filter_removed_items): Filter out nonreadonly variables. * ipa-icf.h (sem_variable::init): Do not set ctor. (sem_variable::ctor): Remove. This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65334 -- H.J.
[PATCH] optionally disable global check
Hi all! Currently !ASAN_GLOBALS disables red-zones for global variables but keeps their checks. This simple patch disables these checks too. --Marat gcc/ChangeLog: 2015-01-22 Marat Zakirov m.zaki...@samsung.com * asan.c (instrument_derefs): asan-globals=0 disable instrumentation. gcc/testsuite/ChangeLog: 2015-01-22 Marat Zakirov m.zaki...@samsung.com * c-c++-common/asan/no-asan-check-glob.c: New test. diff --git a/gcc/asan.c b/gcc/asan.c index be28ede..c331f67 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1809,6 +1809,8 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t, { if (DECL_THREAD_LOCAL_P (inner)) return; + if (!ASAN_GLOBALS is_global_var (inner)) +return; if (!TREE_STATIC (inner)) { /* Automatic vars in the current function will be always diff --git a/gcc/testsuite/c-c++-common/asan/no-asan-check-glob.c b/gcc/testsuite/c-c++-common/asan/no-asan-check-glob.c new file mode 100644 index 000..a1b1410 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/no-asan-check-glob.c @@ -0,0 +1,13 @@ +/* { dg-options --param asan-globals=0 -fdump-tree-asan } */ +/* { dg-do compile } */ +/* { dg-skip-if { *-*-* } { -O0 } { * } } */ + +extern int a; + +int foo () +{ + return a; +} + +/* { dg-final { scan-tree-dump-times ASAN_CHECK 0 asan1 } } */ +/* { dg-final { cleanup-tree-dump asan1 } } */
Re: [PATCH] optionally disable global check
On 03/06/2015 05:23 PM, Marat Zakirov wrote: Hi all! Currently !ASAN_GLOBALS disables red-zones for global variables but keeps their checks. This simple patch disables these checks too. --Marat Jakub, Given that this may be considered a bugfix for --param asan-globals, perhaps this is ok for 5.0? 2015-01-22 Marat Zakirov m.zaki...@samsung.com I think you'll want to update dates here. * asan.c (instrument_derefs): asan-globals=0 disable instrumentation. s/asan-globals=0 disable instrumentation/Disable instrumentation on asan-globals=0./g -Y
RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant
From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Thursday, March 05, 2015 7:12 PM loop header start of loop body //stuff (set (reg 128) (const_int 0)) //other stuff end of loop body becomes: (set (reg 129) (const_int 0)) loop header start of loop body //stuff (set (reg 128) (reg 128)) //other stuff end of loop body Why doesn't copy-propagation clean this up? It's run after loop2. Actually cprop3 is what makes the situation worse in this case as it will copy the constant that is set outside the loop in the mov that is in the loop. In the case or PR64616 the constant is a symbol_ref which makes it a memory access so it propagates the memory access in the loop, making the load executed many times. Note that as I said in the intro this bug is also solved by [1] which is the first thing that goes wrong for this example. That being said, loop invariant pass ought to simply move instructions if it can safely do so. [1] https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00933.html Best regards, Thomas
RE: [PATCH] Fix PR rtl-optimization/65067
Hi, On Fri, 6 Mar 2015 10:09:30, Eric Botcazou wrote: Hmm. As you also modify the no-strict-volatile-bitfield path I'm not sure you don't regress the case where EP_insv can work on memory. I agree that simplifying the strict-volatile-bitfield path to extract the memory within strict-volatile-bitfield constraints to a reg and then using the regular path is a good thing. Eric? Even if the no-strict-volatile-bitfield path isn't touched, I don't understand @@ -976,7 +976,7 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I /* Storing any naturally aligned field can be done with a simple store. For targets that support fast unaligned memory, any naturally sized, unit aligned field can be done directly. */ - if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode)) + if (bitsize == GET_MODE_BITSIZE (fieldmode)) { str_rtx = adjust_bitfield_address (str_rtx, fieldmode, bitnum / BITS_PER_UNIT); { - rtx result; - /* Extraction of a full MODE1 value can be done with a load as long as the field is on a byte boundary and is sufficiently aligned. */ - if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1)) - result = adjust_bitfield_address (str_rtx, mode1, - bitnum / BITS_PER_UNIT); - else + if (bitsize == GET_MODE_BITSIZE(mode1)) { - str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum, - bitnum); - result = extract_fixed_bit_field_1 (mode, str_rtx, bitsize, bitnum, - target, unsignedp); + rtx result = adjust_bitfield_address (str_rtx, mode1, + bitnum / BITS_PER_UNIT); adjust_bitfield_address takes (bitnum / BITS_PER_UNIT) so don't you need to make sure that bitnum % BITS_PER_UNIT == 0? I know it because strict_volatile_bitfield_p checks this: /* Check for cases of unaligned fields that must be split. */ if (bitnum % BITS_PER_UNIT + bitsize modesize Therefore, if bitsize == modesize, we know that bitnum % BITS_PER_UNIT must be zero. In any case, the comments are now totally out of sync. -- Eric Botcazou
Re: [C++ Patch] PR 65323
... in case, I think we can as well apply the below, a tad simpler. Also passes testing. Paolo. Index: decl.c === --- decl.c (revision 221230) +++ decl.c (working copy) @@ -11227,11 +11227,8 @@ check_default_argument (tree decl, tree arg, tsubs LOOKUP_IMPLICIT); --cp_unevaluated_operand; - if (warn_zero_as_null_pointer_constant - TYPE_PTR_OR_PTRMEM_P (decl_type) - null_ptr_cst_p (arg) - (complain tf_warning) - maybe_warn_zero_as_null_pointer_constant (arg, input_location)) + if (TYPE_PTR_OR_PTRMEM_P (decl_type) + null_ptr_cst_p (arg)) return nullptr_node; /* [dcl.fct.default]
[PATCH] Check all python modules in contrib/dg-extract-results.sh
Hi, I have a box, where only python-minimal-2.7 seems to be installed and I have tried to use make -k -j2; but this results in ./dg-extract-results.sh Traceback (most recent call last): File ./dg-extract-results.py, line 13, in module import io ImportError: No module named io This causes contrib/test_summary to be unable to show the results. I would like to fix this with the attached patch: The idea is to import all python modules that the dg-extract-results.py will use in the python version check, and fall back to the default implementation when any modules are missing. Is that OK for trunk? Thanks Bernd. patch-dg-extract-results.diff Description: Binary data
Re: [PATCH] PR target/65240, Fix Power{7,8} insn constraint issue with -O3 -ffast-math
On Thu, Mar 5, 2015 at 9:06 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: This patch fixes PR 65240, which was a latent bug that was introduced when I added the -mupper-regs support to the PowerPC compiler. In the PowerPC compiler, if you use -ffast-math, the compiler allows floating point constants in move RTLs until register allocation time, in order to allow the optimizations that replace division by a constant with multiplication by the reciprocal. Don't we perform this optimization on the GIMPLE/tree level already? Richard. If the register being loaded up is a traditional Altivec register, the load will fail, since there is no D-FORM (register+offset) addressing mode for the traditional altivec registers. I had had a define_split to try and handle this situation, but it didn't work all of the time. I rewrote the insns, so there is an UNSPEC in it to prevent over optimization. I have done bootstrap builds and make check's on both a big endian power7 system, and a little endian power8 system with no regressions in the tests. I have also built both power7 and power8 spec 2006 (v5) versions, and Spec built. I have run the Spec floating point tests on a power7 box, and all of the tests performed about at the same speed as before the bug fix was done, with the exception of gromacs, which is 4% faster with the fix. Looking at the static instruction counts, the big change is to fold the addi instruction into the load of the constant if it is loading the value to a traditional floating point register (previously, it would load up the constant with addis/addi and then do an indirect load). I do see other instructions change, including a few more floating point loads, but with the exception of gromacs, most of the changes are in the noise level. Is this ok to install into the trunk? At present, it is not needed for GCC 4.9, since that branch does not have the -mupper-regs support. [gcc] 2015-03-05 Michael Meissner meiss...@linux.vnet.ibm.com PR target/65240 * config/rs6000/rs6000.md (UNSPEC_FUSION_FP_CONSTANT): New unspec. (Vsx_load): New mode attribute to give appropriate VSX load instruction. (move floating point constant define_split): Use an UNSPEC to make sure that the load of the constant is kept as a memory address, instead of being converted back into a move of the constant. (load_mode_constant): Likewise. [gcc/testsuite] 2015-03-05 Michael Meissner meiss...@linux.vnet.ibm.com PR target/65240 * gcc.target/powerpc/pr65240.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[patch] Obvious fix for typo in gcc/real.c
Tested x86_64-linux. Committed as obvious. commit d16f0a5adc3cc29ce96e606b94c129989b4ca6ff Author: Jonathan Wakely jwak...@redhat.com Date: Fri Mar 6 11:04:31 2015 + * real.c (real_from_string): Fix typo in assertion. diff --git a/gcc/real.c b/gcc/real.c index 1d1d510..43f124e 100644 --- a/gcc/real.c +++ b/gcc/real.c @@ -2075,7 +2075,7 @@ real_from_string (REAL_VALUE_TYPE *r, const char *str) because the hex digits used in real_from_mpfr did not start with a digit 8 to f, but the exponent bounds above should have avoided underflow or overflow. */ - gcc_assert (r-cl = rvc_normal); + gcc_assert (r-cl == rvc_normal); /* Set a sticky bit if mpfr_strtofr was inexact. */ r-sig[0] |= inexact; mpfr_clear (m);
[PATCH] Fix intelmic-mkoffload (was: [PATCH 1/4] Add mkoffload for Intel MIC)
Hi, I've found a bug in intelmic-mkoffload, it works only when the path to gcc is absolute or relative, but doesn't work when it's specified in the PATH env var. Here is the fix, I've got a piece of code from gcc/config/nvptx/mkoffload.c. Regtested on x86_64-linux and i686-linux. Ok for trunk? gcc/ * config/i386/intelmic-mkoffload.c: Include intelmic-offload.h instead of libgomp-plugin.h. (find_target_compiler): Support a case when the path to gcc is specified in the PATH env var, so COLLECT_GCC doesn't contain a path. (generate_host_descr_file): Use GOMP_DEVICE_INTEL_MIC from intelmic-offload.h instead of OFFLOAD_TARGET_TYPE_INTEL_MIC from libgomp-plugin.h. (main): Use GCC_INSTALL_NAME as target_driver_name. * config/i386/t-intelmic (CFLAGS-mkoffload.o): Add GCC_INSTALL_NAME define. (mkoffload.o): Remove obsolete include path and defines. (mkoffload$(exeext)): Use $(LINKER) instead of $(COMPILER). diff --git a/gcc/config/i386/intelmic-mkoffload.c b/gcc/config/i386/intelmic-mkoffload.c index e6394e9..f93007c 100644 --- a/gcc/config/i386/intelmic-mkoffload.c +++ b/gcc/config/i386/intelmic-mkoffload.c @@ -22,13 +22,13 @@ #include config.h #include libgen.h -#include libgomp-plugin.h #include system.h #include coretypes.h #include obstack.h #include intl.h #include diagnostic.h #include collect-utils.h +#include intelmic-offload.h const char tool_name[] = intelmic mkoffload; @@ -158,10 +158,21 @@ find_target_compiler (const char *name) bool found = false; char **paths = NULL; unsigned n_paths, i; - const char *collect_path = dirname (ASTRDUP (getenv (COLLECT_GCC))); - size_t len = strlen (collect_path) + 1 + strlen (name) + 1; - char *target_compiler = XNEWVEC (char, len); - sprintf (target_compiler, %s/%s, collect_path, name); + char *target_compiler; + const char *collect_gcc = getenv (COLLECT_GCC); + const char *gcc_path = dirname (ASTRDUP (collect_gcc)); + const char *gcc_exec = basename (ASTRDUP (collect_gcc)); + + if (strcmp (gcc_exec, collect_gcc) == 0) +{ + /* collect_gcc has no path, so it was found in PATH. Make sure we also +find accel-gcc in PATH. */ + target_compiler = XDUPVEC (char, name, strlen (name) + 1); + found = true; + goto out; +} + + target_compiler = concat (gcc_path, /, name, NULL); if (access_check (target_compiler, X_OK) == 0) { found = true; @@ -171,7 +182,7 @@ find_target_compiler (const char *name) n_paths = parse_env_var (getenv (COMPILER_PATH), paths); for (i = 0; i n_paths; i++) { - len = strlen (paths[i]) + 1 + strlen (name) + 1; + size_t len = strlen (paths[i]) + 1 + strlen (name) + 1; target_compiler = XRESIZEVEC (char, target_compiler, len); sprintf (target_compiler, %s/%s, paths[i], name); if (access_check (target_compiler, X_OK) == 0) @@ -346,7 +357,7 @@ generate_host_descr_file (const char *host_compiler) init (void)\n {\n GOMP_offload_register (__OFFLOAD_TABLE__, %d, __offload_target_data);\n - }\n, OFFLOAD_TARGET_TYPE_INTEL_MIC); + }\n, GOMP_DEVICE_INTEL_MIC); fclose (src_file); unsigned new_argc = 0; @@ -483,8 +494,7 @@ main (int argc, char **argv) if (!host_compiler) fatal_error (input_location, COLLECT_GCC must be set); - const char *target_driver_name -= DEFAULT_REAL_TARGET_MACHINE -accel- DEFAULT_TARGET_MACHINE -gcc; + const char *target_driver_name = GCC_INSTALL_NAME; char *target_compiler = find_target_compiler (target_driver_name); if (target_compiler == NULL) fatal_error (input_location, offload compiler %s not found, diff --git a/gcc/config/i386/t-intelmic b/gcc/config/i386/t-intelmic index 287460e..9de4b76 100644 --- a/gcc/config/i386/t-intelmic +++ b/gcc/config/i386/t-intelmic @@ -1,10 +1,10 @@ +CFLAGS-mkoffload.o += $(DRIVER_DEFINES) -DGCC_INSTALL_NAME=\$(GCC_INSTALL_NAME)\ + mkoffload.o: $(srcdir)/config/i386/intelmic-mkoffload.c - $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ - -I$(srcdir)/../libgomp \ - -DDEFAULT_REAL_TARGET_MACHINE=\$(real_target_noncanonical)\ \ - -DDEFAULT_TARGET_MACHINE=\$(target_noncanonical)\ \ - $ $(OUTPUT_OPTION) + $(COMPILE) $ + $(POSTCOMPILE) ALL_HOST_OBJS += mkoffload.o mkoffload$(exeext): mkoffload.o collect-utils.o libcommon-target.a $(LIBIBERTY) $(LIBDEPS) - $(COMPILER) -o $@ mkoffload.o collect-utils.o libcommon-target.a $(LIBIBERTY) $(LIBS) + $(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \ + mkoffload.o collect-utils.o libcommon-target.a $(LIBIBERTY) $(LIBS) Thanks, -- Ilya
Re: [PATCH] [RTL] Relax CSE check to set REG_EQUAL notes.
On Wed, Mar 04, 2015 at 11:09:14AM +, Alex Velenko wrote: I prefer adding notes in CSE instead of adding additional checks in Jump2 and, if any, other passes, as I think it is more uniform solution and allows single point fix. Downside is having more notes. The other downside is that every other RTL producer will have to add these notes as well, or passes that look at this info will have to look at the insns themselves anyway. Not a good trade-off in my opinion. Just add a simple helper function to do these checks (isn't there one already)? Segher
Re: [PATCH] Improve memory usage on PR64928
On Thu, 5 Mar 2015, Richard Biener wrote: I am currently testing the following patch to reduce peak memory usage of the out-of-SSA phase for the testcase in the PR. The issue is (as usual) big live and SSA conflict graph memory use. This side tackles live info and frees livein before computing the conflict graph (which only needs liveout). Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu. I'm currently giving the following slightly re-factored patch final testing. Peak memory usage drops from 7.6GB to 5.8GB for the small testcase in the PR. When I remove SSA coalescing from the picture (I have a followup patch to do that, by simply limiting the amount of coalescing we do) then memory usage peaks at 7.6GB again which is because CSE and DF blow up later on the testcase. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2015-03-06 Richard Biener rguent...@suse.de PR middle-end/64928 * tree-ssa-live.h (struct tree_live_info_d): Add livein_obstack and liveout_obstack members. (calculate_live_on_exit): Remove. (calculate_live_ranges): Change declaration. * tree-ssa-live.c (liveness_bitmap_obstack): Remove global var. (new_tree_live_info): Adjust. (calculate_live_ranges): Delete livein when not wanted. (calculate_live_ranges): Do not initialize liveness_bitmap_obstack. Deal with partly deleted live info. (loe_visit_block): Remove temporary bitmap by using bitmap_ior_and_compl_into. (live_worklist): Adjust accordingly. (calculate_live_on_exit): Make static. * tree-ssa-coalesce.c (coalesce_ssa_name): Tell calculate_live_ranges we do not need livein. Index: gcc/tree-ssa-live.c === --- gcc/tree-ssa-live.c (revision 221207) +++ gcc/tree-ssa-live.c (working copy) @@ -973,13 +973,6 @@ remove_unused_locals (void) timevar_pop (TV_REMOVE_UNUSED); } -/* Obstack for globale liveness info bitmaps. We don't want to put these - on the default obstack because these bitmaps can grow quite large and - we'll hold on to all that memory until the end of the compiler run. - As a bonus, delete_tree_live_info can destroy all the bitmaps by just - releasing the whole obstack. */ -static bitmap_obstack liveness_bitmap_obstack; - /* Allocate and return a new live range information object base on MAP. */ static tree_live_info_p @@ -992,18 +985,20 @@ new_tree_live_info (var_map map) live-map = map; live-num_blocks = last_basic_block_for_fn (cfun); + bitmap_obstack_initialize (live-livein_obstack); + bitmap_obstack_initialize (live-liveout_obstack); live-livein = XNEWVEC (bitmap_head, last_basic_block_for_fn (cfun)); FOR_EACH_BB_FN (bb, cfun) -bitmap_initialize (live-livein[bb-index], liveness_bitmap_obstack); +bitmap_initialize (live-livein[bb-index], live-livein_obstack); live-liveout = XNEWVEC (bitmap_head, last_basic_block_for_fn (cfun)); FOR_EACH_BB_FN (bb, cfun) -bitmap_initialize (live-liveout[bb-index], liveness_bitmap_obstack); +bitmap_initialize (live-liveout[bb-index], live-liveout_obstack); live-work_stack = XNEWVEC (int, last_basic_block_for_fn (cfun)); live-stack_top = live-work_stack; - live-global = BITMAP_ALLOC (liveness_bitmap_obstack); + live-global = BITMAP_ALLOC (NULL); return live; } @@ -1013,10 +1008,18 @@ new_tree_live_info (var_map map) void delete_tree_live_info (tree_live_info_p live) { - bitmap_obstack_release (liveness_bitmap_obstack); + if (live-livein) +{ + bitmap_obstack_release (live-livein_obstack); + free (live-livein); +} + if (live-liveout) +{ + bitmap_obstack_release (live-liveout_obstack); + free (live-liveout); +} + BITMAP_FREE (live-global); free (live-work_stack); - free (live-liveout); - free (live-livein); free (live); } @@ -1027,8 +1030,7 @@ delete_tree_live_info (tree_live_info_p it each time. */ static void -loe_visit_block (tree_live_info_p live, basic_block bb, sbitmap visited, -bitmap tmp) +loe_visit_block (tree_live_info_p live, basic_block bb, sbitmap visited) { edge e; bool change; @@ -1046,17 +1048,17 @@ loe_visit_block (tree_live_info_p live, pred_bb = e-src; if (pred_bb == ENTRY_BLOCK_PTR_FOR_FN (cfun)) continue; - /* TMP is variables live-on-entry from BB that aren't defined in the + /* Variables live-on-entry from BB that aren't defined in the predecessor block. This should be the live on entry vars to pred. Note that liveout is the DEFs in a block while live on entry is -being calculated. */ - bitmap_and_compl (tmp, loe, live-liveout[pred_bb-index]); - - /* Add these bits to live-on-entry for the pred. if there are any +being calculated. +Add these bits to live-on-entry for the pred. if
Re: [PATCH] Fix PR63743: Incorrect ordering of operands in sequence of commutative operations
On Fri, Mar 6, 2015 at 11:31 AM, Thomas Preud'homme thomas.preudho...@arm.com wrote: Hi, Improved canonization after r216728 causes GCC to more often generate poor code due to suboptimal ordering of operand of commutative libcalls. Indeed, if one of the operands of a commutative operation is the result of a previous operation, both being implemented by libcall, the wrong ordering of the operands in the second operation can lead to extra mov. Consider the following case on softfloat targets: double test1 (double x, double y) { return x * (x + y); } If x + y is put in the operand using the same register as the result of the libcall for x + y then no mov is generated, otherwise mov is needed. The following happens on arm softfloat with the right ordering: bl __aeabi_dadd ldr r2, [sp] ldr r3, [sp, #4] /* r0, r1 are reused from the return values of the __aeabi_dadd libcall. */ bl __aeabi_dmul With the wrong ordering one gets: bl __aeabi_dadd mov r2, r0 mov r3, r1 ldr r0, [sp] ldr r1, [sp, #4] bl __aeabi_dmul This patch extend the patch written by Yuri Rumyantsev in r219646 to also deal with the case of only one of the operand being the result of an operation. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2015-03-05 Thomas Preud'homme thomas.preudho...@arm.com PR tree-optimization/63743 * cfgexpand.c (reorder_operands): Also reorder if only second operand had its definition forwarded by TER. *** gcc/testsuite/ChangeLog *** 2015-03-05 Thomas Preud'homme thomas.preudho...@arm.com PR tree-optimization/63743 * gcc.dg/pr63743.c: New test. diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 7dfe1f6..4fbc037 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -5117,13 +5117,11 @@ reorder_operands (basic_block bb) continue; /* Swap operands if the second one is more expensive. */ def0 = get_gimple_for_ssa_name (op0); - if (!def0) - continue; def1 = get_gimple_for_ssa_name (op1); if (!def1) continue; swap = false; - if (lattice[gimple_uid (def1)] lattice[gimple_uid (def0)]) + if (!def0 || lattice[gimple_uid (def1)] lattice[gimple_uid (def0)]) swap = true; if (swap) { @@ -5132,7 +5130,7 @@ reorder_operands (basic_block bb) fprintf (dump_file, Swap operands in stmt:\n); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); fprintf (dump_file, Cost left opnd=%d, right opnd=%d\n, - lattice[gimple_uid (def0)], + def0 ? lattice[gimple_uid (def0)] : 0, lattice[gimple_uid (def1)]); } swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt), diff --git a/gcc/testsuite/gcc.dg/pr63743.c b/gcc/testsuite/gcc.dg/pr63743.c new file mode 100644 index 000..87254ed --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr63743.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O1 -fdump-rtl-expand-details } */ + +double +libcall_dep (double x, double y) +{ + return x * (x + y); +} + +/* { dg-final { scan-rtl-dump-times Swap operands 1 expand } } */ +/* { dg-final { cleanup-rtl-dump expand } } */ Testsuite was run in QEMU when compiled by an arm-none-eabi GCC cross-compiler targeting Cortex-M3 and a bootstrapped x86_64 native GCC without any regression. CSiBE sees a -0.5034% code size decrease on arm-none-eabi and a 0.0058% code size increase on x86_64-linux-gnu. Is it ok for trunk (since it fixes a code size regression in 5.0)? Ok. Thanks, Richard. Best regards, Thomas
Re: #pragma GCC unroll support
On 6 March 2015 at 02:31, Sandra Loosemore san...@codesourcery.com wrote: On 03/05/2015 04:12 PM, Mike Stump wrote: Ping? Just commenting on the documentation part: [] and a few coding style nits: +++ b/gcc/c-family/c-pragma.c @@ -1459,6 +1459,10 @@ init_pragma (void) cpp_register_deferred_pragma (parse_in, GCC, ivdep, PRAGMA_IVDEP, false, false); + if (!flag_preprocess_only) +cpp_register_deferred_pragma (parse_in, GCC, unroll, PRAGMA_UNROLL, false, + false); + overlong line (also for the IVDEP above) +++ b/gcc/c/c-parser.c +static void c_parser_while_statement (c_parser *, bool, unsigned short); +static void c_parser_do_statement (c_parser *, bool, unsigned short); +static void c_parser_for_statement (c_parser *, bool, unsigned short); since we're now a C++ app I would have added a default for the unsigned short unroll of = 0 Same for finish_while_stmt_cond, finish_do_stmt, finish_for_cond et al. In cp_parser_range_for() i take it you remember there is a //TODO I am attaching an unroll-5.C which might show that this does not seem to be implemented yet, IIUC gcc/loop-unroll.c::decide_unrolling() I'd put the if (loop-unroll == 1) {continue} earlier in the FOR_EACH_LOOP body (we're C++ nowadays) but maybe our optimizers are good enough to do that anyway (but i fear we're not up to that?). I did not see c/c++ tests for both !DIR$ UNROLL and !DIR$ IVDEP, fwiw. You seem to handle both placements proper, though. cheers, /* { dg-do compile } */ /* { dg-options -O2 -fdump-rtl-loop2_unroll -fdump-tree-cunrolli-details } */ #include string #include vector void bar(int); int j; void test1() { unsigned long m = j; const std::string s(abcdefgh); const std::vectorint v = {7, 6, 5, 4, 3, 2, 1, 0}; const std::size_t v_sz = 8; // { dg-final { scan-tree-dump 18:\[0-9\]*: note: loop turned into non-loop cunrolli } } #pragma GCC unroll sizeof(abcdefgh) * 4 for (auto ch : s) bar(ch); // { dg-final { scan-tree-dump 23:\[0-9\]*: note: loop turned into non-loop cunrolli } } #pragma GCC unroll v_sz for (auto i : v) bar(i); // { dg-final { scan-tree-dump 28:\[0-9\]*: note: loop turned into non-loop cunrolli } } #pragma GCC unroll 99 for (auto i : v) bar(i); } // { dg-final { scan-rtl-dump-not Unable to prove that the loop iterates constant times loop2_unroll } } // { dg-final { cleanup-tree-dump cunrolli } } // { dg-final { cleanup-rtl-dump loop2_unroll } }
[PATCH][RFC] Fix PR63155
This fixes PR63155 and reduces the memory usage at -O0 from reported 10GB (couldn't verify/update on my small box) to 350MB (still worse compared to 4.8 which needs only 50MB). It fixes this by no longer computing live info or building a conflict graph for coalescing of SSA names flowing over abnormal edges (which needs to succeed). Of course this also removes verification that this coalescing is valid (but computing this has quadratic cost). With this it turns ICEs into miscompiles. We could restrict verifying that we can perform abnormal coalescing to ENABLE_CHECKING (and I've wanted a verifier pass that can verify this multiple times to be able to catch what breaks it and not having to work back from out-of-SSA ICEing...). So any opinion on this patch welcome. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Ok for trunk? ;) Thanks, Richard. 2015-03-06 Richard Biener rguent...@suse.de PR middle-end/63155 * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being NULL. (coalesce_partitions): Split out abnormal coalescing to ... (perform_abnormal_coalescing): ... this function. (coalesce_ssa_name): Perform abnormal coalescing without computing live/conflict. Index: gcc/tree-ssa-coalesce.c === *** gcc/tree-ssa-coalesce.c (revision 221237) --- gcc/tree-ssa-coalesce.c (working copy) *** create_outofssa_var_map (coalesce_list_p *** 1121,1128 /* Attempt to coalesce ssa versions X and Y together using the partition !mapping in MAP and checking conflicts in GRAPH. Output any debug info to !DEBUG, if it is nun-NULL. */ static inline bool attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, --- 1121,1128 /* Attempt to coalesce ssa versions X and Y together using the partition !mapping in MAP and checking conflicts in GRAPH if not NULL. !Output any debug info to DEBUG, if it is nun-NULL. */ static inline bool attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, *** attempt_coalesce (var_map map, ssa_confl *** 1154,1160 fprintf (debug, [map: %d, %d] , p1, p2); ! if (!ssa_conflicts_test_p (graph, p1, p2)) { var1 = partition_to_var (map, p1); var2 = partition_to_var (map, p2); --- 1154,1161 fprintf (debug, [map: %d, %d] , p1, p2); ! if (!graph ! || !ssa_conflicts_test_p (graph, p1, p2)) { var1 = partition_to_var (map, p1); var2 = partition_to_var (map, p2); *** attempt_coalesce (var_map map, ssa_confl *** 1168,1177 /* z is the new combined partition. Remove the other partition from the list, and merge the conflicts. */ ! if (z == p1) ! ssa_conflicts_merge (graph, p1, p2); ! else ! ssa_conflicts_merge (graph, p2, p1); if (debug) fprintf (debug, : Success - %d\n, z); --- 1169,1181 /* z is the new combined partition. Remove the other partition from the list, and merge the conflicts. */ ! if (graph) ! { ! if (z == p1) ! ssa_conflicts_merge (graph, p1, p2); ! else ! ssa_conflicts_merge (graph, p2, p1); ! } if (debug) fprintf (debug, : Success - %d\n, z); *** attempt_coalesce (var_map map, ssa_confl *** 1185,1208 } ! /* Attempt to Coalesce partitions in MAP which occur in the list CL using !GRAPH. Debug output is sent to DEBUG if it is non-NULL. */ static void ! coalesce_partitions (var_map map, ssa_conflicts_p graph, coalesce_list_p cl, !FILE *debug) { - int x = 0, y = 0; - tree var1, var2; - int cost; basic_block bb; edge e; edge_iterator ei; - /* First, coalesce all the copies across abnormal edges. These are not placed - in the coalesce list because they do not need to be sorted, and simply - consume extra memory/compilation time in large programs. */ - FOR_EACH_BB_FN (bb, cfun) { FOR_EACH_EDGE (e, ei, bb-preds) --- 1189,1204 } ! /* Perform all abnormal coalescing on MAP. !Debug output is sent to DEBUG if it is non-NULL. */ static void ! perform_abnormal_coalescing (var_map map, FILE *debug) { basic_block bb; edge e; edge_iterator ei; FOR_EACH_BB_FN (bb, cfun) { FOR_EACH_EDGE (e, ei, bb-preds) *** coalesce_partitions (var_map map, ssa_co *** 1226,1236 if (debug) fprintf (debug, Abnormal coalesce: ); ! if (!attempt_coalesce (map, graph, v1, v2, debug)) fail_abnormal_edge_coalesce (v1, v2); } } } /* Now process the items in the coalesce list. */ --- 1222,1244 if (debug)
Re: [PATCH] PR target/65248: Copy relocation against protected symbol doesn't work
On Thu, Mar 05, 2015 at 05:31:51PM -0800, H.J. Lu wrote: Protected data symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. TARGET_BINDS_LOCAL_P should return false on protected data symbol. OK for trunk? Thanks. H.J. --- PR target/65248 * output.h (default_binds_local_p_2): New. * varasm.c (default_binds_local_p_2): Renamed to ... (default_binds_local_p_3): This. Don't return true on protected data symbol if protected data may be external. (default_binds_local_p): Use default_binds_local_p_3. (default_binds_local_p_1): Likewise. (default_binds_local_p_2): New. * config/i386/i386.c (TARGET_BINDS_LOCAL_P): Replace darwin_binds_local_p with default_binds_local_p_2. Here is the updated patch with testcases. Tested on Linux/x86. OK for trunk? Thanks. H.J. gcc/ PR target/65248 * output.h (default_binds_local_p_2): New. * varasm.c (default_binds_local_p_2): Renamed to ... (default_binds_local_p_3): This. Don't return true on protected data symbol if protected data may be external. (default_binds_local_p): Use default_binds_local_p_3. (default_binds_local_p_1): Likewise. (default_binds_local_p_2): New. * config/i386/i386.c (TARGET_BINDS_LOCAL_P): Replace darwin_binds_local_p with default_binds_local_p_2. gcc/testsuite/ PR target/65248 * gcc.target/i386/pr65248-1.c: New file. * gcc.target/i386/pr65248-2.c: Likewise. * gcc.target/i386/pr65248-3.c: Likewise. * gcc.target/i386/pr65248-4.c: Likewise. --- gcc/config/i386/i386.c| 3 +++ gcc/output.h | 1 + gcc/testsuite/gcc.target/i386/pr65248-1.c | 17 + gcc/testsuite/gcc.target/i386/pr65248-2.c | 17 + gcc/testsuite/gcc.target/i386/pr65248-3.c | 17 + gcc/testsuite/gcc.target/i386/pr65248-4.c | 17 + gcc/varasm.c | 18 +++--- 7 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr65248-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr65248-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr65248-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr65248-4.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ab8f03a..41a487a 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -51878,6 +51878,9 @@ ix86_initialize_bounds (tree var, tree lb, tree ub, tree *stmts) #if TARGET_MACHO #undef TARGET_BINDS_LOCAL_P #define TARGET_BINDS_LOCAL_P darwin_binds_local_p +#else +#undef TARGET_BINDS_LOCAL_P +#define TARGET_BINDS_LOCAL_P default_binds_local_p_2 #endif #if TARGET_DLLIMPORT_DECL_ATTRIBUTES #undef TARGET_BINDS_LOCAL_P diff --git a/gcc/output.h b/gcc/output.h index 217d979..53e47d0 100644 --- a/gcc/output.h +++ b/gcc/output.h @@ -586,6 +586,7 @@ extern void default_asm_output_anchor (rtx); extern bool default_use_anchors_for_symbol_p (const_rtx); extern bool default_binds_local_p (const_tree); extern bool default_binds_local_p_1 (const_tree, int); +extern bool default_binds_local_p_2 (const_tree); extern void default_globalize_label (FILE *, const char *); extern void default_globalize_decl_name (FILE *, tree); extern void default_emit_unwind_label (FILE *, tree, int, int); diff --git a/gcc/testsuite/gcc.target/i386/pr65248-1.c b/gcc/testsuite/gcc.target/i386/pr65248-1.c new file mode 100644 index 000..7b0d2af --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr65248-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target *-*-linux* } } */ +/* { dg-options -O2 -fpic } */ + +/* Common symbol with -fpic. */ +__attribute__((visibility(protected))) +int xxx; + +int +foo () +{ + return xxx; +} + +/* { dg-final { scan-assembler-not movl\[ \t\]xxx\\(%rip\\), %eax { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler xxx@GOTPCREL { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not movl\[ \t\]xxx@GOTOFF\\(%\[^,\]*\\), %eax { target ia32 } } } */ +/* { dg-final { scan-assembler movl\[ \t\]xxx@GOT\\(%\[^,\]*\\), %eax { target ia32 } } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr65248-2.c b/gcc/testsuite/gcc.target/i386/pr65248-2.c new file mode 100644 index 000..78efa34 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr65248-2.c @@ -0,0 +1,17 @@ +/* { dg-do compile { target *-*-linux* } } */ +/* { dg-options -O2 -fpic } */ + +/* Weak common symbol with -fpic. */ +__attribute__((weak, visibility(protected))) +int xxx; + +int +foo () +{ + return xxx; +} + +/* { dg-final { scan-assembler-not movl\[ \t\]xxx\\(%rip\\), %eax { target { ! ia32 } } } }
Re: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant
On 05/03/15 09:53, Thomas Preud'homme wrote: *** gcc/testsuite/ChangeLog *** 2015-02-16 Thomas Preud'homme thomas.preudho...@arm.com * gcc.dg/loop-7.c: Run on all targets and check for loop2_invariant being able to move instructions without introducing new temporary Thomas, Can you please confirm this relax on all target will not fail on AArch64? It do fails on my quick test.
[patch] Fix dangling pointer in future_error::what()
This bug didn't show up with COW strings, but is a real problem with the new std::string. The fix is to call error_code::message() early and store the result in the logic_error base class during construction. Tested x86_64-linux, committed to trunk. commit 3f7252dcc10eba698c12460c0a74bd313c7fef01 Author: Jonathan Wakely jwak...@redhat.com Date: Fri Mar 6 11:04:13 2015 + * include/std/future (future_error(error_code)): Construct base class with error_code's message. * src/c++11/future.cc (future_error::what()): Do not call c_str() on temporary string. diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future index cb0226d..fc3f816 100644 --- a/libstdc++-v3/include/std/future +++ b/libstdc++-v3/include/std/future @@ -98,7 +98,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION public: explicit future_error(error_code __ec) -: logic_error(std::future_error), _M_code(__ec) +: logic_error(std::future_error: + __ec.message()), _M_code(__ec) { } virtual ~future_error() noexcept; diff --git a/libstdc++-v3/src/c++11/future.cc b/libstdc++-v3/src/c++11/future.cc index c711a5f..3cf503b 100644 --- a/libstdc++-v3/src/c++11/future.cc +++ b/libstdc++-v3/src/c++11/future.cc @@ -75,7 +75,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION future_error::~future_error() noexcept { } const char* - future_error::what() const noexcept { return _M_code.message().c_str(); } + future_error::what() const noexcept { return logic_error::what(); } #if defined(_GLIBCXX_HAS_GTHREADS) defined(_GLIBCXX_USE_C99_STDINT_TR1) \ (ATOMIC_INT_LOCK_FREE 1)
Re: [Patch 2/2, v3, Fortran, pr60322 a.o.] [OOP] Incorrect bounds on polymorphic dummy array
Hi everyone, please find attached the latest version of the patch with the comments from Dominique via IRC worked in. Those were mostly about clarifying comments and style. Nevertheless, thanks for your help Dominique. Still basing on the first part of the patch at: https://gcc.gnu.org/ml/fortran/2015-02/msg00105.html Bootstraps and regtests fine on x86_64-linux-gnu/F20. Reviews welcome. Regards, Andre On Wed, 4 Mar 2015 20:30:15 +0100 Andre Vehreschild ve...@gmx.de wrote: Dear all, during his initial tests Dominique pointed out, that my patch did not fix all issues in the comments of pr60322. This new version of the patch fixes this issue now. The trick is to use a temporary array in the variable association of select type statements. The old code referenced the incoming array directly. This array may have an arbitrary base (i.e. lbound != 1) making indexing very difficult. Creating a temporary array descriptor for the associated variable allows for having the array base set to one (i.e., lbound == 1) making indexing fairly simple and adhere to the Fortran rules. The patch boostraps and regtests cleanly on x86_64-linux-gnu/F20. Note, you need to apply part 1 of this patch first. Part 1 can be found at: https://gcc.gnu.org/ml/fortran/2015-02/msg00105.html Regards, Andre On Thu, 26 Feb 2015 18:17:21 +0100 Andre Vehreschild ve...@gmx.de wrote: Hi all, here is the second part of the patch for pr60322. This patch also addresses the issue reported in pr64692, ... (more to come). The patch fixes the incorrect bounds of polymorphic arrays used to call functions and subroutines by using the same mechanism as regular arrays. In fact most of the code for treating regular arrays is reused by simply inserting the class array's symbol_attributes and gfc_array_specs into the same routines, i.e., doing the switching whether a symbol is a class array or a regular array and assigning the CLASS_DATA(sym).attr or sym.attr to the symbol_attribute temporary variable introduced by the first part (same for gfc_array_spec). The temporary array descriptor is then stored in the tree and extracted were it is needed. This patch furthermore addresses an issue with elemental functions, where the elemental function was not applied to class array members. By introducing the temporary array this merely fixed itself. During fixing elemental function application, an issue about using the polymorphic initializer popped up. The existing code would declare a variable of the base type (not a reference or pointer to it), assign the _vptr's _def_init to it and use the _vptr's _copy to copy the initializer into the object to initialize. This had to be patched to use a pointer for the variable and the correct addressing to be able to make use of the polymorphic init. Furthermore is the array offset in certain cases set to be -1. This helped to get the addressing correct for subarrays of (unlimited polymorphic) objects, where the array offset is used in a select type or other kind of association and the with the offset set incorrectly wrong elements from the array were selected. I had to fix two testcases, too: - finalize_15: We agreed (including checking with other compilers) that the value of an intent(out) variable should be that of its initializer and not that of its finalizer. - finalize_10: By using the temporary arrays the scan-tree-count expressions had to be adapted, too. Thank you's go to: Tobias, Dominique and Paul for their support during figuring what it going wrong and thorough testing and to Antony for reporting the issue. Bootstraps and regtests ok on x86_64-linux-gnu/F20. Regards, Andre -- Andre Vehreschild * Email: vehre ad gmx dot de pr60322_3.clog Description: Binary data diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index d28cf77..7f3a59d 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -4060,7 +4060,7 @@ gfc_lval_expr_from_sym (gfc_symbol *sym) lval-symtree = gfc_find_symtree (sym-ns-sym_root, sym-name); /* It will always be a full array. */ - as = sym-as; + as = IS_CLASS_ARRAY (sym) ? CLASS_DATA (sym)-as : sym-as; lval-rank = as ? as-rank : 0; if (lval-rank) gfc_add_full_array_ref (lval, as); diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index ff0054e..f55c691 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -3198,6 +3198,11 @@ bool gfc_is_finalizable (gfc_symbol *, gfc_expr **); CLASS_DATA (sym) \ CLASS_DATA (sym)-ts.u.derived \ CLASS_DATA (sym)-ts.u.derived-attr.unlimited_polymorphic) +#define IS_CLASS_ARRAY(sym) \ + (sym-ts.type == BT_CLASS \ + CLASS_DATA (sym) \ + CLASS_DATA (sym)-attr.dimension \ + !CLASS_DATA (sym)-attr.class_pointer) /* frontend-passes.c */ diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 0d4d7b2..22fc7c7
Re: RFC: PATCHES: Properly handle reference to protected data on x86
On Thu, Mar 5, 2015 at 8:19 PM, Alan Modra amo...@gmail.com wrote: On Wed, Mar 04, 2015 at 03:26:10PM -0800, H.J. Lu wrote: Protected symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. Here are patches for glibc, binutils and GCC to handle it properly. Any comments? I'd like to see this pass some more tests. For example reference in non-PIC exe to var x protected visibility definition of x in libA protected visibility definition of x in libB I suspect you don't have this case correct, but congratulations if you do! Assuming libA is first on the breadth first search for libraries, then exe and libA ought to use the same x, but libB have its own x. I believe my new testcases on hjl/pr17711 branch at https://sourceware.org/git/?p=glibc.git;a=summary covers those and they work correctly. In fact it would be good to prove that all variations of either a reference, a default visibility definition or a protected visibility definition worked in the exe plus two libs case. You can git my branch a try on PPC. If PPC uses copy relocation, it shouldn't be too hard to update PPC to make it work. -- H.J.
[ARM testsuite obvious] Fixup atomic-comp-swap-release-acquire.c to not use ICF
On Wed, Mar 04, 2015 at 09:38:51AM +, James Greenhalgh wrote: It took me a while longer than expected to get round to it, but I've committed the attached (revision 221175) as the obvious fix, after checking that it worked on aarch64-none-elf. Thanks, James --- gcc/testsuite/ 2015-03-04 James Greenhalgh james.greenha...@arm.com * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Add -fno-ipa-icf to dg-options * gcc.target/aarch64/vect_saddl_1.c: Likewise. * gcc.target/aarch64/vect_smlal_1.c: Likewise. Hi, ARM will want the same fix for the copy of atomic-comp-swap-release-acquire.c that lives in gcc.target/arm/ . I've committed the attached as obvious as revision 221243 after testing on arm-none-eabi with no issues. Thanks, James --- gcc/testsuite/ 2015-03-06 James Greenhalgh james.greenha...@arm.com * gcc.target/arm/atomic-comp-swap-release-acquire.c: Add -fno-ipa-icf to dg-options.
Re: [Patch,microblaze]: Optimized usage of pcmp conditional instruction.
On 03/05/15 21:12, Ajit Kumar Agarwal wrote: Changes are incorporated. Please find the log of the updated patch. commit 91f275c144165320850ddf18e3a1e059a66c Author: Ajit Kumar Agarwal ajitkum@xhdspdgnu.(none) Date: Fri Mar 6 09:55:11 2015 +0530 [Patch,microblaze]: Optimized usage of pcmp conditional instruction. The changes are made in the patch for optimized usage of pcmpne/pcmpeq instructions. The xor with register to register is replaced with pcmpeq /pcmpne instructions and for immediate check still the xori will be used. The purpose of the change is to acheive the aggressive usage of pcmpne /pcmpeq instructions instead of xor being used for comparison. ChangeLog: 2015-03-06 Ajit Agarwal ajit...@xilinx.com * config/microblaze/microblaze.md (cbranchsi4): Added immediate constraints. (cbranchsi4_reg): New. * config/microblaze/microblaze.c (microblaze_expand_conditional_branch_reg): New. * config/microblaze/microblaze-protos.h (microblaze_expand_conditional_branch_reg): New prototype. Signed-off-by:Ajit Agarwal ajit...@xilinx.com Thanks Regards Ajit OK. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [PATCH] Check all python modules in contrib/dg-extract-results.sh
On 03/06/15 03:19, Bernd Edlinger wrote: Hi, I have a box, where only python-minimal-2.7 seems to be installed and I have tried to use make -k -j2; but this results in ./dg-extract-results.sh Traceback (most recent call last): File ./dg-extract-results.py, line 13, in module import io ImportError: No module named io This causes contrib/test_summary to be unable to show the results. I would like to fix this with the attached patch: The idea is to import all python modules that the dg-extract-results.py will use in the python version check, and fall back to the default implementation when any modules are missing. Is that OK for trunk? Yes, this is OK. Jeff
libgo patch committed: update go Go 1.4.2 release
I committed this patch to update libgo to the Go 1.4.2 release (it was at 1.4 before). Bootstrapped and ran testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r d42a0819e2eb go/gogo.cc --- a/go/gogo.ccFri Feb 06 08:17:54 2015 -0800 +++ b/go/gogo.ccThu Mar 05 16:21:29 2015 -0800 @@ -6030,6 +6030,7 @@ Type* type = this-type_; Expression* init = this-init_; if (this-is_type_switch_var_ + type != NULL this-type_-is_nil_constant_as_type()) { Type_guard_expression* tge = this-init_-type_guard_expression(); @@ -6103,7 +6104,9 @@ // type here. It will have an initializer which is a type guard. // We want to initialize it to the value without the type guard, and // use the type of that value as well. - if (this-is_type_switch_var_ this-type_-is_nil_constant_as_type()) + if (this-is_type_switch_var_ + this-type_ != NULL + this-type_-is_nil_constant_as_type()) { Type_guard_expression* tge = this-init_-type_guard_expression(); go_assert(tge != NULL); diff -r d42a0819e2eb go/parse.cc --- a/go/parse.cc Fri Feb 06 08:17:54 2015 -0800 +++ b/go/parse.cc Thu Mar 05 16:21:29 2015 -0800 @@ -50,8 +50,7 @@ break_stack_(NULL), continue_stack_(NULL), iota_(0), -enclosing_vars_(), -type_switch_vars_() +enclosing_vars_() { } @@ -4596,32 +4595,33 @@ Parse::type_switch_body(Label* label, const Type_switch type_switch, Location location) { - Named_object* switch_no = NULL; - if (!type_switch.name.empty()) -{ - if (Gogo::is_sink_name(type_switch.name)) - error_at(type_switch.location, -no new variables on left side of %:=%); + Expression* init = type_switch.expr; + std::string var_name = type_switch.name; + if (!var_name.empty()) +{ + if (Gogo::is_sink_name(var_name)) +{ + error_at(type_switch.location, + no new variables on left side of %:=%); + var_name.clear(); +} else { - Variable* switch_var = new Variable(NULL, type_switch.expr, false, - false, false, - type_switch.location); - switch_no = this-gogo_-add_variable(type_switch.name, switch_var); + Location loc = type_switch.location; + Temporary_statement* switch_temp = + Statement::make_temporary(NULL, init, loc); + this-gogo_-add_statement(switch_temp); + init = Expression::make_temporary_reference(switch_temp, loc); } } Type_switch_statement* statement = -Statement::make_type_switch_statement(switch_no, - (switch_no == NULL - ? type_switch.expr - : NULL), - location); - + Statement::make_type_switch_statement(var_name, init, location); this-push_break_statement(statement, label); Type_case_clauses* case_clauses = new Type_case_clauses(); bool saw_default = false; + std::vectorNamed_object* implicit_vars; while (!this-peek_token()-is_op(OPERATOR_RCURLY)) { if (this-peek_token()-is_eof()) @@ -4629,7 +4629,8 @@ error_at(this-location(), missing %}%); return NULL; } - this-type_case_clause(switch_no, case_clauses, saw_default); + this-type_case_clause(var_name, init, case_clauses, saw_default, + implicit_vars); } this-advance_token(); @@ -4637,14 +4638,36 @@ this-pop_break_statement(); + // If there is a type switch variable implicitly declared in each case clause, + // check that it is used in at least one of the cases. + if (!var_name.empty()) +{ + bool used = false; + for (std::vectorNamed_object*::iterator p = implicit_vars.begin(); + p != implicit_vars.end(); + ++p) + { + if ((*p)-var_value()-is_used()) + { + used = true; + break; + } + } + if (!used) + error_at(type_switch.location, %qs declared and not used, +Gogo::message_name(var_name).c_str()); +} return statement; } // TypeCaseClause = TypeSwitchCase : [ StatementList ] . +// IMPLICIT_VARS is the list of variables implicitly declared for each type +// case if there is a type switch variable declared. void -Parse::type_case_clause(Named_object* switch_no, Type_case_clauses* clauses, - bool* saw_default) +Parse::type_case_clause(const std::string var_name, Expression* init, +Type_case_clauses* clauses, bool* saw_default, + std::vectorNamed_object** implicit_vars) { Location location = this-location(); @@ -4661,24 +4684,21 @@ if (this-statement_list_may_start_here()) {
web site update: Note that GCC 5 includes Go 1.4.2
I committed this patch to the GCC 5 changes.html file, to note that it includes Go 1.4.2. Ian Index: gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.85 diff -u -r1.85 changes.html --- gcc-5/changes.html 2 Mar 2015 16:41:59 - 1.85 +++ gcc-5/changes.html 6 Mar 2015 16:19:51 - @@ -454,7 +454,12 @@ /ul/li /ul -!-- h3 id=goGo/h3 -- +h3 id=goGo/h3 + ul +liGCC 5 provides a complete implementation of the Go 1.4.2 +release./li + /ul + !--h3 id=javaJava (GCJ)/h3-- h2 id=jitlibgccjit/h2
RE: [PATCH] Remove inefficient branchless conditional negate optimization
Jeff Law wrote: Can you move pr45685.c into gcc.target/i386? I know Richi said next stage1, but given this fixes a performance regression for ARM and it's reverting rather than adding new code, I think this is OK for the trunk with the testcase moved. So, OK with the testcase moved into gcc.target/i386/ I've moved it and changed the compile condition: /* { dg-do compile { target { ! { ia32 } } } } */ Jiong, can you commit this please? Wilco 2015-03-06 Wilco Dijkstra wdijk...@arm.com * gcc/tree-ssa-phiopt.c (neg_replacement): Remove. (tree_ssa_phiopt_worker): Remove negate optimization. * gcc/testsuite/gcc.dg/tree-ssa/pr45685.c: Move to gcc.target/i386. * gcc/testsuite/gcc.target/aarch64/csneg-1.c (test_csneg_cmp): New test. * gcc/gcc.target/i386/pr45685.c: Moved test, check for conditional move on x64.--- gcc/testsuite/gcc.dg/tree-ssa/pr45685.c| 41 gcc/testsuite/gcc.target/aarch64/csneg-1.c | 8 ++ gcc/testsuite/gcc.target/i386/pr45685.c| 39 gcc/tree-ssa-phiopt.c | 156 - 4 files changed, 47 insertions(+), 197 deletions(-) delete mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr45685.c create mode 100644 gcc/testsuite/gcc.target/i386/pr45685.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr45685.c b/gcc/testsuite/gcc.dg/tree-ssa/pr45685.c deleted file mode 100644 index 0628943..000 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr45685.c +++ /dev/null @@ -1,41 +0,0 @@ -/* { dg-do compile } */ -/* { dg-options -O3 -fdump-tree-phiopt1-details } */ - -typedef unsigned long int uint64_t; -typedef long int int64_t; -int summation_helper_1(int64_t* products, uint64_t count) -{ - int s = 0; - uint64_t i; - for(i=0; icount; i++) - { - int64_t val = (products[i]0) ? 1 : -1; - products[i] *= val; - if(products[i] != i) - val = -val; - products[i] = val; - s += val; - } - return s; -} - - -int summation_helper_2(int64_t* products, uint64_t count) -{ - int s = 0; - uint64_t i; - for(i=0; icount; i++) - { - int val = (products[i]0) ? 1 : -1; - products[i] *= val; - if(products[i] != i) - val = -val; - products[i] = val; - s += val; - } - return s; -} - -/* { dg-final { scan-tree-dump-times converted to straightline code 2 phiopt1 } } */ -/* { dg-final { cleanup-tree-dump phiopt1 } } */ - diff --git a/gcc/testsuite/gcc.target/aarch64/csneg-1.c b/gcc/testsuite/gcc.target/aarch64/csneg-1.c index 08001af..29d4e4e 100644 --- a/gcc/testsuite/gcc.target/aarch64/csneg-1.c +++ b/gcc/testsuite/gcc.target/aarch64/csneg-1.c @@ -48,3 +48,11 @@ test_csneg64_condasn2(long long x0, x4 = (x0 == x1) ? x3 : -x2; return x4; } + +int test_csneg_cmp(int x) +{ + /* { dg-final { scan-assembler csneg\tw\[0-9\] } } */ + if (x 3) +x = -x; + return x; +} diff --git a/gcc/testsuite/gcc.target/i386/pr45685.c b/gcc/testsuite/gcc.target/i386/pr45685.c new file mode 100644 index 000..7f50bb3 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr45685.c @@ -0,0 +1,39 @@ +/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-options -O3 } */ + +typedef unsigned long int uint64_t; +typedef long int int64_t; +int summation_helper_1(int64_t* products, uint64_t count) +{ + int s = 0; + uint64_t i; + for(i=0; icount; i++) + { + int64_t val = (products[i]0) ? 1 : -1; + products[i] *= val; + if(products[i] != i) + val = -val; + products[i] = val; + s += val; + } + return s; +} + + +int summation_helper_2(int64_t* products, uint64_t count) +{ + int s = 0; + uint64_t i; + for(i=0; icount; i++) + { + int val = (products[i]0) ? 1 : -1; + products[i] *= val; + if(products[i] != i) + val = -val; + products[i] = val; + s += val; + } + return s; +} + +/* { dg-final { scan-assembler-times cmov 4 } } */ diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c index c7fb073..14a7122 100644 --- a/gcc/tree-ssa-phiopt.c +++ b/gcc/tree-ssa-phiopt.c @@ -96,8 +96,6 @@ static bool minmax_replacement (basic_block, basic_block, edge, edge, gimple, tree, tree); static bool abs_replacement (basic_block, basic_block, edge, edge, gimple, tree, tree); -static bool neg_replacement (basic_block, basic_block, -edge, edge, gimple, tree, tree); static bool cond_store_replacement (basic_block, basic_block, edge, edge, hash_settree *); static bool
Re: [patch] Optimize empty class copies within a C++ return statement
On 03/05/2015 10:20 PM, Jason Merrill wrote: On 03/05/2015 06:25 PM, Aldy Hernandez wrote: +tree ret = TREE_OPERAND (*expr_p, 0); +if (ret (TREE_CODE (ret) == INIT_EXPR +|| TREE_CODE (ret) == MODIFY_EXPR) + TREE_CODE (TREE_OPERAND (ret, 0)) == RESULT_DECL + is_gimple_lvalue (TREE_OPERAND (ret, 0)) + is_really_empty_class (TREE_TYPE (TREE_OPERAND (ret, 0 + { +tree result_decl = TREE_OPERAND (ret, 0); +tree list = alloc_stmt_list (); +append_to_statement_list (TREE_OPERAND (ret, 1), list); +append_to_statement_list (build1 (RETURN_EXPR, void_type_node, + result_decl), list); +*expr_p = list; +return GS_OK; + } This should really use the MODIFY_EXPR case rather than duplicate it here. Actually, why don't we already hit that case when processing the RETURN_EXPR? We are hitting the MODIFY_EXPR case. Indeed, _because_ we hit the MODIFY_EXPR is that we return the uninitialized temporary. For example, we start with: return retval = TARGET_EXPR D.2347, ... which becomes: stuff with D.2347 return D.2349 = D.2347; which the aforementioned MODIFY_EXPR case turns into: stuff with D.2347 return D.2349; My patch was trying to notice this pattern and get rid of the temporary, thereby generating: stuff with D.2347 return retval; If you still think it's profitable, I can come up with an alternative using the MODIFY_EXPR code ??. Aldy
Re: [PATCH] Remove inefficient branchless conditional negate optimization
Wilco Dijkstra writes: Jeff Law wrote: Can you move pr45685.c into gcc.target/i386? I know Richi said next stage1, but given this fixes a performance regression for ARM and it's reverting rather than adding new code, I think this is OK for the trunk with the testcase moved. So, OK with the testcase moved into gcc.target/i386/ I've moved it and changed the compile condition: /* { dg-do compile { target { ! { ia32 } } } } */ Jiong, can you commit this please? Done as 221246. Wilco 2015-03-06 Wilco Dijkstra wdijk...@arm.com * gcc/tree-ssa-phiopt.c (neg_replacement): Remove. (tree_ssa_phiopt_worker): Remove negate optimization. * gcc/testsuite/gcc.dg/tree-ssa/pr45685.c: Move to gcc.target/i386. * gcc/testsuite/gcc.target/aarch64/csneg-1.c (test_csneg_cmp): New test. * gcc/gcc.target/i386/pr45685.c: Moved test, check for conditional move on x64. -- Regards, Jiong
Fix gimple_ic WRT EH
Hi, this patch makes gimple_ic to purge dead EH when the direct call does not need it. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza PR ipa/65302 * value-prof.c (gimple_ic): Pure dead eh edges when needed. * g++.dg/lto/pr65302_1.C: New testcase. * g++.dg/lto/pr65302_0.C: New testcase. Index: value-prof.c === --- value-prof.c(revision 221223) +++ value-prof.c(working copy) @@ -1576,6 +1576,8 @@ gimple_ic (gcall *icall_stmt, struct cgr PHI_ARG_DEF_FROM_EDGE (phi, e_eh)); } } + if (!stmt_could_throw_p (dcall_stmt)) +gimple_purge_dead_eh_edges (dcall_bb); return dcall_stmt; } Index: testsuite/g++.dg/lto/pr65302_1.C === --- testsuite/g++.dg/lto/pr65302_1.C(revision 0) +++ testsuite/g++.dg/lto/pr65302_1.C(revision 0) @@ -0,0 +1,83 @@ +#pragma implementation +#pragma interface +class CstringStorageReference { + public: + ~CstringStorageReference (); +}; +class Cstring { + CstringStorageReference m_stringRef; + public: + Cstring (const char *str, int l = 0); + unsigned int getLength () const; +}; +inline unsigned int +Cstring::getLength () const { }; +class ZEvent_Component { }; +class ZEvent_Data { }; +class ZEvent_Interrupt { }; +class ZEvent_Mouse { }; +class ZEvent_Key { }; +class ZEventHandler +{ + virtual void HandleEvent (const ZEvent_Component event); + virtual void HandleEvent (const ZEvent_Mouse event); + virtual void HandleEvent (const ZEvent_Key event); + virtual void HandleEvent (const ZEvent_Interrupt event); + virtual void HandleEvent (const ZEvent_Data event); +}; +class ZColor { }; +class ZViewPort2D { }; +enum ZVerticalAlignment { VA_Baseline }; +struct ZDevicePointStruct { }; +class ZCursor; +class ZPixmap; +class Foo; +class ZOutputDevice : public ZEventHandler { + public: + typedef ZVerticalAlignment TVerticalAlignment; + virtual const char *MyName () const { } + virtual ~ ZOutputDevice (); + virtual Cstring getTitle () const; + virtual void setTitle (const Cstring ) { } + virtual void Init (); + virtual void shutdown (); + virtual void minimize (); + virtual void normalize (); + virtual void raiseToTop (); + virtual ZViewPort2D GetViewPort () const; + virtual void setBackgroundColor (const ZColor color) = 0; + virtual void Clear () = 0; + virtual void Flush (int forced) = 0; + virtual void dismissCache () { } + virtual int GetDeviceWidth () const = 0; + virtual int GetDeviceHeight () const = 0; + virtual Foo *CreateGraphicContext () = 0; + virtual ZCursor *createCursor (const ZPixmap , int, int) { } + virtual void DrawLine (const Foo gc, int x1, int y2) = 0; + virtual void DrawLines (const Foo gc, const ZDevicePointStruct * points, + unsigned int count) = 0; +}; +class ZOutputDevicePS :public ZOutputDevice +{ + virtual void FillPolygon (const Foo gc, unsigned int count); + virtual void DrawPoint (const Foo gc, int x1, int y1); + virtual void DrawPoints (const Foo gc, const ZDevicePointStruct * points, + unsigned int count); + virtual void DrawRectangle (const Foo gc, int x, int height); + virtual void DrawRectangles (const Foo gc, unsigned int count); + virtual void FillRectangle (const Foo gc, int x, int height); + virtual void FillRectangles (const Foo gc, unsigned int count); + virtual void DrawCircle (const Foo gc, int x, int y, int radius); + virtual void DrawCircles (const Foo gc, unsigned int count); + virtual void FillCircle (const Foo gc, int x, int y, int radius); + virtual void FillCircles (const Foo gc, unsigned int count); + virtual void DrawString (const Foo gc, int xx, int yy, + TVerticalAlignment verAlign); + virtual void getStringBounds (const Foo gc, const Cstring theString, + int width, int height, int acsent) const; +}; +void +ZOutputDevicePS::getStringBounds (const Foo , const Cstring theString, + int width, int height, int ascent) const { + width = theString.getLength () * 8; +} Index: testsuite/g++.dg/lto/pr65302_0.C === --- testsuite/g++.dg/lto/pr65302_0.C(revision 0) +++ testsuite/g++.dg/lto/pr65302_0.C(revision 0) @@ -0,0 +1,99 @@ +// { dg-lto-do link } +// { dg-lto-options { { -flto -O2 } } } +// { dg-extra-ld-options -r -nostdlib -O0 } + +class CstringStorageReference { + public: + ~CstringStorageReference (); +}; +class Cstring { + CstringStorageReference m_stringRef; + public: + Cstring (const char *str, int l = 0); + unsigned int getLength () const; +}; +class ZEvent_Component { }; +class ZEvent_Data { }; +class ZEvent_Interrupt { }; +class ZEvent_Mouse { }; +class ZEvent_Key { }; +class ZEventHandler { + virtual void
Re: [PATCH] Fix a number of -Wformat-security warnings in gcc/config/*/*
On 02/13/15 06:58, David Howells wrote: * config/avr/avr.c (avr_print_operand_address, avr_print_operand): Avoid -Wformat-security warning. * config/m68k/m68k.c (print_operand): Likewise. * config/s390/s390.c (print_operand): Likewise. * config/tilegx/tilegx.c (tilegx_print_operand): Likewise. Normally I'd say this should wait as we're in stage4 and this is not a regression. But it's scope is limited to those backends... so, OK for the trunk. jeff
Re: [PATCH][RFC] Fix PR63155
On 03/06/15 06:16, Richard Biener wrote: This fixes PR63155 and reduces the memory usage at -O0 from reported 10GB (couldn't verify/update on my small box) to 350MB (still worse compared to 4.8 which needs only 50MB). It fixes this by no longer computing live info or building a conflict graph for coalescing of SSA names flowing over abnormal edges (which needs to succeed). Of course this also removes verification that this coalescing is valid (but computing this has quadratic cost). With this it turns ICEs into miscompiles. We could restrict verifying that we can perform abnormal coalescing to ENABLE_CHECKING (and I've wanted a verifier pass that can verify this multiple times to be able to catch what breaks it and not having to work back from out-of-SSA ICEing...). So any opinion on this patch welcome. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Ok for trunk? ;) Thanks, Richard. 2015-03-06 Richard Biener rguent...@suse.de PR middle-end/63155 * tree-ssa-coalesce.c (attempt_coalesce): Handle graph being NULL. (coalesce_partitions): Split out abnormal coalescing to ... (perform_abnormal_coalescing): ... this function. (coalesce_ssa_name): Perform abnormal coalescing without computing live/conflict. I'd personally like to keep the checking when ENABLE_CHECKING. I haven't followed this discussion real closely, but I wonder if some kind of blocking approach would work without blowing up the memory consumption. There's no inherent reason why we have to coalesce everything at the same time. We can use a blocking factor and do coalescing on some N number of SSA_NAMEs at a time. I suspect we can select an N that ultimately degenerates into the current do everything together for the common case and only has to iterate over blocks of SSA_NAMEs in extreme cases. Jeff
Re: [ARM testsuite obvious] Fixup atomic-comp-swap-release-acquire.c to not use ICF
On Fri, Mar 06, 2015 at 04:09:40PM +, James Greenhalgh wrote: On Wed, Mar 04, 2015 at 09:38:51AM +, James Greenhalgh wrote: It took me a while longer than expected to get round to it, but I've committed the attached (revision 221175) as the obvious fix, after checking that it worked on aarch64-none-elf. Thanks, James --- gcc/testsuite/ 2015-03-04 James Greenhalgh james.greenha...@arm.com * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Add -fno-ipa-icf to dg-options * gcc.target/aarch64/vect_saddl_1.c: Likewise. * gcc.target/aarch64/vect_smlal_1.c: Likewise. Hi, ARM will want the same fix for the copy of atomic-comp-swap-release-acquire.c that lives in gcc.target/arm/ . I've committed the attached as obvious as revision 221243 after testing on arm-none-eabi with no issues. ENOPATCH, I'll try again! r221243 looks like this... Cheers, James Index: gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c === --- gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c (revision 221242) +++ gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c (revision 221243) @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_arch_v8a_ok } */ -/* { dg-options -O2 } */ +/* { dg-options -O2 -fno-ipa-icf } */ /* { dg-add-options arm_arch_v8a } */ #include ../aarch64/atomic-comp-swap-release-acquire.x Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 221242) +++ gcc/testsuite/ChangeLog (revision 221243) @@ -1,5 +1,10 @@ 2015-03-06 James Greenhalgh james.greenha...@arm.com + * gcc.target/arm/atomic-comp-swap-release-acquire.c: Add + -fno-ipa-icf to dg-options. + +2015-03-06 James Greenhalgh james.greenha...@arm.com + * c-c++-common/torture/aarch64-vect-lane-2.c: XFAIL for LTO compiles using the linker plugin.
Re: [PATCH] Fix PR rtl-optimization/65067
Hmm. As you also modify the no-strict-volatile-bitfield path I'm not sure you don't regress the case where EP_insv can work on memory. I agree that simplifying the strict-volatile-bitfield path to extract the memory within strict-volatile-bitfield constraints to a reg and then using the regular path is a good thing. Eric? Even if the no-strict-volatile-bitfield path isn't touched, I don't understand @@ -976,7 +976,7 @@ store_bit_field (rtx str_rtx, unsigned HOST_WIDE_I /* Storing any naturally aligned field can be done with a simple store. For targets that support fast unaligned memory, any naturally sized, unit aligned field can be done directly. */ - if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, fieldmode)) + if (bitsize == GET_MODE_BITSIZE (fieldmode)) { str_rtx = adjust_bitfield_address (str_rtx, fieldmode, bitnum / BITS_PER_UNIT); { - rtx result; - /* Extraction of a full MODE1 value can be done with a load as long as the field is on a byte boundary and is sufficiently aligned. */ - if (simple_mem_bitfield_p (str_rtx, bitsize, bitnum, mode1)) - result = adjust_bitfield_address (str_rtx, mode1, - bitnum / BITS_PER_UNIT); - else + if (bitsize == GET_MODE_BITSIZE(mode1)) { - str_rtx = narrow_bit_field_mem (str_rtx, mode1, bitsize, bitnum, - bitnum); - result = extract_fixed_bit_field_1 (mode, str_rtx, bitsize, bitnum, - target, unsignedp); + rtx result = adjust_bitfield_address (str_rtx, mode1, + bitnum / BITS_PER_UNIT); adjust_bitfield_address takes (bitnum / BITS_PER_UNIT) so don't you need to make sure that bitnum % BITS_PER_UNIT == 0? In any case, the comments are now totally out of sync. -- Eric Botcazou
Re: [PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0
On Mar 6, 2015, at 1:45 AM, James Greenhalgh james.greenha...@arm.com wrote: On Thu, Feb 12, 2015 at 03:37:33PM +, Christophe Lyon wrote: On 8 February 2015 at 03:24, Andrew Pinski pins...@gmail.com wrote: On Fri, Feb 6, 2015 at 5:02 PM, Andrew Pinski pins...@gmail.com wrote: PR target/64893 * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Change the first argument type to size_type_node and add another size_type_node. (aarch64_simd_expand_builtin): Handle the new argument to AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather print an out when the first two arguments are not nonzero integer constants. * config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK): Pass the sizeof directly to __builtin_aarch64_im_lane_boundsi. testsuite/ChangeLog: * c-c++-common/torture/aarch64-vect-lane-1.c: New testcase. * c-c++-common/torture/aarch64-vect-lane-2.c: New testcase. In case you haven't noticed, aarch64-vect-lane-2.c FAILs when compiled at -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects but PASSes with the other optimization levels. Hi Andrew, Did you get a chance to look at this? Given the way that your error checking works, the test probably just needs a dg-skip-if -flto directive. Did you intend to come back to this and make it fail earlier than link time with -flto? If I don't hear otherwise, I'll propose the patch adding dg-skip-if on Monday. Yes that is the correct approach as I see it. Thanks, Andrew Cheers, James
Re: RFC: PATCHES: Properly handle reference to protected data on x86
On 04/03/15 23:26, H.J. Lu wrote: Protected symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. Here are patches for glibc, binutils and GCC to handle it properly. Any comments? i think this fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55012
Re: [PATCH][simplify-rtx] PR 65235: Calculate element size correctly when simplifying (vec_select (vec_concat (const_int) (...)) [...])
The patch fixes that by calculating the size of the first element by taking the size of the outer mode and subtracting the size of the second element. I've added an assert to make sure that the second element is not also a const_int, as a vec_concat of const_ints doesn't make sense as far as I can see. I'm not sure about the assert, can't we just punt in this case? Bootstrapped and tested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-linux-gnu. This bug appears on trunk, 4.9 and 4.8, so it's not a regression on the release branches but it is a wrong-code bug. I think that the fix would be acceptable for GCC 5 without the assert. -- Eric Botcazou
[PATCH] Fix PR63743: Incorrect ordering of operands in sequence of commutative operations
Hi, Improved canonization after r216728 causes GCC to more often generate poor code due to suboptimal ordering of operand of commutative libcalls. Indeed, if one of the operands of a commutative operation is the result of a previous operation, both being implemented by libcall, the wrong ordering of the operands in the second operation can lead to extra mov. Consider the following case on softfloat targets: double test1 (double x, double y) { return x * (x + y); } If x + y is put in the operand using the same register as the result of the libcall for x + y then no mov is generated, otherwise mov is needed. The following happens on arm softfloat with the right ordering: bl __aeabi_dadd ldr r2, [sp] ldr r3, [sp, #4] /* r0, r1 are reused from the return values of the __aeabi_dadd libcall. */ bl __aeabi_dmul With the wrong ordering one gets: bl __aeabi_dadd mov r2, r0 mov r3, r1 ldr r0, [sp] ldr r1, [sp, #4] bl __aeabi_dmul This patch extend the patch written by Yuri Rumyantsev in r219646 to also deal with the case of only one of the operand being the result of an operation. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2015-03-05 Thomas Preud'homme thomas.preudho...@arm.com PR tree-optimization/63743 * cfgexpand.c (reorder_operands): Also reorder if only second operand had its definition forwarded by TER. *** gcc/testsuite/ChangeLog *** 2015-03-05 Thomas Preud'homme thomas.preudho...@arm.com PR tree-optimization/63743 * gcc.dg/pr63743.c: New test. diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 7dfe1f6..4fbc037 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -5117,13 +5117,11 @@ reorder_operands (basic_block bb) continue; /* Swap operands if the second one is more expensive. */ def0 = get_gimple_for_ssa_name (op0); - if (!def0) - continue; def1 = get_gimple_for_ssa_name (op1); if (!def1) continue; swap = false; - if (lattice[gimple_uid (def1)] lattice[gimple_uid (def0)]) + if (!def0 || lattice[gimple_uid (def1)] lattice[gimple_uid (def0)]) swap = true; if (swap) { @@ -5132,7 +5130,7 @@ reorder_operands (basic_block bb) fprintf (dump_file, Swap operands in stmt:\n); print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); fprintf (dump_file, Cost left opnd=%d, right opnd=%d\n, - lattice[gimple_uid (def0)], + def0 ? lattice[gimple_uid (def0)] : 0, lattice[gimple_uid (def1)]); } swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt), diff --git a/gcc/testsuite/gcc.dg/pr63743.c b/gcc/testsuite/gcc.dg/pr63743.c new file mode 100644 index 000..87254ed --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr63743.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options -O1 -fdump-rtl-expand-details } */ + +double +libcall_dep (double x, double y) +{ + return x * (x + y); +} + +/* { dg-final { scan-rtl-dump-times Swap operands 1 expand } } */ +/* { dg-final { cleanup-rtl-dump expand } } */ Testsuite was run in QEMU when compiled by an arm-none-eabi GCC cross-compiler targeting Cortex-M3 and a bootstrapped x86_64 native GCC without any regression. CSiBE sees a -0.5034% code size decrease on arm-none-eabi and a 0.0058% code size increase on x86_64-linux-gnu. Is it ok for trunk (since it fixes a code size regression in 5.0)? Best regards, Thomas
Re: [PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0
On Fri, Mar 06, 2015 at 10:03:46AM +, pins...@gmail.com wrote: On Mar 6, 2015, at 1:45 AM, James Greenhalgh james.greenha...@arm.com wrote: On Thu, Feb 12, 2015 at 03:37:33PM +, Christophe Lyon wrote: On 8 February 2015 at 03:24, Andrew Pinski pins...@gmail.com wrote: On Fri, Feb 6, 2015 at 5:02 PM, Andrew Pinski pins...@gmail.com wrote: PR target/64893 * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Change the first argument type to size_type_node and add another size_type_node. (aarch64_simd_expand_builtin): Handle the new argument to AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather print an out when the first two arguments are not nonzero integer constants. * config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK): Pass the sizeof directly to __builtin_aarch64_im_lane_boundsi. testsuite/ChangeLog: * c-c++-common/torture/aarch64-vect-lane-1.c: New testcase. * c-c++-common/torture/aarch64-vect-lane-2.c: New testcase. In case you haven't noticed, aarch64-vect-lane-2.c FAILs when compiled at -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects but PASSes with the other optimization levels. Hi Andrew, Did you get a chance to look at this? Given the way that your error checking works, the test probably just needs a dg-skip-if -flto directive. Did you intend to come back to this and make it fail earlier than link time with -flto? If I don't hear otherwise, I'll propose the patch adding dg-skip-if on Monday. Yes that is the correct approach as I see it. Thanks for the confirmation. In the end I settled on XFAILing (It would be nice to catch the error early enough that one day this test PASSes) the test rather than skipping it. I committed the attached patch as obvious as r221233. Thanks, James --- 2015-03-06 James Greenhalgh james.greenha...@arm.com * c-c++-common/torture/aarch64-vect-lane-2.c: XFAIL for LTO compiles using the linker plugin. Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 221232) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2015-03-06 James Greenhalgh james.greenha...@arm.com + + * c-c++-common/torture/aarch64-vect-lane-2.c: XFAIL for LTO + compiles using the linker plugin. + 2015-03-06 Eric Botcazou ebotca...@adacore.com * g++.dg/other/dump-ada-spec-3.C: Remove include and adjust. Index: gcc/testsuite/c-c++-common/torture/aarch64-vect-lane-2.c === --- gcc/testsuite/c-c++-common/torture/aarch64-vect-lane-2.c (revision 221232) +++ gcc/testsuite/c-c++-common/torture/aarch64-vect-lane-2.c (working copy) @@ -1,4 +1,5 @@ // { dg-do compile { target aarch64*-*-* } } +// { dg-xfail-if { *-*-* } { -flto -fuse-linker-plugin } { } } int search_line_fast (void) {
RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields
Hi Richard, here is my new proposal, it addresses your objections and generates better code for this test case: main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movl global+1(%rip), %eax orl $2147483647, %eax movl %eax, global+1(%rip) movl global+1(%rip), %eax andl $2147483647, %eax cmpl $2147483647, %eax je .L2 call abort .L2: movl $0, %eax popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc I also tried to fix the comments. Reg-tested on x86_64 successfully and ARM is still running. Is it OK for trunk? Thanks Bernd. gcc: 2015-03-06 Bernd Edlinger bernd.edlin...@hotmail.de * expmed.c (strict_volatile_bitfield_p): For STRICT_ALIGNMENT check that MEM_ALIGN (op0) allows a MODESIZE access. (store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly generate an unaligned access if the field crosses a word boundary. testsuite: 2015-03-06 Bernd Edlinger bernd.edlin...@hotmail.de * gcc.dg/20150306-1.c: New test. patch-volatile-bitfields-1.diff Description: Binary data
Re: [PATCH] Fix PR ipa/65318
On Thu, Mar 5, 2015 at 7:38 PM, Jan Hubicka hubi...@ucw.cz wrote: Index: gimple-fold.c === --- gimple-fold.c(revision 221170) +++ gimple-fold.c(working copy) @@ -263,7 +263,16 @@ get_symbol_constant_value (tree sym) { val = canonicalize_constructor_val (unshare_expr (val), sym); if (val is_gimple_min_invariant (val)) -return val; +{ + if (!useless_type_conversion_p (TREE_TYPE (sym), TREE_TYPE (val))) +{ + if (operand_equal_p (TYPE_SIZE (TREE_TYPE (sym)), + TYPE_SIZE (TREE_TYPE (val)), 0)) +return NULL_TREE; And no, I don't think this is sane. Callers need to handle mismatches IIRC. OK, I am little bit confused about your MEM_REF suggestion. So you mean that MEM_REF should be added around all references to symbol that is an alias? Where it is done? Is there a reason why we do not add MEM_REF always? I would like to keep optimization passes (like ipa-visibility or ICF) to turn symbol into an alias without having to update underlying IL. Yes - but I said that having an alias should have the same effect as the MEM_REF wrapping we do in LTO (to not barf on stmt verification if symbol merging merges an int and a float for example). Concerning callers handling mismatches, the VIEW_CONVERT_EXPR thing seems valid thing to do for all uses except for fold_const_aggregate_ref_1. So perhaps we can just inline rest of get_symbol_constant_value in there and document that get_symbol_constant_value returns value in correct type. Or am I missing something obvious? Yeah, that looks good. Note that we can as well change all callers of get_symbol_constant_value to use fold_const_aggregate_ref, no? So reduce the number of APIs. Richard. Thanks! Honza + val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (sym), val); +} + return val; +} else return NULL_TREE; }
Re: [PATCH] Fix PR rtl-optimization/65067
I know it because strict_volatile_bitfield_p checks this: /* Check for cases of unaligned fields that must be split. */ if (bitnum % BITS_PER_UNIT + bitsize modesize Therefore, if bitsize == modesize, we know that bitnum % BITS_PER_UNIT must be zero. Isn't that precisely the condition you're trying to relax in the other change? -- Eric Botcazou
RE: [PATCH] Fix PR rtl-optimization/65067
On Fri, 6 Mar 2015 10:52:53, Eric Botcazou wrote: I know it because strict_volatile_bitfield_p checks this: /* Check for cases of unaligned fields that must be split. */ if (bitnum % BITS_PER_UNIT + bitsize modesize Therefore, if bitsize == modesize, we know that bitnum % BITS_PER_UNIT must be zero. Isn't that precisely the condition you're trying to relax in the other change? Yes, but modesize is a mutliple of BITS_PER_UNIT, therefore: bitnum % modesize + bitsize modesize implies bitnum % BITS_PER_UNIT + bitsize modesize I agree that the comments are still out of sync. Thanks Bernd.
Re: [PATCH/AARCH64] Fix 64893: ICE with vget_lane_u32 with C++ front-end at -O0
On Thu, Feb 12, 2015 at 03:37:33PM +, Christophe Lyon wrote: On 8 February 2015 at 03:24, Andrew Pinski pins...@gmail.com wrote: On Fri, Feb 6, 2015 at 5:02 PM, Andrew Pinski pins...@gmail.com wrote: PR target/64893 * config/aarch64/aarch64-builtins.c (aarch64_init_simd_builtins): Change the first argument type to size_type_node and add another size_type_node. (aarch64_simd_expand_builtin): Handle the new argument to AARCH64_SIMD_BUILTIN_LANE_CHECK and don't ICE but rather print an out when the first two arguments are not nonzero integer constants. * config/aarch64/arm_neon.h (__AARCH64_LANE_CHECK): Pass the sizeof directly to __builtin_aarch64_im_lane_boundsi. testsuite/ChangeLog: * c-c++-common/torture/aarch64-vect-lane-1.c: New testcase. * c-c++-common/torture/aarch64-vect-lane-2.c: New testcase. In case you haven't noticed, aarch64-vect-lane-2.c FAILs when compiled at -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects but PASSes with the other optimization levels. Hi Andrew, Did you get a chance to look at this? Given the way that your error checking works, the test probably just needs a dg-skip-if -flto directive. Did you intend to come back to this and make it fail earlier than link time with -flto? If I don't hear otherwise, I'll propose the patch adding dg-skip-if on Monday. Cheers, James
PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option
PING. I am enclosing the patch here for review. On Wed, Feb 11, 2015 at 8:47 AM, H.J. Lu hjl.to...@gmail.com wrote: PING. On Wed, Jan 28, 2015 at 8:05 AM, H.J. Lu hjl.to...@gmail.com wrote: PING. On Tue, Jan 13, 2015 at 3:25 PM, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Jan 13, 2015 at 5:03 AM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Jan 12, 2015 at 11:50:41PM +, Joseph Myers wrote: On Mon, 12 Jan 2015, H.J. Lu wrote: +if test x$enable_default_pie = xyes; then + AC_MSG_CHECKING(if $target supports default PIE) + enable_default_pie=no + case $target in +i?86*-*-linux* | x86_64*-*-linux*) + saved_LDFLAGS=$LDFLAGS + saved_CFLAGS=$CFLAGS + CFLAGS=$CFLAGS -fPIE + LDFLAGS=$LDFLAGS -fPIE -pie + AC_TRY_LINK(,,[enable_default_pie=yes],) + LDFLAGS=$saved_LDFLAGS + CFLAGS=$saved_CFLAGS + ;; +*) + ;; +esac There should not be any such hardcoding of targets here without concrete evidence that the targets for which this sets enable_default_pie=no really cannot support PIE. In particular, there is no reason at all for this to be architecture-specific; all GNU/Linux architectures should support PIE. I believe AC_TRY_LINK here will test for the host, whereas what you want to know is what's supported for the target (but it's not possible to run link tests for the target at this point; the compiler for the target hasn't even been built). So: just presume that if the user passes --enable-default-pie then they know what they are doing, and don't try to override their choice. diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index c9e3bf1..89fc305 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1583,6 +1583,10 @@ not be built. Specify that the run-time libraries for stack smashing protection should not be built. +@item --enable-default-pie +Turn on @option{-fPIE} and @option{-pie} by default if supported. +Currently supported targets are i?86-*-linux* and x86-64-*-linux*. The if supported and target list can then be removed here. Here is the updated patch. To support --enable-default-pie, each target must update STARTFILE_SPEC to support PIE_SPEC and NO_PIE_SPEC. I can provide STARTFILE_SPEC patch if needed. Thanks. H.J. --- gcc/ 2015-01-12 Magnus Granberg zo...@gentoo.org H.J. Lu hongjiu...@intel.com * Makefile.in (COMPILER): Add @NO_PIE_CFLAGS@. (LINKER): Add @NO_PIE_FLAG@. (libgcc.mvars): Set NO_PIE_CFLAGS to -fno-PIE for --enable-default-pie. * common.opt (fPIE): Initialize to -1. (fpie): Likewise. (static): Add RejectNegative Negative(shared). (no-pie): New option. (pie): Replace Negative(shared) with Negative(no-pie). * configure.ac: Add --enable-default-pie. (NO_PIE_CFLAGS): New. Check if -fno-PIE works. AC_SUBST. (NO_PIE_FLAG): New. Check if -no-pie works. AC_SUBST. * defaults.h (DEFAULT_FLAG_PIE): New. Default PIE to -fPIE. * gcc.c (NO_PIE_SPEC): New. (PIE_SPEC): Likewise. (LD_PIE_SPEC): Likewise. (LINK_PIE_SPEC): Handle -no-pie. Use PIE_SPEC and LD_PIE_SPEC. * opts.c (DEFAULT_FLAG_PIE): New. Set to 0 if ENABLE_DEFAULT_PIE is undefined. (finish_options): Update opts-x_flag_pie if it is -1. * config/gnu-user.h (FVTABLE_VERIFY_SPEC): New. (GNU_USER_TARGET_STARTFILE_SPEC): Use FVTABLE_VERIFY_SPEC. Use NO_PIE_SPEC and NO_PIE_SPEC if ENABLE_DEFAULT_PIE is defined. (GNU_USER_TARGET_STARTFILE_SPEC): Use FVTABLE_VERIFY_SPEC. * doc/install.texi: Document --enable-default-pie. * doc/invoke.texi: Document -no-pie. * config.in: Regenerated. * configure: Likewise. gcc/ada/ 2015-01-12 H.J. Lu hongjiu...@intel.com * gcc-interface/Makefile.in (TOOLS_LIBS): Add @NO_PIE_FLAG@. libgcc/ 2015-01-12 H.J. Lu hongjiu...@intel.com * Makefile.in (CRTSTUFF_CFLAGS): Add $(NO_PIE_CFLAGS). This is the updated patch. I fixed the -r regression. LTO tests pass now. -- H.J. -- H.J. -- H.J. -- H.J. From 46faeb11166103c6a8b1608731eccf75385fe5c9 Mon Sep 17 00:00:00 2001 From: H.J. Lu hjl.to...@gmail.com Date: Fri, 6 Mar 2015 09:07:54 -0800 Subject: [PATCH] Add --enable-default-pie option to GCC configure Add --enable-default-pie option to configure GCC to generate PIE by default. gcc/ * Makefile.in (COMPILER): Add @NO_PIE_CFLAGS@. (LINKER): Add @NO_PIE_FLAG@. (libgcc.mvars): Set NO_PIE_CFLAGS to -fno-PIE for --enable-default-pie. * common.opt (fPIE): Initialize to -1. (fpie): Likewise. (no-pie): New option. (pie): Replace Negative(shared) with Negative(no-pie). * configure.ac: Add --enable-default-pie. (NO_PIE_CFLAGS): New. Check if -fno-PIE works. AC_SUBST. (NO_PIE_FLAG): New. Check if -no-pie works. AC_SUBST. *
Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2 basic block vectorized using SLP 1
On 03/02/15 09:28, Martin Sebor wrote: On 03/02/2015 06:58 AM, Richard Biener wrote: On Fri, 27 Feb 2015, Martin Sebor wrote: Given that Martin's fix to the testcase allowed it to succeed without Richi's fix for the underlying problem, is there a modification to the testcase or a new testcase that would really test the optimization? Let me work on it. Below is a patch with a couple of minor tweaks to the existing test first to update the search string and second to better exercise the vectorization not only when the source address isn't aligned on the expected boundary but also when the destination address isn't. This enhancement revealed an outstanding aspect of the regression (not fixed by Richard's already committed patch). Besides this change, the patch also adds a number of other tests to better exercise the vectorization by verifying it takes place for arrays of elements of other sizes besides word: byte, half word, and double word. Those tests reveal both another regression WRT 4.8 and further vectorization opportunities not exploited even in 4.8. I marked the latter XFAIL in the tests so that when the regression is fully resolved, the tests should pass with no unexpected failures. I have a hard time applying the patch because of line-wrapping issues or my patch tool not groking the git diffs. Can you please either commit the patch or extract the testcase that still regresses and paste it into PR63175? I pasted a couple of such test cases to the bug. The full patch is also attached to this email in case there was a problem with line wrapping. So for the unaligned case, is that really a regression when compared to earlier compilers? If not, then it seems that we ought to at least be at a point where the regression marker for that BZ can be removed, right? ie, Richi's patch fixed the actual code quality regression and your patch fixes the testsuite aspects, right? jeff
Re: RFC: PATCHES: Properly handle reference to protected data on x86
On Fri, Mar 6, 2015 at 10:29 AM, Joseph Myers jos...@codesourcery.com wrote: On Wed, 4 Mar 2015, H.J. Lu wrote: Protected symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. Here are patches for glibc, binutils and GCC to handle it properly. Any comments? I don't see any testcases in the glibc patch; it seems critical to have sufficient testcases to make it easy for architecture maintainers to tell if there is an issue for their architecture (and the testcases need to have clear comments explaining any requirements on GCC and binutils for them to work - that is, comments referring to committed patches or releases rather than to anything uncommitted). https://sourceware.org/ml/libc-alpha/2015-03/msg00231.html -- H.J.
[PATCH, Fortran] Sync memory action delegated to OpenCoarrays
Dear all, so far a sync memory statement is translated into a local __sync_synchronize (). The attached draft patch delegates the action for sync memory (when -fcoarray=lib is used) to the external function _gfortran_caf_sync_memory() implemented in the OpenCoarrays library. Any suggestions? Best regards Alessandro commit fdb560e3f548fe60edf26717b8de6bb92f3fd555 Author: Alessandro Fanfarillo fanfarillo@gmail.com Date: Fri Mar 6 10:25:56 2015 -0800 Sync memory is translated into an invokation to _gfortran_caf_sync_memory() diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c index 3664824..a66c25c 100644 --- a/gcc/fortran/trans-decl.c +++ b/gcc/fortran/trans-decl.c @@ -153,6 +153,7 @@ tree gfor_fndecl_caf_get; tree gfor_fndecl_caf_send; tree gfor_fndecl_caf_sendget; tree gfor_fndecl_caf_sync_all; +tree gfor_fndecl_caf_sync_memory; tree gfor_fndecl_caf_sync_images; tree gfor_fndecl_caf_error_stop; tree gfor_fndecl_caf_error_stop_str; @@ -3451,6 +3452,10 @@ gfc_build_builtin_function_decls (void) get_identifier (PREFIX(caf_sync_all)), .WW, void_type_node, 3, pint_type, pchar_type_node, integer_type_node); + gfor_fndecl_caf_sync_memory = gfc_build_library_function_decl_with_spec ( + get_identifier (PREFIX(caf_sync_memory)), .WW, void_type_node, + 3, pint_type, pchar_type_node, integer_type_node); + gfor_fndecl_caf_sync_images = gfc_build_library_function_decl_with_spec ( get_identifier (PREFIX(caf_sync_images)), .RRWW, void_type_node, 5, integer_type_node, pint_type, pint_type, diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c index 505f905..dce0e87 100644 --- a/gcc/fortran/trans-stmt.c +++ b/gcc/fortran/trans-stmt.c @@ -768,8 +768,7 @@ gfc_trans_sync (gfc_code *code, gfc_exec_op type) else stat = null_pointer_node; - if (code-expr3 flag_coarray == GFC_FCOARRAY_LIB - type != EXEC_SYNC_MEMORY) + if (code-expr3 flag_coarray == GFC_FCOARRAY_LIB) { gcc_assert (code-expr3-expr_type == EXPR_VARIABLE); gfc_init_se (argse, NULL); @@ -778,7 +777,7 @@ gfc_trans_sync (gfc_code *code, gfc_exec_op type) errmsg = gfc_build_addr_expr (NULL, argse.expr); errmsglen = argse.string_length; } - else if (flag_coarray == GFC_FCOARRAY_LIB type != EXEC_SYNC_MEMORY) + else if (flag_coarray == GFC_FCOARRAY_LIB) { errmsg = null_pointer_node; errmsglen = build_int_cst (integer_type_node, 0); @@ -822,13 +821,13 @@ gfc_trans_sync (gfc_code *code, gfc_exec_op type) gfc_add_expr_to_block (se.pre, tmp); } - if (flag_coarray != GFC_FCOARRAY_LIB || type == EXEC_SYNC_MEMORY) + if (flag_coarray != GFC_FCOARRAY_LIB) { /* Set STAT to zero. */ if (code-expr2) gfc_add_modify (se.pre, stat, build_int_cst (TREE_TYPE (stat), 0)); } - else if (type == EXEC_SYNC_ALL) + else if (type == EXEC_SYNC_ALL || type == EXEC_SYNC_MEMORY) { /* SYNC ALL = stat == null_pointer_node SYNC ALL(stat=s) = stat has an integer type @@ -840,8 +839,13 @@ gfc_trans_sync (gfc_code *code, gfc_exec_op type) if (TREE_TYPE (stat) == integer_type_node) stat = gfc_build_addr_expr (NULL, stat); - tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_all, -3, stat, errmsg, errmsglen); + if(type == EXEC_SYNC_MEMORY) + tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_memory, + 3, stat, errmsg, errmsglen); + else + tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_all, + 3, stat, errmsg, errmsglen); + gfc_add_expr_to_block (se.pre, tmp); } else diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h index bd1520a..3ba2f88 100644 --- a/gcc/fortran/trans.h +++ b/gcc/fortran/trans.h @@ -731,6 +731,7 @@ extern GTY(()) tree gfor_fndecl_caf_get; extern GTY(()) tree gfor_fndecl_caf_send; extern GTY(()) tree gfor_fndecl_caf_sendget; extern GTY(()) tree gfor_fndecl_caf_sync_all; +extern GTY(()) tree gfor_fndecl_caf_sync_memory; extern GTY(()) tree gfor_fndecl_caf_sync_images; extern GTY(()) tree gfor_fndecl_caf_error_stop; extern GTY(()) tree gfor_fndecl_caf_error_stop_str;
Re: [PATCH] ICF: move readonly decision for variables to the right place
This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65334 This is another alias issue exposed by ICF. Vectorizer attempts to increase an alignment of alias by modifying its DECL_ALIGN, but it doesnot really achieve it - when outputting the definition we use DECL_ALIGN of the ultimate_alias_target. Frankly, increasing alias target is not safe thing to do: alias may be static, while target interposable and we do not want code touching actual target to assume greater alignment. I suppose only way to get this right is to make varasm to walk all aliases and figure largest alignment assumed. I am workign on a patch. Honza -- H.J.
Re: [PR58315] reset inlined debug vars at return-to point
On Feb 26, 2015, Alexandre Oliva aol...@redhat.com wrote: So far, all the differences I looked at were caused by padding at the end of BBs, and by jump stmts without line numbers at the end of BBs, both right after the debug reset stmts the proposed patch introduces. Further investigation showed there were other sources of spurious differences: - copies arising from the expansion of PHI nodes; source code information associated with these copies points at the source of the copy, which is hardly useful and sensible. - optimization of single-range location lists to a location expression covering the entire function, which extends, often incorrectly, the range of the expression, and thus the coverage of the function. By patching GCC so as to eliminate these differences (patches attached for reference), I managed to reduce the coverage differences in libgcc_s.so to a manageable size (about a dozen variables in 3 functions in 2 object files), so I could manually investigate each one of them. All remaining differences amounted to single insns, originally before the reset debug stmt introduced by the patch, computing subexpressions of expressions expanded after the return point of the inlined function. Because of TER, this often causes stmts to be moved past corresponding debug stmts. This is a long-standing and long-known issue to me, thus nothing particularly new brought about or worsened by the proposed patch, and that would be fixed with stmt frontiers. Ultimately, the issue is not so much that the insn gets emitted at a different spot, but rather the possibility that no insn will remain at the point where it was expected, which might make it impossible (with current compiler and debugger) to inspect the results of that computation. The worst the reset debug stmts introduced by this patch could do is to make those effects not visible at other points of the computation, where they are not guaranteed or expected to be visible to begin with. That said, it might be nice if we emitted partial of full expansions of subexpressions at their original location, relative to debug stmts expanded to debug insns, rather than all at the point of the full expression. I recall having started something along these lines years ago, but that project never got to a working state. The differences in libstdc++.so, even after the patches intended to minimize differences, are too many to investigate in depth, but from what I can tell from a quick glance at the diff in dwlocstat per-variable per-range coverage dumps, nearly all of the differences amount to one insn in the final range, which likely means TER-introduced differences. End scopes before location-less insns From: Alexandre Oliva aol...@redhat.com --- gcc/final.c | 25 - 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/gcc/final.c b/gcc/final.c index 1fa93d9..6fab871 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -1634,13 +1634,19 @@ choose_inner_scope (tree s1, tree s2) /* Emit lexical block notes needed to change scope from S1 to S2. */ static void -change_scope (rtx_insn *orig_insn, tree s1, tree s2) +change_scope (rtx_insn *prev_insn, rtx_insn *orig_insn, tree s1, tree s2) { rtx_insn *insn = orig_insn; + rtx_insn *einsn; tree com = NULL_TREE; tree ts1 = s1, ts2 = s2; tree s; + if (!prev_insn) +einsn = insn; + else +einsn = NEXT_INSN (prev_insn); + while (ts1 != ts2) { gcc_assert (ts1 ts2); @@ -1660,7 +1666,7 @@ change_scope (rtx_insn *orig_insn, tree s1, tree s2) s = s1; while (s != com) { - rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, insn); + rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, einsn); NOTE_BLOCK (note) = s; s = BLOCK_SUPERCONTEXT (s); } @@ -1684,6 +1690,7 @@ reemit_insn_block_notes (void) tree cur_block = DECL_INITIAL (cfun-decl); rtx_insn *insn; rtx_note *note; + rtx_insn *prev_insn = NULL; insn = get_insns (); for (; insn; insn = NEXT_INSN (insn)) @@ -1693,10 +1700,16 @@ reemit_insn_block_notes (void) /* Prevent lexical blocks from straddling section boundaries. */ if (NOTE_P (insn) NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS) { + rtx_insn *einsn; + if (prev_insn) + einsn = NEXT_INSN (prev_insn); + else + einsn = insn; + prev_insn = NULL; for (tree s = cur_block; s != DECL_INITIAL (cfun-decl); s = BLOCK_SUPERCONTEXT (s)) { - rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, insn); + rtx_note *note = emit_note_before (NOTE_INSN_BLOCK_END, einsn); NOTE_BLOCK (note) = s; note = emit_note_after (NOTE_INSN_BLOCK_BEG, insn); NOTE_BLOCK (note) = s; @@ -1732,14 +1745,16 @@ reemit_insn_block_notes (void) if (this_block != cur_block) { - change_scope (insn, cur_block, this_block); + change_scope
Re: RFC: PATCHES: Properly handle reference to protected data on x86
On Wed, 4 Mar 2015, H.J. Lu wrote: Protected symbol means that it can't be pre-emptied. It doesn't mean its address won't be external. This is true for pointer to protected function. With copy relocation, address of protected data defined in the shared library may also be external. We only know that for sure at run-time. Here are patches for glibc, binutils and GCC to handle it properly. Any comments? I don't see any testcases in the glibc patch; it seems critical to have sufficient testcases to make it easy for architecture maintainers to tell if there is an issue for their architecture (and the testcases need to have clear comments explaining any requirements on GCC and binutils for them to work - that is, comments referring to committed patches or releases rather than to anything uncommitted). -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, Fortran] Sync memory action delegated to OpenCoarrays
Dear Alessandro, Alessandro Fanfarillo wrote: so far a sync memory statement is translated into a local __sync_synchronize (). The attached draft patch delegates the action for sync memory (when -fcoarray=lib is used) to the external function _gfortran_caf_sync_memory() implemented in the OpenCoarrays library. Looks good to me. However, you should add a test case with 'dg-options -fdump-tree-original -fcoarray=lib' to check that this works; cf. gcc/testsuite/gfortran.dg/coarray*.f90 for examples. And you have to provide a stub implementation in libgfortran/caf/{libcaf.h,single.c}. Tobias PS: I wonder whether it makes sense to remove the __sync_synchronize for -fcoarray=single and replace it by the equivalent to |asm volatile ( : : : memory). It almost certainly does. |
Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2 basic block vectorized using SLP 1
On 03/06/2015 10:28 AM, Jeff Law wrote: On 03/02/15 09:28, Martin Sebor wrote: On 03/02/2015 06:58 AM, Richard Biener wrote: On Fri, 27 Feb 2015, Martin Sebor wrote: Given that Martin's fix to the testcase allowed it to succeed without Richi's fix for the underlying problem, is there a modification to the testcase or a new testcase that would really test the optimization? Let me work on it. Below is a patch with a couple of minor tweaks to the existing test first to update the search string and second to better exercise the vectorization not only when the source address isn't aligned on the expected boundary but also when the destination address isn't. This enhancement revealed an outstanding aspect of the regression (not fixed by Richard's already committed patch). Besides this change, the patch also adds a number of other tests to better exercise the vectorization by verifying it takes place for arrays of elements of other sizes besides word: byte, half word, and double word. Those tests reveal both another regression WRT 4.8 and further vectorization opportunities not exploited even in 4.8. I marked the latter XFAIL in the tests so that when the regression is fully resolved, the tests should pass with no unexpected failures. I have a hard time applying the patch because of line-wrapping issues or my patch tool not groking the git diffs. Can you please either commit the patch or extract the testcase that still regresses and paste it into PR63175? I pasted a couple of such test cases to the bug. The full patch is also attached to this email in case there was a problem with line wrapping. So for the unaligned case, is that really a regression when compared to earlier compilers? If not, then it seems that we ought to at least be at a point where the regression marker for that BZ can be removed, right? ie, Richi's patch fixed the actual code quality regression and your patch fixes the testsuite aspects, right? My interpretation of the bug report is that it points out two problems: 1) a failure in the costmodel-bb-slp-9a.c test 2) a quality regression observed by inspecting the assembly emitted for the test The two are unrelated in that (2) didn't cause (1). Since Richi's patch fixed (2) and my latest patch fixes (1) I would be inclined to consider the bug resolved. While GCC 5 doesn't vectorize some code that 4.8 does with the same options, it's apparently by accident (or due to a bug in 4.8). Since 5.0 does vectorize the same code when the right set of options is specified, I agree with others that none of my additional tests has exposed any other regressions than the one that's already been addressed. Martin
Re: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option
On Fri, Mar 6, 2015 at 12:11 PM, Magnus Granberg zo...@gentoo.org wrote: fredag 06 mars 2015 09.31.26 skrev H.J. Lu: PING. I am enclosing the patch here for review. Have you tested it on mips? gcc pass -mno-shared if HAVE_AS_NO_SHARED is defened in config/mips/gnu-user.h. -mshared don't get enable. I am not familiar with mips. MIPS maintainer needs to look into it. -- H.J.
[patch] add LAMBDA_EXPR support to debug_tree()
Hi Jason. I know LAMBDA_EXPR will never make it to the -fdump-* files, but debugging them internally is a pain. I don't know how you fly blind :). Would it be ok to add tree dump support for them? Not pretty, but practical... Aldy commit 67794c518d439ff2f6a886b19c8e9f0ad32de43b Author: Aldy Hernandez al...@redhat.com Date: Fri Mar 6 11:54:31 2015 -0800 * ptree.c (cxx_print_lambda_node): New. (cxx_print_xnode): Handle LAMBDA_EXPR. diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c index 79c80a3..2d0b584 100644 --- a/gcc/cp/ptree.c +++ b/gcc/cp/ptree.c @@ -204,6 +204,34 @@ cxx_print_identifier (FILE *file, tree node, int indent) } void +cxx_print_lambda_node (FILE *file, tree node, int indent) +{ + if (LAMBDA_EXPR_MUTABLE_P (node)) +fprintf (file, /mutable); + fprintf (file, default_capture_mode=[); + switch (LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (node)) +{ +case CPLD_NONE: + fprintf (file, NONE); + break; +case CPLD_COPY: + fprintf (file, COPY); + break; +case CPLD_REFERENCE: + fprintf (file, CPLD_REFERENCE); + break; +default: + fprintf (file, ??); + break; +} + fprintf (file, ] ); + print_node (file, capture_list, LAMBDA_EXPR_CAPTURE_LIST (node), indent + 4); + print_node (file, this_capture, LAMBDA_EXPR_THIS_CAPTURE (node), indent + 4); + print_node (file, return_type, LAMBDA_EXPR_RETURN_TYPE (node), indent + 4); + print_node (file, closure, LAMBDA_EXPR_CLOSURE (node), indent + 4); +} + +void cxx_print_xnode (FILE *file, tree node, int indent) { switch (TREE_CODE (node)) @@ -243,6 +271,9 @@ cxx_print_xnode (FILE *file, tree node, int indent) print_node (file, pattern, DEFERRED_NOEXCEPT_PATTERN (node), indent+4); print_node (file, args, DEFERRED_NOEXCEPT_ARGS (node), indent+4); break; +case LAMBDA_EXPR: + cxx_print_lambda_node (file, node, indent); + break; default: break; }
New German PO file for 'gcc' (version 5.1-b20150208)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the German team of translators. The file is available at: http://translationproject.org/latest/gcc/de.po (This file, 'gcc-5.1-b20150208.de.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator. coordina...@translationproject.org
Re: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option
fredag 06 mars 2015 09.31.26 skrev H.J. Lu: PING. I am enclosing the patch here for review. Have you tested it on mips? gcc pass -mno-shared if HAVE_AS_NO_SHARED is defened in config/mips/gnu-user.h. -mshared don't get enable. /Magnus G.
Re: [patch] Optimize empty class copies within a C++ return statement
On 03/06/2015 11:19 AM, Aldy Hernandez wrote: We are hitting the MODIFY_EXPR case. Indeed, _because_ we hit the MODIFY_EXPR is that we return the uninitialized temporary. For example, we start with: return retval = TARGET_EXPR D.2347, ... which becomes: stuff with D.2347 return D.2349 = D.2347; which the aforementioned MODIFY_EXPR case turns into: stuff with D.2347 return D.2349; But doesn't this still involve a MODIFY_EXPR, i.e. return retval = D.2349? Jason
Re: [patch] add LAMBDA_EXPR support to debug_tree()
OK, sure. Jason
Re: [C++ PATCH, RFC] PR c++/63959, continued
On 01/19/2015 11:28 AM, Ville Voutilainen wrote: * class.c (check_field_decls): If any field is volatile, make the class type have complex copy/move operations. Discussion on the cxx-abi list points out that this breaks ABI compatibility between C and C++, and is therefore unacceptable. Jason
Re: [patch] Optimize empty class copies within a C++ return statement
On 03/06/2015 01:46 PM, Jason Merrill wrote: On 03/06/2015 11:19 AM, Aldy Hernandez wrote: We are hitting the MODIFY_EXPR case. Indeed, _because_ we hit the MODIFY_EXPR is that we return the uninitialized temporary. For example, we start with: return retval = TARGET_EXPR D.2347, ... which becomes: stuff with D.2347 return D.2349 = D.2347; which the aforementioned MODIFY_EXPR case turns into: stuff with D.2347 return D.2349; But doesn't this still involve a MODIFY_EXPR, i.e. return retval = D.2349? If I understand you correct, no. gimplify_return_expr creates a new temporary and uses that instead of retval: else if (gimplify_ctxp-return_temp) result = gimplify_ctxp-return_temp; else { result = create_tmp_reg (TREE_TYPE (result_decl)); ... } ... ... /* Smash the lhs of the MODIFY_EXPR to the temporary we plan to use. Then gimplify the whole thing. */ if (result != result_decl) TREE_OPERAND (ret_expr, 0) = result; Aldy
Re: [patch] Optimize empty class copies within a C++ return statement
On 03/06/2015 05:01 PM, Jason Merrill wrote: On 03/06/2015 04:54 PM, Aldy Hernandez wrote: But doesn't this still involve a MODIFY_EXPR, i.e. return retval = D.2349? If I understand you correct, no. gimplify_return_expr creates a new temporary and uses that instead of retval: else if (gimplify_ctxp-return_temp) result = gimplify_ctxp-return_temp; else { result = create_tmp_reg (TREE_TYPE (result_decl)); ... } ... ... /* Smash the lhs of the MODIFY_EXPR to the temporary we plan to use. Then gimplify the whole thing. */ if (result != result_decl) TREE_OPERAND (ret_expr, 0) = result; Sounds like ret_expr is a MODIFY_EXPR. Oh, but with the wrong lhs, I see. Jason
Re: [c-family] Fix -fdump-ada-spec ICEs
Following this commit (r221088) testing dump-ada-spec-3.C with make -k check-g++ RUNTESTFLAGS=dg.exp=other/dump-ada-spec-3.C generates a lot of *.ads files in the gcc/testsuite/g++ directory which are not cleaned up after completion. Sorry about that, I thought cleanup-ada-spec would clean this up, but I misremembered. I'm going to tweak the testcase. -- Eric Botcazou
[C++ Patch] PR 65323
Hi, this is a regression about duplicate warnings with -Wzero-as-null-pointer-constant. The regression is rather old, affects 4_8-branch too, and started when check_default_argument got a perform_implicit_conversion_flags call which warns a first time, then maybe_warn_zero_as_null_pointer_constant as called by check_default_argument itself warns a second time. The latter call is even older, dates back to c++/52718, I think we can now safely remove it and keep on returning nullptr_node to avoid warning later still at the call sites (that was the point of c++/52718). Tested x86_64-linux. Thanks, Paolo. /// Index: decl.c === --- decl.c (revision 221230) +++ decl.c (working copy) @@ -11227,11 +11227,9 @@ check_default_argument (tree decl, tree arg, tsubs LOOKUP_IMPLICIT); --cp_unevaluated_operand; - if (warn_zero_as_null_pointer_constant - TYPE_PTR_OR_PTRMEM_P (decl_type) + if (TYPE_PTR_OR_PTRMEM_P (decl_type) null_ptr_cst_p (arg) - (complain tf_warning) - maybe_warn_zero_as_null_pointer_constant (arg, input_location)) + !NULLPTR_TYPE_P (TREE_TYPE (arg))) return nullptr_node; /* [dcl.fct.default]
Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2 basic block vectorized using SLP 1
On Thu, 5 Mar 2015, Martin Sebor wrote: Attached is a scaled down version of the test for the bug. It fixes the scan-tree-dump-times string to match what GCC 5 prints and moves the result checking out of the test function and into main to prevent it from getting optimized away (as observed in comment #8 on the bug). The patch also adds a regression test for the bug to scan the assembly for the absence of ordinary loads and stores. Tested on ppc64le-linux. Does it look okay to everyone? Looks ok to me. Thanks, Richard.
Re: [patch] Optimize empty class copies within a C++ return statement
On 03/06/2015 04:54 PM, Aldy Hernandez wrote: But doesn't this still involve a MODIFY_EXPR, i.e. return retval = D.2349? If I understand you correct, no. gimplify_return_expr creates a new temporary and uses that instead of retval: else if (gimplify_ctxp-return_temp) result = gimplify_ctxp-return_temp; else { result = create_tmp_reg (TREE_TYPE (result_decl)); ... } ... ... /* Smash the lhs of the MODIFY_EXPR to the temporary we plan to use. Then gimplify the whole thing. */ if (result != result_decl) TREE_OPERAND (ret_expr, 0) = result; Sounds like ret_expr is a MODIFY_EXPR. Jason