Re: PING: PATCH [9/n]: Prepare x32: PR middle-end/47383: ivopts miscompiles Pmode != ptr_mode
On Tue, Jul 5, 2011 at 7:07 PM, H.J. Lu hjl.to...@gmail.com wrote: On Tue, Jul 5, 2011 at 8:24 AM, Richard Guenther richard.guent...@gmail.com wrote: On Tue, Jul 5, 2011 at 4:56 PM, Ulrich Weigand uweig...@de.ibm.com wrote: H.J. Lu wrote: However, this still seems odd to me, as I had understood the address in a TARGET_MEM_REF needs to be an *address*, i.e. use address_mode. =A0If this is not true (has changed?) a lot of other places would need to change as well ... I was told that TARGET_MEM_REF needs ptr_mode. Can you elaborate? We are talking about the mode returned from addr_for_mem_ref here. I do now understand how this can be anything but an address mode: That is an address mode, but the intermediate computation (base + index * step + offset) is done in pointer mode. The code currently performs this in address mode as well which is bogus. Which is why I suggested to use pointer mode for the computation (now, with the other target hook you mention) and then convert the result to address mode. I am testing this [patch. OK for trunk if there are no regressions? Ok. Thanks, Richard. Thanks. -- H.J. --- diff --git a/gcc/ChangeLog.x32 b/gcc/ChangeLog.x32 index c5edfe7..7d85746 100644 --- a/gcc/ChangeLog.x32 +++ b/gcc/ChangeLog.x32 @@ -1,5 +1,11 @@ 2011-07-05 H.J. Lu hongjiu...@intel.com + PR middle-end/47383 + * tree-ssa-address.c (addr_for_mem_ref): Use pointer_mode for + address computation and convert to address_mode if needed. + +2011-07-05 H.J. Lu hongjiu...@intel.com + * tree-ssa-address.c (addr_for_mem_ref): Use targetm.addr_space.address_mode. diff --git a/gcc/testsuite/ChangeLog.x32 b/gcc/testsuite/ChangeLog.x32 index cde8d41..492be5c 100644 --- a/gcc/testsuite/ChangeLog.x32 +++ b/gcc/testsuite/ChangeLog.x32 @@ -1,3 +1,8 @@ +2011-07-05 H.J. Lu hongjiu...@intel.com + + PR middle-end/47383 + * gcc.dg/pr47383.c: New. + 2011-06-23 H.J. Lu hongjiu...@intel.com * gcc.target/i386/pr49504.c (main): Check correct return value. diff --git a/gcc/testsuite/gcc.dg/pr47383.c b/gcc/testsuite/gcc.dg/pr47383.c new file mode 100644 index 000..3e2b9ba --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr47383.c @@ -0,0 +1,31 @@ +/* { dg-do run { target fpic } } */ +/* { dg-options -O2 -fPIC } */ + +static int heap[2*(256 +1+29)+1]; +static int heap_len; +static int heap_max; +void +__attribute__ ((noinline)) +foo (int elems) +{ + int n, m; + int max_code = -1; + int node = elems; + heap_len = 0, heap_max = (2*(256 +1+29)+1); + for (n = 0; n elems; n++) + heap[++heap_len] = max_code = n; + do { + n = heap[1]; + heap[1] = heap[heap_len--]; + m = heap[1]; + heap[--heap_max] = n; + heap[--heap_max] = m; + } while (heap_len = 2); +} + +int +main () +{ + foo (286); + return 0; +} diff --git a/gcc/tree-ssa-address.c b/gcc/tree-ssa-address.c index e3934e1..c6dced1 100644 --- a/gcc/tree-ssa-address.c +++ b/gcc/tree-ssa-address.c @@ -189,11 +189,12 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as, bool really_expand) { enum machine_mode address_mode = targetm.addr_space.address_mode (as); + enum machine_mode pointer_mode = targetm.addr_space.pointer_mode (as); rtx address, sym, bse, idx, st, off; struct mem_addr_template *templ; if (addr-step !integer_onep (addr-step)) - st = immed_double_int_const (tree_to_double_int (addr-step), address_mode); + st = immed_double_int_const (tree_to_double_int (addr-step), pointer_mode); else st = NULL_RTX; @@ -201,7 +202,7 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as, off = immed_double_int_const (double_int_sext (tree_to_double_int (addr-offset), TYPE_PRECISION (TREE_TYPE (addr-offset))), - address_mode); + pointer_mode); else off = NULL_RTX; @@ -220,16 +221,16 @@ addr_for_mem_ref (struct mem_address *addr, addr_space_t as, if (!templ-ref) { sym = (addr-symbol ? - gen_rtx_SYMBOL_REF (address_mode, ggc_strdup (test_symbol)) + gen_rtx_SYMBOL_REF (pointer_mode, ggc_strdup (test_symbol)) : NULL_RTX); bse = (addr-base ? - gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1) + gen_raw_REG (pointer_mode, LAST_VIRTUAL_REGISTER + 1) : NULL_RTX); idx = (addr-index ? - gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2) + gen_raw_REG (pointer_mode, LAST_VIRTUAL_REGISTER + 2) : NULL_RTX); - gen_addr_rtx (address_mode, sym, bse, idx, + gen_addr_rtx (pointer_mode, sym, bse, idx, st? const0_rtx : NULL_RTX, off? const0_rtx : NULL_RTX, templ-ref, @@ -247,16
Re: Ping^2: TARGET_HAVE_NAMED_SECTIONS cleanup
On Tue, Jul 5, 2011 at 9:16 PM, Joseph S. Myers jos...@codesourcery.com wrote: Ping^2. The patch http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01642.html is still pending review. Ok if there are no objections from target maintainers in 24h. Thanks, Richard. -- Joseph S. Myers jos...@codesourcery.com
[PATCH, ARM, iWMMXt][2/5]: intrinsic head file change
Hi, It is the second part of iWMMXt maintenance. *config/arm/mmintrin.h: Revise the iWMMXt intrinsics head file. Fix some intrinsics and add some new intrinsics. Thanks, Xinyu 2_mmintrin.diff Description: 2_mmintrin.diff
[PATCH, ARM, iWMMXt][0/5]: iWMMXt intrinsics maintenance and pipeline description
Hi, Since the patch of iWMMXt intrinsics maintenance and pipeline description is too big to review, I subdivide it into five parts. 1 ARM generic code change 2 iWMMXt intrinsic head file change. 3 iWMMXt built in define and expand 4 WMMX machine description 5 WMMX pipeline description These five parts may not work separately. Thanks, Xinyu
[PATCH, ARM, iWMMXt][1/5]: ARM code generic change
Hi, It is the first part of iWMMXt maintenance. *config/arm/arm.c (arm_option_override): Enable iWMMXt with VFP. iWMMXt and NEON are incompatible. iWMMXt unsupported under Thumb-2 mode. (arm_expand_binop_builtin): Accept immediate op (with mode VOID) *config/arm/arm.md: Resettle include location of iwmmxt.md so that *arm_movdi and *arm_movsi_insn could be used when iWMMXt is enabled. Add pipeline description file include. Thanks, Xinyu 1_generic.diff Description: 1_generic.diff
Cleanup Solaris ASM_SPEC handling
As already noted in the Solaris configuration cleanup patch, the ASM_SPEC handling on Solaris can be simplified. This patch does this, also as a prerequisite for a followup to provide a 64-bit default Solaris/x86 configuration. The basic observation is that there's a common part handled by both Sun as and GNU as, and parts that are assembler-specific. The common part now lives in config/sol2.h (ASM_SPEC_BASE), and I also define ASM_PIC_SPEC there. In theory, this is understood by Sun as everywhere, but GNU as only handles it on SPARC. In practice, Sun as on Solaris/x86 warns about various constructs with -K PIC which makes the option useless. I'm moving the 32-bit vs. 64-bit handling on Solaris/x86 to ASM_CPU_SPEC which already used for that purpose on SPARC. Bootstrapped on i386-pc-solaris2.10, i386-pc-solaris2.11, and sparc-sun-solaris2.11 with Sun as/ld, GNU as/Sun ld, and GNU as/ld without regressions. Will commit shortly together with the amd64-pc-solaris2.1? patch Rainer 2011-07-02 Rainer Orth r...@cebitec.uni-bielefeld.de * config/sol2.h (ASM_SPEC): Split into ... (ASM_SPEC_BASE, ASM_PIC_SPEC): ... this. * config/i386/sol2.h (ASM_SPEC): Define using ASM_SPEC_BASE. * config/i386/sol2-bi.h (ASM_CPU_SPEC): Redefine. (ASM_SPEC): Use ASM_SPEC_BASE. * config/sparc/sol2.h (ASM_SPEC): Redefine. diff --git a/gcc/config/i386/sol2-bi.h b/gcc/config/i386/sol2-bi.h --- a/gcc/config/i386/sol2-bi.h +++ b/gcc/config/i386/sol2-bi.h @@ -31,13 +31,20 @@ along with GCC; see the file COPYING3. /* GNU as understands --32 and --64, but the native Solaris assembler requires -xarch=generic or -xarch=generic64 instead. */ +#undef ASM_CPU_SPEC +#ifdef USE_GAS +#define ASM_CPU_SPEC %{m32:--32} %{m64:--64} +#else +#define ASM_CPU_SPEC %{m32:-xarch=generic} %{m64:-xarch=generic64} +#endif + +/* Don't let i386/x86-64.h override i386/sol2.h version. Since Solaris + 10, Sun as can handle -K PIC correctly. */ #undef ASM_SPEC #ifdef USE_GAS -#define ASM_SPEC %{m32:--32} %{m64:--64} -s %(asm_cpu) +#define ASM_SPEC ASM_SPEC_BASE #else -#define ASM_SPEC %{v:-V} %{Qy:} %{!Qn:-Qy} %{Ym,*} \ -%{m32:-xarch=generic} %{m64:-xarch=generic64} \ --s %(asm_cpu) +#define ASM_SPEC ASM_SPEC_BASE ASM_PIC_SPEC #endif /* We do not need to search a special directory for startup files. */ diff --git a/gcc/config/i386/sol2.h b/gcc/config/i386/sol2.h --- a/gcc/config/i386/sol2.h +++ b/gcc/config/i386/sol2.h @@ -61,12 +61,12 @@ along with GCC; see the file COPYING3. #define ASM_CPU_SPEC -/* Removed -K PIC from generic sol2.h ASM_SPEC: the Solaris 8 and 9 assembler - gives many warnings: R_386_32 relocation is used for symbol .text, and +/* Don't include ASM_PIC_SPEC. While the Solaris 8 and 9 assembler accepts + -K PIC, it gives many warnings: + R_386_32 relocation is used for symbol .text GNU as doesn't recognize -K at all. */ -/* FIXME: Perhaps split between common and CPU-specific parts? */ #undef ASM_SPEC -#define ASM_SPEC %{v:-V} %{Qy:} %{!Qn:-Qy} %{Ym,*} -s %(asm_cpu) +#define ASM_SPEC ASM_SPEC_BASE #define SUBTARGET_CPU_EXTRA_SPECS \ { cpp_subtarget,CPP_SUBTARGET_SPEC }, \ diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h --- a/gcc/config/sol2.h +++ b/gcc/config/sol2.h @@ -99,13 +99,12 @@ along with GCC; see the file COPYING3. TARGET_SUB_OS_CPP_BUILTINS(); \ } while (0) -/* It's safe to pass -s always, even if -g is not used. */ -#undef ASM_SPEC -#define ASM_SPEC \ -%{v:-V} %{Qy:} %{!Qn:-Qy} %{Ym,*} -s \ -%{fpic|fpie|fPIC|fPIE:-K PIC} \ -%(asm_cpu) \ - +/* It's safe to pass -s always, even if -g is not used. Those options are + handled by both Sun as and GNU as. */ +#define ASM_SPEC_BASE \ +%{v:-V} %{Qy:} %{!Qn:-Qy} %{Ym,*} -s %(asm_cpu) + +#define ASM_PIC_SPEC %{fpic|fpie|fPIC|fPIE:-K PIC} #undef LIB_SPEC #define LIB_SPEC \ diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h --- a/gcc/config/sparc/sol2.h +++ b/gcc/config/sparc/sol2.h @@ -120,6 +120,10 @@ along with GCC; see the file COPYING3. #define ASM_CPU_DEFAULT_SPEC ASM_CPU32_DEFAULT_SPEC #endif +/* Both Sun as and GNU as understand -K PIC. */ +#undef ASM_SPEC +#define ASM_SPEC ASM_BASE_SPEC ASM_PIC_SPEC + #undef CPP_CPU_SPEC #define CPP_CPU_SPEC \ %{mcpu=sparclet|mcpu=tsc701:-D__sparclet__} \ -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Provide 64-bit default Solaris/x86 configuration (PR target/39150)
There has long been some clamoring for a amd64-*-solaris2 configuration similar to sparcv9-sun-solaris2. I've resisted this for quite some time, primarily because it doubles the maintenance effort of testing both the 32-bit default and 64-bit default configurations. After the recent cleanup patches to the Solaris configuration, it proved to be quite easy and straight forward to implement. Only a few changes are worth mentioning: * TRY_EMPTY_VM_SPACE for amd64 had to be massively reduced in host-solaris2.c, otherwise mmap would fail with ENOMEM. The old value had been in the middle of the 64-bit address space, which obviously doesn't work. * The TARGET_LD_EMULATION macro needed a clause to properly deal with the default case: at first, I had used a 32-bit GNU ld for the 64-bit configuation, which only works if you explicitly specify the 64-bit emulation. I'm leaving this in although a 32-bit gld doesn't work properly: the lto-plugin is built as 64-bit object, thus all -flto tests fail in such a configuration. I think practically the whole patch falls under the Solaris maintainership, with the possible exception of the change to the copy of libtool.m4 in libgo/config. This is not for the technical content, but for the special commit rules to that directory. Ian? Anyway, this part of the patch will have to go to upstream libtool. Ralf, could you take care of that? Bootstrapped without regression on i386-pc-solaris2.10 (both 32-bit default and 64-bit default configurations), i386-pc-solaris2.11 and sparc-sun-solaris2.11 in progress. There are two caveats, which will be addressed subsequently: * libstdc++-abi/abi_check FAILs for the 64-bit ABI due to a mismatch between --print-multi-directory and --print-multi-os-directory. I'll report and hopefully address this subsequently; the problem already exists in the the existing sparcv9-sun-solaris2 configurations. * In the sparcv9-sun-solaris2.11 builds, the 32-bit libgo tests fail to link since they have unresolved references to __sync_bool_compare_and_swap_8 and __sync_add_and_fetch_8. I could trace this to -mv8plus being missing in that configuration. I'm uncertain where best to handle this. Eric? Once all the bootstraps have finished, I'll commit this patch (at least the non-libgo parts) unless anything unexpected comes up. Rainer 2011-07-02 Rainer Orth r...@cebitec.uni-bielefeld.de gcc: PR target/39150 * configure.ac (gcc_cv_as_hidden): Also accept x86_64-*-solaris2.1[0-9]*. (gcc_cv_as_cfi_directive): Likewise. (gcc_cv_as_comdat_group_group): Likewise. (set_have_as_tls): Likewise. * configure: Regenerate. * config.gcc (i[34567]86-*-solaris2*): Also handle x86_64-*-solaris2.1[0-9]*. * config.host (i[34567]86-*-solaris2*): Likewise. * config/sparc/sol2.h (ASM_CPU_DEFAULT_SPEC): Remove. * config/sol2-bi.h (ASM_CPU_DEFAULT_SPEC): Redefine. [USE_GLD] (ARCH_DEFAULT_EMULATION): Define. (TARGET_LD_EMULATION): Use it. * config/i386/sol2.h (ASM_CPU_DEFAULT_SPEC): Define. (SUBTARGET_CPU_EXTRA_SPECS): Add asm_cpu_default. * config/i386/sol2-bi.h (ASM_CPU32_DEFAULT_SPEC): Define. (ASM_CPU64_DEFAULT_SPEC): Define. (ASM_CPU_SPEC): Use %(asm_cpu_default). (ASM_SPEC): Redefine. (DEFAULT_ARCH32_P): Define using TARGET_64BIT_DEFAULT. * config/host-solaris.c [__x86_64__] (TRY_EMPTY_VM_SPACE): Reduce. * doc/install.texi (Specific, amd64-*-solaris2.1[0-9]*): Document. (Specific, i?86-*-solaris2.10): Mention x86_64-*-solaris2.1[0-9]* configuration. (Specific, x86_64-*-solaris2.1[0-9]*): Document. gcc/ada: PR target/39150 * gcc-interface/Makefile.in: Handle x86_64-solaris2. libgcc: PR target/39150 * config.host (*-*-solaris2*): Handle x86_64-*-solaris2.1[0-9]* like i?86-*-solaris2.1[0-9]*. (i[34567]86-*-solaris2*): Also handle x86_64-*-solaris2.1[0-9]*. * configure.ac (i?86-*-solaris2*): Likewise. * configure: Regenerate. gcc/testsuite: PR target/39150 * gcc.misc-tests/linkage.exp: Handle x86_64-*-solaris2.1[0-9]*. toplevel: PR target/39150 * configure.ac (i[3456789]86-*-solaris2*): Also accept x86_64-*-solaris2.1[0-9]*. * configure: Regenerate. boehm-gc: PR target/39150 * configure.ac (i?86-*-solaris2.[89]): Also accept x86_64-*-solaris2.1?. * configure: Regenerate. gnattools: PR target/39150 * configure.ac (*86-*-solaris2*): Also accept x86_64-*-solaris2.1[0-9]*. * configure: Regenerate. libcpp: PR target/39150 * configure.ac (host_wide_int): Handle x86_64-*-solaris2.1[0-9] like i[34567]86-*-solaris2.1[0-9]*. * configure: Regenerate.
[testsuite] Don't XFAIL gcc.dg/tree-ssa/20030807-7.c (PR tree-optimization/49647)
As described in the PR, gcc.dg/tree-ssa/20030807-7.c seems to XPASS everywhere. This patch removes the xfail. Tested with the appropriate runtest invocation on i386-pc-solaris2.10. Ok for mainline? Rainer 2011-07-06 Rainer Orth r...@cebitec.uni-bielefeld.de PR tree-optimization/49647 * gcc.dg/tree-ssa/20030807-7.c: Remove xfail *-*-*. Index: gcc/testsuite/gcc.dg/tree-ssa/20030807-7.c === --- gcc/testsuite/gcc.dg/tree-ssa/20030807-7.c (revision 175909) +++ gcc/testsuite/gcc.dg/tree-ssa/20030807-7.c (working copy) @@ -33,5 +33,5 @@ } /* There should be exactly one IF conditional. */ -/* { dg-final { scan-tree-dump-times if 1 vrp1 { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times if 1 vrp1 } } */ /* { dg-final { cleanup-tree-dump vrp1 } } */ -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[Path,AVR]: Improve loading of 32-bit constants
For loading a 32-bit constant in a register, there is room for improvement: * SF can be handled the same way as SI and therefore the patch adds a peep2 to produce a *reload_insf analogon to *reload_insi. * If the destination register overlaps NO_LD_REGS, values already loaded into some other byte can be reused by a simple MOV. This is helpful then moving values like, e.g. -2, -100 etc. because all high bytes are 0xff. * 0.0f can be directly moved to memory. * The mov insns contain !d constraint. I see no reason to make d expensive and discourage use of d-regs. A *d to hide is better because it does it neither puts additional pressure on d nor discourages d. The patch is basically a rewrite of output_reload_insisf. Tested without regressions. Ok to commit? Johann * config/avr/avr.md (*reload_insi): Change predicate #1 to const_int_operand. Ditto for peep2 producing this insn. Add argument to output_reload_insisf call. (*movsi,*movsf): Add argument to output_movsisf call. Change !d constraint to *d. (*reload_insf): New insn and new peep2 to produce it. * config/avr/avr-protos.h (output_movsisf): Change prototype. (output_reload_insisf): Change prototype. * config/avr/avr.c (avr_asm_len): New function. (output_reload_insisf): Rewrite. (output_movsisf): Change prototype. output_reload_insisf for all CONST_INT and CONST_DOUBLE. ALlow moving 0.0f to memory. (adjust_insn_length): Add argument to output_movsisf and output_reload_insisf call. Index: config/avr/avr.md === --- config/avr/avr.md (revision 175811) +++ config/avr/avr.md (working copy) @@ -402,10 +402,10 @@ (define_expand movsi -(define_peephole2 ; movsi_lreg_const +(define_peephole2 ; *reload_insi [(match_scratch:QI 2 d) (set (match_operand:SI 0 l_register_operand ) -(match_operand:SI 1 immediate_operand )) +(match_operand:SI 1 const_int_operand )) (match_dup 2)] (operands[1] != const0_rtx operands[1] != constm1_rtx) @@ -416,22 +416,26 @@ (define_peephole2 ; movsi_lreg_const ;; '*' because it is not used in rtl generation. (define_insn *reload_insi [(set (match_operand:SI 0 register_operand =r) -(match_operand:SI 1 immediate_operand i)) +(match_operand:SI 1 const_int_operand n)) (clobber (match_operand:QI 2 register_operand =d))] reload_completed - * return output_reload_insisf (insn, operands, NULL); + { +return output_reload_insisf (insn, operands, operands[2], NULL); + } [(set_attr length 8) - (set_attr cc none)]) + (set_attr cc clobber)]) (define_insn *movsi - [(set (match_operand:SI 0 nonimmediate_operand =r,r,r,Qm,!d,r) + [(set (match_operand:SI 0 nonimmediate_operand =r,r,r,Qm,*d,r) (match_operand:SI 1 general_operand r,L,Qm,rL,i,i))] (register_operand (operands[0],SImode) || register_operand (operands[1],SImode) || const0_rtx == operands[1]) - * return output_movsisf (insn, operands, NULL); + { +return output_movsisf (insn, operands, NULL_RTX, NULL); + } [(set_attr length 4,4,8,9,4,10) - (set_attr cc none,set_zn,clobber,clobber,none,clobber)]) + (set_attr cc none,set_zn,clobber,clobber,clobber,clobber)]) ;; f ;; move floating point numbers (32 bit) @@ -451,13 +455,39 @@ (define_expand movsf }) (define_insn *movsf - [(set (match_operand:SF 0 nonimmediate_operand =r,r,r,Qm,!d,r) -(match_operand:SF 1 general_operand r,G,Qm,r,F,F))] + [(set (match_operand:SF 0 nonimmediate_operand =r,r,r,Qm,*d,r) +(match_operand:SF 1 general_operand r,G,Qm,rG,F,F))] register_operand (operands[0], SFmode) - || register_operand (operands[1], SFmode) - * return output_movsisf (insn, operands, NULL); + || register_operand (operands[1], SFmode) + || operands[1] == CONST0_RTX (SFmode) + { +return output_movsisf (insn, operands, NULL_RTX, NULL); + } [(set_attr length 4,4,8,9,4,10) - (set_attr cc none,set_zn,clobber,clobber,none,clobber)]) + (set_attr cc none,set_zn,clobber,clobber,clobber,clobber)]) + +(define_peephole2 ; *reload_insf + [(match_scratch:QI 2 d) + (set (match_operand:SF 0 l_register_operand ) +(match_operand:SF 1 const_double_operand )) + (match_dup 2)] + operands[1] != CONST0_RTX (SFmode) + [(parallel [(set (match_dup 0) + (match_dup 1)) + (clobber (match_dup 2))])] + ) + +;; '*' because it is not used in rtl generation. +(define_insn *reload_insf + [(set (match_operand:SF 0 register_operand =r) +(match_operand:SF 1 const_double_operand F)) + (clobber (match_operand:QI 2 register_operand =d))] + reload_completed + { +return output_reload_insisf (insn, operands, operands[2], NULL); + } + [(set_attr length 8) + (set_attr cc clobber)])
Re: [testsuite] Don't XFAIL gcc.dg/tree-ssa/20030807-7.c (PR tree-optimization/49647)
On Wed, 6 Jul 2011, Rainer Orth wrote: As described in the PR, gcc.dg/tree-ssa/20030807-7.c seems to XPASS everywhere. This patch removes the xfail. Tested with the appropriate runtest invocation on i386-pc-solaris2.10. Ok for mainline? Ok. Thanks, Richard. Rainer 2011-07-06 Rainer Orth r...@cebitec.uni-bielefeld.de PR tree-optimization/49647 * gcc.dg/tree-ssa/20030807-7.c: Remove xfail *-*-*. Index: gcc/testsuite/gcc.dg/tree-ssa/20030807-7.c === --- gcc/testsuite/gcc.dg/tree-ssa/20030807-7.c(revision 175909) +++ gcc/testsuite/gcc.dg/tree-ssa/20030807-7.c(working copy) @@ -33,5 +33,5 @@ } /* There should be exactly one IF conditional. */ -/* { dg-final { scan-tree-dump-times if 1 vrp1 { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times if 1 vrp1 } } */ /* { dg-final { cleanup-tree-dump vrp1 } } */
Re: [PATCH] Fix configure --with-cloog
Hello This patch fix an issue while building with cloog and gmp installed in a custom separate directories. How to reproduce : - Make sure you've installed cloog and gmp in separate directories (ie ${WITH-CLOOG-PATH}/lib doesn't contain libgmp) - Make sure neither gmp nor cloog is not installed in a directory searched by default by your linker when looking for libs. - Launch configure script with both --with-gmp and --with-cloog switch properly set This result in an unexpected error while configuring: error: Unable to find a usable CLooG. 2011-07-06 Romain Geissler romain.geiss...@gmail.com * configure: Add $gmplibs to cloog $LDFLAGS Index: configure === --- configure (revision 175709) +++ configure (working copy) @@ -5713,7 +5713,7 @@ if test x$with_cloog != xno; then CFLAGS=${CFLAGS} ${clooginc} ${gmpinc} CPPFLAGS=${CPPFLAGS} ${_cloogorginc} - LDFLAGS=${LDFLAGS} ${clooglibs} + LDFLAGS=${LDFLAGS} ${clooglibs} ${gmplibs} case $cloog_backend in ppl-legacy) I forgot configure was a generated script. Here is the patch that fix it at the m4 macro level : 2011-07-06 Romain Geissler romain.geiss...@gmail.com * config/cloog.m4: Add $gmplibs to cloog $LDFLAGS * configure: Regenerate Index: config/cloog.m4 === --- config/cloog.m4 (revision 175907) +++ config/cloog.m4 (working copy) @@ -142,7 +142,7 @@ AC_DEFUN([CLOOG_FIND_FLAGS], dnl clooglibs clooginc may have been initialized by CLOOG_INIT_FLAGS. CFLAGS=${CFLAGS} ${clooginc} ${gmpinc} CPPFLAGS=${CPPFLAGS} ${_cloogorginc} - LDFLAGS=${LDFLAGS} ${clooglibs} + LDFLAGS=${LDFLAGS} ${clooglibs} ${gmplibs} case $cloog_backend in ppl-legacy)
PATCH TRUNK: [gcc/configure.ac] Generate GCCPLUGIN_VERSION_* macros
Hello All, The fie plugin-version.h is now generated by gcc/configure.ac. It contains version information (about the GCC supposed to load the plugin compiled with it) as constant strings. But I think it will also help some plugin developers if that file (which is packaged in gcc-4.6-plugin-dev on Debian/Sid, i.e. as the plugin development package) contained preprocessor macros defining the same versions. So this patch generates constant macros like #define GCCPLUGIN_VERSION_STRING 4.7.0 #define GCCPLUGIN_VERSION_MAJOR 4 #define GCCPLUGIN_VERSION_MINOR 7 #define GCCPLUGIN_VERSION_MICRO 0 #define GCCPLUGIN_VERSION (GCCPLUGIN_VERSION_MAJOR*1000 + GCCPLUGIN_VERSION_MINOR) #define GCCPLUGIN_DEVPHASE experimental #define GCCPLUGIN_REVISION [trunk revision 175910] in the plugin-version.h file. I belive it can help to make plugin code more robust. A serious plugin developper could then add in his plugin code something like #if GCCPLUGIN_VERSION != 4007 #error this plugin can be built only for GCC 4.7 #endif and with such a feature the plugin won't even compile if, for one reason or another, the wrong gcc has been considered i.e. passed with -I$(gcc -print-file-name=plugin) This brings some help to the careful plugin coder, and don't harm GCC itself. For GCC trunk or branches we have the BUILDING_GCC_VERSION macro, but it does not appear in the headers insdtalled by gcc-4.6-plugin-dev package. Plugins can currently only check for version compatibility at plugin dlopen time, not at plugin build time! I am attaching the diff file gccplugin_rev_configure_r175910.diff against trunk 175910 ### gcc/ChangeLog entry 2011-07-06 Basile Starynkevitch bas...@starynkevitch.net * configure.ac (plugin-version.h): Generate GCCPLUGIN_VERSION_STRING, GCCPLUGIN_VERSION_MAJOR, GCCPLUGIN_VERSION_MINOR, GCCPLUGIN_VERSION_MICRO, GCCPLUGIN_VERSION_NUMBER, GCCPLUGIN_DEVPHASE, GCCPLUGIN_REVISION macros. * configure: Regenerate. # Ok for trunk, with what changes? Regards. PS. While this would help the MELT plugin, I am very sure it could help other plugins coders! -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} *** Index: gcc/configure === --- gcc/configure (revision 175910) +++ gcc/configure (working copy) @@ -11072,6 +11072,14 @@ cat plugin-version.h EOF #include configargs.h +#define GCCPLUGIN_VERSION_STRING $gcc_BASEVER +#define GCCPLUGIN_VERSION_MAJOR `cut -d. -f1 $srcdir/BASE-VER` +#define GCCPLUGIN_VERSION_MINOR `cut -d. -f2 $srcdir/BASE-VER` +#define GCCPLUGIN_VERSION_MICRO `cut -d. -f3 $srcdir/BASE-VER` +#define GCCPLUGIN_VERSION (GCCPLUGIN_VERSION_MAJOR*1000 + GCCPLUGIN_VERSION_MINOR) +#define GCCPLUGIN_DEVPHASE $gcc_DEVPHASE +#define GCCPLUGIN_REVISION $gcc_REVISION + static char basever[] = $gcc_BASEVER; static char datestamp[] = $gcc_DATESTAMP; static char devphase[] = $gcc_DEVPHASE; @@ -17623,7 +17631,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17626 configure +#line 17634 configure #include confdefs.h #if HAVE_DLFCN_H @@ -17729,7 +17737,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17732 configure +#line 17740 configure #include confdefs.h #if HAVE_DLFCN_H Index: gcc/configure.ac === --- gcc/configure.ac (revision 175910) +++ gcc/configure.ac (working copy) @@ -1511,6 +1511,14 @@ cat plugin-version.h EOF #include configargs.h +#define GCCPLUGIN_VERSION_STRING $gcc_BASEVER +#define GCCPLUGIN_VERSION_MAJOR `cut -d. -f1 $srcdir/BASE-VER` +#define GCCPLUGIN_VERSION_MINOR `cut -d. -f2 $srcdir/BASE-VER` +#define GCCPLUGIN_VERSION_MICRO `cut -d. -f3 $srcdir/BASE-VER` +#define GCCPLUGIN_VERSION (GCCPLUGIN_VERSION_MAJOR*1000 + GCCPLUGIN_VERSION_MINOR) +#define GCCPLUGIN_DEVPHASE $gcc_DEVPHASE +#define GCCPLUGIN_REVISION $gcc_REVISION + static char basever[] = $gcc_BASEVER; static char datestamp[] = $gcc_DATESTAMP; static char devphase[] = $gcc_DEVPHASE;
[Patch,testsuite]: target-supports.exp: Disable -fprofile-generate for AVR
AVR tests will fail if -fprofile-generate is given because that is not (yet) implemented. CCed avr port maintainer in the case they have objections. Ok to commit? Johann * lib/target-supports.exp (check_profiling_available): Disable profiling with -fprofile-generate for target avr. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 175811) +++ lib/target-supports.exp (working copy) @@ -497,6 +497,13 @@ proc check_profiling_available { test_wh # Tree profiling requires TLS runtime support. if { $test_what == -fprofile-generate } { + # Target AVR does not support profile generation because + # it does not implement needed support functions. + # A call to check_effective_target_tls_runtime won't + # reveal that. + if { [istarget avr-*-*] } { + return 0 + } return [check_effective_target_tls_runtime] } Index: gcc.dg/tree-ssa/vrp51.c === --- gcc.dg/tree-ssa/vrp51.c (revision 175811) +++ gcc.dg/tree-ssa/vrp51.c (working copy) @@ -1,6 +1,7 @@ /* PR tree-optimization/28632 */ /* { dg-do compile } */ /* { dg-options -O2 -ftree-vrp } */ +/* { dg-require-effective-target int32plus } */ void v4 (unsigned a, unsigned b)
Re: [PATCH] Address lowering [1/3] Main patch
On Tue, Jul 5, 2011 at 3:59 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: (Sorry for the late response; yesterday was a holiday here.) On Mon, 2011-07-04 at 16:21 +0200, Richard Guenther wrote: On Thu, Jun 30, 2011 at 4:39 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: This is the first of three patches related to lowering addressing expressions to MEM_REFs and TARGET_MEM_REFs in late gimple. This patch contains the new pass together with supporting changes in existing modules. The second patch contains an independent change to the RTL forward propagator to keep it from undoing an optimization made in the first patch. The third patch contains new test cases and changes to existing test cases. Although I've broken it up into three patches to make the review easier, it would be best to commit at least the first and third together to avoid regressions. The second can stand alone. I've done regression tests on powerpc64 and x86_64, and have asked Andreas Krebbel to test against the IBM z (390) platform. I've done performance regression testing on powerpc64. The only performance regression of note is the 2% degradation to 188.ammp due to loss of field disambiguation information. As discussed in another thread, fixing this introduces more complexity than it's worth. Are there also performance improvements? What about code size? Yes, there are performance improvements. I've been running cpu2000 on 32- and 64-bit powerpc64. Thirteen tests show measurable improvements between 1% and 9%, with 187.facerec showing the largest improvements for both 32 and 64. I don't have formal code size results, but anecdotally from code crawling, I have seen code size either neutral or slightly improved. The largest code size improvements I've seen were on 32-bit code where the commoning allowed removal of a number of sign-extend and zero-extend instructions that were otherwise not seen to be redundant. I tried to get an understanding to what kind of optimizations this patch produces based on the test of testcases you added, but I have a hard time here. Can you outline some please? The primary goal is to clean up code such as is shown in the original post of PR46556. In late 2007 there were some changes made to address canonicalization that caused the code gen to be suboptimal on powerpc64. At that time you and others suggested a pattern recognizer prior to expand as probably the best solution, similar to what IVopts is doing. The PR46556 case looks quite simple. By using the same mem_ref generation machinery used by IVopts, together with local CSE, the goal was to ensure base registers are properly shared so that optimal code is generated, particularly for cases of shared addressability to structures and arrays. I also observed cases where it was useful to extend the sharing across the dominator tree. As you are doing IV selection per individual statement only, using the affine combination machinery looks quite a big hammer to me. Especially as it is hard to imagine what the side-effects are, apart from re-associating dependencies that do not fit the MEM-REF and making the MEM-REF as complicated as permitted by the target. What I thought originally when suggesting to do something similar to IVOPTs was to build a list of candidates and uses and optimize that set using a cost function similar to how IVOPTs does. Doing addressing-mode selection locally per statement seems like more a task for a few pattern matchers, for example in tree-ssa-forwprop.c (for its last invocation). One pattern would be that of PR46556, MEM[(p + ((n + 16)*4))] which we can transform to TARGET_MEM_REF[x + 64] with x = p + n*4 if ((n + 16)*4)) was a single-use. The TARGET_MEM_REF generation can easily re-use the address-description and target-availability checks from tree-ssa-address.c. I would be at least interested in whether handling the pattern from PR46556 alone (or maybe with a few similar other cases) is responsible for the performance improvements. Ideally we'd of course have a cost driven machinery that considers (part of) the whole function. Secondarily, I noticed that once this was cleaned up we still had the suboptimal code generation for the zero-offset mem refs scenario outlined in the code commentary. The extra logic to clean this up helps keep register pressure down. I've observed some spill code reduction when this is in place. Again, using expression availability from dominating blocks helps here in some cases as well. Yeah, the case is quite odd and doesn't really fit existing optimizers given that the CSE opportunity is hidden within the TARGET_MEM_REF ... I still do not like the implementation of yet another CSE machinery given that we already have two. I think most of the need for CSE comes from the use of the affine combination framework and force_gimple_operand. In fact I'd be interested to see cases
Re: PATCH TRUNK: [gcc/configure.ac] Generate GCCPLUGIN_VERSION_* macros
On Wed, Jul 6, 2011 at 2:50 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: Hello All, The fie plugin-version.h is now generated by gcc/configure.ac. It contains version information (about the GCC supposed to load the plugin compiled with it) as constant strings. But I think it will also help some plugin developers if that file (which is packaged in gcc-4.6-plugin-dev on Debian/Sid, i.e. as the plugin development package) contained preprocessor macros defining the same versions. So this patch generates constant macros like #define GCCPLUGIN_VERSION_STRING 4.7.0 #define GCCPLUGIN_VERSION_MAJOR 4 #define GCCPLUGIN_VERSION_MINOR 7 #define GCCPLUGIN_VERSION_MICRO 0 #define GCCPLUGIN_VERSION (GCCPLUGIN_VERSION_MAJOR*1000 + GCCPLUGIN_VERSION_MINOR) #define GCCPLUGIN_DEVPHASE experimental #define GCCPLUGIN_REVISION [trunk revision 175910] in the plugin-version.h file. I belive it can help to make plugin code more robust. A serious plugin developper could then add in his plugin code something like #if GCCPLUGIN_VERSION != 4007 #error this plugin can be built only for GCC 4.7 #endif and with such a feature the plugin won't even compile if, for one reason or another, the wrong gcc has been considered i.e. passed with -I$(gcc -print-file-name=plugin) This brings some help to the careful plugin coder, and don't harm GCC itself. For GCC trunk or branches we have the BUILDING_GCC_VERSION macro, but it does not appear in the headers insdtalled by gcc-4.6-plugin-dev package. Plugins can currently only check for version compatibility at plugin dlopen time, not at plugin build time! I'd say exposng major, minor and patchlevel (instead of micro) should be enough. Similar to what the host compiler gives you via __GNUC__, __GNUC_MINOR__ and __GNUC_PATCHLEVEL__. Richard. I am attaching the diff file gccplugin_rev_configure_r175910.diff against trunk 175910 ### gcc/ChangeLog entry 2011-07-06 Basile Starynkevitch bas...@starynkevitch.net * configure.ac (plugin-version.h): Generate GCCPLUGIN_VERSION_STRING, GCCPLUGIN_VERSION_MAJOR, GCCPLUGIN_VERSION_MINOR, GCCPLUGIN_VERSION_MICRO, GCCPLUGIN_VERSION_NUMBER, GCCPLUGIN_DEVPHASE, GCCPLUGIN_REVISION macros. * configure: Regenerate. # Ok for trunk, with what changes? Regards. PS. While this would help the MELT plugin, I am very sure it could help other plugins coders! -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
[PATCH] Fix PR49645, with C FE pieces
This fixes PR49645 - with MEM_REF the value-numbering machinery to look through aggregate copies wasn't working reliably as we have two representations for X, X and MEM[X]. The following patch fixes that by internally always using the more complicated representation. The patch needs consistent DECL_HARD_REGISTER settings to avoid generating MEM_REFs for them though and the C frontend fails to set that flag for global variables - hence the c-decl.c part (otherwise compile.exp 20041119-1.c ICEs). Bootstrapped and tested on x86_64-unknown-linux-gnu, are the C frontend parts ok for trunk? Thanks, Richard. 2011-07-06 Richard Guenther rguent...@suse.de PR tree-optimization/49645 * c-decl.c (finish_decl): Also set DECL_HARD_REGISTER for global register variables. * tree-ssa-sccvn.c (vn_reference_op_eq): Disregard differences in type qualification here ... (copy_reference_ops_from_ref): ... not here. (vn_reference_lookup_3): ... or here. (copy_reference_ops_from_ref): Record decl bases as MEM[decl]. (vn_reference_lookup): Do the lookup with a valueized ao-ref. * g++.dg/tree-ssa/pr8781.C: Disable SRA. Index: gcc/c-decl.c === *** gcc/c-decl.c(revision 175905) --- gcc/c-decl.c(working copy) *** finish_decl (tree decl, location_t init_ *** 4357,4362 --- 4357,4364 when a tentative file-scope definition is seen. But at end of compilation, do output code for them. */ DECL_DEFER_OUTPUT (decl) = 1; + if (asmspec C_DECL_REGISTER (decl)) + DECL_HARD_REGISTER (decl) = 1; rest_of_decl_compilation (decl, true, 0); } else Index: gcc/tree-ssa-sccvn.c === *** gcc/tree-ssa-sccvn.c(revision 175905) --- gcc/tree-ssa-sccvn.c(working copy) *** vn_reference_op_eq (const void *p1, cons *** 391,401 const_vn_reference_op_t const vro1 = (const_vn_reference_op_t) p1; const_vn_reference_op_t const vro2 = (const_vn_reference_op_t) p2; ! return vro1-opcode == vro2-opcode ! types_compatible_p (vro1-type, vro2-type) ! expressions_equal_p (vro1-op0, vro2-op0) ! expressions_equal_p (vro1-op1, vro2-op1) ! expressions_equal_p (vro1-op2, vro2-op2); } /* Compute the hash for a reference operand VRO1. */ --- 391,405 const_vn_reference_op_t const vro1 = (const_vn_reference_op_t) p1; const_vn_reference_op_t const vro2 = (const_vn_reference_op_t) p2; ! return (vro1-opcode == vro2-opcode ! /* We do not care for differences in type qualification. */ ! (vro1-type == vro2-type ! || (vro1-type vro2-type ! types_compatible_p (TYPE_MAIN_VARIANT (vro1-type), !TYPE_MAIN_VARIANT (vro2-type ! expressions_equal_p (vro1-op0, vro2-op0) ! expressions_equal_p (vro1-op1, vro2-op1) ! expressions_equal_p (vro1-op2, vro2-op2)); } /* Compute the hash for a reference operand VRO1. */ *** copy_reference_ops_from_ref (tree ref, V *** 579,585 memset (temp, 0, sizeof (temp)); /* We do not care for spurious type qualifications. */ ! temp.type = TYPE_MAIN_VARIANT (TREE_TYPE (ref)); temp.opcode = TREE_CODE (ref); temp.op0 = TMR_INDEX (ref); temp.op1 = TMR_STEP (ref); --- 583,589 memset (temp, 0, sizeof (temp)); /* We do not care for spurious type qualifications. */ ! temp.type = TREE_TYPE (ref); temp.opcode = TREE_CODE (ref); temp.op0 = TMR_INDEX (ref); temp.op1 = TMR_STEP (ref); *** copy_reference_ops_from_ref (tree ref, V *** 610,617 vn_reference_op_s temp; memset (temp, 0, sizeof (temp)); ! /* We do not care for spurious type qualifications. */ ! temp.type = TYPE_MAIN_VARIANT (TREE_TYPE (ref)); temp.opcode = TREE_CODE (ref); temp.off = -1; --- 614,620 vn_reference_op_s temp; memset (temp, 0, sizeof (temp)); ! temp.type = TREE_TYPE (ref); temp.opcode = TREE_CODE (ref); temp.off = -1; *** copy_reference_ops_from_ref (tree ref, V *** 676,691 temp.off = off.low; } break; case STRING_CST: case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: case REAL_CST: case CONSTRUCTOR: - case VAR_DECL: - case PARM_DECL: - case CONST_DECL: - case RESULT_DECL: case SSA_NAME: temp.op0 = ref; break; --- 679,711 temp.off = off.low; } break; + case VAR_DECL: + if (DECL_HARD_REGISTER (ref)) + { +
Re: [Patch,testsuite]: target-supports.exp: Disable -fprofile-generate for AVR
Georg-Johann Lay a...@gjlay.de writes: Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 175811) +++ lib/target-supports.exp (working copy) @@ -497,6 +497,13 @@ proc check_profiling_available { test_wh # Tree profiling requires TLS runtime support. if { $test_what == -fprofile-generate } { + # Target AVR does not support profile generation because ^ Leave out the `Target' + # it does not implement needed support functions. + # A call to check_effective_target_tls_runtime won't + # reveal that. Omit the second sentence: it isn't supposed to, but just documents a general requirement of -fprofile-generate. + if { [istarget avr-*-*] } { + return 0 + } return [check_effective_target_tls_runtime] } Index: gcc.dg/tree-ssa/vrp51.c === --- gcc.dg/tree-ssa/vrp51.c (revision 175811) +++ gcc.dg/tree-ssa/vrp51.c (working copy) @@ -1,6 +1,7 @@ /* PR tree-optimization/28632 */ /* { dg-do compile } */ /* { dg-options -O2 -ftree-vrp } */ +/* { dg-require-effective-target int32plus } */ void v4 (unsigned a, unsigned b) This is completely unrelated to the first; please don't mix such patches in one post. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Fix PR49645, with C FE pieces
Hi, On Wed, 6 Jul 2011, Richard Guenther wrote: *** copy_reference_ops_from_ref (tree ref, V *** 579,585 memset (temp, 0, sizeof (temp)); /* We do not care for spurious type qualifications. */ ! temp.type = TYPE_MAIN_VARIANT (TREE_TYPE (ref)); temp.opcode = TREE_CODE (ref); temp.op0 = TMR_INDEX (ref); temp.op1 = TMR_STEP (ref); --- 583,589 memset (temp, 0, sizeof (temp)); /* We do not care for spurious type qualifications. */ ! temp.type = TREE_TYPE (ref); The comment is superfluous now. Ciao, Michael.
Re: [Path,AVR]: Improve loading of 32-bit constants
2011/7/6 Georg-Johann Lay a...@gjlay.de: For loading a 32-bit constant in a register, there is room for improvement: * SF can be handled the same way as SI and therefore the patch adds a peep2 to produce a *reload_insf analogon to *reload_insi. * If the destination register overlaps NO_LD_REGS, values already loaded into some other byte can be reused by a simple MOV. This is helpful then moving values like, e.g. -2, -100 etc. because all high bytes are 0xff. * 0.0f can be directly moved to memory. * The mov insns contain !d constraint. I see no reason to make d expensive and discourage use of d-regs. A *d to hide is better because it does it neither puts additional pressure on d nor discourages d. I would like to have a real code examples. Denis.
Re: [PATCH] Fix PR49645, with C FE pieces
On Wed, 6 Jul 2011, Richard Guenther wrote: * c-decl.c (finish_decl): Also set DECL_HARD_REGISTER for global register variables. OK. -- Joseph S. Myers jos...@codesourcery.com
--enable-gnu-indirect-function patch
This patch: http://gcc.gnu.org/ml/gcc-patches/2010-09/msg02070.html for x86_64-*-linux* sets the default for --enable-gnu-indirect-function to glibc-2011. This string is not used anywhere else in gcc as far as I can see. What is the purpose of that? The original version of the patch http://gcc.gnu.org/ml/gcc-patches/2010-09/msg01546.html actually checked the glibc version in a native configuration. Why did you change this? i[34567]86-*-linux* instead sets the default to yes. I don't know why the two targets should be treated differently here. The current situation is that gcc configured for x86_64-gnu-linux will not support IFUNC unless the user knows to use the --enable-gnu-indirect-function option at configure time. That does not seem desirable. Ian
patch committed: Correct configure option name in docs
This patch corrects the name of the configure option --enable-gnu-indirect-function in the docs to correspond to the source. Committed as obvious. Ian 2011-07-06 Ian Lance Taylor i...@google.com * doc/install.texi (Configuration): It's --enable-gnu-indirect-function, not --enable-indirect-function. Index: doc/install.texi === --- doc/install.texi (revision 175914) +++ doc/install.texi (working copy) @@ -1245,7 +1245,7 @@ destructors, but requires __cxa_atexit i only available on systems with GNU libc. When enabled, this will cause @option{-fuse-cxa-atexit} to be passed by default. -@item --enable-indirect-function +@item --enable-gnu-indirect-function Define if you want to enable the @code{ifunc} attribute. This option is currently only available on systems with GNU libc on certain targets.
Re: PATCH TRUNK: [gcc/configure.ac] Generate GCCPLUGIN_VERSION_* macros
On Wed, Jul 6, 2011 at 3:46 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: On Wed, Jul 06, 2011 at 03:21:47PM +0200, Richard Guenther wrote: On Wed, Jul 6, 2011 at 2:50 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: I belive it can help to make plugin code more robust. A serious plugin developper could then add in his plugin code something like #if GCCPLUGIN_VERSION != 4007 #error this plugin can be built only for GCC 4.7 #endif and with such a feature the plugin won't even compile if, for one reason or another, the wrong gcc has been considered i.e. passed with -I$(gcc -print-file-name=plugin) This brings some help to the careful plugin coder, and don't harm GCC itself. For GCC trunk or branches we have the BUILDING_GCC_VERSION macro, but it does not appear in the headers insdtalled by gcc-4.6-plugin-dev package. Plugins can currently only check for version compatibility at plugin dlopen time, not at plugin build time! I'd say exposng major, minor and patchlevel (instead of micro) should be enough. Thanks for the comment. I am attaching a diff to GCC trunk 175912, and I documented the feature. gcc/ChangeLog entry # 2011-07-06 Basile Starynkevitch bas...@starynkevitch.net * configure.ac (plugin-version.h): Generate GCCPLUGIN_VERSION_MAJOR, GCCPLUGIN_VERSION_MINOR, GCCPLUGIN_VERSION_PATCHLEVEL, GCCPLUGIN_VERSION constant integer macros. * configure: Regenerate. * doc/plugins.texi (Building GCC plugins): Mention GCCPLUGIN_VERSION ... constant macros in plugin-version.h. ### end of gcc/ChangeLog entry Ok for trunk? I'm not sure using cut is portable enough - for bversion.h generation we use sed instead (see Makefile.in), so I suppose copying that would be better. I can't approve the configury changes but the change itself looks reasonable to me Thanks, Richard. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: --enable-gnu-indirect-function patch
On 07/06/11 14:57, Ian Lance Taylor wrote: This patch: http://gcc.gnu.org/ml/gcc-patches/2010-09/msg02070.html for x86_64-*-linux* sets the default for --enable-gnu-indirect-function to glibc-2011. This string is not used anywhere else in gcc as far as I can see. What is the purpose of that? I think it's an error on my part, and that x86_64 should behave the same as x86 in this regard, as you say. nathan -- Nathan Sidwell
Define WORDS_BIG_ENDIAN in rs6000/vxworks.h
At http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01104.html I enumerated the cases in GCC where WORDS_BIG_ENDIAN and BYTES_BIG_ENDIAN may differ. The ARM -mwords-little-endian case has now had a deprecation patch submitted http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02217.html and approved (with changes) but apparently not yet committed. This patch fixes the powerpc-wrs-vxworks* case, where as I noted before it's clearly a bug to undefine and redefine only one of the two macros. Tested building cc1 and xgcc for cross to powerpc-wrs-vxworks. Committed as obvious. 2011-07-06 Joseph Myers jos...@codesourcery.com * config/rs6000/vxworks.h (WORDS_BIG_ENDIAN): Define. Index: gcc/config/rs6000/vxworks.h === --- gcc/config/rs6000/vxworks.h (revision 175913) +++ gcc/config/rs6000/vxworks.h (working copy) @@ -47,6 +47,8 @@ /* Only big endian PPC is supported by VxWorks. */ #undef BYTES_BIG_ENDIAN #define BYTES_BIG_ENDIAN 1 +#undef WORDS_BIG_ENDIAN +#define WORDS_BIG_ENDIAN 1 /* We have to kill off the entire specs set created by rs6000/sysv4.h and substitute our own set. The top level vxworks.h has done some -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Address lowering [1/3] Main patch
On Wed, 2011-07-06 at 15:16 +0200, Richard Guenther wrote: On Tue, Jul 5, 2011 at 3:59 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: (Sorry for the late response; yesterday was a holiday here.) On Mon, 2011-07-04 at 16:21 +0200, Richard Guenther wrote: On Thu, Jun 30, 2011 at 4:39 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: This is the first of three patches related to lowering addressing expressions to MEM_REFs and TARGET_MEM_REFs in late gimple. This patch contains the new pass together with supporting changes in existing modules. The second patch contains an independent change to the RTL forward propagator to keep it from undoing an optimization made in the first patch. The third patch contains new test cases and changes to existing test cases. Although I've broken it up into three patches to make the review easier, it would be best to commit at least the first and third together to avoid regressions. The second can stand alone. I've done regression tests on powerpc64 and x86_64, and have asked Andreas Krebbel to test against the IBM z (390) platform. I've done performance regression testing on powerpc64. The only performance regression of note is the 2% degradation to 188.ammp due to loss of field disambiguation information. As discussed in another thread, fixing this introduces more complexity than it's worth. Are there also performance improvements? What about code size? Yes, there are performance improvements. I've been running cpu2000 on 32- and 64-bit powerpc64. Thirteen tests show measurable improvements between 1% and 9%, with 187.facerec showing the largest improvements for both 32 and 64. I don't have formal code size results, but anecdotally from code crawling, I have seen code size either neutral or slightly improved. The largest code size improvements I've seen were on 32-bit code where the commoning allowed removal of a number of sign-extend and zero-extend instructions that were otherwise not seen to be redundant. I tried to get an understanding to what kind of optimizations this patch produces based on the test of testcases you added, but I have a hard time here. Can you outline some please? The primary goal is to clean up code such as is shown in the original post of PR46556. In late 2007 there were some changes made to address canonicalization that caused the code gen to be suboptimal on powerpc64. At that time you and others suggested a pattern recognizer prior to expand as probably the best solution, similar to what IVopts is doing. The PR46556 case looks quite simple. It certainly is. I was personally curious whether there were other suboptimal sequences that might be hiding out there, that a more general approach might expose. There was a comment at the end of the bugzilla about a pass to expose target addressing modes in gimple for this purpose. When I first started looking at this, I looked for some feedback from the community about whether that should be done, and got a few favorable comments along with one negative one. So that's how we got on this road... By using the same mem_ref generation machinery used by IVopts, together with local CSE, the goal was to ensure base registers are properly shared so that optimal code is generated, particularly for cases of shared addressability to structures and arrays. I also observed cases where it was useful to extend the sharing across the dominator tree. As you are doing IV selection per individual statement only, using the affine combination machinery looks quite a big hammer to me. Especially as it is hard to imagine what the side-effects are, apart from re-associating dependencies that do not fit the MEM-REF and making the MEM-REF as complicated as permitted by the target. What I thought originally when suggesting to do something similar to IVOPTs was to build a list of candidates and uses and optimize that set using a cost function similar to how IVOPTs does. OK, reading back I can see that now... Doing addressing-mode selection locally per statement seems like more a task for a few pattern matchers, for example in tree-ssa-forwprop.c (for its last invocation). One pattern would be that of PR46556, MEM[(p + ((n + 16)*4))] which we can transform to TARGET_MEM_REF[x + 64] with x = p + n*4 if ((n + 16)*4)) was a single-use. The TARGET_MEM_REF generation can easily re-use the address-description and target-availability checks from tree-ssa-address.c. I would be at least interested in whether handling the pattern from PR46556 alone (or maybe with a few similar other cases) is responsible for the performance improvements. Hm, but I don't think forwprop sees the code in this form. At the time the last pass of forwprop runs, the gimple for the original problem is: D.1997_3 = p_1(D)-a[n_2(D)]; D.1998_4 = p_1(D)-c[n_2(D)]; D.1999_5 =
Re: PATCH TRUNK: [gcc/configure.ac] Generate GCCPLUGIN_VERSION_* macros
On Wed, Jul 06, 2011 at 04:02:48PM +0200, Richard Guenther wrote: On Wed, Jul 6, 2011 at 3:46 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: On Wed, Jul 06, 2011 at 03:21:47PM +0200, Richard Guenther wrote: On Wed, Jul 6, 2011 at 2:50 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: I beleive it can help to make plugin code more robust. I'm not sure using cut is portable enough - for bversion.h generation we use sed instead (see Makefile.in), so I suppose copying that would be better. I can't approve the configury changes but the change itself looks reasonable to me I am attaching an improved patch to trunk rev 175920 gcc/ChangeLog entry # 2011-07-06 Basile Starynkevitch bas...@starynkevitch.net * configure.ac (plugin-version.h): Generate GCCPLUGIN_VERSION_MAJOR, GCCPLUGIN_VERSION_MINOR, GCCPLUGIN_VERSION_PATCHLEVEL, GCCPLUGIN_VERSION constant integer macros. * configure: Regenerate. * doc/plugins.texi (Building GCC plugins): Mention GCCPLUGIN_VERSION ... constant macros in plugin-version.h. end of gcc/ChangeLog entry # Ok for trunk? Cheers. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} *** Index: gcc/doc/plugins.texi === --- gcc/doc/plugins.texi (revision 175920) +++ gcc/doc/plugins.texi (working copy) @@ -417,6 +417,17 @@ invoking @command{gcc -print-file-name=plugin} (re Inside plugins, this @code{plugin} directory name can be queried by calling @code{default_plugin_dir_name ()}. +Plugins may know, when they are compiled, the GCC version for which +@file{plugin-version.h} is provided. The constant macros +@code{GCCPLUGIN_VERSION_MAJOR}, @code{GCCPLUGIN_VERSION_MINOR}, +@code{GCCPLUGIN_VERSION_PATCHLEVEL}, @code{GCCPLUGIN_VERSION} are +integer numbers, so a plugin could ensure it is built for GCC 4.7 with +@smallexample +#if GCCPLUGIN_VERSION != 4007 +#error this GCC plugin is for GCC 4.7 +#endif +@end smallexample + The following GNU Makefile excerpt shows how to build a simple plugin: @smallexample Index: gcc/configure === --- gcc/configure (revision 175920) +++ gcc/configure (working copy) @@ -11072,6 +11072,11 @@ fi cat plugin-version.h EOF #include configargs.h +#define GCCPLUGIN_VERSION_MAJOR `echo $gcc_BASEVER | sed -e 's/^\([0-9]*\).*$/\1/'` +#define GCCPLUGIN_VERSION_MINOR `echo $gcc_BASEVER | sed -e 's/^[0-9]*\.\([0-9]*\).*$/\1/'` +#define GCCPLUGIN_VERSION_PATCHLEVEL `echo $gcc_BASEVER | sed -e 's/^[0-9]*\.[0-9]*\.\([0-9]*\)$/\1/'` +#define GCCPLUGIN_VERSION (GCCPLUGIN_VERSION_MAJOR*1000 + GCCPLUGIN_VERSION_MINOR) + static char basever[] = $gcc_BASEVER; static char datestamp[] = $gcc_DATESTAMP; static char devphase[] = $gcc_DEVPHASE; @@ -17623,7 +17628,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17626 configure +#line 17631 configure #include confdefs.h #if HAVE_DLFCN_H @@ -17729,7 +17734,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17732 configure +#line 17737 configure #include confdefs.h #if HAVE_DLFCN_H Index: gcc/configure.ac === --- gcc/configure.ac (revision 175920) +++ gcc/configure.ac (working copy) @@ -1511,6 +1511,11 @@ fi cat plugin-version.h EOF #include configargs.h +#define GCCPLUGIN_VERSION_MAJOR `echo $gcc_BASEVER | sed -e 's/^\([0-9]*\).*$/\1/'` +#define GCCPLUGIN_VERSION_MINOR `echo $gcc_BASEVER | sed -e 's/^[0-9]*\.\([0-9]*\).*$/\1/'` +#define GCCPLUGIN_VERSION_PATCHLEVEL `echo $gcc_BASEVER | sed -e 's/^[0-9]*\.[0-9]*\.\([0-9]*\)$/\1/'` +#define GCCPLUGIN_VERSION (GCCPLUGIN_VERSION_MAJOR*1000 + GCCPLUGIN_VERSION_MINOR) + static char basever[] = $gcc_BASEVER; static char datestamp[] = $gcc_DATESTAMP; static char devphase[] = $gcc_DEVPHASE;
Re: PATCH [1/n] X32: Add initial -x32 support
Hi Paolo, DJ, Nathanael, Alexandre, Ralf, Is the change . * configure.ac: Support --enable-x32. * configure: Regenerated. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..bddabeb 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -611,6 +611,11 @@ AC_ARG_ENABLE(multilib, [], [enable_multilib=yes]) AC_SUBST(enable_multilib) +# With x32 support +AC_ARG_ENABLE(x32, +[ --enable-x32enable x32 library support for multiple ABIs], +[], [enable_x32=no]) + # Enable __cxa_atexit for C++. AC_ARG_ENABLE(__cxa_atexit, [AS_HELP_STRING([--enable-__cxa_atexit], [enable __cxa_atexit for C++])], OK? Thanks. -- H.J. ---
Re: [Patch,testsuite]: target-supports.exp: Disable -fprofile-generate for AVR
Rainer Orth wrote: Georg-Johann Lay writes: Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 175811) +++ lib/target-supports.exp (working copy) @@ -497,6 +497,13 @@ proc check_profiling_available { test_wh # Tree profiling requires TLS runtime support. if { $test_what == -fprofile-generate } { +# Target AVR does not support profile generation because ^ Leave out the `Target' +# it does not implement needed support functions. +# A call to check_effective_target_tls_runtime won't +# reveal that. Omit the second sentence: it isn't supposed to, but just documents a general requirement of -fprofile-generate. +if { [istarget avr-*-*] } { +return 0 +} return [check_effective_target_tls_runtime] } Index: gcc.dg/tree-ssa/vrp51.c === --- gcc.dg/tree-ssa/vrp51.c (revision 175811) +++ gcc.dg/tree-ssa/vrp51.c (working copy) @@ -1,6 +1,7 @@ /* PR tree-optimization/28632 */ /* { dg-do compile } */ /* { dg-options -O2 -ftree-vrp } */ +/* { dg-require-effective-target int32plus } */ void v4 (unsigned a, unsigned b) This is completely unrelated to the first; please don't mix such patches in one post. Sorry for the glitch. Thanks. Rainer Here is a revised patch. Ok? Johann * lib/target-supports.exp (check_profiling_available): Disable profiling with -fprofile-generate for target avr. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 175811) +++ lib/target-supports.exp (working copy) @@ -497,6 +497,11 @@ proc check_profiling_available { test_wh # Tree profiling requires TLS runtime support. if { $test_what == -fprofile-generate } { + # AVR does not support profile generation because + # it does not implement needed support functions. + if { [istarget avr-*-*] } { + return 0 + } return [check_effective_target_tls_runtime] }
Re: [PATCH] Address lowering [1/3] Main patch
On Wed, Jul 6, 2011 at 4:28 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: On Wed, 2011-07-06 at 15:16 +0200, Richard Guenther wrote: On Tue, Jul 5, 2011 at 3:59 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: (Sorry for the late response; yesterday was a holiday here.) On Mon, 2011-07-04 at 16:21 +0200, Richard Guenther wrote: On Thu, Jun 30, 2011 at 4:39 PM, William J. Schmidt wschm...@linux.vnet.ibm.com wrote: This is the first of three patches related to lowering addressing expressions to MEM_REFs and TARGET_MEM_REFs in late gimple. This patch contains the new pass together with supporting changes in existing modules. The second patch contains an independent change to the RTL forward propagator to keep it from undoing an optimization made in the first patch. The third patch contains new test cases and changes to existing test cases. Although I've broken it up into three patches to make the review easier, it would be best to commit at least the first and third together to avoid regressions. The second can stand alone. I've done regression tests on powerpc64 and x86_64, and have asked Andreas Krebbel to test against the IBM z (390) platform. I've done performance regression testing on powerpc64. The only performance regression of note is the 2% degradation to 188.ammp due to loss of field disambiguation information. As discussed in another thread, fixing this introduces more complexity than it's worth. Are there also performance improvements? What about code size? Yes, there are performance improvements. I've been running cpu2000 on 32- and 64-bit powerpc64. Thirteen tests show measurable improvements between 1% and 9%, with 187.facerec showing the largest improvements for both 32 and 64. I don't have formal code size results, but anecdotally from code crawling, I have seen code size either neutral or slightly improved. The largest code size improvements I've seen were on 32-bit code where the commoning allowed removal of a number of sign-extend and zero-extend instructions that were otherwise not seen to be redundant. I tried to get an understanding to what kind of optimizations this patch produces based on the test of testcases you added, but I have a hard time here. Can you outline some please? The primary goal is to clean up code such as is shown in the original post of PR46556. In late 2007 there were some changes made to address canonicalization that caused the code gen to be suboptimal on powerpc64. At that time you and others suggested a pattern recognizer prior to expand as probably the best solution, similar to what IVopts is doing. The PR46556 case looks quite simple. It certainly is. I was personally curious whether there were other suboptimal sequences that might be hiding out there, that a more general approach might expose. There was a comment at the end of the bugzilla about a pass to expose target addressing modes in gimple for this purpose. When I first started looking at this, I looked for some feedback from the community about whether that should be done, and got a few favorable comments along with one negative one. So that's how we got on this road... By using the same mem_ref generation machinery used by IVopts, together with local CSE, the goal was to ensure base registers are properly shared so that optimal code is generated, particularly for cases of shared addressability to structures and arrays. I also observed cases where it was useful to extend the sharing across the dominator tree. As you are doing IV selection per individual statement only, using the affine combination machinery looks quite a big hammer to me. Especially as it is hard to imagine what the side-effects are, apart from re-associating dependencies that do not fit the MEM-REF and making the MEM-REF as complicated as permitted by the target. What I thought originally when suggesting to do something similar to IVOPTs was to build a list of candidates and uses and optimize that set using a cost function similar to how IVOPTs does. OK, reading back I can see that now... Doing addressing-mode selection locally per statement seems like more a task for a few pattern matchers, for example in tree-ssa-forwprop.c (for its last invocation). One pattern would be that of PR46556, MEM[(p + ((n + 16)*4))] which we can transform to TARGET_MEM_REF[x + 64] with x = p + n*4 if ((n + 16)*4)) was a single-use. The TARGET_MEM_REF generation can easily re-use the address-description and target-availability checks from tree-ssa-address.c. I would be at least interested in whether handling the pattern from PR46556 alone (or maybe with a few similar other cases) is responsible for the performance improvements. Hm, but I don't think forwprop sees the code in this form. At the time the last pass of forwprop runs, the gimple for the
Re: PATCH [1/n] X32: Add initial -x32 support
On Wed, Jul 6, 2011 at 4:48 PM, H.J. Lu hjl.to...@gmail.com wrote: Hi Paolo, DJ, Nathanael, Alexandre, Ralf, Is the change . * configure.ac: Support --enable-x32. * configure: Regenerated. diff --git a/gcc/configure.ac b/gcc/configure.ac index 5f3641b..bddabeb 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -611,6 +611,11 @@ AC_ARG_ENABLE(multilib, [], [enable_multilib=yes]) AC_SUBST(enable_multilib) +# With x32 support +AC_ARG_ENABLE(x32, +[ --enable-x32 enable x32 library support for multiple ABIs], Looks like a very very generic switch for a global configury ... we already have --with-multilib-list (SH only), why not extend that to also work for x86_64? Richard. +[], [enable_x32=no]) + # Enable __cxa_atexit for C++. AC_ARG_ENABLE(__cxa_atexit, [AS_HELP_STRING([--enable-__cxa_atexit], [enable __cxa_atexit for C++])], OK? Thanks. -- H.J. ---
Re: PATCH TRUNK: [gcc/configure.ac] Generate GCCPLUGIN_VERSION_* macros
On Wed, Jul 6, 2011 at 10:37, Basile Starynkevitch bas...@starynkevitch.net wrote: On Wed, Jul 06, 2011 at 04:02:48PM +0200, Richard Guenther wrote: On Wed, Jul 6, 2011 at 3:46 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: On Wed, Jul 06, 2011 at 03:21:47PM +0200, Richard Guenther wrote: On Wed, Jul 6, 2011 at 2:50 PM, Basile Starynkevitch bas...@starynkevitch.net wrote: I beleive it can help to make plugin code more robust. I'm not sure using cut is portable enough - for bversion.h generation we use sed instead (see Makefile.in), so I suppose copying that would be better. I can't approve the configury changes but the change itself looks reasonable to me I am attaching an improved patch to trunk rev 175920 gcc/ChangeLog entry # 2011-07-06 Basile Starynkevitch bas...@starynkevitch.net * configure.ac (plugin-version.h): Generate GCCPLUGIN_VERSION_MAJOR, GCCPLUGIN_VERSION_MINOR, GCCPLUGIN_VERSION_PATCHLEVEL, GCCPLUGIN_VERSION constant integer macros. * configure: Regenerate. * doc/plugins.texi (Building GCC plugins): Mention GCCPLUGIN_VERSION ... constant macros in plugin-version.h. OK. Diego.
Re: [Patch,testsuite]: target-supports.exp: Disable -fprofile-generate for AVR
Georg-Johann Lay a...@gjlay.de writes: Here is a revised patch. Ok? I'd like to defer to the target maintainers here: they know their port, while I don't. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Provide 64-bit default Solaris/x86 configuration (PR target/39150)
* In the sparcv9-sun-solaris2.11 builds, the 32-bit libgo tests fail to link since they have unresolved references to __sync_bool_compare_and_swap_8 and __sync_add_and_fetch_8. I could trace this to -mv8plus being missing in that configuration. I'm uncertain where best to handle this. Eric? Probably add MASK_V8PLUS to the 64-bit TARGET_DEFAULT in sol2.h, it will be disabled in 64-bit mode by sparc_override_options. -- Eric Botcazou
Re: Provide 64-bit default Solaris/x86 configuration (PR target/39150)
Eric Botcazou ebotca...@adacore.com writes: * In the sparcv9-sun-solaris2.11 builds, the 32-bit libgo tests fail to link since they have unresolved references to __sync_bool_compare_and_swap_8 and __sync_add_and_fetch_8. I could trace this to -mv8plus being missing in that configuration. I'm uncertain where best to handle this. Eric? Probably add MASK_V8PLUS to the 64-bit TARGET_DEFAULT in sol2.h, it will be disabled in 64-bit mode by sparc_override_options. Thanks, I'll give it a try. I just seemd weird to have MASK_V8PLUS and MASK_V9 at the same time. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [Patch,testsuite]: target-supports.exp: Disable -fprofile-generate for AVR
2011/7/6 Rainer Orth r...@cebitec.uni-bielefeld.de: Georg-Johann Lay a...@gjlay.de writes: Here is a revised patch. Ok? I'd like to defer to the target maintainers here: they know their port, while I don't. Approved. Denis.
Re: [Path,AVR]: Improve loading of 32-bit constants
Denis Chertykov wrote: 2011/7/6 Georg-Johann Lay a...@gjlay.de: For loading a 32-bit constant in a register, there is room for improvement: * SF can be handled the same way as SI and therefore the patch adds a peep2 to produce a *reload_insf analogon to *reload_insi. * If the destination register overlaps NO_LD_REGS, values already loaded into some other byte can be reused by a simple MOV. This is helpful then moving values like, e.g. -2, -100 etc. because all high bytes are 0xff. * 0.0f can be directly moved to memory. * The mov insns contain !d constraint. I see no reason to make d expensive and discourage use of d-regs. A *d to hide is better because it does it neither puts additional pressure on d nor discourages d. I would like to have a real code examples. Denis. Hi Denis. Attached you find a small C file and the asm that is generated by new and old versions (-Os -mmcu=atmega88 -S -dp). I took away some regs as potential clobbers (or -fno-peephole2) to show the effect of high register pressure. Bit even if a clobber was available you can see that the new version is smarter in reusing values, e.g. note the loading of -1L to r22-r25. Johann register int _x asm (26); register int _y asm (28); register int _z asm (30); void ibar (long, long, long, long); void fbar (long, long, float, float); void foo1 (long x) { ibar (-1, x, -2, 0xff008000); } void foo2 (long x) { ibar (x, x, 65537L, 0x0408); } void foo3 (long x) { fbar (x, x, -3.0f, 2.0f); } .file oint.c __SREG__ = 0x3f __SP_H__ = 0x3e __SP_L__ = 0x3d __tmp_reg__ = 0 __zero_reg__ = 1 .global __do_copy_data .global __do_clear_bss .text .global foo1 .type foo1, @function foo1: push r10 ; 16 *pushqi/1 [length = 1] push r11 ; 17 *pushqi/1 [length = 1] push r12 ; 18 *pushqi/1 [length = 1] push r13 ; 19 *pushqi/1 [length = 1] push r14 ; 20 *pushqi/1 [length = 1] push r15 ; 21 *pushqi/1 [length = 1] push r16 ; 22 *pushqi/1 [length = 1] push r17 ; 23 *pushqi/1 [length = 1] /* prologue: function */ /* frame size = 0 */ /* stack size = 8 */ .L__stack_usage = 8 movw r18,r22 ; 2 *movsi/1[length = 2] movw r20,r24 ldi r22,lo8(-1) ; 7 *movsi/5[length = 4] ldi r23,hi8(-1) ldi r24,hlo8(-1) ldi r25,hhi8(-1) mov __tmp_reg__,r31 ; 9 *movsi/6[length = 10] ldi r31,lo8(-2) mov r14,r31 ldi r31,hi8(-2) mov r15,r31 ldi r31,hlo8(-2) mov r16,r31 ldi r31,hhi8(-2) mov r17,r31 mov r31,__tmp_reg__ mov __tmp_reg__,r31 ; 10 *movsi/6[length = 10] ldi r31,lo8(-1678) mov r10,r31 ldi r31,hi8(-1678) mov r11,r31 ldi r31,hlo8(-1678) mov r12,r31 ldi r31,hhi8(-1678) mov r13,r31 mov r31,__tmp_reg__ rcall ibar ; 11 call_insn/3 [length = 1] /* epilogue start */ pop r17 ; 26 popqi [length = 1] pop r16 ; 27 popqi [length = 1] pop r15 ; 28 popqi [length = 1] pop r14 ; 29 popqi [length = 1] pop r13 ; 30 popqi [length = 1] pop r12 ; 31 popqi [length = 1] pop r11 ; 32 popqi [length = 1] pop r10 ; 33 popqi [length = 1] ret ; 34 return_from_epilogue[length = 1] .size foo1, .-foo1 .global foo2 .type foo2, @function foo2: push r10 ; 16 *pushqi/1 [length = 1] push r11 ; 17 *pushqi/1 [length = 1] push r12 ; 18 *pushqi/1 [length = 1] push r13 ; 19 *pushqi/1 [length = 1] push r14 ; 20 *pushqi/1 [length = 1] push r15 ; 21 *pushqi/1 [length = 1] push r16 ; 22 *pushqi/1 [length = 1] push r17 ; 23 *pushqi/1 [length = 1] /* prologue: function */ /* frame size = 0 */ /* stack size = 8 */ .L__stack_usage = 8 movw r18,r22 ; 2 *movsi/1[length = 2] movw r20,r24 mov __tmp_reg__,r31 ; 9 *movsi/6[length = 10] ldi r31,lo8(65537) mov r14,r31 ldi r31,hi8(65537) mov r15,r31 ldi r31,hlo8(65537) mov r16,r31 ldi r31,hhi8(65537) mov r17,r31 mov r31,__tmp_reg__ mov __tmp_reg__,r31 ; 10 *movsi/6[length = 10] ldi r31,lo8(-64504) mov r10,r31 ldi r31,hi8(-64504) mov r11,r31 ldi r31,hlo8(-64504) mov r12,r31 ldi r31,hhi8(-64504) mov r13,r31 mov r31,__tmp_reg__ rcall ibar ; 11 call_insn/3
[PATCH, testsuite] Fix for PR49519, miscompiled 447.dealII in SPEC CPU 2006
Hi, I've prepared a patch for: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49519 I've also prepared a test which reproduces the error. ChangeLog entry: 2011-07-06 Kirill Yukhin kirill.yuk...@intel.com PR tailcall-optimization/49519 * calls.c (mem_overlaps_already_clobbered_arg_p): Additional check if address is stored in register. If so - give up. (check_sibcall_argument_overlap_1): Do not perform check of overlapping when it is call to address. tessuite/ChangeLog entry: 2011-07-06 Kirill Yukhin kirill.yuk...@intel.com * g++.dg/torture/pr49519.C: New test for tailcall fix. Bootstrapped, new test fails without patch, passes when it is applied. This fixes the bprblem with SPEC2006/447.dealII miscompile Ok for trunk? Thanks, K pr49519.gcc.patch Description: Binary data
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Hi, Rainer - It addresses a couple of testsuite failures: [...] where the registration of __iob has been done automatically by the compiler. I avoid this problem by not registering stdin, stdout, and stderr separately on Solaris. OK. * Some tests were failing while calling unregister in munmap. It turned out that there had been no corresponding mmap registration before. This occurs because Solaris has mmap64 for largefile-aware programs instead. Fixed by wrapping mmap64, too. What I don't know is if mmap64 needs to be added to MFWRAP_SPEC in gcc.c? I believe so. If so, I'd rather do it by adding some MFWRAP_OS_SPEC to avoid having to duplicate the whole spec in the Solaris config headers. Why would solaris have to duplicate MFWRAP_SPEC if mmap64 is added to the default gcc.c one? * As noted in the last patch, the getmntent signature differs in Solaris. This patch implements a wrapper for the Solaris version. OK. * libmudflap.cth/pass37-frag.c would fail like this: Investigating with -trace-calls reveals that all registrations and unregistrations of errno are for the same address, which is wrong for multithreaded programs which access errno via an accessor function. To enable that, errno.h needs to be included with _REENTRANT defined. It turned out that it suffices to do this in mf-hooks3.c. OK. * libmudflap.c/heap-scalestress.c always timed out on my SPARC test system: on a 1.2 GHz UltraSPARC-T2, it takes real8:47.06 user 43.12 sys 8:03.77 which is way over the limit. On my laptop (1.6 GHz Core i7), it takes real 37.35 user 5.06 sys 32.23 I've divided SCALE by 10 to account for this. OK; I'm surprised by the order-of-magnitude performance difference between the machines though. * I've replaced all the __FreeBSD__ ... tests in libmudflap.c/pass-stratcliff.c with appropriate autoconf macros, and also define MIN which can be missing. OK. * libmudflap.c/pass47-frag.c originally failed like this: With this patch (and the next), I get almost clean testsuite results on sparc-sun-solaris2.11 (both multilibs): OK. - FChE
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Hi Frank, * Some tests were failing while calling unregister in munmap. It turned out that there had been no corresponding mmap registration before. This occurs because Solaris has mmap64 for largefile-aware programs instead. Fixed by wrapping mmap64, too. What I don't know is if mmap64 needs to be added to MFWRAP_SPEC in gcc.c? I believe so. ok, though I haven't seen a failure so far without. If so, I'd rather do it by adding some MFWRAP_OS_SPEC to avoid having to duplicate the whole spec in the Solaris config headers. Why would solaris have to duplicate MFWRAP_SPEC if mmap64 is added to the default gcc.c one? I assumed that you wanted to keep the default generic, and meant to separate target specific additions from the generic part. * libmudflap.c/heap-scalestress.c always timed out on my SPARC test system: on a 1.2 GHz UltraSPARC-T2, it takes real8:47.06 user 43.12 sys 8:03.77 which is way over the limit. On my laptop (1.6 GHz Core i7), it takes real 37.35 user 5.06 sys 32.23 I've divided SCALE by 10 to account for this. OK; I'm surprised by the order-of-magnitude performance difference between the machines though. Right: though the Niagara CPUs are slow, I hadn't expected that much either. So if you agree, I can add mmap64 to the default MFWRAP_SPEC. All other parts are approved, I think. In the meantime, I've rebuild and re-tested on Solaris 11/x86, too. While the gld results are as good as on SPARC, I still get several failures with Sun ld (which works fine on SPARC). I haven't analyzed them yet. I could either commit the current version with the MFWRAP_SPEC addition and work from there, or wait until those failures are understood and fixed, too. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: Provide 64-bit default Solaris/x86 configuration (PR target/39150)
Thanks, I'll give it a try. I just seemd weird to have MASK_V8PLUS and MASK_V9 at the same time. Yes, that's why the existing comment should also be enhanced. I'll fix it. -- Eric Botcazou
[v3] Correctly determine baseline_subdir for 64-bit default Solaris gcc
As alluded to in Provide 64-bit default Solaris/x86 configuration (PR target/39150) http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00327.html (which was meant to be Cc'ed to libstdc++, but bounced due to a stupid typo), there are now to variant bi-arch gcc configurations for Solaris: i386-pc-solaris2.* sparc-sun-solaris2.* which default to 32-bit code generation, and x86_64-pc-solaris2.* sparc64-sun-solaris2.* which default to 64-bit code generation. Unfortunately, libstdc++-abi/abi_check fails for the latter two for the 64-bit (default) multilib. The problem is that testsuite/Makefile.am and testsuite/libstdc++-abi/abi.exp use g++ --print-multi-directory to determine the subdirectory of config/abi/post/baseline_dir to use for the multilib at hand. For the 32-bit configurations, all is fine, while there's a mismatch for the 64-bit ones: 32-bit default 64-bit default --print-multi-directory . amd64 . 32 --print-multi-os-directory . amd64 amd64 . For the 32-bit case, everything works (sort of by chance): if abi.exp cannot fine the baseline in the subdir, it defaults to the baseline dir, which is exactly right. In the 64-bit case, the 32-bit baseline is used instead, which breaks completely. Unfortunately, one cannot simply use --print-multi-os-directory instead everywhere: while this is fine on Solaris, it would break Linux/x86_64: .32 ../lib64 ../lib So it seems the whole thing needs to be made configurable, which is what this patch does. It allows setting a non-default switch in configure.host, but defaults to --print-multi-directory otherwise. Tested on sparc-sun-solaris2.11 and sparcv9-sun-solaris2.11 by rebuilding libstdc++-v3 and running make RUNTESTFLAGS=abi.exp check. As expected, libstdc++-abi/abi_check now succeeds for both multilibs. I'll also test on x86_64-unknown-linux-gnu to make sure nothing breaks there. Ok for mainline if that passes? Thanks. Rainer 2011-07-06 Rainer Orth r...@cebitec.uni-bielefeld.de * configure.host (abi_baseline_subdir_switch): Describe. Provide default. (*-*-solaris2.[89], *-*-solaris2.1[0-9]): Override. * acinclude.m4 (GLIBCXX_CONFIGURE_TESTSUITE): Substitute baseline_subdir_switch. * testsuite/Makefile.am (site.exp): Emit it. (baseline_subdir): Use it. * testsuite/libstdc++-abi/abi.exp: Use it. * configure: Regenerate. * Makefile.in: Regenerate. * doc/Makefile.in: Regenerate. * include/Makefile.in: Regenerate. * libsupc++/Makefile.in: Regenerate. * po/Makefile.in: Regenerate. * python/Makefile.in: Regenerate. * src/Makefile.in: Regenerate. * testsuite/Makefile.in: Regenerate. diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -590,6 +590,7 @@ dnl GLIBCXX_TEST_WCHAR_T dnl GLIBCXX_TEST_THREAD dnl Substs: dnl baseline_dir +dnl baseline_subdir_switch dnl AC_DEFUN([GLIBCXX_CONFIGURE_TESTSUITE], [ if $GLIBCXX_IS_NATIVE ; then @@ -617,6 +618,8 @@ AC_DEFUN([GLIBCXX_CONFIGURE_TESTSUITE], # Export file names for ABI checking. baseline_dir=$glibcxx_srcdir/config/abi/post/${abi_baseline_pair} AC_SUBST(baseline_dir) + baseline_subdir_switch=$abi_baseline_subdir_switch + AC_SUBST(baseline_subdir_switch) ]) diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -30,6 +30,11 @@ # abi_baseline_pair directory name for ABI compat testing, # defaults to host_cpu-host_os (as per config.guess) # +# abi_baseline_subdir_switch +# g++ switch to determine ABI baseline subdir for +# multilibbed targets, +# defaults to --print-multi-directory +# # abi_tweaks_dir location of cxxabi_tweaks.h, # defaults to cpu_include_dir # @@ -78,6 +83,7 @@ atomic_flags= atomicity_dir=cpu/generic cpu_defines_dir=cpu/generic try_cpu=generic +abi_baseline_subdir_switch=--print-multi-directory abi_tweaks_dir=cpu/generic error_constants_dir=os/generic @@ -336,8 +342,10 @@ case ${host} in ;; *-*-solaris2.[89]) abi_baseline_pair=solaris2.8 +abi_baseline_subdir_switch=--print-multi-os-directory ;; *-*-solaris2.1[0-9]) abi_baseline_pair=solaris2.10 +abi_baseline_subdir_switch=--print-multi-os-directory ;; esac diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am --- a/libstdc++-v3/testsuite/Makefile.am +++ b/libstdc++-v3/testsuite/Makefile.am @@ -59,6 +59,7 @@ site.exp: Makefile @echo 'set target_triplet
Re: C++ mangling, function name to mangled name (or tree)
On 06/07/2011 18:25, Kevin André wrote: On Wed, Jul 6, 2011 at 18:00, Pierre Vittetpier...@pvittet.com wrote: I would like user of the plugin to give in arguments the name of the functions on which he would like a test to be run. That means that I must convert the string containing a function name (like myclass::init) and get either the mangled name or the tree corresponding to the function. I know that there might be several results (functions with the same name and different arguments), a good policy for me would be to recover every concerned functions (at least for the moment). I guess what I want to do is possible, because there are already some tools doing it (like gdb). Are you absolutely sure about gdb? It could be doing it the other way around, i.e. start from the mangled names in the object file and demangle all of them. Then it would search for a function name in its list of demangled names. Just guessing, though :) Regards, Kevin André Hello, no I am not sure, but I guess it would really have an important cost to do it like you said. Would it no be easier to have a field containing the 'demangled' names? At least in debug since it has an important space complexity. Thanks! Pierre Vittet
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Hi, Rainer - If so, I'd rather do it by adding some MFWRAP_OS_SPEC to avoid having to duplicate the whole spec in the Solaris config headers. Why would solaris have to duplicate MFWRAP_SPEC if mmap64 is added to the default gcc.c one? I assumed that you wanted to keep the default generic, and meant to separate target specific additions from the generic part. I don't have a strong opinion on this, but if you add mmap64 to the default, there need be no target-specific additions for solaris, right? So we can delay the decision to another day. [...] I could either commit the current version with the MFWRAP_SPEC addition and work from there, or wait until those failures are understood and fixed, too. Committing now would be fine, assuming no regressions on a primary platform. - FChE
Re: C++ mangling, function name to mangled name (or tree)
Le 6 juil. 2011 à 18:40, Pierre Vittet a écrit : On 06/07/2011 18:25, Kevin André wrote: On Wed, Jul 6, 2011 at 18:00, Pierre Vittetpier...@pvittet.com wrote: I would like user of the plugin to give in arguments the name of the functions on which he would like a test to be run. That means that I must convert the string containing a function name (like myclass::init) and get either the mangled name or the tree corresponding to the function. I know that there might be several results (functions with the same name and different arguments), a good policy for me would be to recover every concerned functions (at least for the moment). I guess what I want to do is possible, because there are already some tools doing it (like gdb). Are you absolutely sure about gdb? It could be doing it the other way around, i.e. start from the mangled names in the object file and demangle all of them. Then it would search for a function name in its list of demangled names. Just guessing, though :) Regards, Kevin André Hello, no I am not sure, but I guess it would really have an important cost to do it like you said. Would it no be easier to have a field containing the 'demangled' names? At least in debug since it has an important space complexity. Thanks! Pierre Vittet Hello, Have you considered the reverse way to do that. I mean, why don't you hook on the PLUGIN_PRE_GENERICIZE event to catch all function bodies, and then compare the argument the user gave you to current_function_name() (that will returns you the full protoype of the current function, ie: malloc full name is void* malloc(size_t)). Then, you can store the FUNCTION_DECL tree if there's a match and use it for later processing. That's how i proceed for my plugins. Romain Geissler
Re: Improve Solaris mudflap support (PR libmudflap/49550)
Hi Frank, Why would solaris have to duplicate MFWRAP_SPEC if mmap64 is added to the default gcc.c one? I assumed that you wanted to keep the default generic, and meant to separate target specific additions from the generic part. I don't have a strong opinion on this, but if you add mmap64 to the default, there need be no target-specific additions for solaris, right? So we can delay the decision to another day. fine with me :-) I could either commit the current version with the MFWRAP_SPEC addition and work from there, or wait until those failures are understood and fixed, too. Committing now would be fine, assuming no regressions on a primary platform. I've got a x86_64-unknown-linux-gnu bootstrap running just now. I'll commit if that succeeds. Thanks. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [pph] Stream and restore static_aggregates (issue4626096)
On Tue, Jul 5, 2011 at 9:08 PM, Diego Novillo dnovi...@google.com wrote: * pph-streamer-in.c (pph_add_bindings_to_namespace): diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c index 72536a5..0bab93b 100644 --- a/gcc/cp/pph-streamer-in.c +++ b/gcc/cp/pph-streamer-in.c @@ -1167,11 +1167,9 @@ pph_add_bindings_to_namespace (struct cp_binding_level *bl, tree ns) /* Pushing a decl into a scope clobbers its DECL_CHAIN. Preserve it. */ chain = DECL_CHAIN (t); - - /* FIXME pph: we should first check to see if it isn't already there. - If it is, we should use this function recursively to merge - the bindings in T in the corresponding namespace. */ pushdecl_into_namespace (t, ns); + if (NAMESPACE_LEVEL (t)) + pph_add_bindings_to_namespace (NAMESPACE_LEVEL (t), t); } } I had removed these two lines because that pretty much does nothing (I think, at least removing them didn't make anything fail...)... When we stream out the other namespaces (through the tree which's root is scope_chain-bindings), we also stream out these namespaces' bindings. This is all rebuilt when streaming in, so calling pph_add_bindings_to_namespace (NAMESPACE_LEVEL (t), t); seems like it's trying to add to namespace t all the bindings that are already in it... (unless pushdecl_into_namespace does more magic then adding the bindings to the namespace?) To me this only made sense if we had found a corresponding namespace we wanted to merge the streamed in bindings into... Otherwise I agree we will need to modify pph_add_bindings_to_namespace. It is still missing the merge of usings and using_directives. static_decls if I remember correctly were automatically reinserted when doing pushdecl_into_namespace for the static decls in names. Gab
DOC patch: about gengtype plugins
Hello All, The attached documentation patch is nearly trivial, I was tempted to apply it without review. ### gcc/ChangeLog entry ### 2011-07-06 Basile Starynkevitch bas...@starynkevitch.net * doc/plugins.texi (Building GCC plugins): gengtype needs its gtype.state ### end gcc/ChangeLog entry ### Ok? -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} *** Index: gcc/doc/plugins.texi === --- gcc/doc/plugins.texi (revision 175923) +++ gcc/doc/plugins.texi (working copy) @@ -446,6 +446,6 @@ A single source file plugin may be built with @cod plugin.so}, using backquote shell syntax to query the @file{plugin} directory. -Plugins needing to use @command{gengtype} require a GCC build -directory for the same version of GCC that they will be linked -against. +When a plugin needs to use @command{gengtype}, be sure that both +@file{gengtype} and @file{gtype.state} have the same version than the +GCC for which the plugin is built.
Re: [Path,AVR]: Improve loading of 32-bit constants
2011/7/6 Georg-Johann Lay a...@gjlay.de: Denis Chertykov wrote: 2011/7/6 Georg-Johann Lay a...@gjlay.de: For loading a 32-bit constant in a register, there is room for improvement: * SF can be handled the same way as SI and therefore the patch adds a peep2 to produce a *reload_insf analogon to *reload_insi. * If the destination register overlaps NO_LD_REGS, values already loaded into some other byte can be reused by a simple MOV. This is helpful then moving values like, e.g. -2, -100 etc. because all high bytes are 0xff. * 0.0f can be directly moved to memory. * The mov insns contain !d constraint. I see no reason to make d expensive and discourage use of d-regs. A *d to hide is better because it does it neither puts additional pressure on d nor discourages d. I would like to have a real code examples. Denis. Hi Denis. Attached you find a small C file and the asm that is generated by new and old versions (-Os -mmcu=atmega88 -S -dp). I took away some regs as potential clobbers (or -fno-peephole2) to show the effect of high register pressure. Bit even if a clobber was available you can see that the new version is smarter in reusing values, e.g. note the loading of -1L to r22-r25. I have asked about example of *d instead of !d. Just svn GCC with *d vs svn GCC !d. Denis.
[Patch,testsuite]: Filter more test cases to fit target capabilities
Hi, I am struggling against hundreds of fails in the testsuite because many cases are not carefully written, e.g. stull like shifting an int by 19 bits if int is only 16 bits wide. This patch adds some additional tests to avoid FAILs that are confusing. Sorry for gathering it in one patch, other patches may follow. I just don't like to flood you with bulk of patches. Ok to commit? Johann. testsuite/ * gcc.dg/pragma-align.c: Run only if target !default_packed. * gcc.dg/pr46212.c: Run only if target int32plus. * gcc.dg/torture/pr48146.c: Ditto. * gcc.dg/tree-ssa/vrp51.c: Ditto. * c-c++-common/pr44832.c: Ditto. * gcc.dg/pr49544.c: Run only if target ptr32plus. * gcc.dg/pr31490.c: Ditto. * gcc.dg/torture/builtin-math-7.c: Run only if target large_double. * gcc.dg/torture/pr45764.c: Skip for AVR. * gcc.dg/pr47893.c: Ditto. Index: gcc.dg/pragma-align.c === --- gcc.dg/pragma-align.c (revision 175811) +++ gcc.dg/pragma-align.c (working copy) @@ -1,6 +1,6 @@ /* Prove that pragma alignment handling works somewhat. */ -/* { dg-do run } */ +/* { dg-do run { target { ! default_packed } } } */ extern void abort (void); Index: gcc.dg/pr46212.c === --- gcc.dg/pr46212.c (revision 175811) +++ gcc.dg/pr46212.c (working copy) @@ -2,6 +2,7 @@ /* { dg-do compile } */ /* { dg-options -O3 -funroll-loops } */ /* { dg-options -O3 -funroll-loops -march=i386 { target { { i686-*-* x86_64-*-* } ilp32 } } } */ +/* { dg-require-effective-target int32plus } */ static inline unsigned foo (void *x) Index: gcc.dg/pr49544.c === --- gcc.dg/pr49544.c (revision 175811) +++ gcc.dg/pr49544.c (working copy) @@ -1,6 +1,7 @@ /* PR debug/49544 */ /* { dg-do compile } */ /* { dg-options -g -O2 } */ +/* { dg-require-effective-target ptr32plus } */ int baz (int, int, void *); Index: gcc.dg/torture/pr45764.c === --- gcc.dg/torture/pr45764.c (revision 175811) +++ gcc.dg/torture/pr45764.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* { dg-skip-if Too much RAM needed { avr-*-* } { * } { } } */ int result[64][16]; Index: gcc.dg/torture/builtin-math-7.c === --- gcc.dg/torture/builtin-math-7.c (revision 175811) +++ gcc.dg/torture/builtin-math-7.c (working copy) @@ -6,6 +6,7 @@ /* { dg-do run } */ /* { dg-add-options ieee } */ +/* { dg-require-effective-target large_double } */ extern void link_error(int); Index: gcc.dg/torture/pr48146.c === --- gcc.dg/torture/pr48146.c (revision 175811) +++ gcc.dg/torture/pr48146.c (working copy) @@ -1,4 +1,5 @@ /* { dg-do compile } */ +/* { dg-require-effective-target int32plus } */ static unsigned char safe_sub_func_int_s_s (int si1, unsigned char si2) Index: gcc.dg/pr47893.c === --- gcc.dg/pr47893.c (revision 175811) +++ gcc.dg/pr47893.c (working copy) @@ -2,6 +2,7 @@ /* { dg-do run } */ /* { dg-options -O2 } */ /* { dg-options -O2 -mtune=atom -fno-omit-frame-pointer -fno-strict-aliasing { target { { i?86-*-* x86_64-*-* } ilp32 } } } */ +/* { dg-skip-if Too much RAM needed { avr-*-* } { * } { } } */ extern void abort (void); Index: gcc.dg/tree-ssa/vrp51.c === --- gcc.dg/tree-ssa/vrp51.c (revision 175811) +++ gcc.dg/tree-ssa/vrp51.c (working copy) @@ -1,6 +1,7 @@ /* PR tree-optimization/28632 */ /* { dg-do compile } */ /* { dg-options -O2 -ftree-vrp } */ +/* { dg-require-effective-target int32plus } */ void v4 (unsigned a, unsigned b) Index: gcc.dg/pr31490.c === --- gcc.dg/pr31490.c (revision 175811) +++ gcc.dg/pr31490.c (working copy) @@ -1,6 +1,8 @@ /* PR middle-end/31490 */ /* { dg-do compile } */ /* { dg-require-named-sections } */ +/* { dg-require-effective-target ptr32plus } */ + int cpu (void *attr) {} const unsigned long x __attribute__((section(foo))) = (unsigned long)cpu; const unsigned long g __attribute__((section(foo))) = 0; Index: c-c++-common/pr44832.c === --- c-c++-common/pr44832.c (revision 175811) +++ c-c++-common/pr44832.c (working copy) @@ -2,6 +2,7 @@ /* { dg-do compile } */ /* { dg-options -O2 -fcompare-debug } */ /* { dg-options -O2 -fcompare-debug -fno-short-enums {target short_enums} } */ +/* { dg-require-effective-target int32plus } */ struct rtx_def; typedef struct rtx_def *rtx;
Re: [testsuite] ARM wmul tests: require arm_dsp_multiply
On 06/29/2011 06:25 AM, Richard Earnshaw wrote: On 23/06/11 22:38, Janis Johnson wrote: Tests wmul-[1234].c and mla-2.c in gcc.target/arm require support that the arm backend identifies as TARGET_DSP_MULTIPLY. The tests all specify a -march option with that support, but it is overridden by multilib flags. This patch adds a new effective target, arm_dsp_multiply, and requires it for those tests instead of having them specify a -march value. This means that the tests will be skipped for older targets and test coverage relies on testing for some newer multilibs. The same effective target is needed for tests smlaltb-1.c, smlaltt-1.c, smlatb-1.c, and smlatt-1.c, but those also need to be renamed so the scans don't pass just because the file name is in the assembly file. OK for trunk, and later for 4.6? (btw, I'm currently testing ARM compile-only tests with 43 sets of multilib flags) I've recently approved a patch from James Greenhalgh (http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01852.html) that defines __ARM_DSP_MULTIPLY when these features are available. That should simplify your target-supports change and also serve as a check that we aren't erroneously defining that macro. R. This version uses the new macro from James Greenhalgh, making the effective-target check trivial. The patch removes -march options from the tests, and adds a tab to the scans in smla*.c so the scan won't match the file name; there are other arm tests that use tab in the search target. OK for trunk, and later for 4.6? Putting this patch on 4.6 requires the new macro there as well. 2011-07-06 Janis Johnson jani...@codesourcery.com * lib/target-supports.exp (check_effective_target_arm_dsp): New. * gcc.target/arm/mla-2.c: Use it instead of specific -march. * gcc.target/arm/wmul-1.c: Likewise. * gcc.target/arm/wmul-2.c: Likewise. * gcc.target/arm/wmul-3.c: Likewise. * gcc.target/arm/wmul-4.c: Likewise. * gcc.target/arm/smlaltb-1.c: Require arm_dsp, don't specify -march, add tab after scan target. * gcc.target/arm/smlaltt-1.c: Likewise. * gcc.target/arm/smlatb-1.c: Likewise. * gcc.target/arm/smlatt-1.c: Likewise. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 175921) +++ lib/target-supports.exp (working copy) @@ -1911,6 +1911,18 @@ } } +# Return 1 if this is an ARM target that supports DSP multiply with +# current multilib flags. + +proc check_effective_target_arm_dsp { } { +return [check_no_compiler_messages arm_dsp assembly { + #ifndef __ARM_FEATURE_DSP + #error not DSP + #endif + int i; +}] +} + # Add the options needed for NEON. We need either -mfloat-abi=softfp # or -mfloat-abi=hard, but if one is already specified by the # multilib, use it. Similarly, if a -mfpu option already enables Index: gcc.target/arm/mla-2.c === --- gcc.target/arm/mla-2.c (revision 175921) +++ gcc.target/arm/mla-2.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options -O2 -march=armv7-a } */ +/* { dg-require-effective-target arm_dsp } */ +/* { dg-options -O2 } */ long long foolong (long long x, short *a, short *b) { Index: gcc.target/arm/wmul-1.c === --- gcc.target/arm/wmul-1.c (revision 175921) +++ gcc.target/arm/wmul-1.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options -O2 -march=armv6t2 } */ +/* { dg-require-effective-target arm_dsp } */ +/* { dg-options -O2 } */ int mac(const short *a, const short *b, int sqr, int *sum) { Index: gcc.target/arm/wmul-2.c === --- gcc.target/arm/wmul-2.c (revision 175921) +++ gcc.target/arm/wmul-2.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options -O2 -march=armv6t2 } */ +/* { dg-require-effective-target arm_dsp } */ +/* { dg-options -O2 } */ void vec_mpy(int y[], const short x[], short scaler) { Index: gcc.target/arm/wmul-3.c === --- gcc.target/arm/wmul-3.c (revision 175921) +++ gcc.target/arm/wmul-3.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options -O2 -march=armv6t2 } */ +/* { dg-require-effective-target arm_dsp } */ +/* { dg-options -O2 } */ int mac(const short *a, const short *b, int sqr, int *sum) { Index: gcc.target/arm/wmul-4.c === --- gcc.target/arm/wmul-4.c (revision 175921) +++ gcc.target/arm/wmul-4.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ -/* { dg-options -O2 -march=armv6t2 } */ +/* { dg-require-effective-target arm_dsp } */ +/* { dg-options -O2 } */ int mac(const
Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup
Uros Bizjak ubiz...@gmail.com writes: On Tue, Jul 5, 2011 at 7:17 PM, Mike Stump mikest...@comcast.net wrote: Please note that we set -mieee flag to compile .go files from library and also we add this flag to default testsuite compile flags. Ick, I think this patch might be expedient, but, wrong. Ian will have to think about it and decide. This is something I come up with after a lot of staring into build system: 2011-07-06 Uros Bizjak ubiz...@gmail.com * mt-alphaieee (GOCFLAGS_FOR_TARGET): Add -mieee. This patch by itself does not fix go testsuite failures, although the library is now OK. Additional patch is needed to pass GOCFLAGS to the compiler when checking the package. I will submit it separately. Tested on alphaev68-pc-linux-gnu. OK for mainline? Uros. Index: config/mt-alphaieee === --- config/mt-alphaieee (revision 175904) +++ config/mt-alphaieee (working copy) @@ -1,2 +1,3 @@ CFLAGS_FOR_TARGET += -mieee CXXFLAGS_FOR_TARGET += -mieee +GOCFLAGS_FOR_TARGET += -mieee This seems like a reasonable patch to me, but technically speaking it is incomplete. Go should have IEEE floating point behaviour by default. I believe Java is the same. Ideally there would be a target-independent way for a frontend to request this mode by default. It's a little bit odd because as far as I know every other backend does default to proper IEEE arithmetic, and only deviates when using -ffast-math or equivalent. Anyhow, it's hard for me to care all that much about the Alpha, so I will approve this patch. It's clearly better than the current situation, and it follows what other languages are doing. Thanks. Ian
Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup
On Wed, Jul 6, 2011 at 7:34 PM, Ian Lance Taylor i...@google.com wrote: This seems like a reasonable patch to me, but technically speaking it is incomplete. Go should have IEEE floating point behaviour by default. I believe Java is the same. Ideally there would be a target-independent way for a frontend to request this mode by default. It's a little bit odd because as far as I know every other backend does default to proper IEEE arithmetic, and only deviates when using -ffast-math or equivalent. sh*-*-* also needs -mieee to handle NaN Inf, spu-*-* simply doesn't support them. Uros.
[testsuite] fixes for gcc.target/arm/mla-1.c
Test gcc.target/arm/mla-1.c scans the assembly file for a string that is part of the name, which always succeeds. This patch adds a tab to the search target to avoid that. It also removes the -march option and pruning of warnings about conflicts, and restricts the tests to targets that support Thumb-2, with support for new effective target arm_thumb2. Changes to the comment for arm_thumb1 support clarify the difference between it and arm_thumb1_ok. OK for trunk, and for 4.6 in a few days if no problems? 2011-07-06 Janis Johnson jani...@codesourcery.com * lib/target-supports.exp (check_effective_target_arm_thumb1): New. (check_effective_target_arm_thumb2): Clarify comment, add valid code. * gcc.target/arm/mla-1.c: Skip for arm_thumb1, don't specify -march, add tab to scan target. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 175921) +++ lib/target-supports.exp (working copy) @@ -2027,13 +2039,27 @@ } -mthumb] } -# Return 1 is this is an ARM target where is Thumb-2 used. +# Return 1 if this is an ARM target where Thumb-1 is used without options +# added by the test. +proc check_effective_target_arm_thumb1 { } { +return [check_no_compiler_messages arm_thumb1 assembly { + #if !defined(__arm__) || !defined(__thumb__) || defined(__thumb2__) + #error not thumb1 + #endif + int i; +} ] +} + +# Return 1 if this is an ARM target where Thumb-2 is used without options +# added by the test. + proc check_effective_target_arm_thumb2 { } { return [check_no_compiler_messages arm_thumb2 assembly { #if !defined(__thumb2__) #error FOO #endif + int i; } ] } Index: gcc.target/arm/mla-1.c === --- gcc.target/arm/mla-1.c (revision 175921) +++ gcc.target/arm/mla-1.c (working copy) @@ -1,6 +1,6 @@ /* { dg-do compile } */ -/* { dg-options -O2 -march=armv5te } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-skip-if { arm_thumb1 } { * } { } } */ +/* { dg-options -O2 } */ int @@ -19,4 +19,4 @@ return accum; } -/* { dg-final { scan-assembler mla } } */ +/* { dg-final { scan-assembler mla\\t } } */
Re: [Path,AVR]: Improve loading of 32-bit constants
Denis Chertykov wrote: 2011/7/6 Georg-Johann Lay a...@gjlay.de: Denis Chertykov wrote: 2011/7/6 Georg-Johann Lay a...@gjlay.de: For loading a 32-bit constant in a register, there is room for improvement: * SF can be handled the same way as SI and therefore the patch adds a peep2 to produce a *reload_insf analogon to *reload_insi. * If the destination register overlaps NO_LD_REGS, values already loaded into some other byte can be reused by a simple MOV. This is helpful then moving values like, e.g. -2, -100 etc. because all high bytes are 0xff. * 0.0f can be directly moved to memory. * The mov insns contain !d constraint. I see no reason to make d expensive and discourage use of d-regs. A *d to hide is better because it does it neither puts additional pressure on d nor discourages d. I would like to have a real code examples. Denis. Hi Denis. Attached you find a small C file and the asm that is generated by new and old versions (-Os -mmcu=atmega88 -S -dp). I took away some regs as potential clobbers (or -fno-peephole2) to show the effect of high register pressure. Bit even if a clobber was available you can see that the new version is smarter in reusing values, e.g. note the loading of -1L to r22-r25. I have asked about example of *d instead of !d. Just svn GCC with *d vs svn GCC !d. Denis. Ah, I couldn't depict that from your question. I thought it could help in cases like these: long z; void inc (long y) { z += y; } that gets compiled with -Os to inc: push r16 push r17 /* prologue: function */ /* frame size = 0 */ /* stack size = 2 */ .L__stack_usage = 2 lds r16,z lds r17,z+1 lds r18,z+2 lds r19,z+3 add r16,r22 adc r17,r23 adc r18,r24 adc r19,r25 sts z,r16 sts z+1,r17 sts z+2,r18 sts z+3,r19 /* epilogue start */ pop r17 pop r16 ret But with the *d the code is still the same and R16 chosen instead of better R18. Maybe that's an IRA issue. Looking again at the *d resp. !d, I think the alternative is superfluous because there is a r alternative and d is a subset of r, so allocator can always switch to r if it does not like or see d. I think we con remove that alternative, it's just confusing. Johann
Re: [1/11] Use targetm.shift_truncation_mask more consistently
Bernd Schmidt ber...@codesourcery.com writes: At some point we've grown a shift_truncation_mask hook, but we're not using it everywhere we're masking shift counts. This patch changes the instances I found. The documentation reads: Note that, unlike @code{SHIFT_COUNT_TRUNCATED}, this function does @emph{not} apply to general shift rtxes; it applies only to instructions that are generated by the named shift patterns. I think you need to update the documentation, and check that existing target definitions do in fact apply to shift rtxes as well. Richard
Re: [7/11] rtl optimizer changes
On 07/01/2011 10:35 AM, Bernd Schmidt wrote: * explow.c (trunc_int_for_mode): Use GET_MODE_PRECISION instead of GET_MODE_BITSIZE where appropriate. * rtlanal.c (subreg_lsb_1, subreg_get_info, nonzero_bits1, num_sign_bit_copies1, canonicalize_condition, low_bitmask_len, init_num_sign_bit_copies_in_rep): Likewise. * cse.c (fold_rtx, cse_insn): Likewise. * loop-doloop.c (doloop_modify, doloop_optimize): Likewise. * simplify-rtx.c (simplify_unary_operation_1, simplify_const_unary_operation, simplify_binary_operation_1, simplify_const_binary_operation, simplify_ternary_operation, simplify_const_relational_operation, simplify_subreg): Likewise. * combine.c (try_combine, find_split_point, combine_simplify_rtx, simplify_if_then_else, simplify_set, expand_compound_operation, expand_field_assignment, make_extraction, if_then_else_cond, make_compound_operation, force_to_mode, make_field_assignment, reg_nonzero_bits_for_combine, reg_num_sign_bit_copies_for_combine, extended_count, try_widen_shift_mode, simplify_shift_const_1, simplify_comparison, record_promoted_value, simplify_compare_const, record_dead_and_set_regs_1): Likewise. Ok. r~
Re: [8/11] Expander changes
On 07/01/2011 10:36 AM, Bernd Schmidt wrote: * optabs.c (expand_binop): Use GET_MODE_PRECISION instead of GET_MODE_BITSIZE where appropriate. (widen_leading, expand_parity, expand_ctz, expand_ffs, expand_unop, expand_abs_nojump, expand_one_cmpl_abs_nojump, expand_float, expand_fix): Likewise. * expr.c (convert_move, convert_modes, expand_expr_real_2, expand_expr_real_1, reduce_to_bit_field_precision): Likewise. * stor-layout.c (get_mode_bounds): Likewise. * cfgexpand.c (convert_debug_memory_address, expand_debug_expr): Likewise. * convert.c (convert_to_integer): Likewise. * expmed.c (expand_shift_1): Likewise. Ok. r~
Re: [PATCH] Fix PR49645, with C FE pieces
On Wed, Jul 6, 2011 at 6:26 AM, Richard Guenther rguent...@suse.de wrote: This fixes PR49645 - with MEM_REF the value-numbering machinery to look through aggregate copies wasn't working reliably as we have two representations for X, X and MEM[X]. The following patch fixes that by internally always using the more complicated representation. The patch needs consistent DECL_HARD_REGISTER settings to avoid generating MEM_REFs for them though and the C frontend fails to set that flag for global variables - hence the c-decl.c part (otherwise compile.exp 20041119-1.c ICEs). Bootstrapped and tested on x86_64-unknown-linux-gnu, are the C frontend parts ok for trunk? Thanks, Richard. 2011-07-06 Richard Guenther rguent...@suse.de PR tree-optimization/49645 * c-decl.c (finish_decl): Also set DECL_HARD_REGISTER for global register variables. * tree-ssa-sccvn.c (vn_reference_op_eq): Disregard differences in type qualification here ... (copy_reference_ops_from_ref): ... not here. (vn_reference_lookup_3): ... or here. (copy_reference_ops_from_ref): Record decl bases as MEM[decl]. (vn_reference_lookup): Do the lookup with a valueized ao-ref. * g++.dg/tree-ssa/pr8781.C: Disable SRA. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49662 -- H.J.
Re: [9/11] Fix units mismatch in comparison
On 07/01/2011 10:38 AM, Bernd Schmidt wrote: * rtlanal.c (nonzero_bits1): Don't compare GET_MODE_SIZE against a bitsize. Ok. r~
Re: [10/11] Expander fixes for 40-bit integers
On 07/01/2011 10:41 AM, Bernd Schmidt wrote: * optabs.c (expand_binop): Tighten conditions for doubleword expansions. (widen_bswap): Assert that mode bitsize and precision are the same. * stor-layout.c (get_best_mode): Skip modes that have lower precision than bitsize. * recog.c (simplify_while_replacing): Assert that bitsize and precision are the same. Ok. r~
Re: [11/11] Fix get_mode_bounds
On 07/01/2011 10:42 AM, Bernd Schmidt wrote: get_mode_bounds should also use GET_MODE_PRECISION, but this exposes a problem on ia64 - BImode needs to be handled specially here to work around another preexisting special case in gen_int_mode. Would it be better to remove the trunc_int_for_mode special case? It appears that I added that for ia64 and it's unchanged since... r~
libgo patch committed: Fix json test when rand returns 0
This patch is necessary when compiling the test cases with optimization, which changes the order of the calls to rand. I proposed the same patch to the upstream library. Bootstrapped and tested on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 48658f7ed377 libgo/go/json/scanner_test.go --- a/libgo/go/json/scanner_test.go Fri Jun 24 07:06:48 2011 -0700 +++ b/libgo/go/json/scanner_test.go Wed Jul 06 11:35:51 2011 -0700 @@ -252,6 +252,9 @@ if f n { f = n } + if n 0 f == 0 { + f = 1 + } x := make([]interface{}, int(f)) for i := range x { x[i] = genValue(((i+1)*n)/f - (i*n)/f)
Re: [PATCH] Use ira_reg_class_max_nregs array instead of CLASS_MAX_NREGS macro
Hi. The patch http://gcc.gnu.org/ml/gcc-patches/2011-06/msg02405.html is still pending review. Anatoly.
Re: [Path,AVR]: Improve loading of 32-bit constants
Denis Chertykov wrote: I have asked about example of *d instead of !d. Just svn GCC with *d vs svn GCC !d. Denis. Is the patch ok with the original !d instead of *d ? It still improves and the !d vs. *d don't matter because there's always r I think. Johann
[patch] PR other/49658 fix typo in extend.texi
Index: extend.texi === --- extend.texi (revision 175887) +++ extend.texi (working copy) @@ -1629,7 +1629,7 @@ char **foo = (char *[]) @{ x, y, z @}; @end smallexample -Compound literals for scalar types and union types are is +Compound literals for scalar types and union types are also allowed, but then the compound literal is equivalent to a cast. committed to trunk as obvious PR other/49658 * doc/extend.texi (Compound Literals): Fix typo.
Re: [PATCH, testsuite] Fix for PR49519, miscompiled 447.dealII in SPEC CPU 2006
2011-07-06 Kirill Yukhin kirill.yuk...@intel.com PR tailcall-optimization/49519 Please do not invent components, this will disable the automatic xref of the commit in bugzilla. Copy the Component field of the PR, middle-end here. * calls.c (mem_overlaps_already_clobbered_arg_p): Additional check if address is stored in register. If so - give up. (check_sibcall_argument_overlap_1): Do not perform check of overlapping when it is call to address. Are the 2 changes totally unrelated? Bootstrapped, new test fails without patch, passes when it is applied. This fixes the bprblem with SPEC2006/447.dealII miscompile Ok for trunk? The patch lacks comments - one shouldn't need to read the PR audit trail to understand why the new lines are there. -- Eric Botcazou
[PATCH] Fix dead_debug_insert_before ICE (PR debug/49522, take 2)
On Tue, Jul 05, 2011 at 10:06:51PM +0200, Jakub Jelinek wrote: On Tue, Jul 05, 2011 at 10:35:11AM +0200, Eric Botcazou wrote: There are two kinds of changes we do on the debug insns without immediate rescanning: 1) reset the debug insn 2) replace a reg use with DEBUG_EXPR of the same mode or subreg of a larger DEBUG_EXPR with the same outer mode as the reg In the attached testcase on arm a debug insn is reset, because a multi-reg register has been used there and as the debug insn location was that multi-reg register before, it is now VOIDmode after the reset - (clobber (const_int 0)). That can happen only in this case, right? Otherwise, for a single register, the debug insn would have been removed from debug-head already. If so, how simpler would it be to remove the other uses in dead_debug_reset instead? So you prefer something like this (untested) instead? Without the second loop I have no idea how to make it work in dead_debug_reset, the other dead_debug_use referencing the same insn might be earlier or later in the chain. And here is a version that passed bootstrap/regtest on x86_64-linux and i686-linux: 2011-07-06 Jakub Jelinek ja...@redhat.com PR debug/49522 * df-problems.c (dead_debug_reset): Remove dead_debug_uses referencing debug insns that have been reset. (dead_debug_insert_before): Don't assert reg is non-NULL, instead return immediately if it is NULL. * gcc.dg/debug/pr49522.c: New test. --- gcc/df-problems.c.jj2011-07-04 19:17:50.757435754 +0200 +++ gcc/df-problems.c 2011-07-06 17:20:06.264420868 +0200 @@ -3117,6 +3117,25 @@ dead_debug_reset (struct dead_debug *deb else tailp = (*tailp)-next; } + + /* If any other dead_debug_use structs refer to the debug insns + that have been reset above, remove them too. */ + if (debug-to_rescan != NULL) +{ + tailp = debug-head; + while ((cur = *tailp)) + { + insn = DF_REF_INSN (cur-use); + if (bitmap_bit_p (debug-to_rescan, INSN_UID (insn)) + VAR_LOC_UNKNOWN_P (INSN_VAR_LOCATION_LOC (insn))) + { + *tailp = cur-next; + XDELETE (cur); + } + else + tailp = (*tailp)-next; + } +} } /* Add USE to DEBUG. It must be a dead reference to UREGNO in a debug @@ -3174,7 +3193,8 @@ dead_debug_insert_before (struct dead_de tailp = (*tailp)-next; } - gcc_assert (reg); + if (reg == NULL) +return; /* Create DEBUG_EXPR (and DEBUG_EXPR_DECL). */ dval = make_debug_expr_from_rtl (reg); --- gcc/testsuite/gcc.dg/debug/pr49522.c.jj 2011-07-04 10:54:23.0 +0200 +++ gcc/testsuite/gcc.dg/debug/pr49522.c2011-07-04 10:54:02.0 +0200 @@ -0,0 +1,41 @@ +/* PR debug/49522 */ +/* { dg-do compile } */ +/* { dg-options -fcompare-debug } */ + +int val1 = 0L; +volatile int val2 = 7L; +long long val3; +int *ptr = val1; + +static int +func1 () +{ + return 0; +} + +static short int +func2 (short int a, unsigned int b) +{ + return !b ? a : a b; +} + +static unsigned long long +func3 (unsigned long long a, unsigned long long b) +{ + return !b ? a : a % b; +} + +void +func4 (unsigned short arg1, int arg2) +{ + for (arg2 = 0; arg2 2; arg2++) +{ + *ptr = func3 (func3 (10, func2 (val3, val2)), val3); + for (arg1 = -14; arg1 14; arg1 = func1 ()) + { + *ptr = -1; + if (foo ()) + ; + } +} +} Jakub
[wwwdocs] Buildstat update for 4.5
Latest results for 4.5.x -tgc Testresults for 4.5.3: powerpc-apple-darwin8.11.0 sparc-sun-solaris2.7 Testresults for 4.5.2 powerpc-apple-darwin8.11.0 Testresults for 4.5.1 powerpc-apple-darwin8.11.0 Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.5/buildstat.html,v retrieving revision 1.11 diff -u -r1.11 buildstat.html --- buildstat.html 4 Jun 2011 12:48:05 - 1.11 +++ buildstat.html 6 Jul 2011 19:45:29 - @@ -238,6 +238,9 @@ tdpowerpc-apple-darwin8.11.0/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg02820.html;4.5.3/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg00695.html;4.5.2/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg00694.html;4.5.1/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-08/msg01506.html;4.5.1/a /td /tr @@ -271,6 +274,7 @@ tdsparc-sun-solaris2.7/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg03485.html;4.5.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-06/msg01449.html;4.5.0/a /td /tr
[wwwdocs] Buildstat update for 4.4
Latest results for 4.4.x. -tgc Testresults for 4.4.6: powerpc-apple-darwin8.11.0 Testresults for 4.4.4: powerpc-apple-darwin8.11.0 Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.4/buildstat.html,v retrieving revision 1.23 diff -u -r1.23 buildstat.html --- buildstat.html 4 Jun 2011 11:56:35 - 1.23 +++ buildstat.html 6 Jul 2011 19:44:58 - @@ -343,6 +343,8 @@ tdpowerpc-apple-darwin8.11.0/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg00888.html;4.4.6/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2011-06/msg00887.html;4.4.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-08/msg03013.html;4.4.4/a /td /tr
Re: [PATCH, go]: Compile go checks with $(GOCFLAGS)
Uros Bizjak ubiz...@gmail.com writes: IMO, it makes sense to compile go tests with the same pack of flags as the library. Additionally, this solves an issue with extra compile flags (i.e. -mieee) that needs to be added to handle NaN/Inf. Patch was tested on x86_64-pc-linux-gnu {, -m32} and alphaev68-pc-linux-gnu (where the patch fixes all FPE related testsuite failures due to missing -mieee flags). Approved and applied. Thanks. Ian
Remove unused t-* fragments
This patch removes three unused t-* makefile fragments. (t-pa is unused because no target uses it explicitly and all PA targets define nonempty tmake_file; t-$cpu_type is is only used implicitly if tmake_file is empty after config.gcc.) Bootstrapped with no regressions on x86_64-unknown-linux-gnu. OK to commit? 2011-07-06 Joseph Myers jos...@codesourcery.com * config/i386/t-crtpic, config/i386/t-svr3dbx, config/pa/t-pa: Remove. Index: gcc/config/i386/t-svr3dbx === --- gcc/config/i386/t-svr3dbx (revision 175919) +++ gcc/config/i386/t-svr3dbx (working copy) @@ -1,7 +0,0 @@ -# gas 1.38.1 supporting dbx-in-coff requires a link script. - -svr3.ifile: $(srcdir)/config/i386/svr3.ifile - rm -f svr3.ifile; cp $(srcdir)/config/i386/svr3.ifile . - -svr3z.ifile: $(srcdir)/config/i386/svr3z.ifile - rm -f svr3z.ifile; cp $(srcdir)/config/i386/svr3z.ifile . Index: gcc/config/i386/t-crtpic === --- gcc/config/i386/t-crtpic(revision 175919) +++ gcc/config/i386/t-crtpic(working copy) @@ -1,10 +0,0 @@ -# The pushl in CTOR initialization interferes with frame pointer elimination. - -# We need to use -fPIC when we are using gcc to compile the routines in -# crtstuff.c. This is only really needed when we are going to use gcc/g++ -# to produce a shared library, but since we don't know ahead of time when -# we will be doing that, we just always use -fPIC when compiling the -# routines in crtstuff.c. - -CRTSTUFF_T_CFLAGS = -fPIC -fno-omit-frame-pointer -TARGET_LIBGCC2_CFLAGS = -fPIC Index: gcc/config/pa/t-pa === --- gcc/config/pa/t-pa (revision 175919) +++ gcc/config/pa/t-pa (working copy) @@ -1,7 +0,0 @@ -TARGET_LIBGCC2_CFLAGS = -fPIC - -LIB2FUNCS_EXTRA=lib2funcs.asm - -lib2funcs.asm: $(srcdir)/config/pa/lib2funcs.asm - rm -f lib2funcs.asm - cp $(srcdir)/config/pa/lib2funcs.asm . -- Joseph S. Myers jos...@codesourcery.com
[lra] initial support of debug info and some fixes
The following patch contains some of my work since the last lra patch: o initial support of debug info in LRA. o code size improvement for i386 (changes in register banks). o code improvement by better reallocation of non-reload pseudos in LRA. by using allocno class which is usually wider than preferred class used before. o a fix for IRA working with reload. Without the fix IRA+reload generated much worse code than IRA + LRA (now IRA+reload generates the same code as on the trunk). o restoring original x86 constraints for movzbl generation (they were changed to fix some LRA testsuite degradations). The patch was successfully bootstrapped on x86-64, IA-64, and PPC64. 2011-07-06 Vladimir Makarov vmaka...@redhat.com * lra-assigns.c (reload_pseudo_compare_func, find_hard_regno_for): Use lra_get_allocno_class instead of lra_get_preferred_class. (spill_for, assign_by_spills): Ditto. * lra-constraints.c (get_try_hard_regno, get_reg_class): Use lra_get_allocno_class instead of lra_get_preferred_class. (make_early_clobber_input_reload_reg, inherit_reload_reg): Ditto. (inherit_in_ebb): Ditto. (equivalence_change_p): New function. (lra_constraints): Call equivalence_change_p for debug insns. * lra-int.h (lra_get_preferred_class): Rename to lra_get_allocno_class and return allocno class. (struct lra_reg): Add comments for members insn_bitmap, nrefs, and freq. (lra_operand_data, lra_static_insn_data): Add comment about future changes and debug insns. * lra.c (debug_operand_data, debug_insn_static_data): New initialized varaibles. (free_insn_recog_data, lra_get_insn_recog_data): Add code for processing debug insns. (lra_update_insn_recog_data, invalidate_insn_data_regno_info): Ditto. (lra_update_insn_regno_info): Ditto. * lra-eliminations.c (lra_eliminate_regs_1): Process asm operands too. (eliminate_regs_in_insn): Process debug insns too. * lra-spills.c (lra_hard_reg_substitution): Process debug insns too. * lra-equivs.c (memref_used_between_p): Process debug insns too. * ira.c (ira_setup_eliminable_regset): Initialize dont_use_regs for reload. * reginfo.c (allocate_reg_info, resize_reg_info): Initialize allocno class as GENERAL_REGS. * config/i386/i386.md (*anddi_1, *andsi_1, *andhi_1): Restore correct constraints for movzb. * config/i386/i386.c (ix86_register_bank): Make AX as the most preferable. Index: lra-assigns.c === --- lra-assigns.c (revision 175929) +++ lra-assigns.c (working copy) @@ -78,8 +78,8 @@ static int reload_pseudo_compare_func (const void *v1p, const void *v2p) { int r1 = *(const int *) v1p, r2 = *(const int *) v2p; - enum reg_class cl1 = lra_get_preferred_class (r1); - enum reg_class cl2 = lra_get_preferred_class (r2); + enum reg_class cl1 = lra_get_allocno_class (r1); + enum reg_class cl2 = lra_get_allocno_class (r2); int diff; gcc_assert (r1 = lra_constraint_new_regno_start @@ -292,7 +292,7 @@ find_hard_regno_for (int regno, int *cos bool all_p; COPY_HARD_REG_SET (conflict_set, lra_no_alloc_regs); - rclass = lra_get_preferred_class (regno); + rclass = lra_get_allocno_class (regno); curr_hard_regno_costs_check++; sparseset_clear (conflict_reload_pseudos); sparseset_clear (live_range_hard_reg_pseudos); @@ -344,7 +344,7 @@ find_hard_regno_for (int regno, int *cos for (i = FIRST_STACK_REG; i = LAST_STACK_REG; i++) SET_HARD_REG_BIT (conflict_set, i); #endif - gcc_assert (rclass == lra_get_preferred_class (curr_regno)); + gcc_assert (rclass == lra_get_allocno_class (curr_regno)); } for (curr_regno = lra_reg_info[regno].first; curr_regno = 0; @@ -642,7 +642,7 @@ spill_for (int regno, bitmap spilled_pse bitmap_iterator bi, bi2; gcc_assert (lra_reg_info[regno].first == regno); - rclass = lra_get_preferred_class (regno); + rclass = lra_get_allocno_class (regno); gcc_assert (reg_renumber[regno] 0 rclass != NO_REGS); bitmap_clear (ignore_pseudos_bitmap); bitmap_clear (best_spill_pseudos_bitmap); @@ -737,7 +737,7 @@ spill_for (int regno, bitmap spilled_pse if (reg_renumber[reload_regno] 0 lra_reg_info[reload_regno].first == (int) reload_regno (hard_reg_set_intersect_p - (reg_class_contents[lra_get_preferred_class (reload_regno)], + (reg_class_contents[lra_get_allocno_class (reload_regno)], spilled_hard_regs))) sorted_reload_pseudos[n++] = reload_regno; qsort (sorted_reload_pseudos, n, sizeof (int), pseudo_compare_func); @@ -950,7 +950,7 @@ assign_by_spills (void) for (n = 0, i =
Re: Remove unused t-* fragments
On Wed, Jul 6, 2011 at 10:14 PM, Joseph S. Myers jos...@codesourcery.com wrote: This patch removes three unused t-* makefile fragments. (t-pa is unused because no target uses it explicitly and all PA targets define nonempty tmake_file; t-$cpu_type is is only used implicitly if tmake_file is empty after config.gcc.) Bootstrapped with no regressions on x86_64-unknown-linux-gnu. OK to commit? 2011-07-06 Joseph Myers jos...@codesourcery.com * config/i386/t-crtpic, config/i386/t-svr3dbx, config/pa/t-pa: Remove. OK for x86. Thanks, Uros.
Re: [PATCH] Fix dead_debug_insert_before ICE (PR debug/49522)
So you prefer something like this (untested) instead? I think that, ideally, we should avoid leaving the dead_debug chain in the semi-broken state that we currently have. Without the second loop I have no idea how to make it work in dead_debug_reset, the other dead_debug_use referencing the same insn might be earlier or later in the chain. I guess I was somehow hoping that you could use one of the numerous DF links to get to the other uses; probably not, in the end, indeed. But you can set a flag in the first loop in order to decide whether to run the second loop. But I don't really have a strong opinon so, if you think that the original patch is good enough, fine with me. Maybe use gcc_checking_assert though. -- Eric Botcazou
Re: [PATCH] Fix dead_debug_insert_before ICE (PR debug/49522, take 2)
And here is a version that passed bootstrap/regtest on x86_64-linux and i686-linux: 2011-07-06 Jakub Jelinek ja...@redhat.com PR debug/49522 * df-problems.c (dead_debug_reset): Remove dead_debug_uses referencing debug insns that have been reset. (dead_debug_insert_before): Don't assert reg is non-NULL, instead return immediately if it is NULL. * gcc.dg/debug/pr49522.c: New test. Sorry, our messages crossed. I'd set a flag in the first loop. In the end, it's up to you. -- Eric Botcazou
[pph] Add FIXME comment to avoid finalizing decls when generating pph image. (issue4626099)
We do not need to finalize decls and add them to the varpool when generating the pph image as we will do this when streaming in (lto also already does it this way). I simply added a comment for now, because this will not fix anything, it will simply avoid streaming out unecessary stuff. Since this is not in the front-end it is not as easy as checking pph_out_file != NULL here. Gab diff --git a/gcc/ChangeLog.pph b/gcc/ChangeLog.pph index b9aeb4d..2776bf0 100644 --- a/gcc/ChangeLog.pph +++ b/gcc/ChangeLog.pph @@ -1,3 +1,7 @@ +2011-07-06 Gabriel Charette gch...@google.com + + * passes.c (rest_of_decl_compilation): Add FIXME pph comment. + 2011-07-05 Diego Novillo dnovi...@google.com Merge from trunk rev 175832. diff --git a/gcc/passes.c b/gcc/passes.c index fc9767e..ce7f846 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -184,7 +184,9 @@ rest_of_decl_compilation (tree decl, !DECL_EXTERNAL (decl)) { /* When reading LTO unit, we also read varpool, so do not -rebuild it. */ +rebuild it. +FIXME pph: This is also true for pph and we should not +call varpool_finalize_decl when generating a pph image. */ if (in_lto_p !at_end) ; else if (TREE_CODE (decl) != FUNCTION_DECL) -- This patch is available for review at http://codereview.appspot.com/4626099
Re: [Patch, Fortran] Register allocatable coarrays.
Daniel Carrera wrote: On 07/05/2011 09:57 AM, Tobias Burnus wrote: On 07/04/2011 11:34 PM, Daniel Carrera wrote: The test compiles, but there are expected failures because gcc doesn't think that allocatable scalar coarrays are supported. [...] I really don't want to add knowingly a failing test. Thus, either one adds a test for allocatable array coarrays - or one simply relies on the existing gfortran.dg/coarray/{dummy_1.f90,image_index_1.f90,this_image_1.f90}, which allocate allocatable (array) coarrays. I vote for the second option. Rely on the existing tests. [...] Attached is an updated ChangeLog. And as I wrote above, it seems better to rely on the existing test cases than to duplicate a check for allocatable array coarrays. 2011-07-04 Daniel Carreradcarr...@gmail.com * trans-array.c (gfc_array_allocate): Rename allocatable_array to allocatable. Rename function gfc_allocate_array_with_status to gfc_allocate_allocatable_with_status. Update function call for gfc_allocate_with_status. * trans-opemp.c (gfc_omp_clause_default_ctor): Rename function gfc_allocate_array_with_status to gfc_allocate_allocatable_with_status. * trans-stmt.c (gfc_trans_allocate): Update function call for gfc_allocate_with_status. Rename function gfc_allocate_array_with_status to gfc_allocate_allocatable_with_status. * trans.c (gfc_call_malloc): Add new parameter gfc_allocate_with_status so it uses the library for memory allocation when -fcoarray=lib. (gfc_allocate_allocatable_with_status): Renamed from gfc_allocate_array_with_status. (gfc_allocate_allocatable_with_status): Update function call for gfc_allocate_with_status. * trans.h (gfc_coarray_type): New enum. (gfc_allocate_with_status): Update prototype. (gfc_allocate_allocatable_with_status): Renamed from gfc_allocate_array_with_status. * trans-decl.c: Use the new constant GFC_CAF_COARRAY_ALLOC in the call to gfor_fndecl_caf_register. Nit: There is a (generate_coarray_sym_init) missing in the trans-decl.c entry. Otherwise the patch is OK. I have now committed it as Rev. 175937 Thanks for the patch! Tobias
Re: [Patch, Fortran] Register allocatable coarrays.
On Wed, Jul 06, 2011 at 10:57:35PM +0200, Tobias Burnus wrote: I have now committed it as Rev. 175937 Does Daniel have write-after-approval svn access? If not, we should probably get him access. -- Steve
PING: PATCH [8/n]: Prepare x32: PR other/48007: Unwind library doesn't work with UNITS_PER_WORD sizeof (void *)
PING. On Thu, Jun 30, 2011 at 1:47 PM, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Jun 30, 2011 at 12:02 PM, Richard Henderson r...@redhat.com wrote: On 06/30/2011 11:23 AM, H.J. Lu wrote: +#ifdef REG_VALUE_IN_UNWIND_CONTEXT +typedef _Unwind_Word _Unwind_Context_Reg_Val; +/* Signal frame context. */ +#define SIGNAL_FRAME_BIT ((_Unwind_Word) 1 0) There's absolutely no reason to re-define this. So what if the value is most-significant-bit set? Nor do I see any reason not to continue setting E_C_B. Done. +#define _Unwind_IsExtendedContext(c) 1 Why is this not still an inline function? It is defined before _Unwind_Context is declared. I used macros so that there can be one less #ifdef. + +static inline _Unwind_Word +_Unwind_Get_Unwind_Word (_Unwind_Context_Reg_Val val) +{ + return val; +} + +static inline _Unwind_Context_Reg_Val +_Unwind_Get_Unwind_Context_Reg_Val (_Unwind_Word val) +{ + return val; +} I cannot believe this actually works. I see nowhere that you copy the by-address slot out of the stack frame and place it into the by-value slot in the unwind context. I changed the implantation based on the feedback from Jason. Now I use the same reg field for both value and address. /* This will segfault if the register hasn't been saved. */ if (size == sizeof(_Unwind_Ptr)) - return * (_Unwind_Ptr *) ptr; + return * (_Unwind_Ptr *) (_Unwind_Internal_Ptr) val; else { gcc_assert (size == sizeof(_Unwind_Word)); - return * (_Unwind_Word *) ptr; + return * (_Unwind_Word *) (_Unwind_Internal_Ptr) val; } Indeed, this section is both wrong and belies the change you purport to make. You didn't even test this, did you? Here is the updated patch. It works on simple tests. I am running full tests. I kept config/i386/value-unwind.h since libgcc/md-unwind-support.h is included too late in unwind-dw2.c and I don't want to move it to be on the safe side. OK for trunk? Thanks. -- H.J. --- gcc/ 2011-06-30 H.J. Lu hongjiu...@intel.com * config.gcc (libgcc_tm_file): Add i386/value-unwind.h for Linux/x86. * system.h (REG_VALUE_IN_UNWIND_CONTEXT): Poisoned. * unwind-dw2.c (_Unwind_Context_Reg_Val): New. (_Unwind_Get_Unwind_Word): Likewise. (_Unwind_Get_Unwind_Context_Reg_Val): Likewise. (_Unwind_Context): Use _Unwind_Context_Reg_Val on the reg field. (_Unwind_IsExtendedContext): Defined as macro. (_Unwind_GetGR): Updated. (_Unwind_SetGR): Likewise. (_Unwind_GetGRPtr): Likewise. (_Unwind_SetGRPtr): Likewise. (_Unwind_SetGRValue): Likewise. (_Unwind_GRByValue): Likewise. (__frame_state_for): Likewise. (uw_install_context_1): Likewise. * doc/tm.texi.in: Document REG_VALUE_IN_UNWIND_CONTEXT. * doc/tm.texi: Regenerated. libgcc/ 2011-06-30 H.J. Lu hongjiu...@intel.com * config/i386/value-unwind.h: New. -- H.J.
Re: [pph] Test cleanup (issue4572050)
After having a look at how pph.exp works last Friday I think I could do this myself easily enough. Or are you still modifying the tests and want me to avoid touching this for now? Gab On Fri, Jul 1, 2011 at 5:51 PM, Lawrence Crowl cr...@google.com wrote: On 7/1/11, Gabriel Charette gch...@google.com wrote: One problem now though: `// pph asm xdiff`, only flags for asm diffs, but those could be different diffs after a change (for the better or worse) and this won't be caught. It's probably hard to get something precise on this, but maybe we could simply add the # of lines of diff expected, e.g. `// pph asm xdiff 32`. Then we XFAIL if the number of expected lines in the diff match, but actually fail if the number of lines in the diff is now different. I'm not very familiar with dg.. Is that doable? Would be very helpful at this stage. That looks easy enough. I need to finish the current test stuff before I get to that though. -- Lawrence Crowl
[PATCH] Add -mno-r11 option to suppress load of ppc64 static chain in indirect calls
This patch adds an option to not load the static chain (r11) for 64-bit PowerPC calls through function pointers (or virtual function). Most of the languages on the PowerPC do not need the static chain being loaded when called, and adding this instruction can slow down code that calls very short functions. In addition, if the function does not call alloca, setjmp or deal with exceptions where the stack is modified, the compiler can move the store of the TOC value for the current function to the prologue of the function, rather than at each call site. The effect of these patches is to speed up 464.h264ref in the Spec 2006 benchmark by about 7% if -mno-r11 is used, and 5% if it is not used (but the save of the TOC register is hoisted). I believe this is due to the load of the current function's TOC (r2) having to wait until the store queue is drained with the store just before the call. Unfortunately, I do see a 3% slowdown in 429.mcf, which I don't know what the cause is. I have bootstraped the compiler and saw that there were no regressions in make check. Is it ok to install in the trunk? [gcc] 2011-07-06 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/rs6000-protos.h (rs6000_call_indirect_aix): New declaration. (rs6000_save_toc_in_prologue_p): Ditto. * config/rs6000/rs6000.opt (-mr11): New switch to disable loading up the static chain (r11) during indirect function calls. (-msave-toc-indirect): New undocumented debug switch. * config/rs6000/rs6000.c (struct machine_function): Add save_toc_in_prologue field to note whether the prologue needs to save the TOC value in the reserved stack location. (rs6000_emit_prologue): Use TOC_REGNUM instead of 2. If we need to save the TOC in the prologue, do so. (rs6000_trampoline_init): Don't allow creating AIX style trampolines if -mno-r11 is in effect. (rs6000_call_indirect_aix): New function to create AIX style indirect calls, adding support for -mno-r11 to suppress loading the static chain, and saving the TOC in the prologue instead of the call body. (rs6000_save_toc_in_prologue_p): Return true if we are saving the TOC in the prologue. * config/rs6000/rs6000.md (STACK_POINTER_REGNUM): Add more fixed register numbers. (TOC_REGNUM): Ditto. (STATIC_CHAIN_REGNUM): Ditto. (ARG_POINTER_REGNUM): Ditto. (SFP_REGNO): Delete, unused. (TOC_SAVE_OFFSET_32BIT): Add constants for AIX TOC save and function descriptor offsets. (TOC_SAVE_OFFSET_64BIT): Ditto. (AIX_FUNC_DESC_TOC_32BIT): Ditto. (AIX_FUNC_DESC_TOC_64BIT): Ditto. (AIX_FUNC_DESC_SC_32BIT): Ditto. (AIX_FUNC_DESC_SC_64BIT): Ditto. (ptrload): New mode attribute for the appropriate load of a pointer. (call_indirect_aix32): Delete, rewrite AIX indirect function calls. (call_indirect_aix64): Ditto. (call_value_indirect_aix32): Ditto. (call_value_indirect_aix64): Ditto. (call_indirect_nonlocal_aix32_internal): Ditto. (call_indirect_nonlocal_aix32): Ditto. (call_indirect_nonlocal_aix64_internal): Ditto. (call_indirect_nonlocal_aix64): Ditto. (call): Rewrite AIX indirect function calls. Add support for eliminating the static chain, and for moving the save of the TOC to the function prologue. (call_value): Ditto. (call_indirect_aixptrsize): Ditto. (call_indirect_aixptrsize_internal): Ditto. (call_indirect_aixptrsize_internal2): Ditto. (call_indirect_aixptrsize_nor11): Ditto. (call_value_indirect_aixptrsize): Ditto. (call_value_indirect_aixptrsize_internal): Ditto. (call_value_indirect_aixptrsize_internal2): Ditto. (call_value_indirect_aixptrsize_nor11): Ditto. (call_nonlocal_aix32): Relocate in the rs6000.md file. (call_nonlocal_aix64): Ditto. * doc/invoke.texi (RS/6000 and PowerPC Options): Add -mr11 and -mno-r11 documentation. [gcc/testsuite] 2011-07-06 Michael Meissner meiss...@linux.vnet.ibm.com * gcc.target/powerpc/no-r11-1.c: New test for -mr11, -mno-r11. * gcc.target/powerpc/no-r11-2.c: Ditto. * gcc.target/powerpc/no-r11-3.c: Ditto. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899 Index: gcc/config/rs6000/rs6000-protos.h === --- gcc/config/rs6000/rs6000-protos.h (revision 175921) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -171,6 +171,8 @@ extern unsigned int rs6000_dbx_register_ extern void rs6000_emit_epilogue (int); extern void rs6000_emit_eh_reg_restore (rtx, rtx); extern const char * output_isel (rtx *); +extern void
Re: [PATCH] [Annotalysis] Change to get_virtual_function_decl
On Wed, Jul 6, 2011 at 17:56, Delesley Hutchins deles...@google.com wrote: This patch modifies the behavior of cp_get_virtual_function_decl in gcc/cp/class.c so that it returns NULL if the function declaration cannot be found. The previous behavior was to fail with a segmentation fault. The method-not-found case may occur when Annotalysis uses the function to look up a method, in cases where the static type cannot be accurately determined. Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu. Okay for branches/annotalysis and google/main? -DeLesley 2011-07-06 DeLesley Hutchins deles...@google.com * cp_get_virtual_function_decl.c (handle_call_gs): Changes Blank line before first entry. The file name should be tree-threadsafe-analyze.c, right? function to return null if the method cannot be found. * thread_annot_lock-79.C: Additional annotalysis test cases This goes in testsuite/ChangeLog.* Index: gcc/cp/class.c === --- gcc/cp/class.c (revision 175718) +++ gcc/cp/class.c (working copy) @@ -8391,13 +8391,17 @@ cp_get_virtual_function_decl (tree ref, tree known HOST_WIDE_INT i = 0; tree v = BINFO_VIRTUALS (TYPE_BINFO (known_type)); tree fndecl; - - while (i != index) + + while (v i != index) { i += (TARGET_VTABLE_USES_DESCRIPTORS ? TARGET_VTABLE_USES_DESCRIPTORS : 1); v = TREE_CHAIN (v); } + + /* Return null if the method is not found. */ s/null/NULL_TREE/ OK with those changes. Diego.
[testsuite] arm tests: remove -march= and dg-prune-output from 3 tests
For three tests in gcc.target/arm that don't depend on processor-specific behavior, don't specify the -march option. This makes dg-prune-output for warnings about conflicts unnecessary, so remove it. Two of these tests are for internal compiler errors that showed up with particular values of -march. I think it's fine to test them with normal multilibs, some of which will use those -march values, and others of which could trigger a closely-related ICE. If there'a a desire to use specific options in a test, I'd prefer to see it done in a copy of the test that is skipped for all multilibs but the default. OK for trunk, and for 4.6 after a few days? 2011-07-06 Janis Johnson jani...@codesourcery.com * gcc.target/arm/pr41679.c: Remove -march options and unneeded dg-prune-output. * gcc.target/arm/pr46883.c: Likewise. * gcc.target/arm/xor-and.c: Likewise. Index: gcc.target/arm/pr41679.c === --- gcc.target/arm/pr41679.c(revision 175921) +++ gcc.target/arm/pr41679.c(working copy) @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -march=armv5te -g -O2 } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-options -g -O2 } */ extern int a; extern char b; Index: gcc.target/arm/pr46883.c === --- gcc.target/arm/pr46883.c(revision 175921) +++ gcc.target/arm/pr46883.c(working copy) @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O1 -march=armv5te } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-options -O1 } */ void bar (unsigned char *q, unsigned short *data16s, int len) { Index: gcc.target/arm/xor-and.c === --- gcc.target/arm/xor-and.c(revision 175921) +++ gcc.target/arm/xor-and.c(working copy) @@ -1,6 +1,5 @@ /* { dg-do compile } */ -/* { dg-options -O -march=armv6 } */ -/* { dg-prune-output switch .* conflicts with } */ +/* { dg-options -O } */ unsigned short foo (unsigned short x) {
[testsuite] arm thumb tests: remove -march= and dg-prune-output from 9 tests
This patch removes -march= from nine tests that also check for relevant effective targets. If -march is removed there is no need to ignore compiler warnings about conflicting options with dg-prune-output, so the patch removes that from the tests. OK for trunk, and for 4.6 in a few days? 2011-07-06 Janis Johnson jani...@codesourcery.com * gcc.target/arm/pr39839.c: Remove -march option and unneeded dg-prune-output. * gcc.target/arm/pr40657-2.c: Likewise. * gcc.target/arm/pr40956.c: Likewise. * gcc.target/arm/pr42235.c: Likewise. * gcc.target/arm/pr42495.c: Likewise. * gcc.target/arm/pr42505.c: Likewise. * gcc.target/arm/pr42574.c: Likewise. * gcc.target/arm/pr46934.c: Likewise. * gcc.target/arm/thumb-branch1.c: Likewise. Index: gcc.target/arm/pr39839.c === --- gcc.target/arm/pr39839.c(revision 175921) +++ gcc.target/arm/pr39839.c(working copy) @@ -1,6 +1,5 @@ -/* { dg-options -mthumb -Os -march=armv5te -mthumb-interwork -fpic } */ +/* { dg-options -mthumb -Os -mthumb-interwork -fpic } */ /* { dg-require-effective-target arm_thumb1_ok } */ -/* { dg-prune-output switch .* conflicts with } */ /* { dg-final { scan-assembler-not str\[\\t \]*r.,\[\\t \]*.sp, } } */ struct S Index: gcc.target/arm/pr40657-2.c === --- gcc.target/arm/pr40657-2.c (revision 175921) +++ gcc.target/arm/pr40657-2.c (working copy) @@ -1,6 +1,5 @@ -/* { dg-options -Os -march=armv4t -mthumb } */ +/* { dg-options -Os -mthumb } */ /* { dg-require-effective-target arm_thumb1_ok } */ -/* { dg-prune-output switch .* conflicts with } */ /* { dg-final { scan-assembler-not sub\[\\t \]*sp,\[\\t \]*sp } } */ /* { dg-final { scan-assembler-not add\[\\t \]*sp,\[\\t \]*sp } } */ Index: gcc.target/arm/pr40956.c === --- gcc.target/arm/pr40956.c(revision 175921) +++ gcc.target/arm/pr40956.c(working copy) @@ -1,7 +1,6 @@ -/* { dg-options -mthumb -Os -fpic -march=armv5te } */ +/* { dg-options -mthumb -Os -fpic } */ /* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-require-effective-target fpic } */ -/* { dg-prune-output switch .* conflicts with } */ /* Make sure the constant 0 is loaded into register only once. */ /* { dg-final { scan-assembler-times mov\[\\t \]*r., #0 1 } } */ Index: gcc.target/arm/pr42235.c === --- gcc.target/arm/pr42235.c(revision 175921) +++ gcc.target/arm/pr42235.c(working copy) @@ -1,6 +1,5 @@ -/* { dg-options -mthumb -O2 -march=armv5te } */ +/* { dg-options -mthumb -O2 } */ /* { dg-require-effective-target arm_thumb1_ok } */ -/* { dg-prune-output switch .* conflicts with } */ /* { dg-final { scan-assembler-not add\[\\t \]*r.,\[\\t \]*r.,\[\\t \]*\#1 } } */ /* { dg-final { scan-assembler-not add\[\\t \]*r.,\[\\t \]*\#1 } } */ Index: gcc.target/arm/pr42495.c === --- gcc.target/arm/pr42495.c(revision 175921) +++ gcc.target/arm/pr42495.c(working copy) @@ -1,7 +1,6 @@ -/* { dg-options -mthumb -Os -fpic -march=armv5te -fdump-rtl-hoist } */ +/* { dg-options -mthumb -Os -fpic -fdump-rtl-hoist } */ /* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-require-effective-target fpic } */ -/* { dg-prune-output switch .* conflicts with } */ /* Make sure all calculations of gObj's address get hoisted to one location. */ /* { dg-final { scan-rtl-dump PRE/HOIST: end of bb .* copying expression hoist } } */ Index: gcc.target/arm/pr42505.c === --- gcc.target/arm/pr42505.c(revision 175921) +++ gcc.target/arm/pr42505.c(working copy) @@ -1,6 +1,5 @@ -/* { dg-options -mthumb -Os -march=armv5te } */ +/* { dg-options -mthumb -Os } */ /* { dg-require-effective-target arm_thumb1_ok } */ -/* { dg-prune-output switch .* conflicts with } */ /* { dg-final { scan-assembler-not str\[\\t \]*r.,\[\\t \]*.sp, } } */ struct A { Index: gcc.target/arm/pr42574.c === --- gcc.target/arm/pr42574.c(revision 175921) +++ gcc.target/arm/pr42574.c(working copy) @@ -1,7 +1,6 @@ -/* { dg-options -mthumb -Os -fpic -march=armv5te } */ +/* { dg-options -mthumb -Os -fpic } */ /* { dg-require-effective-target arm_thumb1_ok } */ /* { dg-require-effective-target fpic } */ -/* { dg-prune-output switch .* conflicts with } */ /* Make sure the address of glob.c is calculated only once and using a logical shift for the offset (2001). */ /* { dg-final { scan-assembler-times lsl 1 } } */ Index: gcc.target/arm/pr46934.c === --- gcc.target/arm/pr46934.c(revision 175921) +++
Re: [pph] Add FIXME comment to avoid finalizing decls when generating pph image. (issue4626099)
On Wed, Jul 6, 2011 at 16:55, Gabriel Charette gch...@google.com wrote: +2011-07-06 Gabriel Charette gch...@google.com + + * passes.c (rest_of_decl_compilation): Add FIXME pph comment. + OK under the obvious rule (small patches like this one that make obvious fixes to documentation or formatting or code do not need explicit approval). Diego.
Re: [11/11] Fix get_mode_bounds
On 07/06/2011 04:04 PM, Bernd Schmidt wrote: That might require target specific changes if there are assumptions that a BImode value is either 0 or 1, not 0 or -1. For now I'd prefer to minimize the impact. Systems that set STORE_FLAG_VALUE to -1: m68k spu Systems that use BImode: bfin ia64 mep sh rs6000 stormy16 There's no overlap. That said, I'm willing to approve the patch as-is. Certainly testing the signed-ness of the tree type seems preferable to just the mode, which can't tell signedness. r~
[PATCH] Update html docs for -mno-r11 and --param case-value-threshold
I updated the html documents for my two recent changes: *** changes.html.~1~2011-07-06 19:26:37.0 -0400 --- changes.html2011-07-06 19:35:22.0 -0400 *** *** 48,54 h2General Optimizer Improvements/h2 ul ! li.../li /ul h2New Languages and Language specific improvements/h2 --- 48,57 h2General Optimizer Improvements/h2 ul ! liSupport for a new parameter code--param case-value-threshold=n/code ! was added to allow users to control the cutoff between doing switch statements ! as a series of if statements and using a jump table. ! /li /ul h2New Languages and Language specific improvements/h2 *** struct F: E { }; // error: deriving from *** 230,235 --- 233,246 instruction set. Previously the GCC compiler did not adhere to the ABI for 128-bit vectors with 64-bit integer base types (PR 48857). This will also be fixed in the GCC 4.6.1 and 4.5.4 releases./li + + liA new option (code-mno-r11)/code was added to allow AIX +32-bit/64-bit and Linux 64-bit PowerPC users to specify that the compiler +should not load up the chain register (ir11/i) before calling a +function through a pointer. If you use this option, you cannot call +nested functions through a pointer, or call other languages that might +use the static chain. + /li /ul h3MIPS/h3 -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: [1/11] Use targetm.shift_truncation_mask more consistently
On 07/06/11 20:06, Richard Sandiford wrote: Bernd Schmidt ber...@codesourcery.com writes: At some point we've grown a shift_truncation_mask hook, but we're not using it everywhere we're masking shift counts. This patch changes the instances I found. The documentation reads: Note that, unlike @code{SHIFT_COUNT_TRUNCATED}, this function does @emph{not} apply to general shift rtxes; it applies only to instructions that are generated by the named shift patterns. Ouch. That is one seriously misnamed hook then. I think you need to update the documentation, and check that existing target definitions do in fact apply to shift rtxes as well. Until I can do that, I've reverted this patch. Bernd
C++ PATCH for c++/49353 (emitting functions with DECL_EXTERNAL set)
The C++ front end sets DECL_EXTERNAL on functions and variables with vague linkage during most of compilation, and then clears the flag at EOF if we actually want to emit them. But we were failing to clear DECL_EXTERNAL in the case of inlines that we are emitting because of -fkeep-inline-functions. Tested x86_64-pc-linux-gnu, applying to trunk. commit 600157c6ee5b6425f47b24d03dceaa4b5ac06359 Author: Jason Merrill ja...@redhat.com Date: Wed Jul 6 18:01:40 2011 -0400 PR c++/49353 * semantics.c (expand_or_defer_fn_1): Clear DECL_EXTERNAL on kept inlines. diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 6fcf0da..5caeafe 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -3634,7 +3634,10 @@ expand_or_defer_fn_1 (tree fn) !DECL_REALLY_EXTERN (fn)) || (flag_keep_inline_dllexport lookup_attribute (dllexport, DECL_ATTRIBUTES (fn - mark_needed (fn); + { + mark_needed (fn); + DECL_EXTERNAL (fn) = 0; + } } /* There's no reason to do any of the work here if we're only doing
[Patch, Fortran] Add stat=/errmsg= support to _gfortran_caf_register
This patch cleans up the ABI mess, I created at some point. The initial version of _gfortran_caf_register didn't handle stat/errmsg as one could leave it to the front end: The the returned memory is NULL, it's an error. However, as Nick pointed out, for stat= one can also return STAT_STOPPED_IMAGE. In order to handle this, one needs an additional argument. That's what was done - albeit incompletely: The documentation was updated, cf. http://gcc.gnu.org/wiki/CoarrayLib#Registering_coarrays, as was the front end (cf. function declaration and call in trans-decl.c); however, the library itself (single.c and mpi.c) was not accepting the new arguments. The attached patch solves this: It updates the just (by Daniel) added trans.c call and implements the new arguments in the library. TODO: In trans.c (for the ALLOCATE statement), I currently pass NULL pointers for stat and errormsg argument. Hence, the ABI is fixed, but the error diagnostic is not yet standard conform. However, I think one can defer this to another patch. I added a note in my BUG file to make sure it won't get forgotten. Cf. http://users.physik.fu-berlin.de/~tburnus/coarray/BUGS.txt Build and regtested on x86-64-linux. OK for the trunk? (Daniel Carrera, I would be happy if you could also have a look at the patch.) Tobias 2011-07-06 Tobias Burnus bur...@net-b.de * trans.c (gfc_allocate_with_status): Call _gfortran_caf_register with NULL arguments for (new) stat=/errmsg= arguments. 2011-07-06 Tobias Burnus bur...@net-b.de * libcaf.h (__attribute__, unlikely, likely): New macros. (caf_register_t): Update comment. (_gfortran_caf_register): Add stat, errmsg, errmsg_len arguments. * single.c (_gfortran_caf_register): Ditto; add error diagnostics. * mpi.c (_gfortran_caf_register): Ditto. (caf_is_finalized): New global variable. (_gfortran_caf_finalize): Use it. diff --git a/gcc/fortran/trans.c b/gcc/fortran/trans.c index 683e3f1..4043df2 100644 --- a/gcc/fortran/trans.c +++ b/gcc/fortran/trans.c @@ -622,13 +622,16 @@ gfc_allocate_with_status (stmtblock_t * block, tree size, tree status, gfc_add_modify (alloc_block, res, fold_convert (prvoid_type_node, build_call_expr_loc (input_location, - gfor_fndecl_caf_register, 3, + gfor_fndecl_caf_register, 6, fold_build2_loc (input_location, MAX_EXPR, size_type_node, size, build_int_cst (size_type_node, 1)), build_int_cst (integer_type_node, GFC_CAF_COARRAY_ALLOC), - null_pointer_node))); /* Token */ + null_pointer_node, /* token */ + null_pointer_node, /* stat */ + null_pointer_node, /* errmsg, errmsg_len */ + build_int_cst (integer_type_node, 0; } else { diff --git a/libgfortran/caf/libcaf.h b/libgfortran/caf/libcaf.h index 4177985..4fe09e4 100644 --- a/libgfortran/caf/libcaf.h +++ b/libgfortran/caf/libcaf.h @@ -30,6 +30,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #include stdint.h /* For int32_t. */ #include stddef.h /* For ptrdiff_t. */ +#ifndef __GNUC__ +#define __attribute__(x) +#define likely(x) (x) +#define unlikely(x) (x) +#else +#define likely(x) __builtin_expect(!!(x), 1) +#define unlikely(x) __builtin_expect(!!(x), 0) +#endif /* Definitions of the Fortran 2008 standard; need to kept in sync with ISO_FORTRAN_ENV, cf. libgfortran.h. */ @@ -38,7 +46,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define STAT_LOCKED_OTHER_IMAGE 2 #define STAT_STOPPED_IMAGE 3 -/* Describes what type of array we are registerring. */ +/* Describes what type of array we are registerring. Keep in sync with + gcc/fortran/trans.h. */ typedef enum caf_register_t { CAF_REGTYPE_COARRAY_STATIC, CAF_REGTYPE_COARRAY_ALLOC, @@ -58,7 +67,8 @@ caf_static_t; void _gfortran_caf_init (int *, char ***, int *, int *); void _gfortran_caf_finalize (void); -void * _gfortran_caf_register (ptrdiff_t, caf_register_t, void **); +void * _gfortran_caf_register (ptrdiff_t, caf_register_t, void **, int *, + char *, int); int _gfortran_caf_deregister (void **); diff --git a/libgfortran/caf/mpi.c b/libgfortran/caf/mpi.c index 83f39f6..2d4af6b 100644 --- a/libgfortran/caf/mpi.c +++ b/libgfortran/caf/mpi.c @@ -41,6 +41,7 @@ static void error_stop (int error) __attribute__ ((noreturn)); static int caf_mpi_initialized; static int caf_this_image; static int caf_num_images; +static int caf_is_finalized; caf_static_t *caf_static_list = NULL; @@ -87,14 +88,20 @@ _gfortran_caf_finalize (void) if (!caf_mpi_initialized) MPI_Finalize (); + + caf_is_finalized = 1; } void * -_gfortran_caf_register (ptrdiff_t size, caf_register_t type, -void **token) +_gfortran_caf_register (ptrdiff_t size, caf_register_t type, void **token, + int *stat, char *errmsg, int errmsg_len) { void *local; + int err; + + if (unlikely (caf_is_finalized)) +goto error;