[patch, committed] invoke.texi: copy-edit x86 sections
I noticed that the x86-specific material in invoke.texi had a lot of copy-editing problems; besides the usual grammar and punctuation mistakes, I cleaned up a bunch of problems in the Texinfo markup. Additionally, I corrected quite a large number of issues where incorrect names were used for processors and chip manufacturers. For example, Pentium III is the correct name of that processor, Pentium3 is not. (Wikipedia is very useful for this sort of thing, with links to manufacturer's data sheets and/or photos of the processor showing the branding on it.) I've checked this in as a supposedly content-free patch. I suggest, though, that the port maintainers look over this section (and not just my changes to it), as the reference to GCC 2.6.1 that I removed indicates to me that nobody has reviewed this material for quite a long time and some of it may be bit-rotten in other ways. -Sandra 2012-03-04 Sandra Loosemore san...@codesourcery.com gcc/ * doc/invoke.texi (C++ Dialect Options): Minor copy-edits to x86-specific text. (Debugging Options): Likewise. (Optimize Options): Likewise. (i386 and x86-64 Options): Discuss -march before -mtune, consistently with other architectures. Use official processor names with correct spelling/capitalization. Fix formatting and grammar issues. (i386 and x86-64 Windows Options): Similar cleanup here. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 184878) +++ gcc/doc/invoke.texi (working copy) @@ -2376,14 +2376,14 @@ Instantiations of these templates may be @end itemize -It also warns psABI related changes. The known psABI changes at this +It also warns about psABI-related changes. The known psABI changes at this point include: @itemize @bullet @item -For SYSV/x86-64, when passing union with long double, it is changed to -pass in memory as specified in psABI. For example: +For SysV/x86-64, unions with @code{long double} members are +passed in memory as specified in psABI. For example: @smallexample union U @{ @@ -2393,7 +2393,7 @@ union U @{ @end smallexample @noindent -@code{union U} will always be passed in memory. +@code{union U} is always passed in memory. @end itemize @@ -5484,7 +5484,7 @@ architectures. @item -fdump-rtl-stack @opindex fdump-rtl-stack -Dump after conversion from GCC's flat register file registers to the +Dump after conversion from GCC's ``flat register file'' registers to the x87's stack-like registers. This pass is only run on x86 variants. @item -fdump-rtl-subreg1 @@ -6333,7 +6333,7 @@ whether a target machine supports this f Usage, gccint, GNU Compiler Collection (GCC) Internals}. Starting with GCC version 4.6, the default setting (when not optimizing for -size) for 32-bit Linux x86 and 32-bit Darwin x86 targets has been changed to +size) for 32-bit GNU/Linux x86 and 32-bit Darwin x86 targets has been changed to @option{-fomit-frame-pointer}. The default can be reverted to @option{-fno-omit-frame-pointer} by configuring GCC with the @option{--enable-frame-pointer} configure option. @@ -6740,7 +6740,7 @@ Enabled at levels @option{-O2}, @option{ @item -free @opindex free Attempt to remove redundant extension instructions. This is especially -helpful for the x86-64 architecture which implicitly zero-extends in 64-bit +helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit registers after writing to their lower 32-bit half. Enabled for x86 at levels @option{-O2}, @option{-O3}. @@ -12977,102 +12977,134 @@ These @samp{-m} options are defined for computers: @table @gcctabopt -@item -mtune=@var{cpu-type} -@opindex mtune -Tune to @var{cpu-type} everything applicable about the generated code, except -for the ABI and the set of available instructions. The choices for -@var{cpu-type} are: -@table @emph -@item generic -Produce code optimized for the most common IA32/@/AMD64/@/EM64T processors. -If you know the CPU on which your code will run, then you should use -the corresponding @option{-mtune} option instead of -@option{-mtune=generic}. But, if you do not know exactly what CPU users -of your application will have, then you should use this option. -As new processors are deployed in the marketplace, the behavior of this -option will change. Therefore, if you upgrade to a newer version of -GCC, the code generated option will change to reflect the processors -that were most common when that version of GCC was released. +@item -march=@var{cpu-type} +@opindex march +Generate instructions for the machine type @var{cpu-type}. In contrast to +@option{-mtune=@var{cpu-type}}, which merely tunes the generated code +for the specified @var{cpu-type}, @option{-march=@var{cpu-type}} allows GCC +to generate code that may not run at all on processors other than the one +indicated. Specifying
fix libstdc++/52433
PR libstdc++/52433 * include/debug/safe_iterator.h (_Safe_iterator): Add move constructor and move assignment operator. * testsuite/23_containers/vector/debug/52433.cc: New. Tested 'make check check-debug' on x86_64 and committed to trunk. I plan to fix this for 4.7.1 and 4.6.4 as well diff --git a/libstdc++-v3/include/debug/safe_iterator.h b/libstdc++-v3/include/debug/safe_iterator.h index e7cfe9c..65dff55 100644 --- a/libstdc++-v3/include/debug/safe_iterator.h +++ b/libstdc++-v3/include/debug/safe_iterator.h @@ -169,6 +169,19 @@ namespace __gnu_debug ._M_iterator(__x, other)); } +#ifdef __GXX_EXPERIMENTAL_CXX0X__ + /** + * @brief Move construction. + * @post __x is singular and unattached + */ + _Safe_iterator(_Safe_iterator __x) : _M_current() + { + std::swap(_M_current, __x._M_current); + this-_M_attach(__x._M_sequence); + __x._M_detach(); + } +#endif + /** * @brief Converting constructor from a mutable iterator to a * constant iterator. @@ -208,6 +221,22 @@ namespace __gnu_debug return *this; } +#ifdef __GXX_EXPERIMENTAL_CXX0X__ + /** + * @brief Move assignment. + * @post __x is singular and unattached + */ + _Safe_iterator + operator=(_Safe_iterator __x) + { + _M_current = __x._M_current; + _M_attach(__x._M_sequence); + __x._M_detach(); + __x._M_current = _Iterator(); + return *this; + } +#endif + /** * @brief Iterator dereference. * @pre iterator is dereferenceable @@ -422,7 +451,9 @@ namespace __gnu_debug /// Is this iterator equal to the sequence's before_begin() iterator if /// any? bool _M_is_before_begin() const - { return _BeforeBeginHelper_Sequence::_M_Is(base(), _M_get_sequence()); } + { + return _BeforeBeginHelper_Sequence::_M_Is(base(), _M_get_sequence()); + } }; templatetypename _IteratorL, typename _IteratorR, typename _Sequence diff --git a/libstdc++-v3/testsuite/23_containers/vector/debug/52433.cc b/libstdc++-v3/testsuite/23_containers/vector/debug/52433.cc new file mode 100644 index 000..f1f5917 --- /dev/null +++ b/libstdc++-v3/testsuite/23_containers/vector/debug/52433.cc @@ -0,0 +1,43 @@ +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. +// +// { dg-require-debug-mode } +// { dg-options -std=gnu++0x } +// { dg-do compile } + +// PR libstdc++/52433 + +#include vector + +struct X +{ +std::vectorint::iterator i; + +X() = default; +X(const X) = default; +X(X) = default; +X operator=(const X) = default; +X operator=(X) = default; +}; + +X test01() +{ +X x; +x = X(); +return x; +} +
Re: [fortran, patch] Fix display of locus when source contains wide characters
On Sat, Mar 3, 2012 at 4:08 PM, FX fxcoud...@gmail.com wrote: The attached patch fixes PR 36160 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36160). It should correctly account for wide characters when display error loci. I'm not sure if we can check that in the testsuite harness, but you can manually see it at work on the attached test.f90. Bootstrapped and regtested on x86_64-apple-darwin11, OK for trunk? FX Looks OK to me except for: - for (; i 0; i--) + for (; i 0;) Might as well just make that a while loop. Ciao! Steven
Re: [fortran, patch] Fix display of locus when source contains wide characters
Looks OK to me except for: - for (; i 0; i--) + for (; i 0;) Might as well just make that a while loop. Indeed! Committed with a while loop, thanks for the review! FX
[Patch,AVR] PR52461: Fix RAMPZ clobbering and RAMP* in epilogue
This patch fixes several issues with RAMP registers: * On Devices with more than 64 KiB RAM, RAMPZ is used as high-byte of RAM address. If RAMPZ is used to read flash, it must be reset to 0 after the read so that RAM-read will operate correctly in the remainder. There is no support for RAM 64 Ki so RAMPZ = 0 is in order. * The ISR epilogue restored RAMP* registers in the wrong order. * As RAMPZ is used both in ELPM and LD/LDD on some xmega core, the right condition to set RAMPZ prior to ELPM is have ELPM, not have RAMPZ. * Never read unintentionally from RAM because a flash address interpreted as a RAM address might point to the I/O area. Ok for trunk and 4.7? Johann libgcc/ PR target/52461 * config/avr/lib1funcs.S (__do_copy_data): Clear RAMPZ after usage if RAMPZ affects reading from RAM. (__tablejump_elpm__): Ditto. (.xload): Ditto. (__movmemx_hi): Ditto. (__do_global_ctors): Right condition for RAMPZ usage is have ELPM. (__do_global_dtors): Ditto. (__xload_1, __xload_2, __xload_3, __xload_4): Ditto. And make weak. (__movmemx_hi): Ditto. And fix RAM-loop label. (__xload_1): Never read unintentionally from RAM. gcc/ PR target/52461 * gcc/config/avr/avr.c (expand_prologue): Depend save/restore of RAMPZ on HAVE_RAMPD, not HAVE_RAMPZ. (expand_epilogue): Ditto. And fix order of restoration to: RAMPZ, RAMPY, RAMPX, RAMPD. (avr_xload_libgcc_p): Always load __memx by lilbgcc call on big-RAM devices. (avr_out_lpm): Clear RAMPZ after usage if RAMPZ affects reading from RAM. (avr_out_xload): Never read unintentionally from RAM. * config/avr/avr.md (xload_8): Adjust insn length. Index: libgcc/config/avr/lib1funcs.S === --- libgcc/config/avr/lib1funcs.S (revision 184887) +++ libgcc/config/avr/lib1funcs.S (working copy) @@ -1853,7 +1853,7 @@ DEFUN __do_copy_data cpi r26, lo8(__data_end) cpc r27, r17 brne .L__do_copy_data_loop -#elif !defined(__AVR_HAVE_ELPMX__) defined(__AVR_HAVE_ELPM__) +#elif defined(__AVR_HAVE_ELPM__) ldi r17, hi8(__data_end) ldi r26, lo8(__data_start) ldi r27, hi8(__data_start) @@ -1873,7 +1873,7 @@ DEFUN __do_copy_data cpi r26, lo8(__data_end) cpc r27, r17 brne .L__do_copy_data_loop -#elif !defined(__AVR_HAVE_ELPMX__) !defined(__AVR_HAVE_ELPM__) +#else /* !ELPM */ ldi r17, hi8(__data_end) ldi r26, lo8(__data_start) ldi r27, hi8(__data_start) @@ -1892,7 +1892,11 @@ DEFUN __do_copy_data cpi r26, lo8(__data_end) cpc r27, r17 brne .L__do_copy_data_loop -#endif /* !defined(__AVR_HAVE_ELPMX__) !defined(__AVR_HAVE_ELPM__) */ +#endif /* ELPMX / ELPM / LPM */ +#if defined (__AVR_HAVE_ELPM__) defined (__AVR_HAVE_RAMPD__) + ;; Reset RAMPZ to 0 so that EBI devices don't read garbage from RAM + out __RAMPZ__, __zero_reg__ +#endif /* ELPM RAMPD */ ENDF __do_copy_data #endif /* L_copy_data */ @@ -1920,7 +1924,7 @@ ENDF __do_clear_bss #ifdef L_ctors .section .init6,ax,@progbits DEFUN __do_global_ctors -#if defined(__AVR_HAVE_RAMPZ__) +#if defined(__AVR_HAVE_ELPM__) ldi r17, hi8(__ctors_start) ldi r28, lo8(__ctors_end) ldi r29, hi8(__ctors_end) @@ -1939,7 +1943,7 @@ DEFUN __do_global_ctors ldi r24, hh8(__ctors_start) cpc r16, r24 brne .L__do_global_ctors_loop -#else +#else /* !ELPM */ ldi r17, hi8(__ctors_start) ldi r28, lo8(__ctors_end) ldi r29, hi8(__ctors_end) @@ -1953,14 +1957,14 @@ DEFUN __do_global_ctors cpi r28, lo8(__ctors_start) cpc r29, r17 brne .L__do_global_ctors_loop -#endif /* defined(__AVR_HAVE_RAMPZ__) */ +#endif /* defined(__AVR_HAVE_ELPM__) */ ENDF __do_global_ctors #endif /* L_ctors */ #ifdef L_dtors .section .fini6,ax,@progbits DEFUN __do_global_dtors -#if defined(__AVR_HAVE_RAMPZ__) +#if defined(__AVR_HAVE_ELPM__) ldi r17, hi8(__dtors_end) ldi r28, lo8(__dtors_start) ldi r29, hi8(__dtors_start) @@ -1979,7 +1983,7 @@ DEFUN __do_global_dtors ldi r24, hh8(__dtors_end) cpc r16, r24 brne .L__do_global_dtors_loop -#else +#else /* !ELPM */ ldi r17, hi8(__dtors_end) ldi r28, lo8(__dtors_start) ldi r29, hi8(__dtors_start) @@ -1993,7 +1997,7 @@ DEFUN __do_global_dtors cpi r28, lo8(__dtors_end) cpc r29, r17 brne .L__do_global_dtors_loop -#endif /* defined(__AVR_HAVE_RAMPZ__) */ +#endif /* defined(__AVR_HAVE_ELPM__) */ ENDF __do_global_dtors #endif /* L_dtors */ @@ -2001,18 +2005,21 @@ ENDF __do_global_dtors #ifdef L_tablejump_elpm DEFUN __tablejump_elpm__ -#if defined (__AVR_HAVE_ELPM__) -#if defined (__AVR_HAVE_LPMX__) +#if defined (__AVR_HAVE_ELPMX__) elpm __tmp_reg__, Z+ elpm r31, Z mov r30, __tmp_reg__ +#if defined (__AVR_HAVE_RAMPD__) + ;; Reset RAMPZ to 0 so that EBI devices don't read garbage from RAM + out __RAMPZ__, __zero_reg__ +#endif /* RAMPD */ #if defined (__AVR_HAVE_EIJMP_EICALL__) eijmp #else ijmp -#endif
[Patch,AVR]: Tweak a+2*b
This patch adds a straight forward combine pattern and split for int + 2*byte as frequently seen with accesses to int-arrays with byte offset. Ok for trunk? Johann * config/avr/avr.md (*umaddqihi4.2): New insn-and-split. Index: config/avr/avr.md === --- config/avr/avr.md (revision 184887) +++ config/avr/avr.md (working copy) @@ -1692,6 +1692,30 @@ (define_insn *any_extend:extend_suan ;; Handle small constants +;; Special case of a += 2*b as frequently seen with accesses to int arrays. +;; This is shorter, faster than MUL and has lower register pressure. + +(define_insn_and_split *umaddqihi4.2 + [(set (match_operand:HI 0 register_operand =r) +(plus:HI (mult:HI (zero_extend:HI (match_operand:QI 1 register_operand r)) + (const_int 2)) + (match_operand:HI 2 register_operand r)))] + AVR_HAVE_MUL +!reload_completed +!reg_overlap_mentioned_p (operands[0], operands[1]) + { gcc_unreachable(); } + 1 + [(set (match_dup 0) +(match_dup 2)) + ; *addhi3_zero_extend + (set (match_dup 0) +(plus:HI (zero_extend:HI (match_dup 1)) + (match_dup 0))) + ; *addhi3_zero_extend + (set (match_dup 0) +(plus:HI (zero_extend:HI (match_dup 1)) + (match_dup 0)))]) + ;; umaddqihi4.uconst ;; maddqihi4.sconst (define_insn_and_split *extend_umaddqihi4.extend_suconst
[PATCH, i386]: Improve zero_extend patterns
Hello! Attached patch improves zero_extend patterns by: - removing flags reg clobber from zero_extendsidi patterns for 32bit targets. Everything, including movl $0, mem can be split without using flags reg clobber. - removing intermediate *zero_extend*2_movzbl_and patterns. We do not need to remove any fake clobbers in !TARGET_ZERO_EXTEND_WITH_AND case anymore - adding o,0 and x,x register alternatives. We can split matching memory to load 0 in highpart for 64bit and 32bit targets, and movd zero extends also in xmm-xmm case - truly splitting and RTXes to zero_extend RTXes when appropriate (but only in !TARGET_ZERO_EXTEND_WITH_AND case), again removing unneeded flags reg clobbers - fixing TARGET_ZERO_EXTEND_WITH_AND peephole2 2012-03-04 Uros Bizjak ubiz...@gmail.com * config/i386/constraints.md (Ya): New internal constraint. * config/i386/i386.md (zero_extendsidi2): Remove expansion. (*zero_extendsidi2_rex64): Add x,x alternative. (*zero_extendsidi2): Ditto. Add o,0 alternative. Remove flags reg clobber. Adjust corresponding splits. (zero_extendmodesi2): Macroize expander from zero_extendhisi2 and zero_extendqisi2 expanders using SWI12 mode iterator. (zero_extendmodesi2_and): Macroize insn from zero_extendhisi2_and and zero_extendqisi2_and. Merge corresponding splitters. (*zero_extendmodesi2): Macroize insn from *zero_extendhisi2_movzbl and *zero_extendqisi2_movzbl. (*zero_extend*2_movzbl_and): Remove insn patterns. (zero_extendqihi2_and): Merge corresponding splitter. (*zero_extendqihi2): Rename from *zero_extendqihi2_movzbl. (*zero_extend*2_movzbl_and): Remove insn patterns. (*anddi_1): Split TYPE_IMOVX instructions. (*andsi_1): Use Ya for alternative 2. Split TYPE_IMOVX instructions. (*andhi_1): Ditto. (and-zext splitter): Add splitter pattern. (zero extend with andsi3 splitter): Adjust zero_extend pattern. Patch was tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: config/i386/constraints.md === --- config/i386/constraints.md (revision 184886) +++ config/i386/constraints.md (working copy) @@ -89,6 +89,7 @@ ;; z First SSE register. ;; i SSE2 inter-unit moves enabled ;; m MMX inter-unit moves enabled +;; a Integer register when zero extensions with AND are disabled ;; p Integer register when TARGET_PARTIAL_REG_STALL is disabled ;; d Integer register when integer DFmode moves are enabled ;; x Integer register when integer XFmode moves are enabled @@ -108,6 +109,11 @@ TARGET_PARTIAL_REG_STALL ? NO_REGS : GENERAL_REGS @internal Any integer register when TARGET_PARTIAL_REG_STALL is disabled.) +(define_register_constraint Ya + TARGET_ZERO_EXTEND_WITH_AND optimize_function_for_speed_p (cfun) + ? NO_REGS : GENERAL_REGS + @internal Any integer register when zero extensions with AND are disabled.) + (define_register_constraint Yd (TARGET_64BIT || (TARGET_INTEGER_DFMODE_MOVES optimize_function_for_speed_p (cfun))) Index: config/i386/i386.md === --- config/i386/i386.md (revision 184886) +++ config/i386/i386.md (working copy) @@ -3371,20 +3371,14 @@ (define_expand zero_extendsidi2 [(set (match_operand:DI 0 nonimmediate_operand ) - (zero_extend:DI (match_operand:SI 1 nonimmediate_operand )))] - -{ - if (!TARGET_64BIT) -{ - emit_insn (gen_zero_extendsidi2_1 (operands[0], operands[1])); - DONE; -} -}) + (zero_extend:DI (match_operand:SI 1 nonimmediate_operand )))]) (define_insn *zero_extendsidi2_rex64 - [(set (match_operand:DI 0 nonimmediate_operand =r,o,?*Ym,?*y,?*Yi,*x) + [(set (match_operand:DI 0 nonimmediate_operand + =r,o,?*Ym,?*y,?*Yi,!*x) (zero_extend:DI -(match_operand:SI 1 nonimmediate_operand rm,0,r ,m ,r ,m)))] +(match_operand:SI 1 nonimmediate_operand + rm,0,r ,m ,r ,m*x)))] TARGET_64BIT @ mov{l}\t{%1, %k0|%k0, %1} @@ -3393,24 +3387,17 @@ movd\t{%1, %0|%0, %1} %vmovd\t{%1, %0|%0, %1} %vmovd\t{%1, %0|%0, %1} - [(set_attr type imovx,imov,mmxmov,mmxmov,ssemov,ssemov) + [(set_attr isa *,*,*,*,*,sse2) + (set_attr type imovx,multi,mmxmov,mmxmov,ssemov,ssemov) (set_attr prefix orig,*,orig,orig,maybe_vex,maybe_vex) (set_attr prefix_0f 0,*,*,*,*,*) - (set_attr mode SI,DI,DI,DI,TI,TI)]) + (set_attr mode SI,SI,DI,DI,TI,TI)]) -(define_split - [(set (match_operand:DI 0 memory_operand ) - (zero_extend:DI (match_dup 0)))] - TARGET_64BIT - [(set (match_dup 4) (const_int 0))] - split_double_mode (DImode, operands[0], 1, operands[3], operands[4]);) - -;; %%% Kill me once multi-word ops are sane. -(define_insn zero_extendsidi2_1 - [(set (match_operand:DI 0 nonimmediate_operand
Re: [4.7][SH] Binary compatibility with atomic_test_and_test_trueval != 1
On Sat, 2012-03-03 at 10:31 -0800, Richard Henderson wrote: On 03/02/2012 10:11 AM, Richard Henderson wrote: I'm in the process of sanity testing this on x86_64 with trueval set to 0x80. Jakub, ok for 4.7 branch if it passes? * optabs.c (expand_atomic_test_and_set): Honor atomic_test_and_set_trueval even when atomic_test_and_set optab is not in use. I've committed this patch to mainline. I still think it ought to go onto the 4.7 branch... Attached is a slightly modified version of the patch from http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00085.html I have removed the signed char weirdo and adjusted the comment above TARGET_ATOMIC_TEST_AND_SET_TRUEVAL accordingly. Tested by compiling some test functions that use __atomic_test_and_set / __GCC_ATOMIC_TEST_AND_SET_TRUEVAL with various SH atomic option combinations and looking at the output asm. OK to apply to trunk? Richard, could you also please take the TARGET_ATOMIC_TEST_AND_SET_TRUEVAL hunk from this patch for the 4.7 branch? Cheers, Oleg 2012-03-04 Oleg Endo olege...@gcc.gnu.org * config/sh/sh.h (TARGET_ATOMIC_TEST_AND_SET_TRUEVAL): New hook. * config/sh/sync.md (atomic_test_and_set): New expander. (tasb, atomic_test_and_set_soft): New insns. * config/sh/sh.opt (menable-tas): New option. * doc/invoke.texi (SH Options): Document it. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 184877) +++ gcc/doc/invoke.texi (working copy) @@ -887,7 +887,8 @@ -mdivsi3_libfunc=@var{name} -mfixed-range=@var{register-range} @gol -madjust-unroll -mindexed-addressing -mgettrcost=@var{number} -mpt-fixed @gol -maccumulate-outgoing-args -minvalid-symbols -msoft-atomic @gol --mbranch-cost=@var{num} -mcbranchdi -mcmpeqdi -mfused-madd -mpretend-cmove} +-mbranch-cost=@var{num} -mcbranchdi -mcmpeqdi -mfused-madd -mpretend-cmove @gol +-menable-tas} @emph{Solaris 2 Options} @gccoptlist{-mimpure-text -mno-impure-text @gol @@ -17823,6 +17824,15 @@ This option is enabled by default when the target is @code{sh-*-linux*}. For details on the atomic built-in functions see @ref{__atomic Builtins}. +@item -menable-tas +@opindex menable-tas +Generate the @code{tas.b} opcode for @code{__atomic_test_and_set}. +Notice that depending on the particular hardware and software configuration +this can degrade overall performance due to the operand cache line flushes +that are implied by the @code{tas.b} instruction. On multi-core SH4A +processors the @code{tas.b} instruction must be used with caution since it +can result in data corruption for certain cache configurations. + @item -mspace @opindex mspace Optimize for space instead of speed. Implied by @option{-Os}. Index: gcc/config/sh/sh.h === --- gcc/config/sh/sh.h (revision 184877) +++ gcc/config/sh/sh.h (working copy) @@ -2473,4 +2473,10 @@ /* FIXME: middle-end support for highpart optimizations is missing. */ #define high_life_started reload_in_progress +/* The tas.b instruction sets the 7th bit in the byte, i.e. 0x80. + This value is used by optabs.c atomic op expansion code as well as in + sync.md. */ +#undef TARGET_ATOMIC_TEST_AND_SET_TRUEVAL +#define TARGET_ATOMIC_TEST_AND_SET_TRUEVAL 0x80 + #endif /* ! GCC_SH_H */ Index: gcc/config/sh/sync.md === --- gcc/config/sh/sync.md (revision 184877) +++ gcc/config/sh/sync.md (working copy) @@ -404,3 +404,61 @@ 1: mov r1,r15; } [(set_attr length 18)]) + +(define_expand atomic_test_and_set + [(match_operand:SI 0 register_operand ) ;; bool result output + (match_operand:QI 1 memory_operand ) ;; memory + (match_operand:SI 2 const_int_operand )] ;; model + (TARGET_SOFT_ATOMIC || TARGET_ENABLE_TAS) !TARGET_SHMEDIA +{ + rtx addr = force_reg (Pmode, XEXP (operands[1], 0)); + + if (TARGET_ENABLE_TAS) +emit_insn (gen_tasb (addr)); + else +{ + rtx val = force_reg (QImode, + gen_int_mode (TARGET_ATOMIC_TEST_AND_SET_TRUEVAL, + QImode)); + emit_insn (gen_atomic_test_and_set_soft (addr, val)); +} + + /* The result of the test op is the inverse of what we are + supposed to return. Thus invert the T bit. The inversion will be + potentially optimized away and integrated into surrounding code. */ + emit_insn (gen_movnegt (operands[0])); + DONE; +}) + +(define_insn tasb + [(set (reg:SI T_REG) + (eq:SI (mem:QI (match_operand:SI 0 register_operand r)) + (const_int 0))) + (set (mem:QI (match_dup 0)) + (unspec:QI [(const_int 128)] UNSPEC_ATOMIC))] + TARGET_ENABLE_TAS !TARGET_SHMEDIA + tas.b @%0 + [(set_attr insn_class co_group)]) + +(define_insn atomic_test_and_set_soft + [(set (reg:SI T_REG) + (eq:SI (mem:QI (match_operand:SI 0 register_operand u)) + (const_int 0))) + (set (mem:QI (match_dup 0)) + (unspec:QI
New Swedish PO file for 'gcc' (version 4.7-b20120128)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Swedish team of translators. The file is available at: http://translationproject.org/latest/gcc/sv.po (This file, 'gcc-4.7-b20120128.sv.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator. coordina...@translationproject.org
Re: PATCH [1/n] addr32: Properly use Pmode and word_mode
On Sat, Nov 12, 2011 at 3:19 AM, H.J. Lu hongjiu...@intel.com wrote: The current x32 implementation uses LEAs to convert 32bit address to 64bit. However, we can use addr32 prefix to use 32bit address directly. It improves performance by 5% in SPEC CPU 2K/2006. All changes are done in x86 backend, except for a smaill unwind library assert change: http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01555.html due to return column size difference. For x86-64, Pmode can be 32bit or 64bit, but word_mode is always 64bit. push/pop only work on word_mode. Also string instructions take Pmode pointers. I will submit a set of patches to use 32bit Pmode for x32. This is the first patch to properly use Pmode and word_mode. It also adds addr32 prefix to string instructions if needed. OK for trunk? First round of review comments: @@ -10252,14 +10260,18 @@ ix86_expand_prologue (void) if (r10_live eax_live) { t = choose_baseaddr (m-fs.sp_offset - allocate); - emit_move_insn (r10, gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, R10_REG), + gen_frame_mem (word_mode, t)); t = choose_baseaddr (m-fs.sp_offset - allocate - UNITS_PER_WORD); - emit_move_insn (eax, gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, AX_REG), + gen_frame_mem (word_mode, t)); } else if (eax_live || r10_live) { t = choose_baseaddr (m-fs.sp_offset - allocate); - emit_move_insn ((eax_live ? eax : r10), gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, + (eax_live ? AX_REG : R10_REG)), + gen_frame_mem (word_mode, t)); } } gcc_assert (m-fs.sp_offset == frame.stack_pointer_offset); Please just change rtx eax = gen_rtx_REG (Pmode, AX_REG); and r10 = gen_rtx_REG (Pmode, R10_REG); around line 10305 and line 10324. You also have gen_push in Pmode, just following the former line. Please review the whole ix86_expand_prologue how AX and R10 are defined and used. @@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void) { rtx rax; - rax = gen_rtx_REG (Pmode, AX_REG); - emit_move_insn (rax, reg10); + rax = gen_rtx_REG (word_mode, AX_REG); + emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG)); use_reg (call_fusage, rax); } Same here. Please review how AX, R10 and R11 are defined and used. Also, this needs review from split stack author. @@ -11388,6 +11400,11 @@ ix86_decompose_address (rtx addr, struct ix86_address *out) else disp = addr; /* displacement */ + /* Since address override works only on the (reg) part in fs:(reg), + we can't use it as memory operand. */ + if (Pmode != word_mode seg == SEG_FS (base || index)) +return 0; Can you explain the above some more? IMO, if the override works on (reg) part, this is just what we want. @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code) gcc_unreachable (); } - ix86_print_operand (file, x, 0); + ix86_print_operand (file, x, + TARGET_64BIT REG_P (x) ? 'q' : 0); return; This is too big hammer. You output everything in DImode, so even if the address is in fact in SImode, you output it in DImode with an addr32 prefix. Uros.
Re: [PATCH 02/10] addr32: Only handle zero-extended DImode addresses
On Fri, Mar 2, 2012 at 9:38 PM, H.J. Lu hongjiu...@intel.com wrote: We only need to handle zero-extended addresses in DImode. OK for trunk? 2012-03-02 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_print_operand_address): Only handle zero-extended DImode addresses. OK. Thanks, Uros.
Re: [PATCH 06/10] addr32: Check Pmode to set adjust_stack_insn
On Fri, Mar 2, 2012 at 9:58 PM, H.J. Lu hongjiu...@intel.com wrote: Since stack register may be in SImode for TARGET_64BIT, this patch checks Pmode to set adjust_stack_insn. OK for trunk? 2012-03-02 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (ix86_expand_prologue): Check Pmode to set adjust_stack_insn. OK. Thanks, Uros.
Re: [PATCH 08/10] addr32: Check Pmode instead of TARGET_64BIT
On Fri, Mar 2, 2012 at 10:04 PM, H.J. Lu hongjiu...@intel.com wrote: Since stack register may be in SImode for TARGET_64BIT, this patch checks Pmode to adjust stack in proper mode. OK for trunk? 2012-03-02 H.J. Lu hongjiu...@intel.com * config/i386/i386.c (pro_epilogue_adjust_stack): Check Pmode instead of TARGET_64BIT. OK. Thanks, Uros.
Re: libgo patch committed: Update to weekly.2012-02-22 release
Hello! It looks that this patch introduced: /home/uros/gcc-build-go/x86_64-unknown-linux-gnu/32/libgo/.libs/libgo.so: undefined reference to `libgo_runtime.runtime.Callers' collect2: error: ld returned 1 exit status All libgo tests fail due to this undefined reference. Uros.
[PATCH, i386]: Declare some variables bool
Hello! 2012-03-04 Uros Bizjak ubiz...@gmail.com * config/i386/i386.c (ix86_print_operand) case '+': Declare taken and cputaken as bool. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline as obvious. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 184886) +++ config/i386/i386.c (working copy) @@ -14147,8 +14148,9 @@ ix86_print_operand (FILE *file, rtx x, int code) if (pred_val REG_BR_PROB_BASE * 45 / 100 || pred_val REG_BR_PROB_BASE * 55 / 100) { - int taken = pred_val REG_BR_PROB_BASE / 2; - int cputaken = final_forward_branch_p (current_output_insn) == 0; + bool taken = pred_val REG_BR_PROB_BASE / 2; + bool cputaken + = final_forward_branch_p (current_output_insn) == 0; /* Emit hints only in the case default branch prediction heuristics would fail. */
Re: PATCH [1/n] addr32: Properly use Pmode and word_mode
On Sun, Mar 4, 2012 at 12:09 PM, Uros Bizjak ubiz...@gmail.com wrote: On Sat, Nov 12, 2011 at 3:19 AM, H.J. Lu hongjiu...@intel.com wrote: The current x32 implementation uses LEAs to convert 32bit address to 64bit. However, we can use addr32 prefix to use 32bit address directly. It improves performance by 5% in SPEC CPU 2K/2006. All changes are done in x86 backend, except for a smaill unwind library assert change: http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01555.html due to return column size difference. For x86-64, Pmode can be 32bit or 64bit, but word_mode is always 64bit. push/pop only work on word_mode. Also string instructions take Pmode pointers. I will submit a set of patches to use 32bit Pmode for x32. This is the first patch to properly use Pmode and word_mode. It also adds addr32 prefix to string instructions if needed. OK for trunk? First round of review comments: @@ -10252,14 +10260,18 @@ ix86_expand_prologue (void) if (r10_live eax_live) { t = choose_baseaddr (m-fs.sp_offset - allocate); - emit_move_insn (r10, gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, R10_REG), + gen_frame_mem (word_mode, t)); t = choose_baseaddr (m-fs.sp_offset - allocate - UNITS_PER_WORD); - emit_move_insn (eax, gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, AX_REG), + gen_frame_mem (word_mode, t)); } else if (eax_live || r10_live) { t = choose_baseaddr (m-fs.sp_offset - allocate); - emit_move_insn ((eax_live ? eax : r10), gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, + (eax_live ? AX_REG : R10_REG)), + gen_frame_mem (word_mode, t)); } } gcc_assert (m-fs.sp_offset == frame.stack_pointer_offset); Please just change rtx eax = gen_rtx_REG (Pmode, AX_REG); and r10 = gen_rtx_REG (Pmode, R10_REG); This is done on purpose. We manipulate stack using AX and R10 as scratch registers in Pmode since stack is in Pmode. But AX and R10 registers have to be saved and restored in word_mode. around line 10305 and line 10324. You also have gen_push in Pmode, In those places, they just want to push a register on stack to save it. Callers don't care how it is done. I changed gen_push to allow Pmode by always pushing registers in word_mode: if (REG_P (arg) GET_MODE (arg) != word_mode) arg = gen_rtx_REG (word_mode, REGNO (arg)); just following the former line. Please review the whole ix86_expand_prologue how AX and R10 are defined and used. The same issue applies here. @@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void) { rtx rax; - rax = gen_rtx_REG (Pmode, AX_REG); - emit_move_insn (rax, reg10); + rax = gen_rtx_REG (word_mode, AX_REG); + emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG)); use_reg (call_fusage, rax); } Same here. Please review how AX, R10 and R11 are defined and used. Also, this needs review from split stack author. I CCed Ian. That is the same issue. We need some scratch registers in Pmode to manipulate stack. But we have to save and restore them in word_mode, not Pmode. @@ -11388,6 +11400,11 @@ ix86_decompose_address (rtx addr, struct ix86_address *out) else disp = addr; /* displacement */ + /* Since address override works only on the (reg) part in fs:(reg), + we can't use it as memory operand. */ + if (Pmode != word_mode seg == SEG_FS (base || index)) + return 0; Can you explain the above some more? IMO, if the override works on (reg) part, this is just what we want. When Pmode == SImode, we have fs segment register == 0x1001 and base register (SImode) == -1 (0x). We are expecting address to be 0x1001 - 1 == 0x1000. But, what we get is 0x1000 + 0x, not 0x1000 since 0x67 address prefix only applies to base register to zero-extend 0x to 64bit. @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code) gcc_unreachable (); } - ix86_print_operand (file, x, 0); + ix86_print_operand (file, x, + TARGET_64BIT REG_P (x) ? 'q' : 0); return; This is too big hammer. You output everything in DImode, so even if the address is in fact in SImode, you output it in DImode with an addr32 prefix. %A is only used in jmp\t%A0 and there is no jmp *%eax instruction in 64bit mode, only jmp *%rax: [hjl@gnu-4 tmp]$ cat j.s jmp *%eax jmp *%rax [hjl@gnu-4 tmp]$ gcc -c j.s j.s: Assembler messages: j.s:1: Error: operand type mismatch for `jmp' [hjl@gnu-4 tmp]$ It is OK for x32 since the upper 32bits are zero when we are loading %eax. -- H.J.
Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF
On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote: X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC by checking movq foo@gottpoff(%rip), %reg and addq foo@gottpoff(%rip), %reg It uses the REX prefix to avoid the last byte of the previous instruction. With 32bit Pmode, we may not have the REX prefix and the last byte of the previous instruction may be an offset, which may look like a REX prefix. IE-LE optimization will generate corrupted binary. This patch makes sure we always output an REX pfrefix for UNSPEC_GOTNTPOFF. OK for trunk? No, please implement this using UNSPEC in the same way as tls_initial_exec_64_sun implements Sun linker quirk. Uros.
Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF
On Sun, Mar 4, 2012 at 2:12 PM, Uros Bizjak ubiz...@gmail.com wrote: On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote: X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC by checking movq foo@gottpoff(%rip), %reg and addq foo@gottpoff(%rip), %reg It uses the REX prefix to avoid the last byte of the previous instruction. With 32bit Pmode, we may not have the REX prefix and the last byte of the previous instruction may be an offset, which may look like a REX prefix. IE-LE optimization will generate corrupted binary. This patch makes sure we always output an REX pfrefix for UNSPEC_GOTNTPOFF. OK for trunk? No, please implement this using UNSPEC in the same way as tls_initial_exec_64_sun implements Sun linker quirk. I am not sure how it can be done with UNSPEC cleanly. Unlike the Sun linker issue, this is an instruction encoding issue. At instruction pattern level, there is no difference between x32 and x86-64. I need to make sure that there is always one and only one REX prefix with UNSPEC_GOTNTPOFF. If REG is r8-r15, we shouldn't add another REX prefix. -- H.J.
Re: PATCH [1/n] addr32: Properly use Pmode and word_mode
On Sun, Mar 4, 2012 at 11:01 PM, H.J. Lu hjl.to...@gmail.com wrote: @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code) gcc_unreachable (); } - ix86_print_operand (file, x, 0); + ix86_print_operand (file, x, + TARGET_64BIT REG_P (x) ? 'q' : 0); return; This is too big hammer. You output everything in DImode, so even if the address is in fact in SImode, you output it in DImode with an addr32 prefix. %A is only used in jmp\t%A0 and there is no jmp *%eax instruction in 64bit mode, only jmp *%rax: [hjl@gnu-4 tmp]$ cat j.s jmp *%eax jmp *%rax [hjl@gnu-4 tmp]$ gcc -c j.s j.s: Assembler messages: j.s:1: Error: operand type mismatch for `jmp' [hjl@gnu-4 tmp]$ It is OK for x32 since the upper 32bits are zero when we are loading %eax. Just zero_extend register in wrong mode to DImode in indirect_jump and tablejump expanders. If above is true, then gcc will remove this extension automatically. Uros.
Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF
On Sun, Mar 4, 2012 at 11:38 PM, H.J. Lu hjl.to...@gmail.com wrote: On Sun, Mar 4, 2012 at 2:12 PM, Uros Bizjak ubiz...@gmail.com wrote: On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote: X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC by checking movq foo@gottpoff(%rip), %reg and addq foo@gottpoff(%rip), %reg It uses the REX prefix to avoid the last byte of the previous instruction. With 32bit Pmode, we may not have the REX prefix and the last byte of the previous instruction may be an offset, which may look like a REX prefix. IE-LE optimization will generate corrupted binary. This patch makes sure we always output an REX pfrefix for UNSPEC_GOTNTPOFF. OK for trunk? No, please implement this using UNSPEC in the same way as tls_initial_exec_64_sun implements Sun linker quirk. I am not sure how it can be done with UNSPEC cleanly. Unlike the Sun linker issue, this is an instruction encoding issue. At instruction pattern level, there is no difference between x32 and x86-64. I need to make sure that there is always one and only one REX prefix with UNSPEC_GOTNTPOFF. If REG is r8-r15, we shouldn't add another REX prefix. (define_insn tls_initial_exec_x32 [(set (match_operand:SI 0 register_operand =r) (unspec:SI [(match_operand:SI 1 tls_symbolic_operand )] UNSPEC_TLS_IE_X32)) (clobber (reg:CC FLAGS_REG))] TARGET_X32 { if (!REX_INT_REG_P (operands[0]) fputs (\trex\n, asm_out_file); output_asm_insn (mov{l}\t{%%fs:0, %0|%0, QWORD PTR fs:0}, operands); if (!REX_INT_REG_P (operands[0]) fputs (\trex\n, asm_out_file); return add{l}\t{%a1@gottpoff(%%rip), %0|%0, %a1@gottpoff[rip]}; } [(set_attr type multi)]) rex or rex64 prefix, whatever you wish. Then generate this pattern from legitimize_tls_address, TLS_MODEL_INITIAL_EXEC, see TARGET_SUN_TLS part. Uros.
Re: [4.7][SH] Binary compatibility with atomic_test_and_test_trueval != 1
Oleg Endo oleg.e...@t-online.de wrote: Attached is a slightly modified version of the patch from http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00085.html I have removed the signed char weirdo and adjusted the comment above TARGET_ATOMIC_TEST_AND_SET_TRUEVAL accordingly. Tested by compiling some test functions that use __atomic_test_and_set / __GCC_ATOMIC_TEST_AND_SET_TRUEVAL with various SH atomic option combinations and looking at the output asm. OK to apply to trunk? OK. Regards, kaz
Re: [patch, libffi] Sync merge libffi
On 04.03.2012 22:20, Anthony Green wrote: Hello, The attached patch includes changes that have been reviewed, approved and merged into the stand-alone libffi release tree**. ** http://github.com/atgreen/libffi does this correspond to a libffi release or release candidate?
[v3] libstdc++/43813
Hi, this is what I did to implement the resolution of lwg 1234. All in all pretty straightforward. Note I'm leaving alone for now basic_string and all the trickery with its exports (and well, I don't think *that* many people are instantiating basic_string for, eg, a pointer type). Tested x86_64-linux, normal and debug, committed. Thanks, Paolo. /// 2012-03-04 Paolo Carlini paolo.carl...@oracle.com Jonathan Wakely jwakely@gmail.com PR libstdc++/43813 * include/bits/stl_iterator_base_types.h (_RequireInputIter): New. * include/ext/vstring.h (__versa_string::__versa_string (_InputIterator, _InputIterator, const _Alloc), __versa_string::append(_InputIterator, _InputIterator), __versa_string::assign(_InputIterator, _InputIterator), __versa_string::insert(iterator, _InputIterator, _InputIterator), __versa_string::replace(iterator, iterator, _InputIterator, _InputIterator)): Use it. * include/bits/stl_list.h (list::list(_InputIterator, _InputIterator, const allocator_type), list::assign(_InputIterator, _InputIterator), list::insert(iterator, _InputIterator, _InputIterator)): Likewise. * include/bits/stl_vector.h (vector::vector(_InputIterator, _InputIterator, const allocator_type), vector::assign(_InputIterator, _InputIterator), vectort::insert(iterator, _InputIterator, _InputIterator)): Likewise. * include/bits/stl_deque.h (deque::deque(_InputIterator, _InputIterator, const allocator_type), deque::deque(_InputIterator, _InputIterator), deque::insert(iterator, _InputIterator, _InputIterator)): Likewise. * include/bits/stl_bvector.h (vector::vector(_InputIterator, _InputIterator, const allocator_type), vector::deque(_InputIterator, _InputIterator), vector::insert(iterator, _InputIterator, _InputIterator)): Likewise. * include/bits/forward_list.h (forward_list::forward_list (_InputIterator, _InputIterator, const allocator_type), forward_list::assign(_InputIterator, _InputIterator), forward_list::insert_after(const_iterator, _InputIterator, _InputIterator)): Likewise. (forward_list::_M_initialize_dispatch(,, __true_type): Remove. (forward_list::_M_range_initialize): Add, adjust everywhere. * include/bits/forward_list.tcc: Adjust. * include/debug/forward_list: Adjust. * include/debug/vector: Likewise. * include/debug/deque: Likewise. * include/debug/list: Likewise. * testsuite/ext/vstring/requirements/do_the_right_thing.cc: New. * testsuite/23_containers/forward_list/requirements/ do_the_right_thing.cc: Likewise. * testsuite/23_containers/vector/requirements/ do_the_right_thing.cc: Likewise. * testsuite/23_containers/deque/requirements/ do_the_right_thing.cc: Likewise. * testsuite/23_containers/list/requirements/ do_the_right_thing.cc: Likewise. * testsuite/23_containers/forward_list/requirements/dr438/ assign_neg.cc: Adjust dg-error line number. * testsuite/23_containers/forward_list/requirements/dr438/ insert_neg.cc: Likewise. * testsuite/23_containers/forward_list/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/forward_list/requirements/dr438/ constructor_2_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/ assign_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/ insert_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/vector/requirements/dr438/ constructor_2_neg.cc: Likewise. * testsuite/23_containers/deque/requirements/dr438/ assign_neg.cc: Likewise. * testsuite/23_containers/deque/requirements/dr438/ insert_neg.cc: Likewise. * testsuite/23_containers/deque/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/deque/requirements/dr438/ constructor_2_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/ assign_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/ insert_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/ constructor_1_neg.cc: Likewise. * testsuite/23_containers/list/requirements/dr438/ constructor_2_neg.cc: Likewise. Index: include/debug/forward_list === --- include/debug/forward_list (revision 184887) +++ include/debug/forward_list (working copy) @@ -1,6 +1,6 @@ // forward_list -*- C++ -*- -// Copyright (C) 2010 Free Software Foundation, Inc. +//
Re: [patch, libffi] Sync merge libffi
On 3/4/2012 7:53 PM, Matthias Klose wrote: On 04.03.2012 22:20, Anthony Green wrote: Hello, The attached patch includes changes that have been reviewed, approved and merged into the stand-alone libffi release tree**. ** http://github.com/atgreen/libffi does this correspond to a libffi release or release candidate? No, but very close. There's still the outstanding ARM fp issue (libffi doesn't build for soft-fp targets). I'll sync again when 3.0.11 is final. Chung-Lin, this is related to your VFP support patch. I hope you have time to look at this soon. Let me know if you won't. Thanks, AG
Re: [patch, libffi] Sync merge libffi
On Sun, 04 Mar 2012, Anthony Green wrote: Hello, The attached patch includes changes that have been reviewed, approved and merged into the stand-alone libffi release tree**. Tested on x86_64 linux with no regressions, and committed. Thanks, Anthony Green I'd like to question some of the changes in copyright. For example, the file src/pa/ffi.c was originally written by Randolph Chung. The copyright notice read as follows on the original contribution: ffi.c - (c) 2003-2004 Randolph Chung ta...@debian.org 3 4 HPPA Foreign Function Interface 5 6 Permission is hereby granted, free of charge, to any person obtaining 7 a copy of this software and associated documentation files (the 8 ``Software''), to deal in the Software without restriction, including 9 without limitation the rights to use, copy, modify, merge, publish, 10 distribute, sublicense, and/or sell copies of the Software, and to 11 permit persons to whom the Software is furnished to do so, subject to 12 the following conditions: 13 14 The above copyright notice and this permission notice shall be included 15 in all copies or substantial portions of the Software. 16 17 THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, EXPRESS 18 OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 19 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 20 IN NO EVENT SHALL CYGNUS SOLUTIONS BE LIABLE FOR ANY CLAIM, DAMAGES OR 21 OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 22 ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 23 OTHER DEALINGS IN THE SOFTWARE. It now reads as follows: /* --- ffi.c - (c) 2011 Anthony Green (c) 2008 Red Hat, Inc. (c) 2006 Free Software Foundation, Inc. (c) 2003-2004 Randolph Chung ta...@debian.org HPPA Foreign Function Interface HP-UX PA ABI support Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ``Software''), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. --- */ I'm just wondering why Anthony Green and Redhat are listed as copyright holders. I can understand the Free Software Foundation addition since the file was contributed to it. Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
Re: PATCH [1/n] addr32: Properly use Pmode and word_mode
On Sun, Mar 4, 2012 at 2:40 PM, Uros Bizjak ubiz...@gmail.com wrote: On Sun, Mar 4, 2012 at 11:01 PM, H.J. Lu hjl.to...@gmail.com wrote: @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code) gcc_unreachable (); } - ix86_print_operand (file, x, 0); + ix86_print_operand (file, x, + TARGET_64BIT REG_P (x) ? 'q' : 0); return; This is too big hammer. You output everything in DImode, so even if the address is in fact in SImode, you output it in DImode with an addr32 prefix. %A is only used in jmp\t%A0 and there is no jmp *%eax instruction in 64bit mode, only jmp *%rax: [hjl@gnu-4 tmp]$ cat j.s jmp *%eax jmp *%rax [hjl@gnu-4 tmp]$ gcc -c j.s j.s: Assembler messages: j.s:1: Error: operand type mismatch for `jmp' [hjl@gnu-4 tmp]$ It is OK for x32 since the upper 32bits are zero when we are loading %eax. Just zero_extend register in wrong mode to DImode in indirect_jump and tablejump expanders. If above is true, then gcc will remove this extension automatically. I tried: diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 715e7ea..de5cf67 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11100,10 +11100,15 @@ (set_attr modrm 0)]) (define_expand indirect_jump - [(set (pc) (match_operand 0 indirect_branch_operand ))]) + [(set (pc) (match_operand 0 indirect_branch_operand ))] + +{ + if (TARGET_X32) +operands[0] = convert_memory_address (word_mode, operands[0]); +}) (define_insn *indirect_jump - [(set (pc) (match_operand:P 0 indirect_branch_operand rw))] + [(set (pc) (match_operand:W 0 indirect_branch_operand rw))] jmp\t%A0 [(set_attr type ibr) @@ -11145,12 +11150,12 @@ operands[0] = expand_simple_binop (Pmode, code, op0, op1, NULL_RTX, 0, OPTAB_DIRECT); } - else if (TARGET_X32) -operands[0] = convert_memory_address (Pmode, operands[0]); + if (TARGET_X32) +operands[0] = convert_memory_address (word_mode, operands[0]); }) (define_insn *tablejump_1 - [(set (pc) (match_operand:P 0 indirect_branch_operand rw)) + [(set (pc) (match_operand:W 0 indirect_branch_operand rw)) (use (label_ref (match_operand 1 )))] jmp\t%A0 and compiler does generate the same output. i386.c also has xasm = jmp\t%A0; xasm = call\t%A0; for calls. There are no separate indirect call patterns. For x32, only indirect register calls have to be in DImode. The direct call should be in Pmode (SImode). Since x86-64 hardware always zero-extends upper 32bits of 64bit registers when loading its lower 32bits, it is safe and easier to just to output 64bit registers for %A than zero-extend it by hand for all jump/call patterns. -- H.J.
Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF
On Sun, Mar 4, 2012 at 2:52 PM, Uros Bizjak ubiz...@gmail.com wrote: On Sun, Mar 4, 2012 at 11:38 PM, H.J. Lu hjl.to...@gmail.com wrote: On Sun, Mar 4, 2012 at 2:12 PM, Uros Bizjak ubiz...@gmail.com wrote: On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote: X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC by checking movq foo@gottpoff(%rip), %reg and addq foo@gottpoff(%rip), %reg It uses the REX prefix to avoid the last byte of the previous instruction. With 32bit Pmode, we may not have the REX prefix and the last byte of the previous instruction may be an offset, which may look like a REX prefix. IE-LE optimization will generate corrupted binary. This patch makes sure we always output an REX pfrefix for UNSPEC_GOTNTPOFF. OK for trunk? No, please implement this using UNSPEC in the same way as tls_initial_exec_64_sun implements Sun linker quirk. I am not sure how it can be done with UNSPEC cleanly. Unlike the Sun linker issue, this is an instruction encoding issue. At instruction pattern level, there is no difference between x32 and x86-64. I need to make sure that there is always one and only one REX prefix with UNSPEC_GOTNTPOFF. If REG is r8-r15, we shouldn't add another REX prefix. (define_insn tls_initial_exec_x32 [(set (match_operand:SI 0 register_operand =r) (unspec:SI [(match_operand:SI 1 tls_symbolic_operand )] UNSPEC_TLS_IE_X32)) (clobber (reg:CC FLAGS_REG))] TARGET_X32 { if (!REX_INT_REG_P (operands[0]) fputs (\trex\n, asm_out_file); output_asm_insn (mov{l}\t{%%fs:0, %0|%0, QWORD PTR fs:0}, operands); if (!REX_INT_REG_P (operands[0]) fputs (\trex\n, asm_out_file); return add{l}\t{%a1@gottpoff(%%rip), %0|%0, %a1@gottpoff[rip]}; } [(set_attr type multi)]) rex or rex64 prefix, whatever you wish. Will it introduce bugs? The normal code look like off = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, x), type); off = gen_rtx_CONST (Pmode, off); if (pic) off = gen_rtx_PLUS (Pmode, pic, off); off = gen_const_mem (Pmode, off); set_mem_alias_set (off, ix86_GOT_alias_set ()); if (TARGET_64BIT || TARGET_ANY_GNU_TLS) { base = get_thread_pointer (for_mov || !TARGET_TLS_DIRECT_SEG_REFS); off = force_reg (Pmode, off); return gen_rtx_PLUS (Pmode, base, off); } There is a call to set_mem_alias_set (off, ix86_GOT_alias_set ()); -- H.J.
Re: [patch, libffi] Sync merge libffi
On 3/4/2012 10:22 PM, John David Anglin wrote: I'm just wondering why Anthony Green and Redhat are listed as copyright holders. I can understand the Free Software Foundation addition since the file was contributed to it. Simply because of changes that were made to that source file over the years. For instance, in 2011 I made a small change to that file (ABI check change), and the copyright notice update was largely a mechanical byproduct (emacs mostly automates this). I have no objection to removing this if you feel strongly about it. AG
Re: PATCH [1/n] addr32: Properly use Pmode and word_mode
H.J. Lu hjl.to...@gmail.com writes: @@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void) { rtx rax; - rax = gen_rtx_REG (Pmode, AX_REG); - emit_move_insn (rax, reg10); + rax = gen_rtx_REG (word_mode, AX_REG); + emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG)); use_reg (call_fusage, rax); } Same here. Please review how AX, R10 and R11 are defined and used. Also, this needs review from split stack author. I CCed Ian. That is the same issue. We need some scratch registers in Pmode to manipulate stack. But we have to save and restore them in word_mode, not Pmode. Changing Pmode to word_mode is fine here, if the x86 maintainers approve the rest of the patch. Ian
Re: libgo patch committed: Fill out syscall package for GNU/Linux
Rainer Orth r...@cebitec.uni-bielefeld.de writes: Rainer Orth r...@cebitec.uni-bielefeld.de writes: Ian Lance Taylor i...@google.com writes: This patch to libgo fills out the syscall package for GNU/Linux to match all the functions in the syscall package in the master Go library. There is a test case for this patch at http://code.google.com/p/go/issues/detail?id=3071 . Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Unfortunately, this broke Solaris bootstrap: It also broke Linux/x86_64 bootstrap (CentOS 5.6): In file included from /usr/include/sys/ustat.h:30:0, from /usr/include/ustat.h:1, from sysinfo.c:91: /usr/include/bits/ustat.h:25:8: error: redefinition of 'struct ustat' In file included from /usr/include/linux/filter.h:8:0, from sysinfo.c:61: /usr/include/linux/types.h:156:8: note: originally defined here After some actual testing, this additional patch seems to be needed to fix the problem. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 6ec2944349b2 libgo/configure.ac --- a/libgo/configure.ac Fri Mar 02 13:07:34 2012 -0800 +++ b/libgo/configure.ac Sun Mar 04 21:53:22 2012 -0800 @@ -463,6 +463,8 @@ AC_CACHE_CHECK([whether ustat.h can be used], [libgo_cv_c_ustat_h], +[CFLAGS_hold=$CFLAGS +CFLAGS=$CFLAGS -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE $OSCFLAGS AC_COMPILE_IFELSE( [AC_LANG_SOURCE([ #include sys/types.h @@ -470,7 +472,8 @@ #include linux/filter.h #endif #include ustat.h -])], [libgo_cv_c_ustat_h=yes], [libgo_cv_c_ustat_h=no])) +])], [libgo_cv_c_ustat_h=yes], [libgo_cv_c_ustat_h=no]) +CFLAGS=$CFLAGS_hold]) if test $libgo_cv_c_ustat_h = yes; then AC_DEFINE(HAVE_USTAT_H, 1, [Define to 1 if you have the ustat.h header file and it works.])
libgo patch committed: Better big-endian hash function
This libgo patch improves the big-endian hash function for key sizes less than 8 bytes. The previous hash function would always make all hash values a large multiple of some constants, which interacted badly with the map code. This patch fixes that problem and fixes PR 52342. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 071257161dab libgo/runtime/go-type-identity.c --- a/libgo/runtime/go-type-identity.c Sun Mar 04 22:03:57 2012 -0800 +++ b/libgo/runtime/go-type-identity.c Sun Mar 04 22:36:37 2012 -0800 @@ -6,6 +6,7 @@ #include stddef.h +#include config.h #include go-type.h /* The 64-bit type. */ @@ -31,7 +32,11 @@ unsigned char a[8]; } u; u.v = 0; - __builtin_memcpy (u.a, key, key_size); +#ifdef WORDS_BIGENDIAN + __builtin_memcpy (u.a[8 - key_size], key, key_size); +#else + __builtin_memcpy (u.a[0], key, key_size); +#endif if (sizeof (uintptr_t) = 8) return (uintptr_t) u.v; else
Re: PATCH [1/n] addr32: Properly use Pmode and word_mode
On Mon, Mar 5, 2012 at 4:53 AM, H.J. Lu hjl.to...@gmail.com wrote: and compiler does generate the same output. i386.c also has xasm = jmp\t%A0; xasm = call\t%A0; for calls. There are no separate indirect call patterns. For x32, only indirect register calls have to be in DImode. The direct call should be in Pmode (SImode). Direct call just expects label to some abolute address that is assumed to fit in 32 bits (see constant_call_address_operand_p). call and jmp insn expect word_mode operands, so please change ix86_expand_call and call patterns in the same way as jump instructions above. Since x86-64 hardware always zero-extends upper 32bits of 64bit registers when loading its lower 32bits, it is safe and easier to just to output 64bit registers for %A than zero-extend it by hand for all jump/call patterns. No, the instruction expects word_mode operands, so we have to extend values to expected mode. I don't think that patching at insn output time is acceptable. BTW: I propose to split the patch into smaller pieces, dealing with various independent parts separately. Handling jump/call insn is definitely one of them, the other is stringops handling, another prologue/epilogue expansion. Uros.