[patch, committed] invoke.texi: copy-edit x86 sections

2012-03-04 Thread Sandra Loosemore
I noticed that the x86-specific material in invoke.texi had a lot of 
copy-editing problems; besides the usual grammar and punctuation 
mistakes, I cleaned up a bunch of problems in the Texinfo markup. 
Additionally, I corrected quite a large number of issues where incorrect 
names were used for processors and chip manufacturers.  For example, 
Pentium III is the correct name of that processor, Pentium3 is not. 
 (Wikipedia is very useful for this sort of thing, with links to 
manufacturer's data sheets and/or photos of the processor showing the 
branding on it.)


I've checked this in as a supposedly content-free patch.  I suggest, 
though, that the port maintainers look over this section (and not just 
my changes to it), as the reference to GCC 2.6.1 that I removed 
indicates to me that nobody has reviewed this material for quite a long 
time and some of it may be bit-rotten in other ways.


-Sandra


2012-03-04  Sandra Loosemore  san...@codesourcery.com

gcc/
* doc/invoke.texi (C++ Dialect Options): Minor copy-edits to
x86-specific text.
(Debugging Options): Likewise.
(Optimize Options): Likewise.
(i386 and x86-64 Options): Discuss -march before -mtune, consistently
with other architectures.  Use official processor names with correct
spelling/capitalization.  Fix formatting and grammar issues.
(i386 and x86-64 Windows Options): Similar cleanup here.

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 184878)
+++ gcc/doc/invoke.texi	(working copy)
@@ -2376,14 +2376,14 @@ Instantiations of these templates may be
 
 @end itemize
 
-It also warns psABI related changes.  The known psABI changes at this
+It also warns about psABI-related changes.  The known psABI changes at this
 point include:
 
 @itemize @bullet
 
 @item
-For SYSV/x86-64, when passing union with long double, it is changed to
-pass in memory as specified in psABI.  For example:
+For SysV/x86-64, unions with @code{long double} members are 
+passed in memory as specified in psABI.  For example:
 
 @smallexample
 union U @{
@@ -2393,7 +2393,7 @@ union U @{
 @end smallexample
 
 @noindent
-@code{union U} will always be passed in memory.
+@code{union U} is always passed in memory.
 
 @end itemize
 
@@ -5484,7 +5484,7 @@ architectures.
 
 @item -fdump-rtl-stack
 @opindex fdump-rtl-stack
-Dump after conversion from GCC's flat register file registers to the
+Dump after conversion from GCC's ``flat register file'' registers to the
 x87's stack-like registers.  This pass is only run on x86 variants.
 
 @item -fdump-rtl-subreg1
@@ -6333,7 +6333,7 @@ whether a target machine supports this f
 Usage, gccint, GNU Compiler Collection (GCC) Internals}.
 
 Starting with GCC version 4.6, the default setting (when not optimizing for
-size) for 32-bit Linux x86 and 32-bit Darwin x86 targets has been changed to
+size) for 32-bit GNU/Linux x86 and 32-bit Darwin x86 targets has been changed to
 @option{-fomit-frame-pointer}.  The default can be reverted to
 @option{-fno-omit-frame-pointer} by configuring GCC with the
 @option{--enable-frame-pointer} configure option.
@@ -6740,7 +6740,7 @@ Enabled at levels @option{-O2}, @option{
 @item -free
 @opindex free
 Attempt to remove redundant extension instructions.  This is especially
-helpful for the x86-64 architecture which implicitly zero-extends in 64-bit
+helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
 registers after writing to their lower 32-bit half.
 
 Enabled for x86 at levels @option{-O2}, @option{-O3}.
@@ -12977,102 +12977,134 @@ These @samp{-m} options are defined for 
 computers:
 
 @table @gcctabopt
-@item -mtune=@var{cpu-type}
-@opindex mtune
-Tune to @var{cpu-type} everything applicable about the generated code, except
-for the ABI and the set of available instructions.  The choices for
-@var{cpu-type} are:
-@table @emph
-@item generic
-Produce code optimized for the most common IA32/@/AMD64/@/EM64T processors.
-If you know the CPU on which your code will run, then you should use
-the corresponding @option{-mtune} option instead of
-@option{-mtune=generic}.  But, if you do not know exactly what CPU users
-of your application will have, then you should use this option.
 
-As new processors are deployed in the marketplace, the behavior of this
-option will change.  Therefore, if you upgrade to a newer version of
-GCC, the code generated option will change to reflect the processors
-that were most common when that version of GCC was released.
+@item -march=@var{cpu-type}
+@opindex march
+Generate instructions for the machine type @var{cpu-type}.  In contrast to
+@option{-mtune=@var{cpu-type}}, which merely tunes the generated code 
+for the specified @var{cpu-type}, @option{-march=@var{cpu-type}} allows GCC
+to generate code that may not run at all on processors other than the one
+indicated.  Specifying 

fix libstdc++/52433

2012-03-04 Thread Jonathan Wakely
PR libstdc++/52433
* include/debug/safe_iterator.h (_Safe_iterator): Add move
constructor and move assignment operator.
* testsuite/23_containers/vector/debug/52433.cc: New.

Tested 'make check check-debug' on x86_64 and committed to trunk.  I
plan to fix this for 4.7.1 and 4.6.4 as well
diff --git a/libstdc++-v3/include/debug/safe_iterator.h 
b/libstdc++-v3/include/debug/safe_iterator.h
index e7cfe9c..65dff55 100644
--- a/libstdc++-v3/include/debug/safe_iterator.h
+++ b/libstdc++-v3/include/debug/safe_iterator.h
@@ -169,6 +169,19 @@ namespace __gnu_debug
  ._M_iterator(__x, other));
   }
 
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
+  /**
+   * @brief Move construction.
+   * @post __x is singular and unattached
+   */
+  _Safe_iterator(_Safe_iterator __x) : _M_current()
+  {
+   std::swap(_M_current, __x._M_current);
+   this-_M_attach(__x._M_sequence);
+   __x._M_detach();
+  }
+#endif
+
   /**
*  @brief Converting constructor from a mutable iterator to a
*  constant iterator.
@@ -208,6 +221,22 @@ namespace __gnu_debug
return *this;
   }
 
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
+  /**
+   * @brief Move assignment.
+   * @post __x is singular and unattached
+   */
+  _Safe_iterator
+  operator=(_Safe_iterator __x)
+  {
+   _M_current = __x._M_current;
+   _M_attach(__x._M_sequence);
+   __x._M_detach();
+   __x._M_current = _Iterator();
+   return *this;
+  }
+#endif
+
   /**
*  @brief Iterator dereference.
*  @pre iterator is dereferenceable
@@ -422,7 +451,9 @@ namespace __gnu_debug
   /// Is this iterator equal to the sequence's before_begin() iterator if
   /// any?
   bool _M_is_before_begin() const
-  { return _BeforeBeginHelper_Sequence::_M_Is(base(), 
_M_get_sequence()); }
+  {
+   return _BeforeBeginHelper_Sequence::_M_Is(base(), _M_get_sequence());
+  }
 };
 
   templatetypename _IteratorL, typename _IteratorR, typename _Sequence
diff --git a/libstdc++-v3/testsuite/23_containers/vector/debug/52433.cc 
b/libstdc++-v3/testsuite/23_containers/vector/debug/52433.cc
new file mode 100644
index 000..f1f5917
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/vector/debug/52433.cc
@@ -0,0 +1,43 @@
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+//
+// { dg-require-debug-mode  }
+// { dg-options -std=gnu++0x }
+// { dg-do compile }
+
+// PR libstdc++/52433
+
+#include vector
+
+struct X
+{
+std::vectorint::iterator i;
+
+X() = default;
+X(const X) = default;
+X(X) = default;
+X operator=(const X) = default;
+X operator=(X) = default;
+};
+
+X test01()
+{
+X x;
+x = X();
+return x;
+}
+


Re: [fortran, patch] Fix display of locus when source contains wide characters

2012-03-04 Thread Steven Bosscher
On Sat, Mar 3, 2012 at 4:08 PM, FX fxcoud...@gmail.com wrote:
 The attached patch fixes PR 36160 
 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36160). It should correctly 
 account for wide characters when display error loci. I'm not sure if we can 
 check that in the testsuite harness, but you can manually see it at work on 
 the attached test.f90.

 Bootstrapped and regtested on x86_64-apple-darwin11, OK for trunk?
 FX


Looks OK to me except for:

 -  for (; i  0; i--)
 +  for (; i  0;)

Might as well just make that a while loop.

Ciao!
Steven


Re: [fortran, patch] Fix display of locus when source contains wide characters

2012-03-04 Thread FX
 Looks OK to me except for:
 
 -  for (; i  0; i--)
 +  for (; i  0;)
 
 Might as well just make that a while loop.

Indeed! Committed with a while loop, thanks for the review!

FX


[Patch,AVR] PR52461: Fix RAMPZ clobbering and RAMP* in epilogue

2012-03-04 Thread Georg-Johann Lay
This patch fixes several issues with RAMP registers:

* On Devices with more than 64 KiB RAM, RAMPZ is used as high-byte of
  RAM address. If RAMPZ is used to read flash, it must be reset to 0
  after the read so that RAM-read will operate correctly in the remainder.
  There is no support for RAM  64 Ki so RAMPZ = 0 is in order.

* The ISR epilogue restored RAMP* registers in the wrong order.

* As RAMPZ is used both in ELPM and LD/LDD on some xmega core, the right
  condition to set RAMPZ prior to ELPM is have ELPM, not have RAMPZ.

* Never read unintentionally from RAM because a flash address interpreted
  as a RAM address might point to the I/O area.

Ok for trunk and 4.7?

Johann

libgcc/
PR target/52461
* config/avr/lib1funcs.S (__do_copy_data): Clear RAMPZ after usage
if RAMPZ affects reading from RAM.
(__tablejump_elpm__): Ditto.
(.xload): Ditto.
(__movmemx_hi): Ditto.
(__do_global_ctors): Right condition for RAMPZ usage is have ELPM.
(__do_global_dtors): Ditto.
(__xload_1, __xload_2, __xload_3, __xload_4): Ditto.  And make weak.
(__movmemx_hi): Ditto.  And fix RAM-loop label.
(__xload_1): Never read unintentionally from RAM.

gcc/
PR target/52461
* gcc/config/avr/avr.c (expand_prologue): Depend save/restore of
RAMPZ on HAVE_RAMPD, not HAVE_RAMPZ.
(expand_epilogue): Ditto.  And fix order of restoration to:
RAMPZ, RAMPY, RAMPX, RAMPD.
(avr_xload_libgcc_p): Always load __memx by lilbgcc call on
big-RAM devices.
(avr_out_lpm): Clear RAMPZ after usage if RAMPZ affects reading
from RAM.
(avr_out_xload): Never read unintentionally from RAM.
* config/avr/avr.md (xload_8): Adjust insn length.
Index: libgcc/config/avr/lib1funcs.S
===
--- libgcc/config/avr/lib1funcs.S	(revision 184887)
+++ libgcc/config/avr/lib1funcs.S	(working copy)
@@ -1853,7 +1853,7 @@ DEFUN __do_copy_data
 	cpi	r26, lo8(__data_end)
 	cpc	r27, r17
 	brne	.L__do_copy_data_loop
-#elif  !defined(__AVR_HAVE_ELPMX__)  defined(__AVR_HAVE_ELPM__)
+#elif defined(__AVR_HAVE_ELPM__)
 	ldi	r17, hi8(__data_end)
 	ldi	r26, lo8(__data_start)
 	ldi	r27, hi8(__data_start)
@@ -1873,7 +1873,7 @@ DEFUN __do_copy_data
 	cpi	r26, lo8(__data_end)
 	cpc	r27, r17
 	brne	.L__do_copy_data_loop
-#elif !defined(__AVR_HAVE_ELPMX__)  !defined(__AVR_HAVE_ELPM__)
+#else /* !ELPM */
 	ldi	r17, hi8(__data_end)
 	ldi	r26, lo8(__data_start)
 	ldi	r27, hi8(__data_start)
@@ -1892,7 +1892,11 @@ DEFUN __do_copy_data
 	cpi	r26, lo8(__data_end)
 	cpc	r27, r17
 	brne	.L__do_copy_data_loop
-#endif /* !defined(__AVR_HAVE_ELPMX__)  !defined(__AVR_HAVE_ELPM__) */
+#endif /* ELPMX / ELPM / LPM */
+#if defined (__AVR_HAVE_ELPM__)  defined (__AVR_HAVE_RAMPD__)
+	;; Reset RAMPZ to 0 so that EBI devices don't read garbage from RAM
+	out	__RAMPZ__, __zero_reg__
+#endif /* ELPM  RAMPD */
 ENDF __do_copy_data
 #endif /* L_copy_data */
 
@@ -1920,7 +1924,7 @@ ENDF __do_clear_bss
 #ifdef L_ctors
 	.section .init6,ax,@progbits
 DEFUN __do_global_ctors
-#if defined(__AVR_HAVE_RAMPZ__)
+#if defined(__AVR_HAVE_ELPM__)
 	ldi	r17, hi8(__ctors_start)
 	ldi	r28, lo8(__ctors_end)
 	ldi	r29, hi8(__ctors_end)
@@ -1939,7 +1943,7 @@ DEFUN __do_global_ctors
 	ldi	r24, hh8(__ctors_start)
 	cpc	r16, r24
 	brne	.L__do_global_ctors_loop
-#else
+#else /* !ELPM */
 	ldi	r17, hi8(__ctors_start)
 	ldi	r28, lo8(__ctors_end)
 	ldi	r29, hi8(__ctors_end)
@@ -1953,14 +1957,14 @@ DEFUN __do_global_ctors
 	cpi	r28, lo8(__ctors_start)
 	cpc	r29, r17
 	brne	.L__do_global_ctors_loop
-#endif /* defined(__AVR_HAVE_RAMPZ__) */
+#endif /* defined(__AVR_HAVE_ELPM__) */
 ENDF __do_global_ctors
 #endif /* L_ctors */
 
 #ifdef L_dtors
 	.section .fini6,ax,@progbits
 DEFUN __do_global_dtors
-#if defined(__AVR_HAVE_RAMPZ__)
+#if defined(__AVR_HAVE_ELPM__)
 	ldi	r17, hi8(__dtors_end)
 	ldi	r28, lo8(__dtors_start)
 	ldi	r29, hi8(__dtors_start)
@@ -1979,7 +1983,7 @@ DEFUN __do_global_dtors
 	ldi	r24, hh8(__dtors_end)
 	cpc	r16, r24
 	brne	.L__do_global_dtors_loop
-#else
+#else /* !ELPM */
 	ldi	r17, hi8(__dtors_end)
 	ldi	r28, lo8(__dtors_start)
 	ldi	r29, hi8(__dtors_start)
@@ -1993,7 +1997,7 @@ DEFUN __do_global_dtors
 	cpi	r28, lo8(__dtors_end)
 	cpc	r29, r17
 	brne	.L__do_global_dtors_loop
-#endif /* defined(__AVR_HAVE_RAMPZ__) */
+#endif /* defined(__AVR_HAVE_ELPM__) */
 ENDF __do_global_dtors
 #endif /* L_dtors */
 
@@ -2001,18 +2005,21 @@ ENDF __do_global_dtors
 
 #ifdef L_tablejump_elpm
 DEFUN __tablejump_elpm__
-#if defined (__AVR_HAVE_ELPM__)
-#if defined (__AVR_HAVE_LPMX__)
+#if defined (__AVR_HAVE_ELPMX__)
 	elpm	__tmp_reg__, Z+
 	elpm	r31, Z
 	mov	r30, __tmp_reg__
+#if defined (__AVR_HAVE_RAMPD__)
+	;; Reset RAMPZ to 0 so that EBI devices don't read garbage from RAM
+	out	__RAMPZ__, __zero_reg__
+#endif /* RAMPD */
 #if defined (__AVR_HAVE_EIJMP_EICALL__)
 	eijmp
 #else
 	ijmp
-#endif

[Patch,AVR]: Tweak a+2*b

2012-03-04 Thread Georg-Johann Lay
This patch adds a straight forward combine pattern and split for int + 2*byte
as frequently seen with accesses to int-arrays with byte offset.

Ok for trunk?

Johann

* config/avr/avr.md (*umaddqihi4.2): New insn-and-split.
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 184887)
+++ config/avr/avr.md	(working copy)
@@ -1692,6 +1692,30 @@ (define_insn *any_extend:extend_suan
 
 ;; Handle small constants
 
+;; Special case of a += 2*b as frequently seen with accesses to int arrays.
+;; This is shorter, faster than MUL and has lower register pressure.
+
+(define_insn_and_split *umaddqihi4.2
+  [(set (match_operand:HI 0 register_operand  =r)
+(plus:HI (mult:HI (zero_extend:HI (match_operand:QI 1 register_operand r))
+  (const_int 2))
+ (match_operand:HI 2 register_operand  r)))]
+  AVR_HAVE_MUL
+!reload_completed
+!reg_overlap_mentioned_p (operands[0], operands[1])
+  { gcc_unreachable(); }
+   1
+  [(set (match_dup 0)
+(match_dup 2))
+   ; *addhi3_zero_extend
+   (set (match_dup 0)
+(plus:HI (zero_extend:HI (match_dup 1))
+ (match_dup 0)))
+   ; *addhi3_zero_extend
+   (set (match_dup 0)
+(plus:HI (zero_extend:HI (match_dup 1))
+ (match_dup 0)))])
+
 ;; umaddqihi4.uconst
 ;; maddqihi4.sconst
 (define_insn_and_split *extend_umaddqihi4.extend_suconst


[PATCH, i386]: Improve zero_extend patterns

2012-03-04 Thread Uros Bizjak
Hello!

Attached patch improves zero_extend patterns by:
- removing flags reg clobber from zero_extendsidi patterns for 32bit
targets. Everything, including movl $0, mem can be split without
using flags reg clobber.
- removing intermediate *zero_extend*2_movzbl_and patterns. We do not
need to remove any fake clobbers in !TARGET_ZERO_EXTEND_WITH_AND case
anymore
- adding o,0 and x,x register alternatives. We can split matching
memory to load 0 in highpart for 64bit and 32bit targets, and movd
zero extends also in xmm-xmm case
- truly splitting and RTXes to zero_extend RTXes when appropriate
(but only in !TARGET_ZERO_EXTEND_WITH_AND case), again removing
unneeded flags reg clobbers
- fixing TARGET_ZERO_EXTEND_WITH_AND peephole2

2012-03-04  Uros Bizjak  ubiz...@gmail.com

* config/i386/constraints.md (Ya): New internal constraint.
* config/i386/i386.md (zero_extendsidi2): Remove expansion.
(*zero_extendsidi2_rex64): Add x,x alternative.
(*zero_extendsidi2): Ditto.  Add o,0 alternative.
Remove flags reg clobber.  Adjust corresponding splits.
(zero_extendmodesi2): Macroize expander from zero_extendhisi2 and
zero_extendqisi2 expanders using SWI12 mode iterator.
(zero_extendmodesi2_and): Macroize insn from
zero_extendhisi2_and and zero_extendqisi2_and.  Merge corresponding
splitters.
(*zero_extendmodesi2):  Macroize insn from
*zero_extendhisi2_movzbl and *zero_extendqisi2_movzbl.
(*zero_extend*2_movzbl_and): Remove insn patterns.
(zero_extendqihi2_and): Merge corresponding splitter.
(*zero_extendqihi2): Rename from *zero_extendqihi2_movzbl.
(*zero_extend*2_movzbl_and): Remove insn patterns.
(*anddi_1): Split TYPE_IMOVX instructions.
(*andsi_1): Use Ya for alternative 2.  Split TYPE_IMOVX instructions.
(*andhi_1): Ditto.
(and-zext splitter): Add splitter pattern.
(zero extend with andsi3 splitter): Adjust zero_extend pattern.

Patch was tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN.

Uros.
Index: config/i386/constraints.md
===
--- config/i386/constraints.md  (revision 184886)
+++ config/i386/constraints.md  (working copy)
@@ -89,6 +89,7 @@
 ;;  z  First SSE register.
 ;;  i  SSE2 inter-unit moves enabled
 ;;  m  MMX inter-unit moves enabled
+;;  a  Integer register when zero extensions with AND are disabled
 ;;  p  Integer register when TARGET_PARTIAL_REG_STALL is disabled
 ;;  d  Integer register when integer DFmode moves are enabled
 ;;  x  Integer register when integer XFmode moves are enabled
@@ -108,6 +109,11 @@
  TARGET_PARTIAL_REG_STALL ? NO_REGS : GENERAL_REGS
  @internal Any integer register when TARGET_PARTIAL_REG_STALL is disabled.)
 
+(define_register_constraint Ya
+ TARGET_ZERO_EXTEND_WITH_AND  optimize_function_for_speed_p (cfun)
+  ? NO_REGS : GENERAL_REGS
+ @internal Any integer register when zero extensions with AND are disabled.)
+
 (define_register_constraint Yd
  (TARGET_64BIT
|| (TARGET_INTEGER_DFMODE_MOVES  optimize_function_for_speed_p (cfun)))
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 184886)
+++ config/i386/i386.md (working copy)
@@ -3371,20 +3371,14 @@
 
 (define_expand zero_extendsidi2
   [(set (match_operand:DI 0 nonimmediate_operand )
-   (zero_extend:DI (match_operand:SI 1 nonimmediate_operand )))]
-  
-{
-  if (!TARGET_64BIT)
-{
-  emit_insn (gen_zero_extendsidi2_1 (operands[0], operands[1]));
-  DONE;
-}
-})
+   (zero_extend:DI (match_operand:SI 1 nonimmediate_operand )))])
 
 (define_insn *zero_extendsidi2_rex64
-  [(set (match_operand:DI 0 nonimmediate_operand  =r,o,?*Ym,?*y,?*Yi,*x)
+  [(set (match_operand:DI 0 nonimmediate_operand
+   =r,o,?*Ym,?*y,?*Yi,!*x)
(zero_extend:DI
-(match_operand:SI 1 nonimmediate_operand rm,0,r   ,m  ,r   ,m)))]
+(match_operand:SI 1 nonimmediate_operand
+   rm,0,r   ,m  ,r   ,m*x)))]
   TARGET_64BIT
   @
mov{l}\t{%1, %k0|%k0, %1}
@@ -3393,24 +3387,17 @@
movd\t{%1, %0|%0, %1}
%vmovd\t{%1, %0|%0, %1}
%vmovd\t{%1, %0|%0, %1}
-  [(set_attr type imovx,imov,mmxmov,mmxmov,ssemov,ssemov)
+  [(set_attr isa *,*,*,*,*,sse2)
+   (set_attr type imovx,multi,mmxmov,mmxmov,ssemov,ssemov)
(set_attr prefix orig,*,orig,orig,maybe_vex,maybe_vex)
(set_attr prefix_0f 0,*,*,*,*,*)
-   (set_attr mode SI,DI,DI,DI,TI,TI)])
+   (set_attr mode SI,SI,DI,DI,TI,TI)])
 
-(define_split
-  [(set (match_operand:DI 0 memory_operand )
-   (zero_extend:DI (match_dup 0)))]
-  TARGET_64BIT
-  [(set (match_dup 4) (const_int 0))]
-  split_double_mode (DImode, operands[0], 1, operands[3], operands[4]);)
-
-;; %%% Kill me once multi-word ops are sane.
-(define_insn zero_extendsidi2_1
-  [(set (match_operand:DI 0 nonimmediate_operand 

Re: [4.7][SH] Binary compatibility with atomic_test_and_test_trueval != 1

2012-03-04 Thread Oleg Endo
On Sat, 2012-03-03 at 10:31 -0800, Richard Henderson wrote:
 On 03/02/2012 10:11 AM, Richard Henderson wrote:
  I'm in the process of sanity testing this on x86_64 with trueval set to 
  0x80.
  Jakub, ok for 4.7 branch if it passes?
  
  * optabs.c (expand_atomic_test_and_set): Honor
  atomic_test_and_set_trueval even when atomic_test_and_set
  optab is not in use.
 
 I've committed this patch to mainline.  I still think it ought to 
 go onto the 4.7 branch...
 

Attached is a slightly modified version of the patch from
http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00085.html

I have removed the signed char weirdo and adjusted the comment above
TARGET_ATOMIC_TEST_AND_SET_TRUEVAL accordingly.

Tested by compiling some test functions that use __atomic_test_and_set /
__GCC_ATOMIC_TEST_AND_SET_TRUEVAL with various SH atomic option
combinations and looking at the output asm.

OK to apply to trunk?

Richard, could you also please take the
TARGET_ATOMIC_TEST_AND_SET_TRUEVAL hunk from this patch for the 4.7
branch?

Cheers,
Oleg


2012-03-04  Oleg Endo  olege...@gcc.gnu.org

* config/sh/sh.h (TARGET_ATOMIC_TEST_AND_SET_TRUEVAL): New hook.
* config/sh/sync.md (atomic_test_and_set): New expander.
(tasb, atomic_test_and_set_soft): New insns.
* config/sh/sh.opt (menable-tas): New option.
* doc/invoke.texi (SH Options): Document it.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 184877)
+++ gcc/doc/invoke.texi	(working copy)
@@ -887,7 +887,8 @@
 -mdivsi3_libfunc=@var{name} -mfixed-range=@var{register-range} @gol
 -madjust-unroll -mindexed-addressing -mgettrcost=@var{number} -mpt-fixed @gol
 -maccumulate-outgoing-args -minvalid-symbols -msoft-atomic @gol
--mbranch-cost=@var{num} -mcbranchdi -mcmpeqdi -mfused-madd -mpretend-cmove}
+-mbranch-cost=@var{num} -mcbranchdi -mcmpeqdi -mfused-madd -mpretend-cmove @gol
+-menable-tas}
 
 @emph{Solaris 2 Options}
 @gccoptlist{-mimpure-text  -mno-impure-text @gol
@@ -17823,6 +17824,15 @@
 This option is enabled by default when the target is @code{sh-*-linux*}.
 For details on the atomic built-in functions see @ref{__atomic Builtins}.
 
+@item -menable-tas
+@opindex menable-tas
+Generate the @code{tas.b} opcode for @code{__atomic_test_and_set}.
+Notice that depending on the particular hardware and software configuration
+this can degrade overall performance due to the operand cache line flushes
+that are implied by the @code{tas.b} instruction.  On multi-core SH4A
+processors the @code{tas.b} instruction must be used with caution since it
+can result in data corruption for certain cache configurations.
+
 @item -mspace
 @opindex mspace
 Optimize for space instead of speed.  Implied by @option{-Os}.
Index: gcc/config/sh/sh.h
===
--- gcc/config/sh/sh.h	(revision 184877)
+++ gcc/config/sh/sh.h	(working copy)
@@ -2473,4 +2473,10 @@
 /* FIXME: middle-end support for highpart optimizations is missing.  */
 #define high_life_started reload_in_progress
 
+/* The tas.b instruction sets the 7th bit in the byte, i.e. 0x80.
+   This value is used by optabs.c atomic op expansion code as well as in 
+   sync.md.  */
+#undef TARGET_ATOMIC_TEST_AND_SET_TRUEVAL
+#define TARGET_ATOMIC_TEST_AND_SET_TRUEVAL 0x80
+
 #endif /* ! GCC_SH_H */
Index: gcc/config/sh/sync.md
===
--- gcc/config/sh/sync.md	(revision 184877)
+++ gcc/config/sh/sync.md	(working copy)
@@ -404,3 +404,61 @@
 	 1:	mov	r1,r15;
 }
   [(set_attr length 18)])
+
+(define_expand atomic_test_and_set
+  [(match_operand:SI 0 register_operand )		;; bool result output
+   (match_operand:QI 1 memory_operand )		;; memory
+   (match_operand:SI 2 const_int_operand )]		;; model
+  (TARGET_SOFT_ATOMIC || TARGET_ENABLE_TAS)  !TARGET_SHMEDIA
+{
+  rtx addr = force_reg (Pmode, XEXP (operands[1], 0));
+
+  if (TARGET_ENABLE_TAS)
+emit_insn (gen_tasb (addr));
+  else
+{
+  rtx val = force_reg (QImode, 
+			   gen_int_mode (TARGET_ATOMIC_TEST_AND_SET_TRUEVAL,
+	 QImode));
+  emit_insn (gen_atomic_test_and_set_soft (addr, val));
+}
+
+  /* The result of the test op is the inverse of what we are
+ supposed to return.  Thus invert the T bit.  The inversion will be
+ potentially optimized away and integrated into surrounding code.  */
+  emit_insn (gen_movnegt (operands[0]));
+  DONE;
+})
+
+(define_insn tasb
+  [(set (reg:SI T_REG)
+	(eq:SI (mem:QI (match_operand:SI 0 register_operand r))
+	   (const_int 0)))
+   (set (mem:QI (match_dup 0))
+	(unspec:QI [(const_int 128)] UNSPEC_ATOMIC))]
+  TARGET_ENABLE_TAS  !TARGET_SHMEDIA
+  tas.b	@%0
+  [(set_attr insn_class co_group)])
+
+(define_insn atomic_test_and_set_soft
+  [(set (reg:SI T_REG)
+	(eq:SI (mem:QI (match_operand:SI 0 register_operand u))
+	   (const_int 0)))
+   (set (mem:QI (match_dup 0))
+	(unspec:QI 

New Swedish PO file for 'gcc' (version 4.7-b20120128)

2012-03-04 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-4.7-b20120128.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.
coordina...@translationproject.org



Re: PATCH [1/n] addr32: Properly use Pmode and word_mode

2012-03-04 Thread Uros Bizjak
On Sat, Nov 12, 2011 at 3:19 AM, H.J. Lu hongjiu...@intel.com wrote:

 The current x32 implementation uses LEAs to convert 32bit address to
 64bit.  However, we can use addr32 prefix to use 32bit address directly.
 It improves performance by 5% in SPEC CPU 2K/2006.  All changes are done
 in x86 backend, except for a smaill unwind library assert change:

 http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01555.html

 due to return column size difference.

 For x86-64, Pmode can be 32bit or 64bit, but word_mode is always 64bit.
 push/pop only work on word_mode.  Also string instructions take Pmode
 pointers.

 I will submit a set of patches to use 32bit Pmode for x32.  This is
 the first patch to properly use Pmode and word_mode.  It also adds
 addr32 prefix to string instructions if needed.  OK for trunk?

First round of review comments:

@@ -10252,14 +10260,18 @@ ix86_expand_prologue (void)
   if (r10_live  eax_live)
 {
  t = choose_baseaddr (m-fs.sp_offset - allocate);
- emit_move_insn (r10, gen_frame_mem (Pmode, t));
+ emit_move_insn (gen_rtx_REG (word_mode, R10_REG),
+ gen_frame_mem (word_mode, t));
  t = choose_baseaddr (m-fs.sp_offset - allocate - UNITS_PER_WORD);
- emit_move_insn (eax, gen_frame_mem (Pmode, t));
+ emit_move_insn (gen_rtx_REG (word_mode, AX_REG),
+ gen_frame_mem (word_mode, t));
}
   else if (eax_live || r10_live)
{
  t = choose_baseaddr (m-fs.sp_offset - allocate);
- emit_move_insn ((eax_live ? eax : r10), gen_frame_mem (Pmode, t));
+ emit_move_insn (gen_rtx_REG (word_mode,
+  (eax_live ? AX_REG : R10_REG)),
+ gen_frame_mem (word_mode, t));
}
 }
   gcc_assert (m-fs.sp_offset == frame.stack_pointer_offset);

Please just change

  rtx eax = gen_rtx_REG (Pmode, AX_REG);

and
  r10 = gen_rtx_REG (Pmode, R10_REG);

around line 10305 and line 10324. You also have gen_push in Pmode,
just following the former line. Please review the whole
ix86_expand_prologue how AX and R10 are defined and used.

@@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void)
{
  rtx rax;

- rax = gen_rtx_REG (Pmode, AX_REG);
- emit_move_insn (rax, reg10);
+ rax = gen_rtx_REG (word_mode, AX_REG);
+ emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG));
  use_reg (call_fusage, rax);
}

Same here. Please review how AX, R10 and R11 are defined and used.
Also, this needs review from split stack author.

@@ -11388,6 +11400,11 @@ ix86_decompose_address (rtx addr, struct
ix86_address *out)
   else
 disp = addr;   /* displacement */

+  /* Since address override works only on the (reg) part in fs:(reg),
+ we can't use it as memory operand.  */
+  if (Pmode != word_mode  seg == SEG_FS  (base || index))
+return 0;

Can you explain the above some more? IMO, if the override works on
(reg) part, this is just what we want.

@@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code)
  gcc_unreachable ();
}

- ix86_print_operand (file, x, 0);
+ ix86_print_operand (file, x,
+ TARGET_64BIT  REG_P (x) ? 'q' : 0);
  return;

This is too big hammer. You output everything in DImode, so even if
the address is in fact in SImode, you output it in DImode with an
addr32 prefix.

Uros.


Re: [PATCH 02/10] addr32: Only handle zero-extended DImode addresses

2012-03-04 Thread Uros Bizjak
On Fri, Mar 2, 2012 at 9:38 PM, H.J. Lu hongjiu...@intel.com wrote:

 We only need to handle zero-extended addresses in DImode.
 OK for trunk?

 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_print_operand_address): Only handle
        zero-extended DImode addresses.

OK.

Thanks,
Uros.


Re: [PATCH 06/10] addr32: Check Pmode to set adjust_stack_insn

2012-03-04 Thread Uros Bizjak
On Fri, Mar 2, 2012 at 9:58 PM, H.J. Lu hongjiu...@intel.com wrote:
 Since stack register may be in SImode for TARGET_64BIT, this patch
 checks Pmode to set adjust_stack_insn.  OK for trunk?

 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_expand_prologue): Check Pmode to set
        adjust_stack_insn.

OK.

Thanks,
Uros.


Re: [PATCH 08/10] addr32: Check Pmode instead of TARGET_64BIT

2012-03-04 Thread Uros Bizjak
On Fri, Mar 2, 2012 at 10:04 PM, H.J. Lu hongjiu...@intel.com wrote:

 Since stack register may be in SImode for TARGET_64BIT, this patch
 checks Pmode to adjust stack in proper mode.  OK for trunk?

 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (pro_epilogue_adjust_stack): Check Pmode
        instead of TARGET_64BIT.

OK.

Thanks,
Uros.


Re: libgo patch committed: Update to weekly.2012-02-22 release

2012-03-04 Thread Uros Bizjak
Hello!

It looks that this patch introduced:

/home/uros/gcc-build-go/x86_64-unknown-linux-gnu/32/libgo/.libs/libgo.so:
undefined reference to `libgo_runtime.runtime.Callers'
collect2: error: ld returned 1 exit status

All libgo tests fail due to this undefined reference.

Uros.


[PATCH, i386]: Declare some variables bool

2012-03-04 Thread Uros Bizjak
Hello!

2012-03-04  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_print_operand) case '+': Declare
taken and cputaken as bool.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline as obvious.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 184886)
+++ config/i386/i386.c  (working copy)
@@ -14147,8 +14148,9 @@ ix86_print_operand (FILE *file, rtx x, int code)
if (pred_val  REG_BR_PROB_BASE * 45 / 100
|| pred_val  REG_BR_PROB_BASE * 55 / 100)
  {
-   int taken = pred_val  REG_BR_PROB_BASE / 2;
-   int cputaken = final_forward_branch_p (current_output_insn) 
== 0;
+   bool taken = pred_val  REG_BR_PROB_BASE / 2;
+   bool cputaken
+ = final_forward_branch_p (current_output_insn) == 0;
 
/* Emit hints only in the case default branch prediction
   heuristics would fail.  */


Re: PATCH [1/n] addr32: Properly use Pmode and word_mode

2012-03-04 Thread H.J. Lu
On Sun, Mar 4, 2012 at 12:09 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sat, Nov 12, 2011 at 3:19 AM, H.J. Lu hongjiu...@intel.com wrote:

 The current x32 implementation uses LEAs to convert 32bit address to
 64bit.  However, we can use addr32 prefix to use 32bit address directly.
 It improves performance by 5% in SPEC CPU 2K/2006.  All changes are done
 in x86 backend, except for a smaill unwind library assert change:

 http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01555.html

 due to return column size difference.

 For x86-64, Pmode can be 32bit or 64bit, but word_mode is always 64bit.
 push/pop only work on word_mode.  Also string instructions take Pmode
 pointers.

 I will submit a set of patches to use 32bit Pmode for x32.  This is
 the first patch to properly use Pmode and word_mode.  It also adds
 addr32 prefix to string instructions if needed.  OK for trunk?

 First round of review comments:

 @@ -10252,14 +10260,18 @@ ix86_expand_prologue (void)
       if (r10_live  eax_live)
         {
          t = choose_baseaddr (m-fs.sp_offset - allocate);
 -         emit_move_insn (r10, gen_frame_mem (Pmode, t));
 +         emit_move_insn (gen_rtx_REG (word_mode, R10_REG),
 +                         gen_frame_mem (word_mode, t));
          t = choose_baseaddr (m-fs.sp_offset - allocate - UNITS_PER_WORD);
 -         emit_move_insn (eax, gen_frame_mem (Pmode, t));
 +         emit_move_insn (gen_rtx_REG (word_mode, AX_REG),
 +                         gen_frame_mem (word_mode, t));
        }
       else if (eax_live || r10_live)
        {
          t = choose_baseaddr (m-fs.sp_offset - allocate);
 -         emit_move_insn ((eax_live ? eax : r10), gen_frame_mem (Pmode, t));
 +         emit_move_insn (gen_rtx_REG (word_mode,
 +                                      (eax_live ? AX_REG : R10_REG)),
 +                         gen_frame_mem (word_mode, t));
        }
     }
   gcc_assert (m-fs.sp_offset == frame.stack_pointer_offset);

 Please just change

      rtx eax = gen_rtx_REG (Pmode, AX_REG);

 and
          r10 = gen_rtx_REG (Pmode, R10_REG);

This is done on purpose.  We manipulate stack using AX and R10 as
scratch registers in Pmode since stack is in Pmode.  But AX and R10
registers have to be saved and restored in word_mode.

 around line 10305 and line 10324. You also have gen_push in Pmode,

In those places, they just want to push a register on stack to save it.
Callers don't care how it is done.  I changed gen_push  to allow
Pmode by always pushing registers in word_mode:

 if (REG_P (arg)  GET_MODE (arg) != word_mode)
arg = gen_rtx_REG (word_mode, REGNO (arg));

 just following the former line. Please review the whole
 ix86_expand_prologue how AX and R10 are defined and used.

The same issue applies here.

 @@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void)
        {
          rtx rax;

 -         rax = gen_rtx_REG (Pmode, AX_REG);
 -         emit_move_insn (rax, reg10);
 +         rax = gen_rtx_REG (word_mode, AX_REG);
 +         emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG));
          use_reg (call_fusage, rax);
        }

 Same here. Please review how AX, R10 and R11 are defined and used.
 Also, this needs review from split stack author.

I CCed Ian. That is the same issue.  We need some scratch registers
in Pmode to manipulate stack.  But we have to save and restore them
in word_mode, not Pmode.

 @@ -11388,6 +11400,11 @@ ix86_decompose_address (rtx addr, struct
 ix86_address *out)
   else
     disp = addr;                       /* displacement */

 +  /* Since address override works only on the (reg) part in fs:(reg),
 +     we can't use it as memory operand.  */
 +  if (Pmode != word_mode  seg == SEG_FS  (base || index))
 +    return 0;

 Can you explain the above some more? IMO, if the override works on
 (reg) part, this is just what we want.

When Pmode == SImode, we have

fs segment register == 0x1001

and

base register (SImode) == -1 (0x).

We are expecting address to be 0x1001 - 1 == 0x1000.  But, what we get
is 0x1000 + 0x, not 0x1000 since 0x67 address prefix only applies to
base register to zero-extend 0x to 64bit.

 @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code)
              gcc_unreachable ();
            }

 -         ix86_print_operand (file, x, 0);
 +         ix86_print_operand (file, x,
 +                             TARGET_64BIT  REG_P (x) ? 'q' : 0);
          return;

 This is too big hammer. You output everything in DImode, so even if
 the address is in fact in SImode, you output it in DImode with an
 addr32 prefix.


%A is only used in jmp\t%A0 and there is no jmp *%eax instruction in
64bit mode, only jmp *%rax:

[hjl@gnu-4 tmp]$ cat j.s
jmp *%eax
jmp *%rax
[hjl@gnu-4 tmp]$ gcc -c j.s
j.s: Assembler messages:
j.s:1: Error: operand type mismatch for `jmp'
[hjl@gnu-4 tmp]$

It is OK for x32 since the upper 32bits are zero when we are loading %eax.


-- 
H.J.


Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF

2012-03-04 Thread Uros Bizjak
On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

No, please implement this using UNSPEC in the same way as
tls_initial_exec_64_sun implements Sun linker quirk.

Uros.


Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF

2012-03-04 Thread H.J. Lu
On Sun, Mar 4, 2012 at 2:12 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 No, please implement this using UNSPEC in the same way as
 tls_initial_exec_64_sun implements Sun linker quirk.


I am not sure how it can be done with UNSPEC cleanly.
Unlike the Sun linker issue, this is an instruction encoding
issue.  At instruction pattern level, there is no difference
between x32 and x86-64. I need to make sure that there is
always one and only one REX prefix with UNSPEC_GOTNTPOFF.
If REG is r8-r15, we shouldn't add another REX prefix.

-- 
H.J.


Re: PATCH [1/n] addr32: Properly use Pmode and word_mode

2012-03-04 Thread Uros Bizjak
On Sun, Mar 4, 2012 at 11:01 PM, H.J. Lu hjl.to...@gmail.com wrote:

 @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code)
              gcc_unreachable ();
            }

 -         ix86_print_operand (file, x, 0);
 +         ix86_print_operand (file, x,
 +                             TARGET_64BIT  REG_P (x) ? 'q' : 0);
          return;

 This is too big hammer. You output everything in DImode, so even if
 the address is in fact in SImode, you output it in DImode with an
 addr32 prefix.


 %A is only used in jmp\t%A0 and there is no jmp *%eax instruction in
 64bit mode, only jmp *%rax:

 [hjl@gnu-4 tmp]$ cat j.s
        jmp *%eax
        jmp *%rax
 [hjl@gnu-4 tmp]$ gcc -c j.s
 j.s: Assembler messages:
 j.s:1: Error: operand type mismatch for `jmp'
 [hjl@gnu-4 tmp]$

 It is OK for x32 since the upper 32bits are zero when we are loading %eax.

Just zero_extend register in wrong mode to DImode in indirect_jump and
tablejump expanders. If above is true, then gcc will remove this
extension automatically.

Uros.


Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF

2012-03-04 Thread Uros Bizjak
On Sun, Mar 4, 2012 at 11:38 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Sun, Mar 4, 2012 at 2:12 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 No, please implement this using UNSPEC in the same way as
 tls_initial_exec_64_sun implements Sun linker quirk.


 I am not sure how it can be done with UNSPEC cleanly.
 Unlike the Sun linker issue, this is an instruction encoding
 issue.  At instruction pattern level, there is no difference
 between x32 and x86-64. I need to make sure that there is
 always one and only one REX prefix with UNSPEC_GOTNTPOFF.
 If REG is r8-r15, we shouldn't add another REX prefix.

(define_insn tls_initial_exec_x32
  [(set (match_operand:SI 0 register_operand =r)
(unspec:SI
 [(match_operand:SI 1 tls_symbolic_operand )]
 UNSPEC_TLS_IE_X32))
   (clobber (reg:CC FLAGS_REG))]
  TARGET_X32
{
  if (!REX_INT_REG_P (operands[0])
fputs (\trex\n, asm_out_file);
  output_asm_insn
(mov{l}\t{%%fs:0, %0|%0, QWORD PTR fs:0}, operands);
  if (!REX_INT_REG_P (operands[0])
fputs (\trex\n, asm_out_file);
  return add{l}\t{%a1@gottpoff(%%rip), %0|%0, %a1@gottpoff[rip]};
}
  [(set_attr type multi)])

rex or rex64 prefix, whatever you wish.

Then generate this pattern from legitimize_tls_address,
TLS_MODEL_INITIAL_EXEC, see TARGET_SUN_TLS part.

Uros.


Re: [4.7][SH] Binary compatibility with atomic_test_and_test_trueval != 1

2012-03-04 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 Attached is a slightly modified version of the patch from
 http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00085.html
 
 I have removed the signed char weirdo and adjusted the comment above
 TARGET_ATOMIC_TEST_AND_SET_TRUEVAL accordingly.
 
 Tested by compiling some test functions that use __atomic_test_and_set /
 __GCC_ATOMIC_TEST_AND_SET_TRUEVAL with various SH atomic option
 combinations and looking at the output asm.
 
 OK to apply to trunk?

OK.

Regards,
kaz


Re: [patch, libffi] Sync merge libffi

2012-03-04 Thread Matthias Klose

On 04.03.2012 22:20, Anthony Green wrote:

Hello,

The attached patch includes changes that have been reviewed, approved and merged
into the stand-alone libffi release tree**.

** http://github.com/atgreen/libffi


does this correspond to a libffi release or release candidate?


[v3] libstdc++/43813

2012-03-04 Thread Paolo Carlini

Hi,

this is what I did to implement the resolution of lwg 1234. All in all 
pretty straightforward. Note I'm leaving alone for now basic_string and 
all the trickery with its exports (and well, I don't think *that* many 
people are instantiating basic_string for, eg, a pointer type).


Tested x86_64-linux, normal and debug, committed.

Thanks,
Paolo.

///
2012-03-04  Paolo Carlini  paolo.carl...@oracle.com
Jonathan Wakely  jwakely@gmail.com

PR libstdc++/43813
* include/bits/stl_iterator_base_types.h (_RequireInputIter): New.
* include/ext/vstring.h (__versa_string::__versa_string
(_InputIterator, _InputIterator, const _Alloc),
__versa_string::append(_InputIterator, _InputIterator),
__versa_string::assign(_InputIterator, _InputIterator),
__versa_string::insert(iterator, _InputIterator,
_InputIterator), __versa_string::replace(iterator, iterator,
_InputIterator, _InputIterator)): Use it.
* include/bits/stl_list.h (list::list(_InputIterator,
_InputIterator, const allocator_type), list::assign(_InputIterator,
_InputIterator), list::insert(iterator, _InputIterator,
_InputIterator)): Likewise.
* include/bits/stl_vector.h (vector::vector(_InputIterator,
_InputIterator, const allocator_type), vector::assign(_InputIterator,
_InputIterator), vectort::insert(iterator, _InputIterator,
_InputIterator)): Likewise.
* include/bits/stl_deque.h (deque::deque(_InputIterator,
_InputIterator, const allocator_type), deque::deque(_InputIterator,
_InputIterator), deque::insert(iterator, _InputIterator,
_InputIterator)): Likewise.
* include/bits/stl_bvector.h (vector::vector(_InputIterator,
_InputIterator, const allocator_type), vector::deque(_InputIterator,
_InputIterator), vector::insert(iterator, _InputIterator,
_InputIterator)): Likewise.
* include/bits/forward_list.h (forward_list::forward_list
(_InputIterator, _InputIterator, const allocator_type),
forward_list::assign(_InputIterator, _InputIterator),
forward_list::insert_after(const_iterator, _InputIterator,
_InputIterator)): Likewise.
(forward_list::_M_initialize_dispatch(,, __true_type): Remove.
(forward_list::_M_range_initialize): Add, adjust everywhere.
* include/bits/forward_list.tcc: Adjust.
* include/debug/forward_list: Adjust.
* include/debug/vector: Likewise.
* include/debug/deque: Likewise.
* include/debug/list: Likewise.
* testsuite/ext/vstring/requirements/do_the_right_thing.cc: New.
* testsuite/23_containers/forward_list/requirements/
do_the_right_thing.cc: Likewise.
* testsuite/23_containers/vector/requirements/
do_the_right_thing.cc: Likewise.
* testsuite/23_containers/deque/requirements/
do_the_right_thing.cc: Likewise.
* testsuite/23_containers/list/requirements/
do_the_right_thing.cc: Likewise.
* testsuite/23_containers/forward_list/requirements/dr438/
assign_neg.cc: Adjust dg-error line number.
* testsuite/23_containers/forward_list/requirements/dr438/
insert_neg.cc: Likewise.
* testsuite/23_containers/forward_list/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/forward_list/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/
assign_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/
insert_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/deque/requirements/dr438/
assign_neg.cc: Likewise.
* testsuite/23_containers/deque/requirements/dr438/
insert_neg.cc: Likewise.
* testsuite/23_containers/deque/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/deque/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/list/requirements/dr438/
assign_neg.cc: Likewise.
* testsuite/23_containers/list/requirements/dr438/
insert_neg.cc: Likewise.
* testsuite/23_containers/list/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/list/requirements/dr438/
constructor_2_neg.cc: Likewise.
Index: include/debug/forward_list
===
--- include/debug/forward_list  (revision 184887)
+++ include/debug/forward_list  (working copy)
@@ -1,6 +1,6 @@
 // forward_list -*- C++ -*-
 
-// Copyright (C) 2010 Free Software Foundation, Inc.
+// 

Re: [patch, libffi] Sync merge libffi

2012-03-04 Thread Anthony Green

On 3/4/2012 7:53 PM, Matthias Klose wrote:

On 04.03.2012 22:20, Anthony Green wrote:

Hello,

The attached patch includes changes that have been reviewed, approved 
and merged

into the stand-alone libffi release tree**.

** http://github.com/atgreen/libffi


does this correspond to a libffi release or release candidate?


No, but very close.   There's still the outstanding ARM fp issue
(libffi doesn't build for soft-fp targets).  I'll sync again when 3.0.11 
is final.


Chung-Lin, this is related to your VFP support patch.  I hope you have 
time to look at this soon.  Let me know if you won't.


Thanks,

AG



Re: [patch, libffi] Sync merge libffi

2012-03-04 Thread John David Anglin
On Sun, 04 Mar 2012, Anthony Green wrote:

 Hello,

 The attached patch includes changes that have been reviewed, approved and 
 merged into the stand-alone libffi release tree**.
 Tested on x86_64 linux with no regressions, and committed.

 Thanks,
 Anthony Green

I'd like to question some of the changes in copyright.  For example,
the file src/pa/ffi.c was originally written by Randolph Chung.  The
copyright notice read as follows on the original contribution:

ffi.c - (c) 2003-2004 Randolph Chung ta...@debian.org
3   
4   HPPA Foreign Function Interface
5   
6   Permission is hereby granted, free of charge, to any person obtaining
7   a copy of this software and associated documentation files (the
8   ``Software''), to deal in the Software without restriction, including
9   without limitation the rights to use, copy, modify, merge, publish,
10  distribute, sublicense, and/or sell copies of the Software, and to
11  permit persons to whom the Software is furnished to do so, subject to
12  the following conditions:
13  
14  The above copyright notice and this permission notice shall be included
15  in all copies or substantial portions of the Software.
16  
17  THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, 
EXPRESS
18  OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
19  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
20  IN NO EVENT SHALL CYGNUS SOLUTIONS BE LIABLE FOR ANY CLAIM, DAMAGES OR
21  OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
22  ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
23  OTHER DEALINGS IN THE SOFTWARE.

It now reads as follows:

/* ---
   ffi.c - (c) 2011 Anthony Green
   (c) 2008 Red Hat, Inc.
   (c) 2006 Free Software Foundation, Inc.
   (c) 2003-2004 Randolph Chung ta...@debian.org
   
  HPPA Foreign Function Interface
  HP-UX PA ABI support 

  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  ``Software''), to deal in the Software without restriction, including
  without limitation the rights to use, copy, modify, merge, publish,
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:

  The above copyright notice and this permission notice shall be included
  in all copies or substantial portions of the Software.

  THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND,
  EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
  NONINFRINGEMENT.  IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
  HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
  WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
  DEALINGS IN THE SOFTWARE.
--- */

I'm just wondering why Anthony Green and Redhat are listed as copyright holders.
I can understand the Free Software Foundation addition since the file was
contributed to it.

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)


Re: PATCH [1/n] addr32: Properly use Pmode and word_mode

2012-03-04 Thread H.J. Lu
On Sun, Mar 4, 2012 at 2:40 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sun, Mar 4, 2012 at 11:01 PM, H.J. Lu hjl.to...@gmail.com wrote:

 @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code)
              gcc_unreachable ();
            }

 -         ix86_print_operand (file, x, 0);
 +         ix86_print_operand (file, x,
 +                             TARGET_64BIT  REG_P (x) ? 'q' : 0);
          return;

 This is too big hammer. You output everything in DImode, so even if
 the address is in fact in SImode, you output it in DImode with an
 addr32 prefix.


 %A is only used in jmp\t%A0 and there is no jmp *%eax instruction in
 64bit mode, only jmp *%rax:

 [hjl@gnu-4 tmp]$ cat j.s
        jmp *%eax
        jmp *%rax
 [hjl@gnu-4 tmp]$ gcc -c j.s
 j.s: Assembler messages:
 j.s:1: Error: operand type mismatch for `jmp'
 [hjl@gnu-4 tmp]$

 It is OK for x32 since the upper 32bits are zero when we are loading %eax.

 Just zero_extend register in wrong mode to DImode in indirect_jump and
 tablejump expanders. If above is true, then gcc will remove this
 extension automatically.


I tried:

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 715e7ea..de5cf67 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -11100,10 +11100,15 @@
(set_attr modrm 0)])

 (define_expand indirect_jump
-  [(set (pc) (match_operand 0 indirect_branch_operand ))])
+  [(set (pc) (match_operand 0 indirect_branch_operand ))]
+  
+{
+  if (TARGET_X32)
+operands[0] = convert_memory_address (word_mode, operands[0]);
+})

 (define_insn *indirect_jump
-  [(set (pc) (match_operand:P 0 indirect_branch_operand rw))]
+  [(set (pc) (match_operand:W 0 indirect_branch_operand rw))]
   
   jmp\t%A0
   [(set_attr type ibr)
@@ -11145,12 +11150,12 @@
   operands[0] = expand_simple_binop (Pmode, code, op0, op1, NULL_RTX, 0,
 OPTAB_DIRECT);
 }
-  else if (TARGET_X32)
-operands[0] = convert_memory_address (Pmode, operands[0]);
+  if (TARGET_X32)
+operands[0] = convert_memory_address (word_mode, operands[0]);
 })

 (define_insn *tablejump_1
-  [(set (pc) (match_operand:P 0 indirect_branch_operand rw))
+  [(set (pc) (match_operand:W 0 indirect_branch_operand rw))
(use (label_ref (match_operand 1  )))]
   
   jmp\t%A0

and compiler does generate the same output. i386.c also has

xasm = jmp\t%A0;
xasm = call\t%A0;

for calls.  There are no separate indirect call patterns.  For x32,
only indirect register calls have to be in DImode.  The direct call
should be in Pmode (SImode).

Since x86-64 hardware always zero-extends upper 32bits of 64bit
registers when loading its lower 32bits, it is safe and easier to just
to output 64bit registers for %A than zero-extend it by hand for all
jump/call patterns.

-- 
H.J.


Re: [PATCH 02/10] addr32: Output REX prefix for UNSPEC_GOTNTPOFF

2012-03-04 Thread H.J. Lu
On Sun, Mar 4, 2012 at 2:52 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sun, Mar 4, 2012 at 11:38 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Sun, Mar 4, 2012 at 2:12 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 2, 2012 at 9:36 PM, H.J. Lu hongjiu...@intel.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 No, please implement this using UNSPEC in the same way as
 tls_initial_exec_64_sun implements Sun linker quirk.


 I am not sure how it can be done with UNSPEC cleanly.
 Unlike the Sun linker issue, this is an instruction encoding
 issue.  At instruction pattern level, there is no difference
 between x32 and x86-64. I need to make sure that there is
 always one and only one REX prefix with UNSPEC_GOTNTPOFF.
 If REG is r8-r15, we shouldn't add another REX prefix.

 (define_insn tls_initial_exec_x32
  [(set (match_operand:SI 0 register_operand =r)
        (unspec:SI
         [(match_operand:SI 1 tls_symbolic_operand )]
         UNSPEC_TLS_IE_X32))
   (clobber (reg:CC FLAGS_REG))]
  TARGET_X32
 {
  if (!REX_INT_REG_P (operands[0])
    fputs (\trex\n, asm_out_file);
  output_asm_insn
    (mov{l}\t{%%fs:0, %0|%0, QWORD PTR fs:0}, operands);
  if (!REX_INT_REG_P (operands[0])
    fputs (\trex\n, asm_out_file);
  return add{l}\t{%a1@gottpoff(%%rip), %0|%0, %a1@gottpoff[rip]};
 }
  [(set_attr type multi)])

 rex or rex64 prefix, whatever you wish.


Will it introduce bugs? The normal code look like

  off = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, x), type);
  off = gen_rtx_CONST (Pmode, off);
  if (pic)
off = gen_rtx_PLUS (Pmode, pic, off);
  off = gen_const_mem (Pmode, off);
  set_mem_alias_set (off, ix86_GOT_alias_set ());

  if (TARGET_64BIT || TARGET_ANY_GNU_TLS)
{
  base = get_thread_pointer (for_mov || !TARGET_TLS_DIRECT_SEG_REFS);
  off = force_reg (Pmode, off);
  return gen_rtx_PLUS (Pmode, base, off);
}

There is a call to

 set_mem_alias_set (off, ix86_GOT_alias_set ());

-- 
H.J.


Re: [patch, libffi] Sync merge libffi

2012-03-04 Thread Anthony Green

On 3/4/2012 10:22 PM, John David Anglin wrote:
I'm just wondering why Anthony Green and Redhat are listed as 
copyright holders. I can understand the Free Software Foundation 
addition since the file was contributed to it.


Simply because of changes that were made to that source file over the 
years.  For instance, in 2011 I made a small change to that file (ABI 
check change), and the copyright notice update was largely a mechanical 
byproduct (emacs mostly automates this).


I have no objection to removing this if you feel strongly about it.

AG



Re: PATCH [1/n] addr32: Properly use Pmode and word_mode

2012-03-04 Thread Ian Lance Taylor
H.J. Lu hjl.to...@gmail.com writes:

 @@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void)
        {
          rtx rax;

 -         rax = gen_rtx_REG (Pmode, AX_REG);
 -         emit_move_insn (rax, reg10);
 +         rax = gen_rtx_REG (word_mode, AX_REG);
 +         emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG));
          use_reg (call_fusage, rax);
        }

 Same here. Please review how AX, R10 and R11 are defined and used.
 Also, this needs review from split stack author.

 I CCed Ian. That is the same issue.  We need some scratch registers
 in Pmode to manipulate stack.  But we have to save and restore them
 in word_mode, not Pmode.

Changing Pmode to word_mode is fine here, if the x86 maintainers approve
the rest of the patch.

Ian


Re: libgo patch committed: Fill out syscall package for GNU/Linux

2012-03-04 Thread Ian Lance Taylor
Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Ian Lance Taylor i...@google.com writes:

 This patch to libgo fills out the syscall package for GNU/Linux to match
 all the functions in the syscall package in the master Go library.
 There is a test case for this patch at
 http://code.google.com/p/go/issues/detail?id=3071 .  Bootstrapped and
 ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

 Unfortunately, this broke Solaris bootstrap:

 It also broke Linux/x86_64 bootstrap (CentOS 5.6):

 In file included from /usr/include/sys/ustat.h:30:0,
  from /usr/include/ustat.h:1,
  from sysinfo.c:91:
 /usr/include/bits/ustat.h:25:8: error: redefinition of 'struct ustat'
 In file included from /usr/include/linux/filter.h:8:0,
  from sysinfo.c:61:
 /usr/include/linux/types.h:156:8: note: originally defined here


After some actual testing, this additional patch seems to be needed to
fix the problem.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 6ec2944349b2 libgo/configure.ac
--- a/libgo/configure.ac	Fri Mar 02 13:07:34 2012 -0800
+++ b/libgo/configure.ac	Sun Mar 04 21:53:22 2012 -0800
@@ -463,6 +463,8 @@
 
 AC_CACHE_CHECK([whether ustat.h can be used],
 [libgo_cv_c_ustat_h],
+[CFLAGS_hold=$CFLAGS
+CFLAGS=$CFLAGS -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE $OSCFLAGS
 AC_COMPILE_IFELSE(
 [AC_LANG_SOURCE([
 #include sys/types.h
@@ -470,7 +472,8 @@
 #include linux/filter.h
 #endif
 #include ustat.h
-])], [libgo_cv_c_ustat_h=yes], [libgo_cv_c_ustat_h=no]))
+])], [libgo_cv_c_ustat_h=yes], [libgo_cv_c_ustat_h=no])
+CFLAGS=$CFLAGS_hold])
 if test $libgo_cv_c_ustat_h = yes; then
   AC_DEFINE(HAVE_USTAT_H, 1,
 [Define to 1 if you have the ustat.h header file and it works.])


libgo patch committed: Better big-endian hash function

2012-03-04 Thread Ian Lance Taylor
This libgo patch improves the big-endian hash function for key sizes
less than 8 bytes.  The previous hash function would always make all
hash values a large multiple of some constants, which interacted badly
with the map code.  This patch fixes that problem and fixes PR 52342.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 071257161dab libgo/runtime/go-type-identity.c
--- a/libgo/runtime/go-type-identity.c	Sun Mar 04 22:03:57 2012 -0800
+++ b/libgo/runtime/go-type-identity.c	Sun Mar 04 22:36:37 2012 -0800
@@ -6,6 +6,7 @@
 
 #include stddef.h
 
+#include config.h
 #include go-type.h
 
 /* The 64-bit type.  */
@@ -31,7 +32,11 @@
 	unsigned char a[8];
   } u;
   u.v = 0;
-  __builtin_memcpy (u.a, key, key_size);
+#ifdef WORDS_BIGENDIAN
+  __builtin_memcpy (u.a[8 - key_size], key, key_size);
+#else
+  __builtin_memcpy (u.a[0], key, key_size);
+#endif
   if (sizeof (uintptr_t) = 8)
 	return (uintptr_t) u.v;
   else


Re: PATCH [1/n] addr32: Properly use Pmode and word_mode

2012-03-04 Thread Uros Bizjak
On Mon, Mar 5, 2012 at 4:53 AM, H.J. Lu hjl.to...@gmail.com wrote:

 and compiler does generate the same output. i386.c also has

        xasm = jmp\t%A0;
    xasm = call\t%A0;

 for calls.  There are no separate indirect call patterns.  For x32,
 only indirect register calls have to be in DImode.  The direct call
 should be in Pmode (SImode).

Direct call just expects label to some abolute address that is assumed
to fit in 32 bits (see constant_call_address_operand_p).

call and jmp insn expect word_mode operands, so please change
ix86_expand_call and call patterns in the same way as jump
instructions above.

 Since x86-64 hardware always zero-extends upper 32bits of 64bit
 registers when loading its lower 32bits, it is safe and easier to just
 to output 64bit registers for %A than zero-extend it by hand for all
 jump/call patterns.

No, the instruction expects word_mode operands, so we have to extend
values to expected mode. I don't think that patching at insn output
time is acceptable.

BTW: I propose to split the patch into smaller pieces, dealing with
various independent parts separately. Handling jump/call insn is
definitely one of them, the other is stringops handling, another
prologue/epilogue expansion.

Uros.