Re: [PATCH, i386] RTM support

2012-03-11 Thread Kirill Yukhin

 The patch is OK for mainline, if there are no further comments in next 24h.

Thank you!

According to Tobias's input, I've added few lines about RTM to
doc/invoke.texi. If no objections - I'll commit the patch tomorrow.

Updated patch attached.
Updated ChangeLog entry:
2012-03-11  Kirill Yukhin  kirill.yuk...@intel.com

* doc/invoke.texi: Document -mrtm option.
* common/config/i386/i386-common.c (OPTION_MASK_ISA_RTM_SET):
New.
(OPTION_MASK_ISA_RTM_UNSET): Ditto.
(ix86_handle_option): Handle OPT_mrtm.
* config.gcc (i[34567]86-*-*): Add rtmintrin.h and
xtestintrin.h.
(x86_64-*-*): Ditto.
* i386-builtin-types.def (INT_FTYPE_VOID): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__RTM__ if needed.
(ix86_target_string): Define -mrtm option.
(PTA_RTM): New.
(ix86_option_override_internal): Extend cirei7-avx with
RTM option. Handle new option.
(ix86_valid_target_attribute_inner_p): Add OPT_mrtm.
(ix86_builtins): Add IX86_BUILTIN_XBEGIN, IX86_BUILTIN_XEND,
IX86_BUILTIN_XTEST.
(bdesc_special_args): Ditto.
(ix86_init_mmx_sse_builtins): Add IX86_BUILTIN_XABORT.
(ix86_expand_special_args_builtin): Handle new built-in type.
(ix86_expand_builtin): Handle XABORT instruction.
* config/i386/i386.h (TARGET_RTM): New.
* config/i386/i386.md (UNSPECV_XBEGIN): New.
(UNSPECV_XEND): Ditto.
(UNSPECV_XABORT): Ditto.
(UNSPECV_XTEST): Ditto.
(xbegin): Ditto.
(xbegin_1): Ditto.
(xend): Ditto.
(xabort): Ditto
(xtest): Ditto.
(xtest_1): Ditto.
* config/i386/i386.opt (mrtm): New.
* config/i386/immintrin.h: Include rtmintrin.h and
xtestintrin.h.
* config/i386/rtmintrin.h: New header.
* config/i386/xtestintrin.h: Ditto.


Thanks, K


Re: PATCH: Check Pmode in lwp_slwpcb

2012-03-11 Thread Uros Bizjak
On Sat, Mar 10, 2012 at 8:13 PM, H.J. Lu hongjiu...@intel.com wrote:

 Pmode may be SImode for TARGET_64BIT.  This patch checks Pmode instead
 of TARGET_64BIT in lwp_slwpcb.  Tested on Linux/x86-64.  OK for trunk?

 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.md (lwp_slwpcb): Check Pmode instead of
        TARGET_64BIT.

OK.

Thanks,
Uros.


Re: PATCH: Use Pmode on x86_64 this parameter

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 2:11 AM, H.J. Lu hongjiu...@intel.com wrote:

 This patch replaces DImode with Pmode on x86_64 this parameter.  OK
 for trunk?

 2012-03-10  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (x86_this_parameter): Replace DImode with
        Pmode.

OK.

Thanks,
Uros.


Re: [Patch ARM/ configury] Add fall-back check for gnu_unique_object

2012-03-11 Thread Ramana Radhakrishnan
On 10 March 2012 00:39, DJ Delorie d...@redhat.com wrote:

  Ping -  http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00549.html

 And now really add Paolo and DJ.

 +   [.type foo, '$target_type_format_char'gnu_unique_object],,

 This un-quoting looks incorrect if you don't know what's going on
 under the hood, but I don't see a clean way around it.  A suitable
 comment would be appropriate.

Thanks for the quick review - I thought however
it was kind of a standard workaround for this issue
having seen this elsewhere in the same file - given this is used
in multiple places/



 +target_type_format_char=@
 +       target_type_format_char=%

 Since the string always has special characters, it's likely that
 single quotes are more appropriate here.  The two characters in the
 patch don't care, but some future porter might naively do $ and
 wonder why (or worse, not wonder why) it doesn't work right.

Fair point  - done.


 Other than that it looks OK to me, assuming you tested it on all the
 relevent targets (i.e. arm and not-arm).


I tested x86_64-linux-gnu with a bootstrap and that showed identical
auto-host.h to the previous run and thus that appeared to be fine.
(This is a target that uses the default '@') .. On ARM I've done a
full bootstrap and auto-host.h shows the appropriate macro defined (
and it does with this version of the patch as well). Are there any
other targets you'd suggest ?

Assuming all tests pass is this version better ?


cheers
Ramana

 2012-03-11  Ramana Radhakrishnan   ramana.radhakrish...@linaro.org

   * config.gcc (target_type_format_char): New. Document it. Set it for
 arm*-*-* .
   * configure.ac (gnu_unique_option): Use target_type_format_char in test.
 Comment rationale.
   * configure: Regenerate .
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 99f0b47..a769d0c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -184,6 +184,11 @@
 #  the --with-sysroot configure option or the
 #  --sysroot command line option is used this
 #  will be relative to the sysroot.
+# target_type_format_char 
+#  The default character to be used for formatting
+#  the attribute in a
+#  .type symbol_name, ${t_t_f_c}property
+#  directive.
 
 # The following variables are used in each case-construct to build up the
 # outgoing variables:
@@ -235,6 +240,7 @@ target_gtfiles=
 need_64bit_hwint=
 need_64bit_isa=
 native_system_header_dir=/usr/include
+target_type_format_char='@'
 
 # Don't carry these over build-host-target.  Please.
 xm_file=
@@ -321,6 +327,7 @@ am33_2.0-*-linux*)
 arm*-*-*)
cpu_type=arm
extra_headers=mmintrin.h arm_neon.h
+   target_type_format_char='%'
c_target_objs=arm-c.o
cxx_target_objs=arm-c.o
extra_options=${extra_options} arm/arm-tables.opt
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 39302ad..4a534a1 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -4188,7 +4188,9 @@ Valid choices are 'yes' and 'no'.]) ;;
   esac],
  [gcc_GAS_CHECK_FEATURE([gnu_unique_object], gcc_cv_as_gnu_unique_object,
[elf,2,19,52],,
-   [.type foo, @gnu_unique_object],,
+#We have to unquote here to reuse the variable from 
+#config.gcc.
+   [.type foo, '$target_type_format_char'gnu_unique_object],,
 # Also check for ld.so support, i.e. glibc 2.11 or higher.
[[if test x$host = x$build -a x$host = x$target 
ldd --version 2/dev/null 


Re: [SH] PR 51244 - Improve conditional branches

2012-03-11 Thread Oleg Endo
On Thu, 2012-03-08 at 09:31 +0100, Oleg Endo wrote:

 This is the patch for the patch, as attached in the PR.
 Tested against rev 184966 as before and no changes in the test results
 for me (i.e. no new failures).

This one had a bug, as discussed in the PR.
I've tested the attached latest version of the patch (same as in the PR)
against rev 185160 with 

make -k check RUNTESTFLAGS=--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a-single/-mb,
-m4-single/-ml,-m4-single/-mb,
-m4a-single/-ml,-m4a-single/-mb}

once more and confirmed that there are no new failures.  The new failure
'21_strings/basic_string/cons/char/6.cc' mentioned in the PR is failing
due to the test program allocating too much memory for the simulator.
It aborts with a 'heap and stack collision'.  Chaning the number of test
iterations in the test case from '13' to '12' makes it pass again.

OK to commit the patch?

Cheers,
Oleg

ChangeLog:

PR target/51244
* config/sh/sh.md (movnegt): Expand into respective insns 
immediately.  Use movrt_negc instead of negc pattern for
non-SH2A.
(*movnegt): Remove.
(*movrt_negc, *negnegt, *movtt, *movt_qi): New insns and splits.

testsuite/ChangeLog:
 
PR target/51244
* gcc.target/sh/pr51244-1.c: Fix thinkos.

Index: gcc/testsuite/gcc.target/sh/pr51244-1.c
===
--- gcc/testsuite/gcc.target/sh/pr51244-1.c	(revision 184966)
+++ gcc/testsuite/gcc.target/sh/pr51244-1.c	(working copy)
@@ -13,20 +13,20 @@
 }
 
 int
-testfunc_01 (int a, char* p, int b, int c)
+testfunc_01 (int a, int b, int c, int d)
 {
-  return (a == b  a == c) ? b : c;
+  return (a == b || a == d) ? b : c;
 }
 
 int
-testfunc_02 (int a, char* p, int b, int c)
+testfunc_02 (int a, int b, int c, int d)
 {
-  return (a == b  a == c) ? b : c;
+  return (a == b  a == d) ? b : c;
 }
 
 int
-testfunc_03 (int a, char* p, int b, int c)
+testfunc_03 (int a, int b, int c, int d)
 {
-  return (a != b  a != c) ? b : c;
+  return (a != b  a != d) ? b : c;
 }
 
Index: gcc/config/sh/sh.md
===
--- gcc/config/sh/sh.md	(revision 184966)
+++ gcc/config/sh/sh.md	(working copy)
@@ -9679,39 +9679,90 @@
 ;; If the constant -1 can be CSE-ed or lifted out of a loop it effectively
 ;; becomes a one instruction operation.  Moreover, care must be taken that
 ;; the insn can still be combined with inverted compare and branch code
-;; around it.
-;; The expander will reserve the constant -1, the insn makes the whole thing
-;; combinable, the splitter finally emits the insn if it was not combined 
-;; away.
-;; Notice that when using the negc variant the T bit also gets inverted.
+;; around it.  On the other hand, if a function returns the complement of
+;; a previous comparison result in the T bit, the xor #1,r0 approach might
+;; lead to better code.
 
 (define_expand movnegt
-  [(set (match_dup 1) (const_int -1))
-   (parallel [(set (match_operand:SI 0 arith_reg_dest )
-		   (xor:SI (reg:SI T_REG) (const_int 1)))
-   (use (match_dup 1))])]
+  [(set (match_operand:SI 0 arith_reg_dest )
+	(xor:SI (reg:SI T_REG) (const_int 1)))]
   
 {
-  operands[1] = gen_reg_rtx (SImode);
+  if (TARGET_SH2A)
+emit_insn (gen_movrt (operands[0]));
+  else
+{
+  rtx val = force_reg (SImode, gen_int_mode (-1, SImode));
+  emit_insn (gen_movrt_negc (operands[0], val));
+}
+  DONE;
 })
 
-(define_insn_and_split *movnegt
+(define_insn movrt_negc
   [(set (match_operand:SI 0 arith_reg_dest =r)
 	(xor:SI (reg:SI T_REG) (const_int 1)))
+   (set (reg:SI T_REG) (const_int 1))
(use (match_operand:SI 1 arith_reg_operand r))]
   TARGET_SH1
+  negc	%1,%0
+  [(set_attr type arith)])
+
+;; The *negnegt patterns help the combine pass to figure out how to fold 
+;; an explicit double T bit negation.
+(define_insn_and_split *negnegt
+  [(set (reg:SI T_REG)
+	(eq:SI (subreg:QI (xor:SI (reg:SI T_REG) (const_int 1)) 3)
+(const_int 0)))]
+  ! TARGET_LITTLE_ENDIAN
   #
-   1
-  [(const_int 0)]
-{
-  if (TARGET_SH2A)
-emit_insn (gen_movrt (operands[0]));
-  else
-emit_insn (gen_negc (operands[0], operands[1]));
-  DONE;
-}
+  
+  [(const_int 0)])
+
+(define_insn_and_split *negnegt
+  [(set (reg:SI T_REG)
+	(eq:SI (subreg:QI (xor:SI (reg:SI T_REG) (const_int 1)) 0)
+(const_int 0)))]
+  TARGET_LITTLE_ENDIAN
+  #
+  
+  [(const_int 0)])
+
+;; The *movtt patterns improve code at -O1.
+(define_insn_and_split *movtt
+  [(set (reg:SI T_REG)
+	(eq:SI (zero_extend:SI (subreg:QI (reg:SI T_REG) 3))
+(const_int 1)))]
+  ! TARGET_LITTLE_ENDIAN
+  #
+  
+  [(const_int 0)])
+
+(define_insn_and_split *movtt
+  [(set (reg:SI T_REG)
+	(eq:SI (zero_extend:SI (subreg:QI (reg:SI T_REG) 0))
+(const_int 1)))]
+  TARGET_LITTLE_ENDIAN
+  #
+  
+  [(const_int 0)])
+
+;; The *movt_qi patterns help the combine pass convert a movrt_negc pattern
+;; into a movt Rn, xor #1 Rn pattern.  This can happen 

Re: [SH] PR 51244 - Improve conditional branches

2012-03-11 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 This one had a bug, as discussed in the PR.
 I've tested the attached latest version of the patch (same as in the PR)
 against rev 185160 with 
 
 make -k check RUNTESTFLAGS=--target_board=sh-sim
 \{-m2/-ml,-m2/-mb,-m2a-single/-mb,
 -m4-single/-ml,-m4-single/-mb,
 -m4a-single/-ml,-m4a-single/-mb}
 
 once more and confirmed that there are no new failures.  The new failure
 '21_strings/basic_string/cons/char/6.cc' mentioned in the PR is failing
 due to the test program allocating too much memory for the simulator.
 It aborts with a 'heap and stack collision'.  Chaning the number of test
 iterations in the test case from '13' to '12' makes it pass again.
 
 OK to commit the patch?

OK.

Regards,
kaz


Re: 4.4 branch frozen

2012-03-11 Thread Gerald Pfeifer
On Tue, 6 Mar 2012, Jakub Jelinek wrote:
 The 4.4 branch is now frozen, all commits require RM approval.
 There will be the 4.4.7 release next week released from it and
 after that the branch will be closed.

Cool.  At that point I suggest removing GCC 4.4 from the Release
Series and Status; let me know, and I'll be happy to help with any
web page updates.

Gerald


Re: [PATCH] [SH] Fix target/48596

2012-03-11 Thread Oleg Endo
On Tue, 2012-03-06 at 08:24 +0900, Kaz Kojima wrote:
 Oleg Endo oleg.e...@t-online.de wrote:
  I'd like to add the test case from the PR to the testsuite.
  
  Tested with
  make check-gcc RUNTESTFLAGS=sh.exp=pr48596.c --target_board=sh-sim
  \{-m2/-ml,-m2/-mb,-m2a-single/-mb,
  -m4-single/-ml,-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}
  
  OK?
 
 A gcc.c-torture/compile test is better, isn't it?
 

I just noticed that I've accidentally added the pr48596.c to
gcc.target/sh in another commit.  I'm sorry about that.

The attached patch moves it as suggested to gcc.c-torture/compile.
Briefly tested by running the gcc.c-torture/compile set on sh-him
-m4a-single -ml.


testsuite/ChangeLog:
 
PR target/48596
* gcc.target/sh/pr48596.c: Move accidentally added new test
case to ...
* gcc.c-torture/compile/pr48596.c: ... here.

Index: gcc/testsuite/gcc.target/sh/pr48596.c
===
--- gcc/testsuite/gcc.target/sh/pr48596.c	(revision 185191)
+++ gcc/testsuite/gcc.target/sh/pr48596.c	(working copy)
@@ -1,31 +0,0 @@
-/* Check that the following code compiles without errors.  */
-/* { dg-do compile { target sh*-*-* } } */
-/* { dg-options -O1 } */
-
-enum { nrrdCenterUnknown, nrrdCenterNode, nrrdCenterCell, nrrdCenterLast };
-typedef struct { int size; int center; }  NrrdAxis;
-typedef struct { int dim; NrrdAxis axis[10]; } Nrrd;
-typedef struct { } NrrdKernel;
-typedef struct { const NrrdKernel *kernel[10]; int samples[10]; } Info;
-
-void
-foo (Nrrd *nout, Nrrd *nin, const NrrdKernel *kernel, const double *parm,
- const int *samples, const double *scalings)
-{
-  Info *info;
-  int d, p, np, center;
-  for (d=0; dnin-dim; d++)
-{
-  info-kernel[d] = kernel;
-  if (samples)
-	info-samples[d] = samples[d];
-  else
-	{
-	  center = _nrrdCenter(nin-axis[d].center);
-	  if (nrrdCenterCell == center)
-	info-samples[d] = nin-axis[d].size*scalings[d];
-	  else
-	info-samples[d] = (nin-axis[d].size - 1)*scalings[d] + 1;
-	}
-}
-}
Index: gcc/testsuite/gcc.c-torture/compile/pr48596.c
===
--- gcc/testsuite/gcc.c-torture/compile/pr48596.c	(revision 0)
+++ gcc/testsuite/gcc.c-torture/compile/pr48596.c	(revision 0)
@@ -0,0 +1,31 @@
+/* PR target/48596  */
+/* { dg-do compile } */
+/* { dg-options -O1 } */
+
+enum { nrrdCenterUnknown, nrrdCenterNode, nrrdCenterCell, nrrdCenterLast };
+typedef struct { int size; int center; }  NrrdAxis;
+typedef struct { int dim; NrrdAxis axis[10]; } Nrrd;
+typedef struct { } NrrdKernel;
+typedef struct { const NrrdKernel *kernel[10]; int samples[10]; } Info;
+
+void
+foo (Nrrd *nout, Nrrd *nin, const NrrdKernel *kernel, const double *parm,
+ const int *samples, const double *scalings)
+{
+  Info *info;
+  int d, p, np, center;
+  for (d=0; dnin-dim; d++)
+{
+  info-kernel[d] = kernel;
+  if (samples)
+	info-samples[d] = samples[d];
+  else
+	{
+	  center = _nrrdCenter(nin-axis[d].center);
+	  if (nrrdCenterCell == center)
+	info-samples[d] = nin-axis[d].size*scalings[d];
+	  else
+	info-samples[d] = (nin-axis[d].size - 1)*scalings[d] + 1;
+	}
+}
+}


Re: PATCH: Properly check mode for x86 call/jmp address

2012-03-11 Thread Uros Bizjak
On Sat, Mar 10, 2012 at 5:05 PM, H.J. Lu hjl.to...@gmail.com wrote:

  (define_insn *call
 -  [(call (mem:QI (match_operand:P 0 call_insn_operand czw))
 +  [(call (mem:QI (match_operand:C 0 call_insn_operand czw))
        (match_operand 1  ))]
 -  !SIBLING_CALL_P (insn)
 +  !SIBLING_CALL_P (insn)
 +    (GET_CODE (operands[0]) == SYMBOL_REF
 +       || GET_MODE (operands[0]) == word_mode)

 There are enough copies of this extra constraint that I wonder
 if it simply ought to be folded into call_insn_operand.

 Which would need to be changed to define_special_predicate,
 since you'd be doing your own mode checking.

 Probably similar changes to sibcall_insn_operand.

 Here is the updated patch.  I changed constant_call_address_operand
 and call_register_no_elim_operand to use define_special_predicate.
 OK for trunk?

 Please do not complicate matters that much. Just stick word_mode
 overrides for register operands in predicates.md, like in attached
 patch. These changed predicates now allow registers only in word_mode
 (and VOIDmode).

 You can now remove all new mode iterators and leave call patterns 
 untouched.

 @@ -22940,14 +22940,18 @@ ix86_expand_call (rtx retval, rtx fnaddr,
 rtx callarg1,
        GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
        !local_symbolic_operand (XEXP (fnaddr, 0), VOIDmode))
     fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 
 0)));
 -  else if (sibcall
 -          ? !sibcall_insn_operand (XEXP (fnaddr, 0), Pmode)
 -          : !call_insn_operand (XEXP (fnaddr, 0), Pmode))
 +  else if (!(constant_call_address_operand (XEXP (fnaddr, 0), Pmode)
 +            || call_register_no_elim_operand (XEXP (fnaddr, 0),
 +                                              word_mode)
 +            || (!sibcall
 +                 !TARGET_X32
 +                 memory_operand (XEXP (fnaddr, 0), word_mode
     {
       fnaddr = XEXP (fnaddr, 0);
 -      if (GET_MODE (fnaddr) != Pmode)
 -       fnaddr = convert_to_mode (Pmode, fnaddr, 1);
 -      fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (Pmode, fnaddr));
 +      if (GET_MODE (fnaddr) != word_mode)
 +       fnaddr = convert_to_mode (word_mode, fnaddr, 1);
 +      fnaddr = gen_rtx_MEM (QImode,
 +                           copy_to_mode_reg (word_mode, fnaddr));
     }

   vec_len = 0;

 Please update the above part. It looks you don't even have to change
 condition with new predicates. Basically, you should only convert the
 address to word_mode instead of Pmode.

 +  if (TARGET_X32)
 +    operands[0] = convert_memory_address (word_mode, operands[0]);

 This addition to indirect_jump and tablejump should be the only
 change, needed in i386.md now. Please write the condition

 if (Pmode != word_mode)

 for consistency.

 BTW: The attached patch was bootstrapped and regression tested on
 x86_64-pc-linux-gnu {,-m32}.

 Uros.

 It doesn't work:

 x.i:7:1: error: unrecognizable insn:
 (call_insn/j 8 7 9 3 (call (mem:QI (reg:DI 62) [0 *foo.0_1 S1 A8])
        (const_int 0 [0])) x.i:6 -1
     (nil)
    (nil))
 x.i:7:1: internal compiler error: in extract_insn, at recog.c:2123
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See http://gcc.gnu.org/bugs.html for instructions.
 make: *** [x.s] Error 1

 I will investigate it.

 For reference, attached is the complete patch that uses
 define_special_predicate. This patch works OK with the current
 mainline, with additional patch to i386.h, where

 Index: i386.h
 ===
 --- i386.h      (revision 185079)
 +++ i386.h      (working copy)
 @@ -1744,7 +1744,7 @@
  /* Specify the machine mode that pointers have.
    After generation of rtl, the compiler makes no further distinction
    between pointers and any other objects of this machine mode.  */
 -#define Pmode (TARGET_64BIT ? DImode : SImode)
 +#define Pmode (TARGET_LP64 ? DImode : SImode)

  /* A C expression whose value is zero if pointers that need to be extended
    from being `POINTER_SIZE' bits wide to `Pmode' are sign-extended and

 I tested this patch and it passed all my x32 tests.

Committed to mainline with following ChangeLog:

2012-03-11  H.J. Lu  hongjiu...@intel.com
Uros Bizjak  ubiz...@gmail.com

* config/i386/predicates.md (call_insn_operand): Allow
constant_call_address_operand in Pmode only.
(sibcall_insn_operand): Ditto.
* config/i386/i386.md (*call): Use W mode iterator instead of P mode.
(*call_vzeroupper): Ditto.
(*sibcall): Ditto.
(*sibcall_vzeroupper): Ditto.
(*call_value): Ditto.
(*call_value_vzeroupper): Ditto.
(*sibcall_value): Ditto.
(*sibcall_value_vzeroupper): Ditto.
(*indirect_jump): Ditto.
(*tablejump_1): Ditto.
(indirect_jump): Convert memory address to word mode for TARGET_X32.
(tablejump): Ditto.
* config/i386/i386.c (ix86_expand_call): Convert indirect operands

[patch, RFA] -no-integrated-cpp documentation

2012-03-11 Thread Sandra Loosemore
While I've been cleaning up invoke.texi I noticed that the blurb about 
-no-integrated-cpp needed some copy-editing and markup changes.  Then I 
noticed that the description didn't make a whole lot of sense, and that 
it talked about what might happen in the hypothetical case that 
cc1/cc1plus/cc1obj are merged, which I think only further confused 
things.  And, I further noticed that this option was documented with the 
C Dialect Options instead of the Preprocessor Options, which is where 
users might be most likely to look for it.


I dug up the original discussion that led to this option being added 
back in 2003 -- it's here:


http://gcc.gnu.org/ml/gcc/2002-12/msg01163.html

Based on that and reading the code, I've tried to rewrite the 
documentation so it makes more sense.  Did I get this right?  If I'm 
understanding the intended purpose of this option correctly, it sounds 
like a really convoluted hack and maybe not what the manual ought to 
recommend.  (If you really want to do stuff with the preprocessed code 
before compiling it, why not just write a makefile rule or a shell 
script to use as your $(CC)?)  But, I think we have a gazillion other 
useless options too, and it's probably more trouble to remove than it's 
worth


Anyway, I'd appreciate another pair of eyes looking at this, and 
suggestions on what better to do here if this rewrite isn't adequate.


-Sandra


2012-03-11  Sandra Loosemore  san...@codesourcery.com

gcc/
* doc/invoke.texi (Option Summary): Move -no-integrated-cpp
from C Language Options to Preprocessor Options.
(C Dialect Options): Move -no-integrated-cpp documentation
from here...
(Preprocessor Options): ...to here.  Rewrite the description
so it makes more sense, and remove discussion of merging
front ends.


Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 185168)
+++ gcc/doc/invoke.texi	(working copy)
@@ -174,7 +174,7 @@ in the following sections.
 -aux-info @var{filename} -fallow-parameterless-variadic-functions @gol
 -fno-asm  -fno-builtin  -fno-builtin-@var{function} @gol
 -fhosted  -ffreestanding -fopenmp -fms-extensions -fplan9-extensions @gol
--trigraphs  -no-integrated-cpp  -traditional  -traditional-cpp @gol
+-trigraphs  -traditional  -traditional-cpp @gol
 -fallow-single-precision  -fcond-mismatch -flax-vector-conversions @gol
 -fsigned-bitfields  -fsigned-char @gol
 -funsigned-bitfields  -funsigned-char}
@@ -433,7 +433,7 @@ Objective-C and Objective-C++ Dialects}.
 -M  -MM  -MF  -MG  -MP  -MQ  -MT  -nostdinc  @gol
 -P  -fdebug-cpp -ftrack-macro-expansion -fworking-directory @gol
 -remap -trigraphs  -undef  -U@var{macro}  @gol
--Wp,@var{option} -Xpreprocessor @var{option}}
+-Wp,@var{option} -Xpreprocessor @var{option} -no-integrated-cpp}
 
 @item Assembler Option
 @xref{Assembler Options,,Passing Options to the Assembler}.
@@ -1794,17 +1794,6 @@ supported for C, not C++.
 Support ISO C trigraphs.  The @option{-ansi} option (and @option{-std}
 options for strict ISO C conformance) implies @option{-trigraphs}.
 
-@item -no-integrated-cpp
-@opindex no-integrated-cpp
-Performs a compilation in two passes: preprocessing and compiling.  This
-option allows a user supplied cc1, cc1plus, or cc1obj via the
-@option{-B} option.  The user supplied compilation step can then add in
-an additional preprocessing step after normal preprocessing but before
-compiling.  The default is to use the integrated cpp (internal cpp)
-
-The semantics of this option will change if cc1, cc1plus, and
-cc1obj are merged.
-
 @cindex traditional C language
 @cindex C language, traditional
 @item -traditional
@@ -9300,6 +9289,21 @@ recognize.
 
 If you want to pass an option that takes an argument, you must use
 @option{-Xpreprocessor} twice, once for the option and once for the argument.
+
+@item -no-integrated-cpp
+@opindex no-integrated-cpp
+Perform preprocessing as a separate pass before compilation.
+By default, GCC performs preprocessing as an integrated part of
+input tokenization and parsing.
+If this option is provided, the appropriate language front end
+(@command{cc1}, @command{cc1plus}, or @command{cc1obj} for C, C++,
+and Objective-C, respectively) is instead invoked twice,
+once for preprocessing only and once for actual compilation
+of the preprocessed input.
+This option may be useful in conjunction with the @option{-B} or
+@option{-wrapper} options to specify an alternate preprocessor or
+perform additional processing of the program source between
+normal preprocessing and compilation.
 @end table
 
 @include cppopts.texi


Re: [PATCH 07/10] addr32: Use word_mode instead of Pmode in loop expand

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 2:06 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Mar 8, 2012 at 3:22 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 2, 2012 at 10:02 PM, H.J. Lu hongjiu...@intel.com wrote:

 This patches uses word_mode instead of Pmode in loop expand since
 word_mode may have bigger size than Pmode.  OK for trunk?

 Thanks.

 H.J.
 ---
 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_expand_movmem): Use word_mode instead
        of Pmode on loop.
        (ix86_expand_setmem): Likwise.

 Jan, can you please comment on the changes in this patch?


 Here is a complete updated patch to use word_mode in ix86_expand_movmem
 and ix86_expand_setmem.  It also fixes ix86_zero_extend_to_Pmode to handle
 Pmode != DImode.  OK for trunk?

Please rewrite ix86_zero_extend_to_Pmode to something like:
  rtx tmp;
  if (GET_MODE (exp) != Pmode)
tmp = convert_to_mode (Pmode, exp, 1);
  return force_reg (Pmode, tmp));

Uros.


Re: [patch, RFA] -no-integrated-cpp documentation

2012-03-11 Thread Gerald Pfeifer

On Sun, 11 Mar 2012, Sandra Loosemore wrote:
Anyway, I'd appreciate another pair of eyes looking at this, and 
suggestions on what better to do here if this rewrite isn't adequate.


Looks good to me, but better wait for Joseph's take.

Gerald


2012-03-11  Sandra Loosemore  san...@codesourcery.com

gcc/
* doc/invoke.texi (Option Summary): Move -no-integrated-cpp
from C Language Options to Preprocessor Options.
(C Dialect Options): Move -no-integrated-cpp documentation
from here...
(Preprocessor Options): ...to here.  Rewrite the description
so it makes more sense, and remove discussion of merging
front ends.Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 185168)
+++ gcc/doc/invoke.texi	(working copy)
@@ -174,7 +174,7 @@ in the following sections.
 -aux-info @var{filename} -fallow-parameterless-variadic-functions @gol
 -fno-asm  -fno-builtin  -fno-builtin-@var{function} @gol
 -fhosted  -ffreestanding -fopenmp -fms-extensions -fplan9-extensions @gol
--trigraphs  -no-integrated-cpp  -traditional  -traditional-cpp @gol
+-trigraphs  -traditional  -traditional-cpp @gol
 -fallow-single-precision  -fcond-mismatch -flax-vector-conversions @gol
 -fsigned-bitfields  -fsigned-char @gol
 -funsigned-bitfields  -funsigned-char}
@@ -433,7 +433,7 @@ Objective-C and Objective-C++ Dialects}.
 -M  -MM  -MF  -MG  -MP  -MQ  -MT  -nostdinc  @gol
 -P  -fdebug-cpp -ftrack-macro-expansion -fworking-directory @gol
 -remap -trigraphs  -undef  -U@var{macro}  @gol
--Wp,@var{option} -Xpreprocessor @var{option}}
+-Wp,@var{option} -Xpreprocessor @var{option} -no-integrated-cpp}
 
 @item Assembler Option
 @xref{Assembler Options,,Passing Options to the Assembler}.
@@ -1794,17 +1794,6 @@ supported for C, not C++.
 Support ISO C trigraphs.  The @option{-ansi} option (and @option{-std}
 options for strict ISO C conformance) implies @option{-trigraphs}.
 
-@item -no-integrated-cpp
-@opindex no-integrated-cpp
-Performs a compilation in two passes: preprocessing and compiling.  This
-option allows a user supplied cc1, cc1plus, or cc1obj via the
-@option{-B} option.  The user supplied compilation step can then add in
-an additional preprocessing step after normal preprocessing but before
-compiling.  The default is to use the integrated cpp (internal cpp)
-
-The semantics of this option will change if cc1, cc1plus, and
-cc1obj are merged.
-
 @cindex traditional C language
 @cindex C language, traditional
 @item -traditional
@@ -9300,6 +9289,21 @@ recognize.
 
 If you want to pass an option that takes an argument, you must use
 @option{-Xpreprocessor} twice, once for the option and once for the argument.
+
+@item -no-integrated-cpp
+@opindex no-integrated-cpp
+Perform preprocessing as a separate pass before compilation.
+By default, GCC performs preprocessing as an integrated part of
+input tokenization and parsing.
+If this option is provided, the appropriate language front end
+(@command{cc1}, @command{cc1plus}, or @command{cc1obj} for C, C++,
+and Objective-C, respectively) is instead invoked twice,
+once for preprocessing only and once for actual compilation
+of the preprocessed input.
+This option may be useful in conjunction with the @option{-B} or
+@option{-wrapper} options to specify an alternate preprocessor or
+perform additional processing of the program source between
+normal preprocessing and compilation.
 @end table
 
 @include cppopts.texi


Re: PATCH: Check ptr_mode and use Pmode in ix86_trampoline_init

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 2:18 AM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 x86 trampoline depends on ptr_mode.  This patch checks ptr_mode, instead
 of TARGET_X32.  Also we should use Pmode for address mode.  Tested on
 Linux/x86-64.  OK for trunk?

Why we are looking at ptr_mode here?

Uros.


New Swedish PO file for 'gcc' (version 4.7-b20120128)

2012-03-11 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-4.7-b20120128.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.
coordina...@translationproject.org



Re: [PATCH 02/10] addr32: Only handle zero-extended DImode addresses

2012-03-11 Thread H.J. Lu
On Fri, Mar 9, 2012 at 10:15 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 9, 2012 at 4:26 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Mar 8, 2012 at 7:20 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sun, Mar 4, 2012 at 9:13 PM, Uros Bizjak ubiz...@gmail.com wrote:

 We only need to handle zero-extended addresses in DImode.
 OK for trunk?

 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_print_operand_address): Only handle
        zero-extended DImode addresses.

 OK.

 The patch was reverted due to PR target/52530.


 Revert breaks Pmode == SImode for x32.  Here is a different patch.
 It checks Pmode == DImode instead of TARGET_64BIT.  Tested on
 Linux/x32.  OK for trunk?

 This will still emit i.e. leal 1(%rSImode), %rSImode on Pmode ==
 SImode targets, so you win nothing really.

 Attached patch finally decouples LEA operand handling from generic
 address handling, and by introducing %E operand modifier, we are able
 to always emit DImode registers for LEAs (which is good anyway to
 avoid unnecessary addr32 prefixes). Luckily, the leal 1(%rSImode),
 %rSImode triggered some unknown problem with Sun assembler, so we
 were able to detect the problem.

 I would like to point out that the patched compiler now also emits
 address registers in their natural mode (modulo zero-extended RTXes)
 and fixes following failure on Pmode == SImode targets:

 --cut here--
 struct foo
 {
  int *f;
  int i;
 };

 void
 __attribute__ ((noinline))
 bar (struct foo x)
 {
  *(x.f) = 1;
 }
 --cut here--

 For Pmode == SImode, the compiler emitted (%rdi) address, which was
 wrong, since i was passed in the high part of (%rdi) register.

 2012-03-09  Uros Bizjak  ubiz...@gmail.com

        PR target/52530
        * config/i386/i386.c (ix86_print_operand): Handle 'E' operand modifier.
        (ix86_print_operand_address): Handle UNSPEC_LEA_ADDR. Do not fallback
        to set code to 'q'.
        * config/i386/i386.md (UNSPEC_LEA_ADDR): New unspec.
        (*movdi_internal_rex64): Use %E operand modifier for lea.
        (*movsi_internal): Ditto.
        (*lea_1): Ditto.
        (*leamode_2): Ditto.
        (*lea_{3,4,5,6}_zext): Ditto.
        (*tls_global_dynamic_32_gnu): Ditto.
        (*tls_global_dynamic_64): Ditto.
        (*tls_dynamic_gnu2_lea_32): Ditto.
        (*tls_dynamic_gnu2_lea_64): Ditto.
        (pro_epilogue_adjust_stack_mode_add): Ditto.

 Patch was tested on x86_64-pc-linux-gnu {,-m32}. I have also eyeballed
 x32 code (Pmode == SImode) and found no problems.

 Committed to mainline SVN.

 H.J., can you please construct a runtime test from the above example code?

 Uros.

It passed all my x32 tests.

Thanks.

-- 
H.J.


Re: [PATCH 07/10] addr32: Use word_mode instead of Pmode in loop expand

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 3:30 PM, Uros Bizjak ubiz...@gmail.com wrote:

 This patches uses word_mode instead of Pmode in loop expand since
 word_mode may have bigger size than Pmode.  OK for trunk?

 Thanks.

 H.J.
 ---
 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_expand_movmem): Use word_mode instead
        of Pmode on loop.
        (ix86_expand_setmem): Likwise.

 Jan, can you please comment on the changes in this patch?


 Here is a complete updated patch to use word_mode in ix86_expand_movmem
 and ix86_expand_setmem.  It also fixes ix86_zero_extend_to_Pmode to handle
 Pmode != DImode.  OK for trunk?

 Please rewrite ix86_zero_extend_to_Pmode to something like:
  rtx tmp;
  if (GET_MODE (exp) != Pmode)
    tmp = convert_to_mode (Pmode, exp, 1);
  return force_reg (Pmode, tmp));

I am testing attached patch:

2012-03-11  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_zero_extend_to_Pmode): Rewrite using
convert_to_mode.
(ix86_expand_call): Use force_reg istead of copy_to_mode_reg.

Uros.


 Uros.
Index: i386.c
===
--- i386.c  (revision 185193)
+++ i386.c  (working copy)
@@ -21025,14 +21025,9 @@ ix86_adjust_counter (rtx countreg, HOST_WIDE_INT v
 rtx
 ix86_zero_extend_to_Pmode (rtx exp)
 {
-  rtx r;
-  if (GET_MODE (exp) == VOIDmode)
-return force_reg (Pmode, exp);
-  if (GET_MODE (exp) == Pmode)
-return copy_to_mode_reg (Pmode, exp);
-  r = gen_reg_rtx (Pmode);
-  emit_insn (gen_zero_extendsidi2 (r, exp));
-  return r;
+  if (GET_MODE (exp) != Pmode)
+exp = convert_to_mode (Pmode, exp, 1);
+  return force_reg (Pmode, exp);
 }
 
 /* Divide COUNTREG by SCALE.  */
@@ -22996,7 +22991,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call
   fnaddr = XEXP (fnaddr, 0);
   if (GET_MODE (fnaddr) != word_mode)
fnaddr = convert_to_mode (word_mode, fnaddr, 1);
-  fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, fnaddr));
+  fnaddr = gen_rtx_MEM (QImode, force_reg (word_mode, fnaddr));
 }
 
   vec_len = 0;


Re: PATCH: Check ptr_mode and use Pmode in ix86_trampoline_init

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 4:52 PM, H.J. Lu hjl.to...@gmail.com wrote:

 x86 trampoline depends on ptr_mode.  This patch checks ptr_mode, instead
 of TARGET_X32.  Also we should use Pmode for address mode.  Tested on
 Linux/x86-64.  OK for trunk?

 Why we are looking at ptr_mode here?


 If ptr_mode is SImode, we can always use movl to reach our target.
 We don't need to check anything else.

Under this assumption, the patch is OK.

Thanks,
Uros.


Re: PATCH: Properly generate X32 IE sequence

2012-03-11 Thread H.J. Lu
On Sat, Mar 10, 2012 at 10:49 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Sat, Mar 10, 2012 at 5:09 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 9, 2012 at 11:26 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 5, 2012 at 9:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Mon, Mar 5, 2012 at 6:03 PM, H.J. Lu hjl.to...@gmail.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 Actually, linker has:

    case R_X86_64_GOTTPOFF:
      /* Check transition from IE access model:
                mov foo@gottpoff(%rip), %reg
                add foo@gottpoff(%rip), %reg
       */

      /* Check REX prefix first.  */
      if (offset = 3  (offset + 4) = sec-size)
        {
          val = bfd_get_8 (abfd, contents + offset - 3);
          if (val != 0x48  val != 0x4c)
            {
              /* X32 may have 0x44 REX prefix or no REX prefix.  */
              if (ABI_64_P (abfd))
                return FALSE;
            }
        }
      else
        {
          /* X32 may not have any REX prefix.  */
          if (ABI_64_P (abfd))
            return FALSE;
          if (offset  2 || (offset + 3)  sec-size)
            return FALSE;
        }

 So, it should handle the case without REX just OK. If it doesn't, then
 this is a bug in binutils.


 The last byte of the displacement in the previous instruction
 may happen to look like a REX byte. In that case, linker
 will overwrite the last byte of the previous instruction and
 generate the wrong instruction sequence.

 I need to update linker to enforce the REX byte check.

 One important observation: if we want to follow the x86_64 TLS spec
 strictly, we have to use existing DImode patterns only. This also
 means that we should NOT convert other TLS patterns to Pmode, since
 they explicitly state movq and addq. If this is not the case, then we
 need new TLS specification for X32.

 Here is a patch to properly generate X32 IE sequence.

 This is the summary of differences between x86-64 TLS and x32 TLS:

                     x86-64                               x32
 GD
    byte 0x66; leaq foo@tlsgd(%rip),%rdi;         leaq foo@tlsgd(%rip),%rdi;
    .word 0x; rex64; call __tls_get_addr@plt  .word 0x; rex64;
 call __tls_get_addr@plt

 GD-IE optimization
   movq %fs:0,%rax; addq x@gottpoff(%rip),%rax    movl %fs:0,%eax;
 addq x@gottpoff(%rip),%rax

 GD-LE optimization
   movq %fs:0,%rax; leaq x@tpoff(%rax),%rax       movl %fs:0,%eax;
 leaq x@tpoff(%rax),%rax

 LD
  leaq foo@tlsld(%rip),%rdi;                      leaq foo@tlsld(%rip),%rdi;
  call __tls_get_addr@plt                         call __tls_get_addr@plt

 LD-LE optimization
  .word 0x; .byte 0x66; movq %fs:0, %rax      nopl 0x0(%rax); movl
 %fs:0, %eax

 IE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl 
 x@gottpoff(%rip),%reg32

   or
                                                  Not supported if
 Pmode == SImode
   movq x@gottpoff(%rip),%reg64;                  movq 
 x@gottpoff(%rip),%reg64;
   movq %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

 IE-LE optimization

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl 
 x@gottpoff(%rip),%reg32

   to

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq foo@tpoff, %reg64                         addl foo@tpoff, %reg32

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq foo@tpoff(%reg64), %reg64                 leal foo@tpoff(%reg32), 
 %reg32

   or

   movq x@gottpoff(%rip),%reg64                   movq 
 x@gottpoff(%rip),%reg64;
   movl %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

   to

   movq foo@tpoff, %reg64                         movq foo@tpoff, %reg64
   movl %fs:(%reeg64),%reg32                      movl %fs:(%reg64), %reg32

 LE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq x@tpoff(%reg64),%reg32                    leal x@tpoff(%reg32),%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq $x@tpoff,%reg64                           addl $x@tpoff,%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   movl x@tpoff(%reg64),%reg32                    movl x@tpoff(%reg32),%reg32

   or

   movl %fs:x@tpoff,%reg32                        movl %fs:x@tpoff,%reg32


 X32 TLS implementation is 

Re: [Fortran-dev, patch] Use only lbound/extent/sm in the array descriptor

2012-03-11 Thread Thomas Koenig

Hi Tobias,


with that patch, the array descriptor on the fortran-dev branch uses now
the dimension triplet as defined in TS29113. This patch removes
ubound/stride and updates all calls.


Great!



There are still 227 test-suite failures (FAIL lines) affecting 27
test-suite files. That's slightly down from the 269 lines the branch
currently has. (Some issues can be fixed by modifying the tree dump
patterns, but most seem to be real problems.)

Build and regtested on x86-64-linux.
OK for the branch?


The library parts look OK to me.

There is just one point of efficiency.

+#define GFC_DESCRIPTOR_STRIDE(desc,i) \
+  (GFC_DESCRIPTOR_SM(desc,i) / GFC_DESCRIPTOR_SIZE(desc))

In most generated files, GFC_DESCRIPTOR_SIZE is a constant known
at compile-time (and usually a power of two), which means that the
division can be done in a simple shift.  If we get the size from the 
descriptor, we actually have to divide, which is expensive.


I would commit the patch now, adding


TODO:
- Fixing the regressions.
- Cleanup of the library and the front end
- Switching also from ubound - extent for nondescriptor arrays?
- Properly implement subpointers


- Avoid division for GFC_DESCRIPTOR_STRIDE where possible.

Thomas


Re: PATCH: Properly generate X32 IE sequence

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 6:11 PM, H.J. Lu hjl.to...@gmail.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate 
 corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 Actually, linker has:

    case R_X86_64_GOTTPOFF:
      /* Check transition from IE access model:
                mov foo@gottpoff(%rip), %reg
                add foo@gottpoff(%rip), %reg
       */

      /* Check REX prefix first.  */
      if (offset = 3  (offset + 4) = sec-size)
        {
          val = bfd_get_8 (abfd, contents + offset - 3);
          if (val != 0x48  val != 0x4c)
            {
              /* X32 may have 0x44 REX prefix or no REX prefix.  */
              if (ABI_64_P (abfd))
                return FALSE;
            }
        }
      else
        {
          /* X32 may not have any REX prefix.  */
          if (ABI_64_P (abfd))
            return FALSE;
          if (offset  2 || (offset + 3)  sec-size)
            return FALSE;
        }

 So, it should handle the case without REX just OK. If it doesn't, then
 this is a bug in binutils.


 The last byte of the displacement in the previous instruction
 may happen to look like a REX byte. In that case, linker
 will overwrite the last byte of the previous instruction and
 generate the wrong instruction sequence.

 I need to update linker to enforce the REX byte check.

 One important observation: if we want to follow the x86_64 TLS spec
 strictly, we have to use existing DImode patterns only. This also
 means that we should NOT convert other TLS patterns to Pmode, since
 they explicitly state movq and addq. If this is not the case, then we
 need new TLS specification for X32.

 Here is a patch to properly generate X32 IE sequence.

 This is the summary of differences between x86-64 TLS and x32 TLS:

                     x86-64                               x32
 GD
    byte 0x66; leaq foo@tlsgd(%rip),%rdi;         leaq foo@tlsgd(%rip),%rdi;
    .word 0x; rex64; call __tls_get_addr@plt  .word 0x; rex64;
 call __tls_get_addr@plt

 GD-IE optimization
   movq %fs:0,%rax; addq x@gottpoff(%rip),%rax    movl %fs:0,%eax;
 addq x@gottpoff(%rip),%rax

 GD-LE optimization
   movq %fs:0,%rax; leaq x@tpoff(%rax),%rax       movl %fs:0,%eax;
 leaq x@tpoff(%rax),%rax

 LD
  leaq foo@tlsld(%rip),%rdi;                      leaq foo@tlsld(%rip),%rdi;
  call __tls_get_addr@plt                         call __tls_get_addr@plt

 LD-LE optimization
  .word 0x; .byte 0x66; movq %fs:0, %rax      nopl 0x0(%rax); movl
 %fs:0, %eax

 IE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl 
 x@gottpoff(%rip),%reg32

   or
                                                  Not supported if
 Pmode == SImode
   movq x@gottpoff(%rip),%reg64;                  movq 
 x@gottpoff(%rip),%reg64;
   movq %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

 IE-LE optimization

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl 
 x@gottpoff(%rip),%reg32

   to

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq foo@tpoff, %reg64                         addl foo@tpoff, %reg32

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq foo@tpoff(%reg64), %reg64                 leal foo@tpoff(%reg32), 
 %reg32

   or

   movq x@gottpoff(%rip),%reg64                   movq 
 x@gottpoff(%rip),%reg64;
   movl %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

   to

   movq foo@tpoff, %reg64                         movq foo@tpoff, %reg64
   movl %fs:(%reeg64),%reg32                      movl %fs:(%reg64), %reg32

 LE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq x@tpoff(%reg64),%reg32                    leal 
 x@tpoff(%reg32),%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq $x@tpoff,%reg64                           addl $x@tpoff,%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   movl x@tpoff(%reg64),%reg32                    movl 
 x@tpoff(%reg32),%reg32

   or

   movl %fs:x@tpoff,%reg32                        movl %fs:x@tpoff,%reg32


 X32 TLS implementation is straight forward, except for IE:

 1. Since address override works only on the (reg32) part in fs:(reg32),
 we can't use it as memory operand.  This patch changes 
 ix86_decompose_address
 to disallow  fs:(reg) if Pmode != word_mode.
 2. When Pmode == SImode, there may be no REX 

Re: PATCH: Properly generate X32 IE sequence

2012-03-11 Thread H.J. Lu
On Sun, Mar 11, 2012 at 10:55 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sun, Mar 11, 2012 at 6:11 PM, H.J. Lu hjl.to...@gmail.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate 
 corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 Actually, linker has:

    case R_X86_64_GOTTPOFF:
      /* Check transition from IE access model:
                mov foo@gottpoff(%rip), %reg
                add foo@gottpoff(%rip), %reg
       */

      /* Check REX prefix first.  */
      if (offset = 3  (offset + 4) = sec-size)
        {
          val = bfd_get_8 (abfd, contents + offset - 3);
          if (val != 0x48  val != 0x4c)
            {
              /* X32 may have 0x44 REX prefix or no REX prefix.  */
              if (ABI_64_P (abfd))
                return FALSE;
            }
        }
      else
        {
          /* X32 may not have any REX prefix.  */
          if (ABI_64_P (abfd))
            return FALSE;
          if (offset  2 || (offset + 3)  sec-size)
            return FALSE;
        }

 So, it should handle the case without REX just OK. If it doesn't, then
 this is a bug in binutils.


 The last byte of the displacement in the previous instruction
 may happen to look like a REX byte. In that case, linker
 will overwrite the last byte of the previous instruction and
 generate the wrong instruction sequence.

 I need to update linker to enforce the REX byte check.

 One important observation: if we want to follow the x86_64 TLS spec
 strictly, we have to use existing DImode patterns only. This also
 means that we should NOT convert other TLS patterns to Pmode, since
 they explicitly state movq and addq. If this is not the case, then we
 need new TLS specification for X32.

 Here is a patch to properly generate X32 IE sequence.

 This is the summary of differences between x86-64 TLS and x32 TLS:

                     x86-64                               x32
 GD
    byte 0x66; leaq foo@tlsgd(%rip),%rdi;         leaq 
 foo@tlsgd(%rip),%rdi;
    .word 0x; rex64; call __tls_get_addr@plt  .word 0x; rex64;
 call __tls_get_addr@plt

 GD-IE optimization
   movq %fs:0,%rax; addq x@gottpoff(%rip),%rax    movl %fs:0,%eax;
 addq x@gottpoff(%rip),%rax

 GD-LE optimization
   movq %fs:0,%rax; leaq x@tpoff(%rax),%rax       movl %fs:0,%eax;
 leaq x@tpoff(%rax),%rax

 LD
  leaq foo@tlsld(%rip),%rdi;                      leaq 
 foo@tlsld(%rip),%rdi;
  call __tls_get_addr@plt                         call __tls_get_addr@plt

 LD-LE optimization
  .word 0x; .byte 0x66; movq %fs:0, %rax      nopl 0x0(%rax); movl
 %fs:0, %eax

 IE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl 
 x@gottpoff(%rip),%reg32

   or
                                                  Not supported if
 Pmode == SImode
   movq x@gottpoff(%rip),%reg64;                  movq 
 x@gottpoff(%rip),%reg64;
   movq %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

 IE-LE optimization

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl 
 x@gottpoff(%rip),%reg32

   to

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq foo@tpoff, %reg64                         addl foo@tpoff, %reg32

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq foo@tpoff(%reg64), %reg64                 leal foo@tpoff(%reg32), 
 %reg32

   or

   movq x@gottpoff(%rip),%reg64                   movq 
 x@gottpoff(%rip),%reg64;
   movl %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

   to

   movq foo@tpoff, %reg64                         movq foo@tpoff, %reg64
   movl %fs:(%reeg64),%reg32                      movl %fs:(%reg64), %reg32

 LE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq x@tpoff(%reg64),%reg32                    leal 
 x@tpoff(%reg32),%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq $x@tpoff,%reg64                           addl $x@tpoff,%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   movl x@tpoff(%reg64),%reg32                    movl 
 x@tpoff(%reg32),%reg32

   or

   movl %fs:x@tpoff,%reg32                        movl %fs:x@tpoff,%reg32


 X32 TLS implementation is straight forward, except for IE:

 1. Since address override works only on the (reg32) part in fs:(reg32),
 we can't use it as memory operand.  This patch changes 
 ix86_decompose_address
 to disallow  

Re: PATCH: Properly generate X32 IE sequence

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 7:16 PM, H.J. Lu hjl.to...@gmail.com wrote:

        * config/i386/i386.c (ix86_decompose_address): Disallow fs:(reg)
        if Pmode != word_mode.
        (legitimize_tls_address): Call gen_tls_initial_exec_x32 if
        Pmode == SImode for x32.

        * config/i386/i386.md (UNSPEC_TLS_IE_X32): New.
        (tls_initial_exec_x32): Likewise.

 Nice solution!

 OK for mainline.

 Done.

 BTW: Did you investigate the issue with memory aliasing?


 It isn't a problem since it is wrapped in UNSPEC_TLS_IE_X32
 which loads address of the TLS symbol.

 Thanks.


 Since we must use reg64 in %fs:(%reg) memory operand like

 movq x@gottpoff(%rip),%reg64;
 mov %fs:(%reg64),%reg

 this patch optimizes x32 TLS IE load and store by wrapping
 %reg64 inside of UNSPEC when Pmode == SImode.  OK for
 trunk?

 I think we should just scrap all these complications and go with the
 idea of clearing MASK_TLS_DIRECT_SEG_REFS.


 I will give it a try.

You can also revert:

* config/i386/i386.c (ix86_decompose_address): Disallow fs:(reg)
if Pmode != word_mode.

then, since this part is handled later in the function.

Uros.


Re: [PATCH 07/10] addr32: Use word_mode instead of Pmode in loop expand

2012-03-11 Thread Uros Bizjak
On Sun, Mar 11, 2012 at 5:56 PM, H.J. Lu hjl.to...@gmail.com wrote:

 This patches uses word_mode instead of Pmode in loop expand since
 word_mode may have bigger size than Pmode.  OK for trunk?

 Thanks.

 H.J.
 ---
 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_expand_movmem): Use word_mode instead
        of Pmode on loop.
        (ix86_expand_setmem): Likwise.

 Jan, can you please comment on the changes in this patch?


 Here is a complete updated patch to use word_mode in ix86_expand_movmem
 and ix86_expand_setmem.  It also fixes ix86_zero_extend_to_Pmode to handle
 Pmode != DImode.  OK for trunk?

 Please rewrite ix86_zero_extend_to_Pmode to something like:
  rtx tmp;
  if (GET_MODE (exp) != Pmode)
    tmp = convert_to_mode (Pmode, exp, 1);
  return force_reg (Pmode, tmp));

 I am testing attached patch:

 2012-03-11  Uros Bizjak  ubiz...@gmail.com

        * config/i386/i386.c (ix86_zero_extend_to_Pmode): Rewrite using
        convert_to_mode.
        (ix86_expand_call): Use force_reg istead of copy_to_mode_reg.


 It passed all tests in GCC testsuite under Linux/x32 and glibc x32
 tests.

I have committed the patch without (ix86_expand_call) change. The
later change was wrong, since it allowed arg register in the call
pattern.

Please commit your loop expand patch.

Thanks,
Uros.


Re: [Fortran-dev, patch] Use only lbound/extent/sm in the array descriptor

2012-03-11 Thread Tobias Burnus

Thomas Koenig wrote:

There are still 227 test-suite failures (FAIL lines) affecting 27
test-suite files. That's slightly down from the 269 lines the branch
currently has. (Some issues can be fixed by modifying the tree dump
patterns, but most seem to be real problems.)

Build and regtested on x86-64-linux.
OK for the branch?


The library parts look OK to me.


I have now commit it (Rev. 185199) - thanks for looking at the library part.



There is just one point of efficiency.
+#define GFC_DESCRIPTOR_STRIDE(desc,i) \
+  (GFC_DESCRIPTOR_SM(desc,i) / GFC_DESCRIPTOR_SIZE(desc))

In most generated files, GFC_DESCRIPTOR_SIZE is a constant known
at compile-time (and usually a power of two), which means that the
division can be done in a simple shift.  If we get the size from the 
descriptor, we actually have to divide, which is expensive.


 I think one should also go through all the files and check whether one 
can replace _STRIDE by _SM. In some cases, the code actually does this: 
It calls _STRIDE and later multiplies by the byte size. I tried to avoid 
_STRIDE at some places, but I only looked at it in the context of 
setting the descriptor. Similarly, but less critical: One should check 
whether EXTENT can replace UBOUND.


Side note: c_f_pointer0 (which is called for array with shape) is 
currently broken as dtype is not set before the call - and thus the 
size is not known, which is required for setting the sm. I think the 
best would be to replace it by inline code. (That's the reason behind 5 
of the 24 failing test-case files.)



I would commit the patch now, adding


TODO:
- Fixing the regressions.
- Cleanup of the library and the front end
- Switching also from ubound - extent for nondescriptor arrays?
- Properly implement subpointers

- Avoid division for GFC_DESCRIPTOR_STRIDE where possible.


Well, that's what I meant by cleanup: Trying to update the usage such 
that one avoids ubound/stride when extent/sm are required - and doing 
other optimizations like that one. Maybe one can also get rid of some of 
the macros if they are unused.


In any case, I would be happy if you could have a look.

I think the real fun will start when we have to implement the other 
parts (esp. lower_bound semantic, elem_len and in particular the type 
system of TS29113).


Tobias


Re: [PATCH 02/10] addr32: Only handle zero-extended DImode addresses

2012-03-11 Thread Uros Bizjak
On Fri, Mar 9, 2012 at 6:58 PM, Uros Bizjak ubiz...@gmail.com wrote:

 I would like to point out that the patched compiler now also emits
 address registers in their natural mode (modulo zero-extended RTXes)
 and fixes following failure on Pmode == SImode targets:

 --cut here--
 struct foo
 {
  int *f;
  int i;
 };

 void
 __attribute__ ((noinline))
 bar (struct foo x)
 {
  *(x.f) = 1;
 }
 --cut here--

 For Pmode == SImode, the compiler emitted (%rdi) address, which was
 wrong, since i was passed in the high part of (%rdi) register.

Following patch adds torture test that check for this problem.

2012-03-11  Uros Bizjak  ubiz...@gmail.com

PR target/52530
* gcc.dg/torture/pr52530.c: New test.

Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline.

Uros.
Index: gcc.dg/torture/pr52530.c
===
--- gcc.dg/torture/pr52530.c(revision 0)
+++ gcc.dg/torture/pr52530.c(revision 0)
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+
+extern void abort (void);
+
+struct foo
+{
+ int *f;
+ int i;
+};
+
+int baz;
+
+void __attribute__ ((noinline))
+bar (struct foo x)
+{
+ *(x.f) = x.i;
+}
+
+int
+main ()
+{
+  struct foo x = { baz, 0xdeadbeef };
+
+  bar (x);
+
+  if (baz != 0xdeadbeef)
+abort ();
+
+  return 0;
+}


Re: PATCH: Properly generate X32 IE sequence

2012-03-11 Thread H.J. Lu
On Sun, Mar 11, 2012 at 11:21 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sun, Mar 11, 2012 at 7:16 PM, H.J. Lu hjl.to...@gmail.com wrote:

        * config/i386/i386.c (ix86_decompose_address): Disallow fs:(reg)
        if Pmode != word_mode.
        (legitimize_tls_address): Call gen_tls_initial_exec_x32 if
        Pmode == SImode for x32.

        * config/i386/i386.md (UNSPEC_TLS_IE_X32): New.
        (tls_initial_exec_x32): Likewise.

 Nice solution!

 OK for mainline.

 Done.

 BTW: Did you investigate the issue with memory aliasing?


 It isn't a problem since it is wrapped in UNSPEC_TLS_IE_X32
 which loads address of the TLS symbol.

 Thanks.


 Since we must use reg64 in %fs:(%reg) memory operand like

 movq x@gottpoff(%rip),%reg64;
 mov %fs:(%reg64),%reg

 this patch optimizes x32 TLS IE load and store by wrapping
 %reg64 inside of UNSPEC when Pmode == SImode.  OK for
 trunk?

 I think we should just scrap all these complications and go with the
 idea of clearing MASK_TLS_DIRECT_SEG_REFS.


 I will give it a try.

 You can also revert:

        * config/i386/i386.c (ix86_decompose_address): Disallow fs:(reg)
        if Pmode != word_mode.

 then, since this part is handled later in the function.


Here is the patch which is equivalent to clearing MASK_TLS_DIRECT_SEG_REFS
when Pmode != word_mode.  We need to keep

  else if (Pmode == SImode)
{
  /* Always generate
movl %fs:0, %reg32
addl xgottpoff(%rip), %reg32
 to support linker IE-LE optimization and avoid
 fs:(%reg32) as memory operand.  */
  dest = gen_reg_rtx (Pmode);
  emit_insn (gen_tls_initial_exec_x32 (dest, x));
  return dest;
}

to support linker IE-LE optimization.  TARGET_TLS_DIRECT_SEG_REFS only affects
TLS LE access and fs:(%reg) is only generated by combine.

So the main impact of disabling TARGET_TLS_DIRECT_SEG_REFS is to disable
fs:immediate memory operand for TLS LE access, which doesn't have any problems
to begin with.

I would prefer to keep TARGET_TLS_DIRECT_SEG_REFS and disable only
fs:(%reg), which is generated by combine.

-- 
H.J.
--
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index b101922..1ffcc85 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11478,6 +11478,7 @@ ix86_decompose_address (rtx addr, struct
ix86_address *out)

case UNSPEC:
  if (XINT (op, 1) == UNSPEC_TP
+  Pmode == word_mode
   TARGET_TLS_DIRECT_SEG_REFS
   seg == SEG_DEFAULT)
seg = TARGET_64BIT ? SEG_FS : SEG_GS;
@@ -11534,11 +11535,6 @@ ix86_decompose_address (rtx addr, struct
ix86_address *out)
   else
 disp = addr;   /* displacement */

-  /* Since address override works only on the (reg32) part in fs:(reg32),
- we can't use it as memory operand.  */
-  if (Pmode != word_mode  seg == SEG_FS  (base || index))
-return 0;
-
   if (index)
 {
   if (REG_P (index))
@@ -12706,7 +12702,9 @@ legitimize_tls_address (rtx x, enum tls_model
model, bool for_mov)

   if (TARGET_64BIT || TARGET_ANY_GNU_TLS)
{
- base = get_thread_pointer (for_mov || !TARGET_TLS_DIRECT_SEG_REFS);
+ base = get_thread_pointer (for_mov
+|| Pmode != word_mode
+|| !TARGET_TLS_DIRECT_SEG_REFS);
  return gen_rtx_PLUS (Pmode, base, off);
}
   else
@@ -13239,7 +13237,7 @@ ix86_delegitimize_tls_address (rtx orig_x)
   rtx x = orig_x, unspec;
   struct ix86_address addr;

-  if (!TARGET_TLS_DIRECT_SEG_REFS)
+  if (Pmode != word_mode || !TARGET_TLS_DIRECT_SEG_REFS)
 return orig_x;
   if (MEM_P (x))
 x = XEXP (x, 0);


Re: PING PATCH: Assert DWARF register size = saved reg size

2012-03-11 Thread H.J. Lu
On Fri, Mar 2, 2012 at 12:42 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Fri, Nov 11, 2011 at 11:04 AM, H.J. Lu hongjiu...@intel.com wrote:
 Hi,

 I am working on 32bit Pmode for x32:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50797

 It removes all LEAs, which convert 32bit address to 64bit, and uses 0x67
 address prefix instead.  I got 5% speed up in SPEC CPU 2K/2006.

 But assert in _Unwind_SetGRValue:

 gcc_assert (dwarf_reg_size_table[index] == sizeof (_Unwind_Context_Reg_Val));

 failed on return column since init_return_column_size use Pmode, not
 word_mode.  In this case, _Unwind_Context_Reg_Val is 64bit, but return
 column size is 32bit.  This patch changes it to assert DWARF register
 size = saved reg size.  OK for trunk?

 Thanks.


 H.J.
 ---
 2011-11-11  H.J. Lu  hongjiu...@intel.com

        * unwind-dw2.c (_Unwind_SetGRValue): Assert DWARF register size
        = saved reg size.

 diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
 index 475ad00..db1c757 100644
 --- a/libgcc/unwind-dw2.c
 +++ b/libgcc/unwind-dw2.c
 @@ -294,7 +294,8 @@ _Unwind_SetGRValue (struct _Unwind_Context *context, int 
 index,
  {
   index = DWARF_REG_TO_UNWIND_COLUMN (index);
   gcc_assert (index  (int) sizeof(dwarf_reg_size_table));
 -  gcc_assert (dwarf_reg_size_table[index] == sizeof 
 (_Unwind_Context_Reg_Val));
 +  /* Return column size may be smaller than _Unwind_Context_Reg_Va.  */
 +  gcc_assert (dwarf_reg_size_table[index] = sizeof 
 (_Unwind_Context_Reg_Val));

   context-by_value[index] = 1;
   context-reg[index] = _Unwind_Get_Unwind_Context_Reg_Val (val);

 Now trunk is in stage 1. Jason, is this OK for trunk?

 Thanks.

Ping.


-- 
H.J.


PATCH: Properly set ix86_gen_leave and ix86_gen_monitor

2012-03-11 Thread H.J. Lu
Hi,

leave_rex64 works on DImode and sse3_monitor64 works on Pmode.  This
patch properly sets ix86_gen_leave and ix86_gen_monitor, depending on
TARGET_64BIT and Pmode.  Tested on Linux/x86-64.  OK for trunk?

Thanks.


H.J.
---
2012-03-11  H.J. Lu  hongjiu...@intel.com

* config/i386/i386.c (ix86_option_override_internal): Properly
set ix86_gen_leave and ix86_gen_monitor.  Check Pmode == DImode.

* config/i386/sse.md (sse3_monitor64): Renamed to ...
(sse3_monitor64_mode): This.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d673101..f21721f 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3748,11 +3748,23 @@ ix86_option_override_internal (bool main_args_p)
   if (TARGET_64BIT)
 {
   ix86_gen_leave = gen_leave_rex64;
+  if (Pmode == DImode)
+   ix86_gen_monitor = gen_sse3_monitor64_di;
+  else
+   ix86_gen_monitor = gen_sse3_monitor64_si;
+}
+  else
+{
+  ix86_gen_leave = gen_leave;
+  ix86_gen_monitor = gen_sse3_monitor;
+}
+
+  if (Pmode == DImode)
+{
   ix86_gen_add3 = gen_adddi3;
   ix86_gen_sub3 = gen_subdi3;
   ix86_gen_sub3_carry = gen_subdi3_carry;
   ix86_gen_one_cmpl2 = gen_one_cmpldi2;
-  ix86_gen_monitor = gen_sse3_monitor64;
   ix86_gen_andsp = gen_anddi3;
   ix86_gen_allocate_stack_worker = gen_allocate_stack_worker_probe_di;
   ix86_gen_adjust_stack_and_probe = gen_adjust_stack_and_probedi;
@@ -3760,12 +3772,10 @@ ix86_option_override_internal (bool main_args_p)
 }
   else
 {
-  ix86_gen_leave = gen_leave;
   ix86_gen_add3 = gen_addsi3;
   ix86_gen_sub3 = gen_subsi3;
   ix86_gen_sub3_carry = gen_subsi3_carry;
   ix86_gen_one_cmpl2 = gen_one_cmplsi2;
-  ix86_gen_monitor = gen_sse3_monitor;
   ix86_gen_andsp = gen_andsi3;
   ix86_gen_allocate_stack_worker = gen_allocate_stack_worker_probe_si;
   ix86_gen_adjust_stack_and_probe = gen_adjust_stack_and_probesi;
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 4afc4b3..f5935f1 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8147,8 +8147,8 @@
   monitor\t%0, %1, %2
   [(set_attr length 3)])
 
-(define_insn sse3_monitor64
-  [(unspec_volatile [(match_operand:DI 0 register_operand a)
+(define_insn sse3_monitor64_mode
+  [(unspec_volatile [(match_operand:P 0 register_operand a)
 (match_operand:SI 1 register_operand c)
 (match_operand:SI 2 register_operand d)]
UNSPECV_MONITOR)]


Re: [PATCH] [SH] Fix target/48596

2012-03-11 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 The attached patch moves it as suggested to gcc.c-torture/compile.
 Briefly tested by running the gcc.c-torture/compile set on sh-him
 -m4a-single -ml.

You forgot to remove two dg-* lines:

 +/* { dg-do compile } */
 +/* { dg-options -O1 } */

unneeded for this gcc.c-torture/compile test.  Looks OK
with that change.  FYI, I've tested it on i686-pc-linux-gnu
with no problem.

Regards,
kaz


Re: PATCH RFA: Update Go frontend on gcc 4.7 branch

2012-03-11 Thread Ian Lance Taylor
Jakub Jelinek ja...@redhat.com writes:

 FYI, on Fedora 17 I had recent testresults without the patch, so below are
 just testsuite differences for that (debug/dwarf fails consistently
 everywhere), on RHEL5/6 I didn't have earlier go testsuite results,
 so I'm just providing summaries there.

The reason debug/dwarf fails everywhere with the patch is simply that
there are a couple of binary test files in debug/dwarf, and the patch
program did not update them correctly.  The debug/dwarf tests should
pass now that the correct binary files have been committed to the
branch.

Ian