[PATCH] PR52528, combine fix

2012-03-10 Thread Chung-Lin Tang
Hi,

As described in the PR, a testcase compiled for PowerPC:
struct S {
  unsigned a : 30;
  unsigned b :  2;
};

int foo (int b)
{
  struct S s = {0};
  s.b = b;
  return bar (0x000b0010, 0x00040100ULL, *(unsigned long *)s);
}

currently this is compiled to:
foo:
lis 6,0x4
li 5,0
ori 6,6,256
li 7,0
crxor 6,6,6
b bar

Notice the incorrect code generated: no construction of the 1st arg (reg
3), and wrong code for the 3rd arg (reg 7)

The problem seems to be in combine, during calls from try_combine() to
can_combine_p():

  can_combine_p() has a call to expand_field_assignment(), which may
call get_last_value() during its simplification operations (through the
reg_nonzero_bits_for_combine() hook); not setting subst_low_luid
properly affects its correctness.

So the fix is a one-liner that sets subst_low_luid before the
expand_field_assignment() call. Bootstrapped and tested under i686,
x86-64, powerpc64. Cross-tested on ARM. I was a bit weary that some
optimization regression might appear, which will complicate things, but
everything looks fine.

I have a larger (customer provided) testcase that exposed this bug after
rev.161655 (the mem-ref2 merge, may be related to effects on bitfields).
So if suitable, please also approve this patch for 4.6/4.7 branches.

Thanks,
Chung-Lin

2012-03-10  Chung-Lin Tang  clt...@codesourcery.com

PR rtl-optimization/52528
* combine.c (can_combine_p): Add setting of subst_low_luid
before call to expand_field_assignment().
Index: combine.c
===
--- combine.c   (revision 185168)
+++ combine.c   (working copy)
@@ -1822,6 +1822,10 @@ can_combine_p (rtx insn, rtx i3, rtx pred ATTRIBUT
   if (set == 0)
 return 0;
 
+  /* The simplification in expand_field_assignment() may call back to
+ get_last_value(), so set safe guard here.  */
+  subst_low_luid = DF_INSN_LUID (insn);
+
   set = expand_field_assignment (set);
   src = SET_SRC (set), dest = SET_DEST (set);
 


Re: [PATCH] PR52528, combine fix

2012-03-10 Thread Eric Botcazou
 So the fix is a one-liner that sets subst_low_luid before the
 expand_field_assignment() call. Bootstrapped and tested under i686,
 x86-64, powerpc64. Cross-tested on ARM. I was a bit weary that some
 optimization regression might appear, which will complicate things, but
 everything looks fine.

 I have a larger (customer provided) testcase that exposed this bug after
 rev.161655 (the mem-ref2 merge, may be related to effects on bitfields).
 So if suitable, please also approve this patch for 4.6/4.7 branches.

 Thanks,
 Chung-Lin

 2012-03-10  Chung-Lin Tang  clt...@codesourcery.com

   PR rtl-optimization/52528
   * combine.c (can_combine_p): Add setting of subst_low_luid
   before call to expand_field_assignment().

OK for mainline, 4.7 branch (once 4.7.0 is released) and 4.6 branch, modulo:

+  /* The simplification in expand_field_assignment() may call back to
+ get_last_value(), so set safe guard here.  */
+  subst_low_luid = DF_INSN_LUID (insn);

No () in comments, just use the function name.

-- 
Eric Botcazou


[google/integration] Add XFAIL file for arm-gretv2-linux-gnueabi target (issue5798046)

2012-03-10 Thread Doug Kwan
Hi Diego,

   This patch adds an .xfail file for the arm-grtev2-linux-gnueabi target
in the integration branch.

-Doug

2012-03-10   Doug Kwan  dougk...@google.com

* contrib/testsuite-management/arm-grtev2-linux-gnueabi.xfail:
New file.

Index: contrib/testsuite-management/arm-grtev2-linux-gnueabi.xfail
===
--- contrib/testsuite-management/arm-grtev2-linux-gnueabi.xfail (revision 0)
+++ contrib/testsuite-management/arm-grtev2-linux-gnueabi.xfail (revision 0)
@@ -0,0 +1,126 @@
+# Failures in ./gcc/testsuite/gcc/gcc.sum:
+# *** gcc:
+FAIL: gcc.c-torture/compile/920928-2.c  -Os  (internal compiler error)
+FAIL: gcc.c-torture/compile/920928-2.c  -Os  (test for excess errors)
+FAIL: gcc.dg/builtin-apply2.c execution test
+FAIL: gcc.dg/cproj-fails-with-broken-glibc.c execution test
+FAIL: gcc.dg/di-longlong64-sync-1.c (test for excess errors)
+UNRESOLVED: gcc.dg/di-longlong64-sync-1.c compilation failed to produce 
executable
+FAIL: gcc.dg/di-sync-multithread.c execution test
+FAIL: gcc.dg/pr49994-3.c (test for excess errors)
+FAIL: gcc.dg/tls/pr42894.c (test for excess errors)
+FAIL: gcc.dg/torture/stackalign/builtin-apply-2.c  -O0  execution test
+FAIL: gcc.dg/torture/stackalign/builtin-apply-2.c  -O0  execution test
+FAIL: gcc.dg/torture/stackalign/builtin-apply-2.c  -O1  execution test
+FAIL: gcc.dg/torture/stackalign/builtin-apply-2.c  -Os  execution test
+
+# There are flaky when running on QEMU
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O0  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O1  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O2  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -fomit-frame-pointer  
execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -g  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -Os  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O0  -fpic  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O1  -fpic  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O2  -fpic  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -fomit-frame-pointer  -fpic  
execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -g  -fpic  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -Os  -fpic  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O0  -fPIC  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O1  -fPIC  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O2  -fPIC  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -fomit-frame-pointer  -fPIC  
execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -g  -fPIC  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -Os  -fPIC  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O0  -pie -fpie  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O1  -pie -fpie  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O2  -pie -fpie  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -fomit-frame-pointer  -pie 
-fpie  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -g  -pie -fpie  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -Os  -pie -fpie  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O0  -pie -fPIE  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O1  -pie -fPIE  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O2  -pie -fPIE  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -fomit-frame-pointer  -pie 
-fPIE  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O3 -g  -pie -fPIE  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -Os  -pie -fPIE  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
+flaky | FAIL: gcc.dg/torture/tls/tls-test.c  -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
+
+FAIL: gcc.dg/tree-ssa/sra-12.c scan-tree-dump-times release_ssa l; 0
+FAIL: gcc.dg/vect/vect-104.c scan-tree-dump-times vect possible dependence 
between data-refs 1
+FAIL: gcc.dg/vect/vect-multitypes-11.c scan-tree-dump-times vect vectorized 1 
loops 1
+FAIL: gcc.dg/vect/vect-multitypes-12.c scan-tree-dump-times vect vectorized 1 
loops 1
+FAIL: gcc.dg/vect/vect-outer-1-big-array.c scan-tree-dump-times vect strided 
access in outer loop 1
+FAIL: gcc.dg/vect/vect-outer-1.c scan-tree-dump-times vect strided access in 
outer loop 1
+FAIL: gcc.dg/vect/vect-outer-1a-big-array.c scan-tree-dump-times vect strided 
access in outer loop 1
+FAIL: gcc.dg/vect/vect-outer-1a.c scan-tree-dump-times vect strided access in 
outer loop 1
+FAIL: gcc.dg/vect/vect-outer-1b-big-array.c scan-tree-dump-times vect strided 
access in outer loop 1
+FAIL: gcc.dg/vect/vect-outer-1b.c scan-tree-dump-times vect strided access in 
outer loop 1

Many regressions with: [patch] Cleanup fortran/convert.c

2012-03-10 Thread Tobias Burnus

Steven Bosscher wrote:

This cleans up some remnants of the ancestors of fortran's convert.c,
which was copied from GNAT IIRC. I would bootstraptest this, but trunk appears 
to be broken for x86_64-linux right now (ICE in patch_jump_insn). But I can post 
this
for review, at least.
OK for trunk, after bootstrap+test?


Your patch seems to have caused many Fortran regressions. At least I see 
with 185156 only one (known) failure, cf. 
http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01069.html


While starting with 185160 there are many, many gfortran failures, cf. 
http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01073.html


Tobias



* Make-lang.in (convert.o): Depend on convert.h.
* convert.c: Header and comment cleanups.
(gfc_thruthvalue_conversion): Rename static function
to truthvalue_conversion.  Do not use 'internal_error' from here,
use 'gcc_unreachable' instead.
(convert): Do not use 'error' for conversions to void, use
'gcc_unreachable' instead.  Likewise for conversions to non-scalar
types.  Do not hanlde ENUMERAL_TYPE, the front end never creates them.
Clean up #if 0 code.




Re: Many regressions with: [patch] Cleanup fortran/convert.c

2012-03-10 Thread Steven Bosscher
On Sat, Mar 10, 2012 at 11:19 AM, Tobias Burnus bur...@net-b.de wrote:
 Steven Bosscher wrote:

 This cleans up some remnants of the ancestors of fortran's convert.c,
 which was copied from GNAT IIRC. I would bootstraptest this, but trunk
 appears to be broken for x86_64-linux right now (ICE in patch_jump_insn).
 But I can post this
 for review, at least.
 OK for trunk, after bootstrap+test?


 Your patch seems to have caused many Fortran regressions. At least I see
 with 185156 only one (known) failure, cf.
 http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01069.html

 While starting with 185160 there are many, many gfortran failures, cf.
 http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01073.html

Yes, it seems that different boolean types aren't allowed. I must have
looked at the wrong test results somehow.

I'm testing this fix:

Index: convert.c
===
--- convert.c   (revision 185160)
+++ convert.c   (working copy)
@@ -95,7 +95,8 @@ convert (tree type, tree expr)
   if (code == VOID_TYPE)
 return fold_build1_loc (input_location, CONVERT_EXPR, type, e);
   if (code == BOOLEAN_TYPE)
-return truthvalue_conversion (e);
+return fold_build1_loc (input_location, NOP_EXPR, type,
+   truthvalue_conversion (e));
   if (code == INTEGER_TYPE)
 return fold (convert_to_integer (type, e));
   if (code == POINTER_TYPE || code == REFERENCE_TYPE)


Re: PATCH: Properly generate X32 IE sequence

2012-03-10 Thread Uros Bizjak
On Fri, Mar 9, 2012 at 11:26 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 5, 2012 at 9:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Mon, Mar 5, 2012 at 6:03 PM, H.J. Lu hjl.to...@gmail.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 Actually, linker has:

    case R_X86_64_GOTTPOFF:
      /* Check transition from IE access model:
                mov foo@gottpoff(%rip), %reg
                add foo@gottpoff(%rip), %reg
       */

      /* Check REX prefix first.  */
      if (offset = 3  (offset + 4) = sec-size)
        {
          val = bfd_get_8 (abfd, contents + offset - 3);
          if (val != 0x48  val != 0x4c)
            {
              /* X32 may have 0x44 REX prefix or no REX prefix.  */
              if (ABI_64_P (abfd))
                return FALSE;
            }
        }
      else
        {
          /* X32 may not have any REX prefix.  */
          if (ABI_64_P (abfd))
            return FALSE;
          if (offset  2 || (offset + 3)  sec-size)
            return FALSE;
        }

 So, it should handle the case without REX just OK. If it doesn't, then
 this is a bug in binutils.


 The last byte of the displacement in the previous instruction
 may happen to look like a REX byte. In that case, linker
 will overwrite the last byte of the previous instruction and
 generate the wrong instruction sequence.

 I need to update linker to enforce the REX byte check.

 One important observation: if we want to follow the x86_64 TLS spec
 strictly, we have to use existing DImode patterns only. This also
 means that we should NOT convert other TLS patterns to Pmode, since
 they explicitly state movq and addq. If this is not the case, then we
 need new TLS specification for X32.

 Here is a patch to properly generate X32 IE sequence.

 This is the summary of differences between x86-64 TLS and x32 TLS:

                     x86-64                               x32
 GD
    byte 0x66; leaq foo@tlsgd(%rip),%rdi;         leaq foo@tlsgd(%rip),%rdi;
    .word 0x; rex64; call __tls_get_addr@plt  .word 0x; rex64;
 call __tls_get_addr@plt

 GD-IE optimization
   movq %fs:0,%rax; addq x@gottpoff(%rip),%rax    movl %fs:0,%eax;
 addq x@gottpoff(%rip),%rax

 GD-LE optimization
   movq %fs:0,%rax; leaq x@tpoff(%rax),%rax       movl %fs:0,%eax;
 leaq x@tpoff(%rax),%rax

 LD
  leaq foo@tlsld(%rip),%rdi;                      leaq foo@tlsld(%rip),%rdi;
  call __tls_get_addr@plt                         call __tls_get_addr@plt

 LD-LE optimization
  .word 0x; .byte 0x66; movq %fs:0, %rax      nopl 0x0(%rax); movl
 %fs:0, %eax

 IE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl x@gottpoff(%rip),%reg32

   or
                                                  Not supported if
 Pmode == SImode
   movq x@gottpoff(%rip),%reg64;                  movq x@gottpoff(%rip),%reg64;
   movq %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

 IE-LE optimization

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl x@gottpoff(%rip),%reg32

   to

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq foo@tpoff, %reg64                         addl foo@tpoff, %reg32

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq foo@tpoff(%reg64), %reg64                 leal foo@tpoff(%reg32), 
 %reg32

   or

   movq x@gottpoff(%rip),%reg64                   movq x@gottpoff(%rip),%reg64;
   movl %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

   to

   movq foo@tpoff, %reg64                         movq foo@tpoff, %reg64
   movl %fs:(%reeg64),%reg32                      movl %fs:(%reg64), %reg32

 LE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq x@tpoff(%reg64),%reg32                    leal x@tpoff(%reg32),%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq $x@tpoff,%reg64                           addl $x@tpoff,%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   movl x@tpoff(%reg64),%reg32                    movl x@tpoff(%reg32),%reg32

   or

   movl %fs:x@tpoff,%reg32                        movl %fs:x@tpoff,%reg32


 X32 TLS implementation is straight forward, except for IE:

 1. Since address override works only on the (reg32) part in fs:(reg32),
 we can't use it as memory operand.  This patch 

Re: [PR51752] publication safety violations in loop invariant motion pass

2012-03-10 Thread Torvald Riegel
On Fri, 2012-03-09 at 15:48 -0600, Aldy Hernandez wrote:
 Torvald is this what you were thinking of?

Yes, but with an exit in the else branch or something that can cause x
not being read after the condition.  I _suppose_ that your original
example would be an allowed transformation but just because x would be
read anyway independently of flag's value; we can assume data-race
freedom, and thus we must be able to read x in a data-race-free way even
if flag is false, so flag's value actually doesn't matter.

What about modifying the example like below?  In this case, if flag2 is
true, flag's value will matter and we can't move the load to x before
it.  Will PRE still introduce tmp = x + 4 in such an example?

Torvald

 +__transaction_atomic {
 +  if (flag)
 +y = x + 4;
 +  else
 +// stuff
 if (flag2)
   return;
 +  z = x + 4;
 +}
 +
 +  PRE can rewrite this into:
 +
 +__transaction_atomic {
 +  if (flag) {
 +tmp = x + 4;
 +y = tmp;
 +  } else {
 +// stuff
 +tmp = x + 4;
 if (flag2)
   return;
 +  }
 +  z = tmp;
 +}
 +
 +  A later pass can move the now totally redundant [x + 4]
 +  before its publication predicated by flag:
 +
 +__transaction_atomic {
 +  tmp = x + 4;
 +  if (flag) {
 +  } else {
 +// stuff
 if (flag2)
   return;
 +  }
 +  z = tmp;
 +   */




Re: [Patch, Fortran] PR 52542 - Fix PROCEDURE() with Bind(C)

2012-03-10 Thread Tobias Burnus

Tobias Burnus wrote:
If the interface in a PROCEDURE() statement is Bind(C), also the 
procedure (pointer) declared in that statement is BIND(C).


From the F2008 standard: A proc-language-binding-spec without a NAME= 
is allowed, but is redundant with the proc-interface required by C1222.


Build and currently regtested on x86-64-linux.
OK for the trunk (if regtesting succeeded)?


Well, it didn't as I forgot to reset two variables - one then gets then 
an error that one has specified an binding name - or the wrong binding 
name might be used.


Build and regtested on x86-64-linux.
OK?

Tobias
2012-03-10  Tobias Burnus  bur...@net-b.de

	PR fortran/52542
	* decl.c (match_procedure_decl): If the interface
	is bind(C), the procedure is as well.

2012-03-10  Tobias Burnus  bur...@net-b.de

	PR fortran/52542
	* gfortran.dg/proc_ptr_35.f90: New.

--- /dev/null	2012-03-09 19:41:57.079829322 +0100
+++ gcc/gcc/testsuite/gfortran.dg/proc_ptr_35.f90	2012-03-09 22:22:31.0 +0100
@@ -0,0 +1,16 @@
+! { dg-do compile }
+!
+! PR fortran/52542
+!
+! Ensure that the procedure myproc is Bind(C).
+!
+! Contribute by Mat Cross of NAG
+!
+interface
+  subroutine s() bind(c)
+  end subroutine s
+end interface
+procedure(s) :: myproc
+call myproc()
+end
+! { dg-final { scan-assembler-not myproc_ } }
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 75b8a89..4da21c3 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -4855,6 +4855,13 @@ match_procedure_decl (void)
   if (m == MATCH_ERROR)
 return MATCH_ERROR;
 
+  if (proc_if  proc_if-attr.is_bind_c  !current_attr.is_bind_c)
+{
+  current_attr.is_bind_c = 1;
+  has_name_equals = 0;
+  curr_binding_label = NULL;
+}
+
   /* Get procedure symbols.  */
   for(num=1;;num++)
 {


Re: PATCH: Properly check mode for x86 call/jmp address

2012-03-10 Thread H.J. Lu
On Wed, Mar 7, 2012 at 1:58 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Wed, Mar 7, 2012 at 5:03 PM, H.J. Lu hjl.to...@gmail.com wrote:

  (define_insn *call
 -  [(call (mem:QI (match_operand:P 0 call_insn_operand czw))
 +  [(call (mem:QI (match_operand:C 0 call_insn_operand czw))
        (match_operand 1  ))]
 -  !SIBLING_CALL_P (insn)
 +  !SIBLING_CALL_P (insn)
 +    (GET_CODE (operands[0]) == SYMBOL_REF
 +       || GET_MODE (operands[0]) == word_mode)

 There are enough copies of this extra constraint that I wonder
 if it simply ought to be folded into call_insn_operand.

 Which would need to be changed to define_special_predicate,
 since you'd be doing your own mode checking.

 Probably similar changes to sibcall_insn_operand.

 Here is the updated patch.  I changed constant_call_address_operand
 and call_register_no_elim_operand to use define_special_predicate.
 OK for trunk?

 Please do not complicate matters that much. Just stick word_mode
 overrides for register operands in predicates.md, like in attached
 patch. These changed predicates now allow registers only in word_mode
 (and VOIDmode).

 You can now remove all new mode iterators and leave call patterns untouched.

 @@ -22940,14 +22940,18 @@ ix86_expand_call (rtx retval, rtx fnaddr,
 rtx callarg1,
        GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
        !local_symbolic_operand (XEXP (fnaddr, 0), VOIDmode))
     fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 0)));
 -  else if (sibcall
 -          ? !sibcall_insn_operand (XEXP (fnaddr, 0), Pmode)
 -          : !call_insn_operand (XEXP (fnaddr, 0), Pmode))
 +  else if (!(constant_call_address_operand (XEXP (fnaddr, 0), Pmode)
 +            || call_register_no_elim_operand (XEXP (fnaddr, 0),
 +                                              word_mode)
 +            || (!sibcall
 +                 !TARGET_X32
 +                 memory_operand (XEXP (fnaddr, 0), word_mode
     {
       fnaddr = XEXP (fnaddr, 0);
 -      if (GET_MODE (fnaddr) != Pmode)
 -       fnaddr = convert_to_mode (Pmode, fnaddr, 1);
 -      fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (Pmode, fnaddr));
 +      if (GET_MODE (fnaddr) != word_mode)
 +       fnaddr = convert_to_mode (word_mode, fnaddr, 1);
 +      fnaddr = gen_rtx_MEM (QImode,
 +                           copy_to_mode_reg (word_mode, fnaddr));
     }

   vec_len = 0;

 Please update the above part. It looks you don't even have to change
 condition with new predicates. Basically, you should only convert the
 address to word_mode instead of Pmode.

 +  if (TARGET_X32)
 +    operands[0] = convert_memory_address (word_mode, operands[0]);

 This addition to indirect_jump and tablejump should be the only
 change, needed in i386.md now. Please write the condition

 if (Pmode != word_mode)

 for consistency.

 BTW: The attached patch was bootstrapped and regression tested on
 x86_64-pc-linux-gnu {,-m32}.

 Uros.

 It doesn't work:

 x.i:7:1: error: unrecognizable insn:
 (call_insn/j 8 7 9 3 (call (mem:QI (reg:DI 62) [0 *foo.0_1 S1 A8])
        (const_int 0 [0])) x.i:6 -1
     (nil)
    (nil))
 x.i:7:1: internal compiler error: in extract_insn, at recog.c:2123
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See http://gcc.gnu.org/bugs.html for instructions.
 make: *** [x.s] Error 1

 I will investigate it.

 For reference, attached is the complete patch that uses
 define_special_predicate. This patch works OK with the current
 mainline, with additional patch to i386.h, where

 Index: i386.h
 ===
 --- i386.h      (revision 185079)
 +++ i386.h      (working copy)
 @@ -1744,7 +1744,7 @@
  /* Specify the machine mode that pointers have.
    After generation of rtl, the compiler makes no further distinction
    between pointers and any other objects of this machine mode.  */
 -#define Pmode (TARGET_64BIT ? DImode : SImode)
 +#define Pmode (TARGET_LP64 ? DImode : SImode)

  /* A C expression whose value is zero if pointers that need to be extended
    from being `POINTER_SIZE' bits wide to `Pmode' are sign-extended and

 Uros.

I tested this patch and it passed all my x32 tests.

Thanks.

-- 
H.J.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c2cad5a..33ef330 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -23032,13 +23031,13 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx 
callarg1,
!local_symbolic_operand (XEXP (fnaddr, 0), VOIDmode))
 fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 0)));
   else if (sibcall
-  ? !sibcall_insn_operand (XEXP (fnaddr, 0), Pmode)
-  : !call_insn_operand (XEXP (fnaddr, 0), Pmode))
+  ? !sibcall_insn_operand (XEXP (fnaddr, 0), word_mode)
+  : !call_insn_operand (XEXP (fnaddr, 0), word_mode))
 {
   fnaddr = XEXP (fnaddr, 0);
-  if (GET_MODE (fnaddr) != Pmode)
-   fnaddr = convert_to_mode (Pmode, fnaddr, 1);
-  

Re: [Patch, Fortran] Change array descriptor's data to base_addr for TS 29113

2012-03-10 Thread Paul Richard Thomas
Tobias,

These patches are OK for trunk and fortran-dev.

Many thanks

Paul

On Sat, Mar 10, 2012 at 4:53 PM, Tobias Burnus bur...@net-b.de wrote:
 The attached patch renames (in libgfortran/) the array descriptor's data
 field to base_addr and lbound to lower_bound.

 The reason is that Technical Specification (TS) 29113* uses those names in
 their C bindings, defined in ISO_Fortran_binding.h. But I would like to
 include that header file in libgfortran/libgfortran.h (cf. fortran-dev
 branch). Hence, the renaming.

 In order to make later merging of the fortran-dev branch into the trunk
 easier to review, I'd prefer to commit this patch already to the trunk, but
 it can also be commit to the branch.

 The patch shouldn't have any effect in terms of the ABI, however, I am not
 sure whether it formally fulfills the criteria in C99's 6.2.7p1**
 (compatible type and composite type). On the other hand, the fields in the
 dimension triplet are already differently named: ubound
 (gcc/fortran/trans-types.c) vs. _ubound (libgfortran/libgfortran.h).

 Okay for the trunk? (Or for the fortran-dev branch?) Comments?
 (Bootstrapped and regtested on x86-64-linux.)

 Tobias

 * Current TS 29113 draft: ftp://ftp.nag.co.uk/sc22wg5/N1901-N1950/N1904.pdf
 ** C99 plus TC1 to TC3,
 http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
       --Hitchhikers Guide to the Galaxy


[patch, 4.7] libitm: Fix lost wake-up in serial lock.

2012-03-10 Thread Torvald Riegel
This patch fixes PR52526, a lost wake-up in libitm (ie, one ore more
threads could hang and not get woken up anymore).

The problem was missing handling of one corner case in the futex-based
serial lock implementation (config/linux/rwlock.cc, read_lock()):
Multiple readers would set READERS to 1 and only call
futex_wait(readers, 1) if there were any writers.  Writers would set
READERS to 0 and then call futex_wake(readers).  That's fine, but
because there are multiple readers, it can happen that some would set
READERS to 1 after the writer's futex_wake() call, enabling the
futex_wait() in other readers (because READERS isn't 0 anymore).  This
patch fixes this by having readers wake up all potentially waiting
readers when they set READERS to 1 without an existing writer (thus
taking over what the writer would do).

OK for trunk?

OK for 4.7 too?  This is a showstopper if users hit it, so I'd prefer if
it could go into 4.7 as well.
commit 07d6d68b423797311bb04d8eb571f053d2078aa4
Author: Torvald Riegel trie...@redhat.com
Date:   Sat Mar 10 17:44:37 2012 +0100

libitm: Fix lost wake-up in serial lock.

PR libitm/52526
* config/linux/rwlock.cc (GTM::gtm_rwlock::read_lock): Fix lost
wake-up.

diff --git a/libitm/config/linux/rwlock.cc b/libitm/config/linux/rwlock.cc
index ad1b042..cf1fdd5 100644
--- a/libitm/config/linux/rwlock.cc
+++ b/libitm/config/linux/rwlock.cc
@@ -74,6 +74,32 @@ gtm_rwlock::read_lock (gtm_thread *tx)
  atomic_thread_fence (memory_order_seq_cst);
  if (writers.load (memory_order_relaxed))
futex_wait(readers, 1);
+ else
+   {
+ // There is no writer, actually.  However, we can have enabled
+ // a futex_wait in other readers by previously setting readers
+ // to 1, so we have to wake them up because there is no writer
+ // that will do that.  We don't know whether the wake-up is
+ // really necessary, but we can get lost wake-up situations
+ // otherwise.
+ // No additional barrier nor a nonrelaxed load is required due
+ // to coherency constraints.  write_unlock() checks readers to
+ // see if any wake-up is necessary, but it is not possible that
+ // a reader's store prevents a required later writer wake-up;
+ // If the waking reader's store (value 0) is in modification
+ // order after the waiting readers store (value 1), then the
+ // latter will have to read 0 in the futex due to coherency
+ // constraints and the happens-before enforced by the futex
+ // (paragraph 6.10 in the standard, 6.19.4 in the Batty et al
+ // TR); second, the writer will be forced to read in
+ // modification order too due to Dekker-style synchronization
+ // with the waiting reader (see write_unlock()).
+ // ??? Can we avoid the wake-up if readers is zero (like in
+ // write_unlock())?  Anyway, this might happen too infrequently
+ // to improve performance significantly.
+ readers.store (0, memory_order_relaxed);
+ futex_wake(readers, INT_MAX);
+   }
}
 
   // And we try again to acquire a read lock.


Re: PATCH RFA: Update Go frontend on gcc 4.7 branch

2012-03-10 Thread Jakub Jelinek
On Fri, Mar 09, 2012 at 02:20:14PM -0800, Ian Lance Taylor wrote:
 I would like to update the Go support on the 4.7 branch.  As I've
 mentioned before, Go is working toward a stable Go 1 release.  That
 release is not complete, but it is quite close.  The 4.7 branch was made
 at a slightly unstable point in the process.  I've updated the library
 one more time, and I've spent the week testing the result on a bunch of
 Google-internal programs.  What I have now is not perfect, but it is
 better than what is on the 4.7 branch today.

I'm not very excited by such huge changes, but I've tested this on Fedora 17
(various architectures) and RHEL6/5 today, let's check this in.  But
certainly no further such large change will be accepted on the 4.7 branch.

FYI, on Fedora 17 I had recent testresults without the patch, so below are
just testsuite differences for that (debug/dwarf fails consistently
everywhere), on RHEL5/6 I didn't have earlier go testsuite results,
so I'm just providing summaries there.

Fedora 17

i686-linux
-FAIL: database/sql
+FAIL: debug/dwarf
x86_64-linux
+FAIL: go.test/test/stack.go execution,  -O2 -g 
+FAIL: debug/dwarf
-FAIL: database/sql
+FAIL: debug/dwarf
ppc-linux
+FAIL: log
-FAIL: database/sql
+FAIL: debug/dwarf
-FAIL: exp/signal
+FAIL: net/http/httptest
+FAIL: os/signal
-FAIL: testing/script
ppc64-linux
-FAIL: exp/signal
+FAIL: net/http/httptest
+FAIL: os/signal
-FAIL: testing/script
+FAIL: log
+FAIL: debug/dwarf
s390-linux
+FAIL: log
+FAIL: debug/dwarf
-FAIL: sync/atomic
s390x-linux
+FAIL: log
-FAIL: sync/atomic
+FAIL: debug/dwarf
+FAIL: log
+FAIL: debug/dwarf
-FAIL: sync/atomic

RHEL 5, x86_64-linux (insufficient .cfi* support, so -fsplit-stack
not supported):
=== go Summary ===

# of expected passes1045
# of unexpected failures556
# of expected failures  4
# of untested testcases 535
=== libgo tests ===


Running target unix
FAIL: debug/dwarf

=== libgo Summary for unix ===

# of expected passes122
# of unexpected failures1

Running target unix/-m32
FAIL: net
FAIL: debug/dwarf

RHEL6, x86_64-linux

=== go Summary ===

# of expected passes3296
# of expected failures  4
# of untested testcases 4

=== libgo tests ===


Running target unix
FAIL: debug/dwarf

=== libgo Summary for unix ===

# of expected passes122
# of unexpected failures1


Jakub


Re: PATCH: Properly generate X32 IE sequence

2012-03-10 Thread H.J. Lu
On Sat, Mar 10, 2012 at 5:09 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 9, 2012 at 11:26 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 5, 2012 at 9:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Mon, Mar 5, 2012 at 6:03 PM, H.J. Lu hjl.to...@gmail.com wrote:

 X86-64 linker optimizes TLS_MODEL_INITIAL_EXEC to TLS_MODEL_LOCAL_EXEC
 by checking

        movq foo@gottpoff(%rip), %reg

 and

        addq foo@gottpoff(%rip), %reg

 It uses the REX prefix to avoid the last byte of the previous
 instruction.  With 32bit Pmode, we may not have the REX prefix and
 the last byte of the previous instruction may be an offset, which
 may look like a REX prefix.  IE-LE optimization will generate corrupted
 binary.  This patch makes sure we always output an REX pfrefix for
 UNSPEC_GOTNTPOFF.  OK for trunk?

 Actually, linker has:

    case R_X86_64_GOTTPOFF:
      /* Check transition from IE access model:
                mov foo@gottpoff(%rip), %reg
                add foo@gottpoff(%rip), %reg
       */

      /* Check REX prefix first.  */
      if (offset = 3  (offset + 4) = sec-size)
        {
          val = bfd_get_8 (abfd, contents + offset - 3);
          if (val != 0x48  val != 0x4c)
            {
              /* X32 may have 0x44 REX prefix or no REX prefix.  */
              if (ABI_64_P (abfd))
                return FALSE;
            }
        }
      else
        {
          /* X32 may not have any REX prefix.  */
          if (ABI_64_P (abfd))
            return FALSE;
          if (offset  2 || (offset + 3)  sec-size)
            return FALSE;
        }

 So, it should handle the case without REX just OK. If it doesn't, then
 this is a bug in binutils.


 The last byte of the displacement in the previous instruction
 may happen to look like a REX byte. In that case, linker
 will overwrite the last byte of the previous instruction and
 generate the wrong instruction sequence.

 I need to update linker to enforce the REX byte check.

 One important observation: if we want to follow the x86_64 TLS spec
 strictly, we have to use existing DImode patterns only. This also
 means that we should NOT convert other TLS patterns to Pmode, since
 they explicitly state movq and addq. If this is not the case, then we
 need new TLS specification for X32.

 Here is a patch to properly generate X32 IE sequence.

 This is the summary of differences between x86-64 TLS and x32 TLS:

                     x86-64                               x32
 GD
    byte 0x66; leaq foo@tlsgd(%rip),%rdi;         leaq foo@tlsgd(%rip),%rdi;
    .word 0x; rex64; call __tls_get_addr@plt  .word 0x; rex64;
 call __tls_get_addr@plt

 GD-IE optimization
   movq %fs:0,%rax; addq x@gottpoff(%rip),%rax    movl %fs:0,%eax;
 addq x@gottpoff(%rip),%rax

 GD-LE optimization
   movq %fs:0,%rax; leaq x@tpoff(%rax),%rax       movl %fs:0,%eax;
 leaq x@tpoff(%rax),%rax

 LD
  leaq foo@tlsld(%rip),%rdi;                      leaq foo@tlsld(%rip),%rdi;
  call __tls_get_addr@plt                         call __tls_get_addr@plt

 LD-LE optimization
  .word 0x; .byte 0x66; movq %fs:0, %rax      nopl 0x0(%rax); movl
 %fs:0, %eax

 IE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl x@gottpoff(%rip),%reg32

   or
                                                  Not supported if
 Pmode == SImode
   movq x@gottpoff(%rip),%reg64;                  movq 
 x@gottpoff(%rip),%reg64;
   movq %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

 IE-LE optimization

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq x@gottpoff(%rip),%reg64                   addl x@gottpoff(%rip),%reg32

   to

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq foo@tpoff, %reg64                         addl foo@tpoff, %reg32

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq foo@tpoff(%reg64), %reg64                 leal foo@tpoff(%reg32), 
 %reg32

   or

   movq x@gottpoff(%rip),%reg64                   movq 
 x@gottpoff(%rip),%reg64;
   movl %fs:(%reg64),%reg32                       movl %fs:(%reg64), %reg32

   to

   movq foo@tpoff, %reg64                         movq foo@tpoff, %reg64
   movl %fs:(%reeg64),%reg32                      movl %fs:(%reg64), %reg32

 LE
   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   leaq x@tpoff(%reg64),%reg32                    leal x@tpoff(%reg32),%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   addq $x@tpoff,%reg64                           addl $x@tpoff,%reg32

   or

   movq %fs:0,%reg64;                             movl %fs:0,%reg32;
   movl x@tpoff(%reg64),%reg32                    movl x@tpoff(%reg32),%reg32

   or

   movl %fs:x@tpoff,%reg32                        movl %fs:x@tpoff,%reg32


 X32 TLS implementation is straight forward, except for IE:

 1. Since address override works only on the 

Re: PATCH RFA: Update Go frontend on gcc 4.7 branch

2012-03-10 Thread Ian Lance Taylor
Jakub Jelinek ja...@redhat.com writes:

 I'm not very excited by such huge changes, but I've tested this on Fedora 17
 (various architectures) and RHEL6/5 today, let's check this in.

Thanks.  Committed.


 But
 certainly no further such large change will be accepted on the 4.7 branch.

Understood.


 FYI, on Fedora 17 I had recent testresults without the patch, so below are
 just testsuite differences for that (debug/dwarf fails consistently
 everywhere), on RHEL5/6 I didn't have earlier go testsuite results,
 so I'm just providing summaries there.

I will look into these failures, not sure what is happening here.

Ian


PATCH: Check Pmode in lwp_slwpcb

2012-03-10 Thread H.J. Lu
Hi,

Pmode may be SImode for TARGET_64BIT.  This patch checks Pmode instead
of TARGET_64BIT in lwp_slwpcb.  Tested on Linux/x86-64.  OK for trunk?

Thanks.


H.J.
---
2012-03-02  H.J. Lu  hongjiu...@intel.com

* config/i386/i386.md (lwp_slwpcb): Check Pmode instead of
TARGET_64BIT.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 7f5a9e0..8fc7918 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -18015,7 +18065,7 @@
 {
   rtx (*insn)(rtx);
 
-  insn = (TARGET_64BIT
+  insn = (Pmode == DImode
  ? gen_lwp_slwpcbdi
  : gen_lwp_slwpcbsi);
 
-- 
1.7.6.5




Re: [google/integration] Add XFAIL file for arm-gretv2-linux-gnueabi target (issue5798046)

2012-03-10 Thread Diego Novillo

On 10/03/12 01:16 , Doug Kwan wrote:



* contrib/testsuite-management/arm-grtev2-linux-gnueabi.xfail:
New file.


OK.

Diego.



Re: [committed] Update baseline symbols for hppa-linux-gnu

2012-03-10 Thread Jakub Jelinek
On Sat, Mar 10, 2012 at 04:27:47PM -0500, John David Anglin wrote:
 Tested on hppa-unknown-linux-gnu and committed to trunk.
 
 Ok for 4.7?

Ok, but please leave the two TLS: lines out (similarly how they are left out
for other targets) for now.

 @@ -3288,3 +3613,5 @@
  OBJECT:8:_ZTTSo@@GLIBCXX_3.4
  OBJECT:8:_ZTTSt13basic_istreamIwSt11char_traitsIwEE@@GLIBCXX_3.4
  OBJECT:8:_ZTTSt13basic_ostreamIwSt11char_traitsIwEE@@GLIBCXX_3.4
 +TLS:4:_ZSt11__once_call@@GLIBCXX_3.4.11
 +TLS:4:_ZSt15__once_callable@@GLIBCXX_3.4.11

Jakub


[committed] Skip gcc.dg/torture/pr52402.c execution on 32-bit hppa*-*-hpux*

2012-03-10 Thread John David Anglin
Tested on hppa2.0w-hp-hpux11.11 and hppa64-hp-hpux11.11.  Committed to trunk.

Ok for 4.7?

Dave
-- 
J. David Anglin  dave.ang...@nrc-cnrc.gc.ca
National Research Council of Canada  (613) 990-0752 (FAX: 952-6602)

2012-03-10  John David Anglin  dave.ang...@nrc-cnrc.gc.ca

PR target/52450
* gcc.dg/torture/pr52402.c: Skip execution on 32-bit hppa*-*-hpux*.

Index: gcc.dg/torture/pr52402.c
===
--- gcc.dg/torture/pr52402.c(revision 185121)
+++ gcc.dg/torture/pr52402.c(working copy)
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-options -w -Wno-psabi } */
 /* { dg-require-effective-target int32plus } */
+/* { dg-xfail-run-if pr52450 { { hppa*-*-hpux* }  { ! lp64 } } } */
 
 typedef int v4si __attribute__((vector_size(16)));
 struct T { v4si i[2]; int j; } __attribute__((packed));


Re: [Ping][PATCH, libstdc++-v3] Enable to cross-test libstdc++ on simulator

2012-03-10 Thread Jonathan Wakely
On 7 March 2012 05:22, Terry Guo wrote:
 Hello,

 Can anybody please review and approve the following simple patch? Thanks
 very much.

 http://gcc.gnu.org/ml/libstdc++/2011-08/msg00063.html

I think this looks OK but I'm not familiar with those details of the
testsuite - do any ARM or other maintainers have any comments?


Re: [PATCH 07/10] addr32: Use word_mode instead of Pmode in loop expand

2012-03-10 Thread H.J. Lu
On Thu, Mar 8, 2012 at 3:22 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Fri, Mar 2, 2012 at 10:02 PM, H.J. Lu hongjiu...@intel.com wrote:

 This patches uses word_mode instead of Pmode in loop expand since
 word_mode may have bigger size than Pmode.  OK for trunk?

 Thanks.

 H.J.
 ---
 2012-03-02  H.J. Lu  hongjiu...@intel.com

        * config/i386/i386.c (ix86_expand_movmem): Use word_mode instead
        of Pmode on loop.
        (ix86_expand_setmem): Likwise.

 Jan, can you please comment on the changes in this patch?


Here is a complete updated patch to use word_mode in ix86_expand_movmem
and ix86_expand_setmem.  It also fixes ix86_zero_extend_to_Pmode to handle
Pmode != DImode.  OK for trunk?

Thanks.

-- 
H.J.
---
2012-03-10  H.J. Lu  hongjiu...@intel.com

* config/i386/i386.c (ix86_zero_extend_to_Pmode): Handle Pmode
!= DImode.
(ix86_expand_movmem): Use word_mode for size needed for loop.
(ix86_expand_setmem): Likewise.
2012-03-10  H.J. Lu  hongjiu...@intel.com

* config/i386/i386.c (ix86_zero_extend_to_Pmode): Handle Pmode
!= DImode.
(ix86_expand_movmem): Use word_mode for size needed for loop.
(ix86_expand_setmem): Likewise.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index bc144a9..a51c6b4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21031,7 +21031,11 @@ ix86_zero_extend_to_Pmode (rtx exp)
   if (GET_MODE (exp) == Pmode)
 return copy_to_mode_reg (Pmode, exp);
   r = gen_reg_rtx (Pmode);
-  emit_insn (gen_zero_extendsidi2 (r, exp));
+  if (Pmode == DImode)
+emit_insn (gen_zero_extendsidi2 (r, exp));
+  else
+emit_move_insn (r,
+   simplify_gen_subreg (Pmode, exp, GET_MODE (exp), 0));
   return r;
 }
 
@@ -22060,11 +22064,11 @@ ix86_expand_movmem (rtx dst, rtx src, rtx count_exp, 
rtx align_exp,
   gcc_unreachable ();
 case loop:
   need_zero_guard = true;
-  size_needed = GET_MODE_SIZE (Pmode);
+  size_needed = GET_MODE_SIZE (word_mode);
   break;
 case unrolled_loop:
   need_zero_guard = true;
-  size_needed = GET_MODE_SIZE (Pmode) * (TARGET_64BIT ? 4 : 2);
+  size_needed = GET_MODE_SIZE (word_mode) * (TARGET_64BIT ? 4 : 2);
   break;
 case rep_prefix_8_byte:
   size_needed = 8;
@@ -22230,13 +22234,13 @@ ix86_expand_movmem (rtx dst, rtx src, rtx count_exp, 
rtx align_exp,
   break;
 case loop:
   expand_set_or_movmem_via_loop (dst, src, destreg, srcreg, NULL,
-count_exp, Pmode, 1, expected_size);
+count_exp, word_mode, 1, expected_size);
   break;
 case unrolled_loop:
   /* Unroll only by factor of 2 in 32bit mode, since we don't have enough
 registers for 4 temporaries anyway.  */
   expand_set_or_movmem_via_loop (dst, src, destreg, srcreg, NULL,
-count_exp, Pmode, TARGET_64BIT ? 4 : 2,
+count_exp, word_mode, TARGET_64BIT ? 4 : 2,
 expected_size);
   break;
 case rep_prefix_8_byte:
@@ -22448,11 +22452,11 @@ ix86_expand_setmem (rtx dst, rtx count_exp, rtx 
val_exp, rtx align_exp,
   gcc_unreachable ();
 case loop:
   need_zero_guard = true;
-  size_needed = GET_MODE_SIZE (Pmode);
+  size_needed = GET_MODE_SIZE (word_mode);
   break;
 case unrolled_loop:
   need_zero_guard = true;
-  size_needed = GET_MODE_SIZE (Pmode) * 4;
+  size_needed = GET_MODE_SIZE (word_mode) * 4;
   break;
 case rep_prefix_8_byte:
   size_needed = 8;
@@ -22623,11 +22627,11 @@ ix86_expand_setmem (rtx dst, rtx count_exp, rtx 
val_exp, rtx align_exp,
   break;
 case loop:
   expand_set_or_movmem_via_loop (dst, NULL, destreg, NULL, promoted_val,
-count_exp, Pmode, 1, expected_size);
+count_exp, word_mode, 1, expected_size);
   break;
 case unrolled_loop:
   expand_set_or_movmem_via_loop (dst, NULL, destreg, NULL, promoted_val,
-count_exp, Pmode, 4, expected_size);
+count_exp, word_mode, 4, expected_size);
   break;
 case rep_prefix_8_byte:
   expand_setmem_via_rep_stos (dst, destreg, promoted_val, count_exp,


PATCH: Use Pmode on x86_64 this parameter

2012-03-10 Thread H.J. Lu
Hi,

This patch replaces DImode with Pmode on x86_64 this parameter.  OK
for trunk?

Thanks.

H.J.
---
2012-03-10  H.J. Lu  hongjiu...@intel.com

* config/i386/i386.c (x86_this_parameter): Replace DImode with
Pmode.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index bc144a9..bfa3cdc 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -31971,7 +31978,7 @@ x86_this_parameter (tree function)
 parm_regs = x86_64_ms_abi_int_parameter_registers;
   else
 parm_regs = x86_64_int_parameter_registers;
-  return gen_rtx_REG (DImode, parm_regs[aggr]);
+  return gen_rtx_REG (Pmode, parm_regs[aggr]);
 }
 
   nregs = ix86_function_regparm (type, function);


PATCH: Check ptr_mode and use Pmode in ix86_trampoline_init

2012-03-10 Thread H.J. Lu
Hi,

x86 trampoline depends on ptr_mode.  This patch checks ptr_mode, instead
of TARGET_X32.  Also we should use Pmode for address mode.  Tested on
Linux/x86-64.  OK for trunk?

Thanks.


H.J.
---
2012-03-10  H.J. Lu  hongjiu...@intel.com

* config/i386/i386.c (ix86_trampoline_init): Use movl for 64bit if
ptr_mode == SImode.  Replace DImode with Pmode or ptr_mode.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index bc144a9..bfa3cdc 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24309,10 +24313,13 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
   /* Load the function address to r11.  Try to load address using
 the shorter movl instead of movabs.  We may want to support
 movq for kernel mode, but kernel does not use trampolines at
-the moment.  */
-  if (x86_64_zext_immediate_operand (fnaddr, VOIDmode))
+the moment.  FNADDR is a 32bit address and may not be in
+DImode when ptr_mode == SImode.  Always use movl in this
+case.  */
+  if (ptr_mode == SImode
+ || x86_64_zext_immediate_operand (fnaddr, VOIDmode))
{
- fnaddr = copy_to_mode_reg (DImode, fnaddr);
+ fnaddr = copy_to_mode_reg (Pmode, fnaddr);
 
  mem = adjust_address (m_tramp, HImode, offset);
  emit_move_insn (mem, gen_int_mode (0xbb41, HImode));
@@ -24331,9 +24338,9 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
  offset += 10;
}
 
-  /* Load static chain using movabs to r10.  Use the
-shorter movl instead of movabs for x32.  */
-  if (TARGET_X32)
+  /* Load static chain using movabs to r10.  Use the shorter movl
+ instead of movabs when ptr_mode == SImode.  */
+  if (ptr_mode == SImode)
{
  opcode = 0xba41;
  size = 6;