date:20120307

Re: [RFC]: Add support for pragma pointer_size

2012-03-07 Thread Tristan Gingold


On Mar 6, 2012, at 6:34 PM, Joseph S. Myers wrote:

 On Tue, 6 Mar 2012, Tristan Gingold wrote:
 
 The patch is simple: the C front-end will now calls c_build_pointer_type 
 (instead of build_pointer_type), which in turn calls 
 build_pointer_type_for_mode using the right mode.
 
 There seem to be quite a lot of build_pointer_type calls in the C front 
 end (and in c-common.c) that you haven't changed.  Could you explain the 
 rule for when a call should or should not be changed, and how it applies 
 to all these calls?

The global approach is to have the same effect as a default 
__attribute__((mode(SI/DImode))) on pointers declared by users so that layouts 
match.  That's why only grokdeclarator is changed.

There might be bugs with this approach (e.g. it looks like 
c-common.c:handle_noreturn_attribute doesn't preserve the mode of the pointer 
to function), but my understanding is that they also correspond to bugs of 
__attribute__((mode ())) applied to pointer.  The later one isn't well tested 
and one advantage of the VMS port is that it will test it much more (as there 
are many pragma pointer_size in VMS headers).

I haven't tried to exactly follow the DEC-C rules, because:
- they aren't formally written
- the current behavior of the front and middle end is already correct

I agree that it is possible to concoct tests (e.g. using sizeof) that will 
return different results between gcc and DEC-C, but my purpose is to be 
reasonable compatible, not fully compatible.  There are some features (such as 
supporting VAX float) that I don't plan to implement :-)

Tristan.

Re: [Ping][PATCH, libstdc++-v3] Enable to cross-test libstdc++ on simulator

2012-03-07 Thread Jonathan Wakely

On 7 March 2012 05:22, Terry Guo wrote:
 Hello,

 Can anybody please review and approve the following simple patch? Thanks
 very much.

 http://gcc.gnu.org/ml/libstdc++/2011-08/msg00063.html

I'll test it on x86_64-linux as soon as trunk is able to build again.

Re: [patch] PR 51417

2012-03-07 Thread Richard Guenther

On Wed, 7 Mar 2012, Ralf Corsepius wrote:

 On 03/06/2012 10:43 AM, Richard Guenther wrote:
  On Mon, 5 Mar 2012, Ralf Corsépius wrote:
  
   Hi,
   
   The patch below addresses an issue with gcc-4.7.0 the issue I had reported
   in
   http://gcc.gnu.org/ml/gcc/2012-03/msg00035.html
   
   and somebody else had bz'ed as
   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51417
   
   Tested by cross-building gcc-4_7-branch for several *rtems targets on
   Fedora
   16. Further test-builds are in progress.
   
   No native build-testing, yet.
   
   OK to apply?
  
  Ok for trunk and the 4.7 branch if you've done one native build
  and install successfully.
 
 The patch seems to work for native build/install tests (Here
 fedora-16-x86_64):
 
 # ../configure --disable-nls --prefix=/foo/bar --enable-languages=c,c++
 --disable-multilib --with-system-zlib
 # make
 # make install DESTDIR=~/tmp/INSTALL
 
 # cd ~/tmp/INSTALL
 # ls -l foo/bar/bin/*
 -rwxr-xr-x. 4 rtems rtems 2099206 Mar  7 08:51 foo/bar/bin/c++
 -rwxr-xr-x. 1 rtems rtems 2096784 Mar  7 08:51 foo/bar/bin/cpp
 -rwxr-xr-x. 4 rtems rtems 2099206 Mar  7 08:51 foo/bar/bin/g++
 -rwxr-xr-x. 3 rtems rtems 2094252 Mar  7 08:51 foo/bar/bin/gcc
 -rwxr-xr-x. 2 rtems rtems  113695 Mar  7 08:51 foo/bar/bin/gcc-ar
 -rwxr-xr-x. 2 rtems rtems  113631 Mar  7 08:51 foo/bar/bin/gcc-nm
 -rwxr-xr-x. 2 rtems rtems  113643 Mar  7 08:51 foo/bar/bin/gcc-ranlib
 -rwxr-xr-x. 1 rtems rtems 1064274 Mar  7 08:51 foo/bar/bin/gcov
 -rwxr-xr-x. 4 rtems rtems 2099206 Mar  7 08:51
 foo/bar/bin/x86_64-unknown-linux-gnu-c++
 -rwxr-xr-x. 4 rtems rtems 2099206 Mar  7 08:51
 foo/bar/bin/x86_64-unknown-linux-gnu-g++
 -rwxr-xr-x. 3 rtems rtems 2094252 Mar  7 08:51
 foo/bar/bin/x86_64-unknown-linux-gnu-gcc
 -rwxr-xr-x. 3 rtems rtems 2094252 Mar  7 08:51
 foo/bar/bin/x86_64-unknown-linux-gnu-gcc-4.7.0
 -rwxr-xr-x. 2 rtems rtems  113695 Mar  7 08:51
 foo/bar/bin/x86_64-unknown-linux-gnu-gcc-ar
 -rwxr-xr-x. 2 rtems rtems  113631 Mar  7 08:51
 foo/bar/bin/x86_64-unknown-linux-gnu-gcc-nm
 -rwxr-xr-x. 2 rtems rtems  113643 Mar  7 08:51
 foo/bar/bin/x86_64-unknown-linux-gnu-gcc-ranlib

The patch is ok to install on the trunk and on the 4.7 branch then.

Thanks,
Richard.

Re: [PR51752] publication safety violations in loop invariant motion pass

2012-03-07 Thread Richard Guenther

On Tue, Mar 6, 2012 at 9:56 PM, Torvald Riegel trie...@redhat.com wrote:
 On Tue, 2012-03-06 at 21:18 +0100, Richard Guenther wrote:
 On Tue, Mar 6, 2012 at 6:55 PM, Aldy Hernandez al...@redhat.com wrote:
  On 02/29/12 03:22, Richard Guenther wrote:
 
  So fixing up individual passes is easier - I can only think of PRE being
  problematic right now, I am not aware that any other pass moves loads
  or stores.  So I'd simply pre-compute the stmt bit in PRE and adjust
  the
 
            if (gimple_has_volatile_ops (stmt)
                || stmt_could_throw_p (stmt))
              continue;
 
  in compute_avail accordingly.
 
 
  Initially I thought PRE would be problematic for transactions, but perhaps
  it isn't.  As I understand, for PRE we hoist loads/computations that are
  mostly redundant, but will be performed on every path:
 
         if (flag)
                 a = b + c;
         else
                 stuff;
         d = b + c;              -- [b + c] always computed
 
  Even if we hoist [b + c] before the flag, [b + c] will be computed on every
  path out of if (flag)  So... we can allow this transformation within
  transactions, right?

 In this particular example, I agree.  We can move [b + c] into the else
 branch, and then move it to before flag because it will happen on all
 paths to the exit anyway.

 Note that partial PRE (enabled at -O3) can insert expressions into paths
 that did _not_ execute the expression.  For regular PRE you are right.

 I suppose if only loads will be moved around by PRE, then this could be
 fine, as long as those expressions do not have visible side effects or
 can crash if reading garbage.  For examples, dereferencing pointers
 could lead to accessing unmapped memory and thus segfaults, speculative
 stores are not allowed (even if you undo them later on), etc.

 Also, if PRE inserts expressions into paths that did not execute the
 transactions, can it happen that then something like loop invariant
 motion comes around and optimizes based on that and moves the code to
 before if (flag)...?  If so, PRE would break publication safety
 indirectly by pretending that the expression happened on every path to
 the exit, tricking subsequent passes to believe things that were not in
 place in the source code.  Is this a realistic scenario?

I think so.

Richard.

Re: [PATCH][1/n] No longer sign-extend sizetype constants, remove TYPE_IS_SIZETYPE

2012-03-07 Thread Richard Guenther

On Tue, 6 Mar 2012, Eric Botcazou wrote:

  Well.  I suppose fixing that negative DECL_FIELD_OFFSET thing should
  be #1 priority.
 
 OK, let me try over the next few days.

Thanks.

Btw, making [s]bitsizetype have TYPE_PRECISION of [s]sizetype plus
log2 (BITS_PER_UNIT) solves quite some issues with no longer
sign-extending sizetype constants.  A proper-precision bitsizetype
behaves more like a twos-complement type (and avoids similar
issues we have when doing twos-complement arithmetic on an
unsigned HOST_WIDE_INT variable that we get from a -m32
sizetype constant on a 64bit HWI host - for that we'd need a
HOST_SIZE_INT, maybe a good idea anyway?).

Richard.

[PATCH] Fix WIDEN_MULT_EXPR generation

2012-03-07 Thread Richard Guenther


When the optab causes the actual mode to be wider we have to re-check
whether we still fulfill the gimple constraints we verify.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-03-07  Richard Guenther  rguent...@suse.de

* tree-ssa-math-opts.c (convert_mult_to_widen): Check actual
precision against gimple constraints.

Index: gcc/tree-ssa-math-opts.c
===
*** gcc/tree-ssa-math-opts.c(revision 185025)
--- gcc/tree-ssa-math-opts.c(working copy)
*** convert_mult_to_widen (gimple stmt, gimp
*** 2158,2163 
--- 2158,2165 
/* Ensure that the inputs to the handler are in the correct precison
   for the opcode.  This will be the full mode size.  */
actual_precision = GET_MODE_PRECISION (actual_mode);
+   if (2 * actual_precision  TYPE_PRECISION (type))
+ return false;
if (actual_precision != TYPE_PRECISION (type1)
|| from_unsigned1 != TYPE_UNSIGNED (type1))
  {

Re: [patch] Fix non-standard Ada bootstrap failure on IA-64

2012-03-07 Thread Richard Guenther

On Tue, Mar 6, 2012 at 11:08 PM, Eric Botcazou ebotca...@adacore.com wrote:
 If you try to bootstrap the GNAT 4.7.0 compiler on IA-64/Linux with 
 non-default
 options (-gnatpg replaced with -gnatpgn), you get another comparison failure
 caused by debug insns, stemming from the machine-specific reorg pass (aka insn
 bundling on IA-64).  With -g , when cselib is called on:

 (insn 17 41 18 2 (set (reg/f:DI 14 r14 [357])
        (plus:DI (reg:DI 16 r16 [356])
            (const_int 28
 [0x1c]))) /home/eric/gnat.b/gnat7_47/src/gcc/ada/atree.adb:2244 205 {adddi3}
     (expr_list:REG_DEAD (reg:DI 16 r16 [356])
        (nil)))

 it finds a previous equivalent value:

 (plus:DI (plus:DI (ashift:DI (value:DI 5:111 @0x29abbe0/0x29f5f20)
            (const_int 5 [0x5]))
        (value:DI 9:9 @0x29abc40/0x29f5fe0))
    (const_int 28 [0x1c]))

 computed for a debug insn:

 (debug_insn 12 10 65 2 (var_location:SI n (mem/j:SI (plus:DI (plus:DI
 (ashift:DI (reg:DI 14 r14 [orig:344 D.2979 ] [344])
                    (const_int 5 [0x5]))
                (reg/f:DI 15 r15 [orig:342
 atree__atree_private_part__nodes__table.32 ] [342]))
            (const_int 28 [0x1c])) [0
 *atree__atree_private_part__nodes__table.32_17
 [D.2979_19].is_extension___XVN.S0.field5+0 S4 A8])) sem_ch2.adb:49 -1
     (nil))

 When output_dependence is called on a couple of MEMs, it uses the above value
 to get the equivalent addresses:

 (plus:DI (value:DI 9:9 @0x29abc40/0x29f5fe0)
    (value:DI 12:4189 @0x29abc88/0x29c8a20))

 and

 (plus:DI (plus:DI (ashift:DI (value:DI 5:111 @0x29abbe0/0x29f5f20)
            (const_int 5 [0x5]))
        (value:DI 9:9 @0x29abc40/0x29f5fe0))
    (const_int 28 [0x1c]))

 and rtx_refs_may_alias_p returns true on them because ao_ref_from_mem returns
 false for one of the MEMs.


 Without -g, when cselib is called on:

 (insn 14 30 15 2 (set (reg/f:DI 14 r14 [357])
        (plus:DI (reg:DI 16 r16 [356])
            (const_int 28
 [0x1c]))) /home/eric/gnat.b/gnat7_47/src/gcc/ada/atree.adb:2244 205 {adddi3}
     (expr_list:REG_DEAD (reg:DI 16 r16 [356])
        (nil)))

 output_dependence only gets the equivalent addresses:

 (plus:DI (value:DI 8:8 @0x299f2e8/0x299f1a0)
    (value:DI 10:4188 @0x299f318/0x299f200))

 and

 (plus:DI (value:DI 12:4254 @0x299f348/0x29a0490)
    (const_int 28 [0x1c]))

 and memrefs_conflict_p is able to prove that they don't conflict.


 The problem is that the more complex expression in the first case fools
 memrefs_conflict_p because the predicate makes a wrong assumption about the
 canonicalization of address expressions.  Hence the attached patch.

 Bootstrapped/regtested on IA-64/Linux, OK for the mainline?  Do we also want 
 it
 for 4.7.1 or is it too specific?

Hmm, but isn't the bug that we feed debug-insn mems to memrefs_conflict_p?
Or that we have non-legitimate address expressions in them?

Richard.


 2012-03-06  Eric Botcazou  ebotca...@adacore.com

        * alias.c (memrefs_conflict_p) PLUS: Correct wrong assumption about
        canonicalization of address expressions.


 --
 Eric Botcazou

Re: [patch] Fix non-standard Ada bootstrap failure on IA-64

2012-03-07 Thread Jakub Jelinek

On Wed, Mar 07, 2012 at 10:27:22AM +0100, Richard Guenther wrote:
 On Tue, Mar 6, 2012 at 11:08 PM, Eric Botcazou ebotca...@adacore.com wrote:
  If you try to bootstrap the GNAT 4.7.0 compiler on IA-64/Linux with 
  non-default
  options (-gnatpg replaced with -gnatpgn), you get another comparison failure
  caused by debug insns, stemming from the machine-specific reorg pass (aka 
  insn
  bundling on IA-64).  With -g , when cselib is called on:
 
  (insn 17 41 18 2 (set (reg/f:DI 14 r14 [357])
         (plus:DI (reg:DI 16 r16 [356])
             (const_int 28
  [0x1c]))) /home/eric/gnat.b/gnat7_47/src/gcc/ada/atree.adb:2244 205 {adddi3}
      (expr_list:REG_DEAD (reg:DI 16 r16 [356])
         (nil)))
 
  it finds a previous equivalent value:
 
  (plus:DI (plus:DI (ashift:DI (value:DI 5:111 @0x29abbe0/0x29f5f20)
             (const_int 5 [0x5]))
         (value:DI 9:9 @0x29abc40/0x29f5fe0))
     (const_int 28 [0x1c]))
 
  computed for a debug insn:
 
  (debug_insn 12 10 65 2 (var_location:SI n (mem/j:SI (plus:DI (plus:DI
  (ashift:DI (reg:DI 14 r14 [orig:344 D.2979 ] [344])
                     (const_int 5 [0x5]))
                 (reg/f:DI 15 r15 [orig:342
  atree__atree_private_part__nodes__table.32 ] [342]))
             (const_int 28 [0x1c])) [0
  *atree__atree_private_part__nodes__table.32_17
  [D.2979_19].is_extension___XVN.S0.field5+0 S4 A8])) sem_ch2.adb:49 -1
      (nil))
 
  When output_dependence is called on a couple of MEMs, it uses the above 
  value
  to get the equivalent addresses:
 
  (plus:DI (value:DI 9:9 @0x29abc40/0x29f5fe0)
     (value:DI 12:4189 @0x29abc88/0x29c8a20))
 
  and
 
  (plus:DI (plus:DI (ashift:DI (value:DI 5:111 @0x29abbe0/0x29f5f20)
             (const_int 5 [0x5]))
         (value:DI 9:9 @0x29abc40/0x29f5fe0))
     (const_int 28 [0x1c]))
 
  and rtx_refs_may_alias_p returns true on them because ao_ref_from_mem 
  returns
  false for one of the MEMs.
 
 
  Without -g, when cselib is called on:
 
  (insn 14 30 15 2 (set (reg/f:DI 14 r14 [357])
         (plus:DI (reg:DI 16 r16 [356])
             (const_int 28
  [0x1c]))) /home/eric/gnat.b/gnat7_47/src/gcc/ada/atree.adb:2244 205 {adddi3}
      (expr_list:REG_DEAD (reg:DI 16 r16 [356])
         (nil)))
 
  output_dependence only gets the equivalent addresses:
 
  (plus:DI (value:DI 8:8 @0x299f2e8/0x299f1a0)
     (value:DI 10:4188 @0x299f318/0x299f200))
 
  and
 
  (plus:DI (value:DI 12:4254 @0x299f348/0x29a0490)
     (const_int 28 [0x1c]))
 
  and memrefs_conflict_p is able to prove that they don't conflict.
 
 
  The problem is that the more complex expression in the first case fools
  memrefs_conflict_p because the predicate makes a wrong assumption about the
  canonicalization of address expressions.  Hence the attached patch.
 
  Bootstrapped/regtested on IA-64/Linux, OK for the mainline?  Do we also 
  want it
  for 4.7.1 or is it too specific?
 
 Hmm, but isn't the bug that we feed debug-insn mems to memrefs_conflict_p?
 Or that we have non-legitimate address expressions in them?

CCing Alex.  I think we feed debug insn mems in the scheduler to be able to
find out what debug insns need to be invalidated and what can be kept.
And any address expressions are legitimate for debug insns, why should we be
limited by what the ISA allows?  All we are limited is if we can express
those expressions in DWARF or not.
How can this be reproduced with a cross?

  2012-03-06  Eric Botcazou  ebotca...@adacore.com
 
         * alias.c (memrefs_conflict_p) PLUS: Correct wrong assumption about
         canonicalization of address expressions.

Jakub

Re: [patch mingw/cygwin]: Allow relocated const data to be put in read-only section by default

2012-03-07 Thread Kai Tietz

Hi,

as from Dave's side there is no objection, I committed this patch to
4.8 at revision 185027.

Regards,
Kai

2012/3/3 Kai Tietz ktiet...@googlemail.com:
 Hi,

 this patch allows that relocated const data is placed into .rdata.  To provide
 old behavior for older runtimes not supporting pseudo-relocation operating on
 read-only sections, the option -fwritable-relocated-rdata can be used.

 ChangeLog

 2012-03-03  Kai Tietz  kti...@redhat.com

        * doc/invoke.texi (fwritable-relocated-rdata): Document
        new Cygwin/MinGW target option.
        * config/i386/winnt.c (i386_pe_unique_section): Ignore
        reloc if flag -fwritable-relocated-rdata is not set.
        (i386_pe_section_type_flags): Likewise.
        * config/i386/cygming.opt (fwritable-relocated-rdata):
        Add new flag variable flag_writable_rel_rdata.

 Tested for i686-w64-mingw32, x86_64-w64-mingw32, and i686-pc-cygwin.
 Ok for apply?

 Regards,
 Kai

 Index: doc/invoke.texi
 ===
 --- doc/invoke.texi     (revision 184760)
 +++ doc/invoke.texi     (working copy)
 @@ -13826,6 +13826,13 @@
  Windows, as there the user32 API, which is used to set executable
  privileges, isn't available.

 +@item -fwritable-relocated-rdata
 +@opindex fno-writable-relocated-rdata
 +This option is available for MinGW and Cygwin targets.  It specifies
 +that relocated-data in read-only section is put into .data
 +section.  This is a necessary for older runtimes not supporting
 +modification of .rdata sections for pseudo-relocation.
 +
  @item -mpe-aligned-commons
  @opindex mpe-aligned-commons
  This option is available for Cygwin and MinGW targets.  It
 Index: config/i386/winnt.c
 ===
 --- config/i386/winnt.c (revision 184760)
 +++ config/i386/winnt.c (working copy)
 @@ -395,6 +395,10 @@
   const char *name, *prefix;
   char *string;

 +  /* Ignore RELOC, if we are allowed to put relocated
 +     const data into read-only section.  */
 +  if (!flag_writable_rel_rdata)
 +    reloc = 0;
   name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
   name = i386_pe_strip_name_encoding_full (name);

 @@ -441,6 +445,10 @@
   unsigned int flags;
   unsigned int **slot;

 +  /* Ignore RELOC, if we are allowed to put relocated
 +     const data into read-only section.  */
 +  if (!flag_writable_rel_rdata)
 +    reloc = 0;
   /* The names we put in the hashtable will always be the unique
      versions given to us by the stringtable, so we can just use
      their addresses as the keys.  */
 Index: config/i386/cygming.opt
 ===
 --- config/i386/cygming.opt     (revision 184760)
 +++ config/i386/cygming.opt     (working copy)
 @@ -53,4 +53,8 @@
  posix
  Driver

 +fwritable-relocated-rdata
 +Common Report Var(flag_writable_rel_rdata) Init(0)
 +Put relocated read-only data into .data section.
 +
  ; Retain blank line above



-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| ()_() him gain world domination

[committed] PR 52515: Missing GTY markers

2012-03-07 Thread Richard Sandiford

This patch restores x86_64-linux-gnu bootstrap after my patch for 52372.
Applied as obvious.

Sorry for the breakage.

Richard


gcc/
PR middle-end/52515
* rtl.h (pc_rtx, cc0_rtx, ret_rtx, simple_return_rtx): Add GTY markers.

Index: gcc/rtl.h
===
--- gcc/rtl.h   2012-03-07 09:49:12.0 +
+++ gcc/rtl.h   2012-03-07 09:49:12.878790539 +
@@ -2089,10 +2089,10 @@ #define CONST1_RTX(MODE) (const_tiny_rtx
 #define CONST2_RTX(MODE) (const_tiny_rtx[2][(int) (MODE)])
 #define CONSTM1_RTX(MODE) (const_tiny_rtx[3][(int) (MODE)])
 
-extern rtx pc_rtx;
-extern rtx cc0_rtx;
-extern rtx ret_rtx;
-extern rtx simple_return_rtx;
+extern GTY(()) rtx pc_rtx;
+extern GTY(()) rtx cc0_rtx;
+extern GTY(()) rtx ret_rtx;
+extern GTY(()) rtx simple_return_rtx;
 
 /* If HARD_FRAME_POINTER_REGNUM is defined, then a special dummy reg
is used to represent the frame pointer.  This is because the

Re: [committed] PR 52515: Missing GTY markers

2012-03-07 Thread Jakub Jelinek

On Wed, Mar 07, 2012 at 09:52:51AM +, Richard Sandiford wrote:
 This patch restores x86_64-linux-gnu bootstrap after my patch for 52372.
 Applied as obvious.
 
 Sorry for the breakage.

Was it really necessary to move these out of the global rtx array?
Now you completely unnecessarily have 4 new GTY roots with all the overhead
it has.  Wouldn't just moving their initialization elsewhere be sufficient?

 gcc/
   PR middle-end/52515
   * rtl.h (pc_rtx, cc0_rtx, ret_rtx, simple_return_rtx): Add GTY markers.
 
 Index: gcc/rtl.h
 ===
 --- gcc/rtl.h 2012-03-07 09:49:12.0 +
 +++ gcc/rtl.h 2012-03-07 09:49:12.878790539 +
 @@ -2089,10 +2089,10 @@ #define CONST1_RTX(MODE) (const_tiny_rtx
  #define CONST2_RTX(MODE) (const_tiny_rtx[2][(int) (MODE)])
  #define CONSTM1_RTX(MODE) (const_tiny_rtx[3][(int) (MODE)])
  
 -extern rtx pc_rtx;
 -extern rtx cc0_rtx;
 -extern rtx ret_rtx;
 -extern rtx simple_return_rtx;
 +extern GTY(()) rtx pc_rtx;
 +extern GTY(()) rtx cc0_rtx;
 +extern GTY(()) rtx ret_rtx;
 +extern GTY(()) rtx simple_return_rtx;
  
  /* If HARD_FRAME_POINTER_REGNUM is defined, then a special dummy reg
 is used to represent the frame pointer.  This is because the

Jakub

Re: [patch] Fix non-standard Ada bootstrap failure on IA-64

2012-03-07 Thread Richard Guenther

On Wed, Mar 7, 2012 at 10:35 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Mar 07, 2012 at 10:27:22AM +0100, Richard Guenther wrote:
 On Tue, Mar 6, 2012 at 11:08 PM, Eric Botcazou ebotca...@adacore.com wrote:
  If you try to bootstrap the GNAT 4.7.0 compiler on IA-64/Linux with 
  non-default
  options (-gnatpg replaced with -gnatpgn), you get another comparison 
  failure
  caused by debug insns, stemming from the machine-specific reorg pass (aka 
  insn
  bundling on IA-64).  With -g , when cselib is called on:
 
  (insn 17 41 18 2 (set (reg/f:DI 14 r14 [357])
         (plus:DI (reg:DI 16 r16 [356])
             (const_int 28
  [0x1c]))) /home/eric/gnat.b/gnat7_47/src/gcc/ada/atree.adb:2244 205 
  {adddi3}
      (expr_list:REG_DEAD (reg:DI 16 r16 [356])
         (nil)))
 
  it finds a previous equivalent value:
 
  (plus:DI (plus:DI (ashift:DI (value:DI 5:111 @0x29abbe0/0x29f5f20)
             (const_int 5 [0x5]))
         (value:DI 9:9 @0x29abc40/0x29f5fe0))
     (const_int 28 [0x1c]))
 
  computed for a debug insn:
 
  (debug_insn 12 10 65 2 (var_location:SI n (mem/j:SI (plus:DI (plus:DI
  (ashift:DI (reg:DI 14 r14 [orig:344 D.2979 ] [344])
                     (const_int 5 [0x5]))
                 (reg/f:DI 15 r15 [orig:342
  atree__atree_private_part__nodes__table.32 ] [342]))
             (const_int 28 [0x1c])) [0
  *atree__atree_private_part__nodes__table.32_17
  [D.2979_19].is_extension___XVN.S0.field5+0 S4 A8])) sem_ch2.adb:49 -1
      (nil))
 
  When output_dependence is called on a couple of MEMs, it uses the above 
  value
  to get the equivalent addresses:
 
  (plus:DI (value:DI 9:9 @0x29abc40/0x29f5fe0)
     (value:DI 12:4189 @0x29abc88/0x29c8a20))
 
  and
 
  (plus:DI (plus:DI (ashift:DI (value:DI 5:111 @0x29abbe0/0x29f5f20)
             (const_int 5 [0x5]))
         (value:DI 9:9 @0x29abc40/0x29f5fe0))
     (const_int 28 [0x1c]))
 
  and rtx_refs_may_alias_p returns true on them because ao_ref_from_mem 
  returns
  false for one of the MEMs.
 
 
  Without -g, when cselib is called on:
 
  (insn 14 30 15 2 (set (reg/f:DI 14 r14 [357])
         (plus:DI (reg:DI 16 r16 [356])
             (const_int 28
  [0x1c]))) /home/eric/gnat.b/gnat7_47/src/gcc/ada/atree.adb:2244 205 
  {adddi3}
      (expr_list:REG_DEAD (reg:DI 16 r16 [356])
         (nil)))
 
  output_dependence only gets the equivalent addresses:
 
  (plus:DI (value:DI 8:8 @0x299f2e8/0x299f1a0)
     (value:DI 10:4188 @0x299f318/0x299f200))
 
  and
 
  (plus:DI (value:DI 12:4254 @0x299f348/0x29a0490)
     (const_int 28 [0x1c]))
 
  and memrefs_conflict_p is able to prove that they don't conflict.
 
 
  The problem is that the more complex expression in the first case fools
  memrefs_conflict_p because the predicate makes a wrong assumption about the
  canonicalization of address expressions.  Hence the attached patch.
 
  Bootstrapped/regtested on IA-64/Linux, OK for the mainline?  Do we also 
  want it
  for 4.7.1 or is it too specific?

 Hmm, but isn't the bug that we feed debug-insn mems to memrefs_conflict_p?
 Or that we have non-legitimate address expressions in them?

 CCing Alex.  I think we feed debug insn mems in the scheduler to be able to
 find out what debug insns need to be invalidated and what can be kept.
 And any address expressions are legitimate for debug insns, why should we be
 limited by what the ISA allows?  All we are limited is if we can express
 those expressions in DWARF or not.
 How can this be reproduced with a cross?

Hmm, but then this complicates and slows down the generic alias machinery.
Of course IMHO the RTL alias machinery should be conservative with respect
to what the RTL IL allows - so the question is are non-legitimate addresses
valid in regular instructions at any point?

Richard.

  2012-03-06  Eric Botcazou  ebotca...@adacore.com
 
         * alias.c (memrefs_conflict_p) PLUS: Correct wrong assumption 
  about
         canonicalization of address expressions.

        Jakub

Re: PATCH: Properly check mode for x86 call/jmp address

2012-03-07 Thread Uros Bizjak

On Wed, Mar 7, 2012 at 10:28 AM, Uros Bizjak ubiz...@gmail.com wrote:

 +  if (TARGET_X32)
 +    operands[0] = convert_memory_address (word_mode, operands[0]);

 This addition to indirect_jump and tablejump should be the only
 change, needed in i386.md now. Please write the condition

 if (Pmode != word_mode)

 for consistency.

Ah, I vaguely remember that indirect call/jmp is invalid on X32 for
some other reason. So, please leave the condition above as is and also
revert similar change in attached patch back to (not (match_test
TARGET_X32)).

Uros.

Re: [committed] PR 52515: Missing GTY markers

2012-03-07 Thread Richard Sandiford

Jakub Jelinek ja...@redhat.com writes:
 On Wed, Mar 07, 2012 at 09:52:51AM +, Richard Sandiford wrote:
 This patch restores x86_64-linux-gnu bootstrap after my patch for 52372.
 Applied as obvious.
 
 Sorry for the breakage.

 Was it really necessary to move these out of the global rtx array?
 Now you completely unnecessarily have 4 new GTY roots with all the overhead
 it has.  Wouldn't just moving their initialization elsewhere be sufficient?

It had to be outside the global array, because that's a per-target thing.
E.g. MIPS16 and non-MIPS16 have separate target_rtl structures, but ought
to have the same pc_rtx (just as they ought to have the same const0_rtx, etc.)

I hadn't realised that the overhead of 4 roots was much greater than the
overhead of one root pointing to a 4-element array.  We could have a new
4-element array if that's a problem.

Richard

Re: PATCH: Properly check mode for x86 call/jmp address

2012-03-07 Thread Uros Bizjak

On Wed, Mar 7, 2012 at 11:07 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Wed, Mar 7, 2012 at 10:28 AM, Uros Bizjak ubiz...@gmail.com wrote:

 +  if (TARGET_X32)
 +    operands[0] = convert_memory_address (word_mode, operands[0]);

 This addition to indirect_jump and tablejump should be the only
 change, needed in i386.md now. Please write the condition

 if (Pmode != word_mode)

 for consistency.

 Ah, I vaguely remember that indirect call/jmp is invalid on X32 for
 some other reason. So, please leave the condition above as is and also
 revert similar change in attached patch back to (not (match_test
 TARGET_X32)).

Now with attached predicate.md patch.

Uros.
Index: predicates.md
===
--- predicates.md   (revision 184992)
+++ predicates.md   (working copy)
@@ -1,5 +1,5 @@
 ;; Predicate definitions for IA-32 and x86-64.
-;; Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
+;; Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012
 ;; Free Software Foundation, Inc.
 ;;
 ;; This file is part of GCC.
@@ -557,22 +565,27 @@
(match_operand 0 immediate_operand)))
 
 ;; Test for a valid operand for indirect branch.
+;; Allow register operands in word mode only.
 (define_predicate indirect_branch_operand
-  (if_then_else (match_test TARGET_X32)
-(match_operand 0 register_operand)
-(match_operand 0 nonimmediate_operand)))
+  (ior (match_test register_operand
+(op, mode == VOIDmode ? mode : word_mode))
+   (and (not (match_test TARGET_X32))
+   (match_operand 0 memory_operand
 
 ;; Test for a valid operand for a call instruction.
+;; Allow register operands in word mode only.
 (define_predicate call_insn_operand
   (ior (match_operand 0 constant_call_address_operand)
-   (match_operand 0 call_register_no_elim_operand)
+   (match_test call_register_no_elim_operand
+(op, mode == VOIDmode ? mode : word_mode))
(and (not (match_test TARGET_X32))
(match_operand 0 memory_operand
 
 ;; Similarly, but for tail calls, in which we cannot allow memory references.
 (define_predicate sibcall_insn_operand
   (ior (match_operand 0 constant_call_address_operand)
-   (match_operand 0 register_no_elim_operand)))
+   (match_test register_no_elim_operand
+(op, mode == VOIDmode ? mode : word_mode
 
 ;; Match exactly zero.
 (define_predicate const0_operand

Re: [committed] PR 52515: Missing GTY markers

2012-03-07 Thread Richard Guenther

On Wed, Mar 7, 2012 at 11:11 AM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Jakub Jelinek ja...@redhat.com writes:
 On Wed, Mar 07, 2012 at 09:52:51AM +, Richard Sandiford wrote:
 This patch restores x86_64-linux-gnu bootstrap after my patch for 52372.
 Applied as obvious.

 Sorry for the breakage.

 Was it really necessary to move these out of the global rtx array?
 Now you completely unnecessarily have 4 new GTY roots with all the overhead
 it has.  Wouldn't just moving their initialization elsewhere be sufficient?

 It had to be outside the global array, because that's a per-target thing.
 E.g. MIPS16 and non-MIPS16 have separate target_rtl structures, but ought
 to have the same pc_rtx (just as they ought to have the same const0_rtx, etc.)

 I hadn't realised that the overhead of 4 roots was much greater than the
 overhead of one root pointing to a 4-element array.  We could have a new
 4-element array if that's a problem.

Why can't you put the same RTXen in the per-target rtl structures?

 Richard

Re: PATCH: Properly check mode for x86 call/jmp address

2012-03-07 Thread Uros Bizjak

On Wed, Mar 7, 2012 at 11:07 AM, Uros Bizjak ubiz...@gmail.com wrote:

 Ah, I vaguely remember that indirect call/jmp is invalid on X32 for

This should read ... indirect call/jmp FROM MEMORY is invalid on X32
 It looks I've had too much morning coffee already ;)

Uros.

Re: [committed] PR 52515: Missing GTY markers

2012-03-07 Thread Richard Sandiford

Richard Guenther richard.guent...@gmail.com writes:
 On Wed, Mar 7, 2012 at 11:11 AM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Jakub Jelinek ja...@redhat.com writes:
 On Wed, Mar 07, 2012 at 09:52:51AM +, Richard Sandiford wrote:
 This patch restores x86_64-linux-gnu bootstrap after my patch for 52372.
 Applied as obvious.

 Sorry for the breakage.

 Was it really necessary to move these out of the global rtx array?
 Now you completely unnecessarily have 4 new GTY roots with all the overhead
 it has.  Wouldn't just moving their initialization elsewhere be sufficient?

 It had to be outside the global array, because that's a per-target thing.
 E.g. MIPS16 and non-MIPS16 have separate target_rtl structures, but ought
 to have the same pc_rtx (just as they ought to have the same const0_rtx, 
 etc.)

 I hadn't realised that the overhead of 4 roots was much greater than the
 overhead of one root pointing to a 4-element array.  We could have a new
 4-element array if that's a problem.

 Why can't you put the same RTXen in the per-target rtl structures?

But where does it stop?  Do we move const_tiny_rtx to the target
structure too?

Logically, the target structure should be for target-specific rtl.
Target-independent rtl like constants, (cc0), (pc), etc., should be
shared between targets.  It seems a shame if GC inefficiencies
force us to use a different scheme.

Richard

Re: [committed] PR 52515: Missing GTY markers

2012-03-07 Thread Jakub Jelinek

On Wed, Mar 07, 2012 at 11:57:18AM +0100, Richard Guenther wrote:
  Logically, the target structure should be for target-specific rtl.
  Target-independent rtl like constants, (cc0), (pc), etc., should be
  shared between targets.  It seems a shame if GC inefficiencies
  force us to use a different scheme.
 
 Ok, if we know for certain this rtl will be never target specific (consider
 a gcc for multiple targets, sparc and x86_64 for example), then we
 should have a single structure that contains all those global rtxen.

Or perhaps just optimize, say by putting such global rtxs (or tree and other
single entry GC roots of the exactly same properties) into named sections
(when supported) and letting gengtype create just one root covering the
whole section.

Jakub

[i386, patch, RFC] HLE support in GCC

2012-03-07 Thread Kirill Yukhin

Hello guys,
I am attaching initial patch which enables TSX's HLE [1] prefixes in
GCC. Since we have no official intrinsics declarations, I want to hear
your comments about the patch

Note, there is no option '-mhle' and no tests (I'll do that after)

[1] - 
http://software.intel.com/en-us/blogs/2012/02/07/transactional-synchronization-in-haswell/

Thanks, K


hle-rfc.gcc.patch
Description: Binary data

Re: [i386, patch, RFC] HLE support in GCC

2012-03-07 Thread Jakub Jelinek

On Wed, Mar 07, 2012 at 03:05:58PM +0400, Kirill Yukhin wrote:
 Hello guys,
 I am attaching initial patch which enables TSX's HLE [1] prefixes in
 GCC. Since we have no official intrinsics declarations, I want to hear
 your comments about the patch

I think this is a wrong approach.  Instead we should use for this a flag
on the __atomic_* builtins (some higher bit of the memmodel) that would
say we want to emit an XACQUIRE or XRELEASE insn prefix.

 Note, there is no option '-mhle' and no tests (I'll do that after)
 
 [1] - 
 http://software.intel.com/en-us/blogs/2012/02/07/transactional-synchronization-in-haswell/

Jakub

[PATCH] Fix PR 52203

2012-03-07 Thread Andrey Belevantsev


Hello,

This PR is again about insns that are recog'ed as =0 but do not change the 
processor state.  As explained in 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52203#c8, I've tried 
experimenting with an attribute marking those empty insns in MD files and 
asserting that all other insns do have reservations.  As this doesn't seem 
to be interesting, I give up with the idea, and the below patch makes 
sel-sched do exactly what the Haifa scheduler does, i.e. do not count such 
insns against issue_rate when modelling clock cycles.


Tested on ia64 and x86-64, OK for trunk?  No testcase again because of the 
amount of flags needed.


Andrey

2012-03-07  Andrey Belevantsev  a...@ispras.ru

PR rtl-optimization/52203
* sel-sched.c (estimate_insn_cost): New parameter pempty.  Adjust
all callers to pass NULL except ...
(reset_sched_cycles_in_current_ebb): ... here, save the value
in new variable 'empty'.  Increase issue_rate only for
non-empty insns.

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 2af01ae..2829f60 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -4265,9 +4265,10 @@ invoke_aftermath_hooks (fence_t fence, rtx 
best_insn, int issue_more)

   return issue_more;
 }

-/* Estimate the cost of issuing INSN on DFA state STATE.  */
+/* Estimate the cost of issuing INSN on DFA state STATE.  Write to PEMPTY
+   true when INSN does not change the processor state.  */
 static int
-estimate_insn_cost (rtx insn, state_t state)
+estimate_insn_cost (rtx insn, state_t state, bool *pempty)
 {
   static state_t temp = NULL;
   int cost;
@@ -4277,6 +4278,8 @@ estimate_insn_cost (rtx insn, state_t state)

   memcpy (temp, state, dfa_state_size);
   cost = state_transition (temp, insn);
+  if (pempty)
+*pempty = (memcmp (temp, state, dfa_state_size) == 0);

   if (cost  0)
 return 0;
@@ -4307,7 +4310,7 @@ get_expr_cost (expr_t expr, fence_t fence)
return 0;
 }
   else
-return estimate_insn_cost (insn, FENCE_STATE (fence));
+return estimate_insn_cost (insn, FENCE_STATE (fence), NULL);
 }

 /* Find the best insn for scheduling, either via max_issue or just take
@@ -7020,7 +7023,7 @@ reset_sched_cycles_in_current_ebb (void)
 {
   int cost, haifa_cost;
   int sort_p;
-  bool asm_p, real_insn, after_stall, all_issued;
+  bool asm_p, real_insn, after_stall, all_issued, empty;
   int clock;

   if (!INSN_P (insn))
@@ -7047,7 +7050,7 @@ reset_sched_cycles_in_current_ebb (void)
haifa_cost = 0;
}
   else
-haifa_cost = estimate_insn_cost (insn, curr_state);
+haifa_cost = estimate_insn_cost (insn, curr_state, empty);

   /* Stall for whatever cycles we've stalled before.  */
   after_stall = 0;
@@ -7081,7 +7084,7 @@ reset_sched_cycles_in_current_ebb (void)
   if (!after_stall
real_insn
haifa_cost  0
-   estimate_insn_cost (insn, curr_state) == 0)
+   estimate_insn_cost (insn, curr_state, NULL) == 0)
 break;

   /* When the data dependency stall is longer than the DFA stall,
@@ -7093,7 +7096,7 @@ reset_sched_cycles_in_current_ebb (void)
   if ((after_stall || all_issued)
real_insn
haifa_cost == 0)
-haifa_cost = estimate_insn_cost (insn, curr_state);
+haifa_cost = estimate_insn_cost (insn, curr_state, NULL);
 }

  haifa_clock += i;
@@ -7125,7 +7128,8 @@ reset_sched_cycles_in_current_ebb (void)
   if (real_insn)
{
  cost = state_transition (curr_state, insn);
- issued_insns++;
+ if (!empty)
+   issued_insns++;

   if (sched_verbose = 2)
{

Re: [PATCH] Fix PR 52203

2012-03-07 Thread Alexander Monakov



On Wed, 7 Mar 2012, Andrey Belevantsev wrote:

 Hello,
 
 This PR is again about insns that are recog'ed as =0 but do not change the
 processor state.  As explained in
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52203#c8, I've tried experimenting
 with an attribute marking those empty insns in MD files and asserting that
 all other insns do have reservations.  As this doesn't seem to be interesting,
 I give up with the idea, and the below patch makes sel-sched do exactly what
 the Haifa scheduler does, i.e. do not count such insns against issue_rate when
 modelling clock cycles.
 
 Tested on ia64 and x86-64, OK for trunk?  No testcase again because of the
 amount of flags needed.
 
 Andrey
 
 2012-03-07  Andrey Belevantsev  a...@ispras.ru
 
   PR rtl-optimization/52203
   * sel-sched.c (estimate_insn_cost): New parameter pempty.  Adjust
   all callers to pass NULL except ...
   (reset_sched_cycles_in_current_ebb): ... here, save the value
   in new variable 'empty'.  Increase issue_rate only for
   non-empty insns.

This is OK.

Thanks.

-- 
Alexander

Re: [patch] Fix non-standard Ada bootstrap failure on IA-64

2012-03-07 Thread Jakub Jelinek

On Wed, Mar 07, 2012 at 11:01:36AM +0100, Richard Guenther wrote:
 Hmm, but then this complicates and slows down the generic alias machinery.
 Of course IMHO the RTL alias machinery should be conservative with respect
 to what the RTL IL allows - so the question is are non-legitimate addresses
 valid in regular instructions at any point?

I'd say that alias.c when processing non-debug insn content should just
ignore any cselib locs which have DEBUG_INSN_P setting_insn.
Then both -g and -g0 will see the same things.

Jakub

[PATCH] Do not use lang_hooks.types.type_for_size in signed_or_unsigned_type_for

2012-03-07 Thread Richard Guenther


This makes us use build_nonstandard_integer_type in 
signed_or_unsigned_type_for and adjusts the function to return
NULL_TREE for non-sensical inputs (only allowing pointer and
integeral types).  This way we make sure that the precision of
the result type matches that of the input - something which
fold-const.c definitely expects for example (it uses type_for_size
itself if it doesn't).

In the long run type_for_size should go - or it should be a
wrapper that calls type_for_mode (int_mode_for_size ()) instead.
I'm working towards that.

Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.

I get

FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original return 
\\(char\\)
 -\\(unsigned char\\) c  31; 1
FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original return 
\\(int\\) 
\\(12 - \\(unsigned int\\) d\\)  7; 1

because we dump the unsigned type variant differently now.  What do
people think - adjust the testcase?  Adjust how we pretty-print
these non-standard integer types?

Thanks,
Richard.

2012-03-07  Richard Guenther  rguent...@suse.de

* tree.c (signed_or_unsigned_type_for): Use
build_nonstandard_integer_type.
(signed_type_for): Adjust documentation.
(unsigned_type_for): Likewise.

Index: gcc/tree.c
===
*** gcc/tree.c  (revision 185029)
--- gcc/tree.c  (working copy)
*** widest_int_cst_value (const_tree x)
*** 10197,10228 
return val;
  }
  
! /* If TYPE is an integral type, return an equivalent type which is
! unsigned iff UNSIGNEDP is true.  If TYPE is not an integral type,
! return TYPE itself.  */
  
  tree
  signed_or_unsigned_type_for (int unsignedp, tree type)
  {
!   tree t = type;
!   if (POINTER_TYPE_P (type))
! {
!   /* If the pointer points to the normal address space, use the
!size_type_node.  Otherwise use an appropriate size for the pointer
!based on the named address space it points to.  */
!   if (!TYPE_ADDR_SPACE (TREE_TYPE (t)))
!   t = size_type_node;
!   else
!   return lang_hooks.types.type_for_size (TYPE_PRECISION (t), unsignedp);
! }
  
!   if (!INTEGRAL_TYPE_P (t) || TYPE_UNSIGNED (t) == unsignedp)
! return t;
  
!   return lang_hooks.types.type_for_size (TYPE_PRECISION (t), unsignedp);
  }
  
! /* Returns unsigned variant of TYPE.  */
  
  tree
  unsigned_type_for (tree type)
--- 10197,10222 
return val;
  }
  
! /* If TYPE is an integral or pointer type, return an integer type with
!the same precision which is unsigned iff UNSIGNEDP is true, or itself
!if TYPE is already an integer type of signedness UNSIGNEDP.  */
  
  tree
  signed_or_unsigned_type_for (int unsignedp, tree type)
  {
!   if (TREE_CODE (type) == INTEGER_TYPE  TYPE_UNSIGNED (type) == unsignedp)
! return type;
  
!   if (!INTEGRAL_TYPE_P (type)
!!POINTER_TYPE_P (type))
! return NULL_TREE;
  
!   return build_nonstandard_integer_type (TYPE_PRECISION (type), unsignedp);
  }
  
! /* If TYPE is an integral or pointer type, return an integer type with
!the same precision which is unsigned, or itself if TYPE is already an
!unsigned integer type.  */
  
  tree
  unsigned_type_for (tree type)
*** unsigned_type_for (tree type)
*** 10230,10236 
return signed_or_unsigned_type_for (1, type);
  }
  
! /* Returns signed variant of TYPE.  */
  
  tree
  signed_type_for (tree type)
--- 10224,10232 
return signed_or_unsigned_type_for (1, type);
  }
  
! /* If TYPE is an integral or pointer type, return an integer type with
!the same precision which is signed, or itself if TYPE is already a
!signed integer type.  */
  
  tree
  signed_type_for (tree type)

Re: [patch] Fix non-standard Ada bootstrap failure on IA-64

2012-03-07 Thread Eric Botcazou

 Hmm, but isn't the bug that we feed debug-insn mems to memrefs_conflict_p?

We don't.  The addresses come from regular insns, but cselib is able to 
equivalence one of them with an address that is already in its hashtable 
because of a debug insn (see cselib.c:promote_debug_loc).

 Or that we have non-legitimate address expressions in them?

My understanding is that this is by design.

-- 
Eric Botcazou

[Patch,AVR]: Fix PR52496: Add memory barriers to built-ins

2012-03-07 Thread Georg-Johann Lay

This patch adds memory barriers to

__builtin_avr_nop
__builtin_avr_sei
__builtin_avr_cli
__builtin_avr_wdr
__builtin_avr_sleep
__builtin_avr_delay_cycles

so that their code cannot be dragged over memory accesses.

Ok for trunk?

PR target/52496
* config/avr/avr.c (avr_mem_clobber): New static function.
(avr_expand_delay_cycles): Add memory clobber operand to
delay_cycles_1, delay_cycles_2, delay_cycles_3, delay_cycles_4.
* config/avr/avr.md (unspec): Add UNSPEC_MEMORY_BARRIER.
(enable_interrupt, disable_interrupt): New expander.
(nopv, sleep, wdr): New expanders.
(delay_cycles_1): Add memory clobber.
(delay_cycles_2): Add memory clobber.
(delay_cycles_3): Add memory clobber.
(delay_cycles_4): Add memory clobber.
(cli_sei): New insn from former enable_interrupt,
disable_interrupt with memory clobber.
(*wdt): New insn from former wdt with memory clobber.
(*sleep): New insn from former sleep with memory clobber.
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 185031)
+++ config/avr/avr.md	(working copy)
@@ -69,6 +69,7 @@ (define_c_enum unspec
UNSPEC_COPYSIGN
UNSPEC_IDENTITY
UNSPEC_INSERT_BITS
+   UNSPEC_MEMORY_BARRIER
])
 
 (define_c_enum unspecv
@@ -5236,18 +5237,36 @@ (define_insn popqi
(set_attr length 1)])
 
 ;; Enable Interrupts
-(define_insn enable_interrupt
-  [(unspec_volatile [(const_int 1)] UNSPECV_ENABLE_IRQS)]
+(define_expand enable_interrupt
+  [(clobber (const_int 0))]
   
-  sei
-  [(set_attr length 1)
-   (set_attr cc none)])
+  {
+rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+MEM_VOLATILE_P (mem) = 1;
+emit_insn (gen_cli_sei (const1_rtx, mem));
+DONE;
+  })
 
 ;; Disable Interrupts
-(define_insn disable_interrupt
-  [(unspec_volatile [(const_int 0)] UNSPECV_ENABLE_IRQS)]
+(define_expand disable_interrupt
+  [(clobber (const_int 0))]
   
-  cli
+  {
+rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+MEM_VOLATILE_P (mem) = 1;
+emit_insn (gen_cli_sei (const0_rtx, mem));
+DONE;
+  })
+
+(define_insn cli_sei
+  [(unspec_volatile [(match_operand:QI 0 const_int_operand L,P)]
+UNSPECV_ENABLE_IRQS)
+   (set (match_operand:BLK 1  )
+	(unspec:BLK [(match_dup 1)] UNSPEC_MEMORY_BARRIER))]
+  
+  @
+	cli
+	sei
   [(set_attr length 1)
(set_attr cc none)])
 
@@ -5354,10 +5373,12 @@ (define_insn delay_cycles_1
   [(unspec_volatile [(match_operand:QI 0 const_int_operand n)
  (const_int 1)]
 UNSPECV_DELAY_CYCLES)
-   (clobber (match_scratch:QI 1 =d))]
+   (set (match_operand:BLK 1  )
+	(unspec:BLK [(match_dup 1)] UNSPEC_MEMORY_BARRIER))
+   (clobber (match_scratch:QI 2 =d))]
   
-  ldi %1,lo8(%0)
-	1: dec %1
+  ldi %2,lo8(%0)
+	1: dec %2
 	brne 1b
   [(set_attr length 3)
(set_attr cc clobber)])
@@ -5366,11 +5387,13 @@ (define_insn delay_cycles_2
   [(unspec_volatile [(match_operand:HI 0 const_int_operand n)
  (const_int 2)]
 UNSPECV_DELAY_CYCLES)
-   (clobber (match_scratch:HI 1 =w))]
-  
-  ldi %A1,lo8(%0)
-	ldi %B1,hi8(%0)
-	1: sbiw %A1,1
+   (set (match_operand:BLK 1  )
+	(unspec:BLK [(match_dup 1)] UNSPEC_MEMORY_BARRIER))
+   (clobber (match_scratch:HI 2 =w))]
+  
+  ldi %A2,lo8(%0)
+	ldi %B2,hi8(%0)
+	1: sbiw %A2,1
 	brne 1b
   [(set_attr length 4)
(set_attr cc clobber)])
@@ -5379,16 +5402,18 @@ (define_insn delay_cycles_3
   [(unspec_volatile [(match_operand:SI 0 const_int_operand n)
  (const_int 3)]
 UNSPECV_DELAY_CYCLES)
-   (clobber (match_scratch:QI 1 =d))
+   (set (match_operand:BLK 1  )
+	(unspec:BLK [(match_dup 1)] UNSPEC_MEMORY_BARRIER))
(clobber (match_scratch:QI 2 =d))
-   (clobber (match_scratch:QI 3 =d))]
+   (clobber (match_scratch:QI 3 =d))
+   (clobber (match_scratch:QI 4 =d))]
   
-  ldi %1,lo8(%0)
-	ldi %2,hi8(%0)
-	ldi %3,hlo8(%0)
-	1: subi %1,1
-	sbci %2,0
+  ldi %2,lo8(%0)
+	ldi %3,hi8(%0)
+	ldi %4,hlo8(%0)
+	1: subi %2,1
 	sbci %3,0
+	sbci %4,0
 	brne 1b
   [(set_attr length 7)
(set_attr cc clobber)])
@@ -5397,19 +5422,21 @@ (define_insn delay_cycles_4
   [(unspec_volatile [(match_operand:SI 0 const_int_operand n)
  (const_int 4)]
 UNSPECV_DELAY_CYCLES)
-   (clobber (match_scratch:QI 1 =d))
+   (set (match_operand:BLK 1  )
+	(unspec:BLK [(match_dup 1)] UNSPEC_MEMORY_BARRIER))
(clobber (match_scratch:QI 2 =d))
(clobber (match_scratch:QI 3 =d))
-   (clobber (match_scratch:QI 4 =d))]
+   (clobber (match_scratch:QI 4 =d))
+   (clobber (match_scratch:QI 5 =d))]
   
-  ldi %1,lo8(%0)
-	ldi %2,hi8(%0)
-	ldi %3,hlo8(%0)
-	ldi %4,hhi8(%0)
-	1: subi %1,1
-	sbci %2,0
+  ldi %2,lo8(%0)
+	ldi %3,hi8(%0)
+	ldi %4,hlo8(%0)
+	ldi %5,hhi8(%0)
+	1: subi %2,1
 	sbci %3,0
 	sbci %4,0
+	sbci %5,0
 	brne 1b
   [(set_attr length 9)

Re: [patch] Fix non-standard Ada bootstrap failure on IA-64

2012-03-07 Thread Eric Botcazou

 CCing Alex.  I think we feed debug insn mems in the scheduler to be able to
 find out what debug insns need to be invalidated and what can be kept.
 And any address expressions are legitimate for debug insns, why should we
 be limited by what the ISA allows?  All we are limited is if we can express
 those expressions in DWARF or not.
 How can this be reproduced with a cross?

Run -fcompare-debug on sem_ch2.o with -O2 -gnatpgn -g (on the 4.7 branch).

-- 
Eric Botcazou

[PATCH] Move strip_float_extensions to tree.c

2012-03-07 Thread Richard Guenther


Prototyped in tree.h, called from the middle-end but implemented
in convert.c.  That looks wrong.

Now, convert.c is used from all frontends to implement convert ()
(that looks backwards - the language convert should be a langhook,
called from convert implemented in convert.c).  But well, I aint
not touching this beast ;)

At least with this patch convert.c functions are all convert_to_*,
only called directly by frontends (while they should be all
static to convert.c, called from convert () only(?)).

Queued for bootstrap / regtest.

Richard.

2012-03-07  Richard Guenther  rguent...@suse.de

* convert.c (strip_float_extensions): Move ...
* tree.c (strip_float_extensions): ... here.

Index: gcc/convert.c
===
*** gcc/convert.c   (revision 185029)
--- gcc/convert.c   (working copy)
*** convert_to_pointer (tree type, tree expr
*** 90,141 
  }
  }
  
- /* Avoid any floating point extensions from EXP.  */
- tree
- strip_float_extensions (tree exp)
- {
-   tree sub, expt, subt;
- 
-   /*  For floating point constant look up the narrowest type that can hold
-   it properly and handle it like (type)(narrowest_type)constant.
-   This way we can optimize for instance a=a*2.0 where a is float
-   but 2.0 is double constant.  */
-   if (TREE_CODE (exp) == REAL_CST  !DECIMAL_FLOAT_TYPE_P (TREE_TYPE (exp)))
- {
-   REAL_VALUE_TYPE orig;
-   tree type = NULL;
- 
-   orig = TREE_REAL_CST (exp);
-   if (TYPE_PRECISION (TREE_TYPE (exp))  TYPE_PRECISION (float_type_node)
-  exact_real_truncate (TYPE_MODE (float_type_node), orig))
-   type = float_type_node;
-   else if (TYPE_PRECISION (TREE_TYPE (exp))
-   TYPE_PRECISION (double_type_node)
-   exact_real_truncate (TYPE_MODE (double_type_node), orig))
-   type = double_type_node;
-   if (type)
-   return build_real (type, real_value_truncate (TYPE_MODE (type), orig));
- }
- 
-   if (!CONVERT_EXPR_P (exp))
- return exp;
- 
-   sub = TREE_OPERAND (exp, 0);
-   subt = TREE_TYPE (sub);
-   expt = TREE_TYPE (exp);
- 
-   if (!FLOAT_TYPE_P (subt))
- return exp;
- 
-   if (DECIMAL_FLOAT_TYPE_P (expt) != DECIMAL_FLOAT_TYPE_P (subt))
- return exp;
- 
-   if (TYPE_PRECISION (subt)  TYPE_PRECISION (expt))
- return exp;
- 
-   return strip_float_extensions (sub);
- }
- 
  
  /* Convert EXPR to some floating-point type TYPE.
  
--- 90,95 
Index: gcc/tree.c
===
*** gcc/tree.c  (revision 185029)
--- gcc/tree.c  (working copy)
*** tree_strip_sign_nop_conversions (tree ex
*** 11213,11218 
--- 11209,11260 
return exp;
  }
  
+ /* Avoid any floating point extensions from EXP.  */
+ tree
+ strip_float_extensions (tree exp)
+ {
+   tree sub, expt, subt;
+ 
+   /*  For floating point constant look up the narrowest type that can hold
+   it properly and handle it like (type)(narrowest_type)constant.
+   This way we can optimize for instance a=a*2.0 where a is float
+   but 2.0 is double constant.  */
+   if (TREE_CODE (exp) == REAL_CST  !DECIMAL_FLOAT_TYPE_P (TREE_TYPE (exp)))
+ {
+   REAL_VALUE_TYPE orig;
+   tree type = NULL;
+ 
+   orig = TREE_REAL_CST (exp);
+   if (TYPE_PRECISION (TREE_TYPE (exp))  TYPE_PRECISION (float_type_node)
+  exact_real_truncate (TYPE_MODE (float_type_node), orig))
+   type = float_type_node;
+   else if (TYPE_PRECISION (TREE_TYPE (exp))
+   TYPE_PRECISION (double_type_node)
+   exact_real_truncate (TYPE_MODE (double_type_node), orig))
+   type = double_type_node;
+   if (type)
+   return build_real (type, real_value_truncate (TYPE_MODE (type), orig));
+ }
+ 
+   if (!CONVERT_EXPR_P (exp))
+ return exp;
+ 
+   sub = TREE_OPERAND (exp, 0);
+   subt = TREE_TYPE (sub);
+   expt = TREE_TYPE (exp);
+ 
+   if (!FLOAT_TYPE_P (subt))
+ return exp;
+ 
+   if (DECIMAL_FLOAT_TYPE_P (expt) != DECIMAL_FLOAT_TYPE_P (subt))
+ return exp;
+ 
+   if (TYPE_PRECISION (subt)  TYPE_PRECISION (expt))
+ return exp;
+ 
+   return strip_float_extensions (sub);
+ }
+ 
  /* Strip out all handled components that produce invariant
 offsets.  */

[PATCH] Remove some callers of lang_hooks.types.type_for_size

2012-03-07 Thread Richard Guenther


The type_for_size langhook should go (it does not handle non-mode
precision well, at least it handles it unexpectedly to most callers).
Instead callers that want to call a langhook should use type_for_mode.

The following patch makes some direct uses of the langhook from the
middle-end use something more suitable, for example [un]signed_type_for
for getting an integer type for a pointer type.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Jakub, are the OMP bits ok?

Thanks,
Richard.

2012-03-07  Richard Guenther  rguent...@suse.de

* omp-low.c (extract_omp_for_data): Use signed_type_for.
(expand_omp_for_generic): Likewise.
(expand_omp_for_static_nochunk): Likewise.
(expand_omp_for_static_chunk): Likewise.
* tree-vect-stmts.c (vect_gen_perm_mask): Use type_for_mode.
* tree-vect-slp.c (vect_transform_slp_perm_load): Likewise.
* tree-vect-loop-manip.c (vect_gen_niters_for_prolog_loop):
Use unsigned_type_for.
(vect_create_cond_for_align_checks): Use signed_type_for.

Index: gcc/omp-low.c
===
*** gcc/omp-low.c   (revision 185029)
--- gcc/omp-low.c   (working copy)
*** extract_omp_for_data (gimple for_stmt, s
*** 407,414 
  tree itype = TREE_TYPE (loop-v);
  
  if (POINTER_TYPE_P (itype))
!   itype
! = lang_hooks.types.type_for_size (TYPE_PRECISION (itype), 0);
  t = build_int_cst (itype, (loop-cond_code == LT_EXPR ? -1 : 1));
  t = fold_build2_loc (loc,
   PLUS_EXPR, itype,
--- 407,413 
  tree itype = TREE_TYPE (loop-v);
  
  if (POINTER_TYPE_P (itype))
!   itype = signed_type_for (itype);
  t = build_int_cst (itype, (loop-cond_code == LT_EXPR ? -1 : 1));
  t = fold_build2_loc (loc,
   PLUS_EXPR, itype,
*** expand_omp_for_generic (struct omp_regio
*** 3772,3778 
  tree itype = TREE_TYPE (fd-loops[i].v);
  
  if (POINTER_TYPE_P (itype))
!   itype = lang_hooks.types.type_for_size (TYPE_PRECISION (itype), 0);
  t = build_int_cst (itype, (fd-loops[i].cond_code == LT_EXPR
 ? -1 : 1));
  t = fold_build2 (PLUS_EXPR, itype,
--- 3771,3777 
  tree itype = TREE_TYPE (fd-loops[i].v);
  
  if (POINTER_TYPE_P (itype))
!   itype = signed_type_for (itype);
  t = build_int_cst (itype, (fd-loops[i].cond_code == LT_EXPR
 ? -1 : 1));
  t = fold_build2 (PLUS_EXPR, itype,
*** expand_omp_for_generic (struct omp_regio
*** 3836,3843 
   TYPE_PRECISION (type) != TYPE_PRECISION (fd-iter_type))
{
  /* Avoid casting pointers to integer of a different size.  */
! tree itype
!   = lang_hooks.types.type_for_size (TYPE_PRECISION (type), 0);
  t1 = fold_convert (fd-iter_type, fold_convert (itype, fd-loop.n2));
  t0 = fold_convert (fd-iter_type, fold_convert (itype, fd-loop.n1));
}
--- 3835,3841 
   TYPE_PRECISION (type) != TYPE_PRECISION (fd-iter_type))
{
  /* Avoid casting pointers to integer of a different size.  */
! tree itype = signed_type_for (type);
  t1 = fold_convert (fd-iter_type, fold_convert (itype, fd-loop.n2));
  t0 = fold_convert (fd-iter_type, fold_convert (itype, fd-loop.n1));
}
*** expand_omp_for_generic (struct omp_regio
*** 3904,3911 
if (bias)
  t = fold_build2 (MINUS_EXPR, fd-iter_type, t, bias);
if (POINTER_TYPE_P (type))
! t = fold_convert (lang_hooks.types.type_for_size (TYPE_PRECISION (type),
! 0), t);
t = fold_convert (type, t);
t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
false, GSI_CONTINUE_LINKING);
--- 3902,3908 
if (bias)
  t = fold_build2 (MINUS_EXPR, fd-iter_type, t, bias);
if (POINTER_TYPE_P (type))
! t = fold_convert (signed_type_for (type), t);
t = fold_convert (type, t);
t = force_gimple_operand_gsi (gsi, t, false, NULL_TREE,
false, GSI_CONTINUE_LINKING);
*** expand_omp_for_generic (struct omp_regio
*** 3916,3923 
if (bias)
  t = fold_build2 (MINUS_EXPR, fd-iter_type, t, bias);
if (POINTER_TYPE_P (type))
! t = fold_convert (lang_hooks.types.type_for_size (TYPE_PRECISION (type),
! 0), t);
t = fold_convert (type, t);
iend = force_gimple_operand_gsi (gsi, t, true, NULL_TREE,
   false, GSI_CONTINUE_LINKING);
--- 3913,3919 
if (bias)
  t = fold_build2 (MINUS_EXPR, fd-iter_type, t, bias);
if (POINTER_TYPE_P

Re: [PATCH] Remove some callers of lang_hooks.types.type_for_size

2012-03-07 Thread Jakub Jelinek

On Wed, Mar 07, 2012 at 01:53:12PM +0100, Richard Guenther wrote:
 Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
 
 Jakub, are the OMP bits ok?

Yeah.

Jakub

[PATCH] Use c_common_type_for_size from the C frontend

2012-03-07 Thread Richard Guenther


I'll apply this as obvious after including it in a bootstrap / regtest.

Richard.

2012-03-07  Richard Guenther  rguent...@suse.de

* c-typeck.c (pointer_diff): Use c_common_type_for_size.

Index: gcc/c-typeck.c
===
--- gcc/c-typeck.c  (revision 185029)
+++ gcc/c-typeck.c  (working copy)
@@ -3413,8 +3410,7 @@ pointer_diff (location_t loc, tree op0,
  be the same as the result type (ptrdiff_t), but may need to be a wider
  type if pointers for the address space are wider than ptrdiff_t.  */
   if (TYPE_PRECISION (restype)  TYPE_PRECISION (TREE_TYPE (op0)))
-inttype = lang_hooks.types.type_for_size
-   (TYPE_PRECISION (TREE_TYPE (op0)), 0);
+inttype = c_common_type_for_size (TYPE_PRECISION (TREE_TYPE (op0)), 0);
   else
 inttype = restype;

[PATCH] More type_for_size caller removals

2012-03-07 Thread Richard Guenther


Nearly obvious.  One question would be whether IVOPTs wants to
use SImode and word_mode instead of int/long.  And the dojump.c
code is dead - TYPE_PRECISION (type)  TYPE_PRECISION (TREE_TYPE (exp))
cannot ever hold here, but I'm double-checkign with an assert
(consider the patch adjusted before commit to remove the
case completely).

Bootstrap  regtest running on x86_64-unknown-linux-gnu.

Richard.

2012-03-07  Richard Guenther  rguent...@suse.de

* coverage.c (get_gcov_type): Use type_for_mode.
(get_gcov_unsigned_t): Likewise.
* expr.c (store_constructor): Use type_for_mode.
(try_casesi): Likewise.
* tree-ssa-loop-ivopts.c (add_standard_iv_candidates_for_size):
Remove.
(add_standard_iv_candidates): Use standard type trees.
* dojump.c (do_jump): Remove dead code.

Index: gcc/coverage.c
===
*** gcc/coverage.c  (revision 185029)
--- gcc/coverage.c  (working copy)
*** static void coverage_obj_finish (VEC(con
*** 131,137 
  tree
  get_gcov_type (void)
  {
!   return lang_hooks.types.type_for_size (GCOV_TYPE_SIZE, false);
  }
  
  /* Return the type node for gcov_unsigned_t.  */
--- 131,138 
  tree
  get_gcov_type (void)
  {
!   enum machine_mode mode = smallest_mode_for_size (GCOV_TYPE_SIZE, MODE_INT);
!   return lang_hooks.types.type_for_mode (mode, false);
  }
  
  /* Return the type node for gcov_unsigned_t.  */
*** get_gcov_type (void)
*** 139,145 
  static tree
  get_gcov_unsigned_t (void)
  {
!   return lang_hooks.types.type_for_size (32, true);
  }
  
  static hashval_t
--- 140,147 
  static tree
  get_gcov_unsigned_t (void)
  {
!   enum machine_mode mode = smallest_mode_for_size (32, MODE_INT);
!   return lang_hooks.types.type_for_mode (mode, true);
  }
  
  static hashval_t
Index: gcc/expr.c
===
*** gcc/expr.c  (revision 185029)
--- gcc/expr.c  (working copy)
*** store_constructor (tree exp, rtx target,
*** 5893,5900 
  
if (TYPE_PRECISION (type)  BITS_PER_WORD)
  {
!   type = lang_hooks.types.type_for_size
! (BITS_PER_WORD, TYPE_UNSIGNED (type));
value = fold_convert (type, value);
  }
  
--- 5893,5900 
  
if (TYPE_PRECISION (type)  BITS_PER_WORD)
  {
!   type = lang_hooks.types.type_for_mode
! (word_mode, TYPE_UNSIGNED (type));
value = fold_convert (type, value);
  }
  
*** try_casesi (tree index_type, tree index_
*** 10726,10732 
  {
struct expand_operand ops[5];
enum machine_mode index_mode = SImode;
-   int index_bits = GET_MODE_BITSIZE (index_mode);
rtx op1, op2, index;
  
if (! HAVE_casesi)
--- 10726,10731 
*** try_casesi (tree index_type, tree index_
*** 10753,10759 
  {
if (TYPE_MODE (index_type) != index_mode)
{
! index_type = lang_hooks.types.type_for_size (index_bits, 0);
  index_expr = fold_convert (index_type, index_expr);
}
  
--- 10752,10758 
  {
if (TYPE_MODE (index_type) != index_mode)
{
! index_type = lang_hooks.types.type_for_mode (index_mode, 0);
  index_expr = fold_convert (index_type, index_expr);
}
  
Index: gcc/tree-ssa-loop-ivopts.c
===
*** gcc/tree-ssa-loop-ivopts.c  (revision 185029)
--- gcc/tree-ssa-loop-ivopts.c  (working copy)
*** add_candidate (struct ivopts_data *data,
*** 2405,2432 
  add_autoinc_candidates (data, base, step, important, use);
  }
  
- /* Add a standard 0 + 1 * iteration iv candidate for a
-type with SIZE bits.  */
- 
- static void
- add_standard_iv_candidates_for_size (struct ivopts_data *data,
-unsigned int size)
- {
-   tree type = lang_hooks.types.type_for_size (size, true);
-   add_candidate (data, build_int_cst (type, 0), build_int_cst (type, 1),
-true, NULL);
- }
- 
  /* Adds standard iv candidates.  */
  
  static void
  add_standard_iv_candidates (struct ivopts_data *data)
  {
!   add_standard_iv_candidates_for_size (data, INT_TYPE_SIZE);
  
/* The same for a double-integer type if it is still fast enough.  */
!   if (BITS_PER_WORD = INT_TYPE_SIZE * 2)
! add_standard_iv_candidates_for_size (data, INT_TYPE_SIZE * 2);
  }
  
  
--- 2405,2430 
  add_autoinc_candidates (data, base, step, important, use);
  }
  
  /* Adds standard iv candidates.  */
  
  static void
  add_standard_iv_candidates (struct ivopts_data *data)
  {
!   add_candidate (data, integer_zero_node, integer_one_node, true, NULL);
! 
!   /* The same for a double-integer type if it is still fast enough.  */
!   if

Re: Support for Runtime CPU type detection via builtins (issue5754058)

2012-03-07 Thread Richard Guenther

On Wed, Mar 7, 2012 at 1:49 AM, Sriraman Tallam tmsri...@google.com wrote:
 Patch for CPU detection at run-time.
 ===

 Patch for CPU detection at run-time, to be used in dispatching of
 multi-versioned functions.   Please see this discussion:
 http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01355.html
 when this patch for reviewed the last time.

 For more detailed description:
 http://gcc.gnu.org/ml/gcc/2012-03/msg00074.html

 One of the main concerns was about making CPU detection initialization a
 constructor. The main point raised was about constructor ordering. I have
 added a priority value to the CPU detection constructor to make it very high
 priority so that it is guaranteed to fire before every constructor without
 an explicitly marked priority value of 101.  However, IFUNC initializers
 will still fire before this constructor, so the cpu initialization routine
 has to be explicitly called in such initializers for which I have added a
 builtin: __builtin_cpu_init ().

 This patch adds the following new builtins:

 * __builtin_cpu_init
 * __builtin_cpu_supports_cmov
 * __builtin_cpu_supports_mmx
 * __builtin_cpu_supports_popcount
 * __builtin_cpu_supports_sse
 * __builtin_cpu_supports_sse2
 * __builtin_cpu_supports_sse3
 * __builtin_cpu_supports_ssse3
 * __builtin_cpu_supports_sse4_1
 * __builtin_cpu_supports_sse4_2
 * __builtin_cpu_is_amd
 * __builtin_cpu_is_intel_atom
 * __builtin_cpu_is_intel_core2
 * __builtin_cpu_is_intel
 * __builtin_cpu_is_intel_corei7
 * __builtin_cpu_is_intel_corei7_nehalem
 * __builtin_cpu_is_intel_corei7_westmere
 * __builtin_cpu_is_intel_corei7_sandybridge
 * __builtin_cpu_is_amdfam10
 * __builtin_cpu_is_amdfam10_barcelona
 * __builtin_cpu_is_amdfam10_shanghai
 * __builtin_cpu_is_amdfam10_istanbul
 * __builtin_cpu_is_amdfam15_bdver1
 * __builtin_cpu_is_amdfam15_bdver2

I think the non-feature detection functions are not necessary at all.
Builtin functions are not exactly cheap, nor is the scheme you invent
backward/forward compatible.  Instead, why not add a single builtin
function, __builtin_cpu_supports(const char *), and decode from
a comma-separated list of features?  Unknown features are simply
not present.  So I can write code with only a single configure check,
for __builtin_cpu_supports, and cater for future features or older compilers.

And of course that builtin would be even cross-platform.

Implementation-wise I'll leave this to x86 maintainers to comment on.

Richard.


        * config/i386/i386.c (build_struct_with_one_bit_fields): New function.
        (make_var_decl): New function.
        (get_field_from_struct): New function.
        (fold_builtin_target): New function.
        (ix86_fold_builtin): New function.
        (ix86_expand_builtin): Expand new builtins by folding them.
        (make_platform_builtin): New functions.
        (ix86_init_platform_type_builtins): Make the new builtins.
        (ix86_init_builtins): Make new builtins to detect CPU type.
        (TARGET_FOLD_BUILTIN): New macro.
        (IX86_BUILTIN_CPU_SUPPORTS_CMOV): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_MMX): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE2): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE3): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSSE3): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE4_1): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE4_2): New enum value.
        (IX86_BUILTIN_CPU_INIT): New enum value.
        (IX86_BUILTIN_CPU_IS_AMD): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_ATOM): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_CORE2): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2): New enum value.
        * config/i386/i386-builtin-types.def: New function type.
        * testsuite/gcc.target/builtin_target.c: New testcase.

        * libgcc/config/i386/i386-cpuinfo.c: New file.
        * libgcc/config/i386/t-cpuinfo: New file.
        * libgcc/config.host: Include t-cpuinfo.
        * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
        and __cpu_features.

 Index: libgcc/config.host
 ===
 --- libgcc/config.host  (revision 184971)
 +++ libgcc/config.host  (working copy)
 @@ -1142,7 +1142,7 @@ i[34567]86-*-linux* | x86_64-*-linux*

[Patch,AVR,committed]: Fix PR52484

2012-03-07 Thread Georg-Johann Lay

http://gcc.gnu.org/viewcvs?view=revisionrevision=185043

This wrong code bug for __memx space was caused by a missing R22 in the
register footprint of xloadmode_A split.

Johann

PR target/52484
* config/avr/avr.md (xloadmode_A): Add R22... to register footprint.

Index: config/avr/avr.md
===
--- config/avr/avr.md   (revision 185031)
+++ config/avr/avr.md   (working copy)
@@ -436,6 +436,7 @@ (define_insn_and_split xload8_A
 (define_insn_and_split xloadmode_A
   [(set (match_operand:MOVMODE 0 register_operand =r)
 (match_operand:MOVMODE 1 memory_operandm))
+   (clobber (reg:MOVMODE 22))
(clobber (reg:QI 21))
(clobber (reg:HI REG_Z))]
   can_create_pseudo_p()

[Ada] Swapped inputs and outputs in UG Inline Assembler section

2012-03-07 Thread Arnaud Charlet

This corrects the order of Input and Output operands in the documentation on
machine language insertions.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-03-07  Eric Botcazou  ebotca...@adacore.com

* gnat_ugn.texi (Inline Assembler): Fix swapping of Input and
Output operands throughout.

Index: gnat_ugn.texi
===
--- gnat_ugn.texi   (revision 185043)
+++ gnat_ugn.texi   (working copy)
@@ -26107,8 +26107,8 @@
   Result : Unsigned_32;
begin
   Asm (incl %0,
-   Inputs  = Unsigned_32'Asm_Input (a, Value),
-   Outputs = Unsigned_32'Asm_Output (=a, Result));
+   Outputs = Unsigned_32'Asm_Output (=a, Result),
+   Inputs  = Unsigned_32'Asm_Input (a, Value));
   return Result;
end Incr;
 
@@ -26134,10 +26134,8 @@
 You can have multiple input variables, in the same way that you can have more
 than one output variable.
 
-The parameter count (%0, %1) etc, now starts at the first input
-statement, and continues with the output statements.
-When both parameters use the same variable, the
-compiler will treat them as the same %n operand, which is the case here.
+The parameter count (%0, %1) etc, still starts at the first output statement,
+and continues with the input statements.
 
 Just as the @code{Outputs} parameter causes the register to be stored into the
 target variable after execution of the assembler statements, so does the
@@ -26191,8 +26189,8 @@
   Result : Unsigned_32;
begin
   Asm (incl %0,
-   Inputs  = Unsigned_32'Asm_Input (a, Value),
-   Outputs = Unsigned_32'Asm_Output (=a, Result));
+   Outputs = Unsigned_32'Asm_Output (=a, Result),
+   Inputs  = Unsigned_32'Asm_Input (a, Value));
   return Result;
end Incr;
pragma Inline (Increment);
@@ -26274,8 +26272,8 @@
 @group
 Asm (movl %0, %%ebx  LF  HT 
  movl %%ebx, %1,
- Inputs  = Unsigned_32'Asm_Input  (g, Var_In),
- Outputs = Unsigned_32'Asm_Output (=g, Var_Out));
+ Outputs = Unsigned_32'Asm_Output (=g, Var_Out),
+ Inputs  = Unsigned_32'Asm_Input  (g, Var_In));
 @end group
 @end smallexample
 @noindent
@@ -26289,8 +26287,8 @@
 @group
 Asm (movl %0, %%ebx  LF  HT 
  movl %%ebx, %1,
- Inputs  = Unsigned_32'Asm_Input  (g, Var_In),
  Outputs = Unsigned_32'Asm_Output (=g, Var_Out),
+ Inputs  = Unsigned_32'Asm_Input  (g, Var_In),
  Clobber = ebx);
 @end group
 @end smallexample
@@ -26324,8 +26322,8 @@
 @group
 Asm (movl %0, %%ebx  LF  HT 
  movl %%ebx, %1,
- Inputs   = Unsigned_32'Asm_Input  (g, Var_In),
  Outputs  = Unsigned_32'Asm_Output (=g, Var_Out),
+ Inputs   = Unsigned_32'Asm_Input  (g, Var_In),
  Clobber  = ebx,
  Volatile = True);
 @end group

[Ada] Internal error on multiple layers of nested generics

2012-03-07 Thread Arnaud Charlet

This patch corrects the machinery which detects whether one node appears
earlier in the tree with respect to another node.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-03-07  Hristian Kirtchev  kirtc...@adacore.com

* sem_ch12.adb (Earlier): When two nodes come from the same
generic instantiation, compare their locations. Otherwise always
use the top level locations of the nodes.

Index: sem_ch12.adb
===
--- sem_ch12.adb(revision 185043)
+++ sem_ch12.adb(working copy)
@@ -7159,12 +7159,22 @@
   end if;
 
   --  At this point either both nodes came from source or we approximated
-  --  their source locations through neighbouring source statements. There
-  --  is no need to look at the top level locations of P1 and P2 because
-  --  both nodes are in the same list and whether the enclosing context is
-  --  instantiated is irrelevant.
+  --  their source locations through neighbouring source statements.
 
-  return Sloc (P1)  Sloc (P2);
+  --  When two nodes come from the same instance, they have identical top
+  --  level locations. To determine proper relation within the tree, check
+  --  their locations within the template.
+
+  if Top_Level_Location (Sloc (P1)) = Top_Level_Location (Sloc (P2)) then
+ return Sloc (P1)  Sloc (P2);
+
+  --  The two nodes either come from unrelated instances or do not come
+  --  from instantiated code at all.
+
+  else
+ return Top_Level_Location (Sloc (P1))
+   Top_Level_Location (Sloc (P2));
+  end if;
end Earlier;
 
--

Re: [PATCH] Do not use lang_hooks.types.type_for_size in signed_or_unsigned_type_for

2012-03-07 Thread Michael Matz

Hi,

On Wed, 7 Mar 2012, Richard Guenther wrote:

 FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original return 
 \\(char\\)
  -\\(unsigned char\\) c  31; 1
 FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original return 
 \\(int\\) 
 \\(12 - \\(unsigned int\\) d\\)  7; 1
 
 because we dump the unsigned type variant differently now.  What do
 people think - adjust the testcase?  Adjust how we pretty-print
 these non-standard integer types?

Adjusting the pretty printer would be nice anyway.  Those unnamed:35 
thingies hurt my eyes.  Just printing int17 or uint18 would be perfectly 
fine, with special casing of sizes that match the normal C types for 
the target in question (so that input 'unsigned char' isn't converted to 
'uint8' on one and 'uint16' on another target).


Ciao,
Michael.

Re: [PATCH] Do not use lang_hooks.types.type_for_size in signed_or_unsigned_type_for

2012-03-07 Thread Richard Guenther

On Wed, 7 Mar 2012, Michael Matz wrote:

 Hi,
 
 On Wed, 7 Mar 2012, Richard Guenther wrote:
 
  FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original return 
  \\(char\\)
   -\\(unsigned char\\) c  31; 1
  FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original return 
  \\(int\\) 
  \\(12 - \\(unsigned int\\) d\\)  7; 1
  
  because we dump the unsigned type variant differently now.  What do
  people think - adjust the testcase?  Adjust how we pretty-print
  these non-standard integer types?
 
 Adjusting the pretty printer would be nice anyway.  Those unnamed:35 
 thingies hurt my eyes.  Just printing int17 or uint18 would be perfectly 
 fine, with special casing of sizes that match the normal C types for 
 the target in question (so that input 'unsigned char' isn't converted to 
 'uint8' on one and 'uint16' on another target).

Ok, I'll do that (special-casing some precisions via *_TYPE_SIZE).
I won't touch the unnamed-unsigned-type:35 stuff, for now.

Richard.

Re: Remove obsolete Tru64 UNIX V5.1B support

2012-03-07 Thread Rainer Orth

David Daney david.da...@cavium.com writes:

 I'd have expected regeneration to use GCJ built to use ECJ, though I don't
 know.

 I've never tried this.  Given that the .class file lives below
 libjava/classpath and has to be synced with upstream Classpath anyway, I
 hope the Java maintainers will take care of that.


 This it documented (although perhaps badly) in install/configure.html

 You should use --enable-java-maintainer-mode, this will cause the build to
 use ecj and gjavah to regenerate all the generated files in the 'standard'
 manner.

Ok, I'll try it when incorporating review comments and testing the
result.  ecj-4.5.jar is still the right file to use?  The
contrib/download_ecj script mentioned in HACKING doesn't exist any longer.

 At least with the javac-built File.class I had no libjava testsuite
 failures.


 It probably results in a usable .class file, but is error prone and not
 very reproducible.

No doubt about that: it took me several iterations to get the right
invocation.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[Ada] Increase maximum number of instantiations

2012-03-07 Thread Arnaud Charlet

This patch increases the maximum number of instantiations allowed
per compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-03-07  Bob Duff  d...@adacore.com

* hostparm.ads (Max_Instantiations): Increase parameter.

Index: hostparm.ads
===
--- hostparm.ads(revision 185043)
+++ hostparm.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 1992-2009, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -69,7 +69,7 @@
--  of file names in the library, must be at least Max_Line_Length, but
--  can be larger.
 
-   Max_Instantiations : constant := 4000;
+   Max_Instantiations : constant := 8000;
--  Maximum number of instantiations permitted (to stop runaway cases
--  of nested instantiations). These situations probably only occur in
--  specially concocted test cases.

[Ada] Increase efficiency of bounded strings

2012-03-07 Thread Arnaud Charlet

This patch increases the efficiency of bounded strings by removing an
unnecessary default.
No change in functional behavior.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-03-07  Bob Duff  d...@adacore.com

* a-strsup.ads, a-stwisu.ads, a-stzsup.ads (Super_String):
Remove default initial value for Data. It is no longer needed
because = now composes properly for untagged records. This
default has caused efficiency problems.

Index: a-strsup.ads
===
--- a-strsup.ads(revision 185043)
+++ a-strsup.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2003-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 2003-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -43,7 +43,10 @@
 
type Super_String (Max_Length : Positive) is record
   Current_Length : Natural := 0;
-  Data   : String (1 .. Max_Length) := (others = ASCII.NUL);
+  Data   : String (1 .. Max_Length);
+  --  A previous version had a default initial value for Data, which is no
+  --  longer necessary, because = now composes properly for untagged
+  --  records. Leaving it out is more efficient.
end record;
--  Type Bounded_String in Ada.Strings.Bounded.Generic_Bounded_Length is
--  derived from this type, with the constraint of the maximum length.
Index: a-stwisu.ads
===
--- a-stwisu.ads(revision 185043)
+++ a-stwisu.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2003-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 2003-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -46,7 +46,10 @@
 
type Super_String (Max_Length : Positive) is record
   Current_Length : Natural := 0;
-  Data   : Wide_String (1 .. Max_Length) := (others = Wide_NUL);
+  Data   : Wide_String (1 .. Max_Length);
+  --  A previous version had a default initial value for Data, which is no
+  --  longer necessary, because = now composes properly for untagged
+  --  records. Leaving it out is more efficient.
end record;
--  Ada.Strings.Wide_Bounded.Generic_Bounded_Length.Wide_Bounded_String is
--  derived from this type, with the constraint of the maximum length.
Index: a-stzsup.ads
===
--- a-stzsup.ads(revision 185043)
+++ a-stzsup.ads(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 2003-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 2003-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -47,8 +47,10 @@
 
type Super_String (Max_Length : Positive) is record
   Current_Length : Natural := 0;
-  Data   : Wide_Wide_String (1 .. Max_Length) :=
- (others = Wide_Wide_NUL);
+  Data   : Wide_Wide_String (1 .. Max_Length);
+  --  A previous version had a default initial value for Data, which is no
+  --  longer necessary, because = now composes properly for untagged
+  --  records. Leaving it out is more efficient.
end record;
--  Wide_Wide_Bounded.Generic_Bounded_Length.Wide_Wide_Bounded_String is
--  derived from this type, with the constraint of the maximum length.

[Ada] Merge s-osinte-vms and s-osinte-vms-ia64

2012-03-07 Thread Arnaud Charlet

The differences are not significant enough to require a separate version.
Pthread_Self is now an imported subprogram on Alpha too.  Could be replaced
by an inline assembler on both platforms in case of performace issue.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-03-07  Tristan Gingold  ging...@adacore.com

* s-osinte-vms-ia64.adb, s-osinte-vms-ia64.ads, s-osinte-vms.adb,
s-osinte-vms.ads, gcc-interface/Makefile.in: Merge s-osinte-vms and
s-osinte-vms-ia64.

Index: s-osinte-vms-ia64.adb
===
--- s-osinte-vms-ia64.adb   (revision 185043)
+++ s-osinte-vms-ia64.adb   (working copy)
@@ -1,58 +0,0 @@
---
---  --
--- GNAT RUN-TIME LIBRARY (GNARL) COMPONENTS --
---  --
---   S Y S T E M . O S _ I N T E R F A C E  --
---  --
---  B o d y --
---  --
---  Copyright (C) 2003-2010, Free Software Foundation, Inc. --
---  --
--- GNAT is free software;  you can  redistribute it  and/or modify it under --
--- terms of the  GNU General Public License as published  by the Free Soft- --
--- ware  Foundation;  either version 3,  or (at your option) any later ver- --
--- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
--- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
--- or FITNESS FOR A PARTICULAR PURPOSE. --
---  --
--- As a special exception under Section 7 of GPL version 3, you are granted --
--- additional permissions described in the GCC Runtime Library Exception,   --
--- version 3.1, as published by the Free Software Foundation.   --
---  --
--- You should have received a copy of the GNU General Public License and--
--- a copy of the GCC Runtime Library Exception along with this program; --
--- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see--
--- http://www.gnu.org/licenses/.  --
---  --
--- GNARL was developed by the GNARL team at Florida State University.   --
--- Extensive contributions were provided by Ada Core Technologies, Inc. --
---  --
---
-
---  This is a OpenVMS/IA64 version of this package
-
---  This package encapsulates all direct interfaces to OS services
---  that are needed by children of System.
-
-pragma Polling (Off);
---  Turn off polling, we do not want ATC polling to take place during
---  tasking operations. It causes infinite loops and other problems.
-
-with Interfaces.C; use Interfaces.C;
-
-package body System.OS_Interface is
-
-   -
-   -- sched_yield --
-   -
-
-   function sched_yield return int is
-  procedure sched_yield_base;
-  pragma Import (C, sched_yield_base, PTHREAD_YIELD_NP);
-
-   begin
-  sched_yield_base;
-  return 0;
-   end sched_yield;
-
-end System.OS_Interface;
Index: s-osinte-vms-ia64.ads
===
--- s-osinte-vms-ia64.ads   (revision 185043)
+++ s-osinte-vms-ia64.ads   (working copy)
@@ -1,652 +0,0 @@
---
---  --
--- GNAT RUN-TIME LIBRARY (GNARL) COMPONENTS --
---  --
---   S Y S T E M . O S _ I N T E R F A C E  --
---  --
---  S p e c --
---  --
--- Copyright (C) 1991-1994, Florida State University--
---  Copyright (C) 1995-2010, Free Software Foundation, Inc. --
---  --
--- GNAT is free software;  you can  redistribute it  and/or modify it

[Ada] Set pthread names on OpenVMS

2012-03-07 Thread Arnaud Charlet

Just in order to ease debugging (at pthread level).
No functional change.

Committed on trunk

2012-03-07  Tristan Gingold  ging...@adacore.com

* s-taprop-vms.adb (Create_Task): set thread name.
* s-osinte-vms.ads (pthread_attr_setname_np): Declare.

Index: s-taprop-vms.adb
===
--- s-taprop-vms.adb(revision 185051)
+++ s-taprop-vms.adb(working copy)
@@ -780,6 +780,7 @@
   function Thread_Body_Access is new
 Ada.Unchecked_Conversion (System.Aux_DEC.Short_Address, Thread_Body);
 
+  Task_Name : String (1 .. System.Parameters.Max_Task_Image_Length + 1);
begin
   --  Since the initial signal mask of a thread is inherited from the
   --  creator, we need to set our local signal mask to mask all signals
@@ -809,6 +810,18 @@
   (Attributes'Access, PTHREAD_EXPLICIT_SCHED);
   pragma Assert (Result = 0);
 
+  if T.Common.Task_Image_Len  0 then
+ --  Set thread name to ease debugging
+
+ Task_Name (1 .. T.Common.Task_Image_Len) :=
+   T.Common.Task_Image (1 .. T.Common.Task_Image_Len);
+ Task_Name (T.Common.Task_Image_Len + 1) := ASCII.NUL;
+
+ Result := pthread_attr_setname_np
+   (Attributes'Access, Task_Name'Address, Null_Address);
+ pragma Assert (Result = 0);
+  end if;
+
   --  Note: the use of Unrestricted_Access in the following call is needed
   --  because otherwise we have an error of getting a access-to-volatile
   --  value which points to a non-volatile object. But in this case it is
Index: s-osinte-vms.ads
===
--- s-osinte-vms.ads(revision 185051)
+++ s-osinte-vms.ads(working copy)
@@ -520,6 +520,12 @@
   sched_param : int) return int;
pragma Import (C, pthread_attr_setschedparam, PTHREAD_ATTR_SETSCHEDPARAM);
 
+   function pthread_attr_setname_np
+ (attr : access pthread_attr_t;
+  name : System.Address;
+  mbz  : System.Address) return int;
+   pragma Import (C, pthread_attr_setname_np, PTHREAD_ATTR_SETNAME_NP);
+
function sched_yield return int;
 
--

Re: [PR51752] publication safety violations in loop invariant motion pass

2012-03-07 Thread Aldy Hernandez


On 03/07/12 03:18, Richard Guenther wrote:

On Tue, Mar 6, 2012 at 9:56 PM, Torvald Riegeltrie...@redhat.com  wrote:

On Tue, 2012-03-06 at 21:18 +0100, Richard Guenther wrote:

On Tue, Mar 6, 2012 at 6:55 PM, Aldy Hernandezal...@redhat.com  wrote:

On 02/29/12 03:22, Richard Guenther wrote:


So fixing up individual passes is easier - I can only think of PRE being
problematic right now, I am not aware that any other pass moves loads
or stores.  So I'd simply pre-compute the stmt bit in PRE and adjust
the

   if (gimple_has_volatile_ops (stmt)
   || stmt_could_throw_p (stmt))
 continue;

in compute_avail accordingly.



Initially I thought PRE would be problematic for transactions, but perhaps
it isn't.  As I understand, for PRE we hoist loads/computations that are
mostly redundant, but will be performed on every path:

if (flag)
a = b + c;
else
stuff;
d = b + c;-- [b + c] always computed

Even if we hoist [b + c] before the flag, [b + c] will be computed on every
path out of if (flag)  So... we can allow this transformation within
transactions, right?


In this particular example, I agree.  We can move [b + c] into the else
branch, and then move it to before flag because it will happen on all
paths to the exit anyway.


Note that partial PRE (enabled at -O3) can insert expressions into paths
that did _not_ execute the expression.  For regular PRE you are right.


I suppose if only loads will be moved around by PRE, then this could be
fine, as long as those expressions do not have visible side effects or
can crash if reading garbage.  For examples, dereferencing pointers
could lead to accessing unmapped memory and thus segfaults, speculative
stores are not allowed (even if you undo them later on), etc.

Also, if PRE inserts expressions into paths that did not execute the
transactions, can it happen that then something like loop invariant
motion comes around and optimizes based on that and moves the code to
before if (flag)...?  If so, PRE would break publication safety
indirectly by pretending that the expression happened on every path to
the exit, tricking subsequent passes to believe things that were not in
place in the source code.  Is this a realistic scenario?


I think so.


Wow, yeah.  I hadn't thought about that.

Fortunately in the current code base this won't happen because loop 
invariant motion will refuse to move _any_ loads that happen inside a 
transaction, and because of the memory barrier at the beginning of 
transactions, no code will be moved out of a transaction.  However, when 
we optimize things (4.8?) and allow loop invariant motion inside 
transactions when the load happens on every path to exit, then yes... we 
need to keep this problematic scenario in mind.


For now I believe we're safe, modulo the partial PRE scenario that 
Richard G pointed out.


Aldy

Re: [RFC PATCH]: Handle Pmode == SImode in stringop patterns

2012-03-07 Thread Uros Bizjak

On Tue, Mar 6, 2012 at 10:27 PM, H.J. Lu hjl.to...@gmail.com wrote:
 +     case '^':
 +       if (TARGET_64BIT  Pmode == SImode)
 +         {
 +           fputs (addr32, file);
 +#ifndef HAVE_AS_IX86_REP_LOCK_PREFIX
 +           if (ASSEMBLER_DIALECT == ASM_ATT)
 +             fputs (addr32; , file);
 +           else
 +#endif
 +             fputs (addr32 , file);
 +         }

 Why do you print addr32 twice? addr32addr32;  or addr32addr32 .

 Oops, please remove the first one.


 It looks OK to me.  I will test after I fix indirect jmp/call.

 FYI, addr32 prefix can't stand alone (but addr32 rep; insn is OK),
 so #ifndefed part is bogus.


 I changed it to

 +  case '^':
 +    if (TARGET_64BIT  Pmode == SImode)
 +      fputs (addr32 , file);
 +    return;

 and it seems to work.

Committed with slight adjustment to above code

+   case '^':
+ if (TARGET_64BIT  Pmode != word_mode)
+   fputs (addr32 , file);
+ return;
+

2012-03-07  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_print_operand_punct_valid_p): Add '^'.
(ix86_print_operand): Handle '^'.
* config/i386/i386.md (*strmovdi_rex_1): Macroize memory operands
using P mode iterator.  Add %^ to asm template to conditionally emit
addr32 prefix.
(*rep_movdi_rex64): Ditto.
(*strsetdi_rex_1): Ditto.
(*rep_stosdi_rex64): Ditto.
(*strmov{si,hi,qi}_1): Add %^ to asm template to
conditionally emit addr32 prefix.
(*rep_mov{si,qi}): Ditto.
(*strset{si,hi,qi}): Ditto.
(*rep_stos{si,qi}): Ditto.
(*cmpstrnqi_nz_1): Ditto.
(*cmpstrnqi_1): Ditto.
(*strlenqi_1): Ditto.

Re-tested on x86_64-pc-linux-gnu.

Uros.

Re: PATCH: Properly check mode for x86 call/jmp address

2012-03-07 Thread H.J. Lu

On Wed, Mar 7, 2012 at 1:28 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, Mar 6, 2012 at 9:57 PM, H.J. Lu hjl.to...@gmail.com wrote:
  (define_insn *call
 -  [(call (mem:QI (match_operand:P 0 call_insn_operand czw))
 +  [(call (mem:QI (match_operand:C 0 call_insn_operand czw))
        (match_operand 1  ))]
 -  !SIBLING_CALL_P (insn)
 +  !SIBLING_CALL_P (insn)
 +    (GET_CODE (operands[0]) == SYMBOL_REF
 +       || GET_MODE (operands[0]) == word_mode)

 There are enough copies of this extra constraint that I wonder
 if it simply ought to be folded into call_insn_operand.

 Which would need to be changed to define_special_predicate,
 since you'd be doing your own mode checking.

 Probably similar changes to sibcall_insn_operand.

 Here is the updated patch.  I changed constant_call_address_operand
 and call_register_no_elim_operand to use define_special_predicate.
 OK for trunk?

 Please do not complicate matters that much. Just stick word_mode
 overrides for register operands in predicates.md, like in attached
 patch. These changed predicates now allow registers only in word_mode
 (and VOIDmode).

 You can now remove all new mode iterators and leave call patterns untouched.

 @@ -22940,14 +22940,18 @@ ix86_expand_call (rtx retval, rtx fnaddr,
 rtx callarg1,
        GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
        !local_symbolic_operand (XEXP (fnaddr, 0), VOIDmode))
     fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 0)));
 -  else if (sibcall
 -          ? !sibcall_insn_operand (XEXP (fnaddr, 0), Pmode)
 -          : !call_insn_operand (XEXP (fnaddr, 0), Pmode))
 +  else if (!(constant_call_address_operand (XEXP (fnaddr, 0), Pmode)
 +            || call_register_no_elim_operand (XEXP (fnaddr, 0),
 +                                              word_mode)
 +            || (!sibcall
 +                 !TARGET_X32
 +                 memory_operand (XEXP (fnaddr, 0), word_mode
     {
       fnaddr = XEXP (fnaddr, 0);
 -      if (GET_MODE (fnaddr) != Pmode)
 -       fnaddr = convert_to_mode (Pmode, fnaddr, 1);
 -      fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (Pmode, fnaddr));
 +      if (GET_MODE (fnaddr) != word_mode)
 +       fnaddr = convert_to_mode (word_mode, fnaddr, 1);
 +      fnaddr = gen_rtx_MEM (QImode,
 +                           copy_to_mode_reg (word_mode, fnaddr));
     }

   vec_len = 0;

 Please update the above part. It looks you don't even have to change
 condition with new predicates. Basically, you should only convert the
 address to word_mode instead of Pmode.

 +  if (TARGET_X32)
 +    operands[0] = convert_memory_address (word_mode, operands[0]);

 This addition to indirect_jump and tablejump should be the only
 change, needed in i386.md now. Please write the condition

 if (Pmode != word_mode)

 for consistency.

 BTW: The attached patch was bootstrapped and regression tested on
 x86_64-pc-linux-gnu {,-m32}.

 Uros.

It doesn't work:

x.i:7:1: error: unrecognizable insn:
(call_insn/j 8 7 9 3 (call (mem:QI (reg:DI 62) [0 *foo.0_1 S1 A8])
(const_int 0 [0])) x.i:6 -1
 (nil)
(nil))
x.i:7:1: internal compiler error: in extract_insn, at recog.c:2123
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
make: *** [x.s] Error 1

I will investigate it.

-- 
H.J.

Go patch committed: Fix bug when struct inherits varargs

2012-03-07 Thread Ian Lance Taylor

This patch to the Go compiler fixes a bug when a struct inherits from an
interface with a varargs method.  I added a test for this to the master
Go testsuite, which will be copied into gcc in due course.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

diff -r e9c1caae3b22 go/types.cc
--- a/go/types.cc	Tue Mar 06 17:06:46 2012 -0800
+++ b/go/types.cc	Wed Mar 07 08:03:33 2012 -0800
@@ -3744,8 +3744,12 @@
   go_assert(!this-is_method());
   Typed_identifier* receiver = new Typed_identifier(, receiver_type,
 		this-location_);
-  return Type::make_function_type(receiver, this-parameters_,
-  this-results_, this-location_);
+  Function_type* ret = Type::make_function_type(receiver, this-parameters_,
+		this-results_,
+		this-location_);
+  if (this-is_varargs_)
+ret-set_is_varargs();
+  return ret;
 }
 
 // Make a function type.

[Ada] Set task name under Linux

2012-03-07 Thread Arnaud Charlet

It's useful to set task/thread's name for debugging.
Under Linux, there is a prctl() system call to do that. Note that we don't
want to use pthread_setname_np since only very recent glibc (= 2.12) provide
this function.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-03-07  Arnaud Charlet  char...@adacore.com

* s-osinte-linux.ads, s-taprop-linux.adb (prctl): New function.
(Enter_Task): Call prctl when relevant.

Index: s-osinte-linux.ads
===
--- s-osinte-linux.ads  (revision 185051)
+++ s-osinte-linux.ads  (working copy)
@@ -7,7 +7,7 @@
 --  S p e c --
 --  --
 -- Copyright (C) 1991-1994, Florida State University--
---  Copyright (C) 1995-2011, Free Software Foundation, Inc. --
+--  Copyright (C) 1995-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -255,6 +255,12 @@
function getpid return pid_t;
pragma Import (C, getpid, getpid);
 
+   PR_SET_NAME : constant := 15;
+
+   function prctl
+ (option : int; arg2, arg3, arg4, arg5 : unsigned_long := 0) return int;
+   pragma Import (C, prctl);
+
-
-- Threads --
-
Index: s-taprop-linux.adb
===
--- s-taprop-linux.adb  (revision 185051)
+++ s-taprop-linux.adb  (working copy)
@@ -6,7 +6,7 @@
 --  --
 --  B o d y --
 --  --
--- Copyright (C) 1992-2011, Free Software Foundation, Inc.  --
+-- Copyright (C) 1992-2012, Free Software Foundation, Inc.  --
 --  --
 -- GNARL is free software; you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -767,6 +767,22 @@
   Self_ID.Common.LL.Thread := pthread_self;
   Self_ID.Common.LL.LWP := lwp_self;
 
+  if Self_ID.Common.Task_Image_Len  0 then
+ declare
+Task_Name : String (1 .. Parameters.Max_Task_Image_Length + 1);
+Result: int;
+ begin
+--  Set thread name to ease debugging
+
+Task_Name (1 .. Self_ID.Common.Task_Image_Len) :=
+  Self_ID.Common.Task_Image (1 .. Self_ID.Common.Task_Image_Len);
+Task_Name (Self_ID.Common.Task_Image_Len + 1) := ASCII.NUL;
+
+Result := prctl (PR_SET_NAME, unsigned_long (Task_Name'Address));
+pragma Assert (Result = 0);
+ end;
+  end if;
+
   Specific.Set (Self_ID);
 
   if Use_Alternate_Stack

[PATCH] Do not handle SUBREG in apply_distributive_law (Re: RFC: allowing fwprop to propagate subregs)

2012-03-07 Thread Ulrich Weigand

Richard Kenner wrote:

  Given the current set of results, since I do not have any way to verify
  whether my simplify_set changes would actually trigger correctly, I'd
  rather propose to just remove the SUBREG case in apply_distributive_law
  (i.e. only apply the first patch below).
  
  Thoughts?
 
 I think that's reasonable.  But I'd replace it with a comment saying
 what used to be there and why it was removed.

Now that we're back in stage 1, I'd like to try and move forward with
this again.  Here's a patch that implements your suggestion.

Tested on arm-linux-gnueabi, i386-linux-gnu and s390-linux-gnu.

OK for mainline?

Bye,
Ulrich

2012-03-07  Ulrich Weigand  ulrich.weig...@linaro.org

gcc/
* combine.c (apply_distributive_law): Do not distribute SUBREG.

=== modified file 'gcc/combine.c'
--- gcc/combine.c   2012-02-07 15:48:52 +
+++ gcc/combine.c   2012-02-22 11:57:19 +
@@ -9286,36 +9286,22 @@
   /* This is also a multiply, so it distributes over everything.  */
   break;
 
-case SUBREG:
-  /* Non-paradoxical SUBREGs distributes over all operations,
-provided the inner modes and byte offsets are the same, this
-is an extraction of a low-order part, we don't convert an fp
-operation to int or vice versa, this is not a vector mode,
-and we would not be converting a single-word operation into a
-multi-word operation.  The latter test is not required, but
-it prevents generating unneeded multi-word operations.  Some
-of the previous tests are redundant given the latter test,
-but are retained because they are required for correctness.
-
-We produce the result slightly differently in this case.  */
-
-  if (GET_MODE (SUBREG_REG (lhs)) != GET_MODE (SUBREG_REG (rhs))
- || SUBREG_BYTE (lhs) != SUBREG_BYTE (rhs)
- || ! subreg_lowpart_p (lhs)
- || (GET_MODE_CLASS (GET_MODE (lhs))
- != GET_MODE_CLASS (GET_MODE (SUBREG_REG (lhs
- || paradoxical_subreg_p (lhs)
- || VECTOR_MODE_P (GET_MODE (lhs))
- || GET_MODE_SIZE (GET_MODE (SUBREG_REG (lhs)))  UNITS_PER_WORD
- /* Result might need to be truncated.  Don't change mode if
-explicit truncation is needed.  */
- || !TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (x),
-GET_MODE (SUBREG_REG (lhs
-   return x;
-
-  tem = simplify_gen_binary (code, GET_MODE (SUBREG_REG (lhs)),
-SUBREG_REG (lhs), SUBREG_REG (rhs));
-  return gen_lowpart (GET_MODE (x), tem);
+/* This used to handle SUBREG, but this turned out to be counter-
+   productive, since (subreg (op ...)) usually is not handled by
+   insn patterns, and this optimization therefore transformed
+   recognizable patterns into unrecognizable ones.  Therefore the
+   SUBREG case was removed from here.
+
+   It is possible that distributing SUBREG over arithmetic operations
+   leads to an intermediate result than can then be optimized further,
+   e.g. by moving the outer SUBREG to the other side of a SET as done
+   in simplify_set.  This seems to have been the original intent of
+   handling SUBREGs here.
+
+   However, with current GCC this does not appear to actually happen,
+   at least on major platforms.  If some case is found where removing
+   the SUBREG case here prevents follow-on optimizations, distributing
+   SUBREGs ought to be re-added at that place, e.g. in simplify_set.  */
 
 default:
   return x;



-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com

Re: Remove obsolete OpenBSD/MIPS support

2012-03-07 Thread Rainer Orth

Richard Sandiford rdsandif...@googlemail.com writes:

 Definitely.  Thanks for saving me the legwork :-)

It was both easy to do and a prerequisite for the MIPS_DEBUGGING_INFO
removal :-)

 There are also some SDB_OUTPUT_SOURCE_LINE references in mips.c and
 mips.h that could go.  Removing those is preapproved if you want to
 do it as part of the same patch, otherwise I can do it this weekend.

I'll probably get to it since I have to make minor adjustments to a
couple of patches anyway, but won't be able to before the weekend
myself.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[committed] TILE-Gx/TILEPro: unwind fix for dynamic frames

2012-03-07 Thread Walter Lee


This patch fixes an unwinding bug for functions with dynamic stack
frames.  We stop generating REG_CFA_* notes for stack pointer, and at
the end of unwinding we restore the stack pointer by adjusting it by
EH_RETURN_STACKADJ_RTX.

Walter

diff --git a/gcc/config/tilegx/tilegx.c b/gcc/config/tilegx/tilegx.c
index fa739e3..217682e 100644
--- a/gcc/config/tilegx/tilegx.c
+++ b/gcc/config/tilegx/tilegx.c
@@ -3881,9 +3881,8 @@ tilegx_expand_prologue (void)
 {
   /* Copy the old stack pointer aside so we can save it later.  */
   sp_copy_regno = next_scratch_regno--;
-  insn = FRP (emit_move_insn (gen_rtx_REG (Pmode, sp_copy_regno),
- stack_pointer_rtx));
-  add_reg_note (insn, REG_CFA_REGISTER, NULL_RTX);
+  emit_move_insn (gen_rtx_REG (Pmode, sp_copy_regno),
+ stack_pointer_rtx);
 }
 
   if (tilegx_current_function_is_leaf ())
@@ -3925,8 +3924,8 @@ tilegx_expand_prologue (void)
}
 
   /* Save our frame pointer for backtrace chaining.  */
-  FRP (frame_emit_store (sp_copy_regno, STACK_POINTER_REGNUM,
-chain_addr, cfa, cfa_offset));
+  emit_insn (gen_movdi (gen_frame_mem (DImode, chain_addr),
+   gen_rtx_REG (DImode, sp_copy_regno)));
 }
 
   /* Compute where to start storing registers we need to save.  */
@@ -4067,16 +4066,7 @@ tilegx_expand_epilogue (bool sibcall_p)
 
   emit_insn (gen_blockage ());
 
-  if (crtl-calls_eh_return)
-{
-  rtx r = compute_frame_addr (-total_size + UNITS_PER_WORD,
- next_scratch_regno);
-  insn = emit_move_insn (gen_lowpart (DImode, stack_pointer_rtx),
-gen_frame_mem (DImode, r));
-  RTX_FRAME_RELATED_P (insn) = 1;
-  REG_NOTES (insn) = cfa_restores;
-}
-  else if (frame_pointer_needed)
+  if (frame_pointer_needed)
 {
   /* Restore the old stack pointer by copying from the frame
  pointer.  */
@@ -4100,6 +4090,16 @@ tilegx_expand_epilogue (bool sibcall_p)
 cfa_restores);
 }
 
+  if (crtl-calls_eh_return)
+{
+  if (TARGET_32BIT)
+   emit_insn (gen_sp_adjust_32bit (stack_pointer_rtx, stack_pointer_rtx,
+   EH_RETURN_STACKADJ_RTX));
+  else
+   emit_insn (gen_sp_adjust (stack_pointer_rtx, stack_pointer_rtx,
+ EH_RETURN_STACKADJ_RTX));
+}
+
   /* Restore the old frame pointer.  */
   if (frame_pointer_needed)
 {
diff --git a/gcc/config/tilepro/tilepro.c b/gcc/config/tilepro/tilepro.c
index 71b5807..011ac08 100644
--- a/gcc/config/tilepro/tilepro.c
+++ b/gcc/config/tilepro/tilepro.c
@@ -3556,9 +3556,8 @@ tilepro_expand_prologue (void)
 {
   /* Copy the old stack pointer aside so we can save it later.  */
   sp_copy_regno = next_scratch_regno--;
-  insn = FRP (emit_move_insn (gen_rtx_REG (Pmode, sp_copy_regno),
- stack_pointer_rtx));
-  add_reg_note (insn, REG_CFA_REGISTER, NULL_RTX);
+  emit_move_insn (gen_rtx_REG (Pmode, sp_copy_regno),
+ stack_pointer_rtx);
 }
 
   if (tilepro_current_function_is_leaf ())
@@ -3600,8 +3599,8 @@ tilepro_expand_prologue (void)
}
 
   /* Save our frame pointer for backtrace chaining.  */
-  FRP (frame_emit_store (sp_copy_regno, STACK_POINTER_REGNUM,
-chain_addr, cfa, cfa_offset));
+  emit_insn (gen_movsi (gen_frame_mem (SImode, chain_addr),
+   gen_rtx_REG (SImode, sp_copy_regno)));
 }
 
   /* Compute where to start storing registers we need to save.  */
@@ -3742,16 +3741,7 @@ tilepro_expand_epilogue (bool sibcall_p)
 
   emit_insn (gen_blockage ());
 
-  if (crtl-calls_eh_return)
-{
-  rtx r = compute_frame_addr (-total_size + UNITS_PER_WORD,
- next_scratch_regno);
-  insn = emit_move_insn (gen_rtx_REG (Pmode, STACK_POINTER_REGNUM),
-gen_frame_mem (Pmode, r));
-  RTX_FRAME_RELATED_P (insn) = 1;
-  REG_NOTES (insn) = cfa_restores;
-}
-  else if (frame_pointer_needed)
+  if (frame_pointer_needed)
 {
   /* Restore the old stack pointer by copying from the frame
  pointer.  */
@@ -3767,6 +3757,10 @@ tilepro_expand_epilogue (bool sibcall_p)
 cfa_restores);
 }
 
+  if (crtl-calls_eh_return)
+emit_insn (gen_sp_adjust (stack_pointer_rtx, stack_pointer_rtx,
+ EH_RETURN_STACKADJ_RTX));
+
   /* Restore the old frame pointer.  */
   if (frame_pointer_needed)
 {

Re: [RFC]: Add support for pragma pointer_size

2012-03-07 Thread Joseph S. Myers

On Wed, 7 Mar 2012, Tristan Gingold wrote:

 On Mar 6, 2012, at 6:34 PM, Joseph S. Myers wrote:
 
  On Tue, 6 Mar 2012, Tristan Gingold wrote:
  
  The patch is simple: the C front-end will now calls c_build_pointer_type 
  (instead of build_pointer_type), which in turn calls 
  build_pointer_type_for_mode using the right mode.
  
  There seem to be quite a lot of build_pointer_type calls in the C front 
  end (and in c-common.c) that you haven't changed.  Could you explain the 
  rule for when a call should or should not be changed, and how it applies 
  to all these calls?
 
 The global approach is to have the same effect as a default 
 __attribute__((mode(SI/DImode))) on pointers declared by users so that 
 layouts match.  That's why only grokdeclarator is changed.
 
 There might be bugs with this approach (e.g. it looks like 
 c-common.c:handle_noreturn_attribute doesn't preserve the mode of the 
 pointer to function), but my understanding is that they also correspond 
 to bugs of __attribute__((mode ())) applied to pointer.  The later one 
 isn't well tested and one advantage of the VMS port is that it will test 
 it much more (as there are many pragma pointer_size in VMS headers).

So those places would need to change to use build_pointer_type_for_mode as 
is done for composite types in c-typeck.c:composite_type, for example?

I think the patch at least needs a (VMS-specific?) testcase that tests the 
new pragma (presuming the testsuite can be run for VMS targets) even if 
some cases can't be tested until they are fixed.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Move strip_float_extensions to tree.c

2012-03-07 Thread Joseph S. Myers

On Wed, 7 Mar 2012, Richard Guenther wrote:

 Now, convert.c is used from all frontends to implement convert ()
 (that looks backwards - the language convert should be a langhook,
 called from convert implemented in convert.c).  But well, I aint
 not touching this beast ;)

I don't think convert () should be a langhook; it's all about 
language-specific semantics (the only non-front-end places calling it, 
outside of convert.c itself, now appear to be in arm.c).

I'm not sure of the extent to which the recursive calls to convert inside 
convert.c need any language-specific semantics.  To the extent that they 
do, I think front ends should be dealing with the semantics rather than 
having convert call back into the front end.  (I also think all the errors 
from convert.c should be given by front ends instead; front ends should 
only call the convert_to_* helpers for code they have checked is valid.)

-- 
Joseph S. Myers
jos...@codesourcery.com

[Ada] Remove call to unshare_expr from gigi

2012-03-07 Thread Eric Botcazou

There is a single call to unshare_expr in gigi and it is actually superfluous 
if you do things properly.

Tested on i586-suse-linux, applied on the mainline.


2012-03-07  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/trans.c (Identifier_to_gnu): Don't unshare initializer.
(add_decl_expr): Mark external DECLs as used.
* gcc-interface/utils.c (convert) CONSTRUCTOR: Copy the vector.


-- 
Eric Botcazou
Index: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 184852)
+++ gcc-interface/utils.c	(working copy)
@@ -3894,6 +3894,8 @@ convert (tree type, tree expr)
 	{
 	  expr = copy_node (expr);
 	  TREE_TYPE (expr) = type;
+	  CONSTRUCTOR_ELTS (expr)
+	= VEC_copy (constructor_elt, gc, CONSTRUCTOR_ELTS (expr));
 	  return expr;
 	}
 
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 184852)
+++ gcc-interface/trans.c	(working copy)
@@ -1102,11 +1102,9 @@ Identifier_to_gnu (Node_Id gnat_node, tr
 	  = lvalue_required_p (gnat_node, gnu_result_type, true,
 			   address_of_constant, Is_Aliased (gnat_temp));
 
-  /* ??? We need to unshare the initializer if the object is external
-	 as such objects are not marked for unsharing if we are not at the
-	 global level.  This should be fixed in add_decl_expr.  */
+  /* Finally retrieve the initializer if this is deemed valid.  */
   if ((constant_only  !address_of_constant) || !require_lvalue)
-	gnu_result = unshare_expr (DECL_INITIAL (gnu_result));
+	gnu_result = DECL_INITIAL (gnu_result);
 }
 
   /* The GNAT tree has the type of a function set to its result type, so we
@@ -7113,10 +7111,10 @@ add_decl_expr (tree gnu_decl, Entity_Id
 
   gnu_stmt = build1 (DECL_EXPR, void_type_node, gnu_decl);
 
-  /* If we are global, we don't want to actually output the DECL_EXPR for
- this decl since we already have evaluated the expressions in the
+  /* If we are external or global, we don't want to output the DECL_EXPR for
+ this DECL node since we already have evaluated the expressions in the
  sizes and positions as globals and doing it again would be wrong.  */
-  if (global_bindings_p ())
+  if (DECL_EXTERNAL (gnu_decl) || global_bindings_p ())
 {
   /* Mark everything as used to prevent node sharing with subprograms.
 	 Note that walk_tree knows how to deal with TYPE_DECL, but neither
@@ -7135,7 +7133,7 @@ add_decl_expr (tree gnu_decl, Entity_Id
 	!TYPE_FAT_POINTER_P (type))
 	MARK_VISITED (TYPE_ADA_SIZE (type));
 }
-  else if (!DECL_EXTERNAL (gnu_decl))
+  else
 add_stmt_with_node (gnu_stmt, gnat_entity);
 
   /* If this is a variable and an initializer is attached to it, it must be

Re: [PATCH 10/10] addr32: Add *zero_extendsidi2_x32.

2012-03-07 Thread Uros Bizjak

On Tue, Mar 6, 2012 at 8:44 PM, H.J. Lu hjl.to...@gmail.com wrote:

 This is the last patch for Pmode == SImod in x32. In x32, the return value
 of the symbol address must be zero-extended to DImode, This patch adds
 *zero_extendsidi2_x32 to load the address of a symbol in SImode and
 zero-extend it to DImode. It works for x32 since the address size is 32bit.
 OK for trunk?

 Can you please try attached patch instead? It enhances existing insn
 pattern with required functionality.

 It works.  Thanks.

Committed to mainline with following ChangeLog:

2012-03-07  Uros Bizjak  ubiz...@gmail.com

* config/i386/predicates.md (x86_64_zext_general_operand): New.
* config/i386/i386.md (*zero_extendsidi2_rex64): Change operand 1
predicate to x86_64_zext_general_operand.  Accept Z constraint.

Tested on x86_64-pc-linux-gnu.

Uros.

[Ada] Remove support for type completion deferring in gigi

2012-03-07 Thread Eric Botcazou

It was needed for STABS, but is totally obsolete for DWARF.

Tested on i586-suse-linux, applied on the mainline.


2012-03-07  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/gigi.h (rest_of_type_decl_compilation): Delete.
* gcc-interface/decl.c (defer_finalize_level): Likewise.
(defer_finalize_list): Likewise.
(gnat_to_gnu_entity): Delete references to above variables and do not
call rest_of_type_decl_compilation.
(rest_of_type_decl_compilation): Delete.
(rest_of_type_decl_compilation_no_defer): Likewise.
* gcc-interface/utils.c (rest_of_record_type_compilation): Do not call
rest_of_type_decl_compilation.
(create_type_decl): Likewise.
(update_pointer_to): Likewise.


-- 
Eric Botcazou
Index: gcc-interface/utils.c
===
--- gcc-interface/utils.c	(revision 185072)
+++ gcc-interface/utils.c	(working copy)
@@ -1056,15 +1056,8 @@ rest_of_record_type_compilation (tree re
   TYPE_FIELDS (new_record_type)
 	= nreverse (TYPE_FIELDS (new_record_type));
 
-  /* We used to explicitly invoke rest_of_type_decl_compilation on the
-	 parallel type for the sake of STABS.  We don't do it any more, so
-	 as to ensure that the parallel type be processed after the type
-	 by the debug back-end and, thus, prevent it from interfering with
-	 the processing of a recursive type.  */
   add_parallel_type (TYPE_STUB_DECL (record_type), new_record_type);
 }
-
-  rest_of_type_decl_compilation (TYPE_STUB_DECL (record_type));
 }
 
 /* Append PARALLEL_TYPE on the chain of parallel types for decl.  */
@@ -1354,21 +1347,10 @@ create_type_decl (tree type_name, tree t
   if (!named)
 TYPE_STUB_DECL (type) = type_decl;
 
-  /* Pass the type declaration to the debug back-end unless this is an
- UNCONSTRAINED_ARRAY_TYPE that the back-end does not support, or a
- type for which debugging information was not requested, or else an
- ENUMERAL_TYPE or RECORD_TYPE (except for fat pointers) which are
- handled separately.  And do not pass dummy types either.  */
+  /* Do not generate debug info for UNCONSTRAINED_ARRAY_TYPE that the
+ back-end doesn't support, and for others if we don't need to.  */
   if (code == UNCONSTRAINED_ARRAY_TYPE || !debug_info_p)
 DECL_IGNORED_P (type_decl) = 1;
-  else if (code != ENUMERAL_TYPE
-	(code != RECORD_TYPE || TYPE_FAT_POINTER_P (type))
-	!((code == POINTER_TYPE || code == REFERENCE_TYPE)
-		 TYPE_IS_DUMMY_P (TREE_TYPE (type)))
-	!(code == RECORD_TYPE
-		 TYPE_IS_DUMMY_P
-		   (TREE_TYPE (TREE_TYPE (TYPE_FIELDS (type))
-rest_of_type_decl_compilation (type_decl);
 
   return type_decl;
 }
@@ -3531,12 +3513,6 @@ update_pointer_to (tree old_type, tree n
 	  TREE_TYPE (TREE_OPERAND (TYPE_NULL_BOUNDS (t), 0)) = new_type;
 	  }
 
-  /* If we have adjusted named types, finalize them.  This is necessary
-	 since we had forced a DWARF typedef for them in gnat_pushdecl.  */
-  for (ptr = TYPE_POINTER_TO (old_type); ptr; ptr = TYPE_NEXT_PTR_TO (ptr))
-	if (TYPE_NAME (ptr)  TREE_CODE (TYPE_NAME (ptr)) == TYPE_DECL)
-	  rest_of_type_decl_compilation (TYPE_NAME (ptr));
-
   /* Chain REF and its variants at the end.  */
   new_ref = TYPE_REFERENCE_TO (new_type);
   if (new_ref)
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 185072)
+++ gcc-interface/decl.c	(working copy)
@@ -97,11 +97,6 @@ static struct incomplete *defer_incomple
end of the spec.  */
 static struct incomplete *defer_limited_with;
 
-/* These variables are used to defer finalizing types.  The element of the
-   list is the TYPE_DECL associated with the type.  */
-static int defer_finalize_level = 0;
-static VEC (tree,heap) *defer_finalize_list;
-
 typedef struct subst_pair_d {
   tree discriminant;
   tree replacement;
@@ -181,7 +176,6 @@ static tree get_rep_part (tree);
 static tree create_variant_part_from (tree, VEC(variant_desc,heap) *, tree,
   tree, VEC(subst_pair,heap) *);
 static void copy_and_substitute_in_size (tree, tree, VEC(subst_pair,heap) *);
-static void rest_of_type_decl_compilation_no_defer (tree);
 
 /* The relevant constituents of a subprogram binding to a GCC builtin.  Used
to pass around calls performing profile compatibility checks.  */
@@ -3880,10 +3874,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	   care of those situations.  */
 	if (defer_incomplete_level == 0  !is_from_limited_with)
 	  {
-		defer_finalize_level++;
 		update_pointer_to (TYPE_MAIN_VARIANT (gnu_old_desig_type),
    gnat_to_gnu_type (gnat_desig_equiv));
-		defer_finalize_level--;
 	  }
 	else
 	  {
@@ -5112,11 +5104,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  /* Enumeration types have specific RM bounds.  */
 	  SET_TYPE_RM_MIN_VALUE (gnu_scalar_type, gnu_low_bound);
 	  SET_TYPE_RM_MAX_VALUE

[committed] TILEPro/TILE-Gx: rename internal atomic macros

2012-03-07 Thread Walter Lee


This patch renames some internal atomic macros to have a less generic
prefix.

Walter

   * config/tilepro/atomic.c: Rename atomic_ prefix to
   arch_atomic_.
   (atomic_xor): Rename and move definition to
   config/tilepro/atomic.h.
   (atomic_nand): Ditto.
   * config/tilepro/atomic.h: Rename atomic_ prefix to
   arch_atomic_.
   (arch_atomic_xor): Move from config/tilepro/atomic.c.
   (arch_atomic_nand): Ditto.

diff --git a/libgcc/config/tilepro/atomic.c b/libgcc/config/tilepro/atomic.c
index cafbde8..bdf8098 100644
--- a/libgcc/config/tilepro/atomic.c
+++ b/libgcc/config/tilepro/atomic.c
@@ -63,18 +63,12 @@ post_atomic_barrier (int model)
 
 #define __unused __attribute__((unused))
 
-/* Provide additional methods not implemented by atomic.h. */
-#define atomic_xor(mem, mask) \
-  __atomic_update_cmpxchg(mem, mask, __old ^ __value)
-#define atomic_nand(mem, mask) \
-  __atomic_update_cmpxchg(mem, mask, ~(__old  __value))
-
 #define __atomic_fetch_and_do(type, size, opname)  \
 type   \
 __atomic_fetch_##opname##_##size(type* p, type i, int model)   \
 {  \
   pre_atomic_barrier(model);   \
-  type rv = atomic_##opname(p, i); \
+  type rv = arch_atomic_##opname(p, i);\
   post_atomic_barrier(model);  \
   return rv;   \
 }
@@ -96,7 +90,7 @@ type  
\
 __atomic_##opname##_fetch_##size(type* p, type i, int model)   \
 {  \
   pre_atomic_barrier(model);   \
-  type rv = atomic_##opname(p, i) op i;\
+  type rv = arch_atomic_##opname(p, i) op i;   \
   post_atomic_barrier(model);  \
   return rv;   \
 }
@@ -120,7 +114,7 @@ __atomic_compare_exchange_##size(volatile type* ptr, type* 
oldvalp, \
 {  \
   type oldval = *oldvalp;  \
   pre_atomic_barrier(models);  \
-  type retval = atomic_val_compare_and_exchange(ptr, oldval, newval);  \
+  type retval = arch_atomic_val_compare_and_exchange(ptr, oldval, newval); \
   post_atomic_barrier(models); \
   bool success = (retval == oldval);   \
   *oldvalp = retval;   \
@@ -131,7 +125,7 @@ type
\
 __atomic_exchange_##size(volatile type* ptr, type val, int model)  \
 {  \
   pre_atomic_barrier(model);   \
-  type retval = atomic_exchange(ptr, val); \
+  type retval = arch_atomic_exchange(ptr, val);
\
   post_atomic_barrier(model);  \
   return retval;   \
 }
@@ -159,7 +153,7 @@ __atomic_compare_exchange_##size(volatile type* ptr, type* 
guess,   \
   type oldval = (oldword  shift)  valmask;  \
   if (__builtin_expect((oldval == *guess), 1)) {   \
 unsigned int word = (oldword  bgmask) | ((val  valmask)  shift); \
-oldword = atomic_val_compare_and_exchange(p, oldword, word);   \
+oldword = arch_atomic_val_compare_and_exchange(p, oldword, word);  \
 oldval = (oldword  shift)  valmask; \
   }\
   post_atomic_barrier(models); \
@@ -187,7 +181,7 @@ proto   
\
 oldval = (oldword  shift)  valmask; \
 val = expr;
\
 unsigned int word = (oldword  bgmask) | ((val  valmask)  shift); \
-xword = atomic_val_compare_and_exchange(p, oldword, word);  \
+xword = arch_atomic_val_compare_and_exchange(p, oldword, word);\
   } while (__builtin_expect(xword != oldword, 0)); \
   bottom   \
 }
diff --git a/libgcc/config/tilepro/atomic.h b/libgcc/config/tilepro/atomic.h
index 16306fe..d49d13b 100644
--- a/libgcc/config/tilepro/atomic.h
+++ b/libgcc/config/tilepro/atomic.h
@@ -104,8 +104,8 @@
 
 /* 32-bit integer

Re: [committed] TILE-Gx/TILEPro: unwind fix for dynamic frames

2012-03-07 Thread Walter Lee


On 3/7/2012 1:01 PM, Walter Lee wrote:


This patch fixes an unwinding bug for functions with dynamic stack
frames.  We stop generating REG_CFA_* notes for stack pointer, and at
the end of unwinding we restore the stack pointer by adjusting it by
EH_RETURN_STACKADJ_RTX.


I forgot to attach the ChangeLog:

* config/tilegx/tilegx.c (tilegx_expand_prologue): Don't generate
REG_CFA_* notes for the stack pointer.
(tilegx_expand_epilogue): Restore stack pointer by adjusting it by
EH_RETURN_STACKADJ_RTX.
* config/tilepro/tilepro.c (tilepro_expand_prologue): Don't
generate REG_CFA_* notes for the stack pointer.
(tilepro_expand_epilogue): Restore stack pointer by adjusting it
by EH_RETURN_STACKADJ_RTX.

Walter

[Ada] Fix incomplete debug info in corner cases

2012-03-07 Thread Eric Botcazou

The previous patch has exposed a latent issue: we sometimes set some flags on a 
DECL node using information that come from another entity.

Tested on i586-suse-linux, applied on the mainline.


2012-03-07  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/decl.c (gnat_to_gnu_entity): Do not set flags on the
DECL node built for a type which has a non-trivial equivalent type.


-- 
Eric Botcazou
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 185075)
+++ gcc-interface/decl.c	(working copy)
@@ -5061,9 +5061,11 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 }
 
   /* If we really have a ..._DECL node, set a couple of flags on it.  But we
- cannot do so if we are reusing the ..._DECL node made for an alias or a
- renamed object as the predicates don't apply to it but to GNAT_ENTITY.  */
+ cannot do so if we are reusing the ..._DECL node made for an equivalent
+ type or an alias or a renamed object as the predicates don't apply to it
+ but to GNAT_ENTITY.  */
   if (DECL_P (gnu_decl)
+   !(is_type  gnat_equiv_type != gnat_entity)
!Present (Alias (gnat_entity))
!(Present (Renamed_Object (gnat_entity))  saved))
 {

Merge from trunk to gccgo branch

2012-03-07 Thread Ian Lance Taylor

I've merged trunk revision 185072 to the gccgo branch.

Ian

C++ PATCH to user-defined literal operator mangling

2012-03-07 Thread Jason Merrill

In c++/52521 it was pointed out that our mangling of user-defined 
literals was wrong.


Tested x86_64-pc-linux-gnu, applying to 4.7 and trunk.
commit 287cd9ecf4877db64774f5a29828081a62a53f5b
Author: Jason Merrill ja...@redhat.com
Date:   Wed Mar 7 14:01:56 2012 -0500

	PR c++/52521
	* mangle.c (write_literal_operator_name): The length comes after the
	operator prefix.

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 1379e3b..5d6beb5 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -1292,18 +1292,16 @@ write_source_name (tree identifier)
 }
 
 /* Write a user-defined literal operator.
+  ::= li source-name#  source-name
IDENTIFIER is an LITERAL_IDENTIFIER_NODE.  */
 
 static void
 write_literal_operator_name (tree identifier)
 {
   const char* suffix = UDLIT_OP_SUFFIX (identifier);
-  char* buffer = XNEWVEC (char, strlen (UDLIT_OP_MANGLED_PREFIX)
-			  + strlen (suffix) + 10);
-  sprintf (buffer, UDLIT_OP_MANGLED_FORMAT, suffix);
-
-  write_unsigned_number (strlen (buffer));
-  write_identifier (buffer);
+  write_identifier (UDLIT_OP_MANGLED_PREFIX);
+  write_unsigned_number (strlen (suffix));
+  write_identifier (suffix);
 }
 
 /* Encode 0 as _, and 1+ as n-1_.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-mangle.C b/gcc/testsuite/g++.dg/cpp0x/udlit-mangle.C
new file mode 100644
index 000..6de31b6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-mangle.C
@@ -0,0 +1,8 @@
+// PR c++/52521
+// { dg-options -std=c++0x }
+// { dg-final { scan-assembler _Zli2_wPKc } }
+
+int operator  _w(const char*);
+int main() {
+  123_w;
+}

Go patch committed: Don't crash if writing type functions too late

2012-03-07 Thread Ian Lance Taylor

This patch to the Go frontend avoids a crash if an error causes us to
write out type functions too late.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 90c26bab1684 go/gogo.h
--- a/go/gogo.h	Wed Mar 07 08:04:32 2012 -0800
+++ b/go/gogo.h	Wed Mar 07 13:50:47 2012 -0800
@@ -398,6 +398,11 @@
   void
   write_specific_type_functions();
 
+  // Whether we are done writing out specific type functions.
+  bool
+  specific_type_functions_are_written() const
+  { return this-specific_type_functions_are_written_; }
+
   // Traverse the tree.  See the Traverse class.
   void
   traverse(Traverse*);
diff -r 90c26bab1684 go/types.cc
--- a/go/types.cc	Wed Mar 07 08:04:32 2012 -0800
+++ b/go/types.cc	Wed Mar 07 13:50:47 2012 -0800
@@ -1790,6 +1790,12 @@
 {
   Location bloc = Linemap::predeclared_location();
 
+  if (gogo-specific_type_functions_are_written())
+{
+  go_assert(saw_errors());
+  return;
+}
+
   Named_object* hash_fn = gogo-start_function(hash_name, hash_fntype, false,
 	   bloc);
   gogo-start_block(bloc);

Re: PATCH: Properly check mode for x86 call/jmp address

2012-03-07 Thread Uros Bizjak

On Wed, Mar 7, 2012 at 5:03 PM, H.J. Lu hjl.to...@gmail.com wrote:

  (define_insn *call
 -  [(call (mem:QI (match_operand:P 0 call_insn_operand czw))
 +  [(call (mem:QI (match_operand:C 0 call_insn_operand czw))
        (match_operand 1  ))]
 -  !SIBLING_CALL_P (insn)
 +  !SIBLING_CALL_P (insn)
 +    (GET_CODE (operands[0]) == SYMBOL_REF
 +       || GET_MODE (operands[0]) == word_mode)

 There are enough copies of this extra constraint that I wonder
 if it simply ought to be folded into call_insn_operand.

 Which would need to be changed to define_special_predicate,
 since you'd be doing your own mode checking.

 Probably similar changes to sibcall_insn_operand.

 Here is the updated patch.  I changed constant_call_address_operand
 and call_register_no_elim_operand to use define_special_predicate.
 OK for trunk?

 Please do not complicate matters that much. Just stick word_mode
 overrides for register operands in predicates.md, like in attached
 patch. These changed predicates now allow registers only in word_mode
 (and VOIDmode).

 You can now remove all new mode iterators and leave call patterns untouched.

 @@ -22940,14 +22940,18 @@ ix86_expand_call (rtx retval, rtx fnaddr,
 rtx callarg1,
        GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
        !local_symbolic_operand (XEXP (fnaddr, 0), VOIDmode))
     fnaddr = gen_rtx_MEM (QImode, construct_plt_address (XEXP (fnaddr, 0)));
 -  else if (sibcall
 -          ? !sibcall_insn_operand (XEXP (fnaddr, 0), Pmode)
 -          : !call_insn_operand (XEXP (fnaddr, 0), Pmode))
 +  else if (!(constant_call_address_operand (XEXP (fnaddr, 0), Pmode)
 +            || call_register_no_elim_operand (XEXP (fnaddr, 0),
 +                                              word_mode)
 +            || (!sibcall
 +                 !TARGET_X32
 +                 memory_operand (XEXP (fnaddr, 0), word_mode
     {
       fnaddr = XEXP (fnaddr, 0);
 -      if (GET_MODE (fnaddr) != Pmode)
 -       fnaddr = convert_to_mode (Pmode, fnaddr, 1);
 -      fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (Pmode, fnaddr));
 +      if (GET_MODE (fnaddr) != word_mode)
 +       fnaddr = convert_to_mode (word_mode, fnaddr, 1);
 +      fnaddr = gen_rtx_MEM (QImode,
 +                           copy_to_mode_reg (word_mode, fnaddr));
     }

   vec_len = 0;

 Please update the above part. It looks you don't even have to change
 condition with new predicates. Basically, you should only convert the
 address to word_mode instead of Pmode.

 +  if (TARGET_X32)
 +    operands[0] = convert_memory_address (word_mode, operands[0]);

 This addition to indirect_jump and tablejump should be the only
 change, needed in i386.md now. Please write the condition

 if (Pmode != word_mode)

 for consistency.

 BTW: The attached patch was bootstrapped and regression tested on
 x86_64-pc-linux-gnu {,-m32}.

 Uros.

 It doesn't work:

 x.i:7:1: error: unrecognizable insn:
 (call_insn/j 8 7 9 3 (call (mem:QI (reg:DI 62) [0 *foo.0_1 S1 A8])
        (const_int 0 [0])) x.i:6 -1
     (nil)
    (nil))
 x.i:7:1: internal compiler error: in extract_insn, at recog.c:2123
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See http://gcc.gnu.org/bugs.html for instructions.
 make: *** [x.s] Error 1

 I will investigate it.

For reference, attached is the complete patch that uses
define_special_predicate. This patch works OK with the current
mainline, with additional patch to i386.h, where

Index: i386.h
===
--- i386.h  (revision 185079)
+++ i386.h  (working copy)
@@ -1744,7 +1744,7 @@
 /* Specify the machine mode that pointers have.
After generation of rtl, the compiler makes no further distinction
between pointers and any other objects of this machine mode.  */
-#define Pmode (TARGET_64BIT ? DImode : SImode)
+#define Pmode (TARGET_LP64 ? DImode : SImode)

 /* A C expression whose value is zero if pointers that need to be extended
from being `POINTER_SIZE' bits wide to `Pmode' are sign-extended and

Uros.
Index: i386.c
===
--- i386.c  (revision 185058)
+++ i386.c  (working copy)
@@ -22952,9 +22952,9 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call
   : !call_insn_operand (XEXP (fnaddr, 0), Pmode))
 {
   fnaddr = XEXP (fnaddr, 0);
-  if (GET_MODE (fnaddr) != Pmode)
-   fnaddr = convert_to_mode (Pmode, fnaddr, 1);
-  fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (Pmode, fnaddr));
+  if (GET_MODE (fnaddr) != word_mode)
+   fnaddr = convert_to_mode (word_mode, fnaddr, 1);
+  fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, fnaddr));
 }
 
   vec_len = 0;
Index: i386.md
===
--- i386.md (revision 185073)
+++ i386.md (working copy)
@@ -11078,7 +11078,12 @@
(set_attr modrm 0)])
 
 (define_expand indirect_jump
-  [(set (pc)

[patch committed SH] Fix target/52503

2012-03-07 Thread Kaz Kojima

Hi,

The attached patch is to fix PR target/52503 which is a build failure
for sh-wrs-vxworks.  We've defined too many target option masks.
Tested with usual tests on sh4-unknown-linux-gnu and with a cc1-only
build on sh-wrs-vxworks.  Applied on trunk.

Regards,
kaz
--
2012-03-07  Oleg Endo  olege...@gcc.gnu.org
Kaz Kojima  kkoj...@gcc.gnu.org

PR target/52503
* config/sh/sh.opt (msoft-atomic): Use Var instead of Mask.
* config/sh/linux.h (TARGET_DEFAULT): Remove MASK_SOFT_ATOMIC.
(SUBTARGET_OVERRIDE_OPTIONS): Define.

diff -upr ORIG/trunk/gcc/config/sh/linux.h trunk/gcc/config/sh/linux.h
--- ORIG/trunk/gcc/config/sh/linux.h2011-12-05 10:04:44.0 +0900
+++ trunk/gcc/config/sh/linux.h 2012-03-07 13:54:42.0 +0900
@@ -1,5 +1,6 @@
 /* Definitions for SH running Linux-based GNU systems using ELF
-   Copyright (C) 1999, 2000, 2002, 2003, 2004, 2005, 2006, 2007, 2010, 2011
+   Copyright (C) 1999, 2000, 2002, 2003, 2004, 2005, 2006, 2007, 2010, 2011,
+   2012
Free Software Foundation, Inc.
Contributed by Kazumoto Kojima kkoj...@rr.iij4u.or.jp
 
@@ -41,7 +42,7 @@ along with GCC; see the file COPYING3.  
 #undef TARGET_DEFAULT
 #define TARGET_DEFAULT \
   (TARGET_CPU_DEFAULT | MASK_USERMODE | TARGET_ENDIAN_DEFAULT \
-   | TARGET_OPT_DEFAULT | MASK_SOFT_ATOMIC)
+   | TARGET_OPT_DEFAULT)
 
 #define TARGET_ASM_FILE_END file_end_indicate_exec_stack
 
@@ -135,3 +136,13 @@ along with GCC; see the file COPYING3.  
 /* Install the __sync libcalls.  */
 #undef TARGET_INIT_LIBFUNCS
 #define TARGET_INIT_LIBFUNCS  sh_init_sync_libfuncs
+
+#undef SUBTARGET_OVERRIDE_OPTIONS
+#define SUBTARGET_OVERRIDE_OPTIONS \
+  do   \
+{  \
+  /* Defaulting to -msoft-atomic.  */  \
+  if (global_options_set.x_TARGET_SOFT_ATOMIC == 0)\
+   TARGET_SOFT_ATOMIC = 1; \
+}  \
+  while (0)
diff -upr ORIG/trunk/gcc/config/sh/sh.opt trunk/gcc/config/sh/sh.opt
--- ORIG/trunk/gcc/config/sh/sh.opt 2012-03-06 10:28:32.0 +0900
+++ trunk/gcc/config/sh/sh.opt  2012-03-07 07:13:58.0 +0900
@@ -320,7 +320,7 @@ Target Mask(HITACHI) MaskExists
 Follow Renesas (formerly Hitachi) / SuperH calling conventions
 
 msoft-atomic
-Target Report Mask(SOFT_ATOMIC)
+Target Report Var(TARGET_SOFT_ATOMIC)
 Use software atomic sequences supported by kernel
 
 menable-tas

Re: Support for getting CPU type and feature information at run-time (issue5715051)

2012-03-07 Thread Sriraman Tallam

I committed this patch to google/main.
I have created a new patch for review for trunk here :
http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00458.html

Thanks,
-Sri.

On Thu, Mar 1, 2012 at 2:08 PM, Sriraman Tallam tmsri...@google.com wrote:
 Removing [google] prefix from the subject line.

 On Thu, Mar 1, 2012 at 12:54 PM, Xinliang David Li davi...@google.com wrote:
 Sri, probably need to remove the [google] prefix in the subject line
 to prevent this from being filtered.

 David

 On Thu, Mar 1, 2012 at 12:45 PM, Sriraman Tallam tmsri...@google.com wrote:
 Patch to add builtins to detect CPU type:
 

 I have ported the patch from google/gcc-4_6 to google/main.  I also want 
 this
 patch to be considered for trunk.  Please see this discussion:
 http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01355.html
 when this patch for reviewed the last time.

 One of the main concerns was about making CPU detection initialization a
 constructor. The main point raised was about constructor ordering. I have 
 now
 added a priority value to the CPU detection constructor to make it very high
 priority so that it is guaranteed to fire before every constructor without
 an explicitly marked priority value of 101.  However, IFUNC initializers
 will still fire before this constructor, so the cpu initialization routine
 has to be explicitly called in such initializers for which I have added a
 builtin: __builtin_cpu_init ().

 I would like to reopen discussions on this to make it suitable for trunk
 this time around.

 This patch adds the following new builtins:

 __builtin_cpu_init
 __builtin_cpu_supports_cmov
 __builtin_cpu_supports_mmx
 __builtin_cpu_supports_popcount
 __builtin_cpu_supports_sse
 __builtin_cpu_supports_sse2
 __builtin_cpu_supports_sse3
 __builtin_cpu_supports_ssse3
 __builtin_cpu_supports_sse4_1
 __builtin_cpu_supports_sse4_2
 __builtin_cpu_is_amd
 __builtin_cpu_is_intel_atom
 __builtin_cpu_is_intel_core2
 __builtin_cpu_is_intel
 __builtin_cpu_is_intel_corei7
 __builtin_cpu_is_intel_corei7_nehalem
 __builtin_cpu_is_intel_corei7_westmere
 __builtin_cpu_is_intel_corei7_sandybridge
 __builtin_cpu_is_amdfam10
 __builtin_cpu_is_amdfam10_barcelona
 __builtin_cpu_is_amdfam10_shanghai
 __builtin_cpu_is_amdfam10_istanbul
 __builtin_cpu_is_amdfam15_bdver1
 __builtin_cpu_is_amdfam15_bdver2


        * config/i386/i386.c (build_struct_with_one_bit_fields): New 
 function.
        (make_var_decl): New function.
        (get_field_from_struct): New function.
        (fold_builtin_target): New function.
        (ix86_fold_builtin): New function.
        (ix86_expand_builtin): Expand new builtins by folding them.
        (make_platform_builtin): New functions.
        (ix86_init_platform_type_builtins): Make the new builtins.
        (ix86_init_builtins): Make new builtins to detect CPU type.
        (TARGET_FOLD_BUILTIN): New macro.
        (IX86_BUILTIN_CPU_SUPPORTS_CMOV): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_MMX): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_POPCOUNT): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE2): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE3): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSSE3): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE4_1): New enum value.
        (IX86_BUILTIN_CPU_SUPPORTS_SSE4_2): New enum value.
        (IX86_BUILTIN_CPU_INIT): New enum value.
        (IX86_BUILTIN_CPU_IS_AMD): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_ATOM): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_CORE2): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_COREI7_NEHALEM): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_COREI7_WESTMERE): New enum value.
        (IX86_BUILTIN_CPU_IS_INTEL_COREI7_SANDYBRIDGE): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER1): New enum value.
        (IX86_BUILTIN_CPU_IS_AMDFAM15H_BDVER2): New enum value.
        * config/i386/i386-builtin-types.def: New function type.
        * testsuite/gcc.target/builtin_target.c: New testcase.

        * libgcc/config/i386/i386-cpuinfo.c: New file.
        * libgcc/config/i386/t-cpuinfo: New file.
        * libgcc/config.host: Include t-cpuinfo.
        * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model
        and __cpu_features.



 Index: libgcc/config.host
 ===
 --- libgcc/config.host  (revision 184644)
 +++ libgcc/config.host  (working copy)
 @@ -1128,7 +1128,7 @@ i[34567]86-*-linux* | x86_64-*-linux* | \
   i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
   i[34567]86-*-knetbsd*-gnu | \
   i[34567]86-*-gnu*)
 -

RE: [Patch,AVR]: Fix PR52496: Add memory barriers to built-ins

2012-03-07 Thread Weddington, Eric



 -Original Message-
 From: Georg-Johann Lay [mailto:a...@gjlay.de]
 Sent: Wednesday, March 07, 2012 5:40 AM
 To: gcc-patches@gcc.gnu.org
 Cc: Denis Chertykov; Weddington, Eric
 Subject: [Patch,AVR]: Fix PR52496: Add memory barriers to built-ins
 
 This patch adds memory barriers to
 
 __builtin_avr_nop
 __builtin_avr_sei
 __builtin_avr_cli
 __builtin_avr_wdr
 __builtin_avr_sleep
 __builtin_avr_delay_cycles
 
 so that their code cannot be dragged over memory accesses.
 
 Ok for trunk?
 

Please commit.

Eric

cp-demangle PATCH for user-defined literal operator demangling

2012-03-07 Thread Jason Merrill

The discussion of 52521 also noted that the demangler doesn't support 
literal operators yet.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ee622f8167f66b6ae9fff1526688a2dfdd84a8be
Author: Jason Merrill ja...@redhat.com
Date:   Wed Mar 7 17:50:19 2012 -0500

	* cp-demangle.c (cplus_demangle_operators): Add li.
	(d_unqualified_name): Handle it specially.

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 18b84a1..2b3d182 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -1419,7 +1419,12 @@ d_unqualified_name (struct d_info *di)
 
   ret = d_operator_name (di);
   if (ret != NULL  ret-type == DEMANGLE_COMPONENT_OPERATOR)
-	di-expansion += sizeof operator + ret-u.s_operator.op-len - 2;
+	{
+	  di-expansion += sizeof operator + ret-u.s_operator.op-len - 2;
+	  if (!strcmp (ret-u.s_operator.op-code, li))
+	ret = d_make_comp (di, DEMANGLE_COMPONENT_UNARY, ret,
+			   d_source_name (di));
+	}
   return ret;
 }
   else if (peek == 'C' || peek == 'D')
@@ -1596,6 +1601,7 @@ const struct demangle_operator_info cplus_demangle_operators[] =
   { ix, NL ([]),2 },
   { lS, NL (=),   2 },
   { le, NL (=),2 },
+  { li, NL (operator\\ ), 1 },
   { ls, NL (),2 },
   { lt, NL (), 2 },
   { mI, NL (-=),2 },
diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected
index 408c4f4..036c481 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -4072,6 +4072,9 @@ decltype (g({parm#1}, {})) f1int(int)
 --format=gnu-v3
 _Z2f1IiEDTnw_T_ilEES0_
 decltype (new int{}) f1int(int)
+--format=gnu-v3
+_Zli2_wPKc
+operator _w(char const*)
 #
 # Ada (GNAT) tests.
 #

Re: [Patch libffi, Darwin, PPC64] PR29152 (Darwin64 implementation), PR42387 (ppc va faults).

2012-03-07 Thread David Edelsohn

IainS,

Your patch completely broke libffi on AIX and your changes were not
Darwin-specific, despite Mike Stump's comment with his approval.

AIX shares ffi_darwin.c, as one should be able to see from all of the
AIX comments in the file.

aix_closure.S expects ffi_closure_helper_DARWIN to return the type, as
it previously did.  So the following change completely breaks FFI on
AIX:

   /* Tell ffi_closure_ASM to perform return type promotions.  */
-  return cif-rtype-type;
+  return cif-rtype;

- David

Re: fix libstdc++/52433

2012-03-07 Thread Jonathan Wakely

On 4 March 2012 12:56, Jonathan Wakely wrote:
        PR libstdc++/52433
        * include/debug/safe_iterator.h (_Safe_iterator): Add move
        constructor and move assignment operator.
        * testsuite/23_containers/vector/debug/52433.cc: New.

 Tested 'make check check-debug' on x86_64 and committed to trunk.  I
 plan to fix this for 4.7.1 and 4.6.4 as well

This restores the debug mode checks when moving singular iterators.

Tested x86_64-linux, committed to trunk.
commit 9ada43f026087d440ed6e70d007b51c497d4b790
Author: Jonathan Wakely jwakely@gmail.com
Date:   Wed Mar 7 01:24:45 2012 +

PR libstdc++/52433
* include/debug/safe_iterator.h (_Safe_iterator): Add debug checks
to move constructor and move assignment operator.

diff --git a/libstdc++-v3/include/debug/safe_iterator.h 
b/libstdc++-v3/include/debug/safe_iterator.h
index 65dff55..6bb3cd2 100644
--- a/libstdc++-v3/include/debug/safe_iterator.h
+++ b/libstdc++-v3/include/debug/safe_iterator.h
@@ -1,6 +1,6 @@
 // Safe iterator implementation  -*- C++ -*-
 
-// Copyright (C) 2003, 2004, 2005, 2006, 2009, 2010, 2011
+// Copyright (C) 2003, 2004, 2005, 2006, 2009, 2010, 2011, 2012
 // Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
@@ -176,6 +176,11 @@ namespace __gnu_debug
*/
   _Safe_iterator(_Safe_iterator __x) : _M_current()
   {
+   _GLIBCXX_DEBUG_VERIFY(!__x._M_singular()
+ || __x._M_current == _Iterator(),
+ _M_message(__msg_init_copy_singular)
+ ._M_iterator(*this, this)
+ ._M_iterator(__x, other));
std::swap(_M_current, __x._M_current);
this-_M_attach(__x._M_sequence);
__x._M_detach();
@@ -229,6 +234,11 @@ namespace __gnu_debug
   _Safe_iterator
   operator=(_Safe_iterator __x)
   {
+   _GLIBCXX_DEBUG_VERIFY(!__x._M_singular()
+ || __x._M_current == _Iterator(),
+ _M_message(__msg_copy_singular)
+ ._M_iterator(*this, this)
+ ._M_iterator(__x, other));
_M_current = __x._M_current;
_M_attach(__x._M_sequence);
__x._M_detach();

Go patch committed: Don't crash on array assignment

2012-03-07 Thread Ian Lance Taylor

The Go frontend crashed doing an array assignment of two values of type
[2][]int.  This patch fixes it.  Normal array assignments worked because
gcc uses the same structure for all arrays with the same length and
element type.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 3a8aa04c3a9a go/expressions.cc
--- a/go/expressions.cc	Wed Mar 07 13:51:48 2012 -0800
+++ b/go/expressions.cc	Wed Mar 07 17:36:41 2012 -0800
@@ -284,8 +284,10 @@
 	   || SCALAR_FLOAT_TYPE_P(lhs_type_tree)
 	   || COMPLEX_FLOAT_TYPE_P(lhs_type_tree))
 return fold_convert_loc(location.gcc_location(), lhs_type_tree, rhs_tree);
-  else if (TREE_CODE(lhs_type_tree) == RECORD_TYPE
-	TREE_CODE(TREE_TYPE(rhs_tree)) == RECORD_TYPE)
+  else if ((TREE_CODE(lhs_type_tree) == RECORD_TYPE
+	 TREE_CODE(TREE_TYPE(rhs_tree)) == RECORD_TYPE)
+	   || (TREE_CODE(lhs_type_tree) == ARRAY_TYPE
+	TREE_CODE(TREE_TYPE(rhs_tree)) == ARRAY_TYPE))
 {
   // This conversion must be permitted by Go, or we wouldn't have
   // gotten here.

Re: [Patch libffi, Darwin, PPC64] PR29152 (Darwin64 implementation), PR42387 (ppc va faults).

2012-03-07 Thread David Edelsohn

This patch applies the same logic to aix_closure.S that was modified
in darwin_closure.S, returning from 100% failure to:

=== libffi Summary ===

# of expected passes1638
# of unexpected failures13
# of unsupported tests  55

* src/powerpc/aix_closure.S (ffi_closure_ASM): Load type
from ffi_type return type.

diff --git a/src/powerpc/aix_closure.S b/src/powerpc/aix_closure.S
index c906017..7c319a3 100644
--- a/src/powerpc/aix_closure.S
+++ b/src/powerpc/aix_closure.S
@@ -168,6 +168,7 @@ ffi_closure_ASM:
/* look up the proper starting point in table  */
/* by using return type as offset */
ld  r4, LC..60(2)   /* get address of jump table */
+   lhz r3, 10(r3)
sldir3, r3, 4   /* now multiply return type by 16 */
ld  r0, 240+16(r1)  /* load return address */
add r3, r3, r4  /* add contents of table to table address */
@@ -340,6 +341,7 @@ L..finish:
/* look up the proper starting point in table  */
/* by using return type as offset */
lwz r4, LC..60(2)   /* get address of jump table */
+   lhz r3, 6(r3)
slwir3, r3, 4   /* now multiply return type by 4 */
lwz r0, 176+8(r1)   /* load return address */
add r3, r3, r4  /* add contents of table to table address */

[PATCH] Make powerpc honor PROCESSOR_DEFAULT{,64} in tm*.h files

2012-03-07 Thread Michael Meissner

David discovered that there was a thinko in the logic of the powerpc setting
for the default tuning option.  In theory, the compiler was supposed to use
PROCESSOR_DEFAULT for 32-bit targets and PROCESSOR_DEFAULT64 for 64-bit targets
if no cpu or tuning option was used in configuring the compiler.  The current
linux64.h and aix61.h files set this to be power7.  However, the code actually
set the tuning to the default cpu used for code generation (powerpc and
powerpc64).

This patch for both 4.7 and 4.8 fixes the code so that the tm.h can set the
default as intended.  The person configuring the compiler can still use
--with-cpu=xxx, --with-tune=xxx, --with-cpu-32=xxx, --with-tune-32=xxx,
--with-cpu-64=xxx and --with-tune-64= to adjust the defaults to meet
local conditions.

I have bootstraped and ran make check with and without hte patches.  There are
no regressions with the patch, and one test now succeeds if power7 tuning is
used (64-bit gcc.dg/tree-prof/bb-reorg.c).  Is this ok to install?  David and I
think this is important to get into 4.7 rather than 4.7.1.

2012-03-07  Michael Meissner  meiss...@the-meissners.org

* config/rs6000/linux64.h (OPTION_TARGET_CPU_DEFAULT): Do not
redefine to be NULL if the current bit-size is different from the
configured bit-size.

* config/rs6000/rs6000.c (rs6000_option_override_internal): If the
cpu is defaulted, use PROCESSOR_DEFAULT and PROCESSOR_DEFAULT64 to
set the default tuning.  Add asserts to make sure the cpu and tune
indexes are defined.
(rs6000_file_start): Do not reset the default cpu if the current
bit-size is different from the configured bit-size.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Index: gcc/config/rs6000/linux64.h
===
--- gcc/config/rs6000/linux64.h (revision 185071)
+++ gcc/config/rs6000/linux64.h (working copy)
@@ -159,15 +159,6 @@ extern int dot_symbols;
 }  \
   while (0)
 
-#ifdef RS6000_BI_ARCH
-
-#undef OPTION_TARGET_CPU_DEFAULT
-#defineOPTION_TARGET_CPU_DEFAULT \
-  (((TARGET_DEFAULT ^ target_flags)  MASK_64BIT) \
-   ? (char *) 0 : TARGET_CPU_DEFAULT)
-
-#endif
-
 #undef ASM_DEFAULT_SPEC
 #undef ASM_SPEC
 #undef LINK_OS_LINUX_SPEC
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 185071)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -2596,6 +2596,7 @@ static bool
 rs6000_option_override_internal (bool global_init_p)
 {
   bool ret = true;
+  bool have_cpu = false;
   const char *default_cpu = OPTION_TARGET_CPU_DEFAULT;
   int set_masks;
   int cpu_index;
@@ -2665,30 +2666,51 @@ rs6000_option_override_internal (bool gl
  the cpu in a target attribute or pragma, but did not specify a tuning
  option, use the cpu for the tuning option rather than the option specified
  with -mtune on the command line.  */
-  if (rs6000_cpu_index  0)
-cpu_index = rs6000_cpu_index;
-  else if (main_target_opt != NULL  main_target_opt-x_rs6000_cpu_index  0)
-rs6000_cpu_index = cpu_index = main_target_opt-x_rs6000_cpu_index;
+  if (rs6000_cpu_index = 0)
+{
+  cpu_index = rs6000_cpu_index;
+  have_cpu = true;
+}
+  else if (main_target_opt != NULL  main_target_opt-x_rs6000_cpu_index = 0)
+{
+  rs6000_cpu_index = cpu_index = main_target_opt-x_rs6000_cpu_index;
+  have_cpu = true;
+}
   else
-rs6000_cpu_index = cpu_index = rs6000_cpu_name_lookup (default_cpu);
+{
+  if (!default_cpu)
+   default_cpu = (TARGET_POWERPC64 ? powerpc64 : powerpc);
+
+  rs6000_cpu_index = cpu_index = rs6000_cpu_name_lookup (default_cpu);
+}
+
+  gcc_assert (cpu_index = 0);
+
+  target_flags = ~set_masks;
+  target_flags |= (processor_target_table[cpu_index].target_enable
+   set_masks);
 
-  if (rs6000_tune_index  0)
+  if (rs6000_tune_index = 0)
 tune_index = rs6000_tune_index;
-  else
+  else if (have_cpu)
 rs6000_tune_index = tune_index = cpu_index;
-
-  if (cpu_index = 0)
+  else
 {
-  target_flags = ~set_masks;
-  target_flags |= (processor_target_table[cpu_index].target_enable
-   set_masks);
+  size_t i;
+  enum processor_type tune_proc
+   = (TARGET_POWERPC64 ? PROCESSOR_DEFAULT64 : PROCESSOR_DEFAULT);
+
+  tune_index = -1;
+  for (i = 0; i  ARRAY_SIZE (processor_target_table); i++)
+   if (processor_target_table[i].processor == tune_proc)
+ {
+   rs6000_tune_index = tune_index = i;
+   break;
+ }
 }
 
-  rs6000_cpu = ((tune_index = 0)
-   ? processor_target_table[tune_index].processor
-   : (TARGET_POWERPC64
-  ? PROCESSOR_DEFAULT64
-

RE: [PATCH] Improve SCEV for array element

2012-03-07 Thread Jiangning Liu



 -Original Message-
 From: Richard Guenther [mailto:richard.guent...@gmail.com]
 Sent: Tuesday, March 06, 2012 9:12 PM
 To: Jiangning Liu
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Improve SCEV for array element
 
 On Fri, Jan 20, 2012 at 10:06 AM, Jiangning Liu jiangning@arm.com
 wrote:
  It's definitely not ok at this stage but at most for next stage1.
 
  OK. I may wait until next stage1.
 
  This is a very narrow pattern-match.  It doesn't allow for a[i].x
 for
  example,
  even if a[i] is a one-element structure.  I think the canonical way
 of
  handling
  ADDR_EXPR is to use sth like
 
  base = get_inner_reference (TREE_OPERAND (rhs1, 0), ...,
 offset,  ...);
  base = build1 (ADDR_EXPR, TREE_TYPE (rhs1), base);
          chrec1 = analyze_scalar_evolution (loop, base);
          chrec2 = analyze_scalar_evolution (loop, offset);
          chrec1 = chrec_convert (type, chrec1, at_stmt);
          chrec2 = chrec_convert (TREE_TYPE (offset), chrec2, at_stmt);
          res = chrec_fold_plus (type, chrec1, chrec2);
 
  where you probably need to handle scev_not_known when analyzing
 offset
  (which might be NULL).  You also need to add bitpos to the base
 address
  (in bytes, of course).  Note that the MEM_REF case would naturally
  work
  with this as well.
 
  OK. New patch is like below, and bootstrapped on x86-32.
 
 You want instead of
 
 +  if (TREE_CODE (TREE_OPERAND (rhs1, 0)) == ARRAY_REF
 +  || TREE_CODE (TREE_OPERAND (rhs1, 0)) == MEM_REF
 +  || TREE_CODE (TREE_OPERAND (rhs1, 0)) == COMPONENT_REF)
 +{
 
 if (TREE_CODE (TREE_OPERAND (rhs1, 0)) == MEM_REF
 || handled_component_p (TREE_OPERAND (rhs1, 0)))
   {
 
 + base = build1 (ADDR_EXPR, TREE_TYPE (rhs1), base);
 + chrec1 = analyze_scalar_evolution (loop, base);
 
 can you please add a wrapper
 
 tree
 analyze_scalar_evolution_for_address_of (struct loop *loop, tree var)
 {
   return analyze_scalar_evolution (loop, build_fold_addr_expr (var));
 }
 
 and call that instead of building the ADDR_EXPR there?  We want
 to avoid building that tree node, but even such a simple wrapper would
 be prefered.
 
 + if (bitpos)
 
 if (bitpos != 0)
 
 + chrec3 = build_int_cst (integer_type_node,
 + bitpos / BITS_PER_UNIT);
 
 please use size_int (bitpos / BITS_PER_UNIT) instead.  Using
 integer_type_node is definitely wrong.
 
 Ok with that changes.
 

Richard,

Modified as all you suggested, and new code is like below. Bootstrapped on
x86-32. OK for trunk now?

Thanks,
-Jiangning

ChangeLog:

2012-03-08  Jiangning Liu  jiangning@arm.com

* tree-scalar-evolution (interpret_rhs_expr): generate chrec for
array reference and component reference.
(analyze_scalar_evolution_for_address_of): New.

ChangeLog for testsuite:

2012-03-08  Jiangning Liu  jiangning@arm.com

* gcc.dg/tree-ssa/scev-3.c: New.
* gcc.dg/tree-ssa/scev-4.c: New.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
b/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
new file mode 100644
index 000..28d5c93
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+int *a_p;
+int a[1000];
+
+f(int k)
+{
+   int i;
+
+   for (i=k; i1000; i+=k) {
+   a_p = a[i];
+   *a_p = 100;
+}
+}
+
+/* { dg-final { scan-tree-dump-times a 1 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
b/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
new file mode 100644
index 000..6c1e530
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+typedef struct {
+   int x;
+   int y;
+} S;
+
+int *a_p;
+S a[1000];
+
+f(int k)
+{
+   int i;
+
+   for (i=k; i1000; i+=k) {
+   a_p = a[i].y;
+   *a_p = 100;
+}
+}
+
+/* { dg-final { scan-tree-dump-times a 1 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index 2077c8d..c719984
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -266,6 +266,8 @@ along with GCC; see the file COPYING3.  If not see
 #include params.h
 
 static tree analyze_scalar_evolution_1 (struct loop *, tree, tree);
+static tree analyze_scalar_evolution_for_address_of (struct loop *loop,
+tree var);
 
 /* The cached information about an SSA name VAR, claiming that below
basic block INSTANTIATED_BELOW, the value of VAR can be expressed
@@ -1712,16 +1714,59 @@ interpret_rhs_expr (struct loop *loop, gimple
at_stmt,
   switch (code)
 {
 case ADDR_EXPR:
-  /* Handle MEM[ptr + CST] which is equivalent to POINTER_PLUS_EXPR.
*/
-  if

74 matches

Mail list logo