Re: [Patch, mips] MIPS performance patch for PR 56552

2013-11-16 Thread Richard Sandiford
Steve Ellcey  sell...@mips.com writes:
 diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
 index 0cda169..49c2bf7 100644
 --- a/gcc/config/mips/mips.md
 +++ b/gcc/config/mips/mips.md
 @@ -6721,7 +6721,7 @@
  (define_insn *movGPR:mode_on_MOVECC:mode
[(set (match_operand:GPR 0 register_operand =d,d)
   (if_then_else:GPR
 -  (match_operator:MOVECC 4 equality_operator
 +  (match_operator 4 equality_operator
   [(match_operand:MOVECC 1 register_operand 
 MOVECC:reg,MOVECC:reg)
(const_int 0)])
(match_operand:GPR 2 reg_or_0_operand dJ,0)

Sorry, I didn't notice this before, but we should remove _on_MOVECC:mode
from the name of the insn.  Same for the FP version.

OK with that change, thanks.

It'd be good to add a testcase too.  E.g. we could take your example in
the PR and check for the redundant 0x.

Richard


Re: [PATCH] MIPS: MIPS32r2 FP reciprocal instruction set support

2013-11-16 Thread Richard Sandiford
Maciej W. Rozycki ma...@codesourcery.com writes:
  Note that these instructions were allowed in either FPU mode in the MIPS 
 IV ISA, but for forward ISA compatibility this change does not enable them 
 for -march=mips4 in the 32-bit FPR mode because the original revision of 
 the MIPS64 ISA did not support it.

Yeah, sounds good.

 Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.h
 ===
 --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.h   2013-11-12 
 15:31:46.758734464 +
 +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.h2013-11-12 
 15:33:22.277646941 +
 @@ -921,6 +921,21 @@ struct mips_cpu_info {
 'c = -((a * b) [+-] c)'.  */
  #define ISA_HAS_NMADD3_NMSUB3TARGET_LOONGSON_2EF
  
 +/* ISA has floating-point RECIP.fmt and RSQRT.fmt instructions.  The
 +   MIPS64 rev. 1 ISA says that RECIP.D and RSQRT.D are unpredictable when
 +   doubles are stored in pairs of FPRs, so for safety's sake, we apply
 +   this restriction to the MIPS IV ISA too.  */
 +#define ISA_HAS_FP_RECIP_RSQRT(MODE) \
 + (((ISA_HAS_FP4  \
 +|| (ISA_MIPS32R2  !TARGET_MIPS16)) \
 +((MODE) == SFmode  \
 +   || ((TARGET_FLOAT64   \
 +|| !(ISA_MIPS4   \
 + || ISA_MIPS64)) \
 +(MODE) == DFmode)))\
 +  || ((TARGET_SB1  !TARGET_MIPS16) \
 +   (MODE) == V2SFmode))

I think the !(ISA_MIPS4 || ISA_MIPS64) part is really r2 or later,
which elsewhere we test as ISA_MIPS32R2 || ISA_MIPS64R2.  Obviously
that isn't as future-proof, but I think consistency wins here.
(E.g. the earlier ISA_MIPS32R2 seems like it's reallly r2 or later too).

Cleaning up these macros has been on my todo list for about ten years :-(

Please also test !TARGET_MIPS16 at the outermost level, so that there's
only one instance.  I think that gives something like:

#define ISA_HAS_FP_RECIP_RSQRT(MODE)\
ISA_HAS_FP4 || ISA_MIPS32R2)\
((MODE) == SFmode \
   || ((TARGET_FLOAT64  \
|| ISA_MIPS32R2 \
|| ISA_MIPS64R2)\
(MODE) == DFmode)))   \
  || (TARGET_SB1\
   (MODE) == V2SFmode))   \
  !TARGET_MIPS16)

OK with those changes, thanks.

Richard


Re: [PATCH] Fix lto bootstrap verification failure with -freorder-blocks-and-partition

2013-11-16 Thread Jan Hubicka
 When testing with -freorder-blocks-and-partition enabled, I hit a
 verification failure in an LTO profiledbootstrap. Edge forwarding
 performed when we went into cfg layout mode after bb reordering
 (during compgotos) created a situation where a hot block was then
 dominated by a cold block and was therefore remarked as cold. Because
 bb reorder was complete at that point, it was not moved in the
 physical layout, and we incorrectly went in and out of the cold
 section multiple times.
 
 The following patch addresses that by fixing the layout when we move
 blocks to the cold section after bb reordering is complete.
 
 Tested with an LTO profiledbootstrap with
 -freorder-blocks-and-partition enabled. Ok for trunk?
 
 Thanks,
 Teresa
 
 2013-11-15  Teresa Johnson  tejohn...@google.com
 
 * cfgrtl.c (fixup_partitions): Reorder blocks if necessary.

computed_gotos just unfactors unified blocks that we use to avoid CFGs with
O(n^2) edges. This is mostly to avoid problems with nonlinearity of other passes
and to reduce the quadratic memory use case to one function at a time.

I wonder if it won't be cleaner to simply unfactor those just before 
pass_reorder_blocks.

Computed gotos are used e.g. in libjava interpreter to optimize the tight 
interpretting
loop.  I think those cases would benefit from having at least 
scheduling/reordering
and alignments done right.

Of course it depends on how bad the compile time implications are (I think in 
addition
to libjava, there was a lucier's testcase that made us to go for this trick) ,
but I would prefer it over ading yet another hack into cfgrtl...
We also may just avoid cfglayout cleanup_cfg while doing computed gotos...

Honza


Re: [PATCH] MIPS: MIPS32r2 FP indexed access instruction set support

2013-11-16 Thread Richard Sandiford
Maciej W. Rozycki ma...@codesourcery.com writes:
 2013-11-14  Maciej W. Rozycki  ma...@codesourcery.com

   gcc/
   * config/mips/mips.h (ISA_HAS_FP4): Remove TARGET_FLOAT64 
   restriction for ISA_MIPS32R2.
   (ISA_HAS_FP_MADD4_MSUB4): Remove ISA_MIPS32R2 special-casing.
   (ISA_HAS_NMADD4_NMSUB4): Likewise.
   (ISA_HAS_FP_RECIP_RSQRT): Likewise.
   (ISA_HAS_PREFETCHX): Redefine in terms of ISA_HAS_FP4.

Nice.

So the reasoning is that, after your RECIP.fmt patch, the only direct uses
of ISA_HAS_FP4 for instruction selection are indexed loads and stores.
That's why extending them to ISA_MIPS32R2  !TARGET_FLOAT64 allows
ISA_HAS_FP4 to be simplified.  But if we keep:

 @@ -906,16 +906,14 @@ struct mips_cpu_info {
  #define GENERATE_MADD_MSUB   (TARGET_IMADD  !TARGET_MIPS16)
  
  /* ISA has floating-point madd and msub instructions 'd = a * b [+-] c'.  */
 -#define ISA_HAS_FP_MADD4_MSUB4  (ISA_HAS_FP4 \
 -  || (ISA_MIPS32R2  !TARGET_MIPS16))
 +#define ISA_HAS_FP_MADD4_MSUB4  ISA_HAS_FP4
  
  /* ISA has floating-point madd and msub instructions 'c = a * b [+-] c'.  */
  #define ISA_HAS_FP_MADD3_MSUB3  TARGET_LOONGSON_2EF
  
  /* ISA has floating-point nmadd and nmsub instructions
 'd = -((a * b) [+-] c)'.  */
 -#define ISA_HAS_NMADD4_NMSUB4(ISA_HAS_FP4
 \
 -  || (ISA_MIPS32R2  !TARGET_MIPS16))
 +#define ISA_HAS_NMADD4_NMSUB4ISA_HAS_FP4

then I think we should also have a macro like:

/* ISA has indexed floating-point loads and stores (LWXC1, LDXC1, SWXC1
   and SDXC1).  */
#define ISA_HAS_LXC1_SXC1   ISA_HAS_FP4

and add:

   Note that this macro should only be used by other ISA_HAS_* macros.

to the ISA_HAS_FP4 comment.

OK with those changes, thanks.

Richard


Re: [ia64] [PR target/57491] internal compiler error: in ia64_split_tmode -O2, quadmath

2013-11-16 Thread Eric Botcazou
 As far as I understand semantics of this insn:
   (insn 200 199 0 (set (reg:DI 15 r15)
   (mem:DI (post_dec:DI (reg/f:DI 15 r15 [447])) [3
   *_61[_12]{lb: 1 sz: 64}.text+8 S8 A64])) -1 (nil))
 What is done is (in that sequence).
   1. Calculate address of MEM: get r15 value.
   2. Decrement r15 value.
   3. Load MEM in to r15.
 
 Point 2 is useless as we kill it by 3.
 So, it is clobbered and as mention in comment this is sometimes ok to
 override pointer with pointer value.

That depends on the semantics of the hardware instruction though, does it 
really guarantee 1/2/3 in that order?

 We need to set `dead' flag only when address is actually going to be killed
 by load.
 
 Patch in the bottom. Test from PR pass.

The patch looks good to me if you also adjust the last sentence in the comment 
just above the block:

  /* It is possible for reload to decide to overwrite a pointer with
 the value it points to.  In that case we have to do the loads in
 the appropriate order so that the pointer is not destroyed too
 early.  Also we must not generate a postmodify for that second
 load, or rws_access_regno will die.  */

Something like And we must not generate a postmodify for the second load if
the destination register overlaps with the base register.

Thanks for fixing this.

-- 
Eric Botcazou


Re: [PATCH][3/3] Re-submission of Altera Nios II port, libgcc parts

2013-11-16 Thread Chung-Lin Tang
On 2013/7/14 03:55 PM, Chung-Lin Tang wrote:
 nios2 libgcc parts. Since the original post, the only main change has
 been the fdpbit vs soft-fp issue raised by Joseph, which has been
 resolved. Other parts are mostly the same.

The Nios II libgcc parts have been further updated to include a
sfp-machine.h file, and the Linux atomic cmpxchg updated to now use a
fixed address kernel helper cmpxchg routine, similar to ARM.

Thanks,
Chung-Lin

2013-11-16  Sandra Loosemore  san...@codesourcery.com
Chung-Lin Tang  clt...@codesourcery.com
Based on patches from Altera Corporation

* config.host (nios2-*-*,nios2-*-linux*): Add nios2 host cases.
* config/nios2/lib2-nios2.h: New file.
* config/nios2/lib2-divmod-hi.c: New file.
* config/nios2/linux-unwind.h: New file.
* config/nios2/lib2-divmod.c: New file.
* config/nios2/linux-atomic.c: New file.
* config/nios2/t-nios2: New file.
* config/nios2/crti.asm: New file.
* config/nios2/t-linux: New file.
* config/nios2/lib2-divtable.c: New file.
* config/nios2/lib2-mul.c: New file.
* config/nios2/tramp.c: New file.
* config/nios2/crtn.asm: New file.
* config/nios2/sfp-machine.h: New file.

Index: libgcc/config.host
===
--- libgcc/config.host	(revision 204897)
+++ libgcc/config.host	(working copy)
@@ -146,6 +146,9 @@ mips*-*-*)
 nds32*-*)
 	cpu_type=nds32
 	;;
+nios2*-*-*)
+	cpu_type=nios2
+	;;
 powerpc*-*-*)
 	cpu_type=rs6000
 	;;
@@ -876,6 +879,15 @@ nds32*-elf*)
 		;;
 	esac
 	;;
+nios2-*-linux*)
+	tmake_file=$tmake_file nios2/t-nios2 nios2/t-linux t-libgcc-pic t-slibgcc-libgcc
+	extra_parts=$extra_parts crti.o crtn.o
+	md_unwind_header=nios2/linux-unwind.h
+	;;
+nios2-*-*)
+	tmake_file=$tmake_file nios2/t-nios2 t-softfp-sfdf t-softfp-excl t-softfp
+	extra_parts=$extra_parts crti.o crtn.o
+	;;
 pdp11-*-*)
 	tmake_file=pdp11/t-pdp11 t-fdpbit
 	;;
Index: libgcc/config/nios2/t-linux
===
--- libgcc/config/nios2/t-linux	(revision 0)
+++ libgcc/config/nios2/t-linux	(revision 0)
@@ -0,0 +1,7 @@
+# Soft-float functions go in glibc only, to facilitate the possible
+# future addition of exception and rounding mode support integrated
+# with fenv.h.
+
+LIB2FUNCS_EXCLUDE = _floatdidf _floatdisf _fixunsdfsi _fixunssfsi \
+  _fixunsdfdi _fixdfdi _fixunssfdi _fixsfdi _floatundidf _floatundisf
+LIB2ADD += $(srcdir)/config/nios2/linux-atomic.c
Index: libgcc/config/nios2/sfp-machine.h
===
--- libgcc/config/nios2/sfp-machine.h	(revision 0)
+++ libgcc/config/nios2/sfp-machine.h	(revision 0)
@@ -0,0 +1,78 @@
+/* Soft-FP definitions for Altera Nios II.
+   Copyright (C) 2013 Free Software Foundation, Inc.
+
+This file is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+This file is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+http://www.gnu.org/licenses/.  */
+
+#define _FP_W_TYPE_SIZE		32
+#define _FP_W_TYPE		unsigned long
+#define _FP_WS_TYPE		signed long
+#define _FP_I_TYPE		long
+
+#define _FP_MUL_MEAT_S(R,X,Y)\
+  _FP_MUL_MEAT_1_wide(_FP_WFRACBITS_S,R,X,Y,umul_ppmm)
+#define _FP_MUL_MEAT_D(R,X,Y)\
+  _FP_MUL_MEAT_2_wide(_FP_WFRACBITS_D,R,X,Y,umul_ppmm)
+#define _FP_MUL_MEAT_Q(R,X,Y)\
+  _FP_MUL_MEAT_4_wide(_FP_WFRACBITS_Q,R,X,Y,umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R,X,Y)	_FP_DIV_MEAT_1_loop(S,R,X,Y)
+#define _FP_DIV_MEAT_D(R,X,Y)	_FP_DIV_MEAT_2_udiv(D,R,X,Y)
+#define _FP_DIV_MEAT_Q(R,X,Y)	_FP_DIV_MEAT_4_udiv(Q,R,X,Y)
+
+#define _FP_NANFRAC_S		((_FP_QNANBIT_S  1) - 1)
+#define _FP_NANFRAC_D		((_FP_QNANBIT_D  1) - 1), -1
+#define _FP_NANFRAC_Q		((_FP_QNANBIT_Q  1) - 1), -1, -1, -1
+#define _FP_NANSIGN_S		0
+#define _FP_NANSIGN_D		0
+#define _FP_NANSIGN_Q		0
+
+#define _FP_KEEPNANFRACP 1
+#define _FP_QNANNEGATEDP 0
+
+/* Someone please check this.  */
+#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP)			\
+  do {\
+if ((_FP_FRAC_HIGH_RAW_##fs(X)  _FP_QNANBIT_##fs)		\
+	 !(_FP_FRAC_HIGH_RAW_##fs(Y)  _FP_QNANBIT_##fs))	\
+  {\
+	R##_s = Y##_s;		\
+	_FP_FRAC_COPY_##wc(R,Y);\
+  }\
+else			

Re: [PATCH][2/3] Re-submission of Altera Nios II port, testsuite parts

2013-11-16 Thread Chung-Lin Tang
On 2013/10/17 10:20 PM, Bernd Schmidt wrote:
 On 07/14/2013 09:54 AM, Chung-Lin Tang wrote:
 These are nios2 patches for the gcc testsuite. Some new testcases were
 added since the last posting.
 
 Index: gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c
 ===
 --- gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c   (revision 
 200946)
 +++ gcc/testsuite/gcc.c-torture/execute/builtins/lib/chk.c   (working copy)
 @@ -124,16 +124,17 @@ __memmove_chk (void *dst, const void *src, __SIZE_
  void *
  memset (void *dst, int c, __SIZE_TYPE__ n)
  {
 +  while (n-- != 0)
 +n[(char *) dst] = c;
 +
/* Single-byte memsets should be done inline when optimisation
 - is enabled.  */
 + is enabled.  Do this after the copy in case we're being called to
 + initialize bss.  */
  #ifdef __OPTIMIZE__
if (memset_disallowed  inside_main  n  2)
  abort ();
  #endif
  
 -  while (n-- != 0)
 -n[(char *) dst] = c;
 -
return dst;
  }
 
 I'm not sure I understand this change. Is nios2 the only target calling
 memset to initialize bss, and memset_disallowed is nonzero at the start
 of execution?

This appears to be for the nios2-elf bare metal testing. Looking at the
upstream libgloss sources, nios2 is indeed not the only target that
calls memset for zeroing bss.

Note that however, in a somewhat reverse of situation:
https://sourceware.org/ml/newlib/2013/msg00264.html

It appears that due to the presumed usage model for Nios II, Sandra did
not contribute the libgloss port. So the original code that needed this
testsuite change is probably not there.

OTOH, if this change is not deemed harmful, than it might further
robustify the testsuite.

 Index: gcc/testsuite/gcc.target/nios2/nios2-int-types.c
 ===
 --- gcc/testsuite/gcc.target/nios2/nios2-int-types.c (revision 0)
 +++ gcc/testsuite/gcc.target/nios2/nios2-int-types.c (revision 0)
 @@ -0,0 +1,34 @@
 +/* Test that various types are all derived from int.  */
 +/* { dg-do compile { target nios2-*-* } } */
 
 I think you can lose the { target nios2-*-* } for everything inside
 gcc.target/nios2.

Done.

The new attached patch also has the Dxx constraint test removed, as that
feature is now removed from the compiler. The memset() change mentioned
above is still in the patch, but will remove before committing if not
approved.

Thanks,
Chung-Lin

2013-11-16  Sandra Loosemore  san...@codesourcery.com
Chung-Lin Tang  clt...@codesourcery.com
Based on patches from Altera Corporation

* gcc.dg/stack-usage-1.c (SIZE): Define case for __nios2__.
* gcc.dg/20040813-1.c: Skip for nios2-*-*.
* gcc.dg/20020312-2.c: Add __nios2__ case.
* g++.dg/other/PR23205.C: Skip for nios2-*-*.
* g++.dg/other/pr23205-2.C: Skip for nios2-*-*.
* g++.dg/cpp0x/constexpr-rom.C: Skip for nios2-*-*.
* g++.dg/cpp0x/alias-decl-debug-0.C: Skip for nios2-*-*.
* g++.old-deja/g++.jason/thunk3.C: Skip for nios2-*-*.
* lib/target-supports.exp (check_profiling_available): Check for
nios2-*-elf.
* gcc.c-torture/execute/pr47237.x:: Skip for nios2-*-*.
* gcc.c-torture/execute/20101011-1.c: Skip for nios2-*-*.
* gcc.c-torture/execute/builtins/lib/chk.c (memset): Place
char-based memset loop before inline check, to prevent
problems when called to initialize .bss. Update comments.
* gcc.target/nios2/nios2.exp: New DejaGNU file.
* gcc.target/nios2/nios2-custom-1.c: New test.
* gcc.target/nios2/nios2-trap-insn.c: New test.
* gcc.target/nios2/nios2-builtin-custom.c: New test.
* gcc.target/nios2/nios2-builtin-io.c: New test.
* gcc.target/nios2/nios2-stack-check-1.c: New test.
* gcc.target/nios2/nios2-stack-check-2.c: New test.
* gcc.target/nios2/nios2-rdctl.c: New test.
* gcc.target/nios2/nios2-wrctl.c: New test.
* gcc.target/nios2/nios2-wrctl-zero.c: New test.
* gcc.target/nios2/nios2-wrctl-not-zero.c: New test.
* gcc.target/nios2/nios2-rdwrctl-1.c: New test.
* gcc.target/nios2/nios2-ashlsi3-one_shift.c: New test.
* gcc.target/nios2/nios2-mul-options-1.c: New test.
* gcc.target/nios2/nios2-mul-options-2.c: New test.
* gcc.target/nios2/nios2-mul-options-3.c: New test.
* gcc.target/nios2/nios2-mul-options-4.c: New test.
* gcc.target/nios2/nios2-nor.c: New test.
* gcc.target/nios2/nios2-stxio.c: New test.
* gcc.target/nios2/custom-fp-1.c: New test.
* gcc.target/nios2/custom-fp-2.c: New test.
* gcc.target/nios2/custom-fp-3.c: New test.
* gcc.target/nios2/custom-fp-4.c: New test.
* gcc.target/nios2/custom-fp-5.c: New test.
* gcc.target/nios2/custom-fp-6.c: New test.
* gcc.target/nios2/custom-fp-7.c: New test.
* 

Re: [PATCH] Generate a label for the split cold function while using -freorder-blocks-and-partition

2013-11-16 Thread Richard Sandiford
Cary Coutant ccout...@google.com writes:
 Isn't this something that should be expressed in DWARF with
 DW_AT_ranges? See DWARF4, section 2.17,

 Does GCC generate such ranges?

 GCC does generate these ranges. However, according to Cary many tools
 do not rely on dwarf info for locating the corresponding function
 name, they just look at the symbols to identify what function an
 address resides in. Nor would we want tools such as objdump and
 profilers to rely on dwarf for locating the function names as this
 would not work for binaries that were generated without -g options or
 had their debug info stripped.

 Yes, while the information needed is in the DWARF info, I don't think
 it's a good idea to depend on having debug info in all binaries. It's
 quite common to need to symbolize binaries that don't have debug info,
 and without a symbol such as Sri and Teresa are proposing, the result
 will be not just an address that didn't get symbolized, but an address
 that gets symbolized incorrectly (in a way that will often be quite
 misleading).

+1 FWIW.

Another reason is that on MIPS, we could be throwing cold MIPS and
MIPS16/microMIPS code into the same section.  Tools like objdump rely
on symbols to figure out which ISA mode is being used where.

Thanks,
Richard


Re: [wide-int] Documentation and comment tweaks

2013-11-16 Thread Richard Sandiford
Richard Sandiford rdsandif...@googlemail.com writes:
 Some minor tweaks to the documentation and commentary.  The hyphenation
 and non zero-nonzero changes are supposed to be per guidelines:

http://gcc.gnu.org/codingconventions.html#Spelling

 Hope I got them right.

 OK to install?

Ping.

 Index: gcc/dfp.c
 ===
 --- gcc/dfp.c 2013-11-09 09:50:47.392396760 +
 +++ gcc/dfp.c 2013-11-09 11:07:22.754160541 +
 @@ -605,8 +605,8 @@ decimal_real_to_integer (const REAL_VALU
return real_to_integer (to);
  }
  
 -/* Likewise, but returns a wide_int with PRECISION.  Fail
 -   is set if the value does not fit.  */
 +/* Likewise, but returns a wide_int with PRECISION.  *FAIL is set if the
 +   value does not fit.  */
  
  wide_int
  decimal_real_to_integer (const REAL_VALUE_TYPE *r, bool *fail, int precision)
 Index: gcc/doc/rtl.texi
 ===
 --- gcc/doc/rtl.texi  2013-11-09 09:50:47.392396760 +
 +++ gcc/doc/rtl.texi  2013-11-09 11:07:22.755160549 +
 @@ -1542,11 +1542,10 @@ Similarly, there is only one object for
  @findex const_double
  @item (const_double:@var{m} @var{i0} @var{i1} @dots{})
  This represents either a floating-point constant of mode @var{m} or
 -(on ports older ports that do not define
 +(on older ports that do not define
  @code{TARGET_SUPPORTS_WIDE_INT}) an integer constant too large to fit
  into @code{HOST_BITS_PER_WIDE_INT} bits but small enough to fit within
 -twice that number of bits (GCC does not provide a mechanism to
 -represent even larger constants).  In the latter case, @var{m} will be
 +twice that number of bits.  In the latter case, @var{m} will be
  @code{VOIDmode}.  For integral values constants for modes with more
  bits than twice the number in @code{HOST_WIDE_INT} the implied high
  order bits of that constant are copies of the top bit of
 @@ -1576,25 +1575,25 @@ the precise bit pattern used by the targ
  This contains an array of @code{HOST_WIDE_INTS} that is large enough
  to hold any constant that can be represented on the target.  This form
  of rtl is only used on targets that define
 -@code{TARGET_SUPPORTS_WIDE_INT} to be non zero and then
 -@code{CONST_DOUBLES} are only used to hold floating point values.  If
 +@code{TARGET_SUPPORTS_WIDE_INT} to be nonzero and then
 +@code{CONST_DOUBLE}s are only used to hold floating-point values.  If
  the target leaves @code{TARGET_SUPPORTS_WIDE_INT} defined as 0,
  @code{CONST_WIDE_INT}s are not used and @code{CONST_DOUBLE}s are as
  they were before.
  
 -The values are stored in a compressed format.   The higher order
 +The values are stored in a compressed format.  The higher-order
  0s or -1s are not represented if they are just the logical sign
  extension of the number that is represented.   
  
  @findex CONST_WIDE_INT_VEC
  @item CONST_WIDE_INT_VEC (@var{code})
  Returns the entire array of @code{HOST_WIDE_INT}s that are used to
 -store the value.   This macro should be rarely used.
 +store the value.  This macro should be rarely used.
  
  @findex CONST_WIDE_INT_NUNITS
  @item CONST_WIDE_INT_NUNITS (@var{code})
  The number of @code{HOST_WIDE_INT}s used to represent the number.
 -Note that this generally be smaller than the number of
 +Note that this generally is smaller than the number of
  @code{HOST_WIDE_INT}s implied by the mode size.
  
  @findex CONST_WIDE_INT_ELT
 Index: gcc/doc/tm.texi
 ===
 --- gcc/doc/tm.texi   2013-11-09 09:50:47.392396760 +
 +++ gcc/doc/tm.texi   2013-11-09 11:07:22.757160564 +
 @@ -9683,10 +9683,9 @@ Returns the negative of the floating poi
  Returns the absolute value of @var{x}.
  @end deftypefn
  
 -@deftypefn Macro void REAL_VALUE_FROM_INT (REAL_VALUE_TYPE @var{x}, 
 HOST_WIDE_INT @var{val}, enum machine_mode @var{mode})
 -Converts a double-precision integer found in @var{val},
 -into a floating point value which is then stored into @var{x}.  The
 -value is truncated to fit in mode @var{mode}.
 +@deftypefn Macro void REAL_VALUE_FROM_INT (REAL_VALUE_TYPE @var{x}, const 
 wide_int_ref @var{val}, enum machine_mode @var{mode})
 +Converts integer @var{val} into a floating-point value which is then
 +stored into @var{x}.  The value is truncated to fit in mode @var{mode}.
  @end deftypefn
  
  @node Mode Switching
 @@ -11497,15 +11496,15 @@ The default value of this hook is based
  @defmac TARGET_SUPPORTS_WIDE_INT
  
  On older ports, large integers are stored in @code{CONST_DOUBLE} rtl
 -objects.  Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be non
 -zero to indicate that large integers are stored in
 +objects.  Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be nonzero
 +to indicate that large integers are stored in
  @code{CONST_WIDE_INT} rtl objects.  The @code{CONST_WIDE_INT} allows
  very large integer constants to be represented.  @code{CONST_DOUBLE}
 -are 

Re: [PATCH] Avoid some unnecessary set_cfun calls

2013-11-16 Thread Richard Sandiford
Jakub Jelinek ja...@redhat.com writes:
 On Wed, Nov 13, 2013 at 11:27:10AM +0100, Richard Biener wrote:
  Also, I wonder if we couldn't defer the expensive ira_init, if the info
  computed by it is used only during RTL optimization passes (haven't 
  verified
  it yet), then supposedly we could just remember using some target hook
  what the last state was when we did ira_init last time, and call ira_init
  again at the start of expansion or so if it is different from the last 
  time.
  For i?86/x86_64/ppc* this would be whether the current function's
  DECL_FUNCTION_SPECIFIC_TARGET is the same as one for which ira_init has 
  been
  called, for rx whether interrupt attribute is the same and for mips 
  whatever
  is needed.
 
 I wonder why we cannot move all the stuff we re-init to a member
 of struct function (or rather have a pointer to that info there
 to cache it across functions with the same options).  That is,
 get rid of more global state?  That would make switching back
 and forth cheaper.

 Isn't that what the SWITCHABLE_TARGET stuff is all about?
 So, perhaps we should just define SWITCHABLE_TARGET on i?86/x86_64/powerpc*
 (and rx if maintainer cares) and tweak it to attach somehow
 struct target_globals * to TARGET_OPTION_NODE somehow.
 A problem might be that lots of the save_target_globals
 allocated structures are heap allocated rather than GC, so we might leak
 memory.  Wonder if save_target_globals couldn't just compute the
 aggregate size of all the structures it allocates with XCNEW right now
 (plus required alignment if needed) and just allocate them together
 with the ggc_alloc_target_globals after the target_globals structure
 itself.

Yeah, that might be worth doing.  I think the only non-GCed structures
with subpointers are target_ira_int and target_lra_int, but we could
probably convert them to GCed structures.  (And perhaps use the same
technique recursively.  E.g. LRA could work out the maximum number of
operand_alternative structures needed and allocate them in one go.)

Thanks,
Richard


Re: [PATCH] Time profiler - phase 2

2013-11-16 Thread Jan Hubicka
 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index c566a85..1562098 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,15 @@
 +2013-11-13   Martin Liskamarxin.li...@gmail.com
 + Jan Hubicka  j...@suse.cz
 +
 + * cgraphunit.c (node_cmp): New function.
 + (expand_all_functions): Function ordering added.
 + * common.opt: New profile based function reordering flag introduced.
 + * coverage.c (get_coverage_counts): Wrong profile handled.
 + * ipa.c (cgraph_externally_visible_p): New late flag introduced.
 + * lto-partition.c: Support for time profile added.
 + * lto.c: Likewise.
 + * value-prof.c: Histogram instrumentation switch added.
 +
  2013-11-13  Vladimir Makarov  vmaka...@redhat.com
  
   PR rtl-optimization/59036
 diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
 index 4765e6a..7cdd9a4 100644
 --- a/gcc/cgraphunit.c
 +++ b/gcc/cgraphunit.c
 @@ -1821,6 +1821,17 @@ expand_function (struct cgraph_node *node)
ipa_remove_all_references (node-ref_list);
  }
  
 +/* Node comparer that is responsible for the order that corresponds
 +   to time when a function was launched for the first time.  */
 +
 +static int
 +node_cmp (const void *pa, const void *pb)
 +{
 +  const struct cgraph_node *a = *(const struct cgraph_node * const *) pa;
 +  const struct cgraph_node *b = *(const struct cgraph_node * const *) pb;
 +
 +  return b-tp_first_run - a-tp_first_run;

Please stabilize this by using node-order when tp_first_run is equivalent.
Later we ought to use better heuristic here, but order may be good enough to
start with.
 diff --git a/gcc/ipa.c b/gcc/ipa.c
 index a11b1c7..d92a332 100644
 --- a/gcc/ipa.c
 +++ b/gcc/ipa.c
 @@ -761,10 +761,14 @@ cgraph_externally_visible_p (struct cgraph_node *node,
   This improves code quality and we know we will duplicate them at most 
 twice
   (in the case that we are not using plugin and link with object file
implementing same COMDAT)  */
 -  if ((in_lto_p || whole_program)
 -   DECL_COMDAT (node-decl)
 -   comdat_can_be_unshared_p (node))
 -return false;
 +  if ((in_lto_p || whole_program || profile_arc_flag)
 +  DECL_COMDAT (node-decl)
 +  comdat_can_be_unshared_p (node))
 +{
 +  gcc_checking_assert (cgraph_function_body_availability (node)
 + AVAIL_OVERWRITABLE);
 +  return false;
 +}
  
/* When doing link time optimizations, hidden symbols become local.  */
if (in_lto_p
 @@ -932,7 +936,7 @@ function_and_variable_visibility (bool whole_program)
   }
gcc_assert ((!DECL_WEAK (node-decl)
  !DECL_COMDAT (node-decl))
 -   || TREE_PUBLIC (node-decl)
 +   || TREE_PUBLIC (node-decl)
 || node-weakref
 || DECL_EXTERNAL (node-decl));
if (cgraph_externally_visible_p (node, whole_program))
 @@ -949,7 +953,7 @@ function_and_variable_visibility (bool whole_program)
  node-definition  !node-weakref
  !DECL_EXTERNAL (node-decl))
   {
 -   gcc_assert (whole_program || in_lto_p
 +   gcc_assert (whole_program || in_lto_p || profile_arc_flag
 || !TREE_PUBLIC (node-decl));
 node-unique_name = ((node-resolution == LDPR_PREVAILING_DEF_IRONLY
 || node-resolution == 
 LDPR_PREVAILING_DEF_IRONLY_EXP)

These changes are unrelated, please remove them.
 @@ -395,6 +397,20 @@ node_cmp (const void *pa, const void *pb)
  {
const struct cgraph_node *a = *(const struct cgraph_node * const *) pa;
const struct cgraph_node *b = *(const struct cgraph_node * const *) pb;
 +
 +  /* Profile reorder flag enables function reordering based on first 
 execution
 + of a function. All functions with profile are placed in ascending
 + order at the beginning.  */
 +
 +  if (flag_profile_reorder_functions)
a-tp_first_run != b-tp_first_run
 +  {
 +if (a-tp_first_run  b-tp_first_run)
 +  return a-tp_first_run - b-tp_first_run;
 +
 +if (a-tp_first_run || b-tp_first_run)
 +  return b-tp_first_run - a-tp_first_run;

Drop a comment explaining the logic here ;)
 @@ -449,7 +465,7 @@ void
  lto_balanced_map (void)
  {
int n_nodes = 0;
 -  int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
 +  int n_varpool_nodes = 0, varpool_pos = 0;
struct cgraph_node **order = XNEWVEC (struct cgraph_node *, 
 cgraph_max_uid);
struct varpool_node **varpool_order = NULL;
int i;
 @@ -481,10 +497,13 @@ lto_balanced_map (void)
   get better about minimizing the function bounday, but until that
   things works smoother if we order in source order.  */
qsort (order, n_nodes, sizeof (struct cgraph_node *), node_cmp);
 +
 +  if (cgraph_dump_file)
 +for(i = 0; i  n_nodes; i++)
 +  fprintf (cgraph_dump_file, Balanced map symbol order:%s:%u\n, 
 cgraph_node_asm_name 

[PowerPC] libffi fixes and support for PowerPC64 ELFv2

2013-11-16 Thread Alan Modra
The following six patches correspond to patches posted to the libffi
mailing list a few days ago to add support for PowerPC64 ELFv2.  The
patch series has been tested on powerpc-linux, powerpc64-linux,
powerpc64le-linux and powerpc-freebsd by running the libffi testsuite,
and on powerpc64-linux and powerpc64le-linux by gcc bootstrap and
regression testing.  I guess the normal procedure would be to wait for
upstream approval before applying here, but since Uli's gcc support
for ELFv2 is in, it would be nice to have a working libffi along with
that.

-- 
Alan Modra
Australia Development Lab, IBM


Reinstate powerpc bounce buffer copying in ffi.c

2013-11-16 Thread Alan Modra
The first patch in the series is a little different to the
corresponding upstream libffi patch, because there I needed to revert
some fixes first.  The second patch in the series is entirely missing
due to the testsuite already being fixed in gcc.

This patch properly copies the bounce buffer to destination, and only
uses the bounce buffer for FFI_SYSV.  I also fix an accounting error
in integer register usage.

* src/powerpc/ffi.c (ffi_prep_cif_machdep): Do not consume an
int arg when returning a small struct for FFI_SYSV ABI.
(ffi_call): Only use bounce buffer when FLAG_RETURNS_SMST.
Properly copy bounce buffer to destination.

diff -urp gcc-virgin/libffi/src/powerpc/ffi.c gcc1/libffi/src/powerpc/ffi.c
--- gcc-virgin/libffi/src/powerpc/ffi.c 2013-06-25 09:36:39.259402853 +0930
+++ gcc1/libffi/src/powerpc/ffi.c   2013-11-15 23:06:57.313036827 +1030
@@ -691,7 +691,7 @@
 case FFI_TYPE_STRUCT:
   /*
* The final SYSV ABI says that structures smaller or equal 8 bytes
-   * are returned in r3/r4. The FFI_GCC_SYSV ABI instead returns them
+   * are returned in r3/r4.  The FFI_GCC_SYSV ABI instead returns them
* in memory.
*
* NOTE: The assembly code can safely assume that it just needs to
@@ -700,7 +700,10 @@
*   set.
*/
   if (cif-abi == FFI_SYSV  size = 8)
-   flags |= FLAG_RETURNS_SMST;
+   {
+ flags |= FLAG_RETURNS_SMST;
+ break;
+   }
   intarg_count++;
   flags |= FLAG_RETVAL_REFERENCE;
   /* Fall through.  */
@@ -919,30 +922,25 @@
 {
   /*
* The final SYSV ABI says that structures smaller or equal 8 bytes
-   * are returned in r3/r4. The FFI_GCC_SYSV ABI instead returns them
+   * are returned in r3/r4.  The FFI_GCC_SYSV ABI instead returns them
* in memory.
*
-   * Just to keep things simple for the assembly code, we will always
-   * bounce-buffer struct return values less than or equal to 8 bytes.
-   * This allows the ASM to handle SYSV small structures by directly
-   * writing r3 and r4 to memory without worrying about struct size.
+   * We bounce-buffer SYSV small struct return values so that sysv.S
+   * can write r3 and r4 to memory without worrying about struct size.
*/
   unsigned int smst_buffer[2];
   extended_cif ecif;
-  unsigned int rsize = 0;
 
   ecif.cif = cif;
   ecif.avalue = avalue;
 
-  /* Ensure that we have a valid struct return value */
   ecif.rvalue = rvalue;
-  if (cif-rtype-type == FFI_TYPE_STRUCT) {
-rsize = cif-rtype-size;
-if (rsize = 8)
-  ecif.rvalue = smst_buffer;
-else if (!rvalue)
-  ecif.rvalue = alloca(rsize);
-  }
+  if ((cif-flags  FLAG_RETURNS_SMST) != 0)
+ecif.rvalue = smst_buffer;
+  /* Ensure that we have a valid struct return value.
+ FIXME: Isn't this just papering over a user problem?  */
+  else if (!rvalue  cif-rtype-type == FFI_TYPE_STRUCT)
+ecif.rvalue = alloca (cif-rtype-size);
 
   switch (cif-abi)
 {
@@ -967,7 +965,21 @@
 
   /* Check for a bounce-buffered return value */
   if (rvalue  ecif.rvalue == smst_buffer)
-memcpy(rvalue, smst_buffer, rsize);
+{
+  unsigned int rsize = cif-rtype-size;
+#ifndef __LITTLE_ENDIAN__
+  /* The SYSV ABI returns a structure of up to 4 bytes in size
+left-padded in r3.  */
+  if (rsize = 4)
+   memcpy (rvalue, (char *) smst_buffer + 4 - rsize, rsize);
+  /* The SYSV ABI returns a structure of up to 8 bytes in size
+left-padded in r3/r4.  */
+  else if (rsize = 8)
+   memcpy (rvalue, (char *) smst_buffer + 8 - rsize, rsize);
+  else
+#endif
+   memcpy (rvalue, smst_buffer, rsize);
+}
 }
 
 

-- 
Alan Modra
Australia Development Lab, IBM


libffi doc fixes

2013-11-16 Thread Alan Modra
This enshrines the current testsuite practice of using ffi_arg for
returned values.  It would be reasonable and logical to use the actual
return argument type as passed to ffi_prep_cif, but this would mean
changing a large number of tests that use ffi_arg and all backends
that write results to an ffi_arg.

* doc/libffi.texi: Correct example code.

diff -urp gcc1/libffi/doc/libffi.texi gcc3/libffi/doc/libffi.texi
--- gcc1/libffi/doc/libffi.texi 2013-06-13 21:03:53.0 +0930
+++ gcc3/libffi/doc/libffi.texi 2013-11-15 23:16:06.811643952 +1030
@@ -214,7 +214,7 @@ int main()
   ffi_type *args[1];
   void *values[1];
   char *s;
-  int rc;
+  ffi_arg rc;
   
   /* Initialize the argument info vectors */
   args[0] = ffi_type_pointer;
@@ -222,7 +222,7 @@ int main()
   
   /* Initialize the cif */
   if (ffi_prep_cif(cif, FFI_DEFAULT_ABI, 1, 
-  ffi_type_uint, args) == FFI_OK)
+  ffi_type_sint, args) == FFI_OK)
 @{
   s = Hello World!;
   ffi_call(cif, puts, rc, values);
@@ -414,6 +414,7 @@ Here is the corresponding code to descri
   int i;
 
   tm_type.size = tm_type.alignment = 0;
+  tm_type.type = FFI_TYPE_STRUCT;
   tm_type.elements = tm_type_elements;
 
   for (i = 0; i  9; i++)
@@ -540,7 +541,7 @@ A trivial example that creates a new @co
 #include ffi.h
 
 /* Acts like puts with the file given at time of enclosure. */
-void puts_binding(ffi_cif *cif, unsigned int *ret, void* args[], 
+void puts_binding(ffi_cif *cif, ffi_arg *ret, void* args[], 
   FILE *stream)
 @{
   *ret = fputs(*(char **)args[0], stream);
@@ -565,7 +566,7 @@ int main()
 
   /* Initialize the cif */
   if (ffi_prep_cif(cif, FFI_DEFAULT_ABI, 1,
-   ffi_type_uint, args) == FFI_OK)
+   ffi_type_sint, args) == FFI_OK)
 @{
   /* Initialize the closure, setting stream to stdout */
   if (ffi_prep_closure_loc(closure, cif, puts_binding, 

-- 
Alan Modra
Australia Development Lab, IBM


Pass floating point values on powerpc64 as per ABI

2013-11-16 Thread Alan Modra
The powerpc64 support opted to pass floating point values both in the
fpr area and the parameter save area, necessary when the backend
doesn't know if a function argument corresponds to the ellipsis
arguments of a variadic function.  This patch adds powerpc support for
variadic functions, and changes the code to only pass fp in the ABI
mandated area.  ELFv2 needs this change since the parameter save area
may not exist there.

This also fixes two faulty tests that used a non-variadic function
cast to call a variadic function, and spuriously reasoned that this is
somehow necessary for static functions..

The whitespace changes, and comment changes in the tests, are to make
the gcc versions of these files mirror upstream libffi.

* src/powerpc/ffitarget.h (FFI_TARGET_SPECIFIC_VARIADIC): Define.
(FFI_EXTRA_CIF_FIELDS): Define.
* src/powerpc/ffi.c (ffi_prep_args64): Save fprs as per the
ABI, not to both fpr and param save area.
(ffi_prep_cif_machdep_core): Renamed from ffi_prep_cif_machdep.
Keep initial flags.  Formatting.  Remove dead FFI_LINUX_SOFT_FLOAT
code.
(ffi_prep_cif_machdep, ffi_prep_cif_machdep_var): New functions.
(ffi_closure_helper_LINUX64): Pass floating point as per ABI,
not to both fpr and parameter save areas.

* libffi/testsuite/libffi.call/cls_double_va.c (main): Correct
function cast and don't call ffi_prep_cif.
* libffi/testsuite/libffi.call/cls_longdouble_va.c (main): Likewise.

diff -urp gcc3/libffi/src/powerpc/ffitarget.h 
gcc4/libffi/src/powerpc/ffitarget.h
--- gcc3/libffi/src/powerpc/ffitarget.h 2013-11-15 23:03:07.313959745 +1030
+++ gcc4/libffi/src/powerpc/ffitarget.h 2013-11-15 23:19:21.692053339 +1030
@@ -106,6 +106,10 @@ typedef enum ffi_abi {
 
 #define FFI_CLOSURES 1
 #define FFI_NATIVE_RAW_API 0
+#if defined (POWERPC) || defined (POWERPC_FREEBSD)
+# define FFI_TARGET_SPECIFIC_VARIADIC 1
+# define FFI_EXTRA_CIF_FIELDS unsigned nfixedargs
+#endif
 
 /* For additional types like the below, take care about the order in
ppc_closures.S. They must follow after the FFI_TYPE_LAST.  */
diff -urp gcc3/libffi/src/powerpc/ffi.c gcc4/libffi/src/powerpc/ffi.c
--- gcc3/libffi/src/powerpc/ffi.c   2013-11-15 23:06:57.313036827 +1030
+++ gcc4/libffi/src/powerpc/ffi.c   2013-11-15 23:47:24.402296569 +1030
@@ -443,9 +443,9 @@ ffi_prep_args64 (extended_cif *ecif, uns
   /* 'fpr_base' points at the space for fpr3, and grows upwards as
  we use FPR registers.  */
   valp fpr_base;
-  int fparg_count;
+  unsigned int fparg_count;
 
-  int i, words;
+  unsigned int i, words, nargs, nfixedargs;
   ffi_type **ptr;
   double double_tmp;
   union {
@@ -482,30 +482,34 @@ ffi_prep_args64 (extended_cif *ecif, uns
 
   /* Now for the arguments.  */
   p_argv.v = ecif-avalue;
-  for (ptr = ecif-cif-arg_types, i = ecif-cif-nargs;
-   i  0;
-   i--, ptr++, p_argv.v++)
+  nargs = ecif-cif-nargs;
+  nfixedargs = ecif-cif-nfixedargs;
+  for (ptr = ecif-cif-arg_types, i = 0;
+   i  nargs;
+   i++, ptr++, p_argv.v++)
 {
   switch ((*ptr)-type)
{
case FFI_TYPE_FLOAT:
  double_tmp = **p_argv.f;
- *next_arg.f = (float) double_tmp;
+ if (fparg_count  NUM_FPR_ARG_REGISTERS64  i  nfixedargs)
+   *fpr_base.d++ = double_tmp;
+ else
+   *next_arg.f = (float) double_tmp;
  if (++next_arg.ul == gpr_end.ul)
next_arg.ul = rest.ul;
- if (fparg_count  NUM_FPR_ARG_REGISTERS64)
-   *fpr_base.d++ = double_tmp;
  fparg_count++;
  FFI_ASSERT (flags  FLAG_FP_ARGUMENTS);
  break;
 
case FFI_TYPE_DOUBLE:
  double_tmp = **p_argv.d;
- *next_arg.d = double_tmp;
+ if (fparg_count  NUM_FPR_ARG_REGISTERS64  i  nfixedargs)
+   *fpr_base.d++ = double_tmp;
+ else
+   *next_arg.d = double_tmp;
  if (++next_arg.ul == gpr_end.ul)
next_arg.ul = rest.ul;
- if (fparg_count  NUM_FPR_ARG_REGISTERS64)
-   *fpr_base.d++ = double_tmp;
  fparg_count++;
  FFI_ASSERT (flags  FLAG_FP_ARGUMENTS);
  break;
@@ -513,18 +517,20 @@ ffi_prep_args64 (extended_cif *ecif, uns
 #if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
case FFI_TYPE_LONGDOUBLE:
  double_tmp = (*p_argv.d)[0];
- *next_arg.d = double_tmp;
+ if (fparg_count  NUM_FPR_ARG_REGISTERS64  i  nfixedargs)
+   *fpr_base.d++ = double_tmp;
+ else
+   *next_arg.d = double_tmp;
  if (++next_arg.ul == gpr_end.ul)
next_arg.ul = rest.ul;
- if (fparg_count  NUM_FPR_ARG_REGISTERS64)
-   *fpr_base.d++ = double_tmp;
  fparg_count++;
  double_tmp = (*p_argv.d)[1];
- *next_arg.d = double_tmp;
+ if (fparg_count  NUM_FPR_ARG_REGISTERS64  i  nfixedargs)
+   *fpr_base.d++ = double_tmp;
+ else
+   *next_arg.d = 

Support PowerPC64 ELFv2 ABI

2013-11-16 Thread Alan Modra
Finally, this adds _CALL_ELF == 2 support.  ELFv1 objects can't be
linked with ELFv2 objects, so this is one case where preprocessor
tests in ffi.c are fine.  Also, there is no need to define a new
FFI_ELFv2 or somesuch value in enum ffi_abi.  FFI_LINUX64 will happily
serve both ABIs.

* src/powerpc/ffitarget.h (FFI_V2_TYPE_FLOAT_HOMOG,
FFI_V2_TYPE_DOUBLE_HOMOG, FFI_V2_TYPE_SMALL_STRUCT): Define.
(FFI_TRAMPOLINE_SIZE): Define variant for ELFv2.
* src/powerpc/ffi.c (FLAG_ARG_NEEDS_PSAVE): Define.
(discover_homogeneous_aggregate): New function.
(ffi_prep_args64): Adjust start of param save area for ELFv2.
Handle homogenous floating point struct parms.
(ffi_prep_cif_machdep_core): Adjust space calculation for ELFv2.
Handle ELFv2 return values.  Set FLAG_ARG_NEEDS_PSAVE.  Handle
homogenous floating point structs.
(ffi_call): Increase size of smst_buffer for ELFv2.  Handle ELFv2.
(flush_icache): Compile for ELFv2.
(ffi_prep_closure_loc): Set up ELFv2 trampoline.
(ffi_closure_helper_LINUX64): Don't return all structs directly
to caller.  Handle homogenous floating point structs.  Handle
ELFv2 struct return values.
* src/powerpc/linux64.S (ffi_call_LINUX64): Set up r2 for
ELFv2.  Adjust toc save location.  Call function pointer using
r12.  Handle FLAG_RETURNS_SMST.  Don't predict branches.
* src/powerpc/linux64_closure.S (ffi_closure_LINUX64): Set up r2
for ELFv2.  Define ELFv2 versions of STACKFRAME, PARMSAVE, and
RETVAL.  Handle possibly missing parameter save area.  Handle
ELFv2 return values.
(.note.GNU-stack): Move inside outer #ifdef.

diff -urp gcc6/libffi/src/powerpc/ffitarget.h 
gcc7/libffi/src/powerpc/ffitarget.h
--- gcc6/libffi/src/powerpc/ffitarget.h 2013-11-15 23:19:21.692053339 +1030
+++ gcc7/libffi/src/powerpc/ffitarget.h 2013-11-15 23:48:02.452807673 +1030
@@ -122,14 +122,23 @@ typedef enum ffi_abi {
defined in ffi.c, to determine the exact return type and its size.  */
 #define FFI_SYSV_TYPE_SMALL_STRUCT (FFI_TYPE_LAST + 2)
 
-#if defined(POWERPC64) || defined(POWERPC_AIX)
+/* Used by ELFv2 for homogenous structure returns.  */
+#define FFI_V2_TYPE_FLOAT_HOMOG(FFI_TYPE_LAST + 1)
+#define FFI_V2_TYPE_DOUBLE_HOMOG   (FFI_TYPE_LAST + 2)
+#define FFI_V2_TYPE_SMALL_STRUCT   (FFI_TYPE_LAST + 3)
+
+#if _CALL_ELF == 2
+# define FFI_TRAMPOLINE_SIZE 32
+#else
+# if defined(POWERPC64) || defined(POWERPC_AIX)
 #  if defined(POWERPC_DARWIN64)
 #define FFI_TRAMPOLINE_SIZE 48
 #  else
 #define FFI_TRAMPOLINE_SIZE 24
 #  endif
-#else /* POWERPC || POWERPC_AIX */
+# else /* POWERPC || POWERPC_AIX */
 #  define FFI_TRAMPOLINE_SIZE 40
+# endif
 #endif
 
 #ifndef LIBFFI_ASM
diff -urp gcc6/libffi/src/powerpc/ffi.c gcc7/libffi/src/powerpc/ffi.c
--- gcc6/libffi/src/powerpc/ffi.c   2013-11-15 23:47:40.153680507 +1030
+++ gcc7/libffi/src/powerpc/ffi.c   2013-11-15 23:51:02.333766929 +1030
@@ -49,6 +49,7 @@ enum {
   FLAG_RETURNS_128BITS  = 1  (31-27), /* cr6  */
 
   FLAG_ARG_NEEDS_COPY   = 1  (31- 7),
+  FLAG_ARG_NEEDS_PSAVE  = FLAG_ARG_NEEDS_COPY, /* Used by ELFv2 */
 #ifndef __NO_FPRS__
   FLAG_FP_ARGUMENTS = 1  (31- 6), /* cr1.eq; specified by ABI */
 #endif
@@ -383,6 +384,45 @@ enum {
 };
 enum { ASM_NEEDS_REGISTERS64 = 4 };
 
+#if _CALL_ELF == 2
+static unsigned int
+discover_homogeneous_aggregate (const ffi_type *t, unsigned int *elnum)
+{
+  switch (t-type)
+{
+case FFI_TYPE_FLOAT:
+case FFI_TYPE_DOUBLE:
+  *elnum = 1;
+  return (int) t-type;
+
+case FFI_TYPE_STRUCT:;
+  {
+   unsigned int base_elt = 0, total_elnum = 0;
+   ffi_type **el = t-elements;
+   while (*el)
+ {
+   unsigned int el_elt, el_elnum = 0;
+   el_elt = discover_homogeneous_aggregate (*el, el_elnum);
+   if (el_elt == 0
+   || (base_elt  base_elt != el_elt))
+ return 0;
+   base_elt = el_elt;
+   total_elnum += el_elnum;
+   if (total_elnum  8)
+ return 0;
+   el++;
+ }
+   *elnum = total_elnum;
+   return base_elt;
+  }
+
+default:
+  return 0;
+}
+}
+#endif
+
+
 /* ffi_prep_args64 is called by the assembly routine once stack space
has been allocated for the function's arguments.
 
@@ -470,7 +510,11 @@ ffi_prep_args64 (extended_cif *ecif, uns
   stacktop.c = (char *) stack + bytes;
   gpr_base.ul = stacktop.ul - ASM_NEEDS_REGISTERS64 - NUM_GPR_ARG_REGISTERS64;
   gpr_end.ul = gpr_base.ul + NUM_GPR_ARG_REGISTERS64;
+#if _CALL_ELF == 2
+  rest.ul = stack + 4 + NUM_GPR_ARG_REGISTERS64;
+#else
   rest.ul = stack + 6 + NUM_GPR_ARG_REGISTERS64;
+#endif
   fpr_base.d = gpr_base.d - NUM_FPR_ARG_REGISTERS64;
   fparg_count = 0;
   next_arg.ul = gpr_base.ul;
@@ -492,6 +536,8 @@ ffi_prep_args64 (extended_cif *ecif, uns
i  nargs;

Tidy powerpc linux64_closure.S with defines for stack offsets

2013-11-16 Thread Alan Modra
This patch prepares for ELFv2, where sizes of these areas change.  It
also makes some minor changes to improve code efficiency.

* src/powerpc/linux64.S (ffi_call_LINUX64): Tweak restore of r28.
(.note.GNU-stack): Move inside outer #ifdef.
* src/powerpc/linux64_closure.S (STACKFRAME, PARMSAVE,
RETVAL): Define and use throughout.
(ffi_closure_LINUX64): Save fprs before buying stack.
(.note.GNU-stack): Move inside outer #ifdef.

diff -urp gcc4/libffi/src/powerpc/linux64.S gcc5/libffi/src/powerpc/linux64.S
--- gcc4/libffi/src/powerpc/linux64.S   2013-11-15 23:03:07.337958821 +1030
+++ gcc5/libffi/src/powerpc/linux64.S   2013-11-15 23:37:49.672792802 +1030
@@ -130,7 +130,7 @@ ffi_call_LINUX64:
/* Restore the registers we used and return.  */
mr  %r1, %r28
ld  %r0, 16(%r28)
-   ld  %r28, -32(%r1)
+   ld  %r28, -32(%r28)
mtlr%r0
ld  %r29, -24(%r1)
ld  %r30, -16(%r1)
@@ -197,8 +197,8 @@ ffi_call_LINUX64:
.uleb128 0x4
.align 3
 .LEFDE1:
-#endif
 
-#if defined __ELF__  defined __linux__
+# if (defined __ELF__  defined __linux__) || _CALL_ELF == 2
.section.note.GNU-stack,,@progbits
+# endif
 #endif
diff -urp gcc4/libffi/src/powerpc/linux64_closure.S 
gcc5/libffi/src/powerpc/linux64_closure.S
--- gcc4/libffi/src/powerpc/linux64_closure.S   2013-11-15 23:03:07.333958973 
+1030
+++ gcc5/libffi/src/powerpc/linux64_closure.S   2013-11-15 23:37:49.672792802 
+1030
@@ -50,53 +50,57 @@ ffi_closure_LINUX64:
.text
 .ffi_closure_LINUX64:
 #endif
+
+#  48 bytes special reg save area + 64 bytes parm save area
+#  + 16 bytes retval area + 13*8 bytes fpr save area + round to 16
+#  define STACKFRAME 240
+#  define PARMSAVE 48
+#  define RETVAL PARMSAVE+64
+
 .LFB1:
-   # save general regs into parm save area
-   std %r3, 48(%r1)
-   std %r4, 56(%r1)
-   std %r5, 64(%r1)
-   std %r6, 72(%r1)
mflr%r0
+   # Save general regs into parm save area
+   # This is the parameter save area set up by our caller.
+   std %r3, PARMSAVE+0(%r1)
+   std %r4, PARMSAVE+8(%r1)
+   std %r5, PARMSAVE+16(%r1)
+   std %r6, PARMSAVE+24(%r1)
+   std %r7, PARMSAVE+32(%r1)
+   std %r8, PARMSAVE+40(%r1)
+   std %r9, PARMSAVE+48(%r1)
+   std %r10, PARMSAVE+56(%r1)
 
-   std %r7, 80(%r1)
-   std %r8, 88(%r1)
-   std %r9, 96(%r1)
-   std %r10, 104(%r1)
std %r0, 16(%r1)
 
-   # mandatory 48 bytes special reg save area + 64 bytes parm save area
-   # + 16 bytes retval area + 13*8 bytes fpr save area + round to 16
-   stdu%r1, -240(%r1)
-.LCFI0:
+   # load up the pointer to the parm save area
+   addi%r5, %r1, PARMSAVE
 
# next save fpr 1 to fpr 13
-   stfd  %f1, 128+(0*8)(%r1)
-   stfd  %f2, 128+(1*8)(%r1)
-   stfd  %f3, 128+(2*8)(%r1)
-   stfd  %f4, 128+(3*8)(%r1)
-   stfd  %f5, 128+(4*8)(%r1)
-   stfd  %f6, 128+(5*8)(%r1)
-   stfd  %f7, 128+(6*8)(%r1)
-   stfd  %f8, 128+(7*8)(%r1)
-   stfd  %f9, 128+(8*8)(%r1)
-   stfd  %f10, 128+(9*8)(%r1)
-   stfd  %f11, 128+(10*8)(%r1)
-   stfd  %f12, 128+(11*8)(%r1)
-   stfd  %f13, 128+(12*8)(%r1)
+   stfd%f1, -104+(0*8)(%r1)
+   stfd%f2, -104+(1*8)(%r1)
+   stfd%f3, -104+(2*8)(%r1)
+   stfd%f4, -104+(3*8)(%r1)
+   stfd%f5, -104+(4*8)(%r1)
+   stfd%f6, -104+(5*8)(%r1)
+   stfd%f7, -104+(6*8)(%r1)
+   stfd%f8, -104+(7*8)(%r1)
+   stfd%f9, -104+(8*8)(%r1)
+   stfd%f10, -104+(9*8)(%r1)
+   stfd%f11, -104+(10*8)(%r1)
+   stfd%f12, -104+(11*8)(%r1)
+   stfd%f13, -104+(12*8)(%r1)
 
-   # set up registers for the routine that actually does the work
-   # get the context pointer from the trampoline
-   mr %r3, %r11
+   # load up the pointer to the saved fpr registers */
+   addi%r6, %r1, -104
 
-   # now load up the pointer to the result storage
-   addi %r4, %r1, 112
+   # load up the pointer to the result storage
+   addi%r4, %r1, -STACKFRAME+RETVAL
 
-   # now load up the pointer to the parameter save area
-   # in the previous frame
-   addi %r5, %r1, 240 + 48
+   stdu%r1, -STACKFRAME(%r1)
+.LCFI0:
 
-   # now load up the pointer to the saved fpr registers */
-   addi %r6, %r1, 128
+   # get the context pointer from the trampoline
+   mr  %r3, %r11
 
# make the call
 #ifdef _CALL_LINUX
@@ -115,7 +119,7 @@ ffi_closure_LINUX64:
mflr %r4# move address of .Lret to r4
sldi %r3, %r3, 4# now multiply return type by 16
addi %r4, %r4, .Lret_type0 - .Lret
-   ld %r0, 240+16(%r1)
+   ld %r0, STACKFRAME+16(%r1)
add %r3, %r3, %r4   # add contents of table to table 

Align powerpc64 structs passed by value as per ABI

2013-11-16 Thread Alan Modra
The powerpc64 ABIs align structs passed by value, a fact ignored by
gcc for quite some time.  Since gcc now does the correct alignment,
libffi needs to follow suit.  This ought to be made selectable via
a new abi value, and the #ifdefs removed from ffi.c along with almost
all the other #ifdefs present there and in assembly.

* src/powerpc/ffi.c (ffi_prep_args64): Align struct parameters
according to __STRUCT_PARM_ALIGN__.
(ffi_prep_cif_machdep_core): Likewise.
(ffi_closure_helper_LINUX64): Likewise.

diff -urp gcc5/libffi/src/powerpc/ffi.c gcc6/libffi/src/powerpc/ffi.c
--- gcc5/libffi/src/powerpc/ffi.c   2013-11-15 23:47:31.890003986 +1030
+++ gcc6/libffi/src/powerpc/ffi.c   2013-11-15 23:47:40.153680507 +1030
@@ -428,6 +428,7 @@ ffi_prep_args64 (extended_cif *ecif, uns
 unsigned long *ul;
 float *f;
 double *d;
+size_t p;
   } valp;
 
   /* 'stacktop' points at the previous backchain pointer.  */
@@ -462,6 +463,9 @@ ffi_prep_args64 (extended_cif *ecif, uns
 double **d;
   } p_argv;
   unsigned long gprvalue;
+#ifdef __STRUCT_PARM_ALIGN__
+  unsigned long align;
+#endif
 
   stacktop.c = (char *) stack + bytes;
   gpr_base.ul = stacktop.ul - ASM_NEEDS_REGISTERS64 - NUM_GPR_ARG_REGISTERS64;
@@ -538,6 +542,13 @@ ffi_prep_args64 (extended_cif *ecif, uns
 #endif
 
case FFI_TYPE_STRUCT:
+#ifdef __STRUCT_PARM_ALIGN__
+ align = (*ptr)-alignment;
+ if (align  __STRUCT_PARM_ALIGN__)
+   align = __STRUCT_PARM_ALIGN__;
+ if (align  1)
+   next_arg.p = ALIGN (next_arg.p, align);
+#endif
  words = ((*ptr)-size + 7) / 8;
  if (next_arg.ul = gpr_base.ul  next_arg.ul + words  gpr_end.ul)
{
@@ -828,6 +839,10 @@ ffi_prep_cif_machdep_core (ffi_cif *cif)
   else
 for (ptr = cif-arg_types, i = cif-nargs; i  0; i--, ptr++)
   {
+#ifdef __STRUCT_PARM_ALIGN__
+   unsigned int align;
+#endif
+
switch ((*ptr)-type)
  {
 #if FFI_TYPE_LONGDOUBLE != FFI_TYPE_DOUBLE
@@ -843,6 +858,14 @@ ffi_prep_cif_machdep_core (ffi_cif *cif)
break;
 
  case FFI_TYPE_STRUCT:
+#ifdef __STRUCT_PARM_ALIGN__
+   align = (*ptr)-alignment;
+   if (align  __STRUCT_PARM_ALIGN__)
+ align = __STRUCT_PARM_ALIGN__;
+   align = align / 8;
+   if (align  1)
+ intarg_count = ALIGN (intarg_count, align);
+#endif
intarg_count += ((*ptr)-size + 7) / 8;
break;
 
@@ -1383,6 +1406,9 @@ ffi_closure_helper_LINUX64 (ffi_closure
   unsigned long i, avn, nfixedargs;
   ffi_cif *cif;
   ffi_dblfl *end_pfr = pfr + NUM_FPR_ARG_REGISTERS64;
+#ifdef __STRUCT_PARM_ALIGN__
+  unsigned long align;
+#endif
 
   cif = closure-cif;
   avalue = alloca (cif-nargs * sizeof (void *));
@@ -1437,6 +1463,13 @@ ffi_closure_helper_LINUX64 (ffi_closure
  break;
 
case FFI_TYPE_STRUCT:
+#ifdef __STRUCT_PARM_ALIGN__
+ align = arg_types[i]-alignment;
+ if (align  __STRUCT_PARM_ALIGN__)
+   align = __STRUCT_PARM_ALIGN__;
+ if (align  1)
+   pst = (unsigned long *) ALIGN ((size_t) pst, align);
+#endif
 #ifndef __LITTLE_ENDIAN__
  /* Structures with size less than eight bytes are passed
 left-padded.  */

-- 
Alan Modra
Australia Development Lab, IBM


Committed: arc/constraints.md: simplify Rcq definition

2013-11-16 Thread Joern Rennecke


2013-11-16  Joern Rennecke  joern.renne...@embecosm.com

* config/arc/constraints.md (Rcq): Simplify register number test.

Index: config/arc/constraints.md
===
--- config/arc/constraints.md   (revision 204899)
+++ config/arc/constraints.md   (revision 204900)
@@ -338,7 +338,7 @@ (define_constraint Rcq
   (and (match_code REG)
(match_test TARGET_Rcq
 !arc_ccfsm_cond_exec_p ()
-REGNO (op)  7) ^ 4) - 4)  15) == REGNO (op
+IN_RANGE (REGNO (op) ^ 4, 4, 11
 
 ; If we need a reload, we generally want to steer reload to use three-address
 ; alternatives in preference of two-address alternatives, unless the


Committed: arc.c: Make predication in delay slots explicit

2013-11-16 Thread Joern Rennecke


2013-11-16  Joern Rennecke  joern.renne...@embecosm.com
 
* config/arc/arc.c (arc_predicate_delay_insns): New function.
(pass_data_arc_predicate_delay_insns): New pass_data instance.
(pass_arc_predicate_delay_insns): New subclass of rtl_opt_class.
(make_pass_arc_predicate_delay_insns): New function.
(arc_init): Register pass_arc_predicate_delay_insns if
flag_delayed_branch is active.

Index: config/arc/arc.c
===
--- config/arc/arc.c(revision 204900)
+++ config/arc/arc.c(revision 204901)
@@ -632,6 +632,44 @@ make_pass_arc_ifcvt (gcc::context *ctxt)
   return new pass_arc_ifcvt (ctxt);
 }
 
+static unsigned arc_predicate_delay_insns (void);
+
+namespace {
+
+const pass_data pass_data_arc_predicate_delay_insns =
+{
+  RTL_PASS,
+  arc_predicate_delay_insns, /* name */
+  OPTGROUP_NONE,   /* optinfo_flags */
+  false,   /* has_gate */
+  true,/* has_execute */
+  TV_IFCVT2,   /* tv_id */
+  0,   /* properties_required */
+  0,   /* properties_provided */
+  0,   /* properties_destroyed */
+  0,   /* todo_flags_start */
+  TODO_df_finish   /* todo_flags_finish */
+};
+
+class pass_arc_predicate_delay_insns : public rtl_opt_pass
+{
+public:
+  pass_arc_predicate_delay_insns(gcc::context *ctxt)
+  : rtl_opt_pass(pass_data_arc_predicate_delay_insns, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  unsigned int execute () { return arc_predicate_delay_insns (); }
+};
+
+} // anon namespace
+
+rtl_opt_pass *
+make_pass_arc_predicate_delay_insns (gcc::context *ctxt)
+{
+  return new pass_arc_predicate_delay_insns (ctxt);
+}
+
 /* Called by OVERRIDE_OPTIONS to initialize various things.  */
 
 void
@@ -752,6 +790,16 @@ arc_init (void)
   register_pass (arc_ifcvt4_info);
   register_pass (arc_ifcvt5_info);
 }
+
+  if (flag_delayed_branch)
+{
+  opt_pass *pass_arc_predicate_delay_insns
+   = make_pass_arc_predicate_delay_insns (g);
+  struct register_pass_info arc_predicate_delay_info
+   = { pass_arc_predicate_delay_insns, dbr, 1, PASS_POS_INSERT_AFTER };
+
+  register_pass (arc_predicate_delay_info);
+}
 }
 
 /* Check ARC options, generate derived target attributes.  */
@@ -8296,6 +8344,74 @@ arc_ifcvt (void)
 }
   return 0;
 }
+
+/* Find annulled delay insns and convert them to use the appropriate predicate.
+   This allows branch shortening to size up these insns properly.  */
+
+static unsigned
+arc_predicate_delay_insns (void)
+{
+  for (rtx insn = get_insns (); insn; insn = NEXT_INSN (insn))
+{
+  rtx pat, jump, dlay, src, cond, *patp;
+  int reverse;
+
+  if (!NONJUMP_INSN_P (insn)
+ || GET_CODE (pat = PATTERN (insn)) != SEQUENCE)
+   continue;
+  jump = XVECEXP (pat, 0, 0);
+  dlay = XVECEXP (pat, 0, 1);
+  if (!JUMP_P (jump) || !INSN_ANNULLED_BRANCH_P (jump))
+   continue;
+  /* If the branch insn does the annulling, leave the delay insn alone.  */
+  if (!TARGET_AT_DBR_CONDEXEC  !INSN_FROM_TARGET_P (dlay))
+   continue;
+  /* ??? Could also leave DLAY un-conditionalized if its target is dead
+on the other path.  */
+  gcc_assert (GET_CODE (PATTERN (jump)) == SET);
+  gcc_assert (SET_DEST (PATTERN (jump)) == pc_rtx);
+  src = SET_SRC (PATTERN (jump));
+  gcc_assert (GET_CODE (src) == IF_THEN_ELSE);
+  cond = XEXP (src, 0);
+  if (XEXP (src, 2) == pc_rtx)
+   reverse = 0;
+  else if (XEXP (src, 1) == pc_rtx)
+   reverse = 1;
+  else
+   gcc_unreachable ();
+  if (!INSN_FROM_TARGET_P (dlay) != reverse)
+   {
+ enum machine_mode ccm = GET_MODE (XEXP (cond, 0));
+ enum rtx_code code = reverse_condition (GET_CODE (cond));
+ if (code == UNKNOWN || ccm == CC_FP_GTmode || ccm == CC_FP_GEmode)
+   code = reverse_condition_maybe_unordered (GET_CODE (cond));
+
+ cond = gen_rtx_fmt_ee (code, GET_MODE (cond),
+copy_rtx (XEXP (cond, 0)),
+copy_rtx (XEXP (cond, 1)));
+   }
+  else
+   cond = copy_rtx (cond);
+  patp = PATTERN (dlay);
+  pat = *patp;
+  /* dwarf2out.c:dwarf2out_frame_debug_expr doesn't know
+what to do with COND_EXEC.  */
+  if (RTX_FRAME_RELATED_P (dlay))
+   {
+ /* As this is the delay slot insn of an anulled branch,
+dwarf2out.c:scan_trace understands the anulling semantics
+without the COND_EXEC.  */
+ rtx note = alloc_reg_note (REG_FRAME_RELATED_EXPR, pat,
+REG_NOTES (dlay));
+ validate_change (dlay, REG_NOTES (dlay), note, 1);
+   }
+

Re: Implement C11/C++11 set of UCNs allowed in identifiers

2013-11-16 Thread Joseph S. Myers
On Fri, 15 Nov 2013, Tom Tromey wrote:

  Joseph == Joseph S Myers jos...@codesourcery.com writes:
 
 Joseph Any comments on whether we should consider the Unicode character data
 Joseph - UnicodeData.txt and DerivedNormalizationProps.txt, a total of about
 Joseph 2MB - as source code for the generated ucnid.h that should be checked
 Joseph into the repository and included in releases, or as an external build
 Joseph tool or system library that doesn't need including in the GCC source
 Joseph code?
 
 The last time this came up, for something in libgcj, it wasn't
 permissible, according to the Unicode rules, to check in the file.
 I haven't checked whether this has changed.

According to the NEWS for GNU miscfiles-1.4, License worries about the 
Unicode data are no longer a problem due to a change in the Unicode 
license., and according to 
http://www.gnu.org/licenses/license-list.html, It is a lax permissive 
license, compatible with all versions of the GPL..  My recollection is 
that previously there was a license peculiarity meaning that you could 
import the character data and export an equivalent file under a free 
software license, but not distribute the original file under such a 
license.

(The license text is included in the generated ucnid.h.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][1/3] Re-submission of Altera Nios II port, gcc parts

2013-11-16 Thread Joseph S. Myers
On Sat, 16 Nov 2013, Chung-Lin Tang wrote:

  +/* Local prototypes.  */
  
  I'd much prefer not to have any of those. Achieve this by putting
  +struct gcc_target targetm = TARGET_INITIALIZER;
  along with all the necessary definitions at the end of the file (and
  reordering some other functions).
 
 I would rather keep it that way. The ARM backend is another example of this.

I agree with Bernd's preference of topologically sorting static functions 
/ variables so forward declarations are only needed in cases of recursion.

I sometimes think it should be possible to convert many target macros to 
hooks, including generating function definitions from the macro 
definitions in .h files, with a lot more automation than I think has been 
used for that before.  Some back ends using a style rather requires 
forward function declarations is a needless complication for that sort of 
thing (indeed, if anyone were working on automated target macro to hook 
conversion, I'd suggest an early automated change should be making all 
back ends define targetm at the end of the file and avoid forward static 
declarations where possible).

-- 
Joseph S. Myers
jos...@codesourcery.com


[0/10] Replace host_integerp and tree_low_cst

2013-11-16 Thread Richard Sandiford
After the patch that went in yesterday, all calls to host_integerp and
tree_low_cst pass a constant pos argument.  This series replaces each
function with two separate ones:

  host_integerp (x, 0) - tree_fits_shwi_p (x)
  host_integerp (x, 1) - tree_fits_uhwi_p (x)
  tree_low_cst (x, 0) - tree_to_shwi (x)
  tree_low_cst (x, 1) - tree_to_uhwi (x)

The change is part of the wide-int conversion.  In some ways it's one
of the more bikesheddy parts because, unlike wide_int itself, it just
changes an interface without adding new functionality.  The two main
reasons for doing it IMO are:

1. the new functions are direct analogues of wide-int functions
2. the return type of tree_to_*hwi matches the function name,
   whereas tree_low_cst (x, 1) gets an unsigned value as a signed type

The series is pretty laboured because I wanted to separate out the
large mechanical changes from the small manual changes for ease of review.

Tested by building:

  aarch64-linux-gnueabi alpha-linux-gnu arm-linux-gnueabi c6x-elf
  epiphany-elf ia64-linux-gnu iq2000-elf m32c-elf mep-elf mips-linux-gnu
  picochip-elf powerpc-linux-gnu s390-linux-gnu sparc-linux-gnu
  x86_64-darwin

before and after the patch, checking that there were no new warnings,
and comparing the before and after assembly output at -O2 for gcc.dg,
g++.dg and gcc.c-torture.  Also tested normally on x86_64-linux-gnu
and powerpc64-linux-gnu.

Thanks,
Richard


[1/10] Add tree_fits_shwi_p and tree_fits_uhwi_p

2013-11-16 Thread Richard Sandiford
Add tree_fits_shwi_p and tree_fits_uhwi_p.  The implementations are taken
directly from host_integerp.

Thanks,
Richard


gcc/
* tree.h (tree_fits_shwi_p, tree_fits_uhwi_p): Declare.
* tree.c (tree_fits_shwi_p, tree_fits_uhwi_p): Define.

Index: gcc/tree.h
===
--- gcc/tree.h  2013-11-16 09:09:56.388037088 +
+++ gcc/tree.h  2013-11-16 09:11:53.535874667 +
@@ -3659,6 +3659,16 @@ extern int host_integerp (const_tree, in
   ATTRIBUTE_PURE /* host_integerp is pure only when checking is disabled.  */
 #endif
   ;
+extern bool tree_fits_shwi_p (const_tree)
+#ifndef ENABLE_TREE_CHECKING
+  ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled.  
*/
+#endif
+  ;
+extern bool tree_fits_uhwi_p (const_tree)
+#ifndef ENABLE_TREE_CHECKING
+  ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled.  
*/
+#endif
+  ;
 extern HOST_WIDE_INT tree_low_cst (const_tree, int);
 #if !defined ENABLE_TREE_CHECKING  (GCC_VERSION = 4003)
 extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
Index: gcc/tree.c
===
--- gcc/tree.c  2013-11-16 09:09:56.388037088 +
+++ gcc/tree.c  2013-11-16 09:11:53.534874659 +
@@ -6990,6 +6990,32 @@ host_integerp (const_tree t, int pos)
  || (pos  TREE_INT_CST_HIGH (t) == 0)));
 }
 
+/* Return true if T is an INTEGER_CST whose numerical value (extended
+   according to TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  */
+
+bool
+tree_fits_shwi_p (const_tree t)
+{
+  return (t != NULL_TREE
+  TREE_CODE (t) == INTEGER_CST
+  ((TREE_INT_CST_HIGH (t) == 0
+   (HOST_WIDE_INT) TREE_INT_CST_LOW (t) = 0)
+ || (TREE_INT_CST_HIGH (t) == -1
+  (HOST_WIDE_INT) TREE_INT_CST_LOW (t)  0
+  !TYPE_UNSIGNED (TREE_TYPE (t);
+}
+
+/* Return true if T is an INTEGER_CST whose numerical value (extended
+   according to TYPE_UNSIGNED) fits in an unsigned HOST_WIDE_INT.  */
+
+bool
+tree_fits_uhwi_p (const_tree t)
+{
+  return (t != NULL_TREE
+  TREE_CODE (t) == INTEGER_CST
+  TREE_INT_CST_HIGH (t) == 0);
+}
+
 /* Return the HOST_WIDE_INT least significant bits of T if it is an
INTEGER_CST and there is no overflow.  POS is nonzero if the result must
be non-negative.  We must be able to satisfy the above conditions.  */


[2/10] Mechanical replacement of host_integerp (..., 0)

2013-11-16 Thread Richard Sandiford
This is the result of using sed to replace all single-line
host_integerp (x, 0)s with tree_to_shwi_p (x), taking care to handle
bracket nesting in x.

Thanks,
Richard


gcc/ada/
* gcc-interface/cuintp.c: Replace host_integerp (..., 0) with
tree_fits_shwi_p throughout.

gcc/c-family/
* c-ada-spec.c, c-common.c, c-format.c, c-pretty-print.c: Replace
host_integerp (..., 0) with tree_fits_shwi_p throughout.

gcc/c/
* c-parser.c: Replace host_integerp (..., 0) with tree_fits_shwi_p
throughout.

gcc/cp/
* error.c, init.c, parser.c, semantics.c: Replace
host_integerp (..., 0) with tree_fits_shwi_p throughout.

gcc/go/
* gofrontend/expressions.cc: Replace host_integerp (..., 0) with
tree_fits_shwi_p throughout.

gcc/java/
* class.c, expr.c: Replace host_integerp (..., 0) with
tree_fits_shwi_p throughout.

gcc/
* builtins.c, config/alpha/alpha.c, config/c6x/predicates.md,
config/ia64/predicates.md, config/iq2000/iq2000.c, config/mips/mips.c,
config/s390/s390.c, dbxout.c, dwarf2out.c, except.c, explow.c, expr.c,
expr.h, fold-const.c, gimple-fold.c, gimple-ssa-strength-reduction.c,
gimple.c, godump.c, graphite-scop-detection.c, graphite-sese-to-poly.c,
omp-low.c, predict.c, rtlanal.c, sdbout.c, simplify-rtx.c,
stor-layout.c, tree-data-ref.c, tree-dfa.c, tree-pretty-print.c,
tree-sra.c, tree-ssa-alias.c, tree-ssa-forwprop.c,
tree-ssa-loop-ivopts.c, tree-ssa-loop-prefetch.c, tree-ssa-math-opts.c,
tree-ssa-phiopt.c, tree-ssa-reassoc.c, tree-ssa-sccvn.c,
tree-ssa-strlen.c, tree-ssa-structalias.c, tree-vect-data-refs.c,
tree-vect-patterns.c, tree-vectorizer.h, tree.c, var-tracking.c,
varasm.c: Replace host_integerp (..., 0) with tree_fits_shwi_p
throughout.



tree-to-shwi.diff.bz2
Description: BZip2 compressed data


[3/10] Mechanical replacement of host_integerp (..., 1)

2013-11-16 Thread Richard Sandiford
Like the previous patch, but for host_integerp (x, 1) - tree_to_uhwi_p (x).

Thanks,
Richard


gcc/ada/
* gcc-interface/decl.c, gcc-interface/misc.c, gcc-interface/utils.c:
Replace host_integerp (..., 1) with tree_fits_uhwi_p throughout.

gcc/c-family/
* c-ada-spec.c, c-common.c, c-pretty-print.c: Replace
host_integerp (..., 1) with tree_fits_uhwi_p throughout.

gcc/cp/
* decl.c: Replace host_integerp (..., 1) with tree_fits_uhwi_p
throughout.

gcc/
* builtins.c, config/alpha/alpha.c, config/iq2000/iq2000.c,
config/mips/mips.c, dbxout.c, dwarf2out.c, expr.c, fold-const.c,
gimple-fold.c, godump.c, omp-low.c, predict.c, sdbout.c, stor-layout.c,
tree-dfa.c, tree-sra.c, tree-ssa-forwprop.c, tree-ssa-loop-prefetch.c,
tree-ssa-phiopt.c, tree-ssa-sccvn.c, tree-ssa-strlen.c,
tree-ssa-structalias.c, tree-vect-data-refs.c, tree-vect-patterns.c,
tree.c, varasm.c, alias.c, cfgexpand.c, config/aarch64/aarch64.c,
config/arm/arm.c, config/epiphany/epiphany.c, config/i386/i386.c,
config/m32c/m32c-pragma.c, config/mep/mep-pragma.c,
config/rs6000/rs6000.c, config/sparc/sparc.c, emit-rtl.c, function.c,
gimplify.c, ipa-prop.c, stmt.c, trans-mem.c, tree-cfg.c,
tree-object-size.c, tree-ssa-ccp.c, tree-ssa-loop-ivcanon.c,
tree-stdarg.c, tree-switch-conversion.c, tree-vect-generic.c,
tree-vrp.c, tsan.c, ubsan.c: Replace host_integerp (..., 1) with
tree_fits_uhwi_p throughout.



tree-to-uhwi.diff.bz2
Description: BZip2 compressed data


[4/10] Mop up remaining host_integerp calls

2013-11-16 Thread Richard Sandiford
Handle host_integerp references that weren't caught by the sed.

Thanks,
Richard


gcc/ada/
* gcc-interface/cuintp.c: Update comments to refer to
tree_fits_shwi_p rather than host_integerp.
* gcc-interface/decl.c (gnat_to_gnu_entity): Use tree_fits_uhwi_p
rather than host_integerp.
* gcc-interface/utils.c (rest_of_record_type_compilation): Likewise.

gcc/
* expr.h: Update comments to refer to tree_fits_[su]hwi_p rather
than host_integerp.

Index: gcc/ada/gcc-interface/cuintp.c
===
--- gcc/ada/gcc-interface/cuintp.c  2013-11-16 09:14:25.293995960 +
+++ gcc/ada/gcc-interface/cuintp.c  2013-11-16 09:33:31.591920981 +
@@ -150,7 +150,7 @@ UI_From_gnu (tree Input)
   Int_Vector vec;
 
 #if HOST_BITS_PER_WIDE_INT == 64
-  /* On 64-bit hosts, host_integerp tells whether the input fits in a
+  /* On 64-bit hosts, tree_fits_shwi_p tells whether the input fits in a
  signed 64-bit integer.  Then a truncation tells whether it fits
  in a signed 32-bit integer.  */
   if (tree_fits_shwi_p (Input))
@@ -162,7 +162,7 @@ UI_From_gnu (tree Input)
   else
 return No_Uint;
 #else
-  /* On 32-bit hosts, host_integerp tells whether the input fits in a
+  /* On 32-bit hosts, tree_fits_shwi_p tells whether the input fits in a
  signed 32-bit integer.  Then a sign test tells whether it fits
  in a signed 64-bit integer.  */
   if (tree_fits_shwi_p (Input))
Index: gcc/ada/gcc-interface/decl.c
===
--- gcc/ada/gcc-interface/decl.c2013-11-16 09:22:06.982466042 +
+++ gcc/ada/gcc-interface/decl.c2013-11-16 09:33:31.614921192 +
@@ -1480,8 +1480,8 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 AGGREGATE_TYPE_P (gnu_type)
 tree_fits_uhwi_p (TYPE_SIZE_UNIT (gnu_type))
 !(TYPE_IS_PADDING_P (gnu_type)
- !host_integerp (TYPE_SIZE_UNIT
-   (TREE_TYPE (TYPE_FIELDS (gnu_type))), 1)))
+ !tree_fits_uhwi_p (TYPE_SIZE_UNIT
+  (TREE_TYPE (TYPE_FIELDS (gnu_type))
  static_p = true;
 
/* Now create the variable or the constant and set various flags.  */
Index: gcc/ada/gcc-interface/utils.c
===
--- gcc/ada/gcc-interface/utils.c   2013-11-16 09:22:06.985466064 +
+++ gcc/ada/gcc-interface/utils.c   2013-11-16 09:33:31.616921211 +
@@ -1753,8 +1753,8 @@ rest_of_record_type_compilation (tree re
TREE_CODE (curpos) == PLUS_EXPR
tree_fits_uhwi_p (TREE_OPERAND (curpos, 1))
TREE_CODE (TREE_OPERAND (curpos, 0)) == MULT_EXPR
-   host_integerp
- (TREE_OPERAND (TREE_OPERAND (curpos, 0), 1), 1))
+   tree_fits_uhwi_p
+ (TREE_OPERAND (TREE_OPERAND (curpos, 0), 1)))
{
  tree offset = TREE_OPERAND (TREE_OPERAND (curpos, 0), 0);
  unsigned HOST_WIDE_INT addend
Index: gcc/expr.h
===
--- gcc/expr.h  2013-11-16 09:14:25.398996758 +
+++ gcc/expr.h  2013-11-16 09:33:31.719922154 +
@@ -26,7 +26,7 @@ #define GCC_EXPR_H
 #include rtl.h
 /* For optimize_size */
 #include flags.h
-/* For host_integerp, tree_low_cst, fold_convert, size_binop, ssize_int,
+/* For tree_fits_[su]hwi_p, tree_low_cst, fold_convert, size_binop, ssize_int,
TREE_CODE, TYPE_SIZE, int_size_in_bytes,*/
 #include tree-core.h
 /* For GET_MODE_BITSIZE, word_mode */


[5/10] Add tree_to_shwi and tree_to_uhwi

2013-11-16 Thread Richard Sandiford
Add tree_to_shwi and tree_to_uhwi.  Initially tree_to_uhwi returns a
HOST_WIDE_INT, so that it's a direct replacement for tree_low_cst.
Patch 10 makes it return unsigned HOST_WIDE_INT instead.

Thanks,
Richard


gcc/
* tree.h (tree_to_shwi, tree_to_uhwi): Declare, with inline expansions.
* tree.c (tree_to_shwi, tree_to_uhwi): New functions.

Index: gcc/tree.c
===
--- gcc/tree.c  2013-11-15 16:46:27.420395607 +
+++ gcc/tree.c  2013-11-15 16:47:15.226216885 +
@@ -7027,6 +7027,28 @@ tree_low_cst (const_tree t, int pos)
   return TREE_INT_CST_LOW (t);
 }
 
+/* T is an INTEGER_CST whose numerical value (extended according to
+   TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  Return that
+   HOST_WIDE_INT.  */
+
+HOST_WIDE_INT
+tree_to_shwi (const_tree t)
+{
+  gcc_assert (tree_fits_shwi_p (t));
+  return TREE_INT_CST_LOW (t);
+}
+
+/* T is an INTEGER_CST whose numerical value (extended according to
+   TYPE_UNSIGNED) fits in an unsigned HOST_WIDE_INT.  Return that
+   HOST_WIDE_INT.  */
+
+HOST_WIDE_INT
+tree_to_uhwi (const_tree t)
+{
+  gcc_assert (tree_fits_uhwi_p (t));
+  return TREE_INT_CST_LOW (t);
+}
+
 /* Return the most significant (sign) bit of T.  */
 
 int
Index: gcc/tree.h
===
--- gcc/tree.h  2013-11-15 16:46:26.263399881 +
+++ gcc/tree.h  2013-11-15 16:46:56.569287095 +
@@ -3662,6 +3662,8 @@ extern bool tree_fits_uhwi_p (const_tree
 #endif
   ;
 extern HOST_WIDE_INT tree_low_cst (const_tree, int);
+extern HOST_WIDE_INT tree_to_shwi (const_tree);
+extern HOST_WIDE_INT tree_to_uhwi (const_tree);
 #if !defined ENABLE_TREE_CHECKING  (GCC_VERSION = 4003)
 extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
 tree_low_cst (const_tree t, int pos)
@@ -3669,6 +3671,20 @@ tree_low_cst (const_tree t, int pos)
   gcc_assert (host_integerp (t, pos));
   return TREE_INT_CST_LOW (t);
 }
+
+extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
+tree_to_shwi (const_tree t)
+{
+  gcc_assert (tree_fits_shwi_p (t));
+  return TREE_INT_CST_LOW (t);
+}
+
+extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
+tree_to_uhwi (const_tree t)
+{
+  gcc_assert (tree_fits_uhwi_p (t));
+  return TREE_INT_CST_LOW (t);
+}
 #endif
 extern int tree_int_cst_sgn (const_tree);
 extern int tree_int_cst_sign_bit (const_tree);


[6/10] Mechanical replacement of tree_low_cst (..., 0)

2013-11-16 Thread Richard Sandiford
Like patch 2, but using sed to replace tree_low_cst (x, 0) with
tree_to_shwi (x).

Thanks,
Richard


gcc/c-family/
* c-common.c, c-format.c, c-omp.c, c-pretty-print.c: Replace
tree_low_cst (..., 0) with tree_to_shwi throughout.

gcc/c/
* c-parser.c: Replace tree_low_cst (..., 0) with tree_to_shwi
throughout.

gcc/cp/
* class.c, dump.c, error.c, init.c, method.c, parser.c, semantics.c:
Replace tree_low_cst (..., 0) with tree_to_shwi throughout.

gcc/go/
* gofrontend/expressions.cc: Replace tree_low_cst (..., 0) with
tree_to_shwi throughout.

gcc/java/
* class.c, expr.c: Replace tree_low_cst (..., 0) with tree_to_shwi
throughout.

gcc/objc/
* objc-next-runtime-abi-02.c: Replace tree_low_cst (..., 0) with
tree_to_shwi throughout.

gcc/
* builtins.c, cilk-common.c, config/aarch64/aarch64.c,
config/alpha/alpha.c, config/arm/arm.c, config/c6x/predicates.md,
config/i386/i386.c, config/ia64/predicates.md, config/s390/s390.c,
coverage.c, dbxout.c, dwarf2out.c, except.c, explow.c, expr.c, expr.h,
fold-const.c, gimple-fold.c, godump.c, ipa-prop.c, omp-low.c,
predict.c, rtlanal.c, sdbout.c, stmt.c, stor-layout.c, targhooks.c,
tree-cfg.c, tree-data-ref.c, tree-inline.c, tree-ssa-forwprop.c,
tree-ssa-loop-prefetch.c, tree-ssa-phiopt.c, tree-ssa-sccvn.c,
tree-ssa-strlen.c, tree-stdarg.c, tree-vect-data-refs.c,
tree-vect-patterns.c, tree.c, tree.h, var-tracking.c, varasm.c:
Replace tree_low_cst (..., 0) with tree_to_shwi throughout.



tree-to-shwi.diff.bz2
Description: BZip2 compressed data


[7/10] Mechanical replacement of tree_low_cst (..., 1)

2013-11-16 Thread Richard Sandiford
Like the previous patch, but for tree_low_cst (x, 1) - tree_to_uhwi_p (x).

Thanks,
Richard


gcc/ada/
* gcc-interface/decl.c, gcc-interface/utils.c, gcc-interface/utils2.c:
Replace tree_low_cst (..., 1) with tree_to_uhwi throughout.

gcc/c-family/
* c-common.c, c-cppbuiltin.c: Replace tree_low_cst (..., 1) with
tree_to_uhwi throughout.

gcc/c/
* c-decl.c, c-typeck.c: Replace tree_low_cst (..., 1) with
tree_to_uhwi throughout.

gcc/cp/
* call.c, class.c, decl.c, error.c: Replace tree_low_cst (..., 1) with
tree_to_uhwi throughout.

gcc/objc/
* objc-encoding.c: Replace tree_low_cst (..., 1) with tree_to_uhwi
throughout.

gcc/
* alias.c, asan.c, builtins.c, cfgexpand.c, cgraph.c,
config/aarch64/aarch64.c, config/alpha/predicates.md,
config/arm/arm.c, config/darwin.c, config/epiphany/epiphany.c,
config/i386/i386.c, config/iq2000/iq2000.c, config/m32c/m32c-pragma.c,
config/mep/mep-pragma.c, config/mips/mips.c,
config/picochip/picochip.c, config/rs6000/rs6000.c, cppbuiltin.c,
dbxout.c, dwarf2out.c, emit-rtl.c, except.c, expr.c, fold-const.c,
function.c, gimple-fold.c, godump.c, ipa-cp.c, ipa-prop.c, omp-low.c,
predict.c, sdbout.c, stor-layout.c, trans-mem.c, tree-object-size.c,
tree-sra.c, tree-ssa-ccp.c, tree-ssa-forwprop.c,
tree-ssa-loop-ivcanon.c, tree-ssa-loop-ivopts.c, tree-ssa-loop-niter.c,
tree-ssa-loop-prefetch.c, tree-ssa-strlen.c, tree-stdarg.c,
tree-switch-conversion.c, tree-vect-generic.c, tree-vect-loop.c,
tree-vect-patterns.c, tree-vrp.c, tree.c, tsan.c, ubsan.c, varasm.c:
Replace tree_low_cst (..., 1) with tree_to_uhwi throughout.



tree-to-uhwi.diff.bz2
Description: BZip2 compressed data


[8/10] Mop up remaining tree_low_cst calls

2013-11-16 Thread Richard Sandiford
Handle tree_low_cst references that weren't caught by the sed.

Thanks,
Richard


gcc/ada/
* gcc-interface/cuintp.c (UI_From_gnu): Use tree_to_shwi rather than
tree_low_cst.

gcc/c-family/
* c-common.c (fold_offsetof_1): Use tree_to_uhwi rather than
tree_low_cst.
(complete_array_type): Update comment to refer to tree_to_[su]hwi
rather than tree_low_cst.

gcc/c/
* c-decl.c (grokdeclarator): Update comment to refer to
tree_to_[su]hwi rather than tree_low_cst.

gcc/cp/
* decl.c (reshape_init_array_1): Use tree_to_uhwi rather than
tree_low_cst.
(grokdeclarator): Update comment to refer to tree_to_[su]hwi rather
than tree_low_cst.

gcc/
* expr.h: Update comments to refer to tree_to_[su]hwi rather
than tree_low_cst.
* fold-const.c (fold_binary_loc): Likewise.
* expr.c (store_constructor): Use tree_to_uhwi rather than
tree_low_cst.
* ipa-utils.h (possible_polymorphic_call_target_p): Likewise.
* stmt.c (emit_case_dispatch_table): Likewise.
* tree-switch-conversion.c (emit_case_bit_tests): Likewise.

Index: gcc/ada/gcc-interface/cuintp.c
===
--- gcc/ada/gcc-interface/cuintp.c  2013-11-16 13:08:22.531824320 +
+++ gcc/ada/gcc-interface/cuintp.c  2013-11-16 13:08:24.254837390 +
@@ -176,9 +176,9 @@ UI_From_gnu (tree Input)
 
   for (i = Max_For_Dint - 1; i = 0; i--)
 {
-  v[i] = tree_low_cst (fold_build1 (ABS_EXPR, gnu_type,
+  v[i] = tree_to_shwi (fold_build1 (ABS_EXPR, gnu_type,
fold_build2 (TRUNC_MOD_EXPR, gnu_type,
-gnu_temp, gnu_base)), 0);
+gnu_temp, gnu_base)));
   gnu_temp = fold_build2 (TRUNC_DIV_EXPR, gnu_type, gnu_temp, gnu_base);
 }
 
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-16 13:08:22.531824320 +
+++ gcc/c-family/c-common.c 2013-11-16 13:08:46.45771 +
@@ -9721,8 +9721,7 @@ fold_offsetof_1 (tree expr)
  return error_mark_node;
}
   off = size_binop_loc (input_location, PLUS_EXPR, DECL_FIELD_OFFSET (t),
-   size_int (tree_low_cst (DECL_FIELD_BIT_OFFSET (t),
-   1)
+   size_int (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (t))
  / BITS_PER_UNIT));
   break;
 
@@ -10091,7 +10090,7 @@ complete_array_type (tree *ptype, tree i
 {
   error (size of array is too large);
   /* If we proceed with the array type as it is, we'll eventually
-crash in tree_low_cst().  */
+crash in tree_to_[su]hwi().  */
   type = error_mark_node;
 }
 
Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c  2013-11-16 13:08:22.531824320 +
+++ gcc/c/c-decl.c  2013-11-16 13:08:24.258837421 +
@@ -5912,7 +5912,7 @@ grokdeclarator (const struct c_declarato
   else
error_at (loc, size of unnamed array is too large);
   /* If we proceed with the array type as it is, we'll eventually
-crash in tree_low_cst().  */
+crash in tree_to_[su]hwi().  */
   type = error_mark_node;
 }
 
Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c   2013-11-16 13:08:22.531824320 +
+++ gcc/cp/decl.c   2013-11-16 13:09:31.845353189 +
@@ -5095,8 +5095,7 @@ reshape_init_array_1 (tree elt_type, tre
max_index_cst = tree_to_uhwi (max_index);
   /* sizetype is sign extended, not zero extended.  */
   else
-   max_index_cst = tree_low_cst (fold_convert (size_type_node, max_index),
- 1);
+   max_index_cst = tree_to_uhwi (fold_convert (size_type_node, max_index));
 }
 
   /* Loop until there are no more initializers.  */
@@ -10031,7 +10030,7 @@ grokdeclarator (const cp_declarator *dec
 {
   error (size of array %qs is too large, name);
   /* If we proceed with the array type as it is, we'll eventually
-crash in tree_low_cst().  */
+crash in tree_to_[su]hwi().  */
   type = error_mark_node;
 }
 
Index: gcc/expr.h
===
--- gcc/expr.h  2013-11-16 13:08:22.531824320 +
+++ gcc/expr.h  2013-11-16 13:08:24.263837459 +
@@ -26,8 +26,8 @@ #define GCC_EXPR_H
 #include rtl.h
 /* For optimize_size */
 #include flags.h
-/* For tree_fits_[su]hwi_p, tree_low_cst, fold_convert, size_binop, ssize_int,
-   TREE_CODE, TYPE_SIZE, int_size_in_bytes,*/
+/* For tree_fits_[su]hwi_p, tree_to_[su]hwi, fold_convert, size_binop,
+   ssize_int, TREE_CODE, TYPE_SIZE, 

[9/10] Remove host_integerp and tree_low_cst

2013-11-16 Thread Richard Sandiford
Remove the old functions, which are now unused.

Thanks,
Richard


gcc/
* tree.h (host_integerp, tree_low_cst): Delete.
* tree.c (host_integerp, tree_low_cst): Delete.

Index: gcc/tree.h
===
--- gcc/tree.h  2013-11-16 09:35:59.381239766 +
+++ gcc/tree.h  2013-11-16 10:14:00.618868694 +
@@ -3654,11 +3654,6 @@ extern int attribute_list_contained (con
 extern int tree_int_cst_equal (const_tree, const_tree);
 extern int tree_int_cst_lt (const_tree, const_tree);
 extern int tree_int_cst_compare (const_tree, const_tree);
-extern int host_integerp (const_tree, int)
-#ifndef ENABLE_TREE_CHECKING
-  ATTRIBUTE_PURE /* host_integerp is pure only when checking is disabled.  */
-#endif
-  ;
 extern bool tree_fits_shwi_p (const_tree)
 #ifndef ENABLE_TREE_CHECKING
   ATTRIBUTE_PURE /* tree_fits_shwi_p is pure only when checking is disabled.  
*/
@@ -3669,18 +3664,10 @@ extern bool tree_fits_uhwi_p (const_tree
   ATTRIBUTE_PURE /* tree_fits_uhwi_p is pure only when checking is disabled.  
*/
 #endif
   ;
-extern HOST_WIDE_INT tree_low_cst (const_tree, int);
 extern HOST_WIDE_INT tree_to_shwi (const_tree);
 extern HOST_WIDE_INT tree_to_uhwi (const_tree);
 #if !defined ENABLE_TREE_CHECKING  (GCC_VERSION = 4003)
 extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
-tree_low_cst (const_tree t, int pos)
-{
-  gcc_assert (host_integerp (t, pos));
-  return TREE_INT_CST_LOW (t);
-}
-
-extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
 tree_to_shwi (const_tree t)
 {
   gcc_assert (tree_fits_shwi_p (t));
Index: gcc/tree.c
===
--- gcc/tree.c  2013-11-16 09:59:37.205620348 +
+++ gcc/tree.c  2013-11-16 10:14:00.604868554 +
@@ -6970,26 +6970,6 @@ tree_int_cst_compare (const_tree t1, con
 return 0;
 }
 
-/* Return 1 if T is an INTEGER_CST that can be manipulated efficiently on
-   the host.  If POS is zero, the value can be represented in a single
-   HOST_WIDE_INT.  If POS is nonzero, the value must be non-negative and can
-   be represented in a single unsigned HOST_WIDE_INT.  */
-
-int
-host_integerp (const_tree t, int pos)
-{
-  if (t == NULL_TREE)
-return 0;
-
-  return (TREE_CODE (t) == INTEGER_CST
-  ((TREE_INT_CST_HIGH (t) == 0
-   (HOST_WIDE_INT) TREE_INT_CST_LOW (t) = 0)
- || (! pos  TREE_INT_CST_HIGH (t) == -1
-  (HOST_WIDE_INT) TREE_INT_CST_LOW (t)  0
-  !TYPE_UNSIGNED (TREE_TYPE (t)))
- || (pos  TREE_INT_CST_HIGH (t) == 0)));
-}
-
 /* Return true if T is an INTEGER_CST whose numerical value (extended
according to TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  */
 
@@ -7016,17 +6996,6 @@ tree_fits_uhwi_p (const_tree t)
   TREE_INT_CST_HIGH (t) == 0);
 }
 
-/* Return the HOST_WIDE_INT least significant bits of T if it is an
-   INTEGER_CST and there is no overflow.  POS is nonzero if the result must
-   be non-negative.  We must be able to satisfy the above conditions.  */
-
-HOST_WIDE_INT
-tree_low_cst (const_tree t, int pos)
-{
-  gcc_assert (host_integerp (t, pos));
-  return TREE_INT_CST_LOW (t);
-}
-
 /* T is an INTEGER_CST whose numerical value (extended according to
TYPE_UNSIGNED) fits in a signed HOST_WIDE_INT.  Return that
HOST_WIDE_INT.  */


[10/10] Make tree_to_uhwi return unsigned

2013-11-16 Thread Richard Sandiford
This is probably the only non-obvious part of the series.  I went through
all callers to tree_to_uhwi to see whether they were used in a context
where signedness mattered.  If so, I tried to adjust the casting to match.

This mostly meant removing casts to unsigned types.  There are a couple
of cases where I added casts to HOST_WIDE_INT though, to mimic the old
tree_low_cst behaviour:

- In cfgexpand.c and trans-mem.c, where we're comparing the value
  with an int PARAM_VALUE.  The test isn't watertight since any
  unsigned constant  HOST_WIDE_INT_MAX is going to be accepted.
  That's a preexisting problem though and it can be fixed more
  easily with wi:: routines.  Until then this preserves the current
  behaviour.

- In the AArch32/64 and powerpc ABI handling.  Here too count
  is an int and is probably not safe for large values anyway; e.g.:

count *= (1 + tree_to_uhwi (TYPE_MAX_VALUE (index))
  - tree_to_uhwi (TYPE_MIN_VALUE (index)));

  is done without overflow checking.  This too is easier to fix
  with wi::, so I've just kept it as a signed comparison for now.

Thanks,
Richard


gcc/c-family/
* c-common.c (convert_vector_to_pointer_for_subscript): Remove
cast to unsigned type.

gcc/
* tree.h (tree_to_uhwi): Return an unsigned HOST_WIDE_INT.
* tree.c (tree_to_uhwi): Return an unsigned HOST_WIDE_INT.
(tree_ctz): Remove cast to unsigned type.
* builtins.c (fold_builtin_memory_op): Likewise.
* dwarf2out.c (descr_info_loc): Likewise.
* godump.c (go_output_typedef): Likewise.
* omp-low.c (expand_omp_simd): Likewise.
* stor-layout.c (excess_unit_span): Likewise.
* tree-object-size.c (addr_object_size): Likewise.
* tree-sra.c (analyze_all_variable_accesses): Likewise.
* tree-ssa-forwprop.c (simplify_builtin_call): Likewise.
(simplify_rotate): Likewise.
* tree-ssa-strlen.c (adjust_last_stmt, handle_builtin_memcpy)
(handle_pointer_plus): Likewise.
* tree-switch-conversion.c (check_range): Likewise.
* tree-vect-patterns.c (vect_recog_rotate_pattern): Likewise.
* tsan.c (instrument_builtin_call): Likewise.
* cfgexpand.c (defer_stack_allocation): Add cast to HOST_WIDE_INT.
* trans-mem.c (tm_log_add): Likewise.
* config/aarch64/aarch64.c (aapcs_vfp_sub_candidate): Likewise.
* config/arm/arm.c (aapcs_vfp_sub_candidate): Likewise.
* config/rs6000/rs6000.c (rs6000_aggregate_candidate): Likewise.
* config/mips/mips.c (r10k_safe_mem_expr_p): Make offset unsigned.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-16 10:13:53.825800713 +
+++ gcc/c-family/c-common.c 2013-11-16 10:14:40.373263297 +
@@ -11702,8 +11702,7 @@ convert_vector_to_pointer_for_subscript
 
   if (TREE_CODE (index) == INTEGER_CST)
 if (!tree_fits_uhwi_p (index)
-|| ((unsigned HOST_WIDE_INT) tree_to_uhwi (index)
-   = TYPE_VECTOR_SUBPARTS (type)))
+|| tree_to_uhwi (index) = TYPE_VECTOR_SUBPARTS (type))
   warning_at (loc, OPT_Warray_bounds, index value is out of bound);
 
   c_common_mark_addressable_vec (*vecp);
Index: gcc/tree.h
===
--- gcc/tree.h  2013-11-16 10:14:00.618868694 +
+++ gcc/tree.h  2013-11-16 10:14:40.488264431 +
@@ -3665,7 +3665,7 @@ extern bool tree_fits_uhwi_p (const_tree
 #endif
   ;
 extern HOST_WIDE_INT tree_to_shwi (const_tree);
-extern HOST_WIDE_INT tree_to_uhwi (const_tree);
+extern unsigned HOST_WIDE_INT tree_to_uhwi (const_tree);
 #if !defined ENABLE_TREE_CHECKING  (GCC_VERSION = 4003)
 extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
 tree_to_shwi (const_tree t)
@@ -3674,7 +3674,7 @@ tree_to_shwi (const_tree t)
   return TREE_INT_CST_LOW (t);
 }
 
-extern inline __attribute__ ((__gnu_inline__)) HOST_WIDE_INT
+extern inline __attribute__ ((__gnu_inline__)) unsigned HOST_WIDE_INT
 tree_to_uhwi (const_tree t)
 {
   gcc_assert (tree_fits_uhwi_p (t));
Index: gcc/tree.c
===
--- gcc/tree.c  2013-11-16 10:14:00.604868554 +
+++ gcc/tree.c  2013-11-16 10:14:40.488264431 +
@@ -2211,8 +2211,7 @@ tree_ctz (const_tree expr)
 case LSHIFT_EXPR:
   ret1 = tree_ctz (TREE_OPERAND (expr, 0));
   if (tree_fits_uhwi_p (TREE_OPERAND (expr, 1))
-  ((unsigned HOST_WIDE_INT) tree_to_uhwi (TREE_OPERAND (expr, 1))
-  (unsigned HOST_WIDE_INT) prec))
+  (tree_to_uhwi (TREE_OPERAND (expr, 1))  prec))
{
  ret2 = tree_to_uhwi (TREE_OPERAND (expr, 1));
  return MIN (ret1 + ret2, prec);
@@ -2220,8 +2219,7 @@ tree_ctz (const_tree expr)
   return ret1;
 case RSHIFT_EXPR:
   if (tree_fits_uhwi_p (TREE_OPERAND (expr, 1))
-  ((unsigned 

Re: [3/10] Mechanical replacement of host_integerp (..., 1)

2013-11-16 Thread Richard Sandiford
Richard Sandiford rdsandif...@googlemail.com writes:
 Like the previous patch, but for host_integerp (x, 1) - tree_to_uhwi_p (x).

Should have been this patch.



tree-fits-uhwi-p.diff.bz2
Description: BZip2 compressed data


Re: [2/10] Mechanical replacement of host_integerp (..., 0)

2013-11-16 Thread Richard Sandiford
Richard Sandiford rdsandif...@googlemail.com writes:
 This is the result of using sed to replace all single-line
 host_integerp (x, 0)s with tree_to_shwi_p (x), taking care to handle
 bracket nesting in x.

Bah, wrong patch, sorry.



tree-fits-shwi-p.diff.bz2
Description: BZip2 compressed data


Re: [Patch, mips] MIPS performance patch for PR 56552

2013-11-16 Thread Richard Sandiford
Richard Sandiford rdsandif...@googlemail.com writes:
 Steve Ellcey  sell...@mips.com writes:
 diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
 index 0cda169..49c2bf7 100644
 --- a/gcc/config/mips/mips.md
 +++ b/gcc/config/mips/mips.md
 @@ -6721,7 +6721,7 @@
  (define_insn *movGPR:mode_on_MOVECC:mode
[(set (match_operand:GPR 0 register_operand =d,d)
  (if_then_else:GPR
 - (match_operator:MOVECC 4 equality_operator
 + (match_operator 4 equality_operator
  [(match_operand:MOVECC 1 register_operand 
 MOVECC:reg,MOVECC:reg)
   (const_int 0)])
   (match_operand:GPR 2 reg_or_0_operand dJ,0)

 Sorry, I didn't notice this before, but we should remove _on_MOVECC:mode
 from the name of the insn.  Same for the FP version.

 OK with that change, thanks.

Sorry, MOVECC is still used of course.  The patch is OK as-is.

Thanks,
Richard




Re: [PATCH] Fix lto bootstrap verification failure with -freorder-blocks-and-partition

2013-11-16 Thread Teresa Johnson
On Sat, Nov 16, 2013 at 12:33 AM, Jan Hubicka hubi...@ucw.cz wrote:
 When testing with -freorder-blocks-and-partition enabled, I hit a
 verification failure in an LTO profiledbootstrap. Edge forwarding
 performed when we went into cfg layout mode after bb reordering
 (during compgotos) created a situation where a hot block was then
 dominated by a cold block and was therefore remarked as cold. Because
 bb reorder was complete at that point, it was not moved in the
 physical layout, and we incorrectly went in and out of the cold
 section multiple times.

 The following patch addresses that by fixing the layout when we move
 blocks to the cold section after bb reordering is complete.

 Tested with an LTO profiledbootstrap with
 -freorder-blocks-and-partition enabled. Ok for trunk?

 Thanks,
 Teresa

 2013-11-15  Teresa Johnson  tejohn...@google.com

 * cfgrtl.c (fixup_partitions): Reorder blocks if necessary.

 computed_gotos just unfactors unified blocks that we use to avoid CFGs with
 O(n^2) edges. This is mostly to avoid problems with nonlinearity of other 
 passes
 and to reduce the quadratic memory use case to one function at a time.

 I wonder if it won't be cleaner to simply unfactor those just before 
 pass_reorder_blocks.

 Computed gotos are used e.g. in libjava interpreter to optimize the tight 
 interpretting
 loop.  I think those cases would benefit from having at least 
 scheduling/reordering
 and alignments done right.

 Of course it depends on how bad the compile time implications are (I think in 
 addition
 to libjava, there was a lucier's testcase that made us to go for this trick) ,
 but I would prefer it over ading yet another hack into cfgrtl...
 We also may just avoid cfglayout cleanup_cfg while doing computed gotos...

Note I haven't done an extensive check to see if compgotos is the only
phase that goes back into cfglayout mode after bb reordering is done,
that's just the one that hit this. Eventually it might be good to
prevent going into cfglayout mode after bb reordering.

For now we could either fix up the layout as I am doing here. Or as
you suggest, prevent some cleanup/cfg optimization after bb reordering
is done. I thought about preventing the forwarding optimization after
bb reordering when splitting was on initially, but didn't want
enabling -freorder-blocks-and-partition to unnecessarily prevent
optimization. The reordering seemed reasonably straightforward so I
went with that solution in this patch.

Let me know if you'd rather have the solution of preventing the
forwarding (or maybe all all of try_optimize_cfg to be safe) under
-freorder-blocks-and-partition after bb reordering.

Thanks,
Teresa


 Honza



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH] Fix lto bootstrap verification failure with -freorder-blocks-and-partition

2013-11-16 Thread Jan Hubicka
 Note I haven't done an extensive check to see if compgotos is the only
 phase that goes back into cfglayout mode after bb reordering is done,
 that's just the one that hit this. Eventually it might be good to
 prevent going into cfglayout mode after bb reordering.

Can we just try to abort when into cfg layout is called after
bb reorder.  It seems to make sense to avoid that - in/out will
definitely result in misplaced gotos and toher stuff.
 
 For now we could either fix up the layout as I am doing here. Or as
 you suggest, prevent some cleanup/cfg optimization after bb reordering
 is done. I thought about preventing the forwarding optimization after
 bb reordering when splitting was on initially, but didn't want
 enabling -freorder-blocks-and-partition to unnecessarily prevent
 optimization. The reordering seemed reasonably straightforward so I
 went with that solution in this patch.
 
 Let me know if you'd rather have the solution of preventing the
 forwarding (or maybe all all of try_optimize_cfg to be safe) under
 -freorder-blocks-and-partition after bb reordering.

Generally I would like to be consistent about the stage of IL - i.e. go to
cfglayout after RTl expanstion and stay in it until after the bb reorder and
then consistently work with the actual insns layout we decided on.

Honza
 
 Thanks,
 Teresa
 
 
  Honza
 
 
 
 -- 
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[PING] [PATCH] Add POST_LINK_SPEC for adding possibility of additional steps after linking

2013-11-16 Thread Andris Pavenis

On 11/05/2013 02:09 PM, Andris Pavenis wrote:

Attached patch adds a possibility to add additional build steps after linking.
Without this patch only possibility is to redefine entire LINK_COMMAND_SPEC.
Currently only DJGPP seems to need it

2013-11-05Andris Pavenis andris.pave...@iki.fi

* gcc/gcc.c: Add macro POST_LINK SPEC for specifying additional steps
to invoke after linking
* gcc/doc/tm.texi.in: (POST_LINK_SPEC): new
* gcc/doc/tm.texi: regenerate

Bootstrapped and tested on Linux x86_64 (Fedora 19)

Andris



Original post

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00367.html

Andris




Re: libbacktrace patch RFC: Look up variables in backtrace_syminfo

2013-11-16 Thread Ian Lance Taylor
On Fri, Nov 15, 2013 at 1:34 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, Nov 15, 2013 at 01:26:54PM -0800, Ian Lance Taylor wrote:
 Jakub asked whether it would be possible to extend backtrace_syminfo to
 work for variables as well as functions.  It's a straightforward
 extension, implemented by this patch.  Bootstrapped and ran libbacktrace
 tests on x86_64-unknown-linux-gnu.  Any comments on this patch before I
 submit it?

 Looks good to me.

Committed.

 OT,

permanent buffer.  If THREADED is non-zero the state may be
accessed by multiple threads simultaneously, and the library will
use appropriate locks (this requires that the library be configured
with --enable-backtrace-threads).  If THREADED is zero the state

 in backtrace.h in backtrace_create_state comment doesn't look to be
 up to date, there is no --enable-backtrace-threads it seems, just
 depending on configure either it is thread safe or not (and doesn't use
 locks).

Thanks.  I committed the following patch to correct the comment.

Ian


2013-11-16  Ian Lance Taylor  i...@google.com

* backtrace.h (backtrace_create_state): Correct comment about
threading.
Index: backtrace.h
===
--- backtrace.h	(revision 204904)
+++ backtrace.h	(working copy)
@@ -89,8 +89,7 @@ typedef void (*backtrace_error_callback)
system-specific path names.  If not NULL, FILENAME must point to a
permanent buffer.  If THREADED is non-zero the state may be
accessed by multiple threads simultaneously, and the library will
-   use appropriate locks (this requires that the library be configured
-   with --enable-backtrace-threads).  If THREADED is zero the state
+   use appropriate atomic operations.  If THREADED is zero the state
may only be accessed by one thread at a time.  This returns a state
pointer on success, NULL on error.  If an error occurs, this will
call the ERROR_CALLBACK routine.  */


Re: [PATCH][ARM] Add Cortex-A53 rtx costs table

2013-11-16 Thread Richard Earnshaw (home)
On 15 Nov 2013, at 15:42, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 Hi all,
 
 This patch adds the rtx costs table for the Cortex-A53. It goes in the new 
 aarch-cost-tables.h file because we will want to share it with AArch64.
 
 We add a corresponding tuning struct and set the tuning from generic cortex 
 tuning to the new one.
 
 Tested arm-none-eabi on model.
 
 Ok for trunk?
 
 Thanks,
 Kyrill
 
 
 2013-11-15  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
 * config/arm/aarch-cost-tables.h (cortexa53_extra_costs): New table.
 * config/arm/arm.c (arm_cortex_a53_tune): New.
 * config/arm/arm-cores.def (cortex-a53): Use cortex_a53 tuning struct.
 a53-costs.patch

Ok.

R.

Re: [wide-int] Documentation and comment tweaks

2013-11-16 Thread Kenneth Zadeck

On 11/16/2013 05:49 AM, Richard Sandiford wrote:

Richard Sandiford rdsandif...@googlemail.com writes:

Some minor tweaks to the documentation and commentary.  The hyphenation
and non zero-nonzero changes are supposed to be per guidelines:

http://gcc.gnu.org/codingconventions.html#Spelling

Hope I got them right.

OK to install?

sorry, yes it is ok to install.

Ping.


Index: gcc/dfp.c
===
--- gcc/dfp.c   2013-11-09 09:50:47.392396760 +
+++ gcc/dfp.c   2013-11-09 11:07:22.754160541 +
@@ -605,8 +605,8 @@ decimal_real_to_integer (const REAL_VALU
return real_to_integer (to);
  }
  
-/* Likewise, but returns a wide_int with PRECISION.  Fail

-   is set if the value does not fit.  */
+/* Likewise, but returns a wide_int with PRECISION.  *FAIL is set if the
+   value does not fit.  */
  
  wide_int

  decimal_real_to_integer (const REAL_VALUE_TYPE *r, bool *fail, int precision)
Index: gcc/doc/rtl.texi
===
--- gcc/doc/rtl.texi2013-11-09 09:50:47.392396760 +
+++ gcc/doc/rtl.texi2013-11-09 11:07:22.755160549 +
@@ -1542,11 +1542,10 @@ Similarly, there is only one object for
  @findex const_double
  @item (const_double:@var{m} @var{i0} @var{i1} @dots{})
  This represents either a floating-point constant of mode @var{m} or
-(on ports older ports that do not define
+(on older ports that do not define
  @code{TARGET_SUPPORTS_WIDE_INT}) an integer constant too large to fit
  into @code{HOST_BITS_PER_WIDE_INT} bits but small enough to fit within
-twice that number of bits (GCC does not provide a mechanism to
-represent even larger constants).  In the latter case, @var{m} will be
+twice that number of bits.  In the latter case, @var{m} will be
  @code{VOIDmode}.  For integral values constants for modes with more
  bits than twice the number in @code{HOST_WIDE_INT} the implied high
  order bits of that constant are copies of the top bit of
@@ -1576,25 +1575,25 @@ the precise bit pattern used by the targ
  This contains an array of @code{HOST_WIDE_INTS} that is large enough
  to hold any constant that can be represented on the target.  This form
  of rtl is only used on targets that define
-@code{TARGET_SUPPORTS_WIDE_INT} to be non zero and then
-@code{CONST_DOUBLES} are only used to hold floating point values.  If
+@code{TARGET_SUPPORTS_WIDE_INT} to be nonzero and then
+@code{CONST_DOUBLE}s are only used to hold floating-point values.  If
  the target leaves @code{TARGET_SUPPORTS_WIDE_INT} defined as 0,
  @code{CONST_WIDE_INT}s are not used and @code{CONST_DOUBLE}s are as
  they were before.
  
-The values are stored in a compressed format.   The higher order

+The values are stored in a compressed format.  The higher-order
  0s or -1s are not represented if they are just the logical sign
  extension of the number that is represented.
  
  @findex CONST_WIDE_INT_VEC

  @item CONST_WIDE_INT_VEC (@var{code})
  Returns the entire array of @code{HOST_WIDE_INT}s that are used to
-store the value.   This macro should be rarely used.
+store the value.  This macro should be rarely used.
  
  @findex CONST_WIDE_INT_NUNITS

  @item CONST_WIDE_INT_NUNITS (@var{code})
  The number of @code{HOST_WIDE_INT}s used to represent the number.
-Note that this generally be smaller than the number of
+Note that this generally is smaller than the number of
  @code{HOST_WIDE_INT}s implied by the mode size.
  
  @findex CONST_WIDE_INT_ELT

Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi 2013-11-09 09:50:47.392396760 +
+++ gcc/doc/tm.texi 2013-11-09 11:07:22.757160564 +
@@ -9683,10 +9683,9 @@ Returns the negative of the floating poi
  Returns the absolute value of @var{x}.
  @end deftypefn
  
-@deftypefn Macro void REAL_VALUE_FROM_INT (REAL_VALUE_TYPE @var{x}, HOST_WIDE_INT @var{val}, enum machine_mode @var{mode})

-Converts a double-precision integer found in @var{val},
-into a floating point value which is then stored into @var{x}.  The
-value is truncated to fit in mode @var{mode}.
+@deftypefn Macro void REAL_VALUE_FROM_INT (REAL_VALUE_TYPE @var{x}, const 
wide_int_ref @var{val}, enum machine_mode @var{mode})
+Converts integer @var{val} into a floating-point value which is then
+stored into @var{x}.  The value is truncated to fit in mode @var{mode}.
  @end deftypefn
  
  @node Mode Switching

@@ -11497,15 +11496,15 @@ The default value of this hook is based
  @defmac TARGET_SUPPORTS_WIDE_INT
  
  On older ports, large integers are stored in @code{CONST_DOUBLE} rtl

-objects.  Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be non
-zero to indicate that large integers are stored in
+objects.  Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be nonzero
+to indicate that large integers are stored in
  @code{CONST_WIDE_INT} rtl objects.  The @code{CONST_WIDE_INT} allows
  very large integer constants 

Re: [PATCH] Fix various reassoc issues (PR tree-optimization/58791, tree-optimization/58775)

2013-11-16 Thread H.J. Lu
On Tue, Oct 22, 2013 at 6:09 AM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 I've spent over two days looking at reassoc, fixing spots where
 we invalidly reused SSA_NAMEs (this results in wrong-debug, as the added
 guality testcases show, even some ICEs (pr58791-3.c) and wrong range info
 for SSA_NAMEs) and cleaning up the stmt scheduling stuff (e.g. all gsi_move*
 calls are gone, if we need to move something or set an SSA_NAME to
 different value than previously, we'll now always create new
 stmt and the old one depending on the case either remove or mark as visited
 zero uses, so that it will be removed later on by reassociate_bb.
 Of course some gimple_assign_set_rhs* etc. calls are still valid even
 without creating new stmts, optimizing some statement to equivalent
 computation is fine, but computing something different in an old SSA_NAME
 is not.

 I've also noticed that build_and_add_sum was using different framework from
 rewrite_expr_tree, the former was using stmt_dominates_stmt_p (which is IMHO
 quite clean interface, but with the added uid stuff in reassoc can be
 unnecessarily slow on large basic blocks) and rewrite_expr_tree was using
 worse APIs, but using the uids.  So, the patch also unifies that, into
 a new reassoc_stmt_dominates_stmt_p that has the same semantics as the
 tree-ssa-loop-niter.c function, but uses uids internally.  rewrite_expr_tree
 is changed so that it recurses first, then handles current level (which is
 needed if the recursion needs to create new stmt and give back a new
 SSA_NAME), which allowed removing the ensure_ops_are_available recursive
 stuff.  Also, uids are now computed in break_up_subtract_bb (and are per-bb,
 starting with 1, we never compare uids from different bbs), which allows
 us to get rid of an extra whole IL walk.

 For the inter-bb optimization, I had to stop modifying stmts right away
 in update_range_test, because we don't want to reuse SSA_NAMEs and if we
 modified there, we'd need to modify potentially many dependent SSA_NAMEs
 and sometimes many times.  So, now it instead just updates oe-op values
 and maybe_optimize_range_tests just looks at those values and updates
 what is needed.

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

 For 4.8 a partial backport would be possible, but quite a lot of work,
 for 4.7 I'd prefer not to backport given that there gsi_for_stmt isn't O(1).

 2013-10-22  Jakub Jelinek  ja...@redhat.com

 PR tree-optimization/58775
 PR tree-optimization/58791
 * tree-ssa-reassoc.c (reassoc_stmt_dominates_stmt_p): New function.
 (insert_stmt_after): Rewritten, don't move the stmt, but really
 insert it.
 (get_stmt_uid_with_default): Remove.
 (build_and_add_sum): Use insert_stmt_after and
 reassoc_stmt_dominates_stmt_p.  Fix up uid if bb contains only
 labels.
 (update_range_test): Set uid on stmts added by
 force_gimple_operand_gsi.  Don't immediately modify statements
 in inter-bb optimization, just update oe-op values.
 (optimize_range_tests): Return bool whether any changed have
 been made.
 (update_ops): New function.
 (struct inter_bb_range_test_entry): New type.
 (maybe_optimize_range_tests): Perform statement changes here.
 (not_dominated_by, appears_later_in_bb, get_def_stmt,
 ensure_ops_are_available): Remove.
 (find_insert_point): Rewritten.
 (rewrite_expr_tree): Remove MOVED argument, add CHANGED argument,
 return LHS of the (new resp. old) stmt.  Don't call
 ensure_ops_are_available, don't reuse SSA_NAMEs, recurse first
 instead of last, move new stmt at the right place.
 (linearize_expr, repropagate_negates): Don't reuse SSA_NAMEs.
 (negate_value): Likewise.  Set uids.
 (break_up_subtract_bb): Initialize uids.
 (reassociate_bb): Adjust rewrite_expr_tree caller.
 (do_reassoc): Don't call renumber_gimple_stmt_uids.


It caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59154

H.J.


Re: [PATCH][1-3] New configure option to enable Position independent executable as default.

2013-11-16 Thread Ryan Hill
On Wed, 13 Nov 2013 23:28:45 +0100
Magnus Granberg zo...@gentoo.org wrote:

 Hi
 This patchset will add a new configure options --enable-default-pie.
 With the new option enable will make it pass -fPIE and -pie from the gcc and 
 g++ frontend. Have only add the support for two targets but should work on
 more targes. In configure.ac we add the new option. We can't compile the 
 compiler or the crt stuff with -fPIE it will brake the PCH and the crtbegin
 and crtend files. The disabling is done in the Makefiles. The needed spec is
 added to DRIVER_SELF_SPECS. We disable all the profiling test for the linking
 will fail.Tested on x86_64 linux (Gentoo).
 
 /Magnus Granberg

Hey Magnus.  Some nits:

 --- a/gcc/configure.ac2013-09-25 18:10:35.0 +0200
 +++ b/gcc/configure.ac2013-10-22 21:26:56.287602139 +0200
 @@ -5434,6 +5434,31 @@ if test x${LINKER_HASH_STYLE} != x; th
   [The linker hash style])
  fi
  
 +# Check whether --enable-default-pie was given and target have the support.
 +AC_ARG_ENABLE(default-pie,
 +[AS_HELP_STRING([--enable-default-pie], [Enable Position independent 
 executable as default.

Help strings begin with a lowercase letter and do not end with a period. enable
Position Independent Executables by default.

 + If we have suppot for it when compiling and linking.
 + Linux targets supported i?86 and x86_64.])],

I would drop these lines.

 +enable_default_pie=$enableval,
 +enable_default_pie=no)
 +if test x$enable_default_pie = xyes; then
 +  AC_MSG_CHECKING(if $target support to default with -fPIE and link with 
 -pie as default)

if $target supports default PIE

 +  enable_default_pie=no
 +  case $target in
 +i?86*-*-linux* | x86_64*-*-linux*)
 +  enable_default_pie=yes
 +  ;;
 +*)
 +  ;;
 +esac
 +  AC_MSG_RESULT($enable_default_pie)
 +fi
 +if test x$enable_default_pie == xyes ; then
 +  AC_DEFINE(ENABLE_DEFAULT_PIE, 1,
 +  [Define if your target support default-pie and you have enable it.])

supports default PIE and it is enabled.

 +fi
 +AC_SUBST([enable_default_pie])
 +
  # Configure the subdirectories
  # AC_CONFIG_SUBDIRS($subdirs)
  
 --- a/gcc/doc/install.texi2013-10-01 19:29:40.0 +0200
 +++ b/gcc/doc/install.texi2013-11-09 15:40:20.831402110 +0100
 @@ -1421,6 +1421,11 @@ do a @samp{make -C gcc gnatlib_and_tools
  Specify that the run-time libraries for stack smashing protection
  should not be built.
  
 +@item --enable-default-pie
 +We will turn on @option{-fPIE} and @option{-pie} as default when
 +compileing and linking if the support is there. We only support 
 +i?86-*-linux* and x86-64-*-linux* as target for now.

Turn on @option{-fPIE} and @option{-pie} by default if supported.  
Currently supported targets are i?86-*-linux* and x86-64-*-linux*.

Also two spaces between sentences.

 --- a/gcc/doc/invoke.texi 2012-03-01 10:57:59.0 +0100
 +++ b/gcc/doc/invoke.texi 2012-07-30 00:57:03.766847851 +0200
 @@ -9457,6 +9480,12 @@ For predictable results, you must also s
  that were used to generate code (@option{-fpie}, @option{-fPIE},
  or model suboptions) when you specify this option.
  
 +NOTE: With configure --enable-default-pie  this option is enabled by default 

Extra space (also in the hunk for fPIE).

 +for C, C++, ObjC, ObjC++, if none of @option{-fno-PIE}, @option{-fno-pie}, 
 +@option{-fPIC}, @option{-fpic}, @option{-fno-PIC}, @option{-fno-pic}, 
 +@option{-nostdlib}, @option{-nostartfiles}, @option{-shared}, 
 +@option{-nodefaultlibs}, nor @option{static} are found.

Looks like nodefaultlibs is missing from PIE_DRIVER_SELF_SPECS or this needs
to be updated.

Thanks!


-- 
Ryan Hillpsn: dirtyepic_sk
   gcc-porting/toolchain/wxwidgets @ gentoo.org

47C3 6D62 4864 0E49 8E9E  7F92 ED38 BD49 957A 8463


signature.asc
Description: PGP signature


[PATCH, rs6000] Emit correct note for DWARF CFI information on LE prolog VSX stores

2013-11-16 Thread Bill Schmidt
Hi,

For VSX in little endian we currently split vector register stores into
a permute/store pair.  For prolog stores, this results in a
REG_FRAME_RELATED_EXPR note that doesn't have a simple register for its
RHS, which it needs to have.  This patch detects that situation and
ensures we produce the correct note.

This problem was breaking bootstrap when configured with
--with-cpu=power7, something we hadn't tried before.  With the patch we
now get past stage 1.  There is at least one wrong-code bug to track
down in stage 2, but modifying this note is clearly not involved with
that.

Otherwise bootstrapped and tested on powerpc64-unknown-linux-gnu with no
regressions on the big-endian side, also bootstrapped with
--with-cpu=power7.  Is this ok for trunk?

Thanks,
Bill


2011-11-16  Bill Schmidt  wschm...@linux.vnet.ibm.com

* config/rs6000/rs6000.c (rs6000_frame_related): Add split_reg
parameter and use it in REG_FRAME_RELATED_EXPR note.
(emit_frame_save): Call rs6000_frame_related with extra NULL_RTX
parameter.
(rs6000_emit_prologue): Likewise, but for little endian VSX
stores, pass the source register of the store instead.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 204861)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -21439,7 +21439,7 @@ output_probe_stack_range (rtx reg1, rtx reg2)
 
 static rtx
 rs6000_frame_related (rtx insn, rtx reg, HOST_WIDE_INT val,
- rtx reg2, rtx rreg)
+ rtx reg2, rtx rreg, rtx split_reg)
 {
   rtx real, temp;
 
@@ -21530,6 +21530,11 @@ rs6000_frame_related (rtx insn, rtx reg, HOST_WIDE
  }
 }
 
+  /* If a store insn has been split into multiple insns, the
+ true source register is given by split_reg.  */
+  if (split_reg != NULL_RTX)
+real = gen_rtx_SET (VOIDmode, SET_DEST (real), split_reg);
+
   RTX_FRAME_RELATED_P (insn) = 1;
   add_reg_note (insn, REG_FRAME_RELATED_EXPR, real);
 
@@ -21637,7 +21642,7 @@ emit_frame_save (rtx frame_reg, enum machine_mode
   reg = gen_rtx_REG (mode, regno);
   insn = emit_insn (gen_frame_store (reg, frame_reg, offset));
   return rs6000_frame_related (insn, frame_reg, frame_reg_to_sp,
-  NULL_RTX, NULL_RTX);
+  NULL_RTX, NULL_RTX, NULL_RTX);
 }
 
 /* Emit an offset memory reference suitable for a frame store, while
@@ -22217,7 +2,7 @@ rs6000_emit_prologue (void)
 
   insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
   rs6000_frame_related (insn, frame_reg_rtx, sp_off - frame_off,
-   treg, GEN_INT (-info-total_size));
+   treg, GEN_INT (-info-total_size), NULL_RTX);
   sp_off = frame_off = info-total_size;
 }
 
@@ -22302,7 +22307,7 @@ rs6000_emit_prologue (void)
 
  insn = emit_move_insn (mem, reg);
  rs6000_frame_related (insn, frame_reg_rtx, sp_off - frame_off,
-   NULL_RTX, NULL_RTX);
+   NULL_RTX, NULL_RTX, NULL_RTX);
  END_USE (0);
}
 }
@@ -22358,7 +22363,7 @@ rs6000_emit_prologue (void)
 info-lr_save_offset,
 DFmode, sel);
   rs6000_frame_related (insn, ptr_reg, sp_off,
-   NULL_RTX, NULL_RTX);
+   NULL_RTX, NULL_RTX, NULL_RTX);
   if (lr)
END_USE (0);
 }
@@ -22437,7 +22442,7 @@ rs6000_emit_prologue (void)
 SAVRES_SAVE | SAVRES_GPR);
 
  rs6000_frame_related (insn, spe_save_area_ptr, sp_off - save_off,
-   NULL_RTX, NULL_RTX);
+   NULL_RTX, NULL_RTX, NULL_RTX);
}
 
   /* Move the static chain pointer back.  */
@@ -22487,7 +22492,7 @@ rs6000_emit_prologue (void)
 info-lr_save_offset + ptr_off,
 reg_mode, sel);
   rs6000_frame_related (insn, ptr_reg, sp_off - ptr_off,
-   NULL_RTX, NULL_RTX);
+   NULL_RTX, NULL_RTX, NULL_RTX);
   if (lr)
END_USE (0);
 }
@@ -22503,7 +22508,7 @@ rs6000_emit_prologue (void)
 info-gp_save_offset + frame_off + reg_size * i);
   insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, p));
   rs6000_frame_related (insn, frame_reg_rtx, sp_off - frame_off,
-   NULL_RTX, NULL_RTX);
+   NULL_RTX, NULL_RTX, NULL_RTX);
 }
   else if (!WORLD_SAVE_P (info))
 {
@@ -22826,7 +22831,7 @@ rs6000_emit_prologue (void)
 info-altivec_save_offset + ptr_off,
 0, V4SImode, SAVRES_SAVE | SAVRES_VR);
   rs6000_frame_related (insn, scratch_reg, sp_off - ptr_off,
-

Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2

2013-11-16 Thread Alan Modra
On Sat, Nov 16, 2013 at 10:18:05PM +1030, Alan Modra wrote:
 The following six patches correspond to patches posted to the libffi
 mailing list a few days ago to add support for PowerPC64 ELFv2.  The

The ChangeLog just became easier to write.  :)

* src/powerpc/ffitarget.h: Import from upstream.
* src/powerpc/ffi.c: Likewise.
* src/powerpc/linux64.S: Likewise.
* src/powerpc/linux64_closure.S: Likewise.
* doc/libffi.texi: Likewise.
* testsuite/libffi.call/cls_double_va.c: Likewise.
* testsuite/libffi.call/cls_longdouble_va.c: Likewise.

OK to apply?

-- 
Alan Modra
Australia Development Lab, IBM


[RFA][PATCH]Fix 59019

2013-11-16 Thread Jeff Law


59019 is currently latent on the trunk, but it's likely to fail again at 
some point.


The problem we have is combine transforms a conditional trap into an 
unconditional trap.


conditional traps are not considered control flow insns, but 
unconditional traps are.


Thus, if we turn a conditional trap in the middle of a block into an 
unconditional trap, we end up with a control flow insn in the middle of 
a block and trip a checking assert.


This is, IMHO, a bandaid.  The inconsistency is amazingly annoying.  But 
I've got bigger fish to fry and I was unhappy with the number of issues 
I was running into when I tried to make conditional traps control flow 
insns.


Basically when we see an unconditional trap after we've done combining, 
we remove all the insns after the trap to the end of the block, delete 
the block's outgoing edges and emit a barrier into the block's footer.


It's similar in spirit to the cleanups we do for other situations.

Bootstrapped on ia64 with a hack installed to make this situation more 
likely to arise.


Ok for the trunk if it passes a bootstrap  regression test on 
x86_64-unknown-linux-gnu?


Jeff

* combine.c (try_combine): If we have created an unconditional trap,
make sure to fixup the insn stream  CFG appropriately.

diff --git a/gcc/combine.c b/gcc/combine.c
index 13f5e29..b3d20f2 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -4348,6 +4348,37 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int 
*new_direct_jump_p,
   update_cfg_for_uncondjump (undobuf.other_insn);
 }
 
+  /* If we might have created an unconditional trap, then we have
+ cleanup work to do.
+
+ The fundamental problem is a conditional trap is not considered
+ control flow altering, while an unconditional trap is considered
+ control flow altering.
+
+ So while we could have a conditional trap in the middle of a block
+ we can not have an unconditional trap in the middle of a block.  */
+  if (GET_CODE (i3) == INSN
+   GET_CODE (PATTERN (i3)) == TRAP_IF
+   XEXP (PATTERN (i3), 0) == const1_rtx)
+{
+  basic_block bb = BLOCK_FOR_INSN (i3);
+  rtx last = get_last_bb_insn (bb);
+
+  /* First remove all the insns after the trap.  */
+  if (i3 != last)
+   delete_insn_chain (NEXT_INSN (i3), last, true);
+
+  /* And ensure there's no outgoing edges anymore.  */
+  while (EDGE_COUNT (bb-succs)  0)
+   remove_edge (EDGE_SUCC (bb, 0));
+
+  /* And ensure cfglayout knows this block does not fall through.  */
+  emit_barrier_after_bb (bb);
+
+  /* Not exactly true, but gets the effect we want.  */
+  *new_direct_jump_p = 1;
+}
+
   /* A noop might also need cleaning up of CFG, if it comes from the
  simplification of a jump.  */
   if (JUMP_P (i3)