date:20150316

Re: [PATCH] Disable all target libraries if not building gcc

2015-03-16 Thread Joseph Myers

On Fri, 13 Mar 2015, H.J. Lu wrote:

 I am working on SHF_COMPRESSED support:
 
 http://www.sco.com/developers/gabi/latest/ch4.sheader.html
 
 I will import zlib from GCC source tree into binutils-gdb tree.  But zlib
 failed to build as a target library in binutils-gdb.  There is no need to
 build target libraries if not building gcc.  OK for master?

This is definitely wrong.  newlib / libgloss is a target library that 
certainly makes sense to build separately from GCC, and the toplevel 
configure / build code should be identical in all three repositories (GCC, 
binutils-gdb, newlib-cygwin).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Disable all target libraries if not building gcc

2015-03-16 Thread H.J. Lu

On Mon, Mar 16, 2015 at 4:43 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 16, 2015 at 4:41 PM, Joseph Myers jos...@codesourcery.com wrote:
 On Mon, 16 Mar 2015, H.J. Lu wrote:

 On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com 
 wrote:
  On Fri, 13 Mar 2015, H.J. Lu wrote:
 
  I am working on SHF_COMPRESSED support:
 
  http://www.sco.com/developers/gabi/latest/ch4.sheader.html
 
  I will import zlib from GCC source tree into binutils-gdb tree.  But zlib
  failed to build as a target library in binutils-gdb.  There is no need to
  build target libraries if not building gcc.  OK for master?
 
  This is definitely wrong.  newlib / libgloss is a target library that
  certainly makes sense to build separately from GCC, and the toplevel
  configure / build code should be identical in all three repositories (GCC,
  binutils-gdb, newlib-cygwin).

 We need to work out something to avoid building target libraries
 in binutils.

 I suggest not building zlib as a target library unless libgcj is also
 being built as a target library, given that there should be no need to
 built it as a target library in isolation.


 The logic is there.  But somehow it doesn't work for binutils.
 I will take another look.


Here is a patch.  It excludes target-zlib if target-libjava isn't built.
Tested in binutils and GCC with java.


-- 
H.J.
diff --git a/configure b/configure
index 9e54319..f9455e9 100755
--- a/configure
+++ b/configure
@@ -6432,9 +6432,6 @@ Supported languages are: ${potential_languages} $LINENO 5
 
 
   ac_configure_args=`echo  $ac_configure_args | sed -e s/ '--enable-languages=[^ ]*'//g -e s/$/ '--enable-languages=$enable_languages'/ `
-else
-  # Disable all target libraries if not building gcc.
-  target_configdirs=
 fi
 
 # Handle --disable-component generically.
@@ -6554,6 +6551,13 @@ for i in ${target_configdirs_all} ; do
   fi
 done
 
+# Exclude target-zlib if target-libjava isn't built.
+if echo  ${target_configdirs}  | grep target-libjava /dev/null 21; then
+  :
+else
+  target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'`
+fi
+
 # libiberty-linker-plugin is special: it doesn't have its own source directory,
 # so we have to add it after the preceding checks.
 if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x
diff --git a/configure.ac b/configure.ac
index eaf4661..a5a0fd0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2129,9 +2129,6 @@ Supported languages are: ${potential_languages}])
 
   AC_SUBST(stage1_languages)
   ac_configure_args=`echo  $ac_configure_args | sed -e s/ '--enable-languages=[[^ ]]*'//g -e s/$/ '--enable-languages=$enable_languages'/ `
-else
-  # Disable all target libraries if not building gcc.
-  target_configdirs=
 fi
 
 # Handle --disable-component generically.
@@ -2251,6 +2248,13 @@ for i in ${target_configdirs_all} ; do
   fi
 done
 
+# Exclude target-zlib if target-libjava isn't built.
+if echo  ${target_configdirs}  | grep target-libjava /dev/null 21; then
+  :
+else
+  target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'`
+fi
+
 # libiberty-linker-plugin is special: it doesn't have its own source directory,
 # so we have to add it after the preceding checks.
 if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x

Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Hans-Peter Nilsson

On Mon, 16 Mar 2015, Eric Botcazou wrote:
  If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
  architecture. I doubt that the strict alignment code makes any sense for
  modesize BIGGEST_ALIGNMENT.

 Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of
 BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.

The context of the above statement is somewhat ambiguous: if the
statement is taken to include no matter what is specified in
the program, including use of __attribute__ ((__aligned__ ...))
then I have to object.  (I guess Eric, you didn't actually mean
that, though.)

The definition of BIGGEST_ALIGNMENT is (tm.texi.in):
Biggest alignment that any data type can require on this
machine, in bits.  Note that this is not the biggest alignment
that is supported, just the biggest alignment that, when
violated, may cause a fault.

So, IMNSHO we'd better *support* a *larger* alignment, as in if
the code specified it by explicit means at least up to
MAX_OFILE_ALIGNMENT.  But, perfectly OK when the code didn't
explicitly say anything else.

BTW, unfortunately, BIGGEST_ALIGNMENT can't be blindly followed
for atomic accesses to undecorated (without explicit alignment
specifiers) code either.  Rather the minimum alignment is the
natural alignment *absolutely everywhere no matter what target*
which matters for targets where
 BIGGEST_ALIGNMENT  natural_alignment(all supported types)
(well, as long as the natural alignment is smaller than the
target page-size or cache-line, where at least one of the
concepts is usually applicable).

Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
targets to be at least the natural alignment of supported
atomic types?
A: Because that unfortunately has bad effects on generated code
for all accesses to all sizes now = BIGGEST_ALIGNMENT.

brgds, H-P
PS. It's very unfortunate that there's now __BIGGEST_ALIGNMENT__
visible to programs.

Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Andreas Schwab

Hans-Peter Nilsson h...@bitrange.com writes:

 Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
 targets to be at least the natural alignment of supported
 atomic types?

A: Because it's an ABI change.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.

Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Hans-Peter Nilsson

On Tue, 17 Mar 2015, Andreas Schwab wrote:
 Hans-Peter Nilsson h...@bitrange.com writes:

  Q: So why not adjust the BIGGEST_ALIGNMENT definition in such
  targets to be at least the natural alignment of supported
  atomic types?

 A: Because it's an ABI change.

I intended that to be included in bad effects;
A: Because that unfortunately has bad effects on generated code
for all accesses to all sizes now = BIGGEST_ALIGNMENT.

but thanks for being specific (and I didn't remember exactly
*what* bad effects it was that I saw when I tried that long ago :)

brgds, H-P

Re: [PATCH] Disable all target libraries if not building gcc

2015-03-16 Thread H.J. Lu

On Mon, Mar 16, 2015 at 7:26 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 16, 2015 at 4:43 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Mar 16, 2015 at 4:41 PM, Joseph Myers jos...@codesourcery.com 
 wrote:
 On Mon, 16 Mar 2015, H.J. Lu wrote:

 On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com 
 wrote:
  On Fri, 13 Mar 2015, H.J. Lu wrote:
 
  I am working on SHF_COMPRESSED support:
 
  http://www.sco.com/developers/gabi/latest/ch4.sheader.html
 
  I will import zlib from GCC source tree into binutils-gdb tree.  But 
  zlib
  failed to build as a target library in binutils-gdb.  There is no need 
  to
  build target libraries if not building gcc.  OK for master?
 
  This is definitely wrong.  newlib / libgloss is a target library that
  certainly makes sense to build separately from GCC, and the toplevel
  configure / build code should be identical in all three repositories 
  (GCC,
  binutils-gdb, newlib-cygwin).

 We need to work out something to avoid building target libraries
 in binutils.

 I suggest not building zlib as a target library unless libgcj is also
 being built as a target library, given that there should be no need to
 built it as a target library in isolation.


 The logic is there.  But somehow it doesn't work for binutils.
 I will take another look.


 Here is a patch.  It excludes target-zlib if target-libjava isn't built.
 Tested in binutils and GCC with java.


This version is more flexible to support future target libraries
which depend on target zlib.

-- 
H.J.
diff --git a/configure b/configure
index 9e54319..b719d38 100755
--- a/configure
+++ b/configure
@@ -6554,6 +6551,15 @@ for i in ${target_configdirs_all} ; do
   fi
 done
 
+# Exclude target-zlib if target-libjava isn't built.
+case ${target_configdirs} in
+*target-libjava*)
+  ;;
+*)
+  target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'`
+  ;;
+esac
+
 # libiberty-linker-plugin is special: it doesn't have its own source directory,
 # so we have to add it after the preceding checks.
 if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x
diff --git a/configure.ac b/configure.ac
index eaf4661..a4e4c7d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2251,6 +2248,15 @@ for i in ${target_configdirs_all} ; do
   fi
 done
 
+# Exclude target-zlib if target-libjava isn't built.
+case ${target_configdirs} in
+*target-libjava*)
+  ;;
+*)
+  target_configdirs=`echo ${target_configdirs} | sed -e 's/target-zlib//'`
+  ;;
+esac
+
 # libiberty-linker-plugin is special: it doesn't have its own source directory,
 # so we have to add it after the preceding checks.
 if test x$extra_linker_plugin_flags$extra_linker_plugin_configure_flags != x

Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-03-16 Thread Kugan



On 17/03/15 03:48, Kyrill Tkachov wrote:
 
 On 16/03/15 13:15, Kugan wrote:
 On 16/03/15 23:32, Kugan wrote:
 lower-subreg.c:compute_costs() only cares about the cost of a (set
 (reg)
 (const_int )) move but I think the intention, at least for now, is to
 return extra_cost-vect.alu for all the vector operations.
 Almost, what we want at the moment is COSTS_N_INSNS (1) +
 extra_cost-vect.alu
 Thanks Kyrill for the review.

 Regression tested on aarch64-linux-gnu with no new regression. Is this
 OK for trunk?
 Are you sure it's a (set (reg) (const_int)) that's being costed here? I
 thought for moves into vecto registers it would be a (set (reg)
 (const_vector)) which we don't handle in our rtx costs currently. I
 think the correct approach would be to extend the aarch64_rtx_costs
 switch statement to handle the CONST_VECT case. I believe you can use
 aarch64_simd_valid_immediate to check whether x is a valid immediate
 for
 a simd instruction and give it a cost of extra_cost-vect.alu. The
 logic
 should be similar to the CONST_INT case.
 Sorry about the (set (reg) (const_int)) above. But the actual RTL that
 is being split at 220r.subreg2 is

 (insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0)
   (subreg:V4SF (reg/v:OI 73 [ __o ]) 0))
 /home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625

 800 {*aarch64_simd_movv4sf}
(nil))

 And also, if we return RTX cost above COSTS_N_INSNS (1), it will be
 split and it dosent recover from there. Therefore we need something like
 the below to prevent that happening.

 Hi Kyrill,

 How about the attached patch? It is similar to what is currently done
 for scalar register move.
 
 Hi Kugan,
 yeah, I think this is a better approach, though I can't approve.
 

Here is the patch with minor comment update. Regression tested on
aarch64-linux-gnu with no new regression. Is this
OK for trunk?

Thanks,
Kugan

gcc/ChangeLog:

2015-03-17  Kugan Vivekanandarajah  kug...@linaro.org
Jim Wilson  jim.wil...@linaro.org

PR target/65375
* config/aarch64/aarch64.c (aarch64_rtx_costs): Handle
vector register copies.



diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cba3c1a..d6ad0af 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer 
ATTRIBUTE_UNUSED,
 
  /* Fall through.  */
case REG:
+ if (VECTOR_MODE_P (GET_MODE (op0))  REG_P (op1))
+   {
+  /* The cost is 1 per vector-register copied.  */
+  int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
+ / GET_MODE_SIZE (V4SImode);
+  *cost = COSTS_N_INSNS (n_minus_1 + 1);
+   }
  /* const0_rtx is in general free, but we will use an
 instruction to set a register to 0.  */
-  if (REG_P (op1) || op1 == const0_rtx)
-{
+ else if (REG_P (op1) || op1 == const0_rtx)
+   {
   /* The cost is 1 per register copied.  */
   int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
  / UNITS_PER_WORD;

Re: [PATCH ARM]Fix memset-inline-* failures on cortex-a9 tune by checking tune information.

2015-03-16 Thread Bin.Cheng

On Fri, Mar 13, 2015 at 7:56 PM, Ramana Radhakrishnan
ramana@googlemail.com wrote:
 On Fri, Mar 6, 2015 at 7:46 AM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 This patch is the second part fixing memset-inline-{4,5,6,8,9}.c failures on
 cortex-a9.  It adds a function checking CPU tuning information in dejagnu,
 it also uses that function to skip related testcase when we are compiling
 for cortex-a9 tune.

 Skips the related testcase for all tests where the tuning information
 doesn't use neon. I think this technique can be used to clean up a
 number of multilib related failures in the gcc.target/arm testsuite.
Actually these are all related cases.  Cases {1,2,3} are intended for
non-neon target inlining test, case 7 is an executable test which
should be run what ever the target supports.



 Build and test on arm-none-eabi.  Is it OK?

 gcc/testsuite/ChangeLog
 2015-03-06  Bin Cheng  bin.ch...@arm.com

 * lib/target-supports.exp (arm_tune_string_ops_prefer_neon): New.
 * gcc.target/arm/memset-inline-4.c: Skip for
 arm_tune_string_ops_prefer_neon.
 * gcc.target/arm/memset-inline-5.c: Ditto.
 * gcc.target/arm/memset-inline-6.c: Ditto.
 * gcc.target/arm/memset-inline-8.c: Ditto.
 * gcc.target/arm/memset-inline-9.c: Ditto.

 Ok, please document the new dejagnu helper routine in sourcebuild.texi
Done.  Patch updated, I will push both patches in if you are ok with it.

Thanks,
bin

2015-03-17  Bin Cheng  bin.ch...@arm.com
* doc/sourcebuild.texi (arm_tune_string_ops_prefer_neon): New.

gcc/testsuite/ChangeLog
2015-03-17  Bin Cheng  bin.ch...@arm.com

* lib/target-supports.exp (arm_tune_string_ops_prefer_neon): New.
* gcc.target/arm/memset-inline-4.c: Skip for
arm_tune_string_ops_prefer_neon.
* gcc.target/arm/memset-inline-5.c: Ditto.
* gcc.target/arm/memset-inline-6.c: Ditto.
* gcc.target/arm/memset-inline-8.c: Ditto.
* gcc.target/arm/memset-inline-9.c: Ditto.
Index: gcc/testsuite/gcc.target/arm/memset-inline-4.c
===
--- gcc/testsuite/gcc.target/arm/memset-inline-4.c  (revision 221097)
+++ gcc/testsuite/gcc.target/arm/memset-inline-4.c  (working copy)
@@ -1,6 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mcpu=cortex-a9 } {  } } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mtune=cortex-a9 } {  } } */
+/* { dg-skip-if Don't inline memset using neon instructions { ! 
arm_tune_string_ops_prefer_neon } } */
 /* { dg-options -save-temps -O2 -fno-inline } */
 /* { dg-add-options arm_neon } */
 
Index: gcc/testsuite/gcc.target/arm/memset-inline-5.c
===
--- gcc/testsuite/gcc.target/arm/memset-inline-5.c  (revision 221097)
+++ gcc/testsuite/gcc.target/arm/memset-inline-5.c  (working copy)
@@ -1,6 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mcpu=cortex-a9 } {  } } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mtune=cortex-a9 } {  } } */
+/* { dg-skip-if Don't inline memset using neon instructions { ! 
arm_tune_string_ops_prefer_neon } } */
 /* { dg-options -save-temps -O2 -fno-inline } */
 /* { dg-add-options arm_neon } */
 
Index: gcc/testsuite/gcc.target/arm/memset-inline-6.c
===
--- gcc/testsuite/gcc.target/arm/memset-inline-6.c  (revision 221097)
+++ gcc/testsuite/gcc.target/arm/memset-inline-6.c  (working copy)
@@ -1,6 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mcpu=cortex-a9 } {  } } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mtune=cortex-a9 } {  } } */
+/* { dg-skip-if Don't inline memset using neon instructions { ! 
arm_tune_string_ops_prefer_neon } } */
 /* { dg-options -save-temps -O2 -fno-inline } */
 /* { dg-add-options arm_neon } */
 
Index: gcc/testsuite/gcc.target/arm/memset-inline-8.c
===
--- gcc/testsuite/gcc.target/arm/memset-inline-8.c  (revision 221097)
+++ gcc/testsuite/gcc.target/arm/memset-inline-8.c  (working copy)
@@ -1,6 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mcpu=cortex-a9 } {  } } */
-/* { dg-skip-if Don't inline memset using neon instructions on cortex-a9 { 
*-*-* } { -mtune=cortex-a9 } {  } } */
+/* { dg-skip-if Don't inline memset using neon instructions { ! 
arm_tune_string_ops_prefer_neon } } */
 /* { dg-options -save-temps -O2 -fno-inline  } */
 /* { dg-add-options arm_neon } */
 
Index: gcc/testsuite/gcc.target/arm/memset-inline-9.c

C++ PATCH for c++/65327 aka DR 1688 (volatile constexpr variables)

2015-03-16 Thread Marek Polacek

The PR points out DR 1688: volatile + constexpr is a valid combination.
This patch makes the compiler accept such a code.

There's a related problem:
constexpr volatile int a = 42;
constexpr int b = a;
the initialization of b should be rejected, but it is not.  My patch does
not address this issue though.

Bootstrapped/regtested on x86_64-linux, ok for trunk?  If so, should this be
backported to 4.8/4.9?

2015-03-16  Marek Polacek  pola...@redhat.com

DR 1688
PR c++/65327
* decl.c (grokdeclarator): Allow volatile and constexpr together.

* g++.dg/cpp0x/constexpr-object1.C: Change dg-error to dg-bogus.
* g++.dg/cpp0x/pr65327.C: New test.

diff --git gcc/cp/decl.c gcc/cp/decl.c
index e35e484..cb0f11f 100644
--- gcc/cp/decl.c
+++ gcc/cp/decl.c
@@ -10134,8 +10134,9 @@ grokdeclarator (const cp_declarator *declarator,
  the object as `const'.  */
   if (constexpr_p  innermost_code != cdk_function)
 {
-  if (type_quals  TYPE_QUAL_VOLATILE)
-error (both %volatile% and %constexpr% cannot be used here);
+  /* DR1688 says that a `constexpr' specifier in combination with
+`volatile' is valid.  */
+
   if (TREE_CODE (type) != REFERENCE_TYPE)
{
  type_quals |= TYPE_QUAL_CONST;
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C
index 41afbe9..287c3ac 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-object1.C
@@ -19,7 +19,7 @@ constexpr A1 a2; // { dg-error uninitialized const }
 
 const constexpr A1 a3 = A1();
 
-volatile constexpr A1 a4 = A1(); // { dg-error both .volatile. and 
.constexpr. cannot }
+volatile constexpr A1 a4 = A1(); // { dg-bogus both .volatile. and 
.constexpr. cannot }
 
 // error: on type declaration
 constexpr struct pixel
diff --git gcc/testsuite/g++.dg/cpp0x/pr65327.C 
gcc/testsuite/g++.dg/cpp0x/pr65327.C
index e69de29..c6cefab 100644
--- gcc/testsuite/g++.dg/cpp0x/pr65327.C
+++ gcc/testsuite/g++.dg/cpp0x/pr65327.C
@@ -0,0 +1,18 @@
+// PR c++/65327
+// { dg-do compile { target c++11 } }
+// DR1688 says that constexpr can be used together with volatile.
+
+constexpr volatile int i = 10;
+
+void
+foo ()
+{
+  constexpr volatile int j = 5;
+  static constexpr volatile int k = 5;
+}
+
+constexpr volatile int
+bar ()
+{
+  return i;
+}

Marek

[patch c++]: Fix for PR/65390

2015-03-16 Thread Kai Tietz

Hi,

this patch avoids the attempt to create user-aligned-type for variants
original and main-variants type-alignment differ and original type
isn't user-aligned.
Not sure if this is the preferred variant, we could create for such
cases an aligned-type without setting user-align.
But as this should just happen for invalid types, I am not sure if we
actual need to handle later at all.
So I suggest the following fix for it:

ChangeLog

PR c++/65390
* tree.c (strip_typedefs): Don't attempt
to build user-aligned type if original doesn't
set it and main-variant differs to it.

Jonathan is just about to reduce the testcase, so final patch should
include reduced testcase for it.

Regression-tested for x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: tree.c
===
--- tree.c  (Revision 221277)
+++ tree.c  (Arbeitskopie)
@@ -1356,7 +1356,7 @@ strip_typedefs (tree t)
   if (!result)
   result = TYPE_MAIN_VARIANT (t);
   if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result)
-  || TYPE_ALIGN (t) != TYPE_ALIGN (result))
+  || (TYPE_USER_ALIGN (t)  TYPE_ALIGN (t) != TYPE_ALIGN (result)))
 {
   gcc_assert (TYPE_USER_ALIGN (t));
   if (TYPE_ALIGN (t) == TYPE_ALIGN (result))

Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-03-16 Thread Kugan

On 16/03/15 23:32, Kugan wrote:
 lower-subreg.c:compute_costs() only cares about the cost of a (set (reg)
 (const_int )) move but I think the intention, at least for now, is to
 return extra_cost-vect.alu for all the vector operations.

 Almost, what we want at the moment is COSTS_N_INSNS (1) +
 extra_cost-vect.alu
 
 Thanks Kyrill for the review.
 
 Regression tested on aarch64-linux-gnu with no new regression. Is this
 OK for trunk?

 Are you sure it's a (set (reg) (const_int)) that's being costed here? I
 thought for moves into vecto registers it would be a (set (reg)
 (const_vector)) which we don't handle in our rtx costs currently. I
 think the correct approach would be to extend the aarch64_rtx_costs
 switch statement to handle the CONST_VECT case. I believe you can use
 aarch64_simd_valid_immediate to check whether x is a valid immediate for
 a simd instruction and give it a cost of extra_cost-vect.alu. The logic
 should be similar to the CONST_INT case.
 
 Sorry about the (set (reg) (const_int)) above. But the actual RTL that
 is being split at 220r.subreg2 is
 
 (insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0)
  (subreg:V4SF (reg/v:OI 73 [ __o ]) 0))
 /home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625
 800 {*aarch64_simd_movv4sf}
   (nil))
 
 And also, if we return RTX cost above COSTS_N_INSNS (1), it will be
 split and it dosent recover from there. Therefore we need something like
 the below to prevent that happening.
 

Hi Kyrill,

How about the attached patch? It is similar to what is currently done
for scalar register move.

Thanks,
Kugan
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cba3c1a..b9db3ac 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5544,10 +5544,17 @@ aarch64_rtx_costs (rtx x, int code, int outer 
ATTRIBUTE_UNUSED,
 
  /* Fall through.  */
case REG:
+ if (VECTOR_MODE_P (GET_MODE (op0))  REG_P (op1))
+   {
+  /* The cost is 1 per register copied.  */
+  int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
+ / GET_MODE_SIZE (V4SImode);
+  *cost = COSTS_N_INSNS (n_minus_1 + 1);
+   }
  /* const0_rtx is in general free, but we will use an
 instruction to set a register to 0.  */
-  if (REG_P (op1) || op1 == const0_rtx)
-{
+ else if (REG_P (op1) || op1 == const0_rtx)
+   {
   /* The cost is 1 per register copied.  */
   int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
  / UNITS_PER_WORD;

Re: RFC: Avoid calling convert_to_mode with invalid rtl.

2015-03-16 Thread Nicholas Clifton


Hi Eric,


Note that the very same code is in expand_assignment, so they probably should
be kept in sync.


Oops - I had missed that.


   The patch adds a second call to expand_expr(), giving the address mode
   as the suggested mode, and using a normal expansion, rather than
   EXPAND_SUM.  This might work, but even if it does not the rtl in
   offset_rtx will be valid in a non-address context so that whatever
   convert_to_mode does will still remain valid.


I don't think that we want to expand twice the same expression.  What about
calling force_operand on offset_rtx right before convert_to_mode?


That works too, and is much better than my solution.

What do you think of the attached patch ?  Is it suitable for submission 
once the sources are out of stage 4 ?  (No regressions with an 
x86_64-pc-linux-gnu toolchain and several fixes for an rl78-elf toolchain).


Cheers
  Nick




expr.c.patch.2
Description: Unix manual page

[PATCH] Remove unused function

2015-03-16 Thread Marek Polacek

This function is unused since matrix-reorg.c removal back in 2012.  I think
there's no point in keeping it around.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-03-16  Marek Polacek  pola...@redhat.com

* cgraph.h (add_new_static_var): Remove declaration.
* varpool.c (add_new_static_var): Remove function.

diff --git gcc/cgraph.h gcc/cgraph.h
index 99af026..52b15c5 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -2717,9 +2717,6 @@ cgraph_node::has_gimple_body_p (void)
for ((node) = symtab-first_function_with_gimple_body (); (node); \
(node) = symtab-next_function_with_gimple_body (node))
 
-/* Create a new static variable of type TYPE.  */
-tree add_new_static_var (tree type);
-
 /* Uniquize all constants that appear in memory.
Each constant in memory thus far output is recorded
in `const_desc_table'.  */
diff --git gcc/varpool.c gcc/varpool.c
index ce64279..f1439ca 100644
--- gcc/varpool.c
+++ gcc/varpool.c
@@ -760,27 +760,6 @@ symbol_table::output_variables (void)
   return changed;
 }
 
-/* Create a new global variable of type TYPE.  */
-tree
-add_new_static_var (tree type)
-{
-  tree new_decl;
-  varpool_node *new_node;
-
-  new_decl = create_tmp_var_raw (type);
-  DECL_NAME (new_decl) = create_tmp_var_name (NULL);
-  TREE_READONLY (new_decl) = 0;
-  TREE_STATIC (new_decl) = 1;
-  TREE_USED (new_decl) = 1;
-  DECL_CONTEXT (new_decl) = NULL_TREE;
-  DECL_ABSTRACT_P (new_decl) = false;
-  lang_hooks.dup_lang_specific_decl (new_decl);
-  new_node = varpool_node::get_create (new_decl);
-  varpool_node::finalize_decl (new_decl);
-
-  return new_node-decl;
-}
-
 /* Attempt to mark ALIAS as an alias to DECL.  Return TRUE if successful.
Extra name aliases are output whenever DECL is output.  */
 

Marek

[Patch, Fortran, 2/2] Proposal on renaming gfc_vtable__get() to gfc_class_vtab__get()

2015-03-16 Thread Andre Vehreschild

Hi all,

this is the second part of the patch, substituting gfc_class_vtab_*_get() for
gfc_vtable_*_get () where needed.

Bootstraps and regtests ok on x86_64-linux-gnu/F20.

Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


vtab_access_rework2_v1.clog
Description: Binary data
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 22fc7c7..54f8f4a 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -1196,7 +1196,7 @@ gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post, gfc_ss * ss,
 	elemsize = fold_convert (gfc_array_index_type,
 			TYPE_SIZE_UNIT (gfc_get_element_type (type)));
   else
-	elemsize = gfc_vtable_size_get (class_expr);
+	elemsize = gfc_class_vtab_size_get (class_expr);
 
   size = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type,
 			  size, elemsize);
@@ -3076,7 +3076,7 @@ build_class_array_ref (gfc_se *se, tree base, tree index)
   if (!GFC_CLASS_TYPE_P (TREE_TYPE (decl)))
 return false;
 
-  size = gfc_vtable_size_get (decl);
+  size = gfc_class_vtab_size_get (decl);
 
   /* Build the address of the element.  */
   type = TREE_TYPE (TREE_TYPE (base));
@@ -7986,7 +7986,8 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 
 	  dst_data = gfc_class_data_get (dcmp);
 	  src_data = gfc_class_data_get (comp);
-	  size = fold_convert (size_type_node, gfc_vtable_size_get (comp));
+	  size = fold_convert (size_type_node,
+   gfc_class_vtab_size_get (comp));
 
 	  if (CLASS_DATA (c)-attr.dimension)
 		{
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 65a6ff1..24c20fc 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -907,7 +907,7 @@ tree
 gfc_get_class_array_ref (tree index, tree class_decl)
 {
   tree data = gfc_class_data_get (class_decl);
-  tree size = gfc_vtable_size_get (class_decl);
+  tree size = gfc_class_vtab_size_get (class_decl);
   tree offset = fold_build2_loc (input_location, MULT_EXPR,
  gfc_array_index_type,
  index, size);
@@ -943,16 +943,16 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems)
   args = NULL;
 
   if (from != NULL_TREE)
-fcn = gfc_vtable_copy_get (from);
+fcn = gfc_class_vtab_copy_get (from);
   else
-fcn = gfc_vtable_copy_get (to);
+fcn = gfc_class_vtab_copy_get (to);
 
   fcn_type = TREE_TYPE (TREE_TYPE (fcn));
 
   if (from != NULL_TREE)
 from_data = gfc_class_data_get (from);
   else
-from_data = gfc_vtable_def_init_get (to);
+from_data = gfc_class_vtab_def_init_get (to);
 
   to_data = gfc_class_data_get (to);
 
@@ -5744,7 +5744,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 			CLASS_DATA (expr-value.function.esym-result)-attr);
 	}
 
-	  final_fndecl = gfc_vtable_final_get (se-expr);
+	  final_fndecl = gfc_class_vtab_final_get (se-expr);
 	  is_final = fold_build2_loc (input_location, NE_EXPR,
   boolean_type_node,
  				  final_fndecl,
@@ -5755,7 +5755,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  	  tmp = build_call_expr_loc (input_location,
  final_fndecl, 3,
  gfc_build_addr_expr (NULL, tmp),
- gfc_vtable_size_get (se-expr),
+ gfc_class_vtab_size_get (se-expr),
  boolean_false_node);
  	  tmp = fold_build3_loc (input_location, COND_EXPR,
  void_type_node, is_final, tmp,
diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index c899a73..1e43f1f 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -5866,14 +5866,14 @@ gfc_conv_intrinsic_sizeof (gfc_se *se, gfc_expr *expr)
   /* For deferred length arrays, conv_expr_descriptor returns an
 	 indirect_ref to the component.  */
   if (arg-rank  0)
-	byte_size = gfc_vtable_size_get (TREE_OPERAND (argse.expr, 0));
+	byte_size = gfc_class_vtab_size_get (TREE_OPERAND (argse.expr, 0));
   else if (arg-rank  0)
 	/* The scalarizer added an additional temp.  To get the class' vptr
 	   one has to look at the original backend_decl.  */
-	byte_size = gfc_vtable_size_get (
+	byte_size = gfc_class_vtab_size_get (
 	  GFC_DECL_SAVED_DESCRIPTOR (arg-symtree-n.sym-backend_decl));
   else
-	byte_size = gfc_vtable_size_get (argse.expr);
+	byte_size = gfc_class_vtab_size_get (argse.expr);
 }
   else
 {
@@ -6003,10 +6003,10 @@ gfc_conv_intrinsic_storage_size (gfc_se *se, gfc_expr *expr)
   if (arg-ts.type == BT_CLASS)
 	{
 	  if (arg-rank  0)
-	tmp = gfc_vtable_size_get (
+	tmp = gfc_class_vtab_size_get (
 		 GFC_DECL_SAVED_DESCRIPTOR (arg-symtree-n.sym-backend_decl));
 	  else
-	tmp = gfc_vtable_size_get (TREE_OPERAND (argse.expr, 0));
+	tmp = gfc_class_vtab_size_get (TREE_OPERAND (argse.expr, 0));
 	  tmp = fold_convert (result_type, tmp);
 	  goto done;
 	}
@@ -6151,7 +6151,7 @@ gfc_conv_intrinsic_transfer (gfc_se * se, gfc_expr * expr)
 	 argse.string_length);
 	  break;
 	case

[Ada] Fix another ICE with -gnatct

2015-03-16 Thread Eric Botcazou

An obscur case with a limited-with clause and 'Class in a subprogram profile.

Tested on x86_64-suse-linux, applied on the mainline.


2015-03-16  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/decl.c (is_from_limited_with_of_main): New predicate.
(gnat_to_gnu_entity) E_Subprogram_Type: Invoke it on return and
parameter types to detect circularities in ASIS mode.
* gcc-interface/trans.c (Attribute_to_gnu): Mention AI05-0151.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 221446)
+++ gcc-interface/decl.c	(working copy)
@@ -182,6 +182,7 @@ static tree gnat_to_gnu_component_type (
 static tree gnat_to_gnu_param (Entity_Id, Mechanism_Type, Entity_Id, bool,
 			   bool *);
 static tree gnat_to_gnu_field (Entity_Id, tree, int, bool, bool);
+static bool is_from_limited_with_of_main (Entity_Id);
 static tree change_qualified_type (tree, int);
 static bool same_discriminant_p (Entity_Id, Entity_Id);
 static bool array_type_has_nonaliased_component (tree, Entity_Id);
@@ -4252,11 +4253,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	   context may now appear in parameter and result profiles.  If
 	   we are only annotating types, break circularities here.  */
 	if (type_annotate_only
-		 IN (Ekind (gnat_return_type), Incomplete_Kind)
-	 From_Limited_With (gnat_return_type)
-		 In_Extended_Main_Code_Unit
-		   (Non_Limited_View (gnat_return_type))
-		 !present_gnu_tree (Non_Limited_View (gnat_return_type)))
+	 is_from_limited_with_of_main (gnat_return_type))
 	  gnu_return_type = ptr_void_type_node;
 	else
 	  gnu_return_type = gnat_to_gnu_type (gnat_return_type);
@@ -4365,11 +4362,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	   context may now appear in parameter and result profiles.  If
 	   we are only annotating types, break circularities here.  */
 	if (type_annotate_only
-		 IN (Ekind (gnat_param_type), Incomplete_Kind)
-	 From_Limited_With (Etype (gnat_param_type))
-		 In_Extended_Main_Code_Unit
-		   (Non_Limited_View (gnat_param_type))
-		 !present_gnu_tree (Non_Limited_View (gnat_param_type)))
+	 is_from_limited_with_of_main (gnat_param_type))
 	  {
 		gnu_param_type = ptr_void_type_node;
 		fake_param_type = true;
@@ -5810,6 +5803,30 @@ gnat_to_gnu_param (Entity_Id gnat_param,
   return gnu_param;
 }
 
+/* Return true if GNAT_ENTITY is an incomplete entity coming from a limited
+   with of the main unit and whose full view has not been elaborated yet.  */
+
+static bool
+is_from_limited_with_of_main (Entity_Id gnat_entity)
+{
+  /* Class-wide types are always transformed into their root type.  */
+  if (Ekind (gnat_entity) == E_Class_Wide_Type)
+gnat_entity = Root_Type (gnat_entity);
+
+  if (IN (Ekind (gnat_entity), Incomplete_Kind)
+   From_Limited_With (gnat_entity))
+{
+  Entity_Id gnat_full_view = Non_Limited_View (gnat_entity);
+
+  if (present_gnu_tree (gnat_full_view))
+	return false;
+
+  return In_Extended_Main_Code_Unit (gnat_full_view);
+}
+
+  return false;
+}
+
 /* Like build_qualified_type, but TYPE_QUALS is added to the existing
qualifiers on TYPE.  */
 
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 221407)
+++ gcc-interface/trans.c	(working copy)
@@ -1593,8 +1593,9 @@ Attribute_to_gnu (Node_Id gnat_node, tre
   bool prefix_unused = false;
 
   /* ??? If this is an access attribute for a public subprogram to be used in
- a dispatch table, do not translate its type as it's useless there and the
- parameter types might be incomplete types coming from a limited with.  */
+ a dispatch table, do not translate its type as it's useless in this case
+ and the parameter types might be incomplete types coming from a limited
+ context in Ada 2012 (AI05-0151).  */
   if (Ekind (Etype (gnat_node)) == E_Access_Subprogram_Type
Is_Dispatch_Table_Entity (Etype (gnat_node))
Nkind (gnat_prefix) == N_Identifier

Re: [PATCH, i386 testsuite]: Require nonpic target for some tests

2015-03-16 Thread Tom de Vries


On 12-03-15 11:51, Uros Bizjak wrote:

On Thu, Mar 12, 2015 at 11:41 AM, Tom de Vries tom_devr...@mentor.com wrote:


Attached patch adds nonpic target requirement for some (obvious)
cases, where data access or PIC register setup confuses scan-asms.

2015-01-30  Uros Bizjak  ubiz...@gmail.com

   * gcc.target/i386/fuse-caller-save-rec.c: Require nonpic target.
   * gcc.target/i386/fuse-caller-save-xmm.c: Ditto.
   * gcc.target/i386/fuse-caller-save.c: Ditto.




Hi,

I've reverted this part of the patch. The scans were failing because the
-fipa-ra optimization was broken for -m32 -fpic (PR64895).



Not really.

Allocator is free to allocate %ebx (or other call-saved
register) as PIC register.

In this case, unwanted push/pop sequence
will be emitted.



Sure, but I don't see what that has to do with the test-cases. I don't see a
pic register used in fuse-caller-save.c and fuse-caller-save-rec.c. I do see
a pic register used in gcc.target/i386/fuse-caller-save-xmm.c, but there's
no scan for push/pop sequence in there.


You are right, the call is (obviously) to a local function. There is
no need for PIC reg, so this clears my concerns.



The patch for PR64895 has been reverted, so the scans started failing again. 
I've added xfails.


Thanks,
- Tom



2015-03-16  Tom de Vries  t...@codesourcery.com

	* gcc.target/i386/fuse-caller-save-rec.c: Add PR64895 xfail on scans.
	* gcc.target/i386/fuse-caller-save-xmm.c: Same.
	* gcc.target/i386/fuse-caller-save.c: Same.

diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c
index c660e01..7abcf91 100644
--- a/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c
@@ -18,12 +18,14 @@ foo (int y)
   return y + bar (y);
 }
 
+/* For !nonpic  ia32 xfails, see PR64895.  */
+
 /* Check that no registers are saved/restored. */
-/* { dg-final { scan-assembler-not push  } } */
-/* { dg-final { scan-assembler-not pop  } } */
+/* { dg-final { scan-assembler-not push { xfail { { ! nonpic }  ia32 } } } } */
+/* { dg-final { scan-assembler-not pop { xfail { { ! nonpic }  ia32 } } } } */
 
 /* Check that addition uses dx. */
-/* { dg-final { scan-assembler-times addl\t%\[re\]?dx, %\[re\]?ax 1 } } */
+/* { dg-final { scan-assembler-times addl\t%\[re\]?dx, %\[re\]?ax 1 { xfail { { ! nonpic }  ia32 } } } } */
 
 /* Verify that bar is self-recursive.  */
 /* { dg-final { scan-assembler-times call\t_?bar 2 } } */
diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c
index 1d02844..c2d0544 100644
--- a/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c
@@ -15,11 +15,13 @@ foo (v2df y)
   return y + bar (y);
 }
 
+/* For !nonpic  ia32 xfails, see PR64895.  */
+
 /* Check presence of all insns on xmm registers.  These checks are expected to
pass with both -fipa-ra and -fno-ipa-ra.  */
 /* { dg-final { scan-assembler-times addpd\t\\.?LC0.*, %xmm0 1 } } */
-/* { dg-final { scan-assembler-times addpd\t%xmm1, %xmm0 1 } } */
-/* { dg-final { scan-assembler-times movapd\t%xmm0, %xmm1 1 } } */
+/* { dg-final { scan-assembler-times addpd\t%xmm1, %xmm0 1 { xfail { { ! nonpic }  ia32 } } } } */
+/* { dg-final { scan-assembler-times movapd\t%xmm0, %xmm1 1 { xfail { { ! nonpic }  ia32 } } } } */
 
 /* Check absence of save/restore of xmm1 register.  */
 /* { dg-final { scan-assembler-not movaps\t%xmm1, \\(%\[re\]?sp\\) } } */
diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
index 7cfd22a..4b8e68d 100644
--- a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
@@ -16,9 +16,11 @@ foo (int y)
   return y + bar (y);
 }
 
+/* For !nonpic  ia32 xfails, see PR64895.  */
+
 /* Check that no registers are saved/restored. */
-/* { dg-final { scan-assembler-not push  } } */
-/* { dg-final { scan-assembler-not pop  } } */
+/* { dg-final { scan-assembler-not push { xfail { { ! nonpic }  ia32 } } } } */
+/* { dg-final { scan-assembler-not pop { xfail { { ! nonpic }  ia32 } } } } */
 
 /* PR61605.  If the first argument register and the return register differ, then
bar leaves the first argument register intact.  That means in foo that the
@@ -29,4 +31,4 @@ foo (int y)
 /* { dg-final { scan-assembler-not movl { target { ! ia32 } } } } */
 
 /* Check that addition uses di (in case of no copy) or dx (in case of copy). */
-/* { dg-final { scan-assembler-times addl\t%\[re\]?d\[ix\], %\[re\]?ax 1 } } */
+/* { dg-final { scan-assembler-times addl\t%\[re\]?d\[ix\], %\[re\]?ax 1 { xfail { { ! nonpic }  ia32 } } } } */
-- 
1.9.1

RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-16 Thread Thomas Preud'homme

 From: Steven Bosscher [mailto:stevenb@gmail.com]
 Sent: Monday, March 09, 2015 7:48 PM
 To: Thomas Preud'homme
 Cc: GCC Patches; Eric Botcazou
 Subject: Re: [PATCH, stage1] Move insns without introducing new
 temporaries in loop2_invariant

New patch below.

 
 It looks like this would run for all candidate loop invariants, right?
 
 If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a
 potential compile time hog for large loops.
 
 But why compute this at all? Perhaps I'm missing something, but you
 already have inv-always_executed available, no?

Indeed. I didn't realize the information was already there.

 
 
  +  basic_block use_bb;
  +
  +  ref = DF_REF_INSN (use);
  +  use_bb = BLOCK_FOR_INSN (ref);
 
 You can use DF_REF_BB.

Since I need use_insn here I kept BLOCK_FOR_INSN but I used DF_REF_BB for the 
def below.


So here are the new ChangeLog entries:

*** gcc/ChangeLog ***

2015-03-11  Thomas Preud'homme  thomas.preudho...@arm.com

* loop-invariant.c (can_move_invariant_reg): New.
(move_invariant_reg): Call above new function to decide whether
instruction can just be moved, skipping creation of temporary
register.

*** gcc/testsuite/ChangeLog ***

2015-03-12  Thomas Preud'homme  thomas.preudho...@arm.com

* gcc.dg/loop-8.c: New test.
* gcc.dg/loop-9.c: New test.


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..8217d62 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1512,6 +1512,79 @@ replace_uses (struct invariant *inv, rtx reg, bool 
in_group)
   return 1;
 }
 
And the new patch:

+/* Whether invariant INV setting REG can be moved out of LOOP, at the end of
+   the block preceding its header.  */
+
+static bool
+can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx reg)
+{
+  df_ref def, use;
+  unsigned int dest_regno, defs_in_loop_count = 0;
+  rtx_insn *insn = inv-insn;
+  basic_block bb = BLOCK_FOR_INSN (inv-insn);
+
+  /* We ignore hard register and memory access for cost and complexity reasons.
+ Hard register are few at this stage and expensive to consider as they
+ require building a separate data flow.  Memory access would require using
+ df_simulate_* and can_move_insns_across functions and is more complex.  */
+  if (!REG_P (reg) || HARD_REGISTER_P (reg))
+return false;
+
+  /* Check whether the set is always executed.  We could omit this condition if
+ we know that the register is unused outside of the loop, but it does not
+ seem worth finding out.  */
+  if (!inv-always_executed)
+return false;
+
+  /* Check that all uses reached by the def in insn would still be reached
+ it.  */
+  dest_regno = REGNO (reg);
+  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
+{
+  rtx_insn *use_insn;
+  basic_block use_bb;
+
+  use_insn = DF_REF_INSN (use);
+  use_bb = BLOCK_FOR_INSN (use_insn);
+
+  /* Ignore instruction considered for moving.  */
+  if (use_insn == insn)
+   continue;
+
+  /* Don't consider uses outside loop.  */
+  if (!flow_bb_inside_loop_p (loop, use_bb))
+   continue;
+
+  /* Don't move if a use is not dominated by def in insn.  */
+  if (use_bb == bb  DF_INSN_LUID (insn) = DF_INSN_LUID (use_insn))
+   return false;
+  if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
+   return false;
+}
+
+  /* Check for other defs.  Any other def in the loop might reach a use
+ currently reached by the def in insn.  */
+  for (def = DF_REG_DEF_CHAIN (dest_regno); def; def = DF_REF_NEXT_REG (def))
+{
+  basic_block def_bb = DF_REF_BB (def);
+
+  /* Defs in exit block cannot reach a use they weren't already.  */
+  if (single_succ_p (def_bb))
+   {
+ basic_block def_bb_succ;
+
+ def_bb_succ = single_succ (def_bb);
+ if (!flow_bb_inside_loop_p (loop, def_bb_succ))
+   continue;
+   }
+
+  if (++defs_in_loop_count  1)
+   return false;
+}
+
+  return true;
+}
+
 /* Move invariant INVNO out of the LOOP.  Returns true if this succeeds, false
otherwise.  */
 
@@ -1545,11 +1618,8 @@ move_invariant_reg (struct loop *loop, unsigned invno)
}
}
 
-  /* Move the set out of the loop.  If the set is always executed (we could
-omit this condition if we know that the register is unused outside of
-the loop, but it does not seem worth finding out) and it has no uses
-that would not be dominated by it, we may just move it (TODO).
-Otherwise we need to create a temporary register.  */
+  /* If possible, just move the set out of the loop.  Otherwise, we
+need to create a temporary register.  */
   set = single_set (inv-insn);
   reg = dest = SET_DEST (set);
   if (GET_CODE (reg) == SUBREG)
@@ -1557,19 +1627,25 @@ move_invariant_reg (struct loop *loop, unsigned invno)
   if (REG_P (reg))

[Patch, Fortran, 1/2] Proposal on renaming gfc_vtable__get() to gfc_class_vtab__get()

2015-03-16 Thread Andre Vehreschild

Hi all,

I like to propose a rename and split of the function
trans-expr.c::gfc_vtable_*_get(). 

Background: During fixing an issue, I encountered the case, that I had a handle
to get the vptr already and now wanted to retrieve the size component using that
vtpr handle. There was no function to easily do this. One is only able to get
the size member of a class' vtab using the function gfc_vtable_size_get(). I
therefore resided to extract from the gfc_vtable_field_get() function, which is
called by all gfc_vtable_*_get() functions, the part that is taking care about
the vptr's component ref and put it into a new function vptr_field_get(). The
old gfc_vtable_field_get () then calls this new vptr_field_get() function.

During the split it occurred to me, that the name for gfc_vtable_*_get() is not
clearly saying what it is supposed to do. When reading the function name the
first time, one may think, that the function gets the named field from a vtable,
but looking into the function one learns that not a handle to a vptr is
expected, but a handle to a class object. Therefore I propose to rename the
gfc_vtable_*_get () function to gfc_class_vtab_*_get (). 

Side note: Unfortunately we are not allowed to use ' in function names, or the
name would have been even better gfc_class'_vtab's_*_get (), e.g., for size
gfc_class'_vtab's_size_get (). 

This is a two parts patch. This first patch does the split and introduces
defines to stick with the old gfc_vtable_*_get (). 

The second part substitutes the new name gfc_class_vtab_*_get () where needed.

Bootstraps and regtests fine on x86_64-linux-gnu/F20.

Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


vtab_access_rework1_v1.clog
Description: Binary data
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index f61c79f..062c927 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -176,72 +176,85 @@ gfc_class_len_get (tree decl)
   if (POINTER_TYPE_P (TREE_TYPE (decl)))
 decl = build_fold_indirect_ref_loc (input_location, decl);
   len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)),
-			CLASS_LEN_FIELD);
+			   CLASS_LEN_FIELD);
   return fold_build3_loc (input_location, COMPONENT_REF,
 			  TREE_TYPE (len), decl, len,
 			  NULL_TREE);
 }
 
 
+/* Get the specified FIELD from the VPTR.  */
+
 static tree
-gfc_vtable_field_get (tree decl, int field)
+vptr_field_get (tree vptr, int fieldno)
 {
-  tree size;
-  tree vptr;
-  vptr = gfc_class_vptr_get (decl);
+  tree field;
   vptr = build_fold_indirect_ref_loc (input_location, vptr);
-  size = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (vptr)),
-			field);
-  size = fold_build3_loc (input_location, COMPONENT_REF,
-			  TREE_TYPE (size), vptr, size,
-			  NULL_TREE);
-  /* Always return size as an array index type.  */
-  if (field == VTABLE_SIZE_FIELD)
-size = fold_convert (gfc_array_index_type, size);
-  gcc_assert (size);
-  return size;
+  field = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (vptr)),
+			 fieldno);
+  field = fold_build3_loc (input_location, COMPONENT_REF,
+			   TREE_TYPE (field), vptr, field,
+			   NULL_TREE);
+  gcc_assert (field);
+  return field;
 }
 
 
-tree
-gfc_vtable_hash_get (tree decl)
-{
-  return gfc_vtable_field_get (decl, VTABLE_HASH_FIELD);
-}
-
+/* Get the field from the class' vptr.  */
 
-tree
-gfc_vtable_size_get (tree decl)
+static tree
+class_vtab_field_get (tree decl, int fieldno)
 {
-  return gfc_vtable_field_get (decl, VTABLE_SIZE_FIELD);
+  tree vptr;
+  vptr = gfc_class_vptr_get (decl);
+  return vptr_field_get (vptr, fieldno);
 }
 
 
-tree
-gfc_vtable_extends_get (tree decl)
-{
-  return gfc_vtable_field_get (decl, VTABLE_EXTENDS_FIELD);
+/* Define a macro for creating the class_vtab_* and vptr_* accessors in
+   unison.  */
+#define VTAB_GET_FIELD_GEN(name, field) tree \
+gfc_class_vtab_## name ##_get (tree cl) \
+{ \
+  return class_vtab_field_get (cl, field); \
+} \
+ \
+tree \
+gfc_vptr_## name ##_get (tree vptr) \
+{ \
+  return vptr_field_get (vptr, field); \
 }
 
+VTAB_GET_FIELD_GEN (hash, VTABLE_HASH_FIELD)
+VTAB_GET_FIELD_GEN (extends, VTABLE_EXTENDS_FIELD)
+VTAB_GET_FIELD_GEN (def_init, VTABLE_DEF_INIT_FIELD)
+VTAB_GET_FIELD_GEN (copy, VTABLE_COPY_FIELD)
+VTAB_GET_FIELD_GEN (final, VTABLE_FINAL_FIELD)
 
-tree
-gfc_vtable_def_init_get (tree decl)
-{
-  return gfc_vtable_field_get (decl, VTABLE_DEF_INIT_FIELD);
-}
 
+/* The size field is returned as an array index.  Therefore treat it and only
+   it specially.  */
 
 tree
-gfc_vtable_copy_get (tree decl)
+gfc_class_vtab_size_get (tree cl)
 {
-  return gfc_vtable_field_get (decl, VTABLE_COPY_FIELD);
+  tree size;
+  size = class_vtab_field_get (cl, VTABLE_SIZE_FIELD);
+  /* Always return size as an array index type.  */
+  size = fold_convert (gfc_array_index_type, size);
+  gcc_assert (size);
+  return size;
 }
 
-
 tree
-gfc_vtable_final_get (tree decl)
+gfc_vptr_size_get (tree vptr)
 {
-  return gfc_vtable_field_get

Re: PR c++/64626 - C++14 single quote should not always be a digit separator

2015-03-16 Thread Jason Merrill


OK.

Jason

[PATCH][expmed][cleanup] Use std::swap instead of manual swapping

2015-03-16 Thread Kyrill Tkachov


Hi all,

This patch replaces manual swapping in synth_mult with std::swap.
Not much else to say about this. This code could arguably be refactored
a bit but that's another story.

I believe these are considered obvious at this point.
I'll apply it in 24 hours unless somebody objects

Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.

Thanks,
Kyrill

2015-03-16  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* expmed.c (synth_mult): Use std::swap instead of manually
swapping algorithms.commit 20372baa835e35e365117b4e3f6bdc5e28a78a98
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Thu Mar 12 16:22:58 2015 +

[expmed] Use std::swap instead of manual swapping

diff --git a/gcc/expmed.c b/gcc/expmed.c
index d2b2534..e84ab36 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -2551,9 +2551,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	  alg_in-cost.latency += op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = m;
 	  best_alg-op[best_alg-ops] = alg_shift;
 	}
@@ -2582,9 +2581,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	  alg_in-cost.latency += op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 		{
-		  struct algorithm *x;
 		  best_cost = alg_in-cost;
-		  x = alg_in, alg_in = best_alg, best_alg = x;
+		  std::swap (alg_in, best_alg);
 		  best_alg-log[best_alg-ops] = m;
 		  best_alg-op[best_alg-ops] = alg_shift;
 		}
@@ -2624,9 +2622,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	  alg_in-cost.latency += op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = 0;
 	  best_alg-op[best_alg-ops] = alg_sub_t_m2;
 	}
@@ -2644,9 +2641,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	  alg_in-cost.latency += op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = 0;
 	  best_alg-op[best_alg-ops] = alg_add_t_m2;
 	}
@@ -2667,9 +2663,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	  alg_in-cost.latency += op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = m;
 	  best_alg-op[best_alg-ops] = alg_sub_t_m2;
 	}
@@ -2723,9 +2718,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	alg_in-cost.latency = op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = m;
 	  best_alg-op[best_alg-ops] = alg_add_factor;
 	}
@@ -2762,9 +2756,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	alg_in-cost.latency = op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = m;
 	  best_alg-op[best_alg-ops] = alg_sub_factor;
 	}
@@ -2793,9 +2786,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	  alg_in-cost.latency += op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = m;
 	  best_alg-op[best_alg-ops] = alg_add_t2_m;
 	}
@@ -2818,9 +2810,8 @@ synth_mult (struct algorithm *alg_out, unsigned HOST_WIDE_INT t,
 	  alg_in-cost.latency += op_cost;
 	  if (CHEAPER_MULT_COST (alg_in-cost, best_cost))
 	{
-	  struct algorithm *x;
 	  best_cost = alg_in-cost;
-	  x = alg_in, alg_in = best_alg, best_alg = x;
+	  std::swap (alg_in, best_alg);
 	  best_alg-log[best_alg-ops] = m;
 	  best_alg-op[best_alg-ops] = alg_sub_t2_m;
 	}

Re: [patch, fortran] Bug 64432 - [5 Regression] SYSTEM_CLOCK(COUNT_RATE=rate) wrong result for integer(4)::rate

2015-03-16 Thread Janne Blomqvist

On Sat, Mar 14, 2015 at 4:24 PM, Jerry DeLisle jvdeli...@charter.net wrote:
 Attachment on this one.


 On 03/14/2015 07:22 AM, Jerry DeLisle wrote:

 On 03/08/2015 04:58 PM, Steve Kargl wrote:

 On Mon, Mar 09, 2015 at 01:07:25AM +0200, Janne Blomqvist wrote:

 So I would prefer if we just hardcode the error values in the frontend
 (-HUGE, 0, 0), in case somebody tries to use the kind=1,2 versions,
 thus also removing the need for the new library functions, keeping the
 existing simpler ones instead. AFAICT this would be standards
 conforming. Any other opinions on this?


 Revised patch attached as requested. Regression tested on x86_64 linux.
 Typical
 results are shown below. I will provide a test case for the test-suite.

 $ ./a.out
 KIND=1: -127  0  0
 KIND=1: -127  0  0
 KIND=1: -127  .  0
   ---
 KIND=2: -32767  0  0
 KIND=2: -32767  .  0
   ---
 KIND=4: 57496123  1000  2147483647
 KIND=4: 57496123  1000.0  2147483647
   ---
 KIND=8: 57496123484138  10  9223372036854775807
 KIND=8: 57496123522116  10.000  9223372036854775807
   ---
 KIND=10: 57496123575504  10  9223372036854775807
 KIND=10: 57496123612377  10.000  9223372036854775807
   ---
 KIND=16: 57496123669210  10  9223372036854775807
 KIND=16: 57496123698413  10.00
 9223372036854775807


 OK for trunk?

 Regards,

 Jerry

 2015-03-14  Jerry DeLisle  jvdeli...@gcc.gnu.org

  PR fortran/64432
  *trans-intrinisic.c (conv_intrinsic_system_clock): Check the
  smallest kind passed in user arguments and hard-code results for
  KIND=1 or KIND=2 to indicate no clock available.

 2015-03-14 Jerry DeLisle  jvdeli...@gcc.gnu.org

  PR libgfortran/64432
  * intrinsics/system_clock.c (system_clock4, system_clock8):
  Cleanup some whitespace.

Thanks, this looks good. Ok for trunk.

-- 
Janne Blomqvist

[PATCH][expmed] Calculate mult-by-const cost properly in mult_by_coeff_cost

2015-03-16 Thread Kyrill Tkachov


Hi all,

Eyeballing the mult_by_coeff_cost function I think it has a typo/bug.
It's supposed to return the cost of multiplying by a constant 'coeff'.
It calculates that by taking the cost of a MULT rtx by that constant
and comparing it to the cost of synthesizing that multiplication, and 
returning

the cheapest. However, in the MULT rtx cost calculations it creates
a MULT rtx of two REGs rather than the a REG and the GEN_INT of coeff as 
I would

expect. This patches fixes that in the obvious way.

Tested aarch64-none-elf and bootstrapped on x86_64-linux-gnu.
I'm guessing this is stage 1 material at this point?

Thanks,
Kyrill

2015-03-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* expmed.c (mult_by_coeff_cost): Pass CONT_INT rtx to MULT cost
calculation rather than fake_reg.commit fa230baf4f35d03cf072904dc048ce4ffcf43148
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Thu Mar 12 09:47:06 2015 +

[expmed] Calculate mult-by-const properly in mult_by_coeff_cost

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 0034203..fec9501 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -3285,7 +3285,8 @@ mult_by_coeff_cost (HOST_WIDE_INT coeff, machine_mode mode, bool speed)
   enum mult_variant variant;
 
   rtx fake_reg = gen_raw_REG (mode, LAST_VIRTUAL_REGISTER + 1);
-  max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, fake_reg), speed);
+  max_cost = set_src_cost (gen_rtx_MULT (mode, fake_reg, GEN_INT (coeff)),
+			speed);
   if (choose_mult_variant (mode, coeff, algorithm, variant, max_cost))
 return algorithm.cost.cost;
   else

Re: [Patch, Fortran] Prevent segfault on illegal input

2015-03-16 Thread Andre Vehreschild

Hi Tobias, hi all,

thanks for the review. Commited as r221455:

r221455 | vehre | 2015-03-16 11:29:59 +0100 (Mo, 16. Mär 2015) | 13 Zeilen

gcc/fortran/ChangeLog:

2015-03-16  Andre Vehreschild  ve...@gmx.de

* resolve.c: Prevent segfault on illegal input.

gcc/testsuite/ChangeLog:

2015-03-16  Andre Vehreschild  ve...@gmx.de

* gfortran.dg/pointer_2.f90: New test.

Regards,
Andre

On Fri, 13 Mar 2015 14:05:00 +0100
Tobias Burnus tobias.bur...@physik.fu-berlin.de wrote:

 Andre Vehreschild wrote:
  during debugging I found a segfault of gfortran, when it encounters an
  illegal fortran code. The mistake in Fortran is flagged correctly, but
  later gfortran crashes. This patch prevents the crash.
 
  Bootstraps and regtest ok on x86_64-linux-gnu.
  Ok for trunk?
 
 OK. Thanks for the patch.
 
 Tobias


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 221454)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,7 @@
+2015-03-16  Andre Vehreschild  ve...@gmx.de
+
+	* resolve.c: Prevent segfault on illegal input.
+
 2015-03-14  Mikael Morin  mik...@gcc.gnu.org
 
 	PR fortran/61138
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(Revision 221454)
+++ gcc/fortran/resolve.c	(Arbeitskopie)
@@ -2639,6 +2639,10 @@
 expr-ts = sym-ts;
   expr-value.function.name = sym-name;
   expr-value.function.esym = sym;
+  /* Prevent crash when sym-ts.u.derived-components is not set due to previous
+ error(s).  */
+  if (sym-ts.type == BT_CLASS  !CLASS_DATA (sym))
+return MATCH_ERROR;
   if (sym-ts.type == BT_CLASS  CLASS_DATA (sym)-as)
 expr-rank = CLASS_DATA (sym)-as-rank;
   else if (sym-as != NULL)
Index: gcc/testsuite/gfortran.dg/pointer_2.f90
===
--- gcc/testsuite/gfortran.dg/pointer_2.f90	(Revision 0)
+++ gcc/testsuite/gfortran.dg/pointer_2.f90	(Revision 221455)
@@ -0,0 +1,18 @@
+! { dg-do compile }
+! Check that the compiler reports the errors, but does not segfault.
+! Contributed by: Andre Vehreschild  ve...@gcc.gnu.org
+!
+program test
+implicit none
+class(*), pointer :: P
+class(*), allocatable :: P2
+
+allocate(P2, source=convertType(P))
+
+contains
+
+  function convertType(in) ! { dg-error must be dummy, allocatable or pointer }
+class(*), intent(in) :: in
+class(*) :: convertType
+  end function
+end program test
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 221454)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,7 @@
+2015-03-16  Andre Vehreschild  ve...@gmx.de
+
+	* gfortran.dg/pointer_2.f90: New test.
+
 2015-03-16  Eric Botcazou  ebotca...@adacore.com
 
 	* testsuite/g++.dg/pr65049.C: New test.

Fix PR middle-end/65409

2015-03-16 Thread Eric Botcazou

This is an ICE present on all active branches, a regression introduced by the 
overhaul I applied throughout the RTL expander to avoid creating unnecessary 
stack temporaries for return values.  We enter the special block for MEM:BLK 
in store_field with a PARALLEL:BLK and abort on the assertion.  This RTX is 
dealt with immediately below and, as discussed with Jakub in the audit trail, 
adding the obvious guard is the safest thing to do for the release branches.

Tested on x86_64-suse-linux, applied on all active branches.


2015-03-16  Eric Botcazou  ebotca...@adacore.com

PR middle-end/65409
* expr.c (store_field): Do not do a direct block copy if the source is
a PARALLEL with BLKmode.


2015-03-16  Eric Botcazou  ebotca...@adacore.com

* testsuite/g++.dg/pr65049.C: New test.


-- 
Eric BotcazouIndex: expr.c
===
--- expr.c	(revision 221407)
+++ expr.c	(working copy)
@@ -6645,11 +6645,12 @@ store_field (rtx target, HOST_WIDE_INT b
 	   mode != TYPE_MODE (TREE_TYPE (exp)))
 	temp = convert_modes (mode, TYPE_MODE (TREE_TYPE (exp)), temp, 1);
 
-  /* If the modes of TEMP and TARGET are both BLKmode, both
-	 must be in memory and BITPOS must be aligned on a byte
-	 boundary.  If so, we simply do a block copy.  Likewise
-	 for a BLKmode-like TARGET.  */
-  if (GET_MODE (temp) == BLKmode
+  /* If TEMP is not a PARALLEL (see below) and its mode and that of TARGET
+	 are both BLKmode, both must be in memory and BITPOS must be aligned
+	 on a byte boundary.  If so, we simply do a block copy.  Likewise for
+	 a BLKmode-like TARGET.  */
+  if (GET_CODE (temp) != PARALLEL
+	   GET_MODE (temp) == BLKmode
 	   (GET_MODE (target) == BLKmode
 	  || (MEM_P (target)
 		   GET_MODE_CLASS (GET_MODE (target)) == MODE_INT// PR middle-end/65409
// Reported by Ignacy Gawedzki b...@qult.net

struct Foo
{
  Foo() {}
  int  a;
  int  b;
  char c;
};

Foo copy_foo(Foo);

struct Bar : Foo
{
  Bar(Foo t) : Foo(copy_foo(t)) {}
};

Bar a = Foo();

[RFC AArch64] Implement TARGET_PROMOTE_FUNCTION_MODE for ILP32 code generation

2015-03-16 Thread Yangfei (Felix)

Hi,

  For this trivial testcase: 

extern int bar (int , int);

int foo (int *a, int *b)
{
return bar (*a, *b);
}

I noticed that GCC generate redundant zero-extension instructions under ILP32 
(aarch64-linux-gnu-gcc -S -O2 -mabi=ilp32).
Assembly code:
.arch armv8-a+fp+simd
.file   1.c
.text
.align  2
.p2align 3,,7
.global foo
.type   foo, %function
foo:
uxtwx0, w0
uxtwx1, w1
ldr w0, [x0]
ldr w1, [x1]
b   bar
.size   foo, .-foo
.ident  GCC: (20140403) 5.0.0 20150312 (experimental)

According the ILP32 ABI, the two uxtw instrunctions here is not necessary.
The following is a proposed patch to fix this issue, i.e. promoting pointer 
function arguments to word_mode.
But I don't know whether it's a good idea to do this for pointer return values.
Any comments?


Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c(revision 221393)
+++ gcc/config/aarch64/aarch64.c(working copy)
@@ -1587,7 +1587,7 @@ aarch64_function_value (const_tree type, const_tre
   machine_mode ag_mode;
 
   mode = TYPE_MODE (type);
-  if (INTEGRAL_TYPE_P (type))
+  if (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type))
 mode = promote_function_mode (type, mode, unsignedp, func, 1);
 
   if (aarch64_return_in_msb (type))
@@ -1650,6 +1650,24 @@ aarch64_function_value_regno_p (const unsigned int
   return false;
 }
 
+/* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
+
+static machine_mode
+aarch64_promote_function_mode (const_tree type, machine_mode mode,
+  int *punsignedp, const_tree fntype,
+  int for_return)
+{
+  /* Pointer function arguments and return values are promoted to word_mode.  
*/
+  if (type != NULL_TREE  POINTER_TYPE_P (type))
+{
+  *punsignedp = POINTERS_EXTEND_UNSIGNED;
+  return word_mode;
+}
+
+  return default_promote_function_mode (type, mode, punsignedp, fntype,
+for_return);
+}
+
 /* Implement TARGET_RETURN_IN_MEMORY.
 
If the type T of the result of a function is such that
@@ -11329,6 +11347,9 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool l
 #define TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE \
   aarch64_override_options_after_change
 
+#undef TARGET_PROMOTE_FUNCTION_MODE
+#define TARGET_PROMOTE_FUNCTION_MODE aarch64_promote_function_mode
+
 #undef TARGET_PASS_BY_REFERENCE
 #define TARGET_PASS_BY_REFERENCE aarch64_pass_by_reference


aarch64-promote-v2.diff
Description: aarch64-promote-v2.diff

Re: [RFC AArch64] Implement TARGET_PROMOTE_FUNCTION_MODE for ILP32 code generation

2015-03-16 Thread pinskia





 On Mar 16, 2015, at 2:28 AM, Yangfei (Felix) felix.y...@huawei.com wrote:
 
 Hi,
 
  For this trivial testcase: 
 
 extern int bar (int , int);
 
 int foo (int *a, int *b)
 {
return bar (*a, *b);
 }
 
 I noticed that GCC generate redundant zero-extension instructions under ILP32 
 (aarch64-linux-gnu-gcc -S -O2 -mabi=ilp32).
 Assembly code:
.arch armv8-a+fp+simd
.file   1.c
.text
.align  2
.p2align 3,,7
.global foo
.type   foo, %function
 foo:
uxtwx0, w0
uxtwx1, w1
ldr w0, [x0]
ldr w1, [x1]
b   bar
.size   foo, .-foo
.ident  GCC: (20140403) 5.0.0 20150312 (experimental)
 
 According the ILP32 ABI, the two uxtw instrunctions here is not necessary.
 The following is a proposed patch to fix this issue, i.e. promoting pointer 
 function arguments to word_mode.
 But I don't know whether it's a good idea to do this for pointer return 
 values.
 Any comments?


Actually they are required. The abi says the upper 32bits are undefined for 
arguments smaller then 64bits. I had this discussion a year or more ago about 
this case. 

A simple struct like
struct a { int * b; int c; }; 

Will break the code if we don't have the zero extends

Try 
void f(int *);
void g(struct a d)
{
  f(d.b);
}

And see that there is no zero extends inside g.  I saw this exact thing when 
working on getting gccgo working.

It also means the assembly functions in glibc are broken and need to be fixed.

Thanks,
Andrew
 
 
 Index: gcc/config/aarch64/aarch64.c
 ===
 --- gcc/config/aarch64/aarch64.c(revision 221393)
 +++ gcc/config/aarch64/aarch64.c(working copy)
 @@ -1587,7 +1587,7 @@ aarch64_function_value (const_tree type, const_tre
   machine_mode ag_mode;
 
   mode = TYPE_MODE (type);
 -  if (INTEGRAL_TYPE_P (type))
 +  if (INTEGRAL_TYPE_P (type) || POINTER_TYPE_P (type))
 mode = promote_function_mode (type, mode, unsignedp, func, 1);
 
   if (aarch64_return_in_msb (type))
 @@ -1650,6 +1650,24 @@ aarch64_function_value_regno_p (const unsigned int
   return false;
 }
 
 +/* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
 +
 +static machine_mode
 +aarch64_promote_function_mode (const_tree type, machine_mode mode,
 +   int *punsignedp, const_tree fntype,
 +   int for_return)
 +{
 +  /* Pointer function arguments and return values are promoted to word_mode. 
  */
 +  if (type != NULL_TREE  POINTER_TYPE_P (type))
 +{
 +  *punsignedp = POINTERS_EXTEND_UNSIGNED;
 +  return word_mode;
 +}
 +
 +  return default_promote_function_mode (type, mode, punsignedp, fntype,
 +for_return);
 +}
 +
 /* Implement TARGET_RETURN_IN_MEMORY.
 
If the type T of the result of a function is such that
 @@ -11329,6 +11347,9 @@ aarch64_gen_adjusted_ldpstp (rtx *operands, bool l
 #define TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE \
   aarch64_override_options_after_change
 
 +#undef TARGET_PROMOTE_FUNCTION_MODE
 +#define TARGET_PROMOTE_FUNCTION_MODE aarch64_promote_function_mode
 +
 #undef TARGET_PASS_BY_REFERENCE
 #define TARGET_PASS_BY_REFERENCE aarch64_pass_by_reference
 aarch64-promote-v2.diff

[Ada] Fix ICE on loop at -O3

2015-03-16 Thread Eric Botcazou

This fixes a regression present on all active branches at -O3, caused by a 
call to fold_convert with an aggregate type.

Tested on x86_64-suse-linux, applied on all active branches.


2015-03-16  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/utils2.c (gnat_invariant_expr): Return null if the type
of the expression ends up being composite.


2015-03-16  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/loop_optimization18.ad[sb]: New test.
* gnat.dg/loop_optimization18_pkg.ads: New helper.

-- 
Eric BotcazouIndex: gcc-interface/utils2.c
===
--- gcc-interface/utils2.c	(revision 221407)
+++ gcc-interface/utils2.c	(working copy)
@@ -2805,6 +2805,12 @@ gnat_invariant_expr (tree expr)
   expr = remove_conversions (expr, false);
 }
 
+  /* We are only interested in scalar types at the moment and, even if we may
+ have gone through padding types in the above loop, we must be back to a
+ scalar value at this point.  */
+  if (AGGREGATE_TYPE_P (TREE_TYPE (expr)))
+return NULL_TREE;
+
   if (TREE_CONSTANT (expr))
 return fold_convert (type, expr);
 
with Unchecked_Conversion;

package Loop_Optimization18_Pkg is

   type Arr is array (Integer range ) of Natural;

   type Rec (UB : Integer) is record
  L : Arr (1 .. UB);
   end record;

   type Byte_Array_Type is new String (1..4);

   function Conv is new Unchecked_Conversion (Byte_Array_Type, Integer);

end Loop_Optimization18_Pkg;
with Loop_Optimization18_Pkg; use Loop_Optimization18_Pkg;

package Loop_Optimization18 is

   procedure Proc (Message : Byte_Array_Type);

end Loop_Optimization18;
-- { dg-do compile }
-- { dg-options -O3 }

package body Loop_Optimization18 is

   procedure Proc (Message : Byte_Array_Type) is

  R : Rec (Conv (Message));

   begin
  for Division in 1 .. R.UB loop
 R.L (Division) := 0;
  end loop;
  end;

end Loop_Optimization18;

Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-03-16 Thread Kugan

 lower-subreg.c:compute_costs() only cares about the cost of a (set (reg)
 (const_int )) move but I think the intention, at least for now, is to
 return extra_cost-vect.alu for all the vector operations.
 
 Almost, what we want at the moment is COSTS_N_INSNS (1) +
 extra_cost-vect.alu

Thanks Kyrill for the review.

 Regression tested on aarch64-linux-gnu with no new regression. Is this
 OK for trunk?
 
 Are you sure it's a (set (reg) (const_int)) that's being costed here? I
 thought for moves into vecto registers it would be a (set (reg)
 (const_vector)) which we don't handle in our rtx costs currently. I
 think the correct approach would be to extend the aarch64_rtx_costs
 switch statement to handle the CONST_VECT case. I believe you can use
 aarch64_simd_valid_immediate to check whether x is a valid immediate for
 a simd instruction and give it a cost of extra_cost-vect.alu. The logic
 should be similar to the CONST_INT case.

Sorry about the (set (reg) (const_int)) above. But the actual RTL that
is being split at 220r.subreg2 is

(insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0)
 (subreg:V4SF (reg/v:OI 73 [ __o ]) 0))
/home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625
800 {*aarch64_simd_movv4sf}
  (nil))

And also, if we return RTX cost above COSTS_N_INSNS (1), it will be
split and it dosent recover from there. Therefore we need something like
the below to prevent that happening.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cba3c1a..d5c80f1 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5544,10 +5544,14 @@ aarch64_rtx_costs (rtx x, int code, int outer
ATTRIBUTE_UNUSED,

  /* Fall through.  */
case REG:
+ if (VECTOR_MODE_P (GET_MODE (op0))  REG_P (op1))
+   {
+ *cost = COSTS_N_INSNS (1);
+   }
  /* const0_rtx is in general free, but we will use an
 instruction to set a register to 0.  */
-  if (REG_P (op1) || op1 == const0_rtx)
-{
+ else if (REG_P (op1) || op1 == const0_rtx)
+   {
   /* The cost is 1 per register copied.  */
   int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
  / UNITS_PER_WORD;




Thanks,
Kugan

[committed] Fix reduction_map destruction (PR middle-end/65431)

2015-03-16 Thread Jakub Jelinek

Hi!

Trying to access *ctx-outer in delete_omp_context, which is called from
splay tree deletion and thus it is pretty much random if the outer context
is destructed before or after the child, is a very bad idea.
It seems a unique splay tree is assigned to ctx-reduction_map only for
the offloading openacc GIMPLE_OMP_TARGET constructs, so this patch deletes
them on those contexts only and no other contexts.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2015-03-16  Jakub Jelinek  ja...@redhat.com

PR middle-end/65431
* omp-low.c (delete_omp_context): Only splay_tree_delete
reduction_map in GIMPLE_OMP_TARGET is_gimple_omp_offloaded
is_gimple_omp_oacc contexts.  Don't look at ctx-outer.

--- gcc/omp-low.c.jj2015-03-12 09:49:16.0 +0100
+++ gcc/omp-low.c   2015-03-16 10:18:35.082465483 +0100
@@ -1580,10 +1580,12 @@ delete_omp_context (splay_tree_value val
 splay_tree_delete (ctx-field_map);
   if (ctx-sfield_map)
 splay_tree_delete (ctx-sfield_map);
+  /* Reduction map is copied to nested contexts, so only delete it in the
+ owner.  */
   if (ctx-reduction_map
-  /* Shared over several omp_contexts.  */
-   (ctx-outer == NULL
- || ctx-reduction_map != ctx-outer-reduction_map))
+   gimple_code (ctx-stmt) == GIMPLE_OMP_TARGET
+   is_gimple_omp_offloaded (ctx-stmt)
+   is_gimple_omp_oacc (ctx-stmt))
 splay_tree_delete (ctx-reduction_map);
 
   /* We hijacked DECL_ABSTRACT_ORIGIN earlier.  We need to clear it before

Jakub

[committed] Fix a memory leak in omp-low.c offloading

2015-03-16 Thread Jakub Jelinek

Hi!

While looking at PR65431, I've noticed we leak memory for every
expand_omp_target called.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk.

2015-03-16  Jakub Jelinek  ja...@redhat.com

* omp-low.c (expand_omp_target): Use auto_vectree, 11
instead of vectree * with vec_alloc and release for args.
Adjust all users.

--- gcc/omp-low.c.jj2015-03-16 10:47:40.0 +0100
+++ gcc/omp-low.c   2015-03-16 11:15:00.74227 +0100
@@ -9105,14 +9105,11 @@ expand_omp_target (struct omp_region *re
 }
 
   gimple g;
-  vectree *args;
   /* The maximum number used by any start_ix, without varargs.  */
-  unsigned int argcnt = 11;
-
-  vec_alloc (args, argcnt);
-  args-quick_push (device);
+  auto_vectree, 11 args;
+  args.quick_push (device);
   if (offloaded)
-args-quick_push (build_fold_addr_expr (child_fn));
+args.quick_push (build_fold_addr_expr (child_fn));
   switch (start_ix)
 {
 case BUILT_IN_GOMP_TARGET:
@@ -9120,7 +9117,7 @@ expand_omp_target (struct omp_region *re
 case BUILT_IN_GOMP_TARGET_UPDATE:
   /* This const void * is part of the current ABI, but we're not actually
 using it.  */
-  args-quick_push (build_zero_cst (ptr_type_node));
+  args.quick_push (build_zero_cst (ptr_type_node));
   break;
 case BUILT_IN_GOACC_DATA_START:
 case BUILT_IN_GOACC_ENTER_EXIT_DATA:
@@ -9130,10 +9127,10 @@ expand_omp_target (struct omp_region *re
 default:
   gcc_unreachable ();
 }
-  args-quick_push (t1);
-  args-quick_push (t2);
-  args-quick_push (t3);
-  args-quick_push (t4);
+  args.quick_push (t1);
+  args.quick_push (t2);
+  args.quick_push (t3);
+  args.quick_push (t4);
   switch (start_ix)
 {
 case BUILT_IN_GOACC_DATA_START:
@@ -9166,9 +9163,9 @@ expand_omp_target (struct omp_region *re
  t_vector_length = fold_convert_loc (OMP_CLAUSE_LOCATION (c),
  integer_type_node,
  OMP_CLAUSE_VECTOR_LENGTH_EXPR 
(c));
-   args-quick_push (t_num_gangs);
-   args-quick_push (t_num_workers);
-   args-quick_push (t_vector_length);
+   args.quick_push (t_num_gangs);
+   args.quick_push (t_num_workers);
+   args.quick_push (t_vector_length);
   }
   /* FALLTHRU */
 case BUILT_IN_GOACC_ENTER_EXIT_DATA:
@@ -9190,13 +9187,13 @@ expand_omp_target (struct omp_region *re
  integer_type_node,
  OMP_CLAUSE_ASYNC_EXPR (c));
 
-   args-quick_push (t_async);
+   args.quick_push (t_async);
/* Save the index, and... */
-   t_wait_idx = args-length ();
+   t_wait_idx = args.length ();
/* ... push a default value.  */
-   args-quick_push (fold_convert_loc (gimple_location (entry_stmt),
-   integer_type_node,
-   integer_zero_node));
+   args.quick_push (fold_convert_loc (gimple_location (entry_stmt),
+  integer_type_node,
+  integer_zero_node));
c = find_omp_clause (clauses, OMP_CLAUSE_WAIT);
if (c)
  {
@@ -9206,19 +9203,19 @@ expand_omp_target (struct omp_region *re
  {
if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_WAIT)
  {
-   args-safe_push (fold_convert_loc (OMP_CLAUSE_LOCATION (c),
-  integer_type_node,
-  OMP_CLAUSE_WAIT_EXPR 
(c)));
+   args.safe_push (fold_convert_loc (OMP_CLAUSE_LOCATION (c),
+ integer_type_node,
+ OMP_CLAUSE_WAIT_EXPR 
(c)));
n++;
  }
  }
 
/* Now that we know the number, replace the default value.  */
-   args-ordered_remove (t_wait_idx);
-   args-quick_insert (t_wait_idx,
-   fold_convert_loc (gimple_location (entry_stmt),
- integer_type_node,
- build_int_cst 
(integer_type_node, n)));
+   args.ordered_remove (t_wait_idx);
+   args.quick_insert (t_wait_idx,
+  fold_convert_loc (gimple_location (entry_stmt),
+integer_type_node,
+build_int_cst 
(integer_type_node, n)));
  }
   }
   break;
@@ -9226,8 +9223,7 @@ expand_omp_target (struct omp_region *re
   gcc_unreachable ();
 }
 
-  g = gimple_build_call_vec (builtin_decl_explicit (start_ix), *args);
-  args-release ();
+  g = gimple_build_call_vec

Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-03-16 Thread Jim Wilson

Resending, now that I've figured out how to make gmail send text email
instead of html.

  Almost, what we want at the moment is COSTS_N_INSNS (1) +
  extra_cost-vect.alu

This won't work, because extra_cost-vect.alu is COSTS_N_INSNS (1),
which means the total is COSTS_N_INSNS (2).

The lower-subreg pass makes a decision on whether to split based on
cost = (word_move_cost * size/word_mode_size).  Vectors are twice the
size of word mode, and word moves are cost COSTS_N_INSNS (1).  Setting
the vector move cost to COSTS_N_INSNS (2) means we have COSTS_N_INSNS
(2) = COSTS_N_INSNS (2) and vector moves are split which is bad for
vector register allocation.  This calculation happens in compute_costs
in lower-subreg.c.

 How about the attached patch? It is similar to what is currently done
 for scalar register move.

I like this approach of using the vector register size instead of word
size when we have a vector mode.

Jim

Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-03-16 Thread Kyrill Tkachov



On 16/03/15 13:15, Kugan wrote:

On 16/03/15 23:32, Kugan wrote:

lower-subreg.c:compute_costs() only cares about the cost of a (set (reg)
(const_int )) move but I think the intention, at least for now, is to
return extra_cost-vect.alu for all the vector operations.

Almost, what we want at the moment is COSTS_N_INSNS (1) +
extra_cost-vect.alu

Thanks Kyrill for the review.


Regression tested on aarch64-linux-gnu with no new regression. Is this
OK for trunk?

Are you sure it's a (set (reg) (const_int)) that's being costed here? I
thought for moves into vecto registers it would be a (set (reg)
(const_vector)) which we don't handle in our rtx costs currently. I
think the correct approach would be to extend the aarch64_rtx_costs
switch statement to handle the CONST_VECT case. I believe you can use
aarch64_simd_valid_immediate to check whether x is a valid immediate for
a simd instruction and give it a cost of extra_cost-vect.alu. The logic
should be similar to the CONST_INT case.

Sorry about the (set (reg) (const_int)) above. But the actual RTL that
is being split at 220r.subreg2 is

(insn 11 10 12 2 (set (subreg:V4SF (reg/v:OI 77 [ __o ]) 0)
  (subreg:V4SF (reg/v:OI 73 [ __o ]) 0))
/home/kugan/work/builds/gcc-fsf-gcc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include/arm_neon.h:22625
800 {*aarch64_simd_movv4sf}
   (nil))

And also, if we return RTX cost above COSTS_N_INSNS (1), it will be
split and it dosent recover from there. Therefore we need something like
the below to prevent that happening.


Hi Kyrill,

How about the attached patch? It is similar to what is currently done
for scalar register move.


Hi Kugan,
yeah, I think this is a better approach, though I can't approve.

Kyrill



Thanks,
Kugan

[PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)

2015-03-16 Thread Jakub Jelinek

Hi!

On the following testcase, gimple LIM creates a vector COND_EXPR (scalar
condition, vector lhs, rhs2 and rhs3), but if we don't have corresponding
vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as it is
unprepared for that.

This patch lowers those (parallel or piecewise).
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-03-16  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/65427
* tree-vect-generic.c (do_cond, expand_vector_scalar_condition): New
functions.
(expand_vector_operations_1): Handle BLKmode vector COND_EXPR.

* gcc.c-torture/execute/pr65427.c: New test.

--- gcc/tree-vect-generic.c.jj  2015-01-15 20:25:40.0 +0100
+++ gcc/tree-vect-generic.c 2015-03-16 14:25:37.391932269 +0100
@@ -1417,6 +1417,57 @@ count_type_subparts (tree type)
   return VECTOR_TYPE_P (type) ? TYPE_VECTOR_SUBPARTS (type) : 1;
 }
 
+static tree
+do_cond (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
+tree bitpos, tree bitsize, enum tree_code code)
+{
+  if (TREE_CODE (TREE_TYPE (a)) == VECTOR_TYPE)
+a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
+  if (TREE_CODE (TREE_TYPE (b)) == VECTOR_TYPE)
+b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
+  tree cond = gimple_assign_rhs1 (gsi_stmt (*gsi));
+  return gimplify_build3 (gsi, code, inner_type, cond, a, b);
+}
+
+/* Expand a vector COND_EXPR to scalars, piecewise.  */
+static void
+expand_vector_scalar_condition (gimple_stmt_iterator *gsi)
+{
+  gassign *stmt = as_a gassign * (gsi_stmt (*gsi));
+  tree type = gimple_expr_type (stmt);
+  tree compute_type = get_compute_type (COND_EXPR, mov_optab, type);
+  machine_mode compute_mode = TYPE_MODE (compute_type);
+  gcc_assert (compute_mode != BLKmode);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  tree rhs3 = gimple_assign_rhs3 (stmt);
+  tree new_rhs;
+
+  /* If the compute mode is not a vector mode (hence we are not decomposing
+ a BLKmode vector to smaller, hardware-supported vectors), we may want
+ to expand the operations in parallel.  */
+  if (GET_MODE_CLASS (compute_mode) != MODE_VECTOR_INT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FLOAT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FRACT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UFRACT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_ACCUM
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UACCUM)
+new_rhs = expand_vector_parallel (gsi, do_cond, type, rhs2, rhs3,
+ COND_EXPR);
+  else
+new_rhs = expand_vector_piecewise (gsi, do_cond, type, compute_type,
+  rhs2, rhs3, COND_EXPR);
+  if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (new_rhs)))
+new_rhs = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, TREE_TYPE (lhs),
+  new_rhs);
+
+  /* NOTE:  We should avoid using gimple_assign_set_rhs_from_tree. One
+ way to do it is change expand_vector_operation and its callees to
+ return a tree_code, RHS1 and RHS2 instead of a tree. */
+  gimple_assign_set_rhs_from_tree (gsi, new_rhs);
+  update_stmt (gsi_stmt (*gsi));
+}
+
 /* Process one statement.  If we identify a vector operation, expand it.  */
 
 static void
@@ -1449,6 +1500,14 @@ expand_vector_operations_1 (gimple_stmt_
   return;
 }
 
+  if (code == COND_EXPR
+   TREE_CODE (TREE_TYPE (gimple_assign_lhs (stmt))) == VECTOR_TYPE
+   TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt))) == BLKmode)
+{
+  expand_vector_scalar_condition (gsi);
+  return;
+}
+
   if (code == CONSTRUCTOR
TREE_CODE (lhs) == SSA_NAME
VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (lhs)))
--- gcc/testsuite/gcc.c-torture/execute/pr65427.c.jj2015-03-16 
14:36:29.489254701 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr65427.c   2015-03-16 
14:40:58.789851433 +0100
@@ -0,0 +1,34 @@
+/* PR tree-optimization/65427 */
+
+typedef int V __attribute__ ((vector_size (8 * sizeof (int;
+V a, b, c, d, e, f;
+
+__attribute__((noinline, noclone)) void
+foo (int x, int y)
+{
+  do
+{
+  if (x)
+   d = a ^ c;
+  else
+   d = a ^ b;
+}
+  while (y);
+}
+
+int
+main ()
+{
+  a = (V) { 1, 2, 3, 4, 5, 6, 7, 8 };
+  b = (V) { 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80 };
+  e = (V) { 0x41, 0x82, 0x43, 0x84, 0x45, 0x86, 0x47, 0x88 };
+  foo (0, 0);
+  if (__builtin_memcmp (d, e, sizeof (V)) != 0)
+__builtin_abort ();
+  c = (V) { 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40 };
+  f = (V) { 0x81, 0x42, 0x83, 0x44, 0x85, 0x46, 0x87, 0x48 };
+  foo (1, 0);
+  if (__builtin_memcmp (d, f, sizeof (V)) != 0)
+__builtin_abort ();
+  return 0;
+}

Jakub

[RFC, stage1] Richer source location information for gcc 6 (location ranges etc)

2015-03-16 Thread David Malcolm

I've been experimenting with revamping our diagnostics to better show
source locations.  Some of the ideas are inspired by Clang's
diagnostics, but I'm going beyond what it does in some areas.

I'm attaching a patch (against r221423), which is very much a
work-in-progress, but in a release early, release often spirit I
thought I'd solicit feedback here (obviously this would be for our next
stage 1/gcc 6):

Examples of the output using the patch (some very long lines, sorry):

bitfld1.C:12:15: error: invalid operands of types ‘char*’ and ‘unsigned
char:1’ to binary ‘operator|’
 void *a = s.p | s.f;
   ~~~ ^ ~~~

multiline-rich-location-01.C:13:11: error: no match for ‘operator
+’ (operand types are ‘int’ and ‘s’)
   return (first_function_with_a_very_long_name ()
   ~~~
   + second_function_with_a_very_long_name ());
   ^ 

Going beyond pure ASCII, here's an example using unicode box-drawing
chars (and showing a macro expansion):

gcc.dg/20150304-01.c: In function ‘f’:
gcc.dg/20150304-01.c:4:94: error: invalid operands to binary  (have ‘struct 
mystruct’ and ‘float’)
 #define MYMAX(A,B)__extension__ ({ __typeof__(A) __a = (A); __typeof__(B) 
__b = (B); __a  __b ? __b : __a; })

  ─── ▲ ───
gcc.dg/20150304-01.c:11:7: note: in expansion of macro ‘MYMAX’
   X = MYMAX(P, F);
   ▲──

An example showing captions on ranges, in this case via right-hand
margins:

multiline-rich-location-02.C: In function ‘int test(int, int, int, int, int, 
int, int, int, int, int, int, int, int, int, int, int, int, int)’:
multiline-rich-location-02.C:30:11: error: no match for ‘operator+’ (operand 
types are ‘int’ and ‘s’)
   return (first_function_with_a_very_long_name (lorem, ipsum, dolor, sit, 
amet,┐
   
└│
 consectetur, adipiscing, elit, 
│
 ── 
│
 sed, eiusmod, tempor,  
│
 ─  
├type ‘int’
 incididunt, ut, labore, et,
│
 ───
│
 dolore, magna, aliqua) 
│
 ─┘ 
┘
   + second_function_with_a_very_long_name (lorem, ipsum, dolor, sit, 
// { dg-error no match }┐
   ▲ 
└──│
amet, consectetur,  
│
──  
│
adipiscing, elit, sed,  
│
──  
├type ‘s’
eiusmod, tempor, 
incididunt,│

│
ut, labore, et, dolore, 
│
─── 
│
magna, aliqua));
│
─┘  
┘

Potentially we can support locations within string literals, for -Wformat:

pr52952-spurious-trailing-percent.c: In function ‘foo’:
pr52952-spurious-trailing-percent.c:7:23: warning: spurious trailing ‘%’ in 
format [-Wformat=]
   printf(hello world %);
   ▲

including ranges:

pr52952-mismatching-argument-type.c: In function ‘foo’:
pr52952-mismatching-argument-type.c:7:17: warning: format ‘%i’ expects argument 
of type ‘int’, but argument 2 has type ‘const char *’ [-Wformat=]
   printf(hello %i, msg);
 ▲─

potentially for multiline ones:

pr52952-multiline-format-string.c: In function ‘f’:
pr52952-multiline-format-string.c:6:12: warning: format ‘%d’ expects a matching 
‘int’ argument [-Wformat=]
   %
▲─
   d
   ─┘

You can see colorized versions of these at:
  https://dmalcolm.fedorapeople.org/gcc/2015-03-06/
  https://dmalcolm.fedorapeople.org/gcc/2015-03-09/
  https://dmalcolm.fedorapeople.org/gcc/2015-03-11/
The first two ranges in a

[Committed, PATCH] Disable all target libraries if not building gcc

2015-03-16 Thread H.J. Lu

On Fri, Mar 13, 2015 at 1:06 PM, H.J. Lu hongjiu...@intel.com wrote:
 I am working on SHF_COMPRESSED support:

 http://www.sco.com/developers/gabi/latest/ch4.sheader.html

 I will import zlib from GCC source tree into binutils-gdb tree.  But zlib
 failed to build as a target library in binutils-gdb.  There is no need to
 build target libraries if not building gcc.  OK for master?


Tested with binutils and GCC build.  Checked into binutils-gdb tree.


 H.J.
 ---
 * configure.ac (target_configdirs): Clear if if not building gcc.
 * configure: Regenerated.
 ---
  configure| 3 +++
  configure.ac | 3 +++
  2 files changed, 6 insertions(+)

 diff --git a/configure b/configure
 index d075cf3..5caf82b 100755
 --- a/configure
 +++ b/configure
 @@ -6614,6 +6614,9 @@ Supported languages are: ${potential_languages} 
 $LINENO 5


ac_configure_args=`echo  $ac_configure_args | sed -e s/ 
 '--enable-languages=[^ ]*'//g -e s/$/ 
 '--enable-languages=$enable_languages'/ `
 +else
 +  # Disable all target libraries if not building gcc.
 +  target_configdirs=
  fi

  # Handle --disable-component generically.
 diff --git a/configure.ac b/configure.ac
 index 5ff56bf..34774d8 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -2039,6 +2039,9 @@ Supported languages are: ${potential_languages}])

AC_SUBST(stage1_languages)
ac_configure_args=`echo  $ac_configure_args | sed -e s/ 
 '--enable-languages=[[^ ]]*'//g -e s/$/ 
 '--enable-languages=$enable_languages'/ `
 +else
 +  # Disable all target libraries if not building gcc.
 +  target_configdirs=
  fi

  # Handle --disable-component generically.
 --
 1.9.3




-- 
H.J.

[Patch, Fortran, PR 64787, v1] Invalid code on sourced allocation of class(*) character string

2015-03-16 Thread Andre Vehreschild

Dear all,

herewith I submit a patch to fix another issue on deferred length strings and
unlimited polymorphic objects. When an unlimited polymorphic object is allocated
to store a char array, the _len component was not in all cases set correctly in
gfc_allocate (). During analyzing the issue in gfc_allocate () I figured that
it would be easier to accomplish by restructuring gfc_allocate () completely. I
set to work and viola, here it is. Now the source=/mold= expression is
evaluated just once before the allocate of all objects, when the objects to
allocate aren't arrays.

So most of the source=/mold= expression (expr3) evaluation has been extracted
from the loop running over the objects to allocate. This also enables to
compute the size of a char array correctly needing the size of a single
character. This is now done by storing that size in the size component of the
vtab (for deferred length char arrays only). The size to allocate is then
computed by the expression:

expr3-_len  0 ? expr3-_len * expr3-_vptr-size : expr3-_vptr-size

Furthermore does Pr64787 address the issue on copying the data correctly. In
the vtab of an unlimited polymorphic object the copy function needs four
arguments: from, to, from_len, to_cap, but before this patch copy () was called
with from and to only resulting in a crash of the compiled program. This patch
introduces this block to solve the issue:

if (to-_len  0)
to-_vptr-copy (from-_data, to-_data, from-_len, to-_len);
else
to-_vptr-copy (from-_data, to-_data);

(For simplicity this explanation assumes, that both object from and to are
unlimited polymorphic, the code will handle deferred length strings for from,
too.)

Furthermore, did I try to address the open comment #2 of pr57456.

Bootstraps and regtests ok on x86_64-linux-gnu/F20.

Comments and reviews welcome !!!

Note, this patch assumes that the following patches, which are all awaiting
review, BTW, are present in your code tree (apply from top to bottom):
https://gcc.gnu.org/ml/fortran/2015-02/msg00105.html   
(missing may produce deltas when patching)
https://gcc.gnu.org/ml/fortran/2015-03/msg00032.html   (same)
https://gcc.gnu.org/ml/fortran/2015-03/msg00063.html   
(nice to have, or the new patch may apply with delta.)
https://gcc.gnu.org/ml/fortran/2015-03/msg00074.html   (necessary)
https://gcc.gnu.org/ml/fortran/2015-03/msg00075.html   (necessary)

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr64787_v1.clog
Description: Binary data
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 786876c..949fc98 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -2562,13 +2562,19 @@ find_intrinsic_vtab (gfc_typespec *ts)
 	  c-attr.access = ACCESS_PRIVATE;
 
 	  /* Build a minimal expression to make use of
-		 target-memory.c/gfc_element_size for 'size'.  */
+		 target-memory.c/gfc_element_size for 'size'.  Special handling
+		 for character arrays, that are not constant sized: to support
+		 len(str)*kind, only the kind information is stored in the
+		 vtab.  */
 	  e = gfc_get_expr ();
 	  e-ts = *ts;
 	  e-expr_type = EXPR_VARIABLE;
 	  c-initializer = gfc_get_int_expr (gfc_default_integer_kind,
 		 NULL,
-		 (int)gfc_element_size (e));
+		 ts-type == BT_CHARACTER
+		  charlen == 0 ?
+		   ts-kind :
+		   (int)gfc_element_size (e));
 	  gfc_free_expr (e);
 
 	  /* Add component _extends.  */
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index f55c691..f4fa9c8 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3168,6 +3168,7 @@ void gfc_add_component_ref (gfc_expr *, const char *);
 void gfc_add_class_array_ref (gfc_expr *);
 #define gfc_add_data_component(e) gfc_add_component_ref(e,_data)
 #define gfc_add_vptr_component(e) gfc_add_component_ref(e,_vptr)
+#define gfc_add_len_component(e)  gfc_add_component_ref(e,_len)
 #define gfc_add_hash_component(e) gfc_add_component_ref(e,_hash)
 #define gfc_add_size_component(e) gfc_add_component_ref(e,_size)
 #define gfc_add_def_init_component(e) gfc_add_component_ref(e,_def_init)
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 54f8f4a..9976f79 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -4975,8 +4975,7 @@ static tree
 gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
 		 gfc_expr ** lower, gfc_expr ** upper, stmtblock_t * pblock,
 		 stmtblock_t * descriptor_block, tree * overflow,
-		 tree expr3_elem_size, tree *nelems, gfc_expr *expr3,
-		 gfc_typespec *ts)
+		 tree expr3_elem_size, tree *nelems, gfc_expr *expr3)
 {
   tree type;
   tree tmp;
@@ -5156,9 +5155,6 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
 	  tmp = TYPE_SIZE_UNIT (tmp);
 	}
 }
-  else if (ts-type != BT_UNKNOWN  ts-type != BT_CHARACTER)
-/* FIXME: Properly handle characters.  See

[Committed, PATCH]: Move cloog.m4 ChangeLog to config/ChangeLog

2015-03-16 Thread H.J. Lu

I checked in this patch to move cloog.m4 ChangeLog to config/ChangeLog.

H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 221457)
+++ ChangeLog   (working copy)
@@ -331,7 +331,6 @@
 
 2014-11-11  Tobias Burnus  bur...@net-b.de
 
-   * config/cloog.m4: Remove.
* Makefile.def: Remove CLooG.
* Makefile.tpl: Ditto.
* configure.ac: Ditto.
Index: config/ChangeLog
===
--- config/ChangeLog(revision 221457)
+++ config/ChangeLog(working copy)
@@ -25,6 +25,10 @@
 
* target-posix: New file.
 
+2014-11-11  Tobias Burnus  bur...@net-b.de
+
+   * cloog.m4: Remove.
+
 2014-10-27  Tom Tromey  tro...@redhat.com
 
* gcc-plugin.m4: New file.

Re: [patch,avr]: Part5: Fix various problems with specs and specs file generation.

2015-03-16 Thread Georg-Johann Lay

This patch introduces a new avr specific command option '-nodevicelib' so that 
linking of libdev.a can be bypassed.


The argument of -specs= is suffixed by %s instead of supplying the absolute 
path.  That way -specs= works with installation path that contains spaces.


avr_mct_t.library_name and its initializers in avr-mcus.def are cleaned up.

This field was used to define __AVR_DEV_LIB_NAME__.  If no device macro is 
defined because a device is not supported by avr-gcc, that hook macro can be 
defined to tell avr/io.h where to find the device's header file.


This means it is pointless to define __AVR_DEV_LIB_NAME__ in avr-gcc:

If a device is supported, the device macro (e.g. __AVR_ATmega8__) is defined 
and __AVR_DEV_LIB_NAME__ will never be used.


The patch adds more help text to the device specs file to inform a potential 
reader about the role of the hook macro __AVR_DEV_LIB_NAME__.


The patch also removes specs known to GCC and which don't directly depend on 
the device from the device specs file.  This makes the specs file smaller and 
easier less confusing.


Ok for trunk?

Johann


PR target/65296
* config/avr/avrlibc.h (LIB_SPEC, LIBGCC_SPEC) [AVR1]: Don't link
libgcc.a, libc.a, libm.a.
* config/avr/specs.h: Same.
* config/avr/gen-avr-mmcu-specs.c (print_mcu): Don't print specs
which don't (directly) depend on the device.  Print more help.
(*avrlibc_devicelib) [-nodevicelib]: Don't link libdev.a.
(*cpp): Don't define __AVR_DEV_LIB_NAME__.
* config/avr/driver-avr.c: Remove -nodevicelib from option list in
case of an error.
(avr_devicespecs_file): Use suffix %s instead of absolute path.
for specs file name instead of absolute path.
* config/avr/avr-arch.h (avr_mcu_t) [.library_name]: Remove.
* config/avr/avr-mcus.def: Same: Adjust initializers.
* config/avr/avr.opt (-nodevicelib): New option.
* doc/invoke.texi (AVR Options): Document it.

Index: config/avr/avrlibc.h
===
--- config/avr/avrlibc.h	(revision 221448)
+++ config/avr/avrlibc.h	(working copy)
@@ -19,21 +19,17 @@ You should have received a copy of the G
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
-/* AVR-Libc implements functions from libgcc.a in libm.a, see PR54461.
-   More AVR-Libc specific specs originate from gen-avr-mmcu-specs.c:
-
-   - LIBGCC_SPEC (*libgcc)
-   - LIB_SPEC (*lib)
-   
-*/
-
 #undef  LIB_SPEC
 #define LIB_SPEC\
-   -lc %(avrlibc_devicelib) 
+  %{!mmcu=avr1:-lc} %(avrlibc_devicelib)
+
+// AVR-Libc implements functions from libgcc.a in libm.a, see PR54461.
+// For a list of functions which are provided by libm.a and are
+// omitted from libgcc.a see libgcc's t-avrlibc.
 
 #undef  LIBGCC_SPEC
 #define LIBGCC_SPEC \
-   -lgcc -lm 
+  %{!mmcu=avr1:-lgcc -lm}
 
 #undef  STARTFILE_SPEC
 #define STARTFILE_SPEC  \
Index: config/avr/avr-arch.h
===
--- config/avr/avr-arch.h	(revision 221448)
+++ config/avr/avr-arch.h	(working copy)
@@ -122,9 +122,6 @@ const char *const macro;
 
   /* Number of 64k segments in the flash.  */
   int n_flash;
-
-  /* Old name of device library.  */
-  const char *const library_name;
 } avr_mcu_t;
 
 /* AVR device specific features.
Index: config/avr/avr-devices.c
===
--- config/avr/avr-devices.c	(revision 221448)
+++ config/avr/avr-devices.c	(working copy)
@@ -111,12 +111,12 @@ avr_texinfo[] =
 const avr_mcu_t
 avr_mcu_types[] =
 {
-#define AVR_MCU(NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH, LIBNAME)\
-  { NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH, LIBNAME },
+#define AVR_MCU(NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH)\
+  { NAME, ARCH, DEV_ATTRIBUTE, MACRO, DATA_SEC, TEXT_SEC, N_FLASH },
 #include avr-mcus.def
 #undef AVR_MCU
 /* End of list.  */
-  { NULL, ARCH_UNKNOWN, AVR_ISA_NONE, NULL, 0, 0, 0, NULL }
+  { NULL, ARCH_UNKNOWN, AVR_ISA_NONE, NULL, 0, 0, 0 }
 };
 
 
Index: config/avr/avr.opt
===
--- config/avr/avr.opt	(revision 221448)
+++ config/avr/avr.opt	(working copy)
@@ -94,3 +94,7 @@ Warn if the address space of an address
 mfract-convert-truncate
 Target Report Mask(FRACT_CONV_TRUNC)
 Allow to use truncation instead of rounding towards 0 for fractional int types
+
+nodevicelib
+Driver Target Report RejectNegative
+Do not link against the device-specific library libdev.a
Index: config/avr/avr-mcus.def
===
--- config/avr/avr-mcus.def	(revision 221448)
+++ config/avr/avr-mcus.def	(working copy)
@@ -29,16 +29,25 @@
After that, rebuild everything and

Re: [PATCH] Remove unused function

2015-03-16 Thread Jeff Law


On 03/16/15 07:36, Marek Polacek wrote:

This function is unused since matrix-reorg.c removal back in 2012.  I think
there's no point in keeping it around.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-03-16  Marek Polacek  pola...@redhat.com

* cgraph.h (add_new_static_var): Remove declaration.
* varpool.c (add_new_static_var): Remove function.

OK.
jeff

Re: [RFC, stage1] Richer source location information for gcc 6 (location ranges etc)

2015-03-16 Thread Manuel López-Ibáñez

On 16 March 2015 at 17:52, David Malcolm dmalc...@redhat.com wrote:
 I've been experimenting with revamping our diagnostics to better show
 source locations.  Some of the ideas are inspired by Clang's
 diagnostics, but I'm going beyond what it does in some areas.

 I'm attaching a patch (against r221423), which is very much a
 work-in-progress, but in a release early, release often spirit I
 thought I'd solicit feedback here (obviously this would be for our next
 stage 1/gcc 6):

Wow, awesome!

Obviously, I'm very much in favour of this in general.

From my own experience, you are going to have two enormous
bike-shedding experiences: one for the API and another for the
output. My advice would be to implement a few obvious testcases and
just the ability to print some more detailed location info, like
Clang's -fdiagnostics-print-source-range-info (but see also my past
troubles with that:
https://gcc.gnu.org/ml/gcc-patches/2009-08/msg00174.html) without even
printing any caret info. This way you can work out the internal
details without having to defend at the same time the actual user
interface. I think my mistake was to try to implement
-fdiagnostics-print-source-range-info as a user option, I should have
proposed it as a debug option such as -fdump-location-info and just
move forward from that.

Also, people are going to more easily review and accept this if you
propose small incremental steps that just tackle (initially) one case
at a time (of course, you may be working on your grand plan behind the
scenes).

Cheers,

Manuel.

Re: C++ PATCH for c++/65327 aka DR 1688 (volatile constexpr variables)

2015-03-16 Thread Jason Merrill


OK for trunk, not for release branches.

Jason

Re: [patch c++]: Fix for PR/65390

2015-03-16 Thread Jason Merrill

If there is an alignment mismatch without user intervention, there is a 
problem, we can't just ignore it.


Where we run into trouble is with array types where the version built 
earlier has not been laid out yet but the new one has been.  I've been 
trying to deal with that by making sure that we lay out the original 
type as well, but obviously that isn't working for this case.  Why not?


I suppose we could avoid checking TYPE_ALIGN if neither TYPE_USER_ALIGN 
nor TYPE_SIZE are set on 't', but checking TYPE_USER_ALIGN isn't enough.


Jason

Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)

2015-03-16 Thread Richard Biener

On March 16, 2015 5:21:02 PM GMT+01:00, Jakub Jelinek ja...@redhat.com wrote:
Hi!

On the following testcase, gimple LIM creates a vector COND_EXPR
(scalar
condition, vector lhs, rhs2 and rhs3), but if we don't have
corresponding
vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as
it is
unprepared for that.

This patch lowers those (parallel or piecewise).
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.  Though maybe LIM should not create these for cost reasons? I also wonder 
if we should lower them to control flow?

Thanks,
Richard.

2015-03-16  Jakub Jelinek  ja...@redhat.com

   PR tree-optimization/65427
   * tree-vect-generic.c (do_cond, expand_vector_scalar_condition): New
   functions.
   (expand_vector_operations_1): Handle BLKmode vector COND_EXPR.

   * gcc.c-torture/execute/pr65427.c: New test.

--- gcc/tree-vect-generic.c.jj 2015-01-15 20:25:40.0 +0100
+++ gcc/tree-vect-generic.c2015-03-16 14:25:37.391932269 +0100
@@ -1417,6 +1417,57 @@ count_type_subparts (tree type)
   return VECTOR_TYPE_P (type) ? TYPE_VECTOR_SUBPARTS (type) : 1;
 }
 
+static tree
+do_cond (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
+   tree bitpos, tree bitsize, enum tree_code code)
+{
+  if (TREE_CODE (TREE_TYPE (a)) == VECTOR_TYPE)
+a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
+  if (TREE_CODE (TREE_TYPE (b)) == VECTOR_TYPE)
+b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
+  tree cond = gimple_assign_rhs1 (gsi_stmt (*gsi));
+  return gimplify_build3 (gsi, code, inner_type, cond, a, b);
+}
+
+/* Expand a vector COND_EXPR to scalars, piecewise.  */
+static void
+expand_vector_scalar_condition (gimple_stmt_iterator *gsi)
+{
+  gassign *stmt = as_a gassign * (gsi_stmt (*gsi));
+  tree type = gimple_expr_type (stmt);
+  tree compute_type = get_compute_type (COND_EXPR, mov_optab, type);
+  machine_mode compute_mode = TYPE_MODE (compute_type);
+  gcc_assert (compute_mode != BLKmode);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs2 = gimple_assign_rhs2 (stmt);
+  tree rhs3 = gimple_assign_rhs3 (stmt);
+  tree new_rhs;
+
+  /* If the compute mode is not a vector mode (hence we are not
decomposing
+ a BLKmode vector to smaller, hardware-supported vectors), we may
want
+ to expand the operations in parallel.  */
+  if (GET_MODE_CLASS (compute_mode) != MODE_VECTOR_INT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FLOAT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_FRACT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UFRACT
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_ACCUM
+   GET_MODE_CLASS (compute_mode) != MODE_VECTOR_UACCUM)
+new_rhs = expand_vector_parallel (gsi, do_cond, type, rhs2, rhs3,
+COND_EXPR);
+  else
+new_rhs = expand_vector_piecewise (gsi, do_cond, type,
compute_type,
+ rhs2, rhs3, COND_EXPR);
+  if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE
(new_rhs)))
+new_rhs = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, TREE_TYPE
(lhs),
+ new_rhs);
+
+  /* NOTE:  We should avoid using gimple_assign_set_rhs_from_tree. One
+ way to do it is change expand_vector_operation and its callees to
+ return a tree_code, RHS1 and RHS2 instead of a tree. */
+  gimple_assign_set_rhs_from_tree (gsi, new_rhs);
+  update_stmt (gsi_stmt (*gsi));
+}
+
/* Process one statement.  If we identify a vector operation, expand
it.  */
 
 static void
@@ -1449,6 +1500,14 @@ expand_vector_operations_1 (gimple_stmt_
   return;
 }
 
+  if (code == COND_EXPR
+   TREE_CODE (TREE_TYPE (gimple_assign_lhs (stmt))) ==
VECTOR_TYPE
+   TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt))) == BLKmode)
+{
+  expand_vector_scalar_condition (gsi);
+  return;
+}
+
   if (code == CONSTRUCTOR
TREE_CODE (lhs) == SSA_NAME
VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (lhs)))
--- gcc/testsuite/gcc.c-torture/execute/pr65427.c.jj   2015-03-16
14:36:29.489254701 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr65427.c  2015-03-16
14:40:58.789851433 +0100
@@ -0,0 +1,34 @@
+/* PR tree-optimization/65427 */
+
+typedef int V __attribute__ ((vector_size (8 * sizeof (int;
+V a, b, c, d, e, f;
+
+__attribute__((noinline, noclone)) void
+foo (int x, int y)
+{
+  do
+{
+  if (x)
+  d = a ^ c;
+  else
+  d = a ^ b;
+}
+  while (y);
+}
+
+int
+main ()
+{
+  a = (V) { 1, 2, 3, 4, 5, 6, 7, 8 };
+  b = (V) { 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80 };
+  e = (V) { 0x41, 0x82, 0x43, 0x84, 0x45, 0x86, 0x47, 0x88 };
+  foo (0, 0);
+  if (__builtin_memcmp (d, e, sizeof (V)) != 0)
+__builtin_abort ();
+  c = (V) { 0x80, 0x40, 0x80, 0x40, 0x80, 0x40, 0x80, 0x40 };
+  f = (V) { 0x81, 0x42, 0x83, 0x44, 0x85, 0x46, 0x87, 0x48 };
+  foo (1, 0);
+  if (__builtin_memcmp (d, f, sizeof (V)) != 0)
+__builtin_abort ();
+  return 0;

Re: [Patch, fortran] PR59198 - [4.8/4.9/5 Regression] ICE on cyclically dependent polymorphic types

2015-03-16 Thread Paul Richard Thomas

Dear Tobias,

As far as I can see, without the patch, gfc_get _derived_type goes
into a continuous loop trying to build the abstract type. Why this is
not the case with an additional non-procedure pointer component, I do
not know. I suspect that there is a corner case out there that will
challenge this patch but I was unable to generate it. I decided
therefore to commit, with an additional condition in the loop to
prevent repeated attempts to build the component field.

Committed to trunk as revision 221474.

4.8 and 4.9 to follow.

Cheers

Paul


On 16 March 2015 at 09:39, Paul Richard Thomas
paul.richard.tho...@gmail.com wrote:
 Dear Tobias,

 I think that I have a partial understanding now and will attempt to
 verify it tonight. Certainly to not build the components, when a
 derived type is flagged to have proc_pointer components cannot be
 right just because there can be other components as in the original
 testcase. This led to the lack of DECL_SIZE information for the field
 'term' and the subsequent ICE. The key to understanding is the problem
 with typebound_operator_9.f03, when the check for proc_pointer
 components is omitted completely. This implies that there is something
 missing from the function (sorry, I forget its name) that build
 proc_pointer fields. I will check that idea - enlightenment might lead
 to a different patch.

 Thanks for the review.

 Paul

 On 16 March 2015 at 09:07, Tobias Burnus
 tobias.bur...@physik.fu-berlin.de wrote:
 Dear Paul,

 Paul Richard Thomas wrote:
 The ChangeLog says it all. If the check is not done for components
 that are not procedure pointers,
 typebound_operator_9.f03 breaks. I am not entirely sure why this is
 the case but the fix works fine.

 Bootstraps and regtests on FC21/x86_64 - OK for 4.8, 4.9 and 5.0?

 OK. I have to admit that I also do not really understand why that's
 required.

 Tobias



 --
 Outside of a dog, a book is a man's best friend. Inside of a dog it's
 too dark to read.

 Groucho Marx



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx

Re: [PATCH] Disable all target libraries if not building gcc

2015-03-16 Thread H.J. Lu

On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com wrote:
 On Fri, 13 Mar 2015, H.J. Lu wrote:

 I am working on SHF_COMPRESSED support:

 http://www.sco.com/developers/gabi/latest/ch4.sheader.html

 I will import zlib from GCC source tree into binutils-gdb tree.  But zlib
 failed to build as a target library in binutils-gdb.  There is no need to
 build target libraries if not building gcc.  OK for master?

 This is definitely wrong.  newlib / libgloss is a target library that
 certainly makes sense to build separately from GCC, and the toplevel
 configure / build code should be identical in all three repositories (GCC,
 binutils-gdb, newlib-cygwin).


We need to work out something to avoid building target libraries
in binutils.

-- 
H.J.

Re: [PATCH] Disable all target libraries if not building gcc

2015-03-16 Thread Joseph Myers

On Mon, 16 Mar 2015, H.J. Lu wrote:

 On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com wrote:
  On Fri, 13 Mar 2015, H.J. Lu wrote:
 
  I am working on SHF_COMPRESSED support:
 
  http://www.sco.com/developers/gabi/latest/ch4.sheader.html
 
  I will import zlib from GCC source tree into binutils-gdb tree.  But zlib
  failed to build as a target library in binutils-gdb.  There is no need to
  build target libraries if not building gcc.  OK for master?
 
  This is definitely wrong.  newlib / libgloss is a target library that
  certainly makes sense to build separately from GCC, and the toplevel
  configure / build code should be identical in all three repositories (GCC,
  binutils-gdb, newlib-cygwin).
 
 We need to work out something to avoid building target libraries
 in binutils.

I suggest not building zlib as a target library unless libgcj is also 
being built as a target library, given that there should be no need to 
built it as a target library in isolation.

It can be useful to build even GCC's target libraries when not building 
GCC (for example, rebuilding them with different optimization options 
using an installed compiler) - for libraries that don't use the host-side 
tm.h (all except libgcc and libobjc, I think), this is practical.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Disable all target libraries if not building gcc

2015-03-16 Thread H.J. Lu

On Mon, Mar 16, 2015 at 4:41 PM, Joseph Myers jos...@codesourcery.com wrote:
 On Mon, 16 Mar 2015, H.J. Lu wrote:

 On Mon, Mar 16, 2015 at 3:50 PM, Joseph Myers jos...@codesourcery.com 
 wrote:
  On Fri, 13 Mar 2015, H.J. Lu wrote:
 
  I am working on SHF_COMPRESSED support:
 
  http://www.sco.com/developers/gabi/latest/ch4.sheader.html
 
  I will import zlib from GCC source tree into binutils-gdb tree.  But zlib
  failed to build as a target library in binutils-gdb.  There is no need to
  build target libraries if not building gcc.  OK for master?
 
  This is definitely wrong.  newlib / libgloss is a target library that
  certainly makes sense to build separately from GCC, and the toplevel
  configure / build code should be identical in all three repositories (GCC,
  binutils-gdb, newlib-cygwin).

 We need to work out something to avoid building target libraries
 in binutils.

 I suggest not building zlib as a target library unless libgcj is also
 being built as a target library, given that there should be no need to
 built it as a target library in isolation.


The logic is there.  But somehow it doesn't work for binutils.
I will take another look.

-- 
H.J.

Re: [Patch, libstdc++/65420] Use constexpr variables as regex_constans flags

2015-03-16 Thread Tim Shen

On Sun, Mar 15, 2015 at 5:59 AM, Daniel Krügler
daniel.krueg...@gmail.com wrote:

(I switched to a less interesting but practical change, so all class
stuff is gone. But I'd like to reply to your comments.)

 Your implementation choice is an interesting approach. I believe that
 a strict reading of the library specification does not allow that
 form, because you are using as bitmask type a class type that is not
 std::bitset:

You are too kind :)
I simply forgot the exact requirement of bitmask, and didn't know how,
thought that it should be a concept, with several operators defined.

As you mentioned, this isn't actually a bad idea; we are using
concepts everywhere, then why don't make it a concept?

 What I would consider as more controversial are the following things:

 a) The class takes advantage of a deprecated rule to implicitly
 declare a copy-assignment operator (because it has a user-declared
 copy-constructor). I suggest to fix that and add a defaulted one
 before at some future point this bitmask type is no longer
 copy-assignable.

This would be a easy fix. I admit that I can never remenber in which
case a special member function will be implicitly defined or deleted
:(

 b) Basically no operation is marked as noexcept. This is
 user-observable and seems like an unfortunate outcome.

Obviously all of them should be marked as noexcept.

 c) The presence of a non-explicit conversion to bool allows users to
 unintentionally write code like

 std::regex_constants::syntax_option_type a = ..;
 [..]
 int x = a + 3; // Oops

 The presence of this function also *seemingly* allows to implicitly
 convert syntax_option_type to any integer type:

 int x = a; // Oops

 I suggest to make this conversion function explicit.

I did so just because I used this implicit conversion (since they are
originally enums) elsewhere in the code.

I didn't want to change all those code; besides, standard didn't
specify exact behavior of these two types. So it is crappy, yes, but
was more like an engineering trade off to me (when I'm overengineering
things ;).

 d) One could consider to mark the mutable operations
 (syntax_option_type operator=(syntax_option_type)) as constexpr in
 C++14 mode, but that is just an idea, not a required functionality.

I did noticed that in other functions (operator bool, etc), but ignored these.


On Sun, Mar 15, 2015 at 6:13 AM, Jonathan Wakely jwakely@gmail.com wrote:
 On 15 March 2015 at 08:09, Tim Shen wrote:
 Did a little bit refectoring on regex_constants, so that users won't
 be polluted too much (they are not enum types anymore).

 I think this is overengineered and unnecessary.

 All it needs is something like:

 enum syntax_option_type : unsigned { };
 constexpr syntax_option_type icase = ...;

Acknowledged.

Here's the simple version of it.

Thanks!


-- 
Regards,
Tim Shen
commit 33a88cefd85e04fcb7af463d6b23ea2248feef67
Author: Tim Shen tims...@google.com
Date:   Sun Mar 15 23:05:57 2015 -0700

PR libstdc++/65420
* include/bits/regex_constants.h: Use constexpr variables for flags.
* testsuite/28_regex/constants/constexpr.cc: New testcase.

diff --git a/libstdc++-v3/include/bits/regex_constants.h 
b/libstdc++-v3/include/bits/regex_constants.h
index 1ef5a43..e2c7631 100644
--- a/libstdc++-v3/include/bits/regex_constants.h
+++ b/libstdc++-v3/include/bits/regex_constants.h
@@ -77,88 +77,97 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* elements @c ECMAScript, @c basic, @c extended, @c awk, @c grep, @c egrep
* %set.
*/
-  enum syntax_option_type : unsigned int
-  {
-/**
- * Specifies that the matching of regular expressions against a character
- * sequence shall be performed without regard to case.
- */
-icase  = 1  _S_icase,
-
-/**
- * Specifies that when a regular expression is matched against a character
- * container sequence, no sub-expression matches are to be stored in the
- * supplied match_results structure.
- */
-nosubs = 1  _S_nosubs,
-
-/**
- * Specifies that the regular expression engine should pay more attention 
to
- * the speed with which regular expressions are matched, and less to the
- * speed with which regular expression objects are constructed. Otherwise
- * it has no detectable effect on the program output.
- */
-optimize   = 1  _S_optimize,
-
-/**
- * Specifies that character ranges of the form [a-b] should be locale
- * sensitive.
- */
-collate= 1  _S_collate,
-
-/**
- * Specifies that the grammar recognized by the regular expression engine 
is
- * that used by ECMAScript in ECMA-262 [Ecma International, ECMAScript
- * Language Specification, Standard Ecma-262, third edition, 1999], as
- * modified in section [28.13].  This grammar is similar to that defined
- * in the PERL scripting language but extended with elements found in the
- * POSIX regular expression grammar.
- */
-ECMAScript = 1

Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads

2015-03-16 Thread Jakub Jelinek

On Wed, Mar 11, 2015 at 03:30:36PM -0600, Jeff Law wrote:
 +#ifndef GCC_GCSE__COMMONH
 +#define GCC_GCSE__COMMONH

GCC_GCSE_COMMON_H instead?

 @@ -1308,8 +1396,19 @@ gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED)
  
if (expr_table-elements ()  0)
  {
 +  /* Knowing which MEMs are transparent through a block can signifiantly
 +  increase the number of reundant loads found.  So compute transparency
 +  information for for each memory expression in the hash table.  */

s/for for/for/ ?

 +  df_analyze ();
 +  /* This can not be part of the normal allocation routine because
 +  we have to know the number of elements in the hash table.  */
 +  transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
 +  expr_table-elements ());
 +  bitmap_vector_ones (transp, last_basic_block_for_fn (cfun));
 +  expr_table-traverse FILE *, compute_expr_transp (dump_file);
eliminate_partially_redundant_loads ();
delete_redundant_insns ();
 +  sbitmap_vector_free (transp);
  
if (dump_file)
   {

What effect does the patch have on compile time on say x86_64 or ppc64?

I'm slightly leaning towards trying it even in stage4, but if e.g. richi
disagrees, we could defer it to stage1 too.

Jakub

Re: [patch c++]: Fix for PR/65390

2015-03-16 Thread Kai Tietz

2015-03-16 19:07 GMT+01:00 Jason Merrill ja...@redhat.com:
 If there is an alignment mismatch without user intervention, there is a
 problem, we can't just ignore it.

 Where we run into trouble is with array types where the version built
 earlier has not been laid out yet but the new one has been.  I've been
 trying to deal with that by making sure that we lay out the original type as
 well, but obviously that isn't working for this case.  Why not?

Well, TYPE_ALIGN (t) is set to 32, and it differs to TYPE_ALIGN
(result) (value 8), and TYPE_USER_ALIGN isn't set.

 I suppose we could avoid checking TYPE_ALIGN if neither TYPE_USER_ALIGN nor
 TYPE_SIZE are set on 't', but checking TYPE_USER_ALIGN isn't enough.

For t TYPE_SIZE is set, but it isn't a constant (as it is an variably
modified type).  So we could add here additional check if TYPE_SIZE is
a integer-constant?

Something like this condition you mean?

...
if (TYPE_USER_ALIGN (t) != TYPE_USER_ALIGN (result)
|| ((TYPE_USER_ALIGN (t) || TREE_CODE (TYPE_SIZE (t)) == INTEGER_CST)
 TYPE_ALIGN (t) != TYPE_ALIGN (result)))
  {
...
 Jason

Kai

Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)

2015-03-16 Thread Jakub Jelinek

On Mon, Mar 16, 2015 at 07:15:59PM +0100, Richard Biener wrote:
 On March 16, 2015 5:21:02 PM GMT+01:00, Jakub Jelinek ja...@redhat.com 
 wrote:
 On the following testcase, gimple LIM creates a vector COND_EXPR
 (scalar
 condition, vector lhs, rhs2 and rhs3), but if we don't have
 corresponding
 vector mode for it, we ICE trying to expand the BLKmode COND_EXPR, as
 it is
 unprepared for that.
 
 This patch lowers those (parallel or piecewise).
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 OK.  Though maybe LIM should not create these for cost reasons? I also wonder 
 if we should lower them to control flow?

Yeah, I've been thinking about teaching LIM not to do that if it is BLKmode.
But then found how many other spots create COND_EXPRs and thought I just
can't catch all of them.  But guess I can change LIM incrementally too.

I've also been thinking about lowering it to control flow, but only if it
couldn't be done in say two halves comparison as in the testcase.  I suppose
doing say 2 V4SImode COND_EXPRs would still be beneficial over control flow,
but if it is more than that (say 4 or 8+) it might be already better to just
expand it as a PHI.  But, as we don't create basic blocks in
tree-vect-generic.c right now, I thought it might be too much for stage4.

Jakub

Re: [PATCH] Handle vector COND_EXPRs in vector genericization (PR tree-optimization/65427)

2015-03-16 Thread Richard Biener

On March 16, 2015 7:26:58 PM GMT+01:00, Jakub Jelinek ja...@redhat.com wrote:
On Mon, Mar 16, 2015 at 07:15:59PM +0100, Richard Biener wrote:
 On March 16, 2015 5:21:02 PM GMT+01:00, Jakub Jelinek
ja...@redhat.com wrote:
 On the following testcase, gimple LIM creates a vector COND_EXPR
 (scalar
 condition, vector lhs, rhs2 and rhs3), but if we don't have
 corresponding
 vector mode for it, we ICE trying to expand the BLKmode COND_EXPR,
as
 it is
 unprepared for that.
 
 This patch lowers those (parallel or piecewise).
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 OK.  Though maybe LIM should not create these for cost reasons? I
also wonder if we should lower them to control flow?

Yeah, I've been thinking about teaching LIM not to do that if it is
BLKmode.
But then found how many other spots create COND_EXPRs and thought I
just
can't catch all of them.  But guess I can change LIM incrementally too.

I've also been thinking about lowering it to control flow, but only if
it
couldn't be done in say two halves comparison as in the testcase.  I
suppose
doing say 2 V4SImode COND_EXPRs would still be beneficial over control
flow,
but if it is more than that (say 4 or 8+) it might be already better to
just
expand it as a PHI.  But, as we don't create basic blocks in
tree-vect-generic.c right now, I thought it might be too much for
stage4.

Yeah, the patch is fine for stage 4.  I just wrote down my random thoughts.

Richard.

   Jakub

Re: [AArch64][PR65375] Fix RTX cost for vector SET

2015-03-16 Thread Kyrill Tkachov



On 16/03/15 05:36, Kugan wrote:

Hi Kugan,


AArch64 RTX cost for vector SET is causing PR65375. Lower subreg is
using this rtx_cost to compute the cost of moves, and splitting anything
larger than word size, 64-bits in this case. The aarch64 rtx_costs is
returning 2 * COST_N_INSNS(1) for vector moves, so they get split.
Attach patch fixes this.

With the patch the testcase in the PR:

#include arm_neon.h
void hello_vst2(float* fout, float *fin)
{
   float32x4x2_t a;
   a = vld2q_f32 (fin);
   vst2q_f32 (fout, a);
}

Changes to:

  hello_vst2:
-   ld2 {v0.4s - v1.4s}, [x1]
-   sub sp, sp, #32
-   umovx1, v0.d[0]
-   umovx2, v0.d[1]
-   str q1, [sp, 16]
-   mov x5, x1
-   stp x5, x2, [sp]
-   ld1 {v0.16b - v1.16b}, [sp]
+   ld2 {v2.4s - v3.4s}, [x1]
+   orr v0.16b, v2.16b, v2.16b
+   orr v1.16b, v3.16b, v3.16b
st2 {v0.4s - v1.4s}, [x0]
-   add sp, sp, 32
ret


lower-subreg.c:compute_costs() only cares about the cost of a (set (reg)
(const_int )) move but I think the intention, at least for now, is to
return extra_cost-vect.alu for all the vector operations.


Almost, what we want at the moment is COSTS_N_INSNS (1) + 
extra_cost-vect.alu


Regression tested on aarch64-linux-gnu with no new regression. Is this
OK for trunk?


Are you sure it's a (set (reg) (const_int)) that's being costed here? I 
thought for moves into vecto registers it would be a (set (reg) 
(const_vector)) which we don't handle in our rtx costs currently. I 
think the correct approach would be to extend the aarch64_rtx_costs 
switch statement to handle the CONST_VECT case. I believe you can use 
aarch64_simd_valid_immediate to check whether x is a valid immediate for 
a simd instruction and give it a cost of extra_cost-vect.alu. The logic 
should be similar to the CONST_INT case.


Thanks,
Kyrill


Thanks,
Kugan


gcc/ChangeLog:

2015-03-16  Kugan Vivekanandarajah  kug...@linaro.org
 Jim Wilson  jim.wil...@linaro.org

PR target/65375
* config/aarch64/aarch64.c (aarch64_rtx_costs): Return
extra_cost-vect.alu for SET.

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Bernd Edlinger


Hi,


when looking at the m68k I realized the following, which is
a general problem...

If the alignment of the structure is less than sizeof(field), the
strict volatile bitfields code may read beyond the end of the
structure!

Consider this example:

struct s
{
  char x : 8;
  volatile unsigned int y : 31;
  volatile unsigned int z : 1;
} __attribute__((packed));

struct s global;


Here we have sizeof(struct s) = 5, alignment(global) == 1,
However when we access global.z we read a 32-bit word
at offset 4, which touches 3 bytes that are not safe to use. 

Something like that does never happen with -fno-strict-volatile-bitfields,
because IIRC, with the only exception of the simple_mem_bitfield_p code path,
there is never an access mode used which is larger than MEM_ALIGN(x).

In this example, if I want to use the packed attribute,
I also have to use the aligned(4) attribute, this satisfies the
check MEM_ALIGN (op0)  modesize, which is IMO always necessary
for strict volatile bitfields, not only on STRICT_ALIGNMENT targets.

On a target, that has BIGGEST_ALIGNMENT  BITS_PER_WORD,
to use the strict volatile bitfields, you have to add the 
__attribute__((aligned(4)))
to the structure.

I had to do that on the pr23623.c test case, to have it passed on m68k for 
instance.


I have attached the updated patch.  As explained before, the check
MEM_ALIGN (op0)  modesize should always be done in strict_volatile_bitfield_p.

For the targets, that usually enable -fstrict-volatile-bitfields, nothing 
changes,
Except when we use packed on the structure, we need to add also an aligned(4)
attribute. For m68k where the natural alignment of any structure is =2 we need 
to
force aligned(4) if we want to ensure the access is in SImode.

Boot-strapped and reg-tested on x86_64-linux-gnu.
OK for trunk?


Thanks
Bernd.
  gcc:
2015-03-15  Bernd Edlinger  bernd.edlin...@hotmail.de

* expmed.c (strict_volatile_bitfield_p): Check that MEM_ALIGN allows
a MODESIZE access.
(store_bit_field, extract_bit_field): For !STRICT_ALIGNMENT explicitly
generate an unaligned access if the field crosses a word boundary.

testsuite:
2015-03-15  Bernd Edlinger  bernd.edlin...@hotmail.de

* gcc.dg/pr23623.c: Added aligned attribute.
* gcc.dg/20141029-1.c: Likewise.
* gcc.dg/20150306-1.c: New test.


patch-volatile-bitfields-1.diff
Description: Binary data

Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads

2015-03-16 Thread Jeff Law


On 03/16/15 13:27, Jakub Jelinek wrote:

On Wed, Mar 11, 2015 at 03:30:36PM -0600, Jeff Law wrote:

+#ifndef GCC_GCSE__COMMONH
+#define GCC_GCSE__COMMONH


GCC_GCSE_COMMON_H instead?

:-) Will fix.




@@ -1308,8 +1396,19 @@ gcse_after_reload_main (rtx f ATTRIBUTE_UNUSED)

if (expr_table-elements ()  0)
  {
+  /* Knowing which MEMs are transparent through a block can signifiantly
+increase the number of reundant loads found.  So compute transparency
+information for for each memory expression in the hash table.  */


s/for for/for/ ?

Similarly.





+  df_analyze ();
+  /* This can not be part of the normal allocation routine because
+we have to know the number of elements in the hash table.  */
+  transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+expr_table-elements ());
+  bitmap_vector_ones (transp, last_basic_block_for_fn (cfun));
+  expr_table-traverse FILE *, compute_expr_transp (dump_file);
eliminate_partially_redundant_loads ();
delete_redundant_insns ();
+  sbitmap_vector_free (transp);

if (dump_file)
{


What effect does the patch have on compile time on say x86_64 or ppc64?
I'll test both.  In the common case, the cost is going to be the basic 
bookkeeping so that we can compute the transparent property.  The actual 
computation of transparency and everything else is guarded on having 
something in the hash tables -- and the overwhelming majority of the 
time there's nothing in the hash tables.


Regardless, I'll pin down boxes and do some testing.




I'm slightly leaning towards trying it even in stage4, but if e.g. richi
disagrees, we could defer it to stage1 too.
I'd be OK either way.  I just want us to make a decision one way or the 
other :-)


Jeff

Re: [Patch, fortran] PR59198 - [4.8/4.9/5 Regression] ICE on cyclically dependent polymorphic types

2015-03-16 Thread Tobias Burnus

Dear Paul,

Paul Richard Thomas wrote:
 The ChangeLog says it all. If the check is not done for components
 that are not procedure pointers,
 typebound_operator_9.f03 breaks. I am not entirely sure why this is
 the case but the fix works fine.
 
 Bootstraps and regtests on FC21/x86_64 - OK for 4.8, 4.9 and 5.0?

OK. I have to admit that I also do not really understand why that's
required.

Tobias

Re: RFC: Avoid calling convert_to_mode with invalid rtl.

2015-03-16 Thread Eric Botcazou

   if (offset)
 {
   machine_mode address_mode;
   rtx offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode,
 EXPAND_SUM);
 
   gcc_assert (MEM_P (op0));
 
   address_mode = get_address_mode (op0);
   if (GET_MODE (offset_rtx) != address_mode)
 offset_rtx = convert_to_mode (address_mode, offset_rtx, 0);
 
   The problem here is that convert_to_mode() may decide to offload
   offset_rtx into another reg and then extend the reg to address_mode,
   but it assumes that offset_rtx is a valid rtl expression for the
   target.

Note that the very same code is in expand_assignment, so they probably should 
be kept in sync.

   The patch adds a second call to expand_expr(), giving the address mode
   as the suggested mode, and using a normal expansion, rather than
   EXPAND_SUM.  This might work, but even if it does not the rtl in
   offset_rtx will be valid in a non-address context so that whatever
   convert_to_mode does will still remain valid.

I don't think that we want to expand twice the same expression.  What about 
calling force_operand on offset_rtx right before convert_to_mode?

-- 
Eric Botcazou

Re: [Patch, fortran] PR59198 - [4.8/4.9/5 Regression] ICE on cyclically dependent polymorphic types

2015-03-16 Thread Paul Richard Thomas

Dear Tobias,

I think that I have a partial understanding now and will attempt to
verify it tonight. Certainly to not build the components, when a
derived type is flagged to have proc_pointer components cannot be
right just because there can be other components as in the original
testcase. This led to the lack of DECL_SIZE information for the field
'term' and the subsequent ICE. The key to understanding is the problem
with typebound_operator_9.f03, when the check for proc_pointer
components is omitted completely. This implies that there is something
missing from the function (sorry, I forget its name) that build
proc_pointer fields. I will check that idea - enlightenment might lead
to a different patch.

Thanks for the review.

Paul

On 16 March 2015 at 09:07, Tobias Burnus
tobias.bur...@physik.fu-berlin.de wrote:
 Dear Paul,

 Paul Richard Thomas wrote:
 The ChangeLog says it all. If the check is not done for components
 that are not procedure pointers,
 typebound_operator_9.f03 breaks. I am not entirely sure why this is
 the case but the fix works fine.

 Bootstraps and regtests on FC21/x86_64 - OK for 4.8, 4.9 and 5.0?

 OK. I have to admit that I also do not really understand why that's
 required.

 Tobias



-- 
Outside of a dog, a book is a man's best friend. Inside of a dog it's
too dark to read.

Groucho Marx

Re: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

2015-03-16 Thread Eric Botcazou

 If we have BIGGEST_ALIGNMENT=16 that means we have likely a 16 bit
 architecture. I doubt that the strict alignment code makes any sense for
 modesize BIGGEST_ALIGNMENT.

Note that m68k is a 32-bit port (UNITS_PER_WORD is 4) but, by definition of 
BIGGEST_ALIGNMENT, not honoring an alignment larger than it is perfectly OK.

-- 
Eric Botcazou

[Ada] Fix minor ICE with -gnatct

2015-03-16 Thread Eric Botcazou

A short-circuit was added in gigi for E_Abstract_State but it results in an 
unbalanced handling of the context.

Tested on x86_64-suse-linux, applied on the mainline.


2015-03-16  Eric Botcazou  ebotca...@adacore.com

* gcc-interface/decl.c (gnat_to_gnu_entity) E_Abstract_State: Do not
short-circuit the regular handling.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 221407)
+++ gcc-interface/decl.c	(working copy)
@@ -4800,9 +4800,11 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 
 case E_Abstract_State:
   /* This is a SPARK annotation that only reaches here when compiling in
-	 ASIS mode and has no characteristics to annotate.  */
+	 ASIS mode.  */
   gcc_assert (type_annotate_only);
-  return error_mark_node;
+  gnu_decl = error_mark_node;
+  saved = true;
+  break;
 
 default:
   gcc_unreachable ();

60 matches

Mail list logo