Re: [PATCH/IPA] Fix ipa-polymorphic-call when size of Pmode is not the size of pointers in user code

2014-11-20 Thread Jan Hubicka
 Hi,
   For ILP32 on AARCH64, we have ptr_mode != Pmode (we have ptr_mode
 being SImode while Pmode is DImode and POINTER_SIZE is 32).  This
 breaks ipa-polymorphic-call assumption that Pmode is the correct mode
 for pointers.  Right now before this patch we get many testcase
 failures in the C++ testsuite due to this.  Some of the tests fail due
 to the wrong devirtualization happening (using the base class rather
 the current class).
 
 This patch fixes the issue by using POINTER_SIZE in place of
 GET_MODE_BITSIZE (Pmode) all over the file.
 
 OK?  Bootstrapped and tested on x86_64 and cross built and tested for
 aarch64-elf with no regressions.
 
 Thanks,
 Andrew Pinski
 
 ChangeLog:
 ipa/63981
 * ipa-polymorphic-call.c (possible_placement_new):
 Use POINTER_SIZE instead of GET_MODE_BITSIZE (Pmode).
 (ipa_polymorphic_call_context::restrict_to_inner_class): Likewise.
 (extr_type_from_vtbl_ptr_store): Likewise.

OK,
thanks!
Honza

 diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c
 index 452f2d2..a746c49 100644
 --- a/gcc/ipa-polymorphic-call.c
 +++ b/gcc/ipa-polymorphic-call.c
 @@ -112,7 +112,7 @@ possible_placement_new (tree type, tree expected_type,
 || !tree_fits_shwi_p (TYPE_SIZE (type))
 || (cur_offset
 + (expected_type ? tree_to_uhwi (TYPE_SIZE (expected_type))
 -  : GET_MODE_BITSIZE (Pmode))
 +  : POINTER_SIZE)
 = tree_to_uhwi (TYPE_SIZE (type);
  }
  
 @@ -155,7 +155,7 @@ ipa_polymorphic_call_context::restrict_to_inner_class 
 (tree otr_type,
HOST_WIDE_INT cur_offset = offset;
bool speculative = false;
bool size_unknown = false;
 -  unsigned HOST_WIDE_INT otr_type_size = GET_MODE_BITSIZE (Pmode);
 +  unsigned HOST_WIDE_INT otr_type_size = POINTER_SIZE;
  
/* Update OUTER_TYPE to match EXPECTED_TYPE if it is not set.  */
if (!outer_type)
 @@ -316,7 +316,7 @@ ipa_polymorphic_call_context::restrict_to_inner_class 
 (tree otr_type,
   
 if (pos = (unsigned HOST_WIDE_INT)cur_offset
  (pos + size) = (unsigned HOST_WIDE_INT)cur_offset
 -  + GET_MODE_BITSIZE (Pmode)
 +  + POINTER_SIZE
  (!otr_type
 || !TYPE_SIZE (TREE_TYPE (fld))
 || !tree_fits_shwi_p (TYPE_SIZE (TREE_TYPE (fld)))
 @@ -1243,7 +1243,7 @@ extr_type_from_vtbl_ptr_store (gimple stmt, struct 
 type_change_info *tci,
 print_generic_expr (dump_file, tci-instance, TDF_SLIM);
 fprintf (dump_file,  with offset %i\n, (int)tci-offset);
   }
 -   return tci-offset  GET_MODE_BITSIZE (Pmode) ? error_mark_node : 
 NULL_TREE;
 +   return tci-offset  POINTER_SIZE ? error_mark_node : NULL_TREE;
   }
if (offset != tci-offset
 || size != POINTER_SIZE
 @@ -1252,9 +1252,9 @@ extr_type_from_vtbl_ptr_store (gimple stmt, struct 
 type_change_info *tci,
 if (dump_file)
   fprintf (dump_file, wrong offset %i!=%i or size %i\n,
(int)offset, (int)tci-offset, (int)size);
 -   return offset + GET_MODE_BITSIZE (Pmode) = tci-offset
 +   return offset + POINTER_SIZE = tci-offset
|| (max_size != -1
 -   tci-offset + GET_MODE_BITSIZE (Pmode)  offset + 
 max_size)
 +   tci-offset + POINTER_SIZE  offset + max_size)
? error_mark_node : NULL;
   }
  }



Re: [PATCH][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.

2014-11-20 Thread Uros Bizjak
On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
 Hi,

 New revision of Intel ISA reference [1] has new instructions:
 Clwb, pcommit and new flavors of AVX512. Patch bellow adds them.
 I understand that stage 1 is closed, however those changes shouldn't
 affect anything outside if i386 backend. And are extremely unlikely to
 break existing functionality, and I personally think it's desirable for
 newest GCC to support newest spec.
 Bootstrapped/regtestsed on x86_64-unknown-linux-gnu.
 Ok for trunk?

Please split the patch into patch series, like it was done previously
for AVX512F patches.

Uros.

 [1]:https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf


 gcc/

 2014-11-19  Ilya Tocar  ilya.to...@intel.com

 * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512IFMA_SET,
 OPTION_MASK_ISA_AVX512VBMI_SET, OPTION_MASK_ISA_AVX512IFMA_UNSET,
 OPTION_MASK_ISA_AVX512VBMI_UNSET, OPTION_MASK_ISA_PCOMMIT_UNSET,
 OPTION_MASK_ISA_CLWB_UNSET, OPTION_MASK_ISA_CLWB_SET,
 OPTION_MASK_ISA_PCOMMIT_SET): New.
 (ix86_handle_option): Handle OPT_mavx512ifma, OPT_mavx512vbmi,
 OPT_mpcommit, OPT_mclwb.
 * config.gcc: Add avx512ifmaintrin.h, avx512ifmavlintrin.h,
 avx512vbmiintrin.h, avx512vbmivlintrin.h clwbintrin.h pcommitintrin.h
 * config/i386/avx512ifmaintrin.h: New file.
 * config/i386/avx512ifmaivlntrin.h: Ditto.
 * config/i386/avx512vbmiintrin.h: Ditto.
 * config/i386/avx512vbmivlintrin.h: Ditto.
 * config/i386/clwbintrin.h: Ditto.
 * config/i386/pcommitintrin.h: Ditto.
 * config/i386/cpuid.h (bit_AVX512IFMA, bit_PCOMMIT, bit_CLWB,
 bit_AVX512VBMI): New.
 * config/i386/driver-i386.c (host_detect_local_cpu): Detect pcommit,
 clwb, avx512ifma, avx512vbmi.
 * config/i386/i386-c.c (ix86_target_macros_internal): Define
 __AVX512VBMI__, __AVX512IFMA__, __PCOMMIT__, __CLWB__.
 * config/i386/i386.c (ix86_target_string): Add -mavx512ifma,
 -mavx512vbmi, -mclwb, -mpcommit.
 (PTA_AVX512VBMI, PTA_AVX512IFMA, PTA_CLWB, PTA_PCOMMIT): Define.
 (ix86_option_override_internal): Handle new options.
 (ix86_valid_target_attribute_inner_p): Add avx512vbmi, avx512ifma,
 clwb, pcommit.
 (ix86_builtins): Add IX86_BUILTIN_VPMADD52LUQ512,
 IX86_BUILTIN_VPMADD52HUQ512, IX86_BUILTIN_VPMADD52LUQ256,
 IX86_BUILTIN_VPMADD52HUQ256, IX86_BUILTIN_VPMADD52LUQ128,
 IX86_BUILTIN_VPMADD52HUQ128, IX86_BUILTIN_VPMADD52LUQ512_MASKZ,
 IX86_BUILTIN_VPMADD52HUQ512_MASKZ, IX86_BUILTIN_VPMADD52LUQ256_MASKZ,
 IX86_BUILTIN_VPMADD52HUQ256_MASKZ, IX86_BUILTIN_VPMADD52LUQ128_MASKZ,
 IX86_BUILTIN_VPMADD52HUQ128_MASKZ, IX86_BUILTIN_VPMULTISHIFTQB512,
 IX86_BUILTIN_VPMULTISHIFTQB256, IX86_BUILTIN_VPMULTISHIFTQB128,
 IX86_BUILTIN_VPERMVARQI512_MASK, IX86_BUILTIN_VPERMT2VARQI512,
 IX86_BUILTIN_VPERMT2VARQI512_MASKZ, IX86_BUILTIN_VPERMI2VARQI512,
 IX86_BUILTIN_VPERMVARQI256_MASK, IX86_BUILTIN_VPERMVARQI128_MASK,
 IX86_BUILTIN_VPERMT2VARQI256, IX86_BUILTIN_VPERMT2VARQI256_MASKZ,
 IX86_BUILTIN_VPERMT2VARQI128, IX86_BUILTIN_VPERMI2VARQI256,
 IX86_BUILTIN_VPERMI2VARQI128, IX86_BUILTIN_CLWB, IX86_BUILTIN_PCOMMIT.
 (bdesc_special_args): Add __builtin_ia32_pcommit,
 __builtin_ia32_vpmadd52luq512_mask,
 __builtin_ia32_vpmadd52luq512_maskz,
 __builtin_ia32_vpmadd52huq512_mask,
 __builtin_ia32_vpmadd52huq512_maskx,
 __builtin_ia32_vpmadd52luq256_mask,
 __builtin_ia32_vpmadd52luq256_maskz,
 __builtin_ia32_vpmadd52huq256_mask,
 __builtin_ia32_vpmadd52huq256_maskz,
 __builtin_ia32_vpmadd52luq128_mask,
 __builtin_ia32_vpmadd52luq128_maskz,
 __builtin_ia32_vpmadd52huq128_mask,
 __builtin_ia32_vpmadd52huq128_maskz,
 __builtin_ia32_vpmultishiftqb512_mask,
 __builtin_ia32_vpmultishiftqb256_mask,
 __builtin_ia32_vpmultishiftqb128_mask,
 __builtin_ia32_permvarqi512_mask, __builtin_ia32_vpermt2varqi512_mask,
 __builtin_ia32_vpermt2varqi512_maskz,
 __builtin_ia32_vpermi2varqi512_mask, __builtin_ia32_permvarqi256_mask,
 __builtin_ia32_permvarqi128_mask, __builtin_ia32_vpermt2varqi256_mask,
 __builtin_ia32_vpermt2varqi256_maskz,
 __builtin_ia32_vpermt2varqi128_mask,
 __builtin_ia32_vpermt2varqi128_maskz,
 __builtin_ia32_vpermi2varqi256_mask,
 __builtin_ia32_vpermi2varqi128_mask.
 (ix86_init_mmx_sse_builtins): Add __builtin_ia32_clwb.
 (ix86_expand_builtin): Handle IX86_BUILTIN_CLWB.
 (ix86_hard_regno_mode_ok): Allow big masks for AVX612VBMI.
 * config/i386/i386.h (TARGET_AVX512VBMI, TARGET_AVX512VBMI_P,
 TARGET_AVX512IFMA, TARGET_AVX512IFMA_P, TARGET_PCOMMIT,
 TARGET_PCOMMIT_P, 

Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-20 Thread Ramana Radhakrishnan



On 19/11/14 09:29, Yangfei (Felix) wrote:

Sorry for missing the point.  It seems to me that 't2' here will conflict with

condition of the pattern *movhi_insn_arch4:

TARGET_ARM
  arm_arch4
  (register_operand (operands[0], HImode)
 || register_operand (operands[1], HImode))

#define TARGET_ARM  (! TARGET_THUMB)
/* 32-bit Thumb-2 code.  */
#define TARGET_THUMB2   (TARGET_THUMB 

arm_arch_thumb2)




Bah, Indeed ! - I misremembered the t2 there, my mistake.

Yes you are right there, but what I'd like you to do is to use that mechanism
rather than putting all this logic in the predicate.

So, I'd prefer you to add a v6t2 to the values for the arch attribute, don't 
forget
to update the comments above.

and in arch_enabled you need to enforce this with

   (and (eq_attr arch v6t2)
(match_test TARGET_32BIT  arm_arch6  arm_arch_thumb2))
 (const_string yes)

And in the pattern use v6t2 ...

arm_arch_thumb2 implies that this is at the architecture level of v6t2.
Therefore TARGET_ARM  arm_arch_thumb2 implies ARM state.



Hi Ramana,
 Thank you for your suggestions.  I rebased the patch on the latest trunk 
and updated it accordingly.
 As this patch will not work for architectures older than armv6t2,  I also 
prefer Thomas's patch to fix for them.
 I am currently performing test for this patch.  Assuming no issues pops 
up, OK for the trunk?
 And is it necessary to backport this patch to the 4.8  4.9 branches?



I've applied the following as obvious after Kugan mentioned on IRC this 
morning noticing a movwne r0, #-32768. Obviously this won't be accepted 
as is by the assembler and we should be using the %L character. Applied 
to trunk as obvious.


Felix, How did you test this patch ?

regards
Ramana

2014-11-20  Ramana Radhakrishnan  ramana.radhakrish...@arm.com

PR target/59593
* config/arm/arm.md (*movhi_insn): Use right formatting
for immediate.






Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 217717)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,11 @@
+2014-11-19  Felix Yang  felix.y...@huawei.com
+   Shanyao Chen  chenshan...@huawei.com
+
+   PR target/59593
+   * config/arm/arm.md (define_attr arch): Add v6t2.
+   (define_attr arch_enabled): Add test for the above.
+   (*movhi_insn_arch4): Add new alternative.
+
  2014-11-18  Felix Yang  felix.y...@huawei.com

* config/aarch64/aarch64.c (doloop_end): New pattern.
Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md   (revision 217717)
+++ gcc/config/arm/arm.md   (working copy)
@@ -125,9 +125,10 @@
  ; This can be a for ARM, t for either of the Thumbs, 32 for
  ; TARGET_32BIT, t1 or t2 to specify a specific Thumb mode.  v6
  ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without
-; arm_arch6.  This attribute is used to compute attribute enabled,
-; use type any to enable an alternative in all cases.
-(define_attr arch 
any,a,t,32,t1,t2,v6,nov6,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3
+; arm_arch6.  v6t2 for Thumb-2 with arm_arch6.  This attribute is
+; used to compute attribute enabled, use type any to enable an
+; alternative in all cases.
+(define_attr arch 
any,a,t,32,t1,t2,v6,nov6,v6t2,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3
(const_string any))

  (define_attr arch_enabled no,yes
@@ -162,6 +163,10 @@
  (match_test TARGET_32BIT  !arm_arch6))
 (const_string yes)

+(and (eq_attr arch v6t2)
+ (match_test TARGET_32BIT  arm_arch6  arm_arch_thumb2))
+(const_string yes)
+
 (and (eq_attr arch avoid_neon_for_64bits)
  (match_test TARGET_NEON)
  (not (match_test TARGET_PREFER_NEON_64BITS)))
@@ -6288,8 +6293,8 @@

  ;; Pattern to recognize insn generated default case above
  (define_insn *movhi_insn_arch4
-  [(set (match_operand:HI 0 nonimmediate_operand =r,r,m,r)
-   (match_operand:HI 1 general_operand  rIk,K,r,mi))]
+  [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m,r)
+   (match_operand:HI 1 general_operand  rIk,K,n,r,mi))]
TARGET_ARM
  arm_arch4
  (register_operand (operands[0], HImode)
@@ -6297,16 +6302,19 @@
@
 mov%?\\t%0, %1\\t%@ movhi
 mvn%?\\t%0, #%B1\\t%@ movhi
+   movw%?\\t%0, %1\\t%@ movhi
 str%(h%)\\t%1, %0\\t%@ movhi
 ldr%(h%)\\t%0, %1\\t%@ movhi
[(set_attr predicable yes)
-   (set_attr pool_range *,*,*,256)
-   (set_attr neg_pool_range *,*,*,244)
+   (set_attr pool_range *,*,*,*,256)
+   (set_attr neg_pool_range *,*,*,*,244)
+   (set_attr arch *,*,v6t2,*,*)
 (set_attr_alternative type
   [(if_then_else (match_operand 1 const_int_operand 
)
  (const_string mov_imm )

RE: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL

2014-11-20 Thread Terry Guo


 -Original Message-
 From: Kyrill Tkachov [mailto:kyrylo.tkac...@arm.com]
 Sent: Tuesday, November 18, 2014 11:08 PM
 To: Terry Guo; gcc-patches@gcc.gnu.org
 Cc: ger...@pfeifer.com
 Subject: Re: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL
 
 
 On 18/11/14 02:48, Terry Guo wrote:
  + ul
  +  li The Thumb-1 assembly code are now generated in unified
syntax.
 The new option
  +code-masm-syntax-unified/code can be used to specify
whether
 inline assembly
  +code are using unified syntax. By default the option is off
which
 means
  +non-unified syntax is used. However this is subject to change
in future
 releases.
  +Eventually the non-unified syntax will be deprecated.
  +  /li
  + /ul
 Hi Terry,
 
 Sorry for the late comment, I see this has already been committed.
 
 I think it should be assembly code is now generated.
 Also whether inline assembly code is using unified syntax.
 
 Kyrill

Thanks for comments. I committed below patch to fix those typos.

BR,
Terry

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.39
diff -u -r1.39 changes.html
--- htdocs/gcc-5/changes.html   19 Nov 2014 12:13:00 -  1.39
+++ htdocs/gcc-5/changes.html   20 Nov 2014 03:48:26 -
@@ -387,9 +387,9 @@
 
 h3 id=armARM/h3
  ul
-  li The Thumb-1 assembly code are now generated in unified syntax.
The new option
+  li The Thumb-1 assembly code is now generated in unified syntax.
The new option
 code-masm-syntax-unified/code can be used to specify whether
inline assembly
-code are using unified syntax. By default the option is off which
means
+code is using unified syntax. By default the option is off which
means
 non-unified syntax is used. However this is subject to change in
future releases.
 Eventually the non-unified syntax will be deprecated.
   /li





Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-20 Thread Yangfei (Felix)
 On 19/11/14 09:29, Yangfei (Felix) wrote:
  Sorry for missing the point.  It seems to me that 't2' here will
  conflict with
  condition of the pattern *movhi_insn_arch4:
  TARGET_ARM
arm_arch4
(register_operand (operands[0], HImode)
   || register_operand (operands[1], HImode))
 
  #define TARGET_ARM  (! TARGET_THUMB)
  /* 32-bit Thumb-2 code.  */
  #define TARGET_THUMB2   (TARGET_THUMB 
  arm_arch_thumb2)
 
 
  Bah, Indeed ! - I misremembered the t2 there, my mistake.
 
  Yes you are right there, but what I'd like you to do is to use that
  mechanism rather than putting all this logic in the predicate.
 
  So, I'd prefer you to add a v6t2 to the values for the arch
  attribute, don't forget to update the comments above.
 
  and in arch_enabled you need to enforce this with
 
 (and (eq_attr arch v6t2)
  (match_test TARGET_32BIT  arm_arch6 
 arm_arch_thumb2))
  (const_string yes)
 
  And in the pattern use v6t2 ...
 
  arm_arch_thumb2 implies that this is at the architecture level of v6t2.
  Therefore TARGET_ARM  arm_arch_thumb2 implies ARM state.
 
 
  Hi Ramana,
   Thank you for your suggestions.  I rebased the patch on the latest 
  trunk
 and updated it accordingly.
   As this patch will not work for architectures older than armv6t2,  I 
  also
 prefer Thomas's patch to fix for them.
   I am currently performing test for this patch.  Assuming no issues pops
 up, OK for the trunk?
   And is it necessary to backport this patch to the 4.8  4.9 branches?
 
 
 I've applied the following as obvious after Kugan mentioned on IRC this 
 morning
 noticing a movwne r0, #-32768. Obviously this won't be accepted as is by the
 assembler and we should be using the %L character. Applied to trunk as 
 obvious.
 
 Felix, How did you test this patch ?
 
 regards
 Ramana


I regtested the patch for arm-eabi-gcc/g++  big-endian with qemu.  The test 
result is OK.  That's strange ...  

This issue can be reproduced by the following testcase.  Thanks for fixing it.  

#include stdio.h
unsigned short v = 0x5678;
int i;
int j = 0;
int *ptr = j;
int func()
{
for (i = 0; i  1; ++i)
{
*ptr = -1;
v = 0xF234;
}
return v;
}

 
 2014-11-20  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 
  PR target/59593
  * config/arm/arm.md (*movhi_insn): Use right formatting
  for immediate.


Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

2014-11-20 Thread Kyrill Tkachov

Hi Philipp,

I don't mind it being in config/arm if you plan to wire it up later, 
good to know.

Another comment inline

Thanks,
Kyrill

On 19/11/14 21:42, Philipp Tomsich wrote:

Here's an updated patch with Kyrill's and Andrew's comments integrated.

I left the file in the config/arm-directory, as XGene-family is capable of
executing ARMv7 and we will wire this into the 32bit backend in the near
future (moving it now would just cause another move in the near future).

We also moved the 'include' up to where the pipeline models for the
A53/A57/ThunderX are included, as the previous dependency on picking up the
SIMD types from aarch64-simd.md no longer holds true since gcc-4.9.

Cheers,
-Philipp.


---
  gcc/ChangeLog |   6 +
  gcc/config/aarch64/aarch64.md |   3 +-
  gcc/config/arm/xgene1.md  | 520 ++
  3 files changed, 528 insertions(+), 1 deletion(-)
  create mode 100644 gcc/config/arm/xgene1.md

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c9ac0d9..dad2278 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,11 @@
  2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com

+   * config/aarch64/aarch64.md: Include xgene1.md.
+   (generic_sched): Set to no for xgene1.
+   * config/arm/xgene1.md: New file.
+
+2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com
+
 * config/aarch64/aarch64-cores.def (xgene1): Update/add the
 xgene1 (APM XGene-1) core definition.
 * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 597ff8c..1b36384 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -191,7 +191,7 @@

  (define_attr generic_sched yes,no
(const (if_then_else
-  (eq_attr tune cortexa53,cortexa15,thunderx)
+  (eq_attr tune cortexa53,cortexa15,thunderx,xgene1)
(const_string no)
(const_string yes

@@ -199,6 +199,7 @@
  (include ../arm/cortex-a53.md)
  (include ../arm/cortex-a15.md)
  (include thunderx.md)
+(include ../arm/xgene1.md)

  ;; ---
  ;; Jumps and other miscellaneous insns
diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
new file mode 100644
index 000..227f2c7
--- /dev/null
+++ b/gcc/config/arm/xgene1.md
@@ -0,0 +1,520 @@
+;; Machine description for AppliedMicro xgene1 core.
+;; Copyright (C) 2012-2014 Free Software Foundation, Inc.
+;; Contributed by Theobroma Systems Design und Consulting GmbH.
+;;See http://www.theobroma-systems.com for more info.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; http://www.gnu.org/licenses/.
+
+;; Pipeline description for the xgene1 micro-architecture
+
+(define_automaton xgene1)
+
+(define_cpu_unit xgene1_decode_out0 xgene1)
+(define_cpu_unit xgene1_decode_out1 xgene1)
+(define_cpu_unit xgene1_decode_out2 xgene1)
+(define_cpu_unit xgene1_decode_out3 xgene1)
+
+(define_cpu_unit xgene_divide xgene1)
+(define_cpu_unit xgene_fp_divide xgene1)


Why is this xgene_* while the other units xgene1_*?


+
+(define_reservation xgene1_decode1op
+( xgene1_decode_out0 )
+|( xgene1_decode_out1 )
+|( xgene1_decode_out2 )
+|( xgene1_decode_out3 )
+)
+(define_reservation xgene1_decode2op
+( xgene1_decode_out0 + xgene1_decode_out1 )
+|( xgene1_decode_out0 + xgene1_decode_out2 )
+|( xgene1_decode_out0 + xgene1_decode_out3 )
+|( xgene1_decode_out1 + xgene1_decode_out2 )
+|( xgene1_decode_out1 + xgene1_decode_out3 )
+|( xgene1_decode_out2 + xgene1_decode_out3 )
+)
+(define_reservation xgene1_decodeIsolated
+( xgene1_decode_out0 + xgene1_decode_out1 + xgene1_decode_out2 + 
xgene1_decode_out3 )
+)
+
+(define_insn_reservation branch 1
+  (and (eq_attr tune xgene1)
+   (eq_attr type branch))
+  xgene1_decode1op)


insn_reservation names should also have the xgene1_* namespace


+
+(define_insn_reservation nop 1
+  (and (eq_attr tune xgene1)
+   (eq_attr type no_insn))
+  xgene1_decode1op)
+
+(define_insn_reservation call 1
+  (and (eq_attr tune xgene1)
+   (eq_attr type call))
+  xgene1_decode2op)
+
+(define_insn_reservation f_load 10
+  (and (eq_attr tune xgene1)
+   (eq_attr type f_loadd,f_loads))
+  

Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian

2014-11-20 Thread Yangfei (Felix)
  On 19/11/14 09:29, Yangfei (Felix) wrote:
   Sorry for missing the point.  It seems to me that 't2' here will
   conflict with
   condition of the pattern *movhi_insn_arch4:
   TARGET_ARM
 arm_arch4
 (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))
  
   #define TARGET_ARM  (! TARGET_THUMB)
   /* 32-bit Thumb-2 code.  */
   #define TARGET_THUMB2   (TARGET_THUMB 
   arm_arch_thumb2)
  
  
   Bah, Indeed ! - I misremembered the t2 there, my mistake.
  
   Yes you are right there, but what I'd like you to do is to use that
   mechanism rather than putting all this logic in the predicate.
  
   So, I'd prefer you to add a v6t2 to the values for the arch
   attribute, don't forget to update the comments above.
  
   and in arch_enabled you need to enforce this with
  
  (and (eq_attr arch v6t2)
   (match_test TARGET_32BIT  arm_arch6 
  arm_arch_thumb2))
 (const_string yes)
  
   And in the pattern use v6t2 ...
  
   arm_arch_thumb2 implies that this is at the architecture level of v6t2.
   Therefore TARGET_ARM  arm_arch_thumb2 implies ARM state.
  
  
   Hi Ramana,
Thank you for your suggestions.  I rebased the patch on the
   latest trunk
  and updated it accordingly.
As this patch will not work for architectures older than
   armv6t2,  I also
  prefer Thomas's patch to fix for them.
I am currently performing test for this patch.  Assuming no
   issues pops
  up, OK for the trunk?
And is it necessary to backport this patch to the 4.8  4.9 branches?
  
 
  I've applied the following as obvious after Kugan mentioned on IRC
  this morning noticing a movwne r0, #-32768. Obviously this won't be
  accepted as is by the assembler and we should be using the %L character.
 Applied to trunk as obvious.
 
  Felix, How did you test this patch ?
 
  regards
  Ramana
 
 
 I regtested the patch for arm-eabi-gcc/g++  big-endian with qemu.  The test
 result is OK.  That's strange ...
 
 This issue can be reproduced by the following testcase.  Thanks for fixing it.
 
 #include stdio.h
 unsigned short v = 0x5678;
 int i;
 int j = 0;
 int *ptr = j;
 int func()
 {
 for (i = 0; i  1; ++i)
 {
 *ptr = -1;
 v = 0xF234;
 }
 return v;
 }


And the architecture level is set to armv7-a by default when testing. 


[PATCH] Fix PR63962

2014-11-20 Thread Richard Biener

When moving tree-ssa-forwprop.c:associate_plusminus to match.pd patterns
a single-use restriction escaped my eye.  It is indeed important
for non-simplifications like
  (ptr p+ off1) p+ off2 - ptr p+ (off1 + off2)
to not un-CSE.  The association is most useful to enable later
re-association as reassoc isn't able to associate pointer-plus
chains but only unsigned integer arithmetic.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-11-20  Richard Biener  rguent...@suse.de

PR middle-end/63962
* match.pd ((p +p off1) +p off2 - (p +p (off1 + off2))):
Guard with single-use operand 0.

* gcc.dg/tree-ssa/forwprop-30.c: New testcase.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 217767)
+++ gcc/match.pd(working copy)
@@ -370,8 +370,9 @@ (define_operator_list inverted_tcc_compa
 
 /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
 (simplify
-  (pointer_plus (pointer_plus @0 @1) @3)
-  (pointer_plus @0 (plus @1 @3)))
+  (pointer_plus (pointer_plus@2 @0 @1) @3)
+  (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
+   (pointer_plus @0 (plus @1 @3
 
 /* Pattern match
  tem1 = (long) ptr1;
Index: gcc/testsuite/gcc.dg/tree-ssa/forwprop-30.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/forwprop-30.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/forwprop-30.c (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options -O -fdump-tree-optimized } */
+
+int *p;
+int *foo (int *q, int i, int j)
+{
+  p = q + i;
+  return p + j;
+}
+
+/* We shouldn't associate (q + i) + j to q + (i + j) here as we
+   need q + i as well.  */
+
+/* { dg-final { scan-tree-dump-times \\+ 2 optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */


Re: [patch v2, aarch64] additional bics patterns

2014-11-20 Thread Richard Earnshaw
On 19/11/14 18:22, Sandra Loosemore wrote:
 On 11/13/2014 10:47 AM, Andrew Pinski wrote:
 On Thu, Nov 13, 2014 at 9:42 AM, Sandra Loosemore
 san...@codesourcery.com wrote:
 On 11/13/2014 10:27 AM, Richard Earnshaw wrote:

 On 13/11/14 17:05, Ramana Radhakrishnan wrote:

 On Thu, Nov 13, 2014 at 4:55 PM, Sandra Loosemore
 san...@codesourcery.com wrote:

 This patch to the AArch64 back end adds a couple of additional bics
 patterns
 to match code of the form

 if ((x  y) == x) ...;

 This is testing whether the bits set in x are a subset of the bits set
 in y;
 or, that no bits in x are set that are not set in y.  So, it is
 equivalent
 to

 if ((x  ~y) == 0) ...;

 Presently this generates code like
 and x21, x21, x20
 cmp x21, x20
 b.eqc0 main+0xc0

 and this patch allows it to be written more concisely as:
 bics x21, x20, x21
 b.eq c0 main+0xc0

 Since the bics instruction sets the condition codes itself, no explicit
 comparison is required and the result of the bics computation can be
 discarded.

 Regression-tested on aarch64-linux-gnu.  OK to commit?


 Is this not a duplicate of
 https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00943.html ?


 I don't think so.  However, I think it is something that should be
 caught in generic simplification code

 ie map  ((a  b) == b) == ((~a  b) == 0), etc

 Bit-clear operations are not that uncommon.  Furthermore, A may be a
 constant.


 Alex posted his patch when I already had Chris's in my regression test
 queue, but I've just confirmed that it does not fix the test case I
 included.

 I already thought a little about making this a generic simplification, but
 it seemed to me like it was only useful on targets that have a bit-clear
 instruction that happens to set condition codes, and that it would pessimize
 code on targets that don't have a bit-clear instruction at all (by inserting
 the extra complement operation).  So to me it seemed reasonable to do it in
 the back end.

 But can't you do this in simplify-rtx.c and allow for the cost model
 to do the correct thing?
 
 OK, here is a revised patch to apply the identity there.  This version 
 depends on Alex's aarch64 BICS patch for the included test case to pass, 
 though.
 
 In addition to the aarch64 testing, I bootstrapped and regression-tested 
 the target-inspecific part of the patch on x86_64-linux-gnu.  Is this 
 OK?  Should I hold off on committing it until Alex's patch is in?
 
 -Sandra
 
 
 2014-11-19  Sandra Loosemore  san...@codesourcery.com
 
   gcc/
   * simplify-rtx.c (simplify_relational_operation_1): Handle
   simplification identities for BICS patterns.
 
   gcc/testsuite/
   * gcc.target/aarch64/bics_4.c: New.
 

Looks sensible to me.  Eric, are you happy?

R.

 
 bics2.patch
 
 
 Index: gcc/simplify-rtx.c
 ===
 --- gcc/simplify-rtx.c(revision 217322)
 +++ gcc/simplify-rtx.c(working copy)
 @@ -4551,6 +4551,32 @@ simplify_relational_operation_1 (enum rt
   simplify_gen_binary (XOR, cmp_mode,
XEXP (op0, 1), op1));
  
 +  /* (eq/ne (and x y) x) simplifies to (eq/ne (and (not y) x) 0), which
 + can be implemented with a BICS instruction on some targets, or
 + constant-folded if y is a constant.  */
 +  if ((code == EQ || code == NE)
 +   op0code == AND
 +   rtx_equal_p (XEXP (op0, 0), op1)
 +   !side_effects_p (op1))
 +{
 +  rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1), 
 cmp_mode);
 +  rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0));
 +  
 +  return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx);
 +}
 +
 +  /* Likewise for (eq/ne (and x y) y).  */
 +  if ((code == EQ || code == NE)
 +   op0code == AND
 +   rtx_equal_p (XEXP (op0, 1), op1)
 +   !side_effects_p (op1))
 +{
 +  rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0), 
 cmp_mode);
 +  rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1));
 +  
 +  return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx);
 +}
 +
/* (eq/ne (bswap x) C1) simplifies to (eq/ne x C2) with C2 swapped.  */
if ((code == EQ || code == NE)
 GET_CODE (op0) == BSWAP
 Index: gcc/testsuite/gcc.target/aarch64/bics_4.c
 ===
 --- gcc/testsuite/gcc.target/aarch64/bics_4.c (revision 0)
 +++ gcc/testsuite/gcc.target/aarch64/bics_4.c (revision 0)
 @@ -0,0 +1,87 @@
 +/* { dg-do run } */
 +/* { dg-options -O2 --save-temps -fno-inline } */
 +
 +extern void abort (void);
 +
 +int
 +bics_si_test1 (int a, int b, int c)
 +{
 +  if ((a  b) == a)
 +return a;
 +  else
 +return c;
 +}
 +
 +int
 +bics_si_test2 (int a, int b, int c)
 +{
 +  if ((a  b) == b)
 +return b;
 +  else
 +return c;
 +}
 

[PATCH, ifcvt] Fix PR63917

2014-11-20 Thread Zhenqiang Chen
Hi,

r217646 enhances ifcvt to handle cbranchcc4 instruction. But ifcvt does not
strictly check the dependence before moving instructions before IF. Then
some instructions, which clobber CC, are inserted before the cbranchcc4
instruction.

For the case in the patch, ifcvt transfers code from

   5: r87:SI=r117:SI
   22: pc={(flags:CCGOC=0)?L26:pc}
   25: {r87:SI=-r117:SI;clobber flags:CC;}

to
   5: r87:SI=r117:SI
  136: {r145:SI=-r117:SI;clobber flags:CC;} // CC is clobbered
  137: r87:SI={(flags:CCGOC0)?r145:SI:r117:SI}

The patch skips moving insns, which clobber CC, before cbranchcc4.

Bootstrap and no make check regression on X86-64 and i686.
All the failed cases in PR63917 PASS.

OK for trunk?
Thanks!
-Zhenqiang

ChangeLog:
2014-11-20  Zhenqiang Chen  zhenqiang.c...@arm.com

PR rtl-optimization/63917
* ifcvt.c (clobber_cc_p, use_cc_p): New functions.
(noce_process_if_block, check_cond_move_block): Check CC references.

testsuite/ChangeLog:
2014-11-20  Zhenqiang Chen  zhenqiang.c...@arm.com

* gcc.target/i386/floatsitf.c: New test.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 21f08c2..760eeb6 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -2528,6 +2528,34 @@ noce_can_store_speculate_p (basic_block top_bb,
const_rtx mem)
   return false;
 }
 
+/* Check X clobber CC reg or not.  */
+
+static bool
+clobber_cc_p (rtx x)
+{
+  RTX_CODE code = GET_CODE (x);
+  int i;
+
+  if (code == CLOBBER
+   REG_P (XEXP (x, 0))
+   (GET_MODE_CLASS (GET_MODE (XEXP (x, 0))) == MODE_CC))
+return TRUE;
+  else if (code == PARALLEL)
+for (i = 0; i  XVECLEN (x, 0); i++)
+  if (clobber_cc_p (XVECEXP (x, 0, i)))
+   return TRUE;
+  return FALSE;
+}
+
+/* Check CC reg is used in COND or not.  */
+
+static bool
+use_cc_p (rtx cond)
+{
+  return (HAVE_cbranchcc4)
+  (GET_MODE_CLASS (GET_MODE (XEXP (cond, 0))) == MODE_CC);
+}
+
 /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to
convert
it without using conditional execution.  Return TRUE if we were
successful
at converting the block.  */
@@ -2655,6 +2683,12 @@ noce_process_if_block (struct noce_if_info *if_info)
   if_info-a = a;
   if_info-b = b;
 
+  /* Skip it if the instruction to be moved might clobber CC.  */
+  if (use_cc_p (if_info-cond)
+   (clobber_cc_p (PATTERN (insn_a))
+ || (insn_b  clobber_cc_p (PATTERN (insn_b)
+return FALSE;
+
   /* Try optimizations in some approximation of a useful order.  */
   /* ??? Should first look to see if X is live incoming at all.  If it
  isn't, we don't need anything but an unconditional set.  */
@@ -2868,6 +2902,10 @@ check_cond_move_block (basic_block bb,
   modified_between_p (src, insn, NEXT_INSN (BB_END (bb
return FALSE;
 
+  /* Skip it if the instruction to be moved might clobber CC.  */
+  if (use_cc_p (cond)  clobber_cc_p (PATTERN (insn)))
+   return FALSE;
+
   vals-put (dest, src);
 
   regs-safe_push (dest);

diff --git a/gcc/testsuite/gcc.target/i386/floatsitf.c
b/gcc/testsuite/gcc.target/i386/floatsitf.c
new file mode 100644
index 000..6b249cc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/floatsitf.c
@@ -0,0 +1,48 @@
+/* { dg-do compile { target { { i?86-*-* x86_64-*-* }  ilp32 } } } */
+/* { dg-options -O2 -fdump-rtl-ce2  } */
+
+typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
+void __sfp_handle_exceptions (int);
+
+typedef int QItype __attribute__ ((mode (QI)));
+typedef int SItype __attribute__ ((mode (SI)));
+typedef int DItype __attribute__ ((mode (DI)));
+typedef unsigned int UQItype __attribute__ ((mode (QI)));
+typedef unsigned int USItype __attribute__ ((mode (SI)));
+typedef unsigned int UDItype __attribute__ ((mode (DI)));
+
+typedef unsigned int UHWtype __attribute__ ((mode (HI)));
+extern const UQItype __clz_tab[256] ;
+
+extern void abort (void);
+typedef float TFtype __attribute__ ((mode (TF)));
+
+union _FP_UNION_Q
+{
+  TFtype flt;
+  struct
+  {
+unsigned long frac0 : 32;
+unsigned long frac1 : 32;
+unsigned long frac2 : 32;
+unsigned long frac3 : 113 - (((unsigned int) 1  (113 -1) % 32) !=
0)-(32 * 3);
+unsigned exp : 15;
+unsigned sign : 1;
+
+  } bits __attribute__ ((packed));
+};
+
+TFtype
+__floatsitf (SItype i)
+{
+  int A_c __attribute__ ((unused)); int A_s __attribute__ ((unused)); int
A_e __attribute__ ((unused)); unsigned int A_f[4];
+  TFtype a;
+
+  do { if ((i)) { USItype _FP_FROM_INT_ur; if ((A_s = (((i))  0))) ((i)) =
-(USItype) ((i)); _FP_FROM_INT_ur = (USItype) ((i)); (void) (8 * (int)
sizeof (SItype = 32) ? ({ int _FP_FROM_INT_lz; do { if (sizeof
(unsigned int) == sizeof (unsigned int)) (_FP_FROM_INT_lz) = __builtin_clz
((unsigned int) _FP_FROM_INT_ur); else if (sizeof (unsigned int) == sizeof
(unsigned long)) (_FP_FROM_INT_lz) = __builtin_clzl ((unsigned int)
_FP_FROM_INT_ur); else if (sizeof (unsigned int) == sizeof (unsigned long
long)) (_FP_FROM_INT_lz) = 

[Ada] Missing interface conversion in access type

2014-11-20 Thread Arnaud Charlet
The compiler silently skips the generation of code to perform the
conversion of an access type whose designated type is a class-wide
interface type, thus causing unexpected problems at runtime in
dispatching calls to the target object. After this patch the
following test compiles and executes without errors:

package Lists is
   type List is interface;

   function Element (Self : access List) return Natural is abstract;
end Lists;

limited with Lists;
package Types is
   type List_Access is access all Lists.List'Class;
end Types;

with Types;
with Lists;
with Ada.Finalization;

package My_Lists is
   type My_List is new Ada.Finalization.Controlled
 and Lists.List
   with null record;

   type My_List_Access is access all My_List'Class;

   overriding function Element (Self : access My_List) return Natural
 is (2);
end My_Lists;

with My_Lists;
with Types;
procedure Test is
   X : My_Lists.My_List_Access := new My_Lists.My_List;
   Y : Types.List_Access := Types.List_Access (X);  -- Test
begin
   if Y.Element /= 2 then
  raise Program_Error;
   end if;
end Test;

Command: gnatmake main.adb; ./main
No output

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Javier Miranda  mira...@adacore.com

* exp_ch4.adb (Expand_N_Type_Conversion): Add missing implicit
conversion to force the displacement of the pointer to the object
to reference the secondary dispatch table.

Index: exp_ch4.adb
===
--- exp_ch4.adb (revision 217828)
+++ exp_ch4.adb (working copy)
@@ -10622,7 +10622,9 @@
 
 --  Ada 2005 (AI-251): Handle interface type conversion
 
-if Is_Interface (Actual_Op_Typ) then
+if Is_Interface (Actual_Op_Typ)
+  or else Is_Interface (Actual_Targ_Typ)
+then
Expand_Interface_Conversion (N);
goto Done;
 end if;


[Ada] Lift limitation of inter-unit inlining with generic packages

2014-11-20 Thread Arnaud Charlet
This change lifts the arbitrary limitation on the number of iterations that
can be executed between loading of the inlined bodies and instantiation of
the generic bodies of external units when inter-unit inlining is activated.
It was previously limited to 1 but this may be not sufficient in some cases,
which can result in pragma Inline_Always not being honored.

The following code must compile quietly with -O -gnatn:

with Q; use Q;

package P is

   function F (Cal : Calendar) return Boolean;

end P;
package body P is

   function F (Cal : Calendar) return Boolean is
   begin
  return Pred (Cal);
   end;

end P;

with R; use R;

package Q is

   type Calendar is new Object_Ref;

   type Root_Calendar is new Root_Object with record
  B : Boolean;
   end record;

   type Root_Calendar_Ptr is access all Root_Calendar'Class;

   function Pred (Cal : Calendar) return Boolean;
   pragma Inline (Pred);

end Q;
package body Q is

   function Get_Calendar is new Get_Object (Root_Calendar, Root_Calendar_Ptr);
   pragma Inline (Get_Calendar);

   function Pred (Cal : Calendar) return Boolean is
  Cal_Object : constant Root_Calendar_Ptr
 := Get_Calendar (Object_Ref (Cal));
   begin
  return Cal_Object.B;
   end;
end Q;

with Ada.Finalization;

package R is

   type Root_Object is new Ada.Finalization.Controlled with record
  Reference_Count : Natural;
   end record;

   type Object_Ref is private;

   type Root_Object_Ptr is access all Root_Object'Class;

   generic
  type Object () is abstract new Root_Object with private;
  type Object_Ptr is access all Object'Class;
   function Get_Object (Ref : in Object_Ref) return Object_Ptr;

private

   type Object_Ref is new Ada.Finalization.Controlled with record
  Ptr : Root_Object_Ptr;
   end record;

end R;
package body R is

   function Get_Object (Ref : in Object_Ref) return Object_Ptr is
   begin
  return Object_Ptr (Ref.Ptr);
   end Get_Object;

end R;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Eric Botcazou  ebotca...@adacore.com

* inline.adb (Analyze_Inlined_Bodies): Iterate between loading
of the inlined bodies and instantiation of the generic bodies
until no more bodies need to be loaded.

Index: inline.adb
===
--- inline.adb  (revision 217828)
+++ inline.adb  (working copy)
@@ -774,16 +774,21 @@
 end if;
 
 J := J + 1;
- end loop;
 
- --  The analysis of required bodies may have produced additional
- --  generic instantiations. To obtain further inlining, we perform
- --  another round of generic body instantiations. Establishing a
- --  fully recursive loop between inlining and generic instantiations
- --  is unlikely to yield more than this one additional pass.
+if J  Inlined_Bodies.Last then
 
- Instantiate_Bodies;
+   --  The analysis of required bodies may have produced additional
+   --  generic instantiations. To obtain further inlining, we need
+   --  to perform another round of generic body instantiations.
 
+   Instantiate_Bodies;
+
+   --  Symmetrically, the instantiation of required generic bodies
+   --  may have caused additional bodies to be inlined. To obtain
+   --  further inlining, we keep looping over the inlined bodies.
+end if;
+ end loop;
+
  --  The list of inlined subprograms is an overestimate, because it
  --  includes inlined functions called from functions that are compiled
  --  as part of an inlined package, but are not themselves called. An


Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

2014-11-20 Thread Dr . Philipp Tomsich
Kyrill,

 I don't mind it being in config/arm if you plan to wire it up later, good to 
 know.
 Another comment inline….

I’ll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit.

 +(define_insn_reservation div 2
 +  (and (eq_attr tune xgene1)
 +   (eq_attr type sdiv,udiv))
 +  xgene1_decode1op,xgene_divide)
 
 The dangerous part was the reservation duration (the xgene_divide*large 
 number).
 The latency number (2 in this version, 66 in the previous) is not harmful to 
 the automaton size
 and can be as high as needed (if this operation is high latency)

It doesn’t really matter for any workload we’ve encountered, as the hardware is 
better at dealing with ‘div’-latencies than the scheduler (especially, as ‘div’ 
is variable latency and any guess we have will be wrong… we’ll likely add 
scheduling hook function in the future).
The more important thing is to keep the cost of divides high enough in the 
cost-model.

In other words: 66 would be the worst case and will normally not be correct 
anyway. Furthermore, it’s rather unplausible, that we find 264 instructions 
(for this worst-case scenario) to fill the scheduling bubble between the 
div-insn and its result usage.

Best,
Philipp.

Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

2014-11-20 Thread Kyrill Tkachov

Hi Philipp,

On 20/11/14 10:47, Dr. Philipp Tomsich wrote:

Kyrill,


I don't mind it being in config/arm if you plan to wire it up later, good to 
know.
Another comment inline….

I’ll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit.


+(define_insn_reservation div 2
+  (and (eq_attr tune xgene1)
+   (eq_attr type sdiv,udiv))
+  xgene1_decode1op,xgene_divide)

The dangerous part was the reservation duration (the xgene_divide*large 
number).
The latency number (2 in this version, 66 in the previous) is not harmful to 
the automaton size
and can be as high as needed (if this operation is high latency)

It doesn’t really matter for any workload we’ve encountered, as the hardware is 
better at dealing with ‘div’-latencies than the scheduler (especially, as ‘div’ 
is variable latency and any guess we have will be wrong… we’ll likely add 
scheduling hook function in the future).
The more important thing is to keep the cost of divides high enough in the 
cost-model.

In other words: 66 would be the worst case and will normally not be correct 
anyway. Furthermore, it’s rather unplausible, that we find 264 instructions 
(for this worst-case scenario) to fill the scheduling bubble between the 
div-insn and its result usage.


Ok, makes sense. I just thought that 2 is a bit too low but if your 
benchmarking showed it to be reasonable I won't complain ;)


Kyrill



Best,
Philipp.






[Ada] Fix costly call to Following_Address_Clause

2014-11-20 Thread Arnaud Charlet
This change makes is so that Following_Address_Clause is invoked only if this
is really necessary from Analyze_Object_Declaration.  This saves about 1% of
the compilation time at low optimization levels.  No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Eric Botcazou  ebotca...@adacore.com

* sem_ch3.adb (Analyze_Object_Declaration): Swap a couple of
tests in a condition so Following_Address_Clause is invoked
only if need be.
* exp_util.ads (Following_Address_Clause): Add small note.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 217828)
+++ sem_ch3.adb (working copy)
@@ -3648,8 +3648,13 @@
 
  if Comes_From_Source (N)
and then Expander_Active
+   and then Nkind (E) = N_Aggregate
+
+   --  Note the importance of doing this the following test after the
+   --  N_Aggregate test to avoid inefficiencies from too many calls to
+   --  the function Following_Address_Clause which can be expensive.
+
and then Present (Following_Address_Clause (N))
-   and then Nkind (E) = N_Aggregate
  then
 Set_Etype (E, T);
 
Index: exp_util.ads
===
--- exp_util.ads(revision 217828)
+++ exp_util.ads(working copy)
@@ -507,6 +507,10 @@
--  current declarative part to look for an address clause for the object
--  being declared, and returns the clause if one is found, returns
--  Empty otherwise.
+   --
+   --  Note: this function can be costly and must be invoked with special care.
+   --  Possibly we could introduce a flag at parse time indicating the presence
+   --  of an address clause to speed this up???
 
procedure Force_Evaluation
  (Exp  : Node_Id;


[Ada] Handling of function calls to predefined operators in ASIS

2014-11-20 Thread Arnaud Charlet
An operator that is called in functional notation is rewritten as an operator
so that its operands can be properly resolved. ASIS needs the semantic info
to be available on the original node, so in ASIS mode the resolved operands
are linked back to the original call. This patch takes into account that the
call may have had named associations, using the standard operator arguments
Left and Right.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Ed Schonberg  schonb...@adacore.com

* sem_res.adb (Make_Call_Into_Operator): In ASIS mode, propagate
back the resolved operands to the original call node, taking
into account that the original call may have named associations.

Index: sem_res.adb
===
--- sem_res.adb (revision 217828)
+++ sem_res.adb (working copy)
@@ -1793,16 +1793,62 @@
 and then Nkind (N) in N_Op
 and then Nkind (Original_Node (N)) = N_Function_Call
   then
- if Is_Binary then
-Rewrite (First (Parameter_Associations (Original_Node (N))),
-   Relocate_Node (Left_Opnd (N)));
-Rewrite (Next (First (Parameter_Associations (Original_Node (N,
-   Relocate_Node (Right_Opnd (N)));
- else
-Rewrite (First (Parameter_Associations (Original_Node (N))),
-   Relocate_Node (Right_Opnd (N)));
- end if;
+ declare
+L : constant Node_Id := Left_Opnd  (N);
+R : constant Node_Id := Right_Opnd (N);
 
+Old_First : constant Node_Id :=
+  First (Parameter_Associations (Original_Node (N)));
+Old_Sec   : Node_Id;
+
+ begin
+if Is_Binary then
+   Old_Sec   := Next (Old_First);
+
+   --  If the original call has named associations, replace the
+   --  explicit actual parameter in the association with the proper
+   --  resolved operand.
+
+   if Nkind (Old_First) = N_Parameter_Association then
+  if Chars (Selector_Name (Old_First)) =
+ Chars (First_Entity (Op_Id))
+  then
+ Rewrite (Explicit_Actual_Parameter (Old_First),
+   Relocate_Node (L));
+  else
+ Rewrite (Explicit_Actual_Parameter (Old_First),
+   Relocate_Node (R));
+  end if;
+
+   else
+  Rewrite (Old_First, Relocate_Node (L));
+   end if;
+
+   if Nkind (Old_Sec) = N_Parameter_Association then
+  if Chars (Selector_Name (Old_Sec))  =
+ Chars (First_Entity (Op_Id))
+  then
+ Rewrite (Explicit_Actual_Parameter (Old_Sec),
+   Relocate_Node (L));
+  else
+ Rewrite (Explicit_Actual_Parameter (Old_Sec),
+   Relocate_Node (R));
+  end if;
+
+   else
+  Rewrite (Old_Sec, Relocate_Node (R));
+   end if;
+
+else
+   if Nkind (Old_First) = N_Parameter_Association then
+  Rewrite (Explicit_Actual_Parameter (Old_First),
+Relocate_Node (R));
+   else
+  Rewrite (Old_First, Relocate_Node (R));
+   end if;
+end if;
+ end;
+
  Set_Parent (Original_Node (N), Parent (N));
   end if;
end Make_Call_Into_Operator;


[Ada] Improper assignment on indexing operation with implicit dereference

2014-11-20 Thread Arnaud Charlet
If the left-hand side of an assignment is an Ada 2012 generalized indexing
with an implicit derenference, the compiler must verify that the type of
the access discriminant that provides the implicit dereference is not an
access_to_constant.

Compiling ada_test.adb must yield:

   ada_test.adb:24:25: left hand side of assignment must be a variable
   ada_test.adb:25:04: left hand side of assignment must be a variable

---
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;

procedure Ada_Test is

   type Obj is record
  A : aliased Integer;
   end record;

   type Obj_Access is access all Obj;

   type Accessor (Data : access constant Integer) is null record with
 Implicit_Dereference = Data;

   function Get_Int (This : Obj_Access) return Accessor is
   begin
  return Accessor'(Data = This.A'Access);
   end Get_Int;

   X : aliased Obj := (A = 11);
   X_Ptr : Obj_Access := X'Access;

begin
   Get_Int (X_Ptr).Data.all := 33;   -- Error
   Get_Int (X_Ptr) := 33;-- Error
   Put (X.A);-- Should never execute..
   New_Line;
end Ada_Test;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Ed Schonberg  schonb...@adacore.com

* sem_util.adb (Is_Variable): For an Ada 2012 implicit
dereference introduced for an indexing opertion, check that the
type of the corresponding access discriminant is not an access
to constant.

Index: sem_util.adb
===
--- sem_util.adb(revision 217829)
+++ sem_util.adb(working copy)
@@ -12806,12 +12806,14 @@
  Is_Variable_Prefix (Original_Node (Prefix (N)));
 
   --  in Ada 2012, the dereference may have been added for a type with
-  --  a declared implicit dereference aspect.
+  --  a declared implicit dereference aspect. Check that it is not an
+  --  access to constant.
 
   elsif Nkind (N) = N_Explicit_Dereference
 and then Present (Etype (Orig_Node))
 and then Ada_Version = Ada_2012
 and then Has_Implicit_Dereference (Etype (Orig_Node))
+and then not Is_Access_Constant (Etype (Prefix (N)))
   then
  return True;
 


[Ada] Rework win32_wait to behave more like the UNIX waitpid()

2014-11-20 Thread Arnaud Charlet
The following changes are importants:

- It is possible to have multiple tasks waiting for a child process
  to terminate.

- When a child terminates, a single wait call will receive the
  corresponding process id.

- A call to wait will handle new incoming child processes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Pascal Obry  o...@adacore.com

* initialize.c (ProcListCS): New extern variable (critical section).
(ProcListEvt): New extern variable (handle).
(__gnat_initialize)[Win32]: Initialize the ProcListCS critical
section object and the ProcListEvt event.
* final.c (__gnat_finalize)[Win32]: Properly finalize the
ProcListCS critical section and the ProcListEvt event.
* adaint.c (ProcListEvt): New Win32 event handle.
(EnterCS): New routine to enter the critical section when dealing with
child processes chain list.
(LeaveCS): As above to exit from the critical section.
(SignalListChanged): Routine to signal that the chain process list has
been updated.
(add_handle): Use EnterCS/LeaveCS, also call SignalListChanged when the
handle has been added.
(__gnat_win32_remove_handle): Use EnterCS/LeaveCS,
also call SignalListChanged if the handle has been found and removed.
(remove_handle): Routine removed, implementation merged with the above.
(win32_wait): Use EnterCS/LeaveCS for the critical section. Properly
copy the PID list locally to ensure that even if the list is updated
the local copy remains valid. Add into the hl (handle list) the
ProcListEvt handle. This handle is used to signal that a change has
been made into the process chain list. This is to ensure that a waiting
call can be resumed to take into account new processes. We also make
sure that if the handle was not found into the list we start over
the wait call. Indeed another concurrent call to win32_wait()
could already have handled this process.

Index: final.c
===
--- final.c (revision 217828)
+++ final.c (working copy)
@@ -6,7 +6,7 @@
  *  *
  *  C Implementation File   *
  *  *
- *  Copyright (C) 1992-2011, Free Software Foundation, Inc. *
+ *  Copyright (C) 1992-2014, Free Software Foundation, Inc. *
  *  *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free Soft- *
@@ -40,11 +40,29 @@
at all, the intention is that this be replaced by system specific code
where finalization is required.  */
 
+#if defined (__MINGW32__)
+#include mingw32.h
+#include windows.h
+
+extern CRITICAL_SECTION ProcListCS;
+extern HANDLE ProcListEvt;
+
 void
 __gnat_finalize (void)
 {
+  /* delete critical section and event handle used for the
+ processes chain list */
+  DeleteCriticalSection(ProcListCS);
+  CloseHandle (ProcListEvt);
 }
 
+#else
+void
+__gnat_finalize (void)
+{
+}
+#endif
+
 #ifdef __cplusplus
 }
 #endif
Index: initialize.c
===
--- initialize.c(revision 217828)
+++ initialize.c(working copy)
@@ -74,6 +74,8 @@
 
 extern int gnat_argc;
 extern char **gnat_argv;
+extern CRITICAL_SECTION ProcListCS;
+extern HANDLE ProcListEvt;
 
 #ifdef GNAT_UNICODE_SUPPORT
 
@@ -138,6 +140,11 @@
   given that we have set Max_Digits etc with this in mind */
__gnat_init_float ();
 
+   /* Initialize the critical section and event handle for the win32_wait()
+  implementation, see adaint.c */
+   InitializeCriticalSection (ProcListCS);
+   ProcListEvt = CreateEvent (NULL, FALSE, FALSE, NULL);
+
 #ifdef GNAT_UNICODE_SUPPORT
/* Set current code page for filenames handling. */
{
Index: adaint.c
===
--- adaint.c(revision 217836)
+++ adaint.c(working copy)
@@ -2311,21 +2311,30 @@
for locking and unlocking tasks since we do not support multiple
threads on this configuration (Cert run time on native Windows). */
 
-static void dummy (void)
+static void EnterCS (void) {}
+static void LeaveCS (void) {}
+static void SignalListChanged (void) {}
+
+#else
+
+CRITICAL_SECTION ProcListCS;
+HANDLE ProcListEvt;
+
+static void EnterCS (void)
 {
+  EnterCriticalSection(ProcListCS);
 }
 
-void (*Lock_Task) ()   = dummy;
-void (*Unlock_Task) () = dummy;
+static void LeaveCS (void)
+{
+  LeaveCriticalSection(ProcListCS);
+}
 
-#else
+static void SignalListChanged (void)
+{
+  SetEvent (ProcListEvt);
+}
 
-#define Lock_Task 

[Ada] Attributes 'Old and 'Update must preserve the tag of their prefix

2014-11-20 Thread Arnaud Charlet
The patch modifies the expansion of attributes 'Old and 'Update to ensure that
the tag of a tagged prefix is not modified as a result attribute evaluation.


-- Source --


--  types.ads

package Types is
   type Root is tagged record
  X : Integer;
   end record;

   procedure Show (R : Root);

   type Ext is new Root with record
  Y : Integer;
   end record;

   overriding procedure Show (R : Ext);
end Types;

--  types.adb

with Ada.Text_IO; use Ada.Text_IO;

package body Types is
   procedure Show (R : Root) is
   begin
  Put_Line ((root) X =  R.X'Img);
   end Show;

   overriding procedure Show (R : Ext) is
   begin
  Put_Line ((ext) X =  R.X'Img);
  Put_Line ((ext) Y =  R.Y'Img);
   end Show;
end Types;

--  main.adb

with Ada.Text_IO; use Ada.Text_IO;
with Types;   use Types;

procedure Main is
   procedure Show_Me (R : Root) is
  Tmp : Root'Class := R;
   begin
  Show (Tmp);
   end Show_Me;

   procedure Wibble (R : Root) is
   begin
  Show_Me (R);
  Show_Me (R'Update (X = 5));
   end Wibble;

   A : Ext;
begin
   A.X := 0;
   A.Y := 1;

   Wibble (Root (A));
end Main;


-- Compilation and output --


$ gnatmake -q main.adb
$ ./main
(ext) X = 0
(ext) Y = 1
(ext) X = 5
(ext) Y = 1

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Hristian Kirtchev  kirtc...@adacore.com

* exp_attr.adb (Expand_N_Attribute_Reference,
Expand_Update_Attribute): Preserve the tag of a prefix by offering
a specific view of the class-wide version of the prefix.

Index: exp_attr.adb
===
--- exp_attr.adb(revision 217828)
+++ exp_attr.adb(working copy)
@@ -1021,6 +1021,9 @@
   Pref  : constant Node_Id   := Prefix (N);
   Typ   : constant Entity_Id := Etype (Pref);
   Blk   : Node_Id;
+  CW_Decl   : Node_Id;
+  CW_Temp   : Entity_Id;
+  CW_Typ: Entity_Id;
   Decls : List_Id;
   Installed : Boolean;
   Loc   : Source_Ptr;
@@ -1338,19 +1341,56 @@
   --  Step 3: Create a constant to capture the value of the prefix at the
   --  entry point into the loop.
 
-  --  Generate:
-  --Temp : constant type of Pref := Pref;
-
   Temp_Id := Make_Temporary (Loc, 'P');
 
-  Temp_Decl :=
-Make_Object_Declaration (Loc,
-  Defining_Identifier = Temp_Id,
-  Constant_Present= True,
-  Object_Definition   = New_Occurrence_Of (Typ, Loc),
-  Expression  = Relocate_Node (Pref));
-  Append_To (Decls, Temp_Decl);
+  --  Preserve the tag of the prefix by offering a specific view of the
+  --  class-wide version of the prefix.
 
+  if Is_Tagged_Type (Typ) then
+
+ --  Generate:
+ --CW_Temp : constant Typ'Class := Typ'Class (Pref);
+
+ CW_Temp := Make_Temporary (Loc, 'T');
+ CW_Typ  := Class_Wide_Type (Typ);
+
+ CW_Decl :=
+   Make_Object_Declaration (Loc,
+ Defining_Identifier = CW_Temp,
+ Constant_Present= True,
+ Object_Definition   = New_Occurrence_Of (CW_Typ, Loc),
+ Expression  =
+   Convert_To (CW_Typ, Relocate_Node (Pref)));
+ Append_To (Decls, CW_Decl);
+
+ --  Generate:
+ --Temp : Typ renames Typ (CW_Temp);
+
+ Temp_Decl :=
+   Make_Object_Renaming_Declaration (Loc,
+ Defining_Identifier = Temp_Id,
+ Subtype_Mark= New_Occurrence_Of (Typ, Loc),
+ Name=
+   Convert_To (Typ, New_Occurrence_Of (CW_Temp, Loc)));
+ Append_To (Decls, Temp_Decl);
+
+  --  Non-tagged case
+
+  else
+ CW_Decl := Empty;
+
+ --  Generate:
+ --Temp : constant Typ := Pref;
+
+ Temp_Decl :=
+   Make_Object_Declaration (Loc,
+ Defining_Identifier = Temp_Id,
+ Constant_Present= True,
+ Object_Definition   = New_Occurrence_Of (Typ, Loc),
+ Expression  = Relocate_Node (Pref));
+ Append_To (Decls, Temp_Decl);
+  end if;
+
   --  Step 4: Analyze all bits
 
   Installed := Current_Scope = Scope (Loop_Id);
@@ -1374,6 +1414,10 @@
   --  the declaration of the constant.
 
   else
+ if Present (CW_Decl) then
+Analyze (CW_Decl);
+ end if;
+
  Analyze (Temp_Decl);
   end if;
 
@@ -4358,19 +4402,13 @@
   -
 
   when Attribute_Old = Old : declare
- Asn_Stm : Node_Id;
+ Typ : constant Entity_Id := Etype (N);
+ CW_Temp : Entity_Id;
+ CW_Typ  : Entity_Id;
  Subp: Node_Id;
  Temp: Entity_Id;
 
   begin
- Temp := Make_Temporary (Loc, 'T', Pref);
-
- --  Set the entity kind now in order to mark 

Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

2014-11-20 Thread Ramana Radhakrishnan
On Wed, Nov 19, 2014 at 9:42 PM, Philipp Tomsich
philipp.toms...@theobroma-systems.com wrote:
 Here's an updated patch with Kyrill's and Andrew's comments integrated.

 I left the file in the config/arm-directory, as XGene-family is capable of
 executing ARMv7 and we will wire this into the 32bit backend in the near
 future (moving it now would just cause another move in the near future).


Right, if this were making it into the arm backend and if the core
indeed does have AArch32 support, I'd like to see support for the
command line for xgene1 in the AArch32 backend as well for 5.0. Do
have a look in arm-cores.def in gcc/config/arm - there are ways of
using existing tuning options with the command line or putting this as
part of generic. We've been here before and users typically complain
about CPU option X being available in AArch32 state but not in AArch64
state. Since this is a separate tuning option, I'm less worried about
this going in later in stage3 but realistically it would be good to
have the command line options wired up for AArch32 by the end of the
year.

Ramana


 We also moved the 'include' up to where the pipeline models for the
 A53/A57/ThunderX are included, as the previous dependency on picking up the
 SIMD types from aarch64-simd.md no longer holds true since gcc-4.9.

 Cheers,
 -Philipp.


 ---
  gcc/ChangeLog |   6 +
  gcc/config/aarch64/aarch64.md |   3 +-
  gcc/config/arm/xgene1.md  | 520 
 ++
  3 files changed, 528 insertions(+), 1 deletion(-)
  create mode 100644 gcc/config/arm/xgene1.md

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index c9ac0d9..dad2278 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,5 +1,11 @@
  2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com

 +   * config/aarch64/aarch64.md: Include xgene1.md.
 +   (generic_sched): Set to no for xgene1.
 +   * config/arm/xgene1.md: New file.
 +
 +2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com
 +
 * config/aarch64/aarch64-cores.def (xgene1): Update/add the
 xgene1 (APM XGene-1) core definition.
 * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1
 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
 index 597ff8c..1b36384 100644
 --- a/gcc/config/aarch64/aarch64.md
 +++ b/gcc/config/aarch64/aarch64.md
 @@ -191,7 +191,7 @@

  (define_attr generic_sched yes,no
(const (if_then_else
 -  (eq_attr tune cortexa53,cortexa15,thunderx)
 +  (eq_attr tune cortexa53,cortexa15,thunderx,xgene1)
(const_string no)
(const_string yes

 @@ -199,6 +199,7 @@
  (include ../arm/cortex-a53.md)
  (include ../arm/cortex-a15.md)
  (include thunderx.md)
 +(include ../arm/xgene1.md)

  ;; ---
  ;; Jumps and other miscellaneous insns
 diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
 new file mode 100644
 index 000..227f2c7
 --- /dev/null
 +++ b/gcc/config/arm/xgene1.md
 @@ -0,0 +1,520 @@
 +;; Machine description for AppliedMicro xgene1 core.
 +;; Copyright (C) 2012-2014 Free Software Foundation, Inc.
 +;; Contributed by Theobroma Systems Design und Consulting GmbH.
 +;;See http://www.theobroma-systems.com for more info.
 +;;
 +;; This file is part of GCC.
 +;;
 +;; GCC is free software; you can redistribute it and/or modify it
 +;; under the terms of the GNU General Public License as published by
 +;; the Free Software Foundation; either version 3, or (at your option)
 +;; any later version.
 +;;
 +;; GCC is distributed in the hope that it will be useful, but
 +;; WITHOUT ANY WARRANTY; without even the implied warranty of
 +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +;; General Public License for more details.
 +;;
 +;; You should have received a copy of the GNU General Public License
 +;; along with GCC; see the file COPYING3.  If not see
 +;; http://www.gnu.org/licenses/.
 +
 +;; Pipeline description for the xgene1 micro-architecture
 +
 +(define_automaton xgene1)
 +
 +(define_cpu_unit xgene1_decode_out0 xgene1)
 +(define_cpu_unit xgene1_decode_out1 xgene1)
 +(define_cpu_unit xgene1_decode_out2 xgene1)
 +(define_cpu_unit xgene1_decode_out3 xgene1)
 +
 +(define_cpu_unit xgene_divide xgene1)
 +(define_cpu_unit xgene_fp_divide xgene1)
 +
 +(define_reservation xgene1_decode1op
 +( xgene1_decode_out0 )
 +|( xgene1_decode_out1 )
 +|( xgene1_decode_out2 )
 +|( xgene1_decode_out3 )
 +)
 +(define_reservation xgene1_decode2op
 +( xgene1_decode_out0 + xgene1_decode_out1 )
 +|( xgene1_decode_out0 + xgene1_decode_out2 )
 +|( xgene1_decode_out0 + xgene1_decode_out3 )
 +|( xgene1_decode_out1 + xgene1_decode_out2 )
 +|( xgene1_decode_out1 + xgene1_decode_out3 )
 +|( xgene1_decode_out2 + xgene1_decode_out3 )
 +)
 +(define_reservation 

[Ada] Interaction between 'Loop_Entry, 'Old, 'Update and Extensions_Visible

2014-11-20 Thread Arnaud Charlet
This patch the following SPARK rule (the part about 'Loop_Entry, 'Old, 'Update)

   If the Extensions_Visible aspect is False for a subprogram, then certain
   restrictions are imposed on the use of any parameter of the subprogram which
   is of a specific tagged type. Such a parameter shall not be converted to a
   class-wide type. Such a parameter shall not be passed as an actual parameter
   in a call to a subprogram whose Extensions_Visible aspect is True. These
   restrictions also apply to any parenthesized expression, qualified
   expression, or type conversion whose operand is subject to these
   restrictions, to any Old, Update, or Loop_Entry attribute_reference whose
   prefix is subject to these restrictions, and to any conditional expression
   having at least one dependent_expression which is subjec to these
   restrictions.


-- Source --


--  test_loop_entry_old_update.adb

procedure Test_Loop_Entry_Old_Update is

   -- Test that Extensions_Visible restrictions are enforced for
   -- Old, Update, and Loop_Entry attribute references.

   pragma Assertion_Policy (Check);

   package Pkg is
  type T is abstract tagged record Int1, Int2, Int3 : Integer; end record;
  function Is_Bodacious (X : T) return Boolean is abstract;
   end Pkg;
   use Pkg;

   procedure P1 (X : in out T) with
 Post = Is_Bodacious (T'Class (X'Old)), --  ERROR
 Extensions_Visible = False;
   procedure P1 (X : in out T) is begin null; end P1;

   procedure P2 (X : in out T) with Extensions_Visible = False;
   procedure P2 (X : in out T) is
   begin
  if Is_Bodacious (T'Class (X'Update (Int1 = 123))) then--  ERROR
 X.Int1 := 123;
  end if;
   end P2;

   procedure P3 (X : in out T) with Extensions_Visible = False;
   procedure P3 (X : in out T) is
   begin
  for I in 1 .. 10 loop
 X.Int1 := X.Int1 + 1;
 pragma Assert ((X.Int1 /= X.Int2)
   or else Is_Bodacious (T'Class (X'Loop_Entry)));   --  ERROR
  end loop;
   end P3;

   procedure P4 (X : in out T; Y : T'Class) with Extensions_Visible = False;
   procedure P4 (X : in out T; Y : T'Class) is
   begin
  if Is_Bodacious
(T'Class
  (T'(if X.Int1 = X.Int2 --  ERROR
  then X'Update (Int1 = X.Int1 + 1)
  else T (Y then
  X.Int1 := 456;
  end if;
   end P4;

begin null; end Test_Loop_Entry_Old_Update;


-- Compilation and output --


$ gcc -c test_loop_entry_old_update.adb
test_loop_entry_old_update.adb:15:38: formal parameter with Extensions_Visible
  False cannot be converted to class-wide type
test_loop_entry_old_update.adb:22:34: formal parameter with Extensions_Visible
  False cannot be converted to class-wide type
test_loop_entry_old_update.adb:33:44: formal parameter with Extensions_Visible
  False cannot be converted to class-wide type
test_loop_entry_old_update.adb:42:13: formal parameter with Extensions_Visible
  False cannot be converted to class-wide type

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Hristian Kirtchev  kirtc...@adacore.com

* sem_util.adb (Is_EVF_Expression): Include
attributes 'Loop_Entry, 'Old and 'Update to the logic.

Index: sem_util.adb
===
--- sem_util.adb(revision 217835)
+++ sem_util.adb(working copy)
@@ -10846,6 +10846,16 @@
  N_Type_Conversion)
   then
  return Is_EVF_Expression (Expression (N));
+
+  --  Attributes 'Loop_Entry, 'Old and 'Update are an EVF expression when
+  --  their prefix denotes an EVF expression.
+
+  elsif Nkind (N) = N_Attribute_Reference
+and then Nam_In (Attribute_Name (N), Name_Loop_Entry,
+ Name_Old,
+ Name_Update)
+  then
+ return Is_EVF_Expression (Prefix (N));
   end if;
 
   return False;


[Ada] Add missing SPARK_Mode aspects/pragmas on formal containers

2014-11-20 Thread Arnaud Charlet
While the library of formal maps/sets correctly set SPARK_Mode on spec
(On) and private part / body (Off), it was not the case for lists and
vectors, thus causing some errors in GNATprove when instantiating such
formal containers because bodies contain non-SPARK features (e.g. access
types in formal vectors). Now fixed, which requires for formal lists and
vectors that they are instantiated at library level, as other formal
containers.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Yannick Moy  m...@adacore.com

* a-cfdlli.adb, a-cfdlli.ads, a-cfinve.adb, a-cfinve.ads,
* a-cofove.adb, a-cofove.ads: Mark spec as SPARK_Mode, and private
part/body as SPARK_Mode Off.
* a-cfhama.adb, a-cfhama.ads, a-cfhase.adb, a-cfhase.ads,
* a-cforma.adb, a-cforma.ads, a-cforse.adb, a-cforse.ads: Use
aspect instead of pragma for uniformity.

Index: a-cfdlli.adb
===
--- a-cfdlli.adb(revision 217828)
+++ a-cfdlli.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2010-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 2010-2014, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -27,7 +27,9 @@
 
 with System;  use type System.Address;
 
-package body Ada.Containers.Formal_Doubly_Linked_Lists is
+package body Ada.Containers.Formal_Doubly_Linked_Lists with
+  SPARK_Mode = Off
+is
 
---
-- Local Subprograms --
Index: a-cfdlli.ads
===
--- a-cfdlli.ads(revision 217828)
+++ a-cfdlli.ads(working copy)
@@ -61,9 +61,11 @@
with function = (Left, Right : Element_Type)
   return Boolean is ;
 
-package Ada.Containers.Formal_Doubly_Linked_Lists is
+package Ada.Containers.Formal_Doubly_Linked_Lists with
+  Pure,
+  SPARK_Mode
+is
pragma Annotate (GNATprove, External_Axiomatization);
-   pragma Pure;
 
type List (Capacity : Count_Type) is private with
  Iterable = (First   = First,
@@ -337,6 +339,7 @@
--  scanned yet.
 
 private
+   pragma SPARK_Mode (Off);
 
type Node_Type is record
   Prev: Count_Type'Base := -1;
Index: a-cfhase.adb
===
--- a-cfhase.adb(revision 217828)
+++ a-cfhase.adb(working copy)
@@ -35,8 +35,9 @@
 
 with System; use type System.Address;
 
-package body Ada.Containers.Formal_Hashed_Sets is
-   pragma SPARK_Mode (Off);
+package body Ada.Containers.Formal_Hashed_Sets with
+  SPARK_Mode = Off
+is
 
---
-- Local Subprograms --
Index: a-cfhase.ads
===
--- a-cfhase.ads(revision 217828)
+++ a-cfhase.ads(working copy)
@@ -67,10 +67,11 @@
 
with function = (Left, Right : Element_Type) return Boolean is ;
 
-package Ada.Containers.Formal_Hashed_Sets is
+package Ada.Containers.Formal_Hashed_Sets with
+  Pure,
+  SPARK_Mode
+is
pragma Annotate (GNATprove, External_Axiomatization);
-   pragma Pure;
-   pragma SPARK_Mode (On);
 
type Set (Capacity : Count_Type; Modulus : Hash_Type) is private with
  Iterable = (First   = First,
@@ -335,9 +336,10 @@
--  scanned yet.
 
 private
-   pragma Inline (Next);
pragma SPARK_Mode (Off);
 
+   pragma Inline (Next);
+
type Node_Type is
   record
  Element : Element_Type;
Index: a-cfinve.adb
===
--- a-cfinve.adb(revision 217828)
+++ a-cfinve.adb(working copy)
@@ -26,7 +26,9 @@
 -- http://www.gnu.org/licenses/.  --
 --
 
-package body Ada.Containers.Formal_Indefinite_Vectors is
+package body Ada.Containers.Formal_Indefinite_Vectors with
+  SPARK_Mode = Off
+is
 
function H (New_Item : Element_Type) return Holder renames To_Holder;
function E (Container : Holder) return Element_Type renames Get;
Index: a-cfinve.ads
===
--- a-cfinve.ads(revision 217828)
+++ a-cfinve.ads(working copy)
@@ -52,7 +52,9 @@
--  size, and heap allocation will be avoided. If False, the containers can
--  grow via heap allocation.
 
-package Ada.Containers.Formal_Indefinite_Vectors is
+package 

[Ada] Generate VC in GNATprove instead of error for empty range check

2014-11-20 Thread Arnaud Charlet
Range checks on empty ranges typically correspond to deactivated code
based on a given configuration (say, dead code inside a loop over the
empty range). In GNATprove mode, instead of issuing an error message
(which would stop analysis), enable the range check so that GNATprove
will issue a message if it cannot prove that the check is unreachable.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Yannick Moy  m...@adacore.com

* checks.adb (Apply_Scalar_Range_Check): In GNATprove mode,
put a range check when an empty range is used, instead of an
error message.
* sinfo.ads Update comment on GNATprove mode.

Index: sinfo.ads
===
--- sinfo.ads   (revision 217828)
+++ sinfo.ads   (working copy)
@@ -581,6 +581,12 @@
--   bounds are generated from an expression: Expand_Subtype_From_Expr
--   should be noop.
 
+   --5. Errors (instead of warnings) are issued on compile-time known
+   --   constraint errors, except in a few selected cases where it should
+   --   be allowed to let analysis proceed (e.g. range checks on empty
+   --   ranges, typically in deactivated code based on a given
+   --   configuration).
+
---
-- Check Flag Fields --
---
Index: checks.adb
===
--- checks.adb  (revision 217828)
+++ checks.adb  (working copy)
@@ -2926,7 +2926,21 @@
   --  since all possible values will raise CE).
 
   if Lov  Hiv then
- Bad_Value;
+
+ --  In GNATprove mode, do not issue a message in that case
+ --  (which would be an error stopping analysis), as this
+ --  likely corresponds to deactivated code based on a
+ --  given configuration (say, dead code inside a loop over
+ --  the empty range). Instead, we enable the range check
+ --  so that GNATprove will issue a message if it cannot be
+ --  proved.
+
+ if GNATprove_Mode then
+Enable_Range_Check (Expr);
+ else
+Bad_Value;
+ end if;
+
  return;
   end if;
 


[Ada] Give error message if duplicate Linker_Section given

2014-11-20 Thread Arnaud Charlet
Like other similar pragmas, we should disallow duplicate pragma or
aspect Linker_Section for non-overloadable entities (for the case
of overloading, the pragma only applies to previous entities which
do not have such a pragma).

The following should compile with the given error:

 1. package Pkg1 is
 2.Var_Dyn : natural;
 3.pragma Linker_Section (Var_Dyn, .data_dyn);
 4.pragma Linker_Section (Var_Dyn, .data_dyn1);
  |
 Linker_Section already specified for Var_Dyn at line 3

 5. end Pkg1;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Robert Dewar  de...@adacore.com

* sem_prag.adb (Analyze_Pragma, case Linker_Section): Detect
duplicate Linker_Section.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 217838)
+++ sem_prag.adb(working copy)
@@ -16380,6 +16380,7 @@
  when Pragma_Linker_Section = Linker_Section : declare
 Arg : Node_Id;
 Ent : Entity_Id;
+LPE : Node_Id;
 
  begin
 GNAT_Pragma;
@@ -16398,9 +16399,18 @@
 case Ekind (Ent) is
 
--  Objects (constants and variables) and types. For these cases
-   --  all we need to do is to set the Linker_Section_pragma field.
+   --  all we need to do is to set the Linker_Section_pragma field,
+   --  checking that we do not have a duplicate.
 
when E_Constant | E_Variable | Type_Kind =
+  LPE := Linker_Section_Pragma (Ent);
+
+  if Present (LPE) then
+ Error_Msg_Sloc := Sloc (LPE);
+ Error_Msg_NE
+   (Linker_Section already specified for #, Arg1, Ent);
+  end if;
+
   Set_Linker_Section_Pragma (Ent, N);
 
--  Subprograms


RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop

2014-11-20 Thread Matthew Fortune
 Ok to commit?

 gcc/
   * config/mips/mips.c (mips_process_sync_loop): Place a nop in the 
   delay slot of the branch likely instruction.

With an updated ChangeLog to account for the changes in the callers, OK.

Matthew


[PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Evgeny Stupachenko
Hi,

The patch expand even/odd permutation using:
and, and, pack in odd case
shift, shift, pack in even case

instead of current pshufb, pshufb, or or big set of unpack insns.

AVX2/CORE bootstrap and make check passed.
expensive tests are in progress

Is it ok for trunk?

Evgeny

2014-11-20  Evgeny Stupachenko  evstu...@gmail.com

gcc/testsuite
PR target/60451
* gcc.target/i386/pr60451.c: New.

gcc/
PR target/60451
* config/i386/i386.c (expand_vec_perm_even_odd_pack): New.
(expand_vec_perm_even_odd_1): Add new expand for SSE cases,
replace with for AVX2 cases.
(ix86_expand_vec_perm_const_1): Add new expand.


+/* A subroutine of expand_vec_perm_even_odd_1.  Implement extract-even
+   and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands
+   with two and and pack or two shift and pack insns.  We should
+   have already failed all two instruction sequences.  */
+
+static bool
+expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d)
+{
+  rtx op, dop0, dop1, t, rperm[16];
+  unsigned i, odd, c, s, nelt = d-nelt;
+  bool end_perm = false;
+  machine_mode half_mode;
+  rtx (*gen_and) (rtx, rtx, rtx);
+  rtx (*gen_pack) (rtx, rtx, rtx);
+  rtx (*gen_shift) (rtx, rtx, rtx);
+
+  /* Required for pack.  */
+  if (!TARGET_SSE4_2 || d-one_operand_p)
+return false;
+
+  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general
+ shuffles.  */
+  if (d-vmode == V8HImode)
+{
+  c = 0x;
+  s = 16;
+  half_mode = V4SImode;
+  gen_and = gen_andv4si3;
+  gen_pack = gen_sse4_1_packusdw;
+  gen_shift = gen_lshrv4si3;
+}
+  else if (d-vmode == V16QImode)
+{
+  c = 0xff;
+  s = 8;
+  half_mode = V8HImode;
+  gen_and = gen_andv8hi3;
+  gen_pack = gen_sse2_packuswb;
+  gen_shift = gen_lshrv8hi3;
+}
+  else if (d-vmode == V16HImode)
+{
+  c = 0x;
+  s = 16;
+  half_mode = V8SImode;
+  gen_and = gen_andv8si3;
+  gen_pack = gen_avx2_packusdw;
+  gen_shift = gen_lshrv8si3;
+  end_perm = true;
+}
+  else if (d-vmode == V32QImode)
+{
+  c = 0xff;
+  s = 8;
+  half_mode = V16HImode;
+  gen_and = gen_andv16hi3;
+  gen_pack = gen_avx2_packuswb;
+  gen_shift = gen_lshrv16hi3;
+  end_perm = true;
+}
+  else
+return false;
+
+  /* Check that permutation is even or odd.  */
+  odd = d-perm[0];
+  if (odd != 0  odd != 1)
+return false;
+
+  for (i = 1; i  nelt; ++i)
+if (d-perm[i] != 2 * i + odd)
+  return false;
+
+  if (d-testing_p)
+return true;
+
+  dop0 = gen_reg_rtx (half_mode);
+  dop1 = gen_reg_rtx (half_mode);
+  if (odd == 0)
+{
+  for (i = 0; i  nelt / 2; rperm[i++] = GEN_INT (c));
+  t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm));
+  t = force_reg (half_mode, t);
+  emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0)));
+  emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1)));
+}
+  else
+{
+  emit_insn (gen_shift (dop0,
+   gen_lowpart (half_mode, d-op0),
+   GEN_INT (s)));
+  emit_insn (gen_shift (dop1,
+   gen_lowpart (half_mode, d-op1),
+   GEN_INT (s)));
+}
+  /* In AVX2 for 256 bit case we need to permute pack result.  */
+  if (TARGET_AVX2  end_perm)
+{
+  op = gen_reg_rtx (d-vmode);
+  t = gen_reg_rtx (V4DImode);
+  emit_insn (gen_pack (op, dop0, dop1));
+  emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op),
const0_rtx,
+ const2_rtx, const1_rtx, GEN_INT (3)));
+  emit_move_insn (d-target, gen_lowpart (d-vmode, t));
+}
+  else
+emit_insn (gen_pack (d-target, dop0, dop1));
+
+  return true;
+}
+
 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Implement extract-even
and extract-odd permutations.  */

@@ -48393,6 +48503,8 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)
   gcc_unreachable ();

 case V8HImode:
+  if (TARGET_SSE4_2)
+   return expand_vec_perm_even_odd_pack (d);
   if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
return expand_vec_perm_pshufb2 (d);
   else
@@ -48416,6 +48528,8 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)
   break;

 case V16QImode:
+  if (TARGET_SSE4_2)
+   return expand_vec_perm_even_odd_pack (d);
   if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
return expand_vec_perm_pshufb2 (d);
   else
@@ -48441,7 +48555,7 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)

 case V16HImode:
 case V32QImode:
-  return expand_vec_perm_vpshufb2_vpermq_even_odd (d);
+  return expand_vec_perm_even_odd_pack (d);

 case V4DImode:
   if (!TARGET_AVX2)
@@ -48814,6 +48928,9 @@ ix86_expand_vec_perm_const_1 (struct
expand_vec_perm_d *d)

   /* Try 

Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 02:36:26PM +0300, Evgeny Stupachenko wrote:
 +  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general
 + shuffles.  */

I think switch (d-vmode) would be more readable.

 +  op = gen_reg_rtx (d-vmode);
 +  t = gen_reg_rtx (V4DImode);
 +  emit_insn (gen_pack (op, dop0, dop1));
 +  emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op),
 const0_rtx,

Too long line, wrap it?

Will leave the rest to Uros.

Jakub


[PATCH, ARM] Fix PR63718, Thumb1 bootstrap -- disable fuse-caller-save for Thumb1

2014-11-20 Thread Tom de Vries

Richard,

This patch fixes PR63718, which currently breaks Thumb1 bootstrap.

The problem is that in Thumb1 mode, we emit the epilogue in RTL, but the last 
insn - epilogue_insns - does not accurately model the corresponding insns

emitted in the asm file. F.i., the asm file may contain an insn:
...
  pop {r0}

while the corresponding RTL pattern looks like this:
...
(jump_insn (unspec_volatile [
(return)
 ] VUNSPEC_EPILOGUE))
...

As a consequence, the epilogue may clobber registers without fuse-caller-save 
being able to analyze that.


Adding the missing clobbers to epilogue_insns is not trivial, and probably not a 
good idea for stage3. The patch works around the problem by disabling 
fuse-caller-save in Thumb1 mode.


Build and reg-tested on arm-none-eabi.

OK for stage3?

Thanks,
- Tom
2014-11-20  Tom de Vries  t...@codesourcery.com

	PR rtl-optimization/63718
	* config/arm/arm.c (arm_option_override): Disable fuse-caller-save for
	Thumb1.

Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c (revision 217730)
+++ gcc/config/arm/arm.c (working copy)
@@ -3105,6 +3105,18 @@ arm_option_override (void)
(!arm_arch7 || !current_tune-prefer_ldrd_strd))
 flag_schedule_fusion = 0;
 
+  /* In Thumb1 mode, we emit the epilogue in RTL, but the last insn
+ - epilogue_insns - does not accurately model the corresponding insns
+ emitted in the asm file.  In particular, see the comment in thumb_exit
+ 'Find out how many of the (return) argument registers we can corrupt'.
+ As a consequence, the epilogue may clobber registers without
+ fuse-caller-save finding out about it.  Therefore, disable fuse-caller-save
+ in Thumb1 mode.
+ TODO: Accurately model clobbers for epilogue_insns and reenable
+ fuse-caller-save.  */
+  if (TARGET_THUMB1)
+flag_use_caller_save = 0;
+
   /* Register global variables with the garbage collector.  */
   arm_add_gc_roots ();
 }


[Ada] gnat1: back end switch -G nnn (PR ada/47500)

2014-11-20 Thread Arnaud Charlet
On platform where the switch is allowed, the gcc driver, when called with
-Gnnn (nnn is a non negative number) invokes the compiler (gnat1) with
-G nnn. This patch skips the argument nnn after -G, so that it is not
taken as a source file name.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Vincent Celier  cel...@adacore.com

PR ada/47500
* back_end.adb (Scan_Back_End_Switches): Skip switch -G and
its argument.

Index: back_end.adb
===
--- back_end.adb(revision 217828)
+++ back_end.adb(working copy)
@@ -232,9 +232,10 @@
  Last  : constant Natural  := Switch_Last (Switch_Chars);
 
   begin
- --  Skip -o or internal GCC switches together with their argument
+ --  Skip -o, -G or internal GCC switches together with their argument.
 
  if Switch_Chars (First .. Last) = o
+   or else Switch_Chars (First .. Last) = G
or else Is_Internal_GCC_Switch (Switch_Chars)
  then
 Next_Arg := Next_Arg + 1;


Re: [PATCH, i386]: Fix PR 63966, inconsistent operand constraints compiling libcpp

2014-11-20 Thread Uros Bizjak
On Wed, Nov 19, 2014 at 9:59 PM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 libcpp/lex.c includes ../gcc/config/i386/cpuid.h, and is picked up
 by the system compiler during stage1. Recently, cpuid.h was changed to
 account for %ebx changes and now uses b asm constraint for i686 even
 with __PIC__.

Attached patch is what I have committed to mainline SVN.

2014-11-20  Uros Bizjak  ubiz...@gmail.com

PR target/63966
* lex.c [__i386__ || __x86_64__]: Compile special SSE functions
only for (__GNUC__ = 5 || !defined(__PIC__)).

Bootstrapped on x86_64-linux-gnu, Fedora 20 and CentOS 5.11.

Uros.

Index: lex.c
===
--- lex.c   (revision 217830)
+++ lex.c   (working copy)
@@ -270,7 +270,7 @@
extensions used, so SSE4.2 executables cannot run on machines that
don't support that extension.  */

-#if (GCC_VERSION = 4005)  (defined(__i386__) ||
defined(__x86_64__))  !(defined(__sun__)  defined(__svr4__))
+#if (GCC_VERSION = 4005)  (__GNUC__ = 5 || !defined(__PIC__)) 
(defined(__i386__) || defined(__x86_64__))  !(defined(__sun__)  de

 /* Replicated character data to be shared between implementations.
Recall that outside of a context with vector support we can't


Re: [PATCH] driver: ignore SIGINT while waiting on subprocesses to finish

2014-11-20 Thread Patrick Palka
On Tue, Nov 18, 2014 at 11:14 AM, Michael Matz m...@suse.de wrote:
 Hi,

 On Mon, 17 Nov 2014, Richard Biener wrote:

 This means I can no longer interrupt a compile that is running too long?

 No, that's not what it means, cc1 will also get the SIGINT.

 You should instead debug the actual compiler, not the driver.

 -wrapper is specifically also for invoking cc1 with gdb from the driver
 (that's the usecase documented with -wrapper), so it better should work as
 intended.  I don't know what problems Patrick had with that, though.  For
 me gcc -wrapper gdb,--args works as expected (as in ^C interrupts cc1
 returning to gdb).

Yes it does for me too. But pressing ^C in gdb while cc1 is not
running (by accident or with intention, e.g. pressing ^C to quickly
clear the command prompt) will kill the driver and gdb after it. It's
not a huge problem but it does cause some inconvenience for users of
-wrapper gdb.



 Ciao,
 Michael.


Re: [PATCH, ifcvt] Fix PR63917

2014-11-20 Thread H.J. Lu
On Thu, Nov 20, 2014 at 1:48 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote:
 Hi,

 r217646 enhances ifcvt to handle cbranchcc4 instruction. But ifcvt does not
 strictly check the dependence before moving instructions before IF. Then
 some instructions, which clobber CC, are inserted before the cbranchcc4
 instruction.

 For the case in the patch, ifcvt transfers code from

5: r87:SI=r117:SI
22: pc={(flags:CCGOC=0)?L26:pc}
25: {r87:SI=-r117:SI;clobber flags:CC;}

 to
5: r87:SI=r117:SI
   136: {r145:SI=-r117:SI;clobber flags:CC;} // CC is clobbered
   137: r87:SI={(flags:CCGOC0)?r145:SI:r117:SI}

 The patch skips moving insns, which clobber CC, before cbranchcc4.

 Bootstrap and no make check regression on X86-64 and i686.
 All the failed cases in PR63917 PASS.

 OK for trunk?
 Thanks!
 -Zhenqiang

 ChangeLog:
 2014-11-20  Zhenqiang Chen  zhenqiang.c...@arm.com

 PR rtl-optimization/63917
 * ifcvt.c (clobber_cc_p, use_cc_p): New functions.
 (noce_process_if_block, check_cond_move_block): Check CC references.

 testsuite/ChangeLog:
 2014-11-20  Zhenqiang Chen  zhenqiang.c...@arm.com

 * gcc.target/i386/floatsitf.c: New test.


Why do you need a new testcase?  There are many failures with the
existing testcases.

-- 
H.J.


Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Uros Bizjak
On Thu, Nov 20, 2014 at 12:36 PM, Evgeny Stupachenko evstu...@gmail.com wrote:
 Hi,

 The patch expand even/odd permutation using:
 and, and, pack in odd case
 shift, shift, pack in even case

 instead of current pshufb, pshufb, or or big set of unpack insns.

 AVX2/CORE bootstrap and make check passed.
 expensive tests are in progress

 Is it ok for trunk?

 Evgeny

 2014-11-20  Evgeny Stupachenko  evstu...@gmail.com

 gcc/testsuite
 PR target/60451
 * gcc.target/i386/pr60451.c: New.

 gcc/
 PR target/60451
 * config/i386/i386.c (expand_vec_perm_even_odd_pack): New.
 (expand_vec_perm_even_odd_1): Add new expand for SSE cases,
 replace with for AVX2 cases.
 (ix86_expand_vec_perm_const_1): Add new expand.

OK with a couple of small adjustments below.

Thanks,
Uros.

 +/* A subroutine of expand_vec_perm_even_odd_1.  Implement extract-even
 +   and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands
 +   with two and and pack or two shift and pack insns.  We should
 +   have already failed all two instruction sequences.  */
 +
 +static bool
 +expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d)
 +{
 +  rtx op, dop0, dop1, t, rperm[16];
 +  unsigned i, odd, c, s, nelt = d-nelt;
 +  bool end_perm = false;
 +  machine_mode half_mode;
 +  rtx (*gen_and) (rtx, rtx, rtx);
 +  rtx (*gen_pack) (rtx, rtx, rtx);
 +  rtx (*gen_shift) (rtx, rtx, rtx);
 +
 +  /* Required for pack.  */
 +  if (!TARGET_SSE4_2 || d-one_operand_p)
 +return false;
 +
 +  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general
 + shuffles.  */
 +  if (d-vmode == V8HImode)

Use switch, as proposed by Jakub.

 +{
 +  c = 0x;
 +  s = 16;
 +  half_mode = V4SImode;
 +  gen_and = gen_andv4si3;
 +  gen_pack = gen_sse4_1_packusdw;
 +  gen_shift = gen_lshrv4si3;
 +}
 +  else if (d-vmode == V16QImode)
 +{
 +  c = 0xff;
 +  s = 8;
 +  half_mode = V8HImode;
 +  gen_and = gen_andv8hi3;
 +  gen_pack = gen_sse2_packuswb;
 +  gen_shift = gen_lshrv8hi3;
 +}
 +  else if (d-vmode == V16HImode)
 +{
 +  c = 0x;
 +  s = 16;
 +  half_mode = V8SImode;
 +  gen_and = gen_andv8si3;
 +  gen_pack = gen_avx2_packusdw;
 +  gen_shift = gen_lshrv8si3;
 +  end_perm = true;
 +}
 +  else if (d-vmode == V32QImode)
 +{
 +  c = 0xff;
 +  s = 8;
 +  half_mode = V16HImode;
 +  gen_and = gen_andv16hi3;
 +  gen_pack = gen_avx2_packuswb;
 +  gen_shift = gen_lshrv16hi3;
 +  end_perm = true;
 +}
 +  else
 +return false;
 +
 +  /* Check that permutation is even or odd.  */
 +  odd = d-perm[0];
 +  if (odd != 0  odd != 1)

if (odd  1)

 +return false;
 +
 +  for (i = 1; i  nelt; ++i)
 +if (d-perm[i] != 2 * i + odd)
 +  return false;
 +
 +  if (d-testing_p)
 +return true;
 +
 +  dop0 = gen_reg_rtx (half_mode);
 +  dop1 = gen_reg_rtx (half_mode);
 +  if (odd == 0)
 +{
 +  for (i = 0; i  nelt / 2; rperm[i++] = GEN_INT (c));

Please write above as:

 for (i = 0; i  nelt / 2; i++)
 rperm[i] = GEN_INT (c));

 +  t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm));
 +  t = force_reg (half_mode, t);
 +  emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0)));
 +  emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1)));
 +}
 +  else
 +{
 +  emit_insn (gen_shift (dop0,
 +   gen_lowpart (half_mode, d-op0),
 +   GEN_INT (s)));
 +  emit_insn (gen_shift (dop1,
 +   gen_lowpart (half_mode, d-op1),
 +   GEN_INT (s)));
 +}
 +  /* In AVX2 for 256 bit case we need to permute pack result.  */
 +  if (TARGET_AVX2  end_perm)
 +{
 +  op = gen_reg_rtx (d-vmode);
 +  t = gen_reg_rtx (V4DImode);
 +  emit_insn (gen_pack (op, dop0, dop1));
 +  emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op),
 const0_rtx,
 + const2_rtx, const1_rtx, GEN_INT (3)));
 +  emit_move_insn (d-target, gen_lowpart (d-vmode, t));
 +}
 +  else
 +emit_insn (gen_pack (d-target, dop0, dop1));
 +
 +  return true;
 +}
 +
  /* A subroutine of ix86_expand_vec_perm_builtin_1.  Implement extract-even
 and extract-odd permutations.  */

 @@ -48393,6 +48503,8 @@ expand_vec_perm_even_odd_1 (struct
 expand_vec_perm_d *d, unsigned odd)
gcc_unreachable ();

  case V8HImode:
 +  if (TARGET_SSE4_2)
 +   return expand_vec_perm_even_odd_pack (d);
if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)

else if in the above line, to be consistent with else below.

 return expand_vec_perm_pshufb2 (d);
else
 @@ -48416,6 +48528,8 @@ expand_vec_perm_even_odd_1 (struct
 expand_vec_perm_d *d, unsigned odd)
break;

  case V16QImode:
 +  if (TARGET_SSE4_2)
 +   return expand_vec_perm_even_odd_pack (d);
   

Re: LTO streaming of TARGET_OPTIMIZE_NODE

2014-11-20 Thread Bernd Schmidt

On 11/13/2014 05:06 AM, Jan Hubicka wrote:

this patch adds infrastructure for proper streaming and merging of
TREE_TARGET_OPTION.


This breaks the offloading path via LTO since it introduces an 
incompatibility in LTO format between host and offload machine.


A very quick patch to fix it is below - the OpenACC testcase I was using 
seems to be working again with this. Thoughts, suggestions?



Bernd

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index be041e9..3c4b8c9 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -65,7 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include streamer-hooks.h
 #include cfgloop.h
 #include builtins.h
-
+#include lto-section-names.h
 
 static void lto_write_tree (struct output_block*, tree, bool);
 
@@ -944,7 +944,9 @@ hash_tree (struct streamer_tree_cache_d *cache, hash_maptree, hashval_t *map,
 hstate.add (TRANSLATION_UNIT_LANGUAGE (t),
 			strlen (TRANSLATION_UNIT_LANGUAGE (t)));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
+  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
+  /* We don't stream these when passing things to a different target.  */
+   strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0)
 hstate.add_wide_int (cl_target_option_hash (TREE_TARGET_OPTION (t)));
 
   if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c
index a2a2382..88d36d3 100644
--- a/gcc/tree-streamer-in.c
+++ b/gcc/tree-streamer-in.c
@@ -514,8 +514,10 @@ unpack_value_fields (struct data_in *data_in, struct bitpack_d *bp, tree expr)
 	vec_safe_grow (CONSTRUCTOR_ELTS (expr), length);
 }
 
+#ifndef ACCEL_COMPILER
   if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
 cl_target_option_stream_in (data_in, bp, TREE_TARGET_OPTION (expr));
+#endif
 
   if (code == OMP_CLAUSE)
 unpack_ts_omp_clause_value_fields (data_in, bp, expr);
@@ -779,7 +781,9 @@ lto_input_ts_function_decl_tree_pointers (struct lto_input_block *ib,
   DECL_VINDEX (expr) = stream_read_tree (ib, data_in);
   /* DECL_STRUCT_FUNCTION is loaded on demand by cgraph_get_body.  */
   DECL_FUNCTION_PERSONALITY (expr) = stream_read_tree (ib, data_in);
+#ifndef ACCEL_COMPILER
   DECL_FUNCTION_SPECIFIC_TARGET (expr) = stream_read_tree (ib, data_in);
+#endif
   DECL_FUNCTION_SPECIFIC_OPTIMIZATION (expr) = stream_read_tree (ib, data_in);
 
   /* If the file contains a function with an EH personality set,
diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c
index b959454..fca101e 100644
--- a/gcc/tree-streamer-out.c
+++ b/gcc/tree-streamer-out.c
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include tree-streamer.h
 #include data-streamer.h
 #include streamer-hooks.h
+#include lto-section-names.h
 
 /* Output the STRING constant to the string
table in OB.  Then put the index onto the INDEX_STREAM.  */
@@ -463,7 +464,9 @@ streamer_pack_tree_bitfields (struct output_block *ob,
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
 bp_pack_var_len_unsigned (bp, CONSTRUCTOR_NELTS (expr));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
+  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
+  /* Don't stream these when passing things to a different target.  */
+   strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0)
 cl_target_option_stream_out (ob, bp, TREE_TARGET_OPTION (expr));
 
   if (code == OMP_CLAUSE)
@@ -678,7 +681,9 @@ write_ts_function_decl_tree_pointers (struct output_block *ob, tree expr,
   stream_write_tree (ob, DECL_VINDEX (expr), ref_p);
   /* DECL_STRUCT_FUNCTION is handled by lto_output_function.  */
   stream_write_tree (ob, DECL_FUNCTION_PERSONALITY (expr), ref_p);
-  stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_TARGET (expr), ref_p);
+  /* Don't stream these when passing things to a different target.  */
+  if (strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0)
+stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_TARGET (expr), ref_p);
   stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_OPTIMIZATION (expr), ref_p);
 }
 


Another ptx offloading patch

2014-11-20 Thread Bernd Schmidt
Now that I've managed to put together and test all the submitted OpenACC 
patches I found there was one piece missing. The problem is that omp-low 
on the host likes to generate function names like _main._omp_fn. On 
ptx, the dot is not allowed in identifiers, so we have to rewrite this 
to use a dollar sign.


The patch below does this at the lto-read stage. Bootstrapped on 
x86_64-linux, ok if testing is successful?



Bernd
commit 26b41de43c6db6e2368a9511c589c433b1e49c96
Author: Bernd Schmidt ber...@codesourcery.com
Date:   Wed Nov 19 21:47:59 2014 +0100

Renaming for invalid symbols when reading LTO.

	* cgraph.h (clone_function_name_1): Declare.
	* cgraphclones.c (clone_function_name_1): New function.
	(clone_function_name): Use it.
	* lto-partition.c: Include stringpool.h.
	(must_not_rename, maybe_rewrite_identifier,
	validize_symbol_for_target): New static functions.
	(privatize_symbol_name): Use must_not_rename.
	(promote_symbol): Call validize_symbol_for_target.
	(lto_promote_cross_file_statics): Likewise.
	(lto_promote_statics_nonwpa): Likewise.

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index a5c5f56..7be6413 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -2150,6 +2150,7 @@ basic_block init_lowered_empty_function (tree, bool);
 
 /* In cgraphclones.c  */
 
+tree clone_function_name_1 (const char *, const char *);
 tree clone_function_name (tree decl, const char *);
 
 void tree_function_versioning (tree, tree, vecipa_replace_map *, va_gc *,
diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index 086dd92..1b7d8d2 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -506,19 +506,19 @@ cgraph_node::create_clone (tree decl, gcov_type gcov_count, int freq,
   return new_node;
 }
 
-/* Return a new assembler name for a clone of DECL with SUFFIX.  */
-
 static GTY(()) unsigned int clone_fn_id_num;
 
+/* Return a new assembler name for a clone with SUFFIX of a decl named
+   NAME.  */
+
 tree
-clone_function_name (tree decl, const char *suffix)
+clone_function_name_1 (const char *name, const char *suffix)
 {
-  tree name = DECL_ASSEMBLER_NAME (decl);
-  size_t len = IDENTIFIER_LENGTH (name);
+  size_t len = strlen (name);
   char *tmp_name, *prefix;
 
   prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
-  memcpy (prefix, IDENTIFIER_POINTER (name), len);
+  memcpy (prefix, name, len);
   strcpy (prefix + len + 1, suffix);
 #ifndef NO_DOT_IN_LABEL
   prefix[len] = '.';
@@ -531,6 +531,16 @@ clone_function_name (tree decl, const char *suffix)
   return get_identifier (tmp_name);
 }
 
+/* Return a new assembler name for a clone of DECL with SUFFIX.  */
+
+tree
+clone_function_name (tree decl, const char *suffix)
+{
+  tree name = DECL_ASSEMBLER_NAME (decl);
+  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
+}
+
+
 /* Create callgraph node clone with new declaration.  The actual body will
be copied later at compilation stage.
 
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 80dc87c..c49401b 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -4238,14 +4238,14 @@ process_command (unsigned int decoded_options_count,
 }
 
   gcc_assert (!IS_ABSOLUTE_PATH (tooldir_base_prefix));
-  tooldir_prefix2 = concat (tooldir_base_prefix, spec_host_machine,
+  tooldir_prefix2 = concat (tooldir_base_prefix, spec_machine,
 			dir_separator_str, NULL);
 
   /* Look for tools relative to the location from which the driver is
  running, or, if that is not available, the configured prefix.  */
   tooldir_prefix
 = concat (gcc_exec_prefix ? gcc_exec_prefix : standard_exec_prefix,
-	  spec_host_machine, dir_separator_str, spec_version,
+	  spec_machine, dir_separator_str, spec_version,
 	  accel_dir_suffix, dir_separator_str, tooldir_prefix2, NULL);
   free (tooldir_prefix2);
 
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 65f0582..ac10c90 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 #include ipa-inline.h
 #include ipa-utils.h
 #include lto-partition.h
+#include stringpool.h
 
 vecltrans_partition ltrans_partitions;
 
@@ -775,21 +776,12 @@ lto_balanced_map (int n_lto_partitions)
   free (order);
 }
 
-/* Mangle NODE symbol name into a local name.  
-   This is necessary to do
-   1) if two or more static vars of same assembler name
-  are merged into single ltrans unit.
-   2) if prevoiusly static var was promoted hidden to avoid possible conflict
-  with symbols defined out of the LTO world.
-*/
+/* Return true if we must not change the name of the NODE.  The name as
+   extracted from the corresponding decl should be passed in NAME.  */
 
 static bool
-privatize_symbol_name (symtab_node *node)
+must_not_rename (symtab_node *node, const char *name)
 {
-  tree decl = node-decl;
-  const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
-  cgraph_node *cnode;
-
   /* Our renaming machinery do not 

Re: OpenACC middle end changes

2014-11-20 Thread Bernd Schmidt

On 11/20/2014 07:52 AM, Jakub Jelinek wrote:

On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote:

Thomas had apparently already pointed out an issue with the new gomp_target
class (there are multiple similar types of statements we want to handle with
OpenACC, they have different codes but we want to have function pointers
operating on any of them) back in July. That seems to have been ignored. By
necessity, some of David's changes are reverted in the following patch.


I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just
two new gimple_omp_target_kind GF_* flags.


If that's the case I'll leave it to Thomas to make these changes. At the 
moment I'm just trying to put together all the pieces into versions that 
apply to trunk and can be made to work together.



Bernd




Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787

2014-11-20 Thread Richard Biener
On Thu, Nov 20, 2014 at 12:05 AM, Andrew MacLeod amacl...@redhat.com wrote:
 On 11/19/2014 05:24 PM, David Malcolm wrote:

 On Wed, 2014-11-19 at 22:36 +0100, Richard Biener wrote:

 On November 19, 2014 10:09:56 PM CET, Andrew MacLeod
 amacl...@redhat.com wrote:

 On 11/19/2014 03:43 PM, Richard Biener wrote:

 On November 19, 2014 8:26:23 PM CET, Andrew MacLeod

 amacl...@redhat.com wrote:

 On 11/19/2014 01:12 PM, David Malcolm wrote:

 (A) could become:

  greturn *stmt = gsi-as_a_greturn ();

 (B) could become:

  stmt = gsi-dyn_cast gcall * ();
  if (!stmt)
 or:

  stmt = gsi-dyn_cast_gcall ();
  if (!stmt)

 or maybe:

  stmt = gsi-is_a_gcall ();
  if (!stmt)

 An earlier version of my patches had casting methods within the
 gimple_statement_base class, which were rejected; the above

 proposals

 would instead put them within gimple_stmt_iterator.

 I would like all gsi routines to be consistent, not a mix of

 functions

 and methods.
 so something like

 stmt = gsi_as_call (gsi);
 stmt = gsi_dyn_call (gsi);

 or we need to change gsi_stmt() and friends into methods and access
 them
 as gsi-stmt ()..  which is possibly better, but that much more
 intrusive again (another 2000+ locations).
 If we switched to methods everywhere for gsi, I'd prefer something

 like

 gsi-as_a_call ()
 gsi-is_a_call ()
 gsi-dyn_cast_call ()

 I think its more readable... and it removes a dependency on the
 implementation.. so if we ever decide to change the name of 'gcall'
 down
 the road to using a namespace, and make it gimple::call or whatever,

 we

 wont have to change every single gsi- location which has a

 templated

 use of the type.

 I'm also think this sort of thing could probably wait until next

 stage

 1..

 my 2 cents...

 Why not as_a gassign * (*gsi)?  It would
 Add operator* to gsi of course.

 Richard.


 I could live with that form too.

 we often have an instance of gimple_stmt_iterator rather than a pointer

 to it, so wouldn't  operator gimple *() to implicitly call gsi_stmt()

 when needed work better? (or operator gimple () before the next
 change) ..

 Not sure.  The * matches how iterators work in STL...

 Note that for the cases where we pass a pointer to an iterator we can
 change those to use references to avoid writing **gsi.

 Richard.

 Andrew

 I had a go at adding an operator * to gimple_stmt_iterator, using it
 everywhere that we do an as_a or dyn_cast on the result of a
 gsi_stmt, to abbreviate the gsi_stmt call down to one character.

 Patch attached; only lightly smoketested; am posting it for the sake of
 discussion.

 I don't think this API will make the non-C++-fans happier; I think the
 objection to the work I just merged is that it's adding more C++ than
 those people are comfortable with.

 So although the attached patch makes things shorter (good), it's taking
 things in a more C++ direction (questionable).  I'd hoped to paper
 over the C++ somewhat.

 I suspect that any API which requires the of   characters within the
 implementation of a gimple pass to mean a template is going to give
 those less happy with C++ a visceral ugh reaction.  I wonder if
 there's a way to spell these things that's concise and which doesn't
 involve  ?

 wasnt that my last  thought?   is_a_call(), as_a_call() and dyn_cast_call ()
 ?

 I think lack of  in identifiers helps us old brains parse faster :-)
 are like ()... many many years of causing a certain kind of break in mental
 processing. I'm accustomed to single  these days, but once you get into
 multiple 's I quickly loose track.  I find the same thing with ()... hence
 I'm not a lisp fan :-)

I think we want to have a consistent style across GCC even if seen as
ugly to some people.  Thus having (member) functions for conversion in
some cases
and as_a  templates in others is bad.  C++ was supposed to make
grok GCC easier for newbies - this is exactly making it harder (not that
I believe in this story at all...)

 I dont think 'operator *' c++ifies it too much, but I still think operator
 gimple() would be easier...  no extra character at all, and no odd looking
 dereference of a non-pointer object or double dereference of a pointer.  I
 cant think of how that could get us into trouble... it'll always map to the
 stmt the iterator currently points to.

I dislike conversion operators.  Why is 'operator *' bad?  It's exactly
how iterators are supposed to work - after all the gsi stuff was modeled
after STL iterators!

So that's a definitive no from me to is_a_call () as_a_call () etc.

Richard.

 Andrew


Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787

2014-11-20 Thread Richard Biener
On Wed, Nov 19, 2014 at 11:24 PM, David Malcolm dmalc...@redhat.com wrote:
 On Wed, 2014-11-19 at 22:36 +0100, Richard Biener wrote:
 On November 19, 2014 10:09:56 PM CET, Andrew MacLeod amacl...@redhat.com 
 wrote:
 On 11/19/2014 03:43 PM, Richard Biener wrote:
  On November 19, 2014 8:26:23 PM CET, Andrew MacLeod
 amacl...@redhat.com wrote:
  On 11/19/2014 01:12 PM, David Malcolm wrote:
 
  (A) could become:
 
  greturn *stmt = gsi-as_a_greturn ();
 
  (B) could become:
 
  stmt = gsi-dyn_cast gcall * ();
  if (!stmt)
  or:
 
  stmt = gsi-dyn_cast_gcall ();
  if (!stmt)
 
  or maybe:
 
  stmt = gsi-is_a_gcall ();
  if (!stmt)
 
  An earlier version of my patches had casting methods within the
  gimple_statement_base class, which were rejected; the above
 proposals
  would instead put them within gimple_stmt_iterator.
 
  I would like all gsi routines to be consistent, not a mix of
 functions
  and methods.
  so something like
 
  stmt = gsi_as_call (gsi);
  stmt = gsi_dyn_call (gsi);
 
  or we need to change gsi_stmt() and friends into methods and access
  them
  as gsi-stmt ()..  which is possibly better, but that much more
  intrusive again (another 2000+ locations).
  If we switched to methods everywhere for gsi, I'd prefer something
 like
  gsi-as_a_call ()
  gsi-is_a_call ()
  gsi-dyn_cast_call ()
 
  I think its more readable... and it removes a dependency on the
  implementation.. so if we ever decide to change the name of 'gcall'
  down
  the road to using a namespace, and make it gimple::call or whatever,
 we
 
  wont have to change every single gsi- location which has a
 templated
  use of the type.
 
  I'm also think this sort of thing could probably wait until next
 stage
  1..
 
  my 2 cents...
  Why not as_a gassign * (*gsi)?  It would
  Add operator* to gsi of course.
 
  Richard.
 
 
 
 I could live with that form too.
 
 we often have an instance of gimple_stmt_iterator rather than a pointer
 
 to it, so wouldn't  operator gimple *() to implicitly call gsi_stmt()
 
 when needed work better? (or operator gimple () before the next
 change) ..

 Not sure.  The * matches how iterators work in STL...

 Note that for the cases where we pass a pointer to an iterator we can change 
 those to use references to avoid writing **gsi.

 Richard.

 Andrew

 I had a go at adding an operator * to gimple_stmt_iterator, using it
 everywhere that we do an as_a or dyn_cast on the result of a
 gsi_stmt, to abbreviate the gsi_stmt call down to one character.

 Patch attached; only lightly smoketested; am posting it for the sake of
 discussion.

Looks good.  Note that

diff --git a/gcc/asan.c b/gcc/asan.c
index be28ede..d06d60c 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1902,7 +1902,7 @@ instrument_builtin_call (gimple_stmt_iterator *iter)
 return false;

   bool iter_advanced_p = false;
-  gcall *call = as_a gcall * (gsi_stmt (*iter));
+  gcall *call = as_a gcall * (**iter);

should be fixed by making instrument_builtin_call take a reference
to the iterator so the above becomes

   gcall *call = as_a gcall * (*iter);

probably not possible in 100% of all cases (where we sometimes pass
NULL as the iterator pointer) but in most.

 I don't think this API will make the non-C++-fans happier; I think the
 objection to the work I just merged is that it's adding more C++ than
 those people are comfortable with.

How so?  It's already super-ugly in those views.  We decided to get C++.
Now we have it.  Now please make it AT LEAST CONSISTENT.

 So although the attached patch makes things shorter (good), it's taking
 things in a more C++ direction (questionable).  I'd hoped to paper
 over the C++ somewhat.

 I suspect that any API which requires the of   characters within the
 implementation of a gimple pass to mean a template is going to give
 those less happy with C++ a visceral ugh reaction.  I wonder if
 there's a way to spell these things that's concise and which doesn't
 involve  ?

Only if you drop as_a/is_a/dyn_cast everywhere.

Richard.


Re: bitmap fix for current

2014-11-20 Thread Richard Biener
On Thu, Nov 20, 2014 at 1:18 AM, Mike Stump mikest...@comcast.net wrote:
 On Nov 14, 2014, at 2:26 AM, Richard Biener richard.guent...@gmail.com 
 wrote:
 On Fri, Nov 14, 2014 at 2:10 AM, Jeff Law l...@redhat.com wrote:
 On 11/13/14 12:37, Mike Stump wrote:

 I was doing a merge, and it failed to even compile the runtime
 libraries due to checking in bitmap.  bitmap goes to remove set bits
 from the bitmap (the second hunk in a two hunk set), and it fails to
 update the current pointer.  That memory is freed and then
 reallocated and a new index is put into it, and then we fail a
 consistency check later on due to the mismatch between head-index
 and head-current-indx, because current was not properly maintained.
 This patch removes the old value of current when we remove what it
 points to from the bitmap.

 Was the calling code iterating through the bit with a form like

 EXECUTE_IF_SET_IN_BITMAP (something, 0, i, bi)
 {
   bitmap_clear_bit (something, i)
   [ ... whatever code we want to process i, ... ]
 }

 If so, that's the real issue and we'd really like to identify  fix any code
 that has that kind of structure.

 Nope, that doesn’t appear to be the problem.

 Indeed.  I can't see how this can have triggered:

  prev = elt-prev;
  if (prev)
{
  prev-next = NULL;
  if (head-current-indx  prev-indx)
{
  head-current = prev;
  head-indx = prev-indx;

 so if there was elt-prev then if current == elt current-indx should
 better be  prev-indx.

 Sth else must be wrong (and I doubt it's the above bogus use of
 bitmaps).

 So bitmap_ior_and_compl has an overly cleaver optimization to overwrite an 
 existing bitmap with the newly computed bitmap.  We write it over in place 
 and then at the end, we do:

   if (dst_elt)
 {
   changed = true;
   bitmap_elt_clear_from (dst, dst_elt);
 }

 which is all fine and good, however, notice that when we update a list with:

   0 1

 with:

 1 2

 we get:

 1 1 2

 and then we want to kill from the second 1 to the end.

 The problem is current points at the second 1, and because the update code 
 for current does:

   if (head-current-indx  prev-indx)
 {
   head-current = prev;
   head-indx = prev-indx;
 }

 and index is not greater (it is indeed unrelated to the other index), we 
 don’t update current.  So, even my patch was wrong, in that the two are 
 unrelated, so no comparison will help here.

 Curious the and and xor routine do this:

   /* Ensure that dst-current is valid.  */
   dst-current = dst-first;
   bitmap_elt_clear_from (dst, dst_elt);

 so, certainly the previous authors know of this type of problem.

 and_into almost seems wrong:

   if (a_elt)
 {
   changed = true;
   bitmap_elt_clear_from (a, a_elt);
 }

 as and can remove elements, but, they are saved by the code in 
 bitmap_elt_clear_from:

   if (head-current-indx  prev-indx)
 {
   head-current = prev;
   head-indx = prev-indx;
 }

 which kicks in since and cannot add any elements, it is purely subtractive.

 and_compl works as it does:

   /* Ensure that dst-current is valid.  */
   dst-current = dst-first;

 ior doesn’t reset current, and it broken.

 ior_and_compl doesn’t reset current and likewise, is broken.

 If these were _into varieties, they would have been ok.  But, they are not.

 I added checking code to ensure the current was in the bitmap at the end of 
 bitmap_elt_clear_from, and sure enough, it fired.

 So, next up, is there anything else that is supposed to save us in this case? 
  If not, Ok?

The bitmap_ior and bitmap_ior_and_compl hunks are ok.  Please leave
out the checking bits - they will be very much too expensive.

Thanks,
Richard.








Re: [PATCH] Disable an unsafe VRP transformation when -fno-strict-overflow is set

2014-11-20 Thread Richard Biener
On Thu, Nov 20, 2014 at 4:21 AM, Patrick Palka patr...@parcs.ath.cx wrote:
 VRP may simplify a conditional like i = 5 to i == 5 if it is known that
 the lower bound of i's range is 5, e.g. [5, +INF].  But if the upper
 bound of i's range is also overflow infinity, i.e. [5, +INF(OVF)] then
 this transformation is only valid if -fstrict-overflow is in effect.
 Likewise for transforming i  10 to i != 10 given i's range is
 [10, +INF(OVF)] and for transforming i = 20 to i == 20 given i's range
 is [-INF(OVF), 20].

 This patch makes this transformation only get performed if strict
 overflow rules are in effect and potentially emits a -Wstrict-overflow=3
 warning when the transformation takes place.

 Bootstrap and regtesting in progress on x86_64-unknown-linux-gnu.  Does
 the patch look OK if there are no new regressions?

Ok.

Thanks,
Richard.

 gcc/
 * tree-vrp.c (test_for_singularity): New parameter
 strict_overflow_p.  Set *strict_overflow_p to true if signed
 overflow must be undefined for the return value to satisfy the
 conditional.
 (simplify_cond_using_ranges): Don't perform the simplification
 if it violates overflow rules.

 gcc/testsuite/
 * gcc.dg/no-strict-overflow-8.c: New test.
 ---
  gcc/testsuite/gcc.dg/no-strict-overflow-8.c | 25 +
  gcc/tree-vrp.c  | 57 
 +
  2 files changed, 74 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/no-strict-overflow-8.c

 diff --git a/gcc/testsuite/gcc.dg/no-strict-overflow-8.c 
 b/gcc/testsuite/gcc.dg/no-strict-overflow-8.c
 new file mode 100644
 index 000..11ef935
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/no-strict-overflow-8.c
 @@ -0,0 +1,25 @@
 +/* { dg-do compile } */
 +/* { dg-options -fno-strict-overflow -O2 -fdump-tree-optimized } */
 +
 +/* We cannot fold i  0 because p-a - p-b can be larger than INT_MAX
 +   and thus i can wrap.  Dual of Wstrict-overflow-18.c  */
 +
 +struct c { unsigned int a; unsigned int b; };
 +extern void bar (struct c *);
 +int
 +foo (struct c *p)
 +{
 +  int i;
 +  int sum = 0;
 +
 +  for (i = 0; i  p-a - p-b; ++i)
 +{
 +  if (i  0)
 +   sum += 2;
 +  bar (p);
 +}
 +  return sum;
 +}
 +
 +/* { dg-final { scan-tree-dump i_.*  0 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
 index bcf4c2b..444af71 100644
 --- a/gcc/tree-vrp.c
 +++ b/gcc/tree-vrp.c
 @@ -9117,11 +9117,15 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator 
 *gsi, gimple stmt)
 a known value range VR.

 If there is one and only one value which will satisfy the
 -   conditional, then return that value.  Else return NULL.  */
 +   conditional, then return that value.  Else return NULL.
 +
 +   If signed overflow must be undefined for the value to satisfy
 +   the conditional, then set *STRICT_OVERFLOW_P to true.  */

  static tree
  test_for_singularity (enum tree_code cond_code, tree op0,
 - tree op1, value_range_t *vr)
 + tree op1, value_range_t *vr,
 + bool *strict_overflow_p)
  {
tree min = NULL;
tree max = NULL;
 @@ -9172,7 +9176,16 @@ test_for_singularity (enum tree_code cond_code, tree 
 op0,
  then there is only one value which can satisfy the condition,
  return that value.  */
if (operand_equal_p (min, max, 0)  is_gimple_min_invariant (min))
 -   return min;
 +   {
 + if ((cond_code == LE_EXPR || cond_code == LT_EXPR)
 +  is_overflow_infinity (vr-max))
 +   *strict_overflow_p = true;
 + if ((cond_code == GE_EXPR || cond_code == GT_EXPR)
 +  is_overflow_infinity (vr-min))
 +   *strict_overflow_p = true;
 +
 + return min;
 +   }
  }
return NULL;
  }
 @@ -9252,9 +9265,12 @@ simplify_cond_using_ranges (gcond *stmt)
  able to simplify this conditional. */
if (vr-type == VR_RANGE)
 {
 - tree new_tree = test_for_singularity (cond_code, op0, op1, vr);
 + enum warn_strict_overflow_code wc = 
 WARN_STRICT_OVERFLOW_CONDITIONAL;
 + bool sop = false;
 + tree new_tree = test_for_singularity (cond_code, op0, op1, vr, 
 sop);

 - if (new_tree)
 + if (new_tree
 +  (!sop || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0
 {
   if (dump_file)
 {
 @@ -9275,16 +9291,30 @@ simplify_cond_using_ranges (gcond *stmt)
   fprintf (dump_file, \n);
 }

 + if (sop  issue_strict_overflow_warning (wc))
 +   {
 + location_t location = input_location;
 + if (gimple_has_location (stmt))
 +   location = gimple_location (stmt);
 +
 + warning_at (location, OPT_Wstrict_overflow,
 + assuming signed overflow does 

Re: LTO streaming of TARGET_OPTIMIZE_NODE

2014-11-20 Thread Richard Biener
On Thu, 20 Nov 2014, Bernd Schmidt wrote:

 On 11/13/2014 05:06 AM, Jan Hubicka wrote:
  this patch adds infrastructure for proper streaming and merging of
  TREE_TARGET_OPTION.
 
 This breaks the offloading path via LTO since it introduces an incompatibility
 in LTO format between host and offload machine.
 
 A very quick patch to fix it is below - the OpenACC testcase I was using seems
 to be working again with this. Thoughts, suggestions?

The offload target needs to have the same target options as the host?

Are the offload functions marked somehow?  That is, can we avoid
setting TREE_TARGET_OPTION on them?  Or rather we need to have a
default TREE_TARGET_OPTION node for the offload target which we'd
need to set - how would you otherwise transfer different offload
target options to the offload compile?  How do you transfer
offload target options to the offload compile at all?

I think this just shows conceptual issues with the LTO approach...

Thanks,
Richard.


[PATCH] PR63426 Fix various signed integer overflows

2014-11-20 Thread Markus Trippelsdorf
Running the testsuite after bootstrap-ubsan on gcc112 shows several issues. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 for the full list.

This patch fixes several of them.

Tested on powerpc64-unknown-linux-gnu.

OK for trunk?

Thanks.

2014-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de

* config/rs6000/constraints.md: Avoid signed integer overflows.
* config/rs6000/predicates.md: Likewise.
* config/rs6000/rs6000.c (num_insns_constant_wide): Likewise.
(includes_rldic_lshift_p): Likewise.
(includes_rldicr_lshift_p): Likewise. 
* emit-rtl.c (const_wide_int_htab_hash): Likewise.
* loop-iv.c (determine_max_iter): Likewise.
(iv_number_of_iterations): Likewise.
* tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise.
* varasm.c (get_section_anchor): Likewise.

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index 0e0e517d7a1d..3f12b07e4899 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -176,7 +176,7 @@
 (define_constraint P
   constant whose negation is signed 16-bit constant
   (and (match_code const_int)
-   (match_test (unsigned HOST_WIDE_INT) ((- ival) + 0x8000)  0x1)))
+   (match_test ((- (unsigned HOST_WIDE_INT) ival) + 0x8000)  0x1)))
 
 ;; Floating-point constraints
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 1767cbd7a11b..ea230a5b29a6 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -408,7 +408,7 @@
 (define_predicate reg_or_sub_cint_operand
   (if_then_else (match_code const_int)
 (match_test (unsigned HOST_WIDE_INT)
-  (- INTVAL (op) + (mode == SImode ? 0x8000 : 0x80008000))
+  (- UINTVAL (op) + (mode == SImode ? 0x8000 : 0x80008000))
  (unsigned HOST_WIDE_INT) 0x1ll)
 (match_operand 0 gpc_reg_operand)))
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 506daa1d31e7..a9604cf3fa97 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5083,7 +5083,7 @@ int
 num_insns_constant_wide (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if ((unsigned HOST_WIDE_INT) (value + 0x8000)  0x1)
+  if (((unsigned HOST_WIDE_INT) value + 0x8000)  0x1)
 return 1;
 
   /* constant loadable with addis */
@@ -16194,7 +16194,7 @@ includes_rldic_lshift_p (rtx shiftop, rtx andop)
 {
   if (GET_CODE (andop) == CONST_INT)
 {
-  HOST_WIDE_INT c, lsb, shift_mask;
+  unsigned HOST_WIDE_INT c, lsb, shift_mask;
 
   c = INTVAL (andop);
   if (c == 0 || c == ~0)
@@ -16233,7 +16233,7 @@ includes_rldicr_lshift_p (rtx shiftop, rtx andop)
 {
   if (GET_CODE (andop) == CONST_INT)
 {
-  HOST_WIDE_INT c, lsb, shift_mask;
+  unsigned HOST_WIDE_INT c, lsb, shift_mask;
 
   shift_mask = ~0;
   shift_mask = INTVAL (shiftop);
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 04f677eb608d..9d60d42c01f8 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -203,7 +203,7 @@ static hashval_t
 const_wide_int_htab_hash (const void *x)
 {
   int i;
-  HOST_WIDE_INT hash = 0;
+  unsigned HOST_WIDE_INT hash = 0;
   const_rtx xr = (const_rtx) x;
 
   for (i = 0; i  CONST_WIDE_INT_NUNITS (xr); i++)
diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
index 8ea458c3fc53..f55cea2a9859 100644
--- a/gcc/loop-iv.c
+++ b/gcc/loop-iv.c
@@ -2311,7 +2311,7 @@ determine_max_iter (struct loop *loop, struct niter_desc 
*desc, rtx old_niter)
 }
 
   get_mode_bounds (desc-mode, desc-signed_p, desc-mode, mmin, mmax);
-  nmax = INTVAL (mmax) - INTVAL (mmin);
+  nmax = UINTVAL (mmax) - UINTVAL (mmin);
 
   if (GET_CODE (niter) == UDIV)
 {
@@ -2649,7 +2649,7 @@ iv_number_of_iterations (struct loop *loop, rtx_insn 
*insn, rtx condition,
  down = INTVAL (CONST_INT_P (iv0.base)
 ? iv0.base
 : mode_mmin);
- max = (up - down) / inc + 1;
+ max = (uint64_t) (up - down) / inc + 1;
  if (!desc-infinite
   !desc-assumptions)
record_niter_bound (loop, max, false, true);
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 4007e5483b27..fca18b6cdfe3 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -4183,7 +4183,7 @@ get_computation_cost_at (struct ivopts_data *data,
 
   if (cst_and_fits_in_hwi (cbase))
 {
-  offset = - ratio * int_cst_value (cbase);
+  offset = - ratio * (unsigned HOST_WIDE_INT) int_cst_value (cbase);
   cost = difference_cost (data,
  ubase, build_int_cst (utype, 0),
  symbol_present, var_present, offset,
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 54611f8fd3f1..b93e2559843c 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -7188,7 +7188,7 @@ get_section_anchor (struct object_block 

RE: [PATCH, committed] Update Automake files

2014-11-20 Thread Bernd Edlinger
Hello Jan-Benedict,

 Hi!
 
 This patch updates the files taken from Automake.  Committed.
 
 MfG, JBG


the updated version of missing will confuse the gmp-4.3.2 configure
script if it is installed in-tree with contrib/download_prerequisites
and flex is not installed:

...
checking readline detected... no
checking for bison... (cached) /home/ed/gnu/gcc-5-20141116/missing bison -y
checking for flex... (cached) /home/ed/gnu/gcc-5-20141116/missing flex
checking lex output file root... configure: error: cannot find output from 
/home/ed/gnu/gcc-5-20141116/missing flex; giving up
make[3]: *** [config.status] Error 1
make[3]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf/gmp'
make[2]: *** [all-stage1-gmp] Error 2
make[2]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf'
make[1]: *** [stage1-bubble] Error 2
make[1]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf'
make: *** [all] Error 2

previous version of missing flex produced a dummy lex.yy.c,
as does the version in the gmp package, but unfortunately it is
overwritten by the missing script in the gcc tree.

That's probably just not a supported configuration anymore,
but all previous GCC releases worked without a installed flex tool.

Maybe the problem goes away if a newer version of gmp is used,
or if the missing flex is not passed down to the gmp configure script,
somehow.  Actually, it is not really needed by gmp at all.
I tried to add this hunk from the old version and it made, the gmp configure
script worked again:

--- missing.orig2014-11-16 14:07:13.0 +
+++ missing 2014-11-19 15:01:57.168967538 +
@@ -172,6 +172,21 @@
   echo You should only need it if you modified a '.l' file.
   echo You may want to install the Fast Lexical Analyzer package:
   echo $flex_URL
+  rm -f lex.yy.c
+  if test $# -ne 1; then
+eval LASTARG=\${$#}
+case $LASTARG in
+*.l)
+  SRCFILE=`echo $LASTARG | sed 's/l$/c/'`
+  if test -f $SRCFILE; then
+  cp $SRCFILE lex.yy.c
+  fi
+;;
+esac
+  fi
+  if test ! -f lex.yy.c; then
+  echo 'main() { return 0; }'lex.yy.c
+  fi
   ;;
 help2man*)
   echo You should only need it if you modified a dependency \



What do you think?


Regards,
Bernd.
  

Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Evgeny Stupachenko
Thank you.
Patch with proposed fixes:

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 085eb54..09c0057 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -48322,6 +48322,120 @@ expand_vec_perm_vpshufb2_vpermq_even_odd
(struct expand_vec_perm_d *d)
   return true;
 }

+/* A subroutine of expand_vec_perm_even_odd_1.  Implement extract-even
+   and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands
+   with two and and pack or two shift and pack insns.  We should
+   have already failed all two instruction sequences.  */
+
+static bool
+expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d)
+{
+  rtx op, dop0, dop1, t, rperm[16];
+  unsigned i, odd, c, s, nelt = d-nelt;
+  bool end_perm = false;
+  machine_mode half_mode;
+  rtx (*gen_and) (rtx, rtx, rtx);
+  rtx (*gen_pack) (rtx, rtx, rtx);
+  rtx (*gen_shift) (rtx, rtx, rtx);
+
+  /* Required for pack.  */
+  if (!TARGET_SSE4_2 || d-one_operand_p)
+return false;
+
+  switch (d-vmode)
+{
+case V8HImode:
+  c = 0x;
+  s = 16;
+  half_mode = V4SImode;
+  gen_and = gen_andv4si3;
+  gen_pack = gen_sse4_1_packusdw;
+  gen_shift = gen_lshrv4si3;
+  break;
+case V16QImode:
+  c = 0xff;
+  s = 8;
+  half_mode = V8HImode;
+  gen_and = gen_andv8hi3;
+  gen_pack = gen_sse2_packuswb;
+  gen_shift = gen_lshrv8hi3;
+  break;
+case V16HImode:
+  c = 0x;
+  s = 16;
+  half_mode = V8SImode;
+  gen_and = gen_andv8si3;
+  gen_pack = gen_avx2_packusdw;
+  gen_shift = gen_lshrv8si3;
+  end_perm = true;
+  break;
+case V32QImode:
+  c = 0xff;
+  s = 8;
+  half_mode = V16HImode;
+  gen_and = gen_andv16hi3;
+  gen_pack = gen_avx2_packuswb;
+  gen_shift = gen_lshrv16hi3;
+  end_perm = true;
+  break;
+default:
+  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than
+general shuffles.  */
+  return false;
+}
+
+  /* Check that permutation is even or odd.  */
+  odd = d-perm[0];
+  if (odd  1)
+return false;
+
+  for (i = 1; i  nelt; ++i)
+if (d-perm[i] != 2 * i + odd)
+  return false;
+
+  if (d-testing_p)
+return true;
+
+  dop0 = gen_reg_rtx (half_mode);
+  dop1 = gen_reg_rtx (half_mode);
+  if (odd == 0)
+{
+  for (i = 0; i  nelt / 2; i++)
+   rperm[i] = GEN_INT (c);
+  t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm));
+  t = force_reg (half_mode, t);
+  emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0)));
+  emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1)));
+}
+  else
+{
+  emit_insn (gen_shift (dop0,
+   gen_lowpart (half_mode, d-op0),
+   GEN_INT (s)));
+  emit_insn (gen_shift (dop1,
+   gen_lowpart (half_mode, d-op1),
+   GEN_INT (s)));
+}
+  /* In AVX2 for 256 bit case we need to permute pack result.  */
+  if (TARGET_AVX2  end_perm)
+{
+  op = gen_reg_rtx (d-vmode);
+  t = gen_reg_rtx (V4DImode);
+  emit_insn (gen_pack (op, dop0, dop1));
+  emit_insn (gen_avx2_permv4di_1 (t,
+ gen_lowpart (V4DImode, op),
+ const0_rtx,
+ const2_rtx,
+ const1_rtx,
+ GEN_INT (3)));
+  emit_move_insn (d-target, gen_lowpart (d-vmode, t));
+}
+  else
+emit_insn (gen_pack (d-target, dop0, dop1));
+
+  return true;
+}
+
 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Implement extract-even
and extract-odd permutations.  */

@@ -48393,7 +48507,9 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)
   gcc_unreachable ();

 case V8HImode:
-  if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
+  if (TARGET_SSE4_2)
+   return expand_vec_perm_even_odd_pack (d);
+  else if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
return expand_vec_perm_pshufb2 (d);
   else
{
@@ -48416,7 +48532,9 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)
   break;

 case V16QImode:
-  if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
+  if (TARGET_SSE4_2)
+   return expand_vec_perm_even_odd_pack (d);
+  else if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
return expand_vec_perm_pshufb2 (d);
   else
{
@@ -48441,7 +48559,7 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)

 case V16HImode:
 case V32QImode:
-  return expand_vec_perm_vpshufb2_vpermq_even_odd (d);
+  return expand_vec_perm_even_odd_pack (d);

 case V4DImode:
   if (!TARGET_AVX2)
@@ -48814,6 +48932,9 @@ ix86_expand_vec_perm_const_1 (struct
expand_vec_perm_d *d)

   /* Try sequences of three instructions.  */

+  if (expand_vec_perm_even_odd_pack (d))
+return true;

Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787

2014-11-20 Thread Andrew MacLeod

On 11/20/2014 08:08 AM, Richard Biener wrote:

On Thu, Nov 20, 2014 at 12:05 AM, Andrew MacLeod amacl...@redhat.com wrote:

On 11/19/2014 05:24 PM, David Malcolm wrote:

On Wed, 2014-11-19 at 22:36 +0100, Richard Biener wrote:

On November 19, 2014 10:09:56 PM CET, Andrew MacLeod
amacl...@redhat.com wrote:

On 11/19/2014 03:43 PM, Richard Biener wrote:

On November 19, 2014 8:26:23 PM CET, Andrew MacLeod

amacl...@redhat.com wrote:

On 11/19/2014 01:12 PM, David Malcolm wrote:


(A) could become:

  greturn *stmt = gsi-as_a_greturn ();

(B) could become:

  stmt = gsi-dyn_cast gcall * ();
  if (!stmt)
or:

  stmt = gsi-dyn_cast_gcall ();
  if (!stmt)

or maybe:

  stmt = gsi-is_a_gcall ();
  if (!stmt)

An earlier version of my patches had casting methods within the
gimple_statement_base class, which were rejected; the above

proposals

would instead put them within gimple_stmt_iterator.


I would like all gsi routines to be consistent, not a mix of

functions

and methods.
so something like

stmt = gsi_as_call (gsi);
stmt = gsi_dyn_call (gsi);

or we need to change gsi_stmt() and friends into methods and access
them
as gsi-stmt ()..  which is possibly better, but that much more
intrusive again (another 2000+ locations).
If we switched to methods everywhere for gsi, I'd prefer something

like

gsi-as_a_call ()
gsi-is_a_call ()
gsi-dyn_cast_call ()

I think its more readable... and it removes a dependency on the
implementation.. so if we ever decide to change the name of 'gcall'
down
the road to using a namespace, and make it gimple::call or whatever,

we

wont have to change every single gsi- location which has a

templated

use of the type.

I'm also think this sort of thing could probably wait until next

stage

1..

my 2 cents...

Why not as_a gassign * (*gsi)?  It would
Add operator* to gsi of course.

Richard.



I could live with that form too.

we often have an instance of gimple_stmt_iterator rather than a pointer

to it, so wouldn't  operator gimple *() to implicitly call gsi_stmt()

when needed work better? (or operator gimple () before the next
change) ..

Not sure.  The * matches how iterators work in STL...

Note that for the cases where we pass a pointer to an iterator we can
change those to use references to avoid writing **gsi.

Richard.


Andrew

I had a go at adding an operator * to gimple_stmt_iterator, using it
everywhere that we do an as_a or dyn_cast on the result of a
gsi_stmt, to abbreviate the gsi_stmt call down to one character.

Patch attached; only lightly smoketested; am posting it for the sake of
discussion.

I don't think this API will make the non-C++-fans happier; I think the
objection to the work I just merged is that it's adding more C++ than
those people are comfortable with.

So although the attached patch makes things shorter (good), it's taking
things in a more C++ direction (questionable).  I'd hoped to paper
over the C++ somewhat.

I suspect that any API which requires the of   characters within the
implementation of a gimple pass to mean a template is going to give
those less happy with C++ a visceral ugh reaction.  I wonder if
there's a way to spell these things that's concise and which doesn't
involve  ?


wasnt that my last  thought?   is_a_call(), as_a_call() and dyn_cast_call ()
?

I think lack of  in identifiers helps us old brains parse faster :-)
are like ()... many many years of causing a certain kind of break in mental
processing. I'm accustomed to single  these days, but once you get into
multiple 's I quickly loose track.  I find the same thing with ()... hence
I'm not a lisp fan :-)

I think we want to have a consistent style across GCC even if seen as
ugly to some people.  Thus having (member) functions for conversion in
some cases
and as_a  templates in others is bad.  C++ was supposed to make
grok GCC easier for newbies - this is exactly making it harder (not that
I believe in this story at all...)


I dont think 'operator *' c++ifies it too much, but I still think operator
gimple() would be easier...  no extra character at all, and no odd looking
dereference of a non-pointer object or double dereference of a pointer.  I
cant think of how that could get us into trouble... it'll always map to the
stmt the iterator currently points to.

I dislike conversion operators.  Why is 'operator *' bad?  It's exactly
how iterators are supposed to work - after all the gsi stuff was modeled
after STL iterators!

So that's a definitive no from me to is_a_call () as_a_call () etc.

Richard.

Fine by me, Just running through the options to make sure we know what 
we are getting :-)


Andrew



Re: [PATCH] PR63426 Fix various signed integer overflows

2014-11-20 Thread David Edelsohn
On Thu, Nov 20, 2014 at 8:27 AM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:
 Running the testsuite after bootstrap-ubsan on gcc112 shows several issues. 
 See
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 for the full list.

 This patch fixes several of them.

 Tested on powerpc64-unknown-linux-gnu.

 OK for trunk?

 Thanks.

 2014-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de

 * config/rs6000/constraints.md: Avoid signed integer overflows.
 * config/rs6000/predicates.md: Likewise.
 * config/rs6000/rs6000.c (num_insns_constant_wide): Likewise.
 (includes_rldic_lshift_p): Likewise.
 (includes_rldicr_lshift_p): Likewise.
 * emit-rtl.c (const_wide_int_htab_hash): Likewise.
 * loop-iv.c (determine_max_iter): Likewise.
 (iv_number_of_iterations): Likewise.
 * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise.
 * varasm.c (get_section_anchor): Likewise.

The rs6000 patches are okay.

Someone like Richi or Jakub needs to approve the changes to the common
parts of the compiler.

Thanks, David


Re: LTO streaming of TARGET_OPTIMIZE_NODE

2014-11-20 Thread Bernd Schmidt

On 11/20/2014 02:20 PM, Richard Biener wrote:

On Thu, 20 Nov 2014, Bernd Schmidt wrote:


On 11/13/2014 05:06 AM, Jan Hubicka wrote:

this patch adds infrastructure for proper streaming and merging of
TREE_TARGET_OPTION.


This breaks the offloading path via LTO since it introduces an incompatibility
in LTO format between host and offload machine.

A very quick patch to fix it is below - the OpenACC testcase I was using seems
to be working again with this. Thoughts, suggestions?


The offload target needs to have the same target options as the host?


Not really meaningful I'd think.


Are the offload functions marked somehow?  That is, can we avoid
setting TREE_TARGET_OPTION on them?


Well, they are mostly generated automatically by omp-low.c, so 
TREE_TARGET_OPTION wouldn't normally be set anyway. So the field is 
unnecessary, we just can't write it out since the two compilers involved 
disagree on its layout.



Or rather we need to have a
default TREE_TARGET_OPTION node for the offload target which we'd
need to set - how would you otherwise transfer different offload
target options to the offload compile?  How do you transfer
offload target options to the offload compile at all?


ABI options are transferred via the -foffload-abi mechanism. No other 
target options can be transferred.



I think this just shows conceptual issues with the LTO approach...


I don't think running into a few problems demonstrates a conceptual 
problem when it works fine with some fairly small patches.



Bernd



[PATCH] PR lto/63968: 175.vpr from cpu2000 fails to build with LTO

2014-11-20 Thread Martin Liška

Hello.

As I reimplemented fibheap to C++ template, Honza told me that replace_key 
method actually
supports just decrement operation. Old implementation suppress any feedback if 
we try to increase key:

fibheap.c:
...
  /* If we wanted to, we could actually do a real increase by redeleting and
 inserting. However, this would require O (log n) time. So just bail out
 for now.  */
  if (fibheap_comp_data (heap, key, data, node)  0)
return NULL;
...

My reimplementation added assert for such kind operation, as this PR shows we 
try to do increment in reorder-bb.
Thus, I added fibonacci_heap::replace_key method that can increment key (it 
deletes the node and new key
is associated with the node).

The patch can bootstrap on x86_64-linux-pc and no new regression was introduced.
I would like to ask someone if the increase operation for bb-reorder is valid 
or not?

Thanks,
Martin
gcc/ChangeLog:

2014-11-20  Martin Liska  mli...@suse.cz

* bb-reorder.c (find_traces_1_round): decreate_key is replaced
with replace_key method.
* fibonacci_heap.h (fibonacci_heap::insert): New argument.
(fibonacci_heap::replace_key_data): Likewise.
(fibonacci_heap::replace_key): New method that can even increment key,
this operation costs O(log N).
(fibonacci_heap::extract_min): New argument.
(fibonacci_heap::delete_node): Likewise.
diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
index 689d7b6..b568114 100644
--- a/gcc/bb-reorder.c
+++ b/gcc/bb-reorder.c
@@ -644,7 +644,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
    (long) bbd[e-dest-index].node-get_key (),
    key);
 			}
-		  bbd[e-dest-index].heap-decrease_key
+		  bbd[e-dest-index].heap-replace_key
 		(bbd[e-dest-index].node, key);
 		}
 		}
@@ -812,7 +812,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th,
 			   e-dest-index,
 			   (long) bbd[e-dest-index].node-get_key (), key);
 		}
-		  bbd[e-dest-index].heap-decrease_key
+		  bbd[e-dest-index].heap-replace_key
 		(bbd[e-dest-index].node, key);
 		}
 	}
diff --git a/gcc/fibonacci_heap.h b/gcc/fibonacci_heap.h
index ecb92f8..3fce370 100644
--- a/gcc/fibonacci_heap.h
+++ b/gcc/fibonacci_heap.h
@@ -183,20 +183,27 @@ public:
   }
 
   /* For given NODE, set new KEY value.  */
-  K decrease_key (fibonacci_node_t *node, K key)
+  K replace_key (fibonacci_node_t *node, K key)
   {
 K okey = node-m_key;
-gcc_assert (key = okey);
 
 replace_key_data (node, key, node-m_data);
 return okey;
   }
 
+  /* For given NODE, decrease value to new KEY.  */
+  K decrease_key (fibonacci_node_t *node, K key)
+  {
+gcc_assert (key = node-m_key);
+return replace_key (node, key);
+  }
+
   /* For given NODE, set new KEY and DATA value.  */
   V *replace_key_data (fibonacci_node_t *node, K key, V *data);
 
-  /* Extract minimum node in the heap. */
-  V *extract_min ();
+  /* Extract minimum node in the heap. If RELEASE is specified,
+ memory is released.  */
+  V *extract_min (bool release = true);
 
   /* Return value associated with minimum node in the heap.  */
   V *min ()
@@ -214,12 +221,15 @@ public:
   }
 
   /* Delete NODE in the heap.  */
-  V *delete_node (fibonacci_node_t *node);
+  V *delete_node (fibonacci_node_t *node, bool release = true);
 
   /* Union the heap with HEAPB.  */
   fibonacci_heap *union_with (fibonacci_heap *heapb);
 
 private:
+  /* Insert new NODE given by KEY and DATA associated with the key.  */
+  fibonacci_node_t *insert (fibonacci_node_t *node, K key, V *data);
+
   /* Insert it into the root list.  */
   void insert_root (fibonacci_node_t *node);
 
@@ -322,6 +332,15 @@ fibonacci_heapK,V::insert (K key, V *data)
   /* Create the new node.  */
   fibonacci_nodeK,V *node = new fibonacci_node_t ();
 
+  return insert (node, key, data);
+}
+
+/* Insert new NODE given by KEY and DATA associated with the key.  */
+
+templateclass K, class V
+fibonacci_nodeK,V*
+fibonacci_heapK,V::insert (fibonacci_node_t *node, K key, V *data)
+{
   /* Set the node's data.  */
   node-m_data = data;
   node-m_key = key;
@@ -345,17 +364,22 @@ V*
 fibonacci_heapK,V::replace_key_data (fibonacci_nodeK,V *node, K key,
    V *data)
 {
-  V *odata;
   K okey;
   fibonacci_nodeK,V *y;
+  V *odata = node-m_data;
 
-  /* If we wanted to, we could actually do a real increase by redeleting and
- inserting. However, this would require O (log n) time. So just bail out
- for now.  */
+  /* If we wanted to, we do a real increase by redeleting and
+ inserting.  */
   if (node-compare_data (key)  0)
-return NULL;
+{
+  delete_node (node, false);
+
+  node = new (node) fibonacci_node_t ();
+  insert (node, key, data);
+
+  return odata;
+}
 
-  odata = node-m_data;
   okey = node-m_key;
   node-m_data = data;
   node-m_key = key;
@@ -385,7 +409,7 @@ fibonacci_heapK,V::replace_key_data 

[PATCH] Fix target/63977

2014-11-20 Thread Richard Henderson
My mistake yesterday.  I thought I'd tested both x86_64 -m64/-m32, but not so.
Anyway, as the comment says, the backend keeps querying the static chain, and
if you don't early out, it sets ix86_static_chain_on_stack, at which point the
setting is permanent and affects prologue generation, and not in a good way.

Tested i686-linux and committed.


r~
PR target/63977
* config/i386/i386.c (ix86_static_chain): Reinstate the check
for DECL_STATIC_CHAIN.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fffddfc..6c8dbd6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -27360,6 +27360,12 @@ ix86_static_chain (const_tree fndecl_or_type, bool 
incoming_p)
 {
   unsigned regno;
 
+  /* While this function won't be called by the middle-end when a static
+ chain isn't needed, it's also used throughout the backend so it's
+ easiest to keep this check centralized.  */
+  if (DECL_P (fndecl_or_type)  !DECL_STATIC_CHAIN (fndecl_or_type))
+return NULL;
+
   if (TARGET_64BIT)
 {
   /* We always use R10 in 64-bit mode.  */


[Ada] Spurious errors on extension aggregate for limited type

2014-11-20 Thread Arnaud Charlet
This patch fixes two errors in the handling of extension aggregates for limited
types: Ancestor part of extension aggregate can itself be an extension aggregate
as well as a function call that is rewritten as a reference.

The following must compile quietly:

   gcc -c p2.adb
   gcc -c bugzilla.ads

---
package body P1 is
function Create return T1 is
begin
   return (Length = 3);
end Create;
end P1;
---
package P1 is
type T1 is tagged limited private;

function Create return T1;
private
type T1 (Length : Positive := 3) is
  tagged limited null record;
end P1;
---
with P1;
package P2 is
type T2 is
  limited new P1.T1 with null record;

function Create return T2;
end P2;
---
package body P2 is
function Create return T2 is
begin
   return (P1.Create with null record);
end Create;
end P2;
---
with Ada.Finalization;
package Bugzilla is
   type T1 is limited new Ada.Finalization.Limited_Controlled with null record;
   type T2 is new T1 with null record;
   X : T2 := (T1 with null record);
   Z : T2 := (T1'(Ada.Finalization.Limited_Controlled with null record)
   with null record);
end Bugzilla;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Ed Schonberg  schonb...@adacore.com

* sem_aggr.adb (Valid_Limited_Ancestor): Ancestor part of
extension aggregate can itself be an extension aggregate, as
well as a call that is rewritten as a reference.

Index: sem_aggr.adb
===
--- sem_aggr.adb(revision 217828)
+++ sem_aggr.adb(working copy)
@@ -2663,12 +2663,19 @@
 
   function Valid_Limited_Ancestor (Anc : Node_Id) return Boolean is
   begin
- if Is_Entity_Name (Anc)
-   and then Is_Type (Entity (Anc))
+ if Is_Entity_Name (Anc) and then Is_Type (Entity (Anc)) then
+return True;
+
+ --  The ancestor must be a call or an aggregate, but a call may
+ --  have been expanded into a temporary, so check original node.
+
+ elsif Nkind_In (Anc, N_Aggregate,
+  N_Extension_Aggregate,
+  N_Function_Call)
  then
 return True;
 
- elsif Nkind_In (Anc, N_Aggregate, N_Function_Call) then
+ elsif Nkind (Original_Node (Anc)) = N_Function_Call then
 return True;
 
  elsif Nkind (Anc) = N_Attribute_Reference


[Ada] Inter-unit inlining of expression functions with -gnatn1

2014-11-20 Thread Arnaud Charlet
This enables inter-unit inlining of expression functions with -gnatn1, or more
simply with -O1/-O2 -gnatn.  These functions are automatically candidates for
inlining, but there were actually inlined across units only with -gnatn2, or
more simply -O3 -gnatn.

The following program must compile without warnings with -O -gnatn -Winline:

with Q; use Q;

procedure P (I : Integer) is
begin
  if Process (I) /= 2 * I then
raise Program_Error;
  end if;
end;
package Q is

  function Process (I : Integer) return Integer;
  pragma Inline (Process);

end Q;
with R; use R;

package body Q is

  function Process (I : Integer) return Integer is
  begin
return Process2 (I) + Process3 (I);
  end;

end Q;
package R is

  function Process2 (I : Integer) return Integer;

  function Process3 (I : Integer) return Integer is (I);

private

  function Process2 (I : Integer) return Integer is (I);

end R;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Eric Botcazou  ebotca...@adacore.com

* inline.adb (Add_Inlined_Subprogram): Insert all programs
generated as a body or whose declaration was provided along with
the body.

Index: inline.adb
===
--- inline.adb  (revision 217842)
+++ inline.adb  (working copy)
@@ -454,6 +454,7 @@
 
procedure Add_Inlined_Subprogram (Index : Subp_Index) is
   E: constant Entity_Id := Inlined.Table (Index).Name;
+  Decl : constant Node_Id   := Parent (Declaration_Node (E));
   Pack : constant Entity_Id := Get_Code_Unit_Entity (E);
 
   procedure Register_Backend_Inlined_Subprogram (Subp : Entity_Id);
@@ -486,14 +487,17 @@
begin
   --  If the subprogram is to be inlined, and if its unit is known to be
   --  inlined or is an instance whose body will be analyzed anyway or the
-  --  subprogram has been generated by the compiler, and if it is declared
+  --  subprogram was generated as a body by the compiler (for example an
+  --  initialization procedure) or its declaration was provided along with
+  --  the body (for example an expression function), and if it is declared
   --  at the library level not in the main unit, and if it can be inlined
   --  by the back-end, then insert it in the list of inlined subprograms.
 
   if Is_Inlined (E)
 and then (Is_Inlined (Pack)
or else Is_Generic_Instance (Pack)
-   or else Is_Internal (E))
+   or else Nkind (Decl) = N_Subprogram_Body
+   or else Present (Corresponding_Body (Decl)))
 and then not In_Main_Unit_Or_Subunit (E)
 and then not Is_Nested (E)
 and then not Has_Initialized_Type (E)


[Ada] Type conversion to String causes Constraint_Error

2014-11-20 Thread Arnaud Charlet
This patch modifies the mechanism which creates a subtype from an arbitrary
expression. The mechanism now captures the bounds of all index constraints
when the expression is of an array type.


-- Source --


--  pack.ads

with Ada.Finalization; use Ada.Finalization;

package Pack is
   type Ctrl is new Controlled with record
  Flag : Boolean := False;
   end record;

   type New_String is new String;

   function Make_Ctrl return Ctrl;
   function Make_String (Val : String) return New_String;
end Pack;

--  pack.adb

package body Pack is
   function Make_Ctrl return Ctrl is
  Result : Ctrl;
   begin
  return Result;
   end Make_Ctrl;

   function Make_String (Val : String) return New_String is
   begin
  return New_String (Val);
   end Make_String;
end Pack;

--  pack2.ads

package Pack2 is
   procedure Reproduce;
end Pack2;

--  pack2.adb

with Ada.Text_IO; use Ada.Text_IO;
with Pack;use Pack;

package body Pack2 is
   Str : constant New_String := Make_String (Hello);
   Ctr : constant Ctrl := Make_Ctrl;

   procedure Reproduce is
   begin
  Put_Line (String (Str));
   end Reproduce;
end Pack2;

--  main.adb

with Pack2; use Pack2;

procedure Main is
begin
   Reproduce;
end Main;


-- Compilation and output --


$ gnatmake -q main.adb
$ ./main
Hello

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Hristian Kirtchev  kirtc...@adacore.com

* exp_util.adb (Make_Subtype_From_Expr): Capture the bounds of
all index constracts when the expression is of an array type.

Index: exp_util.adb
===
--- exp_util.adb(revision 217854)
+++ exp_util.adb(working copy)
@@ -6399,22 +6399,24 @@
  (E   : Node_Id;
   Unc_Typ : Entity_Id) return Node_Id
is
+  List_Constr : constant List_Id:= New_List;
   Loc : constant Source_Ptr := Sloc (E);
-  List_Constr : constant List_Id:= New_List;
   D   : Entity_Id;
+  Full_Exp: Node_Id;
+  Full_Subtyp : Entity_Id;
+  High_Bound  : Entity_Id;
+  Index_Typ   : Entity_Id;
+  Low_Bound   : Entity_Id;
+  Priv_Subtyp : Entity_Id;
+  Utyp: Entity_Id;
 
-  Full_Subtyp  : Entity_Id;
-  Priv_Subtyp  : Entity_Id;
-  Utyp : Entity_Id;
-  Full_Exp : Node_Id;
-
begin
   if Is_Private_Type (Unc_Typ)
 and then Has_Unknown_Discriminants (Unc_Typ)
   then
- --  Prepare the subtype completion, Go to base type to
- --  find underlying type, because the type may be a generic
- --  actual or an explicit subtype.
+ --  Prepare the subtype completion. Use the base type to find the
+ --  underlying type because the type may be a generic actual or an
+ --  explicit subtype.
 
  Utyp:= Underlying_Type (Base_Type (Unc_Typ));
  Full_Subtyp := Make_Temporary (Loc, 'C');
@@ -6451,22 +6453,67 @@
  return New_Occurrence_Of (Priv_Subtyp, Loc);
 
   elsif Is_Array_Type (Unc_Typ) then
+ Index_Typ := First_Index (Unc_Typ);
  for J in 1 .. Number_Dimensions (Unc_Typ) loop
-Append_To (List_Constr,
-  Make_Range (Loc,
-Low_Bound =
+
+--  Capture the bounds of each index constraint in case the context
+--  is an object declaration of an unconstrained type initialized
+--  by a function call:
+
+--Obj : Unconstr_Typ := Func_Call;
+
+--  This scenario requires secondary scope management and the index
+--  constraint cannot depend on the temporary used to capture the
+--  result of the function call.
+
+--SS_Mark;
+--Temp : Unconstr_Typ_Ptr := Func_Call'reference;
+--subtype S is Unconstr_Typ (Temp.all'First .. Temp.all'Last);
+--Obj : S := Temp.all;
+--SS_Release;  --  Temp is gone at this point, bounds of S are
+-- --  non existent.
+
+--  The bounds are kept as variables rather than constants because
+--  this prevents spurious optimizations down the line.
+
+--  Generate:
+--Low_Bound : Base_Type (Index_Typ) := E'First (J);
+
+Low_Bound := Make_Temporary (Loc, 'B');
+Insert_Action (E,
+  Make_Object_Declaration (Loc,
+Defining_Identifier = Low_Bound,
+Object_Definition   =
+  New_Occurrence_Of (Base_Type (Etype (Index_Typ)), Loc),
+Expression  =
   Make_Attribute_Reference (Loc,
-Prefix = Duplicate_Subexpr_No_Checks (E),
+Prefix = Duplicate_Subexpr_No_Checks (E),
 Attribute_Name = Name_First,
-

Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Richard Henderson
On 11/20/2014 12:36 PM, Evgeny Stupachenko wrote:
 +  /* Required for pack.  */
 +  if (!TARGET_SSE4_2 || d-one_operand_p)
 +return false;

Why the SSE4_2 check here when...

 +
 +  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general
 + shuffles.  */
 +  if (d-vmode == V8HImode)
 +{
 +  c = 0x;
 +  s = 16;
 +  half_mode = V4SImode;
 +  gen_and = gen_andv4si3;
 +  gen_pack = gen_sse4_1_packusdw;

... it's SSE4_1 here,

 +  gen_shift = gen_lshrv4si3;
 +}
 +  else if (d-vmode == V16QImode)
 +{
 +  c = 0xff;
 +  s = 8;
 +  half_mode = V8HImode;
 +  gen_and = gen_andv8hi3;
 +  gen_pack = gen_sse2_packuswb;

... and SSE2 here?



r~


[Ada] Debugging information for inlined predefined units

2014-11-20 Thread Arnaud Charlet
The compiler suppresses debugging information on predefined units that are
inlined in the code, because stepping into run-time units often complicates
debugging activity. We  make an exception for calls that appear in the source,
when the unit is part of the Ada hierarchy, to facilitate monitoring of storage
management.

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Ed Schonberg  schonb...@adacore.com

* exp_ch6.adb (Expand_Call, Inlined_Subprogram): Do not suppress
debugging information for a call to a predefined unit, if the
call comes from source and the unit is in the Ada hierarchy.

Index: exp_ch6.adb
===
--- exp_ch6.adb (revision 217828)
+++ exp_ch6.adb (working copy)
@@ -3720,7 +3720,17 @@
  (Unit_File_Name (Get_Source_Unit (Sloc (Subp
   and then In_Extended_Main_Source_Unit (N)
 then
-   Set_Needs_Debug_Info (Subp, False);
+   --  We make an exception for calls to the Ada hierarchy if call
+   --  comes from source, because some user applications need the
+   --  debugging information for such calls.
+
+   if Comes_From_Source (Call_Node)
+ and then Name_Buffer (1 .. 2) = a-
+   then
+  null;
+   else
+  Set_Needs_Debug_Info (Subp, False);
+   end if;
 end if;
 
  --  Front end expansion of simple functions returning unconstrained


Re: [AArch64, Patch] Add range-check for Symbol + offset addressing.

2014-11-20 Thread Tejas Belagod

On 14/11/14 17:33, Marcus Shawcroft wrote:

On 14 November 2014 08:12, Tejas Belagod tejas.bela...@arm.com wrote:


2014-11-14  Tejas Belagod  tejas.bela...@arm.com

gcc/
 * config/aarch64/aarch64-protos.h (aarch64_classify_symbol):
 Fixup prototype.
 * config/aarch64/aarch64.c (aarch64_expand_mov_immediate,
 aarch64_cannot_force_const_mem, aarch64_classify_address,
 aarch64_classify_symbolic_expression): Fixup call to
 aarch64_classify_symbol.
 (aarch64_classify_symbol): Add range-checking for
 symbol + offset addressing for tiny and small models.

testsuite/
 * gcc.target/aarch64/symbol-range.c: New.
 * gcc.target/aarch64/symbol-range-tiny.c: New.



OK.
Could you rustle up a back port ?


The same patch applies cleanly to 4.9. OK to commit?

Thanks,
Tejas.




Re: [PATCH] driver: ignore SIGINT while waiting on subprocesses to finish

2014-11-20 Thread Michael Matz
Hi,

On Thu, 20 Nov 2014, Patrick Palka wrote:

  -wrapper is specifically also for invoking cc1 with gdb from the 
  driver (that's the usecase documented with -wrapper), so it better 
  should work as intended.  I don't know what problems Patrick had with 
  that, though.  For me gcc -wrapper gdb,--args works as expected (as in 
  ^C interrupts cc1 returning to gdb).
 
 Yes it does for me too. But pressing ^C in gdb while cc1 is not running 
 (by accident or with intention, e.g. pressing ^C to quickly clear the 
 command prompt) will kill the driver and gdb after it. It's not a huge 
 problem but it does cause some inconvenience for users of -wrapper gdb.

Aha!  Indeed that's quite ugly.  I think fixing this would be appropriate.


Ciao,
Michael.


Re: [PATCH] PR jit/63969: Fix segfault in error-handling when driver isn't found

2014-11-20 Thread David Malcolm
On Wed, 2014-11-19 at 23:09 -0800, Mike Stump wrote:
 On Nov 19, 2014, at 10:23 PM, David Malcolm dmalc...@redhat.com
 wrote:
  It's not clear to me if I can approve my own patches to the jit
 
 So, to answer that, we look at MAINTAINERS, and look up your name:
 
 Various Maintainers
 
 jit David Malcolm   dmalc...@redhat.com
 
 So, this means that you can review other peoples work and approve it
 for the jit code, and you can review and approve your own work for the
 jit code.  This is the definition of Maintainer.  If you had been
 listed under Reviewers, you would need approval for your own work.
 
 Now, that doesn’t mean, you can’t ask for a review for any reason you
 want.  :-)

[CCing Jeff]

Indeed, but Jeff wrote in another thread:
 JIT space, yours to approve :-) We haven't formalized that
 yet, but it'd be silly to do anything else.

[ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02514.html ]

...and that got me wondering if:

(A) there's an additional governance step here that should happen, or

(B) if I can go ahead and commit suitably tested patches that are
confined to the:
   gcc/jit
   gcc/testsuite/jit.exp
subdirectories (and approve other people's such patches), or

(C) both i.e. do (B) whilst (A) is pending


Thanks
Dave



[Ada] Improvements to handling of unchecked union discriminants

2014-11-20 Thread Arnaud Charlet
This patch avoids issuing a warning for a missing component clause
for a discriminant in an unchecked union, and also avoids printing
a line for such a component in the -gnatR2 output.

The following program:

 1. with Interfaces;
 2. procedure Test_Union is
 3.   type Test_Type (Flag : Boolean) is
 4. record
 5.   case Flag is
 6. when True =
 7.   Thing_1 : Interfaces.Unsigned_32;
 8. when False =
 9.   Thing_2 : Interfaces.Unsigned_32;
10.   end case;
11. end record
12. with Unchecked_Union;
13.   for Test_Type use
14. record
15.   Thing_1 at 0 range 0 .. 31;
16.   Thing_2 at 0 range 0 .. 31;
17.   end record;
18.pragma Unreferenced (Test_Type);
19. begin
20.   null;
21. end Test_Union;

compiles quietly with switches -gnatwa -gnatR2, and generates
this representation output:

Representation information for unit Test_Union (body)

for Test_Type'Size use 32;
for Test_Type'Alignment use 4;
for Test_Type use record
   Thing_1 at 0 range  0 .. 31;
   Thing_2 at 0 range  0 .. 31;
end record;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Robert Dewar  de...@adacore.com

* repinfo.adb (List_Record_Info): Do not list discriminant in
unchecked union.
* sem_ch13.adb (Has_Good_Profile): Minor reformatting
(Analyze_Stream_TSS_Definition): Minor reformatting
(Analyze_Record_Representation_Clause): Do not issue warning
for missing rep clause for discriminant in unchecked union.

Index: repinfo.adb
===
--- repinfo.adb (revision 217828)
+++ repinfo.adb (working copy)
@@ -847,37 +847,49 @@
 
   Comp := First_Component_Or_Discriminant (Ent);
   while Present (Comp) loop
- Get_Decoded_Name_String (Chars (Comp));
- Max_Name_Length := Natural'Max (Max_Name_Length, Name_Len);
 
- Cfbit := Component_Bit_Offset (Comp);
+ --  Skip discriminant in unchecked union (since it is not there!)
 
- if Rep_Not_Constant (Cfbit) then
-UI_Image_Length := 2;
+ if Ekind (Comp) = E_Discriminant
+   and then Is_Unchecked_Union (Ent)
+ then
+null;
 
+ --  All other cases
+
  else
---  Complete annotation in case not done
+Get_Decoded_Name_String (Chars (Comp));
+Max_Name_Length := Natural'Max (Max_Name_Length, Name_Len);
 
-Set_Normalized_Position (Comp, Cfbit / SSU);
-Set_Normalized_First_Bit (Comp, Cfbit mod SSU);
+Cfbit := Component_Bit_Offset (Comp);
 
-Sunit := Cfbit / SSU;
-UI_Image (Sunit);
- end if;
+if Rep_Not_Constant (Cfbit) then
+   UI_Image_Length := 2;
 
- --  If the record is not packed, then we know that all fields whose
- --  position is not specified have a starting normalized bit position
- --  of zero.
+else
+   --  Complete annotation in case not done
 
- if Unknown_Normalized_First_Bit (Comp)
-   and then not Is_Packed (Ent)
- then
-Set_Normalized_First_Bit (Comp, Uint_0);
+   Set_Normalized_Position (Comp, Cfbit / SSU);
+   Set_Normalized_First_Bit (Comp, Cfbit mod SSU);
+
+   Sunit := Cfbit / SSU;
+   UI_Image (Sunit);
+end if;
+
+--  If the record is not packed, then we know that all fields
+--  whose position is not specified have a starting normalized
+--  bit position of zero.
+
+if Unknown_Normalized_First_Bit (Comp)
+  and then not Is_Packed (Ent)
+then
+   Set_Normalized_First_Bit (Comp, Uint_0);
+end if;
+
+Max_Suni_Length :=
+  Natural'Max (Max_Suni_Length, UI_Image_Length);
  end if;
 
- Max_Suni_Length :=
-   Natural'Max (Max_Suni_Length, UI_Image_Length);
-
  Next_Component_Or_Discriminant (Comp);
   end loop;
 
@@ -885,6 +897,17 @@
 
   Comp := First_Component_Or_Discriminant (Ent);
   while Present (Comp) loop
+
+ --  Skip discriminant in unchecked union (since it is not there!)
+
+ if Ekind (Comp) = E_Discriminant
+   and then Is_Unchecked_Union (Ent)
+ then
+goto Continue;
+ end if;
+
+ --  All other cases
+
  declare
 Esiz : constant Uint := Esize (Comp);
 Bofs : constant Uint := Component_Bit_Offset (Comp);
Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 217857)
+++ sem_ch13.adb(working copy)
@@ -3555,7 +3555,7 @@
 
 if  Base_Type (Typ) = Base_Type (Ent)
   or else (Is_Class_Wide_Type (Typ)

Re: [COMMITTED 1/3] Make TARGET_STATIC_CHAIN allow a function type

2014-11-20 Thread Richard Henderson
On 11/19/2014 08:56 PM, H.J. Lu wrote:
 On Wed, Nov 19, 2014 at 10:04 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 19, 2014 at 03:58:50PM +0100, Richard Henderson wrote:
 As opposed to always being a decl.  This is a prerequisite
 to allowing the static chain to be loaded for indirect calls.

 * targhooks.c (default_static_chain): Remove check for
 DECL_STATIC_CHAIN.
 * config/moxie/moxie.c (moxie_static_chain): Likewise.
 * config/i386/i386.c (ix86_static_chain): Allow decl or type
 as the first argument.
 * config/xtensa/xtensa.c (xtensa_static_chain): Change the name
 of the unused first parameter.
 * doc/tm.texi (TARGET_STATIC_CHAIN): Document the first parameter
 may be a type.
 * target.def (static_chain): Likewise.

 r217769 broke lots of tests on i686-linux...

Guh.  I thought I tested both multilibs from x86_64, but I guess not.
Anyway, fixed as the comment describes.


r~
PR target/63977
* config/i386/i386.c (ix86_static_chain): Reinstate the check
for DECL_STATIC_CHAIN.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fffddfc..6c8dbd6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -27360,6 +27360,12 @@ ix86_static_chain (const_tree fndecl_or_type, bool 
incoming_p)
 {
   unsigned regno;
 
+  /* While this function won't be called by the middle-end when a static
+ chain isn't needed, it's also used throughout the backend so it's
+ easiest to keep this check centralized.  */
+  if (DECL_P (fndecl_or_type)  !DECL_STATIC_CHAIN (fndecl_or_type))
+return NULL;
+
   if (TARGET_64BIT)
 {
   /* We always use R10 in 64-bit mode.  */


Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Evgeny Stupachenko
Good point! gen_shift also requires only SSE2.
That way we can optimize out interleave sequence for V16QI mode in
expand_vec_perm_even_odd_1.
Thanks!

Evgeny

Updated patch:

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 085eb54..054089b 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -48322,6 +48322,127 @@ expand_vec_perm_vpshufb2_vpermq_even_odd
(struct expand_vec_perm_d *d)
   return true;
 }

+/* A subroutine of expand_vec_perm_even_odd_1.  Implement extract-even
+   and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands
+   with two and and pack or two shift and pack insns.  We should
+   have already failed all two instruction sequences.  */
+
+static bool
+expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d)
+{
+  rtx op, dop0, dop1, t, rperm[16];
+  unsigned i, odd, c, s, nelt = d-nelt;
+  bool end_perm = false;
+  machine_mode half_mode;
+  rtx (*gen_and) (rtx, rtx, rtx);
+  rtx (*gen_pack) (rtx, rtx, rtx);
+  rtx (*gen_shift) (rtx, rtx, rtx);
+
+  if (d-one_operand_p)
+return false;
+
+  switch (d-vmode)
+{
+case V8HImode:
+  /* Required for pack.  */
+  if (!TARGET_SSE4_1)
+return false;
+  c = 0x;
+  s = 16;
+  half_mode = V4SImode;
+  gen_and = gen_andv4si3;
+  gen_pack = gen_sse4_1_packusdw;
+  gen_shift = gen_lshrv4si3;
+  break;
+case V16QImode:
+  /* No check as all instructions are SSE2.  */
+  c = 0xff;
+  s = 8;
+  half_mode = V8HImode;
+  gen_and = gen_andv8hi3;
+  gen_pack = gen_sse2_packuswb;
+  gen_shift = gen_lshrv8hi3;
+  break;
+case V16HImode:
+  if (!TARGET_AVX2)
+return false;
+  c = 0x;
+  s = 16;
+  half_mode = V8SImode;
+  gen_and = gen_andv8si3;
+  gen_pack = gen_avx2_packusdw;
+  gen_shift = gen_lshrv8si3;
+  end_perm = true;
+  break;
+case V32QImode:
+  if (!TARGET_AVX2)
+return false;
+  c = 0xff;
+  s = 8;
+  half_mode = V16HImode;
+  gen_and = gen_andv16hi3;
+  gen_pack = gen_avx2_packuswb;
+  gen_shift = gen_lshrv16hi3;
+  end_perm = true;
+  break;
+default:
+  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than
+general shuffles.  */
+  return false;
+}
+
+  /* Check that permutation is even or odd.  */
+  odd = d-perm[0];
+  if (odd  1)
+return false;
+
+  for (i = 1; i  nelt; ++i)
+if (d-perm[i] != 2 * i + odd)
+  return false;
+
+  if (d-testing_p)
+return true;
+
+  dop0 = gen_reg_rtx (half_mode);
+  dop1 = gen_reg_rtx (half_mode);
+  if (odd == 0)
+{
+  for (i = 0; i  nelt / 2; i++)
+   rperm[i] = GEN_INT (c);
+  t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm));
+  t = force_reg (half_mode, t);
+  emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0)));
+  emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1)));
+}
+  else
+{
+  emit_insn (gen_shift (dop0,
+   gen_lowpart (half_mode, d-op0),
+   GEN_INT (s)));
+  emit_insn (gen_shift (dop1,
+   gen_lowpart (half_mode, d-op1),
+   GEN_INT (s)));
+}
+  /* In AVX2 for 256 bit case we need to permute pack result.  */
+  if (TARGET_AVX2  end_perm)
+{
+  op = gen_reg_rtx (d-vmode);
+  t = gen_reg_rtx (V4DImode);
+  emit_insn (gen_pack (op, dop0, dop1));
+  emit_insn (gen_avx2_permv4di_1 (t,
+ gen_lowpart (V4DImode, op),
+ const0_rtx,
+ const2_rtx,
+ const1_rtx,
+ GEN_INT (3)));
+  emit_move_insn (d-target, gen_lowpart (d-vmode, t));
+}
+  else
+emit_insn (gen_pack (d-target, dop0, dop1));
+
+  return true;
+}
+
 /* A subroutine of ix86_expand_vec_perm_builtin_1.  Implement extract-even
and extract-odd permutations.  */

@@ -48393,7 +48514,9 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)
   gcc_unreachable ();

 case V8HImode:
-  if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
+  if (TARGET_SSE4_1)
+   return expand_vec_perm_even_odd_pack (d);
+  else if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
return expand_vec_perm_pshufb2 (d);
   else
{
@@ -48416,32 +48539,11 @@ expand_vec_perm_even_odd_1 (struct
expand_vec_perm_d *d, unsigned odd)
   break;

 case V16QImode:
-  if (TARGET_SSSE3  !TARGET_SLOW_PSHUFB)
-   return expand_vec_perm_pshufb2 (d);
-  else
-   {
- if (d-testing_p)
-   break;
- t1 = gen_reg_rtx (V16QImode);
- t2 = gen_reg_rtx (V16QImode);
- t3 = gen_reg_rtx (V16QImode);
- emit_insn (gen_vec_interleave_highv16qi (t1, d-op0, d-op1));
- emit_insn (gen_vec_interleave_lowv16qi 

Re: [PATCH] gcc/ubsan.c: Extend 'pretty_name' space to avoid memory overflow

2014-11-20 Thread Chen Gang
On 11/17/14 18:52, Chen Gang wrote:
 
 What you said sound reasonable to me.
 
 I shall try to send patch v2 within this week (use pretty_printer).
 
 Thanks.
 
 On 11/17/14 16:15, Marek Polacek wrote:
 On Mon, Nov 17, 2014 at 08:38:19AM +0100, Jakub Jelinek wrote:

 I think easiest would be to rewrite the code so that it uses pretty_printer
 to construct the string, grep asan.c for asan_pp .  Or obstacks, but you 
 don't
 have a printer to print integers into it easily.
   if (dom  TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST)
 pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC,
 tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);
   else
 /* ??? We can't determine the variable name; print VLA unspec.  
 */
 pretty_name[pos++] = '*';
 looks wrong anyway, as not all integers fit into uhwi.
 Guess you could use wide_int to add 1 there and pp_wide_int.

I have finish use pretty_print instead of normal sprintf, but for above
case, after I tried to use wide_int, it can not pass testsuite, please
help check whether what I have done is correct or not:

For make check-gcc RUNTESTFLAGS=ubsan.exp.

 - Simply replace is OK:

pp_printf (pretty_name, HOST_WIDE_INT_PRINT_DEC,
tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1);

 - But use pp_wide_int for wide_int, will cause issue:

pp_wide_int(pretty_name, wi::add (wi::max_value (dom), 1),
TYPE_SIGN (TREE_TYPE (dom)));

 - The related issues are:

   Running /upstream/gcc-new-x86/gcc/testsuite/gcc.dg/ubsan/ubsan.exp ...
   FAIL: c-c++-common/ubsan/bounds-2.c   -O0  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O1  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O2  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O3 -fomit-frame-pointer  output 
pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O3 -fomit-frame-pointer 
-funroll-loops  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O3 -g  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -Os  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
   FAIL: c-c++-common/ubsan/bounds-2.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O0  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O1  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O2  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O3 -fomit-frame-pointer  output 
pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O3 -fomit-frame-pointer 
-funroll-loops  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O3 -g  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -Os  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
   FAIL: c-c++-common/ubsan/bounds-5.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O0  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O1  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O2  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O3 -fomit-frame-pointer  output 
pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O3 -fomit-frame-pointer 
-funroll-loops  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O3 -g  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -Os  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  output pattern test
   FAIL: c-c++-common/ubsan/bounds-7.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  output pattern test


Thanks.
-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed


Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'

2014-11-20 Thread Chen Gang

Oh, sorry, after ran more than 10 days, the qemu crashed :-(

After checked the output log, and compare with the original log, we know
we have finished more than 90% test, and it is OK (no any new issues).
I guess the reason is I started too many other things on this machine.

Next, I shall try to analyze the cross compiled Linux kernel will run
in dead lock issue. After finish analyzing, I shall restart the test.
I guess it needs 12-13 days (more than a week -- I originally expected).

Thanks.

On 11/9/14 21:15, Chen Gang wrote:
 
 At present, I use simplified sshd, ssh and scp (dropbear open source
 program) to communicate with microblaze qemu successfully, and let gcc
 'make check' have real effect.
 
 It is just testing, at least after almost 10 hours, the log output is
 OK. For each ssh login, it will wast 10 - 20 seconds, so I guess, the
 make check may run a week!!  The recent operations is below:
 
  - zlib (for dropbear):
 
export CHOST=microblaze-gchen-linux
export PATH=/upstream/release/bin:$PATH
./configure --prefix=/upstream/release  make  make install
 
  - dropbear (it is a simple sshd, ssh and scp):
 
export PATH=/upstream/release/bin:$PATH
./configure --prefix=/upstream/release \
  --host=microblaze-gchen-linux \
  CC=microblaze-gchen-linux-gcc
 
modify /ustream/release/include/stdio.h to avoid redefining sscanf to
iso99_sscanf
 
link libz.a (static lib) to 'dropbear' (sshd) and 'dbclient' (ssh).
and make scp to generate 'scp' command.
 
for supporting 'none' username:
 
  under ramfs, echo 'none:x:0:0:none:/:/bin/sh'  ./etc/passwd
 
for supporting no passwords (it is temporary fix):
 
  modify  common-session.c: ses.authstate.pw_passwd[0] = '\0';
 
put 'dropbear', 'dbclient' and 'scp' to ./sbin of ramfs and symbol
links to ./usr/bin of ramfs.
 
for temporary fix its stable issue, need modify code to let it 'fork'
as soon as startup, and only permit one child connection each time.
 
usage:
 
  in microblaze qemu:
 
/sbin/dropbear -F -E -B -p 192.168.122.2:22
 
  in host (x86_64), use system scp and ssh is OK (without password):
 
ssh none@192.168.122.2 cd /test; ./test
scp test.c none@192.168.122.2:/test/
scp none@192.168.122.2:/test/* ./
 
  - For dejagnu:
 
need `echo 192.168.122.2   microblaze-xilinx-gdb  /etc/hosts`
under /usr/local/share/dejagnu/*, use ssh/scp instead of rsh/rcp.
  (need replace all, or will cause failure during make check).
for /usr/local/share/dejagnu/baseboards/microblaze-xilinx-gdb.exp,
need add additional variables:
 
  set_board_info sockethost 192.168.122.2
  set_board_info username none
  set timeout 600
 
 Current left issues are:
 
  - Linux kernel built by current upstream microblaze toolchain will be
dead lock. I shall analyze it (I guess it may be kernel self issue,
which may caused by include/compiler-gcc5.h).
 
  - One patch for qemu microblaze dtb file, just checking by related
members (originally I though it was kernel issue, but after
communicate with kernel members, it is more suitable to change qemu).
 
  - One or more issues for dropbear (at least include stable issues), and
one or more issues for glibc. Sorry for I have to bypass them, since
I have no enough time resource on it.
 
 
 Welcome any ideas, suggestions or completions.
 
 Thanks.
 
 
 On 11/01/2014 01:07 AM, Chen Gang wrote:

 At present, I use telnet (without password), login to microblaze qemu
 successfully!  :-)

  - I compile busy box with the glibc in orginal 'ramfs', so get telnetd:
use new busybox replace the old one, and add symbol link 'telnetd' to
busybox in /bin.

  - configure qemu with network support (device xlnx.xps-ethernetlite).

yum install libvirt
yum install tunctl
tunctl -b
ip link set tap0 up
brctl addif virbr0 tap0
./microblaze-softmmu/qemu-system-microblaze -M petalogix-s3adsp1800 \
  -kernel ../linux-stable.microblaze/arch/microblaze/boot/linux.bin \
  -no-reboot -append console=ttyUL0,115200 doreboot -nographic \
  -net nic,vlan=0,model=xlnx.xps-ethernetlite,macaddr=00:16:35:AF:94:00 \
  -net tap,vlan=0,ifname=tap0,script=no,downscript=no

  - fix a kernel bug: add xlnx,xps-ethernetlite-2.00.b for compatible
with its firmware (can find it under /sys/firmware/compatible,
within microblaze qemu bash environments). Related diff:

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c 
 b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 28dbbdc..298fad3 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1236,6 +1236,7 @@ static struct of_device_id xemaclite_of_match[] = {
{ .compatible = xlnx,opb-ethernetlite-1.01.b, },
{ .compatible = xlnx,xps-ethernetlite-1.00.a, },
{ .compatible = 

[PATCH][doc] Document cortex-a17 and cortex-a17.cortex-a7 -m{cpu,tune} options

2014-11-20 Thread Kyrill Tkachov

Hi all,

As Joseph reminded, new -mcpu options should be documented in 
invoke.texi. This adds the documentation for the cortex-a17 and 
cortex-a17.cortex-a7 values.
Ok to go in if the corresponding support patches posted earlier are 
accepted?


Thanks,
Kyrill

2014-11-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* doc/invoke.texi (ARM Options): Document cortex-a17 and
cortex-a17.cortex-a7 as permissible -mtune values.diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 785faec..a81cc16 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12813,7 +12813,8 @@ Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp},
 @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s},
 @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8}, @samp{cortex-a9},
-@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a53}, @samp{cortex-a57},
+@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a17}, @samp{cortex-a53},
+@samp{cortex-a57},
 @samp{cortex-r4},
 @samp{cortex-r4f}, @samp{cortex-r5}, @samp{cortex-r7}, @samp{cortex-m7},
 @samp{cortex-m4},
@@ -12831,7 +12832,8 @@ Permissible names are: @samp{arm2}, @samp{arm250},
 
 Additionally, this option can specify that GCC should tune the performance
 of the code for a big.LITTLE system.  Permissible names are:
-@samp{cortex-a15.cortex-a7}, @samp{cortex-a57.cortex-a53}.
+@samp{cortex-a15.cortex-a7}, @samp{cortex-a17.cortex-a7},
+@samp{cortex-a57.cortex-a53}.
 
 @option{-mtune=generic-@var{arch}} specifies that GCC should tune the
 performance for a blend of processors within architecture @var{arch}.

Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787

2014-11-20 Thread Michael Matz
Hi,

On Thu, 20 Nov 2014, Richard Biener wrote:

  I don't think this API will make the non-C++-fans happier; I think the
  objection to the work I just merged is that it's adding more C++ than
  those people are comfortable with.
 
 How so?  It's already super-ugly in those views.  We decided to get C++.
 Now we have it.

And?  Nobody says we can't have nice looking code even with C++.

 Now please make it AT LEAST CONSISTENT.

True.

  I suspect that any API which requires the of   characters within the 
  implementation of a gimple pass to mean a template is going to give 
  those less happy with C++ a visceral ugh reaction.  I wonder if 
  there's a way to spell these things that's concise and which doesn't 
  involve  ?
 
 Only if you drop as_a/is_a/dyn_cast everywhere.

Oh god, yes.  Please!  IMHO they don't accomplish much, but make code 
harder to visually parse.  They don't accomplish much because you 
have to write the snippets that check validity of conversions anyway, so 
they can just as well be written as proper methods or global functions, or 
even just conversion operators.  Nothing forces us to implement these 
snippets as noisy template specializations like:

  template 
  template 
  inline bool
  is_a_helper cgraph_node *::test (symtab_node *p)
  {
return p-type == SYMTAB_FUNCTION;
  }

instead of the more mundane means.  And once you have those snippets as 
normal functions, you can just as well call them like they are functions, 
making the using side of those conversion also look nicer.


Ciao,
Michael.


[PATCH] sh: char isn't signed

2014-11-20 Thread Segher Boessenkool
An sh compiler fails to build on systems that have plain char unsigned.
Fix that.


2014-11-20  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
PR target/60111
* config/sh/sh.c: Use signed char for signed field.

---
 gcc/config/sh/sh.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c
index be944da..175af44 100644
--- a/gcc/config/sh/sh.c
+++ b/gcc/config/sh/sh.c
@@ -3013,7 +3013,7 @@ enum
 struct ashl_lshr_sequence
 {
   char insn_count;
-  char amount[6];
+  signed char amount[6];
   char clobbers_t;
 };
 
-- 
1.8.1.4



Re: [PATCH] sh: char isn't signed

2014-11-20 Thread Oleg Endo
On Thu, 2014-11-20 at 07:41 -0800, Segher Boessenkool wrote:
 An sh compiler fails to build on systems that have plain char unsigned.
 Fix that.

Ouch.  Thanks for spotting this.  OK for trunk, 4.9 and 4.8.

Cheers,
Oleg



[Ada] Source in multi-unit source has unique object file name

2014-11-20 Thread Arnaud Charlet
Two units, one in a multi-source file and one in another source with
the same base file name do not have the same object file name.

No error during processing of the following project file should be
reported:

project Prj is
   package Naming is
  for Spec (foo_bar) use foo_bar.ads at 2;
  for Spec (foo_bar_types) use foo_bar.ads at 1;
  for Body (foo_bar) use foo_bar.adb;
   end Naming;
end Prj;

Tested on x86_64-pc-linux-gnu, committed on trunk

2014-11-20  Vincent Celier  cel...@adacore.com

* prj-nmsc.adb (Check_Object): If a unit is in a multi-source
file, its object file is never the same as any other unit.

Index: prj-nmsc.adb
===
--- prj-nmsc.adb(revision 217874)
+++ prj-nmsc.adb(working copy)
@@ -2577,7 +2577,7 @@
 Error_Msg_Name_1 := Lang_Index.Display_Name;
 Error_Msg
   (Data.Flags,
-   ?no compiler specified for language %% 
+   ?\no compiler specified for language %% 
  , ignoring all its sources,
No_Location, Project);
 
@@ -2604,7 +2604,7 @@
 if Lang_Index.Config.Naming_Data.Spec_Suffix = No_File then
Error_Msg
  (Data.Flags,
-  Spec_Suffix not specified for  
+  \Spec_Suffix not specified for  
   Get_Name_String (Lang_Index.Name),
   No_Location, Project);
 end if;
@@ -2612,7 +2612,7 @@
 if Lang_Index.Config.Naming_Data.Body_Suffix = No_File then
Error_Msg
  (Data.Flags,
-  Body_Suffix not specified for  
+  \Body_Suffix not specified for  
   Get_Name_String (Lang_Index.Name),
   No_Location, Project);
 end if;
@@ -2630,7 +2630,7 @@
Error_Msg_Name_1 := Lang_Index.Display_Name;
Error_Msg
  (Data.Flags,
-  no suffixes specified for %%,
+  \no suffixes specified for %%,
   No_Location, Project);
 end if;
  end if;
@@ -3770,7 +3770,7 @@
if Switches /= No_Array_Element then
   Error_Msg
 (Data.Flags,
- ?Linker switches not taken into account in library  
+ ?\Linker switches not taken into account in library  
  projects,
  No_Location, Project);
end if;
@@ -6793,7 +6793,7 @@
 Error_Msg_Name_2 := Source.Unit.Name;
 Error_Or_Warning
   (Data.Flags, Data.Flags.Missing_Source_Files,
-   source file %% for unit %% not found,
+   \source file %% for unit %% not found,
No_Location, Project.Project);
  end if;
   end if;
@@ -7789,7 +7789,7 @@
 Error_Msg_File_1 := Source.File;
 Error_Msg
   (Data.Flags,
-   { cannot be both excluded and an exception file name,
+   \{ cannot be both excluded and an exception file name,
No_Location, Project.Project);
  end if;
 
@@ -7936,13 +7936,15 @@
  if Source /= No_Source
and then Source.Replaced_By = No_Source
and then Source.Path /= Src.Path
+   and then Source.Index = 0
+   and then Src.Index = 0
and then Is_Extending (Src.Project, Source.Project)
  then
 Error_Msg_File_1 := Src.File;
 Error_Msg_File_2 := Source.File;
 Error_Msg
   (Data.Flags,
-   { and { have the same object file name,
+   \{ and { have the same object file name,
No_Location, Project.Project);
 
  else


Re: [PATCH, committed] Update Automake files

2014-11-20 Thread Jan-Benedict Glaw
Hi Bernd,

On Thu, 2014-11-20 14:34:00 +0100, Bernd Edlinger bernd.edlin...@hotmail.de 
wrote:
  This patch updates the files taken from Automake.  Committed.

 the updated version of missing will confuse the gmp-4.3.2 configure
 script if it is installed in-tree with contrib/download_prerequisites
 and flex is not installed:
 
 ...
 checking readline detected... no
 checking for bison... (cached) /home/ed/gnu/gcc-5-20141116/missing bison -y
 checking for flex... (cached) /home/ed/gnu/gcc-5-20141116/missing flex
 checking lex output file root... configure: error: cannot find output from 
 /home/ed/gnu/gcc-5-20141116/missing flex; giving up
 make[3]: *** [config.status] Error 1
 make[3]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf/gmp'
 make[2]: *** [all-stage1-gmp] Error 2
 make[2]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf'
 make[1]: *** [stage1-bubble] Error 2
 make[1]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf'
 make: *** [all] Error 2
 
 previous version of missing flex produced a dummy lex.yy.c,
 as does the version in the gmp package, but unfortunately it is
 overwritten by the missing script in the gcc tree.

Overridden actually by setting/exporting $FLEX and $LEX to GCC's own
file and calling GMP's `configure' that way.

 That's probably just not a supported configuration anymore,
 but all previous GCC releases worked without a installed flex tool.

I don't think something change wrt. the GMP version, see
https://gcc.gnu.org/install/prerequisites.html. (And it's the same problem with
current gmp-6.0.0a, so even updating GMP wouldn't help.)

 mAYBE the problem goes away if a newer version of gmp is used,
 or if the missing flex is not passed down to the gmp configure script,
 somehow.  Actually, it is not really needed by gmp at all.

It seems the only flex'able source file (in both GMP-4.3.2 and current
6.0.0a) is a demo file.

 I tried to add this hunk from the old version and it made, the gmp configure
 script worked again:
 
 --- missing.orig  2014-11-16 14:07:13.0 +
 +++ missing   2014-11-19 15:01:57.168967538 +
 @@ -172,6 +172,21 @@
echo You should only need it if you modified a '.l' file.
echo You may want to install the Fast Lexical Analyzer package:
echo $flex_URL
 +  rm -f lex.yy.c
 +  if test $# -ne 1; then
 +eval LASTARG=\${$#}
 +case $LASTARG in
 +*.l)
 +  SRCFILE=`echo $LASTARG | sed 's/l$/c/'`
 +  if test -f $SRCFILE; then
 +  cp $SRCFILE lex.yy.c
 +  fi
 +;;
 +esac
 +  fi
 +  if test ! -f lex.yy.c; then
 +  echo 'main() { return 0; }'lex.yy.c
 +  fi
;;
  help2man*)
echo You should only need it if you modified a dependency \
 
 
 What do you think?

That patch defeats my attempt to re-sync with upstream files. :)  The
Automake guys decided that faking a tool by simulating a successful
run isn't that much of a good idea, and thinking about it, this looks
like a good decision.

  I actually kind of think that this is simply a small bug in GMP. It
shouldn't require a working flex just for potentially building some
demo file. After all, the release tarballs could just contain the .c
file. No need for the lex file at all!

  However, I *think* the real problem is the way flags are passed.
Top-level `configure' doesn't find `flex' (okay) and sets
FLEX=./missing flex. This $FLEX gets passed by Makefile down to
GMP's `configure', which by now should, IMO, point to a /working/
flex. That way, GMP's `configure' of course will choke finding the
generated output file.

  So what shall we do now?  I'd be quite okay with reverting my
`missing' update for now, until this is actually fixed. However, It
would be nice if we'd discuss that, along with the GMP guys.

  Another fix (or is it a workaround?) would be to not hand down $FLEX
and $LEX:

diff --git a/Makefile.def b/Makefile.def
index 40bbca9..7b988fe 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -229,13 +229,11 @@ flags_to_pass = { flag= CC_FOR_BUILD ; };
 flags_to_pass = { flag= CFLAGS_FOR_BUILD ; };
 flags_to_pass = { flag= CXX_FOR_BUILD ; };
 flags_to_pass = { flag= EXPECT ; };
-flags_to_pass = { flag= FLEX ; };
 flags_to_pass = { flag= INSTALL ; };
 flags_to_pass = { flag= INSTALL_DATA ; };
 flags_to_pass = { flag= INSTALL_PROGRAM ; };
 flags_to_pass = { flag= INSTALL_SCRIPT ; };
 flags_to_pass = { flag= LDFLAGS_FOR_BUILD ; };
-flags_to_pass = { flag= LEX ; };
 flags_to_pass = { flag= M4 ; };
 flags_to_pass = { flag= MAKE ; };
 flags_to_pass = { flag= RUNTEST ; };

(...and regenerate Makefile{,.in}.)


  However, we need to discuss that. I'll head over to the GMP bugs
mailing list and discuss your build error over there.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Friends are relatives you make for yourself.
the second  :



Re: [PATCH] PR63426 Fix various signed integer overflows

2014-11-20 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 02:27:52PM +0100, Markus Trippelsdorf wrote:
 2014-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de
 
   * emit-rtl.c (const_wide_int_htab_hash): Likewise.
   * loop-iv.c (determine_max_iter): Likewise.
   (iv_number_of_iterations): Likewise.
   * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise.
   * varasm.c (get_section_anchor): Likewise.

Ok, with one small change:

 --- a/gcc/varasm.c
 +++ b/gcc/varasm.c
 @@ -7188,7 +7188,7 @@ get_section_anchor (struct object_block *block, 
 HOST_WIDE_INT offset,
  offset = 0;
else
  {
 -  bias = 1  (GET_MODE_BITSIZE (ptr_mode) - 1);
 +  bias = (unsigned HOST_WIDE_INT) 1  (GET_MODE_BITSIZE (ptr_mode) - 1);

Please use HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1.

Jakub


Re: [PATCH][doc] Document cortex-a17 and cortex-a17.cortex-a7 -m{cpu,tune} options

2014-11-20 Thread Richard Earnshaw
On 20/11/14 15:31, Kyrill Tkachov wrote:
 Hi all,
 
 As Joseph reminded, new -mcpu options should be documented in 
 invoke.texi. This adds the documentation for the cortex-a17 and 
 cortex-a17.cortex-a7 values.
 Ok to go in if the corresponding support patches posted earlier are 
 accepted?
 
 Thanks,
 Kyrill
 
 2014-11-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
  * doc/invoke.texi (ARM Options): Document cortex-a17 and
  cortex-a17.cortex-a7 as permissible -mtune values.
 
 

Yes, this is fine.

R.

 a17-doc.patch
 
 
 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
 index 785faec..a81cc16 100644
 --- a/gcc/doc/invoke.texi
 +++ b/gcc/doc/invoke.texi
 @@ -12813,7 +12813,8 @@ Permissible names are: @samp{arm2}, @samp{arm250},
  @samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp},
  @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, 
 @samp{arm1176jzf-s},
  @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8}, @samp{cortex-a9},
 -@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a53}, @samp{cortex-a57},
 +@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a17}, @samp{cortex-a53},
 +@samp{cortex-a57},
  @samp{cortex-r4},
  @samp{cortex-r4f}, @samp{cortex-r5}, @samp{cortex-r7}, @samp{cortex-m7},
  @samp{cortex-m4},
 @@ -12831,7 +12832,8 @@ Permissible names are: @samp{arm2}, @samp{arm250},
  
  Additionally, this option can specify that GCC should tune the performance
  of the code for a big.LITTLE system.  Permissible names are:
 -@samp{cortex-a15.cortex-a7}, @samp{cortex-a57.cortex-a53}.
 +@samp{cortex-a15.cortex-a7}, @samp{cortex-a17.cortex-a7},
 +@samp{cortex-a57.cortex-a53}.
  
  @option{-mtune=generic-@var{arch}} specifies that GCC should tune the
  performance for a blend of processors within architecture @var{arch}.
 




Re: [PATCH 10/21] PR jit/63854: Fix leak of worklist within jit-recording.c

2014-11-20 Thread Richard Biener
On Wed, Nov 19, 2014 at 9:02 PM, David Malcolm dmalc...@redhat.com wrote:
 On Wed, 2014-11-19 at 09:59 -0700, Jeff Law wrote:
 On 11/19/14 03:46, David Malcolm wrote:
  Fix this leak:
 
  160 bytes in 5 blocks are definitely lost in loss record 154 of 228
  at 0x4A0645D: malloc (in 
  /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
  by 0x5D75D4F: xrealloc (xmalloc.c:177)
  by 0x4DE1710: void 
  va_heap::reservegcc::jit::recording::block*(vecgcc::jit::recording::block*,
   va_heap, vl_embed*, unsigned int, bool) (vec.h:310)
  by 0x4DDFAB5: vecgcc::jit::recording::block*, va_heap, 
  vl_ptr::reserve(unsigned int, bool) (vec.h:1428)
  by 0x4DDFBFC: vecgcc::jit::recording::block*, va_heap, 
  vl_ptr::reserve_exact(unsigned int) (vec.h:1448)
  by 0x4DDE588: vecgcc::jit::recording::block*, va_heap, 
  vl_ptr::create(unsigned int) (vec.h:1463)
  by 0x4DD9B9F: gcc::jit::recording::function::validate() 
  (jit-recording.c:2191)
  by 0x4DD7AD3: gcc::jit::recording::context::validate() 
  (jit-recording.c:1005)
  by 0x4DD7660: gcc::jit::recording::context::compile() 
  (jit-recording.c:848)
  by 0x4DD5BD2: gcc_jit_context_compile (libgccjit.c:2014)
  by 0x401CA4: test_jit (harness.h:190)
  by 0x401D88: main (harness.h:232)
 
  gcc/jit/ChangeLog:
  PR jit/63854
  * jit-recording.c (recording::function::validate): Convert
  worklist from vec to autovec to fix a leak.
 JIT space, yours to approve :-)  We haven't formalized that yet, but
 it'd be silly to do anything else.

 FWIW, I added myself to the MAINTAINERS file as JIT maintainer as part
 of a change you reviewed as:
   https://gcc.gnu.org/ml/jit/2014-q4/msg00029.html

 Is there a governance distinction here, between patch review vs
 decisions of the steering committee?  i.e. do changes to the maintainers
 part of the MAINTAINERS file require higher-level approval?

Yes, reviewers and maintainers are appointed by the steering commitee only.

Richard.

 Presumably I should continue to send (non-trivial) jit patches to this
 list and wait for review before committing to trunk?

 Anyway so formally, this is OK for the trunk.

 Thanks.



Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787

2014-11-20 Thread Richard Biener
On Thu, Nov 20, 2014 at 4:34 PM, Michael Matz m...@suse.de wrote:
 Hi,

 On Thu, 20 Nov 2014, Richard Biener wrote:

  I don't think this API will make the non-C++-fans happier; I think the
  objection to the work I just merged is that it's adding more C++ than
  those people are comfortable with.

 How so?  It's already super-ugly in those views.  We decided to get C++.
 Now we have it.

 And?  Nobody says we can't have nice looking code even with C++.

 Now please make it AT LEAST CONSISTENT.

 True.

  I suspect that any API which requires the of   characters within the
  implementation of a gimple pass to mean a template is going to give
  those less happy with C++ a visceral ugh reaction.  I wonder if
  there's a way to spell these things that's concise and which doesn't
  involve  ?

 Only if you drop as_a/is_a/dyn_cast everywhere.

 Oh god, yes.  Please!  IMHO they don't accomplish much, but make code
 harder to visually parse.  They don't accomplish much because you
 have to write the snippets that check validity of conversions anyway, so
 they can just as well be written as proper methods or global functions, or
 even just conversion operators.  Nothing forces us to implement these
 snippets as noisy template specializations like:

   template 
   template 
   inline bool
   is_a_helper cgraph_node *::test (symtab_node *p)
   {
 return p-type == SYMTAB_FUNCTION;
   }

 instead of the more mundane means.  And once you have those snippets as
 normal functions, you can just as well call them like they are functions,
 making the using side of those conversion also look nicer.

True.  I don't remember exactly but exclusively using member functions
wasn't in the list of proposals that ended up with us doing as_a/is_a
as it is done now.

Can unions / PODs have such member functions?  Just thinking about
a reason why it wasn't proposed.

Btw, I don't see as_a/is_a/dyn_cast as super-ugly - it's actually a
perfectly fine C++-way of doing RTTI.

I also guess that it requires less code in the actual implementation
as we share one helper for all three operations.  Of course we
could macroize the member function implementation in some
clever way 

That said - we have a substantial amount of code using as_a/is_a/dyn_cast
and I don't think it's appropriate at this point to change all of it
to a different
mechanism.  Proposals with example patches are of course welcome, but
beware that if you want to succeed here then as-a.h needs to go ;)

Thanks,
Richard.


 Ciao,
 Michael.


[PR63762][4.9] Backport the patch which fixes GCC generates UNPREDICTABLE STR with Rn = Rt for arm

2014-11-20 Thread Renlin Li

Hi all,

This is a backport for gcc-4_9-branch of the patch [PR63762]GCC 
generates UNPREDICTABLE STR with Rn = Rt for arm posted in: 
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html


arm-none-eabi has been test on the model, no new issues. bootstrapping 
and regression tested on x86, no new issues.


Is it Okay for gcc-4_9-branch?

gcc/ChangeLog:

2014-11-20  Renlin Li  renlin...@arm.com

PR middle-end/63762
* ira.c (ira): Update preferred class.

gcc/testsuite/ChangeLog:

2014-11-20  Renlin Li  renlin...@arm.com

PR middle-end/63762
* gcc.dg/pr63762.c: New.diff --git a/gcc/ira.c b/gcc/ira.c
index 9c9e71d..e610d35 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5263,7 +5263,18 @@ ira (FILE *f)
 	  ira_allocno_iterator ai;
 
 	  FOR_EACH_ALLOCNO (a, ai)
-		ALLOCNO_REGNO (a) = REGNO (ALLOCNO_EMIT_DATA (a)-reg);
+{
+  int old_regno = ALLOCNO_REGNO (a);
+  int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)-reg);
+
+  ALLOCNO_REGNO (a) = new_regno;
+
+  if (old_regno != new_regno)
+setup_reg_classes (new_regno, reg_preferred_class (old_regno),
+   reg_alternate_class (old_regno),
+   reg_allocno_class (old_regno));
+}
+
 	}
 	  else
 	{
diff --git a/gcc/testsuite/gcc.dg/pr63762.c b/gcc/testsuite/gcc.dg/pr63762.c
new file mode 100644
index 000..df11067
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr63762.c
@@ -0,0 +1,77 @@
+/* PR middle-end/63762 */
+/* { dg-do assemble } */
+/* { dg-options -O2 } */
+
+#include stdlib.h
+
+void *astFree ();
+void *astMalloc ();
+void astNegate (void *);
+int astGetNegated (void *);
+void astGetRegionBounds (void *, double *, double *);
+int astResampleF (void *, ...);
+
+extern int astOK;
+
+int
+MaskF (int inside, int ndim, const int lbnd[], const int ubnd[],
+   float in[], float val)
+{
+
+  void *used_region;
+  float *c, *d, *out, *tmp_out;
+  double *lbndgd, *ubndgd;
+  int *lbndg, *ubndg, idim, ipix, nax, nin, nout, npix, npixg, result = 0;
+  if (!astOK) return result;
+  lbndg = astMalloc (sizeof (int)*(size_t) ndim);
+  ubndg = astMalloc (sizeof (int)*(size_t) ndim);
+  lbndgd = astMalloc (sizeof (double)*(size_t) ndim);
+  ubndgd = astMalloc (sizeof (double)*(size_t) ndim);
+  if (astOK)
+{
+  astGetRegionBounds (used_region, lbndgd, ubndgd);
+  npix = 1;
+  npixg = 1;
+  for (idim = 0; idim  ndim; idim++)
+{
+  lbndg[ idim ] = lbnd[ idim ];
+  ubndg[ idim ] = ubnd[ idim ];
+  npix *= (ubnd[ idim ] - lbnd[ idim ] + 1);
+  if (npixg = 0) npixg *= (ubndg[ idim ] - lbndg[ idim ] + 1);
+}
+  if (npixg = 0  astOK)
+{
+  if ((inside != 0) == (astGetNegated( used_region ) != 0))
+{
+  c = in;
+  for (ipix = 0; ipix  npix; ipix++) *(c++) = val;
+  result = npix;
+}
+}
+  else if (npixg  0  astOK)
+{
+  if ((inside != 0) == (astGetNegated (used_region) != 0))
+{
+  tmp_out = astMalloc (sizeof (float)*(size_t) npix);
+  if (tmp_out)
+{
+  c = tmp_out;
+  for (ipix = 0; ipix  npix; ipix++) *(c++) = val;
+  result = npix - npixg;
+}
+  out = tmp_out;
+}
+  else
+{
+  tmp_out = NULL;
+  out = in;
+}
+  if (inside) astNegate (used_region);
+  result += astResampleF (used_region, ndim, lbnd, ubnd, in, NULL,
+  NULL, NULL, 0, 0.0, 100, val, ndim,
+  lbnd, ubnd, lbndg, ubndg, out, NULL);
+  if (inside) astNegate (used_region);
+}
+}
+  return result;
+}

[Ada] PR ada/63931

2014-11-20 Thread Arnaud Charlet
Fixing version number according to new GCC naming scheme.

PR ada/63931   
* gnatvsn.ads (Library_Version): Switch to 5.

Index: gnatvsn.ads
===
--- gnatvsn.ads (revision 217874)
+++ gnatvsn.ads (working copy)
@@ -82,7 +82,7 @@
--  Prefix generated by binder. If it is changed, be sure to change
--  GNAT.Compiler_Version.Ver_Prefix as well.
 
-   Library_Version : constant String := 5.0;
+   Library_Version : constant String := 5;
--  Library version. This value must be updated when the compiler
--  version number Gnat_Static_Version_String is updated.
--


Re: [PATCH 10/21] PR jit/63854: Fix leak of worklist within jit-recording.c

2014-11-20 Thread Jeff Law

On 11/20/14 09:01, Richard Biener wrote:

Is there a governance distinction here, between patch review vs
decisions of the steering committee?  i.e. do changes to the maintainers
part of the MAINTAINERS file require higher-level approval?


Yes, reviewers and maintainers are appointed by the steering commitee only.
Right.  I've already raised appointing David as the JIT maintainer to 
the steering committee.  I just need to count the votes and take 
appropriate action.


Similarly for the MPX runtime and Ilya as the MPX maintainer,  Bernd as 
the nvptx maintainer.


If there's other maintainers that need to get appointed, nobody should 
hesitate to contact one of the SC members to get the nomination in front 
of the committee.


jeff



Re: [PATCH 1/2] teach mklog to get name / email from git config when available

2014-11-20 Thread Tom de Vries

On 09-05-14 16:47, Diego Novillo wrote:

I would probably use git config directly here. It would work with both
git and svn checkouts (if you have a global .git configuration). But
testing for .git is fine with me as well.

I like Peter's idea of having a ~/.mklog file to override. This would
work for both svn and git checkouts.



Diego,

this patch implements both:
- it uses the ~/.mklog file proposed by Peter
- in absence of a ~/.mklog file, it uses git config, also when not in a git
  repository

OK?

Thanks,
- Tom
2014-11-20  Tom de Vries  t...@codesourcery.com
	Peter Bergner  berg...@vnet.ibm.com

	* mklog: Handle .mklog.  Use git setting independent of presence .git
	directory.
---
 contrib/mklog | 56 +++-
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/contrib/mklog b/contrib/mklog
index 840f6f8..abbf0af 100755
--- a/contrib/mklog
+++ b/contrib/mklog
@@ -29,32 +29,46 @@
 use File::Temp;
 use File::Copy qw(cp mv);
 
-# Change these settings to reflect your profile.
-$username = $ENV{'USER'};
-$name = `finger $username | grep -o 'Name: .*'`;
-@n = split(/: /, $name);
-$name = $n[1]; chop($name);
-$addr = $username . \@my.domain.org;
 $date = `date +%Y-%m-%d`; chop ($date);
 
+$dot_mklog_format_msg =
+The .mklog format is:\n
+. NAME = ...\n
+. EMAIL = ...\n;
+
+# Create a .mklog to reflect your profile, if necessary.
+my $conf = $ENV{HOME}/.mklog;
+if (-f $conf) {
+open (CONF, $conf)
+	or die Could not open file '$conf' for reading: $!\n;
+while (CONF) {
+	if (m/^\s*NAME\s*=\s*(.*)\s*$/)	{
+	$name = $1;
+	} elsif (m/^\s*EMAIL\s*=\s*(.*)\s*$/) {
+	$addr = $1;
+	}
+}
+if (!($name  $addr)) {
+	die Could not read .mklog settings.\n
+	. $dot_mklog_format_msg;
+}
+} else {
+$name = `git config user.name`;
+chomp($name);
+$addr = `git config user.email`;
+chomp($addr);
+
+if (!($name  $addr)) {
+	die Could not read git user.name and user.email settings.\n
+	. Please add missing git settings, or create a .mklog file in
+	.  $ENV{HOME}.\n
+	. $dot_mklog_format_msg;
+}
+}
+
 $gcc_root = $0;
 $gcc_root =~ s/[^\\\/]+$/../;
 
-# if this is a git tree then take name and email from the git configuration
-if (-d $gcc_root/.git) {
-  $gitname = `git config user.name`;
-  chomp($gitname);
-  if ($gitname) {
-	  $name = $gitname;
-  }
-
-  $gitaddr = `git config user.email`;
-  chomp($gitaddr);
-  if ($gitaddr) {
-	  $addr = $gitaddr;
-  }
-}
-
 #-
 # Program starts here. You should not need to edit anything below this
 # line.
-- 
1.9.1



Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.

2014-11-20 Thread Evgeny Stupachenko
Bootstrap / make check passed with updated patch.

Is it still ok?

It looks like we don't need expand_vec_perm_vpshufb2_vpermq_even_odd
any more with the patch.
However the clean up will be in the separate patch after appropriate testing.

Modified ChangeLog:

2014-11-20  Evgeny Stupachenko  evstu...@gmail.com

gcc/testsuite
PR target/60451
* gcc.target/i386/pr60451.c: New.

gcc/
PR target/60451
* config/i386/i386.c (expand_vec_perm_even_odd_pack): New.
(expand_vec_perm_even_odd_1): Add new expand for V8HI mode,
replace for V16QI, V16HI and V32QI modes.
(ix86_expand_vec_perm_const_1): Add new expand.

On Thu, Nov 20, 2014 at 6:03 PM, Evgeny Stupachenko evstu...@gmail.com wrote:
 Good point! gen_shift also requires only SSE2.
 That way we can optimize out interleave sequence for V16QI mode in
 expand_vec_perm_even_odd_1.
 Thanks!

 Evgeny

 Updated patch:

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 085eb54..054089b 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -48322,6 +48322,127 @@ expand_vec_perm_vpshufb2_vpermq_even_odd
 (struct expand_vec_perm_d *d)
return true;
  }

 +/* A subroutine of expand_vec_perm_even_odd_1.  Implement extract-even
 +   and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands
 +   with two and and pack or two shift and pack insns.  We should
 +   have already failed all two instruction sequences.  */
 +
 +static bool
 +expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d)
 +{
 +  rtx op, dop0, dop1, t, rperm[16];
 +  unsigned i, odd, c, s, nelt = d-nelt;
 +  bool end_perm = false;
 +  machine_mode half_mode;
 +  rtx (*gen_and) (rtx, rtx, rtx);
 +  rtx (*gen_pack) (rtx, rtx, rtx);
 +  rtx (*gen_shift) (rtx, rtx, rtx);
 +
 +  if (d-one_operand_p)
 +return false;
 +
 +  switch (d-vmode)
 +{
 +case V8HImode:
 +  /* Required for pack.  */
 +  if (!TARGET_SSE4_1)
 +return false;
 +  c = 0x;
 +  s = 16;
 +  half_mode = V4SImode;
 +  gen_and = gen_andv4si3;
 +  gen_pack = gen_sse4_1_packusdw;
 +  gen_shift = gen_lshrv4si3;
 +  break;
 +case V16QImode:
 +  /* No check as all instructions are SSE2.  */
 +  c = 0xff;
 +  s = 8;
 +  half_mode = V8HImode;
 +  gen_and = gen_andv8hi3;
 +  gen_pack = gen_sse2_packuswb;
 +  gen_shift = gen_lshrv8hi3;
 +  break;
 +case V16HImode:
 +  if (!TARGET_AVX2)
 +return false;
 +  c = 0x;
 +  s = 16;
 +  half_mode = V8SImode;
 +  gen_and = gen_andv8si3;
 +  gen_pack = gen_avx2_packusdw;
 +  gen_shift = gen_lshrv8si3;
 +  end_perm = true;
 +  break;
 +case V32QImode:
 +  if (!TARGET_AVX2)
 +return false;
 +  c = 0xff;
 +  s = 8;
 +  half_mode = V16HImode;
 +  gen_and = gen_andv16hi3;
 +  gen_pack = gen_avx2_packuswb;
 +  gen_shift = gen_lshrv16hi3;
 +  end_perm = true;
 +  break;
 +default:
 +  /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than
 +general shuffles.  */
 +  return false;
 +}
 +
 +  /* Check that permutation is even or odd.  */
 +  odd = d-perm[0];
 +  if (odd  1)
 +return false;
 +
 +  for (i = 1; i  nelt; ++i)
 +if (d-perm[i] != 2 * i + odd)
 +  return false;
 +
 +  if (d-testing_p)
 +return true;
 +
 +  dop0 = gen_reg_rtx (half_mode);
 +  dop1 = gen_reg_rtx (half_mode);
 +  if (odd == 0)
 +{
 +  for (i = 0; i  nelt / 2; i++)
 +   rperm[i] = GEN_INT (c);
 +  t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm));
 +  t = force_reg (half_mode, t);
 +  emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0)));
 +  emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1)));
 +}
 +  else
 +{
 +  emit_insn (gen_shift (dop0,
 +   gen_lowpart (half_mode, d-op0),
 +   GEN_INT (s)));
 +  emit_insn (gen_shift (dop1,
 +   gen_lowpart (half_mode, d-op1),
 +   GEN_INT (s)));
 +}
 +  /* In AVX2 for 256 bit case we need to permute pack result.  */
 +  if (TARGET_AVX2  end_perm)
 +{
 +  op = gen_reg_rtx (d-vmode);
 +  t = gen_reg_rtx (V4DImode);
 +  emit_insn (gen_pack (op, dop0, dop1));
 +  emit_insn (gen_avx2_permv4di_1 (t,
 + gen_lowpart (V4DImode, op),
 + const0_rtx,
 + const2_rtx,
 + const1_rtx,
 + GEN_INT (3)));
 +  emit_move_insn (d-target, gen_lowpart (d-vmode, t));
 +}
 +  else
 +emit_insn (gen_pack (d-target, dop0, dop1));
 +
 +  return true;
 +}
 +
  /* A subroutine of ix86_expand_vec_perm_builtin_1.  Implement extract-even
 and extract-odd permutations.  */

 @@ -48393,7 +48514,9 @@ 

Re: [patch] Warn on undefined loop exit

2014-11-20 Thread Richard Biener
On Wed, Nov 19, 2014 at 9:19 PM, Andrew Stubbs a...@codesourcery.com wrote:
 On 19/11/14 16:39, Marek Polacek wrote:

 On Wed, Nov 19, 2014 at 04:32:43PM +, Andrew Stubbs wrote:

 +if (warning_at (gimple_location (elt-stmt),
 +OPT_Waggressive_loop_optimizations,
 +Loop exit may only be reached after
 undefined behaviour.))


 Warnings should start with a lowercase and should be without
 a fullstop at the end.


 Fixed, and I spotted a britishism too.

If it's really duplicated code can you split it out to a function?

+  if (OPT_Waggressive_loop_optimizations)
+{

this doesn't do what you think it does ;)  The variable to check is
warn_aggressive_loop_optimizations.

+  if (exit_warned  problem_stmts != vNULL)
+{

!problem_stmts.empty ()

Otherwise it looks ok.

Thanks,
Richard.

 Andrew


Add to maintainers list.

2014-11-20 Thread Alex Velenko

2014-11-20  Alex Velenko  alex.vele...@arm.com

*MAINTAINERS (write-after-approval): Add myself.

diff --git a/MAINTAINERS b/MAINTAINERS
index 11a28ef..eada4e9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -566,6 +566,7 @@ David Ung   
dav...@mips.com
 Neil Vachharajani  nvach...@gmail.com
 Kris Van Hees  kris.van.h...@oracle.com
 Joost VandeVondele joost.vandevond...@mat.ethz.ch
+Alex Velenko   alex.vele...@arm.com
 Ilya Verbiniver...@gmail.com
 Kugan Vivekanandarajah kug...@linaro.org
 Tom de Vries   t...@codesourcery.com

Re: [PATCH][AArch64] Add bounds checking to vqdm*_lane intrinsics via a qualifier that also flips endianness

2014-11-20 Thread Charles Baylis
On 20 November 2014 07:49, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:
 On 19 November 2014 19:05, Charles Baylis charles.bay...@linaro.org wrote:

 PR target/63870
 * config/aarch64/aarch64-builtins.c (aarch64_simd_expand_args): Pass
 expression to aarch64_simd_lane_bounds.
 * config/aarch64/aarch64-protos.h (aarch64_simd_lane_bounds): Update
 prototype.
 * config/aarch64/aarch64.c (aarch64_simd_lane_bounds): Add exp
 parameter. Report calling function in error message if exp is 
 non-NULL.

 These needs to be updated to reflect the changes in the last revision
 of the patch where NULL is passed explicitly. Otherwise OK, commit it
 with a fixed ChangeLog.

Sorry... more haste, less speed.

Committed as r217885, with the following ChangeLog:

2014-11-20  Charles Baylis  charles.bay...@linaro.org

PR target/63870
* config/aarch64/aarch64-builtins.c (aarch64_simd_expand_args): Pass
expression to aarch64_simd_lane_bounds.
* config/aarch64/aarch64-protos.h (aarch64_simd_lane_bounds): Update
prototype.
* config/aarch64/aarch64-simd.md: (aarch64_combinezmode): Update
call to aarch64_simd_lane_bounds.
(aarch64_get_lanedi): Likewise.
(aarch64_ld2_lanemode): Likewise.
(aarch64_ld3_lanemode): Likewise.
(aarch64_ld4_lanemode): Likewise.
(aarch64_im_lane_boundsi): Likewise.
* config/aarch64/aarch64.c (aarch64_simd_lane_bounds): Add exp
parameter. Report calling function in error message if exp is non-NULL.


[PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()

2014-11-20 Thread Martin Liška

Hello.

Following patch fixes ICE in IPA ICF. Problem was that number of non-debug 
statements in a BB can
change (for instance by IPA split), so that the number is recomputed.

Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
Ready for trunk?

Thanks,
Martin
gcc/ChangeLog:

2014-11-20  Martin Liska  mli...@suse.cz

* gimple-iterator.h (gsi_nondebug_stmt_count): New function.
* ipa-icf-gimple.c (func_checker::compare_bb): Number of BB
is recomputed because it can be split.

gcc/testsuite/ChangeLog:

2014-11-20  Martin Liska  mli...@suse.cz

* gcc.dg/ipa/pr63909.c: New test.
diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index fb6cc07..f73b1f6 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -331,4 +331,18 @@ gsi_seq (gimple_stmt_iterator i)
   return *i.seq;
 }
 
+/* Return number of nondebug statements in basic block BB.  */
+
+static inline unsigned
+gsi_nondebug_stmt_count (basic_block bb)
+{
+  unsigned c = 0;
+  for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+   gsi_next (gsi))
+if (!is_gimple_debug (gsi_stmt (gsi)))
+  c++;
+
+  return c;
+}
+
 #endif /* GCC_GIMPLE_ITERATOR_H */
diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 8f2a438..83661ac 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -563,6 +563,9 @@ func_checker::compare_bb (sem_bb *bb1, sem_bb *bb2)
   gimple_stmt_iterator gsi1, gsi2;
   gimple s1, s2;
 
+  bb1-nondbg_stmt_count = gsi_nondebug_stmt_count (bb1-bb);
+  bb2-nondbg_stmt_count = gsi_nondebug_stmt_count (bb2-bb);
+
   if (bb1-nondbg_stmt_count != bb2-nondbg_stmt_count
   || bb1-edge_count != bb2-edge_count)
 return return_false ();
diff --git a/gcc/testsuite/gcc.dg/ipa/pr63909.c b/gcc/testsuite/gcc.dg/ipa/pr63909.c
new file mode 100644
index 000..8538e21
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr63909.c
@@ -0,0 +1,27 @@
+/* { dg-options -O2 -fno-guess-branch-probability } */
+
+int z;
+
+__attribute__((noinline))
+void g ()
+{
+  if (++z)
+__builtin_exit (0);
+  g ();
+}
+
+__attribute__((noinline))
+void f ()
+{
+  if (++z)
+__builtin_exit (0);
+  f ();
+}
+
+int main()
+{
+  f ();
+  g ();
+
+  return 0;
+}


[PATCH, i386] Add new arg values for __builtin_cpu_supports

2014-11-20 Thread Ilya Enkovich
Hi,

MPX runtime checks some feature bits in order to check MPX is fully supported.  
Runtime does it by cpuid calls but there is a __builtin_cpu_supports which may 
be used for that.  Unfortunately currently it doesn't support required bits.  
Will it be OK to add them for trunk?

Thanks,
Ilya
--
gcc/

2014-11-20  Ilya Enkovich  ilya.enkov...@intel.com

* config/i386/cpuid.h (bit_MPX): New.
(bit_BNDREGS): New.
(bit_BNDCSR): New.
* config/i386/i386.c (processor_features): Add
F_XSAVE, F_OSXSAVE, F_MPX, F_BNDREGS, F_BNDCSR.
(isa_names_table): Likewise.
* doc/extend.texi (__builtin_cpu_supports): Add
xsave, osxsave, mpx, bndregs, bndcsr.

libgcc/

2014-11-20  Ilya Enkovich  ilya.enkov...@intel.com

* config/i386/cpuinfo.c (processor_features): Add
FEATURE_XSAVE, FEATURE_OSXSAVE, FEATURE_MPX,
FEATURE_BNDREGS, FEATURE_BNDCSR.
(get_available_features): Likewise.


diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index 133e356..f85cebb 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -72,6 +72,7 @@
 #define bit_AVX2   (1  5)
 #define bit_BMI2   (1  8)
 #define bit_RTM(1  11)
+#define bit_MPX(1  14)
 #define bit_AVX512F(1  16)
 #define bit_AVX512DQ   (1  17)
 #define bit_RDSEED (1  18)
@@ -87,6 +88,10 @@
 /* %ecx */
 #define bit_PREFETCHWT1  (1  0)
 
+/* XFEATURE_ENABLED_MASK register bits (%eax == 13, %ecx == 0) */
+#define bit_BNDREGS (1  3)
+#define bit_BNDCSR  (1  4)
+
 /* Extended State Enumeration Sub-leaf (%eax == 13, %ecx == 1) */
 #define bit_XSAVEOPT   (1  0)
 #define bit_XSAVEC (1  1)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3166e03..bbf3ea3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -35106,6 +35106,11 @@ fold_builtin_cpu (tree fndecl, tree *args)
 F_FMA4,
 F_XOP,
 F_FMA,
+F_XSAVE,
+F_OSXSAVE,
+F_MPX,
+F_BNDREGS,
+F_BNDCSR,
 F_MAX
   };
 
@@ -35194,7 +35199,12 @@ fold_builtin_cpu (tree fndecl, tree *args)
   {fma4,   F_FMA4},
   {xop,F_XOP},
   {fma,F_FMA},
-  {avx2,   F_AVX2}
+  {avx2,   F_AVX2},
+  {xsave,  F_XSAVE},
+  {osxsave,F_OSXSAVE},
+  {mpx,F_MPX},
+  {bndregs,F_BNDREGS},
+  {bndcsr, F_BNDCSR}
 };
 
   tree __processor_model_type = build_processor_model_struct ();
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index d10a815..a06ed0c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -11629,6 +11629,16 @@ SSE4.2 instructions.
 AVX instructions.
 @item avx2
 AVX2 instructions.
+@item xsave
+XFEATURE_ENABLED_MASK register and XSAVE, XRSTOR, XSETBV, XGETBV instructions
+@item osxsave
+OS has enabled support for using XGETBV and XSETBV instructions
+@item mpx
+MPX instructions.
+@item bndregs
+Indicates bound register component of MPX state
+@item bndcsr
+Indicates bounds configuration and status component of MPX state
 @end table
 
 Here is an example:
diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c
index 6ff7502..9e060b0 100644
--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -96,7 +96,12 @@ enum processor_features
   FEATURE_SSE4_A,
   FEATURE_FMA4,
   FEATURE_XOP,
-  FEATURE_FMA
+  FEATURE_FMA,
+  FEATURE_XSAVE,
+  FEATURE_OSXSAVE,
+  FEATURE_MPX,
+  FEATURE_BNDREGS,
+  FEATURE_BNDCSR
 };
 
 struct __processor_model
@@ -270,6 +275,10 @@ get_available_features (unsigned int ecx, unsigned int edx,
 features |= (1  FEATURE_AVX);
   if (ecx  bit_FMA)
 features |= (1  FEATURE_FMA);
+  if (ecx  bit_XSAVE)
+features |= (1  FEATURE_XSAVE);
+  if (ecx  bit_OSXSAVE)
+features |= (1  FEATURE_OSXSAVE);
 
   /* Get Advanced Features at level 7 (eax = 7, ecx = 0). */
   if (max_cpuid_level = 7)
@@ -278,6 +287,19 @@ get_available_features (unsigned int ecx, unsigned int edx,
   __cpuid_count (7, 0, eax, ebx, ecx, edx);
   if (ebx  bit_AVX2)
features |= (1  FEATURE_AVX2);
+  if (ebx  bit_MPX)
+   features |= (1  FEATURE_MPX);
+}
+
+  /* Get Advanced Features at level 13 (eax = 13, ecx = 0). */
+  if (max_cpuid_level = 13)
+{
+  unsigned int eax, ebx, ecx, edx;
+  __cpuid_count (13, 0, eax, ebx, ecx, edx);
+  if (eax  bit_BNDREGS)
+   features |= (1  FEATURE_BNDREGS);
+  if (eax  bit_BNDCSR)
+   features |= (1  FEATURE_BNDCSR);
 }
 
   unsigned int ext_level;


Re: [PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()

2014-11-20 Thread Richard Biener
On Thu, Nov 20, 2014 at 5:30 PM, Martin Liška mli...@suse.cz wrote:
 Hello.

 Following patch fixes ICE in IPA ICF. Problem was that number of non-debug
 statements in a BB can
 change (for instance by IPA split), so that the number is recomputed.

Huh, so can it get different for both candidates?  I think the stmt compare
loop should be terminated on gsi_end_p of either iterator and return
false for any remaining non-debug-stmts on the other.

Thus, not walk all stmts twice here.

As IPA split is run early I don't see how it should affect a real IPA
pass though?

Thanks,
Richard.

 Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
 Ready for trunk?

 Thanks,
 Martin


Re: [PATCH, i386] Add new arg values for __builtin_cpu_supports

2014-11-20 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 07:36:03PM +0300, Ilya Enkovich wrote:
 Hi,
 
 MPX runtime checks some feature bits in order to check MPX is fully
 supported.  Runtime does it by cpuid calls but there is a
 __builtin_cpu_supports which may be used for that.  Unfortunately
 currently it doesn't support required bits.  Will it be OK to add them for
 trunk?

I think using cpuid for that is just fine.  __builtin_cpu_supports
is for ISA additions users might actually want to version code for,
MPX stuff, as the instructions are nops without hw support, are not
something one would multi-version a function for.
If anything, AVX512F and AVX512BW+VL might be good candidates for that, not
MPX.

Jakub


[PATCH][ARM] Make issue rate part of per-core tuning structs

2014-11-20 Thread Kyrill Tkachov

Hi all,

This patch makes the arm_issue_rate function lookup the issue rate of 
the process from the tuning structs.
This makes it look more like the aarch64 mechanism and centralises a 
processor-specific construct to the
tuning structs, thus not forcing us to remember to update the 
arm_issue_rate function every time a new core

is added.

A new tuning struct is added for the marvell-pj4 in order to decouple it 
from the 9e tuning struct and

enable us to set it's correct issue rate to 2.

Bootstrapped and tested on arm-none-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2014-11-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm-protos.h (struct tune_params): Add issue_rate field.
* config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune,
arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune,
arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune,
arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune,
arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune,
arm_fa726te_tune arm_cortex_a5_tune): Specify issue_rate value.
(arm_issue_rate): Look up issue rate from tuning structs. Remove
large switch statement.
(arm_marvell_pj4_tune): New struct.
* config/arm/arm-cores.def (marvell-pj4): Use arm_marvell_pj4_tune
struct.commit a2466d31869cd7edd0a9de14d96427d361d97dd7
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Wed Nov 19 16:24:03 2014 +

[ARM] refactor issue_rate

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 637be15..12625c7 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -158,7 +158,7 @@ ARM_CORE(cortex-r7,		cortexr7, cortexr7,		7R,  FL_LDSCHED | FL_ARM_DIV, cortex
 ARM_CORE(cortex-m7,		cortexm7, cortexm7,		7EM, FL_LDSCHED, cortex_m7)
 ARM_CORE(cortex-m4,		cortexm4, cortexm4,		7EM, FL_LDSCHED, v7m)
 ARM_CORE(cortex-m3,		cortexm3, cortexm3,		7M,  FL_LDSCHED, v7m)
-ARM_CORE(marvell-pj4,		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, 9e)
+ARM_CORE(marvell-pj4,		marvell_pj4, marvell_pj4,	7A,  FL_LDSCHED, marvell_pj4)
 
 /* V7 big.LITTLE implementations */
 ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7,	7A,  FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 71ce362..7d5bfd3 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -291,6 +291,8 @@ struct tune_params
   int max_insns_inline_memset;
   /* Bitfield encoding the fuseable pairs of instructions.  */
   unsigned int fuseable_ops : 1;
+  /* Issue rate of the processor.  */
+  unsigned int issue_rate;
 };
 
 extern const struct tune_params *current_tune;
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9aa402f..94db2b2 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1671,7 +1671,8 @@ const struct tune_params arm_slowmul_tune =
   false, false, /* Prefer 32-bit encodings.  */
   false,	/* Prefer Neon for stringops.  */
   8,		/* Maximum insns to inline memset.  */
-  ARM_FUSE_NOTHING/* Fuseable pairs of instructions.  */
+  ARM_FUSE_NOTHING,/* Fuseable pairs of instructions.  */
+  1		/* Issue rate.  */
 };
 
 const struct tune_params arm_fastmul_tune =
@@ -1691,7 +1692,8 @@ const struct tune_params arm_fastmul_tune =
   false, false, /* Prefer 32-bit encodings.  */
   false,	/* Prefer Neon for stringops.  */
   8,		/* Maximum insns to inline memset.  */
-  ARM_FUSE_NOTHING/* Fuseable pairs of instructions.  */
+  ARM_FUSE_NOTHING,/* Fuseable pairs of instructions.  */
+  1		/* Issue rate.  */
 };
 
 /* StrongARM has early execution of branches, so a sequence that is worth
@@ -1714,7 +1716,8 @@ const struct tune_params arm_strongarm_tune =
   false, false, /* Prefer 32-bit encodings.  */
   false,	/* Prefer Neon for stringops.  */
   8,		/* Maximum insns to inline memset.  */
-  ARM_FUSE_NOTHING/* Fuseable pairs of instructions.  */
+  ARM_FUSE_NOTHING,/* Fuseable pairs of instructions.  */
+  1		/* Issue rate.  */
 };
 
 const struct tune_params arm_xscale_tune =
@@ -1734,7 +1737,8 @@ const struct tune_params arm_xscale_tune =
   false, false, /* Prefer 32-bit encodings.  */
   false,	/* Prefer Neon for stringops.  */
   8,		/* Maximum insns to inline memset.  */
-  ARM_FUSE_NOTHING/* Fuseable pairs of instructions.  */
+  ARM_FUSE_NOTHING,/* Fuseable pairs of instructions.  */
+  1		/* Issue rate.  */
 };
 
 const struct tune_params arm_9e_tune =
@@ -1754,7 +1758,29 @@ const struct tune_params arm_9e_tune =
   false, false, /* Prefer 32-bit encodings.  */
   false,	/* Prefer Neon for stringops.  */
   8,		/* Maximum insns to inline memset.  */
-  ARM_FUSE_NOTHING/* Fuseable pairs of instructions.  */
+  

Re: [PATCH 1/2] teach mklog to get name / email from git config when available

2014-11-20 Thread Segher Boessenkool
On Thu, Nov 20, 2014 at 05:22:20PM +0100, Tom de Vries wrote:
 +my $conf = $ENV{HOME}/.mklog;
 +if (-f $conf) {
 +open (CONF, $conf)
 + or die Could not open file '$conf' for reading: $!\n;
 +while (CONF) {
 + if (m/^\s*NAME\s*=\s*(.*)\s*$/) {

The final \s* never matches anything since the .* gobbles up everything.
Use .*? if you really want to get rid of the trailing whitespace.


Segher


Re: [AArch64, Patch] Add range-check for Symbol + offset addressing.

2014-11-20 Thread Marcus Shawcroft
On 20 November 2014 14:33, Tejas Belagod tejas.bela...@arm.com wrote:

 The same patch applies cleanly to 4.9. OK to commit?

 Thanks,
 Tejas.

Provided it regresses ok, yes.
/Marcus


Re: [PATCH][ARM] Make issue rate part of per-core tuning structs

2014-11-20 Thread Kyrill Tkachov
I should say that the patch context depends on the macro fusion hook 
implementation posted here:

https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00958.html

Kyrill

On 20/11/14 16:43, Kyrill Tkachov wrote:

Hi all,

This patch makes the arm_issue_rate function lookup the issue rate of
the process from the tuning structs.
This makes it look more like the aarch64 mechanism and centralises a
processor-specific construct to the
tuning structs, thus not forcing us to remember to update the
arm_issue_rate function every time a new core
is added.

A new tuning struct is added for the marvell-pj4 in order to decouple it
from the 9e tuning struct and
enable us to set it's correct issue rate to 2.

Bootstrapped and tested on arm-none-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2014-11-19  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/arm/arm-protos.h (struct tune_params): Add issue_rate field.
  * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune,
  arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune,
  arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune,
  arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune,
  arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune,
  arm_fa726te_tune arm_cortex_a5_tune): Specify issue_rate value.
  (arm_issue_rate): Look up issue rate from tuning structs. Remove
  large switch statement.
  (arm_marvell_pj4_tune): New struct.
  * config/arm/arm-cores.def (marvell-pj4): Use arm_marvell_pj4_tune
  struct.





[PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)

2014-11-20 Thread Marek Polacek
This patch fixes a bunch of ICEs related to C++14 constexprs and
-fsanitize=undefined.  We should ignore ubsan internal functions
and ubsan builtins in constexpr functions in cxx_eval_call_expression.

Also add proper printing of internal functions into the C++ printer.

Bootstrapped/regtested on ppc64-linux, ok for trunk?

2014-11-20  Marek Polacek  pola...@redhat.com

PR sanitizer/63956
* constexpr.c: Include ubsan.h.
(cxx_eval_call_expression): Bail out for IFN_UBSAN_{NULL,BOUNDS}
internal functions and for ubsan builtins in constexpr functions.
* error.c: Include internal-fn.h.
(dump_expr): Add printing of internal functions.

* g++.dg/ubsan/pr63956.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 2678223..684e36f 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include gimplify.h
 #include builtins.h
 #include tree-inline.h
+#include ubsan.h
 
 static bool verify_constant (tree, bool, bool *, bool *);
 #define VERIFY_CONSTANT(X) \
@@ -1151,6 +1152,16 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
   constexpr_call *entry;
   bool depth_ok;
 
+  if (fun == NULL_TREE)
+switch (CALL_EXPR_IFN (t))
+  {
+  case IFN_UBSAN_NULL:
+  case IFN_UBSAN_BOUNDS:
+   return void_node;
+  default:
+   break;
+  }
+
   if (TREE_CODE (fun) != FUNCTION_DECL)
 {
   /* Might be a constexpr function pointer.  */
@@ -1171,6 +1182,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 }
   if (DECL_CLONED_FUNCTION_P (fun))
 fun = DECL_CLONED_FUNCTION (fun);
+
+  if (!current_function_decl  is_ubsan_builtin_p (fun))
+return void_node;
+
   if (is_builtin_fn (fun))
 return cxx_eval_builtin_function_call (ctx, t,
   addr, non_constant_p, overflow_p);
diff --git gcc/cp/error.c gcc/cp/error.c
index 76f86cb..09789ad 100644
--- gcc/cp/error.c
+++ gcc/cp/error.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include tree-pretty-print.h
 #include c-family/c-objc.h
 #include ubsan.h
+#include internal-fn.h
 
 #include new// For placement-new.
 
@@ -2037,6 +2038,14 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
tree fn = CALL_EXPR_FN (t);
bool skipfirst = false;
 
+   /* Deal with internal functions.  */
+   if (fn == NULL_TREE)
+ {
+   pp_string (pp, internal_fn_name (CALL_EXPR_IFN (t)));
+   dump_call_expr_args (pp, t, flags, skipfirst);
+   break;
+ }
+
if (TREE_CODE (fn) == ADDR_EXPR)
  fn = TREE_OPERAND (fn, 0);
 
diff --git gcc/testsuite/g++.dg/ubsan/pr63956.C 
gcc/testsuite/g++.dg/ubsan/pr63956.C
index e69de29..7bc0b77 100644
--- gcc/testsuite/g++.dg/ubsan/pr63956.C
+++ gcc/testsuite/g++.dg/ubsan/pr63956.C
@@ -0,0 +1,172 @@
+// PR sanitizer/63956
+// { dg-do compile }
+// { dg-options -std=c++14 
-fsanitize=undefined,float-divide-by-zero,float-cast-overflow }
+
+#define SA(X) static_assert((X),#X)
+#define INT_MIN (-__INT_MAX__ - 1)
+
+constexpr int
+fn1 (int a, int b)
+{
+  if (b != 2)
+a = b;
+  return a;
+}
+
+constexpr int i1 = fn1 (5, 3);
+constexpr int i2 = fn1 (5, -2);
+constexpr int i3 = fn1 (5, sizeof (int) * __CHAR_BIT__);
+constexpr int i4 = fn1 (5, 256);
+constexpr int i5 = fn1 (5, 2);
+constexpr int i6 = fn1 (-2, 4);
+constexpr int i7 = fn1 (0, 2);
+
+SA (i1 == 40);
+SA (i5 == 5);
+SA (i7 == 0);
+
+constexpr int
+fn2 (int a, int b)
+{
+  if (b != 2)
+a = b;
+  return a;
+}
+
+constexpr int j1 = fn2 (4, 1);
+constexpr int j2 = fn2 (4, -1);
+constexpr int j3 = fn2 (10, sizeof (int) * __CHAR_BIT__);
+constexpr int j4 = fn2 (1, 256);
+constexpr int j5 = fn2 (5, 2);
+constexpr int j6 = fn2 (-2, 4);
+constexpr int j7 = fn2 (0, 4);
+
+SA (j1 == 2);
+SA (j5 == 5);
+SA (j7 == 0);
+
+constexpr int
+fn3 (int a, int b)
+{
+  if (b != 2)
+a = a / b;
+  return a;
+}
+
+constexpr int k1 = fn3 (8, 4);
+constexpr int k2 = fn3 (7, 0); // { dg-error is not a constant 
expression|constexpr call flows off }
+constexpr int k3 = fn3 (INT_MIN, -1); // { dg-error overflow in constant 
expression|constexpr call flows off }
+
+SA (k1 == 2);
+
+constexpr float
+fn4 (float a, float b)
+{
+  if (b != 2.0)
+a = a / b;
+  return a;
+}
+
+constexpr float l1 = fn4 (5.0, 3.0);
+constexpr float l2 = fn4 (7.0, 0.0); // { dg-error is not a constant 
expression|constexpr call flows off }
+
+constexpr int
+fn5 (const int *a, int b)
+{
+  if (b != 2)
+b = a[b];
+  return b;
+}
+
+constexpr int m1[4] = { 1, 2, 3, 4 };
+constexpr int m2 = fn5 (m1, 3);
+constexpr int m3 = fn5 (m1, 4); // { dg-error array subscript out of 
bound|constexpr call flows off }
+
+constexpr int
+fn6 (const int a, int b)
+{
+  if (b != 2)
+b = a;
+  return b;
+}
+
+constexpr int
+fn7 (const int *a, int b)
+{
+  if 

Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)

2014-11-20 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 06:14:52PM +0100, Marek Polacek wrote:
 This patch fixes a bunch of ICEs related to C++14 constexprs and
 -fsanitize=undefined.  We should ignore ubsan internal functions
 and ubsan builtins in constexpr functions in cxx_eval_call_expression.
 
 Also add proper printing of internal functions into the C++ printer.
 
 Bootstrapped/regtested on ppc64-linux, ok for trunk?

I'd like Jason to review this.  But a few nits:

 @@ -1171,6 +1182,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
 tree t,
  }
if (DECL_CLONED_FUNCTION_P (fun))
  fun = DECL_CLONED_FUNCTION (fun);
 +
 +  if (!current_function_decl  is_ubsan_builtin_p (fun))
 +return void_node;
 +

I don't understand the !current_function_decl here.

Also, looking at is_ubsan_builtin_p definition, I'd say
it should IMHO at least test DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
before comparing the function name, you can declare
__builtin_ubsan_foobarbaz () and use it and it won't be a builtin.

As for the testcase, I'd like to understand if C++ FE should reject
the constexpr functions when used with arguments that trigger undefined
behavior.  But certainly the behavior should not depend on whether
-fsanitize=undefined or not.
Also, what is the reason for constexpr call flows off the end errors?
Shouldn't that be avoided if any error is found while interpreting the
function?

Jakub


[gomp4] fix a fortran bootstrap failure

2014-11-20 Thread Cesar Philippidis
This patch resolves a bootstrap failure in gomp-4_0-branch, which I
probably introduced after I switched the cache error message from a
gfc_error to a sorry. The code parameter isn't being used anymore by
resolve_oacc_cache, so I've explicitly marked it as unused.

I've applied this patch to gomp-4_0-branch.

Cesar
2014-11-20  Cesar Philippidis  ce...@codesourcery.com

	gcc/fortran/
	* openmp.c (resolve_oacc_cache): Mark the code parameter
	as unused.


Index: gcc/fortran/openmp.c
===
--- gcc/fortran/openmp.c	(revision 442301)
+++ gcc/fortran/openmp.c	(working copy)
@@ -4600,7 +4600,7 @@ resolve_oacc_loop (gfc_code *code)
 
 
 static void
-resolve_oacc_cache (gfc_code *code)
+resolve_oacc_cache (gfc_code *code ATTRIBUTE_UNUSED)
 {
   sorry (Sorry, !$ACC cache unimplemented yet);
 }


Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)

2014-11-20 Thread Jakub Jelinek
On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote:
 On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote:
  Though, following patch is just fine for me too, I don't think it will
  make a significant difference:
 
 This version is fine by me.

Richard, are you ok with that too?

Bootstrapped/regtested on x86_64-linux and i686-linux now.

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR target/63910
* simplify-rtx.c (simplify_immed_subreg): Return NULL for integer
modes wider than MAX_BITSIZE_MODE_ANY_INT.  If not using
CONST_WIDE_INT, make sure r fits into CONST_DOUBLE.

* gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj   2014-11-19 09:17:15.491327992 +0100
+++ gcc/simplify-rtx.c  2014-11-19 12:28:16.223808178 +0100
@@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute
HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
wide_int r;
 
+   if (GET_MODE_PRECISION (outer_submode)  MAX_BITSIZE_MODE_ANY_INT)
+ return NULL_RTX;
for (u = 0; u  units; u++)
  {
unsigned HOST_WIDE_INT buf = 0;
@@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute
tmp[u] = buf;
base += HOST_BITS_PER_WIDE_INT;
  }
-   gcc_assert (GET_MODE_PRECISION (outer_submode)
-   = MAX_BITSIZE_MODE_ANY_INT);
r = wide_int::from_array (tmp, units,
  GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+   /* Make sure r will fit into CONST_INT or CONST_DOUBLE.  */
+   if (wi::min_precision (r, SIGNED)  HOST_BITS_PER_DOUBLE_INT)
+ return NULL_RTX;
+#endif
elems[elem] = immed_wide_int_const (r, outer_submode);
  }
  break;
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-19 12:04:23.490489130 
+0100
+++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-19 12:04:23.490489130 
+0100
@@ -0,0 +1,12 @@
+/* PR target/63910 */
+/* { dg-do compile } */
+/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */
+
+extern void bar (float *c);
+
+void
+foo (void)
+{
+  float c[1024] = { };
+  bar (c);
+}


Jakub


[PATCH] Fix ICE with non-lvalue vector subscripts and make sure non-lvalue vector subscripts aren't used as lvalues (PR target/63764)

2014-11-20 Thread Jakub Jelinek
Hi!

This patch fixes ICEs if a non-lvalue vector (say cast of one vector
to another vector type) was subscripted and used as lhs.
The following patch, if *vecp is not lvalue, will copy it to a temporary
variable which can be made addressable for the subscription, and afterwards
wrap it into a NON_LVALUE_EXPR so that it is properly rejected if later used
on the lhs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR target/63764
c-family/
* c-common.h (convert_vector_to_pointer_for_subscript): Change
return type to bool.
* c-common.c: Include gimple-expr.c.
(convert_vector_to_pointer_for_subscript): Change return type to
bool.  If *vecp is not lvalue_p and has VECTOR_TYPE, return true
and copy it into a TARGET_EXPR and use that instead of *vecp
directly.
c/
* c-typeck.c (build_array_ref): Adjust
convert_vector_to_pointer_for_subscript caller.  If it returns true,
call non_lvalue_loc on the result.
cp/
* typeck.c (cp_build_array_ref): Adjust
convert_vector_to_pointer_for_subscript caller.  If it returns true,
call non_lvalue_loc on the result.
testsuite/
* c-c++-common/pr63764-1.c: New test.
* c-c++-common/pr63764-2.c: New test.

--- gcc/c-family/c-common.h.jj  2014-11-19 15:39:26.606065628 +0100
+++ gcc/c-family/c-common.h 2014-11-20 08:38:02.527655971 +0100
@@ -1310,7 +1310,7 @@ extern tree build_userdef_literal (tree
   enum overflow_type overflow,
   tree num_string);
 
-extern void convert_vector_to_pointer_for_subscript (location_t, tree*, tree);
+extern bool convert_vector_to_pointer_for_subscript (location_t, tree *, tree);
 
 /* Possibe cases of scalar_to_vector conversion.  */
 enum stv_conv {
--- gcc/c-family/c-common.c.jj  2014-11-19 15:39:26.606065628 +0100
+++ gcc/c-family/c-common.c 2014-11-20 08:50:21.000573676 +0100
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.
 #include target-def.h
 #include gimplify.h
 #include wide-int-print.h
+#include gimple-expr.h
 
 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
 
@@ -12030,22 +12031,47 @@ build_userdef_literal (tree suffix_id, t
 }
 
 /* For vector[index], convert the vector to a
-   pointer of the underlying type.  */
-void
+   pointer of the underlying type.  Return true if the resulting
+   ARRAY_REF should not be an lvalue.  */
+
+bool
 convert_vector_to_pointer_for_subscript (location_t loc,
-tree* vecp, tree index)
+tree *vecp, tree index)
 {
+  bool ret = false;
   if (TREE_CODE (TREE_TYPE (*vecp)) == VECTOR_TYPE)
 {
   tree type = TREE_TYPE (*vecp);
   tree type1;
 
+  ret = !lvalue_p (*vecp);
   if (TREE_CODE (index) == INTEGER_CST)
 if (!tree_fits_uhwi_p (index)
 || tree_to_uhwi (index) = TYPE_VECTOR_SUBPARTS (type))
   warning_at (loc, OPT_Warray_bounds, index value is out of bound);
 
-  c_common_mark_addressable_vec (*vecp);
+  if (ret)
+   {
+ tree tmp = create_tmp_var_raw (type, NULL);
+ DECL_SOURCE_LOCATION (tmp) = loc;
+ *vecp = c_save_expr (*vecp);
+ if (TREE_CODE (*vecp) == C_MAYBE_CONST_EXPR)
+   {
+ bool non_const = C_MAYBE_CONST_EXPR_NON_CONST (*vecp);
+ *vecp = C_MAYBE_CONST_EXPR_EXPR (*vecp);
+ *vecp
+   = c_wrap_maybe_const (build4 (TARGET_EXPR, type, tmp,
+ *vecp, NULL_TREE, NULL_TREE),
+ non_const);
+   }
+ else
+   *vecp = build4 (TARGET_EXPR, type, tmp, *vecp,
+   NULL_TREE, NULL_TREE);
+ SET_EXPR_LOCATION (*vecp, loc);
+ c_common_mark_addressable_vec (tmp);
+   }
+  else
+   c_common_mark_addressable_vec (*vecp);
   type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type));
   type1 = build_pointer_type (TREE_TYPE (*vecp));
   bool ref_all = TYPE_REF_CAN_ALIAS_ALL (type1);
@@ -12065,6 +12091,7 @@ convert_vector_to_pointer_for_subscript
   *vecp = build1 (ADDR_EXPR, type1, *vecp);
   *vecp = convert (type, *vecp);
 }
+  return ret;
 }
 
 /* Determine which of the operands, if any, is a scalar that needs to be
--- gcc/c/c-typeck.c.jj 2014-11-19 15:39:24.044113650 +0100
+++ gcc/c/c-typeck.c2014-11-20 08:38:02.534655847 +0100
@@ -2495,7 +2495,8 @@ build_array_ref (location_t loc, tree ar
 
   gcc_assert (TREE_CODE (TREE_TYPE (index)) == INTEGER_TYPE);
 
-  convert_vector_to_pointer_for_subscript (loc, array, index);
+  bool non_lvalue
+= convert_vector_to_pointer_for_subscript (loc, array, index);
 
   if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE)
 {
@@ -2557,6 +2558,8 @@ build_array_ref (location_t loc, tree 

Re: SRA: don't drop clobbers

2014-11-20 Thread Martin Jambor
Hi,

On Mon, Nov 03, 2014 at 10:46:49PM +0100, Marc Glisse wrote:
 On Mon, 3 Nov 2014, Marc Glisse wrote:
 
 On Mon, 3 Nov 2014, Martin Jambor wrote:
 
 I just applied your patch on top of trunk revision 217032 on my
 
 Ah, that explains it, thanks. This patch is a follow-up to
 r217034. Still, I didn't expect the ICE you are seeing by applying
 this patch to older trunk, I'll try to reproduce that.
 
 It is TODO_update_address_taken that used to remove clobbers, and as
 you said ESRA goes straight to TODO_update_ssa, which explains why
 the clobbers caused trouble. In any case, after r217034, update_ssa
 should handle clobbers much better. Could you take an other look
 based on a more recent trunk, please?
 

Sorry for the delay.  Anyway, on the current trunk (i.e. Tuesday
checkout) the patch works as expected, there are assignments from
default definitions now and even though we do not warn as we should,
the patch improves the generated code.  The function foo from the
testcase is optimized to return SR.1_2(D); as soon as release_ssa
now, whereas unpatched trunk leaves an undefined load even in the
optimized dump.

Thus, I like the patch and given that you posted it well before stage1
end, I'd like to see it committed.  Richi, can you have a look and
perhaps approve it?

Thanks,

Martin



Re: [Aarch64][BE][2/2] Fix vector load/stores to not use ld1/st1

2014-11-20 Thread Marcus Shawcroft
On 14 November 2014 16:48, Alan Hayward alan.hayw...@arm.com wrote:
 This is a new version of my BE patch from a few weeks ago.
 This is part 2 and covers all the aarch64 changes.

 When combined with the first patch, It fixes up movoi/ci/xi for Big
 Endian, so that we end up with the lab of a big-endian integer to be in
 the low byte of the highest-numbered register.

 This patch requires part 1 and David Sherwood’s patch:
  [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.

 When tested with David’s patch and [1/2] of this patch, no regressions
 were seen when testing aarch64 and x86_64 on make check.


 Changelog:
 2014-11-14  Alan Hayward  alan.hayw...@arm.com

 * config/aarch64/aarch64.c
 (aarch64_classify_address): Allow extra addressing modes for BE.
 (aarch64_print_operand): new operand for printing a q register+1.


Just a bunch of ChangeLog nits.

ChangeLog entries are sentences.  All of these entries should start
with a capital letter.

Perhaps this one would be better written as:

Add 'R' specifier.

 (aarch64_simd_emit_reg_reg_move): replacement for

Replace with just:

 (aarch64_simd_emit_reg_reg_move): Remove.

 * config/aarch64/aarch64-protos.h
 (aarch64_simd_emit_reg_reg_move): replacement for
 aarch64_simd_disambiguate_copy.

How about:
 ( aarch64_simd_disambiguate_copy): Define.

etc

 * config/aarch64/aarch64-simd.md
 (define_split): Use new aarch64_simd_emit_reg_reg_move.

 (define_expand movmode): less restrictive predicates.


 (define_insn *aarch64_movmode): Simplify and only allow for LE.
 (define_insn *aarch64_be_movoi): New.  BE only.  Plant ldp or
 stp.

Just say: Define.

 (define_insn *aarch64_be_movci): New.  BE only.  No instructions.
 (define_insn *aarch64_be_movxi): New.  BE only.  No instructions.

Likewise.

 (define_split): OI mov.  Use new aarch64_simd_emit_reg_reg_move.
 (define_split): CI mov.  Use new aarch64_simd_emit_reg_reg_move.


 On BE
 plant movs for reg to/from mem case.

Drop this part.


 (define_split): XI mov.  Use new aarch64_simd_emit_reg_reg_move.

 On BE
 plant movs for reg to/from mem case.

Likewise.

+void aarch64_simd_emit_reg_reg_move (rtx *operands, enum machine_mode mode,
+ unsigned int count);

Drop the formal argument names.


Can you respin with these changes please.

/Marcus


[PATCH] Fix tree-ssa-strlen ICE introduced by r211956 (PR tree-optimization/61773)

2014-11-20 Thread Jakub Jelinek
Hi!

Before the r211956 changes, the only places that set si-stmt
were required to check that stpcpy has been declared (with the right
prototype) to signal the strlen pass that it can use stpcpy for
optimization.  But r211956 sets si-stmt also for malloca call,
which isn't in any way related to stpcpy.  So, this patch moves the
assertion where it really is needed (for strcat/strcpy and their checking
variants cases).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/61773
* tree-ssa-strlen.c (get_string_length): Don't assert
stpcpy has been prototyped if si-stmt is BUILT_IN_MALLOC.

* gcc.dg/pr61773.c: New test.

--- gcc/tree-ssa-strlen.c.jj2014-11-19 18:47:59.0 +0100
+++ gcc/tree-ssa-strlen.c   2014-11-20 09:46:33.949017462 +0100
@@ -430,7 +430,6 @@ get_string_length (strinfo si)
   callee = gimple_call_fndecl (stmt);
   gcc_assert (callee  DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL);
   lhs = gimple_call_lhs (stmt);
-  gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY));
   /* unshare_strinfo is intentionally not called here.  The (delayed)
 transformation of strcpy or strcat into stpcpy is done at the place
 of the former strcpy/strcat call and so can affect all the strinfos
@@ -479,6 +478,7 @@ get_string_length (strinfo si)
case BUILT_IN_STRCPY_CHK:
case BUILT_IN_STRCPY_CHKP:
case BUILT_IN_STRCPY_CHK_CHKP:
+ gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY));
  if (gimple_call_num_args (stmt) == (with_bounds ? 4 : 2))
fn = builtin_decl_implicit (BUILT_IN_STPCPY);
  else
--- gcc/testsuite/gcc.dg/pr61773.c.jj   2014-11-20 10:12:48.664616764 +0100
+++ gcc/testsuite/gcc.dg/pr61773.c  2014-11-20 10:13:47.384557904 +0100
@@ -0,0 +1,16 @@
+/* PR tree-optimization/61773 */
+/* { dg-do compile } */
+/* { dg-options -O2 } */
+
+void
+foo (char **x)
+{
+  char *p = __builtin_malloc (64);
+  char *q = __builtin_malloc (64);
+  __builtin_strcat (q, abcde);
+  __builtin_strcat (p, ab);
+  p[1] = q[3];
+  __builtin_strcat (p, q);
+  x[0] = p;
+  x[1] = q;
+}

Jakub


Re: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.

2014-11-20 Thread Marcus Shawcroft
On 13 November 2014 10:09, David Sherwood david.sherw...@arm.com wrote:

 gcc/:
 2014-11-13  David Sherwood  david.sherw...@arm.com

 * config/aarch64/aarch64-protos.h (aarch64_simd_attr_length_rglist,
 aarch64_reverse_mask): New decls.
 * config/aarch64/iterators.md (UNSPEC_REV_REGLIST): New enum.
 * config/aarch64/iterators.md (insn_count): New mode_attr.
* config/aarch64/aarch64-simd.md (vec_store_lanes(o/c/x)i,
 vec_load_lanes(o/c/x)i): Fixed to work for Big Endian.

Spell these out in full please, some folks like to be able to grep for
function names in these logs.

 * config/aarch64/aarch64-simd.md (aarch64_rev_reglist,
 aarch64_simd_(ld/st)(2/3/4)): Added.

Likewise.

 * config/aarch64/aarch64.c (aarch64_simd_attr_length_rglist,
 aarch64_reverse_mask): Added.

It isn;t clear to me how far through the various BE patches we need to
get before 59810 is actually resolved?

Cheers
/Marcus


[ia64 PATCH] Fix up ia64 attribute handling (PR target/61137)

2014-11-20 Thread Jakub Jelinek
Hi!

Seems the gcc.target/ia64/small-addr-1.c testcase is failing on ia64 since
r210262 but clearly has been failing for much longer if compiled with C++
(just there is insufficient testsuite coverage).
The problem is that for the model attribute (and apparently common_object on
VMS too), the argument of that attribute is supposed to be an identifier
rather than expression (for common_object either an identifier or string),
and these days one has to tell the frontends about that in order not to
get the argument parsed as an expression.

The following untested patch fixes that (tested on small-addr-1.c with
a cross-compiler), I don't have ia64 hw nor spare cycles to test this
though, so I'm just offering the patch as is if anyone wants to test it.
Perhaps better testsuite coverage wouldn't hurt (test the model (small)
attribute also in C++, perhaps test the common_object attribute on VMS?).

2014-11-20  Jakub Jelinek  ja...@redhat.com

PR target/61137
* config/ia64/ia64.c (ia64_attribute_takes_identifier_p): New function.
(TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P): Redefine to it.

--- gcc/config/ia64/ia64.c.jj   2014-11-11 00:06:23.0 +0100
+++ gcc/config/ia64/ia64.c  2014-11-20 11:51:59.729478773 +0100
@@ -324,6 +324,7 @@ static bool ia64_vms_valid_pointer_mode
 static tree ia64_vms_common_object_attribute (tree *, tree, tree, int, bool *)
  ATTRIBUTE_UNUSED;
 
+static bool ia64_attribute_takes_identifier_p (const_tree);
 static tree ia64_handle_model_attribute (tree *, tree, tree, int, bool *);
 static tree ia64_handle_version_id_attribute (tree *, tree, tree, int, bool *);
 static void ia64_encode_section_info (tree, rtx, int);
@@ -669,8 +670,26 @@ static const struct attribute_spec ia64_
 #undef TARGET_VECTORIZE_VEC_PERM_CONST_OK
 #define TARGET_VECTORIZE_VEC_PERM_CONST_OK ia64_vectorize_vec_perm_const_ok
 
+#undef TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P
+#define TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P ia64_attribute_takes_identifier_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
+/* Returns TRUE iff the target attribute indicated by ATTR_ID takes a plain
+   identifier as an argument, so the front end shouldn't look it up.  */
+
+static bool
+ia64_attribute_takes_identifier_p (const_tree attr_id)
+{
+  if (is_attribute_p (model, attr_id))
+return true;
+#if TARGET_ABI_OPEN_VMS
+  if (is_attribute_p (common_object, attr_id))
+return true;
+#endif
+  return false;
+}
+
 typedef enum
   {
 ADDR_AREA_NORMAL,  /* normal address area */

Jakub


Re: [PATCH, ifcvt] Fix PR63917

2014-11-20 Thread Richard Henderson
On 11/20/2014 10:48 AM, Zhenqiang Chen wrote:
 +/* Check X clobber CC reg or not.  */
 +
 +static bool
 +clobber_cc_p (rtx x)
 +{
 +  RTX_CODE code = GET_CODE (x);
 +  int i;
 +
 +  if (code == CLOBBER
 +   REG_P (XEXP (x, 0))
 +   (GET_MODE_CLASS (GET_MODE (XEXP (x, 0))) == MODE_CC))
 +return TRUE;
 +  else if (code == PARALLEL)
 +for (i = 0; i  XVECLEN (x, 0); i++)
 +  if (clobber_cc_p (XVECEXP (x, 0, i)))
 + return TRUE;
 +  return FALSE;
 +}

Why would you need something like this when modified_between_p or one of its
kin ought to do the job?


r~


[PATCH] rs6000: Follow up for signed integer overflow fix

2014-11-20 Thread Markus Trippelsdorf
On 2014.11.20 at 08:59 -0500, David Edelsohn wrote:
 On Thu, Nov 20, 2014 at 8:27 AM, Markus Trippelsdorf
 mar...@trippelsdorf.de wrote:
  Running the testsuite after bootstrap-ubsan on gcc112 shows several issues. 
  See
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 for the full list.
 
  This patch fixes several of them.
 
  Tested on powerpc64-unknown-linux-gnu.
 
  OK for trunk?
 
  Thanks.
 
  2014-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de
 
  * config/rs6000/constraints.md: Avoid signed integer overflows.
  * config/rs6000/predicates.md: Likewise.
  * config/rs6000/rs6000.c (num_insns_constant_wide): Likewise.
  (includes_rldic_lshift_p): Likewise.
  (includes_rldicr_lshift_p): Likewise.
  * emit-rtl.c (const_wide_int_htab_hash): Likewise.
  * loop-iv.c (determine_max_iter): Likewise.
  (iv_number_of_iterations): Likewise.
  * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise.
  * varasm.c (get_section_anchor): Likewise.
 
 The rs6000 patches are okay.
 
 Someone like Richi or Jakub needs to approve the changes to the common
 parts of the compiler.

The patch needs a follow up. I have introduced a new compiler warning that I
didn't notice, because I was using --disable-werror during testing
unintentionally.  

Fixed by casting a few 0s to unsigned HOST_WIDE_INT.

Tested with --enable-werror on powerpc64-unknown-linux-gnu.

OK for trunk?

Thanks.

2014-11-20  Markus Trippelsdorf  mar...@trippelsdorf.de

* config/rs6000/rs6000.c (includes_rldic_lshift_p): Cast 0 to unsigned.
(includes_rldicr_lshift_p): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index a9604cf3fa97..d7958b33ba1a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16197,10 +16197,10 @@ includes_rldic_lshift_p (rtx shiftop, rtx andop)
   unsigned HOST_WIDE_INT c, lsb, shift_mask;
 
   c = INTVAL (andop);
-  if (c == 0 || c == ~0)
+  if (c == 0 || c == ~(unsigned HOST_WIDE_INT) 0)
return 0;
 
-  shift_mask = ~0;
+  shift_mask = ~(unsigned HOST_WIDE_INT) 0;
   shift_mask = INTVAL (shiftop);
 
   /* Find the least significant one bit.  */
@@ -16235,7 +16235,7 @@ includes_rldicr_lshift_p (rtx shiftop, rtx andop)
 {
   unsigned HOST_WIDE_INT c, lsb, shift_mask;
 
-  shift_mask = ~0;
+  shift_mask = ~(unsigned HOST_WIDE_INT) 0;
   shift_mask = INTVAL (shiftop);
   c = INTVAL (andop);
 

-- 
Markus


  1   2   >