Re: [PATCH/IPA] Fix ipa-polymorphic-call when size of Pmode is not the size of pointers in user code
Hi, For ILP32 on AARCH64, we have ptr_mode != Pmode (we have ptr_mode being SImode while Pmode is DImode and POINTER_SIZE is 32). This breaks ipa-polymorphic-call assumption that Pmode is the correct mode for pointers. Right now before this patch we get many testcase failures in the C++ testsuite due to this. Some of the tests fail due to the wrong devirtualization happening (using the base class rather the current class). This patch fixes the issue by using POINTER_SIZE in place of GET_MODE_BITSIZE (Pmode) all over the file. OK? Bootstrapped and tested on x86_64 and cross built and tested for aarch64-elf with no regressions. Thanks, Andrew Pinski ChangeLog: ipa/63981 * ipa-polymorphic-call.c (possible_placement_new): Use POINTER_SIZE instead of GET_MODE_BITSIZE (Pmode). (ipa_polymorphic_call_context::restrict_to_inner_class): Likewise. (extr_type_from_vtbl_ptr_store): Likewise. OK, thanks! Honza diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c index 452f2d2..a746c49 100644 --- a/gcc/ipa-polymorphic-call.c +++ b/gcc/ipa-polymorphic-call.c @@ -112,7 +112,7 @@ possible_placement_new (tree type, tree expected_type, || !tree_fits_shwi_p (TYPE_SIZE (type)) || (cur_offset + (expected_type ? tree_to_uhwi (TYPE_SIZE (expected_type)) - : GET_MODE_BITSIZE (Pmode)) + : POINTER_SIZE) = tree_to_uhwi (TYPE_SIZE (type); } @@ -155,7 +155,7 @@ ipa_polymorphic_call_context::restrict_to_inner_class (tree otr_type, HOST_WIDE_INT cur_offset = offset; bool speculative = false; bool size_unknown = false; - unsigned HOST_WIDE_INT otr_type_size = GET_MODE_BITSIZE (Pmode); + unsigned HOST_WIDE_INT otr_type_size = POINTER_SIZE; /* Update OUTER_TYPE to match EXPECTED_TYPE if it is not set. */ if (!outer_type) @@ -316,7 +316,7 @@ ipa_polymorphic_call_context::restrict_to_inner_class (tree otr_type, if (pos = (unsigned HOST_WIDE_INT)cur_offset (pos + size) = (unsigned HOST_WIDE_INT)cur_offset - + GET_MODE_BITSIZE (Pmode) + + POINTER_SIZE (!otr_type || !TYPE_SIZE (TREE_TYPE (fld)) || !tree_fits_shwi_p (TYPE_SIZE (TREE_TYPE (fld))) @@ -1243,7 +1243,7 @@ extr_type_from_vtbl_ptr_store (gimple stmt, struct type_change_info *tci, print_generic_expr (dump_file, tci-instance, TDF_SLIM); fprintf (dump_file, with offset %i\n, (int)tci-offset); } - return tci-offset GET_MODE_BITSIZE (Pmode) ? error_mark_node : NULL_TREE; + return tci-offset POINTER_SIZE ? error_mark_node : NULL_TREE; } if (offset != tci-offset || size != POINTER_SIZE @@ -1252,9 +1252,9 @@ extr_type_from_vtbl_ptr_store (gimple stmt, struct type_change_info *tci, if (dump_file) fprintf (dump_file, wrong offset %i!=%i or size %i\n, (int)offset, (int)tci-offset, (int)size); - return offset + GET_MODE_BITSIZE (Pmode) = tci-offset + return offset + POINTER_SIZE = tci-offset || (max_size != -1 - tci-offset + GET_MODE_BITSIZE (Pmode) offset + max_size) + tci-offset + POINTER_SIZE offset + max_size) ? error_mark_node : NULL; } }
Re: [PATCH][x86] Add clwb,pcommit,avx512avbmi,avx512ifma.
On Wed, Nov 19, 2014 at 6:32 PM, Ilya Tocar tocarip.in...@gmail.com wrote: Hi, New revision of Intel ISA reference [1] has new instructions: Clwb, pcommit and new flavors of AVX512. Patch bellow adds them. I understand that stage 1 is closed, however those changes shouldn't affect anything outside if i386 backend. And are extremely unlikely to break existing functionality, and I personally think it's desirable for newest GCC to support newest spec. Bootstrapped/regtestsed on x86_64-unknown-linux-gnu. Ok for trunk? Please split the patch into patch series, like it was done previously for AVX512F patches. Uros. [1]:https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf gcc/ 2014-11-19 Ilya Tocar ilya.to...@intel.com * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512IFMA_SET, OPTION_MASK_ISA_AVX512VBMI_SET, OPTION_MASK_ISA_AVX512IFMA_UNSET, OPTION_MASK_ISA_AVX512VBMI_UNSET, OPTION_MASK_ISA_PCOMMIT_UNSET, OPTION_MASK_ISA_CLWB_UNSET, OPTION_MASK_ISA_CLWB_SET, OPTION_MASK_ISA_PCOMMIT_SET): New. (ix86_handle_option): Handle OPT_mavx512ifma, OPT_mavx512vbmi, OPT_mpcommit, OPT_mclwb. * config.gcc: Add avx512ifmaintrin.h, avx512ifmavlintrin.h, avx512vbmiintrin.h, avx512vbmivlintrin.h clwbintrin.h pcommitintrin.h * config/i386/avx512ifmaintrin.h: New file. * config/i386/avx512ifmaivlntrin.h: Ditto. * config/i386/avx512vbmiintrin.h: Ditto. * config/i386/avx512vbmivlintrin.h: Ditto. * config/i386/clwbintrin.h: Ditto. * config/i386/pcommitintrin.h: Ditto. * config/i386/cpuid.h (bit_AVX512IFMA, bit_PCOMMIT, bit_CLWB, bit_AVX512VBMI): New. * config/i386/driver-i386.c (host_detect_local_cpu): Detect pcommit, clwb, avx512ifma, avx512vbmi. * config/i386/i386-c.c (ix86_target_macros_internal): Define __AVX512VBMI__, __AVX512IFMA__, __PCOMMIT__, __CLWB__. * config/i386/i386.c (ix86_target_string): Add -mavx512ifma, -mavx512vbmi, -mclwb, -mpcommit. (PTA_AVX512VBMI, PTA_AVX512IFMA, PTA_CLWB, PTA_PCOMMIT): Define. (ix86_option_override_internal): Handle new options. (ix86_valid_target_attribute_inner_p): Add avx512vbmi, avx512ifma, clwb, pcommit. (ix86_builtins): Add IX86_BUILTIN_VPMADD52LUQ512, IX86_BUILTIN_VPMADD52HUQ512, IX86_BUILTIN_VPMADD52LUQ256, IX86_BUILTIN_VPMADD52HUQ256, IX86_BUILTIN_VPMADD52LUQ128, IX86_BUILTIN_VPMADD52HUQ128, IX86_BUILTIN_VPMADD52LUQ512_MASKZ, IX86_BUILTIN_VPMADD52HUQ512_MASKZ, IX86_BUILTIN_VPMADD52LUQ256_MASKZ, IX86_BUILTIN_VPMADD52HUQ256_MASKZ, IX86_BUILTIN_VPMADD52LUQ128_MASKZ, IX86_BUILTIN_VPMADD52HUQ128_MASKZ, IX86_BUILTIN_VPMULTISHIFTQB512, IX86_BUILTIN_VPMULTISHIFTQB256, IX86_BUILTIN_VPMULTISHIFTQB128, IX86_BUILTIN_VPERMVARQI512_MASK, IX86_BUILTIN_VPERMT2VARQI512, IX86_BUILTIN_VPERMT2VARQI512_MASKZ, IX86_BUILTIN_VPERMI2VARQI512, IX86_BUILTIN_VPERMVARQI256_MASK, IX86_BUILTIN_VPERMVARQI128_MASK, IX86_BUILTIN_VPERMT2VARQI256, IX86_BUILTIN_VPERMT2VARQI256_MASKZ, IX86_BUILTIN_VPERMT2VARQI128, IX86_BUILTIN_VPERMI2VARQI256, IX86_BUILTIN_VPERMI2VARQI128, IX86_BUILTIN_CLWB, IX86_BUILTIN_PCOMMIT. (bdesc_special_args): Add __builtin_ia32_pcommit, __builtin_ia32_vpmadd52luq512_mask, __builtin_ia32_vpmadd52luq512_maskz, __builtin_ia32_vpmadd52huq512_mask, __builtin_ia32_vpmadd52huq512_maskx, __builtin_ia32_vpmadd52luq256_mask, __builtin_ia32_vpmadd52luq256_maskz, __builtin_ia32_vpmadd52huq256_mask, __builtin_ia32_vpmadd52huq256_maskz, __builtin_ia32_vpmadd52luq128_mask, __builtin_ia32_vpmadd52luq128_maskz, __builtin_ia32_vpmadd52huq128_mask, __builtin_ia32_vpmadd52huq128_maskz, __builtin_ia32_vpmultishiftqb512_mask, __builtin_ia32_vpmultishiftqb256_mask, __builtin_ia32_vpmultishiftqb128_mask, __builtin_ia32_permvarqi512_mask, __builtin_ia32_vpermt2varqi512_mask, __builtin_ia32_vpermt2varqi512_maskz, __builtin_ia32_vpermi2varqi512_mask, __builtin_ia32_permvarqi256_mask, __builtin_ia32_permvarqi128_mask, __builtin_ia32_vpermt2varqi256_mask, __builtin_ia32_vpermt2varqi256_maskz, __builtin_ia32_vpermt2varqi128_mask, __builtin_ia32_vpermt2varqi128_maskz, __builtin_ia32_vpermi2varqi256_mask, __builtin_ia32_vpermi2varqi128_mask. (ix86_init_mmx_sse_builtins): Add __builtin_ia32_clwb. (ix86_expand_builtin): Handle IX86_BUILTIN_CLWB. (ix86_hard_regno_mode_ok): Allow big masks for AVX612VBMI. * config/i386/i386.h (TARGET_AVX512VBMI, TARGET_AVX512VBMI_P, TARGET_AVX512IFMA, TARGET_AVX512IFMA_P, TARGET_PCOMMIT, TARGET_PCOMMIT_P,
Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian
On 19/11/14 09:29, Yangfei (Felix) wrote: Sorry for missing the point. It seems to me that 't2' here will conflict with condition of the pattern *movhi_insn_arch4: TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) || register_operand (operands[1], HImode)) #define TARGET_ARM (! TARGET_THUMB) /* 32-bit Thumb-2 code. */ #define TARGET_THUMB2 (TARGET_THUMB arm_arch_thumb2) Bah, Indeed ! - I misremembered the t2 there, my mistake. Yes you are right there, but what I'd like you to do is to use that mechanism rather than putting all this logic in the predicate. So, I'd prefer you to add a v6t2 to the values for the arch attribute, don't forget to update the comments above. and in arch_enabled you need to enforce this with (and (eq_attr arch v6t2) (match_test TARGET_32BIT arm_arch6 arm_arch_thumb2)) (const_string yes) And in the pattern use v6t2 ... arm_arch_thumb2 implies that this is at the architecture level of v6t2. Therefore TARGET_ARM arm_arch_thumb2 implies ARM state. Hi Ramana, Thank you for your suggestions. I rebased the patch on the latest trunk and updated it accordingly. As this patch will not work for architectures older than armv6t2, I also prefer Thomas's patch to fix for them. I am currently performing test for this patch. Assuming no issues pops up, OK for the trunk? And is it necessary to backport this patch to the 4.8 4.9 branches? I've applied the following as obvious after Kugan mentioned on IRC this morning noticing a movwne r0, #-32768. Obviously this won't be accepted as is by the assembler and we should be using the %L character. Applied to trunk as obvious. Felix, How did you test this patch ? regards Ramana 2014-11-20 Ramana Radhakrishnan ramana.radhakrish...@arm.com PR target/59593 * config/arm/arm.md (*movhi_insn): Use right formatting for immediate. Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 217717) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,11 @@ +2014-11-19 Felix Yang felix.y...@huawei.com + Shanyao Chen chenshan...@huawei.com + + PR target/59593 + * config/arm/arm.md (define_attr arch): Add v6t2. + (define_attr arch_enabled): Add test for the above. + (*movhi_insn_arch4): Add new alternative. + 2014-11-18 Felix Yang felix.y...@huawei.com * config/aarch64/aarch64.c (doloop_end): New pattern. Index: gcc/config/arm/arm.md === --- gcc/config/arm/arm.md (revision 217717) +++ gcc/config/arm/arm.md (working copy) @@ -125,9 +125,10 @@ ; This can be a for ARM, t for either of the Thumbs, 32 for ; TARGET_32BIT, t1 or t2 to specify a specific Thumb mode. v6 ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without -; arm_arch6. This attribute is used to compute attribute enabled, -; use type any to enable an alternative in all cases. -(define_attr arch any,a,t,32,t1,t2,v6,nov6,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3 +; arm_arch6. v6t2 for Thumb-2 with arm_arch6. This attribute is +; used to compute attribute enabled, use type any to enable an +; alternative in all cases. +(define_attr arch any,a,t,32,t1,t2,v6,nov6,v6t2,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3 (const_string any)) (define_attr arch_enabled no,yes @@ -162,6 +163,10 @@ (match_test TARGET_32BIT !arm_arch6)) (const_string yes) +(and (eq_attr arch v6t2) + (match_test TARGET_32BIT arm_arch6 arm_arch_thumb2)) +(const_string yes) + (and (eq_attr arch avoid_neon_for_64bits) (match_test TARGET_NEON) (not (match_test TARGET_PREFER_NEON_64BITS))) @@ -6288,8 +6293,8 @@ ;; Pattern to recognize insn generated default case above (define_insn *movhi_insn_arch4 - [(set (match_operand:HI 0 nonimmediate_operand =r,r,m,r) - (match_operand:HI 1 general_operand rIk,K,r,mi))] + [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m,r) + (match_operand:HI 1 general_operand rIk,K,n,r,mi))] TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) @@ -6297,16 +6302,19 @@ @ mov%?\\t%0, %1\\t%@ movhi mvn%?\\t%0, #%B1\\t%@ movhi + movw%?\\t%0, %1\\t%@ movhi str%(h%)\\t%1, %0\\t%@ movhi ldr%(h%)\\t%0, %1\\t%@ movhi [(set_attr predicable yes) - (set_attr pool_range *,*,*,256) - (set_attr neg_pool_range *,*,*,244) + (set_attr pool_range *,*,*,*,256) + (set_attr neg_pool_range *,*,*,*,244) + (set_attr arch *,*,v6t2,*,*) (set_attr_alternative type [(if_then_else (match_operand 1 const_int_operand ) (const_string mov_imm )
RE: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL
-Original Message- From: Kyrill Tkachov [mailto:kyrylo.tkac...@arm.com] Sent: Tuesday, November 18, 2014 11:08 PM To: Terry Guo; gcc-patches@gcc.gnu.org Cc: ger...@pfeifer.com Subject: Re: [PATCH][wwwdocs] Update 5.0 changes.html with Thumb1 UAL On 18/11/14 02:48, Terry Guo wrote: + ul + li The Thumb-1 assembly code are now generated in unified syntax. The new option +code-masm-syntax-unified/code can be used to specify whether inline assembly +code are using unified syntax. By default the option is off which means +non-unified syntax is used. However this is subject to change in future releases. +Eventually the non-unified syntax will be deprecated. + /li + /ul Hi Terry, Sorry for the late comment, I see this has already been committed. I think it should be assembly code is now generated. Also whether inline assembly code is using unified syntax. Kyrill Thanks for comments. I committed below patch to fix those typos. BR, Terry Index: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.39 diff -u -r1.39 changes.html --- htdocs/gcc-5/changes.html 19 Nov 2014 12:13:00 - 1.39 +++ htdocs/gcc-5/changes.html 20 Nov 2014 03:48:26 - @@ -387,9 +387,9 @@ h3 id=armARM/h3 ul - li The Thumb-1 assembly code are now generated in unified syntax. The new option + li The Thumb-1 assembly code is now generated in unified syntax. The new option code-masm-syntax-unified/code can be used to specify whether inline assembly -code are using unified syntax. By default the option is off which means +code is using unified syntax. By default the option is off which means non-unified syntax is used. However this is subject to change in future releases. Eventually the non-unified syntax will be deprecated. /li
Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian
On 19/11/14 09:29, Yangfei (Felix) wrote: Sorry for missing the point. It seems to me that 't2' here will conflict with condition of the pattern *movhi_insn_arch4: TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) || register_operand (operands[1], HImode)) #define TARGET_ARM (! TARGET_THUMB) /* 32-bit Thumb-2 code. */ #define TARGET_THUMB2 (TARGET_THUMB arm_arch_thumb2) Bah, Indeed ! - I misremembered the t2 there, my mistake. Yes you are right there, but what I'd like you to do is to use that mechanism rather than putting all this logic in the predicate. So, I'd prefer you to add a v6t2 to the values for the arch attribute, don't forget to update the comments above. and in arch_enabled you need to enforce this with (and (eq_attr arch v6t2) (match_test TARGET_32BIT arm_arch6 arm_arch_thumb2)) (const_string yes) And in the pattern use v6t2 ... arm_arch_thumb2 implies that this is at the architecture level of v6t2. Therefore TARGET_ARM arm_arch_thumb2 implies ARM state. Hi Ramana, Thank you for your suggestions. I rebased the patch on the latest trunk and updated it accordingly. As this patch will not work for architectures older than armv6t2, I also prefer Thomas's patch to fix for them. I am currently performing test for this patch. Assuming no issues pops up, OK for the trunk? And is it necessary to backport this patch to the 4.8 4.9 branches? I've applied the following as obvious after Kugan mentioned on IRC this morning noticing a movwne r0, #-32768. Obviously this won't be accepted as is by the assembler and we should be using the %L character. Applied to trunk as obvious. Felix, How did you test this patch ? regards Ramana I regtested the patch for arm-eabi-gcc/g++ big-endian with qemu. The test result is OK. That's strange ... This issue can be reproduced by the following testcase. Thanks for fixing it. #include stdio.h unsigned short v = 0x5678; int i; int j = 0; int *ptr = j; int func() { for (i = 0; i 1; ++i) { *ptr = -1; v = 0xF234; } return v; } 2014-11-20 Ramana Radhakrishnan ramana.radhakrish...@arm.com PR target/59593 * config/arm/arm.md (*movhi_insn): Use right formatting for immediate.
Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.
Hi Philipp, I don't mind it being in config/arm if you plan to wire it up later, good to know. Another comment inline Thanks, Kyrill On 19/11/14 21:42, Philipp Tomsich wrote: Here's an updated patch with Kyrill's and Andrew's comments integrated. I left the file in the config/arm-directory, as XGene-family is capable of executing ARMv7 and we will wire this into the 32bit backend in the near future (moving it now would just cause another move in the near future). We also moved the 'include' up to where the pipeline models for the A53/A57/ThunderX are included, as the previous dependency on picking up the SIMD types from aarch64-simd.md no longer holds true since gcc-4.9. Cheers, -Philipp. --- gcc/ChangeLog | 6 + gcc/config/aarch64/aarch64.md | 3 +- gcc/config/arm/xgene1.md | 520 ++ 3 files changed, 528 insertions(+), 1 deletion(-) create mode 100644 gcc/config/arm/xgene1.md diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c9ac0d9..dad2278 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2014-11-19 Philipp Tomsich philipp.toms...@theobroma-systems.com + * config/aarch64/aarch64.md: Include xgene1.md. + (generic_sched): Set to no for xgene1. + * config/arm/xgene1.md: New file. + +2014-11-19 Philipp Tomsich philipp.toms...@theobroma-systems.com + * config/aarch64/aarch64-cores.def (xgene1): Update/add the xgene1 (APM XGene-1) core definition. * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 597ff8c..1b36384 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -191,7 +191,7 @@ (define_attr generic_sched yes,no (const (if_then_else - (eq_attr tune cortexa53,cortexa15,thunderx) + (eq_attr tune cortexa53,cortexa15,thunderx,xgene1) (const_string no) (const_string yes @@ -199,6 +199,7 @@ (include ../arm/cortex-a53.md) (include ../arm/cortex-a15.md) (include thunderx.md) +(include ../arm/xgene1.md) ;; --- ;; Jumps and other miscellaneous insns diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md new file mode 100644 index 000..227f2c7 --- /dev/null +++ b/gcc/config/arm/xgene1.md @@ -0,0 +1,520 @@ +;; Machine description for AppliedMicro xgene1 core. +;; Copyright (C) 2012-2014 Free Software Foundation, Inc. +;; Contributed by Theobroma Systems Design und Consulting GmbH. +;;See http://www.theobroma-systems.com for more info. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; http://www.gnu.org/licenses/. + +;; Pipeline description for the xgene1 micro-architecture + +(define_automaton xgene1) + +(define_cpu_unit xgene1_decode_out0 xgene1) +(define_cpu_unit xgene1_decode_out1 xgene1) +(define_cpu_unit xgene1_decode_out2 xgene1) +(define_cpu_unit xgene1_decode_out3 xgene1) + +(define_cpu_unit xgene_divide xgene1) +(define_cpu_unit xgene_fp_divide xgene1) Why is this xgene_* while the other units xgene1_*? + +(define_reservation xgene1_decode1op +( xgene1_decode_out0 ) +|( xgene1_decode_out1 ) +|( xgene1_decode_out2 ) +|( xgene1_decode_out3 ) +) +(define_reservation xgene1_decode2op +( xgene1_decode_out0 + xgene1_decode_out1 ) +|( xgene1_decode_out0 + xgene1_decode_out2 ) +|( xgene1_decode_out0 + xgene1_decode_out3 ) +|( xgene1_decode_out1 + xgene1_decode_out2 ) +|( xgene1_decode_out1 + xgene1_decode_out3 ) +|( xgene1_decode_out2 + xgene1_decode_out3 ) +) +(define_reservation xgene1_decodeIsolated +( xgene1_decode_out0 + xgene1_decode_out1 + xgene1_decode_out2 + xgene1_decode_out3 ) +) + +(define_insn_reservation branch 1 + (and (eq_attr tune xgene1) + (eq_attr type branch)) + xgene1_decode1op) insn_reservation names should also have the xgene1_* namespace + +(define_insn_reservation nop 1 + (and (eq_attr tune xgene1) + (eq_attr type no_insn)) + xgene1_decode1op) + +(define_insn_reservation call 1 + (and (eq_attr tune xgene1) + (eq_attr type call)) + xgene1_decode2op) + +(define_insn_reservation f_load 10 + (and (eq_attr tune xgene1) + (eq_attr type f_loadd,f_loads)) +
Re: [PATCH, PR63742][ARM] Fix arm *movhi_insn_arch4 pattern for big-endian
On 19/11/14 09:29, Yangfei (Felix) wrote: Sorry for missing the point. It seems to me that 't2' here will conflict with condition of the pattern *movhi_insn_arch4: TARGET_ARM arm_arch4 (register_operand (operands[0], HImode) || register_operand (operands[1], HImode)) #define TARGET_ARM (! TARGET_THUMB) /* 32-bit Thumb-2 code. */ #define TARGET_THUMB2 (TARGET_THUMB arm_arch_thumb2) Bah, Indeed ! - I misremembered the t2 there, my mistake. Yes you are right there, but what I'd like you to do is to use that mechanism rather than putting all this logic in the predicate. So, I'd prefer you to add a v6t2 to the values for the arch attribute, don't forget to update the comments above. and in arch_enabled you need to enforce this with (and (eq_attr arch v6t2) (match_test TARGET_32BIT arm_arch6 arm_arch_thumb2)) (const_string yes) And in the pattern use v6t2 ... arm_arch_thumb2 implies that this is at the architecture level of v6t2. Therefore TARGET_ARM arm_arch_thumb2 implies ARM state. Hi Ramana, Thank you for your suggestions. I rebased the patch on the latest trunk and updated it accordingly. As this patch will not work for architectures older than armv6t2, I also prefer Thomas's patch to fix for them. I am currently performing test for this patch. Assuming no issues pops up, OK for the trunk? And is it necessary to backport this patch to the 4.8 4.9 branches? I've applied the following as obvious after Kugan mentioned on IRC this morning noticing a movwne r0, #-32768. Obviously this won't be accepted as is by the assembler and we should be using the %L character. Applied to trunk as obvious. Felix, How did you test this patch ? regards Ramana I regtested the patch for arm-eabi-gcc/g++ big-endian with qemu. The test result is OK. That's strange ... This issue can be reproduced by the following testcase. Thanks for fixing it. #include stdio.h unsigned short v = 0x5678; int i; int j = 0; int *ptr = j; int func() { for (i = 0; i 1; ++i) { *ptr = -1; v = 0xF234; } return v; } And the architecture level is set to armv7-a by default when testing.
[PATCH] Fix PR63962
When moving tree-ssa-forwprop.c:associate_plusminus to match.pd patterns a single-use restriction escaped my eye. It is indeed important for non-simplifications like (ptr p+ off1) p+ off2 - ptr p+ (off1 + off2) to not un-CSE. The association is most useful to enable later re-association as reassoc isn't able to associate pointer-plus chains but only unsigned integer arithmetic. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2014-11-20 Richard Biener rguent...@suse.de PR middle-end/63962 * match.pd ((p +p off1) +p off2 - (p +p (off1 + off2))): Guard with single-use operand 0. * gcc.dg/tree-ssa/forwprop-30.c: New testcase. Index: gcc/match.pd === --- gcc/match.pd(revision 217767) +++ gcc/match.pd(working copy) @@ -370,8 +370,9 @@ (define_operator_list inverted_tcc_compa /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ (simplify - (pointer_plus (pointer_plus @0 @1) @3) - (pointer_plus @0 (plus @1 @3))) + (pointer_plus (pointer_plus@2 @0 @1) @3) + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2)) + (pointer_plus @0 (plus @1 @3 /* Pattern match tem1 = (long) ptr1; Index: gcc/testsuite/gcc.dg/tree-ssa/forwprop-30.c === --- gcc/testsuite/gcc.dg/tree-ssa/forwprop-30.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/forwprop-30.c (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-optimized } */ + +int *p; +int *foo (int *q, int i, int j) +{ + p = q + i; + return p + j; +} + +/* We shouldn't associate (q + i) + j to q + (i + j) here as we + need q + i as well. */ + +/* { dg-final { scan-tree-dump-times \\+ 2 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */
Re: [patch v2, aarch64] additional bics patterns
On 19/11/14 18:22, Sandra Loosemore wrote: On 11/13/2014 10:47 AM, Andrew Pinski wrote: On Thu, Nov 13, 2014 at 9:42 AM, Sandra Loosemore san...@codesourcery.com wrote: On 11/13/2014 10:27 AM, Richard Earnshaw wrote: On 13/11/14 17:05, Ramana Radhakrishnan wrote: On Thu, Nov 13, 2014 at 4:55 PM, Sandra Loosemore san...@codesourcery.com wrote: This patch to the AArch64 back end adds a couple of additional bics patterns to match code of the form if ((x y) == x) ...; This is testing whether the bits set in x are a subset of the bits set in y; or, that no bits in x are set that are not set in y. So, it is equivalent to if ((x ~y) == 0) ...; Presently this generates code like and x21, x21, x20 cmp x21, x20 b.eqc0 main+0xc0 and this patch allows it to be written more concisely as: bics x21, x20, x21 b.eq c0 main+0xc0 Since the bics instruction sets the condition codes itself, no explicit comparison is required and the result of the bics computation can be discarded. Regression-tested on aarch64-linux-gnu. OK to commit? Is this not a duplicate of https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00943.html ? I don't think so. However, I think it is something that should be caught in generic simplification code ie map ((a b) == b) == ((~a b) == 0), etc Bit-clear operations are not that uncommon. Furthermore, A may be a constant. Alex posted his patch when I already had Chris's in my regression test queue, but I've just confirmed that it does not fix the test case I included. I already thought a little about making this a generic simplification, but it seemed to me like it was only useful on targets that have a bit-clear instruction that happens to set condition codes, and that it would pessimize code on targets that don't have a bit-clear instruction at all (by inserting the extra complement operation). So to me it seemed reasonable to do it in the back end. But can't you do this in simplify-rtx.c and allow for the cost model to do the correct thing? OK, here is a revised patch to apply the identity there. This version depends on Alex's aarch64 BICS patch for the included test case to pass, though. In addition to the aarch64 testing, I bootstrapped and regression-tested the target-inspecific part of the patch on x86_64-linux-gnu. Is this OK? Should I hold off on committing it until Alex's patch is in? -Sandra 2014-11-19 Sandra Loosemore san...@codesourcery.com gcc/ * simplify-rtx.c (simplify_relational_operation_1): Handle simplification identities for BICS patterns. gcc/testsuite/ * gcc.target/aarch64/bics_4.c: New. Looks sensible to me. Eric, are you happy? R. bics2.patch Index: gcc/simplify-rtx.c === --- gcc/simplify-rtx.c(revision 217322) +++ gcc/simplify-rtx.c(working copy) @@ -4551,6 +4551,32 @@ simplify_relational_operation_1 (enum rt simplify_gen_binary (XOR, cmp_mode, XEXP (op0, 1), op1)); + /* (eq/ne (and x y) x) simplifies to (eq/ne (and (not y) x) 0), which + can be implemented with a BICS instruction on some targets, or + constant-folded if y is a constant. */ + if ((code == EQ || code == NE) + op0code == AND + rtx_equal_p (XEXP (op0, 0), op1) + !side_effects_p (op1)) +{ + rtx not_y = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 1), cmp_mode); + rtx lhs = simplify_gen_binary (AND, cmp_mode, not_y, XEXP (op0, 0)); + + return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx); +} + + /* Likewise for (eq/ne (and x y) y). */ + if ((code == EQ || code == NE) + op0code == AND + rtx_equal_p (XEXP (op0, 1), op1) + !side_effects_p (op1)) +{ + rtx not_x = simplify_gen_unary (NOT, cmp_mode, XEXP (op0, 0), cmp_mode); + rtx lhs = simplify_gen_binary (AND, cmp_mode, not_x, XEXP (op0, 1)); + + return simplify_gen_relational (code, mode, cmp_mode, lhs, const0_rtx); +} + /* (eq/ne (bswap x) C1) simplifies to (eq/ne x C2) with C2 swapped. */ if ((code == EQ || code == NE) GET_CODE (op0) == BSWAP Index: gcc/testsuite/gcc.target/aarch64/bics_4.c === --- gcc/testsuite/gcc.target/aarch64/bics_4.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/bics_4.c (revision 0) @@ -0,0 +1,87 @@ +/* { dg-do run } */ +/* { dg-options -O2 --save-temps -fno-inline } */ + +extern void abort (void); + +int +bics_si_test1 (int a, int b, int c) +{ + if ((a b) == a) +return a; + else +return c; +} + +int +bics_si_test2 (int a, int b, int c) +{ + if ((a b) == b) +return b; + else +return c; +}
[PATCH, ifcvt] Fix PR63917
Hi, r217646 enhances ifcvt to handle cbranchcc4 instruction. But ifcvt does not strictly check the dependence before moving instructions before IF. Then some instructions, which clobber CC, are inserted before the cbranchcc4 instruction. For the case in the patch, ifcvt transfers code from 5: r87:SI=r117:SI 22: pc={(flags:CCGOC=0)?L26:pc} 25: {r87:SI=-r117:SI;clobber flags:CC;} to 5: r87:SI=r117:SI 136: {r145:SI=-r117:SI;clobber flags:CC;} // CC is clobbered 137: r87:SI={(flags:CCGOC0)?r145:SI:r117:SI} The patch skips moving insns, which clobber CC, before cbranchcc4. Bootstrap and no make check regression on X86-64 and i686. All the failed cases in PR63917 PASS. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-11-20 Zhenqiang Chen zhenqiang.c...@arm.com PR rtl-optimization/63917 * ifcvt.c (clobber_cc_p, use_cc_p): New functions. (noce_process_if_block, check_cond_move_block): Check CC references. testsuite/ChangeLog: 2014-11-20 Zhenqiang Chen zhenqiang.c...@arm.com * gcc.target/i386/floatsitf.c: New test. diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 21f08c2..760eeb6 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -2528,6 +2528,34 @@ noce_can_store_speculate_p (basic_block top_bb, const_rtx mem) return false; } +/* Check X clobber CC reg or not. */ + +static bool +clobber_cc_p (rtx x) +{ + RTX_CODE code = GET_CODE (x); + int i; + + if (code == CLOBBER + REG_P (XEXP (x, 0)) + (GET_MODE_CLASS (GET_MODE (XEXP (x, 0))) == MODE_CC)) +return TRUE; + else if (code == PARALLEL) +for (i = 0; i XVECLEN (x, 0); i++) + if (clobber_cc_p (XVECEXP (x, 0, i))) + return TRUE; + return FALSE; +} + +/* Check CC reg is used in COND or not. */ + +static bool +use_cc_p (rtx cond) +{ + return (HAVE_cbranchcc4) + (GET_MODE_CLASS (GET_MODE (XEXP (cond, 0))) == MODE_CC); +} + /* Given a simple IF-THEN-JOIN or IF-THEN-ELSE-JOIN block, attempt to convert it without using conditional execution. Return TRUE if we were successful at converting the block. */ @@ -2655,6 +2683,12 @@ noce_process_if_block (struct noce_if_info *if_info) if_info-a = a; if_info-b = b; + /* Skip it if the instruction to be moved might clobber CC. */ + if (use_cc_p (if_info-cond) + (clobber_cc_p (PATTERN (insn_a)) + || (insn_b clobber_cc_p (PATTERN (insn_b) +return FALSE; + /* Try optimizations in some approximation of a useful order. */ /* ??? Should first look to see if X is live incoming at all. If it isn't, we don't need anything but an unconditional set. */ @@ -2868,6 +2902,10 @@ check_cond_move_block (basic_block bb, modified_between_p (src, insn, NEXT_INSN (BB_END (bb return FALSE; + /* Skip it if the instruction to be moved might clobber CC. */ + if (use_cc_p (cond) clobber_cc_p (PATTERN (insn))) + return FALSE; + vals-put (dest, src); regs-safe_push (dest); diff --git a/gcc/testsuite/gcc.target/i386/floatsitf.c b/gcc/testsuite/gcc.target/i386/floatsitf.c new file mode 100644 index 000..6b249cc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/floatsitf.c @@ -0,0 +1,48 @@ +/* { dg-do compile { target { { i?86-*-* x86_64-*-* } ilp32 } } } */ +/* { dg-options -O2 -fdump-rtl-ce2 } */ + +typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__))); +void __sfp_handle_exceptions (int); + +typedef int QItype __attribute__ ((mode (QI))); +typedef int SItype __attribute__ ((mode (SI))); +typedef int DItype __attribute__ ((mode (DI))); +typedef unsigned int UQItype __attribute__ ((mode (QI))); +typedef unsigned int USItype __attribute__ ((mode (SI))); +typedef unsigned int UDItype __attribute__ ((mode (DI))); + +typedef unsigned int UHWtype __attribute__ ((mode (HI))); +extern const UQItype __clz_tab[256] ; + +extern void abort (void); +typedef float TFtype __attribute__ ((mode (TF))); + +union _FP_UNION_Q +{ + TFtype flt; + struct + { +unsigned long frac0 : 32; +unsigned long frac1 : 32; +unsigned long frac2 : 32; +unsigned long frac3 : 113 - (((unsigned int) 1 (113 -1) % 32) != 0)-(32 * 3); +unsigned exp : 15; +unsigned sign : 1; + + } bits __attribute__ ((packed)); +}; + +TFtype +__floatsitf (SItype i) +{ + int A_c __attribute__ ((unused)); int A_s __attribute__ ((unused)); int A_e __attribute__ ((unused)); unsigned int A_f[4]; + TFtype a; + + do { if ((i)) { USItype _FP_FROM_INT_ur; if ((A_s = (((i)) 0))) ((i)) = -(USItype) ((i)); _FP_FROM_INT_ur = (USItype) ((i)); (void) (8 * (int) sizeof (SItype = 32) ? ({ int _FP_FROM_INT_lz; do { if (sizeof (unsigned int) == sizeof (unsigned int)) (_FP_FROM_INT_lz) = __builtin_clz ((unsigned int) _FP_FROM_INT_ur); else if (sizeof (unsigned int) == sizeof (unsigned long)) (_FP_FROM_INT_lz) = __builtin_clzl ((unsigned int) _FP_FROM_INT_ur); else if (sizeof (unsigned int) == sizeof (unsigned long long)) (_FP_FROM_INT_lz) =
[Ada] Missing interface conversion in access type
The compiler silently skips the generation of code to perform the conversion of an access type whose designated type is a class-wide interface type, thus causing unexpected problems at runtime in dispatching calls to the target object. After this patch the following test compiles and executes without errors: package Lists is type List is interface; function Element (Self : access List) return Natural is abstract; end Lists; limited with Lists; package Types is type List_Access is access all Lists.List'Class; end Types; with Types; with Lists; with Ada.Finalization; package My_Lists is type My_List is new Ada.Finalization.Controlled and Lists.List with null record; type My_List_Access is access all My_List'Class; overriding function Element (Self : access My_List) return Natural is (2); end My_Lists; with My_Lists; with Types; procedure Test is X : My_Lists.My_List_Access := new My_Lists.My_List; Y : Types.List_Access := Types.List_Access (X); -- Test begin if Y.Element /= 2 then raise Program_Error; end if; end Test; Command: gnatmake main.adb; ./main No output Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Javier Miranda mira...@adacore.com * exp_ch4.adb (Expand_N_Type_Conversion): Add missing implicit conversion to force the displacement of the pointer to the object to reference the secondary dispatch table. Index: exp_ch4.adb === --- exp_ch4.adb (revision 217828) +++ exp_ch4.adb (working copy) @@ -10622,7 +10622,9 @@ -- Ada 2005 (AI-251): Handle interface type conversion -if Is_Interface (Actual_Op_Typ) then +if Is_Interface (Actual_Op_Typ) + or else Is_Interface (Actual_Targ_Typ) +then Expand_Interface_Conversion (N); goto Done; end if;
[Ada] Lift limitation of inter-unit inlining with generic packages
This change lifts the arbitrary limitation on the number of iterations that can be executed between loading of the inlined bodies and instantiation of the generic bodies of external units when inter-unit inlining is activated. It was previously limited to 1 but this may be not sufficient in some cases, which can result in pragma Inline_Always not being honored. The following code must compile quietly with -O -gnatn: with Q; use Q; package P is function F (Cal : Calendar) return Boolean; end P; package body P is function F (Cal : Calendar) return Boolean is begin return Pred (Cal); end; end P; with R; use R; package Q is type Calendar is new Object_Ref; type Root_Calendar is new Root_Object with record B : Boolean; end record; type Root_Calendar_Ptr is access all Root_Calendar'Class; function Pred (Cal : Calendar) return Boolean; pragma Inline (Pred); end Q; package body Q is function Get_Calendar is new Get_Object (Root_Calendar, Root_Calendar_Ptr); pragma Inline (Get_Calendar); function Pred (Cal : Calendar) return Boolean is Cal_Object : constant Root_Calendar_Ptr := Get_Calendar (Object_Ref (Cal)); begin return Cal_Object.B; end; end Q; with Ada.Finalization; package R is type Root_Object is new Ada.Finalization.Controlled with record Reference_Count : Natural; end record; type Object_Ref is private; type Root_Object_Ptr is access all Root_Object'Class; generic type Object () is abstract new Root_Object with private; type Object_Ptr is access all Object'Class; function Get_Object (Ref : in Object_Ref) return Object_Ptr; private type Object_Ref is new Ada.Finalization.Controlled with record Ptr : Root_Object_Ptr; end record; end R; package body R is function Get_Object (Ref : in Object_Ref) return Object_Ptr is begin return Object_Ptr (Ref.Ptr); end Get_Object; end R; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Eric Botcazou ebotca...@adacore.com * inline.adb (Analyze_Inlined_Bodies): Iterate between loading of the inlined bodies and instantiation of the generic bodies until no more bodies need to be loaded. Index: inline.adb === --- inline.adb (revision 217828) +++ inline.adb (working copy) @@ -774,16 +774,21 @@ end if; J := J + 1; - end loop; - -- The analysis of required bodies may have produced additional - -- generic instantiations. To obtain further inlining, we perform - -- another round of generic body instantiations. Establishing a - -- fully recursive loop between inlining and generic instantiations - -- is unlikely to yield more than this one additional pass. +if J Inlined_Bodies.Last then - Instantiate_Bodies; + -- The analysis of required bodies may have produced additional + -- generic instantiations. To obtain further inlining, we need + -- to perform another round of generic body instantiations. + Instantiate_Bodies; + + -- Symmetrically, the instantiation of required generic bodies + -- may have caused additional bodies to be inlined. To obtain + -- further inlining, we keep looping over the inlined bodies. +end if; + end loop; + -- The list of inlined subprograms is an overestimate, because it -- includes inlined functions called from functions that are compiled -- as part of an inlined package, but are not themselves called. An
Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.
Kyrill, I don't mind it being in config/arm if you plan to wire it up later, good to know. Another comment inline…. I’ll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit. +(define_insn_reservation div 2 + (and (eq_attr tune xgene1) + (eq_attr type sdiv,udiv)) + xgene1_decode1op,xgene_divide) The dangerous part was the reservation duration (the xgene_divide*large number). The latency number (2 in this version, 66 in the previous) is not harmful to the automaton size and can be as high as needed (if this operation is high latency) It doesn’t really matter for any workload we’ve encountered, as the hardware is better at dealing with ‘div’-latencies than the scheduler (especially, as ‘div’ is variable latency and any guess we have will be wrong… we’ll likely add scheduling hook function in the future). The more important thing is to keep the cost of divides high enough in the cost-model. In other words: 66 would be the worst case and will normally not be correct anyway. Furthermore, it’s rather unplausible, that we find 264 instructions (for this worst-case scenario) to fill the scheduling bubble between the div-insn and its result usage. Best, Philipp.
Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.
Hi Philipp, On 20/11/14 10:47, Dr. Philipp Tomsich wrote: Kyrill, I don't mind it being in config/arm if you plan to wire it up later, good to know. Another comment inline…. I’ll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit. +(define_insn_reservation div 2 + (and (eq_attr tune xgene1) + (eq_attr type sdiv,udiv)) + xgene1_decode1op,xgene_divide) The dangerous part was the reservation duration (the xgene_divide*large number). The latency number (2 in this version, 66 in the previous) is not harmful to the automaton size and can be as high as needed (if this operation is high latency) It doesn’t really matter for any workload we’ve encountered, as the hardware is better at dealing with ‘div’-latencies than the scheduler (especially, as ‘div’ is variable latency and any guess we have will be wrong… we’ll likely add scheduling hook function in the future). The more important thing is to keep the cost of divides high enough in the cost-model. In other words: 66 would be the worst case and will normally not be correct anyway. Furthermore, it’s rather unplausible, that we find 264 instructions (for this worst-case scenario) to fill the scheduling bubble between the div-insn and its result usage. Ok, makes sense. I just thought that 2 is a bit too low but if your benchmarking showed it to be reasonable I won't complain ;) Kyrill Best, Philipp.
[Ada] Fix costly call to Following_Address_Clause
This change makes is so that Following_Address_Clause is invoked only if this is really necessary from Analyze_Object_Declaration. This saves about 1% of the compilation time at low optimization levels. No functional changes. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Eric Botcazou ebotca...@adacore.com * sem_ch3.adb (Analyze_Object_Declaration): Swap a couple of tests in a condition so Following_Address_Clause is invoked only if need be. * exp_util.ads (Following_Address_Clause): Add small note. Index: sem_ch3.adb === --- sem_ch3.adb (revision 217828) +++ sem_ch3.adb (working copy) @@ -3648,8 +3648,13 @@ if Comes_From_Source (N) and then Expander_Active + and then Nkind (E) = N_Aggregate + + -- Note the importance of doing this the following test after the + -- N_Aggregate test to avoid inefficiencies from too many calls to + -- the function Following_Address_Clause which can be expensive. + and then Present (Following_Address_Clause (N)) - and then Nkind (E) = N_Aggregate then Set_Etype (E, T); Index: exp_util.ads === --- exp_util.ads(revision 217828) +++ exp_util.ads(working copy) @@ -507,6 +507,10 @@ -- current declarative part to look for an address clause for the object -- being declared, and returns the clause if one is found, returns -- Empty otherwise. + -- + -- Note: this function can be costly and must be invoked with special care. + -- Possibly we could introduce a flag at parse time indicating the presence + -- of an address clause to speed this up??? procedure Force_Evaluation (Exp : Node_Id;
[Ada] Handling of function calls to predefined operators in ASIS
An operator that is called in functional notation is rewritten as an operator so that its operands can be properly resolved. ASIS needs the semantic info to be available on the original node, so in ASIS mode the resolved operands are linked back to the original call. This patch takes into account that the call may have had named associations, using the standard operator arguments Left and Right. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Ed Schonberg schonb...@adacore.com * sem_res.adb (Make_Call_Into_Operator): In ASIS mode, propagate back the resolved operands to the original call node, taking into account that the original call may have named associations. Index: sem_res.adb === --- sem_res.adb (revision 217828) +++ sem_res.adb (working copy) @@ -1793,16 +1793,62 @@ and then Nkind (N) in N_Op and then Nkind (Original_Node (N)) = N_Function_Call then - if Is_Binary then -Rewrite (First (Parameter_Associations (Original_Node (N))), - Relocate_Node (Left_Opnd (N))); -Rewrite (Next (First (Parameter_Associations (Original_Node (N, - Relocate_Node (Right_Opnd (N))); - else -Rewrite (First (Parameter_Associations (Original_Node (N))), - Relocate_Node (Right_Opnd (N))); - end if; + declare +L : constant Node_Id := Left_Opnd (N); +R : constant Node_Id := Right_Opnd (N); +Old_First : constant Node_Id := + First (Parameter_Associations (Original_Node (N))); +Old_Sec : Node_Id; + + begin +if Is_Binary then + Old_Sec := Next (Old_First); + + -- If the original call has named associations, replace the + -- explicit actual parameter in the association with the proper + -- resolved operand. + + if Nkind (Old_First) = N_Parameter_Association then + if Chars (Selector_Name (Old_First)) = + Chars (First_Entity (Op_Id)) + then + Rewrite (Explicit_Actual_Parameter (Old_First), + Relocate_Node (L)); + else + Rewrite (Explicit_Actual_Parameter (Old_First), + Relocate_Node (R)); + end if; + + else + Rewrite (Old_First, Relocate_Node (L)); + end if; + + if Nkind (Old_Sec) = N_Parameter_Association then + if Chars (Selector_Name (Old_Sec)) = + Chars (First_Entity (Op_Id)) + then + Rewrite (Explicit_Actual_Parameter (Old_Sec), + Relocate_Node (L)); + else + Rewrite (Explicit_Actual_Parameter (Old_Sec), + Relocate_Node (R)); + end if; + + else + Rewrite (Old_Sec, Relocate_Node (R)); + end if; + +else + if Nkind (Old_First) = N_Parameter_Association then + Rewrite (Explicit_Actual_Parameter (Old_First), +Relocate_Node (R)); + else + Rewrite (Old_First, Relocate_Node (R)); + end if; +end if; + end; + Set_Parent (Original_Node (N), Parent (N)); end if; end Make_Call_Into_Operator;
[Ada] Improper assignment on indexing operation with implicit dereference
If the left-hand side of an assignment is an Ada 2012 generalized indexing with an implicit derenference, the compiler must verify that the type of the access discriminant that provides the implicit dereference is not an access_to_constant. Compiling ada_test.adb must yield: ada_test.adb:24:25: left hand side of assignment must be a variable ada_test.adb:25:04: left hand side of assignment must be a variable --- with Ada.Text_IO; use Ada.Text_IO; with Ada.Integer_Text_IO; use Ada.Integer_Text_IO; procedure Ada_Test is type Obj is record A : aliased Integer; end record; type Obj_Access is access all Obj; type Accessor (Data : access constant Integer) is null record with Implicit_Dereference = Data; function Get_Int (This : Obj_Access) return Accessor is begin return Accessor'(Data = This.A'Access); end Get_Int; X : aliased Obj := (A = 11); X_Ptr : Obj_Access := X'Access; begin Get_Int (X_Ptr).Data.all := 33; -- Error Get_Int (X_Ptr) := 33;-- Error Put (X.A);-- Should never execute.. New_Line; end Ada_Test; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Ed Schonberg schonb...@adacore.com * sem_util.adb (Is_Variable): For an Ada 2012 implicit dereference introduced for an indexing opertion, check that the type of the corresponding access discriminant is not an access to constant. Index: sem_util.adb === --- sem_util.adb(revision 217829) +++ sem_util.adb(working copy) @@ -12806,12 +12806,14 @@ Is_Variable_Prefix (Original_Node (Prefix (N))); -- in Ada 2012, the dereference may have been added for a type with - -- a declared implicit dereference aspect. + -- a declared implicit dereference aspect. Check that it is not an + -- access to constant. elsif Nkind (N) = N_Explicit_Dereference and then Present (Etype (Orig_Node)) and then Ada_Version = Ada_2012 and then Has_Implicit_Dereference (Etype (Orig_Node)) +and then not Is_Access_Constant (Etype (Prefix (N))) then return True;
[Ada] Rework win32_wait to behave more like the UNIX waitpid()
The following changes are importants: - It is possible to have multiple tasks waiting for a child process to terminate. - When a child terminates, a single wait call will receive the corresponding process id. - A call to wait will handle new incoming child processes. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Pascal Obry o...@adacore.com * initialize.c (ProcListCS): New extern variable (critical section). (ProcListEvt): New extern variable (handle). (__gnat_initialize)[Win32]: Initialize the ProcListCS critical section object and the ProcListEvt event. * final.c (__gnat_finalize)[Win32]: Properly finalize the ProcListCS critical section and the ProcListEvt event. * adaint.c (ProcListEvt): New Win32 event handle. (EnterCS): New routine to enter the critical section when dealing with child processes chain list. (LeaveCS): As above to exit from the critical section. (SignalListChanged): Routine to signal that the chain process list has been updated. (add_handle): Use EnterCS/LeaveCS, also call SignalListChanged when the handle has been added. (__gnat_win32_remove_handle): Use EnterCS/LeaveCS, also call SignalListChanged if the handle has been found and removed. (remove_handle): Routine removed, implementation merged with the above. (win32_wait): Use EnterCS/LeaveCS for the critical section. Properly copy the PID list locally to ensure that even if the list is updated the local copy remains valid. Add into the hl (handle list) the ProcListEvt handle. This handle is used to signal that a change has been made into the process chain list. This is to ensure that a waiting call can be resumed to take into account new processes. We also make sure that if the handle was not found into the list we start over the wait call. Indeed another concurrent call to win32_wait() could already have handled this process. Index: final.c === --- final.c (revision 217828) +++ final.c (working copy) @@ -6,7 +6,7 @@ * * * C Implementation File * * * - * Copyright (C) 1992-2011, Free Software Foundation, Inc. * + * Copyright (C) 1992-2014, Free Software Foundation, Inc. * * * * GNAT is free software; you can redistribute it and/or modify it under * * terms of the GNU General Public License as published by the Free Soft- * @@ -40,11 +40,29 @@ at all, the intention is that this be replaced by system specific code where finalization is required. */ +#if defined (__MINGW32__) +#include mingw32.h +#include windows.h + +extern CRITICAL_SECTION ProcListCS; +extern HANDLE ProcListEvt; + void __gnat_finalize (void) { + /* delete critical section and event handle used for the + processes chain list */ + DeleteCriticalSection(ProcListCS); + CloseHandle (ProcListEvt); } +#else +void +__gnat_finalize (void) +{ +} +#endif + #ifdef __cplusplus } #endif Index: initialize.c === --- initialize.c(revision 217828) +++ initialize.c(working copy) @@ -74,6 +74,8 @@ extern int gnat_argc; extern char **gnat_argv; +extern CRITICAL_SECTION ProcListCS; +extern HANDLE ProcListEvt; #ifdef GNAT_UNICODE_SUPPORT @@ -138,6 +140,11 @@ given that we have set Max_Digits etc with this in mind */ __gnat_init_float (); + /* Initialize the critical section and event handle for the win32_wait() + implementation, see adaint.c */ + InitializeCriticalSection (ProcListCS); + ProcListEvt = CreateEvent (NULL, FALSE, FALSE, NULL); + #ifdef GNAT_UNICODE_SUPPORT /* Set current code page for filenames handling. */ { Index: adaint.c === --- adaint.c(revision 217836) +++ adaint.c(working copy) @@ -2311,21 +2311,30 @@ for locking and unlocking tasks since we do not support multiple threads on this configuration (Cert run time on native Windows). */ -static void dummy (void) +static void EnterCS (void) {} +static void LeaveCS (void) {} +static void SignalListChanged (void) {} + +#else + +CRITICAL_SECTION ProcListCS; +HANDLE ProcListEvt; + +static void EnterCS (void) { + EnterCriticalSection(ProcListCS); } -void (*Lock_Task) () = dummy; -void (*Unlock_Task) () = dummy; +static void LeaveCS (void) +{ + LeaveCriticalSection(ProcListCS); +} -#else +static void SignalListChanged (void) +{ + SetEvent (ProcListEvt); +} -#define Lock_Task
[Ada] Attributes 'Old and 'Update must preserve the tag of their prefix
The patch modifies the expansion of attributes 'Old and 'Update to ensure that the tag of a tagged prefix is not modified as a result attribute evaluation. -- Source -- -- types.ads package Types is type Root is tagged record X : Integer; end record; procedure Show (R : Root); type Ext is new Root with record Y : Integer; end record; overriding procedure Show (R : Ext); end Types; -- types.adb with Ada.Text_IO; use Ada.Text_IO; package body Types is procedure Show (R : Root) is begin Put_Line ((root) X = R.X'Img); end Show; overriding procedure Show (R : Ext) is begin Put_Line ((ext) X = R.X'Img); Put_Line ((ext) Y = R.Y'Img); end Show; end Types; -- main.adb with Ada.Text_IO; use Ada.Text_IO; with Types; use Types; procedure Main is procedure Show_Me (R : Root) is Tmp : Root'Class := R; begin Show (Tmp); end Show_Me; procedure Wibble (R : Root) is begin Show_Me (R); Show_Me (R'Update (X = 5)); end Wibble; A : Ext; begin A.X := 0; A.Y := 1; Wibble (Root (A)); end Main; -- Compilation and output -- $ gnatmake -q main.adb $ ./main (ext) X = 0 (ext) Y = 1 (ext) X = 5 (ext) Y = 1 Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Hristian Kirtchev kirtc...@adacore.com * exp_attr.adb (Expand_N_Attribute_Reference, Expand_Update_Attribute): Preserve the tag of a prefix by offering a specific view of the class-wide version of the prefix. Index: exp_attr.adb === --- exp_attr.adb(revision 217828) +++ exp_attr.adb(working copy) @@ -1021,6 +1021,9 @@ Pref : constant Node_Id := Prefix (N); Typ : constant Entity_Id := Etype (Pref); Blk : Node_Id; + CW_Decl : Node_Id; + CW_Temp : Entity_Id; + CW_Typ: Entity_Id; Decls : List_Id; Installed : Boolean; Loc : Source_Ptr; @@ -1338,19 +1341,56 @@ -- Step 3: Create a constant to capture the value of the prefix at the -- entry point into the loop. - -- Generate: - --Temp : constant type of Pref := Pref; - Temp_Id := Make_Temporary (Loc, 'P'); - Temp_Decl := -Make_Object_Declaration (Loc, - Defining_Identifier = Temp_Id, - Constant_Present= True, - Object_Definition = New_Occurrence_Of (Typ, Loc), - Expression = Relocate_Node (Pref)); - Append_To (Decls, Temp_Decl); + -- Preserve the tag of the prefix by offering a specific view of the + -- class-wide version of the prefix. + if Is_Tagged_Type (Typ) then + + -- Generate: + --CW_Temp : constant Typ'Class := Typ'Class (Pref); + + CW_Temp := Make_Temporary (Loc, 'T'); + CW_Typ := Class_Wide_Type (Typ); + + CW_Decl := + Make_Object_Declaration (Loc, + Defining_Identifier = CW_Temp, + Constant_Present= True, + Object_Definition = New_Occurrence_Of (CW_Typ, Loc), + Expression = + Convert_To (CW_Typ, Relocate_Node (Pref))); + Append_To (Decls, CW_Decl); + + -- Generate: + --Temp : Typ renames Typ (CW_Temp); + + Temp_Decl := + Make_Object_Renaming_Declaration (Loc, + Defining_Identifier = Temp_Id, + Subtype_Mark= New_Occurrence_Of (Typ, Loc), + Name= + Convert_To (Typ, New_Occurrence_Of (CW_Temp, Loc))); + Append_To (Decls, Temp_Decl); + + -- Non-tagged case + + else + CW_Decl := Empty; + + -- Generate: + --Temp : constant Typ := Pref; + + Temp_Decl := + Make_Object_Declaration (Loc, + Defining_Identifier = Temp_Id, + Constant_Present= True, + Object_Definition = New_Occurrence_Of (Typ, Loc), + Expression = Relocate_Node (Pref)); + Append_To (Decls, Temp_Decl); + end if; + -- Step 4: Analyze all bits Installed := Current_Scope = Scope (Loop_Id); @@ -1374,6 +1414,10 @@ -- the declaration of the constant. else + if Present (CW_Decl) then +Analyze (CW_Decl); + end if; + Analyze (Temp_Decl); end if; @@ -4358,19 +4402,13 @@ - when Attribute_Old = Old : declare - Asn_Stm : Node_Id; + Typ : constant Entity_Id := Etype (N); + CW_Temp : Entity_Id; + CW_Typ : Entity_Id; Subp: Node_Id; Temp: Entity_Id; begin - Temp := Make_Temporary (Loc, 'T', Pref); - - -- Set the entity kind now in order to mark
Re: [PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.
On Wed, Nov 19, 2014 at 9:42 PM, Philipp Tomsich philipp.toms...@theobroma-systems.com wrote: Here's an updated patch with Kyrill's and Andrew's comments integrated. I left the file in the config/arm-directory, as XGene-family is capable of executing ARMv7 and we will wire this into the 32bit backend in the near future (moving it now would just cause another move in the near future). Right, if this were making it into the arm backend and if the core indeed does have AArch32 support, I'd like to see support for the command line for xgene1 in the AArch32 backend as well for 5.0. Do have a look in arm-cores.def in gcc/config/arm - there are ways of using existing tuning options with the command line or putting this as part of generic. We've been here before and users typically complain about CPU option X being available in AArch32 state but not in AArch64 state. Since this is a separate tuning option, I'm less worried about this going in later in stage3 but realistically it would be good to have the command line options wired up for AArch32 by the end of the year. Ramana We also moved the 'include' up to where the pipeline models for the A53/A57/ThunderX are included, as the previous dependency on picking up the SIMD types from aarch64-simd.md no longer holds true since gcc-4.9. Cheers, -Philipp. --- gcc/ChangeLog | 6 + gcc/config/aarch64/aarch64.md | 3 +- gcc/config/arm/xgene1.md | 520 ++ 3 files changed, 528 insertions(+), 1 deletion(-) create mode 100644 gcc/config/arm/xgene1.md diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c9ac0d9..dad2278 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2014-11-19 Philipp Tomsich philipp.toms...@theobroma-systems.com + * config/aarch64/aarch64.md: Include xgene1.md. + (generic_sched): Set to no for xgene1. + * config/arm/xgene1.md: New file. + +2014-11-19 Philipp Tomsich philipp.toms...@theobroma-systems.com + * config/aarch64/aarch64-cores.def (xgene1): Update/add the xgene1 (APM XGene-1) core definition. * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 597ff8c..1b36384 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -191,7 +191,7 @@ (define_attr generic_sched yes,no (const (if_then_else - (eq_attr tune cortexa53,cortexa15,thunderx) + (eq_attr tune cortexa53,cortexa15,thunderx,xgene1) (const_string no) (const_string yes @@ -199,6 +199,7 @@ (include ../arm/cortex-a53.md) (include ../arm/cortex-a15.md) (include thunderx.md) +(include ../arm/xgene1.md) ;; --- ;; Jumps and other miscellaneous insns diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md new file mode 100644 index 000..227f2c7 --- /dev/null +++ b/gcc/config/arm/xgene1.md @@ -0,0 +1,520 @@ +;; Machine description for AppliedMicro xgene1 core. +;; Copyright (C) 2012-2014 Free Software Foundation, Inc. +;; Contributed by Theobroma Systems Design und Consulting GmbH. +;;See http://www.theobroma-systems.com for more info. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; http://www.gnu.org/licenses/. + +;; Pipeline description for the xgene1 micro-architecture + +(define_automaton xgene1) + +(define_cpu_unit xgene1_decode_out0 xgene1) +(define_cpu_unit xgene1_decode_out1 xgene1) +(define_cpu_unit xgene1_decode_out2 xgene1) +(define_cpu_unit xgene1_decode_out3 xgene1) + +(define_cpu_unit xgene_divide xgene1) +(define_cpu_unit xgene_fp_divide xgene1) + +(define_reservation xgene1_decode1op +( xgene1_decode_out0 ) +|( xgene1_decode_out1 ) +|( xgene1_decode_out2 ) +|( xgene1_decode_out3 ) +) +(define_reservation xgene1_decode2op +( xgene1_decode_out0 + xgene1_decode_out1 ) +|( xgene1_decode_out0 + xgene1_decode_out2 ) +|( xgene1_decode_out0 + xgene1_decode_out3 ) +|( xgene1_decode_out1 + xgene1_decode_out2 ) +|( xgene1_decode_out1 + xgene1_decode_out3 ) +|( xgene1_decode_out2 + xgene1_decode_out3 ) +) +(define_reservation
[Ada] Interaction between 'Loop_Entry, 'Old, 'Update and Extensions_Visible
This patch the following SPARK rule (the part about 'Loop_Entry, 'Old, 'Update) If the Extensions_Visible aspect is False for a subprogram, then certain restrictions are imposed on the use of any parameter of the subprogram which is of a specific tagged type. Such a parameter shall not be converted to a class-wide type. Such a parameter shall not be passed as an actual parameter in a call to a subprogram whose Extensions_Visible aspect is True. These restrictions also apply to any parenthesized expression, qualified expression, or type conversion whose operand is subject to these restrictions, to any Old, Update, or Loop_Entry attribute_reference whose prefix is subject to these restrictions, and to any conditional expression having at least one dependent_expression which is subjec to these restrictions. -- Source -- -- test_loop_entry_old_update.adb procedure Test_Loop_Entry_Old_Update is -- Test that Extensions_Visible restrictions are enforced for -- Old, Update, and Loop_Entry attribute references. pragma Assertion_Policy (Check); package Pkg is type T is abstract tagged record Int1, Int2, Int3 : Integer; end record; function Is_Bodacious (X : T) return Boolean is abstract; end Pkg; use Pkg; procedure P1 (X : in out T) with Post = Is_Bodacious (T'Class (X'Old)), -- ERROR Extensions_Visible = False; procedure P1 (X : in out T) is begin null; end P1; procedure P2 (X : in out T) with Extensions_Visible = False; procedure P2 (X : in out T) is begin if Is_Bodacious (T'Class (X'Update (Int1 = 123))) then-- ERROR X.Int1 := 123; end if; end P2; procedure P3 (X : in out T) with Extensions_Visible = False; procedure P3 (X : in out T) is begin for I in 1 .. 10 loop X.Int1 := X.Int1 + 1; pragma Assert ((X.Int1 /= X.Int2) or else Is_Bodacious (T'Class (X'Loop_Entry))); -- ERROR end loop; end P3; procedure P4 (X : in out T; Y : T'Class) with Extensions_Visible = False; procedure P4 (X : in out T; Y : T'Class) is begin if Is_Bodacious (T'Class (T'(if X.Int1 = X.Int2 -- ERROR then X'Update (Int1 = X.Int1 + 1) else T (Y then X.Int1 := 456; end if; end P4; begin null; end Test_Loop_Entry_Old_Update; -- Compilation and output -- $ gcc -c test_loop_entry_old_update.adb test_loop_entry_old_update.adb:15:38: formal parameter with Extensions_Visible False cannot be converted to class-wide type test_loop_entry_old_update.adb:22:34: formal parameter with Extensions_Visible False cannot be converted to class-wide type test_loop_entry_old_update.adb:33:44: formal parameter with Extensions_Visible False cannot be converted to class-wide type test_loop_entry_old_update.adb:42:13: formal parameter with Extensions_Visible False cannot be converted to class-wide type Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Hristian Kirtchev kirtc...@adacore.com * sem_util.adb (Is_EVF_Expression): Include attributes 'Loop_Entry, 'Old and 'Update to the logic. Index: sem_util.adb === --- sem_util.adb(revision 217835) +++ sem_util.adb(working copy) @@ -10846,6 +10846,16 @@ N_Type_Conversion) then return Is_EVF_Expression (Expression (N)); + + -- Attributes 'Loop_Entry, 'Old and 'Update are an EVF expression when + -- their prefix denotes an EVF expression. + + elsif Nkind (N) = N_Attribute_Reference +and then Nam_In (Attribute_Name (N), Name_Loop_Entry, + Name_Old, + Name_Update) + then + return Is_EVF_Expression (Prefix (N)); end if; return False;
[Ada] Add missing SPARK_Mode aspects/pragmas on formal containers
While the library of formal maps/sets correctly set SPARK_Mode on spec (On) and private part / body (Off), it was not the case for lists and vectors, thus causing some errors in GNATprove when instantiating such formal containers because bodies contain non-SPARK features (e.g. access types in formal vectors). Now fixed, which requires for formal lists and vectors that they are instantiated at library level, as other formal containers. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Yannick Moy m...@adacore.com * a-cfdlli.adb, a-cfdlli.ads, a-cfinve.adb, a-cfinve.ads, * a-cofove.adb, a-cofove.ads: Mark spec as SPARK_Mode, and private part/body as SPARK_Mode Off. * a-cfhama.adb, a-cfhama.ads, a-cfhase.adb, a-cfhase.ads, * a-cforma.adb, a-cforma.ads, a-cforse.adb, a-cforse.ads: Use aspect instead of pragma for uniformity. Index: a-cfdlli.adb === --- a-cfdlli.adb(revision 217828) +++ a-cfdlli.adb(working copy) @@ -6,7 +6,7 @@ -- -- -- B o d y -- -- -- --- Copyright (C) 2010-2013, Free Software Foundation, Inc. -- +-- Copyright (C) 2010-2014, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -27,7 +27,9 @@ with System; use type System.Address; -package body Ada.Containers.Formal_Doubly_Linked_Lists is +package body Ada.Containers.Formal_Doubly_Linked_Lists with + SPARK_Mode = Off +is --- -- Local Subprograms -- Index: a-cfdlli.ads === --- a-cfdlli.ads(revision 217828) +++ a-cfdlli.ads(working copy) @@ -61,9 +61,11 @@ with function = (Left, Right : Element_Type) return Boolean is ; -package Ada.Containers.Formal_Doubly_Linked_Lists is +package Ada.Containers.Formal_Doubly_Linked_Lists with + Pure, + SPARK_Mode +is pragma Annotate (GNATprove, External_Axiomatization); - pragma Pure; type List (Capacity : Count_Type) is private with Iterable = (First = First, @@ -337,6 +339,7 @@ -- scanned yet. private + pragma SPARK_Mode (Off); type Node_Type is record Prev: Count_Type'Base := -1; Index: a-cfhase.adb === --- a-cfhase.adb(revision 217828) +++ a-cfhase.adb(working copy) @@ -35,8 +35,9 @@ with System; use type System.Address; -package body Ada.Containers.Formal_Hashed_Sets is - pragma SPARK_Mode (Off); +package body Ada.Containers.Formal_Hashed_Sets with + SPARK_Mode = Off +is --- -- Local Subprograms -- Index: a-cfhase.ads === --- a-cfhase.ads(revision 217828) +++ a-cfhase.ads(working copy) @@ -67,10 +67,11 @@ with function = (Left, Right : Element_Type) return Boolean is ; -package Ada.Containers.Formal_Hashed_Sets is +package Ada.Containers.Formal_Hashed_Sets with + Pure, + SPARK_Mode +is pragma Annotate (GNATprove, External_Axiomatization); - pragma Pure; - pragma SPARK_Mode (On); type Set (Capacity : Count_Type; Modulus : Hash_Type) is private with Iterable = (First = First, @@ -335,9 +336,10 @@ -- scanned yet. private - pragma Inline (Next); pragma SPARK_Mode (Off); + pragma Inline (Next); + type Node_Type is record Element : Element_Type; Index: a-cfinve.adb === --- a-cfinve.adb(revision 217828) +++ a-cfinve.adb(working copy) @@ -26,7 +26,9 @@ -- http://www.gnu.org/licenses/. -- -- -package body Ada.Containers.Formal_Indefinite_Vectors is +package body Ada.Containers.Formal_Indefinite_Vectors with + SPARK_Mode = Off +is function H (New_Item : Element_Type) return Holder renames To_Holder; function E (Container : Holder) return Element_Type renames Get; Index: a-cfinve.ads === --- a-cfinve.ads(revision 217828) +++ a-cfinve.ads(working copy) @@ -52,7 +52,9 @@ -- size, and heap allocation will be avoided. If False, the containers can -- grow via heap allocation. -package Ada.Containers.Formal_Indefinite_Vectors is +package
[Ada] Generate VC in GNATprove instead of error for empty range check
Range checks on empty ranges typically correspond to deactivated code based on a given configuration (say, dead code inside a loop over the empty range). In GNATprove mode, instead of issuing an error message (which would stop analysis), enable the range check so that GNATprove will issue a message if it cannot prove that the check is unreachable. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Yannick Moy m...@adacore.com * checks.adb (Apply_Scalar_Range_Check): In GNATprove mode, put a range check when an empty range is used, instead of an error message. * sinfo.ads Update comment on GNATprove mode. Index: sinfo.ads === --- sinfo.ads (revision 217828) +++ sinfo.ads (working copy) @@ -581,6 +581,12 @@ -- bounds are generated from an expression: Expand_Subtype_From_Expr -- should be noop. + --5. Errors (instead of warnings) are issued on compile-time known + -- constraint errors, except in a few selected cases where it should + -- be allowed to let analysis proceed (e.g. range checks on empty + -- ranges, typically in deactivated code based on a given + -- configuration). + --- -- Check Flag Fields -- --- Index: checks.adb === --- checks.adb (revision 217828) +++ checks.adb (working copy) @@ -2926,7 +2926,21 @@ -- since all possible values will raise CE). if Lov Hiv then - Bad_Value; + + -- In GNATprove mode, do not issue a message in that case + -- (which would be an error stopping analysis), as this + -- likely corresponds to deactivated code based on a + -- given configuration (say, dead code inside a loop over + -- the empty range). Instead, we enable the range check + -- so that GNATprove will issue a message if it cannot be + -- proved. + + if GNATprove_Mode then +Enable_Range_Check (Expr); + else +Bad_Value; + end if; + return; end if;
[Ada] Give error message if duplicate Linker_Section given
Like other similar pragmas, we should disallow duplicate pragma or aspect Linker_Section for non-overloadable entities (for the case of overloading, the pragma only applies to previous entities which do not have such a pragma). The following should compile with the given error: 1. package Pkg1 is 2.Var_Dyn : natural; 3.pragma Linker_Section (Var_Dyn, .data_dyn); 4.pragma Linker_Section (Var_Dyn, .data_dyn1); | Linker_Section already specified for Var_Dyn at line 3 5. end Pkg1; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Robert Dewar de...@adacore.com * sem_prag.adb (Analyze_Pragma, case Linker_Section): Detect duplicate Linker_Section. Index: sem_prag.adb === --- sem_prag.adb(revision 217838) +++ sem_prag.adb(working copy) @@ -16380,6 +16380,7 @@ when Pragma_Linker_Section = Linker_Section : declare Arg : Node_Id; Ent : Entity_Id; +LPE : Node_Id; begin GNAT_Pragma; @@ -16398,9 +16399,18 @@ case Ekind (Ent) is -- Objects (constants and variables) and types. For these cases - -- all we need to do is to set the Linker_Section_pragma field. + -- all we need to do is to set the Linker_Section_pragma field, + -- checking that we do not have a duplicate. when E_Constant | E_Variable | Type_Kind = + LPE := Linker_Section_Pragma (Ent); + + if Present (LPE) then + Error_Msg_Sloc := Sloc (LPE); + Error_Msg_NE + (Linker_Section already specified for #, Arg1, Ent); + end if; + Set_Linker_Section_Pragma (Ent, N); -- Subprograms
RE: [PATCH] If using branch likelies in MIPS sync code fill the delay slot with a nop
Ok to commit? gcc/ * config/mips/mips.c (mips_process_sync_loop): Place a nop in the delay slot of the branch likely instruction. With an updated ChangeLog to account for the changes in the callers, OK. Matthew
[PATCH x86, PR60451] Expand even/odd permutation using pack insn.
Hi, The patch expand even/odd permutation using: and, and, pack in odd case shift, shift, pack in even case instead of current pshufb, pshufb, or or big set of unpack insns. AVX2/CORE bootstrap and make check passed. expensive tests are in progress Is it ok for trunk? Evgeny 2014-11-20 Evgeny Stupachenko evstu...@gmail.com gcc/testsuite PR target/60451 * gcc.target/i386/pr60451.c: New. gcc/ PR target/60451 * config/i386/i386.c (expand_vec_perm_even_odd_pack): New. (expand_vec_perm_even_odd_1): Add new expand for SSE cases, replace with for AVX2 cases. (ix86_expand_vec_perm_const_1): Add new expand. +/* A subroutine of expand_vec_perm_even_odd_1. Implement extract-even + and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands + with two and and pack or two shift and pack insns. We should + have already failed all two instruction sequences. */ + +static bool +expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d) +{ + rtx op, dop0, dop1, t, rperm[16]; + unsigned i, odd, c, s, nelt = d-nelt; + bool end_perm = false; + machine_mode half_mode; + rtx (*gen_and) (rtx, rtx, rtx); + rtx (*gen_pack) (rtx, rtx, rtx); + rtx (*gen_shift) (rtx, rtx, rtx); + + /* Required for pack. */ + if (!TARGET_SSE4_2 || d-one_operand_p) +return false; + + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general + shuffles. */ + if (d-vmode == V8HImode) +{ + c = 0x; + s = 16; + half_mode = V4SImode; + gen_and = gen_andv4si3; + gen_pack = gen_sse4_1_packusdw; + gen_shift = gen_lshrv4si3; +} + else if (d-vmode == V16QImode) +{ + c = 0xff; + s = 8; + half_mode = V8HImode; + gen_and = gen_andv8hi3; + gen_pack = gen_sse2_packuswb; + gen_shift = gen_lshrv8hi3; +} + else if (d-vmode == V16HImode) +{ + c = 0x; + s = 16; + half_mode = V8SImode; + gen_and = gen_andv8si3; + gen_pack = gen_avx2_packusdw; + gen_shift = gen_lshrv8si3; + end_perm = true; +} + else if (d-vmode == V32QImode) +{ + c = 0xff; + s = 8; + half_mode = V16HImode; + gen_and = gen_andv16hi3; + gen_pack = gen_avx2_packuswb; + gen_shift = gen_lshrv16hi3; + end_perm = true; +} + else +return false; + + /* Check that permutation is even or odd. */ + odd = d-perm[0]; + if (odd != 0 odd != 1) +return false; + + for (i = 1; i nelt; ++i) +if (d-perm[i] != 2 * i + odd) + return false; + + if (d-testing_p) +return true; + + dop0 = gen_reg_rtx (half_mode); + dop1 = gen_reg_rtx (half_mode); + if (odd == 0) +{ + for (i = 0; i nelt / 2; rperm[i++] = GEN_INT (c)); + t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm)); + t = force_reg (half_mode, t); + emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0))); + emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1))); +} + else +{ + emit_insn (gen_shift (dop0, + gen_lowpart (half_mode, d-op0), + GEN_INT (s))); + emit_insn (gen_shift (dop1, + gen_lowpart (half_mode, d-op1), + GEN_INT (s))); +} + /* In AVX2 for 256 bit case we need to permute pack result. */ + if (TARGET_AVX2 end_perm) +{ + op = gen_reg_rtx (d-vmode); + t = gen_reg_rtx (V4DImode); + emit_insn (gen_pack (op, dop0, dop1)); + emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op), const0_rtx, + const2_rtx, const1_rtx, GEN_INT (3))); + emit_move_insn (d-target, gen_lowpart (d-vmode, t)); +} + else +emit_insn (gen_pack (d-target, dop0, dop1)); + + return true; +} + /* A subroutine of ix86_expand_vec_perm_builtin_1. Implement extract-even and extract-odd permutations. */ @@ -48393,6 +48503,8 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) gcc_unreachable (); case V8HImode: + if (TARGET_SSE4_2) + return expand_vec_perm_even_odd_pack (d); if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else @@ -48416,6 +48528,8 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) break; case V16QImode: + if (TARGET_SSE4_2) + return expand_vec_perm_even_odd_pack (d); if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else @@ -48441,7 +48555,7 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) case V16HImode: case V32QImode: - return expand_vec_perm_vpshufb2_vpermq_even_odd (d); + return expand_vec_perm_even_odd_pack (d); case V4DImode: if (!TARGET_AVX2) @@ -48814,6 +48928,9 @@ ix86_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) /* Try
Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.
On Thu, Nov 20, 2014 at 02:36:26PM +0300, Evgeny Stupachenko wrote: + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general + shuffles. */ I think switch (d-vmode) would be more readable. + op = gen_reg_rtx (d-vmode); + t = gen_reg_rtx (V4DImode); + emit_insn (gen_pack (op, dop0, dop1)); + emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op), const0_rtx, Too long line, wrap it? Will leave the rest to Uros. Jakub
[PATCH, ARM] Fix PR63718, Thumb1 bootstrap -- disable fuse-caller-save for Thumb1
Richard, This patch fixes PR63718, which currently breaks Thumb1 bootstrap. The problem is that in Thumb1 mode, we emit the epilogue in RTL, but the last insn - epilogue_insns - does not accurately model the corresponding insns emitted in the asm file. F.i., the asm file may contain an insn: ... pop {r0} while the corresponding RTL pattern looks like this: ... (jump_insn (unspec_volatile [ (return) ] VUNSPEC_EPILOGUE)) ... As a consequence, the epilogue may clobber registers without fuse-caller-save being able to analyze that. Adding the missing clobbers to epilogue_insns is not trivial, and probably not a good idea for stage3. The patch works around the problem by disabling fuse-caller-save in Thumb1 mode. Build and reg-tested on arm-none-eabi. OK for stage3? Thanks, - Tom 2014-11-20 Tom de Vries t...@codesourcery.com PR rtl-optimization/63718 * config/arm/arm.c (arm_option_override): Disable fuse-caller-save for Thumb1. Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c (revision 217730) +++ gcc/config/arm/arm.c (working copy) @@ -3105,6 +3105,18 @@ arm_option_override (void) (!arm_arch7 || !current_tune-prefer_ldrd_strd)) flag_schedule_fusion = 0; + /* In Thumb1 mode, we emit the epilogue in RTL, but the last insn + - epilogue_insns - does not accurately model the corresponding insns + emitted in the asm file. In particular, see the comment in thumb_exit + 'Find out how many of the (return) argument registers we can corrupt'. + As a consequence, the epilogue may clobber registers without + fuse-caller-save finding out about it. Therefore, disable fuse-caller-save + in Thumb1 mode. + TODO: Accurately model clobbers for epilogue_insns and reenable + fuse-caller-save. */ + if (TARGET_THUMB1) +flag_use_caller_save = 0; + /* Register global variables with the garbage collector. */ arm_add_gc_roots (); }
[Ada] gnat1: back end switch -G nnn (PR ada/47500)
On platform where the switch is allowed, the gcc driver, when called with -Gnnn (nnn is a non negative number) invokes the compiler (gnat1) with -G nnn. This patch skips the argument nnn after -G, so that it is not taken as a source file name. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Vincent Celier cel...@adacore.com PR ada/47500 * back_end.adb (Scan_Back_End_Switches): Skip switch -G and its argument. Index: back_end.adb === --- back_end.adb(revision 217828) +++ back_end.adb(working copy) @@ -232,9 +232,10 @@ Last : constant Natural := Switch_Last (Switch_Chars); begin - -- Skip -o or internal GCC switches together with their argument + -- Skip -o, -G or internal GCC switches together with their argument. if Switch_Chars (First .. Last) = o + or else Switch_Chars (First .. Last) = G or else Is_Internal_GCC_Switch (Switch_Chars) then Next_Arg := Next_Arg + 1;
Re: [PATCH, i386]: Fix PR 63966, inconsistent operand constraints compiling libcpp
On Wed, Nov 19, 2014 at 9:59 PM, Uros Bizjak ubiz...@gmail.com wrote: Hello! libcpp/lex.c includes ../gcc/config/i386/cpuid.h, and is picked up by the system compiler during stage1. Recently, cpuid.h was changed to account for %ebx changes and now uses b asm constraint for i686 even with __PIC__. Attached patch is what I have committed to mainline SVN. 2014-11-20 Uros Bizjak ubiz...@gmail.com PR target/63966 * lex.c [__i386__ || __x86_64__]: Compile special SSE functions only for (__GNUC__ = 5 || !defined(__PIC__)). Bootstrapped on x86_64-linux-gnu, Fedora 20 and CentOS 5.11. Uros. Index: lex.c === --- lex.c (revision 217830) +++ lex.c (working copy) @@ -270,7 +270,7 @@ extensions used, so SSE4.2 executables cannot run on machines that don't support that extension. */ -#if (GCC_VERSION = 4005) (defined(__i386__) || defined(__x86_64__)) !(defined(__sun__) defined(__svr4__)) +#if (GCC_VERSION = 4005) (__GNUC__ = 5 || !defined(__PIC__)) (defined(__i386__) || defined(__x86_64__)) !(defined(__sun__) de /* Replicated character data to be shared between implementations. Recall that outside of a context with vector support we can't
Re: [PATCH] driver: ignore SIGINT while waiting on subprocesses to finish
On Tue, Nov 18, 2014 at 11:14 AM, Michael Matz m...@suse.de wrote: Hi, On Mon, 17 Nov 2014, Richard Biener wrote: This means I can no longer interrupt a compile that is running too long? No, that's not what it means, cc1 will also get the SIGINT. You should instead debug the actual compiler, not the driver. -wrapper is specifically also for invoking cc1 with gdb from the driver (that's the usecase documented with -wrapper), so it better should work as intended. I don't know what problems Patrick had with that, though. For me gcc -wrapper gdb,--args works as expected (as in ^C interrupts cc1 returning to gdb). Yes it does for me too. But pressing ^C in gdb while cc1 is not running (by accident or with intention, e.g. pressing ^C to quickly clear the command prompt) will kill the driver and gdb after it. It's not a huge problem but it does cause some inconvenience for users of -wrapper gdb. Ciao, Michael.
Re: [PATCH, ifcvt] Fix PR63917
On Thu, Nov 20, 2014 at 1:48 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: Hi, r217646 enhances ifcvt to handle cbranchcc4 instruction. But ifcvt does not strictly check the dependence before moving instructions before IF. Then some instructions, which clobber CC, are inserted before the cbranchcc4 instruction. For the case in the patch, ifcvt transfers code from 5: r87:SI=r117:SI 22: pc={(flags:CCGOC=0)?L26:pc} 25: {r87:SI=-r117:SI;clobber flags:CC;} to 5: r87:SI=r117:SI 136: {r145:SI=-r117:SI;clobber flags:CC;} // CC is clobbered 137: r87:SI={(flags:CCGOC0)?r145:SI:r117:SI} The patch skips moving insns, which clobber CC, before cbranchcc4. Bootstrap and no make check regression on X86-64 and i686. All the failed cases in PR63917 PASS. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-11-20 Zhenqiang Chen zhenqiang.c...@arm.com PR rtl-optimization/63917 * ifcvt.c (clobber_cc_p, use_cc_p): New functions. (noce_process_if_block, check_cond_move_block): Check CC references. testsuite/ChangeLog: 2014-11-20 Zhenqiang Chen zhenqiang.c...@arm.com * gcc.target/i386/floatsitf.c: New test. Why do you need a new testcase? There are many failures with the existing testcases. -- H.J.
Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.
On Thu, Nov 20, 2014 at 12:36 PM, Evgeny Stupachenko evstu...@gmail.com wrote: Hi, The patch expand even/odd permutation using: and, and, pack in odd case shift, shift, pack in even case instead of current pshufb, pshufb, or or big set of unpack insns. AVX2/CORE bootstrap and make check passed. expensive tests are in progress Is it ok for trunk? Evgeny 2014-11-20 Evgeny Stupachenko evstu...@gmail.com gcc/testsuite PR target/60451 * gcc.target/i386/pr60451.c: New. gcc/ PR target/60451 * config/i386/i386.c (expand_vec_perm_even_odd_pack): New. (expand_vec_perm_even_odd_1): Add new expand for SSE cases, replace with for AVX2 cases. (ix86_expand_vec_perm_const_1): Add new expand. OK with a couple of small adjustments below. Thanks, Uros. +/* A subroutine of expand_vec_perm_even_odd_1. Implement extract-even + and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands + with two and and pack or two shift and pack insns. We should + have already failed all two instruction sequences. */ + +static bool +expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d) +{ + rtx op, dop0, dop1, t, rperm[16]; + unsigned i, odd, c, s, nelt = d-nelt; + bool end_perm = false; + machine_mode half_mode; + rtx (*gen_and) (rtx, rtx, rtx); + rtx (*gen_pack) (rtx, rtx, rtx); + rtx (*gen_shift) (rtx, rtx, rtx); + + /* Required for pack. */ + if (!TARGET_SSE4_2 || d-one_operand_p) +return false; + + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general + shuffles. */ + if (d-vmode == V8HImode) Use switch, as proposed by Jakub. +{ + c = 0x; + s = 16; + half_mode = V4SImode; + gen_and = gen_andv4si3; + gen_pack = gen_sse4_1_packusdw; + gen_shift = gen_lshrv4si3; +} + else if (d-vmode == V16QImode) +{ + c = 0xff; + s = 8; + half_mode = V8HImode; + gen_and = gen_andv8hi3; + gen_pack = gen_sse2_packuswb; + gen_shift = gen_lshrv8hi3; +} + else if (d-vmode == V16HImode) +{ + c = 0x; + s = 16; + half_mode = V8SImode; + gen_and = gen_andv8si3; + gen_pack = gen_avx2_packusdw; + gen_shift = gen_lshrv8si3; + end_perm = true; +} + else if (d-vmode == V32QImode) +{ + c = 0xff; + s = 8; + half_mode = V16HImode; + gen_and = gen_andv16hi3; + gen_pack = gen_avx2_packuswb; + gen_shift = gen_lshrv16hi3; + end_perm = true; +} + else +return false; + + /* Check that permutation is even or odd. */ + odd = d-perm[0]; + if (odd != 0 odd != 1) if (odd 1) +return false; + + for (i = 1; i nelt; ++i) +if (d-perm[i] != 2 * i + odd) + return false; + + if (d-testing_p) +return true; + + dop0 = gen_reg_rtx (half_mode); + dop1 = gen_reg_rtx (half_mode); + if (odd == 0) +{ + for (i = 0; i nelt / 2; rperm[i++] = GEN_INT (c)); Please write above as: for (i = 0; i nelt / 2; i++) rperm[i] = GEN_INT (c)); + t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm)); + t = force_reg (half_mode, t); + emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0))); + emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1))); +} + else +{ + emit_insn (gen_shift (dop0, + gen_lowpart (half_mode, d-op0), + GEN_INT (s))); + emit_insn (gen_shift (dop1, + gen_lowpart (half_mode, d-op1), + GEN_INT (s))); +} + /* In AVX2 for 256 bit case we need to permute pack result. */ + if (TARGET_AVX2 end_perm) +{ + op = gen_reg_rtx (d-vmode); + t = gen_reg_rtx (V4DImode); + emit_insn (gen_pack (op, dop0, dop1)); + emit_insn (gen_avx2_permv4di_1 (t, gen_lowpart (V4DImode, op), const0_rtx, + const2_rtx, const1_rtx, GEN_INT (3))); + emit_move_insn (d-target, gen_lowpart (d-vmode, t)); +} + else +emit_insn (gen_pack (d-target, dop0, dop1)); + + return true; +} + /* A subroutine of ix86_expand_vec_perm_builtin_1. Implement extract-even and extract-odd permutations. */ @@ -48393,6 +48503,8 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) gcc_unreachable (); case V8HImode: + if (TARGET_SSE4_2) + return expand_vec_perm_even_odd_pack (d); if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) else if in the above line, to be consistent with else below. return expand_vec_perm_pshufb2 (d); else @@ -48416,6 +48528,8 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) break; case V16QImode: + if (TARGET_SSE4_2) + return expand_vec_perm_even_odd_pack (d);
Re: LTO streaming of TARGET_OPTIMIZE_NODE
On 11/13/2014 05:06 AM, Jan Hubicka wrote: this patch adds infrastructure for proper streaming and merging of TREE_TARGET_OPTION. This breaks the offloading path via LTO since it introduces an incompatibility in LTO format between host and offload machine. A very quick patch to fix it is below - the OpenACC testcase I was using seems to be working again with this. Thoughts, suggestions? Bernd diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index be041e9..3c4b8c9 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -65,7 +65,7 @@ along with GCC; see the file COPYING3. If not see #include streamer-hooks.h #include cfgloop.h #include builtins.h - +#include lto-section-names.h static void lto_write_tree (struct output_block*, tree, bool); @@ -944,7 +944,9 @@ hash_tree (struct streamer_tree_cache_d *cache, hash_maptree, hashval_t *map, hstate.add (TRANSLATION_UNIT_LANGUAGE (t), strlen (TRANSLATION_UNIT_LANGUAGE (t))); - if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) + if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION) + /* We don't stream these when passing things to a different target. */ + strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0) hstate.add_wide_int (cl_target_option_hash (TREE_TARGET_OPTION (t))); if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION)) diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c index a2a2382..88d36d3 100644 --- a/gcc/tree-streamer-in.c +++ b/gcc/tree-streamer-in.c @@ -514,8 +514,10 @@ unpack_value_fields (struct data_in *data_in, struct bitpack_d *bp, tree expr) vec_safe_grow (CONSTRUCTOR_ELTS (expr), length); } +#ifndef ACCEL_COMPILER if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) cl_target_option_stream_in (data_in, bp, TREE_TARGET_OPTION (expr)); +#endif if (code == OMP_CLAUSE) unpack_ts_omp_clause_value_fields (data_in, bp, expr); @@ -779,7 +781,9 @@ lto_input_ts_function_decl_tree_pointers (struct lto_input_block *ib, DECL_VINDEX (expr) = stream_read_tree (ib, data_in); /* DECL_STRUCT_FUNCTION is loaded on demand by cgraph_get_body. */ DECL_FUNCTION_PERSONALITY (expr) = stream_read_tree (ib, data_in); +#ifndef ACCEL_COMPILER DECL_FUNCTION_SPECIFIC_TARGET (expr) = stream_read_tree (ib, data_in); +#endif DECL_FUNCTION_SPECIFIC_OPTIMIZATION (expr) = stream_read_tree (ib, data_in); /* If the file contains a function with an EH personality set, diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c index b959454..fca101e 100644 --- a/gcc/tree-streamer-out.c +++ b/gcc/tree-streamer-out.c @@ -47,6 +47,7 @@ along with GCC; see the file COPYING3. If not see #include tree-streamer.h #include data-streamer.h #include streamer-hooks.h +#include lto-section-names.h /* Output the STRING constant to the string table in OB. Then put the index onto the INDEX_STREAM. */ @@ -463,7 +464,9 @@ streamer_pack_tree_bitfields (struct output_block *ob, if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR)) bp_pack_var_len_unsigned (bp, CONSTRUCTOR_NELTS (expr)); - if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) + if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION) + /* Don't stream these when passing things to a different target. */ + strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0) cl_target_option_stream_out (ob, bp, TREE_TARGET_OPTION (expr)); if (code == OMP_CLAUSE) @@ -678,7 +681,9 @@ write_ts_function_decl_tree_pointers (struct output_block *ob, tree expr, stream_write_tree (ob, DECL_VINDEX (expr), ref_p); /* DECL_STRUCT_FUNCTION is handled by lto_output_function. */ stream_write_tree (ob, DECL_FUNCTION_PERSONALITY (expr), ref_p); - stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_TARGET (expr), ref_p); + /* Don't stream these when passing things to a different target. */ + if (strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0) +stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_TARGET (expr), ref_p); stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_OPTIMIZATION (expr), ref_p); }
Another ptx offloading patch
Now that I've managed to put together and test all the submitted OpenACC patches I found there was one piece missing. The problem is that omp-low on the host likes to generate function names like _main._omp_fn. On ptx, the dot is not allowed in identifiers, so we have to rewrite this to use a dollar sign. The patch below does this at the lto-read stage. Bootstrapped on x86_64-linux, ok if testing is successful? Bernd commit 26b41de43c6db6e2368a9511c589c433b1e49c96 Author: Bernd Schmidt ber...@codesourcery.com Date: Wed Nov 19 21:47:59 2014 +0100 Renaming for invalid symbols when reading LTO. * cgraph.h (clone_function_name_1): Declare. * cgraphclones.c (clone_function_name_1): New function. (clone_function_name): Use it. * lto-partition.c: Include stringpool.h. (must_not_rename, maybe_rewrite_identifier, validize_symbol_for_target): New static functions. (privatize_symbol_name): Use must_not_rename. (promote_symbol): Call validize_symbol_for_target. (lto_promote_cross_file_statics): Likewise. (lto_promote_statics_nonwpa): Likewise. diff --git a/gcc/cgraph.h b/gcc/cgraph.h index a5c5f56..7be6413 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -2150,6 +2150,7 @@ basic_block init_lowered_empty_function (tree, bool); /* In cgraphclones.c */ +tree clone_function_name_1 (const char *, const char *); tree clone_function_name (tree decl, const char *); void tree_function_versioning (tree, tree, vecipa_replace_map *, va_gc *, diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c index 086dd92..1b7d8d2 100644 --- a/gcc/cgraphclones.c +++ b/gcc/cgraphclones.c @@ -506,19 +506,19 @@ cgraph_node::create_clone (tree decl, gcov_type gcov_count, int freq, return new_node; } -/* Return a new assembler name for a clone of DECL with SUFFIX. */ - static GTY(()) unsigned int clone_fn_id_num; +/* Return a new assembler name for a clone with SUFFIX of a decl named + NAME. */ + tree -clone_function_name (tree decl, const char *suffix) +clone_function_name_1 (const char *name, const char *suffix) { - tree name = DECL_ASSEMBLER_NAME (decl); - size_t len = IDENTIFIER_LENGTH (name); + size_t len = strlen (name); char *tmp_name, *prefix; prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); - memcpy (prefix, IDENTIFIER_POINTER (name), len); + memcpy (prefix, name, len); strcpy (prefix + len + 1, suffix); #ifndef NO_DOT_IN_LABEL prefix[len] = '.'; @@ -531,6 +531,16 @@ clone_function_name (tree decl, const char *suffix) return get_identifier (tmp_name); } +/* Return a new assembler name for a clone of DECL with SUFFIX. */ + +tree +clone_function_name (tree decl, const char *suffix) +{ + tree name = DECL_ASSEMBLER_NAME (decl); + return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); +} + + /* Create callgraph node clone with new declaration. The actual body will be copied later at compilation stage. diff --git a/gcc/gcc.c b/gcc/gcc.c index 80dc87c..c49401b 100644 --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -4238,14 +4238,14 @@ process_command (unsigned int decoded_options_count, } gcc_assert (!IS_ABSOLUTE_PATH (tooldir_base_prefix)); - tooldir_prefix2 = concat (tooldir_base_prefix, spec_host_machine, + tooldir_prefix2 = concat (tooldir_base_prefix, spec_machine, dir_separator_str, NULL); /* Look for tools relative to the location from which the driver is running, or, if that is not available, the configured prefix. */ tooldir_prefix = concat (gcc_exec_prefix ? gcc_exec_prefix : standard_exec_prefix, - spec_host_machine, dir_separator_str, spec_version, + spec_machine, dir_separator_str, spec_version, accel_dir_suffix, dir_separator_str, tooldir_prefix2, NULL); free (tooldir_prefix2); diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c index 65f0582..ac10c90 100644 --- a/gcc/lto/lto-partition.c +++ b/gcc/lto/lto-partition.c @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3. If not see #include ipa-inline.h #include ipa-utils.h #include lto-partition.h +#include stringpool.h vecltrans_partition ltrans_partitions; @@ -775,21 +776,12 @@ lto_balanced_map (int n_lto_partitions) free (order); } -/* Mangle NODE symbol name into a local name. - This is necessary to do - 1) if two or more static vars of same assembler name - are merged into single ltrans unit. - 2) if prevoiusly static var was promoted hidden to avoid possible conflict - with symbols defined out of the LTO world. -*/ +/* Return true if we must not change the name of the NODE. The name as + extracted from the corresponding decl should be passed in NAME. */ static bool -privatize_symbol_name (symtab_node *node) +must_not_rename (symtab_node *node, const char *name) { - tree decl = node-decl; - const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); - cgraph_node *cnode; - /* Our renaming machinery do not
Re: OpenACC middle end changes
On 11/20/2014 07:52 AM, Jakub Jelinek wrote: On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote: Thomas had apparently already pointed out an issue with the new gomp_target class (there are multiple similar types of statements we want to handle with OpenACC, they have different codes but we want to have function pointers operating on any of them) back in July. That seems to have been ignored. By necessity, some of David's changes are reverted in the following patch. I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just two new gimple_omp_target_kind GF_* flags. If that's the case I'll leave it to Thomas to make these changes. At the moment I'm just trying to put together all the pieces into versions that apply to trunk and can be made to work together. Bernd
Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787
On Thu, Nov 20, 2014 at 12:05 AM, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 05:24 PM, David Malcolm wrote: On Wed, 2014-11-19 at 22:36 +0100, Richard Biener wrote: On November 19, 2014 10:09:56 PM CET, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 03:43 PM, Richard Biener wrote: On November 19, 2014 8:26:23 PM CET, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 01:12 PM, David Malcolm wrote: (A) could become: greturn *stmt = gsi-as_a_greturn (); (B) could become: stmt = gsi-dyn_cast gcall * (); if (!stmt) or: stmt = gsi-dyn_cast_gcall (); if (!stmt) or maybe: stmt = gsi-is_a_gcall (); if (!stmt) An earlier version of my patches had casting methods within the gimple_statement_base class, which were rejected; the above proposals would instead put them within gimple_stmt_iterator. I would like all gsi routines to be consistent, not a mix of functions and methods. so something like stmt = gsi_as_call (gsi); stmt = gsi_dyn_call (gsi); or we need to change gsi_stmt() and friends into methods and access them as gsi-stmt ().. which is possibly better, but that much more intrusive again (another 2000+ locations). If we switched to methods everywhere for gsi, I'd prefer something like gsi-as_a_call () gsi-is_a_call () gsi-dyn_cast_call () I think its more readable... and it removes a dependency on the implementation.. so if we ever decide to change the name of 'gcall' down the road to using a namespace, and make it gimple::call or whatever, we wont have to change every single gsi- location which has a templated use of the type. I'm also think this sort of thing could probably wait until next stage 1.. my 2 cents... Why not as_a gassign * (*gsi)? It would Add operator* to gsi of course. Richard. I could live with that form too. we often have an instance of gimple_stmt_iterator rather than a pointer to it, so wouldn't operator gimple *() to implicitly call gsi_stmt() when needed work better? (or operator gimple () before the next change) .. Not sure. The * matches how iterators work in STL... Note that for the cases where we pass a pointer to an iterator we can change those to use references to avoid writing **gsi. Richard. Andrew I had a go at adding an operator * to gimple_stmt_iterator, using it everywhere that we do an as_a or dyn_cast on the result of a gsi_stmt, to abbreviate the gsi_stmt call down to one character. Patch attached; only lightly smoketested; am posting it for the sake of discussion. I don't think this API will make the non-C++-fans happier; I think the objection to the work I just merged is that it's adding more C++ than those people are comfortable with. So although the attached patch makes things shorter (good), it's taking things in a more C++ direction (questionable). I'd hoped to paper over the C++ somewhat. I suspect that any API which requires the of characters within the implementation of a gimple pass to mean a template is going to give those less happy with C++ a visceral ugh reaction. I wonder if there's a way to spell these things that's concise and which doesn't involve ? wasnt that my last thought? is_a_call(), as_a_call() and dyn_cast_call () ? I think lack of in identifiers helps us old brains parse faster :-) are like ()... many many years of causing a certain kind of break in mental processing. I'm accustomed to single these days, but once you get into multiple 's I quickly loose track. I find the same thing with ()... hence I'm not a lisp fan :-) I think we want to have a consistent style across GCC even if seen as ugly to some people. Thus having (member) functions for conversion in some cases and as_a templates in others is bad. C++ was supposed to make grok GCC easier for newbies - this is exactly making it harder (not that I believe in this story at all...) I dont think 'operator *' c++ifies it too much, but I still think operator gimple() would be easier... no extra character at all, and no odd looking dereference of a non-pointer object or double dereference of a pointer. I cant think of how that could get us into trouble... it'll always map to the stmt the iterator currently points to. I dislike conversion operators. Why is 'operator *' bad? It's exactly how iterators are supposed to work - after all the gsi stuff was modeled after STL iterators! So that's a definitive no from me to is_a_call () as_a_call () etc. Richard. Andrew
Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787
On Wed, Nov 19, 2014 at 11:24 PM, David Malcolm dmalc...@redhat.com wrote: On Wed, 2014-11-19 at 22:36 +0100, Richard Biener wrote: On November 19, 2014 10:09:56 PM CET, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 03:43 PM, Richard Biener wrote: On November 19, 2014 8:26:23 PM CET, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 01:12 PM, David Malcolm wrote: (A) could become: greturn *stmt = gsi-as_a_greturn (); (B) could become: stmt = gsi-dyn_cast gcall * (); if (!stmt) or: stmt = gsi-dyn_cast_gcall (); if (!stmt) or maybe: stmt = gsi-is_a_gcall (); if (!stmt) An earlier version of my patches had casting methods within the gimple_statement_base class, which were rejected; the above proposals would instead put them within gimple_stmt_iterator. I would like all gsi routines to be consistent, not a mix of functions and methods. so something like stmt = gsi_as_call (gsi); stmt = gsi_dyn_call (gsi); or we need to change gsi_stmt() and friends into methods and access them as gsi-stmt ().. which is possibly better, but that much more intrusive again (another 2000+ locations). If we switched to methods everywhere for gsi, I'd prefer something like gsi-as_a_call () gsi-is_a_call () gsi-dyn_cast_call () I think its more readable... and it removes a dependency on the implementation.. so if we ever decide to change the name of 'gcall' down the road to using a namespace, and make it gimple::call or whatever, we wont have to change every single gsi- location which has a templated use of the type. I'm also think this sort of thing could probably wait until next stage 1.. my 2 cents... Why not as_a gassign * (*gsi)? It would Add operator* to gsi of course. Richard. I could live with that form too. we often have an instance of gimple_stmt_iterator rather than a pointer to it, so wouldn't operator gimple *() to implicitly call gsi_stmt() when needed work better? (or operator gimple () before the next change) .. Not sure. The * matches how iterators work in STL... Note that for the cases where we pass a pointer to an iterator we can change those to use references to avoid writing **gsi. Richard. Andrew I had a go at adding an operator * to gimple_stmt_iterator, using it everywhere that we do an as_a or dyn_cast on the result of a gsi_stmt, to abbreviate the gsi_stmt call down to one character. Patch attached; only lightly smoketested; am posting it for the sake of discussion. Looks good. Note that diff --git a/gcc/asan.c b/gcc/asan.c index be28ede..d06d60c 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1902,7 +1902,7 @@ instrument_builtin_call (gimple_stmt_iterator *iter) return false; bool iter_advanced_p = false; - gcall *call = as_a gcall * (gsi_stmt (*iter)); + gcall *call = as_a gcall * (**iter); should be fixed by making instrument_builtin_call take a reference to the iterator so the above becomes gcall *call = as_a gcall * (*iter); probably not possible in 100% of all cases (where we sometimes pass NULL as the iterator pointer) but in most. I don't think this API will make the non-C++-fans happier; I think the objection to the work I just merged is that it's adding more C++ than those people are comfortable with. How so? It's already super-ugly in those views. We decided to get C++. Now we have it. Now please make it AT LEAST CONSISTENT. So although the attached patch makes things shorter (good), it's taking things in a more C++ direction (questionable). I'd hoped to paper over the C++ somewhat. I suspect that any API which requires the of characters within the implementation of a gimple pass to mean a template is going to give those less happy with C++ a visceral ugh reaction. I wonder if there's a way to spell these things that's concise and which doesn't involve ? Only if you drop as_a/is_a/dyn_cast everywhere. Richard.
Re: bitmap fix for current
On Thu, Nov 20, 2014 at 1:18 AM, Mike Stump mikest...@comcast.net wrote: On Nov 14, 2014, at 2:26 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Nov 14, 2014 at 2:10 AM, Jeff Law l...@redhat.com wrote: On 11/13/14 12:37, Mike Stump wrote: I was doing a merge, and it failed to even compile the runtime libraries due to checking in bitmap. bitmap goes to remove set bits from the bitmap (the second hunk in a two hunk set), and it fails to update the current pointer. That memory is freed and then reallocated and a new index is put into it, and then we fail a consistency check later on due to the mismatch between head-index and head-current-indx, because current was not properly maintained. This patch removes the old value of current when we remove what it points to from the bitmap. Was the calling code iterating through the bit with a form like EXECUTE_IF_SET_IN_BITMAP (something, 0, i, bi) { bitmap_clear_bit (something, i) [ ... whatever code we want to process i, ... ] } If so, that's the real issue and we'd really like to identify fix any code that has that kind of structure. Nope, that doesn’t appear to be the problem. Indeed. I can't see how this can have triggered: prev = elt-prev; if (prev) { prev-next = NULL; if (head-current-indx prev-indx) { head-current = prev; head-indx = prev-indx; so if there was elt-prev then if current == elt current-indx should better be prev-indx. Sth else must be wrong (and I doubt it's the above bogus use of bitmaps). So bitmap_ior_and_compl has an overly cleaver optimization to overwrite an existing bitmap with the newly computed bitmap. We write it over in place and then at the end, we do: if (dst_elt) { changed = true; bitmap_elt_clear_from (dst, dst_elt); } which is all fine and good, however, notice that when we update a list with: 0 1 with: 1 2 we get: 1 1 2 and then we want to kill from the second 1 to the end. The problem is current points at the second 1, and because the update code for current does: if (head-current-indx prev-indx) { head-current = prev; head-indx = prev-indx; } and index is not greater (it is indeed unrelated to the other index), we don’t update current. So, even my patch was wrong, in that the two are unrelated, so no comparison will help here. Curious the and and xor routine do this: /* Ensure that dst-current is valid. */ dst-current = dst-first; bitmap_elt_clear_from (dst, dst_elt); so, certainly the previous authors know of this type of problem. and_into almost seems wrong: if (a_elt) { changed = true; bitmap_elt_clear_from (a, a_elt); } as and can remove elements, but, they are saved by the code in bitmap_elt_clear_from: if (head-current-indx prev-indx) { head-current = prev; head-indx = prev-indx; } which kicks in since and cannot add any elements, it is purely subtractive. and_compl works as it does: /* Ensure that dst-current is valid. */ dst-current = dst-first; ior doesn’t reset current, and it broken. ior_and_compl doesn’t reset current and likewise, is broken. If these were _into varieties, they would have been ok. But, they are not. I added checking code to ensure the current was in the bitmap at the end of bitmap_elt_clear_from, and sure enough, it fired. So, next up, is there anything else that is supposed to save us in this case? If not, Ok? The bitmap_ior and bitmap_ior_and_compl hunks are ok. Please leave out the checking bits - they will be very much too expensive. Thanks, Richard.
Re: [PATCH] Disable an unsafe VRP transformation when -fno-strict-overflow is set
On Thu, Nov 20, 2014 at 4:21 AM, Patrick Palka patr...@parcs.ath.cx wrote: VRP may simplify a conditional like i = 5 to i == 5 if it is known that the lower bound of i's range is 5, e.g. [5, +INF]. But if the upper bound of i's range is also overflow infinity, i.e. [5, +INF(OVF)] then this transformation is only valid if -fstrict-overflow is in effect. Likewise for transforming i 10 to i != 10 given i's range is [10, +INF(OVF)] and for transforming i = 20 to i == 20 given i's range is [-INF(OVF), 20]. This patch makes this transformation only get performed if strict overflow rules are in effect and potentially emits a -Wstrict-overflow=3 warning when the transformation takes place. Bootstrap and regtesting in progress on x86_64-unknown-linux-gnu. Does the patch look OK if there are no new regressions? Ok. Thanks, Richard. gcc/ * tree-vrp.c (test_for_singularity): New parameter strict_overflow_p. Set *strict_overflow_p to true if signed overflow must be undefined for the return value to satisfy the conditional. (simplify_cond_using_ranges): Don't perform the simplification if it violates overflow rules. gcc/testsuite/ * gcc.dg/no-strict-overflow-8.c: New test. --- gcc/testsuite/gcc.dg/no-strict-overflow-8.c | 25 + gcc/tree-vrp.c | 57 + 2 files changed, 74 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/no-strict-overflow-8.c diff --git a/gcc/testsuite/gcc.dg/no-strict-overflow-8.c b/gcc/testsuite/gcc.dg/no-strict-overflow-8.c new file mode 100644 index 000..11ef935 --- /dev/null +++ b/gcc/testsuite/gcc.dg/no-strict-overflow-8.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options -fno-strict-overflow -O2 -fdump-tree-optimized } */ + +/* We cannot fold i 0 because p-a - p-b can be larger than INT_MAX + and thus i can wrap. Dual of Wstrict-overflow-18.c */ + +struct c { unsigned int a; unsigned int b; }; +extern void bar (struct c *); +int +foo (struct c *p) +{ + int i; + int sum = 0; + + for (i = 0; i p-a - p-b; ++i) +{ + if (i 0) + sum += 2; + bar (p); +} + return sum; +} + +/* { dg-final { scan-tree-dump i_.* 0 optimized } } */ +/* { dg-final { cleanup-tree-dump optimized } } */ diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index bcf4c2b..444af71 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -9117,11 +9117,15 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple stmt) a known value range VR. If there is one and only one value which will satisfy the - conditional, then return that value. Else return NULL. */ + conditional, then return that value. Else return NULL. + + If signed overflow must be undefined for the value to satisfy + the conditional, then set *STRICT_OVERFLOW_P to true. */ static tree test_for_singularity (enum tree_code cond_code, tree op0, - tree op1, value_range_t *vr) + tree op1, value_range_t *vr, + bool *strict_overflow_p) { tree min = NULL; tree max = NULL; @@ -9172,7 +9176,16 @@ test_for_singularity (enum tree_code cond_code, tree op0, then there is only one value which can satisfy the condition, return that value. */ if (operand_equal_p (min, max, 0) is_gimple_min_invariant (min)) - return min; + { + if ((cond_code == LE_EXPR || cond_code == LT_EXPR) + is_overflow_infinity (vr-max)) + *strict_overflow_p = true; + if ((cond_code == GE_EXPR || cond_code == GT_EXPR) + is_overflow_infinity (vr-min)) + *strict_overflow_p = true; + + return min; + } } return NULL; } @@ -9252,9 +9265,12 @@ simplify_cond_using_ranges (gcond *stmt) able to simplify this conditional. */ if (vr-type == VR_RANGE) { - tree new_tree = test_for_singularity (cond_code, op0, op1, vr); + enum warn_strict_overflow_code wc = WARN_STRICT_OVERFLOW_CONDITIONAL; + bool sop = false; + tree new_tree = test_for_singularity (cond_code, op0, op1, vr, sop); - if (new_tree) + if (new_tree + (!sop || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0 { if (dump_file) { @@ -9275,16 +9291,30 @@ simplify_cond_using_ranges (gcond *stmt) fprintf (dump_file, \n); } + if (sop issue_strict_overflow_warning (wc)) + { + location_t location = input_location; + if (gimple_has_location (stmt)) + location = gimple_location (stmt); + + warning_at (location, OPT_Wstrict_overflow, + assuming signed overflow does
Re: LTO streaming of TARGET_OPTIMIZE_NODE
On Thu, 20 Nov 2014, Bernd Schmidt wrote: On 11/13/2014 05:06 AM, Jan Hubicka wrote: this patch adds infrastructure for proper streaming and merging of TREE_TARGET_OPTION. This breaks the offloading path via LTO since it introduces an incompatibility in LTO format between host and offload machine. A very quick patch to fix it is below - the OpenACC testcase I was using seems to be working again with this. Thoughts, suggestions? The offload target needs to have the same target options as the host? Are the offload functions marked somehow? That is, can we avoid setting TREE_TARGET_OPTION on them? Or rather we need to have a default TREE_TARGET_OPTION node for the offload target which we'd need to set - how would you otherwise transfer different offload target options to the offload compile? How do you transfer offload target options to the offload compile at all? I think this just shows conceptual issues with the LTO approach... Thanks, Richard.
[PATCH] PR63426 Fix various signed integer overflows
Running the testsuite after bootstrap-ubsan on gcc112 shows several issues. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 for the full list. This patch fixes several of them. Tested on powerpc64-unknown-linux-gnu. OK for trunk? Thanks. 2014-11-20 Markus Trippelsdorf mar...@trippelsdorf.de * config/rs6000/constraints.md: Avoid signed integer overflows. * config/rs6000/predicates.md: Likewise. * config/rs6000/rs6000.c (num_insns_constant_wide): Likewise. (includes_rldic_lshift_p): Likewise. (includes_rldicr_lshift_p): Likewise. * emit-rtl.c (const_wide_int_htab_hash): Likewise. * loop-iv.c (determine_max_iter): Likewise. (iv_number_of_iterations): Likewise. * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise. * varasm.c (get_section_anchor): Likewise. diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md index 0e0e517d7a1d..3f12b07e4899 100644 --- a/gcc/config/rs6000/constraints.md +++ b/gcc/config/rs6000/constraints.md @@ -176,7 +176,7 @@ (define_constraint P constant whose negation is signed 16-bit constant (and (match_code const_int) - (match_test (unsigned HOST_WIDE_INT) ((- ival) + 0x8000) 0x1))) + (match_test ((- (unsigned HOST_WIDE_INT) ival) + 0x8000) 0x1))) ;; Floating-point constraints diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index 1767cbd7a11b..ea230a5b29a6 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -408,7 +408,7 @@ (define_predicate reg_or_sub_cint_operand (if_then_else (match_code const_int) (match_test (unsigned HOST_WIDE_INT) - (- INTVAL (op) + (mode == SImode ? 0x8000 : 0x80008000)) + (- UINTVAL (op) + (mode == SImode ? 0x8000 : 0x80008000)) (unsigned HOST_WIDE_INT) 0x1ll) (match_operand 0 gpc_reg_operand))) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 506daa1d31e7..a9604cf3fa97 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -5083,7 +5083,7 @@ int num_insns_constant_wide (HOST_WIDE_INT value) { /* signed constant loadable with addi */ - if ((unsigned HOST_WIDE_INT) (value + 0x8000) 0x1) + if (((unsigned HOST_WIDE_INT) value + 0x8000) 0x1) return 1; /* constant loadable with addis */ @@ -16194,7 +16194,7 @@ includes_rldic_lshift_p (rtx shiftop, rtx andop) { if (GET_CODE (andop) == CONST_INT) { - HOST_WIDE_INT c, lsb, shift_mask; + unsigned HOST_WIDE_INT c, lsb, shift_mask; c = INTVAL (andop); if (c == 0 || c == ~0) @@ -16233,7 +16233,7 @@ includes_rldicr_lshift_p (rtx shiftop, rtx andop) { if (GET_CODE (andop) == CONST_INT) { - HOST_WIDE_INT c, lsb, shift_mask; + unsigned HOST_WIDE_INT c, lsb, shift_mask; shift_mask = ~0; shift_mask = INTVAL (shiftop); diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 04f677eb608d..9d60d42c01f8 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -203,7 +203,7 @@ static hashval_t const_wide_int_htab_hash (const void *x) { int i; - HOST_WIDE_INT hash = 0; + unsigned HOST_WIDE_INT hash = 0; const_rtx xr = (const_rtx) x; for (i = 0; i CONST_WIDE_INT_NUNITS (xr); i++) diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c index 8ea458c3fc53..f55cea2a9859 100644 --- a/gcc/loop-iv.c +++ b/gcc/loop-iv.c @@ -2311,7 +2311,7 @@ determine_max_iter (struct loop *loop, struct niter_desc *desc, rtx old_niter) } get_mode_bounds (desc-mode, desc-signed_p, desc-mode, mmin, mmax); - nmax = INTVAL (mmax) - INTVAL (mmin); + nmax = UINTVAL (mmax) - UINTVAL (mmin); if (GET_CODE (niter) == UDIV) { @@ -2649,7 +2649,7 @@ iv_number_of_iterations (struct loop *loop, rtx_insn *insn, rtx condition, down = INTVAL (CONST_INT_P (iv0.base) ? iv0.base : mode_mmin); - max = (up - down) / inc + 1; + max = (uint64_t) (up - down) / inc + 1; if (!desc-infinite !desc-assumptions) record_niter_bound (loop, max, false, true); diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 4007e5483b27..fca18b6cdfe3 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -4183,7 +4183,7 @@ get_computation_cost_at (struct ivopts_data *data, if (cst_and_fits_in_hwi (cbase)) { - offset = - ratio * int_cst_value (cbase); + offset = - ratio * (unsigned HOST_WIDE_INT) int_cst_value (cbase); cost = difference_cost (data, ubase, build_int_cst (utype, 0), symbol_present, var_present, offset, diff --git a/gcc/varasm.c b/gcc/varasm.c index 54611f8fd3f1..b93e2559843c 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -7188,7 +7188,7 @@ get_section_anchor (struct object_block
RE: [PATCH, committed] Update Automake files
Hello Jan-Benedict, Hi! This patch updates the files taken from Automake. Committed. MfG, JBG the updated version of missing will confuse the gmp-4.3.2 configure script if it is installed in-tree with contrib/download_prerequisites and flex is not installed: ... checking readline detected... no checking for bison... (cached) /home/ed/gnu/gcc-5-20141116/missing bison -y checking for flex... (cached) /home/ed/gnu/gcc-5-20141116/missing flex checking lex output file root... configure: error: cannot find output from /home/ed/gnu/gcc-5-20141116/missing flex; giving up make[3]: *** [config.status] Error 1 make[3]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf/gmp' make[2]: *** [all-stage1-gmp] Error 2 make[2]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf' make: *** [all] Error 2 previous version of missing flex produced a dummy lex.yy.c, as does the version in the gmp package, but unfortunately it is overwritten by the missing script in the gcc tree. That's probably just not a supported configuration anymore, but all previous GCC releases worked without a installed flex tool. Maybe the problem goes away if a newer version of gmp is used, or if the missing flex is not passed down to the gmp configure script, somehow. Actually, it is not really needed by gmp at all. I tried to add this hunk from the old version and it made, the gmp configure script worked again: --- missing.orig2014-11-16 14:07:13.0 + +++ missing 2014-11-19 15:01:57.168967538 + @@ -172,6 +172,21 @@ echo You should only need it if you modified a '.l' file. echo You may want to install the Fast Lexical Analyzer package: echo $flex_URL + rm -f lex.yy.c + if test $# -ne 1; then +eval LASTARG=\${$#} +case $LASTARG in +*.l) + SRCFILE=`echo $LASTARG | sed 's/l$/c/'` + if test -f $SRCFILE; then + cp $SRCFILE lex.yy.c + fi +;; +esac + fi + if test ! -f lex.yy.c; then + echo 'main() { return 0; }'lex.yy.c + fi ;; help2man*) echo You should only need it if you modified a dependency \ What do you think? Regards, Bernd.
Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.
Thank you. Patch with proposed fixes: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 085eb54..09c0057 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -48322,6 +48322,120 @@ expand_vec_perm_vpshufb2_vpermq_even_odd (struct expand_vec_perm_d *d) return true; } +/* A subroutine of expand_vec_perm_even_odd_1. Implement extract-even + and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands + with two and and pack or two shift and pack insns. We should + have already failed all two instruction sequences. */ + +static bool +expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d) +{ + rtx op, dop0, dop1, t, rperm[16]; + unsigned i, odd, c, s, nelt = d-nelt; + bool end_perm = false; + machine_mode half_mode; + rtx (*gen_and) (rtx, rtx, rtx); + rtx (*gen_pack) (rtx, rtx, rtx); + rtx (*gen_shift) (rtx, rtx, rtx); + + /* Required for pack. */ + if (!TARGET_SSE4_2 || d-one_operand_p) +return false; + + switch (d-vmode) +{ +case V8HImode: + c = 0x; + s = 16; + half_mode = V4SImode; + gen_and = gen_andv4si3; + gen_pack = gen_sse4_1_packusdw; + gen_shift = gen_lshrv4si3; + break; +case V16QImode: + c = 0xff; + s = 8; + half_mode = V8HImode; + gen_and = gen_andv8hi3; + gen_pack = gen_sse2_packuswb; + gen_shift = gen_lshrv8hi3; + break; +case V16HImode: + c = 0x; + s = 16; + half_mode = V8SImode; + gen_and = gen_andv8si3; + gen_pack = gen_avx2_packusdw; + gen_shift = gen_lshrv8si3; + end_perm = true; + break; +case V32QImode: + c = 0xff; + s = 8; + half_mode = V16HImode; + gen_and = gen_andv16hi3; + gen_pack = gen_avx2_packuswb; + gen_shift = gen_lshrv16hi3; + end_perm = true; + break; +default: + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than +general shuffles. */ + return false; +} + + /* Check that permutation is even or odd. */ + odd = d-perm[0]; + if (odd 1) +return false; + + for (i = 1; i nelt; ++i) +if (d-perm[i] != 2 * i + odd) + return false; + + if (d-testing_p) +return true; + + dop0 = gen_reg_rtx (half_mode); + dop1 = gen_reg_rtx (half_mode); + if (odd == 0) +{ + for (i = 0; i nelt / 2; i++) + rperm[i] = GEN_INT (c); + t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm)); + t = force_reg (half_mode, t); + emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0))); + emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1))); +} + else +{ + emit_insn (gen_shift (dop0, + gen_lowpart (half_mode, d-op0), + GEN_INT (s))); + emit_insn (gen_shift (dop1, + gen_lowpart (half_mode, d-op1), + GEN_INT (s))); +} + /* In AVX2 for 256 bit case we need to permute pack result. */ + if (TARGET_AVX2 end_perm) +{ + op = gen_reg_rtx (d-vmode); + t = gen_reg_rtx (V4DImode); + emit_insn (gen_pack (op, dop0, dop1)); + emit_insn (gen_avx2_permv4di_1 (t, + gen_lowpart (V4DImode, op), + const0_rtx, + const2_rtx, + const1_rtx, + GEN_INT (3))); + emit_move_insn (d-target, gen_lowpart (d-vmode, t)); +} + else +emit_insn (gen_pack (d-target, dop0, dop1)); + + return true; +} + /* A subroutine of ix86_expand_vec_perm_builtin_1. Implement extract-even and extract-odd permutations. */ @@ -48393,7 +48507,9 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) gcc_unreachable (); case V8HImode: - if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) + if (TARGET_SSE4_2) + return expand_vec_perm_even_odd_pack (d); + else if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else { @@ -48416,7 +48532,9 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) break; case V16QImode: - if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) + if (TARGET_SSE4_2) + return expand_vec_perm_even_odd_pack (d); + else if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else { @@ -48441,7 +48559,7 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) case V16HImode: case V32QImode: - return expand_vec_perm_vpshufb2_vpermq_even_odd (d); + return expand_vec_perm_even_odd_pack (d); case V4DImode: if (!TARGET_AVX2) @@ -48814,6 +48932,9 @@ ix86_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) /* Try sequences of three instructions. */ + if (expand_vec_perm_even_odd_pack (d)) +return true;
Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787
On 11/20/2014 08:08 AM, Richard Biener wrote: On Thu, Nov 20, 2014 at 12:05 AM, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 05:24 PM, David Malcolm wrote: On Wed, 2014-11-19 at 22:36 +0100, Richard Biener wrote: On November 19, 2014 10:09:56 PM CET, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 03:43 PM, Richard Biener wrote: On November 19, 2014 8:26:23 PM CET, Andrew MacLeod amacl...@redhat.com wrote: On 11/19/2014 01:12 PM, David Malcolm wrote: (A) could become: greturn *stmt = gsi-as_a_greturn (); (B) could become: stmt = gsi-dyn_cast gcall * (); if (!stmt) or: stmt = gsi-dyn_cast_gcall (); if (!stmt) or maybe: stmt = gsi-is_a_gcall (); if (!stmt) An earlier version of my patches had casting methods within the gimple_statement_base class, which were rejected; the above proposals would instead put them within gimple_stmt_iterator. I would like all gsi routines to be consistent, not a mix of functions and methods. so something like stmt = gsi_as_call (gsi); stmt = gsi_dyn_call (gsi); or we need to change gsi_stmt() and friends into methods and access them as gsi-stmt ().. which is possibly better, but that much more intrusive again (another 2000+ locations). If we switched to methods everywhere for gsi, I'd prefer something like gsi-as_a_call () gsi-is_a_call () gsi-dyn_cast_call () I think its more readable... and it removes a dependency on the implementation.. so if we ever decide to change the name of 'gcall' down the road to using a namespace, and make it gimple::call or whatever, we wont have to change every single gsi- location which has a templated use of the type. I'm also think this sort of thing could probably wait until next stage 1.. my 2 cents... Why not as_a gassign * (*gsi)? It would Add operator* to gsi of course. Richard. I could live with that form too. we often have an instance of gimple_stmt_iterator rather than a pointer to it, so wouldn't operator gimple *() to implicitly call gsi_stmt() when needed work better? (or operator gimple () before the next change) .. Not sure. The * matches how iterators work in STL... Note that for the cases where we pass a pointer to an iterator we can change those to use references to avoid writing **gsi. Richard. Andrew I had a go at adding an operator * to gimple_stmt_iterator, using it everywhere that we do an as_a or dyn_cast on the result of a gsi_stmt, to abbreviate the gsi_stmt call down to one character. Patch attached; only lightly smoketested; am posting it for the sake of discussion. I don't think this API will make the non-C++-fans happier; I think the objection to the work I just merged is that it's adding more C++ than those people are comfortable with. So although the attached patch makes things shorter (good), it's taking things in a more C++ direction (questionable). I'd hoped to paper over the C++ somewhat. I suspect that any API which requires the of characters within the implementation of a gimple pass to mean a template is going to give those less happy with C++ a visceral ugh reaction. I wonder if there's a way to spell these things that's concise and which doesn't involve ? wasnt that my last thought? is_a_call(), as_a_call() and dyn_cast_call () ? I think lack of in identifiers helps us old brains parse faster :-) are like ()... many many years of causing a certain kind of break in mental processing. I'm accustomed to single these days, but once you get into multiple 's I quickly loose track. I find the same thing with ()... hence I'm not a lisp fan :-) I think we want to have a consistent style across GCC even if seen as ugly to some people. Thus having (member) functions for conversion in some cases and as_a templates in others is bad. C++ was supposed to make grok GCC easier for newbies - this is exactly making it harder (not that I believe in this story at all...) I dont think 'operator *' c++ifies it too much, but I still think operator gimple() would be easier... no extra character at all, and no odd looking dereference of a non-pointer object or double dereference of a pointer. I cant think of how that could get us into trouble... it'll always map to the stmt the iterator currently points to. I dislike conversion operators. Why is 'operator *' bad? It's exactly how iterators are supposed to work - after all the gsi stuff was modeled after STL iterators! So that's a definitive no from me to is_a_call () as_a_call () etc. Richard. Fine by me, Just running through the options to make sure we know what we are getting :-) Andrew
Re: [PATCH] PR63426 Fix various signed integer overflows
On Thu, Nov 20, 2014 at 8:27 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: Running the testsuite after bootstrap-ubsan on gcc112 shows several issues. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 for the full list. This patch fixes several of them. Tested on powerpc64-unknown-linux-gnu. OK for trunk? Thanks. 2014-11-20 Markus Trippelsdorf mar...@trippelsdorf.de * config/rs6000/constraints.md: Avoid signed integer overflows. * config/rs6000/predicates.md: Likewise. * config/rs6000/rs6000.c (num_insns_constant_wide): Likewise. (includes_rldic_lshift_p): Likewise. (includes_rldicr_lshift_p): Likewise. * emit-rtl.c (const_wide_int_htab_hash): Likewise. * loop-iv.c (determine_max_iter): Likewise. (iv_number_of_iterations): Likewise. * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise. * varasm.c (get_section_anchor): Likewise. The rs6000 patches are okay. Someone like Richi or Jakub needs to approve the changes to the common parts of the compiler. Thanks, David
Re: LTO streaming of TARGET_OPTIMIZE_NODE
On 11/20/2014 02:20 PM, Richard Biener wrote: On Thu, 20 Nov 2014, Bernd Schmidt wrote: On 11/13/2014 05:06 AM, Jan Hubicka wrote: this patch adds infrastructure for proper streaming and merging of TREE_TARGET_OPTION. This breaks the offloading path via LTO since it introduces an incompatibility in LTO format between host and offload machine. A very quick patch to fix it is below - the OpenACC testcase I was using seems to be working again with this. Thoughts, suggestions? The offload target needs to have the same target options as the host? Not really meaningful I'd think. Are the offload functions marked somehow? That is, can we avoid setting TREE_TARGET_OPTION on them? Well, they are mostly generated automatically by omp-low.c, so TREE_TARGET_OPTION wouldn't normally be set anyway. So the field is unnecessary, we just can't write it out since the two compilers involved disagree on its layout. Or rather we need to have a default TREE_TARGET_OPTION node for the offload target which we'd need to set - how would you otherwise transfer different offload target options to the offload compile? How do you transfer offload target options to the offload compile at all? ABI options are transferred via the -foffload-abi mechanism. No other target options can be transferred. I think this just shows conceptual issues with the LTO approach... I don't think running into a few problems demonstrates a conceptual problem when it works fine with some fairly small patches. Bernd
[PATCH] PR lto/63968: 175.vpr from cpu2000 fails to build with LTO
Hello. As I reimplemented fibheap to C++ template, Honza told me that replace_key method actually supports just decrement operation. Old implementation suppress any feedback if we try to increase key: fibheap.c: ... /* If we wanted to, we could actually do a real increase by redeleting and inserting. However, this would require O (log n) time. So just bail out for now. */ if (fibheap_comp_data (heap, key, data, node) 0) return NULL; ... My reimplementation added assert for such kind operation, as this PR shows we try to do increment in reorder-bb. Thus, I added fibonacci_heap::replace_key method that can increment key (it deletes the node and new key is associated with the node). The patch can bootstrap on x86_64-linux-pc and no new regression was introduced. I would like to ask someone if the increase operation for bb-reorder is valid or not? Thanks, Martin gcc/ChangeLog: 2014-11-20 Martin Liska mli...@suse.cz * bb-reorder.c (find_traces_1_round): decreate_key is replaced with replace_key method. * fibonacci_heap.h (fibonacci_heap::insert): New argument. (fibonacci_heap::replace_key_data): Likewise. (fibonacci_heap::replace_key): New method that can even increment key, this operation costs O(log N). (fibonacci_heap::extract_min): New argument. (fibonacci_heap::delete_node): Likewise. diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c index 689d7b6..b568114 100644 --- a/gcc/bb-reorder.c +++ b/gcc/bb-reorder.c @@ -644,7 +644,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th, (long) bbd[e-dest-index].node-get_key (), key); } - bbd[e-dest-index].heap-decrease_key + bbd[e-dest-index].heap-replace_key (bbd[e-dest-index].node, key); } } @@ -812,7 +812,7 @@ find_traces_1_round (int branch_th, int exec_th, gcov_type count_th, e-dest-index, (long) bbd[e-dest-index].node-get_key (), key); } - bbd[e-dest-index].heap-decrease_key + bbd[e-dest-index].heap-replace_key (bbd[e-dest-index].node, key); } } diff --git a/gcc/fibonacci_heap.h b/gcc/fibonacci_heap.h index ecb92f8..3fce370 100644 --- a/gcc/fibonacci_heap.h +++ b/gcc/fibonacci_heap.h @@ -183,20 +183,27 @@ public: } /* For given NODE, set new KEY value. */ - K decrease_key (fibonacci_node_t *node, K key) + K replace_key (fibonacci_node_t *node, K key) { K okey = node-m_key; -gcc_assert (key = okey); replace_key_data (node, key, node-m_data); return okey; } + /* For given NODE, decrease value to new KEY. */ + K decrease_key (fibonacci_node_t *node, K key) + { +gcc_assert (key = node-m_key); +return replace_key (node, key); + } + /* For given NODE, set new KEY and DATA value. */ V *replace_key_data (fibonacci_node_t *node, K key, V *data); - /* Extract minimum node in the heap. */ - V *extract_min (); + /* Extract minimum node in the heap. If RELEASE is specified, + memory is released. */ + V *extract_min (bool release = true); /* Return value associated with minimum node in the heap. */ V *min () @@ -214,12 +221,15 @@ public: } /* Delete NODE in the heap. */ - V *delete_node (fibonacci_node_t *node); + V *delete_node (fibonacci_node_t *node, bool release = true); /* Union the heap with HEAPB. */ fibonacci_heap *union_with (fibonacci_heap *heapb); private: + /* Insert new NODE given by KEY and DATA associated with the key. */ + fibonacci_node_t *insert (fibonacci_node_t *node, K key, V *data); + /* Insert it into the root list. */ void insert_root (fibonacci_node_t *node); @@ -322,6 +332,15 @@ fibonacci_heapK,V::insert (K key, V *data) /* Create the new node. */ fibonacci_nodeK,V *node = new fibonacci_node_t (); + return insert (node, key, data); +} + +/* Insert new NODE given by KEY and DATA associated with the key. */ + +templateclass K, class V +fibonacci_nodeK,V* +fibonacci_heapK,V::insert (fibonacci_node_t *node, K key, V *data) +{ /* Set the node's data. */ node-m_data = data; node-m_key = key; @@ -345,17 +364,22 @@ V* fibonacci_heapK,V::replace_key_data (fibonacci_nodeK,V *node, K key, V *data) { - V *odata; K okey; fibonacci_nodeK,V *y; + V *odata = node-m_data; - /* If we wanted to, we could actually do a real increase by redeleting and - inserting. However, this would require O (log n) time. So just bail out - for now. */ + /* If we wanted to, we do a real increase by redeleting and + inserting. */ if (node-compare_data (key) 0) -return NULL; +{ + delete_node (node, false); + + node = new (node) fibonacci_node_t (); + insert (node, key, data); + + return odata; +} - odata = node-m_data; okey = node-m_key; node-m_data = data; node-m_key = key; @@ -385,7 +409,7 @@ fibonacci_heapK,V::replace_key_data
[PATCH] Fix target/63977
My mistake yesterday. I thought I'd tested both x86_64 -m64/-m32, but not so. Anyway, as the comment says, the backend keeps querying the static chain, and if you don't early out, it sets ix86_static_chain_on_stack, at which point the setting is permanent and affects prologue generation, and not in a good way. Tested i686-linux and committed. r~ PR target/63977 * config/i386/i386.c (ix86_static_chain): Reinstate the check for DECL_STATIC_CHAIN. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index fffddfc..6c8dbd6 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -27360,6 +27360,12 @@ ix86_static_chain (const_tree fndecl_or_type, bool incoming_p) { unsigned regno; + /* While this function won't be called by the middle-end when a static + chain isn't needed, it's also used throughout the backend so it's + easiest to keep this check centralized. */ + if (DECL_P (fndecl_or_type) !DECL_STATIC_CHAIN (fndecl_or_type)) +return NULL; + if (TARGET_64BIT) { /* We always use R10 in 64-bit mode. */
[Ada] Spurious errors on extension aggregate for limited type
This patch fixes two errors in the handling of extension aggregates for limited types: Ancestor part of extension aggregate can itself be an extension aggregate as well as a function call that is rewritten as a reference. The following must compile quietly: gcc -c p2.adb gcc -c bugzilla.ads --- package body P1 is function Create return T1 is begin return (Length = 3); end Create; end P1; --- package P1 is type T1 is tagged limited private; function Create return T1; private type T1 (Length : Positive := 3) is tagged limited null record; end P1; --- with P1; package P2 is type T2 is limited new P1.T1 with null record; function Create return T2; end P2; --- package body P2 is function Create return T2 is begin return (P1.Create with null record); end Create; end P2; --- with Ada.Finalization; package Bugzilla is type T1 is limited new Ada.Finalization.Limited_Controlled with null record; type T2 is new T1 with null record; X : T2 := (T1 with null record); Z : T2 := (T1'(Ada.Finalization.Limited_Controlled with null record) with null record); end Bugzilla; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Ed Schonberg schonb...@adacore.com * sem_aggr.adb (Valid_Limited_Ancestor): Ancestor part of extension aggregate can itself be an extension aggregate, as well as a call that is rewritten as a reference. Index: sem_aggr.adb === --- sem_aggr.adb(revision 217828) +++ sem_aggr.adb(working copy) @@ -2663,12 +2663,19 @@ function Valid_Limited_Ancestor (Anc : Node_Id) return Boolean is begin - if Is_Entity_Name (Anc) - and then Is_Type (Entity (Anc)) + if Is_Entity_Name (Anc) and then Is_Type (Entity (Anc)) then +return True; + + -- The ancestor must be a call or an aggregate, but a call may + -- have been expanded into a temporary, so check original node. + + elsif Nkind_In (Anc, N_Aggregate, + N_Extension_Aggregate, + N_Function_Call) then return True; - elsif Nkind_In (Anc, N_Aggregate, N_Function_Call) then + elsif Nkind (Original_Node (Anc)) = N_Function_Call then return True; elsif Nkind (Anc) = N_Attribute_Reference
[Ada] Inter-unit inlining of expression functions with -gnatn1
This enables inter-unit inlining of expression functions with -gnatn1, or more simply with -O1/-O2 -gnatn. These functions are automatically candidates for inlining, but there were actually inlined across units only with -gnatn2, or more simply -O3 -gnatn. The following program must compile without warnings with -O -gnatn -Winline: with Q; use Q; procedure P (I : Integer) is begin if Process (I) /= 2 * I then raise Program_Error; end if; end; package Q is function Process (I : Integer) return Integer; pragma Inline (Process); end Q; with R; use R; package body Q is function Process (I : Integer) return Integer is begin return Process2 (I) + Process3 (I); end; end Q; package R is function Process2 (I : Integer) return Integer; function Process3 (I : Integer) return Integer is (I); private function Process2 (I : Integer) return Integer is (I); end R; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Eric Botcazou ebotca...@adacore.com * inline.adb (Add_Inlined_Subprogram): Insert all programs generated as a body or whose declaration was provided along with the body. Index: inline.adb === --- inline.adb (revision 217842) +++ inline.adb (working copy) @@ -454,6 +454,7 @@ procedure Add_Inlined_Subprogram (Index : Subp_Index) is E: constant Entity_Id := Inlined.Table (Index).Name; + Decl : constant Node_Id := Parent (Declaration_Node (E)); Pack : constant Entity_Id := Get_Code_Unit_Entity (E); procedure Register_Backend_Inlined_Subprogram (Subp : Entity_Id); @@ -486,14 +487,17 @@ begin -- If the subprogram is to be inlined, and if its unit is known to be -- inlined or is an instance whose body will be analyzed anyway or the - -- subprogram has been generated by the compiler, and if it is declared + -- subprogram was generated as a body by the compiler (for example an + -- initialization procedure) or its declaration was provided along with + -- the body (for example an expression function), and if it is declared -- at the library level not in the main unit, and if it can be inlined -- by the back-end, then insert it in the list of inlined subprograms. if Is_Inlined (E) and then (Is_Inlined (Pack) or else Is_Generic_Instance (Pack) - or else Is_Internal (E)) + or else Nkind (Decl) = N_Subprogram_Body + or else Present (Corresponding_Body (Decl))) and then not In_Main_Unit_Or_Subunit (E) and then not Is_Nested (E) and then not Has_Initialized_Type (E)
[Ada] Type conversion to String causes Constraint_Error
This patch modifies the mechanism which creates a subtype from an arbitrary expression. The mechanism now captures the bounds of all index constraints when the expression is of an array type. -- Source -- -- pack.ads with Ada.Finalization; use Ada.Finalization; package Pack is type Ctrl is new Controlled with record Flag : Boolean := False; end record; type New_String is new String; function Make_Ctrl return Ctrl; function Make_String (Val : String) return New_String; end Pack; -- pack.adb package body Pack is function Make_Ctrl return Ctrl is Result : Ctrl; begin return Result; end Make_Ctrl; function Make_String (Val : String) return New_String is begin return New_String (Val); end Make_String; end Pack; -- pack2.ads package Pack2 is procedure Reproduce; end Pack2; -- pack2.adb with Ada.Text_IO; use Ada.Text_IO; with Pack;use Pack; package body Pack2 is Str : constant New_String := Make_String (Hello); Ctr : constant Ctrl := Make_Ctrl; procedure Reproduce is begin Put_Line (String (Str)); end Reproduce; end Pack2; -- main.adb with Pack2; use Pack2; procedure Main is begin Reproduce; end Main; -- Compilation and output -- $ gnatmake -q main.adb $ ./main Hello Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Hristian Kirtchev kirtc...@adacore.com * exp_util.adb (Make_Subtype_From_Expr): Capture the bounds of all index constracts when the expression is of an array type. Index: exp_util.adb === --- exp_util.adb(revision 217854) +++ exp_util.adb(working copy) @@ -6399,22 +6399,24 @@ (E : Node_Id; Unc_Typ : Entity_Id) return Node_Id is + List_Constr : constant List_Id:= New_List; Loc : constant Source_Ptr := Sloc (E); - List_Constr : constant List_Id:= New_List; D : Entity_Id; + Full_Exp: Node_Id; + Full_Subtyp : Entity_Id; + High_Bound : Entity_Id; + Index_Typ : Entity_Id; + Low_Bound : Entity_Id; + Priv_Subtyp : Entity_Id; + Utyp: Entity_Id; - Full_Subtyp : Entity_Id; - Priv_Subtyp : Entity_Id; - Utyp : Entity_Id; - Full_Exp : Node_Id; - begin if Is_Private_Type (Unc_Typ) and then Has_Unknown_Discriminants (Unc_Typ) then - -- Prepare the subtype completion, Go to base type to - -- find underlying type, because the type may be a generic - -- actual or an explicit subtype. + -- Prepare the subtype completion. Use the base type to find the + -- underlying type because the type may be a generic actual or an + -- explicit subtype. Utyp:= Underlying_Type (Base_Type (Unc_Typ)); Full_Subtyp := Make_Temporary (Loc, 'C'); @@ -6451,22 +6453,67 @@ return New_Occurrence_Of (Priv_Subtyp, Loc); elsif Is_Array_Type (Unc_Typ) then + Index_Typ := First_Index (Unc_Typ); for J in 1 .. Number_Dimensions (Unc_Typ) loop -Append_To (List_Constr, - Make_Range (Loc, -Low_Bound = + +-- Capture the bounds of each index constraint in case the context +-- is an object declaration of an unconstrained type initialized +-- by a function call: + +--Obj : Unconstr_Typ := Func_Call; + +-- This scenario requires secondary scope management and the index +-- constraint cannot depend on the temporary used to capture the +-- result of the function call. + +--SS_Mark; +--Temp : Unconstr_Typ_Ptr := Func_Call'reference; +--subtype S is Unconstr_Typ (Temp.all'First .. Temp.all'Last); +--Obj : S := Temp.all; +--SS_Release; -- Temp is gone at this point, bounds of S are +-- -- non existent. + +-- The bounds are kept as variables rather than constants because +-- this prevents spurious optimizations down the line. + +-- Generate: +--Low_Bound : Base_Type (Index_Typ) := E'First (J); + +Low_Bound := Make_Temporary (Loc, 'B'); +Insert_Action (E, + Make_Object_Declaration (Loc, +Defining_Identifier = Low_Bound, +Object_Definition = + New_Occurrence_Of (Base_Type (Etype (Index_Typ)), Loc), +Expression = Make_Attribute_Reference (Loc, -Prefix = Duplicate_Subexpr_No_Checks (E), +Prefix = Duplicate_Subexpr_No_Checks (E), Attribute_Name = Name_First, -
Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.
On 11/20/2014 12:36 PM, Evgeny Stupachenko wrote: + /* Required for pack. */ + if (!TARGET_SSE4_2 || d-one_operand_p) +return false; Why the SSE4_2 check here when... + + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than general + shuffles. */ + if (d-vmode == V8HImode) +{ + c = 0x; + s = 16; + half_mode = V4SImode; + gen_and = gen_andv4si3; + gen_pack = gen_sse4_1_packusdw; ... it's SSE4_1 here, + gen_shift = gen_lshrv4si3; +} + else if (d-vmode == V16QImode) +{ + c = 0xff; + s = 8; + half_mode = V8HImode; + gen_and = gen_andv8hi3; + gen_pack = gen_sse2_packuswb; ... and SSE2 here? r~
[Ada] Debugging information for inlined predefined units
The compiler suppresses debugging information on predefined units that are inlined in the code, because stepping into run-time units often complicates debugging activity. We make an exception for calls that appear in the source, when the unit is part of the Ada hierarchy, to facilitate monitoring of storage management. Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Ed Schonberg schonb...@adacore.com * exp_ch6.adb (Expand_Call, Inlined_Subprogram): Do not suppress debugging information for a call to a predefined unit, if the call comes from source and the unit is in the Ada hierarchy. Index: exp_ch6.adb === --- exp_ch6.adb (revision 217828) +++ exp_ch6.adb (working copy) @@ -3720,7 +3720,17 @@ (Unit_File_Name (Get_Source_Unit (Sloc (Subp and then In_Extended_Main_Source_Unit (N) then - Set_Needs_Debug_Info (Subp, False); + -- We make an exception for calls to the Ada hierarchy if call + -- comes from source, because some user applications need the + -- debugging information for such calls. + + if Comes_From_Source (Call_Node) + and then Name_Buffer (1 .. 2) = a- + then + null; + else + Set_Needs_Debug_Info (Subp, False); + end if; end if; -- Front end expansion of simple functions returning unconstrained
Re: [AArch64, Patch] Add range-check for Symbol + offset addressing.
On 14/11/14 17:33, Marcus Shawcroft wrote: On 14 November 2014 08:12, Tejas Belagod tejas.bela...@arm.com wrote: 2014-11-14 Tejas Belagod tejas.bela...@arm.com gcc/ * config/aarch64/aarch64-protos.h (aarch64_classify_symbol): Fixup prototype. * config/aarch64/aarch64.c (aarch64_expand_mov_immediate, aarch64_cannot_force_const_mem, aarch64_classify_address, aarch64_classify_symbolic_expression): Fixup call to aarch64_classify_symbol. (aarch64_classify_symbol): Add range-checking for symbol + offset addressing for tiny and small models. testsuite/ * gcc.target/aarch64/symbol-range.c: New. * gcc.target/aarch64/symbol-range-tiny.c: New. OK. Could you rustle up a back port ? The same patch applies cleanly to 4.9. OK to commit? Thanks, Tejas.
Re: [PATCH] driver: ignore SIGINT while waiting on subprocesses to finish
Hi, On Thu, 20 Nov 2014, Patrick Palka wrote: -wrapper is specifically also for invoking cc1 with gdb from the driver (that's the usecase documented with -wrapper), so it better should work as intended. I don't know what problems Patrick had with that, though. For me gcc -wrapper gdb,--args works as expected (as in ^C interrupts cc1 returning to gdb). Yes it does for me too. But pressing ^C in gdb while cc1 is not running (by accident or with intention, e.g. pressing ^C to quickly clear the command prompt) will kill the driver and gdb after it. It's not a huge problem but it does cause some inconvenience for users of -wrapper gdb. Aha! Indeed that's quite ugly. I think fixing this would be appropriate. Ciao, Michael.
Re: [PATCH] PR jit/63969: Fix segfault in error-handling when driver isn't found
On Wed, 2014-11-19 at 23:09 -0800, Mike Stump wrote: On Nov 19, 2014, at 10:23 PM, David Malcolm dmalc...@redhat.com wrote: It's not clear to me if I can approve my own patches to the jit So, to answer that, we look at MAINTAINERS, and look up your name: Various Maintainers jit David Malcolm dmalc...@redhat.com So, this means that you can review other peoples work and approve it for the jit code, and you can review and approve your own work for the jit code. This is the definition of Maintainer. If you had been listed under Reviewers, you would need approval for your own work. Now, that doesn’t mean, you can’t ask for a review for any reason you want. :-) [CCing Jeff] Indeed, but Jeff wrote in another thread: JIT space, yours to approve :-) We haven't formalized that yet, but it'd be silly to do anything else. [ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02514.html ] ...and that got me wondering if: (A) there's an additional governance step here that should happen, or (B) if I can go ahead and commit suitably tested patches that are confined to the: gcc/jit gcc/testsuite/jit.exp subdirectories (and approve other people's such patches), or (C) both i.e. do (B) whilst (A) is pending Thanks Dave
[Ada] Improvements to handling of unchecked union discriminants
This patch avoids issuing a warning for a missing component clause for a discriminant in an unchecked union, and also avoids printing a line for such a component in the -gnatR2 output. The following program: 1. with Interfaces; 2. procedure Test_Union is 3. type Test_Type (Flag : Boolean) is 4. record 5. case Flag is 6. when True = 7. Thing_1 : Interfaces.Unsigned_32; 8. when False = 9. Thing_2 : Interfaces.Unsigned_32; 10. end case; 11. end record 12. with Unchecked_Union; 13. for Test_Type use 14. record 15. Thing_1 at 0 range 0 .. 31; 16. Thing_2 at 0 range 0 .. 31; 17. end record; 18.pragma Unreferenced (Test_Type); 19. begin 20. null; 21. end Test_Union; compiles quietly with switches -gnatwa -gnatR2, and generates this representation output: Representation information for unit Test_Union (body) for Test_Type'Size use 32; for Test_Type'Alignment use 4; for Test_Type use record Thing_1 at 0 range 0 .. 31; Thing_2 at 0 range 0 .. 31; end record; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Robert Dewar de...@adacore.com * repinfo.adb (List_Record_Info): Do not list discriminant in unchecked union. * sem_ch13.adb (Has_Good_Profile): Minor reformatting (Analyze_Stream_TSS_Definition): Minor reformatting (Analyze_Record_Representation_Clause): Do not issue warning for missing rep clause for discriminant in unchecked union. Index: repinfo.adb === --- repinfo.adb (revision 217828) +++ repinfo.adb (working copy) @@ -847,37 +847,49 @@ Comp := First_Component_Or_Discriminant (Ent); while Present (Comp) loop - Get_Decoded_Name_String (Chars (Comp)); - Max_Name_Length := Natural'Max (Max_Name_Length, Name_Len); - Cfbit := Component_Bit_Offset (Comp); + -- Skip discriminant in unchecked union (since it is not there!) - if Rep_Not_Constant (Cfbit) then -UI_Image_Length := 2; + if Ekind (Comp) = E_Discriminant + and then Is_Unchecked_Union (Ent) + then +null; + -- All other cases + else --- Complete annotation in case not done +Get_Decoded_Name_String (Chars (Comp)); +Max_Name_Length := Natural'Max (Max_Name_Length, Name_Len); -Set_Normalized_Position (Comp, Cfbit / SSU); -Set_Normalized_First_Bit (Comp, Cfbit mod SSU); +Cfbit := Component_Bit_Offset (Comp); -Sunit := Cfbit / SSU; -UI_Image (Sunit); - end if; +if Rep_Not_Constant (Cfbit) then + UI_Image_Length := 2; - -- If the record is not packed, then we know that all fields whose - -- position is not specified have a starting normalized bit position - -- of zero. +else + -- Complete annotation in case not done - if Unknown_Normalized_First_Bit (Comp) - and then not Is_Packed (Ent) - then -Set_Normalized_First_Bit (Comp, Uint_0); + Set_Normalized_Position (Comp, Cfbit / SSU); + Set_Normalized_First_Bit (Comp, Cfbit mod SSU); + + Sunit := Cfbit / SSU; + UI_Image (Sunit); +end if; + +-- If the record is not packed, then we know that all fields +-- whose position is not specified have a starting normalized +-- bit position of zero. + +if Unknown_Normalized_First_Bit (Comp) + and then not Is_Packed (Ent) +then + Set_Normalized_First_Bit (Comp, Uint_0); +end if; + +Max_Suni_Length := + Natural'Max (Max_Suni_Length, UI_Image_Length); end if; - Max_Suni_Length := - Natural'Max (Max_Suni_Length, UI_Image_Length); - Next_Component_Or_Discriminant (Comp); end loop; @@ -885,6 +897,17 @@ Comp := First_Component_Or_Discriminant (Ent); while Present (Comp) loop + + -- Skip discriminant in unchecked union (since it is not there!) + + if Ekind (Comp) = E_Discriminant + and then Is_Unchecked_Union (Ent) + then +goto Continue; + end if; + + -- All other cases + declare Esiz : constant Uint := Esize (Comp); Bofs : constant Uint := Component_Bit_Offset (Comp); Index: sem_ch13.adb === --- sem_ch13.adb(revision 217857) +++ sem_ch13.adb(working copy) @@ -3555,7 +3555,7 @@ if Base_Type (Typ) = Base_Type (Ent) or else (Is_Class_Wide_Type (Typ)
Re: [COMMITTED 1/3] Make TARGET_STATIC_CHAIN allow a function type
On 11/19/2014 08:56 PM, H.J. Lu wrote: On Wed, Nov 19, 2014 at 10:04 AM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Nov 19, 2014 at 03:58:50PM +0100, Richard Henderson wrote: As opposed to always being a decl. This is a prerequisite to allowing the static chain to be loaded for indirect calls. * targhooks.c (default_static_chain): Remove check for DECL_STATIC_CHAIN. * config/moxie/moxie.c (moxie_static_chain): Likewise. * config/i386/i386.c (ix86_static_chain): Allow decl or type as the first argument. * config/xtensa/xtensa.c (xtensa_static_chain): Change the name of the unused first parameter. * doc/tm.texi (TARGET_STATIC_CHAIN): Document the first parameter may be a type. * target.def (static_chain): Likewise. r217769 broke lots of tests on i686-linux... Guh. I thought I tested both multilibs from x86_64, but I guess not. Anyway, fixed as the comment describes. r~ PR target/63977 * config/i386/i386.c (ix86_static_chain): Reinstate the check for DECL_STATIC_CHAIN. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index fffddfc..6c8dbd6 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -27360,6 +27360,12 @@ ix86_static_chain (const_tree fndecl_or_type, bool incoming_p) { unsigned regno; + /* While this function won't be called by the middle-end when a static + chain isn't needed, it's also used throughout the backend so it's + easiest to keep this check centralized. */ + if (DECL_P (fndecl_or_type) !DECL_STATIC_CHAIN (fndecl_or_type)) +return NULL; + if (TARGET_64BIT) { /* We always use R10 in 64-bit mode. */
Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.
Good point! gen_shift also requires only SSE2. That way we can optimize out interleave sequence for V16QI mode in expand_vec_perm_even_odd_1. Thanks! Evgeny Updated patch: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 085eb54..054089b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -48322,6 +48322,127 @@ expand_vec_perm_vpshufb2_vpermq_even_odd (struct expand_vec_perm_d *d) return true; } +/* A subroutine of expand_vec_perm_even_odd_1. Implement extract-even + and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands + with two and and pack or two shift and pack insns. We should + have already failed all two instruction sequences. */ + +static bool +expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d) +{ + rtx op, dop0, dop1, t, rperm[16]; + unsigned i, odd, c, s, nelt = d-nelt; + bool end_perm = false; + machine_mode half_mode; + rtx (*gen_and) (rtx, rtx, rtx); + rtx (*gen_pack) (rtx, rtx, rtx); + rtx (*gen_shift) (rtx, rtx, rtx); + + if (d-one_operand_p) +return false; + + switch (d-vmode) +{ +case V8HImode: + /* Required for pack. */ + if (!TARGET_SSE4_1) +return false; + c = 0x; + s = 16; + half_mode = V4SImode; + gen_and = gen_andv4si3; + gen_pack = gen_sse4_1_packusdw; + gen_shift = gen_lshrv4si3; + break; +case V16QImode: + /* No check as all instructions are SSE2. */ + c = 0xff; + s = 8; + half_mode = V8HImode; + gen_and = gen_andv8hi3; + gen_pack = gen_sse2_packuswb; + gen_shift = gen_lshrv8hi3; + break; +case V16HImode: + if (!TARGET_AVX2) +return false; + c = 0x; + s = 16; + half_mode = V8SImode; + gen_and = gen_andv8si3; + gen_pack = gen_avx2_packusdw; + gen_shift = gen_lshrv8si3; + end_perm = true; + break; +case V32QImode: + if (!TARGET_AVX2) +return false; + c = 0xff; + s = 8; + half_mode = V16HImode; + gen_and = gen_andv16hi3; + gen_pack = gen_avx2_packuswb; + gen_shift = gen_lshrv16hi3; + end_perm = true; + break; +default: + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than +general shuffles. */ + return false; +} + + /* Check that permutation is even or odd. */ + odd = d-perm[0]; + if (odd 1) +return false; + + for (i = 1; i nelt; ++i) +if (d-perm[i] != 2 * i + odd) + return false; + + if (d-testing_p) +return true; + + dop0 = gen_reg_rtx (half_mode); + dop1 = gen_reg_rtx (half_mode); + if (odd == 0) +{ + for (i = 0; i nelt / 2; i++) + rperm[i] = GEN_INT (c); + t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm)); + t = force_reg (half_mode, t); + emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0))); + emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1))); +} + else +{ + emit_insn (gen_shift (dop0, + gen_lowpart (half_mode, d-op0), + GEN_INT (s))); + emit_insn (gen_shift (dop1, + gen_lowpart (half_mode, d-op1), + GEN_INT (s))); +} + /* In AVX2 for 256 bit case we need to permute pack result. */ + if (TARGET_AVX2 end_perm) +{ + op = gen_reg_rtx (d-vmode); + t = gen_reg_rtx (V4DImode); + emit_insn (gen_pack (op, dop0, dop1)); + emit_insn (gen_avx2_permv4di_1 (t, + gen_lowpart (V4DImode, op), + const0_rtx, + const2_rtx, + const1_rtx, + GEN_INT (3))); + emit_move_insn (d-target, gen_lowpart (d-vmode, t)); +} + else +emit_insn (gen_pack (d-target, dop0, dop1)); + + return true; +} + /* A subroutine of ix86_expand_vec_perm_builtin_1. Implement extract-even and extract-odd permutations. */ @@ -48393,7 +48514,9 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) gcc_unreachable (); case V8HImode: - if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) + if (TARGET_SSE4_1) + return expand_vec_perm_even_odd_pack (d); + else if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) return expand_vec_perm_pshufb2 (d); else { @@ -48416,32 +48539,11 @@ expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd) break; case V16QImode: - if (TARGET_SSSE3 !TARGET_SLOW_PSHUFB) - return expand_vec_perm_pshufb2 (d); - else - { - if (d-testing_p) - break; - t1 = gen_reg_rtx (V16QImode); - t2 = gen_reg_rtx (V16QImode); - t3 = gen_reg_rtx (V16QImode); - emit_insn (gen_vec_interleave_highv16qi (t1, d-op0, d-op1)); - emit_insn (gen_vec_interleave_lowv16qi
Re: [PATCH] gcc/ubsan.c: Extend 'pretty_name' space to avoid memory overflow
On 11/17/14 18:52, Chen Gang wrote: What you said sound reasonable to me. I shall try to send patch v2 within this week (use pretty_printer). Thanks. On 11/17/14 16:15, Marek Polacek wrote: On Mon, Nov 17, 2014 at 08:38:19AM +0100, Jakub Jelinek wrote: I think easiest would be to rewrite the code so that it uses pretty_printer to construct the string, grep asan.c for asan_pp . Or obstacks, but you don't have a printer to print integers into it easily. if (dom TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST) pos += sprintf (pretty_name[pos], HOST_WIDE_INT_PRINT_DEC, tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1); else /* ??? We can't determine the variable name; print VLA unspec. */ pretty_name[pos++] = '*'; looks wrong anyway, as not all integers fit into uhwi. Guess you could use wide_int to add 1 there and pp_wide_int. I have finish use pretty_print instead of normal sprintf, but for above case, after I tried to use wide_int, it can not pass testsuite, please help check whether what I have done is correct or not: For make check-gcc RUNTESTFLAGS=ubsan.exp. - Simply replace is OK: pp_printf (pretty_name, HOST_WIDE_INT_PRINT_DEC, tree_to_uhwi (TYPE_MAX_VALUE (dom)) + 1); - But use pp_wide_int for wide_int, will cause issue: pp_wide_int(pretty_name, wi::add (wi::max_value (dom), 1), TYPE_SIGN (TREE_TYPE (dom))); - The related issues are: Running /upstream/gcc-new-x86/gcc/testsuite/gcc.dg/ubsan/ubsan.exp ... FAIL: c-c++-common/ubsan/bounds-2.c -O0 output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O1 output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O2 output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O3 -fomit-frame-pointer output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O3 -fomit-frame-pointer -funroll-loops output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O3 -g output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -Os output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none output pattern test FAIL: c-c++-common/ubsan/bounds-2.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O0 output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O1 output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O2 output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O3 -fomit-frame-pointer output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O3 -fomit-frame-pointer -funroll-loops output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O3 -g output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -Os output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O2 -flto -fno-use-linker-plugin -flto-partition=none output pattern test FAIL: c-c++-common/ubsan/bounds-5.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O0 output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O1 output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O2 output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O3 -fomit-frame-pointer output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O3 -fomit-frame-pointer -funroll-loops output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O3 -g output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -Os output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O2 -flto -fno-use-linker-plugin -flto-partition=none output pattern test FAIL: c-c++-common/ubsan/bounds-7.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects output pattern test Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
Re: [PATCH] microblaze: microblaze.md: Use 'SI' instead of 'VOID' for operand 1 of 'call_value_intern'
Oh, sorry, after ran more than 10 days, the qemu crashed :-( After checked the output log, and compare with the original log, we know we have finished more than 90% test, and it is OK (no any new issues). I guess the reason is I started too many other things on this machine. Next, I shall try to analyze the cross compiled Linux kernel will run in dead lock issue. After finish analyzing, I shall restart the test. I guess it needs 12-13 days (more than a week -- I originally expected). Thanks. On 11/9/14 21:15, Chen Gang wrote: At present, I use simplified sshd, ssh and scp (dropbear open source program) to communicate with microblaze qemu successfully, and let gcc 'make check' have real effect. It is just testing, at least after almost 10 hours, the log output is OK. For each ssh login, it will wast 10 - 20 seconds, so I guess, the make check may run a week!! The recent operations is below: - zlib (for dropbear): export CHOST=microblaze-gchen-linux export PATH=/upstream/release/bin:$PATH ./configure --prefix=/upstream/release make make install - dropbear (it is a simple sshd, ssh and scp): export PATH=/upstream/release/bin:$PATH ./configure --prefix=/upstream/release \ --host=microblaze-gchen-linux \ CC=microblaze-gchen-linux-gcc modify /ustream/release/include/stdio.h to avoid redefining sscanf to iso99_sscanf link libz.a (static lib) to 'dropbear' (sshd) and 'dbclient' (ssh). and make scp to generate 'scp' command. for supporting 'none' username: under ramfs, echo 'none:x:0:0:none:/:/bin/sh' ./etc/passwd for supporting no passwords (it is temporary fix): modify common-session.c: ses.authstate.pw_passwd[0] = '\0'; put 'dropbear', 'dbclient' and 'scp' to ./sbin of ramfs and symbol links to ./usr/bin of ramfs. for temporary fix its stable issue, need modify code to let it 'fork' as soon as startup, and only permit one child connection each time. usage: in microblaze qemu: /sbin/dropbear -F -E -B -p 192.168.122.2:22 in host (x86_64), use system scp and ssh is OK (without password): ssh none@192.168.122.2 cd /test; ./test scp test.c none@192.168.122.2:/test/ scp none@192.168.122.2:/test/* ./ - For dejagnu: need `echo 192.168.122.2 microblaze-xilinx-gdb /etc/hosts` under /usr/local/share/dejagnu/*, use ssh/scp instead of rsh/rcp. (need replace all, or will cause failure during make check). for /usr/local/share/dejagnu/baseboards/microblaze-xilinx-gdb.exp, need add additional variables: set_board_info sockethost 192.168.122.2 set_board_info username none set timeout 600 Current left issues are: - Linux kernel built by current upstream microblaze toolchain will be dead lock. I shall analyze it (I guess it may be kernel self issue, which may caused by include/compiler-gcc5.h). - One patch for qemu microblaze dtb file, just checking by related members (originally I though it was kernel issue, but after communicate with kernel members, it is more suitable to change qemu). - One or more issues for dropbear (at least include stable issues), and one or more issues for glibc. Sorry for I have to bypass them, since I have no enough time resource on it. Welcome any ideas, suggestions or completions. Thanks. On 11/01/2014 01:07 AM, Chen Gang wrote: At present, I use telnet (without password), login to microblaze qemu successfully! :-) - I compile busy box with the glibc in orginal 'ramfs', so get telnetd: use new busybox replace the old one, and add symbol link 'telnetd' to busybox in /bin. - configure qemu with network support (device xlnx.xps-ethernetlite). yum install libvirt yum install tunctl tunctl -b ip link set tap0 up brctl addif virbr0 tap0 ./microblaze-softmmu/qemu-system-microblaze -M petalogix-s3adsp1800 \ -kernel ../linux-stable.microblaze/arch/microblaze/boot/linux.bin \ -no-reboot -append console=ttyUL0,115200 doreboot -nographic \ -net nic,vlan=0,model=xlnx.xps-ethernetlite,macaddr=00:16:35:AF:94:00 \ -net tap,vlan=0,ifname=tap0,script=no,downscript=no - fix a kernel bug: add xlnx,xps-ethernetlite-2.00.b for compatible with its firmware (can find it under /sys/firmware/compatible, within microblaze qemu bash environments). Related diff: diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c index 28dbbdc..298fad3 100644 --- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c +++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c @@ -1236,6 +1236,7 @@ static struct of_device_id xemaclite_of_match[] = { { .compatible = xlnx,opb-ethernetlite-1.01.b, }, { .compatible = xlnx,xps-ethernetlite-1.00.a, }, { .compatible =
[PATCH][doc] Document cortex-a17 and cortex-a17.cortex-a7 -m{cpu,tune} options
Hi all, As Joseph reminded, new -mcpu options should be documented in invoke.texi. This adds the documentation for the cortex-a17 and cortex-a17.cortex-a7 values. Ok to go in if the corresponding support patches posted earlier are accepted? Thanks, Kyrill 2014-11-19 Kyrylo Tkachov kyrylo.tkac...@arm.com * doc/invoke.texi (ARM Options): Document cortex-a17 and cortex-a17.cortex-a7 as permissible -mtune values.diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 785faec..a81cc16 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -12813,7 +12813,8 @@ Permissible names are: @samp{arm2}, @samp{arm250}, @samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp}, @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s}, @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8}, @samp{cortex-a9}, -@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a53}, @samp{cortex-a57}, +@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a17}, @samp{cortex-a53}, +@samp{cortex-a57}, @samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5}, @samp{cortex-r7}, @samp{cortex-m7}, @samp{cortex-m4}, @@ -12831,7 +12832,8 @@ Permissible names are: @samp{arm2}, @samp{arm250}, Additionally, this option can specify that GCC should tune the performance of the code for a big.LITTLE system. Permissible names are: -@samp{cortex-a15.cortex-a7}, @samp{cortex-a57.cortex-a53}. +@samp{cortex-a15.cortex-a7}, @samp{cortex-a17.cortex-a7}, +@samp{cortex-a57.cortex-a53}. @option{-mtune=generic-@var{arch}} specifies that GCC should tune the performance for a blend of processors within architecture @var{arch}.
Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787
Hi, On Thu, 20 Nov 2014, Richard Biener wrote: I don't think this API will make the non-C++-fans happier; I think the objection to the work I just merged is that it's adding more C++ than those people are comfortable with. How so? It's already super-ugly in those views. We decided to get C++. Now we have it. And? Nobody says we can't have nice looking code even with C++. Now please make it AT LEAST CONSISTENT. True. I suspect that any API which requires the of characters within the implementation of a gimple pass to mean a template is going to give those less happy with C++ a visceral ugh reaction. I wonder if there's a way to spell these things that's concise and which doesn't involve ? Only if you drop as_a/is_a/dyn_cast everywhere. Oh god, yes. Please! IMHO they don't accomplish much, but make code harder to visually parse. They don't accomplish much because you have to write the snippets that check validity of conversions anyway, so they can just as well be written as proper methods or global functions, or even just conversion operators. Nothing forces us to implement these snippets as noisy template specializations like: template template inline bool is_a_helper cgraph_node *::test (symtab_node *p) { return p-type == SYMTAB_FUNCTION; } instead of the more mundane means. And once you have those snippets as normal functions, you can just as well call them like they are functions, making the using side of those conversion also look nicer. Ciao, Michael.
[PATCH] sh: char isn't signed
An sh compiler fails to build on systems that have plain char unsigned. Fix that. 2014-11-20 Segher Boessenkool seg...@kernel.crashing.org gcc/ PR target/60111 * config/sh/sh.c: Use signed char for signed field. --- gcc/config/sh/sh.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/sh/sh.c b/gcc/config/sh/sh.c index be944da..175af44 100644 --- a/gcc/config/sh/sh.c +++ b/gcc/config/sh/sh.c @@ -3013,7 +3013,7 @@ enum struct ashl_lshr_sequence { char insn_count; - char amount[6]; + signed char amount[6]; char clobbers_t; }; -- 1.8.1.4
Re: [PATCH] sh: char isn't signed
On Thu, 2014-11-20 at 07:41 -0800, Segher Boessenkool wrote: An sh compiler fails to build on systems that have plain char unsigned. Fix that. Ouch. Thanks for spotting this. OK for trunk, 4.9 and 4.8. Cheers, Oleg
[Ada] Source in multi-unit source has unique object file name
Two units, one in a multi-source file and one in another source with the same base file name do not have the same object file name. No error during processing of the following project file should be reported: project Prj is package Naming is for Spec (foo_bar) use foo_bar.ads at 2; for Spec (foo_bar_types) use foo_bar.ads at 1; for Body (foo_bar) use foo_bar.adb; end Naming; end Prj; Tested on x86_64-pc-linux-gnu, committed on trunk 2014-11-20 Vincent Celier cel...@adacore.com * prj-nmsc.adb (Check_Object): If a unit is in a multi-source file, its object file is never the same as any other unit. Index: prj-nmsc.adb === --- prj-nmsc.adb(revision 217874) +++ prj-nmsc.adb(working copy) @@ -2577,7 +2577,7 @@ Error_Msg_Name_1 := Lang_Index.Display_Name; Error_Msg (Data.Flags, - ?no compiler specified for language %% + ?\no compiler specified for language %% , ignoring all its sources, No_Location, Project); @@ -2604,7 +2604,7 @@ if Lang_Index.Config.Naming_Data.Spec_Suffix = No_File then Error_Msg (Data.Flags, - Spec_Suffix not specified for + \Spec_Suffix not specified for Get_Name_String (Lang_Index.Name), No_Location, Project); end if; @@ -2612,7 +2612,7 @@ if Lang_Index.Config.Naming_Data.Body_Suffix = No_File then Error_Msg (Data.Flags, - Body_Suffix not specified for + \Body_Suffix not specified for Get_Name_String (Lang_Index.Name), No_Location, Project); end if; @@ -2630,7 +2630,7 @@ Error_Msg_Name_1 := Lang_Index.Display_Name; Error_Msg (Data.Flags, - no suffixes specified for %%, + \no suffixes specified for %%, No_Location, Project); end if; end if; @@ -3770,7 +3770,7 @@ if Switches /= No_Array_Element then Error_Msg (Data.Flags, - ?Linker switches not taken into account in library + ?\Linker switches not taken into account in library projects, No_Location, Project); end if; @@ -6793,7 +6793,7 @@ Error_Msg_Name_2 := Source.Unit.Name; Error_Or_Warning (Data.Flags, Data.Flags.Missing_Source_Files, - source file %% for unit %% not found, + \source file %% for unit %% not found, No_Location, Project.Project); end if; end if; @@ -7789,7 +7789,7 @@ Error_Msg_File_1 := Source.File; Error_Msg (Data.Flags, - { cannot be both excluded and an exception file name, + \{ cannot be both excluded and an exception file name, No_Location, Project.Project); end if; @@ -7936,13 +7936,15 @@ if Source /= No_Source and then Source.Replaced_By = No_Source and then Source.Path /= Src.Path + and then Source.Index = 0 + and then Src.Index = 0 and then Is_Extending (Src.Project, Source.Project) then Error_Msg_File_1 := Src.File; Error_Msg_File_2 := Source.File; Error_Msg (Data.Flags, - { and { have the same object file name, + \{ and { have the same object file name, No_Location, Project.Project); else
Re: [PATCH, committed] Update Automake files
Hi Bernd, On Thu, 2014-11-20 14:34:00 +0100, Bernd Edlinger bernd.edlin...@hotmail.de wrote: This patch updates the files taken from Automake. Committed. the updated version of missing will confuse the gmp-4.3.2 configure script if it is installed in-tree with contrib/download_prerequisites and flex is not installed: ... checking readline detected... no checking for bison... (cached) /home/ed/gnu/gcc-5-20141116/missing bison -y checking for flex... (cached) /home/ed/gnu/gcc-5-20141116/missing flex checking lex output file root... configure: error: cannot find output from /home/ed/gnu/gcc-5-20141116/missing flex; giving up make[3]: *** [config.status] Error 1 make[3]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf/gmp' make[2]: *** [all-stage1-gmp] Error 2 make[2]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf' make[1]: *** [stage1-bubble] Error 2 make[1]: Leaving directory `/home/ed/gnu/gcc-build-arm-linux-gnueabihf' make: *** [all] Error 2 previous version of missing flex produced a dummy lex.yy.c, as does the version in the gmp package, but unfortunately it is overwritten by the missing script in the gcc tree. Overridden actually by setting/exporting $FLEX and $LEX to GCC's own file and calling GMP's `configure' that way. That's probably just not a supported configuration anymore, but all previous GCC releases worked without a installed flex tool. I don't think something change wrt. the GMP version, see https://gcc.gnu.org/install/prerequisites.html. (And it's the same problem with current gmp-6.0.0a, so even updating GMP wouldn't help.) mAYBE the problem goes away if a newer version of gmp is used, or if the missing flex is not passed down to the gmp configure script, somehow. Actually, it is not really needed by gmp at all. It seems the only flex'able source file (in both GMP-4.3.2 and current 6.0.0a) is a demo file. I tried to add this hunk from the old version and it made, the gmp configure script worked again: --- missing.orig 2014-11-16 14:07:13.0 + +++ missing 2014-11-19 15:01:57.168967538 + @@ -172,6 +172,21 @@ echo You should only need it if you modified a '.l' file. echo You may want to install the Fast Lexical Analyzer package: echo $flex_URL + rm -f lex.yy.c + if test $# -ne 1; then +eval LASTARG=\${$#} +case $LASTARG in +*.l) + SRCFILE=`echo $LASTARG | sed 's/l$/c/'` + if test -f $SRCFILE; then + cp $SRCFILE lex.yy.c + fi +;; +esac + fi + if test ! -f lex.yy.c; then + echo 'main() { return 0; }'lex.yy.c + fi ;; help2man*) echo You should only need it if you modified a dependency \ What do you think? That patch defeats my attempt to re-sync with upstream files. :) The Automake guys decided that faking a tool by simulating a successful run isn't that much of a good idea, and thinking about it, this looks like a good decision. I actually kind of think that this is simply a small bug in GMP. It shouldn't require a working flex just for potentially building some demo file. After all, the release tarballs could just contain the .c file. No need for the lex file at all! However, I *think* the real problem is the way flags are passed. Top-level `configure' doesn't find `flex' (okay) and sets FLEX=./missing flex. This $FLEX gets passed by Makefile down to GMP's `configure', which by now should, IMO, point to a /working/ flex. That way, GMP's `configure' of course will choke finding the generated output file. So what shall we do now? I'd be quite okay with reverting my `missing' update for now, until this is actually fixed. However, It would be nice if we'd discuss that, along with the GMP guys. Another fix (or is it a workaround?) would be to not hand down $FLEX and $LEX: diff --git a/Makefile.def b/Makefile.def index 40bbca9..7b988fe 100644 --- a/Makefile.def +++ b/Makefile.def @@ -229,13 +229,11 @@ flags_to_pass = { flag= CC_FOR_BUILD ; }; flags_to_pass = { flag= CFLAGS_FOR_BUILD ; }; flags_to_pass = { flag= CXX_FOR_BUILD ; }; flags_to_pass = { flag= EXPECT ; }; -flags_to_pass = { flag= FLEX ; }; flags_to_pass = { flag= INSTALL ; }; flags_to_pass = { flag= INSTALL_DATA ; }; flags_to_pass = { flag= INSTALL_PROGRAM ; }; flags_to_pass = { flag= INSTALL_SCRIPT ; }; flags_to_pass = { flag= LDFLAGS_FOR_BUILD ; }; -flags_to_pass = { flag= LEX ; }; flags_to_pass = { flag= M4 ; }; flags_to_pass = { flag= MAKE ; }; flags_to_pass = { flag= RUNTEST ; }; (...and regenerate Makefile{,.in}.) However, we need to discuss that. I'll head over to the GMP bugs mailing list and discuss your build error over there. MfG, JBG -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Friends are relatives you make for yourself. the second :
Re: [PATCH] PR63426 Fix various signed integer overflows
On Thu, Nov 20, 2014 at 02:27:52PM +0100, Markus Trippelsdorf wrote: 2014-11-20 Markus Trippelsdorf mar...@trippelsdorf.de * emit-rtl.c (const_wide_int_htab_hash): Likewise. * loop-iv.c (determine_max_iter): Likewise. (iv_number_of_iterations): Likewise. * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise. * varasm.c (get_section_anchor): Likewise. Ok, with one small change: --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -7188,7 +7188,7 @@ get_section_anchor (struct object_block *block, HOST_WIDE_INT offset, offset = 0; else { - bias = 1 (GET_MODE_BITSIZE (ptr_mode) - 1); + bias = (unsigned HOST_WIDE_INT) 1 (GET_MODE_BITSIZE (ptr_mode) - 1); Please use HOST_WIDE_INT_1U instead of (unsigned HOST_WIDE_INT) 1. Jakub
Re: [PATCH][doc] Document cortex-a17 and cortex-a17.cortex-a7 -m{cpu,tune} options
On 20/11/14 15:31, Kyrill Tkachov wrote: Hi all, As Joseph reminded, new -mcpu options should be documented in invoke.texi. This adds the documentation for the cortex-a17 and cortex-a17.cortex-a7 values. Ok to go in if the corresponding support patches posted earlier are accepted? Thanks, Kyrill 2014-11-19 Kyrylo Tkachov kyrylo.tkac...@arm.com * doc/invoke.texi (ARM Options): Document cortex-a17 and cortex-a17.cortex-a7 as permissible -mtune values. Yes, this is fine. R. a17-doc.patch diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 785faec..a81cc16 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -12813,7 +12813,8 @@ Permissible names are: @samp{arm2}, @samp{arm250}, @samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp}, @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s}, @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8}, @samp{cortex-a9}, -@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a53}, @samp{cortex-a57}, +@samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a17}, @samp{cortex-a53}, +@samp{cortex-a57}, @samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5}, @samp{cortex-r7}, @samp{cortex-m7}, @samp{cortex-m4}, @@ -12831,7 +12832,8 @@ Permissible names are: @samp{arm2}, @samp{arm250}, Additionally, this option can specify that GCC should tune the performance of the code for a big.LITTLE system. Permissible names are: -@samp{cortex-a15.cortex-a7}, @samp{cortex-a57.cortex-a53}. +@samp{cortex-a15.cortex-a7}, @samp{cortex-a17.cortex-a7}, +@samp{cortex-a57.cortex-a53}. @option{-mtune=generic-@var{arch}} specifies that GCC should tune the performance for a blend of processors within architecture @var{arch}.
Re: [PATCH 10/21] PR jit/63854: Fix leak of worklist within jit-recording.c
On Wed, Nov 19, 2014 at 9:02 PM, David Malcolm dmalc...@redhat.com wrote: On Wed, 2014-11-19 at 09:59 -0700, Jeff Law wrote: On 11/19/14 03:46, David Malcolm wrote: Fix this leak: 160 bytes in 5 blocks are definitely lost in loss record 154 of 228 at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x5D75D4F: xrealloc (xmalloc.c:177) by 0x4DE1710: void va_heap::reservegcc::jit::recording::block*(vecgcc::jit::recording::block*, va_heap, vl_embed*, unsigned int, bool) (vec.h:310) by 0x4DDFAB5: vecgcc::jit::recording::block*, va_heap, vl_ptr::reserve(unsigned int, bool) (vec.h:1428) by 0x4DDFBFC: vecgcc::jit::recording::block*, va_heap, vl_ptr::reserve_exact(unsigned int) (vec.h:1448) by 0x4DDE588: vecgcc::jit::recording::block*, va_heap, vl_ptr::create(unsigned int) (vec.h:1463) by 0x4DD9B9F: gcc::jit::recording::function::validate() (jit-recording.c:2191) by 0x4DD7AD3: gcc::jit::recording::context::validate() (jit-recording.c:1005) by 0x4DD7660: gcc::jit::recording::context::compile() (jit-recording.c:848) by 0x4DD5BD2: gcc_jit_context_compile (libgccjit.c:2014) by 0x401CA4: test_jit (harness.h:190) by 0x401D88: main (harness.h:232) gcc/jit/ChangeLog: PR jit/63854 * jit-recording.c (recording::function::validate): Convert worklist from vec to autovec to fix a leak. JIT space, yours to approve :-) We haven't formalized that yet, but it'd be silly to do anything else. FWIW, I added myself to the MAINTAINERS file as JIT maintainer as part of a change you reviewed as: https://gcc.gnu.org/ml/jit/2014-q4/msg00029.html Is there a governance distinction here, between patch review vs decisions of the steering committee? i.e. do changes to the maintainers part of the MAINTAINERS file require higher-level approval? Yes, reviewers and maintainers are appointed by the steering commitee only. Richard. Presumably I should continue to send (non-trivial) jit patches to this list and wait for review before committing to trunk? Anyway so formally, this is OK for the trunk. Thanks.
Re: gimple-classes-v2-option-3 git branch committed to svn trunk as r217787
On Thu, Nov 20, 2014 at 4:34 PM, Michael Matz m...@suse.de wrote: Hi, On Thu, 20 Nov 2014, Richard Biener wrote: I don't think this API will make the non-C++-fans happier; I think the objection to the work I just merged is that it's adding more C++ than those people are comfortable with. How so? It's already super-ugly in those views. We decided to get C++. Now we have it. And? Nobody says we can't have nice looking code even with C++. Now please make it AT LEAST CONSISTENT. True. I suspect that any API which requires the of characters within the implementation of a gimple pass to mean a template is going to give those less happy with C++ a visceral ugh reaction. I wonder if there's a way to spell these things that's concise and which doesn't involve ? Only if you drop as_a/is_a/dyn_cast everywhere. Oh god, yes. Please! IMHO they don't accomplish much, but make code harder to visually parse. They don't accomplish much because you have to write the snippets that check validity of conversions anyway, so they can just as well be written as proper methods or global functions, or even just conversion operators. Nothing forces us to implement these snippets as noisy template specializations like: template template inline bool is_a_helper cgraph_node *::test (symtab_node *p) { return p-type == SYMTAB_FUNCTION; } instead of the more mundane means. And once you have those snippets as normal functions, you can just as well call them like they are functions, making the using side of those conversion also look nicer. True. I don't remember exactly but exclusively using member functions wasn't in the list of proposals that ended up with us doing as_a/is_a as it is done now. Can unions / PODs have such member functions? Just thinking about a reason why it wasn't proposed. Btw, I don't see as_a/is_a/dyn_cast as super-ugly - it's actually a perfectly fine C++-way of doing RTTI. I also guess that it requires less code in the actual implementation as we share one helper for all three operations. Of course we could macroize the member function implementation in some clever way That said - we have a substantial amount of code using as_a/is_a/dyn_cast and I don't think it's appropriate at this point to change all of it to a different mechanism. Proposals with example patches are of course welcome, but beware that if you want to succeed here then as-a.h needs to go ;) Thanks, Richard. Ciao, Michael.
[PR63762][4.9] Backport the patch which fixes GCC generates UNPREDICTABLE STR with Rn = Rt for arm
Hi all, This is a backport for gcc-4_9-branch of the patch [PR63762]GCC generates UNPREDICTABLE STR with Rn = Rt for arm posted in: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html arm-none-eabi has been test on the model, no new issues. bootstrapping and regression tested on x86, no new issues. Is it Okay for gcc-4_9-branch? gcc/ChangeLog: 2014-11-20 Renlin Li renlin...@arm.com PR middle-end/63762 * ira.c (ira): Update preferred class. gcc/testsuite/ChangeLog: 2014-11-20 Renlin Li renlin...@arm.com PR middle-end/63762 * gcc.dg/pr63762.c: New.diff --git a/gcc/ira.c b/gcc/ira.c index 9c9e71d..e610d35 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -5263,7 +5263,18 @@ ira (FILE *f) ira_allocno_iterator ai; FOR_EACH_ALLOCNO (a, ai) - ALLOCNO_REGNO (a) = REGNO (ALLOCNO_EMIT_DATA (a)-reg); +{ + int old_regno = ALLOCNO_REGNO (a); + int new_regno = REGNO (ALLOCNO_EMIT_DATA (a)-reg); + + ALLOCNO_REGNO (a) = new_regno; + + if (old_regno != new_regno) +setup_reg_classes (new_regno, reg_preferred_class (old_regno), + reg_alternate_class (old_regno), + reg_allocno_class (old_regno)); +} + } else { diff --git a/gcc/testsuite/gcc.dg/pr63762.c b/gcc/testsuite/gcc.dg/pr63762.c new file mode 100644 index 000..df11067 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr63762.c @@ -0,0 +1,77 @@ +/* PR middle-end/63762 */ +/* { dg-do assemble } */ +/* { dg-options -O2 } */ + +#include stdlib.h + +void *astFree (); +void *astMalloc (); +void astNegate (void *); +int astGetNegated (void *); +void astGetRegionBounds (void *, double *, double *); +int astResampleF (void *, ...); + +extern int astOK; + +int +MaskF (int inside, int ndim, const int lbnd[], const int ubnd[], + float in[], float val) +{ + + void *used_region; + float *c, *d, *out, *tmp_out; + double *lbndgd, *ubndgd; + int *lbndg, *ubndg, idim, ipix, nax, nin, nout, npix, npixg, result = 0; + if (!astOK) return result; + lbndg = astMalloc (sizeof (int)*(size_t) ndim); + ubndg = astMalloc (sizeof (int)*(size_t) ndim); + lbndgd = astMalloc (sizeof (double)*(size_t) ndim); + ubndgd = astMalloc (sizeof (double)*(size_t) ndim); + if (astOK) +{ + astGetRegionBounds (used_region, lbndgd, ubndgd); + npix = 1; + npixg = 1; + for (idim = 0; idim ndim; idim++) +{ + lbndg[ idim ] = lbnd[ idim ]; + ubndg[ idim ] = ubnd[ idim ]; + npix *= (ubnd[ idim ] - lbnd[ idim ] + 1); + if (npixg = 0) npixg *= (ubndg[ idim ] - lbndg[ idim ] + 1); +} + if (npixg = 0 astOK) +{ + if ((inside != 0) == (astGetNegated( used_region ) != 0)) +{ + c = in; + for (ipix = 0; ipix npix; ipix++) *(c++) = val; + result = npix; +} +} + else if (npixg 0 astOK) +{ + if ((inside != 0) == (astGetNegated (used_region) != 0)) +{ + tmp_out = astMalloc (sizeof (float)*(size_t) npix); + if (tmp_out) +{ + c = tmp_out; + for (ipix = 0; ipix npix; ipix++) *(c++) = val; + result = npix - npixg; +} + out = tmp_out; +} + else +{ + tmp_out = NULL; + out = in; +} + if (inside) astNegate (used_region); + result += astResampleF (used_region, ndim, lbnd, ubnd, in, NULL, + NULL, NULL, 0, 0.0, 100, val, ndim, + lbnd, ubnd, lbndg, ubndg, out, NULL); + if (inside) astNegate (used_region); +} +} + return result; +}
[Ada] PR ada/63931
Fixing version number according to new GCC naming scheme. PR ada/63931 * gnatvsn.ads (Library_Version): Switch to 5. Index: gnatvsn.ads === --- gnatvsn.ads (revision 217874) +++ gnatvsn.ads (working copy) @@ -82,7 +82,7 @@ -- Prefix generated by binder. If it is changed, be sure to change -- GNAT.Compiler_Version.Ver_Prefix as well. - Library_Version : constant String := 5.0; + Library_Version : constant String := 5; -- Library version. This value must be updated when the compiler -- version number Gnat_Static_Version_String is updated. --
Re: [PATCH 10/21] PR jit/63854: Fix leak of worklist within jit-recording.c
On 11/20/14 09:01, Richard Biener wrote: Is there a governance distinction here, between patch review vs decisions of the steering committee? i.e. do changes to the maintainers part of the MAINTAINERS file require higher-level approval? Yes, reviewers and maintainers are appointed by the steering commitee only. Right. I've already raised appointing David as the JIT maintainer to the steering committee. I just need to count the votes and take appropriate action. Similarly for the MPX runtime and Ilya as the MPX maintainer, Bernd as the nvptx maintainer. If there's other maintainers that need to get appointed, nobody should hesitate to contact one of the SC members to get the nomination in front of the committee. jeff
Re: [PATCH 1/2] teach mklog to get name / email from git config when available
On 09-05-14 16:47, Diego Novillo wrote: I would probably use git config directly here. It would work with both git and svn checkouts (if you have a global .git configuration). But testing for .git is fine with me as well. I like Peter's idea of having a ~/.mklog file to override. This would work for both svn and git checkouts. Diego, this patch implements both: - it uses the ~/.mklog file proposed by Peter - in absence of a ~/.mklog file, it uses git config, also when not in a git repository OK? Thanks, - Tom 2014-11-20 Tom de Vries t...@codesourcery.com Peter Bergner berg...@vnet.ibm.com * mklog: Handle .mklog. Use git setting independent of presence .git directory. --- contrib/mklog | 56 +++- 1 file changed, 35 insertions(+), 21 deletions(-) diff --git a/contrib/mklog b/contrib/mklog index 840f6f8..abbf0af 100755 --- a/contrib/mklog +++ b/contrib/mklog @@ -29,32 +29,46 @@ use File::Temp; use File::Copy qw(cp mv); -# Change these settings to reflect your profile. -$username = $ENV{'USER'}; -$name = `finger $username | grep -o 'Name: .*'`; -@n = split(/: /, $name); -$name = $n[1]; chop($name); -$addr = $username . \@my.domain.org; $date = `date +%Y-%m-%d`; chop ($date); +$dot_mklog_format_msg = +The .mklog format is:\n +. NAME = ...\n +. EMAIL = ...\n; + +# Create a .mklog to reflect your profile, if necessary. +my $conf = $ENV{HOME}/.mklog; +if (-f $conf) { +open (CONF, $conf) + or die Could not open file '$conf' for reading: $!\n; +while (CONF) { + if (m/^\s*NAME\s*=\s*(.*)\s*$/) { + $name = $1; + } elsif (m/^\s*EMAIL\s*=\s*(.*)\s*$/) { + $addr = $1; + } +} +if (!($name $addr)) { + die Could not read .mklog settings.\n + . $dot_mklog_format_msg; +} +} else { +$name = `git config user.name`; +chomp($name); +$addr = `git config user.email`; +chomp($addr); + +if (!($name $addr)) { + die Could not read git user.name and user.email settings.\n + . Please add missing git settings, or create a .mklog file in + . $ENV{HOME}.\n + . $dot_mklog_format_msg; +} +} + $gcc_root = $0; $gcc_root =~ s/[^\\\/]+$/../; -# if this is a git tree then take name and email from the git configuration -if (-d $gcc_root/.git) { - $gitname = `git config user.name`; - chomp($gitname); - if ($gitname) { - $name = $gitname; - } - - $gitaddr = `git config user.email`; - chomp($gitaddr); - if ($gitaddr) { - $addr = $gitaddr; - } -} - #- # Program starts here. You should not need to edit anything below this # line. -- 1.9.1
Re: [PATCH x86, PR60451] Expand even/odd permutation using pack insn.
Bootstrap / make check passed with updated patch. Is it still ok? It looks like we don't need expand_vec_perm_vpshufb2_vpermq_even_odd any more with the patch. However the clean up will be in the separate patch after appropriate testing. Modified ChangeLog: 2014-11-20 Evgeny Stupachenko evstu...@gmail.com gcc/testsuite PR target/60451 * gcc.target/i386/pr60451.c: New. gcc/ PR target/60451 * config/i386/i386.c (expand_vec_perm_even_odd_pack): New. (expand_vec_perm_even_odd_1): Add new expand for V8HI mode, replace for V16QI, V16HI and V32QI modes. (ix86_expand_vec_perm_const_1): Add new expand. On Thu, Nov 20, 2014 at 6:03 PM, Evgeny Stupachenko evstu...@gmail.com wrote: Good point! gen_shift also requires only SSE2. That way we can optimize out interleave sequence for V16QI mode in expand_vec_perm_even_odd_1. Thanks! Evgeny Updated patch: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 085eb54..054089b 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -48322,6 +48322,127 @@ expand_vec_perm_vpshufb2_vpermq_even_odd (struct expand_vec_perm_d *d) return true; } +/* A subroutine of expand_vec_perm_even_odd_1. Implement extract-even + and extract-odd permutations of two V16QI, V8HI, V16HI or V32QI operands + with two and and pack or two shift and pack insns. We should + have already failed all two instruction sequences. */ + +static bool +expand_vec_perm_even_odd_pack (struct expand_vec_perm_d *d) +{ + rtx op, dop0, dop1, t, rperm[16]; + unsigned i, odd, c, s, nelt = d-nelt; + bool end_perm = false; + machine_mode half_mode; + rtx (*gen_and) (rtx, rtx, rtx); + rtx (*gen_pack) (rtx, rtx, rtx); + rtx (*gen_shift) (rtx, rtx, rtx); + + if (d-one_operand_p) +return false; + + switch (d-vmode) +{ +case V8HImode: + /* Required for pack. */ + if (!TARGET_SSE4_1) +return false; + c = 0x; + s = 16; + half_mode = V4SImode; + gen_and = gen_andv4si3; + gen_pack = gen_sse4_1_packusdw; + gen_shift = gen_lshrv4si3; + break; +case V16QImode: + /* No check as all instructions are SSE2. */ + c = 0xff; + s = 8; + half_mode = V8HImode; + gen_and = gen_andv8hi3; + gen_pack = gen_sse2_packuswb; + gen_shift = gen_lshrv8hi3; + break; +case V16HImode: + if (!TARGET_AVX2) +return false; + c = 0x; + s = 16; + half_mode = V8SImode; + gen_and = gen_andv8si3; + gen_pack = gen_avx2_packusdw; + gen_shift = gen_lshrv8si3; + end_perm = true; + break; +case V32QImode: + if (!TARGET_AVX2) +return false; + c = 0xff; + s = 8; + half_mode = V16HImode; + gen_and = gen_andv16hi3; + gen_pack = gen_avx2_packuswb; + gen_shift = gen_lshrv16hi3; + end_perm = true; + break; +default: + /* Only V8HI, V16QI, V16HI and V32QI modes are more profitable than +general shuffles. */ + return false; +} + + /* Check that permutation is even or odd. */ + odd = d-perm[0]; + if (odd 1) +return false; + + for (i = 1; i nelt; ++i) +if (d-perm[i] != 2 * i + odd) + return false; + + if (d-testing_p) +return true; + + dop0 = gen_reg_rtx (half_mode); + dop1 = gen_reg_rtx (half_mode); + if (odd == 0) +{ + for (i = 0; i nelt / 2; i++) + rperm[i] = GEN_INT (c); + t = gen_rtx_CONST_VECTOR (half_mode, gen_rtvec_v (nelt / 2, rperm)); + t = force_reg (half_mode, t); + emit_insn (gen_and (dop0, t, gen_lowpart (half_mode, d-op0))); + emit_insn (gen_and (dop1, t, gen_lowpart (half_mode, d-op1))); +} + else +{ + emit_insn (gen_shift (dop0, + gen_lowpart (half_mode, d-op0), + GEN_INT (s))); + emit_insn (gen_shift (dop1, + gen_lowpart (half_mode, d-op1), + GEN_INT (s))); +} + /* In AVX2 for 256 bit case we need to permute pack result. */ + if (TARGET_AVX2 end_perm) +{ + op = gen_reg_rtx (d-vmode); + t = gen_reg_rtx (V4DImode); + emit_insn (gen_pack (op, dop0, dop1)); + emit_insn (gen_avx2_permv4di_1 (t, + gen_lowpart (V4DImode, op), + const0_rtx, + const2_rtx, + const1_rtx, + GEN_INT (3))); + emit_move_insn (d-target, gen_lowpart (d-vmode, t)); +} + else +emit_insn (gen_pack (d-target, dop0, dop1)); + + return true; +} + /* A subroutine of ix86_expand_vec_perm_builtin_1. Implement extract-even and extract-odd permutations. */ @@ -48393,7 +48514,9 @@
Re: [patch] Warn on undefined loop exit
On Wed, Nov 19, 2014 at 9:19 PM, Andrew Stubbs a...@codesourcery.com wrote: On 19/11/14 16:39, Marek Polacek wrote: On Wed, Nov 19, 2014 at 04:32:43PM +, Andrew Stubbs wrote: +if (warning_at (gimple_location (elt-stmt), +OPT_Waggressive_loop_optimizations, +Loop exit may only be reached after undefined behaviour.)) Warnings should start with a lowercase and should be without a fullstop at the end. Fixed, and I spotted a britishism too. If it's really duplicated code can you split it out to a function? + if (OPT_Waggressive_loop_optimizations) +{ this doesn't do what you think it does ;) The variable to check is warn_aggressive_loop_optimizations. + if (exit_warned problem_stmts != vNULL) +{ !problem_stmts.empty () Otherwise it looks ok. Thanks, Richard. Andrew
Add to maintainers list.
2014-11-20 Alex Velenko alex.vele...@arm.com *MAINTAINERS (write-after-approval): Add myself. diff --git a/MAINTAINERS b/MAINTAINERS index 11a28ef..eada4e9 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -566,6 +566,7 @@ David Ung dav...@mips.com Neil Vachharajani nvach...@gmail.com Kris Van Hees kris.van.h...@oracle.com Joost VandeVondele joost.vandevond...@mat.ethz.ch +Alex Velenko alex.vele...@arm.com Ilya Verbiniver...@gmail.com Kugan Vivekanandarajah kug...@linaro.org Tom de Vries t...@codesourcery.com
Re: [PATCH][AArch64] Add bounds checking to vqdm*_lane intrinsics via a qualifier that also flips endianness
On 20 November 2014 07:49, Marcus Shawcroft marcus.shawcr...@gmail.com wrote: On 19 November 2014 19:05, Charles Baylis charles.bay...@linaro.org wrote: PR target/63870 * config/aarch64/aarch64-builtins.c (aarch64_simd_expand_args): Pass expression to aarch64_simd_lane_bounds. * config/aarch64/aarch64-protos.h (aarch64_simd_lane_bounds): Update prototype. * config/aarch64/aarch64.c (aarch64_simd_lane_bounds): Add exp parameter. Report calling function in error message if exp is non-NULL. These needs to be updated to reflect the changes in the last revision of the patch where NULL is passed explicitly. Otherwise OK, commit it with a fixed ChangeLog. Sorry... more haste, less speed. Committed as r217885, with the following ChangeLog: 2014-11-20 Charles Baylis charles.bay...@linaro.org PR target/63870 * config/aarch64/aarch64-builtins.c (aarch64_simd_expand_args): Pass expression to aarch64_simd_lane_bounds. * config/aarch64/aarch64-protos.h (aarch64_simd_lane_bounds): Update prototype. * config/aarch64/aarch64-simd.md: (aarch64_combinezmode): Update call to aarch64_simd_lane_bounds. (aarch64_get_lanedi): Likewise. (aarch64_ld2_lanemode): Likewise. (aarch64_ld3_lanemode): Likewise. (aarch64_ld4_lanemode): Likewise. (aarch64_im_lane_boundsi): Likewise. * config/aarch64/aarch64.c (aarch64_simd_lane_bounds): Add exp parameter. Report calling function in error message if exp is non-NULL.
[PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()
Hello. Following patch fixes ICE in IPA ICF. Problem was that number of non-debug statements in a BB can change (for instance by IPA split), so that the number is recomputed. Patch can bootstrap on x86_64-linux-pc and no regression has been seen. Ready for trunk? Thanks, Martin gcc/ChangeLog: 2014-11-20 Martin Liska mli...@suse.cz * gimple-iterator.h (gsi_nondebug_stmt_count): New function. * ipa-icf-gimple.c (func_checker::compare_bb): Number of BB is recomputed because it can be split. gcc/testsuite/ChangeLog: 2014-11-20 Martin Liska mli...@suse.cz * gcc.dg/ipa/pr63909.c: New test. diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h index fb6cc07..f73b1f6 100644 --- a/gcc/gimple-iterator.h +++ b/gcc/gimple-iterator.h @@ -331,4 +331,18 @@ gsi_seq (gimple_stmt_iterator i) return *i.seq; } +/* Return number of nondebug statements in basic block BB. */ + +static inline unsigned +gsi_nondebug_stmt_count (basic_block bb) +{ + unsigned c = 0; + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); + gsi_next (gsi)) +if (!is_gimple_debug (gsi_stmt (gsi))) + c++; + + return c; +} + #endif /* GCC_GIMPLE_ITERATOR_H */ diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c index 8f2a438..83661ac 100644 --- a/gcc/ipa-icf-gimple.c +++ b/gcc/ipa-icf-gimple.c @@ -563,6 +563,9 @@ func_checker::compare_bb (sem_bb *bb1, sem_bb *bb2) gimple_stmt_iterator gsi1, gsi2; gimple s1, s2; + bb1-nondbg_stmt_count = gsi_nondebug_stmt_count (bb1-bb); + bb2-nondbg_stmt_count = gsi_nondebug_stmt_count (bb2-bb); + if (bb1-nondbg_stmt_count != bb2-nondbg_stmt_count || bb1-edge_count != bb2-edge_count) return return_false (); diff --git a/gcc/testsuite/gcc.dg/ipa/pr63909.c b/gcc/testsuite/gcc.dg/ipa/pr63909.c new file mode 100644 index 000..8538e21 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/pr63909.c @@ -0,0 +1,27 @@ +/* { dg-options -O2 -fno-guess-branch-probability } */ + +int z; + +__attribute__((noinline)) +void g () +{ + if (++z) +__builtin_exit (0); + g (); +} + +__attribute__((noinline)) +void f () +{ + if (++z) +__builtin_exit (0); + f (); +} + +int main() +{ + f (); + g (); + + return 0; +}
[PATCH, i386] Add new arg values for __builtin_cpu_supports
Hi, MPX runtime checks some feature bits in order to check MPX is fully supported. Runtime does it by cpuid calls but there is a __builtin_cpu_supports which may be used for that. Unfortunately currently it doesn't support required bits. Will it be OK to add them for trunk? Thanks, Ilya -- gcc/ 2014-11-20 Ilya Enkovich ilya.enkov...@intel.com * config/i386/cpuid.h (bit_MPX): New. (bit_BNDREGS): New. (bit_BNDCSR): New. * config/i386/i386.c (processor_features): Add F_XSAVE, F_OSXSAVE, F_MPX, F_BNDREGS, F_BNDCSR. (isa_names_table): Likewise. * doc/extend.texi (__builtin_cpu_supports): Add xsave, osxsave, mpx, bndregs, bndcsr. libgcc/ 2014-11-20 Ilya Enkovich ilya.enkov...@intel.com * config/i386/cpuinfo.c (processor_features): Add FEATURE_XSAVE, FEATURE_OSXSAVE, FEATURE_MPX, FEATURE_BNDREGS, FEATURE_BNDCSR. (get_available_features): Likewise. diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h index 133e356..f85cebb 100644 --- a/gcc/config/i386/cpuid.h +++ b/gcc/config/i386/cpuid.h @@ -72,6 +72,7 @@ #define bit_AVX2 (1 5) #define bit_BMI2 (1 8) #define bit_RTM(1 11) +#define bit_MPX(1 14) #define bit_AVX512F(1 16) #define bit_AVX512DQ (1 17) #define bit_RDSEED (1 18) @@ -87,6 +88,10 @@ /* %ecx */ #define bit_PREFETCHWT1 (1 0) +/* XFEATURE_ENABLED_MASK register bits (%eax == 13, %ecx == 0) */ +#define bit_BNDREGS (1 3) +#define bit_BNDCSR (1 4) + /* Extended State Enumeration Sub-leaf (%eax == 13, %ecx == 1) */ #define bit_XSAVEOPT (1 0) #define bit_XSAVEC (1 1) diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 3166e03..bbf3ea3 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -35106,6 +35106,11 @@ fold_builtin_cpu (tree fndecl, tree *args) F_FMA4, F_XOP, F_FMA, +F_XSAVE, +F_OSXSAVE, +F_MPX, +F_BNDREGS, +F_BNDCSR, F_MAX }; @@ -35194,7 +35199,12 @@ fold_builtin_cpu (tree fndecl, tree *args) {fma4, F_FMA4}, {xop,F_XOP}, {fma,F_FMA}, - {avx2, F_AVX2} + {avx2, F_AVX2}, + {xsave, F_XSAVE}, + {osxsave,F_OSXSAVE}, + {mpx,F_MPX}, + {bndregs,F_BNDREGS}, + {bndcsr, F_BNDCSR} }; tree __processor_model_type = build_processor_model_struct (); diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index d10a815..a06ed0c 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -11629,6 +11629,16 @@ SSE4.2 instructions. AVX instructions. @item avx2 AVX2 instructions. +@item xsave +XFEATURE_ENABLED_MASK register and XSAVE, XRSTOR, XSETBV, XGETBV instructions +@item osxsave +OS has enabled support for using XGETBV and XSETBV instructions +@item mpx +MPX instructions. +@item bndregs +Indicates bound register component of MPX state +@item bndcsr +Indicates bounds configuration and status component of MPX state @end table Here is an example: diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c index 6ff7502..9e060b0 100644 --- a/libgcc/config/i386/cpuinfo.c +++ b/libgcc/config/i386/cpuinfo.c @@ -96,7 +96,12 @@ enum processor_features FEATURE_SSE4_A, FEATURE_FMA4, FEATURE_XOP, - FEATURE_FMA + FEATURE_FMA, + FEATURE_XSAVE, + FEATURE_OSXSAVE, + FEATURE_MPX, + FEATURE_BNDREGS, + FEATURE_BNDCSR }; struct __processor_model @@ -270,6 +275,10 @@ get_available_features (unsigned int ecx, unsigned int edx, features |= (1 FEATURE_AVX); if (ecx bit_FMA) features |= (1 FEATURE_FMA); + if (ecx bit_XSAVE) +features |= (1 FEATURE_XSAVE); + if (ecx bit_OSXSAVE) +features |= (1 FEATURE_OSXSAVE); /* Get Advanced Features at level 7 (eax = 7, ecx = 0). */ if (max_cpuid_level = 7) @@ -278,6 +287,19 @@ get_available_features (unsigned int ecx, unsigned int edx, __cpuid_count (7, 0, eax, ebx, ecx, edx); if (ebx bit_AVX2) features |= (1 FEATURE_AVX2); + if (ebx bit_MPX) + features |= (1 FEATURE_MPX); +} + + /* Get Advanced Features at level 13 (eax = 13, ecx = 0). */ + if (max_cpuid_level = 13) +{ + unsigned int eax, ebx, ecx, edx; + __cpuid_count (13, 0, eax, ebx, ecx, edx); + if (eax bit_BNDREGS) + features |= (1 FEATURE_BNDREGS); + if (eax bit_BNDCSR) + features |= (1 FEATURE_BNDCSR); } unsigned int ext_level;
Re: [PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()
On Thu, Nov 20, 2014 at 5:30 PM, Martin Liška mli...@suse.cz wrote: Hello. Following patch fixes ICE in IPA ICF. Problem was that number of non-debug statements in a BB can change (for instance by IPA split), so that the number is recomputed. Huh, so can it get different for both candidates? I think the stmt compare loop should be terminated on gsi_end_p of either iterator and return false for any remaining non-debug-stmts on the other. Thus, not walk all stmts twice here. As IPA split is run early I don't see how it should affect a real IPA pass though? Thanks, Richard. Patch can bootstrap on x86_64-linux-pc and no regression has been seen. Ready for trunk? Thanks, Martin
Re: [PATCH, i386] Add new arg values for __builtin_cpu_supports
On Thu, Nov 20, 2014 at 07:36:03PM +0300, Ilya Enkovich wrote: Hi, MPX runtime checks some feature bits in order to check MPX is fully supported. Runtime does it by cpuid calls but there is a __builtin_cpu_supports which may be used for that. Unfortunately currently it doesn't support required bits. Will it be OK to add them for trunk? I think using cpuid for that is just fine. __builtin_cpu_supports is for ISA additions users might actually want to version code for, MPX stuff, as the instructions are nops without hw support, are not something one would multi-version a function for. If anything, AVX512F and AVX512BW+VL might be good candidates for that, not MPX. Jakub
[PATCH][ARM] Make issue rate part of per-core tuning structs
Hi all, This patch makes the arm_issue_rate function lookup the issue rate of the process from the tuning structs. This makes it look more like the aarch64 mechanism and centralises a processor-specific construct to the tuning structs, thus not forcing us to remember to update the arm_issue_rate function every time a new core is added. A new tuning struct is added for the marvell-pj4 in order to decouple it from the 9e tuning struct and enable us to set it's correct issue rate to 2. Bootstrapped and tested on arm-none-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-11-19 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm-protos.h (struct tune_params): Add issue_rate field. * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune arm_cortex_a5_tune): Specify issue_rate value. (arm_issue_rate): Look up issue rate from tuning structs. Remove large switch statement. (arm_marvell_pj4_tune): New struct. * config/arm/arm-cores.def (marvell-pj4): Use arm_marvell_pj4_tune struct.commit a2466d31869cd7edd0a9de14d96427d361d97dd7 Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Wed Nov 19 16:24:03 2014 + [ARM] refactor issue_rate diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def index 637be15..12625c7 100644 --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -158,7 +158,7 @@ ARM_CORE(cortex-r7, cortexr7, cortexr7, 7R, FL_LDSCHED | FL_ARM_DIV, cortex ARM_CORE(cortex-m7, cortexm7, cortexm7, 7EM, FL_LDSCHED, cortex_m7) ARM_CORE(cortex-m4, cortexm4, cortexm4, 7EM, FL_LDSCHED, v7m) ARM_CORE(cortex-m3, cortexm3, cortexm3, 7M, FL_LDSCHED, v7m) -ARM_CORE(marvell-pj4, marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, 9e) +ARM_CORE(marvell-pj4, marvell_pj4, marvell_pj4, 7A, FL_LDSCHED, marvell_pj4) /* V7 big.LITTLE implementations */ ARM_CORE(cortex-a15.cortex-a7, cortexa15cortexa7, cortexa7, 7A, FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV, cortex_a15) diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 71ce362..7d5bfd3 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -291,6 +291,8 @@ struct tune_params int max_insns_inline_memset; /* Bitfield encoding the fuseable pairs of instructions. */ unsigned int fuseable_ops : 1; + /* Issue rate of the processor. */ + unsigned int issue_rate; }; extern const struct tune_params *current_tune; diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 9aa402f..94db2b2 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1671,7 +1671,8 @@ const struct tune_params arm_slowmul_tune = false, false, /* Prefer 32-bit encodings. */ false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ - ARM_FUSE_NOTHING/* Fuseable pairs of instructions. */ + ARM_FUSE_NOTHING,/* Fuseable pairs of instructions. */ + 1 /* Issue rate. */ }; const struct tune_params arm_fastmul_tune = @@ -1691,7 +1692,8 @@ const struct tune_params arm_fastmul_tune = false, false, /* Prefer 32-bit encodings. */ false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ - ARM_FUSE_NOTHING/* Fuseable pairs of instructions. */ + ARM_FUSE_NOTHING,/* Fuseable pairs of instructions. */ + 1 /* Issue rate. */ }; /* StrongARM has early execution of branches, so a sequence that is worth @@ -1714,7 +1716,8 @@ const struct tune_params arm_strongarm_tune = false, false, /* Prefer 32-bit encodings. */ false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ - ARM_FUSE_NOTHING/* Fuseable pairs of instructions. */ + ARM_FUSE_NOTHING,/* Fuseable pairs of instructions. */ + 1 /* Issue rate. */ }; const struct tune_params arm_xscale_tune = @@ -1734,7 +1737,8 @@ const struct tune_params arm_xscale_tune = false, false, /* Prefer 32-bit encodings. */ false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ - ARM_FUSE_NOTHING/* Fuseable pairs of instructions. */ + ARM_FUSE_NOTHING,/* Fuseable pairs of instructions. */ + 1 /* Issue rate. */ }; const struct tune_params arm_9e_tune = @@ -1754,7 +1758,29 @@ const struct tune_params arm_9e_tune = false, false, /* Prefer 32-bit encodings. */ false, /* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ - ARM_FUSE_NOTHING/* Fuseable pairs of instructions. */ +
Re: [PATCH 1/2] teach mklog to get name / email from git config when available
On Thu, Nov 20, 2014 at 05:22:20PM +0100, Tom de Vries wrote: +my $conf = $ENV{HOME}/.mklog; +if (-f $conf) { +open (CONF, $conf) + or die Could not open file '$conf' for reading: $!\n; +while (CONF) { + if (m/^\s*NAME\s*=\s*(.*)\s*$/) { The final \s* never matches anything since the .* gobbles up everything. Use .*? if you really want to get rid of the trailing whitespace. Segher
Re: [AArch64, Patch] Add range-check for Symbol + offset addressing.
On 20 November 2014 14:33, Tejas Belagod tejas.bela...@arm.com wrote: The same patch applies cleanly to 4.9. OK to commit? Thanks, Tejas. Provided it regresses ok, yes. /Marcus
Re: [PATCH][ARM] Make issue rate part of per-core tuning structs
I should say that the patch context depends on the macro fusion hook implementation posted here: https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00958.html Kyrill On 20/11/14 16:43, Kyrill Tkachov wrote: Hi all, This patch makes the arm_issue_rate function lookup the issue rate of the process from the tuning structs. This makes it look more like the aarch64 mechanism and centralises a processor-specific construct to the tuning structs, thus not forcing us to remember to update the arm_issue_rate function every time a new core is added. A new tuning struct is added for the marvell-pj4 in order to decouple it from the 9e tuning struct and enable us to set it's correct issue rate to 2. Bootstrapped and tested on arm-none-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-11-19 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm-protos.h (struct tune_params): Add issue_rate field. * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune, arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune, arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune, arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune arm_cortex_a5_tune): Specify issue_rate value. (arm_issue_rate): Look up issue rate from tuning structs. Remove large switch statement. (arm_marvell_pj4_tune): New struct. * config/arm/arm-cores.def (marvell-pj4): Use arm_marvell_pj4_tune struct.
[PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)
This patch fixes a bunch of ICEs related to C++14 constexprs and -fsanitize=undefined. We should ignore ubsan internal functions and ubsan builtins in constexpr functions in cxx_eval_call_expression. Also add proper printing of internal functions into the C++ printer. Bootstrapped/regtested on ppc64-linux, ok for trunk? 2014-11-20 Marek Polacek pola...@redhat.com PR sanitizer/63956 * constexpr.c: Include ubsan.h. (cxx_eval_call_expression): Bail out for IFN_UBSAN_{NULL,BOUNDS} internal functions and for ubsan builtins in constexpr functions. * error.c: Include internal-fn.h. (dump_expr): Add printing of internal functions. * g++.dg/ubsan/pr63956.C: New test. diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c index 2678223..684e36f 100644 --- gcc/cp/constexpr.c +++ gcc/cp/constexpr.c @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. If not see #include gimplify.h #include builtins.h #include tree-inline.h +#include ubsan.h static bool verify_constant (tree, bool, bool *, bool *); #define VERIFY_CONSTANT(X) \ @@ -1151,6 +1152,16 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, constexpr_call *entry; bool depth_ok; + if (fun == NULL_TREE) +switch (CALL_EXPR_IFN (t)) + { + case IFN_UBSAN_NULL: + case IFN_UBSAN_BOUNDS: + return void_node; + default: + break; + } + if (TREE_CODE (fun) != FUNCTION_DECL) { /* Might be a constexpr function pointer. */ @@ -1171,6 +1182,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, } if (DECL_CLONED_FUNCTION_P (fun)) fun = DECL_CLONED_FUNCTION (fun); + + if (!current_function_decl is_ubsan_builtin_p (fun)) +return void_node; + if (is_builtin_fn (fun)) return cxx_eval_builtin_function_call (ctx, t, addr, non_constant_p, overflow_p); diff --git gcc/cp/error.c gcc/cp/error.c index 76f86cb..09789ad 100644 --- gcc/cp/error.c +++ gcc/cp/error.c @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. If not see #include tree-pretty-print.h #include c-family/c-objc.h #include ubsan.h +#include internal-fn.h #include new// For placement-new. @@ -2037,6 +2038,14 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags) tree fn = CALL_EXPR_FN (t); bool skipfirst = false; + /* Deal with internal functions. */ + if (fn == NULL_TREE) + { + pp_string (pp, internal_fn_name (CALL_EXPR_IFN (t))); + dump_call_expr_args (pp, t, flags, skipfirst); + break; + } + if (TREE_CODE (fn) == ADDR_EXPR) fn = TREE_OPERAND (fn, 0); diff --git gcc/testsuite/g++.dg/ubsan/pr63956.C gcc/testsuite/g++.dg/ubsan/pr63956.C index e69de29..7bc0b77 100644 --- gcc/testsuite/g++.dg/ubsan/pr63956.C +++ gcc/testsuite/g++.dg/ubsan/pr63956.C @@ -0,0 +1,172 @@ +// PR sanitizer/63956 +// { dg-do compile } +// { dg-options -std=c++14 -fsanitize=undefined,float-divide-by-zero,float-cast-overflow } + +#define SA(X) static_assert((X),#X) +#define INT_MIN (-__INT_MAX__ - 1) + +constexpr int +fn1 (int a, int b) +{ + if (b != 2) +a = b; + return a; +} + +constexpr int i1 = fn1 (5, 3); +constexpr int i2 = fn1 (5, -2); +constexpr int i3 = fn1 (5, sizeof (int) * __CHAR_BIT__); +constexpr int i4 = fn1 (5, 256); +constexpr int i5 = fn1 (5, 2); +constexpr int i6 = fn1 (-2, 4); +constexpr int i7 = fn1 (0, 2); + +SA (i1 == 40); +SA (i5 == 5); +SA (i7 == 0); + +constexpr int +fn2 (int a, int b) +{ + if (b != 2) +a = b; + return a; +} + +constexpr int j1 = fn2 (4, 1); +constexpr int j2 = fn2 (4, -1); +constexpr int j3 = fn2 (10, sizeof (int) * __CHAR_BIT__); +constexpr int j4 = fn2 (1, 256); +constexpr int j5 = fn2 (5, 2); +constexpr int j6 = fn2 (-2, 4); +constexpr int j7 = fn2 (0, 4); + +SA (j1 == 2); +SA (j5 == 5); +SA (j7 == 0); + +constexpr int +fn3 (int a, int b) +{ + if (b != 2) +a = a / b; + return a; +} + +constexpr int k1 = fn3 (8, 4); +constexpr int k2 = fn3 (7, 0); // { dg-error is not a constant expression|constexpr call flows off } +constexpr int k3 = fn3 (INT_MIN, -1); // { dg-error overflow in constant expression|constexpr call flows off } + +SA (k1 == 2); + +constexpr float +fn4 (float a, float b) +{ + if (b != 2.0) +a = a / b; + return a; +} + +constexpr float l1 = fn4 (5.0, 3.0); +constexpr float l2 = fn4 (7.0, 0.0); // { dg-error is not a constant expression|constexpr call flows off } + +constexpr int +fn5 (const int *a, int b) +{ + if (b != 2) +b = a[b]; + return b; +} + +constexpr int m1[4] = { 1, 2, 3, 4 }; +constexpr int m2 = fn5 (m1, 3); +constexpr int m3 = fn5 (m1, 4); // { dg-error array subscript out of bound|constexpr call flows off } + +constexpr int +fn6 (const int a, int b) +{ + if (b != 2) +b = a; + return b; +} + +constexpr int +fn7 (const int *a, int b) +{ + if
Re: [PATCH] Fix ubsan and C++14 constexpr ICEs (PR sanitizer/63956)
On Thu, Nov 20, 2014 at 06:14:52PM +0100, Marek Polacek wrote: This patch fixes a bunch of ICEs related to C++14 constexprs and -fsanitize=undefined. We should ignore ubsan internal functions and ubsan builtins in constexpr functions in cxx_eval_call_expression. Also add proper printing of internal functions into the C++ printer. Bootstrapped/regtested on ppc64-linux, ok for trunk? I'd like Jason to review this. But a few nits: @@ -1171,6 +1182,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t, } if (DECL_CLONED_FUNCTION_P (fun)) fun = DECL_CLONED_FUNCTION (fun); + + if (!current_function_decl is_ubsan_builtin_p (fun)) +return void_node; + I don't understand the !current_function_decl here. Also, looking at is_ubsan_builtin_p definition, I'd say it should IMHO at least test DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL before comparing the function name, you can declare __builtin_ubsan_foobarbaz () and use it and it won't be a builtin. As for the testcase, I'd like to understand if C++ FE should reject the constexpr functions when used with arguments that trigger undefined behavior. But certainly the behavior should not depend on whether -fsanitize=undefined or not. Also, what is the reason for constexpr call flows off the end errors? Shouldn't that be avoided if any error is found while interpreting the function? Jakub
[gomp4] fix a fortran bootstrap failure
This patch resolves a bootstrap failure in gomp-4_0-branch, which I probably introduced after I switched the cache error message from a gfc_error to a sorry. The code parameter isn't being used anymore by resolve_oacc_cache, so I've explicitly marked it as unused. I've applied this patch to gomp-4_0-branch. Cesar 2014-11-20 Cesar Philippidis ce...@codesourcery.com gcc/fortran/ * openmp.c (resolve_oacc_cache): Mark the code parameter as unused. Index: gcc/fortran/openmp.c === --- gcc/fortran/openmp.c (revision 442301) +++ gcc/fortran/openmp.c (working copy) @@ -4600,7 +4600,7 @@ resolve_oacc_loop (gfc_code *code) static void -resolve_oacc_cache (gfc_code *code) +resolve_oacc_cache (gfc_code *code ATTRIBUTE_UNUSED) { sorry (Sorry, !$ACC cache unimplemented yet); }
Re: [PATCH] Fix ICEs in simplify_immed_subreg on OImode/XImode subregs (PR target/63910)
On Wed, Nov 19, 2014 at 02:23:47PM -0800, Mike Stump wrote: On Nov 19, 2014, at 1:57 PM, Jakub Jelinek ja...@redhat.com wrote: Though, following patch is just fine for me too, I don't think it will make a significant difference: This version is fine by me. Richard, are you ok with that too? Bootstrapped/regtested on x86_64-linux and i686-linux now. 2014-11-20 Jakub Jelinek ja...@redhat.com PR target/63910 * simplify-rtx.c (simplify_immed_subreg): Return NULL for integer modes wider than MAX_BITSIZE_MODE_ANY_INT. If not using CONST_WIDE_INT, make sure r fits into CONST_DOUBLE. * gcc.target/i386/pr63910.c: New test. --- gcc/simplify-rtx.c.jj 2014-11-19 09:17:15.491327992 +0100 +++ gcc/simplify-rtx.c 2014-11-19 12:28:16.223808178 +0100 @@ -5504,6 +5504,8 @@ simplify_immed_subreg (machine_mode oute HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_WIDE_INT]; wide_int r; + if (GET_MODE_PRECISION (outer_submode) MAX_BITSIZE_MODE_ANY_INT) + return NULL_RTX; for (u = 0; u units; u++) { unsigned HOST_WIDE_INT buf = 0; @@ -5515,10 +5517,13 @@ simplify_immed_subreg (machine_mode oute tmp[u] = buf; base += HOST_BITS_PER_WIDE_INT; } - gcc_assert (GET_MODE_PRECISION (outer_submode) - = MAX_BITSIZE_MODE_ANY_INT); r = wide_int::from_array (tmp, units, GET_MODE_PRECISION (outer_submode)); +#if TARGET_SUPPORTS_WIDE_INT == 0 + /* Make sure r will fit into CONST_INT or CONST_DOUBLE. */ + if (wi::min_precision (r, SIGNED) HOST_BITS_PER_DOUBLE_INT) + return NULL_RTX; +#endif elems[elem] = immed_wide_int_const (r, outer_submode); } break; --- gcc/testsuite/gcc.target/i386/pr63910.c.jj 2014-11-19 12:04:23.490489130 +0100 +++ gcc/testsuite/gcc.target/i386/pr63910.c 2014-11-19 12:04:23.490489130 +0100 @@ -0,0 +1,12 @@ +/* PR target/63910 */ +/* { dg-do compile } */ +/* { dg-options -O -mstringop-strategy=vector_loop -mavx512f } */ + +extern void bar (float *c); + +void +foo (void) +{ + float c[1024] = { }; + bar (c); +} Jakub
[PATCH] Fix ICE with non-lvalue vector subscripts and make sure non-lvalue vector subscripts aren't used as lvalues (PR target/63764)
Hi! This patch fixes ICEs if a non-lvalue vector (say cast of one vector to another vector type) was subscripted and used as lhs. The following patch, if *vecp is not lvalue, will copy it to a temporary variable which can be made addressable for the subscription, and afterwards wrap it into a NON_LVALUE_EXPR so that it is properly rejected if later used on the lhs. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-20 Jakub Jelinek ja...@redhat.com PR target/63764 c-family/ * c-common.h (convert_vector_to_pointer_for_subscript): Change return type to bool. * c-common.c: Include gimple-expr.c. (convert_vector_to_pointer_for_subscript): Change return type to bool. If *vecp is not lvalue_p and has VECTOR_TYPE, return true and copy it into a TARGET_EXPR and use that instead of *vecp directly. c/ * c-typeck.c (build_array_ref): Adjust convert_vector_to_pointer_for_subscript caller. If it returns true, call non_lvalue_loc on the result. cp/ * typeck.c (cp_build_array_ref): Adjust convert_vector_to_pointer_for_subscript caller. If it returns true, call non_lvalue_loc on the result. testsuite/ * c-c++-common/pr63764-1.c: New test. * c-c++-common/pr63764-2.c: New test. --- gcc/c-family/c-common.h.jj 2014-11-19 15:39:26.606065628 +0100 +++ gcc/c-family/c-common.h 2014-11-20 08:38:02.527655971 +0100 @@ -1310,7 +1310,7 @@ extern tree build_userdef_literal (tree enum overflow_type overflow, tree num_string); -extern void convert_vector_to_pointer_for_subscript (location_t, tree*, tree); +extern bool convert_vector_to_pointer_for_subscript (location_t, tree *, tree); /* Possibe cases of scalar_to_vector conversion. */ enum stv_conv { --- gcc/c-family/c-common.c.jj 2014-11-19 15:39:26.606065628 +0100 +++ gcc/c-family/c-common.c 2014-11-20 08:50:21.000573676 +0100 @@ -60,6 +60,7 @@ along with GCC; see the file COPYING3. #include target-def.h #include gimplify.h #include wide-int-print.h +#include gimple-expr.h cpp_reader *parse_in; /* Declared in c-pragma.h. */ @@ -12030,22 +12031,47 @@ build_userdef_literal (tree suffix_id, t } /* For vector[index], convert the vector to a - pointer of the underlying type. */ -void + pointer of the underlying type. Return true if the resulting + ARRAY_REF should not be an lvalue. */ + +bool convert_vector_to_pointer_for_subscript (location_t loc, -tree* vecp, tree index) +tree *vecp, tree index) { + bool ret = false; if (TREE_CODE (TREE_TYPE (*vecp)) == VECTOR_TYPE) { tree type = TREE_TYPE (*vecp); tree type1; + ret = !lvalue_p (*vecp); if (TREE_CODE (index) == INTEGER_CST) if (!tree_fits_uhwi_p (index) || tree_to_uhwi (index) = TYPE_VECTOR_SUBPARTS (type)) warning_at (loc, OPT_Warray_bounds, index value is out of bound); - c_common_mark_addressable_vec (*vecp); + if (ret) + { + tree tmp = create_tmp_var_raw (type, NULL); + DECL_SOURCE_LOCATION (tmp) = loc; + *vecp = c_save_expr (*vecp); + if (TREE_CODE (*vecp) == C_MAYBE_CONST_EXPR) + { + bool non_const = C_MAYBE_CONST_EXPR_NON_CONST (*vecp); + *vecp = C_MAYBE_CONST_EXPR_EXPR (*vecp); + *vecp + = c_wrap_maybe_const (build4 (TARGET_EXPR, type, tmp, + *vecp, NULL_TREE, NULL_TREE), + non_const); + } + else + *vecp = build4 (TARGET_EXPR, type, tmp, *vecp, + NULL_TREE, NULL_TREE); + SET_EXPR_LOCATION (*vecp, loc); + c_common_mark_addressable_vec (tmp); + } + else + c_common_mark_addressable_vec (*vecp); type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type)); type1 = build_pointer_type (TREE_TYPE (*vecp)); bool ref_all = TYPE_REF_CAN_ALIAS_ALL (type1); @@ -12065,6 +12091,7 @@ convert_vector_to_pointer_for_subscript *vecp = build1 (ADDR_EXPR, type1, *vecp); *vecp = convert (type, *vecp); } + return ret; } /* Determine which of the operands, if any, is a scalar that needs to be --- gcc/c/c-typeck.c.jj 2014-11-19 15:39:24.044113650 +0100 +++ gcc/c/c-typeck.c2014-11-20 08:38:02.534655847 +0100 @@ -2495,7 +2495,8 @@ build_array_ref (location_t loc, tree ar gcc_assert (TREE_CODE (TREE_TYPE (index)) == INTEGER_TYPE); - convert_vector_to_pointer_for_subscript (loc, array, index); + bool non_lvalue += convert_vector_to_pointer_for_subscript (loc, array, index); if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE) { @@ -2557,6 +2558,8 @@ build_array_ref (location_t loc, tree
Re: SRA: don't drop clobbers
Hi, On Mon, Nov 03, 2014 at 10:46:49PM +0100, Marc Glisse wrote: On Mon, 3 Nov 2014, Marc Glisse wrote: On Mon, 3 Nov 2014, Martin Jambor wrote: I just applied your patch on top of trunk revision 217032 on my Ah, that explains it, thanks. This patch is a follow-up to r217034. Still, I didn't expect the ICE you are seeing by applying this patch to older trunk, I'll try to reproduce that. It is TODO_update_address_taken that used to remove clobbers, and as you said ESRA goes straight to TODO_update_ssa, which explains why the clobbers caused trouble. In any case, after r217034, update_ssa should handle clobbers much better. Could you take an other look based on a more recent trunk, please? Sorry for the delay. Anyway, on the current trunk (i.e. Tuesday checkout) the patch works as expected, there are assignments from default definitions now and even though we do not warn as we should, the patch improves the generated code. The function foo from the testcase is optimized to return SR.1_2(D); as soon as release_ssa now, whereas unpatched trunk leaves an undefined load even in the optimized dump. Thus, I like the patch and given that you posted it well before stage1 end, I'd like to see it committed. Richi, can you have a look and perhaps approve it? Thanks, Martin
Re: [Aarch64][BE][2/2] Fix vector load/stores to not use ld1/st1
On 14 November 2014 16:48, Alan Hayward alan.hayw...@arm.com wrote: This is a new version of my BE patch from a few weeks ago. This is part 2 and covers all the aarch64 changes. When combined with the first patch, It fixes up movoi/ci/xi for Big Endian, so that we end up with the lab of a big-endian integer to be in the low byte of the highest-numbered register. This patch requires part 1 and David Sherwood’s patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe. When tested with David’s patch and [1/2] of this patch, no regressions were seen when testing aarch64 and x86_64 on make check. Changelog: 2014-11-14 Alan Hayward alan.hayw...@arm.com * config/aarch64/aarch64.c (aarch64_classify_address): Allow extra addressing modes for BE. (aarch64_print_operand): new operand for printing a q register+1. Just a bunch of ChangeLog nits. ChangeLog entries are sentences. All of these entries should start with a capital letter. Perhaps this one would be better written as: Add 'R' specifier. (aarch64_simd_emit_reg_reg_move): replacement for Replace with just: (aarch64_simd_emit_reg_reg_move): Remove. * config/aarch64/aarch64-protos.h (aarch64_simd_emit_reg_reg_move): replacement for aarch64_simd_disambiguate_copy. How about: ( aarch64_simd_disambiguate_copy): Define. etc * config/aarch64/aarch64-simd.md (define_split): Use new aarch64_simd_emit_reg_reg_move. (define_expand movmode): less restrictive predicates. (define_insn *aarch64_movmode): Simplify and only allow for LE. (define_insn *aarch64_be_movoi): New. BE only. Plant ldp or stp. Just say: Define. (define_insn *aarch64_be_movci): New. BE only. No instructions. (define_insn *aarch64_be_movxi): New. BE only. No instructions. Likewise. (define_split): OI mov. Use new aarch64_simd_emit_reg_reg_move. (define_split): CI mov. Use new aarch64_simd_emit_reg_reg_move. On BE plant movs for reg to/from mem case. Drop this part. (define_split): XI mov. Use new aarch64_simd_emit_reg_reg_move. On BE plant movs for reg to/from mem case. Likewise. +void aarch64_simd_emit_reg_reg_move (rtx *operands, enum machine_mode mode, + unsigned int count); Drop the formal argument names. Can you respin with these changes please. /Marcus
[PATCH] Fix tree-ssa-strlen ICE introduced by r211956 (PR tree-optimization/61773)
Hi! Before the r211956 changes, the only places that set si-stmt were required to check that stpcpy has been declared (with the right prototype) to signal the strlen pass that it can use stpcpy for optimization. But r211956 sets si-stmt also for malloca call, which isn't in any way related to stpcpy. So, this patch moves the assertion where it really is needed (for strcat/strcpy and their checking variants cases). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-11-20 Jakub Jelinek ja...@redhat.com PR tree-optimization/61773 * tree-ssa-strlen.c (get_string_length): Don't assert stpcpy has been prototyped if si-stmt is BUILT_IN_MALLOC. * gcc.dg/pr61773.c: New test. --- gcc/tree-ssa-strlen.c.jj2014-11-19 18:47:59.0 +0100 +++ gcc/tree-ssa-strlen.c 2014-11-20 09:46:33.949017462 +0100 @@ -430,7 +430,6 @@ get_string_length (strinfo si) callee = gimple_call_fndecl (stmt); gcc_assert (callee DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL); lhs = gimple_call_lhs (stmt); - gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY)); /* unshare_strinfo is intentionally not called here. The (delayed) transformation of strcpy or strcat into stpcpy is done at the place of the former strcpy/strcat call and so can affect all the strinfos @@ -479,6 +478,7 @@ get_string_length (strinfo si) case BUILT_IN_STRCPY_CHK: case BUILT_IN_STRCPY_CHKP: case BUILT_IN_STRCPY_CHK_CHKP: + gcc_assert (builtin_decl_implicit_p (BUILT_IN_STPCPY)); if (gimple_call_num_args (stmt) == (with_bounds ? 4 : 2)) fn = builtin_decl_implicit (BUILT_IN_STPCPY); else --- gcc/testsuite/gcc.dg/pr61773.c.jj 2014-11-20 10:12:48.664616764 +0100 +++ gcc/testsuite/gcc.dg/pr61773.c 2014-11-20 10:13:47.384557904 +0100 @@ -0,0 +1,16 @@ +/* PR tree-optimization/61773 */ +/* { dg-do compile } */ +/* { dg-options -O2 } */ + +void +foo (char **x) +{ + char *p = __builtin_malloc (64); + char *q = __builtin_malloc (64); + __builtin_strcat (q, abcde); + __builtin_strcat (p, ab); + p[1] = q[3]; + __builtin_strcat (p, q); + x[0] = p; + x[1] = q; +} Jakub
Re: New patch: [AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.
On 13 November 2014 10:09, David Sherwood david.sherw...@arm.com wrote: gcc/: 2014-11-13 David Sherwood david.sherw...@arm.com * config/aarch64/aarch64-protos.h (aarch64_simd_attr_length_rglist, aarch64_reverse_mask): New decls. * config/aarch64/iterators.md (UNSPEC_REV_REGLIST): New enum. * config/aarch64/iterators.md (insn_count): New mode_attr. * config/aarch64/aarch64-simd.md (vec_store_lanes(o/c/x)i, vec_load_lanes(o/c/x)i): Fixed to work for Big Endian. Spell these out in full please, some folks like to be able to grep for function names in these logs. * config/aarch64/aarch64-simd.md (aarch64_rev_reglist, aarch64_simd_(ld/st)(2/3/4)): Added. Likewise. * config/aarch64/aarch64.c (aarch64_simd_attr_length_rglist, aarch64_reverse_mask): Added. It isn;t clear to me how far through the various BE patches we need to get before 59810 is actually resolved? Cheers /Marcus
[ia64 PATCH] Fix up ia64 attribute handling (PR target/61137)
Hi! Seems the gcc.target/ia64/small-addr-1.c testcase is failing on ia64 since r210262 but clearly has been failing for much longer if compiled with C++ (just there is insufficient testsuite coverage). The problem is that for the model attribute (and apparently common_object on VMS too), the argument of that attribute is supposed to be an identifier rather than expression (for common_object either an identifier or string), and these days one has to tell the frontends about that in order not to get the argument parsed as an expression. The following untested patch fixes that (tested on small-addr-1.c with a cross-compiler), I don't have ia64 hw nor spare cycles to test this though, so I'm just offering the patch as is if anyone wants to test it. Perhaps better testsuite coverage wouldn't hurt (test the model (small) attribute also in C++, perhaps test the common_object attribute on VMS?). 2014-11-20 Jakub Jelinek ja...@redhat.com PR target/61137 * config/ia64/ia64.c (ia64_attribute_takes_identifier_p): New function. (TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P): Redefine to it. --- gcc/config/ia64/ia64.c.jj 2014-11-11 00:06:23.0 +0100 +++ gcc/config/ia64/ia64.c 2014-11-20 11:51:59.729478773 +0100 @@ -324,6 +324,7 @@ static bool ia64_vms_valid_pointer_mode static tree ia64_vms_common_object_attribute (tree *, tree, tree, int, bool *) ATTRIBUTE_UNUSED; +static bool ia64_attribute_takes_identifier_p (const_tree); static tree ia64_handle_model_attribute (tree *, tree, tree, int, bool *); static tree ia64_handle_version_id_attribute (tree *, tree, tree, int, bool *); static void ia64_encode_section_info (tree, rtx, int); @@ -669,8 +670,26 @@ static const struct attribute_spec ia64_ #undef TARGET_VECTORIZE_VEC_PERM_CONST_OK #define TARGET_VECTORIZE_VEC_PERM_CONST_OK ia64_vectorize_vec_perm_const_ok +#undef TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P +#define TARGET_ATTRIBUTE_TAKES_IDENTIFIER_P ia64_attribute_takes_identifier_p + struct gcc_target targetm = TARGET_INITIALIZER; +/* Returns TRUE iff the target attribute indicated by ATTR_ID takes a plain + identifier as an argument, so the front end shouldn't look it up. */ + +static bool +ia64_attribute_takes_identifier_p (const_tree attr_id) +{ + if (is_attribute_p (model, attr_id)) +return true; +#if TARGET_ABI_OPEN_VMS + if (is_attribute_p (common_object, attr_id)) +return true; +#endif + return false; +} + typedef enum { ADDR_AREA_NORMAL, /* normal address area */ Jakub
Re: [PATCH, ifcvt] Fix PR63917
On 11/20/2014 10:48 AM, Zhenqiang Chen wrote: +/* Check X clobber CC reg or not. */ + +static bool +clobber_cc_p (rtx x) +{ + RTX_CODE code = GET_CODE (x); + int i; + + if (code == CLOBBER + REG_P (XEXP (x, 0)) + (GET_MODE_CLASS (GET_MODE (XEXP (x, 0))) == MODE_CC)) +return TRUE; + else if (code == PARALLEL) +for (i = 0; i XVECLEN (x, 0); i++) + if (clobber_cc_p (XVECEXP (x, 0, i))) + return TRUE; + return FALSE; +} Why would you need something like this when modified_between_p or one of its kin ought to do the job? r~
[PATCH] rs6000: Follow up for signed integer overflow fix
On 2014.11.20 at 08:59 -0500, David Edelsohn wrote: On Thu, Nov 20, 2014 at 8:27 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: Running the testsuite after bootstrap-ubsan on gcc112 shows several issues. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 for the full list. This patch fixes several of them. Tested on powerpc64-unknown-linux-gnu. OK for trunk? Thanks. 2014-11-20 Markus Trippelsdorf mar...@trippelsdorf.de * config/rs6000/constraints.md: Avoid signed integer overflows. * config/rs6000/predicates.md: Likewise. * config/rs6000/rs6000.c (num_insns_constant_wide): Likewise. (includes_rldic_lshift_p): Likewise. (includes_rldicr_lshift_p): Likewise. * emit-rtl.c (const_wide_int_htab_hash): Likewise. * loop-iv.c (determine_max_iter): Likewise. (iv_number_of_iterations): Likewise. * tree-ssa-loop-ivopts.c (get_computation_cost_at): Likewise. * varasm.c (get_section_anchor): Likewise. The rs6000 patches are okay. Someone like Richi or Jakub needs to approve the changes to the common parts of the compiler. The patch needs a follow up. I have introduced a new compiler warning that I didn't notice, because I was using --disable-werror during testing unintentionally. Fixed by casting a few 0s to unsigned HOST_WIDE_INT. Tested with --enable-werror on powerpc64-unknown-linux-gnu. OK for trunk? Thanks. 2014-11-20 Markus Trippelsdorf mar...@trippelsdorf.de * config/rs6000/rs6000.c (includes_rldic_lshift_p): Cast 0 to unsigned. (includes_rldicr_lshift_p): Likewise. diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index a9604cf3fa97..d7958b33ba1a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -16197,10 +16197,10 @@ includes_rldic_lshift_p (rtx shiftop, rtx andop) unsigned HOST_WIDE_INT c, lsb, shift_mask; c = INTVAL (andop); - if (c == 0 || c == ~0) + if (c == 0 || c == ~(unsigned HOST_WIDE_INT) 0) return 0; - shift_mask = ~0; + shift_mask = ~(unsigned HOST_WIDE_INT) 0; shift_mask = INTVAL (shiftop); /* Find the least significant one bit. */ @@ -16235,7 +16235,7 @@ includes_rldicr_lshift_p (rtx shiftop, rtx andop) { unsigned HOST_WIDE_INT c, lsb, shift_mask; - shift_mask = ~0; + shift_mask = ~(unsigned HOST_WIDE_INT) 0; shift_mask = INTVAL (shiftop); c = INTVAL (andop); -- Markus