[Commited/WWW] Add Cavium ThunderX related changes to changes.html for gcc-7

2017-01-09 Thread Andrew Pinski
Just adding the changes that were done to add Cavium ThunderX to changes.html. Committed as obvious. Thanks, Andrew Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v retrieving revision 1.34 diff -u -p

Re: [PATCH, gcc, wwwdocs] Document upcoming Qualcomm Falkor processor support

2017-01-11 Thread Andrew Pinski
On Wed, Jan 11, 2017 at 8:29 AM, Richard Earnshaw (lists) wrote: > On 06/01/17 12:11, Siddhesh Poyarekar wrote: >> Hi, >> >> This patch documents the newly added flag in gcc 7 for the upcoming >> Qualcomm Falkor processor core. >> >> Siddhesh >> >> Index:

Re: [PATCH][AArch64] PR target/71112: Properly create lowpart of pic_offset_table_rtx with -fpie

2016-11-29 Thread Andrew Pinski
On Tue, Nov 29, 2016 at 1:09 AM, Kyrill Tkachov wrote: > Hi all, > > This ICE only occurs on big-endian ILP32 -fpie code. The expansion code > generates the invalid load: > (insn 6 5 7 (set (reg/f:SI 76) > (unspec:SI [ > (mem/u/c:SI (lo_sum:SI

Re: [PATCH/AARCH64] Handle ILP32 multi-arch

2017-01-03 Thread Andrew Pinski
Ping? On Sat, Dec 10, 2016 at 1:24 PM, Andrew Pinski <pins...@gmail.com> wrote: > On Thu, Nov 10, 2016 at 6:58 PM, Andrew Pinski <pins...@gmail.com> wrote: >> On Tue, Oct 25, 2016 at 3:25 PM, Matthias Klose <d...@debian.org> wrote: >>> On 07.10.2016

[PATCH/AARCH64] Improve/correct ThunderX 1 cost model for Arith_shift

2016-12-30 Thread Andrew Pinski
Hi, Currently for the following function: int f(int a, int b) { return a + (b <<7); } GCC produces: add w0, w0, w1, lsl 7 But for ThunderX 1, it is better if the instruction was split allowing better scheduling to happen in most cases, the latency is the same. I get a small improvement

[PATCH/AARCH64] Add -mcpu=thunderx2t99 support

2016-12-29 Thread Andrew Pinski
Hi, This patch adds -mcpu=thunderx2t99. Cavium has acquired the Vulcan IP from Broadcom. I am keeping the old -mcpu=vulcan as backwards compatible but renaming all of the structures to be based on the new name of the chip. In the next few weeks, I am auditing the current tuning and will be

[Committed] Lower iterator count on gcc.dg/atomic/c11-atomic-exec-5.c for AARCH64

2016-12-29 Thread Andrew Pinski
X 2 CN99xx). Thanks, Andrew Pinski ChangeLog: * gcc.dg/atomic/c11-atomic-exec-5.c: Lower ITER_COUNT to 100 for AARCH64. Index: testsuite/gcc.dg/atomic/c11-atomic-exec-5.c === --- testsuite/gcc.dg/atomic/c11-atomic-exec-5.c

Re: Fix ppc64le bootstrap comparison failure

2017-01-14 Thread Andrew Pinski
On Fri, Jan 13, 2017 at 10:24 PM, Jeff Law wrote: > > > Given a block with more than one dead store, one of which is the last > statement in the block, the existence debugging statements can change the > generated code which is of course bad. > > The problem is when I moved the

Re: [PATCH][AArch64] Enable AES fusion with -mcpu=generic

2017-03-16 Thread Andrew Pinski
On Thu, Mar 16, 2017 at 10:22 AM, Wilco Dijkstra wrote: > Many supported cores implement fusion of AES instructions. When fusion > happens it can give a significant performance gain. If not, scheduling > fusion candidates next to each other has almost no effect on

Re: [PATCH][AArch64] Implement ALU_BRANCH fusion

2017-03-20 Thread Andrew Pinski
Basically the idea is to push the check for CC usage into the target macros (macro_fusion_pair_p in i386.c and aarch64.c are the only usage of compare/branch fusion) instead of keeping it in the general code. Also in aarch64.c's macro fusion you need check that the branch instruction uses the same register as the othe

Re: Reload fix for an old aarch64 issue

2017-03-14 Thread Andrew Pinski
On Tue, Mar 14, 2017 at 4:22 PM, Bernd Schmidt wrote: > This triggered a kernel miscompilation with an old (4.8 I think) aarch64 > toolchain. Yes RHEL's 4.8 has the issue. So did Cavium/MontaVista's GCC 4.7 with AARCH64 support backported to it. We would work around the

Re: [PATCH][AArch64] Enable AUTOPREFETCHER_WEAK with -mcpu=generic

2017-04-05 Thread Andrew Pinski
On Wed, Apr 5, 2017 at 5:38 AM, Wilco Dijkstra wrote: > Many supported cores use the AUTOPREFETCHER_WEAK setting which tries > to order loads and stores to improve streaming performance. Since significant > gains were reported in http://patchwork.ozlabs.org/patch/534469/

Re: [RFC][PATCH][AArch64] Improve generic branch cost

2017-03-09 Thread Andrew Pinski
On Thu, Mar 9, 2017 at 6:42 AM, Wilco Dijkstra wrote: > Hi, > > Recently we've put a lot of effort into improving ifcvt to use CSEL on > AArch64. > In https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01639.html James determined > the best value for AArch64 code generation.

Re: [PATCH] free MPFR caches in gimple-ssa-sprintf.c (PR 79699)

2017-03-02 Thread Andrew Pinski
On Thu, Mar 2, 2017 at 1:33 PM, Martin Sebor wrote: > On 03/02/2017 01:08 AM, Richard Biener wrote: >> >> On Thu, Mar 2, 2017 at 2:01 AM, Joseph Myers >> wrote: >>> >>> On Wed, 1 Mar 2017, Martin Sebor wrote: >>> Joseph, since you commented on the

[Committed] Add a few testcases

2017-04-02 Thread Andrew Pinski
Hi, While working on an out of tree optimization pass, I ran into a few failures which was not represented by the testsuite so I am adding them now. Committed after a bootstrap/test on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski testsuite/ChangeLog: * gcc.c-torture/compile

Re: [PATCH 9/9] c-family/c-cppbuiltin.c fix

2017-04-01 Thread Andrew Pinski
On Sat, Apr 1, 2017 at 9:54 AM, Andrew Jenner wrote: > I needed to apply the attached patch for ia16, so that > __LIBGCC_JCR_SECTION_NAME__ does not get defined unless > TARGET_USE_JCR_SECTION is. > > 2017-04-01 Andrew Jenner > > *

[PATCH] Handle BIT_INSERT_EXPR in hashable_expr_equal_p

2017-07-29 Thread Andrew Pinski
the same way as I had fixed PRE, by special casing BIT_INSERT_EXPR due to the implicit operand. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: * tree-ssa-scopedtables.c (hashable_expr_equal_p): Check BIT_INSERT_EXPR's operand 1 to see

[Committed/AARCH64] Fix rdma for -mcpu=native

2017-08-12 Thread Andrew Pinski
CPU implementer : 0x42 CPU architecture: 8 CPU variant : 0x0 CPU part: 0x516 CPU revision: 1 Thanks, Andrew Pinski ChangeLog: * aarch64-option-extensions.def (rdma): Fix feature string to what Linux prints out in /proc/cpuinfo. Index: aarch64-option-extensions.def

[PATCH/AARCH64] Decimal floating point support for AARCH64

2017-07-13 Thread Andrew Pinski
double. In that they are passed via the floating registers (sN, dN, qN). Is this ok an ABI? Is the patch ok? Bootstrapped and tested on aarch64-linux-gnu with --enable-decimal-float with no regressions and all of the dfp testcases pass. Thanks, Andrew Pinski gcc/ChangeLog: * config/aarch64

Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-07-09 Thread Andrew Pinski
On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina wrote: > Hi All, > > this patch implements a optimization rewriting > > x * copysign (1.0, y) and > x * copysign (-1.0, y) > > to: > > x ^ (y & (1 << sign_bit_position)) > > This is done by creating a special builtin

[Committed/AARCH64] Fix ICE with -mcpu=thunderx2t99

2017-07-07 Thread Andrew Pinski
Hi, After https://gcc.gnu.org/ml/gcc-cvs/2017-06/msg01066.html, there was many crashes with -mcpu=thunderx2t99. This patch fixes the crashes. Committed after bootstrap and test. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64.c (aarch_macro_fusion_pair_p): Check prev_set

Re: [RFC/SCCVN] Handle BIT_INSERT_EXPR in vn_nary_op_eq

2017-07-12 Thread Andrew Pinski
On Wed, Jul 12, 2017 at 9:10 PM, Marc Glisse <marc.gli...@inria.fr> wrote: > On Wed, 12 Jul 2017, Andrew Pinski wrote: > >> Hi, >> Unlike most other expressions, BIT_INSERT_EXPR has an implicit >> operand of the precision/size of the second operand. This means if

[RFC/SCCVN] Handle BIT_INSERT_EXPR in vn_nary_op_eq

2017-07-12 Thread Andrew Pinski
). Thanks, Andrew Pinski ChangeLog: * tree-ssa-sccvn.c (vn_nary_op_eq): Check BIT_INSERT_EXPR's operand 1 to see if the types precision matches. Index: tree-ssa-sccvn.c === --- tree-ssa-sccvn.c(revision 250159) +++ tree-ssa

Re: [PATCH/AARCH64] Decimal floating point support for AARCH64

2017-07-15 Thread Andrew Pinski
On Thu, Jul 13, 2017 at 5:12 PM, Andrew Pinski <apin...@cavium.com> wrote: > Hi, > This patch adds Decimal floating point support to aarch64. It is > the base support in that since there is no hardware support for DFP, > it just defines the ABI. The ABI I chose is that _Deci

Re: [RFC/SCCVN] Handle BIT_INSERT_EXPR in vn_nary_op_eq

2017-07-19 Thread Andrew Pinski
On Mon, Jul 17, 2017 at 3:02 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Thu, Jul 13, 2017 at 6:18 AM, Andrew Pinski <pins...@gmail.com> wrote: >> On Wed, Jul 12, 2017 at 9:10 PM, Marc Glisse <marc.gli...@inria.fr> wrote: >>> On Wed, 12 Jul 20

Re: [PATCH][AArch64] Improve addressing of TI/TFmode

2017-07-20 Thread Andrew Pinski
On Thu, Jul 20, 2017 at 5:49 AM, Wilco Dijkstra wrote: > In https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01125.html Jiong > pointed out some addressing inefficiencies due to a recent change in > regcprop (https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00775.html). > > This

Re: [RFC/SCCVN] Handle BIT_INSERT_EXPR in vn_nary_op_eq

2017-07-21 Thread Andrew Pinski
On Wed, Jul 19, 2017 at 11:13 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On July 19, 2017 6:10:28 PM GMT+02:00, Andrew Pinski <pins...@gmail.com> > wrote: >>On Mon, Jul 17, 2017 at 3:02 AM, Richard Biener >><richard.guent...@gmail.com> wrote:

Re: [PATCH/AARCH64] Decimal floating point support for AARCH64

2017-07-21 Thread Andrew Pinski
On Fri, Jul 21, 2017 at 2:45 PM, Peter Bergner <berg...@vnet.ibm.com> wrote: > On 7/13/17 7:12 PM, Andrew Pinski wrote: >> This patch adds Decimal floating point support to aarch64. It is >> the base support in that since there is no hardware support for DFP, >&

[PATCH] unitialized memory access vs BIT_INSERT_EXPR

2017-07-21 Thread Andrew Pinski
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski * tree-ssa-uninit.c (warn_uninitialized_vars): Don't warn about memory accesses where the use is for the first operand of a BIT_INSERT. Index:

Re: [PATCH][2/2] Fix PR81502

2017-07-27 Thread Andrew Pinski
vl%edi, %eax > ret > > the pattern optimizes a BIT_FIELD_REF on a BIT_INSERT_EXPR by > either extracting from the destination or the inserted value. Note this optimization pattern was on my list to implement for bit-field optimizations after lowering. Thanks, Andrew Pinski > >

Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts

2017-07-28 Thread Andrew Pinski
2 (4 bits)>; For the vector case, can't we write it as: _1 = BIT_FIELD_REF <high_4(D), 64, 64>; _2 = BIT_FIELD_REF <low_7(D), 64, 0>; res_8 = {_1, _2}; And then have some match.pd patterns (which might get complex), to rewrite that into VEC_PERM_EXPR? The reason why I ask that is because s

[Committed/AARCH64] Fix ThunderX fp vectorizer cost model

2017-07-26 Thread Andrew Pinski
were with GCC 6. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64.c (thunderx_vector_cost): Fix vec_fp_stmt_cost. Index: config/aarch64/aarch64.c === --- config/aarch64/aarch64.c(revision 250529) +++ config/aarch64/aarch64

[COMMITTED/AARCH64] Improve thunderx_vect_cost some more

2017-07-26 Thread Andrew Pinski
there. Committed as obvious after a bootstrap/test on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski * config/aarch64/aarch64.c (thunderx_vector_cost): Decrease cost of vec_unalign_load_cost and vec_unalign_store_cost. Index: config/aarch64/aarch64.c

[PATCH] [PR 81245] Fix tree-if-conv calling of update_stmt after fold_stmt

2017-06-29 Thread Andrew Pinski
? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: * tree-if-conv.c (predicate_scalar_phi): Update new_stmt if fold_stmt returned true. testsuite/ChangeLog: * gcc.dg/torture/pr81245.c: New testcase. Index: testsuite/gcc.dg/torture/pr81245.c

Re: [PATCH] [PR 81245] Fix tree-if-conv calling of update_stmt after fold_stmt

2017-06-30 Thread Andrew Pinski
On Fri, Jun 30, 2017 at 1:20 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Thu, Jun 29, 2017 at 10:12 PM, Andrew Pinski <pins...@gmail.com> wrote: >> Hi, >> As described in the bug, tree-if-conv is calling update_stmt on an >> old stmt which mi

Re: [PATCH][AArch64] Fix ILP32 memory access

2017-07-05 Thread Andrew Pinski
On Tue, Jun 27, 2017 at 6:39 AM, Wilco Dijkstra wrote: > This patch fixes a failure in gcc.target/aarch64/reload-valid-spoff.c > triggered by https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01367.html - > it supersedes https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01907.html

Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)

2017-07-05 Thread Andrew Pinski
memmove (slot_ + 1, slot_, (vec_->num++ - ix_) * sizeof (T));\ > ^ > > There is nothing wrong with the code being warned here. > While "struct btrace_insn" is trivial (has a user-provided default > ctor), i

Re: [PATCH] Forbid section anchors for ASan build (PR sanitizer/81697)

2017-08-08 Thread Andrew Pinski
riables alignment. Can you describe this a little bit more? What is going wrong here? Is it because there is no red zone between the variables? Also I noticed you are using .cc as the testcase file name, why don't you use .C instead and then you won't need the other patch which you just posted

[Committed/AARCH64] Fix gcc.target/aarch64/vect-xorsign_exec.c testcase

2017-08-09 Thread Andrew Pinski
Hi, This testcase checks the assembly and does an execute of it so it needs --save-temps like the other testcases. Committed as obvious after test on aarch64-linux-gnu with no regressions. Thanks, Andrew ChangeLog: * gcc.target/aarch64/vect-xorsign_exec.c: Add --save-temps to the options

Re: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-07 Thread Andrew Pinski
On Mon, Aug 7, 2017 at 1:36 PM, Michael Collison wrote: > This patch improves code generation for shifts with subtract instructions > where the first operand to the subtract is equal to the bit-size of the > operation. > > > long f1(long x, int i) > { > return x >>

[Committed] Fix testsuite/gcc.target/aarch64/target_attr*.c testcases when -mcpu= or -march= supplied

2017-08-06 Thread Andrew Pinski
/\{,-mcpu=thunderx,-mcpu=thunderx2t99,-march=armv8-a,-march=armv8.1-a,-march=armv8.2-a\} and saw no failures. Thanks, Andrew Pinski ChangeLog: * gcc.target/aarch64/target_attr_10.c: Add -mcpu=generic. * gcc.target/aarch64/target_attr_13.c: LIkewise. * gcc.target/aarch64/target_attr_15.c: LIkewise

[Committed] Fix gcc.target/aarch64/_Float16_*.c when supplied a -mcpu= option

2017-08-06 Thread Andrew Pinski
with "--target_board=unix/\{,-mcpu=thunderx,-mcpu=thunderx2t99,-march=armv8-a,-march=armv8.1-a,-march=armv8.2-a\}". Thanks, Andrew Pinski ChangeLog: * gcc.target/aarch64/_Float16_1.c: Skip if supplied a -mcpu= option. * gcc.target/aarch64/_Float16_2.c: Likewise * gcc.target/aarch64/_Float16_3.c

[Committed] Fix gcc.target/aarch64/atomic_cmp_exchange_*.c for supplied -mcpu=/-march=

2017-08-06 Thread Andrew Pinski
=unix/\{,-mcpu=thunderx,-mcpu=thunderx2t99,-march=armv8-a,-march=armv8.1-a,-march=armv8.2-a\}. Thanks, Andrew Pinski ChangeLog: * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: Pass -march=armv8-a+nolse, skip if -mcpu= is passed. * gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c

[PATCH/AARCH64] Generate FRINTZ for (double)(long) under -ffast-math on aarch64

2017-08-18 Thread Andrew Pinski
with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64.md (*frintz): New pattern. testsuite/ChangeLog: * testsuite/gcc.target/aarch64/floattointtofloat-1.c: New testcase. commit 9cef5e196729df5a197b81b72192d687683a057a Author: Andrew Pinski <apin...@cavium.com> Date: Thu Aug 17

Re: [PATCH] ggc-page loop

2017-05-02 Thread Andrew Pinski
On Tue, May 2, 2017 at 3:41 PM, Nathan Sidwell wrote: > This loop in ggc-page confused me, because the iterator is one greater than > the indexing value. Also the formatting of the array indexing is incorrect. > > Fixed thusly, and applied as obvious after booting on

Re: Go patches committed: merge recent changes to gofrontend

2017-05-10 Thread Andrew Pinski
On Wed, May 10, 2017 at 10:26 AM, Ian Lance Taylor wrote: > I have committed a large patch to update the Go frontend and libgo to > the recent changes in the gofrontend repository. I had postponed > merging changes during the GCC 7 release process. I am now merging > all the

Re: Go patches committed: merge recent changes to gofrontend

2017-05-10 Thread Andrew Pinski
On Wed, May 10, 2017 at 5:37 PM, Andrew Pinski <apin...@cavium.com> wrote: > On Wed, May 10, 2017 at 10:26 AM, Ian Lance Taylor <i...@golang.org> wrote: >> I have committed a large patch to update the Go frontend and libgo to >> the recent changes in the gofrontend r

Re: [PATCH] Add -dB option to disable backtraces

2017-05-16 Thread Andrew Pinski
f (diag_kind == DK_ICE && !context->disable_backtrace) state = backtrace_create_state (NULL, 0, bt_err_callback, NULL); int count = 0; if (state != NULL) backtrace_full (state, 2, bt_callback, bt_err_callback, (void *) );

Re: [PATCH] Add -dB option to disable backtraces

2017-05-16 Thread Andrew Pinski
On Tue, May 16, 2017 at 7:16 PM, Andi Kleen wrote: > From: Andi Kleen > > When running creduce on an ICE substantial amounts of the total > CPU time go to backtrace_qsort() (sorting dwarf of the compiler) for > printing the backtrace of the ICE. When

Re: [PATCH][AArch64] Allow const0_rtx operand for atomic compare-exchange patterns

2017-06-19 Thread Andrew Pinski
(const_int 2 [0x2]) ] UNSPECV_ATOMIC_CMPSW)) ]) "/home/apinski/src/local5/gcc/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c":8 -1 (nil)) during RTL pass: vregs Note also your new testcase is broken even for defaulti

Re: [PATCH/AARCH64] Improve/correct ThunderX 1 cost model for Arith_shift

2017-06-20 Thread Andrew Pinski
On Mon, Jun 19, 2017 at 2:00 PM, Andrew Pinski <pins...@gmail.com> wrote: > On Wed, Jun 7, 2017 at 10:16 AM, James Greenhalgh > <james.greenha...@arm.com> wrote: >> On Fri, Dec 30, 2016 at 10:05:26PM -0800, Andrew Pinski wrote: >>> Hi, >>> Currently

Re: [PATCH][RFA] Fix -fstack-check with really big frames on aarch64

2017-06-22 Thread Andrew Pinski
On Thu, Jun 22, 2017 at 10:02 AM, Mike Stump wrote: > On Jun 22, 2017, at 8:32 AM, Jeff Law wrote: >> >> Sure. I'll do something with 20031023-1.c to ensure it or an equivalent >> is compiled with -fstack-check. That isn't totally unexpected. I >>

[PATCH/AARCH64] Improve cost of arithmetic instructions with shift/extend on ThunderX2T99

2017-06-23 Thread Andrew Pinski
int. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64-cost-tables.h (thunderx2t99_extra_costs): Increment Arith_shift, Arith_shift_reg, Log_shift, Log_shift_reg and Extend_arith by 1. Index: gcc/config/aarch64/aarch64-cost-tables.h

Re: [PATCH] fold a * (a > 0 ? 1 : -1) to abs(a) and related optimizations

2017-06-23 Thread Andrew Pinski
Forgot the patch On Fri, Jun 23, 2017 at 8:59 PM, Andrew Pinski <pins...@gmail.com> wrote: > Hi, > I saw this on llvm's review site (https://reviews.llvm.org/D34579) > and I thought why not add it to GCC. I expanded more than what was > done on the LLVM patch. > &g

[PATCH] fold a * (a > 0 ? 1 : -1) to abs(a) and related optimizations

2017-06-23 Thread Andrew Pinski
1 : 1) into ABS(X). Transform X * (X < 0.0 ? -1.0 : 1.0) into ABS(X). Transform X * (X <= 0.0 ? -1.0 : 1.0) into ABS(X). The floating points ones only happen when not honoring SNANS and not honoring signed zeros. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: * match.pd ( X * (X >/>=/

Re: [PATCH] fold a * (a > 0 ? 1 : -1) to abs(a) and related optimizations

2017-06-24 Thread Andrew Pinski
On Fri, Jun 23, 2017 at 11:50 PM, Marc Glisse <marc.gli...@inria.fr> wrote: > On Fri, 23 Jun 2017, Andrew Pinski wrote: > >> Hi, >> I saw this on llvm's review site (https://reviews.llvm.org/D34579) >> and I thought why not add it to GCC. I expanded more than what

Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-24 Thread Andrew Pinski
On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina wrote: > Hi All, > > this patch implements a optimization rewriting > > x * copysign (1.0, y) and > x * copysign (-1.0, y) This reminds me: copysign(-1.0, y) can be just optimized to: copysign(1.0, y) I did that in my

Re: Simplify 3*x == 3*y for wrapping types

2017-06-24 Thread Andrew Pinski
On Sat, Jun 24, 2017 at 5:34 AM, Marc Glisse wrote: > Hello, > > I remember wanting to add this when the undefined-overflow case was > introduced a while ago. > > It turns out the tree where I wrote this wasn't clean. Since the rest is > details, I am including it in this

[PATCH] Fold (a > 0 ? 1.0 : -1.0) into copysign (1.0, a) and a * copysign (1.0, a) into abs(a)

2017-06-24 Thread Andrew Pinski
rm X * copysign (1.0, -X) into -abs(X). Transform copysign (-1.0, X) into copysign (1.0, X). The last one is there so if someone decides to writes -1.0 instead of 1.0 in the code we would get the optimization still. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks,

Re: [PATCH] fold a * (a > 0 ? 1 : -1) to abs(a) and related optimizations

2017-06-24 Thread Andrew Pinski
On Sat, Jun 24, 2017 at 12:47 PM, Marc Glisse <marc.gli...@inria.fr> wrote: > On Sat, 24 Jun 2017, Andrew Pinski wrote: > >>> * if X is NaN, we may get a qNaN with the wrong sign bit. We probably >>> don't >>> care much though... >> >> >&g

Re: [PATCH] Fold (a > 0 ? 1.0 : -1.0) into copysign (1.0, a) and a * copysign (1.0, a) into abs(a)

2017-06-25 Thread Andrew Pinski
On Sun, Jun 25, 2017 at 11:18 AM, Andrew Pinski <pins...@gmail.com> wrote: > On Sun, Jun 25, 2017 at 1:28 AM, Marc Glisse <marc.gli...@inria.fr> wrote: >> +(for cmp (gt ge lt le) >> + outp (convert convert negate negate) >> + outn (negate negate convert co

Re: [RFC][AARCH64]Add 'r' integer register operand modifier. Document the common asm modifier for aarch64 target.

2017-06-25 Thread Andrew Pinski
On Tue, Jun 6, 2017 at 3:56 AM, Renlin Li wrote: > Hi all, > > In this patch, a new integer register operand modifier 'r' is added. This > will use the > proper register name according to the mode of corresponding operand. > > 'w' register for scalar integer mode smaller

Re: [PATCH] Fold (a > 0 ? 1.0 : -1.0) into copysign (1.0, a) and a * copysign (1.0, a) into abs(a)

2017-06-25 Thread Andrew Pinski
On Sun, Jun 25, 2017 at 1:28 AM, Marc Glisse wrote: > +(for cmp (gt ge lt le) > + outp (convert convert negate negate) > + outn (negate negate convert convert) > + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */ > + /* Transform (X >= 0.0 ? 1.0 : -1.0)

Re: [RFC][PR 67336][PING^2] Verify pointers during stack unwind

2017-06-25 Thread Andrew Pinski
On Sun, Jun 25, 2017 at 12:08 PM, Yuri Gribov wrote: > Hi all, > > Libgcc unwinder currently does not do any verification of pointers > which it chases on stack. In practice this not so rarely causes > segfaults when unwinding on corrupted stacks (e.g. when when trying to >

Re: [GCC][PATCH][mid-end] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-06-25 Thread Andrew Pinski
On Sat, Jun 24, 2017 at 4:53 PM, Andrew Pinski <pins...@gmail.com> wrote: > On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina > <tamar.christ...@arm.com> wrote: >> Hi All, >> >> this patch implements a optimization rewriting >> >> x

Re: [PATCH/AARCH64] Improve/correct ThunderX 1 cost model for Arith_shift

2017-06-19 Thread Andrew Pinski
On Wed, Jun 7, 2017 at 10:16 AM, James Greenhalgh <james.greenha...@arm.com> wrote: > On Fri, Dec 30, 2016 at 10:05:26PM -0800, Andrew Pinski wrote: >> Hi, >> Currently for the following function: >> int f(int a, int b) >> { >> return a + (b <<7); &

[PATCH/AARCH64 v2] Enable software prefetching (-fprefetch-loop-arrays) for ThunderX 88xxx

2017-06-20 Thread Andrew Pinski
Here is the updated patch based on the new infrastructure which is now included. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions and tested again on SPEC CPU 2006 on THunderX T88 with the speed up mentioned before. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64

Re: [Patch AArch64] Add rcpc extension

2017-06-20 Thread Andrew Pinski
On Tue, Jun 20, 2017 at 6:50 AM, James Greenhalgh wrote: > > Hi, > > While GCC doesn't need to know anything about the RcPc extension for code > generation, we do need to add the extension flag to the string we pass > to the assembler when we're compiling for a CPU which

Re: [RFC][AARCH64]Add 'r' integer register operand modifier. Document the common asm modifier for aarch64 target.

2017-06-27 Thread Andrew Pinski
On Tue, Jun 27, 2017 at 8:27 AM, Renlin Li <renlin...@foss.arm.com> wrote: > Hi Andrew, > > On 25/06/17 22:38, Andrew Pinski wrote: >> >> On Tue, Jun 6, 2017 at 3:56 AM, Renlin Li <renlin...@foss.arm.com> wrote: >>> >>> Hi all, >>> &

Re: [PATCH] Fold (a > 0 ? 1.0 : -1.0) into copysign (1.0, a) and a * copysign (1.0, a) into abs(a)

2017-06-27 Thread Andrew Pinski
))) >>> >> +(if (types_match (type, double_type_node)) >>> >> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outp @0))) >>> >> +(if (types_match (type, long_double_type_node)) >>> >> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outp

Re: [PATCH] PR c++/80544 strip cv-quals from cast results

2017-05-24 Thread Andrew Pinski
On Wed, May 24, 2017 at 8:07 PM, Andrew Pinski <pins...@gmail.com> wrote: > On Wed, May 24, 2017 at 12:29 PM, Jonathan Wakely <jwak...@redhat.com> wrote: >> On 24/05/17 14:50 -0400, Jason Merrill wrote: >>> >>> On Wed, May 24, 2017 at 10:20 AM, Jonatha

Re: [PATCH] PR c++/80544 strip cv-quals from cast results

2017-05-24 Thread Andrew Pinski
On Wed, May 24, 2017 at 12:29 PM, Jonathan Wakely wrote: > On 24/05/17 14:50 -0400, Jason Merrill wrote: >> >> On Wed, May 24, 2017 at 10:20 AM, Jonathan Wakely >> wrote: >>> >>> On 23/05/17 16:26 -0400, Jason Merrill wrote: On Tue, May 23,

Re: [PATCH 0/6] Improve -fprefetch-loop-arrays in general and for AArch64 in particular

2017-05-27 Thread Andrew Pinski
On Tue, Feb 28, 2017 at 1:53 AM, Maxim Kuvyrkov wrote: >> On Feb 20, 2017, at 5:38 PM, Kyrill Tkachov >> wrote: >> >> Hi Maxim, >> >> On 30/01/17 11:24, Maxim Kuvyrkov wrote: >>> This patch series improves -fprefetch-loop-arrays pass

Re: [patch, libfortran] AMD-specific versions of library matmul

2017-05-25 Thread Andrew Pinski
On Thu, May 25, 2017 at 6:43 PM, Jerry DeLisle wrote: > On 05/25/2017 02:57 PM, Thomas Koenig wrote: >> >> Hi everybody, >> >> I have committed the patch (with the corrections for the name) >> as rev 248472. >> >> The infrastructure is in place, so we will be able to make

Re: Reorgnanization of profile count maintenance code, part 1

2017-06-05 Thread Andrew Pinski
On Mon, Jun 5, 2017 at 8:37 AM, Jan Hubicka wrote: > Hi, > I have committed the following fix. I seeing the following error while building aarch64-elf: /home/jenkins/workspace/BuildToolchainAARCH64_thunder_elf_upstream/toolchain/scripts/../src/gcc/shrink-wrap.c: In function

Re: [PATCH GCC][1/2]Feed bound computation to folder in loop split

2017-06-16 Thread Andrew Pinski
;>>>> >>>>> so folding should work. Where do you see that it does not? Note the >>>>> code uses gimple_build (), not gimple_build_assign (). >>>> >>>> In spec2k6/hmmer, when building fast_algorithms.c with below command >>>> line: >>>> ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all >>>> -fdump-tree-lsplit >>>> The lsplit dump contains: >>>> [12.75%]: >>>> _124 = _197 + 1; >>>> _123 = _124 + -1; >>>> _115 = MIN_EXPR <_197, _124>; >>>> Which is generated here. >>> >>> >>> That means we miss a pattern in match.PD to handle this case. >> >> I see. I will withdraw this patch and look in that direction. > > > For _123, we have > > /* (A +- CST1) +- CST2 -> A + CST3 > or > /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */ > > > For _115, we have > > /* min (a, a + CST) -> a where CST is positive. */ > /* min (a, a + CST) -> a + CST where CST is negative. */ > (simplify > (min:c @0 (plus@2 @0 INTEGER_CST@1)) > (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) >(if (tree_int_cst_sgn (@1) > 0) > @0 > @2))) > > What is the type of all those SSA_NAMEs? https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01352.html which added the min/max patterns. I forgot to get Naveen to mention I saw this while looking into loop splitting and why I was adding them. Thanks, Andrew Pinski > > -- > Marc Glisse

Re: Record equivalences for spill registers

2017-05-07 Thread Andrew Pinski
On Sun, May 7, 2017 at 10:26 PM, Andrew Pinski <pins...@gmail.com> wrote: > On Sun, May 7, 2017 at 9:30 PM, Jim Wilson <jim.wil...@linaro.org> wrote: >> On 05/05/2017 12:23 AM, Richard Sandiford wrote: >>> >>> 2017-05-05 Richard Sandiford <

Re: Record equivalences for spill registers

2017-05-07 Thread Andrew Pinski
in 2013): Switch tables are implemented using the tiny model but they are stored in the rodata section which means they could overflow the address. Note this patch most likely won't apply directly either: From: Andrew Pinski <apin...@cavium.com> Date: Thu, 15 Aug 2013 20:4

Re: [PATCH/AARCH64] Improve/correct ThunderX 1 cost model for Arith_shift

2017-05-07 Thread Andrew Pinski
On Fri, Dec 30, 2016 at 10:05 PM, Andrew Pinski <pins...@gmail.com> wrote: > Hi, > Currently for the following function: > int f(int a, int b) > { > return a + (b <<7); > } > > GCC produces: > add w0, w0, w1, lsl 7 > But for ThunderX 1, it is be

Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading

2017-05-07 Thread Andrew Pinski
On Sun, May 7, 2017 at 8:06 AM, Jeff Law <l...@redhat.com> wrote: > On 05/06/2017 05:56 PM, Andrew Pinski wrote: >> >> On Sat, May 6, 2017 at 4:55 PM, Andrew Pinski <pins...@gmail.com> wrote: >>> >>> On Sat, May 6, 2017 at 8:03 AM, Jeff Law <l...@red

Re: [RFC][PATCH] Introduce -fdump*-folding

2017-05-03 Thread Andrew Pinski
On Wed, May 3, 2017 at 1:10 AM, Martin Liška wrote: > Hello > > Last release cycle I spent quite some time with reading of IVOPTS pass > dump file. Using -fdump*-details causes to generate a lot of 'Applying > pattern' > lines, which can make reading of a dump file more

Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading

2017-05-06 Thread Andrew Pinski
On Sat, May 6, 2017 at 8:03 AM, Jeff Law wrote: > This is the 2nd of 3-5 patches to address pr78496. > > Jump threading will examine ASSERT_EXPRs as it walks the IL and record > information from those ASSERT_EXPRs into the available expression and > const/copies tables where

Re: [PATCH 2/n] [PR tree-optimization/78496] Simplify ASSERT_EXPRs to facilitate jump threading

2017-05-06 Thread Andrew Pinski
On Sat, May 6, 2017 at 4:55 PM, Andrew Pinski <pins...@gmail.com> wrote: > On Sat, May 6, 2017 at 8:03 AM, Jeff Law <l...@redhat.com> wrote: >> This is the 2nd of 3-5 patches to address pr78496. >> >> Jump threading will examine ASSERT_EXPRs as it walks the IL and

Re: [PATCH AArch64/V3]Add new patterns for vcond_mask and vec_cmp

2017-06-27 Thread Andrew Pinski
ttern. > (vcond_mask_): New pattern. LTGT support is missing and can be generated via __builtin_islessgreater . See PR 81228. Thanks, Andrew Pinski

Re: [RFC][AARCH64][PATCH 2/5]: Add number of hw prefetchers available to cpu_prefetch_tune

2017-09-14 Thread Andrew Pinski
On Thu, Sep 14, 2017 at 6:28 PM, Kugan Vivekanandarajah wrote: > This patch adds number of hw prefetchers available to > cpu_prefetch_tune so it can be used in loop unrolling decisions. Can you explain the difference between this and num_slots

Re: [RFC][PACH 3/5] Prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop

2017-09-14 Thread Andrew Pinski
On Thu, Sep 14, 2017 at 6:30 PM, Kugan Vivekanandarajah wrote: > This patch prevent tree unroller from completely unrolling inner loops if that > results in excessive strided-loads in outer loop. Same comments from the RTL version. Though one more comment

Re: configure erroneously detects eh_frame misoptimization

2017-09-19 Thread Andrew Pinski
On Tue, Sep 19, 2017 at 4:42 PM, Steven Taschuk wrote: > The behaviour of echo for arguments containing the two-character > substring `\0` varies among implementations: in coreutils echo, > and in the builtins of ash, bash, busybox sh, csh, and fish, the two > characters `\0`

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-09-22 Thread Andrew Pinski
On Fri, Sep 22, 2017 at 11:39 AM, Jim Wilson <jim.wil...@linaro.org> wrote: > On Fri, Sep 22, 2017 at 10:58 AM, Andrew Pinski <pins...@gmail.com> wrote: >> Two overall comments: >> * What about splitting register_offset into two different elements, >> one for no

Re: [RFC][AARCH64][PATCH 5/5] add aarch64_loop_unroll_adjust to limit partial unrolling in rtl based on strided-loads in loop

2017-09-17 Thread Andrew Pinski
On Sun, Sep 17, 2017 at 4:41 PM, Kugan Vivekanandarajah <kugan.vivekanandara...@linaro.org> wrote: > Hi Andrew, > > On 15 September 2017 at 13:36, Andrew Pinski <pins...@gmail.com> wrote: >> On Thu, Sep 14, 2017 at 6:33 PM, Kugan Vivekanandarajah >> <kuga

Re: [PATCH, AArch64] Disable reg offset in quad-word store for Falkor.

2017-09-22 Thread Andrew Pinski
On Fri, Sep 22, 2017 at 8:59 AM, Jim Wilson wrote: > On Fri, Sep 22, 2017 at 8:49 AM, Jim Wilson wrote: >> On Falkor, because of an idiosyncracy of how the pipelines are designed, a >> quad-word store using a reg+reg addressing mode is almost twice

Re: Check that there are no missing probabilities

2017-10-13 Thread Andrew Pinski
On Fri, Oct 13, 2017 at 6:38 AM, Jan Hubicka wrote: > Hi, > this patch enables check that no edge probabilities are missing. This caused a bootstrap failure on aarch64-linux-gnu with go enabled. But I see you have disabled the code for now. Just for reference the failure:

Re: [PATCH v2, middle-end]: Introduce memory_blockage named insn pattern

2017-10-14 Thread Andrew Pinski
On Mon, Sep 18, 2017 at 2:06 PM, Uros Bizjak wrote: > On Tue, Sep 5, 2017 at 3:50 PM, Uros Bizjak wrote: >> Revised patch, incorporates fixes from Alexander's review comments. >> >> I removed some implementation details from Alexander's description of >>

Re: Improve int<->FP conversions

2017-09-10 Thread Andrew Pinski
On Sun, Sep 10, 2017 at 9:50 PM, Michael Collison wrote: > This patch improves the latency of code by eliminating two FP <-> integer > register transfers. > > An example: > > float f1(float x) > { > int y = x; > return (float)y; > } > > Trunk generates: > > f1: >

Re: [AArch64], patch] PR71727 fix -mstrict-align

2017-09-11 Thread Andrew Pinski
On Tue, Jul 18, 2017 at 5:50 AM, Christophe Lyon wrote: > Hello, > > I've received a complaint that GCC for AArch64 would generate > vectorized code relying on unaligned memory accesses even when using > -mstrict-align. This is a problem for code where such accesses

Re: [PATCH][AArch64] Remove '*' from movsi/di/ti patterns

2017-09-12 Thread Andrew Pinski
On Tue, Sep 12, 2017 at 9:10 AM, James Greenhalgh wrote: > On Wed, Jul 26, 2017 at 02:46:14PM +0100, Wilco Dijkstra wrote: >> Remove the remaining uses of '*' from the movsi/di/ti patterns. >> Using '*' in alternatives is typically incorrect at it tells the register >>

Re: [PATCH][GCC][AArch64] Restrict lrint inlining on ILP32.

2017-09-09 Thread Andrew Pinski
On Fri, Aug 11, 2017 at 2:58 AM, Tamar Christina wrote: > Hi All, > > The inlining of lrint isn't valid in all cases on ILP32 when > -fno-math-errno is used because an inexact exception is raised in > certain circumstances. > > For ILP32 I now restrict the inlining only

Re: [PATCH][AArch64] Remove '*' from movsi/di/ti patterns

2017-09-09 Thread Andrew Pinski
On Wed, Jul 26, 2017 at 6:46 AM, Wilco Dijkstra wrote: > Remove the remaining uses of '*' from the movsi/di/ti patterns. > Using '*' in alternatives is typically incorrect at it tells the register > allocator to ignore those alternatives. So remove these from all the >

Re: C++ PATCH to reduced_constant_expression for partially-initialized objects

2017-09-12 Thread Andrew Pinski
tle sense. Are you compiling the cross compiler with the new native compiler? Since this patch only touches the C++ front-end and only C++11 even that makes less sense. Are you sure this was not a bug in qemu which is just happening showing up now? Even then this makes little sense as the code generation between the two revisions should not touch anything related to fortran. Thanks, Andrew Pinski > > Christophe

Re: [reviewed] qsort comparator consistency checking

2017-09-29 Thread Andrew Pinski
On Fri, Sep 29, 2017 at 10:46 AM, Christophe Lyon wrote: > Hi, > > > On 29 September 2017 at 15:29, Alexander Monakov wrote: >> Hello, >> >> I'm going to install the following patch on trunk in the next few hours. >> This revision doesn't offer

Re: [PATCH 2/2] [ARM] Add table of costs for AAarch32 addressing modes.

2017-08-25 Thread Andrew Pinski
On Fri, Aug 25, 2017 at 10:43 AM, Charles Baylis wrote: > On 9 June 2017 at 15:13, Richard Earnshaw (lists) > wrote: >> On 21/02/17 16:54, charles.bay...@linaro.org wrote: >>> From: Charles Baylis >>> >>> This patch

<    4   5   6   7   8   9   10   11   12   13   >