Re: Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Aaron Sawdey
/branches/lto-pressure Aaron Thanks Regards Ajit -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Aaron Sawdey
On Mon, 2014-06-16 at 14:42 -0400, Vladimir Makarov wrote: On 2014-06-16, 2:25 PM, Aaron Sawdey wrote: On Mon, 2014-06-16 at 14:14 +, Ajit Kumar Agarwal wrote: Hello All: I have worked on the Open64 compiler where the Register Pressure Guided Unroll and Jam gave a good amount

Re: Live range Analysis based on tree representations

2015-09-04 Thread Aaron Sawdey
On Thu, 2015-09-03 at 15:22 +, Ajit Kumar Agarwal wrote: > > > -Original Message- > From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com] > Sent: Wednesday, September 02, 2015 8:23 PM > To: Ajit Kumar Agarwal > Cc: Jeff Law; vmaka...@redhat.com; Richard Bi

Re: Live range Analysis based on tree representations

2015-09-02 Thread Aaron Sawdey
improve register allocation so it doesn't fall down when register pressure gets high. The code is in a branch called lto-pressure. Aaron > > Thanks & Regards > Ajit > -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: Live range Analysis based on tree representations

2015-09-15 Thread Aaron Sawdey
On Sat, 2015-09-12 at 18:45 +, Ajit Kumar Agarwal wrote: > > > -Original Message- > From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com] > Sent: Friday, September 04, 2015 11:51 PM > To: Ajit Kumar Agarwal > Cc: Jeff Law; vmaka...@redhat.com; Richard Bi

guessed profile counts leading to incorrect static branch hints on ppc64

2015-12-09 Thread Aaron Sawdey
on. */ esucc->probability = REG_BR_PROB_BASE; } } } It would appear that the guessed counts are getting changed inconsistently before this during the tree-ssa-dom pass. Any trail of breadcrumbs to follow through the forest would be helpful here ... Thanks! Aaron -- Aaron Sawde

determining reassociation width

2016-05-02 Thread Aaron Sawdey
whose terms are fp multiplies because now we have fused multipy-adds to consider. See PR 70912 for more on this. Suggestions? Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

cmpstrnsi pattern should check for zero byte?

2016-11-01 Thread Aaron Sawdey
the "scmpu" instruction to do the comparison. The RX manual I found showed pseudocode for scmpu that shows it both checks for zero byte as well as comparing the strings. If this isn't correct, please let me know here or on the patch itself. Thanks,     Aaron -- Aaron Sawdey, Ph

Re: k-byte memset/memcpy/strlen builtins

2017-01-11 Thread Aaron Sawdey
complicated than memset because of encodings etc. My > snippet > in question used a fixed-length encoding of 2 bytes, however. > > Another simple idea to tackle this would be a peephole optimization > but > I'm not sure if this is really feasible for something like memset. > Would

Re: help with PR78809 - inline strcmp for small constant strings

2017-08-05 Thread Aaron Sawdey
mp+ccmp and one branch. > > > > Even better would be wider loads if you either know the alignment > > of s or it's max size > > (although given the overhead of creating the return value that > > works best for equality). > > All those things are hand

Re: [RFC][GCC][rs6000] Remaining work for inline expansion of strncmp/strcmp/memcmp for powerpc

2018-10-18 Thread Aaron Sawdey
On 10/17/18 4:03 PM, Florian Weimer wrote: > * Aaron Sawdey: > >> I've previously posted a patch to add vector/vsx inline expansion of >> strcmp/strncmp for the power8/power9 processors. Here are some of the >> other items I have in the pipeline that I hope to get into

[RFC][GCC][rs6000] Remaining work for inline expansion of strncmp/strcmp/memcmp for powerpc

2018-10-17 Thread Aaron Sawdey
512 bytes inline before dumping to the library function. If anyone has any other input on the inline expansion work I've been doing for the rs6000 target, please let me know. Thanks! Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 1:01 PM, Jakub Jelinek wrote: > On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote: >> 1) rename optab movmem and the underlying patterns to cpymem. >> 2) add a new optab movmem that is really memmove() and add support for >> having __builtin_memmove() u

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 9:02 AM, Michael Matz wrote: > On Wed, 15 May 2019, Aaron Sawdey wrote: >> Next question would be how do we move from the existing movmem pattern >> (which Michael Matz tells us should be renamed cpymem anyway) to this >> new thing. Are you proposing that we s

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 11:31 AM, Jakub Jelinek wrote: > On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote: >> My goals for this are: >> * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem] >> * memmove() call becomes __builtin_memmove (or __builtin_memcpy base

Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-14 Thread Aaron Sawdey
this machinery that need to work a certain way, or other related issues that should be addressed in between expand_builtin_memcpy() and emit_block_move_via_movmem(). Thanks! Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technolo

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 7:22 AM, Richard Biener wrote: > On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey wrote: >> I'd be interested in any comments about pieces of this machinery that need to >> work a certain way, or other related issues that should be addressed in >> between

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 8:10 AM, Michael Matz wrote:> On Tue, 14 May 2019, Aaron Sawdey wrote: > >> memcpy -> expand with movmem pattern >> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern >> memmove (overlap) -> remains memmove -> gl

LTO inliner -- sensitivity to increasing register pressure

2014-04-18 Thread Aaron Sawdey
, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: LTO inliner -- sensitivity to increasing register pressure

2014-04-18 Thread Aaron Sawdey
in bzip2. Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: LTO inliner -- sensitivity to increasing register pressure

2014-04-18 Thread Aaron Sawdey
, Apr 18, 2014 at 9:43 AM, Aaron Sawdey acsaw...@linux.vnet.ibm.com wrote: Honza, Seeing your recent patches relating to inliner heuristics for LTO, I thought I should mention some related work I'm doing. By way of introduction, I've recently joined the IBM LTC's PPC Toolchain team

Re: [Patch,optimization]: Optimized changes in the estimate register pressure cost.

2015-09-28 Thread Aaron Sawdey
t. size_cost = (estimate_reg_pressure_cost (new_regs[0] + regs_needed[0], regs_used, speed, call_p) - estimate_reg_pressure_cost (new_regs[0], regs_used, speed, call_p)); I'm not quite sure I understand the "wh

[PATCH] add reassociation width target function for power8

2016-05-04 Thread Aaron Sawdey
Hi, This patch enables TARGET_SCHED_REASSOCIATION_WIDTH for power8 and up. The widths returned are derived from testing with SPEC 2006 and some simple tests on power8. Bootstrapped and regtested on powerpc64le-unknown-linux-gnu, ok for trunk? 2016-05-04 Aaron Sawdey <ac

[PATCH] add myself to MAINTAINERS

2016-05-04 Thread Aaron Sawdey
<baldr...@gcc.gnu.org> Sujoy Saraswati <sujoy.sarasw...@hpe.com> Trevor Saunders<tsaund...@mozilla.com> +Aaron Sawdey <acsaw...@linux.vnet.ibm.

Re: [PATCH, pr63256] update powerpc dg options

2017-02-01 Thread Aaron Sawdey
On Thu, 2017-01-19 at 17:00 -0600, Aaron Sawdey wrote: > SMS does process the loop in sms-8.c on powerpc now so I have updated > the options to reflect that. > > Test now passes on powerpc -m64/-m32/-m32 -mpowerpc64. Ok for trunk? > > testsuite/ChangeLog > 2017-01-19 

[PATCH][PR target/79170] fix memcmp builtin expansion sequence for rs6000 target.

2017-01-27 Thread Aaron Sawdey
on ppc64/ppc64le. Assuming regtest on ppc64/ppc64le passes, ok for trunk? 2017-01-27  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/79170 * gcc.dg/memcmp-1.c: Improved to catch failures seen in PR 79170. 2017-01-27  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com>

Re: [PATCH, testsuite]: Use posix_memalign instead of aligned_alloc in gcc.dg/strncmp-2.c

2017-02-20 Thread Aaron Sawdey
down the same issue again. Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

[PATCH] portability fix for gcc.dg/strncmp-2.c testcase

2017-02-14 Thread Aaron Sawdey
This testcase I added failed to compile on AIX or older linux due to the use of aligned_alloc(). Now fixed to use posix_memalign if available, and valloc otherwise. Now it compiles and passes on x86_64 (fedora 25), ppc64 (RHEL6.8), and AIX. OK for trunk? 2017-02-14  Aaron Sawdey  <ac

[PATCH][PR target/79449][7 regression] fix ppc strncmp builtin expansion runtime boundary crossing check

2017-02-09 Thread Aaron Sawdey
and the new test case passes on x86_64 as well, ok for trunk? 2017-02-09  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/79449 * gcc.dg/strncmp-2.c: New. Test strncmp and memcmp builtin expansion for reading beyond a 4k boundary. 2017-02-09  Aaron Sawdey  

[PATCH][PR target/79295][7 regression] fix ppc bcdadd insn pattern

2017-02-09 Thread Aaron Sawdey
The bcdadd pattern has the wrong constraints. The change Meissner supplied in PR79295 fixes the issue. Successfully bootstrapped on ppc64le, ok for trunk if regtest also passes? 2017-02-09  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/79295 * config/rs6000/alti

Re: [PATCH] portability fix for gcc.dg/strncmp-2.c testcase

2017-02-14 Thread Aaron Sawdey
On Tue, 2017-02-14 at 13:09 -0600, Segher Boessenkool wrote: > On Tue, Feb 14, 2017 at 11:56:50AM -0600, Aaron Sawdey wrote: > > This testcase I added failed to compile on AIX or older linux due > > to > > the use of aligned_alloc(). Now fixed to use posix_memalign if >

Re: [PATCH, bugfix] builtin expansion of strcmp for rs6000

2017-01-17 Thread Aaron Sawdey
On Tue, 2017-01-17 at 08:30 -0600, Peter Bergner wrote: > On 1/16/17 3:09 PM, Aaron Sawdey wrote: > > Here is an updated version of this patch. > > > > Tulio noted that glibc's strncmp test was failing. This turned out > > to > > be the use of signed HOST_WID

[PATCH] testcase for builtin expansion of strncmp and strcmp

2017-01-17 Thread Aaron Sawdey
included interested parties from targets that have a strncmp builtin. The test passes on x86_64 and on ppc64le with -mcpu=power6. It will not pass on ppc64/ppc64le -mcpu=power[78] until I check in my patch that segher ack'd yesterday and is currently regtesting. OK for trunk? -- Aaron Sawdey, Ph.D

Re: [PATCH, bugfix] builtin expansion of strcmp for rs6000

2017-01-16 Thread Aaron Sawdey
, Aaron On Wed, 2017-01-11 at 11:26 -0600, Aaron Sawdey wrote: > This expands on the previous patch. For strcmp and for strncmp with N > larger than 64, the first 64 bytes of comparison is expanded inline > and > then a call to strcmp or strncmp is emitted to compare the remainder > i

[PATCH, pr63256] update powerpc dg options

2017-01-19 Thread Aaron Sawdey
SMS does process the loop in sms-8.c on powerpc now so I have updated the options to reflect that. Test now passes on powerpc -m64/-m32/-m32 -mpowerpc64. Ok for trunk? testsuite/ChangeLog 2017-01-19  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * gcc.dg/sms-8.c: Update options for p

[PATCH][PR target/79752] fix rs6000 power9 peephole2 for udiv/umod

2017-02-28 Thread Aaron Sawdey
(div:GPR (match_dup 1) +   (udiv:GPR (match_dup 1)  (match_dup 2))) (set (match_dup 3) (mult:GPR (match_dup 0) 2017-02-28  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/79752 * config/rs6000/rs6000.md (peephole2 for udiv/umod): Should emit

Re: [PATCH] builtin expansion of memcmp for powerpc

2016-09-24 Thread Aaron Sawdey
PPC64 Linux.  This patch > should be using !BYTES_BIG_ENDIAN. Change made, I will commit as obvious once I bootstrap to double check my work on ppc64le. Sorry for the mess ...   Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Te

[PATCH, RS6000] improve builtin expansion of memcmp for p7

2016-10-06 Thread Aaron Sawdey
r code for processors older than p8. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs

[PATCH, RS6000, Committed] increase buf size in rs6000_elf_asm_out_{constructor,destructor}

2016-10-06 Thread Aaron Sawdey
It seems we now have analysis that concludes these buffers may possibly overflow. This broke bootstrap on ppc64 BE. Bootstrap passed on ppc64 BE power7. Committing as pre-approved by Segher. 2016-10-06 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs

[PATCH] builtin expansion of memcmp for powerpc

2016-09-22 Thread Aaron Sawdey
40 bytes. Bootstrap on powerpc64le, regtest in progress, OK for trunk if no new regressions? 2016-09-22  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000.md (cmpmemsi): New define_expand. * config/rs6000/rs6000.c (expand_block_compare): New functio

[PATCH PR77718]

2016-09-28 Thread Aaron Sawdey
_rtx, arg2_rtx); +}      /* If SRC is a string constant and block move would be done   by pieces, we can avoid loading the string from memory -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

[RFC PATCH] expand_strn_compare should attempt expansion even if neither string is constant

2016-10-26 Thread Aaron Sawdey
bootstrap/regtest on i386 as rs6000 does not as yet have an expansion for cmpstrsi or cmpstrnsi. Thanks,    Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: builtins.c

Re: [PATCH 0/2] strncmp builtin expansion improvement

2016-11-05 Thread Aaron Sawdey
   Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: [PATCH 0/2] strncmp builtin expansion improvement

2016-11-07 Thread Aaron Sawdey
 %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movl%edx, %ebx movl$5, %edx callstrncmp movl%ebx, %edx I think it's pretty clear from the code in expand_builtin_strncmp that if len1 and len2 are both NULL, you end up with len=len2 and then i

Re: [PATCH 2/2, expand] make expand_builtin_strncmp more general

2016-11-08 Thread Aaron Sawdey
Richard,   Thanks for the review ... comments below. On Tue, 2016-11-08 at 13:36 +0100, Richard Biener wrote: > On Tue, Nov 1, 2016 at 11:29 PM, Aaron Sawdey > <acsaw...@linux.vnet.ibm.com> wrote: > > > > This patch adds code to expand_builtin_strncmp so it also at

[PATCH v2 0/2] strncmp builtin expansion improvement

2016-11-16 Thread Aaron Sawdey
. Bootstrap & regtest passed on x86_64 with svn 242454, ok for trunk? -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

[PATCH v2 1/2, i386] cmpstrnsi needs string length

2016-11-16 Thread Aaron Sawdey
expansion of strncmp when neither string argument is constant. I've also changed the pattern to indicate that operand 3 may be clobbered (if it happens to be in cx already). 2016-11-16  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * config/i386/i386.md (cmpstrnsi): New test to ba

[PATCH v2 2/2, expand] make expand_builtin_strncmp more general

2016-11-16 Thread Aaron Sawdey
This patch makes expand_builtin_strncmp attempt to expand via cmpstrnsi even if neither of the string arguments are string constants. 2016-11-16  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp via cmpstrns

[committed] strncmp builtin expansion improvement

2016-11-17 Thread Aaron Sawdey
Committed to trunk as 242556 after removing the use->clobber change from cmpstrnsi and bootstrap/regtest. gcc/ChangeLog 2016-11-17  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * config/i386/i386.md (cmpstrnsi): New test to bail out if neither string input is a string

Re: [PATCH 2/2, expand] make expand_builtin_strncmp more general

2016-11-04 Thread Aaron Sawdey
ChangeLog for this patch: 2016-11-03  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp via cmpstrnsi even if neither string is constant. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 25

Re: [PATCH 1/2, i386] cmpstrnsi needs string length

2016-11-04 Thread Aaron Sawdey
ChangeLog for this patch: 2016-11-03  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * config/i386/i386.md (cmpstrnsi): New test to bail out if neither string input is a string constant. --  Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 5

[PATCH 2/2, expand] make expand_builtin_strncmp more general

2016-11-01 Thread Aaron Sawdey
This patch adds code to expand_builtin_strncmp so it also attempts expansion via cmpstrnsi in the case where c_strlen() returns NULL for both string arguments, meaning that neither one is a constant. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263

[PATCH 1/2, i386] cmpstrnsi needs string length

2016-11-01 Thread Aaron Sawdey
expansion of strncmp when neither string argument is constant. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: config/i386/i386.md

[PATCH 0/2] strncmp builtin expansion improvement

2016-11-01 Thread Aaron Sawdey
cmpstrnsi in the case where c_strlen() return null for both strings. With these two patches bootstrap passes on x86_64 linux, currently checking regtest. If clean, ok for trunk? -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology

Re: [RFC PATCH] expand_strn_compare should attempt expansion even if neither string is constant

2016-11-02 Thread Aaron Sawdey
On Wed, 2016-11-02 at 13:41 +0100, Bernd Schmidt wrote: > On 10/27/2016 03:14 AM, Aaron Sawdey wrote: > > > > I'm currently working on a builtin expansion of strncmp for powerpc > > similar to the one for memcmp I checked recently. One thing I > > e

[PATCH, RS6000, PR77934] mtvsrdd needs b (base register) constraint on first input

2016-10-11 Thread Aaron Sawdey
Gcc 7 trunk was generating incorrect code for spec2k6 403.gcc due to this constraint issue. OK for trunk after bootstrap/regtest passes? 2016-10-06  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/77934 * config/rs6000/vmx.md (vsx_concat_): The mtvsrdd instr

Re: PING! Re: [PATCH, Fortran] Extension: COTAN and degree-valued trig intrinsics with -fdec-math

2016-10-11 Thread Aaron Sawdey
get_degrees(gfc_expr*)â: ../../gcc/gcc/fortran/iresolve.c:2728:14: error: âtmpâ was not declared in this scope and also this: ../../gcc/gcc/fortran/simplify.c: In function âvoid radians_f(__mpfr_struct*, mpfr_rnd_t)â: ../../gcc/gcc/fortran/simplify.c:1775:5: error: âmpfr_fmod_dâ was not declared in t

[PATCH] builtin expansion of strncmp for rs6000

2016-12-15 Thread Aaron Sawdey
for trunk if no new regressions? 2016-11-17 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000-protos.h (expand_strn_compare): Declare. * config/rs6000/rs6000.md (UNSPEC_CMPB): New unspec. (cmpb3): pattern for generating cmpb. (cmpstrnsi): p

[PATCH, bugfix] builtin expansion of strcmp for rs6000

2017-01-11 Thread Aaron Sawdey
strcmp-1.c test case to check strcmp expansion. Also both now have a length 100 tests to check the transition from the inline comparison to the library call for the remainder. ChangeLog 2017-01-11  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000-protos.h (exp

Re: [PATCH] Add testcases to test builtin-expansion of memcmp and strncmp

2017-01-06 Thread Aaron Sawdey
Jeff, Thanks for the review. Committed as 244177 with requested changes. 2017-01-06 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * gcc.dg/memcmp-1.c: New. * gcc.dg/strncmp-1.c: New. Aaron

[PATCH] Add testcases to test builtin-expansion of memcmp and strncmp

2016-12-19 Thread Aaron Sawdey
parties from targets that have a strncmp builtin. The tests pass on ppc64le and x86_64. OK for trunk? -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: gcc/testsuite/gcc.dg/memcmp-1.c

Re: [PATCH] builtin expansion of strncmp for rs6000

2016-12-19 Thread Aaron Sawdey
nks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-protos.h === --- gcc/config/rs6000/rs6000-proto

[PATCH][PR target/80123][7 regression] new constraint wA to prevent r0 use in mtvsrdd

2017-03-21 Thread Aaron Sawdey
/regtest on 64-bit LE and BE, and also BE 32- bit. OK for trunk if everything passes? 2017-03-21  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/80123 * doc/md.texi (Constraints): Document wA constraint. * config/rs6000/constraints.md (wA): New. * config/

Re: [PATCH][PR target/80083][7 regression] fix power9 vsx-small-integer issue caused by wrong constraints

2017-03-20 Thread Aaron Sawdey
On Mon, 2017-03-20 at 11:11 -0500, Aaron Sawdey wrote: > Test libgomp doacross2.f90 failed only at -O1 because an incorrect > constraint on movsi_internal1 (for vspltisw) led to confusion between > vsx and float registers (fix credit to Meissner). In subsequent > discussion David Edel

[PATCH][PR target/80083][7 regression] fix power9 vsx-small-integer issue caused by wrong constraints

2017-03-20 Thread Aaron Sawdey
for xxspltib -1 that is also fixed now. Bootstrap/regtest reveals no errors on either power8 or power9. Ok for trunk? 2017-03-20  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/80083 * config/rs6000/rs6000.md (*movsi_internal1): incorrect constraints for alternati

[PATCH][PR target/79752] fix rs6000 power9 peephole2 for udiv/umod -- backported to gcc-6-branch

2017-03-14 Thread Aaron Sawdey
 (match_dup 2))) +   (udiv:GPR (match_dup 1) + (match_dup 2))) (set (match_dup 3) (mult:GPR (match_dup 0)   (match_dup 2))) 2017-03-14  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> Backport from mainline 2017-02-28  Aaron Sawdey  &

[PATCH][PR target/80358][7 regression] Fix boundary check error in expand_block_compare

2017-04-07 Thread Aaron Sawdey
in progress passes? 2017-04-07  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> PR target/80358 * config/rs6000/rs6000.c (expand_block_compare): Fix boundary check. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/

Re: [PATCH rs6000] remove implicit static var outputs of toc_relative_expr_p

2017-06-28 Thread Aaron Sawdey
Hi Segher, On Tue, 2017-06-27 at 18:35 -0500, Segher Boessenkool wrote: > Hi Aaron, > > On Tue, Jun 27, 2017 at 11:43:57AM -0500, Aaron Sawdey wrote: > > The function toc_relative_expr_p implicitly sets two static vars > > (tocrel_base and tocrel_offset) that are

Re: [PATCH rs6000] remove implicit static var outputs of toc_relative_expr_p

2017-06-29 Thread Aaron Sawdey
On Wed, 2017-06-28 at 18:19 -0500, Segher Boessenkool wrote: > On Wed, Jun 28, 2017 at 03:21:49PM -0500, Aaron Sawdey wrote: > > -toc_relative_expr_p (const_rtx op, bool strict) > > +toc_relative_expr_p (const_rtx op, bool strict, const_rtx > > *tocrel_base_ret, > >

[PATCH] reorganize block/string move/compare expansions out of rs6000.c

2017-06-22 Thread Aaron Sawdey
? 2017-06-22  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000-string.c (expand_block_clear, do_load_for_compare, select_block_compare_mode, compute_current_alignment, expand_block_compare, expand_strncmp_align_check, expand_strn_c

[PATCH rs6000] remove implicit static var outputs of toc_relative_expr_p

2017-06-27 Thread Aaron Sawdey
the only thing they are used for. Bootstrap and regtest passes in trunk 249639 (to avoid the bootstrap fail), ok for trunk? 2017-06-27  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000.c (toc_relative_expr_p): Make tocrel_base and tocrel_offset be pointe

Re: [PATCH] Add -static-pie to GCC driver to create static PIE

2017-09-12 Thread Aaron Sawdey
uot; I don't see the problem on 252033. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: [TESTSUITE]Use strncpy instead of strcpy in testsuite/gcc.dg/memcmp-1.c

2017-08-30 Thread Aaron Sawdey
> > Test Okay without any problem. Okay to commit? > > Regard, > Renlin > > > gcc/testsuite/ChangeLog: > > 2017-08-30  Renlin Li  <renlin...@arm.com> > > * gcc.dg/memcmp-1.c (test_strncmp): Use strncpy instead of > strcpy. -- Aaron Sa

[PATCH] rs6000: Cleanup bdz/bdnz insn/splitter, add new insn/splitter for bdzt/bdzf/bdnzt/bdnzf

2017-11-30 Thread Aaron Sawdey
will require the change to canonicalize_condition I posted before thanksgiving to prevent doloop from being confused by bdnzt et. al. Bootstrap/regtest passes on ppc64le. OK for trunk? 2017-11-30 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000.md (cceq_ior_compare):

Re: [PATCH] make canonicalize_condition keep its promise

2017-11-29 Thread Aaron Sawdey
On Tue, 2017-11-21 at 11:45 -0600, Aaron Sawdey wrote: > On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote: > > On 11/20/2017 06:41 AM, Aaron Sawdey wrote: > > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote: > > > > On 11/15/2017 08:40 AM, Aaron Sawdey

Re: [PATCH] make canonicalize_condition keep its promise

2017-12-14 Thread Aaron Sawdey
On Thu, 2017-12-14 at 13:43 -0700, Jeff Law wrote: > On 11/21/2017 10:45 AM, Aaron Sawdey wrote: > > On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote: > > > On 11/20/2017 06:41 AM, Aaron Sawdey wrote: > > > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote: &g

Re: [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion

2017-12-12 Thread Aaron Sawdey
On Tue, 2017-12-12 at 20:50 +0100, Jakub Jelinek wrote: > On Tue, Dec 12, 2017 at 01:40:41PM -0600, Aaron Sawdey wrote: > > 2017-12-12  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> > > > > PR target/82190 > > * config/rs6000/rs6000-string.c (expand_block_

[PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion

2017-12-12 Thread Aaron Sawdey
to the size of the load being done regardless of how many bytes are being used. OK for trunk if bootstrap/regtest passes on ppc64le? 2017-12-12 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> PR target/82190 * config/rs6000/rs6000-string.c (expand_bloc

[PATCH, rs6000] Allow memmov/memset builtin expansion to use unaligned vsx on p8/p9

2017-12-13 Thread Aaron Sawdey
This patch allows the use of unaligned vsx loads/stores for builtin expansion of memset and memcmp on p8/p9. Performance of unaligned vsx instructions is good on these processors. OK for trunk if bootstrap/regtest on ppc64le passes? 2017-12-13  Aaron Sawdey  <acsaw...@linux.vnet.ibm.

Re: [PATCH] make canonicalize_condition keep its promise

2017-11-20 Thread Aaron Sawdey
On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote: > On 11/15/2017 08:40 AM, Aaron Sawdey wrote: > > So, the story of this very small patch starts with me adding > > patterns > > for ppc instructions bdz[tf] and bdnz[tf] such as this: > > > >  

Re: [PATCH] make canonicalize_condition keep its promise

2017-11-21 Thread Aaron Sawdey
On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote: > On 11/20/2017 06:41 AM, Aaron Sawdey wrote: > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote: > > > On 11/15/2017 08:40 AM, Aaron Sawdey wrote: > > > > So, the story of this very small patch starts w

[PATCH] make canonicalize_condition keep its promise

2017-11-15 Thread Aaron Sawdey
y to go about this please let me know and I'll revise/retest. Bootstrap and regtest pass on ppc64le and x86_64. Ok for trunk? Thanks, Aaron 2017-11-15  Aaron Sawdey  <acsaw...@linux.vnet.ibm.com> * rtlanal.c (canonicalize_condition): Return 0 if final rtx does not have

[PATCH, rs6000] generate loop code for memcmp inline expansion

2017-12-11 Thread Aaron Sawdey
iption. (-mblock-compare-inline-loop-limit): New option. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-st

[PATCH, rs6000] cleanup/refactor in rs6000-string.c

2018-06-14 Thread Aaron Sawdey
for trunk? Thanks! Aaron 2018-06-14 Aaron Sawdey * config/rs6000/rs6000-string.c (select_block_compare_mode): Check TARGET_EFFICIENT_OVERLAPPING_UNALIGNED here instead of in caller. (do_and3, do_and3_mask, do_compb3, do_rotl3): New functions

[PATCH, rs6000] don't use unaligned vsx for memset of less than 32 bytes

2018-06-25 Thread Aaron Sawdey
runs show the performance regression is fixed. Regstrap passes on powerpc64le, ok for trunk and backport to 8? Thanks, Aaron 2018-06-25 Aaron Sawdey * config/rs6000/rs6000-string.c (expand_block_clear): Don't use unaligned vsx for 16B memset. -- Aaron Sawdey, Ph.D

Re: [PATCH] rs6000: Cleanup bdz/bdnz insn/splitter, add new insn/splitter for bdzt/bdzf/bdnzt/bdnzf

2018-01-08 Thread Aaron Sawdey
this cleanup and addition to the patterns and splitters for the branch decrement instructions as 256344. 2018-01-08 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000.md (cceq_ior_compare): Remove * so I can use it to generate rtl. (cceq_ior_compare_complement):

Re: [PATCH, rs6000] generate loop code for memcmp inline expansion

2018-01-08 Thread Aaron Sawdey
On Tue, 2017-12-12 at 10:13 -0600, Segher Boessenkool wrote: > Please fix those trivialities, and it's okay for trunk (after the > rtlanal patch is approved too). Thanks! Here's the final version of this, which is committed as 256351. 2018-01-08 Aaron Sawdey <acsaw...@linux.vne

Re: [PATCH, rs6000] generate loop code for memcmp inline expansion

2018-01-10 Thread Aaron Sawdey
I'll check the runtime of that --- I added some test cases to memcmp- 1.c and probably it is now taking too long. I will revise it so it's no longer than it was before. Aaron On Wed, 2018-01-10 at 14:25 +, Szabolcs Nagy wrote: > On 08/01/18 19:37, Aaron Sawdey wrote: > > On Tue

[PATCH] reduce runtime of gcc.dg/memcmp-1.c test

2018-01-10 Thread Aaron Sawdey
This brings it back not quite to where it was but a lot more reasonable than what I put into 256351. 2018-01-10 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * gcc.dg/memcmp-1.c: Reduce runtime to something reasonable. OK for trunk? Thanks, Aaron -- Aaron Sawdey, Ph.D.

[PATCH][PR debug/83758] look more carefully for internal_arg_pointer in vt_add_function_parameter()

2018-01-29 Thread Aaron Sawdey
ppc64le and x86_64, ok for trunk? 2018-01-29 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * var-tracking.c (vt_add_function_parameter): Fix comparison of rtx. Index: gcc/var-tracking.c === --- gcc/var-tracking.c (

Re: [PATCH, rs6000][PR debug/83758] v2 rs6000_internal_arg_pointer should only return a register

2018-01-30 Thread Aaron Sawdey
ap, go tests run. Segher is currently regtesting on ppc64le power9. OK for trunk if tests pass? 2018-01-30 Aaron Sawdey <acsaw...@linux.vnet.ibm.com> * config/rs6000/rs6000.c (rs6000_internal_arg_pointer ): Only return a reg rtx. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.

Re: [PATCH][PR debug/83758] look more carefully for internal_arg_pointer in vt_add_function_parameter()

2018-01-30 Thread Aaron Sawdey
> in var-tracking.c. > rs6000/powerpcspe with -fsplit-stack are the only cases where > crtl->args.internal_arg_pointer is not a REG, so just running libgo > testsuite on powerpc{,64,64le} should cover it all. I'll give this a try today when I get to the office. Thanks, Aaron > > Jakub > -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

Re: [PATCH] make canonicalize_condition keep its promise

2018-01-02 Thread Aaron Sawdey
ttach a simple > loop > descriptor to a loop that is not a simple loop. But clearly you > didn't > introduce that oddball behavior. Jeff, Thanks for sticking with this and reviewing, I have re-checked that regstrap still passes and committed as 256079. Aaron -- Aaron Sawdey, Ph.D.

Re: [PATCH][rs6000][PR target/82190] fix mem size info in rtl generated by memcmp and strncmp/strcmp builtin expansion

2018-01-02 Thread Aaron Sawdey
gnificantly larger. */ - if (TARGET_ALTIVEC && bytes >= 16 && align >= 128) + if (TARGET_ALTIVEC && bytes >= 16 && (TARGET_EFFICIENT_UNALIGNED_VSX || align >= 128)) { move_bytes = 16; mode = V4SImode; -- Aaro

[PATCH, rs6000] PR target/86222 fix truncation issue with constants when compiling -m32

2018-06-21 Thread Aaron Sawdey
and backport to 8? Thanks, Aaron 2018-06-19 Aaron Sawdey * config/rs6000/rs6000-string.c (expand_strn_compare): Handle -m32 correctly. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC

[PATCH, rs6000] refactor/cleanup in rs6000-string.c

2018-07-31 Thread Aaron Sawdey
Just teasing things apart a bit more in this function so I can add vec/vsx code generation without making it enormous and incomprehensible. Bootstrap/regtest passes on powerpc64le, ok for trunk? Thanks, Aaron 2018-07-31 Aaron Sawdey * config/rs6000/rs6000-string.c

[PATCH, rs6000] inline expansion of str[n]cmp using vec/vsx instructions

2018-08-22 Thread Aaron Sawdey
) and ppc64le (power8 and power9). Ok for trunk? Thanks! Aaron 2018-08-22 Aaron Sawdey * config/rs6000/altivec.md (altivec_eq): Remove star. * config/rs6000/rs6000-string.c (do_load_for_compare): Support vector load modes. (expand_strncmp_vec_sequence): New function

Re: [PATCH][AArch64] PR84114: Avoid reassociating FMA

2018-02-27 Thread Aaron Sawdey
n increases register pressure so it would be nice to be able to avoid causing issues as a result of that. -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain

PR target/84743 adjust reassociation widths for power8/power9

2018-03-12 Thread Aaron Sawdey
% Bottom line is net improvement for CPU2017 int compared with either current trunk, or disabling parallel reassociation. For CPU2017 fp, very small overall degradation. Currently doing regstrap on ppc64le, ok for trunk if results look good? Thanks! Aaron 2018-03-12 Aaron Sawdey <ac

[PATCH rs6000: document options (PR85321)

2018-04-10 Thread Aaron Sawdey
This updates invoke.texi to document -mblock-compare-inline-limit, -mblock-compare-inline-loop-limit, and -mstring-compare-inline-limit. Tested with "make pdf", ok for trunk? 2018-04-10 Aaron Sawdey <acsaw...@linux.ibm.com> PR target/85321 * doc/invo

  1   2   >