/branches/lto-pressure
Aaron
Thanks Regards
Ajit
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
On Mon, 2014-06-16 at 14:42 -0400, Vladimir Makarov wrote:
On 2014-06-16, 2:25 PM, Aaron Sawdey wrote:
On Mon, 2014-06-16 at 14:14 +, Ajit Kumar Agarwal wrote:
Hello All:
I have worked on the Open64 compiler where the Register Pressure Guided
Unroll and Jam gave a good amount
On Thu, 2015-09-03 at 15:22 +, Ajit Kumar Agarwal wrote:
>
>
> -Original Message-
> From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com]
> Sent: Wednesday, September 02, 2015 8:23 PM
> To: Ajit Kumar Agarwal
> Cc: Jeff Law; vmaka...@redhat.com; Richard Bi
improve register allocation so it doesn't fall down when register
pressure gets high.
The code is in a branch called lto-pressure.
Aaron
>
> Thanks & Regards
> Ajit
>
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
On Sat, 2015-09-12 at 18:45 +, Ajit Kumar Agarwal wrote:
>
>
> -Original Message-
> From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com]
> Sent: Friday, September 04, 2015 11:51 PM
> To: Ajit Kumar Agarwal
> Cc: Jeff Law; vmaka...@redhat.com; Richard Bi
on. */
esucc->probability = REG_BR_PROB_BASE;
}
}
}
It would appear that the guessed counts are getting changed
inconsistently before this during the tree-ssa-dom pass.
Any trail of breadcrumbs to follow through the forest would be helpful
here ...
Thanks!
Aaron
--
Aaron Sawde
whose terms are fp multiplies because now we have
fused multipy-adds to consider. See PR 70912 for more on this.
Suggestions?
Thanks,
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
the "scmpu" instruction to do the
comparison. The RX manual I found showed pseudocode for scmpu that
shows it both checks for zero byte as well as comparing the strings.
If this isn't correct, please let me know here or on the patch itself.
Thanks,
Aaron
--
Aaron Sawdey, Ph
complicated than memset because of encodings etc. My
> snippet
> in question used a fixed-length encoding of 2 bytes, however.
>
> Another simple idea to tackle this would be a peephole optimization
> but
> I'm not sure if this is really feasible for something like memset.
> Would
mp+ccmp and one branch.
> >
> > Even better would be wider loads if you either know the alignment
> > of s or it's max size
> > (although given the overhead of creating the return value that
> > works best for equality).
>
> All those things are hand
On 10/17/18 4:03 PM, Florian Weimer wrote:
> * Aaron Sawdey:
>
>> I've previously posted a patch to add vector/vsx inline expansion of
>> strcmp/strncmp for the power8/power9 processors. Here are some of the
>> other items I have in the pipeline that I hope to get into
512 bytes inline before dumping to the library
function.
If anyone has any other input on the inline expansion work I've been
doing for the rs6000 target, please let me know.
Thanks!
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM
On 5/15/19 1:01 PM, Jakub Jelinek wrote:
> On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote:
>> 1) rename optab movmem and the underlying patterns to cpymem.
>> 2) add a new optab movmem that is really memmove() and add support for
>> having __builtin_memmove() u
On 5/15/19 9:02 AM, Michael Matz wrote:
> On Wed, 15 May 2019, Aaron Sawdey wrote:
>> Next question would be how do we move from the existing movmem pattern
>> (which Michael Matz tells us should be renamed cpymem anyway) to this
>> new thing. Are you proposing that we s
On 5/15/19 11:31 AM, Jakub Jelinek wrote:
> On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote:
>> My goals for this are:
>> * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
>> * memmove() call becomes __builtin_memmove (or __builtin_memcpy base
this machinery that need to
work a certain way, or other related issues that should be addressed in
between expand_builtin_memcpy() and emit_block_move_via_movmem().
Thanks!
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technolo
On 5/15/19 7:22 AM, Richard Biener wrote:
> On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey wrote:
>> I'd be interested in any comments about pieces of this machinery that need to
>> work a certain way, or other related issues that should be addressed in
>> between
On 5/15/19 8:10 AM, Michael Matz wrote:> On Tue, 14 May 2019, Aaron Sawdey
wrote:
>
>> memcpy -> expand with movmem pattern
>> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
>> memmove (overlap) -> remains memmove -> gl
,
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
in bzip2.
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
, Apr 18, 2014 at 9:43 AM, Aaron Sawdey
acsaw...@linux.vnet.ibm.com wrote:
Honza,
Seeing your recent patches relating to inliner heuristics for LTO, I
thought I should mention some related work I'm doing.
By way of introduction, I've recently joined the IBM LTC's PPC Toolchain
team
t.
size_cost = (estimate_reg_pressure_cost (new_regs[0] + regs_needed[0],
regs_used, speed, call_p)
- estimate_reg_pressure_cost (new_regs[0],
regs_used, speed, call_p));
I'm not quite sure I understand the "wh
Hi,
This patch enables TARGET_SCHED_REASSOCIATION_WIDTH for power8 and up.
The widths returned are derived from testing with SPEC 2006 and some
simple tests on power8.
Bootstrapped and regtested on powerpc64le-unknown-linux-gnu, ok for
trunk?
2016-05-04 Aaron Sawdey <ac
<baldr...@gcc.gnu.org>
Sujoy Saraswati
<sujoy.sarasw...@hpe.com>
Trevor Saunders<tsaund...@mozilla.com>
+Aaron Sawdey <acsaw...@linux.vnet.ibm.
On Thu, 2017-01-19 at 17:00 -0600, Aaron Sawdey wrote:
> SMS does process the loop in sms-8.c on powerpc now so I have updated
> the options to reflect that.
>
> Test now passes on powerpc -m64/-m32/-m32 -mpowerpc64. Ok for trunk?
>
> testsuite/ChangeLog
> 2017-01-19
on ppc64/ppc64le.
Assuming regtest on ppc64/ppc64le passes, ok for trunk?
2017-01-27 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/79170
* gcc.dg/memcmp-1.c: Improved to catch failures seen in PR 79170.
2017-01-27 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
down the same issue again.
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
This testcase I added failed to compile on AIX or older linux due to
the use of aligned_alloc(). Now fixed to use posix_memalign if
available, and valloc otherwise.
Now it compiles and passes on x86_64 (fedora 25), ppc64 (RHEL6.8), and
AIX. OK for trunk?
2017-02-14 Aaron Sawdey <ac
and the new test case
passes on x86_64 as well, ok for trunk?
2017-02-09 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/79449
* gcc.dg/strncmp-2.c: New. Test strncmp and memcmp builtin
expansion for reading beyond a 4k boundary.
2017-02-09 Aaron Sawdey
The bcdadd pattern has the wrong constraints. The change Meissner
supplied in PR79295 fixes the issue.
Successfully bootstrapped on ppc64le, ok for trunk if regtest also
passes?
2017-02-09 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/79295
* config/rs6000/alti
On Tue, 2017-02-14 at 13:09 -0600, Segher Boessenkool wrote:
> On Tue, Feb 14, 2017 at 11:56:50AM -0600, Aaron Sawdey wrote:
> > This testcase I added failed to compile on AIX or older linux due
> > to
> > the use of aligned_alloc(). Now fixed to use posix_memalign if
>
On Tue, 2017-01-17 at 08:30 -0600, Peter Bergner wrote:
> On 1/16/17 3:09 PM, Aaron Sawdey wrote:
> > Here is an updated version of this patch.
> >
> > Tulio noted that glibc's strncmp test was failing. This turned out
> > to
> > be the use of signed HOST_WID
included interested parties from targets that have a strncmp
builtin.
The test passes on x86_64 and on ppc64le with -mcpu=power6. It will not
pass on ppc64/ppc64le -mcpu=power[78] until I check in my patch that
segher ack'd yesterday and is currently regtesting. OK for trunk?
--
Aaron Sawdey, Ph.D
,
Aaron
On Wed, 2017-01-11 at 11:26 -0600, Aaron Sawdey wrote:
> This expands on the previous patch. For strcmp and for strncmp with N
> larger than 64, the first 64 bytes of comparison is expanded inline
> and
> then a call to strcmp or strncmp is emitted to compare the remainder
> i
SMS does process the loop in sms-8.c on powerpc now so I have updated
the options to reflect that.
Test now passes on powerpc -m64/-m32/-m32 -mpowerpc64. Ok for trunk?
testsuite/ChangeLog
2017-01-19 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* gcc.dg/sms-8.c: Update options for p
(div:GPR (match_dup 1)
+ (udiv:GPR (match_dup 1)
(match_dup 2)))
(set (match_dup 3)
(mult:GPR (match_dup 0)
2017-02-28 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/79752
* config/rs6000/rs6000.md (peephole2 for udiv/umod): Should emit
PPC64 Linux. This patch
> should be using !BYTES_BIG_ENDIAN.
Change made, I will commit as obvious once I bootstrap to double check
my work on ppc64le.
Sorry for the mess ...
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Te
r code for processors older than p8.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs
It seems we now have analysis that concludes these buffers may possibly
overflow. This broke bootstrap on ppc64 BE. Bootstrap passed on ppc64 BE
power7. Committing as pre-approved by Segher.
2016-10-06 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs
40 bytes.
Bootstrap on powerpc64le, regtest in progress, OK for trunk if no new
regressions?
2016-09-22 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000.md (cmpmemsi): New define_expand.
* config/rs6000/rs6000.c (expand_block_compare): New functio
_rtx, arg2_rtx);
+}
/* If SRC is a string constant and block move would be done
by pieces, we can avoid loading the string from memory
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
bootstrap/regtest on i386 as rs6000 does not as yet have an expansion
for cmpstrsi or cmpstrnsi.
Thanks,
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: builtins.c
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
%rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movl%edx, %ebx
movl$5, %edx
callstrncmp
movl%ebx, %edx
I think it's pretty clear from the code in expand_builtin_strncmp that
if len1 and len2 are both NULL, you end up with len=len2 and then i
Richard,
Thanks for the review ... comments below.
On Tue, 2016-11-08 at 13:36 +0100, Richard Biener wrote:
> On Tue, Nov 1, 2016 at 11:29 PM, Aaron Sawdey
> <acsaw...@linux.vnet.ibm.com> wrote:
> >
> > This patch adds code to expand_builtin_strncmp so it also at
.
Bootstrap & regtest passed on x86_64 with svn 242454, ok for trunk?
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
expansion of strncmp when neither string argument is
constant. I've also changed the pattern to indicate that operand 3 may
be clobbered (if it happens to be in cx already).
2016-11-16 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/i386/i386.md (cmpstrnsi): New test to ba
This patch makes expand_builtin_strncmp attempt to expand via cmpstrnsi
even if neither of the string arguments are string constants.
2016-11-16 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp
via cmpstrns
Committed to trunk as 242556 after removing the use->clobber change
from cmpstrnsi and bootstrap/regtest.
gcc/ChangeLog
2016-11-17 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/i386/i386.md (cmpstrnsi): New test to bail out if neither
string input is a string
ChangeLog for this patch:
2016-11-03 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp
via cmpstrnsi even if neither string is constant.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 25
ChangeLog for this patch:
2016-11-03 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/i386/i386.md (cmpstrnsi): New test to bail out if neither
string input is a string constant.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 5
This patch adds code to expand_builtin_strncmp so it also attempts
expansion via cmpstrnsi in the case where c_strlen() returns NULL for
both string arguments, meaning that neither one is a constant.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263
expansion of strncmp when neither string argument is
constant.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: config/i386/i386.md
cmpstrnsi in the case where c_strlen() return null for both
strings.
With these two patches bootstrap passes on x86_64 linux, currently
checking regtest. If clean, ok for trunk?
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology
On Wed, 2016-11-02 at 13:41 +0100, Bernd Schmidt wrote:
> On 10/27/2016 03:14 AM, Aaron Sawdey wrote:
> >
> > I'm currently working on a builtin expansion of strncmp for powerpc
> > similar to the one for memcmp I checked recently. One thing I
> > e
Gcc 7 trunk was generating incorrect code for spec2k6 403.gcc due to
this constraint issue. OK for trunk after bootstrap/regtest passes?
2016-10-06 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/77934
* config/rs6000/vmx.md (vsx_concat_): The mtvsrdd instr
get_degrees(gfc_expr*)â:
../../gcc/gcc/fortran/iresolve.c:2728:14: error: âtmpâ was not declared in this
scope
and also this:
../../gcc/gcc/fortran/simplify.c: In function âvoid radians_f(__mpfr_struct*,
mpfr_rnd_t)â:
../../gcc/gcc/fortran/simplify.c:1775:5: error: âmpfr_fmod_dâ was not declared
in t
for trunk if no new regressions?
2016-11-17 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000-protos.h (expand_strn_compare): Declare.
* config/rs6000/rs6000.md (UNSPEC_CMPB): New unspec.
(cmpb3): pattern for generating cmpb.
(cmpstrnsi): p
strcmp-1.c test case to check strcmp expansion. Also both now have a
length 100 tests to check the transition from the inline comparison to
the library call for the remainder.
ChangeLog
2017-01-11 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000-protos.h (exp
Jeff,
Thanks for the review. Committed as 244177 with requested changes.
2017-01-06 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* gcc.dg/memcmp-1.c: New.
* gcc.dg/strncmp-1.c: New.
Aaron
parties from targets that have a strncmp
builtin.
The tests pass on ppc64le and x86_64. OK for trunk?
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/testsuite/gcc.dg/memcmp-1.c
nks,
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-proto
/regtest on 64-bit LE and BE, and also BE 32-
bit. OK for trunk if everything passes?
2017-03-21 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/80123
* doc/md.texi (Constraints): Document wA constraint.
* config/rs6000/constraints.md (wA): New.
* config/
On Mon, 2017-03-20 at 11:11 -0500, Aaron Sawdey wrote:
> Test libgomp doacross2.f90 failed only at -O1 because an incorrect
> constraint on movsi_internal1 (for vspltisw) led to confusion between
> vsx and float registers (fix credit to Meissner). In subsequent
> discussion David Edel
for xxspltib -1 that is also fixed now.
Bootstrap/regtest reveals no errors on either power8 or power9. Ok for
trunk?
2017-03-20 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/80083
* config/rs6000/rs6000.md (*movsi_internal1): incorrect constraints
for alternati
(match_dup 2)))
+ (udiv:GPR (match_dup 1)
+ (match_dup 2)))
(set (match_dup 3)
(mult:GPR (match_dup 0)
(match_dup 2)))
2017-03-14 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
Backport from mainline
2017-02-28 Aaron Sawdey &
in progress passes?
2017-04-07 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/80358
* config/rs6000/rs6000.c (expand_block_compare): Fix boundary check.
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/
Hi Segher,
On Tue, 2017-06-27 at 18:35 -0500, Segher Boessenkool wrote:
> Hi Aaron,
>
> On Tue, Jun 27, 2017 at 11:43:57AM -0500, Aaron Sawdey wrote:
> > The function toc_relative_expr_p implicitly sets two static vars
> > (tocrel_base and tocrel_offset) that are
On Wed, 2017-06-28 at 18:19 -0500, Segher Boessenkool wrote:
> On Wed, Jun 28, 2017 at 03:21:49PM -0500, Aaron Sawdey wrote:
> > -toc_relative_expr_p (const_rtx op, bool strict)
> > +toc_relative_expr_p (const_rtx op, bool strict, const_rtx
> > *tocrel_base_ret,
> >
?
2017-06-22 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000-string.c (expand_block_clear,
do_load_for_compare, select_block_compare_mode,
compute_current_alignment, expand_block_compare,
expand_strncmp_align_check, expand_strn_c
the only thing
they are used for.
Bootstrap and regtest passes in trunk 249639 (to avoid the bootstrap
fail), ok for trunk?
2017-06-27 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (toc_relative_expr_p): Make tocrel_base
and tocrel_offset be pointe
uot;
I don't see the problem on 252033.
Thanks,
Aaron
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
>
> Test Okay without any problem. Okay to commit?
>
> Regard,
> Renlin
>
>
> gcc/testsuite/ChangeLog:
>
> 2017-08-30 Renlin Li <renlin...@arm.com>
>
> * gcc.dg/memcmp-1.c (test_strncmp): Use strncpy instead of
> strcpy.
--
Aaron Sa
will require the change to canonicalize_condition I
posted before thanksgiving to prevent doloop from being confused by
bdnzt et. al.
Bootstrap/regtest passes on ppc64le. OK for trunk?
2017-11-30 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000.md (cceq_ior_compare):
On Tue, 2017-11-21 at 11:45 -0600, Aaron Sawdey wrote:
> On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote:
> > On 11/20/2017 06:41 AM, Aaron Sawdey wrote:
> > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote:
> > > > On 11/15/2017 08:40 AM, Aaron Sawdey
On Thu, 2017-12-14 at 13:43 -0700, Jeff Law wrote:
> On 11/21/2017 10:45 AM, Aaron Sawdey wrote:
> > On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote:
> > > On 11/20/2017 06:41 AM, Aaron Sawdey wrote:
> > > > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote:
&g
On Tue, 2017-12-12 at 20:50 +0100, Jakub Jelinek wrote:
> On Tue, Dec 12, 2017 at 01:40:41PM -0600, Aaron Sawdey wrote:
> > 2017-12-12 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
> >
> > PR target/82190
> > * config/rs6000/rs6000-string.c (expand_block_
to the size of the load being done
regardless of how many bytes are being used.
OK for trunk if bootstrap/regtest passes on ppc64le?
2017-12-12 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
PR target/82190
* config/rs6000/rs6000-string.c (expand_bloc
This patch allows the use of unaligned vsx loads/stores for builtin
expansion of memset and memcmp on p8/p9. Performance of unaligned vsx
instructions is good on these processors.
OK for trunk if bootstrap/regtest on ppc64le passes?
2017-12-13 Aaron Sawdey <acsaw...@linux.vnet.ibm.
On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote:
> On 11/15/2017 08:40 AM, Aaron Sawdey wrote:
> > So, the story of this very small patch starts with me adding
> > patterns
> > for ppc instructions bdz[tf] and bdnz[tf] such as this:
> >
> >
On Tue, 2017-11-21 at 10:06 -0700, Jeff Law wrote:
> On 11/20/2017 06:41 AM, Aaron Sawdey wrote:
> > On Sun, 2017-11-19 at 16:44 -0700, Jeff Law wrote:
> > > On 11/15/2017 08:40 AM, Aaron Sawdey wrote:
> > > > So, the story of this very small patch starts w
y to go about this please let me know and I'll
revise/retest.
Bootstrap and regtest pass on ppc64le and x86_64. Ok for trunk?
Thanks,
Aaron
2017-11-15 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* rtlanal.c (canonicalize_condition): Return 0 if final rtx
does not have
iption.
(-mblock-compare-inline-loop-limit): New option.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-st
for trunk?
Thanks!
Aaron
2018-06-14 Aaron Sawdey
* config/rs6000/rs6000-string.c (select_block_compare_mode): Check
TARGET_EFFICIENT_OVERLAPPING_UNALIGNED here instead of in caller.
(do_and3, do_and3_mask, do_compb3, do_rotl3): New functions
runs show
the performance regression is fixed.
Regstrap passes on powerpc64le, ok for trunk and backport to 8?
Thanks,
Aaron
2018-06-25 Aaron Sawdey
* config/rs6000/rs6000-string.c (expand_block_clear): Don't use
unaligned vsx for 16B memset.
--
Aaron Sawdey, Ph.D
this
cleanup and addition to the patterns and splitters for the branch
decrement instructions as 256344.
2018-01-08 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000.md (cceq_ior_compare): Remove * so I can use it
to generate rtl.
(cceq_ior_compare_complement):
On Tue, 2017-12-12 at 10:13 -0600, Segher Boessenkool wrote:
> Please fix those trivialities, and it's okay for trunk (after the
> rtlanal patch is approved too). Thanks!
Here's the final version of this, which is committed as 256351.
2018-01-08 Aaron Sawdey <acsaw...@linux.vne
I'll check the runtime of that --- I added some test cases to memcmp-
1.c and probably it is now taking too long. I will revise it so it's no
longer than it was before.
Aaron
On Wed, 2018-01-10 at 14:25 +, Szabolcs Nagy wrote:
> On 08/01/18 19:37, Aaron Sawdey wrote:
> > On Tue
This brings it back not quite to where it was but a lot more reasonable
than what I put into 256351.
2018-01-10 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* gcc.dg/memcmp-1.c: Reduce runtime to something reasonable.
OK for trunk?
Thanks,
Aaron
--
Aaron Sawdey, Ph.D.
ppc64le and x86_64, ok for trunk?
2018-01-29 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* var-tracking.c (vt_add_function_parameter): Fix comparison of rtx.
Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c (
ap, go tests run.
Segher is currently regtesting on ppc64le power9. OK for trunk if tests
pass?
2018-01-30 Aaron Sawdey <acsaw...@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_internal_arg_pointer ): Only return
a reg rtx.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.
> in var-tracking.c.
> rs6000/powerpcspe with -fsplit-stack are the only cases where
> crtl->args.internal_arg_pointer is not a REG, so just running libgo
> testsuite on powerpc{,64,64le} should cover it all.
I'll give this a try today when I get to the office.
Thanks,
Aaron
>
> Jakub
>
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
ttach a simple
> loop
> descriptor to a loop that is not a simple loop. But clearly you
> didn't
> introduce that oddball behavior.
Jeff,
Thanks for sticking with this and reviewing, I have re-checked that
regstrap still passes and committed as 256079.
Aaron
--
Aaron Sawdey, Ph.D.
gnificantly larger. */
- if (TARGET_ALTIVEC && bytes >= 16 && align >= 128)
+ if (TARGET_ALTIVEC && bytes >= 16 && (TARGET_EFFICIENT_UNALIGNED_VSX ||
align >= 128))
{
move_bytes = 16;
mode = V4SImode;
--
Aaro
and backport to 8?
Thanks,
Aaron
2018-06-19 Aaron Sawdey
* config/rs6000/rs6000-string.c (expand_strn_compare): Handle -m32
correctly.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC
Just teasing things apart a bit more in this function so I can add
vec/vsx code generation without making it enormous and
incomprehensible.
Bootstrap/regtest passes on powerpc64le, ok for trunk?
Thanks,
Aaron
2018-07-31 Aaron Sawdey
* config/rs6000/rs6000-string.c
) and ppc64le (power8 and
power9). Ok for trunk?
Thanks!
Aaron
2018-08-22 Aaron Sawdey
* config/rs6000/altivec.md (altivec_eq): Remove star.
* config/rs6000/rs6000-string.c (do_load_for_compare): Support
vector load modes.
(expand_strncmp_vec_sequence): New function
n increases register pressure so it
would be nice to be able to avoid causing issues as a result of that.
--
Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
%
Bottom line is net improvement for CPU2017 int compared with either
current trunk, or disabling parallel reassociation. For CPU2017 fp,
very small overall degradation.
Currently doing regstrap on ppc64le, ok for trunk if results look good?
Thanks!
Aaron
2018-03-12 Aaron Sawdey <ac
This updates invoke.texi to document -mblock-compare-inline-limit,
-mblock-compare-inline-loop-limit, and -mstring-compare-inline-limit.
Tested with "make pdf", ok for trunk?
2018-04-10 Aaron Sawdey <acsaw...@linux.ibm.com>
PR target/85321
* doc/invo
1 - 100 of 189 matches
Mail list logo