Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-28 Thread Bernd Schmidt
On 11/28/19 8:53 PM, Gunther Nikl wrote:> Bernd Schmidt : >> On 11/23/19 9:53 PM, Bernd Schmidt wrote: >> move.w %a4,%d0 >> - tst.b %d0 >> - jeq .L352 >> + jeq .L353 >> >> And the reason - that's a movqi using move.w. > >

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-25 Thread Bernd Schmidt
On 11/26/19 3:21 AM, Joseph Myers wrote: > > The soft-float ColdFire build (--with-arch=cf --with-cpu=54455 > --disable-multilib) successfully built libgcc and glibc, but ran into an > ICE building the glibc tests. Again, I've not bisected but this commit > seems likely to be responsible.

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-25 Thread Bernd Schmidt
On 11/26/19 1:36 AM, Joseph Myers wrote: > I'm seeing a libgcc build failure for coldfire in my build-many-glibcs.py > bot (m68k-linux-gnu configured --with-arch=cf --disable-multilib). That's > building _mulsc3.o; I get assembler errors: I overlooked a difference in the 68881 vs coldfire

Autoinc vs reload and LRA

2019-11-25 Thread Bernd Schmidt
So I was curious what would happen if I turned on LRA for m68k. It turns out my autoinc patches from the cc0 patch set expose a bug in how LRA handles autoincrement. While it copies the logic from reload's inc_for_reload, it appears to be missing the find_reloads_address code to ensure an autoinc

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-25 Thread Bernd Schmidt
On 11/25/19 1:38 PM, Tobias Burnus wrote: > Thanks for the m68k work! Can you also update > https://gcc.gnu.org/backends.html ? Committed as obvious. Bernd commit f42834ad5e77c05cb6bc0908b8fc9282fec7fc19 Author: Bernd Schmidt Date: Mon Nov 25 13:48:08 2019 +0100 Change backends

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-25 Thread Bernd Schmidt
On 11/25/19 1:34 PM, John Paul Adrian Glaubitz wrote: > Are all 4 + 2 patches in now? Thus, can we close the bug? We're missing one piece for better autoinc generation, but that's a small optimization issue. The cc0 conversion is complete. Bernd

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-25 Thread Bernd Schmidt
On 11/23/19 6:36 PM, Jeff Law wrote: > Not really. I've already indicated to Bernd that he should go ahead and > commit the changes and we can iterate on any problems that arise. After the last fix, I did some more testing and since I feel confident that it really is in good shape now, I

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-25 Thread Bernd Schmidt
On 11/25/19 12:26 PM, Andreas Schwab wrote: > On Nov 24 2019, Bernd Schmidt wrote: > >> Whew, I think I have it. One tst instruction eliminated when it >> shouldn't have been: >> >> move.w %a4,%d0 >> - tst.b %d0 >> - jeq .L352 >>

Re: [PATCH 4/4] Fix autoinc cbranch

2019-11-24 Thread Bernd Schmidt
On 11/24/19 8:43 PM, Segher Boessenkool wrote: > But. Allowing autoinc into jump insns means those jump insns may then > eventually need an output reload; it may just have been because of that? That's almost certainly the reasoning, but as I pointed out in my original mail - reload is careful

Re: [PATCH 4/4] Fix autoinc cbranch

2019-11-24 Thread Bernd Schmidt
On 11/19/19 1:27 AM, Segher Boessenkool wrote: > The combine parts are okay for trunk, if you keep an eye out :-) Thanks, now committed. That leaves the auto-inc-dec part. Since we're being adventurous, I've also bootstrapped and tested the following in the meantime (on the gcc135 machine). This

Re: [PATCH ix86] Fix rtx_costs for flag-setting adds

2019-11-24 Thread Bernd Schmidt
y) @@ -1,3 +1,8 @@ +2019-11-24 Bernd Schmidt + + * config/i386/i386.c (ix86_rtx_costs): Handle care of a PLUS in a + COMPARE, representing an overflow detection. + 2019-11-23 Jan Hubicka * cif-code.def (MAX_INLINE_INSNS_SINGLE_O2_LIMIT): Remove. Index: gcc/con

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-23 Thread Bernd Schmidt
On 11/23/19 9:53 PM, Bernd Schmidt wrote: > I'll spend a few more days trying to see if I can do something about the > bootstrap failure Mikael saw (currently trying to do a two-stage cross > build rather than a really slow bootstrap). Whew, I think I have it. One tst instruction elimin

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-23 Thread Bernd Schmidt
On 11/23/19 6:36 PM, Jeff Law wrote: > Not really. I've already indicated to Bernd that he should go ahead and > commit the changes and we can iterate on any problems that arise. In the meantime I've made an aranym setup in addition to the qemu setup I had, and I've not been able to reproduce

Re: [PATCH ix86] Fix rtx_costs for flag-setting adds

2019-11-22 Thread Bernd Schmidt
On 11/22/19 3:04 PM, Uros Bizjak wrote: > On Fri, Nov 22, 2019 at 1:58 PM Bernd Schmidt wrote: >> >> A patch I posted recently fixes combine to take costs of JUMP_INSNs into >> account. That causes the pr30315 test to fail with -m32, since the cost >> of an add that

[PATCH ix86] Fix rtx_costs for flag-setting adds

2019-11-22 Thread Bernd Schmidt
A patch I posted recently fixes combine to take costs of JUMP_INSNs into account. That causes the pr30315 test to fail with -m32, since the cost of an add that sets the flags is estimated too high. The following seems to fix it. Bootstrapped and tested on x86_64-linux, ok? Bernd *

Re: [PATCH 3/4] Set costs for jumps in combine

2019-11-21 Thread Bernd Schmidt
On 11/22/19 1:42 AM, Segher Boessenkool wrote: > On Thu, Nov 21, 2019 at 02:36:53PM +0100, Bernd Schmidt wrote: >> Thanks. Just FYI, this is held up a little. I decided I'd also test on >> x86, and there it shows a case where ix86_rtx_cost misses something: the >> i386/pr3

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-21 Thread Bernd Schmidt
On 11/21/19 1:30 PM, Matthias Klose wrote: > > that would be apt build-dep gcc-9. The former would only install the build > dependencies of the gcc-defaults package. That gets me E: You must put some 'source' URIs in your sources.list where /etc/apt/sources.list looks like deb

Re: [PATCH 3/4] Set costs for jumps in combine

2019-11-21 Thread Bernd Schmidt
On 11/13/19 5:16 PM, Segher Boessenkool wrote: > On Wed, Nov 13, 2019 at 02:13:48PM +0100, Bernd Schmidt wrote: >> Also, it does not compute costs for jump >> insns, so they are always set to zero. As a consequence, any possible >> substitution is performed if a combination in

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-20 Thread Bernd Schmidt
On 11/20/19 8:27 PM, Mikael Pettersson wrote: > On Wed, Nov 20, 2019 at 3:16 PM Bernd Schmidt wrote: >> Probably best to just run tests on stage1 and hope something shows up. > > Ok, how do I did that? I've always just done 'make -k check' after > full bootstraps. >

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-20 Thread Bernd Schmidt
On 11/20/19 2:50 PM, Mikael Pettersson wrote: > On Mon, Nov 18, 2019 at 9:57 PM Mikael Pettersson > wrote: >> >> On Mon, Nov 18, 2019 at 8:31 PM Bernd Schmidt wrote: >>> >>> Hi Mikael, >>> >>>> This fixed the problem, thanks. >>>

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-18 Thread Bernd Schmidt
Hi Mikael, > This fixed the problem, thanks. Could you also run the testsuite to see if you can reproduce the g++.old-deja failures Andreas reported? Bernd

Re: [PATCH 2/4] The main m68k cc0 conversion

2019-11-18 Thread Bernd Schmidt
(Apologies to Jeff who's getting this twice because I didn't hit reply-all the first time.) On 11/17/19 6:56 PM, Jeff Law wrote: > While scanning this patch I did notice the introduction of > CC_STATUS_INIT in output_{and,ior,xor}si. You might want to check that. That is intentional.

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-16 Thread Bernd Schmidt
On 11/16/19 9:18 AM, Andreas Schwab wrote: > On Nov 16 2019, Bernd Schmidt wrote: > >> Well, there has to be some difference between what you are doing and >> what I am doing, because: >> >> Running /local/src/egcs/git/gcc/testsuite/g++.old-deja/old-deja.exp ... >

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-15 Thread Bernd Schmidt
On 11/15/19 11:50 PM, Andreas Schwab wrote: > On Nov 15 2019, Bernd Schmidt wrote: > >> I meant the compiler command line of course... for any -mcpu flags that >> might differ from my test run. > > There are none. Well, there has to be some difference between what you

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-15 Thread Bernd Schmidt
On 11/15/19 10:58 PM, Andreas Schwab wrote: > On Nov 15 2019, Bernd Schmidt wrote: > >> Any chance you could show the command lines from the log files or some >> other way of reproducing the issue? > > Executing on aranym: OMP_NUM_THREADS=2 > LD_LIBRARY_PATH=.:/da

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-15 Thread Bernd Schmidt
On 11/15/19 5:34 PM, Andreas Schwab wrote: > On Nov 15 2019, Bernd Schmidt wrote: > >> Are these with the patch? > > Yes. > >> Are you on real hardware > > No, I'm using aranym. Any chance you could show the command lines from the log files or some other way of reproducing the issue? Bernd

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-15 Thread Bernd Schmidt
On 11/15/19 2:48 PM, Andreas Schwab wrote: > Here are the results of running the testsuite on m68k-linux: > > http://gcc.gnu.org/ml/gcc-testresults/2019-11/msg00908.html > > This is a list of regressions: Are these with the patch? I'm not seeing any of these in my testing with qemu. Are you on

Re: [PATCH 1/4] Preliminary m68k patches

2019-11-14 Thread Bernd Schmidt
On 11/13/19 9:03 PM, Jeff Law wrote: > OK. I'd actually recommend this go ahead and get installed. My tester > will bootstrap it overnight. Alright, let me know how that turns out. What kind of machine do you have for that? Bernd

Re: [PATCH 0/4] Eliminate cc0 from m68k

2019-11-13 Thread Bernd Schmidt
On 11/13/19 7:16 PM, Segher Boessenkool wrote: > I tried this out with a kernel build (just the defconfig). > during RTL pass: jump2 > /home/segher/src/kernel/fs/binfmt_elf.c: In function 'elf_core_dump': > /home/segher/src/kernel/fs/binfmt_elf.c:2409:1: internal compiler error: in >

[PATCH 4/4] Fix autoinc cbranch

2019-11-13 Thread Bernd Schmidt
After the m68k cc0 conversion, there is one code quality regression that I can see: we no longer generate autoinc addressing modes in comparisons. This is because the parts of the compiler that generate autoinc are unwilling to substitute into jumps. If you look at the code in reload, you'll see

[PATCH 3/4] Set costs for jumps in combine

2019-11-13 Thread Bernd Schmidt
The combiner is somewhat strange about how it uses costs. If any of the insns involved in a comparison have a cost of 0, it does not verify that the substitution is cheaper. Also, it does not compute costs for jump insns, so they are always set to zero. As a consequence, any possible substitution

[PATCH 2/4] The main m68k cc0 conversion

2019-11-13 Thread Bernd Schmidt
This achieves the conversion by using combined cbranch/cstore patterns, and using a mechanism similar to the cc_status tracking to elide certain comparisons. Unlike cc_status, this is opt-in and requires a flags_valid attribute to be set for suitable instructions. Due to lack of test hardware,

Re: [PATCH 1/4] Preliminary m68k patches

2019-11-13 Thread Bernd Schmidt
This tidies up a few spots in the m68k backend in preparation for the large patch to follow. This is purely for review purposes: this patch has not been tested independently, and will be committed together with the following one. Noteworthy changes: Some patterns and peepholes were unified

[PATCH 0/4] Eliminate cc0 from m68k

2019-11-13 Thread Bernd Schmidt
This is a set of patches to convert m68k so that it no longer uses cc0. The approach is to combine cc0 setter/user pairs into cbranch and cstore patterns. It does not expose the flag register directly. Since m68k is a target that is not under active development, and probably receives very limited

Re: [RFC, vectorizer] Allow single element vector types for vector reduction operations

2017-09-07 Thread Bernd Schmidt
On 08/27/2017 09:36 PM, Jon Beniston wrote: > I have an out-of-tree GCC port and it is struggling supporting > auto-vectorization on some dot product instructions. For example, I have an > instruction that takes three operands which are all 32-bit general > registers. The second and third

Re: MAINTAINERS update

2017-07-03 Thread Bernd Schmidt
On 06/11/2017 08:03 PM, Gerald Pfeifer wrote: > On Tue, 30 May 2017, Bernd Schmidt wrote: >> On 05/30/2017 09:05 AM, Richard Biener wrote: >>> This leaves the nvptx and c6x ports without a maintainer. Do >>> you have any recommendations for a successor here? >>

Re: MAINTAINERS update

2017-05-30 Thread Bernd Schmidt
On 05/30/2017 09:05 AM, Richard Biener wrote: > This leaves the nvptx and c6x ports without a maintainer. Do you have > any recommendations for a successor here? Not really. It would be a shame to lose the C6X port though. If I'm CC'd on any bug reports I'm prepared to keep it working - if

Re: MAINTAINERS update

2017-05-29 Thread Bernd Schmidt
On 05/27/2017 12:52 PM, Bernd Schmidt wrote: I am no longer working for Red Hat, so I've updated my email address. Also, I don't expect to be around very much in the near future, so I've removed myself as maintainer for some areas. Judging by a reply I got, I may have been too terse. No need

MAINTAINERS update

2017-05-27 Thread Bernd Schmidt
(revision 248535) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2017-05-27 Bernd Schmidt <bschm...@redhat.com> + + * MAINTAINERS: Update my email address, and remove myself as + maintainer in some areas. + 2017-05-25 Eric Gallager <eg...@gwmail.gwu.edu> * MAINTAINERS: Add self to

PR78972, 80283: Extend TER with scheduling

2017-05-12 Thread Bernd Schmidt
If you look at certain testcases like the one for PR78972, you'll find that the code generated by TER is maximally pessimal in terms of register pressure: we can generate a large number of intermediate results, and defer all the statements that use them up. Another observation one can make is

Re: [PATCH 1/5] nvptx: implement SIMT enter/exit insns

2017-03-27 Thread Bernd Schmidt
On 03/27/2017 12:56 PM, Alexander Monakov wrote: Hello Bernd, Can you have a look at this patch (unchanged from previous posting in January)? The rest of the patches in the set are reviewed. On Wed, 22 Mar 2017, Alexander Monakov wrote: This patch adds handling of new

LRA fix for 80160

2017-03-24 Thread Bernd Schmidt
(revision 246472) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,10 @@ +2017-03-25 Bernd Schmidt <bschm...@redhat.com> + + PR rtl-optimization/80160 + PR rtl-optimization/80159 + * lra-assigns.c (must_not_spill_p): Tighten new test to also take + reg_alternate_class into account. + 2017-03-24 Vl

Re: [PATCH] Decrease compile time memory with heavy find_base_{value,term} on i?86/x86_64 (PR rtl-optimization/63191, take 2)

2017-03-22 Thread Bernd Schmidt
On 03/22/2017 04:38 PM, Uros Bizjak wrote: LGTM, but I don't want to step on Bernd's toes, so let's wait for his opinion. I was waiting for yours really, that's the one that counts. Bernd

Re: [PATCH] Fix tree-prof/pr66295.c

2017-03-16 Thread Bernd Schmidt
On 03/15/2017 09:59 PM, Segher Boessenkool wrote: This testcase can only ever be built on x86 (it needs the "avx*" attributes). This patch skips the test elsewhere. Is this okay for trunk? Ok. Bernd

Re: [PATCH] Remove dead stores and initializations

2017-03-16 Thread Bernd Schmidt
On 03/16/2017 01:31 PM, Markus Trippelsdorf wrote: clang --analyze pointed out a number of dead stores and initializations. Tested on ppc64le. Ok for trunk? I'd say - not now. Ideally someone would delve into the commit history to figure out what happened with each of these, and whether any

Document PR79806 as a non-bug

2017-03-15 Thread Bernd Schmidt
I suggest we apply the following and close the PR as INVALID (not a bug). Ok? Bernd Index: pr65693.c === --- pr65693.c (revision 245685) +++ pr65693.c (working copy) @@ -2,6 +2,11 @@ /* { dg-do compile } */ /* { dg-options

Re: Combiner fix for PR79910

2017-03-15 Thread Bernd Schmidt
On 03/15/2017 04:00 PM, Bernd Schmidt wrote: On 03/15/2017 12:09 AM, Bernd Schmidt wrote: I'll retest with your suggestion and with the bitmap creation conditional on i1 being nonnull. Like this (also had to throw in a bitmap_empty_p). Retested as before. Ok? Oops, that one also has

Re: Combiner fix for PR79910

2017-03-15 Thread Bernd Schmidt
On 03/15/2017 12:09 AM, Bernd Schmidt wrote: I'll retest with your suggestion and with the bitmap creation conditional on i1 being nonnull. Like this (also had to throw in a bitmap_empty_p). Retested as before. Ok? Bernd Index: gcc/combine.c

Fix C6X hwloop issue

2017-03-15 Thread Bernd Schmidt
This fixes a failure in the testsuite. When we transform the doloop pattern, the decrement of the old iteration register goes away, which is a problem if it's used after the loop (where it should have the value zero). Committed. Bernd * config/c6x/c6x.c (predicate_insn): Avoid rtl sharing

Reload fix for an old aarch64 issue

2017-03-14 Thread Bernd Schmidt
This triggered a kernel miscompilation with an old (4.8 I think) aarch64 toolchain. Here's the reloads for the insn where things go wrong: Reloads for insn # 210 Reload 0: reload_in (DI) = (reg/v/f:DI 80 [ pgdata ]) GENERAL_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 0)

Re: Combiner fix for PR79910

2017-03-14 Thread Bernd Schmidt
On 03/15/2017 12:03 AM, Jeff Law wrote: On 03/10/2017 04:24 PM, Bernd Schmidt wrote: PR rtl-optimization/79910 * combine.c (record_used_regs): New static function. (try_combine): Handle situations where there is an additional instruction between I2 and I3 which needs to have

Combiner fix for PR79910

2017-03-10 Thread Bernd Schmidt
In this PR, we have a few insns involved in two instruction combinations: insn 16: set r100 insn 27: some calculation insn 28: some calculation insn 32: using r100 insn 33: using r100 insn 35: some calculation Then we combine insns 27, 28 and 33, producing two output insns, As a result, insn

Re: Fix IRA issue, PR79728

2017-03-10 Thread Bernd Schmidt
Ping (minus the require-effective-target line, as Uros pointed out). Bernd On 03/03/2017 02:51 PM, Bernd Schmidt wrote: This is an ICE where setup_pressure_classes fails if xmm0 is a global reg. Instead of GENERAL/FLOAT/SSE/MMX_REGS, it computes only SSE_FIRST_REG as the third register class

Re: [PATCH] Fix out-of-bounds write in RTL function reader (PR bootstrap/79952)

2017-03-10 Thread Bernd Schmidt
On 03/10/2017 08:03 PM, David Malcolm wrote: print-rtl.c:rtx_writer::print_rtx_operand_code_0 has some special -casing for SYMBOL_REF, but if I'm reading things right we don't yet dump SYMBOL_REF_BLOCK and SYMBOL_REF_BLOCK_OFFSET, so we'd need to dump these somehow. Yeah. Perhaps as an extra

Re: [PATCH] Decrease compile time memory with heavy find_base_{value,term} on i?86/x86_64 (PR rtl-optimization/63191)

2017-03-10 Thread Bernd Schmidt
On 03/10/2017 06:53 PM, Jakub Jelinek wrote: + +template +static inline rtx +ix86_delegitimize_address_tmpl (rtx x) { Why is this a template and not a function arg? Bernd

Re: [PATCH] Fix out-of-bounds write in RTL function reader (PR bootstrap/79952)

2017-03-09 Thread Bernd Schmidt
On 03/09/2017 08:28 PM, David Malcolm wrote: The root cause is an out-of-bounds memory write in the RTL dump reader when handling SYMBOL_REFs with SYMBOL_FLAG_HAS_BLOCK_INFO set. Such SYMBOL_REFs are normally created by varasm.c:create_block_symbol, which has: Hmm, I don't actually recall

Re: C PATCH to fix c/79758 (ICE-on-invalid with function redefinition and old style decls)

2017-03-03 Thread Bernd Schmidt
On 03/03/2017 02:33 PM, Marek Polacek wrote: 2017-03-03 Marek Polacek PR c/79758 * c-decl.c (store_parm_decls_oldstyle): Check if the element of current_function_prototype_arg_types is error_mark_node. Fix formatting. Use TREE_VALUE

Fix IRA issue, PR79728

2017-03-03 Thread Bernd Schmidt
This is an ICE where setup_pressure_classes fails if xmm0 is a global reg. Instead of GENERAL/FLOAT/SSE/MMX_REGS, it computes only SSE_FIRST_REG as the third register class. The problem is that the costs for moving between SSE_FIRST_REG and SSE_REGS are inflated because we think we have no

Fix (work around) LRA infinite loop, PR78911

2017-03-03 Thread Bernd Schmidt
In this PR, we have an endless cycle in LRA, generating ever more instructions. The relevant portions of the dump seem to be these: ** Local #9: ** Creating newreg=130 from oldreg=128, assigning class CREG to r130 18:

Re: GCSE: Use HOST_WIDE_INT instead of int (PR, rtl-optimization/79574).

2017-03-02 Thread Bernd Schmidt
On 03/02/2017 06:50 PM, Martin Liška wrote: Hello. This is second part of fixes needed to not trigger integer overflow in gcse pass. So, how is this intended to work? The min/max stored in the param is an int, and by using a HOST_WIDE_INT here, we expect that it is a larger type and

Re: C PATCH to fix c/79758 (ICE-on-invalid with function redefinition and old style decls)

2017-03-02 Thread Bernd Schmidt
On 03/02/2017 06:35 PM, Marek Polacek wrote: While at it, I fixed wrong formatting in the nearby code. Also use NULL_TREE instead of 0 where appropriate. I really dislike those zeros-as-trees; one day I'll just go and turn them into NULL_TREEs. I sympathize, but it makes it harder to see

Re: [PATCH] Fix ICE with multiple conditional traps turned into unconditional in one bb (PR rtl-optimization/79780)

2017-03-02 Thread Bernd Schmidt
On 03/02/2017 10:23 AM, Jakub Jelinek wrote: 2017-03-02 Jakub Jelinek PR rtl-optimization/79780 * cprop.c (one_cprop_pass): When second and further conditional trap in a single basic block is turned into an unconditional trap, turn it into a

Re: [PATCH] Improve ifcvt (PR tree-optimization/79389)

2017-02-23 Thread Bernd Schmidt
On 02/23/2017 10:27 PM, Jakub Jelinek wrote: Now successfully bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? LGTM. Bernd

Re: [PATCH] Improve ifcvt (PR tree-optimization/79389)

2017-02-23 Thread Bernd Schmidt
On 02/23/2017 02:36 PM, Jakub Jelinek wrote: and both UNLT and GE can be reversed. But if the arguments of the condition are canonicalized, we run into: /* Test for an integer condition, or a floating-point comparison in which NaNs can be ignored. */ if (CONST_INT_P (arg0) ||

Re: [PATCH] Improve ifcvt (PR tree-optimization/79389)

2017-02-23 Thread Bernd Schmidt
On 02/23/2017 12:46 PM, Jakub Jelinek wrote: But as soon as we only have the (unlt (reg:DF 100) (reg:DF 97)), reversed_comparison_code fails on it: case UNLT: case UNLE: case UNGT: case UNGE: /* We don't have safe way to reverse these yet. */ return UNKNOWN; I do

Re: [PATCH 1/6] c6x: Fix for RTL checking

2017-02-21 Thread Bernd Schmidt
On 02/21/2017 03:48 PM, Segher Boessenkool wrote: 2017-02-21 Segher Boessenkool * config/c6x/c6x.c (predicate_insn): Do not incorrectly share RTL. Ok, thanks. Bernd

Re: fwprop fix for PR79405

2017-02-20 Thread Bernd Schmidt
On 02/17/2017 10:11 AM, Richard Biener wrote: Index: gcc/fwprop.c === --- gcc/fwprop.c(revision 245501) +++ gcc/fwprop.c(working copy) @@ -1478,7 +1478,8 @@ fwprop (void) Do not forward propagate addresses into

Re: Improving code generation in the nvptx back end

2017-02-20 Thread Bernd Schmidt
On 02/17/2017 02:09 PM, Thomas Schwinge wrote: Hi! On Fri, 17 Feb 2017 14:00:09 +0100, I wrote: [...] for "normal" functions there is no reason to use the ".param" space for passing arguments in and out of functions. We can then get rid of the boilerplate code to move ".param %in_ar*" into

fwprop fix for PR79405

2017-02-16 Thread Bernd Schmidt
We have two registers being assigned to each other: (set (reg 213) (reg 209)) (set (reg 209) (reg 213)) These being the only definitions, we are happy to forward propagate reg 209 for reg 213 into a third insn, making a new use for reg 209. We are then happy to forward propagate reg 213 for

Re: C PATCH to fix ICE with -Wdouble-promotion (PR c/79515)

2017-02-15 Thread Bernd Schmidt
On 02/15/2017 12:49 PM, Marek Polacek wrote: We ICEd on this testcase in do_warn_double_promotion because an invalid conversion had produced an error result type and accessing that via TYPE_MAIN_VARIANT crashes. Fixed in an obvious way. Bootstrapped/regtested on x86_64-linux, ok for trunk?

Re: [PATCH] Replace XALLOCAVEC with XCNEWVEC (PR c/79471).

2017-02-13 Thread Bernd Schmidt
On 02/13/2017 02:06 PM, Martin Liška wrote: On 02/13/2017 01:58 PM, Bernd Schmidt wrote: On 02/13/2017 11:15 AM, Martin Liška wrote: In order to not cause a stack overflow, lets use a vector allocated on heap instead of the one created by XALLOCVEC. Patch can bootstrap on ppc64le-redhat

Re: [PATCH] Invalidate combiner's cached last value upon insn removal (PR rtl-optimization/79388, PR rtl-optimization/79450)

2017-02-13 Thread Bernd Schmidt
On 02/10/2017 08:50 PM, Jakub Jelinek wrote: 2017-02-10 Jakub Jelinek PR rtl-optimization/79388 PR rtl-optimization/79450 * combine.c (distribute_notes): When removing TEM_INSN for which corresponding dest has last value recorded, invalidate

Re: [PATCH] Replace XALLOCAVEC with XCNEWVEC (PR c/79471).

2017-02-13 Thread Bernd Schmidt
On 02/13/2017 11:15 AM, Martin Liška wrote: In order to not cause a stack overflow, lets use a vector allocated on heap instead of the one created by XALLOCVEC. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ok. I'm surprised this is marked as a regression, but

Re: [patch] Fix PR middle-end/78468

2017-01-27 Thread Bernd Schmidt
On 01/27/2017 01:02 PM, Eric Botcazou wrote: The attached patch is a middle ground between the previously working and currently broken situations: if the back-end defines STACK_DYNAMIC_OFFSET, then the middle-end assumes that STACK_DYNAMIC_OFFSET maintains the alignment; if it doesn't, which

One more cprop trap_if fix, PR79194

2017-01-27 Thread Bernd Schmidt
This PR seems to be curable by fixing up the CFG a little earlier. Bootstrapped and tested on x86_64-linux, and it seems to cure the testcase with a ppc cross. I'd appreciate if someone ran full ppc tests with this though. Ok? Bernd PR rtl-optimization/79194 * cprop.c (one_cprop_pass):

Re: [RFC] sched: Do not move expensive insns speculatively (PR68664)

2017-01-27 Thread Bernd Schmidt
On 01/27/2017 02:19 AM, Segher Boessenkool wrote: But what is "insn cost"? Latency is no good at all -- we *want* insns with higher latency to be earlier. fsqrt is not pipelined, and that is what makes it so costly. (This isn't modeled in the scheduling description btw: that would make the

Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-26 Thread Bernd Schmidt
On 01/25/2017 08:46 PM, Segher Boessenkool wrote: It turns out my patch (see the PR) causes (or at least triggers) miscompilations on tilegx. I will drop it for now. Curious, it looked very reasonable. What's needed to reproduce this? Bernd

Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-25 Thread Bernd Schmidt
On 01/25/2017 10:18 AM, Kyrill Tkachov wrote: The test is supposed to test the generation of the vsel instruction. I believe adding an -mcpu=cortex-a57 to the testcases would be best, as VSEL isn't actually available on Cortex-A5, it's just enabled by the -mfpu=fp-armv8 option. A more realistic

Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-24 Thread Bernd Schmidt
On 01/24/2017 06:03 PM, Christophe Lyon wrote: Ha... the regression occurred between r 244818 and r 244816, and I read r 244816 ChangeLog too quickly and did not notice it was modifying ifcvt.c in addition to x86-only files. So it's likely that it's your other patch for pr78634 that caused the

Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-24 Thread Bernd Schmidt
On 01/24/2017 05:50 PM, Kyrill Tkachov wrote: Actually trying it out with an explicit -mcpu=cortex-a5 (so -O2 -S -mfpu=fp-armv8 -mcpu=cortex-a57 -mfloat-abi=hard) I get the test failing before and after the patch. The code generated is vcmp.f64d0, d1 vmrsAPSR_nzcv,

Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-24 Thread Bernd Schmidt
On 01/24/2017 05:30 PM, Kyrill Tkachov wrote: The -mfpu is overridden in the testcase to add the ARMv8 instructions. So to reproduce the compilation in that testcase you'd want -mfpu=fp-armv8 or something equivalent rather than vfpv3-d16-fp16. Exact steps please. No one who's not well-versed

Re: Improve things for PR71724, in combine/if_then_else_cond

2017-01-24 Thread Bernd Schmidt
On 01/24/2017 09:38 AM, Christophe Lyon wrote: It seems that Bernd's patch causes regressions on arm-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16: gcc.target/arm/vselvcdf.c scan-assembler-times vselvs.f64\td[0-9]+ 1 gcc.target/arm/vselvcsf.c scan-assembler-times

Re: [PATCH v5] add -fprolog-pad=N,M option

2017-01-23 Thread Bernd Schmidt
There's still a a few details that need addressing, and some questions I have. Also Ccing Jakub to have another pair of eyes on the name of the section - I don't know if we want some sort of .gnu.something name. On 01/13/2017 01:19 PM, Torsten Duwe wrote: 2017-01-13 Torsten Duwe

Improve things for PR71724, in combine/if_then_else_cond

2017-01-20 Thread Bernd Schmidt
The PR is about infinite recursion in combine_simplify_rtx, because if_then_else_cond does strange things to an expression, and we end up simplifying something to itself. The patch below tries to address this by improving that function a little. As stated in the PR, the situation is that we

Another cprop trap_if fix, PR79125

2017-01-20 Thread Bernd Schmidt
This is essentially the same patch I sent for the previous instance of this problem, but this time applied to local_cprop_pass. Bootstrapped and tested on x86_64-linux, and it seems to fix the testcase with a ppc cross. Ok? Bernd PR rtl-optimization/79125 * cprop.c (local_cprop_pass):

Re: PR78634: ifcvt/i386 cost updates

2017-01-18 Thread Bernd Schmidt
On 12/09/2016 12:49 PM, Bernd Schmidt wrote: On 12/03/2016 10:49 AM, Uros Bizjak wrote: Based on the above explanation, the patch is OK. I'll be treating the ifcvt part of it as obvious. However, testing showed an issue with the i386 funcspec-11 test: /* PR target/36936 */ /* { dg-do

Re: [PATCH, GCC/LRA, gcc-5/6-branch] Fix PR78617: Fix conflict detection in rematerialization

2017-01-17 Thread Bernd Schmidt
On 01/16/2017 08:26 PM, Jeff Law wrote: On 01/13/2017 11:19 AM, Thomas Preudhomme wrote: Ping? I'm not sure if an ok from Valdimir is enough or if I also need RM approval. Vlad's approval is all you need. Is that a general rule? I'm never too certain on that. Bernd

[i386] New lea patterns for PR71321

2016-12-20 Thread Bernd Schmidt
The problem here is that we don't have complete coverage of lea patterns for HImode/QImode: the combiner can't recognize a (plus (ashift reg 2) reg) pattern it builds. My first idea was to canonicalize ASHIFT by constant inside PLUS to MULT. The docs say that this is done only inside a MEM

Re: [PATCH v3] add -fprolog-pad=N,M option

2016-12-19 Thread Bernd Schmidt
I'll consider myself agnostic as to whether this is a feature we want or need, so I'll just comment on some style questions. There's a fair amount of coding style violations, I'll point some of them out but please read the documents we have linked on this page:

Re: [PATCH 1/2] print-rtl.c: use '<' and '>' rather than % for pseudos in compact mode

2016-12-19 Thread Bernd Schmidt
On 12/16/2016 09:18 PM, David Malcolm wrote: The following patch implements the change for print-rtl.c. OK for trunk assuming it passes bootstrap? Yes. Bernd

Re: [PATCH] Fix assertions along default switch labels (PR tree-optimization/78819)

2016-12-16 Thread Bernd Schmidt
On 12/16/2016 12:49 PM, Marek Polacek wrote: But as this testcase shows, this breaks when the default label shares a label with another case. On this testcase, when we reach the switch, we know that argc is either 1, 2, or 3. So by the time we reach vrp2, the IR will have been optimized to

Re: Problem with pseudo-reg syntax in RTL frontend

2016-12-16 Thread Bernd Schmidt
On 12/14/2016 05:57 PM, David Malcolm wrote: Any preferences? (or other syntax ideas?). My preference is one of the currently-unused sigils e.g. "@3", or to wrap them in braces "{3}". Either might work, I'd vaguely prefer <3> over {3} but they're equivalent really. Maybe using "@" is

Re: [PATCH] Formatting and spelling fixes for ipa-cp.c

2016-12-15 Thread Bernd Schmidt
On 12/15/2016 05:51 PM, Jakub Jelinek wrote: This patch fixes what I've found quickly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. Bernd

Re: cprop fix for PR78626

2016-12-14 Thread Bernd Schmidt
On 12/12/2016 03:21 PM, Bernd Schmidt wrote: On 12/10/2016 08:58 PM, Segher Boessenkool wrote: On Thu, Dec 08, 2016 at 01:21:04PM +0100, Bernd Schmidt wrote: This is another case where an optimization turns a trap_if unconditional. We have to defer changing the CFG, since the rest of cprop

Re: PR target/78213 revisited (was Re: [PATCH 5/9] Introduce selftest::locate_file (v4))

2016-12-14 Thread Bernd Schmidt
On 12/09/2016 08:32 PM, David Malcolm wrote: Thanks. Unfortunately, applying the "locate_file" patch https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01186.html would now introduce a regression in a recently-added test case: The problem is that this DejaGnu test case uses -fself-test, and

Re: cprop fix for PR78626

2016-12-12 Thread Bernd Schmidt
On 12/10/2016 08:58 PM, Segher Boessenkool wrote: On Thu, Dec 08, 2016 at 01:21:04PM +0100, Bernd Schmidt wrote: This is another case where an optimization turns a trap_if unconditional. We have to defer changing the CFG, since the rest of cprop seems to blow up when we modify things while

Re: [nvptx] propagating conditionals in worker-vector partitioned loops

2016-12-09 Thread Bernd Schmidt
On 10/27/2016 12:29 AM, Cesar Philippidis wrote: Currently, the nvptx backend is only neutering the worker axis when propagating variables used in conditional expressions across the worker and vector axes. That's a problem with the worker-state spill and fill propagation implementation because

Re: [PATCH] PR78255: Make postreload aware of NO_FUNCTION_CSE

2016-12-09 Thread Bernd Schmidt
On 12/09/2016 05:16 PM, Andre Vieira (lists) wrote: Regardless, 'reload_cse_simplify' would never perform the opposite transformation. It checks whether it can replace anything within the first argument INSN, with the second argument TESTREG. As the name implies this will always be a register.

Re: [PATCH] PR78255: Make postreload aware of NO_FUNCTION_CSE

2016-12-09 Thread Bernd Schmidt
On 12/09/2016 04:34 PM, Andre Vieira (lists) wrote: Regardless, the other testcases I add in this patch show a sub-optimal transformation done by postreload, turning direct calls into indirect calls, for targets which have specifically pointed out that no CSE should be done on functions through

Re: [PATCH] PR78255: Make postreload aware of NO_FUNCTION_CSE

2016-12-09 Thread Bernd Schmidt
On 12/09/2016 03:03 PM, Andre Vieira (lists) wrote: This patch fixes the issue reported in PR78255 by making postreload aware it should not be performing CSE on functions if NO_FUNCTION_CSE is defined to true. Bootstrap and full regression on arm-none-linux-gnueabihf and

  1   2   3   4   5   6   7   8   9   10   >