Re: [PATCH, rs6000] Use bcdsub. instead of bcdadd. for bcd invalid number checking

2024-04-18 Thread Segher Boessenkool
On Thu, Apr 18, 2024 at 11:14:42AM +0800, Kewen.Lin wrote: > on 2024/4/18 10:01, HAO CHEN GUI wrote: > > This patch replace bcdadd. with bcdsub. for bcd invalid number checking. > > bcdadd on two same numbers might cause overflow which also set > > overflow/invalid bit so that we can't

Re: [PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-12 Thread Segher Boessenkool
Hi! On Thu, Apr 11, 2024 at 11:23:02PM -0500, Peter Bergner wrote: > On 4/11/24 10:31 PM, Kewen.Lin wrote: > >> +;; This option exists only to create its MASK. It is not intended for > >> users. > >> +mdo-not-use-this-option > >> +Target RejectNegative Mask(POWER8) Var(rs6000_isa_flags)

Re: Combine patch ping

2024-04-11 Thread Segher Boessenkool
On Wed, Apr 10, 2024 at 08:32:39PM +0200, Uros Bizjak wrote: > On Wed, Apr 10, 2024 at 7:56 PM Segher Boessenkool > wrote: > > This is never okay. You cannot commit a patch without approval, *ever*. This is the biggest issue, to start with. It is fundamental. > > That pat

Re: Combine patch ping

2024-04-10 Thread Segher Boessenkool
On Sun, Apr 07, 2024 at 08:31:38AM +0200, Uros Bizjak wrote: > If there are no further comments, I plan to commit the referred patch > to the mainline on Wednesday. The latest version can be considered an > obvious patch that solves certain oversight in the original > implementation. This is

Re: [COMMITTED] testsuite/gcc.target/cris/pr93372-2.c: Handle xpass from combine improvement

2024-04-09 Thread Segher Boessenkool
Hi! On Fri, Apr 05, 2024 at 04:06:01AM +0200, Hans-Peter Nilsson wrote: > The xpassing change in generated code was as follows, at > r14-9788-gb7bd2ec73d66f7 (where I locally applied a revert > to verify that this suspect was the cause). That was so > much of an improvement that I had to share

Re: [PATCH] rtl-optimization/101523 - avoid re-combine after noop 2->2 combination

2024-04-05 Thread Segher Boessenkool
Hi! On Wed, Apr 03, 2024 at 01:07:41PM +0200, Richard Biener wrote: > The following avoids re-walking and re-combining the instructions > between i2 and i3 when the pattern of i2 doesn't change. > > Bootstrap and regtest running ontop of a reversal of > r14-9692-g839bc42772ba7a. Please include

[PATCH] combine: Don't combine if I2 does not change

2024-03-27 Thread Segher Boessenkool
ot the way to do it. Committed to trunk. Segher 2024-03-27 Segher Boessenkool PR rtl-optimization/101523 * combine.cc (try_combine): Don't do a 2-insn combination if it does not in fact change I2. --- gcc/combine.cc | 11 +++ 1 file changed, 11 insertion

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-18 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 11:46:54PM +0100, Uros Bizjak wrote: > > Can't you just describe the dataflow then, without an unspec? An unspec > > by definition does some (unspecified) operation on the data. > > Previously, it was defined as: > > (define_insn "*pushfl2" >[(set (match_operand:W 0

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-18 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 11:27:28PM +0100, Uros Bizjak wrote: > On Thu, Mar 7, 2024 at 11:07 PM Uros Bizjak wrote: > > > > (unspec:DI [ > > > > (reg:CC 17 flags) > > > > ] UNSPEC_PUSHFL) > > > > > > But that is invalid RTL? The only valid use of a CC is written as > > >

Re: [PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2024-03-18 Thread Segher Boessenkool
Hi! On Fri, Feb 23, 2024 at 03:04:13PM +0530, jeevitha wrote: > PTImode attribute assists in generating even/odd register pairs on 128 bits. It is a mode, not an attribute. Attributes are on declarations, while modes are on a much more fundamental level (every value has a mode, in GCC!) > When

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 11:07:18PM +0100, Uros Bizjak wrote: > On Thu, Mar 7, 2024 at 10:37 PM Segher Boessenkool > wrote: > > > but can be something else, such as the above noted > > > > > > (unspec:DI [ > > > (reg:CC 17 flags) > > >

Re: [PATCH V3 2/2] rs6000: Load store fusion for rs6000 target using common infrastructure

2024-03-07 Thread Segher Boessenkool
On Fri, Mar 08, 2024 at 03:01:04AM +0530, Ajit Agarwal wrote: > > >> + Copyright (C) 2020-2023 Free Software Foundation, Inc. > > > > What in here is from 2020? > > > > Most things will be from 2024, too. First publication date is what > > counts. > > Please let me know the second

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 10:04:32PM +0100, Uros Bizjak wrote: [snip] > The part we want to fix deals with the *user* of the CC register. It > is not true that this is always COMPARISON_P, so EQ, NE, GE, LT, ... > in the form of > > (LT:CCGC (reg:CCGC 17 flags) (const_int 0)) > > but can be

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 12:22:04PM +0100, Uros Bizjak wrote: > As I understood find_single_use, it is returning RTX iff DEST is used > only a single time in an insn sequence following INSN. Connected by a log_link even, yeah. > We can reject the combination without worries of multiple uses.

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 11:45:45AM +0100, Richard Biener wrote: > The question is, whether a NULL cc_use_loc (find_single_use returning > NULL) means "there is no use" or it can mean "huh, don't know, maybe > more than one, maybe I was too stupid to indentify the single use". > The implementation

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 11:22:12AM +0100, Richard Biener wrote: > > > > Undo the combination if *cc_use_loc is not COMPARISON_P. Why, anyway? COMPARISON_P means things like LE. It does not even include actual RTX COMPARE. Segher

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Segher Boessenkool
On Thu, Mar 07, 2024 at 10:55:12AM +0100, Richard Biener wrote: > On Thu, 7 Mar 2024, Uros Bizjak wrote: > > This is > > > > 3236 /* Just replace the CC reg with a new mode. */ > > 3237 SUBST (XEXP (*cc_use_loc, 0), newpat_dest); > > 3238 undobuf.other_insn

Re: [PATCH V3 2/2] rs6000: Load store fusion for rs6000 target using common infrastructure

2024-02-29 Thread Segher Boessenkool
Hi! On Mon, Feb 19, 2024 at 04:24:37PM +0530, Ajit Agarwal wrote: > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -518,7 +518,7 @@ or1k*-*-*) > ;; > powerpc*-*-*) > cpu_type=rs6000 > - extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o" > +

Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-28 Thread Segher Boessenkool
On Wed, Feb 28, 2024 at 11:58:15AM -0600, Peter Bergner wrote: > On 2/28/24 8:31 AM, Segher Boessenkool wrote: > > On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote: > >> So it seems you're not NAKing the use of splat_input_operand, but > >> just tha

Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-28 Thread Segher Boessenkool
On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote: > On 2/27/24 6:40 AM, Segher Boessenkool wrote: > > On Tue, Feb 27, 2024 at 02:02:38AM +0530, jeevitha wrote: > > input_operand allows a lot of things that splat_input_operand does not, > > not just imm

Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-27 Thread Segher Boessenkool
Hi! On Tue, Feb 27, 2024 at 02:02:38AM +0530, jeevitha wrote: > There is no immediate value splatting instruction in Power. Currently, those > values need to be stored in a register or memory. To address this issue, I > have updated the predicate for the second operand in vsx_splat to >

Re: Repost [PATCH 1/6] Add -mcpu=future

2024-02-23 Thread Segher Boessenkool
On Tue, Feb 20, 2024 at 06:35:34PM +0800, Kewen.Lin wrote: > on 2024/2/8 03:58, Michael Meissner wrote: > $ grep -r "define PROCESSOR_DEFAULT" gcc/config/rs6000/ > gcc/config/rs6000/aix71.h:#define PROCESSOR_DEFAULT PROCESSOR_POWER7 > gcc/config/rs6000/aix71.h:#define PROCESSOR_DEFAULT64

Re: [PATCH 0/2 V2] aarch64: Place target independent and dependent code in one file.

2024-02-22 Thread Segher Boessenkool
On Thu, Feb 22, 2024 at 07:49:20PM +, Richard Sandiford wrote: > Thanks for the update. This is still quite hard to review though. > Sorry to ask for another round, but could you split it up further? > The ideal thing would be if patches that move code do nothing other > than move code, and

Re: [PATCH] rs6000: Update instruction counts due to combine changes [PR112103]

2024-02-20 Thread Segher Boessenkool
On Tue, Feb 20, 2024 at 01:49:30PM -0600, Peter Bergner wrote: > I think this will become less fragile after we fix PR114004 which is You call it "fragile". I call it the testcase found the exact kind of bug this testcase was meant to find! Yes, the test should become quieter when the compiler

Re: [PATCH] rs6000: Neuter option -mpower{8,9}-vector [PR109987]

2024-02-20 Thread Segher Boessenkool
On Tue, Feb 20, 2024 at 05:27:07PM +0800, Kewen.Lin wrote: > > -mcpu=power8 implies -mvsx already. > > Yes, but users can specify -mno-vsx in RUNTESTFLAGS, dejagnu > framework can have different behaviors (options order) for > different versions, this explicit -mvsx is mainly for the >

Re: [PATCH] rs6000: Neuter option -mpower{8,9}-vector [PR109987]

2024-02-19 Thread Segher Boessenkool
Hi! On Tue, Jan 16, 2024 at 10:50:01AM +0800, Kewen.Lin wrote: > As PR109987 and its duplicated bugs show, -mno-power8-vector > (and -mno-power9-vector) cause some problems and as Segher > pointed out in [1] they are workaround options, so this patch > is to remove -m{no,}-power{8,9}-options.

Re: [PATCH] Turn on LRA on all targets

2024-02-16 Thread Segher Boessenkool
On Fri, Feb 16, 2024 at 11:34:55AM +, Maciej W. Rozycki wrote: > Not really, in particular because EH unwinding has to be reliable and > heuristics inherently is not. Yup. Which is why I did 0359465c703a for rs6000 six years ago (how time flies!) The commit message for that includes

Re: [PATCH] Turn on LRA on all targets

2024-02-16 Thread Segher Boessenkool
On Thu, Feb 15, 2024 at 08:41:42PM -0500, Paul Koning wrote: > > On Feb 15, 2024, at 5:56 PM, Segher Boessenkool > > wrote: > > > > On Thu, Feb 15, 2024 at 07:34:32PM +, Sam James wrote: > >> I have now started doing this in PR113932. > > > &g

Re: [PATCH] Turn on LRA on all targets

2024-02-15 Thread Segher Boessenkool
On Thu, Feb 15, 2024 at 07:34:32PM +, Sam James wrote: > I have now started doing this in PR113932. Thank you! Segher

Re: Repost [PATCH 1/6] Add -mcpu=future

2024-02-08 Thread Segher Boessenkool
On Fri, Jan 05, 2024 at 06:35:37PM -0500, Michael Meissner wrote: > * config/rs6000/rs6000.opt (-mfuture): New undocumented debug switch. No. Never ever use a flag that does what -mcpu= should do. We're still trying to recover from previous such mistakes. Don't add more please. > +++

Re: Repost [PATCH 1/6] Add -mcpu=future

2024-02-08 Thread Segher Boessenkool
On Wed, Feb 07, 2024 at 05:21:10PM +0800, Kewen.Lin wrote: > on 2024/2/6 14:01, Michael Meissner wrote: > > It was more as a separation. The MPCCORE, CELL, PPCA2, and TITAN are rather > > old processors. I'll probably remove Titan soonish, btw. We have adjusted code around it for what, fifteen

Re: Repost [PATCH 1/6] Add -mcpu=future

2024-02-08 Thread Segher Boessenkool
On Tue, Feb 06, 2024 at 01:01:52AM -0500, Michael Meissner wrote: > > Nit: Named as "ISA_FUTURE_MASKS_SERVER" seems more accurate as it's > > constituted > > with ISA_3_1_MASKS_**SERVER** ... > > Well the _SERVER stuff was due to the power7 days when we still had to support > the E500 in the

Re: Repost [PATCH 0/6] PowerPC Future patches

2024-02-08 Thread Segher Boessenkool
Hi! On Fri, Jan 05, 2024 at 06:27:05PM -0500, Michael Meissner wrote: > In the current MMA subsystem for Power10, there are 8 512-bit accumulator > registers. These accumulators are each tied to sets of 4 FPR registers. When Four VSX registers -- the FP registers are only a 64 bit part of each

Re: [PATCH] Add a late-combine pass [PR106594]

2023-12-30 Thread Segher Boessenkool
Hi! On Tue, Oct 24, 2023 at 07:49:10PM +0100, Richard Sandiford wrote: > This patch adds a combine pass that runs late in the pipeline. But it is not. It is a completely new thing, and much closer to fwprop than to combine, too. Could you rename it to something else, please? Something less

Re: [PATCH 1/2] RTX_COST: Count instructions

2023-12-30 Thread Segher Boessenkool
On Fri, Dec 29, 2023 at 09:14:52PM -0700, Jeff Law wrote: > On 12/29/23 10:46, YunQiang Su wrote: > >When we try to combine RTLs, the result may be very complex, > >and `rtx_cost` may think that it need lots of costs. But in > >fact, it may match a pattern in machine descriptions, which > >may

Re: [PATCH] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2023-11-30 Thread Segher Boessenkool
Hi! On Wed, Nov 29, 2023 at 02:20:03PM +0100, Uros Bizjak wrote: > On Wed, Nov 29, 2023 at 1:25 PM Richard Biener > wrote: > > On Wed, Nov 29, 2023 at 10:35 AM Uros Bizjak wrote: > I was assuming that if the CC reg is not used inside the comparison, > then the mode of CC reg is irrelevant. We

Re: [PATCH] rs6000, Add missing overloaded bcd builtin tests

2023-10-31 Thread Segher Boessenkool
On Tue, Oct 31, 2023 at 08:31:25AM -0700, Carl Love wrote: > > I just found that actually they have the test coverage, because we > > have > > > > #define __builtin_bcdcmpeq(a,b) __builtin_vec_bcdsub_eq(a,b,0) > > #define __builtin_bcdcmpgt(a,b) __builtin_vec_bcdsub_gt(a,b,0) > > #define

Re: [PATCH v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-10-28 Thread Segher Boessenkool
Hi! Please say "rs6000/p8swap:" in the subject, not "swap:" :-) On Sun, Sep 10, 2023 at 10:58:32PM +0530, Surya Kumari Jangala wrote: > Another issue with always handling swappable instructions is that it is > incorrect to do so in webs where loads/stores on quad word aligned > addresses are

Re: [PATCH] Cleanup: Replace UNSPEC_COPYSIGN with copysign RTL

2023-09-30 Thread Segher Boessenkool
On Fri, Sep 29, 2023 at 02:09:12PM -0400, Michael Meissner wrote: > * config/rs6000/rs6000.md (UNSPEC_COPYSIGN): Delete. > (copysign3_fcpsg): Use copysign RTL instead of UNSPEC. (typo, it is _fcpsgn) Nice to see unnecessary unspecs going away :-) Segher

Re: [PATCH V4, rs6000] Disable generation of scalar modulo instructions

2023-09-13 Thread Segher Boessenkool
Hi! On Fri, Jun 30, 2023 at 02:26:35PM -0500, Pat Haugen wrote: > gcc/ > * config/rs6000/rs6000.cc (rs6000_rtx_costs): Check if disabling > scalar modulo. "Check whether the modulo instruction is disabled?" > * config/rs6000/rs6000.md (mod3, *mod3): Disable. > (define_expand

Re: [MAINTAINERS/KERNEL SUMMIT] Trust and maintenance of file systems

2023-09-07 Thread Segher Boessenkool
On Thu, Sep 07, 2023 at 02:23:00PM +0300, Dan Carpenter wrote: > On Thu, Sep 07, 2023 at 06:04:09AM -0500, Segher Boessenkool wrote: > > On Thu, Sep 07, 2023 at 12:48:25PM +0300, Dan Carpenter via Gcc-patches > > wrote: > > > I started to hunt > > > down all

Re: [MAINTAINERS/KERNEL SUMMIT] Trust and maintenance of file systems

2023-09-07 Thread Segher Boessenkool
On Thu, Sep 07, 2023 at 07:22:45AM -0400, Steven Rostedt wrote: > On Thu, 7 Sep 2023 06:04:09 -0500 > Segher Boessenkool wrote: > > On Thu, Sep 07, 2023 at 12:48:25PM +0300, Dan Carpenter via Gcc-patches > > wrote: > > No. You should patch your program, instead.

Re: [MAINTAINERS/KERNEL SUMMIT] Trust and maintenance of file systems

2023-09-07 Thread Segher Boessenkool
On Thu, Sep 07, 2023 at 12:48:25PM +0300, Dan Carpenter via Gcc-patches wrote: > I started to hunt > down all the Makefile which add a -Werror but there are a lot and > eventually I got bored and gave up. I have a patch stack for that, since 2014 or so. I build Linux with unreleased GCC versions

Re: [PATCH] Fix typo in insn name.

2023-07-10 Thread Segher Boessenkool
Hi! On Mon, Jul 10, 2023 at 03:59:44PM -0400, Michael Meissner wrote: > In doing other work, I noticed that there was an insn: > > vsx_extract_v4sf__load > > Which did not have an iterator. I removed the useless . This patch does that, you mean. > --- a/gcc/config/rs6000/vsx.md > +++

Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-06 Thread Segher Boessenkool
On Thu, Jul 06, 2023 at 02:48:19PM -0500, Peter Bergner wrote: > On 7/6/23 12:33 PM, Segher Boessenkool wrote: > > On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote: > >> --- a/gcc/config/rs6000/rs6000.cc > >> +++ b/gcc/config/rs6000/rs6000

Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-06 Thread Segher Boessenkool
Hi! On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote: > The following patch has been bootstrapped and regtested on powerpc64le-linux. > > while generating vector pairs of load & store instruction, the src address > was treated as an altivec type and that type of address is invalid for

Re: [PATCH, V6] Fix power10 fusion and -fstack-protector, PR target/105325

2023-06-20 Thread Segher Boessenkool
Hi! The patch looks great now, thanks you! But the commit message needs some work: First off, the subject, which is a short (50 character max!) summary of what the patch is about. Fix power10 fusion and -fstack-protector, PR target/105325 There is absolutely nothing to do with stack protector,

Re: [PATCH V3 1/4] rs6000: build constant via li;rotldi

2023-06-16 Thread Segher Boessenkool
Hi! On Fri, Jun 16, 2023 at 04:34:12PM +0800, Jiufu Guo wrote: > +/* Check if value C can be built by 2 instructions: one is 'li', another is > + rotldi. > + > + If so, *SHIFT is set to the shift operand of rotldi(rldicl), and *MASK > + is set to -1, and return true. Return false

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-15 Thread Segher Boessenkool
On Thu, Jun 15, 2023 at 03:00:40PM +0800, Jiufu Guo wrote: > >> This is the existing pattern. It may be read as an action > >> to clean an unknown-size memory block. > > > > Including a size zero memory block, yes. BLKmode was originally to do > > things like bcopy (before modern names like

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
On Wed, Jun 14, 2023 at 06:25:10PM +0200, Richard Biener wrote: > > Form rs6000.md: > > ; This is to explain that changes to the stack pointer should > > ; not be moved over loads from or stores to stack memory. > > (define_insn "stack_tie" > > That suggests it’s the hard register value that‘s

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
Hi! On Wed, Jun 14, 2023 at 10:04:20AM +0100, Richard Sandiford wrote: > I'd also understood it to be either. As in, it is a may-clobber > that can be used for must-clobber. Alternatively: the value stored > is unpredictable, and can therefore be the same as the current value. Yes, it is a set

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
On Wed, Jun 14, 2023 at 09:22:09AM +, Richard Biener wrote: > How can a clobber be validly dropped? Same as any other set: if no code executed after it can read whatever is written. This typically means a stack frame goes away, or simply no more code is executed *at all* after this. > For

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
Hi! On Wed, Jun 14, 2023 at 09:52:37AM +, Richard Biener wrote: > I see. So > > (parallel > (unspec stack_tie) > (clobber (mem:BLK ...))) Written like this, without a "set", *every* unspec has to be an unspec_volatile, for the same reason as all inline asms without outputs always are

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
Hi! On Wed, Jun 14, 2023 at 05:26:52PM +0800, Jiufu Guo wrote: > Richard Biener writes: > >> 3. "set (mem/c:DI (reg/f:DI 1 1) unspec:DI (const_int 0 [0]) > >> UNSPEC_TIE". > >>This avoids using BLK on unspec, but using DI. > > > > That gives the MEM a size which means we can interpret the

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
Hi! On Wed, Jun 14, 2023 at 07:59:04AM +, Richard Biener wrote: > On Wed, 14 Jun 2023, Jiufu Guo wrote: > > 3. "set (mem/c:DI (reg/f:DI 1 1) unspec:DI (const_int 0 [0]) > > UNSPEC_TIE". > >This avoids using BLK on unspec, but using DI. > > That gives the MEM a size which means we can

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
Hi! On Wed, Jun 14, 2023 at 12:06:29PM +0800, Jiufu Guo wrote: > Segher Boessenkool writes: > I'm also thinking about other solutions: > 1. "set (mem/c:BLK (reg/f:DI 1 1) (const_int 0 [0])" > This is the existing pattern. It may be read as an action > to clean an

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Segher Boessenkool
Hi! On Wed, Jun 14, 2023 at 05:18:15PM +0800, Xi Ruoyao wrote: > The generic issue here is to fix (not "papering over") the signed > overflow, we need to perform the addition in a target machine mode. We > may always use Pmode (IIRC const_anchor was introduced for optimizing > some constant

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-13 Thread Segher Boessenkool
Hi! As I said in a reply to the original patch: not okay. Sorry. But some comments on this patch: On Tue, Jun 13, 2023 at 08:23:35PM +0800, Jiufu Guo wrote: > + && XINT (SET_SRC (set), 1) == UNSPEC_TIE > + && XVECEXP (SET_SRC (set), 0, 0) == const0_rtx); This makes it required

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-13 Thread Segher Boessenkool
Hi! On Tue, Jun 13, 2023 at 10:15:49AM +0800, Jiufu Guo wrote: > David Edelsohn writes: > > > > This definitely seems to be a better solution. > > > > The TARGET_CONST_ANCHOR change should not be part of this patch. Also > > there is no ChangeLog for the patch. > > Thanks a lot for your quick

Re: [PATCH v2] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-06-12 Thread Segher Boessenkool
Hi! On Sat, Feb 25, 2023 at 03:20:33PM +0530, Ajit Agarwal wrote: > Here is the patch that uses xxlor instead of fmr where possible. > Performance results shows that fmr is better in power9 and > power10 architectures whereas xxlor is better in power7 and > power 8 architectures. fmr is the only

Re: [PATCH V2] Optimize '(X - N * M) / N' to 'X / N - M' if valid

2023-06-09 Thread Segher Boessenkool
Hi! On Wed, Jun 07, 2023 at 04:21:11PM +0800, Jiufu Guo wrote: > This patch tries to optimize "(X - N * M) / N" to "X / N - M". > For C code, "/" towards zero (trunc_div), and "X - N * M" maybe > wrap/overflow/underflow. So, it is valid that "X - N * M" does > not cross zero and does not

[PATCH 2/2] rs6000: genfusion: Delete dead code

2023-06-06 Thread Segher Boessenkool
2023-06-06 Segher Boessenkool * config/rs6000/genfusion.pl: Delete some dead code. --- gcc/config/rs6000/genfusion.pl | 3 --- 1 file changed, 3 deletions(-) diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index 2851bb7..82e8f86 100755 --- a/gcc/config

[PATCH 1/2] rs6000: genfusion: Rewrite load/compare code

2023-06-06 Thread Segher Boessenkool
;s or "qw" for lists of constants. 2023-06-06 Segher Boessenkool * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): New, rewritten and split out from... (gen_ld_cmpi_p10): ... this. --- gcc/config/rs6000/genfusion.pl | 185 +++-

Re: [PATCH] rs6000: Remove duplicate expression [PR106907]

2023-06-05 Thread Segher Boessenkool
Hi! On Mon, Jun 05, 2023 at 12:11:42PM +0530, P Jeevitha wrote: > PR106907 has few warnings spotted from cppcheck. In that addressing duplicate > expression issue here. Here the same expression is used twice in logical > AND(&&) operation which result in same result so removing that. > >

Re: [PATCH V5, 2/2] PR target/105325: Fix memory constraints for power10 fusion.

2023-05-26 Thread Segher Boessenkool
On Wed, May 10, 2023 at 11:40:00AM -0400, Michael Meissner wrote: > This patch applies stricter predicates and constraints for LD and LWA > instructions with power10 fusion. These instructions are DS-form > instructions, > which means that the bottom 2 bits of the address must be 0. The low two

Re: [PATCH V5, 1/2] PR target/105325: Rewrite genfusion.pl's gen_ld_cmpi_p10 function.

2023-05-26 Thread Segher Boessenkool
Hi Mike, On Wed, May 10, 2023 at 11:38:55AM -0400, Michael Meissner wrote: > This patch rewrites the gen_ld_cmpi_p10 function in genfusion.pl to be > clearer. That is not at all what I asked for, even if I would agree the code is nicer to read now (I don't). What I asked for, what is needed,

Re: [PATCH] Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS and mode.

2023-05-25 Thread Segher Boessenkool
On Thu, May 25, 2023 at 10:29:47AM -0400, Vladimir Makarov wrote: > > On 5/17/23 02:57, liuhongt wrote: > >r14-172-g0368d169492017 replaces GENERAL_REGS with NO_REGS in cost > >calculation when the preferred register class are not known yet. > >It regressed powerpc PR109610 and PR109858, it looks

Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2023-05-25 Thread Segher Boessenkool
Hi Alex, On Thu, May 25, 2023 at 10:55:37AM -0300, Alexandre Oliva wrote: > On May 25, 2023, Segher Boessenkool wrote: > > Fwiw, updating the insn counts blindly like this > > ... is a claim that carries a wildly incorrect and insulting underlying > assumption: Sorry you f

Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2023-05-25 Thread Segher Boessenkool
Hi! On Thu, May 25, 2023 at 07:05:55AM -0300, Alexandre Oliva wrote: > On May 25, 2023, "Kewen.Lin" wrote: > > So both lp64 and ilp32 have the same count, could we merge it and > > remove the selectors? > > We could, but... I thought I wouldn't, since they were different > before, and they're

Re: [PATCH v1] tree-ssa-sink: Improve code sinking pass.

2023-05-18 Thread Segher Boessenkool
Hi! On Thu, May 18, 2023 at 12:44:28PM +0530, Ajit Agarwal wrote: > This patch improves code sinking pass to sink statements before call to reduce > register pressure. An example would be useful :-) > * tree-ssa-sink.cc (statement_sink_location): Modifed to > move statements before

Re: [PATCH v5 1/4] rs6000: Enable REE pass by default

2023-05-16 Thread Segher Boessenkool
Hi! On Tue, May 16, 2023 at 11:45:28AM +0530, Ajit Agarwal wrote: > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -12455,8 +12455,8 @@ Attempt to remove redundant extension instructions. > This is especially > helpful for the x86-64 architecture, which implicitly zero-extends in

Re: [PATCH] [powerpc] Add a peephole2 to eliminate redundant move from VSX_REGS to GENERAL_REGS when it's from memory.

2023-05-15 Thread Segher Boessenkool
On Thu, May 04, 2023 at 01:54:46PM +0800, liuhongt wrote: > r14-172-g0368d169492017 use NO_REGS instead of GENERAL_REGS in memory cost > calculation when preferred register class is unkown. > + /* Costs for NO_REGS are used in cost calculation on the > +1st pass when the preferred

Re: [PATCH, V4] PR target/105325, Make load/cmp fusion know about prefixed loads.

2023-05-02 Thread Segher Boessenkool
On Wed, Apr 26, 2023 at 12:18:36PM -0400, Michael Meissner wrote: > * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation > of the ld and lwa instructions which use the DS encoding instead of D. > Use the YZ constraint for these loads. Handle prefixed loads

Re: [committed] Convert xstormy16 to LRA

2023-05-02 Thread Segher Boessenkool
Hi! On Tue, May 02, 2023 at 05:20:49PM +0100, Roger Sayle wrote: > On 02 May 2023 14:49, Segher Boessenkool wrote: > Then combine inserts an additional copy: Combine makes sure a pseudo-to-pseudo move remains. Without that, combine will seize part of RA's job, and butcher it. It has

Re: [committed] Convert xstormy16 to LRA

2023-05-02 Thread Segher Boessenkool
On Tue, May 02, 2023 at 10:11:27AM -0400, Paul Koning wrote: > > On May 2, 2023, at 9:18 AM, Roger Sayle wrote: > > Yes, see the section -fsplit-wide-types in > > https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html > > Thanks. So I'm wondering why that would be a problem. > > The obvious

Re: [committed] Convert xstormy16 to LRA

2023-05-02 Thread Segher Boessenkool
Hi! On Tue, May 02, 2023 at 02:18:43PM +0100, Roger Sayle wrote: > On 02 May 2023 13:40, Paul Koning wrote: > > > On May 1, 2023, at 7:37 PM, Roger Sayle > > wrote: > > > The shiftsi.cc regression on xstormy16 is fixed by adding > > > -fno-split-wide-types. > > > In fact, if all the regression

Re: [PATCH v4 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces.

2023-05-01 Thread Segher Boessenkool
Hi! On Sat, Apr 22, 2023 at 02:36:20PM +0530, Ajit Agarwal wrote: > * ree.cc (combline_reaching_defs): Add zero_extend > using defined abi interfaces. Typo. Also, please don't wrap lines early. Also, you are missing some changes in this file in the changelog. >

Re: [PATCH V5] Use reg mode to move sub blocks for parameters and returns

2023-05-01 Thread Segher Boessenkool
Hi! On Fri, Mar 17, 2023 at 11:39:52AM +0800, Jiufu Guo wrote: > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/pr65421-1.c: New test. > * gcc.target/powerpc/pr65421.c: New test. Please name the tests something else? -1.c and -2.c maybe. Or something more inspired. Just not

Re: [PATCH] Turn on LRA on all targets

2023-04-29 Thread Segher Boessenkool
Hi! On Mon, Apr 24, 2023 at 11:46:50AM +0200, Uros Bizjak wrote: > On Mon, Apr 24, 2023 at 11:19 AM Segher Boessenkool > wrote: > > We still need someone to test this on alpha now, years later, and give > > a final okay, but hearing this is encouraging :-) > > Please no

Re: [PATCH] powerpc: Fix up *branch_anddi3_dot for -m32 -mpowerpc64 [PR109566]

2023-04-25 Thread Segher Boessenkool
Hi! On Mon, Apr 24, 2023 at 05:54:02PM +0200, Jakub Jelinek wrote: > The problem is that the *branch_anddi3_dot define_insn_and_split > relies on the *rotldi3_mask_dot define_insn_and_split being recognized > during splitting. The rs6000_is_valid_rotate_dot_mask function checks whether > the

Re: [PATCH] rs6000: Fix predicate for const vector in sldoi_to_mov [PR109069]

2023-04-24 Thread Segher Boessenkool
Hi! On Mon, Mar 27, 2023 at 04:09:39PM +0800, Kewen.Lin wrote: > As PR109069 shows, commit r12-6537-g080a06fcb076b3 which > introduces define_insn_and_split sldoi_to_mov adopts > easy_vector_constant for const vector of interest, but it's > wrong since predicate easy_vector_constant doesn't

Re: [PATCH] Turn on LRA on all targets

2023-04-24 Thread Segher Boessenkool
On Mon, Apr 24, 2023 at 10:19:23AM +0200, Richard Biener wrote: > On Sun, Apr 23, 2023 at 6:48 PM Segher Boessenkool > wrote: > > > > This minimal patch enables LRA for all targets. It does not clean up > > the target code, nor does it do anything to generic code: it jus

Re: [PATCH] Turn on LRA on all targets

2023-04-24 Thread Segher Boessenkool
On Sun, Apr 23, 2023 at 11:06:41PM +0200, Uros Bizjak wrote: > > I send this patch now so that people can start testing. I don't plan to > > commit this for another week at least, for a week after GCC 13 release I > > guess? How does that plan sound to people? > > An old patch to enable Alpha

Re: [PATCH] Turn on LRA on all targets

2023-04-23 Thread Segher Boessenkool
On Sun, Apr 23, 2023 at 07:56:56PM +0100, Maciej W. Rozycki wrote: > On Sun, 23 Apr 2023, Segher Boessenkool wrote: > > 1) Targets that already always have LRA, but that redefine the hook > > anyway. These are gcn, pdp11, rx, sparc, vax, and xtensa. Nothing > > really chan

Re: [PATCH] Turn on LRA on all targets

2023-04-23 Thread Segher Boessenkool
(You didn't leave me in Cc: on the reply. Maybe you did a reply-to-only-one-person?) On Sun, Apr 23, 2023 at 11:01:05AM -0600, Jeff Law via Gcc-patches wrote: > On 4/23/23 10:47, Segher Boessenkool wrote: > >3) Targets that as of yet never used LRA. Many of those will be fine, &g

Re: [PATCH] Turn on LRA on all targets

2023-04-23 Thread Segher Boessenkool
Hi! On Sun, Apr 23, 2023 at 02:36:05PM -0400, Paul Koning wrote: > > On Apr 23, 2023, at 12:47 PM, Segher Boessenkool > > wrote: > > 1) Targets that already always have LRA, but that redefine the hook > > anyway. These are gcn, pdp11, rx, sparc, vax, and xtensa. No

[PATCH] Turn on LRA on all targets

2023-04-23 Thread Segher Boessenkool
This minimal patch enables LRA for all targets. It does not clean up the target code, nor does it do anything to generic code: it just deletes all target definitions of TARGET_LRA_P. There are three kinds of changes: 1) Targets that already always have LRA, but that redefine the hook anyway.

Re: [PATCH v4 1/4] rs6000: Enable REE pass by default

2023-04-22 Thread Segher Boessenkool
Hi! Please look at and reply to that message, with answers to the questions? And make sure you are listed in MAINTAINERS before anything else. Thanks! Segher

Re: [PATCH v3 1/4] ree: Default ree pass for O2 and above for rs6000 target.

2023-04-19 Thread Segher Boessenkool
Hi! The subject should be something like rs6000: Enable REE pass by default (and no period at the end). On Wed, Apr 19, 2023 at 11:23:07PM +0530, Ajit Agarwal wrote: > This is the patch-1 for improving ree pass for rs6000 target. It actually just enables it :-) The mail body should be the

Re: [PATCH] testsuite: filter out warning noise for CWE-1341 test

2023-04-13 Thread Segher Boessenkool
On Thu, Apr 13, 2023 at 07:39:01AM +, Richard Biener wrote: > On Thu, 13 Apr 2023, Jiufu Guo wrote: > I think this should be fixed in the analyzer, "stripping" malloc > tracking from fopen/fclose since it does this manually. I've adjusted > the bug accordingly. Yeah. > > > +/* This case

Re: [PATCH] combine, v4: Fix AND handling for WORD_REGISTER_OPERATIONS targets [PR109040]

2023-04-13 Thread Segher Boessenkool
On Wed, Apr 12, 2023 at 10:05:08PM -0600, Jeff Law wrote: > On 4/12/23 10:58, Jakub Jelinek wrote: > >Seems my cross defaulted to 32-bit compilation, reproduced it with > >additional -mabi=lp64 -march=rv64gv even on the pr108947.c test. > >So, let's include that test in the patch too: > > >

Re: [PATCH] combine, v3: Fix AND handling for WORD_REGISTER_OPERATIONS targets [PR109040]

2023-04-12 Thread Segher Boessenkool
On Wed, Apr 12, 2023 at 12:02:12PM +0200, Jakub Jelinek wrote: > On Wed, Apr 12, 2023 at 08:21:26AM +0200, Jakub Jelinek via Gcc-patches wrote: > > I would have expected something like > > WORD_REGISTER_OPERATIONS && known_le (GET_MODE_PRECISION (mode), > > BITS_PER_WORD) > > as the condition to

Re: [PATCH] testsuite: update requires for powerpc/float128-cmp2-runnable.c

2023-04-11 Thread Segher Boessenkool
On Tue, Apr 11, 2023 at 05:40:09PM +0800, Kewen.Lin wrote: > on 2023/4/11 17:14, guojiufu wrote: > > Thanks for raising this concern. > > The behavior to check about bif on FLOAT128_HW and emit an error message for > > requirements on quad-precision is added in gcc12. This is why gcc12 fails to >

Re: [PATCH, V3] PR target/70243 - Do not generate vmaddfp or vnmsubdp

2023-04-08 Thread Segher Boessenkool
Hi! On Sat, Apr 08, 2023 at 09:34:51AM -0400, Michael Meissner wrote: > The Altivec instructions vmaddfp and vnmsubfp have different rounding > behaviors > than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating > these instructions seems to break Eigen on big endian

Re: [PATCH, V2] PR target/70243: Do not generate vmaddfp and vnmsubfp

2023-04-07 Thread Segher Boessenkool
Hi! On Fri, Apr 07, 2023 at 02:34:01AM -0400, Michael Meissner wrote: > As we discussed in a private chat room, I modified the code to generate > vmaddfp > and vnmsubfp if -Ofast (-ffast-math) is used. As I said, that is no good. > This allows the compiler to > eliminate the extra move if the

Re: PR target/70243: Do not generate fmaddfp and fnmsubfp

2023-04-07 Thread Segher Boessenkool
Hi! On Fri, Apr 07, 2023 at 02:32:04AM -0400, Michael Meissner wrote: > On Thu, Apr 06, 2023 at 03:37:59PM -0500, Segher Boessenkool wrote: > > > This patch eliminates the generation of the Altivec fmaddfp and fnmsubfp > > > instructions as alternatives in the VSX ins

Re: PR target/70243: Do not generate fmaddfp and fnmsubfp

2023-04-06 Thread Segher Boessenkool
Hi! On Thu, Apr 06, 2023 at 11:12:11AM -0400, Michael Meissner wrote: > The Altivec instructions fmaddfp and fnmsubfp have different rounding > behaviors Those are not existing instructions. You mean "vmaddfp" etc. > than the VSX xvmaddsp and xvnmsubsp instructions. In particular, generating

Re: [RFA][Bug target/108892 ][13 regression] Force re-recognition after changing RTL structure of an insn

2023-04-05 Thread Segher Boessenkool
Hi again, On Wed, Apr 05, 2023 at 11:43:30AM -0600, Jeff Law wrote: > On 4/5/23 11:38, Segher Boessenkool wrote: > >Right. But it seems to me it has been there all those years? Does the > >new testcase fail on older branches? Even if not, it seems clear it is > >wrong

Re: [RFA][Bug target/108892 ][13 regression] Force re-recognition after changing RTL structure of an insn

2023-04-05 Thread Segher Boessenkool
On Wed, Apr 05, 2023 at 09:07:30AM -0600, Jeff Law wrote: > On 4/5/23 08:21, Segher Boessenkool wrote: > >On Wed, Mar 29, 2023 at 07:48:00AM -0600, Jeff Law wrote: > >>So as mentioned in the PR the underlying issue here is combine changes > >>the form of an existing

Re: [RFA][Bug target/108892 ][13 regression] Force re-recognition after changing RTL structure of an insn

2023-04-05 Thread Segher Boessenkool
Hi! On Wed, Mar 29, 2023 at 07:48:00AM -0600, Jeff Law wrote: > So as mentioned in the PR the underlying issue here is combine changes > the form of an existing insn, but fails to force re-recognition. As a > result other parts of the compiler blow up. [snip] > The fix is trivial, reset the

  1   2   3   4   5   6   7   8   9   10   >