Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-10-24 Thread Steve Ellcey
Ping.  There was discussion of larger fixes for this including a type promotion pass but this patch seems small, safe, and in line with other platforms (like arm32). Steve Ellcey sell...@cavium.com On Thu, 2017-09-14 at 11:43 -0700, Steve Ellcey wrote: > On Thu, 2017-09-14 at 11:53 -0600, Jeff

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Steve Ellcey
On Thu, 2017-09-14 at 11:53 -0600, Jeff Law wrote: >  > > And I think that's starting to zero in on the problem -- > WORD_REGISTER_OPERATIONS is zero on aarch64 as you don't get extension > to word_mode for W form registers. > > I wonder if what needs to happen is somehow look to extend that

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Jeff Law
On 09/14/2017 10:33 AM, Steve Ellcey wrote: > On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote: >> On 09/13/2017 03:46 PM, Steve Ellcey wrote: >>> >>> In arm32 rtl expansion, when reading the QI memory location, I see >>> these instructions get generated: >>> >>> (insn 10 3 11 2 (set (reg:SI

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Jeff Law
On 09/14/2017 10:33 AM, Steve Ellcey wrote: > On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote: >> On 09/13/2017 03:46 PM, Steve Ellcey wrote: >>> >>> In arm32 rtl expansion, when reading the QI memory location, I see >>> these instructions get generated: >>> >>> (insn 10 3 11 2 (set (reg:SI

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Steve Ellcey
On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote: > On 09/13/2017 03:46 PM, Steve Ellcey wrote: > >  > > In arm32 rtl expansion, when reading the QI memory location, I see > > these instructions get generated: > > > > (insn 10 3 11 2 (set (reg:SI 119) > > (zero_extend:SI (mem:QI

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-14 Thread Jeff Law
On 09/13/2017 03:46 PM, Steve Ellcey wrote: > On Wed, 2017-09-13 at 14:46 -0500, Segher Boessenkool wrote: >> On Wed, Sep 13, 2017 at 06:13:50PM +0100, Kyrill Tkachov wrote: >>> >>> We are usually hesitant to add explicit subreg matching in the MD pattern >>> (though I don't remember if there's

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-13 Thread Segher Boessenkool
On Wed, Sep 13, 2017 at 10:41:33PM +, Wilco Dijkstra wrote: > Steve Ellcey wrote: > > > And in aarch64 rtl expansion I see: > > > > (insn 10 9 11 (set (reg:QI 81) > > (mem:QI (reg/v/f:DI 80 [ string ]) [0 *string_9(D)+0 S1 A8])) > > "pr77729.c":3 -1 > > (nil))​ > > Yes using

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-13 Thread Steve Ellcey
On Wed, 2017-09-13 at 22:39 +, Wilco Dijkstra wrote: > Steve Ellcey wrote: > > > And in aarch64 rtl expansion I see: > > > > (insn 10 9 11 (set (reg:QI 81) > > (mem:QI (reg/v/f:DI 80 [ string ]) [0 *string_9(D)+0 S1 > A8])) "pr77729.c":3 -1 > > (nil)) > > Yes using QI/HI mode

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-13 Thread Wilco Dijkstra
Steve Ellcey wrote: > And in aarch64 rtl expansion I see: > > (insn 10 9 11 (set (reg:QI 81) > (mem:QI (reg/v/f:DI 80 [ string ]) [0 *string_9(D)+0 S1 A8])) > "pr77729.c":3 -1 > (nil))​ Yes using QI/HI mode anywhere in the RTL seems perverse and incorrect given AArch64 doesn't

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-13 Thread Steve Ellcey
On Wed, 2017-09-13 at 14:46 -0500, Segher Boessenkool wrote: > On Wed, Sep 13, 2017 at 06:13:50PM +0100, Kyrill Tkachov wrote: > >  > > We are usually hesitant to add explicit subreg matching in the MD pattern > > (though I don't remember if there's a hard rule against it). > > In this case this

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-13 Thread Segher Boessenkool
Hi! On Wed, Sep 13, 2017 at 06:13:50PM +0100, Kyrill Tkachov wrote: > +;; Specialized OR instruction for combiner. The AND is masking out bits > +;; not needed in the OR (doing a zero_extend). The zero_extend is not > +;; needed because we know from the subreg that the upper part of the reg >

Re: [PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-13 Thread Kyrill Tkachov
Hi Steve, On 13/09/17 17:50, Steve Ellcey wrote: This is a patch for PR target/77729 on aarch64. The code is doing an unneeded zero extend ('uxtb' in the original report, 'and' in the ToT sources). The patch looks a bit odd, it is a specialized define_insn for the combine pass. At some

[PATCH][aarch64] Fix target/pr77729 - missed optimization related to zero extension

2017-09-13 Thread Steve Ellcey
This is a patch for PR target/77729 on aarch64.  The code is doing an unneeded zero extend ('uxtb' in the original report, 'and' in the ToT sources). The patch looks a bit odd, it is a specialized define_insn for the combine pass.  At some point in combine (I never did find out where), the