[Bug target/77308] surprisingly large stack usage for sha512 on arm

2019-08-23 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Wilco changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2018-11-19 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Bernd Edlinger changed: What|Removed |Added Known to work||8.1.0 --- Comment #67 from Bernd

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2018-11-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2017-09-06 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #65 from Bernd Edlinger --- Author: edlinger Date: Wed Sep 6 07:47:52 2017 New Revision: 251752 URL: https://gcc.gnu.org/viewcvs?rev=251752=gcc=rev Log: 2017-09-06 Bernd Edlinger PR

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2017-09-04 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #64 from Bernd Edlinger --- Author: edlinger Date: Mon Sep 4 15:25:59 2017 New Revision: 251663 URL: https://gcc.gnu.org/viewcvs?rev=251663=gcc=rev Log: 2017-09-04 Bernd Edlinger PR

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-17 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #63 from Bernd Edlinger --- Author: edlinger Date: Thu Nov 17 13:47:24 2016 New Revision: 242549 URL: https://gcc.gnu.org/viewcvs?rev=242549=gcc=rev Log: 2016-11-17 Bernd Edlinger PR

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-09 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #62 from Bernd Edlinger --- Both parts of the patch are now posted for review: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00523.html https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00830.html

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-03 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #61 from Bernd Edlinger --- Created attachment 39958 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39958=edit patch for enabling ldrdstrd peephole And this is what I will bootstrap in the next cycle. It will enable all

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-03 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Bernd Edlinger changed: What|Removed |Added Attachment #39940|0 |1 is obsolete|

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-03 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #59 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #58) > (In reply to wilco from comment #57) > > (In reply to Bernd Edlinger from comment #56) > > > Agreed, I can split the patch. > > > > > > From what I

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-03 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #57 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #56) > (In reply to wilco from comment #55) > > (In reply to Bernd Edlinger from comment #39) > > > Created attachment 39940 [details] > > > proposed

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-03 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #58 from Bernd Edlinger --- (In reply to wilco from comment #57) > (In reply to Bernd Edlinger from comment #56) > > Agreed, I can split the patch. > > > > From what I understand, we should never emit ldrd/strd out of > > the

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-03 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #56 from Bernd Edlinger --- (In reply to wilco from comment #55) > (In reply to Bernd Edlinger from comment #39) > > Created attachment 39940 [details] > > proposed patch, v2 > > > > last upload was accidentally truncated. > >

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-03 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #55 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #39) > Created attachment 39940 [details] > proposed patch, v2 > > last upload was accidentally truncated. > uploaded the right patch. Right so looking

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #54 from Bernd Edlinger --- (In reply to richard.earnshaw from comment #53) > On 02/11/16 11:57, bernd.edlinger at hotmail dot de wrote: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 > > > > --- Comment #52 from Bernd

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread richard.earnshaw at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #53 from richard.earnshaw at arm dot com --- On 02/11/16 11:57, bernd.edlinger at hotmail dot de wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 > > --- Comment #52 from Bernd Edlinger --- > (In reply to wilco from

Re: [Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread Richard Earnshaw (lists)
On 02/11/16 11:57, bernd.edlinger at hotmail dot de wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 > > --- Comment #52 from Bernd Edlinger --- > (In reply to wilco from comment #51) >> >> Indeed, that's the reason behind the existing check. However it disables all >> profitable

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #52 from Bernd Edlinger --- (In reply to wilco from comment #51) > > Indeed, that's the reason behind the existing check. However it disables all > profitable bswap cases while still generating unaligned accesses if no bswap > is

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #51 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #49) > (In reply to Bernd Edlinger from comment #48) > > (In reply to wilco from comment #22) > > > > > > Anyway, there is another bug: on AArch64 we

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #50 from Richard Earnshaw --- (In reply to wilco from comment #47) > (In reply to Richard Earnshaw from comment #46) > > (In reply to wilco from comment #44) > > > (In reply to Bernd Edlinger from comment #38) > > > > Created

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #49 from Bernd Edlinger --- (In reply to Bernd Edlinger from comment #48) > (In reply to wilco from comment #22) > > > > Anyway, there is another bug: on AArch64 we correctly recognize there are 8 > > 1-byte loads, shifts and orrs

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-02 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #48 from Bernd Edlinger --- (In reply to wilco from comment #22) > > Anyway, there is another bug: on AArch64 we correctly recognize there are 8 > 1-byte loads, shifts and orrs which can be replaced by a single 8-byte load > and a

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #47 from wilco at gcc dot gnu.org --- (In reply to Richard Earnshaw from comment #46) > (In reply to wilco from comment #44) > > (In reply to Bernd Edlinger from comment #38) > > > Created attachment 39939 [details] > > > proposed

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #45 from Bernd Edlinger --- (In reply to wilco from comment #44) > (In reply to Bernd Edlinger from comment #38) > > Created attachment 39939 [details] > > proposed patch, v2 > > > > > Unlike the previous patch, thumb1 stack usage

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #46 from Richard Earnshaw --- (In reply to wilco from comment #44) > (In reply to Bernd Edlinger from comment #38) > > Created attachment 39939 [details] > > proposed patch, v2 > > > > > Unlike the previous patch, thumb1 stack

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #44 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #38) > Created attachment 39939 [details] > proposed patch, v2 > > Unlike the previous patch, thumb1 stack usage stays at 1588 bytes, > because thumb1

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #43 from Bernd Edlinger --- (In reply to wilco from comment #41) > > ARM only uses the 2nd alternative (set_attr "arch" "any,a,t2,t2"), so this > is correct. There is no need to support this pattern for ARM as ARM doesn't > have

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #42 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #40) > BTW: I found something strange in this pattern in neon.md: > > (define_insn_and_split "orndi3_neon" > [(set (match_operand:DI 0

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #41 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #40) > BTW: I found something strange in this pattern in neon.md: > > (define_insn_and_split "orndi3_neon" > [(set (match_operand:DI 0

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #40 from Bernd Edlinger --- BTW: I found something strange in this pattern in neon.md: (define_insn_and_split "orndi3_neon" [(set (match_operand:DI 0 "s_register_operand" "=w,?,?,?") (ior:DI (not:DI (match_operand:DI 2

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Bernd Edlinger changed: What|Removed |Added Attachment #39939|0 |1 is obsolete|

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Bernd Edlinger changed: What|Removed |Added Attachment #39898|0 |1 is obsolete|

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #37 from Richard Earnshaw --- (In reply to Bernd Edlinger from comment #34) > (In reply to Richard Earnshaw from comment #33) > > The logic is certainly strange. Some cores run LDRD less quickly than they > > can do LDM, or even

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #36 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #34) > (In reply to Richard Earnshaw from comment #33) > > (In reply to Wilco from comment #32) > > > (In reply to Bernd Edlinger from comment #31) > > > >

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #35 from wilco at gcc dot gnu.org --- (In reply to Richard Earnshaw from comment #30) > (In reply to wilco from comment #29) > > Combine could help with > > merging 2 loads/stores into a single instruction. > > No, combine works

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #34 from Bernd Edlinger --- (In reply to Richard Earnshaw from comment #33) > (In reply to Wilco from comment #32) > > (In reply to Bernd Edlinger from comment #31) > > > Furthermore, if I want to do -Os the third condition is FALSE

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-11-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #33 from Richard Earnshaw --- (In reply to Wilco from comment #32) > (In reply to Bernd Edlinger from comment #31) > > Furthermore, if I want to do -Os the third condition is FALSE too. > > But one ldrd must be shorter than two ldr ?

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-31 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #32 from Wilco --- (In reply to Bernd Edlinger from comment #31) > Sure, combine cant help, especially because it runs before split1. > > But I wondered why this peephole2 is not enabled: > > (define_peephole2 ; ldrd > [(set

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-31 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #31 from Bernd Edlinger --- Sure, combine cant help, especially because it runs before split1. But I wondered why this peephole2 is not enabled: (define_peephole2 ; ldrd [(set (match_operand:SI 0 "arm_general_register_operand"

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-31 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #30 from Richard Earnshaw --- (In reply to wilco from comment #29) > Combine could help with > merging 2 loads/stores into a single instruction. No, combine works strictly on dataflow dependencies. Two stores cannot be dataflow

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-31 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #29 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #28) > With my latest patch I bootstrapped a configuration with > --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 > --with-float=hard > > I

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-31 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #28 from Bernd Edlinger --- With my latest patch I bootstrapped a configuration with --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard I noticed a single regression in gcc.target/arm/pr53447-*.c That

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-28 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #27 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #26) > (In reply to wilco from comment #25) > > > > Alternatives can be disabled, there are flags, eg: > > > > (set_attr "arch"

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-27 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #26 from Bernd Edlinger --- (In reply to wilco from comment #25) > > Alternatives can be disabled, there are flags, eg: > > (set_attr "arch" "neon_for_64bits,*,*,avoid_neon_for_64bits") > Ok I see, thanks. Still lots of insns

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #25 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #24) > (In reply to Bernd Edlinger from comment #23) > > @@ -5020,7 +5020,7 @@ > > (define_insn_and_split "one_cmpldi2" > >[(set (match_operand:DI 0

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-27 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #24 from Bernd Edlinger --- (In reply to Bernd Edlinger from comment #23) > @@ -5020,7 +5020,7 @@ > (define_insn_and_split "one_cmpldi2" >[(set (match_operand:DI 0 "s_register_operand" "=w,,,?w") > (not:DI

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-27 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #23 from Bernd Edlinger --- (In reply to wilco from comment #22) > > What I meant is that your patch still makes a large difference on the > original test case despite making no difference in simple cases like the > above. For sure

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #22 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #21) > (In reply to wilco from comment #20) > > > Wilco, where have you seen the additional registers used with my > > > previous patch, maybe we can try

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-27 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #21 from Bernd Edlinger --- (In reply to wilco from comment #20) > > Wilco, where have you seen the additional registers used with my > > previous patch, maybe we can try to fix that somehow? > > What happens is that the move of

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-26 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #20 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #19) > I think the problem with anddi iordi and xordi instructions is that > they obscure the data flow between low and high half words. > When they are

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-26 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #19 from Bernd Edlinger --- I think the problem with anddi iordi and xordi instructions is that they obscure the data flow between low and high half words. When they are not enabled, we have the low and high parts expanded

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-26 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #18 from Bernd Edlinger --- Created attachment 39898 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39898=edit proposed patch This disables problematic di patterns when no fpu is used, and there is absolutely no chance that

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-26 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 wilco at gcc dot gnu.org changed: What|Removed |Added CC||wilco at gcc dot gnu.org ---

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-25 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #16 from Bernd Edlinger --- Wow. look at this: Index: arm.md === --- arm.md (revision 241539) +++ arm.md (working copy) @@ -448,7 +448,7 @@

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-25 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #15 from Bernd Edlinger --- (In reply to Wilco from comment #14) > (In reply to Bernd Edlinger from comment #13) > > I am still trying to understand why thumb1 seems to outperform thumb2. > > > > Obviously thumb1 does not have the

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-25 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #14 from Wilco --- (In reply to Bernd Edlinger from comment #13) > I am still trying to understand why thumb1 seems to outperform thumb2. > > Obviously thumb1 does not have the shiftdi3 pattern, > but even if I remove these from

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-25 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #13 from Bernd Edlinger --- I am still trying to understand why thumb1 seems to outperform thumb2. Obviously thumb1 does not have the shiftdi3 pattern, but even if I remove these from thumb2, the result is still not par with thumb2.

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-20 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #12 from Wilco --- It looks like we need a different approach, I've seen the extra SETs use up more registers in some cases, and in other cases being optimized away early on... Doing shift expansion at the same time as all other DI

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-10-17 Thread edlinger at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #11 from Bernd Edlinger --- Author: edlinger Date: Mon Oct 17 17:46:59 2016 New Revision: 241273 URL: https://gcc.gnu.org/viewcvs?rev=241273=gcc=rev Log: 2016-10-17 Bernd Edlinger PR

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-23 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Wilco changed: What|Removed |Added CC||wdijkstr at arm dot com --- Comment #10 from

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-23 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 ktkachov at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-22 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #8 from Bernd Edlinger --- analyzing the different thumb1/2 reload dumps, I see t2 often uses code like that to access spill slots: (insn 11576 8090 9941 5 (set (reg:SI 3 r3 [11890]) (plus:SI (reg/f:SI 13 sp)

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-22 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #7 from Bernd Edlinger --- even more surprisingly is that: While thumb2 code (-march=armv6t2 -mthumb) has about the same stack size as arm code (-marm), thumb1 code has only 1588 bytes stack, and it does not change with

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-21 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Bernd Edlinger changed: What|Removed |Added CC||vmakarov at gcc dot gnu.org ---

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-21 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #5 from Bernd Edlinger --- Now I try to clear the out register when the shift < 32 Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c(revision 239624)

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-21 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #4 from Bernd Edlinger --- hmm, when I compare aarch64 vs. arm sha512.c.260r.reload with -O3 -fno-schedule-insns I see a big difference: aarch64 has only few spill regs subreg regs: Slot 0 regnos (width = 8): 856 Slot 1

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-21 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 Bernd Edlinger changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de ---

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-21 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #2 from Andrew Pinski --- For aarch64, the stack size is just 208 bytes.

[Bug target/77308] surprisingly large stack usage for sha512 on arm

2016-08-21 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308 --- Comment #1 from Andrew Pinski --- Does -fno-schedule-insns help? Sometimes the scheduler before the register allocator causes register pressure and forces more register spills.