[Bug tree-optimization/37312] -Os significantly faster than -O2 on test case wiht -funroll-all-loops

2008-09-02 Thread pinskia at gcc dot gnu dot org
--- Comment #6 from pinskia at gcc dot gnu dot org 2008-09-02 20:45 --- The main difference between -O2 and -Os is that csum_partial is inlined for -Os and unrolling is disabled for -Os. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37312

[Bug tree-optimization/37312] -Os significantly faster than -O2 on test case wiht -funroll-all-loops

2008-09-01 Thread pinskia at gmail dot com
--- Comment #5 from pinskia at gmail dot com 2008-09-01 20:41 --- Subject: Re: -Os significantly faster than -O2 on test case This is mostly because of extra register moves that IRA some times introduces. There is another bug about Inline-asm and the return register. Sent from my

Re: [Bug tree-optimization/37312] -Os significantly faster than -O2 on test case

2008-09-01 Thread Andrew Thomas Pinski
This is mostly because of extra register moves that IRA some times introduces. There is another bug about Inline-asm and the return register. Sent from my iPhone On Sep 1, 2008, at 7:36, "rguenth at gcc dot gnu dot org" <[EMAIL PROTECTED] > wrote: --- Comment #4 from rguenth at gcc

[Bug tree-optimization/37312] -Os significantly faster than -O2 on test case

2008-09-01 Thread rguenth at gcc dot gnu dot org
--- Comment #4 from rguenth at gcc dot gnu dot org 2008-09-01 14:36 --- Well, now -Os -funroll-all-loops doesn't do any unrolling anymore while it did before. With -O2 you get what you ask for - unrolled loops. -funroll-all-loops isn't really a flag to be used in general. -- http:

[Bug tree-optimization/37312] -Os significantly faster than -O2 on test case

2008-09-01 Thread andi-gcc at firstfloor dot org
--- Comment #3 from andi-gcc at firstfloor dot org 2008-09-01 14:20 --- Thanks for the us^whelpful comment. If you can suggest a way to do carry preserving addition without inline assembler that would be fine, otherwise not. -Os seems to do something that improves it at least (and that

[Bug tree-optimization/37312] -Os significantly faster than -O2 on test case

2008-09-01 Thread rguenth at gcc dot gnu dot org
--- Comment #2 from rguenth at gcc dot gnu dot org 2008-09-01 13:42 --- Uh, well. The code ist mostly inline assembly which doesn't give GCC much freedom to do something. I guess -O2 simply optimizes "too much" around the asm. Try not using inline assembly instead. -- http://gcc.

[Bug tree-optimization/37312] -Os significantly faster than -O2 on test case

2008-09-01 Thread andi-gcc at firstfloor dot org
--- Comment #1 from andi-gcc at firstfloor dot org 2008-09-01 11:22 --- Created an attachment (id=16178) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16178&action=view) test case checksum functions extracted from the Linux kernel. Not preprocessed, but should compile on any x86