[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2008-09-06 Thread ubizjak at gmail dot com
--- Comment #30 from ubizjak at gmail dot com 2008-09-06 16:11 --- Current mainline (4.4.0 20080906) produces: pushl %ebx movl8(%ebp), %eax movl16(%ebp), %edx movl20(%ebp), %ecx movl12(%ebp), %ebx imull %eax, %ecx

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2008-09-06 Thread ubizjak at gmail dot com
--- Comment #31 from ubizjak at gmail dot com 2008-09-06 16:18 --- *** Bug 6585 has been marked as a duplicate of this bug. *** -- ubizjak at gmail dot com changed: What|Removed |Added

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2008-09-06 Thread ubizjak at gmail dot com
-- ubizjak at gmail dot com changed: What|Removed |Added Target Milestone|--- |4.4.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2008-03-07 Thread bonzini at gnu dot org
--- Comment #29 from bonzini at gnu dot org 2008-03-07 08:26 --- ira branch produces the same code as with my patch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2008-03-07 Thread bonzini at gnu dot org
-- bonzini at gnu dot org changed: What|Removed |Added Status|NEW |SUSPENDED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-20 Thread bonzini at gnu dot org
--- Comment #27 from bonzini at gnu dot org 2007-12-20 13:53 --- I screwed up so I have to rerun most of SPECfp2000, but the results seem a wash. Anybody can fire the patch I'll attach soon on a wide range of machines? -- Bug 17236 depends on bug 6585, which changed state. Bug 6585

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-20 Thread bonzini at gnu dot org
--- Comment #28 from bonzini at gnu dot org 2007-12-20 14:15 --- Created an attachment (id=14800) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14800action=view) combined patch -- bonzini at gnu dot org changed: What|Removed |Added

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread bonzini at gnu dot org
--- Comment #17 from bonzini at gnu dot org 2007-12-19 09:49 --- With this patch, GCC gets the preferences right, but it does not affect code generation. Index: regclass.c === --- regclass.c (revision 130928) +++

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread ubizjak at gmail dot com
--- Comment #18 from ubizjak at gmail dot com 2007-12-19 12:11 --- Another baby step can be performed by: Index: optabs.c === --- optabs.c(revision 131053) +++ optabs.c(working copy) @@ -1245,7 +1245,7 @@

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread bonzini at gnu dot org
--- Comment #19 from bonzini at gnu dot org 2007-12-19 12:13 --- Created an attachment (id=14792) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14792action=view) patch to almost fix the bug With this patch: 1) local-alloc first tries to allocate registers that go into small

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread bonzini at gnu dot org
--- Comment #21 from bonzini at gnu dot org 2007-12-19 12:43 --- Created an attachment (id=14793) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14793action=view) two hunks only from the previous patch Indeed, if I only use the regclass.c and local-alloc.c hunks, I get only one

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread bonzini at gnu dot org
--- Comment #20 from bonzini at gnu dot org 2007-12-19 12:32 --- There is a big catch-22. When GCC ties one of regs 64/66 with reg 61, it enlarges reg 61's live range to cover the live range of the tied range. When it does this, it basically locks up %edx for the whole live range of

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread ubizjak at gmail dot com
--- Comment #22 from ubizjak at gmail dot com 2007-12-19 13:11 --- (In reply to comment #21) Indeed, if I only use the regclass.c and local-alloc.c hunks, I get only one spill! pushl %ebx movl8(%esp), %edx movl16(%esp), %eax movl

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread bonzini at gnu dot org
--- Comment #23 from bonzini at gnu dot org 2007-12-19 13:36 --- Created an attachment (id=14794) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14794action=view) teach combine that reg = op(reg, mem) is better Since combine operates on the whole pattern, it can be taught the trick

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread steven at gcc dot gnu dot org
--- Comment #24 from steven at gcc dot gnu dot org 2007-12-19 13:48 --- The patch in comment #23 might even be suitable for GCC 4.3 ... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread bonzini at gnu dot org
--- Comment #25 from bonzini at gnu dot org 2007-12-19 13:50 --- Note that fwprop is not an exact term, because there *is* a memory load in each multiplication, and propagating a second memory operand will create an invalid insn. You may try to add a split from reg=op(mem1, mem2) to

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-19 Thread bonzini at gnu dot org
--- Comment #26 from bonzini at gnu dot org 2007-12-19 13:53 --- I'm starting a SPEC run on the overall patch -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread bonzini at gnu dot org
--- Comment #9 from bonzini at gnu dot org 2007-12-18 08:05 --- Uros mentioned offlist that he wanted to hijack fwprop to always propagate stack slots into instructions. It would be a relatively useful piece of infrastructure to have a flag in MEMs that marks on-stack MEMs, because

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread ebotcazou at gcc dot gnu dot org
--- Comment #10 from ebotcazou at gcc dot gnu dot org 2007-12-18 08:10 --- It would be a relatively useful piece of infrastructure to have a flag in MEMs that marks on-stack MEMs, because other MEMs definitely must not be propagated blindly. Depending on your needs, MEM_NOTRAP_P

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread ubizjak at gmail dot com
--- Comment #11 from ubizjak at gmail dot com 2007-12-18 13:47 --- Generated code for a similar example is just plain stupid: --cut here-- int test(long long a, long long b) { return a * b; } --cut here-- gcc -O3: test: pushl %ebp movl%esp, %ebp

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread bonzini at gnu dot org
--- Comment #12 from bonzini at gnu dot org 2007-12-18 16:01 --- The problem in comment #11 is that GCC generates a widening multiply, and cannot remove the DImode operations until after register allocation (!). While the root cause is a deficiency in RTL-level DCE, I suggest filing a

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread jakub at gcc dot gnu dot org
--- Comment #13 from jakub at gcc dot gnu dot org 2007-12-18 16:39 --- I think tree level does the right thing, TER fixes this up and expand_expr is called with return (int) (b * a) Later on expand_expr is called with mult_expr 0x2e9032c0 type integer_type 0x2e937840 long

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread bonzini at gnu dot org
--- Comment #14 from bonzini at gnu dot org 2007-12-18 16:50 --- The bug with 64*64-32 multiplication is now PR34522. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17236

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread ubizjak at gmail dot com
--- Comment #15 from ubizjak at gmail dot com 2007-12-18 18:20 --- (In reply to comment #7) mull%ebx leal(%ecx,%edx), %esi ; what the heck, a simple addl could do! movl%esi, %edx Something disturbs RA to emit two DImode moves: (insn:HI 10 36

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-18 Thread ubizjak at gmail dot com
--- Comment #16 from ubizjak at gmail dot com 2007-12-18 18:33 --- (In reply to comment #15) Note two moves [(insn 36) and (insn 37)] around (insn 12). Bah. This is the correct sequence, around (insn 10) that seems to be the root of all problems: (insn:HI 9 8 36 2 m.c:2 (parallel [

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-17 Thread bonzini at gnu dot org
--- Comment #7 from bonzini at gnu dot org 2007-12-18 07:43 --- The generated code has changed a lot recently, though it still uses two spills: pushl %esi pushl %ebx movl12(%esp), %ebx ; load alow movl20(%esp), %esi ; load

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-12-17 Thread bonzini at gnu dot org
--- Comment #8 from bonzini at gnu dot org 2007-12-18 07:53 --- For -mregparm=2 the code is this: pushl %ebx movl8(%esp), %ebx movl12(%esp), %ecx imull %ebx, %edx imull %eax, %ecx addl%edx, %ecx mull%ebx

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-11-23 Thread steven at gcc dot gnu dot org
--- Comment #6 from steven at gcc dot gnu dot org 2007-11-23 20:48 --- *** Bug 6585 has been marked as a duplicate of this bug. *** -- steven at gcc dot gnu dot org changed: What|Removed |Added

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2007-02-01 Thread roger at eyesopen dot com
--- Comment #5 from roger at eyesopen dot com 2007-02-02 00:17 --- It looks like Ian's recent subreg lowering pass patch has improved code generation on this testcase. Previously, we'd spill three integer registers to the stack for LLM, we're now down to two. [A significant

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2005-03-15 Thread giovannibajo at libero dot it
--- Additional Comments From giovannibajo at libero dot it 2005-03-15 10:04 --- Roger explains what else needs to be done here: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01386.html Right now, after his patch, mainline generates this code: pushl %edi pushl %esi

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2005-03-15 Thread giovannibajo at libero dot it
--- Additional Comments From giovannibajo at libero dot it 2005-03-15 10:07 --- Uros did some additional comments: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01427.html -- What|Removed |Added

[Bug rtl-optimization/17236] inefficient code for long long multiply on x86

2005-03-14 Thread cvs-commit at gcc dot gnu dot org
--- Additional Comments From cvs-commit at gcc dot gnu dot org 2005-03-14 18:24 --- Subject: Bug 17236 CVSROOT:/cvs/gcc Module name:gcc Changes by: [EMAIL PROTECTED] 2005-03-14 18:24:15 Modified files: gcc: ChangeLog optabs.c Log message: