------- Additional Comments From tbptbp at gmail dot com  2005-01-30 08:07 
-------
I'm sorry for providing such a poor testcase.
Here's the kind of *48 sequence i'm seeing, k8 codegen; that's happening at a
point where's there's quite some register pressure and it really doesn't help
that another register is needed (gcc has to push/pop whereas icc doesn't).

  402520:       mov    (%esi),%ecx
  402522:       mov    $0x30,%eax
  402527:       mov    0x3c(%esp),%ebx
  40252b:       add    $0x4,%esi
  40252e:       imul   %eax,%ecx
  402531:       add    0x94(%esp),%ecx
  402538:       mov    (%ecx),%edi

After that there's a long string of vector operations where gcc computes all
addresses upfront (with shifts) and then use a *1 scale factor; ICC only use *8
for that and is faster.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19680

Reply via email to