[Bug target/82242] x86_64 bad optimization with -march

2017-09-20 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82242

Alexander Monakov  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-09-20
 CC||amonakov at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Alexander Monakov  ---
(Marc, for vectorization -fassociative-math would be required)

I think it's due to handling of possibly-throwing insns in register allocation:
allocno for the accumulator is considered to conflict with all SSE registers,
but the throwing call is outside of the loop, and we don't want spills inside
the loop.  Minimal C++ testcase, needs just -O2:

void c();
struct S {~S();};
double f(double *x, int n)
{
  S s;
  double r = 0;
  for (; n; n--)
r += *x++;
  c();
  return r;
}

I don't understand why throwing calls are more special than normal calls for
IRA, it seems values in call-clobbered registers would need to be spilled
either way...

[Bug target/82242] x86_64 bad optimization with -march

2017-09-19 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82242

--- Comment #2 from Marc Glisse  ---
Nothing gets vectorized :-(
Note that to fill the vector, this would be better

  std::vector array(size, 1e-9);

In the reduction, we seem to do strange things with the accumulator.

addsd   (%rax), %xmm1
addq$8, %rax
cmpq%rbx, %rax
movsd   %xmm1, (%rsp)
jne .L13

or

vmovq   %rbp, %xmm2
vaddsd  (%rax), %xmm2, %xmm1
addq$8, %rax
vmovq   %xmm1, %rbp
cmpq%rbx, %rax
jne .L13

We aren't happy with xmm1, we save the value to memory in the first case, and
to an integer register in the second case where we even restore the value from
that register...