I've been trying to fix some bad tree-ssa related optimisation for
s390x and come up with the attached experimental patch.  The patch
is not really good - it breaks some situations in which the
optimisation was useful.  With this code:

  void bar(long);
  void foo (char a)
  {
    long l;
    char b;

    b = a & 63;
    l = b;
    if (l > 9)
      bar(l);
  }

We get this representation before value range propagation:

  ...
  a_4 = *p_3(D);
  b_5 = a_4 & 63;
  l_6 = (long int) b_5;
  if (l_6 > 9)
  ...

Now, there's some code in tree-vrp.c:simplify_cond_using_ranges()
that folds b_5 into the if condition, because l_6 is just a sign
extension of b_5, and the value range of l_6 can also be
represented by the type of b (char).

  if (b_5 > 9)

(On s390x we end up with "a & 63" stored in two separate
registers, extended to 32 bits in one and to 64 bits in the other,
adding up to two unnecessary instructions.)

A naive idea to prevent folding in this situation was to suppress
it if it would introduce a second use of b_5 (i.e. b_5 was only
used in the cast before) while not eliminating all uses of l_6.
However, calling has_single_use() for both purposes proves to be
not good enough, and VRP does not do this kind of optimisation
yet.  It does not catch cases like

  if (l_6 > 9)
    ...
  else if (l_6 > 7)
    ...

where all occurences of l_6 could be replaced, and simply looking
at the use counts is too coarse.

--

Is VRP the right pass to do this optimisation or should a later
pass rather attempt to eliminate the new use of b_5 instead?  Uli
has brought up the idea a mini "sign extend elimination" pass that
checks if the result of a sign extend could be replaced by the
original quantity in all places, and if so, eliminate the ssa
name.  (I guess that won't help with the above code because l is
used also as a function argument.)  How could a sensible approach
to deal with the situation look like?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

Reply via email to