I've been trying to fix some bad tree-ssa related optimisation for s390x and come up with the attached experimental patch. The patch is not really good - it breaks some situations in which the optimisation was useful. With this code:
void bar(long); void foo (char a) { long l; char b; b = a & 63; l = b; if (l > 9) bar(l); } We get this representation before value range propagation: ... a_4 = *p_3(D); b_5 = a_4 & 63; l_6 = (long int) b_5; if (l_6 > 9) ... Now, there's some code in tree-vrp.c:simplify_cond_using_ranges() that folds b_5 into the if condition, because l_6 is just a sign extension of b_5, and the value range of l_6 can also be represented by the type of b (char). if (b_5 > 9) (On s390x we end up with "a & 63" stored in two separate registers, extended to 32 bits in one and to 64 bits in the other, adding up to two unnecessary instructions.) A naive idea to prevent folding in this situation was to suppress it if it would introduce a second use of b_5 (i.e. b_5 was only used in the cast before) while not eliminating all uses of l_6. However, calling has_single_use() for both purposes proves to be not good enough, and VRP does not do this kind of optimisation yet. It does not catch cases like if (l_6 > 9) ... else if (l_6 > 7) ... where all occurences of l_6 could be replaced, and simply looking at the use counts is too coarse. -- Is VRP the right pass to do this optimisation or should a later pass rather attempt to eliminate the new use of b_5 instead? Uli has brought up the idea a mini "sign extend elimination" pass that checks if the result of a sign extend could be replaced by the original quantity in all places, and if so, eliminate the ssa name. (I guess that won't help with the above code because l is used also as a function argument.) How could a sensible approach to deal with the situation look like? Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany