Hello, I am just playing with loop unrolling on trees (for the purposes of prefetching), and I encountered the following problem. Consider loop
sum1 = 0; sum2 = 0; for (i = 0; i < n; i++) { x_1 = a[i]; y_1 = b[i]; sum1 += x_1 * y_1; sum2 += x_1 / y_1; } Now after unrolling we get (with some handling for remaining iterations of the loop) sum1 = 0; sum2 = 0; for (i = 0; i < n; i+=4) { x_1 = a[i]; y_1 = b[i]; sum1 += x_1 * y_1; sum2 += x_1 / y_1; x_2 = a[i+1]; y_2 = b[i+1]; sum1 += x_2 * y_2; sum2 += x_2 / y_2; x_3 = a[i+2]; y_3 = b[i+2]; sum1 += x_3 * y_3; sum2 += x_3 / y_3; x_4 = a[i+3]; y_4 = b[i+3]; sum1 += x_4 * y_4; sum2 += x_4 / y_4; } So far OK, but with ter, this becomes sum1 = 0; sum2 = 0; for (i = 0; i < n; i+=4) { x_1 = a[i]; y_1 = b[i]; x_2 = a[i+1]; y_2 = b[i+1]; x_3 = a[i+2]; y_3 = b[i+2]; x_4 = a[i+3]; y_4 = b[i+3]; sum1 += x_1 * y_1 + x_2 * y_2 + x_3 * y_3 + x_4 * y_4; sum2 += x_1 / y_1 + x_2 / y_2 + x_3 / y_3 + x_4 / y_4; } Now we need some 11 registers for the loop, instead of the original 5 (and the number of registers grows with the unroll factor). This causes huge regressions (more than 60% on sixtrack from spec2000). It might perhaps be a good idea to limit the size of expressions ter produces, or in some other way restrict the way it increases the lives of registers? Zdenek