[Bug tree-optimization/92649] dead store elimination by iteration domain pruning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 --- Comment #8 from rguenther at suse dot de --- On Tue, 26 Nov 2019, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 > > Jakub Jelinek changed: > >What|Removed |Added > > CC||jakub at gcc dot gnu.org > > --- Comment #7 from Jakub Jelinek --- > Or it could be handled by constexpr-like evaluation of the loops But that's probibitly expensive for large number of iterations and doesn't work for non-constant niters (but related ones for earlier vs. later one)
[Bug tree-optimization/92649] dead store elimination by iteration domain pruning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- Or it could be handled by constexpr-like evaluation of the loops, where we'd evaluate the side-effects at compile time both to PHIs after loop and written memory, and then FRE-like propagate that to the second loop if there is some cheap simple expression with which it can be replaced. Though, unsure if it wouldn't handle only toy testcases like these.
[Bug tree-optimization/92649] dead store elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 --- Comment #6 from Richard Biener --- I wonder if it makes sense for a production compiler to do this kind of transformation. When presented with a program representation as SSA on a CFG such transform would be quite ad-hoc or rather translating the SSA + CFG representation into something a lot more high-level. On that high-level it would indeed be kind of a liveness analysis and dead code removal by means of iteration domain adjustments. I wonder whether the polyhedral representation that -floop-nest-optimize uses provides any advantage here (we certainly do not transfer knowledge that the polyhedral represents all uses of a specific array and thus unneeded computes can be eliminated - but if we did, would it make such transforms automagically by simply pruning the polyhedral?) In GCC the closest "high-level" representation we try to build is the RDG loop-distribution builds, but obviously that doesn't even cover the iteration domain. As said, we're still missing a classical loop fusion transform - such transform would benefit from a better "high-level" representation of ("simple") "loops" as well (and ideally fusion would work in concert with distribution, not as separate passes). When we then have such then pruning iteration domains from liveness analysis can be implemented in such a framework (and liveness analysis should work on that high-level representation). Alternatively loop splitting sounds like a transform this could piggy-back on (given liveness analysis is done), split the domain and elide one of the loops as dead. I would advise against trying to invent a new pass just for this particular transform.
[Bug tree-optimization/92649] dead store elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 --- Comment #5 from Jiangning Liu --- Unrolling 1024 iterations would increase code size a lot, so usually we don't do that. 1024 is only an example. Without knowing we could eliminate most of them, we don't really want to do loop unrolling, I guess. Yes. Assigning 5 to all a's elements is only an example as well. It could be any random value or predefined number. Let me give a more complicated case, extern int rand(void); #define LIVE_SIZE 100 #define DATA_SIZE 256 int f(void) { int a[DATA_SIZE], b[DATA_SIZE][DATA_SIZE]; int i,j; long long s = 0; int next; for (i=0; i
[Bug tree-optimization/92649] dead store elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 --- Comment #4 from Richard Biener --- (In reply to Jiangning Liu from comment #3) > It is a stupid test, but it is simplified from a real application. > > To solve even more complicated scenario, this simple case needs to be > addressed first. > > If we change the case to be as below, > > int f(void) > { > int i, a[1024], s=0; > > for (i=0; i<1024; i++) > a[i] = 5; > > for (i=0; i<37; i++) > s += a[i]; > return s; > } > > the loop peeling will not work, but compiler should still know the store to > elements with index >= 37 can all be eliminated. Can any framework in GCC > solve this problem? No, not when faced with loops. If both loops were completely unrolled then DSE would do this job but as it stands DSE doesn't know how to adjust a loops iteration space to elide dead stores. In the above case splitting the first loop into two so that the first iterates [0, 37[ and the second [37, 1024[ would make the problem easier for DSE. We could then also fuse the loops, eliding a completely. So I guess loop fusion might be the actual transform we're looking after (which for enablement needs the loop splitting). I suppose the "real" application doesn't iniitalize a[i] to all 5? But the first loop is actually some kind of initialization?
[Bug tree-optimization/92649] dead store elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 --- Comment #3 from Jiangning Liu --- It is a stupid test, but it is simplified from a real application. To solve even more complicated scenario, this simple case needs to be addressed first. If we change the case to be as below, int f(void) { int i, a[1024], s=0; for (i=0; i<1024; i++) a[i] = 5; for (i=0; i<37; i++) s += a[i]; return s; } the loop peeling will not work, but compiler should still know the store to elements with index >= 37 can all be eliminated. Can any framework in GCC solve this problem?
[Bug tree-optimization/92649] dead store elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2019-11-25 Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- Confirmed. Note this would require DSE to adjust the iteration domain. Peeling one iteratiou would "solve" that. The testcase is also stupid, so what's the original motivating testcase?
[Bug tree-optimization/92649] dead store elimination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92649 prathamesh3492 at gcc dot gnu.org changed: What|Removed |Added CC||prathamesh3492 at gcc dot gnu.org --- Comment #1 from prathamesh3492 at gcc dot gnu.org --- This is likely dup of PR89332. Thanks, Prathamesh