FranckQC opened a new pull request #9482:
URL: https://github.com/apache/tvm/pull/9482
Hi everyone,
We would like to upstream some work that we did at Qualcomm.
The goal of this PR is to implement a Common Subexpression Elimination (CSE)
pass for TIR, which aims at identifying redundant computations (both within
statements and within expressions), and to replace them by a new fresh
variable, introduced before the first occurrence of the redundant computation.
Note that it does not only try to do commoning on full expressions, but it
is also able to do it on subexpressions. For instance, if the program computes
the expression (w+x) + (y+z) and the expression (w+x)+u, it will introduce the
subexpression (w+x) into a new variable.
If we want so, it will be easily possible in the future to make the notion
of equivalence between terms more flexible, allowing for instance to identify
expressions modulo commutativity (identifying for instance (x+y) with (y+x)),
modulo associativity (identifying for instance (x+y)+z with x+(y+z)), etc.
Replacing only the function bool EquivalentTerms(const PrimExpr& a, const
PrimExpr& b) will be the only thing needed in order to do that. The typical way
to rewrite it for such extensions would be to compute a canonical representant
of a and a canonical representant of b and to then compare them with the strict
syntactical equality.
The main CSE pass is declared and implemented respectively in the files
common_subexpr_elim.h and common_subexpr_elim.cc.
The function Stmt CommonSubexpressionEliminator::VisitStmt(const Stmt& stmt)
is a good entry point as it contains many comments about what the pass is doing.
The general idea of this pass is that it tries to introduce at the current
level (the current root) the computations that are redundant and which are
possible to introduce there (they should only contain variables that are in
scope). This notion of variables in scope is implemented with a context, which
is a vector of pairs (var, MaybeValue). The context is not only used for
checking that variables that appear in candidate computations are known at this
point, but also for checking if a computation has already been introduced into
a variable.
For a greater flexibility in the future, there is a strong distinction
already in place between :
- Syntactic computations, which are maintained in a hashtable which
associates expressions (the computations already seen) to size_int (the number
of times the computation has been seen).
- Semantic entities, which are obtained from the syntactic computations
by merging equivalent computations (where this notion of "equivalent" is
customizable). Semantic entities are stored into a vector of pairs (expr,
size_int) where, again, the number is the number of times that expr or
equivalent computations have been seen.
The VisitStmt() method starts by computing the syntactic computations
(implemented in an auxiliary analysis), then it merges equivalent computations
to obtain the semantic computations. Then it sorts these semantic computations
from biggest to smallest in order to always consider first the biggest
computations. The rest will essentially be a loop over all these candidates,
which will stay sorted.
When dealing with a candidate computation, there are three cases that can
happen:
- 1 - Rare case A variable in the context already contains this
computation. This variable can't have been introduced by the CSE, as we would
have performed the replacements at the same time (see case 2). So this is the
case where the user himself (or the previous TIR passes) has written something
like "let x = A in ...A...A...)"
-> In this case, we simply perform the replacements of A with x in the
current result. These replacements are done by an auxiliary transform/Mutator,
declared and implemented in replace_expr_selected.h and in
replace_expr_selected.cc.
- 2 - Case where we need to introduce the current computation inside a
new variable This is the case where all the variables used by the current
computation are within scope (i.e. are present in the context) and where our
internal heuristic/predicate tells us to introduce this computation into a new
variable.
-> In this case, a new variable new_var_i is generated, all the
locations that use this computation in result are replaced by this fresh
variable (using the same auxiliary Mutator mentioned in 1.), and the current
result is replaced by let new_var_i = currentComputation in result.
- 3 - Case where we can't or don't want to introduce this computation
inside a new variable This is the case where we either can't introduce the
current computation inside a new variable (because it contains variables that
are not yet in scope there) or because our internal heuristic/predicate did not
want to introduce it.
-> In this case, we will compute the direct sub-expressions of the
current computation (implemented by an auxiliary analysis), and we will add
them to the vector of semantic computations so that they have a chance to be
considered later. Note that they are added while still preserving the order.
Note that we do not add all the sub-expressions of the current
expression but only its direct subexpressions given the fact that we always
consider them from biggest to smallest, and given that some candidates are
mutually exclusive. Otherwise it would be computationally more intensive and it
would pose the problem of cleaning the vector of candidate computations when
one of them gets introduced into a variable. Evaluating them lazily by only
looking at the direct sub-expressions is at the same time more efficient and
simpler.
Once the entire vector of semantic computations has been tried, the main
function VisitStmt() calls the general dispatcher , which will in turn call the
appropriate handlers. The only specific task of overridden handlers will be to
update the context appropriately as new variables are introduced into scope
(via Let-In, via For loop, etc) or leave the current scope. Thus, they will
update the context appropriately before and after the calls to VisitStmt() and
VisitExpr() on the child nodes.
Please do not hesitate if you have any question.
Thank you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]