[GitHub] [tvm] FranckQC opened a new pull request #9482: Implementation of Common Subexpression Elimination for TIR

GitBox Tue, 09 Nov 2021 16:21:29 -0800


FranckQC opened a new pull request #9482:
URL: https://github.com/apache/tvm/pull/9482



   Hi everyone,
   
   We would like to upstream some work that we did at Qualcomm.
   The goal of this PR is to implement a Common Subexpression Elimination (CSE) 
pass for TIR, which aims at identifying redundant computations (both within 
statements and within expressions), and to replace them by a new fresh 
variable, introduced before the first occurrence of the redundant computation.
   
   Note that it does not only try to do commoning on full expressions, but it 
is also able to do it on subexpressions. For instance, if the program computes 
the expression (w+x) + (y+z) and the expression (w+x)+u, it will introduce the 
subexpression (w+x) into a new variable.
   
   If we want so, it will be easily possible in the future to make the notion 
of equivalence between terms more flexible, allowing for instance to identify 
expressions modulo commutativity (identifying for instance (x+y) with (y+x)), 
modulo associativity (identifying for instance (x+y)+z with x+(y+z)), etc. 
Replacing only the function bool EquivalentTerms(const PrimExpr& a, const 
PrimExpr& b) will be the only thing needed in order to do that. The typical way 
to rewrite it for such extensions would be to compute a canonical representant 
of a and a canonical representant of b and to then compare them with the strict 
syntactical equality.
   
   The main CSE pass is declared and implemented respectively in the files 
common_subexpr_elim.h and common_subexpr_elim.cc.
   The function Stmt CommonSubexpressionEliminator::VisitStmt(const Stmt& stmt) 
is a good entry point as it contains many comments about what the pass is doing.
   
   The general idea of this pass is that it tries to introduce at the current 
level (the current root) the computations that are redundant and which are 
possible to introduce there (they should only contain variables that are in 
scope). This notion of variables in scope is implemented with a context, which 
is a vector of pairs (var, MaybeValue). The context is not only used for 
checking that variables that appear in candidate computations are known at this 
point, but also for checking if a computation has already been introduced into 
a variable.
   
   For a greater flexibility in the future, there is a strong distinction 
already in place between :
   
      - Syntactic computations, which are maintained in a hashtable which 
associates expressions (the computations already seen) to size_int (the number 
of times the computation has been seen).
      - Semantic entities, which are obtained from the syntactic computations 
by merging equivalent computations (where this notion of "equivalent" is 
customizable). Semantic entities are stored into a vector of pairs (expr, 
size_int) where, again, the number is the number of times that expr or 
equivalent computations have been seen.
   
   The VisitStmt() method starts by computing the syntactic computations 
(implemented in an auxiliary analysis), then it merges equivalent computations 
to obtain the semantic computations. Then it sorts these semantic computations 
from biggest to smallest in order to always consider first the biggest 
computations. The rest will essentially be a loop over all these candidates, 
which will stay sorted.
   
   When dealing with a candidate computation, there are three cases that can 
happen:
   
      - 1 - Rare case A variable in the context already contains this 
computation. This variable can't have been introduced by the CSE, as we would 
have performed the replacements at the same time (see case 2). So this is the 
case where the user himself (or the previous TIR passes) has written something 
like "let x = A in ...A...A...)"
       -> In this case, we simply perform the replacements of A with x in the 
current result. These replacements are done by an auxiliary transform/Mutator, 
declared and implemented in replace_expr_selected.h and in 
replace_expr_selected.cc.
   
      - 2 - Case where we need to introduce the current computation inside a 
new variable This is the case where all the variables used by the current 
computation are within scope (i.e. are present in the context) and where our 
internal heuristic/predicate tells us to introduce this computation into a new 
variable.
       -> In this case, a new variable new_var_i is generated, all the 
locations that use this computation in result are replaced by this fresh 
variable (using the same auxiliary Mutator mentioned in 1.), and the current 
result is replaced by let new_var_i = currentComputation in result.
   
      - 3 - Case where we can't or don't want to introduce this computation 
inside a new variable This is the case where we either can't introduce the 
current computation inside a new variable (because it contains variables that 
are not yet in scope there) or because our internal heuristic/predicate did not 
want to introduce it.
       -> In this case, we will compute the direct sub-expressions of the 
current computation (implemented by an auxiliary analysis), and we will add 
them to the vector of semantic computations so that they have a chance to be 
considered later. Note that they are added while still preserving the order.
       Note that we do not add all the sub-expressions of the current 
expression but only its direct subexpressions given the fact that we always 
consider them from biggest to smallest, and given that some candidates are 
mutually exclusive. Otherwise it would be computationally more intensive and it 
would pose the problem of cleaning the vector of candidate computations when 
one of them gets introduced into a variable. Evaluating them lazily by only 
looking at the direct sub-expressions is at the same time more efficient and 
simpler.
   
   Once the entire vector of semantic computations has been tried, the main 
function VisitStmt() calls the general dispatcher , which will in turn call the 
appropriate handlers. The only specific task of overridden handlers will be to 
update the context appropriately as new variables are introduced into scope 
(via Let-In, via For loop, etc) or leave the current scope. Thus, they will 
update the context appropriately before and after the calls to VisitStmt() and 
VisitExpr() on the child nodes.
   
   Please do not hesitate if you have any question.
   Thank you.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] FranckQC opened a new pull request #9482: Implementation of Common Subexpression Elimination for TIR

Reply via email to