[tvm-rfcs] branch main updated: [RFC] Introduce PresburgerSet (#99)

syfeng Tue, 28 Feb 2023 00:33:54 -0800

This is an automated email from the ASF dual-hosted git repository.

syfeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm-rfcs.git



The following commit(s) were added to refs/heads/main by this push:
     new e17994b  [RFC] Introduce PresburgerSet (#99)
e17994b is described below

commit e17994b90d9278a280d019f3c8ad9065c9a3f584
Author: multiverstack <[email protected]>
AuthorDate: Tue Feb 28 16:33:43 2023 +0800

    [RFC] Introduce PresburgerSet (#99)
    
    This RFC proposes to introduce PresburgerSet as another integer set 
utility, aiming to tackle more complex analysis problems.
    
    
    ---------
    
    Co-authored-by: Min Chen <[email protected]>
---
 rfcs/0099-introduce-PresburgerSet.md | 67 ++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/rfcs/0099-introduce-PresburgerSet.md 
b/rfcs/0099-introduce-PresburgerSet.md
new file mode 100644
index 0000000..a8617aa
--- /dev/null
+++ b/rfcs/0099-introduce-PresburgerSet.md
@@ -0,0 +1,67 @@
+- Feature Name: Introduce PresburgerSet
+- Start Date: 2023-02-13
+- RFC PR: https://github.com/apache/tvm-rfcs/pull/99/
+- GitHub Issue: https://github.com/apache/tvm/issues/14006
+
+# Summary
+It would be great if TVMScript can grow into a generic programming language in 
marchine learning domain. To reach that, it seems some powerful analysis tools 
are needed. Integer set is pivotal in IR analyzing, but IntSet in TVM only 
represents ranges. This RFC is to seek an improvement for it so that we can 
perform the IR analysis more precisely. We found the Presburger Set in MLIR 
library could be leveraged for this purpose.
+# Motivation
+Current dependence analysis is carried out roughly and inconveniently. Due to 
the absence of necessary basic infrastructure, it's hard to consider complex if 
conditions in IR, and it's even more challenging to do element-wise dependence 
analysis, both for inside or between TIR block analysis.
+## Inner block analysis
+One goal of TVMscript is to provide an easy & flexible way to construct 
computation workload for both TVM compiler developers and machine learning 
algorithm developers. TVMscript requires users to annotate some extra 
information when programming, such as `T.block`/`T.remap`/`T.init`..., which 
seem not very intuitive if developers do not have deep compiler knowledge. If 
IR analysis can help users annotate this kind of information automatically, it 
would be a tremendous programming option  [...]
+```
+for i, j, k, m, n in grid(37, 23, 40, 57, 60):
+    if 3*m + 7n < 58 and 45*k + 77*j >= 34:
+        B[i*3324 + j*23 + k*103 + m*279] = A[i, j, k, m, n]
+```
+If we can detect the reduce axis, `T.init` pattern detection should not be a 
problem, then. Of course, this auto-detection should only work as an option 
because TVM developers may sometimes still want to handcraft some complex 
blocks, such as block nesting.
+## Interblock analysis
+Most TIR primitives and passes need to analyze the data denpendency between 
producer & consumer blocks. Without sufficient utility, it's hard to consider 
complex `if` conditions. Analysis without if-conditions leads to a rough 
dependency result and causes redundant data transfer & computation. The 
redundant workload could be neglective for CPU/GPU, but it could be painful for 
NPU, for which extra data is devastating both for DMA and computation. An 
if-condition-aware integer set should s [...]
+```
[email protected]_func
+def if_func(a: T.handle, c: T.handle) -> None:
+    A = T.match_buffer(a, (60,), "float32")
+    C = T.match_buffer(c, (20, 20), "float32")
+    with T.block():
+        B = T.alloc_buffer((60,), dtype="float32")
+        for i in T.grid(60,):
+            with T.block():
+                B[i] = A[i]
+        for i, j in T.grid(20, 20):
+            with T.block():
+                if i + j < 20 and i - j <= 0:
+                    C[i, j] = B[i + 2*j]
+                else:
+                    C[i, j] = 0.0
+```
+How to determine the maximum range of `B` and do compact buffer range for it? 
Shape of `B` only needs (39,) and currently CompactBufferAllocation does 
nothing on this. If we can determine the range automatically, the 
`T.Read`/`T.Write` annotations could be saved .
+# Guide-level explanation
+The proposal is to implement a `PresburgerSet` class. The key point is to 
support inequation constraints to consider if-conditions, so the inequation on 
multiple Vars will mainly be used to express the sets. The basic set 
manipulation functions and constructor functions in `IntSet` class will be 
reimplemented in `PresburgerSet`. An additional constructor function will be 
added to construct from inequations:
+```
+PresburgerSet FromConstraints(Array<Constraint> inequations)
+```
+In order to manage the analysis, we need to separate the vars in all the 
inequations into at least two kinds, the iteration(or domain) vars and other 
vars, say local(target) vars. `PresburgerSet` keeps the relationship between 
iteration vars and local vars, from iteration to local vars or vice versa. Some 
other utility functions are needed to transform the relationship, including:
+```
+PresburgerSet reverse()
+```
+  Reverse the relationship from local vars to iteration vars. So we can 
further analyze dependency based on read/write sets.
+```
+PresburgerSet apply_iteration()/apply_local()
+```
+  Merge two relationships targeting the iteration vars or local vars. Then we 
can propagate the relationship between multiple sets.
+```
+PresburgerSet solve_bounds(PrimExpr expr)
+```
+  Prove engine to solve the maximum/minimum optimization problem based on the 
inequations in `PresburgerSet`, such as simplex solver. Input parameter expr is 
the target expression of optimization problem.
+
+Other existing API, like intersect/union etc, will be reimplemented based on 
updated data structure.
+# Reference-level explanation
+You may have already noticed that this is just what other modern integer set 
library provides, like ISL. So an economical way to achieve this is to leverage 
existing public wheels. ISL is mostly used, but it seems not modular enough or 
open enough, so it could be difficult to integrate deeply. Presburger Set 
located in MLIR is modular designed and open developed. So building from it 
would be a good choice.
+
+No need to introduce MLIR as a source code submodule. Installing LLVM prebuilt 
package installs the necessary libs of Presburger Set, so MLIR can be 
integrated into TVM just like LLVM codegen uses LLVM libs, and it can be 
switched on/off on demand. The new-added util function needs to check whether 
MLIR is installed when called and falls back to the interval set when MLIR is 
not found.
+# Drawbacks
+The `PresburgerSet` serves as the basic infrastructure of IR analysis, and a 
wide range of lowering passes/primitives may need it. Part of its functionality 
is duplicated with `IntSet`, so people should make a decision which one to use 
according to the analysis task.
+# Alternatives
+The other way is to handcraft a copy of code similar to Presburger Set in 
MLIR, which minimizes the software dependence of TVM project, but it needs 
considerable effort and seems like reinventing the wheel, if there is no extra 
new idea to implement.
+# Future possiblities
+One day, when we make sure `PresburgerSet` can fully cover what `IntSet` 
provides, in terms of functionality and efficiency, we may consider phasing out 
the legacy IntSet, then no more decisions about `PresburgerSet` and `IntSet`.

[tvm-rfcs] branch main updated: [RFC] Introduce PresburgerSet (#99)

Reply via email to