leeexyz commented on a change in pull request #7497:
URL: https://github.com/apache/tvm/pull/7497#discussion_r586427561
##########
File path: tests/python/unittest/test_te_schedule_tensorize.py
##########
@@ -18,14 +18,22 @@
from tvm import te
-def intrin_vadd(n):
+def intrin_vadd(xo, m, n):
x = te.placeholder((n,), name="vx")
y = te.placeholder((n,), name="vy")
- z = te.compute(x.shape, lambda i: x[i] + y[i], name="z")
+ if m % n == 0:
+ body = lambda i: x[i] + y[i]
+ else:
+ body = lambda i: tvm.tir.Select(
+ xo * n + i < m, x[i] + y[i], tvm.tir.const(0, dtype=x.dtype)
+ )
+ z = te.compute(x.shape, body, name="z")
def intrin_func(ins, outs):
xx, yy = ins
zz = outs[0]
+ # special handle needed to tackle tail loop part when m % n != 0
+ # here is tvm.min(n, m - xo * n)
Review comment:
> Is that possible to also improve `vadd` in this PR so that you can
reuse the `check` function for both test cases?
Thanks for your advice. I tried, but Tensorize is a little strictly for the
pattern. I cannot use the same compute in intrin_vadd for both cases. And for
the two check functions, they check the different results.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]