Re: [PR] [Relax][TIR] Introduce new `cumsum` op for gpu [tvm]

via GitHub Fri, 26 Apr 2024 08:24:47 -0700


Lunderberg commented on code in PR #16934:
URL: https://github.com/apache/tvm/pull/16934#discussion_r1581185569



##########
python/tvm/relax/backend/dispatch_sort_scan.py:
##########
@@ -154,7 +154,48 @@ def visit_call_(self, call: relax.Call) -> relax.Expr:
         if call.op.name in ("relax.cumprod", "relax.cumsum"):
             tgt = self._get_target(call.struct_info)
             axis = int(call.attrs.axis) if call.attrs.axis is not None else 
call.attrs.axis
+            shape = call.struct_info.shape
             kwargs = {}
+            if (
+                (axis == -1 or axis == len(shape) - 1)

Review Comment:
   For tensors of unknown shape, the `shape` field is none.  Instead of 
`len(call.struct_info.shape)`, can we use `call.struct_info.ndim`?  
(Alternatively, since it looks like the implementation requires an explicit 
shape in order to apply a reshape, we could add `shape is not None` to this 
condition.)



##########
tests/python/relax/test_backend_dispatch_sort_scan.py:
##########
@@ -399,5 +400,32 @@ def foo(x: R.Tensor((2, 3), "float32", "vulkan")):
     assert_structural_equal(mod, expected_mod)
 
 
[email protected]_cuda
+def test_dispatch_cumsum_gpu():
+    """Test cumsum kernel dispatch and numerical correctness"""
+
+    @I.ir_module
+    class Module:
+        @R.function
+        def main(x: R.Tensor(("m", "n"), "int32")):
+            with R.dataflow():
+                gv = R.cumsum(x, axis=-1, exclusive=False)
+                R.output(gv)
+            return gv
+
+    size = (8, 2000)
+    np_data = np.random.randint(0, 10, size).astype("int32")
+    np_cumsum = np.cumsum(np_data, axis=-1)
+    for target in ["cuda", "vulkan -supports_int64=1"]:

Review Comment:
   Nitpick: Use `@tvm.testing.parametrize_targets("cuda", "vulkan 
-supports_int64=1")` instead of looping over each target.  This performs each 
test case in a separate pytest environment, 
   
   * Exercises each test in a separate pytest case.  Can distinguish between 
failure on one specific backend as opposed to failure on every backend.
   * Applies the appropriate `@tvm.testing.requires_*` marks for each target.  
Currently, this test would fail if a developer runs it with `set(USE_CUDA ON)` 
and `set(USE_VULKAN OFF)`.
   
   ```python
   @tvm.testing.parametrize_targets("cuda", "vulkan -supports_int64=1")
   def test_dispatch_cumsum_gpu(target, dev):
      ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [Relax][TIR] Introduce new `cumsum` op for gpu [tvm]

Reply via email to