[GitHub] [tvm] masahi commented on pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on pull request #7123: URL: https://github.com/apache/tvm/pull/7123#issuecomment-755824086 Does this mean NMS and not `get_valid_counts` kernel have an issue? I recognize the thread launch config `(1,1,1),(1024,1,1)`, this is due to my NMS change. But that kernel should be `fused_vision_non_max_suppression_kernel2` and not `fused_vision_non_max_suppression_kernel1` as shown above, so this is weird. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on pull request #7123: URL: https://github.com/apache/tvm/pull/7123#issuecomment-754966258 @anijain2305 @trevor-m We should definitely use a fixed, real image for CI testing, like pytorch MaskRCNN test does. Please send a PR https://github.com/apache/tvm/blob/4c13ae9d17d1709ed7a777ce1bb72212e8d2559d/tests/python/frontend/pytorch/test_object_detection.py#L90-L95 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on pull request #7123: URL: https://github.com/apache/tvm/pull/7123#issuecomment-754912340 hmm strange, after running the ssd test on GPU a few times, I cannot reproduce the error anymore. Could this error be random? One annoying thing about this model is that compilation time is extremely slow and it requires increasing the stack size limit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on pull request #7123: URL: https://github.com/apache/tvm/pull/7123#issuecomment-754891319 I can reproduce the issue by running ssd test in tensorflow/test_forward.py with cuda target: ``` terminate called after throwing an instance of 'dmlc::Error' what(): [05:42:13] /home/masa/projects/dev/tvm/src/runtime/cuda/cuda_device_api.cc:126: --- An internal invariant was violated during the execution of TVM. Please read TVM's error reporting guidelines. More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793. --- Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading == false: CUDA: an illegal memory access was encountered Stack trace: [bt] (0) /home/masa/projects/dev/tvm/build/libtvm.so(+0x14aa8e8) [0x7f4fcb8ca8e8] [bt] (1) /home/masa/projects/dev/tvm/build/libtvm.so(tvm::runtime::CUDADeviceAPI::FreeDataSpace(DLContext, void*)+0xe4) [0x7f4fcb8cabe4] [bt] (2) /home/masa/projects/dev/tvm/build/libtvm.so(tvm::runtime::NDArray::Internal::DefaultDeleter(tvm::runtime::Object*)+0x5b) [0x7f4fcb8593fb] [bt] (3) /home/masa/projects/dev/tvm/build/libtvm.so(tvm::runtime::NDArray::CopyTo(DLContext const&) const+0x325) [0x7f4fcb5e4915] [bt] (4) /home/masa/projects/dev/tvm/build/libtvm.so(tvm::runtime::vm::CopyTo(tvm::runtime::ObjectRef, DLContext const&)+0x311) [0x7f4fcb884b11] [bt] (5) /home/masa/projects/dev/tvm/build/libtvm.so(tvm::runtime::vm::VirtualMachine::RunLoop()+0x2aee) [0x7f4fcb880dde] [bt] (6) /home/masa/projects/dev/tvm/build/libtvm.so(tvm::runtime::vm::VirtualMachine::Invoke(tvm::runtime::vm::VMFunction const&, std::vector > const&)+0x27) [0x7f4fcb881c17] [bt] (7) /home/masa/projects/dev/tvm/build/libtvm.so(+0x14621f0) [0x7f4fcb8821f0] [bt] (8) /home/masa/projects/dev/tvm/build/libtvm.so(TVMFuncCall+0x63) [0x7f4fcb835613] ``` @trevor-m Are you sure this is caused by `get_valid_counts` change? I've also changed NMS in https://github.com/apache/tvm/pull/7172, I hope that change is fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on pull request #7123: URL: https://github.com/apache/tvm/pull/7123#issuecomment-752856182 @Laurawly The plan is after we merge this first, we will generalized the cumsum IR in this PR into a reusable, exclusive scan primitive. After that, we can update our CUDA `argwhere` implementation to use ex scan + compaction, and introduce numpy style `cumsum` operator. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7123: Parallelize cumsum in get_valid_counts
masahi commented on pull request #7123: URL: https://github.com/apache/tvm/pull/7123#issuecomment-747865155 @mbrookhart Can you revive disabled topi `get_valid_count` test? It seems this test needs some updating. https://github.com/apache/tvm/blob/76b4ad09386f26ff360d0276745fe882d3ba6b0d/tests/python/topi/python/test_topi_vision.py#L124-L129 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org