[GitHub] [tvm] masahi commented on pull request #7303: [TOPI] Make cumsum IR reusable, add thrust scan
masahi commented on pull request #7303: URL: https://github.com/apache/tvm/pull/7303#issuecomment-763456381 Thanks @mbrookhart @anijain2305 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7303: [TOPI] Make cumsum IR reusable, add thrust scan
masahi commented on pull request #7303: URL: https://github.com/apache/tvm/pull/7303#issuecomment-763291062 @anijain2305 I added an empty tensor test in https://github.com/apache/tvm/pull/7303/commits/20afc3243a17f48084204855f498c7f9af1cad7a OpenCL seems to have a problem with 0 size buffer, but otherwise both TIR scan and thrust scan seem to have no issue. Please take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7303: [TOPI] Make cumsum IR reusable, add thrust scan
masahi commented on pull request #7303: URL: https://github.com/apache/tvm/pull/7303#issuecomment-763264832 > Once it is merged, I can try on my end with TF models as well. Perf improvement is not expected, since it only improves `get_valid_count` slightly if you use thrust scan instead of TIR scan. The purpose of this PR is to enable parallelization for other ops, that are difficult without it. `argwhere` is a perfect example that I'll demonstrate soon after this one. @anijain2305 The term you want to search for is "gpu stream compaction". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7303: [TOPI] Make cumsum IR reusable, add thrust scan
masahi commented on pull request #7303: URL: https://github.com/apache/tvm/pull/7303#issuecomment-763257824 hmm interesting, I've never created a test case with empty tensor, is that possible? Note that the IR is copied straight from https://github.com/apache/tvm/pull/7303, so the same guard against empty tensor is here. https://github.com/apache/tvm/blob/4e13a3f4a04300113e9332ef581859cb0a40a082/python/tvm/topi/cuda/scan.py#L59 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] masahi commented on pull request #7303: [TOPI] Make cumsum IR reusable, add thrust scan
masahi commented on pull request #7303: URL: https://github.com/apache/tvm/pull/7303#issuecomment-763106769 1. Right now, inclusive scan can be supported by `exclusive_scan(data) + data`. I think that is fine for now, given that our scan IR is far from stable and we don't want to maintain two IRs for the sake of removing the additional sum. 2. Yes, we can definitely do that. But this PR is already not small and I want to keep the original IR as close as possible for this PR. There are other TODO items for scan (e.g. support other binary ops), so I hope we can address this problem in the future as well. A related discussion point: Do you expect scan performance on non-innermost axis to be slower than the innermost case? If that's the case (which I believe yes), I think supporting non innermost scan and other ranks by ``` reshape + transpose + innermost scan + reshape and transpose back ``` is a good solution. It is definitely preferred in terms of implementation simplicity, allowing scan implementation to focus on 1 or 2D + innermost axis. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org