nverke opened a new pull request, #12394: URL: https://github.com/apache/tvm/pull/12394
Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @ them in the pull request thread. **Background** Creating more threads than there are HVX units causes performance degradation when threads are unable to reserve an HVX unit and end up running vector instructions on the Scalar cores. Capping the max number of threads to be the number of available HVX units on the device prevents this from occurring and results in better performance on hexagon. **Testing** Ran tests in simulator mode as well on a hexagon HDK to ensure that the expected result of locking max_concurrency to 4 was achieved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
