lsy643 opened a new pull request #5613: URL: https://github.com/apache/incubator-tvm/pull/5613
For dense operation such as `C = relay.op.dense(A, B), A.shape = [215296, 1], B.shape = [1, 4]`, no valid config can be found for current schedule algorithms either `dense_small_batch` nor `dense_large_batch`, because the batch size is too large and it will lead to a grid dimension larger than 65535 which is the max grid dimension size the CUDA can handle currently. Therefore, a new schedule algorithm dealing with such situation is uploaded. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org