Lunderberg commented on PR #16183: URL: https://github.com/apache/tvm/pull/16183#issuecomment-2273924915
First results, looks like the implicit conversions aren't necessarily causing the overhead. I benchmarked constructing a TIR buffer, with the shape/strides either provided as a On `main` (after the FFI changes), the runtime is about the same, regardless of how the shape/strides are provided.  On `5a67a00bcb` (last commit before the FFI changes), it's about 80% slower 's about a 40% overhead when providing python integers (`[1,2,3]`) as compared to providing TIR IntImm (`[T.int32(1), T.int32(2), T.int32(3)]`).  This is rather surprising, because I expected the overhead to be present after the FFI changes, but this benchmark suggests that there's less overhead. Next up, seeing how the FFI changes impact Relax `R.call_tir` operators. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
