mbs-octoml opened a new pull request, #11979:
URL: https://github.com/apache/tvm/pull/11979
This does for the TensorRT integration what #11631 did for the CUTLASS
integration.
- All compilation options are captured within the attributes of a Target of
kind "tensorrt" (instead of the "relay.ext.tensorrt.options" attribute in
PassContext). This means all BYOC configurations options needed by Collage can
be captured uniformly by a list-of-Targets. It also means RPC boundaries (as
used internally at OctoML) only need to worry about maintaining the fidelity of
the Target instance(s) rather than reaching into the PassContex.
- Compilation is switched from function-at-a-time (relying on the
TECompiler) to IRModule-at-a-time (using the RelayToTIR target-specific hook
mechanism). Though not strictly necessary for Collage I want to check the path
is now clear to deprecate the support for BYOC in TEComplier.
- Get all the TensorRT tests going again, except for a few I've disabled
with x-link to a new issue #11765. CAUTION: The TensorRT runtime is not
supported in CI so many of these tests are just cosmetic.
- While trying to track down a 'free(): invalid pointer' error in
test_tensorrt_int8_exp.py made the TensorRT allocs/frees more robust, but
turns out its also broken in main. No harm leaving these changes in though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]