mbs-octoml opened a new pull request, #11770:
URL: https://github.com/apache/tvm/pull/11770
I tried to do to the TensorRT integration what #11631 did to the CUTLASS
integration, viz:
- Make sure all compilation options are passed in Target instances. This
helps Collage.
- Use a custom pass invoked via RelayToTIRTargetHooks instead of the
relay.ext.$toolchain mechanism.
This helps use decouple external codegen from lowering.
This PR collects the prep for that change:
- TensorRT uses the JSONSerializer visitor to encode each partition
function. Previously, when the
visitor encountered a Constant it simply generated and recorded a name
for the constant. Then,
completely separately, and via a callback in TECompiler, the function is
visited again in the
same order and with the same name generation convention by a
ConstantUpdater to actually collect the
bindings, which are then encoded into a ConstLoaderModule to be made
available at runtime.
However if all TensorRT compilation is to be done by a stand-alone pass
there's no TECompiler callback
hackery available. So I've added a "const_name_to_ndarray" attribute to
the IRModule of type
Map<String, runtime::NDArray> so that named constants can be accumulated
throughout compilation by
any pass which needs to do so. Then the Graph, AOT and VM executors are
all updated to merge those
constants into the final runtime artifact
(Compare with "Constants", the equivalent attribute for extracting TIR
AllocateConsts.)
- The TensorRT tests use the create_executor interface but it wasn't quite
ready for the
new more general form of passing list-of-targets.
- I want TensorRT compilation to work out of the box without the need for
any special targets if
all the default options should apply. Go back and make the CUTLASS
integration I did follow the
same convention.
- TensorRT actually needs to 'undo' partitionings in some situations. Add
an InlineCompilerFunctions
pass to make that robust. In particular, it must undo both the
'partitioning' (ie separating out
the "Compiler" function) and any 'compositing' (ie separating out small
sub-graphs as
"Composite" functions).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]