Lunderberg opened a new pull request, #15453: URL: https://github.com/apache/tvm/pull/15453
Prior to this commit, `nn.Module` generated the relax expression within the body of `relax.BlockBuilder.current()`. For large models, this can result in extremely large function bodies, making it difficult to identify regions within the module. This commit adds an option `nn.Module.define_subroutine`, which defaults to `False` (current behavior). If set to `True`; either within a subclass, within an instance, or globally in `nn.Module`; function calls into the module will produce a subroutine representing the module's execution, and a call into that subroutine. For example, calling a `class Linear(nn.Module)` would produce a `def linear(arg, weights)` function definition and a `module.linear(arg, weights)` function call. To ensure correct shape propagation, a subroutine is generated for each unique set of argument shapes passed to the `nn.Module` subclass. The PR branch contains three commits for ease of review. The first commit extends `relax.BlockBuilder` to allow definitions to be generated while already making a function. The second commit extends `nn.Module` to optionally generate and call subroutine functions. The third commit adds backwards compatibility type checks to retain compatibility with [`mlc-ai`'s implementation of llama](https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/models/llama.py) when generating subroutines. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
