manupa-arm commented on a change in pull request #22: URL: https://github.com/apache/tvm-rfcs/pull/22#discussion_r710308724
########## File path: rfcs/0022-tir-non-scalar-constants.md ########## @@ -0,0 +1,107 @@ + +- Feature Name: tir_non_scalar_constants +- Start Date: 2021-06-01 +- RFC PR: https://github.com/apache/tvm-rfcs/pull/22 +- GitHub Issue: TBD + +# 1. Summary + +This RFC proposes how non-scalar constants could be represented in TIR and used by passes in the lowering process. + +# 2. Motivation + +Currently, the non-scalar constants could be represented in Relay (relay.Constant) to be used by relay passes but not in TIR. Therefore, when performing lowering using TIR passes, we have to maintain a side-channel of tir::Var to constant non-scalar data mapping to perform transformations that could use the knowledge where some of the data are constants. + +Few example scenarios as further motivation : + +## Weight compression + +When lowering for accelerators (E.g. : [Arm(R) Ethos(TM)-U NPU](https://github.com/apache/tvm-rfcs/pull/11)), certain operations will need to get tiled to co-optimize performance and memory utilization. Such tiling patterns create slices of weights that need compressing that will end up with varying sizes. Therefore, the knowledge of some tir::Vars refer to constants are critical in the level of TIR to perform this. + +## Memory Planning + +The TIR program has the ability to express both inter and intra operator memory requirement, post-scheduling as explained further by [Unified Static Memory Planning RFC](https://github.com/apache/tvm-rfcs/pull/9). It would be better if the constants could be embedded to the TIR PrimFunc. Moreover, this allows various [target-dependent lowerings](https://github.com/apache/tvm-rfcs/pull/10), to produce TIR PrimFuncs with constants in it. Review comment: Does not relay use show_meta=off for this exact problem with relay.constant ? I dont see a problem of printing associated with introduction of the node. Have a look at the PR : https://github.com/apache/tvm/pull/8472/files The reason we dont prefer to store them IRModule is that our passes will perform optimizations (such as slicing weights using compute_at and performing compression again) and all such passes has to undergo a in-direction. It feels unncessary if we had a leaf node to represent constants. Moreover, given such passes it feels ability to represent non-scalar constants in TIR is bit limiting. I personally dont feel like putting such a intimate construct that is used with the compute as an 'attribute'. Let me know if you have further concerns here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
