junrushao1994 commented on a change in pull request #25: URL: https://github.com/apache/tvm-rfcs/pull/25#discussion_r698092449
########## File path: rfcs/0025-add-pytorch-tvm.md ########## @@ -0,0 +1,265 @@ +- Feature Name: PyTorchTVM +- Start Date: 2021-08-24 +- RFC PR: [apache/tvm-rfcs#0025](https://github.com/apache/tvm-rfcs/pull/25) +- GitHub Issue: TODO + +# Summary +[summary]: #summary + +This RFC add a `PyTorchTVM` module to support: compile TorchScript to TVM and use accelerated module in PyTorch. + +To increase the TVM accessibility for PyTorch users, we propose `PyTorchTVM` module to support the following workflow: +1. convert a torchscript module to tvm graph +2. build and tune tvm graph +3. export well-tuned tvm graph as a pytorch op +4. torch jit trace the tvm pytorch op with other pytorch modules, then save/load/serve as normal pytorch model + + + +# Motivation +[motivation]: #motivation + +PyTorch framework is increasingly being adopted for research and production. At the same time, PyTorch lacks an effective inference acceleration toolchain, which is the main concern in the industry. Existing acceleration includes: + +* PyTorch → ONNX → TensorRT/TVM +* PyTorch → torchscript → TensorRT/TVM + +From our perspective, there are some limitations for both ONNX and TensorRT: + +* Onnx cannot cover all models with dynamic control flow (e.g. for loop) +* TensorRT can only accelerate some standard networks Review comment: ```suggestion Below are the two classic acceleration workflows as the status quo: - PyTorch -> ONNX -> TensorRT/TVM - PyTorch -> TorchScript -> TensorRT/TVM However, both workflows introduce one level of indirection, which means flaws of either levels are inherited in the pipeline. For example: - ONNX offers no support for models with dynamic control flow, so the first workflow is unable to support dynamic models - The coverage of TensorRT is often limited to a range of standard neural networks, so both of the workflows, if offloaded to TensorRT, are hard to be effective on real-world irregular models. Furthermore, both of the existing workflows don't provide any benefit of an interface that is practical enough for researchers to widely adopt and reuse. For example, it requires deep knowledge of TVM runtime modules to load the exported binary artifacts back to python and use it together with PyTorch. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
