Archermmt commented on issue #15233: URL: https://github.com/apache/tvm/issues/15233#issuecomment-1769573439
@Lunderberg Emmm....I've also thought about this, which method is better: 1. Convert in C++ to enable eager errors detection; 2. Convert by string generation to enable independent loading. Both has advantage and disadvantage. The first method (lets say converter, either C++ or python) like relax.builder can check and normalize the op while building graph, but that limit the deployment possibility. For example if I need compare the results between an old version tvm without relax and the new unity version(which maybe a real task for me....), I have to spend lot of time setting up environments and dumps testing datas with the converter solution. And MSC is designed not only for converting to relax, but also torch/torch2, tensorflow/tf2, tensorrt, and so on. Considering dispatch models in different framework and environment, the converter may not be a good solution. The second method (lets say string generation) like cutlass codegen first generate strings and process them to kernel/model/engine. That means codegen process disable check and normalization, that may lead to lazy errors detection. However, strings can be change to script/C++ files and loaded in any environment, that method seperates codegen and loading, which is very essential in fast model release, especially on cloud(where different environment and framework are used). And as mentioned in the RFC:https://discuss.tvm.apache.org/t/rfc-unity-msc-introduction-to-multi-system-compiler/15251 MSC is currently targeting at solving the model optimization problems base on relax. That means the codegen part should have the ability of using features in different framework, such as training, weights reusing/reloading, distribution system, and so on. Current I only have experience "describe" these features in python with string generation(not that good at C++ -_-). To partially solve the error detection problem, the codegen in MSC not only generate the model, but also generate the unittest. Using the unittest developers can locate and solve the problems efficiently. I think we can leave this part as a todo, thus enable C++ converter for MSC. After the main target is reached, I'll consider of building a converter, or may be directly use relax as the core IR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
