zhiics opened a new pull request #6337: URL: https://github.com/apache/incubator-tvm/pull/6337
Currently, the dynamic models can only be executed for on CPU. The GPU execution is not allowed for these models because they have shape functions to do runtime type inference. These functions may contain various control logic to derive the shape of a tensor at runtime and they are never compute intensive, therefore are designed to be executed on CPU. That being said, we must use CPU to execute these functions even when trying to run the whole model on other devices. This PR enables the heterogeneous execution for Relay VM to support dynamic models on devices other CPU. More specifically, it includes the following changes: - [x] makes the memory_alloc and memory plan passes context aware when inserting vm/memory dialects. - [x] designs a union-find based context analysis pass to analyze the device context of the IR node in a relay program [Thanks @jroesch and @icemelon9 for help] - [x] implements a DeviceCopy instruction in VM to copy data directly cross different devices. - [x] enables GPU tests for various unit tests involving dynamic inputs/shape functions, namely those in test_any.py, test_adt.py, and test_vm.py - [x] fixes several bugs in the VM that are manifested by heterogeneous execution cc @icemelon9 @jroesch @mbrookhart @wweic ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
