zhanghaohit opened a new pull request #6124: URL: https://github.com/apache/incubator-tvm/pull/6124
This is related to #5840 and split from PR #5842 Originally, device type is propagated based on the post DFS traversed graph, which may not be consistent if the argument order changes. In addition, it may handle some cases wrongly, e.g., the first residual block in Resnet50. The first few layers in Resnet50 are depicted in the following figure (top to bottom is in DFS order). Basically, we want to let all the layers run on FPGA device, except the first and last few layers. In the original device propagation algorithm, based on the post DFS order, the conv2d layers in grey will be propagated with `CPU` device type as we encounter `copy2` first, following which the three grey conv2d nodes are marked as the source device type of `copy2` (i.e., `CPU`), which is not correct. <img src="https://raw.githubusercontent.com/4paradigm/incubator-tvm/feature/images/docs/resnet50.png" alt="Resnet50" width=300 style="float: centre; margin-left: 50px;" /> By change the device annotation behaviour, we can support more complex graph structure. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
