zhanghaohit opened a new pull request #6124:
URL: https://github.com/apache/incubator-tvm/pull/6124


   This is related to #5840 and split from PR #5842 
   
   Originally, device type is propagated based on the post DFS traversed graph, 
which may not be consistent if the argument order changes. In addition, it may 
handle some cases wrongly, e.g., the first residual block in Resnet50. The 
first few layers in Resnet50 are depicted in the following figure (top to 
bottom is in DFS order). Basically, we want to let all the layers run on FPGA 
device, except the first and last few layers. In the original device 
propagation algorithm, based on the post DFS order, the conv2d layers in grey 
will be propagated with `CPU` device type as we encounter `copy2` first, 
following which the three grey conv2d nodes are marked as the source device 
type of `copy2` (i.e., `CPU`), which is not correct.
   
   
   <img 
src="https://raw.githubusercontent.com/4paradigm/incubator-tvm/feature/images/docs/resnet50.png";
        alt="Resnet50"
         width=300
        style="float: centre; margin-left: 50px;" />
   
   By change the device annotation behaviour, we can support more complex graph 
structure.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to