comaniac opened a new pull request #5059: [Draft][BYOC] Annotation Target with 
Merging
URL: https://github.com/apache/incubator-tvm/pull/5059
 
 
   This PR implements a Relay pass that annotates target device. Different from 
the existing annotation target pass (#4933), this pass implements the algorithm 
RFC proposed by @mbaret 
(https://discuss.tvm.ai/t/relay-improved-graph-partitioning-algorithm/5830). In 
short, it greedy merges supported ops and minimizes the number of generated 
subgraphs.
   
   Some highlights and lowlights for this PR:
   - The pass is general in terms of supporting multiple targets. We can use 
["dnnl", "trt"], for example to annotate the graph.
   - The pass uses lots of utility functions which are supposed to be removed 
after https://discuss.tvm.ai/t/discuss-annotation-defined-subgraphs/5934 has 
been implemented.
   - This pass supports multiple outputs, but the subgraph with multiple 
outputs cannot be partitioned at this moment because we haven't supported 
multiple outputs in the partition pass.
   - The unit test uses exactly the same example used in the RFC. The "add" 
nodes are blue nodes while "substrate" are red nodes in the RFC figure.
   - I marked "substrate" as a non-support op to demonstrate how this pass 
works, but we need a more suitable way to do so.
   
   Here is the example graph:
   ```
   def @main(%in_1: Tensor[(10, 10), float32], %in_2: Tensor[(10, 10), 
float32], %in_3: Tensor[(10, 10), float32], %in_4: Tensor[(10, 10), float32], 
%in_5: Tensor[(10, 10), float32], %in_6: Tensor[(10, 10), float32], %in_7: 
Tensor[(10, 10), float32], %in_8: Tensor[(10, 10), float32], %in_9: Tensor[(10, 
10), float32], %in_10: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
     %0 = add(%in_1, %in_2) /* ty=Tensor[(10, 10), float32] */;
     %1 = add(%in_3, %in_4) /* ty=Tensor[(10, 10), float32] */;
     %2 = add(%0, %1) /* ty=Tensor[(10, 10), float32] */;
     %3 = subtract(%in_5, %in_6) /* ty=Tensor[(10, 10), float32] */;
     %4 = subtract(%in_7, %3) /* ty=Tensor[(10, 10), float32] */;
     %5 = add(%2, %4) /* ty=Tensor[(10, 10), float32] */;
     %6 = subtract(%in_8, %5) /* ty=Tensor[(10, 10), float32] */;
     %7 = add(%in_9, %5) /* ty=Tensor[(10, 10), float32] */;
     %8 = add(%6, %7) /* ty=Tensor[(10, 10), float32] */;
     add(%in_10, %8) /* ty=Tensor[(10, 10), float32] */
   }
   ```
   
   After annotation with merge:
   
   ```
   def @main(%in_1: Tensor[(10, 10), float32], %in_2: Tensor[(10, 10), 
float32], %in_3: Tensor[(10, 10), float32], %in_4: Tensor[(10, 10), float32], 
%in_5: Tensor[(10, 10), float32], %in_6: Tensor[(10, 10), float32], %in_7: 
Tensor[(10, 10), float32], %in_8: Tensor[(10, 10), float32], %in_9: Tensor[(10, 
10), float32], %in_10: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
     %0 = annotation.compiler_begin(%in_10, meta[relay.attrs.CompilerAttrs][0]) 
/* ty=Tensor[(10, 10), float32] */;
     %1 = annotation.compiler_begin(%in_1, meta[relay.attrs.CompilerAttrs][1]) 
/* ty=Tensor[(10, 10), float32] */;
     %2 = annotation.compiler_begin(%in_2, meta[relay.attrs.CompilerAttrs][2]) 
/* ty=Tensor[(10, 10), float32] */;
     %3 = add(%1, %2) /* ty=Tensor[(10, 10), float32] */;
     %4 = annotation.compiler_begin(%in_3, meta[relay.attrs.CompilerAttrs][3]) 
/* ty=Tensor[(10, 10), float32] */;
     %5 = annotation.compiler_begin(%in_4, meta[relay.attrs.CompilerAttrs][4]) 
/* ty=Tensor[(10, 10), float32] */;
     %6 = add(%4, %5) /* ty=Tensor[(10, 10), float32] */;
     %7 = add(%3, %6) /* ty=Tensor[(10, 10), float32] */;
     %8 = subtract(%in_5, %in_6) /* ty=Tensor[(10, 10), float32] */;
     %9 = subtract(%in_7, %8) /* ty=Tensor[(10, 10), float32] */;
     %10 = annotation.compiler_begin(%9, meta[relay.attrs.CompilerAttrs][5]) /* 
ty=Tensor[(10, 10), float32] */;
     %11 = add(%7, %10) /* ty=Tensor[(10, 10), float32] */;
     %12 = annotation.compiler_end(%11, meta[relay.attrs.CompilerAttrs][6]) /* 
ty=Tensor[(10, 10), float32] */;
     %13 = subtract(%in_8, %12) /* ty=Tensor[(10, 10), float32] */;
     %14 = annotation.compiler_begin(%13, meta[relay.attrs.CompilerAttrs][7]) 
/* ty=Tensor[(10, 10), float32] */;
     %15 = annotation.compiler_begin(%in_9, meta[relay.attrs.CompilerAttrs][8]) 
/* ty=Tensor[(10, 10), float32] */;
     %16 = add(%15, %11) /* ty=Tensor[(10, 10), float32] */;
     %17 = annotation.compiler_end(%16, meta[relay.attrs.CompilerAttrs][9]) /* 
ty=Tensor[(10, 10), float32] */;
     %18 = annotation.compiler_begin(%17, meta[relay.attrs.CompilerAttrs][10]) 
/* ty=Tensor[(10, 10), float32] */;
     %19 = add(%14, %18) /* ty=Tensor[(10, 10), float32] */;
     %20 = add(%0, %19) /* ty=Tensor[(10, 10), float32] */;
     annotation.compiler_end(%20, meta[relay.attrs.CompilerAttrs][11]) /* 
ty=Tensor[(10, 10), float32] */
   }
   ```
   
   I'll need to clean up the code and refactor the unit test before it can be 
reviewed and merged. Meanwhile, @mbaret since you are also working on this 
pass, could you share your thoughts? We don't have to merge this PR if yours is 
almost done.
   
   cc @zhiics 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to