[GitHub] [tvm] vvchernov opened a new pull request #8781: WIP: [Frontend] [Torch] [ONNX] GRU layer

GitBox Wed, 18 Aug 2021 01:11:35 -0700


vvchernov opened a new pull request #8781:
URL: https://github.com/apache/tvm/pull/8781



   WORK IN PROGRESS:
   The following issues were observed during testing:
   1. Current onnx frontend implementation with activation fix fails accuracy 
test for stacked modification after tuning. There is no issue without tuning 
and for stacked bidiractional one (or may be not observed)!
   2. New pytorch GRU has the same problem. Note: onnx and pytorch GRU close 
but different implementations.
   3. More advanced implementation of GRU is faster on more than 20% but has 
stable accuracy failures for tvm tuning and no problem without it. It is 
implemented locally while issue is not resolved.
   4. Onnx GRU has issues related wrong implementation but unit test checks 
accuracy and does not catch it. Need more detail investigation, potentially it 
can be big problem.
   5. Strong difference of performance results for ONNX and new pytroch GRU. 
Adding new implementation to ONNX frontend instead of current one is planned.
   
   GRU cell was unified and implemented in common.py. It is used by pytorch 
frontend of TVM. Critical fix was done on ONNX GRU implementation side. 
Performance tests for different modification of GRU before and after new 
implementation were carried out. The results are collected in the table:
   
   Table 1. Average time per run (microsec) for 10000 runs. The following 
parameters are used (small input size): with biases = True, batch first = True, 
linear before reset = True, feature size = 5, hidden size = 10, number of 
stacked layers = 2, sequence length = 3, batch size = 1, trials number = 100. 
TVM target is “llvm -mcpu=core-avx2”
   
   | Frontend name/GRU type  |   uni  |    b    |     s     |    sb    | 
   | :-----------------------------| :-----:|:------:|:-------:|:-------:|
   |Onnx                                    | 23.3  | 48.6   | 46.4   | 100.8   
|
   |Onnx tuned                          |  8.37 | 15.25 | 14.35 |  28.1   |
   Pytorch implemented          | 7.42   | 13.0   | 12.8  |  24.7   |
   Pytorch impl tuned              |   3.2   |  4.91  |  4.77 |   8.43  |
   Onnxruntime                       | 13.06  | 16.89 | 19.2  |  25.9   |
   
   There are several GRU types: uni – unidirectional, b – bidirectional, s – 
stacked (2 layers are used in the tests), sb - stacked bidirectional. Compiled 
by TVM pytorch GRU is faster than tuned ONNX GRU and onnxruntime! Tuned pytorch 
is strongly quicker. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] vvchernov opened a new pull request #8781: WIP: [Frontend] [Torch] [ONNX] GRU layer

Reply via email to