wsl-inspur commented on pull request #5485:
URL: https://github.com/apache/incubator-tvm/pull/5485#issuecomment-624103994


   @FrozenGene 
   We tested the layout as you suggested, and the results are listed below. 
   
   
   kernel:3x3x64x64 feature maps:56x56x64:
   batch | kernel (alpha, alpha, ci, co) | kernel (alpha, alpha, co, ci)
   1 | 0.0762 | 0.0766
   2 | 0.0911 | 0.0931
   4 | 0.1197 | 0.124
   8 | 0.1979 | 0.1942
   16 | 0.3453 | 0.3577
   32 | 0.6613 | 0.7161
   256 | 5.5837 | 5.3269
   
   
   kernel:3x3x256x256 feature maps:14x14x256:
   batch | kernel (alpha, alpha, ci, co) | kernel (alpha, alpha, co, ci)
   1 | 0.0633 | 0.0694
   2 | 0.0825 | 0.0835
   4 | 0.1417 | 0.1562
   8 | 0.1829 | 0.1853
   16 | 0.264 | 0.277
   32 | 0.4506 | 0.4799
   256 | 3.9432 | 4.0867
   
   Note: weight transform was pre-computed. The benchmarks were running on T4 
GPU (16GB, 70W). Latency is reported with unit of ms.
   
   We can see that the performance of both layouts are in the same level, and 
the kernel with alpha, alpha, ci, co layout is a little better than alpha, 
alpha, co, ci layout for most of the cases. 
   
   
   
   
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to