Tantalus13A98B5F opened a new pull request #8605:
URL: https://github.com/apache/tvm/pull/8605


   This PR currently comes with,
   
   - NHWC & NCHW computes & schedules implemented for x86, by AutoTVM templates;
   - Relay integration, reusing the infra contributed by existing 1x1 sparse 
conv2d kernels;
   - A new relay pass to convert sparse models from models with freezed 
parameters;
   - Tests not included for now; need discussion on coverage design;
   
   Sample usages 
[here](https://github.com/Tantalus13A98B5F/convbench/blob/main/resnet50_conv3x3/tvm/integration.ipynb).
 Tested on Intel Xeon 6248 using ResNet18, single-threaded, we have the 
following results,
   
   | Sparsity | Op Shape (NCHW) |                 |                 |           
    | ResNet18 Infer |
   
|:--------:|:---------------:|:---------------:|:---------------:|:-------------:|:--------------:|
   |          |  10, 64, 56, 56 | 10, 128, 28, 28 | 10, 256, 14, 14 | 10, 512, 
7, 7 |                |
   |   Dense  | 0.0113          | 0.0107          | 0.0107          | 0.0127    
    | 0.2144         |
   |    90%   | 0.0032          | 0.0031          | 0.0029          | 0.0032    
    | 0.0962         |
   |    80%   | 0.0048          | 0.0049          | 0.0046          | 0.0053    
    | 0.1199         |
   |    70%   | 0.0066          | 0.0065          | 0.0061          | 0.0065    
    | 0.1427         |
   |    60%   | 0.0082          | 0.0076          | 0.0075          | 0.0078    
    | 0.1634         |
   |    50%   | 0.0097          | 0.0085          | 0.0086          | 0.0092    
    | 0.1847         |
   |    40%   | 0.0109          | 0.0097          | 0.0098          | 0.0107    
    | 0.1966         |
   |    30%   | 0.0124          | 0.0110          | 0.0111          | 0.0122    
    | 0.2163         |
   |    20%   | 0.0136          | 0.0123          | 0.0124          | 0.0138    
    | 0.2315         |
   |    10%   | 0.0150          | 0.0137          | 0.0139          | 0.0154    
    | 0.2519         |
   
   To conclude, acceleration is possible at the sparsity of 40%, compared with 
NCHWc convolutions.
   
   @jcf94 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to