antinucleon opened a new pull request #5377: [Blocksparse] Pipeline for 
lowering dense model to sparse-dense
URL: https://github.com/apache/incubator-tvm/pull/5377
 
 
   Useful pass and helpful function to lower dense model to block sparse model.
   
   In this PR it brings two new optimization involves change of weight, under 
name scope of ```relay.data_dep_optimization```:
   
   1. ```simplify_fc_transpose```
   This optimization is able to automatically convert ```y = nn.dense(x, 
transpose(w, [1, 0])``` to ```y = nn.dense(x, wt)```, and pre-transpose 
relevant parameters before future optimization.
   
   2. ```bsr_dense```
   This optimization is able to automatically convert ```y = nn.dense(x, w)``` 
to block sparse operation ```nn.sparse_dense``` when ```w``` qualified 
sparsifty requirement (manual set so far)
   
   Both optimizations should be run before normal lowing steps.
   
   I did some init experiments on BERT (uncased_L-12_H-768_A-12) with perfect 
random block sparse weights. Not all operators are tuned but it is enough to 
see the gain of block sparse.
   
   Dense: 213 ms
   
   | Sparsity / Blocksize | (32, 1) | (16, 1) | (8, 1) |
   |----------------------|---------|---------|--------|
   | 0.95                 | 107 ms  | 58 ms   | 61 ms  |
   | 0.9                  | 167 ms  | 62 ms   | 67 ms  |
   | 0.85                 |         | 66 ms   | 74 ms  |
   
   
   This work can be easily extended to 1x1 convolution. However current master 
doesn't contain good NHWC schedules for depthwise convolution, so it is not 
that urgent to PR. But once with good schedules on NHWC layout will see gains 
on CV tasks as well.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to