antinucleon opened a new pull request #5377: [Blocksparse] Pipeline for lowering dense model to sparse-dense URL: https://github.com/apache/incubator-tvm/pull/5377 Useful pass and helpful function to lower dense model to block sparse model. In this PR it brings two new optimization involves change of weight, under name scope of ```relay.data_dep_optimization```: 1. ```simplify_fc_transpose``` This optimization is able to automatically convert ```y = nn.dense(x, transpose(w, [1, 0])``` to ```y = nn.dense(x, wt)```, and pre-transpose relevant parameters before future optimization. 2. ```bsr_dense``` This optimization is able to automatically convert ```y = nn.dense(x, w)``` to block sparse operation ```nn.sparse_dense``` when ```w``` qualified sparsifty requirement (manual set so far) Both optimizations should be run before normal lowing steps. I did some init experiments on BERT (uncased_L-12_H-768_A-12) with perfect random block sparse weights. Not all operators are tuned but it is enough to see the gain of block sparse. Dense: 213 ms | Sparsity / Blocksize | (32, 1) | (16, 1) | (8, 1) | |----------------------|---------|---------|--------| | 0.95 | 107 ms | 58 ms | 61 ms | | 0.9 | 167 ms | 62 ms | 67 ms | | 0.85 | | 66 ms | 74 ms | This work can be easily extended to 1x1 convolution. However current master doesn't contain good NHWC schedules for depthwise convolution, so it is not that urgent to PR. But once with good schedules on NHWC layout will see gains on CV tasks as well.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
