int4

GitBox Tue, 06 Jul 2021 04:04:55 -0700


wyc-ruiker commented on pull request #8402:
URL: https://github.com/apache/tvm/pull/8402#issuecomment-873933081



   > Thanks for your continue contribution on the tensor core schedule! 
@wyc-ruiker I'll help reivew when I have time.
   > 
   > p.s. Recently I added a new op `nn.matmul` which extend the `nn.dense` to 
support data tensor and weight tensor to be in transposed or non-transposed 
format. For a model from frameworks like TensorFlow, TVM will insert an extra 
transpose for `nn.dense` while use `nn.matmul` can get rid of that.
   > I'm not sure but maybe it will be beneficial to use it if your model 
suffered from some performance issue on the inserted transpose.
   
   ```
     %1552 = reshape(%1551, newshape=[-1, 64, 50]) /* ty=Tensor[(12, 64, 50), 
float32] */;
     %1553 = transpose(%1552, axes=[0, 2, 1]) /* ty=Tensor[(12, 50, 64), 
float32] */;
     %1554 = multiply(%1553, 16f /* ty=float32 */) /* ty=Tensor[(12, 50, 64), 
float32] */;
     %1555 = round(%1554) /* ty=Tensor[(12, 50, 64), float32] */;
     %1556 = clip(%1555, a_min=-127f, a_max=127f) /* ty=Tensor[(12, 50, 64), 
float32] */;
     %1557 = cast(%1549, dtype="int8") /* ty=Tensor[(12, 50, 64), int8] */;
     %1558 = cast(%1556, dtype="int8") /* ty=Tensor[(12, 50, 64), int8] */;
     %1559 = nn.batch_matmul(%1557, %1558, 
meta[relay.attrs.BatchMatmulAttrs][61]) /* ty=Tensor[(12, 50, 50), int32] */;
   ```
   Thanks, But in our vit network, it looks like we have some performance 
issues before `nn.batch_matmul`. Waiting for your adding full transpose support 
for `nn.batch_matmul`!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] wyc-ruiker commented on pull request #8402: [CUDA] dense_tensorcore/batch_matmul_tensorcore support int8/int4

Reply via email to