[GitHub] [tvm] Qianshui-Jiang commented on a diff in pull request #13642: [Tensorize][runtime] Add support for AMX(Advanced Matrix Extensions) through Tensor intrinsics

GitBox Wed, 28 Dec 2022 23:41:15 -0800


Qianshui-Jiang commented on code in PR #13642:
URL: https://github.com/apache/tvm/pull/13642#discussion_r1058782058



##########
python/tvm/topi/x86/tensor_intrin.py:
##########
@@ -348,3 +348,227 @@ def _instr(index):
         binds={data: a_buffer, kernel: b_buffer},
         default_buffer_params=buffer_params,
     )
+
+
+def dot_32x128x32_u8s8s32_sapphirerapids(LDA):
+    """
+    Int8 dot product by every 16x64 elements using AMX-TMUL Sapphire Rapids 
instructions.
+    The tdpxxd instruction takes two tile of uint8 and int8 datatype -- 
data[16][64] and
+    kernel[1][16][16][4] -- and computes a dot product of data[16][16] in 
int32 datatype.
+
+    (Physically, to efficiently leveraging the tile register, we constructing 
a 2x2 tiles
+    matmul which performs 32x128x32 in total)
+
+    The pseudo code is as follows:
+        for(k=0; k<2; k++){
+            for(n=0; n<2; n++){
+                tileload64(tmm_b, B)
+                for(m=0; m<2; m++){
+                    if(n==0)
+                        tileload64(tmm_a, A)

Review Comment:
   > Have you considered tensorizing at finer granularity? Meaning, instead of 
tensoring the whole load + compute as one tensor intrin, tensorize each load 
and compute separately. That could make the hard coded outer-loop factors (2 in 
this intrin) tunable.
   
   @masahi  Yes we considered to make it works like tensor core which mentioned 
here: https://github.com/apache/tvm/issues/4052, but the problem is AMX 
intrinsics in LLVM need user to assigned register(tmm) number by hand craft. If 
we seperate the load and compute, we have to leave the register assginment to 
schedule method, that would bring much redundancy work. So finnally we decide 
to follow BRGEMM and leave the whole smallest comute unit for tensorizing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] Qianshui-Jiang commented on a diff in pull request #13642: [Tensorize][runtime] Add support for AMX(Advanced Matrix Extensions) through Tensor intrinsics

Reply via email to