merrymercy opened a new pull request #6981:
URL: https://github.com/apache/tvm/pull/6981


   Compiling Winograd conv2d is very slow. This makes the feature extraction in 
auto-scheduler very slow.
   I found the reason is that simplifying the const matrix expressions in 
Winograd conv2d takes a lot of time. One example of these expressions is
   ```
   A(i, j) = select(((floormod(i, 6) == 5) && (floormod(j, 4) == 3)), 1f, 
select(((floormod(i, 6) == 5) && (floormod(j, 4) == 2)), 0f, 
select(((floormod(i, 6) == 5) && (floormod(j, 4) == 1)), 0f, 
select(((floormod(i, 6) == 5) && (floormod(j, 4) == 0)), 0f, 
select(((floormod(i, 6) == 4) && (floormod(j, 4) == 3)), -8f, 
select(((floormod(i, 6) == 4) && (floormod(j, 4) == 2)), 4f, 
select(((floormod(i, 6) == 4) && (floormod(j, 4) == 1)), -2f, 
select(((floormod(i, 6) == 4) && (floormod(j, 4) == 0)), 1f, 
select(((floormod(i, 6) == 3) && (floormod(j, 4) == 3)), 0.125f, 
select(((floormod(i, 6) == 3) && (floormod(j, 4) == 2)), 0.25f, 
select(((floormod(i, 6) == 3) && (floormod(j, 4) == 1)), 0.5f, 
select(((floormod(i, 6) == 3) && (floormod(j, 4) == 0)), 1f, 
select(((floormod(i, 6) == 2) && (floormod(j, 4) == 3)), 1f, 
select(((floormod(i, 6) == 2) && (floormod(j, 4) == 2)), 1f, 
select(((floormod(i, 6) == 2) && (floormod(j, 4) == 1)), 1f, 
select(((floormod(i, 6) == 2) && (floormod(j, 4) == 0)), 1f
 , select(((floormod(i, 6) == 1) && (floormod(j, 4) == 3)), -1f, 
select(((floormod(i, 6) == 1) && (floormod(j, 4) == 2)), 1f, 
select(((floormod(i, 6) == 1) && (floormod(j, 4) == 1)), -1f, 
select(((floormod(i, 6) == 1) && (floormod(j, 4) == 0)), 1f, 
select(((floormod(i, 6) == 0) && (floormod(j, 4) == 3)), 0f, 
select(((floormod(i, 6) == 0) && (floormod(j, 4) == 2)), 0f, 
select(((floormod(i, 6) == 0) && (floormod(j, 4) == 1)), 0f, 
select(((floormod(i, 6) == 0) && (floormod(j, 4) == 0)), 1f, 
0f))))))))))))))))))))))))
   ```
   But this part is actually useless for feature extraction. So I added a new 
flag to replace all accesses to constant matrices by constant value 1.
   This flag produces a wrong IR but it is good enough for feature extraction 
purposes.
   
   This accelerates the GA search from 300s to 150s in p3.2x for a Winograd 
conv2d.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to