[GitHub] [tvm] comaniac commented on issue #7135: [Bug] AutoScheduler / Fuse Pass bug

GitBox Sat, 19 Dec 2020 15:05:37 -0800


comaniac commented on issue #7135:
URL: https://github.com/apache/tvm/issues/7135#issuecomment-748537304



   OK so the root cause is your script uses `opt_level=0` when building the 
model; while auto_scheduler task extraction uses `opt_level=3`. I changed 
`opt_level` in `search_dense_gpu.py` to 3 and here is what I got:
   
   ```
   Compile...
   -----------------------------------
   Cannot find tuned schedules for target=metal -keys=metal,gpu 
-max_num_threads=256, workload_key=["13da82b16db5a9fde8953f4c5667d2e4"]. A 
fallback TOPI schedule is used, which may bring great performance regression or 
even compilation failure. Compute DAG info:
   placeholder = PLACEHOLDER [1, 768]
   placeholder = PLACEHOLDER [768, 768]
   T_dense(i, j) += (placeholder[i, k]*placeholder[j, k])
   placeholder = PLACEHOLDER [768]
   T_add(ax0, ax1) = (T_dense[ax0, ax1] + placeholder[ax1])
   T_minimum(ax0, ax1) = min(T_add[ax0, ax1], 9f)
   T_maximum(ax0, ax1) = max(T_minimum[ax0, ax1], -9f)
   T_fast_tanh(ax0, ax1) = ((T_maximum[ax0, ax1]*(((T_maximum[ax0, 
ax1]*T_maximum[ax0, ax1])*(((T_maximum[ax0, ax1]*T_maximum[ax0,  ..(OMITTED).. 
*T_maximum[ax0, ax1])*(((T_maximum[ax0, ax1]*T_maximum[ax0, ax1])*1.19826e-06f) 
+ 0.000118535f)) + 0.00226843f)) + 0.00489353f))
   ```
   
   The task hash code `13da82b16db5a9fde8953f4c5667d2e4` matches one of the 
extracted tasks from the model:
   
   ```
   ========== Task 9  (workload key: ["13da82b16db5a9fde8953f4c5667d2e4"]) 
==========
   placeholder = PLACEHOLDER [1, 768]
   placeholder = PLACEHOLDER [768, 768]
   T_dense(i, j) += (placeholder[i, k]*placeholder[j, k])
   placeholder = PLACEHOLDER [768]
   T_add(ax0, ax1) = (T_dense[ax0, ax1] + placeholder[ax1])
   T_minimum(ax0, ax1) = min(T_add[ax0, ax1], 9f)
   T_maximum(ax0, ax1) = max(T_minimum[ax0, ax1], -9f)
   T_fast_tanh(ax0, ax1) = ((T_maximum[ax0, ax1]*(((T_maximum[ax0, 
ax1]*T_maximum[ax0, ax1])*(((T_maximum[ax0, ax1]*T_maximum[ax0,  ..(OMITTED).. 
*T_maximum[ax0, ax1])*(((T_maximum[ax0, ax1]*T_maximum[ax0, ax1])*1.19826e-06f) 
+ 0.000118535f)) + 0.00226843f)) + 0.00489353f))
   ```
   
   In conclusion, this is not really a bug, but we may need to come up with a 
solution to further improve task extraction configuration. I'm closing this 
issue first, and we could have an RFC on the discuss forum.
   
   cc @merrymercy 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] comaniac commented on issue #7135: [Bug] AutoScheduler / Fuse Pass bug

Reply via email to