SubjectNoi opened a new issue, #16491:
URL: https://github.com/apache/tvm/issues/16491

   Thanks for participating in the TVM community! We use https://discuss.tvm.ai 
for any general usage questions and discussions. The issue tracker is used for 
actionable items such as feature proposals discussion, roadmaps, and bug 
tracking.  You are always welcomed to post on the forum first :smile_cat:
   
   Issues that are inactive for a period of time may get closed. We adopt this 
policy so that we won't lose track of actionable issues that may fall at the 
bottom of the pile. Feel free to reopen a new one if you feel there is an 
additional problem that needs attention when an old one gets closed.
   
   ### Expected behavior
   
   Assertion pass
   
   ### Actual behavior
   
   Assertion failed as follow:
   
   ```
   Traceback (most recent call last):
     File 
"/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py",
 line 1314, in <module>
       test_mma_script_after_build()
     File 
"/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py",
 line 1233, in test_mma_script_after_build
       assert rt_mod.imported_modules[0].get_source() == expected_cuda_script
   ```
   if i comment this function out, I still have failed assertion:
   
   ```
   2024-01-30 17:48:39 [INFO] Logging directory: /tmp/tmpdhcpc25t/logs
   2024-01-30 17:48:39 [INFO] LocalRunner: max_workers = 1
   2024-01-30 17:48:40 [INFO] [task_scheduler.cc:159] Initializing Task #0: 
"main"
   2024-01-30 17:48:40 [INFO] [task_scheduler.cc:320] 
    ID | Name |       FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted 
Latency (us) | Trials | Done 
   
---------------------------------------------------------------------------------------------------------
     0 | main | 2147483648 |      1 |            N/A |          N/A |           
        N/A |      0 |      
   
---------------------------------------------------------------------------------------------------------
   Total trials: 0
   Total latency (us): 0
   
   2024-01-30 17:48:40 [INFO] [task_scheduler.cc:180] TaskScheduler picks Task 
#0: "main"
   2024-01-30 17:50:30 [INFO] [task_scheduler.cc:193] Sending 8 sample(s) to 
builder
   2024-01-30 17:50:41 [INFO] [task_scheduler.cc:195] Sending 8 sample(s) to 
runner
   2024-01-30 17:50:42 [DEBUG] XGB iter   0: tr-p-rmse: 0.548714   
tr-a-peak@32: 0.775022  tr-rmse: 0.551039       tr-rmse: 0.551039
   2024-01-30 17:50:42 [DEBUG] XGB iter  25: tr-p-rmse: 0.033089   
tr-a-peak@32: 1.000000  tr-rmse: 0.609717       tr-rmse: 0.609717
   2024-01-30 17:50:42 [DEBUG] XGB iter  50: tr-p-rmse: 0.033089   
tr-a-peak@32: 1.000000  tr-rmse: 0.609717       tr-rmse: 0.609717
   2024-01-30 17:50:42 [DEBUG] XGB stopped. Best iteration: [15] 
tr-p-rmse:0.03309 tr-a-peak@32:1.00000    tr-rmse:0.60972 tr-rmse:0.60972 
   2024-01-30 17:50:42 [INFO] [task_scheduler.cc:237] [Updated] Task #0: "main"
   2024-01-30 17:50:42 [INFO] [task_scheduler.cc:320] 
    ID | Name |       FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted 
Latency (us) | Trials | Done 
   
---------------------------------------------------------------------------------------------------------
     0 | main | 2147483648 |      1 |     68595.0219 |      31.3067 |           
    31.3067 |      8 |      
   
---------------------------------------------------------------------------------------------------------
   Total trials: 8
   Total latency (us): 31.3067
   
   2024-01-30 17:50:42 [INFO] [task_scheduler.cc:260] Task #0 has finished. 
Remaining task(s): 0
   2024-01-30 17:50:42 [INFO] [task_scheduler.cc:320] 
    ID | Name |       FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted 
Latency (us) | Trials | Done 
   
---------------------------------------------------------------------------------------------------------
     0 | main | 2147483648 |      1 |     68595.0219 |      31.3067 |           
    31.3067 |      8 |    Y 
   
---------------------------------------------------------------------------------------------------------
   Total trials: 8
   Total latency (us): 31.3067
   
   Traceback (most recent call last):
     File 
"/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py",
 line 1315, in <module>
       test_mma_tune()
     File 
"/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py",
 line 1308, in test_mma_tune
       tune("float16")
     File 
"/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py",
 line 1306, in tune
       assert np.allclose(c_tvm.numpy(), c_cublas.numpy(), rtol=1e-2)
   AssertionError
   ```
   And as I print out the `c_tvm` tensor, I get:
   ```
   [[0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 0.000e+00]
    [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 0.000e+00]
    [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 0.000e+00]
    ...
    [1.305e-01 5.063e-01 4.082e-01 ... 2.666e-01 5.225e-01 3.525e-01]
    [8.296e-01 9.082e-01 9.170e-01 ... 8.447e-01 8.735e-01 8.403e-01]
    [6.484e-01 6.525e-02 4.817e-01 ... 6.489e-01 2.588e+02 2.414e+02]]
   ```
   ### Environment
   
   GPU: RTX6000 Ada
   NVCC: 12.3
   Ubuntu: 20.04
   Driver: 535.133
   
   ### Steps to reproduce
   
   Just go to the `tests/python/meta_scheduler` and run `python 
test_meta_schedule_mma_m16n8k8_auto_tensorization.py`
   
   ### Triage
   
   Please refer to the list of label tags 
[here](https://github.com/apache/tvm/wiki/Issue-Triage-Labels) to find the 
relevant tags and add them below in a bullet format (example below).
   
   * needs-triage
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to