SubjectNoi opened a new issue, #16491: URL: https://github.com/apache/tvm/issues/16491
Thanks for participating in the TVM community! We use https://discuss.tvm.ai for any general usage questions and discussions. The issue tracker is used for actionable items such as feature proposals discussion, roadmaps, and bug tracking. You are always welcomed to post on the forum first :smile_cat: Issues that are inactive for a period of time may get closed. We adopt this policy so that we won't lose track of actionable issues that may fall at the bottom of the pile. Feel free to reopen a new one if you feel there is an additional problem that needs attention when an old one gets closed. ### Expected behavior Assertion pass ### Actual behavior Assertion failed as follow: ``` Traceback (most recent call last): File "/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py", line 1314, in <module> test_mma_script_after_build() File "/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py", line 1233, in test_mma_script_after_build assert rt_mod.imported_modules[0].get_source() == expected_cuda_script ``` if i comment this function out, I still have failed assertion: ``` 2024-01-30 17:48:39 [INFO] Logging directory: /tmp/tmpdhcpc25t/logs 2024-01-30 17:48:39 [INFO] LocalRunner: max_workers = 1 2024-01-30 17:48:40 [INFO] [task_scheduler.cc:159] Initializing Task #0: "main" 2024-01-30 17:48:40 [INFO] [task_scheduler.cc:320] ID | Name | FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted Latency (us) | Trials | Done --------------------------------------------------------------------------------------------------------- 0 | main | 2147483648 | 1 | N/A | N/A | N/A | 0 | --------------------------------------------------------------------------------------------------------- Total trials: 0 Total latency (us): 0 2024-01-30 17:48:40 [INFO] [task_scheduler.cc:180] TaskScheduler picks Task #0: "main" 2024-01-30 17:50:30 [INFO] [task_scheduler.cc:193] Sending 8 sample(s) to builder 2024-01-30 17:50:41 [INFO] [task_scheduler.cc:195] Sending 8 sample(s) to runner 2024-01-30 17:50:42 [DEBUG] XGB iter 0: tr-p-rmse: 0.548714 tr-a-peak@32: 0.775022 tr-rmse: 0.551039 tr-rmse: 0.551039 2024-01-30 17:50:42 [DEBUG] XGB iter 25: tr-p-rmse: 0.033089 tr-a-peak@32: 1.000000 tr-rmse: 0.609717 tr-rmse: 0.609717 2024-01-30 17:50:42 [DEBUG] XGB iter 50: tr-p-rmse: 0.033089 tr-a-peak@32: 1.000000 tr-rmse: 0.609717 tr-rmse: 0.609717 2024-01-30 17:50:42 [DEBUG] XGB stopped. Best iteration: [15] tr-p-rmse:0.03309 tr-a-peak@32:1.00000 tr-rmse:0.60972 tr-rmse:0.60972 2024-01-30 17:50:42 [INFO] [task_scheduler.cc:237] [Updated] Task #0: "main" 2024-01-30 17:50:42 [INFO] [task_scheduler.cc:320] ID | Name | FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted Latency (us) | Trials | Done --------------------------------------------------------------------------------------------------------- 0 | main | 2147483648 | 1 | 68595.0219 | 31.3067 | 31.3067 | 8 | --------------------------------------------------------------------------------------------------------- Total trials: 8 Total latency (us): 31.3067 2024-01-30 17:50:42 [INFO] [task_scheduler.cc:260] Task #0 has finished. Remaining task(s): 0 2024-01-30 17:50:42 [INFO] [task_scheduler.cc:320] ID | Name | FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted Latency (us) | Trials | Done --------------------------------------------------------------------------------------------------------- 0 | main | 2147483648 | 1 | 68595.0219 | 31.3067 | 31.3067 | 8 | Y --------------------------------------------------------------------------------------------------------- Total trials: 8 Total latency (us): 31.3067 Traceback (most recent call last): File "/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py", line 1315, in <module> test_mma_tune() File "/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py", line 1308, in test_mma_tune tune("float16") File "/myhome/workspace/tvm/tests/python/meta_schedule/test_meta_schedule_mma_m16n8k8_auto_tensorization.py", line 1306, in tune assert np.allclose(c_tvm.numpy(), c_cublas.numpy(), rtol=1e-2) AssertionError ``` And as I print out the `c_tvm` tensor, I get: ``` [[0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 0.000e+00] [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 0.000e+00] [0.000e+00 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00 0.000e+00] ... [1.305e-01 5.063e-01 4.082e-01 ... 2.666e-01 5.225e-01 3.525e-01] [8.296e-01 9.082e-01 9.170e-01 ... 8.447e-01 8.735e-01 8.403e-01] [6.484e-01 6.525e-02 4.817e-01 ... 6.489e-01 2.588e+02 2.414e+02]] ``` ### Environment GPU: RTX6000 Ada NVCC: 12.3 Ubuntu: 20.04 Driver: 535.133 ### Steps to reproduce Just go to the `tests/python/meta_scheduler` and run `python test_meta_schedule_mma_m16n8k8_auto_tensorization.py` ### Triage Please refer to the list of label tags [here](https://github.com/apache/tvm/wiki/Issue-Triage-Labels) to find the relevant tags and add them below in a bullet format (example below). * needs-triage -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
