The GitHub Actions job "Teams" on tvm.git/main has succeeded. Run started by GitHub user raindrops-0199 (triggered by raindrops-0199).
Head commit for run: 2a16d98b9ec1899f11c35a70c779285812d360c2 / Shushi Hong <[email protected]> [S-TIR] Fix software pipeline offsets for legacy MMA intrinsics (#19742) This pr fixes `InjectSoftwarePipeline` to rewrite opaque buffer offsets for legacy PTX MMA intrinsics, including `ptx_ldmatrix_legacy`, `ptx_mma_legacy`, `mma_store_legacy`, and `mma_fill_legacy`. ### Failure `test_async_nested_pipeline_mma_gemm_ideal_annotation` failed with a real numerical mismatch during the final GEMM result check: ```text Mismatched elements: 6744298 / 16777216 (40.2%) Max absolute difference: 6.706421 Max relative difference: 0.00653978 ``` A diagnostic run with a fixed seed reproduced the same class of error: ```text nested_full bad 6826830 / 16777216 max_abs 7.0405273 mean_abs 0.9874781 ``` The adjacent simple pipeline test passed, and disabling async copy did not change the mismatch, so this was not a cp.async lowering issue, tolerance noise, or GPU flakiness. ### Root Cause `InjectSoftwarePipeline` may add a leading version dimension to pipeline buffers. Normal buffer loads/stores and newer opaque PTX intrinsics already had their offsets rewritten to include the pipeline version slot. However, the legacy MMA intrinsics were not covered. In this test, the warp buffers became multi-versioned, but the legacy `ldmatrix`/`mma` offsets still pointed to the original slot. As a result, the second `ldmatrix` stage overwrote the first warp fragment, and both MMA stages read the same fragment. That skipped one K fragment and duplicated another, producing the numerical mismatch above. Report URL: https://github.com/apache/tvm/actions/runs/27399094142 With regards, GitHub Actions via GitBox --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
