================ @@ -50,6 +50,19 @@ memory and LDS memory. void @llvm.amdgcn.global.store.async.from.lds.type(ptr %dst, ptr %src) void @llvm.amdgcn.cluster.load.async.to.lds.type(ptr %dst, ptr %src) +**GFX1250 Tensor DMA Instructions** + +.. code-block:: llvm + + void @llvm.amdgcn.tensor.load.to.lds(...) + void @llvm.amdgcn.tensor.store.from.lds(...) + +These intrinsics are asynchronous despite the absence of ``async`` in their +names. They are tracked by the ``TENSOR_CNT`` hardware counter and participate +in the ``asyncmark`` / ``wait.asyncmark`` framework just like the intrinsics +above. Equivalently, the caller may issue an explicit ``s_wait_tensorcnt`` +instead of using ``asyncmark`` / ``wait.asyncmark``. ---------------- ssahasra wrote:
Remove this whole paragraph. Too much information. The whole point of `asyncmark` is to abstract away details like `TENSOR_CNT`. If users need the old way of doing things, they will have to go read the ISA doc for that. https://github.com/llvm/llvm-project/pull/200775 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
