ThomasRaoux wrote:

@AlexMaclean, this PR doesn't seem to be NFC. This generates different PTX than 
before. In particular I see extra `mov.b32      %r, global_smem;` that seem to 
generate different sass and causes regressions in some Triton workloads.
Before this PR I would only have one of those moves, now I see multiple. I 
haven't debugged why yet.

Is this is something you have noticed?

https://github.com/llvm/llvm-project/pull/145581
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to