jayfoad wrote: > > Why would you restrict this to "non-zero counter values"? > > When a waitcnt already has a zero counter value expanding it would just > generate another waitcnt(0), which provides no additional profiling > granularity. If you believe there's a use case for expanding waitcnt(0), I'd > be happy to discuss it.
The requirement is that instead of emitting e.g. `s_waitcnt vmcnt(2)` you should emit e.g.: ``` s_waitcnt vmcnt(4) s_waitcnt vmcnt(3) s_waitcnt vmcnt(2) ``` The starting value "4" here is assuming that SIInsertWaitcnts already knows that the upper bound on this counter's value is 5, so 4 is the highest value you can wait for that will have any effect. Similarly instead of `s_waitcnt vmcnt(0)` you should emit: ``` s_waitcnt vmcnt(4) s_waitcnt vmcnt(3) s_waitcnt vmcnt(2) s_waitcnt vmcnt(1) s_waitcnt vmcnt(0) ``` > > Why does this need a new subtarget feature? > > On this, I am still not fully sure what would be the best approach to handle. > I there any suggestion from you? I don't know why you added a subtarget feature. I would suggest not adding it -- unless there is some reason for it that I am missing. https://github.com/llvm/llvm-project/pull/169345 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
