jayfoad wrote:

> > Why would you restrict this to "non-zero counter values"?
> 
> When a waitcnt already has a zero counter value expanding it would just 
> generate another waitcnt(0), which provides no additional profiling 
> granularity. If you believe there's a use case for expanding waitcnt(0), I'd 
> be happy to discuss it.

The requirement is that instead of emitting e.g. `s_waitcnt vmcnt(2)` you 
should emit e.g.:
```
s_waitcnt vmcnt(4)
s_waitcnt vmcnt(3)
s_waitcnt vmcnt(2)
```
The starting value "4" here is assuming that SIInsertWaitcnts already knows 
that the upper bound on this counter's value is 5, so 4 is the highest value 
you can wait for that will have any effect.

Similarly instead of `s_waitcnt vmcnt(0)` you should emit:
```
s_waitcnt vmcnt(4)
s_waitcnt vmcnt(3)
s_waitcnt vmcnt(2)
s_waitcnt vmcnt(1)
s_waitcnt vmcnt(0)
```

> > Why does this need a new subtarget feature?
> 
> On this, I am still not fully sure what would be the best approach to handle. 
> I there any suggestion from you?

I don't know why you added a subtarget feature. I would suggest not adding it 
-- unless there is some reason for it that I am missing.

https://github.com/llvm/llvm-project/pull/169345
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to