On 18/6/2025 09:07, Kito Cheng wrote:
Maybe it's a good time to create a -mtune=generic and copy-and-modify
from rocket?
Indeed, it appears to be the most suitable solution.
Thanks,
Yangyu Chen
On Wed, Jun 18, 2025 at 6:59 AM Jeff Law <jeffreya...@gmail.com> wrote:
On 6/17/25 10:51 AM, Yangyu Chen wrote:
On 17/6/2025 20:42, Jeff Law wrote:
On 6/16/25 10:08 PM, Dongyan Chen wrote:
Hi, I've come across a question regarding the branch cost of gcc. In
the link
https://gcc.godbolt.org/z/hnddevd5h, gcc fails to recognize the
optimization
branch judgment, while llvm does. I eventually discovered that the
value of the branch
cost was too small. Moreover, in that link, if I add "-mbranch-
cost=4" (a larger
number can also be used) for gcc, the zicond extension functions
properly. So, is
it necessary to modify the branch cost for gcc? According to the
source code, the
default mtun is rocket, which has a branch cost of 3. I think it
should be set to 4.
gcc/ChangeLog:
* config/riscv/riscv.cc: Change the branch cost.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zicond-
primitiveSemantics_compare_reg_reg_return_reg_reg.c: New test.
So I'd be a lot more comfortable with this if someone that knows the
rocket uarch could chime in or if we had wider data on how this
behaves in general. One pico-sized benchmark isn't a great way to
evaluate something like this.
The rocket core is quite simple, utilizing a five-stage in-order scalar
pipeline with a 3-cycle branch mis-predict penalty.
However, there is a trade-off here:
- Use branch
- 2-3 dynamic instructions reduced for each loop
- 3 cycles penalty when branch can be predicted
- Use Zicond
- No branch mis-predict penalty
- 2-3 dynamic instructions overhead for each loop
I agree that this might not be helpful for rocket-chip. However, since
rocket-chip is the default tune information for RISC-V, and AFAIK every
rocket core that has been taped out lacks a zicond extension. I think
it's acceptable to adjust this for better RISC-V ecosystems, as branch
misprediction on large OoO cores usually incurs a penalty of about 10
cycles.
No, that's not a good reason.
You could make the argument that instead of defaulting to rocket that we
should use a default generic tuning model. That would make much more
sense than deliberately choosing the wrong values for the rocket uarch
because it happens to be used as the default.
Jeff