Hi,
You can find the configuration I usually use. There are a few things that
are a bit unrealistic, for example, large SQ size, but I usually do so to
account for better features on a real machine that we do not have at the
moment. The command would have:
--cpu-type=DerivO3CPU
Majid,
These are all great suggestions! Do you have a configuration file that you
would be willing to share? It would be a huge benefit to the community if
we had some better default configurations in the "examples" for gem5
configuration files.
We're also trying to use the new standard library
I think it is hard to get to a real machine level in terms of BW. But By
looking at your stats, I found the lsqFullEvents is high.
You can go after the CPU to make it more aggressive, increasing Load/Store
queue size, and ROB depth are the minimal changes you can make. I
usually do at least ROB
Hi Majid,
Thanks for your suggestion! I check the default number of MSHRs (in
configs/common/Caches.py), and found the default #MSHR of L1/L2 are 4 and 20
respectively.
According to the PACT’18 paper "Cimple: Instruction and Memory Level
Parallelism: A DSL for Uncovering ILP and MLP”, it
Hi,
Make sure your system has enough MSHRs, out of the box, L1, and L2 are set
to have a few MSHR entries.
Also, stride prefetcher is not the best, you may try something better: DCPT
gives me better numbers.
On Fri, Apr 15, 2022 at 4:57 AM Zicong Wang via gem5-users <
gem5-users@gem5.org> wrote: