Hi Marcelo,
For future reference, if someone else has this issue... Another possibility
is that the branch predictor is the problem. It looks like it could be
predicting that instruction is a branch. I'm not sure if it's specifically
because of the compressed format or not, though. It's another pl
Hi Ciro,
As you seemed to have figured out, running dynamically linked executables
has only been tested for x86_64 native platforms. It *is supported* if your
binary is x86 and your native machine is x86. I'm not sure what it would
take to get this working for native ARM machines (e.g., simulating
Hi Da,
"For size > 512, the whole stats.txt is identical."
This isn't surprising. 512*4KB = 2MB. So, if your workload is only 1MB when
you have at least 512 entries you are only seeing compulsory (cold) misses.
Try running larger workloads and/or workloads with more reuse.
Cheers,
Jason
On Thu,
Hi Muhammad,
Generally, if sendTimingReq fails, you have to save the packet so you can
resend it. In my Learning gem5 code, I *try* to simplify the retry logic so
that this is hidden. Instead of saving the packet in the cache code, the
packet is saved in the port code. Also, the code in Learning g
Hi Tariq,
It's up to you what you want the latency for SSE instructions to be. It
depends on what architecture you're simulating. Unfortunately, we currently
don't have any "known good" configurations for x86 cores so you'll have to
come up with your own :). Here's some examples of numbers you cou
Thanks, it does work on non native QEMU user mode with the -L option:
https://github.com/cirosantilli/linux-kernel-module-cheat/tree/b60c6f1b9c8bb7c34cf2c7fbce6f035d11483d4c#qemu-user-mode
On Mon, May 28, 2018 at 5:01 PM, Jason Lowe-Power wrote:
> Hi Ciro,
>
> As you seemed to have figured out, r
Thanks Jason. I also came across the same document earlier but I just
wanted to ask about this in general.
On Mon, May 28, 2018 at 11:09 AM, Jason Lowe-Power
wrote:
> Hi Tariq,
>
> It's up to you what you want the latency for SSE instructions to be. It
> depends on what architecture you're simul
Hi, Jason
Sorry for my unclear description before. For our workload,
the switch_cpus.dtb's miss rate for 64 tlb entries is 154654 / 1589214 =
9.74%; the miss rate for 1048576 tlb entries is 154360 / 1583757 = 9.73%.
Both are running for 20ms warm up in atomic mode and 2.5ms real simulation
with O3