Re: [gem5-users] Indeterministic gem5 behavior

2019-09-18 Thread Shehab Elsayed
Since the problem seems to be related to the faulty load instruction being re-executed before the faulty version is committed, is it safe to skip load execution if the instruction has ReExec fault and its Executed flag is set and manually resetting the Executed flag once the faulty instruction is

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-16 Thread Shehab Elsayed
I ran some more tests and it doesn't always break at the same instruction ( MOV_R_M). However, what seems to be common is that the instruction causing the problem (let's call it instruction A) conflicts with another instruction in the 'checkSnoop' function in 'src/cpu/o3/lsq_unit_impl.hh'.

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-12 Thread Shehab Elsayed
Looks like this is the instruction causing the assertion failure MOV_R_M : ld r9, DS:[rdx + 0x10] On Wed, Sep 11, 2019 at 5:20 PM Pouya Fotouhi wrote: > If you use --debug-flags=ExecAll,Decode and narrow down your trace to the > Ticks that you know the load is failing with --debug-start

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-11 Thread Pouya Fotouhi
If you use --debug-flags=ExecAll,Decode and narrow down your trace to the Ticks that you know the load is failing with --debug-start and --debug-end you should be able to get that. Best, On Wed, Sep 11, 2019 at 2:15 PM Shehab Elsayed wrote: > Is there a way to get the macroop from the

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-11 Thread Shehab Elsayed
Is there a way to get the macroop from the corresponding instruction pointer? On Wed, Sep 11, 2019 at 5:07 PM Pouya Fotouhi wrote: > Hi Shehab, > > Can you please confirm what is the macroop that is issuing that load? I > suspect it's one of the 128-bit instructions (maybe recently

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-11 Thread Pouya Fotouhi
Hi Shehab, Can you please confirm what is the macroop that is issuing that load? I suspect it's one of the 128-bit instructions (maybe recently non-temporal ones that I added) that are executed as two 64-bit loads, and possibly the second one is failing due to the cda check that we do, and that

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-11 Thread Shehab Elsayed
So actually load instruction gets executed twice causing the assertion to fail on the second time. 769413949: system.switch_cpus.iew.lsq.thread0: Doing memory access for inst [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2) 769413949: system.switch_cpus.iew.lsq.thread0:

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-09 Thread Pouya Fotouhi
You can try dumping Exec trace for the last few million ticks and see what is going on in your LSQ and why you have load instruction that is not executed. Best, On Mon, Sep 9, 2019 at 11:28 AM Shehab Elsayed wrote: > I am not sure that prefetch_nta is the problem. For different runs the >

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-09 Thread Shehab Elsayed
I am not sure that prefetch_nta is the problem. For different runs the simulation would fail after different periods after printing the prefetch_ nta warning message. Also, from what I have seen in different posts it seems that this warning has been around for a while. I tried compiling my hello

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-08 Thread Pouya Fotouhi
Hi Shehab, Good, that's "progress"! My guess off the top of my head is that you used a "more recent" compiler (compared to what other gem5 users tend to use), and thus some instructions are being generated that were not critical to the execution of applications other users had so far (and that's

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-07 Thread Shehab Elsayed
I am sorry for the late update. I tried running with MESI_Two_Level but the simulation ends with this error. warn: instruction 'prefetch_nta' unimplemented gem5.opt: build/X86_MESI_Two_Level/cpu/o3/lsq_unit.hh:621: Fault LSQUnit::read(LSQUnit::LSQRequest*, int) [with Impl = O3CPUImpl; Fault =

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Pouya Fotouhi
Hi Shehab, As Jason pointed out, I won’t be surprised if you are having issues with classic caches running workloads that rely on locking mechanisms. Your pthread implementation is possibly using some synchronization variables which requires cache coherence to maintain its correctness, and

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Jason Lowe-Power
Hi Shehab, IIRC, there are some issues when using classic caches + x86 + multiple cores on full system mode. I suggest using Ruby (MESI_two_level or MOESI_hammer) for FS simulations. Jason On Fri, Sep 6, 2019 at 11:24 AM Shehab Elsayed wrote: > My latest experiments are with the classical

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Shehab Elsayed
My latest experiments are with the classical memory system, but I remember trying Ruby and it was not different. I am using kernel 4.8.13 and ubuntu-16.04.1-server-amd64 disk image. I am using Pthreads for my Hello World program. On Fri, Sep 6, 2019 at 1:13 PM Pouya Fotouhi wrote: > Hi Shehab,

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Shehab Elsayed
First of all, thanks for your replies, Ryan and Jason. I have already pulled the latest changes by Pouya and the problem still persists. As for checkpointing, I was originally doing exactly what Jason mentioned and ran into the same problem. I then switched to not checkpointing just to avoid any

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Jason Lowe-Power
Hi Shehab, One quick note: There is *no way* to have deterministic behavior when running with KVM. Since you are using the hardware, the underlying host OS will influence the execution path of the workload. To try to narrow down the bug you're seeing, you can try to take a checkpoint after

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Gambord, Ryan
Yes, running in full system. I cant even run arm fs on my home computer (arch linux) or on campus (centos7). Under vm with ubuntu lts, it runs fine. For x86 fs, parsec benchmarks work better out of the box when run under a vm. Ryan Gambord On Fri, Sep 6, 2019, 08:21 Shehab Elsayed wrote: >

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Shehab Elsayed
That's interesting. Are you using Full System as well? I don't think FS behavior is supposed to be so dependent on the host environment! On Fri, Sep 6, 2019 at 11:16 AM Gambord, Ryan wrote: > I have found that gem5 behavior is sensitive to the execution environment. > I now run gem5 inside an

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Gambord, Ryan
I have found that gem5 behavior is sensitive to the execution environment. I now run gem5 inside an ubuntu vm on qemu and have had much more consistent results. I haven't tried running kvm gem5 inside a kvm qemu vm, so not sure how that works, but might be worth trying. Ryan On Fri, Sep 6,

Re: [gem5-users] Indeterministic gem5 behavior

2019-09-06 Thread Shehab Elsayed
I was wondering if anyone is running into the same problem or if anyone has any suggestions on how to proceed with debugging this problem. On Mon, Jul 29, 2019 at 4:57 PM Shehab Elsayed wrote: > Sorry for the spam. I just forgot to mention that the system configuration > I am using is mainly

Re: [gem5-users] Indeterministic gem5 behavior

2019-07-29 Thread Shehab Elsayed
Sorry for the spam. I just forgot to mention that the system configuration I am using is mainly from https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs. Shehab Y. Elsayed, MSc. PhD Student The Edwards

Re: [gem5-users] Indeterministic gem5 behavior

2019-07-29 Thread Shehab Elsayed
I have set up a repo with gem5 that demonstrates the problem. The repo includes the latest version of gem5 from gem5's github repo with a few patches applied to get KVM working together with the kernel binary and disk image I am using. You can get the repo at

Re: [gem5-users] Indeterministic gem5 behavior

2019-07-23 Thread Shehab Elsayed
When I enable the Exec debug flag I can see that it seems to be stuck in a spin lock (queued_spin_lock_slowpath) On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed wrote: > Hello All, > > I have a gem5 X86 full system set up that starts with KVM cores and then > switches to O3 cores once the

[gem5-users] Indeterministic gem5 behavior

2019-07-19 Thread Shehab Elsayed
Hello All, I have a gem5 X86 full system set up that starts with KVM cores and then switches to O3 cores once the benchmark reaches the region of interest. Right now I am testing with a simple multithreaded hello world benchmark. Sometimes the benchmark completes successfully while others gem5