If you use --debug-flags=ExecAll,Decode and narrow down your trace to the Ticks that you know the load is failing with --debug-start and --debug-end you should be able to get that.
Best, On Wed, Sep 11, 2019 at 2:15 PM Shehab Elsayed <[email protected]> wrote: > Is there a way to get the macroop from the corresponding instruction > pointer? > > > On Wed, Sep 11, 2019 at 5:07 PM Pouya Fotouhi <[email protected]> > wrote: > >> Hi Shehab, >> >> Can you please confirm what is the macroop that is issuing that load? I >> suspect it's one of the 128-bit instructions (maybe recently non-temporal >> ones that I added) that are executed as two 64-bit loads, and possibly the >> second one is failing due to the cda check that we do, and that stops the >> load from being committed. >> >> Best, >> >> On Wed, Sep 11, 2019 at 1:16 PM Shehab Elsayed <[email protected]> >> wrote: >> >>> So actually load instruction gets executed twice causing the assertion >>> to fail on the second time. >>> >>> 7694139490000: system.switch_cpus.iew.lsq.thread0: Doing memory access >>> for inst [sn:15059405] PC (0xffffffff810ed626=>0xffffffff810ed62a).(1=>2) >>> 7694139490000: system.switch_cpus.iew.lsq.thread0: Load [sn:15059405] >>> not executed from fault >>> 7694139490000: system.switch_cpus.iew.lsq.thread0: 1- Setting >>> [sn:15059405] as executed (I added this message to track when LSQ >>> instructions are set as executed) >>> >>> I believe this instruction should then be committed and removed from the >>> LSQ before before executed again, however, this does not happen. Instead it >>> gets executed again before being removed and then comes the assertion >>> failure that it has already executed. >>> >>> I see that it gets sent to commit >>> >>> 7694139490000: system.switch_cpus.iew: Sending instructions to commit, >>> [sn:15059405] PC (0xffffffff810ed626=>0xffffffff810ed62a).(1=>2). >>> >>> but it never actually gets to commit and removed from LSQ. >>> >>> >>> On Mon, Sep 9, 2019 at 3:01 PM Pouya Fotouhi <[email protected]> >>> wrote: >>> >>>> You can try dumping Exec trace for the last few million ticks and see >>>> what is going on in your LSQ and why you have load instruction that is not >>>> executed. >>>> >>>> Best, >>>> >>>> On Mon, Sep 9, 2019 at 11:28 AM Shehab Elsayed <[email protected]> >>>> wrote: >>>> >>>>> I am not sure that prefetch_nta is the problem. For different runs >>>>> the simulation would fail after different periods after printing the >>>>> prefetch_nta warning message. Also, from what I have seen in >>>>> different posts it seems that this warning has been around for a while. >>>>> >>>>> I tried compiling my hello world program with -march=athlon64 alone >>>>> and together with -O0 and the the same problem happens. >>>>> >>>>> Also, the I am building my benchmark on the disk image directly using >>>>> qemu and the gcc on the image is versio 5.4.0 >>>>> >>>>> >>>>> >>>>> On Sun, Sep 8, 2019 at 4:14 PM Pouya Fotouhi <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Shehab, >>>>>> >>>>>> Good, that's "progress"! >>>>>> My guess off the top of my head is that you used a "more recent" >>>>>> compiler (compared to what other gem5 users tend to use), and thus some >>>>>> instructions are being generated that were not critical to the execution >>>>>> of >>>>>> applications other users had so far (and that's mostly why those >>>>>> instructions are not yet implemented). I think you have two options: >>>>>> >>>>>> 1. You can try implementing prefetch_nta, and possibly ignore the >>>>>> non-temporal hint (i.e. implement it as a cacheable prefetch). You can >>>>>> start by looking at the implementation of other prefetch instruction >>>>>> we >>>>>> have in gem5 (basically you can do the same :) ). >>>>>> 2. Try compiling your application (I think we are still talking >>>>>> about the hello world, right?), and target an older architecture (you >>>>>> can >>>>>> do as extreme as march=athlon64) with less optimizations involved to >>>>>> avoid >>>>>> these performance-optimizations (reducing cache pollution in this >>>>>> particular case) that your compiler is trying to apply. >>>>>> >>>>>> My suggestion is to go with the first one, since running real >>>>>> applications compiled for an older architecture with less optimization >>>>>> on a >>>>>> "newer" system is the equivalent of not using "parts/features" of your >>>>>> system (e.g. SIMD units, direct prefetch, etc), which would (possibly) >>>>>> directly impact any study you are working on. >>>>>> >>>>>> Best, >>>>>> >>>>>> On Sat, Sep 7, 2019 at 8:27 PM Shehab Elsayed <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I am sorry for the late update. I tried running with MESI_Two_Level >>>>>>> but the simulation ends with this error. >>>>>>> >>>>>>> warn: instruction 'prefetch_nta' unimplemented >>>>>>> >>>>>>> gem5.opt: build/X86_MESI_Two_Level/cpu/o3/lsq_unit.hh:621: Fault >>>>>>> LSQUnit<Impl>::read(LSQUnit<Impl>::LSQRequest*, int) [with Impl = >>>>>>> O3CPUImpl; Fault = std::shared_ptr<FaultBase>; >>>>>>> LSQUnit<Impl>::LSQRequest = L >>>>>>> SQ<O3CPUImpl>::LSQRequest]: Assertion `!load_inst->isExecuted()' >>>>>>> failed. >>>>>>> >>>>>>> Which I believe has something to do with a recent update since I >>>>>>> don't remember seeing it before. And this error happens even for just 2 >>>>>>> cores and 2 threads. >>>>>>> >>>>>>> On Fri, Sep 6, 2019 at 3:16 PM Pouya Fotouhi <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Shehab, >>>>>>>> >>>>>>>> As Jason pointed out, I won’t be surprised if you are having issues >>>>>>>> with classic caches running workloads that rely on locking mechanisms. >>>>>>>> Your >>>>>>>> pthread implementation is possibly using some synchronization variables >>>>>>>> which requires cache coherence to maintain its correctness, and >>>>>>>> classic >>>>>>>> caches (at least for now) doesn’t support that. >>>>>>>> >>>>>>>> Switch to ruby caches (I suggest MESI Two Level to begin with), and >>>>>>>> given your kernel version you should be getting stable behavior from >>>>>>>> gem5. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> On Fri, Sep 6, 2019 at 11:47 AM Jason Lowe-Power < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Shehab, >>>>>>>>> >>>>>>>>> IIRC, there are some issues when using classic caches + x86 + >>>>>>>>> multiple cores on full system mode. I suggest using Ruby >>>>>>>>> (MESI_two_level or >>>>>>>>> MOESI_hammer) for FS simulations. >>>>>>>>> >>>>>>>>> Jason >>>>>>>>> >>>>>>>>> On Fri, Sep 6, 2019 at 11:24 AM Shehab Elsayed < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> My latest experiments are with the classical memory system, but I >>>>>>>>>> remember trying Ruby and it was not different. I am using kernel >>>>>>>>>> 4.8.13 and >>>>>>>>>> ubuntu-16.04.1-server-amd64 disk image. I am using Pthreads for my >>>>>>>>>> Hello >>>>>>>>>> World program. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 6, 2019 at 1:13 PM Pouya Fotouhi < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Shehab, >>>>>>>>>>> >>>>>>>>>>> Can you confirm a few details about the configuration you are >>>>>>>>>>> using? Are you using classic caches or Ruby? What is the kernel >>>>>>>>>>> version and >>>>>>>>>>> disk image you are using? What is the implementation of your >>>>>>>>>>> "multithreaded >>>>>>>>>>> hello world" (are you using OMP)? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 6, 2019 at 8:58 AM Shehab Elsayed < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> First of all, thanks for your replies, Ryan and Jason. >>>>>>>>>>>> >>>>>>>>>>>> I have already pulled the latest changes by Pouya and the >>>>>>>>>>>> problem still persists. >>>>>>>>>>>> >>>>>>>>>>>> As for checkpointing, I was originally doing exactly what >>>>>>>>>>>> Jason mentioned and ran into the same problem. I then switched to >>>>>>>>>>>> not >>>>>>>>>>>> checkpointing just to avoid any problems that might be caused >>>>>>>>>>>> by checkpointing (if any). My plan was to go back to >>>>>>>>>>>> checkpointing after proving that it works without it. >>>>>>>>>>>> >>>>>>>>>>>> However, the problem doesn't seem to be related to KVM as linux >>>>>>>>>>>> boots reliable every time. The problem happens after the >>>>>>>>>>>> benchmarks starts >>>>>>>>>>>> execution and it seems to be happening only when running multiple >>>>>>>>>>>> cores >>>>>>>>>>>> (>=4). My latest experiments with a single core and 8 threads for >>>>>>>>>>>> the >>>>>>>>>>>> benchmark seem to be working fine. But once I increase the number >>>>>>>>>>>> of >>>>>>>>>>>> simulated cores problems happen. >>>>>>>>>>>> >>>>>>>>>>>> Also, I have posted a link to the repo I am using to run my >>>>>>>>>>>> tests in a previous message. I have also added 2 debug traces with >>>>>>>>>>>> the Exec >>>>>>>>>>>> flag for a working and non-working examples. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 6, 2019 at 11:28 AM Jason Lowe-Power < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Shehab, >>>>>>>>>>>>> >>>>>>>>>>>>> One quick note: There is *no way* to have deterministic >>>>>>>>>>>>> behavior when running with KVM. Since you are using the hardware, >>>>>>>>>>>>> the >>>>>>>>>>>>> underlying host OS will influence the execution path of the >>>>>>>>>>>>> workload. >>>>>>>>>>>>> >>>>>>>>>>>>> To try to narrow down the bug you're seeing, you can try to >>>>>>>>>>>>> take a checkpoint after booting with KVM. Then, the execution >>>>>>>>>>>>> from the >>>>>>>>>>>>> checkpoint should be deterministic since it is 100% in gem5. >>>>>>>>>>>>> >>>>>>>>>>>>> BTW, I doubt you can run the KVM CPU in a VM since this would >>>>>>>>>>>>> require your hardware and the VM to support nested >>>>>>>>>>>>> virtualization. There >>>>>>>>>>>>> *is* support for this in the Linux kernel, but I don't think it's >>>>>>>>>>>>> been >>>>>>>>>>>>> widely deployed outside of specific cloud environments. >>>>>>>>>>>>> >>>>>>>>>>>>> One other note: Pouya has pushed some changes which implement >>>>>>>>>>>>> some x86 instructions that were causing issues for him. You can >>>>>>>>>>>>> try with >>>>>>>>>>>>> the current gem5 mainline to see if that helps. >>>>>>>>>>>>> >>>>>>>>>>>>> Cheers, >>>>>>>>>>>>> Jason >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 6, 2019 at 8:22 AM Shehab Elsayed < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> That's interesting. Are you using Full System as well? I >>>>>>>>>>>>>> don't think FS behavior is supposed to be so dependent on the >>>>>>>>>>>>>> host >>>>>>>>>>>>>> environment! >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Sep 6, 2019 at 11:16 AM Gambord, Ryan < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have found that gem5 behavior is sensitive to the >>>>>>>>>>>>>>> execution environment. I now run gem5 inside an ubuntu vm on >>>>>>>>>>>>>>> qemu and have >>>>>>>>>>>>>>> had much more consistent results. I haven't tried running kvm >>>>>>>>>>>>>>> gem5 inside a >>>>>>>>>>>>>>> kvm qemu vm, so not sure how that works, but might be worth >>>>>>>>>>>>>>> trying. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ryan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Sep 6, 2019, 08:07 Shehab Elsayed < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I was wondering if anyone is running into the same problem >>>>>>>>>>>>>>>> or if anyone has any suggestions on how to proceed with >>>>>>>>>>>>>>>> debugging this >>>>>>>>>>>>>>>> problem. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Jul 29, 2019 at 4:57 PM Shehab Elsayed < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sorry for the spam. I just forgot to mention that the >>>>>>>>>>>>>>>>> system configuration I am using is mainly from >>>>>>>>>>>>>>>>> https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs. >>>>>>>>>>>>>>>>> <https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Shehab Y. Elsayed, MSc. >>>>>>>>>>>>>>>>> PhD Student >>>>>>>>>>>>>>>>> The Edwards S. Rogers Sr. Dept. of Electrical and Computer >>>>>>>>>>>>>>>>> Engineering >>>>>>>>>>>>>>>>> University of Toronto >>>>>>>>>>>>>>>>> E-mail: [email protected] >>>>>>>>>>>>>>>>> <https://webmail.rice.edu/imp/message.php?mailbox=INBOX&index=11#> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jul 29, 2019 at 4:08 PM Shehab Elsayed < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have set up a repo with gem5 that demonstrates the >>>>>>>>>>>>>>>>>> problem. The repo includes the latest version of gem5 from >>>>>>>>>>>>>>>>>> gem5's github >>>>>>>>>>>>>>>>>> repo with a few patches applied to get KVM working together >>>>>>>>>>>>>>>>>> with the kernel >>>>>>>>>>>>>>>>>> binary and disk image I am using. You can get the repo at >>>>>>>>>>>>>>>>>> https://github.com/ShehabElsayed/gem5_debug.git. >>>>>>>>>>>>>>>>>> <https://github.com/ShehabElsayed/gem5_debug.git> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> These steps should reproduce the problem: >>>>>>>>>>>>>>>>>> 1- scons build/X86/gem5.opt >>>>>>>>>>>>>>>>>> 2- ./scripts/get_fs_stuff.sh >>>>>>>>>>>>>>>>>> 3- ./scripts/run_fs.sh 8 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have also included sample m5term outputs for both a 2 >>>>>>>>>>>>>>>>>> thread run (m5out_2t) and an 8 thread run (m5out_8t) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Any help is really appreciated. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Jul 23, 2019 at 11:01 AM Shehab Elsayed < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> When I enable the Exec debug flag I can see that it >>>>>>>>>>>>>>>>>>> seems to be stuck in a spin lock (queued_spin_lock_slowpath) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed < >>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hello All, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I have a gem5 X86 full system set up that starts with >>>>>>>>>>>>>>>>>>>> KVM cores and then switches to O3 cores once the >>>>>>>>>>>>>>>>>>>> benchmark reaches the region of interest. Right now I am >>>>>>>>>>>>>>>>>>>> testing with a >>>>>>>>>>>>>>>>>>>> simple multithreaded hello world benchmark. Sometimes >>>>>>>>>>>>>>>>>>>> the benchmark completes successfully while others gem5 >>>>>>>>>>>>>>>>>>>> just seems to hang >>>>>>>>>>>>>>>>>>>> after starting the benchmark. I believe it is still >>>>>>>>>>>>>>>>>>>> executing some >>>>>>>>>>>>>>>>>>>> instructions but without making any progress. The chance >>>>>>>>>>>>>>>>>>>> of this behavior ( >>>>>>>>>>>>>>>>>>>> indeterminism) happening increases as the number of >>>>>>>>>>>>>>>>>>>> simulated cores or the number of threads created by the >>>>>>>>>>>>>>>>>>>> benchmark increases. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Any ideas what might be the reason for this or how I >>>>>>>>>>>>>>>>>>>> can start debugging this problem? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Note: I have tried the patch in https://gem5-review. >>>>>>>>>>>>>>>>>>>> googlesource.com/c/public/gem5/+/19568 but the problem >>>>>>>>>>>>>>>>>>>> persists. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> gem5-users mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Pouya Fotouhi >>>>>>>>>>> PhD Candidate >>>>>>>>>>> Department of Electrical and Computer Engineering >>>>>>>>>>> University of California, Davis >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> gem5-users mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> gem5-users mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> gem5-users mailing list >>>>>>>>> [email protected] >>>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>>> >>>>>>>> -- >>>>>>>> Pouya Fotouhi >>>>>>>> PhD Candidate >>>>>>>> Department of Electrical and Computer Engineering >>>>>>>> University of California, Davis >>>>>>>> _______________________________________________ >>>>>>>> gem5-users mailing list >>>>>>>> [email protected] >>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>>> >>>>>>> _______________________________________________ >>>>>>> gem5-users mailing list >>>>>>> [email protected] >>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Pouya Fotouhi >>>>>> PhD Candidate >>>>>> Department of Electrical and Computer Engineering >>>>>> University of California, Davis >>>>>> _______________________________________________ >>>>>> gem5-users mailing list >>>>>> [email protected] >>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>>> >>>>> _______________________________________________ >>>>> gem5-users mailing list >>>>> [email protected] >>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>>> >>>> >>>> -- >>>> Pouya Fotouhi >>>> PhD Candidate >>>> Department of Electrical and Computer Engineering >>>> University of California, Davis >>>> _______________________________________________ >>>> gem5-users mailing list >>>> [email protected] >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>> >>> _______________________________________________ >>> gem5-users mailing list >>> [email protected] >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> >> >> >> -- >> Pouya Fotouhi >> PhD Candidate >> Department of Electrical and Computer Engineering >> University of California, Davis >> _______________________________________________ >> gem5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- Pouya Fotouhi PhD Candidate Department of Electrical and Computer Engineering University of California, Davis
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
