[gem5-users] Support for multicore for X86 O3 CPU Full System with classical memory
Hello All, I am running into trouble running full system simulation with 2 X86 O3 cores and classical memory system where the system seems to get stuck during script execution. I read in other (relatively) old posts that multicore might not be supported for this configuration (X86-O3-classical memory). I was just wondering if this is still the case or if I should investigate more to see if I am doing something wrong. The same configuration works for 1 X86 O3 core as well as for multicore AtomicSimple cores. Thanks in advance, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] [EXT] Accessing logical (software) thread ID in gem5
UPDATE: If I call curTaskPIDFromTaskStruct() instead of curTaskPID() both kernels give the same error of "panic: vtophys page walk returned fault" On Mon, Jan 14, 2019 at 12:19 PM Shehab Elsayed wrote: > So, I tried adding this patch to kernels 4.3 and 4.8.13. Both kernels > compile successfully but then I run into problems in gem5. Here is what > happens: > > 1- Kernel 4.3: > When curThreadInfo() gets called on a context switch it gives this error > "panic:curThreadInfo() not implemented for this ISA" > > 2- Kernel 4.8.13: > When curTaskPID() gets called on a context switch it gives this error > "panic: vtophys page walk returned fault" > > Also, I don't think I mentioned this before. I am running full system > simulation with X86 cores. > > > On Sat, Jan 12, 2019 at 12:17 PM Paul Rosenfeld (prosenfeld) < > prosenf...@micron.com> wrote: > >> You’ll need to build a kernel with the extra annotations. I’m not sure if >> they are supported in the latest kernels but I know it was supported in >> v4.3. Here is the commit that adds those symbols: >> https://github.com/gem5/linux-arm-gem5-legacy/commit/516ba2d255b502b1dad07662bd18110f3bf37b1b >> >> >> >> *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Shehab >> Elsayed >> *Sent:* Friday, January 11, 2019 10:58 PM >> *To:* gem5 users mailing list >> *Subject:* Re: [gem5-users] [EXT] Accessing logical (software) thread ID >> in gem5 >> >> >> >> Thank you all for your help. I have been trying to to get what Paul >> suggested to work, however, I keep running into this problem: >> >> >> >> warn: Unable to find kernel symbol thread_info_task >> warn: Kernel not compiled with task_struct info; can't get currently >> executing task/process/thread name/ids! >> >> >> >> I am not sure what I need to modify in the kernel in order to get it to >> work. Any suggestions? >> >> >> >> On Fri, Jan 11, 2019 at 9:27 AM Paul Rosenfeld (prosenfeld) < >> prosenf...@micron.com> wrote: >> >> You could take the approach previously implemented by ARM, which is to >> add a few annotations to your kernel that allow you to find the task_info >> structures in kernel memory and then ask gem5 to hook the kernel process >> switch function. Each time the kernel context switches on a core, you get a >> callback into gem5 and you can look up the process info. It’s been a while >> since I’ve worked on this sort of thing but you might be able to look at >> this patch for some hints about where to look: >> >> >> >> https://gem5-review.googlesource.com/c/public/gem5/+/2640 >> >> >> >> >> >> Cheeers, >> >> Paul >> >> >> >> *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Shehab >> Elsayed >> *Sent:* Thursday, January 10, 2019 11:53 AM >> *To:* gem5-users@gem5.org >> *Subject:* [EXT] [gem5-users] Accessing logical (software) thread ID in >> gem5 >> >> >> >> Hello All, >> >> >> >> I was wondering if there is a way to differentiate between different >> logical (software) threads in gem5. I am trying to collect some stats for >> each logical thread and so far all I could find in gem5 is access to >> physical threads. I know that logical threads is the responsibility of the >> OS but is there anyway for gem5 to access the logical thread ID. >> >> >> >> One option is to pin threads to cores but this only works if the number >> of cores is at least equal to the number of logical threads. However, I >> will need to run some experiments where the number of logical threads >> exceed the number of cores, in which case, multiple logical threads will be >> assigned to the same core and in order to differentiate between them I need >> the logical thread ID. >> >> >> >> Thank you very much in advance. >> >> >> >> Best Regards, >> >> Shehab >> >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] [EXT] Accessing logical (software) thread ID in gem5
So, I tried adding this patch to kernels 4.3 and 4.8.13. Both kernels compile successfully but then I run into problems in gem5. Here is what happens: 1- Kernel 4.3: When curThreadInfo() gets called on a context switch it gives this error "panic:curThreadInfo() not implemented for this ISA" 2- Kernel 4.8.13: When curTaskPID() gets called on a context switch it gives this error "panic: vtophys page walk returned fault" Also, I don't think I mentioned this before. I am running full system simulation with X86 cores. On Sat, Jan 12, 2019 at 12:17 PM Paul Rosenfeld (prosenfeld) < prosenf...@micron.com> wrote: > You’ll need to build a kernel with the extra annotations. I’m not sure if > they are supported in the latest kernels but I know it was supported in > v4.3. Here is the commit that adds those symbols: > https://github.com/gem5/linux-arm-gem5-legacy/commit/516ba2d255b502b1dad07662bd18110f3bf37b1b > > > > *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Shehab > Elsayed > *Sent:* Friday, January 11, 2019 10:58 PM > *To:* gem5 users mailing list > *Subject:* Re: [gem5-users] [EXT] Accessing logical (software) thread ID > in gem5 > > > > Thank you all for your help. I have been trying to to get what Paul > suggested to work, however, I keep running into this problem: > > > > warn: Unable to find kernel symbol thread_info_task > warn: Kernel not compiled with task_struct info; can't get currently > executing task/process/thread name/ids! > > > > I am not sure what I need to modify in the kernel in order to get it to > work. Any suggestions? > > > > On Fri, Jan 11, 2019 at 9:27 AM Paul Rosenfeld (prosenfeld) < > prosenf...@micron.com> wrote: > > You could take the approach previously implemented by ARM, which is to add > a few annotations to your kernel that allow you to find the task_info > structures in kernel memory and then ask gem5 to hook the kernel process > switch function. Each time the kernel context switches on a core, you get a > callback into gem5 and you can look up the process info. It’s been a while > since I’ve worked on this sort of thing but you might be able to look at > this patch for some hints about where to look: > > > > https://gem5-review.googlesource.com/c/public/gem5/+/2640 > > > > > > Cheeers, > > Paul > > > > *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Shehab > Elsayed > *Sent:* Thursday, January 10, 2019 11:53 AM > *To:* gem5-users@gem5.org > *Subject:* [EXT] [gem5-users] Accessing logical (software) thread ID in > gem5 > > > > Hello All, > > > > I was wondering if there is a way to differentiate between different > logical (software) threads in gem5. I am trying to collect some stats for > each logical thread and so far all I could find in gem5 is access to > physical threads. I know that logical threads is the responsibility of the > OS but is there anyway for gem5 to access the logical thread ID. > > > > One option is to pin threads to cores but this only works if the number of > cores is at least equal to the number of logical threads. However, I will > need to run some experiments where the number of logical threads exceed the > number of cores, in which case, multiple logical threads will be assigned > to the same core and in order to differentiate between them I need the > logical thread ID. > > > > Thank you very much in advance. > > > > Best Regards, > > Shehab > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Support for multicore for X86 O3 CPU Full System with classical memory
This is the post I was referring to: https://www.mail-archive.com/gem5-users@gem5.org/msg10498.html The second paragraph describes the problem with more than O3 CPU, Full System and Classical Memory system at the time of the post. On Fri, Nov 16, 2018 at 5:38 PM Gabe Black wrote: > I'm not specifically aware of a problem with O3 and multicore on x86. Can > you point out the posts you're referring to? O3 and multicore both add a > dimension of complexity, so that would be a more likely place for bugs to > crop up. > > Gabe > > On Fri, Nov 16, 2018 at 7:20 AM Shehab Elsayed > wrote: > >> Hello All, >> >> I am running into trouble running full system simulation with 2 X86 O3 >> cores and classical memory system where the system seems to get stuck >> during script execution. I read in other (relatively) old posts that >> multicore might not be supported for this configuration (X86-O3-classical >> memory). >> >> I was just wondering if this is still the case or if I should investigate >> more to see if I am doing something wrong. >> >> The same configuration works for 1 X86 O3 core as well as for multicore >> AtomicSimple cores. >> >> Thanks in advance, >> >> Shehab >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Accessing logical (software) thread ID in gem5
Hello All, I was wondering if there is a way to differentiate between different logical (software) threads in gem5. I am trying to collect some stats for each logical thread and so far all I could find in gem5 is access to physical threads. I know that logical threads is the responsibility of the OS but is there anyway for gem5 to access the logical thread ID. One option is to pin threads to cores but this only works if the number of cores is at least equal to the number of logical threads. However, I will need to run some experiments where the number of logical threads exceed the number of cores, in which case, multiple logical threads will be assigned to the same core and in order to differentiate between them I need the logical thread ID. Thank you very much in advance. Best Regards, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] [EXT] Accessing logical (software) thread ID in gem5
Thank you all for your help. I have been trying to to get what Paul suggested to work, however, I keep running into this problem: warn: Unable to find kernel symbol thread_info_task warn: Kernel not compiled with task_struct info; can't get currently executing task/process/thread name/ids! I am not sure what I need to modify in the kernel in order to get it to work. Any suggestions? On Fri, Jan 11, 2019 at 9:27 AM Paul Rosenfeld (prosenfeld) < prosenf...@micron.com> wrote: > You could take the approach previously implemented by ARM, which is to add > a few annotations to your kernel that allow you to find the task_info > structures in kernel memory and then ask gem5 to hook the kernel process > switch function. Each time the kernel context switches on a core, you get a > callback into gem5 and you can look up the process info. It’s been a while > since I’ve worked on this sort of thing but you might be able to look at > this patch for some hints about where to look: > > > > https://gem5-review.googlesource.com/c/public/gem5/+/2640 > > > > > > Cheeers, > > Paul > > > > *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Shehab > Elsayed > *Sent:* Thursday, January 10, 2019 11:53 AM > *To:* gem5-users@gem5.org > *Subject:* [EXT] [gem5-users] Accessing logical (software) thread ID in > gem5 > > > > Hello All, > > > > I was wondering if there is a way to differentiate between different > logical (software) threads in gem5. I am trying to collect some stats for > each logical thread and so far all I could find in gem5 is access to > physical threads. I know that logical threads is the responsibility of the > OS but is there anyway for gem5 to access the logical thread ID. > > > > One option is to pin threads to cores but this only works if the number of > cores is at least equal to the number of logical threads. However, I will > need to run some experiments where the number of logical threads exceed the > number of cores, in which case, multiple logical threads will be assigned > to the same core and in order to differentiate between them I need the > logical thread ID. > > > > Thank you very much in advance. > > > > Best Regards, > > Shehab > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Full System hangs or exits to login
Hello All, When I try running some benchmarks (Splash 3) in Full System, gem5 seems to hang after some time, usually in the middle of a print statement in the gem5 terminal or some other times it seems to exit the benchmark to the Ubuntu log in prompt! My setup: I create a checkpoint after booting with X86 KVM cores. then I boot from the checkpoint using the Atomic CPU model and later switch to OoO CPU models after reaching the ROI. Simulation usually hangs after reaching the ROI. I am using a classical X86 system without ruby. Any ideas what might be going wrong or where I could start looking to debug this problem? Best Regards, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Full system sometimes ignores rcS script
Hello All, Is it normal that a full system simulation sometimes ignores the rcS script and just goes to the ubuntu log in screen directly? I sometimes run into this problem and then I just run the command again without any change and it starts executing the rcS script. Thanks in advance, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Full system sometimes ignores rcS script
Yes, I am. But even if I am not creating the checkpoint, and using an rcS script that runs the benchmarks directly Gem5 still sometimes goes to the login screen instead of starting the benchmark. On Thu, Jun 27, 2019 at 9:03 PM Gambord, Ryan wrote: > Are you using the hackback script with a checkpoint? > > Ryan Gambord > > > > > On Thu, Jun 27, 2019 at 2:48 PM Shehab Elsayed > wrote: > >> Hello All, >> >> Is it normal that a full system simulation sometimes ignores the rcS >> script and just goes to the ubuntu log in screen directly? >> >> I sometimes run into this problem and then I just run the command again >> without any change and it starts executing the rcS script. >> >> Thanks in advance, >> Shehab >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Indeterministic gem5 behavior
So actually load instruction gets executed twice causing the assertion to fail on the second time. 769413949: system.switch_cpus.iew.lsq.thread0: Doing memory access for inst [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2) 769413949: system.switch_cpus.iew.lsq.thread0: Load [sn:15059405] not executed from fault 769413949: system.switch_cpus.iew.lsq.thread0: 1- Setting [sn:15059405] as executed (I added this message to track when LSQ instructions are set as executed) I believe this instruction should then be committed and removed from the LSQ before before executed again, however, this does not happen. Instead it gets executed again before being removed and then comes the assertion failure that it has already executed. I see that it gets sent to commit 769413949: system.switch_cpus.iew: Sending instructions to commit, [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2). but it never actually gets to commit and removed from LSQ. On Mon, Sep 9, 2019 at 3:01 PM Pouya Fotouhi wrote: > You can try dumping Exec trace for the last few million ticks and see what > is going on in your LSQ and why you have load instruction that is not > executed. > > Best, > > On Mon, Sep 9, 2019 at 11:28 AM Shehab Elsayed > wrote: > >> I am not sure that prefetch_nta is the problem. For different runs the >> simulation would fail after different periods after printing the prefetch >> _nta warning message. Also, from what I have seen in different posts it >> seems that this warning has been around for a while. >> >> I tried compiling my hello world program with -march=athlon64 alone and >> together with -O0 and the the same problem happens. >> >> Also, the I am building my benchmark on the disk image directly using >> qemu and the gcc on the image is versio 5.4.0 >> >> >> >> On Sun, Sep 8, 2019 at 4:14 PM Pouya Fotouhi >> wrote: >> >>> Hi Shehab, >>> >>> Good, that's "progress"! >>> My guess off the top of my head is that you used a "more recent" >>> compiler (compared to what other gem5 users tend to use), and thus some >>> instructions are being generated that were not critical to the execution of >>> applications other users had so far (and that's mostly why those >>> instructions are not yet implemented). I think you have two options: >>> >>>1. You can try implementing prefetch_nta, and possibly ignore the >>>non-temporal hint (i.e. implement it as a cacheable prefetch). You can >>>start by looking at the implementation of other prefetch instruction we >>>have in gem5 (basically you can do the same :) ). >>>2. Try compiling your application (I think we are still talking >>>about the hello world, right?), and target an older architecture (you can >>>do as extreme as march=athlon64) with less optimizations involved to >>> avoid >>>these performance-optimizations (reducing cache pollution in this >>>particular case) that your compiler is trying to apply. >>> >>> My suggestion is to go with the first one, since running real >>> applications compiled for an older architecture with less optimization on a >>> "newer" system is the equivalent of not using "parts/features" of your >>> system (e.g. SIMD units, direct prefetch, etc), which would (possibly) >>> directly impact any study you are working on. >>> >>> Best, >>> >>> On Sat, Sep 7, 2019 at 8:27 PM Shehab Elsayed >>> wrote: >>> >>>> I am sorry for the late update. I tried running with MESI_Two_Level but >>>> the simulation ends with this error. >>>> >>>> warn: instruction 'prefetch_nta' unimplemented >>>> >>>> gem5.opt: build/X86_MESI_Two_Level/cpu/o3/lsq_unit.hh:621: Fault >>>> LSQUnit::read(LSQUnit::LSQRequest*, int) [with Impl = >>>> O3CPUImpl; Fault = std::shared_ptr; LSQUnit::LSQRequest = >>>> L >>>> SQ::LSQRequest]: Assertion `!load_inst->isExecuted()' failed. >>>> >>>> Which I believe has something to do with a recent update since I don't >>>> remember seeing it before. And this error happens even for just 2 cores and >>>> 2 threads. >>>> >>>> On Fri, Sep 6, 2019 at 3:16 PM Pouya Fotouhi >>>> wrote: >>>> >>>>> Hi Shehab, >>>>> >>>>> As Jason pointed out, I won’t be surprised if you are having issues >>>>> with classic caches running workloads that re
Re: [gem5-users] Indeterministic gem5 behavior
Is there a way to get the macroop from the corresponding instruction pointer? On Wed, Sep 11, 2019 at 5:07 PM Pouya Fotouhi wrote: > Hi Shehab, > > Can you please confirm what is the macroop that is issuing that load? I > suspect it's one of the 128-bit instructions (maybe recently non-temporal > ones that I added) that are executed as two 64-bit loads, and possibly the > second one is failing due to the cda check that we do, and that stops the > load from being committed. > > Best, > > On Wed, Sep 11, 2019 at 1:16 PM Shehab Elsayed > wrote: > >> So actually load instruction gets executed twice causing the assertion to >> fail on the second time. >> >> 769413949: system.switch_cpus.iew.lsq.thread0: Doing memory access >> for inst [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2) >> 769413949: system.switch_cpus.iew.lsq.thread0: Load [sn:15059405] not >> executed from fault >> 769413949: system.switch_cpus.iew.lsq.thread0: 1- Setting >> [sn:15059405] as executed (I added this message to track when LSQ >> instructions are set as executed) >> >> I believe this instruction should then be committed and removed from the >> LSQ before before executed again, however, this does not happen. Instead it >> gets executed again before being removed and then comes the assertion >> failure that it has already executed. >> >> I see that it gets sent to commit >> >> 769413949: system.switch_cpus.iew: Sending instructions to commit, >> [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2). >> >> but it never actually gets to commit and removed from LSQ. >> >> >> On Mon, Sep 9, 2019 at 3:01 PM Pouya Fotouhi >> wrote: >> >>> You can try dumping Exec trace for the last few million ticks and see >>> what is going on in your LSQ and why you have load instruction that is not >>> executed. >>> >>> Best, >>> >>> On Mon, Sep 9, 2019 at 11:28 AM Shehab Elsayed >>> wrote: >>> >>>> I am not sure that prefetch_nta is the problem. For different runs the >>>> simulation would fail after different periods after printing the >>>> prefetch_nta warning message. Also, from what I have seen in different >>>> posts it seems that this warning has been around for a while. >>>> >>>> I tried compiling my hello world program with -march=athlon64 alone and >>>> together with -O0 and the the same problem happens. >>>> >>>> Also, the I am building my benchmark on the disk image directly using >>>> qemu and the gcc on the image is versio 5.4.0 >>>> >>>> >>>> >>>> On Sun, Sep 8, 2019 at 4:14 PM Pouya Fotouhi >>>> wrote: >>>> >>>>> Hi Shehab, >>>>> >>>>> Good, that's "progress"! >>>>> My guess off the top of my head is that you used a "more recent" >>>>> compiler (compared to what other gem5 users tend to use), and thus some >>>>> instructions are being generated that were not critical to the execution >>>>> of >>>>> applications other users had so far (and that's mostly why those >>>>> instructions are not yet implemented). I think you have two options: >>>>> >>>>>1. You can try implementing prefetch_nta, and possibly ignore the >>>>>non-temporal hint (i.e. implement it as a cacheable prefetch). You can >>>>>start by looking at the implementation of other prefetch instruction we >>>>>have in gem5 (basically you can do the same :) ). >>>>>2. Try compiling your application (I think we are still talking >>>>>about the hello world, right?), and target an older architecture (you >>>>> can >>>>>do as extreme as march=athlon64) with less optimizations involved to >>>>> avoid >>>>>these performance-optimizations (reducing cache pollution in this >>>>>particular case) that your compiler is trying to apply. >>>>> >>>>> My suggestion is to go with the first one, since running real >>>>> applications compiled for an older architecture with less optimization on >>>>> a >>>>> "newer" system is the equivalent of not using "parts/features" of your >>>>> system (e.g. SIMD units, direct prefetch, etc), which would (possibly) >>>>> directly impact any st
Re: [gem5-users] Indeterministic gem5 behavior
My latest experiments are with the classical memory system, but I remember trying Ruby and it was not different. I am using kernel 4.8.13 and ubuntu-16.04.1-server-amd64 disk image. I am using Pthreads for my Hello World program. On Fri, Sep 6, 2019 at 1:13 PM Pouya Fotouhi wrote: > Hi Shehab, > > Can you confirm a few details about the configuration you are using? Are > you using classic caches or Ruby? What is the kernel version and disk image > you are using? What is the implementation of your "multithreaded hello > world" (are you using OMP)? > > Best, > > On Fri, Sep 6, 2019 at 8:58 AM Shehab Elsayed > wrote: > >> First of all, thanks for your replies, Ryan and Jason. >> >> I have already pulled the latest changes by Pouya and the problem still >> persists. >> >> As for checkpointing, I was originally doing exactly what Jason >> mentioned and ran into the same problem. I then switched to not >> checkpointing just to avoid any problems that might be caused by >> checkpointing (if any). My plan was to go back to checkpointing after >> proving that it works without it. >> >> However, the problem doesn't seem to be related to KVM as linux boots >> reliable every time. The problem happens after the benchmarks starts >> execution and it seems to be happening only when running multiple cores >> (>=4). My latest experiments with a single core and 8 threads for the >> benchmark seem to be working fine. But once I increase the number of >> simulated cores problems happen. >> >> Also, I have posted a link to the repo I am using to run my tests in a >> previous message. I have also added 2 debug traces with the Exec flag for a >> working and non-working examples. >> >> >> On Fri, Sep 6, 2019 at 11:28 AM Jason Lowe-Power >> wrote: >> >>> Hi Shehab, >>> >>> One quick note: There is *no way* to have deterministic behavior when >>> running with KVM. Since you are using the hardware, the underlying host OS >>> will influence the execution path of the workload. >>> >>> To try to narrow down the bug you're seeing, you can try to take a >>> checkpoint after booting with KVM. Then, the execution from the checkpoint >>> should be deterministic since it is 100% in gem5. >>> >>> BTW, I doubt you can run the KVM CPU in a VM since this would require >>> your hardware and the VM to support nested virtualization. There *is* >>> support for this in the Linux kernel, but I don't think it's been widely >>> deployed outside of specific cloud environments. >>> >>> One other note: Pouya has pushed some changes which implement some x86 >>> instructions that were causing issues for him. You can try with the current >>> gem5 mainline to see if that helps. >>> >>> Cheers, >>> Jason >>> >>> On Fri, Sep 6, 2019 at 8:22 AM Shehab Elsayed >>> wrote: >>> >>>> That's interesting. Are you using Full System as well? I don't think FS >>>> behavior is supposed to be so dependent on the host environment! >>>> >>>> On Fri, Sep 6, 2019 at 11:16 AM Gambord, Ryan >>>> wrote: >>>> >>>>> I have found that gem5 behavior is sensitive to the execution >>>>> environment. I now run gem5 inside an ubuntu vm on qemu and have had much >>>>> more consistent results. I haven't tried running kvm gem5 inside a kvm >>>>> qemu >>>>> vm, so not sure how that works, but might be worth trying. >>>>> >>>>> Ryan >>>>> >>>>> >>>>> On Fri, Sep 6, 2019, 08:07 Shehab Elsayed >>>>> wrote: >>>>> >>>>>> I was wondering if anyone is running into the same problem or if >>>>>> anyone has any suggestions on how to proceed with debugging this problem. >>>>>> >>>>>> On Mon, Jul 29, 2019 at 4:57 PM Shehab Elsayed >>>>>> wrote: >>>>>> >>>>>>> Sorry for the spam. I just forgot to mention that the system >>>>>>> configuration I am using is mainly from >>>>>>> https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs. >>>>>>> <https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs> >>>>>>> >>>>>>> >>>>>>> Shehab Y. Elsayed, MSc. >>>>>>> PhD Student >>>>>>> The Edwards
Re: [gem5-users] Indeterministic gem5 behavior
Looks like this is the instruction causing the assertion failure MOV_R_M : ld r9, DS:[rdx + 0x10] On Wed, Sep 11, 2019 at 5:20 PM Pouya Fotouhi wrote: > If you use --debug-flags=ExecAll,Decode and narrow down your trace to the > Ticks that you know the load is failing with --debug-start and --debug-end > you should be able to get that. > > Best, > > On Wed, Sep 11, 2019 at 2:15 PM Shehab Elsayed > wrote: > >> Is there a way to get the macroop from the corresponding instruction >> pointer? >> >> >> On Wed, Sep 11, 2019 at 5:07 PM Pouya Fotouhi >> wrote: >> >>> Hi Shehab, >>> >>> Can you please confirm what is the macroop that is issuing that load? I >>> suspect it's one of the 128-bit instructions (maybe recently non-temporal >>> ones that I added) that are executed as two 64-bit loads, and possibly the >>> second one is failing due to the cda check that we do, and that stops the >>> load from being committed. >>> >>> Best, >>> >>> On Wed, Sep 11, 2019 at 1:16 PM Shehab Elsayed >>> wrote: >>> >>>> So actually load instruction gets executed twice causing the assertion >>>> to fail on the second time. >>>> >>>> 769413949: system.switch_cpus.iew.lsq.thread0: Doing memory access >>>> for inst [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2) >>>> 769413949: system.switch_cpus.iew.lsq.thread0: Load [sn:15059405] >>>> not executed from fault >>>> 769413949: system.switch_cpus.iew.lsq.thread0: 1- Setting >>>> [sn:15059405] as executed (I added this message to track when LSQ >>>> instructions are set as executed) >>>> >>>> I believe this instruction should then be committed and removed from >>>> the LSQ before before executed again, however, this does not happen. >>>> Instead it gets executed again before being removed and then comes the >>>> assertion failure that it has already executed. >>>> >>>> I see that it gets sent to commit >>>> >>>> 769413949: system.switch_cpus.iew: Sending instructions to commit, >>>> [sn:15059405] PC (0xffff810ed626=>0x810ed62a).(1=>2). >>>> >>>> but it never actually gets to commit and removed from LSQ. >>>> >>>> >>>> On Mon, Sep 9, 2019 at 3:01 PM Pouya Fotouhi >>>> wrote: >>>> >>>>> You can try dumping Exec trace for the last few million ticks and see >>>>> what is going on in your LSQ and why you have load instruction that is not >>>>> executed. >>>>> >>>>> Best, >>>>> >>>>> On Mon, Sep 9, 2019 at 11:28 AM Shehab Elsayed >>>>> wrote: >>>>> >>>>>> I am not sure that prefetch_nta is the problem. For different runs >>>>>> the simulation would fail after different periods after printing the >>>>>> prefetch_nta warning message. Also, from what I have seen in >>>>>> different posts it seems that this warning has been around for a while. >>>>>> >>>>>> I tried compiling my hello world program with -march=athlon64 alone >>>>>> and together with -O0 and the the same problem happens. >>>>>> >>>>>> Also, the I am building my benchmark on the disk image directly using >>>>>> qemu and the gcc on the image is versio 5.4.0 >>>>>> >>>>>> >>>>>> >>>>>> On Sun, Sep 8, 2019 at 4:14 PM Pouya Fotouhi >>>>>> wrote: >>>>>> >>>>>>> Hi Shehab, >>>>>>> >>>>>>> Good, that's "progress"! >>>>>>> My guess off the top of my head is that you used a "more recent" >>>>>>> compiler (compared to what other gem5 users tend to use), and thus some >>>>>>> instructions are being generated that were not critical to the >>>>>>> execution of >>>>>>> applications other users had so far (and that's mostly why those >>>>>>> instructions are not yet implemented). I think you have two options: >>>>>>> >>>>>>>1. You can try implementing prefetch_nta, and possibly ignore >>>>>>>the non-temporal hint (i.e. implement it as a cacheable prefetch). >>>>>>>
Re: [gem5-users] Indeterministic gem5 behavior
I ran some more tests and it doesn't always break at the same instruction ( MOV_R_M). However, what seems to be common is that the instruction causing the problem (let's call it instruction A) conflicts with another instruction in the 'checkSnoop' function in 'src/cpu/o3/lsq_unit_impl.hh'. Therefore, 'A' ends up being faulty with ReExec and consequently marked as executed and sent to commit but never actually gets to commit before being executed again. What I understood from the comments in this function is that offending instructions should be squashed (since I amusing X86 which sets needsTSO to True and consequently force_squash to True as well). But looks like the Instructions are not squashed as far as I can tell. I am not sure if this helps narrow down the problem or not but I hope it helps! On Thu, Sep 12, 2019 at 2:52 PM Shehab Elsayed wrote: > Looks like this is the instruction causing the assertion failure > > MOV_R_M : ld r9, DS:[rdx + 0x10] > > > > On Wed, Sep 11, 2019 at 5:20 PM Pouya Fotouhi > wrote: > >> If you use --debug-flags=ExecAll,Decode and narrow down your trace to the >> Ticks that you know the load is failing with --debug-start and --debug-end >> you should be able to get that. >> >> Best, >> >> On Wed, Sep 11, 2019 at 2:15 PM Shehab Elsayed >> wrote: >> >>> Is there a way to get the macroop from the corresponding instruction >>> pointer? >>> >>> >>> On Wed, Sep 11, 2019 at 5:07 PM Pouya Fotouhi >>> wrote: >>> >>>> Hi Shehab, >>>> >>>> Can you please confirm what is the macroop that is issuing that load? I >>>> suspect it's one of the 128-bit instructions (maybe recently non-temporal >>>> ones that I added) that are executed as two 64-bit loads, and possibly the >>>> second one is failing due to the cda check that we do, and that stops the >>>> load from being committed. >>>> >>>> Best, >>>> >>>> On Wed, Sep 11, 2019 at 1:16 PM Shehab Elsayed >>>> wrote: >>>> >>>>> So actually load instruction gets executed twice causing the assertion >>>>> to fail on the second time. >>>>> >>>>> 769413949: system.switch_cpus.iew.lsq.thread0: Doing memory access >>>>> for inst [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2) >>>>> 769413949: system.switch_cpus.iew.lsq.thread0: Load [sn:15059405] >>>>> not executed from fault >>>>> 769413949: system.switch_cpus.iew.lsq.thread0: 1- Setting >>>>> [sn:15059405] as executed (I added this message to track when LSQ >>>>> instructions are set as executed) >>>>> >>>>> I believe this instruction should then be committed and removed from >>>>> the LSQ before before executed again, however, this does not happen. >>>>> Instead it gets executed again before being removed and then comes the >>>>> assertion failure that it has already executed. >>>>> >>>>> I see that it gets sent to commit >>>>> >>>>> 769413949: system.switch_cpus.iew: Sending instructions to commit, >>>>> [sn:15059405] PC (0x810ed626=>0x810ed62a).(1=>2). >>>>> >>>>> but it never actually gets to commit and removed from LSQ. >>>>> >>>>> >>>>> On Mon, Sep 9, 2019 at 3:01 PM Pouya Fotouhi >>>>> wrote: >>>>> >>>>>> You can try dumping Exec trace for the last few million ticks and see >>>>>> what is going on in your LSQ and why you have load instruction that is >>>>>> not >>>>>> executed. >>>>>> >>>>>> Best, >>>>>> >>>>>> On Mon, Sep 9, 2019 at 11:28 AM Shehab Elsayed >>>>>> wrote: >>>>>> >>>>>>> I am not sure that prefetch_nta is the problem. For different runs >>>>>>> the simulation would fail after different periods after printing the >>>>>>> prefetch_nta warning message. Also, from what I have seen in >>>>>>> different posts it seems that this warning has been around for a while. >>>>>>> >>>>>>> I tried compiling my hello world program with -march=athlon64 alone >>>>>>> and together with -O0 and the the same problem happens. >>>>>>> >>>>>>> Also, the I am
Re: [gem5-users] Indeterministic gem5 behavior
I am not sure that prefetch_nta is the problem. For different runs the simulation would fail after different periods after printing the prefetch_ nta warning message. Also, from what I have seen in different posts it seems that this warning has been around for a while. I tried compiling my hello world program with -march=athlon64 alone and together with -O0 and the the same problem happens. Also, the I am building my benchmark on the disk image directly using qemu and the gcc on the image is versio 5.4.0 On Sun, Sep 8, 2019 at 4:14 PM Pouya Fotouhi wrote: > Hi Shehab, > > Good, that's "progress"! > My guess off the top of my head is that you used a "more recent" compiler > (compared to what other gem5 users tend to use), and thus some instructions > are being generated that were not critical to the execution of applications > other users had so far (and that's mostly why those instructions are not > yet implemented). I think you have two options: > >1. You can try implementing prefetch_nta, and possibly ignore the >non-temporal hint (i.e. implement it as a cacheable prefetch). You can >start by looking at the implementation of other prefetch instruction we >have in gem5 (basically you can do the same :) ). >2. Try compiling your application (I think we are still talking about >the hello world, right?), and target an older architecture (you can do as >extreme as march=athlon64) with less optimizations involved to avoid these >performance-optimizations (reducing cache pollution in this >particular case) that your compiler is trying to apply. > > My suggestion is to go with the first one, since running real applications > compiled for an older architecture with less optimization on a "newer" > system is the equivalent of not using "parts/features" of your system (e.g. > SIMD units, direct prefetch, etc), which would (possibly) directly impact > any study you are working on. > > Best, > > On Sat, Sep 7, 2019 at 8:27 PM Shehab Elsayed > wrote: > >> I am sorry for the late update. I tried running with MESI_Two_Level but >> the simulation ends with this error. >> >> warn: instruction 'prefetch_nta' unimplemented >> >> gem5.opt: build/X86_MESI_Two_Level/cpu/o3/lsq_unit.hh:621: Fault >> LSQUnit::read(LSQUnit::LSQRequest*, int) [with Impl = >> O3CPUImpl; Fault = std::shared_ptr; LSQUnit::LSQRequest = L >> SQ::LSQRequest]: Assertion `!load_inst->isExecuted()' failed. >> >> Which I believe has something to do with a recent update since I don't >> remember seeing it before. And this error happens even for just 2 cores and >> 2 threads. >> >> On Fri, Sep 6, 2019 at 3:16 PM Pouya Fotouhi >> wrote: >> >>> Hi Shehab, >>> >>> As Jason pointed out, I won’t be surprised if you are having issues with >>> classic caches running workloads that rely on locking mechanisms. Your >>> pthread implementation is possibly using some synchronization variables >>> which requires cache coherence to maintain its correctness, and classic >>> caches (at least for now) doesn’t support that. >>> >>> Switch to ruby caches (I suggest MESI Two Level to begin with), and >>> given your kernel version you should be getting stable behavior from gem5. >>> >>> Best, >>> >>> On Fri, Sep 6, 2019 at 11:47 AM Jason Lowe-Power >>> wrote: >>> >>>> Hi Shehab, >>>> >>>> IIRC, there are some issues when using classic caches + x86 + multiple >>>> cores on full system mode. I suggest using Ruby (MESI_two_level or >>>> MOESI_hammer) for FS simulations. >>>> >>>> Jason >>>> >>>> On Fri, Sep 6, 2019 at 11:24 AM Shehab Elsayed >>>> wrote: >>>> >>>>> My latest experiments are with the classical memory system, but I >>>>> remember trying Ruby and it was not different. I am using kernel 4.8.13 >>>>> and >>>>> ubuntu-16.04.1-server-amd64 disk image. I am using Pthreads for my Hello >>>>> World program. >>>>> >>>>> >>>>> On Fri, Sep 6, 2019 at 1:13 PM Pouya Fotouhi >>>>> wrote: >>>>> >>>>>> Hi Shehab, >>>>>> >>>>>> Can you confirm a few details about the configuration you are using? >>>>>> Are you using classic caches or Ruby? What is the kernel version and disk >>>>>> image you are using? What is the implementation of your "multithreaded >>>>>> hello world"
Re: [gem5-users] Indeterministic gem5 behavior
I am sorry for the late update. I tried running with MESI_Two_Level but the simulation ends with this error. warn: instruction 'prefetch_nta' unimplemented gem5.opt: build/X86_MESI_Two_Level/cpu/o3/lsq_unit.hh:621: Fault LSQUnit::read(LSQUnit::LSQRequest*, int) [with Impl = O3CPUImpl; Fault = std::shared_ptr; LSQUnit::LSQRequest = L SQ::LSQRequest]: Assertion `!load_inst->isExecuted()' failed. Which I believe has something to do with a recent update since I don't remember seeing it before. And this error happens even for just 2 cores and 2 threads. On Fri, Sep 6, 2019 at 3:16 PM Pouya Fotouhi wrote: > Hi Shehab, > > As Jason pointed out, I won’t be surprised if you are having issues with > classic caches running workloads that rely on locking mechanisms. Your > pthread implementation is possibly using some synchronization variables > which requires cache coherence to maintain its correctness, and classic > caches (at least for now) doesn’t support that. > > Switch to ruby caches (I suggest MESI Two Level to begin with), and given > your kernel version you should be getting stable behavior from gem5. > > Best, > > On Fri, Sep 6, 2019 at 11:47 AM Jason Lowe-Power > wrote: > >> Hi Shehab, >> >> IIRC, there are some issues when using classic caches + x86 + multiple >> cores on full system mode. I suggest using Ruby (MESI_two_level or >> MOESI_hammer) for FS simulations. >> >> Jason >> >> On Fri, Sep 6, 2019 at 11:24 AM Shehab Elsayed >> wrote: >> >>> My latest experiments are with the classical memory system, but I >>> remember trying Ruby and it was not different. I am using kernel 4.8.13 and >>> ubuntu-16.04.1-server-amd64 disk image. I am using Pthreads for my Hello >>> World program. >>> >>> >>> On Fri, Sep 6, 2019 at 1:13 PM Pouya Fotouhi >>> wrote: >>> >>>> Hi Shehab, >>>> >>>> Can you confirm a few details about the configuration you are using? >>>> Are you using classic caches or Ruby? What is the kernel version and disk >>>> image you are using? What is the implementation of your "multithreaded >>>> hello world" (are you using OMP)? >>>> >>>> Best, >>>> >>>> On Fri, Sep 6, 2019 at 8:58 AM Shehab Elsayed >>>> wrote: >>>> >>>>> First of all, thanks for your replies, Ryan and Jason. >>>>> >>>>> I have already pulled the latest changes by Pouya and the problem >>>>> still persists. >>>>> >>>>> As for checkpointing, I was originally doing exactly what Jason >>>>> mentioned and ran into the same problem. I then switched to not >>>>> checkpointing just to avoid any problems that might be caused by >>>>> checkpointing (if any). My plan was to go back to checkpointing after >>>>> proving that it works without it. >>>>> >>>>> However, the problem doesn't seem to be related to KVM as linux boots >>>>> reliable every time. The problem happens after the benchmarks starts >>>>> execution and it seems to be happening only when running multiple cores >>>>> (>=4). My latest experiments with a single core and 8 threads for the >>>>> benchmark seem to be working fine. But once I increase the number of >>>>> simulated cores problems happen. >>>>> >>>>> Also, I have posted a link to the repo I am using to run my tests in >>>>> a previous message. I have also added 2 debug traces with the Exec flag >>>>> for >>>>> a working and non-working examples. >>>>> >>>>> >>>>> On Fri, Sep 6, 2019 at 11:28 AM Jason Lowe-Power >>>>> wrote: >>>>> >>>>>> Hi Shehab, >>>>>> >>>>>> One quick note: There is *no way* to have deterministic behavior when >>>>>> running with KVM. Since you are using the hardware, the underlying host >>>>>> OS >>>>>> will influence the execution path of the workload. >>>>>> >>>>>> To try to narrow down the bug you're seeing, you can try to take a >>>>>> checkpoint after booting with KVM. Then, the execution from the >>>>>> checkpoint >>>>>> should be deterministic since it is 100% in gem5. >>>>>> >>>>>> BTW, I doubt you can run the KVM CPU in a VM since this would require >>>>>> your hardware and the VM
Re: [gem5-users] Indeterministic gem5 behavior
Since the problem seems to be related to the faulty load instruction being re-executed before the faulty version is committed, is it safe to skip load execution if the instruction has ReExec fault and its Executed flag is set and manually resetting the Executed flag once the faulty instruction is completed? This is roughly what I had in mind. I am not very familiar with the flow of instructions through the pipeline on the O3 implementation. so I would appreciate if you could give some feedback on whether this is safe to implement or not. DefaultIEW::executeInsts() { . . . . } else if (inst->isLoad()) { // Added check if ((dynamic_cast(inst->fault.get()) != nullptr) && (inst->isExecuted())) { if (inst->isCompleted()) inst->clearExecuted(); continue; } // End of added check . . . . } I am having better luck with this modification but Ruby (MOESI_hammer) still sometimes runs into deadlock On Mon, Sep 16, 2019 at 2:52 PM Shehab Elsayed wrote: > I ran some more tests and it doesn't always break at the same instruction ( > MOV_R_M). However, what seems to be common is that the instruction > causing the problem (let's call it instruction A) conflicts with another > instruction in the 'checkSnoop' function in 'src/cpu/o3/lsq_unit_impl.hh'. > Therefore, 'A' ends up being faulty with ReExec and consequently marked > as executed and sent to commit but never actually gets to commit before > being executed again. > > What I understood from the comments in this function is that offending > instructions should be squashed (since I amusing X86 which sets needsTSO > to True and consequently force_squash to True as well). But looks like > the Instructions are not squashed as far as I can tell. > > I am not sure if this helps narrow down the problem or not but I hope it > helps! > > > On Thu, Sep 12, 2019 at 2:52 PM Shehab Elsayed > wrote: > >> Looks like this is the instruction causing the assertion failure >> >> MOV_R_M : ld r9, DS:[rdx + 0x10] >> >> >> >> On Wed, Sep 11, 2019 at 5:20 PM Pouya Fotouhi >> wrote: >> >>> If you use --debug-flags=ExecAll,Decode and narrow down your trace to >>> the Ticks that you know the load is failing with --debug-start and >>> --debug-end you should be able to get that. >>> >>> Best, >>> >>> On Wed, Sep 11, 2019 at 2:15 PM Shehab Elsayed >>> wrote: >>> >>>> Is there a way to get the macroop from the corresponding instruction >>>> pointer? >>>> >>>> >>>> On Wed, Sep 11, 2019 at 5:07 PM Pouya Fotouhi >>>> wrote: >>>> >>>>> Hi Shehab, >>>>> >>>>> Can you please confirm what is the macroop that is issuing that load? >>>>> I suspect it's one of the 128-bit instructions (maybe recently >>>>> non-temporal >>>>> ones that I added) that are executed as two 64-bit loads, and possibly the >>>>> second one is failing due to the cda check that we do, and that stops the >>>>> load from being committed. >>>>> >>>>> Best, >>>>> >>>>> On Wed, Sep 11, 2019 at 1:16 PM Shehab Elsayed >>>>> wrote: >>>>> >>>>>> So actually load instruction gets executed twice causing the >>>>>> assertion to fail on the second time. >>>>>> >>>>>> 769413949: system.switch_cpus.iew.lsq.thread0: Doing memory >>>>>> access for inst [sn:15059405] PC >>>>>> (0x810ed626=>0x810ed62a).(1=>2) >>>>>> 769413949: system.switch_cpus.iew.lsq.thread0: Load [sn:15059405] >>>>>> not executed from fault >>>>>> 769413949: system.switch_cpus.iew.lsq.thread0: 1- Setting >>>>>> [sn:15059405] as executed (I added this message to track when LSQ >>>>>> instructions are set as executed) >>>>>> >>>>>> I believe this instruction should then be committed and removed from >>>>>> the LSQ before before executed again, however, this does not happen. >>>>>> Instead it gets executed again before being removed and then comes the >>>>>> assertion failure that it has already executed. >>>>>> >>>>>> I see that it gets sent to commit >>>>>> >&g
Re: [gem5-users] Indeterministic gem5 behavior
First of all, thanks for your replies, Ryan and Jason. I have already pulled the latest changes by Pouya and the problem still persists. As for checkpointing, I was originally doing exactly what Jason mentioned and ran into the same problem. I then switched to not checkpointing just to avoid any problems that might be caused by checkpointing (if any). My plan was to go back to checkpointing after proving that it works without it. However, the problem doesn't seem to be related to KVM as linux boots reliable every time. The problem happens after the benchmarks starts execution and it seems to be happening only when running multiple cores (>=4). My latest experiments with a single core and 8 threads for the benchmark seem to be working fine. But once I increase the number of simulated cores problems happen. Also, I have posted a link to the repo I am using to run my tests in a previous message. I have also added 2 debug traces with the Exec flag for a working and non-working examples. On Fri, Sep 6, 2019 at 11:28 AM Jason Lowe-Power wrote: > Hi Shehab, > > One quick note: There is *no way* to have deterministic behavior when > running with KVM. Since you are using the hardware, the underlying host OS > will influence the execution path of the workload. > > To try to narrow down the bug you're seeing, you can try to take a > checkpoint after booting with KVM. Then, the execution from the checkpoint > should be deterministic since it is 100% in gem5. > > BTW, I doubt you can run the KVM CPU in a VM since this would require your > hardware and the VM to support nested virtualization. There *is* support > for this in the Linux kernel, but I don't think it's been widely deployed > outside of specific cloud environments. > > One other note: Pouya has pushed some changes which implement some x86 > instructions that were causing issues for him. You can try with the current > gem5 mainline to see if that helps. > > Cheers, > Jason > > On Fri, Sep 6, 2019 at 8:22 AM Shehab Elsayed > wrote: > >> That's interesting. Are you using Full System as well? I don't think FS >> behavior is supposed to be so dependent on the host environment! >> >> On Fri, Sep 6, 2019 at 11:16 AM Gambord, Ryan >> wrote: >> >>> I have found that gem5 behavior is sensitive to the execution >>> environment. I now run gem5 inside an ubuntu vm on qemu and have had much >>> more consistent results. I haven't tried running kvm gem5 inside a kvm qemu >>> vm, so not sure how that works, but might be worth trying. >>> >>> Ryan >>> >>> >>> On Fri, Sep 6, 2019, 08:07 Shehab Elsayed wrote: >>> >>>> I was wondering if anyone is running into the same problem or if anyone >>>> has any suggestions on how to proceed with debugging this problem. >>>> >>>> On Mon, Jul 29, 2019 at 4:57 PM Shehab Elsayed >>>> wrote: >>>> >>>>> Sorry for the spam. I just forgot to mention that the system >>>>> configuration I am using is mainly from >>>>> https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs. >>>>> <https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs> >>>>> >>>>> >>>>> Shehab Y. Elsayed, MSc. >>>>> PhD Student >>>>> The Edwards S. Rogers Sr. Dept. of Electrical and Computer Engineering >>>>> University of Toronto >>>>> E-mail: shehaby...@gmail.com >>>>> <https://webmail.rice.edu/imp/message.php?mailbox=INBOX=11#> >>>>> >>>>> >>>>> On Mon, Jul 29, 2019 at 4:08 PM Shehab Elsayed >>>>> wrote: >>>>> >>>>>> I have set up a repo with gem5 that demonstrates the problem. The >>>>>> repo includes the latest version of gem5 from gem5's github repo with a >>>>>> few >>>>>> patches applied to get KVM working together with the kernel binary and >>>>>> disk >>>>>> image I am using. You can get the repo at >>>>>> https://github.com/ShehabElsayed/gem5_debug.git. >>>>>> <https://github.com/ShehabElsayed/gem5_debug.git> >>>>>> >>>>>> These steps should reproduce the problem: >>>>>> 1- scons build/X86/gem5.opt >>>>>> 2- ./scripts/get_fs_stuff.sh >>>>>> 3- ./scripts/run_fs.sh 8 >>>>>> >>>>>> I have also included sample m5term outputs for both a 2 thread run >>>>>> (m5out_2t) and an 8 thread run (m5out_8t
Re: [gem5-users] Indeterministic gem5 behavior
I was wondering if anyone is running into the same problem or if anyone has any suggestions on how to proceed with debugging this problem. On Mon, Jul 29, 2019 at 4:57 PM Shehab Elsayed wrote: > Sorry for the spam. I just forgot to mention that the system configuration > I am using is mainly from > https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs. > <https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs> > > > Shehab Y. Elsayed, MSc. > PhD Student > The Edwards S. Rogers Sr. Dept. of Electrical and Computer Engineering > University of Toronto > E-mail: shehaby...@gmail.com > <https://webmail.rice.edu/imp/message.php?mailbox=INBOX=11#> > > > On Mon, Jul 29, 2019 at 4:08 PM Shehab Elsayed > wrote: > >> I have set up a repo with gem5 that demonstrates the problem. The repo >> includes the latest version of gem5 from gem5's github repo with a few >> patches applied to get KVM working together with the kernel binary and disk >> image I am using. You can get the repo at >> https://github.com/ShehabElsayed/gem5_debug.git. >> <https://github.com/ShehabElsayed/gem5_debug.git> >> >> These steps should reproduce the problem: >> 1- scons build/X86/gem5.opt >> 2- ./scripts/get_fs_stuff.sh >> 3- ./scripts/run_fs.sh 8 >> >> I have also included sample m5term outputs for both a 2 thread run >> (m5out_2t) and an 8 thread run (m5out_8t) >> >> Any help is really appreciated. >> >> >> >> On Tue, Jul 23, 2019 at 11:01 AM Shehab Elsayed >> wrote: >> >>> When I enable the Exec debug flag I can see that it seems to be stuck in >>> a spin lock (queued_spin_lock_slowpath) >>> >>> On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed >>> wrote: >>> >>>> Hello All, >>>> >>>> I have a gem5 X86 full system set up that starts with KVM cores and >>>> then switches to O3 cores once the benchmark reaches the region of >>>> interest. Right now I am testing with a simple multithreaded hello >>>> world benchmark. Sometimes the benchmark completes successfully while >>>> others gem5 just seems to hang after starting the benchmark. I believe it >>>> is still executing some instructions but without making any progress. The >>>> chance of this behavior (indeterminism) happening increases as the >>>> number of simulated cores or the number of threads created by the benchmark >>>> increases. >>>> >>>> Any ideas what might be the reason for this or how I can start >>>> debugging this problem? >>>> >>>> Note: I have tried the patch in https://gem5-review.googlesource >>>> .com/c/public/gem5/+/19568 but the problem persists. >>>> >>>> Thanks! >>>> >>> ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Indeterministic gem5 behavior
That's interesting. Are you using Full System as well? I don't think FS behavior is supposed to be so dependent on the host environment! On Fri, Sep 6, 2019 at 11:16 AM Gambord, Ryan wrote: > I have found that gem5 behavior is sensitive to the execution environment. > I now run gem5 inside an ubuntu vm on qemu and have had much more > consistent results. I haven't tried running kvm gem5 inside a kvm qemu vm, > so not sure how that works, but might be worth trying. > > Ryan > > > On Fri, Sep 6, 2019, 08:07 Shehab Elsayed wrote: > >> I was wondering if anyone is running into the same problem or if anyone >> has any suggestions on how to proceed with debugging this problem. >> >> On Mon, Jul 29, 2019 at 4:57 PM Shehab Elsayed >> wrote: >> >>> Sorry for the spam. I just forgot to mention that the system >>> configuration I am using is mainly from >>> https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs. >>> <https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs> >>> >>> >>> Shehab Y. Elsayed, MSc. >>> PhD Student >>> The Edwards S. Rogers Sr. Dept. of Electrical and Computer Engineering >>> University of Toronto >>> E-mail: shehaby...@gmail.com >>> <https://webmail.rice.edu/imp/message.php?mailbox=INBOX=11#> >>> >>> >>> On Mon, Jul 29, 2019 at 4:08 PM Shehab Elsayed >>> wrote: >>> >>>> I have set up a repo with gem5 that demonstrates the problem. The repo >>>> includes the latest version of gem5 from gem5's github repo with a few >>>> patches applied to get KVM working together with the kernel binary and disk >>>> image I am using. You can get the repo at >>>> https://github.com/ShehabElsayed/gem5_debug.git. >>>> <https://github.com/ShehabElsayed/gem5_debug.git> >>>> >>>> These steps should reproduce the problem: >>>> 1- scons build/X86/gem5.opt >>>> 2- ./scripts/get_fs_stuff.sh >>>> 3- ./scripts/run_fs.sh 8 >>>> >>>> I have also included sample m5term outputs for both a 2 thread run >>>> (m5out_2t) and an 8 thread run (m5out_8t) >>>> >>>> Any help is really appreciated. >>>> >>>> >>>> >>>> On Tue, Jul 23, 2019 at 11:01 AM Shehab Elsayed >>>> wrote: >>>> >>>>> When I enable the Exec debug flag I can see that it seems to be stuck >>>>> in a spin lock (queued_spin_lock_slowpath) >>>>> >>>>> On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed >>>>> wrote: >>>>> >>>>>> Hello All, >>>>>> >>>>>> I have a gem5 X86 full system set up that starts with KVM cores and >>>>>> then switches to O3 cores once the benchmark reaches the region of >>>>>> interest. Right now I am testing with a simple multithreaded hello >>>>>> world benchmark. Sometimes the benchmark completes successfully while >>>>>> others gem5 just seems to hang after starting the benchmark. I believe it >>>>>> is still executing some instructions but without making any progress. The >>>>>> chance of this behavior (indeterminism) happening increases as the >>>>>> number of simulated cores or the number of threads created by the >>>>>> benchmark >>>>>> increases. >>>>>> >>>>>> Any ideas what might be the reason for this or how I can start >>>>>> debugging this problem? >>>>>> >>>>>> Note: I have tried the patch in https://gem5-review.googlesource >>>>>> .com/c/public/gem5/+/19568 but the problem persists. >>>>>> >>>>>> Thanks! >>>>>> >>>>> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Indeterministic gem5 behavior
I have set up a repo with gem5 that demonstrates the problem. The repo includes the latest version of gem5 from gem5's github repo with a few patches applied to get KVM working together with the kernel binary and disk image I am using. You can get the repo at https://github.com/ShehabElsayed/gem5_debug.git. <https://github.com/ShehabElsayed/gem5_debug.git> These steps should reproduce the problem: 1- scons build/X86/gem5.opt 2- ./scripts/get_fs_stuff.sh 3- ./scripts/run_fs.sh 8 I have also included sample m5term outputs for both a 2 thread run (m5out_2t) and an 8 thread run (m5out_8t) Any help is really appreciated. On Tue, Jul 23, 2019 at 11:01 AM Shehab Elsayed wrote: > When I enable the Exec debug flag I can see that it seems to be stuck in a > spin lock (queued_spin_lock_slowpath) > > On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed > wrote: > >> Hello All, >> >> I have a gem5 X86 full system set up that starts with KVM cores and then >> switches to O3 cores once the benchmark reaches the region of interest. >> Right now I am testing with a simple multithreaded hello world >> benchmark. Sometimes the benchmark completes successfully while others gem5 >> just seems to hang after starting the benchmark. I believe it is still >> executing some instructions but without making any progress. The chance of >> this behavior (indeterminism) happening increases as the number of >> simulated cores or the number of threads created by the benchmark increases. >> >> Any ideas what might be the reason for this or how I can start debugging >> this problem? >> >> Note: I have tried the patch in https://gem5-review.googlesource >> .com/c/public/gem5/+/19568 but the problem persists. >> >> Thanks! >> > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Indeterministic gem5 behavior
Sorry for the spam. I just forgot to mention that the system configuration I am using is mainly from https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs. <https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs> Shehab Y. Elsayed, MSc. PhD Student The Edwards S. Rogers Sr. Dept. of Electrical and Computer Engineering University of Toronto E-mail: shehaby...@gmail.com <https://webmail.rice.edu/imp/message.php?mailbox=INBOX=11#> On Mon, Jul 29, 2019 at 4:08 PM Shehab Elsayed wrote: > I have set up a repo with gem5 that demonstrates the problem. The repo > includes the latest version of gem5 from gem5's github repo with a few > patches applied to get KVM working together with the kernel binary and disk > image I am using. You can get the repo at > https://github.com/ShehabElsayed/gem5_debug.git. > <https://github.com/ShehabElsayed/gem5_debug.git> > > These steps should reproduce the problem: > 1- scons build/X86/gem5.opt > 2- ./scripts/get_fs_stuff.sh > 3- ./scripts/run_fs.sh 8 > > I have also included sample m5term outputs for both a 2 thread run > (m5out_2t) and an 8 thread run (m5out_8t) > > Any help is really appreciated. > > > > On Tue, Jul 23, 2019 at 11:01 AM Shehab Elsayed > wrote: > >> When I enable the Exec debug flag I can see that it seems to be stuck in >> a spin lock (queued_spin_lock_slowpath) >> >> On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed >> wrote: >> >>> Hello All, >>> >>> I have a gem5 X86 full system set up that starts with KVM cores and >>> then switches to O3 cores once the benchmark reaches the region of >>> interest. Right now I am testing with a simple multithreaded hello >>> world benchmark. Sometimes the benchmark completes successfully while >>> others gem5 just seems to hang after starting the benchmark. I believe it >>> is still executing some instructions but without making any progress. The >>> chance of this behavior (indeterminism) happening increases as the >>> number of simulated cores or the number of threads created by the benchmark >>> increases. >>> >>> Any ideas what might be the reason for this or how I can start debugging >>> this problem? >>> >>> Note: I have tried the patch in https://gem5-review.googlesource >>> .com/c/public/gem5/+/19568 but the problem persists. >>> >>> Thanks! >>> >> ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Indeterministic gem5 behavior
Hello All, I have a gem5 X86 full system set up that starts with KVM cores and then switches to O3 cores once the benchmark reaches the region of interest. Right now I am testing with a simple multithreaded hello world benchmark. Sometimes the benchmark completes successfully while others gem5 just seems to hang after starting the benchmark. I believe it is still executing some instructions but without making any progress. The chance of this behavior ( indeterminism) happening increases as the number of simulated cores or the number of threads created by the benchmark increases. Any ideas what might be the reason for this or how I can start debugging this problem? Note: I have tried the patch in https://gem5-review.googlesource .com/c/public/gem5/+/19568 but the problem persists. Thanks! ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Indeterministic gem5 behavior
When I enable the Exec debug flag I can see that it seems to be stuck in a spin lock (queued_spin_lock_slowpath) On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed wrote: > Hello All, > > I have a gem5 X86 full system set up that starts with KVM cores and then > switches to O3 cores once the benchmark reaches the region of interest. > Right now I am testing with a simple multithreaded hello world benchmark. > Sometimes the benchmark completes successfully while others gem5 just seems > to hang after starting the benchmark. I believe it is still executing some > instructions but without making any progress. The chance of this behavior ( > indeterminism) happening increases as the number of simulated cores or > the number of threads created by the benchmark increases. > > Any ideas what might be the reason for this or how I can start debugging > this problem? > > Note: I have tried the patch in https://gem5-review.googlesource > .com/c/public/gem5/+/19568 but the problem persists. > > Thanks! > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Panic: Invalid microop with X86 and MinorCPU
Hello All, I recently started testing the MinorCPU model and X86_MOESI_hammer in full system. However, the simulation terminates with this error panic: Invalid microop The exact same test runs fine with TimingSimpleCPU. Is there anything special that has to be done in order to get MinorCPU to work properly? Thanks in advance, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Adding pseudo instructions in ARM
Hello All, I was wondering if there a document or a tutorial somewhere describing how to add pseudo instructions in the ARM architecture. I went through the process before but for X86. However, I see there are some differences especially in adding the extra instructions to the ISA. Thanks in advance. Best Regards, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] O3 LSQ operation
Hello All, First of all, I should point out that this is related to a previous question I have posted in https://www.mail-archive.com/gem5-users@gem5.org/msg16701.html but I believe the discussion has diverted from the initial problem so I thought it might be a good idea to start another thread. I am trying to understand how the LSQ is implemented in gem5 since I am seeing some behavior that I can't explain. So my expectation is that 2 similar load instructions will receive similar behavior from the LSQ but I am seeing different behaviors in my test. So here are the traces (--debug-flags=LSQUnit,LSQ,Commit,ROB,IEW,IQ,Decode, RubySlicc) for 2 loads (856511 and 856518) Load 856511: --- 15573591436383: system.o3Cpu.decode: [tid:0] Processing instruction [sn:856511] with PC (0x8105bc89=>0x8105bc8d).(0=>1) 15573591436716: global: [sn:856511] has 1 ready out of 3 sources. RTI 0) 15573591436716: global: [sn:856511] has 2 ready out of 3 sources. RTI 0) 15573591437382: system.o3Cpu.iew: [tid:0] Issue: Adding PC (0x8105bc89=>0x8105bc8d).(0=>1) [sn:856511] [tid:0] to IQ. 15573591437382: system.o3Cpu.iew.lsq.thread0: Inserting load PC (0x8105bc89=>0x8105bc8d).(0=>1), idx:13 [sn:856511] 15573591437382: system.o3Cpu.iq: Adding instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) to the IQ. 15573591441045: system.o3Cpu.iq: Waking up a dependent instruction, [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1). 15573591441045: global: [sn:856511] has 3 ready out of 3 sources. RTI 0) 15573591441045: system.o3Cpu.iq: Instruction is ready to issue, putting it onto the ready list, PC (0x8105bc89=>0x8105bc8d).(0=>1) opclass:47 [sn:856511]. 15573591441045: system.o3Cpu.iq: Thread 0: Issuing instruction PC (0x8105bc89=>0x8105bc8d).(0=>1) [sn:856511] 15573591441378: system.o3Cpu.iew: Execute: Processing PC (0x8105bc89=>0x8105bc8d).(0=>1), [tid:0] [sn:856511]. 15573591441378: system.o3Cpu.iew.lsq.thread0: Executing load PC (0x8105bc89=>0x8105bc8d).(0=>1), [sn:856511] 15573591441378: system.o3Cpu.iew.lsq.thread0: Doing memory access for inst [ sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) 15573591441711: system.o3Cpu.iew.lsq.thread0: -- inst [sn:856511] to pktAddr :0xb9ea4300 15573591441711: system.o3Cpu.iew.lsq.thread0: Conflicting load at addr 0xb9ea4300 [sn:856511] 15573591442377: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591442710: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591443043: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591443376: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591443709: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591444042: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591444375: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591444708: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591445041: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready 15573591445374: system.o3Cpu.commit: [tid:0] Can't commit, Instruction [sn:856511] PC (0x8105bc89=>0x8105bc8d).(0=>1) is head of ROB and not ready Load 856518: --- 15573591436716: system.o3Cpu.decode: [tid:0] Processing instruction [sn:856518] with PC (0x8105bc95=>0x8105bc99).(0=>1) 15573591437049: global: [sn:856518] has 1 ready out of 3 sources. RTI 0) 15573591437049: global: [sn:856518] has 2 ready out of 3 sources. RTI 0) 15573591437715: system.o3Cpu.iew: [tid:0] Issue: Adding PC (0x8105bc95=>0x8105bc99).(0=>1) [sn:856518] [tid:0] to IQ. 15573591437715: system.o3Cpu.iew.lsq.thread0: Inserting load PC (0x8105bc95=>0x8105bc99).(0=>1), idx:14 [sn:856518] 15573591437715: system.o3Cpu.iq: Adding instruction [sn:856518] PC (0x8105bc95=>0x8105bc99).(0=>1) to the IQ. 15573591441045: system.o3Cpu.iq: Waking up a dependent instruction, [sn:856518] PC (0x8105bc95=>0x8105bc99).(0=>1). 15573591441045: global: [sn:856518] has 3 ready out of 3 sources.
[gem5-users] Configuring cache bandwidth in Ruby/classical memory system
Hello All, I was wondering if there is a way to specify the bandwidths for the different caches in the simulated system? For example, let's say I want to simulate a system that resembles the Ivy Bridge architecture as much as possible which according to this link ( https://en.wikichip.org/wiki/intel/microarchitectures/ivy_bridge_(client)) has the following: 1- 16B/cycle L1I-cache bandwidth 2- 32B/cycle L1D-cache load bandwidth 3- 16B/cycle L1D-cache store bandwidth 4- 32B/cycle L2-L1 bandwidth Is there an easy way to configure those values in either the Ruby or classical memory systems? Thanks in advance, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Problem with DerivO3CPU and Ruby in FS
Hi Ciro, Thank you very much for your reply. I tried the patch you sent on a quick test and it seems to be working. I will update this thread if I run into any issues. Thanks again for your help. Best Regards, Shehab On Thu, Oct 24, 2019 at 9:28 AM Ciro Santilli wrote: > Also, could you try your content with the patch: > https://gem5-review.googlesource.com/c/public/gem5/+/21819 to see if it > fixes it? I've been told it might be the solution. > -- > *From:* Ciro Santilli > *Sent:* Thursday, October 24, 2019 1:05 PM > *To:* gem5 users mailing list ; shehaby...@gmail.com > > *Subject:* Re: [gem5-users] Problem with DerivO3CPU and Ruby in FS > > Shehab, thanks for reporting this. > > Can you share your test program, and full gem5 CLI so we can try to > reproduce? > > It is a full system simulation, and then you run the application from > userland after boot? > > Do you boot with Atomic and restore the checkpoint, or did full boot in O3? > ---------- > *From:* gem5-users on behalf of Shehab > Elsayed > *Sent:* Tuesday, October 22, 2019 7:39 PM > *To:* gem5 users mailing list > *Subject:* Re: [gem5-users] Problem with DerivO3CPU and Ruby in FS > > I think I might have found the commit that causes this problem. This is it > https://github.com/gem5/gem5/commit/46da8fb805407cdc224abe788e8c666f3b0dadd1 > > Specifically, the modification that is causing the problems is the one in > src/cpu/o3/lsq_impl.hh. In pushRequest(...) there is is this check: > /* This is the place were instructions get the effAddr. */ > > if (req->isTranslationComplete()) { > > // PROBLEMATIC LINE: > // The following line was modified in the commit mentioned above. > if (req->isMemAccessRequired()) { // New line > > //if(inst->getFault() == NoFault) { // Old line > > > inst->effAddr = req->getVaddr(); > > inst->effSize = size; > > inst->effAddrValid(true); > > > > if (cpu->checker) { > > inst->reqToVerify = > std::make_shared(*req->request()); > > } > > Fault fault; > > if (isLoad) > > fault = cpu->read(req, inst->lqIdx); > > else > > fault = cpu->write(req, data, inst->sqIdx); > > // inst->getFault() may have the first-fault of a > > // multi-access split request at this point. > > // Overwrite that only if we got another type of fault > > // (e.g. re-exec). > > if (fault != NoFault) > > inst->getFault() = fault; > > } else if (isLoad) { > > inst->setMemAccPredicate(false); > > // Commit will have to clean up whatever happened. Set this > > // instruction as executed. > > inst->setExecuted(); > > } > > } > > When built with the new line, I end up having the assertion failure. > However, with the old line I don't run into this problem. And this problem > doesn't seem to be fixed in later commits. > > Right now, just using the older version of this specific line seems to fix > the issue I was having, but I don't think this is a proper fix as I believe > it doesn't achieve what the commit was trying to achieve. Any ideas on how > to properly fix this? > > Thanks! > > On Thu, Oct 17, 2019 at 10:34 AM Shehab Elsayed > wrote: > > Hello All, > > Could someone please confirm whether DerivO3CPU with Ruby are working > properly? > > I have been having the same problem with both X86 and ARM when running a > multithreaded application (a simple hello world program with 2 threads) on > DerivO3CPU and Ruby. After some time I get an assertion failure: > > Assertion `!load_inst->isExecuted()' failed > > The problem doesn't happen for ARM when using the classical memory model > or even when using Ruby with a different core model (tested with MinorCPU > on ARM). > > Thank you very much in advance. > > Best Regards, > Shehab > > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Problem with DerivO3CPU and Ruby in FS
Hello All, Could someone please confirm whether DerivO3CPU with Ruby are working properly? I have been having the same problem with both X86 and ARM when running a multithreaded application (a simple hello world program with 2 threads) on DerivO3CPU and Ruby. After some time I get an assertion failure: Assertion `!load_inst->isExecuted()' failed The problem doesn't happen for ARM when using the classical memory model or even when using Ruby with a different core model (tested with MinorCPU on ARM). Thank you very much in advance. Best Regards, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Delayed printing to telent localhost 3456
Just an update: It seems that the problem is not there or at least not as severe when I used the aarch64-ubuntu-trusty-headless.img disk image. My previous experiments were using the linaro-minimal-aarch64.img disk image. I am not sure what makes the difference between both images. On Fri, Oct 4, 2019 at 11:55 AM Shehab Elsayed wrote: > Hi Jason, > > Thanks for the quick reply. I will give it a try with the m5 writefile. I > will also keep this thread updated with any findings I might have. > > Best Regards, > Shehab > > On Fri, Oct 4, 2019 at 11:46 AM Jason Lowe-Power > wrote: > >> Hi Shehab, >> >> This is a great question. I've noticed this as well, and I'm not sure why >> it occurs. I'll put it on the to do list to look into (or you can look into >> it and see if you can figure it out :)). >> >> One option to get around this that we've been playing with is using m5 >> writefile to output important information instead of the terminal. >> >> Cheers, >> Jason >> >> On Fri, Oct 4, 2019 at 8:40 AM Shehab Elsayed >> wrote: >> >>> Hello All, >>> >>> I have an rcS script that prints a statement to the terminal and then >>> switches cpus and does some more stuff. I can see the effect of switching >>> cpus in the gem5 (host) terminal a very long time before the printed >>> statement starts appearing on the simulated (guest) terminal. I even have >>> to add sleep statement before m5 exit just to be able to see the printed >>> statement, otherwise, it gem5 will exit without event the statement being >>> printed. >>> >>> Is such a big delay expected and normal? >>> >>> Thanks in advance, >>> Shehab >>> ___ >>> gem5-users mailing list >>> gem5-users@gem5.org >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Problem with DerivO3CPU and Ruby in FS
I think I might have found the commit that causes this problem. This is it https://github.com/gem5/gem5/commit/46da8fb805407cdc224abe788e8c666f3b0dadd1 Specifically, the modification that is causing the problems is the one in src/cpu/o3/lsq_impl.hh. In pushRequest(...) there is is this check: /* This is the place were instructions get the effAddr. */ if (req->isTranslationComplete()) { // PROBLEMATIC LINE: // The following line was modified in the commit mentioned above. if (req->isMemAccessRequired()) { // New line //if(inst->getFault() == NoFault) { // Old line inst->effAddr = req->getVaddr(); inst->effSize = size; inst->effAddrValid(true); if (cpu->checker) { inst->reqToVerify = std::make_shared(*req->request()); } Fault fault; if (isLoad) fault = cpu->read(req, inst->lqIdx); else fault = cpu->write(req, data, inst->sqIdx); // inst->getFault() may have the first-fault of a // multi-access split request at this point. // Overwrite that only if we got another type of fault // (e.g. re-exec). if (fault != NoFault) inst->getFault() = fault; } else if (isLoad) { inst->setMemAccPredicate(false); // Commit will have to clean up whatever happened. Set this // instruction as executed. inst->setExecuted(); } } When built with the new line, I end up having the assertion failure. However, with the old line I don't run into this problem. And this problem doesn't seem to be fixed in later commits. Right now, just using the older version of this specific line seems to fix the issue I was having, but I don't think this is a proper fix as I believe it doesn't achieve what the commit was trying to achieve. Any ideas on how to properly fix this? Thanks! On Thu, Oct 17, 2019 at 10:34 AM Shehab Elsayed wrote: > Hello All, > > Could someone please confirm whether DerivO3CPU with Ruby are working > properly? > > I have been having the same problem with both X86 and ARM when running a > multithreaded application (a simple hello world program with 2 threads) on > DerivO3CPU and Ruby. After some time I get an assertion failure: > > Assertion `!load_inst->isExecuted()' failed > > The problem doesn't happen for ARM when using the classical memory model > or even when using Ruby with a different core model (tested with MinorCPU > on ARM). > > Thank you very much in advance. > > Best Regards, > Shehab > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Delayed printing to telent localhost 3456
Hi Jason, Thanks for the quick reply. I will give it a try with the m5 writefile. I will also keep this thread updated with any findings I might have. Best Regards, Shehab On Fri, Oct 4, 2019 at 11:46 AM Jason Lowe-Power wrote: > Hi Shehab, > > This is a great question. I've noticed this as well, and I'm not sure why > it occurs. I'll put it on the to do list to look into (or you can look into > it and see if you can figure it out :)). > > One option to get around this that we've been playing with is using m5 > writefile to output important information instead of the terminal. > > Cheers, > Jason > > On Fri, Oct 4, 2019 at 8:40 AM Shehab Elsayed > wrote: > >> Hello All, >> >> I have an rcS script that prints a statement to the terminal and then >> switches cpus and does some more stuff. I can see the effect of switching >> cpus in the gem5 (host) terminal a very long time before the printed >> statement starts appearing on the simulated (guest) terminal. I even have >> to add sleep statement before m5 exit just to be able to see the printed >> statement, otherwise, it gem5 will exit without event the statement being >> printed. >> >> Is such a big delay expected and normal? >> >> Thanks in advance, >> Shehab >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Delayed printing to telent localhost 3456
Hello All, I have an rcS script that prints a statement to the terminal and then switches cpus and does some more stuff. I can see the effect of switching cpus in the gem5 (host) terminal a very long time before the printed statement starts appearing on the simulated (guest) terminal. I even have to add sleep statement before m5 exit just to be able to see the printed statement, otherwise, it gem5 will exit without event the statement being printed. Is such a big delay expected and normal? Thanks in advance, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Adding pseudo instructions in ARM
Hi Ciro, Thank you for your reply. I actually did that and managed to get it to work but forgot to update thread. I believe what confused me is the difference in the implementation of instruction decoding between AARCH64 and X86. Thanks again, Shehab On Fri, Oct 4, 2019 at 12:14 PM Ciro Santilli wrote: > Shebab, I don't think such tutorial exits. > > Could you try to just copy what is done for one of the other instructions? > It should not be hard if you try this way. > > Then if you have a more precise questions, do follow up. > > -- > *From:* gem5-users on behalf of Shehab > Elsayed > *Sent:* Tuesday, October 1, 2019 3:30 PM > *To:* gem5 users mailing list > *Subject:* [gem5-users] Adding pseudo instructions in ARM > > Hello All, > > I was wondering if there a document or a tutorial somewhere describing how > to add pseudo instructions in the ARM architecture. I went through the > process before but for X86. However, I see there are some differences > especially in adding the extra instructions to the ISA. > > Thanks in advance. > > Best Regards, > Shehab > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Changing configuration between checkpoint and restore
Hello All, I was wondering which configuration parameters are safe to change between taking a checkpoint and restoring from the same checkpoint. For example, cache configuration, core configuration, number of LLC banks, number of cores, Also, Is there a way to tell whether a configuration can be safely changed between checkpointing and restoring? Thank you very much in advance. Best Regards, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Fwd: Changing configuration between checkpoint and restore
I am afraid I don't know what you mean! On Mon, Jan 6, 2020 at 11:11 AM CS18M010 RICHA CHAUDHRY < cs18m...@iittp.ac.in> wrote: > how to install gem 5 initial stage > > -- Forwarded message ----- > From: Shehab Elsayed > Date: Mon, Jan 6, 2020 at 9:39 PM > Subject: [gem5-users] Changing configuration between checkpoint and restore > To: gem5 users mailing list > > > Hello All, > > I was wondering which configuration parameters are safe to change between > taking a checkpoint and restoring from the same checkpoint. For example, > cache configuration, core configuration, number of LLC banks, number of > cores, > > Also, Is there a way to tell whether a configuration can be safely changed > between checkpointing and restoring? > > Thank you very much in advance. > > Best Regards, > Shehab > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Changing configuration between checkpoint and restore
Thanks, Ciro! This makes sense. So I guess one example for things that shouldn't change between checkpointing and restoring is the number of cores while the cache sizes should be OK to change. On Mon, Jan 6, 2020 at 11:50 AM Ciro Santilli wrote: > On Mon, Jan 6, 2020 at 4:09 PM Shehab Elsayed > wrote: > > > > Hello All, > > > > I was wondering which configuration parameters are safe to change > between taking a checkpoint and restoring from the same checkpoint. For > example, cache configuration, core configuration, number of LLC banks, > number of cores, > > > > Also, Is there a way to tell whether a configuration can be safely > changed between checkpointing and restoring? > > > > I'm not 100% sure, but I believe that in general things which are not > architecturally visible can be switched safely. > > If it is visible, you have to be careful that the software might > expect one state previously read, but now the hardware suddenly > changed to a new one. > > Also note that cache sizes are not currently exposed to the guest: > > https://stackoverflow.com/questions/49008792/why-doesnt-the-linux-kernel-see-the-cache-sizes-in-the-gem5-emulator-in-full-sy > > > Thank you very much in advance. > > > > Best Regards, > > Shehab > > ___ > > gem5-users mailing list > > gem5-users@gem5.org > > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Simulation terminates successfully on one machine and runs into assertion failure on another
Hello All, I am trying to run the same experiment (ARM full system with RUBY MESI_Three_level that boots from checkpoint) on two different machines. It terminates successfully on one but runs into assertion failure on the other. This is the terminating message: Addr Request::getPaddr() const: Assertion `privateFlags.isSet(VALID_PADDR)' failed I tried running with the Exec debug flag and comparing both traces. This is the difference I found: < 47134939: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1244: ldrsw x2, [x21, #8]: MemRead : D=0x5e41dd0c A=0xffc07b1ce008 < 471349390500: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1248: str x2, [x19, #80] : MemWrite : D=0x5e41dd0c A=0xffc07c205d00 --- > 47134939: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1244: ldrsw x2, [x21, #8]: MemRead : D=0x5e4ebbdf A=0xffc07b1ce008 > 471349390500: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1248: str x2, [x19, #80] : MemWrite : D=0x5e4ebbdf A=0xffc07c205d00 17299344d17299343 < 474920248000: system.cpu_cluster.bef_roi_cpus0 T0 : 0x7fbf4f9510: su Any ideas how I should proceed with debugging this problem? Thank you very much in advance. Best Regards, Shehab ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Simulation terminates successfully on one machine and runs into assertion failure on another
Hi Ciro, Thank you very much for your reply. I tried the patch in the link you sent and it seems to be working so far. Best Regards, Shehab On Tue, Mar 10, 2020 at 7:17 AM Ciro Santilli wrote: > Please give us stack trace (GDB it if none), full gem5 CLI, gem5 git > version, as much detail as possible about content, and ensure you have > https://gem5-review.googlesource.com/c/public/gem5/+/22283/4 ideally on a > bug report at: > https://gem5-review.googlesource.com/c/public/gem5/+/22283/4 > -- > *From:* gem5-users on behalf of Shehab > Elsayed > *Sent:* Monday, March 9, 2020 8:47 PM > *To:* gem5 users mailing list > *Subject:* [gem5-users] Simulation terminates successfully on one machine > and runs into assertion failure on another > > Hello All, > > I am trying to run the same experiment (ARM full system with RUBY > MESI_Three_level that boots from checkpoint) on two different machines. It > terminates successfully on one but runs into assertion failure on the other. > > This is the terminating message: > Addr Request::getPaddr() const: Assertion > `privateFlags.isSet(VALID_PADDR)' failed > > I tried running with the Exec debug flag and comparing both traces. This > is the difference I found: > < 47134939: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1244: > ldrsw x2, [x21, #8]: MemRead : D=0x5e41dd0c > A=0xffc07b1ce008 > < 471349390500: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1248: > str x2, [x19, #80] : MemWrite : D=0x5e41dd0c > A=0xffc07c205d00 > --- > > > 47134939: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1244: > ldrsw x2, [x21, #8]: MemRead : D=0x5e4ebbdf > A=0xffc07b1ce008 > > 471349390500: system.cpu_cluster.bef_roi_cpus0 T0 : @ext4_iget+1248: > str x2, [x19, #80] : MemWrite : D=0x5e4ebbdf > A=0xffc07c205d00 > 17299344d17299343 > < 474920248000: system.cpu_cluster.bef_roi_cpus0 T0 : 0x7fbf4f9510: > su > > Any ideas how I should proceed with debugging this problem? > > Thank you very much in advance. > > Best Regards, > Shehab > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Ruby functional read fails and potential fix
Hi Ciro, Thank you for your reply. Looks like this patch does address the problem I mentioned along with some other ones. Thanks for sharing. Best Regards, Shehab On Thu, Apr 9, 2020 at 1:45 PM Ciro Santilli wrote: > Thanks for this Shehab, > > Could you compare your changes to this patchset: > https://gem5-review.googlesource.com/c/public/gem5/+/22022/1 > > On Thu, Apr 9, 2020 at 6:22 PM Shehab Elsayed > wrote: > > > > Hello All, > > > > I was running some experiments and I ran into a problem with ruby where > a functional read was failing. After some investigation I found that the > reason was that the functional read was trying to read a line that was in a > MaybeStale state (no ReadOnly or ReadWrite versions). > > > > I implemented a fix which so far seems to be working fine but I am not > sure if there is a deeper problem that needs fixing or if my fix could > present future problems. > > > > I am running Full System simulations with ARM ISA and MESI_Three_Level. > > > > Here is my fix (I have marked new lines with //--NEW--//): > > Basically what this fix does is perform the functional read from the > controller that has the line in the MaybeStale state if no ReadOnly or > ReadWrite versions in any controller. > > > > bool > > RubySystem::functionalRead(PacketPtr pkt) > > { > > Addr address(pkt->getAddr()); > > Addr line_address = makeLineAddress(address); > > > > AccessPermission access_perm = AccessPermission_NotPresent; > > int num_controllers = m_abs_cntrl_vec.size(); > > > > DPRINTF(RubySystem, "Functional Read request for %#x\n", address); > > > > unsigned int num_ro = 0; > > unsigned int num_rw = 0; > > unsigned int num_busy = 0; > > unsigned int num_backing_store = 0; > > unsigned int num_invalid = 0; > > unsigned int num_maybe_stale = 0;//--NEW--// > > > > // In this loop we count the number of controllers that have the > given > > // address in read only, read write and busy states. > > for (unsigned int i = 0; i < num_controllers; ++i) { > > > > // Ignore ATD controllers for functional reads > > if (m_abs_cntrl_vec[i]->getType() == MachineType_ATD) { > > continue; > > } > > > > access_perm = m_abs_cntrl_vec[i]-> > getAccessPermission(line_address); > > if (access_perm == AccessPermission_Read_Only) > > num_ro++; > > else if (access_perm == AccessPermission_Read_Write) > > num_rw++; > > else if (access_perm == AccessPermission_Busy) > > num_busy++; > > else if (access_perm == AccessPermission_Backing_Store) > > // See RubySlicc_Exports.sm for details, but Backing_Store > is meant > > // to represent blocks in memory *for Broadcast/Snooping > protocols*, > > // where memory has no idea whether it has an exclusive copy > of data > > // or not. > > num_backing_store++; > > else if (access_perm == AccessPermission_Invalid || > > access_perm == AccessPermission_NotPresent) > > num_invalid++; > > else if (access_perm == AccessPermission_Maybe_Stale) > //--NEW--// > > num_maybe_stale++; > //--NEW--// > > } > > > > // This if case is meant to capture what happens in a Broadcast/Snoop > > // protocol where the block does not exist in the cache hierarchy. > You > > // only want to read from the Backing_Store memory if there is no > copy in > > // the cache hierarchy, otherwise you want to try to read the RO or > RW > > // copies existing in the cache hierarchy (covered by the else > statement). > > // The reason is because the Backing_Store memory could easily be > stale, if > > // there are copies floating around the cache hierarchy, so you want > to read > > // it only if it's not in the cache hierarchy at all. > > if (num_invalid == (num_controllers - 1) && num_backing_store == 1) { > > DPRINTF(RubySystem, "only copy in Backing_Store memory, read > from it\n"); > > for (unsigned int i = 0; i < num_controllers; ++i) { > > access_perm = > m_abs_cntrl_vec[i]->getAccessPermission(line_address); > > if (access_perm == AccessPermission_Backing_Store) { > > m_abs_cntrl_vec[i]->functionalRead(line_addr
[gem5-users] Ruby functional read fails and potential fix
Hello All, I was running some experiments and I ran into a problem with ruby where a functional read was failing. After some investigation I found that the reason was that the functional read was trying to read a line that was in a MaybeStale state (no ReadOnly or ReadWrite versions). I implemented a fix which so far seems to be working fine but I am not sure if there is a deeper problem that needs fixing or if my fix could present future problems. I am running Full System simulations with ARM ISA and MESI_Three_Level. Here is my fix (I have marked new lines with //--NEW--//): Basically what this fix does is perform the functional read from the controller that has the line in the MaybeStale state if no ReadOnly or ReadWrite versions in any controller. bool RubySystem::functionalRead(PacketPtr pkt) { Addr address(pkt->getAddr()); Addr line_address = makeLineAddress(address); AccessPermission access_perm = AccessPermission_NotPresent; int num_controllers = m_abs_cntrl_vec.size(); DPRINTF(RubySystem, "Functional Read request for %#x\n", address); unsigned int num_ro = 0; unsigned int num_rw = 0; unsigned int num_busy = 0; unsigned int num_backing_store = 0; unsigned int num_invalid = 0; unsigned int num_maybe_stale = 0;//--NEW--// // In this loop we count the number of controllers that have the given // address in read only, read write and busy states. for (unsigned int i = 0; i < num_controllers; ++i) { // Ignore ATD controllers for functional reads if (m_abs_cntrl_vec[i]->getType() == MachineType_ATD) { continue; } access_perm = m_abs_cntrl_vec[i]-> getAccessPermission(line_address); if (access_perm == AccessPermission_Read_Only) num_ro++; else if (access_perm == AccessPermission_Read_Write) num_rw++; else if (access_perm == AccessPermission_Busy) num_busy++; else if (access_perm == AccessPermission_Backing_Store) // See RubySlicc_Exports.sm for details, but Backing_Store is meant // to represent blocks in memory *for Broadcast/Snooping protocols*, // where memory has no idea whether it has an exclusive copy of data // or not. num_backing_store++; else if (access_perm == AccessPermission_Invalid || access_perm == AccessPermission_NotPresent) num_invalid++; else if (access_perm == AccessPermission_Maybe_Stale) //--NEW--// num_maybe_stale++; //--NEW--// } // This if case is meant to capture what happens in a Broadcast/Snoop // protocol where the block does not exist in the cache hierarchy. You // only want to read from the Backing_Store memory if there is no copy in // the cache hierarchy, otherwise you want to try to read the RO or RW // copies existing in the cache hierarchy (covered by the else statement). // The reason is because the Backing_Store memory could easily be stale, if // there are copies floating around the cache hierarchy, so you want to read // it only if it's not in the cache hierarchy at all. if (num_invalid == (num_controllers - 1) && num_backing_store == 1) { DPRINTF(RubySystem, "only copy in Backing_Store memory, read from it\n"); for (unsigned int i = 0; i < num_controllers; ++i) { access_perm = m_abs_cntrl_vec[i]->getAccessPermission(line_address); if (access_perm == AccessPermission_Backing_Store) { m_abs_cntrl_vec[i]->functionalRead(line_address, pkt); return true; } } } else if (num_ro > 0 || num_rw >= 1 || num_maybe_stale > 0) { //--NEW--// if (num_rw > 1) { // We iterate over the vector of abstract controllers, and return // the first copy found. If we have more than one cache with block // in writable permission, the first one found would be returned. warn("More than one Abstract Controller with RW permission for " "addr: %#x on cacheline: %#x.", address, line_address); } // In Broadcast/Snoop protocols, this covers if you know the block // exists somewhere in the caching hierarchy, then you want to read any // valid RO or RW block. In directory protocols, same thing, you want // to read any valid readable copy of the block. DPRINTF(RubySystem, "num_busy = %d, num_ro = %d, num_rw = %d\n", num_busy, num_ro, num_rw); // In this loop, we try to figure which controller has a read only or // a read write copy of the given address. Any valid copy would suffice // for a functional read. // Sometimes the functional read is to a line that has recently // transitioned to MaybeStale state and no other controller has it in // a
[gem5-users] Question about Ruby cache latencies
Hello All, In MOESI_CMP_directory, the ruby cache latencies (tagAccessLatency and dataAccessLatency) are included in the SLICC cache controllers through cacheReponsLatency() function. However, the function is only included in messages that include a data response while all other messages use the controllers latency as defined in corresponding SLICC file. My question is the following: Doesn't the cache still need to at least access the tag array for other actions as well that might not include sending data. For example, when receiving a request to invalidate a certain block, the cache would need to access the tag array first to invalidate the block before sending the Ack. If that is the case, shouldn't the sendAck get its enqueue latency using cacheResponseLatecy() as well?! Right now it is hardcoded to response_latency regardless of tagAccessLatency. Is my understanding correct? Or am I missing something? ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: Question about Ruby cache latencies
I see, Thanks for the explanation! On Thu, May 14, 2020 at 3:36 PM Tiago Muck wrote: > Right now it's possible the redefine the mandatoryQueueLatency function to > return the cache latency, but this only works for L1 hit latency. It's > currently not possible to have a fully generic model since each protocol > can have different assumptions regarding how a cache lookup/update latency > would affect each transaction. > > Best, > Tiago > ---------- > *From:* Shehab Elsayed > *Sent:* Thursday, May 14, 2020 11:50 AM > *To:* gem5 users mailing list > *Cc:* Tiago Muck > *Subject:* Re: [gem5-users] Question about Ruby cache latencies > > Thank you very much for your reply and explanation, Tiago! > > Wouldn't it be more generic to add the latencies at the time of performing > the access in the cache itself instead of having it in the controllers > since any cache access should incur access latency? I am not sure how easy > that would be though given the way ruby works right now. I don't know the > exact details of ruby operation but I took a quick look and noticed that > getEntry(...) can be called multiple times for the same request which, I > guess, makes my suggestion more difficult to add. > > On Tue, May 12, 2020 at 12:11 PM Tiago Muck via gem5-users < > gem5-users@gem5.org> wrote: > > Hi Shehab, > > Your understanding is correct, there are some cases that are not being > handled. This https://gem5-review.googlesource.com/c/public/gem5/+/18414 > patched > MOESI_CMP_directory to some extent (there was no cache latency being > considered before) but was not a complete solution. Other then the case > you mentioned, MOESI_CMP_directory is also currently missing the > transaction annotations so it can generate stalls on cache/directory bank > access conflicts. > > Best, > Tiago > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > ___ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s > > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: Question about Ruby cache latencies
Thank you very much for your reply and explanation, Tiago! Wouldn't it be more generic to add the latencies at the time of performing the access in the cache itself instead of having it in the controllers since any cache access should incur access latency? I am not sure how easy that would be though given the way ruby works right now. I don't know the exact details of ruby operation but I took a quick look and noticed that getEntry(...) can be called multiple times for the same request which, I guess, makes my suggestion more difficult to add. On Tue, May 12, 2020 at 12:11 PM Tiago Muck via gem5-users < gem5-users@gem5.org> wrote: > Hi Shehab, > > Your understanding is correct, there are some cases that are not being > handled. This https://gem5-review.googlesource.com/c/public/gem5/+/18414 > patched > MOESI_CMP_directory to some extent (there was no cache latency being > considered before) but was not a complete solution. Other then the case > you mentioned, MOESI_CMP_directory is also currently missing the > transaction annotations so it can generate stalls on cache/directory bank > access conflicts. > > Best, > Tiago > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > ___ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: GEM5/Ruby and MESI_Three_Level protocol
Which files do you think are missing? There are some shared files between MESI_Three_Level and MESI_Two-Level such as the L2 controller. You can find a list of all files used by the MESI_Three_Level protocol in src/mem/ruby/protocol/MESI_Three_Level.slicc. I hope this helps. On Thu, May 28, 2020 at 11:37 AM Javed Osmany via gem5-users < gem5-users@gem5.org> wrote: > Hello > > > > 1. I am able to successfully generate the executable gem5 simulator > for [ARM ISA, MESI_Three_Level protocol]. The command I used being: > > a. scons -j4 build/ARM_MESI_3_level/gem5.opt --default=ARM > PROTOCOL=MESI_Three_Level SLICC_HTML=True > > > > 2. Also, I am able successfully generate the executable gem5 > simulator for [X86 ISA, MESI_Three_Level protocol]. The command I used > being: > > a. scons -j4 build/X86_MESI_3_level/gem5.opt --default=X86 > PROTOCOL=MESI_Three_Level SLICC_HTML=True > > > > However, if I look in src/mem/ruby/protocol, the code for MESI_Three_Level > is as follows: > > > > [j00533938@lhrplinux1 protocol]$ ll MESI_Three_Level* > > -rw-rw-r-- 1 j00533938 j00533938 40031 May 28 09:17 > MESI_Three_Level-L0cache.sm > > -rw-rw-r-- 1 j00533938 j00533938 36841 May 28 09:17 > MESI_Three_Level-L1cache.sm > > -rw-rw-r-- 1 j00533938 j00533938 4270 May 28 09:17 MESI_Three_Level-msg.sm > > -rw-rw-r-- 1 j00533938 j00533938 316 Mar 20 17:42 MESI_Three_Level.slicc > > > > > > Therefore, it looks to me that the code for MESI_Three_Level is not > complete. Thus it is not clear to me how the executable gem5 simulator for > MESI_Three_Level is being generated. > > > > > > Any thoughts on this please? > > > > Thanks in advance. > > JO > > > ___ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: 2 level TLB in ARM Full System with Ruby
Hi Ciro, Thanks for your reply! I don't remember seeing this patch before. I will check it out. The reason I specified RUBY is that one solution I found posted used a cache as a second level TLB and modified the port connections accordingly. However, that was a cache from the classical system and therefore wouldn't work with RUBY. Thanks again! Best Regards, Shehab On Wed, Jul 8, 2020 at 8:30 AM Ciro Santilli wrote: > Shehab, sorry for the delay, I had to check a few things about this, > > First, are you aware that there is a not-yet-merged patch that implements > a two level TLB at: > https://github.com/giactra/gem5/commit/3022ecc8a06a9182b2cf1936941901a785c1b21d > ? > > It hasn't been merged because we noticed that it broke Linux boot I think. > But we would like to merge it in the following months. > > I'm not sure why Ruby vs classic would matter since the TLB sits behind > caches anyways? I believe that model will work for either classic or Ruby. > ---------- > *From:* Shehab Elsayed via gem5-users > *Sent:* Tuesday, June 23, 2020 12:20 AM > *To:* gem5 users mailing list > *Cc:* Shehab Elsayed > *Subject:* [gem5-users] 2 level TLB in ARM Full System with Ruby > > Hello All, > > I was wondering if there is a way to simulate a system with 2 levels of > TLBs in full system simulation with ruby for ARM? > > I have seen other examples that use the classical memory model and use a > cache as the second level TLB. Is there something similar that can be > done in Ruby memory system. Can I use a standalone RubyCache as the > second level TLB? > > Thank you very much in advance. > > Best Regards, > Shehab > IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] 2 level TLB in ARM Full System with Ruby
Hello All, I was wondering if there is a way to simulate a system with 2 levels of TLBs in full system simulation with ruby for ARM? I have seen other examples that use the classical memory model and use a cache as the second level TLB. Is there something similar that can be done in Ruby memory system. Can I use a standalone RubyCache as the second level TLB? Thank you very much in advance. Best Regards, Shehab ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Modelling a deeper pipeline (O3-ARM)
Hello All, I am trying to model a deeper O3 pipeline as suggested in https://gem5-users.gem5.narkive.com/LNMJQ1M5/model-deeper-pipeline-in-x86 but I keep running into some assertion failures related to the time buffers and skid buffers even though that patch mentioned in the previous link is already added in my gem5 version. Is there any relation between the different pipeline delay values, widths, forward and backward communication sizes and any others parameter of the O3 cores that has to be maintained to avoid running into problems? For reference, these are the assertion failures I a facing depending on the values I choose: /cpu/timebuf.hh:54: void TimeBuffer::valid(int) const [with T = DefaultRenameDefaultIEW]: Assertion `idx >= -past && idx <= future' failed. cpu/o3/decode_impl.hh:425: void DefaultDecode::skidInsert(ThreadID) [with Impl = O3CPUImpl; ThreadID = short int]: Assertion `skidBuffer[tid].size() <= skidBufferMax' failed. Best Regards, Shehab ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Setting up cluster for gem5
Hello All, My group is in the process of upgrading our cluster and since many of us are using gem5 I was wondering if anyone has experience or recommendation they would like to share about the process for a smooth gem5 operation. Mainly I am concerned about 2 issues: 1) Required hard disk and memory on the nodes for a smooth gem5 operation. 2) OS and job management systems or any software related recommendations. Thank you very much in advance. Best Regards, Shehab ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: Setting up cluster for gem5
Thank you so much for your reply, Daniel! It is really helpful. On Fri, Nov 6, 2020 at 10:21 AM Daniel Gerzhoy wrote: > Hey Shehab, > > I've been working with gem5 on my group's research cluster for a while > now. > 1) Gem5 isn't very memory hungry in my experience, sometimes long > simulations (I'm talking 3 weeks+) will start bloating to GB of RAM but > its usually not paging so it doesn't slow things down (depends on the > program you are running) > I exclusively use Syscall Emulation mode, so that may not apply in Full > System. > > *It is however single-threaded. So if your entire group is running many > experiments at the same time make sure you have a ton of cores.* > > 2) As for job management, I created my own system for > configuring/running/parsing etc.that I've built with python. > > gem5 as of recently has been shipped with dockerfiles. I use the gcn3 > dockerfile for instance. I'd recommend using them. > Again I use a custom solution here, but I'm pretty sure container job > management is a solved problem. I think one of them is "kubernets" (see > https://kubernetes.io/) > I don't have experience with anything like that, but I'm sure that would > be useful. > > Also, if you plan on editing gem5 and your sourcecode is going to be > located on the cluster, I'd recommend using code-server ( > https://github.com/cdr/code-server) > It broadcasts an instance of vscode to a web page that you can access from > anywhere. I used to use gvim and bash scripts and it was hell. Code server > was a life-saver. > > If you (or anyone else) already have a solution for editing code on the > cluster I would be interested in what it is. > > Good luck! > > Dan Gerzhoy > PhD Candidate, Computer Engineering > University of Maryland College Park > > On Fri, Nov 6, 2020 at 8:38 AM Shehab Elsayed via gem5-users < > gem5-users@gem5.org> wrote: > >> Hello All, >> >> My group is in the process of upgrading our cluster and since many of us >> are using gem5 I was wondering if anyone has experience or recommendation >> they would like to share about the process for a smooth gem5 operation. >> Mainly I am concerned about 2 issues: >> >> 1) Required hard disk and memory on the nodes for a smooth gem5 >> operation. >> 2) OS and job management systems or any software related recommendations. >> >> Thank you very much in advance. >> >> Best Regards, >> Shehab >> ___ >> gem5-users mailing list -- gem5-users@gem5.org >> To unsubscribe send an email to gem5-users-le...@gem5.org >> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s > > ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s