Re: [m5-dev] [PATCH 2 of 5] Fix Branch Prediction for MIPS. Namely, i dont see the reason for updating the branch history on anything that happens in decode. this should happen only in commit, right?
OK So I finally figured out what's wrong here and admittedly I'm a bit apprehensive of the universal (all ISA) fix will be but we'll see. So let me explain the situation from the top: (1) In the Decode of O3, an unconditional pc-relative branch is resolved and the prediciton history table is updated to remove the resolved branch (since we've already updated the predictors with the result). However, in the Execute stage the branch is seen as mispredicted AGAIN, causing the code to want to update and delete something from the prediction history. This triggers an assert and breaks exeuction. (2) So a few questions emerge: (2a) Why does this work with the previous hack disabling the early resolve of branches in Decode? Because the mispredict only happens once and in Execute the fetch is redirected to the results of the execution so we only experience performance degradation instead of non-functional peformance. (2b) How can we mispredict a branch in Execute after we've previously resolved that branch in Decode? It just seems that the mispredict() function isnt handling the MIPS delay slots correctly. When a branch happens in MIPS, it sets the NPC = PC + 4 and NNPC = target. However the code shows this: return readPredPC() != readNextPC() || readPredNPC() != readNextNPC() || readPredMicroPC() != readNextMicroPC(); Consider for MIPS, the predPC is the target and the predNPC is target+4, then you're saying: target != pc+4 and target+4 != target which will be wrong in most cases. (3) How could this work for SPARC? I'm not sure. My guess is that instructions in SPARC are disregarding the delay slot in execution and setting the NPC to the target instead of setting the NPC to PC+4. SIDE ISSUE: Gabe, is SPARC not using the Unconditional Control flag? I ran SPARC and put this assert (assert(!inst-UnCondCtrl()) in the decode-branch-resolve optimization and it ran fine (for hello-world) So if my guess on (3) is right then I dont right away see a quick fix other than using a compiler directive. The MIPS code needs this in mispredict(): #if THE_ISA == MIPS_ISA // Special case since a not-taken, cond. delay slot, effectively // nullifies the delay slot instruction if (isCondDelaySlot() !predictTaken) { return predPC != nextPC; } else { return predPC != nextNPC; } #else return readPredPC() != readNextPC() || readPredNPC() != readNextNPC() || readPredMicroPC() != readNextMicroPC(); #endif So that's the main issue that's happening and forcing MIPS to break rather quickly. My solution for a fix? I guess it all depends on the answer to (3), but it seems like there is going to have to be a uniform way for how instructions specify their NPC and NNPC when an instruction executes. I need to take a break from coding MIPS to handle a round of InOrder issues (research!) but hopefully this discussion can lead us into a uniform solution for when I get back to the code. On Thu, Apr 16, 2009 at 12:52 AM, Steve Reinhardt ste...@gmail.com wrote: I see now, I missed that little change buried at the bottom. I recall that condition about unconditional PC-relative branches finding out in decode that their target was mispredicted too. I'm wondering if you're not just hacking around the real problem here... I don't see how this particular case is MIPS-specific, but I could believe that there may be some MIPS-specific details in how it's handled (like maybe the nextPC is not getting updated correctly because of branch delay slots or something). So your fix may be avoiding this correctness problem by preventing the model from handling this particular class of misprediction properly, forcing it to use an unrealistic but not broken path instead, while the right answer is to fix the real problem. Steve On Wed, Apr 15, 2009 at 9:16 PM, Korey Sewell ksew...@umich.edu wrote: There is a bunch of DPRINTS in there that I kept in figuring out what's going on. The actual fix is small in fetch_impl.hh: // Update the branch predictor. +#if THE_ISA == MIPS_ISA + if (fromDecode-decodeInfo[tid]. branchMispredict 0) { +#else if (fromDecode-decodeInfo[tid].branchMispredict) { +#endif A little lower in the fall-out of that selection statement you'll see 2 different squash functions that get called in the branch predictor: - One squashes all history behind the mispredicted instruction - The other squashes all history and then updates some branch history state (which is kind of confusing) I edited out the 2nd one from happening and things work again. The reason why you get a mispredict in decode is I think the fetching scenario behind an unconditional PC-relative branch. You wont know exactly which direction to go until after decode. On Wed, Apr 15, 2009 at 11:44 PM, Steve Reinhardt ste...@gmail.com wrote: I'm confused... this patch looks like it pretty much
Re: [m5-dev] Big push coming
I can see where this patch solves the immediate problem, but it's not clear it's not just a band-aid. The real question is what is the code supposed to be doing? The larger context in FullO3CPUImpl::init() is this: for (int tid=0; tid number_of_threads; tid++) { #if FULL_SYSTEM ThreadContext *src_tc = threadContexts[tid]; #else ThreadContext *src_tc = thread[tid]-getTC(); #endif // Threads start in the Suspended State if (src_tc-status() != ThreadContext::Suspended) { continue; } #if FULL_SYSTEM TheISA::initCPU(src_tc, src_tc-contextId()); #endif } So the idea is that we only call initCPU() if the thread is in the Suspended state (I believe there's a more straightforward way to code that!), which the comment implies is because we only want to call initCPU() when the thread is just starting up. There's no indication of why we would ever call FullO3CPUImpl::init() when the thread isn't just starting up though (i.e., did this check ever fail in practice?). Does it have something to do with creating an O3CPU but not starting it up right away because we're starting in SimpleCPU instead and are going to do a switchover later? This condition isn't present in any of the SimpleCPU models. As far as wakeup, I think there should be at least a warning and maybe an assertion if you wakeup a Halted cpu. My thought is that a Halted CPU doesn't have a PC to run from, so it doesn't make sense just to wake it up. If there is a PC for it to run from it should be in Suspended. If it's assigned a PC but you don't want it to start running right away, it should transition from Halted to Suspended at that time. This model may not work cleanly with starting up a CPU via an interrupt, and I'm willing to rethink it if that's the case, but let's do it deliberately and not ad hoc. Sorry for not catching this before I committed... I thought I had run all of the long regressions, but somehow I missed this one (or missed the fact that it failed). Steve On Thu, Apr 16, 2009 at 10:37 PM, Gabe Black gbl...@eecs.umich.edu wrote: The patch below seems to fix it. I think there may be a similar issue with the in order model. There's a bug in the wakeup function of I believe all the CPU models where it only wakes up from Suspended and just returns if you're Halted. I have a patch in my queue that fixes it for the simple CPU, but a more general fix would be appropriate, I think. I'll leave committing these to you guys since you know your code best. Gabe diff -r be16ad28822f src/cpu/o3/cpu.cc --- a/src/cpu/o3/cpu.cc Wed Apr 15 13:18:24 2009 -0700 +++ b/src/cpu/o3/cpu.cc Thu Apr 16 22:32:49 2009 -0700 @@ -575,7 +575,7 @@ ThreadContext *src_tc = thread[tid]-getTC(); #endif // Threads start in the Suspended State -if (src_tc-status() != ThreadContext::Suspended) { +if (src_tc-status() != ThreadContext::Halted) { continue; } ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] View Traceflags From Command Line...
Yea, that's helpful. I guess I am looking for a more fine-grained control of things in terms of show me all flags that have to do with me running O3CPU. You would think O3CPUAll would be enough, but there are some base flags that are relevant (e.g. TLB or Cache) that I would want to include. I could understand if most people thought the functionality present is enough though... On Fri, Apr 17, 2009 at 9:32 AM, Lisa Hsu h...@eecs.umich.edu wrote: m5.opt --trace-help that will list all available trace flags, including composite ones. m5.opt -h will give a big usage blurb too, so that you can see if you want is available. Lisa On Fri, Apr 17, 2009 at 2:56 AM, Korey Sewell ksew...@umich.edu wrote: I'm wondering if anybody would find this useful or not. Basically, when debugging, I often find myself going through SConscript files trying to remember or find out what trace-flag is available for a particular object. For instance, just now, I wanted to view the Exec trace but didnt want the symbols. And a little earlier, I wanted to figure out if I could just print out the branch prediction info but none of the other information. So it's slightly annoying for me to have to do that all the time. One solution might for me to just remember all the trace flags for things I care about :) But it might be nice to novice and expert users if you could view trace-flags for a particular object or set of objects from the command line. I imagine I want something like: ./m5.debug --view-flags=O3CPU So does anybody else think that would be a useful thing? -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] View Traceflags From Command Line...
You could just cobble everything you want together into a composite flag and call it O3CPUAll.then not only would it be easy to see but easy to turn on. On Fri, Apr 17, 2009 at 11:47 AM, Korey Sewell ksew...@umich.edu wrote: Yea, that's helpful. I guess I am looking for a more fine-grained control of things in terms of show me all flags that have to do with me running O3CPU. You would think O3CPUAll would be enough, but there are some base flags that are relevant (e.g. TLB or Cache) that I would want to include. I could understand if most people thought the functionality present is enough though... On Fri, Apr 17, 2009 at 9:32 AM, Lisa Hsu h...@eecs.umich.edu wrote: m5.opt --trace-help that will list all available trace flags, including composite ones. m5.opt -h will give a big usage blurb too, so that you can see if you want is available. Lisa On Fri, Apr 17, 2009 at 2:56 AM, Korey Sewell ksew...@umich.edu wrote: I'm wondering if anybody would find this useful or not. Basically, when debugging, I often find myself going through SConscript files trying to remember or find out what trace-flag is available for a particular object. For instance, just now, I wanted to view the Exec trace but didnt want the symbols. And a little earlier, I wanted to figure out if I could just print out the branch prediction info but none of the other information. So it's slightly annoying for me to have to do that all the time. One solution might for me to just remember all the trace flags for things I care about :) But it might be nice to novice and expert users if you could view trace-flags for a particular object or set of objects from the command line. I imagine I want something like: ./m5.debug --view-flags=O3CPU So does anybody else think that would be a useful thing? -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] View Traceflags From Command Line...
Yea, at some point it becomes subjective (and dependent on the specific problem you're tracking down) to say what's related. If O3CPUAll isn't sufficient, what is there in All that is too much noise to deal with? Basically the only thing in All that I find truly annoying is the Event stuff, so if I'm not tracing something specific like Cache or Bus I usually find that All,-Event is pretty good. Steve On Fri, Apr 17, 2009 at 8:59 AM, Korey Sewell ksew...@umich.edu wrote: Sure I cooouuuld, but something tells me adding Exec, TLB, Cache, etc. etc. would not be met with open-arms since technically I'm merging different object flags together BUT all these different objects are related to each other. I guess I want a per-configuration relevant trace flag listing so nothing slips under the cracks when I'm trying to debug something. Maybe that's specific to how I want to use M5 so I could see why people think the *All Flags are sufficient. On Fri, Apr 17, 2009 at 11:50 AM, Lisa Hsu h...@eecs.umich.edu wrote: You could just cobble everything you want together into a composite flag and call it O3CPUAll.then not only would it be easy to see but easy to turn on. On Fri, Apr 17, 2009 at 11:47 AM, Korey Sewell ksew...@umich.edu wrote: Yea, that's helpful. I guess I am looking for a more fine-grained control of things in terms of show me all flags that have to do with me running O3CPU. You would think O3CPUAll would be enough, but there are some base flags that are relevant (e.g. TLB or Cache) that I would want to include. I could understand if most people thought the functionality present is enough though... On Fri, Apr 17, 2009 at 9:32 AM, Lisa Hsu h...@eecs.umich.edu wrote: m5.opt --trace-help that will list all available trace flags, including composite ones. m5.opt -h will give a big usage blurb too, so that you can see if you want is available. Lisa On Fri, Apr 17, 2009 at 2:56 AM, Korey Sewell ksew...@umich.edu wrote: I'm wondering if anybody would find this useful or not. Basically, when debugging, I often find myself going through SConscript files trying to remember or find out what trace-flag is available for a particular object. For instance, just now, I wanted to view the Exec trace but didnt want the symbols. And a little earlier, I wanted to figure out if I could just print out the branch prediction info but none of the other information. So it's slightly annoying for me to have to do that all the time. One solution might for me to just remember all the trace flags for things I care about :) But it might be nice to novice and expert users if you could view trace-flags for a particular object or set of objects from the command line. I imagine I want something like: ./m5.debug --view-flags=O3CPU So does anybody else think that would be a useful thing? -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] View Traceflags From Command Line...
I'm not sure what you mean. You just make up a traceflag calle O3CPUAll, or even KoreysO3Flags , you could make whatever you want, it won't affect anyone. See src/cpu/SConscript and look at the CompoundFlags - that's how Exec works, it's a collection of lots of other flags. On Fri, Apr 17, 2009 at 11:59 AM, Korey Sewell ksew...@umich.edu wrote: Sure I cooouuuld, but something tells me adding Exec, TLB, Cache, etc. etc. would not be met with open-arms since technically I'm merging different object flags together BUT all these different objects are related to each other. I guess I want a per-configuration relevant trace flag listing so nothing slips under the cracks when I'm trying to debug something. Maybe that's specific to how I want to use M5 so I could see why people think the *All Flags are sufficient. On Fri, Apr 17, 2009 at 11:50 AM, Lisa Hsu h...@eecs.umich.edu wrote: You could just cobble everything you want together into a composite flag and call it O3CPUAll.then not only would it be easy to see but easy to turn on. On Fri, Apr 17, 2009 at 11:47 AM, Korey Sewell ksew...@umich.edu wrote: Yea, that's helpful. I guess I am looking for a more fine-grained control of things in terms of show me all flags that have to do with me running O3CPU. You would think O3CPUAll would be enough, but there are some base flags that are relevant (e.g. TLB or Cache) that I would want to include. I could understand if most people thought the functionality present is enough though... On Fri, Apr 17, 2009 at 9:32 AM, Lisa Hsu h...@eecs.umich.edu wrote: m5.opt --trace-help that will list all available trace flags, including composite ones. m5.opt -h will give a big usage blurb too, so that you can see if you want is available. Lisa On Fri, Apr 17, 2009 at 2:56 AM, Korey Sewell ksew...@umich.edu wrote: I'm wondering if anybody would find this useful or not. Basically, when debugging, I often find myself going through SConscript files trying to remember or find out what trace-flag is available for a particular object. For instance, just now, I wanted to view the Exec trace but didnt want the symbols. And a little earlier, I wanted to figure out if I could just print out the branch prediction info but none of the other information. So it's slightly annoying for me to have to do that all the time. One solution might for me to just remember all the trace flags for things I care about :) But it might be nice to novice and expert users if you could view trace-flags for a particular object or set of objects from the command line. I imagine I want something like: ./m5.debug --view-flags=O3CPU So does anybody else think that would be a useful thing? -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- -- Korey L Sewell Graduate Student - PhD Candidate Computer Science Engineering University of Michigan ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] [PATCH 2 of 5] Fix Branch Prediction for MIPS. Namely, i dont see the reason for updating the branch history on anything that happens in decode. this should happen only in commit, right?
Thanks for digging into this in more detail, Korey. It sounds to me like maybe the problem is that MIPS is setting predPC to the branch target, when (to account for the delay slot) it ought to be leaving predPC as PC+4 and setting predNPC to be the branch target. That way if the prediction turns out to be correct, then you set PC - predPC and NPC - predNPC and you get the right behavior for the following instruction. Is there a reason it doesn't work this way? Steve On Fri, Apr 17, 2009 at 2:24 AM, Korey Sewell ksew...@umich.edu wrote: OK So I finally figured out what's wrong here and admittedly I'm a bit apprehensive of the universal (all ISA) fix will be but we'll see. So let me explain the situation from the top: (1) In the Decode of O3, an unconditional pc-relative branch is resolved and the prediciton history table is updated to remove the resolved branch (since we've already updated the predictors with the result). However, in the Execute stage the branch is seen as mispredicted AGAIN, causing the code to want to update and delete something from the prediction history. This triggers an assert and breaks exeuction. (2) So a few questions emerge: (2a) Why does this work with the previous hack disabling the early resolve of branches in Decode? Because the mispredict only happens once and in Execute the fetch is redirected to the results of the execution so we only experience performance degradation instead of non-functional peformance. (2b) How can we mispredict a branch in Execute after we've previously resolved that branch in Decode? It just seems that the mispredict() function isnt handling the MIPS delay slots correctly. When a branch happens in MIPS, it sets the NPC = PC + 4 and NNPC = target. However the code shows this: return readPredPC() != readNextPC() || readPredNPC() != readNextNPC() || readPredMicroPC() != readNextMicroPC(); Consider for MIPS, the predPC is the target and the predNPC is target+4, then you're saying: target != pc+4 and target+4 != target which will be wrong in most cases. (3) How could this work for SPARC? I'm not sure. My guess is that instructions in SPARC are disregarding the delay slot in execution and setting the NPC to the target instead of setting the NPC to PC+4. SIDE ISSUE: Gabe, is SPARC not using the Unconditional Control flag? I ran SPARC and put this assert (assert(!inst-UnCondCtrl()) in the decode-branch-resolve optimization and it ran fine (for hello-world) So if my guess on (3) is right then I dont right away see a quick fix other than using a compiler directive. The MIPS code needs this in mispredict(): #if THE_ISA == MIPS_ISA // Special case since a not-taken, cond. delay slot, effectively // nullifies the delay slot instruction if (isCondDelaySlot() !predictTaken) { return predPC != nextPC; } else { return predPC != nextNPC; } #else return readPredPC() != readNextPC() || readPredNPC() != readNextNPC() || readPredMicroPC() != readNextMicroPC(); #endif So that's the main issue that's happening and forcing MIPS to break rather quickly. My solution for a fix? I guess it all depends on the answer to (3), but it seems like there is going to have to be a uniform way for how instructions specify their NPC and NNPC when an instruction executes. I need to take a break from coding MIPS to handle a round of InOrder issues (research!) but hopefully this discussion can lead us into a uniform solution for when I get back to the code. On Thu, Apr 16, 2009 at 12:52 AM, Steve Reinhardt ste...@gmail.com wrote: I see now, I missed that little change buried at the bottom. I recall that condition about unconditional PC-relative branches finding out in decode that their target was mispredicted too. I'm wondering if you're not just hacking around the real problem here... I don't see how this particular case is MIPS-specific, but I could believe that there may be some MIPS-specific details in how it's handled (like maybe the nextPC is not getting updated correctly because of branch delay slots or something). So your fix may be avoiding this correctness problem by preventing the model from handling this particular class of misprediction properly, forcing it to use an unrealistic but not broken path instead, while the right answer is to fix the real problem. Steve On Wed, Apr 15, 2009 at 9:16 PM, Korey Sewell ksew...@umich.edu wrote: There is a bunch of DPRINTS in there that I kept in figuring out what's going on. The actual fix is small in fetch_impl.hh: // Update the branch predictor. +#if THE_ISA == MIPS_ISA +if (fromDecode-decodeInfo[tid]. branchMispredict 0) { +#else if (fromDecode-decodeInfo[tid].branchMispredict) { +#endif A little lower in the fall-out of that selection statement you'll
Re: [m5-dev] View Traceflags From Command Line...
I'm not sure what you mean. You just make up a traceflag calle O3CPUAll, or even KoreysO3Flags , you could make whatever you want, it won't affect anyone. See src/cpu/SConscript and look at the CompoundFlags - that's how Exec works, it's a collection of lots of other flags. But committing KoreysO3Flags to the tree would be annoying. You can turn on flags from the python, so you could just add an option to your script to turn on flags. You could also use $HOME/.m5/options.py to modify the default set of flags. I could add support for defining your own composite flags in something like $HOME/.m5/flags.py if people think that would be useful. It would actually be rather easy to do. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] View Traceflags From Command Line...
Yea, that's helpful. I guess I am looking for a more fine-grained control of things in terms of show me all flags that have to do with me running O3CPU. We don't actually keep that kind of information anywhere. All traceflags are currently global. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] View Traceflags From Command Line...
Oh, definitely not a commit of KoreysO3, that would obviously have to stay in one's patch tree...but a reasonable commit with a reasonable name that is usedul - I think wouldn't be so bad. Lisa On Fri, Apr 17, 2009 at 1:35 PM, nathan binkert n...@binkert.org wrote: I'm not sure what you mean. You just make up a traceflag calle O3CPUAll, or even KoreysO3Flags , you could make whatever you want, it won't affect anyone. See src/cpu/SConscript and look at the CompoundFlags - that's how Exec works, it's a collection of lots of other flags. But committing KoreysO3Flags to the tree would be annoying. You can turn on flags from the python, so you could just add an option to your script to turn on flags. You could also use $HOME/.m5/options.py to modify the default set of flags. I could add support for defining your own composite flags in something like $HOME/.m5/flags.py if people think that would be useful. It would actually be rather easy to do. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] encumbered: add a hack to ingore a return value even when gcc wa...
changeset fb895218046e in /z/repo/encumbered details: http://repo.m5sim.org/encumbered?cmd=changeset;node=fb895218046e summary: add a hack to ingore a return value even when gcc was told that we shouldn't diffstat: 3 files changed, 6 insertions(+), 3 deletions(-) eio/eio.cc|4 ++-- eio/exolex.cc |2 +- eio/libexo.h |3 +++ diffs (46 lines): diff -r 3ef40471424f -r fb895218046e eio/eio.cc --- a/eio/eio.ccWed Mar 11 09:15:25 2009 -0700 +++ b/eio/eio.ccFri Apr 17 11:49:34 2009 -0700 @@ -221,7 +221,7 @@ return false; /* read and check EIO file header */ -fgets(buf, 512, fd); +$ignore(fgets(buf, 512, fd)); /* check the header */ if (strcmp(buf, EIO_FILE_HEADER)) @@ -515,7 +515,7 @@ int real_fd = sim_fd(tgt_fd); if (real_fd = 0) { - write(real_fd, blob-as_blob.data, blob-as_blob.size); + $ignore(write(real_fd, blob-as_blob.data, blob-as_blob.size)); } } } diff -r 3ef40471424f -r fb895218046e eio/exolex.cc --- a/eio/exolex.cc Wed Mar 11 09:15:25 2009 -0700 +++ b/eio/exolex.cc Fri Apr 17 11:49:34 2009 -0700 @@ -584,7 +584,7 @@ /* This used to be an fputs(), but since the string might contain NUL's, * we now use fwrite(). */ -#define ECHO (void) fwrite( yytext, yyleng, 1, yyout ) +#define ECHO $ignore(fwrite( yytext, yyleng, 1, yyout )) #endif /* Gets input and stuffs it into buf. number of characters read, or YY_NULL, diff -r 3ef40471424f -r fb895218046e eio/libexo.h --- a/eio/libexo.h Wed Mar 11 09:15:25 2009 -0700 +++ b/eio/libexo.h Fri Apr 17 11:49:34 2009 -0700 @@ -77,6 +77,9 @@ #include eio/alpha_exo.h #include sim/host.hh +// Gross hack to ignore a return value even with a gcc warning +#define $ignore(X) do { if (X) {} } while(0) + #ifdef __cplusplus extern C { #endif ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
On Fri, Apr 17, 2009 at 9:42 AM, Gabe Black gbl...@eecs.umich.edu wrote: So the idea is that we only call initCPU() if the thread is in the Suspended state (I believe there's a more straightforward way to code that!), which the comment implies is because we only want to call initCPU() when the thread is just starting up. There's no indication of why we would ever call FullO3CPUImpl::init() when the thread isn't just starting up though (i.e., did this check ever fail in practice?). Does it have something to do with creating an O3CPU but not starting it up right away because we're starting in SimpleCPU instead and are going to do a switchover later? This condition isn't present in any of the SimpleCPU models. I'm not an expert on this code, but I think it's preventing excessively restarting the CPU if you do a switchover or resume from a checkpoint. Yea, that's exactly my assumption too, but I'd feel much better if someone could explain in detail why it's necessary rather than just leaving it in out of the fear of the unknown. For example, it seems wrong that init() would get called more than once on the same object. Also if there are cases where init() is called and the thread is *not* in its initial state, then when where did the thread's state get changed? As far as wakeup, I think there should be at least a warning and maybe an assertion if you wakeup a Halted cpu. My thought is that a Halted CPU doesn't have a PC to run from, so it doesn't make sense just to wake it up. If there is a PC for it to run from it should be in Suspended. If it's assigned a PC but you don't want it to start running right away, it should transition from Halted to Suspended at that time. This model may not work cleanly with starting up a CPU via an interrupt, and I'm willing to rethink it if that's the case, but let's do it deliberately and not ad hoc. So halted would be basically powered off and uninitialized and suspended would be ready and waiting for something to do? Pretty much... Halted is uninitialized from a SW perspective, anyway, while Suspended is not just waiting for something to do, but knowing what it supposed to be doing and just waiting for the signal to go do it (e.g., waiting on a lock or an interrupt). In that case, I'd want to initialize the unused CPUs to suspended and make the halt microop use suspended because the CPUs couldn't be started otherwise. If I'm just using it wrong I'll stop doing that :) Yes, one of the little ironies is that this means that the x86 HLT instruction should put the thread in the Suspended state and not the Halted state. The unused CPUs in x86 FS mode do start out in Suspended, since that's what the x86-specific startupCPU() function does with them (at the same time it activates the boot CPU). This seems exactly right to me: ThreadContexts start out as Halted by default, then can be put in Active or Suspended as appropriate when they are configured to run software, either by the Process code (for SE mode) or by the ISA-specific code (for FS mode), since the details of MP boot are ISA/platform specific. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
I'm not an expert on this code, but I think it's preventing excessively restarting the CPU if you do a switchover or resume from a checkpoint. Yea, that's exactly my assumption too, but I'd feel much better if someone could explain in detail why it's necessary rather than just leaving it in out of the fear of the unknown. For example, it seems wrong that init() would get called more than once on the same object. Also if there are cases where init() is called and the thread is *not* in its initial state, then when where did the thread's state get changed? Can we even figure out who did what? Can you trace hg annotate through the renames and repository conversions and stuff to understand it? The thing is, I think the unfortunate answer is that the code just grew completely organically without anyone really understanding what's going on. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
On Fri, Apr 17, 2009 at 1:37 PM, nathan binkert n...@binkert.org wrote: I'm not an expert on this code, but I think it's preventing excessively restarting the CPU if you do a switchover or resume from a checkpoint. Yea, that's exactly my assumption too, but I'd feel much better if someone could explain in detail why it's necessary rather than just leaving it in out of the fear of the unknown. For example, it seems wrong that init() would get called more than once on the same object. Also if there are cases where init() is called and the thread is *not* in its initial state, then when where did the thread's state get changed? Can we even figure out who did what? Can you trace hg annotate through the renames and repository conversions and stuff to understand it? The thing is, I think the unfortunate answer is that the code just grew completely organically without anyone really understanding what's going on. Nate Here's the changeset; it doesn't really provide any clues that I could find: http://repo.m5sim.org/m5/rev/db5f2da1271a I whacked that test entirely (so that the O3 init() calls initCPU() unconditionally in FS mode) and am running the long regressions and so far they've all passed. However I don't think there's a switchover in the regressions so I'll have to test that separately. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
Quoting Steve Reinhardt ste...@gmail.com: On Fri, Apr 17, 2009 at 1:37 PM, nathan binkert n...@binkert.org wrote: I'm not an expert on this code, but I think it's preventing excessively restarting the CPU if you do a switchover or resume from a checkpoint. Yea, that's exactly my assumption too, but I'd feel much better if someone could explain in detail why it's necessary rather than just leaving it in out of the fear of the unknown. For example, it seems wrong that init() would get called more than once on the same object. Also if there are cases where init() is called and the thread is *not* in its initial state, then when where did the thread's state get changed? Can we even figure out who did what? Can you trace hg annotate through the renames and repository conversions and stuff to understand it? The thing is, I think the unfortunate answer is that the code just grew completely organically without anyone really understanding what's going on. Nate Here's the changeset; it doesn't really provide any clues that I could find: http://repo.m5sim.org/m5/rev/db5f2da1271a I whacked that test entirely (so that the O3 init() calls initCPU() unconditionally in FS mode) and am running the long regressions and so far they've all passed. However I don't think there's a switchover in the regressions so I'll have to test that separately. Steve You should also be sure to test starting from a checkpoint which I believe is different from a switchover, although I'm not sure. I've done things in the past that seem fine and pass regressions but break that mechanism. Speaking of which, since I and potentially others don't know off hand how to use checkpointing (still!), we should probably make a regression to test those. I know we've agreed on that before, but this is another case where it would be good to have so I thought I'd bring it up. Gabe ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
On Fri, Apr 17, 2009 at 1:43 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: You should also be sure to test starting from a checkpoint which I believe is different from a switchover, although I'm not sure. It shouldn't be different from O3's perspective; we always restore from a checkpoint into a SimpleCPU model and then switch over to O3, since you can't directly restore a SimpleCPU checkpoint into and O3 model. I've done things in the past that seem fine and pass regressions but break that mechanism. Speaking of which, since I and potentially others don't know off hand how to use checkpointing (still!), we should probably make a regression to test those. I know we've agreed on that before, but this is another case where it would be good to have so I thought I'd bring it up. Agreed, absolutely. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
Quoting Steve Reinhardt ste...@gmail.com: On Fri, Apr 17, 2009 at 1:43 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: You should also be sure to test starting from a checkpoint which I believe is different from a switchover, although I'm not sure. It shouldn't be different from O3's perspective; we always restore from a checkpoint into a SimpleCPU model and then switch over to O3, since you can't directly restore a SimpleCPU checkpoint into and O3 model. But you could checkpoint O3 and then restore directly, couldn't you? Gabe ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
But you could checkpoint O3 and then restore directly, couldn't you? No, we currently don't support that. We also don't support checkpointing the caches. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
No, we don't support that. O3 is too complicated, has too much state to checkpoint, so we only checkpoint and restore AtomicCPU, then move into the desired CPU model post-checkpoint. On Fri, Apr 17, 2009 at 7:26 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: Quoting Steve Reinhardt ste...@gmail.com: On Fri, Apr 17, 2009 at 1:43 PM, Gabriel Michael Black gbl...@eecs.umich.edu wrote: You should also be sure to test starting from a checkpoint which I believe is different from a switchover, although I'm not sure. It shouldn't be different from O3's perspective; we always restore from a checkpoint into a SimpleCPU model and then switch over to O3, since you can't directly restore a SimpleCPU checkpoint into and O3 model. But you could checkpoint O3 and then restore directly, couldn't you? Gabe ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] changeset in m5: o3, inorder: fix FS bug due to initializing Thr...
changeset fc2e234b4404 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=fc2e234b4404 description: o3, inorder: fix FS bug due to initializing ThreadState to Halted. For some reason o3 FS init() only called initCPU if the thread state was Suspended, which was no longer the case. There's no apparent reason to check, so I whacked the test completely rather than changing the check to Halted. The inorder init() was also updated to be symmetric, though the previous code was just a fancy no-op. diffstat: 2 files changed, 9 insertions(+), 20 deletions(-) src/cpu/inorder/cpu.cc | 14 ++ src/cpu/o3/cpu.cc | 15 +++ diffs (61 lines): diff -r f1a9f7f6e7c6 -r fc2e234b4404 src/cpu/inorder/cpu.cc --- a/src/cpu/inorder/cpu.ccWed Apr 15 23:12:00 2009 -0700 +++ b/src/cpu/inorder/cpu.ccFri Apr 17 16:54:58 2009 -0700 @@ -29,6 +29,8 @@ * */ +#include config/full_system.hh + #include arch/utility.hh #include cpu/exetrace.hh #include cpu/activity.hh @@ -420,16 +422,12 @@ for (int i = 0; i number_of_threads; ++i) thread[i]-inSyscall = true; +#if FULL_SYSTEM for (int tid=0; tid number_of_threads; tid++) { - -ThreadContext *src_tc = thread[tid]-getTC(); - -// Threads start in the Suspended State -if (src_tc-status() != ThreadContext::Suspended) { -continue; -} - +ThreadContext *src_tc = threadContexts[tid]; +TheISA::initCPU(src_tc, src_tc-contextId()); } +#endif // Clear inSyscall. for (int i = 0; i number_of_threads; ++i) diff -r f1a9f7f6e7c6 -r fc2e234b4404 src/cpu/o3/cpu.cc --- a/src/cpu/o3/cpu.cc Wed Apr 15 23:12:00 2009 -0700 +++ b/src/cpu/o3/cpu.cc Fri Apr 17 16:54:58 2009 -0700 @@ -568,21 +568,12 @@ for (int i = 0; i number_of_threads; ++i) thread[i]-inSyscall = true; +#if FULL_SYSTEM for (int tid=0; tid number_of_threads; tid++) { -#if FULL_SYSTEM ThreadContext *src_tc = threadContexts[tid]; -#else -ThreadContext *src_tc = thread[tid]-getTC(); +TheISA::initCPU(src_tc, src_tc-contextId()); +} #endif -// Threads start in the Suspended State -if (src_tc-status() != ThreadContext::Suspended) { -continue; -} - -#if FULL_SYSTEM -TheISA::initCPU(src_tc, src_tc-contextId()); -#endif -} // Clear inSyscall. for (int i = 0; i number_of_threads; ++i) ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: o3, inorder: fix FS bug due to initializing Thr...
OK, this code passes the long regressions (for me, anyway), and I also did an ad-hoc test of checkpoint/resume/switchover and fast-forward and those seemed to go OK as well. Steve On Fri, Apr 17, 2009 at 4:59 PM, Steve Reinhardt steve.reinha...@amd.comwrote: changeset fc2e234b4404 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=fc2e234b4404 description: o3, inorder: fix FS bug due to initializing ThreadState to Halted. For some reason o3 FS init() only called initCPU if the thread state was Suspended, which was no longer the case. There's no apparent reason to check, so I whacked the test completely rather than changing the check to Halted. The inorder init() was also updated to be symmetric, though the previous code was just a fancy no-op. diffstat: 2 files changed, 9 insertions(+), 20 deletions(-) src/cpu/inorder/cpu.cc | 14 ++ src/cpu/o3/cpu.cc | 15 +++ diffs (61 lines): diff -r f1a9f7f6e7c6 -r fc2e234b4404 src/cpu/inorder/cpu.cc --- a/src/cpu/inorder/cpu.ccWed Apr 15 23:12:00 2009 -0700 +++ b/src/cpu/inorder/cpu.ccFri Apr 17 16:54:58 2009 -0700 @@ -29,6 +29,8 @@ * */ +#include config/full_system.hh + #include arch/utility.hh #include cpu/exetrace.hh #include cpu/activity.hh @@ -420,16 +422,12 @@ for (int i = 0; i number_of_threads; ++i) thread[i]-inSyscall = true; +#if FULL_SYSTEM for (int tid=0; tid number_of_threads; tid++) { - -ThreadContext *src_tc = thread[tid]-getTC(); - -// Threads start in the Suspended State -if (src_tc-status() != ThreadContext::Suspended) { -continue; -} - +ThreadContext *src_tc = threadContexts[tid]; +TheISA::initCPU(src_tc, src_tc-contextId()); } +#endif // Clear inSyscall. for (int i = 0; i number_of_threads; ++i) diff -r f1a9f7f6e7c6 -r fc2e234b4404 src/cpu/o3/cpu.cc --- a/src/cpu/o3/cpu.cc Wed Apr 15 23:12:00 2009 -0700 +++ b/src/cpu/o3/cpu.cc Fri Apr 17 16:54:58 2009 -0700 @@ -568,21 +568,12 @@ for (int i = 0; i number_of_threads; ++i) thread[i]-inSyscall = true; +#if FULL_SYSTEM for (int tid=0; tid number_of_threads; tid++) { -#if FULL_SYSTEM ThreadContext *src_tc = threadContexts[tid]; -#else -ThreadContext *src_tc = thread[tid]-getTC(); +TheISA::initCPU(src_tc, src_tc-contextId()); +} #endif -// Threads start in the Suspended State -if (src_tc-status() != ThreadContext::Suspended) { -continue; -} - -#if FULL_SYSTEM -TheISA::initCPU(src_tc, src_tc-contextId()); -#endif -} // Clear inSyscall. for (int i = 0; i number_of_threads; ++i) ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Big push coming
Just to make sure I'm interpreting this correctly, I should be able to suspend a halted CPU to go directly to suspended, correct? The simple CPU asserts in that case. The attached patch fixes it, but I want to make sure I'm not getting the status stuff turned around again. Gabe Lisa Hsu wrote: No, we don't support that. O3 is too complicated, has too much state to checkpoint, so we only checkpoint and restore AtomicCPU, then move into the desired CPU model post-checkpoint. On Fri, Apr 17, 2009 at 7:26 PM, Gabriel Michael Black gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu wrote: Quoting Steve Reinhardt ste...@gmail.com mailto:ste...@gmail.com: On Fri, Apr 17, 2009 at 1:43 PM, Gabriel Michael Black gbl...@eecs.umich.edu mailto:gbl...@eecs.umich.edu wrote: You should also be sure to test starting from a checkpoint which I believe is different from a switchover, although I'm not sure. It shouldn't be different from O3's perspective; we always restore from a checkpoint into a SimpleCPU model and then switch over to O3, since you can't directly restore a SimpleCPU checkpoint into and O3 model. But you could checkpoint O3 and then restore directly, couldn't you? Gabe ___ m5-dev mailing list m5-dev@m5sim.org mailto:m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev diff --git a/src/cpu/simple/atomic.cc b/src/cpu/simple/atomic.cc --- a/src/cpu/simple/atomic.cc +++ b/src/cpu/simple/atomic.cc @@ -270,6 +270,9 @@ assert(thread_num == 0); assert(thread); + +if (_status == Idle) +return; assert(_status == Running); diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc --- a/src/cpu/simple/timing.cc +++ b/src/cpu/simple/timing.cc @@ -232,6 +232,9 @@ assert(thread_num == 0); assert(thread); + +if (_status == Idle) +return; assert(_status == Running); ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev