OK.... So I finally figured out what's wrong here and admittedly I'm a
bit apprehensive of the universal (all ISA) fix will be but we'll see.
So let me explain the situation from the top:
(1) In the Decode of O3, an unconditional pc-relative branch is
resolved and the prediciton history table is updated to remove the
resolved branch (since we've already updated the predictors with the
result). However, in the Execute stage the branch is seen as
mispredicted AGAIN, causing the code to want to update and delete
something from the prediction history. This triggers an assert and
breaks exeuction.
(2) So a few questions emerge:
(2a) Why does this work with the previous hack disabling the early
resolve of branches in Decode?
Because the mispredict only happens once and in Execute the fetch is
redirected to the results of the execution so we only experience
performance degradation instead of non-functional peformance.
(2b) How can we mispredict a branch in Execute after we've previously
resolved that branch in Decode?
It just seems that the mispredict() function isnt handling the MIPS
delay slots correctly. When a branch happens in MIPS, it sets the NPC
= PC + 4 and NNPC = target. However the code shows this:
return readPredPC() != readNextPC() ||
readPredNPC() != readNextNPC() ||
readPredMicroPC() != readNextMicroPC();
Consider for MIPS, the predPC is the target and the predNPC is
target+4, then you're saying:
target != pc+4
and
target+4 != target
which will be wrong in most cases.
(3) How could this work for SPARC?
I'm not sure. My guess is that instructions in SPARC are disregarding
the delay slot in execution and setting the NPC to the target instead
of setting the NPC to PC+4.
SIDE ISSUE: Gabe, is SPARC not using the Unconditional Control flag? I
ran SPARC and put this assert (assert(!inst->UnCondCtrl()) in the
decode-branch-resolve optimization and it ran fine (for hello-world)
So if my guess on (3) is right then I dont right away see a quick fix
other than using a compiler directive. The MIPS code needs this in
mispredict():
#if THE_ISA == MIPS_ISA
// Special case since a not-taken, cond. delay slot, effectively
// nullifies the delay slot instruction
if (isCondDelaySlot() && !predictTaken) {
return predPC != nextPC;
} else {
return predPC != nextNPC;
}
#else
return readPredPC() != readNextPC() ||
readPredNPC() != readNextNPC() ||
readPredMicroPC() != readNextMicroPC();
#endif
So that's the main issue that's happening and forcing MIPS to break
rather quickly. My solution for a fix? I guess it all depends on the
answer to (3), but it seems like there is going to have to be a
uniform way for how instructions specify their NPC and NNPC when an
instruction executes.
I need to take a break from coding MIPS to handle a round of InOrder
issues (research!) but hopefully this discussion can lead us into a
uniform solution for when I get back to the code.
On Thu, Apr 16, 2009 at 12:52 AM, Steve Reinhardt <[email protected]> wrote:
> I see now, I missed that little change buried at the bottom.
>
> I recall that condition about unconditional PC-relative branches finding out
> in decode that their target was mispredicted too.
>
> I'm wondering if you're not just hacking around the real problem here... I
> don't see how this particular case is MIPS-specific, but I could believe
> that there may be some MIPS-specific details in how it's handled (like maybe
> the nextPC is not getting updated correctly because of branch delay slots or
> something). So your fix may be avoiding this correctness problem by
> preventing the model from handling this particular class of misprediction
> properly, forcing it to use an unrealistic but not broken path instead,
> while the right answer is to fix the real problem.
>
> Steve
>
> On Wed, Apr 15, 2009 at 9:16 PM, Korey Sewell <[email protected]> wrote:
>>
>> There is a bunch of DPRINTS in there that I kept in figuring out what's
>> going on.
>>
>> The actual fix is small in fetch_impl.hh:
>> " // Update the branch predictor.
>> +#if THE_ISA == MIPS_ISA
>> + if (fromDecode->decodeInfo[tid].
>> branchMispredict && 0) {
>> +#else
>> if (fromDecode->decodeInfo[tid].branchMispredict) {
>> +#endif
>> "
>> A little lower in the fall-out of that selection statement you'll see 2
>> different squash functions that get called in the branch predictor:
>> - One squashes all history behind the mispredicted instruction
>> - The other squashes all history and then updates some branch history
>> state (which is kind of confusing)
>>
>> I edited out the 2nd one from happening and things work again.
>>
>> The reason why you get a mispredict in decode is I think the fetching
>> scenario behind an unconditional PC-relative branch. You wont know exactly
>> which direction to go until after decode.
>>
>>
>> On Wed, Apr 15, 2009 at 11:44 PM, Steve Reinhardt <[email protected]>
>> wrote:
>>>
>>> I'm confused... this patch looks like it pretty much just inserts a bunch
>>> of DPRINTFs (plus the seemingly mistaken inclusion of a whitespace change in
>>> mips/isa/decoder.isa).
>>>
>>> I don't know anything specific about the O3 predictor, but it makes sense
>>> that the history is updated speculatively in decode, but I don't see how you
>>> can even know whether a branch is mispredicted while you're still in the
>>> decode stage, so the fixup would have to wait at least until execute (and
>>> you may want to wait until commit if that makes recovery substantially
>>> simpler).
>>>
>>> Steve
>>>
>>> On Wed, Apr 15, 2009 at 8:15 PM, Korey Sewell <[email protected]> wrote:
>>>>
>>>> This is the patch that will prevent me from committing.
>>>>
>>>> Basically, the decode stage is finding a mispredicted branch and then
>>>> deleting information out the branch history table for squashed instructions
>>>> AND itself.
>>>>
>>>> This messes things up for when the actual mispredicted instruction gets
>>>> to commit and wants to update the branch history.
>>>>
>>>> So basically, I ignored that condition and it branch prediction started
>>>> working. I did the same for ALPHA and SPARC and it seemed to work for Hello
>>>> World as well (maybe I just got lucky?)
>>>>
>>>> The next thing I guess I should do is run the quick/long regressions on
>>>> ALPHA/SPARC to make sure I'm not breaking anything if I were to remove that
>>>> code.
>>>>
>>>> Anybody have any thoughts on this (Gabe, Kevin?)?
>>>>
>>>>
>>>> On Wed, Apr 15, 2009 at 11:06 PM, Korey Sewell <[email protected]>
>>>> wrote:
>>>>>
>>>>> # HG changeset patch
>>>>> # User Korey Sewell <[email protected]>
>>>>> # Date 1239850525 14400
>>>>> # Node ID ce539a58327b78b049f8af0bf8a00737f2b97640
>>>>> # Parent 7b685fee8773c2fe0bd09ec7abb0a43873aeed92
>>>>> Fix Branch Prediction for MIPS. Namely, i dont see the reason for
>>>>> updating the branch history on anything that happens in decode. this
>>>>> should
>>>>> happen only in commit, right?
>>>>>
>>>>> diff --git a/src/arch/mips/isa/decoder.isa
>>>>> b/src/arch/mips/isa/decoder.isa
>>>>> --- a/src/arch/mips/isa/decoder.isa
>>>>> +++ b/src/arch/mips/isa/decoder.isa
>>>>> @@ -135,7 +135,7 @@ decode OPCODE_HI default Unknown::unknow
>>>>> 0x2: movz({{ Rd = (Rt == 0) ? Rs : Rd; }});
>>>>> 0x3: movn({{ Rd = (Rt != 0) ? Rs : Rd; }});
>>>>> #if FULL_SYSTEM
>>>>> - 0x4: syscall({{
>>>>> + 0x4: syscall({{
>>>>> fault = new SystemCallFault();
>>>>> }});
>>>>> #else
>>>>> diff --git a/src/cpu/o3/bpred_unit_impl.hh
>>>>> b/src/cpu/o3/bpred_unit_impl.hh
>>>>> --- a/src/cpu/o3/bpred_unit_impl.hh
>>>>> +++ b/src/cpu/o3/bpred_unit_impl.hh
>>>>> @@ -173,6 +173,10 @@ BPredUnit<Impl>::predict(DynInstPtr &ins
>>>>> tid, pred_taken, inst->readPC());
>>>>> }
>>>>>
>>>>> + DPRINTF(Fetch, "BranchPred: [tid:%i]: [sn:%i] Creating prediction
>>>>> history "
>>>>> + "for PC %#x\n",
>>>>> + tid, inst->seqNum, inst->readPC());
>>>>> +
>>>>> PredictorHistory predict_record(inst->seqNum, PC, pred_taken,
>>>>> bp_history, tid);
>>>>>
>>>>> @@ -249,7 +253,7 @@ void
>>>>> void
>>>>> BPredUnit<Impl>::update(const InstSeqNum &done_sn, unsigned tid)
>>>>> {
>>>>> - DPRINTF(Fetch, "BranchPred: [tid:%i]: Commiting branches until "
>>>>> + DPRINTF(Fetch, "BranchPred: [tid:%i]: Committing branches until "
>>>>> "[sn:%lli].\n", tid, done_sn);
>>>>>
>>>>> while (!predHist[tid].empty() &&
>>>>> @@ -290,7 +294,12 @@ BPredUnit<Impl>::squash(const InstSeqNum
>>>>> // This call should delete the bpHistory.
>>>>> BPSquash(pred_hist.front().bpHistory);
>>>>>
>>>>> + DPRINTF(Fetch, "BranchPred: [tid:%i]: Removing history for
>>>>> [sn:%i] "
>>>>> + "PC %#x.\n", tid, pred_hist.front().seqNum,
>>>>> pred_hist.front().PC);
>>>>> +
>>>>> pred_hist.pop_front();
>>>>> +
>>>>> + DPRINTF(Fetch, "[tid:%i]: predHist.size(): %i\n", tid,
>>>>> predHist[tid].size());
>>>>> }
>>>>>
>>>>> }
>>>>> @@ -320,7 +329,13 @@ BPredUnit<Impl>::squash(const InstSeqNum
>>>>> // corresponding to the squash. In that case, don't bother trying
>>>>> to
>>>>> // fix up the entry.
>>>>> if (!pred_hist.empty()) {
>>>>> - assert(pred_hist.front().seqNum == squashed_sn);
>>>>> + if (pred_hist.front().seqNum != squashed_sn)
>>>>> + {
>>>>> + DPRINTF(Fetch, "PredHist.front.sn = %i | squashed_sn =
>>>>> %i.\n",
>>>>> + pred_hist.front().seqNum, squashed_sn);
>>>>> + assert(pred_hist.front().seqNum == squashed_sn);
>>>>> + }
>>>>> +
>>>>> if (pred_hist.front().usedRAS) {
>>>>> ++RASIncorrect;
>>>>> }
>>>>> @@ -329,7 +344,13 @@ BPredUnit<Impl>::squash(const InstSeqNum
>>>>> pred_hist.front().bpHistory);
>>>>>
>>>>> BTB.update(pred_hist.front().PC, corr_target, tid);
>>>>> +
>>>>> + DPRINTF(Fetch, "BranchPred: [tid:%i]: Removing history for
>>>>> [sn:%i] "
>>>>> + "PC %#x.\n", tid, pred_hist.front().seqNum,
>>>>> pred_hist.front().PC);
>>>>> +
>>>>> pred_hist.pop_front();
>>>>> +
>>>>> + DPRINTF(Fetch, "[tid:%i]: predHist.size(): %i\n", tid,
>>>>> predHist[tid].size());
>>>>> }
>>>>> }
>>>>>
>>>>> diff --git a/src/cpu/o3/fetch_impl.hh b/src/cpu/o3/fetch_impl.hh
>>>>> --- a/src/cpu/o3/fetch_impl.hh
>>>>> +++ b/src/cpu/o3/fetch_impl.hh
>>>>> @@ -939,7 +939,11 @@ DefaultFetch<Impl>::checkSignalsAndUpdat
>>>>> "from decode.\n",tid);
>>>>>
>>>>> // Update the branch predictor.
>>>>> +#if THE_ISA == MIPS_ISA
>>>>> + if (fromDecode->decodeInfo[tid].branchMispredict && 0) {
>>>>> +#else
>>>>> if (fromDecode->decodeInfo[tid].branchMispredict) {
>>>>> +#endif
>>>>> branchPred.squash(fromDecode->decodeInfo[tid].doneSeqNum,
>>>>> fromDecode->decodeInfo[tid].nextPC,
>>>>> fromDecode->decodeInfo[tid].branchTaken,
>>>>> diff --git a/src/cpu/o3/iew_impl.hh b/src/cpu/o3/iew_impl.hh
>>>>> --- a/src/cpu/o3/iew_impl.hh
>>>>> +++ b/src/cpu/o3/iew_impl.hh
>>>>> @@ -1282,11 +1282,11 @@ DefaultIEW<Impl>::executeInsts()
>>>>> fetchRedirect[tid] = true;
>>>>>
>>>>> DPRINTF(IEW, "Execute: Branch mispredict detected.\n");
>>>>> - DPRINTF(IEW, "Predicted target was %#x, %#x.\n",
>>>>> + DPRINTF(IEW, "Predicted target was PC:%#x,
>>>>> NPC:%#x.\n",
>>>>> inst->readPredPC(), inst->readPredNPC());
>>>>> DPRINTF(IEW, "Execute: Redirecting fetch to PC: %#x,"
>>>>> " NPC: %#x.\n", inst->readNextPC(),
>>>>> - inst->readNextNPC());
>>>>> + inst->readNextPC());
>>>>> // If incorrect, then signal the ROB that it must be
>>>>> squashed.
>>>>> squashDueToBranch(inst, tid);
>>>>>
>>>>> _______________________________________________
>>>>> m5-dev mailing list
>>>>> [email protected]
>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>
>>>>
>>>>
>>>> --
>>>> ----------
>>>> Korey L Sewell
>>>> Graduate Student - PhD Candidate
>>>> Computer Science & Engineering
>>>> University of Michigan
>>>>
>>>> _______________________________________________
>>>> m5-dev mailing list
>>>> [email protected]
>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>
>>>
>>>
>>> _______________________________________________
>>> m5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>
>>
>>
>> --
>> ----------
>> Korey L Sewell
>> Graduate Student - PhD Candidate
>> Computer Science & Engineering
>> University of Michigan
>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
>
--
----------
Korey L Sewell
Graduate Student - PhD Candidate
Computer Science & Engineering
University of Michigan
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev