Thanks for showing that trace output. It does look like the branchTarget should be used there since the pipeline has recognized the branch as resolved quite yet (or else inst->readNextPC would show the taken branch).
I would say submit the patch, BUT I'm not sure you are using the most recent code (from the dev repo). Am I correct in that you are using the stable repository rather than the development repository? The code I'm looking at says this: toFetch->decodeInfo[tid].branchTaken = inst->pcState().branching(); The dev repo has a different syntax to check for branching (using the PCState objects now), but I still dont think it's handled correctly (i'm going to forward this email to m5-dev for discussion on what the fix should be). With regards to your questions about branch path, I would just note that branch can squashed in fetch (after prediction), in decode (early resolution) and in execute/"iew" (resolution). Those would be good places to start monitoring branch paths. The correct fix should be actually go ahead and resolve the branch On Sat, Mar 26, 2011 at 5:05 PM, reena panda <[email protected]> wrote: > Hi Korey, > > Following is the trace dump. You can check the instruction with sequence > number 2. > > Following trace corresponds to when the instruction was fetched and passed > through the branch predictor. > > 34000: system.cpu.fetch: [tid:0]: Adding instructions to queue to decode. > 34000: system.cpu.fetch: In fetch() function, fetch PC = 0x4001, fetch NPC > = 0x4005, next PC = 0x4001, next NPC = 0x4005 > 34000: system.cpu.fetch: In lookupAndUpdateNextPC function, Seq_Num = 1, > Instruction = nop (bis r31,r31,r31), PC = 0x4001 > 34000: system.cpu.fetch: In fetch function,while loop, fetch PC = 0x4005, > fetch NPC = 0x4009, next PC = 0x4005, next NPC = 0x4009 > 34000: system.cpu.fetch: In lookupAndUpdateNextPC function, Seq_Num = 2, > Instruction = br r1,0x8c41, PC = 0x4005 > 34000: system.cpu.BPredUnit: BranchPred: [tid:0]: Unconditional control. > 34000: system.cpu.BPredUnit: BranchPred: [tid:0]: [sn:2] Creating > prediction history for PC 0x4005 > 34000: system.cpu.BPredUnit: BranchPred: [tid:0]: BTB doesn't have a valid > entry. > 34000: system.cpu.BPredUnit: BranchPred: [tid:0]: [sn:2]: History entry > added.predHist.size(): 1 > 34000: system.cpu.fetch: [tid:0]: [sn:2]:Branch predicted to be not taken > or no valid BTB entry found. > 34000: system.cpu.fetch: [tid:0]: [sn:2] Branch predicted to go to 0x4009 > and then 0x400d. > > The instruction sequence that I am having difficulty following is: > > toFetch->decodeInfo[tid].nextPC = inst->branchTarget(); > toFetch->decodeInfo[tid].nextNPC = inst->branchTarget() + > sizeof(TheISA::MachInst); > > toFetch->decodeInfo[tid].branchTaken = inst->readNextPC() != > (inst->readPC() + sizeof(TheISA::MachInst)); > > DPRINTF(Decode, "In Decode squash,inst->branchTarget()= > %#x,inst->branchTarget() + machInst = %#x, inst->readnextPC = %#x, > inst->readPC = %#x, seq num = %i\n",inst->branchTarget(), > toFetch->decodeInfo[tid].nextNPC, inst->readNextPC(), inst->readPC(), > inst->seqNum); > //Added this DPRINTF > > When the instruction(sn = 2) reaches decode, following trace is > collected:- > > 34500: system.cpu.decode: In decode Insts, branchtarget = 0x8c41, predPC = > 0x4009 > 34500: system.cpu.decode: In Decode squash, inst->branchTarget()= 0x8c41, > inst->branchTarget() + machInst = 0x8c45, inst->readnextPC = 0x4009, > inst->readPC = 0x4005, seq num = 2 > > As a result, when the squash reaches back the fetch stage, "actually_taken" > variable is set to 0, when it should be 1. > > 35000: system.cpu.fetch: [tid:0]: Squashing instructions due to squash > from decode. > 35000: system.cpu.BPredUnit: BranchPred: [tid:0]: Squashing from sequence > number 2, setting target to 0x8c41. > 35000: system.cpu.BPredUnit: Squashing BranchPred: [tid:0]: Squashing > branches until [sn:2]. > 35000: system.cpu.BPredUnit: BranchPred: [tid:0]: Removing history for > [sn:16] PC 0x403d. > 35000: system.cpu.BPredUnit: BranchPred: [tid:0]: Removing history for > [sn:2] PC 0x4005., correct target = 0x8c41, Actually taken = 0 > 35000: system.cpu.BPredUnit: [tid:0]: predHist.size(): 0 > 35000: system.cpu.fetch: Squashing from decode with PC = 0x8c41, NPC = > 0x8c45 > 35000: system.cpu.fetch: [tid:0]: Squashing from decode. > 35000: system.cpu.fetch: [tid:0]: Squashing, setting PC to: 0x8c41, NPC > to: 0x8c45. > 35000: system.cpu.fetch: Running stage. > > > What I am interested in looking at is the actual taken/not taken > path/directions of each branch instruction as they . I thought I could > capture it at the fetch stage, when a branch commits or when a squash signal > gets generated for the branch. Is there any other way I can capture them? > > Thanks, > Reena > > On Sat, Mar 26, 2011 at 3:41 PM, Ali Saidi <[email protected]> wrote: >> >> On Mar 26, 2011, at 3:17 PM, reena panda wrote: >> >> Hi, >> >> I am using m5 in ALPHA FS mode, with O3 CPU model. I was going through the >> fetch/decode stage implementation in m5. But I can't understand properly, >> the way branch misprediction is handled in the decode/fetch stage of >> pipeline. Please correct me if I am wrong, but the way it is currently >> implemented in m5 is as follows:- >> >> Suppose an unconditional branch(PC = x, say) is fetched in the fetch >> cycle, its branch prediction history is immediately updated as a "taken >> branch". Now lets say, its actual target is "Y". But suppose the entry >> corresponding to the branch PC (x) is not found in the BTB, then the next PC >> and nextNPC are still updated to x+4, x+8 respectively. Since unconditional >> branches can be resolved in the decode stage, the following check is >> correctly performed in decodeInsts function (in decode_impl.hh):- >> >> if (inst->branchTarget() != inst->readPredPC()) { >> ++decodeBranchMispred; >> squash(inst, inst->threadNumber); >> } >> >> But what is odd is that in squash function, the following information is >> sent back to fetch stage:- >> >> toFetch->decodeInfo[tid].nextPC = inst->branchTarget(); >> toFetch->decodeInfo[tid].nextNPC = inst->branchTarget() + >> sizeof(TheISA::MachInst); >> toFetch->decodeInfo[tid].branchTaken = inst->readNextPC() != >> (inst->readPC() + sizeof(TheISA::MachInst)); >> >> The third statement is odd because it compares nextPC with PC( i.e, x+4 >> with x+4) yields branch direction as "not-taken", Which is wrong and would >> update the branch predictors incorrectly. Branch is actually an >> "unconditional taken branch" in the example. Then, should not the last line >> be something like this:- >> toFetch->decodeInfo[tid].branchTaken = inst->branchTarget() != >> (inst->readPC() + sizeof(TheISA::MachInst)); >> >> Please point if I am missing something here? I can't understand the >> working correctly. Also, can some one give me pointers on how to infer total >> branch misprediction statistics from the stats.txt file, the stats seem to >> be scattered across the different pipeline stages. Are they all disjoint/or >> is there any degree of overlap between them? >> >> This might be a bug, but it might be that nextPc gets updated along the >> way. Without diving into the code I don't really know, however have you >> tried your fix? Does it work? Do branch predictions go down when running >> code after you make the fix? >> Ali >> >> >> _______________________________________________ >> m5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users > > > _______________________________________________ > m5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/m5-users > -- - Korey _______________________________________________ m5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
