Hi,
I have made some changes to the code: I have added a new flag (named
pushRAS) in the PredictorHistory structure. This flag tracks whether the
RAS has been pushed or not during a prediction. Then, in the squash
function it is used to pop the RAS if necessary.
I have made tests on a part of Spec2006 compiled for ARM. In most of the
cases, there is less RAS misprediction with this modification.
However, I have one case where the simulator is not able to finish the
simulation. Here is a part of the trace output (debug-flags=All,-Event):
112977432760000: system.cpu.icache-mem_side_port-MemSidePort: Queue
system.cpu.icache-mem_side_port-MemSidePort received retry
112977432760000: system.tol2bus: recvTimingReq: src system.tol2bus-p0
ReadReq 0x4f000
112977432760000: system.switch_cpus_1.iew.lsq: received pkt for
addr:0x4f000 ReadReq
112977432760000: system.l2-cpu_side_port: Scheduling a retry while blocked
112977432760000: system.tol2bus: recvTimingReq: src system.tol2bus-p0
ReadReq 0x4f000 RETRY
112977432760000: system.tol2bus: The bus is now occupied from tick
112977432760000 to 112977432761000
112977432760000: system.cpu.icache-mem_side_port-MemSidePort: now
waiting on a retry
112977432761000: system.cpu.icache-mem_side_port-MemSidePort: Queue
system.cpu.icache-mem_side_port-MemSidePort received retry
112977432761000: system.tol2bus: recvTimingReq: src system.tol2bus-p0
ReadReq 0x4f000
112977432761000: system.switch_cpus_1.iew.lsq: received pkt for
addr:0x4f000 ReadReq
112977432761000: system.l2-cpu_side_port: Scheduling a retry while blocked
112977432761000: system.tol2bus: recvTimingReq: src system.tol2bus-p0
ReadReq 0x4f000 RETRY
...
I don't know if this happens because of my modifications (even if I
don't how it can be related) or because I have stumble upon some bugs.
Nathanaël
Le 22/05/2012 04:56, Ali Saidi a écrit :
Hi Nathanaël,
It's quite possible there is a bug. Unfortunately, I don't think anyone knows
the branch predictor code extremely well.
Please see inline.
On May 21, 2012, at 3:36 PM, Nathanaël Prémillieu wrote:
Hi all,
I'am currently looking at the branch prediction code.
In the predict function (src/cpu/o3/bpred_unit_impl.hh: 160), if the BTB entry
is not valid for a taken prediction, the prediction pred_taken is changed to
false and the history is changed. Moreover, the RAS is popped if it is a call
(l.254). But the RAS is popped only if the call is conditional (I don't know
exactly why).
If it is an unconditional call, the RAS is not popped.
I think the reason that the code is like this is that you're definitely
branching in this case. If you don't Because in this case you're definitely
branching. In the case of not having a target fetch will be re-directed to the
target when the branch resolves, so there will never be another opportunity to
put the branch into the return address stack. However, I would assume that the
conditional branch would have the same issue, so this seems to be a problem.
And if this unconditional call need to be squashed (by the squash function
l.298), the RAS needs to be popped. For this instruction, the validBTB flag is
not set, so the line 316 of the squash function is not executed ans thus, the
entry is not popped.
That seems to be an issue too.
From my current understanding of the code, this seems to be a bug. As I don't
know the whole code, this case can be taken care of in some other function or
it is not a problem.
Any idea on the subject ?
I think it might be a problem. Could you try to suggest a fix and see if it
effects performance?
Thanks,
Ali
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users