2012/9/25 Yann Vernier <[email protected]>: > On Tue, 25 Sep 2012 03:16:48 -1000 > Julius Baxter <[email protected]> wrote: > >> On Tue, Sep 25, 2012 at 12:28 AM, Yann Vernier >> <[email protected]> wrote: >> > The thread http://opencores.org/forum,OpenRISC,0,5005 about GDB >> > failing to single-step after certain breakpoints led us to a logic >> > error in the debug unit. When the pipeline is stalled for other >> > reasons, for instance an instruction fetch (therefore affected by >> > instruction cache), at the cycle _after_ a trap instruction is >> > decoded, the trap signal is never lowered since no new instruction >> > is decoded. This locks the debug unit into stalling continuously. >> >> Hi Yann, >> >> Sounds like good work on tracking this down. >> >> Can this be recreated in simulation via the JTAG-via-VPI interface? >> I'm not saying you need to, but it would be good to know that this bug >> is fixed by observing it in simulation, and/or having a regression >> test which, or instructions how to, trigger this bug. How did you work >> this out this is the right fix? > > Yes. We confirmed the issue on hardware, reproduced it in simulation > with VPI interface, saw the sig_trap line getting stuck, and found that > it overrides stall register updates over JTAG via code inspection. The > test case we used was or1200-cbasic, where (for this particular > compilation) breakpoints worked on main but not test_bss. gdb was > manually controlled, using break and si commands. > The observed failure was when if_stall coincided with except_stop. > > This fix was then verified in simulation and synthesis. > >> > Attached is a workaround to make the debug unit aware that the >> > sig_trap and sig_syscall signals only update when the ex stage is >> > not stalled. This fixes the issue we've observed, although similar >> > conditions may exist for any other exceptions. Also note that the >> > debug interface uses a priority encoder to only catch one of the >> > events that occur; short-lived events may therefore be ignored >> > completely. >> >> Can you elaborate on this? Which "short-lived events" are you talking >> about? > > By short-lived I mean exception signals which go back to 0 without being > acknowledged. sig_trap and sig_syscall are simply two of the bits in the > except_stop vector (which goes to the debug unit) as well as its relative > except_trig (goes to the CPU). I'm simply not sure if any others may be > short-lived, but the debug unit only catches one per cycle because the > du_except_stop to except_stop recoder is a priority encoder. > sig_trap is normally short-lived, but being part of the pipeline, may > stall along with it. > > The patch forces the debug unit stall reason of a trap exception to > seem short lived, such that if the debug unit happens to stop for > another reason, it may not be evident that it was about to trap as > well. In current gdb use, the DSR register only captures trap, so it > may not be a problem. > > I'm not sure why the debug unit is written to only capture one reason > per cycle anyway. It's going to stall anyhow, just not record all > triggers. > >> > Any comments? Personally I feel it's a bit of a stopgap measure, >> > addressing this problem narrowly, but it's better than the current >> > state. >> >> Can you please post the patch inline? > > Certainly (assuming my mail client won't mangle it). > > > Index: or1200_du.v > =================================================================== > --- or1200_du.v (revision 663) > +++ or1200_du.v (working copy) > @@ -412,6 +412,7 @@ > wire dwcr0_sel, > dwcr1_sel; // DWCR selects > reg dbg_bp_r; > +reg ex_freeze_q; > `ifdef OR1200_DU_HWBKPTS > reg [31:0] match_cond0_ct; > reg [31:0] match_cond1_ct; > @@ -529,12 +530,16 @@ > assign dwcr1_sel = (spr_cs && (spr_addr[`OR1200_DUOFS_BITS] == > `OR1200_DU_DWCR1)); > `endif > > +// Track previous ex_freeze to detect when signals are updated > +always @(posedge clk) > + ex_freeze_q <= ex_freeze; > + > // > // Decode started exception > // > // du_except_stop comes from or1200_except > // > -always @(du_except_stop) begin > +always @(du_except_stop or ex_freeze_q) begin > except_stop = 14'b00_0000_0000_0000; > casez (du_except_stop) > 14'b1?_????_????_????: > @@ -566,16 +571,16 @@ > except_stop[`OR1200_DU_DRR_RE] = 1'b1; > end > 14'b00_0000_0000_01??: begin > - except_stop[`OR1200_DU_DRR_TE] = 1'b1; > + except_stop[`OR1200_DU_DRR_TE] = 1'b1 & ~ex_freeze_q; > end > 14'b00_0000_0000_001?: begin > except_stop[`OR1200_DU_DRR_FPE] = 1'b1; > end > 14'b00_0000_0000_0001: > - except_stop[`OR1200_DU_DRR_SCE] = 1'b1; > + except_stop[`OR1200_DU_DRR_SCE] = 1'b1 & ~ex_freeze_q; > default: > except_stop = 14'b00_0000_0000_0000; > - endcase > + endcase // casez (du_except_stop) > end > > // > _______________________________________________ > Openrisc mailing list > [email protected] > http://lists.opencores.org/listinfo/openrisc
Hi, I think we should try to not get this patch be lost in the ML. Yann, you should at least post this on the openrisc forum. I think a lot of people (including me) observed this anoying bug. I'll try that patch as soon as I can. Thanks for your investigation on this. Franck. _______________________________________________ OpenRISC mailing list [email protected] http://lists.openrisc.net/listinfo/openrisc
