Re: [OpenRISC] [Openrisc] [RFC patch] OR1200 debug unit failing to continue from some stalls

Stefan Kristiansson Wed, 03 Oct 2012 19:31:11 -0700

On Thu, Oct 4, 2012 at 1:46 AM, Franck Jullien <[email protected]>wrote:


> 2012/9/25 Yann Vernier <[email protected]>:
> > On Tue, 25 Sep 2012 03:16:48 -1000
> > Julius Baxter <[email protected]> wrote:
> >
> >> On Tue, Sep 25, 2012 at 12:28 AM, Yann Vernier
> >> <[email protected]> wrote:
> >> > The thread http://opencores.org/forum,OpenRISC,0,5005 about GDB
> >> > failing to single-step after certain breakpoints led us to a logic
> >> > error in the debug unit. When the pipeline is stalled for other
> >> > reasons, for instance an instruction fetch (therefore affected by
> >> > instruction cache), at the cycle _after_ a trap instruction is
> >> > decoded, the trap signal is never lowered since no new instruction
> >> > is decoded. This locks the debug unit into stalling continuously.
> >>
> >> Hi Yann,
> >>
> >> Sounds like good work on tracking this down.
> >>
> >> Can this be recreated in simulation via the JTAG-via-VPI interface?
> >> I'm not saying you need to, but it would be good to know that this bug
> >> is fixed by observing it in simulation, and/or having a regression
> >> test which, or instructions how to, trigger this bug. How did you work
> >> this out this is the right fix?
> >
> > Yes. We confirmed the issue on hardware, reproduced it in simulation
> > with VPI interface, saw the sig_trap line getting stuck, and found that
> > it overrides stall register updates over JTAG via code inspection. The
> > test case we used was or1200-cbasic, where (for this particular
> > compilation) breakpoints worked on main but not test_bss. gdb was
> > manually controlled, using break and si commands.
> > The observed failure was when if_stall coincided with except_stop.
> >
> > This fix was then verified in simulation and synthesis.
> >
> >> > Attached is a workaround to make the debug unit aware that the
> >> > sig_trap and sig_syscall signals only update when the ex stage is
> >> > not stalled. This fixes the issue we've observed, although similar
> >> > conditions may exist for any other exceptions. Also note that the
> >> > debug interface uses a priority encoder to only catch one of the
> >> > events that occur; short-lived events may therefore be ignored
> >> > completely.
> >>
> >> Can you elaborate on this? Which "short-lived events" are you talking
> >> about?
> >
> > By short-lived I mean exception signals which go back to 0 without being
> > acknowledged. sig_trap and sig_syscall are simply two of the bits in the
> > except_stop vector (which goes to the debug unit) as well as its relative
> > except_trig (goes to the CPU). I'm simply not sure if any others may be
> > short-lived, but the debug unit only catches one per cycle because the
> > du_except_stop to except_stop recoder is a priority encoder.
> > sig_trap is normally short-lived, but being part of the pipeline, may
> > stall along with it.
> >
> > The patch forces the debug unit stall reason of a trap exception to
> > seem short lived, such that if the debug unit happens to stop for
> > another reason, it may not be evident that it was about to trap as
> > well. In current gdb use, the DSR register only captures trap, so it
> > may not be a problem.
> >
> > I'm not sure why the debug unit is written to only capture one reason
> > per cycle anyway. It's going to stall anyhow, just not record all
> > triggers.
> >
> >> > Any comments? Personally I feel it's a bit of a stopgap measure,
> >> > addressing this problem narrowly, but it's better than the current
> >> > state.
> >>
> >> Can you please post the patch inline?
> >
> > Certainly (assuming my mail client won't mangle it).
> >
> >
> > Index: or1200_du.v
> > ===================================================================
> > --- or1200_du.v (revision 663)
> > +++ or1200_du.v (working copy)
> > @@ -412,6 +412,7 @@
> >  wire                           dwcr0_sel,
> >                                 dwcr1_sel;      // DWCR selects
> >  reg                            dbg_bp_r;
> > +reg                            ex_freeze_q;
> >  `ifdef OR1200_DU_HWBKPTS
> >  reg    [31:0]                  match_cond0_ct;
> >  reg    [31:0]                  match_cond1_ct;
> > @@ -529,12 +530,16 @@
> >  assign dwcr1_sel = (spr_cs && (spr_addr[`OR1200_DUOFS_BITS] ==
> `OR1200_DU_DWCR1));
> >  `endif
> >
> > +// Track previous ex_freeze to detect when signals are updated
> > +always @(posedge clk)
> > +  ex_freeze_q <= ex_freeze;
> > +
> >  //
> >  // Decode started exception
> >  //
> >  // du_except_stop comes from or1200_except
> >  //
> > -always @(du_except_stop) begin
> > +always @(du_except_stop or ex_freeze_q) begin
> >         except_stop = 14'b00_0000_0000_0000;
> >         casez (du_except_stop)
> >                 14'b1?_????_????_????:
> > @@ -566,16 +571,16 @@
> >                         except_stop[`OR1200_DU_DRR_RE] = 1'b1;
> >                 end
> >                 14'b00_0000_0000_01??: begin
> > -                       except_stop[`OR1200_DU_DRR_TE] = 1'b1;
> > +                       except_stop[`OR1200_DU_DRR_TE] = 1'b1 &
> ~ex_freeze_q;
> >                 end
> >                 14'b00_0000_0000_001?: begin
> >                         except_stop[`OR1200_DU_DRR_FPE] = 1'b1;
> >                 end
> >                 14'b00_0000_0000_0001:
> > -                       except_stop[`OR1200_DU_DRR_SCE] = 1'b1;
> > +                       except_stop[`OR1200_DU_DRR_SCE] = 1'b1 &
> ~ex_freeze_q;
> >                 default:
> >                         except_stop = 14'b00_0000_0000_0000;
> > -       endcase
> > +       endcase // casez (du_except_stop)
> >  end
> >
> >  //
> > _______________________________________________
> > Openrisc mailing list
> > [email protected]
> > http://lists.opencores.org/listinfo/openrisc
>
> Hi,
>
> I think we should try to not get this patch be lost in the ML. Yann,
> you should at least post this on the openrisc forum. I think a lot of
> people (including me) observed this anoying bug. I'll try that patch
> as soon as I can.
>
> Thanks for your investigation on this.
>
> Franck.
>
>
I'd even say, commit it.
No-one is objecting to it, it's perhaps not perfect,
but no-one have anything better to suggest.

Stefan

_______________________________________________
OpenRISC mailing list
[email protected]
http://lists.openrisc.net/listinfo/openrisc

Re: [OpenRISC] [Openrisc] [RFC patch] OR1200 debug unit failing to continue from some stalls

Reply via email to