On Fri, Jun 5, 2015 at 12:58 PM, Peter Crosthwaite <peter.crosthwa...@xilinx.com> wrote: > On Thu, Jun 4, 2015 at 7:00 PM, Alistair Francis > <alistair.fran...@xilinx.com> wrote: >> On Thu, Jun 4, 2015 at 8:52 AM, Peter Crosthwaite >> <peter.crosthwa...@xilinx.com> wrote: >>> On Wed, May 27, 2015 at 12:37 AM, Alistair Francis >>> <alistair.fran...@xilinx.com> wrote: >>>> Previously the stream_running() function didn't check >>>> if the DMA was halted. This caused hangs in recent versions >>>> of MicroBlaze u-boot. Correct stream_running() to check >>>> DMASR_HALTED as well as DMACR_RUNSTOP. >>>> >>> >>> So I'm stuggling with this one. Partly because I think HALTED might be >>> misimplemented in existing code. I did some digging, and AFAICS, >>> HALTED is conditional on !DAMCR_RUNSTOP. I think i might have got >>> 210914e29975d17e635f9e8c1f7478c0ed7a208f wrong: >>> >>> @@ -276,7 +276,7 @@ static void stream_process_mem2s(struct Stream *s, >>> stream_desc_load(s, s->regs[R_CURDESC]); >>> >>> if (s->desc.status & SDESC_STATUS_COMPLETE) { >>> - s->regs[R_DMASR] |= DMASR_IDLE; >>> + s->regs[R_DMASR] |= DMASR_HALTED; >>> break; >>> } >>> >>> Stepping back and ignoring the existing implementation of HALTED there >>> are 4 states of RS:H (RUNSTOP and HALTED): >>> >>> !RS && H - this is the off state. doc refers to this as the "halted" state. >>> RS && !H - This is the running state. >>> !RS && !H - This is the transient state. Software has cleared RS but >>> there s still something on AXI bus so cant assert halted yet. >>> RS && H - This is an invalid state. >>> >>> Current code reaches the invalid state on the ring buffer full case >>> due to the bug above. My thoery is >>> 210914e29975d17e635f9e8c1f7478c0ed7a208f should have just been: >>> >>> if (s->desc.status & SDESC_STATUS_COMPLETE) { >>> - s->regs[R_DMASR] |= DMASR_IDLE; >>> break; >>> } >>> >>> Now I think there is yet another bug in that clearing RS doesn't seem >>> to be able to reliably set the HALTED bit (only in the unrelated case >>> of a ring buffer fill). >>> >>> I'm starting to question whether the HALTED bit as far as QEMU is >>> concerned should just be a straight negation of RS. Depending on what >>> the conditions cause a transient and what doesn't, the transient as I >>> describe above may evaporate as we can get away with this simple >>> shortcut. >> >> Hey Peter, >> >> Ok, so you are saying maybe we should only have 'Running' and 'Off' >> states? >> >>> >>> This would make this patch obsolete without fixing your bug :). >>> >>> So running on the assumption that HALTED is misimplemented your patch >>> is doing something with that behaviour. The misimplemented HALTED is >>> currently holding the state of "we are blocked on a full buffer". If >>> you can point me which of the 3 call sites of stream_running was >>> giving you problems I might have more clues. >> >> So the problem was basically because: >> - axienet_eth_rx_notify() calls the DMA push functions >> >> - xilinx_axidma_data_stream_can_push() returned true >> - Therefore xilinx_axidma_data_stream_push() could be called >> >> - This meant that stream_process_s2mem() was called >> - But it would break straight away as >> 's->desc.status & SDESC_STATUS_COMPLETE' would return true. >> - This meant it would cause an infinite loop as the stream could be >> pushed, but nothing was ever pushed. As ret was always zero >> the code never broke out of the second while loop in >> axienet_eth_rx_notify() >> > > Ok I think I know what is up. It happens by fluke, that the bad > implementation of halted is exactly the state you need to correct this > issue. You could capture it as a new boolean in the state struct and > just check it in can_push. I'm looking at ways to refactor it to avoid
Do you mean capture the running/!running state in the state struct? > the extra calls to _push though. It might be possible to refactor such > that all of stream_running, stream_idle and the SDESC_STATUS_COMPLETE > happen in can_push. This means can_push has to do the > stream_desc_load. But that is ok, as the current solution will already > have spurious fetches of the descriptor. Then make can_push itself the > iteration condition for _mem2s's loop which will handle descriptor > fetches for you in the process. Is it really that bad to be calling _push? Thanks, Alistair > > Regards, > Peter > >> Thanks, >> >> Alistair >> >>> >>> FYI you patch may still be correct but I wondering whether is has >>> uncovered a bug that should lead to a rework of this. >>> >>> Regards, >>> Peter >>> >>>> Signed-off-by: Alistair Francis <alistair.fran...@xilinx.com> >>>> Reviewed-by: Sai Pavan Boddu <saip...@xilinx.com> >>>> --- >>>> hw/dma/xilinx_axidma.c | 3 ++- >>>> 1 files changed, 2 insertions(+), 1 deletions(-) >>>> >>>> diff --git a/hw/dma/xilinx_axidma.c b/hw/dma/xilinx_axidma.c >>>> index d06002d..27fba40 100644 >>>> --- a/hw/dma/xilinx_axidma.c >>>> +++ b/hw/dma/xilinx_axidma.c >>>> @@ -154,7 +154,8 @@ static inline int stream_resetting(struct Stream *s) >>>> >>>> static inline int stream_running(struct Stream *s) >>>> { >>>> - return s->regs[R_DMACR] & DMACR_RUNSTOP; >>>> + return s->regs[R_DMACR] & DMACR_RUNSTOP && >>>> + !(s->regs[R_DMASR] & DMASR_HALTED); >>>> } >>>> >>>> static inline int stream_idle(struct Stream *s) >>>> -- >>>> 1.7.1 >>>> >>>> >>> >> >