[Dri-devel] Re: NO_BATCH_DISPATCH path

Leif Delgass Wed, 05 Jun 2002 12:18:12 -0700

On Wed, 5 Jun 2002, José Fonseca wrote:

> On Tue, 4 Jun 2002, Leif Delgass wrote:
> > On Tue, 4 Jun 2002, Jose Fonseca wrote:
> > 
> > > Leif,
> > >
> > > As I was restructuring part of the DMA submission code I realized that
> > > there is a fault in the NO_BATCH_DISPATCH path. You don't guarantee
> > > that the card doesn't stop at any momment because you've chosen not
> > > check if a change in the last buffer continuation flag is sucessful or
> > > not, and wait for idle & restart if not. This means that in _every_
> > > wait loop we have to account for possibility of the card to stop in
> > > result of that (this include e.g., _wait_ring, but many others).
> > 
> > I'm checking for idle (but not waiting) on each commit by way of the
> > do_dma_flush.  If the card is idle, a new pass is started.  If it has
> > almost caught up to the tail but the card's not idle yet, it'll be seen on
> > the next commit or in freelist_get.  I don't think there's a fault here,
> 
> In _wait_ring you don't check if card is idle (and resume) so you could
> be forever waiting for free space space.


I do see the problem there, I was thinking that this couldn't be triggered
with the current code because wait_ring would never be entered (can't fill
the ring).  I realized that ring.space is only kept up to date with regard
to the ring head by wait_ring, so it is possible.  I went back and added a
flush to the ring space macro in my version and it seemed to fix the
lockup I was having with the q3demo, although I haven't tested long enough
to see if there aren't other lockups.
 
> In do_dma_idle you just called _wait_for_idle, but you didn't account
> for the possibility of the card hadn't finished processing.

The reason for this was that I have a separate ioctl for flushing DMA.  
DMA can be idle without the queue being empty, if the queue is empty then
the card is quiescent.  Most places where there is a wait for idle, it's
preceeded by a flush, e.g. in Read/WritePixels, the XAA sync, etc.  The
question is:  are there still places where we call the idle ioctl when
quiescense is _not_ required (there is no preceeding flush)?  If not, we
wouldn't need the flush ioctl (assuming we never need to flush from
outside the drm without waiting for idle) and your version of do_dma_idle
is fine since it flushes before waiting for idle.
 
> These are just two I think of. I don't recall if there are more. I dealt
> with the problem by making the check for card idle but with buffers to
> process in UPDATE_HEAD.

I see why you did this, but I think it obfuscates the code a bit to flush 
inside the UPDATE_RING_HEAD macro.  Maybe we could rename it to 
UPDATE_HEAD_WITH_FLUSH or something to make it more explicit.  Also, I 
noticed that you removed the bounds check on BM_GUI_TABLE.  Without that, 
you need to be certain that the card is not locked up when starting a new 
pass, because the ring_head could be an invalid address.  I guess if the 
card is locked, GUI_STAT will still read as active, so maybe the check 
isn't necessary, but we should make sure that the new path can handle a 
lockup.  One of the reasons I wanted to keep the old path in the code for 
a while was to make sure there isn't a problem with using BM_GUI_TABLE 
the way we are that we haven't encountered or tracked down yet.
 
> > it's just a different approach.  My thinking was:  how do you determine if
> > moving the end of list flag was successful?  You'll end up needing a wait
> > loop to see if the card passes the previous tail (which can often be only
> > one descriptor away from the new tail).  That wastes CPU cycles, decreases
> > concurrency, and increases the chance of the card going idle before the
> > next commit (by finishing the ring).  Maybe we could assume that if the
> 
> Yes. You're right. Waiting would be a waste. As I said above, that
> problem is already overcomed by checking everytime it updates the head.
> 
> > head is a certain number of descriptors behind the previous tail, the card
> > will continue on and not go idle -- that would reduce the need for a wait
> > loop to situations where it's likely the card will go idle.  Of course, if
> > the draw engine locks up, then the card will stop, so there's really 3
> > possibilities: running, finished, or locked.
> > 
> > > I tried to make quick fix to this, but it didn't solve anything (I can
> > > reproduce the bug everytime I want by runing TO - a Mod to UT), so I
> > > stopped trying to concentrate on what I was doing before. I just wanted
> > > to tell you because you're probably looking at that too.
> > >
> > > Regarding the restructuring I'm doing, is basically a simplification to
> > > bring more closely what I had previously in mind. I'm making the RING_*
> > > macros to commit the buffers immediately, and the DMA_* macros to fire
> > > up one buffer (also commiting immediately).
> > 
> > The DMAADVANCE macro already does commit one buffer immediately if
> > NO_BATCH_DISPATCH is set.  I guess I'll have to wait and see what you had
> 
> As I was writing I realised that you were already doing that on the
> NO_BATCH (the several code paths _really_ confuse one trying to
> understand the code..). In fact as I was tweaking around I rewrote
> several functions and deleted all because I realised that I was going on
> the same direction of what you did before! ;-) But it was ok, since now I
> know why things are the are the way they are.
> 
> > in mind here.  BTW, I hadn't gotten to making pseudo-DMA work with the new
> > path yet, but I think we should have that for debugging.
> >
> 
> As I said in my previous mail (I hadn't received this one yet), it's not
> on my priority list. I prefer have the DMA work properly and without
> faults, than losing time maintaining more than one code path. Having the
> boths paths in runtime is gonna be difficult. Perhaps better is leave
> the #ifdef's and in the end just make the pseudo-dma code play with the
> rest and whoever want's use that path changes a macro and recompiles the
> DRM.

I understand, and I know it makes the code harder to read.  I think there
are still problems with DMA on ppc, but I haven't heard from Peter for a
while.  I was thinking of pseudo-DMA as a way to debug DMA more than a
path that would be for everyday use.  I've used it for this purpose and
it's been helpful to have, but we can remove it once things are known to
be stable.

> > ...
> >
> > OK, I'm going to take a break from the drm and wait to see your changes.
> > I'm going to take a look at 2D accel.
> 
> Now I have an old test box (thanks to Keith) and another friend gave me
> a AGP Mach64 which was no longer being used, so I can do much more (and
> quicker) since I don't need hang the same computer I work on anymore.
> 
> Jose Fonseca

In general I think the changes look good.  Here's a couple of minor fixes 
for your patch:

- dma_start needs a prototype in mach64_drv.h since it's referenced in 
the macros, which are used in mach64_state.c.

- I think you have an endianess problem with ADVANCE_RING.  The command
read from the ring will by cpu endianess, and then you're or/and-ing with
a little endian value and writing back with the |= and &=.  I think you 
need to move the read inside the cpu_to_le32 and use a simple assignment:

ring[(write-2) & mask] = cpu_to_le32( ring[(write-2) & mask] | DMA_EOL );

maybe use a temp var for the dword offset?

-- 
Leif Delgass 
http://www.retinalburn.net



_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

[Dri-devel] Re: NO_BATCH_DISPATCH path

Reply via email to