[Open-graphics] Re: Scatter-gather?

Daniel Phillips Tue, 15 Mar 2005 20:39:34 -0800

On Tuesday 15 March 2005 21:34, Timothy Miller wrote:
> On Tue, 15 Mar 2005 18:29:58 -0500, Daniel Phillips wrote:
> > ...lets see if it is a serious problem.  Suppose each indirect DMA
> > command is 8 bytes.  Suppose we are loading textures at 128 MB/Sec,
> > or 128 KB/ms, or 32 pages/ms.  Suppose we are willing to take an
> > interrupt and wake up a task that loads up the next batch of DMA
> > commands every 10 ms.  That is 8 bytes/command * 32 pages/ms * 10
> > ms = 2560 bytes of ring buffer space, so round it up to 4K, which
> > allows for enough slack to take an interrupt and refill the buffer
> > before it drains completely.
>
> It would take much longer than that to drain the ring buffer.  Each
> entry in the ring buffer would point to some other DMA transaction
> that would itself likely take a long time.


But all _DMA_ has to be initiated from the command ring buffer.  And if 
the command ring buffer is full of 4K DMA commands, it drains at a rate 
of 256 bytes a millisecond.  Did I miss something?

> > That is the PCI case.  PCI-e can transfer one or two orders of
> > magnitude faster.  Say we could somehow keep up with this,
> > regardless of whether it is fanciful at this point.  Then we might
> > need, say, a 256K ring buffer, and we could not be sure of being
> > able to find that much unfragmented physical memory, except at boot
> > time.
> >
> > Now, we probably will only ever initialize the ring buffer at boot
> > time, but say for the sake of argument that we want to fix this
> > theoretical problem.  The way I would propose to do it is:
> > initialize the command ring buffer by loading a number of physical
> > page addresses via PIO, so that the command ring buffer does not
> > have to be physically contiguous. In other words, the DMA hardware
> > translates ring buffer addresses through a table of up to, say, 64
> > 4K pages, and therefore does not have to be allocated from
> > physically contiguous memory.
> >
> > Indirect DMA pages will be locked down by DRI via a software
> > interface, so the hardware doesn't have to worry about that at all.
> >
> > What do you think?  Seriously, I doubt it would be a real problem
> > to just make the command ring buffer physically contiguous and hope
> > for the best at init time.  But if it isn't too hard, maybe we
> > should solve the fragmentation problem definitively, at least on
> > paper.  We can always put this under the "after initial release but
> > before ASIC" category.
>
> I'll consider the idea of being able to take a LIST of address ranges
> for the ring buffer.  Say, maybe 8 entries or something.

Hmm, and force the command ringbuffer to be physically contiguous?  In 
that case, I'd say don't bother with the 8 entry lists, just let the 
ringbuffer be a huge thing that can only initialize at boot.  Because 
after a Linux kernel has been running for a while, you can't be sure of 
successfully getting even 2 physically contiguous pages, let alone a 
whole bunch.  Reducing the problem by a factor of 8 doesn't make it go 
away.

In my opinion, we only have to solve this problem on paper - we have to 
show how the card won't always be a contiguous memory pig.  But there 
is no practical need to implement the solution in the initial release.

Regards,

Daniel
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

[Open-graphics] Re: Scatter-gather?

Reply via email to