On Sat, 8 Oct 2005 09:54:17 -0400
Timothy Miller <[EMAIL PROTECTED]> wrote:


> It's a pain to deal with.  I had to write special macros that would
> stick different numbers of words into the buffer efficiently.
> 
> But really, although drawing commands are allowed in this buffer,
> you're really supposed to use the indirect buffer for drawing.  That
> makes this less of a big deal.  If this is a real problem, I just need
> to add a NO-OP packet that's one word that you can use as filler.
> 
> > IMHO it wouldnt be a bad idea to have fixed size packets, as these
> > would not only make the hardware more easy, but also simplify
> > driver programming (given that we do not lose functionality by
> > restricting us to a fixed size).
> 
> But then you have to pull extra words over the bus.  That's
> inefficient, especially for PCI.  I think making them variable is an
> advantage since you can minimize the data transfer.

Ok, but then i would at least use some granularity of 4 or 8 bytes.

> > s/kernel/driver/
> > The driver does not necessarily have to be in the kernel.
> 
> Well, yes, but it's probably most efficient to have that driver
> accessible via system calls to the kernel and to not have to send
> messages to another process.  Most of the time, we're just instructing
> the driver to load another indirect packet into the ring buffer, and
> we want to minimize overhead.

Sure, i just wanted to make clear that we should rather use the
term driver instead of kernel.

> > > 0: Indirect DMA commands
> > > This packet is allowed in the ring buffer but not an indirect buffer.
> > > The packet contains two words:
> > > [4:0][1:privileged][27:length]
> > > [32:starting address]
> > > When this packet is read from the ring buffer, the indirect sequencer
> > > starts reading words from this address and continues reading until
> > > length words have been read.  If the privileged bit is not set, the
> > > particular indirect buffer is only allowed to contain rendering
> > > commands.
> >
> > where is the start adress ? on the card memory or host memory?
> 
> These are engine commands.  Engine commands always come from the host,
> and their target addresses are implicit.

Would it be an idea to have an indirect buffer already loaded
into the graphic card memory? Ie if there is some command sequence
that we want to run again and again then we could avoid the
DMA transferes from the host memory.

> > I would use here from the beginning 64bit for the host adress.
> > It will make upgrade to 64bit systems simpler.
> 
> You have a point.  But I think perhaps I should make it selectable, like this:
> 
> [4:2][1:64-bit host address][27:length]
> ...

I would not. Just make it 64bit from the beginning (the additional
4 bytes should not hurt) and ignore the MSB half for now.
A flag in the config space to signal whether the card can do
64bit adresses or not should be enough to select the correct method
in the driver.

> Note that the PCI controller in this version doesn't support 64-bit
> addresses.  

Yes, but it will in the not too far future.
I wouldn't be surprised if the second version of the controller
already did 64bit transferes.

> The kernel developers have found all sorts of clever ways
> to use the AMD iommu to deal with 32-bit devices in a 64-bit address
> space, so I'm not too worried.  But reserving it isn't a bad idea.

The problem is not the transfere of the data but rather allocating
DMA capable memory. It's quite a PITA for driver developers to
ensure that the memory they allocate is in a range where the
device can access it. Not to talk about the problems the memory
management has with it.


> > > 3: Engine upload indirect
[...]
> > Here again, 64bit for host address.
> > And where does the image end up?
> 
> It ends up wherever you programmed the GPU to put it.  It'll show up
> as a register in a GPU unit that I haven't defined yet.  You program
> some registers in that unit with some information about where you want
> to draw.  When you write to this register (via PIO or this DMA
> command), each word you write causes a pixel to be emitted down the
> pipeline.  This way, you can use the GPU to control where you're
> drawing.

Could you please explain this a little bit further?
I'm not familiar enough with graphics cards and how they
work to understand what you want to achive here.


> Actually, I have a better idea:  We'll give the rasterizer a special
> "single-step" state.  Rather than rasterizing automatically, it waits
> for you to send it pixels, and it uses those pixels as though they
> were the primary shade color.  Since it's single-step, the state can
> be saved, modified, or restored at any time, unlike some GPUs that
> lock up when you haven't sent them as much data as you said you would.

That sounds good and easy to deal with, both in hardware and software.

> > Also i would change this command to allow multi plane formats,
> > as today most software systems deliver images not interleaved
> > but as three (or four) different planes. But you have to keep here in
> > mind that there are subsampled planes that use 4:2:2 and 4:2:0
> > (which are easy to deal with) and other more obscure (that are
> > not so easy to deal with) formats. I would say, for practical uses
> > limit subsampling to 4:2:2 and 4:2:0 and let the others convert
> > by software.
> 
> The host interface is going to have an elaborate format-conversion
> facility.  Everything has to end up 32-bits internally, so you can put
> this into a state where 8-bit pixels (packed) are converted to AAAA or
> XAAA (where X is programmable) formats.  YUV and other such things are
> options too.

Sure, but how do you do that? My ideal graphics card just takes
two size values (X*Y), what subsampling is used and the order of
the planes (YUV,YVU, RGB, BGR, and all variants with A), stride,
one (or three/four) pointers where the planes are located and
a position where to show it on the screen. Of course it can
transfere it atomicaly and thus does not suffer from tearing.

But as this is not possible, some way to upload the image first
to card memory (converting it to the internal representation on
the fly) and using tripple buffers shouls be enough.


> > Also maybe it would be an idea, if we use non-uniform sized packets,
> > to have a specialized packet to upload mouse pointer pixmap/masks.
> > On the other hand these can be stored in the card memory
> > and used by switching the pointing registers.
> 
> Changing the mouse pointer is VERY infrequent (relatively speaking). 
> It's also something asynchronous to other drawing operations that you
> want to happen immediately.  As such, I've decided it's going to be
> all PIO.

Ok.

                        Attila Kinali
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to