Re: [Open-graphics] Shader kernels: How many local variables?

Timothy Normand Miller Wed, 07 Oct 2009 10:06:50 -0700

Let's say we just give each engine 256 registers.  (I'd rather it were
smaller.)  That means that each engine, which can run up to 8 threads
(default for now), needs 2048 registers or 4 block RAMs, just for the
register file.  Also, since the icache is shared across four (we're
defaulting to this for now) shaders, that means each shader requires
4.25 BRAMs.

For sort-first, if we want the global dcache to be the same size as
one memory row (which is also arbitrary, because we do expect reads to
happen to other memory rows, from other surfaces), and we want to give
the texture engine the same amount (why not), then that's four BRAMs
for the global dcache.

We're going to need BRAMs for the memory and video system.  Currently
(IIRC), OGD1 uses small queues for PCI access to memory, but this is
because PCI is really slow compared to the memory system.  With an
engine in there, we'll need some sort of queueing system.

There are four memory controllers, and each one has:
- One 64-bit-wide return queue for video head 0
- One 64-bit return queue for video head 1
- One 96-bit write/command queue
- One 64-bit read return queue

That adds up to 9 BRAMs per controller, totaling 36.

I'm assuming that each shader will have one small (distributed RAM)
queue for writes and read requests and another for read return.
Writes and commands spill from the shader queues into the global one
to be processed.  We'll have to see how that random logic adds up.

It's clear enough to me that we could probably combine some of the
queues with the caches.  We (probably) have to keep the video queues,
but combining the caches with the read return queues, we end up with
basically just double the cache space at no extra cost.

I'm missing things, I'm sure of it, but just going from these numbers,
if we have N BRAMs total, then we can fit in ((N-36)/4.5) shaders.
For instance, in the Spartan 3 4000, which has 96 BRAMs, we can fit 13
shader engines.  (Assuming the logic fits too.)  That's not too bad
and will allow us to do a lot of scalability testing.

Xilinx has some high-end Spartan 6 boards that IIRC are quite
expensive, but we might buy one for some extended testing.  The
largest one has 268 BRAMs, so we can fit 51 shaders.

Of course, all of this assumes that 256 registers is the right number.

On Tue, Oct 6, 2009 at 5:53 PM, Hugh Fisher <[email protected]> wrote:
> Timothy Normand Miller wrote:
>>>
>>> The OpenGL spec requires at least 16 4-way vector attributes for
>>> vertex shaders, and at least 32 4-way vector varying values for
>>> fragment shaders.
>>
>> So, 128 regs, just for arguments.
>
> My bad. That's not a simultaneous load, since the vertex and fragment
> shaders don't need to run at the same time. (And in a sort-to-tiles
> architecture which you seem to be suggesting, all the vertex shading
> has to be done before fragment shading.)
>
> So the max number of incoming arguments is 32 for fragment shaders.
> The number of outgoing arguments is <= incoming for vertex shaders,
> and much < incoming for fragment shaders.
>
> And again assuming the target market is not high end gaming or HPC,
> you could easily aim for 8 arguments in registers and the rest
> passed in slower memory, like the MIPS calling conventions.
>
>>> For the fixed function pipeline the vertex shader is the more complex
>>> one, needing 4 argument vectors and enough working registers for a
>>> full matrix x vector transform and Gouraud lighting equation with one
>>> light source.
>>
>> Even in this case, I'm not sure how many scalars it translates into.
>
> There are sample GLSL shaders around that emulate the standard fixed
> function pipeline. I'll go through the code of one and try to figure
> this out.
>
> --
>        Hugh Fisher
>        CECS, ANU
>

-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Shader kernels: How many local variables?

Reply via email to