Sorry, replying to self.

I just remembered other benefits of the partitioned bram based shader.

#1 you can save and restore frozen execution contexts to the framebuffer, allowing task switching by (re)using the shader's DMA unit.

#2 If you can reuse use the same shader program and execution context for each fragment as you move across a scanline, I'm betting that the execution state will effectively cache the 'hot' section of the texture(s).

#3 if you then break each partition of the data memory into a several 'pages', you can effectively pass arguments into an existing shader program, by only switching one or two of the pages into the execution context.

#4 It would probably be possible to modify the instruction and data memories to be treated sort of like normal instruction and data caches; but I don't think doing that would be a good idea.

Basically the overall shader architecture I was last working on was a network of simple cpus each with a few K of almost zero latency local data and program memories (set by the bram size on the fpga family). The programming model would sort of be like MPI with a shared incoherent but fine grain lockable framebuffer.

This does not sound like the direction OGP is going to go in, but hey typing this up got my juices going.

-John

--
John R. Culp
[email protected]
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to