Reply to the open-hardware list.

Vinicius Santos wrote:
On Dec 15, 2007 5:47 PM,  <[EMAIL PROTECTED]> wrote:

--- [EMAIL PROTECTED] wrote:
I don't speak on generic terme, i speak about real world example of current complexe vector shader and pixel shader.

I speak of the actual code which we intend to use; it contains a lot of vector and matrix operations. --- We aren't trying to make a programmable shader with these SIMD units, are we? I think we can easily use the SIMD units in a pseudo-fixed pipeline.

Look at some previous posts on the subject: http://article.gmane.org/gmane.comp.graphics.opengraphics/2445 http://article.gmane.org/gmane.comp.graphics.opengraphics/2553 http://article.gmane.org/gmane.comp.graphics.opengraphics/2461 These are two shader examples and an analysis of various functions used.

People might hate me to feed an IMHO fruitless discussion

This is not a discussion.  It is NB using the rhetorical device commonly
called 'hit and run'.  I have made the mistake of trying to answer his
postings, but now realize that this is just a waste of my time.  It
started with an off had remark about Sun having released the FGX RTL
code (VIS instructions). I guess that I will have to stop making such remarks when there are list members who seem interested mostly in finding a way to say that I am wrong about something.

Might I add here that there is no black and white here (except perhaps for the fact that RISC processors don't, by definition, use microcode). There are things wrong with every possible way that we might do this. That is what engineering is all about. You do not find the perfect solution to a problem, you choose what you think is the best solution from among various good solutions.

but previous threads(including those cited above) don't exactly show how "real world shaders" translate into scalar operations:

1-You have shader examples written in high-level(ogsl) language

2-You have some compiled into a specific(arbvp1 or arbfp1) architecture assembly [1]

3-You have a dataflow profiling of DirectX  shaders, I assume already
 compiled into the architecture assembly.

The way I see it, the device driver get those arbvp assembly code and
translate them into your own architecture code. So can arbvp be efficiently translated into a systolic/SIMD/multi-core architecture?(stalls, dependency, etc)[*]

Those examples are not directly related to our project.

In any case, dot products are vector operations and they don't have
any dependencies when run on a vector processor that has a wide enough word (might not be a necessary condition) and the MAC instruction. However, dependencies might be an issue with some of the code in ogmodel.cpp.

I think that OGSL, or C, can be translated into any of these architectures. The major issue is how much hardware it would require to implement it in various ways.

Using multiple SIMD processors controlled by microcode has the advantage that it is totally reconfigurable and you know how much hardware it will use. It also should be scalable. I have no doubt that a systolic array processor will run faster. The questions are how much faster and how much hardware will it require.

After that, can said architecture be implemented efficiently in FPGA?
 (complexity,space,etc)

There are two efficiency issues here: adequation of shader code into
an architecture and complexity of said architecture. People are jumping from one issue to another leaving questions unanswered.

Yes, that is 'hit and run' and I would very much like to have a coherent
discussion instead. I am not convinced of what the best solution is, so I would like to discuss it. I see no value in jumping from one issue to another to the point that I am sure that NB contradicted himself. I may have as well which I would attribute to the fact that I am ill and easily confused -- can't seem to get the brain functioning fully. :-)

This whole discussion doesn't change a couple of facts:

1-As Timothy pointed out, there is already a tested model for OGA1.

Actually, there is tested C code (ogmodel.cpp).  IIUC, The
implementation of this in Verilog is not yet done.  I don't know if TM
has started on it yet.  He said that he had some 2D stuff running in
simulation.

It might be interesting that future versions are improvements on that. (ie. OGA2 adds more reconfigurability to the fixed pipeline, instead of a whole new architecture)

There are certainly advantages of configurable hardware -- especially if
it is to be a custom ASIC.  IIUC, ATI & nVidia use configurable arrays
of processors.  3D-Labs switched from a fixed pixel pipeline (Oxygen) to
multiple processors.

2-OGD1 isn't released yet, so people can't hack with it to figure out
 its limits in any of the propositions.

Quite true, however the number of integer hardware multipliers (and
their width) in the Xilinx chip is know.  I think that this is going to
be the limiting factor, but it might turn out differently.

3-Any of the models will have to be able to handle compiz and high-resolution efficiently. Might be a limiting factor for programmable shaders(at least in FPGA).

The number of hardware multipliers is always going to be a limiting
factor even if designing a custom ASIC because they take up a lot of
real estate on the chip and, therefore, consume a lot of power.

--
JRT
_______________________________________________
Open-hardware mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-hardware

Reply via email to