Hi Ncolas

2012/5/28 Nicolas Boulay <[email protected]>:
> How could you be effiscient on fully scalar shader with a single
> decoder for 4 alu ? How do you manage register/memory bank with many
> ports ? You need many port to fill many scalar pipelines but many
> ports means slower accesses.
>
> On GPU, they used large register bank to avoid using RAM at maximum
> (32K registres for Fermi ?).
>
The design is still a paper design at this point. There was a few
thing that weren't fully tough on how to do it.  But the hardware I
was thinking the design while using spartan6 fpga as a target. So a
lot of BRAM where to be use and also flip-flop in the fabric.

> I try to think about very high level instruction design to fill many
> scalar pipeline. GPU usualy have many datas format manipulate by
> register (packed rgb, etc...). But why not adding even square matrix
> of fixe size 2 to 4, diagonals (to do complex and quaternions
> calculus), and vector of "any" size (and array of vector).
>
> For exemple, the multiplication of an array of vector of size 4 to a
> single matrix of size 4 could use many ressource in the same time
> effciently.
>
We had that discussion in the past. Scalar is more practical that you
use less hardware than a vector architecture. That mean a more optimal
utilisation of the hardware, but at the cost that you need to unroll
the vector operation in scalar one.

Regards,
André
>
>
> 2012/5/28 Andre Pouliot <[email protected]>:
>> Hi,
>> it's more a traditional GPGPU design with some twist. The basic design
>> is a scalar engine. Multiple scalar engine are controlled via an
>> instruction decoder pipeline that control multiple scalar engine. Each
>> step of the pipeline have it's own thread. For example for a pipeline
>> with a depth of 8 you would have more than 8 program running. This
>> prevent pipeline flush or stall because of data dependency. Each
>> scalar engine is running it's own fragment. For 4 scalar engine with a
>> pipeline depth of 8 you would run at the same time 32 thread and 4xN
>> waiting, N being the program in queue waiting for a time slot in the
>> round robin. The 32 thread could be controlled by 8 different program.
>>
>> If I remember right we were also talking about network on chip for 
>> scalability.
>>
>> Kenneth and me had some thing that are well spec out in a document.
>> That doc need to be reorganized, it's was quite a mess at the time, I
>> didn't know how to do a good spec document. Still learning how. Also
>> some stuff was still in discussion. We did discuss during many hour
>> some of the different option and breaking them.
>>
>> For energy efficiency I don't remember if we specified stuff for it, I
>> would need to look at that document again.
>> I'll try to try finding all the stuff again and make it accessible to
>> people to look at. If someone want access to contribute, it will be a
>> case by case basis.
>>
>> Regards,
>> André
>>
>> 2012/5/28 Xiaohan Ma <[email protected]>:
>>> Hi Tim
>>>
>>> Can you put more info about oga2 which Andre spec'd out? Is this a 
>>> traditional gpu design or energy-efficient ideas involved?
>>>
>>> Thanks
>>> Xiaohan
>>>
>>> On May 27, 2012, at 2:19 PM, Timothy Normand Miller <[email protected]> 
>>> wrote:
>>>
>>>> I'm not trying to start an argument as to whether or not "intellectual
>>>> property" is real.  Maybe I'll blog about that some time.  :)  I
>>>> nevertheless need to point out that being an employee of a State
>>>> University of New York binds be in certain ways.
>>>>
>>>> http://research.binghamton.edu/Innovation/IntellectualProperty.php
>>>>
>>>> The bottom line for me is that I need to stay far away from any
>>>> cash-flow that might occur.  And regarding the IP owned by Traversal,
>>>> Traversal is defunct, and the IP ownership fell back to me, Howard,
>>>> and Andy.  We're ready to transfer that, and some responsible
>>>> facilitator(s) need(s) to take ownership (literal or figurative) and
>>>> see where the project can leverage it.  I think that there needs to
>>>> still be some centralized entity who can relicense the IP without
>>>> having to ask permission from 1000 contributors.
>>>>
>>>> So, on to what the OGP can do...
>>>>
>>>> ARM has cornered the market on energy-efficient CPUs.  And ARM is
>>>> entirely fabless.  Maybe the OGP can corner the market on
>>>> energy-efficient GPUs.  The design would be dual-licensed GPL and
>>>> commercial, where for production purposes, all GPL viral-like
>>>> characteristics can be stripped in exchange for money, with the
>>>> understanding that breaking binary compatibility with the open design
>>>> (thereby potentially creating a closed architecture) will cost a LOT
>>>> more to license.  Our chosen facilitator would handle the money and
>>>> fund whatever seems useful to fund.  Mostly prototype hardware,
>>>> reference designs, donations to other projects, etc.  Linux Fund took
>>>> over the Open Hardware Foundation, so we can use that.
>>>>
>>>> Of course, most companies that set out, a priori, to be fabless and
>>>> license IP for profit tend to fail disastrously.  But we're not trying
>>>> to sustain a company on this.  Indeed, the profit margin would have to
>>>> be painfully small in order to be the least bit competitive anyhow.
>>>> Our objective is to put a completely open GPU design out on the
>>>> market, and that isn't necessarily profitable.
>>>>
>>>> So just for fun and science, let's see what we can design.  André
>>>> Pouliot and Kenneth Østby spec'd out a GPU shader engine design called
>>>> OGA2.  Let's start there.  The first thing to do is my favorite part,
>>>> which is to argue about architectural design decisions.  Then we make
>>>> a C-based prototype to determine functional efficiency, then we code
>>>> it in Verilog and synthesize it for gate-level synthesis so we can
>>>> judge energy efficiency.
>>>>
>>>> Think about leveraging the brainpower of the FOSS community to design
>>>> a GPU that outperforms and is more energy-efficient than PowerVR.  A
>>>> compelling-enough design would get market penetration.  Eventually, it
>>>> would make its way from embedded systems into desktop systems and
>>>> supercomputers (GPGPU, etc.), and we would all benefit from having an
>>>> open architecture dominate in graphics.
>>>>
>>>> --
>>>> Timothy Normand Miller
>>>> http://www.cse.ohio-state.edu/~millerti
>>>> Open Graphics Project
>>>> _______________________________________________
>>>> Open-graphics mailing list
>>>> [email protected]
>>>> http://lists.duskglow.com/mailman/listinfo/open-graphics
>>>> List service provided by Duskglow Consulting, LLC (www.duskglow.com)
>>> _______________________________________________
>>> Open-graphics mailing list
>>> [email protected]
>>> http://lists.duskglow.com/mailman/listinfo/open-graphics
>>> List service provided by Duskglow Consulting, LLC (www.duskglow.com)
>> _______________________________________________
>> Open-graphics mailing list
>> [email protected]
>> http://lists.duskglow.com/mailman/listinfo/open-graphics
>> List service provided by Duskglow Consulting, LLC (www.duskglow.com)
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to