Hi Ncolas 2012/5/28 Nicolas Boulay <[email protected]>: > How could you be effiscient on fully scalar shader with a single > decoder for 4 alu ? How do you manage register/memory bank with many > ports ? You need many port to fill many scalar pipelines but many > ports means slower accesses. > > On GPU, they used large register bank to avoid using RAM at maximum > (32K registres for Fermi ?). > The design is still a paper design at this point. There was a few thing that weren't fully tough on how to do it. But the hardware I was thinking the design while using spartan6 fpga as a target. So a lot of BRAM where to be use and also flip-flop in the fabric.
> I try to think about very high level instruction design to fill many > scalar pipeline. GPU usualy have many datas format manipulate by > register (packed rgb, etc...). But why not adding even square matrix > of fixe size 2 to 4, diagonals (to do complex and quaternions > calculus), and vector of "any" size (and array of vector). > > For exemple, the multiplication of an array of vector of size 4 to a > single matrix of size 4 could use many ressource in the same time > effciently. > We had that discussion in the past. Scalar is more practical that you use less hardware than a vector architecture. That mean a more optimal utilisation of the hardware, but at the cost that you need to unroll the vector operation in scalar one. Regards, André > > > 2012/5/28 Andre Pouliot <[email protected]>: >> Hi, >> it's more a traditional GPGPU design with some twist. The basic design >> is a scalar engine. Multiple scalar engine are controlled via an >> instruction decoder pipeline that control multiple scalar engine. Each >> step of the pipeline have it's own thread. For example for a pipeline >> with a depth of 8 you would have more than 8 program running. This >> prevent pipeline flush or stall because of data dependency. Each >> scalar engine is running it's own fragment. For 4 scalar engine with a >> pipeline depth of 8 you would run at the same time 32 thread and 4xN >> waiting, N being the program in queue waiting for a time slot in the >> round robin. The 32 thread could be controlled by 8 different program. >> >> If I remember right we were also talking about network on chip for >> scalability. >> >> Kenneth and me had some thing that are well spec out in a document. >> That doc need to be reorganized, it's was quite a mess at the time, I >> didn't know how to do a good spec document. Still learning how. Also >> some stuff was still in discussion. We did discuss during many hour >> some of the different option and breaking them. >> >> For energy efficiency I don't remember if we specified stuff for it, I >> would need to look at that document again. >> I'll try to try finding all the stuff again and make it accessible to >> people to look at. If someone want access to contribute, it will be a >> case by case basis. >> >> Regards, >> André >> >> 2012/5/28 Xiaohan Ma <[email protected]>: >>> Hi Tim >>> >>> Can you put more info about oga2 which Andre spec'd out? Is this a >>> traditional gpu design or energy-efficient ideas involved? >>> >>> Thanks >>> Xiaohan >>> >>> On May 27, 2012, at 2:19 PM, Timothy Normand Miller <[email protected]> >>> wrote: >>> >>>> I'm not trying to start an argument as to whether or not "intellectual >>>> property" is real. Maybe I'll blog about that some time. :) I >>>> nevertheless need to point out that being an employee of a State >>>> University of New York binds be in certain ways. >>>> >>>> http://research.binghamton.edu/Innovation/IntellectualProperty.php >>>> >>>> The bottom line for me is that I need to stay far away from any >>>> cash-flow that might occur. And regarding the IP owned by Traversal, >>>> Traversal is defunct, and the IP ownership fell back to me, Howard, >>>> and Andy. We're ready to transfer that, and some responsible >>>> facilitator(s) need(s) to take ownership (literal or figurative) and >>>> see where the project can leverage it. I think that there needs to >>>> still be some centralized entity who can relicense the IP without >>>> having to ask permission from 1000 contributors. >>>> >>>> So, on to what the OGP can do... >>>> >>>> ARM has cornered the market on energy-efficient CPUs. And ARM is >>>> entirely fabless. Maybe the OGP can corner the market on >>>> energy-efficient GPUs. The design would be dual-licensed GPL and >>>> commercial, where for production purposes, all GPL viral-like >>>> characteristics can be stripped in exchange for money, with the >>>> understanding that breaking binary compatibility with the open design >>>> (thereby potentially creating a closed architecture) will cost a LOT >>>> more to license. Our chosen facilitator would handle the money and >>>> fund whatever seems useful to fund. Mostly prototype hardware, >>>> reference designs, donations to other projects, etc. Linux Fund took >>>> over the Open Hardware Foundation, so we can use that. >>>> >>>> Of course, most companies that set out, a priori, to be fabless and >>>> license IP for profit tend to fail disastrously. But we're not trying >>>> to sustain a company on this. Indeed, the profit margin would have to >>>> be painfully small in order to be the least bit competitive anyhow. >>>> Our objective is to put a completely open GPU design out on the >>>> market, and that isn't necessarily profitable. >>>> >>>> So just for fun and science, let's see what we can design. André >>>> Pouliot and Kenneth Østby spec'd out a GPU shader engine design called >>>> OGA2. Let's start there. The first thing to do is my favorite part, >>>> which is to argue about architectural design decisions. Then we make >>>> a C-based prototype to determine functional efficiency, then we code >>>> it in Verilog and synthesize it for gate-level synthesis so we can >>>> judge energy efficiency. >>>> >>>> Think about leveraging the brainpower of the FOSS community to design >>>> a GPU that outperforms and is more energy-efficient than PowerVR. A >>>> compelling-enough design would get market penetration. Eventually, it >>>> would make its way from embedded systems into desktop systems and >>>> supercomputers (GPGPU, etc.), and we would all benefit from having an >>>> open architecture dominate in graphics. >>>> >>>> -- >>>> Timothy Normand Miller >>>> http://www.cse.ohio-state.edu/~millerti >>>> Open Graphics Project >>>> _______________________________________________ >>>> Open-graphics mailing list >>>> [email protected] >>>> http://lists.duskglow.com/mailman/listinfo/open-graphics >>>> List service provided by Duskglow Consulting, LLC (www.duskglow.com) >>> _______________________________________________ >>> Open-graphics mailing list >>> [email protected] >>> http://lists.duskglow.com/mailman/listinfo/open-graphics >>> List service provided by Duskglow Consulting, LLC (www.duskglow.com) >> _______________________________________________ >> Open-graphics mailing list >> [email protected] >> http://lists.duskglow.com/mailman/listinfo/open-graphics >> List service provided by Duskglow Consulting, LLC (www.duskglow.com) _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
