this from nick on the llvm list
---------- Forwarded message ---------- From: Nick Lewycky <[email protected]> Date: Mon, Aug 22, 2011 at 4:42 AM Subject: Re: [LLVMdev] Xilinx zynq-7000 (7030) as a Gallium3D LLVM FPGA target To: Luke Kenneth Casson Leighton <[email protected]> [-llvmdev] I think this stuff is really cool personally, though I don't have time to pursue it myself, and I don't think llvmdev is interested. Luke Kenneth Casson Leighton wrote: > > On Sun, Aug 21, 2011 at 5:27 AM, Nick Lewycky<[email protected]> wrote: > >>>> The way in which Gallium3D targets LLVM, is that it waits until it >>>> receives >>>> the shader program from the application, then compiles that down to LLVM >>>> IR. > >>> nick.... the Zynq-7000 series of Dual-Core Cortex A9 800mhz 28nm CPUs >>> have an on-board Series 7 Artix-7 or Kinect-7 FPGA (depending on the > >>> so - does that change things at all? :) >> >> No, because that doesn't have: >> - nearly enough gates. Recall that a modern GPU has more gates than a >> modern CPU, so you're orders of magnitude away. >> - quite enough I/O bandwidth. Assuming off-chip TMDS/LVDS (sensible, given >> that neither the ARM core nor the FPGA have a high enough clock rate), > > well the Series 7 has 6.6gb/sec adjustable Serial Transceivers, and > there are cases where people have actually implemented DVI / HDMI with > that. but yes, a TFP410 would be a goood idea :) > >> the >> limiting I/O bandwidth is between the GPU and its video memory. That product >> claims it can do DDR3, which is not quite the same as GDDR5. > > ahh, given that OGP is creating a Graphics Card with a PCI (33mhz) > bus, 256mb DDR2 RAM and a Spartan 3 (4000), i think that trying to set > sights on "the absolute latest and greatest in GPU Technology" is a > leeetle ambitious :) Nail on the head. The issue is how good a graphics card you can make, as free software+hardware, and how inexpensively. If you're okay with being 5-years behind the latest and greatest on nvidia/amd cards, then you've got a goal to pursue and lots of engineering tradeoffs you can make. You probably know all this, but let me just recap what I know about GPUs, and maybe you can jump in where I'm wrong. Post-fixed-pipeline hardware looks mostly like a CPU. You've got an instruction set with ALUs, branch operations, and some really funny load/store operations usually labelled "texture ops" even though what they load/store generally aren't textures. For a concrete example, here's the ISA for the R700: http://developer.amd.com/gpu_assets/R700-family_instruction_set_architecture.pdf with the actual instructions listed in chapter 9. The trick then, is that they pretend you've got "threaded" execution in such a way that you've got a large number of copies of the same program, and each of them run in parallel. They *don't* run in parallel (as much as they appear), instead when each thread stalls (say, to do a memory lookup), another thread will run. It's just pipelined. Now, they do run in parallel, a bit. A GPU might have 16 ALUs so it can run 16 threads in parallel, while having 2048 threads lined up ready to go upon the first stall. In terms of hardware, this means that you've just got simple CPU parts, copied over and over and over. The same ALUs, the same register files, with many copies. So that's step one, and if it were me, that's how I'd start. Just build a GPU that looks like a modern GPU. Sure, the number of copies of these circuits you can have will be limited by the # of gates on your FPGA, but that's fine. You could try hand-coding all that in verilog/VHDL, but there are tools which let you code in a higher-level language and lower that down to verilog/VHDL for you. Some of them are even based on LLVM. :) The next thing to realize is that this GPU seems to be missing things. I mean, OpenGL isn't a series of threads running in parallel with lots of pipelining. What gives? Well, on a modern graphics system, the drivers do everything else. Really. These guys know that you've got at least a 2.8GHz dual-core system and aren't afraid to do everything they can get away with on your CPU. Older GPUs got away with a lot. For example, while the ISA has a branch instruction, all the threads in a program were required to branch the exact same way. Newer GPUs aren't so restrictive, and also have MMUs to prevent them from being, well, from being big gaping in-hardware security holes. LLVM would be a good start to writing a driver for this. From our point of view, it's just a new CPU architecture. Given such a backend in LLVM, Gallium3D will probably be able to drive it without substantial changes. LLVM is not a *great* start for writing a driver for a GPU because LLVM doesn't have any support for EPIC architectures, scheduling around pipeline stalls across multiple units, etc. However, I'm hopeful that this will improve before too long. People have written backends for GPUs before but nobody has yet released their code because it would help their competitors with the hard stuff. Eventually someone will, and everyone will converge on improving that implementation. Now, there's other ways to slice this problem. You could decide that the "graphics driver"-type stuff runs on the ARM core and the shaders get pumped into the FPGA. That's a fascinating design trade-off where you get to slash the number of FPGA gates you need by only reserving them for the shaders in use, unless the shaders are too big (This. Will. Happen.) in which case the new problem to solve is how to break them down into reusable pieces. I have no sense of how difficult or how performant this would actually be. Bottom line, it sounds like using a board like the Zync is something you could try to implement a graphics card with, and see how fast it goes. Start by getting a baseline, then people can improve upon it once you've got something straight-forward and working out the door. The first revision doesn't have to be fast. :) Nick _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
