this from nick on the llvm list

---------- Forwarded message ----------
From: Nick Lewycky <[email protected]>
Date: Mon, Aug 22, 2011 at 4:42 AM
Subject: Re: [LLVMdev] Xilinx zynq-7000 (7030) as a Gallium3D LLVM FPGA target
To: Luke Kenneth Casson Leighton <[email protected]>


[-llvmdev]

I think this stuff is really cool personally, though I don't have time
to pursue it myself, and I don't think llvmdev is interested.

Luke Kenneth Casson Leighton wrote:
>
> On Sun, Aug 21, 2011 at 5:27 AM, Nick Lewycky<[email protected]>  wrote:
>
>>>> The way in which Gallium3D targets LLVM, is that it waits until it
>>>> receives
>>>> the shader program from the application, then compiles that down to LLVM
>>>> IR.
>
>>>  nick.... the Zynq-7000 series of Dual-Core Cortex A9 800mhz 28nm CPUs
>>> have an on-board Series 7 Artix-7 or Kinect-7 FPGA (depending on the
>
>>>  so - does that change things at all? :)
>>
>> No, because that doesn't have:
>>  - nearly enough gates. Recall that a modern GPU has more gates than a
>> modern CPU, so you're orders of magnitude away.
>>  - quite enough I/O bandwidth. Assuming off-chip TMDS/LVDS (sensible, given
>> that neither the ARM core nor the FPGA have a high enough clock rate),
>
>  well the Series 7 has 6.6gb/sec adjustable Serial Transceivers, and
> there are cases where people have actually implemented DVI / HDMI with
> that.  but yes, a TFP410 would be a goood idea :)
>
>> the
>> limiting I/O bandwidth is between the GPU and its video memory. That product
>> claims it can do DDR3, which is not quite the same as GDDR5.
>
>  ahh, given that OGP is creating a Graphics Card with a PCI (33mhz)
> bus, 256mb DDR2 RAM and a Spartan 3 (4000), i think that trying to set
> sights on "the absolute latest and greatest in GPU Technology" is a
> leeetle ambitious :)

Nail on the head. The issue is how good a graphics card you can make,
as free software+hardware, and how inexpensively. If you're okay with
being 5-years behind the latest and greatest on nvidia/amd cards, then
you've got a goal to pursue and lots of engineering tradeoffs you can
make.

You probably know all this, but let me just recap what I know about
GPUs, and maybe you can jump in where I'm wrong.

Post-fixed-pipeline hardware looks mostly like a CPU. You've got an
instruction set with ALUs, branch operations, and some really funny
load/store operations usually labelled "texture ops" even though what
they load/store generally aren't textures. For a concrete example,
here's the ISA for the R700:

http://developer.amd.com/gpu_assets/R700-family_instruction_set_architecture.pdf

with the actual instructions listed in chapter 9. The trick then, is
that they pretend you've got "threaded" execution in such a way that
you've got a large number of copies of the same program, and each of
them run in parallel. They *don't* run in parallel (as much as they
appear), instead when each thread stalls (say, to do a memory lookup),
another thread will run. It's just pipelined.

Now, they do run in parallel, a bit. A GPU might have 16 ALUs so it
can run 16 threads in parallel, while having 2048 threads lined up
ready to go upon the first stall. In terms of hardware, this means
that you've just got simple CPU parts, copied over and over and over.
The same ALUs, the same register files, with many copies.

So that's step one, and if it were me, that's how I'd start. Just
build a GPU that looks like a modern GPU. Sure, the number of copies
of these circuits you can have will be limited by the # of gates on
your FPGA, but that's fine. You could try hand-coding all that in
verilog/VHDL, but there are tools which let you code in a higher-level
language and lower that down to verilog/VHDL for you. Some of them are
even based on LLVM. :)

The next thing to realize is that this GPU seems to be missing things.
I mean, OpenGL isn't a series of threads running in parallel with lots
of pipelining. What gives? Well, on a modern graphics system, the
drivers do everything else. Really. These guys know that you've got at
least a 2.8GHz dual-core system and aren't afraid to do everything
they can get away with on your CPU.

Older GPUs got away with a lot. For example, while the ISA has a
branch instruction, all the threads in a program were required to
branch the exact same way. Newer GPUs aren't so restrictive, and also
have MMUs to prevent them from being, well, from being big gaping
in-hardware security holes.

LLVM would be a good start to writing a driver for this. From our
point of view, it's just a new CPU architecture. Given such a backend
in LLVM, Gallium3D will probably be able to drive it without
substantial changes.

LLVM is not a *great* start for writing a driver for a GPU because
LLVM doesn't have any support for EPIC architectures, scheduling
around pipeline stalls across multiple units, etc. However, I'm
hopeful that this will improve before too long. People have written
backends for GPUs before but nobody has yet released their code
because it would help their competitors with the hard stuff.
Eventually someone will, and everyone will converge on improving that
implementation.

Now, there's other ways to slice this problem. You could decide that
the "graphics driver"-type stuff runs on the ARM core and the shaders
get pumped into the FPGA. That's a fascinating design trade-off where
you get to slash the number of FPGA gates you need by only reserving
them for the shaders in use, unless the shaders are too big (This.
Will. Happen.) in which case the new problem to solve is how to break
them down into reusable pieces. I have no sense of how difficult or
how performant this would actually be.

Bottom line, it sounds like using a board like the Zync is something
you could try to implement a graphics card with, and see how fast it
goes. Start by getting a baseline, then people can improve upon it
once you've got something straight-forward and working out the door.
The first revision doesn't have to be fast. :)

Nick
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to