Jim, If you or anyone else on this are interested in learning more about the anton architecture, there a bunch of links here:
http://www.deshawresearch.com/publications.html There's a couple that give good descriptions of the anton architecture. I read most of the computer-related ones over the summer. Yes, that's my idea of light summer reading! Prentice On 12/22/2011 12:33 PM, Lux, Jim (337C) wrote: > That's an interesting approach of combining ASICs with FPGAs. ASICs will > blow the doors off anything else in a FLOP/Joule contest or a FLOPS/kg or > FLOPS/dollar.. For tasks for which the ASIC is designed. FPGAs to handle > the routing/sequencing/variable parts of the problem and ASICs to do the > crunching is a great idea. Sort of the same idea as including DSP or > PowerPC cores on a Xilinx FPGA, at a more macro scale. > (and of interest in the HPC world, since early 2nd generation Hypercubes > from Intel used Xilinx FPGAs as their routing fabric) > > The challenge with this kind of hardware design is PWB design. Sure, you > have 1100+ pins coming out of that FPGA.. Now you have to route them > somewhere. And do it in a manufacturable board: I've worked recently with > a board that had 22 layers, and we were at the ragged edge of tolerances > with the close pitch column grid array parts we had to use. > > I would expect the clever folks at DE Shaw did an integrated design with > their ASIC.. Make the ASIC pinouts such that they line up with the FPGAs, > and make the routing problem simpler. > > > > > On 12/22/11 8:53 AM, "Prentice Bisbal" <[email protected]> wrote: > >> Just for the record - I'm only the messenger. I noticed a >> not-insignificant number of booths touting FPGAs at SC11 this year, so I >> reported on it. I also mentioned other forms of accelerators, like GPUs >> and Intel's MIC architecture. >> >> The Anton computer architecture isn't just a FPGA - it also has >> custom-designed processors (ASICS). The ASICs handle the parts of the >> molecular dynamics (MD) algorithms that are well-understood, and >> unlikely to change, and the FPGAs handle the parts of the algorithms >> that may change or might have room for further optimization. >> >> As far as I know, only 8 or 9 Antons have been built. One is at the >> Pittsburgh Supercomputing Center (PSC), the rest are for internal use at >> DE Shaw. A single Anton consists of 512 cores, and takes up 6 or 8 >> racks. Despite it's small size, it's orders of magnitude faster at >> doing MD calculations than even super computers like Jaguar and >> Roadrunner with hundreds of thousands of processors. So overall, Anton >> is several orders of magnitudes faster than an general-purpose processor >> based supercomputer. And sI'm sure it uses a LOT less power. I don't >> think the Anton's are clustered together, so I'm pretty sure the >> published performance on MD simulations is for a single Anton with 512 >> cores >> >> Keep in mind that Anton was designed to do only 1 thing: MD, so it >> probably can't even run LinPack, and if it did, I'm sure it's score >> would be awful. Also, the designers cut corners where they knew the >> safely could, like using fixed-precision (or is it fixed-point?) math, >> so the hardware design is only half the story in this example. >> >> Prentice >> >> >> >> On 12/22/2011 11:27 AM, Lux, Jim (337C) wrote: >>> The problem with FPGAs (and I use a fair number of them) is that you're >>> never going to get the same picojoules/bit transition kind of power >>> consumption that you do with a purpose designed processor. The extra >>> logic needed to get it "reconfigurable", and the physical junction sizes >>> as well, make it so. >>> >>> What you will find is that on certain kinds of problems, you can >>> implement >>> a more efficient algorithm in FPGA than you can in a conventional >>> processor or GPU. So, for that class of problem, the FPGA is a winner >>> (things lending themselves to fixed point systolic array type processes >>> are a good candidate). >>> >>> Bear in mind also that while an FPGA may have, say, 10-million gate >>> equivalent, any given practical design is going to use a small fraction >>> of >>> those gates. Fortunately, most of those unused gates aren't toggling, >>> so >>> they don't consume clock related power, but they do consume leakage >>> current, so the whole clock rate vs core voltage trade winds up a bit >>> different for FPGAs. >>> >>> The biggest problem with FPGAs is that they are difficult to write high >>> performance software for. With FORTRAN on conventional and vectorized >>> and >>> pipelined processors, we've got 50 years of compiler writing expertise, >>> and real high performance libraries. And, literally millions of people >>> who know how to code in FORTRAN or C or something, so if you're looking >>> for the highest performance coders, even at the 4 sigma level, you've >>> got >>> a fair number to choose from. For numerical computation in FPGAs, not >>> so >>> many. I'd guess that a large fraction of FPGA developers are doing one >>> of >>> two things: 1) digital signal processing, flow through kinds of stuff >>> (error correcting codes, compression/decompression, crypto; 2) bus >>> interface and data handling (PCI bus, disk drive controls, etc.). >>> >>> Interestingly, even with the relative scarcity of FPGA developers versus >>> conventional CPU software, the average salaries aren't that far apart. >>> The distribution on "generic coders" is wider (particularly on the low >>> end.. Barriers to entry are lower for C,Java,whathaveyou code monkeys), >>> but there are very, very few people making more than, say, 150-200k/yr >>> doing either. (except in a few anomalous industries, where compensation >>> is higher than normal in general). (also leaving out "equity >>> participation" type deals) >>> >>> >>> >>> On 12/22/11 7:42 AM, "Prentice Bisbal" <[email protected]> wrote: >>> >>>> On 12/22/2011 09:57 AM, Eugen Leitl wrote: >>>>> On Thu, Dec 22, 2011 at 09:43:55AM -0500, Prentice Bisbal wrote: >>>>> >>>>>> Or if your German is rusty: >>>>>> >>>>>> >>>>>> >>>>>> http://www.zdnet.com/blog/computers/amd-radeon-hd-7970-graphics-card-l >>>>>> au >>>>>> nched-benchmarked-fastest-single-gpu-board-available/7204 >>>>> Wonder what kind of response will be forthcoming from nVidia, >>>>> given developments like >>>>> http://www.theregister.co.uk/2011/11/14/arm_gpu_nvidia_supercomputer/ >>>>> >>>>> It does seem that x86 is dead, despite good Bulldozer performance >>>>> in Interlagos >>>>> >>>>> >>>>> >>>>> http://www.heise.de/newsticker/meldung/AMDs-Serverprozessoren-mit-Bulld >>>>> oz >>>>> er-Architektur-legen-los-1378230.html >>>>> >>>>> (engage dekrautizer of your choice). >>>>> >>>> At SC11, it was clear that everyone was looking for ways around the >>>> power wall. I saw 5 or 6 different booths touting the use of FPGAs for >>>> improved performance/efficiency. I don't remember there being a single >>>> FPGA booth in the past. Whether the accelerator is GPU, FPGA, GRAPE, >>>> Intem MIC, or something else, I think it's clear that the future of >>>> HPC >>>> architecture is going to change radically in the next couple years, >>>> unless some major breakthrough occurs for commodity processors. >>>> >>>> I think DE Shaw Research's Anton computer, which uses FPGAs and custom >>>> processors, is an excellent example of what the future of HPC might >>>> look >>>> like. >>>> >>>> -- >>>> Prentice >>>> _______________________________________________ >>>> Beowulf mailing list, [email protected] sponsored by Penguin >>>> Computing >>>> To change your subscription (digest mode or unsubscribe) visit >>>> http://www.beowulf.org/mailman/listinfo/beowulf >> _______________________________________________ >> Beowulf mailing list, [email protected] sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
