Between the list problems and my ISP's mail meltdown (mail is taking up to 48 hours - or more - to get through to me) I'm not quite on top of replying to things just now :)
I copied this from the web archive to reply to since the email still hasn't arrived here...
Dan M wrote:
> http://news.bbc.co.uk/1/hi/technology/3180872.stm
>
> " Everyone would love a supercomputer but with a price tag of around
> $100m each they are not easy to come by.
>
> But in the United States staff and students at Virginia Tech have built
> one of the world's most powerful supercomputers for just $5m by
> plugging together hundreds of the latest computers from Apple.
>
> The project involved placing 1,100 brand new Apple G5 towers side by
> side, making it the world's most powerful "homebuilt" system.
>
> It is capable of 17.6 trillion floating point operations per second,
> with a combined storage capacity of 176 terabytes. "
OK, I'm a bit confused here. I double checked on Apple's website, and see
that each tower has two 2 Ghz processors. As a result, we see that we
have, roughly 4.4 THz of processing.
Cycles don't mean much in modern CPU design where processors are designed to execute > 1 operation per clock cycle.
I know about array processing, but this machine's specs do not seem to
include that. They claim to be about 1.8 x a single processor P4. Putting
that together, and I cannot see how there would be 4 floating point
operations per clock cycle.
I only did some undergraduate computer engineering many years ago (being a software guy), but I googled up the following explanation which is at least mostly rightish :)
I presume the figures are worked out like this: the G5 has two 64-bit floating point units per chip, and the multiply/add instruction produces a combined floating point multiply *and* addition per clock cycle once the pipeline is filled. Thus Rpeak (which the quoted number is based on) is 4 floating point operations per clock cycle per chip. Rmax (what they get when they actually benchmark it on LINPACK or whatever will probably be 50% to 75% of that depending on the compiler (the IBM compiler is *very* good) and of course how well all that clever hardware actually performs :)
This must be a case of multiplying instead of
dividing by 4, which is more typical: 4 clock cycles per CPU per operation.
Modern CPUs have multiple execution units all chugging away simultaneously. The G5 cpu has 2 floating point, 2 integer, 2 vector, 2 load-store, a branch and a condition-register unit. It can fetch up to eight instructions per clock cycle, dispatch up to five instructions per cycle to the function units, issue up to eight instructions per cycle from the function units and retire results for up to five instructions per cycle: more than 200 instructions can be 'in flight' at any moment. It's what they call a superscalar RISC design :)
-- William T Goodall Mail : [EMAIL PROTECTED] Web : http://www.wtgab.demon.co.uk Blog : http://radio.weblogs.com/0111221/
"I have always wished that my computer would be as easy to use as my telephone. My wish has come true. I no longer know how to use my telephone." - Bjarne Stroustrup
_______________________________________________ http://www.mccmedia.com/mailman/listinfo/brin-l
