On Sun, Nov 27, 2005 at 12:34:26PM -0800, [EMAIL PROTECTED] wrote:

> Yes.  My assertion is that for most interesting real-world problems, the
> Cell isn't a good fit.

AI is a an interesting real-world problem. AI is massively parallel.
Gaming is an interesting real-world problem. Gaming is massively parallel.
So are physical simulations. 

I must admit that most interesting problems are parallel. (In fact,
most strictly sequential problems are pathological, not the other way 
round as most people still seem to think).
 
> In the real world, I think this is mostly wrong.  Remember your constraints:
> programmers (especially good ones) are extremely rare and expensive.   Fast
> computers are not.  Unless your problem is truly massive (Google-scale), you

It used to be that you could ramp up performance by clock and CPU advances
alone. It stopped being strictly true a while (a couple years) ago. Right now 
the limits
are power dissipation/ops. Everybody is moving to multicores. Where these
can't go you have to do with commodity clusters with not-so-commodity
interconnects (Myrinet, Quadrics, Infiniband, SCI, etc) which give you
few microsecond latencies on small messages.

So basically the times of nearly-free-lunch are over. Programmers
have to do more work now to make their codes run on new machines
faster.

> should be optimizing for programmer productivity rather than FLOPS / $.
> That means using as little parallelism as possible.  In most cases, that
> means no parallelism at all-- a modern CPU can handle an astonishing amount
> of work, if well programmed.

Why don't you see those 10 GHz cores right now? Why has memory bandwidth
rise been unable to keep up with CPU core clock for many years now? It's
certainly not for lack of trying. It has worked so well so long, but it
had to stop at some point.
 
> If you must use parallelism, I assert (and it sounds like you agree) that a
> message-based architecture is very often a better choice than a
> multi-threaded one, purely because it's easier to work with.

Absolutely. Multithreading (SMP) approach doesn't scale. Illusion of a
global memory is a difficult one to maintain in a write-intesive context
and won't go beyond 16-64 cores. If you want to have >10^3 parallelism,
you have to go message passing. There are no alternatives to that.
 
> Multi-core architectures are obviously the future.  My objection is to the
> specifics of the Cell architecture, which makes trade-offs around symmetry
> and memory access which I (and many others) consider very sub-optimal for
> most real-world applications.

I agree it has a few warts. However, I'm happy to see a new design at all.
 
> You're right that current CPU architectures have serious issues with memory
> latency, and that managing those issues effectively is part of what
> separates good from mediocre programmers.  The problem is that those issues
> look to be much more severe on the Cell than on competing architectures.

Less so, because you don't have to worry about the cache. There's only
local memory, and nonlocal memory. The local memory has a pleasantly flat
access model. The nonlocal memory likes to be operated in burst mode in
order to achieve maximal throughput. Otherwise, it's unremarkable.
There is not much novelty so far.
 
> There's another real-world issue with the Cell, which has to do with
> lifecycle.  The very strong consensus in the gaming community is that to
> write a decent PS3 app, you'll need to throw away all your existing code and
> start from scratch.  The first generation of apps will probably be

That's very possible true (but for PS2 emulators, which they will have
to ship in order for the platform to succeed).

> profoundly mediocre, as developers take time to get a feel for the new
> architecture.  The lifetime of the architecture will be about 5 years, at
> the end of which time all code written for it is almost certain to be a dead
> end.  That's painful but survivable if you're in the console games business.
> It's a disaster if you're in the AI business (unless your timeframe for a
> seed AI is < 5 years...)

I've always have argued that you have to write for MPI today (it has 
been true for more than a decade now). MPI has always worked well on 
SMP (it's a special case) but it has also worked on every other interconnect.
The opposite has never been true, and it's because this universe doesn't
allow coherency of spatially distributed busy systems without a lot of 
relativistic
signalling to and fro, and lots of decisions.
 
> This is another example of Sony optimizing for the wrong problem-- they're
> maximizing theoretical FLOPS at the expense of real-world programmer
> productivity.

They had a budget. That budget buys you a certain amount of silicon estate.
They had to make the Cell that way (warts and all) because they no longer
could make it single core (arguably, this was true for PS2 which had even
more warts -- a miracle it it all suceeded).
 
> Don't get me wrong-- I make my living writing massive distributed
> applications.  When you have to parallelize, you have to parallelize.  But
> you should do so in a very thoughtful and deliberate manner.

Absolutely no disagreement there.

-- 
Eugen* Leitl <a href="http://leitl.org";>leitl</a>
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE

-------
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

Attachment: signature.asc
Description: Digital signature

Reply via email to