I can't comment too deeply without treading into NDA land, but from public sources it's safe to say that the difficulty in programming the SPEs isn't so much in code generation (although automatic parallelization will continue to be a grail) but in data movement. The SPEs have only explicit "cache" in the sense that the local memory is very fast, but doesn't share an address space with the PU. Instead, you DMA chunks back and forth as needed and rely on DMA bandwidth (and sufficiency of channels) to fill relatively large chunks of "cache" explicitly instead of relying on automated cache- line granularity. That makes you think of the machine architecture pretty much any time you design an algorithm to run on the SPE - which I guess puts it in line with other parallelization methods :- ( Wherefrom, of course, comes the interesting systems work :-)

Paul

On 24-May-05, at 4:18 PM, Jack Johnson wrote:

On 5/24/05, Paul Lalonde <[EMAIL PROTECTED]> wrote:

The SPEs, of course, are the interesting part from the systems point
of view.


I'm a layman, so speaking completely out my posterior here, but I read
a paper somewhere that lead me to believe that some of the vectorizing
techniques used on the Crays could be applied well to the cell
processors.  True?

Not that I recall seeing a Cray port on sources.... ;)

-Jack


Reply via email to