The viability of this idea, however, really comes down to the
question: How slow is the access of the main RAM from the SPE's. Is
this significantly slower than on a traditional modern computer? If
so then the idea may not be viable (unless one is doing simple GP/BOA
problems where the fitness function can fit inside 256KB of memory,
which is not generally the case with AGI-related evolutionary learning
problems).
I've seen a figure of 20 gigabytes/second for Cell main memory bandwidth. I don't know whether this figure is correct or not, if it is, then while (obviously, necessarily) slower than on-chip memory, it's faster than the main memory bandwidth of any other processor I know of.
I'm guessing main memory _latency_ isn't going to be amazingly hot, given that Cell is designed primarily for vector-style workloads. (Of course, main memory latency isn't very good on any modern processor, though it's less bad on some (e.g. Athlon-64/Opteron) than others.)
Assuming you're clever about overlapping fetch of one chunk of data with processing of another, I think performance will be in large part determined by whether you can arrange your data in long sequential chunks with contiguous addresses, i.e. you slurp many kilobytes in one go before moving somewhere else, rather than fetching a couple of words, then following a random pointer to somewhere else, fetching another couple of words etc.
Or (roughly speaking, for illustration purposes only, you need not remind me why this is not entirely accurate): Fortran-style arrays = fast, Lisp-style linked lists = slow.
Of course, this is to a considerable extent going to be true of all processors for the foreseeable future, so work done to optimize your data structures for this won't be wasted even if you end up not using Cell.
- Russell
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
