On Wednesday, February 08, 2012 06:15:01 PM Mark Hahn wrote: > > The APU concept has a few interesting points but certainly also a few > > major problems (when comparing it to a cpu + stand alone gpu setup): > > > > * Memory bandwidth to all those FPUs > > well, sorta. my experience with GP-GPU programming today is that your > first goal is to avoid touching anything offchip anyway (spilling, etc), > so I'm not sure this is a big problem. obviously, the integrated GPU > is a small slice of a "real" add-in GPU, so needs proportionately > less bandwidth.
Well yes you want to avoid touching memory on a GPU (just as you do on a CPU). But just as you cant completely avoid it on a CPU nor can you on a GPU. On a current socket (CPU) you see maybe 20 GB/s and 50 GF and the flop-wise much faster GPU is also alot faster in memory access (>200 GB/s). Now I admit I'm not a GPU programmer but are you saying those 200 GB/s aren't needed? My assumption was that the fact that CPU-codes depend on cache for performance but still need good memory bandwidth held true even on GPUs. Anyway, my point I guess was mostly that it's a lot easier to sort out hundreds of gigs per second to memory on a device with RAM directly on the PCB than on a server socket. Also, if the APU is a "small slice of a real GPU" then I question the point (not much GPU power per classic core or total system foot-print). ... > I think the real question is whether someone will produce a minimalist > APU node. since Llano has on-die PCIE, it seems like you'd need only > APU, 2-4 dimms and a network chip or two. that's going to add up to > very little beyond the the APU's 65 or 100W TDP... (I figure 150/node > including PSU overhead.) I think anything beyond early testing is a fair bit into the future. For the APU to become interesting I think we need a few (or all of): * Memory shared with the CPU in some useable way (did not say the c-word..) * A proper number crunching version (ecc...) * A fairly high tdp part on a socket with good memory bw * Noticeably better "host to device" bandwidth and even more, latency And don't get me wrong, I'm not saying the above is particularly unlikely... /Peter
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
