Hi, On 02/12/2013 06:40 PM, Jed Brown wrote: > On Tue, Feb 12, 2013 at 6:06 PM, Tim Tautges <tautges at mcs.anl.gov > <mailto:tautges at mcs.anl.gov>> wrote: > > I'm kind of surprised at the > 10k element crossover myself. For > the strong scaling cases, at high core counts, that's not terribly > far from the number of DOFS per processor, is it? I guess CPUs will > be slower than the Xeon in most cases (BGx), or fewer (Titan), but > still. > > > Which crossover are you referring to? The CPU versus GTX285 at about 20k > dofs, but with only very small gains for another order of magnitude?
I assume it's the cross-over of Xeon Phi vs. Xeon, but almost all cross-overs happen in the 10k-100k region and are due to PCI-Express latency. Scaling this to large clusters, replace Xeon Phi by compute node, PCI-Express by GB-Ethernet/Infiniband as well as change the timescale a bit and you're probably not too far off... > For 2D Laplace, we expect to see strong scaling peter out around a > couple thousand dofs per core. It can go a little further on Blue Gene > because the network is much faster and the cores are a bit slower. > > Titan has a lot of (premium price) GPUs that you have to use to utilize > the machine well, but it's unclear whether the architecture is > delivering a science/dollars advantage, even if you ignore development > costs to port and re-tune codes and the environment costs (it's more > complicated to build and run, so it takes people longer to get running). > I think the main justification is speculation about what future hardware > will look like, not cost-effectiveness today. Once should also keep the influence of TOP500 in mind here. I don't think this machine was primarily built for domain decomposition and the like rather than "approaching Exascale via Linpack". Best regards, Karli
