Hi Jed, > Which crossover are you referring to? The CPU versus GTX285 at > about 20k > dofs, but with only very small gains for another order of magnitude? > > > I assume it's the cross-over of Xeon Phi vs. Xeon, > > > MIC is slower than Xeon by more than an order of magnitude at 10k dofs.
Tim was referring to the cross-over at >10k... > but almost all cross-overs happen in the 10k-100k region and are due > to PCI-Express latency. > > > Why is PCI-Express latency important here? Can't the MIC code run > entirely on the device? Almost-all (OpenCL, CUDA). Native mode ought to be the exception, but it's the OpenMP overhead which limits then. Single-core on the MIC is not really an option either... It would be interesting to play with a pthreads-threadpool implementation on the MIC to see how much performance can really be obtained for smallish problems. Best regards, Karli
