On 11/13/12 2:54 AM, Paul Mullowney wrote: > Every test we've done shows that the MKL triangular solve doesn't > scale at all on a sandy bridge multi-core. I doubt it will be any > different on the Xeon Phi. > > -Paul Do you mean sparse or dense solves? Sparse triangular solves are sequential in MKL. PARDISO also does it sequentially.
Anton >>> >>>> >>>> In terms of raw numbers, $2,649 for 320 GB/sec and 8 GB of memory >>>> is quite a lot compared to the $500 of a Radeon HD 7970 GHz Edition >>>> at 288 GB/sec and 3 GB memory. My hope is that Xeon Phi can do >>>> better than GPUs in kernels requiring frequent global >>>> synchronizations, e.g. ILU-substitutions. >>> >>> But, but, but it runs the Intel instruction set, that is clearly >>> worth 5+ times the price :-) >> >> I'm tempted to say 'yes', but at a second thought I'm not so sure >> whether any of us is actually programming in x86 assembly (again)? >> Part of the GPU/accelerator hype is arguably due to a rediscovery of >> programming close to hardware, even though it was/is non-x86. With >> Xeon Phi we might now observe some sort of compiler war instead of >> low-level kernel tuning - is this what we want? >> >> Best regards, >> Karli >> > >
