> A partial counter-point is that MatSolve with OpenMP is unlikely to be near > the throughput of MPI-based MatSolve because the high-concurrency paths are > not going to provide very good memory locality, and may cause horrible > degradation due to cache-coherency issues.
I am just going to throw down the gauntlet here and say that I can reproduce or beat in (my choice of) OpenMP or pthreads on a reasonably UMA multi-core processor anything that can be implemented in MPI. -Aron
