thanks for your message and the pointer.
The incomplete factorizations have been around for a while, and with
recent hardware they tend to be less competitive (note that they use a
Tesla 2050 in their benchmarks, which is ~7 years old).
The fine-grained parallel version here:
is an attractive alternative (and available in the master-branch of
PETSc through ViennaCL), yet it also has drawbacks.
On 02/14/2018 12:02 AM, Jonathan Perry-Houts wrote:
I'm not sure if this is the right place to post this, but I wanted to
point out a new white paper I stumbled across about preconditioned
iterative solvers on GPU's:
The speed-ups are not huge, but they're not negligible either. I thought
it might be of interest to some of you.