Hi guys, > Provided you have a good parallel sparse direct solve for a single SM,
you could unleash 32 direct solves (or perhaps 16) which run concurrently on the K20x. One only needs to set an environment variable to use Hypre Q
Thanks for your inputs on HyperQ. I'm afraid this still won't give the good performance Marc and Ed are looking for, mostly because there is simply not enough parallelism in sparse direct solvers for systems of that size (cf. Jed's comment). They might actually work quite well on the CPU if a symbolic factorization is first carried out in a preprocessing step and then the actual numbers are computed in each Picard iteration.
Best regards, Karli
