Re: [petsc-users] MatSolve_SeqAIJ_NaturalOrdering

2016-04-22 Thread Jed Brown
Randall Mackie  writes:

> After profiling our code, we have found that most of the time is spent in 
> MatSolve_SeqAIJ_NaturalOrdering, which upon inspection is just doing simple 
> forward and backward solves of already factored ILU matrices.
>
> We think that we should be able to see improvement by replacing these with 
> optimized versions from Intel MKL (or other optimized BLAS).

Doing so would actually give up memory optimizations that are only in
PETSc.

http://hpc.sagepub.com/content/25/4/386 (also 
http://www.mcs.anl.gov/uploads/cels/papers/P1658.pdf)

Run with -log_summary and look at the rows for MatSolve versus MatMult.
Is the flop/s rate (last column) close?  If so, then you're already
getting a high fraction of memory bandwidth.


signature.asc
Description: PGP signature


[petsc-users] MatSolve_SeqAIJ_NaturalOrdering

2016-04-22 Thread Randall Mackie
After profiling our code, we have found that most of the time is spent in 
MatSolve_SeqAIJ_NaturalOrdering, which upon inspection is just doing simple 
forward and backward solves of already factored ILU matrices.

We think that we should be able to see improvement by replacing these with 
optimized versions from Intel MKL (or other optimized BLAS).

For example, Intel MKL has these routines:

https://software.intel.com/en-us/node/468572

Is it possible to replace the PETSc triangular solves with a more optimized 
version?

Thanks, 

Randy M.