Most operations in PETSc would not benefit much from vectorization since they 
are memory-bounded. But this does not discourage you from compiling PETSc with 
AVX2/AVX512. We have added a new matrix format (currently named ELL, but will 
be changed to SELL shortly) that can make MatMult ~2X faster than the AIJ 
format. The MatMult kernel is hand-optimized with AVX intrinsics. It works on 
any Intel processors that support AVX or AVX2 or AVX512, e.g. Haswell, 
Broadwell, Xeon Phi, Skylake. On the other hand, we have been optimizing the 
AIJ MatMult kernel for these architectures as well. And one has to use AVX 
compiler flags in order to take advantage of the optimized kernels and the new 
matrix format.

Hong (Mr.)

> On Nov 12, 2017, at 10:35 PM, Xiangdong <[email protected]> wrote:
> 
> Hello everyone,
> 
> Can someone comment on the vectorization of PETSc? For example, for the 
> MatMult function, will it perform better or run faster if it is compiled with 
> avx2 or avx512?
> 
> Thank you.
> 
> Best,
> Xiangdong

Reply via email to