Xiangdong, If you are running on an Intel-based system with support for recent instruction sets like AVX2 or AVX-512, and you have access to the Intel compilers, then telling the compiler to target these instruction sets (e.g., "-xCORE-AVX2" or "-xMIC-AVX512") will probably give you some noticeable gain in performance. It will be much less than you would expect from something very CPU-bound like xGEMM code, but, in my experience, it will be noticeable (remember, even if you have a memory-bound code, your code's performance won't be bound by the memory subsystem 100% of the time). I don't know how well the non-Intel compilers are able to auto-vectorize, so your mileage may vary for those. As Hong has pointed out, there are some places in the PETSc source in which we have introduced code using AVX/AVX512 intrinsics. For those codes, you should see benefit with any compiler that supports these intrinsics, as one is not relying on the auto-vectorizer then.
Best regards, Richard On Mon, Nov 13, 2017 at 8:32 AM, Zhang, Hong <hongzh...@anl.gov> wrote: > Most operations in PETSc would not benefit much from vectorization since > they are memory-bounded. But this does not discourage you from compiling > PETSc with AVX2/AVX512. We have added a new matrix format (currently named > ELL, but will be changed to SELL shortly) that can make MatMult ~2X faster > than the AIJ format. The MatMult kernel is hand-optimized with AVX > intrinsics. It works on any Intel processors that support AVX or AVX2 or > AVX512, e.g. Haswell, Broadwell, Xeon Phi, Skylake. On the other hand, we > have been optimizing the AIJ MatMult kernel for these architectures as > well. And one has to use AVX compiler flags in order to take advantage of > the optimized kernels and the new matrix format. > > Hong (Mr.) > > > On Nov 12, 2017, at 10:35 PM, Xiangdong <epsco...@gmail.com> wrote: > > > > Hello everyone, > > > > Can someone comment on the vectorization of PETSc? For example, for the > MatMult function, will it perform better or run faster if it is compiled > with avx2 or avx512? > > > > Thank you. > > > > Best, > > Xiangdong > >