The proposed fix is #if defined(PETSC_USE_AVX512_KERNELS) && && && && && in https://gitlab.com/petsc/petsc/merge_requests/2213/diffs
but note that PETSC_USE_AVX512_KERNELS does not even do a configure check to make sure it is valid. The user has to guess that passing that flag will work. Of course a proper configure test is needed and since a proper test is needed it can handle all the issues in one place instead of having one issue in configure and n - 1 in the source code. This is a basic implementation disagreement, I hate CPP and think it should be used minimally, you hate configure and think it should be used minimally. Barry > On Oct 25, 2019, at 1:54 PM, Jed Brown <[email protected]> wrote: > > "Smith, Barry F. via petsc-dev" <[email protected]> writes: > >> This needs to be fixed properly with a configure test(s) and not with huge >> and inconsistent checks like this >> >> #if defined(PETSC_HAVE_IMMINTRIN_H) && defined(__AVX512F__) && >> defined(PETSC_USE_REAL_DOUBLE) && !defined(PETSC_USE_COMPLEX) && >> !defined(PETSC_USE_64BIT_INDICES) or this >> >> #elif defined(PETSC_USE_AVX512_KERNELS) && defined(PETSC_HAVE_IMMINTRIN_H) >> && defined(__AVX512F__) && defined(PETSC_USE_REAL_DOUBLE) && >> !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_64BIT_INDICES) && >> !defined(PETSC_SKIP_IMMINTRIN_H_CUDAWORKAROUND) >> >> >> >> self.useAVX512Kernels = self.framework.argDB['with-avx512-kernels'] >> if self.useAVX512Kernels: >> self.addDefine('USE_AVX512_KERNELS', 1) >> >> Here you should check that the needed include files are available, that >> it is 32 bit integers, that defined(__AVX512F__) exists and that appropriate >> functions exist. > > What happens when one AVX512 kernel gets a 64-bit integer or > single-precision implementation? You'll have new macros for each > combination of criteria instead of using && on the normal macros? > > We should investigate whether #pragma omp simd or ivdep can generate > comparable vectorization without directly using the intrinsics -- those > are quite a bit more portable when they work, and they often work with > some experimentation. > >> Maybe you need two configure checks if there are two different types of >> functionality you are trying to catch? >> >> Continuing to hack away with gross #if def means this crud is >> unmaintainable and will always haunt you. Yes the incremental cost of doing >> a proper configure test is there but once that is done the maintenance costs >> (which have been haunting use for months) will be gone. >> >> Barry >> >> >> >> >> >> >> >> >> >>> On Oct 25, 2019, at 2:49 AM, Lisandro Dalcin via petsc-dev >>> <[email protected]> wrote: >>> >>> >>> >>> On Fri, 25 Oct 2019 at 01:40, Balay, Satish <[email protected]> wrote: >>> I'm curious why this issue comes up for you. The code was unrelated to >>> --with-avx512-kernels=0 option. >>> >>> Its relying on __AVX512F__and PETSC_HAVE_IMMINTRIN_H flags. And >>> assumes immintrin.h has a definition for _mm512_reduce_add_pd() >>> >>> Is the flag __AVX512F__ always set on your machine by gcc? >>> >>> And does this change based on the hardware? I just tried this build >>> [same os/compiler] on "Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz" - and >>> can't reproduce the issue. >>> >>> I do see _mm512_reduce_add_pd is missing from immintrin.h - but the >>> flag __AVX512F__ is not set for me. >>> >>> >>> Of course it is not set, you are just invoking the preprocessor. Try this >>> way: >>> >>> $ cat xyz.c >>> #if defined __AVX512F__ >>> #error "avx512f flag set" >>> #endif >>> >>> $ gcc -march=native -c xyz.c >>> xyz.c:2:2: error: #error "avx512f flag set" >>> #error "avx512f flag set" >>> >>> I forgot to mention my XXXOPTFLAGS, full reconfigure script below. >>> Do we have some Ubuntu 16 builder using system GCC? >>> Maybe we should use `-march=native -O3 -g3` in one of these builders? >>> >>> $ cat arch-gnu-opt/lib/petsc/conf/reconfigure-arch-gnu-opt.py >>> #!/usr/bin/python >>> if __name__ == '__main__': >>> import sys >>> import os >>> sys.path.insert(0, os.path.abspath('config')) >>> import configure >>> configure_options = [ >>> '--COPTFLAGS=-march=native -mtune=native -O3', >>> '--CXXOPTFLAGS=-march=native -mtune=native -O3', >>> '--FOPTFLAGS=-march=native -mtune=native -O3', >>> '--download-metis=1', >>> '--download-p4est=1', >>> '--download-parmetis=1', >>> '--with-avx512-kernels=0', >>> '--with-debugging=0', >>> '--with-fortran-bindings=0', >>> '--with-zlib=1', >>> 'CC=mpicc', >>> 'CXX=mpicxx', >>> 'FC=mpifort', >>> 'PETSC_ARCH=arch-gnu-opt', >>> ] >>> configure.petsc_configure(configure_options) >>> >>> >>> -- >>> Lisandro Dalcin >>> ============ >>> Research Scientist >>> Extreme Computing Research Center (ECRC) >>> King Abdullah University of Science and Technology (KAUST) >>> http://ecrc.kaust.edu.sa/
