This should generate an SSE2 binary:

    'COPTFLAGS=-g',
    'FOPTFLAGS=-g',

This should generate a KNL binary:

    'COPTFLAGS=-g -xMIC-AVX512 -O3',
    'FOPTFLAGS=-g -xMIC-AVX512 -O3',

This should generate a SSE2 binary that also supports CORE-AVX2 dispatch.

    '--COPTFLAGS=-g -axcore-avx2',
    '--FOPTFLAGS=-g -axcore-avx2',

I don't see a good reason for the third option to fail.  Please report this
bug to Intel.

You might also verify that this works:

    '--COPTFLAGS=-g -xCORE-AVX2',
    '--FOPTFLAGS=-g -xCORE-AVX2',

In general, one should avoid compiling for SSE on KNL, because SSE-AVX
transition penalties need to be avoided (google should find the details).
Are you trying to generate a single binary that is portable to ancient
Core/Xeon and KNL?  I recommend that you use AVX (Sandy Bridge) -
preferably AVX2 (Haswell) - as your oldest ISA target when generating a
portable binary that includes KNL support.

Jeff

On Tue, Apr 10, 2018 at 2:23 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> I tried a few builds with:
>
>     '--with-64-bit-indices=1',
>     '--with-memalign=64',
>     '--with-blaslapack-dir=/home/intel/18/compilers_and_
> libraries_2018.0.128/linux/mkl',
>     '--with-cc=icc',
>     '--with-fc=ifort',
>     '--with-cxx=0',
>     '--with-debugging=0',
>     '--with-mpi=0',
>
> And then changed the OPTFLAGS:
>
> 1.  'basic -g' - works fine
>
>     'COPTFLAGS=-g',
>     'FOPTFLAGS=-g',
>
> 2. 'avx512' - works fine
>
>     'COPTFLAGS=-g -xMIC-AVX512 -O3',
>     'FOPTFLAGS=-g -xMIC-AVX512 -O3',
>
> 3. 'avx2' - breaks.
>
>     '--COPTFLAGS=-g -axcore-avx2',
>     '--FOPTFLAGS=-g -axcore-avx2',
>
> with a breakpoint at dmdavecrestorearrayf903_() in gdb - I see - the
> stack is fine during the first call to dmdavecrestorearrayf903_() -
> but is corrupted when it goes to the second call to
> dmdavecrestorearrayf903_() i.e ierr=0x7fffffffb4a0 changes to
> ierr=0x0]
>
> >>>>>>>>>>
>
> Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,
> v=0x6030c0 <test_$VEC2.0.1>, a=0x401abd <test+2301>,
>     ierr=0x7fffffffb4a0) at /home/petsc/petsc.barry-test/
> src/dm/impls/da/f90-custom/zda1f90.c:153
> 153     {
> (gdb) where
> #0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030c0
> <test_$VEC2.0.1>, a=0x401abd <test+2301>, ierr=0x7fffffffb4a0)
>     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> zda1f90.c:153
> #1  0x0000000000401abd in test () at ex1f.F90:80
> #2  0x00000000004011ae in main ()
> #3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6
> #4  0x00000000004010b9 in _start ()
> (gdb) c
> Continuing.
>
> Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,
> v=0x6030b8 <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)
>     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> zda1f90.c:153
> 153     {
> (gdb) where
> #0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030b8
> <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)
>     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> zda1f90.c:153
> #1  0x0000000000401ada in test () at ex1f.F90:81
> #2  0x00000000004011ae in main ()
> #3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6
> #4  0x00000000004010b9 in _start ()
> (gdb)
>
> >>>>>>>>>
>
> Its not clear to me why this happens. [and why it would work with
> -xMIC-AVX512 but breaks with -axcore-avx2].
>
> Perhaps Richard, Jeff have better insight on this.
>
> BTW: The above run is with:
>
> bash-4.2$ icc --version
> icc (ICC) 18.0.0 20170811
>
> Satish
>
> On Mon, 9 Apr 2018, Satish Balay wrote:
>
> > I'm able to reproduce this problem on knl box [with the attached test
> code]. But it goes away if I rebuild without the option
> --with-64-bit-indices.
> >
> > Will have to check further..
> >
> > Satish
> >
> >
> > On Thu, 5 Apr 2018, Randall Mackie wrote:
> >
> > > Dear PETSc users,
> > >
> > > I’m curious if anyone else experiences problems using
> DMDAVecGetArrayF90 in conjunction with Intel compilers?
> > > We have had many problems (typically 11 SEGV segmentation violations)
> when PETSc is compiled in optimize mode (with various combinations of
> options).
> > > These same codes run valgrind clean with gfortran, so I assume this is
> an Intel bug, but before we submit a bug report I wanted to see if anyone
> else had similar experiences?
> > > We have basically gone back and replaced our calls to
> DMDAVecGetArrayF90 with calls to VecGetArrayF90 and pass those pointers
> into a “local” subroutine that works fine.
> > >
> > > In case anyone is curious, the attached test code shows this behavior
> when PETSc is compiled with the following options:
> > >
> > > ./configure \
> > >   --with-clean=1 \
> > >   --with-debugging=0 \
> > >   --with-fortran=1 \
> > >   --with-64-bit-indices \
> > >   --download-mpich=../mpich-3.3a2.tar.gz \
> > >   --with-blas-lapack-dir=/opt/intel/mkl \
> > >   --with-cc=icc \
> > >   --with-fc=ifort \
> > >   --with-cxx=icc \
> > >   --FOPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \
> > >   --COPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \
> > >   --CXXOPTFLAGS='-O2 -xSSSE3 -axcore-avx2’ \
> > >
> > >
> > >
> > > Thanks, Randy M.
> > >
> > >
> >




-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/

Reply via email to