On Tue, 10 Apr 2018, Jeff Hammond wrote:

> This should generate an SSE2 binary:
>     'COPTFLAGS=-g',
>     'FOPTFLAGS=-g',
> This should generate a KNL binary:
>     'COPTFLAGS=-g -xMIC-AVX512 -O3',
>     'FOPTFLAGS=-g -xMIC-AVX512 -O3',
> This should generate a SSE2 binary that also supports CORE-AVX2 dispatch.
>     '--COPTFLAGS=-g -axcore-avx2',
>     '--FOPTFLAGS=-g -axcore-avx2',
> I don't see a good reason for the third option to fail.  Please report this
> bug to Intel.
> You might also verify that this works:
>     '--COPTFLAGS=-g -xCORE-AVX2',
>     '--FOPTFLAGS=-g -xCORE-AVX2',

This fails the same way as  -axcore-avx2

> In general, one should avoid compiling for SSE on KNL, because SSE-AVX
> transition penalties need to be avoided (google should find the details).
> Are you trying to generate a single binary that is portable to ancient
> Core/Xeon and KNL?

My usage here is to reproduce this issue reported by Randy - assumed the knl 
box we have is the easiest way..


>  I recommend that you use AVX (Sandy Bridge) -
> preferably AVX2 (Haswell) - as your oldest ISA target when generating a
> portable binary that includes KNL support.
> Jeff
> On Tue, Apr 10, 2018 at 2:23 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
> > I tried a few builds with:
> >
> >     '--with-64-bit-indices=1',
> >     '--with-memalign=64',
> >     '--with-blaslapack-dir=/home/intel/18/compilers_and_
> > libraries_2018.0.128/linux/mkl',
> >     '--with-cc=icc',
> >     '--with-fc=ifort',
> >     '--with-cxx=0',
> >     '--with-debugging=0',
> >     '--with-mpi=0',
> >
> > And then changed the OPTFLAGS:
> >
> > 1.  'basic -g' - works fine
> >
> >     'COPTFLAGS=-g',
> >     'FOPTFLAGS=-g',
> >
> > 2. 'avx512' - works fine
> >
> >     'COPTFLAGS=-g -xMIC-AVX512 -O3',
> >     'FOPTFLAGS=-g -xMIC-AVX512 -O3',
> >
> > 3. 'avx2' - breaks.
> >
> >     '--COPTFLAGS=-g -axcore-avx2',
> >     '--FOPTFLAGS=-g -axcore-avx2',
> >
> > with a breakpoint at dmdavecrestorearrayf903_() in gdb - I see - the
> > stack is fine during the first call to dmdavecrestorearrayf903_() -
> > but is corrupted when it goes to the second call to
> > dmdavecrestorearrayf903_() i.e ierr=0x7fffffffb4a0 changes to
> > ierr=0x0]
> >
> > >>>>>>>>>>
> >
> > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,
> > v=0x6030c0 <test_$VEC2.0.1>, a=0x401abd <test+2301>,
> >     ierr=0x7fffffffb4a0) at /home/petsc/petsc.barry-test/
> > src/dm/impls/da/f90-custom/zda1f90.c:153
> > 153     {
> > (gdb) where
> > #0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030c0
> > <test_$VEC2.0.1>, a=0x401abd <test+2301>, ierr=0x7fffffffb4a0)
> >     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> > zda1f90.c:153
> > #1  0x0000000000401abd in test () at ex1f.F90:80
> > #2  0x00000000004011ae in main ()
> > #3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6
> > #4  0x00000000004010b9 in _start ()
> > (gdb) c
> > Continuing.
> >
> > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,
> > v=0x6030b8 <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)
> >     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> > zda1f90.c:153
> > 153     {
> > (gdb) where
> > #0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030b8
> > <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)
> >     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> > zda1f90.c:153
> > #1  0x0000000000401ada in test () at ex1f.F90:81
> > #2  0x00000000004011ae in main ()
> > #3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6
> > #4  0x00000000004010b9 in _start ()
> > (gdb)
> >
> > >>>>>>>>>
> >
> > Its not clear to me why this happens. [and why it would work with
> > -xMIC-AVX512 but breaks with -axcore-avx2].
> >
> > Perhaps Richard, Jeff have better insight on this.
> >
> > BTW: The above run is with:
> >
> > bash-4.2$ icc --version
> > icc (ICC) 18.0.0 20170811
> >
> > Satish
> >
> > On Mon, 9 Apr 2018, Satish Balay wrote:
> >
> > > I'm able to reproduce this problem on knl box [with the attached test
> > code]. But it goes away if I rebuild without the option
> > --with-64-bit-indices.
> > >
> > > Will have to check further..
> > >
> > > Satish
> > >
> > >
> > > On Thu, 5 Apr 2018, Randall Mackie wrote:
> > >
> > > > Dear PETSc users,
> > > >
> > > > I’m curious if anyone else experiences problems using
> > DMDAVecGetArrayF90 in conjunction with Intel compilers?
> > > > We have had many problems (typically 11 SEGV segmentation violations)
> > when PETSc is compiled in optimize mode (with various combinations of
> > options).
> > > > These same codes run valgrind clean with gfortran, so I assume this is
> > an Intel bug, but before we submit a bug report I wanted to see if anyone
> > else had similar experiences?
> > > > We have basically gone back and replaced our calls to
> > DMDAVecGetArrayF90 with calls to VecGetArrayF90 and pass those pointers
> > into a “local” subroutine that works fine.
> > > >
> > > > In case anyone is curious, the attached test code shows this behavior
> > when PETSc is compiled with the following options:
> > > >
> > > > ./configure \
> > > >   --with-clean=1 \
> > > >   --with-debugging=0 \
> > > >   --with-fortran=1 \
> > > >   --with-64-bit-indices \
> > > >   --download-mpich=../mpich-3.3a2.tar.gz \
> > > >   --with-blas-lapack-dir=/opt/intel/mkl \
> > > >   --with-cc=icc \
> > > >   --with-fc=ifort \
> > > >   --with-cxx=icc \
> > > >   --FOPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \
> > > >   --COPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \
> > > >   --CXXOPTFLAGS='-O2 -xSSSE3 -axcore-avx2’ \
> > > >
> > > >
> > > >
> > > > Thanks, Randy M.
> > > >
> > > >
> > >

Reply via email to