On Tue, Apr 10, 2018 at 4:39 PM, Satish Balay <ba...@mcs.anl.gov> wrote:

> On Tue, 10 Apr 2018, Jeff Hammond wrote:
>
> > This should generate an SSE2 binary:
> >
> >     'COPTFLAGS=-g',
> >     'FOPTFLAGS=-g',
> >
> > This should generate a KNL binary:
> >
> >     'COPTFLAGS=-g -xMIC-AVX512 -O3',
> >     'FOPTFLAGS=-g -xMIC-AVX512 -O3',
> >
> > This should generate a SSE2 binary that also supports CORE-AVX2 dispatch.
> >
> >     '--COPTFLAGS=-g -axcore-avx2',
> >     '--FOPTFLAGS=-g -axcore-avx2',
> >
> > I don't see a good reason for the third option to fail.  Please report
> this
> > bug to Intel.
> >
> > You might also verify that this works:
> >
> >     '--COPTFLAGS=-g -xCORE-AVX2',
> >     '--FOPTFLAGS=-g -xCORE-AVX2',
>
> This fails the same way as  -axcore-avx2
>
>
Can you try on a non-KNL host?  It's a bug either way but I want to
determine if KNL host is the issue.


> >
> > In general, one should avoid compiling for SSE on KNL, because SSE-AVX
> > transition penalties need to be avoided (google should find the details).
> > Are you trying to generate a single binary that is portable to ancient
> > Core/Xeon and KNL?
>
> My usage here is to reproduce this issue reported by Randy - assumed the
> knl box we have is the easiest way..
>
>
Based only what I see below, Randy doesn't seem to be reporting a
KNL-specific issue.  Is that incorrect?

I strongly recommend generating KNL-specific binaries for KNL, in which
case, the original issue should be investigated on non-KNL systems.

Again, there is clearly a bug here, but it helps to localize the problem as
much as possible.

Jeff


> Satish
>
> >  I recommend that you use AVX (Sandy Bridge) -
> > preferably AVX2 (Haswell) - as your oldest ISA target when generating a
> > portable binary that includes KNL support.
> >
> > Jeff
> >
> > On Tue, Apr 10, 2018 at 2:23 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
> >
> > > I tried a few builds with:
> > >
> > >     '--with-64-bit-indices=1',
> > >     '--with-memalign=64',
> > >     '--with-blaslapack-dir=/home/intel/18/compilers_and_
> > > libraries_2018.0.128/linux/mkl',
> > >     '--with-cc=icc',
> > >     '--with-fc=ifort',
> > >     '--with-cxx=0',
> > >     '--with-debugging=0',
> > >     '--with-mpi=0',
> > >
> > > And then changed the OPTFLAGS:
> > >
> > > 1.  'basic -g' - works fine
> > >
> > >     'COPTFLAGS=-g',
> > >     'FOPTFLAGS=-g',
> > >
> > > 2. 'avx512' - works fine
> > >
> > >     'COPTFLAGS=-g -xMIC-AVX512 -O3',
> > >     'FOPTFLAGS=-g -xMIC-AVX512 -O3',
> > >
> > > 3. 'avx2' - breaks.
> > >
> > >     '--COPTFLAGS=-g -axcore-avx2',
> > >     '--FOPTFLAGS=-g -axcore-avx2',
> > >
> > > with a breakpoint at dmdavecrestorearrayf903_() in gdb - I see - the
> > > stack is fine during the first call to dmdavecrestorearrayf903_() -
> > > but is corrupted when it goes to the second call to
> > > dmdavecrestorearrayf903_() i.e ierr=0x7fffffffb4a0 changes to
> > > ierr=0x0]
> > >
> > > >>>>>>>>>>
> > >
> > > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,
> > > v=0x6030c0 <test_$VEC2.0.1>, a=0x401abd <test+2301>,
> > >     ierr=0x7fffffffb4a0) at /home/petsc/petsc.barry-test/
> > > src/dm/impls/da/f90-custom/zda1f90.c:153
> > > 153     {
> > > (gdb) where
> > > #0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030c0
> > > <test_$VEC2.0.1>, a=0x401abd <test+2301>, ierr=0x7fffffffb4a0)
> > >     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> > > zda1f90.c:153
> > > #1  0x0000000000401abd in test () at ex1f.F90:80
> > > #2  0x00000000004011ae in main ()
> > > #3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6
> > > #4  0x00000000004010b9 in _start ()
> > > (gdb) c
> > > Continuing.
> > >
> > > Breakpoint 1, dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>,
> > > v=0x6030b8 <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)
> > >     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> > > zda1f90.c:153
> > > 153     {
> > > (gdb) where
> > > #0  dmdavecrestorearrayf903_ (da=0x603098 <test_$DA1.0.1>, v=0x6030b8
> > > <test_$VEC1.0.1>, a=0x401ada <test+2330>, ierr=0x0)
> > >     at /home/petsc/petsc.barry-test/src/dm/impls/da/f90-custom/
> > > zda1f90.c:153
> > > #1  0x0000000000401ada in test () at ex1f.F90:81
> > > #2  0x00000000004011ae in main ()
> > > #3  0x00007fffef1c3c05 in __libc_start_main () from /lib64/libc.so.6
> > > #4  0x00000000004010b9 in _start ()
> > > (gdb)
> > >
> > > >>>>>>>>>
> > >
> > > Its not clear to me why this happens. [and why it would work with
> > > -xMIC-AVX512 but breaks with -axcore-avx2].
> > >
> > > Perhaps Richard, Jeff have better insight on this.
> > >
> > > BTW: The above run is with:
> > >
> > > bash-4.2$ icc --version
> > > icc (ICC) 18.0.0 20170811
> > >
> > > Satish
> > >
> > > On Mon, 9 Apr 2018, Satish Balay wrote:
> > >
> > > > I'm able to reproduce this problem on knl box [with the attached test
> > > code]. But it goes away if I rebuild without the option
> > > --with-64-bit-indices.
> > > >
> > > > Will have to check further..
> > > >
> > > > Satish
> > > >
> > > >
> > > > On Thu, 5 Apr 2018, Randall Mackie wrote:
> > > >
> > > > > Dear PETSc users,
> > > > >
> > > > > I’m curious if anyone else experiences problems using
> > > DMDAVecGetArrayF90 in conjunction with Intel compilers?
> > > > > We have had many problems (typically 11 SEGV segmentation
> violations)
> > > when PETSc is compiled in optimize mode (with various combinations of
> > > options).
> > > > > These same codes run valgrind clean with gfortran, so I assume
> this is
> > > an Intel bug, but before we submit a bug report I wanted to see if
> anyone
> > > else had similar experiences?
> > > > > We have basically gone back and replaced our calls to
> > > DMDAVecGetArrayF90 with calls to VecGetArrayF90 and pass those pointers
> > > into a “local” subroutine that works fine.
> > > > >
> > > > > In case anyone is curious, the attached test code shows this
> behavior
> > > when PETSc is compiled with the following options:
> > > > >
> > > > > ./configure \
> > > > >   --with-clean=1 \
> > > > >   --with-debugging=0 \
> > > > >   --with-fortran=1 \
> > > > >   --with-64-bit-indices \
> > > > >   --download-mpich=../mpich-3.3a2.tar.gz \
> > > > >   --with-blas-lapack-dir=/opt/intel/mkl \
> > > > >   --with-cc=icc \
> > > > >   --with-fc=ifort \
> > > > >   --with-cxx=icc \
> > > > >   --FOPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \
> > > > >   --COPTFLAGS='-O2 -xSSSE3 -axcore-avx2' \
> > > > >   --CXXOPTFLAGS='-O2 -xSSSE3 -axcore-avx2’ \
> > > > >
> > > > >
> > > > >
> > > > > Thanks, Randy M.
> > > > >
> > > > >
> > > >
> >
> >
> >
> >
> >
>



-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/

Reply via email to