Satish,

    Thanks for running this, but it is the 15 that is breaking, not the 12 :-). 
It is crashing inside building the matrix on Solaris with memory corruption. 
But I am having trouble getting it to  cause problems elsewhere.

  Barry

  I think it is just code what was not previously properly tested in the 
nightly builds, the code has been around for a while. Or could be a bug in my 
test program.





> On Feb 22, 2021, at 10:31 PM, Satish Balay <[email protected]> wrote:
> 
> I get the following with a debug build.
> 
>>>>>>>>> 
> balay@petsc-02:/scratch/balay/petsc/src/mat/tests$ make ex238
> gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas 
> -fstack-protector -fvisibility=hidden -g3  -fPIC -Wall -Wwrite-strings 
> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
> -fvisibility=hidden -g3    -I/scratch/balay/petsc/include 
> -I/scratch/balay/petsc/arch-linux-c-debug/include     ex238.c  
> -Wl,-rpath,/scratch/balay/petsc/arch-linux-c-debug/lib 
> -L/scratch/balay/petsc/arch-linux-c-debug/lib 
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/7 -L/usr/lib/gcc/x86_64-linux-gnu/7 
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu 
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -llapack 
> -lblas -lpthread -lm -lX11 -lstdc++ -ldl -lgfortran -lm -lgfortran -lm 
> -lgcc_s -lquadmath -lstdc++ -ldl -o ex238
> balay@petsc-02:/scratch/balay/petsc/src/mat/tests$ valgrind --tool=memcheck  
> ./ex238 -mat_block_size 12
> ==34355== Memcheck, a memory error detector
> ==34355== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==34355== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
> ==34355== Command: ./ex238 -mat_block_size 12
> ==34355== 
> ==34355== Warning: set address range perms: large range [0x59e43040, 
> 0xb696a840) (undefined)
> <<<<<<<<
> 
> Hang? takes a long time.  try a different example
> 
>>>>>>> 
> balay@petsc-02:/scratch/balay/petsc/src/mat/tests$ make ex237
> gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas 
> -fstack-protector -fvisibility=hidden -g3  -fPIC -Wall -Wwrite-strings 
> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
> -fvisibility=hidden -g3    -I/scratch/balay/petsc/include 
> -I/scratch/balay/petsc/arch-linux-c-debug/include     ex237.c  
> -Wl,-rpath,/scratch/balay/petsc/arch-linux-c-debug/lib 
> -L/scratch/balay/petsc/arch-linux-c-debug/lib 
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/7 -L/usr/lib/gcc/x86_64-linux-gnu/7 
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu 
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -llapack 
> -lblas -lpthread -lm -lX11 -lstdc++ -ldl -lgfortran -lm -lgfortran -lm 
> -lgcc_s -lquadmath -lstdc++ -ldl -o ex237
> balay@petsc-02:/scratch/balay/petsc/src/mat/tests$ valgrind --tool=memcheck 
> -q ./ex237 -f 
> /scratch/balay/petsc/share/petsc/datafiles/matrices/spd-real-int32-float64
> Benchmarking MatMult: with A seqaij 12x12
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x2
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x4
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x8
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x16
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x32
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x64
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x128
> balay@petsc-02:/scratch/balay/petsc/src/mat/tests$ 
> 
> <<<<<<<<<
> 
> So the likely issue is - this opt build with '-march=native' [perhaps this 
> valgrind version is older than the cpu].
> 
> Ok try an optimized build on an older CPU - aka  es [@gce]
> 
>>>>>> 
> 
> balay@es:/scratch/balay/petsc/src/mat/tests$ make ex237
> gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas 
> -fstack-protector -fvisibility=hidden -march=native -O3  -fPIC -Wall 
> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
> -fvisibility=hidden -march=native -O3    -I/scratch/balay/petsc/include 
> -I/scratch/balay/petsc/arch-linux-c-opt/include     ex237.c  
> -Wl,-rpath,/scratch/balay/petsc/arch-linux-c-opt/lib 
> -L/scratch/balay/petsc/arch-linux-c-opt/lib 
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/7 -L/usr/lib/gcc/x86_64-linux-gnu/7 
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu 
> -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lpetsc -llapack 
> -lblas -lpthread -lm -lX11 -lstdc++ -ldl -lgfortran -lm -lgfortran -lm 
> -lgcc_s -lquadmath -lstdc++ -ldl -o ex237
> balay@es:/scratch/balay/petsc/src/mat/tests$ valgrind --tool=memcheck -q 
> ./ex237 -f 
> /scratch/balay/petsc/share/petsc/datafiles/matrices/spd-real-int32-float64
> Benchmarking MatMult: with A seqaij 12x12
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x2
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x4
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x8
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x16
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x32
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x64
> Benchmarking MatProduct AB: with A seqaij 12x12 and B seqdense 12x128
> balay@es:/scratch/balay/petsc/src/mat/tests$ 
> 
> <<<<<<<
> 
> Satish
> 
> 
> 
> On Mon, 22 Feb 2021, Barry Smith wrote:
> 
>> 
>>  I knew they hate Macs but now Linux? Any trustworthy machines to run 
>> valgrind?
>> 
>> 
>> $ petscmpiexec -valgrind -n 1 ./ex238 -mat_block_size 12
>> ==14144== 
>> ==14144== Process terminating with default action of signal 4 (SIGILL)
>> ==14144==  Illegal opcode at address 0x4F808A9
>> ==14144==    at 0x4F808A9: PetscSetDisplay (in 
>> /scratch/bsmith/petsc/arch-add-baij-12/lib/libpetsc.so.3.014.4)
>> ==14144==    by 0x4F086BD: PetscOptionsCheckInitial_Private (in 
>> /scratch/bsmith/petsc/arch-add-baij-12/lib/libpetsc.so.3.014.4)
>> ==14144==    by 0x4F0D5BC: PetscInitialize (in 
>> /scratch/bsmith/petsc/arch-add-baij-12/lib/libpetsc.so.3.014.4)
>> ==14144==    by 0x108D0E: main (in /scratch/bsmith/petsc/src/mat/tests/ex238)
>> Illegal instruction (core dumped)
>> /scratch/bsmith/petsc/src/mat/tests (barry/2021-02-12/add-baij-12=) 
>> arch-add-baij-12
>> $ echo $PETSC_OPTIONS
>> 
>> /scratch/bsmith/petsc/src/mat/tests (barry/2021-02-12/add-baij-12=) 
>> arch-add-baij-12
>> $ hostname 
>> petsc-02
>> /scratch/bsmith/petsc/src/mat/tests (barry/2021-02-12/add-baij-12=) 
>> arch-add-baij-12
>> $ uname -a
>> Linux petsc-02 4.15.0-135-generic #139-Ubuntu SMP Mon Jan 18 17:38:24 UTC 
>> 2021 x86_64 x86_64 x86_64 GNU/Linux
>> /scratch/bsmith/petsc/src/mat/tests (barry/2021-02-12/add-baij-12=) 
>> arch-add-baij-12
>> $ which valgrind
>> /usr/bin/valgrind
>> 
>> $ make ex237
>> gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas 
>> -fstack-protector -fvisibility=hidden -march=native -O3  -fPIC -Wall 
>> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
>> -fvisibility=hidden -march=native -O3    -I/scratch/bsmith/petsc/include 
>> -I/scratch/bsmith/petsc/arch-add-baij-12/include     ex237.c  
>> -Wl,-rpath,/scratch/bsmith/petsc/arch-add-baij-12/lib 
>> -L/scratch/bsmith/petsc/arch-add-baij-12/lib -lpetsc -llapack -lblas 
>> -lpthread -lm -lX11 -lquadmath -ldl -o ex237
>> /scratch/bsmith/petsc/src/mat/tests (barry/2021-02-12/add-baij-12=) 
>> arch-add-baij-12
>> $ petscmpiexec -valgrind -n 1 ./ex237
>> ==14841== 
>> ==14841== Process terminating with default action of signal 4 (SIGILL)
>> ==14841==  Illegal opcode at address 0x4F808A9
>> ==14841==    at 0x4F808A9: PetscSetDisplay (in 
>> /scratch/bsmith/petsc/arch-add-baij-12/lib/libpetsc.so.3.014.4)
>> ==14841==    by 0x4F086BD: PetscOptionsCheckInitial_Private (in 
>> /scratch/bsmith/petsc/arch-add-baij-12/lib/libpetsc.so.3.014.4)
>> ==14841==    by 0x4F0D5BC: PetscInitialize (in 
>> /scratch/bsmith/petsc/arch-add-baij-12/lib/libpetsc.so.3.014.4)
>> ==14841==    by 0x109DE0: main (in /scratch/bsmith/petsc/src/mat/tests/ex237)
>> Illegal instruction (core dumped)
>> /scratch/bsmith/petsc/src/mat/tests (barry/2021-02-12/add-baij-12=) 
>> arch-add-baij-12
>> 
>> 
> 

Reply via email to