There are always diffs (between different compilers, versions, hardware) - 
doesn't necessarily mean failures.

Very large diffs [as Barry mentioned] might require a closer look.

Depending upon your purpose - you might consider that you have a valid petsc 
install and start using it.

Obviously - based on the problem that you are solving - you might need to 
re-visit this issue [checking how your application behaves between a debug 
build and optimized build - and play with different optimization flags]

Satish

On Wed, 7 Apr 2021, Chen Gang wrote:

> ARM: configure with the following: -ffp-contract=off
> 
> 
> ./configure -with-debugging=0 COPTFLAGS='-O3 -ffp-contract=off 
> -march=armv8.2-a -mtune=tsv110' CXXOPTFLAGS='-O3 -ffp-contract=off 
> -march=armv8.2-a -mtune=tsv110' FOPTFLAGS='-O3 -ffp-contract=off 
> -march=armv8.2-a -mtune=tsv110' --with-x=1 -download-fblaslapack 
> PETSC-KERNEL-USE-UNROLL-4
> 
> 
> There are 3 failures, all related to FC code
> 
> 
> 
> 
> # -------------
> #   Summary
> # -------------
> # FAILED diff-tao_bound_tutorials-plate2f_2 
> diff-tao_bound_tutorials-plate2f_1 diff-vec_is_is_tutorials-ex2f_1
> # success 7547/9528 tests (79.2%)
> # failed 3/9528 tests (0.0%)
> # todo 225/9528 tests (2.4%)
> # skip 1753/9528 tests (18.4%)
> #
> # Wall clock time for tests: 1371 sec
> # Approximate CPU time (not incl. build time): 106722.62999999999 sec
> #
> # To rerun failed tests:
> #     /opt/rh/devtoolset-9/root/usr/bin/gmake -f gmakefile 
> test test-fail=1
> #
> # Timing summary (actual test time / total CPU time):
> #   dm_tests-ex36_3dp1: 467.68 sec / 477.71 sec
> #   dm_impls_stag_tests-ex1_multidof_3: 395.16 sec / 398.45 sec
> #   ts_tutorials-ex29_1: 236.54 sec / 238.30 sec
> #   dm_impls_stag_tests-ex1_basic_2: 205.41 sec / 207.34 sec
> #   dm_tests-ex34_1: 178.94 sec / 180.26 sec
> 
> 
> 
> 
> 
> 
> 
> ------------------ ???????? ------------------
> ??????:                                                                       
>                                                  "Chen Gang"                  
>                                                                   
> <[email protected]&gt;;
> ????????:&nbsp;2021??4??7??(??????) ????10:49
> ??????:&nbsp;"petsc-dev"<[email protected]&gt;;
> 
> ????:&nbsp;?????? [petsc-dev] Petsc: Error code 1
> 
> 
> 
> ARM: configure with the following: -ffp-contract=off
> 
> 
> ./configure -with-debugging=0 COPTFLAGS='-O3 -ffp-contract=off 
> -march=armv8.2-a -mtune=tsv110' CXXOPTFLAGS='-O3 -ffp-contract=off 
> -march=armv8.2-a -mtune=tsv110' FOPTFLAGS='-O3 -ffp-contract=off 
> -march=armv8.2-a -mtune=tsv110' --with-x=1 -download-fblaslapack 
> PETSC-KERNEL-USE-UNROLL-4
> 
> 
> There are 3 failures, all related to FC code
> 
> 
> 
> 
> # -------------
> #&nbsp; &nbsp;Summary
> # -------------
> # FAILED diff-tao_bound_tutorials-plate2f_2 
> diff-tao_bound_tutorials-plate2f_1 diff-vec_is_is_tutorials-ex2f_1
> # success 7547/9528 tests (79.2%)
> # failed 3/9528 tests (0.0%)
> # todo 225/9528 tests (2.4%)
> # skip 1753/9528 tests (18.4%)
> #
> # Wall clock time for tests: 1371 sec
> # Approximate CPU time (not incl. build time): 106722.62999999999 sec
> #
> # To rerun failed tests:
> #&nbsp; &nbsp; &nbsp;/opt/rh/devtoolset-9/root/usr/bin/gmake -f gmakefile 
> test test-fail=1
> #
> # Timing summary (actual test time / total CPU time):
> #&nbsp; &nbsp;dm_tests-ex36_3dp1: 467.68 sec / 477.71 sec
> #&nbsp; &nbsp;dm_impls_stag_tests-ex1_multidof_3: 395.16 sec / 398.45 sec
> #&nbsp; &nbsp;ts_tutorials-ex29_1: 236.54 sec / 238.30 sec
> #&nbsp; &nbsp;dm_impls_stag_tests-ex1_basic_2: 205.41 sec / 207.34 sec
> #&nbsp; &nbsp;dm_tests-ex34_1: 178.94 sec / 180.26 sec
> 
> 
> 
> ------------------ ???????? ------------------
> ??????:                                                                       
>                                                  "petsc-dev"                  
>                                                                   
> <[email protected]&gt;;
> ????????:&nbsp;2021??4??7??(??????) ????1:06
> ??????:&nbsp;"Barry Smith"<[email protected]&gt;;
> ????:&nbsp;"Chen Gang"<[email protected]&gt;;"Alp 
> Dener"<[email protected]&gt;;"petsc-dev"<[email protected]&gt;;"cglwdm"<[email protected]&gt;;
> ????:&nbsp;Re: [petsc-dev] Petsc: Error code 1
> 
> 
> 
> &gt; See the attachements. alltest.log is on a machine with 96 cores, ARM, 
> with FC,gcc9.3.5,mpich3.4.1,fblaslapack; 6 failures
> 
> Perhaps this is an issue with ARM - and such diffs are expected - as we 
> already have multiple alt files for some of these tests
> 
> $ ls -lt src/tao/bound/tutorials/output/plate2f_*
> -rw-r--r--. 1 balay balay 1029 Mar 23 19:48 
> src/tao/bound/tutorials/output/plate2f_1_alt.out
> -rw-r--r--. 1 balay balay 1071 Mar 23 19:48 
> src/tao/bound/tutorials/output/plate2f_1.out
> -rw-r--r--. 1 balay balay 1029 Mar 23 19:48 
> src/tao/bound/tutorials/output/plate2f_2_alt.out
> -rw-r--r--. 1 balay balay 1071 Mar 23 19:48 
> src/tao/bound/tutorials/output/plate2f_2.out
> 
> &gt;&gt;&gt;&gt;&gt;&gt;
> not ok diff-vec_is_is_tutorials-ex2f_1 # Error code: 1
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; 16,24d15
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; < &nbsp; 5
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; < &nbsp; 7
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; < &nbsp; 9
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; <&nbsp; 11
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; <&nbsp; 13
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; <&nbsp; 15
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; <&nbsp; 17
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; <&nbsp; 19
> #&nbsp;&nbsp;&nbsp;&nbsp; &nbsp; <&nbsp; 21
> <<<<<
> 
> This one is puzzling - missing fortran stdout? Perhaps compile issue on ARM? 
> [its a sequential example - so can't blame MPI]
> 
> Or they are all related to the optimization flags used? What configure 
> options were used for the build?
> 
> Satish
> 
> On Tue, 6 Apr 2021, Barry Smith wrote:
> 
> &gt; 
> &gt;&nbsp;&nbsp; &nbsp; Alp,
> &gt; 
> &gt;&nbsp; &nbsp; Except for the first test, these are all optimization 
> problems (mostly in Fortran). The function values are very different so I am 
> sending it to our optimization expert to take a look at it. The differences 
> could possibly be related to the use of real() and maybe the direct use of 
> floating point numbers that the compiler first treats as single and then 
> converts to double thus losing precision.
> &gt; 
> &gt;&nbsp; &nbsp; Chen Gang, I assume you compiled with the default standard 
> precision PETSc configure options?
> &gt; 
> &gt; 
> &gt; 
> &gt; On Apr 6, 2021, at 3:56 AM, Chen Gang 
> <[email protected]<mailto:[email protected]&gt;&gt; wrote:
> &gt; 
> &gt; 
> &gt; See the attachements. alltest.log is on a machine with 96 cores, ARM, 
> with FC,gcc9.3.5,mpich3.4.1,fblaslapack; 6 failures
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
>  &nbsp; alltest2.log is on an intel machine with 40&nbsp; cores,x86, without 
> FC; icc&amp;mkl&amp; intel mpi; only 1 failure
> &gt; 
> &gt; ------------------ ???????? ------------------
> &gt; ??????: "petsc-dev" <[email protected]<mailto:[email protected]&gt;&gt;;
> &gt; ????????: 2021??4??6??(??????) ????12:38
> &gt; ??????: 
> "petsc-dev"<[email protected]<mailto:[email protected]&gt;&gt;;
> &gt; ????: "Chen 
> Gang"<[email protected]<mailto:[email protected]&gt;&gt;;"cglwdm";<[email protected]<mailto:[email protected]&gt;&gt;;
> &gt; ????: Re: [petsc-dev] Petsc: Error code 1
> &gt; 
> &gt; Note: do not use '-j' with alltests.
> &gt; 
> &gt; And run alltests on both machines [but *not* at the same time on 
> machines] and send us logs from both the runs.
> &gt; 
> &gt; Satish
> &gt; 
> &gt; 
> &gt; On Mon, 5 Apr 2021, Satish Balay wrote:
> &gt; 
> &gt; &gt; Try:
> &gt; &gt;
> &gt; &gt; make alltests TIMEOUT=600
> &gt; &gt;
> &gt; &gt; And send us the complete log (alltests.log)
> &gt; &gt;
> &gt; &gt; Satish
> &gt; &gt;
> &gt; &gt; On Tue, 6 Apr 2021, Chen Gang wrote:
> &gt; &gt;
> &gt; &gt; &gt; Dear sir,
> &gt; &gt; &gt;
> &gt; &gt; &gt;
> &gt; &gt; &gt; The result of make check is OK. And I do set the timeout to a 
> larger value, which keeps me from getting timeout error. The thing is I have 
> two machines. And I get the error code 1 in different tests on different 
> machines.I don??t know what is error code1. What case this? How can I fix the 
> failure tests.
> &gt; &gt; &gt;
> &gt; &gt; &gt;
> &gt; &gt; &gt; ------------------ Original ------------------
> &gt; &gt; &gt; From: Satish Balay 
> <[email protected]<mailto:[email protected]&gt;&amp;gt;
> &gt; &gt; &gt; Date: Tue,Apr 6,2021 0:18 PM
> &gt; &gt; &gt; To: Chen Gang 
> <[email protected]<mailto:[email protected]&gt;&amp;gt;
> &gt; &gt; &gt; Cc: petsc-dev 
> <[email protected]<mailto:[email protected]&gt;&amp;gt;, cglwdm 
> <[email protected]<mailto:[email protected]&gt;&amp;gt;
> &gt; &gt; &gt; Subject: Re: [petsc-dev] Petsc: Error code 1
> &gt; &gt;
> &gt; 
> &gt; <alltests2.log&gt;<alltests.log&gt;
> &gt; 
> &gt;

Reply via email to