Hi Matt, I am sure that the partitioning is exactly the same: I have an external tool that partitions the mesh before launching the FE code. So for all the runs the mesh partitions has been created only once and then reused.
For the case where I wanted every ghost node to be shared by two and only two processors, I used simple geometries like rings or bars with structured meshes. Once again the partitions have been created once and then reused. The initial residuals and the initial matrix are exactly the same. I have added some lines in my code: After calling MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); I made a Matrix vector product between A and an unity vector. Then I've computed the norm of the resulting vector. You will see below the results for 4 linear system solves (two with 2.3.0 and two with 2.3.3p8) Mainly: With all runs : 1/ the results of the matrix * unity vector product are the same: 6838.31173987650 2/ the Initial Residual also : 1.50972105381228e+006 3/ At iteration 40 all the runs provides exactly the same residual: Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000 3/ with 2.3.0 the final residual is always the same : 3.19392726797939e+000 4/ with 2.3.3p8 the final residual vary after iteration 40. Some statistics made with 12 successive runs : We obtained 5 times 3.19515221050523, two times 3.19369843187027, three times 3.19373947848208e and two others for the two lasts. RUN1: 3.19515221050523e+000 RUN2: 3.19515221050523e+000 RUN3: 3.19369843187027e+000 RUN4: 3.19588480582213e+000 RUN5: 3.19515221050523e+000 RUN6: 3.19373947848208e+000 RUN7: 3.19515221050523e+000 RUN8: 3.19384417350916e+000 RUN9: 3.19515221050523e+000 RUN10: 3.19373947848208e+000 RUN11: 3.19369843187027e+000 RUN12: 3.19373947848208e+000 So same initial residual, same results for the matrix * unity vector product, same residual at iteration 40. I always used the options: OptionTable: -ksp_truemonitor OptionTable: -log_summary Any ideas will be very welcome, don't hesitate if you need additional tests. It sound, perhaps, reuse of a buffer that has not been properly released ? Best regards, Etienne ------------------------------------------------------------------------ With 2.3.0: Using Petsc Release Version 2.3.0, Patch 44, April, 26, 2005 RUN1: Norm A*One = 6838.31173987650 * Resolution method : Preconditionned Conjugate Residual * Preconditionner : BJACOBI with ILU, Blocks of 1 * * Initial Residual : 1.50972105381228e+006 Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002 Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001 Iteration= 66 residual= 3.73014307e+001 tolerance= 4.98207948e+000 Iteration= 67 residual= 3.75579067e+001 tolerance= 5.05756553e+000 Iteration= 68 residual= 3.19392727e+000 tolerance= 5.13305158e+000 * * Number of iterations : 68 * Convergency code : 3 * Final Residual Norm : 3.19392726797939e+000 * PETSC : Resolution time : 1.000389 seconds RUN2: Norme A*Un = 6838.31173987650 * Resolution method : Preconditionned Conjugate Residual * Preconditionner : BJACOBI with ILU, Blocks of 1 * * Initial Residual : 1.50972105381228e+006 Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002 Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001 Iteration= 10 residual= 2.73382943e+004 tolerance= 7.54860527e-001 Iteration= 20 residual= 7.27122933e+003 tolerance= 1.50972105e+000 Iteration= 30 residual= 8.42209039e+003 tolerance= 2.26458158e+000 Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000 Iteration= 50 residual= 3.17446784e+002 tolerance= 3.77430263e+000 Iteration= 60 residual= 3.53234217e+001 tolerance= 4.52916316e+000 Iteration= 66 residual= 3.73014307e+001 tolerance= 4.98207948e+000 Iteration= 67 residual= 3.75579067e+001 tolerance= 5.05756553e+000 Iteration= 68 residual= 3.19392727e+000 tolerance= 5.13305158e+000 * * Number of iterations : 68 * Convergency code : 3 * Final Residual Norm : 3.19392726797939e+000 * PETSC : Resolution time : 0.888913 seconds ******************************************************************************************************************************************************** WITH 2.3.3p8: Using Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b RUN1: Norme A*Un = 6838.31173987650 * Resolution method : Preconditionned Conjugate Residual * Preconditionner : BJACOBI with ILU, Blocks of 1 * * Initial Residual : 1.50972105381228e+006 Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002 Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001 Iteration= 10 residual= 2.73382943e+004 tolerance= 7.54860527e-001 Iteration= 20 residual= 7.27122933e+003 tolerance= 1.50972105e+000 Iteration= 30 residual= 8.42209039e+003 tolerance= 2.26458158e+000 Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000 Iteration= 50 residual= 3.17446756e+002 tolerance= 3.77430263e+000 Iteration= 60 residual= 3.53234489e+001 tolerance= 4.52916316e+000 Iteration= 65 residual= 7.12874932e+000 tolerance= 4.90659342e+000 Iteration= 66 residual= 3.72396571e+001 tolerance= 4.98207948e+000 Iteration= 67 residual= 3.75096723e+001 tolerance= 5.05756553e+000 Iteration= 68 residual= 3.19515221e+000 tolerance= 5.13305158e+000 * * Number of iterations : 68 * Convergency code : 3 * Final Residual Norm : 3.19515221050523e+000 * PETSC : Resolution time : 0.928915 seconds RUN2: Norme A*Un = 6838.31173987650 * Resolution method : Preconditionned Conjugate Residual * Preconditionner : BJACOBI with ILU, Blocks of 1 * * Initial Residual : 1.50972105381228e+006 Iteration= 1 residual= 9.59236416e+004 tolerance= 7.54860527e-002 Iteration= 2 residual= 8.46044988e+004 tolerance= 1.50972105e-001 Iteration= 10 residual= 2.73382943e+004 tolerance= 7.54860527e-001 Iteration= 20 residual= 7.27122933e+003 tolerance= 1.50972105e+000 Iteration= 30 residual= 8.42209039e+003 tolerance= 2.26458158e+000 Iteration= 40 residual= 2.64670054e+003 tolerance= 3.01944211e+000 Iteration= 50 residual= 3.17446774e+002 tolerance= 3.77430263e+000 Iteration= 60 residual= 3.53233608e+001 tolerance= 4.52916316e+000 Iteration= 65 residual= 7.12937602e+000 tolerance= 4.90659342e+000 Iteration= 66 residual= 3.72832632e+001 tolerance= 4.98207948e+000 Iteration= 67 residual= 3.75447170e+001 tolerance= 5.05756553e+000 Iteration= 68 residual= 3.19369843e+000 tolerance= 5.13305158e+000 * * Number of iterations : 68 * Convergency code : 3 * Final Residual Norm : 3.19369843187027e+000 * PETSC : Resolution time : 0.872702 seconds Etienne -----Message d'origine----- De?: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] De la part de Matthew Knepley Envoy??: mercredi 24 septembre 2008 19:15 ??: petsc-users at mcs.anl.gov Objet?: Re: Non repetability issue and difference between 2.3.0 and 2.3.3 On Wed, Sep 24, 2008 at 11:21 AM, Etienne PERCHAT <etienne.perchat at transvalor.com> wrote: > Dear Petsc users, > > > > I come again with my comparisons between v2.3.0 and v2.3.3p8. > > > > I face a non repeatability issue with v2.3.3 that I didn't have with v2.3.0. > > I have read the exchanges made in March on a related subject but in my case > it is at the first linear system solution that two successive runs differ. > > > > > > It happens when the number of processors used is greater than 2, even on a > standard PC. > > I am solving MPIBAIJ symmetric systems with the Conjugate Residual method > preconditioned ILU(1) and Block Jacobi between subdomains. > > This system is the results of a FE assembly on an unstructured mesh. > > > > I made all the runs using -log_summary and -ksp_truemonitor. > > > > Starting with the same initial matrix and RHS, each run using 2.3.3p8 > provides slightly different results while we obtain exactly the same > solution with v2.3.0. > > > > With Petsc 2.3.3p8: > > > > Run1: Iteration= 68 residual= 3.19515221e+000 tolerance= > 5.13305158e+000 0 > > Run2: Iteration= 68 residual= 3.19588481e+000 tolerance= > 5.13305158e+000 0 > > Run3: Iteration= 68 residual= 3.19384417e+000 tolerance= > 5.13305158e+000 0 > > > > With Petsc 2.3.0: > > > > Run1: Iteration= 68 residual= 3.19369843e+000 tolerance= > 5.13305158e+000 0 > > Run2: Iteration= 68 residual= 3.19369843e+000 tolerance= > 5.13305158e+000 0 > > > > If I made a 4proc run with a mesh partitioning such that any node could be > located on more than 2 proc. I did not face the problem. It is not clear whether you have verified that on different runs, the partitioning is exactly the same. Matt > I first thought about a MPI problem related to the order in which messages > are received and then summed. > > But it would have been exactly the same with 2.3.0 ? > > > > Any tips/ideas ? > > > > Thanks by advance. > > Best regards, > > > > Etienne Perchat -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
