On Fri, Aug 28, 2020 at 11:30 AM Roy Stogner <royst...@oden.utexas.edu>
wrote:

>
> On Fri, 28 Aug 2020, John Peterson wrote:
>
> > On Fri, Aug 21, 2020 at 9:51 AM Nikhil Vaidya <nikhilvaidy...@gmail.com>
> > wrote:
> >
> >> I need to print the sparse matrices (Petsc) and vectors involved in my
> >> calculations to file using print_matlab(). I have observed that the
> >> matrices and vectors that are written to the matlab scripts in serial
> and
> >> parallel runs are not identical. Is this actually the case or am I
> missing
> >> something?
> >>
> >
> > By "not identical" I guess you mean that they don't match in all digits,
> > but are they at least "close"? It's normal to have floating point
> > differences between serial and parallel runs, but they should be due to
> > different orders of operations and therefore of order 10-100 * machine
> > epsilon.
>
> To expand: it's even normal to have floating point differences between
> different parallel runs.  Both MPI reductions and threading pool
> algorithms typically operate on "I'll begin summing the first data I
> see ready" for efficiency, and "the first data I see ready" depends on
> how loaded each CPU and network device is, meaning the reductions are
> practically done in random order.  IIRC there's even a funny bit in
> the MPI standard where they find a very polite and professional way to
> rephrase "If you don't like it then why don't you go write your own
> reduction code!?"
>
> And if you're using PETSc?  You're probably not even be using the same
> algorithm in serial vs parallel; the default (for performance /
> robustness reasons) is Block Jacobi (between processors) + ILU0
> (within a processor), so the very definition of your preconditioner
> depends on your partitioning.  This is a much bigger issue than the
> order of operations problem.  Because of it, if you want to be able to
> do testing on different processor counts (or partitioner settings or
> solver algorithm choices or preconditioner algorithm choices), you
> can't safely assert that a "gold" regression test standard will be
> repeatable to a tolerance any better than your solver tolerance (or
> even equal to your solver tolerance, thanks to conditioning issues).
>

Hi Nikhil,

A colleague just reminded me that the DOFs are distributed differently in
serial vs. parallel, depending on the number of processors you have. So I
think in general there is no way to compare a matrix written to file in
serial vs. one written in parallel.

Assuming you have only nodal DOFs and the mesh doesn't get renumbered in
serial vs. parallel, you could theoretically write out the non-zero matrix
entries row by row in a permuted order (e.g. node number order) in both
serial and parallel, and then check that it matches. Probably a bit more
effort than you were looking to spend, unfortunately.

-- 
John

_______________________________________________
Libmesh-users mailing list
Libmesh-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to