I’ve switched the default parallel LU solver back to MUMPS and set MUMPS to use AMD ordering (anything other than METIS . . ), which seems to avoid MUMPS crashing when PETSc is configured with recent METIS versions.
Garth On 27 Mar 2014, at 11:52, Garth N. Wells <[email protected]> wrote: > > On 26 Mar 2014, at 18:45, Jan Blechta <[email protected]> wrote: > >> On Wed, 26 Mar 2014 17:16:13 +0100 >> "Garth N. Wells" <[email protected]> wrote: >> >>> >>> On 26 Mar 2014, at 16:56, Jan Blechta <[email protected]> >>> wrote: >>> >>>> On Wed, 26 Mar 2014 16:29:11 +0100 >>>> "Garth N. Wells" <[email protected]> wrote: >>>> >>>>> >>>>> On 26 Mar 2014, at 16:26, Jan Blechta <[email protected]> >>>>> wrote: >>>>> >>>>>> On Wed, 26 Mar 2014 16:16:25 +0100 >>>>>> Johannes Ring <[email protected]> wrote: >>>>>> >>>>>>> On Wed, Mar 26, 2014 at 1:39 PM, Jan Blechta >>>>>>> <[email protected]> wrote: >>>>>>>> As a follow-up of 'Broken PETSc wrappers?' thread on this list, >>>>>>>> can anyone reproduce incorrect (orders of magnitude) norm using >>>>>>>> superlu_dist on following example? Both in serial and parallel. >>>>>>>> Thanks, >>>>>>> >>>>>>> This is the result I got: >>>>>>> >>>>>>> Serial: >>>>>>> >>>>>>> L2 norm mumps 0.611356580181 >>>>>>> L2 norm superlu_dist 92.4733890983 >>>>>>> >>>>>>> Parallel (2 processes): >>>>>>> >>>>>>> L2 norm mumps 0.611356580181 >>>>>>> L2 norm superlu_dist 220.027905995 >>>>>>> L2 norm mumps 0.611356580181 >>>>>>> L2 norm superlu_dist 220.027905995 >>>>>> >>>>>> superlu_dist results are obviously wrong. Do we have broken >>>>>> installations or is there something wrong with the library? >>>>>> >>>>>> In the latter case I would suggest switching the default back to >>>>>> MUMPS. (Additionally, MUMPS has Cholesky factorization!) What was >>>>>> your motivation for switching to superlu_dist, Garth? >>>>>> >>>>> >>>>> MUMPS often fails in parallel with global dofs, and there is no >>>>> indication that MUMPS developers are willing to fix bugs. >>>> >>>> I'm not sure what do you mean by 'MUMPS fails’. >>> >>> Crashes. >>> >>>> I also observe that >>>> MUMPS sometimes fails because size of work arrays estimated during >>>> symbolic factorization is not sufficient for actual numeric >>>> factorization with pivoting. But this is hardly a bug. >>> >>> It has bugs with versions of SCOTCH. We’ve been over this before. >>> What you describe above indeed isn’t a bug, but just poor software >>> design in MUMPS. >>> >>>> It can by >>>> analyzed simply by increasing verbosity >>>> >>>> PETScOptions.set('mat_mumps_icntl_4', 3) >>>> >>>> and fixed by increasing ' >>>> work array increase percentage' >>>> >>>> PETScOptions.set('mat_mumps_icntl_14', 50) # default=25 >>>> >>>> or decreasing pivoting threshold. I have suspicion that frequent >>>> reason for this is using too small partitions (too much processes). >>>> (Users should also use Cholesky and PD-Cholesky whenever possible. >>>> Numerics is much more better and more things are predictable in >>>> analysis phase.) >>>> >>>> On the other superlu_dist is computing rubbish without any warning >>>> for me and Johannes. Can you duplicate? >>>> >>> >>> I haven’t had time to look. We should have unit testing for LU >>> solvers. From memory I don’t think we do. >> >> Ok, fix is here switch column ordering >> PETScOptions.set('mat_superlu_dist_colperm', col_ordering) >> >> col_ordering | properties >> -------------------------------------- >> NATURAL | works, large fill-in >> MMD_AT_PLUS_A | works, smallest fill-in (for this case) >> MMD_ATA | works, reasonable fill-in >> METIS_AT_PLUS_A | computes rubish (default on my system for this case) >> PARMETIS | supported only in parallel, computes rubish >> >> or row ordering >> PETScOptions.set('mat_superlu_dist_rowperm', row_ordering) >> >> row_ordering | properties >> -------------------------------------- >> NATURAL | works, good fill-in >> LargeDiag | computes rubish (default on my system for this case) >> >> or both. >> > > Good digging. Is there anyway to know when superlu_dist is going to return > garbage? It’s concerning that it can silently return a solution that is way > off. > > Garth > >> Jan >> >>> >>> Garth >>> >>>> Jan >>>> >>>>> >>>>> Garth >>>>> >>>>>> Jan >>>>>> >>>>>>> >>>>>>> Johannes >>>>>>> _______________________________________________ >>>>>>> fenics-support mailing list >>>>>>> [email protected] >>>>>>> http://fenicsproject.org/mailman/listinfo/fenics-support >>> >>> _______________________________________________ >>> fenics-support mailing list >>> [email protected] >>> http://fenicsproject.org/mailman/listinfo/fenics-support > > _______________________________________________ > fenics-support mailing list > [email protected] > http://fenicsproject.org/mailman/listinfo/fenics-support _______________________________________________ fenics-support mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics-support
