On Sat, 31 Dec 2016, Eric Chamberland wrote: > Hi, > > I am just starting to debug a bug encountered with and only with SuperLU_Dist > combined with MKL on a 2 processes validation test. > > (the same test works fine with MUMPS on 2 processes). > > I just noticed that the SuperLU_Dist version installed by PETSc configure > script is 5.1.0 and the latest SuperLU_DIST is 5.1.3.
If you use petsc-master - it will install 5.1.3 by default. > > Before going further, I just want to ask: > > Is there any specific reason to stick to 5.1.0? We don't usually upgrade externalpackage version in PETSc releases [unless its tested to work and fixes known bugs]. There could be API changes - or build changes that can potentially conflict. >From what I know - 5.1.3 should work with petsc-3.7 [it fixes a couple of >bugs]. You might be able to do the following with petsc-3.7 [with git externalpackage repos] --download-superlu_dist --download-superlu_dit-commit=v5.1.3 Satish > Here is some more information: > > On process 2 I have this printed in stdout: > > Intel MKL ERROR: Parameter 6 was incorrect on entry to DTRSM . > > and in stderr: > > Test.ProblemeEFGen.opt: malloc.c:2369: sysmalloc: Assertion `(old_top == > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof > (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) > >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, > fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) && > ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' failed. > [saruman:15771] *** Process received signal *** > > This is the 7th call to KSPSolve in the same execution. Here is the last > KSPView: > > KSP Object:(o_slin) 2 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object:(o_slin) 2 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 2 MPI processes > type: mpiaij > rows=382, cols=382 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 2 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 1 > Replace tiny pivots FALSE > Use iterative refinement FALSE > Processors in row 2 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern > linear system matrix = precond matrix: > Mat Object: (o_slin) 2 MPI processes > type: mpiaij > rows=382, cols=382 > total: nonzeros=4458, allocated nonzeros=4458 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 109 nodes, limit used is 5 > > I know this information is not enough to help debug, but I would like to know > if PETSc guys will upgrade to 5.1.3 before trying to debug anything. > > Thanks, > Eric > >
