Ah - ok. A bug in superlu_dist. Version string in CMakeLists.txt needs updating for every release..
set(VERSION_MAJOR "5") set(VERSION_MINOR "1") set(VERSION_BugFix "0") cc:ing Sherry. Satish On Sat, 31 Dec 2016, Eric Chamberland wrote: > Ah ok, I see! Here look at the file name in the configure.log: > > Install the project... > /usr/bin/cmake -P cmake_install.cmake > -- Install configuration: "DEBUG" > -- Installing: /opt/petsc-master_debug/lib/libsuperlu_dist.so.5.1.0 > -- Installing: /opt/petsc-master_debug/lib/libsuperlu_dist.so.5 > > It is saying 5.1.0, but in fact you are right: it is 5.1.3 that is > downloaded!!! :) > > And FWIW, the nighlty automatic compilation of PETSc starts within a brand new > and empty directory each night... > > Thanks to both of you again! :) > > Eric > > > Le 2016-12-31 à 13:17, Satish Balay a écrit : > > > > =============================================================================== > > Trying to download > > git://https://github.com/xiaoyeli/superlu_dist for > > SUPERLU_DIST > > > > =============================================================================== > > > > Executing: git clone https://github.com/xiaoyeli/superlu_dist > > Executing: > > /pmi/cmpbib/compilation_BIB_gcc_redhat_petsc-master_debug/COMPILE_AUTO/petsc-master-debug/arch-linux2-c-debug/externalpackages/git.superlu_dist > > stdout: Cloning into > > '/pmi/cmpbib/compilation_BIB_gcc_redhat_petsc-master_debug/COMPILE_AUTO/petsc-master-debug/arch-linux2-c-debug/externalpackages/git.superlu_dist'... > > Looking for SUPERLU_DIST at git.superlu_dist, > > hg.superlu_dist or a directory starting with > > ['superlu_dist'] > > Found a copy of SUPERLU_DIST in git.superlu_dist > > Executing: ['git', 'rev-parse', '--git-dir'] > > stdout: .git > > Executing: ['git', 'cat-file', '-e', 'v5.1.3^{commit}'] > > Executing: ['git', 'rev-parse', 'v5.1.3'] > > stdout: 7306f704c6c8d5113def649b76def3c8eb607690 > > Executing: ['git', 'stash'] > > stdout: No local changes to save > > Executing: ['git', 'clean', '-f', '-d', '-x'] > > Executing: ['git', 'checkout', '-f', > > Executing: '7306f704c6c8d5113def649b76def3c8eb607690'] > > <<<<<<<< > > > > Per log below - its using 5.1.3. Why did you think you got 5.1.0? > > > > Satish > > > > On Sat, 31 Dec 2016, Eric Chamberland wrote: > > > > > Hi, > > > > > > ok I will test with 5.1.3 with the option you gave me > > > (--download-superlu_dit-commit=v5.1.3). > > > > > > But from what you and Matthew said, I should have 5.1.3 with petsc-master, > > > but > > > the last night log shows me library file name 5.1.0: > > > > > > http://www.giref.ulaval.ca/~cmpgiref/petsc-master-debug/2016.12.31.02h00m01s_configure.log > > > > > > So I am a bit confused: Why did I got 5.1.0 last night? (I use the > > > petsc-master tarball, is it the reason?) > > > > > > Thanks, > > > > > > Eric > > > > > > > > > Le 2016-12-31 à 11:52, Satish Balay a écrit : > > > > On Sat, 31 Dec 2016, Eric Chamberland wrote: > > > > > > > > > Hi, > > > > > > > > > > I am just starting to debug a bug encountered with and only with > > > > > SuperLU_Dist > > > > > combined with MKL on a 2 processes validation test. > > > > > > > > > > (the same test works fine with MUMPS on 2 processes). > > > > > > > > > > I just noticed that the SuperLU_Dist version installed by PETSc > > > > > configure > > > > > script is 5.1.0 and the latest SuperLU_DIST is 5.1.3. > > > > If you use petsc-master - it will install 5.1.3 by default. > > > > > Before going further, I just want to ask: > > > > > > > > > > Is there any specific reason to stick to 5.1.0? > > > > We don't usually upgrade externalpackage version in PETSc releases > > > > [unless its tested to work and fixes known bugs]. There could be API > > > > changes - or build changes that can potentially conflict. > > > > > > > > >From what I know - 5.1.3 should work with petsc-3.7 [it fixes a couple > > > > of > > > > bugs]. > > > > > > > > You might be able to do the following with petsc-3.7 [with git > > > > externalpackage repos] > > > > > > > > --download-superlu_dist --download-superlu_dit-commit=v5.1.3 > > > > > > > > Satish > > > > > > > > > Here is some more information: > > > > > > > > > > On process 2 I have this printed in stdout: > > > > > > > > > > Intel MKL ERROR: Parameter 6 was incorrect on entry to DTRSM . > > > > > > > > > > and in stderr: > > > > > > > > > > Test.ProblemeEFGen.opt: malloc.c:2369: sysmalloc: Assertion `(old_top > > > > > == > > > > > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - > > > > > __builtin_offsetof > > > > > (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) > > > > > (old_size) > > > > > > = (unsigned long)((((__builtin_offsetof (struct malloc_chunk, > > > > > fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - > > > > > 1))) && > > > > > ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' > > > > > failed. > > > > > [saruman:15771] *** Process received signal *** > > > > > > > > > > This is the 7th call to KSPSolve in the same execution. Here is the > > > > > last > > > > > KSPView: > > > > > > > > > > KSP Object:(o_slin) 2 MPI processes > > > > > type: preonly > > > > > maximum iterations=10000, initial guess is zero > > > > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > > > > left preconditioning > > > > > using NONE norm type for convergence test > > > > > PC Object:(o_slin) 2 MPI processes > > > > > type: lu > > > > > LU: out-of-place factorization > > > > > tolerance for zero pivot 2.22045e-14 > > > > > matrix ordering: natural > > > > > factor fill ratio given 0., needed 0. > > > > > Factored matrix follows: > > > > > Mat Object: 2 MPI processes > > > > > type: mpiaij > > > > > rows=382, cols=382 > > > > > package used to perform factorization: superlu_dist > > > > > total: nonzeros=0, allocated nonzeros=0 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > SuperLU_DIST run parameters: > > > > > Process grid nprow 2 x npcol 1 > > > > > Equilibrate matrix TRUE > > > > > Matrix input mode 1 > > > > > Replace tiny pivots FALSE > > > > > Use iterative refinement FALSE > > > > > Processors in row 2 col partition 1 > > > > > Row permutation LargeDiag > > > > > Column permutation METIS_AT_PLUS_A > > > > > Parallel symbolic factorization FALSE > > > > > Repeated factorization SamePattern > > > > > linear system matrix = precond matrix: > > > > > Mat Object: (o_slin) 2 MPI processes > > > > > type: mpiaij > > > > > rows=382, cols=382 > > > > > total: nonzeros=4458, allocated nonzeros=4458 > > > > > total number of mallocs used during MatSetValues calls =0 > > > > > using I-node (on process 0) routines: found 109 nodes, limit > > > > > used > > > > > is 5 > > > > > > > > > > I know this information is not enough to help debug, but I would like > > > > > to > > > > > know > > > > > if PETSc guys will upgrade to 5.1.3 before trying to debug anything. > > > > > > > > > > Thanks, > > > > > Eric > > > > > > > > > > > > > > >
