I think there is definitly a problem.
After looking at the files installed either from petsc-master tarball or
the manual configure I just did with
--download-superlu_dist-commit=v5.1.3, the file include/superlu_defs.h
have these values:
#define SUPERLU_DIST_MAJOR_VERSION 5
#define SUPERLU_DIST_MINOR_VERSION 1
#define SUPERLU_DIST_PATCH_VERSION 0
What's wrong?
Eric
Le 2016-12-31 à 13:26, Eric Chamberland a écrit :
Ah ok, I see! Here look at the file name in the configure.log:
Install the project...
/usr/bin/cmake -P cmake_install.cmake
-- Install configuration: "DEBUG"
-- Installing: /opt/petsc-master_debug/lib/libsuperlu_dist.so.5.1.0
-- Installing: /opt/petsc-master_debug/lib/libsuperlu_dist.so.5
It is saying 5.1.0, but in fact you are right: it is 5.1.3 that is
downloaded!!! :)
And FWIW, the nighlty automatic compilation of PETSc starts within a
brand new and empty directory each night...
Thanks to both of you again! :)
Eric
Le 2016-12-31 à 13:17, Satish Balay a écrit :
===============================================================================
Trying to download
git://https://github.com/xiaoyeli/superlu_dist for SUPERLU_DIST
===============================================================================
Executing: git clone
https://github.com/xiaoyeli/superlu_dist
/pmi/cmpbib/compilation_BIB_gcc_redhat_petsc-master_debug/COMPILE_AUTO/petsc-master-debug/arch-linux2-c-debug/externalpackages/git.superlu_dist
stdout: Cloning into
'/pmi/cmpbib/compilation_BIB_gcc_redhat_petsc-master_debug/COMPILE_AUTO/petsc-master-debug/arch-linux2-c-debug/externalpackages/git.superlu_dist'...
Looking for SUPERLU_DIST at git.superlu_dist,
hg.superlu_dist or a directory starting with ['superlu_dist']
Found a copy of SUPERLU_DIST in git.superlu_dist
Executing: ['git', 'rev-parse', '--git-dir']
stdout: .git
Executing: ['git', 'cat-file', '-e', 'v5.1.3^{commit}']
Executing: ['git', 'rev-parse', 'v5.1.3']
stdout: 7306f704c6c8d5113def649b76def3c8eb607690
Executing: ['git', 'stash']
stdout: No local changes to save
Executing: ['git', 'clean', '-f', '-d', '-x']
Executing: ['git', 'checkout', '-f',
'7306f704c6c8d5113def649b76def3c8eb607690']
<<<<<<<<
Per log below - its using 5.1.3. Why did you think you got 5.1.0?
Satish
On Sat, 31 Dec 2016, Eric Chamberland wrote:
Hi,
ok I will test with 5.1.3 with the option you gave me
(--download-superlu_dit-commit=v5.1.3).
But from what you and Matthew said, I should have 5.1.3 with
petsc-master, but
the last night log shows me library file name 5.1.0:
http://www.giref.ulaval.ca/~cmpgiref/petsc-master-debug/2016.12.31.02h00m01s_configure.log
So I am a bit confused: Why did I got 5.1.0 last night? (I use the
petsc-master tarball, is it the reason?)
Thanks,
Eric
Le 2016-12-31 à 11:52, Satish Balay a écrit :
On Sat, 31 Dec 2016, Eric Chamberland wrote:
Hi,
I am just starting to debug a bug encountered with and only with
SuperLU_Dist
combined with MKL on a 2 processes validation test.
(the same test works fine with MUMPS on 2 processes).
I just noticed that the SuperLU_Dist version installed by PETSc
configure
script is 5.1.0 and the latest SuperLU_DIST is 5.1.3.
If you use petsc-master - it will install 5.1.3 by default.
Before going further, I just want to ask:
Is there any specific reason to stick to 5.1.0?
We don't usually upgrade externalpackage version in PETSc releases
[unless its tested to work and fixes known bugs]. There could be API
changes - or build changes that can potentially conflict.
>From what I know - 5.1.3 should work with petsc-3.7 [it fixes a
couple of
bugs].
You might be able to do the following with petsc-3.7 [with git
externalpackage repos]
--download-superlu_dist --download-superlu_dit-commit=v5.1.3
Satish
Here is some more information:
On process 2 I have this printed in stdout:
Intel MKL ERROR: Parameter 6 was incorrect on entry to DTRSM .
and in stderr:
Test.ProblemeEFGen.opt: malloc.c:2369: sysmalloc: Assertion
`(old_top ==
(((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long)
(old_size)
= (unsigned long)((((__builtin_offsetof (struct malloc_chunk,
fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2
*(sizeof(size_t))) -
1))) &&
((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) ==
0)'
failed.
[saruman:15771] *** Process received signal ***
This is the 7th call to KSPSolve in the same execution. Here is
the last
KSPView:
KSP Object:(o_slin) 2 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object:(o_slin) 2 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 0., needed 0.
Factored matrix follows:
Mat Object: 2 MPI processes
type: mpiaij
rows=382, cols=382
package used to perform factorization: superlu_dist
total: nonzeros=0, allocated nonzeros=0
total number of mallocs used during MatSetValues calls =0
SuperLU_DIST run parameters:
Process grid nprow 2 x npcol 1
Equilibrate matrix TRUE
Matrix input mode 1
Replace tiny pivots FALSE
Use iterative refinement FALSE
Processors in row 2 col partition 1
Row permutation LargeDiag
Column permutation METIS_AT_PLUS_A
Parallel symbolic factorization FALSE
Repeated factorization SamePattern
linear system matrix = precond matrix:
Mat Object: (o_slin) 2 MPI processes
type: mpiaij
rows=382, cols=382
total: nonzeros=4458, allocated nonzeros=4458
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 109 nodes,
limit used
is 5
I know this information is not enough to help debug, but I would
like to
know
if PETSc guys will upgrade to 5.1.3 before trying to debug anything.
Thanks,
Eric