Hi,

ok, using petsc-3.7.4 but with SuperLU_DIST 5.1.3 (--download-superlu_dist-commit=v5.1.3) fixed the issue!

Thanks to both of you and happy new year! :)

Eric



Le 2016-12-31 à 11:51, Matthew Knepley a écrit :
On Sat, Dec 31, 2016 at 9:53 AM, Eric Chamberland <[email protected] <mailto:[email protected]>> wrote:

    Hi,

    I am just starting to debug a bug encountered with and only with
    SuperLU_Dist combined with MKL on a 2 processes validation test.

    (the same test works fine with MUMPS on 2 processes).

    I just noticed that the SuperLU_Dist version installed by PETSc
    configure script is 5.1.0 and the latest SuperLU_DIST is 5.1.3.

    Before going further, I just want to ask:

    Is there any specific reason to stick to 5.1.0?


Can you debug in 'master' which does have 5.1.3, including an important bug fix?

   Matt


    Here is some more information:

    On process 2 I have this printed in stdout:

    Intel MKL ERROR: Parameter 6 was incorrect on entry to DTRSM .

    and in stderr:

    Test.ProblemeEFGen.opt: malloc.c:2369: sysmalloc: Assertion
    `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2]))
    - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size ==
    0) || ((unsigned long) (old_size) >= (unsigned
    long)((((__builtin_offsetof (struct malloc_chunk,
    fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2
    *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned
    long) old_end & pagemask) == 0)' failed.
    [saruman:15771] *** Process received signal ***

    This is the 7th call to KSPSolve in the same execution. Here is
    the last KSPView:

    KSP Object:(o_slin) 2 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:(o_slin) 2 MPI processes
      type: lu
        LU: out-of-place factorization
        tolerance for zero pivot 2.22045e-14
        matrix ordering: natural
        factor fill ratio given 0., needed 0.
          Factored matrix follows:
            Mat Object:         2 MPI processes
              type: mpiaij
              rows=382, cols=382
              package used to perform factorization: superlu_dist
              total: nonzeros=0, allocated nonzeros=0
              total number of mallocs used during MatSetValues calls =0
                SuperLU_DIST run parameters:
                  Process grid nprow 2 x npcol 1
                  Equilibrate matrix TRUE
                  Matrix input mode 1
                  Replace tiny pivots FALSE
                  Use iterative refinement FALSE
                  Processors in row 2 col partition 1
                  Row permutation LargeDiag
                  Column permutation METIS_AT_PLUS_A
                  Parallel symbolic factorization FALSE
                  Repeated factorization SamePattern
      linear system matrix = precond matrix:
      Mat Object:  (o_slin)   2 MPI processes
        type: mpiaij
        rows=382, cols=382
        total: nonzeros=4458, allocated nonzeros=4458
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 109 nodes, limit
    used is 5

    I know this information is not enough to help debug, but I would
    like to know if PETSc guys will upgrade to 5.1.3 before trying to
    debug anything.

    Thanks,
    Eric




--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

Reply via email to