On Thu, Dec 13, 2012 at 1:20 PM, Nachiket Gokhale <gokhalen at gmail.com> wrote: > I am trying to solve a complex matrix equation which was assembled using > MatCompositeMerge using MUMPS and LU preconditioner. It seems to me that > the solve is stuck in the factorization phase. It is taking 20 mins or so, > using 16 processes. A problem of the same size using reals instead of > complex was solved previously in approximately a minute using 4 processes. > Mumps output of -mat_mumps_icntl_4 1 at the end of this email. Does anyone > have any ideas about what the problem maybe ?
Complex arithmetic is much more expensive, and you can lose some of the optimizations made in the code. I think you have to wait longer than this. Also, you should try attaching the debugger to a process to see whether it is computing or waiting. Matt > Thanks, > > -Nachiket > > > > Entering ZMUMPS driver with JOB, N, NZ = 1 122370 0 > > ZMUMPS 4.10.0 > L U Solver for unsymmetric matrices > Type of parallelism: Working host > > ****** ANALYSIS STEP ******** > > ** Max-trans not allowed because matrix is distributed > ... Structural symmetry (in percent)= 100 > Density: NBdense, Average, Median = 0 42 26 > Ordering based on METIS > A root of estimated size 2736 has been selected for Scalapack. > > Leaving analysis phase with ... > INFOG(1) = 0 > INFOG(2) = 0 > -- (20) Number of entries in factors (estim.) = 563723522 > -- (3) Storage of factors (REAL, estimated) = 565185337 > -- (4) Storage of factors (INT , estimated) = 3537003 > -- (5) Maximum frontal size (estimated) = 15239 > -- (6) Number of nodes in the tree = 7914 > -- (32) Type of analysis effectively used = 1 > -- (7) Ordering option effectively used = 5 > ICNTL(6) Maximum transversal option = 0 > ICNTL(7) Pivot order option = 7 > Percentage of memory relaxation (effective) = 35 > Number of level 2 nodes = 35 > Number of split nodes = 8 > RINFOG(1) Operations during elimination (estim)= 4.877D+12 > Distributed matrix entry format (ICNTL(18)) = 3 > ** Rank of proc needing largest memory in IC facto : 0 > ** Estimated corresponding MBYTES for IC facto : 3661 > ** Estimated avg. MBYTES per work. proc at facto (IC) : 2018 > ** TOTAL space in MBYTES for IC factorization : 32289 > ** Rank of proc needing largest memory for OOC facto : 0 > ** Estimated corresponding MBYTES for OOC facto : 3462 > ** Estimated avg. MBYTES per work. proc at facto (OOC) : 1787 > ** TOTAL space in MBYTES for OOC factorization : 28599 > Entering ZMUMPS driver with JOB, N, NZ = 2 122370 5211070 > > ****** FACTORIZATION STEP ******** > > > GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ... > NUMBER OF WORKING PROCESSES = 16 > OUT-OF-CORE OPTION (ICNTL(22)) = 0 > REAL SPACE FOR FACTORS = 565185337 > INTEGER SPACE FOR FACTORS = 3537003 > MAXIMUM FRONTAL SIZE (ESTIMATED) = 15239 > NUMBER OF NODES IN THE TREE = 7914 > Convergence error after scaling for ONE-NORM (option 7/8) = 0.79D+00 > Maximum effective relaxed size of S = 199523439 > Average effective relaxed size of S = 98303057 > > REDISTRIB: TOTAL DATA LOCAL/SENT = 657185 14022665 > GLOBAL TIME FOR MATRIX DISTRIBUTION = 0.4805 > ** Memory relaxation parameter ( ICNTL(14) ) : 35 > ** Rank of processor needing largest memory in facto : 0 > ** Space in MBYTES used by this processor for facto : 3661 > ** Avg. Space in MBYTES per working proc during facto : 2018 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
