venkatesh: The differences are in MatSolve() as Matt said. The questions are
1. why np=2 calls 7624 MatSolve(), while np=8 calls 9600 times? With slepc/sinvert/lu, would both runs use same algorithm? 2. with np=2, each MatSolve takes 0.027 sec, while np=8 takes 0.065 sec. Would MatSolve be scalable? The matrix factors for np=2 and 8 might be very different. We would like to know what mumps' developer say about it. Hong Hi, > I have emailed the mumps-user list. > Actually the cluster has 8 nodes with 16 cores, and other codes scale > well. > I wanted to ask if this job takes much time, then if I submit on more > cores, I have to increase the icntl(14).. which would again take long time. > > So is there another way ? > > cheers, > Venkatesh > > On Mon, May 18, 2015 at 7:16 PM, Matthew Knepley <[email protected]> > wrote: > >> On Mon, May 18, 2015 at 8:29 AM, venkatesh g <[email protected]> >> wrote: >> >>> Hi I have attached the performance logs for 2 jobs on different >>> processors. I had to increase the workspace icntl(14) when I submit on more >>> cores since it is failing with small value of icntl(14). >>> >>> 1. performance_log1.txt is run on 8 cores (option given >>> -mat_mumps_icntl_14 200) >>> 2. performance_log2.txt is run on 2 cores (option given >>> -mat_mumps_icntl_14 85 ) >>> >> >> 1) Your number of iterates increased from 7600 to 9600, but that is a >> relatively small effect >> >> 2) MUMPS is just taking a lot longer to do forward/backward solve. You >> might try emailing >> the list for them. However, I would bet that your system has enough >> bandwidth for 2 procs >> and not much more. >> >> Thanks, >> >> Matt >> >> >>> Venkatesh >>> >>> On Sun, May 17, 2015 at 6:13 PM, Matthew Knepley <[email protected]> >>> wrote: >>> >>>> On Sun, May 17, 2015 at 1:38 AM, venkatesh g <[email protected]> >>>> wrote: >>>> >>>>> Hi, Thanks for the information. I now increased the workspace by >>>>> adding '-mat_mumps_icntl_14 100' >>>>> >>>>> It works. However, the problem is, if I submit in 1 core I get the >>>>> answer in 200 secs, but with 4 cores and '-mat_mumps_icntl_14 100' it >>>>> takes >>>>> 3500secs. >>>>> >>>> >>>> Send the output of -log_summary for all performance queries. Otherwise >>>> we are just guessing. >>>> >>>> Matt >>>> >>>> My command line is: 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 -eps_nev 1 >>>>> -st_type sinvert -eps_max_it 5000 -st_ksp_type preonly -st_pc_type lu >>>>> -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_14 100' >>>>> >>>>> Kindly let me know. >>>>> >>>>> Venkatesh >>>>> >>>>> >>>>> >>>>> On Sat, May 16, 2015 at 7:10 PM, David Knezevic < >>>>> [email protected]> wrote: >>>>> >>>>>> On Sat, May 16, 2015 at 8:08 AM, venkatesh g <[email protected] >>>>>> > wrote: >>>>>> >>>>>>> Hi, >>>>>>> I am trying to solving AX=lambda BX eigenvalue problem. >>>>>>> >>>>>>> A and B are of sizes 3600x3600 >>>>>>> >>>>>>> I run with this command : >>>>>>> >>>>>>> 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 -eps_nev 1 -st_type sinvert >>>>>>> -eps_max_it 5000 -st_ksp_type preonly -st_pc_type lu >>>>>>> -st_pc_factor_mat_solver_package mumps' >>>>>>> >>>>>>> I get this error: (I get result only when I give 1 or 2 processors) >>>>>>> Reading COMPLEX matrices from binary files... >>>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>>> ------------------------------------ >>>>>>> [0]PETSC ERROR: Error in external library! >>>>>>> [0]PETSC ERROR: Error reported by MUMPS in numerical factorization >>>>>>> phase: INFO(1)=-9, INFO(2)=2024 >>>>>>> >>>>>> >>>>>> >>>>>> The MUMPS error types are described in Chapter 7 of the MUMPS manual. >>>>>> In this case you have INFO(1)=-9, which is explained in the manual as: >>>>>> >>>>>> "–9 Main internal real/complex workarray S too small. If INFO(2) is >>>>>> positive, then the number of entries that are missing in S at the moment >>>>>> when the error is raised is available in INFO(2). If INFO(2) is negative, >>>>>> then its absolute value should be multiplied by 1 million. If an error –9 >>>>>> occurs, the user should increase the value of ICNTL(14) before calling >>>>>> the >>>>>> factorization (JOB=2) again, except if ICNTL(23) is provided, in which >>>>>> case >>>>>> ICNTL(23) should be increased." >>>>>> >>>>>> This says that you should use ICTNL(14) to increase the working space >>>>>> size: >>>>>> >>>>>> "ICNTL(14) is accessed by the host both during the analysis and the >>>>>> factorization phases. It corresponds to the percentage increase in the >>>>>> estimated working space. When significant extra fill-in is caused by >>>>>> numerical pivoting, increasing ICNTL(14) may help. Except in special >>>>>> cases, >>>>>> the default value is 20 (which corresponds to a 20 % increase)." >>>>>> >>>>>> So, for example, you can avoid this error via the following command >>>>>> line argument to PETSc: "-mat_mumps_icntl_14 30", where 30 indicates that >>>>>> we allow a 30% increase in the workspace instead of the default 20%. >>>>>> >>>>>> David >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > >
