Ok this will load the matrices in parallel correct ? On Sun, May 24, 2015 at 7:36 PM, Matthew Knepley <[email protected]> wrote:
> On Sun, May 24, 2015 at 8:57 AM, venkatesh g <[email protected]> > wrote: > >> I am using Matload option as in the ex7.c code given by the Slepc. >> ierr = MatLoad(A,viewer);CHKERRQ(ierr); >> >> >> There is no problem here right ? or any additional option is required for >> very large matrices while running the eigensolver in parallel ? >> > > This will load the matrix from the viewer (presumably disk). There are no > options for large matrices. > > Thanks, > > Matt > > >> cheers, >> Venkatesh >> >> On Sat, May 23, 2015 at 5:43 PM, Matthew Knepley <[email protected]> >> wrote: >> >>> On Sat, May 23, 2015 at 7:09 AM, venkatesh g <[email protected]> >>> wrote: >>> >>>> Hi, >>>> Thanks. >>>> Per node it has 24 cores and each core has 4 GB RAM. And the job was >>>> submitted in 10 nodes. >>>> >>>> So, does it mean it requires 10G for one core ? or for 1 node ? >>>> >>> >>> The error message from MUMPS said that it tried to allocate 10G. We must >>> assume each process >>> tried to do the same thing. That means if you scheduled 24 processes on >>> a node, it would try to >>> allocate at least 240G, which is in excess of what you specify above. >>> >>> Note that this has nothing to do with PETSc. It is all in the >>> documentation for that machine and its >>> scheduling policy. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> cheers, >>>> >>>> Venkatesh >>>> >>>> On Sat, May 23, 2015 at 5:17 PM, Matthew Knepley <[email protected]> >>>> wrote: >>>> >>>>> On Sat, May 23, 2015 at 6:44 AM, venkatesh g <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> The same eigenproblem runs with 120 GB RAM in a serial machine in >>>>>> Matlab. >>>>>> >>>>>> In Cray I fired with 240*4 GB RAM in parallel. So it has to go in >>>>>> right ? >>>>>> >>>>> >>>>> I do not know how MUMPS allocates memory, but the message is >>>>> unambiguous. Also, >>>>> this is concerned with the memory available per node. Do you know how >>>>> many processes >>>>> per node were scheduled? The message below indicates that it was >>>>> trying to allocate 10G >>>>> for one process. >>>>> >>>>> >>>>>> And for small matrices it is having negative scaling i.e 24 core is >>>>>> running faster. >>>>>> >>>>> >>>>> Yes, for strong scaling you always get slowdown eventually since >>>>> overheads dominate >>>>> work, see Amdahl's Law. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> I have attached the submission script. >>>>>> >>>>>> Pls see.. Kindly let me know >>>>>> >>>>>> cheers, >>>>>> Venkatesh >>>>>> >>>>>> >>>>>> On Sat, May 23, 2015 at 4:58 PM, Matthew Knepley <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> On Sat, May 23, 2015 at 2:39 AM, venkatesh g < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi again, >>>>>>>> >>>>>>>> I have installed the Petsc and Slepc in Cray with intel compilers >>>>>>>> with Mumps. >>>>>>>> >>>>>>>> I am getting this error when I solve eigenvalue problem with large >>>>>>>> matrices: [201]PETSC ERROR: Error reported by MUMPS in numerical >>>>>>>> factorization phase: Cannot allocate required memory 9632 megabytes >>>>>>>> >>>>>>> >>>>>>> It ran out of memory on the node. >>>>>>> >>>>>>> >>>>>>>> Also it is again not scaling well for small matrices. >>>>>>>> >>>>>>> >>>>>>> MUMPS strong scaling for small matrices is not very good. Weak >>>>>>> scaling is looking at big matrices. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Kindly let me know what to do. >>>>>>>> >>>>>>>> cheers, >>>>>>>> >>>>>>>> Venkatesh >>>>>>>> >>>>>>>> >>>>>>>> On Tue, May 19, 2015 at 3:02 PM, Matthew Knepley <[email protected] >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> On Tue, May 19, 2015 at 1:04 AM, venkatesh g < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I have attached the log of the command which I gave in the master >>>>>>>>>> node: make streams NPMAX=32 >>>>>>>>>> >>>>>>>>>> I dont know why it says 'It appears you have only 1 node'. But >>>>>>>>>> other codes run in parallel with good scaling on 8 nodes. >>>>>>>>>> >>>>>>>>> >>>>>>>>> If you look at the STREAMS numbers, you can see that your system >>>>>>>>> is only able to support about 2 cores with the >>>>>>>>> available memory bandwidth. Thus for bandwidth constrained >>>>>>>>> operations (almost everything in sparse linear algebra >>>>>>>>> and solvers), your speedup will not be bigger than 2. >>>>>>>>> >>>>>>>>> Other codes may do well on this machine, but they would be compute >>>>>>>>> constrained, using things like DGEMM. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> >>>>>>>>>> Kindly let me know. >>>>>>>>>> >>>>>>>>>> Venkatesh >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, May 18, 2015 at 11:21 PM, Barry Smith <[email protected] >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Run the streams benchmark on this system and send the >>>>>>>>>>> results. >>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> > On May 18, 2015, at 11:14 AM, venkatesh g < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > >>>>>>>>>>> > Hi, >>>>>>>>>>> > I have emailed the mumps-user list. >>>>>>>>>>> > Actually the cluster has 8 nodes with 16 cores, and other >>>>>>>>>>> codes scale well. >>>>>>>>>>> > I wanted to ask if this job takes much time, then if I submit >>>>>>>>>>> on more cores, I have to increase the icntl(14).. which would again >>>>>>>>>>> take >>>>>>>>>>> long time. >>>>>>>>>>> > >>>>>>>>>>> > So is there another way ? >>>>>>>>>>> > >>>>>>>>>>> > cheers, >>>>>>>>>>> > Venkatesh >>>>>>>>>>> > >>>>>>>>>>> > On Mon, May 18, 2015 at 7:16 PM, Matthew Knepley < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > On Mon, May 18, 2015 at 8:29 AM, venkatesh g < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > Hi I have attached the performance logs for 2 jobs on >>>>>>>>>>> different processors. I had to increase the workspace icntl(14) >>>>>>>>>>> when I >>>>>>>>>>> submit on more cores since it is failing with small value of >>>>>>>>>>> icntl(14). >>>>>>>>>>> > >>>>>>>>>>> > 1. performance_log1.txt is run on 8 cores (option given >>>>>>>>>>> -mat_mumps_icntl_14 200) >>>>>>>>>>> > 2. performance_log2.txt is run on 2 cores (option given >>>>>>>>>>> -mat_mumps_icntl_14 85 ) >>>>>>>>>>> > >>>>>>>>>>> > 1) Your number of iterates increased from 7600 to 9600, but >>>>>>>>>>> that is a relatively small effect >>>>>>>>>>> > >>>>>>>>>>> > 2) MUMPS is just taking a lot longer to do forward/backward >>>>>>>>>>> solve. You might try emailing >>>>>>>>>>> > the list for them. However, I would bet that your system has >>>>>>>>>>> enough bandwidth for 2 procs >>>>>>>>>>> > and not much more. >>>>>>>>>>> > >>>>>>>>>>> > Thanks, >>>>>>>>>>> > >>>>>>>>>>> > Matt >>>>>>>>>>> > >>>>>>>>>>> > Venkatesh >>>>>>>>>>> > >>>>>>>>>>> > On Sun, May 17, 2015 at 6:13 PM, Matthew Knepley < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > On Sun, May 17, 2015 at 1:38 AM, venkatesh g < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > Hi, Thanks for the information. I now increased the workspace >>>>>>>>>>> by adding '-mat_mumps_icntl_14 100' >>>>>>>>>>> > >>>>>>>>>>> > It works. However, the problem is, if I submit in 1 core I get >>>>>>>>>>> the answer in 200 secs, but with 4 cores and '-mat_mumps_icntl_14 >>>>>>>>>>> 100' it >>>>>>>>>>> takes 3500secs. >>>>>>>>>>> > >>>>>>>>>>> > Send the output of -log_summary for all performance queries. >>>>>>>>>>> Otherwise we are just guessing. >>>>>>>>>>> > >>>>>>>>>>> > Matt >>>>>>>>>>> > >>>>>>>>>>> > My command line is: 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 >>>>>>>>>>> -eps_nev 1 -st_type sinvert -eps_max_it 5000 -st_ksp_type preonly >>>>>>>>>>> -st_pc_type lu -st_pc_factor_mat_solver_package mumps >>>>>>>>>>> -mat_mumps_icntl_14 >>>>>>>>>>> 100' >>>>>>>>>>> > >>>>>>>>>>> > Kindly let me know. >>>>>>>>>>> > >>>>>>>>>>> > Venkatesh >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > On Sat, May 16, 2015 at 7:10 PM, David Knezevic < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > On Sat, May 16, 2015 at 8:08 AM, venkatesh g < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > Hi, >>>>>>>>>>> > I am trying to solving AX=lambda BX eigenvalue problem. >>>>>>>>>>> > >>>>>>>>>>> > A and B are of sizes 3600x3600 >>>>>>>>>>> > >>>>>>>>>>> > I run with this command : >>>>>>>>>>> > >>>>>>>>>>> > 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 -eps_nev 1 -st_type sinvert >>>>>>>>>>> -eps_max_it 5000 -st_ksp_type preonly -st_pc_type lu >>>>>>>>>>> -st_pc_factor_mat_solver_package mumps' >>>>>>>>>>> > >>>>>>>>>>> > I get this error: (I get result only when I give 1 or 2 >>>>>>>>>>> processors) >>>>>>>>>>> > Reading COMPLEX matrices from binary files... >>>>>>>>>>> > [0]PETSC ERROR: --------------------- Error Message >>>>>>>>>>> ------------------------------------ >>>>>>>>>>> > [0]PETSC ERROR: Error in external library! >>>>>>>>>>> > [0]PETSC ERROR: Error reported by MUMPS in numerical >>>>>>>>>>> factorization phase: INFO(1)=-9, INFO(2)=2024 >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > The MUMPS error types are described in Chapter 7 of the MUMPS >>>>>>>>>>> manual. In this case you have INFO(1)=-9, which is explained in the >>>>>>>>>>> manual >>>>>>>>>>> as: >>>>>>>>>>> > >>>>>>>>>>> > "–9 Main internal real/complex workarray S too small. If >>>>>>>>>>> INFO(2) is positive, then the number of entries that are missing in >>>>>>>>>>> S at >>>>>>>>>>> the moment when the error is raised is available in INFO(2). If >>>>>>>>>>> INFO(2) is >>>>>>>>>>> negative, then its absolute value should be multiplied by 1 >>>>>>>>>>> million. If an >>>>>>>>>>> error –9 occurs, the user should increase the value of ICNTL(14) >>>>>>>>>>> before >>>>>>>>>>> calling the factorization (JOB=2) again, except if ICNTL(23) is >>>>>>>>>>> provided, >>>>>>>>>>> in which case ICNTL(23) should be increased." >>>>>>>>>>> > >>>>>>>>>>> > This says that you should use ICTNL(14) to increase the >>>>>>>>>>> working space size: >>>>>>>>>>> > >>>>>>>>>>> > "ICNTL(14) is accessed by the host both during the analysis >>>>>>>>>>> and the factorization phases. It corresponds to the percentage >>>>>>>>>>> increase in >>>>>>>>>>> the estimated working space. When significant extra fill-in is >>>>>>>>>>> caused by >>>>>>>>>>> numerical pivoting, increasing ICNTL(14) may help. Except in >>>>>>>>>>> special cases, >>>>>>>>>>> the default value is 20 (which corresponds to a 20 % increase)." >>>>>>>>>>> > >>>>>>>>>>> > So, for example, you can avoid this error via the following >>>>>>>>>>> command line argument to PETSc: "-mat_mumps_icntl_14 30", where 30 >>>>>>>>>>> indicates that we allow a 30% increase in the workspace instead of >>>>>>>>>>> the >>>>>>>>>>> default 20%. >>>>>>>>>>> > >>>>>>>>>>> > David >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > -- >>>>>>>>>>> > What most experimenters take for granted before they begin >>>>>>>>>>> their experiments is infinitely more interesting than any results >>>>>>>>>>> to which >>>>>>>>>>> their experiments lead. >>>>>>>>>>> > -- Norbert Wiener >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > -- >>>>>>>>>>> > What most experimenters take for granted before they begin >>>>>>>>>>> their experiments is infinitely more interesting than any results >>>>>>>>>>> to which >>>>>>>>>>> their experiments lead. >>>>>>>>>>> > -- Norbert Wiener >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>> their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which >>>>>>> their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >
