On Mon, May 25, 2015 at 2:54 AM, venkatesh g <[email protected]> wrote:
> Ok this will load the matrices in parallel correct ? > Yes Matt > On Sun, May 24, 2015 at 7:36 PM, Matthew Knepley <[email protected]> > wrote: > >> On Sun, May 24, 2015 at 8:57 AM, venkatesh g <[email protected]> >> wrote: >> >>> I am using Matload option as in the ex7.c code given by the Slepc. >>> ierr = MatLoad(A,viewer);CHKERRQ(ierr); >>> >>> >>> There is no problem here right ? or any additional option is required >>> for very large matrices while running the eigensolver in parallel ? >>> >> >> This will load the matrix from the viewer (presumably disk). There are no >> options for large matrices. >> >> Thanks, >> >> Matt >> >> >>> cheers, >>> Venkatesh >>> >>> On Sat, May 23, 2015 at 5:43 PM, Matthew Knepley <[email protected]> >>> wrote: >>> >>>> On Sat, May 23, 2015 at 7:09 AM, venkatesh g <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> Thanks. >>>>> Per node it has 24 cores and each core has 4 GB RAM. And the job was >>>>> submitted in 10 nodes. >>>>> >>>>> So, does it mean it requires 10G for one core ? or for 1 node ? >>>>> >>>> >>>> The error message from MUMPS said that it tried to allocate 10G. We >>>> must assume each process >>>> tried to do the same thing. That means if you scheduled 24 processes on >>>> a node, it would try to >>>> allocate at least 240G, which is in excess of what you specify above. >>>> >>>> Note that this has nothing to do with PETSc. It is all in the >>>> documentation for that machine and its >>>> scheduling policy. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> cheers, >>>>> >>>>> Venkatesh >>>>> >>>>> On Sat, May 23, 2015 at 5:17 PM, Matthew Knepley <[email protected]> >>>>> wrote: >>>>> >>>>>> On Sat, May 23, 2015 at 6:44 AM, venkatesh g <[email protected] >>>>>> > wrote: >>>>>> >>>>>>> Hi, >>>>>>> The same eigenproblem runs with 120 GB RAM in a serial machine in >>>>>>> Matlab. >>>>>>> >>>>>>> In Cray I fired with 240*4 GB RAM in parallel. So it has to go in >>>>>>> right ? >>>>>>> >>>>>> >>>>>> I do not know how MUMPS allocates memory, but the message is >>>>>> unambiguous. Also, >>>>>> this is concerned with the memory available per node. Do you know how >>>>>> many processes >>>>>> per node were scheduled? The message below indicates that it was >>>>>> trying to allocate 10G >>>>>> for one process. >>>>>> >>>>>> >>>>>>> And for small matrices it is having negative scaling i.e 24 core is >>>>>>> running faster. >>>>>>> >>>>>> >>>>>> Yes, for strong scaling you always get slowdown eventually since >>>>>> overheads dominate >>>>>> work, see Amdahl's Law. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> I have attached the submission script. >>>>>>> >>>>>>> Pls see.. Kindly let me know >>>>>>> >>>>>>> cheers, >>>>>>> Venkatesh >>>>>>> >>>>>>> >>>>>>> On Sat, May 23, 2015 at 4:58 PM, Matthew Knepley <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> On Sat, May 23, 2015 at 2:39 AM, venkatesh g < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi again, >>>>>>>>> >>>>>>>>> I have installed the Petsc and Slepc in Cray with intel compilers >>>>>>>>> with Mumps. >>>>>>>>> >>>>>>>>> I am getting this error when I solve eigenvalue problem with large >>>>>>>>> matrices: [201]PETSC ERROR: Error reported by MUMPS in numerical >>>>>>>>> factorization phase: Cannot allocate required memory 9632 megabytes >>>>>>>>> >>>>>>>> >>>>>>>> It ran out of memory on the node. >>>>>>>> >>>>>>>> >>>>>>>>> Also it is again not scaling well for small matrices. >>>>>>>>> >>>>>>>> >>>>>>>> MUMPS strong scaling for small matrices is not very good. Weak >>>>>>>> scaling is looking at big matrices. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Kindly let me know what to do. >>>>>>>>> >>>>>>>>> cheers, >>>>>>>>> >>>>>>>>> Venkatesh >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, May 19, 2015 at 3:02 PM, Matthew Knepley < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> On Tue, May 19, 2015 at 1:04 AM, venkatesh g < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I have attached the log of the command which I gave in the >>>>>>>>>>> master node: make streams NPMAX=32 >>>>>>>>>>> >>>>>>>>>>> I dont know why it says 'It appears you have only 1 node'. But >>>>>>>>>>> other codes run in parallel with good scaling on 8 nodes. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> If you look at the STREAMS numbers, you can see that your system >>>>>>>>>> is only able to support about 2 cores with the >>>>>>>>>> available memory bandwidth. Thus for bandwidth constrained >>>>>>>>>> operations (almost everything in sparse linear algebra >>>>>>>>>> and solvers), your speedup will not be bigger than 2. >>>>>>>>>> >>>>>>>>>> Other codes may do well on this machine, but they would be >>>>>>>>>> compute constrained, using things like DGEMM. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Kindly let me know. >>>>>>>>>>> >>>>>>>>>>> Venkatesh >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, May 18, 2015 at 11:21 PM, Barry Smith < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Run the streams benchmark on this system and send the >>>>>>>>>>>> results. >>>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> > On May 18, 2015, at 11:14 AM, venkatesh g < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> > >>>>>>>>>>>> > Hi, >>>>>>>>>>>> > I have emailed the mumps-user list. >>>>>>>>>>>> > Actually the cluster has 8 nodes with 16 cores, and other >>>>>>>>>>>> codes scale well. >>>>>>>>>>>> > I wanted to ask if this job takes much time, then if I submit >>>>>>>>>>>> on more cores, I have to increase the icntl(14).. which would >>>>>>>>>>>> again take >>>>>>>>>>>> long time. >>>>>>>>>>>> > >>>>>>>>>>>> > So is there another way ? >>>>>>>>>>>> > >>>>>>>>>>>> > cheers, >>>>>>>>>>>> > Venkatesh >>>>>>>>>>>> > >>>>>>>>>>>> > On Mon, May 18, 2015 at 7:16 PM, Matthew Knepley < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> > On Mon, May 18, 2015 at 8:29 AM, venkatesh g < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> > Hi I have attached the performance logs for 2 jobs on >>>>>>>>>>>> different processors. I had to increase the workspace icntl(14) >>>>>>>>>>>> when I >>>>>>>>>>>> submit on more cores since it is failing with small value of >>>>>>>>>>>> icntl(14). >>>>>>>>>>>> > >>>>>>>>>>>> > 1. performance_log1.txt is run on 8 cores (option given >>>>>>>>>>>> -mat_mumps_icntl_14 200) >>>>>>>>>>>> > 2. performance_log2.txt is run on 2 cores (option given >>>>>>>>>>>> -mat_mumps_icntl_14 85 ) >>>>>>>>>>>> > >>>>>>>>>>>> > 1) Your number of iterates increased from 7600 to 9600, but >>>>>>>>>>>> that is a relatively small effect >>>>>>>>>>>> > >>>>>>>>>>>> > 2) MUMPS is just taking a lot longer to do forward/backward >>>>>>>>>>>> solve. You might try emailing >>>>>>>>>>>> > the list for them. However, I would bet that your system has >>>>>>>>>>>> enough bandwidth for 2 procs >>>>>>>>>>>> > and not much more. >>>>>>>>>>>> > >>>>>>>>>>>> > Thanks, >>>>>>>>>>>> > >>>>>>>>>>>> > Matt >>>>>>>>>>>> > >>>>>>>>>>>> > Venkatesh >>>>>>>>>>>> > >>>>>>>>>>>> > On Sun, May 17, 2015 at 6:13 PM, Matthew Knepley < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> > On Sun, May 17, 2015 at 1:38 AM, venkatesh g < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> > Hi, Thanks for the information. I now increased the workspace >>>>>>>>>>>> by adding '-mat_mumps_icntl_14 100' >>>>>>>>>>>> > >>>>>>>>>>>> > It works. However, the problem is, if I submit in 1 core I >>>>>>>>>>>> get the answer in 200 secs, but with 4 cores and >>>>>>>>>>>> '-mat_mumps_icntl_14 100' >>>>>>>>>>>> it takes 3500secs. >>>>>>>>>>>> > >>>>>>>>>>>> > Send the output of -log_summary for all performance queries. >>>>>>>>>>>> Otherwise we are just guessing. >>>>>>>>>>>> > >>>>>>>>>>>> > Matt >>>>>>>>>>>> > >>>>>>>>>>>> > My command line is: 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 >>>>>>>>>>>> -eps_nev 1 -st_type sinvert -eps_max_it 5000 -st_ksp_type preonly >>>>>>>>>>>> -st_pc_type lu -st_pc_factor_mat_solver_package mumps >>>>>>>>>>>> -mat_mumps_icntl_14 >>>>>>>>>>>> 100' >>>>>>>>>>>> > >>>>>>>>>>>> > Kindly let me know. >>>>>>>>>>>> > >>>>>>>>>>>> > Venkatesh >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > On Sat, May 16, 2015 at 7:10 PM, David Knezevic < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> > On Sat, May 16, 2015 at 8:08 AM, venkatesh g < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> > Hi, >>>>>>>>>>>> > I am trying to solving AX=lambda BX eigenvalue problem. >>>>>>>>>>>> > >>>>>>>>>>>> > A and B are of sizes 3600x3600 >>>>>>>>>>>> > >>>>>>>>>>>> > I run with this command : >>>>>>>>>>>> > >>>>>>>>>>>> > 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 -eps_nev 1 -st_type >>>>>>>>>>>> sinvert -eps_max_it 5000 -st_ksp_type preonly -st_pc_type lu >>>>>>>>>>>> -st_pc_factor_mat_solver_package mumps' >>>>>>>>>>>> > >>>>>>>>>>>> > I get this error: (I get result only when I give 1 or 2 >>>>>>>>>>>> processors) >>>>>>>>>>>> > Reading COMPLEX matrices from binary files... >>>>>>>>>>>> > [0]PETSC ERROR: --------------------- Error Message >>>>>>>>>>>> ------------------------------------ >>>>>>>>>>>> > [0]PETSC ERROR: Error in external library! >>>>>>>>>>>> > [0]PETSC ERROR: Error reported by MUMPS in numerical >>>>>>>>>>>> factorization phase: INFO(1)=-9, INFO(2)=2024 >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > The MUMPS error types are described in Chapter 7 of the MUMPS >>>>>>>>>>>> manual. In this case you have INFO(1)=-9, which is explained in >>>>>>>>>>>> the manual >>>>>>>>>>>> as: >>>>>>>>>>>> > >>>>>>>>>>>> > "–9 Main internal real/complex workarray S too small. If >>>>>>>>>>>> INFO(2) is positive, then the number of entries that are missing >>>>>>>>>>>> in S at >>>>>>>>>>>> the moment when the error is raised is available in INFO(2). If >>>>>>>>>>>> INFO(2) is >>>>>>>>>>>> negative, then its absolute value should be multiplied by 1 >>>>>>>>>>>> million. If an >>>>>>>>>>>> error –9 occurs, the user should increase the value of ICNTL(14) >>>>>>>>>>>> before >>>>>>>>>>>> calling the factorization (JOB=2) again, except if ICNTL(23) is >>>>>>>>>>>> provided, >>>>>>>>>>>> in which case ICNTL(23) should be increased." >>>>>>>>>>>> > >>>>>>>>>>>> > This says that you should use ICTNL(14) to increase the >>>>>>>>>>>> working space size: >>>>>>>>>>>> > >>>>>>>>>>>> > "ICNTL(14) is accessed by the host both during the analysis >>>>>>>>>>>> and the factorization phases. It corresponds to the percentage >>>>>>>>>>>> increase in >>>>>>>>>>>> the estimated working space. When significant extra fill-in is >>>>>>>>>>>> caused by >>>>>>>>>>>> numerical pivoting, increasing ICNTL(14) may help. Except in >>>>>>>>>>>> special cases, >>>>>>>>>>>> the default value is 20 (which corresponds to a 20 % increase)." >>>>>>>>>>>> > >>>>>>>>>>>> > So, for example, you can avoid this error via the following >>>>>>>>>>>> command line argument to PETSc: "-mat_mumps_icntl_14 30", where 30 >>>>>>>>>>>> indicates that we allow a 30% increase in the workspace instead of >>>>>>>>>>>> the >>>>>>>>>>>> default 20%. >>>>>>>>>>>> > >>>>>>>>>>>> > David >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > -- >>>>>>>>>>>> > What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results >>>>>>>>>>>> to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> > -- Norbert Wiener >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > -- >>>>>>>>>>>> > What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results >>>>>>>>>>>> to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> > -- Norbert Wiener >>>>>>>>>>>> > >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>>> their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
