You should try running under valgrind, see:  
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

    You can also run in the debugger (yes this is tricky on a batch system but 
possible) and see exactly what triggers the floating point exception or when it 
"hangs" interrupt the program to see where it is "hanging".


  Barry



> On Jul 8, 2015, at 12:34 PM, Anthony Paul Haas <[email protected]> wrote:
> 
> Hi,
> 
> I have used the switch -mat_superlu_dist_parsymbfact in my pbs script. 
> However, although my program worked fine with sequential symbolic 
> factorization, I get one of the following 2 behaviors when I run with 
> parallel symbolic factorization (depending on the number of processors that I 
> use):
> 
> 1) the program just hangs (it seems stuck in some subroutine ==> see 
> test.out-hangs)
> 2) I get a floating point exception ==> see test.out-floating-point-exception
> 
> Note that as suggested in the Superlu manual, I use a power of 2 number of 
> procs. Are there any tunable parameters for the parallel symbolic 
> factorization? Note that when I build my sparse matrix, most elements I add 
> are nonzero of course but to simplify the programming, I also add a few zero 
> elements in the sparse matrix. I was thinking that maybe if the parallel 
> symbolic factorization proceed by block, there could be some blocks where the 
> pivot would be zero, hence creating the FPE??
> 
> Thanks,
> 
> Anthony
> 
> 
> 
> On Wed, Jul 8, 2015 at 6:46 AM, Xiaoye S. Li <[email protected]> wrote:
> Did you find out how to change option to use parallel symbolic factorization? 
>  Perhaps PETSc team can help. 
> 
> Sherry
> 
> 
> On Tue, Jul 7, 2015 at 3:58 PM, Xiaoye S. Li <[email protected]> wrote:
> Is there an inquiry function that tells you all the available options?
> 
> Sherry
> 
> On Tue, Jul 7, 2015 at 3:25 PM, Anthony Paul Haas <[email protected]> 
> wrote:
> Hi Sherry,
> 
> Thanks for your message. I have used superlu_dist default options. I did not 
> realize that I was doing serial symbolic factorization. That is probably the 
> cause of my problem. 
> Each node on Garnet has 60GB usable memory and I can run with 1,2,4,8,16 or 
> 32 core per node. 
> 
> So I should use: 
> 
> -mat_superlu_dist_r 20
> -mat_superlu_dist_c 32
> 
> How do you specify the parallel symbolic factorization option? is it 
> -mat_superlu_dist_matinput 1
> 
> Thanks,
> 
> Anthony
> 
> 
> On Tue, Jul 7, 2015 at 3:08 PM, Xiaoye S. Li <[email protected]> wrote:
> For superlu_dist failure, this occurs during symbolic factorization.  Since 
> you are using serial symbolic factorization, it requires the entire graph of 
> A to be available in the memory of one MPI task. How much memory do you have 
> for each MPI task?
> 
> It won't help even if you use more processes.  You should try to use parallel 
> symbolic factorization option.
> 
> Another point.  You set up process grid as:
>        Process grid nprow 32 x npcol 20 
> For better performance, you show swap the grid dimension. That is, it's 
> better to use 20 x 32, never gives nprow larger than npcol.
> 
> 
> Sherry
> 
> 
> On Tue, Jul 7, 2015 at 1:27 PM, Barry Smith <[email protected]> wrote:
> 
>    I would suggest running a sequence of problems, 101 by 101 111 by 111 etc 
> and get the memory usage in each case (when you run out of memory you can get 
> NO useful information out about memory needs). You can then plot memory usage 
> as a function of problem size to get a handle on how much memory it is using. 
>  You can also run on more and more processes (which have a total of more 
> memory) to see how large a problem you may be able to reach.
> 
>    MUMPS also has an "out of core" version (which we have never used) that 
> could in theory anyways let you get to large problems if you have lots of 
> disk space, but you are on your own figuring out how to use it.
> 
>   Barry
> 
> > On Jul 7, 2015, at 2:37 PM, Anthony Paul Haas <[email protected]> 
> > wrote:
> >
> > Hi Jose,
> >
> > In my code, I use once PETSc to solve a linear system to get the baseflow 
> > (without using SLEPc) and then I use SLEPc to do the stability analysis of 
> > that baseflow. This is why, there are some SLEPc options that are not used 
> > in test.out-superlu_dist-151x151 (when I am solving for the baseflow with 
> > PETSc only). I have attached a 101x101 case for which I get the 
> > eigenvalues. That case works fine. However If i increase to 151x151, I get 
> > the error that you can see in test.out-superlu_dist-151x151 (similar error 
> > with mumps: see test.out-mumps-151x151 line 2918 ). If you look a the very 
> > end of the files test.out-superlu_dist-151x151 and test.out-mumps-151x151, 
> > you will see that the last info message printed is:
> >
> > On Processor (after EPSSetFromOptions)  0    memory:    0.65073152000E+08   
> >        =====>  (see line 807 of module_petsc.F90)
> >
> > This means that the memory error probably occurs in the call to EPSSolve 
> > (see module_petsc.F90 line 810). I would like to evaluate how much memory 
> > is required by the most memory intensive operation within EPSSolve. Since I 
> > am solving a generalized EVP, I would imagine that it would be the LU 
> > decomposition. But is there an accurate way of doing it?
> >
> > Before starting with iterative solvers, I would like to exploit as much as 
> > I can direct solvers. I tried GMRES with default preconditioner at some 
> > point but I had convergence problem. What solver/preconditioner would you 
> > recommend for a generalized non-Hermitian (EPS_GNHEP) EVP?
> >
> > Thanks,
> >
> > Anthony
> >
> > On Tue, Jul 7, 2015 at 12:17 AM, Jose E. Roman <[email protected]> wrote:
> >
> > El 07/07/2015, a las 02:33, Anthony Haas escribió:
> >
> > > Hi,
> > >
> > > I am computing eigenvalues using PETSc/SLEPc and superlu_dist for the LU 
> > > decomposition (my problem is a generalized eigenvalue problem). The code 
> > > runs fine for a grid with 101x101 but when I increase to 151x151, I get 
> > > the following error:
> > >
> > > Can't expand MemType 1: jcol 16104   (and then [NID 00037] 2015-07-06 
> > > 19:19:17 Apid 31025976: OOM killer terminated this process.)
> > >
> > > It seems to be a memory problem. I monitor the memory usage as far as I 
> > > can and it seems that memory usage is pretty low. The most memory 
> > > intensive part of the program is probably the LU decomposition in the 
> > > context of the generalized EVP. Is there a way to evaluate how much 
> > > memory will be required for that step? I am currently running the debug 
> > > version of the code which I would assume would use more memory?
> > >
> > > I have attached the output of the job. Note that the program uses twice 
> > > PETSc: 1) to solve a linear system for which no problem occurs, and, 2) 
> > > to solve the Generalized EVP with SLEPc, where I get the error.
> > >
> > > Thanks
> > >
> > > Anthony
> > > <test.out-superlu_dist-151x151>
> >
> > In the output you are attaching there are no SLEPc objects in the report 
> > and SLEPc options are not used. It seems that SLEPc calls are skipped?
> >
> > Do you get the same error with MUMPS? Have you tried to solve linear 
> > systems with a preconditioned iterative solver?
> >
> > Jose
> >
> >
> > <module_petsc.F90><test.out-mumps-151x151><test.out_superlu_dist-101x101><test.out-superlu_dist-151x151>
> 
> 
> 
> 
> 
> 
> <test.out-hangs><test.out-floating-point-exception>

Reply via email to