If I use MatView to check the type of my matrix, it replies mpiaij, not seqaij. Am I correct in understanding your comments to mean that the reason for the error that when I do the spectrum slicing, it is creating seqaij for each processor?
On Sat, Aug 29, 2015 at 12:55 AM, Jose E. Roman <[email protected]> wrote: > You are doing a spectrum slicing run (EPSSetInterval) with “size” > partitions (EPSKrylovSchurSetPartitions), so every single process will be > in charge of computing a subinterval. Each subcommunicator needs a > redundant copy of the matrix, and in this case this copy is SeqAIJ since > subcommunicators consist in just one process. You will probably need to > share this memory across a set of processes and use MUMPS for the > factorization. Try setting e.g. size/8 partitions. > > Jose > > > > El 29/8/2015, a las 1:14, Barry Smith <[email protected]> escribió: > > > > > > It is using a SeqAIJ matrix, not a parallel matrix. Increasing the > number of cores won't affect the size of a sequential matrix since it must > be stored entirely on one process. Perhaps you need to use parallel > matrices? > > > > > > [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: #5 MatDuplicate() line 4252 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c > > > > > >> On Aug 28, 2015, at 4:13 PM, Hank Lamm <[email protected]> wrote: > >> > >> Hi All, > >> > >> I am having a problem running Petsc3.6 and Slepc3.6 on Stampede. My > code should be a simple eigenvalue solver, but when I attempt to solve > large problems (8488x8488 matrices) I get errors: > >> > >> --------------------- Error Message > -------------------------------------------------------------- > >> [1]Total space allocated 1736835920 bytes > >> [1]PETSC ERROR: Out of memory. This could be due to allocating > >> [1]PETSC ERROR: too large an object or bleeding by not properly > >> [1]PETSC ERROR: destroying unneeded objects. > >> [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process > 1769742336 > >> [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796 > >> [1]PETSC ERROR: [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [1]PETSC ERROR: #8 STSetUp() line 305 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > >> [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line > 4030 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > >> [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c > >> [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > >> [1]PETSC ERROR: #5 MatDuplicate() line 4252 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c > >> [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > >> [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c > >> [1]PETSC ERROR: #8 STSetUp() line 305 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > >> [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c > >> [1]PETSC ERROR: #12 EPSSetUp() line 121 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c > >> [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c > >> [1]PETSC ERROR: #16 EPSSetUp() line 121 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c > >> [1]PETSC ERROR: #17 EPSSolve() line 88 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c > >> [1]PETSC ERROR: #18 eigensolver() line 64 in > /work/03324/hlammiv/TMSWIFT/src/solver.cpp > >> [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced() > 1.73684e+09 > >> [1]Current process memory 1.76979e+09 max process memory 1.76979e+09 > >> > >> > >> The curious thing about this error, is that it seems that if I increase > the number of nodes, from 32 to 64 to 128, the amount of memory per node > doesn't decrease. I have used valgrind and it doesn't seem to a memory > leak. > >> > >> The relevant code piece is: > >> > >> void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc, > char **argv) > >> { > >> > >> > >> EPS eps; /* eigenproblem solver context */ > >> EPSType type; > >> ST st; > >> KSP ksp; > >> PC pc; > >> PetscReal tol,error; > >> PetscReal lower,upper; > >> //PetscInt nev=dim,maxit,its; > >> PetscInt nev,maxit,its,nconv; > >> Vec xr,xi; > >> PetscScalar kr,ki; > >> PetscReal re,im; > >> PetscViewer viewer; > >> PetscInt rank; > >> PetscInt size; > >> std::string eig_file_n; > >> std::ofstream eig_file; > >> char ofile[100]; > >> > >> MPI_Comm_rank(PETSC_COMM_WORLD,&rank); > >> MPI_Comm_size(PETSC_COMM_WORLD,&size); > >> > >> ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue > Solver---\n");CHKERRV(ierr); > >> ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr); > >> > >> eig_file_n.append(params->ofile_n); > >> eig_file_n.append("_eval"); > >> eig_file.open(eig_file_n.c_str(),std::ofstream::trunc); > >> > >> //Set operators. In this case, it is a standard eigenvalue problem > >> ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr); > >> ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr); > >> > >> ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr); > >> > >> ierr = EPSGetST(eps,&st);CHKERRV(ierr); > >> ierr = STSetType(st,STSINVERT);CHKERRV(ierr); > >> > >> ierr = STGetKSP(st,&ksp);CHKERRV(ierr); > >> ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr); > >> ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr); > >> ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr); > >> ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr); > >> > >> for(PetscInt i=0;i<params->nf;i++){ > >> > lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0); > >> upper=4.0*params->m[i]*params->m[i]; > >> ierr = EPSSetInterval(eps,lower,upper); > >> ierr = EPSSetWhichEigenpairs(eps,EPS_ALL); > >> //Set solver parameters at runtime > >> ierr = EPSSetFromOptions(eps);CHKERRV(ierr); > >> // ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); > >> > >> ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr); > >> ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr); > >> > >> > >> ierr = EPSSolve(eps);CHKERRV(ierr); > >> > >> ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the > method: %D\n",its);CHKERRV(ierr); > >> > >> > >> //Optional: Get some information from the solver and display it > >> ierr = EPSGetType(eps,&type);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method: > %s\n\n",type);CHKERRV(ierr); > >> ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested > eigenvalues: %D\n",nev);CHKERRV(ierr); > >> ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: > tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr); > >> > >> ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged > eigenpairs: %D\n\n",nconv);CHKERRV(ierr); > >> > >> strcpy(ofile,params->ofile_n); > >> strcat(ofile,"_evecr"); > >> > >> ierr = > PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr); > >> > >> if (nconv>0) > >> { > >> ierr = PetscPrintf(PETSC_COMM_WORLD, > >> " k ||Ax-kx||/||kx||\n" > >> " ----------------- > ------------------\n");CHKERRV(ierr); > >> > >> for (PetscInt i=0;i<nconv;i++) > >> { > >> //Get converged eigenpairs: i-th eigenvalue is stored in kr > (real part) and ki (imaginary part) > >> ierr = > EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr); > >> //Compute the relative error associated to each > eigenpair > >> ierr = > EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr); > >> > >> #if defined(PETSC_USE_COMPLEX) > >> re = PetscRealPart(kr); > >> im = PetscImaginaryPart(kr); > >> #else > >> re = kr; > >> im = ki; > >> #endif > >> > >> if (im!=0.0) > >> { > >> > >> ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j > %12g\n",re,im,error);CHKERRV(ierr); > >> if(rank==0) eig_file << re << " " << im << " " << error > << std::endl; > >> } else > >> { > >> ierr = PetscPrintf(PETSC_COMM_WORLD," %12f > %12g\n",re,error);CHKERRV(ierr); > >> if(rank==0) eig_file << re << " " << 0 << " " << error > << std::endl; > >> } > >> > >> ierr = VecView(xr,viewer);CHKERRV(ierr); > >> > >> } > >> ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr); > >> } > >> } > >> eig_file.close(); > >> ierr = EPSDestroy(&eps);CHKERRV(ierr); > >> ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr); > >> ierr = VecDestroy(&xr);CHKERRV(ierr); > >> ierr = VecDestroy(&xi);CHKERRV(ierr); > >> > >> ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue > Solver---\n");CHKERRV(ierr); > >> } > >> > >> > >> > >> Thanks, > >> Hank > > > >
