On Aug 9, 2011, at 2:54 AM, Shitij Bhargava wrote: > Thanks Jose, Barry. > > I tried what you said, but that gives me an error: > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Argument out of range! > [0]PETSC ERROR: Can only get local values, trying 9! > > This is probably because here I am trying to insert all rows of the matrix > through process 0, but process 0 doesnt own all the rows. > > In any case, this seems very "unnatural", so I am using MPIAIJ the right way > as you said, where I assemble the MPIAIJ matrix in parallel instead of only > on one process. I have done that actually, and am running the code on the > cluster right now. Its going to take a long long time to finish,
It shouldn't take a long time to finish. Are you sure you are creating all the objects with the PETSC_COMM_WORLD and not PETSC_COMM_SELF? Have you done the correct matrix preallocation http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly? Is each process generating just its part of the matrix? Barry > so I cant confirm some of my doubts, which I am asking below: > > 1. If I run the code with 1 process, and say it takes M memory (peak) while > solving for eigenvalues, then when I run it with N processes, each will take > nearly M/N memory (peak) (probably a little more) right ? And for doing this, > I dont have to use any special MPI stuff....the fact that I am using MPIAIJ, > and building the EPS object from it, and then calling EPSSolve() is enough ? > I mean EPSSolve() is internally in some way distributing memory and > computation effort automatically when I use MPIAIJ, and run the code with > many processes, right ? > This confusion is there because when I use top, while running the code with 8 > processes, each of them showed me nearly 250 mb initially, but each has grown > to use 270 mb in about 70 minutes. I understand that the method krylovschur > is such that memory requirements increase slowly, but the peak on any process > will be less (than if I ran only one process), right ? (Even though their > memory requirements are growing, they will grow to some M/N only, right ?) > > Actually the fact that in this case, each of the process creates its own EPS > context, initializes it itself, and then calls EPSSolve() itself without any > "interaction" with other processes makes me wonder if they really are working > together, or just individually (I would have verified this myself, but the > program will take way too much time, and I know I would have to kill it > sooner or later).....or the fact that they initialize their own EPS context > with THEIR part of the MPI is enough to make them "cooperate and work > together" ? (Although I think this is what Barry meant in that last post, but > I am not too sure) > > I am not too comfortable with the MPI way of thinking right now, probably > this is why I have this confusion. > > Anyways, I cant thank you guys enough. I would have been scrounging through > documentation again and again to no avail if you guys had not helped me the > way you did. The responses were always prompt, always to the point (even > though my questions were sometimes not, probably because I didnt completely > understand the problems I was facing.....but you always knew what I was > asking) and very clear. At this moment, I dont know much about PETSc/SLEPc > myself, but I will be sure to contribute back to this list when I do. I have > nothing but sincere gratitude for you guys. > > > Thank you very much ! > > Shitij > > > On 9 August 2011 00:58, Barry Smith <bsmith at mcs.anl.gov> wrote: > > On Aug 8, 2011, at 2:14 AM, Shitij Bhargava wrote: > > > Thank you Jed. That was indeed the problem. I installed a separate MPI for > > PETSc/SLEPc, but was running my program with a default, already installed > > one. > > > > Now, I have a different question. What I want to do is this: > > > > 1. Only 1 process, say root, calculates the matrix in SeqAIJ format > > 2. Then root creates the EPS context, eps and initializes,sets parameters, > > problem type,etc. properly > > 3. After this the root process broadcasts this eps object to other processes > > 4. I use EPSSolve to solve for eigenvalues (all process together in > > cooperation resulting in memory distribution) > > 5. I get the results from root > > We do have an undocumented routine MatDistribute_MPIAIJ(MPI_Comm comm,Mat > gmat,PetscInt m,MatReuse reuse,Mat *inmat) in src/mat/impls/aij/mpi/mpiaij.c > that will take a SeqAIJ matrix and distribute it over a larger MPI > communicator. > > Note that you cannot create the EPS context etc on a the root process and > then broadcast the object but once the matrix is distributed you can simple > create the EPS context etc on the parallel communicator where the matrix is > and run with that. > > Barry > > > > > is this possible ? I am not able to broadcast the EPS object, because it is > > not an MPI_DataType. Is there any PETSc/SLEPc function for this ? I am > > avoiding using MPIAIJ because that will mean making many changes in the > > existing code, including the numerous write(*,*) statements (i would have > > to convert them to PetscPrint in FORTRAN or something like that). > > So I want a single process to handle matrix generation and assembly, but > > want to solve the eigenproblem in parallel by different processes. Running > > the subroutine EPSSolve in parallel and hence distribute memory is the only > > reason why I want to use MPI. > > > > Thanks a lot !! > > > > Shitij > > > > On 8 August 2011 11:05, Jed Brown <jedbrown at mcs.anl.gov> wrote: > > On Mon, Aug 8, 2011 at 00:29, Shitij Bhargava <shitij.cse at gmail.com> > > wrote: > > I ran it with: > > > > mpirun -np 2 ./slepcEigenMPI -eps_monitor > > > > I didnt do exactly what you said, because the matrix generation part in the > > actual program is quite time consuming itself. But I assume what I am doing > > is equivalent to what you meant to do? Also, I put MPD as PETSC_DECIDE, > > because I didnt know what to put it for this matrix dimension. > > > > This is the output I get: (part of the output) > > MATRIX ASSMEBLY DONE !!!!!!!! > > > > MATRIX ASSMEBLY DONE !!!!!!!! > > > > 1 EPS nconv=98 first unconverged value (error) 1490.88 (1.73958730e-05) > > 1 EPS nconv=98 first unconverged value (error) 1490.88 (1.73958730e-05) > > 2 EPS nconv=282 first unconverged value (error) 3.04636e-27 > > (2.49532175e-04) > > 2 EPS nconv=282 first unconverged value (error) 3.04636e-27 > > (2.49532175e-04) > > > > The most likely case is that you have more than one MPI implementation > > installed and that you are running with a different implementation than you > > built with. Compare the outputs: > > > > $ ldd ./slepcEigenMPI > > $ which mpirun > > > >
