hi Matt, I found the error. It's my problem. I used a serial code. My program tried to write value for other process. As you said, off-process values. After I corrected the code to parallel mode, it works now. Sorry for wasting your time. :-)
thanx pan --- Matthew Knepley <knepley at gmail.com> wrote: > Here is the trace: > > 0]PETSC ERROR: Memory allocated 865987336 Memory > used > by process 1591005184 > [0]PETSC ERROR: Try running with -malloc_dump or > -malloc_log for info. > [0]PETSC ERROR: Memory requested 1310720296! > [0]PETSC ERROR: PetscTrMallocDefault() line 188 in > src/sys/src/memory/mtr.c > [0]PETSC ERROR: MatStashExpand_Private() line 240 in > src/mat/utils/matstash.c > [0]PETSC ERROR: MatStashValuesRow_Private() line 276 > in src/mat/utils/matstash.c > [0]PETSC ERROR: MatSetValues_MPIAIJ() line 199 in > src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: MatSetValues() line 702 in > src/mat/interface/matrix.c > [0]PETSC ERROR: User provided function() line 312 in > unknowndirectory/src/numerics/petsc_matrix.C > > So, you did not write petsc_matrix? What is > happening here is > off-processor values are being set with > MatSetValues(). That means > we have to stash them. This is not inherently bad, > but the stash space > grows so large that memory on the node is exhausted. > This is very rare > with a PDE problem on a mesh. That what leads me to > think that too many > values are being generated on a single proc. > > Matt > > On 9/5/07, li pan <li76pan at yahoo.com> wrote: > > hi Matt, > > I'm using libmesh. So I have no idea how the > values > > were set. Before, I was connecting several > computers > > in my office. And I didn't have this problem. > > Recently, I tried to install all libraries to a > linux > > cluster. And I've got this problem. I don't know > why. > > mpdtrace shows all the connected nodes I want. The > > only one difference is, all the nodes are mounted > to a > > headnode. In my office I didn't use mount. > > Could this be the reason? > > > > thanx > > > > pan > > > > > > --- Matthew Knepley <knepley at gmail.com> wrote: > > > > > Are you trying to set all the values from a > single > > > processor? > > > > > > Matt > > > > > > On 9/4/07, li pan <li76pan at yahoo.com> wrote: > > > > Dear all, > > > > I recently installed Petsc on a linux cluster > and > > > > tried to solve a linear equation in parallel > way. > > > I > > > > used 3D Hex mesh. Mesh dimension is 181, 181, > 41. > > > The > > > > number of Dofs are 1343201. > > > > In serial run, there was no problem. But at > > > parallel > > > > run, there was memory allocation problem. > > > > > > > > > > ----------------------------------------------------------------------- > > > > [0]PETSC ERROR: PetscMallocAlign() line 62 in > > > > src/sys/src/memory/mal.c > > > > [0]PETSC ERROR: Out of memory. This could be > due > > > to > > > > allocating > > > > [0]PETSC ERROR: too large an object or > bleeding by > > > not > > > > properly > > > > [0]PETSC ERROR: destroying unneeded objects. > > > > [3]PETSC ERROR: MatSetValues() line 702 in > > > > src/mat/interface/matrix.c > > > > [3]PETSC ERROR: User provided function() line > 312 > > > in > > > > unknowndirectory/src/numerics/petsc_matrix.C > > > > [cli_3]: aborting job: > > > > application called MPI_Abort(comm=0x84000000, > 55) > > > - > > > > process 3 > > > > [0]PETSC ERROR: Memory allocated 865987336 > Memory > > > used > > > > by process 1591005184 > > > > [0]PETSC ERROR: Try running with -malloc_dump > or > > > > -malloc_log for info. > > > > [0]PETSC ERROR: Memory requested 1310720296! > > > > [0]PETSC ERROR: PetscTrMallocDefault() line > 188 in > > > > src/sys/src/memory/mtr.c > > > > [0]PETSC ERROR: MatStashExpand_Private() line > 240 > > > in > > > > src/mat/utils/matstash.c > > > > [0]PETSC ERROR: MatStashValuesRow_Private() > line > > > 276 > > > > in src/mat/utils/matstash.c > > > > [0]PETSC ERROR: MatSetValues_MPIAIJ() line 199 > in > > > > src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: MatSetValues() line 702 in > > > > src/mat/interface/matrix.c > > > > [0]PETSC ERROR: User provided function() line > 312 > > > in > > > > unknowndirectory/src/numerics/petsc_matrix.C > > > > [cli_0]: aborting job: > > > > application called MPI_Abort(comm=0x84000000, > 55) > > > - > > > > process 0 > > > > rank 3 in job 1 hpc16_44261 caused > collective > > > abort > > > > of all ranks > > > > exit status of rank 3: return code 55 > > > > > > > > > > > > I checked memory on all the nodes. Each of > them > > > has > > > > more than 2.5 GB before program starts. > > > > What could be the reason? > > > > > > > > thanx > > > > > > > > pan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ____________________________________________________________________________________ > > > > Building a website is a piece of cake. Yahoo! > > > Small Business gives you all the tools to get > > > online. > > > > http://smallbusiness.yahoo.com/webhosting > > > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before > they > > > begin their > > > experiments is infinitely more interesting than > any > > > results to which > > > their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > ____________________________________________________________________________________ > > Sick sense of humor? Visit Yahoo! TV's > > Comedy with an Edge to see what's on, when. > > http://tv.yahoo.com/collections/222 > > > > > > > -- > What most experimenters take for granted before they > begin their > experiments is infinitely more interesting than any > results to which > their experiments lead. > === message truncated === ____________________________________________________________________________________Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. http://tv.yahoo.com/
