I updated the results for the Bratu problem on our SGI. It has 8 cores per node (two 4-core processors per node), and I ran from 1 to 256 cores. The log_summary output is attached for both studies. Question: is there anything about the memory usage of that problem that doesn't scale? The memory usage looks steady at < 1GB per core based on log_summary. I ask because last night I tried to do one more level of refinement for weak scaling on 1024 cores and it crashed. I ran the same job on 512 cores this morning, and it ran fine so I'm hoping the issue was a temporary system problem.
Notes: There is a shift in the strong scaling curve as it fills up the first node (i.e. from 1 to 16 cores), then it looks perfect. The shift seems reasonable due to the sharing of the cache by 4 cores. The weak scaling shows slight growth in the wall clock from 6.3 seconds to 17 seconds. I'm going to run that again with a larger coarse grid in order to increase the runtime to several minutes. Graphs: https://proteus.usace.army.mil/home/pub/17/ On Thu, Apr 11, 2013 at 12:46 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote: > Chris Kees <cekees at gmail.com> writes: > >> Thanks a lot. I did a little example with the Bratu problem and posted it >> here: >> >> https://proteus.usace.army.mil/home/pub/17/ >> >> I used boomeramg instead of geometric multigrid because I was getting >> an error with the options above: >> >> %mpiexec -np 4 ./ex5 -mx 129 -my 129 -Nx 2 -Ny 2 -pc_type mg -pc_mg_levels 2 >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Argument out of range! >> [0]PETSC ERROR: New nonzero at (66,1) caused a malloc! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ > > That test hard-codes evil things (presumably for testing purposes, > though maybe the functionality has been subsumed). Please use > src/snes/examples/tutorials/ex5.c instead. > > mpiexec -n 4 ./ex5 -da_grid_x 65 -da_grid_y 65 -pc_type mg -log_summary > -da_refine 1 > > Increase '-da_refine 1' to get higher resolution. (This will increase > the number of MG levels used by PCMG.) > > Switch '-da_refine 1' to '-snes_grid_sequence 1' if you want FMG, but > note that it's trickier to profile because proportionately more time is > spent in coarse levels (although the total solve time is lower). > >> >> I like the ice paper and will try to get the contractor started on >> reproducing those results. >> >> -Chris >> >> On Wed, Apr 10, 2013 at 1:13 PM, Nystrom, William D <wdn at lanl.gov> wrote: >>> Sorry. I overlooked that the URL was using git protocol. My bad. >>> >>> Dave >>> >>> ________________________________________ >>> From: Jed Brown [five9a2 at gmail.com] on behalf of Jed Brown [jedbrown at >>> mcs.anl.gov] >>> Sent: Wednesday, April 10, 2013 12:10 PM >>> To: Nystrom, William D; For users of the development version of PETSc; >>> Chris Kees >>> Subject: Re: [petsc-dev] examples/benchmarks for weak and strong scaling >>> exercise >>> >>> "Nystrom, William D" <wdn at lanl.gov> writes: >>> >>>> Jed, >>>> >>>> I tried cloning your tme-ice git repo as follows and it failed: >>>> >>>> % git clone --recursive git://github.com/jedbrown/tme-ice.git tme_ice >>>> Cloning into 'tme_ice'... >>>> fatal: unable to connect to github.com: >>>> github.com[0: 204.232.175.90]: errno=Connection timed out >>>> >>>> I'm doing this from an xterm that allows me to clone petsc just fine. >>> >>> You're using https or ssh to clone PETSc, but the git:// to clone >>> tme-ice. The LANL network is blocking that port, so just use the https >>> or ssh protocol. -------------- next part -------------- A non-text attachment was scrubbed... Name: weakScaling.out Type: application/octet-stream Size: 63112 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130412/28762b9b/attachment-0002.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: strongScaling.out Type: application/octet-stream Size: 77424 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130412/28762b9b/attachment-0003.obj>
