subject:"\[petsc\-dev\] examples\/benchmarks for weak and strong scaling exercise"

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-12 Thread Matthew Knepley

On Fri, Apr 12, 2013 at 9:18 AM, Chris Kees cekees at gmail.com wrote:

I updated the results for the Bratu problem on our SGI. It has 8 cores
per node (two 4-core processors per node), and I ran from 1 to 256
cores. The log_summary output is attached for both studies. Question:

Strong scaling:

This looks fine. You get the classic memory bandwidth starvation after
2 cores on the same node (although your scaling does not completely
bottom out), and among nodes the scaling is great.

Weak scaling:

I have to go through the logs, but obviously something is wrong. I am
betting it is the failure to increase the GMG levels with increasing problem
size.

is there anything about the memory usage of that problem that doesn't
scale? The memory usage looks steady at 1GB per core based on
log_summary. I ask because last night I tried to do one more level of
refinement for weak scaling on 1024 cores and it crashed. I ran the
same job on 512 cores this morning, and it ran fine so I'm hoping the
issue was a temporary system problem.

No, the memory usage is scalable.

Thanks,

Matt

Notes:

There is a shift in the strong scaling curve as it fills up the first
node (i.e. from 1 to 16 cores), then it looks perfect. The shift
seems reasonable due to the sharing of the cache by 4 cores.

The weak scaling shows slight growth in the wall clock from 6.3
seconds to 17 seconds. I'm going to run that again with a larger
coarse grid in order to increase the runtime to several minutes.

Graphs: https://proteus.usace.army.mil/home/pub/17/

On Thu, Apr 11, 2013 at 12:46 PM, Jed Brown jedbrown at mcs.anl.gov wrote:
Chris Kees cekees at gmail.com writes:

Thanks a lot. I did a little example with the Bratu problem and posted
it here:

https://proteus.usace.army.mil/home/pub/17/

I used boomeramg instead of geometric multigrid because I was getting
an error with the options above:

%mpiexec -np 4 ./ex5 -mx 129 -my 129 -Nx 2 -Ny 2 -pc_type mg
-pc_mg_levels 2
[0]PETSC ERROR: - Error Message

[0]PETSC ERROR: Argument out of range!
[0]PETSC ERROR: New nonzero at (66,1) caused a malloc!
[0]PETSC ERROR:

That test hard-codes evil things (presumably for testing purposes,
though maybe the functionality has been subsumed). Please use
src/snes/examples/tutorials/ex5.c instead.

mpiexec -n 4 ./ex5 -da_grid_x 65 -da_grid_y 65 -pc_type mg -log_summary
-da_refine 1

Increase '-da_refine 1' to get higher resolution. (This will increase
the number of MG levels used by PCMG.)

Switch '-da_refine 1' to '-snes_grid_sequence 1' if you want FMG, but
note that it's trickier to profile because proportionately more time is
spent in coarse levels (although the total solve time is lower).

I like the ice paper and will try to get the contractor started on
reproducing those results.

-Chris

On Wed, Apr 10, 2013 at 1:13 PM, Nystrom, William D wdn at lanl.gov
wrote:
Sorry. I overlooked that the URL was using git protocol. My bad.

Dave

From: Jed Brown [five9a2 at gmail.com] on behalf of Jed Brown [
jedbrown at mcs.anl.gov]
Sent: Wednesday, April 10, 2013 12:10 PM
To: Nystrom, William D; For users of the development version of PETSc;
Chris Kees
Subject: Re: [petsc-dev] examples/benchmarks for weak and strong
scaling exercise

Nystrom, William D wdn at lanl.gov writes:

Jed,

I tried cloning your tme-ice git repo as follows and it failed:

% git clone --recursive git://github.com/jedbrown/tme-ice.git tme_ice
Cloning into 'tme_ice'...
fatal: unable to connect to github.com:
github.com[0: 204.232.175.90]: errno=Connection timed out

I'm doing this from an xterm that allows me to clone petsc just fine.

You're using https or ssh to clone PETSc, but the git:// to clone
tme-ice. The LANL network is blocking that port, so just use the https
or ssh protocol.

--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-- next part --
An HTML attachment was scrubbed...
URL:
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130412/4530ceab/attachment.html

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-11 Thread Chris Kees

Thanks a lot. I did a little example with the Bratu problem and posted it here:

https://proteus.usace.army.mil/home/pub/17/

I used boomeramg instead of geometric multigrid because I was getting
an error with the options above:

%mpiexec -np 4 ./ex5 -mx 129 -my 129 -Nx 2 -Ny 2 -pc_type mg -pc_mg_levels 2
[0]PETSC ERROR: - Error Message

[0]PETSC ERROR: Argument out of range!
[0]PETSC ERROR: New nonzero at (66,1) caused a malloc!
[0]PETSC ERROR:


I like the ice paper and will try to get the contractor started on
reproducing those results.

-Chris

On Wed, Apr 10, 2013 at 1:13 PM, Nystrom, William D wdn at lanl.gov wrote:
 Sorry.  I overlooked that the URL was using git protocol.  My bad.

 Dave

 
 From: Jed Brown [five9a2 at gmail.com] on behalf of Jed Brown [jedbrown at 
 mcs.anl.gov]
 Sent: Wednesday, April 10, 2013 12:10 PM
 To: Nystrom, William D; For users of the development version of PETSc; Chris 
 Kees
 Subject: Re: [petsc-dev] examples/benchmarks for weak and strong scaling 
 exercise

 Nystrom, William D wdn at lanl.gov writes:

 Jed,

 I tried cloning your tme-ice git repo as follows and it failed:

 % git clone --recursive git://github.com/jedbrown/tme-ice.git tme_ice
 Cloning into 'tme_ice'...
 fatal: unable to connect to github.com:
 github.com[0: 204.232.175.90]: errno=Connection timed out

 I'm doing this from an xterm that allows me to clone petsc just fine.

 You're using https or ssh to clone PETSc, but the git:// to clone
 tme-ice.  The LANL network is blocking that port, so just use the https
 or ssh protocol.

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-11 Thread Aron Ahmadia

What is going on with those results?  In both cases the first parallel code
is seriously outperforming the single-core.  I'd be interested in seeing
the two log_summary outputs.

I can only assume that the PETSc developers have an (if (serial);
sleep(10);) buried somewhere in the src from their last Gordon Bell run :)

A


On Thu, Apr 11, 2013 at 6:33 PM, Chris Kees cekees at gmail.com wrote:

 Thanks a lot. I did a little example with the Bratu problem and posted it
 here:

 https://proteus.usace.army.mil/home/pub/17/

 I used boomeramg instead of geometric multigrid because I was getting
 an error with the options above:

 %mpiexec -np 4 ./ex5 -mx 129 -my 129 -Nx 2 -Ny 2 -pc_type mg -pc_mg_levels
 2
 [0]PETSC ERROR: - Error Message
 
 [0]PETSC ERROR: Argument out of range!
 [0]PETSC ERROR: New nonzero at (66,1) caused a malloc!
 [0]PETSC ERROR:
 

 I like the ice paper and will try to get the contractor started on
 reproducing those results.

 -Chris

 On Wed, Apr 10, 2013 at 1:13 PM, Nystrom, William D wdn at lanl.gov wrote:
  Sorry.  I overlooked that the URL was using git protocol.  My bad.
 
  Dave
 
  
  From: Jed Brown [five9a2 at gmail.com] on behalf of Jed Brown [
 jedbrown at mcs.anl.gov]
  Sent: Wednesday, April 10, 2013 12:10 PM
  To: Nystrom, William D; For users of the development version of PETSc;
 Chris Kees
  Subject: Re: [petsc-dev] examples/benchmarks for weak and strong scaling
 exercise
 
  Nystrom, William D wdn at lanl.gov writes:
 
  Jed,
 
  I tried cloning your tme-ice git repo as follows and it failed:
 
  % git clone --recursive git://github.com/jedbrown/tme-ice.git tme_ice
  Cloning into 'tme_ice'...
  fatal: unable to connect to github.com:
  github.com[0: 204.232.175.90]: errno=Connection timed out
 
  I'm doing this from an xterm that allows me to clone petsc just fine.
 
  You're using https or ssh to clone PETSc, but the git:// to clone
  tme-ice.  The LANL network is blocking that port, so just use the https
  or ssh protocol.

-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130411/59765d35/attachment.html

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-11 Thread Jed Brown

Chris Kees cekees at gmail.com writes:

 Thanks a lot. I did a little example with the Bratu problem and posted it 
 here:

 https://proteus.usace.army.mil/home/pub/17/

 I used boomeramg instead of geometric multigrid because I was getting
 an error with the options above:

 %mpiexec -np 4 ./ex5 -mx 129 -my 129 -Nx 2 -Ny 2 -pc_type mg -pc_mg_levels 2
 [0]PETSC ERROR: - Error Message
 
 [0]PETSC ERROR: Argument out of range!
 [0]PETSC ERROR: New nonzero at (66,1) caused a malloc!
 [0]PETSC ERROR:
 

That test hard-codes evil things (presumably for testing purposes,
though maybe the functionality has been subsumed).  Please use
src/snes/examples/tutorials/ex5.c instead.

 mpiexec -n 4 ./ex5 -da_grid_x 65 -da_grid_y 65 -pc_type mg -log_summary 
-da_refine 1

Increase '-da_refine 1' to get higher resolution.  (This will increase
the number of MG levels used by PCMG.)

Switch '-da_refine 1' to '-snes_grid_sequence 1' if you want FMG, but
note that it's trickier to profile because proportionately more time is
spent in coarse levels (although the total solve time is lower).


 I like the ice paper and will try to get the contractor started on
 reproducing those results.

 -Chris

 On Wed, Apr 10, 2013 at 1:13 PM, Nystrom, William D wdn at lanl.gov wrote:
 Sorry.  I overlooked that the URL was using git protocol.  My bad.

 Dave

 
 From: Jed Brown [five9a2 at gmail.com] on behalf of Jed Brown [jedbrown at 
 mcs.anl.gov]
 Sent: Wednesday, April 10, 2013 12:10 PM
 To: Nystrom, William D; For users of the development version of PETSc; Chris 
 Kees
 Subject: Re: [petsc-dev] examples/benchmarks for weak and strong scaling 
 exercise

 Nystrom, William D wdn at lanl.gov writes:

 Jed,

 I tried cloning your tme-ice git repo as follows and it failed:

 % git clone --recursive git://github.com/jedbrown/tme-ice.git tme_ice
 Cloning into 'tme_ice'...
 fatal: unable to connect to github.com:
 github.com[0: 204.232.175.90]: errno=Connection timed out

 I'm doing this from an xterm that allows me to clone petsc just fine.

 You're using https or ssh to clone PETSc, but the git:// to clone
 tme-ice.  The LANL network is blocking that port, so just use the https
 or ssh protocol.

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-10 Thread Matthew Knepley

On Wed, Apr 10, 2013 at 10:04 AM, Chris Kees cekees at gmail.com wrote:

Hi guys,

Could somebody point me to some examples you guys routinely use for
weak and strong scaling studies (maybe even with scripts, option
files, or prior results on recent hardware)? I'm thinking of 3D
Poisson with finite differences and geometric multigrid or something
like that.

I am trying to write this stuff down, but there is not much written right
now.
Let's start at the beginning. SNES ex5 is the simplest example that can be
used for this I think. It is 2D Poisson (actually Bratu). Its really easy
to get
weak scaling by adjusting the grid size using

-da_grid_x m -da_grid_y n

You can turn on MG using

-pc_type mg -pc_mg_levels n

although the slightly non-intuitive thing is that then the grid size you
input is
for the coarse grid.

We've been trying to work toward scaling studies of the field split
and Schur complement preconditioners for our multiphase flow solvers,
but I'm realizing that we need to do more thorough testing of the
petsc installation itself and make sure we're using timing/profiling
best practices and such.

We are using petsc-dev on the hardware below. I promise to quit using
petsc-dev as soon as the next release comes out:) Several versions of
PETSc are also installed by the system maintainers, but my sense is
that there is very little testing done on any of the installations.

I think using petsc-dev is the right thing, and it is now much much more
stable (the 'master' branch on BitBucket).

Thanks,

Matt

http://www.erdc.hpc.mil/hardware/index.html

Chris

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-10 Thread Jed Brown

Chris Kees cekees at gmail.com writes:

 Hi guys,

 Could somebody point me to some examples you guys routinely use for
 weak and strong scaling studies (maybe even with scripts, option
 files,  or prior results on recent hardware)? I'm thinking of 3D
 Poisson with finite differences and geometric multigrid or something
 like that.

One option would be to use src/snes/examples/tutorials/ex48.c

and use the configurations from

  http://dx.doi.org/10.1137/110834512 (http://59A2.org/files/hstat.pdf)

which you can find in the paper repository:

  https://github.com/jedbrown/tme-ice

Look in shaheen/b/.

Those runs were using DMMG so the command line will have to be modified
slightly, but it should be straightforward and you can compare to the
runex48_* targets in src/snes/examples/tutorials/makefile.

 We've been trying to work toward scaling studies of the field split
 and Schur complement preconditioners for our multiphase flow solvers,
 but I'm realizing that we need to do more thorough testing of the
 petsc installation itself and make sure we're using timing/profiling
 best practices and such.

 We are using petsc-dev on the hardware below. I promise to quit using
 petsc-dev as soon as the next release comes out:)

We're actually happy to have people using petsc-dev.  One motivation for
our new workflow is that we can now provide a pretty stable 'master' so
that we can interact with users on new features without the latency of a
release cycle and without frequent breakage.

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-10 Thread Nystrom, William D

Jed,

I tried cloning your tme-ice git repo as follows and it failed:

% git clone --recursive git://github.com/jedbrown/tme-ice.git tme_ice
Cloning into 'tme_ice'...
fatal: unable to connect to github.com:
github.com[0: 204.232.175.90]: errno=Connection timed out

I'm doing this from an xterm that allows me to clone petsc just fine.

Any idea what the problem might be?

Dave


From: petsc-dev-bounces at mcs.anl.gov [petsc-dev-bounces at mcs.anl.gov] on 
behalf of Jed Brown [jedbr...@mcs.anl.gov]
Sent: Wednesday, April 10, 2013 9:22 AM
To: Chris Kees; petsc-dev at mcs.anl.gov
Subject: Re: [petsc-dev] examples/benchmarks for weak and strong scaling
exercise

Chris Kees cekees at gmail.com writes:

 Hi guys,

 Could somebody point me to some examples you guys routinely use for
 weak and strong scaling studies (maybe even with scripts, option
 files,  or prior results on recent hardware)? I'm thinking of 3D
 Poisson with finite differences and geometric multigrid or something
 like that.

One option would be to use src/snes/examples/tutorials/ex48.c

and use the configurations from

  http://dx.doi.org/10.1137/110834512 (http://59A2.org/files/hstat.pdf)

which you can find in the paper repository:

  https://github.com/jedbrown/tme-ice

Look in shaheen/b/.

Those runs were using DMMG so the command line will have to be modified
slightly, but it should be straightforward and you can compare to the
runex48_* targets in src/snes/examples/tutorials/makefile.

 We've been trying to work toward scaling studies of the field split
 and Schur complement preconditioners for our multiphase flow solvers,
 but I'm realizing that we need to do more thorough testing of the
 petsc installation itself and make sure we're using timing/profiling
 best practices and such.

 We are using petsc-dev on the hardware below. I promise to quit using
 petsc-dev as soon as the next release comes out:)

We're actually happy to have people using petsc-dev.  One motivation for
our new workflow is that we can now provide a pretty stable 'master' so
that we can interact with users on new features without the latency of a
release cycle and without frequent breakage.

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

2013-04-10 Thread Nystrom, William D

Sorry.  I overlooked that the URL was using git protocol.  My bad.

Dave

From: Jed Brown [five9a2 at gmail.com] on behalf of Jed Brown 
[jedbr...@mcs.anl.gov]
Sent: Wednesday, April 10, 2013 12:10 PM
To: Nystrom, William D; For users of the development version of PETSc; Chris 
Kees
Subject: Re: [petsc-dev] examples/benchmarks for weak and strong scaling 
exercise

Nystrom, William D wdn at lanl.gov writes:

 Jed,

 I tried cloning your tme-ice git repo as follows and it failed:

 % git clone --recursive git://github.com/jedbrown/tme-ice.git tme_ice
 Cloning into 'tme_ice'...
 fatal: unable to connect to github.com:
 github.com[0: 204.232.175.90]: errno=Connection timed out

 I'm doing this from an xterm that allows me to clone petsc just fine.

You're using https or ssh to clone PETSc, but the git:// to clone
tme-ice.  The LANL network is blocking that port, so just use the https
or ssh protocol.

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

[petsc-dev] examples/benchmarks for weak and strong scaling exercise

8 matches

Site Navigation

Mail list logo

Footer information