Re: [petsc-users] Configuring PETSc for KNL

2017-04-06 Thread Jed Brown
Lawrence Mitchell writes: > On 06/04/17 12:25, Matthew Knepley wrote: >> I'm not sure whether getting the Intel acronyms mixed up (KNL vs MKL) >> makes the quote above better or worse. >> >> >> Too cryptic. Are you saying that this cannot be what is

Re: [petsc-users] Configuring PETSc for KNL

2017-04-06 Thread Lawrence Mitchell
On 06/04/17 12:25, Matthew Knepley wrote: > I'm not sure whether getting the Intel acronyms mixed up (KNL vs MKL) > makes the quote above better or worse. > > > Too cryptic. Are you saying that this cannot be what is happening? How > would you explain > the drop off in performance? >

Re: [petsc-users] Configuring PETSc for KNL

2017-04-06 Thread Matthew Knepley
On Wed, Apr 5, 2017 at 10:26 PM, Jed Brown wrote: > Matthew Knepley writes: > > > On Wed, Apr 5, 2017 at 12:23 PM, Justin Chang > wrote: > > > >> I simply ran these KNL simulations in flat mode with the following > options: > >> > >>

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Jed Brown
Matthew Knepley writes: > On Wed, Apr 5, 2017 at 12:23 PM, Justin Chang wrote: > >> I simply ran these KNL simulations in flat mode with the following options: >> >> srun -n 64 -c 4 --cpu_bind=cores numactl -p 1 ./ex48 >> >> Basically I told it that

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Matthew Knepley
On Wed, Apr 5, 2017 at 12:23 PM, Justin Chang wrote: > I simply ran these KNL simulations in flat mode with the following options: > > srun -n 64 -c 4 --cpu_bind=cores numactl -p 1 ./ex48 > > Basically I told it that MCDRAM usage in NUMA domain 1 is preferred. I >

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Justin Chang
I simply ran these KNL simulations in flat mode with the following options: srun -n 64 -c 4 --cpu_bind=cores numactl -p 1 ./ex48 Basically I told it that MCDRAM usage in NUMA domain 1 is preferred. I followed the last example:

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Matthew Knepley
On Wed, Apr 5, 2017 at 11:54 AM, Zhang, Hong wrote: > > > On Apr 5, 2017, at 10:53 AM, Jed Brown wrote: > > > > "Zhang, Hong" writes: > > > >> On Apr 4, 2017, at 10:45 PM, Justin Chang > wrote: >

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Zhang, Hong
> On Apr 5, 2017, at 10:53 AM, Jed Brown wrote: > > "Zhang, Hong" writes: > >> On Apr 4, 2017, at 10:45 PM, Justin Chang >> > wrote: >> >> So I tried the following options: >> >> -M 40 >> -N 40 >> -P 5 >>

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Jed Brown
"Zhang, Hong" writes: > On Apr 4, 2017, at 10:45 PM, Justin Chang > > wrote: > > So I tried the following options: > > -M 40 > -N 40 > -P 5 > -da_refine 1/2/3/4 > -log_view > -mg_coarse_pc_type gamg > -mg_levels_0_pc_type gamg >

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Zhang, Hong
On Apr 4, 2017, at 10:45 PM, Justin Chang > wrote: So I tried the following options: -M 40 -N 40 -P 5 -da_refine 1/2/3/4 -log_view -mg_coarse_pc_type gamg -mg_levels_0_pc_type gamg -mg_levels_1_sub_pc_type cholesky -pc_type mg -thi_mat_type baij

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Matthew Knepley
On Wed, Apr 5, 2017 at 7:54 AM, Jed Brown wrote: > Matthew Knepley writes: > > > On Wed, Apr 5, 2017 at 12:12 AM, Richard Mills > > wrote: > > > >> On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote: > >> > >>>

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Jed Brown
Matthew Knepley writes: > On Wed, Apr 5, 2017 at 12:12 AM, Richard Mills > wrote: > >> On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote: >> >>> Barry Smith writes: >>> >>> >These results seem reasonable to

Re: [petsc-users] Configuring PETSc for KNL

2017-04-05 Thread Matthew Knepley
On Wed, Apr 5, 2017 at 12:12 AM, Richard Mills wrote: > On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote: > >> Barry Smith writes: >> >> >These results seem reasonable to me. >> > >> >What makes you think that KNL should be

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Richard Mills
On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote: > Barry Smith writes: > > >These results seem reasonable to me. > > > >What makes you think that KNL should be doing better than it does in > comparison to Haswell? > > > >The entire reason for

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Barry Smith
> On Apr 4, 2017, at 11:10 PM, Jed Brown wrote: > > Barry Smith writes: > >> These results seem reasonable to me. >> >> What makes you think that KNL should be doing better than it does in >> comparison to Haswell? >> >> The entire reason for

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Jed Brown
Barry Smith writes: >These results seem reasonable to me. > >What makes you think that KNL should be doing better than it does in > comparison to Haswell? > >The entire reason for the existence of KNL is that it is a way for >Intel to be able to "compete"

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Jed Brown
Justin Chang writes: > So I tried the following options: > > -M 40 > -N 40 > -P 5 > -da_refine 1/2/3/4 > -log_view > -mg_coarse_pc_type gamg > -mg_levels_0_pc_type gamg > -mg_levels_1_sub_pc_type cholesky > -pc_type mg > -thi_mat_type baij > > Performance improved

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Barry Smith
These results seem reasonable to me. What makes you think that KNL should be doing better than it does in comparison to Haswell? The entire reason for the existence of KNL is that it is a way for Intel to be able to "compete" with Nvidia GPUs for numerics and data processing, for

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Jed Brown
Justin Chang writes: > Attached are the job output files (which include -log_view) for SNES ex48 > run on a single haswell and knl node (32 and 64 cores respectively). > Started off with a coarse grid of size 40x40x5 and ran three different > tests with -da_refine 1/2/3 and

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Karl Rupp
Hey, here's some data on what you should see with STREAM when comparing against conventional XEONs: https://www.karlrupp.net/2016/07/knights-landing-vs-knights-corner-haswell-ivy-bridge-and-sandy-bridge-stream-benchmark-results/ Note that MCDRAM only pays off if you can keep enough cores

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Jed Brown
Justin Chang writes: > Thanks everyone for the helpful advice. So I tried all the suggestions > including using libsci. The performance did not improve for my particular > runs, which I think suggests the problem parameters chosen for my tests > (SNES ex48) are not optimal

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Justin Chang
Attached are the job output files (which include -log_view) for SNES ex48 run on a single haswell and knl node (32 and 64 cores respectively). Started off with a coarse grid of size 40x40x5 and ran three different tests with -da_refine 1/2/3 and -pc_type mg What's interesting/strange is that if i

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Zhang, Hong
I did some quick tests (with a different example) on a single KNL node and a single Haswell node, both using 4 processes. Check below for the results about MatMult. And the total running time on KNL is a bit more than two times of that on Haswell. So I think the results Justin got with SNE ex48

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Matthew Knepley
On Tue, Apr 4, 2017 at 10:57 AM, Justin Chang wrote: > Thanks everyone for the helpful advice. So I tried all the suggestions > including using libsci. The performance did not improve for my particular > runs, which I think suggests the problem parameters chosen for my tests

Re: [petsc-users] Configuring PETSc for KNL

2017-04-04 Thread Justin Chang
Thanks everyone for the helpful advice. So I tried all the suggestions including using libsci. The performance did not improve for my particular runs, which I think suggests the problem parameters chosen for my tests (SNES ex48) are not optimal for KNL. Does anyone have example test runs I could

Re: [petsc-users] Configuring PETSc for KNL

2017-04-03 Thread Richard Mills
Yes, one should rely on MKL (or Cray LibSci, if using the Cray toolchain) on Cori. But I'm guessing that this will make no noticeable difference for what Justin is doing. --Richard On Mon, Apr 3, 2017 at 12:57 PM, murat keçeli wrote: > How about replacing

Re: [petsc-users] Configuring PETSc for KNL

2017-04-03 Thread murat keçeli
How about replacing --download-fblaslapack with vendor specific BLAS/LAPACK? Murat On Mon, Apr 3, 2017 at 2:45 PM, Richard Mills wrote: > On Mon, Apr 3, 2017 at 12:24 PM, Zhang, Hong wrote: > >> >> On Apr 3, 2017, at 1:44 PM, Justin Chang

Re: [petsc-users] Configuring PETSc for KNL

2017-04-03 Thread Richard Mills
On Mon, Apr 3, 2017 at 12:24 PM, Zhang, Hong wrote: > > On Apr 3, 2017, at 1:44 PM, Justin Chang wrote: > > Richard, > > This is what my job script looks like: > > #!/bin/bash > #SBATCH -N 16 > #SBATCH -C knl,quad,flat > #SBATCH -p regular > #SBATCH -J

Re: [petsc-users] Configuring PETSc for KNL

2017-04-03 Thread Zhang, Hong
On Apr 3, 2017, at 1:44 PM, Justin Chang > wrote: Richard, This is what my job script looks like: #!/bin/bash #SBATCH -N 16 #SBATCH -C knl,quad,flat #SBATCH -p regular #SBATCH -J knlflat1024 #SBATCH -L SCRATCH #SBATCH -o knlflat1024.o%j #SBATCH

Re: [petsc-users] Configuring PETSc for KNL

2017-04-03 Thread Justin Chang
Richard, This is what my job script looks like: #!/bin/bash #SBATCH -N 16 #SBATCH -C knl,quad,flat #SBATCH -p regular #SBATCH -J knlflat1024 #SBATCH -L SCRATCH #SBATCH -o knlflat1024.o%j #SBATCH --mail-type=ALL #SBATCH --mail-user=jychan...@gmail.com #SBATCH -t 00:20:00 #run the application: cd

Re: [petsc-users] Configuring PETSc for KNL

2017-04-03 Thread Richard Mills
Fixing typo: Meant to say "Keep in mind that individual KNL cores are much less powerful than an individual Haswell *core*." --Richard On Mon, Apr 3, 2017 at 11:36 AM, Richard Mills wrote: > Hi Justin, > > How is the MCDRAM (on-package "high-bandwidth memory")

Re: [petsc-users] Configuring PETSc for KNL

2017-04-03 Thread Richard Mills
Hi Justin, How is the MCDRAM (on-package "high-bandwidth memory") configured for your KNL runs? And if it is in "flat" mode, what are you doing to ensure that you use the MCDRAM? Doing this wrong seems to be one of the most common reasons for unexpected poor performance on KNL. I'm not that

[petsc-users] Configuring PETSc for KNL

2017-04-03 Thread Justin Chang
Hi all, On NERSC's Cori I have the following configure options for PETSc: ./configure --download-fblaslapack --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-64-bit-indices=1