Lawrence Mitchell writes:
> On 06/04/17 12:25, Matthew Knepley wrote:
>> I'm not sure whether getting the Intel acronyms mixed up (KNL vs MKL)
>> makes the quote above better or worse.
>>
>>
>> Too cryptic. Are you saying that this cannot be what is
On 06/04/17 12:25, Matthew Knepley wrote:
> I'm not sure whether getting the Intel acronyms mixed up (KNL vs MKL)
> makes the quote above better or worse.
>
>
> Too cryptic. Are you saying that this cannot be what is happening? How
> would you explain
> the drop off in performance?
>
On Wed, Apr 5, 2017 at 10:26 PM, Jed Brown wrote:
> Matthew Knepley writes:
>
> > On Wed, Apr 5, 2017 at 12:23 PM, Justin Chang
> wrote:
> >
> >> I simply ran these KNL simulations in flat mode with the following
> options:
> >>
> >>
Matthew Knepley writes:
> On Wed, Apr 5, 2017 at 12:23 PM, Justin Chang wrote:
>
>> I simply ran these KNL simulations in flat mode with the following options:
>>
>> srun -n 64 -c 4 --cpu_bind=cores numactl -p 1 ./ex48
>>
>> Basically I told it that
On Wed, Apr 5, 2017 at 12:23 PM, Justin Chang wrote:
> I simply ran these KNL simulations in flat mode with the following options:
>
> srun -n 64 -c 4 --cpu_bind=cores numactl -p 1 ./ex48
>
> Basically I told it that MCDRAM usage in NUMA domain 1 is preferred. I
>
I simply ran these KNL simulations in flat mode with the following options:
srun -n 64 -c 4 --cpu_bind=cores numactl -p 1 ./ex48
Basically I told it that MCDRAM usage in NUMA domain 1 is preferred. I
followed the last example:
On Wed, Apr 5, 2017 at 11:54 AM, Zhang, Hong wrote:
>
> > On Apr 5, 2017, at 10:53 AM, Jed Brown wrote:
> >
> > "Zhang, Hong" writes:
> >
> >> On Apr 4, 2017, at 10:45 PM, Justin Chang > wrote:
>
> On Apr 5, 2017, at 10:53 AM, Jed Brown wrote:
>
> "Zhang, Hong" writes:
>
>> On Apr 4, 2017, at 10:45 PM, Justin Chang
>> > wrote:
>>
>> So I tried the following options:
>>
>> -M 40
>> -N 40
>> -P 5
>>
"Zhang, Hong" writes:
> On Apr 4, 2017, at 10:45 PM, Justin Chang
> > wrote:
>
> So I tried the following options:
>
> -M 40
> -N 40
> -P 5
> -da_refine 1/2/3/4
> -log_view
> -mg_coarse_pc_type gamg
> -mg_levels_0_pc_type gamg
>
On Apr 4, 2017, at 10:45 PM, Justin Chang
> wrote:
So I tried the following options:
-M 40
-N 40
-P 5
-da_refine 1/2/3/4
-log_view
-mg_coarse_pc_type gamg
-mg_levels_0_pc_type gamg
-mg_levels_1_sub_pc_type cholesky
-pc_type mg
-thi_mat_type baij
On Wed, Apr 5, 2017 at 7:54 AM, Jed Brown wrote:
> Matthew Knepley writes:
>
> > On Wed, Apr 5, 2017 at 12:12 AM, Richard Mills
> > wrote:
> >
> >> On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote:
> >>
> >>>
Matthew Knepley writes:
> On Wed, Apr 5, 2017 at 12:12 AM, Richard Mills
> wrote:
>
>> On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote:
>>
>>> Barry Smith writes:
>>>
>>> >These results seem reasonable to
On Wed, Apr 5, 2017 at 12:12 AM, Richard Mills
wrote:
> On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote:
>
>> Barry Smith writes:
>>
>> >These results seem reasonable to me.
>> >
>> >What makes you think that KNL should be
On Tue, Apr 4, 2017 at 9:10 PM, Jed Brown wrote:
> Barry Smith writes:
>
> >These results seem reasonable to me.
> >
> >What makes you think that KNL should be doing better than it does in
> comparison to Haswell?
> >
> >The entire reason for
> On Apr 4, 2017, at 11:10 PM, Jed Brown wrote:
>
> Barry Smith writes:
>
>> These results seem reasonable to me.
>>
>> What makes you think that KNL should be doing better than it does in
>> comparison to Haswell?
>>
>> The entire reason for
Barry Smith writes:
>These results seem reasonable to me.
>
>What makes you think that KNL should be doing better than it does in
> comparison to Haswell?
>
>The entire reason for the existence of KNL is that it is a way for
>Intel to be able to "compete"
Justin Chang writes:
> So I tried the following options:
>
> -M 40
> -N 40
> -P 5
> -da_refine 1/2/3/4
> -log_view
> -mg_coarse_pc_type gamg
> -mg_levels_0_pc_type gamg
> -mg_levels_1_sub_pc_type cholesky
> -pc_type mg
> -thi_mat_type baij
>
> Performance improved
These results seem reasonable to me.
What makes you think that KNL should be doing better than it does in
comparison to Haswell?
The entire reason for the existence of KNL is that it is a way for Intel to
be able to "compete" with Nvidia GPUs for numerics and data processing, for
Justin Chang writes:
> Attached are the job output files (which include -log_view) for SNES ex48
> run on a single haswell and knl node (32 and 64 cores respectively).
> Started off with a coarse grid of size 40x40x5 and ran three different
> tests with -da_refine 1/2/3 and
Hey,
here's some data on what you should see with STREAM when comparing
against conventional XEONs:
https://www.karlrupp.net/2016/07/knights-landing-vs-knights-corner-haswell-ivy-bridge-and-sandy-bridge-stream-benchmark-results/
Note that MCDRAM only pays off if you can keep enough cores
Justin Chang writes:
> Thanks everyone for the helpful advice. So I tried all the suggestions
> including using libsci. The performance did not improve for my particular
> runs, which I think suggests the problem parameters chosen for my tests
> (SNES ex48) are not optimal
Attached are the job output files (which include -log_view) for SNES ex48
run on a single haswell and knl node (32 and 64 cores respectively).
Started off with a coarse grid of size 40x40x5 and ran three different
tests with -da_refine 1/2/3 and -pc_type mg
What's interesting/strange is that if i
I did some quick tests (with a different example) on a single KNL node and a
single Haswell node, both using 4 processes. Check below for the results about
MatMult. And the total running time on KNL is a bit more than two times of that
on Haswell. So I think the results Justin got with SNE ex48
On Tue, Apr 4, 2017 at 10:57 AM, Justin Chang wrote:
> Thanks everyone for the helpful advice. So I tried all the suggestions
> including using libsci. The performance did not improve for my particular
> runs, which I think suggests the problem parameters chosen for my tests
Thanks everyone for the helpful advice. So I tried all the suggestions
including using libsci. The performance did not improve for my particular
runs, which I think suggests the problem parameters chosen for my tests
(SNES ex48) are not optimal for KNL. Does anyone have example test runs I
could
Yes, one should rely on MKL (or Cray LibSci, if using the Cray toolchain)
on Cori. But I'm guessing that this will make no noticeable difference for
what Justin is doing.
--Richard
On Mon, Apr 3, 2017 at 12:57 PM, murat keçeli wrote:
> How about replacing
How about replacing --download-fblaslapack with vendor specific
BLAS/LAPACK?
Murat
On Mon, Apr 3, 2017 at 2:45 PM, Richard Mills
wrote:
> On Mon, Apr 3, 2017 at 12:24 PM, Zhang, Hong wrote:
>
>>
>> On Apr 3, 2017, at 1:44 PM, Justin Chang
On Mon, Apr 3, 2017 at 12:24 PM, Zhang, Hong wrote:
>
> On Apr 3, 2017, at 1:44 PM, Justin Chang wrote:
>
> Richard,
>
> This is what my job script looks like:
>
> #!/bin/bash
> #SBATCH -N 16
> #SBATCH -C knl,quad,flat
> #SBATCH -p regular
> #SBATCH -J
On Apr 3, 2017, at 1:44 PM, Justin Chang
> wrote:
Richard,
This is what my job script looks like:
#!/bin/bash
#SBATCH -N 16
#SBATCH -C knl,quad,flat
#SBATCH -p regular
#SBATCH -J knlflat1024
#SBATCH -L SCRATCH
#SBATCH -o knlflat1024.o%j
#SBATCH
Richard,
This is what my job script looks like:
#!/bin/bash
#SBATCH -N 16
#SBATCH -C knl,quad,flat
#SBATCH -p regular
#SBATCH -J knlflat1024
#SBATCH -L SCRATCH
#SBATCH -o knlflat1024.o%j
#SBATCH --mail-type=ALL
#SBATCH --mail-user=jychan...@gmail.com
#SBATCH -t 00:20:00
#run the application:
cd
Fixing typo: Meant to say "Keep in mind that individual KNL cores are much
less powerful than an individual Haswell *core*."
--Richard
On Mon, Apr 3, 2017 at 11:36 AM, Richard Mills
wrote:
> Hi Justin,
>
> How is the MCDRAM (on-package "high-bandwidth memory")
Hi Justin,
How is the MCDRAM (on-package "high-bandwidth memory") configured for your
KNL runs? And if it is in "flat" mode, what are you doing to ensure that
you use the MCDRAM? Doing this wrong seems to be one of the most common
reasons for unexpected poor performance on KNL.
I'm not that
Hi all,
On NERSC's Cori I have the following configure options for PETSc:
./configure --download-fblaslapack --with-cc=cc --with-clib-autodetect=0
--with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn
--with-fortranlib-autodetect=0 --with-mpiexec=srun --with-64-bit-indices=1
33 matches
Mail list logo