In case someone wants to learn more about the hierarchical partitioning
algorithm. Here is a reference
https://arxiv.org/pdf/1809.02666.pdf
Thanks
Fande
> On Mar 25, 2020, at 5:18 PM, Mark Adams wrote:
>
>
>
>
>> On Wed, Mar 25, 2020 at 6:40 PM Fande Kong wrote:
>>>
>>>
On
On Wed, Mar 25, 2020 at 6:40 PM Fande Kong wrote:
>
>
> On Wed, Mar 25, 2020 at 12:18 PM Mark Adams wrote:
>
>> Also, a better test is see where streams pretty much saturates, then run
>> that many processors per node and do the same test by increasing the nodes.
>> This will tell you how well
MPI rank distribution (e.g., 8 ranks per node or 16 ranks per node) is usually
managed by workload managers like Slurm, PBS through your job scripts, which is
out of petsc’s control.
From: Amin Sadeghi
Date: Wednesday, March 25, 2020 at 4:40 PM
To: Junchao Zhang
Cc: Mark Adams , PETSc users
On Wed, Mar 25, 2020 at 12:18 PM Mark Adams wrote:
> Also, a better test is see where streams pretty much saturates, then run
> that many processors per node and do the same test by increasing the nodes.
> This will tell you how well your network communication is doing.
>
> But this result has a
That's great. Thanks for creating this great piece of software!
Amin
On Wed, Mar 25, 2020 at 5:56 PM Matthew Knepley wrote:
> On Wed, Mar 25, 2020 at 5:41 PM Amin Sadeghi
> wrote:
>
>> Junchao, thank you for doing the experiment, I guess TACC Frontera nodes
>> have higher memory bandwidth
On Wed, Mar 25, 2020 at 5:41 PM Amin Sadeghi wrote:
> Junchao, thank you for doing the experiment, I guess TACC Frontera nodes
> have higher memory bandwidth (maybe more modern CPU architecture, although
> I'm not familiar as to which hardware affect memory bandwidth) than Compute
> Canada's
Junchao, thank you for doing the experiment, I guess TACC Frontera nodes
have higher memory bandwidth (maybe more modern CPU architecture, although
I'm not familiar as to which hardware affect memory bandwidth) than Compute
Canada's Graham.
Mark, I did as you suggested. As you suspected, running
I repeated your experiment on one node of TACC Frontera,
1 rank: 85.0s
16 ranks: 8.2s, 10x speedup
32 ranks: 5.7s, 15x speedup
--Junchao Zhang
On Wed, Mar 25, 2020 at 1:18 PM Mark Adams wrote:
> Also, a better test is see where streams pretty much saturates, then run
> that many processors
Also, a better test is see where streams pretty much saturates, then run
that many processors per node and do the same test by increasing the nodes.
This will tell you how well your network communication is doing.
But this result has a lot of stuff in "network communication" that can be
further
On Wed, Mar 25, 2020 at 2:11 PM Amin Sadeghi wrote:
> Thank you Matt and Mark for the explanation. That makes sense. Please
> correct me if I'm wrong, I think instead of asking for the whole node with
> 32 cores, if I ask for more nodes, say 4 or 8, but each with 8 cores, then
> I should see
Thank you Matt and Mark for the explanation. That makes sense. Please
correct me if I'm wrong, I think instead of asking for the whole node with
32 cores, if I ask for more nodes, say 4 or 8, but each with 8 cores, then
I should see much better speedups. Is that correct?
On Wed, Mar 25, 2020 at
I would guess that you are saturating the memory bandwidth. After you make
PETSc (make all) it will suggest that you test it (make test) and suggest
that you run streams (make streams).
I see Matt answered but let me add that when you make streams you will seed
the memory rate for 1,2,3, ... NP
On Wed, Mar 25, 2020 at 1:01 PM Amin Sadeghi wrote:
> Hi,
>
> I ran KSP example 45 on a single node with 32 cores and 125GB memory using
> 1, 16 and 32 MPI processes. Here's a comparison of the time spent during
> KSP.solve:
>
> - 1 MPI process: ~98 sec, speedup: 1X
> - 16 MPI processes: ~12
Hi,
I ran KSP example 45 on a single node with 32 cores and 125GB memory using
1, 16 and 32 MPI processes. Here's a comparison of the time spent during
KSP.solve:
- 1 MPI process: ~98 sec, speedup: 1X
- 16 MPI processes: ~12 sec, speedup: ~8X
- 32 MPI processes: ~11 sec, speedup: ~9X
Since the
On Wed, Mar 25, 2020 at 12:29 PM Alejandro Aragon - 3ME <
a.m.ara...@tudelft.nl> wrote:
> Dear everyone,
>
> I’m new to petsc4py and I’m trying to run a simple finite element code
> that uses DMPLEX to load a .msh file (created by Gmsh). In version 3.10 the
> code was working but I recently
Dear everyone,
I’m new to petsc4py and I’m trying to run a simple finite element code that
uses DMPLEX to load a .msh file (created by Gmsh). In version 3.10 the code was
working but I recently upgraded to 3.12 and I get the following error:
(.pydev) ➜ testmodule git:(e0bc9ae) ✗ mpirun -np 2
16 matches
Mail list logo