Re: [petsc-users] Galerkin projection using petsc4py

2023-10-10 Thread Pierre Jolivet
I disagree with what Mark and Matt are saying: your code is fine, the error 
message is fine, petsc4py is fine (in this instance).
It’s not a typical use case of MatPtAP(), which is mostly designed for MatAIJ, 
not MatDense.
On the one hand, in the MatDense case, indeed there will be a mismatch between 
the number of columns of A and the number of rows of P, as written in the error 
message.
On the other hand, there is not much to optimize when computing C = P’ A P with 
everything being dense.
I would just write this as B = A P and then C = P’ B (but then you may face the 
same issue as initially reported, please let us know then).

Thanks,
Pierre

> On 11 Oct 2023, at 2:42 AM, Mark Adams  wrote:
> 
> This looks like a false positive or there is some subtle bug here that we are 
> not seeing.
> Could this be the first time parallel PtAP has been used (and reported) in 
> petsc4py?
> 
> Mark
> 
> On Tue, Oct 10, 2023 at 8:27 PM Matthew Knepley  > wrote:
>> On Tue, Oct 10, 2023 at 5:34 PM Thanasis Boutsikakis 
>> > > wrote:
>>> Hi all,
>>> 
>>> Revisiting my code and the proposed solution from Pierre, I realized this 
>>> works only in sequential. The reason is that PETSc partitions those 
>>> matrices only row-wise, which leads to an error due to the mismatch between 
>>> number of columns of A (non-partitioned) and the number of rows of Phi 
>>> (partitioned).
>> 
>> Are you positive about this? P^T A P is designed to run in this scenario, so 
>> either we have a bug or the diagnosis is wrong.
>> 
>>   Thanks,
>> 
>>  Matt
>>  
>>> """Experimenting with PETSc mat-mat multiplication"""
>>> 
>>> import time
>>> 
>>> import numpy as np
>>> from colorama import Fore
>>> from firedrake import COMM_SELF, COMM_WORLD
>>> from firedrake.petsc import PETSc
>>> from mpi4py import MPI
>>> from numpy.testing import assert_array_almost_equal
>>> 
>>> from utilities import Print
>>> 
>>> nproc = COMM_WORLD.size
>>> rank = COMM_WORLD.rank
>>> 
>>> def create_petsc_matrix(input_array, sparse=True):
>>> """Create a PETSc matrix from an input_array
>>> 
>>> Args:
>>> input_array (np array): Input array
>>> partition_like (PETSc mat, optional): Petsc matrix. Defaults to 
>>> None.
>>> sparse (bool, optional): Toggle for sparese or dense. Defaults to 
>>> True.
>>> 
>>> Returns:
>>> PETSc mat: PETSc mpi matrix
>>> """
>>> # Check if input_array is 1D and reshape if necessary
>>> assert len(input_array.shape) == 2, "Input array should be 
>>> 2-dimensional"
>>> global_rows, global_cols = input_array.shape
>>> size = ((None, global_rows), (global_cols, global_cols))
>>> 
>>> # Create a sparse or dense matrix based on the 'sparse' argument
>>> if sparse:
>>> matrix = PETSc.Mat().createAIJ(size=size, comm=COMM_WORLD)
>>> else:
>>> matrix = PETSc.Mat().createDense(size=size, comm=COMM_WORLD)
>>> matrix.setUp()
>>> 
>>> local_rows_start, local_rows_end = matrix.getOwnershipRange()
>>> 
>>> for counter, i in enumerate(range(local_rows_start, local_rows_end)):
>>> # Calculate the correct row in the array for the current process
>>> row_in_array = counter + local_rows_start
>>> matrix.setValues(
>>> i, range(global_cols), input_array[row_in_array, :], addv=False
>>> )
>>> 
>>> # Assembly the matrix to compute the final structure
>>> matrix.assemblyBegin()
>>> matrix.assemblyEnd()
>>> 
>>> return matrix
>>> 
>>> # 
>>> # EXP: Galerkin projection of an mpi PETSc matrix A with an mpi PETSc 
>>> matrix Phi
>>> #  A' = Phi.T * A * Phi
>>> # [k x k] <- [k x m] x [m x m] x [m x k]
>>> # 
>>> 
>>> m, k = 100, 7
>>> # Generate the random numpy matrices
>>> np.random.seed(0)  # sets the seed to 0
>>> A_np = np.random.randint(low=0, high=6, size=(m, m))
>>> Phi_np = np.random.randint(low=0, high=6, size=(m, k))
>>> 
>>> # 
>>> # TEST: Galerking projection of numpy matrices A_np and Phi_np
>>> # 
>>> Aprime_np = Phi_np.T @ A_np @ Phi_np
>>> Print(f"MATRIX Aprime_np [{Aprime_np.shape[0]}x{Aprime_np.shape[1]}]")
>>> Print(f"{Aprime_np}")
>>> 
>>> # Create A as an mpi matrix distributed on each process
>>> A = create_petsc_matrix(A_np, sparse=False)
>>> 
>>> # Create Phi as an mpi matrix distributed on each process
>>> Phi = create_petsc_matrix(Phi_np, sparse=False)
>>> 
>>> # Create an empty PETSc matrix object to store the result of the PtAP 
>>> operation.
>>> # This will hold the result A' = Phi.T * A * Phi after the computation.
>>> A_prime = create_petsc_matrix(np.zeros((k, k)), sparse=False)
>>> 
>>> # Perform the PtAP (Phi Transpose times A times Phi) operation.
>>> # In mathematical terms, this operation is A' = Phi.T * A * Phi.
>>> # 

Re: [petsc-users] FEM Implementation of NS with SUPG Stabilization

2023-10-10 Thread Brandon Denton via petsc-users
My initial plan was to write a new code using only PETSc. However, I don't see 
how to do what I want within the point-wise residual function. Am I missing 
something?

Yes. I would be interested in collaborating on the ceed-fluids. I took a quick 
look at the links you provided and it looks interesting. I'll warn you though. 
I'm a Mechanical Engineer by trade/training. The calculus and programming 
sometimes take me a little while to wrap my head around. Let me know how I can 
help. In the meantime, I'll continue to review the information you sent over.

From: Jed Brown 
Sent: Tuesday, October 10, 2023 10:18 PM
To: Brandon Denton ; petsc-users@mcs.anl.gov 

Subject: Re: [petsc-users] FEM Implementation of NS with SUPG Stabilization

Do you want to write a new code using only PETSc or would you be up for 
collaborating on ceed-fluids, which is a high-performance compressible SUPG 
solver based on DMPlex with good GPU support? It uses the metric to compute 
covariant length for stabilization. We have YZƁ shock capturing, though it 
hasn't been tested much beyond shock tube experiments. (Most of our work has 
been low Mach.)

https://libceed.org/en/latest/examples/fluids/
https://github.com/CEED/libCEED/blob/main/examples/fluids/qfunctions/stabilization.h#L76


On Tue, Oct 10, 2023, at 7:34 PM, Brandon Denton via petsc-users wrote:
Good Evening,

I am looking to implement a form of Navier-Stokes with SUPG Stabilization and 
shock capturing using PETSc's FEM infrastructure. In this implementation, I 
need access to the cell's shape function gradients and natural coordinate 
gradients for calculations within the point-wise residual calculations. How do 
I get these quantities at the quadrature points? The signatures for fo and f1 
don't seem to contain this information.

Thank you in advance for your time.
Brandon



Re: [petsc-users] FEM Implementation of NS with SUPG Stabilization

2023-10-10 Thread Jed Brown
Do you want to write a new code using only PETSc or would you be up for 
collaborating on ceed-fluids, which is a high-performance compressible SUPG 
solver based on DMPlex with good GPU support? It uses the metric to compute 
covariant length for stabilization. We have YZƁ shock capturing, though it 
hasn't been tested much beyond shock tube experiments. (Most of our work has 
been low Mach.)

https://libceed.org/en/latest/examples/fluids/
https://github.com/CEED/libCEED/blob/main/examples/fluids/qfunctions/stabilization.h#L76


On Tue, Oct 10, 2023, at 7:34 PM, Brandon Denton via petsc-users wrote:
> Good Evening,
> 
> I am looking to implement a form of Navier-Stokes with SUPG Stabilization and 
> shock capturing using PETSc's FEM infrastructure. In this implementation, I 
> need access to the cell's shape function gradients and natural coordinate 
> gradients for calculations within the point-wise residual calculations. How 
> do I get these quantities at the quadrature points? The signatures for fo and 
> f1 don't seem to contain this information.
> 
> Thank you in advance for your time.
> Brandon


[petsc-users] FEM Implementation of NS with SUPG Stabilization

2023-10-10 Thread Brandon Denton via petsc-users
Good Evening,

I am looking to implement a form of Navier-Stokes with SUPG Stabilization and 
shock capturing using PETSc's FEM infrastructure. In this implementation, I 
need access to the cell's shape function gradients and natural coordinate 
gradients for calculations within the point-wise residual calculations. How do 
I get these quantities at the quadrature points? The signatures for fo and f1 
don't seem to contain this information.

Thank you in advance for your time.
Brandon


Re: [petsc-users] Galerkin projection using petsc4py

2023-10-10 Thread Mark Adams
This looks like a false positive or there is some subtle bug here that we
are not seeing.
Could this be the first time parallel PtAP has been used (and reported) in
petsc4py?

Mark

On Tue, Oct 10, 2023 at 8:27 PM Matthew Knepley  wrote:

> On Tue, Oct 10, 2023 at 5:34 PM Thanasis Boutsikakis <
> thanasis.boutsika...@corintis.com> wrote:
>
>> Hi all,
>>
>> Revisiting my code and the proposed solution from Pierre, I realized this
>> works only in sequential. The reason is that PETSc partitions those
>> matrices only row-wise, which leads to an error due to the mismatch between
>> number of columns of A (non-partitioned) and the number of rows of Phi
>> (partitioned).
>>
>
> Are you positive about this? P^T A P is designed to run in this scenario,
> so either we have a bug or the diagnosis is wrong.
>
>   Thanks,
>
>  Matt
>
>
>> """Experimenting with PETSc mat-mat multiplication"""
>>
>> import time
>>
>> import numpy as np
>> from colorama import Fore
>> from firedrake import COMM_SELF, COMM_WORLD
>> from firedrake.petsc import PETSc
>> from mpi4py import MPI
>> from numpy.testing import assert_array_almost_equal
>>
>> from utilities import Print
>>
>> nproc = COMM_WORLD.size
>> rank = COMM_WORLD.rank
>>
>> def create_petsc_matrix(input_array, sparse=True):
>> """Create a PETSc matrix from an input_array
>>
>> Args:
>> input_array (np array): Input array
>> partition_like (PETSc mat, optional): Petsc matrix. Defaults to None.
>> sparse (bool, optional): Toggle for sparese or dense. Defaults to True.
>>
>> Returns:
>> PETSc mat: PETSc mpi matrix
>> """
>> # Check if input_array is 1D and reshape if necessary
>> assert len(input_array.shape) == 2, "Input array should be 2-dimensional"
>> global_rows, global_cols = input_array.shape
>> size = ((None, global_rows), (global_cols, global_cols))
>>
>> # Create a sparse or dense matrix based on the 'sparse' argument
>> if sparse:
>> matrix = PETSc.Mat().createAIJ(size=size, comm=COMM_WORLD)
>> else:
>> matrix = PETSc.Mat().createDense(size=size, comm=COMM_WORLD)
>> matrix.setUp()
>>
>> local_rows_start, local_rows_end = matrix.getOwnershipRange()
>>
>> for counter, i in enumerate(range(local_rows_start, local_rows_end)):
>> # Calculate the correct row in the array for the current process
>> row_in_array = counter + local_rows_start
>> matrix.setValues(
>> i, range(global_cols), input_array[row_in_array, :], addv=False
>> )
>>
>> # Assembly the matrix to compute the final structure
>> matrix.assemblyBegin()
>> matrix.assemblyEnd()
>>
>> return matrix
>>
>> # 
>> # EXP: Galerkin projection of an mpi PETSc matrix A with an mpi PETSc
>> matrix Phi
>> # A' = Phi.T * A * Phi
>> # [k x k] <- [k x m] x [m x m] x [m x k]
>> # 
>>
>> m, k = 100, 7
>> # Generate the random numpy matrices
>> np.random.seed(0) # sets the seed to 0
>> A_np = np.random.randint(low=0, high=6, size=(m, m))
>> Phi_np = np.random.randint(low=0, high=6, size=(m, k))
>>
>> # 
>> # TEST: Galerking projection of numpy matrices A_np and Phi_np
>> # 
>> Aprime_np = Phi_np.T @ A_np @ Phi_np
>> Print(f"MATRIX Aprime_np [{Aprime_np.shape[0]}x{Aprime_np.shape[1]}]")
>> Print(f"{Aprime_np}")
>>
>> # Create A as an mpi matrix distributed on each process
>> A = create_petsc_matrix(A_np, sparse=False)
>>
>> # Create Phi as an mpi matrix distributed on each process
>> Phi = create_petsc_matrix(Phi_np, sparse=False)
>>
>> # Create an empty PETSc matrix object to store the result of the PtAP
>> operation.
>> # This will hold the result A' = Phi.T * A * Phi after the computation.
>> A_prime = create_petsc_matrix(np.zeros((k, k)), sparse=False)
>>
>> # Perform the PtAP (Phi Transpose times A times Phi) operation.
>> # In mathematical terms, this operation is A' = Phi.T * A * Phi.
>> # A_prime will store the result of the operation.
>> A_prime = A.ptap(Phi)
>>
>> Here is the error
>>
>> MATRIX mpiaij A [100x100]
>> Assembled
>>
>> Partitioning for A:
>>   Rank 0: Rows [0, 34)
>>   Rank 1: Rows [34, 67)
>>   Rank 2: Rows [67, 100)
>>
>> MATRIX mpiaij Phi [100x7]
>> Assembled
>>
>> Partitioning for Phi:
>>   Rank 0: Rows [0, 34)
>>   Rank 1: Rows [34, 67)
>>   Rank 2: Rows [67, 100)
>>
>> Traceback (most recent call last):
>>   File "/Users/boutsitron/work/galerkin_projection.py", line 87, in
>> 
>> A_prime = A.ptap(Phi)
>>   ^^^
>>   File "petsc4py/PETSc/Mat.pyx", line 1525, in petsc4py.PETSc.Mat.ptap
>> petsc4py.PETSc.Error: error code 60
>> [0] MatPtAP() at
>> /Users/boutsitron/firedrake/src/petsc/src/mat/interface/matrix.c:9896
>> [0] MatProductSetFromOptions() at
>> /Users/boutsitron/firedrake/src/petsc/src/mat/interface/matproduct.c:541
>> [0] MatProductSetFromOptions_Private() at
>> /Users/boutsitron/firedrake/src/petsc/src/mat/interface/matproduct.c:435
>> [0] MatProductSetFromOptions_MPIAIJ() at
>> 

Re: [petsc-users] Parallel DMPlex

2023-10-10 Thread Matthew Knepley
On Tue, Oct 10, 2023 at 7:01 PM erdemguer  wrote:

>
> Hi,
> Sorry for my late response. I tried with your suggestions and I think I
> made a progress. But I still got issues. Let me explain my latest mesh
> routine:
>
>
>1. DMPlexCreateBoxMesh
>2. DMSetFromOptions
>3. PetscSectionCreate
>4. PetscSectionSetNumFields
>5. PetscSectionSetFieldDof
>6. PetscSectionSetDof
>7. PetscSectionSetUp
>8. DMSetLocalSection
>9. DMSetAdjacency
>10. DMPlexDistribute
>
>
> It's still not working but it's promising, if I call DMPlexGetDepthStratum
> for cells, I can see that after distribution processors have more cells.
>

Please send the output of DMPlexView() for each incarnation of the mesh.
What I do is put

  DMViewFromOptions(dm, NULL, "-dm1_view")

with a different string after each call.


> But I couldn't figure out how to decide where the ghost/processor boundary
> cells start.
>

Please send the actual code because the above is not specific enough. For
example, you will not have
"ghost cells" unless you partition with overlap. This is because by default
cells are the partitioned quantity,
so each process gets a unique set.

  Thanks,

  Matt


> In older mails I saw there is a function DMPlexGetHybridBounds but I
> think that function is deprecated. I tried to use,
> DMPlexGetCellTypeStratum as in ts/tutorials/ex11_sa.c but I'm getting -1
> as cEndInterior before and after distribution. I tried it for
> DM_POLYTOPE_FV_GHOST, DM_POLYTOPE_INTERIOR_GHOST polytope types. I also
> tried calling DMPlexComputeCellTypes before DMPlexGetCellTypeStratum but
> nothing changed. I think I can calculate the ghost cell indices using
> cStart/cEnd before & after distribution but I think there is a better way
> I'm currently missing.
>
> Thanks again,
> Guer.
>
> --- Original Message ---
> On Thursday, September 28th, 2023 at 10:42 PM, Matthew Knepley <
> knep...@gmail.com> wrote:
>
> On Thu, Sep 28, 2023 at 3:38 PM erdemguer via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> Hi,
>>
>> I am currently using DMPlex in my code. It runs serially at the moment,
>> but I'm interested in adding parallel options. Here is my workflow:
>>
>> Create a DMPlex mesh from GMSH.
>> Reorder it with DMPlexPermute.
>> Create necessary pre-processing arrays related to the mesh/problem.
>> Create field(s) with multi-dofs.
>> Create residual vectors.
>> Define a function to calculate the residual for each cell and, use SNES.
>> As you can see, I'm not using FV or FE structures (most examples do).
>> Now, I'm trying to implement this in parallel using a similar approach.
>> However, I'm struggling to understand how to create corresponding vectors
>> and how to obtain index sets for each processor. Is there a tutorial or
>> paper that covers this topic?
>>
>
> The intention was that there is enough information in the manual to do
> this.
>
> Using PetscFE/PetscFV is not required. However, I strongly encourage you
> to use PetscSection. Without this, it would be incredibly hard to do what
> you want. Once the DM has a Section, it can do things like automatically
> create vectors and matrices for you. It can redistribute them, subset them,
> etc. The Section describes how dofs are assigned to pieces of the mesh
> (mesh points). This is in the manual, and there are a few examples that do
> it by hand.
>
> So I suggest changing your code to use PetscSection, and then letting us
> know if things still do not work.
>
> Thanks,
>
> Matt
>
>> Thank you.
>> Guer.
>>
>> Sent with Proton Mail  secure email.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] Galerkin projection using petsc4py

2023-10-10 Thread Matthew Knepley
On Tue, Oct 10, 2023 at 5:34 PM Thanasis Boutsikakis <
thanasis.boutsika...@corintis.com> wrote:

> Hi all,
>
> Revisiting my code and the proposed solution from Pierre, I realized this
> works only in sequential. The reason is that PETSc partitions those
> matrices only row-wise, which leads to an error due to the mismatch between
> number of columns of A (non-partitioned) and the number of rows of Phi
> (partitioned).
>

Are you positive about this? P^T A P is designed to run in this scenario,
so either we have a bug or the diagnosis is wrong.

  Thanks,

 Matt


> """Experimenting with PETSc mat-mat multiplication"""
>
> import time
>
> import numpy as np
> from colorama import Fore
> from firedrake import COMM_SELF, COMM_WORLD
> from firedrake.petsc import PETSc
> from mpi4py import MPI
> from numpy.testing import assert_array_almost_equal
>
> from utilities import Print
>
> nproc = COMM_WORLD.size
> rank = COMM_WORLD.rank
>
> def create_petsc_matrix(input_array, sparse=True):
> """Create a PETSc matrix from an input_array
>
> Args:
> input_array (np array): Input array
> partition_like (PETSc mat, optional): Petsc matrix. Defaults to None.
> sparse (bool, optional): Toggle for sparese or dense. Defaults to True.
>
> Returns:
> PETSc mat: PETSc mpi matrix
> """
> # Check if input_array is 1D and reshape if necessary
> assert len(input_array.shape) == 2, "Input array should be 2-dimensional"
> global_rows, global_cols = input_array.shape
> size = ((None, global_rows), (global_cols, global_cols))
>
> # Create a sparse or dense matrix based on the 'sparse' argument
> if sparse:
> matrix = PETSc.Mat().createAIJ(size=size, comm=COMM_WORLD)
> else:
> matrix = PETSc.Mat().createDense(size=size, comm=COMM_WORLD)
> matrix.setUp()
>
> local_rows_start, local_rows_end = matrix.getOwnershipRange()
>
> for counter, i in enumerate(range(local_rows_start, local_rows_end)):
> # Calculate the correct row in the array for the current process
> row_in_array = counter + local_rows_start
> matrix.setValues(
> i, range(global_cols), input_array[row_in_array, :], addv=False
> )
>
> # Assembly the matrix to compute the final structure
> matrix.assemblyBegin()
> matrix.assemblyEnd()
>
> return matrix
>
> # 
> # EXP: Galerkin projection of an mpi PETSc matrix A with an mpi PETSc
> matrix Phi
> # A' = Phi.T * A * Phi
> # [k x k] <- [k x m] x [m x m] x [m x k]
> # 
>
> m, k = 100, 7
> # Generate the random numpy matrices
> np.random.seed(0) # sets the seed to 0
> A_np = np.random.randint(low=0, high=6, size=(m, m))
> Phi_np = np.random.randint(low=0, high=6, size=(m, k))
>
> # 
> # TEST: Galerking projection of numpy matrices A_np and Phi_np
> # 
> Aprime_np = Phi_np.T @ A_np @ Phi_np
> Print(f"MATRIX Aprime_np [{Aprime_np.shape[0]}x{Aprime_np.shape[1]}]")
> Print(f"{Aprime_np}")
>
> # Create A as an mpi matrix distributed on each process
> A = create_petsc_matrix(A_np, sparse=False)
>
> # Create Phi as an mpi matrix distributed on each process
> Phi = create_petsc_matrix(Phi_np, sparse=False)
>
> # Create an empty PETSc matrix object to store the result of the PtAP
> operation.
> # This will hold the result A' = Phi.T * A * Phi after the computation.
> A_prime = create_petsc_matrix(np.zeros((k, k)), sparse=False)
>
> # Perform the PtAP (Phi Transpose times A times Phi) operation.
> # In mathematical terms, this operation is A' = Phi.T * A * Phi.
> # A_prime will store the result of the operation.
> A_prime = A.ptap(Phi)
>
> Here is the error
>
> MATRIX mpiaij A [100x100]
> Assembled
>
> Partitioning for A:
>   Rank 0: Rows [0, 34)
>   Rank 1: Rows [34, 67)
>   Rank 2: Rows [67, 100)
>
> MATRIX mpiaij Phi [100x7]
> Assembled
>
> Partitioning for Phi:
>   Rank 0: Rows [0, 34)
>   Rank 1: Rows [34, 67)
>   Rank 2: Rows [67, 100)
>
> Traceback (most recent call last):
>   File "/Users/boutsitron/work/galerkin_projection.py", line 87, in
> 
> A_prime = A.ptap(Phi)
>   ^^^
>   File "petsc4py/PETSc/Mat.pyx", line 1525, in petsc4py.PETSc.Mat.ptap
> petsc4py.PETSc.Error: error code 60
> [0] MatPtAP() at
> /Users/boutsitron/firedrake/src/petsc/src/mat/interface/matrix.c:9896
> [0] MatProductSetFromOptions() at
> /Users/boutsitron/firedrake/src/petsc/src/mat/interface/matproduct.c:541
> [0] MatProductSetFromOptions_Private() at
> /Users/boutsitron/firedrake/src/petsc/src/mat/interface/matproduct.c:435
> [0] MatProductSetFromOptions_MPIAIJ() at
> /Users/boutsitron/firedrake/src/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c:2372
> [0] MatProductSetFromOptions_MPIAIJ_PtAP() at
> /Users/boutsitron/firedrake/src/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c:2266
> [0] Nonconforming object sizes
> [0] Matrix local dimensions are incompatible, Acol (0, 100) != Prow (0,34)
> Abort(1) on node 0 (rank 0 in comm 496): application called
> 

Re: [petsc-users] Parallel DMPlex

2023-10-10 Thread erdemguer via petsc-users
Hi,
Sorry for my late response. I tried with your suggestions and I think I made a 
progress. But I still got issues. Let me explain my latest mesh routine:

- DMPlexCreateBoxMesh

- DMSetFromOptions
- PetscSectionCreate
- PetscSectionSetNumFields
- PetscSectionSetFieldDof

- PetscSectionSetDof

- PetscSectionSetUp
- DMSetLocalSection
- DMSetAdjacency
- DMPlexDistribute

It's still not working but it's promising, if I call DMPlexGetDepthStratum for 
cells, I can see that after distribution processors have more cells. But I 
couldn't figure out how to decide where the ghost/processor boundary cells 
start. In older mails I saw there is a function DMPlexGetHybridBounds but I 
think that function is deprecated. I tried to use, DMPlexGetCellTypeStratumas 
in ts/tutorials/ex11_sa.c but I'm getting -1 as cEndInterior before and after 
distribution. I tried it for DM_POLYTOPE_FV_GHOST, DM_POLYTOPE_INTERIOR_GHOST 
polytope types. I also tried calling DMPlexComputeCellTypes before 
DMPlexGetCellTypeStratum but nothing changed. I think I can calculate the ghost 
cell indices using cStart/cEnd before & after distribution but I think there is 
a better way I'm currently missing.

Thanks again,
Guer.

--- Original Message ---
On Thursday, September 28th, 2023 at 10:42 PM, Matthew Knepley 
 wrote:

> On Thu, Sep 28, 2023 at 3:38 PM erdemguer via petsc-users 
>  wrote:
>
>> Hi,
>>
>> I am currently using DMPlex in my code. It runs serially at the moment, but 
>> I'm interested in adding parallel options. Here is my workflow:
>>
>> Create a DMPlex mesh from GMSH.
>> Reorder it with DMPlexPermute.
>> Create necessary pre-processing arrays related to the mesh/problem.
>> Create field(s) with multi-dofs.
>> Create residual vectors.
>> Define a function to calculate the residual for each cell and, use SNES.
>> As you can see, I'm not using FV or FE structures (most examples do). Now, 
>> I'm trying to implement this in parallel using a similar approach. However, 
>> I'm struggling to understand how to create corresponding vectors and how to 
>> obtain index sets for each processor. Is there a tutorial or paper that 
>> covers this topic?
>
> The intention was that there is enough information in the manual to do this.
>
> Using PetscFE/PetscFV is not required. However, I strongly encourage you to 
> use PetscSection. Without this, it would be incredibly hard to do what you 
> want. Once the DM has a Section, it can do things like automatically create 
> vectors and matrices for you. It can redistribute them, subset them, etc. The 
> Section describes how dofs are assigned to pieces of the mesh (mesh points). 
> This is in the manual, and there are a few examples that do it by hand.
>
> So I suggest changing your code to use PetscSection, and then letting us know 
> if things still do not work.
>
> Thanks,
>
> Matt
>
>> Thank you.
>> Guer.
>>
>> Sent with [Proton Mail](https://proton.me/) secure email.
>
> --
>
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>
> [https://www.cse.buffalo.edu/~knepley/](http://www.cse.buffalo.edu/~knepley/)

Re: [petsc-users] Galerkin projection using petsc4py

2023-10-10 Thread Thanasis Boutsikakis
Hi all,

Revisiting my code and the proposed solution from Pierre, I realized this works 
only in sequential. The reason is that PETSc partitions those matrices only 
row-wise, which leads to an error due to the mismatch between number of columns 
of A (non-partitioned) and the number of rows of Phi (partitioned).

"""Experimenting with PETSc mat-mat multiplication"""

import time

import numpy as np
from colorama import Fore
from firedrake import COMM_SELF, COMM_WORLD
from firedrake.petsc import PETSc
from mpi4py import MPI
from numpy.testing import assert_array_almost_equal

from utilities import Print

nproc = COMM_WORLD.size
rank = COMM_WORLD.rank

def create_petsc_matrix(input_array, sparse=True):
"""Create a PETSc matrix from an input_array

Args:
input_array (np array): Input array
partition_like (PETSc mat, optional): Petsc matrix. Defaults to None.
sparse (bool, optional): Toggle for sparese or dense. Defaults to True.

Returns:
PETSc mat: PETSc mpi matrix
"""
# Check if input_array is 1D and reshape if necessary
assert len(input_array.shape) == 2, "Input array should be 2-dimensional"
global_rows, global_cols = input_array.shape
size = ((None, global_rows), (global_cols, global_cols))

# Create a sparse or dense matrix based on the 'sparse' argument
if sparse:
matrix = PETSc.Mat().createAIJ(size=size, comm=COMM_WORLD)
else:
matrix = PETSc.Mat().createDense(size=size, comm=COMM_WORLD)
matrix.setUp()

local_rows_start, local_rows_end = matrix.getOwnershipRange()

for counter, i in enumerate(range(local_rows_start, local_rows_end)):
# Calculate the correct row in the array for the current process
row_in_array = counter + local_rows_start
matrix.setValues(
i, range(global_cols), input_array[row_in_array, :], addv=False
)

# Assembly the matrix to compute the final structure
matrix.assemblyBegin()
matrix.assemblyEnd()

return matrix

# 
# EXP: Galerkin projection of an mpi PETSc matrix A with an mpi PETSc matrix Phi
#  A' = Phi.T * A * Phi
# [k x k] <- [k x m] x [m x m] x [m x k]
# 

m, k = 100, 7
# Generate the random numpy matrices
np.random.seed(0)  # sets the seed to 0
A_np = np.random.randint(low=0, high=6, size=(m, m))
Phi_np = np.random.randint(low=0, high=6, size=(m, k))

# 
# TEST: Galerking projection of numpy matrices A_np and Phi_np
# 
Aprime_np = Phi_np.T @ A_np @ Phi_np
Print(f"MATRIX Aprime_np [{Aprime_np.shape[0]}x{Aprime_np.shape[1]}]")
Print(f"{Aprime_np}")

# Create A as an mpi matrix distributed on each process
A = create_petsc_matrix(A_np, sparse=False)

# Create Phi as an mpi matrix distributed on each process
Phi = create_petsc_matrix(Phi_np, sparse=False)

# Create an empty PETSc matrix object to store the result of the PtAP operation.
# This will hold the result A' = Phi.T * A * Phi after the computation.
A_prime = create_petsc_matrix(np.zeros((k, k)), sparse=False)

# Perform the PtAP (Phi Transpose times A times Phi) operation.
# In mathematical terms, this operation is A' = Phi.T * A * Phi.
# A_prime will store the result of the operation.
A_prime = A.ptap(Phi)

Here is the error

MATRIX mpiaij A [100x100]
Assembled

Partitioning for A:
  Rank 0: Rows [0, 34)
  Rank 1: Rows [34, 67)
  Rank 2: Rows [67, 100)

MATRIX mpiaij Phi [100x7]
Assembled

Partitioning for Phi:
  Rank 0: Rows [0, 34)
  Rank 1: Rows [34, 67)
  Rank 2: Rows [67, 100)

Traceback (most recent call last):
  File "/Users/boutsitron/work/galerkin_projection.py", line 87, in 
A_prime = A.ptap(Phi)
  ^^^
  File "petsc4py/PETSc/Mat.pyx", line 1525, in petsc4py.PETSc.Mat.ptap
petsc4py.PETSc.Error: error code 60
[0] MatPtAP() at 
/Users/boutsitron/firedrake/src/petsc/src/mat/interface/matrix.c:9896
[0] MatProductSetFromOptions() at 
/Users/boutsitron/firedrake/src/petsc/src/mat/interface/matproduct.c:541
[0] MatProductSetFromOptions_Private() at 
/Users/boutsitron/firedrake/src/petsc/src/mat/interface/matproduct.c:435
[0] MatProductSetFromOptions_MPIAIJ() at 
/Users/boutsitron/firedrake/src/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c:2372
[0] MatProductSetFromOptions_MPIAIJ_PtAP() at 
/Users/boutsitron/firedrake/src/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c:2266
[0] Nonconforming object sizes
[0] Matrix local dimensions are incompatible, Acol (0, 100) != Prow (0,34)
Abort(1) on node 0 (rank 0 in comm 496): application called 
MPI_Abort(PYOP2_COMM_WORLD, 1) - process 0

Any thoughts?

Thanks,
Thanos

> On 5 Oct 2023, at 14:23, Thanasis Boutsikakis 
>  wrote:
> 
> This works Pierre. Amazing input, thanks a lot!
> 
>> On 5 Oct 2023, at 14:17, Pierre Jolivet  wrote:
>> 
>> Not a petsc4py expert here, but you may to try instead:
>> A_prime = A.ptap(Phi)
>> 

Re: [petsc-users] Scalability problem using PETSc with local installed OpenMPI

2023-10-10 Thread Barry Smith

  Run STREAMS with 

MPI_BINDING="-map-by socket --bind-to core --report-bindings" make mpistreams

send the result

  Also run 

lscpu
numactl -H

if they are available on your machine, send the result


> On Oct 10, 2023, at 10:17 AM, Gong Yujie  wrote:
> 
> Dear Barry,
> 
> I tried to use the binding as suggested by PETSc: 
> mpiexec -n 4 --map-by socket --bind-to socket --report-bindings
> But it seems not improving the performance. Here is the make stream log
> 
> Best Regards,
> Yujie
> 
> mpicc -o MPIVersion.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing 
> -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O
> -I/home/tt/petsc-3.16.0/include 
> -I/home/tt/petsc-3.16.0/arch-linux-c-opt/include`pwd`/MPIVersion.c
> Running streams with 'mpiexec --oversubscribe ' using 'NPMAX=16'
> 1  26119.1937   Rate (MB/s)
> 2  29833.4281   Rate (MB/s) 1.1422
> 3  65338.5050   Rate (MB/s) 2.50155
> 4  59832.7482   Rate (MB/s) 2.29076
> 5  48629.8396   Rate (MB/s) 1.86184
> 6  58569.4289   Rate (MB/s) 2.24239
> 7  63827.1144   Rate (MB/s) 2.44369
> 8  57448.5349   Rate (MB/s) 2.19948
> 9  61405.3273   Rate (MB/s) 2.35097
> 10  68021.6111   Rate (MB/s) 2.60428
> 11  71289.0422   Rate (MB/s) 2.72937
> 12  76900.6386   Rate (MB/s) 2.94422
> 13  80198.6807   Rate (MB/s) 3.07049
> 14  64846.3685   Rate (MB/s) 2.48271
> 15  83072.8631   Rate (MB/s) 3.18053
> 16  70128.0166   Rate (MB/s) 2.68492
> 
> Traceback (most recent call last):
>   File "process.py", line 89, in 
> process(sys.argv[1],len(sys.argv)-2)
>   File "process.py", line 33, in process
> speedups[i] = triads[i]/triads[0]
> TypeError: 'dict_values' object does not support indexing
> make[2]: [makefile:47: mpistream] Error 1 (ignored)
> Traceback (most recent call last):
>   File "process.py", line 89, in 
> process(sys.argv[1],len(sys.argv)-2)
>   File "process.py", line 33, in process
> speedups[i] = triads[i]/triads[0]
> TypeError: 'dict_values' object does not support indexing
> make[2]: [makefile:79: mpistreams] Error 1 (ignored)
> From: Barry Smith 
> Sent: Tuesday, October 10, 2023 9:59 PM
> To: Gong Yujie 
> Cc: petsc-users@mcs.anl.gov 
> Subject: Re: [petsc-users] Scalability problem using PETSc with local 
> installed OpenMPI
>  
> 
>   Take a look at 
> https://petsc.org/release/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup
> 
>   Check the binding that OpenMPI is using (by the way, there are much more 
> recent OpenMPI versions, I suggest using them). Run the STREAMS benchmark as 
> indicated on that page.
> 
>   Barry
> 
> 
>> On Oct 10, 2023, at 9:27 AM, Gong Yujie  wrote:
>> 
>> Dear PETSc developers,
>> 
>> I installed OpenMPI3 first and then installed PETSc with that mpi. 
>> Currently, I'm facing a scalability issue, in detail, I tested that using 
>> OpenMPI to calculate an addition of two distributed arrays and I get a good 
>> scalability. The problem is when I calculate the addition of two vectors in 
>> PETSc, I don't have any scalability. For the same size of the problem, PETSc 
>> costs a lot much time than merely using OpenMPI. 
>> 
>> My PETSc version is 3.16.0 and the version of OpenMPI is 3.1.4. Hope you can 
>> give me some suggestions.
>> 
>> Best Regards,
>> Yujie



Re: [petsc-users] Scalability problem using PETSc with local installed OpenMPI

2023-10-10 Thread Barry Smith

  Take a look at 
https://petsc.org/release/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup

  Check the binding that OpenMPI is using (by the way, there are much more 
recent OpenMPI versions, I suggest using them). Run the STREAMS benchmark as 
indicated on that page.

  Barry


> On Oct 10, 2023, at 9:27 AM, Gong Yujie  wrote:
> 
> Dear PETSc developers,
> 
> I installed OpenMPI3 first and then installed PETSc with that mpi. Currently, 
> I'm facing a scalability issue, in detail, I tested that using OpenMPI to 
> calculate an addition of two distributed arrays and I get a good scalability. 
> The problem is when I calculate the addition of two vectors in PETSc, I don't 
> have any scalability. For the same size of the problem, PETSc costs a lot 
> much time than merely using OpenMPI. 
> 
> My PETSc version is 3.16.0 and the version of OpenMPI is 3.1.4. Hope you can 
> give me some suggestions.
> 
> Best Regards,
> Yujie



Re: [petsc-users] Scalability problem using PETSc with local installed OpenMPI

2023-10-10 Thread Matthew Knepley
On Tue, Oct 10, 2023 at 9:28 AM Gong Yujie 
wrote:

> Dear PETSc developers,
>
> I installed OpenMPI3 first and then installed PETSc with that mpi.
> Currently, I'm facing a scalability issue, in detail, I tested that using
> OpenMPI to calculate an addition of two distributed arrays and I get a good
> scalability. The problem is when I calculate the addition of two vectors in
> PETSc, I don't have any scalability. For the same size of the problem,
> PETSc costs a lot much time than merely using OpenMPI.
>
> My PETSc version is 3.16.0 and the version of OpenMPI is 3.1.4. Hope you
> can give me some suggestions.
>

1. For any performance question, we really need to see the output of
-log_view for each run.

2. I am not sure I understand your question. Vector addition does not
involve communication. Thus it will scale perfectly in the absence of load
imbalance.

  Thanks,

  Matt


> Best Regards,
> Yujie
>
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


[petsc-users] Scalability problem using PETSc with local installed OpenMPI

2023-10-10 Thread Gong Yujie
Dear PETSc developers,

I installed OpenMPI3 first and then installed PETSc with that mpi. Currently, 
I'm facing a scalability issue, in detail, I tested that using OpenMPI to 
calculate an addition of two distributed arrays and I get a good scalability. 
The problem is when I calculate the addition of two vectors in PETSc, I don't 
have any scalability. For the same size of the problem, PETSc costs a lot much 
time than merely using OpenMPI.

My PETSc version is 3.16.0 and the version of OpenMPI is 3.1.4. Hope you can 
give me some suggestions.

Best Regards,
Yujie