Re: [petsc-users] Parallel TS for ODE

Zhang, Hong via petsc-users Thu, 29 Apr 2021 06:29:08 -0700

Before you have a space-dependent SIR model, it does not make much sense to use 
DM and even consider parallelizing your code.


On Apr 29, 2021, at 2:57 AM, Francesco Brarda 
<[email protected]<mailto:[email protected]>> wrote:

Hi,
The plan is actually to move to a SIR model also with the space.
I understand that doing a SIR model in parallel will not bring any benefits, 
but I have been asked to do it as part of a project I am involved in.

I defined the DM as follows
ierr = 
DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,3,3,3,NULL,&da);CHKERRQ(ierr);

The size of DMDA is usually associated with the size of your computational 
domain, e.g. number of nodes in space. In your case, you have 3 degrees of 
freedom on each node, but you have only one node since the model has not been 
extended to the space yet.

Hong (Mr.)


I am not sure whether I correctly understood this command properly. The vector 
should have 3 components (S, I, R) and 3 DOF as it is defined only when the 
three coordinates have been set.
Then I create a global vector X. When I set the initial conditions as below

static PetscErrorCode InitialConditions(TS ts,Vec X, void *ctx)
{
  PetscErrorCode    ierr;
  AppCtx            *appctx = (AppCtx*) ctx;
  PetscScalar       *x;
  DM                da;

  PetscFunctionBeginUser;
  ierr = TSGetDM(ts,&da);CHKERRQ(ierr);

  /* Get pointers to vector data */
  ierr = DMDAVecGetArray(da,X,(void*)&x);CHKERRQ(ierr);

  x[0] = appctx->N - appctx->p[2];
  x[1] = appctx->p[2];
  x[2] = 0.0;

  ierr = DMDAVecRestoreArray(da,X,(void*)&x);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

I have the error:

[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Wrong subtype object:Parameter # 1 must have implementation da 
it is shell
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for 
trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.14.4, unknown
[0]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda Thu 
Apr 29 09:36:17 2021
[0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ 
--with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc 
--download-mpich PETSC_ARCH=arch-debug
[0]PETSC ERROR: #1 DMDAVecGetArray() line 48 in 
/home/fbrarda/petsc/src/dm/impls/da/dagetarray.c
[0]PETSC ERROR: #2 InitialConditions() line 175 in par_sir_model.c
[0]PETSC ERROR: #3 main() line 295 in par_sir_model.c
[0]PETSC ERROR: No PETSc Option Table entries

I would be very happy to receive any advices to fix the code.
Best,
Francesco

Il giorno 20 apr 2021, alle ore 21:35, Matthew Knepley 
<[email protected]<mailto:[email protected]>> ha scritto:

On Tue, Apr 20, 2021 at 1:17 PM Francesco Brarda 
<[email protected]<mailto:[email protected]>> wrote:
Thank you for the advices, I would just like to convert the code I already have 
to see what might happen once parallelized.
Do you think it is better to put the 3 equations into a 1d Distributed Array 
with 3 dofs and run the job with multiple procs regardless of how many 
equations I have? Is it possible?

If you plan in the end to use a structured grid, this is a great plan. If not, 
this is not a good plan.

  Thanks,

     Matt

Thank you,
Francesco

Il giorno 20 apr 2021, alle ore 17:57, Stefano Zampini 
<[email protected]<mailto:[email protected]>> ha scritto:

It does not make sense to parallelize to 1 equation per process, unless that 
single equation per process is super super super costly.
Is this work you are doing used to understand PETSc parallelization strategy? 
if so, there are multiple examples in the sourcetree that you can look at to 
populate matrices and vectors in parallel

Il giorno mar 20 apr 2021 alle ore 17:52 Francesco Brarda 
<[email protected]<mailto:[email protected]>> ha scritto:
In principle the entire code was for 1 proc only. The functions were built with 
VecGetArray(). While adapting the code for multiple procs I thought using 
VecGetOwnershipRange was a possible way to allocate the equations in the vector 
using multiple procs. What do you think, please?

Thank you,
Francesco

Il giorno 20 apr 2021, alle ore 16:43, Matthew Knepley 
<[email protected]<mailto:[email protected]>> ha scritto:

On Tue, Apr 20, 2021 at 10:41 AM Francesco Brarda 
<[email protected]<mailto:[email protected]>> wrote:
I was trying to follow Barry's advice some time ago, but I guess that's not the 
way he meant it. How should I refer to the values contained in x? With 
Distributed Arrays?

That is how you get values from x. However, I cannot understand at all what you 
are doing with "mybase".

   Matt

Thanks
Francesco

 Even though it will not scale and will deliver slower performance it is 
completely possible for you to solve the 3 variable problem using 3 MPI ranks. 
Or 10 mpi ranks. You would just create vectors/matrices with 1 degree of 
freedom for the first three ranks and no degrees of freedom for the later 
ranks. During your function evaluation (and Jacobian evaluation) for TS you 
will need to set up the appropriate communication to get the values you need on 
each rank to evaluate the parts of the function evaluation needed by that rank. 
This is true for parallelizing any computation.

 Barry



Il giorno 20 apr 2021, alle ore 15:40, Matthew Knepley 
<[email protected]<mailto:[email protected]>> ha scritto:

On Tue, Apr 20, 2021 at 9:36 AM Francesco Brarda 
<[email protected]<mailto:[email protected]>> wrote:
Hi!
I tried to implement the SIR model taking into account the fact that I will 
only use 3 MPI ranks at this moment.
I built vectors and matrices following the examples already available. In 
particular, I defined the functions required similarly (RHSFunction, IFunction, 
IJacobian), as follows:

I don't think this makes sense. You use "mybase" to distinguish between 3 
procs, which would indicate that each procs has only
1 degree of freedom. However, you use x[1] on each proc, indicating it has at 
least 2 dofs.

  Thanks,

     Matt

static PetscErrorCode RHSFunction(TS ts,PetscReal t,Vec X,Vec F,void *ctx)
{
  PetscErrorCode    ierr;
  AppCtx            *appctx = (AppCtx*) ctx;
  PetscScalar       f;//, *x_localptr;
  const PetscScalar *x;
  PetscInt          mybase;

  PetscFunctionBeginUser;
  ierr = VecGetOwnershipRange(X,&mybase,NULL);CHKERRQ(ierr);
  ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr);
  if (mybase == 0) {
    f    = (PetscScalar) (-appctx->p1*x[0]*x[1]/appctx->N);
    ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
  }
  if (mybase == 1) {
    f    = (PetscScalar) (appctx->p1*x[0]*x[1]/appctx->N-appctx->p2*x[1]);
    ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
  }
  if (mybase == 2) {
    f    = (PetscScalar) (appctx->p2*x[1]);
    ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
  }
  ierr = VecRestoreArrayRead(X,&x);CHKERRQ(ierr);
  ierr = VecAssemblyBegin(F);CHKERRQ(ierr);
  ierr = VecAssemblyEnd(F);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}


Whilst for the Jacobian I did:


static PetscErrorCode IJacobian(TS ts,PetscReal t,Vec X,Vec Xdot,PetscReal 
a,Mat A,Mat B,void *ctx)
{
  PetscErrorCode    ierr;
  AppCtx            *appctx = (AppCtx*) ctx;
  PetscInt          mybase, rowcol[] = {0,1,2};
  const PetscScalar *x;

  PetscFunctionBeginUser;
  ierr = MatGetOwnershipRange(B,&mybase,NULL);CHKERRQ(ierr);
  ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr);
  if (mybase == 0) {
    const PetscScalar J[] = {a + appctx->p1*x[1]/appctx->N, 
appctx->p1*x[0]/appctx->N, 0};
    ierr = MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
  }
  if (mybase == 1) {
    const PetscScalar J[] = {- appctx->p1*x[1]/appctx->N, a - 
appctx->p1*x[0]/appctx->N + appctx->p2, 0};
    ierr = MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
  }
  if (mybase == 2) {
    const PetscScalar J[] = {0, - appctx->p2, a};
    ierr = MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
  }
  ierr    = VecRestoreArrayRead(X,&x);CHKERRQ(ierr);

  ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  if (A != B) {
    ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
    ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

This code does not provide the correct result, that is, the solution is the 
initial condition, either using implicit or explicit methods. Is the way I 
defined these objects wrong? How can I fix it?
I also tried to print the Jacobian with the following commands but it does not 
work (blank rows and error message). How should I print the Jacobian?

ierr = TSGetIJacobian(ts,NULL,&K, NULL, NULL); CHKERRQ(ierr);
ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr = MatView(K,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);

I would very much appreciate any kind of help or advice.
Best,
Francesco

Il giorno 2 apr 2021, alle ore 04:45, Barry Smith 
<[email protected]<mailto:[email protected]>> ha scritto:



On Apr 1, 2021, at 9:17 PM, Zhang, Hong via petsc-users 
<[email protected]<mailto:[email protected]>> wrote:



On Mar 31, 2021, at 2:53 AM, Francesco Brarda 
<[email protected]<mailto:[email protected]>> wrote:

Hi everyone!

I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. 
Sequentially works pretty well, but I need to switch it into a parallel version.
I started working with TS not very long time ago, there are few questions I’d 
like to share with you and if you have any advices I’d be happy to hear.
First of all, do I need to use a DM object even if the model is only time 
dependent? All the examples I found were using that object for the other 
variable when solving PDEs.

Are you considering SIR on a spatial domain? If so, you can parallelize your 
model in the spatial domain using DM. Splitting the three variables in the ODE 
among processors would not scale.

 Even though it will not scale and will deliver slower performance it is 
completely possible for you to solve the 3 variable problem using 3 MPI ranks. 
Or 10 mpi ranks. You would just create vectors/matrices with 1 degree of 
freedom for the first three ranks and no degrees of freedom for the later 
ranks. During your function evaluation (and Jacobian evaluation) for TS you 
will need to set up the appropriate communication to get the values you need on 
each rank to evaluate the parts of the function evaluation needed by that rank. 
This is true for parallelizing any computation.

 Barry





Hong (Mr.)

When I preallocate the space for the Jacobian matrix, is it better to decide 
the local or global space?

Best,
Francesco



--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/



--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/



--
Stefano



--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/

Re: [petsc-users] Parallel TS for ODE

Reply via email to