Re: [petsc-users] Parallel TS for ODE

Francesco Brarda Thu, 29 Apr 2021 07:19:46 -0700

I am using DM to share the data when I define the functions outside the main 
(RHSFunction, InitialConditions and so on).
In the sequential model, following the available examples, I used VecGetArray. 
As a replacement for a parallel model, I thought DM array would be a good 
alternative. Would you recommend using another tool?


Best,
Francesco


> Il giorno 29 apr 2021, alle ore 15:27, Zhang, Hong <[email protected]> ha 
> scritto:
> 
> Before you have a space-dependent SIR model, it does not make much sense to 
> use DM and even consider parallelizing your code.  
> 
>> On Apr 29, 2021, at 2:57 AM, Francesco Brarda <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi,
>> The plan is actually to move to a SIR model also with the space. 
>> I understand that doing a SIR model in parallel will not bring any benefits, 
>> but I have been asked to do it as part of a project I am involved in.
>> 
>> I defined the DM as follows
>> ierr = 
>> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,3,3,3,NULL,&da);CHKERRQ(ierr);
> 
> The size of DMDA is usually associated with the size of your computational 
> domain, e.g. number of nodes in space. In your case, you have 3 degrees of 
> freedom on each node, but you have only one node since the model has not been 
> extended to the space yet.
> 
> Hong (Mr.)
> 
>> 
>> I am not sure whether I correctly understood this command properly. The 
>> vector should have 3 components (S, I, R) and 3 DOF as it is defined only 
>> when the three coordinates have been set.
>> Then I create a global vector X. When I set the initial conditions as below 
>> 
>> static PetscErrorCode InitialConditions(TS ts,Vec X, void *ctx)
>> { 
>>   PetscErrorCode    ierr;
>>   AppCtx            *appctx = (AppCtx*) ctx;
>>   PetscScalar       *x;
>>   DM                da;
>>   
>>   PetscFunctionBeginUser;
>>   ierr = TSGetDM(ts,&da);CHKERRQ(ierr);
>>   
>>   /* Get pointers to vector data */
>>   ierr = DMDAVecGetArray(da,X,(void*)&x);CHKERRQ(ierr);
>>   
>>   x[0] = appctx->N - appctx->p[2];
>>   x[1] = appctx->p[2];
>>   x[2] = 0.0;
>>   
>>   ierr = DMDAVecRestoreArray(da,X,(void*)&x);CHKERRQ(ierr);
>>   PetscFunctionReturn(0);
>> }
>> 
>> I have the error:
>> 
>> [0]PETSC ERROR: Invalid argument
>> [0]PETSC ERROR: Wrong subtype object:Parameter # 1 must have implementation 
>> da it is shell
>> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html 
>> <https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.14.4, unknown 
>> [0]PETSC ERROR: ./par_sir_model on a arch-debug named srvulx13 by fbrarda 
>> Thu Apr 29 09:36:17 2021
>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ 
>> --with-fc=gfortran --with-openblas-dir=/opt/packages/openblas/0.2.13-gcc 
>> --download-mpich PETSC_ARCH=arch-debug
>> [0]PETSC ERROR: #1 DMDAVecGetArray() line 48 in 
>> /home/fbrarda/petsc/src/dm/impls/da/dagetarray.c
>> [0]PETSC ERROR: #2 InitialConditions() line 175 in par_sir_model.c
>> [0]PETSC ERROR: #3 main() line 295 in par_sir_model.c
>> [0]PETSC ERROR: No PETSc Option Table entries
>> 
>> I would be very happy to receive any advices to fix the code.
>> Best,
>> Francesco
>> 
>>> Il giorno 20 apr 2021, alle ore 21:35, Matthew Knepley <[email protected] 
>>> <mailto:[email protected]>> ha scritto:
>>> 
>>> On Tue, Apr 20, 2021 at 1:17 PM Francesco Brarda <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Thank you for the advices, I would just like to convert the code I already 
>>> have to see what might happen once parallelized.
>>> Do you think it is better to put the 3 equations into a 1d Distributed 
>>> Array with 3 dofs and run the job with multiple procs regardless of how 
>>> many equations I have? Is it possible? 
>>> 
>>> If you plan in the end to use a structured grid, this is a great plan. If 
>>> not, this is not a good plan.
>>> 
>>>   Thanks,
>>> 
>>>      Matt
>>>  
>>> Thank you,
>>> Francesco
>>> 
>>>> Il giorno 20 apr 2021, alle ore 17:57, Stefano Zampini 
>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>> 
>>>> It does not make sense to parallelize to 1 equation per process, unless 
>>>> that single equation per process is super super super costly.
>>>> Is this work you are doing used to understand PETSc parallelization 
>>>> strategy? if so, there are multiple examples in the sourcetree that you 
>>>> can look at to populate matrices and vectors in parallel
>>>> 
>>>> Il giorno mar 20 apr 2021 alle ore 17:52 Francesco Brarda 
>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>> In principle the entire code was for 1 proc only. The functions were built 
>>>> with VecGetArray(). While adapting the code for multiple procs I thought 
>>>> using VecGetOwnershipRange was a possible way to allocate the equations in 
>>>> the vector using multiple procs. What do you think, please? 
>>>> 
>>>> Thank you,
>>>> Francesco
>>>> 
>>>>> Il giorno 20 apr 2021, alle ore 16:43, Matthew Knepley <[email protected] 
>>>>> <mailto:[email protected]>> ha scritto:
>>>>> 
>>>>> On Tue, Apr 20, 2021 at 10:41 AM Francesco Brarda 
>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>> I was trying to follow Barry's advice some time ago, but I guess that's 
>>>>> not the way he meant it. How should I refer to the values contained in x? 
>>>>> With Distributed Arrays?
>>>>> 
>>>>> That is how you get values from x. However, I cannot understand at all 
>>>>> what you are doing with "mybase".
>>>>> 
>>>>>    Matt
>>>>>  
>>>>> Thanks 
>>>>> Francesco
>>>>> 
>>>>>>>  Even though it will not scale and will deliver slower performance it 
>>>>>>> is completely possible for you to solve the 3 variable problem using 3 
>>>>>>> MPI ranks. Or 10 mpi ranks. You would just create vectors/matrices with 
>>>>>>> 1 degree of freedom for the first three ranks and no degrees of freedom 
>>>>>>> for the later ranks. During your function evaluation (and Jacobian 
>>>>>>> evaluation) for TS you will need to set up the appropriate 
>>>>>>> communication to get the values you need on each rank to evaluate the 
>>>>>>> parts of the function evaluation needed by that rank. This is true for 
>>>>>>> parallelizing any computation.
>>>>>>> 
>>>>>>>  Barry
>>>>> 
>>>>> 
>>>>> 
>>>>>> Il giorno 20 apr 2021, alle ore 15:40, Matthew Knepley 
>>>>>> <[email protected] <mailto:[email protected]>> ha scritto:
>>>>>> 
>>>>>> On Tue, Apr 20, 2021 at 9:36 AM Francesco Brarda 
>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>> Hi!
>>>>>> I tried to implement the SIR model taking into account the fact that I 
>>>>>> will only use 3 MPI ranks at this moment.
>>>>>> I built vectors and matrices following the examples already available. 
>>>>>> In particular, I defined the functions required similarly (RHSFunction, 
>>>>>> IFunction, IJacobian), as follows:
>>>>>> 
>>>>>> I don't think this makes sense. You use "mybase" to distinguish between 
>>>>>> 3 procs, which would indicate that each procs has only
>>>>>> 1 degree of freedom. However, you use x[1] on each proc, indicating it 
>>>>>> has at least 2 dofs.
>>>>>> 
>>>>>>   Thanks,
>>>>>> 
>>>>>>      Matt
>>>>>>  
>>>>>> static PetscErrorCode RHSFunction(TS ts,PetscReal t,Vec X,Vec F,void 
>>>>>> *ctx)
>>>>>> { 
>>>>>>   PetscErrorCode    ierr;
>>>>>>   AppCtx            *appctx = (AppCtx*) ctx;
>>>>>>   PetscScalar       f;//, *x_localptr; 
>>>>>>   const PetscScalar *x;
>>>>>>   PetscInt          mybase;
>>>>>>   
>>>>>>   PetscFunctionBeginUser;
>>>>>>   ierr = VecGetOwnershipRange(X,&mybase,NULL);CHKERRQ(ierr);
>>>>>>   ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr);
>>>>>>   if (mybase == 0) {
>>>>>>     f    = (PetscScalar) (-appctx->p1*x[0]*x[1]/appctx->N);
>>>>>>     ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
>>>>>>   }
>>>>>>   if (mybase == 1) {
>>>>>>     f    = (PetscScalar) 
>>>>>> (appctx->p1*x[0]*x[1]/appctx->N-appctx->p2*x[1]);
>>>>>>     ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
>>>>>>   }
>>>>>>   if (mybase == 2) {
>>>>>>     f    = (PetscScalar) (appctx->p2*x[1]);
>>>>>>     ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
>>>>>>   }
>>>>>>   ierr = VecRestoreArrayRead(X,&x);CHKERRQ(ierr);
>>>>>>   ierr = VecAssemblyBegin(F);CHKERRQ(ierr);
>>>>>>   ierr = VecAssemblyEnd(F);CHKERRQ(ierr);
>>>>>>   PetscFunctionReturn(0);
>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> Whilst for the Jacobian I did:
>>>>>> 
>>>>>> 
>>>>>> static PetscErrorCode IJacobian(TS ts,PetscReal t,Vec X,Vec 
>>>>>> Xdot,PetscReal a,Mat A,Mat B,void *ctx)
>>>>>> { 
>>>>>>   PetscErrorCode    ierr;
>>>>>>   AppCtx            *appctx = (AppCtx*) ctx;
>>>>>>   PetscInt          mybase, rowcol[] = {0,1,2};
>>>>>>   const PetscScalar *x;
>>>>>>   
>>>>>>   PetscFunctionBeginUser;
>>>>>>   ierr = MatGetOwnershipRange(B,&mybase,NULL);CHKERRQ(ierr);
>>>>>>   ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr);
>>>>>>   if (mybase == 0) {
>>>>>>     const PetscScalar J[] = {a + appctx->p1*x[1]/appctx->N, 
>>>>>> appctx->p1*x[0]/appctx->N, 0};
>>>>>>     ierr = 
>>>>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
>>>>>>   }
>>>>>>   if (mybase == 1) {
>>>>>>     const PetscScalar J[] = {- appctx->p1*x[1]/appctx->N, a - 
>>>>>> appctx->p1*x[0]/appctx->N + appctx->p2, 0};
>>>>>>     ierr = 
>>>>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
>>>>>>   }
>>>>>>   if (mybase == 2) {
>>>>>>     const PetscScalar J[] = {0, - appctx->p2, a};
>>>>>>     ierr = 
>>>>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
>>>>>>   }
>>>>>>   ierr    = VecRestoreArrayRead(X,&x);CHKERRQ(ierr);
>>>>>>   
>>>>>>   ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>>>   ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>>>   if (A != B) {
>>>>>>     ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>>>     ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>>>   }
>>>>>>   PetscFunctionReturn(0);
>>>>>> }
>>>>>> 
>>>>>> This code does not provide the correct result, that is, the solution is 
>>>>>> the initial condition, either using implicit or explicit methods. Is the 
>>>>>> way I defined these objects wrong? How can I fix it? 
>>>>>> I also tried to print the Jacobian with the following commands but it 
>>>>>> does not work (blank rows and error message). How should I print the 
>>>>>> Jacobian?
>>>>>> 
>>>>>> ierr = TSGetIJacobian(ts,NULL,&K, NULL, NULL); CHKERRQ(ierr);
>>>>>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>>> ierr = MatView(K,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);               
>>>>>>              
>>>>>> 
>>>>>> I would very much appreciate any kind of help or advice.
>>>>>> Best,
>>>>>> Francesco
>>>>>> 
>>>>>>> Il giorno 2 apr 2021, alle ore 04:45, Barry Smith <[email protected] 
>>>>>>> <mailto:[email protected]>> ha scritto:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Apr 1, 2021, at 9:17 PM, Zhang, Hong via petsc-users 
>>>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Mar 31, 2021, at 2:53 AM, Francesco Brarda 
>>>>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>>>> 
>>>>>>>>> Hi everyone!
>>>>>>>>> 
>>>>>>>>> I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. 
>>>>>>>>> Sequentially works pretty well, but I need to switch it into a 
>>>>>>>>> parallel version. 
>>>>>>>>> I started working with TS not very long time ago, there are few 
>>>>>>>>> questions I’d like to share with you and if you have any advices I’d 
>>>>>>>>> be happy to hear.
>>>>>>>>> First of all, do I need to use a DM object even if the model is only 
>>>>>>>>> time dependent? All the examples I found were using that object for 
>>>>>>>>> the other variable when solving PDEs.
>>>>>>>> 
>>>>>>>> Are you considering SIR on a spatial domain? If so, you can 
>>>>>>>> parallelize your model in the spatial domain using DM. Splitting the 
>>>>>>>> three variables in the ODE among processors would not scale.
>>>>>>> 
>>>>>>>  Even though it will not scale and will deliver slower performance it 
>>>>>>> is completely possible for you to solve the 3 variable problem using 3 
>>>>>>> MPI ranks. Or 10 mpi ranks. You would just create vectors/matrices with 
>>>>>>> 1 degree of freedom for the first three ranks and no degrees of freedom 
>>>>>>> for the later ranks. During your function evaluation (and Jacobian 
>>>>>>> evaluation) for TS you will need to set up the appropriate 
>>>>>>> communication to get the values you need on each rank to evaluate the 
>>>>>>> parts of the function evaluation needed by that rank. This is true for 
>>>>>>> parallelizing any computation.
>>>>>>> 
>>>>>>>  Barry
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> Hong (Mr.)  
>>>>>>>> 
>>>>>>>>> When I preallocate the space for the Jacobian matrix, is it better to 
>>>>>>>>> decide the local or global space?
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Francesco
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their 
>>>>>> experiments is infinitely more interesting than any results to which 
>>>>>> their experiments lead.
>>>>>> -- Norbert Wiener
>>>>>> 
>>>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>>>> <https://www.cse.buffalo.edu/~knepley/>
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their 
>>>>> experiments is infinitely more interesting than any results to which 
>>>>> their experiments lead.
>>>>> -- Norbert Wiener
>>>>> 
>>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>>> <https://www.cse.buffalo.edu/~knepley/>
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Stefano
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their 
>>> experiments is infinitely more interesting than any results to which their 
>>> experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/ 
>>> <https://www.cse.buffalo.edu/~knepley/>
>> 
>

Re: [petsc-users] Parallel TS for ODE

Reply via email to