Re: [petsc-users] Parallel TS for ODE

Stefano Zampini Tue, 20 Apr 2021 13:07:40 -0700

> 
> 
> Thank you for the advices, I would just like to convert the code I already 
> have to see what might happen once parallelized.


You are not really listening to our advices. I can tell you what happens to 3 
coupled ODEs split  on 3 processes. The solver will be slower, by far.

> Do you think it is better to put the 3 equations into a 1d Distributed Array 
> with 3 dofs and run the job with multiple procs regardless of how many 
> equations I have? Is it possible? 
> 
> Thank you,
> Francesco
> 
>> Il giorno 20 apr 2021, alle ore 17:57, Stefano Zampini 
>> <[email protected] <mailto:[email protected]>> ha scritto:
>> 
>> It does not make sense to parallelize to 1 equation per process, unless that 
>> single equation per process is super super super costly.
>> Is this work you are doing used to understand PETSc parallelization 
>> strategy? if so, there are multiple examples in the sourcetree that you can 
>> look at to populate matrices and vectors in parallel
>> 
>> Il giorno mar 20 apr 2021 alle ore 17:52 Francesco Brarda 
>> <[email protected] <mailto:[email protected]>> ha scritto:
>> In principle the entire code was for 1 proc only. The functions were built 
>> with VecGetArray(). While adapting the code for multiple procs I thought 
>> using VecGetOwnershipRange was a possible way to allocate the equations in 
>> the vector using multiple procs. What do you think, please? 
>> 
>> Thank you,
>> Francesco
>> 
>>> Il giorno 20 apr 2021, alle ore 16:43, Matthew Knepley <[email protected] 
>>> <mailto:[email protected]>> ha scritto:
>>> 
>>> On Tue, Apr 20, 2021 at 10:41 AM Francesco Brarda 
>>> <[email protected] <mailto:[email protected]>> wrote:
>>> I was trying to follow Barry's advice some time ago, but I guess that's not 
>>> the way he meant it. How should I refer to the values contained in x? With 
>>> Distributed Arrays?
>>> 
>>> That is how you get values from x. However, I cannot understand at all what 
>>> you are doing with "mybase".
>>> 
>>>    Matt
>>>  
>>> Thanks 
>>> Francesco
>>> 
>>>>>  Even though it will not scale and will deliver slower performance it is 
>>>>> completely possible for you to solve the 3 variable problem using 3 MPI 
>>>>> ranks. Or 10 mpi ranks. You would just create vectors/matrices with 1 
>>>>> degree of freedom for the first three ranks and no degrees of freedom for 
>>>>> the later ranks. During your function evaluation (and Jacobian 
>>>>> evaluation) for TS you will need to set up the appropriate communication 
>>>>> to get the values you need on each rank to evaluate the parts of the 
>>>>> function evaluation needed by that rank. This is true for parallelizing 
>>>>> any computation.
>>>>> 
>>>>>  Barry
>>> 
>>> 
>>> 
>>>> Il giorno 20 apr 2021, alle ore 15:40, Matthew Knepley <[email protected] 
>>>> <mailto:[email protected]>> ha scritto:
>>>> 
>>>> On Tue, Apr 20, 2021 at 9:36 AM Francesco Brarda 
>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>> Hi!
>>>> I tried to implement the SIR model taking into account the fact that I 
>>>> will only use 3 MPI ranks at this moment.
>>>> I built vectors and matrices following the examples already available. In 
>>>> particular, I defined the functions required similarly (RHSFunction, 
>>>> IFunction, IJacobian), as follows:
>>>> 
>>>> I don't think this makes sense. You use "mybase" to distinguish between 3 
>>>> procs, which would indicate that each procs has only
>>>> 1 degree of freedom. However, you use x[1] on each proc, indicating it has 
>>>> at least 2 dofs.
>>>> 
>>>>   Thanks,
>>>> 
>>>>      Matt
>>>>  
>>>> static PetscErrorCode RHSFunction(TS ts,PetscReal t,Vec X,Vec F,void *ctx)
>>>> { 
>>>>   PetscErrorCode    ierr;
>>>>   AppCtx            *appctx = (AppCtx*) ctx;
>>>>   PetscScalar       f;//, *x_localptr; 
>>>>   const PetscScalar *x;
>>>>   PetscInt          mybase;
>>>>   
>>>>   PetscFunctionBeginUser;
>>>>   ierr = VecGetOwnershipRange(X,&mybase,NULL);CHKERRQ(ierr);
>>>>   ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr);
>>>>   if (mybase == 0) {
>>>>     f    = (PetscScalar) (-appctx->p1*x[0]*x[1]/appctx->N);
>>>>     ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
>>>>   }
>>>>   if (mybase == 1) {
>>>>     f    = (PetscScalar) (appctx->p1*x[0]*x[1]/appctx->N-appctx->p2*x[1]);
>>>>     ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
>>>>   }
>>>>   if (mybase == 2) {
>>>>     f    = (PetscScalar) (appctx->p2*x[1]);
>>>>     ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES);
>>>>   }
>>>>   ierr = VecRestoreArrayRead(X,&x);CHKERRQ(ierr);
>>>>   ierr = VecAssemblyBegin(F);CHKERRQ(ierr);
>>>>   ierr = VecAssemblyEnd(F);CHKERRQ(ierr);
>>>>   PetscFunctionReturn(0);
>>>> }
>>>> 
>>>> 
>>>> Whilst for the Jacobian I did:
>>>> 
>>>> 
>>>> static PetscErrorCode IJacobian(TS ts,PetscReal t,Vec X,Vec Xdot,PetscReal 
>>>> a,Mat A,Mat B,void *ctx)
>>>> { 
>>>>   PetscErrorCode    ierr;
>>>>   AppCtx            *appctx = (AppCtx*) ctx;
>>>>   PetscInt          mybase, rowcol[] = {0,1,2};
>>>>   const PetscScalar *x;
>>>>   
>>>>   PetscFunctionBeginUser;
>>>>   ierr = MatGetOwnershipRange(B,&mybase,NULL);CHKERRQ(ierr);
>>>>   ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr);
>>>>   if (mybase == 0) {
>>>>     const PetscScalar J[] = {a + appctx->p1*x[1]/appctx->N, 
>>>> appctx->p1*x[0]/appctx->N, 0};
>>>>     ierr = 
>>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
>>>>   }
>>>>   if (mybase == 1) {
>>>>     const PetscScalar J[] = {- appctx->p1*x[1]/appctx->N, a - 
>>>> appctx->p1*x[0]/appctx->N + appctx->p2, 0};
>>>>     ierr = 
>>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
>>>>   }
>>>>   if (mybase == 2) {
>>>>     const PetscScalar J[] = {0, - appctx->p2, a};
>>>>     ierr = 
>>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr);
>>>>   }
>>>>   ierr    = VecRestoreArrayRead(X,&x);CHKERRQ(ierr);
>>>>   
>>>>   ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>   ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>   if (A != B) {
>>>>     ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>     ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>>   }
>>>>   PetscFunctionReturn(0);
>>>> }
>>>> 
>>>> This code does not provide the correct result, that is, the solution is 
>>>> the initial condition, either using implicit or explicit methods. Is the 
>>>> way I defined these objects wrong? How can I fix it? 
>>>> I also tried to print the Jacobian with the following commands but it does 
>>>> not work (blank rows and error message). How should I print the Jacobian?
>>>> 
>>>> ierr = TSGetIJacobian(ts,NULL,&K, NULL, NULL); CHKERRQ(ierr);
>>>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>>> ierr = MatView(K,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);                 
>>>>            
>>>> 
>>>> I would very much appreciate any kind of help or advice.
>>>> Best,
>>>> Francesco
>>>> 
>>>>> Il giorno 2 apr 2021, alle ore 04:45, Barry Smith <[email protected] 
>>>>> <mailto:[email protected]>> ha scritto:
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Apr 1, 2021, at 9:17 PM, Zhang, Hong via petsc-users 
>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Mar 31, 2021, at 2:53 AM, Francesco Brarda 
>>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>> 
>>>>>>> Hi everyone!
>>>>>>> 
>>>>>>> I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. 
>>>>>>> Sequentially works pretty well, but I need to switch it into a parallel 
>>>>>>> version. 
>>>>>>> I started working with TS not very long time ago, there are few 
>>>>>>> questions I’d like to share with you and if you have any advices I’d be 
>>>>>>> happy to hear.
>>>>>>> First of all, do I need to use a DM object even if the model is only 
>>>>>>> time dependent? All the examples I found were using that object for the 
>>>>>>> other variable when solving PDEs.
>>>>>> 
>>>>>> Are you considering SIR on a spatial domain? If so, you can parallelize 
>>>>>> your model in the spatial domain using DM. Splitting the three variables 
>>>>>> in the ODE among processors would not scale.
>>>>> 
>>>>>  Even though it will not scale and will deliver slower performance it is 
>>>>> completely possible for you to solve the 3 variable problem using 3 MPI 
>>>>> ranks. Or 10 mpi ranks. You would just create vectors/matrices with 1 
>>>>> degree of freedom for the first three ranks and no degrees of freedom for 
>>>>> the later ranks. During your function evaluation (and Jacobian 
>>>>> evaluation) for TS you will need to set up the appropriate communication 
>>>>> to get the values you need on each rank to evaluate the parts of the 
>>>>> function evaluation needed by that rank. This is true for parallelizing 
>>>>> any computation.
>>>>> 
>>>>>  Barry
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Hong (Mr.)  
>>>>>> 
>>>>>>> When I preallocate the space for the Jacobian matrix, is it better to 
>>>>>>> decide the local or global space?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Francesco
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their 
>>>> experiments is infinitely more interesting than any results to which their 
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their 
>>> experiments is infinitely more interesting than any results to which their 
>>> experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
>> 
>> 
>> -- 
>> Stefano
>

Re: [petsc-users] Parallel TS for ODE

Reply via email to