> > > Thank you for the advices, I would just like to convert the code I already > have to see what might happen once parallelized.
You are not really listening to our advices. I can tell you what happens to 3 coupled ODEs split on 3 processes. The solver will be slower, by far. > Do you think it is better to put the 3 equations into a 1d Distributed Array > with 3 dofs and run the job with multiple procs regardless of how many > equations I have? Is it possible? > > Thank you, > Francesco > >> Il giorno 20 apr 2021, alle ore 17:57, Stefano Zampini >> <[email protected] <mailto:[email protected]>> ha scritto: >> >> It does not make sense to parallelize to 1 equation per process, unless that >> single equation per process is super super super costly. >> Is this work you are doing used to understand PETSc parallelization >> strategy? if so, there are multiple examples in the sourcetree that you can >> look at to populate matrices and vectors in parallel >> >> Il giorno mar 20 apr 2021 alle ore 17:52 Francesco Brarda >> <[email protected] <mailto:[email protected]>> ha scritto: >> In principle the entire code was for 1 proc only. The functions were built >> with VecGetArray(). While adapting the code for multiple procs I thought >> using VecGetOwnershipRange was a possible way to allocate the equations in >> the vector using multiple procs. What do you think, please? >> >> Thank you, >> Francesco >> >>> Il giorno 20 apr 2021, alle ore 16:43, Matthew Knepley <[email protected] >>> <mailto:[email protected]>> ha scritto: >>> >>> On Tue, Apr 20, 2021 at 10:41 AM Francesco Brarda >>> <[email protected] <mailto:[email protected]>> wrote: >>> I was trying to follow Barry's advice some time ago, but I guess that's not >>> the way he meant it. How should I refer to the values contained in x? With >>> Distributed Arrays? >>> >>> That is how you get values from x. However, I cannot understand at all what >>> you are doing with "mybase". >>> >>> Matt >>> >>> Thanks >>> Francesco >>> >>>>> Even though it will not scale and will deliver slower performance it is >>>>> completely possible for you to solve the 3 variable problem using 3 MPI >>>>> ranks. Or 10 mpi ranks. You would just create vectors/matrices with 1 >>>>> degree of freedom for the first three ranks and no degrees of freedom for >>>>> the later ranks. During your function evaluation (and Jacobian >>>>> evaluation) for TS you will need to set up the appropriate communication >>>>> to get the values you need on each rank to evaluate the parts of the >>>>> function evaluation needed by that rank. This is true for parallelizing >>>>> any computation. >>>>> >>>>> Barry >>> >>> >>> >>>> Il giorno 20 apr 2021, alle ore 15:40, Matthew Knepley <[email protected] >>>> <mailto:[email protected]>> ha scritto: >>>> >>>> On Tue, Apr 20, 2021 at 9:36 AM Francesco Brarda >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> Hi! >>>> I tried to implement the SIR model taking into account the fact that I >>>> will only use 3 MPI ranks at this moment. >>>> I built vectors and matrices following the examples already available. In >>>> particular, I defined the functions required similarly (RHSFunction, >>>> IFunction, IJacobian), as follows: >>>> >>>> I don't think this makes sense. You use "mybase" to distinguish between 3 >>>> procs, which would indicate that each procs has only >>>> 1 degree of freedom. However, you use x[1] on each proc, indicating it has >>>> at least 2 dofs. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> static PetscErrorCode RHSFunction(TS ts,PetscReal t,Vec X,Vec F,void *ctx) >>>> { >>>> PetscErrorCode ierr; >>>> AppCtx *appctx = (AppCtx*) ctx; >>>> PetscScalar f;//, *x_localptr; >>>> const PetscScalar *x; >>>> PetscInt mybase; >>>> >>>> PetscFunctionBeginUser; >>>> ierr = VecGetOwnershipRange(X,&mybase,NULL);CHKERRQ(ierr); >>>> ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr); >>>> if (mybase == 0) { >>>> f = (PetscScalar) (-appctx->p1*x[0]*x[1]/appctx->N); >>>> ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES); >>>> } >>>> if (mybase == 1) { >>>> f = (PetscScalar) (appctx->p1*x[0]*x[1]/appctx->N-appctx->p2*x[1]); >>>> ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES); >>>> } >>>> if (mybase == 2) { >>>> f = (PetscScalar) (appctx->p2*x[1]); >>>> ierr = VecSetValues(F,1,&mybase,&f,INSERT_VALUES); >>>> } >>>> ierr = VecRestoreArrayRead(X,&x);CHKERRQ(ierr); >>>> ierr = VecAssemblyBegin(F);CHKERRQ(ierr); >>>> ierr = VecAssemblyEnd(F);CHKERRQ(ierr); >>>> PetscFunctionReturn(0); >>>> } >>>> >>>> >>>> Whilst for the Jacobian I did: >>>> >>>> >>>> static PetscErrorCode IJacobian(TS ts,PetscReal t,Vec X,Vec Xdot,PetscReal >>>> a,Mat A,Mat B,void *ctx) >>>> { >>>> PetscErrorCode ierr; >>>> AppCtx *appctx = (AppCtx*) ctx; >>>> PetscInt mybase, rowcol[] = {0,1,2}; >>>> const PetscScalar *x; >>>> >>>> PetscFunctionBeginUser; >>>> ierr = MatGetOwnershipRange(B,&mybase,NULL);CHKERRQ(ierr); >>>> ierr = VecGetArrayRead(X,&x);CHKERRQ(ierr); >>>> if (mybase == 0) { >>>> const PetscScalar J[] = {a + appctx->p1*x[1]/appctx->N, >>>> appctx->p1*x[0]/appctx->N, 0}; >>>> ierr = >>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr); >>>> } >>>> if (mybase == 1) { >>>> const PetscScalar J[] = {- appctx->p1*x[1]/appctx->N, a - >>>> appctx->p1*x[0]/appctx->N + appctx->p2, 0}; >>>> ierr = >>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr); >>>> } >>>> if (mybase == 2) { >>>> const PetscScalar J[] = {0, - appctx->p2, a}; >>>> ierr = >>>> MatSetValues(B,1,&mybase,3,rowcol,J,INSERT_VALUES);CHKERRQ(ierr); >>>> } >>>> ierr = VecRestoreArrayRead(X,&x);CHKERRQ(ierr); >>>> >>>> ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> if (A != B) { >>>> ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> } >>>> PetscFunctionReturn(0); >>>> } >>>> >>>> This code does not provide the correct result, that is, the solution is >>>> the initial condition, either using implicit or explicit methods. Is the >>>> way I defined these objects wrong? How can I fix it? >>>> I also tried to print the Jacobian with the following commands but it does >>>> not work (blank rows and error message). How should I print the Jacobian? >>>> >>>> ierr = TSGetIJacobian(ts,NULL,&K, NULL, NULL); CHKERRQ(ierr); >>>> ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >>>> ierr = MatView(K,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr); >>>> >>>> >>>> I would very much appreciate any kind of help or advice. >>>> Best, >>>> Francesco >>>> >>>>> Il giorno 2 apr 2021, alle ore 04:45, Barry Smith <[email protected] >>>>> <mailto:[email protected]>> ha scritto: >>>>> >>>>> >>>>> >>>>>> On Apr 1, 2021, at 9:17 PM, Zhang, Hong via petsc-users >>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On Mar 31, 2021, at 2:53 AM, Francesco Brarda >>>>>>> <[email protected] <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> Hi everyone! >>>>>>> >>>>>>> I am trying to solve a system of 3 ODEs (a basic SIR model) with TS. >>>>>>> Sequentially works pretty well, but I need to switch it into a parallel >>>>>>> version. >>>>>>> I started working with TS not very long time ago, there are few >>>>>>> questions I’d like to share with you and if you have any advices I’d be >>>>>>> happy to hear. >>>>>>> First of all, do I need to use a DM object even if the model is only >>>>>>> time dependent? All the examples I found were using that object for the >>>>>>> other variable when solving PDEs. >>>>>> >>>>>> Are you considering SIR on a spatial domain? If so, you can parallelize >>>>>> your model in the spatial domain using DM. Splitting the three variables >>>>>> in the ODE among processors would not scale. >>>>> >>>>> Even though it will not scale and will deliver slower performance it is >>>>> completely possible for you to solve the 3 variable problem using 3 MPI >>>>> ranks. Or 10 mpi ranks. You would just create vectors/matrices with 1 >>>>> degree of freedom for the first three ranks and no degrees of freedom for >>>>> the later ranks. During your function evaluation (and Jacobian >>>>> evaluation) for TS you will need to set up the appropriate communication >>>>> to get the values you need on each rank to evaluate the parts of the >>>>> function evaluation needed by that rank. This is true for parallelizing >>>>> any computation. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Hong (Mr.) >>>>>> >>>>>>> When I preallocate the space for the Jacobian matrix, is it better to >>>>>>> decide the local or global space? >>>>>>> >>>>>>> Best, >>>>>>> Francesco >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> <http://www.cse.buffalo.edu/~knepley/> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/> >> >> >> -- >> Stefano >
