In order to get good performance you need to compute d_nz and o_nz and call MatMPIAIJSetPreallocation().
Barry > On Nov 28, 2016, at 2:18 PM, Massoud Rezavand <[email protected]> > wrote: > > Hi, > > Thanks. > > As you know, in SPH method, the calculations are done over the neighboring > particles (j) that fall inside a support domain defined by a circle over the > particle of interest (i). Since the Lagrangian nature the method, the number > of neighboring particles are varying slightly over time, e.g. in a 2D domain > this number is varying between 43 to 51 (in my experience). > > The number of nonzeros per row (A_ij) is equal to the number of neighboring > particles and normally is not fixed over time, therefore, we put the elements > dynamically at each time step and we have to calculate d_nz and o_nz at each > time iteration. > > In order to preallocate the matrix, another way would be to calculate the > number of neighboring particles and set that as the number of nonzeros per > row. Doing so, do you recommend to use : > > MatMPIAIJSetPreallocation() > > to preallocate A to achieve the best performance? > > > Regards, > > Massoud > > > On 11/28/2016 06:36 PM, Barry Smith wrote: >>> On Nov 28, 2016, at 10:30 AM, Massoud Rezavand >>> <[email protected]> wrote: >>> >>> Dear Barry, >>> >>> You recommended me to directly use MatSetValues() and not to put the matrix >>> in a parallel CSR matrix. >>> >>> In order to count the d_nz and o_nz I have to put the entries into a >>> sequential CSR matrix >> If you don't know the number of nonzeros per row how are you going to put >> the values into a sequential CSR format? >> On the other hand if you can figure out the number of nonzeros per row >> without creating the matrix how come you cannot figure out the d_nz and o_nz? >> >> >>> and then do the MatMPIAIJSetPreallocation() and then do the MatSet Values(). >> If you do put the values into a sequential CSR format, which it is not >> clear to me is needed, then you can just call >> MatCreateMPIAIJWithArrays() and skip the "MatMPIAIJSetPreallocation() and >> then do the MatSet Values()" >> >> Barry >> >> >> >>> Does it effect the performance ? >>> >>> >>> Regards, >>> >>> Massoud >>> >>> On 11/21/2016 08:10 PM, Barry Smith wrote: >>>>> On Nov 21, 2016, at 12:20 PM, Massoud Rezavand >>>>> <[email protected]> wrote: >>>>> >>>>> Thank you very much. >>>>> >>>>> Yes I am developing the new 3D version in Parallel with the PPM (the new >>>>> generation OpenFPM, not released yet) library which generates the >>>>> particles and decomposes the domain. >>>>> >>>>> I don't have the parallel matrix generation yet. In the old version I had >>>>> CSR format and a vector of knowns (b). >>>>> So, should I use MatSetValuesStencil() ? >>>> MatSetValuesStencil is for finite differences on a structured grid. I >>>> don't think it makes sense for your application. >>>> >>>> You need to use MatMPIAIJSetPreallocation() and then MatSetValues() to >>>> put the entries in. >>>> >>>>> What do you recommend for creating the vector of knowns (b)? >>>> Just use VecCreateMPI() >>>>> On the other hand, due to the convergence issues for millions of >>>>> particles in ISPH, I have to use a preconditioner. In a paper I saw they >>>>> have used BoomerAMG from HYPRE. Do you have any recommendation? >>>> We have many to try, it is not clear that any would be particularly >>>> good for SPH. Certainly try BoomerAMG >>>> >>>>> I saw an example ( ex19.c) using BoomerAMG. Should I follow that? >>>>> >>>>> >>>>> PS: regarding the unbalance sparsity in SPH, yes in contrast to the >>>>> mesh-based methods, the A matrix in ISPH is changing over the time but >>>>> the number of non-zeros is defined by the number of neighboring particles >>>>> which in most cases is constant. >>>>> >>>>> Cheers, >>>>> >>>>> Massoud >>>>> >>>>> >>>>> >>>>> On 11/21/2016 06:18 PM, Barry Smith wrote: >>>>>>> On Nov 21, 2016, at 10:33 AM, Massoud Rezavand >>>>>>> <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Dear all, >>>>>>> >>>>>>> I am going to use PETSc in an Incompressible SPH code to solve the >>>>>>> pressure Poisson equation as a linear system of equations. >>>>>>> >>>>>>> In my old sequential code I used the PCG method or the BiCGSTAB with >>>>>>> jacobi preconditioner. >>>>>>> I used to store the coefficient matrix (A) in CSR (AIJ) format and >>>>>>> solve it. >>>>>>> >>>>>>> My question is that how should I import the CSR metrix and the known >>>>>>> vector (b) into the library to be solved? Is there an example to show >>>>>>> how to import and external system of eqs. into PETSc? >>>>>>> >>>>>> For sequential code it is straightforward. >>>>>> >>>>>> If you already have the matrix in CSR format you can call >>>>>> MatCreateSeqAIJWithArrays() to create the PETSc matrix without copying >>>>>> the data. You can use VecCreateSeqWithArray() to provide the vector. Or >>>>>> you can use VecPlaceArray() to use the array of vector values you >>>>>> provide. >>>>>> >>>>>> >>>>>>> In my case, the computational domain is decomposed by another library, >>>>>>> so does it effect the performance of PETSc? >>>>>>> >>>>>> I read this to mean you want the new code to be parallel (but the old >>>>>> one is sequential)? >>>>>> >>>>>> If you don't currently have matrix generation in parallel I urge you >>>>>> strongly to directly use MatSetValues() to generate your matrix, do not >>>>>> first put the matrix entries into some kind of parallel CSR format. If >>>>>> you already have the matrix in "parallel" CSR format you can use >>>>>> MatCreateMPIAIJWithArrays() to copy the matrix over to CSR format. >>>>>> >>>>>> It is my understanding that SPH can produce matrices with very >>>>>> unbalance sparsity. It is important to take this into account if you >>>>>> wish to run in parallel since if you end up with some processes having >>>>>> many more nonzeros than other processes you will get very poor >>>>>> performance. >>>>>> >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>>> With the best regards, >>>>>>> Massoud >>>>>>> >>>>>>> >
