I just submitted the issue: https://gitlab.com/petsc/petsc/issues/535
What I really want is an exact Block Tri-diagonal solver on GPU. Since for block tridiagonal system, ILU0 would be the same as ILU. So I tried the chowiluviennacl. but I found that the default parameters does not produce the same ILU0 factorization as the CPU ones (PCILU). My guess is that if I increase the number of sweeps chow_patel_ilu_config.sweeps(3), it may give a better result. So the option Keys would be helpful. Since Mark mentioned the Superlu's GPU feature, can I use superlu or hypre's GPU functionality through PETSc? Thank you. Xiangdong On Wed, Jan 15, 2020 at 2:22 PM Matthew Knepley <[email protected]> wrote: > On Wed, Jan 15, 2020 at 1:48 PM Xiangdong <[email protected]> wrote: > >> In the ViennaCL manual >> http://viennacl.sourceforge.net/doc/manual-algorithms.html >> >> It did expose two parameters: >> >> // configuration of preconditioner: >> viennacl::linalg::chow_patel_tag chow_patel_ilu_config; >> chow_patel_ilu_config.sweeps(3); // three nonlinear sweeps >> chow_patel_ilu_config.jacobi_iters(2); // two Jacobi iterations per >> triangular 'solve' Rx=r >> >> and mentioned that: >> The number of nonlinear sweeps and Jacobi iterations need to be set >> problem-specific for best performance. >> >> In the PETSc' implementation: >> >> viennacl::linalg::chow_patel_tag ilu_tag; >> ViennaCLAIJMatrix *mat = (ViennaCLAIJMatrix*)gpustruct->mat; >> ilu->CHOWILUVIENNACL = new >> viennacl::linalg::chow_patel_ilu_precond<viennacl::compressed_matrix<PetscScalar> >> >(*mat, ilu_tag); >> >> The default is used. Is it possible to expose these two parameters so >> that user can change it through option keys? >> > > Yes. Do you mind making an issue for it? That way we can better keep track. > > https://gitlab.com/petsc/petsc/issues > > Thanks, > > Matt > > >> Thank you. >> >> Xiangdong >> >> On Wed, Jan 15, 2020 at 12:40 PM Matthew Knepley <[email protected]> >> wrote: >> >>> On Wed, Jan 15, 2020 at 9:59 AM Xiangdong <[email protected]> wrote: >>> >>>> Maybe I am not clear. I want to solve the block tridiagonal system >>>> Tx=b a few times with same T but different b. On CPU, I can have it by >>>> applying the ILU0 and reuse the factorization. Since it is block >>>> tridiagonal, ILU0 would give same results as LU. >>>> >>>> I am trying to do the same thing on GPU with chowiluviennacl, but found >>>> default factorization does not produce the exact factorization for >>>> tridiagonal system. Can we tight the drop off tolerance so that it can work >>>> as LU for tridiagonal system? >>>> >>> >>> There are no options in our implementation. You could look at the >>> ViennaCL manual to see if we missed something. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thank you. >>>> >>>> Xiangdong >>>> >>>> On Wed, Jan 15, 2020 at 9:41 AM Matthew Knepley <[email protected]> >>>> wrote: >>>> >>>>> On Wed, Jan 15, 2020 at 9:36 AM Xiangdong <[email protected]> wrote: >>>>> >>>>>> Can chowiluviennacl do ilu0? >>>>>> >>>>>> I need to solve a tri-diagonal system directly. If I apply the PCILU, >>>>>> I will obtain the exact solution with preonly + pcilu. However, the >>>>>> preonly + chowiluviennacl will not provide the exact solution. Any option >>>>>> keys to set the CHOWILUVIENNACL filling level or dropping off tolerance >>>>>> like the standard ilu? >>>>>> >>>>> >>>>> No. However, such a scheme makes less sense here. This algorithm >>>>> spawns a individual threads for individual elements. Drop tolerance >>>>> is not less work, it is sparser, but that should not matter for a >>>>> tridiagonal system. Levels also is not applicable since you have only 1 >>>>> level. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thank you. >>>>>> >>>>>> Best, >>>>>> Xiangdong >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Jan 14, 2020 at 10:05 PM Matthew Knepley <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> On Tue, Jan 14, 2020 at 9:56 PM Xiangdong <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Dear Developers, >>>>>>>> >>>>>>>> I have a quick question about the chowiluviennacl. When I tried to >>>>>>>> use it, I found that it only works for np=1, not np>1. However, in the >>>>>>>> description of chowiluviennacl.cxx, it says "the ViennaCL Chow-Patel >>>>>>>> parallel ILU preconditioner". >>>>>>>> >>>>>>> >>>>>>> By parallel, this means shared memory parallelism on the GPU. >>>>>>> >>>>>>> >>>>>>>> I am wondering whether I am using it correctly. >>>>>>>> Does chowiluviennacl work for np>1? >>>>>>>> >>>>>>> >>>>>>> I do not believe so. I do not see why it could not be extended, but >>>>>>> that would mean writing some more code. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> In addition, are there option keys for the chowiluviennacl one can >>>>>>>> try? >>>>>>>> Thank you. >>>>>>>> >>>>>>>> Best, >>>>>>>> Xiangdong >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which >>>>>>> their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> <http://www.cse.buffalo.edu/~knepley/> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > <http://www.cse.buffalo.edu/~knepley/> >
