Danyang: You must access inner pc, then set shift. See petsc/src/ksp/ksp/examples/tutorials/ex7.c
For example, I add following to petsc/src/ksp/ksp/examples/tutorials/ex2.c, line 191: PetscBool isbjacobi; PC pc; ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); ierr = PetscObjectTypeCompare((PetscObject)pc,PCBJACOBI,&isbjacobi);CHKERRQ(ierr); if (isbjacobi) { PetscInt nlocal; KSP *subksp; PC subpc; ierr = KSPSetUp(ksp);CHKERRQ(ierr); ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); /* Extract the array of KSP contexts for the local blocks */ ierr = PCBJacobiGetSubKSP(pc,&nlocal,NULL,&subksp);CHKERRQ(ierr); printf("isbjacobi, nlocal %D, set option to subpc...\n",nlocal); for (i=0; i<nlocal; i++) { ierr = KSPGetPC(subksp[i],&subpc);CHKERRQ(ierr); ierr = PCFactorSetShiftType(subpc,MAT_SHIFT_NONZERO);CHKERRQ(ierr); } } Dear Hong and Barry, > > I have implemented this option in the code, as we also need to use > configuration from file for convenience. When I run the code using options, > it works fine, however, when I run the code using configuration file, it > does not work. The code has two set of equations, flow and reactive, with > prefix been set to "flow_" and "react_". When I run the code using > > mpiexec -n 4 ../executable -flow_sub_pc_factor_shift_type nonzero > -react_sub_pc_factor_shift_type nonzero > > it works. However, if I run using > > mpiexec -n 4 ../executable > > and let the executable file read the options from file, it just does not > work at "call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr) or > none, positive_definite ...". Do I miss something here? > > Below is the pseudo code I have used for flow equations, similar for > reactive equations. > > call MatCreateAIJ(Petsc_Comm_World,nndof,nndof,nngbldof, & > nngbldof,d_nz,PETSC_NULL_INTEGER,o_nz, & > PETSC_NULL_INTEGER,a_flow,ierr) > CHKERRQ(ierr) > > call MatSetFromOptions(a_flow,ierr) > CHKERRQ(ierr) > > call KSPCreate(Petsc_Comm_World, ksp_flow, ierr) > CHKERRQ(ierr) > > call KSPAppendOptionsPrefix(ksp_flow,"flow_",ierr) > CHKERRQ(ierr) > > call KSPSetInitialGuessNonzero(ksp_flow, & > b_initial_guess_nonzero_flow, ierr) > CHKERRQ(ierr) > > call KSPSetInitialGuessNonzero(ksp_flow, & > b_initial_guess_nonzero_flow, ierr) > CHKERRQ(ierr) > > call KSPSetDM(ksp_flow,dmda_flow%da,ierr) > CHKERRQ(ierr) > call KSPSetDMActive(ksp_flow,PETSC_FALSE,ierr) > CHKERRQ(ierr) > > !!!!*********CHECK IF READ OPTION FROM FILE*********!!!! > if (read_option_from_file) then > > call KSPSetType(ksp_flow, KSPGMRES, ierr) !or KSPBCGS or > others... > CHKERRQ(ierr) > > call KSPGetPC(ksp_flow, pc_flow, ierr) > CHKERRQ(ierr) > > call PCSetType(pc_flow,PCBJACOBI, ierr) !or PCILU or > PCJACOBI or PCHYPRE ... > CHKERRQ(ierr) > > call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr) or > none, positive_definite ... > CHKERRQ(ierr) > > end if > > call PCFactorGetMatSolverPackage(pc_flow,solver_pkg_flow,ierr) > CHKERRQ(ierr) > > call compute_jacobian(rank,dmda_flow%da, & > a_flow,a_in,ia_in,ja_in,nngl_in, & > row_idx_l2pg,col_idx_l2pg, & > b_non_interlaced) > call KSPSetFromOptions(ksp_flow,ierr) > CHKERRQ(ierr) > > call KSPSetUp(ksp_flow,ierr) > CHKERRQ(ierr) > > call KSPSetUpOnBlocks(ksp_flow,ierr) > CHKERRQ(ierr) > > call KSPSolve(ksp_flow,b_flow,x_flow,ierr) > CHKERRQ(ierr) > > > Thanks and Regards, > > Danyang > On 17-05-24 06:32 PM, Hong wrote: > > Remove your option '-vecload_block_size 10'. > Hong > > On Wed, May 24, 2017 at 3:06 PM, Danyang Su <danyang...@gmail.com> wrote: > >> Dear Hong, >> >> I just tested with different number of processors for the same matrix. It >> sometimes got "ERROR: Arguments are incompatible" for different number of >> processors. It works fine using 4, 8, or 24 processors, but failed with >> "ERROR: Arguments are incompatible" using 16 or 48 processors. The error >> information is attached. I tested this on my local computer with 6 cores 12 >> threads. Any suggestion on this? >> >> Thanks, >> >> Danyang >> >> On 17-05-24 12:28 PM, Danyang Su wrote: >> >> Hi Hong, >> >> Awesome. Thanks for testing the case. I will try your options for the >> code and get back to you later. >> >> Regards, >> >> Danyang >> >> On 17-05-24 12:21 PM, Hong wrote: >> >> Danyang : >> I tested your data. >> Your matrices encountered zero pivots, e.g. >> petsc/src/ksp/ksp/examples/tutorials (master) >> $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin >> -ksp_monitor -ksp_error_if_not_converged >> >> [15]PETSC ERROR: Zero pivot in LU factorization: >> http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot >> [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14 tolerance >> 2.22045e-14 >> ... >> >> Adding option '-sub_pc_factor_shift_type nonzero', I got >> mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs b_react_in_2.bin >> -ksp_monitor -ksp_error_if_not_converged -sub_pc_factor_shift_type nonzero >> -mat_view ascii::ascii_info >> >> Mat Object: 24 MPI processes >> type: mpiaij >> rows=450000, cols=450000 >> total: nonzeros=6991400, allocated nonzeros=6991400 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> 0 KSP Residual norm 5.849777711755e+01 >> 1 KSP Residual norm 6.824179430230e-01 >> 2 KSP Residual norm 3.994483555787e-02 >> 3 KSP Residual norm 6.085841461433e-03 >> 4 KSP Residual norm 8.876162583511e-04 >> 5 KSP Residual norm 9.407780665278e-05 >> Number of iterations = 5 >> Residual norm 0.00542891 >> >> Hong >> >>> Hi Matt, >>> >>> Yes. The matrix is 450000x450000 sparse. The hypre takes hundreds of >>> iterates, not for all but in most of the timesteps. The matrix is not well >>> conditioned, with nonzero entries range from 1.0e-29 to 1.0e2. I also made >>> double check if there is anything wrong in the parallel version, however, >>> the matrix is the same with sequential version except some round error >>> which is relatively very small. Usually for those not well conditioned >>> matrix, direct solver should be faster than iterative solver, right? But >>> when I use the sequential iterative solver with ILU prec developed almost >>> 20 years go by others, the solver converge fast with appropriate >>> factorization level. In other words, when I use 24 processor using hypre, >>> the speed is almost the same as as the old sequential iterative solver >>> using 1 processor. >>> >>> I use most of the default configuration for the general case with pretty >>> good speedup. And I am not sure if I miss something for this problem. >>> >>> Thanks, >>> >>> Danyang >>> >>> On 17-05-24 11:12 AM, Matthew Knepley wrote: >>> >>> On Wed, May 24, 2017 at 12:50 PM, Danyang Su <danyang...@gmail.com> >>> wrote: >>> >>>> Hi Matthew and Barry, >>>> >>>> Thanks for the quick response. >>>> >>>> I also tried superlu and mumps, both work but it is about four times >>>> slower than ILU(dt) prec through hypre, with 24 processors I have tested. >>>> >>> You mean the total time is 4x? And you are taking hundreds of iterates? >>> That seems hard to believe, unless you are dropping >>> a huge number of elements. >>> >>>> When I look into the convergence information, the method using ILU(dt) >>>> still takes 200 to 3000 linear iterations for each newton iteration. One >>>> reason is this equation is hard to solve. As for the general cases, the >>>> same method works awesome and get very good speedup. >>>> >>> I do not understand what you mean here. >>> >>>> I also doubt if I use hypre correctly for this case. Is there anyway to >>>> check this problem, or is it possible to increase the factorization level >>>> through hypre? >>>> >>> I don't know. >>> >>> Matt >>> >>>> Thanks, >>>> >>>> Danyang >>>> >>>> On 17-05-24 04:59 AM, Matthew Knepley wrote: >>>> >>>> On Wed, May 24, 2017 at 2:21 AM, Danyang Su <danyang...@gmail.com> >>>> wrote: >>>> >>>>> Dear All, >>>>> >>>>> I use PCFactorSetLevels for ILU and PCFactorSetFill for other >>>>> preconditioning in my code to help solve the problems that the default >>>>> option is hard to solve. However, I found the latter one, PCFactorSetFill >>>>> does not take effect for my problem. The matrices and rhs as well as the >>>>> solutions are attached from the link below. I obtain the solution using >>>>> hypre preconditioner and it takes 7 and 38 iterations for matrix 1 and >>>>> matrix 2. However, if I use other preconditioner, the solver just failed >>>>> at >>>>> the first matrix. I have tested this matrix using the native sequential >>>>> solver (not PETSc) with ILU preconditioning. If I set the incomplete >>>>> factorization level to 0, this sequential solver will take more than 100 >>>>> iterations. If I increase the factorization level to 1 or more, it just >>>>> takes several iterations. This remind me that the PC factor for this >>>>> matrices should be increased. However, when I tried it in PETSc, it just >>>>> does not work. >>>>> >>>>> Matrix and rhs can be obtained from the link below. >>>>> >>>>> https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R >>>>> >>>>> Would anyone help to check if you can make this work by increasing the >>>>> PC factor level or fill? >>>>> >>>> >>>> We have ILU(k) supported in serial. However ILU(dt) which takes a >>>> tolerance only works through Hypre >>>> >>>> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html >>>> >>>> I recommend you try SuperLU or MUMPS, which can both be downloaded >>>> automatically by configure, and >>>> do a full sparse LU. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks and regards, >>>>> >>>>> Danyang >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> http://www.caam.rice.edu/~mk51/ >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> http://www.caam.rice.edu/~mk51/ >>> >>> >>> >> >> >> > >