Hi Hong,

It works like a charm. I really appreciate your help.

Regards,

Danyang


On 17-05-25 07:49 AM, Hong wrote:
Danyang:
You must access inner pc, then set shift. See
petsc/src/ksp/ksp/examples/tutorials/ex7.c

For example, I add following to petsc/src/ksp/ksp/examples/tutorials/ex2.c, line 191:
  PetscBool isbjacobi;
  PC        pc;
  ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
ierr = PetscObjectTypeCompare((PetscObject)pc,PCBJACOBI,&isbjacobi);CHKERRQ(ierr);
  if (isbjacobi) {
    PetscInt nlocal;
    KSP      *subksp;
    PC       subpc;

    ierr = KSPSetUp(ksp);CHKERRQ(ierr);
    ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);

    /* Extract the array of KSP contexts for the local blocks */
    ierr = PCBJacobiGetSubKSP(pc,&nlocal,NULL,&subksp);CHKERRQ(ierr);
    printf("isbjacobi, nlocal %D, set option to subpc...\n",nlocal);
    for (i=0; i<nlocal; i++) {
      ierr = KSPGetPC(subksp[i],&subpc);CHKERRQ(ierr);
      ierr = PCFactorSetShiftType(subpc,MAT_SHIFT_NONZERO);CHKERRQ(ierr);
    }
  }


    Dear Hong and Barry,

    I have implemented this option in the code, as we also need to use
    configuration from file for convenience. When I run the code using
    options, it works fine, however, when I run the code using
    configuration file, it does not work. The code has two set of
    equations, flow and reactive, with prefix been set to "flow_" and
    "react_". When I run the code using

    mpiexec -n 4 ../executable -flow_sub_pc_factor_shift_type nonzero
    -react_sub_pc_factor_shift_type nonzero

    it works. However, if I run using

    mpiexec -n 4 ../executable

    and let the executable file read the options from file, it just
    does not work at "call
    PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO, ierr)  or none,
    positive_definite ...". Do I miss something here?

    Below is the pseudo code I have used for flow equations, similar
    for reactive equations.

              call MatCreateAIJ(Petsc_Comm_World,nndof,nndof,nngbldof, &
                                nngbldof,d_nz,PETSC_NULL_INTEGER,o_nz, &
                                PETSC_NULL_INTEGER,a_flow,ierr)
              CHKERRQ(ierr)

            call MatSetFromOptions(a_flow,ierr)
            CHKERRQ(ierr)

            call KSPCreate(Petsc_Comm_World, ksp_flow, ierr)
            CHKERRQ(ierr)

            call KSPAppendOptionsPrefix(ksp_flow,"flow_",ierr)
            CHKERRQ(ierr)

            call KSPSetInitialGuessNonzero(ksp_flow, &
                    b_initial_guess_nonzero_flow, ierr)
            CHKERRQ(ierr)

            call KSPSetInitialGuessNonzero(ksp_flow, &
                    b_initial_guess_nonzero_flow, ierr)
            CHKERRQ(ierr)

            call KSPSetDM(ksp_flow,dmda_flow%da,ierr)
            CHKERRQ(ierr)
            call KSPSetDMActive(ksp_flow,PETSC_FALSE,ierr)
            CHKERRQ(ierr)

            !!!!*********CHECK IF READ OPTION FROM FILE*********!!!!
            if (read_option_from_file) then

              call KSPSetType(ksp_flow, KSPGMRES, ierr) !or KSPBCGS or
    others...
              CHKERRQ(ierr)

              call KSPGetPC(ksp_flow, pc_flow, ierr)
              CHKERRQ(ierr)

              call PCSetType(pc_flow,PCBJACOBI, ierr)       !or PCILU
    or PCJACOBI or PCHYPRE ...
              CHKERRQ(ierr)

              call PCFactorSetShiftType(pc_flow,MAT_SHIFT_NONZERO,
    ierr)  or none, positive_definite ...
              CHKERRQ(ierr)

            end if

            call
    PCFactorGetMatSolverPackage(pc_flow,solver_pkg_flow,ierr)
            CHKERRQ(ierr)

            call compute_jacobian(rank,dmda_flow%da, &
    a_flow,a_in,ia_in,ja_in,nngl_in,         &
    row_idx_l2pg,col_idx_l2pg,            &
                                  b_non_interlaced)
            call KSPSetFromOptions(ksp_flow,ierr)
            CHKERRQ(ierr)

            call KSPSetUp(ksp_flow,ierr)
            CHKERRQ(ierr)

            call KSPSetUpOnBlocks(ksp_flow,ierr)
            CHKERRQ(ierr)

            call KSPSolve(ksp_flow,b_flow,x_flow,ierr)
            CHKERRQ(ierr)


    Thanks and Regards,

    Danyang

    On 17-05-24 06:32 PM, Hong wrote:
    Remove your option '-vecload_block_size 10'.
    Hong

    On Wed, May 24, 2017 at 3:06 PM, Danyang Su <[email protected]
    <mailto:[email protected]>> wrote:

        Dear Hong,

        I just tested with different number of processors for the
        same matrix. It sometimes got "ERROR: Arguments are
        incompatible" for different number of processors. It works
        fine using 4, 8, or 24 processors, but failed with "ERROR:
        Arguments are incompatible" using 16 or 48 processors. The
        error information is attached. I tested this on my local
        computer with 6 cores 12 threads. Any suggestion on this?

        Thanks,

        Danyang


        On 17-05-24 12:28 PM, Danyang Su wrote:

        Hi Hong,

        Awesome. Thanks for testing the case. I will try your
        options for the code and get back to you later.

        Regards,

        Danyang


        On 17-05-24 12:21 PM, Hong wrote:
        Danyang :
        I tested your data.
        Your matrices encountered zero pivots, e.g.
        petsc/src/ksp/ksp/examples/tutorials (master)
        $ mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs
        b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged

        [15]PETSC ERROR: Zero pivot in LU factorization:
        http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot
        <http://www.mcs.anl.gov/petsc/documentation/faq.html#zeropivot>
        [15]PETSC ERROR: Zero pivot row 1249 value 2.05808e-14
        tolerance 2.22045e-14
        ...

        Adding option '-sub_pc_factor_shift_type nonzero', I got
        mpiexec -n 24 ./ex10 -f0 a_react_in_2.bin -rhs
        b_react_in_2.bin -ksp_monitor -ksp_error_if_not_converged
        -sub_pc_factor_shift_type nonzero -mat_view ascii::ascii_info

        Mat Object: 24 MPI processes
        type: mpiaij
        rows=450000, cols=450000
        total: nonzeros=6991400, allocated nonzeros=6991400
        total number of mallocs used during MatSetValues calls =0
        not using I-node (on process 0) routines
          0 KSP Residual norm 5.849777711755e+01
          1 KSP Residual norm 6.824179430230e-01
          2 KSP Residual norm 3.994483555787e-02
          3 KSP Residual norm 6.085841461433e-03
          4 KSP Residual norm 8.876162583511e-04
          5 KSP Residual norm 9.407780665278e-05
        Number of iterations =   5
        Residual norm 0.00542891

        Hong

            Hi Matt,

            Yes. The matrix is 450000x450000 sparse. The hypre
            takes hundreds of iterates, not for all but in most of
            the timesteps. The matrix is not well conditioned, with
            nonzero entries range from 1.0e-29 to 1.0e2. I also
            made double check if there is anything wrong in the
            parallel version, however, the matrix is the same with
            sequential version except some round error which is
            relatively very small. Usually for those not well
            conditioned matrix, direct solver should be faster than
            iterative solver, right? But when I use the sequential
            iterative solver with ILU prec developed almost 20
            years go by others, the solver converge fast with
            appropriate factorization level. In other words, when I
            use 24 processor using hypre, the speed is almost the
            same as as the old sequential iterative solver using 1
            processor.

            I use most of the default configuration for the general
            case with pretty good speedup. And I am not sure if I
            miss something for this problem.

            Thanks,

            Danyang


            On 17-05-24 11:12 AM, Matthew Knepley wrote:
            On Wed, May 24, 2017 at 12:50 PM, Danyang Su
            <[email protected] <mailto:[email protected]>>
            wrote:

                Hi Matthew and Barry,

                Thanks for the quick response.

                I also tried superlu and mumps, both work but it
                is about four times slower than ILU(dt) prec
                through hypre, with 24 processors I have tested.

            You mean the total time is 4x? And you are taking
            hundreds of iterates? That seems hard to believe,
            unless you are dropping
            a huge number of elements.

                When I look into the convergence information, the
                method using ILU(dt) still takes 200 to 3000
                linear iterations for each newton iteration. One
                reason is this equation is hard to solve. As for
                the general cases, the same method works awesome
                and get very good speedup.

            I do not understand what you mean here.

                I also doubt if I use hypre correctly for this
                case. Is there anyway to check this problem, or is
                it possible to increase the factorization level
                through hypre?

            I don't know.

              Matt

                Thanks,

                Danyang


                On 17-05-24 04:59 AM, Matthew Knepley wrote:
                On Wed, May 24, 2017 at 2:21 AM, Danyang Su
                <[email protected]
                <mailto:[email protected]>> wrote:

                    Dear All,

                    I use PCFactorSetLevels for ILU and
                    PCFactorSetFill for other preconditioning in
                    my code to help solve the problems that the
                    default option is hard to solve. However, I
                    found the latter one, PCFactorSetFill does
                    not take effect for my problem. The matrices
                    and rhs as well as the solutions are attached
                    from the link below. I obtain the solution
                    using hypre preconditioner and it takes 7 and
                    38 iterations for matrix 1 and matrix 2.
                    However, if I use other preconditioner, the
                    solver just failed at the first matrix. I
                    have tested this matrix using the native
                    sequential solver (not PETSc) with ILU
                    preconditioning. If I set the incomplete
                    factorization level to 0, this sequential
                    solver will take more than 100 iterations. If
                    I increase the factorization level to 1 or
                    more, it just takes several iterations. This
                    remind me that the PC factor for this
                    matrices should be increased. However, when I
                    tried it in PETSc, it just does not work.

                    Matrix and rhs can be obtained from the link
                    below.

                    
https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R
                    
<https://eilinator.eos.ubc.ca:8443/index.php/s/CalUcq9CMeblk4R>

                    Would anyone help to check if you can make
                    this work by increasing the PC factor level
                    or fill?


                We have ILU(k) supported in serial. However
                ILU(dt) which takes a tolerance only works
                through Hypre

                
http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html
                
<http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html>

                I recommend you try SuperLU or MUMPS, which can
                both be downloaded automatically by configure, and
                do a full sparse LU.

                  Thanks,

                    Matt

                    Thanks and regards,

                    Danyang





-- What most experimenters take for granted before
                they begin their experiments is infinitely more
                interesting than any results to which their
                experiments lead.
                -- Norbert Wiener

                http://www.caam.rice.edu/~mk51/
                <http://www.caam.rice.edu/%7Emk51/>




-- What most experimenters take for granted before they
            begin their experiments is infinitely more interesting
            than any results to which their experiments lead.
            -- Norbert Wiener

            http://www.caam.rice.edu/~mk51/
            <http://www.caam.rice.edu/%7Emk51/>








Reply via email to