Re: [petsc-users] Can't expand MemType 1: jcol 16104

Barry Smith Mon, 13 Jul 2015 12:16:40 -0700

> On Jul 13, 2015, at 2:05 PM, Anthony Haas <[email protected]> wrote:
> 
> Hi,
> 
> I ran my program under Valgrind with 2 processors using the following:
> 
> ${PETSC_DIR}/${PETSC_ARCH}/bin/mpiexec -n 2 valgrind --tool=memcheck 
> --leak-check=full --show-leak-kinds=all --num-callers=20 
> --log-file=valgrind.log.%p ./BiGlobal_Spectral_Petsc.x -malloc off -ksp_type 
> preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -eps_view 
> -st_ksp_type preonly -st_pc_type lu -st_pc_factor_mat_solver_package 
> superlu_dist
> 
> Note that in my program, I use once PETSc alone to solve a linear system of 
> equations (see module_baseflows.F90) and then later in the code I use SLEPc 
> to get the eigenvalues (see module_petsc.F90). My main program is called 
> BiGlobal_Spectral_Petsc.F90
> 
> I have attached the log files from Valgrind. It seems that quite a few erros 
> occur in pzgstrf. See valgrind.log.31371 and 31372. Do I have any control 
> over these errors?


   These are likely bugs in SuperLU_Dist.

> Then starting from line 403 in valgrind.log.31371 and 458 in 
> valgrind.log.31372, I have a series of errors that seem to be related to a 
> VecCreateSeq and to a VecScatterCreateToZero (see in module_baseflows.F90). 
> Is there something I can do to avoid these errors?

   These are because you are not destroying all your vectors at the end.

> 
> Thanks
> 
> Anthony
> 
> 
> 
> 
> 
> 
> 
> On 07/08/2015 01:49 PM, Barry Smith wrote:
>>    You should try running under valgrind, see:  
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> 
>>     You can also run in the debugger (yes this is tricky on a batch system 
>> but possible) and see exactly what triggers the floating point exception or 
>> when it "hangs" interrupt the program to see where it is "hanging".
>> 
>> 
>>   Barry
>> 
>> 
>> 
>>> On Jul 8, 2015, at 12:34 PM, Anthony Paul Haas <[email protected]> 
>>> wrote:
>>> 
>>> Hi,
>>> 
>>> I have used the switch -mat_superlu_dist_parsymbfact in my pbs script. 
>>> However, although my program worked fine with sequential symbolic 
>>> factorization, I get one of the following 2 behaviors when I run with 
>>> parallel symbolic factorization (depending on the number of processors that 
>>> I use):
>>> 
>>> 1) the program just hangs (it seems stuck in some subroutine ==> see 
>>> test.out-hangs)
>>> 2) I get a floating point exception ==> see 
>>> test.out-floating-point-exception
>>> 
>>> Note that as suggested in the Superlu manual, I use a power of 2 number of 
>>> procs. Are there any tunable parameters for the parallel symbolic 
>>> factorization? Note that when I build my sparse matrix, most elements I add 
>>> are nonzero of course but to simplify the programming, I also add a few 
>>> zero elements in the sparse matrix. I was thinking that maybe if the 
>>> parallel symbolic factorization proceed by block, there could be some 
>>> blocks where the pivot would be zero, hence creating the FPE??
>>> 
>>> Thanks,
>>> 
>>> Anthony
>>> 
>>> 
>>> 
>>> On Wed, Jul 8, 2015 at 6:46 AM, Xiaoye S. Li <[email protected]> wrote:
>>> Did you find out how to change option to use parallel symbolic 
>>> factorization?  Perhaps PETSc team can help.
>>> 
>>> Sherry
>>> 
>>> 
>>> On Tue, Jul 7, 2015 at 3:58 PM, Xiaoye S. Li <[email protected]> wrote:
>>> Is there an inquiry function that tells you all the available options?
>>> 
>>> Sherry
>>> 
>>> On Tue, Jul 7, 2015 at 3:25 PM, Anthony Paul Haas <[email protected]> 
>>> wrote:
>>> Hi Sherry,
>>> 
>>> Thanks for your message. I have used superlu_dist default options. I did 
>>> not realize that I was doing serial symbolic factorization. That is 
>>> probably the cause of my problem.
>>> Each node on Garnet has 60GB usable memory and I can run with 1,2,4,8,16 or 
>>> 32 core per node.
>>> 
>>> So I should use:
>>> 
>>> -mat_superlu_dist_r 20
>>> -mat_superlu_dist_c 32
>>> 
>>> How do you specify the parallel symbolic factorization option? is it 
>>> -mat_superlu_dist_matinput 1
>>> 
>>> Thanks,
>>> 
>>> Anthony
>>> 
>>> 
>>> On Tue, Jul 7, 2015 at 3:08 PM, Xiaoye S. Li <[email protected]> wrote:
>>> For superlu_dist failure, this occurs during symbolic factorization.  Since 
>>> you are using serial symbolic factorization, it requires the entire graph 
>>> of A to be available in the memory of one MPI task. How much memory do you 
>>> have for each MPI task?
>>> 
>>> It won't help even if you use more processes.  You should try to use 
>>> parallel symbolic factorization option.
>>> 
>>> Another point.  You set up process grid as:
>>>        Process grid nprow 32 x npcol 20
>>> For better performance, you show swap the grid dimension. That is, it's 
>>> better to use 20 x 32, never gives nprow larger than npcol.
>>> 
>>> 
>>> Sherry
>>> 
>>> 
>>> On Tue, Jul 7, 2015 at 1:27 PM, Barry Smith <[email protected]> wrote:
>>> 
>>>    I would suggest running a sequence of problems, 101 by 101 111 by 111 
>>> etc and get the memory usage in each case (when you run out of memory you 
>>> can get NO useful information out about memory needs). You can then plot 
>>> memory usage as a function of problem size to get a handle on how much 
>>> memory it is using.  You can also run on more and more processes (which 
>>> have a total of more memory) to see how large a problem you may be able to 
>>> reach.
>>> 
>>>    MUMPS also has an "out of core" version (which we have never used) that 
>>> could in theory anyways let you get to large problems if you have lots of 
>>> disk space, but you are on your own figuring out how to use it.
>>> 
>>>   Barry
>>> 
>>>> On Jul 7, 2015, at 2:37 PM, Anthony Paul Haas <[email protected]> 
>>>> wrote:
>>>> 
>>>> Hi Jose,
>>>> 
>>>> In my code, I use once PETSc to solve a linear system to get the baseflow 
>>>> (without using SLEPc) and then I use SLEPc to do the stability analysis of 
>>>> that baseflow. This is why, there are some SLEPc options that are not used 
>>>> in test.out-superlu_dist-151x151 (when I am solving for the baseflow with 
>>>> PETSc only). I have attached a 101x101 case for which I get the 
>>>> eigenvalues. That case works fine. However If i increase to 151x151, I get 
>>>> the error that you can see in test.out-superlu_dist-151x151 (similar error 
>>>> with mumps: see test.out-mumps-151x151 line 2918 ). If you look a the very 
>>>> end of the files test.out-superlu_dist-151x151 and test.out-mumps-151x151, 
>>>> you will see that the last info message printed is:
>>>> 
>>>> On Processor (after EPSSetFromOptions)  0    memory:    0.65073152000E+08  
>>>>         =====>  (see line 807 of module_petsc.F90)
>>>> 
>>>> This means that the memory error probably occurs in the call to EPSSolve 
>>>> (see module_petsc.F90 line 810). I would like to evaluate how much memory 
>>>> is required by the most memory intensive operation within EPSSolve. Since 
>>>> I am solving a generalized EVP, I would imagine that it would be the LU 
>>>> decomposition. But is there an accurate way of doing it?
>>>> 
>>>> Before starting with iterative solvers, I would like to exploit as much as 
>>>> I can direct solvers. I tried GMRES with default preconditioner at some 
>>>> point but I had convergence problem. What solver/preconditioner would you 
>>>> recommend for a generalized non-Hermitian (EPS_GNHEP) EVP?
>>>> 
>>>> Thanks,
>>>> 
>>>> Anthony
>>>> 
>>>> On Tue, Jul 7, 2015 at 12:17 AM, Jose E. Roman <[email protected]> wrote:
>>>> 
>>>> El 07/07/2015, a las 02:33, Anthony Haas escribió:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am computing eigenvalues using PETSc/SLEPc and superlu_dist for the LU 
>>>>> decomposition (my problem is a generalized eigenvalue problem). The code 
>>>>> runs fine for a grid with 101x101 but when I increase to 151x151, I get 
>>>>> the following error:
>>>>> 
>>>>> Can't expand MemType 1: jcol 16104   (and then [NID 00037] 2015-07-06 
>>>>> 19:19:17 Apid 31025976: OOM killer terminated this process.)
>>>>> 
>>>>> It seems to be a memory problem. I monitor the memory usage as far as I 
>>>>> can and it seems that memory usage is pretty low. The most memory 
>>>>> intensive part of the program is probably the LU decomposition in the 
>>>>> context of the generalized EVP. Is there a way to evaluate how much 
>>>>> memory will be required for that step? I am currently running the debug 
>>>>> version of the code which I would assume would use more memory?
>>>>> 
>>>>> I have attached the output of the job. Note that the program uses twice 
>>>>> PETSc: 1) to solve a linear system for which no problem occurs, and, 2) 
>>>>> to solve the Generalized EVP with SLEPc, where I get the error.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Anthony
>>>>> <test.out-superlu_dist-151x151>
>>>> In the output you are attaching there are no SLEPc objects in the report 
>>>> and SLEPc options are not used. It seems that SLEPc calls are skipped?
>>>> 
>>>> Do you get the same error with MUMPS? Have you tried to solve linear 
>>>> systems with a preconditioned iterative solver?
>>>> 
>>>> Jose
>>>> 
>>>> 
>>>> <module_petsc.F90><test.out-mumps-151x151><test.out_superlu_dist-101x101><test.out-superlu_dist-151x151>
>>> 
>>> 
>>> 
>>> 
>>> 
>>> <test.out-hangs><test.out-floating-point-exception>
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !! SPECTRAL TEMPORAL BIGLOBAL SOLVER !
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> PROGRAM Biglobal_Spectral
> 
>  USE PETSC
>  USE TYPEDEF
>  USE MAPPINGS
>  USE BASEFLOW
>  USE WRITE_PRINT
>  USE FOURIER_CHEBYSHEV
> 
>  IMPLICIT NONE
> 
>  include 'mpif.h'
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> !                           Variables Definition                         !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> 
> !...Testing and dummy variables 
>    real(dp) :: dummy
> 
> !...Problem parameters
>    integer(i4b),parameter :: nx=11,ny=11,nxy=nx*ny,N=4*nxy
> 
>    !Grid size for 3D Poiseuille baseflow
>    integer(i4b),parameter :: nnx=nx,nny=ny
> 
>    real(dp) :: Re,alpha,beta,Lx,xmin,xmax,ymin,ymax,AA,xll,yll
> 
> !...Problem Flags
>    integer(i4b) :: fourier,testcase,formulation,algorithm
> 
>    !Mapping Flags
>    integer(i4b) :: map,mapx,mapy
> 
> !...Indices
>    integer(i4b) :: i,j,iii,jjj,ij,skip,imin,imax,jmin,jmax
> 
> !...Output files
>    integer(i4b)   :: file_num
>    character(128) :: filetype,filename,file_name
> 
> !...Boundary conditions variables
>    integer(i4b) :: ct1,ct2,ct3,ct4
>    integer(i4b),dimension(:),allocatable :: fst_bc,wall_bc,outlet_bc,inlet_bc
>    integer(i4b),dimension(:),allocatable :: bcond_px,bcond_py,bcond_pz
>    integer(i4b),dimension(:),allocatable :: bcond_u,bcond_v,bcond_w,bcond_p
>    integer(i4b),dimension(:),allocatable :: bcond
> 
> !...Chebyshev related
>    real(dp),dimension(:),allocatable   :: xi,yi,x,y,xfourier
> 
> !...Baseflow variables
>    real(dp),dimension(:,:),allocatable :: usol,uxsol,uysol
>    real(dp),dimension(:,:),allocatable :: vsol,vxsol,vysol
>    real(dp),dimension(:,:),allocatable :: wsol,wxsol,wysol
> 
>    !For Hiemenz Baseflow
>    integer(i4b) :: npts_hz
>    real(dp) :: ymax_hz,epstol,si
> 
>    real(dp),dimension(:),allocatable :: eta
>    real(dp),dimension(:),allocatable :: u0,v0,v1,v2,v3
>    real(dp),dimension(:),allocatable :: w0,w1,w2
>    real(dp),dimension(:),allocatable :: etarev,u0rev,v0rev,w0rev
> 
>    real(dp),dimension(:,:),allocatable :: Ubar,Vbar,Wbar
> 
> !...Grid variables
>    real(dp),dimension(:,:),allocatable :: xx,yy,xx1,yy1
> 
> !...Vectorization array
>    integer(i4b),dimension(:),allocatable :: li
> 
> !...MPI
>    integer(i4b) :: rank,size,ierr
> 
> !...Spline interpolation
>    !real(dp),dimension(:),allocatable :: bsp,csp,dsp
>    !real(dp),dimension(:),allocatable :: interp_u0,interp_v0,interp_w0
> 
>    real(dp) :: seval
> 
> !...External Subroutines
>    external seval
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> !                           Beginning of program                         !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> 
>    call MPI_INIT(ierr)
> 
>    call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr) 
>    call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
> 
>    if (rank==0) call cpu_time(start_total)
> 
> !...Parameters for BiGlobal Computation       
> 
>    testcase=2      ! 1 --> Plane Poiseuille Flow
>                    ! 2 --> 3D Poiseuille Flow
>                    ! 3 --> Attachment line flow (Hiemenz)
>                    ! 4 --> Laminar Separation Bubble
>                    ! 5 --> Artificially parallel boundary-layer (cf. 
> Rodriguez thesis)
> 
>    formulation = 1 ! 1 --> x,y inhomogeneous
>                    ! 2 --> y,z inhomogeneous
> 
>    algorithm = 1   ! 1 --> QZ algorith
>                    ! 2 --> Arnoldi algorithm
> 
> 
> !...Arrays allocation
> 
>    allocate(li(nx))
>    allocate(xi(nx),yi(ny))
>    allocate(x(nx),y(ny))
>    allocate(xfourier(nx))
> 
>    allocate(uvec(nxy),uxvec(nxy),uyvec(nxy))
>    allocate(vvec(nxy),vxvec(nxy),vyvec(nxy))
>    allocate(wvec(nxy),wxvec(nxy),wyvec(nxy))
> 
>    allocate(xx(nx,ny),yy(nx,ny))
>    allocate(D1X(nx,nx),D2X(nx,nx))
>    allocate(D1Y(ny,ny),D2Y(ny,ny))
>    allocate(DMX(nx,nx,2),DMY(ny,ny,2))
> 
>    !allocate(interp_u0(ny),interp_v0(ny),interp_w0(ny))
> 
>    li       = 0
>    xi       = 0.0d0
>    yi       = 0.0d0
>    x        = 0.0d0
>    y        = 0.0d0
>    xfourier = 0.0d0
>    uvec     = 0.0d0
>    uxvec    = 0.0d0
>    uyvec    = 0.0d0
>    vvec     = 0.0d0
>    vxvec    = 0.0d0
>    vyvec    = 0.0d0
>    wvec     = 0.0d0
>    wxvec    = 0.0d0
>    wyvec    = 0.0d0
>    xx       = 0.0d0
>    yy       = 0.0d0
>    D1X      = 0.0d0
>    D2X      = 0.0d0
>    D1Y      = 0.0d0
>    D2Y      = 0.0d0
>    DMX      = 0.0d0 
>    DMY      = 0.0d0 
> 
> !...Test Cases Parameters
> 
>    if ( testcase == 2 ) then
> 
>       mapx=1
>       mapy=0
> 
>       Re=1000
> 
>       if (formulation == 1)then
>          beta = pi
>       elseif (formulation == 2) then
>          alpha = pi
>       endif
> 
>       AA=1.0
> 
>       xmin=-AA
>       xmax=AA
> 
>       ymin=-1.0
>       ymax=1.0
> 
>       !grid for baseflow
>       !nnx=nx
>       !nny=ny
> 
>    endif
> 
> !...Working Array for Converting 2D Indices to 1D
> 
>    li=0
> 
>    do i=1,nx
>       li(i)=(i-1)*ny
>    enddo
> 
> !...Get Collocation Points and Chebyshev Differentiation Matrices
> 
>    call cheby(nx-1,xi,DMX) ! (cf. Spectral Methods in Matlab pages 53-55)
>    call cheby(ny-1,yi,DMY)
> 
>    xll=0
> 
>    ! To get matrices D1X,D2X,D1Y,D2Y
>    call set_mappings(nx,ny,mapx,mapy,xmin,xmax,ymin,ymax,xi,yi,xll,yll,x,y)
> 
>    do i=1,nx
>       do j=1,ny
> 
>          xx(i,j)=x(i) ! not equivalent to matlab meshgrid!!!
>          yy(i,j)=y(j) ! not equivalent to matlab meshgrid!!!
> 
>       enddo
>    enddo
> 
> 
> !...Compute Baseflow (3D Poiseuille)
> 
>    if ( testcase == 2 ) then
> 
>       allocate(xx1(nnx,nny),yy1(nnx,nny))
> 
>       allocate(usol(nnx,nny),uxsol(nnx,nny),uysol(nnx,nny))
>       allocate(vsol(nnx,nny),vxsol(nnx,nny),vysol(nnx,nny))
>       allocate(wsol(nnx,nny),wxsol(nnx,nny),wysol(nnx,nny))
> 
>       xx1   = 0.0d0
>       yy1   = 0.0d0
>       usol  = 0.0d0
>       uxsol = 0.0d0
>       uysol = 0.0d0
>       vsol  = 0.0d0
>       vxsol = 0.0d0
>       vysol = 0.0d0
>       wsol  = 0.0d0
>       wxsol = 0.0d0
>       wysol = 0.0d0
> 
>       if (rank == 0) then 
>          WRITE(*,*)
>          WRITE(*,'(A,I3,A,I3)')'3D POISEUILLE BASEFLOW COMPUTED ON GRID WITH 
> NX =',nnx,' NY =',nny
>          WRITE(*,*)
>          call cpu_time(start)
>       endif
> 
>       !currently I am taking same grid for baseflow as for biglobal ==> no 
> interpolation
>       call Poiseuille_3D_PETSC(nnx,nny,AA,usol,xx1,yy1)
> 
>       if (rank==0) then
>          call cpu_time(finish)
>          print '("Time for baseflow computation = ",f15.4," 
> seconds.")',finish-start
> 
>          file_num=10
>          file_name='check_baseflow_3D_poiseuille.dat'
> 
>          call 
> write_to_tecplot(file_num,file_name,nnx,nny,xx1,yy1,usol,usol,usol)
> 
>       endif
> 
>       vsol=0*usol
>       wsol=0*usol
> 
>       if (formulation==1) then
> 
>          wsol=usol                        
>          usol=0.*wsol
>          vsol=0.*wsol
> 
>       endif
> 
>       !Compute derivatives
>       uxsol = MATMUL(D1X,usol)
>       uysol = MATMUL(usol,TRANSPOSE(D1Y))
> 
>       vxsol = MATMUL(D1X,vsol)
>       vysol = MATMUL(vsol,TRANSPOSE(D1Y))
> 
>       wxsol = MATMUL(D1X,wsol)
>       wysol = MATMUL(wsol,TRANSPOSE(D1Y))
> 
>       !Vectorize the matrices
>       do i=1,nx
>          do j=1,ny
> 
>             ij=li(i)+j
> 
>             uvec(ij) = usol(i,j)
>             vvec(ij) = vsol(i,j)
>             wvec(ij) = wsol(i,j)
> 
>             uxvec(ij) = uxsol(i,j)
>             vxvec(ij) = vxsol(i,j)
>             wxvec(ij) = wxsol(i,j)
> 
>             uyvec(ij) = uysol(i,j)
>             vyvec(ij) = vysol(i,j)
>             wyvec(ij) = wysol(i,j)         
> 
>          enddo
>       enddo
> 
>    endif
> 
>    !call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
> 
> !...Build Matrices, Set BCs and Solve Problem
> 
>    call SetupSolveGEVP(nx,ny,nxy,alpha,beta,Re)
> 
>    if (rank==0) then
>       call cpu_time(finish_total)
>       print '("Total time = ",f15.4," seconds.")',finish_total-start_total
>    endif
> 
>    deallocate(li)
>    deallocate(xi,yi)
>    deallocate(x,y)
>    deallocate(xfourier)
> 
>    deallocate(uvec,uxvec,uyvec)
>    deallocate(vvec,vxvec,vyvec)
>    deallocate(wvec,wxvec,wyvec)
> 
>    deallocate(xx,yy)
>    deallocate(D1X,D2X)
>    deallocate(D1Y,D2Y)
>    deallocate(DMX,DMY)
> 
>    deallocate(xx1,yy1)
> 
>    deallocate(usol,uxsol,uysol)
>    deallocate(vsol,vxsol,vysol)
>    deallocate(wsol,wxsol,wysol)
> 
>    call MPI_FINALIZE(ierr)
> 
> 
>  END PROGRAM Biglobal_Spectral
> 
> MODULE BASEFLOW
> 
>  USE TYPEDEF
>  USE WRITE_PRINT
>  USE FOURIER_CHEBYSHEV
> 
>  IMPLICIT NONE
> 
> CONTAINS
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
>  SUBROUTINE Poiseuille_3D_PETSC(nx,ny,AA,usol,xx,yy)
> 
> #include <petsc/finclude/petscsys.h>
> #include <petsc/finclude/petscvec.h>
> #include <petsc/finclude/petscmat.h>
> #include <petsc/finclude/petscmat.h90>
> #include <petsc/finclude/petscvec.h90>
> #include <petsc/finclude/petscpc.h>
> #include <petsc/finclude/petscksp.h>
> #include <petsc/finclude/petscviewer.h>
> #include <petsc/finclude/petscviewer.h90>
> 
> #include <slepc/finclude/slepcsys.h>
> #include <slepc/finclude/slepceps.h>
> 
> !...Regular Variables
> 
>    integer(i4b),intent(in) :: nx,ny
> 
>    real(dp),intent(in) :: AA
>    real(dp),dimension(nx,ny),intent(inout) :: usol,xx,yy
> 
>    integer(i4b) :: i,j,ij
>    integer(i4b) :: rank,size,ierr,source
>    integer(i4b) :: nxy,nxm2,nym2,nxym2
> 
>    integer(i4b),dimension(nx) :: li
>    integer(i4b),dimension(nx-2) :: lim2
> 
>    integer(i4b),dimension(:),allocatable  :: loc
> 
>    real(dp) :: ymin,ymax,xmin,xmax
>    real(dp) :: dxidx,d2xidx2,dyidy,d2yidy2
> 
>    real(dp),dimension(nx) :: xi,x
>    real(dp),dimension(ny) :: yi,y
> 
>    real(dp),dimension(nx,nx) :: D2XB
>    real(dp),dimension(ny,ny) :: D2YB
>    real(dp),dimension(nx,nx,2) :: DMXB
>    real(dp),dimension(ny,ny,2) :: DMYB
> 
>    real(dp),dimension(nx*ny) :: usol_vec   
>    real(dp),dimension( nx-2,ny-2 ) :: sol_mat 
> 
>    complex(dpc) :: U00
>    complex(dpc) :: atest
>    complex(dpc),dimension(nx-2,nx-2) :: D2XBs
>    complex(dpc),dimension(ny-2,ny-2) :: D2YBs
>    complex(dpc),dimension( nx-2,ny-2 ) :: sol_mat_cmplx 
>    complex(dpc),dimension(:),allocatable  :: xwork1
> 
> !!$    allocate(li(nx))
> !!$    allocate(lim2(nx-2))
> 
> !!! NEED TO TRANSFORM ALL THESE ARRAYS TO ALLOCATABLE!!!!!!  
> 
> !...Petsc Variables
>    PetscInt ii,jj,ione,ct
>    PetscInt jmin,jmax
>    PetscInt Istart,Iend
>    PetscInt IstartVec,IendVec
> 
>    PetscReal   tol
>    PetscScalar aone
> 
>    PetscViewer viewer
> 
>    PC         pc
>    KSP        ksp
>    VecScatter ctx
>    Mat        DXX,DYY
>    !Mat        DLOAD
>    Vec        frhs,frhs2
>    Vec        sol,sol_seq
> 
> 
>    !For VecGetArrayReadF90
>    !PetscScalar,pointer :: xx_v(:) !Fortran90 pointer to the array 
>    PetscOffset i_x
>    PetscScalar sol_array(1)
> 
> !...MPI 
>    !PetscMPIInt    rank,size
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> !                         Beginning of program                           !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> 
>    call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr) 
>    call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
> 
>    call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
> 
>    !call PetscOptionsInsertString("-mat_superlu_dist_parsymbfact",ierr)
> 
> !...Check if nz and ny are odd (necessary to set max velocity to 1)
> 
>    if ( mod(nx,2)==0 .or.  mod(ny,2)==0   ) then
>       write(*,*)
>       STOP'STOPPING... IN 3D POISEUILLE BASEFLOW NZ,NY NEED TO BE ODD'
>    endif
> 
> !...Set Domain Sizes and Initialize other variables
> 
>    xmin=-AA
>    xmax=AA
> 
>    ymin=-1
>    ymax=1 
> 
>    ione=1  
> 
>    nxy=nx*ny
> 
>    !Variables to Build DXX and DYY
>    nxm2=nx-2
>    nym2=ny-2
>    nxym2 = nxm2*nym2
> 
> !...Mat to Vec and Vec to Mat conversion
> 
>    lim2=0
>    do i=1,nxm2
>       lim2(i) = (i-1)*nym2
>    enddo
> 
> !...Get collocation points and chebyshev differentiation matrices
> 
>    call cheby(nx-1,xi,DMXB)
>    call cheby(ny-1,yi,DMYB)
> 
>    ! x-mapping from [-1,1]->[-XMAX,XMAX]
>    x=xmax*xi
> 
>    ! y-mapping (from [-1,1]->[-1,1])
>    y=yi
> 
>    !Compute the metric terms
>    dxidx = 1/xmax
>    d2xidx2 = 0
> 
>    dyidy = 1
>    d2yidy2 = 0
> 
>    !Scale the differentiation matrix (due to the use of the mapping)
> 
>    !x-direction
>    D2XB  = d2xidx2*DMXB(:,:,1) + dxidx**2*DMXB(:,:,2)
>    D2XBs = D2XB(2:nx-1,2:nx-1)
> 
>    !y-direction
>    D2YB  = d2yidy2*DMYB(:,:,1) + dyidy**2*DMYB(:,:,2)
>    D2YBs = D2YB(2:ny-1,2:ny-1)
> 
>    xx=0.0
>    yy=0.0
> 
>    do i=1,nx
>       do j=1,ny
> 
>          xx(i,j)=x(i)
>          yy(i,j)=y(j)
> 
>       enddo
>    enddo
> 
> !...Create Petsc right-hand-side vector and derivation matrices
>    call VecCreateSeq(PETSC_COMM_SELF,nxym2,sol_seq,ierr)
>    call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,nxym2,frhs,ierr) ! right 
> hand side vector
>    call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,nxym2,frhs2,ierr) ! right 
> hand side vector # 2
> 
>    call 
> MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,nxym2,nxym2,nx,PETSC_NULL_INTEGER,nx,PETSC_NULL_INTEGER,DXX,ierr)
>    call 
> MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,nxym2,nxym2,ny,PETSC_NULL_INTEGER,ny,PETSC_NULL_INTEGER,DYY,ierr)
>   
> 
> !...Determine which rows of the vector/matrix are locally owned.
>    call VecGetOwnershipRange(frhs,IstartVec,IendVec,ierr)
>    call MatGetOwnershipRange(DXX,Istart,Iend,ierr)
> 
>    if ( IstartVec /= Istart .or. IendVec /= Iend ) then
>       stop 'Problem in Baseflow Computation'
>    endif
> 
> !...Fill in Sequential and Parallel Vectors
>    allocate(xwork1(Iend-Istart))
>    allocate(loc(Iend-Istart))
> 
>    ct=0
>    do i=Istart,Iend-1
>       ct=ct+1
>       loc(ct)=i
>       xwork1(ct)=(-2.0d0,0.0d0)
>    enddo
> 
>    call VecSetValues(frhs,Iend-Istart,loc,xwork1,INSERT_VALUES,ierr)
>    call VecZeroEntries(sol_seq,ierr)
> 
> !...Assemble Vectors
>    call VecAssemblyBegin(frhs,ierr)
>    call VecAssemblyEnd(frhs,ierr)
> 
>    call VecAssemblyBegin(sol_seq,ierr)
>    call VecAssemblyEnd(sol_seq,ierr)
> 
>    call VecDuplicate(frhs,sol,ierr)   ! parallel solution vector 
> 
> !...Display vector
>    !call VecView(frhs,PETSC_VIEWER_STDOUT_WORLD,ierr) ! ==> seems very slow
> 
> !...Build DXX in Parallel
> 
> !!$    if (rank==0) then
> !!$       call print_matrix('D2XB',nx,nx,D2XB)
> !!$       call print_matrix('D2YB',ny,ny,D2YB)
> !!$
> !!$       call print_matrix_cmplx('D2XBs',nx-2,nx-2,D2XBs)
> !!$       call print_matrix_cmplx('D2YBs',ny-2,ny-2,D2YBs)
> !!$    endif
> 
>    if (rank == 0) call cpu_time(start)
> 
>    do i=Istart,Iend-1
> 
>       ii=1+floor((i+0.01)/nym2) 
>       jj=1 
> 
>       jmin=mod(i,nym2)
>       jmax=jmin+(nxm2-1)*nym2
> 
>       do j=jmin,jmax,nym2
> 
>          call MatSetValues(DXX,ione,i,ione,j,D2XBs(ii,jj),INSERT_VALUES,ierr)
>          jj=jj+1 
> 
>       enddo
> 
>    enddo
> 
> !...Build DYY in Parallel
> 
>    do i=Istart,Iend-1
> 
>       ii=mod(i,nym2)+1
>       jj=1
> 
>       jmin=floor((i+0.01)/nym2)*nym2
> 
>       do j=jmin,jmin+(nym2-1)
> 
>          call MatSetValues(DYY,ione,i,ione,j,D2YBs(ii,jj),INSERT_VALUES,ierr)
>          jj=jj+1
> 
>       enddo
> 
>    enddo
> 
> !...Assemble DXX and DYY
> 
>    call MatAssemblyBegin(DXX,MAT_FINAL_ASSEMBLY,ierr)
>    call MatAssemblyEnd(DXX,MAT_FINAL_ASSEMBLY,ierr)
> 
>    call MatAssemblyBegin(DYY,MAT_FINAL_ASSEMBLY,ierr)
>    call MatAssemblyEnd(DYY,MAT_FINAL_ASSEMBLY,ierr)
> 
>    if (rank==0) then
>       call cpu_time(finish)
>       print '("Time for build and assembly of baseflow operator = ",f15.4," 
> seconds.")',finish-start   
>    endif
> 
>    !call MatView(DXX,PETSC_VIEWER_STDOUT_WORLD,ierr)
>    !call MatView(DYY,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
>    aone = (1.0d0,0.0d0)
> 
>    if (rank == 0) call cpu_time(start)
>    call MatAXPY(DXX,aone,DYY,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit DXX = 
> a*DYY + DXX
>    if (rank==0) then
>       call cpu_time(finish)
>       print '("Time for Matrix summation = ",f15.4," seconds.")',finish-start 
>   
>    endif
> 
> 
> !...Write Matrix to ASCII and Binary Format
> !!$    call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
> !!$    call MatView(DXX,viewer,ierr)
> !!$    call PetscViewerDestroy(viewer,ierr)
> !!$
> !!$    call 
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
> !!$    call MatView(DXX,viewer,ierr)
> !!$    call PetscViewerDestroy(viewer,ierr)
> 
> !...Load a Matrix in Binary Format
> !!$    call 
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
> !!$    call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
> !!$    call MatSetType(DLOAD,MATAIJ,ierr)
> !!$    call MatLoad(DLOAD,viewer,ierr)
> !!$    call PetscViewerDestroy(viewer,ierr)
> 
>    !call MatTranspose(DXX,MatReuse reuse,Mat *B)
>    !call MatView(DXX,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !! SOLVE POISSON EQUATION FOR THE FIRST TIME !!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> !...Create linear solver context
>    call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> 
> !...Set operators. Here the matrix that defines the linear system also serves 
> as the preconditioning matrix.
>    !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha 
> commented and replaced by next line
>    call KSPSetOperators(ksp,DXX,DXX,ierr) ! remember: here DXX=DXX+DYY
> 
>    !call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here DXX=DXX+DYY
> 
>    tol = 1.e-10
>    !call 
> KSPSetTolerances(ksp,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
>  ! set relative tolerance and uses defualt for absolute and divergence tol 
>    call 
> KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) ! 
> set relative and absolute tolerances and uses defualt for divergence tol
> 
> !...Set the Direct (LU) Solver
>    !call KSPSetType(ksp,KSPPREONLY,ierr)
>    !call KSPGetPC(ksp,pc,ierr)
>    !call PCSetType(pc,PCLU,ierr)
>    !call PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr) ! 
> MATSOLVERSUPERLU_DIST
> 
> !...Set runtime options, e.g., -ksp_type <type> -pc_type <type> -ksp_monitor 
> -ksp_rtol <rtol>
>    !These options will override those specified above as long as 
> KSPSetFromOptions() is called _after_ 
>    !any other customization routines.
> 
>    call KSPSetFromOptions(ksp,ierr)
> 
> !...Sets an ADDITIONAL function to be called at every iteration to monitor 
> the residual/error etc
>    call 
> KSPMonitorSet(ksp,KSPMonitorTrueResidualNorm,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr)
>  
> 
> !...Solve the linear system
>    call KSPSolve(ksp,frhs,sol,ierr)
> 
> !...View solver info (could use -ksp_view instead)
> 
>    call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
> !...Extract Maximum Value (at center of duct) from Solution Vector
> 
>    call VecScatterCreateToZero(sol,ctx,sol_seq,ierr)
>    !scatter as many times as you need
>    call VecScatterBegin(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
>    call VecScatterEnd(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
>    !destroy scatter context and local vector when no longer needed
>    call VecScatterDestroy(ctx,ierr)
> 
>    !call VecView(sol_seq,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
>    call VecGetArray(sol_seq,sol_array,i_x,ierr)
> 
>    if (rank==0) then
> 
>       sol_mat_cmplx=0.0d0
> 
>       do i=1,nxm2
>          do j=1,nym2
>             ij=lim2(i)+j
>             sol_mat_cmplx(i,j) = sol_array(i_x + ij)
>          enddo
>       enddo
> 
>       U00 = sol_mat_cmplx((nxm2+1)/2,(nym2+1)/2)
> 
>    endif
> 
>    source=0
>    Call MPI_Bcast(U00,1,MPI_DOUBLE_COMPLEX,source,MPI_COMM_WORLD,ierr)
> 
>    !call VecGetArrayReadF90(sol,xx_v,ierr) ! requires #include 
> <petsc/finclude/petscvec.h90>
>    !atest = xx_v(3)
>    !call VecRestoreArrayReadF90(sol,xx_v,ierr)
>    !write(*,*)'atest',atest
> 
>    call VecRestoreArray(sol_seq,sol_array,i_x,ierr)
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !! SOLVE POISSON EQUATION FOR THE SECOND TIME !!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> !...Fill in New Right-Hand-Side Vector (frhs2)
>    call VecSetValues(frhs2,Iend-Istart,loc,xwork1/U00,INSERT_VALUES,ierr)
> 
> !...Assemble New RHS Vector
>    call VecAssemblyBegin(frhs2,ierr)
>    call VecAssemblyEnd(frhs2,ierr)
> 
> !...Solve System
>    call KSPSolve(ksp,frhs2,sol,ierr)
> 
> !...Get Final Solution
>    call VecScatterCreateToZero(sol,ctx,sol_seq,ierr)
>    call VecScatterBegin(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
>    call VecScatterEnd(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
>    call VecScatterDestroy(ctx,ierr)
> 
>    call VecGetArray(sol_seq,sol_array,i_x,ierr)
> 
>    if (rank==0) then
> 
>       usol=0.0d0
>       sol_mat_cmplx=0.0d0
> 
>       do i=1,nxm2
>          do j=1,nym2
>             ij=lim2(i)+j
>             sol_mat_cmplx(i,j) = sol_array(i_x + ij)
>          enddo
>       enddo
> 
>       usol(2:nx-1,2:ny-1) = dble(sol_mat_cmplx)
> 
>    endif
> 
>    call VecRestoreArray(sol_seq,sol_array,i_x,ierr)
> 
> !...Send usol to all other processes (vectorize, broadcast and matrixize)
>    li=0
>    do i=1,nx
>       li(i) = (i-1)*ny
>    enddo
> 
>    if (rank==0) then
> 
>       do i=1,nx
>          do j=1,ny
>             ij=li(i)+j
>             usol_vec(ij)=usol(i,j)
>          enddo
>       enddo
> 
>    endif
> 
>    call MPI_Bcast(usol_vec,nx*ny,MPI_DOUBLE,source,MPI_COMM_WORLD,ierr)
> 
>    if (rank/=0) then
> 
>       do i=1,nx
>          do j=1,ny
>             ij=li(i)+j
>             usol(i,j)=usol_vec(ij)
>          enddo
>       enddo
> 
>    endif
> 
> !...Free work space. All PETSc objects should be destroyed when they are no 
> longer needed.
> 
>    deallocate(loc)
>    deallocate(xwork1)
> 
>    call KSPDestroy(ksp,ierr)
>    call VecDestroy(frhs,ierr)
>    call VecDestroy(frhs2,ierr)
>    call VecDestroy(sol,ierr)
>    call VecDestroy(sol_seq,ierr)
>    call MatDestroy(DXX,ierr)
>    call MatDestroy(DYY,ierr)    
> 
>    call PetscFinalize(ierr)
> 
> 
> END SUBROUTINE Poiseuille_3D_PETSC
> 
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!! HIEMENZ FLOW SOLVER !!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
> !*************************************************************************************************
> ! on Veltins: ifort hiemenz.f90 -convert big_endian -r8 -i4 -O3 -132 -o 
> hiemenz.x 
> ! 
> ************************************************************************************************
>      !program hiemenz
> 
>      subroutine hiemenz(ymax,eps,n,si,eta,u0,v0,v1,v2,v3,w0,w1,w2)
> 
>      implicit none
> 
>      real(dp) eps,ymax,si
>      integer(i4b) n,i
> 
> !      real(dp),allocatable,dimension(:) :: eta
> !      real(dp),allocatable,dimension(:) :: u0,v0,v1,v2,v3
> !      real(dp),allocatable,dimension(:) :: w0,w1,w2
> 
>      real(dp),dimension(n) :: eta
>      real(dp),dimension(n) :: u0,v0,v1,v2,v3
>      real(dp),dimension(n) :: w0,w1,w2
> 
>      write(*,*)
>      write(*,*) 
> '*******************************************************************'
>      write(*,*) '                       HIEMENZ SOLVER                        
>       '
>      write(*,*) 
> '*******************************************************************'
>      write(*,*)
> 
> !
> !..... Numerical Parameters
> !
> 
> !Domain height and number of points
> 
>      !ymax=100
>      !n=500001
> 
> !Tolerance for Newton iteration
> 
>      !eps=1e-12
> 
> !Initial guess for Newton iteration: f''(0)
> 
>      !si = -1.232587656820134
> 
> !
> !..... Allocate arrays
> !
> !      allocate(eta(n))
> !      allocate(u0(n),v0(n),v1(n),v2(n),v3(n))
> !      allocate(w0(n),w1(n),w2(n))
> 
> !
> !..... Compute solution
> !
> 
>      call hiemenz_fct(ymax,eps,n,si,eta,u0,v0,v1,v2,v3,w0,w1,w2)
> 
> !
> !..... Write out solution in ASCII format
> !
> 
>      write(*,*)
>      write(*,*) 'Writing solution to hiemenz.dat'
>      write(*,*)
> 
>      open(unit=15,file='hiemenz.dat',form='formatted')
> 
>      do i=1,n
>         write(15,'(9E21.11)') 
> eta(i),u0(i),v0(i),v1(i),v2(i),v3(i),w0(i),w1(i),w2(i)
>      end do
> 
>      close(15)
> 
> 
> 
>    end subroutine hiemenz
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
>      subroutine hiemenz_fct(ymax,eps,n,si,eta,u0,v0,v1,v2,v3,w0,w1,w2)
> 
>      implicit none
> 
>      integer(i4b) i,iter,n,j
>      real(dp) ymax,res,p,eps,s,si
> 
>      real(dp),allocatable,dimension(:,:) :: v,w
> 
>      real(dp) eta(n)
>      real(dp) u0(n),v0(n),v1(n),v2(n),v3(n)
>      real(dp) w0(n),w1(n),w2(n)
> 
>      !external w_ode
> 
>      write(*,*) 'eps=',eps
>      write(*,*) 'ymax=',ymax
>      write(*,*) 'N=',n
>      write(*,*)
> 
> !
> !     Allocate arrays
> !
>      allocate(v (6,n))
>      allocate(w (5,n))
> 
> !
> !     Create grid
> !
>      do i=1,n
>         eta(i) = ymax * real(i-1)/real(n-1)
>      end do
> !
> !     Newton-Raphson iteration for v
> !
>      iter = 0
>      s = si
>      res = 1.
>      write(*,*) 'Newton-Raphson iteration for v: s=',s
> 
>      do while (res.gt.eps)
> 
>         iter=iter+1
> 
>         call hiemenz_eval(s,eta,n,v)
> 
>         s = s - (v(2,n) + 1.0) / v(5,n)
>         res = abs(v(2,n) + 1.0)
> 
>         write(*,*) iter,s,res
> 
>      end do
> 
>      write(*,*) 'Number of iterations ',iter
>      write(*,*)
>      write(*,*) "v''(0)=",v(3,1)
> 
> !
> !     Now compute w
> !
>      w(1,1) = 0.
>      w(2,1) = v(3,1)
> 
>      w(3,1) = 0.
>      w(4,1) = 0.
>      w(5,1) = v(3,1)
> 
>      do i=2,n
>         call rk4(w(1,i-1),w(1,i),eta(i)-eta(i-1),5,w_ode)
>      end do
> 
> !
> !     Rescale w to match boundary conditions at infinity
> !
>      p = 1./w(1,n)
> 
>      do i=1,n
>         w(1,i) = w(1,i) * p
>         w(2,i) = w(2,i) * p
>      end do
> 
>      write(*,*) "w'(0)=",w(2,1)
> 
> !
> !     Now compute v''', w'' from the RHS and copy all values to new arrays 
> !
> 
>      do i=1,n
> 
>         u0(i) = -v(2,i)
>         v0(i) = v(1,i)
>         v1(i) = v(2,i)
>         v2(i) = v(3,i)
>         v3(i) = -v(2,i)*v(2,i) + v(1,i)*v(3,i) + 1.0
>         w0(i) = w(1,i)
>         w1(i) = w(2,i)
>         w2(i) = w(2,i)*v(1,i)
> 
>      end do
> 
> 
>      deallocate(v,w)
> 
>    end subroutine hiemenz_fct
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
>      subroutine hiemenz_eval(s,eta,n,f)
> 
>      implicit none
> 
>      integer(i4b) n,i
>      real(dp) s
>      real(dp) eta(n),f(6,n)
> 
>      !external v_ode
> 
>      f(1,1) = 0.
>      f(2,1) = 0.
>      f(3,1) = s
> 
>      f(4,1) = 0.
>      f(5,1) = 0.
>      f(6,1) = 1.
> 
>      do i=2,n
>         call rk4(f(1,i-1),f(1,i),eta(i)-eta(i-1),6,v_ode)
>      end do
> 
>    end subroutine hiemenz_eval
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
>      subroutine v_ode(y,dy)
> 
>      implicit none
> 
>      real(dp) y(6),dy(6)
> 
>      dy(1) = y(2)
>      dy(2) = y(3)
>      dy(3) = -y(2)*y(2) + y(1)*y(3) + 1.0
> 
>      dy(4) = y(5)
>      dy(5) = y(6)
>      dy(6) = y(3)*y(4) - 2.0*y(2)*y(5) + y(1)*y(6)
> 
>      end subroutine v_ode
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
>      subroutine w_ode(y,dy)
> 
>      implicit none
> 
>      real(dp) y(5),dy(5)
> 
>      dy(1) = y(2)
>      dy(2) = y(2)*y(3)
> 
>      dy(3) = y(4)
>      dy(4) = y(5)
>      dy(5) = -y(4)*y(4) + y(3)*y(5) + 1.0
> 
>      end subroutine w_ode
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
>      subroutine rk4(y1,y2,h,nn,der)
> !
> !     Perform one RK4 step with step size h of a n-vector
> !     Input:  y1(n)  old step
> !             h      step size
> !             nn     vector size
> !             der    derivative function
> !     Output: y2(nn) new step
> !
>      implicit none
> 
>      integer(i4b) nn
> 
>      real(dp) y1(nn),y2(nn),y(nn)
>      real(dp) k1(nn),k2(nn),k3(nn),k4(nn)
>      real(dp) h
> 
>      external der
> !
> !     First RK substep
> !
> 
>      y(:) = y1(:)
>      call der(y,k1)
>      k1(:)=h*k1(:)
> 
> !
> !     Second RK substep
> !
> 
>      y(:) = y1(:) + .5*k1(:)
>      call der(y,k2)
>      k2(:)=h*k2(:)
> 
> !
> !     Third RK substep
> !
> 
>      y(:) = y1(:) + .5*k2(:)
>      call der(y,k3)
>      k3(:)=h*k3(:)
> 
> !
> !     Fourth RK substep
> !
> 
>      y(:) = y1(:) + k3(:)
>      call der(y,k4)
>      k4(:)=h*k4(:)
> 
> !
> !     Compose solution after full RK step
> !
>      y2(:) = y1(:) + ( k1(:) + 2.0*k2(:) + 2.0*k3(:) + k4(:) ) / 6.0
> 
>      end subroutine rk4
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> 
>      subroutine compute_baseflow_derivative(DXYZ,uvwvec,nx,ny,UVWxyz)
> 
>        implicit none
> 
>        integer(i4b) :: i,j
> 
>        integer(i4b),intent(in) :: nx,ny
> 
>        real(dp),dimension(nx*ny),intent(in) :: uvwvec 
>        real(dp),dimension(nx*ny,nx*ny),intent(in) :: DXYZ      
>        real(dp),dimension(nx*ny,nx*ny),intent(out) :: UVWxyz
> 
>        UVWxyz = 0.0
> 
>        do i=1,nx*ny
>           do j=1,nx*ny
> 
>              UVWxyz(i,i) = UVWxyz(i,i) + DXYZ(i,j)*uvwvec(j)
> 
>           enddo
>        enddo
> 
>      end subroutine compute_baseflow_derivative
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>     
> 
> 
>      subroutine compute_UVW_DXYZ(DXYZ,uvwvec,nx,ny,UVW_DXYZ)
> 
>        implicit none
> 
>        integer(i4b) :: i,j
> 
>        integer(i4b),intent(in) :: nx,ny
> 
>        real(dp),dimension(nx*ny),intent(in) :: uvwvec 
>        real(dp),dimension(nx*ny,nx*ny),intent(in) :: DXYZ      
>        real(dp),dimension(nx*ny,nx*ny),intent(out) :: UVW_DXYZ
> 
>        do i=1,nx*ny
>           do j=1,nx*ny
> 
>              UVW_DXYZ(i,j) = uvwvec(i)*DXYZ(i,j)
> 
>           enddo
>        enddo
> 
>      end subroutine compute_UVW_DXYZ
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>  
> 
> 
> 
> 
> END MODULE BASEFLOW
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> !!$
> !!$
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$
> !!$  SUBROUTINE Poiseuille_3D(nz,ny,AA,usol,zz,yy)
> !!$    
> !!$    character(128) :: file_name
> !!$    integer(i4b) :: INFO,NRHS,ct,i,j,N,file_num,m1,n1,p1,q1
> !!$    
> !!$    integer(i4b),intent(in) :: nz,ny
> !!$    real(dp),intent(in) :: AA
> !!$    
> !!$    real(dp) :: ymin,ymax,zmin,zmax,dzidz,d2zidz2,dyidy,d2yidy2,U00
> !!$    real(dp) :: start,finish
> !!$    
> !!$    real(dp),dimension(nz,nz,2) :: DMZ
> !!$    real(dp),dimension(ny,ny,2) :: DMY
> !!$    
> !!$    real(dp),dimension(nz,nz) :: D2Z
> !!$    real(dp),dimension(ny,ny) :: D2Y
> !!$    
> !!$    real(dp),dimension(nz-2,nz-2) :: D2Zs
> !!$    real(dp),dimension(ny-2,ny-2) :: D2Ys
> !!$    
> !!$    real(dp),dimension(nz-2,nz-2) :: IZ
> !!$    real(dp),dimension(ny-2,ny-2) :: IY
> !!$    
> !!$    real(dp),dimension(nz,ny) :: usol,zz,yy
> !!$    real(dp),dimension(nz-2,ny-2) :: utmp
> !!$    
> !!$    real(dp),dimension(nz) :: zi,z
> !!$    real(dp),dimension(ny) :: yi,y
> !!$    
> !!$    real(dp),dimension( (nz-2)*(ny-2) ) :: frhs
> !!$    real(dp),dimension( (nz-2)*(ny-2),(nz-2)*(ny-2) ) :: AMAT,DZZ,DYY
> !!$    
> !!$    integer(i4b),dimension((nz-2)*(ny-2)) :: IPIV
> !!$    
> !!$    frhs=0.0
> !!$    
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$!!!!!!! 3D POISEUILLE FLOW !!!!!!!!
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$    
> !!$    zmin=-AA
> !!$    zmax=AA
> !!$    
> !!$    ymin=-1
> !!$    ymax=1
> !!$    
> !!$    !
> !!$    !! .....Create Grid
> !!$    !
> !!$    
> !!$    if ( mod(nz,2)==0 .or.  mod(ny,2)==0   ) then
> !!$       WRITE(*,*)
> !!$       STOP'STOPPING... IN 3D POISEUILLE BASEFLOW NZ,NY NEED TO BE ODD'
> !!$    endif
> !!$    
> !!$    !
> !!$    !! .....Get collocation points and chebyshev differentiation matrices
> !!$    !
> !!$    
> !!$    call cheby(nz-1,zi,DMZ)
> !!$    call cheby(ny-1,yi,DMY)
> !!$    
> !!$    ! z-mapping from [-1,1]->[-ZMAX,ZMAX]
> !!$    z=zmax*zi
> !!$    
> !!$    ! y-mapping (from [-1,1]->[-1,1])
> !!$    y=yi
> !!$    
> !!$    !Compute the metric terms
> !!$    dzidz = 1/zmax
> !!$    d2zidz2 = 0
> !!$    
> !!$    dyidy = 1
> !!$    d2yidy2 = 0
> !!$    
> !!$    !Scale the differentiation matrix (due to the use of the mapping)
> !!$    
> !!$    !z-direction
> !!$    !D1Z = dzidz*DMZ(:,:,1)
> !!$    D2Z  = d2zidz2*DMZ(:,:,1) + dzidz**2*DMZ(:,:,2)
> !!$    D2Zs = D2Z(2:nz-1,2:nz-1)
> !!$    
> !!$    !y-direction
> !!$    !D1Y = dyidy*DMY(:,:,1)
> !!$    D2Y  = d2yidy2*DMY(:,:,1) + dyidy**2*DMY(:,:,2)
> !!$    D2Ys = D2Y(2:ny-1,2:ny-1)
> !!$    
> !!$    IZ=0.0
> !!$    IY=0.0
> !!$    
> !!$    do i=1,nz-2
> !!$       IZ(i,i)=1.0
> !!$    enddo
> !!$    
> !!$    do i=1,ny-2
> !!$       IY(i,i)=1.0
> !!$    enddo
> !!$    
> !!$    m1=size(IY,1)
> !!$    n1=size(IY,2)
> !!$    p1=size(D2Zs,1)
> !!$    q1=size(D2Zs,2)
> !!$    
> !!$    call mykron(IY,D2Zs,m1,n1,p1,q1,DZZ)
> !!$    
> !!$    m1=size(D2Ys,1)
> !!$    n1=size(D2Ys,2)
> !!$    p1=size(IZ,1)
> !!$    q1=size(IZ,2)
> !!$    
> !!$    call mykron(D2Ys,IZ,m1,n1,p1,q1,DYY)
> !!$    
> !!$    AMAT = DYY + DZZ
> !!$    
> !!$    !CALL PRINT_MATRIX( 'DZZ', (nz-2)*(ny-2), (nz-2)*(ny-2), DZZ )
> !!$    !CALL PRINT_MATRIX( 'DYY', (nz-2)*(ny-2), (nz-2)*(ny-2),DYY )
> !!$    
> !!$    
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$    !! SOLVE POISSON EQUATION FOR THE FIRST TIME !!
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$    
> !!$    
> !!$    
> !!$    do i=1,(ny-2)*(nz-2)
> !!$       frhs(i)=-2.0
> !!$    enddo
> !!$    
> !!$    INFO=0
> !!$    NRHS=1 ! only 1 rhs here
> !!$    N=(nz-2)*(ny-2)
> !!$    
> !!$    
> !!$    !
> !!$    ! Compute the LU factorization of A.
> !!$    !
> !!$    
> !!$    ! Note: on exit of dgetrf, AMAT is not AMAT anymore but contains the 
> factors L and U 
> !!$    ! from the factorization A = P*L*U; the unit diagonal elements of L 
> are not stored.
> !!$    
> !!$    !call PRINT_MATRIX( 'AMAT', N, N, AMAT )
> !!$
> !!$    
> !!$    call cpu_time(start)
> !!$    CALL dgetrf( N, N, AMAT, N, IPIV, INFO )
> !!$    call cpu_time(finish)
> !!$    
> !!$    !write(*,*)'AMAT AFTER',AMAT
> !!$    
> !!$    print '("Time LU = ",f15.4," seconds.")',finish-start
> !!$    
> !!$    IF( INFO .NE. 0 ) THEN
> !!$       WRITE(*,*)'INFO',INFO
> !!$       STOP 'PROBLEM IN LAPACK (Poiseuille_3D)'
> !!$    ENDIF
> !!$    
> !!$    
> !!$    !
> !!$    ! Solve the system A*X = B, overwriting B with X.
> !!$    !
> !!$    
> !!$    IF( INFO.EQ.0 ) THEN
> !!$       
> !!$       call cpu_time(start)
> !!$       CALL dgetrs( 'No transpose', n, nrhs, AMAT, N, IPIV, frhs, N, INFO )
> !!$       call cpu_time(finish)
> !!$       
> !!$       print '("Time Linear System Solve = ",f15.4," 
> seconds.")',finish-start
> !!$       write(*,*)
> !!$       
> !!$    END IF
> !!$    
> !!$    ct=0
> !!$    do j=1,ny-2
> !!$       do i=1,nz-2
> !!$          
> !!$          ct=ct+1
> !!$          utmp(i,j) = frhs(ct)
> !!$          
> !!$       enddo
> !!$    enddo
> !!$    
> !!$    usol = 0.0
> !!$    usol(2:nz-1,2:ny-1) = utmp(:,:)
> !!$    
> !!$    zz=0.0
> !!$    yy=0.0
> !!$    
> !!$    do i=1,nz
> !!$       do j=1,ny
> !!$          
> !!$          zz(i,j)=z(i)
> !!$          yy(i,j)=y(j)
> !!$          
> !!$       enddo
> !!$    enddo
> !!$    
> !!$    U00 = usol( (nz+1)/2 , (ny+1)/2 )
> !!$    
> !!$    
> !!$    
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$!! SOLVE POISSON EQUATION FOR THE SECOND TIME !!
> !!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!$    
> !!$    
> !!$    utmp=0.0
> !!$    usol=0.0
> !!$    frhs=0.0
> !!$    
> !!$    !New,scaled rhs
> !!$    do i=1,(ny-2)*(nz-2)
> !!$       frhs(i)=-2.0/U00
> !!$    enddo
> !!$    
> !!$    !
> !!$    ! Solve the system A*X = B, overwriting B with X.
> !!$    !
> !!$    
> !!$    ! Note: here we already have the LU factorization so that we only need 
> a call to dgetrs
> !!$    
> !!$    CALL dgetrs( 'No transpose', N, NRHS, AMAT, N, IPIV, frhs, N, INFO )
> !!$    
> !!$    IF( INFO .NE. 0 ) THEN
> !!$       WRITE(*,*)'INFO',INFO
> !!$       STOP 'PROBLEM IN LAPACK (Poiseuille_3D -- solve 2)'
> !!$    ENDIF
> !!$    
> !!$    !write(*,*)'frhs',frhs
> !!$    
> !!$    ct=0
> !!$    do j=1,ny-2
> !!$       do i=1,nz-2
> !!$          
> !!$          ct=ct+1
> !!$          utmp(i,j) = frhs(ct)
> !!$          
> !!$       enddo
> !!$    enddo
> !!$    
> !!$    usol(2:nz-1,2:ny-1) = utmp(:,:)
> !!$    
> !!$    
> !!$  END SUBROUTINE Poiseuille_3D
> !!$
> !!$
> !
> !  Description: Build complex matrices A and B in the context of a generalized
> !  eigenvalue problem (GEVP)
> !
> !
> !/*T
> !  Concepts: Build complex matrices
> !  Processors: n
> !T*/
> !
> ! -----------------------------------------------------------------------
> 
> MODULE PETSC
> 
>  USE TYPEDEF
>  USE WRITE_PRINT
> 
>  IMPLICIT NONE
> 
> CONTAINS
> 
>  Subroutine SetupSolveGEVP(nx,ny,nxy,alpha,beta,Re)
> 
> #include <petsc/finclude/petscsys.h>
> #include <petsc/finclude/petscvec.h>
> #include <petsc/finclude/petscmat.h>
> #include <petsc/finclude/petscmat.h90>
> #include <petsc/finclude/petscpc.h>
> #include <petsc/finclude/petscksp.h>
> 
> #include <slepc/finclude/slepcsys.h>
> #include <slepc/finclude/slepceps.h>
> 
> !...Non PETSc/SLEPc Variables
> 
>    real(dp),intent(in) :: alpha,beta,Re
>    integer(i4b),intent(in) :: nx,ny,nxy
> 
>    integer(i4b) :: ij
>    integer(i4b) :: sizem,rank,source
> 
>    integer(i4b),dimension(:),allocatable  :: idxm1,idxn1
>    integer(i4b),dimension(:),allocatable  :: idxm2,idxn2
>    integer(i4b),dimension(:),allocatable  :: idxm3,idxn3
> 
>    integer(i4b),dimension(:),allocatable  :: li
>    integer(i4b),dimension(:),allocatable  :: all_bc
>    integer(i4b),dimension(:),allocatable  :: u_bc,v_bc,w_bc,p_bc
>    integer(i4b),dimension(:),allocatable  :: fst_bc,wall_bc
>    integer(i4b),dimension(:),allocatable  :: outlet_bc,inlet_bc
> 
> 
>    complex(dpc) :: ibeta,Rec,alphac,betac,small,sigma
>    complex(dpc),dimension(:,:),allocatable :: D1Xc,D2Xc,D1Yc,D2Yc
> 
>    complex(dpc),dimension(nxy) :: uvec_cmplx,uxvec_cmplx,uyvec_cmplx
>    complex(dpc),dimension(nxy) :: vvec_cmplx,vxvec_cmplx,vyvec_cmplx
>    complex(dpc),dimension(nxy) :: wvec_cmplx,wxvec_cmplx,wyvec_cmplx
> 
>    complex(dpc),dimension(3*nxy) :: wvec_cmplx3
>    complex(dpc),dimension(ny,ny) :: VDIAG,VDY
> 
> !...Temporary variables
>    integer(i4b),dimension(:),allocatable  :: loc
>    complex(dpc),dimension(:),allocatable  :: xwork1
> 
> 
> !...SLEPC VARIABLES
>    EPS eps
>    EPSType tname
> 
>    ST st
>    PC pc
> 
>    Vec xr,xi
> 
>    PetscReal      error
>    PetscScalar    kr,ki
> 
>    PetscLogDouble mem
> 
>    PetscInt       maxit,nconv
>    PetscInt       nev,ncv,mpd
>    PetscInt       three 
>    PetscInt       ct,ct1,ct2,ct3,ct4
> 
>    PetscInt       icntl,ival,ival1
> 
>    PetscReal tolslepc
>    PetscInt maxiter
> 
> !...PETSC VARIABLES
> 
>    PetscReal      norm,tol,nrm,nrm1
>    PetscInt       i,j,k,II,JJ,its
> 
>    PetscInt       jmin,jmax
>    PetscInt       ione,ntimes
>    PetscErrorCode ierr
>    PetscBool      flg
>    PetscScalar    v,one,zero,rhs
> 
>    KSP            ksp
>    PetscRandom    rctx
> 
> !...MPI 
>    !PetscMPIInt    rank,size
> 
> !...Matrices indices
> 
>    PetscInt       IstartA,IendA,IstartB,IendB
> 
> !...Vectors
> 
>    Vec            i_vec,frhs,sol
> 
> !...Matrices
> 
>    Mat            A,B
> 
>    !  Note: Any user-defined Fortran routines (such as MyKSPMonitor)
>    !  MUST be declared as external.
> 
>    !external MyKSPMonitor,MyKSPConverged
> 
> !#if !defined(PETSC_USE_COMPLEX)
> !#error "this code requires PETSc --with-scalar-type=complex build"
> !#endif
> !#if !defined(PETSC_USE_REAL_DOUBLE)
> !#error "this code requires PETSc --with-precision=real build"
> !#endif
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> !                           Beginning of program                         !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
> 
>    call SlepcInitialize(PETSC_NULL_CHARACTER,ierr)!;CHKERRQ(ierr)
> 
>    call PetscMemorySetGetMaximumUsage(ierr)!;CHKERRQ(ierr)
> 
>    call PetscOptionsInsertString("-mat_superlu_dist_parsymbfact",ierr)
> 
>    call MPI_COMM_SIZE(MPI_COMM_WORLD,sizem,ierr)!;CHKERRQ(ierr)
>    call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)!;CHKERRQ(ierr)
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    !call PetscMemoryGetMaximumUsage(mem,ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (entering 
> module)',rank,'    memory:',mem
> 
>    !call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-nx',nx,flg,ierr)
>    !call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-ny',ny,flg,ierr)
> 
>    ione=1 ! integer
>    ibeta = imag_unit*beta 
>    one = (1.0d0,0.0d0)
>    zero = (0.0d0,0.0d0)
>    Rec=Re
>    alphac=alpha
>    betac=beta
> 
>    source=0
> 
>    ct=0
> 
> !...Convert Arrays to Complex
> 
>    uvec_cmplx=uvec
>    uxvec_cmplx=uxvec
>    uyvec_cmplx=uyvec
> 
>    vvec_cmplx=vvec
>    vxvec_cmplx=vxvec
>    vyvec_cmplx=vyvec
> 
>    wvec_cmplx=wvec
>    wxvec_cmplx=wxvec       
>    wyvec_cmplx=wyvec
> 
>    wvec_cmplx3(1:nxy)=wvec_cmplx
>    wvec_cmplx3(nxy+1:2*nxy)=wvec_cmplx
>    wvec_cmplx3(2*nxy+1:3*nxy)=wvec_cmplx  
> 
>    allocate(D1Xc(nx,nx),D2Xc(nx,nx),D1Yc(ny,ny),D2Yc(ny,ny))
> 
>    D1Xc=D1X
>    D1Yc=D1Y
> 
>    D2Xc=D2X
>    D2Yc=D2Y
> 
>    !call 
> MPI_Bcast(wvec_cmplx3,3*nxy,MPI_DOUBLE_COMPLEX,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
> 
> !...Parallel Matrices
> 
>    !Create A
>    call 
> MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,2*(nx+ny),PETSC_NULL_INTEGER,max(nx,ny),PETSC_NULL_INTEGER,A,ierr)!;CHKERRQ(ierr)
> 
>    !Create B
> 
>    call 
> MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,B,ierr)!;CHKERRQ(ierr)
> 
>    call MatGetOwnershipRange(A,IstartA,IendA,ierr)!;CHKERRQ(ierr)
>    call MatGetOwnershipRange(B,IstartB,IendB,ierr)!;CHKERRQ(ierr)
> 
>    !write(*,*)'IstartA,IendA',IstartA,IendA
>    !write(*,*)'IstartB,IendB',IstartB,IendB
> 
>    if (IstartA > 4*nxy .or. IstartB > 4*nxy .or. IendA > 4*nxy .or. IendB > 
> 4*nxy) then
>       stop 'indices problem in module_petsc'
>    endif
> 
> !...NEED TO TEST WITHOUT THIS NEXT LINE TO UNDERSTAND 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !...This line avoids error [0]PETSC ERROR: Argument out of range [0]PETSC 
> ERROR: New nonzero at (470,470)...
>    call 
> MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE,ierr)!;CHKERRQ(ierr)
>   
> 
> !...Get Memory Usage
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    !call PetscMemoryGetMaximumUsage(mem,ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (after 
> matcreate)',rank,'    memory:',mem
> 
> !...Build A in Parallel
> 
>    if (rank == 0) call cpu_time(start)
> 
>    do i=IstartA,IendA-1
> 
>       if     ( i < nxy ) then
> 
>          if (abs(uyvec_cmplx(i+1))>1e10) STOP '1'
>          call 
> MatSetValues(A,ione,i,ione,i+nxy,uyvec_cmplx(i+1),INSERT_VALUES,ierr)!;CHKERRQ(ierr)
>  ! Uy (block 1,2)
> 
>          !Build DX (block 1,4)
>          ii=1+floor((i+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             if (abs(D1Xc(ii,jj))>1e10) STOP '2'
>             call 
> MatSetValues(A,ione,i,ione,j+3*nxy,D1Xc(ii,jj),INSERT_VALUES,ierr) ! Build DX 
> (block 1,4)
>             jj=jj+1 
> 
>          enddo
> 
>          !Build -DXX/Re (part of block 1,1)
>          ii=1+floor((i+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             if (abs(-D2Xc(ii,jj)/Rec)>1e10) STOP '3'
>             call 
> MatSetValues(A,ione,i,ione,j,-D2Xc(ii,jj)/Rec,INSERT_VALUES,ierr) !Build 
> -DXX/Re (part of block 1,1)
>             jj=jj+1 
> 
>          enddo
> 
> 
>       elseif ( i > nxy-1 .and. i < 2*nxy ) then
> 
>          if (abs(vxvec_cmplx(i-nxy+1))>1e10) STOP '4'
>          call 
> MatSetValues(A,ione,i,ione,i-nxy,vxvec_cmplx(i-nxy+1),INSERT_VALUES,ierr)!;CHKERRQ(ierr)
>  ! Vx (block 2,1)
> 
>          !Build DY (block 2,4)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i-nxy+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
> 
>             if (abs(D1Yc(ii,jj))>1e10) STOP '5'
>             call 
> MatSetValues(A,ione,i,ione,j+3*nxy,D1Yc(ii,jj),INSERT_VALUES,ierr) ! Build DY 
> (block 2,4)
>             jj=jj+1
> 
>          enddo
> 
> 
>          !Build -DXX/Re (part of block 2,2)
>          ii=1+floor((i-nxy+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             if (abs(-D2Xc(ii,jj)/Rec)>1e10) STOP '6'
>             call 
> MatSetValues(A,ione,i,ione,j+nxy,-D2Xc(ii,jj)/Rec,INSERT_VALUES,ierr) !Build 
> -DXX/Re (part of block 2,2)
>             jj=jj+1 
> 
>          enddo
> 
> 
> 
> 
>       elseif ( i > 2*nxy-1 .and. i < 3*nxy ) then
> 
>          call MatSetValues(A,ione,i,ione,i+nxy,ibeta,INSERT_VALUES,ierr)! 
> iBeta (block 3,4)
> 
>          call 
> MatSetValues(A,ione,i,ione,i-2*nxy,wxvec_cmplx(i-2*nxy+1),INSERT_VALUES,ierr)!
>  Wx (block 3,1)  
>          call MatSetValues(A,ione,i,ione,i  
> -nxy,wyvec_cmplx(i-2*nxy+1),INSERT_VALUES,ierr)! Wy (block 3,2)
> 
> 
>          !Build -DXX/Re (part of block 3,3)
>          ii=1+floor((i-2*nxy+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             call 
> MatSetValues(A,ione,i,ione,j+2*nxy,-D2Xc(ii,jj)/Rec,INSERT_VALUES,ierr) 
> !Build -DXX/Re (part of block 3,3)
>             jj=jj+1 
> 
>          enddo
> 
> 
>       elseif ( i > 3*nxy-1 ) then
> 
>          call MatSetValues(A,ione,i,ione,i-nxy,ibeta,INSERT_VALUES,ierr)! 
> iBeta (block 4,3)
> 
>          !Build DX (block 4,1)
>          ii=1+floor(((i-3*nxy)+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             call MatSetValues(A,ione,i,ione,j,D1Xc(ii,jj),INSERT_VALUES,ierr)
>             jj=jj+1 
> 
>          enddo
> 
>          !Build DY (block 4,2)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i-3*nxy+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
> 
>             call 
> MatSetValues(A,ione,i,ione,j+nxy,D1Yc(ii,jj),INSERT_VALUES,ierr)
>             jj=jj+1
> 
>          enddo
> 
>       endif
> 
>    enddo
> 
>    if (rank==0) then
> 
>       call cpu_time(finish)
>       write(*,*)
>       print '("Time Partial Parallel Build A #1 = ",f21.3," 
> seconds.")',finish-start
> 
>    endif
> 
> 
>    call MatAssemblyBegin(A,MAT_FLUSH_ASSEMBLY,ierr)!;CHKERRQ(ierr) ! 
> necessary to switch between INSERT_VALUES and ADD_VALUES
>    call MatAssemblyEnd(A,MAT_FLUSH_ASSEMBLY,ierr)!;CHKERRQ(ierr)   ! 
> necessary to switch between INSERT_VALUES and ADD_VALUES
> 
> 
> 
> !...Finish Blocks A11, A22 and A33
> 
>    if (rank == 0) call cpu_time(start)
> 
>    VDY=(0.0d0,0.0d0)
>    VDIAG=(0.0d0,0.0d0)    
> 
>    do i=IstartA,IendA-1 
>       if (i < 3*nxy) then
>          call 
> MatSetValues(A,ione,i,ione,i,betac**2/Rec+imag_unit*betac*wvec_cmplx3(i+1),ADD_VALUES,ierr)!;CHKERRQ(ierr)
>       endif
>    enddo
> 
>    do i=IstartA,IendA-1
> 
>       if     ( i < nxy ) then
> 
>          !Build -DYY/Rec (part of block 1,1)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
> 
>             call 
> MatSetValues(A,ione,i,ione,j,-D2Yc(ii,jj)/Rec,ADD_VALUES,ierr) 
>             jj=jj+1
> 
>          enddo
> 
> 
>          !Build U*DX (part of block 1,1)
>          ii=1+floor((i+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             call 
> MatSetValues(A,ione,i,ione,j,uvec_cmplx(i+1)*D1Xc(ii,jj),ADD_VALUES,ierr) 
>             jj=jj+1 
> 
>          enddo
> 
> 
>          !Build V*DY (part of block 1,1)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
>             call 
> MatSetValues(A,ione,i,ione,j,vvec_cmplx(i+1)*D1Yc(ii,jj),ADD_VALUES,ierr) 
>             jj=jj+1
> 
>          enddo
> 
> 
>          call 
> MatSetValues(A,ione,i,ione,i,uxvec_cmplx(i+1),ADD_VALUES,ierr)!;CHKERRQ(ierr) 
> ! Ux (block 1,1)
> 
> 
> 
> 
>       elseif ( i > nxy-1 .and. i < 2*nxy ) then
> 
>          !Build -DYY/Rec (part of block 2,2)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i-nxy+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
> 
>             call 
> MatSetValues(A,ione,i,ione,j+nxy,-D2Yc(ii,jj)/Rec,ADD_VALUES,ierr) ! Build DY 
> (block 2,4)
>             jj=jj+1
> 
>          enddo
> 
> 
>          !Build U*DX (part of block 2,2)
>          ii=1+floor((i-nxy+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             call 
> MatSetValues(A,ione,i,ione,j+nxy,uvec_cmplx(i-nxy+1)*D1Xc(ii,jj),ADD_VALUES,ierr)
>  
>             jj=jj+1 
> 
>          enddo
> 
> 
>          !Build V*DY (part of block 2,2)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i-nxy+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
>             call 
> MatSetValues(A,ione,i,ione,j+nxy,vvec_cmplx(i-nxy+1)*D1Yc(ii,jj),ADD_VALUES,ierr)
>             jj=jj+1
> 
>          enddo
> 
>          call 
> MatSetValues(A,ione,i,ione,i,vyvec_cmplx(i-nxy+1),ADD_VALUES,ierr)!;CHKERRQ(ierr)
>  ! Vy (block 2,2)
> 
> 
>       elseif ( i > 2*nxy-1 .and. i < 3*nxy ) then
> 
>          !Build -DYY/Rec (part of block 3,3)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i-2*nxy+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
> 
>             call 
> MatSetValues(A,ione,i,ione,j+2*nxy,-D2Yc(ii,jj)/Rec,ADD_VALUES,ierr) ! Build 
> DY (block 2,4)
>             jj=jj+1
> 
>          enddo
> 
> 
>          !Build U*DX (part of block 3,3)
>          ii=1+floor((i-2*nxy+0.01)/ny) 
>          jj=1 
> 
>          jmin=mod(i,ny)
>          jmax=jmin+(nx-1)*ny
> 
>          do j=jmin,jmax,ny
> 
>             call 
> MatSetValues(A,ione,i,ione,j+2*nxy,uvec_cmplx(i-2*nxy+1)*D1Xc(ii,jj),ADD_VALUES,ierr)
>  
>             jj=jj+1 
> 
>          enddo
> 
> 
>          !Build V*DY (part of block 3,3)
>          ii=mod(i,ny)+1
>          jj=1
> 
>          jmin=floor((i-2*nxy+0.01)/ny)*ny
> 
>          do j=jmin,jmin+(ny-1)
>             call 
> MatSetValues(A,ione,i,ione,j+2*nxy,vvec_cmplx(i-2*nxy+1)*D1Yc(ii,jj),ADD_VALUES,ierr)
>             jj=jj+1
> 
>          enddo
> 
>       endif
> 
>    enddo
> 
> 
>    if (rank==0) then
> 
>       call cpu_time(finish)
>       write(*,*)
>       print '("Time Partial Parallel Build A #2 = ",f21.3," 
> seconds.")',finish-start
> 
>    endif
> 
> !$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
> !$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
> 
> 
>    !...Artificially create diagonal element for A ==> TO BE REMOVED
>    small = (0.000001d0,0.0d0)    
>    do i=IstartA,IendA-1   
>       call MatSetValues(A,ione,i,ione,i,small,ADD_VALUES,ierr)!;CHKERRQ(ierr) 
>  
>    enddo
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    !call PetscMemoryGetMaximumUsage(mem,ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (A just before 
> assembly)',rank,'    memory:',mem
> 
> 
>    if (rank==0) then
>       call cpu_time(start)
>    endif
> 
> !...Assemble A    
>    call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
>    call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
>    !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
>    if (rank==0) then    
>       call cpu_time(finish)
>       write(*,*)
>       print '("Time to assemble A = ",f21.3," seconds.")',finish-start
>    endif
> 
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    !call PetscMemoryGetMaximumUsage(mem,ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (A build and 
> assembled)',rank,'    memory:',mem
> 
> 
> !...Build B in parallel
>    if (rank==0) then
>       call cpu_time(start)
>    endif
> 
>    do i=IstartB,IendB-1 
>       if (i < 3*nxy) then
>          call 
> MatSetValues(B,ione,i,ione,i,imag_unit,INSERT_VALUES,ierr)!;CHKERRQ(ierr)
>       endif
>    enddo
> 
> !...Assemble B     
>    call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
>    call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
>    !call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    !call PetscMemoryGetMaximumUsage(mem,ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (B build and 
> assembled)',rank,'    memory:',mem
> 
> 
>    call MatNorm(B,NORM_FROBENIUS,nrm1,ierr)!;CHKERRQ(ierr)
> 
> !!$    if (rank==0) then   
> !!$       write(*,*)'norm B:',nrm1
> !!$    endif
> 
>    if (rank==0) then    
>       call cpu_time(finish)
>       write(*,*)
>       print '("Time to build and assemble B = ",f21.3," 
> seconds.")',finish-start
>    endif
> 
> 
> !...Set Boundary Conditions
> 
>    if (rank==0) then
> 
>       allocate(li(nx))
> 
>       do i=1,nx
>          li(i)=(i-1)*ny
>       enddo
> 
>       ct1=0
>       ct2=0
>       ct3=0
>       ct4=0
> 
>       do i=1,nx
>          do j=1,ny
> 
>             if (j==1) then ! fst_bc
>                ct1=ct1+1
>             elseif (j==ny) then ! wall_bc
>                ct2=ct2+1
>             elseif (i==1) then ! outlet_bc
>                ct3=ct3+1               
>             elseif (i==nx) then ! inlet_bc
>                ct4=ct4+1
>             endif
> 
>          enddo
>       enddo
> 
>       ct=ct1+ct2+ct3+ct4
> 
>    endif ! if (rank == 0)
> 
>    call MPI_Bcast(ct,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)   
>    call MPI_Bcast(ct1,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)  
>    call MPI_Bcast(ct2,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)  
>    call MPI_Bcast(ct3,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)  
>    call MPI_Bcast(ct4,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)  
> 
>    allocate(fst_bc(ct1))
>    allocate(wall_bc(ct2))
>    allocate(outlet_bc(ct3))
>    allocate(inlet_bc(ct4))
> 
>    allocate(all_bc(3*ct))
> 
>    allocate(u_bc(ct))
>    allocate(v_bc(ct))
>    allocate(w_bc(ct))
>    allocate(p_bc(ct))
> 
>    if (rank == 0) then
> 
>       ct1=0
>       ct2=0
>       ct3=0
>       ct4=0
> 
>       do i=1,nx
>          do j=1,ny
> 
>             ij=li(i)+j-1
> 
>             if (j==1) then ! fst_bc
>                ct1=ct1+1
>                fst_bc(ct1)=ij
>             elseif (j==ny) then ! wall_bc
>                ct2=ct2+1
>                wall_bc(ct2)=ij
>             elseif (i==1) then ! outlet_bc
>                ct3=ct3+1
>                outlet_bc(ct3)=ij
>             elseif (i==nx) then ! inlet_bc
>                ct4=ct4+1
>                inlet_bc(ct4)=ij       
>             endif             
> 
>          enddo
>       enddo
> 
>       u_bc(1:ct1)=fst_bc
>       u_bc(ct1+1:ct1+ct2)=wall_bc
>       u_bc(ct1+ct2+1:ct1+ct2+ct3)=outlet_bc
>       u_bc(ct1+ct2+ct3+1:ct1+ct2+ct3+ct4)=inlet_bc
> 
>       v_bc=u_bc+nxy
>       w_bc=v_bc+nxy
>       p_bc=w_bc+nxy
> 
>       all_bc=[u_bc,v_bc,w_bc]
> 
>    endif ! if (rank == 0)
> 
>    call 
> MPI_Bcast(all_bc,3*ct,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
> 
>    !For parallel case, all processes that share  matrix MUST call this 
> routine, regardless of whether any rows being zeroed are owned by them.
>    call MatZeroRows(A,3*ct,all_bc, one,zero,zero,ierr);CHKERRQ(ierr)
>    call MatZeroRows(B,3*ct,all_bc,zero,zero,zero,ierr);CHKERRQ(ierr)
> 
>    !call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
>    ! cannot be inside a if (rank==0) statment!!!
>    call MatNorm(A,NORM_FROBENIUS,nrm,ierr)!;CHKERRQ(ierr)
>    call MatNorm(B,NORM_FROBENIUS,nrm1,ierr)!;CHKERRQ(ierr)
> 
>    if (rank==0) then 
>       write(*,*)'norm A:',nrm
>       write(*,*)'norm B:',nrm1
>    endif
> 
> 
> 
> !!!!!! TESTING HOW BIG LU CAN BE !!!!!!!!!!!!!!!!
> 
> !!$    sigma = (1.0d0,0.0d0)
> !!$    call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit A = 
> A-sigma*B
> !!$
> !!$    !...Create linear solver context
> !!$    call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> !!$    
> !!$!...Set operators. Here the matrix that defines the linear system also 
> serves as the preconditioning matrix.
> !!$    !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha 
> commented and replaced by next line
> !!$    call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B
> !!$
> !!$    tol = 1.e-10 
> !!$    call 
> KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) ! 
> set relative and absolute tolerances and uses defualt for divergence tol
> !!$
> !!$!...Set the Direct (LU) Solver
> !!$    call KSPSetType(ksp,KSPPREONLY,ierr)
> !!$    call KSPGetPC(ksp,pc,ierr)
> !!$    call PCSetType(pc,PCLU,ierr)
> !!$    call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! 
> MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
> !!$
> !!$!...Create right-hand-side vector
> !!$    call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
> !!$    call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
> !!$
> !!$    allocate(xwork1(IendA-IstartA))
> !!$    allocate(loc(IendA-IstartA))
> !!$
> !!$    ct=0
> !!$    do i=IstartA,IendA-1
> !!$       ct=ct+1
> !!$       loc(ct)=i
> !!$       xwork1(ct)=(1.0d0,0.0d0)
> !!$    enddo
> !!$    
> !!$    call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
> !!$    call VecZeroEntries(sol,ierr)
> !!$
> !!$    deallocate(xwork1,loc)
> !!$
> !!$!...Assemble Vectors
> !!$    call VecAssemblyBegin(frhs,ierr)
> !!$    call VecAssemblyEnd(frhs,ierr)
> !!$
> !!$!...Solve the linear system
> !!$    call KSPSolve(ksp,frhs,sol,ierr)
> !!$
> !!$    !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
> !!$
> !!$    STOP
> 
> 
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!           SLEPC         !!!!!!!!!!!!!!!!!!!!!!!!!!
> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
>    if (rank==0) call cpu_time(start)
> 
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (begin slepc)',rank,' 
>    memory:',mem
> 
> !...Get vector(s) compatible with the matrix, i.e. with the same parallel 
> layout 
>    call MatCreateVecs(A,xr,xi,ierr)
> !!$    call MatCreateVecs(A,xr,PETSC_NULL_OBJECT,ierr)
> !!$    call MatCreateVecs(A,xi,PETSC_NULL_OBJECT,ierr)
> 
> 
> !...Create Eigensolver Context and Spectral Transformation context
>    call EPSCreate(PETSC_COMM_WORLD,eps,ierr)
>    !call STCreate(PETSC_COMM_WORLD,st,ierr)
> 
> !...Set Operators and Problem Type - Generalized Eigenvalue Problem
>    call EPSSetOperators(eps,A,B,ierr)
>    !call EPSSetProblemType(eps,EPS_PGNHEP,ierr) ! THIS ONE LEADS TO AN ERROR 
> --> CHECK WHY BECAUSE MY B SEEMS FITTING THAT CRITERION
>    call EPSSetProblemType(eps,EPS_GNHEP,ierr)
> 
> !...Select Portion of Spectrum
>    call EPSSetTarget(eps,(3.0d0,0.d0),ierr)
>    call EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE,ierr)
> 
> !...Select Algorithm for Eigenvalue Computation (default is Krylov-Schur)
>    call EPSSetType(eps,EPSKRYLOVSCHUR,ierr) !EPSARNOLDI,EPSKRYLOVSCHUR
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (slepc after 
> EPSSetType)',rank,'    memory:',mem
> 
> !...Set Tolerance and Maxiter for Solver
>    tolslepc=1e-12
>    maxiter=500
>    !call EPSSetTolerances(eps,PETSC_DEFAULT_REAL,maxiter,ierr)
>    call EPSSetTolerances(eps,tolslepc,maxiter,ierr)
> 
> !...Sets number of eigenvalues to compute and dimension of subspace
>    nev=200
>    ncv=3*nev
>    mpd=10*nev
>    !call EPSSetDimensions(eps,nev,ncv,mpd,ierr) PETSC_DEFAULT
>    call 
> EPSSetDimensions(eps,nev,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (before 
> EPSGetST)',rank,'    memory:',mem
> 
> !...Set the Spectral Transformation --> Use Shift-And-Invert
>    call EPSGetST(eps,st,ierr)
>    call STSetType(st,STSINVERT,ierr)
> 
> !...Set Linear Solver
>    call STGetKSP(st,ksp,ierr)
>    call KSPSetType(ksp,KSPPREONLY,ierr)
>    call KSPGetPC(ksp,pc,ierr)
>    call PCSetType(pc,PCLU,ierr) ! could try PCCHOLESKY
>    call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! 
> MATSOLVERSUPERLU_DIST,MATSOLVERMUMPS
> 
> !...Set Solver Parameters at Runtime
>    call EPSSetFromOptions(eps,ierr)
> 
>    call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
>    call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
>    if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (after 
> EPSSetFromOptions)',rank,'    memory:',mem
> 
> !...Solve the Eigensystem
>    call EPSSolve(eps,ierr)
> 
>    call EPSGetIterationNumber(eps,its,ierr)
>    if (rank .eq. 0) then
>       write(*,110) its
>    endif
> 110 format (/' Number of iterations of the method:',I4)  
> 
> !...Optional: Get some information from the solver and display it
>    call EPSGetType(eps,tname,ierr)
>    if (rank .eq. 0) then
>       write(*,120) tname
>    endif
> 120 format (' Solution method: ',A)
> 
>    nev=10
>    call EPSGetDimensions(eps,nev,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,ierr)
>    if (rank .eq. 0) then
>       write(*,130) nev
>    endif
> 130 format (' Number of requested eigenvalues:',I4)
>    call EPSGetTolerances(eps,tol,maxit,ierr)
>    if (rank .eq. 0) then
>       write(*,140) tol, maxit
>    endif
> 140 format (' Stopping condition: tol=',1P,E10.4,', maxit=',I4)
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> !     Display solution and clean up
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> 
> !...Get number of converged eigenpairs
>    call EPSGetConverged(eps,nconv,ierr)
>    if (rank .eq. 0) then
>       write(*,150) nconv
>    endif
> 150 format (' Number of converged eigenpairs:',I4/)
> 
> !...Display eigenvalues and relative errors
>    if (nconv.gt.0) then
>       if (rank .eq. 0) then
>          write(*,*) '         kr               ki          ||Ax-kx||/||kx||'
>          write(*,*) ' ----------------- -----------------  ------------------'
>       endif
>       do i=0,nconv-1
>          !...Get converged eigenpairs: i-th eigenvalue is stored in kr (real 
> part) and ki (imaginary part)
>          call EPSGetEigenpair(eps,i,kr,ki,xr,xi,ierr)
> 
>          !...Compute the relative error associated to each eigenpair
>          call EPSComputeError(eps,i,EPS_ERROR_RELATIVE,error,ierr)
>          if (rank .eq. 0) then
>             write(*,160) PetscRealPart(kr),PetscImaginaryPart(kr), error
>          endif
> 160       format (1P,'   ',E21.11,'       ',E21.11,'      ',E21.11)
> 
>       enddo
> 
>    endif
> 
> !...Free work space
> 
>    if (rank==0) deallocate(li)
> 
>    deallocate(D1Xc,D2Xc,D1Yc,D2Yc)
> 
>    deallocate(fst_bc)
>    deallocate(wall_bc)
>    deallocate(outlet_bc)
>    deallocate(inlet_bc)
> 
>    deallocate(all_bc)
> 
>    deallocate(u_bc)
>    deallocate(v_bc)
>    deallocate(w_bc)
>    deallocate(p_bc)
> 
>    !call STDestroy(st,ierr)    
>    call EPSDestroy(eps,ierr)
>    call MatDestroy(A,ierr)
>    call MatDestroy(B,ierr)
>    !call VecDestroy(i_vec,ierr)   
>    call VecDestroy(xr,ierr)
>    call VecDestroy(xi,ierr)    
> 
> !...Finalize
>    call SlepcFinalize(ierr)
> 
>    if (rank==0) then    
>       call cpu_time(finish)
>       write(*,*)
>       print '("Total time for SLEPC = ",f21.3," seconds.")',finish-start
>    endif
> 
>    END Subroutine SETUPSOLVEGEVP
> 
> 
>    END MODULE PETSC
> 
> 
> 
> 
> 
> !!$       write(*,*)'fst_bc',fst_bc
> !!$       write(*,*)'wall_bc',wall_bc
> !!$       write(*,*)'outlet_bc',outlet_bc
> !!$       write(*,*)'inlet_bc',inlet_bc
> !!$       
> !!$       write(*,*)'all_bc',all_bc
> 
> !!$       write(*,*)'ct1+ct2+ct3+ct4',ct1+ct2+ct3+ct4 
> 
> 
> 
> !!$    call MatCreate(PETSC_COMM_WORLD,A,ierr)
> !!$    call MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,ierr)
> !!$    call MatSetFromOptions(A,ierr)
> !!$    call MatSetUp(A,ierr)
> 
> 
> 
>       !call VecCreateSeq(PETSC_COMM_SELF,nxy,i_vec,ierr) ! sequential vector
>       !call VecSetValues(i_vec,nxy,loc,xwork1,INSERT_VALUES,ierr)
> 
>       !  Assemble vector
>       !call VecAssemblyBegin(i_vec,ierr)
>       !call VecAssemblyEnd(i_vec,ierr)
> 
>    !  Create parallel matrix, specifying only its global dimensions.
>    !  When using MatCreate(), the matrix format can be specified at
>    !  runtime. Also, the parallel partitioning of the matrix is
>    !  determined by PETSc at runtime.
> 
> 
> 
> !!$    call MatCreate(PETSC_COMM_WORLD,B,ierr)
> !!$    call MatSetSizes(B,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,ierr)
> !!$    call MatSetFromOptions(B,ierr)
> !!$    call MatSetUp(B,ierr)
> 
>    !  Currently, all PETSc parallel matrix formats are partitioned by
>    !  contiguous chunks of rows across the processors.  Determine which
>    !  rows of the matrix are locally owned.
> 
> 
> 
> 
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                    Include files
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !
> !  This program uses CPP for preprocessing, as indicated by the use of
> !  PETSc include files in the directory petsc/include/finclude.  This
> !  convention enables use of the CPP preprocessor, which allows the use
> !  of the #include statements that define PETSc objects and variables.
> !
> !  Use of the conventional Fortran include statements is also supported
> !  In this case, the PETsc include files are located in the directory
> !  petsc/include/foldinclude.
> !
> !  Since one must be very careful to include each file no more than once
> !  in a Fortran routine, application programmers must exlicitly list
> !  each file needed for the various PETSc components within their
> !  program (unlike the C/C++ interface).
> !
> 
> 
> 
> 
>    !write(*,*)'IstartA,IendA',IstartA,IendA
>    !write(*,*)'IstartB,IendB',IstartB,IendB
> 
> 
> 
> !!$#include <finclude/petscsys.h>
> !!$#include <finclude/petscvec.h>
> !!$#include <finclude/petscmat.h>
> !!$#include <finclude/petscmat.h90>
> !!$#include <finclude/petscpc.h>
> !!$#include <finclude/petscksp.h>
> !!$
> !!$#include <finclude/slepcsys.h>
> !!$#include <finclude/slepceps.h>
> 
> 
> 
> 
> 
> 
> 
> 
> !!$      do i=1,nx
> !!$         do j=1,nx
> !!$            D1X(i,j)=100+j+(i-1)*nx
> !!$         enddo
> !!$      enddo
> 
> !!$      do i=1,ny
> !!$         do j=1,ny
> !!$            D1Y(i,j)=999.d0
> !!$         enddo
> !!$      enddo
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> !  Create parallel vectors.
> !   - Here, the parallel partitioning of the vector is determined by
> !     PETSc at runtime.  We could also specify the local dimensions
> !     if desired -- or use the more general routine VecCreate().
> !   - When solving a linear system, the vectors and matrices MUST
> !     be partitioned accordingly.  PETSc automatically generates
> !     appropriately partitioned matrices and vectors when MatCreate()
> !     and VecCreate() are used with the same communicator.
> !   - Note: We form 1 vector from scratch and then duplicate as needed.
> 
>      !call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,m*n,u,ierr)
>      !call VecSetFromOptions(u,ierr)
>      !call VecDuplicate(u,b,ierr)
>      !call VecDuplicate(b,x,ierr)
> 
> !  Set exact solution; then compute right-hand-side vector.
> !  By default we use an exact solution of a vector with all
> !  elements of 1.0;  Alternatively, using the runtime option
> !  -random_sol forms a solution vector with random components.
> 
> !!$      call 
> PetscOptionsHasName(PETSC_NULL_CHARACTER,'-random_exact_sol',flg,ierr)
> !!$      if (flg) then
> !!$         call PetscRandomCreate(PETSC_COMM_WORLD,rctx,ierr)
> !!$         call PetscRandomSetFromOptions(rctx,ierr)
> !!$         call VecSetRandom(u,rctx,ierr)
> !!$         call PetscRandomDestroy(rctx,ierr)
> !!$      else
> !!$         call VecSet(u,one,ierr)
> !!$      endif
> !!$      call MatMult(A,u,b,ierr)
> 
> 
> !  View the exact solution vector if desired
> 
> !!$      call 
> PetscOptionsHasName(PETSC_NULL_CHARACTER,"-view_exact_sol",flg,ierr)
> !!$      if (flg) then
> !!$         call VecView(u,PETSC_VIEWER_STDOUT_WORLD,ierr)
> !!$      endif
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !         Create the linear solver and set various options
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
> !  Create linear solver context
> 
> !!$      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> 
> 
> !  Set operators. Here the matrix that defines the linear system
> !  also serves as the preconditioning matrix.
> 
>      !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha 
> commented and replaced by next line
> !!$      call KSPSetOperators(ksp,A,A,ierr)
> 
> 
> !  Set linear solver defaults for this problem (optional).
> !   - By extracting the KSP and PC contexts from the KSP context,
> !     we can then directly directly call any KSP and PC routines
> !     to set various options.
> !   - The following four statements are optional; all of these
> !     parameters could alternatively be specified at runtime via
> !     KSPSetFromOptions(). All of these defaults can be
> !     overridden at runtime, as indicated below.
> 
> !     We comment out this section of code since the Jacobi
> !     preconditioner is not a good general default.
> 
> !      call KSPGetPC(ksp,pc,ierr)
> !      ptype = PCJACOBI
> !      call PCSetType(pc,ptype,ierr)
> !      tol = 1.e-7
> !      call KSPSetTolerances(ksp,tol,PETSC_DEFAULT_DOUBLE_PRECISION,
> !     &     PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr)
> 
> !  Set user-defined monitoring routine if desired
> 
>      !call 
> PetscOptionsHasName(PETSC_NULL_CHARACTER,'-my_ksp_monitor',flg,ierr) ! flg 
> set to true if that option has been specified
>      !if (flg) then
>      !  call 
> KSPMonitorSet(ksp,MyKSPMonitor,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr) ! 
> call MyKSPMonitor
>      !endif
> 
>      !tol = 1.e-10
>      !call 
> KSPSetTolerances(ksp,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
>  ! set relative tolerance and uses defualt for absolute and divergence tol 
>      !call 
> KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) ! 
> set relative and absolute tolerances and uses defualt for divergence tol 
> 
> !  Set runtime options, e.g.,
> !      -ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
> !  These options will override those specified above as long as
> !  KSPSetFromOptions() is called _after_ any other customization
> !  routines.
> 
>      !call KSPSetFromOptions(ksp,ierr)
> 
> 
> !  Set convergence test routine if desired
> 
>      !call 
> PetscOptionsHasName(PETSC_NULL_CHARACTER,'-my_ksp_convergence',flg,ierr)
>      !if (flg) then
>      !  call 
> KSPSetConvergenceTest(ksp,MyKSPConverged,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr)
>      !endif
> 
> 
> 
>      !call 
> KSPMonitorSet(ksp,KSPMonitorTrueResidualNorm,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr);
> 
> 
> !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                      Solve the linear system
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
>      !call KSPSolve(ksp,b,x,ierr)
> 
> 
> !
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                      View solver info (could use -ksp_view instead)
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
>      !added by aha
> 
> !!$      write(*,*)''
> !!$      write(*,*)'Start of KSPView'
> !!$      write(*,*)'----------------'      
> !!$      write(*,*)''
> !!$      call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD,ierr)
> !!$      write(*,*)''
> !!$      write(*,*)'End of KSPView'
> !!$      write(*,*)'--------------' 
> !!$      write(*,*)''
> 
> 
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> !                     Check solution and clean up
> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> 
> !  Check the error
> !!$      call VecAXPY(x,neg_one,u,ierr)
> !!$      call VecNorm(x,NORM_2,norm,ierr)
> !!$      call KSPGetIterationNumber(ksp,its,ierr)
> !!$      if (rank .eq. 0) then
> !!$        if (norm .gt. 1.e-12) then
> !!$           write(6,100) norm,its
> !!$        else
> !!$           write(6,110) its
> !!$        endif
> !!$      endif
> !!$  100 format('Norm of error ',e11.4,' iterations ',i5)
> !!$  110 format('Norm of error < 1.e-12,iterations ',i5)
> 
> !  Free work space.  All PETSc objects should be destroyed when they
> !  are no longer needed.
> 
> !!$      call KSPDestroy(ksp,ierr)
> !!$      call VecDestroy(u,ierr)
> !!$      call VecDestroy(x,ierr)
> !!$      call VecDestroy(b,ierr)
> !!$      call MatDestroy(A,ierr)
> 
> !  Always call PetscFinalize() before exiting a program.  This routine
> !    - finalizes the PETSc libraries as well as MPI
> !    - provides summary and diagnostic information if certain runtime
> !      options are chosen (e.g., -log_summary).  See PetscFinalize()
> !      manpage for more information.
> 
> 
> ! --------------------------------------------------------------
> !
> !  MyKSPMonitor - This is a user-defined routine for monitoring
> !  the KSP iterative solvers.
> !
> !  Input Parameters:
> !    ksp   - iterative context
> !    n     - iteration number
> !    rnorm - 2-norm (preconditioned) residual value (may be estimated)
> !    dummy - optional user-defined monitor context (unused here)
> !
> !!$      subroutine MyKSPMonitor(ksp,n,rnorm,dummy,ierr)
> !!$
> !!$      implicit none
> !!$
> !!$#include <finclude/petscsys.h>
> !!$#include <finclude/petscvec.h>
> !!$#include <finclude/petscksp.h>
> !!$
> !!$      KSP              ksp
> !!$      Vec              x
> !!$      PetscErrorCode ierr
> !!$      PetscInt n,dummy
> !!$      PetscMPIInt rank
> !!$      double precision rnorm
> !!$
> !!$!  Build the solution vector
> !!$
> !!$      call KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr)
> !!$
> !!$!  Write the solution vector and residual norm to stdout
> !!$!   - Note that the parallel viewer PETSC_VIEWER_STDOUT_WORLD
> !!$!     handles data from multiple processors so that the
> !!$!     output is not jumbled.
> !!$
> !!$      call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> !!$      if (rank .eq. 0) write(6,100) n
> !!$      call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr)
> !!$      if (rank .eq. 0) write(6,200) n,rnorm
> !!$
> !!$ 100  format('iteration ',i5,' solution vector:')
> !!$ 200  format('iteration ',i5,' residual norm ',e11.4)
> !!$      ierr = 0
> !!$
> !!$      end subroutine MyKSPMonitor
> 
> ! --------------------------------------------------------------
> !
> !  MyKSPConverged - This is a user-defined routine for testing
> !  convergence of the KSP iterative solvers.
> !
> !  Input Parameters:
> !    ksp   - iterative context
> !    n     - iteration number
> !    rnorm - 2-norm (preconditioned) residual value (may be estimated)
> !    dummy - optional user-defined monitor context (unused here)
> !
> !!$      subroutine MyKSPConverged(ksp,n,rnorm,flag,dummy,ierr)
> !!$
> !!$      implicit none
> !!$
> !!$#include <finclude/petscsys.h>
> !!$#include <finclude/petscvec.h>
> !!$#include <finclude/petscksp.h>
> !!$
> !!$      KSP              ksp
> !!$      PetscErrorCode ierr
> !!$      PetscInt n,dummy
> !!$      KSPConvergedReason flag
> !!$      double precision rnorm
> !!$
> !!$      if (rnorm .le. .05) then
> !!$        flag = 1
> !!$      else
> !!$        flag = 0
> !!$      endif
> !!$      ierr = 0
> !!$
> !!$      end subroutine MyKSPConverged
> 
> 
> 
> <valgrind.log.31371><valgrind.log.31372>

Re: [petsc-users] Can't expand MemType 1: jcol 16104

Reply via email to