Thanks
Anthony
On 07/08/2015 01:49 PM, Barry Smith wrote:
You should try running under valgrind, see:
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
You can also run in the debugger (yes this is tricky on a batch system but possible) and see
exactly what triggers the floating point exception or when it "hangs" interrupt the
program to see where it is "hanging".
Barry
On Jul 8, 2015, at 12:34 PM, Anthony Paul Haas <[email protected]> wrote:
Hi,
I have used the switch -mat_superlu_dist_parsymbfact in my pbs script. However,
although my program worked fine with sequential symbolic factorization, I get
one of the following 2 behaviors when I run with parallel symbolic
factorization (depending on the number of processors that I use):
1) the program just hangs (it seems stuck in some subroutine ==> see
test.out-hangs)
2) I get a floating point exception ==> see test.out-floating-point-exception
Note that as suggested in the Superlu manual, I use a power of 2 number of
procs. Are there any tunable parameters for the parallel symbolic
factorization? Note that when I build my sparse matrix, most elements I add are
nonzero of course but to simplify the programming, I also add a few zero
elements in the sparse matrix. I was thinking that maybe if the parallel
symbolic factorization proceed by block, there could be some blocks where the
pivot would be zero, hence creating the FPE??
Thanks,
Anthony
On Wed, Jul 8, 2015 at 6:46 AM, Xiaoye S. Li <[email protected]> wrote:
Did you find out how to change option to use parallel symbolic factorization?
Perhaps PETSc team can help.
Sherry
On Tue, Jul 7, 2015 at 3:58 PM, Xiaoye S. Li <[email protected]> wrote:
Is there an inquiry function that tells you all the available options?
Sherry
On Tue, Jul 7, 2015 at 3:25 PM, Anthony Paul Haas <[email protected]>
wrote:
Hi Sherry,
Thanks for your message. I have used superlu_dist default options. I did not
realize that I was doing serial symbolic factorization. That is probably the
cause of my problem.
Each node on Garnet has 60GB usable memory and I can run with 1,2,4,8,16 or 32
core per node.
So I should use:
-mat_superlu_dist_r 20
-mat_superlu_dist_c 32
How do you specify the parallel symbolic factorization option? is it
-mat_superlu_dist_matinput 1
Thanks,
Anthony
On Tue, Jul 7, 2015 at 3:08 PM, Xiaoye S. Li <[email protected]> wrote:
For superlu_dist failure, this occurs during symbolic factorization. Since you
are using serial symbolic factorization, it requires the entire graph of A to
be available in the memory of one MPI task. How much memory do you have for
each MPI task?
It won't help even if you use more processes. You should try to use parallel
symbolic factorization option.
Another point. You set up process grid as:
Process grid nprow 32 x npcol 20
For better performance, you show swap the grid dimension. That is, it's better
to use 20 x 32, never gives nprow larger than npcol.
Sherry
On Tue, Jul 7, 2015 at 1:27 PM, Barry Smith <[email protected]> wrote:
I would suggest running a sequence of problems, 101 by 101 111 by 111 etc
and get the memory usage in each case (when you run out of memory you can get
NO useful information out about memory needs). You can then plot memory usage
as a function of problem size to get a handle on how much memory it is using.
You can also run on more and more processes (which have a total of more memory)
to see how large a problem you may be able to reach.
MUMPS also has an "out of core" version (which we have never used) that
could in theory anyways let you get to large problems if you have lots of disk space, but
you are on your own figuring out how to use it.
Barry
On Jul 7, 2015, at 2:37 PM, Anthony Paul Haas <[email protected]> wrote:
Hi Jose,
In my code, I use once PETSc to solve a linear system to get the baseflow
(without using SLEPc) and then I use SLEPc to do the stability analysis of that
baseflow. This is why, there are some SLEPc options that are not used in
test.out-superlu_dist-151x151 (when I am solving for the baseflow with PETSc
only). I have attached a 101x101 case for which I get the eigenvalues. That
case works fine. However If i increase to 151x151, I get the error that you can
see in test.out-superlu_dist-151x151 (similar error with mumps: see
test.out-mumps-151x151 line 2918 ). If you look a the very end of the files
test.out-superlu_dist-151x151 and test.out-mumps-151x151, you will see that the
last info message printed is:
On Processor (after EPSSetFromOptions) 0 memory: 0.65073152000E+08
=====> (see line 807 of module_petsc.F90)
This means that the memory error probably occurs in the call to EPSSolve (see
module_petsc.F90 line 810). I would like to evaluate how much memory is
required by the most memory intensive operation within EPSSolve. Since I am
solving a generalized EVP, I would imagine that it would be the LU
decomposition. But is there an accurate way of doing it?
Before starting with iterative solvers, I would like to exploit as much as I
can direct solvers. I tried GMRES with default preconditioner at some point but
I had convergence problem. What solver/preconditioner would you recommend for a
generalized non-Hermitian (EPS_GNHEP) EVP?
Thanks,
Anthony
On Tue, Jul 7, 2015 at 12:17 AM, Jose E. Roman <[email protected]> wrote:
El 07/07/2015, a las 02:33, Anthony Haas escribió:
Hi,
I am computing eigenvalues using PETSc/SLEPc and superlu_dist for the LU
decomposition (my problem is a generalized eigenvalue problem). The code runs
fine for a grid with 101x101 but when I increase to 151x151, I get the
following error:
Can't expand MemType 1: jcol 16104 (and then [NID 00037] 2015-07-06 19:19:17
Apid 31025976: OOM killer terminated this process.)
It seems to be a memory problem. I monitor the memory usage as far as I can and
it seems that memory usage is pretty low. The most memory intensive part of the
program is probably the LU decomposition in the context of the generalized EVP.
Is there a way to evaluate how much memory will be required for that step? I am
currently running the debug version of the code which I would assume would use
more memory?
I have attached the output of the job. Note that the program uses twice PETSc:
1) to solve a linear system for which no problem occurs, and, 2) to solve the
Generalized EVP with SLEPc, where I get the error.
Thanks
Anthony
<test.out-superlu_dist-151x151>
In the output you are attaching there are no SLEPc objects in the report and
SLEPc options are not used. It seems that SLEPc calls are skipped?
Do you get the same error with MUMPS? Have you tried to solve linear systems
with a preconditioned iterative solver?
Jose
<module_petsc.F90><test.out-mumps-151x151><test.out_superlu_dist-101x101><test.out-superlu_dist-151x151>
<test.out-hangs><test.out-floating-point-exception>
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! SPECTRAL TEMPORAL BIGLOBAL SOLVER !
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
PROGRAM Biglobal_Spectral
USE PETSC
USE TYPEDEF
USE MAPPINGS
USE BASEFLOW
USE WRITE_PRINT
USE FOURIER_CHEBYSHEV
IMPLICIT NONE
include 'mpif.h'
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
! Variables Definition !
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
!...Testing and dummy variables
real(dp) :: dummy
!...Problem parameters
integer(i4b),parameter :: nx=11,ny=11,nxy=nx*ny,N=4*nxy
!Grid size for 3D Poiseuille baseflow
integer(i4b),parameter :: nnx=nx,nny=ny
real(dp) :: Re,alpha,beta,Lx,xmin,xmax,ymin,ymax,AA,xll,yll
!...Problem Flags
integer(i4b) :: fourier,testcase,formulation,algorithm
!Mapping Flags
integer(i4b) :: map,mapx,mapy
!...Indices
integer(i4b) :: i,j,iii,jjj,ij,skip,imin,imax,jmin,jmax
!...Output files
integer(i4b) :: file_num
character(128) :: filetype,filename,file_name
!...Boundary conditions variables
integer(i4b) :: ct1,ct2,ct3,ct4
integer(i4b),dimension(:),allocatable :: fst_bc,wall_bc,outlet_bc,inlet_bc
integer(i4b),dimension(:),allocatable :: bcond_px,bcond_py,bcond_pz
integer(i4b),dimension(:),allocatable :: bcond_u,bcond_v,bcond_w,bcond_p
integer(i4b),dimension(:),allocatable :: bcond
!...Chebyshev related
real(dp),dimension(:),allocatable :: xi,yi,x,y,xfourier
!...Baseflow variables
real(dp),dimension(:,:),allocatable :: usol,uxsol,uysol
real(dp),dimension(:,:),allocatable :: vsol,vxsol,vysol
real(dp),dimension(:,:),allocatable :: wsol,wxsol,wysol
!For Hiemenz Baseflow
integer(i4b) :: npts_hz
real(dp) :: ymax_hz,epstol,si
real(dp),dimension(:),allocatable :: eta
real(dp),dimension(:),allocatable :: u0,v0,v1,v2,v3
real(dp),dimension(:),allocatable :: w0,w1,w2
real(dp),dimension(:),allocatable :: etarev,u0rev,v0rev,w0rev
real(dp),dimension(:,:),allocatable :: Ubar,Vbar,Wbar
!...Grid variables
real(dp),dimension(:,:),allocatable :: xx,yy,xx1,yy1
!...Vectorization array
integer(i4b),dimension(:),allocatable :: li
!...MPI
integer(i4b) :: rank,size,ierr
!...Spline interpolation
!real(dp),dimension(:),allocatable :: bsp,csp,dsp
!real(dp),dimension(:),allocatable :: interp_u0,interp_v0,interp_w0
real(dp) :: seval
!...External Subroutines
external seval
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
! Beginning of program !
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
if (rank==0) call cpu_time(start_total)
!...Parameters for BiGlobal Computation
testcase=2 ! 1 --> Plane Poiseuille Flow
! 2 --> 3D Poiseuille Flow
! 3 --> Attachment line flow (Hiemenz)
! 4 --> Laminar Separation Bubble
! 5 --> Artificially parallel boundary-layer (cf. Rodriguez
thesis)
formulation = 1 ! 1 --> x,y inhomogeneous
! 2 --> y,z inhomogeneous
algorithm = 1 ! 1 --> QZ algorith
! 2 --> Arnoldi algorithm
!...Arrays allocation
allocate(li(nx))
allocate(xi(nx),yi(ny))
allocate(x(nx),y(ny))
allocate(xfourier(nx))
allocate(uvec(nxy),uxvec(nxy),uyvec(nxy))
allocate(vvec(nxy),vxvec(nxy),vyvec(nxy))
allocate(wvec(nxy),wxvec(nxy),wyvec(nxy))
allocate(xx(nx,ny),yy(nx,ny))
allocate(D1X(nx,nx),D2X(nx,nx))
allocate(D1Y(ny,ny),D2Y(ny,ny))
allocate(DMX(nx,nx,2),DMY(ny,ny,2))
!allocate(interp_u0(ny),interp_v0(ny),interp_w0(ny))
li = 0
xi = 0.0d0
yi = 0.0d0
x = 0.0d0
y = 0.0d0
xfourier = 0.0d0
uvec = 0.0d0
uxvec = 0.0d0
uyvec = 0.0d0
vvec = 0.0d0
vxvec = 0.0d0
vyvec = 0.0d0
wvec = 0.0d0
wxvec = 0.0d0
wyvec = 0.0d0
xx = 0.0d0
yy = 0.0d0
D1X = 0.0d0
D2X = 0.0d0
D1Y = 0.0d0
D2Y = 0.0d0
DMX = 0.0d0
DMY = 0.0d0
!...Test Cases Parameters
if ( testcase == 2 ) then
mapx=1
mapy=0
Re=1000
if (formulation == 1)then
beta = pi
elseif (formulation == 2) then
alpha = pi
endif
AA=1.0
xmin=-AA
xmax=AA
ymin=-1.0
ymax=1.0
!grid for baseflow
!nnx=nx
!nny=ny
endif
!...Working Array for Converting 2D Indices to 1D
li=0
do i=1,nx
li(i)=(i-1)*ny
enddo
!...Get Collocation Points and Chebyshev Differentiation Matrices
call cheby(nx-1,xi,DMX) ! (cf. Spectral Methods in Matlab pages 53-55)
call cheby(ny-1,yi,DMY)
xll=0
! To get matrices D1X,D2X,D1Y,D2Y
call set_mappings(nx,ny,mapx,mapy,xmin,xmax,ymin,ymax,xi,yi,xll,yll,x,y)
do i=1,nx
do j=1,ny
xx(i,j)=x(i) ! not equivalent to matlab meshgrid!!!
yy(i,j)=y(j) ! not equivalent to matlab meshgrid!!!
enddo
enddo
!...Compute Baseflow (3D Poiseuille)
if ( testcase == 2 ) then
allocate(xx1(nnx,nny),yy1(nnx,nny))
allocate(usol(nnx,nny),uxsol(nnx,nny),uysol(nnx,nny))
allocate(vsol(nnx,nny),vxsol(nnx,nny),vysol(nnx,nny))
allocate(wsol(nnx,nny),wxsol(nnx,nny),wysol(nnx,nny))
xx1 = 0.0d0
yy1 = 0.0d0
usol = 0.0d0
uxsol = 0.0d0
uysol = 0.0d0
vsol = 0.0d0
vxsol = 0.0d0
vysol = 0.0d0
wsol = 0.0d0
wxsol = 0.0d0
wysol = 0.0d0
if (rank == 0) then
WRITE(*,*)
WRITE(*,'(A,I3,A,I3)')'3D POISEUILLE BASEFLOW COMPUTED ON GRID WITH
NX =',nnx,' NY =',nny
WRITE(*,*)
call cpu_time(start)
endif
!currently I am taking same grid for baseflow as for biglobal ==> no
interpolation
call Poiseuille_3D_PETSC(nnx,nny,AA,usol,xx1,yy1)
if (rank==0) then
call cpu_time(finish)
print '("Time for baseflow computation = ",f15.4,"
seconds.")',finish-start
file_num=10
file_name='check_baseflow_3D_poiseuille.dat'
call
write_to_tecplot(file_num,file_name,nnx,nny,xx1,yy1,usol,usol,usol)
endif
vsol=0*usol
wsol=0*usol
if (formulation==1) then
wsol=usol
usol=0.*wsol
vsol=0.*wsol
endif
!Compute derivatives
uxsol = MATMUL(D1X,usol)
uysol = MATMUL(usol,TRANSPOSE(D1Y))
vxsol = MATMUL(D1X,vsol)
vysol = MATMUL(vsol,TRANSPOSE(D1Y))
wxsol = MATMUL(D1X,wsol)
wysol = MATMUL(wsol,TRANSPOSE(D1Y))
!Vectorize the matrices
do i=1,nx
do j=1,ny
ij=li(i)+j
uvec(ij) = usol(i,j)
vvec(ij) = vsol(i,j)
wvec(ij) = wsol(i,j)
uxvec(ij) = uxsol(i,j)
vxvec(ij) = vxsol(i,j)
wxvec(ij) = wxsol(i,j)
uyvec(ij) = uysol(i,j)
vyvec(ij) = vysol(i,j)
wyvec(ij) = wysol(i,j)
enddo
enddo
endif
!call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!...Build Matrices, Set BCs and Solve Problem
call SetupSolveGEVP(nx,ny,nxy,alpha,beta,Re)
if (rank==0) then
call cpu_time(finish_total)
print '("Total time = ",f15.4," seconds.")',finish_total-start_total
endif
deallocate(li)
deallocate(xi,yi)
deallocate(x,y)
deallocate(xfourier)
deallocate(uvec,uxvec,uyvec)
deallocate(vvec,vxvec,vyvec)
deallocate(wvec,wxvec,wyvec)
deallocate(xx,yy)
deallocate(D1X,D2X)
deallocate(D1Y,D2Y)
deallocate(DMX,DMY)
deallocate(xx1,yy1)
deallocate(usol,uxsol,uysol)
deallocate(vsol,vxsol,vysol)
deallocate(wsol,wxsol,wysol)
call MPI_FINALIZE(ierr)
END PROGRAM Biglobal_Spectral
MODULE BASEFLOW
USE TYPEDEF
USE WRITE_PRINT
USE FOURIER_CHEBYSHEV
IMPLICIT NONE
CONTAINS
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
SUBROUTINE Poiseuille_3D_PETSC(nx,ny,AA,usol,xx,yy)
#include <petsc/finclude/petscsys.h>
#include <petsc/finclude/petscvec.h>
#include <petsc/finclude/petscmat.h>
#include <petsc/finclude/petscmat.h90>
#include <petsc/finclude/petscvec.h90>
#include <petsc/finclude/petscpc.h>
#include <petsc/finclude/petscksp.h>
#include <petsc/finclude/petscviewer.h>
#include <petsc/finclude/petscviewer.h90>
#include <slepc/finclude/slepcsys.h>
#include <slepc/finclude/slepceps.h>
!...Regular Variables
integer(i4b),intent(in) :: nx,ny
real(dp),intent(in) :: AA
real(dp),dimension(nx,ny),intent(inout) :: usol,xx,yy
integer(i4b) :: i,j,ij
integer(i4b) :: rank,size,ierr,source
integer(i4b) :: nxy,nxm2,nym2,nxym2
integer(i4b),dimension(nx) :: li
integer(i4b),dimension(nx-2) :: lim2
integer(i4b),dimension(:),allocatable :: loc
real(dp) :: ymin,ymax,xmin,xmax
real(dp) :: dxidx,d2xidx2,dyidy,d2yidy2
real(dp),dimension(nx) :: xi,x
real(dp),dimension(ny) :: yi,y
real(dp),dimension(nx,nx) :: D2XB
real(dp),dimension(ny,ny) :: D2YB
real(dp),dimension(nx,nx,2) :: DMXB
real(dp),dimension(ny,ny,2) :: DMYB
real(dp),dimension(nx*ny) :: usol_vec
real(dp),dimension( nx-2,ny-2 ) :: sol_mat
complex(dpc) :: U00
complex(dpc) :: atest
complex(dpc),dimension(nx-2,nx-2) :: D2XBs
complex(dpc),dimension(ny-2,ny-2) :: D2YBs
complex(dpc),dimension( nx-2,ny-2 ) :: sol_mat_cmplx
complex(dpc),dimension(:),allocatable :: xwork1
!!$ allocate(li(nx))
!!$ allocate(lim2(nx-2))
!!! NEED TO TRANSFORM ALL THESE ARRAYS TO ALLOCATABLE!!!!!!
!...Petsc Variables
PetscInt ii,jj,ione,ct
PetscInt jmin,jmax
PetscInt Istart,Iend
PetscInt IstartVec,IendVec
PetscReal tol
PetscScalar aone
PetscViewer viewer
PC pc
KSP ksp
VecScatter ctx
Mat DXX,DYY
!Mat DLOAD
Vec frhs,frhs2
Vec sol,sol_seq
!For VecGetArrayReadF90
!PetscScalar,pointer :: xx_v(:) !Fortran90 pointer to the array
PetscOffset i_x
PetscScalar sol_array(1)
!...MPI
!PetscMPIInt rank,size
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
! Beginning of program !
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
!call PetscOptionsInsertString("-mat_superlu_dist_parsymbfact",ierr)
!...Check if nz and ny are odd (necessary to set max velocity to 1)
if ( mod(nx,2)==0 .or. mod(ny,2)==0 ) then
write(*,*)
STOP'STOPPING... IN 3D POISEUILLE BASEFLOW NZ,NY NEED TO BE ODD'
endif
!...Set Domain Sizes and Initialize other variables
xmin=-AA
xmax=AA
ymin=-1
ymax=1
ione=1
nxy=nx*ny
!Variables to Build DXX and DYY
nxm2=nx-2
nym2=ny-2
nxym2 = nxm2*nym2
!...Mat to Vec and Vec to Mat conversion
lim2=0
do i=1,nxm2
lim2(i) = (i-1)*nym2
enddo
!...Get collocation points and chebyshev differentiation matrices
call cheby(nx-1,xi,DMXB)
call cheby(ny-1,yi,DMYB)
! x-mapping from [-1,1]->[-XMAX,XMAX]
x=xmax*xi
! y-mapping (from [-1,1]->[-1,1])
y=yi
!Compute the metric terms
dxidx = 1/xmax
d2xidx2 = 0
dyidy = 1
d2yidy2 = 0
!Scale the differentiation matrix (due to the use of the mapping)
!x-direction
D2XB = d2xidx2*DMXB(:,:,1) + dxidx**2*DMXB(:,:,2)
D2XBs = D2XB(2:nx-1,2:nx-1)
!y-direction
D2YB = d2yidy2*DMYB(:,:,1) + dyidy**2*DMYB(:,:,2)
D2YBs = D2YB(2:ny-1,2:ny-1)
xx=0.0
yy=0.0
do i=1,nx
do j=1,ny
xx(i,j)=x(i)
yy(i,j)=y(j)
enddo
enddo
!...Create Petsc right-hand-side vector and derivation matrices
call VecCreateSeq(PETSC_COMM_SELF,nxym2,sol_seq,ierr)
call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,nxym2,frhs,ierr) ! right
hand side vector
call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,nxym2,frhs2,ierr) ! right
hand side vector # 2
call
MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,nxym2,nxym2,nx,PETSC_NULL_INTEGER,nx,PETSC_NULL_INTEGER,DXX,ierr)
call
MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,nxym2,nxym2,ny,PETSC_NULL_INTEGER,ny,PETSC_NULL_INTEGER,DYY,ierr)
!...Determine which rows of the vector/matrix are locally owned.
call VecGetOwnershipRange(frhs,IstartVec,IendVec,ierr)
call MatGetOwnershipRange(DXX,Istart,Iend,ierr)
if ( IstartVec /= Istart .or. IendVec /= Iend ) then
stop 'Problem in Baseflow Computation'
endif
!...Fill in Sequential and Parallel Vectors
allocate(xwork1(Iend-Istart))
allocate(loc(Iend-Istart))
ct=0
do i=Istart,Iend-1
ct=ct+1
loc(ct)=i
xwork1(ct)=(-2.0d0,0.0d0)
enddo
call VecSetValues(frhs,Iend-Istart,loc,xwork1,INSERT_VALUES,ierr)
call VecZeroEntries(sol_seq,ierr)
!...Assemble Vectors
call VecAssemblyBegin(frhs,ierr)
call VecAssemblyEnd(frhs,ierr)
call VecAssemblyBegin(sol_seq,ierr)
call VecAssemblyEnd(sol_seq,ierr)
call VecDuplicate(frhs,sol,ierr) ! parallel solution vector
!...Display vector
!call VecView(frhs,PETSC_VIEWER_STDOUT_WORLD,ierr) ! ==> seems very slow
!...Build DXX in Parallel
!!$ if (rank==0) then
!!$ call print_matrix('D2XB',nx,nx,D2XB)
!!$ call print_matrix('D2YB',ny,ny,D2YB)
!!$
!!$ call print_matrix_cmplx('D2XBs',nx-2,nx-2,D2XBs)
!!$ call print_matrix_cmplx('D2YBs',ny-2,ny-2,D2YBs)
!!$ endif
if (rank == 0) call cpu_time(start)
do i=Istart,Iend-1
ii=1+floor((i+0.01)/nym2)
jj=1
jmin=mod(i,nym2)
jmax=jmin+(nxm2-1)*nym2
do j=jmin,jmax,nym2
call MatSetValues(DXX,ione,i,ione,j,D2XBs(ii,jj),INSERT_VALUES,ierr)
jj=jj+1
enddo
enddo
!...Build DYY in Parallel
do i=Istart,Iend-1
ii=mod(i,nym2)+1
jj=1
jmin=floor((i+0.01)/nym2)*nym2
do j=jmin,jmin+(nym2-1)
call MatSetValues(DYY,ione,i,ione,j,D2YBs(ii,jj),INSERT_VALUES,ierr)
jj=jj+1
enddo
enddo
!...Assemble DXX and DYY
call MatAssemblyBegin(DXX,MAT_FINAL_ASSEMBLY,ierr)
call MatAssemblyEnd(DXX,MAT_FINAL_ASSEMBLY,ierr)
call MatAssemblyBegin(DYY,MAT_FINAL_ASSEMBLY,ierr)
call MatAssemblyEnd(DYY,MAT_FINAL_ASSEMBLY,ierr)
if (rank==0) then
call cpu_time(finish)
print '("Time for build and assembly of baseflow operator = ",f15.4,"
seconds.")',finish-start
endif
!call MatView(DXX,PETSC_VIEWER_STDOUT_WORLD,ierr)
!call MatView(DYY,PETSC_VIEWER_STDOUT_WORLD,ierr)
aone = (1.0d0,0.0d0)
if (rank == 0) call cpu_time(start)
call MatAXPY(DXX,aone,DYY,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit DXX =
a*DYY + DXX
if (rank==0) then
call cpu_time(finish)
print '("Time for Matrix summation = ",f15.4," seconds.")',finish-start
endif
!...Write Matrix to ASCII and Binary Format
!!$ call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
!!$ call MatView(DXX,viewer,ierr)
!!$ call PetscViewerDestroy(viewer,ierr)
!!$
!!$ call
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
!!$ call MatView(DXX,viewer,ierr)
!!$ call PetscViewerDestroy(viewer,ierr)
!...Load a Matrix in Binary Format
!!$ call
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
!!$ call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
!!$ call MatSetType(DLOAD,MATAIJ,ierr)
!!$ call MatLoad(DLOAD,viewer,ierr)
!!$ call PetscViewerDestroy(viewer,ierr)
!call MatTranspose(DXX,MatReuse reuse,Mat *B)
!call MatView(DXX,PETSC_VIEWER_STDOUT_WORLD,ierr)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! SOLVE POISSON EQUATION FOR THE FIRST TIME !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!...Create linear solver context
call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
!...Set operators. Here the matrix that defines the linear system also serves
as the preconditioning matrix.
!call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
commented and replaced by next line
call KSPSetOperators(ksp,DXX,DXX,ierr) ! remember: here DXX=DXX+DYY
!call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here DXX=DXX+DYY
tol = 1.e-10
!call
KSPSetTolerances(ksp,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
! set relative tolerance and uses defualt for absolute and divergence tol
call
KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) !
set relative and absolute tolerances and uses defualt for divergence tol
!...Set the Direct (LU) Solver
!call KSPSetType(ksp,KSPPREONLY,ierr)
!call KSPGetPC(ksp,pc,ierr)
!call PCSetType(pc,PCLU,ierr)
!call PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS,ierr) !
MATSOLVERSUPERLU_DIST
!...Set runtime options, e.g., -ksp_type <type> -pc_type <type> -ksp_monitor
-ksp_rtol <rtol>
!These options will override those specified above as long as
KSPSetFromOptions() is called _after_
!any other customization routines.
call KSPSetFromOptions(ksp,ierr)
!...Sets an ADDITIONAL function to be called at every iteration to monitor the
residual/error etc
call
KSPMonitorSet(ksp,KSPMonitorTrueResidualNorm,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr)
!...Solve the linear system
call KSPSolve(ksp,frhs,sol,ierr)
!...View solver info (could use -ksp_view instead)
call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD,ierr)
!...Extract Maximum Value (at center of duct) from Solution Vector
call VecScatterCreateToZero(sol,ctx,sol_seq,ierr)
!scatter as many times as you need
call VecScatterBegin(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
call VecScatterEnd(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
!destroy scatter context and local vector when no longer needed
call VecScatterDestroy(ctx,ierr)
!call VecView(sol_seq,PETSC_VIEWER_STDOUT_WORLD,ierr)
call VecGetArray(sol_seq,sol_array,i_x,ierr)
if (rank==0) then
sol_mat_cmplx=0.0d0
do i=1,nxm2
do j=1,nym2
ij=lim2(i)+j
sol_mat_cmplx(i,j) = sol_array(i_x + ij)
enddo
enddo
U00 = sol_mat_cmplx((nxm2+1)/2,(nym2+1)/2)
endif
source=0
Call MPI_Bcast(U00,1,MPI_DOUBLE_COMPLEX,source,MPI_COMM_WORLD,ierr)
!call VecGetArrayReadF90(sol,xx_v,ierr) ! requires #include
<petsc/finclude/petscvec.h90>
!atest = xx_v(3)
!call VecRestoreArrayReadF90(sol,xx_v,ierr)
!write(*,*)'atest',atest
call VecRestoreArray(sol_seq,sol_array,i_x,ierr)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! SOLVE POISSON EQUATION FOR THE SECOND TIME !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!...Fill in New Right-Hand-Side Vector (frhs2)
call VecSetValues(frhs2,Iend-Istart,loc,xwork1/U00,INSERT_VALUES,ierr)
!...Assemble New RHS Vector
call VecAssemblyBegin(frhs2,ierr)
call VecAssemblyEnd(frhs2,ierr)
!...Solve System
call KSPSolve(ksp,frhs2,sol,ierr)
!...Get Final Solution
call VecScatterCreateToZero(sol,ctx,sol_seq,ierr)
call VecScatterBegin(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
call VecScatterEnd(ctx,sol,sol_seq,INSERT_VALUES,SCATTER_FORWARD,ierr)
call VecScatterDestroy(ctx,ierr)
call VecGetArray(sol_seq,sol_array,i_x,ierr)
if (rank==0) then
usol=0.0d0
sol_mat_cmplx=0.0d0
do i=1,nxm2
do j=1,nym2
ij=lim2(i)+j
sol_mat_cmplx(i,j) = sol_array(i_x + ij)
enddo
enddo
usol(2:nx-1,2:ny-1) = dble(sol_mat_cmplx)
endif
call VecRestoreArray(sol_seq,sol_array,i_x,ierr)
!...Send usol to all other processes (vectorize, broadcast and matrixize)
li=0
do i=1,nx
li(i) = (i-1)*ny
enddo
if (rank==0) then
do i=1,nx
do j=1,ny
ij=li(i)+j
usol_vec(ij)=usol(i,j)
enddo
enddo
endif
call MPI_Bcast(usol_vec,nx*ny,MPI_DOUBLE,source,MPI_COMM_WORLD,ierr)
if (rank/=0) then
do i=1,nx
do j=1,ny
ij=li(i)+j
usol(i,j)=usol_vec(ij)
enddo
enddo
endif
!...Free work space. All PETSc objects should be destroyed when they are no
longer needed.
deallocate(loc)
deallocate(xwork1)
call KSPDestroy(ksp,ierr)
call VecDestroy(frhs,ierr)
call VecDestroy(frhs2,ierr)
call VecDestroy(sol,ierr)
call VecDestroy(sol_seq,ierr)
call MatDestroy(DXX,ierr)
call MatDestroy(DYY,ierr)
call PetscFinalize(ierr)
END SUBROUTINE Poiseuille_3D_PETSC
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!! HIEMENZ FLOW SOLVER !!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!*************************************************************************************************
! on Veltins: ifort hiemenz.f90 -convert big_endian -r8 -i4 -O3 -132 -o
hiemenz.x
!
************************************************************************************************
!program hiemenz
subroutine hiemenz(ymax,eps,n,si,eta,u0,v0,v1,v2,v3,w0,w1,w2)
implicit none
real(dp) eps,ymax,si
integer(i4b) n,i
! real(dp),allocatable,dimension(:) :: eta
! real(dp),allocatable,dimension(:) :: u0,v0,v1,v2,v3
! real(dp),allocatable,dimension(:) :: w0,w1,w2
real(dp),dimension(n) :: eta
real(dp),dimension(n) :: u0,v0,v1,v2,v3
real(dp),dimension(n) :: w0,w1,w2
write(*,*)
write(*,*)
'*******************************************************************'
write(*,*) ' HIEMENZ SOLVER
'
write(*,*)
'*******************************************************************'
write(*,*)
!
!..... Numerical Parameters
!
!Domain height and number of points
!ymax=100
!n=500001
!Tolerance for Newton iteration
!eps=1e-12
!Initial guess for Newton iteration: f''(0)
!si = -1.232587656820134
!
!..... Allocate arrays
!
! allocate(eta(n))
! allocate(u0(n),v0(n),v1(n),v2(n),v3(n))
! allocate(w0(n),w1(n),w2(n))
!
!..... Compute solution
!
call hiemenz_fct(ymax,eps,n,si,eta,u0,v0,v1,v2,v3,w0,w1,w2)
!
!..... Write out solution in ASCII format
!
write(*,*)
write(*,*) 'Writing solution to hiemenz.dat'
write(*,*)
open(unit=15,file='hiemenz.dat',form='formatted')
do i=1,n
write(15,'(9E21.11)')
eta(i),u0(i),v0(i),v1(i),v2(i),v3(i),w0(i),w1(i),w2(i)
end do
close(15)
end subroutine hiemenz
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine hiemenz_fct(ymax,eps,n,si,eta,u0,v0,v1,v2,v3,w0,w1,w2)
implicit none
integer(i4b) i,iter,n,j
real(dp) ymax,res,p,eps,s,si
real(dp),allocatable,dimension(:,:) :: v,w
real(dp) eta(n)
real(dp) u0(n),v0(n),v1(n),v2(n),v3(n)
real(dp) w0(n),w1(n),w2(n)
!external w_ode
write(*,*) 'eps=',eps
write(*,*) 'ymax=',ymax
write(*,*) 'N=',n
write(*,*)
!
! Allocate arrays
!
allocate(v (6,n))
allocate(w (5,n))
!
! Create grid
!
do i=1,n
eta(i) = ymax * real(i-1)/real(n-1)
end do
!
! Newton-Raphson iteration for v
!
iter = 0
s = si
res = 1.
write(*,*) 'Newton-Raphson iteration for v: s=',s
do while (res.gt.eps)
iter=iter+1
call hiemenz_eval(s,eta,n,v)
s = s - (v(2,n) + 1.0) / v(5,n)
res = abs(v(2,n) + 1.0)
write(*,*) iter,s,res
end do
write(*,*) 'Number of iterations ',iter
write(*,*)
write(*,*) "v''(0)=",v(3,1)
!
! Now compute w
!
w(1,1) = 0.
w(2,1) = v(3,1)
w(3,1) = 0.
w(4,1) = 0.
w(5,1) = v(3,1)
do i=2,n
call rk4(w(1,i-1),w(1,i),eta(i)-eta(i-1),5,w_ode)
end do
!
! Rescale w to match boundary conditions at infinity
!
p = 1./w(1,n)
do i=1,n
w(1,i) = w(1,i) * p
w(2,i) = w(2,i) * p
end do
write(*,*) "w'(0)=",w(2,1)
!
! Now compute v''', w'' from the RHS and copy all values to new arrays
!
do i=1,n
u0(i) = -v(2,i)
v0(i) = v(1,i)
v1(i) = v(2,i)
v2(i) = v(3,i)
v3(i) = -v(2,i)*v(2,i) + v(1,i)*v(3,i) + 1.0
w0(i) = w(1,i)
w1(i) = w(2,i)
w2(i) = w(2,i)*v(1,i)
end do
deallocate(v,w)
end subroutine hiemenz_fct
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine hiemenz_eval(s,eta,n,f)
implicit none
integer(i4b) n,i
real(dp) s
real(dp) eta(n),f(6,n)
!external v_ode
f(1,1) = 0.
f(2,1) = 0.
f(3,1) = s
f(4,1) = 0.
f(5,1) = 0.
f(6,1) = 1.
do i=2,n
call rk4(f(1,i-1),f(1,i),eta(i)-eta(i-1),6,v_ode)
end do
end subroutine hiemenz_eval
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine v_ode(y,dy)
implicit none
real(dp) y(6),dy(6)
dy(1) = y(2)
dy(2) = y(3)
dy(3) = -y(2)*y(2) + y(1)*y(3) + 1.0
dy(4) = y(5)
dy(5) = y(6)
dy(6) = y(3)*y(4) - 2.0*y(2)*y(5) + y(1)*y(6)
end subroutine v_ode
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine w_ode(y,dy)
implicit none
real(dp) y(5),dy(5)
dy(1) = y(2)
dy(2) = y(2)*y(3)
dy(3) = y(4)
dy(4) = y(5)
dy(5) = -y(4)*y(4) + y(3)*y(5) + 1.0
end subroutine w_ode
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine rk4(y1,y2,h,nn,der)
!
! Perform one RK4 step with step size h of a n-vector
! Input: y1(n) old step
! h step size
! nn vector size
! der derivative function
! Output: y2(nn) new step
!
implicit none
integer(i4b) nn
real(dp) y1(nn),y2(nn),y(nn)
real(dp) k1(nn),k2(nn),k3(nn),k4(nn)
real(dp) h
external der
!
! First RK substep
!
y(:) = y1(:)
call der(y,k1)
k1(:)=h*k1(:)
!
! Second RK substep
!
y(:) = y1(:) + .5*k1(:)
call der(y,k2)
k2(:)=h*k2(:)
!
! Third RK substep
!
y(:) = y1(:) + .5*k2(:)
call der(y,k3)
k3(:)=h*k3(:)
!
! Fourth RK substep
!
y(:) = y1(:) + k3(:)
call der(y,k4)
k4(:)=h*k4(:)
!
! Compose solution after full RK step
!
y2(:) = y1(:) + ( k1(:) + 2.0*k2(:) + 2.0*k3(:) + k4(:) ) / 6.0
end subroutine rk4
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine compute_baseflow_derivative(DXYZ,uvwvec,nx,ny,UVWxyz)
implicit none
integer(i4b) :: i,j
integer(i4b),intent(in) :: nx,ny
real(dp),dimension(nx*ny),intent(in) :: uvwvec
real(dp),dimension(nx*ny,nx*ny),intent(in) :: DXYZ
real(dp),dimension(nx*ny,nx*ny),intent(out) :: UVWxyz
UVWxyz = 0.0
do i=1,nx*ny
do j=1,nx*ny
UVWxyz(i,i) = UVWxyz(i,i) + DXYZ(i,j)*uvwvec(j)
enddo
enddo
end subroutine compute_baseflow_derivative
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
subroutine compute_UVW_DXYZ(DXYZ,uvwvec,nx,ny,UVW_DXYZ)
implicit none
integer(i4b) :: i,j
integer(i4b),intent(in) :: nx,ny
real(dp),dimension(nx*ny),intent(in) :: uvwvec
real(dp),dimension(nx*ny,nx*ny),intent(in) :: DXYZ
real(dp),dimension(nx*ny,nx*ny),intent(out) :: UVW_DXYZ
do i=1,nx*ny
do j=1,nx*ny
UVW_DXYZ(i,j) = uvwvec(i)*DXYZ(i,j)
enddo
enddo
end subroutine compute_UVW_DXYZ
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
END MODULE BASEFLOW
!!$
!!$
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$
!!$ SUBROUTINE Poiseuille_3D(nz,ny,AA,usol,zz,yy)
!!$
!!$ character(128) :: file_name
!!$ integer(i4b) :: INFO,NRHS,ct,i,j,N,file_num,m1,n1,p1,q1
!!$
!!$ integer(i4b),intent(in) :: nz,ny
!!$ real(dp),intent(in) :: AA
!!$
!!$ real(dp) :: ymin,ymax,zmin,zmax,dzidz,d2zidz2,dyidy,d2yidy2,U00
!!$ real(dp) :: start,finish
!!$
!!$ real(dp),dimension(nz,nz,2) :: DMZ
!!$ real(dp),dimension(ny,ny,2) :: DMY
!!$
!!$ real(dp),dimension(nz,nz) :: D2Z
!!$ real(dp),dimension(ny,ny) :: D2Y
!!$
!!$ real(dp),dimension(nz-2,nz-2) :: D2Zs
!!$ real(dp),dimension(ny-2,ny-2) :: D2Ys
!!$
!!$ real(dp),dimension(nz-2,nz-2) :: IZ
!!$ real(dp),dimension(ny-2,ny-2) :: IY
!!$
!!$ real(dp),dimension(nz,ny) :: usol,zz,yy
!!$ real(dp),dimension(nz-2,ny-2) :: utmp
!!$
!!$ real(dp),dimension(nz) :: zi,z
!!$ real(dp),dimension(ny) :: yi,y
!!$
!!$ real(dp),dimension( (nz-2)*(ny-2) ) :: frhs
!!$ real(dp),dimension( (nz-2)*(ny-2),(nz-2)*(ny-2) ) :: AMAT,DZZ,DYY
!!$
!!$ integer(i4b),dimension((nz-2)*(ny-2)) :: IPIV
!!$
!!$ frhs=0.0
!!$
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$!!!!!!! 3D POISEUILLE FLOW !!!!!!!!
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$
!!$ zmin=-AA
!!$ zmax=AA
!!$
!!$ ymin=-1
!!$ ymax=1
!!$
!!$ !
!!$ !! .....Create Grid
!!$ !
!!$
!!$ if ( mod(nz,2)==0 .or. mod(ny,2)==0 ) then
!!$ WRITE(*,*)
!!$ STOP'STOPPING... IN 3D POISEUILLE BASEFLOW NZ,NY NEED TO BE ODD'
!!$ endif
!!$
!!$ !
!!$ !! .....Get collocation points and chebyshev differentiation matrices
!!$ !
!!$
!!$ call cheby(nz-1,zi,DMZ)
!!$ call cheby(ny-1,yi,DMY)
!!$
!!$ ! z-mapping from [-1,1]->[-ZMAX,ZMAX]
!!$ z=zmax*zi
!!$
!!$ ! y-mapping (from [-1,1]->[-1,1])
!!$ y=yi
!!$
!!$ !Compute the metric terms
!!$ dzidz = 1/zmax
!!$ d2zidz2 = 0
!!$
!!$ dyidy = 1
!!$ d2yidy2 = 0
!!$
!!$ !Scale the differentiation matrix (due to the use of the mapping)
!!$
!!$ !z-direction
!!$ !D1Z = dzidz*DMZ(:,:,1)
!!$ D2Z = d2zidz2*DMZ(:,:,1) + dzidz**2*DMZ(:,:,2)
!!$ D2Zs = D2Z(2:nz-1,2:nz-1)
!!$
!!$ !y-direction
!!$ !D1Y = dyidy*DMY(:,:,1)
!!$ D2Y = d2yidy2*DMY(:,:,1) + dyidy**2*DMY(:,:,2)
!!$ D2Ys = D2Y(2:ny-1,2:ny-1)
!!$
!!$ IZ=0.0
!!$ IY=0.0
!!$
!!$ do i=1,nz-2
!!$ IZ(i,i)=1.0
!!$ enddo
!!$
!!$ do i=1,ny-2
!!$ IY(i,i)=1.0
!!$ enddo
!!$
!!$ m1=size(IY,1)
!!$ n1=size(IY,2)
!!$ p1=size(D2Zs,1)
!!$ q1=size(D2Zs,2)
!!$
!!$ call mykron(IY,D2Zs,m1,n1,p1,q1,DZZ)
!!$
!!$ m1=size(D2Ys,1)
!!$ n1=size(D2Ys,2)
!!$ p1=size(IZ,1)
!!$ q1=size(IZ,2)
!!$
!!$ call mykron(D2Ys,IZ,m1,n1,p1,q1,DYY)
!!$
!!$ AMAT = DYY + DZZ
!!$
!!$ !CALL PRINT_MATRIX( 'DZZ', (nz-2)*(ny-2), (nz-2)*(ny-2), DZZ )
!!$ !CALL PRINT_MATRIX( 'DYY', (nz-2)*(ny-2), (nz-2)*(ny-2),DYY )
!!$
!!$
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$ !! SOLVE POISSON EQUATION FOR THE FIRST TIME !!
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$
!!$
!!$
!!$ do i=1,(ny-2)*(nz-2)
!!$ frhs(i)=-2.0
!!$ enddo
!!$
!!$ INFO=0
!!$ NRHS=1 ! only 1 rhs here
!!$ N=(nz-2)*(ny-2)
!!$
!!$
!!$ !
!!$ ! Compute the LU factorization of A.
!!$ !
!!$
!!$ ! Note: on exit of dgetrf, AMAT is not AMAT anymore but contains the
factors L and U
!!$ ! from the factorization A = P*L*U; the unit diagonal elements of L are
not stored.
!!$
!!$ !call PRINT_MATRIX( 'AMAT', N, N, AMAT )
!!$
!!$
!!$ call cpu_time(start)
!!$ CALL dgetrf( N, N, AMAT, N, IPIV, INFO )
!!$ call cpu_time(finish)
!!$
!!$ !write(*,*)'AMAT AFTER',AMAT
!!$
!!$ print '("Time LU = ",f15.4," seconds.")',finish-start
!!$
!!$ IF( INFO .NE. 0 ) THEN
!!$ WRITE(*,*)'INFO',INFO
!!$ STOP 'PROBLEM IN LAPACK (Poiseuille_3D)'
!!$ ENDIF
!!$
!!$
!!$ !
!!$ ! Solve the system A*X = B, overwriting B with X.
!!$ !
!!$
!!$ IF( INFO.EQ.0 ) THEN
!!$
!!$ call cpu_time(start)
!!$ CALL dgetrs( 'No transpose', n, nrhs, AMAT, N, IPIV, frhs, N, INFO )
!!$ call cpu_time(finish)
!!$
!!$ print '("Time Linear System Solve = ",f15.4," seconds.")',finish-start
!!$ write(*,*)
!!$
!!$ END IF
!!$
!!$ ct=0
!!$ do j=1,ny-2
!!$ do i=1,nz-2
!!$
!!$ ct=ct+1
!!$ utmp(i,j) = frhs(ct)
!!$
!!$ enddo
!!$ enddo
!!$
!!$ usol = 0.0
!!$ usol(2:nz-1,2:ny-1) = utmp(:,:)
!!$
!!$ zz=0.0
!!$ yy=0.0
!!$
!!$ do i=1,nz
!!$ do j=1,ny
!!$
!!$ zz(i,j)=z(i)
!!$ yy(i,j)=y(j)
!!$
!!$ enddo
!!$ enddo
!!$
!!$ U00 = usol( (nz+1)/2 , (ny+1)/2 )
!!$
!!$
!!$
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$!! SOLVE POISSON EQUATION FOR THE SECOND TIME !!
!!$!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!$
!!$
!!$ utmp=0.0
!!$ usol=0.0
!!$ frhs=0.0
!!$
!!$ !New,scaled rhs
!!$ do i=1,(ny-2)*(nz-2)
!!$ frhs(i)=-2.0/U00
!!$ enddo
!!$
!!$ !
!!$ ! Solve the system A*X = B, overwriting B with X.
!!$ !
!!$
!!$ ! Note: here we already have the LU factorization so that we only need a
call to dgetrs
!!$
!!$ CALL dgetrs( 'No transpose', N, NRHS, AMAT, N, IPIV, frhs, N, INFO )
!!$
!!$ IF( INFO .NE. 0 ) THEN
!!$ WRITE(*,*)'INFO',INFO
!!$ STOP 'PROBLEM IN LAPACK (Poiseuille_3D -- solve 2)'
!!$ ENDIF
!!$
!!$ !write(*,*)'frhs',frhs
!!$
!!$ ct=0
!!$ do j=1,ny-2
!!$ do i=1,nz-2
!!$
!!$ ct=ct+1
!!$ utmp(i,j) = frhs(ct)
!!$
!!$ enddo
!!$ enddo
!!$
!!$ usol(2:nz-1,2:ny-1) = utmp(:,:)
!!$
!!$
!!$ END SUBROUTINE Poiseuille_3D
!!$
!!$
!
! Description: Build complex matrices A and B in the context of a generalized
! eigenvalue problem (GEVP)
!
!
!/*T
! Concepts: Build complex matrices
! Processors: n
!T*/
!
! -----------------------------------------------------------------------
MODULE PETSC
USE TYPEDEF
USE WRITE_PRINT
IMPLICIT NONE
CONTAINS
Subroutine SetupSolveGEVP(nx,ny,nxy,alpha,beta,Re)
#include <petsc/finclude/petscsys.h>
#include <petsc/finclude/petscvec.h>
#include <petsc/finclude/petscmat.h>
#include <petsc/finclude/petscmat.h90>
#include <petsc/finclude/petscpc.h>
#include <petsc/finclude/petscksp.h>
#include <slepc/finclude/slepcsys.h>
#include <slepc/finclude/slepceps.h>
!...Non PETSc/SLEPc Variables
real(dp),intent(in) :: alpha,beta,Re
integer(i4b),intent(in) :: nx,ny,nxy
integer(i4b) :: ij
integer(i4b) :: sizem,rank,source
integer(i4b),dimension(:),allocatable :: idxm1,idxn1
integer(i4b),dimension(:),allocatable :: idxm2,idxn2
integer(i4b),dimension(:),allocatable :: idxm3,idxn3
integer(i4b),dimension(:),allocatable :: li
integer(i4b),dimension(:),allocatable :: all_bc
integer(i4b),dimension(:),allocatable :: u_bc,v_bc,w_bc,p_bc
integer(i4b),dimension(:),allocatable :: fst_bc,wall_bc
integer(i4b),dimension(:),allocatable :: outlet_bc,inlet_bc
complex(dpc) :: ibeta,Rec,alphac,betac,small,sigma
complex(dpc),dimension(:,:),allocatable :: D1Xc,D2Xc,D1Yc,D2Yc
complex(dpc),dimension(nxy) :: uvec_cmplx,uxvec_cmplx,uyvec_cmplx
complex(dpc),dimension(nxy) :: vvec_cmplx,vxvec_cmplx,vyvec_cmplx
complex(dpc),dimension(nxy) :: wvec_cmplx,wxvec_cmplx,wyvec_cmplx
complex(dpc),dimension(3*nxy) :: wvec_cmplx3
complex(dpc),dimension(ny,ny) :: VDIAG,VDY
!...Temporary variables
integer(i4b),dimension(:),allocatable :: loc
complex(dpc),dimension(:),allocatable :: xwork1
!...SLEPC VARIABLES
EPS eps
EPSType tname
ST st
PC pc
Vec xr,xi
PetscReal error
PetscScalar kr,ki
PetscLogDouble mem
PetscInt maxit,nconv
PetscInt nev,ncv,mpd
PetscInt three
PetscInt ct,ct1,ct2,ct3,ct4
PetscInt icntl,ival,ival1
PetscReal tolslepc
PetscInt maxiter
!...PETSC VARIABLES
PetscReal norm,tol,nrm,nrm1
PetscInt i,j,k,II,JJ,its
PetscInt jmin,jmax
PetscInt ione,ntimes
PetscErrorCode ierr
PetscBool flg
PetscScalar v,one,zero,rhs
KSP ksp
PetscRandom rctx
!...MPI
!PetscMPIInt rank,size
!...Matrices indices
PetscInt IstartA,IendA,IstartB,IendB
!...Vectors
Vec i_vec,frhs,sol
!...Matrices
Mat A,B
! Note: Any user-defined Fortran routines (such as MyKSPMonitor)
! MUST be declared as external.
!external MyKSPMonitor,MyKSPConverged
!#if !defined(PETSC_USE_COMPLEX)
!#error "this code requires PETSc --with-scalar-type=complex build"
!#endif
!#if !defined(PETSC_USE_REAL_DOUBLE)
!#error "this code requires PETSc --with-precision=real build"
!#endif
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
! Beginning of program !
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -!
call SlepcInitialize(PETSC_NULL_CHARACTER,ierr)!;CHKERRQ(ierr)
call PetscMemorySetGetMaximumUsage(ierr)!;CHKERRQ(ierr)
call PetscOptionsInsertString("-mat_superlu_dist_parsymbfact",ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD,sizem,ierr)!;CHKERRQ(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)!;CHKERRQ(ierr)
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!call PetscMemoryGetMaximumUsage(mem,ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (entering
module)',rank,' memory:',mem
!call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-nx',nx,flg,ierr)
!call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-ny',ny,flg,ierr)
ione=1 ! integer
ibeta = imag_unit*beta
one = (1.0d0,0.0d0)
zero = (0.0d0,0.0d0)
Rec=Re
alphac=alpha
betac=beta
source=0
ct=0
!...Convert Arrays to Complex
uvec_cmplx=uvec
uxvec_cmplx=uxvec
uyvec_cmplx=uyvec
vvec_cmplx=vvec
vxvec_cmplx=vxvec
vyvec_cmplx=vyvec
wvec_cmplx=wvec
wxvec_cmplx=wxvec
wyvec_cmplx=wyvec
wvec_cmplx3(1:nxy)=wvec_cmplx
wvec_cmplx3(nxy+1:2*nxy)=wvec_cmplx
wvec_cmplx3(2*nxy+1:3*nxy)=wvec_cmplx
allocate(D1Xc(nx,nx),D2Xc(nx,nx),D1Yc(ny,ny),D2Yc(ny,ny))
D1Xc=D1X
D1Yc=D1Y
D2Xc=D2X
D2Yc=D2Y
!call
MPI_Bcast(wvec_cmplx3,3*nxy,MPI_DOUBLE_COMPLEX,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!...Parallel Matrices
!Create A
call
MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,2*(nx+ny),PETSC_NULL_INTEGER,max(nx,ny),PETSC_NULL_INTEGER,A,ierr)!;CHKERRQ(ierr)
!Create B
call
MatCreateAIJ(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,B,ierr)!;CHKERRQ(ierr)
call MatGetOwnershipRange(A,IstartA,IendA,ierr)!;CHKERRQ(ierr)
call MatGetOwnershipRange(B,IstartB,IendB,ierr)!;CHKERRQ(ierr)
!write(*,*)'IstartA,IendA',IstartA,IendA
!write(*,*)'IstartB,IendB',IstartB,IendB
if (IstartA > 4*nxy .or. IstartB > 4*nxy .or. IendA > 4*nxy .or. IendB >
4*nxy) then
stop 'indices problem in module_petsc'
endif
!...NEED TO TEST WITHOUT THIS NEXT LINE TO UNDERSTAND
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!...This line avoids error [0]PETSC ERROR: Argument out of range [0]PETSC
ERROR: New nonzero at (470,470)...
call
MatSetOption(A,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE,ierr)!;CHKERRQ(ierr)
!...Get Memory Usage
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!call PetscMemoryGetMaximumUsage(mem,ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (after
matcreate)',rank,' memory:',mem
!...Build A in Parallel
if (rank == 0) call cpu_time(start)
do i=IstartA,IendA-1
if ( i < nxy ) then
if (abs(uyvec_cmplx(i+1))>1e10) STOP '1'
call
MatSetValues(A,ione,i,ione,i+nxy,uyvec_cmplx(i+1),INSERT_VALUES,ierr)!;CHKERRQ(ierr)
! Uy (block 1,2)
!Build DX (block 1,4)
ii=1+floor((i+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
if (abs(D1Xc(ii,jj))>1e10) STOP '2'
call
MatSetValues(A,ione,i,ione,j+3*nxy,D1Xc(ii,jj),INSERT_VALUES,ierr) ! Build DX
(block 1,4)
jj=jj+1
enddo
!Build -DXX/Re (part of block 1,1)
ii=1+floor((i+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
if (abs(-D2Xc(ii,jj)/Rec)>1e10) STOP '3'
call
MatSetValues(A,ione,i,ione,j,-D2Xc(ii,jj)/Rec,INSERT_VALUES,ierr) !Build
-DXX/Re (part of block 1,1)
jj=jj+1
enddo
elseif ( i > nxy-1 .and. i < 2*nxy ) then
if (abs(vxvec_cmplx(i-nxy+1))>1e10) STOP '4'
call
MatSetValues(A,ione,i,ione,i-nxy,vxvec_cmplx(i-nxy+1),INSERT_VALUES,ierr)!;CHKERRQ(ierr)
! Vx (block 2,1)
!Build DY (block 2,4)
ii=mod(i,ny)+1
jj=1
jmin=floor((i-nxy+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
if (abs(D1Yc(ii,jj))>1e10) STOP '5'
call
MatSetValues(A,ione,i,ione,j+3*nxy,D1Yc(ii,jj),INSERT_VALUES,ierr) ! Build DY
(block 2,4)
jj=jj+1
enddo
!Build -DXX/Re (part of block 2,2)
ii=1+floor((i-nxy+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
if (abs(-D2Xc(ii,jj)/Rec)>1e10) STOP '6'
call
MatSetValues(A,ione,i,ione,j+nxy,-D2Xc(ii,jj)/Rec,INSERT_VALUES,ierr) !Build
-DXX/Re (part of block 2,2)
jj=jj+1
enddo
elseif ( i > 2*nxy-1 .and. i < 3*nxy ) then
call MatSetValues(A,ione,i,ione,i+nxy,ibeta,INSERT_VALUES,ierr)!
iBeta (block 3,4)
call
MatSetValues(A,ione,i,ione,i-2*nxy,wxvec_cmplx(i-2*nxy+1),INSERT_VALUES,ierr)!
Wx (block 3,1)
call MatSetValues(A,ione,i,ione,i
-nxy,wyvec_cmplx(i-2*nxy+1),INSERT_VALUES,ierr)! Wy (block 3,2)
!Build -DXX/Re (part of block 3,3)
ii=1+floor((i-2*nxy+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
call
MatSetValues(A,ione,i,ione,j+2*nxy,-D2Xc(ii,jj)/Rec,INSERT_VALUES,ierr) !Build
-DXX/Re (part of block 3,3)
jj=jj+1
enddo
elseif ( i > 3*nxy-1 ) then
call MatSetValues(A,ione,i,ione,i-nxy,ibeta,INSERT_VALUES,ierr)!
iBeta (block 4,3)
!Build DX (block 4,1)
ii=1+floor(((i-3*nxy)+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
call MatSetValues(A,ione,i,ione,j,D1Xc(ii,jj),INSERT_VALUES,ierr)
jj=jj+1
enddo
!Build DY (block 4,2)
ii=mod(i,ny)+1
jj=1
jmin=floor((i-3*nxy+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
call
MatSetValues(A,ione,i,ione,j+nxy,D1Yc(ii,jj),INSERT_VALUES,ierr)
jj=jj+1
enddo
endif
enddo
if (rank==0) then
call cpu_time(finish)
write(*,*)
print '("Time Partial Parallel Build A #1 = ",f21.3,"
seconds.")',finish-start
endif
call MatAssemblyBegin(A,MAT_FLUSH_ASSEMBLY,ierr)!;CHKERRQ(ierr) ! necessary
to switch between INSERT_VALUES and ADD_VALUES
call MatAssemblyEnd(A,MAT_FLUSH_ASSEMBLY,ierr)!;CHKERRQ(ierr) ! necessary
to switch between INSERT_VALUES and ADD_VALUES
!...Finish Blocks A11, A22 and A33
if (rank == 0) call cpu_time(start)
VDY=(0.0d0,0.0d0)
VDIAG=(0.0d0,0.0d0)
do i=IstartA,IendA-1
if (i < 3*nxy) then
call
MatSetValues(A,ione,i,ione,i,betac**2/Rec+imag_unit*betac*wvec_cmplx3(i+1),ADD_VALUES,ierr)!;CHKERRQ(ierr)
endif
enddo
do i=IstartA,IendA-1
if ( i < nxy ) then
!Build -DYY/Rec (part of block 1,1)
ii=mod(i,ny)+1
jj=1
jmin=floor((i+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
call MatSetValues(A,ione,i,ione,j,-D2Yc(ii,jj)/Rec,ADD_VALUES,ierr)
jj=jj+1
enddo
!Build U*DX (part of block 1,1)
ii=1+floor((i+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
call
MatSetValues(A,ione,i,ione,j,uvec_cmplx(i+1)*D1Xc(ii,jj),ADD_VALUES,ierr)
jj=jj+1
enddo
!Build V*DY (part of block 1,1)
ii=mod(i,ny)+1
jj=1
jmin=floor((i+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
call
MatSetValues(A,ione,i,ione,j,vvec_cmplx(i+1)*D1Yc(ii,jj),ADD_VALUES,ierr)
jj=jj+1
enddo
call
MatSetValues(A,ione,i,ione,i,uxvec_cmplx(i+1),ADD_VALUES,ierr)!;CHKERRQ(ierr) !
Ux (block 1,1)
elseif ( i > nxy-1 .and. i < 2*nxy ) then
!Build -DYY/Rec (part of block 2,2)
ii=mod(i,ny)+1
jj=1
jmin=floor((i-nxy+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
call
MatSetValues(A,ione,i,ione,j+nxy,-D2Yc(ii,jj)/Rec,ADD_VALUES,ierr) ! Build DY
(block 2,4)
jj=jj+1
enddo
!Build U*DX (part of block 2,2)
ii=1+floor((i-nxy+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
call
MatSetValues(A,ione,i,ione,j+nxy,uvec_cmplx(i-nxy+1)*D1Xc(ii,jj),ADD_VALUES,ierr)
jj=jj+1
enddo
!Build V*DY (part of block 2,2)
ii=mod(i,ny)+1
jj=1
jmin=floor((i-nxy+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
call
MatSetValues(A,ione,i,ione,j+nxy,vvec_cmplx(i-nxy+1)*D1Yc(ii,jj),ADD_VALUES,ierr)
jj=jj+1
enddo
call
MatSetValues(A,ione,i,ione,i,vyvec_cmplx(i-nxy+1),ADD_VALUES,ierr)!;CHKERRQ(ierr)
! Vy (block 2,2)
elseif ( i > 2*nxy-1 .and. i < 3*nxy ) then
!Build -DYY/Rec (part of block 3,3)
ii=mod(i,ny)+1
jj=1
jmin=floor((i-2*nxy+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
call
MatSetValues(A,ione,i,ione,j+2*nxy,-D2Yc(ii,jj)/Rec,ADD_VALUES,ierr) ! Build DY
(block 2,4)
jj=jj+1
enddo
!Build U*DX (part of block 3,3)
ii=1+floor((i-2*nxy+0.01)/ny)
jj=1
jmin=mod(i,ny)
jmax=jmin+(nx-1)*ny
do j=jmin,jmax,ny
call
MatSetValues(A,ione,i,ione,j+2*nxy,uvec_cmplx(i-2*nxy+1)*D1Xc(ii,jj),ADD_VALUES,ierr)
jj=jj+1
enddo
!Build V*DY (part of block 3,3)
ii=mod(i,ny)+1
jj=1
jmin=floor((i-2*nxy+0.01)/ny)*ny
do j=jmin,jmin+(ny-1)
call
MatSetValues(A,ione,i,ione,j+2*nxy,vvec_cmplx(i-2*nxy+1)*D1Yc(ii,jj),ADD_VALUES,ierr)
jj=jj+1
enddo
endif
enddo
if (rank==0) then
call cpu_time(finish)
write(*,*)
print '("Time Partial Parallel Build A #2 = ",f21.3,"
seconds.")',finish-start
endif
!$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
!$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
!...Artificially create diagonal element for A ==> TO BE REMOVED
small = (0.000001d0,0.0d0)
do i=IstartA,IendA-1
call MatSetValues(A,ione,i,ione,i,small,ADD_VALUES,ierr)!;CHKERRQ(ierr)
enddo
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!call PetscMemoryGetMaximumUsage(mem,ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (A just before
assembly)',rank,' memory:',mem
if (rank==0) then
call cpu_time(start)
endif
!...Assemble A
call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
!call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
if (rank==0) then
call cpu_time(finish)
write(*,*)
print '("Time to assemble A = ",f21.3," seconds.")',finish-start
endif
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!call PetscMemoryGetMaximumUsage(mem,ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (A build and
assembled)',rank,' memory:',mem
!...Build B in parallel
if (rank==0) then
call cpu_time(start)
endif
do i=IstartB,IendB-1
if (i < 3*nxy) then
call
MatSetValues(B,ione,i,ione,i,imag_unit,INSERT_VALUES,ierr)!;CHKERRQ(ierr)
endif
enddo
!...Assemble B
call MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
call MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY,ierr)!;CHKERRQ(ierr)
!call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr)
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!call PetscMemoryGetMaximumUsage(mem,ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (B build and
assembled)',rank,' memory:',mem
call MatNorm(B,NORM_FROBENIUS,nrm1,ierr)!;CHKERRQ(ierr)
!!$ if (rank==0) then
!!$ write(*,*)'norm B:',nrm1
!!$ endif
if (rank==0) then
call cpu_time(finish)
write(*,*)
print '("Time to build and assemble B = ",f21.3,"
seconds.")',finish-start
endif
!...Set Boundary Conditions
if (rank==0) then
allocate(li(nx))
do i=1,nx
li(i)=(i-1)*ny
enddo
ct1=0
ct2=0
ct3=0
ct4=0
do i=1,nx
do j=1,ny
if (j==1) then ! fst_bc
ct1=ct1+1
elseif (j==ny) then ! wall_bc
ct2=ct2+1
elseif (i==1) then ! outlet_bc
ct3=ct3+1
elseif (i==nx) then ! inlet_bc
ct4=ct4+1
endif
enddo
enddo
ct=ct1+ct2+ct3+ct4
endif ! if (rank == 0)
call MPI_Bcast(ct,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call MPI_Bcast(ct1,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call MPI_Bcast(ct2,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call MPI_Bcast(ct3,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call MPI_Bcast(ct4,1,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
allocate(fst_bc(ct1))
allocate(wall_bc(ct2))
allocate(outlet_bc(ct3))
allocate(inlet_bc(ct4))
allocate(all_bc(3*ct))
allocate(u_bc(ct))
allocate(v_bc(ct))
allocate(w_bc(ct))
allocate(p_bc(ct))
if (rank == 0) then
ct1=0
ct2=0
ct3=0
ct4=0
do i=1,nx
do j=1,ny
ij=li(i)+j-1
if (j==1) then ! fst_bc
ct1=ct1+1
fst_bc(ct1)=ij
elseif (j==ny) then ! wall_bc
ct2=ct2+1
wall_bc(ct2)=ij
elseif (i==1) then ! outlet_bc
ct3=ct3+1
outlet_bc(ct3)=ij
elseif (i==nx) then ! inlet_bc
ct4=ct4+1
inlet_bc(ct4)=ij
endif
enddo
enddo
u_bc(1:ct1)=fst_bc
u_bc(ct1+1:ct1+ct2)=wall_bc
u_bc(ct1+ct2+1:ct1+ct2+ct3)=outlet_bc
u_bc(ct1+ct2+ct3+1:ct1+ct2+ct3+ct4)=inlet_bc
v_bc=u_bc+nxy
w_bc=v_bc+nxy
p_bc=w_bc+nxy
all_bc=[u_bc,v_bc,w_bc]
endif ! if (rank == 0)
call
MPI_Bcast(all_bc,3*ct,MPI_INT,source,MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
!For parallel case, all processes that share matrix MUST call this
routine, regardless of whether any rows being zeroed are owned by them.
call MatZeroRows(A,3*ct,all_bc, one,zero,zero,ierr);CHKERRQ(ierr)
call MatZeroRows(B,3*ct,all_bc,zero,zero,zero,ierr);CHKERRQ(ierr)
!call MatView(B,PETSC_VIEWER_STDOUT_WORLD,ierr)
! cannot be inside a if (rank==0) statment!!!
call MatNorm(A,NORM_FROBENIUS,nrm,ierr)!;CHKERRQ(ierr)
call MatNorm(B,NORM_FROBENIUS,nrm1,ierr)!;CHKERRQ(ierr)
if (rank==0) then
write(*,*)'norm A:',nrm
write(*,*)'norm B:',nrm1
endif
!!!!!! TESTING HOW BIG LU CAN BE !!!!!!!!!!!!!!!!
!!$ sigma = (1.0d0,0.0d0)
!!$ call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit A =
A-sigma*B
!!$
!!$ !...Create linear solver context
!!$ call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
!!$
!!$!...Set operators. Here the matrix that defines the linear system also
serves as the preconditioning matrix.
!!$ !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
commented and replaced by next line
!!$ call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B
!!$
!!$ tol = 1.e-10
!!$ call
KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) !
set relative and absolute tolerances and uses defualt for divergence tol
!!$
!!$!...Set the Direct (LU) Solver
!!$ call KSPSetType(ksp,KSPPREONLY,ierr)
!!$ call KSPGetPC(ksp,pc,ierr)
!!$ call PCSetType(pc,PCLU,ierr)
!!$ call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !
MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
!!$
!!$!...Create right-hand-side vector
!!$ call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
!!$ call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
!!$
!!$ allocate(xwork1(IendA-IstartA))
!!$ allocate(loc(IendA-IstartA))
!!$
!!$ ct=0
!!$ do i=IstartA,IendA-1
!!$ ct=ct+1
!!$ loc(ct)=i
!!$ xwork1(ct)=(1.0d0,0.0d0)
!!$ enddo
!!$
!!$ call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
!!$ call VecZeroEntries(sol,ierr)
!!$
!!$ deallocate(xwork1,loc)
!!$
!!$!...Assemble Vectors
!!$ call VecAssemblyBegin(frhs,ierr)
!!$ call VecAssemblyEnd(frhs,ierr)
!!$
!!$!...Solve the linear system
!!$ call KSPSolve(ksp,frhs,sol,ierr)
!!$
!!$ !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
!!$
!!$ STOP
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!! SLEPC !!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
if (rank==0) call cpu_time(start)
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (begin slepc)',rank,'
memory:',mem
!...Get vector(s) compatible with the matrix, i.e. with the same parallel layout
call MatCreateVecs(A,xr,xi,ierr)
!!$ call MatCreateVecs(A,xr,PETSC_NULL_OBJECT,ierr)
!!$ call MatCreateVecs(A,xi,PETSC_NULL_OBJECT,ierr)
!...Create Eigensolver Context and Spectral Transformation context
call EPSCreate(PETSC_COMM_WORLD,eps,ierr)
!call STCreate(PETSC_COMM_WORLD,st,ierr)
!...Set Operators and Problem Type - Generalized Eigenvalue Problem
call EPSSetOperators(eps,A,B,ierr)
!call EPSSetProblemType(eps,EPS_PGNHEP,ierr) ! THIS ONE LEADS TO AN ERROR
--> CHECK WHY BECAUSE MY B SEEMS FITTING THAT CRITERION
call EPSSetProblemType(eps,EPS_GNHEP,ierr)
!...Select Portion of Spectrum
call EPSSetTarget(eps,(3.0d0,0.d0),ierr)
call EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE,ierr)
!...Select Algorithm for Eigenvalue Computation (default is Krylov-Schur)
call EPSSetType(eps,EPSKRYLOVSCHUR,ierr) !EPSARNOLDI,EPSKRYLOVSCHUR
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (slepc after
EPSSetType)',rank,' memory:',mem
!...Set Tolerance and Maxiter for Solver
tolslepc=1e-12
maxiter=500
!call EPSSetTolerances(eps,PETSC_DEFAULT_REAL,maxiter,ierr)
call EPSSetTolerances(eps,tolslepc,maxiter,ierr)
!...Sets number of eigenvalues to compute and dimension of subspace
nev=200
ncv=3*nev
mpd=10*nev
!call EPSSetDimensions(eps,nev,ncv,mpd,ierr) PETSC_DEFAULT
call
EPSSetDimensions(eps,nev,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (before
EPSGetST)',rank,' memory:',mem
!...Set the Spectral Transformation --> Use Shift-And-Invert
call EPSGetST(eps,st,ierr)
call STSetType(st,STSINVERT,ierr)
!...Set Linear Solver
call STGetKSP(st,ksp,ierr)
call KSPSetType(ksp,KSPPREONLY,ierr)
call KSPGetPC(ksp,pc,ierr)
call PCSetType(pc,PCLU,ierr) ! could try PCCHOLESKY
call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !
MATSOLVERSUPERLU_DIST,MATSOLVERMUMPS
!...Set Solver Parameters at Runtime
call EPSSetFromOptions(eps,ierr)
call MPI_Barrier(MPI_COMM_WORLD,ierr)!;CHKERRQ(ierr)
call PetscMemoryGetCurrentUsage(mem,ierr)!;CHKERRQ(ierr)
if (rank==0) write(*,'(A,I3,A,E21.11)')'On Processor (after
EPSSetFromOptions)',rank,' memory:',mem
!...Solve the Eigensystem
call EPSSolve(eps,ierr)
call EPSGetIterationNumber(eps,its,ierr)
if (rank .eq. 0) then
write(*,110) its
endif
110 format (/' Number of iterations of the method:',I4)
!...Optional: Get some information from the solver and display it
call EPSGetType(eps,tname,ierr)
if (rank .eq. 0) then
write(*,120) tname
endif
120 format (' Solution method: ',A)
nev=10
call EPSGetDimensions(eps,nev,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,ierr)
if (rank .eq. 0) then
write(*,130) nev
endif
130 format (' Number of requested eigenvalues:',I4)
call EPSGetTolerances(eps,tol,maxit,ierr)
if (rank .eq. 0) then
write(*,140) tol, maxit
endif
140 format (' Stopping condition: tol=',1P,E10.4,', maxit=',I4)
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! Display solution and clean up
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
!...Get number of converged eigenpairs
call EPSGetConverged(eps,nconv,ierr)
if (rank .eq. 0) then
write(*,150) nconv
endif
150 format (' Number of converged eigenpairs:',I4/)
!...Display eigenvalues and relative errors
if (nconv.gt.0) then
if (rank .eq. 0) then
write(*,*) ' kr ki ||Ax-kx||/||kx||'
write(*,*) ' ----------------- ----------------- ------------------'
endif
do i=0,nconv-1
!...Get converged eigenpairs: i-th eigenvalue is stored in kr (real
part) and ki (imaginary part)
call EPSGetEigenpair(eps,i,kr,ki,xr,xi,ierr)
!...Compute the relative error associated to each eigenpair
call EPSComputeError(eps,i,EPS_ERROR_RELATIVE,error,ierr)
if (rank .eq. 0) then
write(*,160) PetscRealPart(kr),PetscImaginaryPart(kr), error
endif
160 format (1P,' ',E21.11,' ',E21.11,' ',E21.11)
enddo
endif
!...Free work space
if (rank==0) deallocate(li)
deallocate(D1Xc,D2Xc,D1Yc,D2Yc)
deallocate(fst_bc)
deallocate(wall_bc)
deallocate(outlet_bc)
deallocate(inlet_bc)
deallocate(all_bc)
deallocate(u_bc)
deallocate(v_bc)
deallocate(w_bc)
deallocate(p_bc)
!call STDestroy(st,ierr)
call EPSDestroy(eps,ierr)
call MatDestroy(A,ierr)
call MatDestroy(B,ierr)
!call VecDestroy(i_vec,ierr)
call VecDestroy(xr,ierr)
call VecDestroy(xi,ierr)
!...Finalize
call SlepcFinalize(ierr)
if (rank==0) then
call cpu_time(finish)
write(*,*)
print '("Total time for SLEPC = ",f21.3," seconds.")',finish-start
endif
END Subroutine SETUPSOLVEGEVP
END MODULE PETSC
!!$ write(*,*)'fst_bc',fst_bc
!!$ write(*,*)'wall_bc',wall_bc
!!$ write(*,*)'outlet_bc',outlet_bc
!!$ write(*,*)'inlet_bc',inlet_bc
!!$
!!$ write(*,*)'all_bc',all_bc
!!$ write(*,*)'ct1+ct2+ct3+ct4',ct1+ct2+ct3+ct4
!!$ call MatCreate(PETSC_COMM_WORLD,A,ierr)
!!$ call MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,ierr)
!!$ call MatSetFromOptions(A,ierr)
!!$ call MatSetUp(A,ierr)
!call VecCreateSeq(PETSC_COMM_SELF,nxy,i_vec,ierr) ! sequential vector
!call VecSetValues(i_vec,nxy,loc,xwork1,INSERT_VALUES,ierr)
! Assemble vector
!call VecAssemblyBegin(i_vec,ierr)
!call VecAssemblyEnd(i_vec,ierr)
! Create parallel matrix, specifying only its global dimensions.
! When using MatCreate(), the matrix format can be specified at
! runtime. Also, the parallel partitioning of the matrix is
! determined by PETSc at runtime.
!!$ call MatCreate(PETSC_COMM_WORLD,B,ierr)
!!$ call MatSetSizes(B,PETSC_DECIDE,PETSC_DECIDE,4*nxy,4*nxy,ierr)
!!$ call MatSetFromOptions(B,ierr)
!!$ call MatSetUp(B,ierr)
! Currently, all PETSc parallel matrix formats are partitioned by
! contiguous chunks of rows across the processors. Determine which
! rows of the matrix are locally owned.
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! Include files
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
!
! This program uses CPP for preprocessing, as indicated by the use of
! PETSc include files in the directory petsc/include/finclude. This
! convention enables use of the CPP preprocessor, which allows the use
! of the #include statements that define PETSc objects and variables.
!
! Use of the conventional Fortran include statements is also supported
! In this case, the PETsc include files are located in the directory
! petsc/include/foldinclude.
!
! Since one must be very careful to include each file no more than once
! in a Fortran routine, application programmers must exlicitly list
! each file needed for the various PETSc components within their
! program (unlike the C/C++ interface).
!
!write(*,*)'IstartA,IendA',IstartA,IendA
!write(*,*)'IstartB,IendB',IstartB,IendB
!!$#include <finclude/petscsys.h>
!!$#include <finclude/petscvec.h>
!!$#include <finclude/petscmat.h>
!!$#include <finclude/petscmat.h90>
!!$#include <finclude/petscpc.h>
!!$#include <finclude/petscksp.h>
!!$
!!$#include <finclude/slepcsys.h>
!!$#include <finclude/slepceps.h>
!!$ do i=1,nx
!!$ do j=1,nx
!!$ D1X(i,j)=100+j+(i-1)*nx
!!$ enddo
!!$ enddo
!!$ do i=1,ny
!!$ do j=1,ny
!!$ D1Y(i,j)=999.d0
!!$ enddo
!!$ enddo
! Create parallel vectors.
! - Here, the parallel partitioning of the vector is determined by
! PETSc at runtime. We could also specify the local dimensions
! if desired -- or use the more general routine VecCreate().
! - When solving a linear system, the vectors and matrices MUST
! be partitioned accordingly. PETSc automatically generates
! appropriately partitioned matrices and vectors when MatCreate()
! and VecCreate() are used with the same communicator.
! - Note: We form 1 vector from scratch and then duplicate as needed.
!call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,m*n,u,ierr)
!call VecSetFromOptions(u,ierr)
!call VecDuplicate(u,b,ierr)
!call VecDuplicate(b,x,ierr)
! Set exact solution; then compute right-hand-side vector.
! By default we use an exact solution of a vector with all
! elements of 1.0; Alternatively, using the runtime option
! -random_sol forms a solution vector with random components.
!!$ call
PetscOptionsHasName(PETSC_NULL_CHARACTER,'-random_exact_sol',flg,ierr)
!!$ if (flg) then
!!$ call PetscRandomCreate(PETSC_COMM_WORLD,rctx,ierr)
!!$ call PetscRandomSetFromOptions(rctx,ierr)
!!$ call VecSetRandom(u,rctx,ierr)
!!$ call PetscRandomDestroy(rctx,ierr)
!!$ else
!!$ call VecSet(u,one,ierr)
!!$ endif
!!$ call MatMult(A,u,b,ierr)
! View the exact solution vector if desired
!!$ call
PetscOptionsHasName(PETSC_NULL_CHARACTER,"-view_exact_sol",flg,ierr)
!!$ if (flg) then
!!$ call VecView(u,PETSC_VIEWER_STDOUT_WORLD,ierr)
!!$ endif
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! Create the linear solver and set various options
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! Create linear solver context
!!$ call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
! Set operators. Here the matrix that defines the linear system
! also serves as the preconditioning matrix.
!call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
commented and replaced by next line
!!$ call KSPSetOperators(ksp,A,A,ierr)
! Set linear solver defaults for this problem (optional).
! - By extracting the KSP and PC contexts from the KSP context,
! we can then directly directly call any KSP and PC routines
! to set various options.
! - The following four statements are optional; all of these
! parameters could alternatively be specified at runtime via
! KSPSetFromOptions(). All of these defaults can be
! overridden at runtime, as indicated below.
! We comment out this section of code since the Jacobi
! preconditioner is not a good general default.
! call KSPGetPC(ksp,pc,ierr)
! ptype = PCJACOBI
! call PCSetType(pc,ptype,ierr)
! tol = 1.e-7
! call KSPSetTolerances(ksp,tol,PETSC_DEFAULT_DOUBLE_PRECISION,
! & PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr)
! Set user-defined monitoring routine if desired
!call
PetscOptionsHasName(PETSC_NULL_CHARACTER,'-my_ksp_monitor',flg,ierr) ! flg set
to true if that option has been specified
!if (flg) then
! call
KSPMonitorSet(ksp,MyKSPMonitor,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr) !
call MyKSPMonitor
!endif
!tol = 1.e-10
!call
KSPSetTolerances(ksp,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
! set relative tolerance and uses defualt for absolute and divergence tol
!call
KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) !
set relative and absolute tolerances and uses defualt for divergence tol
! Set runtime options, e.g.,
! -ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol>
! These options will override those specified above as long as
! KSPSetFromOptions() is called _after_ any other customization
! routines.
!call KSPSetFromOptions(ksp,ierr)
! Set convergence test routine if desired
!call
PetscOptionsHasName(PETSC_NULL_CHARACTER,'-my_ksp_convergence',flg,ierr)
!if (flg) then
! call
KSPSetConvergenceTest(ksp,MyKSPConverged,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr)
!endif
!call
KSPMonitorSet(ksp,KSPMonitorTrueResidualNorm,PETSC_NULL_OBJECT,PETSC_NULL_FUNCTION,ierr);
!
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! Solve the linear system
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
!call KSPSolve(ksp,b,x,ierr)
!
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! View solver info (could use -ksp_view instead)
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
!added by aha
!!$ write(*,*)''
!!$ write(*,*)'Start of KSPView'
!!$ write(*,*)'----------------'
!!$ write(*,*)''
!!$ call KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD,ierr)
!!$ write(*,*)''
!!$ write(*,*)'End of KSPView'
!!$ write(*,*)'--------------'
!!$ write(*,*)''
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! Check solution and clean up
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
! Check the error
!!$ call VecAXPY(x,neg_one,u,ierr)
!!$ call VecNorm(x,NORM_2,norm,ierr)
!!$ call KSPGetIterationNumber(ksp,its,ierr)
!!$ if (rank .eq. 0) then
!!$ if (norm .gt. 1.e-12) then
!!$ write(6,100) norm,its
!!$ else
!!$ write(6,110) its
!!$ endif
!!$ endif
!!$ 100 format('Norm of error ',e11.4,' iterations ',i5)
!!$ 110 format('Norm of error < 1.e-12,iterations ',i5)
! Free work space. All PETSc objects should be destroyed when they
! are no longer needed.
!!$ call KSPDestroy(ksp,ierr)
!!$ call VecDestroy(u,ierr)
!!$ call VecDestroy(x,ierr)
!!$ call VecDestroy(b,ierr)
!!$ call MatDestroy(A,ierr)
! Always call PetscFinalize() before exiting a program. This routine
! - finalizes the PETSc libraries as well as MPI
! - provides summary and diagnostic information if certain runtime
! options are chosen (e.g., -log_summary). See PetscFinalize()
! manpage for more information.
! --------------------------------------------------------------
!
! MyKSPMonitor - This is a user-defined routine for monitoring
! the KSP iterative solvers.
!
! Input Parameters:
! ksp - iterative context
! n - iteration number
! rnorm - 2-norm (preconditioned) residual value (may be estimated)
! dummy - optional user-defined monitor context (unused here)
!
!!$ subroutine MyKSPMonitor(ksp,n,rnorm,dummy,ierr)
!!$
!!$ implicit none
!!$
!!$#include <finclude/petscsys.h>
!!$#include <finclude/petscvec.h>
!!$#include <finclude/petscksp.h>
!!$
!!$ KSP ksp
!!$ Vec x
!!$ PetscErrorCode ierr
!!$ PetscInt n,dummy
!!$ PetscMPIInt rank
!!$ double precision rnorm
!!$
!!$! Build the solution vector
!!$
!!$ call KSPBuildSolution(ksp,PETSC_NULL_OBJECT,x,ierr)
!!$
!!$! Write the solution vector and residual norm to stdout
!!$! - Note that the parallel viewer PETSC_VIEWER_STDOUT_WORLD
!!$! handles data from multiple processors so that the
!!$! output is not jumbled.
!!$
!!$ call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
!!$ if (rank .eq. 0) write(6,100) n
!!$ call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr)
!!$ if (rank .eq. 0) write(6,200) n,rnorm
!!$
!!$ 100 format('iteration ',i5,' solution vector:')
!!$ 200 format('iteration ',i5,' residual norm ',e11.4)
!!$ ierr = 0
!!$
!!$ end subroutine MyKSPMonitor
! --------------------------------------------------------------
!
! MyKSPConverged - This is a user-defined routine for testing
! convergence of the KSP iterative solvers.
!
! Input Parameters:
! ksp - iterative context
! n - iteration number
! rnorm - 2-norm (preconditioned) residual value (may be estimated)
! dummy - optional user-defined monitor context (unused here)
!
!!$ subroutine MyKSPConverged(ksp,n,rnorm,flag,dummy,ierr)
!!$
!!$ implicit none
!!$
!!$#include <finclude/petscsys.h>
!!$#include <finclude/petscvec.h>
!!$#include <finclude/petscksp.h>
!!$
!!$ KSP ksp
!!$ PetscErrorCode ierr
!!$ PetscInt n,dummy
!!$ KSPConvergedReason flag
!!$ double precision rnorm
!!$
!!$ if (rnorm .le. .05) then
!!$ flag = 1
!!$ else
!!$ flag = 0
!!$ endif
!!$ ierr = 0
!!$
!!$ end subroutine MyKSPConverged
<valgrind.log.31371><valgrind.log.31372>