On May 31, 2014, at 6:06 PM, Sun, Hui <[email protected]> wrote:

> Thanks Jed. It converges now. With a 32 by 32 grid, it takes 3.1076 seconds 
> on 8 cores and 7.13586 seconds on 2 cores. With a 64 by 64 grid, it takes 
> 18.1767s and 55.0017s respectively. That seems quite reasonable. 
> 
> By the way, how do I know which matrix solver and which preconditioner is 
> being called? 

  Run with -snes_view  (or -ts_view if using the ODE integrators).

> 
> Besides, I have another question: I try to program finite difference for 2D 
> Stokes flow with Dirichlet or Neumann bdry conditions, using staggered MAC 
> grid. I looked up all the examples in snes, there are three stokes flow 
> examples, all of which are finite element. I was thinking about naming 
> (i-1/2,j), (i,j-1/2) and (i,j) all as (i,j), then define u, v, p as three 
> petscscalers on (i,j), but in that case u will have one more column than p 
> and v will have one more row than p. If there is already something there in 
> PETSc about MAC grid, then I don't have to worry about those details. Do you 
> know any examples or references doing that? 

   Unfortunately the DMDA is not ideal for this since it only supports the same 
number of dof at each grid point. You need to decouple the extra “variables” 
and not use their values to do a MAC grid. For example in two dimensions with u 
(velocity in x direction), v (velocity in y direction) and p (pressure at cell 
centers), and pure Dirichlet boundary conditions then create a DMDA with a dof 
of three and for each cell treat the first component of the cell as u (on the 
lower side of cell) , the second as v (on left side of cell)  and the third as 
p (on center of cell). For the final row of cells across the top there is no v 
or p, just the u along the bottoms of the cells and for the final row of cells 
along the right there is only a v. So make all the “extra” equations  be simply 
f.v[i][j]  = x.v[i][j]  (or x.p or x.u depending on where) and put a 1 on the 
diagonal of that row/column of the Jacobian). Yes it is a little annoyingly 
cumbersome.

   Barry
 
> 
> Hui
> 
> 
> 
> ________________________________________
> From: Jed Brown [[email protected]]
> Sent: Saturday, May 31, 2014 2:48 PM
> To: Sun, Hui; [email protected]
> Subject: RE: [petsc-users] Question on DMMGSetSNESLocal from 
> snes/example/tutorials/ex19.c
> 
> "Sun, Hui" <[email protected]> writes:
> 
>> Thank you Jed for explaining this to me. I tried to compile and run with the 
>> following options:
>> ./ex19 -lidvelocity 100 -grashof 1e4 -da_grid_x 32 -da_grid_y 32 -da_refine 
>> 2 -snes_monitor_short -snes_converged_reason
>> 
>> 1). I use 2 cores and get the following output:
>> lid velocity = 100, prandtl # = 1, grashof # = 10000
>>  0 SNES Function norm 1111.93
>>  1 SNES Function norm 829.129
>>  2 SNES Function norm 532.66
>>  3 SNES Function norm 302.926
>>  4 SNES Function norm 3.64014
>>  5 SNES Function norm 0.0410053
>>  6 SNES Function norm 4.57951e-06
>> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 6
>> Number of SNES iterations = 6
>> Time cost for creating solver context 0.000907183 s, and for solve 25.2764 s.
>> 
>> 2). I use 8 cores and get the following output:
>> lid velocity = 100, prandtl # = 1, grashof # = 10000
>>  0 SNES Function norm 1111.93
>>  1 SNES Function norm 829.049
>>  2 SNES Function norm 532.616
>>  3 SNES Function norm 303.165
>>  4 SNES Function norm 3.93436
>> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 4
>> Number of SNES iterations = 4
>> Time cost for creating solver context 0.0244079 s, and for solve 25.0607 s.
>> 
>> First of all, the two runs yields different results.
> 
> The linear solve did not converge in the second case.
> 
> Run a more robust linear solver.  These problems can get difficult, but
> I think -pc_type asm -sub_pc_type lu should be sufficient.
> 
>> Secondly, the time cost comparison doesn't seem to be scaling correctly.
>> ( I have used petsctime.h to calculate the time cost. )
> 
> 1. Run in optimized mode.
> 
> 2. Don't use more processes than you have cores (I don't know if this
> affects you).
> 
> 3. This problem is too small to take advantage of much (if any)
> parallelism.

Reply via email to