from:"Dave May"

Re: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34)

2024-01-19 Thread Dave May

Thanks Barry!

On Fri, 19 Jan 2024 at 12:18, Barry Smith  wrote:

>
>   Junchao
>
> I run the following on the CI machine, why does this happen? With
> trivial solver options it runs ok.
>
> bsmith@petsc-gpu-02:/scratch/bsmith/petsc/src/ksp/ksp/tutorials$ ./ex34
> -da_grid_x 192 -da_grid_y 192 -da_grid_z 192 -dm_mat_type seqaijhipsparse
> -dm_vec_type seqhip -ksp_max_it 10 -ksp_monitor -ksp_type richardson
> -ksp_view -log_view -mg_coarse_ksp_max_it 2 -mg_coarse_ksp_type richardson
> -mg_coarse_pc_type none -mg_levels_ksp_type richardson -mg_levels_pc_type
> none -options_left -pc_mg_levels 3 -pc_mg_log -pc_type mg
>
> *[0]PETSC ERROR: - Error Message
> --*
>
> [0]PETSC ERROR: GPU error
>
> [0]PETSC ERROR: hipSPARSE errorcode 3 (HIPSPARSE_STATUS_INVALID_VALUE)
>
> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the
> program crashed before usage or a spelling mistake, etc!
>
> [0]PETSC ERROR:   Option left: name:-options_left (no value) source:
> command line
>
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.20.3, unknown
>
> [0]PETSC ERROR: ./ex34 on a  named petsc-gpu-02 by bsmith Fri Jan 19
> 14:15:20 2024
>
> [0]PETSC ERROR: Configure options
> --package-prefix-hash=/home/bsmith/petsc-hash-pkgs --with-make-np=24
> --with-make-test-np=8 --with-hipc=/opt/rocm-5.4.3/bin/hipcc
> --with-hip-dir=/opt/rocm-5.4.3 COPTFLAGS="-g -O" FOPTFLAGS="-g -O"
> CXXOPTFLAGS="-g -O" HIPOPTFLAGS="-g -O" --with-cuda=0 --with-hip=1
> --with-precision=double --with-clanguage=c --download-kokkos
> --download-kokkos-kernels --download-hypre --download-magma
> --with-magma-fortran-bindings=0 --download-mfem --download-metis
> --with-strict-petscerrorcode PETSC_ARCH=arch-ci-linux-hip-double
>
> [0]PETSC ERROR: #1 MatMultAddKernel_SeqAIJHIPSPARSE() at
> /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3131
>
> [0]PETSC ERROR: #2 MatMultAdd_SeqAIJHIPSPARSE() at
> /scratch/bsmith/petsc/src/mat/impls/aij/seq/seqhipsparse/aijhipsparse.hip.cpp:3004
>
> [0]PETSC ERROR: #3 MatMultAdd() at
> /scratch/bsmith/petsc/src/mat/interface/matrix.c:2770
>
> [0]PETSC ERROR: #4 MatInterpolateAdd() at
> /scratch/bsmith/petsc/src/mat/interface/matrix.c:8603
>
> [0]PETSC ERROR: #5 PCMGMCycle_Private() at
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:87
>
> [0]PETSC ERROR: #6 PCMGMCycle_Private() at
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:83
>
> [0]PETSC ERROR: #7 PCApply_MG_Internal() at
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:611
>
> [0]PETSC ERROR: #8 PCApply_MG() at
> /scratch/bsmith/petsc/src/ksp/pc/impls/mg/mg.c:633
>
> [0]PETSC ERROR: #9 PCApply() at
> /scratch/bsmith/petsc/src/ksp/pc/interface/precon.c:498
>
> [0]PETSC ERROR: #10 KSP_PCApply() at
> /scratch/bsmith/petsc/include/petsc/private/kspimpl.h:383
>
> [0]PETSC ERROR: #11 KSPSolve_Richardson() at
> /scratch/bsmith/petsc/src/ksp/ksp/impls/rich/rich.c:106
>
> [0]PETSC ERROR: #12 KSPSolve_Private() at
> /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:906
>
> [0]PETSC ERROR: #13 KSPSolve() at
> /scratch/bsmith/petsc/src/ksp/ksp/interface/itfunc.c:1079
>
> [0]PETSC ERROR: #14 main() at ex34.c:52
>
> [0]PETSC ERROR: PETSc Option Table entries:
>
>   Dave,
>
> Trying to debug the 7% now, but having trouble running, as you see
> above.
>
>
>
> On Jan 19, 2024, at 3:02 PM, Dave May  wrote:
>
> Thank you Barry and Junchao for these explanations. I'll turn on
> -log_view_gpu_time.
>
> Do either of you have any thoughts regarding why the percentage of flop's
> being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this
> solver configuration?
>
> This number should have nothing to do with timings as it reports the ratio
> of operations performed on the GPU and CPU, presumably obtained from
> PetscLogFlops() and PetscLogGpuFlops().
>
> Cheers,
> Dave
>
> On Fri, 19 Jan 2024 at 11:39, Junchao Zhang 
> wrote:
>
>> Try to also add -log_view_gpu_time,
>> https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/
>>
>> --Junchao Zhang
>>
>>
>> On Fri, Jan 19, 2024 at 11:35 AM Dave May 
>> wrote:
>>
>>> Hi all,
>>>
>>> I am trying to understand the logging information associated with the
>>> %flops-performed-on-the-gpu reported by -log_view when running
>>>   src/ksp/ksp/tutorials/ex34
>>> with the following options
>>> -da_grid_x 192
>>> -da_grid_y 192
>>&

Re: [petsc-users] Trying to understand -log_view when using HIP kernels (ex34)

2024-01-19 Thread Dave May

Thank you Barry and Junchao for these explanations. I'll turn on
-log_view_gpu_time.

Do either of you have any thoughts regarding why the percentage of flop's
being reported on the GPU is not 100% for MGSmooth Level {0,1,2} for this
solver configuration?

This number should have nothing to do with timings as it reports the ratio
of operations performed on the GPU and CPU, presumably obtained from
PetscLogFlops() and PetscLogGpuFlops().

Cheers,
Dave

On Fri, 19 Jan 2024 at 11:39, Junchao Zhang  wrote:

> Try to also add -log_view_gpu_time,
> https://petsc.org/release/manualpages/Profiling/PetscLogGpuTime/
>
> --Junchao Zhang
>
>
> On Fri, Jan 19, 2024 at 11:35 AM Dave May  wrote:
>
>> Hi all,
>>
>> I am trying to understand the logging information associated with the
>> %flops-performed-on-the-gpu reported by -log_view when running
>>   src/ksp/ksp/tutorials/ex34
>> with the following options
>> -da_grid_x 192
>> -da_grid_y 192
>> -da_grid_z 192
>> -dm_mat_type seqaijhipsparse
>> -dm_vec_type seqhip
>> -ksp_max_it 10
>> -ksp_monitor
>> -ksp_type richardson
>> -ksp_view
>> -log_view
>> -mg_coarse_ksp_max_it 2
>> -mg_coarse_ksp_type richardson
>> -mg_coarse_pc_type none
>> -mg_levels_ksp_type richardson
>> -mg_levels_pc_type none
>> -options_left
>> -pc_mg_levels 3
>> -pc_mg_log
>> -pc_type mg
>>
>> This config is not intended to actually solve the problem, rather it is a
>> stripped down set of options designed to understand what parts of the
>> smoothers are being executed on the GPU.
>>
>> With respect to the log file attached, my first set of questions related
>> to the data reported under "Event Stage 2: MG Apply".
>>
>> [1] Why is the log littered with nan's?
>> * I don't understand how and why "GPU Mflop/s" should be reported as nan
>> when a value is given for "GPU %F" (see MatMult for example).
>>
>> * For events executed on the GPU, I assume the column "Time (sec)"
>> relates to "CPU execute time", this would explain why we see a nan in "Time
>> (sec)" for MatMult.
>> If my assumption is correct, how should I interpret the column "Flop
>> (Max)" which is showing 1.92e+09?
>> I would assume of "Time (sec)" relates to the CPU then "Flop (Max)"
>> should also relate to CPU and GPU flops would be logged in "GPU Mflop/s"
>>
>> [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve,
>> MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as
>> 93. I believe this value should be 100 as the smoother (and coarse grid
>> solver) are configured as richardson(2)+none and thus should run entirely
>> on the GPU.
>> Furthermore, when one inspects all events listed under "Event Stage 2: MG
>> Apply" those events which do flops correctly report "GPU %F" as 100.
>> And the events showing "GPU %F" = 0 such as
>>   MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync
>> don't do any flops (on the CPU or GPU) - which is also correct
>> (although non GPU events should show nan??)
>>
>> Hence I am wondering what is the explanation for the missing 7% from "GPU
>> %F" for KSPSolve and MGSmooth {0,1,2}??
>>
>> Does anyone understand this -log_view, or can explain to me how to
>> interpret it?
>>
>> It could simply be that:
>> a) something is messed up with -pc_mg_log
>> b) something is messed up with the PETSc build
>> c) I am putting too much faith in -log_view and should profile the code
>> differently.
>>
>> Either way I'd really like to understand what is going on.
>>
>>
>> Cheers,
>> Dave
>>
>>
>>
>>

[petsc-users] Trying to understand -log_view when using HIP kernels (ex34)

2024-01-19 Thread Dave May

Hi all,

I am trying to understand the logging information associated with the
%flops-performed-on-the-gpu reported by -log_view when running
  src/ksp/ksp/tutorials/ex34
with the following options
-da_grid_x 192
-da_grid_y 192
-da_grid_z 192
-dm_mat_type seqaijhipsparse
-dm_vec_type seqhip
-ksp_max_it 10
-ksp_monitor
-ksp_type richardson
-ksp_view
-log_view
-mg_coarse_ksp_max_it 2
-mg_coarse_ksp_type richardson
-mg_coarse_pc_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_type none
-options_left
-pc_mg_levels 3
-pc_mg_log
-pc_type mg

This config is not intended to actually solve the problem, rather it is a
stripped down set of options designed to understand what parts of the
smoothers are being executed on the GPU.

With respect to the log file attached, my first set of questions related to
the data reported under "Event Stage 2: MG Apply".

[1] Why is the log littered with nan's?
* I don't understand how and why "GPU Mflop/s" should be reported as nan
when a value is given for "GPU %F" (see MatMult for example).

* For events executed on the GPU, I assume the column "Time (sec)" relates
to "CPU execute time", this would explain why we see a nan in "Time (sec)"
for MatMult.
If my assumption is correct, how should I interpret the column "Flop (Max)"
which is showing 1.92e+09?
I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" should
also relate to CPU and GPU flops would be logged in "GPU Mflop/s"

[2] More curious is that within "Event Stage 2: MG Apply" KSPSolve,
MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as
93. I believe this value should be 100 as the smoother (and coarse grid
solver) are configured as richardson(2)+none and thus should run entirely
on the GPU.
Furthermore, when one inspects all events listed under "Event Stage 2: MG
Apply" those events which do flops correctly report "GPU %F" as 100.
And the events showing "GPU %F" = 0 such as
  MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync
don't do any flops (on the CPU or GPU) - which is also correct
(although non GPU events should show nan??)

Hence I am wondering what is the explanation for the missing 7% from "GPU
%F" for KSPSolve and MGSmooth {0,1,2}??

Does anyone understand this -log_view, or can explain to me how to
interpret it?

It could simply be that:
a) something is messed up with -pc_mg_log
b) something is messed up with the PETSc build
c) I am putting too much faith in -log_view and should profile the code
differently.

Either way I'd really like to understand what is going on.


Cheers,
Dave


ex34_192_mg_seqhip_richardson_pcnone.o5748667
Description: Binary data

Re: [petsc-users] sources of floating point randomness in JFNK in serial

2023-05-04 Thread Dave May

Is your code valgrind clean?

On Thu 4. May 2023 at 05:54, Mark Lohry  wrote:

> Try -pc_type none.
>>
>
> With -pc_type none the 0 KSP residual looks identical. But *sometimes*
> it's producing exactly the same history and others it's gradually
> changing.  I'm reasonably confident my residual evaluation has no
> randomness, see info after the petsc output.
>
> solve history 1:
>
>   0 SNES Function norm 3.424003312857e+04
> 0 KSP Residual norm 3.424003312857e+04
> 1 KSP Residual norm 2.87173536e+04
> 2 KSP Residual norm 2.490276931041e+04
> ...
>20 KSP Residual norm 7.449686034356e+03
>   Linear solve converged due to CONVERGED_ITS iterations 20
>   1 SNES Function norm 1.085015821006e+04
>
> solve history 2, identical to 1:
>
>   0 SNES Function norm 3.424003312857e+04
> 0 KSP Residual norm 3.424003312857e+04
> 1 KSP Residual norm 2.87173536e+04
> 2 KSP Residual norm 2.490276931041e+04
> ...
>20 KSP Residual norm 7.449686034356e+03
>   Linear solve converged due to CONVERGED_ITS iterations 20
>   1 SNES Function norm 1.085015821006e+04
>
> solve history 3, identical KSP at 0 and 1, slight change at 2, growing
> difference to the end:
>   0 SNES Function norm 3.424003312857e+04
> 0 KSP Residual norm 3.424003312857e+04
> 1 KSP Residual norm 2.87173536e+04
> 2 KSP Residual norm 2.490276930242e+04
> ...
>  20 KSP Residual norm 7.449686095424e+03
>   Linear solve converged due to CONVERGED_ITS iterations 20
>   1 SNES Function norm 1.085015646971e+04
>
>
> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10
> iterations, so 30 calls of the same residual evaluation, identical
> residuals every time
>
> run 1:
>
> # iterationrho rhourhov
>  rhoEabs_res rel_res umin
>  vmaxvminelapsed_time
> #
>
>
>   1.0e+00  1.086860616292e+00  2.782316758416e+02
>  4.482867643761e+00  2.993435920340e+02 2.04353e+02
> 1.0e+00-8.23945e-15-6.15326e-15-1.35563e-14
> 6.34834e-01
>   2.0e+00  2.310547487017e+00  1.079059352425e+02
>  3.958323921837e+00  5.058927165686e+02 2.58647e+02
> 1.26568e+00-1.02539e-14-9.35368e-15-1.69925e-14
> 6.40063e-01
>   3.0e+00  2.361005867444e+00  5.706213331683e+01
>  6.130016323357e+00  4.688968362579e+02 2.36201e+02
> 1.15585e+00-1.19370e-14-1.15216e-14-1.59733e-14
> 6.45166e-01
>   4.0e+00  2.16751863e+00  3.757541401594e+01
>  6.313917437428e+00  4.054310291628e+02 2.03612e+02
> 9.96372e-01-1.81831e-14-1.28312e-14-1.46238e-14
> 6.50494e-01
>   5.0e+00  1.941443738676e+00  2.884190334049e+01
>  6.237106158479e+00  3.539201037156e+02 1.77577e+02
> 8.68970e-01 3.56633e-14-8.74089e-15-1.0e-14
> 6.55656e-01
>   6.0e+00  1.736947124693e+00  2.429485695670e+01
>  5.996962200407e+00  3.148280178142e+02 1.57913e+02
> 7.72745e-01-8.98634e-14-2.41152e-14-1.39713e-14
> 6.60872e-01
>   7.0e+00  1.564153212635e+00  2.149609219810e+01
>  5.786910705204e+00  2.848717011033e+02 1.42872e+02
> 6.99144e-01-2.95352e-13-2.48158e-14-2.39351e-14
> 6.66041e-01
>   8.0e+00  1.419280815384e+00  1.950619804089e+01
>  5.627281158306e+00  2.606623371229e+02 1.30728e+02
> 6.39715e-01 8.98941e-13 1.09674e-13 3.78905e-14
> 6.71316e-01
>   9.0e+00  1.296115915975e+00  1.794843530745e+01
>  5.514933264437e+00  2.401524522393e+02 1.20444e+02
> 5.89394e-01 1.70717e-12 1.38762e-14 1.09825e-13
> 6.76447e-01
>   1.0e+01  1.189639693918e+00  1.665381754953e+01
>  5.433183087037e+00  2.222572900473e+02 1.11475e+02
> 5.45501e-01-4.22462e-12-7.15206e-13-2.28736e-13
> 6.81716e-01
>
> run N:
>
>
> #
>
>
> # iterationrho rhourhov
>  rhoEabs_res rel_res umin
>  vmaxvminelapsed_time
> #
>
>
>   1.0e+00  1.086860616292e+00  2.782316758416e+02
>  4.482867643761e+00  2.993435920340e+02 2.04353e+02
> 1.0e+00-8.23945e-15-6.15326e-15-1.35563e-14
> 6.23316e-01
>   2.0e+00  2.310547487017e+00  1.079059352425e+02
>  3.958323921837e+00  5.058927165686e+02 2.58647e+02
> 1.26568e+00-1.02539e-14-9.35368e-15-1.69925e-14
> 6.28510e-01
>   3.0e+00  2.361005867444e+00  5.706213331683e+01
>  6.130016323357e+00  4.688968362579e+02 2.36201e+02
> 1.15585e+00-1.19370e-14-1.15216e-14-1.59733e-14
>

Re: [petsc-users] DMSWARM with DMDA and KSP

2023-05-01 Thread Dave May

On Mon 1. May 2023 at 18:57, Matthew Young 
wrote:

> Thanks for the suggestion to keep DMs separate, and for pointing me toward
> that example. I now have a DM for the particle quantities (i.e., density
> and flux) and another for the potential. I'm hoping to use
> KSPSetComputeOperators with PCGAMG, so I packed the density DM into the
> application context and set the potential DM on the KSP, but I'm not sure
> how to communicate changes in the KSP DM (e.g., coarsening) to the density
> DM inside my operator function.
>

I don’t think you need to.

GAMG only requires the fine grid operator - this will be the matrix
assembled from KSPSetComputeOperators. Hence density DM and potential DM
fields only need to be managed by you on the finest level.

However, if you wanted to use PCMG with rediscretized operators on every
level, then you would need the density DM field defined on each level of
your geometric multigrid hierarchy. This could be done (possibly less than
ideally) by calling DMCreateInterpolation() and then using the Mat to
interpolate the density from the  finest level to next coarsest level (and
so on).

Thanks,
Dave


>
> --Matt
> ==
> Matthew Young, PhD (he/him)
> Research Scientist II
> Space Science Center
> University of New Hampshire
> matthew.yo...@unh.edu
> ==
>
>
> On Sun, Apr 30, 2023 at 1:52 PM Matthew Knepley  wrote:
>
>> On Sun, Apr 30, 2023 at 1:12 PM Matthew Young <
>> myoung.space.scie...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I am developing a particle-in-cell code that models ions as particles
>>> and electrons as an inertialess fluid. I use a PIC DMSWARM for the ions,
>>> which I gather into density and flux before solving a linear system for the
>>> electrostatic potential (phi). I currently have one DMDA with 5 degrees of
>>> freedom -- one each for density, 3 flux components, and phi.
>>>
>>> When setting up the linear system to solve for phi, I've been following
>>> examples like KSP ex34.c and ex42.c when writing the KSP operator and RHS
>>> functions but I'm not sure I have the right approach, since 4 of the DOFs
>>> are known and 1 is unknown.
>>>
>>> I saw this thread
>>> 
>>> that recommended using DMDAGetReducedDMDA, which I gather has been
>>> deprecated in favor of DMDACreateCompatibleDMDA. Is that a good approach
>>> for managing a regular grid with known and unknown quantities on each node?
>>> Could a composite DM be useful? Has anyone else worked on a problem like
>>> this?
>>>
>>
>> I recommend making a different DM for each kind of solve you want.
>> DMDACreateCompatibleDMDA() should be the implementation of DMClone(), but
>> we have yet to harmonize all things for all DMs. I would create one DM for
>> your Vlasov components and one for the Poisson.
>> We follow this strategy in our Vlasov-Poisson test for Landau damping:
>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/swarm/tests/ex9.c
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> --Matt
>>> ==
>>> Matthew Young, PhD (he/him)
>>> Research Scientist II
>>> Space Science Center
>>> University of New Hampshire
>>> matthew.yo...@unh.edu
>>> ==
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>

Re: [petsc-users] MPI+OpenMP+MKL

2023-04-07 Thread Dave May

On Fri 7. Apr 2023 at 07:06, Astor Piaz  wrote:

> Hello petsc-users,
> I am trying to use a code that is parallelized with a combination of
> OpenMP and MKL parallelisms, where OpenMP threads are able to spawn MPI
> processes.
>

Is this really the correct way to go?

Would it not be more suitable (or simpler) to run your application on an
 MPI sub communicator which maps one rank to say one compute node, and then
within each rank of the sub comm you utilize your threaded OpenMP / MKL
code using as many physical threads as there are cores/ node  (and or hyper
threads if that’s is effective for you)?

Thanks,
Dave

I have carefully scheduled the processes such that the right amount is
> launched, at the right time.
> When trying to use my code inside a MatShell (for later use in an FGMRES
> KSPSolver), MKL processes are not being used.
>
> I am sorry if this has been asked before.
> What configuration should I use in order to profit from MPI+OpenMP+MKL
> parallelism?
>
> Thank you!
> --
> Astor
>

Re: [petsc-users] Memory Usage in Matrix Assembly.

2023-03-14 Thread Dave May

On Tue, 14 Mar 2023 at 07:59, Pantelis Moschopoulos <
pmoschopou...@outlook.com> wrote:

> Dear Dave,
>
> Yes, I observe this in parallel runs. How I can change the parallel layout
> of the matrix? In my implementation, I read the mesh file, and the I split
> the domain where the first rank gets the first N elements, the second rank
> gets the next N elements etc. Should I use metis to distribute elements?
>


> Note that I use continuous finite elements, which means that some values
> will be cached in a temporary buffer.
>

Sure. With CG FE you will always have some DOFs which need to be cached,
however the number of cached values will be minimized if you follow Barry's
advice. If you do what Barry suggests, only the DOFs which live on the
boundary of your element-wise defined sub-domains would need to cached.

Thanks,
Dave


>
> Thank you very much,
> Pantelis
> --
> *From:* Dave May 
> *Sent:* Tuesday, March 14, 2023 4:40 PM
> *To:* Pantelis Moschopoulos 
> *Cc:* petsc-users@mcs.anl.gov 
> *Subject:* Re: [petsc-users] Memory Usage in Matrix Assembly.
>
>
>
> On Tue 14. Mar 2023 at 07:15, Pantelis Moschopoulos <
> pmoschopou...@outlook.com> wrote:
>
> Hi everyone,
>
> I am a new Petsc user that incorporates Petsc for FEM in a Fortran code.
> My question concerns the sudden increase of the memory that Petsc needs
> during the assembly of the jacobian matrix. After this point, memory is
> freed. It seems to me like Petsc performs memory allocations and the
> deallocations during assembly.
> I have used the following commands with no success:
> CALL MatSetOption(petsc_A, MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ier)
> CALL MatSetOption(petsc_A, MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE,ier)
> CALL MatSetOption(petsc_A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_TRUE,ier).
> CALL MatSetOption(petsc_A, MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE,ier)
>
> The structure of the matrix does not change during my simulation, just the
> values. I am expecting this behavior the first time that I create this
> matrix because the preallocation instructions that I use are not very
> accurate but this continues every time I assemble the matrix.
> What I am missing here?
>
>
> I am guessing this observation is seen when you run a parallel job.
>
> MatSetValues() will cache values in a temporary memory buffer if the
> values are to be sent to a different MPI rank.
> Hence if the parallel layout of your matrix doesn’t closely match the
> layout of the DOFs on each mesh sub-domain, then a huge number of values
> can potentially be cached. After you call MatAssemblyBegin(),
> MatAssemblyEnd() this cache will be freed.
>
> Thanks,
> Dave
>
>
>
> Thank you very much,
> Pantelis
>
>

Re: [petsc-users] Memory Usage in Matrix Assembly.

2023-03-14 Thread Dave May

On Tue 14. Mar 2023 at 07:15, Pantelis Moschopoulos <
pmoschopou...@outlook.com> wrote:

> Hi everyone,
>
> I am a new Petsc user that incorporates Petsc for FEM in a Fortran code.
> My question concerns the sudden increase of the memory that Petsc needs
> during the assembly of the jacobian matrix. After this point, memory is
> freed. It seems to me like Petsc performs memory allocations and the
> deallocations during assembly.
> I have used the following commands with no success:
> CALL MatSetOption(petsc_A, MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE,ier)
> CALL MatSetOption(petsc_A, MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE,ier)
> CALL MatSetOption(petsc_A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_TRUE,ier).
> CALL MatSetOption(petsc_A, MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE,ier)
>
> The structure of the matrix does not change during my simulation, just the
> values. I am expecting this behavior the first time that I create this
> matrix because the preallocation instructions that I use are not very
> accurate but this continues every time I assemble the matrix.
> What I am missing here?
>

I am guessing this observation is seen when you run a parallel job.

MatSetValues() will cache values in a temporary memory buffer if the values
are to be sent to a different MPI rank.
Hence if the parallel layout of your matrix doesn’t closely match the
layout of the DOFs on each mesh sub-domain, then a huge number of values
can potentially be cached. After you call MatAssemblyBegin(),
MatAssemblyEnd() this cache will be freed.

Thanks,
Dave



> Thank you very much,
> Pantelis
>

Re: [petsc-users] PetscViewer with 64bit

2023-02-14 Thread Dave May

On Tue 14. Feb 2023 at 21:27, Jed Brown  wrote:

> Dave May  writes:
>
> > On Tue 14. Feb 2023 at 17:17, Jed Brown  wrote:
> >
> >> Can you share a reproducer? I think I recall the format requiring
> certain
> >> things to be Int32.
> >
> >
> > By default, the byte offset used with the appended data format is
> UInt32. I
> > believe that’s where the sizeof(int) is coming from. This default is
> > annoying as it limits the total size of your appended data to be < 3 GB.
> > That said, in the opening of the paraview file you can add this attribute
> >
> > header_type="UInt64"
>
> You mean in the header of the .vtu?


Yeah, within the open VTKFile tag.
Like this
< VTKFile type=“xxx”,  byte_order="LittleEndian" header_type="UInt64" >

Do you happen to have an example or pointers to docs describing this
> feature?


Example yes - will send it to you tomorrow. Docs… not really. Only stuff
like this

https://kitware.github.io/paraview-docs/latest/python/paraview.simple.XMLPStructuredGridWriter.html


https://kitware.github.io/paraview-docs/v5.8.0/python/paraview.simple.XMLMultiBlockDataWriter.html

All the writers seem to support it.


Can we always do this?


Yep!


It isn't mentioned in these:
>
> https://vtk.org/wp-content/uploads/2015/04/file-formats.pdf   (PDF was
> created in 2003)
>
> https://kitware.github.io/vtk-examples/site/VTKFileFormats/#xml-file-formats
>

Yes I know. I’ve tied myself in knots for years because the of the
assumption that the offset had to be an int.

Credit for the discovery goes to Carsten Uphoff. He showed me this.

Cheers,
Dave



> > then the size of the offset is now UInt64 and now large files can be
> > finally written.
> >
> >
> > Cheers,
> > Dave
> >
> >
> >
> >
> >>
> >> Mike Michell  writes:
> >>
> >> > Thanks for the note.
> >> > I understood that PETSc calculates the offsets for me through
> "boffset"
> >> > variable in plexvtu.c file. Please correct me if it is wrong.
> >> >
> >> > If plexvtu.c has a bug, it could be around "write file header" part in
> >> > which the boffset is also computed. Is this correct? I am not using
> >> complex
> >> > number.
> >> > There are several mixed parts among "Int32, UInt8, PetscInt_FMT,
> >> > PetscInt64_FMT" in writing the header.
> >> >
> >> > Which combination of those flags is correct for 64bit indices? I am
> gonna
> >> > modify plexvtu.c file with "#if defined(PETSC_USE_64BIT_INDICES)"
> >> > statement, but I do not know what is the correct form of the header
> flag
> >> > for 64bit indices.
> >> >
> >> > It is also confusing to me:
> >> > boffset += gpiece[r].ncells * sizeof(PetscInt) + sizeof(int);
> >> > How is sizeof(PetscInt) different from sizeof(int)?
> >> >
> >> > Thanks,
> >> > Mike
> >> >
> >> >
> >> >> On Tue, Feb 14, 2023 at 11:45 AM Mike Michell  >
> >> >> wrote:
> >> >>
> >> >>> I was trying to modify the header flags from "Int32" to "Int64", but
> >> the
> >> >>> problem was not resolved. Could I get any additional comments?
> >> >>>
> >> >>
> >> >> The calculated offsets are not correct I think.
> >> >>
> >> >>   Matt
> >> >>
> >> >>
> >> >>> Thanks,
> >> >>> Mike
> >> >>>
> >> >>>
> >> >>>> Thanks for the comments.
> >> >>>> To be precise on the question, the entire part of the header of the
> >> .vtu
> >> >>>> file is attached:
> >> >>>>
> >> >>>> 
> >> >>>>  >> byte_order="LittleEndian">
> >> >>>>   
> >> >>>> 
> >> >>>>   
> >> >>>>  >> NumberOfComponents="3"
> >> >>>> format="appended" offset="0" />
> >> >>>>   
> >> >>>>   
> >> >>>>  >> >>>> NumberOfComponents="1" format="appended" offset="116932" />
> >> >>>>  >> >>>>  NumberOfComponents="1" format="appen

Re: [petsc-users] PetscViewer with 64bit

2023-02-14 Thread Dave May

On Tue 14. Feb 2023 at 21:03, Dave May  wrote:

>
>
> On Tue 14. Feb 2023 at 17:17, Jed Brown  wrote:
>
>> Can you share a reproducer? I think I recall the format requiring certain
>> things to be Int32.
>
>
> By default, the byte offset used with the appended data format is UInt32.
> I believe that’s where the sizeof(int) is coming from. This default is
> annoying as it limits the total size of your appended data to be < 3 GB.
>

Oops, I meant to type 4 GB

That said, in the opening of the paraview file you can add this attribute
>
> header_type="UInt64"
>
> then the size of the offset is now UInt64 and now large files can be
> finally written.
>
>
> Cheers,
> Dave
>
>
>
>
>>
>> Mike Michell  writes:
>>
>> > Thanks for the note.
>> > I understood that PETSc calculates the offsets for me through "boffset"
>> > variable in plexvtu.c file. Please correct me if it is wrong.
>> >
>> > If plexvtu.c has a bug, it could be around "write file header" part in
>> > which the boffset is also computed. Is this correct? I am not using
>> complex
>> > number.
>> > There are several mixed parts among "Int32, UInt8, PetscInt_FMT,
>> > PetscInt64_FMT" in writing the header.
>> >
>> > Which combination of those flags is correct for 64bit indices? I am
>> gonna
>> > modify plexvtu.c file with "#if defined(PETSC_USE_64BIT_INDICES)"
>> > statement, but I do not know what is the correct form of the header flag
>> > for 64bit indices.
>> >
>> > It is also confusing to me:
>> > boffset += gpiece[r].ncells * sizeof(PetscInt) + sizeof(int);
>> > How is sizeof(PetscInt) different from sizeof(int)?
>> >
>> > Thanks,
>> > Mike
>> >
>> >
>> >> On Tue, Feb 14, 2023 at 11:45 AM Mike Michell 
>> >> wrote:
>> >>
>> >>> I was trying to modify the header flags from "Int32" to "Int64", but
>> the
>> >>> problem was not resolved. Could I get any additional comments?
>> >>>
>> >>
>> >> The calculated offsets are not correct I think.
>> >>
>> >>   Matt
>> >>
>> >>
>> >>> Thanks,
>> >>> Mike
>> >>>
>> >>>
>> >>>> Thanks for the comments.
>> >>>> To be precise on the question, the entire part of the header of the
>> .vtu
>> >>>> file is attached:
>> >>>>
>> >>>> 
>> >>>> > byte_order="LittleEndian">
>> >>>>   
>> >>>> 
>> >>>>   
>> >>>> > NumberOfComponents="3"
>> >>>> format="appended" offset="0" />
>> >>>>   
>> >>>>   
>> >>>> > >>>> NumberOfComponents="1" format="appended" offset="116932" />
>> >>>> > >>>>  NumberOfComponents="1" format="appended" offset="372936" />
>> >>>> > >>>>  NumberOfComponents="1" format="appended" offset="404940" />
>> >>>>   
>> >>>>   
>> >>>> > >>>> format="appended" offset="408944" />
>> >>>>   
>> >>>>   
>> >>>> > >>>> NumberOfComponents="1" format="appended" offset="424948" />
>> >>>>   
>> >>>> 
>> >>>> 
>> >>>>   
>> >>>> > NumberOfComponents="3"
>> >>>> format="appended" offset="463928" />
>> >>>>   
>> >>>>   
>> >>>> > >>>> NumberOfComponents="1" format="appended" offset="580860" />
>> >>>> > >>>>  NumberOfComponents="1" format="appended" offset="836864" />
>> >>>> > >>>>  NumberOfComponents="1" format="appended" offset="868868" />
>> >>>>   
>> >>>>   
>> >>>> &g

Re: [petsc-users] PetscViewer with 64bit

2023-02-14 Thread Dave May

On Tue 14. Feb 2023 at 17:17, Jed Brown  wrote:

> Can you share a reproducer? I think I recall the format requiring certain
> things to be Int32.


By default, the byte offset used with the appended data format is UInt32. I
believe that’s where the sizeof(int) is coming from. This default is
annoying as it limits the total size of your appended data to be < 3 GB.
That said, in the opening of the paraview file you can add this attribute

header_type="UInt64"

then the size of the offset is now UInt64 and now large files can be
finally written.


Cheers,
Dave




>
> Mike Michell  writes:
>
> > Thanks for the note.
> > I understood that PETSc calculates the offsets for me through "boffset"
> > variable in plexvtu.c file. Please correct me if it is wrong.
> >
> > If plexvtu.c has a bug, it could be around "write file header" part in
> > which the boffset is also computed. Is this correct? I am not using
> complex
> > number.
> > There are several mixed parts among "Int32, UInt8, PetscInt_FMT,
> > PetscInt64_FMT" in writing the header.
> >
> > Which combination of those flags is correct for 64bit indices? I am gonna
> > modify plexvtu.c file with "#if defined(PETSC_USE_64BIT_INDICES)"
> > statement, but I do not know what is the correct form of the header flag
> > for 64bit indices.
> >
> > It is also confusing to me:
> > boffset += gpiece[r].ncells * sizeof(PetscInt) + sizeof(int);
> > How is sizeof(PetscInt) different from sizeof(int)?
> >
> > Thanks,
> > Mike
> >
> >
> >> On Tue, Feb 14, 2023 at 11:45 AM Mike Michell 
> >> wrote:
> >>
> >>> I was trying to modify the header flags from "Int32" to "Int64", but
> the
> >>> problem was not resolved. Could I get any additional comments?
> >>>
> >>
> >> The calculated offsets are not correct I think.
> >>
> >>   Matt
> >>
> >>
> >>> Thanks,
> >>> Mike
> >>>
> >>>
>  Thanks for the comments.
>  To be precise on the question, the entire part of the header of the
> .vtu
>  file is attached:
> 
>  
>   byte_order="LittleEndian">
>    
>  
>    
>   NumberOfComponents="3"
>  format="appended" offset="0" />
>    
>    
>    NumberOfComponents="1" format="appended" offset="116932" />
>     NumberOfComponents="1" format="appended" offset="372936" />
>     NumberOfComponents="1" format="appended" offset="404940" />
>    
>    
>    format="appended" offset="408944" />
>    
>    
>    NumberOfComponents="1" format="appended" offset="424948" />
>    
>  
>  
>    
>   NumberOfComponents="3"
>  format="appended" offset="463928" />
>    
>    
>    NumberOfComponents="1" format="appended" offset="580860" />
>     NumberOfComponents="1" format="appended" offset="836864" />
>     NumberOfComponents="1" format="appended" offset="868868" />
>    
>    
>    format="appended" offset="872872" />
>    
>    
>    NumberOfComponents="1" format="appended" offset="76" />
>    
>  
>    
>    
> 
> 
>  Thanks,
>  Mike
> 
> 
> > On Sun, Feb 12, 2023 at 6:15 PM Mike Michell 
> > wrote:
> >
> >> Dear PETSc team,
> >>
> >> I am a user of PETSc with Fortran. My code uses DMPlex to handle dm
> >> object. To print out output variable and mesh connectivity, I use
> VecView()
> >> by defining PetscSection on that dm and borrow a vector. The type
> of the
> >> viewer is set to PETSCVIEWERVTK.
> >>
> >> With 32bit indices, the above work flow has no issue. However, if
> >> PETSc is configured with 64bit indices, my output .vtu file has an
> error if
> >> I open the file with visualization tools, such as Paraview or
> Tecplot,
> >> saying that:
> >> "Cannot read cell connectivity from Cells in piece 0 because the
> >> "offsets" array is not monotonically increasing or starts with a
> value
> >> other than 0."
> >>
> >> If I open the .vtu file from terminal, I can see such a line:
> >> ...
> >>  >> format="appended" offset="580860" />
> >> ...
> >>
> >> I expected "DataArray type="Int64", since the PETSc has 64bit
> indices.
> >> Could I get recommendations that I need to check to resolve the
> issue?
> >>
> >
> > This is probably a bug. We will look at it.
> >
> > Jed, I saw that Int32 is hardcoded in plexvtu.c, but sizeof(PetscInt)
> > is used to calculate the offset, which looks inconsistent. Can you
> take a
> > look?
> >
> >   Thanks,
> >
> >  Matt
> >
> >
> >> Thanks,
> >> Mike
> >>
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> > experiments is infinitely

Re: [petsc-users] coordinate degrees of freedom for 2nd-order gmsh mesh

2023-01-12 Thread Dave May

On Thu 12. Jan 2023 at 17:58, Blaise Bourdin  wrote:

> Out of curiosity, what is the rationale for _reading_ high order gmsh
> meshes?
>

GMSH can use a CAD engine like OpenCascade. This provides geometric
representations via things like BSplines. Such geometric representation are
not exposed to the users application code, nor are they embedded in any
mesh format GMSH emits. The next best thing is to use a high order
representation of the mesh geometry and project the CAD geometry (say a
BSpline) into this higher order function space. The projection of the
geometry is a quantity that can be described with the .msh format.

Is it so that one can write data back in native gmsh format?
>

No.

Cheers,
Dave

>



> Regards,
> Blaise
>
>
> On Jan 12, 2023, at 7:13 PM, Matthew Knepley  wrote:
>
> On Thu, Jan 12, 2023 at 1:33 PM Jed Brown  wrote:
>
>> It's confusing, but this line makes high order simplices always read as
>> discontinuous coordinate spaces. I would love if someone would revisit
>> that, perhaps also using DMPlexSetIsoperiodicFaceSF(),
>
>
> Perhaps as a switch, but there is no way I am getting rid of the current
> periodicity. As we have discussed before, breaking the topological relation
> is a non-starter for me.
>
> It does look like higher order Gmsh does read as DG. We can just project
> that to CG for non-periodic stuff.
>
>   Thanks,
>
> Matt
>
> which should simplify the code and avoid the confusing cell coordinates
>> pattern. Sadly, I don't have time to dive in.
>>
>>
>> https://gitlab.com/petsc/petsc/-/commit/066ea43f7f75752f012be6cd06b6107ebe84cc6d#3616cad8148970af5b97293c49492ff893e25b59_1552_1724
>>
>> "Daniel R. Shapero"  writes:
>>
>> > Sorry either your mail system or mine prevented me from attaching the
>> file,
>> > so I put it on pastebin:
>> > https://pastebin.com/awFpc1Js
>> >
>> > On Wed, Jan 11, 2023 at 4:54 PM Matthew Knepley 
>> wrote:
>> >
>> >> Can you send the .msh file? I still have not installed Gmsh :)
>> >>
>> >>   Thanks,
>> >>
>> >>  Matt
>> >>
>> >> On Wed, Jan 11, 2023 at 2:43 PM Daniel R. Shapero 
>> wrote:
>> >>
>> >>> Hi all -- I'm trying to read in 2nd-order / piecewise quadratic meshes
>> >>> that are generated by gmsh and I don't understand how the coordinates
>> are
>> >>> stored in the plex. I've been discussing this with Matt Knepley here
>> >>> <
>> https://urldefense.com/v3/__https://github.com/firedrakeproject/firedrake/issues/982__;!!K-Hz7m0Vt54!hL9WLR51ieyHFZx8N9AjhDwJCRpvmQto9CL1XOTkkAxFfUbtsabHuBDOATnWyP6lQszhA2gOStva7A$
>> >
>> >>> as it pertains to Firedrake but I think this is more an issue at the
>> PETSc
>> >>> level.
>> >>>
>> >>> This code
>> >>> <
>> https://urldefense.com/v3/__https://gist.github.com/danshapero/a140daaf951ba58c48285ec29f5973cc__;!!K-Hz7m0Vt54!hL9WLR51ieyHFZx8N9AjhDwJCRpvmQto9CL1XOTkkAxFfUbtsabHuBDOATnWyP6lQszhA2hho2eD1g$
>> >
>> >>> uses gmsh to generate a 2nd-order mesh of the unit disk, read it into
>> a
>> >>> DMPlex, print out the number of cells in each depth stratum, and
>> finally
>> >>> print a view of the coordinate DM's section. The resulting mesh has 64
>> >>> triangles, 104 edges, and 41 vertices. For 2nd-order meshes, I'd
>> expected
>> >>> there to be 2 degrees of freedom at each node and 2 at each edge. The
>> >>> output is:
>> >>>
>> >>> ```
>> >>> Depth strata: [(64, 105), (105, 209), (0, 64)]
>> >>>
>> >>> PetscSection Object: 1 MPI process
>> >>>   type not yet set
>> >>> 1 fields
>> >>>   field 0 with 2 components
>> >>> Process 0:
>> >>>   (   0) dim 12 offset   0
>> >>>   (   1) dim 12 offset  12
>> >>>   (   2) dim 12 offset  24
>> >>> ...
>> >>>   (  62) dim 12 offset 744
>> >>>   (  63) dim 12 offset 756
>> >>>   (  64) dim  0 offset 768
>> >>>   (  65) dim  0 offset 768
>> >>> ...
>> >>>   ( 207) dim  0 offset 768
>> >>>   ( 208) dim  0 offset 768
>> >>>   PetscSectionSym Object: 1 MPI process
>> >>> type: label
>> >>> Label 'depth'
>> >>> Symmetry for stratum value 0 (0 dofs per point): no symmetries
>> >>> Symmetry for stratum value 1 (0 dofs per point): no symmetries
>> >>> Symmetry for stratum value 2 (12 dofs per point):
>> >>>   Orientation range: [-3, 3)
>> >>> Symmetry for stratum value -1 (0 dofs per point): no symmetries
>> >>> ```
>> >>>
>> >>> The output suggests that there are 12 degrees of freedom in each
>> >>> triangle. That would mean the coordinate field is discontinuous
>> across cell
>> >>> boundaries. Can someone explain what's going on? I tried reading the
>> .msh
>> >>> file but it's totally inscrutable to me. I'm happy to RTFSC if someone
>> >>> points me in the right direction. Matt tells me that the coordinate
>> field
>> >>> should only be discontinuous if the mesh is periodic, but this mesh
>> >>> shouldn't be periodic.
>> >>>
>> >>
>> >>
>> >> --
>> >> What most experimenters take for granted before they begin their
>> >> experiments is infinitely more interesting than any results to which
>> their
>> >> experiments lead.
>> >>

Re: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh

2022-12-22 Thread Dave May

On Thu, 22 Dec 2022 at 12:08, Matteo Semplice 
wrote:

>
> Il 22/12/22 20:06, Dave May ha scritto:
>
>
>
> On Thu 22. Dec 2022 at 10:27, Matteo Semplice <
> matteo.sempl...@uninsubria.it> wrote:
>
>> Dear Dave and Matt,
>>
>> I am really dealing with two different use cases in a code that will
>> compute a levelset function passing through a large set of points. If I had
>> DMSwarmSetMigrateType() and if it were safe to switch the migration mode
>> back and forth in the same swarm, this would cover all my use cases here.
>> Is it safe to add it back to petsc? Details below if you are curious.
>>
>> 1) During preprocessing I am loading a point cloud from disk (in whatever
>> order it comes) and need to send the particles to the right ranks. Since
>> the background DM is a DMDA I can easily figure out the destination rank.
>> This would be covered by your suggestion not to attach the DM, except that
>> later I need to locate these points with respect to the background cells in
>> order to initialize data on the Vecs associated to the DMDA.
>>
>> 2) Then I need to implement a semilagrangian time evolution scheme. For
>> this I'd like to send particles around at the "foot of characteristic",
>> collect data there and then send them back to the originating point. The
>> first migration would be based on particle coordinates
>> (DMSwarmMigrate_DMNeighborScatter and the restriction to only neighbouring
>> ranks is perfect), while for the second move it would be easier to just
>> send them back to the originating rank, which I can easily store in an Int
>> field in the swarm. Thus at each timestep I'd need to swap migrate types in
>> this swarm (DMScatter for moving them to the feet and BASIC to send them
>> back).
>>
>
> When you use BASIC, you would have to explicitly call the point location
> routine from your code as BASIC does not interact with the DM.
>
> Based on what I see in the code, switching  migrate modes between basic
> and dmneighbourscatter should be safe.
>
> If you are fine calling the point location from your side then what you
> propose should work.
>
> If I understood the code correctly, BASIC will just migrate particles
> sending them to what is stored in DMSwarmField_rank, right?
>

Correct.


> That'd be easy since I can create a SWARM with all the data I need and an
> extra int field (say "original_rank") and copy those values into
> DMSwarmField_rank before calling migrate for the "going back" step. After
> this backward migration I do not need to locate particles again (e.g. I do
> not need DMSwarmSortGetAccess after the BASIC migration, but only after the
> DMNeighborScatter one).
>

Okay

> Thus having back DMSwarmSetMigrateType() should be enough for me.
>

Okay. Thanks for clarifying.

Cheers,
Dave



> Thanks
>
> Matteo
>
>
> Cheers
> Dave
>
>
>
>> Thanks
>>
>> Matteo
>> Il 22/12/22 18:40, Dave May ha scritto:
>>
>> Hey Matt,
>>
>> On Thu 22. Dec 2022 at 05:02, Matthew Knepley  wrote:
>>
>>> On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice <
>>> matteo.sempl...@uninsubria.it> wrote:
>>>
>>>> Dear all
>>>>
>>>> please ignore my previous email and read this one: I have better
>>>> localized the problem. Maybe DMSwarmMigrate is designed to migrate
>>>> particles only to first neighbouring ranks?
>>>>
>>> Yes, I believe that was the design.
>>>
>>> Dave, is this correct?
>>>
>>
>> Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the
>> neighbour ranks - where neighbours are defined by the DM provided to
>> represent the mesh.
>>
>> DMSwarmMigrate_DMNeighborScatter() Is selected by default if you attach
>> a DM.
>>
>> The scatter method should be over ridden with
>>
>> DMSwarmSetMigrateType()
>>
>> however it appears this method no longer exists.
>>
>> If one can determine the exact rank where points should should be sent
>> and it is not going to be the neighbour rank (given by the DM), I would
>> suggest not attaching the DM at all.
>>
>> However if this is not possible and one wanted to scatter to say the
>> neighbours neighbours, we will have to add a new interface and refactor
>> things a little bit.
>>
>> Cheers
>> Dave
>>
>>
>>
>>>   Thanks,
>>>
>>> Matt
>>>
>>>
>>>> Il 22/12/22 11:44, Matteo Semplice ha scritto:
>>>

Re: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh

2022-12-22 Thread Dave May

On Thu 22. Dec 2022 at 10:27, Matteo Semplice 
wrote:

> Dear Dave and Matt,
>
> I am really dealing with two different use cases in a code that will
> compute a levelset function passing through a large set of points. If I had
> DMSwarmSetMigrateType() and if it were safe to switch the migration mode
> back and forth in the same swarm, this would cover all my use cases here.
> Is it safe to add it back to petsc? Details below if you are curious.
>
> 1) During preprocessing I am loading a point cloud from disk (in whatever
> order it comes) and need to send the particles to the right ranks. Since
> the background DM is a DMDA I can easily figure out the destination rank.
> This would be covered by your suggestion not to attach the DM, except that
> later I need to locate these points with respect to the background cells in
> order to initialize data on the Vecs associated to the DMDA.
>
> 2) Then I need to implement a semilagrangian time evolution scheme. For
> this I'd like to send particles around at the "foot of characteristic",
> collect data there and then send them back to the originating point. The
> first migration would be based on particle coordinates
> (DMSwarmMigrate_DMNeighborScatter and the restriction to only neighbouring
> ranks is perfect), while for the second move it would be easier to just
> send them back to the originating rank, which I can easily store in an Int
> field in the swarm. Thus at each timestep I'd need to swap migrate types in
> this swarm (DMScatter for moving them to the feet and BASIC to send them
> back).
>

When you use BASIC, you would have to explicitly call the point location
routine from your code as BASIC does not interact with the DM.

Based on what I see in the code, switching  migrate modes between basic and
dmneighbourscatter should be safe.

If you are fine calling the point location from your side then what you
propose should work.

Cheers
Dave



> Thanks
>
> Matteo
> Il 22/12/22 18:40, Dave May ha scritto:
>
> Hey Matt,
>
> On Thu 22. Dec 2022 at 05:02, Matthew Knepley  wrote:
>
>> On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice <
>> matteo.sempl...@uninsubria.it> wrote:
>>
>>> Dear all
>>>
>>> please ignore my previous email and read this one: I have better
>>> localized the problem. Maybe DMSwarmMigrate is designed to migrate
>>> particles only to first neighbouring ranks?
>>>
>> Yes, I believe that was the design.
>>
>> Dave, is this correct?
>>
>
> Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the
> neighbour ranks - where neighbours are defined by the DM provided to
> represent the mesh.
>
> DMSwarmMigrate_DMNeighborScatter() Is selected by default if you attach a
> DM.
>
> The scatter method should be over ridden with
>
> DMSwarmSetMigrateType()
>
> however it appears this method no longer exists.
>
> If one can determine the exact rank where points should should be sent and
> it is not going to be the neighbour rank (given by the DM), I would suggest
> not attaching the DM at all.
>
> However if this is not possible and one wanted to scatter to say the
> neighbours neighbours, we will have to add a new interface and refactor
> things a little bit.
>
> Cheers
> Dave
>
>
>
>>   Thanks,
>>
>> Matt
>>
>>
>>> Il 22/12/22 11:44, Matteo Semplice ha scritto:
>>>
>>> Dear everybody,
>>>
>>> I have bug a bit into the code and I am able to add more information.
>>> Il 02/12/22 12:48, Matteo Semplice ha scritto:
>>>
>>> Hi.
>>> I am sorry to take this up again, but further tests show that it's not
>>> right yet.
>>>
>>> Il 04/11/22 12:48, Matthew Knepley ha scritto:
>>>
>>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice <
>>> matteo.sempl...@uninsubria.it> wrote:
>>>
>>>> On 04/11/2022 02:43, Matthew Knepley wrote:
>>>>
>>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley 
>>>> wrote:
>>>>
>>>>> On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo <
>>>>> matteo.sempl...@uninsubria.it> wrote:
>>>>>
>>>>>> Dear Petsc developers,
>>>>>> I am trying to use a DMSwarm to locate a cloud of points with
>>>>>> respect to a background mesh. In the real application the points will be
>>>>>> loaded from disk, but I have created a small demo in which
>>>>>>
>>>>>>- each processor creates Npart particles, all within the domain
>&

Re: [petsc-users] locate DMSwarm particles with respect to a background DMDA mesh

2022-12-22 Thread Dave May

Hey Matt,

On Thu 22. Dec 2022 at 05:02, Matthew Knepley  wrote:

> On Thu, Dec 22, 2022 at 6:28 AM Matteo Semplice <
> matteo.sempl...@uninsubria.it> wrote:
>
>> Dear all
>>
>> please ignore my previous email and read this one: I have better
>> localized the problem. Maybe DMSwarmMigrate is designed to migrate
>> particles only to first neighbouring ranks?
>>
> Yes, I believe that was the design.
>
> Dave, is this correct?
>

Correct. DMSwarmMigrate_DMNeighborScatter() only scatter points to the
neighbour ranks - where neighbours are defined by the DM provided to
represent the mesh.

DMSwarmMigrate_DMNeighborScatter() Is selected by default if you attach a
DM.

The scatter method should be over ridden with

DMSwarmSetMigrateType()

however it appears this method no longer exists.

If one can determine the exact rank where points should should be sent and
it is not going to be the neighbour rank (given by the DM), I would suggest
not attaching the DM at all.

However if this is not possible and one wanted to scatter to say the
neighbours neighbours, we will have to add a new interface and refactor
things a little bit.

Cheers
Dave



>   Thanks,
>
> Matt
>
>
>> Il 22/12/22 11:44, Matteo Semplice ha scritto:
>>
>> Dear everybody,
>>
>> I have bug a bit into the code and I am able to add more information.
>> Il 02/12/22 12:48, Matteo Semplice ha scritto:
>>
>> Hi.
>> I am sorry to take this up again, but further tests show that it's not
>> right yet.
>>
>> Il 04/11/22 12:48, Matthew Knepley ha scritto:
>>
>> On Fri, Nov 4, 2022 at 7:46 AM Matteo Semplice <
>> matteo.sempl...@uninsubria.it> wrote:
>>
>>> On 04/11/2022 02:43, Matthew Knepley wrote:
>>>
>>> On Thu, Nov 3, 2022 at 8:36 PM Matthew Knepley 
>>> wrote:
>>>
 On Thu, Oct 27, 2022 at 11:57 AM Semplice Matteo <
 matteo.sempl...@uninsubria.it> wrote:

> Dear Petsc developers,
> I am trying to use a DMSwarm to locate a cloud of points with
> respect to a background mesh. In the real application the points will be
> loaded from disk, but I have created a small demo in which
>
>- each processor creates Npart particles, all within the domain
>covered by the mesh, but not all in the local portion of the mesh
>- migrate the particles
>
> After migration most particles are not any more in the DMSwarm (how
> many and which ones seems to depend on the number of cpus, but it never
> happens that all particle survive the migration process).
>
> Thanks for sending this. I found the problem. Someone has some overly
 fancy code inside DMDA to figure out the local bounding box from the
 coordinates.
 It is broken for DM_BOUNDARY_GHOSTED, but we never tested with this. I
 will fix it.

>>>
>>> Okay, I think this fix is correct
>>>
>>>   https://gitlab.com/petsc/petsc/-/merge_requests/5802
>>> 
>>>
>>> I incorporated your test as src/dm/impls/da/tests/ex1.c. Can you take a
>>> look and see if this fixes your issue?
>>>
>>> Yes, we have tested 2d and 3d, with various combinations of
>>> DM_BOUNDARY_* along different directions and it works like a charm.
>>>
>>> On a side note, neither DMSwarmViewXDMF nor DMSwarmMigrate seem to be
>>> implemented for 1d: I get
>>>
>>> [0]PETSC ERROR: No support for this operation for this object type
>>> 
>>> [0]PETSC
>>> ERROR: Support not provided for 1D
>>>
>>> However, currently I have no need for this feature.
>>>
>>> Finally, if the test is meant to stay in the source, you may remove the
>>> call to DMSwarmRegisterPetscDatatypeField as in the attached patch.
>>>
>>> Thanks a lot!!
>>>
>> Thanks! Glad it works.
>>
>>Matt
>>
>> There are still problems when not using 1,2 or 4 cpus. Any other number
>> of cpus that I've tested does not work corectly.
>>
>> I have now modified private_DMDALocatePointsIS_2D_Regular to print out
>> some debugging information. I see that this is called twice during
>> migration, once before and once after DMSwarmMigrate_DMNeighborScatter. If
>> I understand correctly, the second call to
>> private_DMDALocatePointsIS_2D_Regular should be able to locate all
>> particles owned by the rank but it fails for some of them because they have
>> been sent to the wrong rank (despite being well away from process
>> boundaries).
>>
>> For example, running the example src/dm/impls/da/tests/ex1.c with Nx=21
>> (20x20 Q1 elements on [-1,1]X[-1,1]) with

Re: [petsc-users] Efficiently build a matrix from two asymmetric diagonal block matrices

2022-07-21 Thread Dave May

On Thu 21. Jul 2022 at 14:06, Matthew Knepley  wrote:

> On Thu, Jul 21, 2022 at 6:28 AM Emile Soutter 
> wrote:
>
>> Dear all,
>>
>> I am struggling with the simple following problem : Having a first matrix
>> B1 of size n1xm1, a second matrix B2 of size n2 x m2, build a matrix M of
>> size (n1+n2)x(m1+m2) where the blocks B1 and B2 are the "diagonal" of M
>> (M[0:n1,0:m1]=B1, M[n1:(n1+n2),m1:(m1+m2)]=B2). In my case, the blocks B1
>> and B2 are obtained from another routine, directly in the petsc matrix form
>> (or pyop2.Sparsity form). However the blocks are not squared (n1,n2,m1,m2
>> are all different integers). The operation is easy to do with the SetValues
>> option. However, it takes a large amount of time (too much) when the system
>> becomes large. I struggle to do it efficiently and in parallel. What method
>> do you recommend to use to do this as fast as possible?
>>
>> Thanks you for any tips,
>>
>
> I think it depends on what you want to do with the final matrix. If you
> only want MatMult, then I think you can just use MatNest
>
>   https://petsc.org/main/docs/manualpages/Mat/MatCreateNest/
>
> which will wrap up the submatrices.
>


As an assembled matrix is sought, a follow up suggestion might be to first
create a MatNest representation and pass this to MatConvert to convert the
Nest Mat into an MPIAIJ Mat.

The MatNest object is pretty light weight and doesn’t use much memory as it
refers (via a pointer) to the original matrices. Hence this two step
approach might be appropriate.

Cheers
Dave

However, if you want to manipulate the values (factorization, relaxation,
> etc) then you need
> to assemble a monolithic matrix. For this you could create the global
> matrix, and then use
>
>   https://petsc.org/main/docs/manualpages/Mat/MatCreateLocalRef/
>
> to get a submatrix to assemble directly into, which you pass to your
> assembly routine. Clearly this is more complicated, but
> sometimes necessary.
>
>   Thanks,
>
>   Matt
>
>
>> Emile
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Mat created by DMStag cannot access ghost points

2022-05-31 Thread Dave May

On Tue 31. May 2022 at 16:28, Ye Changqing  wrote:

> Dear developers of PETSc,
>
> I encountered a problem when using the DMStag module. The program could be
> executed perfectly in serial, while errors are thrown out in parallel
> (using mpiexec). Some rows in Mat cannot be accessed in local processes
> when looping all elements in DMStag. The DM object I used only has one DOF
> in each element. Hence, I could switch to the DMDA module easily, and the
> program now is back to normal.
>
> Some snippets are below.
>
> Initialise a DMStag object:
> PetscCall(DMStagCreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE,
> DM_BOUNDARY_NONE, M, N, PETSC_DECIDE, PETSC_DECIDE, 0, 0, 1,
> DMSTAG_STENCIL_BOX, 1, NULL, NULL, &(s_ctx->dm_P)));
> Created a Mat:
> PetscCall(DMCreateMatrix(s_ctx->dm_P, A));
> Loop:
> PetscCall(DMStagGetCorners(s_ctx->dm_V, , , , ,
> , , , , ));
> for (ey = starty; ey < starty + ny; ++ey)
> for (ex = startx; ex < startx + nx; ++ex)
> {
> ...
> PetscCall(DMStagMatSetValuesStencil(s_ctx->dm_P, *A, 2, [0], 2,
> [0], _A[0][0], ADD_VALUES));  // The traceback shows the problem is
> in here.
> }
>

In addition to the code or MWE, please forward us the complete stack trace
/ error thrown to stdout.

Thanks,
Dave



> Best,
> Changqing
>
>

Re: [petsc-users] MatColoring

2022-05-10 Thread Dave May

On Tue 10. May 2022 at 18:51, Tang, Qi  wrote:

> We are using SNES + TS + dmstag. The current bottleneck is the number of
> residual evaluation (more than 300 per Jacobian building using the default
> coloring from dmstag).
>

I suspect that this high count stems from the fact that non zero pattern
defined by MatCreate for dmstag is not specialized for your particular
stencil. Ie it just considers the stencil width, shape and assumes all
cell, face, vertex dofs are connected. Is that correct?

Would a simple solution to drop the 300 residual evals be just to define
your own Amat with a non-zero pattern which is specific to your actually
stencil?

The code to define such a non zero pattern is probably not too hard to
write using MatPreallocator.

Just a thought.

Thanks,
Dave


We talked to Patrick and we are not sure how to improve further.
>
> So it looks like we should play with mat_coloring_type and see if others
> give us better performance.
>
> If there is anything else we can play with, please also let us know. We
> also lag Jacobian and only build once every three Newton iterations, which
> works well. Thanks,
>
> Qi
>
>
>
> On May 10, 2022, at 10:35 AM, Barry Smith  wrote:
>
>
>   This depends to some degree on how you are accessing applying the
> Jacobian construction process.
>
>If you are using SNES or TS then the SNES object handles most of the
> work of organizing the PETSc objects needed to compute the Jacobian and you
> can control the choices via the options database.
>
>With SNES/TS using a DM you can skip calling SNESSetJacobian() and
> simply use the option -snes_fd_color to have SNES compute the Jacobian for
> you. By default, the coloring is obtained from the DM (which is generally
> the best available coloring), you can have it use a coloring computed
> directly from the matrix structure with the
> options -snes_fd_color_use_mat -mat_coloring_type  jp power sl lf ld or
> greedy (see MatColoringType, MatColoringSetFromOptions)
> Use  -mat_fd_coloring_view to get information on the computation of the
> Jacobian from the coloring. Use  -mat_coloring_view to get information on
> the coloring used. (see MatFDColoringSetFromOptions)
>
>   If you are using SNES/TS but not using a DM you need to compute the
> nonzero structure of the matrix yourself into J and call
> SNESSetJacobian(snes,J,J,SNESComputeJacobianDefaultColor,matfdcoloring);
> you need to create the matfdcoloring object with MatFDColoringCreate()
> using an iscoloring you obtained with MatColoringCreate() etc.  The same
> command line arguments as above allow you to control the coloring algorithm
> used and to view them etc.
>
>   Barry
>
>
>
>
>
>
>
>
> On May 10, 2022, at 11:40 AM, Jorti, Zakariae via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
> Hi,
>
> I am solving a non-linear problem and using a finite difference
> approximation with coloring to compute the Jacobian matrix.
>
> There are several coloring algorithms available in PETSc as indicated here:
> https://petsc.org/release/docs/manualpages/Mat/MatColoring.html
> 
>
> And I was wondering how to switch from one to another in the Jacobian
> setup routine and also how to check which coloring algorithm I am currently
> using.
>
> Thank you.
>
> Zakariae Jorti
>
>
>
>

Re: [petsc-users] GMRES for outer solver

2022-05-01 Thread Dave May

On Sun 1. May 2022 at 07:03, Amneet Bhalla  wrote:

> How about using a fixed number of Richardson iterations as a Krylov
> preconditioner to a GMRES solver?
>

That is fine.

Would that lead to a linear operation?
>

Yes.



> On Sat, Apr 30, 2022 at 8:21 PM Jed Brown  wrote:
>
>> In general, no. A fixed number of Krylov iterations (CG, GMRES, etc.) is
>> a nonlinear operation.
>>
>> A fixed number of iterations of a method with a fixed polynomial, such as
>> Chebyshev, is a linear operation so you don't need a flexible outer method.
>>
>> Ramakrishnan Thirumalaisamy  writes:
>>
>> > Hi,
>> >
>> > I have a Krylov solver with a preconditioner that is also a Krylov
>> solver.
>> > I know I can use "fgmres" for the outer solver but can I use gmres for
>> the
>> > outer solver with a fixed number of iterations in the Krylov
>> > preconditioners?
>> >
>> >
>> > Thanks,
>> > Rama
>>
> --
> --Amneet
>
>
>
>

Re: [petsc-users] DMSwarm losing particles with a non-uniform mesh

2022-04-04 Thread Dave May

On Mon, 4 Apr 2022 at 12:07, Joauma Marichal 
wrote:

> Hello,
>
> I have written before as I am trying use the DMSwarm library to track
> particles over a collocated non-uniform mesh with ghost cells.
> I have been able to deal with the collocated and ghost cell issues by
> creating an intermediate DMDA.
> However, I lose particles when my mesh is non-uniform. I have re-written a
> function similar to DMDASetUniformCoordinates but I still have issues when
> my cells have varying sizes.
>

Right.

As I wrote in my previous email, the native PIC support with DA will only
work with coordinates created using DMDASetUniformCoordinates.
The point location DMDA provides is very simple - you can find it here
  src/dm/impls/da/dageometry.c : DMLocatePoints_DA_Regular()

If your cell DM coordinates are not uniform, then you need to provide your
own point location routine to the DM.
In your code, you would do this as follows
  da_swarm->ops->locatepoints = CUSTOM_POINT_LOCATION_FUNCTION
with a signature matching

*PetscErrorCode DMLocatePoints(DM dm,Vec pos,DMPointLocationType
ltype,PetscSF cellSF)*

> I attach a small code to this e-mail that reads particles coordinates from
> a file and stores them in a DMSwarm structure. My code works well when I
> use uniform coordinates but whenever I change this, I lose several
> particles after calling the migration function.
>
> Can it be due to how I define my coordinates? If yes, why? Or is it due to
> how the migrate function is implmented?
>

The problem has nothing to do with Migrate. Here is the issue.
When you set the type to DMSWARM_PIC, a particular migration function was
selected.
The migration function selected does the following:
* It calls the point location routine (DMLocatePoints) for the cellDM you
provided for all swarm points
* Any swarm points which were not located in the sub-domain of the cellDM
are scattered to the neighbouring cellDM sub-domains.
* When the scatter has finished, the DM point location routine is called
again on the received swarm points.
* If any received swarm points are located in the sub-domain, they are
added to the swarm object.

Thanks,
Dave

> Best regards,
>
> Joauma
>
> PS. the code is run with: mpirun -np 3 ./cobpor.
>

Re: [petsc-users] DMSwarm

2022-03-25 Thread Dave May

Hi,

On Wed 23. Mar 2022 at 18:52, Matthew Knepley  wrote:

> On Wed, Mar 23, 2022 at 11:09 AM Joauma Marichal <
> joauma.maric...@uclouvain.be> wrote:
>
>> Hello,
>>
>> I sent an email last week about an issue I had with DMSwarm but did not
>> get an answer yet. If there is any other information needed or anything I
>> could try to solve it, I would be happy to do them...
>>
>
> I got a chance to run the code. I believe this undercovered a bug in our
> implementation of point location with DMDA. I will make an Issue.
>
> Your example runs correctly for me if you replace DM_BOUNDARY_GHOSTED with
> DM_BOUNDARY_NONE in the DMDACreate3d.
> Can you try that?
>

The PIC support in place between DMSwarm and DMDA only works when the DA
points define the vertices of a set of quads / hexes AND if the mesh is
uniform, ie you defined the coordinates using SetUniformCoordinates. The
point location routine is very simple.

There is no way for the DA infrastructure to know what the points in the DA
physically represent (Ie vertices, cell centroid or face centroids). The DA
just defines a set of logically order points which can be indexed in an
i,j,k manner.

So if you are using the DA to represent cell centered data then the point
location routine will give incorrect results. Also, the coordinates from
SetUniformCoordinates won’t give you what you expect either if the x0,x1
you provide define the start,end coordinates of the physical boundary, but
you interpret the DA points to be cell centers.

There are several options you can pursue.
1/ Make an independent DMDA which represents the vertices of your mesh. Use
this DA with your DMSwarm.
2/ Provide your own point location routine for your collocated DA
representation.

Thanks,
Dave

>   Thanks,
>
>  Matt
>
>
>> Thanks a lot for your help.
>>
>> Best regards,
>> Joauma
>>
>> --
>> *From:* Joauma Marichal
>> *Sent:* Friday, March 18, 2022 4:02 PM
>> *To:* petsc-users@mcs.anl.gov 
>> *Subject:* DMSwarm
>>
>> Hello,
>>
>> I am writing to you as I am trying to implement a Lagrangian Particle
>> Tracking method to my eulerian solver that relies on a 3D collocated DMDA.
>>
>> I have been using examples to develop a first basic code. The latter
>> creates particles on rank 0 with random coordinates on the whole domain and
>> then migrates them to the rank corresponding to these coordinates.
>> Unfortunately, as I migrate I am loosing some particles. I came to
>> understand that when I create a DMDA with 6 grid points in each 3
>> directions and then set coordinates in between 0 and 1 using
>> ,DMDASetUniformCoordinates and running on 2 processors, I obtain the
>> following coordinates values on each proc:
>> [Proc 0] X = 0.00 0.20 0.40 0.60 0.80 1.00
>> [Proc 0] Y = 0.00 0.20 0.40 0.60 0.80 1.00
>> [Proc 0] Z = 0.00 0.20 0.40
>> [Proc 1] X = 0.00 0.20 0.40 0.60 0.80 1.00
>> [Proc 1] Y = 0.00 0.20 0.40 0.60 0.80 1.00
>> [Proc 1] Z = 0.60 0.80 1.00 .
>> Furthermore, it appears that the particles that I am losing are (in the
>> case of 2 processors) located in between z = 0.4 and z = 0.6. How can this
>> be avoided?
>> I attach my code to this email (I run it using mpirun -np 2 ./cobpor).
>>
>> Furthermore, my actual code relies on a collocated 3D DMDA, however the
>> DMDASetUniformCoordinates seems to be working for staggered grids
>> only... How would you advice to deal with particles in this case?
>>
>> Thanks a lot for your help.
>>
>> Best regards,
>> Joauma
>>
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Finite difference approximation of Jacobian

2021-12-13 Thread Dave May

On Mon, 13 Dec 2021 at 20:13, Matthew Knepley  wrote:

> On Mon, Dec 13, 2021 at 1:52 PM Dave May  wrote:
>
>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley  wrote:
>>
>>> On Mon, Dec 13, 2021 at 1:16 PM Dave May 
>>> wrote:
>>>
>>>>
>>>>
>>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley 
>>>> wrote:
>>>>
>>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi  wrote:
>>>>>
>>>>>> Hi,
>>>>>> Does anyone have comment on finite difference coloring with DMStag?
>>>>>> We are using DMStag and TS to evolve some nonlinear equations implicitly.
>>>>>> It would be helpful to have the coloring Jacobian option with that.
>>>>>>
>>>>>
>>>>> Since DMStag produces the Jacobian connectivity,
>>>>>
>>>>
>>>> This is incorrect.
>>>> The DMCreateMatrix implementation for DMSTAG only sets the number of
>>>> nonzeros (very inaccurately). It does not insert any zero values and thus
>>>> the nonzero structure is actually not defined.
>>>> That is why coloring doesn’t work.
>>>>
>>>
>>> Ah, thanks Dave.
>>>
>>> Okay, we should fix that.It is perfectly possible to compute the nonzero
>>> pattern from the DMStag information.
>>>
>>
>> Agreed. The API for DMSTAG is complete enough to enable one to
>> loop over the cells, and for all quantities defined on the cell (centre,
>> face, vertex),
>> insert values into the appropriate slot in the matrix.
>> Combined with MATPREALLOCATOR, I believe a compact and readable
>> code should be possible to write for the preallocation (cf DMDA).
>>
>> I think the only caveat with the approach of using all quantities defined
>> on the cell is
>> It may slightly over allocate depending on how the user wishes to impose
>> the boundary condition,
>> or slightly over allocate for says Stokes where there is no
>> pressure-pressure coupling term.
>>
>
> Yes, and would not handle higher order stencils.I think the
> overallocating is livable for the first imeplementation.
>
>
Sure, but neither does DMDA.

The user always has to know what they are doing and set the stencil width
accordingly.
I actually had this point listed in my initial email (and the stencil
growth issue when using FD for nonlinear problems),
however I deleted it as all the same issue exist in DMDA and no one
complains (at least not loudly) :D





>   Thanks,
>
>  Matt
>
>
>> Thanks,
>> Dave
>>
>>
>>> Paging Patrick :)
>>>
>>>   Thanks,
>>>
>>> Matt
>>>
>>>
>>>> Thanks,
>>>> Dave
>>>>
>>>>
>>>> you can use -snes_fd_color_use_mat. It has many options. Here is an
>>>>> example of us using that:
>>>>>
>>>>>
>>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>  Matt
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Qi
>>>>>>
>>>>>>
>>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users <
>>>>>> petsc-users@mcs.anl.gov> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Does the Jacobian approximation using coloring and finite
>>>>>> differencing of the function evaluation work in DMStag?
>>>>>> Thank you.
>>>>>> Best regards,
>>>>>>
>>>>>> Zakariae
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>

Re: [petsc-users] Finite difference approximation of Jacobian

2021-12-13 Thread Dave May

On Mon, 13 Dec 2021 at 19:55, Tang, Qi  wrote:

> Matt and Dave,
>
> Thanks, this is consistent with what we found. If Patrick or someone can
> add some basic coloring option with DMStag, that would be very useful for
> our project.
>
>
Colouring only requires the non-zero structure of the matrix.
So actually colouring is supported.
The only thing missing for you is that the matrix returned from
DMCreateMatrix
for DMSTAG does not have a defined non-zero structure.
Once that is set / defined, colouring will just work.

Qi
>
>
>
> On Dec 13, 2021, at 11:52 AM, Dave May  wrote:
>
>
>
> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley  wrote:
>
>> On Mon, Dec 13, 2021 at 1:16 PM Dave May  wrote:
>>
>>>
>>>
>>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley  wrote:
>>>
>>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi  wrote:
>>>>
>>>>> Hi,
>>>>> Does anyone have comment on finite difference coloring with DMStag? We
>>>>> are using DMStag and TS to evolve some nonlinear equations implicitly. It
>>>>> would be helpful to have the coloring Jacobian option with that.
>>>>>
>>>>
>>>> Since DMStag produces the Jacobian connectivity,
>>>>
>>>
>>> This is incorrect.
>>> The DMCreateMatrix implementation for DMSTAG only sets the number of
>>> nonzeros (very inaccurately). It does not insert any zero values and thus
>>> the nonzero structure is actually not defined.
>>> That is why coloring doesn’t work.
>>>
>>
>> Ah, thanks Dave.
>>
>> Okay, we should fix that.It is perfectly possible to compute the nonzero
>> pattern from the DMStag information.
>>
>
> Agreed. The API for DMSTAG is complete enough to enable one to
> loop over the cells, and for all quantities defined on the cell (centre,
> face, vertex),
> insert values into the appropriate slot in the matrix.
> Combined with MATPREALLOCATOR, I believe a compact and readable
> code should be possible to write for the preallocation (cf DMDA).
>
> I think the only caveat with the approach of using all quantities defined
> on the cell is
> It may slightly over allocate depending on how the user wishes to impose
> the boundary condition,
> or slightly over allocate for says Stokes where there is no
> pressure-pressure coupling term.
>
> Thanks,
> Dave
>
>
>> Paging Patrick :)
>>
>>   Thanks,
>>
>> Matt
>>
>>
>>> Thanks,
>>> Dave
>>>
>>>
>>> you can use -snes_fd_color_use_mat. It has many options. Here is an
>>>> example of us using that:
>>>>
>>>>
>>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898
>>>> <https://urldefense.com/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c*L898__;Iw!!HXCxUKc!h_StfE5Flul2AzB6sJ3ooEa2GBhV-GZ4s8KWEB2soynkpcpilsVmmHUbsYezUA$>
>>>>
>>>>   Thanks,
>>>>
>>>>  Matt
>>>>
>>>>
>>>>> Thanks,
>>>>> Qi
>>>>>
>>>>>
>>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users <
>>>>> petsc-users@mcs.anl.gov> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> Does the Jacobian approximation using coloring and finite
>>>>> differencing of the function evaluation work in DMStag?
>>>>> Thank you.
>>>>> Best regards,
>>>>>
>>>>> Zakariae
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!HXCxUKc!h_StfE5Flul2AzB6sJ3ooEa2GBhV-GZ4s8KWEB2soynkpcpilsVmmHUXuu3a8g$>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!HXCxUKc!h_StfE5Flul2AzB6sJ3ooEa2GBhV-GZ4s8KWEB2soynkpcpilsVmmHUXuu3a8g$>
>>
>
>

Re: [petsc-users] Finite difference approximation of Jacobian

2021-12-13 Thread Dave May

On Mon, 13 Dec 2021 at 19:29, Matthew Knepley  wrote:

> On Mon, Dec 13, 2021 at 1:16 PM Dave May  wrote:
>
>>
>>
>> On Sat 11. Dec 2021 at 22:28, Matthew Knepley  wrote:
>>
>>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi  wrote:
>>>
>>>> Hi,
>>>> Does anyone have comment on finite difference coloring with DMStag? We
>>>> are using DMStag and TS to evolve some nonlinear equations implicitly. It
>>>> would be helpful to have the coloring Jacobian option with that.
>>>>
>>>
>>> Since DMStag produces the Jacobian connectivity,
>>>
>>
>> This is incorrect.
>> The DMCreateMatrix implementation for DMSTAG only sets the number of
>> nonzeros (very inaccurately). It does not insert any zero values and thus
>> the nonzero structure is actually not defined.
>> That is why coloring doesn’t work.
>>
>
> Ah, thanks Dave.
>
> Okay, we should fix that.It is perfectly possible to compute the nonzero
> pattern from the DMStag information.
>

Agreed. The API for DMSTAG is complete enough to enable one to
loop over the cells, and for all quantities defined on the cell (centre,
face, vertex),
insert values into the appropriate slot in the matrix.
Combined with MATPREALLOCATOR, I believe a compact and readable
code should be possible to write for the preallocation (cf DMDA).

I think the only caveat with the approach of using all quantities defined
on the cell is
It may slightly over allocate depending on how the user wishes to impose
the boundary condition,
or slightly over allocate for says Stokes where there is no
pressure-pressure coupling term.

Thanks,
Dave


> Paging Patrick :)
>
>   Thanks,
>
> Matt
>
>
>> Thanks,
>> Dave
>>
>>
>> you can use -snes_fd_color_use_mat. It has many options. Here is an
>>> example of us using that:
>>>
>>>
>>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898
>>>
>>>   Thanks,
>>>
>>>  Matt
>>>
>>>
>>>> Thanks,
>>>> Qi
>>>>
>>>>
>>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users <
>>>> petsc-users@mcs.anl.gov> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Does the Jacobian approximation using coloring and finite differencing
>>>> of the function evaluation work in DMStag?
>>>> Thank you.
>>>> Best regards,
>>>>
>>>> Zakariae
>>>>
>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>

Re: [petsc-users] Finite difference approximation of Jacobian

2021-12-13 Thread Dave May

On Sat 11. Dec 2021 at 22:28, Matthew Knepley  wrote:

> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi  wrote:
>
>> Hi,
>> Does anyone have comment on finite difference coloring with DMStag? We
>> are using DMStag and TS to evolve some nonlinear equations implicitly. It
>> would be helpful to have the coloring Jacobian option with that.
>>
>
> Since DMStag produces the Jacobian connectivity,
>

This is incorrect.
The DMCreateMatrix implementation for DMSTAG only sets the number of
nonzeros (very inaccurately). It does not insert any zero values and thus
the nonzero structure is actually not defined.
That is why coloring doesn’t work.

Thanks,
Dave


you can use -snes_fd_color_use_mat. It has many options. Here is an example
> of us using that:
>
>
> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898
>
>   Thanks,
>
>  Matt
>
>
>> Thanks,
>> Qi
>>
>>
>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>> Hello,
>>
>> Does the Jacobian approximation using coloring and finite differencing
>> of the function evaluation work in DMStag?
>> Thank you.
>> Best regards,
>>
>> Zakariae
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] GAMG memory consumption

2021-11-24 Thread Dave May

I think your run with -pc_type mg is defining a multigrid hierarchy with a
only single level. (A single level mg PC would also explain the 100+
iterations required to converge.) The gamg configuration is definitely
coarsening your problem and has a deeper hierarchy.  A single level
hierarchy will require less memory than a multilevel hierarchy.

Cheers,
Dave

On Wed 24. Nov 2021 at 19:03, Matthew Knepley  wrote:

> On Wed, Nov 24, 2021 at 12:26 PM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalin...@stfc.ac.uk> wrote:
>
>> Hello,
>>
>>
>>
>> I would like to understand why more memory is consumed by -pc_type gamg
>> compared to -pc_type mg for the same problem size
>>
>>
>>
>> ksp/ksp/tutorial: ./ex45 -da_grid_x 368 -da_grid_x 368 -da_grid_x 368
>> -ksp_type cg
>>
>>
>>
>> -pc_type mg
>>
>>
>>
>> Maximum (over computational time) process memory:total 1.9399e+10
>> max 9.7000e+09 min 9.6992e+09
>>
>>
>>
>> -pc_type gamg
>>
>>
>>
>> Maximum (over computational time) process memory:total 4.9671e+10
>> max 2.4836e+10 min 2.4835e+10
>>
>>
>>
>>
>> Am I right in understanding that the memory limiting factor is ‘max
>> 2.4836e+10’ as it is the maximum memory used at any given time?
>>
>
> Yes, I believe so.
>
> GAMG is using A_C = P^T A P, where P is the prolongation from coarse to
> fine, in order to compute the coarse operator A_C, rather than
> rediscretization, since it does not have any notion of discretization or
> coarse meshes. This takes more memory.
>
>   Thanks,
>
> Matt
>
>
>> I have attached the -log_view output of both the preconditioners.
>>
>>
>>
>> Best regards,
>>
>> Karthik.
>>
>>
>>
>> This email and any attachments are intended solely for the use of the
>> named recipients. If you are not the intended recipient you must not use,
>> disclose, copy or distribute this email or any of its attachments and
>> should notify the sender immediately and delete this email from your
>> system. UK Research and Innovation (UKRI) has taken every reasonable
>> precaution to minimise risk of this email or any attachments containing
>> viruses or malware but the recipient should carry out its own virus and
>> malware checks before opening the attachments. UKRI does not accept any
>> liability for any losses or damages which the recipient may sustain due to
>> presence of any viruses.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Scaling of the Petsc Binary Viewer

2021-07-07 Thread Dave May

On Wed 7. Jul 2021 at 20:41, Thibault Bridel-Bertomeu <
thibault.bridelberto...@gmail.com> wrote:

> Dear all,
>
> I have been having issues with large Vec (based on DMPLex) and massive MPI
> I/O  ... it looks like the data that is written by the Petsc Binary Viewer
> is gibberish for large meshes split on a high number of processes. For
> instance, I am using a mesh that has around 50 million cells, split on 1024
> processors.
> The computation seems to run fine, the timestep computed from the data
> makes sense so I think internally everything is fine. But when I look at
> the solution (one example attached) it's noise - at this point it should
> show a bow shock developing on the left near the step.
> The piece of code I use is below for the output :
>
> call DMGetOutputSequenceNumber(dm, save_seqnum,
> save_seqval, ierr); CHKERRA(ierr)
> call DMSetOutputSequenceNumber(dm, -1, 0.d0, ierr);
> CHKERRA(ierr)
> write(filename,'(A,I8.8,A)') "restart_", stepnum, ".bin"
> call PetscViewerCreate(PETSC_COMM_WORLD, binViewer, ierr);
> CHKERRA(ierr)
> call PetscViewerSetType(binViewer, PETSCVIEWERBINARY,
> ierr); CHKERRA(ierr)
> call PetscViewerFileSetMode(binViewer, FILE_MODE_WRITE,
> ierr); CHKERRA(ierr);
> call PetscViewerBinarySetUseMPIIO(binViewer, PETSC_TRUE,
> ierr); CHKERRA(ierr);
>
>

Do you get the correct output if you don’t call the function above (or
equivalently use PETSC_FALSE)


call PetscViewerFileSetName(binViewer, trim(filename), ierr); CHKERRA(ierr)
> call VecView(X, binViewer, ierr); CHKERRA(ierr)
> call PetscViewerDestroy(binViewer, ierr); CHKERRA(ierr)
> call DMSetOutputSequenceNumber(dm, save_seqnum,
> save_seqval, ierr); CHKERRA(ierr)
>
> I do not think there is anything wrong with it but of course I would be
> happy to hear your feedback.
> Nonetheless my question was : how far have you tested the binary mpi i/o
> of a Vec ? Does it make some sense that for a 50 million cell mesh split on
> 1024 processes, it could somehow fail ?
> Or is it my python drawing method that is completely incapable of handling
> this dataset ? (paraview displays the same thing though so I'm not sure ...)
>

Are you using the python provided tools within petsc to load the Vec from
file?


Thanks,
Dave



> Thank you very much for your advice and help !!!
>
> Thibault
>

Re: [petsc-users] Change Amat in FormJacobian

2021-06-14 Thread Dave May

On Mon 14. Jun 2021 at 17:27, Anton Popov  wrote:

>
> On 14.06.21 15:04, Dave May wrote:
>
>
> Hi Anton,
>
> Hi Dave,
>
>
> On Mon, 14 Jun 2021 at 14:47, Anton Popov  wrote:
>
>> Hi Barry & Matt,
>>
>> thanks for your quick response. These options were exactly what I needed
>> and expected:
>>
>> -pc_mg_galerkin pmat
>> -pc_use_amat false
>>
>> I just assumed that it’s a default behavior of the PC object.
>>
>> So to clarify my case, I don't use nonlinear multigrid. Galerkin is
>> expected to deal with Pmat only, and it's enough if Amat implements a
>> matrix-vector product for the Krylov accelerator.
>>
>> Matt, the reason for switching Amat during the iteration is a quite
>> common Picard-Newton combination. Jacobian matrix gives accurate updates
>> close to the solution, but is rather unstable far form the solution. Picard
>> matrix (approximate Jacobian) is quite the opposite – it’s kind of stable,
>> but slow. So the idea is to begin the iteration with Picard matrix, and
>> switch to the Jacobian later.
>>
>> If the assembled matrices are used, then the standard SNES interface is
>> just perfect. I can decide how to fill the matrices. But I don’t bother
>> with  Jacobian assembly and want to use a built-in MFFD approximation
>> instead. I did quite a few tests previously and figured out that MFFD is
>> practically the same as closed-from matrix-free Jacobian for the later
>> stages of the iteration. The Picard matrix still does a good job as a
>> preconditioner. But it is important to start the iteration with Picard and
>> only change to MFFD later.
>>
>> Is my workaround with the shell matrix acceptable, or there is a better
>> solution?
>>
>
> Given what you write, it sounds like you already have a good heuristic for
> when to switch from Picard to Newton,
> Hence I think the simplest option is just to use two seperate SNES objects
> - one for performing Picard and one for Newton.
>
>
> Yes, I considered this option initially. Sometimes it is necessary to
> switch back and forth between the methods, so it becomes a bit messy to
> organize this in the code.
>
> But maybe if Newton fails after Picard, I should just reduce the time step
> and restart, instead of switching back to Picard. Thanks, Dave.
>
>
Oh yeah, for a nasty multiphysics problem a single SNES is likely not the
end of the story! Definitely there is almost certainly an entire other
outer loop of several nonlinear solver strategies slapped together in some
problem specific manner in an effort to get the monolithic nonlinear
residual down. Aborting the time step and dropping dt is often the thing
you fall back to when all of those fail.

Thanks,
Dave


Thanks,
>
> Anton
>
>
> The stopping condition for the Picard object would encode your heuristic
> to abort earlier when the solution was deemed sufficiently accurate.
>
> Thanks,
> Dave
>
>
>>
>> Thanks,
>> Anton
>> On 13.06.21 20:52, Barry Smith wrote:
>>
>>
>>   Anton,
>>
>>   -pc_mg_galerkin pmat
>>
>>   Though it seems simple, there is some subtly in swapping out matrices
>> with SNES.
>>
>>   When using multigrid with SNES there are at least five distinct uses of
>> the Jacobian operator.
>>
>>- Perform matrix-vector product in line search to check Wolf
>>   sufficient decrease convergence criteria
>>   - Perform the matrix-vector product for the Krylov accelerator of
>>   the system
>>   - Perform smoothing on the finest level of MG
>>   - Perform the matrix-vector product needed on the finest level of
>>   MG to compute the residual that will be restricted to the coarser 
>> level of
>>   multigrid
>>   - When using Galerkin products to compute the coarser grid
>>   operator performing the Galerkin matrix triple product
>>
>>
>> when one swaps out the mat, which of these do they wish to change? The
>> first two seem to naturally go together as do the last three. In your case
>> I presume you want to swap for the first two, but always use pmat for the
>> last three? To achieve this you will also need -pc_use_amat  false
>>
>> If you run with -info and -snes_view it will print out some of the
>> information about which operator it is using for each case, but probably
>> not all of them.
>>
>> Note: if the pmat is actually an accurate computation of the Jacobian
>> then it is likely best not to ever use a matrix-free product. It is only
>> when pmat is approximated in some specific way that usi

Re: [petsc-users] Change Amat in FormJacobian

2021-06-14 Thread Dave May

Hi Anton,

On Mon, 14 Jun 2021 at 14:47, Anton Popov  wrote:

> Hi Barry & Matt,
>
> thanks for your quick response. These options were exactly what I needed
> and expected:
>
> -pc_mg_galerkin pmat
> -pc_use_amat false
>
> I just assumed that it’s a default behavior of the PC object.
>
> So to clarify my case, I don't use nonlinear multigrid. Galerkin is
> expected to deal with Pmat only, and it's enough if Amat implements a
> matrix-vector product for the Krylov accelerator.
>
> Matt, the reason for switching Amat during the iteration is a quite common
> Picard-Newton combination. Jacobian matrix gives accurate updates close to
> the solution, but is rather unstable far form the solution. Picard matrix
> (approximate Jacobian) is quite the opposite – it’s kind of stable, but
> slow. So the idea is to begin the iteration with Picard matrix, and switch
> to the Jacobian later.
>
> If the assembled matrices are used, then the standard SNES interface is
> just perfect. I can decide how to fill the matrices. But I don’t bother
> with  Jacobian assembly and want to use a built-in MFFD approximation
> instead. I did quite a few tests previously and figured out that MFFD is
> practically the same as closed-from matrix-free Jacobian for the later
> stages of the iteration. The Picard matrix still does a good job as a
> preconditioner. But it is important to start the iteration with Picard and
> only change to MFFD later.
>
> Is my workaround with the shell matrix acceptable, or there is a better
> solution?
>

Given what you write, it sounds like you already have a good heuristic for
when to switch from Picard to Newton,
Hence I think the simplest option is just to use two seperate SNES objects
- one for performing Picard and one for Newton.
The stopping condition for the Picard object would encode your heuristic to
abort earlier when the solution was deemed sufficiently accurate.

Thanks,
Dave


>
> Thanks,
> Anton
> On 13.06.21 20:52, Barry Smith wrote:
>
>
>   Anton,
>
>   -pc_mg_galerkin pmat
>
>   Though it seems simple, there is some subtly in swapping out matrices
> with SNES.
>
>   When using multigrid with SNES there are at least five distinct uses of
> the Jacobian operator.
>
>- Perform matrix-vector product in line search to check Wolf
>   sufficient decrease convergence criteria
>   - Perform the matrix-vector product for the Krylov accelerator of
>   the system
>   - Perform smoothing on the finest level of MG
>   - Perform the matrix-vector product needed on the finest level of
>   MG to compute the residual that will be restricted to the coarser level 
> of
>   multigrid
>   - When using Galerkin products to compute the coarser grid operator
>   performing the Galerkin matrix triple product
>
>
> when one swaps out the mat, which of these do they wish to change? The
> first two seem to naturally go together as do the last three. In your case
> I presume you want to swap for the first two, but always use pmat for the
> last three? To achieve this you will also need -pc_use_amat  false
>
> If you run with -info and -snes_view it will print out some of the
> information about which operator it is using for each case, but probably
> not all of them.
>
> Note: if the pmat is actually an accurate computation of the Jacobian then
> it is likely best not to ever use a matrix-free product. It is only when
> pmat is approximated in some specific way that using the matrix-free
> product would be useful to insure the "Newton" method actually computes a
> Newton step.
>
> Barry
>
>
>
> On Jun 13, 2021, at 11:21 AM, Matthew Knepley  wrote:
>
> On Sun, Jun 13, 2021 at 10:55 AM Anton Popov  wrote:
>
>> Hi,
>>
>> I want a simple(?) thing. I want to decide and be able to assign the
>> Jacobian matrix (Amat) on the fly within the FormJacobian function (i.e.
>> during Newton iteration) to one of the following options:
>>
>> 1) built-in MFFD approximation
>> 2) assembled preconditioner matrix (Pmat)
>>
>> I have not found this scenario demonstrated in any example, therefore
>> I’m asking how to do that.
>>
>> Currently I do the following:
>>
>> 1) setup Amat as a shell matrix with a MATOP_MULT operation that simply
>> retrieves a matrix object form its context and calls MatMult on it.
>>
>> 2) if I need MFFD, I put a matrix generated with MatCreateSNESMF in the
>> Amat context (of course I also call MatMFFDComputeJacobian before that).
>>
>> 3) if I need Pmat, I simply put Pmat in the Amat context.
>>
>> 4) call MatAssemblyBegin/End on Amat
>>
>> So far so good.
>>
>> However, shell Amat and assembled Pmat generate a problem if Galerkin
>> multigrid is requested as a preconditioner (I just test on 1 CPU):
>>
>> [0]PETSC ERROR: MatPtAP requires A, shell, to be compatible with P,
>> seqaij (Misses composed function MatPtAP_shell_seqaij_C)
>> [0]PETSC ERROR: #1 MatPtAP()
>> [0]PETSC ERROR: #2 MatGalerkin()
>> [0]PETSC ERROR: #3 PCSetUp_MG()
>> [0]PETSC ERROR: #4 PCSetUp()
>>

Re: [petsc-users] Data transfer between DMDA-managed Vecs

2021-04-19 Thread Dave May

On Tue, 20 Apr 2021 at 01:06, Constantine Khrulev 
wrote:

> Hi,
>
> I would like to transfer values from one DMDA-managed Vec (i.e. created
> using DMCreateGlobalVector() or equivalent) to a Vec managed using a
> different DMDA instance (same number of elements, same number of degrees
> of freedom, *different* domain decomposition).
>
> What approach would you recommend?
>
>
VecScatter.


> Bonus question: what about DMDAs using different MPI communicators?
>

VecScatter. :D

We do exactly this in PCTELESCOPE using VecScatter.
It might be worth snooping through telescope_dmda.c which you can find here

https://www.mcs.anl.gov/petsc/petsc-current/src/ksp/pc/impls/telescope/telescope_dmda.c.html

Thanks,
Dave


> Thanks!
>
> --
> Constantine
>
>

Re: [petsc-users] Code speedup after upgrading

2021-03-23 Thread Dave May

Nice to hear!
The answer is simple, PETSc is awesome :)

Jokes aside, assuming both petsc builds were configured with
—with-debugging=0, I don’t think there is a definitive answer to your
question with the information you provided.

It could be as simple as one specific implementation you use was improved
between petsc releases. Not being an Ubuntu expert, the change might be
associated with using a different compiler, and or a more efficient BLAS
implementation (non threaded vs threaded). However I doubt this is the
origin of your 2x performance increase.

If you really want to understand where the performance improvement
originated from, you’d need to send to the email list the result of
-log_view from both the old and new versions, running the exact same
problem.

>From that info, we can see what implementations in PETSc are being used and
where the time reduction is occurring. Knowing that, it should be clearer
to provide an explanation for it.

Thanks,
Dave

On Tue 23. Mar 2021 at 06:24, Mohammad Gohardoust 
wrote:

> Hi,
>
> I am using a code which is based on petsc (and also parmetis). Recently I
> made the following changes and now the code is running about two times
> faster than before:
>
>- Upgraded Ubuntu 18.04 to 20.04
>- Upgraded petsc 3.13.4 to 3.14.5
>- This time I installed parmetis and metis directly via petsc by
>--download-parmetis --download-metis flags instead of installing them
>separately and using --with-parmetis-include=... and
>--with-parmetis-lib=... (the version of installed parmetis was 4.0.3 
> before)
>
> I was wondering what can possibly explain this speedup? Does anyone have
> any suggestions?
>
> Thanks,
> Mohammad
>

Re: [petsc-users] error message

2021-03-16 Thread Dave May

On Tue, 16 Mar 2021 at 19:50, Sam Guo  wrote:

> Dear PETSc dev team,
>When there is an PETSc error, I go following overly verbose error
> message. Is it possible to get a simple error message like "Initial vector
> is zero or belongs to the deflection space"?
>
>
When an error occurs and the execution is halted, a verbose and informative
error message is shown.
I would argue this is useful (very useful), and should never ever be
shortened or truncated.

This error thrown by PETSc gives you a stack trace. You can see where the
error occurred, and the calling code which resulted in the error. In
anything but a trivial code, this information is incredibly useful to
isolate and fix the problem. I also think it's neat that you see the stack
without having to even use a debugger.

Currently if your code does not produce errors, no message is displayed.
However, when an error occurs, a loud, long and informative message is
displayed - and the code stops.
What is the use case which would cause / require you to change the current
behaviour?

Thanks,
Dave

> Thanks,
> Sam
>
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Initial vector is zero or belongs to the deflation space
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
> trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.11.3, Jun, 26, 2019
> [0]PETSC ERROR: Unknown Name on a arch-starccmplus_serial_real named 
> pl2denbg0033pc.net.plm.eds.com by cd3ppw Tue Mar 16 16:19:28 2021
> [0]PETSC ERROR: Configure options --with-x=0 --with-fc=0 --with-debugging=0 
> --with-blaslapack-dir=/usr/local/jenkins/dev1/mkl/2017.2-cda-001/linux/lib/intel64/../..
>  --with-mpi=0 -CFLAGS=-O3 -CXXFLAGS=-O3 --with-clean=1 --force 
> --with-scalar-type=real
> [0]PETSC ERROR: #1 
>  
> EPSGetStartVector() line 806 in ../../../slepc/src/eps/interface/epssolve.c
> [0]PETSC ERROR: #2 
>  
> EPSSolve_KrylovSchur_Default() line 259 in 
> ../../../slepc/src/eps/impls/krylov/krylovschur/krylovschur.c
> [0]PETSC ERROR: #3 
>  
> EPSSolve() line 149 in ../../../slepc/src/eps/interface/epssolve.c
>
>

Re: [petsc-users] Block unstructured grid

2021-03-11 Thread Dave May

On Thu, 11 Mar 2021 at 09:27, Mathieu Dutour 
wrote:

> Dear all,
>
> I would like to work with a special kind of linear system that ought to be
> very common but I am not sure that it is possible in PETSC.
>
> What we have is an unstructured grid with say 3.10^5 nodes in it.
> At each node, we have a number of frequency/direction and together
> this makes about 1000 values at the node. So, in total the linear system
> has say 3.10^8 values.
>
> We managed to implement this system with Petsc but the performance
> was unsatisfactory.
>

I think part of the reason the answers you are getting aren't helpful to
you is
that you have not identified "what" exactly you find to be unsatisfactory.
Nor is it obvious what you consider to be satisfactory.

For example, does "unsatisfactory" relate to any of these items?
* memory usage of the matrix
* time taken to assemble the matrix
* time taken to perform MatMult()
* solve time

If it does, providing the output from -log_view (from an optimized build of
petsc) would be helpful,
and moreover it would provide developers with a baseline result with which
they could compare to
should any implementation changes be made.

Having established what functionality is causing you concern, it would then
be help for you to explain
why you think it should be better, e.g. based on a performance model, prior
experience with
other software, etc.

More information would help.

Thanks,
Dave

> We think that Petsc is not exploiting the special
> structure of the matrix and we wonder if this structure can be implemented
> in Petsc.
>
> By special structure we mean the following. An entry in the linear system
> is of the form (i, j) with 1<=i<=1000 and 1<=j<=N   with N = 3.10^5.
> The node (i , j) is adjacent to all the nodes (i' , j) and thus they make
> a block
> diagonal entry. But the node (i , j) is also adjacent to some nodes (i ,
> j')
> [About 6 such nodes, but it varies].
>
> Would there be a way to exploit this special structure in Petsc? I think
> this should be fairly common and significant speedup could be obtained.
>
> Best,
>
>   Mathieu
>

Re: [petsc-users] using preconditioner with SLEPc

2021-02-08 Thread Dave May

On Mon 8. Feb 2021 at 17:40, Dave May  wrote:

>
>
> On Mon 8. Feb 2021 at 15:49, Matthew Knepley  wrote:
>
>> On Mon, Feb 8, 2021 at 9:37 AM Jose E. Roman  wrote:
>>
>>> The problem can be written as A0*v=omega*B0*v and you want the
>>> eigenvalues omega closest to zero. If the matrices were explicitly
>>> available, you would do shift-and-invert with target=0, that is
>>>
>>>   (A0-sigma*B0)^{-1}*B0*v=theta*vfor sigma=0, that is
>>>
>>>   A0^{-1}*B0*v=theta*v
>>>
>>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega.
>>>
>>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of
>>> EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need?
>>> EPS_SMALLEST_REAL will give slow convergence.
>>>
>>
>> Thanks Jose! I am not understanding some step. I want the smallest
>> eigenvalues. Should I use EPS_SMALLEST_MAGNITUDE? I appear to get what I
>> want
>> using SMALLEST_REAL, but as you say it might be slower than it has to be.
>>
>
>
> With shift-and-invert you want to use EPS_LARGEST_MAGNITUDE as Jose says.
> The largest magnitude v
>


Sorry “v” should be “theta”!

eigenvalues you obtain (see Jose equation above) from the transformed
> system correspond to the smallest magnitude omega eigenvalues of the
> original problem.
>
> Cheers
> Dave
>
>
>> Also, sometime I would like to talk about incorporating the multilevel
>> eigensolver. I am sure you could make lots of improvements to my initial
>> attempt. I will send
>> you a separate email, since I am getting serious about testing it.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Florian: I would not recommend setting the KSP matrices directly, it may
>>> produce strange side-effects. We should have an interface function to pass
>>> this matrix. Currently there is STPrecondSetMatForPC() but it has two
>>> problems: (1) it is intended for STPRECOND, so cannot be used with
>>> Krylov-Schur, and (2) it is not currently available in the python interface.
>>>
>>> The approach used by Matt is a workaround that does not use ST, so you
>>> can handle linear solves with a KSP of your own.
>>>
>>> As an alternative, since your problem is symmetric, you could try
>>> LOBPCG, assuming that the leftmost eigenvalues are those that you want
>>> (e.g. if all eigenvalues are non-negative). In that case you could use
>>> STPrecondSetMatForPC(), but the remaining issue is calling it from python.
>>>
>>> If you are using the git repo, I could add the relevant code.
>>>
>>> Jose
>>>
>>>
>>>
>>> > El 8 feb 2021, a las 14:22, Matthew Knepley 
>>> escribió:
>>> >
>>> > On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner 
>>> wrote:
>>> > Dear PETSc / SLEPc Users,
>>> >
>>> > my question is very similar to the one posted here:
>>> >
>>> https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html
>>> >
>>> > The eigensystem I would like to solve looks like:
>>> > B0 v = 1/omega A0 v
>>> > B0 and A0 are both hermitian, A0 is positive definite, but only given
>>> as a linear operator (matshell). I am looking for the largest eigenvalues
>>> (=smallest omega).
>>> >
>>> > I also have a sparse approximation P0 of the A0 operator, which i
>>> would like to use as precondtioner, using something like this:
>>> >
>>> > es = SLEPc.EPS().create(comm=fd.COMM_WORLD)
>>> > st = es.getST()
>>> > ksp = st.getKSP()
>>> > ksp.setOperators(self.A0, self.P0)
>>> >
>>> > Unfortunately PETSc still complains that it cannot create a
>>> preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but
>>> A0.type == 'python').
>>> > By the way, should P0 be an approximation of A0 or does it have to
>>> include B0?
>>> >
>>> > Right now I am using the krylov-schur method. Are there any
>>> alternatives if A0 is only given as an operator?
>>> >
>>> > Jose can correct me if I say something wrong.
>>> >
>>> > When I did this, I made a shell operator for the action of A0^{-1} B0
>>> which has a KSPSolve() in it, so you can use your P0 preconditioning
>>> matrix, and
>>> > then handed that to EPS. You can see me do it here:
>>> >
>>> >
>>> https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123
>>> >
>>> > I had a hard time getting the embedded solver to work the way I
>>> wanted, but maybe that is the better way.
>>> >
>>> >   Thanks,
>>> >
>>> >  Matt
>>> >
>>> > thanks for any advice
>>> > best wishes
>>> > Florian
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > -- Norbert Wiener
>>> >
>>> > https://www.cse.buffalo.edu/~knepley/
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

Re: [petsc-users] using preconditioner with SLEPc

2021-02-08 Thread Dave May

On Mon 8. Feb 2021 at 15:49, Matthew Knepley  wrote:

> On Mon, Feb 8, 2021 at 9:37 AM Jose E. Roman  wrote:
>
>> The problem can be written as A0*v=omega*B0*v and you want the
>> eigenvalues omega closest to zero. If the matrices were explicitly
>> available, you would do shift-and-invert with target=0, that is
>>
>>   (A0-sigma*B0)^{-1}*B0*v=theta*vfor sigma=0, that is
>>
>>   A0^{-1}*B0*v=theta*v
>>
>> and you compute EPS_LARGEST_MAGNITUDE eigenvalues theta=1/omega.
>>
>> Matt: I guess you should have EPS_LARGEST_MAGNITUDE instead of
>> EPS_SMALLEST_REAL in your code. Are you getting the eigenvalues you need?
>> EPS_SMALLEST_REAL will give slow convergence.
>>
>
> Thanks Jose! I am not understanding some step. I want the smallest
> eigenvalues. Should I use EPS_SMALLEST_MAGNITUDE? I appear to get what I
> want
> using SMALLEST_REAL, but as you say it might be slower than it has to be.
>


With shift-and-invert you want to use EPS_LARGEST_MAGNITUDE as Jose says.
The largest magnitude v eigenvalues you obtain (see Jose equation above)
from the transformed system correspond to the smallest magnitude omega
eigenvalues of the original problem.

Cheers
Dave


> Also, sometime I would like to talk about incorporating the multilevel
> eigensolver. I am sure you could make lots of improvements to my initial
> attempt. I will send
> you a separate email, since I am getting serious about testing it.
>
>   Thanks,
>
>  Matt
>
>
>> Florian: I would not recommend setting the KSP matrices directly, it may
>> produce strange side-effects. We should have an interface function to pass
>> this matrix. Currently there is STPrecondSetMatForPC() but it has two
>> problems: (1) it is intended for STPRECOND, so cannot be used with
>> Krylov-Schur, and (2) it is not currently available in the python interface.
>>
>> The approach used by Matt is a workaround that does not use ST, so you
>> can handle linear solves with a KSP of your own.
>>
>> As an alternative, since your problem is symmetric, you could try LOBPCG,
>> assuming that the leftmost eigenvalues are those that you want (e.g. if all
>> eigenvalues are non-negative). In that case you could use
>> STPrecondSetMatForPC(), but the remaining issue is calling it from python.
>>
>> If you are using the git repo, I could add the relevant code.
>>
>> Jose
>>
>>
>>
>> > El 8 feb 2021, a las 14:22, Matthew Knepley 
>> escribió:
>> >
>> > On Mon, Feb 8, 2021 at 7:04 AM Florian Bruckner 
>> wrote:
>> > Dear PETSc / SLEPc Users,
>> >
>> > my question is very similar to the one posted here:
>> > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-August/035878.html
>> >
>> > The eigensystem I would like to solve looks like:
>> > B0 v = 1/omega A0 v
>> > B0 and A0 are both hermitian, A0 is positive definite, but only given
>> as a linear operator (matshell). I am looking for the largest eigenvalues
>> (=smallest omega).
>> >
>> > I also have a sparse approximation P0 of the A0 operator, which i would
>> like to use as precondtioner, using something like this:
>> >
>> > es = SLEPc.EPS().create(comm=fd.COMM_WORLD)
>> > st = es.getST()
>> > ksp = st.getKSP()
>> > ksp.setOperators(self.A0, self.P0)
>> >
>> > Unfortunately PETSc still complains that it cannot create a
>> preconditioner for a type 'python' matrix although P0.type == 'seqaij' (but
>> A0.type == 'python').
>> > By the way, should P0 be an approximation of A0 or does it have to
>> include B0?
>> >
>> > Right now I am using the krylov-schur method. Are there any
>> alternatives if A0 is only given as an operator?
>> >
>> > Jose can correct me if I say something wrong.
>> >
>> > When I did this, I made a shell operator for the action of A0^{-1} B0
>> which has a KSPSolve() in it, so you can use your P0 preconditioning
>> matrix, and
>> > then handed that to EPS. You can see me do it here:
>> >
>> >
>> https://gitlab.com/knepley/bamg/-/blob/master/src/coarse/bamgCoarseSpace.c#L123
>> >
>> > I had a hard time getting the embedded solver to work the way I wanted,
>> but maybe that is the better way.
>> >
>> >   Thanks,
>> >
>> >  Matt
>> >
>> > thanks for any advice
>> > best wishes
>> > Florian
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> > https://www.cse.buffalo.edu/~knepley/
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Enhancing MatScale computing time

2020-10-23 Thread Dave May

On Thu 22. Oct 2020 at 21:23, Antoine Côté 
wrote:

> Hi,
>
> I'm working with a 3D DMDA, with 3 dof per "node", used to create a sparse
> matrix Mat K. The Mat is modified repeatedly by the program, using the
> commands (in that order) :
>
> MatZeroEntries(K)
> In a for loop : MatSetValuesLocal(K, 24, irow, 24, icol, vals, ADD_VALUES)
> MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY)
> MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY)
> MatDiagonalScale(K, vec1, vec1)
> MatDiagonalSet(K, vec2, ADD_VALUES)
>

Why not just assemble the entire operator you seek locally in vals?
You would then avoid the calls to MatDiagonalScale and MatDiagonalSet by
instead calling VecGetArrayRead on vec1 and vec2 and using the local parts
of these vectors you need with vals. You probably need to scatter vec1,
vec2 first before VecGetArrayRead.

Thanks,
Dave




> Computing time seems high and I would like to improve it. Running tests
> with "-log_view" tells me that MatScale() is the bottle neck (50% of total
> computing time) . From manual pages, I've tried a few tweaks :
>
>- DMSetMatType(da, MATMPIBAIJ) : "For problems with multiple degrees
>of freedom per node, ... BAIJ can significantly enhance performance",
>Chapter 14.2.4
>- Used MatMissingDiagonal() to confirm there is no missing diagonal
>entries : "If the matrix Y is missing some diagonal entries this routine
>can be very slow", MatDiagonalSet() manual
>- Tried MatSetOption()
>   - MAT_NEW_NONZERO_LOCATIONS == PETSC_FALSE : to increase assembly
>   efficiency
>   - MAT_NEW_NONZERO_LOCATION_ERR == PETSC_TRUE : "When true, assembly
>   processes have one less global reduction"
>   - MAT_NEW_NONZERO_ALLOCATION_ERR == PETSC_TRUE : "When true,
>   assembly processes have one less global reduction"
>   - MAT_USE_HASH_TABLE == PETSC_TRUE : "Improve the searches during
>   matrix assembly"
>
> According to "-log_view", assembly is fast (0% of total time), and the
> use of a DMDA makes me believe preallocation isn't the cause of performance
> issue.
>
> I would like to know how could I improve MatScale(). What are the best
> practices (during allocation, when defining Vecs and Mats, the DMDA, etc.)?
> Instead of MatDiagonalScale(), should I use another command to obtain the
> same result faster?
>
> Thank you very much!
>
> Antoine Côté
>
>

Re: [petsc-users] Test convergence with non linear preconditioners

2020-08-07 Thread Dave May

On Fri 7. Aug 2020 at 18:21, Adolfo Rodriguez  wrote:

> Great, that works. What would be the way to change the ilu level, I need
> to use ilu(1). I
>

You want to use the option

-xxx_pc_factor_levels 1

Where xxx is the appropriate FAS prefix level.

See here

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCILU.html

would assume that I can accomplish that by means of:
>
> -npc_fas_coarse_pc_type ilu
> -npc_fas_coarse_pc_ilu_levels 1
>
> I noticed that the first line actually works, but I am not sure about the
> second one.
>

If you want to see which options are used / unused, add the command line
option

-options_left 1

Thanks,
Dave

> Thanks,
> Adolfo
>
>
> 
>  Virus-free.
> www.avast.com
> 
> <#m_5466096191530741685_m_-242385451441738719_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Thu, Aug 6, 2020 at 9:23 PM Matthew Knepley  wrote:
>
>> On Thu, Aug 6, 2020 at 10:08 PM Adolfo Rodriguez 
>> wrote:
>>
>>> Considering the output produced by snes_view (attachment), would be
>>> possible to change the linear solver tolerances and the preconditioning
>>> level or type?
>>>
>>
>> Yes. The options prefix is shown for each subsolver. For example, you can
>> change the linear solver type for the coarse level of FAS using
>>
>>   -npc_fas_coarse_ksp_type bigcg
>>
>> Notice that you are using 1 level of FAS, so its the same as just Newton.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Adolfo
>>>
>>>
>>> 
>>>  Virus-free.
>>> www.avast.com
>>> 
>>> <#m_5466096191530741685_m_-242385451441738719_m_5425691679335133860_m_9065914518290907664_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>
>>> On Wed, Aug 5, 2020 at 8:31 PM Barry Smith  wrote:
>>>

Turtles on top of turtles on top of turtles.

It is probably easiest for you to look at the actual code to see how
 it handles things

   1) the SNESFAS uses SNES for each of the levels, for each of these
 level SNES you can control the convergence criteria (either from the
 command lineor with appropriate prefix (highly recommended) or with
 function calls (not recommended)); and even provide your own convergence
 functions  run with -snes_view to see the various solvers and their
 prefixes).

   2) at the finest level of SNESFAS it does call

 /* Test for convergence */
 if (isFine) {
   ierr =
 (*snes->ops->converged)(snes,snes->iter,0.0,0.0,snes->norm,>reason,snes->cnvP);CHKERRQ(ierr);
   if (snes->reason) break;
 }

 src/snes/impls/fas/fas.c line 881 so at least in theory you can provide
 your own convergence test function.

   It was certainly our intention that users can control all the
 convergence knobs for arbitrary imbedded nonlinear solvers including FAS
 but, of course, there may be bugs so let us know what doesn't work.

  Generally the model for FAS is to run a single (or small number of)
 iteration(s) on the level solves and so not directly use convergence
 tolerances like rtol to control the number of iterations on a level but you
 should be able to set any criteria you want.

   You should be able to run with -snes_view and change some of the
 criteria on the command line and see the changes presented in the
 -snes_view output, plus see differences in convergence behavior.

   Barry

 On Aug 5, 2020, at 8:10 PM, Adolfo Rodriguez  wrote:

 It looks like I cannot really change the test function or anything else
 for this particular SNES solver  (I am using SNESFas). Basically, I am
 trying to use the ideas exposed in the paper on Composing scalable solvers
 but it seems that SNESFas does not allow to change the function for testing
 convergence, or anything else. Is this correct?

 Adolfo

  Virus-free.
 www.avast.com

 On Wed, Aug 5, 2020 at 3:41 PM Barry Smith  wrote:

>
>Adolfo,
>
>  You can also just change the tolerances for the inner solve using
> the options data base and the prefix for the inner solve.
>
>  When you run with -snes_view it will show the prefix for each of
> the (nested) solvers. You can also run with -help to get all the possible
> options for the inner solvers.

Re: [petsc-users] Error on INTEGER SIZE using DMDACreate3d

2020-07-21 Thread Dave May

On Tue, 21 Jul 2020 at 12:32, Pierpaolo Minelli 
wrote:

> Hi,
>
> I have asked to compile a Petsc Version updated and with 64bit indices.
> Now I have Version 3.13.3 and these are the configure options used:
>
> #!/bin/python
> if __name__ == '__main__':
>   import sys
>   import os
>   sys.path.insert(0, os.path.abspath('config'))
>   import configure
>   configure_options = [
> '--CC=mpiicc',
> '--CXX=mpiicpc',
> '--download-hypre',
> '--download-metis',
> '--download-mumps=yes',
> '--download-parmetis',
> '--download-scalapack',
> '--download-superlu_dist',
> '--known-64-bit-blas-indices',
>
>   
> '--prefix=/cineca/prod/opt/libraries/petsc/3.13.3_int64/intelmpi--2018--binary',
> '--with-64-bit-indices=1',
>
>   
> '--with-blaslapack-dir=/cineca/prod/opt/compilers/intel/pe-xe-2018/binary/mkl',
> '--with-cmake-dir=/cineca/prod/opt/tools/cmake/3.12.0/none',
> '--with-debugging=0',
> '--with-fortran-interfaces=1',
> '--with-fortran=1',
> 'FC=mpiifort',
> 'PETSC_ARCH=arch-linux2-c-opt',
>   ]
>   configure.petsc_configure(configure_options)
>
> Now, I receive an error on hypre:
>
> forrtl: error (78): process killed (SIGTERM)
> Image  PCRoutineLine
>   Source
> libHYPRE-2.18.2.s  2B33CF465D3F  for__signal_handl Unknown  Unknown
> libpthread-2.17.s  2B33D5BFD370  Unknown   Unknown  Unknown
> libpthread-2.17.s  2B33D5BF96D3  pthread_cond_wait Unknown  Unknown
> libiomp5.so2B33DBA14E07  Unknown   Unknown  Unknown
> libiomp5.so2B33DB98810C  Unknown   Unknown  Unknown
> libiomp5.so2B33DB990578  Unknown   Unknown  Unknown
> libiomp5.so2B33DB9D9659  Unknown   Unknown  Unknown
> libiomp5.so2B33DB9D8C39  Unknown   Unknown  Unknown
> libiomp5.so2B33DB993BCE  __kmpc_fork_call  Unknown  Unknown
> PIC_3D 004071C0  Unknown   Unknown  Unknown
> PIC_3D 00490299  Unknown   Unknown  Unknown
> PIC_3D 00492C17  Unknown   Unknown  Unknown
> PIC_3D 0040562E  Unknown   Unknown  Unknown
> libc-2.17.so   2B33DC5BEB35  __libc_start_main
>   Unknown  Unknown
> PIC_3D 00405539  Unknown   Unknown  Unknown
>
> Is it possible that I need to ask also to compile hypre with an option for
> 64bit indices?
> Is it possible to instruct this inside Petsc configure?
> Alternatively, is it possible to use a different multigrid PC inside PETSc
> that accept 64bit indices?
>

You can use
  -pc_type gamg
All native PETSc implementations support 64bit indices.


>
> Thanks in advance
>
> Pierpaolo
>
>
> Il giorno 27 mag 2020, alle ore 11:26, Stefano Zampini <
> stefano.zamp...@gmail.com> ha scritto:
>
> You need a version of PETSc compiled with 64bit indices, since the message
> indicates the number of dofs in this case is larger the INT_MAX
> 2501×3401×1601 = 13617947501
>
> I also suggest you upgrade to a newer version, 3.8.3 is quite old as the
> error message reports
>
> Il giorno mer 27 mag 2020 alle ore 11:50 Pierpaolo Minelli <
> pierpaolo.mine...@cnr.it> ha scritto:
>
>> Hi,
>>
>> I am trying to solve a Poisson equation on this grid:
>>
>> Nx = 2501
>> Ny = 3401
>> Nz = 1601
>>
>> I received this error:
>>
>> [0]PETSC ERROR: - Error Message
>> --
>> [0]PETSC ERROR: Overflow in integer operation:
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#64-bit-indices
>> [0]PETSC ERROR: Mesh of 2501 by 3401 by 1 (dof) is too large for 32 bit
>> indices
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
>> [0]PETSC ERROR:
>> /marconi_scratch/userexternal/pminelli/PIC3D/2500_3400_1600/./PIC_3D on a
>> arch-linux2-c-opt named r129c09s02 by pminelli Tu
>> e May 26 20:16:34 2020
>> [0]PETSC ERROR: Configure options
>> --prefix=/cineca/prod/opt/libraries/petsc/3.8.3/intelmpi--2018--binary
>> CC=mpiicc FC=mpiifort CXX=mpiicpc
>> F77=mpiifort F90=mpiifort --with-debugging=0
>> --with-blaslapack-dir=/cineca/prod/opt/compilers/intel/pe-xe-2018/binary/mkl
>> --with-fortran=1
>> --with-fortran-interfaces=1
>> --with-cmake-dir=/cineca/prod/opt/tools/cmake/3.5.2/none
>> --with-mpi-dir=/cineca/prod/opt/compilers/intel/pe-xe-
>> 2018/binary/impi/2018.4.274 --download-scalapack --download-mumps=yes
>> --download-hypre --download-superlu_dist --download-parmetis --downlo
>> ad-metis
>> [0]PETSC ERROR: #1 DMSetUp_DA_3D() line 218 in
>> /marconi/prod/build/libraries/petsc/3.8.3/intelmpi--2018--binary/BA_WORK/petsc-3.8.3/src/dm/
>> impls/da/da3.c
>> [0]PETSC ERROR: #2 DMSetUp_DA() line 25 in
>>

Re: [petsc-users] [Ext] Re: Question on SLEPc + computing SVD with a "matrix free" matrix

2020-06-25 Thread Dave May

On Thu 25. Jun 2020 at 08:23, Ernesto Prudencio via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Thank you, Jose.
>
> However, in the case of a "matrix free" matrix, the APIs on PETSc seem to
> allow just the implementation of A.v, not of A' . w
>
> One could create another "matrix free" matrix B which could make the role
> of computing z = B . w = A' . w. But how could one force the routine
> MatMultTranspose(A, w, z) to call the routine for B? Here I am assuming
> that MatMultTranspose(A, w, z) is the routine that SLEPc calls internally
> in its algorithms when the user sets for implicit transpose.
>
> I see two possibilities (for brain storming, since I don't know if such
> approaches would be acceptable for the PETSc team and/or the SLEPc team):
> 1) PETSc could add an entry for a second routine that computes [ A' . w ]
> when calling MatMFFDSetFunction()
>

MatMFFD is designed for providing the action of the jacobian associated
with a nonlinear problem F(x)=0. It uses F to provide a finite difference
approx of the action J w. It does not support actions J’ w.

You will need to create a MatShell and supply the methods for MatMult and
MatMultTranspose (as Jose suggested).

Thanks
Dave



2) SLEPc could add an entry for the routine for computing [ A' . w ] when
> the matrix A is "matrix free"
>
> Thanks again,
>
> Ernesto.
>
>
> Schlumberger-Private
>
> -Original Message-
> From: Jose E. Roman [mailto:jro...@dsic.upv.es]
> Sent: Thursday, June 25, 2020 1:01 AM
> To: Ernesto Prudencio 
> Cc: petsc-users@mcs.anl.gov
> Subject: [Ext] Re: [petsc-users] Question on SLEPc + computing SVD with a
> "matrix free" matrix
>
> Yes, you have to set it with SVDSetImplicitTranspose(), but then the
> matrix-free matrix should implement both "A.v" and "A'.v" operations.
> Jose
>
>
> > El 24 jun 2020, a las 23:25, Ernesto Prudencio via petsc-users <
> petsc-users@mcs.anl.gov> escribió:
> >
> > Hi,
> >
> > Is it possible to compute a SVD for a “matrix free” matrix?
> >
> > At first, it seems it would be ok with a MatCreateMFFD() and
> MatMFFDSetFunction(), because one could then provide the routine that
> computes “A . v” for any given v, which is an operation needed by SLEPc.
> However, one would also need to set up the SVD object in SLEPc with an
> implicit transpose. Would that be possible?
> >
> > Thanks in advance,
> >
> > Ernesto.
> >
> > Schlumberger-Private
>

Re: [petsc-users] Regarding P4est

2020-06-18 Thread Dave May

On Thu 18. Jun 2020 at 01:20, Mark Adams  wrote:

> PETSc does take pains to keep it clean in Valgrind, to make it more
> useful ...
>

Yes of course!

As I understood, the code being discussed was derived / based on ex11, and
not identical to ex11 (eg flux definitions have changed). Hence there’s
some user code in the mix which is not guaranteed to be valgrind clean.





> And yes there is tree structure to this error, and p4est is a tree code.
>
> Try with uniform bathymetry, maybe your mapping is messed up by some
> recording by p4est.
>
>
> On Wed, Jun 17, 2020 at 6:47 PM MUKKUND SUNJII 
> wrote:
>
>> No, I have not checked it using Valgrind. Perhaps it will help me trace
>> the problem.
>>
>> Regards,
>>
>> Mukkund
>>
>> On 18 Jun 2020, at 00:43, Dave May  wrote:
>>
>> Is the code valgrind clean?
>>
>> On Wed, 17 Jun 2020 at 23:25, MUKKUND SUNJII 
>> wrote:
>>
>>> I agree with the structured nature of the noise. I did play around with
>>> the PetscFV implementation a bit to allow for the computation of different
>>> fluxes left and right side of every interface.
>>>
>>> Nevertheless it is indeed strange that the problem disappears when I use
>>> a PLEX dm.
>>>
>>> Regards,
>>>
>>> Mukkund
>>>
>>> On 17 Jun 2020, at 22:53, Dave May  wrote:
>>>
>>>
>>>
>>> On Wed 17. Jun 2020 at 21:21, MUKKUND SUNJII 
>>> wrote:
>>>
>>>> Yes, precisely! I am not sure how I can replicate using the original
>>>> version of ex11.c because it does not support bathymetry.
>>>>
>>>> Regardless, to demonstrate the discrepancy, I have uploaded three
>>>> plots. The scenario is a lake at rest. Essentially, you have a varying
>>>> bathymetry but a level water surface. If the model is well balanced, then
>>>> the water surface height must not change. The description of the files are
>>>> below
>>>>
>>>> 1) Bathymetry.png : It shows you the bathymetry profile (z(x)) and the
>>>> water surface height (H = h+z(x)) at t = 0.
>>>> 
>>>>
>>>> 2) Plex.png : This is the water surface height after 1 time step (0.007055
>>>> sec)  and the dm type is Plex. As you can see, the water surface
>>>> height is undisturbed as expected.
>>>> 
>>>>
>>>> 3) P4est.png : This is the result after 1 time step (same final time)
>>>> if I set the dm type as p4est. The noise is in the order of 1e-3 to be a
>>>> little more specific. Since its not specifically at the boundaries and more
>>>> or less spread throughout, it could indeed be noise introduced. But of
>>>> course I could be wrong.
>>>> 
>>>>
>>>>
>>> The (wrong) result has seemingly a lot of structure. Have you verified
>>> your code using p4est is valgrind clean? This looks too much like a weird
>>> indexing bug for me to not ask this question.
>>>
>>> Thanks,
>>> Dave
>>>
>>>
>>> Maybe this paints a better picture.
>>>>
>>>> Regards,
>>>>
>>>> Mukkund
>>>>
>>>> For your reference, the Riemann Solver is a modified version of the HLL
>>>> solver: *A simple well-balanced and positive numerical scheme for the
>>>> shallow-water system by **Emmanuel Audusse, Christophe Chalons,
>>>> Philippe Ung. *
>>>> (
>>>> https://www.intlpress.com/site/pub/files/_fulltext/journals/cms/2015/0013/0005/CMS-2015-0013-0005-a011.pdf
>>>> )
>>>>
>>>> On 17 Jun 2020, at 20:47, Mark Adams  wrote:
>>>>
>>>> So you get this noise with a regular grid in p4est. So the same grid as
>>>> will Plex, and you are not getting the same results.
>>>>
>>>> I don't know of any difference from p4est on a non-adapted grid. Can
>>>> you reproduce this with ex11?
>>>>
>>>> Matt and Toby could answer this better.
>>>>
>>>>
>>>> On Wed, Jun 17, 2020 at 1:33 PM MUKKUND SUNJII 
>>>> wrote:
>>>> Greetings,
>>>>
>>>> I am a master’s student working on the shallow water model of the TS
>>>> example 'ex11.c' as part of my thesis. Therefore, I am working with
>>>> DMForest for the implementation of adaptive grids. I have a question and an
>>>> observation.
>>>>
>>>> I am t

Re: [petsc-users] Regarding P4est

2020-06-17 Thread Dave May

Is the code valgrind clean?

On Wed, 17 Jun 2020 at 23:25, MUKKUND SUNJII 
wrote:

> I agree with the structured nature of the noise. I did play around with
> the PetscFV implementation a bit to allow for the computation of different
> fluxes left and right side of every interface.
>
> Nevertheless it is indeed strange that the problem disappears when I use a
> PLEX dm.
>
> Regards,
>
> Mukkund
>
> On 17 Jun 2020, at 22:53, Dave May  wrote:
>
>
>
> On Wed 17. Jun 2020 at 21:21, MUKKUND SUNJII 
> wrote:
>
>> Yes, precisely! I am not sure how I can replicate using the original
>> version of ex11.c because it does not support bathymetry.
>>
>> Regardless, to demonstrate the discrepancy, I have uploaded three plots.
>> The scenario is a lake at rest. Essentially, you have a varying bathymetry
>> but a level water surface. If the model is well balanced, then the water
>> surface height must not change. The description of the files are below
>>
>> 1) Bathymetry.png : It shows you the bathymetry profile (z(x)) and the
>> water surface height (H = h+z(x)) at t = 0.
>> 
>>
>> 2) Plex.png : This is the water surface height after 1 time step (0.007055
>> sec)  and the dm type is Plex. As you can see, the water surface height
>> is undisturbed as expected.
>> 
>>
>> 3) P4est.png : This is the result after 1 time step (same final time) if
>> I set the dm type as p4est. The noise is in the order of 1e-3 to be a
>> little more specific. Since its not specifically at the boundaries and more
>> or less spread throughout, it could indeed be noise introduced. But of
>> course I could be wrong.
>> 
>>
>>
> The (wrong) result has seemingly a lot of structure. Have you verified
> your code using p4est is valgrind clean? This looks too much like a weird
> indexing bug for me to not ask this question.
>
> Thanks,
> Dave
>
>
> Maybe this paints a better picture.
>>
>> Regards,
>>
>> Mukkund
>>
>> For your reference, the Riemann Solver is a modified version of the HLL
>> solver: *A simple well-balanced and positive numerical scheme for the
>> shallow-water system by **Emmanuel Audusse, Christophe Chalons, Philippe
>> Ung. *
>> (
>> https://www.intlpress.com/site/pub/files/_fulltext/journals/cms/2015/0013/0005/CMS-2015-0013-0005-a011.pdf
>> )
>>
>> On 17 Jun 2020, at 20:47, Mark Adams  wrote:
>>
>> So you get this noise with a regular grid in p4est. So the same grid as
>> will Plex, and you are not getting the same results.
>>
>> I don't know of any difference from p4est on a non-adapted grid. Can you
>> reproduce this with ex11?
>>
>> Matt and Toby could answer this better.
>>
>>
>> On Wed, Jun 17, 2020 at 1:33 PM MUKKUND SUNJII 
>> wrote:
>> Greetings,
>>
>> I am a master’s student working on the shallow water model of the TS
>> example 'ex11.c' as part of my thesis. Therefore, I am working with
>> DMForest for the implementation of adaptive grids. I have a question and an
>> observation.
>>
>> I am trying to find relevant information about interpolation that takes
>> place through the routine DMForestTransferVec. Perhaps it could be my
>> inability to find it, but I am unable to locate the implementation of the
>> routine
>>
>> (forest->transfervec)(dmIn,vecIn,dmOut,vecOut,useBCs,time).
>>
>> Any information on this particular routine is highly appreciated.
>>
>> Furthermore, I have developed a well balanced Riemann Solver that
>> includes topography in the model. In the process of testing both the
>> non-adaptive and adaptive version, I found that my results differed when I
>> changed the type of DM. For instance, when I run a scenario in a fixed,
>> non-adaptive grid  with a DM of type 'P4est', I find that the well balanced
>> nature is lost due to small perturbations all across the domain. However,
>> this does not occur when I use a DM of type ‘plex’. Is there a radical
>> change in the routines between the two DM’s? This is not as much of a
>> question as it is an observation.
>>
>> Thank you for all of your suggestions!
>>
>> Regards,
>>
>> Mukkund
>>
>>
>

Re: [petsc-users] Question on reverse scatters from VecScatterCreateToAll

2020-06-05 Thread Dave May

On Fri 5. Jun 2020 at 13:52, Fabian Jakub <
fabian.ja...@physik.uni-muenchen.de> wrote:

> Dear Petsc list,
>
> I have a question regarding reverse vec-scatters:
>
> I have a shared memory solver that I want to use on a distributed DMDA and
> average its results.
>
> The shared mem solver needs some of the global state.
>
> So I want to create a full copy of a global vec on each master rank of a
> machine, compute some result
>
> and average the results back into the global vec.
>
>
> For example: the parallel layout for a global vector is
>
> 3 ranks in comm_world, each with Ndof = 6, so global size is 18
>
> Then I create a mpi sub communicator with MPI_COMM_TYPE_SHARED
>
> Then I create a local vec on each rank, with sizes 9 on each rank_0 on the
> sub comm, size 0 otherwise
>
>   if ( sub_id == 0 ) then
> call VecGetSize(gvec, Nlocal, ierr); call CHKERR(ierr)
>   else
> Nlocal = 0
>   endif
>   call VecCreateSeq(PETSC_COMM_SELF, Nlocal, lvec, ierr)
>
> This yields for example (first 2 ranks in a shared mem subcomm, 3rd rank
> on another machine):
>
> Global rank_id  |   sub_rank_id   |   local_size_gvec |  local_size_lvec
>
> 0   |  0|
> 6| 18
>
> 1   |  1|
> 6| 0
>
> 2   |  0|
> 6| 18
>
>
> To copy the global vec on each shared mem master, I do:
>
> ISCreateStride(PETSC_COMM_SELF, Nlocal, 0, 1, is, ierr)
>
> VecScatterCreate(gvec, is, lvec, is, ctx, ierr)
>
> VecScatterBegin(ctx, gvec, lvec, INSERT_VALUES, SCATTER_FORWARD, ierr)
>
> VecScatterEnd  (ctx, gvec, lvec, INSERT_VALUES, SCATTER_FORWARD, ierr)
>
>
> That works fine.
>
> Then I want to do the reverse, i.e. add all values from the local vec to
> the global vec on comm_world to generate the average of the results.
>
> I tried:
>
> VecSet(gvec, zero, ierr)
>
> VecScatterBegin(ctx, gvec, lvec, ADD_VALUES, SCATTER_REVERSE, ierr)
>
>
> I was hoping to get the sum of svec in the global vec, so that gvec /
> comm_size(sub_comm_id==0) gives the mean.
>
> However, I get the following error:
>
>
> Nonconforming object sizes
>
> Vector wrong size 18 for scatter 6 (scatter reverse and vector to != ctx
> from size)
>
>
> Going with the same approach with VecScatterCreateToAll
> 
> leads to the same issue.
>
> Do you have suggestions on how I could/should achieve it?
>
>

You have to flip the lvec, gvec args in reverse mode (as per the man pages
states under Notes)

Thanks
Dave



> Many thanks!
>
> Fabian
>

Re: [petsc-users] Running example problem

2020-06-04 Thread Dave May

On Thu, 4 Jun 2020 at 14:17, Dave May  wrote:

>
>
> On Thu, 4 Jun 2020 at 14:15, Matthew Knepley  wrote:
>
>> On Thu, Jun 4, 2020 at 9:12 AM Fazlul Huq  wrote:
>>
>>> Somehow, make is not working.
>>> Please find the attachment herewith for the terminal readout.
>>>
>>
>> Since you built with PETSC_ARCH=linux-gnu, you need that in your
>> environment.
>>
>
> Or just do
>
> make ex5 PETSC_ARCH=linux-gnu
>

sorry I hit send without checking your png)
The command should be

make ex5 PETSC_ARCH=arch-linux2-c-debug


>
>
>
>
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Thank you.
>>>
>>> Sincerely,
>>> Huq
>>>
>>> On Thu, Jun 4, 2020 at 7:57 AM Matthew Knepley 
>>> wrote:
>>>
>>>> On Thu, Jun 4, 2020 at 8:53 AM Fazlul Huq  wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I have a very preliminary question!
>>>>> I think I have installed PETSc correctly cause I got following on the
>>>>> terminal:
>>>>>
>>>>> Command:
>>>>> make PETSC_DIR=/home/huq2090/petsc-3.10.2 PETSC_ARCH=linux-gnu check
>>>>> Response:
>>>>> Running check examples to verify correct installation
>>>>> Using PETSC_DIR=/home/huq2090/petsc-3.10.2 and PETSC_ARCH=linux-gnu
>>>>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI
>>>>> process
>>>>> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI
>>>>> processes
>>>>> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI
>>>>> process
>>>>> Completed test examples
>>>>>
>>>>> Now, I am trying to run an example from the list of provided examples.
>>>>> Can you please help me out, how to run an example problem from the
>>>>> list of examples?
>>>>> I mean, how to make executable and run the executable?
>>>>>
>>>>
>>>> cd $PETSC_DIR
>>>> cd src/snes/tutorials
>>>> make ex5
>>>> ./ex5 -snes_monitor
>>>>
>>>>   Thanks,
>>>>
>>>>  Matt
>>>>
>>>>
>>>>> Thank you.
>>>>> Sincerely,
>>>>> Huq
>>>>> --
>>>>>
>>>>> Fazlul Huq
>>>>> Graduate Research Assistant
>>>>> Department of Nuclear, Plasma & Radiological Engineering (NPRE)
>>>>> University of Illinois at Urbana-Champaign (UIUC)
>>>>> E-mail: huq2...@gmail.com
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>>
>>> --
>>>
>>> Fazlul Huq
>>> Graduate Research Assistant
>>> Department of Nuclear, Plasma & Radiological Engineering (NPRE)
>>> University of Illinois at Urbana-Champaign (UIUC)
>>> E-mail: huq2...@gmail.com
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

Re: [petsc-users] Running example problem

2020-06-04 Thread Dave May

On Thu, 4 Jun 2020 at 14:15, Matthew Knepley  wrote:

> On Thu, Jun 4, 2020 at 9:12 AM Fazlul Huq  wrote:
>
>> Somehow, make is not working.
>> Please find the attachment herewith for the terminal readout.
>>
>
> Since you built with PETSC_ARCH=linux-gnu, you need that in your
> environment.
>

Or just do

make ex5 PETSC_ARCH=linux-gnu




>
>   Thanks,
>
>  Matt
>
>
>> Thank you.
>>
>> Sincerely,
>> Huq
>>
>> On Thu, Jun 4, 2020 at 7:57 AM Matthew Knepley  wrote:
>>
>>> On Thu, Jun 4, 2020 at 8:53 AM Fazlul Huq  wrote:
>>>
 Hello,

 I have a very preliminary question!
 I think I have installed PETSc correctly cause I got following on the
 terminal:

 Command:
 make PETSC_DIR=/home/huq2090/petsc-3.10.2 PETSC_ARCH=linux-gnu check
 Response:
 Running check examples to verify correct installation
 Using PETSC_DIR=/home/huq2090/petsc-3.10.2 and PETSC_ARCH=linux-gnu
 C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI
 process
 C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI
 processes
 Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI
 process
 Completed test examples

 Now, I am trying to run an example from the list of provided examples.
 Can you please help me out, how to run an example problem from the list
 of examples?
 I mean, how to make executable and run the executable?

>>>
>>> cd $PETSC_DIR
>>> cd src/snes/tutorials
>>> make ex5
>>> ./ex5 -snes_monitor
>>>
>>>   Thanks,
>>>
>>>  Matt
>>>
>>>
 Thank you.
 Sincerely,
 Huq
 --

 Fazlul Huq
 Graduate Research Assistant
 Department of Nuclear, Plasma & Radiological Engineering (NPRE)
 University of Illinois at Urbana-Champaign (UIUC)
 E-mail: huq2...@gmail.com

>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> 
>>>
>>
>>
>> --
>>
>> Fazlul Huq
>> Graduate Research Assistant
>> Department of Nuclear, Plasma & Radiological Engineering (NPRE)
>> University of Illinois at Urbana-Champaign (UIUC)
>> E-mail: huq2...@gmail.com
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Agglomeration for Multigrid on Unstructured Meshes

2020-06-01 Thread Dave May

On Tue 2. Jun 2020 at 03:30, Matthew Knepley  wrote:

> On Mon, Jun 1, 2020 at 7:03 PM Danyang Su  wrote:
>
>> Thanks Jed for the quick response. Yes I am asking about the
>> repartitioning of coarse grids in geometric multigrid for unstructured
>> mesh. I am happy with AMG. Thanks for letting me know.
>>
>
> All the pieces are there, we just have not had users asking for this, and
> it will take some work to put together.
>

Matt - I created a branch for you and Lawrence last year which added full
support for PLEX within Telescope. This implementation was not a fully
automated algmoeration strategy - it utilized the partition associated with
the DM returned from DMGetCoarseDM. Hence the job of building the
distributed coarse hierarchy was let to the user.

I’m pretty sure that code got merged into master as the branch also
contained several bug mixes for Telescope. Or am I mistaken?

Cheers
Dave




>   Thanks,
>
> Matt
>
>
>> Danyang
>>
>> On 2020-06-01, 1:47 PM, "Jed Brown"  wrote:
>>
>> I assume you're talking about repartitioning of coarse grids in
>> geometric multigrid -- that hasn't been implemented.
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCTELESCOPE.html
>>
>> But you can use an algebraic multigrid that does similar communicator
>> reduction, and can be applied to the original global problem or just
>> on
>> the "coarse" problem of an initial geometric hierarchy.
>>
>> Danyang Su  writes:
>>
>> > Dear All,
>> >
>> >
>> >
>> > I recalled there was a presentation ‘Extreme-scale multigrid
>> components with PETSc’  taling about agglomeration in parallel multigrid,
>> with future plan to extend to support unstructured meshes. Is this under
>> development or to be added?
>> >
>> >
>> >
>> > Thanks and regards,
>> >
>> >
>> >
>> > Danyang
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] a question about MatSetValue

2020-05-21 Thread Dave May

On Thu 21. May 2020 at 12:17, Yang Bo (Asst Prof) 
wrote:

> Hi Dave,
>
> Yes it is parallel so the preallocation calls are not lowered by the
> allocation.
>
> I am trying to use MatXAIJSetPreallocation, but not sure how, since the
> following link does not give an example:
>
>
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatXAIJSetPreallocation.html
>
> If I have the following matrix:
>
> 0 1 2 0
> 1 0 0 0
> 2 0 1 3
> 0 0 3 2
>
> How should I put in the parameters of MatXAIJSetPreallocation?
>

Please read this page to understand the info required

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html


Compute everything as described above and give the results to
MatXAIJSetPreallocation(). MatXAIJSetPreallocation() is just a helper
function to hide all the implementation specific setters.

Thanks
Dave






> Thanks!
>
> Cheers,
>
> Yang Bo
>
>
> On 21 May 2020, at 5:42 PM, Dave May  wrote:
>
>
>
> On Thu 21. May 2020 at 10:49, Yang Bo (Asst Prof) 
> wrote:
>
>> Hi Dave,
>>
>> Thank you very much for your reply. That is indeed the problem. I have
>> been working with matrices in Slepc but I don’t really understand it. I
>> tried to preallocate but it still does not work.
>>
>
> Meaning the number of reported mallocs is still non-zero?
> Is the number reported with you preallocation calls lower than what you
> originally saw?
>
> If you look at my code below:
>>
>> ierr = MatCreate(PETSC_COMM_WORLD,);CHKERRQ(ierr);
>> ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,h_dim,h_dim);
>>  // h_dim is the dimension of the square matrix A
>> ierr = MatSetFromOptions(A);CHKERRQ(ierr);
>> ierr = MatSetUp(A);CHKERRQ(ierr);
>> ierr = MatGetOwnershipRange(A,,);CHKERRQ(ierr);
>>
>> MatSeqAIJSetPreallocation(A,0,nnz);
>>  // I try to preallocate here, where nnz is the array containing the
>> number of non-zero entries each row
>>
>> for (int i=0;i> MatSetValue(A,row[i],column[i],h[i],INSERT_VALUES);
>> }
>>
>> I am not sure what other information I need to give for the
>> pre-allocation…
>>
>
> This looks fine. However MatSeqAIJSetPreallocation() has no effect if the
> Mat type is not SEQAIJ.
>
> Are you running in parallel? If yes then the Mat type will be MATMPIAIJ
> and you either have to call the MPI specific preallocator or use the
> generic one I pointed you too.
>
> Thanks
> Dave
>
>
>
>> Cheers,
>>
>> Yang Bo
>>
>>
>>
>> On 21 May 2020, at 4:08 PM, Dave May  wrote:
>>
>> *-info | grep malloc*
>>
>>
>> --
>>
>> CONFIDENTIALITY: This email is intended solely for the person(s) named
>> and may be confidential and/or privileged. If you are not the intended
>> recipient, please delete it, notify us and do not copy, use, or disclose
>> its contents.
>> Towards a sustainable earth: Print only when necessary. Thank you.
>>
>
>

Re: [petsc-users] a question about MatSetValue

2020-05-21 Thread Dave May

On Thu 21. May 2020 at 10:49, Yang Bo (Asst Prof) 
wrote:

> Hi Dave,
>
> Thank you very much for your reply. That is indeed the problem. I have
> been working with matrices in Slepc but I don’t really understand it. I
> tried to preallocate but it still does not work.
>

Meaning the number of reported mallocs is still non-zero?
Is the number reported with you preallocation calls lower than what you
originally saw?

If you look at my code below:
>
> ierr = MatCreate(PETSC_COMM_WORLD,);CHKERRQ(ierr);
> ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,h_dim,h_dim);
>  // h_dim is the dimension of the square matrix A
> ierr = MatSetFromOptions(A);CHKERRQ(ierr);
> ierr = MatSetUp(A);CHKERRQ(ierr);
> ierr = MatGetOwnershipRange(A,,);CHKERRQ(ierr);
>
> MatSeqAIJSetPreallocation(A,0,nnz);   //
> I try to preallocate here, where nnz is the array containing the number of
> non-zero entries each row
>
> for (int i=0;i MatSetValue(A,row[i],column[i],h[i],INSERT_VALUES);
> }
>
> I am not sure what other information I need to give for the pre-allocation…
>

This looks fine. However MatSeqAIJSetPreallocation() has no effect if the
Mat type is not SEQAIJ.

Are you running in parallel? If yes then the Mat type will be MATMPIAIJ and
you either have to call the MPI specific preallocator or use the generic
one I pointed you too.

Thanks
Dave



> Cheers,
>
> Yang Bo
>
>
>
> On 21 May 2020, at 4:08 PM, Dave May  wrote:
>
> *-info | grep malloc*
>
>
> --
>
> CONFIDENTIALITY: This email is intended solely for the person(s) named and
> may be confidential and/or privileged. If you are not the intended
> recipient, please delete it, notify us and do not copy, use, or disclose
> its contents.
> Towards a sustainable earth: Print only when necessary. Thank you.
>

Re: [petsc-users] a question about MatSetValue

2020-05-21 Thread Dave May

On Thu, 21 May 2020 at 08:55, Yang Bo (Asst Prof) 
wrote:

> Hi Everyone,
>
> I have a question about adding values to the matrix. The code I have is
>
>
> for (int i=0;i MatSetValue(A,row[i],column[i],h[i],INSERT_VALUES);
> }
>
> where row.size() is a large number. It seems the running time of this
> procedure does not scale linearly with row.size(). As row.size() gets
> bigger, the time it takes increases exponentially.


It sounds like your matrix is not properly preallocated. Could this be the
case?

You can confirm / deny this by running with the command line options (shown
in bold)

./your-exec * -info | grep malloc*

If all is good you will see something like this

[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
*total number of mallocs used during MatSetValues calls=0*

If the reported number of mallocs in your code is not 0, please read these
pages:

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSeqAIJSetPreallocation.html

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html


You may like to use the generic preallocator (depending on the type of you
Mat).

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatXAIJSetPreallocation.html


Thanks
Dave




> Am I doing something wrong and can I do better than that?
>
> Thanks and stay healthy!
>
> Cheers,
>
> Yang Bo
> 
>
> CONFIDENTIALITY: This email is intended solely for the person(s) named and
> may be confidential and/or privileged. If you are not the intended
> recipient, please delete it, notify us and do not copy, use, or disclose
> its contents.
> Towards a sustainable earth: Print only when necessary. Thank you.
>

Re: [petsc-users] PetscObjectGetComm

2020-04-22 Thread Dave May

On Wed 22. Apr 2020 at 07:11, Marius Buerkle  wrote:

> Hi,
>
> What is PetscObjectGetComm expected to return?

As Patrick said, it returns the communicator associated with the petsc
object.

I thought it would give the MPI communicator the object lives on. So if I
> create A matrix on PETSC_COMM_WORLD a call of PetscObjectGetComm for A it
> would return PETSC_COMM_WORLD? But it seems to return something else, and
> while most of the nodes return a similar communicator some are giving a
> different one.

How are you actually comparing the communicators (send code snippet)? Which
MPI implementation are you using? And when are comparing comms is the
comparison code written in C it FORTRAN?

That said, is there a way to get the MPI communicator a matrix lives on?

You are using the correct function. There is a macro as well but it’s best
to use the function.

Thanks,
Dave

>
> Best,
> Marius
>

Re: [petsc-users] petsc error disappears when I print something in the function

2020-04-21 Thread Dave May

On Tue 21. Apr 2020 at 16:47, Matthew Knepley  wrote:

> You are overwriting memory somewhere. The prints just move it around. I
> suggest running with valgrind.
>

Matt is right. However, judging by the code snippet I bet all the arrays in
question are statically allocated, thus valgrind may be of somewhat limited
use.

If you send the entire function, or all the related pieces of code, someone
in the list might spot the error.

Thanks
Dave




>   Thanks,
>
> Matt
>
> On Tue, Apr 21, 2020 at 10:44 AM Kaushik Vijaykumar 
> wrote:
>
>> Hello group,
>>
>> I have been trying to navigate a weird error that I have found in my FEA
>> code that I am developing using PetSc. The error occurs when, i execute a
>> call to stiffness generation of a Tetrahederal element. The code returns an
>> memory error if I don't print the following statements in the function,
>> (see the ierr print statements below):
>>
>> for (i1=1; i1<4; i1++)  // Loop 6b
>>   {
>> for (j1=1; j1<4; j1++) // Loop 7b
>>   {
>>   for (k1=1; k1<4; k1++) // Loop 8
>> {
>> for (l1=1; l1<4; l1++) // Loop 9
>> {
>>   s[ii1+i1-1][jj1+j1-1] =
>> s[ii1+i1-1][jj1+j1-1]+C4[i1][k1][j1][l1]*w[k1][l1]*weight;
>>   ierr = PetscFPrintf(PETSC_COMM_WORLD,outfile,"C4 %f
>> \n",C4[i1][k1][j1][l1]);CHKERRQ(ierr);
>>   ierr = PetscFPrintf(PETSC_COMM_WORLD,outfile,"w %f
>> \n",w[k1][l1]);CHKERRQ(ierr);
>>   ierr = PetscFPrintf(PETSC_COMM_WORLD,outfile,"weight %f
>> \n",weight);CHKERRQ(ierr);
>> }   // Loop 9
>> }  // Loop 8
>>   } // Loop 7b
>> } // Loop 6b
>>
>>
>> Any help on this is really appreciated.
>>
>> Thanks
>> Kaushik
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] error: too few arguments to function call (PetscOptionsHasName)

2020-04-17 Thread Dave May

Please always use "reply-all" so that your messages go to the list.
This is standard mailing list etiquette.  It is important to preserve
threading for people who find this discussion later and so that we do
not waste our time re-answering the same questions that have already
been answered in private side-conversations.  You'll likely get an
answer faster that way too.

On Fri, 17 Apr 2020 at 09:20, huabel  wrote:

> I have checked that manual, I mean why a new version release include old
> versions of petsc api, why not update them all to new version?
>

I understand now. I thought that the code you couldn't compile was
something you wrote, however I now see it is actually living in the PETSc
src tree.
I also note that PLogEvent.c also fails to compile for the same reason.

The fact the API change was not propagated throughout these two files in
src/benchmarks (PetscMalloc.c and PLogEvent.c) is an oversight.

I am surprised that this did not get caught as:
(i) API changes are usually applied via smart scripting,
(ii) I imagined that the regression testing would have picked this up issue.
These files were also broken in v 3.12.

Thanks for the bug report.

>
>
> On Apr 17, 2020, at 16:10, Dave May  wrote:
>
> Old versions of petsc had 3 args for this function, latest version expects
> 4 (as the compiler error indicates).
>
> When in doubt as to what these args are, please refer to the extensive man
> pages. You can find them all here
>
>
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/singleindex.html
>
>
> The page you want for this func is here
>
>
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscOptionsHasName.html
>
>
> Tip: It is wise to avoid performing a google search of the function name.
> It can bring you to the man page for an old version of petsc sometimes and
> this can lead to confusion. Best go directly to the URL above (or access
> the pages through the petsc web page) to ensure you are looking at the
> appropriate man pages
>
>
> Thanks
> Dave
>
>
>
> On Fri 17. Apr 2020 at 09:43, huabel via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> Dear PETSc users,
>>
>> I’m learn some base for PETSc , compile file src/benchmarks/PetscMalloc.c
>> , get next error. (Use PETSc 3.13.0)
>>
>> >pwd
>> src/benchmarks
>>
>> >mpicc PetscMalloc.c
>> *PetscMalloc.c:53:49: **error: **too few arguments to function call,
>> expected 4, have 3*
>>   ierr = PetscOptionsHasName(NULL,"-malloc",);CHKERRQ(ierr);
>> * ~~~^*
>> */usr/local/include/petscoptions.h:18:1: note: *'PetscOptionsHasName'
>> declared here
>> PETSC_EXTERN PetscErrorCode PetscOptionsHasName(PetscOptions,const
>> char[],const char[],PetscBool*);
>> *^*
>> */usr/local/include/petscsys.h:106:24: note: *expanded from macro
>> 'PETSC_EXTERN'
>> #  define PETSC_EXTERN extern PETSC_VISIBILITY_PUBLIC
>> *   ^*
>> 1 error generated.
>>
>>  Thanks.
>>Abel
>>
>
>

Re: [petsc-users] error: too few arguments to function call (PetscOptionsHasName)

2020-04-17 Thread Dave May

Old versions of petsc had 3 args for this function, latest version expects
4 (as the compiler error indicates).

When in doubt as to what these args are, please refer to the extensive man
pages. You can find them all here

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/singleindex.html

The page you want for this func is here

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscOptionsHasName.html

Tip: It is wise to avoid performing a google search of the function name.
It can bring you to the man page for an old version of petsc sometimes and
this can lead to confusion. Best go directly to the URL above (or access
the pages through the petsc web page) to ensure you are looking at the
appropriate man pages

Thanks
Dave

On Fri 17. Apr 2020 at 09:43, huabel via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Dear PETSc users,
>
> I’m learn some base for PETSc , compile file src/benchmarks/PetscMalloc.c
> , get next error. (Use PETSc 3.13.0)
>
> >pwd
> src/benchmarks
>
> >mpicc PetscMalloc.c
> *PetscMalloc.c:53:49: **error: **too few arguments to function call,
> expected 4, have 3*
>   ierr = PetscOptionsHasName(NULL,"-malloc",);CHKERRQ(ierr);
> * ~~~^*
> */usr/local/include/petscoptions.h:18:1: note: *'PetscOptionsHasName'
> declared here
> PETSC_EXTERN PetscErrorCode PetscOptionsHasName(PetscOptions,const
> char[],const char[],PetscBool*);
> *^*
> */usr/local/include/petscsys.h:106:24: note: *expanded from macro
> 'PETSC_EXTERN'
> #  define PETSC_EXTERN extern PETSC_VISIBILITY_PUBLIC
> *   ^*
> 1 error generated.
>
>  Thanks.
>Abel
>

Re: [petsc-users] Inquiry about the setup for multigrid as a preconditioner in Petsc.

2020-03-12 Thread Dave May

You want to look at the bottom of each of these web pages

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMCreateInjection.html

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMCreateInterpolation.html

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMDA/DMCreateInterpolationScale.html

At the bottom you will see URLs to the current set of DM implementations
which implement Injection, Interpolation.

Thanks
Dave

On Thu, 12 Mar 2020 at 21:14, Xiaodong Liu  wrote:

> Hi, all,
>
> I am practising multigrid as a preconditioner in Petsc. From the previous
> resource, there are 2 main ways to set up the multigrid preconditioner,
>
> 1). For general cae,
> KSPCreate(MPI Comm comm,KSP *ksp);
> KSPGetPC(KSP ksp,PC *pc);
> PCSetType(PC pc,PCMG);
> PCMGSetLevels(pc,int levels,MPI Comm *comms);
> PCMGSetType(PC pc,PCMGType mode);
> PCMGSetCycleType(PC pc,PCMGCycleType ctype);
> ...
> PCMGSetInterpolation(PC pc,int level,Mat P);
> PCMGSetRestriction(PC pc,int level,Mat R);
>
> The above means that I need to specify a lot details, e.g., cycletype.
> interpolation and restriction matrix, coarse solver, etc.
>
> 2) For the case of structured mesh, (DMDA is enough)
> Taking the following case as an example,
>
> https://www.mcs.anl.gov/petsc/petsc-3.6/src/ksp/ksp/examples/tutorials/ex25.c.html
>
>  50:   KSPCreate(PETSC_COMM_WORLD,);
>  51:   DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_NONE,-3,1,1,0,);
>  52:   KSPSetDM(ksp,da);
>  53:   KSPSetComputeRHS(ksp,ComputeRHS,);
>  54:   KSPSetComputeOperators(ksp,ComputeMatrix,);
>  55:   KSPSetFromOptions(ksp);
>  56:   KSPSolve(ksp,NULL,NULL);
>
> DMDA handles all the multigrid setting automatically, e.g., interpolation
> and restriction matrix.
> If my understanding is right, *my question is where to find these source
> file to define these default interpolation and restriction matrix. *
>
> Thanks,S
> Xiaodong Liu, PhD
> X: Computational Physics Division
> Los Alamos National Laboratory
> P.O. Box 1663,
> Los Alamos, NM 87544
> 505-709-0534
>

Re: [petsc-users] Choosing VecScatter Method in Matrix-Vector Product

2020-01-22 Thread Dave May

On Wed 22. Jan 2020 at 16:12, Felix Huber 
wrote:

> Hello,
>
> I currently investigate why our code does not show the expected weak
> scaling behaviour in a CG solver.


Can you please send representative log files which characterize the lack of
scaling (include the full log_view)?

Are you using a KSP/PC configuration which should weak scale?

Thanks
Dave


Therefore I wanted to try out
> different communication methods for the VecScatter in the matrix-vector
> product. However, it seems like PETSc (version 3.7.6) always chooses
> either MPI_Alltoallv or MPI_Alltoallw when I pass different options via
> the PETSC_OPTIONS environment variable. Does anybody know, why this
> doesn't work as I expected?
>
> The matrix is a MPIAIJ matrix and created by a finite element
> discretization of a 3D Laplacian. Therefore it only communicates with
> 'neighboring' MPI ranks. Not sure if it helps, but the code is run on a
> Cray XC40.
>
> I tried the `ssend`, `rsend`, `sendfirst`, `reproduce` and no options
> from
>
> https://www.mcs.anl.gov/petsc/petsc-3.7/docs/manualpages/Vec/VecScatterCreate.html
> which all result in a MPI_Alltoallv. When combined with `nopack` the
> communication uses MPI_Alltoallw.
>
> Best regards,
> Felix
>
>

Re: [petsc-users] DMDA Error

2020-01-21 Thread Dave May

Hi Anthony,

On Tue, 21 Jan 2020 at 08:25, Anthony Jourdon 
wrote:

> Hello,
>
> I made a test to try to reproduce the error.
> To do so I modified the file $PETSC_DIR/src/dm/examples/tests/ex35.c
> I attach the file in case of need.
>
> The same error is reproduced for 1024 mpi ranks. I tested two problem
> sizes (2*512+1x2*64+1x2*256+1 and 2*1024+1x2*128+1x2*512+1) and the error
> occured for both cases, the first case is also the one I used to run before
> the OS and mpi updates.
> I also run the code with -malloc_debug and nothing more appeared.
>
> I attached the configure command I used to build a debug version of petsc.
>

The error indicates the problem occurs on the bold line below (e.g. within
MPI_Isend())

  /* Post the Isends with the message length-info */

  for (i=0,j=0; i
> Thank you for your time,
> Sincerly.
> Anthony Jourdon
>
>
> --
> *De :* Zhang, Junchao 
> *Envoyé :* jeudi 16 janvier 2020 16:49
> *À :* Anthony Jourdon 
> *Cc :* petsc-users@mcs.anl.gov 
> *Objet :* Re: [petsc-users] DMDA Error
>
> It seems the problem is triggered by DMSetUp. You can write a small test
> creating the DMDA with the same size as your code, to see if you can
> reproduce the problem. If yes, it would be much easier for us to debug it.
> --Junchao Zhang
>
>
> On Thu, Jan 16, 2020 at 7:38 AM Anthony Jourdon <
> jourdon_anth...@hotmail.fr> wrote:
>
> Dear Petsc developer,
>
>
> I need assistance with an error.
>
>
> I run a code that uses the DMDA related functions. I'm using petsc-3.8.4.
>
>
> This code used to run very well on a super computer with the OS SLES11.
>
> Petsc was built using an intel mpi 5.1.3.223 module and intel mkl version
> 2016.0.2.181
>
> The code was running with no problem on 1024 and more mpi ranks.
>
>
> Recently, the OS of the computer has been updated to RHEL7
>
> I rebuilt Petsc using new available versions of intel mpi (2019U5) and mkl
> (2019.0.5.281) which are the same versions for compilers and mkl.
>
> Since then I tested to run the exact same code on 8, 16, 24, 48, 512 and
> 1024 mpi ranks.
>
> Until 1024 mpi ranks no problem, but for 1024 an error related to DMDA
> appeared. I snip the first lines of the error stack here and the full error
> stack is attached.
>
>
> [534]PETSC ERROR: #1 PetscGatherMessageLengths() line 120 in
> /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/sys/utils/mpimesg.c
>
> [534]PETSC ERROR: #2 VecScatterCreate_PtoS() line 2288 in
> /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/vec/vec/utils/vpscat.c
>
> [534]PETSC ERROR: #3 VecScatterCreate() line 1462 in
> /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/vec/vec/utils/vscat.c
>
> [534]PETSC ERROR: #4 DMSetUp_DA_3D() line 1042 in
> /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/impls/da/da3.c
>
> [534]PETSC ERROR: #5 DMSetUp_DA() line 25 in
> /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/impls/da/dareg.c
>
> [534]PETSC ERROR: #6 DMSetUp() line 720 in
> /scratch2/dlp/appli_local/SCR/OROGEN/petsc3.8.4_MPI/petsc-3.8.4/src/dm/interface/dm.c
>
>
>
> Thank you for your time,
>
> Sincerly,
>
>
> Anthony Jourdon
>
>

Re: [petsc-users] error handling

2020-01-20 Thread Dave May

On Mon 20. Jan 2020 at 19:47, Sam Guo  wrote:

> Can I assume if there is MatCreat or VecCreate, I should clean up the
> memory myself?
>

Yes. You will need to call the matching Destroy function.



> On Mon, Jan 20, 2020 at 10:45 AM Sam Guo  wrote:
>
>> I only include the first few lines of SLEPc example. What about following
>>   ierr = MatCreate(PETSC_COMM_WORLD,);CHKERRQ(ierr);
>>   ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,n,n);CHKERRQ(ierr);
>> Is there any memory  lost?
>>
>> On Mon, Jan 20, 2020 at 10:41 AM Dave May 
>> wrote:
>>
>>>
>>>
>>> On Mon 20. Jan 2020 at 19:39, Sam Guo  wrote:
>>>
>>>> I don't have a specific case yet. Currently every call of PETSc is
>>>> checked. If ierr is not zero, print the error and return. For example,
>>>>Mat A; /* problem matrix */
>>>>EPS eps; /* eigenproblem solver context */
>>>>EPSType type;
>>>>   PetscReal error,tol,re,im;
>>>>   PetscScalar kr,ki; Vec xr,xi; 25
>>>>   PetscInt n=30,i,Istart,Iend,nev,maxit,its,nconv;
>>>>   PetscErrorCode ierr;
>>>>   ierr = SlepcInitialize(,,(char*)0,help);CHKERRQ(ierr);
>>>>   ierr = PetscOptionsGetInt(NULL,NULL,"-n",,NULL);CHKERRQ(ierr);
>>>>ierr = PetscPrintf(PETSC_COMM_WORLD,"\n1-D Laplacian Eigenproblem,
>>>> n=%D\n\n",n);CHKERRQ(ierr);
>>>>
>>>> I am wondering if the memory is lost by calling CHKERRQ.
>>>>
>>>
>>> No.
>>>
>>>
>>>
>>>> On Mon, Jan 20, 2020 at 10:14 AM Dave May 
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon 20. Jan 2020 at 19:11, Sam Guo  wrote:
>>>>>
>>>>>> Dear PETSc dev team,
>>>>>>If PETSc function returns an error, what's the correct way to
>>>>>> clean PETSc?
>>>>>>
>>>>>
>>>>> The answer depends on the error message reported. Send the complete
>>>>> error message and a better answer can be provided.
>>>>>
>>>>> Particularly how to clean up the memory?
>>>>>>
>>>>>
>>>>> Totally depends on the objects which aren’t being freed. You need to
>>>>> provide more information
>>>>>
>>>>> Thanks
>>>>> Dave
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Sam
>>>>>>
>>>>>

Re: [petsc-users] error handling

2020-01-20 Thread Dave May

On Mon 20. Jan 2020 at 19:39, Sam Guo  wrote:

> I don't have a specific case yet. Currently every call of PETSc is
> checked. If ierr is not zero, print the error and return. For example,
>Mat A; /* problem matrix */
>EPS eps; /* eigenproblem solver context */
>EPSType type;
>   PetscReal error,tol,re,im;
>   PetscScalar kr,ki; Vec xr,xi; 25
>   PetscInt n=30,i,Istart,Iend,nev,maxit,its,nconv;
>   PetscErrorCode ierr;
>   ierr = SlepcInitialize(,,(char*)0,help);CHKERRQ(ierr);
>   ierr = PetscOptionsGetInt(NULL,NULL,"-n",,NULL);CHKERRQ(ierr);
>ierr = PetscPrintf(PETSC_COMM_WORLD,"\n1-D Laplacian Eigenproblem,
> n=%D\n\n",n);CHKERRQ(ierr);
>
> I am wondering if the memory is lost by calling CHKERRQ.
>

No.



> On Mon, Jan 20, 2020 at 10:14 AM Dave May  wrote:
>
>>
>>
>> On Mon 20. Jan 2020 at 19:11, Sam Guo  wrote:
>>
>>> Dear PETSc dev team,
>>>If PETSc function returns an error, what's the correct way to clean
>>> PETSc?
>>>
>>
>> The answer depends on the error message reported. Send the complete error
>> message and a better answer can be provided.
>>
>> Particularly how to clean up the memory?
>>>
>>
>> Totally depends on the objects which aren’t being freed. You need to
>> provide more information
>>
>> Thanks
>> Dave
>>
>>
>>> Thanks,
>>> Sam
>>>
>>

Re: [petsc-users] error handling

2020-01-20 Thread Dave May

On Mon 20. Jan 2020 at 19:11, Sam Guo  wrote:

> Dear PETSc dev team,
>If PETSc function returns an error, what's the correct way to clean
> PETSc?
>

The answer depends on the error message reported. Send the complete error
message and a better answer can be provided.

Particularly how to clean up the memory?
>

Totally depends on the objects which aren’t being freed. You need to
provide more information

Thanks
Dave

> Thanks,
> Sam
>

Re: [petsc-users] killed 9 signal after upgrade from petsc 3.9.4 to 3.12.2

2020-01-10 Thread Dave May

On Sat 11. Jan 2020 at 00:04, Santiago Andres Triana 
wrote:

> Hi Barry, petsc-users:
>
> Just updated to petsc-3.12.3 and the performance is about the same as
> 3.12.2, i.e. about 2x the memory use of petsc-3.9.4
>
>
> petsc-3.12.3 (uses superlu_dist-6.2.0)
>
> Summary of Memory Usage in PETSc
> Maximum (over computational time) process memory:total 2.9368e+10
> max 1.2922e+09 min 1.1784e+09
> Current process memory:  total 2.8192e+10
> max 1.2263e+09 min 1.1456e+09
> Maximum (over computational time) space PetscMalloc()ed: total 2.7619e+09
> max 1.4339e+08 min 8.6494e+07
> Current space PetscMalloc()ed:   total 3.6127e+06
> max 1.5053e+05 min 1.5053e+05
>
>
> petsc-3.9.4
>
> Summary of Memory Usage in PETSc
> Maximum (over computational time) process memory:total 1.5695e+10
> max 7.1985e+08 min 6.0131e+08
> Current process memory:  total 1.3186e+10
> max 6.9240e+08 min 4.2821e+08
> Maximum (over computational time) space PetscMalloc()ed: total 3.1290e+09
> max 1.5869e+08 min 1.0179e+08
> Current space PetscMalloc()ed:   total 1.8808e+06
> max 7.8368e+04 min 7.8368e+04
>
>
> However, it seems that the culprit is superlu_dist: I recompiled current
> petsc/slepc with superlu_dist-5.4.0 (used option
> --download-superlu_dist=/home/spin/superlu_dist-5.4.0.tar.gz) and the
> result is this:
>
> petsc-3.12.3 with superlu_dist-5.4.0:
>
> Summary of Memory Usage in PETSc
> Maximum (over computational time) process memory:total 1.5636e+10
> max 7.1217e+08 min 5.9963e+08
> Current process memory:  total 1.3401e+10
> max 6.5498e+08 min 4.2626e+08
> Maximum (over computational time) space PetscMalloc()ed: total 2.7619e+09
> max 1.4339e+08 min 8.6494e+07
> Current space PetscMalloc()ed:   total 3.6127e+06
> max 1.5053e+05 min 1.5053e+05
>
> I could not compile petsc-3.12.3 with the exact superlu_dist version that
> petsc-3.9.4 uses (5.3.0), but will try newer versions to see how they
> perform ... I guess I should address this issue to the superlu mantainers?
>

Yes.



> Thanks!
> Santiago
>
> On Fri, Jan 10, 2020 at 9:19 PM Smith, Barry F. 
> wrote:
>
>>
>>   Can you please try v3.12.3  There was some funky business mistakenly
>> added related to partitioning that has been fixed in 3.12.3
>>
>>Barry
>>
>>
>> > On Jan 10, 2020, at 1:57 PM, Santiago Andres Triana 
>> wrote:
>> >
>> > Dear all,
>> >
>> > I ran the program with valgrind --tool=massif, the results are cryptic
>> to me ... not sure who's the memory hog! the logs are attached.
>> >
>> > The command I used is:
>> > mpiexec -n 24 valgrind --tool=massif --num-callers=20
>> --log-file=valgrind.log.%p ./ex7 -f1 A.petsc -f2 B.petsc -eps_nev 1 $opts
>> -eps_target -4.008e-3+1.57142i -eps_target_magnitude -eps_tol 1e-14
>> >
>> > Is there any possibility to install a version of superlu_dist (or
>> mumps) different from what the petsc version automatically downloads?
>> >
>> > Thanks!
>> > Santiago
>> >
>> >
>> > On Thu, Jan 9, 2020 at 10:04 PM Dave May 
>> wrote:
>> > This kind of issue is difficult to untangle because you have
>> potentially three pieces of software which might have changed between v3.9
>> and v3.12, namely
>> > PETSc, SLEPC and SuperLU_dist.
>> > You need to isolate which software component is responsible for the 2x
>> increase in memory.
>> >
>> > When I look at the memory usage in the log files, things look very very
>> similar for the raw PETSc objects.
>> >
>> > [v3.9]
>> > --- Event Stage 0: Main Stage
>> >
>> >   Viewer 4  3 2520 0.
>> >   Matrix15 15125236536 0.
>> >   Vector22 22 19713856 0.
>> >Index Set10 10   995280 0.
>> >  Vec Scatter 4  4 4928 0.
>> >   EPS Solver 1  1 2276 0.
>> >   Spectral Transform 1  1  848 0.
>> >Basis Vectors 1  1 2168 0.
>> >  PetscRandom 1  1  662 0.
>> >   Region 1  1  672 0.
>> >Direct Solver 1  117440 0.
>> >Krylo

Re: [petsc-users] killed 9 signal after upgrade from petsc 3.9.4 to 3.12.2

2020-01-09 Thread Dave May

This kind of issue is difficult to untangle because you have potentially
three pieces of software which might have changed between v3.9 and v3.12,
namely
PETSc, SLEPC and SuperLU_dist.
You need to isolate which software component is responsible for the 2x
increase in memory.

When I look at the memory usage in the log files, things look very very
similar for the raw PETSc objects.

[v3.9]
--- Event Stage 0: Main Stage

  Viewer 4  3 2520 0.

  Matrix15 15125236536 0.

  Vector22 22 19713856 0.

   Index Set10 10   995280 0.

 Vec Scatter 4  4 4928 0.

  EPS Solver 1  1 2276 0.

  Spectral Transform 1  1  848 0.

   Basis Vectors 1  1 2168 0.

 PetscRandom 1  1  662 0.

  Region 1  1  672 0.

   Direct Solver 1  117440 0.

   Krylov Solver 1  1 1176 0.

  Preconditioner 1  1 1000 0.

versus

[v3.12]

--- Event Stage 0: Main Stage

  Viewer 4  3 2520 0.

  Matrix15 15125237144 0.

  Vector22 22 19714528 0.

   Index Set10 10   995096 0.

 Vec Scatter 4  4 3168 0.

   Star Forest Graph 4  4 3936 0.

  EPS Solver 1  1 2292 0.

  Spectral Transform 1  1  848 0.

   Basis Vectors 1  1 2184 0.

 PetscRandom 1  1  662 0.

  Region 1  1  672 0.

   Direct Solver 1  117456 0.

   Krylov Solver 1  1 1400 0.

  Preconditioner 1  1 1000 0.

Certainly there is no apparent factor 2x increase in memory usage in the
underlying petsc objects themselves.
Furthermore, the counts of creations of petsc objects in toobig.log and
justfine.log match, indicating that none of the implementations used in
either PETSc or SLEPc have fundamentally changed wrt the usage of the
native petsc objects.

It is also curious that VecNorm is called 3 times in "justfine.log" and 19
times in "toobig.log" - although I don't see how that could be related to
you problem...

The above at least gives me the impression that issue of memory increase is
likely not coming from PETSc.
I just read Barry's useful email which is even more compelling and also
indicates SLEPc is not the likely culprit either as it uses PetscMalloc()
internally.

Some options to identify the problem:

1/ Eliminate SLEPc as a possible culprit by not calling EPSSolve() and
rather just call KSPSolve() with some RHS vector.
* If you still see a 2x increase, switch the preconditioner to using
-pc_type bjacobi -ksp_max_it 10 rather than superlu_dist.
If the memory usage is good, you can be pretty certain the issue arises
internally to superl_dist.

2/ Leave your code as is and perform your profiling using mumps rather than
superlu_dist.
This is a less reliable test than 1/ since the mumps implementation used
with v3.9 and v3.12 may differ...

Thanks
Dave

On Thu, 9 Jan 2020 at 20:17, Santiago Andres Triana 
wrote:

> Dear all,
>
> I think parmetis is not involved since I still run out of memory if I use
> the following options:
> export opts='-st_type sinvert -st_ksp_type preonly -st_pc_type lu
> -st_pc_factor_mat_solver_type superlu_dist -eps_true_residual 1'
> and  issuing:
> mpiexec -n 24 ./ex7 -f1 A.petsc -f2 B.petsc -eps_nev 1 -eps_target
> -4.008e-3+1.57142i $opts -eps_target_magnitude -eps_tol 1e-14 -memory_view
>
> Bottom line is that the memory usage of petsc-3.9.4 / slepc-3.9.2 is much
> lower than current version. I can only solve relatively small problems
> using the 3.12 series :(
> I have an example with smaller matrices that will likely fail in a 32 Gb
> ram machine with petsc-3.12 but runs just fine with petsc-3.9. The
> -memory_view output is
>
> with petsc-3.9.4: (log 'justfine.log' attached)
>
> Summary of Memory Usage in PETSc
> Maximum (over computational time) process memory:total 1.6665e+10
> max 7.5674e+08 min 6.4215e+08
> Current process memory:  total 1.5841e+10
> max 7.2881e+08 min 6.0905e+08
> Maximum (over computational time) space PetscMalloc()ed: total 3.1290e+09
> max 1.5868e+08 min 1.0179e+08
> Current space PetscMalloc()ed:   total 1.8808e+06
> max 7.8368e+04 min 7.8368e+04
>
>
> with petsc-3.12.2: (log 'toobig.log' attached)
>
> Summary of Memory Usage in PETSc
> Maximum (over

Re: [petsc-users] Changing nonzero structure and Jacobian coloring

2019-10-16 Thread Dave May via petsc-users

What Ellen wants to do seems exactly the same use case as required by
dynamic AMR.

Some thoughts:
* If the target problem is nonlinear, then you will need to evaluate the
Jacobian more than once (with the same nonzero pattern) per time step. You
would also have to solve a linear problem at each Newton iterate.
Collectively I think both tasks will consume much more time than that
required to create a new matrix and determine / set the nonzero pattern
(which is only required once per time step).

* If you are using an incompressible SPH method (e.g. you use a kernel with
a constant compact support) then you will have code which allows you to
efficiently determine which particles interact, e.g. via a background cell
structure, thus you have a means to infer the the nonzero structure.
However computing the off-diagonal counts can be a pain...

* Going further, provided you have a unique id assigned to each particle,
you can use MatPreallocator (
https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatPreallocatorPreallocate.html)
to easily define the exact nonzero pattern.

Given all the above, I don’t see why you shouldn’t try your idea of
creating a new matrix at each step and assembling the Jacobian.
Why not just try using MatPreallocator and profile the time required to
define the nonzero structure?

I like Barry’s idea of defining the preconditioner for the Jacobian using
an operator defined via kernels with smaller compact support. I’d be
interested to know how effective that is as a preconditioner.

There is potentially a larger issue to consider (if your application runs
in parallel). Whilst the number of particles is constant in time, the
number of particles per MPI rank will likely change as particles advect
(I'm assuming you decomposed the problem using the background search cell
grid and do not load balance the particles which is a commonly done with
incompressible SPH implementations). As a result, the local size of the Vec
object which stores the solution will change between time steps. Vec cannot
be re-sized, hence you will not only need to change the nonzero structure
of the Jacobian but you will also need to destroy/create all Vec's objects
and all objects associated with the nonlinear solve. Given this, I'm not
even sure you can use TS for your use case (hopefully a TS expert will
comment on this).

My experience has been that creating new objects (Vec, Mat, KSP, SNES) in
PETSc is fast (compared to a linear solve). You might have to give up on
using TS, and instead roll your own time integrator. By doing this you will
have control of only a SNES object (plus Mat's for J Vec's for the residual
and solution) which you can create and destroy within each time step. To
use FD coloring you would need to provide this function
SNESComputeJacobianDefaultColor to SNESComputeJacobian(), along with a
preallocated matrix (which you define using MatPreallocator).

Thanks
Dave
Dave

On Wed 16. Oct 2019 at 13:25, Matthew Knepley via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> On Tue, Oct 15, 2019 at 4:56 PM Smith, Barry F. via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>>
>>   Because of the difficulty of maintaining a nonzero matrix for such
>> problems aren't these problems often done "matrix-free"?
>>
>>   So you provide a routine that computes the action of the operator but
>> doesn't form the operator explicitly (and you hope that you don't need a
>> preconditioner). There are simple ways (but less optimal) to do this as
>> well as more sophisticated (such as multipole methods).
>>
>>   If the convergence of the linear solver is too slow (due to lack of
>> preconditioner) you might consider continuing with matrix free but at each
>> new Newton solve (or several Newton solves) construct a very sparse matrix
>> that captures just the very local coupling in the problem. Once particles
>> have moved around a bit you would throw away the old matrix and construct a
>> new one. For example the matrix might just captures interactions between
>> particles that are less than some radius away from themselves. You could
>> use a direct solver or iterative solver to solve this very sparse system.
>>
>
> I tried to do this with Dan Negrut many years ago and had the same
> problems. That is part of why incompressible SPH never works,
> since you need global modes.
>
>   Thanks,
>
>  Matt
>
>
>>   Barry
>>
>> This is why KSPSetOperators and SNESSetJacobian and TSSetRHS/IJacobain
>> take two Jacobian matrices, the first holds the matrix free Jacobian and
>> the second holds the approximation used inside the preconditioner.
>>
>>
>>
>> > On Oct 15, 2019, at 3:29 PM, Ellen M. Price <
>> ellen.pr...@cfa.harvard.edu> wrote:
>> >
>> > Thanks for the reply, Barry! Unfortunately, this is a particle code
>> > (SPH, specifically), so the particle neighbors, which influence the
>> > properties, change over time; the degrees of freedom is a constant, as
>> > is the particle number. Any thoughts,

Re: [petsc-users] DMDAGetElements and global/local element number

2019-09-12 Thread Dave May via petsc-users

Please always use "reply-all" so that your messages go to the list.
This is standard mailing list etiquette.  It is important to preserve
threading for people who find this discussion later and so that we do
not waste our time re-answering the same questions that have already
been answered in private side-conversations.  You'll likely get an
answer faster that way too.

On Thu, 12 Sep 2019 at 22:26, Emmanuel Ayala  wrote:

> Thank you for the answer.
>
> El jue., 12 de sep. de 2019 a la(s) 15:21, Dave May (
> dave.mayhe...@gmail.com) escribió:
>
>>
>>
>> On Thu, 12 Sep 2019 at 20:21, Emmanuel Ayala via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Hi everyone, it would be great if someone can give me a hint for this
>>> issue, i have been trying to figure out how to solve it, but i did not
>>> succeed
>>>
>>> I'm using DMDA to generate a 3D mesh (DMDA_ELEMENT_Q1). I'm trying to
>>> fill a MPI matrix with some values wich are related to the dofs of each
>>> element node, moreover i need to set this values based on the element
>>> number. Something like:
>>>
>>> mpi_A(total_elements X total_dofs)
>>>
>>>  total_dofs
>>> row_0 (element_0)  a_0 a_1 a_2 ... a_23
>>> row_1 (element_1) a_0 a_1 a_2 ...
>>> a_23
>>> row_2 (element_2)
>>> a_0 a_1 a_2 ... a_23
>>> .
>>> .
>>> .
>>> row_n (element_n)a_0 a_1 a_2 ... a_23
>>>
>>> The element number is related to the row index. And the matrix values
>>> are set depending of the DOFs related to the element.
>>>
>>> With DMDAGetElements i can read the LOCAL nodes connected to the element
>>> and then the DOFs associated to the element. I can handle the local and
>>> global relations with DMGetLocalToGlobalMapping, MatSetLocalToGlobalMapping
>>> and MatSetValuesLocal. BUT i CAN NOT understand how to know the element
>>> number in LOCAL or GLOBAL contex. DMDAGetElements gives the NUMBER OF
>>> ELEMENTS owned in the local process, but there is not any information about
>>> the local or global ELEMENT NUMBER.
>>>
>>> How to know the local or global element number related to the data
>>> provided by DMDAGetElements?
>>>
>>
>> The DMDA defines cells of the same type (quads (2D) or hex (3D), hence
>> every cell defines the same number of vertices.
>>
>
>> DMDAGetElements(DM dm,PetscInt *nel,PetscInt *nen,const PetscInt *e[])
>> nel - number of local elements
>> nen - number of element nodes
>> e - the local indices of the elements' vertices
>>
>> e[] defines the ordering of the elements. e[] is an array containing all
>> of the element-vertex maps. Since each element in the DMDA has the same
>> number of vertices, the first nen values in e[] correspond to the vertices
>> (local index) associated with the first element. The next nen values in e[]
>> correspond to the vertices of the second element.  The vertices for any
>> (local) element with the index "cid" can be sought via e[nen*cid + i] where
>> i would range from 0 to nen-1.
>>
>>
> You are right. I can handle the local information, i think the idea is:
>
> for ( PetscInt i = 0; i < nel; i++ )
> for (PetscInt j = 0; j < nen; j++)
> PetscSynchronizedPrintf(PETSC_COMM_WORLD,"local element %d :
> e[%d] = %d\n", i, j, e[i*nen+j]);
>
> BUT, it does not give information regarding to the ELEMENT identifier
> (number). I need the element number to ordering the elements inside of a
> MPI matrix. I want to access to each element data by means of the matrix
> row . I mean, in the row_0 there is the information (spreading through the
> columns) of the element_0.
>

I think this is a mis-understanding. The element number is not related to a
row in the matrix. The element is associated with vertices (basis
functions), and each vertex (basis) in the DMDA is given a unique index.
The index of that basis corresponds to a row (column) if it's a test
(trial) function. So if you have any element defined by the array e[], you
know how to insert values into a matrix by using the vertex indices.



>
> The element vertices are numbered starting from 0, for each process. It
> does not give information about the element number.
>
> Why would you ever want, or need, the global element number? What is the
>> use case?
>>
>
> I'm performing topology optimization, and it is part of gradient
>

Re: [petsc-users] DMDAGetElements and global/local element number

2019-09-12 Thread Dave May via petsc-users

On Thu, 12 Sep 2019 at 20:21, Emmanuel Ayala via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Hi everyone, it would be great if someone can give me a hint for this
> issue, i have been trying to figure out how to solve it, but i did not
> succeed
>
> I'm using DMDA to generate a 3D mesh (DMDA_ELEMENT_Q1). I'm trying to fill
> a MPI matrix with some values wich are related to the dofs of each element
> node, moreover i need to set this values based on the element number.
> Something like:
>
> mpi_A(total_elements X total_dofs)
>
>  total_dofs
> row_0 (element_0)  a_0 a_1 a_2 ... a_23
> row_1 (element_1) a_0 a_1 a_2 ... a_23
> row_2 (element_2)
>   a_0 a_1 a_2 ... a_23
> .
> .
> .
> row_n (element_n)a_0 a_1 a_2 ... a_23
>
> The element number is related to the row index. And the matrix values are
> set depending of the DOFs related to the element.
>
> With DMDAGetElements i can read the LOCAL nodes connected to the element
> and then the DOFs associated to the element. I can handle the local and
> global relations with DMGetLocalToGlobalMapping, MatSetLocalToGlobalMapping
> and MatSetValuesLocal. BUT i CAN NOT understand how to know the element
> number in LOCAL or GLOBAL contex. DMDAGetElements gives the NUMBER OF
> ELEMENTS owned in the local process, but there is not any information about
> the local or global ELEMENT NUMBER.
>
> How to know the local or global element number related to the data
> provided by DMDAGetElements?
>

The DMDA defines cells of the same type (quads (2D) or hex (3D), hence
every cell defines the same number of vertices.

DMDAGetElements(DM dm,PetscInt *nel,PetscInt *nen,const PetscInt *e[])
nel - number of local elements
nen - number of element nodes
e - the local indices of the elements' vertices

e[] defines the ordering of the elements. e[] is an array containing all of
the element-vertex maps. Since each element in the DMDA has the same number
of vertices, the first nen values in e[] correspond to the vertices (local
index) associated with the first element. The next nen values in e[]
correspond to the vertices of the second element.  The vertices for any
(local) element with the index "cid" can be sought via e[nen*cid + i] where
i would range from 0 to nen-1.

Why would you ever want, or need, the global element number? What is the
use case?

Thanks,
Dave

>
> Thank you.
>

Re: [petsc-users] [petsc-dev] Working Group Beginners: Feedback On Layout

2019-08-16 Thread Dave May via petsc-users

I think it would useful to have links to all the man pages in the table of
contents.

I also think it would be useful to have links to the man pages for specific
key functions which are fundamental to the objectives of the tutorial.
These could appear at the end of the tutorial under a new section heading
(eg "Further reading"). It would be good to keep the list of man pages
displayed to a minimum to avoid info overload and obscuring the primary
objectives of the tut.

Overall I like it. Nice work.

Cheers,
Dave

On Fri, 16 Aug 2019 at 16:22, Faibussowitsch, Jacob via petsc-dev <
petsc-...@mcs.anl.gov> wrote:

> Hello All PETSC Developers/Users!
>
> As many of you may or may not know, PETSc recently held an all-hands
> strategic meeting to chart the medium term course for the group. As part of
> this meeting a working group was formed to focus on beginner tutorial
> guides aimed at bringing new users up to speed on how to program basic to
> intermediate PETSc scripts. We have just completed a first draft of our
> template for these guides and would like to ask you all for your feedback!
> Any and all feedback would be greatly appreciated, however please limit
> your feedback to the general *layout* and *structure*. The visual
> presentation of the web page and content is still all a WIP, and is not
> necessarily representative of the finished product.
>
> That being said, in order to keep the project moving forward we will *soft-cap
> feedback collection by the end of next Friday (August 23)* so that we can
> get started on writing the tutorials and integrating them with the rest of
> the revamped user-guides. Please email me directly at
> jfaibussowit...@anl.gov with your comments! Be sure to include specific
> details and examples of what you like and don’t like with your mail.
>
> Here is the template:
> http://patricksanan.com/temp/_build/html/introductory_tutorial_ksp.html
>
> Sincerely,
>
> Jacob Faibussowitsch
>
>

Re: [petsc-users] strange error using fgmres

2019-05-05 Thread Dave May via petsc-users

On Mon, 6 May 2019 at 02:18, Smith, Barry F. via petsc-users <
petsc-users@mcs.anl.gov> wrote:

>
>
>   Even if you don't get failures on the smaller version of a code it can
> still be worth running with valgrind (when you can't run valgrind on the
> massive problem) because often the problem is still there on the smaller
> problem, just less directly visible but valgrind can still find it.
>
>
> > [13]PETSC ERROR: Object is in wrong state
> > [13]PETSC ERROR: Clearing DM of global vectors that has a global vector
> obtained with DMGetGlobalVector()
>
>You probably have a work vector obtained with DMGetGlobalVector() that
> you forgot to return with DMRestoreGlobalVector(). Though I would expect
> that this would reproduce on any size problem.


I'd fix the DM issue first before addressing the solver problem. I suspect
the DM error could cause the solver error.

Yep - something is wrong with your management of vectors associated with
one of your DM's. You can figure out if this is the case by running with
-log_view. Make sure the summary of the objects reported shows that the
number of Vecs created and destroyed matches. At the very least, if there
is a mismatch, make sure this difference does not increase as you do
additional optimization solvers (or time steps).

As Barry says, you don't need to run a large scale job to detect this, nor
do you need to run through many optimization solves - the problem exists
and is detectable and thus fixable for all job sizes.


>
>Barry
>
>
> > On May 5, 2019, at 5:21 PM, Randall Mackie via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
> >
> > In solving a nonlinear optimization problem, I was recently
> experimenting with fgmres using the following options:
> >
> > -nlcg_ksp_type fgmres \
> > -nlcg_pc_type ksp \
> > -nlcg_ksp_ksp_type bcgs \
> > -nlcg_ksp_pc_type jacobi \
> > -nlcg_ksp_rtol 1e-6 \
> > -nlcg_ksp_ksp_max_it 300 \
> > -nlcg_ksp_max_it 200 \
> > -nlcg_ksp_converged_reason \
> > -nlcg_ksp_monitor_true_residual \
> >
> > I sometimes randomly will get an error like the following:
> >
> > Residual norms for nlcg_ solve.
> >   0 KSP unpreconditioned resid norm 3.371606868500e+04 true resid norm
> 3.371606868500e+04 ||r(i)||/||b|| 1.e+00
> >   1 KSP unpreconditioned resid norm 2.322590778002e+02 true resid norm
> 2.322590778002e+02 ||r(i)||/||b|| 6.888676137487e-03
> >   2 KSP unpreconditioned resid norm 8.262440884758e+01 true resid norm
> 8.262440884758e+01 ||r(i)||/||b|| 2.450594392232e-03
> >   3 KSP unpreconditioned resid norm 3.660428333809e+01 true resid norm
> 3.660428333809e+01 ||r(i)||/||b|| 1.085662853522e-03
> >   3 KSP unpreconditioned resid norm 0.e+00 true resid norm
>  -nan ||r(i)||/||b||   -nan
> > Linear nlcg_ solve did not converge due to DIVERGED_PC_FAILED iterations
> 3
> >PC_FAILED due to SUBPC_ERROR
> >
> > This usually happens after a few nonlinear optimization iterations,
> meaning that it’s worked perfectly fine until this point.
> > How can using jacobi pc all of a sudden cause a NaN, if it’s worked
> perfectly fine before?
> >
> > Some other errors in the output log file are as follows, although I have
> no idea if they result from the above error or not:
> >
> > [13]PETSC ERROR: Object is in wrong state
> > [13]PETSC ERROR: Clearing DM of global vectors that has a global vector
> obtained with DMGetGlobalVector()
> > [13]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [13]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019
> >
> >
> > [27]PETSC ERROR: #1 DMClearGlobalVectors() line 196 in
> /state/std2/FEMI/PETSc/petsc-3.11.1/src/dm/interface/dmget.c
> > [27]PETSC ERROR: Configure options --with-clean=1
> --with-scalar-type=complex --with-debugging=0 --with-fortran=1
> --with-blaslapack-dir=/state/std2/intel_2018/m
> > kl --with-mkl_pardiso-dir=/state/std2/intel_2018/mkl
> --with-mkl_cpardiso-dir=/state/std2/intel_2018/mkl
> --download-mumps=../external/mumps_v5.1.2-p1.tar.gz --d
> > ownload-scalapack=../external/scalapack-2.0.2.tgz --with-cc=mpiicc
> --with-fc=mpiifort --with-cxx=mpiicc --FOPTFLAGS="-O3 -xHost"
> --COPTFLAGS="-O3 -xHost" --CXX
> > OPTFLAGS="-O3 -xHost"
> >
> >
> > #2 DMDestroy() line 752 in
> /state/std2/FEMI/PETSc/petsc-3.11.1/src/dm/interface/dm.c
> > [72]PETSC ERROR: #3 PetscObjectDereference() line 624 in
> /state/std2/FEMI/PETSc/petsc-3.11.1/src/sys/objects/inherit.c
> > [72]PETSC ERROR: #4 PetscObjectListDestroy() line 156 in
> /state/std2/FEMI/PETSc/petsc-3.11.1/src/sys/objects/olist.c
> > [72]PETSC ERROR: #5 PetscHeaderDestroy_Private() line 122 in
> /state/std2/FEMI/PETSc/petsc-3.11.1/src/sys/objects/inherit.c
> > [72]PETSC ERROR: #6 VecDestroy() line 412 in
> /state/std2/FEMI/PETSc/petsc-3.11.1/src/vec/vec/interface/vector.c
> >
> >
> >
> > This is a large run taking many hours to get to this problem. I will try
> to run in debug mode, but given that this seems to be randomly happening

Re: [petsc-users] Confusing Schur preconditioner behaviour

2019-03-19 Thread Dave May via petsc-users

Hi Colin,

On Tue, 19 Mar 2019 at 09:33, Cotter, Colin J 
wrote:

> Hi Dave,
>
> >If you are doing that, then you need to tell fieldsplit to use the Amat
> to define the splits otherwise it will define the Schur compliment as
> >S = B22 - B21 inv(B11) B12
> >preconditiones with B22, where as what you want is
> >S = A22 - A21 inv(A11) A12
> >preconditioned with B22.
>
> >If your operators are set up this way and you didn't indicate to use Amat
> to define S this would definitely explain why preonly works but iterating
> on Schur does not.
>
> Yes, thanks - this solves it! I need pc_use_amat.
>

Okay great. But doesn't that option eradicate your custom Schur complement
object which you inserted into the Bmat in the (2,2) slot?

I thought you would use the option
-pc_fieldsplit_diag_use_amat

In general for fieldsplit (Schur) I found that the best way to manage user
defined Schur complement preconditioners is via PCFieldSplitSetSchurPre().

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetSchurPre.html#PCFieldSplitSetSchurPre

Also, for solver debugging purposes with fieldsplit and MatNest, I find it
incredibly useful to attach textual names to all the matrices going to into
FieldSplit. You can use PetscObjectSetName() with each of your sub-matrices
in the Amat and the Bmat, and any schur complement operators. The textual
names will be displayed in KSP view. In that way you have a better chance
of understanding which operators are being used where. (Note that this
trick is less useful with the Amat and Bmat are AIJ matrices).

Below is an example KSPView associated with 2x2 block system where I've
attached the names Auu,Aup,Apu,App, and S* to the Amat sub-matices and the
schur complement preconditioner.

PC Object:(dcy_) 1 MPI processes

  type: fieldsplit

FieldSplit with Schur preconditioner, factorization FULL

Preconditioner for the Schur complement formed from Sp, an assembled
approximation to S, which uses (lumped, if requested) A00's diagonal's
inverse

Split info:

Split number 0 Defined by IS

Split number 1 Defined by IS

KSP solver for A00 block

  KSP Object:  (dcy_fieldsplit_u_)   1 MPI processes

type: preonly

maximum iterations=1, initial guess is zero

tolerances:  relative=1e-05, absolute=1e-50, divergence=1.

left preconditioning

using NONE norm type for convergence test

  PC Object:  (dcy_fieldsplit_u_)   1 MPI processes

type: lu

  LU: out-of-place factorization

  tolerance for zero pivot 2.22045e-14

  matrix ordering: nd

  factor fill ratio given 0., needed 0.

Factored matrix follows:

  Mat Object:   1 MPI processes

type: seqaij

rows=85728, cols=85728

package used to perform factorization: umfpack

total: nonzeros=0, allocated nonzeros=0

total number of mallocs used during MatSetValues calls =0

  not using I-node routines

  UMFPACK run parameters:

Control[UMFPACK_PRL]: 1.

Control[UMFPACK_STRATEGY]: 0.

Control[UMFPACK_DENSE_COL]: 0.2

Control[UMFPACK_DENSE_ROW]: 0.2

Control[UMFPACK_AMD_DENSE]: 10.

Control[UMFPACK_BLOCK_SIZE]: 32.

Control[UMFPACK_FIXQ]: 0.

Control[UMFPACK_AGGRESSIVE]: 1.

Control[UMFPACK_PIVOT_TOLERANCE]: 0.1

Control[UMFPACK_SYM_PIVOT_TOLERANCE]: 0.001

Control[UMFPACK_SCALE]: 1.

Control[UMFPACK_ALLOC_INIT]: 0.7

Control[UMFPACK_DROPTOL]: 0.

Control[UMFPACK_IRSTEP]: 0.

Control[UMFPACK_ORDERING]: AMD (not using the PETSc
ordering)

linear system matrix = precond matrix:

Mat Object:Auu(dcy_fieldsplit_u_) 1 MPI
processes

  type: seqaij

  rows=85728, cols=85728

  total: nonzeros=1028736, allocated nonzeros=1028736

  total number of mallocs used during MatSetValues calls =0

using I-node routines: found 21432 nodes, limit used is 5

KSP solver for S = A11 - A10 inv(A00) A01

  KSP Object:  (dcy_fieldsplit_p_)   1 MPI processes

type: fgmres

  GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement

  GMRES: happy breakdown tolerance 1e-30

maximum iterations=300, initial guess is zero

tolerances:  relative=0.01, absolute=1e-50, divergence=1.

right preconditioning

using UNPRECONDITIONED norm type for convergence test

  PC Object:  (dcy_fieldsplit_p_)   1 MPI processes

type: lu

  LU: out-of-place

Re: [petsc-users] PETSC address vector c++ access

2018-11-30 Thread Dave May via petsc-users

On Fri, 30 Nov 2018 at 14:50, RAELI ALICE via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Hi All,
> My team is working on a PETSC version of an existent code.
> In order to convert the main part of this work retaining the c++ levels of
> abstraction,
> we would access to c++ vector data structures easily.
>
> We would like to use the memory area allocated using c++ as a Petsc Vector
>
> a supposed pseudocode could be:
>
>
> *int data[] = { 1,2,3,4,5,6,7,8,9 };*
>

To avoid any confusion (or disappointment) later, I want to point out that
your pseudo code is wrong (specifically the line above) and in detail would
result in a SEGV.

A PETSc Vec object can ONLY represent PetscScalar data types.

It may well have been a typo on your part, but I think it's important to
emphasise that the Vec object does not try to mimic a C++ template with the
data type as an argument.

Thanks,
  Dave

>
>
> *unsigned int sizeVectorPetsC = data.size();Vec v(data, sizeVectorPetsC);
> (Can it exists with easy access petsc routines?)*
> *cout<< "The mapped vector v is: [";*
>
> *PetscScalar *vecArray;*
>
>
>
>
>
>
> *PetscGetArray(v, );for (unsigned int i = 0; i < sizeVectorPetsC;
> i++){Informations of v are the informations of data.(same memory read by
> both data structures)}PetscRestoreArray(v,);*
> * cout<< "]"<< endl;*
>
>
> Is it provided this kind of duality between c++ and Petsc objects without
> a re-copy of the concerned informations?
>
> Thank you,
> Alice
>

Re: [petsc-users] [SLEPc] ex5 fails, error in lapack

2018-10-28 Thread Dave May

On Sun, 28 Oct 2018 at 21:46, Santiago Andres Triana 
wrote:

> Hi Dave,
>
> Indeed, I added that last arg myself after the configure script asked for
> it (--with-batch seems to need it). I just tried with petsc-3.9.1, without
> the --with-batch and --known-64-blas-indices=1 options and everything is
> working nicely.
>

Great.

I believe as general rule, flags such as -known-64-bit-xxx are only
required to be specified by the user when using system provided packages
(actually any package not installed by petsc' configure). If you use
--download-yyy then petscs' configure defines how package yyy is to be
configured and built, hence it knows whether it used 64 bit ints, or not -
the user does not (and probably should not) provide a flag to indicate what
petsc configuration already knows


Thanks,
  Dave

I will try again later with the latest version.
>

Ok.



> Thanks!
>
> Santiago
>
> On Sun, Oct 28, 2018 at 10:31 AM Dave May  wrote:
>
>>
>>
>> On Sun, 28 Oct 2018 at 09:37, Santiago Andres Triana 
>> wrote:
>>
>>> Hi petsc-users,
>>>
>>> I am experiencing problems running ex5 and ex7 from the slepc tutorial.
>>> This is after upgrade to petsc-3.10.2 and slepc-3.10.1. Has anyone run into
>>> this problem? see the error message below. Any help or advice would be
>>> highly appreciated. Thanks in advance!
>>>
>>> Santiago
>>>
>>>
>>>
>>> trianas@hpcb-n02:/home/trianas/slepc-3.10.1/src/eps/examples/tutorials>
>>> ./ex5 -eps_nev 4
>>>
>>> Markov Model, N=120 (m=15)
>>>
>>> [0]PETSC ERROR: - Error Message
>>> --
>>> [0]PETSC ERROR: Error in external library
>>> [0]PETSC ERROR: Error in LAPACK subroutine hseqr: info=0
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.10.2, Oct, 09, 2018
>>> [0]PETSC ERROR: ./ex5 on a arch-linux2-c-opt named hpcb-n02 by trianas
>>> Sun Oct 28 09:30:18 2018
>>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>> --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8
>>> --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2
>>> --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8
>>> --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8
>>> --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4
>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1
>>> --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1
>>> --with-scalar-type=complex --download-mumps=1 --download-parmetis
>>> --download-metis --download-scalapack=1 --download-fblaslapack=1
>>> --with-debugging=0 --download-superlu_dist=1 --download-ptscotch=1
>>> CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native"
>>> COPTFLAGS="-O3 -march=native" --with-batch --known-64-bit-blas-indices=1
>>>
>>
>> I think this last arg is wrong if you use --download-fblaslapack.
>>
>> Did you explicitly add this option yourself?
>>
>>
>> [0]PETSC ERROR: #1 DSSolve_NHEP() line 586 in
>>> /space/hpc-home/trianas/slepc-3.10.1/src/sys/classes/ds/impls/nhep/dsnhep.c
>>> [0]PETSC ERROR: #2 DSSolve() line 586 in
>>> /space/hpc-home/trianas/slepc-3.10.1/src/sys/classes/ds/interface/dsops.c
>>> [0]PETSC ERROR: #3 EPSSolve_KrylovSchur_Default() line 275 in
>>> /space/hpc-home/trianas/slepc-3.10.1/src/eps/impls/krylov/krylovschur/krylovschur.c
>>> [0]PETSC ERROR: #4 EPSSolve() line 148 in
>>> /space/hpc-home/trianas/slepc-3.10.1/src/eps/interface/epssolve.c
>>> [0]PETSC ERROR: #5 main() line 90 in
>>> /home/trianas/slepc-3.10.1/src/eps/examples/tutorials/ex5.c
>>> [0]PETSC ERROR: PETSc Option Table entries:
>>> [0]PETSC ERROR: -eps_nev 4
>>> [0]PETSC ERROR: End of Error Message ---send entire
>>> error message to petsc-ma...@mcs.anl.gov--
>>> application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0
>>> [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=76
>>> :
>>> system msg for write_line failure : Bad file descriptor
>>>
>>>

Re: [petsc-users] [SLEPc] ex5 fails, error in lapack

2018-10-28 Thread Dave May

On Sun, 28 Oct 2018 at 09:37, Santiago Andres Triana 
wrote:

> Hi petsc-users,
>
> I am experiencing problems running ex5 and ex7 from the slepc tutorial.
> This is after upgrade to petsc-3.10.2 and slepc-3.10.1. Has anyone run into
> this problem? see the error message below. Any help or advice would be
> highly appreciated. Thanks in advance!
>
> Santiago
>
>
>
> trianas@hpcb-n02:/home/trianas/slepc-3.10.1/src/eps/examples/tutorials>
> ./ex5 -eps_nev 4
>
> Markov Model, N=120 (m=15)
>
> [0]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: Error in external library
> [0]PETSC ERROR: Error in LAPACK subroutine hseqr: info=0
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.10.2, Oct, 09, 2018
> [0]PETSC ERROR: ./ex5 on a arch-linux2-c-opt named hpcb-n02 by trianas Sun
> Oct 28 09:30:18 2018
> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8
> --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2
> --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8
> --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8
> --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1
> --known-mpi-c-double-complex=1 --known-has-attribute-aligned=1
> --with-scalar-type=complex --download-mumps=1 --download-parmetis
> --download-metis --download-scalapack=1 --download-fblaslapack=1
> --with-debugging=0 --download-superlu_dist=1 --download-ptscotch=1
> CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native"
> COPTFLAGS="-O3 -march=native" --with-batch --known-64-bit-blas-indices=1
>

I think this last arg is wrong if you use --download-fblaslapack.

Did you explicitly add this option yourself?


[0]PETSC ERROR: #1 DSSolve_NHEP() line 586 in
> /space/hpc-home/trianas/slepc-3.10.1/src/sys/classes/ds/impls/nhep/dsnhep.c
> [0]PETSC ERROR: #2 DSSolve() line 586 in
> /space/hpc-home/trianas/slepc-3.10.1/src/sys/classes/ds/interface/dsops.c
> [0]PETSC ERROR: #3 EPSSolve_KrylovSchur_Default() line 275 in
> /space/hpc-home/trianas/slepc-3.10.1/src/eps/impls/krylov/krylovschur/krylovschur.c
> [0]PETSC ERROR: #4 EPSSolve() line 148 in
> /space/hpc-home/trianas/slepc-3.10.1/src/eps/interface/epssolve.c
> [0]PETSC ERROR: #5 main() line 90 in
> /home/trianas/slepc-3.10.1/src/eps/examples/tutorials/ex5.c
> [0]PETSC ERROR: PETSc Option Table entries:
> [0]PETSC ERROR: -eps_nev 4
> [0]PETSC ERROR: End of Error Message ---send entire
> error message to petsc-ma...@mcs.anl.gov--
> application called MPI_Abort(MPI_COMM_WORLD, 76) - process 0
> [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=76
> :
> system msg for write_line failure : Bad file descriptor
>
>

Re: [petsc-users] Shell Matrix Operations required for KSP solvers?

2018-10-23 Thread Dave May

On Tue, 23 Oct 2018 at 02:24, Matthew Knepley  wrote:

> On Mon, Oct 22, 2018 at 7:44 PM Andrew Ho  wrote:
>
>> I have a specialized matrix structure I'm trying to take advantage of for
>> solving large scale (non)linear systems. I think for this purpose using a
>> Shell matrix is sufficient for interfacing with PETSc's KSP linear solvers.
>>
>> Looking at the examples which use shell matrices, it seems most only
>> require implementing MatMult, and sometimes MatMultTranspose. Is there a
>> list of what operations are required (or optional but good to have) for the
>> different KSP solver types? This is specifically for the KSP solve itself,
>> not constructing the actual matrix. I'd also be interested if any of the
>> required/optional operations changes if preconditioners (left and/or right)
>> are used.
>>
>
> There is no list, but its hard to think of another operation KSP would ask
> for. Preconditioners are another story unfortunately. They often want
> explicit access to matrix entries. its really unusual for KSPs to work
> without a good preconditioner (the notable exception being well-conditioned
> systems like some boundary integral operators).
>

The most basic PC is Jacobi - for that to work you need to implement
MatGetDiagonal.

Thanks,
  Dave



>   Thanks,
>
> Matt
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] KSP and matrix-free matrix (shell)

2018-10-18 Thread Dave May

On Thu, 18 Oct 2018 at 17:57, Florian Lindner  wrote:

> Hello,
>
> I try to use the KSP solver package together with a shell matrix:
>
>
>   MyContext mycontext; // an empty struct, not sure it it's needed?
>   Mat s;
>   ierr = MatCreateShell(PETSC_COMM_WORLD, size, size, PETSC_DECIDE,
> PETSC_DECIDE, , );
>   ierr = MatShellSetOperation(s, MATOP_MULT, (void(*)(void))usermult);
> CHKERRQ(ierr);
>
> To simulate a meaningfull usermult, I use MatMult on an actual existing
> matrix of same dimensions:
>
> extern PetscErrorCode usermult(Mat m ,Vec x, Vec y)
> {
>   PetscErrorCode ierr = 0;
>   ierr = MatMult(matrix, x, y);
>   printf("Call\n");
>   return ierr;
> }
>
> Btw, what is the significance of the Mat m argument here?


m is your shell matrix. You should be calling

MatShellGetContext(m,(void**));

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatShellGetContext.html

to get your user context inside usermult to retrieve any data you need for
the mat-vec product. Where the hell is the variable "matrix"? Is it a
global variable?? If yes - don't do that.


>
> matrix is created like:
>
>   ierr = MatCreate(PETSC_COMM_WORLD, ); CHKERRQ(ierr);
>   ierr = MatSetSizes(matrix, size, size, PETSC_DECIDE, PETSC_DECIDE);
> CHKERRQ(ierr);
>   ierr = MatSetFromOptions(matrix); CHKERRQ(ierr);
>   ierr = MatSetUp(matrix); CHKERRQ(ierr);
>
>
>   MatMult(s, b, x);
>
> works. The usermult function is called.
>
> But trying to use a KSP gives an error:
>
>   KSP solver;
>   KSPCreate(PETSC_COMM_WORLD, );
>   KSPSetFromOptions(solver);
>   KSPSetOperators(solver, s, s);
>
>
> error:
>
> [0]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for
> possible LU and Cholesky solvers
> [0]PETSC ERROR: Could not locate a solver package. Perhaps you must
> ./configure with --download-
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.9.3, unknown
> [0]PETSC ERROR: ./a.out on a arch-linux2-c-opt named asaru by lindnefn Thu
> Oct 18 17:39:52 2018
> [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3
> -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native"
> FOPTFLAGS="-O3 -march=native -mtune=native" --download-petsc4py
> --download-mpi4py --with-mpi-dir=/opt/mpich
> [0]PETSC ERROR: #1 MatGetFactor() line 4328 in
> /home/lindnefn/software/petsc/src/mat/interface/matrix.c
> [0]PETSC ERROR: #2 PCSetUp_ILU() line 142 in
> /home/lindnefn/software/petsc/src/ksp/pc/impls/factor/ilu/ilu.c
> [0]PETSC ERROR: #3 PCSetUp() line 923 in
> /home/lindnefn/software/petsc/src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: #4 KSPSetUp() line 381 in
> /home/lindnefn/software/petsc/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: #5 KSPSolve() line 612 in
> /home/lindnefn/software/petsc/src/ksp/ksp/interface/itfunc.c
>
> Do I need to MatShellSetOperations additional operations? Like
> MATOP_ILUFACTOR? How can I know what operations to implement?
>
> Best Thanks,
> Florian
>
>
>

Re: [petsc-users] Increasing norm with finer mesh

2018-10-16 Thread Dave May

On Wed, 17 Oct 2018 at 03:15, Weizhuo Wang  wrote:

> I just tried both, neither of them make a difference. I got exactly the
> same curve with either combination.
>

Try using right preconditioning.

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetPCSide.html


Use the options:

-ksp_type gmres -ksp_pc_side right -ksp_rtol 1e-12

Or;

-ksp_type fgmres  -ksp_rtol 1e-12

Fgmres does right preconditioning my default



> Thanks!
>
> Wang weizhuo
>
> On Tue, Oct 16, 2018 at 8:06 PM Matthew Knepley  wrote:
>
>> On Tue, Oct 16, 2018 at 7:26 PM Weizhuo Wang 
>> wrote:
>>
>>> Hello again!
>>>
>>> After some tweaking the code is giving right answers now. However it
>>> start to disagree with MATLAB results ('traditional' way using matrix
>>> inverse) when the grid is larger than 100*100. My PhD advisor and I
>>> suspects that the default dimension of the Krylov subspace is 100 in the
>>> test case we are running. If so, is there a way to increase the size of the
>>> subspace?
>>>
>>
>> 1) The default subspace size is 30, not 100. You can increase the
>> subspace size using
>>
>>-ksp_gmres_restart n
>>
>> 2) The problem is likely your tolerance. The default solver tolerance is
>> 1e-5. You can change it using
>>
>>-ksp_rtol 1e-9
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>>
>>> [image: Disagrees.png]
>>>
>>> Thanks!
>>>
>>> Wang Weizhuo
>>>
>>> On Tue, Oct 9, 2018 at 2:50 AM Mark Adams  wrote:
>>>
 To reiterate what Matt is saying, you seem to have the exact solution
 on a 10x10 grid. That makes no sense unless the solution can be represented
 exactly by your FE space (eg, u(x,y) = x + y).

 On Mon, Oct 8, 2018 at 9:33 PM Matthew Knepley 
 wrote:

> On Mon, Oct 8, 2018 at 9:28 PM Weizhuo Wang 
> wrote:
>
>> The code is attached in case anyone wants to take a look, I will try
>> the high frequency scenario later.
>>
>
> That is not the error. It is superconvergence at the vertices. The
> real solution is trigonometric, so your
> linear interpolants or whatever you use is not going to get the right
> value in between mesh points. You
> need to do a real integral over the whole interval to get the L_2
> error.
>
>   Thanks,
>
>  Matt
>
>
>> On Mon, Oct 8, 2018 at 7:58 PM Mark Adams  wrote:
>>
>>>
>>>
>>> On Mon, Oct 8, 2018 at 6:58 PM Weizhuo Wang 
>>> wrote:
>>>
 The first plot is the norm with the flag -pc_type lu with respect
 to number of grids in one axis (n), and the second plot is the norm 
 without
 the flag -pc_type lu.

>>>
>>> So you are using the default PC w/o LU. The default is ILU. This
>>> will reduce high frequency effectively but is not effective on the low
>>> frequency error. Don't expect your algebraic error reduction to be at 
>>> the
>>> same scale as the residual reduction (what KSP measures).
>>>
>>>

>>
>> --
>> Wang Weizhuo
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

>>>
>>> --
>>> Wang Weizhuo
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>
>
> --
> Wang Weizhuo
>

Re: [petsc-users] Some Problems in Modifying Parallel Programs

2018-10-15 Thread Dave May

On Mon, 15 Oct 2018 at 16:54, Matthew Knepley  wrote:

> On Mon, Oct 15, 2018 at 10:42 AM Yingjie Wu  wrote:
>
>> Dear Petsc developer:
>> Hi,
>> Thank you very much for your previous reply.
>> I recently wanted to modify my program to parallel version, and
>> encountered some problems in modifying it.
>> 1. There are functions (read matrix) in the program that reads files,will
>> they affect my parallelism?
>>
>
> No, MatLoad works in parallel. This will work unchanged if you want the
> default layout. If you want a special partition,
> you must call MatSetSizes() after MatCreate().
>
>
>> The codes are as follows:
>>
>> ierr = PetscViewerBinaryOpen (PETSC_COMM_WORLD, file, FILE_MODE_READ,
>> ); CHKERRQ (ierr);
>> ierr = MatCreate (PETSC_COMM_WORLD, ); CHKERRQ (ierr);
>> ierr = MatSetFromOptions (A1); CHKERRQ (ierr);
>> ierr = MatCreate (PETSC_COMM_WORLD, ); CHKERRQ (ierr);
>> ierr = MatSetFromOptions (A2); CHKERRQ (ierr);
>> ierr = MatLoad (A1, viewer); CHKERRQ (ierr);
>> ierr = MatLoad (A2, viewer); CHKERRQ (ierr);
>> ierr = PetscViewerDestroy (); CHKERRQ (ierr);
>>
>> I read two matrix information from a binary file and wanted to use it on
>> each processor (in fact, my goal was to use these two matrices to give
>> initial values to the two field variables). The program can run in serial
>> time. Now I want to change it to parallel program. What do I need to
>> modify?
>> 2. Following the last question, I used the following code in giving
>> initial value procedure FormInitialGuess():
>>
>> ierr = MatSeqAIJGetArray (A1, _phi1); CHKERRQ (ierr);
>> ierr = MatSeqAIJGetArray (A2, _phi2); CHKERRQ (ierr);
>>
>> I found this function on manualpages, and I felt that it could fulfill my
>> need to represent the elements of the matrix in arrays to give field
>> variables an initial value in each grid. The matrix A1 and A2 above are
>> actually the matrix information that was read from the file before. Now I
>> want to modify it as a parallel program. How do I use matrix information to
>> give initial values in parallel? (In program, field variables are
>> implemented through DM because parallel computing and Ghost Value are
>> supported)
>>
>
> I do not understand the use of matrices to initialize field values.
> Usually field values are stored on Vec objects, and this is
> the philosophy of DM objects.
>

Maybe this is a "matlab style" idea / way of thinking, e.g. representing 2d
meshes and fields as matrices because they "look kinda the same". If
that is the case - don't do it! Use vectors to represent fields



>   Thanks,
>
>  Matt
>
>
>> Thanks,
>> Yingjie
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Failure of MUMPS

2018-10-12 Thread Dave May

On Thu, 11 Oct 2018 at 20:26, Michael Wick 
wrote:

> Thanks for all the suggestions!
>
> Increasing the value of icntl_14 in MUMPS helps a lot for my case.
>
> Do you have any suggestions for higher-order methods in saddle-point
> problems?
>

If the saddle point system arises from Stokes or an incompressible
elasticity formulation, then the standard block factorizations of
Silvester, Elman, Wathan will work very well for high-order - assuming of
course you use inf-sup stable basis for u/p. For Stokes/elasticity, the
pressure mass matrix is a decent spectrally equivalent operator for the
Schur complement.

These preconditioners are discussed here:

* Michele Benzi, Gene H. Golub, and Jörg Liesen, Numerical solution of
saddle point problems,
Acta Numerica, 14 (2005), pp. 1–137.

* Howard C. Elman, David J. Silvester, and Andrew J. Wathen, Finite
elements and fast iterative
solvers: with applications in incompressible fluid dynamics, Oxford
University Press, 2014.

High order examples can be found here:

* https://arxiv.org/abs/1607.03936

* Rudi, Johann, Georg Stadler, and Omar Ghattas. "Weighted BFBT
Preconditioner for Stokes Flow Problems with Highly Heterogeneous
Viscosity." SIAM Journal on Scientific Computing 39.5 (2017): S272-S297.

Note that in the Rudi et al papers, due the highly variable nature of the
viscosity, the authors advocate using a more complex definition of the
preconditioner for the Schur complement. Whether you need to use their
approach is dependent on the nature of the problem you are solving.

Thanks,
  Dave

>
> Mike
>
> Dave May  于2018年10月11日周四 上午1:50写道：
>
>>
>>
>> On Sat, 6 Oct 2018 at 12:42, Matthew Knepley  wrote:
>>
>>> On Fri, Oct 5, 2018 at 9:08 PM Mike Wick 
>>> wrote:
>>>
>>>> Hello PETSc team:
>>>>
>>>> I am trying to solve a PDE problem with high-order finite elements. The
>>>> matrix is getting denser and my experience is that MUMPS just outperforms
>>>> iterative solvers.
>>>>
>>>
>>> If the problem is elliptic, there is a lot of evidence that the P1
>>> preconditioner is descent for the system. Some people
>>> just project the system to P1, invert that with multigrid, and use that
>>> as the PC for Krylov. It should be worth trying.
>>>
>>
>> Matt means project to P1 directly from your high order function space in
>> one step. It is definitely worth trying.
>> For those interested, this approach is first described and discussed (to
>> my knowledge) in this paper:
>>
>> Persson, Per-Olof, and Jaime Peraire. "An efficient low memory implicit
>> DG algorithm for time dependent problems." *44th AIAA Aerospace Sciences
>> Meeting and Exhibit*. 2006.
>>
>>
>>> Moreover, as Jed will tell you, forming matrices for higher order is
>>> counterproductive. You should apply those matrix-free.
>>>
>>
>> I definitely agree with that.
>>
>> Cheers,
>>   Dave
>>
>>
>>
>>>
>>>   Thanks,
>>>
>>>  Matt
>>>
>>>
>>>> For certain problems, MUMPS just fail in the middle for no clear
>>>> reason. I just wander if there is any suggestion to improve the robustness
>>>> of MUMPS? Or in general, any suggestion for interative solver with very
>>>> high-order finite elements?
>>>>
>>>> Thanks!
>>>>
>>>> Mike
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>

Re: [petsc-users] Failure of MUMPS

2018-10-11 Thread Dave May

On Sat, 6 Oct 2018 at 12:42, Matthew Knepley  wrote:

> On Fri, Oct 5, 2018 at 9:08 PM Mike Wick 
> wrote:
>
>> Hello PETSc team:
>>
>> I am trying to solve a PDE problem with high-order finite elements. The
>> matrix is getting denser and my experience is that MUMPS just outperforms
>> iterative solvers.
>>
>
> If the problem is elliptic, there is a lot of evidence that the P1
> preconditioner is descent for the system. Some people
> just project the system to P1, invert that with multigrid, and use that as
> the PC for Krylov. It should be worth trying.
>

Matt means project to P1 directly from your high order function space in
one step. It is definitely worth trying.
For those interested, this approach is first described and discussed (to my
knowledge) in this paper:

Persson, Per-Olof, and Jaime Peraire. "An efficient low memory implicit DG
algorithm for time dependent problems." *44th AIAA Aerospace Sciences
Meeting and Exhibit*. 2006.


> Moreover, as Jed will tell you, forming matrices for higher order is
> counterproductive. You should apply those matrix-free.
>

I definitely agree with that.

Cheers,
  Dave



>
>   Thanks,
>
>  Matt
>
>
>> For certain problems, MUMPS just fail in the middle for no clear reason.
>> I just wander if there is any suggestion to improve the robustness of
>> MUMPS? Or in general, any suggestion for interative solver with very
>> high-order finite elements?
>>
>> Thanks!
>>
>> Mike
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Fwd: Implementing a homotopy solver

2018-09-29 Thread Dave May

On Sat, 29 Sep 2018 at 16:09, Matthew Knepley  wrote:

> On Sat, Sep 29, 2018 at 9:47 AM zakaryah  wrote:
>
>> Hi Matt - thanks for all your help.
>>
>> Let's say I want exactly the same solver for the tangent vector and the
>> SNES update, so I should reuse the KSP.
>>
>
If you want to do this, there is no need or reason to call
KSPSetFromOptions() inside your Jacobian evaluator - just call
SNESSetFromOptions once on the outer object.



>> My attempt to do this looks like the summary of FormJacobian() in the
>> previous message:
>>
>>- assemble Jacobian
>>- assemble RHS
>>- get KSP from the SNES passed to FormJacobian()
>>- KSPSetFromOptions()
>>- KSPSetOperators()
>>- KSPSolve()
>>
>> I'm not sure that's the right approach, but it doesn't work - the
>> KSPSolve() in the summary, i.e. the one for the tangent vector, seems to
>> work fine.  But the next KSPSolve() looks strange - it seems to use a
>> preconditioner even with -pc_type none, etc.  This makes me think I am
>> doing something seriously wrong.
>>
>
> 1) To get things going, just make the separate. Then we can optimize.
>
> 2) Give -ksp_view -ksp_monitor_true_residual -ksp_converged_reason to see
> what is happening
>
>   Matt
>
>
>> On Sat, Sep 29, 2018, 8:16 AM Matthew Knepley  wrote:
>>
>>> On Fri, Sep 28, 2018 at 11:13 PM zakaryah  wrote:
>>>
 I'm working on a homotopy solver which follows a zero curve through
 solution space.  A central aspect of this method is to calculate the vector
 tangent to the curve, predict the next point, then correct using iteration
 of, e.g. Newton's method, orthogonal to the tangent vector.

 Previously, we discussed the possibilities of implementing this within
 PETSc's SNES.  Within my FormJacobian() function, I construct the linear
 system which defines the tangent vector, solve it, then add the vector to
 the nullspace of the Jacobian.  I think that in principle this can work,
 but I suspect I'm doing something wrong.

 Here's a summary of the code within FormJacobian():

- Set values and assemble Jacobian matrix A - this is working fine
- Set values and assemble RHS vector b for linear system defining
tangent vector n - this is working fine
- SNESGetKSP(snes,_ksp) - I thought it made sense to use the KSP
associated with the SNES, hoping that PCs which use a factorization 
 could
be reused when the SNES calls KSPSolve() to calculate the update
- KSPSetFromOptions(my_ksp) - not sure this is necessary but one of
my problems is setting options for this KSP from the command line and 
 even
with this call it doesn't seem to be working properly
- MatSetNullSpace(A,NULL) - remove any existing null space from
Jacobian
- KSPSetOperators(my_ksp,A,P) - P is the other matrix in
FormJacobian()
- VecSet(n,0) - set initial guess to zero
- KSPSolve(my_ksp,b,n) - these solves appear to work, i.e. use the
options passed from the command line with -ksp_XXX or -pc_XXX
- VecNormalize(n,NULL)
-

 MatNullSpaceCreate(PetscObjectComm((PetscObject)A),PETSC_FALSE,1,,)
- MatSetNullSpace(A,nullsp)
- MatNullSpaceDestroy()
- return

 The immediate problem is that the subsequent KSPSolve(), i.e. the one
 called internally by SNESSolve(), behaves strangely.  For example, if I use
 -pc_type none -ksp_monitor -ksp_monitor_true_residual, then the KSPSolve()
 that I call within FormJacobian() looks correct - "preconditioned" norm and
 true norm are identical, and both converge as I expect (i.e. slowly but
 geometrically).  However, the subsequent KSPSolve(), internal to the
 SNESSolve(), has large differences between the preconditioned norm and the
 true norm.  In addition, the KSP does not converge in the true residual,
 but I'll have a hard time debugging that without knowing how to properly
 set the options.

>>>
>>> We need to clear up the usage first. If you want EXACTLY the same solver
>>> for both solvers, then reuse
>>> the KSP, otherwise do not do it. Does it work then?
>>>
>>>   Thanks,
>>>
>>>  Matt
>>>
>>>
 I hope someone can help me see what I'm doing wrong.

 On Sun, Jul 22, 2018 at 9:09 PM zakaryah  wrote:

> Thanks Matt and Barry,
>
> Matt - if I do the calculation in FormJacobian(), which makes by far
> the most sense and is as per your suggestion, do I need to set the
> operators of the SNES's KSP back to whatever they were before I set them? 
>  The
> linear system I want to solve within FormJacobian() involves the Jacobian
> matrix itself, and I want to remove the "nullspace" from that same matrix
> within FormFunction().
>
> Barry - I'm trying to implement a homotopy solver.  In short, I have a
> system of n nonlinear equations in n variables, F(x),

Re: [petsc-users] Checking if a vector is a localvector of a given DMDA

2018-09-25 Thread Dave May

On Tue, 25 Sep 2018 at 13:31, Phil Tooley 
wrote:

> Thanks both,
>
> I now have what I need.  For now I am checking that the vector I am passed
> has the same local size, global size, and Comm as the vector provided by
> DMGetLocalVector, mostly because I already have a compatibility check
> function written.  (I assume this requires a malloc and free behind the
> scenes)
>

Not necessarily. The Get/Restore strategy will re-use internally cached
vectors.


> At some point I will likely change to explicitly checking for comm size of
> one and appropriate global and local sizes based on the DMDA properties
> instead, for now I want to get to an alpha version I can let people play
> with.
>
> Phil
>
> On 25/09/18 13:07, Dave May wrote:
>
>
>
> On Tue, 25 Sep 2018 at 13:20, Matthew Knepley  wrote:
>
>> On Tue, Sep 25, 2018 at 7:03 AM Dave May  wrote:
>>
>>> On Tue, 25 Sep 2018 at 11:49, Phil Tooley 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Given a vector I know I can get an associated DM (if there is one) by
>>>> calling VecGetDM, but I need to also be able to check that
>>>>
>>>> a) the vector is the localvector of that DM rather than the global
>>>>
>>>
>>> Given the vector, you can check the communicator size via
>>> PetscObjectGetComm()
>>>
>>>
>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscObjectGetComm.html
>>> and then MPI_Comm_size()
>>> If the comm size 1, it is local vector.
>>>
>>
>> In serial, both vectors have comm size 1.
>>
>
> Right - and the local and global sizes are the same.
>
>  My point was to check the comm size first. If it's 1 then you have a
> candidate for a local vector. Then you'd check the vec global size matches
> the dmda local size. If the commsize is anything other than 1 then it
> cannot be a local vector
>
>
>>Matt
>>
>>
>>> You can check the size matches your local DMDA space by using
>>> DMDAGetGhostCorners()
>>>
>>>
>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMDA/DMDAGetGhostCorners.html
>>>
>>> and return the quantities m, n, and p.
>>>
>>> You also need to use  DMDAGetInfo()
>>>
>>>
>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMDA/DMDAGetInfo.html
>>>
>>> The important quantity you want returned is "dof"
>>>
>>> If m x n x p x dof matches the number returned by VecGetSize() (assuming
>>> you know the vector is sequential) then you know the local space will fit
>>> within your vector.
>>>
>>>
>>>
>>>>
>>>> b) the DM is a DMDA rather than some other subclass
>>>>
>>>
>>> See Matt's answer
>>>
>>>
>>>>
>>>> I can't immediately see routines that do what I need, but I am likely
>>>> missing something obvious. Is there a way to achieve the above?
>>>>
>>>> Thanks
>>>>
>>>> Phil
>>>>
>>>> --
>>>> Phil Tooley
>>>> Research Software Engineering
>>>> University of Sheffield
>>>>
>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/%7Eknepley/>
>>
>
> --
> Phil Tooley
> Research Software Engineering
> University of Sheffield
>
>

Re: [petsc-users] Checking if a vector is a localvector of a given DMDA

2018-09-25 Thread Dave May

On Tue, 25 Sep 2018 at 13:20, Matthew Knepley  wrote:

> On Tue, Sep 25, 2018 at 7:03 AM Dave May  wrote:
>
>> On Tue, 25 Sep 2018 at 11:49, Phil Tooley 
>> wrote:
>>
>>> Hi all,
>>>
>>> Given a vector I know I can get an associated DM (if there is one) by
>>> calling VecGetDM, but I need to also be able to check that
>>>
>>> a) the vector is the localvector of that DM rather than the global
>>>
>>
>> Given the vector, you can check the communicator size via
>> PetscObjectGetComm()
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscObjectGetComm.html
>> and then MPI_Comm_size()
>> If the comm size 1, it is local vector.
>>
>
> In serial, both vectors have comm size 1.
>

Right - and the local and global sizes are the same.

 My point was to check the comm size first. If it's 1 then you have a
candidate for a local vector. Then you'd check the vec global size matches
the dmda local size. If the commsize is anything other than 1 then it
cannot be a local vector


>Matt
>
>
>> You can check the size matches your local DMDA space by using
>> DMDAGetGhostCorners()
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMDA/DMDAGetGhostCorners.html
>>
>> and return the quantities m, n, and p.
>>
>> You also need to use  DMDAGetInfo()
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMDA/DMDAGetInfo.html
>>
>> The important quantity you want returned is "dof"
>>
>> If m x n x p x dof matches the number returned by VecGetSize() (assuming
>> you know the vector is sequential) then you know the local space will fit
>> within your vector.
>>
>>
>>
>>>
>>> b) the DM is a DMDA rather than some other subclass
>>>
>>
>> See Matt's answer
>>
>>
>>>
>>> I can't immediately see routines that do what I need, but I am likely
>>> missing something obvious. Is there a way to achieve the above?
>>>
>>> Thanks
>>>
>>> Phil
>>>
>>> --
>>> Phil Tooley
>>> Research Software Engineering
>>> University of Sheffield
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>

Re: [petsc-users] Checking if a vector is a localvector of a given DMDA

2018-09-25 Thread Dave May

On Tue, 25 Sep 2018 at 11:49, Phil Tooley 
wrote:

> Hi all,
>
> Given a vector I know I can get an associated DM (if there is one) by
> calling VecGetDM, but I need to also be able to check that
>
> a) the vector is the localvector of that DM rather than the global
>

Given the vector, you can check the communicator size via
PetscObjectGetComm()


https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscObjectGetComm.html
and then MPI_Comm_size()
If the comm size 1, it is local vector.

You can check the size matches your local DMDA space by using
DMDAGetGhostCorners()

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMDA/DMDAGetGhostCorners.html

and return the quantities m, n, and p.

You also need to use  DMDAGetInfo()

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMDA/DMDAGetInfo.html

The important quantity you want returned is "dof"

If m x n x p x dof matches the number returned by VecGetSize() (assuming
you know the vector is sequential) then you know the local space will fit
within your vector.



>
> b) the DM is a DMDA rather than some other subclass
>

See Matt's answer


>
> I can't immediately see routines that do what I need, but I am likely
> missing something obvious. Is there a way to achieve the above?
>
> Thanks
>
> Phil
>
> --
> Phil Tooley
> Research Software Engineering
> University of Sheffield
>
>

Re: [petsc-users] Use block Jacobi preconditioner with SNES

2018-08-27 Thread Dave May

On Mon, 27 Aug 2018 at 10:12, Ali Reza Khaz'ali 
wrote:

>  > Okay, interesting.  I take it you either are not running in parallel
> or need to have several subdomains (of varying size) per process.
>  > One approach would be to use PCASM (with zero overlap, it is
> equivalent to Block Jacobi) and then use -mat_partitioning_type to
> select a partitioner (could be a general graph partitioner or could by a
> custom implementation that you provide).  I don't know if this would
> feel overly clumsy or ultimately be a cleaner and more generic approach.
>
> Thanks for the answer. I'm still running a serial code. I plan to
> parallelized it after finding a suitable solver. Unfortunately, I do not
> know how to use PCASM, and therefore, I'm going to learn it. In
> addition, I found another possible solution with MATNEST. However, I do
> not know if MATNEST is suitable for my application or if it can be used
> with SNES. I'd be grateful if you could kindly guide me about it.
>
>
>
>  > Lets discuss this point a bit further. I assume your system is
> sparse. Sparse direct solvers can solve systems fairly efficiently for
> hundreds of thousands of unknowns. How big do you want? Also, do you
> plan on having more than 500K unknowns per process? If not, why not just
> use sparse direct solvers on each process?
>
> Thanks for the answer. My system is sparse, and also a variable sized
> block matrix. For example, for a small size simulation, I have about 7K
> unknowns. For ONE TIME STEP, Intel MKL PARDISO took about 30 minutes to
> solve my system, while occupying about 2.5GB out of my 4GB RAM. Since I
> have to simulate at least 1 time steps, the runtime (and the
> required RAM) would be unacceptable.
>
>
>
>  > If none of the suggestions provided is to your taste, why not just
> build the preconditioner matrix yourself? Seems your have precise
> requirements and the relevant info of the individual blocks, so you
> should be able to construct the preconditioner, either using A (original
> operator) or directly from the discrete problem.
>
> Thanks for your answer. As I stated, I have built a preconditioner for
> it.


I mean directly build the preconditioner with the required block diagonal
structure. Then do don't need to use something like PCBJACOBI or PCASM to
extract the block diagonal operator from your original operator.

My preconditioner does not require a large memory, however, it has a
> low performance (even on GPU). Therefore, I'm trying to use PETSc
> functions and modules to solve the system more efficiently. I do not
> think there is any other library more suited than PETSc for the job.
>
>
>
> Best Regards,
> Ali
>

Re: [petsc-users] Use block Jacobi preconditioner with SNES

2018-08-27 Thread Dave May

If none of the suggestions provided is to your taste, why not just build
the preconditioner matrix yourself? Seems your have precise requirements
and the relevant info of the individual blocks, so you should be able to
construct the preconditioner, either using A (original operator) or
directly from the discrete problem.



On Mon, 27 Aug 2018 at 05:33, Matthew Knepley  wrote:

> On Sat, Aug 25, 2018 at 2:30 PM Ali Reza Khaz'ali 
> wrote:
>
>> Dear Barry and Jed,
>>
>> Thanks for your great suggestions. I set the options as Jed proposed and
>> it worked. However, Block Jacobi preconditioner treats the entire
>> Jacobian matrix as only one giant block. It is good in small systems,
>> but in larger ones, it requires a lot of memory and runtime, as direct
>> methods do.  In addition, I cannot use -pc_bjacobi_blocks option,
>> because the block sizes of the Jacobain matrix are variable (In my
>> sample code, SadeqSize array contain block sizes), and apparently
>> -pc_bjacobi_blocks option assumes the block sizes are equal.
>> I have to add an explanation about unusual high tolerances in my code
>> sample, in fact, I am facing a serious ill-conditioned problem. The
>> direct methods (like LU) solve the system without problem, but I could
>> not find any preconditioner+iterative solver that can solve it with the
>> required precision.  Since direct solvers are useless in large systems,
>>
>
> Lets discuss this point a bit further. I assume your system is sparse.
> Sparse
> direct solvers can solve systems fairly efficiently for hundreds of
> thousands of
> unknowns. How big do you want? Also, do you plan on having more than 500K
> unknowns per process? If not, why not just use sparse direct solvers on
> each process?
>
>   Thanks,
>
>  Matt
>
> I had to design an specific preconditioner for it (I wrote a paper about
>> it: https://link.springer.com/article/10.1007/s10596-014-9421-3), but
>> the designed preconditioner was rather slow (even on GPU), and I hope
>> that I can find a faster solution in PETSc. So for now I am searching
>> for a combination that just works for me, and then, I'll refine it.
>>
>> Best wishes,
>> Ali
>>
>> On 8/25/2018 4:31 AM, Smith, Barry F. wrote:
>> > What you would like to do is reasonable but unfortunately the order
>> of the operations in SNES/KSP means that the KSPSetUp() cannot be called
>> before the SNESSolve() because the Jacobian matrix has yet to be provided
>> to KSP. Thus you have to, as Jed suggested, use the options database to set
>> the options. Note that if you want the options "hardwired" into the code
>> you can use PetscOptionsSetValue() from the code.
>> >
>> >Barry
>> >
>> >
>> >> On Aug 24, 2018, at 1:48 PM, Ali Reza Khaz'ali 
>> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am trying to use block Jacobi preconditioner in SNES (SNESNEWTONLS).
>> However, PCBJacobiGetSubKSP function returns an error stating "Object is in
>> wrong state, Must call KSPSetUp() or PCSetUp() first". When I add KSPSetUp,
>> I got and error from them as: "Must call DMShellSetGlobalVector() or
>> DMShellSetCreateGlobalVector()", and if PCSetUp is added, "Object is in
>> wrong state, Matrix must be set first" error is printed.
>> >>
>> >> Below is a part of my code. It is run serially. Any help is much
>> appreciated.
>> >>
>> >>  ierr = SNESGetKSP(snes, _ksp);
>> >>  CHKERRQ(ierr);
>> >>  ierr = KSPGetPC(Petsc_ksp, _pc);
>> >>  CHKERRQ(ierr);
>> >>  ierr = KSPSetTolerances(Petsc_ksp, 1.e-3, 1e-3, PETSC_DEFAULT,
>> 2);
>> >>  CHKERRQ(ierr);
>> >>  ierr = SNESSetTolerances(snes, 1e-1, 1e-1, 1e-1, 2000, 2000);
>> >>  CHKERRQ(ierr);
>> >>  ierr = SNESSetType(snes, SNESNEWTONLS);
>> >>  CHKERRQ(ierr);
>> >>  ierr = KSPSetType(Petsc_ksp, KSPGMRES);
>> >>  CHKERRQ(ierr);
>> >>  ierr = PCSetType(Petsc_pc, PCBJACOBI);
>> >>  CHKERRQ(ierr);
>> >>  ierr = PCSetType(Petsc_pc, PCBJACOBI);
>> >>  CHKERRQ(ierr);
>> >>  ierr = PCBJacobiSetTotalBlocks(Petsc_pc, 2*Nx*Ny*Nz, SadeqSize);
>> >>  CHKERRQ(ierr);
>> >>
>> >>  SNESSetUp(snes);
>> >>  CHKERRQ(ierr);
>> >>  ierr = PCBJacobiGetSubKSP(Petsc_pc, , , );
>> >>  CHKERRQ(ierr);
>> >>
>> >>  for (i = 0; i < nLocal; i++) {
>> >>  ierr = KSPGetPC(subKSP[i], );
>> >>  CHKERRQ(ierr);
>> >>  ierr = PCSetType(SubPc, PCLU);
>> >>  CHKERRQ(ierr);
>> >>  ierr = PCFactorSetMatSolverPackage(SubPc, "mkl_pardiso");
>> >>  CHKERRQ(ierr);
>> >>  ierr = KSPSetType(subKSP[i], KSPPREONLY);
>> >>  CHKERRQ(ierr);
>> >>  ierr = KSPSetTolerances(subKSP[i], 1.e-6, PETSC_DEFAULT,
>> PETSC_DEFAULT, PETSC_DEFAULT);
>> >>  CHKERRQ(ierr);
>> >>  }
>> >>  ierr = SNESSolve(snes, NULL, Petsc_X);
>> >>  CHKERRQ(ierr);
>> >>
>> >>
>> >>
>> >> --
>> >> Ali Reza Khaz’ali
>> >> Assistant Professor of Petroleum Engineering,
>> >> Department of Chemical Engineering
>> >> Isfahan University of

Re: [petsc-users] Question about I/O in PETSc

2018-08-26 Thread Dave May

On Sun, 26 Aug 2018 at 03:54, Yingjie Wu  wrote:

> Dear PETSc developer:
> Hello,
> I am a student of nuclear energy science from Tsinghua University. I want
> to do some work of neutron numerical simulation based on PETSc. At present,
> some examples of learning and testing PETSc have the following questions:
>
> 1. How to input data to PETSc conveniently?
>

What sort of data? Heavy data or stuff like model parameters / simulation
params / meta data?

For single values or relatively short arrays the methods
PetscOptionsGetInt()
and
PetscOptionsGetIntArray()
are useful. There are variants for real's, bool's and char's as well.

See

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscOptionsGetInt.html

Options can be parsed from the command line or collected in a text file
which is parsed.

In some examples of VEC, PETSc support data input and output in .dat and
> HDF5 format. I don't know much about HDF5, and the binary data file I can't
> open with any software.
>

You can use matlab
See here

https://www.mcs.anl.gov/petsc/documentation/faq.html#matlab

I suggest following point b

Thanks,
  Dave


I need an easy-to-view input and output format, preferably a file format
> that can be opened with a text editor for subsequent analysis and data
> loading.
>

> I'm not very good at using "petsc-users" mailing list. I'm very sorry if I
> disturb you.
> And I am looking forward to your reply.
> Thanks,
> Yingjie
>

Re: [petsc-users] MatSetValues error with ViennaCL types

2018-08-15 Thread Dave May

On Thu, 16 Aug 2018 at 04:44, Manuel Valera  wrote:

> Thanks Matthew and Barry,
>
> Now my code looks like:
>
> call DMSetMatrixPreallocateOnly(daDummy,PETSC_TRUE,ierr)
>
> call DMSetMatType(daDummy,MATMPIAIJVIENNACL,ierr)
>> call DMSetVecType(daDummy,VECMPIVIENNACL,ierr)
>>
> call DMCreateMatrix(daDummy,A,ierr)
>> call MatSetFromOptions(A,ierr)
>
> call MatSetUp(A,ierr)
>> [...]
>> call
>> MatSetValues(A,1,row,sumpos,pos(0:iter-1),vals(0:iter-1),INSERT_VALUES,ierr)
>> [...]
>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>
>
> And i get a different error, now is:
>
> [0]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: Argument out of range
> [0]PETSC ERROR: Column too large: col 10980 max 124
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.9.2-549-g779ab53  GIT
> Date: 2018-05-31 17:31:13 +0300
> [0]PETSC ERROR: ./gcmLEP.GPU on a cuda-debug named node50 by valera Wed
> Aug 15 19:40:00 2018
> [0]PETSC ERROR: Configure options PETSC_ARCH=cuda-debug
> --with-mpi-dir=/usr/lib64/openmpi --COPTFLAGS=-O2 --CXXOPTFLAGS=-O2
> --FOPTFLAGS=-O2 --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
> --CUDAFLAGS=-arch=sm_60 --with-blaslapack-dir=/usr/lib64 --download-viennacl
> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() line 442 in
> /home/valera/petsc/src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: #2 MatSetValues() line 1339 in
> /home/valera/petsc/src/mat/interface/matrix.c
>
>
> Thanks again,
>


This error has nothing to do with matrix type being used. The size of the
matrix is defined by the particular DM. You should be using DM associated
data/APIs to set values in the matrix.

It's not obvious how the args here

  call MatSetValues(A,1,row,sumpos,pos(0:iter-1),vals(0:iter-1),INS
ERT_VALUES,ier

actually relate to the DM. Code snippets aren't helpful in this case to
understand the error.

I suggest you send a complete example illustrating your actual problem.

Thanks,
  Dave




>
>
>
>
>
>
>
> On Wed, Aug 15, 2018 at 7:02 PM, Smith, Barry F. 
> wrote:
>
>>
>>   Should be
>>
>> call DMSetMatType(daDummy,MATMPIAIJVIENNACL,ierr)
>> call DMSetVecType(daDummy,VECMPIVIENNACL,ierr)
>> call DMCreateMatrix(daDummy,A,ierr)
>>
>>   and remove the rest. You need to set the type of Mat you want the DM to
>> return BEFORE you create the matrix.
>>
>>   Barry
>>
>>
>>
>> > On Aug 15, 2018, at 4:45 PM, Manuel Valera  wrote:
>> >
>> > Ok thanks for clarifying that, i wasn't sure if there were different
>> types,
>> >
>> > Here is a stripped down version of my code, it seems like the
>> preallocation is working now since the matrix population part is working
>> without problem, but here it is for illustration purposes:
>> >
>> > call DMSetMatrixPreallocateOnly(daDummy,PETSC_TRUE,ierr)
>> > call DMCreateMatrix(daDummy,A,ierr)
>> > call MatSetFromOptions(A,ierr)
>> > call DMSetMatType(daDummy,MATMPIAIJVIENNACL,ierr)
>> > call DMSetVecType(daDummy,VECMPIVIENNACL,ierr)
>> > call
>> MatMPIAIJSetPreallocation(A,19,PETSC_NULL_INTEGER,19,PETSC_NULL_INTEGER,ierr)
>> > call MatSetUp(A,ierr)
>> > [...]
>> > call
>> MatSetValues(A,1,row,sumpos,pos(0:iter-1),vals(0:iter-1),INSERT_VALUES,ierr)
>> > [...]
>> > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>> > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>> >
>> > Adding the first line there did the trick,
>> >
>> > Now the problem seems to be the program is not recognizing the matrix
>> as ViennaCL type when i try with more than one processor, i get now:
>> >
>> > [0]PETSC ERROR: - Error Message
>> --
>> > [0]PETSC ERROR: No support for this operation for this object type
>> > [0]PETSC ERROR: Currently only handles ViennaCL matrices
>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [0]PETSC ERROR: Petsc Development GIT revision: v3.9.2-549-g779ab53
>> GIT Date: 2018-05-31 17:31:13 +0300
>> > [0]PETSC ERROR: ./gcmLEP.GPU on a cuda-debug named node50 by valera Wed
>> Aug 15 14:44:22 2018
>> > [0]PETSC ERROR: Configure options PETSC_ARCH=cuda-debug
>> --with-mpi-dir=/usr/lib64/openmpi --COPTFLAGS=-O2 --CXXOPTFLAGS=-O2
>> --FOPTFLAGS=-O2 --with-shared-libraries=1 --with-debugging=1 --with-cuda=1
>> --CUDAFLAGS=-arch=sm_60 --with-blaslapack-dir=/usr/lib64 --download-viennacl
>> > [0]PETSC ERROR: #1 PCSetUp_SAVIENNACL() line 47 in
>> /home/valera/petsc/src/ksp/pc/impls/saviennaclcuda/saviennacl.cu
>> > [0]PETSC ERROR: #2 PCSetUp() line 932 in
>> /home/valera/petsc/src/ksp/pc/interface/precon.c
>> > [0]PETSC ERROR: #3 KSPSetUp() line 381 in
>> /home/valera/petsc/src/ksp/ksp/interface/itfunc.c
>> >
>> > When running with:
>> >
>> > mpirun -n 1 ./gcmLEP.GPU tc=TestCases/LockRelease/LE_6x6x6/

Re: [petsc-users] memory corruption when using harmonic extraction with SLEPc

2018-08-06 Thread Dave May

Please always use "reply-all" so that your messages go to the list.
This is standard mailing list etiquette.  It is important to preserve
threading for people who find this discussion later and so that we do
not waste our time re-answering the same questions that have already
been answered in private side-conversations.  You'll likely get an
answer faster that way too.




On 6 August 2018 at 17:17, Moritz Cygorek  wrote:

> Hi,
>
>
> I have found that the memory corruption takes place at the end of the
> first iteration or the beginning of the second iteration, i.e., at a time
> where I expect the harmonic extraction part to do some calculations.
>

Where is the report from valgrind indicating the exact file/function/line
where the problem occurred?



> I've then played around with the command line options and I found that
> harmonic extraction works when -eps_ncv is set to smaller values.
>
> I have the feeling that the memory needs to sustain this number of vectors
> and the eigenvectors for all processes individually. If I run multiple
> processes on a single computer, much more memory is required for storage
> than when I only use a single process, eventually allocating all of the
> available memory
>
>
> Do you know if this behavior of the harmonic extraction routine
> intended/necessary?
>
>
> Regards,
> Moritz
>
>
>
>
>
>
>
> --
> *From:* Dave May 
> *Sent:* Friday, August 3, 2018 12:59:54 AM
> *To:* Moritz Cygorek
> *Cc:* petsc-users@mcs.anl.gov
> *Subject:* Re: [petsc-users] memory corruption when using harmonic
> extraction with SLEPc
>
> On Thu, 2 Aug 2018 at 21:32, Moritz Cygorek  wrote:
>
>> Hi,
>>
>>
>> I want to diagonalize a huge sparse matrix and I'm using the Kryov-Schur
>> method with harmonic extraction (command line option -eps_harmonic )
>> implemented in SLEPc.
>>
>>
>> I manually distribute a sparse matrix across several CPUs and everything
>> works fine when:
>>
>> - I do _not_ use harmonic extraction
>>
>> - I use harmonic extraction on only a single CPU
>>
>>
>> If I try do use harmonic extraction on multiple CPUs, I get a memory
>> corruption.
>>
>> I'm not quite sure where to look at, but somewhere in the output, I find:
>>
>>
>>
>> [1]PETSC ERROR: PetscMallocValidate: error detected at
>> PetscSignalHandlerDefault() line 145 in /home/applications/sources/
>> libraries/petsc-3.9.3/src/sys/error/signal.c
>> [1]PETSC ERROR: Memory [id=0(9072)] at address 0x145bcd0 is corrupted
>> (probably write past end of array)
>> [1]PETSC ERROR: Memory originally allocated in DSAllocateWork_Private()
>> line 74 in /home/applications/sources/libraries/slepc-3.9.2/src/sys/
>> classes/ds/interface/dspriv.c
>>
>>
>> Now, I have the feeling that this might be a bug in SLEPc because, if I
>> had messed up the matrix initialization and distribution, I should also get
>> a memory corruption when I don't use harmonic extraction, right?
>>
>
> Not necessarily
>
>
>> Any suggestions what might be going on?
>>
>
> Run your code through valgrind and make sure that your application code is
> clean. See here
>
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> If errors are detected in your application code, fix them and see if the
> slepc errors go away. If your code is valgrind clean, send through the
> relevant chunk of the valgrind report indicating exactly where the error is
> occurring
>
> Thanks,
>   Dave
>
>
>
>> Regards,
>>
>> Moritz
>>
>>
>>
>>
>>
>>
>>
>>

Re: [petsc-users] memory corruption when using harmonic extraction with SLEPc

2018-08-02 Thread Dave May

On Thu, 2 Aug 2018 at 21:32, Moritz Cygorek  wrote:

> Hi,
>
>
> I want to diagonalize a huge sparse matrix and I'm using the Kryov-Schur
> method with harmonic extraction (command line option -eps_harmonic )
> implemented in SLEPc.
>
>
> I manually distribute a sparse matrix across several CPUs and everything
> works fine when:
>
> - I do _not_ use harmonic extraction
>
> - I use harmonic extraction on only a single CPU
>
>
> If I try do use harmonic extraction on multiple CPUs, I get a memory
> corruption.
>
> I'm not quite sure where to look at, but somewhere in the output, I find:
>
>
>
> [1]PETSC ERROR: PetscMallocValidate: error detected at
> PetscSignalHandlerDefault() line 145 in
> /home/applications/sources/libraries/petsc-3.9.3/src/sys/error/signal.c
> [1]PETSC ERROR: Memory [id=0(9072)] at address 0x145bcd0 is corrupted
> (probably write past end of array)
> [1]PETSC ERROR: Memory originally allocated in DSAllocateWork_Private()
> line 74 in
> /home/applications/sources/libraries/slepc-3.9.2/src/sys/classes/ds/interface/dspriv.c
>
>
> Now, I have the feeling that this might be a bug in SLEPc because, if I
> had messed up the matrix initialization and distribution, I should also get
> a memory corruption when I don't use harmonic extraction, right?
>

Not necessarily


> Any suggestions what might be going on?
>

Run your code through valgrind and make sure that your application code is
clean. See here

http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

If errors are detected in your application code, fix them and see if the
slepc errors go away. If your code is valgrind clean, send through the
relevant chunk of the valgrind report indicating exactly where the error is
occurring

Thanks,
  Dave



> Regards,
>
> Moritz
>
>
>
>
>
>
>
>

Re: [petsc-users] Fieldsplit - Schur Complement Reduction - Efficient Preconditioner for Schur Complement

2018-07-27 Thread Dave May

On Wed, 25 Jul 2018 at 12:24, Buesing, Henrik <
hbues...@eonerc.rwth-aachen.de> wrote:

> The problem is from two-phase flow in porous media. I have written down
> the equations and the operators from the 2x2 block Jacobian (see
> attachment).
>

Have you just tried plain old multiplicative firldplit with LU as the
preconditioned for the diagonal blocks?

What are the iteration counts like?

I think that choice should not be terrible for this system. I'd try this
first before looking at Schur complements.

The J11 block of the Jacobian is made elliptic.
>
>
>
> Maybe someone could tell me how to go on from there to build a good
> preconditioner?
>
> Thank you!
> Henrik
>
>
>
>
>
> --
>
> Dipl.-Math. Henrik Büsing
>
> Institute for Applied Geophysics and Geothermal Energy
>
> E.ON Energy Research Center
>
> RWTH Aachen University
>
>
>
> Mathieustr. 10| Tel +49 (0)241 80 49907
>
> 52074 Aachen, Germany | Fax +49 (0)241 80 49889
>
>
>
> http://www.eonerc.rwth-aachen.de/GGE
>
> hbues...@eonerc.rwth-aachen.de
>
>
>
> *Von:* Dave May 
> *Gesendet:* Mittwoch, 25. Juli 2018 11:37
> *An:* Buesing, Henrik 
> *Cc:* Matthew Knepley ; PETSc 
>
>
> *Betreff:* Re: [petsc-users] Fieldsplit - Schur Complement Reduction -
> Efficient Preconditioner for Schur Complement
>
>
>
>
>
>
>
> On 25 July 2018 at 10:34, Buesing, Henrik 
> wrote:
>
> Dear Matt! Dear Dave!
>
>
>
> Thank you for your messages! I pursued your option 1) and the solver I
> sent is what I ended up with. Thus, I would like to pursue option 2): Find
> a better preconditioner than the a11 block.
>
>
> From a technical viewpoint I understand how I would build a matrix that is
> used as a preconditioner for the Schur complement.
> But, from a mathematical viewpoint I do not know what to assemble. How do
> I find a good preconditioner for my problem? How would I tackle such a
> problem?
>
>
>
> Where does your discrete saddle point system come from?
>
> Stokes? Navier Stokes? Something else?
>
> Maybe someone on the list can advise you.
>
>
>
>
> Thank you!
> Henrik
>
>
>
>
>
> --
>
> Dipl.-Math. Henrik Büsing
>
> Institute for Applied Geophysics and Geothermal Energy
>
> E.ON Energy Research Center
>
> RWTH Aachen University
>
>
>
> Mathieustr. 10
> <https://maps.google.com/?q=Mathieustr.+10=gmail=g>|
> Tel +49 (0)241 80 49907
>
> 52074 Aachen, Germany | Fax +49 (0)241 80 49889
>
>
>
> http://www.eonerc.rwth-aachen.de/GGE
>
> hbues...@eonerc.rwth-aachen.de
>
>
>
> *Von:* Dave May 
> *Gesendet:* Mittwoch, 25. Juli 2018 11:14
> *An:* Matthew Knepley 
> *Cc:* Buesing, Henrik ; PETSc <
> petsc-users@mcs.anl.gov>
> *Betreff:* Re: [petsc-users] Fieldsplit - Schur Complement Reduction -
> Efficient Preconditioner for Schur Complement
>
>
>
>
>
>
>
> On 25 July 2018 at 09:48, Matthew Knepley  wrote:
>
> On Wed, Jul 25, 2018 at 4:24 AM Buesing, Henrik <
> hbues...@eonerc.rwth-aachen.de> wrote:
>
> Dear all,
>
> I would like to improve the iterative solver [1]. As I understand it I
> would need to improve the preconditioner for the Schur complement.
>
> How would I do that?
>
>
>
> 1) I always start from the exact thing (full Schur factorization with
> exact solves and back off parts until I am happy)
>
>
>
> 2) Is the a11 block a good preconditioner for your Schur complement? If
> not, I would start by replacing that matrix
>
> with something better.
>
>
>
> Some additional info. If you want to pursue option 2, you need to do call
>
>
>
>   PCFieldSplitSetSchurPre()
>
>
>
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSetSchurPre.html#PCFieldSplitSetSchurPre
>
>
>
> with PC_FIELDSPLIT_SCHUR_PRE_USER
> <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSchurPreType.html#PCFieldSplitSchurPreType>
> (second arg) and your user defined schur complement preconditioner (last
> arg).
>
>
>
> Thanks,
>
>   Dave
>
>
>
>
>
>   Thanks,
>
>
>
> Matt
>
>
>
> Thank you for your help!
> Henrik
>
>
>
> [1]
> -ksp_max_it 100 -ksp_rtol 1e-6 -ksp_atol 1e-50 -ksp_type fgmres -pc_type
> fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition a11
> -fieldsplit_p_w_ksp_type preonly -fieldsplit_S_n_ksp_type gmres
> -fieldsplit_p_w_pc_type hypre -fieldsplit_p_w_pc_hypre_type boomeramg
> -fieldsplit_S_n_pc_type hypre -fieldsplit_S_n_pc_hypre_type boomeramg
> -fiel

Re: [petsc-users] Fieldsplit - Schur Complement Reduction - Efficient Preconditioner for Schur Complement

2018-07-25 Thread Dave May

On 25 July 2018 at 10:34, Buesing, Henrik 
wrote:

> Dear Matt! Dear Dave!
>
>
>
> Thank you for your messages! I pursued your option 1) and the solver I
> sent is what I ended up with. Thus, I would like to pursue option 2): Find
> a better preconditioner than the a11 block.
>
>
> From a technical viewpoint I understand how I would build a matrix that is
> used as a preconditioner for the Schur complement.
> But, from a mathematical viewpoint I do not know what to assemble. How do
> I find a good preconditioner for my problem? How would I tackle such a
> problem?
>

Where does your discrete saddle point system come from?
Stokes? Navier Stokes? Something else?
Maybe someone on the list can advise you.


>
> Thank you!
> Henrik
>
>
>
>
>
> --
>
> Dipl.-Math. Henrik Büsing
>
> Institute for Applied Geophysics and Geothermal Energy
>
> E.ON Energy Research Center
>
> RWTH Aachen University
>
>
>
> Mathieustr. 10
> <https://maps.google.com/?q=Mathieustr.+10=gmail=g>|
> Tel +49 (0)241 80 49907
>
> 52074 Aachen, Germany | Fax +49 (0)241 80 49889
>
>
>
> http://www.eonerc.rwth-aachen.de/GGE
>
> hbues...@eonerc.rwth-aachen.de
>
>
>
> *Von:* Dave May 
> *Gesendet:* Mittwoch, 25. Juli 2018 11:14
> *An:* Matthew Knepley 
> *Cc:* Buesing, Henrik ; PETSc <
> petsc-users@mcs.anl.gov>
> *Betreff:* Re: [petsc-users] Fieldsplit - Schur Complement Reduction -
> Efficient Preconditioner for Schur Complement
>
>
>
>
>
>
>
> On 25 July 2018 at 09:48, Matthew Knepley  wrote:
>
> On Wed, Jul 25, 2018 at 4:24 AM Buesing, Henrik <
> hbues...@eonerc.rwth-aachen.de> wrote:
>
> Dear all,
>
> I would like to improve the iterative solver [1]. As I understand it I
> would need to improve the preconditioner for the Schur complement.
>
> How would I do that?
>
>
>
> 1) I always start from the exact thing (full Schur factorization with
> exact solves and back off parts until I am happy)
>
>
>
> 2) Is the a11 block a good preconditioner for your Schur complement? If
> not, I would start by replacing that matrix
>
> with something better.
>
>
>
> Some additional info. If you want to pursue option 2, you need to do call
>
>
>
>   PCFieldSplitSetSchurPre()
>
>
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/
> PCFieldSplitSetSchurPre.html#PCFieldSplitSetSchurPre
>
>
>
> with PC_FIELDSPLIT_SCHUR_PRE_USER
> <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCFieldSplitSchurPreType.html#PCFieldSplitSchurPreType>
> (second arg) and your user defined schur complement preconditioner (last
> arg).
>
>
>
> Thanks,
>
>   Dave
>
>
>
>
>
>   Thanks,
>
>
>
> Matt
>
>
>
> Thank you for your help!
> Henrik
>
>
>
> [1]
> -ksp_max_it 100 -ksp_rtol 1e-6 -ksp_atol 1e-50 -ksp_type fgmres -pc_type
> fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition
> a11 -fieldsplit_p_w_ksp_type preonly -fieldsplit_S_n_ksp_type gmres
> -fieldsplit_p_w_pc_type hypre -fieldsplit_p_w_pc_hypre_type boomeramg
> -fieldsplit_S_n_pc_type hypre -fieldsplit_S_n_pc_hypre_type boomeramg
> -fieldsplit_S_n_ksp_max_it 100 fieldsplit_S_n_ksp_rtol 1e-2
>
>
> --
> Dipl.-Math. Henrik Büsing
> Institute for Applied Geophysics and Geothermal Energy
> E.ON Energy Research Center
> RWTH Aachen University
>
> Mathieustr. 10
> <https://maps.google.com/?q=Mathieustr.+10=gmail=g>|
> Tel +49 (0)241 80 49907
> 52074 Aachen, Germany | Fax +49 (0)241 80 49889
>
> http://www.eonerc.rwth-aachen.de/GGE
> hbues...@eonerc.rwth-aachen.de
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
>
>
>

Re: [petsc-users] Fieldsplit - Schur Complement Reduction - Efficient Preconditioner for Schur Complement

2018-07-25 Thread Dave May

On 25 July 2018 at 09:48, Matthew Knepley  wrote:

> On Wed, Jul 25, 2018 at 4:24 AM Buesing, Henrik <
> hbues...@eonerc.rwth-aachen.de> wrote:
>
>> Dear all,
>>
>> I would like to improve the iterative solver [1]. As I understand it I
>> would need to improve the preconditioner for the Schur complement.
>>
>> How would I do that?
>>
>
> 1) I always start from the exact thing (full Schur factorization with
> exact solves and back off parts until I am happy)
>
> 2) Is the a11 block a good preconditioner for your Schur complement? If
> not, I would start by replacing that matrix
> with something better.
>

Some additional info. If you want to pursue option 2, you need to do call

  PCFieldSplitSetSchurPre()

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/
PCFieldSplitSetSchurPre.html#PCFieldSplitSetSchurPre

with PC_FIELDSPLIT_SCHUR_PRE_USER

(second arg) and your user defined schur complement preconditioner (last
arg).

Thanks,
  Dave


>
>   Thanks,
>
> Matt
>
>
>> Thank you for your help!
>> Henrik
>>
>>
>>
>> [1]
>> -ksp_max_it 100 -ksp_rtol 1e-6 -ksp_atol 1e-50 -ksp_type fgmres -pc_type
>> fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition
>> a11 -fieldsplit_p_w_ksp_type preonly -fieldsplit_S_n_ksp_type gmres
>> -fieldsplit_p_w_pc_type hypre -fieldsplit_p_w_pc_hypre_type boomeramg
>> -fieldsplit_S_n_pc_type hypre -fieldsplit_S_n_pc_hypre_type boomeramg
>> -fieldsplit_S_n_ksp_max_it 100 fieldsplit_S_n_ksp_rtol 1e-2
>>
>>
>> --
>> Dipl.-Math. Henrik Büsing
>> Institute for Applied Geophysics and Geothermal Energy
>> E.ON Energy Research Center
>> RWTH Aachen University
>>
>> Mathieustr. 10
>> 
>> | Tel +49 (0)241 80 49907
>> 52074 Aachen, Germany | Fax +49 (0)241 80 49889
>>
>> http://www.eonerc.rwth-aachen.de/GGE
>> hbues...@eonerc.rwth-aachen.de
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
>

Re: [petsc-users] petsc4py: parallel matrix-vector multiplication

2018-05-06 Thread Dave May

On Sun, 6 May 2018 at 17:52, Robert Speck <r.sp...@fz-juelich.de> wrote:

> OK, thanks, I see. This is basically what happens in the poisson2d.py
> example, too, right?
>
> I tried it with the shell matrix (?) used in the poisson2d example and
> it works right away, but then I fail to see how to make use of the
> preconditioners for KSP (see my original message)..


If you want assemble something within the operator created by
DMCreateMatrix(), take a look at this function

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetValuesStencil.html

Study some of the examples listed at the bottom of the page.

(But you will have to concern yourself with the spatial decomposition, just
like in the poisson2d and Bratu example.)

Thanks,
  Dave




>
> Thanks again!
> -Robert-
>
>
> On 06.05.18 16:52, Jed Brown wrote:
> > Robert Speck <r.sp...@fz-juelich.de> writes:
> >
> >> Thanks for your reply and help. Yes, this is going to be a PDE solver
> >> for structured grids. The first goal would be IDC (or Crank-Nicholson)
> >> for the heat equation, which would require both solving a linear system
> >> and application of the matrix.
> >>
> >> The code I wrote for testing parallel matrix-vector multiplication can
> >> be found here:
> >>
> https://github.com/Parallel-in-Time/pySDC/blob/petsc/pySDC/playgrounds/PETSc/playground_matmult.py
> >>
> >> Both vectors and matrix come from the DMDA, but I guess filling them is
> >> done in a wrong way? Or do I need to convert global/natural vectors to
> >> local ones somewhere?
> >
> > Global and Natural are not the same (see user's manual for details).
> > The matrix acts on a Global vector.  See
> > petsc4py/demo/bratu3d/bratu3d.py for examples of efficiently setting
> > values (and computing residuals) using Global vectors.  It should be
> > simpler/cleaner code than you currently have.
> >
> >>
> >> Best
> >> -Robert-
> >>
> >>
> >>
> >> On 06.05.18 14:44, Dave May wrote:
> >>> On Sun, 6 May 2018 at 10:40, Robert Speck <r.sp...@fz-juelich.de
> >>> <mailto:r.sp...@fz-juelich.de>> wrote:
> >>>
> >>> Hi!
> >>>
> >>> I would like to do a matrix-vector multiplication (besides using
> linear
> >>> solvers and so on) with petsc4py. I took the matrix from this
> example
> >>>
> >>> (
> https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py
> )
> >>>
> >>>
> >>> This example only produces a matrix. And from the code the matrix
> >>> produced is identical in serial or parallel.
> >>>
> >>>
> >>>
> >>> and applied it to a PETSc Vector. All works well in serial, but in
> >>> parallel (in particular if ordering becomes relevant) the resulting
> >>> vector looks very different.
> >>>
> >>>
> >>> Given this, the way you defined the x vector in y = A x must be
> >>> different when run on 1 versus N mpi ranks.
> >>>
> >>>
> >>> Using the shell matrix of this example
> >>> (
> https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py
> )
> >>> helps, but then I cannot use matrix-based preconditioners for KSP
> >>> directly (right?). I also tried using DMDA for creating vectors and
> >>> matrix and for taking care of their ordering (which seems to be my
> >>> problem here), but that did not help either.
> >>>
> >>> So, my question is this: How do I do easy parallel matrix-vector
> >>> multiplication with petsc4py in a way that allows me to use
> parallel
> >>> linear solvers etc. later on? I want to deal with spatial
> decomposition
> >>> as little as possible.
> >>>
> >>>
> >>> What's the context - are you solving a PDE?
> >>>
> >>> Assuming you are using your own grid object (e.g. as you might have if
> >>> solving a PDE), and assuming you are not solving a 1D problem, you
> >>> actually have to "deal" with the spatial decomposition otherwise
> >>> performance could be quite terrible - even for something simple like a
> 5
> >>> point Laplacian on a structured grid in 2D
> >>>
> >>> What data structures should I use? DMDA or
> >>> PETSc.Vec() and PETSc.Mat() or something else?
&

Re: [petsc-users] petsc4py: parallel matrix-vector multiplication

2018-05-06 Thread Dave May

On Sun, 6 May 2018 at 10:40, Robert Speck  wrote:

> Hi!
>
> I would like to do a matrix-vector multiplication (besides using linear
> solvers and so on) with petsc4py. I took the matrix from this example
>
(https://bitbucket.org/petsc/petsc4py/src/master/demo/kspsolve/petsc-mat.py)

This example only produces a matrix. And from the code the matrix produced
is identical in serial or parallel.

> and applied it to a PETSc Vector. All works well in serial, but in
> parallel (in particular if ordering becomes relevant) the resulting
> vector looks very different.

Given this, the way you defined the x vector in y = A x must be different
when run on 1 versus N mpi ranks.

Using the shell matrix of this example
> (
> https://bitbucket.org/petsc/petsc4py/src/master/demo/poisson2d/poisson2d.py
> )
> helps, but then I cannot use matrix-based preconditioners for KSP
> directly (right?). I also tried using DMDA for creating vectors and
> matrix and for taking care of their ordering (which seems to be my
> problem here), but that did not help either.
>
> So, my question is this: How do I do easy parallel matrix-vector
> multiplication with petsc4py in a way that allows me to use parallel
> linear solvers etc. later on? I want to deal with spatial decomposition
> as little as possible.

What's the context - are you solving a PDE?

Assuming you are using your own grid object (e.g. as you might have if
solving a PDE), and assuming you are not solving a 1D problem, you actually
have to "deal" with the spatial decomposition otherwise performance could
be quite terrible - even for something simple like a 5 point Laplacian on a
structured grid in 2D

What data structures should I use? DMDA or
> PETSc.Vec() and PETSc.Mat() or something else?

The mat vec product is not causing you a problem. Your issue appears to be
that you do not have a way to label entries in a vector in a consistent
manner.

What's the objective? Are you solving a PDE? If yes, structured grid? If
yes again, use the DMDA. It takes care of all the local-to-global and
global-to-local mapping you need.

Thanks,
  Dave

>
> Thanks!
> -Robert-
>
> --
> Dr. Robert Speck
> Juelich Supercomputing Centre
> Institute for Advanced Simulation
> Forschungszentrum Juelich GmbH
> 52425 Juelich, Germany
>
> Tel: +49 2461 61 1644
> Fax: +49 2461 61 6656
>
> Email:   r.sp...@fz-juelich.de
> Website: http://www.fz-juelich.de/ias/jsc/speck_r
> PinT:http://www.fz-juelich.de/ias/jsc/pint
>
>
>
>
> 
>
> 
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
>
> 
>
> 
>
>

Re: [petsc-users] Problems with VecGetArray under sub-communicators

2018-04-22 Thread Dave May

On Sun, 22 Apr 2018 at 20:13, Zin Lin  wrote:

> Hi
> I am experiencing possible memory issues with VecGetArray when it is used
> under sub-communicators (when I split the PETSC_COMM_WORLD to multiple
> subcomms). The following is the minimal code. Basically, you can test that
> if you parallelize the vector to more than one processor under a subcomm,
> the array obtained from the VecGetArray call doesn't seem to be correct.
> Please test it with
>
> 1)  mpirun -np 1 ./example -ncomms 1
> 2)  mpirun -np 2 ./example -ncomms 2
> 3)  mpirun -np 2 ./example -ncomms 1
> 4)  mpirun -np 4 ./example -ncomms 2
>
> you will 1) and 2) work as expected while in 3) and 4) some entries of the
> array are assigned erroneous values.
>

First - your call to PetscPrintf contains a mistake. The second instance of
i is missing.

Second (more important), you cannot access the raw array obtained via
VecGetArray() using global indices. You must use local indices.
Change the access to _u[i-ns] and the code should work.

Also, debugging based on what printf() tells you can be misleading. It's
much better to use valgrind - see here

https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind


Thanks,
  Dave


> Any input will be appreciated.
> Thanks
> Zin
>
> Minimal Code
>
> PetscErrorCode main(int argc, char **argv)
> {
>
>   MPI_Init(NULL, NULL);
>   PetscInitialize(,,NULL,NULL);
>   int size;
>   MPI_Comm_size(MPI_COMM_WORLD, );
>   PetscPrintf(PETSC_COMM_WORLD,"\tThe total number of processors is
> %d\n",size);
>
>   //check if the number of processors is divisible by the number of
> subcomms
>   int ncomms, np_per_comm;
>   PetscOptionsGetInt(NULL,"-ncomms",,NULL);
>   if(!(size%ncomms==0)) SETERRQ(PETSC_COMM_WORLD,1,"The number of
> processes must be a multiple of ncomms so that it is divisible by the
> number of subcomms.");
>   np_per_comm=size/ncomms;
>   PetscPrintf(PETSC_COMM_WORLD,"\tThe number of subcomms is %d.\n\tEach
> subcomm has %d processors.\n",ncomms,np_per_comm);
>
>   //calculate the colour of each subcomm ( = rank of each processor /
> number of processors in each subcomm )
>   //note once calculated, the colour is fixed throughout the entire run
>   int rank;
>   MPI_Comm_rank(MPI_COMM_WORLD, );
>   MPI_Comm subcomm;
>   int colour = rank/np_per_comm;
>   MPI_Comm_split(MPI_COMM_WORLD, colour, rank, );
>
>   Vec u;
>   PetscScalar *_u;
>   int i,ns,ne;
>   PetscScalar tmp;
>
>   VecCreateMPI(subcomm,PETSC_DECIDE,10,);
>   VecSet(u,1.0+PETSC_i*0);
>
>   VecGetArray(u,&_u);
>   VecGetOwnershipRange(u,,);
>   for(i=ns;i VecGetValues(u,1,,);
> PetscPrintf(PETSC_COMM_SELF,"colour %d, u[%d]_array = %g + i * (%g),
> u[%d]_vec = %g + i * %g \n",colour,i,creal(_u[i]),cimag(_u[i]),
>

Insert i here

creal(tmp),cimag(tmp));
>   }
>   VecRestoreArray(u,&_u);
>
>   PetscFinalize();
>   return 0;
>
> }
>
>

Re: [petsc-users] error running parallel on cluster

2018-04-18 Thread Dave May

On 18 April 2018 at 23:52, Matthew Knepley <knep...@gmail.com> wrote:

> On Wed, Apr 18, 2018 at 5:52 PM, Sepideh Kavousi <skav...@lsu.edu> wrote:
>
>> Mathew and Dave,
>>
>> Thank you so much it is working perfectly now.
>>
>
> Excellent.
>
> If you want your name to appear on the next PETSc release as a
> contributor, you
> can make a PR with this change :)
>

Here is a URL describing the PR's protocol for PETSc contribs:

https://bitbucket.org/petsc/petsc/wiki/pull-request-instructions-git



>
>   Thanks,
>
>      Matt
>
>
>> Sepideh
>> --
>> *From:* Dave May <dave.mayhe...@gmail.com>
>> *Sent:* Wednesday, April 18, 2018 3:13:33 PM
>> *To:* Sepideh Kavousi
>> *Cc:* Matthew Knepley; petsc-users@mcs.anl.gov
>> *Subject:* Re: [petsc-users] error running parallel on cluster
>>
>>
>>
>> On 18 April 2018 at 21:06, Sepideh Kavousi <skav...@lsu.edu> wrote:
>>
>> Mathew
>>
>> I added the lines and I still have the same issue. It may be a silly
>> question but should I configure and install petsc again using this new
>> lines added? or changing the line is enough? the highlighted lines are the
>> lines I modified.
>>
>>
>> PetscErrorCode ierr;
>>   DM dm;
>>   DMTS_DA*dmdats = (DMTS_DA*)ctx;
>>   DMDALocalInfo  info;
>>   VecXloc,Xdotloc;
>>   void   *x,*f,*xdot;
>>
>>   PetscFunctionBegin;
>>   PetscValidHeaderSpecific(ts,TS_CLASSID,1);
>>   PetscValidHeaderSpecific(X,VEC_CLASSID,2);
>>   PetscValidHeaderSpecific(F,VEC_CLASSID,3);
>>   if (!dmdats->ifunctionlocal) SETERRQ(PetscObjectComm((Petsc
>> Object)ts),PETSC_ERR_PLIB,"Corrupt context");
>>   ierr = TSGetDM(ts,);CHKERRQ(ierr);
>>   ierr = DMGetLocalVector(dm,);CHKERRQ(ierr);
>>   ierr = DMGlobalToLocalBegin(dm,Xdot,INSERT_VALUES,Xdotloc);CHKERRQ(
>> ierr);
>>   ierr = DMGlobalToLocalEnd(dm,Xdot,INSERT_VALUES,Xdotloc);CHKERRQ(ierr);
>>
>>   ierr = DMGetLocalVector(dm,);CHKERRQ(ierr);
>>   ierr = DMGlobalToLocalBegin(dm,X,INSERT_VALUES,Xloc);CHKERRQ(ierr);
>>   ierr = DMGlobalToLocalEnd(dm,X,INSERT_VALUES,Xloc);CHKERRQ(ierr);
>>   ierr = DMDAGetLocalInfo(dm,);CHKERRQ(ierr);
>>   ierr = DMDAVecGetArray(dm,Xloc,);CHKERRQ(ierr);
>>   ierr = DMDAVecGetArray(dm,Xdotloc,);CHKERRQ(ierr);
>>
>>
>> Don't forget to include these calls (in this order) after you are done
>> with the Xdotloc vector
>>
>> ierr = DMDAVecRestoreArray(dm,Xdotloc,);CHKERRQ(ierr);
>> ierr = DMRestoreLocalVector(dm,);CHKERRQ(ierr);
>>
>> Failure to do so will result in a memory leak.
>>
>>
>>
>> Thanks,
>>
>> Sepideh
>> --
>> *From:* Matthew Knepley <knep...@gmail.com>
>> *Sent:* Tuesday, April 17, 2018 5:59:12 PM
>>
>> *To:* Sepideh Kavousi
>> *Cc:* petsc-users@mcs.anl.gov
>> *Subject:* Re: [petsc-users] error running parallel on cluster
>>
>> On Tue, Apr 17, 2018 at 3:07 PM, Sepideh Kavousi <skav...@lsu.edu> wrote:
>>
>> The reason  I can not use your method is that the input arguments of
>> SetIFunctionLocal are the arrays of x,x_t instead of x,x_t vectors. In your
>> method which was:
>>
>> ierr *=* DMGetLocalVector(dm,*&*Xdotloc);CHKERRQ(ierr);  ierr *=* 
>> DMGlobalToLocalBegin(dm,Xdot,INSERT_VALUES,Xdotloc);CHKERRQ(ierr);  ierr *=* 
>> DMGlobalToLocalEnd(dm,Xdot,INSERT_VALUES,Xdotloc);CHKERRQ(ierr);
>>   ierr *=* DMDAVecGetArray(dm,Xdotloc,*&*xdot);CHKERRQ(ierr);
>>
>>
>> You misunderstand my suggestion. I mean stick this code in right here in
>> PETSc
>>
>> https://bitbucket.org/petsc/petsc/annotate/be3efd428a942676a
>> 0189b3273b3c582694ff011/src/ts/utils/dmdats.c?at=master
>> viewer=file-view-default#dmdats.c-68
>>
>> Then the X_t array you get in your local function will be ghosted.
>>
>>Matt
>>
>>
>> I need to have the vector of Xdot, not the array. So I think I should use
>> SetIFunction instead of SetIFunctionLocal.
>>
>>
>> Sepideh
>> --
>> *From:* Matthew Knepley <knep...@gmail.com>
>> *Sent:* Tuesday, April 17, 2018 1:22:53 PM
>> *To:* Sepideh Kavousi
>> *Cc:* petsc-users@mcs.anl.gov
>> *Subject:* Re: [petsc-users] error running parallel on cluster
>>
>> On Tue, Apr 17, 2018 at 1:50 PM, Sepideh Kavousi <skav...@lsu.edu> wrote:
>>
>> Mathew,
>> I previously use DMDATSSetIF

Re: [petsc-users] error running parallel on cluster

2018-04-18 Thread Dave May

On 18 April 2018 at 21:06, Sepideh Kavousi  wrote:

> Mathew
>
> I added the lines and I still have the same issue. It may be a silly
> question but should I configure and install petsc again using this new
> lines added? or changing the line is enough? the highlighted lines are the
> lines I modified.
>
>
> PetscErrorCode ierr;
>   DM dm;
>   DMTS_DA*dmdats = (DMTS_DA*)ctx;
>   DMDALocalInfo  info;
>   VecXloc,Xdotloc;
>   void   *x,*f,*xdot;
>
>   PetscFunctionBegin;
>   PetscValidHeaderSpecific(ts,TS_CLASSID,1);
>   PetscValidHeaderSpecific(X,VEC_CLASSID,2);
>   PetscValidHeaderSpecific(F,VEC_CLASSID,3);
>   if (!dmdats->ifunctionlocal) SETERRQ(PetscObjectComm((
> PetscObject)ts),PETSC_ERR_PLIB,"Corrupt context");
>   ierr = TSGetDM(ts,);CHKERRQ(ierr);
>   ierr = DMGetLocalVector(dm,);CHKERRQ(ierr);
>   ierr = DMGlobalToLocalBegin(dm,Xdot,INSERT_VALUES,Xdotloc);
> CHKERRQ(ierr);
>   ierr = DMGlobalToLocalEnd(dm,Xdot,INSERT_VALUES,Xdotloc);CHKERRQ(ierr);
>   ierr = DMGetLocalVector(dm,);CHKERRQ(ierr);
>   ierr = DMGlobalToLocalBegin(dm,X,INSERT_VALUES,Xloc);CHKERRQ(ierr);
>   ierr = DMGlobalToLocalEnd(dm,X,INSERT_VALUES,Xloc);CHKERRQ(ierr);
>   ierr = DMDAGetLocalInfo(dm,);CHKERRQ(ierr);
>   ierr = DMDAVecGetArray(dm,Xloc,);CHKERRQ(ierr);
>   ierr = DMDAVecGetArray(dm,Xdotloc,);CHKERRQ(ierr);
>
>
Don't forget to include these calls (in this order) after you are done with
the Xdotloc vector

ierr = DMDAVecRestoreArray(dm,Xdotloc,);CHKERRQ(ierr);
ierr = DMRestoreLocalVector(dm,);CHKERRQ(ierr);

Failure to do so will result in a memory leak.



> Thanks,
>
> Sepideh
> --
> *From:* Matthew Knepley 
> *Sent:* Tuesday, April 17, 2018 5:59:12 PM
>
> *To:* Sepideh Kavousi
> *Cc:* petsc-users@mcs.anl.gov
> *Subject:* Re: [petsc-users] error running parallel on cluster
>
> On Tue, Apr 17, 2018 at 3:07 PM, Sepideh Kavousi  wrote:
>
> The reason  I can not use your method is that the input arguments of
> SetIFunctionLocal are the arrays of x,x_t instead of x,x_t vectors. In your
> method which was:
>
> ierr *=* DMGetLocalVector(dm,*&*Xdotloc);CHKERRQ(ierr);  ierr *=* 
> DMGlobalToLocalBegin(dm,Xdot,INSERT_VALUES,Xdotloc);CHKERRQ(ierr);  ierr *=* 
> DMGlobalToLocalEnd(dm,Xdot,INSERT_VALUES,Xdotloc);CHKERRQ(ierr);
>   ierr *=* DMDAVecGetArray(dm,Xdotloc,*&*xdot);CHKERRQ(ierr);
>
>
> You misunderstand my suggestion. I mean stick this code in right here in
> PETSc
>
> https://bitbucket.org/petsc/petsc/annotate/be3efd428a942676a
> 0189b3273b3c582694ff011/src/ts/utils/dmdats.c?at=master
> viewer=file-view-default#dmdats.c-68
>
> Then the X_t array you get in your local function will be ghosted.
>
>Matt
>
>
> I need to have the vector of Xdot, not the array. So I think I should use
> SetIFunction instead of SetIFunctionLocal.
>
>
> Sepideh
> --
> *From:* Matthew Knepley 
> *Sent:* Tuesday, April 17, 2018 1:22:53 PM
> *To:* Sepideh Kavousi
> *Cc:* petsc-users@mcs.anl.gov
> *Subject:* Re: [petsc-users] error running parallel on cluster
>
> On Tue, Apr 17, 2018 at 1:50 PM, Sepideh Kavousi  wrote:
>
> Mathew,
> I previously use DMDATSSetIFunctionLocal(user.d
> a,INSERT_VALUES,(DMDATSIFunctionLocal) FormFunction,) in my code.
> If I want to use your solution I can not use it because in the FormFunction
> definition I must use arrays, not vectors.So to solve this issue I followed
> two methods where none were able to solve it.
> 1- in first method I decided to use TSSetIFunction instead of
> DMDATSSetIFunctionLocal
>
> for this means first in the main function, I use TSSetDM and  my form
> function variables were as:
> PetscErrorCode FormFunction(TS ts,PetscScalar t,Vec Y,Vec Ydot,Vec F,
> struct VAR_STRUCT *user) {
> .
> .
> .
> .
> ierr = TSGetDM(ts,);CHKERRQ(ierr);
> ierr= DMDAGetLocalInfo(dmda,) ;CHKERRQ(ierr);
>
> ierr = DMGetLocalVector(dmda,_local);CHKERRQ(ierr);
> ierr = DMGlobalToLocalBegin(dmda,Ydot,INSERT_VALUES,Ydot_local);CHK
> ERRQ(ierr);
> ierr = DMGlobalToLocalEnd(dmda,Ydot,INSERT_VALUES,Ydot_local);CHKER
> RQ(ierr);
> .
> .
> .
>
> }
> But still, it does not consider vectors y,ydot,f related to dmda (problem
> executing DMDAVecGetArray)
>
>
> We cannot help you if you do not show full error messages.
>
> Why not fix the code with SetIFunctionLocal(), as I said in my last email.
> I will fix PETSc proper in branch at the end of the week. I
> have a proposal due tomorrow, so I cannot do it right now.
>
>   Thanks,
>
> Matt
>
>
> 2- in second method I decided to use DMTSSetIFunction
> but still, FormFunction is in form of TSIFunction where we do not define
> dm object and I think it does not understand dm and da are connected,
> although I have used TSSetDM in the main function.
>
> Can you please help me what should I do?
> Regards,
> Sepideh
>
>
>
>
> --
> *From:* Matthew

Re: [petsc-users] petsc4py: reuse setup for multiple solver calls?

2018-04-06 Thread Dave May

On Fri, 6 Apr 2018 at 07:48, Robert Speck  wrote:

> Thank you for your answer! Please see below for comments/questions.
>
> On 05.04.18 12:53, Matthew Knepley wrote:
> > On Thu, Apr 5, 2018 at 6:39 AM, Robert Speck  > > wrote:
> >
> > Hi!
> >
> > I would like to use petsc4py for my own Python library. Installation
> > went well, first tests (serial and parallel) look good.
> >
> > Here is what I want to do: I have my own time-stepping playground
> and I
> > want petsc4py to be one of my backbones for the data types and
> (linear
> > or non-linear, serial or parallel) solvers. I don't want to use
> PETSc's
> > time-steppers, at least not for now. So, I saw in some examples, in
> > particular the ones shipped with petsc4py, that the standard way of
> > running one of PETSc's solvers is a bunch of setup routines, then
> > setting the right-hand side and solve.
> >
> > Now, I don't want to rerun the whole setup part each time I call the
> > solver. I know that I can change the right-hand side without having
> to
> > do so, but what if I change a parameter of my operator like, say, the
> > time-step size or some material parameter?
> >
> > Take the simplest case: Say I have my own implicit Euler written in
> > Python. I know the right-hand side F of my ODE, so in each step I
> want
> > to solve "I - dt*F". But the time-step changes every now and then,
> so I
> > cannot pre-assemble everything once and for all (or I don't want to).
> > What do I need to rerun before I can use the solver again, what can I
> > reuse? Could I just assemble F and combine it with the identity and
> the
> > parameter dt right before I call the solver? How would that look
> like?
> >
> > I'm pretty new to PETSc and to petsc4py, so please forgive any
> stupidity
> > or ignorance in these questions. I'm happy to take any advice, links
> to
> > examples or previous questions. Thanks!
> >
> >
> > For linear solves which stay the same size, you just have to call
> > SetOperators
> > again with the new operator.
>
> OK, this sounds straightforward. Thanks!
>
> >
> > For nonlinear solves which stay the same size, you do nothing.
>
> "nothing" in terms of "nothing you can do" or "nothing you have to do"?


Nothing you have to do.


>
> >
> > If the system size changes, it generally better to create the object.
>
> What does this mean?


Possibly a typo - Matt is advising to "re-create" the objects if the system
sizes change. Petsc objects are typically not dynamic with respect to their
size (e.g. Mat, Vec), however creating new instances is generally cheap
(matrices can be the exception if you don't preallocate)



>
> Thanks again!
> -Robert-
>
>
>
>
>
>
> 
>
> 
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
>
> 
>
> 
>
>

Re: [petsc-users] Obtaining compiling and building information from a.out

2018-03-27 Thread Dave May

On 27 March 2018 at 10:16, TAY wee-beng  wrote:

> Hi,
>
> I have been compiling and building different version of my CFD with the
> intel 2016, 2018 compilers, and also different compiling options.
>
> I tested a version of my a.out and it is much faster than the other a.out,
> using only 3 min instead of more than 10min to solve a certain case using
> GAMG.
>
> However, I can't recall how it was compiled. I only know that I used the
> intel 2016 compiler.
>
> So is there any way I can find out how the a.out was compiled? Like what
> options were used?


Since you posted to the list I presume "a.out" links against petsc...
If so, run your code with
  -log_view

Upon calling PetscFinalize(), you will get all the options given to PETSc
configure, plus the CFLAGS, link lines, etc.

Re: [petsc-users] Load balancing / redistributing a 1D DM

2018-03-05 Thread Dave May

On 5 March 2018 at 09:29, Åsmund Ervik  wrote:

> Hi all,
>
> We have a code that solves the 1D multiphase Euler equations, using some
> very expensive thermodynamic calls in each cell in each time step. The
> computational time for different cells varies significantly in the spatial
> direction (due to different thermodynamic states), and varies slowly from
> timestep to timestep.
>
> Currently the code runs in serial, but I would like to use a PETSc DM of
> some sort to run it in parallell. There will be no linear on nonlinear
> PETSc solves etc., just a distributed mesh, at least initially. The code is
> Fortran.
>
> Now for my question: Is it possible to do dynamic load balancing using a
> plain 1D DMDA, somehow? There is some mention of this for PCTELESCOPE, but
> I guess it only works for linear solves? Or could I use an index set or
> some other PETSc structure? Or do I need to use a 1D DMPLEX?
>

I don't think TELESCOPE is what you want to use.

TELESCOPE redistributes a DMDA from one MPI communicator to another MPI
communicator with fewer ranks. I would not describe its functionality as
"load balancing". Re-distribution could be interpreted as load balancing
onto a different communicator, with an equal "load" associated with each
point in the DMDA - but that is not what you are after. In addition, I
didn't add support within TELESCOPE to re-distribute a 1D DMDA as that
use-case almost never arises.

For a 1D problem such as yours, I would use your favourite graph
partitioner (Metis,Parmetis, Scotch) together with your cell based
weighting and repartition the data yourself.

This is not a very helpful comment but I'll make it anyway...
If your code was in C, or C++, and you didn't want to mess around with any
MPI calls at all from your application code, I think you could use the
DMSWAM object pretty easily to perform the load balancing. I haven't tried
this exact use-case myself, but in principal you could take the output from
Metis (which tells you the rank you should move each point in the graph to)
and directly shove this info into a SWARM object and then ask it to migrate
your data.
DMSWAM lets you define and migrate (across a communicator) any data type
you like - it doesn't have to be a PetscReal, PetscScalar, you can define C
structs for example.  Unfortunately I didn't have the time to add Fortran
support for DMSWAM at the moment.

Cheers,
  Dave

Thanks,
  Dave

>
> If the latter, how do I make a 1D DMPLEX? All the variables are stored in
> cell centers (collocated), so it's a completely trivial "mesh". I tried
> reading the DMPLEX manual, and looking at examples, but I'm having trouble
> penetrating the FEM lingo / abstract nonsense.
>
> Best regards,
> Åsmund
>

Re: [petsc-users] Accessing a field values of Staggered grid

2018-02-13 Thread Dave May

On 13 February 2018 at 21:17, Matthew Knepley  wrote:

> On Tue, Feb 13, 2018 at 3:21 PM, Mohammad Hassan Baghaei <
> mhbagh...@mail.sjtu.edu.cn> wrote:
>
>> Hi
>>
>> I am filling the local vector from dm , has a section layout. The thing
>> is I want to know how I can see the field variable values defined on edges,
>> the staggered grid. In fact, Whenever I output to VTK in preview, I would
>> be able to see the main grid. But the values which are defined on edges, I
>> could not see them and that makes me unsure about the way I fill the local
>> vector. How I would be babe to check the field value on staggered grid?
>>
>
> VTK does not have a way to specify data on edges, only on cells or
> vertices.
>

This is not entirely true.

At least for a staggered grid, where you have one DOF per edge, you can
represent the edge data via the type VTK_VERTEX.
You won't generate a beautiful picture, as your field will be rendered as a
set of points (your edge faces) - but you can at least inspect the values
within ParaView.

Thanks,
  Dave


>
>   Thanks,
>
> Matt
>
>
>> Thanks
>>
>> Amir
>>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
>

Re: [petsc-users] Weak scaling test with geometric multigrid

2018-01-10 Thread Dave May

Henrik,


On Wed, 10 Jan 2018 at 16:39, Smith, Barry F.  wrote:

>
>DMDA requires that there be at least 1 grid points in each direction on
> each process (this simplified the implementation a huge amount but gives up
> flexibility). In your case you have 16 processes in a particular direction
> but a total of only 12 grid points hence not enough grid points to have at
> least 1 per process.


This particular implementation limitation can be overcome using
PCTELESCOPE. It allows you to repartition the coarse levels onto fewer
ranks.

Cheers,
  Dave



>
>   Barry
>
>
> > On Jan 10, 2018, at 5:25 AM, Buesing, Henrik <
> hbues...@eonerc.rwth-aachen.de> wrote:
> >
> > Dear all,
> >
> > I am doing a weak scaling test using geometric multigrid. With
> increasing the number of cells, and the number of processes, I also
> increase the number of multigrid levels. With 64 cores, 12288 cells in
> x-direction and 11 multigrid levels, I see error message [1].
> >
> > Could you help me understand what is happening here?
> >
> > The characteristics of the weak scaling test are summarized in table
> [2]. Refinement level 7-9 went through fine.
> >
> > Thank you!
> > Henrik
> >
> > [1]
> >
> > [0]PETSC ERROR: - Error Message
> --
> > [0]PETSC ERROR: Argument out of range
> > [0]PETSC ERROR: Partition in x direction is too fine! 12 16
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [0]PETSC ERROR: Petsc Development GIT revision: v3.8.2-48-g851ec02  GIT
> Date: 2017-12-05 09:52:17 -0600
> > [0]PETSC ERROR: shem_fw64gnu_const.x on a gnu_openmpi named
> linuxihfc033.rz.RWTH-Aachen.DE by hb111949 Wed Jan 10 11:48:09 2018
> > [0]PETSC ERROR: Configure options --download-fblaslapack --with-cc=mpicc
> -with-fc=mpif90 --with-cxx=mpicxx --download-hypre --download-superlu_dist
> --download-suitesparse --download-scalapack --download-blacs
> --download-hdf5 --download-parmetis --download-metis --with-debugging=0
> --download-mumps
> > [0]PETSC ERROR: #1 DMSetUp_DA_3D() line 299 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/dm/impls/da/da3.c
> > [0]PETSC ERROR: #2 DMSetUp_DA() line 25 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/dm/impls/da/dareg.c
> > [0]PETSC ERROR: #3 DMSetUp() line 720 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/dm/interface/dm.c
> > [0]PETSC ERROR: #4 DMCoarsen_DA() line 1203 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/dm/impls/da/da.c
> > [0]PETSC ERROR: #5 DMCoarsen() line 2427 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/dm/interface/dm.c
> > [0]PETSC ERROR: #6 PCSetUp_MG() line 618 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/ksp/pc/impls/mg/mg.c
> > [0]PETSC ERROR: #7 PCSetUp() line 924 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/ksp/pc/interface/precon.c
> > [0]PETSC ERROR: #8 KSPSetUp() line 381 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/ksp/ksp/interface/itfunc.c
> > [0]PETSC ERROR: #9 KSPSolve() line 612 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/ksp/ksp/interface/itfunc.c
> > [0]PETSC ERROR: #10 SNESSolve_NEWTONLS() line 224 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/snes/impls/ls/ls.c
> > [0]PETSC ERROR: #11 SNESSolve() line 4179 in
> /rwthfs/rz/cluster/work/hb111949/Code/petsc/src/snes/interface/snes.c
> >
> >
> > [2]
> >
> > # refinement level
> > # cores
> > # cells in x
> > # cells in y
> > # cells in z
> > # mg levels
> > 7
> > 1
> > 1536
> > 1
> > 256
> > 8
> > 8
> > 4
> > 3072
> > 1
> > 512
> > 9
> > 9
> > 16
> > 6144
> > 1
> > 1024
> > 10
> > 10
> > 64
> > 12288
> > 1
> > 2048
> > 11
> >
> > --
> > Dipl.-Math. Henrik Büsing
> > Institute for Applied Geoph
> ysics and
> Geothermal Energy
> > E.ON Energy Research Center
> > RWTH Aachen University
> > --
> > Mathieustr. 10|Tel +49 (0)241 80 49907
> > 52074 Aachen, Germany |Fax +49 (0)241 80 49889
> > --
> > http://www.eonerc.rwth-aachen.de/GGE
> > hbues...@eonerc.rwth-aachen.de
> > --
>
>

Re: [petsc-users] Auxiliary fields for multigrid

2017-11-17 Thread Dave May

On Fri, 17 Nov 2017 at 04:26, zakaryah .  wrote:

> I have equations which depend on some external data.  I'm not sure about
> the approach for making this compatible with multigrid - i.e. making sure
> those external fields are properly refined/coarsened.
>

Multi grid does not strictly require you to restrict your external data.
For example (I) algebraic mg (PCGAMG) only requires only the fine level
operator: (I) you can define coarse level operators with geometric mg using
Galerkin projection. The latter only requires one to specify how to
interpolate your DM fields from fine to coarse (which PLEX and DA provide).

Why not try these out for your problem first?

Are there PETSc examples that do this?
>

Not that I'm aware off. Most use Galerkin or rediscretize the operator on
the coarser levels.

Thanks,
  Dave

>

Re: [petsc-users] ISGlobalToLocalMappingApplyBlock

2017-11-15 Thread Dave May

On Wed, 15 Nov 2017 at 05:55, Adrian Croucher 
wrote:

> hi
>
> I'm trying to use ISGlobalToLocalMappingApplyBlock() and am a bit
> puzzled about the results it's giving.
>
> I've attached a small test to illustrate. It just sets up a
> local-to-global mapping with 10 elements. Running on two processes the
> first has global indices 0 - 4 and the the second has 5 - 9. I then try
> to find the local index corresponding to global index 8.
>
> If I set the blocksize parameter to 1, it correctly gives the results -1
> on rank 0 and 3 on rank 1.
>
> But if I set the blocksize to 2 (or more), the results are -253701943 on
> rank 0 and -1 on rank 1. Neither of these are what I expected- I thought
> they should be the same as in the blocksize 1 case.


The man page says to use "block global  numbering"


>
> I'm presuming the global indices I pass in to
> ISGlobalToLocalMappingApplyBlock() should be global block indices (i.e.
> not scaled up by blocksize).


Yes, the indices should relate to the blocks

If I do scale them up it doesn't give the
> answers I expect either.
>
> Or am I wrong to expect this to give the same results regardless of
> blocksize?


Yep.

However the large negative number being printed looks an uninitialized
variable. This seems odd as with mode = MASK nout should equal N and any
requested block indices not in the IS should result in -1 being inserted in
your local_indices array.

What's the value of nout?

Thanks,
  Dave


>
> Cheers, Adrian
>
> --
> Dr Adrian Croucher
> Senior Research Fellow
> Department of Engineering Science
> University of Auckland, New Zealand
> email: a.crouc...@auckland.ac.nz
> tel: +64 (0)9 923 4611
>
>

1 2 3 4 >

1 - 100 of 317 matches

Mail list logo