Re: [petsc-users] global indices of a vector in each process

2023-01-30 Thread Barry Smith

  VecGetOwnershipRange() works with any vector, including DM created vectors. 
You could not have the global ownership you provide below.

  Perhaps you are thinking about the Natural ownership values? For those if you 
using DMDA take a look at DMDAGetAO() and DMDACreateNaturalVector()

  Barry


> On Jan 30, 2023, at 10:36 AM, Venugopal, Vysakh (venugovh) via petsc-users 
>  wrote:
> 
> Hello,
>  
> I am using a DMCreateGlobalVector to create a vector V. If V is divided into 
> m processes, is there a way to get the global indices of V assigned to each 
> process?
>  
> For example: V = [10, 20, 30, 40, 50, 60, 70, 80].
> If MPI process 0 has [10, 40, 50, 60] and process 1 has [20, 30, 70, 80], is 
> it possible to get the indices for process 0 as [0,3,4,5] and process 1 as 
> [1,2,6,7]?
>  
> Thanks,
>  
> Vysakh
> ---
> Vysakh Venugopal
> Ph.D. Candidate
> Department of Mechanical Engineering
> University of Cincinnati, Cincinnati, OH 45221-0072



[petsc-users] Register for the next PETSc users meeting: Chicago June 5-7, 2023

2023-01-25 Thread Barry Smith

  We are pleased to announce the next PETSc users meeting in Chicago on June 
5-7, 2023. https://petsc.org/release/community/meetings/2023 
 Please register 
now and submit your talks.

  The meeting will include a lightning tutorial for new users Monday morning 
June 5, a workshop for potential PETSc contributors, a speed dating session to 
discuss your application needs with PETSc developers, mini tutorials on 
advanced PETSc topics, as well as user presentations on their work using PETSc.

  Mark your calendars for the 2024 PETSc users meeting, May 23,24 in Cologne, 
Germany.

  Barry




Re: [petsc-users] compile PETSc on win using clang

2023-01-25 Thread Barry Smith

  Please do as Satish previously suggested

~/petsc/lib/petsc/bin/win32fe/win32fe cl --use clang --verbose sizeof.c


  Also do 

clang sizeof.c

and send the output of both. Where sizeof.c is 

#include 
int main(int argc,char **args)
{
   printf("%d\n",(int)sizeof(int));
   return 0;
}

I may have typos in my sample code so please fix those.

Barry


> On Jan 24, 2023, at 6:22 PM, Guo, Sam  wrote:
> 
> Attached please find configure.log.
> 
> error messgae:
> C:\home\xian\dev\star\petsc\src\sys\objects\device\INTERF~1\device.cxx(486): 
> error C2065: 'PETSC_DEVICE_CASE': undeclared identifier
> 
> From: Satish Balay mailto:ba...@mcs.anl.gov>>
> Sent: Tuesday, January 24, 2023 2:00 PM
> To: Barry Smith mailto:bsm...@petsc.dev>>
> Cc: Guo, Sam (DI SW STS SDDEV MECH PHY FEA FW)  <mailto:sam@siemens.com>>; petsc-users@mcs.anl.gov 
> <mailto:petsc-users@mcs.anl.gov>  <mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] compile PETSc on win using clang
>  
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wikihow.com%2FInstall-Clang-on-Windows=05%7C01%7Csam.guo%40siemens.com%7Ca6e1607f7e23403f9b4008dafe56627d%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C638101944252560682%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=BOy9RDMGw11IlwRthzcB5Il3YUIgVVrukbzOMFdV8MI%3D=0
> 
> Is the clang you have from visual studio - as described above?
> 
> We don't have experience with using this variant of clang.
> 
> If its compatible with 'cl' - and supports the same command interface as 'cl' 
> then the following might work [assuming clang.exe is the compiler binary 
> installed - and available in PATH]:
> 
> '--with-cc=win32fe cl --use clang'
> 
> Satish
> 
> 
> On Tue, 24 Jan 2023, Barry Smith wrote:
> 
> > 
> >Are you using clang as a replacement for the
> > 
> >*  "Unix-like" Cygwin GNU compilers compilers or 
> > 
> >*  MinGW GNU compilers that are compatible with the Microsoft compilers?
> > 
> >   If the former, follow the instructions for using the Cygwin GNU 
> > compilers, if the latter follow the directions for the MinGW compilers. 
> > 
> >   Send the configure.log and make.log if things go wrong and we'll help you 
> > out.
> > 
> >   Barry
> > 
> > 
> > 
> > 
> > > On Jan 24, 2023, at 4:01 PM, Guo, Sam  > > <mailto:sam@siemens.com>> wrote:
> > > 
> > > Hi PETSc dev team,
> > >I try to compile PETSc on win using clang. I am wondering if you could 
> > > give me some hint. (I’ve already made intel compiler work on win using 
> > > win32fe icl).
> > >  
> > > Thanks,
> > > Sam Guo
> > 
> > 
> 



Re: [petsc-users] compile PETSc on win using clang

2023-01-24 Thread Barry Smith

   Are you using clang as a replacement for the

   *  "Unix-like" Cygwin GNU compilers compilers or 

   *  MinGW GNU compilers that are compatible with the Microsoft compilers?

  If the former, follow the instructions for using the Cygwin GNU compilers, if 
the latter follow the directions for the MinGW compilers. 

  Send the configure.log and make.log if things go wrong and we'll help you out.

  Barry




> On Jan 24, 2023, at 4:01 PM, Guo, Sam  wrote:
> 
> Hi PETSc dev team,
>I try to compile PETSc on win using clang. I am wondering if you could 
> give me some hint. (I’ve already made intel compiler work on win using 
> win32fe icl).
>  
> Thanks,
> Sam Guo



Re: [petsc-users] Cmake problem on an old cluster

2023-01-19 Thread Barry Smith


  Remove 
--download-cmake=/home/danyangs/soft/petsc/petsc-3.18.3/packages/cmake-3.25.1.tar.gz
  and install CMake yourself. Then configure PETSc with --with-cmake=directory 
you installed it in.

  Barry


> On Jan 19, 2023, at 1:46 PM, Danyang Su  wrote:
> 
> Hi All,
> 
> I am trying to install the latest PETSc on an old cluster but always get some 
> error information at the step of cmake. The system installed cmake is V3.2.3, 
> which is out-of-date for PETSc. I tried to use --download-cmake first, it 
> does not work. Then I tried to clean everything (delete the petsc_arch 
> folder), download the latest cmake myself and pass the path to the 
> configuration, the error is still there.
> 
> The compiler there is a bit old, intel-14.0.2 and openmpi-1.6.5. I have no 
> problem to install PETSc-3.13.6 there. The latest version cannot pass 
> configuration, unfortunately. Attached is the last configuration I have tried.
> 
> --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 
> --download-cmake=/home/danyangs/soft/petsc/petsc-3.18.3/packages/cmake-3.25.1.tar.gz
>  --download-mumps --download-scalapack --download-parmetis --download-metis 
> --download-ptscotch --download-fblaslapack --download-hypre 
> --download-superlu_dist --download-hdf5=yes --with-hdf5-fortran-bindings 
> --with-debugging=0 COPTFLAGS="-O2 -march=native -mtune=native" 
> CXXOPTFLAGS="-O2 -march=native -mtune=native" FOPTFLAGS="-O2 -march=native 
> -mtune=native"
> 
> Is there any solution for this.
> 
> Thanks,
> 
> Danyang
> 
> 
> 



Re: [petsc-users] Nonconforming object sizes using TAO (TAOBNTR)

2023-01-17 Thread Barry Smith

   It appears that Tao is not written to allow multiple TaoSolve() on the same 
Tao object; this is different from KSP, SNES, and TS. 

   If you look at the converged reason at the beginning of the second 
TaoSolve() you will see it is the reason that occurred in the first solve and 
all the Tao data structures are in the state they were in when the previous 
TaoSolve ended. Thus it is using incorrect flags and previous matrices 
incorrectly.

   Fixing this would be a largish process I think. 

   I added an error check for TaoSolve that checks if converged reason is not 
iterating (meaning the Tao object was previously used and left in a bad state) 
so that this same problem won't come up for other users. 
https://gitlab.com/petsc/petsc/-/merge_requests/5986

  Barry


> On Jan 16, 2023, at 5:07 PM, Blaise Bourdin  wrote:
> 
> Hi,
> 
> I am attaching a small modification of the eptorsion1.c example that 
> replicates the issue. It looks like this bug is triggered when the upper and 
> lower bounds are equal on enough (10) degrees of freedom.
>  
> 
>  Elastic-Plastic Torsion Problem -
> mx: 10 my: 10   
> 
> i: 0
> i: 1
> i: 2
> i: 3
> i: 4
> i: 5
> i: 6
> i: 7
> i: 8
> i: 9
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Nonconforming object sizes
> [0]PETSC ERROR: Preconditioner number of local rows 44 does not equal input 
> vector size 54
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.18.2-242-g4615508c7fc  GIT 
> Date: 2022-11-28 10:21:46 -0600
> [0]PETSC ERROR: ./eptorsion1 on a ventura-gcc12.2-arm64-g64 named 
> bblaptop.math.mcmaster.ca by blaise Mon Jan 16 17:06:57 2023
> [0]PETSC ERROR: Configure options --CFLAGS="-Wimplicit-function-declaration 
> -Wunused" --FFLAGS="-ffree-line-length-none -fallow-argument-mismatch 
> -Wunused" --download-ctetgen=1 --download-exodusii=1 --download-hdf5=1 
> --download-hypre=1 --download-metis=1 --download-netcdf=1 --download-mumps=1 
> --download-parmetis=1 --download-pnetcdf=1 --download-scalapack 
> --download-triangle=1 --download-zlib=1 --with-64-bit-indices=1 
> --with-debugging=1 --with-exodusii-fortran-bindings --with-shared-libraries=1 
> --with-x11=0
> [0]PETSC ERROR: #1 PCApply() at 
> /opt/HPC/petsc-main/src/ksp/pc/interface/precon.c:434
> [0]PETSC ERROR: #2 KSP_PCApply() at 
> /opt/HPC/petsc-main/include/petsc/private/kspimpl.h:380
> [0]PETSC ERROR: #3 KSPCGSolve_STCG() at 
> /opt/HPC/petsc-main/src/ksp/ksp/impls/cg/stcg/stcg.c:76
> [0]PETSC ERROR: #4 KSPSolve_Private() at 
> /opt/HPC/petsc-main/src/ksp/ksp/interface/itfunc.c:898
> [0]PETSC ERROR: #5 KSPSolve() at 
> /opt/HPC/petsc-main/src/ksp/ksp/interface/itfunc.c:1070
> [0]PETSC ERROR: #6 TaoBNKComputeStep() at 
> /opt/HPC/petsc-main/src/tao/bound/impls/bnk/bnk.c:459
> [0]PETSC ERROR: #7 TaoSolve_BNTR() at 
> /opt/HPC/petsc-main/src/tao/bound/impls/bnk/bntr.c:138
> [0]PETSC ERROR: #8 TaoSolve() at 
> /opt/HPC/petsc-main/src/tao/interface/taosolver.c:177
> [0]PETSC ERROR: #9 main() at eptorsion1.c:166
> [0]PETSC ERROR: No PETSc Option Table entries
> [0]PETSC ERROR: End of Error Message ---send entire error 
> message to petsc-ma...@mcs.anl.gov--
> Abort(60) on node 0 (rank 0 in comm 16): application called 
> MPI_Abort(MPI_COMM_SELF, 60) - process 0
> 
> I hope that this helps.
> 
> Blaise
> 
> 
>> On Jan 16, 2023, at 3:14 PM, Alexis Marboeuf  
>> wrote:
>> 
>> Hi Matt,
>> After investigation, it fails because, at some point, the boolean needH is 
>> set to PETSC_FALSE when initializing the BNK method with TAOBNKInitialize 
>> (line 103 of $PETSC_DIR/src/tao/bound/impls/bnk/bntr.c). The Hessian and the 
>> precondtitioner are thus not updated throughout the TAO iterations. It has 
>> something to do with the option BNK_INIT_INTERPOLATION set by default. It 
>> works when I choose BNK_INIT_CONSTANT. In my case, in all the successful 
>> calls of TAOSolve, the computed trial objective value is better than the 
>> current value which implies needH = PETSC_TRUE within TAOBNKInitialize. At 
>> some point, the trial value becomes equal to the current objective value up 
>> to machine precision and then, needH = PETSC_FALSE. I have to admit I am 
>> struggling understanding how that boolean needH is computed when BNK is 
>> initialized with BNK_INIT_INTERPOLATION. Can you help me with that?
>> Thanks a lot.
>> Alexis
>> De : Alexis Marboeuf 
>> Envoyé : samedi 14 janvier 2023 05:24
>> À : Matthew Knepley 
>> Cc : petsc-users@mcs.anl.gov 
>> Objet : RE: [petsc-users] Nonconforming object sizes using TAO (TAOBNTR)
>>  
>> Hi Matt,
>> Indeed, it fails on 1 process with the same error. The source code is 
>> available here: https://github.com/bourdin/mef90 (branch 
>> marboeuf/vdef-tao-test)
>>    
>> GitHub - bourdin/mef90: Official repository for mef90/vDef 

Re: [petsc-users] about repeat of expensive functions using VecScatterCreateToAll

2023-01-17 Thread Barry Smith


> On Jan 17, 2023, at 3:12 PM, Venugopal, Vysakh (venugovh) via petsc-users 
>  wrote:
> 
> Hi,
>  
> I am doing the following thing.
>  
> Step 1. Create DM object and get global vector ‘V’ using DMGetGlobalVector.
> Step 2. Doing some parallel operations on V.
> Step 3. I am using VecScatterCreateToAll on V to create a sequential vector 
> ‘V_SEQ’ using VecScatterBegin/End with SCATTER_FORWARD.
> Step 4. I am performing an expensive operation on V_SEQ and outputting the 
> updated V_SEQ.
> Step 5. I am using VecScatterBegin/End with SCATTER_REVERSE (global and 
> sequential flipped) to get V that is updated with new values from V_SEQ.
> Step 6. I continue using this new V on the rest of the parallelized program.
>  
> Question: Suppose I have n MPI processes, is the expensive operation in Step 
> 4 repeated n times? If yes, is there a workaround such that the operation in 
> Step 4 is performed only once? I would like to follow the same structure as 
> steps 1 to 6 with step 4 only performed once.

  Each MPI rank is doing the same operations on its copy of the sequential 
vector. Since they are running in parallel it probably does not matter much 
that each is doing the same computation. Step 5 does not require any MPI since 
only part of the sequential vector (which everyone has) is needed in the 
parallel vector.

  You could use VecScatterCreateToZero() but then step 3 would require less 
communication but step 5 would require communication to get parts of the 
solution from rank 0 to the other ranks. The time for step 4 would be roughly 
the same.

  You will likely only see a worthwhile improvement in performance if you can 
parallelize the computation in 4. What are you doing that is computational 
intense and requires all the data on a rank?

Barry

>  
> Thanks,
>  
> Vysakh Venugopal
> ---
> Vysakh Venugopal
> Ph.D. Candidate
> Department of Mechanical Engineering
> University of Cincinnati, Cincinnati, OH 45221-0072



Re: [petsc-users] PETSC install

2023-01-12 Thread Barry Smith

  If you put the variable in the .bashrc file then you mush 

  source ~/.bashrc 

  before running the 

  make check


  Barry


> On Jan 11, 2023, at 11:52 PM, Sijie Zhang  wrote:
> 
> Yes, I followed the exact instructions. I’m using bash. I put the 
> environmental variable in the .bashrc file. This only happens to my intel 
> i129700 workstation. Is it because of the hardware?
>  
> Thanks.
>  
> Sijie 
>  
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for Windows
>  
> From: Matthew Knepley <mailto:knep...@gmail.com>
> Sent: Wednesday, January 11, 2023 7:40 PM
> To: Barry Smith <mailto:bsm...@petsc.dev>
> Cc: Sijie Zhang <mailto:zhan2...@purdue.edu>; petsc-users@mcs.anl.gov 
> <mailto:petsc-users@mcs.anl.gov>
> Subject: Re: [petsc-users] PETSC install
>  
>  External Email: Use caution with attachments, links, or sharing data 
>  
> On Wed, Jan 11, 2023 at 2:33 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>  
>   Did you do exactly: 
>  
> export HWLOC_HIDE_ERRORS=2
> make check
>  
> Also, what shell are you using? The command above is for bash, but if you use 
> csh it is different.
>  
>   Thanks,
>  
>  Matt
>  
> ?
> 
> 
> 
> 
> On Jan 11, 2023, at 6:51 PM, Sijie Zhang  <mailto:zhan2...@purdue.edu>> wrote:
>  
> Hi,
> 
> I tried that but it's showing the same error.  Can you help me to take a look 
> at that?
> 
> Thanks.
> 
> Sijie
> 
> +
> 
> Running check examples to verify correct installation
> Using PETSC_DIR=/home/zhangsijie1995/Documents/Package/petsc-3.18.3 and 
> PETSC_ARCH=arch-linux-c-debug
> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> ***Error detected during compile or link!***
> See https://petsc.org/release/faq/
> /home/zhangsijie1995/Documents/Package/petsc-3.18.3/src/snes/tutorials ex5f
> *
> /opt/intel/oneapi/mpi/2021.8.0/bin/mpif90 -fPIC -Wall -ffree-line-length-none 
> -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O0 
>   -I/home/zhangsijie1995/Documents/Package/petsc-3.18.3/include 
> -I/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/include
>  -I/opt/intel/oneapi/mkl/2023.0.0/include 
> -I/opt/intel/oneapi/mpi/2021.8.0/include ex5f.F90  
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/lib
>  -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/lib 
> -Wl,-rpath,/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 
> -L/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib/release
>  
> -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib/release
>  
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib
>  
> -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib
>  -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11 
> -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lmkl_intel_lp64 -lmkl_core 
> -lmkl_sequential -lpthread -lm -lstdc++ -ldl -lmpifort -lmpi -lrt -lpthread 
> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl -o ex5f
> f951: Warning: Nonexistent include directory 
> ‘I_MPI_SUBSTITUTE_INSTALLDIR/include/gfortran/11.1.0’ [-Wmissing-include-dirs]
> f951: Warning: Nonexistent include directory 
> ‘I_MPI_SUBSTITUTE_INSTALLDIR/include’ [-Wmissing-include-dirs]
> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI 
> process
> See https://petsc.org/releas

Re: [petsc-users] PETSC install

2023-01-11 Thread Barry Smith

  Did you do exactly:

export HWLOC_HIDE_ERRORS=2
make check

?


> On Jan 11, 2023, at 6:51 PM, Sijie Zhang  wrote:
> 
> Hi,
> 
> I tried that but it's showing the same error.  Can you help me to take a look 
> at that?
> 
> Thanks.
> 
> Sijie
> 
> +
> 
> Running check examples to verify correct installation
> Using PETSC_DIR=/home/zhangsijie1995/Documents/Package/petsc-3.18.3 and 
> PETSC_ARCH=arch-linux-c-debug
> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> ***Error detected during compile or link!***
> See https://petsc.org/release/faq/
> /home/zhangsijie1995/Documents/Package/petsc-3.18.3/src/snes/tutorials ex5f
> *
> /opt/intel/oneapi/mpi/2021.8.0/bin/mpif90 -fPIC -Wall -ffree-line-length-none 
> -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O0 
>   -I/home/zhangsijie1995/Documents/Package/petsc-3.18.3/include 
> -I/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/include
>  -I/opt/intel/oneapi/mkl/2023.0.0/include 
> -I/opt/intel/oneapi/mpi/2021.8.0/include ex5f.F90  
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/lib
>  -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/lib 
> -Wl,-rpath,/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 
> -L/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib/release
>  
> -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib/release
>  
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib
>  
> -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib
>  -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11 
> -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lmkl_intel_lp64 -lmkl_core 
> -lmkl_sequential -lpthread -lm -lstdc++ -ldl -lmpifort -lmpi -lrt -lpthread 
> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl -o ex5f
> f951: Warning: Nonexistent include directory 
> ‘I_MPI_SUBSTITUTE_INSTALLDIR/include/gfortran/11.1.0’ [-Wmissing-include-dirs]
> f951: Warning: Nonexistent include directory 
> ‘I_MPI_SUBSTITUTE_INSTALLDIR/include’ [-Wmissing-include-dirs]
> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI 
> process
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> Number of SNES iterations =     3
> Completed test examples
> Error while running make check
> gmake[1]: *** [makefile:149: check] Error 1
> make: *** [GNUmakefile:17: check] Error 2
> 
> 
> From: Barry Smith 
> Sent: Wednesday, January 11, 2023 8:22 AM
> To: Sijie Zhang
> Cc: petsc-users@mcs.anl.gov
> Subject: Re: [petsc-users] PETSC install
> 
>  External Email: Use caution with attachments, links, or sharing data 
> 
> 
> https://petsc.org/release/faq/#what-does-the-message-hwloc-linux-ignoring-pci-device-with-non-16bit-domain-mean
> 
> On Jan 11, 2023, at 5:03 AM, Sijie Zhang  wrote:
> 
> Hi,
> 
> When I try to install petsc on my workstation, I got the following error. Can 
> you help me with that?
> 
> Thank you and best regards.
> 
> Sijie
> 
> 
> 
> Running check examples to verify correct installation
> Using PETSC_DIR=/home/zhangsijie1995/Documents/Packa

Re: [petsc-users] PETSC install

2023-01-11 Thread Barry Smith

https://petsc.org/release/faq/#what-does-the-message-hwloc-linux-ignoring-pci-device-with-non-16bit-domain-mean

> On Jan 11, 2023, at 5:03 AM, Sijie Zhang  wrote:
> 
> Hi,
> 
> When I try to install petsc on my workstation, I got the following error. Can 
> you help me with that?
> 
> Thank you and best regards.
> 
> Sijie
> 
> 
> 
> Running check examples to verify correct installation
> Using PETSC_DIR=/home/zhangsijie1995/Documents/Package/petsc-3.18.3 and 
> PETSC_ARCH=arch-linux-c-debug
> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> ***Error detected during compile or link!***
> See https://petsc.org/release/faq/
> /home/zhangsijie1995/Documents/Package/petsc-3.18.3/src/snes/tutorials ex5f
> *
> /opt/intel/oneapi/mpi/2021.8.0/bin/mpif90 -fPIC -Wall -ffree-line-length-none 
> -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O0 
>   -I/home/zhangsijie1995/Documents/Package/petsc-3.18.3/include 
> -I/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/include
>  -I/opt/intel/oneapi/mkl/2023.0.0/include 
> -I/opt/intel/oneapi/mpi/2021.8.0/include ex5f.F90  
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/lib
>  -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/arch-linux-c-debug/lib 
> -Wl,-rpath,/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 
> -L/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib/release
>  
> -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib/release
>  
> -Wl,-rpath,/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib
>  
> -L/home/zhangsijie1995/Documents/Package/petsc-3.18.3/I_MPI_SUBSTITUTE_INSTALLDIR/lib
>  -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11 
> -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lmkl_intel_lp64 -lmkl_core 
> -lmkl_sequential -lpthread -lm -lstdc++ -ldl -lmpifort -lmpi -lrt -lpthread 
> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl -o ex5f
> f951: Warning: Nonexistent include directory 
> ‘I_MPI_SUBSTITUTE_INSTALLDIR/include/gfortran/11.1.0’ [-Wmissing-include-dirs]
> f951: Warning: Nonexistent include directory 
> ‘I_MPI_SUBSTITUTE_INSTALLDIR/include’ [-Wmissing-include-dirs]
> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI 
> process
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> Number of SNES iterations = 3
> Completed test examples
> Error while running make check
> gmake[1]: *** [makefile:149: check] Error 1
> make: *** [GNUmakefile:17: check] Error 2



Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Barry Smith

  The default is some kind of Jacobi plus Chebyshev, for a certain class of 
problems, it is quite good.



> On Jan 10, 2023, at 3:31 PM, Mark Lohry  wrote:
> 
> So what are people using for GAMG configs on GPU? I was hoping petsc today 
> would be performance competitive with AMGx but it sounds like that's not the 
> case?
> 
> On Tue, Jan 10, 2023 at 3:03 PM Jed Brown  > wrote:
>> Mark Lohry mailto:mlo...@gmail.com>> writes:
>> 
>> > I definitely need multigrid. I was under the impression that GAMG was
>> > relatively cuda-complete, is that not the case? What functionality works
>> > fully on GPU and what doesn't, without any host transfers (aside from
>> > what's needed for MPI)?
>> >
>> > If I use -ksp-pc_type gamg -mg_levels_pc_type pbjacobi -mg_levels_ksp_type
>> > richardson is that fully on device, but -mg_levels_pc_type ilu or
>> > -mg_levels_pc_type sor require transfers?
>> 
>> You can do `-mg_levels_pc_type ilu`, but it'll be extremely slow (like 20x 
>> slower than an operator apply). One can use Krylov smoothers, though that's 
>> more synchronization. Automatic construction of operator-dependent 
>> multistage smoothers for linear multigrid (because Chebyshev only works for 
>> problems that have eigenvalues near the real axis) is something I've wanted 
>> to develop for at least a decade, but time is always short. I might put some 
>> effort into p-MG with such smoothers this year as we add DDES to our 
>> scale-resolving compressible solver.



Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Barry Smith


> On Jan 10, 2023, at 2:19 PM, Mark Lohry  wrote:
> 
>> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if 
>> the node size is not uniform). The are good choices for scale-resolving CFD 
>> on GPUs.
> 
> I was hoping you'd know :)  pbjacobi is underperforming ilu by a pretty wide 
> margin on some of the systems i'm looking at.
> 
>> We don't have colored smoothers currently in PETSc.
> 
> So what happens under the hood when I run -mg_levels_pc_type sor on GPU? Are 
> you actually decomposing the matrix into lower and computing updates with 
> matrix multiplications? Or is it just the standard serial algorithm with 
> thread safety ignored?

  It is running the regular SOR on the CPU and needs to copy up the vector and 
copy down the result.
> 
> On Tue, Jan 10, 2023 at 1:52 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>>   We don't have colored smoothers currently in PETSc.
>> 
>> > On Jan 10, 2023, at 12:56 PM, Jed Brown > > <mailto:j...@jedbrown.org>> wrote:
>> > 
>> > Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if 
>> > the node size is not uniform). The are good choices for scale-resolving 
>> > CFD on GPUs.
>> > 
>> > Mark Lohry mailto:mlo...@gmail.com>> writes:
>> > 
>> >> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
>> >> smoother algorithms are implemented on GPU? Specifically SOR/GS and ILU(0)
>> >> -- in e.g. AMGx these are applied by first creating a coloring, and the
>> >> smoother passes are done color by color. Is this how it's done in petsc 
>> >> AMG?
>> >> 
>> >> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal ILU.
>> >> Is there an equivalent in petsc?
>> >> 
>> >> Thanks,
>> >> Mark
>> 



Re: [petsc-users] GPU implementation of serial smoothers

2023-01-10 Thread Barry Smith


  We don't have colored smoothers currently in PETSc.

> On Jan 10, 2023, at 12:56 PM, Jed Brown  wrote:
> 
> Is DILU a point-block method? We have -pc_type pbjacobi (and vpbjacobi if the 
> node size is not uniform). The are good choices for scale-resolving CFD on 
> GPUs.
> 
> Mark Lohry  writes:
> 
>> I'm running GAMG with CUDA, and I'm wondering how the nominally serial
>> smoother algorithms are implemented on GPU? Specifically SOR/GS and ILU(0)
>> -- in e.g. AMGx these are applied by first creating a coloring, and the
>> smoother passes are done color by color. Is this how it's done in petsc AMG?
>> 
>> Tangential, AMGx and OpenFOAM offer something called "DILU", diagonal ILU.
>> Is there an equivalent in petsc?
>> 
>> Thanks,
>> Mark



Re: [petsc-users] Eliminating rows and columns which are zeros

2023-01-10 Thread Barry Smith

  Yes, after the solve the x will contain correct values for ALL the locations 
including the (zeroed out rows). You use case is exactly what redistribute it 
for.

  Barry


> On Jan 10, 2023, at 11:25 AM, Karthikeyan Chockalingam - STFC UKRI 
>  wrote:
> 
> Thank you Barry. This is great!
>  
> I plan to solve using ‘-pc_type redistribute’ after applying the Dirichlet bc 
> using
> MatZeroRowsColumnsIS(A, isout, 1, x, b); 
>  
> While I retrieve the solution data from x (after the solve) – can I index 
> them using the original ordering (if I may say that)?
>  
> Kind regards,
> Karthik.
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Tuesday, 10 January 2023 at 16:04
> To: Chockalingam, Karthikeyan (STFC,DL,HC) 
>  <mailto:karthikeyan.chockalin...@stfc.ac.uk>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Eliminating rows and columns which are zeros
> 
>  
> https://petsc.org/release/docs/manualpages/PC/PCREDISTRIBUTE/#pcredistribute  
>  -pc_type redistribute
>  
>  
> It does everything for you. Note that if the right hand side for any of the 
> "zero" rows is nonzero then the system is inconsistent and the system does 
> not have a solution.
>  
> Barry
>  
> 
> 
> On Jan 10, 2023, at 10:30 AM, Karthikeyan Chockalingam - STFC UKRI via 
> petsc-users mailto:petsc-users@mcs.anl.gov>> wrote:
>  
> Hello,
>  
> I am assembling a MATIJ of size N, where a very large number of rows (and 
> corresponding columns), are zeros. I would like to potentially eliminate them 
> before the solve.
>  
> For instance say N=7
>  
> 0 0  0  0 0 0 0
> 0 1 -1  0 0 0 0
> 0 -1 2  0 0 0 -1
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  -1 0 0 0 1
>  
> I would like to reduce it to a 3x3
>  
> 1 -1 0
> -1 2 -1
> 0 -1 1
>  
> I do know the size N.
>  
> Q1) How do I do it?
> Q2) Is it better to eliminate them as it would save a lot of memory?
> Q3) At the moment, I don’t know which rows (and columns) have the zero 
> entries but with some effort I probably can find them. Should I know which 
> rows (and columns) I am eliminating?
>  
> Thank you.
>  
> Karthik.
> This email and any attachments are intended solely for the use of the named 
> recipients. If you are not the intended recipient you must not use, disclose, 
> copy or distribute this email or any of its attachments and should notify the 
> sender immediately and delete this email from your system. UK Research and 
> Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
> this email or any attachments containing viruses or malware but the recipient 
> should carry out its own virus and malware checks before opening the 
> attachments. UKRI does not accept any liability for any losses or damages 
> which the recipient may sustain due to presence of any viruses. 
> 



Re: [petsc-users] Eliminating rows and columns which are zeros

2023-01-10 Thread Barry Smith

https://petsc.org/release/docs/manualpages/PC/PCREDISTRIBUTE/#pcredistribute   
-pc_type redistribute


It does everything for you. Note that if the right hand side for any of the 
"zero" rows is nonzero then the system is inconsistent and the system does not 
have a solution.

Barry


> On Jan 10, 2023, at 10:30 AM, Karthikeyan Chockalingam - STFC UKRI via 
> petsc-users  wrote:
> 
> Hello,
>  
> I am assembling a MATIJ of size N, where a very large number of rows (and 
> corresponding columns), are zeros. I would like to potentially eliminate them 
> before the solve.
>  
> For instance say N=7
>  
> 0 0  0  0 0 0 0
> 0 1 -1  0 0 0 0
> 0 -1 2  0 0 0 -1
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  0  0 0 0 0
> 0 0  -1 0 0 0 1
>  
> I would like to reduce it to a 3x3
>  
> 1 -1 0
> -1 2 -1
> 0 -1 1
>  
> I do know the size N.
>  
> Q1) How do I do it?
> Q2) Is it better to eliminate them as it would save a lot of memory?
> Q3) At the moment, I don’t know which rows (and columns) have the zero 
> entries but with some effort I probably can find them. Should I know which 
> rows (and columns) I am eliminating?
>  
> Thank you.
>  
> Karthik.
> This email and any attachments are intended solely for the use of the named 
> recipients. If you are not the intended recipient you must not use, disclose, 
> copy or distribute this email or any of its attachments and should notify the 
> sender immediately and delete this email from your system. UK Research and 
> Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
> this email or any attachments containing viruses or malware but the recipient 
> should carry out its own virus and malware checks before opening the 
> attachments. UKRI does not accept any liability for any losses or damages 
> which the recipient may sustain due to presence of any viruses. 
> 



Re: [petsc-users] Error running configure on HDF5 in PETSc-3.18.3

2023-01-06 Thread Barry Smith

  Please email your latest configure.log  (with no Conda stuff) to 
petsc-ma...@mcs.anl.gov 

  The configuration of HDF5 (done by HDF5) is objecting to some particular 
aspect of your current Fortran compiler, we need to figure out the exact 
objection.

  Barry


> On Jan 6, 2023, at 5:14 PM, Danyang Su  wrote:
> 
> Hi Pierre,
>  
> I have tried to exclude Conda related environment variables but it does not 
> work. Instead, if I include ‘--download-hdf5=yes’ but exclude 
> ‘--with-hdf5-fortran-bindings’ in the configuration, PETSc can be configured 
> and installed without problem, even with Conda related environment activated. 
> However, since my code requires fortran interface to HDF5, I do need 
> ‘--with-hdf5-fortran-bindings’, otherwise, my code cannot be compiled.
>  
> Any other suggestions?
>  
> Thanks,
>  
> Danyang
>  
> From: Pierre Jolivet mailto:pie...@joliv.et>>
> Date: Friday, January 6, 2023 at 7:59 AM
> To: Danyang Su mailto:danyang...@gmail.com>>
> Cc: mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Error running configure on HDF5 in PETSc-3.18.3
>  
>  
> 
> 
>> On 6 Jan 2023, at 4:49 PM, Danyang Su > > wrote:
>>  
>> Hi All,
>>  
>> I get ‘Error running configure on HDF5’ in PETSc-3.18.3 on MacOS, but no 
>> problem on Ubuntu. Attached is the configuration log file. 
>>  
>> ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps 
>> --download-scalapack --download-parmetis --download-metis 
>> --download-ptscotch --download-fblaslapack --download-mpich --download-hypre 
>> --download-superlu_dist --download-hdf5=yes --with-debugging=0 
>> --download-cmake --with-hdf5-fortran-bindings
>>  
>> Any idea on this?
>  
> Could you try to reconfigure in a shell without conda being activated?
> You have 
> PATH=/Users/danyangsu/Soft/Anaconda3/bin:/Users/danyangsu/Soft/Anaconda3/condabin:[…]
>  which typically results in a broken configuration.
>  
> Thanks,
> Pierre
> 
> 
>> Thanks,
>>  
>> Danyang



Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-05 Thread Barry Smith

  So Jed's "everyone" now consists of "no one" and Jed can stop complaining 
that "everyone" thinks it is a bad idea.



> On Jan 5, 2023, at 11:50 PM, Junchao Zhang  wrote:
> 
> 
> 
> 
> On Thu, Jan 5, 2023 at 10:32 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>> 
>> 
>> > On Jan 5, 2023, at 3:42 PM, Jed Brown > > <mailto:j...@jedbrown.org>> wrote:
>> > 
>> > Mark Adams mailto:mfad...@lbl.gov>> writes:
>> > 
>> >> Support of HIP and CUDA hardware together would be crazy, 
>> > 
>> > I don't think it's remotely crazy. libCEED supports both together and it's 
>> > very convenient when testing on a development machine that has one of each 
>> > brand GPU and simplifies binary distribution for us and every package that 
>> > uses us. Every day I wish PETSc could build with both simultaneously, but 
>> > everyone tells me it's silly.
>> 
>>   Not everyone at all; just a subset of everyone. Junchao is really the 
>> hold-out :-)
> I am not, instead I think we should try (I fully agree it can ease binary 
> distribution).  But satish needs to install such a machine first :)
> There are issues out of our control if we want to mix GPUs in execution.  For 
> example, how to do VexAXPY on a cuda vector and a hip vector? Shall we do it 
> on the host? Also, there are no gpu-aware MPI implementations supporting 
> messages between cuda memory and hip memory.
>> 
>>   I just don't care about "binary packages" :-); I think they are an archaic 
>> and bad way of thinking about code distribution (but yes the alternatives 
>> need lots of work to make them flawless, but I think that is where the work 
>> should go in the packaging world.)
>> 
>>I go further and think one should be able to automatically use a CUDA 
>> vector on a HIP device as well, it is not hard in theory but requires 
>> thinking about how we handle classes and subclasses a little to make it 
>> straightforward; or perhaps Jacob has fixed that also?
>  



Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-05 Thread Barry Smith



> On Jan 5, 2023, at 3:42 PM, Jed Brown  wrote:
> 
> Mark Adams  writes:
> 
>> Support of HIP and CUDA hardware together would be crazy, 
> 
> I don't think it's remotely crazy. libCEED supports both together and it's 
> very convenient when testing on a development machine that has one of each 
> brand GPU and simplifies binary distribution for us and every package that 
> uses us. Every day I wish PETSc could build with both simultaneously, but 
> everyone tells me it's silly.

  Not everyone at all; just a subset of everyone. Junchao is really the 
hold-out :-)

  I just don't care about "binary packages" :-); I think they are an archaic 
and bad way of thinking about code distribution (but yes the alternatives need 
lots of work to make them flawless, but I think that is where the work should 
go in the packaging world.)

   I go further and think one should be able to automatically use a CUDA vector 
on a HIP device as well, it is not hard in theory but requires thinking about 
how we handle classes and subclasses a little to make it straightforward; or 
perhaps Jacob has fixed that also?



Re: [petsc-users] setting a vector with VecSetValue versus VecSetValues

2023-01-05 Thread Barry Smith

  Note there is also VecSetValuesLocal() that takes ghosted local indices 
(ghosted local indices are different from your meaning of "local indices"). See 
https://petsc.org/release/docs/manualpages/Vec/VecSetValuesLocal/  
https://petsc.org/release/docs/manualpages/Vec/VecSetLocalToGlobalMapping/

  Barry


> On Jan 5, 2023, at 12:41 PM, Alfredo Jaramillo  
> wrote:
> 
> omg... for some reason I was thinking it takes local indices. Yes.. modifying 
> that line the code works well.
> 
> thank you!
> Alfredo
> 
> On Thu, Jan 5, 2023 at 10:38 AM Junchao Zhang  > wrote:
>> VecSetValue() also needs global indices, so you need PetscInt gl_row = 
>> (PetscInt)(i)+rstart; 
>> 
>> --Junchao Zhang
>> 
>> 
>> On Thu, Jan 5, 2023 at 11:25 AM Alfredo Jaramillo > > wrote:
>>> dear PETSc developers,
>>> 
>>> I have a code where I copy an array to a distributed petsc vector with the 
>>> next lines:
>>> 
>>> 1for (int i = 0; i < ndof_local; i++) {
>>> 2PetscInt gl_row = (PetscInt)(i)+rstart;
>>> 3PetscScalar val = (PetscScalar)u[i];
>>> 4VecSetValues(x,1,_row,,INSERT_VALUES);
>>> 5}
>>> 
>>> // for (int i = 0; i < ndof_local; i++) {
>>> // PetscInt gl_row = (PetscInt)(i);
>>> // PetscScalar val = (PetscScalar)u[i];
>>> // VecSetValue(x,gl_row,val,INSERT_VALUES);
>>> // }
>>> 
>>> VecAssemblyBegin(x);
>>> VecAssemblyEnd(x);
>>> 
>>> This works as expected. If, instead of using lines 1-5, I use the lines 
>>> where VecSetValue is used with local indices, then the vector is null on 
>>> all the processes but rank 0, and the piece of information at rank zero is 
>>> incorrect.
>>> 
>>> What could I be doing wrong?
>>> 
>>> bests regards
>>> Alfredo



Re: [petsc-users] Question - about the 'Hint for performance tuning'

2023-01-02 Thread Barry Smith


> On Jan 2, 2023, at 9:27 AM, 김성익  wrote:
> 
> There are more questions.
> 
> 1. Following your comments, but there is an error as below.
> 
> How can I fix this?

   The previous value of PETSC_ARCH was not arch-main-debug do ls $PETSC_DIR 
and locate the directory whose name begins with arch- that will tell you the 
previous value used for PETSC_ARCH.

> 
> 2. After changing the optimized build, then how can I set the debug mode 
> again?

   export PETSC_ARCH=arch- whatever the previous value was and then recompile 
your code (note you do not need to recompile PETSc just your executable).

> 
> 3.Following your comments, the new makefile is as below. Is it right?
> CFLAGS   = -O3
> FFLAGS   =
> CPPFLAGS =
> FPPFLAGS =
> COPTFLAGS= -march=native
>  
> app : a1.o a2.o a3.o a4.o
>$(LINK.C) -o $@ $^ $(LDLIBS)
>  
> include ${PETSC_DIR}/lib/petsc/conf/rules
> include ${PETSC_DIR}/lib/petsc/conf/test

Best not to set these values in the makefile at all because they will affect 
all compilers. Just set them with ./configure CCOPTFLAGS="-O3 -march=native"

> 
> 
> 4. I have no such processors. Where can I find benchmark information about 
> STREAMS?

  do make mpistreams in PETSC_DIR


> 
> 
> Thanks, 
> Hyung Kim
> 
> 
> 
> 
> 
> 
> 
> 
> 2023년 1월 2일 (월) 오후 11:03, Matthew Knepley  >님이 작성:
>> On Mon, Jan 2, 2023 at 4:16 AM 김성익 > > wrote:
>>> Hello,
>>> 
>>> Happy new year!!
>>> 
>>>  
>>> I have some questions about “Hint for performance tuning” in user guide of 
>>> petsc.
>>> 
>>>  
>>> 1. In the “Performance Pitfalls and Advice” section, there are 2 modes 
>>> “debug” and “optimized builds. My current setup is debug mode. So I want to 
>>> change for test the performance the optimized build mode. However, if I 
>>> configure again, does the existing debug mode disappear? Is there any way 
>>> to coexist the 2 modes and use them in the run the application?
>>> 
>> Suppose your current arch is named "arch-main-debug". Then you can make an 
>> optimized version using
>> 
>>   cd $PETSC_DIR
>>   ./arch-main-debug/lib/petsc/conf/reconfigure-arch-main-debug.py 
>> --with-debugging=0 --PETSC_ARCH=arch-main-opt
>>  
>>> 2. In the guide, there are some paragraphs about optimization level of 
>>> compiler. To control the optimization level of compiler, I put the ‘-O3’ as 
>>> below. Is this right??
>>> 
>>> CFLAGS   = -O3
>>> FFLAGS   =
>>> CPPFLAGS =
>>> FPPFLAGS =
>>>  
>>> app : a1.o a2.o a3.o a4.o
>>>$(LINK.C) -o $@ $^ $(LDLIBS)
>>>  
>>> include ${PETSC_DIR}/lib/petsc/conf/rules
>>> include ${PETSC_DIR}/lib/petsc/conf/test
>> 
>> You could dp this, but that only changes it for that directory. It is best 
>> to do it by reconfiguring.
>>  
>>> 3. In the guide, user should put ‘-march=native’ for using AVX2 or 
>>> AVX-512. Where should I put the ‘-march=native’ for using AVX?
>>> 
>> You can add --COPTFLAGS="" with any flags you want to the 
>> configure.
>> 
>>> 4. After read the “Hint for performance tuning” I understood that for 
>>> good performance and scalability user should use the multiple node and 
>>> multiple socket . However, before composing cluster system, many users just 
>>> can use desktop system. 
>>> In that case, between intel 13th i9 and amd ryzen 7950x, can the 7950x, 
>>> which has an arichitecture similar to the server processor, be more 
>>> suitable for petsc? (Because the architecture of intel desktop cpu is 
>>> big.little.)
>>> 
>> 
>> A good guide is to run the STREAMS benchmark on the processor. PETSc 
>> performance closely tracks that.
>> 
>>   Thanks,
>> 
>>  Matt
>>  
>>>  
>>> Thanks,
>>> 
>>> Hyung Kim
>>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments 
>> is infinitely more interesting than any results to which their experiments 
>> lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/ 



Re: [petsc-users] Global Vector Generated from Section memory error

2022-12-28 Thread Barry Smith

  This is a mysterious stack. It is inside memalign() that Valgrind has found 
the code is accessing memory outside of the block size allocated, but 
memalign() is presumably the routine that is in the middle of the process of 
doing the allocation!  This could indicate some (undetected) memory corruption 
has occurred earlier in the run thus memalign() has corrupted data structures. 
I presume this is the first warning message?

  You can try running without Valgrind but with the PETSc option -malloc_debug 
and see if that detects corruption more clearly.

   Barry


> On Dec 28, 2022, at 2:10 PM, Nicholas Arnold-Medabalimi  
> wrote:
> 
> Hi Petsc Users
> 
> I've been working with vectors generated from a DM and getting some odd 
> memory errors. Using Valgrind, I have been able to trace the issue to 
> DMCreateGlobalVector. I've reduced the code to a relatively simple routine 
> (very similar to example 7) and attached it. I suspect the issue comes down 
> to something improperly set in the section. The code, when integrated, will 
> run correctly 10-30% of the time and otherwise give a memory corruption error.
> 
>  Any insight on the issue or possible error on my part would be appreciated.
> 
> 
> Using Valgrind, I get the following error. 
> 
> ==27064== Invalid write of size 8
> ==27064==at 0x10C91E: main (section_vector_build.cpp:184)
> ==27064==  Address 0xc4aa248 is 4 bytes after a block of size 3,204 alloc'd
> ==27064==at 0x483E0F0: memalign (in 
> /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==27064==by 0x483E212: posix_memalign (in 
> /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==27064==by 0x4C4DAB0: PetscMallocAlign (mal.c:54)
> ==27064==by 0x4C5262F: PetscTrMallocDefault (mtr.c:186)
> ==27064==by 0x4C501F7: PetscMallocA (mal.c:420)
> ==27064==by 0x527E8A9: VecCreate_MPI_Private (pbvec.c:485)
> ==27064==by 0x527F04F: VecCreate_MPI (pbvec.c:523)
> ==27064==by 0x53E7097: VecSetType (vecreg.c:89)
> ==27064==by 0x527FBC8: VecCreate_Standard (pbvec.c:547)
> ==27064==by 0x53E7097: VecSetType (vecreg.c:89)
> ==27064==by 0x6CD77C0: DMCreateGlobalVector_Section_Private (dmi.c:58)
> ==27064==by 0x61D52DB: DMCreateGlobalVector_Plex (plexcreate.c:4130)
> 
> 
> 
> -- 
> Nicholas Arnold-Medabalimi
> 
> Ph.D. Candidate
> Computational Aeroscience Lab
> University of Michigan
> 



Re: [petsc-users] extract preconditioner matrix

2022-12-14 Thread Barry Smith


> On Dec 14, 2022, at 9:10 PM, 김성익  wrote:
> 
> Hello,
> 
> 
> 
> I tried to find the way to adapt my own preconditioner.
> In other words, I want to apply and solve a new preconditioner rather than 
> using the existing one in Petsc.
> 
> So, my questions are as below
> 1. Is this possible to adapt my own preconditioner??

   There are a variety of ways to provide your own preconditioner; you can use 
https://petsc.org/release/docs/manualpages/PC/PCSHELL/ and take the 
preconditioner completely in your own hands. But often one builds a 
preconditioner by combining multiple simpler preconditioners: for example 
PCFIELDSPLIT discuss in https://petsc.org/release/docs/manual/ksp/, even block 
Jacobi https://petsc.org/release/docs/manualpages/PC/PCBJACOBI/#pcbjacobi is 
built up with smaller preconditioners.  

  What particular type of preconditioner are you planning to build? Other users 
may have ideas on how to do it.
> 
> 2. Also is it possible to extract preconditioner matrix created in Petsc?

I'm not sure what you mean by this. Preconditioners are very rarely represented 
directly by a matrix (that would be too inefficient). Rather one provides 
functions that apply the action of the preconditioner. As noted above one 
provides such functions in PETSc using 
https://petsc.org/release/docs/manualpages/PC/PCSHELL/.  


> 
> 3. Is this possible to separate preconditioning & solving procedure to check 
> the result of each process in Petsc??

   The KSP and the PC work together to provide an over all good solver. One can 
focus on the preconditioner's quality by using it with several different Krylov 
methods. For example the KSPRICHARDSON (-ksp_typre richardson) does essentially 
nothing so the convergence (or lack of convergence) is determined by the 
preconditioner only. For non positive definite matrices 
https://petsc.org/release/docs/manualpages/KSP/KSPBCGS/#kspbcgs is generally 
weaker than  https://petsc.org/release/docs/manualpages/KSP/KSPGMRES/#kspgmres 
so how your preconditioner works with these two different KSP methods can help 
you evaluate the preconditioner.

  Feel free to ask more detailed questions as you use the PETSc solvers,

  Barry

> 
> 
> Thanks,
> Hyung Kim



Re: [petsc-users] Saving solution with monitor function

2022-12-14 Thread Barry Smith

  See, for example 
https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetVecs/ and 
https://petsc.org/release/docs/manualpages/TS/TSTrajectoryGetUpdatedHistoryVecs/
 

  One does not directly access the data inside the trajectory; one calls 
functions in the API to obtained desired information.  If you need specific 
information that it does not currently provide we can attempt to provide 
additional functionality.


  Barry


> On Dec 14, 2022, at 1:07 PM, Guglielmo, Tyler Hardy via petsc-users 
>  wrote:
> 
> Thanks Matt,
>  
> I’m a bit confused on where the trajectory is being stored in the 
> TSTrajectory object.
>  
> Basically I have run 
>  
> TSSetSaveTrajectory(ts);
> …
> TSSolve(ts, x);
> TSTrajectory tj;
> TSGetTrajectory(ts, );
> TSTrajectorySetType(tj, ts, TSTRAJECTORYMEMORY);
>  
> How is the object supposed to be accessed to find the entire trajectory?  I 
> couldn’t find a clear example of where this is laid out in the documentation.
>  
> The TSTrajectory object looks like some complicated struct, but parsing which 
> pointer is pointing to the solution has alluded me.
>  
> Thanks for your time!
>  
> Best,
> Tyler
>  
>  
> From: Matthew Knepley mailto:knep...@gmail.com>>
> Date: Tuesday, December 13, 2022 at 6:41 AM
> To: Guglielmo, Tyler Hardy mailto:gugliel...@llnl.gov>>
> Cc: petsc-users@mcs.anl.gov  
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Saving solution with monitor function
> 
> On Tue, Dec 13, 2022 at 8:40 AM Guglielmo, Tyler Hardy via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
> Hi all,
>  
> I am a new PETSc user (and new to MPI in general), and was wondering if 
> someone could help me out with what I am sure is a basic question (if this is 
> not the appropriate email list or there is a better place please let me 
> know!).
>  
> Basically, I am writing a code that requires a solution to an ODE that will 
> be used later on during runtime.  I have written the basic ODE solver using 
> TSRK, however I haven’t thought of a good way to store the actual solution at 
> all time steps throughout the time evolution.  I would like to avoid writing 
> each time step to a file through the monitor function, and instead just plug 
> each time step into an array.
>  
> How is this usually done?  I suppose the user defined struct that gets passed 
> into the monitor function could contain a pointer to an array in main?  This 
> is how I would do this if the program wasn’t of the MPI variety, but I am not 
> sure how to properly declare a pointer to an array declared as Vec and built 
> through the usual PETSc process.  Any tips are greatly appreciated
>  
> I think this is what TSTrajectory is for. I believe you want 
> https://petsc.org/main/docs/manualpages/TS/TSTRAJECTORYMEMORY/ 
> 
>  
>   Thanks,
>  
>   Matt
>  
> Thanks for your time,
> Tyler
>  
> +
> Tyler Guglielmo
> Postdoctoral Researcher
> Lawrence Livermore National Lab
> Office: 925-423-6186
> Cell: 210-480-8000
> +
> 
>  
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ 
> 


Re: [petsc-users] Saving solution with monitor function

2022-12-13 Thread Barry Smith


  It is also possible to read the solutions back from the trajectory object 
from your running code. It is not just for saving to files.


> On Dec 13, 2022, at 12:51 PM, Zhang, Hong via petsc-users 
>  wrote:
> 
> Tyler,
> 
> The quickest solution is to use TSTrajectory as Matt mentioned. You can add 
> the following command line options to save the solution into a binary file 
> under a folder at each time step.
> 
> -ts_save_trajectory -ts_trajectory_type visualization 
> 
> The folder name and the file name can be customized with 
> -ts_trajectory_dirname and -ts_trajectory_file_template.
> 
> If you want to load these files into Matlab, you can use some scripts in 
> share/petsc/matlab/ such as PetscReadBinaryTrajectory.m and PetscBinaryRead.m.
> 
> The python versions of these scripts are available in lib/petsc/bin/.
> 
> Hong(Mr.)
> 
>> On Dec 13, 2022, at 12:14 AM, Guglielmo, Tyler Hardy via petsc-users 
>> mailto:petsc-users@mcs.anl.gov>> wrote:
>> 
>> Hi all,
>>  
>> I am a new PETSc user (and new to MPI in general), and was wondering if 
>> someone could help me out with what I am sure is a basic question (if this 
>> is not the appropriate email list or there is a better place please let me 
>> know!).
>>  
>> Basically, I am writing a code that requires a solution to an ODE that will 
>> be used later on during runtime.  I have written the basic ODE solver using 
>> TSRK, however I haven’t thought of a good way to store the actual solution 
>> at all time steps throughout the time evolution.  I would like to avoid 
>> writing each time step to a file through the monitor function, and instead 
>> just plug each time step into an array.
>>  
>> How is this usually done?  I suppose the user defined struct that gets 
>> passed into the monitor function could contain a pointer to an array in 
>> main?  This is how I would do this if the program wasn’t of the MPI variety, 
>> but I am not sure how to properly declare a pointer to an array declared as 
>> Vec and built through the usual PETSc process.  Any tips are greatly 
>> appreciated!
>>  
>> Thanks for your time,
>> Tyler
>>  
>> +
>> Tyler Guglielmo
>> Postdoctoral Researcher
>> Lawrence Livermore National Lab
>> Office: 925-423-6186
>> Cell: 210-480-8000
>> +
> 



Re: [petsc-users] parallelize matrix assembly process

2022-12-13 Thread Barry Smith
-03
> 
> [0]  PCSetUp(): Setting up PC with same nonzero pattern
> 
> [1]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139620736897504 94891083133744
> 
> [0]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139687279636960 94370403730224
> 
> [1]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139620736897504 94891083133744
> 
> [1]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139620736897504 94891083133744
> 
> [0]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139687279636960 94370403730224
> 
> [0]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139687279636960 94370403730224
> 
> [0]  PCSetUp(): Leaving PC with identical preconditioner since operator 
> is unchanged
> 
> [0]  PCSetUp(): Leaving PC with identical preconditioner since operator 
> is unchanged
> 
> [0]  KSPConvergedDefault(): Linear solver has converged. Residual norm 
> 1.539725065974e-19 is less than relative tolerance 1.e-05 times 
> initial right hand side norm 8.015104666105e-06 at iteration 1
> 
> Solving Time : 4.662785sec
> 
> [0]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139687279636960 94370403730224
> 
> [0]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139687279636960 94370403730224
> 
> Solving Time : 4.664515sec
> 
> [1]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139620736897504 94891083133744
> 
> [1]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139620736897504 94891083133744
> 
> [0]  VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs.
> 
> [0]  VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 
> mallocs.
> 
> [1]  MatAssemblyBegin_MPIAIJ(): Stash has 461184 entries, uses 0 mallocs.
> 
> [0]  MatAssemblyBegin_MPIAIJ(): Stash has 460416 entries, uses 0 mallocs.
> 
> Assemble Time : 5.238257sec
> 
> [1]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139620736897504 94891083133744
> 
> [1]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139620736897504 94891083133744
> 
> Assemble Time : 5.236535sec
> 
> [0]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139687279636960 94370403730224
> 
> [0]  PetscCommDuplicate(): Using internal PETSc communicator 
> 139687279636960 94370403730224
> 
>  
>  TIME : 1.00, TIME_STEP : 1.00,  ITER : 3, RESIDUAL : 
> 3.705062e-08
> 
>  TIME0 : 1.00
> 
> [0]  VecAssemblyBegin_MPI_BTS(): Stash has 13891 entries, uses 0 mallocs.
> 
> [0]  VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 
> mallocs.
> 
>  
>  TIME : 1.00, TIME_STEP : 1.00,  ITER : 3, RESIDUAL : 
> 3.705062e-08
> 
>  TIME0 : 1.00
> 
> [1]  PetscFinalize(): PetscFinalize() called
> 
> [0]  VecAssemblyBegin_MPI_BTS(): Stash has 661 entries, uses 5 mallocs.
> 
> [0]  VecAssemblyBegin_MPI_BTS(): Block-Stash has 0 entries, uses 0 
> mallocs.
> 
> [0]  PetscFinalize(): PetscFinalize() called
> 
> 
> 2022년 12월 13일 (화) 오전 12:50, Barry Smith  <mailto:bsm...@petsc.dev>>님이 작성:
>> 
>>The problem is possibly due to most elements being computed on "wrong" 
>> MPI rank and thus requiring almost all the matrix entries to be "stashed" 
>> when computed and then sent off to the owning MPI rank.  Please send ALL the 
>> output of a parallel run with -info so we can see how much communication is 
>> done in the matrix assembly.
>> 
>>   Barry
>> 
>> 
>> > On Dec 12, 2022, at 6:16 AM, 김성익 > > <mailto:ksi2...@gmail.com>> wrote:
>> > 
>> > Hello,
>> > 
>> > 
>> > I need some keyword or some examples for parallelizing matrix assemble 
>> > process.
>> > 
>> > My current state is as below.
>> > - Finite element analysis code for Structural mechanics.
>> > - problem size : 3D solid hexa element (number of elements : 125,000), 
>> > number of degree of freedom : 397,953
>> > - Matrix type : seqaij, matrix set preallocation by using 
>> > MatSeqAIJSetPreallocation
>> > - Matrix assemble time by using 1 core : 120 sec
>> >for (int i=0; i<125000; i++) {
>> > ~~ element matrix calculation}
>> >matassemblybegin
>> >matassemblyend
>> > - Matrix assemble time by using 8 core : 70,234sec
>> >   int start, end;
>> >   VecGetOwnershipRange( element_vec, , );
>> >   for (int i=start; i> >~~ element matrix calculation
>> >matassemblybegin
>> >matassemblyend
>> > 
>> > 
>> > As you see the state, the parallel case spent a lot of time than 
>> > sequential case..
>> > How can I speed up in this case?
>> > Can I get some keyword or examples for parallelizing assembly of matrix in 
>> > finite element analysis ?
>> > 
>> > Thanks,
>> > Hyung Kim
>> > 
>> 



Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread Barry Smith


   The problem is possibly due to most elements being computed on "wrong" MPI 
rank and thus requiring almost all the matrix entries to be "stashed" when 
computed and then sent off to the owning MPI rank.  Please send ALL the output 
of a parallel run with -info so we can see how much communication is done in 
the matrix assembly.

  Barry


> On Dec 12, 2022, at 6:16 AM, 김성익  wrote:
> 
> Hello,
> 
> 
> I need some keyword or some examples for parallelizing matrix assemble 
> process.
> 
> My current state is as below.
> - Finite element analysis code for Structural mechanics.
> - problem size : 3D solid hexa element (number of elements : 125,000), number 
> of degree of freedom : 397,953
> - Matrix type : seqaij, matrix set preallocation by using 
> MatSeqAIJSetPreallocation
> - Matrix assemble time by using 1 core : 120 sec
>for (int i=0; i<125000; i++) {
> ~~ element matrix calculation}
>matassemblybegin
>matassemblyend
> - Matrix assemble time by using 8 core : 70,234sec
>   int start, end;
>   VecGetOwnershipRange( element_vec, , );
>   for (int i=start; i~~ element matrix calculation
>matassemblybegin
>matassemblyend
> 
> 
> As you see the state, the parallel case spent a lot of time than sequential 
> case..
> How can I speed up in this case?
> Can I get some keyword or examples for parallelizing assembly of matrix in 
> finite element analysis ?
> 
> Thanks,
> Hyung Kim
> 



Re: [petsc-users] Union of sequential vecs

2022-12-09 Thread Barry Smith

  Ok, so you want the unique list of integers sorted from all the seq vectors 
on ever MPI rank?

   VecScatterCreateToAll() to get all values on all ranks (make the sequential 
vectors MPI vectors instead).
create an integer array long enough to hold all of them
Use VecGetArray() and a for loop to copy all the values to the integer 
array,
Use PetscSortRemoveDupsInt on the integer array

  Now each rank has all the desired values.



> On Dec 9, 2022, at 3:24 PM, Karthikeyan Chockalingam - STFC UKRI 
>  wrote:
> 
> That is where I am stuck, I don’t know who to combine them to get Vec = 
> {2,5,7,8,10,11,12}.
> I just want them in an MPI vector.
>  
> I finally plan to call VecScatterCreateToAll so that all processor gets a 
> copy.
>  
> Thank you.
>  
> Kind regards,
> Karthik.
>  
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Date: Friday, 9 December 2022 at 20:04
> To: Chockalingam, Karthikeyan (STFC,DL,HC) 
>  <mailto:karthikeyan.chockalin...@stfc.ac.uk>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Union of sequential vecs
> 
>  
>   How are you combining them to get Vec = {2,5,7,8,10,11,12}?
>  
>   Do you want the values to remain on the same MPI rank as before, just in an 
> MPI vector?
>  
>  
> 
> 
> On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via 
> petsc-users mailto:petsc-users@mcs.anl.gov>> wrote:
>  
> Hi,
>  
> I want to take the union of a set of sequential vectors, each living in a 
> different processor.
>  
> Say, 
> Vec_Seq1 = {2,5,7}
> Vec_Seq2 = {5,8,10,11}
> Vec_Seq3 = {5,2,12}.
>  
> Finally, get the union of all them Vec = {2,5,7,8,10,11,12}.
>  
> I initially wanted to create a parallel vector and insert the (sequential 
> vector) values but I do not know, to which index to insert the values to. But 
> I do know the total size of Vec (which in this case is 7).
>  
> Any help is much appreciated.
>  
> Kind regards,
> Karthik.
>  
>  
>  
> This email and any attachments are intended solely for the use of the named 
> recipients. If you are not the intended recipient you must not use, disclose, 
> copy or distribute this email or any of its attachments and should notify the 
> sender immediately and delete this email from your system. UK Research and 
> Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
> this email or any attachments containing viruses or malware but the recipient 
> should carry out its own virus and malware checks before opening the 
> attachments. UKRI does not accept any liability for any losses or damages 
> which the recipient may sustain due to presence of any viruses. 
> 



Re: [petsc-users] Union of sequential vecs

2022-12-09 Thread Barry Smith

  How are you combining them to get Vec = {2,5,7,8,10,11,12}?

  Do you want the values to remain on the same MPI rank as before, just in an 
MPI vector?



> On Dec 9, 2022, at 2:28 PM, Karthikeyan Chockalingam - STFC UKRI via 
> petsc-users  wrote:
> 
> Hi,
>  
> I want to take the union of a set of sequential vectors, each living in a 
> different processor.
>  
> Say, 
> Vec_Seq1 = {2,5,7}
> Vec_Seq2 = {5,8,10,11}
> Vec_Seq3 = {5,2,12}.
>  
> Finally, get the union of all them Vec = {2,5,7,8,10,11,12}.
>  
> I initially wanted to create a parallel vector and insert the (sequential 
> vector) values but I do not know, to which index to insert the values to. But 
> I do know the total size of Vec (which in this case is 7).
>  
> Any help is much appreciated.
>  
> Kind regards,
> Karthik.
>  
>  
>  
> This email and any attachments are intended solely for the use of the named 
> recipients. If you are not the intended recipient you must not use, disclose, 
> copy or distribute this email or any of its attachments and should notify the 
> sender immediately and delete this email from your system. UK Research and 
> Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
> this email or any attachments containing viruses or malware but the recipient 
> should carry out its own virus and malware checks before opening the 
> attachments. UKRI does not accept any liability for any losses or damages 
> which the recipient may sustain due to presence of any viruses. 
> 



Re: [petsc-users] Fortran Interface NULL object / Casting

2022-12-08 Thread Barry Smith

   You would use PETSC_NULL_DMLABEL but Matt needs to customize the PETSc 
Fortran stub for DMAddField() for you to handle accepting the NULL from PETSc.

   Barry





> On Dec 8, 2022, at 1:05 PM, Nicholas Arnold-Medabalimi  
> wrote:
> 
> Hi Petsc Users
> 
> I am trying to use DMAddField in a Fortran code. I had some questions on 
> casting/passing NULL. I follow how to pass NULL for standard types (INT, 
> CHAR, etc). 
> 
> Is there a method/best practice for passing NULL for Petsc type arguments? 
> (In this case DMAddLabel I'd want to pass NULL to a DMLabel Parameter not an 
> intrinsic type)
> 
> For the 2nd question, In some cases in C we would cast a more specific object 
> to Petsc Object 
> DMAddField(dm, NULL, (PetscObject)fvm); (where fvm is a PetscFVM type)
> Is there a method or practice to do the same in Fortran.
> 
> Thanks
> Nicholas
> 
> -- 
> Nicholas Arnold-Medabalimi
> 
> Ph.D. Candidate
> Computational Aeroscience Lab
> University of Michigan



Re: [petsc-users] prevent linking to multithreaded BLAS?

2022-12-08 Thread Barry Smith



> On Dec 7, 2022, at 11:56 PM, Jed Brown  wrote:
> 
> It isn't always wrong to link threaded BLAS. For example, a user might need 
> to call threaded BLAS on the side (but the application can only link one) or 
> a sparse direct solver might want threading for the supernode.

  Indeed, the user asked specifically for their work flow if configure could, 
based on additional configure argument, ensure that they did not get a threaded 
BLAS; they did not ask that configure never give a threaded BLAS or even that 
it give a non-threaded BLAS by default.


> We could test at runtime whether child threads exist/are created when calling 
> BLAS and deliver a warning. 

  How does one test for this? Some standard Unix API for checking this?

> 
> Barry Smith  writes:
> 
>>   There would need to be, for example, some symbol in all the threaded BLAS 
>> libraries that is not in the unthreaded libraries. Of at least in some of 
>> the threaded libraries but never in the unthreaded. 
>> 
>>   BlasLapack.py could check for the special symbol(s) to determine.
>> 
>>  Barry
>> 
>> 
>>> On Dec 7, 2022, at 4:47 PM, Mark Lohry  wrote:
>>> 
>>> Thanks, yes, I figured out the OMP_NUM_THREADS=1 way while triaging it, and 
>>> the --download-fblaslapack way occurred to me. 
>>> 
>>> I was hoping for something that "just worked" (refuse to build in this 
>>> case) but I don't know if it's programmatically possible for petsc to tell 
>>> whether or not it's linking to a threaded BLAS?
>>> 
>>> On Wed, Dec 7, 2022 at 4:35 PM Satish Balay >> <mailto:ba...@mcs.anl.gov>> wrote:
>>>> If you don't specify a blas to use - petsc configure will guess and use 
>>>> what it can find.
>>>> 
>>>> So only way to force it use a particular blas is to specify one [one way 
>>>> is --download-fblaslapack]
>>>> 
>>>> Wrt multi-thread openblas -  you can force it run single threaded [by one 
>>>> of these 2 env variables]
>>>> 
>>>># Use single thread openblas
>>>>export OPENBLAS_NUM_THREADS=1
>>>>export OMP_NUM_THREADS=1
>>>> 
>>>> Satish
>>>> 
>>>> 
>>>> On Wed, 7 Dec 2022, Mark Lohry wrote:
>>>> 
>>>>> I ran into an unexpected issue -- on an NP-core machine, each MPI rank of
>>>>> my application was launching NP threads, such that when running with
>>>>> multiple ranks the machine was quickly oversubscribed and performance
>>>>> tanked.
>>>>> 
>>>>> The root cause of this was petsc linking against the system-provided
>>>>> library (libopenblas0-pthread in this case) set by the update-alternatives
>>>>> in ubuntu. At some point this machine got updated to using the threaded
>>>>> blas implementation instead of serial; not sure how, and I wouldn't have
>>>>> noticed if I weren't running interactively.
>>>>> 
>>>>> Is there any mechanism in petsc or its build system to prevent linking
>>>>> against an inappropriate BLAS, or do I need to be diligent about manually
>>>>> setting the BLAS library in the configuration stage?
>>>>> 
>>>>> Thanks,
>>>>> Mark
>>>>> 
>>>> 



Re: [petsc-users] prevent linking to multithreaded BLAS?

2022-12-07 Thread Barry Smith

   There would need to be, for example, some symbol in all the threaded BLAS 
libraries that is not in the unthreaded libraries. Of at least in some of the 
threaded libraries but never in the unthreaded. 

   BlasLapack.py could check for the special symbol(s) to determine.

  Barry


> On Dec 7, 2022, at 4:47 PM, Mark Lohry  wrote:
> 
> Thanks, yes, I figured out the OMP_NUM_THREADS=1 way while triaging it, and 
> the --download-fblaslapack way occurred to me. 
> 
> I was hoping for something that "just worked" (refuse to build in this case) 
> but I don't know if it's programmatically possible for petsc to tell whether 
> or not it's linking to a threaded BLAS?
> 
> On Wed, Dec 7, 2022 at 4:35 PM Satish Balay  > wrote:
>> If you don't specify a blas to use - petsc configure will guess and use what 
>> it can find.
>> 
>> So only way to force it use a particular blas is to specify one [one way is 
>> --download-fblaslapack]
>> 
>> Wrt multi-thread openblas -  you can force it run single threaded [by one of 
>> these 2 env variables]
>> 
>> # Use single thread openblas
>> export OPENBLAS_NUM_THREADS=1
>> export OMP_NUM_THREADS=1
>> 
>> Satish
>> 
>> 
>> On Wed, 7 Dec 2022, Mark Lohry wrote:
>> 
>> > I ran into an unexpected issue -- on an NP-core machine, each MPI rank of
>> > my application was launching NP threads, such that when running with
>> > multiple ranks the machine was quickly oversubscribed and performance
>> > tanked.
>> > 
>> > The root cause of this was petsc linking against the system-provided
>> > library (libopenblas0-pthread in this case) set by the update-alternatives
>> > in ubuntu. At some point this machine got updated to using the threaded
>> > blas implementation instead of serial; not sure how, and I wouldn't have
>> > noticed if I weren't running interactively.
>> > 
>> > Is there any mechanism in petsc or its build system to prevent linking
>> > against an inappropriate BLAS, or do I need to be diligent about manually
>> > setting the BLAS library in the configuration stage?
>> > 
>> > Thanks,
>> > Mark
>> > 
>> 



Re: [petsc-users] prevent linking to multithreaded BLAS?

2022-12-07 Thread Barry Smith


  We don't have configure code to detect if the BLAS is thread parallel, nor do 
we have code to tell it not to use a thread parallel version. 

  Except if it is using MKL then we do force it to not use the threaded BLAS.

  A "cheat" would be for you to just set the environmental variable BLAS uses 
for number of threads to 1 always, then you would not need to worry about 
checking to avoid the "bad" library.

  Barry




> On Dec 7, 2022, at 4:21 PM, Mark Lohry  wrote:
> 
> I ran into an unexpected issue -- on an NP-core machine, each MPI rank of my 
> application was launching NP threads, such that when running with multiple 
> ranks the machine was quickly oversubscribed and performance tanked.
> 
> The root cause of this was petsc linking against the system-provided library 
> (libopenblas0-pthread in this case) set by the update-alternatives in ubuntu. 
> At some point this machine got updated to using the threaded blas 
> implementation instead of serial; not sure how, and I wouldn't have noticed 
> if I weren't running interactively.
> 
> Is there any mechanism in petsc or its build system to prevent linking 
> against an inappropriate BLAS, or do I need to be diligent about manually 
> setting the BLAS library in the configuration stage?
> 
> Thanks,
> Mark



Re: [petsc-users] About Preconditioner and MUMPS

2022-12-06 Thread Barry Smith


> On Dec 6, 2022, at 5:15 AM, 김성익  wrote:
> 
> Hello,
> 
>  
> I have some questions about pc and mumps_icntl.
> 
> 1. What’s the difference between adopt preconditioner by code (for 
> example, PetscCall(PCSetType(pc,PCLU)) and option -pc_type lu??
> And also, What’s the priority between code pcsettype and option -pc_type ??
> 
> 
> 2. When I tried to use METIS in MUMPS, I adopted metis by option (for 
> example, -mat_mumps_icntl_7 5). In this situation, it is impossible to use 
> metis without pc_type lu. However, in my case pc type lu makes the 
> performance poor. So I don’t want to use lu preconditioner. How can I do this?
> 
   The package MUMPS has an option to use metis in its ordering process which 
can be turned on as indicated while using MUMPS.  Most preconditioners that 
PETSc can use do not use metis for any purpose hence there is no option to turn 
on its use.  For what purpose do you wish to use metis? Partitioning, ordering, 
?


 
>  
> Thanks,
> 
> Hyung Kim
> 



Re: [petsc-users] PETSc geometric multigrid use

2022-12-01 Thread Barry Smith

  I don't think you want to use -pc_type gamg if you want to use geometric 
multigrid. You can can use -pc_type mg and the DMDA. The only thing I think you 
need to change from the default use of DMDA and -pc_type mg is to provide 
custom code that computes the interpolation between levels to take into account 
the curvilinear grids.

  Barry


> On Dec 1, 2022, at 3:43 PM, Alfredo J Duarte Gomez  
> wrote:
> 
> Good morning,
> 
> Good afternoon,
> 
> I am testing the performance of some preconditioners on a problem I have and 
> I wanted to try geometric multigrid. I am using a DMDA object and respective 
> matrices (the mesh is structured, but not rectangular, it is curvilinear)
> 
> I have used the default version of algebraic multigrid (that is -pc_gamg_type 
> agg) very easily.
> 
> When I try to use  -pc_gamg_type geo, it returns an error.
> 
> What additional lines of code do I have to add to use the geo option? 
> 
> Or does this require some sort of fast remeshing routine, given that I am 
> using curvilinear grids?
> 
> Thank you and have a good day.
> 
> -Alfredo
> 
> -- 
> Alfredo Duarte
> Graduate Research Assistant
> The University of Texas at Austin



Re: [petsc-users] Report Bug TaoALMM class

2022-12-01 Thread Barry Smith

  Stephan,

The Tao author has kindly fixed the bug in the code you found in 
https://gitlab.com/petsc/petsc/-/merge_requests/5890  Please let us know if 
this works and resolves your problem.

   Thanks

   Barry


> On Nov 22, 2022, at 4:03 AM, Stephan Köhler 
>  wrote:
> 
> Yeah, I also read this.  But for me as a reader "highly recommended" does not 
> mean that Armijo does not work correct with ALMM :).
> 
> And I think that Armijo is easier than trust-region in the sense that if 
> Armijo does not work, than I did something wrong in the Hessian or the 
> gradient.  In trust-region methods, there are more parameters and maybe I set 
> one of them wrong or something like that.
> At least, this is my view.  But I'm also not an expert on trust-region 
> methods.
> 
> On 12.11.22 06:00, Barry Smith wrote:
>> 
>>   I noticed this in the TAOALMM manual page.
>> 
>> It is also highly recommended that the subsolver chosen by the user utilize 
>> a trust-region
>> strategy for globalization (default: TAOBQNKTR) especially if the outer 
>> problem features bound constraints.
>> 
>> I am far from an expert on these topics.
>> 
>>> On Nov 4, 2022, at 7:43 AM, Stephan Köhler 
>>>  
>>> <mailto:stephan.koeh...@math.tu-freiberg.de> wrote:
>>> 
>>> Barry,
>>> 
>>> this is a nonartificial code.  This is a problem in the ALMM subsolver.  I 
>>> want to solve a problem with a TaoALMM solver what then happens is:
>>> 
>>> TaoSolve(tao)/* TaoALMM solver */
>>>|
>>>|
>>>|>   This calls the TaoALMM subsolver routine
>>>   
>>>  TaoSolve(subsolver)
>>>|
>>>|
>>>|--->   The subsolver does not correctly 
>>> work, at least with an Armijo line search, since the solution is 
>>> overwritten within the line search.  
>>>In my case, the subsolver does not 
>>> make any progress although it is possible.
>>> 
>>> To get to my real problem you can simply change line 268 to if(0)  (from 
>>> if(1) -> if(0)) and line 317 from // ierr = TaoSolve(tao); 
>>> CHKERRQ(ierr);  ---> ierr = TaoSolve(tao); CHKERRQ(ierr);
>>> What you can see is that the solver does not make any progress, but it 
>>> should make progress.
>>> 
>>> To be honest, I do not really know why the option 
>>> -tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is 
>>> called and not the subsolver. I also do not know why 
>>> -tao_almm_subsolver_tao_view prints as termination reason for the subsolver 
>>> 
>>>  Solution converged:    ||g(X)|| <= gatol
>>> 
>>> This is obviously not the case.  I set the tolerance
>>> -tao_almm_subsolver_tao_gatol 1e-8 \
>>> -tao_almm_subsolver_tao_grtol 1e-8 \
>>>  
>>> I encountered this and then I looked into the ALMM class and therefore I 
>>> tried to call the subsolver (previous example).
>>> 
>>> I attach the updated programm and also the options.
>>> 
>>> Stephan
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 03.11.22 22:15, Barry Smith wrote:
>>>> 
>>>>   Thanks for your response and the code. I understand the potential 
>>>> problem and how your code demonstrates a bug if the 
>>>> TaoALMMSubsolverObjective() is used in the manner you use in the example 
>>>> where you directly call TaoComputeObjective() multiple times line a line 
>>>> search code might.
>>>> 
>>>>   What I don't have or understand is how to reproduce the problem in a 
>>>> real code that uses Tao. That is where the Tao Armijo line search code has 
>>>> a problem when it is used (somehow) in a Tao solver with ALMM. You suggest 
>>>> "If you have an example for your own, you can switch the Armijo line 
>>>> search by the option -tao_ls_type armijo.  The thing is that it will cause 
>>>> no problems if the line search accepts the steps with step length one."  I 
>>>> don't see how to do this if I use -tao_type almm I cannot use -tao_ls_type 
>>>> armijo; that is the option -tao_ls_type doesn't seem to me to be usable in 
>>>> the context of almm (since almm internally does directly its own trust 
>>>> region approach for globalization). If we remove the if (

Re: [petsc-users] About MatMumpsSetIcntl function

2022-11-30 Thread Barry Smith

  Note you can use -help to have the running code print all possible options it 
can currently handle. This produces a lot of output so generally, one can do 
things like

./code various options -help | grep mumps 

to see what the exact option is named for mumps in your situation. 

   Also if you had run with  -mpi_ksp_view as suggested in your previous output 
you sent *** Use -mpi_ksp_view to see the MPI KSP parameters *** it would show 
the options prefix for the mumps options (which in this case is -mpi_)  so you 
would know the option should be given as -mpi_mat_mumps_icntl_7 5,

  So, in summary, since nested solvers can have complicated nesting and options 
names, one can determine the exact names without guessing by either 

1) running with -help  and searching for the appropriate full option name or 

2) running with the appropriate view command and locating the options prefix 
for particular solve you want to control.

We do this all the time, even though we wrote the code we cannot fully remember 
the options and figure out in our heads the full naming of nested solver 
options so we use 1) and 2) to determine them.

Barry


> On Nov 30, 2022, at 12:52 AM, 김성익  wrote:
> 
> Hello,
> 
> 
> I tried to adopt METIS option in MUMPS by using 
> ' PetscCall(MatMumpsSetIcntl(Mat, 7, 5));'
> 
> However, there is an error as follows
> 
> [0]PETSC ERROR: Object is in wrong state
> [0]PETSC ERROR: Only for factored matrix
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.18.1, unknown 
> [0]PETSC ERROR: ./app on a arch-linux-c-debug named ubuntu by ksi2443 Tue Nov 
> 29 21:12:41 2022
> [0]PETSC ERROR: Configure options -download-mumps -download-scalapack 
> -download-parmetis -download-metis
> [0]PETSC ERROR: #1 MatMumpsSetIcntl() at 
> /home/ksi2443/petsc/src/mat/impls/aij/mpi/mumps/mumps.c:2478
> [0]PETSC ERROR: #2 main() at /home/ksi2443/Downloads/coding/a1.c:149
> [0]PETSC ERROR: No PETSc Option Table entries
> 
> How can I fix this error?
> 
> Thank you for your help.
> 
> 
> Hyung Kim



Re: [petsc-users] Fortran DMLoad bindings

2022-11-25 Thread Barry Smith

   The branch is now available for you to use DMLoad() from Fortran


> On Nov 25, 2022, at 1:34 PM, Barry Smith  wrote:
> 
> 
>Nicholas
> 
>I will add the Fortran stubs for these two functions shortly in the git 
> branch barry/2022-11-25/add-dm-view-load-fortran/release
> 
>Barry
>   
>> On Nov 25, 2022, at 1:05 PM, Pierre Jolivet  wrote:
>> 
>> That example has no DMLoad(), and the interface is indeed not automatically 
>> generated 
>> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/interface/dm.c#L4075
>> I’m not sure why, though.
>> 
>> Thanks,
>> Pierre
>> 
>>> On 25 Nov 2022, at 6:42 PM, Mark Adams  wrote:
>>> 
>>> It looks like it is available with an example here: 
>>> 
>>> https://petsc.org/main/src/dm/impls/plex/tutorials/ex3f90.F90.html
>>> 
>>> Try 'cd src/dm/impls/plex/tutorials; make ex3f90'
>>> 
>>> Mark
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Nov 25, 2022 at 6:32 AM Nicholas Arnold-Medabalimi 
>>> mailto:narno...@umich.edu>> wrote:
>>>> Good Morning
>>>> 
>>>> I am adding some Petsc for mesh management into an existing Fortran 
>>>> Solver. I'd like to use the DMLoad() function to read in a generated 
>>>> DMPlex (using DMView from a companion C code I've been using to debug). It 
>>>> appears there isn't an existing binding for that function (or I might be 
>>>> making a mistake.) 
>>>> 
>>>>  I noticed some outdated user posts about using the more general 
>>>> PetscObjectView to achieve the result, but I can't seem to replicate it 
>>>> (and it might be outdated information).
>>>> 
>>>> Any assistance on this would be appreciated.
>>>> 
>>>> Happy Thanksgiving & Sincerely
>>>> Nicholas
>>>> 
>>>> -- 
>>>> Nicholas Arnold-Medabalimi
>>>> 
>>>> Ph.D. Candidate
>>>> Computational Aeroscience Lab
>>>> University of Michigan
>> 
> 



Re: [petsc-users] Fortran DMLoad bindings

2022-11-25 Thread Barry Smith

   Nicholas

   I will add the Fortran stubs for these two functions shortly in the git 
branch barry/2022-11-25/add-dm-view-load-fortran/release

   Barry
  
> On Nov 25, 2022, at 1:05 PM, Pierre Jolivet  wrote:
> 
> That example has no DMLoad(), and the interface is indeed not automatically 
> generated 
> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/interface/dm.c#L4075
> I’m not sure why, though.
> 
> Thanks,
> Pierre
> 
>> On 25 Nov 2022, at 6:42 PM, Mark Adams  wrote:
>> 
>> It looks like it is available with an example here: 
>> 
>> https://petsc.org/main/src/dm/impls/plex/tutorials/ex3f90.F90.html
>> 
>> Try 'cd src/dm/impls/plex/tutorials; make ex3f90'
>> 
>> Mark
>> 
>> 
>> 
>> 
>> On Fri, Nov 25, 2022 at 6:32 AM Nicholas Arnold-Medabalimi 
>> mailto:narno...@umich.edu>> wrote:
>>> Good Morning
>>> 
>>> I am adding some Petsc for mesh management into an existing Fortran Solver. 
>>> I'd like to use the DMLoad() function to read in a generated DMPlex (using 
>>> DMView from a companion C code I've been using to debug). It appears there 
>>> isn't an existing binding for that function (or I might be making a 
>>> mistake.) 
>>> 
>>>  I noticed some outdated user posts about using the more general 
>>> PetscObjectView to achieve the result, but I can't seem to replicate it 
>>> (and it might be outdated information).
>>> 
>>> Any assistance on this would be appreciated.
>>> 
>>> Happy Thanksgiving & Sincerely
>>> Nicholas
>>> 
>>> -- 
>>> Nicholas Arnold-Medabalimi
>>> 
>>> Ph.D. Candidate
>>> Computational Aeroscience Lab
>>> University of Michigan
> 



Re: [petsc-users] comparing HYPRE on CPU vs GPU

2022-11-24 Thread Barry Smith


  Probably some of these questions are best asked to the hypre folks. But I can 
answer some

> On Nov 24, 2022, at 11:19 AM, nicola varini  wrote:
> 
> Dear all, I am comparing the HYPRE boomeramg preconditioner on CPU and GPU.
> It looks like the defaults are different, therefore I tried to specify the 
> match the options.
> I have a few observations/questions:
> 1) Why are the residuals at step 0 different?

  By default, PETSc uses left preconditioning; thus, the norms reported are for 
the preconditioner residual and will be different for different preconditioner 
options (and possibly slightly different for numerical differences even if run 
with the same preconditioner options.  You can use -ksp_pc_side right to have 
the true residual norm printed, this should be the same except for possibly, 
hopefully, small numerics effects.

> 2) If the options are the same the number of iterations should be the same?

  Numerical changes could affect these. Hopefully, not much if all the 
algorithms are truly the same.

> 3) Why on the GPU side the PCApply doesn't seem to be done on GPU?
> I noticed that if I change certain options, like 
> pc_hypre_boomeramg_relax_type_all symmetric SOR/Jacobi, the KSPSolve get 
> slower and performs on CPU.

   Hypre probably does not have, by default, a GPU version of SOR, so it must 
be run on a CPU (Petsc also does not currently have a GPU SOR).

> However, how can I diagnose if the computations are perfomed on GPU or not?

  Run with -log_view -log_view_gpu_time and look at the final columns that give 
some information on time and communication to the GPU. But since hypre does not 
log flops, the flop rates for computations done by hypre are meaningless.


> 
> Thanks very much,
> 
> Nicola
> 



Re: [petsc-users] Petsc Fortran Memory stack trace

2022-11-21 Thread Barry Smith

  My understanding of Fortran bounds checking is that before each array access 
in Fortran it checks to see if the index is valid for the array you are 
accessing; that is it is from start to end if you had declared the array as 

  double precision, dimension (start:end) :: A

 It should also work if the array is a Fortran allocatable array or if you 
obtain the array from PETSc with VecGetArrayF90() and all its friends and 
relations.

 PETSc should not change the behavior above.

  Now if there is memory corruption (or some other error) somewhere else (like 
in PETSc or a more subtle problem in your code) then simply array out of bounds 
then yes you can get more complicated error messages that would usually include 
the PETSc stack trace. 

  Instead of using valgrind you can also try running the PETSc program with 
-malloc_debug, this is sort of a poor person's version of valgrind but can 
sometimes provide more useful information than valgrind.

  When debugging always make sure PETSc was NOT ./configure with 
--with-debugging=0 

  You can send specific error messages that are cryptic to 
petsc-ma...@mcs.anl.gov  and we may be able to 
help decipher them.

  Barry





> On Nov 21, 2022, at 2:16 PM, Nicholas Arnold-Medabalimi  
> wrote:
> 
> Hi Petsc users
> 
> I'm working on an integration of Petsc into an existing fortran code. Most of 
> my memory debugging is very primitive and is usually accomplished using the 
> -check bounds option in the compiler. However with Petsc attached the stack 
> trace becomes much more opaque compared to the original code. At least as far 
> as I can tell the error becomes much harder to pin down (just pointing to 
> libpetsc.so). Any assistance in getting more informative error messages or 
> checks would be much appreciated.
> 
> Sincerely
> Nicholas
> 
> -- 
> Nicholas Arnold-Medabalimi
> 
> Ph.D. Candidate
> Computational Aeroscience Lab
> University of Michigan



Re: [petsc-users] Using multiple MPI ranks with COO interface crashes in some cases

2022-11-14 Thread Barry Smith

Mat of type (null)

Either the entire matrix (header) data structure has gotten corrupted or the 
matrix type was never set. Can you run with valgrind to see if there is any 
memory corruption? 

> On Nov 14, 2022, at 1:24 PM, Fackler, Philip via petsc-users 
>  wrote:
> 
> In Xolotl's "feature-petsc-kokkos" branch, I have moved our code to use the 
> COO interface for preallocating and setting values in the Jacobian matrix. I 
> have found that with some of our test cases, using more than one MPI rank 
> results in a crash. Way down in the preconditioner code in petsc a Mat gets 
> computed that has "null" for the "productsymbolic" member of its "ops". It's 
> pretty far removed from where we compute the Jacobian entries, so I haven't 
> been able (so far) to track it back to an error in my code. I'd appreciate 
> some help with this from someone who is more familiar with the petsc guts so 
> we can figure out what I'm doing wrong. (I'm assuming it's a bug in Xolotl.)
> 
> Note that this is using the kokkos backend for Mat and Vec in petsc, but with 
> a serial-only build of kokkos and kokkos-kernels. So, it's a CPU-only 
> multiple MPI rank run.
> 
> Here's a paste of the error output showing the relevant parts of the call 
> stack:
> 
> [ERROR] [0]PETSC ERROR:
> [ERROR] - Error Message 
> --
> [ERROR] [1]PETSC ERROR: 
> [ERROR] - Error Message 
> --
> [ERROR] [1]PETSC ERROR: 
> [ERROR] [0]PETSC ERROR: 
> [ERROR] No support for this operation for this object type
> [ERROR] [1]PETSC ERROR: 
> [ERROR] No support for this operation for this object type
> [ERROR] [0]PETSC ERROR: 
> [ERROR] No method productsymbolic for Mat of type (null)
> [ERROR] No method productsymbolic for Mat of type (null)
> [ERROR] [0]PETSC ERROR: 
> [ERROR] [1]PETSC ERROR: 
> [ERROR] See https://petsc.org/release/faq/ for trouble shooting.
> [ERROR] See https://petsc.org/release/faq/ for trouble shooting.
> [ERROR] [0]PETSC ERROR: 
> [ERROR] [1]PETSC ERROR: 
> [ERROR] Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a  GIT Date: 
> 2022-10-28 14:39:41 +
> [ERROR] Petsc Development GIT revision: v3.18.1-115-gdca010e0e9a  GIT Date: 
> 2022-10-28 14:39:41 +
> [ERROR] [1]PETSC ERROR: 
> [ERROR] [0]PETSC ERROR: 
> [ERROR] Unknown Name on a  named PC0115427 by 4pf Mon Nov 14 13:22:01 2022
> [ERROR] Unknown Name on a  named PC0115427 by 4pf Mon Nov 14 13:22:01 2022
> [ERROR] [1]PETSC ERROR: 
> [ERROR] [0]PETSC ERROR: 
> [ERROR] Configure options PETSC_DIR=/home/4pf/repos/petsc 
> PETSC_ARCH=arch-kokkos-serial-debug --with-debugging=1 --with-cc=mpicc 
> --with-cxx=mpicxx --with-fc=0 --with-cudac=0 
> --prefix=/home/4pf/build/petsc/serial-debug/install --with-64-bit-indices 
> --with-shared-libraries 
> --with-kokkos-dir=/home/4pf/build/kokkos/serial/install 
> --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/serial/install
> [ERROR] Configure options PETSC_DIR=/home/4pf/repos/petsc 
> PETSC_ARCH=arch-kokkos-serial-debug --with-debugging=1 --with-cc=mpicc 
> --with-cxx=mpicxx --with-fc=0 --with-cudac=0 
> --prefix=/home/4pf/build/petsc/serial-debug/install --with-64-bit-indices 
> --with-shared-libraries 
> --with-kokkos-dir=/home/4pf/build/kokkos/serial/install 
> --with-kokkos-kernels-dir=/home/4pf/build/kokkos-kernels/serial/install
> [ERROR] [1]PETSC ERROR: 
> [ERROR] [0]PETSC ERROR: 
> [ERROR] #1 MatProductSymbolic_MPIAIJKokkos_AB() at 
> /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:918
> [ERROR] #1 MatProductSymbolic_MPIAIJKokkos_AB() at 
> /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:918
> [ERROR] [1]PETSC ERROR: 
> [ERROR] [0]PETSC ERROR: 
> [ERROR] #2 MatProductSymbolic_MPIAIJKokkos() at 
> /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:1138
> [ERROR] #2 MatProductSymbolic_MPIAIJKokkos() at 
> /home/4pf/repos/petsc/src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx:1138
> [ERROR] [1]PETSC ERROR: 
> [ERROR] [0]PETSC ERROR: 
> [ERROR] #3 MatProductSymbolic() at 
> /home/4pf/repos/petsc/src/mat/interface/matproduct.c:793
> [ERROR] #3 MatProductSymbolic() at 
> /home/4pf/repos/petsc/src/mat/interface/matproduct.c:793
> [ERROR] [1]PETSC ERROR: 
> [ERROR] [0]PETSC ERROR: 
> [ERROR] #4 MatProduct_Private() at 
> /home/4pf/repos/petsc/src/mat/interface/matrix.c:9820
> [ERROR] #4 MatProduct_Private() at 
> /home/4pf/repos/petsc/src/mat/interface/matrix.c:9820
> [ERROR] [0]PETSC ERROR: 
> [ERROR] [1]PETSC ERROR: 
> [ERROR] #5 MatMatMult() at 
> /home/4pf/repos/petsc/src/mat/interface/matrix.c:9897
> [ERROR] #5 MatMatMult() at 
> /home/4pf/repos/petsc/src/mat/interface/matrix.c:9897
> [ERROR] [0]PETSC ERROR: 
> [ERROR] [1]PETSC ERROR: 
> [ERROR] #6 PCGAMGOptProlongator_AGG() at 
> /home/4pf/repos/petsc/src/ksp/pc/impls/gamg/agg.c:769
> [ERROR] #6 PCGAMGOptProlongator_AGG() at 
> 

Re: [petsc-users] On PCFIELDSPLIT and its implementation

2022-11-14 Thread Barry Smith

  Can you clarify what you mean? For some classes of problems, PCFIELDSPLIT can 
be a very efficacious preconditioner; for example when certain fields have very 
different mathematical structure than others. In those cases it is worth using 
AIJ and PCFIELDSPLIT instead of keeping BAIJ.


> On Nov 14, 2022, at 2:21 PM, Edoardo alinovi  
> wrote:
> 
> Hi Barry no worries!
> 
> Thanks for letting me know! It is not a problem for me to use MPIAIJ, do you 
> think field split will be a game changer?
>  
> 
> 
> Il Lun 14 Nov 2022, 20:13 Barry Smith  <mailto:bsm...@petsc.dev>> ha scritto:
>> 
>>Very sorry for wasting so much of your time. The PCFIELDSPLIT generally 
>> will not work with BAIJ matrices because the MatCreateSubMatrix() for BAIJ 
>> requires indexing by block in the matrix. Your code should work if you use 
>> MPIAIJ matrices.  Note you can still use MatSetValuesBlocked() with MPAIJ 
>> matrices.
>> 
>>   Barry
>> 
>> 
>> > On Nov 10, 2022, at 5:24 PM, Edoardo alinovi > > <mailto:edoardo.alin...@gmail.com>> wrote:
>> > 
>> > True, 
>> > 
>> > Maybe somebody merged it already? I have  attached my silly example. 
>> > 
>> > To compile:
>> > mpifort -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc -fdefault-real-8 -o test 
>> > test.F90 -I$PETSC_DIR/include -I$PETSC_DIR/$PETSC_ARCH/include
>> > 
>> > Do you need the petsc code MAtt did? 
>> > 
>> 



Re: [petsc-users] On PCFIELDSPLIT and its implementation

2022-11-14 Thread Barry Smith


   Very sorry for wasting so much of your time. The PCFIELDSPLIT generally will 
not work with BAIJ matrices because the MatCreateSubMatrix() for BAIJ requires 
indexing by block in the matrix. Your code should work if you use MPIAIJ 
matrices.  Note you can still use MatSetValuesBlocked() with MPAIJ matrices.

  Barry


> On Nov 10, 2022, at 5:24 PM, Edoardo alinovi  
> wrote:
> 
> True, 
> 
> Maybe somebody merged it already? I have  attached my silly example. 
> 
> To compile:
> mpifort -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc -fdefault-real-8 -o test 
> test.F90 -I$PETSC_DIR/include -I$PETSC_DIR/$PETSC_ARCH/include
> 
> Do you need the petsc code MAtt did? 
> 



Re: [petsc-users] Report Bug TaoALMM class

2022-11-11 Thread Barry Smith

  I noticed this in the TAOALMM manual page.

It is also highly recommended that the subsolver chosen by the user utilize a 
trust-region
strategy for globalization (default: TAOBQNKTR) especially if the outer problem 
features bound constraints.

I am far from an expert on these topics.

> On Nov 4, 2022, at 7:43 AM, Stephan Köhler 
>  wrote:
> 
> Barry,
> 
> this is a nonartificial code.  This is a problem in the ALMM subsolver.  I 
> want to solve a problem with a TaoALMM solver what   then happens is:
> 
> TaoSolve(tao)/* TaoALMM solver */
>|
>|
>|>   This calls the TaoALMM subsolver routine
>   
>  TaoSolve(subsolver)
>|
>|
>|--->   The subsolver does not correctly work, 
> at least with an Armijo line search, since the solution is overwritten within 
> the line search.  
>In my case, the subsolver does not 
> make any progress although it is possible.
> 
> To get to my real problem you can simply change line 268 to if(0)  (from 
> if(1) -> if(0)) and line 317 from // ierr = TaoSolve(tao); CHKERRQ(ierr); 
>  ---> ierr = TaoSolve(tao); CHKERRQ(ierr);
> What you can see is that the solver does not make any progress, but it should 
> make progress.
> 
> To be honest, I do not really know why the option 
> -tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is 
> called and not the subsolver. I also do not know why 
> -tao_almm_subsolver_tao_view prints as termination reason for the subsolver 
> 
>  Solution converged:||g(X)|| <= gatol
> 
> This is obviously not the case.  I set the tolerance
> -tao_almm_subsolver_tao_gatol 1e-8 \
> -tao_almm_subsolver_tao_grtol 1e-8 \
>  
> I encountered this and then I looked into the ALMM class and therefore I 
> tried to call the subsolver (previous example).
> 
> I attach the updated programm and also the options.
> 
> Stephan
> 
> 
> 
> 
> 
>  <https://www.dict.cc/?s=obviously>
> On 03.11.22 22:15, Barry Smith wrote:
>> 
>>   Thanks for your response and the code. I understand the potential problem 
>> and how your code demonstrates a bug if the TaoALMMSubsolverObjective() is 
>> used in the manner you use in the example where you directly call 
>> TaoComputeObjective() multiple times line a line search code might.
>> 
>>   What I don't have or understand is how to reproduce the problem in a real 
>> code that uses Tao. That is where the Tao Armijo line search code has a 
>> problem when it is used (somehow) in a Tao solver with ALMM. You suggest "If 
>> you have an example for your own, you can switch the Armijo line search by 
>> the option -tao_ls_type armijo.  The thing is that it will cause no problems 
>> if the line search accepts the steps with step length one."  I don't see how 
>> to do this if I use -tao_type almm I cannot use -tao_ls_type armijo; that is 
>> the option -tao_ls_type doesn't seem to me to be usable in the context of 
>> almm (since almm internally does directly its own trust region approach for 
>> globalization). If we remove the if (1) code from your example, is there 
>> some Tao options I can use to get the bug to appear inside the Tao solve?
>> 
>> I'll try to explain again, I agree that the fact that the Tao solution is 
>> aliased (within the ALMM solver) is a problem with repeated calls to 
>> TaoComputeObjective() but I cannot see how these repeated calls could ever 
>> happen in the use of TaoSolve() with the ALMM solver. That is when is this 
>> "design problem" a true problem as opposed to just a potential problem that 
>> can be demonstrated in artificial code?
>> 
>> The reason I need to understand the non-artificial situation it breaks 
>> things is to come up with an appropriate correction for the current code.
>> 
>>   Barry
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Nov 3, 2022, at 12:46 PM, Stephan Köhler 
>>>  
>>> <mailto:stephan.koeh...@math.tu-freiberg.de> wrote:
>>> 
>>> Barry,
>>> 
>>> so far, I have not experimented with trust-region methods, but I can 
>>> imagine that this "design feature" causes no problem for trust-region 
>>> methods, if the old point is saved and after the trust-region check fails 
>>> the old point is copied to the actual point.  But the implementation of the 
>>> Armijo line search method does not work that way.  Here, the actual point 
>&g

Re: [petsc-users] Report Bug TaoALMM class

2022-11-11 Thread Barry Smith

  I am still working to understand. I have a PETSc branch 
barry/2022-11-11/fixes-for-tao/release where I have made a few fix/improvements 
to help me run and debug with your code.

I made a tiny change to your code, passing Hessian twice, and ran with 

./test_tao_neohooke-tao_monitor-tao_view-tao_max_it 
500 -tao_converged_reason -tao_lmvm_recycle -tao_type nls -tao_ls_monitor

 and got 

18 TAO,  Function value: -0.0383888,  Residual: 7.46748e-11 
  TAO  solve converged due to CONVERGED_GATOL iterations 18

Is this what you expect? Also works with ntr

If I run with 

./test_tao_neohooke-tao_monitor-tao_view-tao_max_it 
1 -tao_converged_reason  -tao_type lmvm -tao_ls_monitor

 I get 

2753 TAO,  Function value: -0.0161685,  Residual: 0.120782 
0 LSFunction value: -0.0161685,Step length: 0.
1 LSFunction value: 4.49423e+307,Step length: 1.
stx: 0., fx: -0.0161685, dgx: -0.0145883
sty: 0., fy: -0.0161685, dgy: -0.0145883
2 LSFunction value: -0.0161685,Step length: 0.
stx: 0., fx: -0.0161685, dgx: -0.0145883
sty: 1., fy: 4.49423e+307, dgy: 5.68594e+307
2754 TAO,  Function value: -0.0161685,  Residual: 0.120782 
  TAO  solve did not converge due to DIVERGED_LS_FAILURE iteration 2754

Note the insane fy value that pops up at the end.

The next one

./test_tao_neohooke-tao_monitor-tao_view-tao_max_it 
500 -tao_converged_reason -tao_lmvm_recycle -tao_type owlqn -tao_ls_monitor
  0 TAO,  Function value: 0.,  Residual: 0. 
  TAO  solve converged due to CONVERGED_GATOL iterations 0

fails right off the bat, somehow the initial residual norm is 0, which should 
not depend on the solver (maybe a bug in Tao?)

bmrm gets stuck far from the minimum found by the Newton methods.

1719 TAO,  Function value: -2.36706e-06,  Residual: 1.94494e-09

I realize this is still far from the problem you reported (and I agree is a 
bug), I am working to understand enough to provide a proper fix to the bug 
instead of just doing something ad hoc.

 Barry



> On Nov 4, 2022, at 7:43 AM, Stephan Köhler 
>  wrote:
> 
> Barry,
> 
> this is a nonartificial code.  This is a problem in the ALMM subsolver.  I 
> want to solve a problem with a TaoALMM solver what   then happens is:
> 
> TaoSolve(tao)/* TaoALMM solver */
>|
>|
>|>   This calls the TaoALMM subsolver routine
>   
>  TaoSolve(subsolver)
>|
>|
>|--->   The subsolver does not correctly work, 
> at least with an Armijo line search, since the solution is overwritten within 
> the line search.  
>In my case, the subsolver does not 
> make any progress although it is possible.
> 
> To get to my real problem you can simply change line 268 to if(0)  (from 
> if(1) -> if(0)) and line 317 from // ierr = TaoSolve(tao); CHKERRQ(ierr); 
>  ---> ierr = TaoSolve(tao); CHKERRQ(ierr);
> What you can see is that the solver does not make any progress, but it should 
> make progress.
> 
> To be honest, I do not really know why the option 
> -tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is 
> called and not the subsolver. I also do not know why 
> -tao_almm_subsolver_tao_view prints as termination reason for the subsolver 
> 
>  Solution converged:||g(X)|| <= gatol
> 
> This is obviously not the case.  I set the tolerance
> -tao_almm_subsolver_tao_gatol 1e-8 \
> -tao_almm_subsolver_tao_grtol 1e-8 \
>  
> I encountered this and then I looked into the ALMM class and therefore I 
> tried to call the subsolver (previous example).
> 
> I attach the updated programm and also the options.
> 
> Stephan
> 
> 
> 
> 
> 
>  <https://www.dict.cc/?s=obviously>
> On 03.11.22 22:15, Barry Smith wrote:
>> 
>>   Thanks for your response and the code. I understand the potential problem 
>> and how your code demonstrates a bug if the TaoALMMSubsolverObjective() is 
>> used in the manner you use in the example where you directly call 
>> TaoComputeObjective() multiple times line a line search code might.
>> 
>>   What I don't have or understand is how to reproduce the problem in a real 
>> code that uses Tao. That is where the Tao Armijo line search code has a 
>> problem when it is used (somehow) in a Tao solver with ALMM. You suggest "If 
>> you have an example for your own, you can switch the Armijo line search by 
>> the option -tao_ls_type armijo.  The thing is that it will cause no problems 
>> if the line search accepts the steps with step length one."  I don't see how 
>> to

Re: [petsc-users] Report Bug TaoALMM class

2022-11-11 Thread Barry Smith


> On Nov 4, 2022, at 7:43 AM, Stephan Köhler 
>  wrote:
> 
> Barry,
> 
> this is a nonartificial code.  This is a problem in the ALMM subsolver.  I 
> want to solve a problem with a TaoALMM solver what   then happens is:
> 
> TaoSolve(tao)/* TaoALMM solver */
>|
>|
>|>   This calls the TaoALMM subsolver routine
>   
>  TaoSolve(subsolver)
>|
>|
>|--->   The subsolver does not correctly work, 
> at least with an Armijo line search, since the solution is overwritten within 
> the line search.  
>In my case, the subsolver does not 
> make any progress although it is possible.
> 
> To get to my real problem you can simply change line 268 to if(0)  (from 
> if(1) -> if(0)) and line 317 from // ierr = TaoSolve(tao); CHKERRQ(ierr); 
>  ---> ierr = TaoSolve(tao); CHKERRQ(ierr);
> What you can see is that the solver does not make any progress, but it should 
> make progress.
> 
> To be honest, I do not really know why the option 
> -tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is 
> called and not the subsolver. I also do not know why 
> -tao_almm_subsolver_tao_view prints as termination reason for the subsolver 
> 
>  Solution converged:||g(X)|| <= gatol
> 
> This is obviously not the case.  I set the tolerance
> -tao_almm_subsolver_tao_gatol 1e-8 \
> -tao_almm_subsolver_tao_grtol 1e-8 \

  This is because TaoSolve_ALMM adaptively sets the tolerances for the 
subsolver 

/* update subsolver tolerance */
PetscCall(PetscInfo(tao, "Subsolver tolerance: ||G|| <= %e\n", 
(double)auglag->gtol));
PetscCall(TaoSetTolerances(auglag->subsolver, auglag->gtol, 0.0, 0.0));

  So any values one set initially are ignored. Unfortunately, given the 
organization of TaoSetFromOptions() as a general tool, there is no way to have 
ALMM not accept the command line tolerances, producing a message that the end 
that they have been ignored. Hence the user thinks they have been set and gets 
confused that they seem to be ignored. I don't see any way to prevent this 
confusion cleanly.


   I am still digging through all the nesting here.


>  
> I encountered this and then I looked into the ALMM class and therefore I 
> tried to call the subsolver (previous example).
> 
> I attach the updated programm and also the options.
> 
> Stephan
> 
> 
> 
> 
> 
>  <https://www.dict.cc/?s=obviously>
> On 03.11.22 22:15, Barry Smith wrote:
>> 
>>   Thanks for your response and the code. I understand the potential problem 
>> and how your code demonstrates a bug if the TaoALMMSubsolverObjective() is 
>> used in the manner you use in the example where you directly call 
>> TaoComputeObjective() multiple times line a line search code might.
>> 
>>   What I don't have or understand is how to reproduce the problem in a real 
>> code that uses Tao. That is where the Tao Armijo line search code has a 
>> problem when it is used (somehow) in a Tao solver with ALMM. You suggest "If 
>> you have an example for your own, you can switch the Armijo line search by 
>> the option -tao_ls_type armijo.  The thing is that it will cause no problems 
>> if the line search accepts the steps with step length one."  I don't see how 
>> to do this if I use -tao_type almm I cannot use -tao_ls_type armijo; that is 
>> the option -tao_ls_type doesn't seem to me to be usable in the context of 
>> almm (since almm internally does directly its own trust region approach for 
>> globalization). If we remove the if (1) code from your example, is there 
>> some Tao options I can use to get the bug to appear inside the Tao solve?
>> 
>> I'll try to explain again, I agree that the fact that the Tao solution is 
>> aliased (within the ALMM solver) is a problem with repeated calls to 
>> TaoComputeObjective() but I cannot see how these repeated calls could ever 
>> happen in the use of TaoSolve() with the ALMM solver. That is when is this 
>> "design problem" a true problem as opposed to just a potential problem that 
>> can be demonstrated in artificial code?
>> 
>> The reason I need to understand the non-artificial situation it breaks 
>> things is to come up with an appropriate correction for the current code.
>> 
>>   Barry
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Nov 3, 2022, at 12:46 PM, Stephan Köhler 
>>>  
>>> <mailto:stephan.koeh...@math.tu-freiberg.de> wrote:
>>> 
>>> B

Re: [petsc-users] On PCFIELDSPLIT and its implementation

2022-11-10 Thread Barry Smith


  Can you share the code that produces the problem below?

> On Nov 10, 2022, at 3:52 PM, Edoardo alinovi  
> wrote:
> 
> The thing is, even I pass the following options: 
> 
> -UPeqn_pc_type fieldsplit -UPeqn_pc_fieldsplit_0_fields 0,1 
> -UPeqn_pc_fieldsplit_1_fields 2 -UPeqn_pc_fieldsplit_type SCHUR 
> -UPeqn_pc_fieldsplit_block_size 3
> 
> I am getting the same issue, so there must be something fundamental in the 
> way I am splitting :/
> 
> 



Re: [petsc-users] On PCFIELDSPLIT and its implementation

2022-11-10 Thread Barry Smith

  Hmm, that branch does not appear to exist.



> On Nov 10, 2022, at 3:48 PM, Edoardo alinovi  
> wrote:
> 
> I am sorry Barry,
> 
> I told you it works, but it is not. I changed to index to integer, but I am 
> still getting this: 
> 
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Nonconforming object sizes
> [0]PETSC ERROR: Local column sizes 6 do not add up to total number of columns 
> 9
> 
> I am a bit tired and I was running linux's test instead of my program -.-
> 
> This is the branch Matt did if you whish to try... -> 
> https://gitlab.com/petsc/petsc/-/commits/knepley/fix-fieldsplit-fortran
> 
> Cheers



Re: [petsc-users] On PCFIELDSPLIT and its implementation

2022-11-10 Thread Barry Smith

  These beasts should be PetscInt, not real

 real :: ufields(2), pfields(1)


   Side note. We do not recommend using options like -fdefault-real-8 because 
the compiler may change values in surprising ways.  You can use PetscReal to 
declare real numbers and this will automatically match with single or double 
precision based on the PETSc configure options.

   What version of PETSc are you using? We don't have Fortran stubs for the 
calls to PCFieldSplitSetFields in the latest release. I should add them.



> On Nov 9, 2022, at 12:00 PM, Edoardo alinovi  
> wrote:
> 
> Hi Matt,
> 
> it took a bit more than 1s, but I can reproduce the error in the attached 
> file.
> 
> To compile:
> mpifort -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc -fdefault-real-8 -o test 
> test.F90 -I$PETSC_DIR/include -I$PETSC_DIR/$PETSC_ARCH/include
> 
> Please run it in serial as I have hardcoded some dimensions to code this up 
> faster.
> 
> Thank you!
> 



Re: [petsc-users] Error configuring external packages

2022-11-07 Thread Barry Smith


   The cmake in your path is broken

TESTING: locateCMake from 
config.packages.cmake(/gpfs/fs1/home/h/hngharti/hngharti/lsoft/petsc-gnu/config/BuildSystem/config/packages/cmake.py:53)
Looking for default CMake executable
Checking for program 
/scinet/niagara/software/2019b/opt/gcc-9.4.0/openmpi/4.1.1/bin/cmake...not found
Checking for program 
/scinet/niagara/software/2019b/opt/base/gcc/9.4.0/bin/cmake...not found
Checking for program 
/home/h/hngharti/hngharti/lsoft/cmake-3.23.1/bin/cmake...found
  Defined make macro "CMAKE" to 
"/home/h/hngharti/hngharti/lsoft/cmake-3.23.1/bin/cmake"
Looking for default CTest executable
Checking for program 
/scinet/niagara/software/2019b/opt/gcc-9.4.0/openmpi/4.1.1/bin/ctest...not found
Checking for program 
/scinet/niagara/software/2019b/opt/base/gcc/9.4.0/bin/ctest...not found
Checking for program 
/home/h/hngharti/hngharti/lsoft/cmake-3.23.1/bin/ctest...found
  Defined make macro "CTEST" to 
"/home/h/hngharti/hngharti/lsoft/cmake-3.23.1/bin/ctest"
Executing: /home/h/hngharti/hngharti/lsoft/cmake-3.23.1/bin/cmake --version
cmake --version failed: Could not execute 
"['/home/h/hngharti/hngharti/lsoft/cmake-3.23.1/bin/cmake --version']":
/home/h/hngharti/hngharti/lsoft/cmake-3.23.1/bin/cmake: error while loading 
shared libraries: libfabric.so.1: cannot open shared object file: No such file 
or directory

you need to either install a better cmake or have PETSc install one using 
--download-cmake



> On Nov 7, 2022, at 12:48 PM, Hom Nath Gharti  wrote:
> 
> Dear all,
> 
> I am trying to compile the latest version, but I am getting the following 
> error:
> ***
>  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for 
> details):
> ---
> Error configuring METIS with CMake
> ***
> 
> When I remove this package from the configure option, I get the same error 
> for other packages. Is there something wrong with my configure command or 
> compilers?
> 
> Attached is the configure log file.
> 
> Best,
> Hom Nath
> 



Re: [petsc-users] PCMGSetResidual and fortran

2022-11-04 Thread Barry Smith


  Steven,

Could you please send your test code. It is possible there is a bug in our 
Fortran interface since we do not test it for this functionality. 

  Barry


> On Nov 4, 2022, at 10:50 AM, Steven Dargaville  
> wrote:
> 
> Hi all
> 
> I have a quick question regarding the use of PCMGSetResidual within fortran 
> code. I'm calling PCMGSetResidual from within fortran: 
> 
> call PCMGSetResidual(pc_mg, petsc_level, mg_residual, coarse_matrix ierr) 
> 
> and just for testing purposes I've written a trivial residual evaluation 
> routine:
> 
> subroutine mg_residual(mat, b, x, r, ierr)
>   !!< Compute the residual
> 
>   ! ~~
>   type(tMat) :: mat
>   type(tVec) :: b, x, r
>   PetscErrorCode :: ierr
>   ! ~~
> 
>   print *, "inside residual evaluation"
>   call MatResidual(mat, b, x, r, ierr)
>   
>end subroutine
> 
> The problem I am having is that this segfaults when the residual routine is 
> called. Valgrind shows that it is failing in the fortran interface in 
> ftn-custom/zmgfuncf.c, with the message:
> 
> ==24742== Invalid read of size 8
> ==24742==at 0x5B7CBC0: ourresidualfunction (in 
> /home/projects/dependencies/petsc_main/arch-linux-c-opt/lib/libpetsc.so.3.015.0)
> ==24742==by 0x5B6D804: PCMGMCycle_Private (in 
> /home/projects/dependencies/petsc_main/arch-linux-c-opt/lib/libpetsc.so.3.015.0)
> 
> ==24742== Process terminating with default action of signal 11 (SIGSEGV)
> ==24742==  Access not within mapped region at address 0x0
> ==24742==at 0x5B7CBC0: ourresidualfunction (in 
> /home/sdargavi/projects/dependencies/petsc_main/arch-linux-c-opt/lib/libpetsc.so.3.015.0)
> ==24742==by 0x5B6D804: PCMGMCycle_Private (in 
> /home/sdargavi/projects/dependencies/petsc_main/arch-linux-c-opt/lib/libpetsc.so.3.015.0)
> 
> I'm guessing this is because the fortran_func_pointers isn't pointing to the 
> mg_residual routine, but I am not sure why. I noticed that in the C code of 
> PCMGSetResidual that it calls MatDestroy on the A matrix in mg_levels and 
> replaces it with the mat passed in:
> 
> if (mat) PetscObjectReference((PetscObject)mat);
> MatDestroy([l]->A);
> mglevels[l]->A = mat
> 
> so I modified my code to call PCMGSetResidual either before the operators are 
> set, or after but passing in an extra copy, but this doesn't seem to help.
> 
> I'm guessing I'm doing something silly, but just wondering if anyone had any 
> ideas? Thanks for your help
> Steven



Re: [petsc-users] On the usage of MatSetValuesBlocked

2022-11-04 Thread Barry Smith



> On Nov 4, 2022, at 6:55 AM, Edoardo alinovi  wrote:
> 
> Thanks Matt,
> 
> I have found out that setValuesblocked will work if I do:
> 
> call MatCreateVecs(A, x, y, ierr)
> call setValuesBlocked(x, nblocks, varray, ierr) 

  Ah, likely the block size for the vector was not correct, leading to the 
memory corruption. MatCreateVecs() creates a vector compatible with the matrix, 
same block size and parallel layout so you don't need to worry about setting 
those values yourself.

  Barry

> 
> However, there is nogetValuesBlocked. Not the end of the world, it is handy 
> to set and get stuff by block and not by single entry :)
> 
> Cheers



Re: [petsc-users] [petsc-maint] Issues linking petsc header files and lib from FORTRAN codes

2022-11-03 Thread Barry Smith

 Please send your attempted makefile and we'll see if we can get it working.

  I am not sure if we can organize the include files as Fortran compiler 
include files easily. We've always used the preprocessor approach. The Intel 
compiler docs indicate the procedure for finding the Fortran compiler include 
files 
https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/program-structure/use-include-files.html
 is the same as for the preprocessor include files so I don't understand how 
the using the Fortran compiler include file approach would make the makefiles 
any simpler for users?


  Barry


> On Nov 3, 2022, at 8:58 PM, Jianbo Long  wrote:
> 
> Hello,
> 
> I'm struggling to make my FORTRAN code work with petsc as I cannot link the 
> required header files (e.g., petscksp.h) and compiled library files to my 
> FORTRAN code. 
> 
> Compiling petsc was not a problem. However, even with the fortran examples 
> (see those on https://petsc.org/main/docs/manual/fortran/) and the guide on 
> using petsc in c++ and fortran codes (see Section "Writing C/C++ or Fortran 
> Applications" at https://petsc.org/main/docs/manual/getting_started/), I 
> still cannot make my FORTRAN code work. 
> 
> The Fortran test code is exactly the example code ex83f.F90 (see attached 
> files). Aftering following the 2nd method in the Guide (see the picture 
> below), I still get errors:
> 
> petsc/finclude/petscksp.h: No such file or directory
> 
> Even if I set up the path of the header file correctly in my own makefile 
> without using environment variables, I still can only find the file 
> "petscksp.h" for my code. Of course, the trouble is that all other headers 
> files required by KSP are recursively included in this petscksp.h file, and I 
> have no way to link them together for my Fortran code. 
> 
> So, here are my questions:
> 1) in the Guide, how exactly are we supposed to set up the environment 
> variables  PETSC_DIR  and PETSC_ARCH ? More details and examples would be 
> extremely helpful !
> 2) Is there a way to get rid of the preprocessor statement 
>  #include 
> when using c++/Fortran codes ?
> 
> For example, when using MUMPS package in a Fortran code, we can simply use 
> compiler 'include', rather than a preprocessor, to extract all required 
> variables for the user's codes :
>   INCLUDE 'zmumps_struc.h'
> where the header file zmumps_struc.h is already provided in the package. 
> Similarly, I think it's much more portable and easier when using petsc in 
> other codes if we can make it work to use petsc. 
> 
> (Note: similar issues were discussed before, see 
> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2019-January/037499.html. 
> Unfortunately, I have no clue about the solution archived there ...)
> 
> Any thoughts and solutions would be much appreciated !
> 
> Thanks,
> Jianbo Long
> 
> 
> 



Re: [petsc-users] On the usage of MatSetValuesBlocked

2022-11-03 Thread Barry Smith

   You should pass 1 and 1 not 3, because you are setting one block.

   Regarding all the integer values passed in to PETSc routines To be 
completely portable you need to declare them as PetscInt and pass the 
variables. But if you do not use --with-64-bit-indices in ./configure and you 
do not use some Fortran compiler option to promote all integers to 64 bit 
integers just passing in directly the 0 etc is fine.

   The reason you get different results with a 0 or a Fortran integer is 
because of the 3 PETSc tries to three indices from 0 value you pass in so it is 
readying memory it is not suppose to be reading. Once you change the 3 to 1 it 
will likely be fine.

  Barry


> On Nov 3, 2022, at 5:56 PM, Edoardo alinovi  wrote:
> 
> Barry,
> 
> Can you please provide me with an example on how to use MatSetValuesBlocked? 
> 
> To play it easy, let's say that I want to insert a 3x3 block matrix b into 
> the matrix A, rows 0-2, columns 0-2. Up to what I've understood (very few 
> apparently XD ), I would do like this:
> 
> b(3,3) = 11.0 
> call MatSetValuesBlocked(A, 3, 0, 3, 0, b, INSERT_VALUES, ierr).  
> 
> This does not work at all, I get this result that does not make any sense 
> 
> 
> 
> It places 6 values instead of 9 and it they are in odd locations (0 1 2 9 10 
> 11).
> 
> Also I noted that I am getting  different results if in place of  the zero in 
> red I use a fortran integer 
> 
> Super thanks for the help



Re: [petsc-users] Report Bug TaoALMM class

2022-11-03 Thread Barry Smith

  Thanks for your response and the code. I understand the potential problem and 
how your code demonstrates a bug if the TaoALMMSubsolverObjective() is used in 
the manner you use in the example where you directly call TaoComputeObjective() 
multiple times line a line search code might.

  What I don't have or understand is how to reproduce the problem in a real 
code that uses Tao. That is where the Tao Armijo line search code has a problem 
when it is used (somehow) in a Tao solver with ALMM. You suggest "If you have 
an example for your own, you can switch the Armijo line search by the option 
-tao_ls_type armijo.  The thing is that it will cause no problems if the line 
search accepts the steps with step length one."  I don't see how to do this if 
I use -tao_type almm I cannot use -tao_ls_type armijo; that is the option 
-tao_ls_type doesn't seem to me to be usable in the context of almm (since almm 
internally does directly its own trust region approach for globalization). If 
we remove the if (1) code from your example, is there some Tao options I can 
use to get the bug to appear inside the Tao solve?

I'll try to explain again, I agree that the fact that the Tao solution is 
aliased (within the ALMM solver) is a problem with repeated calls to 
TaoComputeObjective() but I cannot see how these repeated calls could ever 
happen in the use of TaoSolve() with the ALMM solver. That is when is this 
"design problem" a true problem as opposed to just a potential problem that can 
be demonstrated in artificial code?

The reason I need to understand the non-artificial situation it breaks things 
is to come up with an appropriate correction for the current code.

  Barry







> On Nov 3, 2022, at 12:46 PM, Stephan Köhler 
>  wrote:
> 
> Barry,
> 
> so far, I have not experimented with trust-region methods, but I can imagine 
> that this "design feature" causes no problem for   trust-region methods, 
> if the old point is saved and after the trust-region check fails the old 
> point is copied to the actual point.  But the implementation of the Armijo 
> line search method does not work that way.  Here, the actual point will 
> always be overwritten.  Only if the line search fails, then the old point is 
> restored, but then the TaoSolve method ends with a line search failure. 
> 
> If you have an example for your own, you can switch the Armijo line search by 
> the option -tao_ls_type armijo.  The thing is that it will cause no problems 
> if the line search accepts the steps with step length one.  
> It is also possible that, by luck, it will cause no problems, if the 
> "excessive" step brings a reduction of the objective
> 
> Otherwise, I attach my example, which is not minimal, but here you can see 
> that it causes problems.  You need to set the paths to the PETSc library in 
> the makefile.  You find the options for this problem in the 
> run_test_tao_neohooke.sh script.
> The import part begins at line 292 in test_tao_neohooke.cpp
> 
> Stephan
> 
> On 02.11.22 19:04, Barry Smith wrote:
>>   Stephan,
>> 
>> I have located the troublesome line in TaoSetUp_ALMM() it has the line
>> 
>>   auglag->Px = tao->solution;
>> 
>> and in alma.h it has 
>> 
>> Vec  Px, LgradX, Ce, Ci, G; /* aliased vectors (do not destroy!) */
>> 
>> Now auglag->P in some situations alias auglag->P  and in some cases 
>> auglag->Px serves to hold a portion of auglag->P. So then in 
>> TaoALMMSubsolverObjective_Private()
>> the lines
>> 
>> PetscCall(VecCopy(P, auglag->P));
>>  PetscCall((*auglag->sub_obj)(auglag->parent));
>> 
>> causes, just as you said, tao->solution to be overwritten by the P at which 
>> the objective function is being computed. In other words, the solution of 
>> the outer Tao is aliased with the solution of the inner Tao, by design. 
>> 
>> You are definitely correct, the use of TaoALMMSubsolverObjective_Private and 
>> TaoALMMSubsolverObjectiveAndGradient_Private  in a line search would be 
>> problematic. 
>> 
>> I am not an expert at these methods or their implementations. Could you 
>> point to an actual use case within Tao that triggers the problem. Is there a 
>> set of command line options or code calls to Tao that fail due to this 
>> "design feature". Within the standard use of ALMM I do not see how the 
>> objective function would be used within a line search. The TaoSolve_ALMM() 
>> code is self-correcting in that if a trust region check fails it 
>> automatically rolls back the solution.
>> 
>>   Barry
>> 
>> 
>> 
>> 
>>> On Oct 28, 2022, at 4:27 AM, Stephan Köhler 
>>

Re: [petsc-users] Advice on coupling linear physics with Allen-Cahn

2022-11-03 Thread Barry Smith



> On Nov 3, 2022, at 2:33 PM, Mike Welland  wrote:
> 
> I am coupling a linear diffusion equation with Allen-Cahn in a time dependent 
> problem. I'd like to take advantage of the linear block to speed things up. 
> I'm trying two approaches:
> 
> 1. Allen-Cahn with double well potential: phi^2*(1-phi^2), which makes it 
> nonlinear. The best performance I have is with geometric multigrid on the 
> full system. I tried using a schur complement with the linear diffusion block 
> on A00 (both inside mg levels, and just mg on S) but didn't get good 
> performance. 

With geometric multigrid there is not much setup cost (so reusing it is not 
important).

> 
> 2. Allen-Cahn with the 'obstacle' potential: phi*(1-phi) which is linear but 
> needs the vi solver to keep 0<=phi<=1. My whole system becomes linear 
> (great!) but needs the nonlinear steps for the vi solver, and I'm not sure if 
> it is reusing the factorization since the DOFs are being changed with the 
> active step. 

   You are correct. Since the problem (size) changes for each solve not much of 
anything can be directly reused in the solver. But with geometric multigrid 
there is not much setup cost (so reusing it is not important).


> 
> Any suggestion / guidance would be appreciated! 
> Thanks!



Re: [petsc-users] On the usage of MatSetValuesBlocked

2022-11-03 Thread Barry Smith

   The error indicates not enough nonzero blocks are preallocated for. Try 
something really simple, preallocate for one block and put in one block then 
call MatAssemblyBegin/End(), MatView(), if that works then work up to your full 
problem.

  Barry


> On Nov 3, 2022, at 3:11 PM, Edoardo alinovi  wrote:
> 
> Just to give a bit more of context I am doing like this:
> 
> call MatCreate(PETSC_COMM_WORLD, this%A, ierr)
> 
> call MatSetSizes(this%A, lm, lm, M, M, ierr) ! lm is the local size 
> of the matrix , M the global one
> 
> call MatSetType(this%A, myType, ierr) ! myType is MATMPIBAIJ
>  
> call MatSetBlockSize(this%A, 4-bdim, ierr) ! 4-bdim is equal to 3 in 
> this case
> 
> call MatMPIBAIJSetPreallocation(this%A, 4-bdim, flubioSolvers%d_nz, 
> mesh%d_nnz, flubioSolvers%o_nz, mesh%o_nnz, ierr) ! d_nnz and o_nnz is the 
> number of diagonal and off diagonal non zero blocks
> 
> call MatSetUp(this%A, ierr)
> 
>... some non relevant code .
> 
>  call MatSetValuesBlocked(this%A, 4-bdim, 
> mesh%cellGlobalAddr(iElement)-1, 4-bdim, mesh%cellGlobalAddr(iElement)-1, 
> blockValues, INSERT_VALUES, ierr)  mesh%cellGlobalAddr(iElement)-1 is an 
> integer equal (not an array) to the block element number the block matrix 
> blockValues (3x3) belongs to.
> 
> Any evident errors? 
> 
> 
> 
> 



Re: [petsc-users] On the usage of MatSetValuesBlocked

2022-11-03 Thread Barry Smith



> On Nov 3, 2022, at 1:16 PM, Edoardo alinovi  wrote:
> 
> Ah, I was forgetting the most important thing... Are the size of idxm and 
> idxn equal to one if I insert 1 block or should I specify all the rows and 
> columns in the block?

  Yes, for a single block they are one.  The block size is set with 
MatSetBlockSize() or in the preallocation routines or automatically if you use 
DMCreateMatrix(), the block size is not passed in during MatSetValuesBlocked() 
the matrix uses its internal value.

 Barry

> 
> I am getting some memory issues with unallocated non zero values so I must 
> have made some mistake here... :( 
> 
> Sorry for the ton of questions! 



Re: [petsc-users] On the usage of MatSetValuesBlocked

2022-11-03 Thread Barry Smith

  You can find the current F90 interface definitions we support for 
MatSetValuesBlocked() in 

/src/mat/f90-mod/petscmat.h90



> On Nov 3, 2022, at 12:16 PM, Edoardo alinovi  
> wrote:
> 
> Hello Jed/Barry/Petsc friends
> 
> I am trying to assemble a block matrix with 3x3 in 2D and 4x4 blocks in 3D 
> coming from the fully coupled NS equation.
> 
> I am not sure I am understanding the example provided here:
> https://petsc.org/main/docs/manualpages/Mat/MatSetValuesBlocked/
> 
> The description says that " v - a logically two-dimensional array of values", 
> while in the example is passed as a 1D array.
> 
> Should I pass a 2D array like v(1:nComp,1:nComp) or an array v(1:nComp*nComp) 
> to MatSetValuesBlocked (I am using fortran that's why I start form 1 in my 
> arrays) ? 

  We intend to support both approaches, whatever is most convenient for the 
rest of your code. (Perhaps more interface definitions are needed?)  

  When we say "logically" two-dimensions this is intended to mean that you can 
pass a one dimensional array that contains all the nonzeros in the block in the 
order of the first column, followed by the second column etc. (How Fortran 
handles two dimensional arrays normally). But in most circumstances I would 
guess providing directly the two dimensal Fortran array is more natural.

> 
> Are idxm and idxn the global index of each block right? I guess PETSc will 
> correctly assign each block row and column as it knows the matrix has a given 
> structure.

   Yes
> 
> Thank you as always! 
> 
> 



Re: [petsc-users] Bug report LMVM matrix class

2022-11-02 Thread Barry Smith

  Thanks for the bug report with reproducing example. I have a fix in 
https://gitlab.com/petsc/petsc/-/merge_requests/5797

  Barry


> On Nov 2, 2022, at 6:52 AM, Stephan Köhler 
>  wrote:
> 
> Dear PETSc/Tao team,
> 
> it seems to be that there is a bug in the LMVM matrix class:
> 
> In the function MatCreateVecs_LMVM, see, e.g., 
> https://petsc.org/release/src/ksp/ksp/utils/lmvm/lmvmimpl.c.html at line 214.
> it is not checked if the vectors  *L, or *R are NULL.  This is, in 
> particular, a problem if this matrix class is combined with the Schur 
> complement matrix class, since MatMult_SchurComplement
> calls this function with NULL as *R, see, e.g. 
> https://petsc.org/release/src/ksp/ksp/utils/schurm/schurm.c.html at line 66.
> 
> I attach a minimal example.  You need to modify the paths to the PETSc 
> installation in the makefile.
> 
> Best regards
> Stephan Köhler
> 
> -- 
> Stephan Köhler
> TU Bergakademie Freiberg
> Institut für numerische Mathematik und Optimierung
> 
> Akademiestraße 6
> 09599 Freiberg
> Gebäudeteil Mittelbau, Zimmer 2.07
> 
> Telefon: +49 (0)3731 39-3173 (Büro)
> 
> 



Re: [petsc-users] Report Bug TaoALMM class

2022-11-02 Thread Barry Smith


  Stephan,

I have located the troublesome line in TaoSetUp_ALMM() it has the line

  auglag->Px = tao->solution;

and in alma.h it has 

Vec  Px, LgradX, Ce, Ci, G; /* aliased vectors (do not destroy!) */

Now auglag->P in some situations alias auglag->P  and in some cases auglag->Px 
serves to hold a portion of auglag->P. So then in 
TaoALMMSubsolverObjective_Private()
the lines

PetscCall(VecCopy(P, auglag->P));
 PetscCall((*auglag->sub_obj)(auglag->parent));

causes, just as you said, tao->solution to be overwritten by the P at which the 
objective function is being computed. In other words, the solution of the outer 
Tao is aliased with the solution of the inner Tao, by design. 

You are definitely correct, the use of TaoALMMSubsolverObjective_Private and 
TaoALMMSubsolverObjectiveAndGradient_Private  in a line search would be 
problematic. 

I am not an expert at these methods or their implementations. Could you point 
to an actual use case within Tao that triggers the problem. Is there a set of 
command line options or code calls to Tao that fail due to this "design 
feature". Within the standard use of ALMM I do not see how the objective 
function would be used within a line search. The TaoSolve_ALMM() code is 
self-correcting in that if a trust region check fails it automatically rolls 
back the solution.

  Barry




> On Oct 28, 2022, at 4:27 AM, Stephan Köhler 
>  wrote:
> 
> Dear PETSc/Tao team,
> 
> it seems to be that there is a bug in the TaoALMM class:
> 
> In the methods TaoALMMSubsolverObjective_Private and 
> TaoALMMSubsolverObjectiveAndGradient_Private the vector where the function 
> value for the augmented Lagrangian is evaluate
> is copied into the current solution, see, e.g., 
> https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html line 672 
> or 682.  This causes subsolver routine to not converge if the line search for 
> the subsolver rejects the step length 1. for some
> update.  In detail:
> 
> Suppose the current iterate is xk and the current update is dxk. The line 
> search evaluates the augmented Lagrangian now at (xk + dxk).  This causes 
> that the value (xk + dxk) is copied in the current solution.  If the point 
> (xk + dxk) is rejected, the line search should
> try the point (xk + alpha * dxk), where alpha < 1.  But due to the copying, 
> what happens is that the point ((xk + dxk) + alpha * dxk) is evaluated, see, 
> e.g., https://petsc.org/release/src/tao/linesearch/impls/armijo/armijo.c.html 
> line 191.
> 
> Best regards
> Stephan Köhler
> 
> -- 
> Stephan Köhler
> TU Bergakademie Freiberg
> Institut für numerische Mathematik und Optimierung
> 
> Akademiestraße 6
> 09599 Freiberg
> Gebäudeteil Mittelbau, Zimmer 2.07
> 
> Telefon: +49 (0)3731 39-3173 (Büro)
> 
> 



Re: [petsc-users] KSP on GPU

2022-10-31 Thread Barry Smith

  Please send the full output when you run with the monitors I mentioned turned 
on. If one approach is converging and one is not then we should be able to see 
this in differences in the convergence output printed for the two runs getting 
further and further apart.

  Barry


> On Oct 31, 2022, at 1:56 AM, Carl-Johan Thore  wrote:
> 
> The GPU supports double precision and I didn’t explicitly tell PETSc to use 
> float when compiling, so
> I guess it uses double? What’s the easiest way to check?
>  
> Barry, running -ksp_view shows that the solver options are the same for CPU 
> and GPU. The only
> difference is the coarse grid solver for gamg (“the package used to perform 
> factorization:”) which
> is petsc for CPU and cusparse for GPU. I tried forcing the GPU to use petsc 
> via
> -fieldsplit_0_mg_coarse_sub_pc_factor_mat_solver_type, but then ksp failed to 
> converge
> even on the first topology optimization iteration.  
>  
> -ksp_view also shows differences in the eigenvalues from the Chebyshev 
> smoother. For example,
>  
> GPU: 
>Down solver (pre-smoother) on level 2 ---
>   KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
> type: chebyshev
>   eigenvalue targets used: min 0.109245, max 1.2017
>   eigenvalues provided (min 0.889134, max 1.09245) with
>  
> CPU: 
>   eigenvalue targets used: min 0.112623, max 1.23886
>   eigenvalues provided (min 0.879582, max 1.12623)
>  
> But I guess such differences are expected?
>  
> /Carl-Johan
>  
> From: Matthew Knepley mailto:knep...@gmail.com>> 
> Sent: den 30 oktober 2022 22:00
> To: Barry Smith mailto:bsm...@petsc.dev>>
> Cc: Carl-Johan Thore  <mailto:carl-johan.th...@liu.se>>; petsc-users@mcs.anl.gov 
> <mailto:petsc-users@mcs.anl.gov>
> Subject: Re: [petsc-users] KSP on GPU
>  
> On Sun, Oct 30, 2022 at 3:52 PM Barry Smith  <mailto:bsm...@petsc.dev>> wrote:
>  
>In general you should expect similar but not identical conference 
> behavior. 
>  
> I suggest running with all the monitoring you can. 
> -ksp_monitor_true_residual -fieldsplit_0_monitor_true_residual 
> -fieldsplit_1_monitor_true_residual and compare the various convergence 
> between the CPU and GPU. Also run with -ksp_view and check that the various 
> solver options are the same (they should be).
>  
> Is the GPU using float or double?
>  
>Matt
>  
>   Barry
>  
> 
> 
> On Oct 30, 2022, at 11:02 AM, Carl-Johan Thore via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
>  
> Hi,
>  
> I'm solving a topology optimization problem with Stokes flow discretized by a 
> stabilized Q1-Q0 finite element method
> and using BiCGStab with the fieldsplit preconditioner to solve the linear 
> systems. The implementation
> is based on DMStag, runs on Ubuntu via WSL2, and works fine with PETSc-3.18.1 
> on multiple CPU cores and the following
> options for the preconditioner:
>  
> -fieldsplit_0_ksp_type preonly \
> -fieldsplit_0_pc_type gamg \
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
> -fieldsplit_1_ksp_type preonly \
> -fieldsplit_1_pc_type jacobi 
>  
> However, when I enable GPU computations by adding two options -
>  
> ...
> -dm_vec_type cuda \
> -dm_mat_type aijcusparse \
> -fieldsplit_0_ksp_type preonly \
> -fieldsplit_0_pc_type gamg \
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
> -fieldsplit_1_ksp_type preonly \
> -fieldsplit_1_pc_type jacobi 
>  
> - KSP still works fine the first couple of topology optimization iterations 
> but then
> stops with "Linear solve did not converge due to DIVERGED_DTOL ..".
>  
> My question is whether I should expect the GPU versions of the linear solvers 
> and pre-conditioners
> to function exactly as their CPU counterparts (I got this impression from the 
> documentation),
> in which case I've probably made some mistake in my own code, or whether 
> there are other/additional
> settings or modifications I should use to run on the GPU (an NVIDIA Quadro 
> T2000)?
>  
> Kind regards,
>  
> Carl-Johan
>  
> 
>  
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ 
> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F=05%7C01%7Ccarl-johan.thore%40liu.se%7C8113f968516a44cce6ba08dabab9b28e%7C913f18ec7f264c5fa816784fe9a58edd%7C0%7C0%7C638027603971893036%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=gQ45cdsgo404NeBzZ7e6c5zNXhYOy39ZPfZzqtGDaDk%3D=0>


Re: [petsc-users] KSP on GPU

2022-10-30 Thread Barry Smith

   In general you should expect similar but not identical conference behavior. 

I suggest running with all the monitoring you can. 
-ksp_monitor_true_residual -fieldsplit_0_monitor_true_residual 
-fieldsplit_1_monitor_true_residual and compare the various convergence between 
the CPU and GPU. Also run with -ksp_view and check that the various solver 
options are the same (they should be).

  Barry


> On Oct 30, 2022, at 11:02 AM, Carl-Johan Thore via petsc-users 
>  wrote:
> 
> Hi,
>  
> I'm solving a topology optimization problem with Stokes flow discretized by a 
> stabilized Q1-Q0 finite element method
> and using BiCGStab with the fieldsplit preconditioner to solve the linear 
> systems. The implementation
> is based on DMStag, runs on Ubuntu via WSL2, and works fine with PETSc-3.18.1 
> on multiple CPU cores and the following
> options for the preconditioner:
>  
> -fieldsplit_0_ksp_type preonly \
> -fieldsplit_0_pc_type gamg \
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
> -fieldsplit_1_ksp_type preonly \
> -fieldsplit_1_pc_type jacobi 
>  
> However, when I enable GPU computations by adding two options -
>  
> ...
> -dm_vec_type cuda \
> -dm_mat_type aijcusparse \
> -fieldsplit_0_ksp_type preonly \
> -fieldsplit_0_pc_type gamg \
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
> -fieldsplit_1_ksp_type preonly \
> -fieldsplit_1_pc_type jacobi 
>  
> - KSP still works fine the first couple of topology optimization iterations 
> but then
> stops with "Linear solve did not converge due to DIVERGED_DTOL ..".
>  
> My question is whether I should expect the GPU versions of the linear solvers 
> and pre-conditioners
> to function exactly as their CPU counterparts (I got this impression from the 
> documentation),
> in which case I've probably made some mistake in my own code, or whether 
> there are other/additional
> settings or modifications I should use to run on the GPU (an NVIDIA Quadro 
> T2000)?
>  
> Kind regards,
>  
> Carl-Johan



Re: [petsc-users] Report Bug TaoALMM class

2022-10-28 Thread Barry Smith


  Stephan,

 Thanks for your detailed report. 

 Do you have a reproducing example? 

 I am having trouble following the logic you indicate below.  

It is copying the P into auglag->P. Is auglag->P the "the current solution" 
you are referring to? 

Is it because of the line PetscCall(TaoSetSolution(auglag->subsolver, 
auglag->P))? So I am assuming that you mean the current solution of the 
auglag->subsolver Tao?

This means that TaoALMMSubsolverObjective_Private() always sets the 
subsolver Tao current solution to the input P. Are you saying this is flawed 
logic in the implementation of the entire ALMM solver? 

Since the auglag->P gets overwritten for every 
TaoALMMSubsolverObjective_Private()  with the new P passed in I don't see where 
((xk + dxk) + alpha * dxk) would occur. Would it first have xk + dxk passed in 
for P and then the next time have xk + alpha * dxk passed in?

  Barry

PetscErrorCode TaoALMMSubsolverObjective_Private(Tao tao, Vec P, PetscReal 
*Lval, void *ctx)
{
  TAO_ALMM *auglag = (TAO_ALMM *)ctx;

  PetscFunctionBegin;
  PetscCall(VecCopy(P, auglag->P));
  PetscCall((*auglag->sub_obj)(auglag->parent));
  *Lval = auglag->Lval;
  PetscFunctionReturn(0);
}



> On Oct 28, 2022, at 4:27 AM, Stephan Köhler 
>  wrote:
> 
> Dear PETSc/Tao team,
> 
> it seems to be that there is a bug in the TaoALMM class:
> 
> In the methods TaoALMMSubsolverObjective_Private and 
> TaoALMMSubsolverObjectiveAndGradient_Private the vector where the function 
> value for the augmented Lagrangian is evaluate
> is copied into the current solution, see, e.g., 
> https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html line 672 
> or 682.  This causes subsolver routine to not converge if the line search for 
> the subsolver rejects the step length 1. for some
> update.  In detail:
> 
> Suppose the current iterate is xk and the current update is dxk. The line 
> search evaluates the augmented Lagrangian now at (xk + dxk).  This causes 
> that the value (xk + dxk) is copied in the current solution.  If the point 
> (xk + dxk) is rejected, the line search should
> try the point (xk + alpha * dxk), where alpha < 1.  But due to the copying, 
> what happens is that the point ((xk + dxk) + alpha * dxk) is evaluated, see, 
> e.g., https://petsc.org/release/src/tao/linesearch/impls/armijo/armijo.c.html 
> line 191.
> 
> Best regards
> Stephan Köhler
> 
> -- 
> Stephan Köhler
> TU Bergakademie Freiberg
> Institut für numerische Mathematik und Optimierung
> 
> Akademiestraße 6
> 09599 Freiberg
> Gebäudeteil Mittelbau, Zimmer 2.07
> 
> Telefon: +49 (0)3731 39-3173 (Büro)
> 
> 



Re: [petsc-users] Does flagging a matrix as symmetric improving performances?

2022-10-21 Thread Barry Smith


  Set the flag, it cannot harm. I cannot see that hypre boomerAMG has options 
to indicate matrix symmetry to take advantage of it.

  Stefano or Mark would know better than I.

  No, do not change MPIAIJ it offers the most preconditioners.

  Barry




> On Oct 21, 2022, at 11:36 AM, Edoardo alinovi  
> wrote:
> 
> Hi Berry, 
> 
> Thank you for the hints. Actually, I am using MPIAIJ, does this mean I need 
> to change the matrix format? 
> 
> Most of the time I am using CG + boomerAMG, do you think that using a sym 
> matrix can be any good for the performance?  
> 
> Cheers
> 
> 



Re: [petsc-users] Does flagging a matrix as symmetric improving performances?

2022-10-21 Thread Barry Smith


 For most solvers, just setting this flag will not, by itself, will not improve 
the setup time. An exception to this is PCGAMG in the latest release where it 
will improve the solution time. 

 For some preconditioner situations you can use MATSBAIJ to store the matrix, 
this can help noticeably because only half of the matrix is stored and computed 
with.

 You can select KSP methods that specifically work only for symmetric operators 
(or symmetric positive definite operators) to get faster solve times but you 
need to do this explicitly, it is not done automatically.

  If you know the matrix is symmetric there is likely never a reason to not set 
the flag. Note also the flag MAT_SYMMETRY_ETERNAL that can be used in 
conjunction with 
MAT_SYMMETRIC if you know the matrix will remain symmetric despite changes may 
to its numerical values.

  Barry


> On Oct 21, 2022, at 10:52 AM, Edoardo alinovi  
> wrote:
> 
> Hi PETSc friends,
> 
> As per object, do you think that flagging a matrix as symmetric might improve 
> setup times of the preconditioner?
> 
> Thank you as always.



Re: [petsc-users] Periodic boundary condition

2022-10-20 Thread Barry Smith

  Some of the valgrind information does not appear to make sense

PetscMemcpy() is not calling SNESSolve() so I suspect there must be some 
serious corruption of something to this impossible stack trace

==236074==by 0x6FD160F: SNESSolve (snes.c:4569)
==236074==by 0x711917E: PetscMemcpy (bdf.c:223)

From

==236074== Conditional jump or move depends on uninitialised value(s)
==236074==at 0x5F10977: MatFDColoringSetUpBlocked_AIJ_Private (fdaij.c:146)
==236074==by 0x5F10977: MatFDColoringSetUp_SeqXAIJ.A (fdaij.c:284)
==236074==by 0x585892A: MatFDColoringSetUp (fdmatrix.c:242)
==236074==by 0x6FE5037: SNESComputeJacobianDefaultColor (snesj2.c:79)
==236074==by 0x6FC8E4E: SNESComputeJacobian (snes.c:2717)
==236074==by 0x70236E7: SNESSolve_NEWTONTR (tr.c:324)
==236074==by 0x6FD160F: SNESSolve (snes.c:4569)
==236074==by 0x711917E: PetscMemcpy (bdf.c:223)
==236074==by 0x711917E: PetscCitationsRegister (petscsys.h:2689)
==236074==by 0x711917E: TSStep_BDF.A (bdf.c:265)
==236074==by 0x70C363A: TSStep (ts.c:3757)
==236074==by 0x70C1999: TSSolve (ts.c:4154)
==236074==by 0x402594: main (one.c:391)

I suggest you run with -malloc_debug instead of valgrind and see if any errors 
are reported. If so you can add the macro CHKMEMQ; inside your function 
evaluation where you write to memory to see if anything is writing to the wrong 
location. For example wherever you assign aF such as

aF[j][i].vx=(x3+x4+x5+x6+x7+x8+x9-x1-x2)*user->hx;

this can help you determine the exact line number where you are writing to the 
wrong location and determine what might be the cause.


> On Oct 20, 2022, at 6:45 PM, Sepideh Kavousi  wrote:
> 
> Hello,
> I want to solve my 5 PDEs based on finite  difference method using periodic 
> BC in x-direction and non-periodic in y-direction but I run into error 
> (Segmentation Violation, probably memory access out of range).
> For this, I discretize my equation in FormFunction function. My PDE 
> discretization in (i,j) node needs data on (i+1,j), (i+2,j), (i-1,j), 
> (i-2,j), (i,j+1), (i,j+2), (i,j-1), (i,j-2) points.
> In my previous codes that the x-direction was non-periodic (no flux) boundary 
> condition, I:
> i)implemented the no flux BC for i=0 and i=Nx-1,
> ii)   set i+2= Nx-1 in discretizing (Nx-2,j) and i+2= 0 in 
> discretizing (1,j)
> iii) discretized my equation for i=1..Nx-2.
> I am not sure how I should do the periodic BC. From the following discussions 
> (https://lists.mcs.anl.gov/pipermail/petsc-users/2012-May/013476.html 
>   
> andhttps://lists.mcs.anl.gov/pipermail/petsc-users/2016-May/029273.html 
> ), I 
> guess I should not do step (i) (stated above) for the x-boundaries and just 
> do step (iii) for i=0..Nx-1. If I just focus on solving 2 of the PDEs which 
> does need data on (i+2,j), (i-2,j), (i,j+2), (i,j-2) points for discretizing 
> equation in (i,j) node, I still run into error:
> Running with Valgrind (just 1 processor) gave the following file. I did not 
> find any information which gives me hint on the error source.
> Can you please help me to find the error? 
> Best,
> Sepideh
>  
> ==236074== Conditional jump or move depends on uninitialised value(s)
> ==236074==at 0x4C29E39: malloc (vg_replace_malloc.c:309)
> ==236074==by 0x1B79E59B: MPID_Init (mpid_init.c:1649)
> ==236074==by 0x1B73FAEA: MPIR_Init_thread (initthread.c:717)
> ==236074==by 0x1B73D795: PMPI_Init_thread (initthread.c:1061)
> ==236074==by 0x5264A94: PetscInitialize (pinit.c:907)
> ==236074==by 0x40219D: main (one.c:335)
> ==236074== 
> ==236074== Conditional jump or move depends on uninitialised value(s)
> ==236074==at 0x2183B805: ??? (in /usr/lib64/libpsm2.so.2.2)
> ==236074==by 0x218323C1: ??? (in /usr/lib64/libpsm2.so.2.2)
> ==236074==by 0x218341C7: ??? (in /usr/lib64/libpsm2.so.2.2)
> ==236074==by 0x400F9C2: _dl_init (in /usr/lib64/ld-2.17.so)
> ==236074==by 0x401459D: dl_open_worker (in /usr/lib64/ld-2.17.so)
> ==236074==by 0x400F7D3: _dl_catch_error (in /usr/lib64/ld-2.17.so)
> ==236074==by 0x4013B8A: _dl_open (in /usr/lib64/ld-2.17.so)
> ==236074==by 0x1AEA4FAA: dlopen_doit (in /usr/lib64/libdl-2.17.so)
> ==236074==by 0x400F7D3: _dl_catch_error (in /usr/lib64/ld-2.17.so)
> ==236074==by 0x1AEA55AC: _dlerror_run (in /usr/lib64/libdl-2.17.so)
> ==236074==by 0x1AEA5040: dlopen@@GLIBC_2.2.5 (in /usr/lib64/libdl-2.17.so)
> ==236074==by 0x1B8198DC: MPID_nem_ofi_init (ofi_init.c:158)
> ==236074==by 0x1B7B5C55: ??? (mpid_nem_init.c:231)
> ==236074==by 0x1B7B3F92: MPID_nem_init_ckpt (mpid_nem_init.c:954)
> ==236074==by 0x1B580640: MPIDI_CH3_Init (ch3_init.c:125)
> ==236074==by 0x1B79F02D: MPID_Init (mpid_init.c:1857)
> ==236074==by 0x1B73FAEA: 

Re: [petsc-users] how to reuse Mumps factorization

2022-10-19 Thread Barry Smith

   Every time a matrix entry gets changes PETSc tracks these changes so for the 
next KSP by default solve it repeats the numerical factorization if the matrix 
has changed. Otherwise it reuses the still current factorization. 

   If you are calling KSP directly, you can call KSPSetReusePreconditioner() to 
prevent KSP from automatically performing a new factorization, so it will use 
the out-of-date preconditioner but if you use a KSPType of, for example, 
KSPGMRES, it will still solve the linear system correctly just taking some 
iterations. Reusing the preconditioner can be faster if the matrix does not 
change too much since a numerical factorization takes a lot of time

   If you use SNES you can control "lagging" the preconditioner with 
SNESSetLagPreconditioner() 

  Barry


> On Oct 19, 2022, at 9:15 AM, Matthew Knepley  wrote:
> 
> On Wed, Oct 19, 2022 at 9:13 AM 袁煕  > wrote:
> Hello,
> 
> I am using Mumps to solve a problem with multiple time steps. The matrix 
> structure does not change  but its value may or may not change  during those 
> steps. That means I should reuse the symbolic factorization but recall 
> numeric factorization when needed.
> 
> I have found the following anwser of a similar question
> https://lists.mcs.anl.gov/pipermail/petsc-users/2020-August/041846.html 
> 
> 
> which says "it automatically uses the same factorization", but I don't know 
> if it includes numerical factorization also.
> 
> My question is : 
> 1. Does numeric factorization do automatically? If not
> 
> Yes.
> 
>   Thanks,
> 
>  Matt
>  
> 2. Could I control when numeric factorization should be done and how to do it?
> 
> Much thanks
> 
> YUAN
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ 



Re: [petsc-users] Issue with single precision complex numbers in petsc4py

2022-10-13 Thread Barry Smith

  Is there any reason you can't use the most recent version of PETSc4py? The 
one you are working with is several years old



> On Oct 13, 2022, at 8:53 PM, Peng Sun  wrote:
> 
> Hi Hong,
> 
> Thanks for the advice.  I could not install petsc4py with the 
> --with-petsc4py=1 option, which gave me an "No rule to make target 
> 'petsc4py-install'" error when I ran "make install".   That was why I needed 
> to install petsc4py separately after the PETSc was installed.
> 
> Best regards,
> Peng Sun
> From: Zhang, Hong mailto:hongzh...@anl.gov>>
> Sent: Thursday, October 13, 2022 4:30 PM
> To: Peng Sun mailto:p...@outlook.com>>
> Cc: petsc-users@mcs.anl.gov  
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Issue with single precision complex numbers in 
> petsc4py
>  
> It seems that you installed petsc4py separately. I would suggest to add the 
> configure option --with-petsc4py=1 and follow the instructions to set 
> PYTHONPATH before using petsc4py.
> 
> Hong (Mr.)
> 
> > On Oct 13, 2022, at 10:42 AM, Peng Sun  > > wrote:
> > 
> > Hi Matt,
> > 
> > Sure, please see the attached configure.log file.  Thanks!
> > 
> > Best regards,
> > Peng Sun
> > 
> > 
> > From: Matthew Knepley mailto:knep...@gmail.com>>
> > Sent: Thursday, October 13, 2022 6:34 AM
> > To: Peng Sun mailto:p...@outlook.com>>
> > Cc: petsc-users@mcs.anl.gov  
> > mailto:petsc-users@mcs.anl.gov>>
> > Subject: Re: [petsc-users] Issue with single precision complex numbers in 
> > petsc4py
> >  
> > First send configure.log so we can see the setup.
> > 
> >   Thanks,
> > 
> >   Matt
> > 
> > On Thu, Oct 13, 2022 at 12:53 AM Peng Sun  > > wrote:
> > Dear PETSc community, 
> > 
> > I have a question regarding the single precision complex numbers of 
> > petsc4py.  I configured PETSc with the “--with-scalar-type=complex 
> > --with-precision=single" option before compiling, but all the DA structures 
> > I created with petsc4py had double precision.  
> > 
> > Here is a minimum test code on Python 3.8/PETSc 3.12/petsc4py 3.12: both 
> > print commands show data type of complex128.  Could anybody please help me? 
> >  Thanks!
> > 
> > import
> >  petsc4py
> > 
> > import
> >  sys
> > petsc4py.init(sys.argv)
> > 
> > from petsc4py import
> >  PETSc
> > 
> > da=PETSc.DA().create(sizes=[
> > 2,2,2],dof=1,stencil_type=0,stencil_width=1,boundary_type=1
> > )
> > da_1 = da.createGlobalVec()
> > 
> > print
> > (petsc4py.PETSc.ComplexType)
> > 
> > print(da_1.getArray().dtype)
> > 
> > 
> > 
> > Best regards,
> > Peng Sun
> > 
> > 
> > -- 
> > What most experimenters take for granted before they begin their 
> > experiments is infinitely more interesting than any results to which their 
> > experiments lead.
> > -- Norbert Wiener
> > 
> > https://www.cse.buffalo.edu/~knepley/ 
> > 
> > 



Re: [petsc-users] Laplace Equation preconditioner

2022-10-12 Thread Barry Smith

   What KSP are you using? DIVERGED_BREAKDOWN is very rare for KSPGMRES. If you 
are using one of its lesser cousins like bcgs you might consider switching to 
bcgsl or gmres.

   I assume because of boundary conditions or the discretization you do not 
have symmetric positive definite and thus cannot use CG.

  Barry


> On Oct 12, 2022, at 2:00 PM, Matthew Knepley  wrote:
> 
> On Wed, Oct 12, 2022 at 1:38 PM Alfredo J Duarte Gomez  > wrote:
> Good morning PETSC users,
> 
> I have a current solver that requires the solution of a Laplace equation to 
> be reused for all future time steps.
> 
> The configuration is axisymmetric with Dirichlet BCs at the top and bottom 
> boundaries, and Zero Neumman conditions at the axis and far field. The grid 
> is curvilinear and structured. 
> 
> So far I have been using PCHYPRE boomeramg as the preconditioner, which often 
> works well, but I have also experienced DIVERGED_BREAKDOWN on many occasions. 
> When I use direct solver PCMUMPS it always produces a satisfactory answer, 
> which gives me confidence that the solution exists for the given grid in all 
> these configurations where boomeramg fails.
> 
> I do not know why Hypre is breaking down. Did you try ML or GAMG? They are 
> easier to diagnose I think.
> 
>   Thanks,
> 
>  Matt
>  
> I am looking for recommendations on other preconditioners to try in this 
> problem that can produce a solution faster than PCMUMPS, or recommendations 
> on which parameters to adjust on boomeramg.
> 
> Thank you,
> 
> 
> -- 
> Alfredo Duarte
> Graduate Research Assistant
> The University of Texas at Austin
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ 



Re: [petsc-users] VecScatter

2022-10-11 Thread Barry Smith

   You need to first convert the Vec to the "natural" ordering and then bring 
that vector down to one rank. Something like

DMDACreateNaturalVector(dm,);
DMDAGlobalToNaturalBegin/End(dm,   n);
> VecScatterCreateToZero(n, , );
>   VecScatterBegin(scat, n Xseq, INSERT_VALUES, SCATTER_FORWARD);
>   VecScatterEnd(scat, n,  Xseq, INSERT_VALUES, SCATTER_FORWARD);



> On Oct 11, 2022, at 7:45 PM, Jorti, Zakariae via petsc-users 
>  wrote:
> 
> Hello, 
> 
> I have a code that handles a PETSc Vec on many procs and I would like to use 
> VecScatterCreateToZero to have all elements of this vector on a single proc, 
> call VecGetArrayRead on this proc to get the corresponding array to carry out 
> some diagnostics.
> Unfortunately, what I noticed is that the ordering of the initial Vec is not 
> preserved after the VecScatterCreateToZero call. Is there a way to have the 
> same ordering for both Vecs? 
> You will find below a minimal example that demonstrates the issue.
> Many thanks,
> 
> Zakariae   
> 
> -
> 
> static char help[] = "Demonstrates ordering change after 
> VecScatterCreateToZero call.\n\n";
> 
> 
> #include 
> #include 
> #include 
> 
> int main(int argc,char **argv)
> {
>   Vecxy;
>   DM da;
>   PetscErrorCode ierr;
>   PetscInt   m = 11, n = 11, dof = 2;
>   PetscMPIInt rank;
>   DMDACoor2d **coors;
> 
>   ierr = PetscInitialize(,,(char*)0,help);if (ierr) return ierr;
>   ierr = DMDACreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 
> DM_BOUNDARY_NONE,DMDA_STENCIL_BOX,m,n,PETSC_DECIDE,PETSC_DECIDE,dof,1,0,0,);CHKERRQ(ierr);
>   ierr = DMSetFromOptions(da);CHKERRQ(ierr);
>   ierr = DMSetUp(da);CHKERRQ(ierr);
>   ierr = DMDASetUniformCoordinates(da,0.0,1.0,0.0,1.0,0.0,1.0);CHKERRQ(ierr);
>   ierr = DMGetCoordinates(da,);CHKERRQ(ierr);
> 
>   PetscInt  i, j, ixs, ixm, iys, iym;
>   MPI_Comm_rank(PETSC_COMM_WORLD, );
> 
>   DMDAGetCorners(da, , , 0, , , 0);
>   DMDAVecGetArray(da, xy, );
>   for (j = iys; j < iys + iym; j++) {
> for (i = ixs; i < ixs + ixm; i++) {
> PetscPrintf(PETSC_COMM_SELF, "rank=%d, %d, %d, (%g, %g)\n",rank, 
> i, j,coors[j][i].x,coors[j][i].y);
> }
>   }
>   DMDAVecRestoreArray(da, xy, );
> 
>   VecScatter  scat;
>   Vec Xseq;
>   const PetscScalar *array; 
>
>   /* create scater to zero */
>   VecScatterCreateToZero(xy, , );
>   VecScatterBegin(scat, xy, Xseq, INSERT_VALUES, SCATTER_FORWARD);
>   VecScatterEnd(scat, xy, Xseq, INSERT_VALUES, SCATTER_FORWARD);
> 
>   if (rank == 0) {
> PetscInt sizeX;
> VecGetSize(Xseq, ); 
> PetscPrintf(PETSC_COMM_SELF,"The size of Xseq is %d, and the grid size is 
> %d\n",sizeX,11*11);
> VecGetArrayRead(Xseq, );
> 
> for (j = 0; j < 11; j++) {
>   for (i = 0; i < 11; i++) {
> PetscPrintf(PETSC_COMM_SELF, "%d,%d, (%g,%g)\n", i, j, 
> (double)array[2*(j*11+i)], (double)array[1+2*(j*11+i)]);
>   }
> }
> VecRestoreArrayRead(Xseq, );
>   }
> 
>   /*
>  Free work space.  All PETSc objects should be destroyed when they
>  are no longer needed.
>   */
>   ierr = DMDestroy();CHKERRQ(ierr);
>   ierr = PetscFinalize();
>   return ierr;
> }



Re: [petsc-users] make all check error

2022-10-11 Thread Barry Smith


https://petsc.org/release/faq/#what-does-the-message-hwloc-linux-ignoring-pci-device-with-non-16bit-domain-mean


> On Oct 10, 2022, at 11:54 PM, Sijie Zhang  wrote:
> 
> Hi,
> 
> When I try to install petsc on my PC and run the make all check commend it 
> has the following error. Can you help me to troubleshoot that?
> 
> Thanks.
> 
> Sijie
> 
> 
> Running check examples to verify correct installation
> Using PETSC_DIR=/home/zhangsijie1995/Documents/Packages/petsc-3.18.0 and 
> PETSC_ARCH=arch-linux-c-debug
> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> lid velocity = 0.0016, prandtl # = 1., grashof # = 1.
> Number of SNES iterations = 2
> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI 
> process
> See https://petsc.org/release/faq/
> hwloc/linux: Ignoring PCI device with non-16bit domain.
> Pass --enable-32bits-pci-domain to configure to support such devices
> (warning: it would break the library ABI, don't enable unless really needed).
> Number of SNES iterations = 3
> Completed test examples
> Error while running make check
> gmake[1]: *** [makefile:149: check] Error 1
> make: *** [GNUmakefile:17: check] Error 2
> 
> 



[petsc-users] PETSc 2023 User Meeting and New Release 3.18

2022-10-09 Thread Barry Smith
We are pleased to announce the

 - next PETSc users meeting, June 5-7, 2023, in Chicago on the campus of IIT 
(mark your calendars, further information will be forthcoming soon) and the

 - release of PETSc version 3.18.0 at 
https://petsc.org/release/install/download/ 
<https://petsc.org/release/install/download/>.  This release includes

 + far more extensive solvers on GPUs, including an efficient 
MatSetValuesCOO() for setting matrix values directly on the GPU

 + improved interfaces to Kokkos, so user GPU code can now be written in 
Kokkos, CUDA, or HIP.

 + vastly improved documentation at petsc.org with a more comprehensive 
user manual and better search capabilities (due to Patrick Sanan)

We recommend upgrading to PETSc 3.18.0 soon. As always, please report problems 
to  petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> and ask questions 
at petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>

A list of the major changes and updates can be found at 
https://petsc.org/release/docs/changes/318 
<https://petsc.org/release/docs/changes/318>

The final update to petsc-3.17 i.e petsc-3.17.5 is also available

This release includes contributions from

AdelekeBankole
Aidan Hamilton
Albert Cowie
Alexis Marboeuf
Barry Smith
Blaise Bourdin
Dave May
David Andrs
David Wells
Fande Kong
ftrigaux
Getnet Betrie
Hong Zhang
Jacob Faibussowitsch
James Wright
JDBetteridge
Jed Brown
Jeremy L Thompson
Joe Wallwork
Jose Roman
Junchao Zhang
Justin Chang
Kaushik Kulkarni
Kerry Key
Koki Sagiyama
Lawrence Mitchell
Lisandro Dalcin
Mark Adams
Martin Diehl
Matthew Knepley
Matthew Woehlk
Min RK
Mr. Hong Zhang
Patrick Farrell
Patrick Sanan
Pierre Jolivet
Richard Tran Mills
Romain Beucher
Satish Balay
Scott Kruger
Stefano Zampini
suyashtn
Toby Isaac
Todd Munson
Umberto Zerbinati
Vaclav Hapla
Zongze Yang

and bug reports/proposed improvements received from

Brad Aagaard
Abylay Zhumekenov
Aidan Hamilton
Alfredo J Duarte Gomez
Bro H
Collins, Eric
Benjamin Dudson
Ed Bueler
Erhan Turan
Eric Chamberland
Fackler, Philip
f...@rzg.mpg.de <mailto:f...@rzg.mpg.de>
Glenn Hammond
Henrik Büsing
Jacob Simon Merson
Jesper Lund-Carlson
Jin Chen
John Snyder
Jose E. Roman
Kaustubh Khedkar
Lisandro Dalcin
Lucas Banting
Matthew Knepley
Nicolas Berlie
Nicolas Tardieu
 Robert Nourgaliev
Olivier Jamond
Patrick Sanan
Pierre Jolivet
Rafel Amer Ramon
Randall J LeVeque
Richard F. Katz
Sanjay Govindjee
san.tempo...@gmail.com <mailto:san.tempo...@gmail.com>
Sidarth Narayanan
Stefano Zampini
TAY wee-beng
Victor Eijkhout
Victoria Hamtiaux
Xiaoye S. Li
Xu Hui
Yang Liu
Ye Changqing
Zakariae Jorti



Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-08 Thread Barry Smith


  True, but when users send reports back to us they will never have used the 
VERBOSE=1 option, so it requires one more round trip of email to get this 
additional information. 

> On Oct 8, 2022, at 6:48 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
> 
>>   I hate these kinds of make rules that hide what the compiler is doing (in 
>> the name of having less output, I guess) it makes it difficult to figure out 
>> what is going wrong.
> 
> You can make VERBOSE=1 with CMake-generated makefiles.



Re: [petsc-users] suppress CUDA warning & choose MCA parameter for mpirun during make PETSC_ARCH=arch-linux-c-debug check

2022-10-08 Thread Barry Smith

   I hate these kinds of make rules that hide what the compiler is doing (in 
the name of having less output, I guess) it makes it difficult to figure out 
what is going wrong.

   Anyways, either some of the MPI libraries are missing from the link line or 
they are in the wrong order and thus it is not able to search them properly. 
Here is a bunch of discussions on why that error message can appear 
https://stackoverflow.com/questions/19901934/libpthread-so-0-error-adding-symbols-dso-missing-from-command-line


  Barry


> On Oct 7, 2022, at 11:45 PM, Rob Kudyba  wrote:
> 
> The error changes now and at an earlier place, 66% vs 70%:
> make LDFLAGS="-Wl,--copy-dt-needed-entries"
> Consolidate compiler generated dependencies of target fmt
> [ 12%] Built target fmt
> Consolidate compiler generated dependencies of target richdem
> [ 37%] Built target richdem
> Consolidate compiler generated dependencies of target wtm
> [ 62%] Built target wtm
> Consolidate compiler generated dependencies of target wtm.x
> [ 66%] Linking CXX executable wtm.x
> /usr/bin/ld: libwtm.a(transient_groundwater.cpp.o): undefined reference to 
> symbol 'MPI_Abort'
> /path/to/openmpi-4.1.1_ucx_cuda_11.0.3_support/lib/libmpi.so.40: error adding 
> symbols: DSO missing from command line
> collect2: error: ld returned 1 exit status
> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
> make[1]: *** [CMakeFiles/Makefile2:225: CMakeFiles/wtm.x.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
> 
> So perhaps PET_Sc is now being found. Any other suggestions?
> 
> On Fri, Oct 7, 2022 at 11:18 PM Rob Kudyba  > wrote:
> 
> Thanks for the quick reply. I added these options to make and make check 
> still produce the warnings so I used the command like this:
> make PETSC_DIR=/path/to/petsc PETSC_ARCH=arch-linux-c-debug  MPIEXEC="mpiexec 
> -mca orte_base_help_aggregate 0 --mca opal_warn_on_missing_libcuda 0 -mca pml 
> ucx --mca btl '^openib'" check
> Running check examples to verify correct installation
> Using PETSC_DIR=/path/to/petsc and PETSC_ARCH=arch-linux-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> Completed test examples
> 
> Could be useful for the FAQ.
> You mentioned you had "OpenMPI 4.1.1 with CUDA aware",  so I think a workable 
> mpicc should automatically find cuda libraries.  Maybe you unloaded cuda 
> libraries?
> Oh let me clarify, OpenMPI is CUDA aware however this code and the node where 
> PET_Sc is compiling does not have a GPU, hence not needed and using the 
> MPIEXEC option worked during the 'check' to suppress the warning. 
> 
> I'm not trying to use PetSC to compile and linking appears to go awry:
> [ 58%] Building CXX object 
> CMakeFiles/wtm.dir/src/update_effective_storativity.cpp.o
> [ 62%] Linking CXX static library libwtm.a
> [ 62%] Built target wtm
> [ 66%] Building CXX object CMakeFiles/wtm.x.dir/src/WTM.cpp.o
> [ 70%] Linking CXX executable wtm.x
> /usr/bin/ld: cannot find -lpetsc
> collect2: error: ld returned 1 exit status
> make[2]: *** [CMakeFiles/wtm.x.dir/build.make:103: wtm.x] Error 1
> make[1]: *** [CMakeFiles/Makefile2:269: CMakeFiles/wtm.x.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2
> It seems cmake could not find petsc.   Look at 
> $PETSC_DIR/share/petsc/CMakeLists.txt and try to modify your CMakeLists.txt.
> 
> There is an explicit reference to the path in CMakeLists.txt:
> # NOTE: You may need to update this path to identify PETSc's location
> set(ENV{PKG_CONFIG_PATH} 
> "$ENV{PKG_CONFIG_PATH}:/path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/")
> pkg_check_modules(PETSC PETSc>=3.17.1 IMPORTED_TARGET REQUIRED)
> message(STATUS "Found PETSc ${PETSC_VERSION}")
> add_subdirectory(common/richdem EXCLUDE_FROM_ALL)
> add_subdirectory(common/fmt EXCLUDE_FROM_ALL)
>  
> And that exists:
> ls /path/to/petsc/arch-linux-cxx-debug/lib/pkgconfig/
> petsc.pc  PETSc.pc
> 
>  Is there an environment variable I'm missing? I've seen the suggestion 
> 
>  to add it to LD_LIBRARY_PATH which I did with export 
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib and that points 
> to:
> ls -l /path/to/petsc/arch-linux-c-debug/lib
> total 83732
> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so -> 
> libpetsc.so.3.18.0
> lrwxrwxrwx 1 rk3199 user   18 Oct  7 13:56 libpetsc.so.3.18 -> 
> libpetsc.so.3.18.0
> -rwxr-xr-x 1 rk3199 user 85719200 Oct  7 13:56 libpetsc.so.3.18.0
> drwxr-xr-x 3 rk3199 user 4096 Oct  6 10:22 petsc
> drwxr-xr-x 2 rk3199 user 4096 Oct  6 10:23 pkgconfig
> 
> Anything else to check?
> If modifying  CMakeLists.txt does not work, you can try export 
> LIBRARY_PATH=$LIBRARY_PATH:$PETSC_DIR/$PETSC_ARCH/lib
> LD_LIBRARY_PATHis is for run time, but the error 

Re: [petsc-users] PetscLogView produces nan's instead of timing data when using GPUs

2022-10-05 Thread Barry Smith

  It prints Nan to indicate that the time for that event is not known 
accurately. But the times for the larger events that contain these events are 
known. So for example the time for KSPSolve is know but not the time for 
VecNorm.  The other numbers in the events, like number of times called etc that 
are not Nan are correct as displayed.

  This is done because correctly tracking the times of the individual events 
requires synchronizations that slow down the entire calculation a bit; for 
example the time for the KSPSolve will register a longer time then it registers 
if the smaller events are not timed.

  To display the times of the smaller events use -log_view_gpu_time also but 
note this will increase the times of the larger events a bit.

  Barry


> On Oct 5, 2022, at 4:47 PM, Sajid Ali  
> wrote:
> 
> Hi PETSc-developers, 
> 
> I'm having trouble with getting performance logs from an application that 
> uses PETSc. There are no issues when I run it on a CPU, but every time a GPU 
> is used there is no timing data and almost all times are replaced by times 
> that are just `nan` (on two different clusters). I am attaching the log files 
> for both cases with this email. Could someone explain what is happening here ?
> 
> In case it helps, here are the routines used to initialize/finalize the 
> application that also handle initializing/finalizing PETSc and printing the 
> PETSc performance logs to PETSC_VIEWER_STDOUT_WORLD : 
> https://github.com/fnalacceleratormodeling/synergia2/blob/devel3/src/synergia/utils/utils.h
>  
> 
> 
> Thank You,
> Sajid Ali (he/him) | Research Associate
> Scientific Computing Division
> Fermi National Accelerator Laboratory
> s-sajid-ali.github.io 



Re: [petsc-users] code with TS throws error at the end

2022-10-05 Thread Barry Smith

  Can you try running with valgrind? 
https://petsc.org/release/faq/?highlight=valgrind#what-does-corrupt-argument-or-caught-signal-or-segv-or-segmentation-violation-or-bus-error-mean-can-i-use-valgrind-or-cuda-memcheck-to-debug-memory-corruption-issues
 


  Barry


> On Oct 5, 2022, at 7:39 AM, ashish bhole  wrote:
> 
> Hi All,
> 
> I am writing a code in Fortran to solve a linear advection equation using 
> PETSc 3.18.0 (Vec and TS). It seems to work fine on my HP elitebook laptop 
> with Fedora 30 OS and GCC 9.3.1. It gives acceptable numerical solutions, but 
> throws the following error at the end, with as well as without parallel 
> computing. The same error also appears with the slightly older version I 
> tried: PETSc 3.14.0.
> 
> The error message gives a hint for error locations, but I am unable to figure 
> out what is wrong. I have attached a snippet for my TS usage lines. I spent 
> some time searching for similar error reports but it was not so fruitful. So 
> I am approaching the PETSc community for help understanding this error. 
> Thank you.
> 
> 
> [0]PETSC ERROR: PetscTrFreeDefault() called from VecDestroy_Seq() at 
> /home/abhole/lib/petsc-3.18.0/src/vec/vec/impls/seq/bvec2.c:753
> [0]PETSC ERROR: Block [id=2154(800)] at address 0x2309ac0 is corrupted 
> (probably write past end of array)
> [0]PETSC ERROR: Block allocated in VecCreate_Seq() at 
> /home/abhole/lib/petsc-3.18.0/src/vec/vec/impls/seq/bvec3.c:34
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Memory corruption: https://petsc.org/release/faq/#valgrind 
> 
> [0]PETSC ERROR: Corrupted memory
> [0]PETSC ERROR: See https://petsc.org/release/faq/ 
>  for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.18.0, Sep 30, 2022 
> [0]PETSC ERROR: ./exe on a arch-linux-c-debug named ischia by abhole Wed Oct  
> 5 13:06:27 2022
> [0]PETSC ERROR: Configure options --with-cc=mpicc --with-cxx=mpicxx 
> --with-fc=mpif90 --download-fblaslapack --download-scalapack --download-mumps 
> --download-superlu --download-ptscotch 
> --with-metis-include=/user/abhole/home/lib/metis-5.1.0/include 
> --with-metis-lib=/user/abhole/home/lib/metis-5.1.0/lib/libmetis.a -lmetis 
> --with-parmetis-include=/user/abhole/home/lib/parmetis-4.0.3/include 
> --with-parmetis-lib=/user/abhole/home/lib/parmetis-4.0.3/lib/libparmetis.a 
> -lparmetis -lmetis 
> --with-hdf5-include=/user/abhole/home/lib/hdf5-1.8.18/include 
> --with-hdf5-lib=/user/abhole/home/lib/hdf5-1.8.18/lib64/libhdf5.a 
> --with-valgrind=1 --with-scalar-type=real --with-precision=double
> [0]PETSC ERROR: #1 PetscTrFreeDefault() at 
> /home/abhole/lib/petsc-3.18.0/src/sys/memory/mtr.c:305
> [0]PETSC ERROR: #2 VecDestroy_Seq() at 
> /home/abhole/lib/petsc-3.18.0/src/vec/vec/impls/seq/bvec2.c:753
> [0]PETSC ERROR: #3 VecDestroy() at 
> /home/abhole/lib/petsc-3.18.0/src/vec/vec/interface/vector.c:521
> [0]PETSC ERROR: #4 VecDestroyVecs_Default() at 
> /home/abhole/lib/petsc-3.18.0/src/vec/vec/interface/vector.c:977
> [0]PETSC ERROR: #5 VecDestroyVecs() at 
> /home/abhole/lib/petsc-3.18.0/src/vec/vec/interface/vector.c:606
> [0]PETSC ERROR: #6 TSRKTableauReset() at 
> /home/abhole/lib/petsc-3.18.0/src/ts/impls/explicit/rk/rk.c:1102
> [0]PETSC ERROR: #7 TSReset_RK() at 
> /home/abhole/lib/petsc-3.18.0/src/ts/impls/explicit/rk/rk.c:1109
> [0]PETSC ERROR: #8 TSReset() at 
> /home/abhole/lib/petsc-3.18.0/src/ts/interface/ts.c:2644
> [0]PETSC ERROR: #9 TSDestroy() at 
> /home/abhole/lib/petsc-3.18.0/src/ts/interface/ts.c:2706
> [0]PETSC ERROR: #10 main.F90:159
> --
> 
> -- With Regards
> Ashish Bhole
> 



Re: [petsc-users] How to use Intel OneApi mpi wrappers on Linux

2022-10-03 Thread Barry Smith

   That is indeed disappointing. mpicc and mpiicc are simple scripts that 
select the compiler based on multiple criteria include the environmental 
variables so it is curious that this functionality does not work.

  Barry


> On Oct 3, 2022, at 9:58 AM, Paolo Lampitella  
> wrote:
> 
> Hi Barry,
>  
> thanks for the suggestion. I tried this but doesn’t seem to work as expected. 
> That is, configure actually works, but it is because it is not seeing the 
> LLVM based compilers, only the intel classical ones. Yet the variables seem 
> correctly exported.
>  
> Paolo
>  
>  
> Da: Barry Smith <mailto:bsm...@petsc.dev>
> Inviato: lunedì 3 ottobre 2022 15:19
> A: Paolo Lampitella <mailto:paololampite...@hotmail.com>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>
> Oggetto: Re: [petsc-users] How to use Intel OneApi mpi wrappers on Linux
>  
>  
> bsmith@petsc-01:~$ mpicc
> This script invokes an appropriate specialized C MPI compiler driver.
> The following ways (priority order) can be used for changing default
> compiler name (gcc):
>1. Command line option:  -cc=
>2. Environment variable: I_MPI_CC (current value '')
>3. Environment variable: MPICH_CC (current value '')
> 
> 
> So 
> export I_MPI_CC=icx 
> export I_MPI_CXX=icpx
> export I_MPI_FC=ifx 
>  
> should do the trick.
>  
> 
> 
> On Oct 3, 2022, at 5:43 AM, Paolo Lampitella  <mailto:paololampite...@hotmail.com>> wrote:
>  
> Dear PETSc users and developers,
>  
> as per the title, I recently installed the base and HPC Intel OneApi toolkits 
> on a machine running Ubuntu 20.04.
>  
> As you probably know, OneApi comes with the classical compilers (icc, icpc, 
> ifort) and relative mpi wrappers (mpiicc, mpiicpc, mpiifort) as well as with 
> the new LLVM based compilers (icx, icpx, ifx).
>  
> My experience so far with PETSc on Linux has been without troubles using both 
> gcc compilers and either Mpich or OpenMPI and Intel classical compilers and 
> MPI.
>  
> However, I have now troubles using the MPI wrappers of the new LLVM compilers 
> as, in fact, there aren’t dedicated mpi wrappers for them. Instead, they can 
> be used with certain flags for the classical wrappers:
>  
> mpiicc -cc=icx
> mpiicpc -cxx=icpx
> mpiifort -fc=ifx
>  
> The problem I have is that I have no idea how to pass them correctly to the 
> configure and whatever comes after that.
>  
> Admittedly, I am just starting to use the new compilers, so I have no clue 
> how I would use them in other projects as well.
>  
> I started with an alias in my .bash_aliases (which works for simple 
> compilation tests from command line) but doesn’t with configure.
>  
> I also tried adding the flags to the COPTFLAGS, CXXOPTFLAGS and FOPTFLAGS but 
> didn’t work as well.
>  
> Do you have any experience with the new Intel compilers and, in case, could 
> you share hot to properly use them with MPI?
>  
> Thanks
>  
> Paolo



Re: [petsc-users] How to use Intel OneApi mpi wrappers on Linux

2022-10-03 Thread Barry Smith

bsmith@petsc-01:~$ mpicc
This script invokes an appropriate specialized C MPI compiler driver.
The following ways (priority order) can be used for changing default
compiler name (gcc):
   1. Command line option:  -cc=
   2. Environment variable: I_MPI_CC (current value '')
   3. Environment variable: MPICH_CC (current value '')

So 
export I_MPI_CC=icx 
export I_MPI_CXX=icpx
export I_MPI_FC=ifx 

should do the trick.


> On Oct 3, 2022, at 5:43 AM, Paolo Lampitella  
> wrote:
> 
> Dear PETSc users and developers,
>  
> as per the title, I recently installed the base and HPC Intel OneApi toolkits 
> on a machine running Ubuntu 20.04.
>  
> As you probably know, OneApi comes with the classical compilers (icc, icpc, 
> ifort) and relative mpi wrappers (mpiicc, mpiicpc, mpiifort) as well as with 
> the new LLVM based compilers (icx, icpx, ifx).
>  
> My experience so far with PETSc on Linux has been without troubles using both 
> gcc compilers and either Mpich or OpenMPI and Intel classical compilers and 
> MPI.
>  
> However, I have now troubles using the MPI wrappers of the new LLVM compilers 
> as, in fact, there aren’t dedicated mpi wrappers for them. Instead, they can 
> be used with certain flags for the classical wrappers:
>  
> mpiicc -cc=icx
> mpiicpc -cxx=icpx
> mpiifort -fc=ifx
>  
> The problem I have is that I have no idea how to pass them correctly to the 
> configure and whatever comes after that.
>  
> Admittedly, I am just starting to use the new compilers, so I have no clue 
> how I would use them in other projects as well.
>  
> I started with an alias in my .bash_aliases (which works for simple 
> compilation tests from command line) but doesn’t with configure.
>  
> I also tried adding the flags to the COPTFLAGS, CXXOPTFLAGS and FOPTFLAGS but 
> didn’t work as well.
>  
> Do you have any experience with the new Intel compilers and, in case, could 
> you share hot to properly use them with MPI?
>  
> Thanks
>  
> Paolo



Re: [petsc-users] PETSc usage issues

2022-09-28 Thread Barry Smith

   -pc_type fieldsplit -pc_fieldsplit_detect_saddle_point -pc_fieldsplit_type 
schur

   Now there will be two additional decisions you need to make how to 
precondition the A00 block and the Schur complement. 

   For the A00 block the option is 

   -fieldsplit_0_pc_type somethingwhere depending on your problem gamg may 
be a good place to start 
   -fieldsplit_0_ksp_type preonly (likely is the default)

   For the Schur complement start with just using the default and see how the 
convergence goes.

   Use -ksp_view to see all the parts of the preconditioner and 
-ksp_monitor_true_residual to see how it is coverging.

   Run with -help | grep fieldsplit to see all possible options and of course 
consult https://petsc.org/release/docs/manualpages/PC/PCFIELDSPLIT.html 


  Barry




   

> On Sep 27, 2022, at 11:47 PM, wangzj997  wrote:
> 
> Dear PETSc development team:
> 
> Currently, I am learning and trying to use  <>PETSc <>'s KSP to solve 
> large-scale sparse linear systems Ax= b, where A is symmetric positive 
> definite and nonsingular. However, the main diagonal of A contains many 0 
> items, which leads to the fact that many preconditioners cannot be used when 
> using MPI for multi-process solution, and the number of iterations is large, 
> and the convergence is slow. May I ask how to solve this problem? If it is 
> necessary to make the main diagonal of A all non-zero items, is there any 
> solution in PETSc?
> 
> I would be grateful if you would reply and answer my question.
> 
> Best Regards.
> 
> 
> 
> 



Re: [petsc-users] Solve Linear System with Field Split Preconditioner

2022-09-27 Thread Barry Smith

Composed preconditioners (like field split) have multiple moving parts and 
you need to "tune" them for each part separately; you cannot just run the 
entire preconditioner, get slow convergence on the entire problem and then give 
up. 

 So step one is to get a good preconditioner for the A00 block, and not 
worry about the entire fieldsplit yet (once you get good convergence on the A00 
block you can tune the Schur complement preconditioner but without good 
convergence on the A00 block it makes no sense to try to tune the Schur 
complement preconditioner). 

You can run with options to monitor convergence of the A00 block and try to 
tune for that -fieldplit_0_ksp_monitor_true residual  -fieldsplit_0_ksp_view 
-fieldsplit_0_pc_type gamg and control GAMG options with 
-fieldsplit_0_pc_gamg_* 

 As Mark said, you first need to provide the coordinate and null space 
information for the A00 block to have any hope of good performance

 
> On Sep 27, 2022, at 1:45 AM, 晓峰 何  wrote:
> 
> Hi Barry,
> 
> A00 is formed from elliptic operator. 
> 
> I tried GAMG with A00, but it was extremely slow to solve the system with 
> field-split preconditioner(I’m not sure I did it with the right way).
> 
> Thanks,
> Xiaofeng
> 
>> On Sep 26, 2022, at 23:11, Barry Smith > <mailto:bsm...@petsc.dev>> wrote:
>> 
>>   
>>   What is your A00 operator? ILU is almost never a good choice for large 
>> scale problems. If it is an elliptic operator that using a PC of gamg may 
>> work well for the A00 preconditioner instead of ILU.
>> 
>>   Barry
>> 
>>   For moderate size problems you can use a PC type LU for AOO to help you 
>> understand the best preconditioner to use for the A11 (the Schur complement 
>> block), once you have a good preconditioner for the A11 block you would then 
>> go back and determine a good preconditioner for the A00 block.
>> 
>>> On Sep 26, 2022, at 10:08 AM, 晓峰 何 >> <mailto:tlan...@hotmail.com>> wrote:
>>> 
>>> Hello Jed,
>>> 
>>> The saddle point is due to Lagrange multipliers, thus the size of A11 is 
>>> much smaller than A00.
>>> 
>>> 
>>> 
>>> Best Regards,
>>> 
>>> Xiaofeng
>>> 
>>> 
>>>> On Sep 26, 2022, at 21:03, Jed Brown >>> <mailto:j...@jedbrown.org>> wrote:
>>>> 
>>>> Lagrange multipliers
>>> 
>> 
> 



Re: [petsc-users] Strange mpi timing and CPU load when -np > 2

2022-09-26 Thread Barry Smith


  It is important to check out 
https://petsc.org/main/faq/?highlight=faq#why-is-my-parallel-solver-slower-than-my-sequential-solver-or-i-have-poor-speed-up
 


  In particular you will need to set appropriate binding for the mpiexec and 
should run the streams benchmark (make mpistreams) using the binding to find 
the potential performance of the system.

  If you are using a thread enabled BLAS/LAPACK that utilizes all the cores 
then you can get oversubscription and thus slow performance during BLAS/LAPACK 
calls. We try not to link with thread enabled BLAS/LAPACK by default. See 
https://petsc.org/main/docs/manual/blas-lapack/?highlight=thread%20blas

   Barry



> On Sep 26, 2022, at 12:39 PM, Duan Junming via petsc-users 
>  wrote:
> 
> Dear all,
> 
> I am using PETSc 3.17.4 on a Linux server, compiling with --download-exodus 
> --download-hdf5 --download-openmpi --download-triangle --with-fc=0 
> --with-debugging=0 PETSC_ARCH=arch-linux-c-opt COPTFLAGS="-g -O3" 
> CXXOPTFLAGS="-g -O3".
> The strange thing is when I run my code with mpirun -np 1 ./main, the CPU 
> time is 30s.
> When I use mpirun -np 2 ./main, the CPU time is 16s. It's OK.
> But when I use more than 2 CPUs, like mpirun -np 3 ./main, the CPU time is 
> 30s.
> The output of command time is: real 0m30.189s, user 9m3.133s, sys 10m55.715s.
> I can also see that the CPU load is about 100% for each process when np = 2, 
> but the CPU load goes to 2000%, 1000%, 1000% for each process (the server has 
> 40 CPUs).
> Do you have any idea about this?
> 
> Thanks in advance!



Re: [petsc-users] Solve Linear System with Field Split Preconditioner

2022-09-26 Thread Barry Smith
  
  What is your A00 operator? ILU is almost never a good choice for large scale 
problems. If it is an elliptic operator that using a PC of gamg may work well 
for the A00 preconditioner instead of ILU.

  Barry

  For moderate size problems you can use a PC type LU for AOO to help you 
understand the best preconditioner to use for the A11 (the Schur complement 
block), once you have a good preconditioner for the A11 block you would then go 
back and determine a good preconditioner for the A00 block.

> On Sep 26, 2022, at 10:08 AM, 晓峰 何  wrote:
> 
> Hello Jed,
> 
> The saddle point is due to Lagrange multipliers, thus the size of A11 is much 
> smaller than A00.
> 
> 
> 
> Best Regards,
> 
> Xiaofeng
> 
> 
>> On Sep 26, 2022, at 21:03, Jed Brown > > wrote:
>> 
>> Lagrange multipliers
> 



Re: [petsc-users] C++ error! MPI_Finalize() could not be located!

2022-09-25 Thread Barry Smith


   It appears you want to use MPI (if not pass --with-mpi=0 also).

   Thus you must either

1)  have the MPI compiler wrappers in your path (mpicc, mpicxx, mpif90) or 
use --with-mpi-dir=somedirectory where MPI is installed and do NOT provide the 
compiler names (since MPI provides compiler wrappers). or

2) use the current configure options and add --download-mpich to have MPI 
installed for you.

  Barry





> On Sep 25, 2022, at 4:49 PM, Laryssa Abdala  wrote:
> 
> Hello, 
> 
> I am trying to configure PETSc and have been getting the following message: 
> TESTING: CxxMPICheck from 
> config.packages.MPI(config/BuildSystem/config/packages/MPI.py:416)
>   
>
> ***
>  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for 
> details):
> ---
> C++ error! MPI_Finalize() could not be located!
> ***
> 
> I attached my configure.log file. Any suggestions on how to fix this issue 
> would be appreciated.
> 
> Thanks, 
> Laryssa
> 



Re: [petsc-users] PCApplySymmetricRight for PCBJACOBI (fwd)

2022-09-25 Thread Barry Smith

  Thanks for the bug report; your fix is correct. I have corrected it in PETSc 
and also added support for multiple block per MPI rank in  
https://gitlab.com/petsc/petsc/-/merge_requests/5678 


  Barry


> 
> 
> -- Forwarded message --
> Date: Thu, 22 Sep 2022 02:31:37 +0300
> From: Abylay Zhumekenov 
> To: petsc-users@mcs.anl.gov
> Subject: [petsc-users] PCApplySymmetricRight for PCBJACOBI
> 
> Hello,
> 
> I have been playing around with a block Jacobi preconditioner (PCJACOBI)
> with an incomplete Cholesky (PCICC) sub-preconditioner. Although you cannot
> set KSPSetPCSide to PC_SYMMETRIC for PCBJACOBI, you still can do it for
> PCICC. I was surprised to find that PCApplySymmetricLeft is properly
> applied to a vector. However, PCApplySymmetricRight throws an error:
> ...
>[0]PETSC ERROR: Object is in wrong state
>[0]PETSC ERROR: VecPlaceArray() was already called on this vector,
> without a call to VecResetArray()
>...
>[0]PETSC ERROR: #3 PCApplySymmetricLeft_BJacobi_Singleblock() at
> /home/abylay/KAUST/Libraries/C/petsc/src/ksp/pc/impls/bjacobi/bjacobi.c:660
>[0]PETSC ERROR: #4 PCApplySymmetricLeft() at
> /home/abylay/KAUST/Libraries/C/petsc/src/ksp/pc/interface/precon.c:532
> ...
> The problem goes away if I add:
> ...
>PetscCall(VecResetArray(bjac->x));
>PetscCall(VecResetArray(bjac->y));
> ...
> at line 698 in source file bjacobi.c. I don't know if it is a bug, and how
> I should report it, just wanted let someone know if it is.
> Thanks.
> 
> Abylay Zhumekenov



Re: [petsc-users] PETSc with 64 bit indices and MKL Sparse BLAS fails to build

2022-09-25 Thread Barry Smith

   Likely you can fix the problem by adding

#if defined(PETSC_HAVE_MKL_INTEL_ILP64)
  #define MKL_ILP64
#endif

before the #include  in

src/mat/impls/aij/seq/aijmkl/aijmkl.c  and 
src/mat/impls/baij/seq/baijmkl/baijmkl.c

  Please let us know if this resolves the problem.

  Barry
> On Sep 25, 2022, at 12:37 AM, Bro H  wrote:
> 
> Barry, thank you for answering. I did some further testing. My MKL
> version is 20220002 as detected by PETSc. I tried to compile one of
> the examples distributed with MKL
> (${MKLROOT}/latest/examples/examples_core_c.tgz/c/sparse_blas/source/sparse_csr.c)
> that contains calls to mkl_sparse_d_mv function from Sparse BLAS,
> among others. It compiled and ran without errors with 64-bit indices.
> 
> I used the following command to compile this example on Debian 11:
> 
> gcc `mkl_link_tool --quiet -c gnu_c -o gomp -i ilp64 -opts`
> sparse_csr.c `mkl_link_tool --quiet -c gnu_c -o gomp -i ilp64 -libs`
> 
> mkl_link_tool is bundled with MKL. It returns:
> 
> `mkl_link_tool --quiet -c gnu_c -o gomp -i ilp64 -opts` = -DMKL_ILP64
> -m64  -I"${MKLROOT}/include"
> 
> `mkl_link_tool --quiet -c gnu_c -o gomp -i ilp64 -opts` =
> -L${MKLROOT}/lib/intel64 -Wl,--no-as-needed -lmkl_intel_ilp64
> -lmkl_gnu_thread -lmkl_core -lgomp -lpthread -lm -ldl
> 
> So in theory it should work when PetscInt is 64 bits, am I right? The
> only problem is with the build scripts or source code in PETSc?
> 
> Per your suggestion, I have tried to change requires32bitint to 0 in
> mkl_sparse.py and mkl_sparse_optimize.py routines. PETSc used 64-bit
> indices, but configuration also set MKL to 32-bit indices, so I added
> --with-64-bit-blas-indices. I then ran into "Cannot use mkl_sparse
> with 64-bit BLAS/Lapack indices" error, and so I tried to change
> requires32bitintblas to 0 in config/BuildSystem/config/package.py.
> Configuration was successful with both PETSc and MKL set to 64-bit
> indices, however, it looks like PETSc was still using 32-bit indices
> during compilation of aijmkl implementation, with warnings such as:
> 
> petsc-3.17.4/src/mat/impls/aij/seq/aijmkl/aijmkl.c:282:98: warning:
> passing argument 7 of ‘mkl_sparse_d_export_csr’ from incompatible
> pointer type
> 
> Are there any other modifications required to make PETSc work with
> 64-bit MKL Sparse BLAS?
> 
> On Sun, Sep 25, 2022 at 3:54 AM Barry Smith  wrote:
>> 
>> 
>>  It is possible they recently added support for using it with 64 bit 
>> integers. You would need to through their documents to see how to get 
>> mkl_spblas.h. to use 64 bit integers and if the library for 64 bit integer 
>> has a different name that would need to be linked to.
>> 
>>  You would need to remove the requires32bitint = 1 from the various 
>> config/BuildSystem/config/packages/mkl_*.py routines to get configure to 
>> accept MKL sparse with 64 bit integers.
>> 
>> 
>>> On Sep 24, 2022, at 11:04 AM, Bro H  wrote:
>>> 
>>> Hello.
>>> 
>>> I would like to build PETSc with support for MKL Sparse BLAS, so that
>>> I can use MATAIJMKL for improved performance, but I also need to use
>>> 64 bit indices. I'm trying to build using the following parameters:
>>> 
>>> ./configure --force --prefix="/opt/libs/petsc" --with-precision=double
>>> --with-64-bit-indices
>>> --with-blas-lapack-dir=/opt/intel/oneapi/mkl/latest/
>>> --with-mkl_sparse=1 --with-mkl_sparse_optimize=1 --with-debugging=0
>>> --with-shared-libraries=1 --with-cxx=0 --with-mpi=1 --with-hdf5=1
>>> --with-hdf5-dir="${HDF5_ROOT}" --with-openmp=1
>>> 
>>> Configuration fails when using --with-64-bit-indices simultaneously
>>> with --with-mkl_sparse=1 and/or --with-mkl_sparse_optimize=1 with the
>>> following possible errors:
>>> 
>>> "Cannot use mkl_sparse with 64 bit integers, it is not coded for this
>>> capability"
>>> "Cannot use mkl_sparse_optimize with 64 bit integers, it is not coded
>>> for this capability"
>>> 
>>> But doesn't the latest version of MKL Sparse BLAS support 64 bit
>>> indices? It is using MKL_INT in mkl_spblas.h.
>>> 
>>> Is it possible to build PETSc with 64 bit indices and MKL Sparse BLAS?
>> 



Re: [petsc-users] PETSc with 64 bit indices and MKL Sparse BLAS fails to build

2022-09-24 Thread Barry Smith


  It is possible they recently added support for using it with 64 bit integers. 
You would need to through their documents to see how to get mkl_spblas.h. to 
use 64 bit integers and if the library for 64 bit integer has a different name 
that would need to be linked to. 

  You would need to remove the requires32bitint = 1 from the various 
config/BuildSystem/config/packages/mkl_*.py routines to get configure to accept 
MKL sparse with 64 bit integers.


> On Sep 24, 2022, at 11:04 AM, Bro H  wrote:
> 
> Hello.
> 
> I would like to build PETSc with support for MKL Sparse BLAS, so that
> I can use MATAIJMKL for improved performance, but I also need to use
> 64 bit indices. I'm trying to build using the following parameters:
> 
> ./configure --force --prefix="/opt/libs/petsc" --with-precision=double
> --with-64-bit-indices
> --with-blas-lapack-dir=/opt/intel/oneapi/mkl/latest/
> --with-mkl_sparse=1 --with-mkl_sparse_optimize=1 --with-debugging=0
> --with-shared-libraries=1 --with-cxx=0 --with-mpi=1 --with-hdf5=1
> --with-hdf5-dir="${HDF5_ROOT}" --with-openmp=1
> 
> Configuration fails when using --with-64-bit-indices simultaneously
> with --with-mkl_sparse=1 and/or --with-mkl_sparse_optimize=1 with the
> following possible errors:
> 
> "Cannot use mkl_sparse with 64 bit integers, it is not coded for this
> capability"
> "Cannot use mkl_sparse_optimize with 64 bit integers, it is not coded
> for this capability"
> 
> But doesn't the latest version of MKL Sparse BLAS support 64 bit
> indices? It is using MKL_INT in mkl_spblas.h.
> 
> Is it possible to build PETSc with 64 bit indices and MKL Sparse BLAS?



Re: [petsc-users] Using matrix-free with KSP

2022-09-13 Thread Barry Smith

  This is becoming a nightmare. 

  You pointed out something I had not thought about. Not only is the nonzero 
structure different in your case but so is the local to global mapping. Because 
certain of your ghost points are connected to different global points than for 
the standard case, thus the calls to MatSetValuesStencil() will not work 
correctly for the nonstandard ghost points.

The mapping is constructed in DMSetUp_DA_3D() and it is very tedious code.  In 
theory, the idx[] entries could be patched just before the call to 
ISLocalToGlobalMappingCreate() but it may be tedious to figure out exactly 
which local points need to have their global point changed. In theory it is 
easy but the code is extremely tedious to get to exactly the place that needs 
change.

 I'm sorry, we may have been attempting to push the DMDA construct too far from 
its simple, save home territory and thus ending up in a morass. 

 I suggest looking at the DMSetUp_DA_3D() to see if you can figure out how to 
get the specific ghost points properly mapped but I am not sure if that is a 
reasonable task. Otherwise, I have no good suggestions. If the local to global 
mapping cannot be fixed then the entire model of using MatSetValuesStencil 
falls apart.

Barry


> On Sep 13, 2022, at 12:03 AM, Tu, Jiannan  wrote:
> 
> Indeed, it is not a standard grid structure that DMDA handles by default. 
> Normally, a grid (k, j, i) has connectivity at grid (k, j-1, i). I understand 
> if j > 0, DMDA correctly allocates memory for it. When j = 0, j-1 = -1, this 
> (k, -1, i) grid is actually the grid across the pole with k-index = (k+a3/2) 
> % a3, where a3 is the number of grids for k-index. Attached figure 
> illustrates a pair of such grids. Assume red grid is at (k = 10, j = 0, i), 
> its corresponding grid *blue dot) cross the pole is at (k = 55, j =0, i) if 
> a3 = 90.
>  
> The question is how to make DMDA allocate memory at such local column 
> stencils like (col.k = 55,  col.j = 0, col.i = i, col.c = 0), instead of the 
> column stencil (col.k = 10, col.j = -1, col.i = i, col.c = 0).
>  
> I tried insert at stencils like (col.k = 10, col.j = -1, col.i = i, col.c = 
> 0). The matrix assemble has no issues but TSSolve failed. I guess probably 
> due to wrong column locations of those Jacobian elements.
>  
> Thank you,
> Jiannan
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Monday, September 12, 2022 4:18 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
>   I understand in principle, I just cannot wrap my head around a sphere :-)
>  
> 
> 
> On Sep 12, 2022, at 3:45 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Simulation domain is in spherical coordinates. The grid along co-latitude 
> direction is that j = 0 on one side of the pole and j = -1 (in fact index 0) 
> on the other side of the pole. The longitude index k for j = 0, say is 0 
> degree then for j = -1 is 180 degrees. kcm accounts for this change in 
> k-index when going from one side of the pole to another side.
>  
> I'll see if I can solve the problem with the method you suggested. 
>  
> Thank you very much.
> Jiannan
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Sent: Monday, September 12, 2022 9:31 AM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
>   I am not able to visualize the pattern you refer to but I am pretty sure it 
> is not a type of connectivity that DMDA handles by default since DMDA is for 
> relatively simply structured meshes. Here is what I suggest. 
>  
>The routine that does the preallocation for DMDA that you are using is in 
> src/dm/impls/da/fdda.c called DMCreateMatrix_DA_3d_MPIAIJ_Fill().  On each 
> MPI rank it determines the connectivity for each local grid vertex (calledthe 
> variable row in the routine) with all the other grid vertices (these are 
> stored  temporarily  in the array cols[]).
> The call to MatPreallocateSetLocal() takes this coupling information and 
> stores it temporarily. The calls to 
>  
>   PetscCall(MatSeqAIJSetPreallocation(J, 0, dnz));
>   PetscCall(MatMPIAIJSetPreallocation(J, 0, dnz, 0, onz));
>   MatPreallocateEnd(dnz, onz);
>   PetscCall(MatSetLocalToGlobalMapping(J, ltog, ltog));
>  
> at the end of the loop over the local grid vertices then provide the 
> information to the matrix so it now has the correct preallocation and cor

Re: [petsc-users] Using matrix-free with KSP

2022-09-12 Thread Barry Smith

  I understand in principle, I just cannot wrap my head around a sphere :-)


> On Sep 12, 2022, at 3:45 PM, Tu, Jiannan  wrote:
> 
> Simulation domain is in spherical coordinates. The grid along co-latitude 
> direction is that j = 0 on one side of the pole and j = -1 (in fact index 0) 
> on the other side of the pole. The longitude index k for j = 0, say is 0 
> degree then for j = -1 is 180 degrees. kcm accounts for this change in 
> k-index when going from one side of the pole to another side.
> 
> I'll see if I can solve the problem with the method you suggested. 
> 
> Thank you very much.
> Jiannan
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Sent: Monday, September 12, 2022 9:31 AM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
> 
> 
>   I am not able to visualize the pattern you refer to but I am pretty sure it 
> is not a type of connectivity that DMDA handles by default since DMDA is for 
> relatively simply structured meshes. Here is what I suggest.
> 
>The routine that does the preallocation for DMDA that you are using is in 
> src/dm/impls/da/fdda.c called DMCreateMatrix_DA_3d_MPIAIJ_Fill().  On each 
> MPI rank it determines the connectivity for each local grid vertex (calledthe 
> variable row in the routine) with all the other grid vertices (these are 
> stored  temporarily  in the array cols[]).
> The call to MatPreallocateSetLocal() takes this coupling information and 
> stores it temporarily. The calls to 
> 
>   PetscCall(MatSeqAIJSetPreallocation(J, 0, dnz));
>   PetscCall(MatMPIAIJSetPreallocation(J, 0, dnz, 0, onz));
>   MatPreallocateEnd(dnz, onz);
>   PetscCall(MatSetLocalToGlobalMapping(J, ltog, ltog));
> 
> at the end of the loop over the local grid vertices then provide the 
> information to the matrix so it now has the correct preallocation and correct 
> local to global mapping needed by MatSetValuesStencil()
> 
> The same loop structure is then called again where it inserts zero values 
> into the matrix at all the correctly preallocated locations.
> 
> You can copy this routine and modify it slightly to include the "special 
> coupling" that your problem requires. I do not understand your formula for 
> kcm so cannot tell you exactly what you need to change but it may correspond 
> to remapping the kk values at the extremes of the loop for (kk = kstart; kk < 
> kend + 1; kk++) {  so that the
> values nc * (slot + ii + gnx * jj + gnx * gny * kk) correspond to the correct 
> coupling location. I recommend drawing a picture of a case with a very small 
> size for nx,ny, and nz so you can see exactly how the mapping takes place and 
> trace through your modified code that it does what you need.
> 
> Barry
> 
> 
> 
> 
>> On Sep 11, 2022, at 11:01 PM, Tu, Jiannan > <mailto:jiannan...@uml.edu>> wrote:
>> 
>> Barry,
>>  
>> Creating DNDA with periodic boundary BC and using km=k-1; kp=k+1 even at the 
>> BCs solve the “augment out of range” problem for inserting elements at theae 
>> boundaries of k-index. Now the problem is due to wrapping around the north 
>> pole when j=0 and jm = -1 is reset as j=0 but on the other side of the pole 
>> with k-index replacing by kcm = [kmax+1)/2\ % (kmax+1). In my case, the 
>> “Augment out of range” happens at global row/column (26, 5265026), exactly 
>> at wrapping jm=-1 to j=0 to the other side.
>>  
>> I am thinking about how to properly use ghost grid j = -1 by setting 
>> appropriate BC there and inserting at that location without wrapping.
>>  
>> Jiannan
>>  
>> From: Barry Smith mailto:bsm...@petsc.dev>> 
>> Sent: Sunday, September 11, 2022 12:10 PM
>> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
>> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
>> Subject: Re: [petsc-users] Using matrix-free with KSP
>>  
>> CAUTION: This email was sent from outside the UMass Lowell network.
>>  
>>  
>> 
>> 
>> On Sep 11, 2022, at 9:56 AM, Tu, Jiannan > <mailto:jiannan...@uml.edu>> wrote:
>>  
>> Barry,
>>  
>> Thank you.
>>  
>> Yes, the DMDA is created with periodic BC. But it might not be the periodic 
>> BC since in azimuthal direction, the ghost grid -1 is actually grid Nmax, 
>> and Nmax+1 is grid 0.
>>  
>>Consider the following, one has n points from 0 to n-1  where n = 4, so I 
>> think with your definition of Nmax, Nmax is 3, Nmax + 1 is 4,
>>  
>>  

Re: [petsc-users] Problems with PCMGSetInterpolation

2022-09-12 Thread Barry Smith

It made some progress :-)

It would be quickest if you just sent me your code (you can use 
petsc-ma...@mcs.anl.gov <mailto:petsc-ma...@mcs.anl.gov> if you don't want it 
public).

> Somehow I thought that setting ComputeOperators and ComputeRHS will be 
> enough..

I think it is failing because though you provide code to fill up the matrices 
you never provide the matrices; it may be implicit in the use of 
KSPSetComputeOperators
 that you provide a DM that can create the matrices for each of the levels 
automatically inside the PCSetUp_MG.

Barry



> On Sep 12, 2022, at 1:26 PM, Oleg Shatrov  wrote:
> 
> Unfortunately it still failed.
> 
> [0]PETSC ERROR: - Error Message 
> --
> [0]PETSC ERROR: Must call DMShellSetGlobalVector() or 
> DMShellSetCreateGlobalVector()
> [0]PETSC ERROR: See https://petsc.org/release/faq/ 
> <https://petsc.org/release/faq/> for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.17.3-770-gceb9926454c  GIT 
> Date: 2022-07-15 21:35:25 +
> [0]PETSC ERROR: ./a.out on a  named MacBook-Pro-Oleg.local by olegshatrov Mon 
> Sep 12 19:59:34 2022
> [0]PETSC ERROR: Configure options --prefix=/usr/local CFLAGS="-m64 -O0 -g" 
> CXXFLAGS="-m64 -O0 -g" FCFLAGS="-m64 -O0 -g -ffree-line-length-none" 
> --with-mem-debug=1 --with-debugging=1 --with-mpi-dir=/usr/local/ 
> --with-moab=true
> [0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/dm/impls/shell/dmshell.c:206
> [0]PETSC ERROR: #2 DMCreateGlobalVector() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/dm/interface/dm.c:998
> [0]PETSC ERROR: #3 DMGetGlobalVector() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/dm/interface/dmget.c:161
> [0]PETSC ERROR: #4 KSPCreateVecs() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/ksp/interface/iterativ.c:1616
> [0]PETSC ERROR: #5 KSPSetWorkVecs() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/ksp/interface/iterativ.c:1684
> [0]PETSC ERROR: #6 KSPSetUp_Chebyshev() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/ksp/impls/cheby/cheby.c:46
> [0]PETSC ERROR: #7 KSPSetUp() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/ksp/interface/itfunc.c:362
> [0]PETSC ERROR: #8 PCSetUp_MG() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/pc/impls/mg/mg.c:1189
> [0]PETSC ERROR: #9 PCSetUp() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/pc/interface/precon.c:991
> [0]PETSC ERROR: #10 KSPSetUp() at 
> /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/ksp/interface/itfunc.c:401
> [0]PETSC ERROR: #11 main() at ex28.c:67
> [0]PETSC ERROR: No PETSc Option Table entries
> [0]PETSC ERROR: End of Error Message ---send entire error 
> message to petsc-ma...@mcs.anl.gov--
> ------
> 
> Somehow I thought that setting ComputeOperators and ComputeRHS will be 
> enough..
> 
> Oleg.
> 
> 
> пн, 12 сент. 2022 г. в 19:23, Barry Smith  <mailto:bsm...@petsc.dev>>:
> 
> In PCSetUp_MG() it is creating needed vectors for each level. In order to 
> create the vectors it needs to have a "template" of the vectors. It tries
> several ways to get such a template. 
> 
> If you have provided an vector for each level with PCMGSetRhs() it uses 
> VecDuplicate for the needed vectors.
> 
> If you provided a matrix for each level with PCMGSetOperators() it uses the 
> MatCreateVectors() using that matrix
> 
> If you have not provided either of those it tries DMCreateGlobalVector() for 
> the DM associated with that level, if there is no DM it defaults to DMShell 
> and then errors.
> 
> Since it errors this seems to indicate you did not provide an operator for 
> each level, nor a right hand side vector. 
> 
> Please find attached a patch that may resolve the issue for.
> 
> In the petsc directory do
> patch -p1 < mg.patch
> make libs
> 
> then rerun your example. Please let us know if it resolves the problem and if 
> not send the complete error message
> 
> Barry
> 
> 
> 
> 
>> On Sep 12, 2022, at 11:02 AM, Oleg Shatrov > <mailto:shatrov.ole...@gmail.com>> wrote:
>> 
>> Hello!
>> 
>> Can somebody explain to me what I am doing wrong here?
>> I took tutorial 28 from KSP folder and modified it a little in order to 
>> create manual DM interpolations and set them to PC.
>> When i run the application I receive an error:
>> 
>> [0]PETSC ERROR: - Error Message 
>> --
>> 

Re: [petsc-users] Problems with PCMGSetInterpolation

2022-09-12 Thread Barry Smith
    In PCSetUp_MG() it is creating needed vectors for each level. In order to create the vectors it needs to have a "template" of the vectors. It triesseveral ways to get such a template. If you have provided an vector for each level with PCMGSetRhs() it uses VecDuplicate for the needed vectors.If you provided a matrix for each level with PCMGSetOperators() it uses the MatCreateVectors() using that matrixIf you have not provided either of those it tries DMCreateGlobalVector() for the DM associated with that level, if there is no DM it defaults to DMShell and then errors.Since it errors this seems to indicate you did not provide an operator for each level, nor a right hand side vector. Please find attached a patch that may resolve the issue for.In the petsc directory dopatch -p1 < mg.patchmake libsthen rerun your example. Please let us know if it resolves the problem and if not send the complete error messageBarry

mg.patch
Description: Binary data
On Sep 12, 2022, at 11:02 AM, Oleg Shatrov  wrote:Hello!Can somebody explain to me what I am doing wrong here?I took tutorial 28 from KSP folder and modified it a little in order to create manual DM interpolations and set them to PC.When i run the application I receive an error:[0]PETSC ERROR: - Error Message --[0]PETSC ERROR: Must call DMShellSetGlobalVector() or DMShellSetCreateGlobalVector()[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.[0]PETSC ERROR: Petsc Development GIT revision: v3.17.3-770-gceb9926454c  GIT Date: 2022-07-15 21:35:25 +[0]PETSC ERROR: ./a.out on a  named MacBook-Pro-Oleg.local by olegshatrov Mon Sep 12 17:53:13 2022[0]PETSC ERROR: Configure options --prefix=/usr/local CFLAGS="-m64 -O0 -g" CXXFLAGS="-m64 -O0 -g" FCFLAGS="-m64 -O0 -g -ffree-line-length-none" --with-mem-debug=1 --with-debugging=1 --with-mpi-dir=/usr/local/ --with-moab=true[0]PETSC ERROR: #1 DMCreateGlobalVector_Shell() at /Users/olegshatrov/Documents/GitHub/petsc/src/dm/impls/shell/dmshell.c:206[0]PETSC ERROR: #2 DMCreateGlobalVector() at /Users/olegshatrov/Documents/GitHub/petsc/src/dm/interface/dm.c:998[0]PETSC ERROR: #3 DMGetGlobalVector() at /Users/olegshatrov/Documents/GitHub/petsc/src/dm/interface/dmget.c:161[0]PETSC ERROR: #4 KSPCreateVecs() at /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/ksp/interface/iterativ.c:1616[0]PETSC ERROR: #5 PCSetUp_MG() at /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/pc/impls/mg/mg.c:1129[0]PETSC ERROR: #6 PCSetUp() at /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/pc/interface/precon.c:991[0]PETSC ERROR: #7 KSPSetUp() at /Users/olegshatrov/Documents/GitHub/petsc/src/ksp/ksp/interface/itfunc.c:401[0]PETSC ERROR: #8 main() at ex28.c:67[0]PETSC ERROR: No PETSc Option Table entries[0]PETSC ERROR: End of Error Message ---send entire error message to petsc-ma...@mcs.anl.govAnd this is strange, since I did not provide any Shell DM.When i comment out lines with setting interpolation matrix to PCMG everything works fine:I am testing on MacOS 12.3.MPI: Open MPI: 4.1.4gcc version 12.1.0Program is compiled with mpicc ex28.c -I /usr/local/include/ -L /usr/local/lib -lpetscThanks in advance,Oleg


Re: [petsc-users] Using matrix-free with KSP

2022-09-12 Thread Barry Smith

  I am not able to visualize the pattern you refer to but I am pretty sure it 
is not a type of connectivity that DMDA handles by default since DMDA is for 
relatively simply structured meshes. Here is what I suggest.

   The routine that does the preallocation for DMDA that you are using is in 
src/dm/impls/da/fdda.c called DMCreateMatrix_DA_3d_MPIAIJ_Fill().  On each MPI 
rank it determines the connectivity for each local grid vertex (calledthe 
variable row in the routine) with all the other grid vertices (these are stored 
 temporarily  in the array cols[]).
The call to MatPreallocateSetLocal() takes this coupling information and stores 
it temporarily. The calls to 

  PetscCall(MatSeqAIJSetPreallocation(J, 0, dnz));
  PetscCall(MatMPIAIJSetPreallocation(J, 0, dnz, 0, onz));
  MatPreallocateEnd(dnz, onz);
  PetscCall(MatSetLocalToGlobalMapping(J, ltog, ltog));

at the end of the loop over the local grid vertices then provide the 
information to the matrix so it now has the correct preallocation and correct 
local to global mapping needed by MatSetValuesStencil()

The same loop structure is then called again where it inserts zero values into 
the matrix at all the correctly preallocated locations.

You can copy this routine and modify it slightly to include the "special 
coupling" that your problem requires. I do not understand your formula for kcm 
so cannot tell you exactly what you need to change but it may correspond to 
remapping the kk values at the extremes of the loop for (kk = kstart; kk < kend 
+ 1; kk++) {  so that the
values nc * (slot + ii + gnx * jj + gnx * gny * kk) correspond to the correct 
coupling location. I recommend drawing a picture of a case with a very small 
size for nx,ny, and nz so you can see exactly how the mapping takes place and 
trace through your modified code that it does what you need.

Barry




> On Sep 11, 2022, at 11:01 PM, Tu, Jiannan  wrote:
> 
> Barry,
>  
> Creating DNDA with periodic boundary BC and using km=k-1; kp=k+1 even at the 
> BCs solve the “augment out of range” problem for inserting elements at theae 
> boundaries of k-index. Now the problem is due to wrapping around the north 
> pole when j=0 and jm = -1 is reset as j=0 but on the other side of the pole 
> with k-index replacing by kcm = [kmax+1)/2\ % (kmax+1). In my case, the 
> “Augment out of range” happens at global row/column (26, 5265026), exactly at 
> wrapping jm=-1 to j=0 to the other side.
>  
> I am thinking about how to properly use ghost grid j = -1 by setting 
> appropriate BC there and inserting at that location without wrapping.
>  
> Jiannan
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Sunday, September 11, 2022 12:10 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> On Sep 11, 2022, at 9:56 AM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you.
>  
> Yes, the DMDA is created with periodic BC. But it might not be the periodic 
> BC since in azimuthal direction, the ghost grid -1 is actually grid Nmax, and 
> Nmax+1 is grid 0.
>  
>Consider the following, one has n points from 0 to n-1  where n = 4, so I 
> think with your definition of Nmax, Nmax is 3, Nmax + 1 is 4,
>  
>  
> -1   0  1   2  3  4
>  +*   *   *   * +
>  
> Thus there is one grid point at each end.  The * represent regular grid 
> points and the + ghost locations. Is this your situation? 
>  
> This is what we call periodic with a stencil width of 1 with DMDA and the 
> DMCreateMatrix() should automatically provide the correct preallocation and 
> nonzero structure.
>  
> Can you please do a MatView() on the result from DMCreateMatrix() for a very 
> small grid before you set any values and you should see the locations for the 
> coupling across the periodic boundary.
>  
> It sounds like you have a more complicated cross periodic coupling, we need 
> to understand it exactly so we can determine how to proceed.
>  
>   Barry
>  
>  
> 
>  
> Still the problem is how to properly pre-allocate matrix. Otherwise either 
> “Augument out of range” error or code stuck at inserting values if 
> MatSetOption() is called.
>  
> Jiannan
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Sunday, September 11, 2022 12:10 AM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION:

Re: [petsc-users] Using matrix-free with KSP

2022-09-11 Thread Barry Smith

  MatSetValuesStencil 
https://petsc.org/release/docs/manualpages/Mat/MatSetValuesStencil.html

 For periodic boundary conditions use negative indices for values to the left 
(below 0; that are to be
   obtained by wrapping values from right edge). For values to the right of the 
last entry using that index plus one
   etc to obtain values that obtained by wrapping the values from the left 
edge. This does not work for anything but the
   `DM_BOUNDARY_PERIODIC` boundary type.


Thus I believe it should be 

>   km = k-1;
>   kp = k+1;

  Does this not work?


> On Sep 11, 2022, at 12:58 PM, Tu, Jiannan  wrote:
> 
> Barry,
>  
> Thank you. Here is part of the jacbian, showing the non-zero structure 
> (vals[nv] all set to 1 for simplicity). Np is the last regular grid in 
> k-index, a3 is the number of grids also in k-index. Nth is the max j grid 
> index. This may provide useful information for you to help me determine how 
> to pre-allocate the matrix.
>  
> Jiannan
>  
> 
> double vals [10];
> MatStencil row, col [10];
>  
> for (k = zs; k < zs+zm; k++) {
> row.k=k; zk=k-zs; 
>  
> if (k == 0) km = Np; else km = k-1;
> if (k == Np) kp = 0; else kp = k+1;
>  
> for (j = ys; j < ys+ym; j++) {
> row.j=j;
>  
> if (j == 0) {jm = j; kcm = (k+a3/2) % a3;}
> else {jm = j-1; kcm = k;}
> if (j == Nth) {jp = Nth; kcp = (k+a3/2) % a3;}
> else {jp = j+1; kcp = k;}
>  
> for (i = xs; i < xs+xm; i++) {
> row.i=i;
>  
> for (ir = 0; ir < 26; ir++) {
> row.c=ir; nv=0;
>  
> col[nv].k=k; col[nv].j=j;
>   col[nv].i=im;  //column at i-1 grid
>   col[nv].c=ir;
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=k; col[nv].j=j;
>col[nv].i=i;
> col[nv].c=ir;
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=k; col[nv].j=j;
>   col[nv].i=ip;  //column at i+1 grid
>   col[nv].c=ir;
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=kcm; col[nv].j=jm; //j-1 grid. wrapped to the 
> other side of noth pole if j=0
>col[nv].i=i; col[nv].c=ir;
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=kcp; col[nv].j=jp; //j-+ grid. wrapped to the 
> other side of south pole if j=max
>   col[nv].i=i; col[nv].c=ir;
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=km; //k-1 grid. if k=0 then km=Np
>   col[nv].j=j;
>   col[nv].i=i; col[nv].c=ir;
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=kp; //k+1 grid. if k=Np then kp=0
>col[nv].j=j;
>col[nv].i=i; col[nv].c=ir;
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=k; col[nv].j=j; col[nv].i=i;
>col[nv].c=11; //intra couping to component 11
> vals[nv] = 1.0;
> nv++;
>  
> col[nv].k=k; col[nv].j=j; col[nv].i=i;
>    col[nv].c=22; //intra couping to component 22
> vals[nv] = 1.0;
> nv++;
>  
> MatSetValuesStencil(Jac,1,,nv,col,vals,INSERT_VALUES);
> }
> }
> }
> }
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Sunday, September 11, 2022 12:10 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> On Sep 11, 2022, at 9:56 AM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you.
>  
> Yes, the DMDA is created with periodic BC. But it might not be the periodic 
> BC since in azimuthal direction, the ghost grid -1 is actually grid Nmax, and 
> Nmax+1 is grid 0.
>  
>Consider the following, one has n points from 0 to n-1  where n = 4, so I 
> think with your definition of Nmax, Nmax is 3, Nmax + 1 is 4,
>  
>  
> 

Re: [petsc-users] Using matrix-free with KSP

2022-09-11 Thread Barry Smith


> On Sep 11, 2022, at 9:56 AM, Tu, Jiannan  wrote:
> 
> Barry,
>  
> Thank you.
>  
> Yes, the DMDA is created with periodic BC. But it might not be the periodic 
> BC since in azimuthal direction, the ghost grid -1 is actually grid Nmax, and 
> Nmax+1 is grid 0.

   Consider the following, one has n points from 0 to n-1  where n = 4, so I 
think with your definition of Nmax, Nmax is 3, Nmax + 1 is 4,


-1   0  1   2  3  4
 +*   *   *   * +

Thus there is one grid point at each end.  The * represent regular grid points 
and the + ghost locations. Is this your situation? 

This is what we call periodic with a stencil width of 1 with DMDA and the 
DMCreateMatrix() should automatically provide the correct preallocation and 
nonzero structure.

Can you please do a MatView() on the result from DMCreateMatrix() for a very 
small grid before you set any values and you should see the locations for the 
coupling across the periodic boundary.

It sounds like you have a more complicated cross periodic coupling, we need to 
understand it exactly so we can determine how to proceed.

  Barry

 
>  
> Still the problem is how to properly pre-allocate matrix. Otherwise either 
> “Augument out of range” error or code stuck at inserting values if 
> MatSetOption() is called.
>  
> Jiannan
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Sunday, September 11, 2022 12:10 AM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
>   Have you created the DMDA with the periodic boundary conditions arguments? 
> Or are your boundary conditions not simply periodic? 
>  
>   Compare DMCreateMatrix_DA_3d_MPIAIJ() and 
> DMCreateMatrix_DA_3d_MPIAIJ_Fill() in src/dm/impls/da/fdda.c I think they 
> both attempt to handle the periodic case.
>  
>   Barry
>  
>  
>  
> 
> 
> On Sep 10, 2022, at 10:14 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> I think I understand how to set values for the intra-grid coupling. But I am 
> not clear about the way to set values for inter-grid coupling. When a 
> component couples to itself at above, below, right, left, front and back 
> grids, diagonal is set as 1 for the second array of DMDASetBlockFills(). How 
> about if involving more grids? How does DMDA know where to allocate memory? 
> In my case, a component couples to itself at the other end of the grid 
> because of periodic boundary condition. There is an error message “Augment 
> out of range . Insert new nonzero at global row/column (26, 520526) into 
> matrix” because of this far away coupling. The column index 520526 is related 
> to the other end of the grid in azimuthal direction.
>  
> Thanks,
> Jiannan
>  
>  
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Saturday, September 10, 2022 1:10 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
>   If you are using DMCreateMatrix() then MatMPIAIJSetPreallocation is not 
> needed, it is automatically handled by the DM.
>  
>   It sounds like that the values you set in DMDASetBlockFills() did not 
> include all the coupling, hence the preallocation was wrong, hence the time 
> to insert values was too long.
>  
>   Barry
>  
> 
> 
> 
> On Sep 10, 2022, at 8:16 AM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Just found MatMPIAIJSetPreallocation is needed. Now Jacobin insertion and 
> assembly is fast.
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Thursday, September 8, 2022 9:53 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> 
> 
> On Sep 8, 2022, at 8:34 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you.
>  
> I set up these two arrays according to the non-zero pattern of the Jacobian. 
> The DMDASetBlockFills() is then called after the call to DMCreateMatrix(da, 
> ).
>  
>   Absolutely not! It must be called before DMCreateMatrix() so that the DM 
> knows how to construct exactly the matrix you need.
>  
> 
> 
> 
> 
> Nevertheless, the program execution is still ki

Re: [petsc-users] Using matrix-free with KSP

2022-09-10 Thread Barry Smith

  Have you created the DMDA with the periodic boundary conditions arguments? Or 
are your boundary conditions not simply periodic? 

  Compare DMCreateMatrix_DA_3d_MPIAIJ() and DMCreateMatrix_DA_3d_MPIAIJ_Fill() 
in src/dm/impls/da/fdda.c I think they both attempt to handle the periodic case.

  Barry




> On Sep 10, 2022, at 10:14 PM, Tu, Jiannan  wrote:
> 
> Barry,
>  
> I think I understand how to set values for the intra-grid coupling. But I am 
> not clear about the way to set values for inter-grid coupling. When a 
> component couples to itself at above, below, right, left, front and back 
> grids, diagonal is set as 1 for the second array of DMDASetBlockFills(). How 
> about if involving more grids? How does DMDA know where to allocate memory? 
> In my case, a component couples to itself at the other end of the grid 
> because of periodic boundary condition. There is an error message “Augment 
> out of range . Insert new nonzero at global row/column (26, 520526) into 
> matrix” because of this far away coupling. The column index 520526 is related 
> to the other end of the grid in azimuthal direction.
>  
> Thanks,
> Jiannan
>  
>  
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Saturday, September 10, 2022 1:10 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
>   If you are using DMCreateMatrix() then MatMPIAIJSetPreallocation is not 
> needed, it is automatically handled by the DM.
>  
>   It sounds like that the values you set in DMDASetBlockFills() did not 
> include all the coupling, hence the preallocation was wrong, hence the time 
> to insert values was too long.
>  
>   Barry
>  
> 
> 
> On Sep 10, 2022, at 8:16 AM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Just found MatMPIAIJSetPreallocation is needed. Now Jacobin insertion and 
> assembly is fast.
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Thursday, September 8, 2022 9:53 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> 
> On Sep 8, 2022, at 8:34 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you.
>  
> I set up these two arrays according to the non-zero pattern of the Jacobian. 
> The DMDASetBlockFills() is then called after the call to DMCreateMatrix(da, 
> ).
>  
>   Absolutely not! It must be called before DMCreateMatrix() so that the DM 
> knows how to construct exactly the matrix you need.
>  
> 
> 
> 
> Nevertheless, the program execution is still killed with exit code 9, which I 
> assume is due to overuse of the memory. My desktop machine has 32 GB RAM.
>  
> I ran the code with a smaller mesh, 100x45x90, the result is the same. With 
> that number of grids, 26 unknowns, and up to 10 non-zero at each matrix row, 
> the estimated memory to store 100x45x90x26x10 elements should be less than 1 
> GB (double precision). I am wondering why the matrix still takes too much 
> memory. Maybe I have to use matrix-free?
>  
> Thank you,
> Jiannan
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Thursday, September 8, 2022 4:19 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> 
> 
> On Sep 8, 2022, at 3:39 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you very much for the detailed description.
>  
> So the first array, dfill, is for intra grids and the second one, ofill is 
> for the inter-grid coupling?
>  
> In the case for inter-grid coupling, a component is only coupled to itself, 
> say x in your example is coupled to itself on above, below, right, left, 
> front, and back, how to set values for the second array? 
>  
> Then the off-diagonal one would just have values on its diagonal.
> 
> 
> 
>  
> Jiannan
>  
> From: Barry Smith <mailto:bsm...@petsc.dev>
> Sent: Thursday, September 8, 2022 2:12 PM
> To: Tu, Jiannan <mailto:jiannan...@uml.edu>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> 
> 
> 
> On Sep 8, 2022, 

Re: [petsc-users] Using matrix-free with KSP

2022-09-10 Thread Barry Smith

  If you are using DMCreateMatrix() then MatMPIAIJSetPreallocation is not 
needed, it is automatically handled by the DM.

  It sounds like that the values you set in DMDASetBlockFills() did not include 
all the coupling, hence the preallocation was wrong, hence the time to insert 
values was too long.

  Barry


> On Sep 10, 2022, at 8:16 AM, Tu, Jiannan  wrote:
> 
> Just found MatMPIAIJSetPreallocation is needed. Now Jacobin insertion and 
> assembly is fast.
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Thursday, September 8, 2022 9:53 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> On Sep 8, 2022, at 8:34 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you.
>  
> I set up these two arrays according to the non-zero pattern of the Jacobian. 
> The DMDASetBlockFills() is then called after the call to DMCreateMatrix(da, 
> ).
>  
>   Absolutely not! It must be called before DMCreateMatrix() so that the DM 
> knows how to construct exactly the matrix you need.
>  
> 
> 
> Nevertheless, the program execution is still killed with exit code 9, which I 
> assume is due to overuse of the memory. My desktop machine has 32 GB RAM.
>  
> I ran the code with a smaller mesh, 100x45x90, the result is the same. With 
> that number of grids, 26 unknowns, and up to 10 non-zero at each matrix row, 
> the estimated memory to store 100x45x90x26x10 elements should be less than 1 
> GB (double precision). I am wondering why the matrix still takes too much 
> memory. Maybe I have to use matrix-free?
>  
> Thank you,
> Jiannan
>  
> From: Barry Smith mailto:bsm...@petsc.dev>> 
> Sent: Thursday, September 8, 2022 4:19 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> 
> On Sep 8, 2022, at 3:39 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you very much for the detailed description.
>  
> So the first array, dfill, is for intra grids and the second one, ofill is 
> for the inter-grid coupling?
>  
> In the case for inter-grid coupling, a component is only coupled to itself, 
> say x in your example is coupled to itself on above, below, right, left, 
> front, and back, how to set values for the second array? 
>  
> Then the off-diagonal one would just have values on its diagonal.
> 
> 
>  
> Jiannan
>  
> From: Barry Smith <mailto:bsm...@petsc.dev>
> Sent: Thursday, September 8, 2022 2:12 PM
> To: Tu, Jiannan <mailto:jiannan...@uml.edu>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> 
> 
> On Sep 8, 2022, at 1:47 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry,
>  
> Thank you very much.
>  
> DMDASetBlockFills() needs two input arrays specifying sparsity pattern. In my 
> case, I have 26 components at each grid. Is it correct that I need to have 
> two arrays of 26x26 elements with value 1 for coupling and 0 for no coupling?
>  
>Exactly
> 
> 
> 
> 
> Also how to determine when the value should be 1 or 0? 
>  
>   That is problem dependent. Given paper-and-pencil you know for your stencil 
> which components are coupled with other components at each grid point and 
> which components are coupled with components at neighboring grid points. 
>  
>   Note the second array contains information about all the neighboring grid 
> points. Above, below, right, left, front back. Normally the coupling is 
> symmetric so the nonzero structures of those 6 blocks are identical. If, for 
> you particular problem, the nonzero structures of the six are not identical 
> you use the union of the nonzero structures of the six of them.
>  
>  
>   2d example say you have x,y,z at each grid point and the inter-point 
> coupling is x is connected to y and y is connect to x then the first array is
>  
> [ 1 1 0 ; 1 1; 0 ; 0 0 1]; the zeros are because x is not coupled to z and y 
> is not coupled to z.
>  
> For intra grid points, say x is coupled to z and itself, z is coupled to x 
> and itself and y is not coupled to itself then
>  
> [1 0 1; 0 0 0; 1 0 1]; the middle row is zero because y is not coupled to 
> anything and the zeros in the first and 

Re: [petsc-users] Make error --download-hdf5-fortran-bindings=1

2022-09-08 Thread Barry Smith


> On Sep 8, 2022, at 8:38 PM, Park, Heeho via petsc-users 
>  wrote:
> 
> Hi PETSc Developers,
>  
> I am having trouble compiling with the --download-hdf5-fortran-bindings=yes 
> option on Sandia’s HPC system.
> It compiles with make version 3.82, but fails to compile with make version 
> 4.2.1 in the environment.
> The issues I’m having redirecting to make version 3.82 while having the more 
> recent versions of gnu compilers like gnu/10.2.1


ptest.f90:20:7: Fatal Error: Cannot read module file 
‘/opt/openmpi/4.1/gnu/lib/mpi.mod’ opened at (1), because it was created by a 
different version of GNU Fortran
compilation terminated.

The MPI (Fortran side) cannot be used with the GFortran version you are trying 
to use. You need to have an MPI install compatible with the GNU Fortran version 
you want to use. 

I suggest showing this error to your sys-admins and asking for help in 
determining what MPI and Fortran compilers to use to get what you need.

> Is there a way around this? By the way, adding --download-make=1 did not help
> Attached is the configure log. Thank you for your help.
>  
> heepark@boca1:/projects/ADSM/software/petsc-v3.17.2$ make -v
> GNU Make 4.2.1
> Built for x86_64-redhat-linux-gnu
> Copyright (C) 1988-2016 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later  >
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
>  
> heepark@boca1:/projects/ADSM/software/petsc-v3.17.2$ ./run_configure_gnu_O.sh
> =
>  Configuring PETSc to compile on your system  
>
> =
> =
> Running configure on HDF5; this may take several minutes  
>
> =
>   
> =
> 
> Running make on HDF5; this may take several minutes   
>
> =
> 
> ***
>  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for 
> details):
> ---
> Error running make; make install on HDF5
>  
>  
> Best,
>  
> Heeho D. Park, Ph.D.
>Computational Scientist & Engineer
> Center for Energy & Earth Systems
> Applied Systems Analysis and Research Dept.
>Sandia National Laboratories 
>Email: heep...@sandia.gov 
>Web: Homepage 
>  
> 



Re: [petsc-users] Using matrix-free with KSP

2022-09-08 Thread Barry Smith


> On Sep 8, 2022, at 1:47 PM, Tu, Jiannan  wrote:
> 
> Barry,
>  
> Thank you very much.
>  
> DMDASetBlockFills() needs two input arrays specifying sparsity pattern. In my 
> case, I have 26 components at each grid. Is it correct that I need to have 
> two arrays of 26x26 elements with value 1 for coupling and 0 for no coupling?

   Exactly

> Also how to determine when the value should be 1 or 0?

  That is problem dependent. Given paper-and-pencil you know for your stencil 
which components are coupled with other components at each grid point and which 
components are coupled with components at neighboring grid points. 

  Note the second array contains information about all the neighboring grid 
points. Above, below, right, left, front back. Normally the coupling is 
symmetric so the nonzero structures of those 6 blocks are identical. If, for 
you particular problem, the nonzero structures of the six are not identical you 
use the union of the nonzero structures of the six of them.


  2d example say you have x,y,z at each grid point and the inter-point 
coupling is x is connected to y and y is connect to x then the first array is

[ 1 1 0 ; 1 1; 0 ; 0 0 1]; the zeros are because x is not coupled to z and y is 
not coupled to z.

For intra grid points, say x is coupled to z and itself, z is coupled to x and 
itself and y is not coupled to itself then

[1 0 1; 0 0 0; 1 0 1]; the middle row is zero because y is not coupled to 
anything and the zeros in the first and last are because x and z are not 
coupled to y.


  
> Each component involves 7 stencils (i-1, j, k), (I, j, k), (i+1, j, k), (I, 
> j-1, k), (I, j+1, k), (I, j, k-1), and (I, j, k+1), and couples with several 
> other components.
>  
> Jiannan
>  
>  
>  
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for Windows
>  
> From: Barry Smith <mailto:bsm...@petsc.dev>
> Sent: Wednesday, September 7, 2022 11:53 AM
> To: Tu, Jiannan <mailto:jiannan...@uml.edu>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> DMDASetBlockFills() or DMDASetBlockFillsSparse() (they are just different 
> ways of providing the same information) will help you here enormously, your 
> type of case is exactly what they were designed for.
>  
>  Barry
>  
> 
> 
> On Sep 7, 2022, at 10:47 AM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> Barry and Hong,
>  
> Thank you. 
>  
> There are 26 components at each grid and there are not fully coupled in terms 
> of stiff functions. Mutual coupling is among about 6 components. 
>  
> I would prefer not using matrix-free since the Jacobina is not difficult to 
> calculate and only up to 10 non-zeros at each row. I'll try 
> DMDASetBlockFills() or DMDASetBlockFillsSparse() and see how they can reduce 
> the memory usage.
>  
> Jiannan
> 
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Sent: Tuesday, September 6, 2022 11:33 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
>  
>  
> 
> 
> On Sep 6, 2022, at 11:00 PM, Tu, Jiannan  <mailto:jiannan...@uml.edu>> wrote:
>  
> I am using TS IMEX to solve a large DAE system. The DAE is obtained by 
> applying finite FV method to 3-D multi-specie ion/neutral fluids equations 
> with magnetic induction equation. The Jacobian for stiff part is formed by 
> using MatSetValuesStencil(). The Jacobian matrix is very sparse, no more than 
> 10 non-zeros on each row. MatSetValuesStencil requires local to global 
> mapping by calling ISLocaltoGlobalMapping(). Could you please instruct me how 
> to use local to global mapping? 
>  
>DMDA automatically sets up the ISLocaltoGlobalMapping() so you should not 
> need to.
>  
> I also tried using DMCreateMatrix() to create the Jacobian. While local to 
> global mapping is not necessary, the matrix takes too much memory and 
> requires 64-bit indices. I would prefer to take the advantage of sparsity of 
> the Jacobian, pre-allocate the matrix to use as less as possible memory so 
> that the code can be run on a multi-core desktop.
>  
>   If using matrix-free does not work for you because the linear solves do not 
> converge or converge too slowly. Then you might be able to decrease the 
> memory used in the matrix. The DMCreateMatrix() does take advantage of 
> sparsity and tries to preallocate only what it 

Re: [petsc-users] Using matrix-free with KSP

2022-09-07 Thread Barry Smith

DMDASetBlockFills() or DMDASetBlockFillsSparse() (they are just different ways 
of providing the same information) will help you here enormously, your type of 
case is exactly what they were designed for.

 Barry
> 


> On Sep 7, 2022, at 10:47 AM, Tu, Jiannan  wrote:
> 
> Barry and Hong,
> 
> Thank you. 
> 
> There are 26 components at each grid and there are not fully coupled in terms 
> of stiff functions. Mutual coupling is among about 6 components. 
> 
> I would prefer not using matrix-free since the Jacobina is not difficult to 
> calculate and only up to 10 non-zeros at each row. I'll try 
> DMDASetBlockFills() or DMDASetBlockFillsSparse() and see how they can reduce 
> the memory usage.
> 
> Jiannan
> From: Barry Smith mailto:bsm...@petsc.dev>>
> Sent: Tuesday, September 6, 2022 11:33 PM
> To: Tu, Jiannan mailto:jiannan...@uml.edu>>
> Cc: petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov> 
> mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using matrix-free with KSP
>  
> CAUTION: This email was sent from outside the UMass Lowell network.
> 
> 
> 
>> On Sep 6, 2022, at 11:00 PM, Tu, Jiannan > <mailto:jiannan...@uml.edu>> wrote:
>> 
>> I am using TS IMEX to solve a large DAE system. The DAE is obtained by 
>> applying finite FV method to 3-D multi-specie ion/neutral fluids equations 
>> with magnetic induction equation. The Jacobian for stiff part is formed by 
>> using MatSetValuesStencil(). The Jacobian matrix is very sparse, no more 
>> than 10 non-zeros on each row. MatSetValuesStencil requires local to global 
>> mapping by calling ISLocaltoGlobalMapping(). Could you please instruct me 
>> how to use local to global mapping? 
> 
>DMDA automatically sets up the ISLocaltoGlobalMapping() so you should not 
> need to.
>> 
>>  
>> I also tried using DMCreateMatrix() to create the Jacobian. While local to 
>> global mapping is not necessary, the matrix takes too much memory and 
>> requires 64-bit indices. I would prefer to take the advantage of sparsity of 
>> the Jacobian, pre-allocate the matrix to use as less as possible memory so 
>> that the code can be run on a multi-core desktop.
> 
>   If using matrix-free does not work for you because the linear solves do not 
> converge or converge too slowly. Then you might be able to decrease the 
> memory used in the matrix. The DMCreateMatrix() does take advantage of 
> sparsity and tries to preallocate only what it needs. Often what it 
> preallocates is the best possible,  but for multicomponent problems it has to 
> assume there is full coupling within all the degrees of freedom that live at 
> each at each grid point. How many components live at each grid point in your 
> model and are they not all coupled to each other in the equations? If they 
> are not fully coupled you can use either DMDASetBlockFills() or 
> DMDASetBlockFillsSparse() to indicate the reduced coupling that actually 
> exists for you model.
> 
>   Barry
> 
>>  
>> Thank you very much for your advice.
>>  
>> Jiannan



Re: [petsc-users] [slepc] nan output for eigenvectors

2022-09-07 Thread Barry Smith


   You can run with -fp_trap to have the program stop as soon as the first Nan 
or Inf appears, this can help track down why it is happening. In a debugger you 
can also set the debugger to trap on floating point exceptions (syntax is 
debugger dependent) to focus in on where it first happens.

  Barry


> On Sep 7, 2022, at 3:20 AM, Jose E. Roman  wrote:
> 
> 
> 
>> El 7 sept 2022, a las 6:18, Patrick Alken  
>> escribió:
>> 
>> I sometimes get Nan output values in computed eigenvectors for the 
>> generalized symmetric eigenvalue problem produced by slepc. Is this a known 
>> issue, and is it related to the conditioning of the matrix pair (A,B)? Is 
>> there some way to compute a "condition number" of the matrix pair ahead of 
>> time to see if i have a good chance of getting stable eigenvectors out?
> 
> You should never get NaN. Can you send a reproducible example?
> 
>> 
>> In a possibly related issue, i am finding that petsc/slepc compiled with 
>> debugging vs optimization can produce very different eigenvectors for the 
>> same problem, while the eigenvalues are the same. The eigenvectors seem more 
>> accurate when I use the debugging version of the libraries. Could this be 
>> also a conditioning problem with the matrix pair?
> 
> What do you mean more accurate? The residual norm computed with 
> EPSComputeError() should be below the tolerance in both debugging and 
> optimized versions.
> 
> Jose
> 



<    1   2   3   4   5   6   7   8   9   10   >